Share this page
Share this page E-mail this page Print this page RSS feeds
Home > News > Microsoft Research Silicon Valley Efforts Pay Distributed Dividends
Microsoft Research Silicon Valley Efforts Pay Distributed Dividends
By Rob Knies
May 23, 2008 3:00 AM PT

For Roy Levin and his colleagues at Microsoft Research Silicon Valley, rewarding work is never in short supply.

Roy Levin during SVC Road Show 2008
Roy Levin addresses a packed house during the SVC Road Show 2008 on May 22.

The lab focuses on distributed computing, a complex field with a broad array of hard problems to tackle. The facility’s 60 or so accomplished computer scientists have plenty to do, as hundreds of movers and shakers from the Bay Area technology community heard May 22 during the fourth biannual Silicon Valley Road Show, held on the Microsoft campus in Mountain View, Calif.

Levin, Microsoft distinguished engineer and director of Microsoft Research Silicon Valley, delivered a 25-minute keynote address that included a wry observation that underscored the challenges posed by the nature of the work his facility is addressing.

“In the modern world,” he remarked, “as [lab principal researcher] Leslie Lamport famously said, a distributed system is one in which a computer whose name you don’t even know can prevent you from getting your work done.”

The chuckles the line engendered hardly diminished the truth it represented. The degree of difficulty in the projects Microsoft Research Silicon Valley pursues is daunting, but Levin expressed full confidence that his researchers are up to the task.

“The expertise in the lab spans a quite considerable spectrum, from theory to practice,” he noted. “We have people who write lots of systems and debug them, we have people who prove theorems, and everything in between.”

Following introductory remarks by Dan’l Lewin, Microsoft corporate vice president of Strategic and Emerging Business Development, and a 20-minute overview by Rick Rashid, Microsoft Research senior vice president, Levin briefly sketched a portrait of his organization’s structure and focus before examining a handful of projects selected to provide a glimpse into the depth and the breadth of what the lab has to offer.

Levin described a research environment that thrives on a flat organizational structure that fosters cross-disciplinary collaboration.

“We don’t divide researchers into any particular formalized subgroups,” he said, “and that works out extremely well, producing results that wouldn’t happen if you focused your research within individual groups.”

But while subgroups at the lab don’t exist, there are focus areas, six of them:

  • Algorithms and Theory.
  • Distributed Systems.
  • Security and Privacy.
  • Software Tools.
  • System Architecture.
  • Web Search and Data Mining.

“Making these categorizations is a little bit arbitrary,” Levin stipulated, “and there is work we do that will fit into multiple categories, but at least it’s a way of thinking about what goes on.”

His sketch complete, he then gave examples of what goes on at Microsoft Research Silicon Valley―and how deep, theoretical studies can deliver real-life benefits for computer users worldwide.

10,000 TIMES FASTER

The first, finding the shortest paths in a graph, is a classical problem in computer science.

“The problem is focusing on very large graphs, a problem most of us are familiar with,” Levin said. “I’m in a car, and I want to drive from Point A to Point B. What’s the shortest route? Typically that would be in time, but it could be in distance, if I cared about that. That’s a problem that has a classical solution, but the classical solution is not particularly attractive when the graphs get as large as a road map of all the United States or all of Europe, for example.”

Some have tried heuristic techniques, which do make the computation faster, but they’re not always completely accurate, generating an approximate shortest path.

“Approximately might be OK in many cases,” Levin said, “but often, you want what’s provably optimal. The work that we’ve done here is really maintaining that prime consideration, that you get an optimal route but very, very fast, much faster than the traditional algorithms.”

In fact, the Shortest Path Algorithms project, which already has provided technology for Microsoft’s MapPoint products, provides results 10,000 times faster than the traditional approach. This is achieved by the use of “landmarks,” points on the periphery of the graph, and preprocessing techniques.

“We do a lot of precomputing to figure out distances between the source destination and these landmarks,” Levin explained, “and then exploit that precomputation to make the individual searches very fast, and it’s done in a way that can be implemented on a server or on a desktop or on a portable device, so that this becomes a real-time computation, which might be important to you as you’re driving down the freeway and you discover that the traffic is bad and you want to find an alternate route right now.”

MAXIMIZING MARKET EFFICIENCIES

Then there’s the lab’s burgeoning work on Electronic Market Design, which addresses marketplace competition, such as ad auctions for sponsored search, though online examples proliferate.

“Designing these algorithms in a way that, in economic terms, maximizes social welfare is actually a quite difficult problem,” Levin said, “and there are extremely subtle issues that arise in doing this. It is, in some sense, a kind of modern rocket science, in my view.”

The trick is to place elements, such as keyword-related ads, in the most efficient order.

“That turns out to be a quite unintuitive problem to solve,” he added, “and you get somewhat anomalous things happening if you do it in the obvious way. This is where having people who can deal with this stuff in a mathematically precise sense becomes extremely important, and that’s one of the areas that we’ve come to focus on.”

DOUBLY RELEVANT SEARCH

Another Web-related project is Query-Dependent Ranking, an attempt to improve relevance in Web queries. It’s a critical element in search.

“This is obviously a very important thing,” Levin said. “You go to a search engine, you type in a bunch of words, you get back a bunch of query results, and you’d very much like to have those ordered by relevance so that the ones at the top are the ones that are most important or interesting to you.”

Query results are ranked using algorithms, such as PageRank. That one, though, relies on the static structure of the Web graph, computing relative page interest regardless of query. Other attempts examine the query itself to try to improve on those results.

“There have been various algorithms designed where you take the query results and rank them based on some relationship of the pages that came back from the search engine and their relationship on the Web graph,” Levin added. “But actually implementing these things at scale and understanding how well they do better is fairly hard.”

The Query-Dependent Ranking project implements a specialized infrastructure to analyze a variety of such algorithms at scale, making it efficient to measure and to analyze algorithmic performance to see which ones work best.

In recent tests, this infrastructure has indicated that some search algorithms can double relevance performance.

“This is a huge improvement,” Levin said.”People rejoice for a few percent in this area. Something that actually doubles the relevance is extremely attractive, and what that tells us is that Query-Dependent Ranking really does work. Now you can afford to invest in building infrastructure that will support that on a production scale.”

PRIVATE MEANS PRIVATE

Keeping confidential information private has become a key concern in an era when individuals’ personal data often is hosted online, and a project called Privacy Integrated Queries is showing plenty of promise in being able to control access to such data.

“This issue of privacy is one that’s gotten a lot of attention, but it hasn’t gotten a lot of what I would call solid technical work,” Levin said. “We’re making some significant efforts to try to put a technical foundation under the notion of privacy. The problem in most systems that exist today is that the guarantees they offer for the privacy of individual data are too weak.”

That’s alarming for data providers, such as those who host medical data. They are bound by law to ensure that customers can obtain only the information they’re supposed to get.

“The idea here, in the work that we’re doing,” he said, “is to start over with a rigorous definition, rigorous in the mathematical sense, from first principles that are general enough to encompass not just what we see in the privacy world today, but what we can plausibly anticipate will happen in the future.

“Imagine that a query is thrown at the database and some result comes back. If the particular result that comes back is equally likely whether your data has been added to the database or not, then, in a sense, that result doesn’t really reveal anything about you. If the database could make that assurance to you, we think that would go a long way toward satisfying the desire for privacy.

“We think this is a first step in what will become a very important area for the modern world.”

QUICK SOLUTIONS

The Boolean satisfiability problem is another classic issue in computer science, a decision problem with no efficient general solutions. People have been working for decades on algorithms to solve this problem.

“But in special cases,” Levin said, “and, as it turns out, special cases arise a lot in practice, you can do a lot better. The work that we’ve done most recently looks at an old technique, which is to do some special-purpose hardware and couple it with general-purpose software as a way of getting a performance enhancement, and the particular way we’re doing this is with a custom FPGA [field-programmable gate array] arrangement hooked onto a standard CPU. But the architecture of the solution is a little unusual.”

Usually, the technique involves running a problem through a special compiler that creates code for a FPGA. It’s cumbersome, at best.

“What we’ve done,” Levin said, “is to build an architecture where you can solve sat problems and do this compilation process once per application. If you’re doing circuit verification, you do the compilation once, and now you have an engine that will deal with any instance of a circuit-verification problem. The result of that is that you can load problems into it very quickly and solve them very quickly.”

The project, called Accelerating SAT Solving, could prove a boon to financial markets.

“The solution,” he said, “enables you to solve problems that are an order of magnitude bigger than the ones that standard software solvers can do, five to 20 times faster—a very significant acceleration.”

BUGS BEWARE

Even the best-laid plans don’t always proceed according to intention.

“If you’ve ever had the occasion to try to work with or build a distributed system,” Levin said, “you know that lots of unexpected things happen.”

When a system is implemented on a collection of computers, they communicate via various protocols. Things can go awry.

But MoDist can help.

“This is a tool,” Levin said, “that goes in and intercepts, on each of those computers, the place where they call through a core interface, the Windows API in this case, and then there’s a controlling machine that basically intercepts all those requests and drives unexpected events back. It causes packets to be dropped, or it causes wrong answers to happen, or simply simulates a crash or whatever—and then observes what happens to the system in the process of doing this. And it explores those possibilities in a systematic way.”

The tool uses model checking to exploring the state space, along with guidance from the programmer about the system’s high-level properties, then checks the results for failure.

“This doesn’t give you a proof of correctness,” Levin said. “Model checking can never do that. But what it can do is to drive the system into a state that you might have a very difficult time generating by yourself with normal testing.

“You get a sense of a much more robust system as a result of having done that state-space exploration.”

The technique has found bugs in systems in production use with many computers for a couple of years, demonstrating that it can locate latent bugs.

“We want to make sure that we get those things out,” Levin said, “so that they don’t happen at, by Murphy’s Law, the worst possible time.”

DEPENDENCY-BASED DIAGNOSIS

This has happened to everybody: You try to do something on a computer that you’ve done a thousand times before, but this time it doesn’t work. You’re left scratching your head. What happened?

The Constellation project from Microsoft Research Silicon Valley attempts to automate solutions to that question.

“Manual solution is pretty unpleasant,” Levin said.”It typically involves getting down into the detailed packet, logs, something that’s really only for experts, and even then, it’s pretty puzzling sometimes to sort out what’s going on.”

Constellation uses machine learning to observe network behavior and, over time, builds a set of dependency inferences about what depends on what. When something goes wrong, Constellation can analyze its information to determine what’s broken.

“Constellation uses that graph of dependencies to probe particular computers,” Levin said, “bypassing all the ones it believes are not relevant, based on this dependency information, and analyzes what’s working and what’s not and gives you a nice little report that says, ‘Well, your e-mail failed because the e-mail server tried to talk to this folder server and the folder server tried to talk to a domain controller and it’s broken―something that would have taken you forever to figure out by hand if you’re an amateur and a long time if you’re an expert.

“Our idea is that this system can really be a very helpful diagnosis tool for anyone running a large network, which these days means just about every company.”

That’s just a sampling of the challenges that intrigue researchers at Microsoft Research Silicon Valley. It’s a formidable collection, but Levin wouldn’t have it any other way.

“We take a particular focus down here,” he said. “Distributed computing is still a big field.”