Microsoft Research hosted the 2009 TechFair on June 24 in Washington, D.C. Computer scientists from Microsoft Research demonstrated groundbreaking innovations under development in the organization’s six labs worldwide and gave guests an opportunity to interact with Microsoft Research executives and computer scientists to learn how they are helping to turn ideas into reality for Microsoft and for technology users around the world.
- Closed-Loop Control Systems for the Data Center
- Code Name Viveri: A Platform for Search Incubation
- Commute UX: Dialog System for In-Car Infotainment
- Dryad and DryadLINQ
- Interactions with an Omni-Directional Projector
- Kodu: Lightweight Programming for Kids
- Large-scale Spamming Botnet Detection
- PINQ: Privacy-preserving data analysis for privacy non-experts
- Recommendation Systems
- Research Desktop Activity
- Social Desktop
- Social Views of E-Mail
Closed-Loop Control Systems for the Data Center
Navendu Jain, Researcher, Microsoft Research, Redmond
This demonstration shows a closed-loop, adaptive control system for Live Search that aims to minimize energy usage while guaranteeing a service-level agreement (SLA) for search response time. Power is a central issue in the design and management of data centers. Power consumption accounts for as much as 30 percent of a data centers operating costs. Idle machines consume a good fraction of this power only about 7.5 percent of CPU cycles executed in Microsoft’s data centers performs useful work. Minimizing power usage during periods of low workload could save money. Applications deployed in the data centers, however, require a strong guarantee of their performance. For example, Live Search requires the response time of at least 99 percent of queries to be less than 300 milliseconds. A key challenge is to minimize power usage while meeting the desired SLAs. To address this challenge, we present an energy-aware prototype built using 100 low-power Atom processors that execute a Live Search benchmark with a scaled-down, 1-GB search index per node. To meet the 300-millisecond response time, we apply machine-learning techniques that model performance as a function of workload and that set power states idle, sleep, hibernate across nodes to save energy. Because transitions between different power states incur a latency of 15-30 seconds, our prototype provides a predictor module that switches processors to different power states in advance of workload transitions.
Code Name Viveri: A Platform for Search Incubation
Scott Imig, Senior Software Design Engineer, Microsoft Corporation, Redmond
We demonstrate a search engine that serves as a platform for incubation of new search ideas and a showcase for search innovation. Experimental search interfaces can be too unusual or jarring to trial directly on unprepared users of our primary search engine, but refinement and improvement rely on interaction with real users. Our technology aggregates content from multiple sites, presenting diverse user-interface elements and types of information for each query, and enabling the exploration of multiple experimental ideas at once. A high-level API facilitates rapid deployment of existing technology, while Silverlight provides scalability and low latency.
We will present several experiments, including intelligent federation of search results through standards such as OpenSearch and RSS. By allowing the user to view topic-specific results together with general-purpose results, diverse and relevant information is communicated in an efficient manner. In this way, web search evolves into an intelligent ecosystem, rather than a monolithic application.
Commute UX: Dialog System for In-Car Infotainment
Mike Seltzer, Researcher, Microsoft Research, Redmond
Y.C. Ju, Senior Researcher Software Design Engineer, Microsoft Research, Redmond
After deploying Blue&Me for Fiat and Sync for Ford, in-car dialog systems are morphing from cool gadgets that amaze people and sell more cars to integral parts of in-car infotainment. This raises the bar for the functionality, usability, and reliability of these systems. The presented in-car infotainment system contains novel technologies from Microsoft Research that enable natural-language input; expose a multimodal user interface including speech, a GUI, touch, and buttons; and use state-of-the-art sound-capture and processing technologies for improved speech recognition and sound quality.
Dryad and DryadLINQ
Roger Barga, Principal Architect, Microsoft External Research, Redmond
In this demo we showcase efforts in MSR to collaborate with external researchers to explore the application of new technologies, specifically Dryad and DryadLINQ, to big data research problems in science. We also highlight our efforts to provide software and services to academics across the world, through the release of Dryad and DryadLINQ free of charge to the research community, along with associated programming guides, user documentation, and code libraries.
Dryad is a general-purpose distributed computing engine, more flexible than MapReduce or Hadoop!, that was designed to simplify the task of implementing distributed applications on clusters of Windows computers. DryadLINQ is an abstraction layer which simplifies the process of implementing Dryad-based applications. Microsoft Research is acutely aware the ubiquity of big data and challenges this presents to researchers and we are offering researchers the tools, resources and collaboration to explore this new area.
Interactions with an Omni-Directional Projector
Andy Wilson, Senior Researcher, Microsoft Research, Redmond
Hrvoje Benko, Researcher, Microsoft Research, Redmond
We will present a combination of a standard projector with a wide-angle lens capable of projecting data onto the entire, 360-degree surrounding environment from a single position. This setup provides an immersive experience similar to existing, much more expensive planetarium projectors or Virtual Reality CAVE projectors, on which all of the surfaces in the room can receive projections. We have added an infrared camera that shares the wide-angle lens with the projector and is capable of detecting a users hands and tracking freehand gestures in mid-air, without additional gloves or tracking objects. This demo integrates several Microsoft technologies into a stunning presentation: We will offer a hemispherical dome in which users can interact with data from Virtual Earth and WorldWide Telescope.
Kodu: Lightweight Programming for Kids
Matthew MacLaurin, Principal Program Manager, Microsoft Research, Redmond
Kodu is an entertaining educational tool used to teach programming and simulation to students from age 9 and up. Kodu uses a new visual programming system invented at Microsoft Research to allow kids to create sophisticated games and simulations in an intuitive way within a richly detailed 3d world. All programming in Kodu is done with an XBox 360 game controller which provides a comfortable, non-intimidating experience for kids familiar with video games. Kodu is currently in use in academic institutions around the world, in countries such as New Zealand, Finland, Russia, Brazil, Australia, Canada and more. Kodu is designed to be easy to integrate with K12 curricula with teachers who may have little computer science background. With Kodu, we empower students with pragmatic and powerful software development capabilities at a very young age - existing Kodu programs begin in the 4th grade.
Large-scale Spamming Botnet Detection
Yinglian Xie, Researcher, Microsoft Research, Silicon Valley
Fang Yu, Researcher, Microsoft Research, Silicon Valley
We design and implement a novel system, called BotGraph, to detect a new type of botnet spamming attacks. Unlike traditionalattacks where botnet hosts were used to set up as spam email servers, this attack leverages botnet hosts to signup millions of email accounts at major Web-email serviceproviders (e.g.,Hotmail, Yahoo!, Gmail), and then uses these accountsto send billions of spam emails across the world. Its large scale and severe impact have repeatedly caught public media's attention.
BotGraph uncovers the correlations among botnet activities by constructing large user-user graphs and looking for tightly connected subgraph components. This enables us to identify stealthy botnetaccounts that are hard to detect when viewed in isolation. To deal with the huge data volume, we implement BotGraph as a distributed application on a computer cluster. To our knowledge, BotGraph is one of the first solutions to combat this new attack. It successfully detected over 26 million botnet accounts using two months of Hotmail data. More generally, we believe both our graph-based approach and our implementations areapplicable to a wide class of security applications for analyzing large datasets.
PINQ: Privacy-preserving data analysis for privacy non-experts
Cynthia Dwork, Principal Researcher, Microsoft Research, Silicon Valley
Frank McSherry, Researcher, Microsoft Research, Silicon Valley
Recent research on privacy-preserving data mining and analysis (done largely at Microsoft Research) has resulted in several very exciting results, demonstrating the possibility of a broad class of data analyses that provide mathematical guarantees on the privacy of the underlying records. Specifically, these analyses are guaranteed to unfold identically with and without any one user’s records; neither the user, nor anyone else, can even tell if the user participated in the data set, much less the contents of their records. This privacy guarantee is called “Differential Privacy”.
We present a programming interface to data, much like a standard database, that automatically guarantees differential privacy. The analyst describes the computation they wish to perform (e.g: “count the number of patients admitted from a certain zip code, with the following symptoms, last month”) and our execution platform provides a response certain to respect the strong formal guarantees above. Importantly, the system itself provides the privacy guarantees, and does not require privacy expertise of the users, or the participation of a privacy expert. The users, who are likely expert in other areas (eg: epidemiology, sociology, public policy) can focus on applying their expertise to the task at hand, without stumbling over privacy constraints, and without the risk of unintended disclosure of sensitive information.
Adam Kalai, Researcher, Microsoft Research, New England
Recommendation systems and social networks are rapidly proliferating. People would like high quality recommendations for products, doctors, restaurants, etc. How can one use a network to give better recommendations? We use the “axiomatic” approach to design a recommendation system. In particular, we consider several natural properties that one may like a recommendation system to satisfy. While no system satisfies each and every property, the desired properties for a particular application can often “uniquely” determine which system to use.
In this example, we have a social network in which each node represents a person, an edge from a to b means that a trusts b, and a thumbs up/thumbs down indicates first-hand experience with some item in question. For example, the item may be an Xbox game and the network may be the network of Xbox players. The remaining nodes are then colored with green or red indicating whether the system thinks that the person would or wouldn’t like the item in question.
Research Desktop Activity
Natasa Milic-Frayling, Director of Research Partnerships, Microsoft Research, Cambridge
Could people use tagging to manage day-to-day work in their personal computing environment? Could tagging be sufficiently generic and lightweight to support diverse ways of working and, perhaps, support new and efficient practices for managing applications and accessing documents?
We enhanced the MS Desktop environment with the TAGtivity system that enables users to tag resources in the context of their ongoing work. By providing several features, TAGtivity enables users to create a new tag and apply it to documents, e-mail messages, Web pages, images, and other files that they wish to associate with the tag. The tagging is sufficiently generic to support flexible gathering of resources during a variety of users’ tasks. Thus, the tags can be used to designate ongoing projects or planned activities, provide alternative organizations of the file system, designate objects that require specific handling, e.g., printing or sorting, or represent ‘to do’ lists.
TAGtivity creates and stores comprehensive metadata about the created tags and the user’s interactions with the system. Its UI features include the TAGtivity Manager, a centralized place for accessing and managing tags, and the TAGtivity Toolbar extension of the main MS Office 2007 applications to support tagging in Word, Excel, PowerPoint and Outlook, and Internet Explorer 7 (IE7). In order to leverage the existing user practices, TAGtivity is integrated with the file system and enables users to associate files and entire folders with tags through a simple drag and drop interaction.
TAGtivity is part of the Research Desktop project that investigates new concepts for enhancing working practices in the PC environment. Concept videos of Activities, Tools, Library, and Notes can be viewed from the project site: http://research.microsoft.com/researchdesktop/.
Cezary Marcjan, Principal Software Design Engineer, Microsoft Research, Redmond
Today, it’s easy to share a Web page or a blog post, because items on the Web have unique IDs: URLs. We don’t have this on the desktop. Social Desktop adds URLs to the files and folders on your desktop, letting you share anything on your computer with anyone who can click on a URL. Persons receiving links can either access via e-mail or comment, tag, and search across all shared items via our Web page. We implement this by using a .NET service, but it is possible to create a universal namespace for every device and data source for a user, providing a universally addressable namespace with:
- Universal access. The same URL works from any device in the world.
- Universal sharing.
- Universal tagging and commenting.
- Freedom from legacy paths
Data isn’t limited by file-system concepts. You can have a URL drill into a sub-portion of a document or a PowerPoint deck, or data could come from a Web service or a database. Social Desktop is a local service that maps the user’s local data into a .NET service bus service, enabling local data to be accessible through firewalls. Social Desktop also provides a Web-service view over the same data, with inherent RSS event streams for any container. New data sources can be mapped into the URL hierarchy, enabling a distributed view to be built. There are simple sharing paradigms that enable URLs to be shared temporarily or permanently.
Social Views of E-Mail
Steve Ickman, Senior Software Design Engineer, Microsoft Research, Redmond
Incoming feeds of information become increasingly overwhelming as more people use social networks, e-mail, instant messaging, and other forms of communication. Our work automatically analyzes a users communication and organizes the feed into groups. Depending on when and with whom a user is communicating, a different stream of information can be presented as contextually appropriate. The goal of automatic group discovery is not only to detect the initial grouping, but also to discover slow changes to groups over time, thus freeing the user from manual group management. We also will show different ways to visualize an incoming stream of information, from a condensed overview for a small screen to a lush, immersive experience. Peripheral information can be squeezed into less space by employing time-sharing presentation: a ticker. Users can zoom into additional pages to get older messages in the thread or context. A search for information also can be done via a timeline-based presentation.
Roger Barga, Principal Architect, Microsoft External Research, Redmond
Trident is an open source workbench for scientific workflow that is implemented on top of Microsoft’s Windows Workflow Foundation, levering existing functionally of a commercial workflow engine. Trident provides a workflow environment in which scientists can visually design and execute scientific workflows by specifying the desired sequence of computational actions and the appropriate data flow, including required data transformations, between these steps. The scientific workflow approach offers a number of advantages over traditional scripting-based approaches, including ease of configuration, improved reusability and maintenance of workflows and components, automated provenance management, "smart" re-running of different versions of workflow instances, on-the-fly updateable parameters, monitoring of long running tasks, and support for fault-tolerance and recovery from failures. Trident has been developed in collaboration with researchers for several large-scale eScience projects, ranging from oceanography, and astronomy, to the atmospheric sciences.