At TechFest 2007, the company’s annual showcase of emerging technologies, unveiling more than 100 innovations. Researchers have been working hard to get a chance to display their latest research wares—in a mind-boggling variety of disciplines.
Emerging Markets and Research Partners
Split-Screen UIs for Small Businesses
In developing nations and emerging markets, there are situations, especially in small-business settings, where a single computer is shared among multiple users at the same time. A lot of juggling of control takes place, and people share without changing sessions; rather, they manage with minimization and maximization of relevant windows. Typical applications include word processing, accounting, image editing, and browsing. We provide shared access around the same single display, with multiple mice and multiple keyboards, by splitting the screen into separate sections for each user, optimally for two users. The sections are adjustable, with permissions, and separate applications or operating systems can run independently in each area. Various interactions are enabled by features such as a common area for sharing common files and resources, and joint editing of documents.
The SMS Tidal Wave
In emerging markets, Short Messaging Service (SMS) is one of the most popular modes of communication. We will demonstrate a variety of ways in which SMS can be used via mobile phones to enhance small businesses, microfinance, and agricultural production. Using mobile phones linked to a PC acting as an SMS server can improve data collection and sharing, improving organizational efficiency and rural economic development. SMS can also be made more relevant for specific Web-based applications, such as blogging, searching, chatting, instant messaging, and Outlook® Access. Such uses can help a small business run an SMS server as if it is running a Web site, and we have a software-development kit that makes it simple to build custom SMS servers.
Digital Assistance for Emerging Markets
Microsoft Research India is investigating ways in which digital technology can help enrich the lives of illiterate or rural residents in emerging markets. One such technique is to use a text-free UI to enable illiterate, first-time computer users to access relevant health information to make educated healthcare decisions. Another empowers users unable to read or write to e-mail their loved ones, also using text-free navigation. And a third seeks to assist poor rural farmers by providing specific, targeted advice about relevant farming practices using appropriate digital technologies.
Hardware, Devices, and Mobile Computing
Telescopic Pixel and New UI Devices
Today’s digital images are enabled through arrays of small light gates (LCD, DMD) or light emitters (LED, plasma). The most popular display technology, the LCD, is inefficient, allowing less than 10 percent of backlight to reach the viewing surface. LCDs also are slow, making separate R, G and B necessary. The telescopic pixel is a microminiature reflecting telescope in which the focus controls the amount of light passing through. With a theoretical light efficiency of 75 percent, the telescopic pixel is much more energy-efficient, and its faster switching time enables a single pixel to serve R, G and B functions. We also will demonstrate a capacitance touch-pad control, a low-cost X/Y touch pad for a mobile device, and a gesture-sensing keyboard, in which sensitive capacitance-to-digital converters enable an X/Y position sensor for resolving hand gestures above the keyboard.
Many consumers carry portable electronic devices, smartphones, personal digital assistants, or laptops that can connect to Wi-Fi networks. Location-sensitive advertisements, ads targeted to a Wi-Fi user based in part on the physical location of that user, will be an important market in the near future. We have developed a scheme for distributing location-sensitive ads to Wi-Fi devices. Our approach has three advantages:
- We do not require information from the client device to deliver ads to the client.
- We do not require the client to have Internet connectivity. In fact, we can deliver ads even when the client is connected to a competitor’s Wi-Fi network.
- We can supply dynamic information to consumers in real time. For example, a restaurant can continuously advertise an expected wait time to all wireless clients in its vicinity.
Location-Based Enterprise Wi-Fi Management
The physical locations of clients and access points in a wireless LAN have a large impact on network performance. We demonstrate a scalable, easy-to-deploy WLAN performance-management system that includes a self-configuring location-estimation engine. Our system displays the location of all the WLAN clients, and it tracks those access points with which clients associate, along with a variety of performance metrics that characterize the client’s experience. Using our system to observe the WLAN usage in our building, we show that information about client locations is crucial for understanding WLAN performance.
Surface computing uses sensing and display technology to imbue everyday surfaces with interaction. PlayAnywhere is a compact surface-computing system shown at TechFest last year. This year, we will show PlayTogether: two networked PlayAnywhere units exchanging video of each other’s desktop surface, including hands, game pieces, and drawing surfaces. PlayTogether offers interesting combinations of the real world with the virtual world: Playing chess across the network, you see your opponent’s hands and pieces superimposed on your own real pieces and desktop. We will show other technologies, including an application of depth-sensing video cameras, which work like a normal video camera but also calculate how far away the imaged surface is at each pixel, resulting in (R, G, B, Z)-valued images. We will show a game that combines the surface-computing idea with this exciting new technology.
Search, Interaction, and Collaboration
VIBE Team Demos
The VIBE research group will showcase:
- DynaVis: A visualization framework for Dynamics UX that supports animated transitions, direct manipulation of data, and compositing.
- CandidTree visualizes structural uncertainty in merged trees.
- For PP and agile dev teams: a peripheral display that shows software-development teams where team members are in the code, the methods on which they are working, and who may need help.
- A novel UX for the smartphone.
- Courier: Take your documents with you, and share them on a large display using a smartphone.
Tango: Find Your Circle, Enjoy your Social
Tango enables users not only to manage their own cyber traces—tags, rankings, and comments—but also to see what other people in the same “circle”—friends, favorite users, people sharing the same interest—are doing and what's popular on a more global scale. Tango users can browse any URL and find tags or comments left by others and therefore expand their social network by exploring common interests. All content can be filtered based on circle, ranking, or tag so a user can find the desired information, supported by a trusted relationship. By supporting various social activities, Tango adds interaction to a social network and, thus, is more fun.
Tag Booster: A System for Ranking, Suggesting Tags
Finding things on your PC via the Internet is hard. Search engines provide a solution, provided that sufficient features about information items are available. But increasingly, we wish to navigate and search items such as images, video, music, and even a person’s reputation. This is where user-generated tagging helps. We propose a solution for tag recommendations to help users’ consensus to emerge more rapidly.
Recognition and Disambiguation of Entities in Text
Our project proposes a substantial change in the way we interact with text and information. This includes instant access to relevant data on the Web, as well as contextualized bookmarks and search. The core of the system is a powerful, named-entity recognition and disambiguation technology. The system identifies and disambiguates the named entities and the most important concepts in text based on information extracted from a large, encyclopedic collection and search-query logs. It also enables a user to create context-dependent bookmarks and to share them with other users. The system then employs such data as user feedback to improve its performance. In addition, the system enables a user to perform context-aware Web searches. For this, the system disambiguates the user’s queries by using the information extracted from the documents the user has been reading or editing.
Searching Maps Using Street-Side Photos
A picture is worth a thousand words. In this demo, we show a Web service that can be used to match your photo against millions of street-side-view photos in our database. An efficient, distributed, high-dimensional index is developed to speed the query performance. In our system, which supports both PC and mobile interfaces, each query can be answered in mere seconds. We will use Seattle as an example city to illustrate the performance of our system.
New Concepts for the Home
We will present nine new technologies aimed at enriching home life, under four themes:
- New messaging concepts: We will show a “digital postcard” device for the living room, a “visual answering machine” for the kitchen, and the Epigraph—a kitchen display supporting family presence and identity.
- New mobile concepts: We will show Glancephone, a way of turning a cellphone into a Webcam, and Grab & Share, a system for “trafficking” TV clips through your cellphone.
- New image displays: We will present the Photo Shoebox, which shows a tangible way of archiving and displaying photos in the home, and three variations of Time-Mill, an interactive mirror that captures and reflects photos in the home.
- Paper-digital concepts: We will show two concepts: one using paper to send remote messages into the home, and another that enables a family to message on paper from the home.
Community Buzz is a new window into online communities! Interesting and useful conversations, authors, and groups are discovered easily using this tool, jointly developed by Microsoft Research Redmond's Community Technologies group and Microsoft Research Cambridge's Integrated Systems team, with sponsorship from Live Labs. Community Buzz combines text mining, social accounting (Netscan/MSR-Halo), and new visualization techniques to study and present the content of communication threads in online discussion groups. The merging of these research technologies results in a system that gives great value to community participants, enables highly directed advertising, and supplies rich metrics to product managers.
Pictures of Search Relevance
The link structure of the Web plays an important role in today's search engines, with techniques such as PageRank. These analyses typically work at the level of the entire Web. Our work examines characteristics of key subsets of the Web graph. In particular, we characterize the subgraphs induced by projecting the results of a search onto the larger Web graph. We represent the subgraphs using a rich variety of graphical properties—number of nodes and edges, graph diameter, connected components, triads—and use this representation to predict behavior on several search-related tasks. For example, we can predict the overall quality of a set of search results, when a user will reformulate a query and whether a user will specialize or generalize a query.
Wearable Sensors for Health, Sports, and Community
We will present four projects that utilize a variety of sensors, such as electrocardiograms, blood oximetry, and GPS, in conjunction with Windows Mobile and SPOT devices, to provide feedback to users and online communities:
- The iPox project uses two multisensor, wearable devices to investigate how easy access to one’s physiological data influences individuals and communities.
- SlamXR is a system supporting outdoor sports communities through sensor-annotated GPS traces, such as heart rate and altitude.
- HealthGear utilizes sensor inputs such as blood oximetry to assist with a variety of personal health issues, such as sleep apnea.
- Mobile-sensor-extraction technology for Windows Mobile devices developed at the European Microsoft Innovation Centre is presented in the context of an application for an enhanced SPOT watch to assist diabetics.
Using E-Mail to Query Structured Business Apps
Users of business applications such as CRM or ERP respond to incoming e-mails by manually navigating through the UI of the app. We help users become more efficient by using incoming e-mail as queries against the underlying database of the app. We will show an example of such a query system, called Business Context Expediter (BCE), operating on a CRM database. BCE will find entities within e-mail and offer the users actions related to these entities. BCE also automatically extracts the category of the e-mail and summarizes each e-mail with three sentences. The technology underlying BCE, a joint project of Office Labs and Microsoft Research's Knowledge Tools group, is not tied to CRM: It should be applicable to many structured business scenarios. Come see the demo to learn more!
InSite Live! is a tool for visualizing the structure of Web sites and intranets. It assists users in orientating themselves during navigation, enabling them to jump easily to subsites of interest. It uses a novel link-structure-graph technique to infer the structure from the layout of hyperlinks on site pages. InSite can expand as the community of users browses the site, or it can present a static view of crawled site pages.
Gazing into Web Search
We use eye-tracking technology to help us understand how people use Web-search interfaces. How do people scan search results? Does it depend on what they’re doing? Can we improve our interfaces based on this information?
Wikis and blogs have facilitated greatly the lightweight creation of collaborative documents. A wiki is a type of Web site that makes it easy for users to add, remove, or otherwise edit all content. Wikis, however, are primarily textual in nature. We propose a system that enables a pasteboard metaphor for collaboratively creating Web-based documents. Users easily can add, remove, or rearrange images or text blocks on a page. As on a wiki page, anyone, or only those with appropriate credentials, can edit a page, and new pages can be added and linked to a current page. Users can place text or images anywhere on the page. Since the VIKI maintains a strict model/view separation, both manual and data-driven views can be represented. We will demonstrate the VIKI system and show the kinds of projects that can be built with it, from photo scrapbooks to note taking and to-do lists.
Software, Theory, and Security
Backstory: Find the Story Behind the Code
What were they thinking when they wrote this code? This is a common question and one difficult to answer, because relevant information can be scattered across bugs, e-mails, check-in messages, and elsewhere. We have built a multisearch investigative UI to help you dig for answers.
The Yogi Project
Yogi is a research project on software-property checking from the Rigorous Software Engineering group at Microsoft Research India. Our goal is to build a scalable software-property checker by directly analyzing program binaries. This involves a new algorithm for property checking that systematically combines static analysis with testing. We will show that this synergy of static analysis and testing can be harnessed for effectively finding bugs in system software.
Asirra: Securing Web Services with Cute Kittens
Can you tell a dog from a cat? Perhaps you’ve seen Web services that require you to solve a small challenge to prove you are not an automated script. This is known as a CAPTCHA, and it commonly involves looking at distorted text and typing it into a box. Since OCR software can identify distorted characters quite well, CAPTCHAs add visual clutter to their images, but this also makes the challenges harder and more annoying for humans. We are developing a system, called Asirra, that challenges users to classify images of dogs and cats, a task difficult for software but easy and even fun for humans. Because software is of little help to us, Asirra needs a large source of classified pet images. We obtain them through an alliance with Petfinder.com, a nationwide pet-adoption site, which benefits because every challenge implicitly advertises adoptable pets.
Competitive Online Algorithms and Ad Auctions
There are many situations in which one must make decisions before all the input data has arrived. Algorithms that work in this setting are called online algorithms. For example, how do you choose which ads to display for a given search query when the advertisers have budgets and the number of queries is not known in advance? How do you schedule continually arriving jobs to processors? How do you even evaluate the performance of an online algorithm? We describe online algorithms for auctions and scheduling, and evaluate their performance using the notion of the competitive ratio with an algorithm that knows all the inputs in advance.
From Physics and Geometry to Algorithms
Simple—and not so simple—techniques and results from physics and geometry are incredibly useful in the analysis of applied problems in computer science. One example is sphere packing, a fundamental problem in geometry that has applications to communication over noisy channels. Sphere packings arise in nature as materials minimize their energy. Other examples are algorithms for quickly finding good matchings using Coulombic forces, or fair allocations using gravity.
Pex: Dynamic Analysis and Test Generation for .NET
Pex enables a new development experience in Visual Studio® Team System, taking test-driven development to the next level. Pex analyzes .NET applications. From a parameterized unit test, it automatically produces traditional unit-test cases with high code coverage. Moreover, when a generated test fails, Pex often can suggest a bug fix. Pex performs a systematic program analysis, recording detailed execution traces of existing test cases. Pex learns program behavior from the execution traces, and a constraint solver produces new test cases with different behavior. The result is a minimal test suite with maximal code coverage. When a test fails, Pex uses detailed data-flow information to determine the root cause and a potential bug fix.
MemRay: Viewing Memory-Reference Locality
The performance tools we use are code-centric and produce summary information about the entire program execution. Consequently, performance problems, such as poor data locality or bad performance during a specific phase of program execution, often go undetected. Research prototypes of more sophisticated tools for investigating data locality and time-specific program behavior exist, but they are complex, limiting their utilization. We present MemRay, a memory-access-animation tool that makes understanding memory-reference behavior accessible to a much broader audience. MemRay displays a memory-access movie that can be viewed to identify memory bottlenecks, as well as locality and scalability problems. It pinpoints the code and data structures responsible and shows when during execution the problem arises. We show how MemRay can be used to understand and optimize memory performance.
Concurrent Programming: A New Approach
We’ll describe a new technique that enables programmers to create correctly synchronized, efficient, concurrent programs without having to write synchronization code. We’ll explain how this magic happens, supported by examples and an outline of an implementation.
Biometric Authentication via Fingerprint Hashing
We present a new technique for generating biometric fingerprint hashes, or summaries of information contained in human fingerprints. Our method calculates and aggregates various key-determined metrics over fingerprint images, producing short hash strings that cannot be used to reconstruct the source fingerprints without knowledge of the key. This can be considered a randomized form of the Radon transform in which a custom metric replaces the standard, line-based metric. Resistant to minor distortions and noise, the resulting fingerprint hashes are useful for secure biometric authentication, either augmenting or replacing traditional password hashes. As shown in our hands-on demo, this approach can help increase the security and usability of Web services and other client-server systems.
Systems, Networking, and Databases
Example-Driven Design of Record Matching Queries
Matching records from two relations is an important component of data-cleaning processes and Extract Transform and Load. The goal is to identify pairs of records, which may differ because of representational differences and errors, that represent the same real-world entity. Searching through the large space of possible queries, evaluating each, and finding the most accurate is difficult. In this demo, we will illustrate tools to develop a record-matching package in SQL Server™ Integration Services (SSIS). A user has to mark a set of example record pairs as matches or non-matches. We then suggest an accurate package, using a set of SSIS transforms, which can be reviewed and used as a foundation for further analysis.
Efficient Point-to-Point Shortest Paths
A lot of progress recently has been made in point-to-point shortest-path algorithms. In particular, highly practical algorithms have been developed for computing driving directions. We demonstrate our recent codes for this application. These codes work well on servers, desktops, and handheld devices.
Scaling P2P Games in Low-Bandwidth Environments
First-person shooter games such as Halo and Quake are limited to a few players, but we wish to scale such fast-paced games to massive battles with many players. Since this requires more outbound bandwidth than any home machine has, we partition the game state among all machines. But even so, at extremely large scales, there is still not enough bandwidth for each machine to update all others in every frame. We thus send updates infrequently and use guidable AI to emulate remote avatars’ behavior between updates. We estimate which remote avatars are most interesting to the local player and ensure a higher update rate for them. And we ensure consistent interaction when necessary, such as when one player damages another. A user study shows that these techniques make Quake III over low-bandwidth connections nearly as much fun as on a LAN. Come play the game and see for yourself!
Automatically Finding Network and Server Problems
Does your browser sometimes temporarily hang while loading a Web page, even from intranet sites? You are not alone. We have observed that 10 percent of requests take 10 times longer than expected. We will demo the Analysis of Network Dependencies system, which automatically finds the causes of these hangs, whether it be an overloaded Web or SQL server, a delay in the Domain Name System, a congested network link, or a scheduling delay in the client operating system. The system uses software running on the clients to observe the behavior of the IT infrastructure and to determine dependencies among components. It then uses tomography and Bayesian inference to find problems. Our goal is for the system’s output to support helpdesk staff answering user issues, IT managers planning for capacity upgrades, architects modeling system deployment, and, potentially, data-center operations staff.
UI, Graphics and Media
Boku: Lightweight Programming for Kids
Boku uses a novel, high-level programming paradigm within a 3-D gaming world on the Xbox 360® to introduce children to creative use of the computer. Boku’s programming model is extremely simple as it does not use a textual language or wiring diagrams. Kids use simple behavior cards to enable a small virtual robot to navigate its world and achieve specific tasks. The goal is to provide a gentle introduction to some of the foundational elements of creative programming to children who may not yet be ready for the complexity of classical computer languages. The user is exposed to behavior arbitration, generality, representation of an abstract state, real-time experimentation and feedback, simulation, sensors, physics, and message passing. The programming environment is integrated in an attractive gaming world and controlled entirely via an Xbox 360 game controller.
Mix: Search-Based Authoring
Search, aggregators, and RSS enable people to draw information from many dynamic streams of information on their desktop. People are getting used to reading dynamic content, but there are limited tools today to author and share dynamic content. Mix enables people to build and share dynamic documents with rich structure and visualizations on top of first-class query objects that draw from desktop, intranets, and Web-based search. Mix explores new user interfaces with regard to privacy and security. Sharing a query presents challenges, because the recipient of the query may not have the same access permissions as the publisher. This involves new notions of publishing and privacy control in the user interface.
Linking the World Through Pictures
Have you ever wanted to know more about a DVD, a painting, or a rock-band poster? Take a picture of it. Our prototype connects the world through pictures, providing relevant Web pages and comments from a community of users. Discover if the DVD got good reviews or if you like the music of the rock band. This works as a smartphone application on camera phones: capturing a picture, sending it to our servers, and retrieving relevant information. Content for our system is supplied by the community. Users can add new images, as well as add links and comments to existing images. Using our Web site, users also can search using photos from any digital camera. Our technology is based on a new image-matching technique. Pictures of flat objects, such as signs, posters, and advertisements, can be matched reliably without the need for special bar codes.
Personal Audio Space
A Personal Audio Space is a semi-private, energy-efficient system for real-time communication. We recreate the headset experience without using a headset. Only the intended user can hear the system. Using multiple speakers, we focus the sound into a region around the user. To anyone outside of this personal audio space, the sound is inaudible. By focusing the sound, we can achieve any absolute sound level with less power than a conventional system.
HDView: IE Plug-in for Viewing Very Large Images
New imaging modalities range from photo collections arranged in 3-D to super-high-resolution (gigapixel) images to 360-degree panoramic video. This is revolutionizing the way that people view and interact with their photos. We will demonstrate a new viewer that can be embedded in any application or Web page. It merges traditional slide shows, super-high-resolution panoramas, high-dynamic-range imagery, and 360-degree animations to create an incredibly rich photo-viewing and -browsing experience. During TechFest, we will demonstrate these features. A version with a subset of these features will be made available both internally and externally. We also will demonstrate a prototype authoring tool that generates HDView content.
For more information, visit the HDView Web site.
Digital Effects for Internet Video Clips
We will present a set of offline video-editing tools that make videos more fun. Existing video-editing tools provide filters such as de-noising and adjustment of color and contrast or transitions such as fade in and out. These tools are useful, but they provide only slight improvement to videos. We will show fun video-editing tools that can improve a user's video experiences significantly. Our tools achieve three operations to a video:
- Add—adding objects such as 3-D synthetic objects and video hyperlinks into a video.
- Separate—separating a video’s foreground from its background, to achieve cut and paste of video objects.
- Browse—browsing video in the form of montage, summarizing the video in a space-time manner.
We will demonstrate our technologies as applied to Internet video clips rendered even more enjoyable.
Using Touch to Operate Stylus-Based Devices
Operating a personal digital assistant or other pen-based devices with bare fingers is often faster than retrieving the stylus. I will present extensions for Windows Mobile that help users using touch, even though the application was designed to be stylus-based.
Improved Podcast Authoring with Speech Recognition
Creation of audio/video content, podcasts in particular, presents challenges. Editing long podcasts can be tedious. The author must precisely identify the boundaries of the material he wishes to delete, move, or manipulate. This is time-consuming, because it requires marking of boundaries while listening or watching the content and then checking or modifying those boundaries by repeating the process multiple times. Automatic Speech Recognition recognizes the words and aligns them with the podcast content. The author then can manipulate the raw audio content by manipulating words in a GUI. Words can be processed further to extract keywords or summaries automatically.
Dynamic Noise Reduction
Speech enhancement is used to improve the quality of recorded speech and to remove non-speech sounds from a recording. We’ve developed a new, simple, strong model for the structure of speech audio. We use this model to identify the user’s speech and to remove everything else. It is much better than conventional techniques at removing non-stationary noises such as restaurant and traffic noise. Speech enhancement is especially important in applications such as preventing environmental noise from leaking into a conference call, creating a professional-sounding podcast, and polishing recordings taken under less-than-ideal conditions.
Relaxed Internet Video Exploration and Discovery
The ultimate challenge for Internet video is to bring it into the living room. But the active, disruptive discovery modes of the lean-forward Internet used in today’s video portals, such as keyword search and hierarchical list browsing, do not translate well into the relaxed culture of the living room. Creating new technologies for realizing relaxed discovery modes are the subject of this project. We will show a system for exploring the immense collection of Internet video from the comfort of the living room. Speech/audio-based content analysis and document similarity is combined with collaborative filtering to organize, select, and recommend video content. Familiar TV metaphors such as channel zapping and headline bars are used to enable a low-interaction, more passive perusal of Internet video on the television.
In the News
- Microsoft Links Technology, Common Tools
- Searching for Michael Jordan? Microsoft Wants a Better Way (Registration Required)
- Microsoft TechFest provides glimpse of future
- Microsoft TechFest goes glam
- Sociology at Microsoft
- Where are the programmers?
- Microsoft's gadgets of the future get their day in the limelight
- At TechFest, Microsoft gives everyday objects a tech makeover
- Microsoft TechFest: Top R&D, From Deeply Geeky to Cute and Fuzzy
- Microsoft looks beyond search
- Microsoft developing Xbox game to teach programming
- Want to Buy Some Cool Stuff from Microsoft? Not this Stuff, Not Yet
- Microsoft Researchers Look to the Future of Search
- Outsiders get the first peek at Microsoft's TechFest