Catapult is a Microsoft project investigating the use of field-programmable gate arrays (FPGAs) to improve performance, reduce power, and provide new capabilities in the datacenter.
Deep Structured Semantic Model / Deep Semantic Similarity Model
Website for the CIKM2014 tutorial on Deep Learning for Natural Language Processing: Theory and Practice (more content to be added)
Using the Internet as an (noisy) knowledgebase to mine semantics for multimedia data.
We built the Sketch2Cartoon system, which is an automatic cartoon making system. It enables users to sketch major curves of characters and props in their mind, and real-time search results from millions of clipart images could be selected to compose the cartoon images. The selected com- ponents are vectorized and thus could be further edited. By enabling sketch-based input, even a child who is too young to read or write can draw whatever he/she imagines and get interesting cartoon images.
We built the Sketch2Tag system for hand-drawn sketch recognition. Due to large variations presented in hand-drawn sketches, most of existing work was limited to a particular domain or limited pre-defined classes. Different from existing work, Sketch2Tag is a general sketch recognition system, towards recognizing any semantically meaningful object that a child can recognize. This system enables a user to draw a sketch on the query panel, and then provides real-time recognition results.
Microsoft Research in partnership with Bing is happy to launch the second MSR-Bing Challenge on Image Retrieval. Do you have what it takes to build the best image retrieval system? Enter the MSR-Bing Image Retrieval Challenge in ACM Multimedia and/or ICME to develop an image scoring system for a search query. Last Challenge: MSR-Bing IRC @ ACM Multimedia 2014. Current Challenge: MSR-Bing IRC @ ICME 2015. Next Challenge: MSR-Bing IRC @ ACM Multimedia 2015
We argue that the massive amount of click data from commercial search engines provides a data set that is unique in the bridging of the semantic and intent gap. Search engines generate millions of click data (a.k.a. image-query pairs), which provide almost "unlimited" yet strong connections between semantics and images, as well as connections between users' intents and queries. This site is to introduce such as dataset, Clickture.
Mobile video is quickly becoming a mass consumer phenomenon. More and more people are using their smartphones to search and browse video contents while on the move. This project is to develop an innovative instant mobile video search system through which users can discover videos by simply pointing their phones at a screen to capture a very few seconds of what they are watching.
Stroke Recovery with Kinect is an interactive rehabilitation system that helps stroke patients improve their upper-limb motor functioning in the comfort of their own home. By using the Microsoft Kinect sensor’s gesture recognition technology, the system recognizes and interprets the user’s movements, assesses their rehabilitation progress, and adjusts the level of difficulty for subsequent therapy sessions.
MSRA Mood Board
Exploratory queries on a database often returns too few or too many results (e.g., a home search query on a database of available homes). In such cases, the user faces the challenges of (i) navigating through too many results and/or (ii) refining the query. This project focuses on innovative ways to help the user when the face the above challenges.
The same entity is often referred to in a variety of ways. For example, the camera Canon 600d is also referred to as "canon rebel t3i", the celebrity Jennifer Lopez is also referred to as "jlo" and Seattle Tacoma International Airport is also referred to as "sea tac". These are known as synonyms. Without knowledge of synonyms, many applications like e-commerce search will fail to return relevant results. We leverage the data assets amassed by Bing to automatically mine such synonyms.
We are working toward a theoretical foundation of developing large-scale human-machine systems that combine the intelligence of human and the computing power of machine to solve the problems that are difficult to solve by either human or machine alone.
AutoTag 'n Search My Photos, a Microsoft Garage project, uses photos tagged in your Facebook account to learn face models of your friends. It can then automatically tag faces in your personal photo collection in Pictures Library, including OneDrive Camera roll. The app supports the ability to search for people tags across your photo collection. AutoTag ‘n Search My Photos adds new people tags to your photos and does not overwrite any existing tags.
Tempe is a web service for exploratory data analysis.
Search TrailBlazer is a project that aims at redefining the way people think about search. We propose to model user search behaivor using tasks rather than queries or sessions in the traditional way. Our framework contains components to impact multiple core areas of search engines, including relevance ranking, metric design, user satisfaction prediction, DSAT mining, competitive analysis and etc.
In this project, we investigate near-duplicate document detection, focusing primarily on the detection of evolving news stories. These stories often consist primarily of syndicated information, with local replacement of headlines, captions, and the addition of locally-relevant content. By detecting near-duplicates, we can offer users only those stories with content materially different from previously-viewed versions of the story.
Our team from Microsoft Research is studying social information seeking behavior.
We investigated heuristics for automatically identifying "spam" web pages, i.e. pages that are created to enrich the publisher rather than to provide utility to the consumer.
The Scalable Hyperlink Store is a specialized "database" for the web graph. SHS maintains the web graph in main memory, distributed over many machines. The system is available as C# source code as well as precompiled binaries.
A Web page is not atom but rich in structure. In this project, we take advantage of HTML DOM structure and associated visual features, such as font size, width and height of a DOM element, to understand the purpose of authors in creating a page. We model importance of blocks in the page; we extract structured data from pages across websites; we learn templates from a set of mixed pages from a website; we also identify article title, body and images from pages to improve reading experience.
An automated, unsupervised, scalable solution to language identification based on publicly available data.
Our research team is studying how users seek health information using both traditional search engines and emerging social platforms, and how the experience of health information seeking can be improved.
There is some evidence that a gap exists between the neural network research and software development communities. Source code examples available to software developers are often incomplete, misleading, or just plain incorrect. The goal of this project is to bridge that gap by providing a series of high quality demo programs.