Share on Facebook Tweet on Twitter Share on LinkedIn Share by email
Our research
Content type
+
Downloads (455)
+
Events (487)
 
Groups (150)
+
News (2850)
 
People (720)
 
Projects (1160)
+
Publications (13023)
+
Videos (6121)
Labs
Research areas
Algorithms and theory47205 (89)
Communication and collaboration47188 (108)
Computational linguistics47189 (54)
Computational sciences47190 (76)
Computer systems and networking47191 (294)
Computer vision208594 (88)
Data mining and data management208595 (22)
Economics and computation47192 (18)
Education47193 (36)
Gaming47194 (45)
Graphics and multimedia47195 (141)
Hardware and devices47196 (103)
Health and well-being47197 (34)
Human-computer interaction47198 (309)
Machine learning and intelligence47200 (193)
Mobile computing208596 (19)
Quantum computing208597 (1)
Search, information retrieval, and knowledge management47199 (207)
Security and privacy47202 (92)
Social media208598 (14)
Social sciences47203 (103)
Software development, programming principles, tools, and languages47204 (203)
Speech recognition, synthesis, and dialog systems208599 (15)
Technology for emerging markets208600 (6)
1–25 of 141
Sort
Show 25 | 50 | 100
123456Next 
holoportation is a new type of 3D capture technology that allows high quality 3D models of people to be reconstructed, compressed, and transmitted anywhere in the world in real-time. When combined with mixed reality displays such as HoloLens, this technology allows users to see and interact with remote participants in 3D as if they are actually present in their physical space. Communicating and interacting with remote users becomes as simple as face to face communication.
Project details
Labs: Redmond
We are developing a system for the acquisition, transmission and display of real-time 3D digital content. Our goal is to enable live, immersive 3D communications and entertainment experiences. Our strategy is to acquire the 3D signal within a cubical volume. We represent this signal using high resolution colored voxels, and have created algorithms for acquisition, encoding, decoding, streaming, and display of this voxel data designed specifically for modern massively data-parallel GPUs.
Project details
Labs: Redmond
Room2Room is a life-size telepresence system that leverages projected augmented reality to enable co-present interaction between two remote participants. We enable a face-to-face conversation by performing 3D capture of the local user with color + depth cameras and projecting their virtual copy into the remote space at life-size scale. This creates an illusion of the remote person’s presence in the local space, as well as a shared understanding of verbal and non-verbal cues (e.g., gaze).
Project details
Labs: Redmond
The capability of managing personal photos is becoming crucial. In this work, we have attempted to solve the following pain points for mobile users: 1) intelligent photo tagging, best photo selection, event segmentation and album naming, 2) speech recognition and user intent parsing of time, location, people attributes and objects, 3) search by arbitrary queries.
Project details
Labs: Asia
We propose a novel learning scheme called network morphism. It morphs a parent network into a child network, allowing fast knowledge transferring. The child network is able to achieve the performance of the parent network immediately, and its performance shall continue to improve as the training process goes on. The proposed scheme allows any network morphism in an expanding mode for arbitrary non-linear neurons, including depth, width, kernel size and subnet morphing operations.
Project details
Labs: Asia
We study the problem of image captioning, i.e., automatically describing an image by a sentence. This is a challenging problem, since different from other computer vision tasks such as image classification and object detection, image captioning requires not only understanding the image, but also the knowledge of natural language. We formulate this problem as a multimodal translation task, and develop novel algorithms to solve this problem.
Project details
Labs: Asia
We study the problem of food image recognition via deep learning techniques. Our goal is to develop a robust service to recognize thousands of popular Asia and Western food. Several prototypes have been developed to support diverse applications. We are also developing a prototype called Im2Calories, to automatically calculate the calories and conduct nutrition analysis for a dish image.
Project details
Labs: Asia
Automatically describing video content with natural language is a fundamental challenge of computer vision. Recurrent Neural Networks (RNNs), which models sequence dynamics, has attracted increasing attention on visual interpretation. In this project, we present a novel unified framework, named Long Short-Term Memory with visual-semantic Embedding (LSTM-E), which can simultaneously explore the learning of LSTM and visual-semantic embedding.
Project details
Labs: Asia
Project details
Labs: Redmond
NUIgraph is a prototype Windows 10 app for visually exploring data in order to discover and share insight.
Project details
Labs: Redmond
The RoomAlive Toolkit is an open source SDK that enables developers to calibrate a network of multiple Kinect sensors and video projectors. The toolkit also provides a simple projection mapping sample that can be used as a basis to develop new immersive augmented reality experiences similar to those of the IllumiRoom and RoomAlive research projects.
Project details
Labs: Redmond
Mano-a-Mano is a unique spatial augmented reality system that combines dynamic projection mapping, multiple perspective views and device-less interaction to support face-to-face, or dyadic, interaction with 3D virtual objects. Its main advantage over more traditional AR approaches is users are able to interact with 3D virtual objects and each other without cumbersome devices that obstruct face to face interaction.
Project details
Labs: Redmond
Project details
Labs: Redmond
Animated computer graphics are projected onto the base of a fiber optic tree to create a sparse 3D display within the tree. This was done as an entry into Microsoft Research's MakeFest and demonstrated on 1/10/2014 to the MSRMakeFest community.
Project details
Labs: Redmond
Using the Internet as an (noisy) knowledgebase to mine semantics for multimedia data.
Project details
Labs: Redmond
This paper presents a method for acquiring dense nonrigid shape and deformation from a single monocular depth sensor. We focus on modeling the human hand, and assume that a single rough template model is available. We combine and extend existing work on model-based tracking, subdivision surface fitting, and mesh deformation to acquire detailed hand models from as few as 15 frames of depth data.
Project details
Labs: Cambridge
Online 3D reconstruction is gaining newfound interest due to the availability of real-time consumer depth cameras. The basic problem takes live overlapping depth maps as input and incrementally fuses these into a single 3D model. This is challenging particularly when real-time performance is desired without trading quality or scale. We contribute an online system for large and fine scale volumetric reconstruction based on a memory and speed efficient data structure.
Project details
Labs: Cambridge
Project details
Labs: Redmond
We built the Sketch2Cartoon system, which is an automatic cartoon making system. It enables users to sketch major curves of characters and props in their mind, and real-time search results from millions of clipart images could be selected to compose the cartoon images. The selected com- ponents are vectorized and thus could be further edited. By enabling sketch-based input, even a child who is too young to read or write can draw whatever he/she imagines and get interesting cartoon images.
Project details
Labs: Asia
Microsoft Research is happy to continue hosting this series of Image Recognition (Retrieval) Grand Challenges. Do you have what it takes to build the best image recognition system? Enter these MSR Image Recognition Challenges in ACM Multimedia and/or IEEE ICME to develop your image recognition system based on real world large scale data.
Project details
Labs: Redmond
We argue that the massive amount of click data from commercial search engines provides a data set that is unique in the bridging of the semantic and intent gap. Search engines generate millions of click data (a.k.a. image-query pairs), which provide almost "unlimited" yet strong connections between semantics and images, as well as connections between users' intents and queries. This site is to introduce such as dataset, Clickture.
Project details
Labs: Redmond
Mobile video is quickly becoming a mass consumer phenomenon. More and more people are using their smartphones to search and browse video contents while on the move. This project is to develop an innovative instant mobile video search system through which users can discover videos by simply pointing their phones at a screen to capture a very few seconds of what they are watching.
Project details
Labs: Asia
Gigapixel ArtZoom is an interactive panoramic image of Seattle that captures artists and performers in action throughout a 360-degree view of the city. You can zoom into the image to see dancers, acrobats, painters, performance artists, actors, jugglers, and sculptors—all appearing simultaneously within a single 20-gigapixel image. Visit the web site to explore the panorama and find out more.
Project details
Labs: Redmond
We address the fundamental challenge of scalability for real-time volumetric surface reconstruction methods. We present a memory-efficient, streamable, hierarchical GPU data structure for 3D reconstruction of large-scale scenes with fine geometric details in real time. The system fuses live depth maps from a moving Kinect camera to generate high-quality 3D models of unbounded size.
Project details
Labs: Cambridge
The goal of the Spin project is to enable users to capture photorealistic 3D models of objects using just an ordinary camera -- with no special lighting, sensors, or other equipment. Our approach works equally well for a mobile phone, a point-and-shoot camera, or a digital SLR camera. The results can be shared and viewed on a phone, in a web browser, or in a desktop application.
Project details
Labs: Redmond
1–25 of 141
Sort
Show 25 | 50 | 100
123456Next 
> Our research