Quick interaction between a human teacher and a learning machine presents numerous benefits and challenges when working with web-scale data. The human teacher guides the machine towards accomplishing the task of interest. The system leverages big data to find examples that maximize the training value of its interaction with the teacher.
The proliferation of connected devices can in theory enable a range of applications that make rich inferences about users and their environment. But in practice developing such applications today is arduous because they are constructed as monolithic silos, tightly coupled to sensing devices, and must implement all sensing & inference logic, even as devices move or are temporarily disconnected. Our goal is to break down restrictive device-application silos and simplify app development.
The Kamino project explores ways in which systems should adopt new memory technologies including SSDs (NAND-Flash), battery-backed DRAM and emerging non-volatile memory technologies (phase change memory, memristors, spin-torque transfer memory, etc.) for increased performance and efficiency. The project explores how to best leverage such new memory technologies inside systems of all sizes and shapes: from mobile to data center scale.
We introduce a novel approach for automatically generating image descriptions. Visual detectors, language models, and deep multimodal similarity models are learned directly from a dataset of image captions. Our system is state-of-the-art on the official Microsoft COCO benchmark, producing a BLEU-4 score of 29.1%. Human judges consider the captions to be as good as or better than humans 34% of the time.
Microsoft believes the Surface Hub will be as empowering and as transformative to teams and the shared work environment as the PC was to individuals and the desk. The Surface Hub creates new modalities for creating and brainstorming with its unique large-screen productivity apps and capabilities. We believe it will be a critical component for the modern workplace, home, or other venue where people need to come together to think, ideate, and produce.
We envision a future Internet of Things where every human-created artifact in the world that uses electricity will be connected to the internet. We are creating new experiences and technologies for the coming convergence of digital and physical systems enabled in this future.
The Eye Gaze keyboard is a project to enable people who are unable to speak or use a physical keyboard to communicate using only their eyes. Our initial prototypes are based around an on screen qwerty keyboard very similar to the 'taptip' keyboard built into Windows 8 which has been extended to response to eye gaze input from a sensor bar like the Tobii EyeX. Our goal is to improve communication speed by 25% compared to experienced users of off the shelf Speech Generating Devices.
This project aims to enable people to converse with their devices. We are trying to teach devices to engage with humans using human language in ways that appear seamless and natural to humans. Our research focuses on statistical methods by which devices can learn from human-human conversational interactions and can situate responses in the verbal context and in physical or virtual environments.
Mobile devices are severely battery constrained. While smartphone capabilities have increased manifold in the last ten years, the battery energy density has only doubled. In the Prana project, we have been exploring several techniques to improve the battery life of mobile devices towards a vision of having a phone last a week without recharge under normal usage.
The Logical Form analysis produced by the NLPwin parser is very close in spirit to the level of semantic representation defined in AMR, Abstract Meaning Representation. The "NLPwin parses AMR" project is a conversion from LF to AMR in order to facilitate 1) evaluation of the NLPwin LF and 2) contribution the ongoing discussion of the specification of AMR. In this project, we include publications, as well as links to our LF training data converted to AMR and to the LF-AMR parser for English.
This is a project looking into design and evaluation of efficient and deployable algorithms for assignment of complex workloads to resources in modern cloud service platforms.
The Indoor Patrol Robot is a low cost holonomic-drive enabled robot that can navigate autonomously indoors using just RGB and ultrasonic sensors. The robot leverages the Better Together Framework to allow real-time remote-viewing of the video, and includes a feature to upload photos to OneDrive. The robot can sef-navigate to a charging base, enabling 24/7 maintenance-free operation.
The aim of this project is to develop a personalized recommendation system for the timelines of twitter users where tweets are ranked by the user's home location and personal interests. TRUPIL addresses the challenge of Twitter users who tend to post short messages of 140 characters reflecting a variety of topics. The large volume of posts in several topics is overwhelming to twitter users who might be interested in only few topics.
Labs: ATL Cairo
The ability to detect human actions in real-time is fundamental to several applications such as surveillance, gaming, and sign language detection. These applications demand accurate and robust localization of actions at low latencies which remains a very challenging computer vision task. In this project we present efficient descriptors for action detection on RGBD sequences.
Labs: ATL Cairo
Project Blush explorers the materiality of digital ephemera and people's receptiveness to 'digital jewellery' - exploring the materials and aesthetics that may allow wearables to become jewellables.Project Blush is a research project that originates from the Human Experience and Design group (HXD). HXD specialise in designing and fabricating new human experiences with computing. These play on many different kinds of human values, from amplifying efficiency and effectiveness to creating delight a
Code Hunt is a serious educational game. The Code Hunt community is interested in all aspects of research and development around the game, including analysis of the data and development of the platform.
Code Hunt is a serious gaming platform for coding contests and practicing programming skills. It is based on the symbolic white box execution engine, Pex. Code Hunt is unique as an online coding platform in that each puzzle is presented with test cases only, no specification. Players have to first work out the pattern and then code the answer. Code Hunt has been used by over 100,000 players as of February 2015.
Catapult is a Microsoft project investigating the use of field-programmable gate arrays (FPGAs) to improve performance, reduce power, and provide new capabilities in the datacenter.
Deep Structured Semantic Model / Deep Semantic Similarity Model
FaST-LMM (Factored Spectrally Transformed Linear Mixed Models) is a set of tools for performing genome-wide association studies (GWAS) on large data sets. FaST-LMM runs on both Windows and Linux, and contains code to do (1) univariate GWAS, (2) testing sets of SNPs, (3) feature selection for background correction, (4) epistatic association scans, (5) a correction method for cellular heterogeneity in methylation and similar data.
We envision using Eye Gaze technology to bring independent mobility to people living with disabilities who are unable to use a joystick.
Parasail is a novel approach to parallelizing a large class of seemingly sequential applications wherein dependencies are, at runtime, treated as symbolic values. The efficiency of parallelization, then, depends on the efficiency of the symbolic computation, an active area of research in static analysis, verification, and partial evaluation. This is exciting as advances in these fields can translate to novel parallel algorithms for sequential computation.
NLPwin is a software project at Microsoft Research that aims to provide Natural Language Processing tools for Windows (hence, NLPwin). The project was started in 1991, just as Microsoft inaugurated the Microsoft Research group; while active development of NLPwin continued through 2002, it is still being updated regularly, primarily in service of Machine Translation.
We explore grip and motion sensing to afford new techniques that leverage how users naturally manipulate tablet and stylus devices during pen-and-touch interaction. We can detect whether the user holds the pen in a writing grip or tucked between his fingers. We can distinguish bare-handed inputs, such as drag and pinch gestures, from touch gestures produced by the hand holding the pen, and we can sense which hand grips the tablet, and determine the screen's relative orientation to the pen.