We aim to enable people with mobile devices to receive continuously updated information about their surroundings by pointing a camera. The system is able to use image recognition to augment what a person sees on the screen with 2D or 3D graphics that track their environment in real time.
We demonstrate this using a treasure hunt game which guides the user along a previously authored path indoors or outdoors using geo-located arrows or floating 3D bubbles. Applications include games, city tours, and self-localization for mobile robotics. At its core, this is based on a technology for rapidly extracting distinctive features in images and matching them into a database of locations. This has already formed the backbone of released products such as Live Labs Photosynth and Microsoft Image Composite Editor.
Our current technology extracts "interest points" and "invariant descriptors" from images to provide characteristic information about a visual scene and therefore allows matching from one scene to another. We used this to automatically stitch together many photographs in Microsoft Image Composite Editor, a tool for assembling panoramas. We also used it together with 3D reconstruction methods to create 3D scenes from collections of photographs in Live Labs Photosynth. Now that hand-held devices have video cameras and powerful processors we are developing real time solutions that are able to continuously match what is seen by the camera of a mobile device against a database of known views of locations, obtained for example from Windows Live Street-side imagery. This provides localization for the device that adds significant detail to any information gained from GPS.
Augmented reality is the process of adding computer graphics elements to images of real scenes to provide all kinds of information to the user. Since we are able to track the moment-by-moment mapping between the camera view and stored views, we can add any location-related data to the user's screen and make it look like it is part of the world. It is possible to imagine a virtual tour guide who points out the details of the city as we walk around. This is just one of many applications.
- Simon Winder, Gang Hua, and Matthew Brown, Picking the Best Daisy, in Computer Vision and Pattern Recognition, IEEE Computer Society, June 2009.
- Gang Hua, Matthew Brown, and Simon Winder, Discriminant Embedding for Local Image Descriptors, in International Conference on Computer Vision, October 2007.
- Simon Winder and Matthew Brown, Learning Local Image Descriptors, in IEEE Computer Society Conf. on Computer Vision and Pattern Recognition, June 2007.
- Augmented Reality for Core Tools demonstration
Simon Winder demonstrates Core Tools for Augmented Reality at the Enabling Innovation Through Research 2009 conference at Microsoft Research Cambridge.
- Treasure hunt demonstration using Core Tools for Augmented Reality
Microsoft Research Principal Researcher Michael Cohen explains his TechFest 2009 treasure hunt demonstration using Core Tools for Augmented Reality.