The Interactive Visual Media group conducts state-of-the-art research in a variety of topics related to computer vision, computer graphics and computational photography. Our goal is to develop new applications for manipulation, reasoning and communication with visual media. Core areas of research include:

. Object recognition
. 3D reconstruction and image segmentation
. Image and video enhancement


Each year we hire exceptional PhD students for summer internships. Offers are generally made in December through March for the following summer. For more information please visit our intern webpage.

The Team

Chris Buehler

Celso Gomes

Neel Joshi

Sing Bing Kang

Krishnan Ramnath

Chris Sienkiewicz

Sudipta Sinha

Eric Stollnitz

Larry Zitnick

Sample Projects

Adopting Abstract Images for
Semantic Scene Understanding

Relating visual information to its linguistic semantic meaning remains an open and challenging area of research.

First-person Hyperlapse Videos

Converting first-person videos into hyperlapse videos.

Image-Based Rendering in the
Gradient Domain

Standard image-based rendering synthesizes novel views of a scene by reprojecting the input image.

Efficient High-Resolution Stereo Matching using Local Plane Sweeps

A multi-perspective street slide panorama with navigational aides and mini-map.

Piecewise Planar Stereo for Image-based Rendering

Multiple scene planes are robustly extracted from sparse 3D.

Structured Edge Detection Toolbox

Very fast edge detector (up to 60 fps depending on parameter settings) that achieves excellent accuracy.


Microsoft COCO Common
Objects in Context

A new image recognition, segmentation, and captioning dataset.

3D Video Data

This data includes a sequence of 100 images captured from 8 cameras.

Bringing Semantics Into Focus
Using Visual Abstraction

We create 1,002 sets of 10 semantically similar abstract scenes with corresponding written descriptions.

MSR-V3D (Microsoft Research Stereo Video + Depth) Dataset

We offer MATLAB code to take a normal, 2-D video and automatically compute depth at every frame.

Middlebury Computer Vision Datasets

Datasets and sample code for evaluating pairwise and multi-view stereo correspondence, optical flow, and MRF inference.

Downloads & Applications

Click here to see all available downloads from our team, including applications for Windows, Windows Phone, plug-ins for Visual Studio and Photoshop.

All Publications

Click here for a list of all our publications (for the most up to date lists, please see the individual member's webpages.)

Recent Publications

. Anitha Kannan and Simon Baker, Identifying Presentation Styles in Online Educational Videos, no. MSR-TR-2014-141, 6 November 2014

. T.Y. Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan, P. Dollár, and C. L. Zitnick, Microsoft COCO: Common Objects in Context, in ECCV, European Conference on Computer Vision, September 2014

. C. Lawrence Zitnick and Piotr Dollár, Edge Boxes: Locating Object Proposals from Edges, in ECCV, European Conference on Computer Vision, September 2014

. Jaesik Park, Sudipta N. Sinha, Yasuyuki Matsushita, Yu-Wing Tai, and In So Kweon, Calibrating a Non-isotropic Near Point Light Source using a Plane, in CVPR, Computer Vision and Patter Recognition, 21 June 2014

. Sudipta Sinha, Daniel Scharstein, and Richard Szeliski, Efficient High-Resolution Stereo Matching using Local Plane Sweeps, in CVPR, Computer Vision and Patter Recognition, 21 June 2014

. Bharath Hariharan, C. Lawrence Zitnick, and Piotr Dollár, Detecting Objects using Deformation Dictionaries, in CVPR, Computer Vision and Pattern Recognition, June 2014

. Neel Joshi and C. Lawrence Zitnick, Micro-Baseline Stereo, no. MSR-TR-2014-73, 22 May 2014

. Annegret L. Falkner, Piotr Dollar, Pietro Perona, David J. Anderson, and Dayu Lin, Decoding Ventromedial Hypothalamic Neural Activity during Male Mouse Aggression, in JNS, Journal of Neuroscience, April 2014

. Piotr Dollár, Ron Appel, Serge Belongie, and Pietro Perona, Fast Feature Pyramids for Object Detection, in PAMI, Pattern Analysis and Machine Intelligence, April 2014

. Krishnan Ramnath, Sudipta N. Sinha, Richard Szeliski, and Edward Hsiao, Car Make and Model Recognition using 3D Curve Alignment, in IEEE Winter Conference on Applications of Computer Vision (WACV 2014), IEEE Computer Society, March 2014

Group Alumni

. P. Anandan (Microsoft Research)
. Kentaro Toyama (University of Michigan)
. Zhengyou Zhang (Microsoft Research)
. Harry Shum (Microsoft Technology and Research)
. Antonio Criminisi (Microsoft Research)
. Sumit Basu (Microsoft Research)
. Nebojsa Jojic (Microsoft Research)
. Chuck Jacobs (Microsoft Research)
. David Salesin (Adobe Research)
. Steve Seitz (University of Washington and Google)
. Shai Avidan (Tel-Aviv University)
. Phil Torr (Oxford University)
. Ying Shan (Microsoft)
. Yaron Caspi (Weizmann)
. Chris Pal (École Polytechique de Montréal)
. Matthew Brown (University of Bath)
. Simon Winder
. Piotr Dollar (Facebook AI Research)
. Simon Baker (NVidia)
. Rick Szeliski
. Michael Cohen
. Matt Uyttendaele
. Johannes Kopf
. Ross Girshick