Our research
Content type
+
Downloads (0)
+
Events (0)
 
Groups (0)
+
News (2)
 
People (1)
 
Projects (0)
+
Publications (89)
+
Videos (7)
Labs
Research areas
Algorithms and theory47205 (0)
Communication and collaboration47188 (0)
Computational linguistics47189 (0)
Computational sciences47190 (0)
Computer systems and networking47191 (0)
Computer vision208594 (2)
Data mining and data management208595 (0)
Economics and computation47192 (0)
Education47193 (0)
Gaming47194 (0)
Graphics and multimedia47195 (17)
Hardware and devices47196 (0)
Health and well-being47197 (0)
Human-computer interaction47198 (0)
Machine learning and intelligence47200 (3)
Mobile computing208596 (0)
Quantum computing208597 (0)
Search, information retrieval, and knowledge management47199 (0)
Security and privacy47202 (0)
Social media208598 (0)
Social sciences47203 (0)
Software development, programming principles, tools, and languages47204 (0)
Speech recognition, synthesis, and dialog systems208599 (0)
Technology for emerging markets208600 (0)
1–89 of 89
Sort
Show 25 | 50 | 100
1
Sudipta N. Sinha, Daniel Scharstein, and Richard Szeliski

We present a stereo algorithm designed for speed and efficiency that uses local slanted plane sweeps to propose disparity hypotheses for a semi-global matching algorithm. Our local plane hypotheses are derived from initial sparse feature correspondences followed by an iterative clustering step. Local plane sweeps are then performed around each slanted plane to produce out-of-plane parallax and matching-cost estimates. A final global optimization stage, implemented using semi-global matching, assigns...

Publication details
Date: 1 June 2014
Type: Inproceeding
Publisher: IEEE Computer Society
Krishnan Ramnath, Sudipta N. Sinha, Richard Szeliski, and Edward Hsiao

We present a new approach for recognizing the make and model of a car from a single image. While most previous methods are restricted to fixed or limited viewpoints, our system is able to verify a car's make and model from an arbitrary view. Our model consists of 3D space curves obtained by backprojecting image curves onto silhouette-based visual hulls and then refining them using three-view curve matching. These 3D curves are then matched to 2D image curves using a 3D view-based alignment technique....

Publication details
Date: 1 March 2014
Type: Inproceeding
Publisher: IEEE Computer Society
Edward Hsiao, Sudipta Sinha, Krishnan Ramnath, Larry Zitnick, Simon Baker, and Richard Szeliski

We present a new approach for recognizing the make and model of a car from a single image. While most previous methods are restricted to fixed or limited viewpoints, our system is able to verify a car's make and model from an arbitrary view. Our model consists of 3D space curves obtained by backprojecting image curves onto silhouette-based visual hulls and then refining them using three-view curve matching. We also build an appearance model of taillights which is used as an additional cue. Our approach...

Publication details
Date: 1 February 2014
Type: Technical report
Number: MSR-TR-2014-9
Johannes Kopf, Fabian Langguth, Daniel Scharstein, Richard Szeliski, and Michael Goesele

We propose a novel image-based rendering algorithm for handling complex scenes that may include reflective surfaces. Our key contribution lies in treating the problem in the gradient domain. We use a standard technique to estimate scene depth, but assign depths to image gradients rather than pixels. A novel view is obtained by rendering the horizontal and vertical gradients, from which the final result is reconstructed through Poisson integration using an approximate solution as a data term. Our...

Publication details
Date: 1 December 2013
Type: Article
Number: 5
Dilip Krishnan, Raanan Fattal, and Richard Szeliski

We present a new multi-level preconditioning scheme for discrete Poisson equations that arise in various computer graphics applications such as colorization, edge-preserving decomposition for two-dimensional images, and geodesic distances and diffusion on three-dimensional meshes. Our approach interleaves the selection of fine- and coarse-level variables with the removal of weak connections between potential fine-level variables sparsification and the compensation for these changes by strengthening...

Publication details
Date: 1 July 2013
Type: Article
Publisher: ACM SIGGRAPH
Number: 4
Sudipta N. Sinha, Krishnan Ramnath, and Richard Szeliski

We present a system that detects 3D mirror-symmetric objects in images and then reconstructs their visible symmetric parts. Our detection stage is based on matching mirror symmetric feature points and descriptors and then estimating the symmetry direction using RANSAC. We enhance this step by augmenting feature descriptors with their affine deformed versions and matching these extended sets of descriptors. The reconstruction stage uses a novel edge matching algorithm that matches symmetric pairs of...

Publication details
Date: 8 October 2012
Type: Inproceeding
Publisher: Springer
Adarsh Kowdle, Sudipta N. Sinha, and Richard Szeliski

We present an automatic approach to segment an object in calibrated images acquired from multiple viewpoints. Our system starts with a new piecewise planar layer-based stereo algorithm that estimates a dense depth map that consists of a set of 3D planar surfaces. The algorithm is formulated using an energy minimization framework that combines stereo and appearance cues, where for each surface, an appearance model is learnt using an unsupervised approach. By treating the planar surfaces as structural...

Publication details
Date: 8 October 2012
Type: Inproceeding
Publisher: Springer Verlag
Varsha Hedau, Sudipta N. Sinha, C. Lawrence Zitnick, and Richard Szeliski

We propose a visual recognition approach aimed at fast recognition of urban landmarks on a GPS-enabled mobile device. While most existing methods offload their computation to a server, the latency of an image upload over a slow network can be a significant bottleneck. In this paper, we investigate a new approach to mobile visual recognition that would involve uploading only GPS coordinates to a server, following which a compact location specific classifier would be downloaded to the client and...

Publication details
Date: 7 October 2012
Type: Inproceeding
Publisher: Springer Verlag
Yekeun Jeong, David Nistér, Drew Steedly, Richard Szeliski, and In-So Kweon

In this paper, we present results and experiments with several methods for bundle adjustment, producing the fastest bundle adjuster ever published in terms of computation and convergence. From a computational perspective, the fastest methods naturally handle the block-sparse pattern that arises in a reduced camera system. Adapting to the naturally arising block-sparsity allows the use of BLAS3, efficient memory handling, fast variable ordering, and customized sparse solving, all simultaneously. We...

Publication details
Date: 1 August 2012
Type: Article
Publisher: IEEE Computer Society
Number: 8
Sudipta N. Sinha, Johannes Kopf, Michael Goesele, Daniel Scharstein, and Richard Szeliski

We present a system for image-based modeling and rendering of real-world scenes containing reflective and glossy surfaces. Previous approaches to image-based rendering assume that the scene can be approximated by 3D proxies that enable view interpolation using traditional back-to-front or z-buffer compositing. In this work, we show how these can be generalized to multiple layers that are combined in an additive fashion to model the reflection and transmission of light that occurs at specular surfaces...

Publication details
Date: 1 August 2012
Type: Inproceeding
Publisher: ACM SIGGRAPH
Taeg Sang Cho, Neel Joshi, C. Lawrence Zitnick, Sing Bing Kang, Richard Szeliski, and William T. Freeman

The restoration of a blurry or noisy image is commonly performed with a MAP estimator, which maximizes a posterior probability to reconstruct a clean image from a degraded image. A MAP estimator, when used with a sparse gradient image prior, reconstructs piecewise smooth images and typically removes textures that are important for visual realism. We present an alternative deconvolution method called iterative distribution reweighting (IDR) which imposes a global constraint on gradients so that a...

Publication details
Date: 1 April 2012
Type: Article
Publisher: IEEE
Number: 4
Publication details
Date: 1 December 2011
Type: Inproceeding
Publisher: ACM SIGGRAPH Asia
Number: 5
Sameer Agarwal, Yasutaka Furukawa, Noah Snavely, Ian Simon, Brian Curless, Steven M. Seitz, and Richard Szeliski
Publication details
Date: 1 October 2011
Type: Article
Publisher: ACM
Number: 10
Richard Roberts, Sudipta N. Sinha, Richard Szeliski, and Drew Steedly

Most existing structure from motion (SFM) approaches for unordered images cannot handle multiple instances of the same structure in the scene. When image pairs containing different instances are matched based on visual similarity, the pairwise geometric relations as well as the correspondences inferred from such pairs are erroneous, which can lead to catastrophic failures in the reconstruction.

In this paper, we investigate the geometric ambiguities caused by the presence of repeated or...

Publication details
Date: 1 June 2011
Type: Inproceeding
Publisher: IEEE Computer Society
Richard Szeliski, Matthew Uyttendaele, and Drew Steedly

We present a technique for fast Poisson blending and gradient domain compositing. Instead of using a single piecewise-smooth offset map to perform the blending, we associate a separate map with each input source image. Each individual offset map is itself smoothly varying and can therefore be represented using a low-dimensional spline. The resulting linear system is much smaller than either the original Poisson system or the quadtree spline approximation of a single (unified) offset map. We...

Publication details
Date: 1 April 2011
Type: Inproceeding
Publisher: IEEE
Simon Baker, Daniel Scharstein, J.P. Lewis, Stefan Roth, Michael Black, and Richard Szeliski

The quantitative evaluation of optical flow algorithms by Barron et al. (1994) led to significant advances in performance. The challenges for optical flow algorithms today go beyond the datasets and evaluation methods proposed in that paper. Instead, they center on problems associated with complex natural scenes, including nonrigid motion, real sensor noise, and motion discontinuities. We propose a new set of benchmarks and evaluation methods for the next generation of optical flow algorithms. To that...

Publication details
Date: 1 March 2011
Type: Article
Publisher: Springer Verlag
Number: 1
Sameer Agarwal, Noah Snavely, Steven M. Seitz, and Richard Szeliski

We present the design and implementation of a new inexact Newton type algorithm for solving large-scale bundle adjustment problems with tens of thousands of images. We explore the use of Conjugate Gradients for calculating the Newton step and its performance as a function of some simple and computationally efficient preconditioners. We show that the common Schur complement trick is not limited to factorization-based methods and that it can be interpreted as a form of preconditioning. Using photos from...

Publication details
Date: 1 October 2010
Type: Inproceeding
Publisher: Springer Verlag
Noah Snavely, Ian Simon, Michael Goesele, Richard Szeliski, and Steven M. Seitz

There are billions of photographs on the Internet, representing an extremely large, rich, and nearly comprehensive visual record of virtually every famous place on Earth. Unfortunately, these massive community photo collections are almost completely unstructured, making it very difficult to use them for applications such as the virtual exploration of our world. Over the past several years, advances in computer vision have made it possible to automatically reconstruct 3-D geometry including camera...

Publication details
Date: 1 August 2010
Type: Article
Publisher: IEEE
Number: 8
Michael Goesele, Jens Ackermann, Simon Fuhrmann, Carsten Haubold, Ronny Klowsky, Drew Steedly, and Richard Szeliski

View interpolation and image-based rendering algorithms often produce visual artifacts in regions where the 3D scene geometry is erroneous, uncertain, or incomplete. We introduce ambient point clouds constructed from colored pixels with uncertain depth, which help reduce these artifacts while providing non-photorealistic background coloring and emphasizing reconstructed 3D geometry. Ambient point clouds are created by randomly sampling colored points along the viewing rays associated with uncertain...

Publication details
Date: 1 July 2010
Type: Article
Publisher: Association for Computing Machinery, Inc.
Number: 4
Neel Joshi, Sing Bing Kang, C. Lawrence Zitnick, and Richard Szeliski

We present a deblurring algorithm that uses a hardware attachment coupled with a natural image prior to deblur images from consumer cameras. Our approach uses a combination of inexpensive gyroscopes and accelerometers in an energy optimization framework to estimate a blur function from the camera’s acceleration and angular velocity during an exposure. We solve for the camera motion at a high sampling rate during an exposure and infer the latent image using a joint optimization. Our method is completely...

Publication details
Date: 1 July 2010
Type: Article
Publisher: Association for Computing Machinery, Inc.
Number: 3
Johannes Kopf, Billy Chen, Richard Szeliski, and Michael F. Cohen

Systems such as Google Street View and Bing Maps Streetside enable users to virtually visit cities by navigating between immersive 360° panoramas, or bubbles. The discrete moves from bubble to bubble enabled in these systems do not provide a good visual sense of a larger aggregate such as a whole city block. Multi-perspective "strip" panoramas can provide a visual summary of a city street but lack the full realism of immersive panoramas.

We present Street Slide, which combines the best aspects...

Publication details
Date: 1 July 2010
Type: Article
Publisher: Association for Computing Machinery, Inc.
Number: 4
Yasutaka Furukawa, Brian Curless, Steven M. Seitz, and Richard Szeliski

This paper introduces an approach for enabling existing multi-view stereo methods to operate on extremely large unstructured photo collections. The main idea is to decompose the collection into a set of overlapping sets of photos that can be processed in parallel, and to merge the resulting reconstructions. This overlapping clustering problem is formulated as a constrained optimization and solved iteratively. The merging algorithm, designed to be parallel and out-of-core, incorporates robust filtering...

Publication details
Date: 1 June 2010
Type: Inproceeding
Publisher: IEEE
Yekeun Jeong, David Nistér, Drew Steedly, Richard Szeliski, and In-So Kweon
Publication details
Date: 1 June 2010
Type: Inproceeding
Publisher: IEEE
Taeg Sang Cho, Neel Joshi, Charles Larry Zitnick, Sing Bing Kang, Richard Szeliski, and William Freeman

In image restoration tasks, a heavy-tailed gradient distribution of natural images has been extensively exploited as an image prior. Most image restoration algorithms impose a sparse gradient prior on the whole image, reconstructing an image with piecewise smooth characteristics. While the sparse gradient prior removes ringing and noise artifacts, it also tends to remove mid-frequency textures, degrading the visual quality. We can attribute such degradations to imposing an incorrect image prior. The...

Publication details
Date: 1 June 2010
Type: Inproceeding
Publisher: IEEE
Sameer Agarwal, Yasutaka Furukawa, Noah Snavely, Brian Curless, Steven M. Seitz, and Richard Szeliski

Community photo collections like Flickr offer a rich, ever-growing record of the world around us. New computer vision techniques can use photographs from these collections to rapidly build detailed 3D models.

Publication details
Date: 1 June 2010
Type: Article
Publisher: IEEE Computer Society
Number: 6
Simon Baker, Eric Bennett, Sing Bing Kang, and Richard Szeliski

We present an algorithm to remove wobble artifacts from a video captured with a rolling shutter camera undergoing large accelerations or jitter. We show how estimating the rapid motion of the camera can be posed as a temporal super-resolution problem. The low-frequency measurements are the motions of pixels from one frame to the next. These measurements are modeled as temporal integrals of the underlying high-frequency jitter of the camera. The estimated high-frequency motion of the camera is then used...

Publication details
Date: 1 June 2010
Type: Inproceeding
Publisher: IEEE Computer Society
Simon Baker, Eric Bennett, Sing Bing Kang, and Richard Szeliski

We present an algorithm to remove wobble artifacts from a video captured with a rolling shutter camera undergoing large accelerations or jitter. We show how estimating the rapid motion of the camera can be posed as a temporal super-resolution problem. The low-frequency measurements are the motions of pixels from one frame to the next. These measurements are modeled as temporal integrals of the underlying high-frequency jitter of the camera. The high-frequency estimated motion of the camera is then used...

Publication details
Date: 1 March 2010
Type: Technical report
Publisher: Microsoft Research
Number: MSR-TR-2010-28
Richard Szeliski, Simon Winder, and Matt Uyttendaele

This paper develops a family of multi-pass image resampling algorithms that use one-dimensional filtering stages to achieve high-quality results at low computational cost. Our key insight is to perform a frequency-domain analysis to ensure that very little aliasing occurs at each stage in the multi-pass transform and to insert additional stages where necessary to ensure this. Using one-dimensional resampling enables the use of small resampling kernels, thus producing highly efficient algorithms. We...

Publication details
Date: 1 February 2010
Type: Technical report
Publisher: Microsoft Research
Number: MSR-TR-2010-10
Simon Baker, Daniel Scharstein, J.P. Lewis, Stefan Roth, Michael J. Black, and Richard Szeliski

The quantitative evaluation of optical flow algorithms by Barron et al. (1994) led to significant advances in performance. The challenges for optical flow algorithms today go beyond the datasets and evaluation methods proposed in that paper. Instead, they center on problems associated with complex natural scenes, including nonrigid motion, real sensor noise, and motion discontinuities. We propose a new set of benchmarks and evaluation methods for the next generation of optical flow algorithms. To that...

Publication details
Date: 1 December 2009
Type: Technical report
Publisher: Microsoft Research
Number: MSR-TR-2009-179
Sudipta N. Sinha, Drew Steedly, and Richard Szeliski

We present a novel multi-view stereo method designed for image-based rendering that generates piecewise planar depth maps from an unordered collection of photographs.

First a discrete set of 3D plane candidates are computed based on a sparse point cloud of the scene (recovered by structure from motion) and sparse 3D line segments reconstructed from multiple views. Next, evidence is accumulated for each plane using 3D point and line incidence and photo-consistency cues. Finally, a piecewise...

Publication details
Date: 29 September 2009
Type: Inproceeding
Publisher: IEEE
Sameer Agarwal, Noah Snavely, Ian Simon, Steven M. Seitz, and Richard Szeliski

We present a system that can match and reconstruct 3D scenes from extremely large collections of photographs such as those found by searching for a given city (e.g., Rome) on Internet photo sharing sites. Our system uses a collection of novel parallel distributed matching and reconstruction algorithms, designed to maximize parallelism at each stage in the pipeline and minimize serialization bottlenecks. It is designed to scale gracefully with both the size of the problem and the amount of available...

Publication details
Date: 1 September 2009
Type: Inproceeding
Publisher: IEEE
Yasutaka Furukawa, Brian Curless, Steven M. Seitz, and Richard Szeliski

This paper proposes a fully automated 3D reconstruction and visualization system for architectural scenes (interiors and exteriors). The reconstruction of indoor environments from photographs is particularly challenging due to texture-poor planar surfaces such as uniformly-painted walls. Our system first uses structure-from-motion, multiview stereo, and a stereo algorithm specifically designed for Manhattan-world scenes (scenes consisting predominantly of piece-wise planar surfaces with dominant...

Publication details
Date: 1 September 2009
Type: Inproceeding
Publisher: IEEE
Ryan S. Kaminsky, Noah Snavely, Steven M. Seitz, and Richard Szeliski

We address the problem of automatically aligning structure-from-motion reconstructions to overhead images, such as satellite images, maps and floor plans, generated from an orthographic camera. We compute the optimal alignment using an objective function that matches 3D points to image edges and imposes free space constraints based on the visibility of points in each camera. We demonstrate the accuracy of our alignment algorithm on several outdoor and indoor scenes using both satellite and floor plan...

Publication details
Date: 21 June 2009
Type: Inproceeding
Publisher: IEEE Computer Society
Sudipta N. Sinha, Drew Steedly, Richard Szeliski, Maneesh Agrawala, and Marc Pollefeys

We present an interactive system for generating photorealistic, textured, piecewise-planar 3D models of architectural structures and urban scenes from unordered sets of photographs. To reconstruct 3D geometry in our system, the user draws outlines overlaid on 2D photographs. The 3D structure is then automatically computed by combining the 2D interaction with the multi-view geometric information recovered by performing structure from motion analysis on the input photographs. We utilize vanishing point...

Publication details
Date: 1 December 2008
Type: Inproceeding
Publisher: Association for Computing Machinery, Inc.
Noah Snavely, Steven M. Seitz, and Richard Szeliski

There are billions of photographs on the Internet, comprising the largest and most diverse photo collection ever assembled. How can computer vision researchers exploit this imagery? This paper explores this question from the standpoint of 3D scene modeling and visualization. We present structure-from-motion and image-based rendering algorithms that operate on hundreds of images downloaded as a result of keyword-based image search queries like “Notre Dame” or “Trevi Fountain.” This approach, which we...

Publication details
Date: 1 November 2008
Type: Article
Publisher: Springer-Verlag
Number: 2
Richard Szeliski, Matthew Uyttendaele, and Drew Steedly

We present a technique for fast Poisson blending and gradient domain compositing. Instead of using a single (piecewise-smooth) offset map to perform the blending, we associate a separate map with each input source image. Each individual offset map is itself smoothly varying and can therefore be represented using a low-dimensional spline. The resulting linear system is much smaller than either the original Poisson system or the quadtree spline approximation of a single (unified) offset map. We...

Publication details
Date: 1 April 2008
Type: Technical report
Number: MSR-TR-2008-58
Simon Baker, Daniel Scharstein, J.P. Lewis, Stefan Roth, Michael Black, and Richard Szeliski

The quantitative evaluation of optical flow algorithms by Barron et al. led to significant advances in the performance of optical flow methods. The challenges for optical flow today go beyond the datasets and evaluation methods proposed in that paper and center on problems associated with nonrigid motion, real sensor noise, complex natural scenes, and motion discontinuities. Our goal is to establish a new set of benchmarks and evaluation methods for the next generation of optical flow algorithms. To...

Publication details
Date: 1 October 2007
Type: Inproceeding
Publisher: IEEE Computer Society
Xiangyang Lan, Larry Zitnick, and Richard Szeliski

In this paper, we describe a model-based approach to object recognition. Spatial relationships between matching primitives are modeled using a purely local bi-gram representation consisting of transition probabilities between neighboring primitives. For matching primitives, sets of one, two or three features are used. The addition of doublets and triplets provides a highly discriminative matching primitive and a reference frame that is invariant to similarity or affine transformations. The recognition...

Publication details
Date: 1 May 2007
Type: Technical report
Number: MSR-TR-2007-54
Larry Zitnick, Jie Sun, Richard Szeliski, and Simon Winder

Known object recognition is the task of recognizing specific objects, such as cereal boxes or soda cans. Millions of such objects exist, and finding a computationally feasible method for recognition can be difficult. Ideally, the computational costs should scale with the complexity of the testing image, and not the size of the object database. To accomplish this goal we propose a method for detection and recognition based on triplets of feature descriptors. Each feature is given a label based on a...

Publication details
Date: 1 April 2007
Type: Technical report
Number: MSR-TR-2007-53
Ce Liu, Richard Szeliski, Sing Bing Kang, C. Lawrence Zitnick, and William T. Freeman

Most existing image denoising work assumes additive white Gaussian noise (AWGN) and removes the noise independent of the RGB channels. Therefore, the current approaches are not fully automatic and cannot effectively remove color noise produced by CCD digital camera. In this paper, we propose a framework for two tasks, automatically estimating and removing color noise from a single image using piecewise smooth image models. We estimate noise level function (NLF), a continuous function of noise level to...

Publication details
Date: 1 December 2006
Type: Technical report
Number: MSR-TR-2006-180
Dani Lischinski, Zeev Farbman, Matt Uyttendaele, and Richard Szeliski

This paper presents a new interactive tool for making local adjustments of tonal values and other visual parameters in an image. Rather than carefully selecting regions or hand-painting layer masks, the user quickly indicates regions of interest by drawing a few simple brush strokes and then uses sliders to adjust the brightness, contrast, and other parameters in these regions. The effects of the user's sparse set of constraints are interpolated to the entire image using an edge-preserving energy...

Publication details
Date: 1 August 2006
Type: Proceedings
Publisher: Association for Computing Machinery, Inc.
Samuel W. Hasinoff, Sing Bing Kang, and Richard Szeliski

In the last few years, new view synthesis has emerged as an important application of 3D stereo reconstruction. While the quality of stereo has improved, it is still imperfect, and a unique depth is typically assigned to every pixel. This is problematic at object boundaries, where the pixel colors are mixtures of foreground and background colors. Interpolating views without explicitly accounting for this effect results in objects with a "cut-out" appearance. To produce seamless view interpolation, we...

Publication details
Date: 1 July 2006
Type: Article
Publisher: Elsevier Science Inc.
Number: 1
Richard Szeliski, Ramin Zabih, Daniel Scharstein, Olga Veksler, Vladimir Kolmogorov, Aseem Agarwala, Marshall Tappen, and Carsten Rother

One of the most exciting advances in early vision has been the development of efficient energy minimization algorithms. Many early vision tasks require labeling each pixel with some quantity such as depth or texture. While these problems can be elegantly expressed in the language of Markov Random Fields (MRF's), the resulting energy minimization problems were widely viewed as intractable. Recently, algorithms such as graph cuts and Loopy Belief Propagation (LBP) have proven to be very powerful: for...

Publication details
Date: 1 June 2006
Type: Inproceeding
Publisher: Springer-Verlag
Anat Levin and Richard Szeliski

One of the important image and video registration goals is the accurate motion and structure estimation. On the other hand, a good motion estimation is also an important requirement for most mosaicing and novel view generation techniques. While it has been well known for a while that narrow field-of-view cameras have a hard time distinguishing between certain kinds of rotations and translations, it has been recently observed that using omni-directional cameras significantly decrease those ambiguities....

Publication details
Date: 1 May 2006
Type: Technical report
Number: MSR-TR-2006-37
Richard Szeliski

This paper develops locally adapted hierarchical basis functions for effectively preconditioning large optimization problems that arise in computer vision, computer graphics, and computational photography applications such as surface interpolation, optic flow, tone mapping, gradient-domain blending, and colorization. By looking at the local structure of the coefficient matrix and performing a recursive set of variable eliminations, combined with a simplification of the resulting coarse level problems,...

Publication details
Date: 1 May 2006
Type: Technical report
Number: MSR-TR-2006-38
Patrick Baudisch, Desney Tan, Drew Steedly, Eric Rudolph, Matt Uyttendaele, Chris Pal, and Richard Szeliski

Image stitching allows users to combine multiple regular-sized photographs into a single wide-angle picture, often referred to as a panoramic picture. To create such a panoramic picture, users traditionally first take all the photographs, then upload them to a PC and stitch. During stitching, however, users often discover that the produced panorama contains artifacts or is incomplete. Fixing these flaws requires retaking individual images, which is often difficult by this time. In this paper, we present...

Publication details
Date: 1 January 2006
Type: Inproceeding
Patrick Baudisch, Desney Tan, Drew Steedly, Eric Rudolph, Matt Uyttendaele, Chris Pal, and Richard Szeliski

Image stitching allows users to combine multiple regular-sized photographs into a single wide-angle picture, often referred to as a panoramic picture. To create such a panoramic picture, users traditionally first take all the photographs, then upload them to a PC and stitch. During stitching, however, users often discover that the produced panorama contains artifacts or is incomplete. Fixing these flaws requires retaking individual images, which is often difficult by this time. In this paper, we present...

Publication details
Date: 1 November 2005
Type: Inproceeding
Matthew Brown, Simon Winder, and Richard Szeliski

This paper describes a novel multi-view matching framework based on a new type of invariant feature. Our features are located at Harris corners in discrete scale-space and oriented using a blurred local gradient. This defines a rotationally invariant frame in which we sample a feature descriptor, which consists of an 8 × 8 patch of bias/gain normalised intensity values. The density of features in the image is controlled using a novel adaptive non-maximal suppression algorithm, which gives a better...

Publication details
Date: 1 June 2005
Type: Inproceeding
Publisher: IEEE Computer Society
Antonio Criminisi, Sing Bing Kang, Rahul Swaminathan, Richard Szeliski, and P. Anandan

Despite progress in stereo reconstruction and structure from motion, three-dimensional scene reconstruction from multiple images still faces many difficulties, especially in dealing with occlusions, partial visibility, textureless regions and specular reflections. Moreover, the problem of recovering a spatially dense three-dimensional representation from many views has not been adequately treated. This document addresses the problems of achieving a dense reconstruction from a sequence of images and...

Publication details
Date: 1 January 2005
Type: Article
Publisher: Elsevier
Number: 1
Sing Bing Kang, Charles Lawrence Zitnick, Matthew Uyttendaele, Simon Winder, and Richard Szeliski
Publication details
Date: 1 December 2004
Type: Inproceeding
Matthew Brown, Richard Szeliski, and Simon Winder

This paper describes a novel multi-view matching framework based on a new type of invariant feature. Our features are located at Harris corners in discrete scale-space and oriented using a blurred local gradient. This defines a rotationally invariant frame in which we sample a feature descriptor, which consists of an 8x8 patch of bias/gain normalised intensity values. The density of features in the image is controlled using a novel adaptive non-maximal suppression algorithm, which gives a better spatial...

Publication details
Date: 1 December 2004
Type: Technical report
Number: MSR-TR-2004-133
Richard Szeliski

This tutorial reviews image alignment and image stitching algorithms. Image alignment (registration) algorithms can discover the large-scale (parametric) correspondence relationships among images with varying degrees of overlap. They are ideally suited for applications such as video stabilization, summarization, and the creation of large-scale panoramic photographs. Image stitching algorithms take the alignment estimates produced by such registration algorithms and blend the images in a seamless manner,...

Publication details
Date: 1 October 2004
Type: Technical report
Number: MSR-TR-2004-92
C. Zitnick, S.B. Kang, M. Uyttendaele, S. Winder, and R. Szeliski

The ability to interactively control viewpointwhilewatching a video is an exciting application of image-based rendering. The goal of our work is to render dynamic scenes with interactive viewpoint control using a relatively small number of video cameras. In this paper, we showhowhigh-quality video-based rendering of dynamic scenes can be accomplished using multiple synchronized video streams combined with novel image-based modeling and rendering algorithms. Once these video streams have been processed,...

Publication details
Date: 1 August 2004
Type: Inproceeding
Publisher: Association for Computing Machinery, Inc.
Number: 3
Zhouchen Lin and Heung-Yeung Shum

Recently, many image-based modeling and rendering techniques have been successfully designed to render photo-realistic images without the need for explicit 3D geometry. However, these techniques (e.g., light field rendering (Levoy, M. and Hanrahan, P., 1996. In SIGGRAPH 1996 Conference Proceedings, Annual Conference Series, Aug. 1996, pp. 31–42) and Lumigraph (Gortler, S.J., Grzeszczuk, R., Szeliski, R., and Cohen, M.F., 1996. In SIGGRAPH 1996 Conference Proceedings, Annual Conference Series, Aug. 1996,...

Publication details
Date: 1 July 2004
Type: Article
Publisher: Kluwer Academic
Sing Bing Kang, Richard Szeliski, and Matthew Uyttendaele

A major problem in stitching a partially overlapping set of images taken from different viewpoints is the presence of parallax, which causes ghosting. We propose a new technique, Multi-Perspective Plane Sweep (MPPS), to handle this problem. Given a pair of images to stitch, we rectify them, find their area of intersection, and estimate a disparity map by plane sweeping different columns referenced at different virtual camera positions. This minimizes object distortion across the stitched area while...

Publication details
Date: 1 June 2004
Type: Technical report
Number: MSR-TR-2004-48
Matthew Uyttendaele, Antonio Criminisi, Sing Binb Kang, Simon Winder, Richard Hartley, and Richard Szeliski

Interactive scene walk-throughs have long been an important computer graphics application area. Starting with Fred Brooks pioneering work, efficient rendering algorithms have emerged for visualizing large architectural databases. More recently, researchers have developed techniques for constructing photorealistic 3D architectural models from real-world images.Real-world tours based on panoramic images also exist, as we describe in the "Panoramic Imaging" sidebar. These systems all aim to create a real...

Publication details
Date: 1 May 2004
Type: Article
Publisher: IEEE Computer Society
Number: 3
Publication details
Date: 1 October 2003
Type: Technical report
Daniel Scharstein and Richard Szeliski

Recent progress in stereo algorithm performance is quickly outpacing the ability of existing stereo data sets to discriminate among the best-performing algorithms, motivating the need for more challenging scenes with accurate ground truth information. This paper describes a method for acquiring high-complexity stereo image pairs with pixel-accurate correspondence information using structured light. Unlike traditional range-sensing approaches, our method does not require the calibration of the light...

Publication details
Date: 1 June 2003
Type: Inproceeding
Publisher: IEEE Computer Society
Rahul Swaminathan, Sing Bing Kang, Antonio Criminisi, and Richard Szeliski

Real scenes are full of specularities (highlights and reflections), and yet most vision algorithms ignore them. In order to capture the appearance of realistic scenes, we need to model specularities as separate layers. In this paper, we study the behavior of specularities in static scenes as the camera moves, and describe their dependence on varying surface geometry, orientation, and scene point and camera locations. For a rectilinear camera motion with constant velocity, we study how the specular...

Publication details
Date: 1 May 2002
Type: Inproceeding
Publisher: Springer Verlag
Daniel Scharstein and Richard Szeliski

Stereo matching is one of the most active research areas in computer vision. While a large number of algorithms for stereo correspondence have been developed, relatively little work has been done on characterizing their performance. In this paper, we present a taxonomy of dense, two-frame stereo methods. Our taxonomy is designed to assess the different components and design decisions made in individual stereo algorithms. Using this taxonomy, we compare existing stereo methods and present experiments...

Publication details
Date: 1 May 2002
Type: Article
Publisher: Kluwer Academic
Number: 1
Publication details
Date: 1 January 2002
Type: Technical report
Yung-Yu Chuang, Brian Curless, David H. Salesin, and Richard Szeliski

This paper proposes a new Bayesian framework for solving the matting problem, i.e., extracting a foreground element from a background image by estimating an opacity for each pixel of the foreground element. Our approach models both the foreground and background color distributions with spatiallyvarying sets of Gaussians, and assumes a fractional blending of the foreground and background colors to produce the final output. It then uses a maximum-likelihood criterion to estimate the optimal opacity,...

Publication details
Date: 1 December 2001
Type: Inproceeding
Publisher: IEEE Computer Society
Richard Szeliski and Daniel Scharstein

Stereo matching is one of the most active research areas in computer vision. While a large number of algorithms for stereo correspondence have been developed, rela-tively little work has been done on characterizing their performance. In this paper, we present a taxonomy of dense, two-frame stereo methods designed to assess the differ-ent components and design decisions made in individual stereo algorithms. Using this taxonomy, we compare existing stereo methods and present experiments evaluating the...

Publication details
Date: 1 November 2001
Type: Technical report
Number: MSR-TR-2001-81
Sing Bing Kang, Richard Szeliski, and Jinxiang Chai

While stereo matching was originally formulated as the recovery of 3D shape from a pair of images, it is now generally recognized that using more than two images can dramatically improve the quality of the reconstruction. Unfortunately, as more images are added, the prevalence of semi-occluded regions (pixels visible in some but not all images) also increases. In this paper, we propose some novel techniques to deal with this problem. Our first idea is to use a combination of shiftable windows and a...

Publication details
Date: 1 September 2001
Type: Technical report
Number: MSR-TR-2001-80
Ian Buck, Adam Finkelstein, Chuck Jacobs, Allison W. Klein, David Salesin, Joshua Seims, Richard Szeliski, and Kentaro Toyama

We present a novel method for generating performance-driven, "hand-drawn" animation in real-time. Given an annotated set of hand-drawn faces for various expressions, our algorithm performs multi-way morphs to generate real-time animation that mimics the expressions of a user. Our system consists of a vision-based tracking component and a rendering component. Together, they form an animation system that can be used in a variety of applications, including teleconferencing, multi-user virtual worlds,...

Publication details
Date: 1 January 2000
Type: Inproceeding
Publisher: Association for Computing Machinery, Inc.
Publication details
Date: 1 November 1999
Type: Article
Publisher: Association for Computing Machinery, Inc.
Number: 4
Ken Hinckley, Mike Sinclair, Erik Hanson, Richard Szeliski, and Matt Conway

The VideoMouse is a mouse that uses a camera as its input sensor. A real-time vision algorithm determines the six degree-of-freedom mouse posture, consisting of 2D motion, tilt in the forward/back and left/righ axis, rotation of the mouse about its vertical axis, and some limited height sensing. Thus, a familiar 2D device can be extended for three-dimensional manipulation, while remaining suitable for standard 2D GUI tasks. We describe techniques for mouse functionality, 3D manipulation, navigating...

Publication details
Date: 1 November 1999
Type: Inproceeding
Publisher: Association for Computing Machinery, Inc.
Frédéric Pighin, David H. Salesin, and Richard Szeliski

Given video footage of a person's face, we present new techniques to automatically recover the face position and the facial expression from each frame in the video sequence. A 3D face model is fitted to each frame using a continuous optimization technique. Our model is based on a set of 3D face models that are linearly combined using 3D morphing. Our method has the advantages over previous techniques of fitting directly a realistic 3-dimensional face model and of recovering parameters that can be used...

Publication details
Date: 1 September 1999
Type: Inproceeding
Publisher: IEEE Computer Society
P. H. S. Torr, Richard Szeliski, and P. Anandan

This paper describes a Bayesian approach for modeling 3D scenes as a collection of approximately planar layers that are arbitrarily positioned and oriented in the scene. In contrast to much of the previous work on layer based motion modeling, which compute layered descriptions of 2D image motion, our work leads to a 3D description of the scene. We focus on the key problem of automatically segmenting the scene into layers based on stereo disparity data from multiple images. The prior assumptions about...

Publication details
Date: 1 September 1999
Type: Inproceeding
Publisher: IEEE Computer Society
Richard Szeliski

This paper reviews a number of recently developed stereo matching algorithms and representations. It focuses on techniques that are especially well suited for image-based rendering applications such as novel view generation and the mixing of live imagery with synthetic computer graphics. The paper begins by reviewing some recent approaches to the classic problem of recovering a depth map from two or more images. It then describes a number of newer representations (and their associated reconstruction...

Publication details
Date: 1 September 1999
Type: Inproceeding
Publisher: Springer Verlag
Richard Szeliski

This paper presents a new methodology for evaluating the quality of motion estimation and stereo correspondence algorithms. Motivated by applications such as novel view generation and motion-compensated compression, we suggest that the ability to predict new views or frames is a natural metric for evaluating such algorithms. Our new metric has several advantages over comparing algorithm outputs to true motions or depths. First of all, it does not require the knowledge of ground truth data, which may be...

Publication details
Date: 1 September 1999
Type: Inproceeding
Publisher: IEEE Computer Society
Heung-Yeung Shum and Richard Szeliski

This paper presents a new approach to computing depth maps from a large collection of images where the camera motion has been constrained to planar concentric circles. We resample the resulting collection of regular perspective images into a set of multiperspective panoramas, and then compute depth maps directly from these resampled images. Only a small number of multiperspective panoramas is needed to obtain a dense and accurate 3D reconstruction, since our panoramas sample uniformly in three...

Publication details
Date: 1 September 1999
Type: Inproceeding
Publisher: Institute of Electrical and Electronics Engineers, Inc.
Richard Szeliski

This paper presents a new approach to computing dense depth and motion estimates from multiple images. Rather than computing a single depth or motion map from such a collection, we associate motion or depth estimates with multiple images in the collection. This has the advantage that the depth or motion of regions occluded in one image will still be represented in some other image. Thus, tasks such as novel view interpolation or motion-compensated prediction can be solved with greater fidelity. It also...

Publication details
Date: 1 July 1999
Type: Technical report
Publisher: Institute of Electrical and Electronics Engineers, Inc.
Number: MSR-TR-99-19
Richard Szeliski, P. Anandan, and Simon Baker

We propose a framework for extracting structure from stereo which represents the scene as a collection of approximately planar layers. Each layer consists of an explicit 3D plane equation, a colored image with per-pixel opacity, and a per-pixel depth offset relative to the plane. Initial estimates of the layers are recovered using techniques taken from parametric motion estimation. These initial estimates are then refined using a re-synthesis algorithm which takes into account both occlusions and mixed...

Publication details
Date: 1 June 1999
Type: Inproceeding
Publisher: IEEE Computer Society
R. Szeliski and P. Golland

This paper formulates and solves a new variant of the stereo correspondence problem: simultaneously recovering the disparities, true colors, and opacities of visible surface elements. This problem arises in newer applications of stereo reconstruction, such as view interpolation and the layering of real imagery with synthetic graphics for special effects and virtual studio applications. While this problem is intrinsically more difficult than traditional stereo correspondence, where only the disparities...

Publication details
Date: 1 May 1999
Type: Article
Publisher: Kluwer Academic
Number: 1
Richard Szeliski and P.H.S. Torr

Structure from motion algorithms typically do not use external geometric constraints, e.g., the coplanarity of certain points or known orientations associated with such planes, until a final post-processing stage. In this paper, it is shown how such geometric constraints can be incorporated early on in the reconstruction process, thereby improving the quality of the estimates. The approaches studied include hallucinating extra point matches in planar regions, computing fundamental matrices directly from...

Publication details
Date: 1 November 1998
Type: Technical report
Publisher: Springer-Verlag
Number: MSR-TR-98-64
Frederic Pighin, Jamie Hecker, Dani Lischinski, David H. Salesin, and Richard Szeliski

We present new techniques for creating photorealistic textured 3D facial models from photographs of a human subject, and for creating smooth transitions between different facial expressions by morphing between these different models. Starting from several uncalibrated views of a human subject, we employ a user-assisted technique to recover the camera poses corresponding to the views as well as the 3D coordinates of a sparse set of chosen locations on the subject’s face. A scattered data interpolation...

Publication details
Date: 1 July 1998
Type: Inproceeding
Publisher: Association for Computing Machinery, Inc.
Jonathan Shade, Steven Gortler, Li-wei He, and Richard Szeliski

In this paper we present a set of efficient image based rendering methods capable of rendering multiple frames per second on a PC. The first method warps Sprites with Depth representing smooth surfaces without the gaps found in other techniques. A second method for more general scenes performs warping from an intermediate representation called a Layered Depth Image (LDI). An LDI is a view of the scene from a single input camera view, but with multiple pixels along each line of sight. The size of the...

Publication details
Date: 1 July 1998
Type: Inproceeding
Publisher: Association for Computing Machinery, Inc.
Daniel Scharstein and Richard Szeliski

One of the central problems in stereo matching (and other image registration tasks) is the selection of optimal window sizes for comparing image regions. This paper addresses this problem with some novel algorithms based on iteratively diffusing support at different disparity hypotheses, and locally controlling the amount of diffusion based on the current quality of the disparity estimate. It also develops a novel Bayesian estimation technique, which significantly outperforms techniques based on...

Publication details
Date: 1 June 1998
Type: Article
Publisher: Kluwer Academic
Number: 2
Richard Szeliski and Phil Torr

Structure from motion algorithms typically do not use exter- nal geometric constraints, e.g., the coplanarity of certain points or known orientations associated with such planes, until a final post-processing stage. In this paper, we show how such geometric constraints can be incorporated early on in the reconstruction process, thereby improving the quality of the estimates. The approaches we study include hallu- cinating extra point matches in planar regions, computing fundamental matrices directly...

Publication details
Date: 1 June 1998
Type: Inproceeding
Publisher: Springer Verlag
S. Baker, R. Szeliski, and P. Anandan

We propose a framework for extracting structure from stereo which represents the scene as a collection of approximately planar layers. Each layer consists of an explicit 3D plane equation, a colored image with per-pixel opacity (a sprite), and a per-pixel depth offset relative to the plane. Initial estimates of the layers are recovered using techniques taken from parametricmotion estimation. These initial estimates are then refined using a re-synthesis algorithm which takes into account both occlusions...

Publication details
Date: 1 June 1998
Type: Inproceeding
Publisher: IEEE Computer Society
Heung-Yeung Shum, Mei Han, and Richard Szeliski

This paper presents an interactive modeling system that constructs 3D models from a collection of panoramic image mosaics. A panoramic mosaic consists of a set of images taken around the same viewpoint, and a transformation matrix associated with each input image. Our system first recovers the camera pose for each mosaic from known line directions and points, and then constructs the 3D model using all available geometrical constraints. We partition constraints into soft and hard linear constraints so...

Publication details
Date: 1 June 1998
Type: Inproceeding
Publisher: IEEE Computer Society
Richard Szeliski and Richard Weiss

Recovering the shape of an object from two views fails at occluding contours of smooth objects because the extremal contours are view dependent. For three or more views, shape recovery is possible, and several algorithms have recently been developed for this purpose. We present a new approach to the multiframe stereo problem that does not depend on differential measurements in the image, which may be noise sensitive. Instead, we use a linear smoother to optimally combine all of the measurements...

Publication details
Date: 1 June 1998
Type: Article
Publisher: Kluwer Academic
Number: 1
Heung-Yeung Shum, Richard Szeliski, Simon Baker, Mei Han, and P. Anandan

We present some recent progress in designing and implementing two interactive image-based 3D modeling systems. The first system constructs 3D models from a collection of panoramic image mosaics. A panoramic mosaic consists of a set of images taken around the same viewpoint, and a camera matrix associated with each input image. The user first interactively specifies features such as points, lines, and planes. Our system recovers the camera pose for each mosaic from known line directions and reference...

Publication details
Date: 1 June 1998
Type: Inproceeding
Publisher: Springer Verlag
Heung-Yeung Shum and Richard Szeliski

This paper presents a new approach to computing depth maps from a large collection of images where the camera motion has been constrained to planar concentric circles. We resample the resulting collection of regular perspective images into a set of multiperspective panoramas, and then compute depth maps directly from these resampled images. Only a small number of multiperspective panoramas is needed to obtain a dense and accurate 3D reconstruction, since our panoramas sample uniformly in three...

Publication details
Date: 1 January 1998
Type: Inproceeding
Publisher: IEEE Computer Society
Richard Szeliski and Polina Golland

This paper formulates and solves a new variant of the stereo correspondence problem: simultaneously recovering the disparities, true colors, and opacities of visible surface elements. This problem arises in newer applications of stereo reconstruction, such as view interpolation and the layering of real imagery with synthetic graphics for special effects and virtual studio applications. While this problem is intrinsically more difficult than traditional stereo correspondence, where only the disparities...

Publication details
Date: 1 January 1998
Type: Inproceeding
Publisher: IEEE Computer Society
Heung-Yeung Shum and Richard Szeliski

This paper presents some techniques for constructing panoramic image mosaics from sequences of images. Our mosaic representation associates a transformation matrix with each input image, rather than explicitly projecting all of the images onto a common surface (e.g., a cylinder). In particular, to construct a full view panorama, we introduce a rotational mosaic representation that associates a rotation matrix (and optionally a focal length) with each input image. A patch-based alignment algorithm is...

Publication details
Date: 1 September 1997
Type: Technical report
Number: MSR-TR-97-23
Steven J. Gortler, Radek Grzeszczuk, Richard Szeliski, and Michael F. Cohen

This paper discusses a new method for capturing the complete appearance of both synthetic and real world objects and scenes, representing this information, and then using this representation to render images of the object from new camera positions. Unlike the shape capture process traditionally used in computer vision and the rendering process traditionally used in computer graphics, our approach does not rely on geometric representations. Instead we sample and reconstruct a 4D function, which we call a...

Publication details
Date: 1 August 1996
Type: Inproceeding
Publisher: Association for Computing Machinery, Inc.
Publication details
Date: 1 January 1994
Type: Technical report
1–89 of 89
Sort
Show 25 | 50 | 100
1
> Our research