Shape and Appearance Models from Multiple Images
Richard Szeliski
Microsoft Research
Workshop on Image-Based Modeling and Rendering
StanfordUniversity, March 24, 1998.
Image-Based Modeling
Create 3D models from one or more images
calibration and camera pose recovery
correspondence (matching, tracking, stereo)
3D model construction
appearance extraction
Mix with graphics & re-render (new views)
Image-Based Rendering
Render 3D graphics from images
sprite-based 3D rendering (Talisman)
view interpolation: warp and blend (morph) between several images
Lumigraph: full 2D manifold of images
layered depth images (LDIs): voxel-based representation
Applications
"Desktop scanning" for 3D world and object building ("3D home page")
Collaborative design ("3D fax")
Virtual environment construction
(virtual tourism, home sales/redesign)
Video editing and special effects:
uncalibrated, uncontrolled video
Shape and Appearance Representations
Depth maps
Volumetric models
Surface models
View-based representations
Scene decompositions: layers/sprites
Outline
Volumes from silhouettes
Surface meshes from matched curves
Depth maps from stereo
Range data merging and surface modeling
Appearance recovery (texture maps)
(Sub-) pixel-accurate, multi-view stereo
Discussion and summary
Volumes from Silhouettes
Start with collection of "calibrated" images
Volumes from Silhouettes
Cup on turntable example
Volumes from Silhouettes
Advantages:
simple to implement, fairly robust
fast execution
complete (closed) surface
Limitations:
only produces line hull
limited resolution
sensitive to classification (thresholding)
3D Curves from Edges
"Feature-based" stereo matching
Extract extremal and internal edges
Match curves along epipolar lines
Reconstruct 3D curves…
Silhouette curves don’t match in 3D
3D Curves from Edges
Coffee jar example
3D Curves from Edges
Advantages:
correct estimates at occluding contours
good for smoothly curved objects
provides intrinsic surface estimates
works on interior surface markings
Limitations:
fails in highly textured regions
fails in textureless interior areas
incomplete surface (not closed)
Dense Stereo Matching
Compute depth map using correlation
Move correlation windows along epipolar lines
Dense Stereo Matching
View extrapolation results
input depth image novel view
[Matthies,Szeliski,Kanade’88]
Dense Stereo Matching
Newer view extrapolation results
input depth image novel view
Dense Stereo Matching
Compute certainty map from correlations
input depth map certainty map
Range Data Merging
Convert sparse depths to 3D points
Dense Stereo Matching
Advantages:
gives detailed surface estimates
multi-view aggregation improves accuracy
Limitations:
narrow baseline Þ
noisy estimates
fails in textureless areas
sparse, incomplete surface
sensitive to non-Lambertian effects
3D Surface Fitting
Convert 3D points into smooth surface
physically-based oriented particles
[Szeliski, Tonnesen & Terzopoulos]
triangulation and mesh simplification
[Hoppe et al., ...]
distance functions and isosurface extraction
[Curless & Levoy]
Oriented Particles
Use a collection of small surface elements
local coordinates: position, normal, curvature
interaction potentials enforce smoothness
simulate motions using dynamics
local triangulation/interpolation scheme
topology changes occur automatically
Oriented Particles
Interactive particle-based surface modeling
Oriented Particles
Advantages:
can conform to any topology
good for interactive shaping, design
intrinsic surface representation
Limitations:
hard to get dynamics right
slow convergence to energy minimum
non-local interactions sometimes undesirable
Texture Map Recovery
For each model patch:
determine visibility (item buffer)
blend together textures (weight by view)
Texture Map Recovery
3D model building example
octree 3D curves texture-mapped
Beyond Texture-Mapped Models
Capture view-dependent appearance
recovering BRDF [Sato et al., Yu & Malik]
view-dependent texture maps [Debevec et al.]
view interpolation [Chen & Williams, …,
Seitz & Dyer]
lightfield and Lumigraph [Levoy & Hanrahan, Gortler et al.]
Lumigraph Example
acquisition stage volumetric model novel view
Multi-Image Scene Recovery
Problems with "classical" approach
narrow baseline Þ
noisy results
single depth map misses information
ignores (or improperly treats) occlusions
ignores mixed (partially transparent) pixels
Multi-Image Scene Recovery
Goals of new stereo algorithm
simultaneously recover disparities, colors, and opacities (c.f. blue screen matting)
explicitly handle occlusions
true multi-frame setting [Collins]
details in [Szeliski & Golland, ICCV’98]
Plane Sweep Stereo
Sweep family of planes through volume
Plane Sweep Stereo
For each depth plane
compute composite (mosaic) image — mean
compute error image — variance
convert to confidence and aggregate spatially
Select winning depth at each pixel
Plane Sweep Stereo
"Stack of acetates" model (related to LDI...)
Plane Sweep Stereo
Compute visibility each input/layer pair
Voxel Coloring
Generalizes plane sweep camera geometry
replace plane sweep with surface sweep
[Seitz & Dyer][Seitz & Kutalakos]
Voxel Coloring
Results for dinosaur and rose
Stereo with Matting
Estimate fractional opacities for pixels
adjust layer "sprites" (colors and opacities) to best match input images
optimization criteria:
re-synthesis error
color and opacity smoothness
prior distribution on opacities
corresponds to MAP Bayesian estimator
Stereo with Matting
SRI Trees sequence example
input images stereo layers
Stereo with Matting
Advantages:
true multi-image matching
deals with occlusions and mixed pixels
Limitations:
too many degrees of freedom (volume)
breaks up surfaces into "voxels"
no "sub-pixel" depths
Layered Stereo
Use arbitrarily oriented sprites [Baker,Szeliski,Anandan’98]
Layered Stereo Demo
SpriteViewer: renders sprites with depth
Layered Stereo
Assign pixel to different "layers" (objects, sprites)
Layered Stereo
Track each layer from frame to frame, compute plane eqn. and composite mosaic
Re-compute pixel assignment by comparing original images to sprites
Layered Stereo
Resulting sprite collection
Layered Stereo
Estimated depth map
Layered Stereo
Re-synthesize original or novel images from collection of sprites
Layered Stereo
Per-pixel residual depth estimation
plane plus parallax [Anandan et al.]
model-based stereo [Debevec et al.]
better accuracy / fidelity
makes forward warping more difficult
Layered Stereo
Advantages:
can represent occluded regions
can represent transparent and border (mixed) pixels (sprites have alpha value per pixel)
works on texture-less interior regions
Limitations:
fails for high depth-complexity scenes
may need manual initialization / control
Image-Based Modeling & Rendering
Grand Unified Theory of Image-Based Modeling and Rendering
Modeling & Rendering
Silhouettes ®
volume
Curves ®
3D mesh
Stereo ®
depth map
Range data merging
3D surface modeling
Texture recovery
Multi-view stereo
3D texture-mapped model
View-dependent texture maps
Sprites with depth
Layered Depth Images
Colored depth maps
Lumigraph
Lightfield
Open Problems
Automatic scene segmentation
Complex scenes: forests…
Non-static scenes
Non-rigid motion
Moving illumination, specularities, …
…
… but potential of IBMR looks great
Acknowledgements
Colleagues
CMU: Takeo Kanade, Geoffrey Hinton,
Larry Matthies
DEC CRL: Demetri Terzopoulos, David Tonnesen, Sing Bing Kang, James Coughlan, Richard Weiss
Microsoft: Michael Cohen, Steven Gortler, Radek Grzeszczuk, Polina Golland,
Heung-Yeung Shum, Simon Baker, Anandan, Mei Han
Bibliography
see http://www.research.microsoft.com/~szeliski/IBMR
© Microsoft Corp., 1998