A Crash Course on Vision Based Modeling

The ubiquity of digital cameras and the ease of collecting video data opens the possibility of modeling 3D scenes and actions from images and video. In this course, we will provide an overview of the problems and techniques in Vision Based Modeling, an active area of research in computer vision. We will describe the principles and methods used for analyzing multiple images of a 3D scene taken from different view points (“stereo”), and for analyzing a sequence of images from video (“motion”). Specifically, over a course of four lectures, the following topics will be reviewed : (i) techniques for computing visual motion and stereo correspondence, (ii) the interpretation of visual motion and its applications, (iii) the geometric relationships between multiple images, (iv) techniques for calibrating the camera and recovering the camera pose and motion, (v) techniques for recovering the 3D scene structure based on 2D image correspondences,

Speaker Details

Rick Szeliski has been at Microsoft Research for sixteen years, where he currently leads the Interactive Visual Media group; he is also an Affiliate Professor at the University of Washington. He co-authored the most widely cited tutorial on the topic of stereo matching, has taught computer vision courses at UW and Stanford, and recently completed a textbook on computer vision. He wrote his first stereo matching algorithm as a graduate student at CMU in 1985, and believes that the last word on stereo has yet to be written.