CVPR 2001 Short Course

 

Developing Computer Vision Applications in Windows

 

Instructors: Ross Cutler (Microsoft Research)

 

Duration: 4 hours

 

ABSTRACT:

 

 

Motivation

 

Windows-based workstations are the most popular and cost effective platforms for developing computer vision applications.  This course provides detailed and essential information on how to develop computer vision applications on this platform.  We will review the relevant APIs and provide example applications.  This course is geared both for beginning developers and researchers, but also for experienced developers who want to maximize performance on the Windows platform, and take advantage of new APIs.

 

 

Prerequisites

 

Participants should be experienced in C and C++ programming. A brief introduction to Visual C++ will be given, which is used in the course.

 

 

Topics

 

DirectShow

 

DirectShow is used to efficiently capture and process video and audio streams.  We provide example filters and applications that demonstrate how DirectShow is used for audio/video capture, processing, display, and storage.  We also provide a Visual Studio wizard that allows easy creation of transform filters.  We also discuss DirectX Media Objects.

 

Direct3D

 

Direct3D is used for efficiently displaying 2D and 3D graphics.  It can also be used to accelerate many image processing functions, such as image warps.

 

GDI+

 

The GDI+ (graphics device interface) is used for drawing text and 2D graphics to the display and images.  It has been significantly enhanced over the GDI, and includes anti-aliased lines, alpha blending, gradient and texture fills, enhanced graphics file support, enhanced typography support, and image processing functions such as rotation, cropping, brightness/contrast adjustment, and color balance.

 

Win32

 

The Win32 API is used for many important system level tasks.  We discuss process and thread creation, priorities, events, timers, shared memory, file access, bitmaps, and window management.

 

Vision SDK

 

The Vision SDK was developed for writing computer vision applications on the Windows platform.  We discuss the basic features of the Vision SDK, including video capture, display, and image classes.  Advanced topics discussed include sequence classes, property lists, self-describing streams, CLAPACK, and extending the Vision SDK.

 

Real-time applications

 

There are many import issues to consider when developing real-time applications in Windows. We discuss methods of benchmarking, profiling, and optimizing your application.  We also discuss scheduling, timers, events, and multiprocessor issues for real-time applications.

 

Intel Pentium III/4 Architecture

 

Knowing the system architecture is critical for developing many computer vision applications.  We provide a brief overview of the Intel Pentium III/4 architecture.  This includes the memory system, AGP, PCI, SIMD operations, and multiprocessor issues.

 

 

Biography

 

Ross Cutler is a researcher in the Collaboration and Multimedia Systems at Microsoft Research.  His areas of interest include computer vision, video indexing, multimedia databases, multimedia authoring, multi-view imaging, face recognition, speaker detection, and real-time systems.  He has received B.S. degrees in Mathematics, Physics, and Computer Science (1992), a M.S. in Computer Science (1996), and a Ph.D. in Computer Science (2000) from the University of Maryland, College Park.  He has over 15 years experience as a professional developer in a variety of areas, including neuroscience, data acquisition and real-time control, risk analysis, and real-time computer vision applications.

 

 

Contact information:

 

Ross Cutler

voice: 425-703-9236

fax:   425-936-5329

email: rcutler@microsoft.com

 

 

Back to Main