CVPR 2001 Short Course
Developing Computer Vision Applications in
Windows
Instructors:
Ross Cutler (Microsoft
Research)
Duration:
4 hours
ABSTRACT:
Motivation
Windows-based
workstations are the most popular and cost effective platforms for developing
computer vision applications. This
course provides detailed and essential information on how to develop computer vision
applications on this platform. We will
review the relevant APIs and provide example applications. This course is geared both for beginning
developers and researchers, but also for experienced developers who want to
maximize performance on the Windows platform, and take advantage of new APIs.
Prerequisites
Participants
should be experienced in C and C++ programming. A brief introduction to Visual
C++ will be given, which is used in the course.
Topics
DirectShow
DirectShow
is used to efficiently capture and process video and audio streams. We provide example filters and applications
that demonstrate how DirectShow is used for audio/video capture, processing,
display, and storage. We also provide a
Visual Studio wizard that allows easy creation of transform filters. We also discuss DirectX Media Objects.
Direct3D
Direct3D
is used for efficiently displaying 2D and 3D graphics. It can also be used to accelerate many image
processing functions, such as image warps.
GDI+
The
GDI+ (graphics device interface) is used for drawing text and 2D graphics to
the display and images. It has been
significantly enhanced over the GDI, and includes anti-aliased lines, alpha
blending, gradient and texture fills, enhanced graphics file support, enhanced
typography support, and image processing functions such as rotation, cropping,
brightness/contrast adjustment, and color balance.
Win32
The
Win32 API is used for many important system level tasks. We discuss process and thread creation,
priorities, events, timers, shared memory, file access, bitmaps, and window
management.
Vision
SDK
The
Vision SDK was developed for writing computer vision applications on the
Windows platform. We discuss the basic features
of the Vision SDK, including video capture, display, and image classes. Advanced topics discussed include sequence
classes, property lists, self-describing streams, CLAPACK, and extending the
Vision SDK.
Real-time
applications
There
are many import issues to consider when developing real-time applications in
Windows. We discuss methods of benchmarking, profiling, and optimizing your
application. We also discuss scheduling,
timers, events, and multiprocessor issues for real-time applications.
Intel
Pentium III/4 Architecture
Knowing
the system architecture is critical for developing many computer vision
applications. We provide a brief
overview of the Intel Pentium III/4 architecture. This includes the memory system, AGP, PCI,
SIMD operations, and multiprocessor issues.
Biography
Ross
Cutler is a researcher in the Collaboration and Multimedia Systems at Microsoft
Research. His areas of interest include
computer vision, video indexing, multimedia databases, multimedia authoring,
multi-view imaging, face recognition, speaker detection, and real-time
systems. He has received B.S. degrees in
Mathematics, Physics, and Computer Science (1992), a M.S. in Computer Science
(1996), and a Ph.D. in Computer Science (2000) from the
Contact
information:
Ross
Cutler
voice:
425-703-9236
fax: 425-936-5329
email: rcutler@microsoft.com
Back to
Main