Zhengyou Zhang
Microsoft Corp., One Microsoft Way, Redmond WA 98052-6399, USA
zhang@microsoft.com

IEEE TAMD has its
first Impact Factor: 3.210. Wow!
SIGGRAPH Asia,
Singapore, November 28 - December 1, 2012. New Program:
Technical
Briefs. Successfully completed!
CFP:
IEEE TAMD Special Issue
on Behavior Understanding and Developmental Robotics,
Deadline: May 15, 2013
Principal Researcher and Research Manager at Microsoft Research (MSR), Redmond,
USA. My research is in computer vision, speech signal processing, multi-sensory fusion, multimedia computing,
real-time collaboration and human-machine interaction.
I manage the Multimedia, Interaction, and Communication (MIC) Group.I
was affiliated with the Communication and Collaboration Systems Group,and
the Speech Technology Group.
|
Research interests
- Computer vision and graphics: calibration, matching,
stereo, motion, 3D modeling, 3D display
- Audio processing and rendering, speech processing,
spatial audio, multichannel AEC
- Audio-visual fusion, active object detection and
tracking
- Multimedia, human-computer interaction, human-human
communication and collaboration
- Biology-inspired learning, autonomous mental
development
- Human information processing:
face/speaker recognition/verification, activity recognition
and understanding.
|
Full
publication list is available through Google Scholar.
Recent publications and some downloadable papers are available from
here.
Please visit my home page at INRIA for
information prior to my arrival (March 30, 1998) at Microsoft Research.
Education
-
B.S. degree in electronic engineering from the
Zhejiang University, China,
in 1985.
-
M.S. degree (DEA) in computer science from the
University of Nancy, France, in 1987. Advisor: Jean-Paul Haton
-
Ph.D. degree in computer science from the University
of Paris XI, Orsay, France, in 1990.
Advisor:
Olivier Faugeras
-
D.Sc. (Habilitation à diriger des recherches) from the
University of Paris XI, Orsay, France, in 1994.

Short Bio
Full version of his résumé is available by clicking
here.
Zhengyou Zhang is a Fellow
of the Institute of Electrical and Electronic Engineers (IEEE).
He is the Founding Editor-in-Chief of the newly established
IEEE Transactions on
Autonomous Mental Development (IEEE T-AMD), and is on the Editorial Board of
the International Journal of Computer Vision
(IJCV), the Machine Vision and Applications,
and the Journal of Computer Science and Technology (JCST).
He was on the Editorial Board of the IEEE Transactions on Pattern Analysis and Machine Intelligence
(IEEE T-PAMI) from 1999 to 2005, the IEEE Transactions on Multimedia (IEEE T-MM)
from 2004 to 2009,
the International Journal of Pattern Recognition and Artificial Intelligence
(IJPRAI) from 1997 to 2008, among others.
He is listed in Who's Who in the World, Who's Who in America
and Who's Who in Science and Engineering.
Before joining Microsoft, Zhengyou worked at INRIA
(French National Institute for Research in Computer Science and Control) for 11 years, and was a Senior Research Scientist since 1991, where
he worked in the Computer
Vision and Robotics group. In 1996-1997, he spent
one-year sabbatical as an Invited Researcher at the
Advanced Telecommunications Research Institute International (ATR), Kyoto, Japan.
He holds about 100 US patents and has about 20 patents pending. He also holds a few Japanese patents for his inventions during his sabbatical at
ATR.
He has published over 200 papers in refereed international journals and conferences, and is the author of the following books
-
3D Dynamic Scene Analysis: A Stereo Based Approach (with
O. Faugeras) (Springer, Berlin, Heidelberg, 1992). ISBN
3-540-55429-7
& ISBN 0-387-55429-7.
Preview available at here.
-
Epipolar
Geometry in Stereo, Motion and Object Recognition: A Unified Approach
(with G. Xu, forewords
by O. Faugeras and S. Tsuji; Telecom Systems Technical Award, The
Japan Telecommunications Advancement Foundation) (Kluwer Academic Publishers,
1996). ISBN 0-7923-4199-6.
Preview available at here.
-
Computer Vision: Fundamentals of Computational Theory and Algorithms (in
Chinese) (with S. Ma)
(Chinese Academy of Sciences, 1998; Second edition, 2003). ISBN
7-03-006070-9. The experimental data described in the appendix can be downloaded here (7.84MB).
-
Face Detection and Adaptation
(with
C. Zhang) (Morgan and Claypool, 2010).
ISBN-10 160845133X.
Available from Amazon.
-
Face Geometry and Appearance Modeling
(with
Z. Liu;
Foreword by Demetri Terzopoulos) (Cambridge University Press, 2011).
Available at Amazon.
He is the Chair of the new
Technical
Briefs program of the
SIGGRAPH Asia, Singapore,
November 28 - December 1, 2012.
He is a co-organizer of the Second
International Workshop on
Human Activity Understanding from 3D Data (HAU3D) 2012, in conjunction with
the IEEE Conference on Computer Vision and Pattern Recognition
(CVPR), Providence, Rhode Island, June 16-21, 2012.
He was a General Co-Chair of the
International Workshop on Multimedia Signal Processing (MMSP 2011), October
2011, Hangzhou, China.
He was a co-organizer of the International Workshop on
Human Activity Understanding from 3D Data (HAU3D) 2011, in conjunction with
the IEEE Conference on Computer Vision and Pattern Recognition
(CVPR) Colorado Springs, June 20-25, 2011.
He was a Program Co-Chair of the
International Conference on Multimedia and Expo (ICME), July 2010, a Program
Co-Chair of the ACM International Conference
on Multimedia (ACM MM), October 2010, and a Program Co-Chair of the
ACM International Conference on
Multimodal Interfaces (ICMI), November 2010. He was the Program Co-Chair of the 8th International Conference on Development
and Learning (ICDL09), June 5-7, 2009, Shanghai, China.
He was a Technical Co-Chair of the International Workshop on Multimedia Signal Processing (MMSP06), October 3-6, 2006, Victoria, BC, Canada.
He was the Program Co-Chair of the Asian Conference on Computer Vision
(ACCV2004), Jan. 27-30, 2004, Jeju Island, Korea;
a Demo Chair and an Area Chair of the
International Conference on Computer Vision (ICCV2003), Oct. 14-17, 2003, Nice, France; the Demo Chair of the
International Conference on Computer Vision (ICCV2005), Oct. 15-21, 2005, Beijing, China. He co-organized the International Workshop on
Multimedia Technologies in E-Learning and Collaboration, held in Nice, France, on October 17, 2003.
He served on the Program Committees of ICCV, CVPR, ECCV, ACCV and many other international conferences and workshops.
Zhengyou Zhang is a member of the IEEE Computer Society Fellows Committee
from 2005 to 2007, and in 2010 and 2011, a member of IEEE Technical Committee on Multimedia Signal Processing
(2006-2010) and the
ex-Chair of IEEE Technical Committee on Autonomous Mental Development
(2007-2009). He is a member of ACM.
Interview by the
Computational Intelligence Magazine is available
here
(or go to
IEEE Xplore).
Interview by the
IEEE Signal Processing Magazine on "Telepresence:
Virtual Reality in the Real World" is available
here (or go to
IEEE Xplore). (November 2011)
Natural
User Interfaces: What's Next?. Video on
3D
Photorealistic Talking Head.(Februray 2011)
IEEE Transactions on Autonomous Mental Development (IEEE TAMD)
-
Submission information available
here.
-
Table of Contents with Abstracts and Links to PDF files available
here.
Projects
Tutorial
Parameter Estimation Techniques: A Tutorial with Application to Conic Fitting.
PDF version, HTML version, and PostScript version
Published in Image and Vision Computing Journal, Vol.15, No.1, pages 59-76, 1997.
Recent publications: Click here
Full
publication list is available through Google Scholar.
Some downloadable publications:
-
C. Zhang, Q. Cai, P. Chou, Z. Zhang, and R. Martin-Brualla, "Viewport:
A Fully Distributed Immersive Teleconferencing System with Infrared
Dot Pattern", IEEE MultiMedia, Vol. 20, No. 1,
pages 17-27, 2013.
available
from
IEEE or from
http://research.microsoft.com/~zhang/Papers/Viewport-IEEE-Multimedia-2003.pdf
-
Z. Zhang, "Microsoft Kinect Sensor and Its Effect", IEEE
MultiMedia, Vol. 19, No. 2, pages 4-12, April-June 2012.
available
from
IEEE or from
http://research.microsoft.com/~zhang/Papers/Microsoft Kinect Sensor
and Its Effect - IEEE MM 2012.pdf
- Z. Zhang, "Estimating Projective Transformation Matrix (Collineation,
Homography)", Microsoft Research Technical Report MSR-TR-2010-63,
May 2010.
available at
http://research.microsoft.com/apps/pubs/?id=131928
- Z. Zhang, "Camera Calibration", Chapter 2, pages 4-43, in G. Medioni and S.B. Kang, eds.,
Emergin Topics in Computer Vision, Prentice Hall Professional Technical Reference, 2004.
This book chapter provides an overview of various camera calibration techniques, including using 3D calibration appratus, 3D planar pattern,
1D linear points, and just points from the environment (0D). Available at Camera Calibration - book chapter.pdf
- Z. Zhang, "A flexible new technique for camera calibration",
IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol.22, No.11, pages 1330-1334, 2000.
The calibration paper with free moving 2D pattern, available as Technical Report MSR-TR-98-71
- Z. Zhang, "Camera Calibration with One-Dimensional Objects",
IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol.26, No.7, pages 892-899, 2004.
doi:10.1109/TPAMI.2004.21, available from
IEEE or
from http://research.microsoft.com/~zhang/Papers/ZhangPAMI-04-07-Calib-1D.pdf
-
Z Zhang, O. Faugeras, R. Deriche,
"An Effective Technique for Calibrating a Binocular Stereo Through Projective
Reconstruction Using Both a Calibration Object and the Environment",
Videre: Journal of Computer Vision Research, MIT Press,
Vol.1, No.1, pages 58-68, 1997.
available at
http://research.microsoft.com/~zhang/Papers/videre97-stereo-calibration.pdf
-
Z Zhang and V. Schenk,
"Self-Maintaining Camera Calibration Over Time",
Proc. IEEE Conference on Computer Vision and Pattern Recognition, pages 231-236, Puerto Rico, June 1997.
available at
http://research.microsoft.com/~zhang/Papers/CVPR1997-Self-Maintaining Camera Calibration Over Time.pdf
-
Z. Zhang, "Determining the epipolar geometry and its uncertainty: A review",
International Journal of Computer Vision, Vol.27, No.2, pages 161-198, 1998.
available at
http://research.microsoft.com/~zhang/Papers/IJCV-Review.pdf
- Z. Zhang, "On the optimization criteria used in two-view motion analysis",
IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol.20, No.7, pages 717-729, 1998.
available here.
-
Z. Zhang and C. Loop, "Estimating the Fundamental Matrix by Transforming Image Points in Projective Space",
Computer Vision and Image Understanding, Vol.82, No.2, pages 174-180, 2001.
available from
CVIU link. or from here.
-
Z. Zhang, O.D. Faugeras, "Determining motion from 3D line
segment matches: a comparative study", Image and Vision Computing,
9(1):10-19, 1991.
available from
Elsevier or at
http://research.microsoft.com/~zhang/Papers/Determining Motion from 3D Line
Segments-A Comparative Study.pdf
-
Z. Zhang, R. Deriche, O. Faugeras, Q.-T. Luong,
"A Robust Technique for Matching Two Uncalibrated Images Through
the Recovery of the Unknown Epipolar Geometry", Artificial
Intelligence Journal, Vol.78, pages 87-119, October 1995.
available at
http://research.microsoft.com/~zhang/Papers/ZhangAIJ95.pdf
-
Z. Zhang, G. Xu,
"A Unified Theory of Uncalibrated Stereo for Both Perspective
and Affine Cameras", Journal of Mathematical Imaging and Vision,
Vol.9, No.3, pages 213-229, November 1998.
available at
http://research.microsoft.com/~zhang/Papers/JMIV98.pdf
-
Z. Zhang, "Iterative Point Matching for Registration of Free-Form Curves
and Surfaces",
International Journal of Computer Vision, Vol.13, No.2, pages 119-152, 1994.
The ICP paper available at
http://research.microsoft.com/~zhang/Papers/IJCV-94-ICP.pdf
-
Z. Zhang, O. Faugeras,
"Estimation of Displacements from Two 3-D Frames Obtained From Stereo",
IEEE Transactions on Pattern Analysis and Machine Intelligence,
Vol.14, No.12, pages 1141-1156, 1992.
available at
http://research.microsoft.com/~zhang/Papers/ZhangPAMI-92-12.pdf
-
Z. Zhang, O. Faugeras,
"A 3D World Model Builder with a Mobile Robot",
International Journal of Robotics Research,
Vol.11, No.4, pages 269-285, August 1992.
available at
Papers/ZhangIJRR-92-RR-1546-World Model Builder.pdf
- Z Zhang,
"Motion and Structure From Two Perspective Views: From Essential Parameters to Euclidean Motion Via Fundamental Matrix",
Journal of the Optical Society of America A,
Vol.14, no.11, pages 2938-2950, 1997.
available at
http://research.microsoft.com/~zhang/Papers/ZhangJOSA-97.pdf
- C. Wu, C. Liu, H.-Y. Shum, Y.-Q. Xu, and Z. Zhang, "Automatic
Eyeglasses Removal from Face Images'', IEEE Trans. Pattern
Analysis and Machine Intelligence, 26(3):322-336, 2004.
available from
IEEE or at
http://research.microsoft.com/~zhang/Papers/PAMI - Eyeglasses
Removal.pdf
- Z. Zhang,
"Feature-Based Facial Expression Recognition: Sensitivity Analysis
and Experiments With a Multi-Layer Perceptron",
International Journal of Pattern Recognition and Artificial
Intelligence, Vol.13, No.6, pages 893-911, 1999.
available at
http://research.microsoft.com/~zhang/Papers/IJPRAI.pdf
or
ftp://ftp.inria.fr/INRIA/publication/dienst/RR-3354.pdf
- Z. Zhang, Z. Liu, D. Adler, M.F. Cohen, E. Hanson, and Y. Shan,
"Robust and Rapid Generation of Animated Faces From Video Images: A Model-Based Modeling Approach",
International Journal of Computer Vision, Vol.58, No.1, pages 93-119, June 2004.
available at
http://research.microsoft.com/~zhang/Papers/IJCV04-Face.pdf
- I. Shimizu, Z. Zhang, S. Akamatsu, and K. Deguchi,"Head pose determination from one image using a generic model",
Proc. 3rd International Conference on Automatic Face and Gesture Recognition (FG'98), pages 100-105, Nara, Japan, April 1998.
available at
http://research.microsoft.com/~zhang/Papers/FG1998-HeadPoseFromOneImage.pdf
- Z. Zhang, and L. He, "Whiteboard Scanning and Image Enhancement",
Digital Signal Processing, Vol.17, No.2,
pages 414-432, 2007.
available from
Elsevier or at
http://research.microsoft.com/~zhang/Papers/ZhangHeDSP07.pdf
- Z. Zhang,
"Image-based geometrically-correct photorealistic scene/object modeling (IBPhM): A review",
Proc. 3rd Asian Conf. on Computer Vision (ACCV 1998),
pages 340-349, Hong Kong, January 8-11, 1998.
available at
http://research.microsoft.com/~zhang/Papers/ACCV98.pdf
- K Nishino, Z Zhang, K Ikeuchi,
"Determining Reflectance Parameters and Illumination Distribution
from a Sparse Set of Images for View-dependent Image Synthesis ",
Proc. International Conference on Computer Vision (ICCV 2001),
Vol. I, pages 599-606, Vancouver, Canada, July 2001.
available at
http://research.microsoft.com/~zhang/Papers/ICCV01-Reflectance.PDF
- O. Faugeras, P. Fua, B. Hotz, R. Ma, L. Robert, M. Thonnat, and Z.
Zhang, "Quantitative and Qualitative Comparison of Some Area and
Feature-based Stereo Algorithms", In Wolfgang Forstner and Stephan
Ruwiedel, editors, Robust Computer Vision: Quality of Vision
Algorithms, pages 1--26. Wichmann,Karlsruhe, Germany, 1992.
available at
http://research.microsoft.com/~zhang/Papers/comp-stereo.pdf
- Z. Zhang, Y. Wu, and Z. Liu, "Side
Statistics and Maximum Discriminant Analysis for Real-Time Tracking",
Proc. Asian Conference on Computer Vision (ACCV 2002), pages
308-313, Melbourne, Australia, January 2002.
available
here
- Z. Liu, M. Cohen, D. Bhatnagar, R. Cutler, and Z. Zhang, "Head-size
Equalization for Improved Visual Perception in Video Conferencing",
IEEE Transactions on Multimedia, Vol.9, No.7, pages 1520-1527,
2007. Available at
http://research.microsoft.com/~zhang/papers/Headsize equalization -
TMM07.pdf
- Y. Wu, Z. Zhang, T.S. Huang, and J.Y. Lin, "Multibody
Grouping via Orthogonal Subspace Decomposition", Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR'01),
Kauai, Hawaii, Vol.II, pages 252--257, December 2001.
available from
IEEE or at
http://research.microsoft.com/~zhang/Papers/Multibody Grouping -
CVPR01.pdf
- C. Zhang, D. Florencio, D. Ba, and Z. Zhang, "Maximum
Likelihood Sound Source Localization and Beamforming for Directional
Microphone Arrays in Distributed Meetings", IEEE
Transactions on Multimedia, Vol.10, No.3, Pages 538-548, 2008.
Available at
http://research.microsoft.com/~zhang/Papers/ML-SSL-IEEE-TMM.pdf.
- C. Zhang, P. Yin, Y. Rui, R. Cutler, P. Viola, X. Sun, N. Pinto,
and Z. Zhang, "Boosting-Based
Multimodal Speaker Detection for Distributed Meeting Videos",
IEEE Transactions on Multimedia, Vol.10, No.8, pages 1541-1552,
2008. Available at
http://research.microsoft.com/~zhang/Papers/SpeakerDetection-IEEE-TMM.pdf.
- W. Li, Z. Zhang, and Z. Liu, "Action Recognition
Based on A Bag of 3D Points", in Proc.
IEEE International Workshop on CVPR for Human Communicative Behavior
Analysis (CVPR4HB), pages 9-14, San Francisco, CA, USA, June 18,
2010.
PDF file.
- W. Li, Z. Zhang, and Z. Liu, "Expandable Data-Driven Graphical Modeling
of Human Actions Based on Salient Postures", IEEE Transaction on Circuits and Systems for Video Technology,
Vol.18 No.11, pages 1499-1510, 2008.
PDF
file
- W. Lin, M.-T, Sun, R. Poovendran, Z. Zhang, ``Group event
detection with a varying number of group members for video surveillance'',
IEEE Trans. Circuits and Systems for Video Technology, vol. 20,
issue. 8, pp.1057--1067, 2010.
PDF File.
- W. Lin, M.-T. Sun, R. Poovandran, and Z. Zhang, "Activity Recognition Using
A Combination of Category Components And Local Models for Video Surveillance",
IEEE Transaction on Circuits and Systems for Video Technology,
Vol.18, No.8, pages 1128-1139, 2008.
PDF
file
- A. Subramanya, Z. Zhang, Z. Liu, and A. Acero, "Multisensory
Processing for Speech Enhancement and Magnitude-Normalized Spectra for
Speech Modeling", Speech Communication, Vol. 50, pp
228-243, 2008.
PDF File.

Collaborators, Post-Doctoral Researchers and Students
Zicheng Liu
(Researcher, MSR)
Mike Sinclair
(Principal Researcher, MSR)
Li-wei He
(Research Engineer, MSR)
Cha Zhang
(Researcher, MSR)
Rajesh Hegde
(Research Engineer, MSR)
Dinei
Florencio (Researcher, MSR)
Qin Cai
(Research Engineer, MSR)
Wei-ge Chen
(Software Architect, MSR)
Phil Chou (Principal
Researcher, MSR)
Ying Shan
(Post-Doc, now Scientist at Microsoft Online)
Gang Hua
(Scientist at Nokia Research)
Ming-Ting Sun
(Professor, University of Washington)
Wanqing Li (Associate
Professor, University of Wollongong)
Chunhui Zhang (Researcher, MSR Asia, now at Alibaba)
John Hershey (Post-Doc, now at IBM Research)
Interns: Sasa Junuzovic (2008), Matt Luciw (2008), Aswin
Sankaranarayanan (2008), Xiaogang Wang (2008), Qing Zhang (2008), Raffay
Hamid (2007), Sasa Junuzovic (2007), Miao Liao (2007), Mingxuan Sun
(2007), Qi Zhao (2007), Amar Subramanya (2006), Sasa Junuzovic (2006), Ming Liu (2005), Gang Hua (2005), Amar Subramanya (2004), Ya Chang (2004), Yanli Zheng
(2003), Hanning Zhou (2003), Guodong Guo (2002), Ruigang Yang (2001),
Ying Wu (2000), Ko Nishino (1999, 2000), Qifa ke (1998)
Supervision of researchers when I was at INRIA:
Nassir Navab (Ph.D., 1993), Michel Buffa (Ph.D., 1993), Gabriella Csurka
(Ph.D., 1996), Bernard Hotz (Research Engineer, 1991-1994), Serge
Saracco (Master, 1993), Jean-Francois Ponthieux (Master, 1993),
Veit Schenk (Master, 1996), Laurence Lucido (Ph.D., 1997), Sylvain
Bougnoux (Ph.D., 1998).
What's new?
-
Interview by the
IEEE Signal Processing Magazine on "Telepresence:
Virtual Reality in the Real World" is available
here (or go to
IEEE Xplore). (November 2011)
-
Natural
User Interfaces: What's Next?. Video on
3D
Photorealistic Talking Head.(Februray 2011)
-
Editorial for the inaugural issue of the IEEE Transactions on Autonomous Mental
Development (or go to
IEEE Xplore). (May 2009)
- Microsoft Research Feature Story:
Making Virtual Meeting Feel Real. (March 2009)
- Interview by the
Computational Intelligence Magazine is available
here
(or go to
IEEE Xplore). (Februray 2009)
- Recent publications:
- J. Weng, B. Scassellati, and Z. Zhang, editors, Special Issue on Autonomous Mental Development: Mind, Body and Beyond, International Journal of Humanoid Robotics (IJHR), Vol. 4, No. 2, June 2007.
- M. Liao, R. Yang, and Z. Zhang, "Robust and Accurate Visual Echo Cancelation in a Full-duplex Projector-camera System", IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 30, No. 10, pages 1831-1840, 2008.
- A. Subramanya, Z. Zhang, Z. Liu, and A. Acero, "Multisensory Processing for Speech Enhancement and Magnitude-Normalized Spectra for Speech Modeling", Speech Communication, Vol. 50, pp 228-243, 2008.
- C. Zhang, D. Florencio and Z. Zhang, "Maximum Likelihood Sound Source Localization and Beamforming for Directional Microphone Arrays in Distributed Meetings", IEEE Transactions on Multimedia, Vol. 10, No. 3, pages 538-548, Apr. 2008.
- Z. Liu, M. Cohen, D. Bhatnagar, R. Cutler, and Z. Zhang, "Head-size Equalization for Improved Visual Perception in Video Conferencing", IEEE Transactions on Multimedia, Vol.9, No.7, pages 1520-1527, 2007.
- Report "Whiteboard Scanning and Image Enhancement"
available in PDF (3.5MB).
- Paper "Vision-based Interaction with Fingers and Papers"
available in PDF (945KB).
- Report "Why Take Notes? Use the Whiteboard Capture System"
available in PDF (2.2MB).
-
Report "Camera Calibration with One-Dimensional Objects"
available in PDF (2.4MB).
-
Report "Eye Gaze Correction with Stereovision for Video Tele-Conferencing"
available in PDF (4.6MB).
-
Report "Model-based Head Pose Tracking With Stereovision"
available in PDF (3.6MB).
-
Report "Robust and Rapid Generation of Animated Faces From Video Images: A
Model-Based Modeling Approach" available
in PDF (13MB)
-
Report "Visual Panel: From an ordinary paper to a wireless and mobile input
device" available in PDF
(0.95MB)
Misc links