Eye-Gaze Correction for Video Telecommunications

Producing a virtual video to maintain correct eye-gaze awareness

The lack of eye contact in desktop video teleconferencing substantially reduces the effectiveness of video contents. While expensive and bulky hardware is available on the market to correct eye gaze, researchers have been trying to provide a practical software-based solution to bring video-teleconferencing one step closer to the mass market. This paper presents a novel approach that is based on stereo analysis combined with rich domain knowledge (a personalized face model). This marriage is mutually beneficial. The personalized face model greatly improved the accuracy and robustness of the stereo analysis by substantially reducing the search range; the stereo techniques, using both feature matching and template matching, allow us to extract 3D information of objects other than the face and to determine the head pose in a much more reliable way than if only one camera is used. Thus we enjoy the versatility of stereo techniques without suffering from their vulnerability due to, e.g., lack of texture on faces. By emphasizing a 3D description of the scene on the face part, we synthesize virtual views that maintain eye contact using graphics hardware. Our current system is able to generate an eye-gaze corrected video stream at about 5 frames per second on a commodity PC.


R. Yang and Z. Zhang. "Eye Gaze Correction with Stereovision for Video Tele-Conferencing". In Proc. 7th European Conference on Computer Vision (ECCV2002), Volume II, pages 479-494, Copenhagen, Denmark, May 28-31, 2002. Also available as Technical Report MSR-TR-01-119.

R. Yang and Z. Zhang. Model-based Head Pose Tracking With Stereovision. In Proc. Fifth IEEE International Conference on Automatic Face and Gesture Recognition (FG2002), pages 255-260, Washington, DC, May 20-21, 2002. Also available as Technical Report MSR-TR-01-102.

R. Yang, and Z. Zhang, "Eye Gaze Correction With Stereovision for Video-Teleconferencing'', IEEE Trans. Pattern Analysis and Machine Intelligence, 26(7):956--960, 2004.