Z. Wen, Z. Liu, M. Cohen, J. Li, K. Zheng, and T. Huang
Face-to-face video teleconferencing is very important for real time
communication. Current teleconferencing application uses standard
video codec, such as MPEG1/2/4, for the compression of
face video. It either requires high bandwidth for high quality video
transmission, or the transmitted face video be blurred at low bitrate.
In this paper, we present a system for real-time coding of
face video at low bit-rate. There are two main contributions. First,
we improve the technique of long term memory prediction by selecting
frames into the database in an optimal way. A new frame
is selected into the database only when it is significantly different
from those frames which are already in the database. In this way,
the database can cover a wider range of images. Second, we incorporate
the prior knowledge about faces into the long term memory
prediction framework. The prior knowledge includes: (1) facial
motions are repetitive such that most of them can be reconstructed
from multiple reference frames; and (2) different components of
the face and the background could tolerate different level of error
because of different perceptual importance. Experiments show
that at similar PSNR the proposed system works much faster and
achieves better visual quality than standard H.264/JVT codec.
|Published in||Proc. Int. Conference on Image Processing|
|Publisher||Institute of Electrical and Electronics Engineers, Inc.|
© 2007 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.