Z. Wen, Z. Liu, M. Cohen, J. Li, K. Zheng, and T. Huang
June 2004
Face-to-face video teleconferencing is very important for real timefl
communication. Current teleconferencing application uses standardfl
video codec, such as MPEG1/2/4, for the compression offl
face video. It either requires high bandwidth for high quality videofl
transmission, or the transmitted face video be blurred at low bitrate.fl
In this paper, we present a system for real-time coding offl
face video at low bit-rate. There are two main contributions. First,fl
we improve the technique of long term memory prediction by selectingfl
frames into the database in an optimal way. A new framefl
is selected into the database only when it is significantly differentfl
from those frames which are already in the database. In this way,fl
the database can cover a wider range of images. Second, we incorporatefl
the prior knowledge about faces into the long term memoryfl
prediction framework. The prior knowledge includes: (1) facialfl
motions are repetitive such that most of them can be reconstructedfl
from multiple reference frames; and (2) different components offl
the face and the background could tolerate different level of errorfl
because of different perceptual importance. Experiments showfl
that at similar PSNR the proposed system works much faster andfl
achieves better visual quality than standard H.264/JVT codec.
![]() PDF file |
In Proc. of the IEEE Int. Conf. on Multimedia and Expo
| Type | Inproceedings |