Z. Wen, Z. Liu, M. Cohen, J. Li, K. Zheng, and T. Huang
Face-to-face video teleconferencing is very important for real timefl communication. Current teleconferencing application uses standardfl video codec, such as MPEG1/2/4, for the compression offl face video. It either requires high bandwidth for high quality videofl transmission, or the transmitted face video be blurred at low bitrate.fl In this paper, we present a system for real-time coding offl face video at low bit-rate. There are two main contributions. First,fl we improve the technique of long term memory prediction by selectingfl frames into the database in an optimal way. A new framefl is selected into the database only when it is significantly differentfl from those frames which are already in the database. In this way,fl the database can cover a wider range of images. Second, we incorporatefl the prior knowledge about faces into the long term memoryfl prediction framework. The prior knowledge includes: (1) facialfl motions are repetitive such that most of them can be reconstructedfl from multiple reference frames; and (2) different components offl the face and the background could tolerate different level of errorfl because of different perceptual importance. Experiments showfl that at similar PSNR the proposed system works much faster andfl achieves better visual quality than standard H.264/JVT codec.
|Published in||Proc. of the IEEE Int. Conf. on Multimedia and Expo|