HOMEPAGE OF FENG WU

 

Feng Wu
Research Manager
Internet Media Group
Microsoft Research Asia

 

Dr. Feng Wu received his B.S. in Electrical Engineering from XIDIAN University in 1992. He received his M.S. and Ph.D. in Computer Science from Harbin Institute of Technology in 1996 and 1999, respectively. He joined in Microsoft Research China as an associated researcher in 1999. He has been a researcher with Microsoft Research Asia since 2001.

His research interests include image and video representation, media compression and communication, machine learning, computer vision and graphics. He has developed some efficient technologies on scalable video coding from progressive fine granularity scalable coding (PFGS) to 3D wavelet video coding. He is also working on vision-based image and video coding, distributed video coding, multi-view video, stream switching, intermedia, graphics and texture compression, super-resolution, etc. He has been an active contributor to the ISO/MPEG and ITU-T standardization efforts. Some techniques are adopted by MPEG-4 FGS, H.264/MPEG-4 AVC and the coming H.264 SVC standard. He served as the chairman of China AVS video group in 2002~2004 and led the efforts on developing China AVS video standard 1.0.

He has been an IEEE member since 1999 and a senior member since July 2006. He serves as the reviewer for IEEE trans. on Circuits and Systems for Video Technologies, IEEE trans. on Multimedia, IEEE signal processing letter and some other international journals. He also serves as the member of technical program committee in some international conferences (e.g. ICME 2006, PCS 2006, VCIP 2005). He has authored or co-authored over 100 conference and journal papers. He has about 30 U.S. patents granted or pending in video and image coding.

Researches

bullet

1.    Scalable Video Coding (SVC)

Ø      PFGS
Progressive fine granularity scalable (PFGS) coding technology is an improvement of MPEG-4 FGS, where two motion compensations are used for each frame coding. Each frame at base layer is always predicted from the previous frame at base layer, whereas each frame at enhancement layer is predicted from the previous frame at either base layer or enhancement layer. Three coding modes are defined in PFGS to control the reference for prediction and reconstruction on each enhancement macroblock. Furthermore, a drifting model is proposed to estimate the drifting errors at encoder.

Ø     3D subband video coding

In the 3D subband video coding, subband transform are applied in the horizontal, vertical and temporal directions, respectively. The resulted coefficients of each subband are scanned from one bit-plane to another and coded with either variable length table or arithmetic coding in a SNR scalable form. For the sake of high coding efficiency, motion alignment is incorporated into the temporal transform. Some techniques are developed to improve 3D subband video coding, such as Barbell lifting, 3D EBCOB, scalable motion vector coding, in-scale structure, etc.

bullet

2.   Stream switching
Highly efficient adaptation on channel bandwidth is broadly required by streaming video over the Internet. Switching among non-scalable streams and/or scalable streams is a challenging topic because it may cause severe visual artifact and PSNR degradation due to the mismatch on the reconstructed references. Some techniques are developed and reported, such as improvements in the switching method of H.264/MPEG-4 AVC, switching among scalable streams, switching through distributed coding, etc.

 

3.   Directional transform

2D DCT transform and 2D wavelet transform, which are extensively used in video and image coding, are implemented by two separable 1D transforms in horizontal and vertical directions. A serious drawback of these transforms is that they are ill suited to approximate image features with arbitrary orientation that is neither vertical nor horizontal.  We have developed directional wavelet transform and DCT transform. Both of them are carried out in the direction of image edges and textures in a local window, and are not necessarily horizontal or vertical.

 

4.   Vision-based image and video coding

Main-stream signal-processing-based compression schemes share a common architecture, namely transform followed by entropy coding, where only the statistical redundancy among pixels is considered as the adversary of coding. Through two-decade development, it has been becoming difficult to continuously improve the coding performance under such architecture. Based on newly developed vision technologies (e.g., inpainting and hallucination), we proposed a new image and video coding architecture to incorporate signal processing techniques and vision techniques together.

 

5.   Distributed video coding (DVC)

The most attractive feature in DVC is that the complexity caused by motion estimation and compensation at the encoder can be shifted to the decoding side. The lightweight encoder becomes possible for portable and battery-powered devices. However, the distributed video coding also face many challenges, such as low coding efficiency caused by inaccurate side information, inefficiency entropy coding, how many WZ bits that should be sent to the decoder side. We are investigating the DVC schemes from these aspects and also exploiting kill application scenarios for DVC.

 

6.   Multi-view and stereo video

With the developments on glass-free display devices of multi-view and stereo video, applications of immersive media get more and more attention, such as 3D TV and immersive conference. Our researches involve two aspects. One is how to efficiently represent of multi-view video in the compressed way because the correlation among views provides a bigger space to exploit in compression. Another is how to generate multi-video and stereo content from existing video and image.

 

7.   Intermedia

There are two different approaches for network and device adaptation in video coding: scalable video coding and transcoding. Streams generated by scalable video coding are very easy to modify its resolution, frame rate and bit rate by the truncating process. However, scalable video coding suffers from coding efficiency degradation. Transcoding can provide high performance when modifying streams from a combination of resolution, frame rate and bit rate to another. But the transcoding process is of high computational intensity. The Intermedia tries to generate a representation of source video and/or compressed stream to facilitate the network and device adaptation in an easy and efficient way.

bullet

PUBLICATIONS

Ø      English papers

Ø      Chinese papers

Ø      Proposals to ISO/MPEG and ITU-T

 

CONTACT INFORMATION

MICROSOFT RESEARCH CHINA
5F Sigma, No49 Zhichun Rd
Haidian, Beijing, 100080, China
Email: fengwu@microsoft.com
Phone: (86-10) 58963119
Fax: (86-10) 88097306