Efficient Dense Stereo and Novel-view Synthesis for Gaze Manipulation in One-to-one Teleconferencing

  • Antonio Criminisi ,
  • Jamie Shotton ,
  • ,
  • Carsten Rother ,
  • Philip H.S. Torr

MSR-TR-2003-59 |

MSR-TR-2003-59

A new algorithm is proposed for novel-view synthesis, with particular application to teleconferencing. Given the video streams acquired by two cameras placed on either side of a computer monitor, the proposed algorithm synthesises images from a virtual camera in arbitrary position (typically located within the monitor area) to facilitate eye contact. The new technique is based on an improved, dynamic-programming, stereo algorithm for efficient novel-view generation. The two main contributions of this paper are: i) a new four-layer matching graph for dense-stereo dynamic-programming, that supports accurate occlusion labeling; ii) a compact geometric derivation for novel-view synthesis by direct projection of the minimum-cost surface. Furthermore, the paper presents an algorithm for the temporal maintenance of a background model to enhance the rendering of occlusions and reduce temporal artefacts (flicker); and a cost aggregation algorithm that acts directly in three-dimensional matching cost space. The proposed algorithm has been designed to work with input images with large disparity range, a common situation in one-to-one video-conferencing. The enhanced occlusion- handling capabilities of the new DP algorithm are evaluated against those of the most powerful state-of-the-art dynamic-programming and graph-cut techniques. A number of examples demonstrate the robustness of the algorithm to artefacts in stereo video streams. This includes demonstrations of cyclopean view synthesis in extended conversational sequences, synthesis from a freely translating virtual camera and, finally, basic 3D scene editing.