|Chen Qian1,2||Xiao Sun1||Yichen Wei1||Xiaoou Tang2||Jian Sun1|
|Visual Computing Group||Multimedia Laboratory|
|1Microsoft Research Asia||2Chinese University of Hong Kong|
We present a realtime hand tracking system using a depth sensor. It tracks a fully articulated hand under large viewpoints in realtime (25 FPS on a desktop without using a GPU) and with high accuracy (error below 10 mm). To our knowledge, it is the first system that achieves such robustness, accuracy, and speed simultaneously, as verified on challenging real data.
Our system is made of several novel techniques. We model a hand simply using a number of spheres and define a fast cost function. Those are critical for realtime performance. We propose a hybrid method that combines gradient based and stochastic optimization methods to achieve fast convergence and good accuracy. We present new finger detection and hand initialization methods that greatly enhance the robustness of tracking.
The paper and the database have been slightly changed on 5/28/2014 due to a minor bug we found later after the official submission. The error statistics in the paper now are slightly different from the official CVPR version, but mostly consistent. Please use our updated version and we're sorry for any confusion caused.
Oral Acceptance, Computer Vision and Pattern Recognition, June, 2014, pdf 1M