The Microsoft Research Cambridge-12 Kinect gesture data set consists of sequences of human movements, represented as body-part locations, and the associated gesture to be recognized by the system. The data set includes 594 sequences and 719,359 frames-approximately six hours and 40 minutes-collected from 30 people performing 12 gestures. In total, there are 6,244 gesture instances. The motion files contain tracks of 20 joints estimated using the Kinect Pose Estimation pipeline. The body poses are captured at a sample rate of 30Hz with an accuracy of about two centimeters in joint positions.

The data set and details as to how it was produced are described in detail in the publication linked below. We believe it will be a useful data set for a evaluating gesture recognition systems, and as a database of Kinect skeletal track recordings and their variation across different persons.


The full dataset has been released on the 5th of June 2012 and can be downloaded at Microsoft Research Download. Additional details can be found in this document. If you use the data set, we ask you to please cite the following paper that describes the dataset. (paper, PDF).

@InProceedings{ msrc12,
    title = "Instructing people for training gestural interactive systems",
    author = "Simon Fothergill and Helena M. Mentis and Pushmeet Kohli and Sebastian Nowozin",
    booktitle = "CHI",
    publisher = "ACM",
    year = "2012",
    editor = "Joseph A. Konstan and Ed H. Chi and Kristina H{\"o}{\"o}k",
    pages = "1737--1746",


