We introduce an efficient camera relocalization approach which can be easily integrated into real-time 3D reconstruction methods, such as KinectFusion. Our approach makes use of compact encoding of whole image frames which enables both online harvesting of keyframes in tracking mode, and fast retrieval of pose proposals when tracking is lost. The encoding scheme is based on randomized ferns and simple binary feature tests. Each fern generates a small block code, and the concatenation of codes yields a compact representation of each camera frame. Based on those representations we introduce an efficient frame dissimilarity measure which is defined via the block-wise hamming distance (BlockHD). We illustrate how BlockHDs between a query frame and a large set of keyframes can be simultaneously evaluated by traversing the nodes of the ferns and counting image co-occurrences in corresponding code tables. In tracking mode, this mechanism allows us to consider every frame/pose pair as a potential keyframe. A new keyframe is added only if it is sufficiently dissimilar from all previously stored keyframes. For tracking recovery, camera poses are retrieved that correspond to the keyframes with smallest BlockHDs. The pose proposals are then used to reinitialize the tracking algorithm. Harvesting of keyframes and pose retrieval are computationally efficient with only small impact on the run-time performance of the 3D reconstruction. Integrating our relocalization method into KinectFusion allows seamless continuation of mapping even when tracking is frequently lost. Additionally, we demonstrate how marker-free augmented reality, in particular, can benefit from this integration by enabling a smoother and continuous AR experience.
- Ben Glocker, Shahram Izadi, Jamie Shotton, and Antonio Criminisi, Real-Time RGB-D Camera Relocalization, in International Symposium on Mixed and Augmented Reality (ISMAR), IEEE, October 2013