Varsha Hedau, Sudipta N. Sinha, C. Lawrence Zitnick, and Richard Szeliski
7 October 2012
We propose a visual recognition approach aimed at fast recognition of urban landmarks on a GPS-enabled mobile device. While most existing methods offload their computation to a server, the latency of an image upload over a slow network can be a significant bottleneck. In this paper, we investigate a new approach to mobile visual recognition that would involve uploading only GPS coordinates to a server, following which a compact location specific classifier would be downloaded to the client and recognition would be computed completely on the client. To achieve this goal, we have developed an approach based on supervised learning that involves training very compact random forest classifiers based on labeled geo-tagged images. Our approach selectively chooses highly discriminative yet repeatable visual features in the database images during offline processing. Classification is effifficient at query time as we first rectify the image based on vanishing points and then use random binary patterns to densely match a small set of downloaded features with min-hashing used to speedup the search. We evaluate our method on two public benchmarks and on two streetside datasets where we outperform standard bag-of-words retrieval as well as direct feature matching approaches, both of which are infeasible for client-side query processing.
|Published in||Proceedings of the 1st Workshop on Visual Analysis and Geo-Localization of Large-Scale Imagery (in conjunction with ECCV 2012)|