Share on Facebook Tweet on Twitter Share on LinkedIn Share by email
Optimal Parameters for Locality-Sensitive Hashing

M. Slaney, Y. Lifshits, and J. He

Abstract

Locality-sensitive hashing (LSH) is the basis of many algorithms that use a probabilistic approach to find nearest neighbors. We describe an algorithm for optimizing the parameters and use of LSH. Prior work ignores these issues or suggests a search for the best parameters. We start with two histograms: one that characterizes the distributions of distances to a point's nearest neighbors and the second that characterizes the distance between a query and any point in the data set. Given a desired performance level (the chance of finding the true nearest neighbor) and a simple computational cost model, we return the LSH parameters that allow an LSH index to meet the performance goal and have the minimum computational cost. We can also use this analysis to connect LSH to deterministic nearest-neighbor algorithms such as k d trees and thus start to unify the two approaches.

Details

Publication typeArticle
Published inProceedings of the IEEE
URL10.1109/JPROC.2012.2193849
Pages1 -20
Number99
> Publications > Optimal Parameters for Locality-Sensitive Hashing