A Geometric Approach to Lower Bounds for Approximate Near-Neighbor Search and Partial Match

This work investigates a geometric approach to proving cell probe lower bounds for data structure problems.

We consider the {\em approximate nearest neighbor search problem} on the Boolean hypercube $(\bool^d,\onenorm{\cdot})$ with $d=\Theta(\log n)$. We show that any (randomized) data structure for the problem that answers $c$-approximate nearest neighbor search queries using $t$ probes must use space at least $n^{1+\Omega(1/ct)}$. In particular, our bound implies that any data structure that uses space $\tilde{O}(n)$ with polylogarithmic word size, and with constant probability gives a constant approximation to nearest neighbor search queries must be probed $\Omega(\log n/ \log\log n)$ times. This improves on the lower bound of $\Omega(\log\log d/\log\log\log d)$ probes shown by Chakrabarti and Regev~\cite{ChakrabartiR04} for any polynomial space data structure, and the $\Omega(\log\log d)$ lower bound in \Patrascu and Thorup~\cite{PatrascuT07} for linear space data structures.

Our lower bound holds for the {\em near neighbor problem}, where the algorithm knows in advance a good approximation to the distance to the nearest neighbor.

Additionally, it is an {\em average case} lower bound for the natural distribution for the problem. Our approach also gives the same bound for $(2-\frac{1}{c})$-approximation to the farthest neighbor problem.

For the case of non-adaptive algorithms we can improve the bound slightly and show a $\Omega(\log n)$ lower bound on the time complexity of data structures with $O(n)$ space and logarithmic word size.

We also show similar lower bounds for the partial match problem: any randomized $t$-probe data structure that solves the partial match problem on $\{0,1,\star\}^d$ for $d=\Theta(\log n)$ must use space $n^{1+\Omega(1/t)}$. This implies an $\Omega(\log n/\log\log n)$ lower bound for time complexity of near linear space data structures, slightly improving the $\Omega(\log n /(\log \log n)^2)$ lower bound from~\cite{PatrascuT06a},\cite{JayramKKR03} for this range of $d$. Recently and independently \Patrascu achieved similar bounds \cite{patrascu08}. Our results also generalize to approximate partial match, improving on the bounds of \cite{BarkolR02,PatrascuT06a}.

anns-hp.pdf
PDF file

In  FOCS '08: Proceedings of the 49th annual IEEE Symposium on Foundations of Computer Science

Publisher  IEEE
© 2008 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE. http://www.ieee.org/

Details

TypeInproceedings
Pages414–423
ISBN978-0-7695-3436-7
> Publications > A Geometric Approach to Lower Bounds for Approximate Near-Neighbor Search and Partial Match