Parikshit Gopalan and Jaikumar Radhakrishnan
Given a data stream of length n over an alphabet [m] where n > m, we consider the problem of finding a duplicate in a single pass. We give a randomized algorithm for this problem that uses O((log m)3) space. This answers a question of Muthukrishnan and Tarui, who asked if this problem could be solved using sub-linear space and one pass over the input. Our main tool is an Isolation Lemma that reduces this problem to the task of detecting and identifying a Dictatorial variable in a Boolean halfspace.
|Publisher||The British Computer Society|