We propose a method to exponentially speed up computation of various
fingerprints, such as the ones used to compute similarity and rarity in massive
data sets. Rather then maintaining the full stream of *b* items of a
universe *[u]*, such methods only maintain a concise fingerprint of the
stream, and perform computations using the fingerprints. The computations are
done approximately, and the required fingerprint size *k* depends on the
desired accuracy *ε* and confidence *δ*. Our technique
maintains a single bit per hash function, rather than a single integer, thus
requiring a fingerprint of length *k = O(frac ln frac
1δε ^{2})* bits, rather than