Continuous Distributed Counting for Non-monotonic Streams

MSR-TR-2011-128 |

We consider the continual count tracking problem in a distributed environment where the input is an aggregate stream originating from k distinct sites and the updates are allowed to be non-monotonic, i.e. both increments and decrements are allowed. The goal is to continually track the count within a prescribed relative accuracy ε at the lowest possible communication cost. Specifically, we consider an adversarial setting where the input values are selected and assigned to sites by an adversary but the order is according to a random permutation or is a random i.i.d process. The input stream of values is allowed to be non-monotonous with an unknown drift -1≤ μ ≤ 1 where the case μ = 1 corresponds to the special case of a monotonic stream of only non-negative updates. We show that a randomized algorithm guarantees to track the count accurately with high probability and has the expected communication cost ˜ O(min {√k/(|μ| ε), √k n/ε, n}), for an input stream of length n, and establish matching lower bounds. This improves upon previously best known algorithm whose expected communication cost is ˜ Θ(min {√k/ε,n}) that applies only to an important but more restrictive class of monotonic input streams, and our results are substantially more positive than the communication complexity of Ω(n) under fully adversarial input. We also show how our framework can also accommodate other types of random input streams, including fractional Brownian motion that has been widely used to model temporal long-range dependencies observed in many natural phenomena. Last but not least, we show how our non-monotonic counter can be applied to track the second frequency moment and to a Bayesian linear regression problem.