Ruoming Jin, Lin Liu, Bolin Ding, and Haixun Wang
Driven by the emerging network applications, querying and mining uncertain graphs has become increasingly important. In this paper, we investigate a fundamental problem concerning uncertain graphs, which we call the distance-constraint reachability (DCR) problem: Given two vertices s and t, what is the probability that the distance from s to t is less than or equal to a user-deﬁned threshold d in the uncertain graph? Since this problem is #P-Complete, we focus on efﬁciently and accurately approximating DCR online. Our main results include two new estimators for the probabilistic reachability. One is a Horvitz Thomson type estimator based on the unequal probabilistic sampling scheme, and the other is a novel recursive sampling estimator, which effectively combines a deterministic recursive computational procedure with a sampling process to boost the estimation accuracy. Both estimators can produce much smaller variance than the direct sampling estimator, which considers each trial to be either 1 or 0. We also present methods to make these estimators more computationally efﬁcient. The comprehensive experiment evaluation on both real and synthetic datasets demonstrates the efﬁciency and accuracy of our new estimators.
In Proceedings of the VLDB Endowment, the 37th International Conference on Very Large Data Bases (VLDB 2011)
Publisher Very Large Data Bases Endowment Inc.