Ruoming Jin, Lin Liu, Bolin Ding, and Haixun Wang
Driven by the emerging network applications, querying and mining uncertain graphs has become increasingly important. In this paper, we investigate a fundamental problem concerning uncertain graphs, which we call the distance-constraint reachability (DCR) problem: Given two vertices s and t, what is the probability that the distance from s to t is less than or equal to a user-deﬁned threshold d in the uncertain graph? Since this problem is #P-Complete, we focus on efﬁciently and accurately approximating DCR online. Our main results include two new estimators for the probabilistic reachability. One is a Horvitz Thomson type estimator based on the unequal probabilistic sampling scheme, and the other is a novel recursive sampling estimator, which effectively combines a deterministic recursive computational procedure with a sampling process to boost the estimation accuracy. Both estimators can produce much smaller variance than the direct sampling estimator, which considers each trial to be either 1 or 0. We also present methods to make these estimators more computationally efﬁcient. The comprehensive experiment evaluation on both real and synthetic datasets demonstrates the efﬁciency and accuracy of our new estimators.
|Published in||Proceedings of the VLDB Endowment, the 37th International Conference on Very Large Data Bases (VLDB 2011)|
|Publisher||Very Large Data Bases Endowment Inc.|