Embedding, Distance Estimation and Object Location in Networks

Concurrent with numerous theoretical results on metric embeddings, a growing body of research in the networking community has studied the distance matrix defined by node-to-node latencies in the Internet, resulting in a number of recent approaches that approximately embed this distance matrix into low-dimensional Euclidean space. A fundamental distinction between the theoretical approaches to embeddings and this recent Internet-related work is that the latter operates under the additional constraint that it is only feasible to measure a linear number of node pairs, and typically in a highly structured way. Indeed, the most common framework here is a beacon-based approach: one randomly chooses a small number of nodes (’beacons’) in the network, and each node measures its distance to these beacons only. Moreover, beacon-based algorithms are also designed for the more basic problem of triangulation, in which one uses the triangle inequality to infer the distances that have not been measured.

We give algorithms with provable performance guarantees for triangulation and embedding. We show that in addition to multiplicative error in the distances, performance guarantees for beacon-based algorithms typically must include a notion of ”slack” – a certain fraction of all distances may be arbitrarily distorted.

For arbitrary metrics, we give a beacon-based embedding algorithm that achieves constant distortion on a-fraction of distances; this provides some theoretical justification for the success of the recent networking algorithms, and forms an interesting contrast with lower bounds showing that it is not possible to embed all distances with constant distortion. For doubling metrics (which have been proposed as a reasonable abstraction of Internet latencies), we show that triangulation with a constant number of beacons can achieve multiplicative error 1+ δ on a-fraction of distances, for arbitrarily small constants ,δ.

We extend these results in a number of directions: embeddings with slack that work for all distributed algorithms for triangulation and embedding with low overhead on all participating nodes; distributed triangulation with guarantees for all node pairs; node-labeling problems for graphs and metrics; systems project on location-aware node selection in a large-scale distributed network.