
Speaker Maria Florina Balcan Host Nikhil Devanur Affiliation Georgia Institute of Technology Duration 01:00:38 Date recorded 22 July 2010 There has been substantial work on approximation algorithms for clustering data under distancebased objective functions such as kmedian, kmeans, and minsum objectives. This work is fueled in part by the hope that approximating these objectives well will indeed yield more accurate solutions. That is, for problems such as clustering proteins by function, or clustering images by subject, there is some unknown correct “target” clustering and the implicit assumption is that clusterings that are approximately optimal in terms of these distancebased measures are also approximately correct in terms of error with respect to the target. In this work we show that if we make this implicit assumption explicit — that is, if we assume that any capproximation to the given clustering objective Phi is epsilonclose to the target — then we can produce clusterings that are O(epsilon)close to the target, even for values c for which obtaining a capproximation is NPhard. In particular, for the kmedian, kmeans, and minsum objectives, we show that we can achieve this guarantee for any constant c 1.
©2010 Microsoft Corporation. All rights reserved.
