Mode-detection via median-shift

Lior Shapira, Shai Avidan, and Ariel Shamir

Abstract

Median-shift is a mode seeking algorithm that relies on computing the median of local

neighborhoods, instead of the mean. We further combine median-shift with Locality

Sensitive Hashing (LSH) and show that the combined algorithm is

suitable for clustering large scale, high dimensional data sets. In particular,

we propose a new mode detection step that greatly accelerates performance. In

the past, LSH was used in conjunction with mean shift only to accelerate nearest

neighbor queries. Here we show that we can analyze the density of the LSH bins

to quickly detect potential mode candidates and use only them to initialize the

median-shift procedure. We use the median, instead of the mean (or its discrete

counterpart - the medoid) because the median is more robust and because the

median of a set is a point in the set. A median is well defined for scalars but

there is no single agreed upon extension of the median to high dimensional

data. We adopt a particular extension, known as the Tukey median, and show that

it can be computed efficiently using random projections of the high dimensional

data onto 1D lines, just like LSH, leading to a tightly integrated and

efficient algorithm.

Details

Publication typeInproceedings
Published inComputer Vision, 2009 IEEE 12th International Conference on
URL10.1109/ICCV.2009.5459423
Pages1909 -1916
> Publications > Mode-detection via median-shift