Statistical Physics of Clustering Algorithms

This thesis presents a principled approach to clustering by formulating the problem in a probabilistic autoencoder framework which is based on folded Markov chains. As a suitable optimization technique deterministic annealing is introduced, which performs robust optimization based on an analogy to the cooling of a system in statistical physics. Application of deterministic annealing to the derived clustering cost functions leads to three algorithms: soft topographic vector quantization (STVQ), which performs topographic clustering on Euclidean feature vectors, kernel-based soft topographic mapping (STMK), which allows to do the clustering in a high dimensional Euclidean feature space by the application of the kernel trick, and soft topographic mapping for proximity data (STMP), which generalizes STVQ to arbitrary pairwise dissimilarity data in a mean field fashion. All three algorithms are analysed w.r.t. the annealing process and their application is demonstrated on both artificial and real world data.

diplom.ps.gz
File

Details

TypeMastersThesis
AddressBerlin, Germany
Share
Share this page on Facebook
Share this page on Twitter
Share this page on LinkedIn
E-mail this page
RSS feeds
> Publications > Statistical Physics of Clustering Algorithms