Thore Graepel, Matthias Burger, and Klaus Obermayer
We offer three algorithms for the generation of topographic mappings to the practitioner of unsupervised data analysis. The algorithms are each based on the minimization of a cost function which is performed using an EM algorithm and deterministic annealing. The soft topographic vector quantization algorithm (STVQ) - like the original Self-Organizing Map (SOM) - provides a tool for the creation of self-organizing maps of Euclidean data. Its optimization scheme, however, offers an alternative to the heuristic stepwise shrinking of the neighborhood width in the SOM and makes it possible to use a fixed neighborhood function solely to encode desired neighborhood relations between nodes. The kernel-based soft topographic mapping (STMK) is a generalization of STVQ and introduces new distance measures in data space based on kernel functions. Using the new distance measures corresponds to performing the STVQ in a high-dimensional feature space, which is related to data space by a nonlinear mapping. This preprocessing can reveal structure of the data which may go unnoticed if the STVQ is performed in the standard Euclidean space. The soft topographic mapping for proximity data (STMP) is another generalization of STVQ that enables the user to generate topographic maps for data which are given in terms of pairwise proximities. It thus offers a flexible alternative to multidimensional scaling methods and opens up a new range of applications for Self-Organizing Maps. Both STMK and STMP share the robust optimization properties of STVQ due to the application of deterministic annealing. In our contribution we discuss the algorithms together with their implementation and provide detailed pseudo-code and explanations.