John C. Platt, CCSP Group, Microsoft Research
Proc. Advances in Neural Information Processing Systems, volume 16, (2004), to appear.
This paper applies fast sparse multidimensional scaling (MDS) to a large graph of music similarity, with 267K vertices that represent artists, albums, and tracks; and 3.22M edges that represent similarity between those entities. Once vertices are assigned locations in a Euclidean space, the locations can be used to browse music and to generate playlists.
MDS
on very large sparse graphs can be effectively performed by a family of algorithms
called Rectangular Dijsktra (RD) MDS algorithms. These RD algorithms operate on
a dense rectangular slice of the distance matrix, created by calling Dijsktra a
constant number of times. Two RD algorithms are compared: Landmark MDS, which
uses the Nyström approximation to perform MDS; and a new algorithm called Fast
Sparse Embedding, which uses FastMap. These algorithms compare favorably to
Laplacian Eigenmaps, both in terms of speed and embedding quality.
PDF file (147 KB)