Speaker Ping Li
Host Chris Burges
Date recorded 14 December 2006
Random projections are a popular method for estimating distances. Instead of computing L2 distances over a big data matrix, we multiply the big matrix with a very sparse random matrix (over [-1 0 1]) to produce a smaller matrix. Distances in the smaller matrix are close to distances in the big matrix, but the smaller matrix is easier to work with. We then generalize very sparse random projections by introducing a novel data structure for estimating L1 and Lp distances as opposed to L2 distances only. Finally, we end with a new method, conditional random sampling. The new method estimates a variety of distance functions (L1, L2, Lp, Chi-square and more) from a single data structure. A single data structure is more convenient than different data structures for different distance functions.
©2006 Microsoft Corporation. All rights reserved.