Estimating the Support of a High-Dimensional Distribution

Authors

Bernhard Schölkopf, Machine Learning and Perception Group, Microsoft Research Cambridge

John Platt, CCSP Group, Microsoft Research

John Shawe-Taylor, Department of Computer Science, Royal Holloway, University of London

Alex J. Smola, RSISE, Australian National University

Robert C. Williamson, RSISE, Australian National University

Reference

Neural Computation, v 13, no 7, pp. 1443-1472, (2001).

Abstract

Suppose you are given some data set drawn from an underlying probability distribution P and you want to estimate a "simple" subset S of input space such that the probability that a test point drawn from P lies outside of S equals some a priori specified value between 0 and 1.

We propose a method to approach this problem by trying to estimate a function f that is positive on S and negative on the complement. The functional form of f is given by a kernel expansion in terms of a potentially small subset of the training data; it is regularized by controlling the length of the weight vector in an associated feature space. The expansion coefficients are found by solving a quadratic programming problem, which we do by carrying out sequential optimization over pairs of input patterns. We also provide a theoretical analysis of the statistical performance of our algorithm.

The algorithm is a natural extension of the support vector algorithm to the case of unlabeled data.

Paper Link

The journal paper is only available on-line with a subscription to Neural Computation.

 

An earlier version of the paper is available as Microsoft Research Technical Report MSR-TR-99-87, (1999).

A conference version was Support Vector Method for Novelty Detection by B. Schölkopf, R. Williamson, A.J. Smola, J. Shawe-Taylor, J.C. Platt, Advances in Neural Information Processing Systems 12, pp. 582-588, (2000).