More Data Less Work: SVM Training in Time Decreasing with Larger Data Sets

Traditional runtime analysis of training Support Vector Machines, and indeed most learning methods, shows how the training runtime increases as more training examples are available. Considering the true objective of training, which is to obtain a good predictor, I will argue that training time should be studied as a decreasing function of training set size. I will then present both theoretical and empirical results demonstrating how a simple stochastic subgradient descent approach for training SVMs indeed displays such monotonic decreasing behavior.

I will also discuss a similar phenomena in the context of Gaussian mixture clustering, where it appears that excess data turns the problem from computationally intractable to computationally tractable.

Joint work with Shai Shalev-Shwartz, Yoram Singer, Greg Shakhnarovich and Sam Roweis.

Speaker Details

Nathan Srebro received his PhD at MIT in 2004 and after visiting positions at the University of Toronto and at the IBM Haifa Research Labs he is now an Assistant Professor at the Toyota Technological Institute–Chicago (TTI-Chicago). TTI-Chicago is a philanthropically endowed academic computer science institute dedicated to basic research and graduate education in computer science.

Date:
Speakers:
Nathan Srebro
Affiliation:
Toyota Technological Institute--Chicago