Robust Spectral Inference for Joint Stochastic Matrix Factorization and Topic Modeling

Spectral inference provides fast algorithms and provable optimality for latent topic analysis. But for real data these algorithms require additional ad-hoc heuristics, and even then often produce unusable results. We explain this poor performance by casting the problem of topic inference in the framework of Joint Stochastic Matrix Factorization (JSMF) and showing that previous methods violate the theoretical conditions necessary for a good solution to exist. We then propose a novel rectification method that learns high quality topics and their interactions even on small, noisy data. This method achieves results comparable to probabilistic techniques in several domains while maintaining scalability and provable optimality.

Speaker Details

Moontae Lee is a Ph. D. student in Computer Science at Cornell University. His research involves designing new models and algorithms for machine learning by intertwining matrix theory and stochastic methods. His research interests include spectral inference, low-dimensional embedding, and non-parametric approaches. He is currently working with David Mimno, David Bindel, and Peter Frazier on topic modeling and non-parametric ranking/bayesian.

Date:
Speakers:
Moontae Lee
Affiliation:
Cornell University
    • Portrait of Jeff Running

      Jeff Running

Series: Microsoft Research Talks