Misha

Mikhail Bilenko


mbilenko@microsoft.com

Research   |   Personal   |   Contact


Update: I lead the Machine Learning Algorithms team in Cloud+Enterprise division; we are a part of Microsoft Azure ML. Our learners, predictors and tools are used by many product groups across the company, and we collaborate extensively with MSR and applied ML/data science groups. If you love both ML fundamentals and coding, and would enjoy a job where you do both with a fun group of incredible engineers and scientists, please ping me.

Before that, I was a researcher in the Machine Learning Department at Microsoft Research. I like building ML systems and tools, and working on large-scale prediction problems, such as those around behavioral, transactional and textual data. Specific applications on which I worked recently are high-throughput ML, click probability prediction, advertisement selection, constructing user profiles for targeting, and improving search relevance by mining logs of browsing behavior. In the past, I worked on semi-supervised clustering and record linkage (entity resolution, de-duplication, etc.). I am generally interested in adaptive similarity/distance functions, implementing learning algorithms on parallel/distributed platforms, and creating tools for machine learning practitioners.

I completed my Ph.D. in the Department of Computer Science at the University of Texas at Austin in 2006, where I was a member of the Machine Learning Group. Along the way, I spent the summer of 2002 at IBM T.J. Watson Research Center, and the summer/fall of 2004 at Google.

Research

  • Learning from large datasets
  • Learnable similarity functions and their applications in information integration (e.g., record linkage/identity uncertainty) and text mining


  • Semi-supervised clustering

    • Probabilistic Semi-Supervised Clustering with Constraints
      Sugato Basu, Mikhail Bilenko, Arindam Banerjee, and Raymond J. Mooney. In Semi-Supervised Learning, O. Chapelle, B. Schölkopf, and A. Zien (eds.), MIT Press, 2006.
      Note: this chapter summarizes the KDD and ICML papers below
      [PDF]

    • A Probabilistic Framework for Semi-Supervised Clustering
      Sugato Basu, Mikhail Bilenko, and Raymond J. Mooney. In Proceedings of the 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-2004), pp.59-68, Seattle, WA, August 2004.
      (Winner of Best Research Paper Award)
      [PDF]

    • Integrating Constraints and Metric Learning in Semi-Supervised Clustering
      Mikhail Bilenko, Sugato Basu, and Raymond J. Mooney. In Proceedings of the 21st International Conference on Machine Learning (ICML-2004), pp.81-88, Banff, Canada, July 2004.
      [PDF]

    • A Comparison of Inference Techniques for Semi-supervised Clustering with Hidden Markov Random Fields
      Mikhail Bilenko and Sugato Basu. In Proceedings of the ICML-2004 Workshop on Statistical Relational Learning and its Connections to Other Fields (SRL-2004), pp.17-22, Banff, Canada, July 2004.
      [PDF]

  • Indirect learning in information integration (record linkage, information extraction), text classification, and clustering

    • Two Approaches to Handling Noisy Variation in Text Mining
      Un Yong Nahm, Mikhail Bilenko, and Raymond J. Mooney. In Proceedings of the ICML-2002 Workshop on Text Learning (TextML'2002), pp.18-27, Sydney, Australia, July 2002.
      [PDF]

Personal
In my leisure time I enjoy applying hill-climbing search and gradient descent algorithms to real-world domains, which are almost as cool as the cool stuff that my sister does.
Contact Info

Email mbilenko@microsoft.com
   
Postal Microsoft Research
One Microsoft Way
Redmond, WA 98052
USA