Misha

Mikhail Bilenko


mbilenko@microsoft.com

[ Research   |   Personal   |   Contact ]


I am a Researcher in the Text Mining, Search, and Navigation (TMSN) group at Microsoft Research. I am broadly interested in machine learning, data mining, and information retrieval tasks that arise in the context of large textual and behavioral datasets. Specific problems on which I focus include record linkage, semi-supervised clustering, and improving information retrieval and targeted advertising via user behavior modeling. I am also interested in information extraction, recommender systems, and methods related to learning similarity (distance, kernel) functions.

I completed my Ph.D. in the Department of Computer Sciences at the University of Texas at Austin in the summer of 2006. I was a member of the Machine Learning group led by Prof. Raymond Mooney. Along the way, I spent the summer of 2002 at IBM T.J. Watson Research Center, and the summer/fall of 2004 at Google.

Research

  • Learning from large datasets of user behavior

    • Enhancing Web Search by Promoting Multiple Search Engine Usage
      Ryen W. White, Matthew Richardson, Mikhail Bilenko, and Allison Heath. To appear in Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR-2008), Singapore, July 2008.
    • Talking the Talk vs. Walking the Walk: Salience of Information Needs in Querying vs. Browsing
      Mikhail Bilenko, Ryen W. White, Matthew Richardson, and G. Craig Murray. To appear in Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR-2008), Singapore, July 2008.
      [PDF] [bib]
    • Mining the Search Trails of Surfing Crowds: Identifying Relevant Websites From User Activity
      Mikhail Bilenko and Ryen W. White. To appear in Proceedings of the 17th International World Wide Web Conference (WWW-2008), pp.51-60, Beijing, April 2008.
      [PDF] [bib]
    • Studying the Use of Popular Destinations to Enhance Web Search Interaction
      Ryen W. White, Mikhail Bilenko, and Silviu Cucerzan. In Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR-2007), pp.159-166, Amsterdam, July 2007.
      (Winner of Best Paper Award)
      [PDF] [PS.gz] [bib]

  • Learnable similarity functions and their applications in information integration (e.g., record linkage/identity uncertainty) and text mining


  • Semi-supervised clustering

    • Probabilistic Semi-Supervised Clustering with Constraints
      Sugato Basu, Mikhail Bilenko, Arindam Banerjee, and Raymond J. Mooney. In Semi-Supervised Learning, O. Chapelle, B. Schölkopf, and A. Zien (eds.), MIT Press, 2006.
      Note: this chapter summarizes the KDD and ICML papers below
      [PDF] [PS.gz] [bib]

    • A Probabilistic Framework for Semi-Supervised Clustering
      Sugato Basu, Mikhail Bilenko, and Raymond J. Mooney. In Proceedings of the 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-2004), pp.59-68, Seattle, WA, August 2004.
      (Winner of Best Research Paper Award)
      [PDF] [PS.gz] [bib]

    • Integrating Constraints and Metric Learning in Semi-Supervised Clustering
      Mikhail Bilenko, Sugato Basu, and Raymond J. Mooney. In Proceedings of the 21st International Conference on Machine Learning (ICML-2004), pp.81-88, Banff, Canada, July 2004.
      [PDF] [PS.gz] [bib]

    • A Comparison of Inference Techniques for Semi-supervised Clustering with Hidden Markov Random Fields
      Mikhail Bilenko and Sugato Basu. In Proceedings of the ICML-2004 Workshop on Statistical Relational Learning and its Connections to Other Fields (SRL-2004), pp.17-22, Banff, Canada, July 2004.
      [PDF] [PS.gz] [bib]

  • Indirect learning in information integration (record linkage, information extraction), text classification, and clustering

    • Two Approaches to Handling Noisy Variation in Text Mining
      Un Yong Nahm, Mikhail Bilenko, and Raymond J. Mooney. In Proceedings of the ICML-2002 Workshop on Text Learning (TextML'2002), pp.18-27, Sydney, Australia, July 2002.
      [PDF] [PS.gz] [bib]

Personal
In my leisure time I enjoy applying hill-climbing search and gradient descent algorithms to real-world domains.
Contact Info

Email mbilenko@microsoft.com
   
Postal Microsoft Research
One Microsoft Way
Redmond, WA 98052
USA