My undergraduate degree is from UC Berkeley, and my Ph.D. is from UCLA Linguistics . I joined MSR in 1992, and most of my work since then has focused on semantic processing. I am also deeply involved in our group's machine translation effort (see our MT blog and the and Live Translator homepage), as well as the Microsoft Research ESL Assistant project, which helps non-native speakers write better English by showing them targeted usage examples culled from the web. For more details on this project, see our project site and blog .
During the 1990s I was preoccupied with building richly structured semantic networks from text data as part of the MindNet project. This work spurred my interest in "the paraphrase problem": when do superficially dissimilar strings of words convey essentially the same meaning?
Learning to identify and generate such paraphrase alternations is key to developing applications that appear to understand human language, and we've done some interesting work in this area . I have also been active in helping establishing the Recognizing Textual Entailment challenges , which address a closely related problem.
- Mark Yatsgar, Svitlana Volkova, Asli Celikyilmaz, Bill Dolan, and Luke Zettlemoyer, Learning to Relate Literal and Sentimental Descriptions of Visual Properties, North American Chapter of the Association for Computational Linguistics, 2013
- Xiaohu Liu, Ruhi Sarakaya, Chris Brockett, Chris Quirk, and William Dolan, Paraphrase features to improve natural language understanding, in Interspeech 2013, , 2013
- Svitlana Volkova, Pallavi Choudhury, Chris Quirk, Bill Dolan, and Luke Zettlemoyer, Lightly Supervised Learning of Procedural Dialog Systems, Association for Computational Linguistics, 2013
- Haifeng Wang, Idan Szpektor, Shiqi Zhao, and Bill Dolan, Special Issue on Paraphrasing, in ACM Transactions on Intelligent Systems and Technology (TIST), ACM, 2013
- Svitlana Volkova, Bill Dolan, and Theresa Wilson, CLex: A Lexicon for Exploring Color, Concept and Emotion Associations in Language, Conference of the European Chapter of the Association for Computational Linguistics (EACL), 2012
- Wei Xu, Alan Ritter, William B. Dolan, Ralph Grishman, and Colin Cherry, Paraphrasing for Style, Conference on Computational Linguistics, 2012
- David L. Chen and William B. Dolan, Building a Persistent Workforce on Mechanical Turk for Multilingual Data Collection , in Building a Persistent Workforce on Mechanical Turk for Multilingual Data Collection , Human Computer Interaction International Conference, 2011
- Alan Ritter, Colin Cherry, and William B. Dolan, Data-Driven Response Generation in Social Media, Empirical Methods in Natural Language Processing (EMNLP), 2011
- David Chen and William B. Dolan, Collecting Highly Parallel Data for Paraphrase Evaluation , Association for Computational Linguistics, 2011
- Ido Dagan, Bill Dolan, Bernardo Magnini, and Dan Roth, Recognizing textual entailment: Rational, evaluation and approaches , in Journal of Natural Language Engineering, 2010
- Danilo Giampiccolo, Hoa Dang, Bernardo Magnini, Ido Dagan, and Bill Dolan, The Fourth Pascal Recognizing Textual Entailment Challenge, in Journal of Natural Language Engineering, Journal of Natural Language Engineering, 2010
- Alan Ritter, Colin Cherry, and Bill Dolan, Unsupervised Modeling of Twitter Conversations, Human Language Technologies - North American Chapter of the Association for Computational Linguistics (HLT-NAACL), 2010
- Michael Gamon, Claudia Leacock, Chris Brockett, William B. Dolan, Jianfeng Gao, Dmitriy Belenko, and Alexandre Klementiev, Using Statistical Techniques and Web Search to Correct ESL Errors, in Calico Journal, Vol 26, No. 3, CALICO Journal, June 2009
- Xiaodong He, Jianfeng Gao, Chris Quirk, Patrick Nguyen, Arul Menezes, Robert Moore, Kristina Toutanova, Mei Yang, Bill dolan, Mu Li, Chi-Ho Li, Dongdong Zhang, Long Jiang, and Ming Zhou, The MSR-MSRA MT System for NIST Open Machine Translation 2008 Evaluation, in The 2008 NIST Open Machine Translation Evaluation Workshop, 2008
- Xiaodong He, Jianfeng Gao, Chris Quirk, Patrick Nguyen, Arul Menezes, Robert Moore, Kristina Toutanova, Mei Yang, Bill dolan, Mu Li, Chi-Ho Li, Dongdong Zhang, Long Jiang, Ming Zhou, George Foster, Roland Kuhn, Jing Zheng, Wen Wang, Necip Fazil Ayan, Dimitra Vergyri, Nicolas Scheffer, and Andreas Stolcke, The MSR-NRC-SRI MT System for NIST Open Machine Translation 2008 Evaluation, in The 2008 NIST Open Machine Translation Evaluation Workshop, 2008
- Xing Yi, Jianfeng Gao, and William B. Dolan, A Web-based English Proofing System for English as a Second Language Users, International Joint Conference on Natural Language Processing (IJCNLP), 2008
- Michael Gamon, Jianfeng Gao, Chris Brockett, Alexander Klementiev, William Dolan, Dmitriy Belenko, and Lucy Vanderwende, Using Contextual Speller Techniques and Language Modeling for ESL Error Correction. Proceedings of IJCNLP, Hyderabad, India. , Asia Federation of Natural Language Processing, January 2008
- Danilo Giampiccolo, Bernardo Magnini, Ido Dagan, and Bill Dolan, The Third PASCAL Recognizing Textual Entailment Challenge, Workshop on Textual Entailment and Paraphrasing, 2007
- Lucy Vanderwende and William B. Dolan, What syntax can contribute in entailment task, Springer-Verlag, June 2006
- Roy Bar-Haim, Ido Dagan, Bill Dolan, Lisa Ferro, Danilo Giampiccolo, and Bernardo Magnini, The Second PASCAL Recognising Textual Entailment Challenge, Workshop on Recognising Textual Entailment, 2006
- Chris Brockett, William B. Dolan, and Michael Gamon, Correcting ESL Errors Using Phrasal SMT Techniques, in 21st International Conference on Computational Linguistics and 44th Annual Meeting of the ACL, Sydney, Australia, Association for Computational Linguistics, 2006
- William B. Dolan and Chris Brockett, Automatically Constructing a Corpus of Sentential Paraphrases, in Third International Workshop on Paraphrasing (IWP2005), Asia Federation of Natural Language Processing, 2005
- Chris Brockett and William B. Dolan, Support Vector Machines for Paraphrase Identification and Corpus Construction, in Third International Workshop on Paraphrasing (IWP2005), Asia Federation of Natural Language Processing, 2005
- William Dolan, Chris Quirk, and Chris Brockett, Unsupervised Construction of Large Paraphrase Corpora: Exploiting Massively Parallel News Sources, International Conference on Computational Linguistics, August 2004
- Chris Quirk, Chris Brockett, and William B. Dolan, Monolingual Machine Translation for Paraphrase Generation, Association for Computational Linguistics, July 2004
