I am a Principal Researcher in Microsoft Research, where I manage the Natural Language Processing group. My undergraduate degree is from UC Berkeley, and my Ph.D. is from UCLA Linguistics . I joined MSR in 1992, and most of my work since then has focused on semantic processing.
A fundamental research interest has been "the paraphrase problem": when do superficially dissimilar strings of words convey essentially the same meaning?
- On its way to an extended mission at Saturn, the Cassini probe on Friday makes its closest rendezvous with Saturn's dark moon Phoebe.
- The Cassini spacecraft, which is en route to Saturn, is about to make a close pass of the ringed planet's mysterious moon Phoebe.
Learning to identify and generate such alternations is key to developing applications that appear to understand human language, and we've done some interesting work in this area . I have also been active in helping establishing the Recognizing Textual Entailment challenges , which address a closely related problem. In addition, I've worked extensively on Machine Translation, managing the Microsoft Translator team from its inception until 2011.
Most recently, my work has focused on modeling language "grounded" or "situated" in the real world:
· Learning dialog models that tie linguistic utterances to specific changes in machine state
· Linking crowdsourced natural language descriptions to visual objects and actions
I'm particularly interested in imbuing machines with the linguistic means to react to environmental changes, and in allowing humans to alter the state of the world (real or virtual) with language.
- Jiwei Li, Michel Galley, Chris Brockett, Jianfeng Gao, and Bill Dolan, A Diversity-Promoting Objective Function for Neural Conversation Models, in NAACL HLT 2016 (forthcoming), March 2016.
- Alessandro Sordoni, Michel Galley, Michael Auli, Chris Brockett, Yangfeng Ji, Meg Mitchell, Jian-Yun Nie, Jianfeng Gao, and Bill Dolan, A Neural Network Approach to Context-Sensitive Generation of Conversational Responses, Conference of the North American Chapter of the Association for Computational Linguistics – Human Language Technologies (NAACL-HLT 2015), 1 June 2015.
- Xiaohu Liu, Ruhi Sarikaya, Chris Brockett, Chris Quirk, and William Dolan, Paraphrase features to improve natural language understanding, in Interspeech 2013, , 2013.
- Mark Yatsgar, Svitlana Volkova, Asli Celikyilmaz, Bill Dolan, and Luke Zettlemoyer, Learning to Relate Literal and Sentimental Descriptions of Visual Properties, North American Chapter of the Association for Computational Linguistics, 2013.
- Svitlana Volkova, Pallavi Choudhury, Chris Quirk, Bill Dolan, and Luke Zettlemoyer, Lightly Supervised Learning of Procedural Dialog Systems, Association for Computational Linguistics, 2013.
- Haifeng Wang, Idan Szpektor, Shiqi Zhao, and Bill Dolan, Special Issue on Paraphrasing, in ACM Transactions on Intelligent Systems and Technology (TIST), ACM, 2013.
- Svitlana Volkova, Bill Dolan, and Theresa Wilson, CLex: A Lexicon for Exploring Color, Concept and Emotion Associations in Language, Conference of the European Chapter of the Association for Computational Linguistics (EACL), 2012.
- Wei Xu, Alan Ritter, William B. Dolan, Ralph Grishman, and Colin Cherry, Paraphrasing for Style, Conference on Computational Linguistics, 2012.
- Alan Ritter, Colin Cherry, and William B. Dolan, Data-Driven Response Generation in Social Media, Empirical Methods in Natural Language Processing (EMNLP), 2011.
- David L. Chen and William B. Dolan, Building a Persistent Workforce on Mechanical Turk for Multilingual Data Collection , in Building a Persistent Workforce on Mechanical Turk for Multilingual Data Collection , Human Computer Interaction International Conference, 2011.
- David Chen and William B. Dolan, Collecting Highly Parallel Data for Paraphrase Evaluation , Association for Computational Linguistics, 2011.
- Ido Dagan, Bill Dolan, Bernardo Magnini, and Dan Roth, Recognizing textual entailment: Rational, evaluation and approaches , in Journal of Natural Language Engineering, 2010.
- Danilo Giampiccolo, Hoa Dang, Bernardo Magnini, Ido Dagan, and Bill Dolan, The Fourth Pascal Recognizing Textual Entailment Challenge, in Journal of Natural Language Engineering, Journal of Natural Language Engineering, 2010.
- Alan Ritter, Colin Cherry, and Bill Dolan, Unsupervised Modeling of Twitter Conversations, Human Language Technologies - North American Chapter of the Association for Computational Linguistics (HLT-NAACL), 2010.
- Michael Gamon, Claudia Leacock, Chris Brockett, William B. Dolan, Jianfeng Gao, Dmitriy Belenko, and Alexandre Klementiev, Using Statistical Techniques and Web Search to Correct ESL Errors, in Calico Journal, Vol 26, No. 3, CALICO Journal, June 2009.
- Xiaodong He, Jianfeng Gao, Chris Quirk, Patrick Nguyen, Arul Menezes, Robert Moore, Kristina Toutanova, Mei Yang, Bill dolan, Mu Li, Chi-Ho Li, Dongdong Zhang, Long Jiang, and Ming Zhou, The MSR-MSRA MT System for NIST Open Machine Translation 2008 Evaluation, in The 2008 NIST Open Machine Translation Evaluation Workshop, 2008.
- Michael Gamon, Jianfeng Gao, Chris Brockett, Alexander Klementiev, William Dolan, Dmitriy Belenko, and Lucy Vanderwende, Using Contextual Speller Techniques and Language Modeling for ESL Error Correction. Proceedings of IJCNLP, Hyderabad, India. , Asia Federation of Natural Language Processing, January 2008.
- Xing Yi, Jianfeng Gao, and William B. Dolan, A Web-based English Proofing System for English as a Second Language Users, International Joint Conference on Natural Language Processing (IJCNLP), 2008.
- Xiaodong He, Jianfeng Gao, Chris Quirk, Patrick Nguyen, Arul Menezes, Robert Moore, Kristina Toutanova, Mei Yang, Bill dolan, Mu Li, Chi-Ho Li, Dongdong Zhang, Long Jiang, Ming Zhou, George Foster, Roland Kuhn, Jing Zheng, Wen Wang, Necip Fazil Ayan, Dimitra Vergyri, Nicolas Scheffer, and Andreas Stolcke, The MSR-NRC-SRI MT System for NIST Open Machine Translation 2008 Evaluation, in The 2008 NIST Open Machine Translation Evaluation Workshop, 2008.
- Danilo Giampiccolo, Bernardo Magnini, Ido Dagan, and Bill Dolan, The Third PASCAL Recognizing Textual Entailment Challenge, Workshop on Textual Entailment and Paraphrasing, 2007.
- Lucy Vanderwende and William B. Dolan, What syntax can contribute in entailment task, Springer-Verlag, June 2006.
- Roy Bar-Haim, Ido Dagan, Bill Dolan, Lisa Ferro, Danilo Giampiccolo, and Bernardo Magnini, The Second PASCAL Recognising Textual Entailment Challenge, Workshop on Recognising Textual Entailment, 2006.
- Chris Brockett, William B. Dolan, and Michael Gamon, Correcting ESL Errors Using Phrasal SMT Techniques, in 21st International Conference on Computational Linguistics and 44th Annual Meeting of the ACL, Sydney, Australia, Association for Computational Linguistics, 2006.
- William B. Dolan and Chris Brockett, Automatically Constructing a Corpus of Sentential Paraphrases, in Third International Workshop on Paraphrasing (IWP2005), Asia Federation of Natural Language Processing, 2005.
- Chris Brockett and William B. Dolan, Support Vector Machines for Paraphrase Identification and Corpus Construction, in Third International Workshop on Paraphrasing (IWP2005), Asia Federation of Natural Language Processing, 2005.
- Avatar Dataset
- Microsoft Research Video Description Corpus
- Microsoft Research Question-Answering Corpus
- NLP Data Sets for Comparative Study of Parameter-Estimation Methods
- Microsoft Research Paraphrase Phrase Tables
- ESL 123 Mass Noun Examples
- Microsoft Research IME Corpus
- Microsoft Research Paraphrase Corpus
- Unification Grammar Sentence Realization Algorithms