Senior Researcher, Natural Language Processing Group
Bio
After studying Computer Science and Mathematics at Carnegie Mellon University, I joined Microsoft in 2000 to work on the Intentional Programming project, an extensible compiler and development framework. I moved to the Natural Language Processing group in 2001, where my research has mostly focused on statistical machine translation powering Microsoft Translator, especially on several generations of a syntax directed translation system that powers over half of the translation systems. I am also interested in semantic parsing, paraphrase methods, and very practical problems such as spelling correction and transliteration.
Tools and Services
- Traveling? Worried about expensive data plans? Try our updated translator app that includes OCR and on-device translation.
- Try out our parsing and analysis tools online.
- Browse tools released through the MT labs page here.
Collaborators and Interns
Mostly I work with folks in the NLP and Machine Translation groups.
I've had the privilege of mentoring or working with a number of interns over the years, including Katharina Probst, Colin Cherry, Pavel Pecina, Ethan Phelps-Goodman, Vivek Srikumar, Arne Mauser, Hao Zhang, Jason Smith, Mohit Bansal, Arianna Bisazza, Mayank Srivastava, Jenny Lin, Joern Wuebker, Juri Ganitkevich, and Hui Zhang.
Apply for internships at MSR -- it's a fantastic program!
I just came back from a Summer Workshop at Johns Hopkins working on Domain Adaptation for Statistical Machine Translation.
Teaching
- Fall 2012: LING 570 Shallow Methods in Natural Language Processing with Will Lewis
- Spring 2011: Kristina Toutanova and taught a Statistical Machine Translation course at the University of Washington.
Academic Service
Reviewer for ACL, EMNLP, COLING, MT Summit consistently over the past 5+ years. Area Chair in MT: ACL 2009, EMNLP 2009, 2012
2013
- Svitlana Volkova, Pallavi Choudhury, Chris Quirk, Bill Dolan, and Luke Zettlemoyer, Lightly Supervised Learning of Procedural Dialog Systems, Association for Computational Linguistics, 2013
2012
- Colin Cherry, Robert C. Moore, and Chris Quirk, On Hierarchical Re-ordering and Permutation Parsing for Phrase-based Decoding, in WMT, 2012
- Chris Quirk, Book review of "Linguistic Structure Prediction" by Noah A. Smith, in Computational Linguistics, pp. 1–6, MIT Press, 2012
- Chris Quirk, Pallavi Choudhury, Jianfeng Gao, Hisami Suzuki, Kristina Toutanova, Michael Gamon, Wen-tau Yih, Lucy Vanderwende, and Colin Cherry, MSR SPLAT, a language analysis toolkit, in NAACL-HLT 2012, 2012
- Joern Wuebker, Mei-Yuh Hwang, and Chris Quirk, Leave-One-Out Phrase Model Training for Large-Scale Deployment, in WMT, 2012
2011
- Pallavi Choudhury, Hisami Suzuki, and Chris Quirk, From pecher to pêcher... or pécher: Simplifying French Input by Accent Prediction , in Proceedings of the Workshop on Advances in Text Input Methods (WTIM 2011), Asia Federation of Natural Language Processing, November 2011
- Qin Gao, Will Lewis, Chris Quirk, and Mei-Yuh Hwang, Incremental Training and Intentional Over-fitting of Word Alignment, in Proceedings of MT Summit XIII, Asia-Pacific Association for Machine Translation, September 2011
- Spencer Rarrick, Chris Quirk, and William Lewis, MT Detection in Web-Scraped Parallel Corpora, in Proceedings of MT Summit XIII, Asia-Pacific Association for Machine Translation, September 2011
- Michel Galley and Chris Quirk, Optimal Search for Minimum Error Rate Training, in Proc. of Empirical Methods in Natural Language Processing, July 2011
- Mohit Bansal, Chris Quirk, and Robert C. Moore, Gappy phrasal alignment by agreement, in ACL 2011, 2011
- Chris Quirk, Pallavi Choudhury, Michael Gamon, and Lucy Vanderwende, MSR-NLP Entry in BioNLP Shared Task 2011, 2011
- Markus Saers, Dekai Wu, and Chris Quirk, On the Expressivity of Linear Transductions, 2011
2010
- Minwoo Jeong, Kristina Toutanova, Hisami Suzuki, and Chris Quirk, A Discriminative Lexicon Model for Complex Morphology, in The Ninth Conference of the Association for Machine Translation in the Americas, Association for Computational Linguistics, 1 November 2010
- Jianfeng Gao, Xiaolong Li, Daniel Micol, Chris Quirk, and Xu Sun, A Large Scale Ranker-Based System for Search Query Spelling Correction, in The 23rd International Conference on Computational Linguistics, International Conference on Computational Linguistics, 23 August 2010
- Jason R. Smith, Chris Quirk, and Kristina Toutanova, Extracting Parallel Sentences from Comparable Corpora using Document Level Alignment, in Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the ACL, Association for Computational Linguistics, 1 June 2010
- Xu Sun, Jianfeng Gao, Daniel Micol, and Chris Quirk, Learning phrase-based spelling error models from clickthrough data, in ACL 2010, 2010
- Adam Pauls, Dan Klein, and Chris Quirk, Top-down k-best A* parsing, in ACL 2010, 2010
2009
- Robert C. Moore and Chris Quirk, Less is more: significance-based N-gram selection for smaller, better language models, in EMNLP 2009, 2009
- Hao Ma, Raman Chandrasekar, Chris Quirk, and Abhishek Gupta, Page hunt: using human computation games to improve web search, in SIGKDD Workshop on Human Computation, 2009
- Hao Ma, Raman Chandrasekar, Chris Quirk, and Abhishek Gupta, Page hunt: improving search engines using human computation games, in SIGIR 2009, 2009
- Robert C. Moore and Chris Quirk, Improved smoothing for N-gram language models based on ordinary counts, in ACL-IJCNLP 2009, 2009
- Hao Ma, Raman Chandrasekar, Chris Quirk, and Abhishek Gupta, Improving search engines using human computation games, in ACM Conference on Information and Knowledge Management, 2009
2008
- Colin Cherry and Chris Quirk, Discriminative, Syntactic Language Modeling through Latent SVMs, in Proceeding of AMTA, Association for Machine Translation in the Americas, 23 October 2008
- Menezes, Arul, Quirk, and Chris, Syntactic Models for Structural Word Insertion and Deletion during Translation, in Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing, Association for Computational Linguistics, Honolulu, Hawaii, October 2008
- Moore, Robert C., Quirk, and Chris, Random Restarts in Minimum Error Rate Training for Statistical Machine Translation, in Proceedings of the 22nd International Conference on Computational Linguistics (Coling 2008), Coling 2008 Organizing Committee, Manchester, UK, August 2008
- Zhang, Hao, Quirk, Chris, Moore, Robert C., Gildea, and Daniel, Bayesian Learning of Non-Compositional Phrases with Synchronous Parsing, in Proceedings of ACL-08: HLT, Association for Computational Linguistics, Columbus, Ohio, June 2008
- Xiaodong He, Jianfeng Gao, Chris Quirk, Patrick Nguyen, Arul Menezes, Robert Moore, Kristina Toutanova, Mei Yang, Bill dolan, Mu Li, Chi-Ho Li, Dongdong Zhang, Long Jiang, Ming Zhou, George Foster, Roland Kuhn, Jing Zheng, Wen Wang, Necip Fazil Ayan, Dimitra Vergyri, Nicolas Scheffer, and Andreas Stolcke, The MSR-NRC-SRI MT System for NIST Open Machine Translation 2008 Evaluation, in The 2008 NIST Open Machine Translation Evaluation Workshop, 2008
- Xiaodong He, Jianfeng Gao, Chris Quirk, Patrick Nguyen, Arul Menezes, Robert Moore, Kristina Toutanova, Mei Yang, Bill dolan, Mu Li, Chi-Ho Li, Dongdong Zhang, Long Jiang, and Ming Zhou, The MSR-MSRA MT System for NIST Open Machine Translation 2008 Evaluation, in The 2008 NIST Open Machine Translation Evaluation Workshop, 2008
2007
- Chris Quirk, Raghavendra Udupa, and Arul Menezes, Generative Models of Noisy Translations with Applications to Parallel Fragment Extraction, in Proceedings of MT Summit XI, European Association for Machine Translation, September 2007
- Robert C. Moore and Chris Quirk, Faster Beam-Search Decoding for Phrasal Statistical Machine Translation, in Proceedings of MT Summit XI, European Association for Machine Translation, September 2007
- Robert C. Moore and Chris Quirk, An Iteratively-Trained Segmentation-Free Phrase Translation Model for Statistical Machine Translation, in Proceedings of the Second Workshop on Statistical Machine Translation at ACL 2007, Association for Computational Linguistics, July 2007
- Arul Menezes and Chris Quirk, Using Dependency Order Templates to Improve Generality in Translation, in Proceedings of the Second Workshop on Statistical Machine Translation at ACL 2007, Association for Computational Linguistics, July 2007
2006
- Chris Quirk and Simon Corston-Oliver, The impact of parse quality on syntactically-informed statistical machine translation, in Proceedings of EMNLP 2006, ACL/SIGPARSE, July 2006
- Chris Quirk and Arul Menezes, Do we need phrases? Challenging the conventional wisdom in Statistical Machine Translation, in Proceedings of HLT-NAACL 2006, ACL/SIGPARSE, May 2006
- Chris Quirk and Arul Menezes, Dependency Treelet Translation: The convergence of statistical and example-based machine translation?, in Machine Translation, vol. 20, pp. 43–65, March 2006
- Xiaodong He, Arul Menezes, Chris Quirk, Anthony Aue, Simon Corston-Oliver, Jianfeng Gao, and Patrick Nguyen, Microsoft Research Treelet Translation System: NIST MT Evaluation 06, National Institute of Standards and Technology , March 2006
- Arul Menezes, Kristina Toutanova, and Chris Quirk, Microsoft research treelet translation system: NAACL 2006 Europarl evaluation, in WMT 2006, 2006
2005
- Arul Menezes and Chris Quirk, Microsoft Research Treelet Translation System: IWSLT Evaluation, in Proceedings of the International Workshop on Spoken Language Translation, October 2005
- Chris Quirk, Arul Menezes, and Colin Cherry, Dependency Treelet Translation: Syntactically Informed Phrasal SMT, in Proceedings of ACL, Association for Computational Linguistics, June 2005
- Arul Menezes and Chris Quirk, Dependency treelet translation: the convergence of statistical and example-based machine-translation, in Proceedings of the 10th Machine Translation Summit Workshop on Example-Based Machine Translation, pp. 99–108, 2005
2004
- Chris Quirk, Arul Menezes, and Colin Cherry, Dependency Tree Translation: Syntactically Informed Phrasal SMT, no. MSR-TR-2004-113, November 2004
- Anthony Aue, Arul Menezes, Robert Moore, Chris Quirk, and Eric Ringger, Statistical Machine Translation Using Labeled Semantic Dependency Graphs, ACL/SIGPARSE, October 2004
- William Dolan, Chris Quirk, and Chris Brockett, Unsupervised Construction of Large Paraphrase Corpora: Exploiting Massively Parallel News Sources, International Conference on Computational Linguistics, August 2004
- Chris Quirk, Chris Brockett, and William B. Dolan, Monolingual Machine Translation for Paraphrase Generation, Association for Computational Linguistics, July 2004
- Chris Quirk, Training a Sentence-Level Machine Translation Confidence Measure, European Language Resources Association, May 2004
2003
- Takako Aikawa, Chris Quirk, and Lee Schwartz, Learning prepositional attachment from sentence aligned bilingual corpora, Association for Machine Translation in the Americas, September 2003
2002
- Chris Brockett, Takako Aikawa, Anthony Aue, Arul Menezes, Chris Quirk, and Hisami Suzuki, English-Japanese Example-Based Machine Translation Using Abstract Semantic Representations, International Conference on Computational Linguistics, October 2002

