Will has worked in MSR Machine Translaton (MT) incubation team since July 2007. Prior to joining the MT team, he worked as an Assistant Professor in the Computational Linguistics Master’s Program at the University of Washington, where he was founding faculty in the program and continues to hold an Affiliate Position. He received his PhD from the University of Arizona in 2002, and graduated Magna Cum Laude from the University of California at Davis in 1996.
Chris Quirk and I are teaching Shallow Methods in Statistical Natural Language Processing this term at the University of Washington. It's actually being taught in a classroom at MSR, with a live connection between the two classrooms using Conference XP. See the course website here: http://courses.washington.edu/ling570/.
2012
- Amittai Axelrod, QingJun Li, and Will Lewis, Applications of Data Selection via Cross-Entropy Difference for Real-World Statistical Machine Translation, in Proceedings of the International Workshop on Spoken Language Translation (IWSLT 2012), Internaltional Workshop on Spoken Language Translation (IWSLT), December 2012
- William D. Lewis and Phong Yang, Building MT for a Severely Under-Resourced Language: White Hmong, Association for Machine Translation in the Americas, October 2012
- Ryan Georgi, Fei Xia, and William Lewis, Measuring the Divergence of Dependency Structures Cross-Linguistically to Improve Syntactic Projection Algorithms, in Proceedings of the Eight International Conference on Language Resources and Evaluation (LREC'12), European Language Resources Association, May 2012
2011
- Qin Gao, Will Lewis, Chris Quirk, and Mei-Yuh Hwang, Incremental Training and Intentional Over-fitting of Word Alignment, in Proceedings of MT Summit XIII, Asia-Pacific Association for Machine Translation, September 2011
- Spencer Rarrick, Chris Quirk, and William Lewis, MT Detection in Web-Scraped Parallel Corpora, in Proceedings of MT Summit XIII, Asia-Pacific Association for Machine Translation, September 2011
- William D. Lewis, Robert Munro, and Stephan Vogel, Crisis MT: Developing A Cookbook for MT in Crisis Situations, in Proceedings of the Sixth Workshop on Statistical Machine Translation, Association for Computational Linguistics, July 2011
2010
- William Lewis and Fei Xia, Developing ODIN: A Multilingual Repository of Annotated Language Data for Hundreds of the World's Languages , in Literary and Linguistic Computing, Oxford University Press, September 2010
- Ryan Georgi, Fei Xia, and William Lewis, Comparing Language Similarity Across Genetic and Typologically-Based Groupings, in Proceedings of the 23rd International Conference on Computational Linguistics (COLING 2010), International Conference on Computational Linguistics, August 2010
- Robert C. Moore and William Lewis, Intelligent Selection of Language Model Training Data, in Proceedings of the ACL 2010 Conference Short Papers, Association for Computational Linguistics, Uppsala, Sweden, July 2010
- Fei Xia, Carrie Lewis, and William Lewis, The Problems of Language Identification within Hugely Multilingual Data Sets, in Proceedings of the 7th International Conference on Language Resources and Evaluation (LREC 2010), European Language Resources Association, May 2010
- William Lewis, Chris Wendt, and David Bullock, Achieving Domain Specificity in SMT without Overt Siloing, in Proceedings of the 7th International Conference on Language Resources and Evaluation (LREC 2010), European Language Resources Association, May 2010
- William Lewis, Haitian Creole: How to Build and Ship an MT Engine from Scratch in 4 Days, 17 Hours, & 30 Minutes, in EAMT 2010: Proceedings of the 14th Annual conference of the European Association for Machine Translation, European Association for Machine Translation, May 2010
- Fei Xia, Carrie Lewis, and William Lewis, Language ID for a Thousand Languages, in eLanguage, LSA Annual Meeting Extended Abstracts, Linguistics Society of America, January 2010
2009
- Chris Wendt and Will Lewis, Improving the quality of a customized SMT system using shared training data, in MT Summit 2009, 29 August 2009
- William Lewis and Fei Xia, Parsing, Projecting & Prototypes: Repurposing Linguistic Data on the Web, in Proceedings of the 12th Conference of the European Chapter of the ACL (EACL-2009), Association for Computational Linguistics, April 2009
- Fei Xia and William Lewis, Applying NLP Technologies to the Collection and Enrichment of Language Data on the Web to Aid Linguistic Research, in Proceedings of the EACL 2009 Workshop on Language Technology and Resources for Cultural Heritage, Social Sciences, Humanities, and Education (LaTeCH-SHELT&R 2009), Association for Computational Linguistics, April 2009
- Fei Xia, William Lewis, and Hoifung Poon, Language ID in the Context of Harvesting Language Data off the Web, in Proceedings of the 12th Conference of the European Chapter of the ACL (EACL-2009), Association for Computational Linguistics, April 2009
- William Lewis, Simin Karimi, Heidi Harley, and Scott Farrar, Time and Again: Theoretical Perspectives on Formal Linguistics (Edited Volume), John Benjamins Publishing Company, 2009
2008
- William D. Lewis and Fei Xia, Automatically Identifying Computationally Relevant Typological Features, in Proceedings of The Third International Joint Conference on Natural Language Processing (IJCNLP), Asia Federation of Natural Language Processing, January 2008
- Fei Xia and William D. Lewis, Repurposing Theoretical Linguistic Data for Tool Development and Search, in Proceedings of The Third International Joint Conference on Natural Language Processing (IJCNLP), Asia Federation of Natural Language Processing, January 2008
