|Areas of focus for Microsoft Research’s inquiries into computational linguistics are threefold: machine translation, to create systems and technologies that cater to today’s multitude of translation scenarios; multilingual systems, to develop a natural-language-neutral approach to all aspects of linguistic computing; and natural-language processing, to design and build software that will analyze, understand, and generate languages that humans use naturally, with the goal of enabling a user to address a computer as though addressing another person.|
- Spoken Language UnderstandingSpoken language understanding (SLU) is an emerging field in between the areas of speech processing and natural language processing. The term spoken language understanding has largely been coined for targeted understanding of human speech directed at machines. This project covers our research on SLU tasks such as domain detection, intent determination, and slot filling, using data-driven methods.
- Automated Problem Generation for EducationIntelligent Tutoring Systems (ITS) can significantly enhance the educational experience, both in the classroom and online. A key aspect of ITS is the ability to automatically generate problems of a certain difficulty level and that exercise use of certain concepts. This can help avoid copyright or plagiarism issues and help generate personalized workflows. This project develops technologies for problem generation in various subject domains including math, logic, and even language learning.
- MSRA Knowledge ServiceThe goals of MSRA Knowledge Service system and the team are: (1) build a large, high-quality, fresh, and easy to use knowledge layer; (2) provide knowledge service to the utility layer and the application layer; (3) coordinate various knowledge extraction and refining effors at MSRA (to reduce duplication efforts).
- Recurrent Neural Networks for Language ProcessingThis project focuses on advancing the state-of-the-art in language processing with recurrent neural networks. We are currently applying these to language modeling, machine translation, speech recognition, language understanding and meaning representation. A special interest in is adding side-channels of information as input, to model phenomena which are not easily handled in other frameworks.
Rohan Ramanath, Monojit Choudhury, Kalika Bali, and Rishiaj Saha Roy, Crowd Prefers the Middle Path: A New IAA Metric for Crowdsourcing Reveals Turker Biases in Query Segmentation, in Proceedings of ACL, Association for Computational Linguistics, July 2013
Munmun De Choudhury, Scott Counts, Eric Horvitz, and Michael Gamon, Predicting Depression via Social Media., AAAI, July 2013
Heeyoung Lee, Andreas Stolcke, and Elizabeth Shriberg, Using Out-of-Domain Data for Lexical Addressee Detection in Human-Human-Computer Dialog, in Proceedings NAACL, Association for Computational Linguistics, June 2013
Mark Liberman, Jiahong Yuan, Andreas Stolcke, Wen Wang, and Vikramjit Mitra, Using multiple versions of speech input in phone recognition, in Proceedings IEEE ICASSP, IEEE SPS, May 2013
Michael Gamon, Martin Chodorow, Claudia Leacock, and Joel Tetreault, Grammatical Error Detection in Automatic Essay Scoring and Feedback, in Handbook of Automated Essay Evaluation, Routledge, May 2013