Machine translation, multilingual systems, and natural-language processing
Areas of focus for Microsoft Research’s inquiries into computational linguistics are threefold: machine translation, to create systems and technologies that cater to today’s multitude of translation scenarios; multilingual systems, to develop a natural-language-neutral approach to all aspects of linguistic computing; and natural-language processing, to design and build software that will analyze, understand, and generate languages that humans use naturally, with the goal of enabling a user to address a computer as though addressing another person.
Li Dong, Furu Wei, Shujie Liu, Ming Zhou, and Ke Xu, A Statistical Parsing Framework for Sentiment Classification, Computational Linguistics, December 2015.
Alessandro Sordoni, Michel Galley, Michael Auli, Chris Brockett, Yangfeng Ji, Meg Mitchell, Jian-Yun Nie, and Bill Dolan, A Neural Network Approach to Context-Sensitive Generation of Conversational Responses, Conference of the North American Chapter of the Association for Computational Linguistics – Human Language Technologies (NAACL-HLT 2015), June 2015.
Jialu Liu, Jingbo Shang, Chi Wang, Xiang Ren, and Jiawei Han, Mining Quality Phrases from Massive Text Corpora, ACM – Association for Computing Machinery, June 2015.
Hao Fang, Saurabh Gupta, Forrest Iandola, Rupesh Srivastava, Li Deng, Piotr Dollar, Jianfeng Gao, Xiaodong He, Margaret Mitchell, John Platt, Lawrence Zitnick, and Geoffrey Zweig, From Captions to Visual Concepts and Back, in The proceedings of CVPR, IEEE – Institute of Electrical and Electronics Engineers, June 2015.
Xiaodong Liu, Jianfeng Gao, Xiaodong He, Li Deng, Kevin Duh, and Ye-Yi Wang, Representation Learning Using Multi-Task Deep Neural Networks for Semantic Classification and Information Retrieval, in NAACL, NAACL, May 2015.
- Avatar Dataset
- Query Representation and Understanding Set
- Term Vectors of 60,730 Comparable English/Spanish Wikipedia Articles
- Speller Challenge TREC Data
- Data-Driven Conversation
- Deep Learning for Natural Language Processing: Theory and Practice (CIKM2014 Tutorial)
- Colloquial to Arabic Converter
- Part of Speech (POS) Tagger
- Named Entity Recognizer (NER)
- SARF (morphological analyzer)
- Arabic Toolkit Service (ATKS)
- Catalyst: Center for Sustainable Development
- Lexical Semantics Toolkit & Dataset
- Spoken Language Understanding
- Automated Problem Generation for Education
- MSRA Knowledge Service
- Recurrent Neural Networks for Language Processing
- Postpartum Mood Study