Machine translation, multilingual systems, and natural-language processing
Areas of focus for Microsoft Research’s inquiries into computational linguistics are threefold: machine translation, to create systems and technologies that cater to today’s multitude of translation scenarios; multilingual systems, to develop a natural-language-neutral approach to all aspects of linguistic computing; and natural-language processing, to design and build software that will analyze, understand, and generate languages that humans use naturally, with the goal of enabling a user to address a computer as though addressing another person.
Li Dong, Furu Wei, Shujie Liu, Ming Zhou, and Ke Xu, A Statistical Parsing Framework for Sentiment Classification, Computational Linguistics, December 2015.
Young-Bum Kim, Karl Stratos, Ruhi Sarikaya, and Minwoo Jeong, New Transfer Learning Techniques For Disparate Label Sets, in Association for Computational Linguistics (ACL), ACL – Association for Computational Linguistics, August 2015.
Wen-tau Yih, Ming-Wei Chang, Xiaodong He, and Jianfeng Gao, Semantic Parsing via Staged Query Graph Generation: Question Answering with Knowledge Base, in Proceedings of the Joint Conference of the 53rd Annual Meeting of the ACL and the 7th International Joint Conference on Natural Language Processing of the AFNLP, ACL – Association for Computational Linguistics, July 2015.
Igor Labutov, sumit basu, and lucy vanderwende, Deep Questions without Deep Understanding, to appear in: Proceedings of ACL 2015, July 2015.
Hao Fang, Saurabh Gupta, Forrest Iandola, Rupesh Srivastava, Li Deng, Piotr Dollar, Jianfeng Gao, Xiaodong He, Margaret Mitchell, John Platt, Lawrence Zitnick, and Geoffrey Zweig, From Captions to Visual Concepts and Back, in The proceedings of CVPR, IEEE – Institute of Electrical and Electronics Engineers, June 2015.
- Avatar Dataset
- Query Representation and Understanding Set
- Term Vectors of 60,730 Comparable English/Spanish Wikipedia Articles
- Speller Challenge TREC Data
- Language to Code
- From Captions to Visual Concepts and Back
- Data-Driven Conversation
- NLPwin parses AMR
- Deep Learning for Natural Language Processing: Theory and Practice (CIKM2014 Tutorial)
- Colloquial to Arabic Converter
- Part of Speech (POS) Tagger
- Named Entity Recognizer (NER)
- SARF (morphological analyzer)
- Arabic Toolkit Service (ATKS)
- Catalyst: Center for Sustainable Development
- Lexical Semantics Toolkit & Dataset
- Spoken Language Understanding
- Automated Problem Generation for Education