My primary research interests are in solving different text processing problems with the help of machine learning. Currently I am working on problems related to continuous space semantic representations and question answering.
Selected Publications by Topics
Below is the list of some of my recent papers categorized by the topic. The complete list of my publications is also available.
- Questions vs. Queries in Informational Search Tasks. R. White, M. Richardson & W. Yih. MSR-TR-2014-96, July 2014.
- Semantic Parsing for Single-Relation Question Answering. W. Yih, X. He & C. Meek. In ACL-14.
- Question Answering Using Enhanced Lexical Semantic Models. W. Yih, M. Chang, C. Meek & A. Pastusiak. In ACL-13.
- Mapping Dependencies Trees: An Application to Question Answering, with V. Punyakanok & D. Roth. In Proceedings of AI&Math 2004 (Special session: Intelligent Text Processing).
- Typed Tensor Decomposition of Knowledge Bases for Relation Extraction. K. Chang, W. Yih, B. Yang & C. Meek. In EMNLP-14.
- Learning Continuous Phrase Representations for Translation Modeling. J. Gao, X. He, W. Yih & L. Deng. In ACL-14.
- Multi-Relational Latent Semantic Analysis. K. Chang, W. Yih & C. Meek. In EMNLP-13.
- Combining Heterogeneous Models for Measuring Relational Similarity. A. Zhila, W. Yih, C. Meek, G. Zweig & T. Mikolov. In NAACL-HLT-13.
- Linguistic Regularities in Continuous Space Word Representations. T. Mikolov, W. Yih & G. Zweig. In NAACL-HLT-13.
- Polarity Inducing Latent Semantic Analysis. W. Yih, G. Zweig & J. Platt. In EMNLP-CoNLL-12.
- Measuring Word Relatedness Using Heterogeneous Vector Space Models. W. Yih & V. Qazvinian. In NAACL-HLT-12.
Clickthrough-Based Latent Semantic Models for Web Search, with J. Gao & K. Toutanova. In SIGIR-11.
Learning Discriminative Projections for Text Similarity Measures. W. Yih, K. Toutanova, J. Platt & C. Meek. In CoNLL-11. [Best Paper]
Animacy Detection with Voting Models. J. Moore, C. Burges, E. Renshaw & W. Yih. In EMNLP-13.
Semantic Similarity and Relevance
Similarity Models for Ad Relevance Measures. W. Yih & N. Jiang. In MLOAD-10.
Translingual Document Representations from Discriminative Projections. J. Platt, K. Toutanova & W. Yih. In EMNLP-10.
Adaptive Near-Duplicate Detection via Similarity Learning. H. Hajishirzi, W. Yih & A. Kolcz, In SIGIR-10.
Learning Term-weighting Functions for Similarity Measures. W. Yih. In EMNLP-09.
Consistent Phrase Relevance Measures. W. Yih and C. Meek. In ADKDD-08.
Improving Similarity Measures for Short Segments of Text. W. Yih and C. Meek. In AAAI-07.
- Finding Advertising Keywords on Web Pages. W. Yih, J. Goodman & V. Carvalho. In WWW-06.
Domain Adaptation with Ensemble of Feature Groups. R. Samdani and W. Yih. In IJCAI-11.
Spam Filtering and Email Applications
- Extracting Product Information from Email Receipts Using Markov Logic. S. Kok & W. Yih. In CEAS-09.
- Partitioned Logistic Regression for Spam Filtering. M. Chang, W. Yih & C. Meek. In KDD-08.
- Personalized Spam Filtering for Gray Mail. M. Chang, W. Yih & R. McCann. In CEAS-08.
- Raising the Baseline for High-Precision Text Classifiers. A. Kolcz & W. Yih. In KDD-07.
- Improving Spam Filtering by Detecting Gray Mail. W. Yih, R. McCann & A. Kolcz. In CEAS-07.
- Learning at Low False Positive Rates. W. Yih, J. Goodman & G. Hulten. In CEAS-06.
- Online Discriminative Spam Filter Training. J. Goodman & W. Yih. In CEAS-06.
Multi-Document Summarization by Maximizing Informative Content-Words. W. Yih, J. Goodman, L. Vanderwende & H. Suzuki. In IJCAI-07.
Integer Linear Programming (ILP) Inference
Integer Linear Programming Inference for Conditional Random Fields, with D. Roth. In ICML-05.
A Linear Programming Formulation for Global Inference in Natural Language Tasks, with D. Roth. In CoNLL-04.
Structured Output Learning
Dual Coordinate Descent Algorithms for Efficient Large Margin Structured Prediction. M. Chang & W. Yih. In Transactions of ACL (TACL), 2013.
Improved Discriminative Bilingual Word Alignment. R. Moore, W. Yih & Andreas Bode. In ACL-COLING-06.
Learning and Inference over Constrained Output, with V. Punyakanok, D. Roth & D. Zimak. In IJCAI-05.
Probabilistic Reasoning for Entity & Relation Recognition, with D. Roth. In COLING-02.
Semantic Role Labeling
Automatic Semantic Role Labeling (Tutorial Handout for AAAI-07). W. Yih & K. Toutanova.
Automatic Semantic Role Labeling (Tutorial Handout for HLT-NAACL-06). W. Yih & K. Toutanova.
The Importance of Syntactic Parsing and Inference in Semantic Role Labeling. In Computational Linguistics 2008, with V. Punyakanok & D. Roth.
Generalized Inference with Multiple Semantic Role Labeling Systems. CoNLL-05 (shared task), with P. Koomen, V. Punyakanok & D. Roth.
The Necessity of Syntactic Parsing for Semantic Role Labeling. IJCAI-05, with V. Punyakanok & D. Roth.
- Semantic Role Labeling via Integer Linear Programming Inference. COLING-04, with V. Punyakanok, D. Roth & D. Zimak.
- Semantic Role Labeling via Generalized Inference over Classifiers, with V. Punyakanok, D. Roth, W. Yih, D. Zimak & Y. Tu. In CoNLL-04 (shared task).
Relational Learning via Propositional Algorithms: An Information Extraction Case Study, with D. Roth. In IJCAI-01.
Template-Based Information Mining from HTML Documents. J.Hsu & W. Yih. In AAAI-97.
- CoNLL-14 Program Co-chair
- ICML-2014 Workshop on Knowledge-Powered Deep Learning for Text Mining
- The second Workshop on Continuous Vector Space Models and their Compositionality
- IJCNLP-13 Workshop Co-chair
- CEAS-09 Program Co-chair
- ICML-07 Workshop on Constrained Optimization and Learning with Structured Outputs
Editorial Board Member
- Journal of Artificial Intelligence Research (JAIR), 2013-2016
Program Committee Member
- Journal of Artificial Intelligence Research (JAIR), 2013-2016
- Area Chair: HLT-NAACL-12, ACL-14
- Senior Program Committee: IJCAI-09, AAAI-11, AAAI-14, AAAI-15
- CEAS: 2004, 2005, 2006, 2007, 2008, 2009 (Program Co-chair), 2010
- ICML: 2006, 2008, 2009, 2012, 2013, 2014
- NIPS: 2006, 2007, 2008, 2009, 2012, 2013 (Reviewer Award), 2014
- AAAI: 2006, 2008, 2011 (SPC), 2014 (SPC), 2015 (SPC)
- IJCAI: 2009 (SPC)
- ACL: 2007, 2008 (w/ HLT), 2009 (w/ IJCNLP), 2010, 2011 (w/ HLT), 2012, 2013, 2014 (Area Chair)
- EMNLP: 2005, 2007 (w/ CoNLL), 2008, 2010, 2011, 2013, 2014
- HLT-NAACL: 2004, 2009, 2010, 2012 (Area Co-chair), 2013
- CIKM-08, CoNLL-09, ILPNLP-WS-09, EACL-12
Demo & Web Service
- MSR Continuous-Space Text Representation
- Semantic Word Relatedness
- Word Relation Measures: Synonym, Antonym, Hyponym
- Relational Similarity (Analogy)
- Text Similarity