This project focuses on advancing the state-of-the-art in language processing with recurrent neural networks. We are currently applying these to language modeling, machine translation, speech recognition, language understanding and meaning representation. A special interest in is adding side-channels of information as input, to model phenomena which are not easily handled in other frameworks.
A toolkit for doing RNN language modeling with side-information is in the associated download. Sample word vectors for use with this toolkit can be found here (be sure to unzip), along with training and test scripts. These are for Penn Treebank words, and achieve a perplexity of 128; removing the context dependence results in a perplexity of 144. The Penn Treebank data is available here.
As described in the NAACL-2013 paper "Linguistic Regularities in Continuous Space Word Representations," we have found that the word representations capture many linguistic regularities. A test set for quantifying the degree to which syntactic regularities are modeled can be found here.
- Suman Ravuri and Andreas Stolcke, Recurrent Neural Network and LSTM Models for Lexical Utterance Classification, in Proc. Interspeech, ISCA - International Speech Communication Association, Dresden, September 2015.
- Suman Ravuri and Andreas Stolcke, Neural Network Models for Lexical Addressee Detection, in Proc. Interspeech, ISCA - International Speech Communication Association, Singapore, September 2014.
- kaisheng yao, Baolin Peng, Geoffrey Zweig, Dong Yu, Xiaolong Li, and Feng Gao, Recurrent Conditional Random Field for Language Understanding, in ICASSP 2014, IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2014.
- Zhiheng Huang, Geoffrey Zweig, and Benoit Dumoulin, Cache Based Recurrent Neural Network Language Model Inference for First Pass Speech Recognition, in ICASSP, IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2014.
- Michael Auli, Michel Galley, Chris Quirk, and Geoffrey Zweig, Joint Language and Translation Modeling with Recurrent Neural Networks, in Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, Association for Computational Linguistics, Seattle, Washington, October 2013.
- Kaisheng Yao, Geoffrey Zweig, Mei-Yuh Hwang, Yangyang Shi, and Dong Yu, Recurrent Neural Networks for Language Understanding, Interspeech, August 2013.
- Tomas Mikolov, Wen-tau Yih, and Geoffrey Zweig, Linguistic Regularities in Continuous Space Word Representations, in Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT-2013), Association for Computational Linguistics, 27 May 2013.
- Geoffrey Zweig and Konstantin Makarychev, SPEED REGULARIZATION AND OPTIMALITY IN WORD CLASSING, in ICASSP, IEEE, 2013.
- Anoop Deoras, Tomas Mikolov, Stefan Kombrink, and Ken Church, Approximate Inference: A Sampling Based Modeling Technique to Capture Complex Dependencies in a Language Model, in Elsevier Speech Communication, Elsevier, August 2012.
- Geoffrey Zweig, John C. Platt, Christopher Meek, Christopher J.C. Burges, Ainur Yessenalina, and Qiang Liu, Computational Approaches to Sentence Completion, in ACL 2012, ACL/SIGPARSE, July 2012.