This project focuses on advancing the state-of-the-art in language processing with recurrent neural networks. We are currently applying these to language modeling, machine translation, speech recognition, language understanding and meaning representation. A special interest in is adding side-channels of information as input, to model phenomena which are not easily handled in other frameworks.
A toolkit for doing RNN language modeling with side-information is in the associated download. Sample word vectors for use with this toolkit can be found here (be sure to unzip), along with training and test scripts. These are for Penn Treebank words, and achieve a perplexity of 128; removing the context dependence results in a perplexity of 144. The Penn Treebank data is available here.
As described in the NAACL-2013 paper "Linguistic Regularities in Continuous Space Word Representations," we have found that the word representations capture many linguistic regularities. A test set for quantifying the degree to which syntactic regularities are modeled can be found here.
- Tomas Mikolov, Wen-tau Yih, and Geoffrey Zweig, Linguistic Regularities in Continuous SpaceWord Representations, in Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT-2013), Association for Computational Linguistics, 27 May 2013
- Geoffrey Zweig and Konstantin Makarychev, SPEED REGULARIZATION AND OPTIMALITY IN WORD CLASSING, in ICASSP, IEEE, 2013
- Tomas Mikolov and Geoffrey Zweig, Context Dependent Recurrent Neural Network Language Model, no. MSR-TR-2012-92, July 2012
- Tomas Mikolov and Geoffrey Zweig, Context Dependent Recurrent Neural Network Language Model, in Spoken Language Technologies, IEEE, 2012