Phone sequence modeling with recurrent neural networks

Nicolas Boulanger-Lewandowski, Jasha Droppo, Mike Seltzer, and Dong Yu

Abstract

In this paper, we investigate phone sequence modeling with recurrent neural networks in the context of speech recognition. We introduce a hybrid architecture that combines a phonetic model with an arbi- trary frame-level acoustic model and we propose ef?cient algorithms for training, decoding and sequence alignment. We evaluate the ad- vantage of our phonetic model on the TIMIT and Switchboard-mini datasets in complementarity to a powerful context-dependent deep neural network (DNN) acoustic classi?er and a higher-level 3-gram language model. Consistent improvements of 2–10% in phone accu- racy and 3% in word error rate suggest that our approach can readily replace HMMs in current state-of-the-art systems.

Details

Publication typeInproceedings
Published inICASSP
PublisherIEEE SPS
> Publications > Phone sequence modeling with recurrent neural networks