G. Dahl, Dong Yu, Li Deng, and Alex Acero
May 2011
The context-independent deep belief network (DBN) hidden
Markov model (HMM) hybrid architecture has recently achieved
promising results for phone recognition. In this work, we propose
a context-dependent DBN-HMM system that dramatically outperforms
strong Gaussian mixture model (GMM)-HMM baselines
on a challenging, large vocabulary, spontaneous speech recognition
dataset from the Bing mobile voice search task. Our system achieves
absolute sentence accuracy improvements of 5.8% and 9.2% over
GMM-HMMs trained using the minimum phone error rate (MPE)
and maximum likelihood (ML) criteria, respectively, which translate
to relative error reductions of 16.0% and 23.2%.
![]() PDF file |
In Proc. ICASSP, Prague
Publisher IEEE
| Type | Inproceedings |