Large Vocabulary Continuous Speech Recognition With Context-Dependent DBN-HMMS

The context-independent deep belief network (DBN) hidden

Markov model (HMM) hybrid architecture has recently achieved

promising results for phone recognition. In this work, we propose

a context-dependent DBN-HMM system that dramatically outperforms

strong Gaussian mixture model (GMM)-HMM baselines

on a challenging, large vocabulary, spontaneous speech recognition

dataset from the Bing mobile voice search task. Our system achieves

absolute sentence accuracy improvements of 5.8% and 9.2% over

GMM-HMMs trained using the minimum phone error rate (MPE)

and maximum likelihood (ML) criteria, respectively, which translate

to relative error reductions of 16.0% and 23.2%.

CD-DNN-HMM-ICASSP2011.pdf
PDF file

In  Proc. ICASSP, Prague

Publisher  IEEE

Details

TypeInproceedings
> Publications > Large Vocabulary Continuous Speech Recognition With Context-Dependent DBN-HMMS