Tied-State Based Discriminative Training of Context-Expanded Region-Dependent Feature Transforms for LVCSR

Zhi-Jie Yan, Qiang Huo, Jian Xu, and Yu Zhang

Abstract

We present a new discriminative feature transform approach to large vocabulary continuous speech recognition (LVCSR) using Gaussian mixture density hidden Markov models (GMM-HMMs) for acoustic modeling. The feature transform is formulated with a set of context-expanded region-dependent linear transforms (RDLTs) utilizing both long-span features and contextual weight expansion. The RDLTs are estimated by lattice-free, tied-state based discriminative training using maximum mutual information (MMI) criterion, while the GMM-HMMs are trained by conventional lattice-based, boosted MMI training. Compared with two baseline systems, which use RDLTs with either long-span features or weight expansion only and are trained using the conventional lattice-based discriminative training for both RDLTs and HMMs, the proposed approach achieves a relative word error rate reduction of 10% and 6% respectively on Switchboard-1 conversational telephone speech transcription task.

Details

Publication typeInproceedings
Published inIEEE International Conference on Acoustics, Speech and Signal Processing, 2013, ICASSP 2013
PublisherInternational Conference on Acoustics, Speech, and Signal Processing (ICASSP)
> Publications > Tied-State Based Discriminative Training of Context-Expanded Region-Dependent Feature Transforms for LVCSR