Jasha Droppo, Milind Mahajan, Asela Gunawardana, and Alex Acero
This paper presents a general discriminative training method for the front end of an automatic speech recognition system. The SPLICE parameters of the front end are trained using stochastic gradient descent (SGD) of a maximum mutual information (MMI) objective function. SPLICE is chosen for its ability to approximate both linear and non-linear transformations of the feature space. SGD is chosen for its simplicity of implementation. Results are presented on both the Aurora 2 small vocabulary task and the WSJ Nov-92 medium vocabulary task. It is shown that the discriminative front end is able to consistently increase system accuracy across different front end configurations and tasks.
In Proc. of the IEEE Workshop on Automatic Speech Recognition and Understanding
Publisher Institute of Electrical and Electronics Engineers, Inc.
© 2007 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.