Maximum Mutual Information SPLICE Transform for Seen and Unseen Conditions

SPLICE is a front-end technique for automatic speech recog-nition systems. It is a non-linear feature space transformation meant to increase recognition accuracy. Our previous work has shown how to train SPLICE to perform speech feature en-hancement. This paper evaluates a maximum mutual informa-tion (MMI) based discriminative training method for SPLICE. Discriminative techniques tend to excel when the training and testing data are similar, and to degrade performance signifi-cantly otherwise. This paper explores both cases in detail us-ing the Aurora 2 corpus. The overall recognition accuracy of the MMI-SPLICE system is slightly better than the Advanced Front End standard from ETSI, and much better than previ-ous SPLICE training algorithms. Most notably, it achieves this without explicitly resorting to the standard techniques of envi-ronment modeling, noise modeling or spectral subtraction.

2005-jdroppo-eurospeech.pdf
PDF file

In  Proc. Interspeech Conference

Publisher  International Speech Communication Association
© 2007 ISCA. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the ISCA and/or the author.

Details

TypeInproceedings
> Publications > Maximum Mutual Information SPLICE Transform for Seen and Unseen Conditions