Share on Facebook Tweet on Twitter Share on LinkedIn Share by email
Maximum Mutual Information SPLICE Transform for Seen and Unseen Conditions

Jasha Droppo and Alex Acero

Abstract

SPLICE is a front-end technique for automatic speech recog-nition systems. It is a non-linear feature space transformation meant to increase recognition accuracy. Our previous work has shown how to train SPLICE to perform speech feature en-hancement. This paper evaluates a maximum mutual informa-tion (MMI) based discriminative training method for SPLICE. Discriminative techniques tend to excel when the training and testing data are similar, and to degrade performance signifi-cantly otherwise. This paper explores both cases in detail us-ing the Aurora 2 corpus. The overall recognition accuracy of the MMI-SPLICE system is slightly better than the Advanced Front End standard from ETSI, and much better than previ-ous SPLICE training algorithms. Most notably, it achieves this without explicitly resorting to the standard techniques of envi-ronment modeling, noise modeling or spectral subtraction.

Details

Publication typeInproceedings
Published inProc. Interspeech Conference
PublisherInternational Speech Communication Association
> Publications > Maximum Mutual Information SPLICE Transform for Seen and Unseen Conditions