Training Wideband Acoustic Models using Mixed-Bandwidth Training Data via Feature Bandwidth Extension

Michael Seltzer and Alex Acero

Abstract

One serious difficulty in the deployment of wideband speech recognition systems for new tasks is the expense in both time and cost of obtaining sufficient training data. A more economical approach is to collect telephone speech and then restrict the application to operate at the telephone bandwidth. However, this generally results in sub-optimal performance. In this paper, we propose a new algorithm for training wideband acoustic models that requires only a small amount of wideband speech augmented by a larger amount of narrowband speech. The algorithm operates by first converting the narrowband features to wideband features through a process called Feature Bandwidth Extension. The bandwidthextended features are then combined with available wideband data to train the acoustic models using a modified version of the conventional forward-backward algorithm. Experiments performed using wideband speech and telephone speech demonstrate that the proposed mixed-bandwidth training algorithm results in significant improvements in recognition accuracy over conventional training strategies when the amount of wideband data is limited.

Details

Publication typeInproceedings
Published inProc. of the Int. Conf. on Acoustics, Speech, and Signal Processing
PublisherInstitute of Electrical and Electronics Engineers, Inc.
> Publications > Training Wideband Acoustic Models using Mixed-Bandwidth Training Data via Feature Bandwidth Extension