Share on Facebook Tweet on Twitter Share on LinkedIn Share by email
A Discriminative Training Framework using N-Best Speech Recognition Transcriptions and Scores for Spoken Utterance Classification

Sibel Yaman, Li Deng, Dong Yu, Ye-Yi Wang, and Alex Acero

Abstract

In this paper, we propose a novel discriminative training approach to spoken utterance classification (SUC). The ultimate objective of the SUC task, originally developed to map a spoken speech utterance into the most appropriate semantic class, is to minimize the classification error rate (CER). Conventionally, a two-phase approach is adapted, in which the first phase is the ASR transcription phase, and the second phase is the semantic classification phase. In the proposed framework, the classification error rate is approximated as differentiable functions of the language and classifier model parameters. Furthermore, in order to exploit all the available information from the first phase, class-specific discriminant functions are defined based on score functions derived from the N-best lists. Our experimental results on the standard ATIS database indicate a notable reduction in CER from the earlier best result on the identical task. The proposed framework achieved a reduction of CER from 4.92% to 4.04%

Details

Publication typeInproceedings
Published inProc. of the International Conference on Acoustics, Speech and Signal Processing
PagesIV-5-IV-8
VolumeIV
AddressHonolulu, Hawaii, U.S.A.
PublisherInstitute of Electrical and Electronics Engineers, Inc.
> Publications > A Discriminative Training Framework using N-Best Speech Recognition Transcriptions and Scores for Spoken Utterance Classification