Estimating Speech Recognition Error Rate without Acoustic Test Data,

Yongang Deng, Milind Mahajan, and Alex Acero


We address the problem of estimating the word error rate

(WER) of an automatic speech recognition (ASR) system

without using acoustic test data. This is an important problem

which is faced by the designers of new applications which use

ASR. Quick estimate of WER early in the design cycle can be

used to guide the decisions involving dialog strategy and

grammar design. Our approach involves estimating the

probability distribution of the word hypotheses produced by

the underlying ASR system given the text test corpus. A

critical component of this system is a phonemic confusion

model which seeks to capture the errors made by ASR on the

acoustic data at a phonemic level. We use a confusion model

composed of probabilistic phoneme sequence conversion

rules which are learned from phonemic transcription pairs

obtained by leave-one-out decoding of the training set. We

show reasonably close estimation of WER when applying the

system to test sets from different domains.


Publication typeInproceedings
Published inProc. of the European Conference on Speech Communication
PublisherInternational Speech Communication Association
> Publications > Estimating Speech Recognition Error Rate without Acoustic Test Data,