A study on lattice rescoring with knowledge scores for automatic speech recognition

We study lattice rescoring with knowledge scores for automatic

speech recognition. Frame-based log likelihood ratio is adopted as

a score measure of the goodness-of-fit between a speech segment

and the knowledge sources. We evaluate our approach in two different

applications: phone recognition, and connected digit continuous

recognition. By incorporating knowledge scores obtained

from 15 attribute detectors for place and manner of articulation,

we reduced phone error rate from 40.52% to 35.16% using monophone

models. The error rate can be further reduced to 33.42% for

triphone models. The same lattice rescoring algorithm is extended

to connected digit recognition using the TIDIGITS database, and

without using any digit-specific training data. We observed the

digit error rate can be effectively reduced to 4.03% from 4.54%

which was obtained with the conventional Viterbi decoding algorithm

with no knowledge scores.

interspeech06_2.pdf
PDF file

In  Proc. Interspeech

Details

TypeInproceedings
> Publications > A study on lattice rescoring with knowledge scores for automatic speech recognition