Share this page
Share this page E-mail this page Print this page RSS feeds
Home > Publications > Efficient Decoding Strategies for Conversational Speech Recognition Using a Constrained Nonlinear State-Space Model
Efficient Decoding Strategies for Conversational Speech Recognition Using a Constrained Nonlinear State-Space Model

In this paper, we present two efficient strategies for

likelihood computation and decoding in a continuous speech recognizer

using an underlying nonlinear state-space dynamic model

for the hidden speech dynamics. The state-space model has been

specially constructed so as to be suitable for the conversational or

casual style of speech where phonetic reduction abounds. Two specific

decoding algorithms, based on optimal state-sequence estimation

for the nonlinear state-space model, are derived, implemented,

and evaluated. They successfully overcome the exponential growth

in the original search paths by using the path-merging approaches

derived from Bayes’ rule. We have tested and compared the two

algorithms using the speech data from the Switchboard corpus,

confirming their effectiveness. Conversational speech recognition

experiments using the Switchboard corpus further demonstrated

that the use of the new decoding strategies is capable of reducing

the recognizer’s word error rate compared with two baseline recognizers,

including theHMMsystem and the nonlinear state-space

model using the HMM-produced phonetic boundaries, under identical

test conditions.

2003-deng-transb.pdf
PDF file

In: IEEE Trans. on Speech and Audio Processing

Details

Type: Article
Pages: 590-602
Volume: 11
Number: 6