Variational Inference and Learning for Segmental Switching State Space Models of Hidden Speech Dynamics

This paper describes novel and powerful variational EM al-

gorithms for the segmental switching state space models

used in speech applications, which are capable of capturing

key internal (or hidden) dynamics of natural speech pro-

duction. Hidden dynamic models (HDMs) have recently

become a class of promising acoustic models to incorporate

crucial speech-speci¯c knowledge and overcome many inher-

ent weaknesses of traditional HMMs. However, the lack of

powerful and e±cient statistical learning algorithms is one

of the main obstacles preventing them from being well stud-

ied and widely used. Since exact inference and learning are

intractable, a variational approach is taken to develop ef-

fective approximate algorithms. We have implemented the

segmental constraint crucial for modeling speech dynamics

and present algorithms for recovering hidden speech dy-

namics and discrete speech units from acoustic data only.

The e®ectiveness of the algorithms developed are veri¯ed by

experiments on simulation and Switchboard speech data.

2003-lee-icassp.pdf
PDF file

In  Proc. of the Int. Conf. on Acoustics, Speech, and Signal Processing

Details

TypeInproceedings
> Publications > Variational Inference and Learning for Segmental Switching State Space Models of Hidden Speech Dynamics