H. Attias, L. Lee, and Li Deng
April 2003
This paper describes novel and powerful variational EM al-
gorithms for the segmental switching state space models
used in speech applications, which are capable of capturing
key internal (or hidden) dynamics of natural speech pro-
duction. Hidden dynamic models (HDMs) have recently
become a class of promising acoustic models to incorporate
crucial speech-speci¯c knowledge and overcome many inher-
ent weaknesses of traditional HMMs. However, the lack of
powerful and e±cient statistical learning algorithms is one
of the main obstacles preventing them from being well stud-
ied and widely used. Since exact inference and learning are
intractable, a variational approach is taken to develop ef-
fective approximate algorithms. We have implemented the
segmental constraint crucial for modeling speech dynamics
and present algorithms for recovering hidden speech dy-
namics and discrete speech units from acoustic data only.
The e®ectiveness of the algorithms developed are veri¯ed by
experiments on simulation and Switchboard speech data.
![]() PDF file |
In Proc. of the Int. Conf. on Acoustics, Speech, and Signal Processing
| Type | Inproceedings |