Formant Analysis and Synthesis using Hidden Markov Models

Alex Acero

Abstract

This paper describes a unifying framework for both formant

tracking and speech synthesis using Hidden Markov Models

(HMM). The feature vector in the HMM is composed by the

first three formant frequencies, their bandwidths and their delta

with time. Speech is synthesized by generating the most likely

sequence of feature vectors from a HMM, trained with a set of

sentences from a given speaker. Higher formant tracking

accuracy can be achieved by finding the most likely formant

track given a distribution of the formants of every sound. This

data-driven formant synthesizer bridges the gaps between rulebased

formant synthesizers and concatenative synthesizers by

synthesizing speech that is both smooth and resembles the

speaker in the training data.

Details

Publication typeInproceedings
Published inProc. of the Eurospeech Conference
> Publications > Formant Analysis and Synthesis using Hidden Markov Models