Whistler: A Trainable Text-to-Speech System

Xuedong Huang, Alex Acero, J. Adcock, J. Goldsmith, and J. Liu

Abstract

We introduce Whistler, a trainable Text-to-Speech (TTS)

system, that automatically learns the model parameters from a

corpus. Both prosody parameters and concatenative speech units

are derived through the use of probabilistic learning methods

that have been successfully used for speech recognition. Whistler

can produce synthetic speech that sounds very natural and

resembles the acoustic and prosodic characteristics of the

original speaker. The underlying technologies used in Whistler

can significantly facilitate the process of creating generic TTS

systems for a new language, a new voice, or a new speech style.

Details

Publication typeInproceedings
Published inProc. of the Int. Conf. on Spoken Language Processing
PublisherInternational Speech Communication Association
> Publications > Whistler: A Trainable Text-to-Speech System