Alex Acero
December 1998
This paper presents a time-scale pitch-scale modification
technique for concatenative speech synthesis. The method is
based on a frequency domain source-filter model, where the
source is modeled as a mixed excitation. This model is highly
coupled with a compression scheme that result in compact
acoustic inventories. When compared to the approach in the
Whistler system using no mixed excitation, the new method
shows improvement in voiced fricatives and over-stretched
voiced sounds. In addition, it allows for spectral manipulation
such as smoothing of discontinuities at unit boundaries, voice
transformations or loudness equalization.
![]() PDF file |
In Proc. of the Int. Conf. on Spoken Language Processing
| Type | Inproceedings |