This paper presents two time-scale pitch-scale modification techniques to be used in speech synthesis systems. They have been applied to Microsoft’s Whistler system, which is based on concatenative synthesis. Both methods are based on a sourcefilter model, one of them using LPC parameters and the other one using cepstral parameters. The proposed methods achieve high quality prosody modification, retain the characteristics of the donor speaker, allow for spectral manipulation (to reduce spectral discontinuities at unit boundaries), yield compact acoustic inventories and improved voiced fricatives.
|Published in||Proc. of the Int. Conf. on Acoustics, Speech, and Signal Processing|