M. R. P. Thomas, J. Gudnason, and P. A. Naylor
Accurate estimation of glottal closure instants (GCIs) and opening instants (GOIs) is important for speech processing applications that benefit from glottal-synchronous processing. This paper proposes a novel improvement to the DYPSA framework, based upon a multiscale analysis technique and an accurate estimation of glottal volume velocity. This replaces the linear prediction residual for candidate selection and enables the reliable detection of both GCI and GOI candidates. A two-stage dynamic programming process then detects the GCIs and removes them from the candidate set, before detecting GOIs from the remaining candidates. A postprocessing step improves GOI detection using the estimated GCIs. Evaluation against hand-labelled data on a large speech database shows that GCI detection is marginally improved compared with original DYPSA at 96% but, more importantly, shows that GOI detection can be achieved to a similar accuracy of 95%.
|Published in||Proc. European Signal Processing Conf. (EUSIPCO)|