Wen Wang, Andreas Stolcke, Jiahong Yuan, and Mark Liberman
We investigate two systems for automatic disfluency detection on English and Mandarin conversational speech data. The first system combines various lexical and prosodic features in a ConditionalRandomField model for detecting edit disfluencies. The second system combines acoustic and language model scores for detecting filled pauses through constrained speech recognition. We compare the contributions of different knowledge sources to detection performance between these two languages.
In Proc. NAACL
Publisher Association for Computational Linguistics