Jieun Oh, Eunjoon Cho, and Malcolm Slaney
26 August 2013
Trying to automatically detect laughter and other nonlinguistic events in speech raises a fundamental question: Is it appropriate to simply adopt acoustic features that have traditionally been used for analyzing linguistic events? Thus we take a step back and propose syllabic-level features that may show a contrast between laughter and speech in their intensity-, pitch-, and timbral-contours and rhythmic patterns. We motivate and deﬁne our features and evaluate their effectiveness in correctly classifying laughter from speech. Inclusion of our features in the baseline feature set for the Social Signals Sub-Challenge of the Computational Paralinguistics Challenge yielded an improvement of 2.4% in Unweighted Average Area Under the Curve (UAAUC). But beyond objective metrics, analyzing laughter at a phonetically meaningful level has allowed us to examine the characteristic contours of laughter and to recognize the importance of the shape of its intensity envelope.
|Published in||Proceedings of Interspeech 2013|