A Robust Audio Classification and Segmentation Method

Hao Jiang; Lie Lu; Hong-Jiang Zhang

A Robust Audio Classification and Segmentation Method

Hao Jiang ,
Lie Lu ,
Hong-Jiang Zhang

MSR-TR-2001-79 | September 2001

Download BibTex

In this paper, we present a robust algorithm for audio classification that is capable of segmenting and classifying an audio stream into speech, music, environment sound and silence. Audio classification is processed in two steps, which makes it suitable for different applications. The first step of the classification is speech and non-speech discrimination. In this step, a novel algorithm based on KNN and LSP VQ is presented. The second step further divides non-speech class into music, environment sounds and silence with a rule based classification scheme. Some new features such as the noise frame ratio and band periodicity are introduced and discussed in detail. Our experiments in the context of video structure parsing have shown the algorithms produce very satisfactory results.