Share on Facebook Tweet on Twitter Share on LinkedIn Share by email
The Audio Epitome: A New Representation for Modeling and Classifying Auditory Phenomena

Ashish Kapoor and Sumit Basu

Abstract

This paper presents a novel representation for auditory environments that can be used for classifying events of interest, such as speech, cars, etc., and potentially used to classify the environments themselves. We propose a novel discriminative framework that is based on the audio epitome, an audio extension of the image representation developed by Jojic et al. [3]. We also develop an informative patch sampling procedure to train the epitomes. This procedure reduces the computational complexity and increases the quality of the epitome. For classification, the training data is used to learn distributions over the epitomes to model the different classes; the distributions for new inputs are then compared to these models. On a task of distinguishing between 4 auditory classes in the context of environmental sounds (car, speech, birds, utensils), our method outperforms the conventional approaches of nearest neighbor and mixture of Gaussians on three out of the four classes.

Details

Publication typeInproceedings
URLhttp://www.ieee.org/
PublisherInstitute of Electrical and Electronics Engineers, Inc.
> Publications > The Audio Epitome: A New Representation for Modeling and Classifying Auditory Phenomena