C.J.C. Burges, J.C. Platt, and S. Jana
A key problem faced by audio identification, classification, and retrieval systems is the mapping of high-dimensional audio input data into informative lower-dimensional feature vectors. This paper explores an automatic dimensionality reduction algorithm called Distortion Discriminant Analysis (DDA). Each layer of DDA projects its input into directions which maximize the SNR for a given set of distortions. Multiple layers efficiently extract features over a wide temporal window. The audio input to DDA undergoes perceptually-relevant preprocessing and de-equalization, to further suppress distortions. We apply DDA to the task of identifying audio clips in an incoming audio stream, based on matching stored audio fingerprints. We show excellent test results on matching input fingerprints against 36 hours of stored audio data.
In Proc. IEEE Conference on Acoustics, Speech and Signal Processing
Publisher IEEE Signal Processing Society
© 2004 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.