Data Driven Suppression Rule for Speech Enhancement

Ivan Tashev; Malcolm Slaney

Data Driven Suppression Rule for Speech Enhancement

Ivan Tashev ,
Malcolm Slaney

Information Theory and Applications Workshop | February 2013

Published by University of California - San Diego

Download BibTex

Audio signal enhancement often involves the application of a time-varying filter, or suppression rule, to the frequency-domain transform of a corrupted signal. Classic approaches use rules derived under Gaussian models and interpret them as spectral estimators in a Bayesian statistical framework. This mathematical approach provides rules that satisfy certain optimization criteria – maximum likelihood, mean square error, etc. In this paper we propose to learn the suppression rule from a representative training corpus and make it optimal in the sense of best perceived quality. This can be measured, for example, with the wideband PESQ algorithm, for which we cannot derive an analytic estimator. The proposed suppression rule is evaluated in controlled environment and shows improvements in the range of 0.1–0.2 PESQ points on a data corpus with SNRs ranging from -10 to +50 dB.