Data Driven Suppression Rule for Speech Enhancement

Ivan Tashev and Malcolm Slaney

Abstract

Audio signal enhancement often involves the application of a time-varying filter, or suppression rule, to the frequency-domain transform of a corrupted signal. Classic approaches use rules derived under Gaussian models and interpret them as spectral estimators in a Bayesian statistical framework. This mathematical approach provides rules that satisfy certain optimization criteria – maximum likelihood, mean square error, etc. In this paper we propose to learn the suppression rule from a representative training corpus and make it optimal in the sense of best perceived quality. This can be measured, for example, with the wideband PESQ algorithm, for which we cannot derive an analytic estimator. The proposed suppression rule is evaluated in controlled environment and shows improvements in the range of 0.1–0.2 PESQ points on a data corpus with SNRs ranging from -10 to +50 dB.

Details

Publication typeInproceedings
Published inInformation Theory and Applications Workshop
PublisherUniversity of California - San Diego
> Publications > Data Driven Suppression Rule for Speech Enhancement