Speaker Nicolas Maurice Boulanger-Lewandowski
Affiliation University of Montréal
Host Jasha Droppo
Date recorded 24 February 2014
Humans commonly understand sequential events by giving importance to what they expect rather than exclusively to what they actually observe. The ability to fill in the blanks, useful in speech recognition to favor words that make sense in the current context, is particularly important in noisy conditions. In this talk, we present a probabilistic model of symbolic sequences based on a recurrent neural network that can serve as a powerful prior during information retrieval. We show that conditional distribution estimators can describe much more realistic output distributions, and we devise inference procedures to efficiently search for the most plausible annotations when the observations are partially destroyed or distorted. We demonstrate improvements in the state of the art in polyphonic music transcription, chord recognition, speech recognition and audio source separation.
©2014 Microsoft Corporation. All rights reserved.
People also watched