Samuel Thomas, Patrick Nguyen, Geoffrey Zweig, and Hynek Hermansky
Phoneme posterior probabilities estimated using Multi-Layer Perceptrons (MLPs) are extensively used both as acoustic scores and features for speech recognition. In this paper we explore a different application of these posteriors - as phonetic event detectors for speech recognition. We show how these detectors can be built to reliably capture phonetic events in the acoustic signal by integrating both acoustic and phonetic information about sound classes. These event detectors are used along with Segmental Conditional Random Fields (SCRFs) to improve the performance of speech recognition systems on the Broadcast News task.