Continuous Speech Recognition with a TF-IDF Acoustic Model

Geoffrey Zweig, Patrick Nguyen, Jasha Droppo, and Alex Acero

Abstract

Information retrieval methods are frequently used for indexing

and retrieving spoken documents, and more recently

have been proposed for voice-search amongst a pre-defined set

of business entries. In this paper, we show that these methods

can be used in an even more fundamental way, as the core component

in a continuous speech recognizer. Speech is initially

processed and represented as a sequence of discrete symbols,

specifically phoneme or multi-phone units. Recognition then

operates on this sequence. The recognizer is segment-based,

and the acoustic score for labeling a segment with a word is

based on the TF-IDF similarity between the subword units detected

in the segment, and those typically seen in association

with the word. We present promising results on both a voice

search task and the Wall Street Journal task. The development

of this method brings us one step closer to being able to do

speech recognition based on the detection of sub-word audio

attributes.

Details

Publication typeInproceedings
PublisherInternational Speech Communication Association
> Publications > Continuous Speech Recognition with a TF-IDF Acoustic Model