J. Silva, C. Chelba, and Alex Acero
The paper presents the Position Specific Posterior Lattice (PSPL), a novel lossy representation of automatic speech recognition lattices that naturally lends itself to efficient indexing and subsequent relevance ranking of spoken documents. Two pruning techniques for generating word lattices are explored in this framework, where experiments performed on a collection of lecture recordings — MIT iCampus database — show that the spoken document ranking accuracy was improved by 20%—in the mean average precision sense—relative over the commonly used baseline of indexing the 1-best output from an automatic speech recognizer (ASR).
|Published in||Proc. of the Int. Conf. on Acoustics, Speech, and Signal Processing|