J. Silva, C. Chelba, and Alex Acero
The paper presents the Position Specific Posterior Lattice
(PSPL), a novel lossy representation of automatic speech
recognition lattices that naturally lends itself to efficient indexing
and subsequent relevance ranking of spoken documents.
Two pruning techniques for generating word lattices are
explored in this framework, where experiments performed on
a collection of lecture recordings — MIT iCampus database
— show that the spoken document ranking accuracy was improved
by 20%—in the mean average precision sense—relative
over the commonly used baseline of indexing the 1-best
output from an automatic speech recognizer (ASR).
|Published in||Proc. of the Int. Conf. on Acoustics, Speech, and Signal Processing|