Position Specific Posterior Lattices for Indexing Speech

C. Chelba and Alex Acero


The paper presents the Position Specific

Posterior Lattice, a novel representation

of automatic speech recognition lattices

that naturally lends itself to efficient indexing

of position information and subsequent

relevance ranking of spoken documents

using proximity.

In experiments performed on a collection

of lecture recordings — MIT iCampus

data — the spoken document ranking

accuracy was improved by 20% relative

over the commonly used baseline of

indexing the 1-best output from an automatic

speech recognizer. The Mean Average

Precision (MAP) increased from 0.53

when using 1-best output to 0.62 when using

the new lattice representation. The reference

used for evaluation is the output of

a standard retrieval engine working on the

manual transcription of the speech collection.

Albeit lossy, the PSPL lattice is also much

more compact than the ASR 3-gram lattice

from which it is computed — which

translates in reduced inverted index size

as well — at virtually no degradation in

word-error-rate performance. Since new

paths are introduced in the lattice, the ORACLE

accuracy increases over the original

ASR lattice.


Publication typeInproceedings
Published inProc. of the Association for Computational Linguistics
> Publications > Position Specific Posterior Lattices for Indexing Speech