Kris Demuynck, Dino Seppi, Dirk Van Compernolle, Patrick Nguyen, and Geoffrey Zweig
2011
Exemplar based recognition systems are characterized by the fact
that, instead of abstracting large amounts of data into compact models,
they store the observed data enriched with some annotations and
infer on-the-fly from the data by finding those exemplars that resemble
the input speech best. One advantage of exemplar based systems
is that next to deriving what the current phone or word is, one can
easily derive a wealth of meta-information concerning the chunk of
audio under investigation. In this work we harvest meta-information
from the set of best matching exemplars, that is thought to be relevant
for the recognition such as word boundary predictions and
speaker entropy. Integrating this meta-information into the recognition
framework using segmental conditional random fields, reduced
the WER of the exemplar based system on the WSJ Nov92 20k task
from 8.2% to 7.6%. Adding the HMM-score and multiple HMM
phone detectors as features further reduced the error rate to 6.6%.
![]() PDF file |
In ICASSP
Publisher IEEE
| Type | Inproceedings |