On Designing and Evaluating Speech Event Detectors

Jinyu Li and Chin-Hui Lee


We study issues related to designing speech event detectors for

automatic speech recognition. Event detection is a critical

component of a recently proposed automatic speech attribute

transcription (ASAT) paradigm for speech research. Similar to

keyword spotting and non-keyword rejection, a good detector

needs to effectively detect speech attributes of interest while

rejecting extraneous events. We compare frame and segment

based detectors, study their properties in detecting manners of

articulation, and propose new performance measures. We test

these detectors on the TIMIT database with several evaluation

criteria. Our results indicate that segment based detectors

outperform frame based detectors in several key aspects of

speech detector design. We also show that the performance

can be significantly enhanced by incorporating discriminative

training into designing speech event detectors.


Publication typeInproceedings
Published inProc. Interspeech
> Publications > On Designing and Evaluating Speech Event Detectors