Share on Facebook Tweet on Twitter Share on LinkedIn Share by email
Handset-Dependent Background Models for Robust Text-Independent Speaker Recognition

Larry Heck and Mitchel Weintraub

Abstract

This paper studies the effects of handset distortion on telephone-based speaker recognition performance, resulting in the following observations: (1) the major factor in speaker recognition errors is whether the handset type (e.g., electret, carbon) is different across training and testing, not whether the telephone lines are mismatched, (2) the distribution of speaker recognition scores for true speakers is bimodal, with one mode dominated by matched handset tests and the other by mismatched handsets, (3) cohort-based normalization methods derive much of their performance gains from implicitly selecting cohorts trained with the same handset type as the claimant, and (4) utilizing a handset-dependent background model which is matched to the handset type of the claimant's training data sharpens and separates the true and false speaker score distributions. Results on the 1996 NIST Speaker Recognition Evaluation corpus show that using handset-matched background models reduces false acceptances (at a 10% miss rate) by more than 60% over previously reported (handset-independent) approaches.

Details

Publication typeProceedings
PublisherInternational Conference on Acoustics, Speech, and Signal Processing (ICASSP)
> Publications > Handset-Dependent Background Models for Robust Text-Independent Speaker Recognition