Larry Heck and Yochai Konig
This paper presents a new training procedure for speaker verification systems. The procedure extends previous speaker verification work by (1) developing a new discriminative a posteriori-based training algorithm, and (2) extending the algorithm to directly optimize speaker verification performance. The key features of the new training algorithm include leveraging current state of the art technology by initializing the system with Bayesian-adapted Gaussian mixture models. The discriminative training algorithm then adjusts parameters of these models to directly minimize a verification cost function (VCF) representing the expected costs of falsely accepting impostors and falsely rejecting true claimants. Results are presented from the 1997 NIST Speaker Recognition Evaluation corpus indicating that the VCF performance can be improved with this procedure, but at the expense of reduced system performance at other operating points (different false alarm and false rejection costs).