Larry Heck and Dominique Genoud
This paper presents a novel approach to the integration of a speech and speaker recognizer for the purpose of automatically capturing an identity claim of a user. The approach integrates the speaker recognition score into the search process of the speech recognizer resulting in a best hypothesis that jointly optimizes the probability of the word sequence and the speaker. This facilitates the use of a natural speech-based interface, where the identity claim can be ambiguous and relatively difficult to recognize (e.g., names). This paper presents a theoretical framework for the integration of speech and speaker recognition systems. In addition, experimental results are presented that show a 35% reduction in the NL-error rate of an over-the-telephone speech recognition task, where the testset consists of users from a US city of size 1 million identifying themselves by simply speaking their name.
|Published in||Proceedings of Odyssey Speaker Recognition Workshop|