Asli Celikyilmaz, Dilek Hakkani-Tur, and Gokhan Tur
This paper presents a semi-latent topic model for semantic domain detection in spoken language understanding systems. We use labeled utterance information to capture latent topics, which directly correspond to semantic domains. Additionally, we introduce an ’informative prior’ for Bayesian inference that can simultaneously segment utterances of known domains into classes and divide them from out-of-domain utterances. We show that our model generalizes well on the task of classifying spoken language utterances and compare its results to those of an unsupervised topic model, which does not use labeled information.
Publisher Annual Conference of the International Speech Communication Association (Interspeech)