Asli Celikyilmaz, Dilek Hakkani-Tur, Gokhan Tur, Ashley Fidler, and Dustin Hillard
One of the main components of spoken language understanding is intent detection, which allows user goals to be identified. A challenging sub-task of intent detection is the identification of intent bearing phrases from a limited amount of training data, while maintaining the ability to generalize well. We present a new probabilistic topic model for jointly identifying semantic intents and common phrases in spoken language utterances. Our model jointly learns a set of intent dependent phrases and captures semantic intent clusters as distributions over these phrases based on a distance dependent sampling method. This sampling method uses proximity of words utterances when assigning words to latent topics. We evaluate our method on labeled utterances and present several examples of discovered semantic units. We demonstrate that our model outperforms standard topic models based on bag-of-words assumption.
Publisher IEEE Automatic Speech Recognition and Understanding Workshop