Ye-Yi Wang, John Lee, Milind Mahajan, and Alex Acero
Spoken Language Understanding (SLU) addresses the problem of extracting semantic meaning conveyed in an utterance. The traditional knowledge-based approach to this problem is very expensive – it requires joint expertise in natural language processing and speech recognition, and the best practice in language engineering for every new domain. On the other hand, statistical learning approach needs a large amount of annotated data for model training, which is seldom available in practical applications out of large research labs. A generative HMM/CFG composite model, which integrates easy-to-obtain domain knowledge in a data-driven statistical learning framework, has previously been introduced to reduce data requirement. The major contribution of this paper is the investigation of integrating prior knowledge and statistical learning in a conditional model framework. We also study and compare the conditional random fields (CRFs) with perceptron learning for SLU. Experimental results show that the conditional models achieve more than 20% relative reduction in slot error rate over the generative HMM/CFG model, which had already achieved the SLU accuracy at the same level as the best results reported on the ATIS data.
Publisher Association for Computational Linguistics