The semantic frame based spoken language understanding involves two decisions – frame classification and slot filling. The two decisions can be made either separately or jointly. This paper compares the different strategies and presents some empirical results in the conditional model framework when only a small amount of training data is available. It is found that while the two pass classification/slot filling solution has resulted in the much better frame classification accuracy, the joint model has yielded better results for slot filling. Application developers need to carefully choose the strategy appropriate to the application scenarios.
|Published in||Proc. of Interspeech|
|Publisher||International Speech Communication Association|