Constructing Accurate Beliefs in Spoken Dialog Systems

2005 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU), San Juan, Puerto Rico |

We propose a novel approach for constructing more accurate beliefs over concept values in spoken dialog systems by integrating information across multiple turns in the conversation. In particular, we focus our attention on updating the confidence score of the top hypothesis for a concept, in light of subsequent user responses to system confirmation actions. Our data-driven approach bridges previous work in confidence annotation and correction detection, providing a unified framework for belief updating. The approach significantly outperforms heuristic rules currently used in most spoken dialog systems.

Subsequent experiments with the machine learning infrastructure used in this work have revealed a small defect in the model construction and evaluation. During the stepwise model building process, the scoring of features was done by assessing performance on the entire dataset (including train + development folds), instead of exclusively on the train folds. Nevertheless, once a feature to be added to a model was selected, the model was trained exclusively on the training folds, i.e. the corresponding feature weight in the max-ent model was determined based only on the training data, and the evaluation was done on the held-out development fold. Subsequent experiments with a correct setup (where the feature scoring is done only by looking at the training folds) on several problems show that this bug does not significantly affect results. While with a correct setup the numbers reported in cross-validation might differ by small amounts, we believe the general results we have reported in this paper stand.