Partially Observable Markov Decision Processes for Spoken Dialogue Systems

Current spoken dialogue systems (SDS) typically employ hand-crafted decision networks or flow-charts to determine what action to take at each point in a conversation. The result is a system which is fragile to speech recognition errors and which is unable to adapt and learn from experience.

There are two key features needed to build robust and adaptable spoken dialogue systems. Firstly, the system must have an explicit mechanism for modelling uncertainty and, secondly, the system must have an objective measure of dialogue success which can be used as the basis of policy optimisation. The framework of Partially Observable Markov Decision Processes (POMDPs) provides both of these.

The talk will begin with a simple example to illustrate the underlying principles and potential advantage of the POMDP approach. POMDPs provide a Bayesian model of belief and a principled mathematical framework for modelling uncertainty. They can be trained from real data and they yield policies which can be optimised using reinforcement learning. However, exact belief update and policy optimisation algorithms are intractable and as a result there are many issues inherent in scaling POMDP-based systems to handle real-world tasks. Therefore, the main part of the talk will focus on some of the recent work conducted in the Dialogue Systems Group at Cambridge to develop ways of scaling POMDPs to allow them to be used in practical real-world dialogue systems. Two specific systems will be outlined and performance results from user trials will be presented. The talk will conclude by summarising the main issues which need further work.

Speaker Details

Steve Young received a BA in Electrical Sciences from Cambridge University in 1973 and a PhD in Speech Processing in 1978. He held lectureships at both Manchester and Cambridge Universities before being elected to the Chair of Information Engineering at Cambridge University in 1994. He was a co-founder and Technical Director of Entropic Ltd from 1995 until 1999 when the company was taken over by Microsoft. After a short period as an Architect at Microsoft, he returned full-time to the University in January 2001 where he is now Professor of Information Engineering.His research interests include speech recognition, language modelling, spoken dialogue and multi-media applications. He is the inventor and original author of the HTK Toolkit for building hidden Markov model- based recognition systems (see http://htk.eng.cam.ac.uk), and with Phil Woodland, he developed the HTK large vocabulary speech recognition system which has figured strongly in DARPA/NIST evaluations since it was first introduced in the early nineties. More recently he has developed statistical dialogue systems and pioneered the use of Partially Observable Markov Decision Processes for modelling them. He also has active research in voice transformation, emotion generation and HMM synthesis.He has written and edited books on software engineering and speech processing, and he has published as author and co-author, more than 200 papers in these areas. He is a Fellow of the UK Royal Academy of Engineering, the Institute of Electrical Engineers and the Royal Society of Arts. He served as the senior editor of Computer Speech and Language from 1993 to 2004 and is now a member of the editorial board. He is a Fellow of the IEEE and Chair of the Speech and Language Technical Committee. He has also served on the technical committees of numerous workshops and conferences. He was the recipient of an IEEE Signal Processing Society Technical Achievement Award in 2004 and he was made a Fellow of the International Speech Communication Association in 2008.

Date:
Speakers:
Steve Young
Affiliation:
Cambridge University