A primary motivation for work within my group is the notion of autonomous agents that can interact, robustly over the long term, with an incompletely known environment that continually changes. In this talk I will describe results from a few different projects that attempt to address key aspects of this big question.
I will begin by looking at how task encodings can be made effective using qualitative (geometric) structure in the strategy space. Using examples that may be familiar to many machine learning researchers – such as control of an inverted pendulum and bipedal walkers – we will explore this connection between the geometric structure of solutions and strategies for dealing with a continually changing task context. The key result here would be regarding ways to combine exploitation of 'natural' dynamics with the benefits of active planning.
Can there be similarly flexible encodings for more general decision problems, beyond the domain of robot control? I will describe recent results from our work on policy reuse and transfer learning, demonstrating how it is possible to construct agents that can learn to adapt, through a process of belief updating based on policy performance, to a changing task context including the case where the change may be induced by other decision making agents.
Finally, building on this theme of making decisions in the presence of other decision making agents, I will briefly describe results from our recent experiments in human-robot interaction where agents must learn to influence the behaviour of other agents in order to achieve their task. This experiment is a step towards general and implementable models of ad hoc interaction where agents learn from experience to shape aspects of that interaction without the benefits of prior coordination and related knowledge. I will conclude with some remarks on the potential practical uses of such models and learning methods in a wide variety of applications ranging from personal robotics to intelligent user interfaces.