Homepage with submission
Submission website: http://mc.manuscriptcentral.com/tamd-ieee
All published issues: index.html
Volume: 2 Issue: 2 Date: June 2010
(Previous issue: Vol. 2, No. 1, March 2009)
Abstract:The six papers in this special issue focus on active learning and intrinsically motivated exploration in robots. The issue presents novel contributions on intentional exploration, i.e., internal mechanisms and constraints that explicitly foster organized exploration.
Full Text from IEEE:PDF (1151 KB) ; Contact the author by email
Abstract: There is great interest in building intrinsic motivation into artificial systems using the reinforcement learning framework. Yet, what intrinsic motivation may mean computationally, and how it may differ from extrinsic motivation, remains a murky and controversial subject. In this paper, we adopt an evolutionary perspective and define a new optimal reward framework that captures the pressure to design good primary reward functions that lead to evolutionary success across environments. The results of two computational experiments show that optimal primary reward signals may yield both emergent intrinsic and extrinsic motivation. The evolutionary perspective and the associated optimal reward framework thus lead to the conclusion that there are no hard and fast features distinguishing intrinsic and extrinsic reward computationally. Rather, the directness of the relationship between rewarding behavior and evolutionary success varies along a continuum.
Full Text from IEEE:PDF (1979KB) ; Contact the author by email
Abstract: Reward functions in reinforcement learning have largely been assumed given as part of the problem being solved by the agent. However, the psychological notion of intrinsic motivation has recently inspired inquiry into whether there exist alternate reward functions that enable an agent to learn a task more easily than the natural task-based reward function allows. This paper presents a genetic programming algorithm to search for alternate reward functions that improve agent learning performance. We present experiments that show the superiority of these reward functions, demonstrate the possible scalability of our method, and define three classes of problems where reward function search might be particularly useful: distributions of environments, nonstationary environments, and problems with short agent lifetimes.
Full Text from IEEE: PDF (656KB); Contact the author by email
Abstract: Recently, infomax methods of optimal control have begun to reshape how we think about active information gathering. We show how such methods can be used to formulate the problem of choosing where to look. We show how an optimal eye movement controller can be learned from subjective experiences of information gathering, and we explore in simulation properties of the optimal controller. This controller outperforms other eye movement strategies proposed in the literature. The learned eye movement strategies are tailored to the specific visual system of the learner-we show that agents with different kinds of eyes should follow different eye movement strategies. Then we use these insights to build an autonomous computer program that follows this approach and learns to search for faces in images faster than current state-of-the-art techniques. The context of these results is search in static scenes, but the approach extends easily, and gives further efficiency gains, to dynamic tracking tasks. A limitation of infomax methods is that they require probabilistic models of uncertainty of the sensory system, the motor system, and the external world. In the final section of this paper, we propose future avenues of research by which autonomous physical agents may use developmental experience to subjectively characterize the uncertainties they face.
Full Text from IEEE: PDF (2251KB); Contact the author by email
Abstract: This paper addresses some of the problems that arise when applying active learning to the context of human-robot interaction (HRI). Active learning is an attractive strategy for robot learners because it has the potential to improve the accuracy and the speed of learning, but it can cause issues from an interaction perspective. Here we present three interaction modes that enable a robot to use active learning queries. The three modes differ in when they make queries: the first makes a query every turn, the second makes a query only under certain conditions, and the third makes a query only when explicitly requested by the teacher. We conduct an experiment in which 24 human subjects teach concepts to our upper-torso humanoid robot, Simon, in each interaction mode, and we compare these modes against a baseline mode using only passive supervised learning. We report results from both a learning and an interaction perspective. The data show that the three modes using active learning are preferable to the mode using passive supervised learning both in terms of performance and human subject preference, but each mode has advantages and disadvantages. Based on our results, we lay out several guidelines that can inform the design of future robotic systems that use active learning in an HRI setting.
Full Text from IEEE: PDF (786KB); Contact the author by email
Abstract: A range of different value systems have been proposed for self-motivated agents, including biologically and cognitively inspired approaches. Likewise, these value systems have been integrated with different behavioral systems including reflexive architectures, reward-based learning and supervised learning. However, there is little literature comparing the performance of different value systems for motivating exploration and learning by robots. This paper proposes a neural network architecture for integrating different value systems with reinforcement learning. It then presents an empirical evaluation and comparison of four value systems for motivating exploration by a Lego Mindstorms NXT robot. Results reveal the different exploratory properties of novelty-seeking motivation, interest and competence-seeking motivation.
Full Text from IEEE: PDF (1396KB); Contact the author by email
Abstract:We present a framework for intrinsically motivated developmental learning of abstract skill hierarchies by reinforcement learning agents in structured environments. Long-term learning of skill hierarchies can drastically improve an agent's efficiency in solving ensembles of related tasks in a complex domain. In structured domains composed of many features, understanding the causal relationships between actions and their effects on different features of the environment can greatly facilitate skill learning. Using Bayesian network structure (learning techniques and structured dynamic programming algorithms), we show that reinforcement learning agents can learn incrementally and autonomously both the causal structure of their environment and a hierarchy of skills that exploit this structure. Furthermore, we present a novel active learning scheme that employs intrinsic motivation to maximize the efficiency with which this structure is learned. As new structure is acquired using an agent's current set of skills, more complex skills are learned, which in turn allow the agent to discover more structure, and so on. This bootstrapping property makes our approach a developmental learning process that results in steadily increasing domain knowledge and behavioral complexity as an agent continues to explore its environment.
Full Text from IEEE: PDF (906KB); Contact the author by email
Abstract:In the above titled paper (ibid., vol. 1, no. 3, pp. 187-195, Oct. 09), the acknowledgement to financial support was incompletely displayed. The correct acknowledgement is presented here.
Full Text from IEEE:
PDF (24KB); Contact the author by email