How to force unsupervised neural networks to discover the right representation of images

One appealing way to design an object recognition system is to define objects recursively in terms of their parts and the required spatial relationships between the parts and the whole. These relationships can be represented by the coordinate transformation between an intrinsic frame of reference embedded in the part and an intrinsic frame embedded in the whole. This transformation is unaffected by the viewpoint so this form of knowledge about the shape of an object is viewpoint invariant. A natural way for a neural network to implement this knowledge is by using a matrix of weights to represent each part-whole relationship and a vector of neural activities to represent the pose of each part or whole relative to the viewer. The pose of the whole can then be predicted from the poses of the parts and, if the predictions agree, the whole is present. This leads to neural networks that can recognize objects over a wide range of viewpoints using neural activities that are “equivariant” rather than invariant: as the viewpoint varies the neural activities all vary even though the knowledge is viewpoint-invariant. The “capsules” that implement the lowest-level parts in the shape hierarchy need to extract explicit pose parameters from pixel intensities and these pose parameters need to have the right form to allow coordinate transformations to be implemented by matrix multiplies. These capsules are quite easy to learn from pairs of transformed images if the neural net has direct, non-visual access to the transformations, as it would if it controlled them. (Joint work with Sida Wang and Alex Krizhevsky)

Speaker Details

Geoffrey Hinton received his BA in experimental psychology from Cambridge in 1970 and his PhD in Artificial Intelligence from Edinburgh in 1978. He did postdoctoral work at Sussex University and the University of California San Diego and spent five years as a faculty member in the Computer Science department at Carnegie-Mellon University. He then became a fellow of the Canadian Institute for Advanced Research and moved to the Department of Computer Science at the University of Toronto. He spent three years from 1998 until 2001 setting up the Gatsby Computational Neuroscience Unit at University College London and then returned to Toronto.

Geoffrey Hinton is a fellow of the Royal Society, the Royal Society of Canada, and the American Association for Artificial Intelligence and a former president of the Cognitive Science Society. He received an honorary doctorate from the University of Edinburgh in 2001. He was awarded the first David E. Rumelhart prize (2001), the IEEE Neural Network Pioneer award (1998) and the ITAC/NSERC award for contributions to information technology (1992).

A simple introduction to Geoffrey Hinton’s research can be found in his articles in Scientific American in September 1992 and October 1993. He investigates ways of using neural networks for learning, memory, perception and symbol processing and has over 150 publications in these areas. He was one of the researchers who introduced the back-propagation algorithm that has been widely used for practical applications. His other contributions to neural network research include Boltzmann machines, distributed representations, time-delay neural nets, mixtures of experts, Helmholtz machines and products of experts. His current main interest is in unsupervised learning procedures for neural networks with rich sensory input.

Date:: June 23, 2011
Speakers:: Geoffrey Hinton
Affiliation:: University of Toronto

- Jeff Running

How to force unsupervised neural networks to discover the right representation of images

Speaker Details

Speakers

Jeff Running