Towards Understandable Neural Networks for High Level AI Tasks
Current AI software relies increasingly on neural networks (NNs). The universal data structure of NNs is the numerical vector of activity levels of model neurons, typically with activity distributed widely over many neurons. Can NNs in principle achieve human-like performance in higher cognitive domains – such as inference, planning, grammar – where theories in AI, cognitive science, and linguistics have long argued that abstract, structured symbolic representations are necessary? The work I will present seeks to determine whether, and precisely how, distributed vectors can be functionally isomorphic to symbol structures for computational purposes relevant to AI – at least in certain idealized limits such as unbounded network size. This work – defining and exploring Gradient Symbolic Computation (GSC) – has shown:
- How recursive structures built of symbols can be compositionally encoded as distributed numerical vectors: tensor product representations (TPRs) – how TPRs can be used to compute recursive symbolic functions with massive parallelism – how certain symbolic constraint-based grammars can be encoded as interconnection-weight matrices which asymptotically compute the TPRs of grammatical structures – how certain symbolic Maxent models can be encoded as weight matrices of networks that produce the TPRs of alternative structures with a log-linear probability distribution – how generative models can be used to reverse-engineer a trained network to determine whether that network has learned a TPR scheme – how networks deploying TPRs go beyond the capabilities of symbol processing because their representations include TPRs not only of purely discrete structures, but also structures built of blends of numerically-weighted (‘gradient’) symbols.
These results on GSC are purely theoretical. Current work at MSR is exploring the use of GSC to address large-scale practical problems using NNs that can be understood because they operate under the explanatory principles of GSC.
Speaker Details
Paul Smolensky is Krieger-Eisenhower Professor of Cognitive Science at Johns Hopkins University. His research addresses mathematical unification of the continuous and the discrete facets of cognition: principally, the development of grammar formalisms that are grounded in cognitive and neural computation. A member of the Parallel Distributed Processing (PDP) Research Group at UCSD (1986), he developed Harmony Theory, proposing what is now known as the ‘Restricted Boltzmann Machine’ architecture. He then developed Tensor Product Representations (1990), a compositional, recursive technique for encoding symbol structures as real-valued activation vectors. Combining these two theories, he co-developed Harmonic Grammar (1990) and Optimality Theory (1993), general grammatical formalisms now widely used in phonological theory. His publications include the books Mathematical perspectives on neural networks (1996, with M. Mozer, D. Rumelhart), Optimality Theory: Constraint interaction in generative grammar (1993/2004, with A. Prince), Learnability in Optimality Theory (2000, with B. Tesar), and The harmonic mind: From neural computation to optimality-theoretic grammar (2006, with G. Legendre). He was awarded the 2005 David E. Rumelhart Prize for Outstanding Contributions to the Formal Analysis of Human Cognition, a Blaise Pascal Chair in Paris (2008-9), and the 2015 Sapir Professorship of the Linguistic Society of America.
Webpage: http://cogsci.jhu.edu/people/smolensky.html
- Series:
- Microsoft Research Talks
- Date:
- Speakers:
- Paul Smolensky
- Affiliation:
- Johns Hopkins University
-
-
Jeff Running
-
Series: Microsoft Research Talks
-
Decoding the Human Brain – A Neurosurgeon’s Experience
Speakers:- Pascal Zinn,
- Ivan Tashev
-
-
-
-
Galea: The Bridge Between Mixed Reality and Neurotechnology
Speakers:- Eva Esteban,
- Conor Russomanno
-
Current and Future Application of BCIs
Speakers:- Christoph Guger
-
Challenges in Evolving a Successful Database Product (SQL Server) to a Cloud Service (SQL Azure)
Speakers:- Hanuma Kodavalla,
- Phil Bernstein
-
Improving text prediction accuracy using neurophysiology
Speakers:- Sophia Mehdizadeh
-
-
DIABLo: a Deep Individual-Agnostic Binaural Localizer
Speakers:- Shoken Kaneko
-
-
Recent Efforts Towards Efficient And Scalable Neural Waveform Coding
Speakers:- Kai Zhen
-
-
Audio-based Toxic Language Detection
Speakers:- Midia Yousefi
-
-
From SqueezeNet to SqueezeBERT: Developing Efficient Deep Neural Networks
Speakers:- Sujeeth Bharadwaj
-
Hope Speech and Help Speech: Surfacing Positivity Amidst Hate
Speakers:- Monojit Choudhury
-
-
-
-
-
'F' to 'A' on the N.Y. Regents Science Exams: An Overview of the Aristo Project
Speakers:- Peter Clark
-
Checkpointing the Un-checkpointable: the Split-Process Approach for MPI and Formal Verification
Speakers:- Gene Cooperman
-
Learning Structured Models for Safe Robot Control
Speakers:- Ashish Kapoor
-