The Linear Algebraic Structure of Word Meanings
Word embeddings are often constructed with discriminative models such as deep nets and word2vec. Mikolov et al (2013) showed that these embeddings exhibit linear structure that is useful in solving “word analogy tasks”. Subsequently, Levy and Goldberg (2014) and Pennington et al (2014) tried to explain why such linear structure should arise in embeddings derived from nonlinear methods.
We provide a new generative model “explanation” for various word embedding methods as well as the above-mentioned linear structure. It also gives a generative explanation of older vector space methods such as the PMI method of Church and Hanks (1990). The model has surprising predictions (e.g., the spatial isotropy of word vectors), which are empirically verified. It also directly leads to a linear algebraic understanding of how a word embedding behaves when the word is polysemous (has multiple meanings), and to recover the different meanings from the embedding.
This methodology and generative model may be useful for other NLP tasks and neural models.
Joint work with Sanjeev Arora, Yuanzhi Li, Yingyu Liang, and Andrej Risteski (listed in alphabetical order).
Speaker Details
Tengyu Ma is currently a fourth-year graduate student at Princeton University, advised by Prof. Sanjeev Arora. He received the Simons Award for Graduate Students in Theoretical Computer Science in 2014 and IBM Ph.D. Fellowship in 2015. Ma’s work seeks to develop efficient algorithms with provable guarantees for machine learning problems. His research interests include non-convex optimization, deep learning, natural language processing, distributed optimization, and convex relaxation (e.g. sum of squares hierarchy) for machine learning problems.
- Series:
- Microsoft Research Talks
- Date:
- Speakers:
- Tengyu Ma
-
-
Jeff Running
-
Series: Microsoft Research Talks
-
Decoding the Human Brain – A Neurosurgeon’s Experience
Speakers:- Pascal Zinn,
- Ivan Tashev
-
-
-
-
-
-
Challenges in Evolving a Successful Database Product (SQL Server) to a Cloud Service (SQL Azure)
Speakers:- Hanuma Kodavalla,
- Phil Bernstein
-
Improving text prediction accuracy using neurophysiology
Speakers:- Sophia Mehdizadeh
-
Tongue-Gesture Recognition in Head-Mounted Displays
Speakers:- Tan Gemicioglu
-
DIABLo: a Deep Individual-Agnostic Binaural Localizer
Speakers:- Shoken Kaneko
-
-
Recent Efforts Towards Efficient And Scalable Neural Waveform Coding
Speakers:- Kai Zhen
-
-
Audio-based Toxic Language Detection
Speakers:- Midia Yousefi
-
-
From SqueezeNet to SqueezeBERT: Developing Efficient Deep Neural Networks
Speakers:- Sujeeth Bharadwaj
-
Hope Speech and Help Speech: Surfacing Positivity Amidst Hate
Speakers:- Monojit Choudhury
-
-
-
-
-
'F' to 'A' on the N.Y. Regents Science Exams: An Overview of the Aristo Project
Speakers:- Peter Clark
-
Checkpointing the Un-checkpointable: the Split-Process Approach for MPI and Formal Verification
Speakers:- Gene Cooperman
-
Learning Structured Models for Safe Robot Control
Speakers:- Ashish Kapoor
-