Speaker Eric Rozell
Affiliation Microsoft Research Intern
Host Evelyne Viegas
Date recorded 4 August 2011
Microsoft Research Connections intern, Eric Rozell, presents the results of his research on feature generation techniques for unstructured data sources. He applies Probase—a web-scale knowledge base that was developed by Microsoft Research Asia and is generated from the Bing index, search query logs, and other sources—to extract concepts from text. He compares the performance of features generated from Probase and two other forms of semantic analysis: Explicit Semantic Analysis using Wikipedia and Latent Dirichlet Allocation. He evaluates the semantic analysis techniques on two tasks: recommendation, by using Matchbox (a platform for probabilistic recommendations from Microsoft Research Cambridge) and clustering, by using K-Means.
©2011 Microsoft Corporation. All rights reserved.