Zhongyuan Wang

Zhongyuan Wang@Seattle

Zhongyuan Wang (王仲远)

Email: zhy.wang @ microsoft . com (without any space in the email address)
          zhywangchina @ 163 . com, wzhy AT outlook . com
Tel: +86-138-109-72076, +86-10-59174328

Zhongyuan Wang is an associate researcher at Microsoft Research Asia (MSRA) and a PhD candidate at Renmin University of China (his PhD advisors are Haixun Wang and Ji-Rong Wen). He received his master’s degree (advisor was Xiaofeng Meng) and bachelor's degree in computer science at Renmin University in 2010 and 2007 respectively. Zhongyuan Wang won Wu Yuzhang Scholarship (Top-level Scholarship at Renmin University), Kwang-Hua Scholarship, and ACM SIGMOD07 Undergraduate Scholarship (one of the seven winners all over the world) in the university. After he graduated from RUC, he joined MSRA as a Research Software Development Engineer. Until now, Zhongyuan Wang has published several papers in the leading international conferences, such as VLDB, ICDE, etc. He is also the translator of the book “Windows Phone 7 Programming for Android and iOS Developers”, published in 2012. His research interests include knowledge base, web data mining, online advertising, machine learning and natural language processing.

Currently, Zhongyuan Wang takes charge of Probase project. He focuses on acquiring web tables, attributes, knowledge facts from more than 7 billion web documents in MS Cloud platform, addressing entities disambiguation/attributes synonyms in Probase, understanding web documents by reasoning over uncertain data, and building cool applications (such as short text understanding, ads matching, and query recommendation) upon on the knowledge base. 

Probase: a knowledgebase that knows our mental world

My personal blog: 仲子说

Research Projects:

  • Short Text Understanding / Conceptualization

The goal of this project is to provide better text understanding.

A large variety of applications need to handle short texts such as search queries, ads keywords, tweets, image captions, etc. Understanding short texts is a big challenge for machines. Unlike long texts and documents, for which we can use “bag of words” based statistical approaches to analyze, short texts do not contain enough information or statistical signals to make the analysis meaningful. Furthermore, short texts are usually not well-formed sentences. For example, queries submitted to search engines usually do not follow grammar rules. Consequently, approaches based on sentence structure analysis do not work well either. Human beings are good at deriving meaning from noisy, ambiguous, and sparse input. We understand short texts because knowledge in our mind enriches the input to produce meaning. Thus, in order for machines to understand short texts, we need to supply such knowledge to machines so that the gap between insufficient input and understanding can be bridged.

We have been continuously improving our conceptualization mechanism, which is at the core of our short text understanding services. We leverage the co-occurrence network to enhance sense disambiguation. We also generate the mappings between auxiliary words and concept clusters. These can help sense disambiguation using context auxiliary words.

Research Interest:

  • Knowledgebase, Graph
  • Database, Data Mining
  • Machine Learning
  • Web Search and Mining
  • Natural Language Processing

Talks:

Community Services:

  • Program Committees, WAIM 2013
  • Program Committees, CIKM 2012
  • Program Committees, WAIM 2011 

Tech Transfers to Products:

  • Bing Ads System

–Added semantic features based on semantic similarity between queries and ads keywords

–Shipped to Bing ads system, Oct. 2012

  • Query Recommendation on MSN US

–Using article titles of each channel to train a classifier based on conceptualization techniques

–Compared with the previous QAS-based approach, our model made CTR increase by 36.8% and 80.0% in US Movie and US Music channels separately

  • Related Topics for Bing Image Search

–Using is-a data to improve related topics in Bing image search

–Constructing and weighting an entity linkage graph to improve the related topics

–Shipped to Bing Image Search in June, 2013, and got ~200% gains on the total query share

  • Microsoft Power Query for Excel

–Microsoft Power Query is an Excel add-in that enhances the self-service Business Intelligence experience in Excel by simplifying data discovery and access. Power Query enables users to easily discover, combine, and refine data for better analysis in Excel. Power Query includes a public search feature that is currently intended for use in the United States only.

–Download link: http://office.microsoft.com/en-us/excel/download-microsoft-power-query-for-excel-FX104018616.aspx

Awards:

  • 2009 Wu Yuzhang Scholarship(Top-level Scholarship of Renmin University of China. Top 10/22000)
  • 2008/2009 Kwang-Hua Scholarship(Twice)
  • 2008 HP Distinguished Chinese Student Scholarship
  • 2007 Excellent Graduate Student Award of Renmin University
  • 2007 ACM SIGMOD07 Undergraduate Scholarship (one of the seven winners all over the world)
  • 2006 China Computer World Scholarship
  • 2005~2006 The Outstanding Students Scholarship
  • 2005 First Prize in Beijing Contest District in China Undergraduate Mathematical Contest in Modeling (CUMCM2005)
  • 2005~2006 First-Class Scholarship
  • 2003~2004 Fan Zhi’an Scholarship
  • 2003~2004 Excellent League Member of RUC
Publications