Zaiqing Nie (聂再清) 

      Lead Researcher

      Web Search & Mining

      Microsoft Research Asia

 

      4F, Beijing Sigma Center

      No. 49, Zhichun Road, Haidian District

      Beijing, P.R.China, 100080

 

       Email: znie AT microsoft DOT com

       Tel: (86-10) 5896-3309

       Fax: (86-10) 8809-7306



Zaiqing Nie joined Web Search & Mining Group in April 2004. He graduated in May 2004 with a Ph.D. in Computer Science from Arizona State University. He received his Master of Engineering degree in Computer Applications from Tsinghua University in 1998, and his Bachelor of Engineering degree in Computer Science and Technology from Tsinghua University in 1996. His research interests include Web Search, Data Mining, Information Retrieval, and Machine Learning.
 

News

Check Out EntityCube Libra Academic SearchRenlifang Guanxi Search (人立方关系搜索) (Language: Chinese)


 

Recent Publications

 


2009

·         Closing the Loop in Webpage Understanding

Chunyu Yang, Yong Cao, Zaiqing Nie, Jie Zhou, Ji-Rong Wen

To appear in IEEE Transactions on Knowledge and Data Engineering (TKDE).

 

·         Query Result Clustering for Object-level Search
Jongwuk Lee, Seung-won Hwang, Zaiqing Nie, Ji-Rong Wen

To appear in the Proceedings of SIGKDD 2009.

 

·         StatSnowball: a Statistical Approach to Extracting Entity Relationships

Jun Zhu, Zaiqing Nie, Xiaojiang Liu, Bo Zhang, Ji-Rong Wen.

To appear in the Proceedings of the 18th international World Wide Web conference (WWW 2009).

 


2008

·         WebPage Understanding: Beyond Page-Level Search

Zaiqing Nie, Ji-Rong Wen, and Wei-Ying Ma.

SIGMOD Record, December 2008 (Vol. 37, No. 4).  Special Issue on Managing Information Extraction.

 

·         Scalable Community Discovery on Textual Data with Relations.

Huajing Li, Zaiqing Nie, Wang-Chien Lee, C. Lee Giles, and Ji-Rong Wen

CIKM 2008 (1203-1212).

 

·         Dynamic Hierarchical Markov Random Fields for Integrated Web Data Extraction

Jun Zhu, Zaiqing Nie, Bo Zhang, Ji-Rong Wen.

In the Journal of Machine Learning Research (JMLR), 9(Jul):1583--1614, 2008.

 


2007

·         Webpage Understanding: An Integrated Approach

Jun Zhu, Zaiqing Nie, Ji-Rong Wen, Bo Zhang, Hsiao-Wuen Hon.

To appear in the Proceedings of SIGKDD 2007.

 

·         Name Disambiguation Using Web Connection
Yiming Lu, Zaiqing Nie, Taoyuan Cheng, Ying Gao, Ji-Rong Wen.

To appear in the AAAI-07 Workshop on Information Integration on the Web (IIWeb 2007).

 

Jun Zhu, Zaiqing Nie, Bo Zhang, Ji-Rong Wen.

To appear in the Proceedings of the 24th International Conference on Machine Learning  (ICML 2007).

 

·         Web Object Retrieval

Zaiqing Nie, Yunxiao Ma, Shuming Shi, Ji-Rong Wen, Wei-Ying Ma.

In the Proceedings of the 16th international World Wide Web conference (WWW 2007).

 

·         Object-Level Vertical Search

Zaiqing Nie, Ji-Rong Wen, Wei-Ying Ma.

In the Third Biennial Conference on Innovative Data Systems Research (CIDR 2007, research paper).

 


2006

·         Simultaneous Record Detection and Attribute Labeling in Web Data Extraction

Jun Zhu, Zaiqing Nie, Ji-Rong Wen, Bo Zhang, Wei-Ying Ma.

In the 12th International Conference on Knowledge Discovery and Data Mining  (SIGKDD 2006, full paper).

 

Honghua(Kathy) Dai, Zaiqing Nie, Lee Wang, Lingzhi Zhao, Ji-Rong Wen, Ying Li.

In Proceedings of the 15th international World Wide Web conference (WWW 2006, industry track).

  

·         Extracting Objects from the Web

Zaiqing Nie, Fei Wu, Ji-Rong Wen, Wei-Ying Ma.

In the 22nd International Conference on Data Engineering (ICDE 2006, poster paper).

 


2005

Jun Zhu, Zaiqing Nie, Ji-Rong Wen, Bo Zhang, Wei-Ying Ma.

In the 22nd International Conference on Machine Learning (ICML 2005).
 

·         Object-Level Ranking: Bringing Order to Web Objects.
Zaiqing Nie, Yuanzhi Zhang, Ji-Rong Wen, and Wei-Ying Ma.

In Proceedings of the 14th international World Wide Web conference (WWW 2005),

May 10-14, 2005, in Chiba, Japan.

·         Effectively mining and using coverage and overlap statistics for data integration.
Zaiqing Nie, Subbarao Kambhampati and Ullas Nambiar.
IEEE Transactions on Knowledge and Data Engineering (TKDE). Vol. 17, No. 5, May 2005.


2004

·         A Frequency-based Approach for Mining Coverage Statistics in Data Integration.
Zaiqing Nie and Subbarao Kambhampati
In Proceedings of the 20th International Conference on Data Engineering (ICDE 2004).
 

·         Optimizing Recursive Information Gathering Plans in EMERAC.
S. Kambhampati, E. Lambrecht, U. Nambiar, Z. Nie and G. Senthil.
Journal of Intelligent Information Systems
. Volume 22, Number 2, March 2004.


2001 - 2003

·         BibFinder/StatMiner: Effectively Mining and Using Coverage and Overlap Statistics in Data Integration (system demo). 
Zaiqing Nie, Subbarao Kambhampati and Thomas Hernandez.
In Proceeding of the 29th International Conference on Very Large Data Bases (VLDB 2003).
 

·         Frequency-Based Coverage Statistics Mining for Data Integration.
Zaiqing Nie and Subbarao Kambhampati.
IJCAI 2003 Workshop on Information Integration on the Web.

·         Mining Coverage Statistics for Websource Selection in a Mediator.
Z.Nie, U. Nambiar, S. Lakshmi and S. Kambhampati.
ASU CSE TR 02-009 (A short version of this paper appears in Proceedings of CIKM 2002).
 

·         Scalable Delivery of Streaming Data on the Internet through Customizable Approximate Caching.
Zaiqing Nie and Wen-Syan Li. NEC Technical Report, 2002.
 

·         Mining Source Coverage Statistics for Data Integration.
Zaiqing Nie, Subbarao Kambhampati, Ullas Nambiar and Sreelakshmi Vaddi.
In 3rd ACM International Workshop on Web Information and Data Management (WIDM), Atlanta, Georgia, USA, November 2001.
 

·         Joint optimization of cost and coverage of query plans in data integration.
Zaiqing Nie and Subbarao Kambhampati.
In Proceedings of the 10th ACM International Conference on Information and Knowledge Management (CIKM) , Atlanta, Georgia, November 2001.
 

·         AltAlt: Combining Graphplan and Heuristic State Search.
Biplav Srivastava, XuanLong Nguyen, Subbarao Kambhampati, Minh B. Do, Ullas Nambiar, Zaiqing Nie, Romeo Nigenda, Terry Zimmerman.
In AI Magazine, American Association for Artificial Intelligence, Fall 2001.
 


 

Professional Service

 

Deployed Object-Level Search Systems

EntityCube

EntityCube is a research prototype and is a test bed for exploring object-level search technologies, which automatically summarizes the Web for entities (such as people, locations and organizations) with a substantial presence.

The need for collecting and understanding Web information about a real-world entity (such as a person or a product) is currently fulfilled manually through search engines. However, information about a single entity might appear in thousands of Web pages. Even if a search engine could find all the relevant Web pages about an entity, the user would need to sift through all these pages to get a complete view of the entity. EntityCube generates summaries of Web entities from billions of public Web pages that contain information about people, locations, and organizations, and allows for exploration of their relationships. For example, users can use EntityCube to find an automatically generated biography page and social-network graph for a person, and use it to discover a relationship path between two people.

Please note that we are still working on improving the accuracy of the key machine learning problems including entity extraction, name disambiguation, entity ranking, and relationship extraction, as well as looking at a better way of incorporating user feedback. Some of the known potential problems include:

²  The prototype currently only contains information extracted from 3 billion Web pages, therefore it is possible that some information for people with a substantial Web presence is still missing in our index;

²  Some names and relationships could be incorrect, and the information may not be update-to-date;

²  Name disambiguation is still largely unsolved. Some people with popular/common names may find that their information has been mixed with other people of the same name;

²  Some of the summarization features are currently only available for people. We are currently working on these for other entities.

 

Renlifang (人立方)

Renlifang is the Chinese version of EntityCube (and the name EntityCube is the English translation of Renlifang) which currently has millions of daily page-views during the peak days.

 

Libra Academic Search

By our object-level search technologies, we have created the Libra academic search engine to facilitate the exchange of ideas and communications between academic communities. A user entering search queries in Libra can retrieve relevant information on academic papers, scientists, conferences, journals, and interest groups thus generates more accurate, relevant, and efficient results in comparison to document-level ranking. Features of this search engine include the ability to:

²  Find top scientists, conferences, and journals in a specific field;

²  Locate top research papers;

²  Identify rising stars or hot topics in your field 

 

Work Experiences

Research Links

Web Search and Mining Group

ASU' Yochan Database Group
YOCHAN