Share this page
Share this page E-mail this page Print this page RSS feeds
Home > Projects > EntityCube
EntityCube

EntityCube is an entity search and summarization engine, which automatically summarizes the Web for the long tail, not just celebrities! The Chinese name of the project is called Renlifang.

The need for collecting and understanding Web information about a real-world entity (such as a person or a product) currently is fulfilled manually through search engines. But the information about a single entity might appear in thousands of Web pages. Even if a search engine could find all the relevant Web pages about an entity, the user would need to sift through all the pages to get a complete view of the entity. EntityCube is an entity search and summarization system that efficiently generates summaries of Web entities from billions of crawled Web pages. The summarized information is used to build an object-level search engine about people, locations, and organizations and explore their relationships.

Specifically, EntityCube automatically generates:

  • A biography page for a person.
  • A social-network graph for a person.
  • A shortest-relationship path between two people.
  • All titles of a person that are found on the Web.

 

Related Systems

Renlifang (人立方关系搜索):

Renlifang (http://renlifang.msra.cn) is the Chinese version of EntityCube (and the name EntityCube is the English translation of Renlifang). Renlifang has been well received by Chinese Internet users and mainstream media in China (including CCTV and Phoenix TV) with positive comments and millions of daily page-views during the peak days. 

Renlifang is a new-generation of search engine, one that enables users to navigate through search result and explore relationships between entities. In Renlifang, users can submit a query about any people, locations, and organizations and then explore their relationships. From more than 1 billion Chinese web pages, Renlifang employs automatic algorithms to extract entity information and detects relationships, covering a spectrum of everyday individuals and well-known people, locations, or organizations. At this point Renlifang only serves in Chinese language domain.

Libra Academic Search

By using the latest object level technologies, we have created the Libra academic search engine to facilitate the exchange of ideas and communications between academic communities. A user entering search queries in Libra can retrieve relevant information on academic papers, scientists, conferences, journals, and interest groups thus generates more accurate, relevant, and efficient results in comparison to document-level ranking. Features of this search engine include the ability to:

  • Find top scientists, conferences, and journals in a specific field;
  • Witness the growth and evolution of research communities;
  • Locate top research papers;
  • Identify rising stars or hot topics in your field

 

Web Product Extractor

We extract meta-data about real-world products from every product page on the Web by using a single information extraction model. Specifically, for each crawled Web page, we first use a classifier to decide whether it is a product page and then extract the name, image, price and description of each product from detected product pages.

 

Related Publications

  • StatSnowball: a Statistical Approach to Extracting Entity Relationships. Jun Zhu, Zaiqing Nie, Xiaojiang Liu, Bo Zhang and Ji-Rong Wen. In the Proceedings of the 18th international World Wide Web conference (WWW 2009).
  • Web Object Retrieval. Zaiqing Nie, Yunxiao Ma, Shuming Shi, Ji-Rong Wen, Wei-Ying Ma. In the Proceedings of the 16th international World Wide Web conference (WWW 2007).
  • Object-Level Vertical Search. Zaiqing Nie, Ji-Rong Wen, Wei-Ying Ma. In the Third Biennial Conference on Innovative Data Systems Research (CIDR 2007, research paper).
  • Object-Level Ranking: Bringing Order to Web Objects. Zaiqing Nie, Yuanzhi Zhang, Ji-Rong Wen, and Wei-Ying Ma. In Proceedings of the 14th international World Wide Web conference (WWW 2005), May 10-14, 2005, in Chiba, Japan.