EntityCube

EntityCube is a research prototype for exploring object-level search technologies, which automatically summarizes the Web for entities (such as people, locations and organizations) with a modest web presence. The Chinese-language version is called Renlifang.

The need for collecting and understanding Web information about a real-world entity (such as a person or a product) is mostly collated manually through search engines. However, information about a single entity might appear in thousands of Web pages. Even if a search engine could find all the relevant Web pages about an entity, the user would need to sift through all these pages to get a complete view of the entity. EntityCube generates summaries of Web entities from billions of public Web pages that contain information about people, locations, and organizations, and allows for exploration of their relationships. For example, users can use EntityCube to find an automatically generated biography page and social-network graph for a person, and use it to discover a relationship path between two people.

Please note that we are still working on improving the accuracy of the key machine learning problems including entity extraction, name disambiguation, entity ranking, and relationship extraction, as well as looking at a better way of incorporating user feedback. Some of the known potential problems include:

  • The prototype currently only contains information extracted from 3 billion Web pages, therefore it is possible that some information for people with a substantial Web presence is still missing in our index;
  • Some names and relationships could be incorrect, and the information may not be update-to-date;
  • Name disambiguation is still largely unsolved. Some people with popular/common names may find that their information has been mixed with other people of the same name;
  • Some of the summarization features are currently only available for people. We are currently working on these for other entities.

 

 Check out EntityCube

 

Related Systems

Renlifang:

Renlifang (http://renlifang.msra.cn) is the Chinese version of EntityCube (and the name EntityCube is the English translation of Renlifang) which currently has millions of daily page-views during the peak days.

Libra Academic Search:

By similar technologies, we have created the Libra academic search (http://academic.research.microsoft.com) service to facilitate the exchange of ideas and communications between academic communities. A user entering search queries in Libra can retrieve relevant information on academic papers, scientists, conferences, and journals and thus generates more accurate, relevant, and efficient results in comparison to document-level ranking. Features of this search service include the ability to:

  • Find top scientists, conferences, and journals in a specific field;
  • Locate top research papers;
  • Identify rising stars or hot topics in your field 

 

Related Publications