EntityCube is a research prototype for exploring object-level search technologies, which automatically summarizes the Web for entities (such as people, locations and organizations) with a modest web presence. The Chinese-language version is called Renlifang.
The need for collecting and understanding Web information about a real-world entity (such as a person or a product) is mostly collated manually through search engines. However, information about a single entity might appear in thousands of Web pages. Even if a search engine could find all the relevant Web pages about an entity, the user would need to sift through all these pages to get a complete view of the entity. EntityCube generates summaries of Web entities from billions of public Web pages that contain information about people, locations, and organizations, and allows for exploration of their relationships. For example, users can use EntityCube to find an automatically generated biography page and social-network graph for a person, and use it to discover a relationship path between two people.
Please note that we are still working on improving the accuracy of the key machine learning problems including entity extraction, name disambiguation, entity ranking, and relationship extraction, as well as looking at a better way of incorporating user feedback. Some of the known potential problems include:
- The prototype currently only contains information extracted from 3 billion Web pages, therefore it is possible that some information for people with a substantial Web presence is still missing in our index;
- Some names and relationships could be incorrect, and the information may not be update-to-date;
- Name disambiguation is still largely unsolved. Some people with popular/common names may find that their information has been mixed with other people of the same name;
- Some of the summarization features are currently only available for people. We are currently working on these for other entities.
Check out EntityCube
Renlifang (http://renlifang.msra.cn) is the Chinese version of EntityCube (and the name EntityCube is the English translation of Renlifang) which currently has millions of daily page-views during the peak days.
By similar technologies, we have created the Libra academic search (http://academic.research.microsoft.com) service to facilitate the exchange of ideas and communications between academic communities. A user entering search queries in Libra can retrieve relevant information on academic papers, scientists, conferences, and journals and thus generates more accurate, relevant, and efficient results in comparison to document-level ranking. Features of this search service include the ability to:
- Find top scientists, conferences, and journals in a specific field;
- Locate top research papers;
- Identify rising stars or hot topics in your field
- StatSnowball: a Statistical Approach to Extracting Entity Relationships. Jun Zhu, Zaiqing Nie, Xiaojiang Liu, Bo Zhang and Ji-Rong Wen. In the Proceedings of the 18th international World Wide Web conference (WWW 2009).
- Web Object Retrieval. Zaiqing Nie, Yunxiao Ma, Shuming Shi, Ji-Rong Wen, Wei-Ying Ma. In the Proceedings of the 16th international World Wide Web conference (WWW 2007).
- Object-Level Vertical Search. Zaiqing Nie, Ji-Rong Wen, Wei-Ying Ma. In the Third Biennial Conference on Innovative Data Systems Research (CIDR 2007, research paper).
- Object-Level Ranking: Bringing Order to Web Objects. Zaiqing Nie, Yuanzhi Zhang, Ji-Rong Wen, and Wei-Ying Ma. In Proceedings of the 14th international World Wide Web conference (WWW 2005), May 10-14, 2005, in Chiba, Japan.