Share on Facebook Tweet on Twitter Share on LinkedIn Share by email

What is it?

NeedleSeek is a project in knowledge computing group of Microsoft Research Asia for open-domain semantic mining and serving. In this project, we

    • Mine open-domain semantic knowledge from web-scale data sources;
    • Answer and serve user requests based on the mined semantic knowledge

Our fundamental goal in this project is to explore how and to what extent a computer system can understand the world as well as the meaning of text to better meet the information needs of users.

In this online research prototype, a web interface is provided for users to search and browse the semantic knowledge-base we built (via the “Semantic Card” and “Semantic Map” tabs).

Semantic Card:

On the “semantic card” tab, we show the mapping from the input word or phrase to one or multiple concepts, with each card representing a concept. For example, the word “apple” can be mapped to the company Apple, the fruit apple, and the tree apple. As another example, the phrase “Harry Potter” can represent a movie, a book, a character, a game, etc. In our prototype, only the top-three concepts of a term are shown at the moment. On each card, the following information about the concept is shown: labels, attributes, key sentences, and related concepts.

Semantic Map:

On the “semantic map” tab, concepts are organized to semantic categories (or semantic classes); and the semantic relations between concepts are shown. For example, {apple, orange, banana…} is a semantic class of fruits. We pay special attention to semantic classes because the concepts in one semantic class tend to share similar semantic characteristics (for example, they have similar attributes). Only the top-three semantic categories are shown for (the concepts of) a term in our prototype at the moment.


Research prototype (language: English):

Related Research Papers

1. Unsupervised Template Mining for Semantic Category Understanding.
    Lei Shi, Shuming Shi, Chin-Yew Lin, Yi-Dong Shen, and Yong Rui
    EMNLP 2014.

2. Ensemble Semantics for Large-scale Unsupervised Relation Extraction.
    Bonan Min, Shuming Shi, Ralph Grishman and Chin-Yew Lin
    EMNLP-CoNLL 2012.

3. Nonlinear Evidence Fusion and Propagation for Hyponymy Relation Mining.
    Fan Zhang, Shuming Shi, Jing Liu, Shuqi Sun, and Chin-Yew Lin

4. Corpus-based Semantic Class Mining: Distributional vs. Pattern-Based Approaches.
    By Shuming Shi, Huibin Zhang, Xiaojie Yuan, and Ji-Rong Wen
    In the 23rd International Conference on Computational Linguistics (COLING'10).

5. Comparable Entity Mining from Comparative Questions.
    by Shasha Li, Chin-Yew Lin, Young-In Song and Zhoujun Li
    In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL'10), 2010.

6. Employing Topic Models for Pattern-based Semantic Class Discovery. [paper][slides]
    By Huibin Zhang, Mingjie Zhu, Shuming Shi, and Ji-Rong Wen
    In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL'09), Singapore, August 2009.

7. Pattern-based Semantic Class Discovery with Multi-Membership Support.
    By Shuming Shi, Xiaokang Liu, and Ji-Rong Wen
    In ACM 17th Conference on Information and Knowledge Management (CIKM'08). Napa Valley, California, USA, 2008 (Poster)