Corpus-based Semantic Class Mining: Distributional vs. Pattern-Based Approaches
- Shuming Shi ,
- Huibin Zhang ,
- Xiaojie Yuan ,
- Ji-Rong Wen
Proceedings of COLING 2010 |
Main approaches to corpus-based semantic class mining include distributional similarity (DS) and pattern-based (PB). In this paper, we perform an empirical comparison of them, based on a publicly available dataset containing 500 million web pages, using various categories of queries. We further propose a frequency-based rule to select appropriate approaches for different types of terms.