Zhongyuan Wang, Haixun Wang, Yanghua Xiao, and Ji-Rong Wen
Words and phrases associate with each other to form a semantic network. Characterizing such associations is a first step toward understanding natural languages for machines. Psychologists and linguists have used concepts such as typicality and basic level conceptualization to characterize such associations. However, how to quantify such concepts is an open problem. Recently, much work has focused on constructing semantic networks from web scale textcorpora, which makes it possible for the first time to analyze such networks using a data driven approach. In this paper, we introduce measures such as typicality, basic level conceptualization, vagueness, ambiguity, and similarity to systematically characterize the associations in a semantic network. We use such measures as the basis for probabilistic semantic inferencing, which enables a wide range of applications such as word sense disambiguation and short text understanding. We conduct extensive experiments to show the effectiveness of the models and the measures we introduce for the semantic network.