Minimally Supervised Learning of Semantic Knowledge from Query Logs

  • Mamoru Komachi ,
  • Hisami Suzuki

Proceedings of IJCNLP, Hyderabad, India |

We propose a method for learning semantic categories of words with minimal supervision from web search query logs. Our method is based on the Espresso algorithm (Pantel and Pennacchiotti, 2006) for extracting binary lexical relations, but makes important modifications to handle query log data for the task of acquiring semantic categories. We present experimental results comparing our method with two state-of-the-art minimally supervised lexical knowledge extraction systems using Japanese query log data, and show that our method achieves higher precision than the previously proposed methods. We also show that the proposed method offers an additional advantage for knowledge acquisition in an Asian language for which word segmentation is an issue, as the method utilizes no prior knowledge of word segmentation, and is able to harvest new terms with correct word segmentation.