Towards Accurate Distant Supervision for Relational Facts Extraction

Xingxing Zhang, Jianwen Zhang, Junyu Zeng, Jun Yan, Zheng Chen, and Zhifang Sui

Abstract

Distant supervision (DS) is an appealing learning method which learns from existing relational facts to extract more from a text corpus. However, the accuracy is still not satisfying. In this paper, we point out and analyze some critical factors in DS which have great impact on accuracy, including valid entity type detection, negative training examples construction and ensembles. We propose an approach to handle these factors. By experimenting on Wikipedia articles to extract the facts in Freebase (the top 92 relations), we show the impact of these three factors on the accuracy of DS and the remarkable improvement led by the proposed approach.

Details

Publication typeInproceedings
Published inProceedings of the 51st Annual Meeting of the Association for Computational Linguistics
PublisherAssociation for Computational Linguistics
> Publications > Towards Accurate Distant Supervision for Relational Facts Extraction