Enhancing Single-Document Summarization by Combining RankNet and Third-Party Sources

We present a new approach to automatic summarization based on neural nets, called NetSum. We extract a set of features from each sentence that helps identify its importance in the document. We apply novel features based on news search query logs and Wikipedia entities. Using the RankNet learning algorithm, we train a pair-based sentence ranker to score every sentence in the document and identify the most important sentences. We apply our system to documents gathered from CNN.com, where each document includes highlights and an article. Our system significantly outperforms the standard baseline in the ROUGE-1 measure on over 70 % of our document set.

emnlp_svore07.pdf
PDF file

In  Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL)

Publisher  Association for Computational Linguistics
All copyrights reserved by ACL 2007

Details

TypeInproceedings
> Publications > Enhancing Single-Document Summarization by Combining RankNet and Third-Party Sources