Text Summarization Based on Sentence Selection with Semantic Representation

Chi Zhang,Lei Zhang,Chong-Jun Wang,Junyuan Xie
DOI: https://doi.org/10.1109/ICTAI.2014.93
2014-01-01
Abstract:Text summarization is of great importance to solve information overload. Salience and coverage are two most important issues for summaries. Most existing models extract summaries by selecting the top sentences with highest scores without using the relationships between sentences, and usually represent the sentences simply basing on lexical or statistical features. As a result, those models can not achieve salience or coverage very well. In this paper, we propose a novel summarization model called Sentence Selection with Semantic Representation (SSSR). SSSR ensures both salience and coverage by learning semantic representations for sentences and applying a well-designed selection strategy to select summary sentences. The selection strategy used in SSSR is to select sentences that can reconstruct the original document with least distortion by means of linear combination. Besides, we improve our selection strategy by reducing redundant information. Then we learn two semantic representations for sentences: (1) weighted mean of word embeddings, (2) deep coding. Both of them are semantic and compact, and can capture similarities between sentences. Extensive experiments on datasets DUC2006 and DUC2007 validate our model.
What problem does this paper attempt to address?