A New Method for Extractive Text Summarization Using Neural Networks

Chowdhury, Sohini Roy,Sarkar, Kamal
DOI: https://doi.org/10.1007/s42979-023-01806-0
2023-05-10
SN Computer Science
Abstract:Summarization aims at extracting the salient information from a document and presenting the extracted information in a condensed form. Most existing methods for extractive text summarization generate a summary from a document using a two-stage process. In the first stage, the sentences are ranked based on their saliency scores and, in the second stage, the summary generation process starts with the top-ranked sentence and selects the next sentences one by one from the ranked list. To improve summary diversity, a sentence is included in the summary if the sentence is sufficiently dissimilar from the already selected sentences. Sentence selection is continued until the summary of the desired length is reached. The second stage is greedy in nature and it uses a predefined similarity threshold value to check the dissimilarity of a sentence with the already selected sentences. Due to this fixed similarity threshold which is manually tuned, in most cases, this approach fails to manage the diversity in a summary. This article proposes a summarization approach that uses a neural network-based learning model that learns to include a sentence in a summary by taking into account both the saliency of the sentence and the diversity in the summary. For this purpose, the model is trained using two types of features—saliency features and diversity features. We have evaluated the proposed approach using two open benchmark datasets—the DUC dataset and the Daily Mail dataset. Experimental results show that the proposed neural summarization approach is effective in producing better non-redundant informative summaries and outperforms many existing summarization approaches to which it is compared.
What problem does this paper attempt to address?