Stacked Denoising Variational Auto Encoder Model for Extractive Web Text Summarization

Yadav, Madhuri
DOI: https://doi.org/10.1007/s40998-024-00751-9
2024-09-14
Iranian Journal of Science and Technology Transactions of Electrical Engineering
Abstract:Extracting and concatenating distilled content from a corpus into a summary is a technique known as extractive summarization. In recent days, extractive summarization of web text has become popular due to the wide usage of social media. Hence various researches have been conducted on extractive summarization of web text, but the processing of huge amounts of web text and understanding the context is difficult due to the requirement of a lot of storage and time. To solve this issue, the continuous bag of words text vectorization model has been used that reduce the processing time by producing a distributed combination of words in vector arrangement. Moreover, the polysemous words are unable to be captured, which makes extraction difficult. Hence a novel Hierarchical Attention pointer Stacked Denoising Variational Autoencoder Model has been proposed in which the SDVAE model forms latent distribution for contextualized words and bidirectional attention mechanism extracts keywords and features from sentences thereby capturing polysemic words. Furthermore, the summary is obtained with dangling anaphora whereas antecedent morphological expression and verb referents are not considered in the summary. Hence a novel Multilayered Competitive Probable Modular Perception Model has been proposed in which the competitive layer scores the sentence and the scored sentences are ranked using string kernel and class conditional probability thereby considering the antecedent morphological expression and then, Graph based Quadruplicate Lexicon Summarization is used that forms quadruplicate lexicon chain in graphical format to eliminate dangling anaphoric expressions. The experimental results obtained show that the proposed model has achieved a comparatively high accuracy of 98.3% and recall, precision, and F-measure of 98%.
engineering, electrical & electronic
What problem does this paper attempt to address?