IWM-LSTM encoder for abstractive text summarization

Ravindra Gangundi,Rajeswari Sridhar
DOI: https://doi.org/10.1007/s11042-024-19091-1
IF: 2.577
2024-04-12
Multimedia Tools and Applications
Abstract:Sequence-to-sequence models are fundamental building blocks for generating abstractive text summaries, which can produce precise and coherent summaries. Recently proposed, different text summarization models aimed to enhance summarization performance through the use of copying mechanisms, reinforcement learning, and multiple-level encoders. However, there has been limited research on improving the summarization output by modifying the structure of the long short-term memory (LSTM) cell. We introduced an improved version of LSTM called improved working memory LSTM (IWM-LSTM). IWM-LSTM removes the output gate and enhances the input and forget gates by incorporating cell state information into these gates. In our sequence-to-sequence model for text summarization, we replaced the LSTM encoder with a bi-directional IWM-LSTM, resulting in better summaries with minimal training time and less computational intensiveness. Additionally, we utilized Bidirectional encoder representations from transformers (BERT) embeddings to enhance the rouge score. The CNN/DailyMail dataset is used to train and test the model performance. The proposed model achieves better Recall-oriented understudy for gisting evaluation (ROUGE) scores than state-of-the-art models.
computer science, information systems, theory & methods,engineering, electrical & electronic, software engineering
What problem does this paper attempt to address?