Combining Temporal Event Relations and Pre-Trained Language Models for Text Summarization

Divyanshu Daiya
DOI: https://doi.org/10.1109/ICMLA51294.2020.00106
2020-12-01
Abstract:In this paper, we introduce an innovative high performing deep learning architecture for text summarization using pre-trained language models. Language model (LM) pre-training has provided an impressive performance on many of language understanding tasks. However, it remains much for experimentation to efficiently utilize these pre-trained models for specific tasks. We propose a novel architecture(ENEMAbst) effectively utilizing pre-trained LM MASS for summarization. We also propose that the use of temporal relations across event representations can significantly improve language generation and inference. We demonstrate experimentally that using events provides a more in-depth, comprehensive understanding of the text. We showcase how MASS, along with Event representations, can be used for summarization. We couple our Event-Network-Encoders(ENE) with MASS based document level encoders(MASSEnc) by attention augmenting MASSEnc with ENE. We devise an innovative fine-tuning schedule to train our Pretrained-Encoders and Pretrained-Decoders with ENE efficiently. We show empirically that our model performs best compared to state-of-the-art baseline methods for abstractive and extractive summarization. Our model provides 0.85 point improvement from 43.331 to 44.179 for abstractive summarization and 0.63 point improvement from 43.85 to 44.48 for extractive summarization on ROUGE-1 over the existing best performing models.
Computer Science
What problem does this paper attempt to address?