A Text Abstraction Summary Model Based on BERT Word Embedding and Reinforcement Learning

Qicai Wang,Peiyu Liu,Zhenfang Zhu,Hongxia Yin,Qiuyue Zhang,Lindong Zhang
DOI: https://doi.org/10.3390/app9214701
2019-11-04
Applied Sciences
Abstract:As a core task of natural language processing and information retrieval, automatic text summarization is widely applied in many fields. There are two existing methods for text summarization task at present: abstractive and extractive. On this basis we propose a novel hybrid model of extractive-abstractive to combine BERT (Bidirectional Encoder Representations from Transformers) word embedding with reinforcement learning. Firstly, we convert the human-written abstractive summaries to the ground truth labels. Secondly, we use BERT word embedding as text representation and pre-train two sub-models respectively. Finally, the extraction network and the abstraction network are bridged by reinforcement learning. To verify the performance of the model, we compare it with the current popular automatic text summary model on the CNN/Daily Mail dataset, and use the ROUGE (Recall-Oriented Understudy for Gisting Evaluation) metrics as the evaluation method. Extensive experimental results show that the accuracy of the model is improved obviously.
What problem does this paper attempt to address?