Unsupervised extractive multi-document summarization method based on transfer learning from BERT multi-task fine-tuning

Salima Lamsiyah,Abdelkader El Mahdaouy,Saïd El Alaoui Ouatik,Bernard Espinasse
DOI: https://doi.org/10.1177/0165551521990616
2021-02-15
Journal of Information Science
Abstract:Text representation is a fundamental cornerstone that impacts the effectiveness of several text summarization methods. Transfer learning using pre-trained word embedding models has shown promising results. However, most of these representations do not consider the order and the semantic relationships between words in a sentence, and thus they do not carry the meaning of a full sentence. To overcome this issue, the current study proposes an unsupervised method for extractive multi-document summarization based on transfer learning from BERT sentence embedding model. Moreover, to improve sentence representation learning, we fine-tune BERT model on supervised intermediate tasks from GLUE benchmark datasets using single-task and multi-task fine-tuning methods. Experiments are performed on the standard DUC’2002–2004 datasets. The obtained results show that our method has significantly outperformed several baseline methods and achieves a comparable and sometimes better performance than the recent state-of-the-art deep learning–based methods. Furthermore, the results show that fine-tuning BERT using multi-task learning has considerably improved the performance.
computer science, information systems,information science & library science
What problem does this paper attempt to address?