HetTreeSum: A Heterogeneous Tree Structure-based Extractive Summarization Model for Scientific Papers

Jintao Zhao,Libin Yang,Xiaoyan Cai
DOI: https://doi.org/10.1016/j.eswa.2022.118335
IF: 8.5
2022-01-01
Expert Systems with Applications
Abstract:Scientific paper summarization aims at generating a short and concise digest while preserving important information of the original document. Currently, scientific paper summarization faces two main challenges. First, inter-sentence relations are hard to learn, especially in the case of long-form scientific papers. Second, structural information of the well-structured scientific papers has not been fully exploited. To overcome the above two challenges, we propose a novel Heterogeneous Tree structure-based extractive Summarization (HetTreSum) model, where each document is modeled as a tree structure to learn inter-sentence relations and structural information of the original document is incorporated, enabling the tree structure to have a global perspective of the whole document. Then an iterative updating strategy is presented to interactively refine nodes of the tree structure for better contextualized representations, which can further enhance summarization performance. Experimental results on PubMed and arXiv datasets show that our proposed HetTreeSum model achieves significantly advanced performance compared with various scientific paper summarization models.
What problem does this paper attempt to address?