A topic modeling‐based bibliometric exploration of automatic summarization research
Xieling Chen,Haoran Xie,Xiaohui Tao,Lingling Xu,Jingjing Wang,Hong‐Ning Dai,Fu Lee Wang
DOI: https://doi.org/10.1002/widm.1540
2024-04-27
WIREs Data Mining and Knowledge Discovery
Abstract:This work presents a topic modeling‐based bibliometric analysis of the scientific output from 2010 to 2022 regarding automatic summarization to understand research topics and trends, top sources, countries/regions, institutions, researchers, and scientific collaborations in the field. The surge in text data has driven extensive research into developing diverse automatic summarization approaches to effectively handle vast textual information. There are several reviews on this topic, yet no large‐scale analysis based on quantitative approaches has been conducted. To provide a comprehensive overview of the field, this study conducted a bibliometric analysis of 3108 papers published from 2010 to 2022, focusing on automatic summarization research regarding topics and trends, top sources, countries/regions, institutions, researchers, and scientific collaborations. We have identified the following trends. First, the number of papers has experienced 65% growth, with the majority being published in computer science conferences. Second, Asian countries and institutions, notably China and India, actively engage in this field and demonstrate a strong inclination toward inter‐regional international collaboration, contributing to more than 24% and 20% of the output, respectively. Third, researchers show a high level of interest in multihead and attention mechanisms, graph‐based semantic analysis, and topic modeling and clustering techniques, with each topic having a prevalence of over 10%. Finally, scholars have been increasingly interested in self‐supervised and zero/few‐shot learning, multihead and attention mechanisms, and temporal analysis and event detection. This study is valuable when it comes to enhancing scholars' and practitioners' understanding of the current hotspots and future directions in automatic summarization. This article is categorized under: Algorithmic Development > Text Mining