A Systematic Survey of Text Summarization: From Statistical Methods to Large Language Models

Haopeng Zhang,Philip S. Yu,Jiawei Zhang
2024-06-17
Abstract:Text summarization research has undergone several significant transformations with the advent of deep neural networks, pre-trained language models (PLMs), and recent large language models (LLMs). This survey thus provides a comprehensive review of the research progress and evolution in text summarization through the lens of these paradigm shifts. It is organized into two main parts: (1) a detailed overview of datasets, evaluation metrics, and summarization methods before the LLM era, encompassing traditional statistical methods, deep learning approaches, and PLM fine-tuning techniques, and (2) the first detailed examination of recent advancements in benchmarking, modeling, and evaluating summarization in the LLM era. By synthesizing existing literature and presenting a cohesive overview, this survey also discusses research trends, open challenges, and proposes promising research directions in summarization, aiming to guide researchers through the evolving landscape of summarization research.
Computation and Language
What problem does this paper attempt to address?
### Problems the Paper Aims to Address This paper aims to provide a comprehensive review of the research progress and evolution in the field of text summarization, particularly focusing on the development from statistical methods to large language models (LLMs). Specifically, the paper seeks to address the following key issues: 1. **Systematic Review**: - Provide a comprehensive literature review covering the major development stages in the field of text summarization, including traditional statistical methods, deep learning methods, fine-tuning techniques of pre-trained language models (PLMs), and the current large language models (LLMs). - Demonstrate the evolution of text summarization research by comparing methods and techniques from different stages. 2. **Detailed Analysis**: - Conduct a detailed analysis of representative methods at each development stage, including datasets, evaluation metrics, and summarization methods. - Pay special attention to new advancements in the era of LLMs, including benchmarking, modeling, and evaluation methods. 3. **Research Trends and Challenges**: - Discuss current research trends and open challenges, especially in the era of LLMs. - Propose future research directions to further advance the field of text summarization. 4. **Filling the Gap**: - Existing reviews mostly focus on traditional statistical methods and deep learning-based methods, lacking a comprehensive survey of the LLMs era. - This paper aims to fill this gap by providing an integrated overview of the latest advancements. 5. **Guiding Researchers**: - Help researchers better understand the current state and development trends in the field of text summarization through systematic review and analysis. - Provide practical guidelines for researchers to conduct text summarization research in the evolving era of LLMs. In summary, through systematic review and analysis, this paper aims to provide researchers with a comprehensive perspective to help them understand and address the latest challenges and opportunities in the field of text summarization.