Abstract:Automatic Text Summarization (ATS), utilizing Natural Language Processing (NLP) algorithms, aims to create concise and accurate summaries, thereby significantly reducing the human effort required in processing large volumes of text. ATS has drawn considerable interest in both academic and industrial circles. Many studies have been conducted in the past to survey ATS methods; however, they generally lack practicality for real-world implementations, as they often categorize previous methods from a theoretical standpoint. Moreover, the advent of Large Language Models (LLMs) has altered conventional ATS methods. In this survey, we aim to 1) provide a comprehensive overview of ATS from a ``Process-Oriented Schema'' perspective, which is best aligned with real-world implementations; 2) comprehensively review the latest LLM-based ATS works; and 3) deliver an up-to-date survey of ATS, bridging the two-year gap in the literature. To the best of our knowledge, this is the first survey to specifically investigate LLM-based ATS methods.

What problem does this paper attempt to address?

This paper focuses on the field of Automatic Text Summarization (ATS), with a particular emphasis on a comprehensive investigation from a process-oriented perspective, and explores approaches based on Large Language Models (LLMs). Current ATS research is often classified into theoretical categories, such as extractive or generative, but these classifications may not fully align with practical applications. With the development of LLMs, ATS methods may undergo changes. The main objectives of the paper are as follows: 1. Provide a comprehensive overview of ATS based on a "process-oriented pattern" to better adapt to practical application requirements. 2. Review and summarize the latest applications of LLMs in ATS. 3. Provide the latest investigation into ATS, bridging the research gap of the past two years, which is the first dedicated survey specifically focused on LLM-based ATS methods. The paper points out that with the rapid development of the internet, the emergence of large amounts of textual data has made ATS a key technology for addressing information processing problems. Although there have been many surveys on ATS methods, they often classify from a theoretical perspective. This paper organizes the content according to the implementation process of ATS, including data acquisition, preprocessing, modeling methods, and evaluation metrics, to provide more practical guidance. In addition, the paper discusses the impact of LLMs on ATS, as these models can significantly improve the accuracy and coherence of summaries. The paper reviews existing open-source datasets, analyzes their characteristics, and explores methods for creating new datasets, including rule-based and LLM-based annotation techniques. Finally, the paper summarizes the challenges and limitations in the ATS field, providing directions for future research. The entire study aims to provide a comprehensive roadmap for ATS to assist researchers and engineers in better understanding and applying relevant technologies.

A Comprehensive Survey on Process-Oriented Automatic Text Summarization with Exploration of LLM-Based Methods

A Comprehensive Survey on Process-Oriented Automatic Text Summarization with Exploration of LLM-Based Methods

A Comprehensive Survey of Abstractive Text Summarization Based on Deep Learning

A comprehensive review of automatic text summarization techniques: method, data, evaluation and coding

A Systematic Survey of Text Summarization: From Statistical Methods to Large Language Models

Hierarchical Human-Like Deep Neural Networks for Abstractive Text Summarization

Enhancements of Attention-Based Bidirectional LSTM for Hybrid Automatic Text Summarization

LANS: Large-scale Arabic News Summarization Corpus

A Comparative Study of Quality Evaluation Methods for Text Summarization

Surveying the Landscape of Text Summarization with Deep Learning: A Comprehensive Review

A Survey for Biomedical Text Summarization: From Pre-trained to Large Language Models

An End-to-End Speech Summarization Using Large Language Model

Knowledge-guided Unsupervised Rhetorical Parsing for Text Summarization

Topic-Aware Abstractive Text Summarization

Joint learning of text alignment and abstractive summarization for long documents via unbalanced optimal transport

From Standard Summarization to New Tasks and Beyond: Summarization with Manifold Information

GATSum: Graph-Based Topic-Aware Abstract Text Summarization

A Survey on Neural Network-Based Summarization Methods

Automatic Text Summarization Methods: A Comprehensive Review

A Survey on Large Language Model based Autonomous Agents

A Survey of Automatic Source Code Summarization