Abstract:Generating high-quality summaries for chat dialogs often requires large labeled datasets. We propose a method to efficiently use unlabeled data for extractive summarization of customer-agent dialogs. In our method, we frame summarization as a question-answering problem and use state-of-the-art large language models (LLMs) to generate pseudo-labels for a dialog. We then use these pseudo-labels to fine-tune a chat summarization model, effectively transferring knowledge from the large LLM into a smaller specialized model. We demonstrate our method on the \tweetsumm dataset, and show that using 10% of the original labelled data set we can achieve 65.9/57.0/61.0 ROUGE-1/-2/-L, whereas the current state-of-the-art trained on the entire training data set obtains 65.16/55.81/64.37 ROUGE-1/-2/-L. In other words, in the worst case (i.e., ROUGE-L) we still effectively retain 94.7% of the performance while using only 10% of the data.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is how to efficiently train a high - quality extractive summarization model using unlabeled data in the case of only a small amount of labeled data in customer - support - dialogue summarization generation. Specifically, the paper proposes a semi - supervised learning - based method. By using large language models (LLMs) to generate pseudo - labels and then using these pseudo - labels to fine - tune a smaller specialized model, the knowledge of the large LLM is effectively transferred to this smaller model. This method aims to reduce the dependence on large - scale labeled data sets while maintaining or approaching the performance of the current state - of - the - art methods. The main contributions of the paper include: - Introducing a semi - supervised extractive summarization method that can distill knowledge from general large language models into smaller specialized summarization models. - Demonstrating that this method can achieve performance comparable to or better than that of the current state - of - the - art models trained with all labeled data (evaluated according to the ROUGE metric) when only 10% of the labeled data is used. - Effectively using large language models for weak supervision, such as using GPT - 3.5, by framing the extractive summarization task as a question - answering problem. This method not only improves the performance of the model in the case of data scarcity but also significantly reduces the cost of constructing large - scale labeled data sets, which is of great significance for practical applications.

LLM aided semi-supervision for Extractive Dialog Summarization

AugSumm: towards generalizable speech summarization using synthetic labels from large language model

Semi-Supervised Dialogue Abstractive Summarization via High-Quality Pseudolabel Selection

In-context Learning of Large Language Models for Controlled Dialogue Summarization: A Holistic Benchmark and Empirical Analysis

Towards a Robust Retrieval-Based Summarization System

A Novel LLM-based Two-stage Summarization Approach for Long Dialogues

Learning to Summarize from LLM-generated Feedback

Effective and Efficient Conversation Retrieval for Dialogue State Tracking with Implicit Text Summaries

Prompting and Fine-Tuning of Small LLMs for Length-Controllable Telephone Call Summarization

Factual Dialogue Summarization via Learning from Large Language Models

DialogSum: A Real-Life Scenario Dialogue Summarization Dataset

On Learning to Summarize with Large Language Models as References

Exploring the Dialogue Comprehension Ability of Large Language Models

Recursively Summarizing Enables Long-Term Dialogue Memory in Large Language Models

Dialogue acts enhanced extract–abstract framework for meeting summarization

Extractive Dialogue Summarization Without Annotation Based on Distantly Supervised Machine Reading Comprehension in Customer Service

Enhancing Abstractive Dialogue Summarization with Internal Knowledge

Benchmarking Large Language Models for News Summarization

Prompt-based Pseudo-labeling Strategy for Sample-Efficient Semi-Supervised Extractive Summarization

Increasing faithfulness in human-human dialog summarization with Spoken Language Understanding tasks

Q-learning with Language Model for Edit-based Unsupervised Summarization