Abstract:With recent advancements in the area of Natural Language Processing, the focus is slowly shifting from a purely English-centric view towards more language-specific solutions, including German. Especially practical for businesses to analyze their growing amount of textual data are text summarization systems, which transform long input documents into compressed and more digestible summary texts. In this work, we assess the particular landscape of German abstractive text summarization and investigate the reasons why practically useful solutions for abstractive text summarization are still absent in industry. Our focus is two-fold, analyzing a) training resources, and b) publicly available summarization systems. We are able to show that popular existing datasets exhibit crucial flaws in their assumptions about the original sources, which frequently leads to detrimental effects on system generalization and evaluation biases. We confirm that for the most popular training dataset, MLSUM, over 50% of the training set is unsuitable for abstractive summarization purposes. Furthermore, available systems frequently fail to compare to simple baselines, and ignore more effective and efficient extractive summarization approaches. We attribute poor evaluation quality to a variety of different factors, which are investigated in more detail in this work: A lack of qualitative (and diverse) gold data considered for training, understudied (and untreated) positional biases in some of the existing datasets, and the lack of easily accessible and streamlined pre-processing strategies or analysis tools. We provide a comprehensive assessment of available models on the cleaned datasets, and find that this can lead to a reduction of more than 20 ROUGE-1 points during evaluation. The code for dataset filtering and reproducing results can be found online at <a class="link-external link-https" href="https://github.com/dennlinger/summaries" rel="external noopener nofollow">this https URL</a>

Mevaker: Conclusion Extraction and Allocation Resources for the Hebrew Language

HeSum: a Novel Dataset for Abstractive Text Summarization in Hebrew

LexSumm and LexT5: Benchmarking and Modeling Legal Summarization Tasks in English

HebDB: a Weakly Supervised Dataset for Hebrew Speech Processing

HeRo: RoBERTa and Longformer Hebrew Language Models

SummVis: Interactive Visual Analysis of Models, Data, and Evaluation for Text Summarization

Building a Hebrew Semantic Role Labeling Lexical Resource from Parallel Movie Subtitles

L3Cube-MahaSum: A Comprehensive Dataset and BART Models for Abstractive Text Summarization in Marathi

Towards Enhancing Coherence in Extractive Summarization: Dataset and Experiments with LLMs

ivrit.ai: A Comprehensive Dataset of Hebrew Speech for AI Research and Development

Multisumm: Towards A Unified Model For Multi-Lingual Abstractive Summarization

Legal Extractive Summarization of U.S. Court Opinions

LexAbSumm: Aspect-based Summarization of Legal Decisions

From News to Summaries: Building a Hungarian Corpus for Extractive and Abstractive Summarization

On the State of German (Abstractive) Text Summarization

Evaluation of Abstractive Summarisation Models with Machine Translation in Deliberative Processes

ConVerSum: A Contrastive Learning based Approach for Data-Scarce Solution of Cross-Lingual Summarization Beyond Direct Equivalents

Automated Extraction of Sentencing Decisions from Court Cases in the Hebrew Language

Adapting LLMs to Hebrew: Unveiling DictaLM 2.0 with Enhanced Vocabulary and Instruction Capabilities

GAE-ISumm: Unsupervised Graph-Based Summarization of Indian Languages

UniSumEval: Towards Unified, Fine-Grained, Multi-Dimensional Summarization Evaluation for LLMs