Abstract:We present the results and the main findings of SemEval-2024 Task 8: Multigenerator, Multidomain, and Multilingual Machine-Generated Text Detection. The task featured three subtasks. Subtask A is a binary classification task determining whether a text is written by a human or generated by a machine. This subtask has two tracks: a monolingual track focused solely on English texts and a multilingual track. Subtask B is to detect the exact source of a text, discerning whether it is written by a human or generated by a specific LLM. Subtask C aims to identify the changing point within a text, at which the authorship transitions from human to machine. The task attracted a large number of participants: subtask A monolingual (126), subtask A multilingual (59), subtask B (70), and subtask C (30). In this paper, we present the task, analyze the results, and discuss the system submissions and the methods they used. For all subtasks, the best systems used LLMs.

What problem does this paper attempt to address?

This paper proposes a task named SemEval-2024 Task 8, which focuses on detecting machine-generated text in a multi-source, multi-domain, and multi-language context. The task consists of three subtasks: A) binary classification, determining whether the text is generated by humans or machines; B) precise source detection, identifying whether the text is specifically generated by humans or a specific large language model (LLM); C) change point detection, finding the transition point in the text where the author's identity changes from human to machine. These tasks aim to address the misuse issues caused by the widespread use of LLMs, ensure information accuracy, and promote the development of machine-generated text detection technology. There are numerous participating teams, with 126 teams participating in the monolingual task, 59 teams participating in the multilingual task, 70 teams participating in the source detection task, and 30 teams participating in the change point detection task. The best performing systems for each subtask use LLMs. The research analyzes various methods, including supervised and unsupervised techniques, and provides a large amount of evaluation datasets. Subtask A (monolingual and multilingual binary classification) focuses on distinguishing between human and machine-generated text and involves datasets in English and multiple languages. Subtask B (multi-source detection) aims to determine the specific source of the text, whether it is human or a specific LLM. Subtask C (change point detection) aims to identify the exact location in the text where the transition from human to machine authorship occurs, dealing with mixed human and machine-generated text. The paper analyzes the methods submitted by various systems, discusses the results, and provides directions for future research.

SemEval-2024 Task 8: Multidomain, Multimodel and Multilingual Machine-Generated Text Detection

Fine-tuning Large Language Models for Multigenerator, Multidomain, and Multilingual Machine-Generated Text Detection

TrustAI at SemEval-2024 Task 8: A Comprehensive Analysis of Multi-domain Machine Generated Text Detection Techniques

KInIT at SemEval-2024 Task 8: Fine-tuned LLMs for Multilingual Machine-Generated Text Detection

AISPACE at SemEval-2024 task 8: A Class-balanced Soft-voting System for Detecting Multi-generator Machine-generated Text

HU at SemEval-2024 Task 8A: Can Contrastive Learning Learn Embeddings to Detect Machine-Generated Text?

Mast Kalandar at SemEval-2024 Task 8: On the Trail of Textual Origins: RoBERTa-BiLSTM Approach to Detect AI-Generated Text

Transformer and Hybrid Deep Learning Based Models for Machine-Generated Text Detection

M4: Multi-generator, Multi-domain, and Multi-lingual Black-Box Machine-Generated Text Detection

PetKaz at SemEval-2024 Task 8: Can Linguistics Capture the Specifics of LLM-generated Text?

MULTITuDE: Large-Scale Multilingual Machine-Generated Text Detection Benchmark

DeepPavlov at SemEval-2024 Task 8: Leveraging Transfer Learning for Detecting Boundaries of Machine-Generated Texts

MasonTigers at SemEval-2024 Task 8: Performance Analysis of Transformer-based Models on Machine-Generated Text Detection

MAGE: Machine-generated Text Detection in the Wild

Team QUST at SemEval-2024 Task 8: A Comprehensive Study of Monolingual and Multilingual Approaches for Detecting AI-generated Text

MultiSocial: Multilingual Benchmark of Machine-Generated Text Detection of Social-Media Texts

Sharif-MGTD at SemEval-2024 Task 8: A Transformer-Based Approach to Detect Machine Generated Text

TM-TREK at SemEval-2024 Task 8: Towards LLM-Based Automatic Boundary Detection for Human-Machine Mixed Text

LLM-DetectAIve: a Tool for Fine-Grained Machine-Generated Text Detection

RKadiyala at SemEval-2024 Task 8: Black-Box Word-Level Text Boundary Detection in Partially Machine Generated Texts

RFBES at SemEval-2024 Task 8: Investigating Syntactic and Semantic Features for Distinguishing AI-Generated and Human-Written Texts