Abstract:A comprehensive benchmark is yet to be established in the Image Manipulation Detection \& Localization (IMDL) field. The absence of such a benchmark leads to insufficient and misleading model evaluations, severely undermining the development of this field. However, the scarcity of open-sourced baseline models and inconsistent training and evaluation protocols make conducting rigorous experiments and faithful comparisons among IMDL models challenging. To address these challenges, we introduce IMDL-BenCo, the first comprehensive IMDL benchmark and modular codebase. IMDL-BenCo:~\textbf{i)} decomposes the IMDL framework into standardized, reusable components and revises the model construction pipeline, improving coding efficiency and customization flexibility;~\textbf{ii)} fully implements or incorporates training code for state-of-the-art models to establish a comprehensive IMDL benchmark; and~\textbf{iii)} conducts deep analysis based on the established benchmark and codebase, offering new insights into IMDL model architecture, dataset characteristics, and evaluation standards. Specifically, IMDL-BenCo includes common processing algorithms, 8 state-of-the-art IMDL models (1 of which are reproduced from scratch), 2 sets of standard training and evaluation protocols, 15 GPU-accelerated evaluation metrics, and 3 kinds of robustness evaluation. This benchmark and codebase represent a significant leap forward in calibrating the current progress in the IMDL field and inspiring future breakthroughs. Code is available at: <a class="link-external link-https" href="https://github.com/scu-zjz/IMDLBenCo" rel="external noopener nofollow">this https URL</a>

What problem does this paper attempt to address?

This paper attempts to solve several key problems in the field of Image Manipulation Detection & Localization (IMDL): 1. **Lack of a unified benchmark**: Currently, a comprehensive benchmark has not been established in the IMDL field yet, which has led to insufficient and misleading model evaluation and seriously hindered the development of this field. Specifically, due to the lack of open - source baseline models and inconsistent training and evaluation protocols, it is very difficult to conduct rigorous experiments and fair comparisons between models. 2. **Inconsistency in model training and evaluation protocols**: Existing IMDL models have inconsistencies in training and evaluation protocols. These inconsistencies lead to incompatible and unfair comparisons, and then result in insufficient and misleading experimental results. For example, different models may use different pre - training data sets, or adopt different data augmentation methods during training and testing. 3. **Difficulty in model reproduction**: The training codes of many State - of - the - Art (SoTA) IMDL models are not publicly available. Even if the source codes of some models are partially public, highly customized model architectures and decoupled pipeline designs are often required for efficient reproduction. Existing frameworks (such as OpenMMLab and Detectron2) rely on registration mechanisms and tightly coupled pipelines, which lead to low efficiency when reproducing IMDL models under the existing frameworks, and the model architectures are too single, with heavy coding burdens and poor scalability. To solve the above problems, the paper introduces IMDL - BenCo, which is a comprehensive IMDL benchmark and modular code library. The main contributions of IMDL - BenCo include: - **Modular code library**: IMDL - BenCo decomposes the IMDL framework into standardized and reusable components, improving coding efficiency and custom flexibility. It includes four main components: data loader, model library, training script, and evaluator. - **Comprehensive implementation or integration of SoTA models**: IMDL - BenCo fully implements or integrates the training codes of 8 SoTA IMDL models and establishes a comprehensive IMDL benchmark. - **In - depth analysis**: Based on the established benchmark and code library, IMDL - BenCo conducts in - depth analysis and provides new insights into IMDL model architectures, data set characteristics, and evaluation criteria. Through these contributions, IMDL - BenCo not only calibrates the current progress in the IMDL field but also provides inspiration for future breakthroughs.

IMDL-BenCo: A Comprehensive Benchmark and Codebase for Image Manipulation Detection & Localization

GIM: A Million-scale Benchmark for Generative Image Manipulation Detection and Localization

A New Benchmark and Model for Challenging Image Manipulation Detection

Interactive Medical Image Segmentation: A Benchmark Dataset and Baseline

IM-IAD: Industrial Image Anomaly Detection Benchmark in Manufacturing

Towards Open-ended Visual Quality Comparison

MIBench: Evaluating Multimodal Large Language Models over Multiple Images

CODIS: Benchmarking Context-Dependent Visual Comprehension for Multimodal Large Language Models

II-Bench: An Image Implication Understanding Benchmark for Multimodal Large Language Models

Img-Diff: Contrastive Data Synthesis for Multimodal Large Language Models

IW-Bench: Evaluating Large Multimodal Models for Converting Image-to-Web

Q-Bench: A Benchmark for General-Purpose Foundation Models on Low-level Vision

TrainFors: A Large Benchmark Training Dataset for Image Manipulation Detection and Localization

LIME: Less Is More for MLLM Evaluation

CDTD: A Large-Scale Cross-Domain Benchmark for Instance-Level Image-to-Image Translation and Domain Adaptive Object Detection.

3DBench: A Scalable 3D Benchmark and Instruction-Tuning Dataset

Omni-IML: Towards Unified Image Manipulation Localization

Towards Fair and Comprehensive Comparisons for Image-Based 3D Object Detection

MMIU: Multimodal Multi-image Understanding for Evaluating Large Vision-Language Models

Image Matching Across Wide Baselines: From Paper to Practice