Abstract:Finding tampered regions in images is a common research topic in machine learning and computer vision. Although many image manipulation location algorithms have been proposed, most of them only focus on RGB images with different color spaces, and the frequency information that contains the potential tampering clues is often ignored. Moreover, among the manipulation operations, splicing and copy-move are two frequently used methods, but as their characteristics are quite different, specific methods have been individually designed for detecting the operations of either splicing or copy-move, and it is very difficult to widely apply these methods in practice. To solve these issues, in this work, a novel end-to-end two-stream boundary-aware network (abbreviated as TBNet) is proposed for generic image manipulation localization where the RGB stream, the frequency stream, and the boundary artifact location are explored in a unified framework. Specifically, we first design an adaptive frequency selection module (AFS) to adaptively select the appropriate frequency to mine inconsistent statistics and eliminate the interference of redundant statistics. Then, an adaptive cross-attention fusion module (ACF) is proposed to adaptively fuse the RGB feature and the frequency feature. Finally, the boundary artifact location network (BAL) is designed to locate the boundary artifacts for which the parameters are jointly updated by the outputs of the ACF, and its results are further fed into the decoder. Thus, the parameters of the RGB stream, the frequency stream, and the boundary artifact location network are jointly optimized, and their latent complementary relationships are fully mined. The results of the extensive experiments performed on six public benchmarks of the image manipulation localization task, namely, CASIA1.0, COVER, Carvalho, In-The-Wild, NIST-16, and IMD-2020, demonstrate that the proposed TBNet can substantially outperform state-of-the-art generic image manipulation localization methods in terms of MCC, F1, and AUC while maintaining robustness with respect to various attacks. Compared with DeepLabV3+ on the CASIA1.0, COVER, Carvalho, In-The-Wild, and NIST-16 datasets, the improvements in MCC/F1 reach 11%/11.1%, 8.2%/10.3%, 10.2%/11.6%, 8.9%/6.2%, and 13.3%/16.0%, respectively. Moreover, when IMD2020 is utilized, its AUC improvement can achieve 14.7%.

UnionFormer: Unified-Learning Transformer with Multi-View Representation for Image Manipulation Detection and Localization

ObjectFormer for Image Manipulation Detection and Localization

UVL2: A Unified Framework for Video Tampering Localization

Unified Frequency-Assisted Transformer Framework for Detecting and Grounding Multi-modal Manipulation

Unified Video and Image Representation for Boosted Video Face Forgery Detection

Unveiling tampering traces: Enhancing image reconstruction errors for visualization

Image Manipulation Localization Using Multi-Scale Feature Fusion and Adaptive Edge Supervision

Deep Localization of Mixed Image Tampering Techniques

Multi-modality boundary-guided network for generalizable image manipulation localization

Image Manipulation Localization Using Spatial–Channel Fusion Excitation and Fine-Grained Feature Enhancement

Image Tampering Detection Method Based on Multi-Feature Fusion

Effective Image Tampering Localization with Multi-Scale ConvNeXt Feature Fusion

Image Manipulation Detection by Multiple Tampering Traces and Edge Artifact Enhancement

UMMAFormer: A Universal Multimodal-adaptive Transformer Framework for Temporal Forgery Localization

MVSS-Net: Multi-View Multi-Scale Supervised Networks for Image Manipulation Detection

Image Manipulation Localization Using Attentional Cross-Domain CNN Features

Spatiotemporal Trident Networks: Detection and Localization of Object Removal Tampering in Video Passive Forensics

UniTR: A Unified TRansformer-based Framework for Co-object and Multi-modal Saliency Detection

TBNet: A Two-Stream Boundary-Aware Network for Generic Image Manipulation Localization

CECL-Net: Contrastive Learning and Edge-Reconstruction-Driven Complementary Learning Network for Image Forgery Localization

Multi-scale segmentation strategies in PRNU-based image tampering localization