Multi-modality boundary-guided network for generalizable image manipulation localization

Yanyan Jiang,Yongping Huang,Haipeng Chen,Yingda Lyu
DOI: https://doi.org/10.1007/s00530-024-01583-7
IF: 3.9
2024-12-06
Multimedia Systems
Abstract:The main concern in image manipulation localization is the development of a feature representation that can effectively detect various manipulation techniques such as copy-move, removal, and splicing. However, many existing techniques focus on identifying specific tampering techniques, which limits their applicability in real-world scenarios. To address this challenge, we propose a new approach called the M ulti- M odality B oundary- G uided Network (M2BG-Net) to enhance the generalization capacity of the model. Our approach includes several key components. Firstly, we introduce a multi-modality feature learning module that combines high-frequency features with RGB features. This allows us to detect subtle tampering traces that may not be visible in the RGB domain alone. Additionally, we propose an edge-aware module (EAM) to enhance the boundary features of the tampered region. By focusing on the edges, we can improve the accuracy of tampering localization. Furthermore, we introduce a multi-dilation context aggregation module (MCAM) that jointly optimizes RGB streams, frequency streams, and boundary artifacts. This enables us to improve the manipulation localization capability on unknown forgery patterns. Through extensive experiments, we demonstrate that our proposed M2BG-Net outperforms existing image tampering localization methods in terms of generalization and robustness to various tampering means. Our approach represents a significant step towards more effective and reliable image manipulation detection in real-world scenarios.
computer science, information systems, theory & methods
What problem does this paper attempt to address?