UnionFormer: Unified-Learning Transformer with Multi-View Representation for Image Manipulation Detection and Localization

Jianwei Guo,Wei Ma,Shuaibo Li,Shibiao Xu,Xiaopeng Zhang,Benchong Li
DOI: https://doi.org/10.1109/CVPR52733.2024.01190
2024-06-16
Computer Vision and Pattern Recognition
Abstract:We present UnionFormer, a novel framework that inte-grates tampering clues across three views by unified learning for image manipulation detection and localization. Specifically, we construct a BSFI-Net to extract tampering features from RGB and noise views, achieving enhanced responsive-ness to boundary artifacts while modulating spatial consis-tency at different scales. Additionally, to explore the incon-sistency between objects as a new view of clues, we combine object consistency modeling with tampering detection and localization into a three-task unified learning process, allowing them to promote and improve mutually. Therefore, we acquire a unified manipulation discriminative representation under multi-scale supervision that consolidates information from three views. This integration facilitates highly effective concurrent detection and localization of tampering. We perform extensive experiments on diverse datasets, and the results show that the proposed approach outperforms state-of-the-art methods in tampering detection and localization.
Computer Science
What problem does this paper attempt to address?