Removing Watermarks for Image Processing Networks Via Referenced Subspace Attention

Yuliang Xue,Yuhao Zhu,Zhiying Zhu,Sheng Li,Zhenxing Qian,Xinpeng Zhang
DOI: https://doi.org/10.1093/comjnl/bxac190
2024-01-01
Abstract:Deep neural network model extraction attack is the process of retraining a surrogate model based on the outputs of a target model with a given set of inputs. Such attacks are hard to defend for the sake of model owners’ interest. Recently, some work propose model watermarking scheme for image processing networks, which is able to prove the intellectual property of deep models even after the model extraction attack. This scheme makes sure that, once the target model (an image processing network) is watermarked, we can extract the watermark from the output of the surrogate model. In this paper, we propose a new model extraction attack scheme to fight against the latest method. Instead of directly using the output images of a target model, we propose to use their reconstructed versions for model retraining, where an asymmetrical UNet is proposed for image reconstruction. To thoroughly remove the watermarking traces, we propose and incorporate a referenced subspace attention module in the asymmetrical UNet, which removes the watermark by projecting the outputs of the target model into the subspaces of the reference image. Various experiments demonstrate the effectiveness of our attack.
What problem does this paper attempt to address?