A Unique Identification-Oriented Black-Box Watermarking Scheme for Deep Classification Neural Networks

Mouke Mo,Chuntao Wang,Shan Bian

DOI: https://doi.org/10.3390/sym16030299

2024-03-05

Symmetry

Abstract:Given the substantial value and considerable training costs associated with deep neural network models, the field of deep neural network model watermarking has come to the forefront. While black-box model watermarking has made commendable strides, the current methodology for constructing poisoned images in the existing literature is simplistic and susceptible to forgery. Notably, there is a scarcity of black-box model watermarking techniques capable of discerning a unique user in a multi-user model distribution setting. For this reason, this paper proposes a novel black-box model watermarking method for unique identity identification, which is denoted as the ID watermarking of neural networks (IDwNet). Specifically, to enhance the distinguishability of deep neural network models in multi-user scenarios and mitigate the likelihood of poisoned image counterfeiting, this study develops a discrete cosine transform (DCT) and singular value decomposition (SVD)-based symmetrical embedding method to form the poisoned image. As this ID embedding method leads to indistinguishable deep features, the study constructs a poisoned adversary training strategy by simultaneously inputting clean images, poisoned images with the correct ID, and poisoned adversary images with incorrect IDs to train a deep neural network. Extensive simulation experiments show that the proposed scheme achieves excellent invisibility for the concealed ID, surpassing remarkably the state-of-the-art. In addition, the proposed scheme obtains a validation success rate exceeding 99% for the poisoned images at the cost of a marginal classification accuracy reduction of less than 0.5%. Moreover, even though there is only a 1-bit discrepancy between IDs, the proposed scheme still results in an accurate validation of user copyright. These results indicate that the proposed scheme is promising.

multidisciplinary sciences

What problem does this paper attempt to address?

The problem that this paper attempts to solve is that the current black - box model watermarking techniques are too simple in constructing poisoned images and are easily forged, especially the lack of black - box model watermarking techniques that can distinguish unique users in the multi - user model distribution setting. Specifically, the paper aims to propose a new black - box model watermarking method for the unique identity recognition of deep classification neural networks, called IDwNet. This method solves the existing problems in the following ways: 1. **Enhancing non - forgerability**: By using a symmetric embedding method that combines the discrete cosine transform (DCT) and the singular value decomposition (SVD) to construct poisoned images, making the poisoned images more concealed and difficult to be detected by human observers, thus reducing the possibility of forging. 2. **Improving the unique identification ability in a multi - user environment**: By introducing poisoned adversarial images with wrong IDs for training, the model's ability to distinguish different IDs is enhanced, and user copyright can be accurately verified even if there is only a 1 - bit difference between IDs. 3. **Maintaining high fidelity and reliability**: While embedding the watermark, ensure that the classification prediction accuracy of the model is not significantly affected, and the watermark can be efficiently extracted and verified. 4. **Improving robustness**: Ensure that the watermark is not easily removed or destroyed maliciously. If an attacker attempts to forcibly remove or destroy the watermark, it will seriously affect the accuracy of the model. The main contributions of the paper include: - Proposing the concept of black - box ID model watermarking, which can uniquely identify users in a multi - user distribution framework. - Designing a method for constructing poisoned images based on digital watermarking techniques, which can dynamically adjust embedding parameters, enhance the concealment of poisoned images, and maintain the fidelity and reliability of black - box model watermarks. - Introducing a poisoned adversarial image training strategy to enhance the model's ability to recognize the embedded ID, thereby effectively distinguishing the owner of the model. Through these methods, the experimental results of the IDwNet scheme proposed in the paper on multiple datasets show that this method performs excellently in the field of black - box model watermarking and has a high degree of non - forgerability, fidelity and reliability.

A Unique Identification-Oriented Black-Box Watermarking Scheme for Deep Classification Neural Networks

Leveraging Unlabeled Data for Watermark Removal of Deep Neural Networks

A Novel Robust Black-Box Fingerprinting Scheme for Deep Classification Neural Networks

On Function-Coupled Watermarks for Deep Neural Networks

Active intellectual property protection for deep neural networks through stealthy backdoor and users’ identities authentication

Deep Neural Network Watermarking Against Model Extraction Attack

Universal BlackMarks: Key-Image-Free Blackbox Multi-Bit Watermarking of Deep Neural Networks

Deep neural networks watermark via universal deep hiding and metric learning

Protecting the Intellectual Property of Deep Neural Networks with Watermarking: The Frequency Domain Approach

Free Fine-tuning: A Plug-and-Play Watermarking Scheme for Deep Neural Networks

Deep Model Intellectual Property Protection Via Deep Watermarking

Deep Serial Number: Computational Watermarking for DNN Intellectual Property Protection

Watermarking Neural Networks with Watermarked Images

Fragile Neural Network Watermarking with Trigger Image Set

Robust Black-box Watermarking for Deep NeuralNetwork using Inverse Document Frequency

BlackMarks: Blackbox Multibit Watermarking for Deep Neural Networks

Semi-Fragile Neural Network Watermarking Based on Adversarial Examples

Protecting the Intellectual Properties of Deep Neural Networks with an Additional Class and Steganographic Images

Rethinking White-BoxWatermarks on Deep Learning Models under Neural Structural Obfuscation

Subnetwork-Lossless Robust Watermarking for Hostile Theft Attacks in Deep Transfer Learning Models

Reversible Quantization Index Modulation for Static Deep Neural Network Watermarking