IDEA: An Inverse Domain Expert Adaptation Based Active DNN IP Protection Method

Chaohui Xu,Qi Cui,Jinxin Dong,Weiyang He,Chip-Hong Chang
2024-09-29
Abstract:Illegitimate reproduction, distribution and derivation of Deep Neural Network (DNN) models can inflict economic loss, reputation damage and even privacy infringement. Passive DNN intellectual property (IP) protection methods such as watermarking and fingerprinting attempt to prove the ownership upon IP violation, but they are often too late to stop catastrophic damage of IP abuse and too feeble against strong adversaries. In this paper, we propose IDEA, an Inverse Domain Expert Adaptation based proactive DNN IP protection method featuring active authorization and source traceability. IDEA generalizes active authorization as an inverse problem of domain adaptation. The multi-adaptive optimization is solved by a mixture-of-experts model with one real and two fake experts. The real expert re-optimizes the source model to correctly classify test images with a unique model user key steganographically embedded. The fake experts are trained to output random prediction on test images without or with incorrect user key embedded by minimizing their mutual information (MI) with the real expert. The MoE model is knowledge distilled into a unified protected model to avoid leaking the expert model features by maximizing their MI with additional multi-layer attention and contrastive representation loss optimization. IDEA not only prevents unauthorized users without the valid key to access the functional model, but also enable the model owner to validate the deployed model and trace the source of IP infringement. We extensively evaluate IDEA on five datasets and four DNN models to demonstrate its effectiveness in authorization control, culprit tracing success rate, and robustness against various attacks.
Cryptography and Security,Artificial Intelligence,Computer Vision and Pattern Recognition,Machine Learning
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the intellectual property (IP) protection of deep neural network (DNN) models. Specifically, the paper aims to propose an active DNN IP protection method to prevent unauthorized copying, distribution, and derivative use, thereby avoiding economic losses, reputation damage, and even privacy violations. ### Problem Background With the wide application of DNN in various fields, training a high - performance DNN model requires a large amount of data collection and annotation work, computing resources, and professional knowledge. These carefully trained models have become attractive targets for piracy and improper use. When a trained DNN model is directly distributed to end - users, the internal structure and parameters can be easily copied by competitors or dishonest consumers. Even if the model is deployed in the cloud and provides online inference services through APIs, research shows that attackers can still achieve performance comparable to the source model by training proxy models, at a cost far lower than designing the model from scratch. ### Limitations of Existing Methods Existing DNN IP protection methods can be divided into passive and active types: - **Passive Protection Methods**: Such as watermarking and fingerprinting techniques, are mainly used to prove ownership when IP infringement occurs, but usually cannot prevent catastrophic damage in a timely manner and are vulnerable to strong adversarial attacks. - **Active Protection Methods**: Aimed at restricting unauthorized users from accessing all functions of the model through multi - task optimization, but limited by multiple optimization objectives, these methods mainly change the decision boundaries of the last layer or the second last layer, resulting in almost the same shallow feature distribution and being vulnerable to fine - tuning attacks. ### Solution Proposed in the Paper The paper proposes IDEA (Active DNN IP Protection Method Based on Inverse Domain Expert Adaptation), and its main features include: 1. **Active Authorization and Traceability**: IDEA regards active authorization as the inverse problem of domain adaptation and is implemented through a Mixture of Experts (MoE) model, which contains one real expert and two fake experts. The real expert re - optimizes the source model to correctly classify test images with unique user - key embeddings, while the fake experts are trained to output random prediction results to minimize the mutual information (MI) between them and the real expert. 2. **Knowledge Distillation**: To prevent the leakage of expert model features, the MoE model is fused into a unified protected model through knowledge distillation and optimized to maximize MI through multi - layer attention and contrastive representation loss. 3. **Function Locking and Traceability**: IDEA not only prevents unauthorized users from accessing the functional model, but also allows the model owner to verify the deployed model and track the source of IP infringement. ### Summary The paper evaluates the effectiveness of IDEA on five datasets and four DNN models through extensive experiments, demonstrating its superiority in authorization control, criminal tracking success rate, and robustness against various attacks.