Perceptual Hashing of Deep Convolutional Neural Networks for Model Copy Detection
Haozhe Chen,Hang Zhou,Jie Zhang,Dongdong Chen,Weiming Zhang,Kejiang Chen,Gang Hua,Nenghai Yu
DOI: https://doi.org/10.1145/3572777
IF: 4.094
2023-01-01
ACM Transactions on Multimedia Computing Communications and Applications
Abstract:In recent years, many model intellectual property (IP) proof methods for IP protection have been proposed, such as model watermarking and model fingerprinting. However, with the increasing number of models transmitted and deployed on the Internet, quickly finding the suspect model among thousands of models on model-sharing platforms such as GitHub is in great demand, which concurrently triggers the new security problem of model copy detection for IP protection. As an important part of the model IP protection system, the model copy detection task has not received enough attention. Due to the high computational complexity, both model watermarking and model fingerprinting lack the capability to efficiently find suspected infringing models among tens of millions of models. In this article, inspired by the hash-based image retrieval methods, we introduce a novel model copy detection mechanism: perceptual hashing for convolutional neural networks (CNNs). The proposed perceptual hashing algorithm can convert the weights of CNNs to fixed-length binary hash codes so that the lightly modified version has the similar hash code as the original model. By comparing the similarity of a pair of hash codes between a query model and a test model in the model library, similar versions of a query model can be retrieved efficiently. To the best of our knowledge, this is the first perceptual hashing algorithm for deep neural network models. Specifically, we first select the important model weights based on the model compression theory, then calculate the normal test statistics (NTS) on the segments of important weights, and finally encode the NTS features into hash codes. The experiment performed on a model library containing 3,565 models indicates that our perceptual hashing scheme has a superior copy detection performance.