Holistic Implicit Factor Evaluation of Model Extraction Attacks

Anli Yan,Hongyang Yan,Li Hu,Xiaozhang Liu,Teng Huang
DOI: https://doi.org/10.1109/tdsc.2022.3231271
2023-01-01
IEEE Transactions on Dependable and Secure Computing
Abstract:Model extraction attacks (MEAs) allow adversaries to replicate a surrogate model analogous to the target model's decision pattern. While several attacks and defenses have been studied in-depth, the underlying reasons behind our susceptibility to them often remain unclear. Analyzing these implication influence factors helps to promote secure deep learning (DL) systems, it requires studying extraction attacks in various scenarios to determine the success of different attacks and the hallmarks of DLs. However, understanding, implementing, and evaluating even a single attack requires extremely high technical effort, making it impractical to study the vast number of unique extraction attack scenarios. To this end, we present a first-of-its-kind holistic evaluation of implication factors for MEAs which relies on the attack process abstracted from state-of-the-art MEAs. Specifically, we concentrate on four perspectives. we consider the impact of the task accuracy, model architecture, and robustness of the target model on MEAs, as well as the impact of the model architecture of the surrogate model on MEAs. Our empirical evaluation includes an ablation study over sixteen model architectures and four image datasets. Surprisingly, our study shows that improving the robustness of the target model via adversarial training is more vulnerable to model extraction attacks.
What problem does this paper attempt to address?