Stolen Risks of Models with Security Properties

Yue Qin,Zhuoqun Fu,Chuyun Deng,Xiaojing Liao,Jia Zhang,Haixin Duan
DOI: https://doi.org/10.1145/3576915.3616653
2023-01-01
Abstract:Verifiable robust machine learning, as a new trend of ML security defense, enforces security properties (e.g., Lipschitzness, Monotonicity) on machine learning models and achieves satisfying accuracy-security trade-off. Such security properties identify a series of evasion strategies of ML security attackers and specify logical constraints on their effects on a classifier (e.g., the classifier is monotonically increasing along some feature dimensions). However, little has been done so far to understand the side effect of those security properties on the model privacy. In this paper, we aim at better understanding the privacy impacts on security properties of robust ML models. Particularly, we report the first measurement study to investigate the model stolen risks of robust models satisfying four security properties (i.e., Local-Invariance, Lipschitzness, SmallNeighborhood, and Monotonicity). Our findings bring to light the factors that influence model stealing attacks and defense performance on models trained with security properties. In addition, to train an ML model satisfying goals in accuracy, security, and privacy, we propose a novel technique, called BoundaryFuzz, which introduces a privacy property into verifiable robust training frameworks to defend against model stealing attacks on robust models. Experimental results demonstrate the defense effectiveness of BoundaryFuzz.
What problem does this paper attempt to address?