Adversarial Adaptive Neighborhood With Feature Importance-Aware Convex Interpolation
J. Dong,Saiyu Qi,Qian Li,Qingyuan Hu,Yong Qi,Yun Lin
DOI: https://doi.org/10.1109/TIFS.2020.3047752
IF: 7.231
IEEE Transactions on Information Forensics and Security
Abstract:Adversarial Examples threaten to fool deep learning models to output erroneous predictions with high confidence. Optimization-based methods for constructing such samples have been extensively studied. While being effective in terms of aggression, they typically lack clear interpretation and constraint about their underlying generation process, which thus hinders us from leveraging the produced adversarial samples for model protection in the reverse direction. Hence, we expect them to repair bugs in the pre-trained models by produced additional training data equipped with strong attack ability rather than time-consuming full re-training from scratch. To address these issues, we first study the black-box behaviors and the intrinsic deficiency of neighborhood information in previous optimization-based adversarial attacks and defenses, respectively. Then we introduce a new method dubbed FeaCP, which uses correct predicted samples in disjoint classes to guide the generation of more explainable adversarial samples in the ambiguous region around the decision boundary instead of uncontrolled “blind spots”, via convex combination in a feature component-wise manner which takes the individual importance of feature ingredients into account. Our method incorporates the prior fact that for well-separated samples, the path connecting them would go through model’s decision-boundary that lies in a low-density region, however, wherein adversarial examples are spread with high probability, thus having an impact on the ultimate trained model. In our work, the path is constructed by proposed inhomogeneous feature-wise convex interpolation rather than operating on sample-wise level, limiting the search space of FeaCP to obtain an adaptive neighborhood. Finally, we provide detailed insights and extend our method to adversarial fine-tuning using vicinity distribution to optimize the approximated decision boundary, and validate the significance of our FeaCP to model performance. The experimental results show that our method provides competitive performance on various datasets and networks.
Mathematics,Computer Science