EEJE: Two-Step Input Transformation for Robust DNN Against Adversarial Examples

Seok-Hwan Choi,Jinmyeong Shin,Peng Liu,Yoon-Ho Choi
DOI: https://doi.org/10.1109/tnse.2020.3008394
IF: 6.6
2021-04-01
IEEE Transactions on Network Science and Engineering
Abstract:Adversarial examples are human-imperceptible perturbations to inputs to machine learning models. While attacking machine learning models, adversarial examples cause the model to make a false positive or a false negative. So far, two representative defense architectures have shown a significant effect: (1) model retraining architecture; and (2) input transformation architecture. However, previous defense methods belonging to these two architectures do not produce good outputs for every input, i.e., adversarial examples and legitimate inputs. Specifically, model retraining methods generate false negatives for unknown adversarial examples, and input transformation methods generate false positives for legitimate inputs. To produce good-enough outputs for every input, we propose and evaluate a new input transformation architecture based on two-step input transformation. To solve the limitations of the previous two defense methods, we intend to answer the following question: How to maintain the performance of Deep Neural Network (DNN) models for legitimate inputs while providing good robustness against various adversarial examples? From the evaluation results under various conditions, we show that the proposed two-step input transformation architecture provides good robustness to DNN models against state-of-the-art adversarial perturbations, while maintaining the high accuracy even for legitimate inputs.
engineering, multidisciplinary,mathematics, interdisciplinary applications
What problem does this paper attempt to address?