Aster: Encoding Data Augmentation Relations into Seed Test Suites for Robustness Assessment and Fuzzing of Data-Augmented Deep Learning Models

Haipeng Wang,Zhengyuan Wei,Qilin Zhou,Bo Jiang,Wing Kwong Chan
DOI: https://doi.org/10.1109/qrs60937.2023.00044
2023-01-01
Abstract:Data-augmented deep learning models are widely used in real-world applications. However, many state-of the-art loss-based or coverage-based fuzzing techniques fail to produce fuzzing samples for them from many seeds. This paper proposes Aster, a novel technique to address this problem to enhance their fuzzing effectiveness for deep learning models trained with multi-sample data augmentation methods. Aster formulates a novel reachability-based strategy to encode the insights of every seed’s direct and indirect data augmentation relation instances into the replacement seed of that seed systematically. Our experiment shows that Aster is highly effective. On average, loss-based and coverage-based fuzzing techniques can generate 166% and 110% more fuzzing samples and reduce 31% and 22% unsuccessful seeds, respectively, after adopting the replacement seeds generated by Aster to replace their original seeds. Their improved models also become up to 55% and 40% on average more robust against FGSM and PGD attacks in the experiment.
What problem does this paper attempt to address?