Seed Selection for Testing Deep Neural Networks

Yuhan Zhi,Xiaofei Xie,Chao Shen,Jun Sun,Xiaoyu Zhang,Xiaohong Guan
DOI: https://doi.org/10.1145/3607190
IF: 3.685
2024-01-01
ACM Transactions on Software Engineering and Methodology
Abstract:Deep learning (DL) has been applied in many applications. Meanwhile, the quality of DL systems is becoming a big concern. To evaluate the quality of DL systems, a number of DL testing techniques have been proposed. To generate test cases, a set of initial seed inputs are required. Existing testing techniques usually construct seed corpus by randomly selecting inputs from training or test dataset. Till now, there is no study on how initial seed inputs affect the performance of DL testing and how to construct an optimal one. To fill this gap, we conduct the first systematic study to evaluate the impact of seed selection strategies on DL testing. Specifically, considering three popular goals of DL testing (i.e., coverage, failure detection, and robustness), we develop five seed selection strategies, including three based on single-objective optimization (SOO) and two based on multi-objective optimization (MOO). We evaluate these strategies on seven testing tools. Our results demonstrate that the selection of initial seed inputs greatly affects the testing performance. SOO-based selection can construct the best seed corpus that can boost DL testing with respect to the specific testing goal. MOO-based selection strategies can construct seed corpus that achieve balanced improvement on multiple objectives.
What problem does this paper attempt to address?