Abstract:With the introduction of neuron coverage as a testing criterion for deep neural networks (DNNs), covering more neurons to detect more internal logic of DNNs became the main goal of many research studies. While some works had made progress, some new challenges for testing methods based on neuron coverage had been proposed, mainly as establishing better neuron selection and activation strategies influenced not only obtaining higher neuron coverage, but also more testing efficiency, validating testing results automatically, labeling generated test cases to extricate manual work, and so on. In this article, we put forward Test4Deep, an effective white-box testing DNN approach based on neuron coverage. It is based on a differential testing framework to automatically verify inconsistent DNNs' behavior. We designed a strategy that can track inactive neurons and constantly triggered them in each iteration to maximize neuron coverage. Furthermore, we devised an optimization function that guided the DNN under testing to deviate predictions between the original input and generated test data and dominated unobservable generation perturbations to avoid manually checking test oracles. We conducted comparative experiments with two state-of-the-art white-box testing methods DLFuzz and DeepXplore. Empirical results on three popular datasets with nine DNNs demonstrated that compared to DLFuzz and DeepXplore, Test4Deep, on average, exceeded by 32.87% and 35.69% in neuron coverage, while reducing 58.37% and 53.24% testing time, respectively. In the meantime, Test4Deep also produced 58.37% and 53.24% more test cases with 23.81% and 98.40% fewer perturbations. Even compared with the two highest neuron coverage strategies of DLFuzz, Test4Deep still enhanced neuron coverage by 4.34% and 23.23% and achieved 94.48% and 85.67% higher generation time efficiency. Furthermore, Test4Deep could improve the accuracy and robustness of DNNs by merging generated test cases and retraining.

Are Coverage Criteria Meaningful Metrics for DNNs?

There is Limited Correlation Between Coverage and Robustness for Deep Neural Networks

An Empirical Study on Correlation between Coverage and Robustness for Deep Neural Networks

Structural coverage criteria for neural networks could be misleading

HashC: Making DNNs' Coverage Testing Finer and Faster.

Revisiting Deep Neural Network Test Coverage from the Test Effectiveness Perspective

Correlations Between Deep Neural Network Model Coverage Criteria and Model Quality

Increasing the Confidence of Deep Neural Networks by Coverage Analysis

Quantifying safety risks of deep neural networks

HashC: Making deep learning coverage testing finer and faster

Robust Black-box Testing of Deep Neural Networks using Co-Domain Coverage

Coverage Testing of Deep Learning Models using Dataset Characterization

Evaluating Deep Neural Networks in Deployment (A Comparative and Replicability Study)

Dependable Neural Networks for Safety Critical Tasks

Coverage-enhanced fault diagnosis for Deep Learning programs: A learning-based approach with hybrid metrics

A White-Box Testing for Deep Neural Networks Based on Neuron Coverage.

Verifying Safety of Neural Networks from Topological Perspectives

DeepPath: Path-driven Testing Criteria for Deep Neural Networks

SoK: Certified Robustness for Deep Neural Networks

Assessing Systematic Weaknesses of DNNs using Counterfactuals

Inspect, Understand, Overcome: A Survey of Practical Methods for AI Safety