Deep Learning Framework Testing Via Hierarchical and Heuristic Model Generation.

Yinglong Zou,Haofeng Sun,Chunrong Fang,Jiawei Liu,Zhenping Zhang
DOI: https://doi.org/10.1016/j.jss.2023.111681
IF: 3.5
2023-01-01
Journal of Systems and Software
Abstract:Deep learning frameworks are the foundation of deep learning model construction and inference. Many testing methods using deep learning models as test inputs are proposed to ensure the quality of deep learning frameworks. However, there are still critical challenges in model generation, model instantiation, and result analysis. To bridge the gap, we propose Ramos, a hierarchical heuristic deep learning framework testing method. To generate diversified models, we design a novel hierarchical structure to represent the building block of the model. Based on this structure, new models are generated by the mutation method. To trigger more precision bugs in deep learning frameworks, we design a heuristic method to increase the error triggered by models and guide the subsequent model generation. To reduce false positives, we propose an API mapping rule between different frameworks to aid model instantiation. Further, we design different test oracles for crashes and precision bugs respectively. We conduct experiments under three widely-used frameworks (TensorFlow, PyTorch, and MindSpore) to evaluate the effectiveness of Ramos. The results show that Ramos can effectively generate diversified models and detect more deep learning framework bugs, including crashes and precision bugs, with fewer false positives. Additionally, 14 of 15 are confirmed by developers.
What problem does this paper attempt to address?