Abstract:Context: Constructing an effective defect prediction model relies on a substantial number of labeled program modules. Unfortunately, program module labeling is often time-consuming and error-prone. Semi-supervised software defect prediction (SSDP) can alleviate this issue by incorporating some labeled modules and the remaining unlabeled modules from the same project. Objective: However, previous SSDP methods ignore the significant influence of dependencies between software modules. The potential of knowledge distillation in leveraging labeled instances to guide the learning process and effectively utilizing information from unlabeled instances to improve SSDP performance has not been fully investigated. Method: We propose a novel approach SeDPGK. Specifically, to exploit the graph-structured knowledge, we first construct the program dependence graph to extract control and data dependencies among modules. Then we use graph neural networks (GNNs) to learn the graph representation of the module relationships and encode with the statement semantics of abstract syntax tree and traditional static features for diversity. Second, we integrate multiple GNNs jointly trained as teacher models to ensemble various styles of graph-based networks and generate trustworthy labels for unlabeled modules. Further, to preserve the teacher model's sufficient structure and semantic knowledge, we adopt a trainable label propagation and multi-layer perception as the student model and mitigate the differences between the teacher and student models using two widespread knowledge distillation functions. Results: We conducted our experiments on 17 real-world projects. The experimental results show that SeDPGK outperforms semi-supervised baselines with an average improvement of 16.9% for PD, 42.5% for FAR, and 8.9% for AUC, respectively. Moreover, the performance improvement is consistently significant across multiple statistical tests. Conclusion: The effectiveness of SeDPGK comes from the aggregation of the different GNNs with heterogeneity. Moreover, the graph structure and semantic features hidden behind the source code play a crucial role in the distillation framework.

GLAM-SERP: Building a Graph Learning-Assisted Model for Soft Error Resilience Prediction in GPGPUs.

G-SEPM: building an accurate and efficient soft error prediction model for GPGPUs

Evaluating the Soft Error Resilience of Graph Applications on GPGPUs.

Evaluating the Soft Error Resilience of Instructions for GPU Applications

G-SEAP: Analyzing and Characterizing Soft-Error Aware Approximation in GPGPUs

PyGFI: Analyzing and Enhancing Robustness of Graph Neural Networks Against Hardware Errors

G-SEPM

Prediction of GPU Failures Under Deep Learning Workloads

Soft Error Resilience of Deep Residual Networks for Object Recognition

Comparative analysis of soft-error sensitivity in LU decomposition algorithms on diverse GPUs

Detecting SDCs in GPGPUs Through an Efficient Instruction Duplication Mechanism

TC-SEPM: Characterizing Soft Error Resilience of CNNs on Tensor Cores from Program and Microarchitecture Perspectives

Understanding the Effects of Permanent Faults in GPU's Parallelism Management and Control Units

Exploiting Component Dependency for Accurate and Efficient Soft Error Analysis Via Probabilistic Graphical Models

Characterizing Soft-Error Resiliency in Arm's Ethos-U55 Embedded Machine Learning Accelerator

Towards Evaluating SEU Type Soft Error Effects with Graph Attention Network

Composing Graph Theory and Deep Neural Networks to Evaluate SEU Type Soft Error Effects

Can GPU performance increase faster than the code error rate?

PARIS: Predicting application resilience using machine learning

Modeling the Propagation of Soft Errors in Programs

SeDPGK: Semi-supervised software defect prediction with graph representation learning and knowledge distillation