Abstract:Spectrum-based fault localization (SBFL) techniques are widely studied and have been evaluated to be effective in locating faults. Recent studies also showed that developers from industry value automated SBFL techniques. However, their effectiveness is still limited by two main reasons. First, the test coverage information leveraged to construct the spectrum does not reflect the root cause directly. Second, SBFL suffers from the tie issue so that the buggy code entities can not be well differentiated from non-buggy ones. To address these challenges, we propose to leverage the information of version histories in fault localization based on the following two intuitions. First, version histories record how bugs are introduced to software projects and this information reflects the root cause of bugs directly. Second, the evolution histories of code can help differentiate those suspicious code entities ranked in tie by SBFL. Our intuitions are also inspired by the observations on debugging practices from large open source projects and industry. Based on the intuitions, we propose a novel technique HSFL (historical spectrum based fault localization). Specifically, HSFL identifies bug-inducing commits from the version history in the first step. It then constructs historical spectrum (denoted as Histrum) based on bug-inducing commits, which is another dimension of spectrum orthogonal to the coverage based spectrum used in SBFL. HSFL finally ranks the suspicious code elements based on our proposed Histrum and the conventional spectrum. HSFL outperforms the state-of-the-art SBFL techniques significantly on the Defects4J benchmark. Specifically, it locates and ranks the buggy statement at Top-1 for 77.8 percent more bugs as compared with SBFL, and 33.9 percent more bugs at Top-5. Besides, for the metrics MAP and MRR, HSFL achieves an average improvement of 28.3 and 40.8 percent over all bugs, respectively. Moreover, HSFL can also outperform other six families of fault localization techniques, and our proposed Histrum model can be integrated with different families of techniques and boost their performance.

Graph Neural Network Based Two-Phase Fault Localization Approach

Just-In-Time Defect Identification and Localization: A Two-Phase Framework.

Software Fault Localization Based on Network Spectrum and Graph Neural Network

A Dynamic Fault Localization Technique with Noise Reduction for Java Programs

A General Noise-Reduction Framework for Fault Localization of Java Programs.

Towards Better Graph Neural Network-based Fault Localization Through Enhanced Code Representation

Automatic Belief Network Modeling Via Policy Inference for SDN Fault Localization.

ALBFL: A Novel Neural Ranking Model for Software Fault Localization Via Combining Static and Dynamic Features

An effective fault localization approach based on PageRank and mutation analysis

ABFL: an Autoencoder Based Practical Approach for Software Fault Localization.

A fault localization approach based on fault propagation context

Novel Fault Localization Approach Based on Lagrangian Relaxation and Subgradient Method

Path Analysis for Effective Fault Localization in Deep Neural Networks

Software Fault Localization Based on Multi-objective Feature Fusion and Deep Learning

Historical Spectrum Based Fault Localization

A Lightweight Fault Localization Approach based on XGBoost

A Hybrid Approach to Fine-grained Automated Fault Localization

Global-and-local-structure-based neural network for fault detection

Effective Fault Localization using Probabilistic and Grouping Approach

A Novel Label-Aware Global Graph Construction Method and Spiking-Coded Graph Neural Network for Intelligent Process Fault Diagnosis

A Preliminary Investigation on the Performance of SBFL Techniques and Distance Metrics in Parallel Fault Localization