Software Fault Localization Based on Multi-objective Feature Fusion and Deep Learning

Xiaolei Hu,Dongcheng Li,W. Eric Wong,Ya Zou

2024-11-26

Abstract:Software fault localization remains challenging due to limited feature diversity and low precision in traditional methods. This paper proposes a novel approach that integrates multi-objective optimization with deep learning models to improve both accuracy and efficiency in fault localization (FL). By framing feature selection as a multi-objective optimization problem (MOP), we extract and fuse three critical fault-related feature sets: spectrum-based, mutation-based, and text-based features, into a comprehensive feature fusion model. These features are then embedded within a deep learning architecture, comprising a multilayer perceptron (MLP) and gated recurrent network (GRN), which together enhance localization accuracy and generalizability. Experiments on the Defects4J benchmark dataset with 434 faults show that the proposed algorithm reduces processing time by 78.2% compared to single-objective methods. Additionally, our MLP and GRN models achieve a 94.2% improvement in localization accuracy compared to traditional FL methods, outperforming state-of-the-art deep learning-based FL method by 7.67%. Further validation using the PROMISE dataset demonstrates the generalizability of the proposed model, showing a 4.6% accuracy improvement in cross-project tests over state-of-the-art deep learning-based FL method.

Software Engineering

What problem does this paper attempt to address?

This paper attempts to address two major challenges in software fault localization (FL): 1. **Insufficient feature extraction**: Existing feature extraction techniques are unable to fully capture software fault information, resulting in inaccurate fault localization. Traditional methods usually rely on a single type of feature as guidance, which leaves a large amount of useful feature information under - utilized. 2. **Low model accuracy**: Many existing models have the problem of insufficient accuracy in fault localization. Especially when dealing with large - scale actual fault data sets, the processing time increases significantly, affecting the time efficiency of the model. To solve these problems, the paper proposes a new method that combines multi - objective optimization and deep - learning models to improve the accuracy and efficiency of fault localization. Specifically, the main contributions of the paper are as follows: - **Multi - objective feature fusion algorithm**: Select effective features from three dimensions of spectral features, mutation features, and text features, and develop a multi - objective feature fusion algorithm through voting and weighting methods, which solves the problems of static feature loss and feature information redundancy. - **Fault - location model based on deep learning**: Design and implement a fault - location model based on Multilayer Perceptron (MLP) and Gated Recurrent Unit (GRU), which improves the accuracy and generalization ability of fault localization. The experimental results show that the proposed algorithm reduces the processing time on the Defects4J benchmark data set by 78.2% compared with the single - objective method, and improves the fault - location accuracy by 94.2% compared with traditional methods and by 7.67% compared with the state - of - the - art deep - learning methods. In addition, the verification using the PROMISE data set also shows the generalization ability of the model, with an improvement of 4.6% in accuracy in cross - project testing compared with the state - of - the - art deep - learning methods.

Software Fault Localization Based on Multi-objective Feature Fusion and Deep Learning

Fault Localization Based on Wide & Deep Learning Model by Mining Software Behavior.

A Multiple-Criteria Ensemble Weight Strategy to Increase the Effectiveness of Deep Learning-based Fault Localization

Fault Localization in Deep Learning-based Software: A System-level Approach

A Fault Localization Approach Based on BiRNN and Multi-Dimensional Features.

ALBFL: A Novel Neural Ranking Model for Software Fault Localization Via Combining Static and Dynamic Features

DeepFD: Automated Fault Diagnosis and Localization for Deep Learning Programs

Enhancing Fault Localization Through Ordered Code Analysis with LLM Agents and Self-Reflection

Integrating Neural Mutation into Mutation-Based Fault Localization: A Hybrid Approach

HetFL: Heterogeneous Graph-based Software Fault Localization

Mulr4FL: Effective Fault Localization of Evolution Software Based on Multivariate Logistic Regression Model

Fault Localization Analysis Based on Deep Neural Network

A LambdaMart-Based High-Accuracy Approach for Software Automatic Fault Localization

Code-Aware Fault Localization with Pre-Training and Interpretable Machine Learning

ABFL: an Autoencoder Based Practical Approach for Software Fault Localization.

A Study of Effectiveness of Deep Learning in Locating Real Faults.

Feature-FL: Feature-Based Fault Localization

GMBFL: Optimizing Mutation-Based Fault Localization Via Graph Representation

Fault Diagnostic Method Based on Deep Learning and Multimodel Feature Fusion for Complex Industrial Processes

Dynamic Data Fault Localization for Deep Neural Networks

AgentFL: Scaling LLM-based Fault Localization to Project-Level Context