Abstract:We propose a method combining machine learning with a static analysis tool (i.e. Infer) to automatically repair source code. Machine Learning methods perform well for producing idiomatic source code. However, their output is sometimes difficult to trust as language models can output incorrect code with high confidence. Static analysis tools are trustable, but also less flexible and produce non-idiomatic code. In this paper, we propose to fix resource leak bugs in IR space, and to use a sequence-to-sequence model to propose fix in source code space. We also study several decoding strategies, and use Infer to filter the output of the model. On a dataset of CodeNet submissions with potential resource leak bugs, our method is able to find a function with the same semantics that does not raise a warning with around 97% precision and 66% recall.

What problem does this paper attempt to address?

### Problems Addressed by the Paper The paper aims to address the issue of vulnerability detection and repair in source code. Specifically, it proposes a method that combines machine learning with static analysis tools (such as Infer) to automatically fix resource leak vulnerabilities in source code. #### Main Contributions: 1. **Generating Parallel Dataset**: The paper generates a parallel dataset containing Java code and Infer IR, and trains a model to decompile Infer IR back to source code. 2. **Automatic Repair of Resource Leak Vulnerabilities**: It proposes a method to reliably automatically fix resource leak vulnerabilities in Infer IR. 3. **Combining Automatic Repair Tools and Decompiler**: It combines the new automatic repair tool with a decompiler to fix resource leak vulnerabilities and restore the repaired Java source code. Experimental results show that this method can fix over 66% of Infer warnings in the dataset, and 96.9% of the submissions passed unit tests. ### Research Background and Challenges Current models struggle to capture the reasoning capabilities within the structure of source code without a sufficiently large high-quality dataset. Additionally, machine learning models may produce results that are difficult to trust, while static analysis tools, although reliable, are not flexible enough and struggle to propose fixes that align with programming practices. ### Method Overview 1. **Dataset**: Experiments are conducted using the GitHub BigQuery and CodeNet datasets. 2. **Model Architecture**: A sequence-to-sequence (seq2seq) Transformer model is used, consisting of 6 layers each for the encoder and decoder, with approximately 312 million parameters. 3. **Pre-training**: The model is pre-trained using a denoising autoencoder task to improve its ability to understand IR and generate Java source code. 4. **Automatic Repair**: Vulnerabilities are automatically repaired at the IR level, and a decompiler is used to convert the repaired IR back to source code. 5. **Evaluation Metrics**: These include edit distance, compilation accuracy, IR matching degree, and unit test accuracy. ### Conclusion and Future Work The proposed method can suggest appropriate fixes in 66.4% of code segments, with an accuracy ranging from 96.4% to 97.9%. Future work plans include extending to other types of vulnerabilities, developing automated methods to compare the semantic equivalence of IR, and proposing semantic distances between programs.

Leveraging Static Analysis for Bug Repair

Towards More Reliable Automated Program Repair by Integrating Static Analysis Techniques

InferFix: End-to-End Program Repair with LLMs

DeepCode AI Fix: Fixing Security Vulnerabilities with Large Language Models

Patch Space Exploration using Static Analysis Feedback

Boosting Static Resource Leak Detection via LLM-based Resource-Oriented Intention Inference

StaticFixer: From Static Analysis to Static Repair

Enhancing Source Code Security with LLMs: Demystifying The Challenges and Generating Reliable Repairs

LLM-Powered Code Vulnerability Repair with Reinforcement Learning and Semantic Reward

SkipAnalyzer: A Tool for Static Code Analysis with Large Language Models

Automatically Inspecting Thousands of Static Bug Warnings with Large Language Model: How Far Are We?

Enabling Automatic Repair of Source Code Vulnerabilities Using Data-Driven Methods

Automated Repair of AI Code with Large Language Models and Formal Verification

Inferring Resource-Oriented Intentions Using LLMs for Static Resource Leak Detection

A Controlled Experiment of Different Code Representations for Learning-Based Bug Repair

Semantic Code Repair using Neuro-Symbolic Transformation Networks

SequenceR: Sequence-to-Sequence Learning for End-to-End Program Repair

Hybrid Automated Program Repair by Combining Large Language Models and Program Analysis

LLM-Assisted Static Analysis for Detecting Security Vulnerabilities

Repairing Bugs in Python Assignments Using Large Language Models

On Hardware Security Bug Code Fixes By Prompting Large Language Models