Abstract:Vulnerabilities are disclosed with corresponding patches so that users can remediate them in time. However, there are instances where patches are not released with the disclosed vulnerabilities, causing hidden dangers, especially if dependent software remains uninformed about the affected code repository. Hence, it is crucial to automatically locate security patches for disclosed vulnerabilities among a multitude of commits. Despite the promising performance of existing learning-based localization approaches, they still suffer from the following limitations: (1) They cannot perform well in data scarcity scenarios. Most neural models require extensive datasets to capture the semantic correlations between the vulnerability description and code commits, while the number of disclosed vulnerabilities with patches is limited. (2) They struggle to capture the deep semantic correlations between the vulnerability description and code commits due to inherent differences in semantics and characters between code changes and commit messages. It is difficult to use one model to capture the semantic correlations between vulnerability descriptions and code commits. To mitigate these two limitations, in this paper, we propose a novel security patch localization approach named Prom VPat, which utilizes the dual prompt tuning channel to capture the semantic correlation between vulnerability descriptions and commits, especially in data scarcity (i.e., few-shot) scenarios. We first input the commit message and code changes with the vulnerability description into the prompt generator to generate two new inputs with prompt templates. Then, we adopt a pre-trained language model (i.e., PLM) as the encoder, utilize the prompt tuning method to fine-tune the encoder, and generate two correlation probabilities as the semantic features. In addition, we extract 26 handcrafted features from the vulnerability descriptions and the code commits. Finally, we utilize the attention mechanism to fuse the handcrafted and semantic features, which are fed into the classifier to predict the correlation probability and locate the security patch. To evaluate the performance of Prom VPat, we compare it with five baselines on two datasets. Experimental results demonstrate that Prom VPat performs best in the security patch localization task, improving the best baseline by 14.42 % and 86.57 % on two datasets regarding Recall@1. Moreover, Prom VPat has proven to be effective even in data scarcity scenarios.

Automated Software Vulnerability Patching using Large Language Models

VulnLLMEval: A Framework for Evaluating Large Language Models in Software Vulnerability Detection and Patching

Harnessing Large Language Models for Software Vulnerability Detection: A Comprehensive Benchmarking Study

How Far Have We Gone in Vulnerability Detection Using Large Language Models

Understanding the Effectiveness of Large Language Models in Detecting Security Vulnerabilities

VulDetectBench: Evaluating the Deep Capability of Vulnerability Detection with Large Language Models

Outside the Comfort Zone: Analysing LLM Capabilities in Software Vulnerability Detection

Chain-of-Thought Prompting of Large Language Models for Discovering and Fixing Software Vulnerabilities

Can LLMs Patch Security Issues?

Multitask-based Evaluation of Open-Source LLM on Software Vulnerability

Just-in-Time Detection of Silent Security Patches

LLM-Enhanced Software Patch Localization

Dual Prompt-Based Few-Shot Learning for Automated Vulnerability Patch Localization

DLAP: A Deep Learning Augmented Large Language Model Prompting Framework for Software Vulnerability Detection

Fixing Security Vulnerabilities with AI in OSS-Fuzz

A Study of Vulnerability Repair in JavaScript Programs with Large Language Models

PenHeal: A Two-Stage LLM Framework for Automated Pentesting and Optimal Remediation

LLM4Vuln: A Unified Evaluation Framework for Decoupling and Enhancing LLMs' Vulnerability Reasoning

A Preliminary Study on Using Large Language Models in Software Pentesting

Code Vulnerability Repair with Large Language Model using Context-Aware Prompt Tuning