Abstract:Vulnerabilities are disclosed with corresponding patches so that users can remediate them in time. However, there are instances where patches are not released with the disclosed vulnerabilities, causing hidden dangers, especially if dependent software remains uninformed about the affected code repository. Hence, it is crucial to automatically locate security patches for disclosed vulnerabilities among a multitude of commits. Despite the promising performance of existing learning-based localization approaches, they still suffer from the following limitations: (1) They cannot perform well in data scarcity scenarios. Most neural models require extensive datasets to capture the semantic correlations between the vulnerability description and code commits, while the number of disclosed vulnerabilities with patches is limited. (2) They struggle to capture the deep semantic correlations between the vulnerability description and code commits due to inherent differences in semantics and characters between code changes and commit messages. It is difficult to use one model to capture the semantic correlations between vulnerability descriptions and code commits. To mitigate these two limitations, in this paper, we propose a novel security patch localization approach named Prom VPat, which utilizes the dual prompt tuning channel to capture the semantic correlation between vulnerability descriptions and commits, especially in data scarcity (i.e., few-shot) scenarios. We first input the commit message and code changes with the vulnerability description into the prompt generator to generate two new inputs with prompt templates. Then, we adopt a pre-trained language model (i.e., PLM) as the encoder, utilize the prompt tuning method to fine-tune the encoder, and generate two correlation probabilities as the semantic features. In addition, we extract 26 handcrafted features from the vulnerability descriptions and the code commits. Finally, we utilize the attention mechanism to fuse the handcrafted and semantic features, which are fed into the classifier to predict the correlation probability and locate the security patch. To evaluate the performance of Prom VPat, we compare it with five baselines on two datasets. Experimental results demonstrate that Prom VPat performs best in the security patch localization task, improving the best baseline by 14.42 % and 86.57 % on two datasets regarding Recall@1. Moreover, Prom VPat has proven to be effective even in data scarcity scenarios.

ProRLearn: boosting prompt tuning-based vulnerability detection by reinforcement learning

DLAP: A Deep Learning Augmented Large Language Model Prompting Framework for Software Vulnerability Detection

Dual Prompt-Based Few-Shot Learning for Automated Vulnerability Patch Localization

Exploring the Universal Vulnerability of Prompt-based Learning Paradigm

How the Training Procedure Impacts the Performance of Deep Learning-based Vulnerability Patching

Your Instructions Are Not Always Helpful: Assessing the Efficacy of Instruction Fine-tuning for Software Vulnerability Detection

Prompt as Triggers for Backdoor Attack: Examining the Vulnerability in Language Models

Towards prompt tuning-based software vulnerability assessment with continual learning

Fine-tuned Large Language Models (LLMs): Improved Prompt Injection Attacks Detection

Learning diverse attacks on large language models for robust red-teaming and safety tuning

Refuse Whenever You Feel Unsafe: Improving Safety in LLMs via Decoupled Refusal Training

Vulnerability Detection with Representation Learning.

Deep Learning based Vulnerability Detection: Are We There Yet?

Enhancing the Capability and Robustness of Large Language Models through Reinforcement Learning-Driven Query Refinement

Chain-of-Thought Prompting of Large Language Models for Discovering and Fixing Software Vulnerabilities

Tiny Refinements Elicit Resilience: Toward Efficient Prefix-Model Against LLM Red-Teaming

Large Language Model for Vulnerability Detection: Emerging Results and Future Directions

Game Rewards Vulnerabilities: Software Vulnerability Detection with Zero-Sum Game and Prototype Learning

Knowledge-Informed Auto-Penetration Testing Based on Reinforcement Learning with Reward Machine

Reinforcement Learning for Aligning Large Language Models Agents with Interactive Environments: Quantifying and Mitigating Prompt Overfitting

Automated Software Vulnerability Patching using Large Language Models