SVulDetector: Vulnerability Detection Based on Similarity Using Tree-Based Attention and Weighted Graph Embedding Mechanisms

Weining Zheng,Xiaohong Su,Hongwei Wei,Wenxin Tao
DOI: https://doi.org/10.1016/j.cose.2024.103930
IF: 5.105
2024-01-01
Computers & Security
Abstract:Vulnerability detection by comparing similarities with known vulnerable code is an important method for improving code security, and is particularly effective in detecting vulnerabilities caused by code reuse. However, vulnerability detection is made difficult by the existence of some different and vulnerability-unrelated statements between codes with the same vulnerability pattern, as well as the small differences between vulnerable and fixed non-vulnerable codes. To address these challenges, we believe that more attention needs to be paid to some core syntactic and semantic information about vulnerabilities, which can help models more accurately identify vulnerable code. Hence, we propose a novel code-similarity-based vulnerability detection approach named SVulDetector. First, it contains a new code representation, called Sliced Composite Graphs (SCGs), which can represent rich syntactic and semantic information related to vulnerable statements while minimizing the interference from similar vulnerability irrelevant information as much as possible. Next, a tree-based attention mechanism is used to highlight certain key syntactic information in vulnerable code and fixed non-vulnerable code. Finally, SVulDetector highlights key vulnerable node information in the graph-based code representation via a weighted graph embedding mechanism. We extensively evaluated SVulDetector on an improved real-world dataset using both binary classification and multi-class vulnerability detection tasks, and the proposed SVulDetector outperforms existing state-of-the-art detection methods.
What problem does this paper attempt to address?