Snopy: Bridging Sample Denoising with Causal Graph Learning for Effective Vulnerability Detection

Sicong Cao,Xiaobing Sun,Xiaoxue Wu,David Lo,Lili Bo,Bin Li,Xiaolei Liu,Xingwei Lin,Wei Liu
DOI: https://doi.org/10.1145/3691620.3695057
2024-01-01
Abstract:Deep Learning (DL) has emerged as a promising means for vulnerability detection due to its ability to automatically derive features from vulnerable code. Unfortunately, current solutions struggle to focus on vulnerability-related parts of vulnerable functions, and tend to exploit spurious correlations for prediction, thus undermining their effectiveness in practice. In this paper, we propose Snopy, a novel DL-based approach, which bridges sample denoising with causal graph learning to capture real vulnerability patterns from vulnerable samples with numerous noise for effective detection. Specifically, Snopy adopts a change-based sample denoising approach to automatically weed out vulnerability-irrelevant code elements in the vulnerable functions without sacrificing the label accuracy. Then, Snopy constructs a novel Causality-Aware Graph Attention Network (CA-GAT) with Feature Caching Scheme (FCS) to learn causal vulnerability features while maintaining efficiency. Experiments on the three public benchmark datasets show that Snopy outperforms the state-of-the-art baselines by an average of 27.22%, 85.89%, and 75.50% in terms of F1-score, respectively.
What problem does this paper attempt to address?