Detecting Vulnerable Software Functions Via Text and Dependency Features

Wenlin Xu,Tong Li,Jinsong Wang,Yahui Tang
DOI: https://doi.org/10.1007/s00500-022-07775-5
IF: 3.732
2023-01-01
Soft Computing
Abstract:Detecting vulnerabilities in software is crucial to guarantee the security of software systems. Most previous methods focus on training a classification or regression model on the text feature of the source code to predict vulnerabilities. However, it is not always easy to obtain the labeled vulnerabilities in practical applications, and using only the text feature is insufficient to find the vulnerabilities in complex software systems. To address these problems, in this paper, we propose an unsupervised method to detect vulnerable software functions, which uses both text and dependency features of the source code to improve the detection accuracy. Specifically, we first extract the text and dependency features from the source code and concatenate them to the combined feature. We then learn a deep autoencoder to transform the combined feature into low-dimensional embedding. We finally apply an outlier detection method on the embedding to predict the vulnerable functions. We extensively evaluated the proposed method on seven C/C++ program datasets, and the results illustrate that our method improves F1 score on average of 88 and 66% over comparison methods Rats and Joern, which verifies the effectiveness of our method.
What problem does this paper attempt to address?