HiddenCPG: Large-Scale Vulnerable Clone Detection Using Subgraph Isomorphism of Code Property Graphs

Seongil Wi,Sijae Woo,Joyce Jiyoung Whang,Sooel Son
DOI: https://doi.org/10.1145/3485447.3512235
2022-04-25
Abstract:A code property graph (CPG) is a joint representation of syntax, control flows, and data flows of a target application. Recent studies have demonstrated the promising efficacy of leveraging CPGs for the identification of vulnerabilities. It recasts the problem of implementing a specific static analysis for a target vulnerability as a graph query composition problem. It requires devising coarse-grained graph queries that model vulnerable code patterns. Unfortunately, such coarse-grained queries often leave vulnerabilities due to faulty input sanitization undetected. In this paper, we propose, a scalable system designed to identify various web vulnerabilities, including bugs that stem from incorrect sanitization. We designed to find a subgraph in a target CPG that matches a given CPG query having a known vulnerability, which is known as the subgraph isomorphism problem. To address the scalability challenge that stems from the NP-complete nature of this problem, leverages optimization techniques designed to boost the efficiency of matching vulnerable subgraphs. found confirmed vulnerabilities including CVEs among 2,464 potential vulnerabilities in real-world CPGs having a combined total of 1 billion nodes and 1.2 billion edges.
What problem does this paper attempt to address?