Modeling and Discovering Data Race with Concurrent Code Property Graphs.

Yang Cao,Yunwei Dong
DOI: https://doi.org/10.1109/QRS-C60940.2023.00074
2023-01-01
Abstract:Concurrent programs have brought about higher resource utilization and faster response times. However, the non-determinism in concurrent programs introduces concurrency bugs that are harder to detect than traditional program bugs. These concurrency bugs pose serious threats to the safety and security of software systems. Data races are the most common type of concurrency bug. Previous static analysis techniques either lack precision or face difficulties when applied to large-scale programs. The vulnerability detection tool Joern provides a new static approach that uses a novel source code representation, called code property graph, and effectively mines various types of source code vulnerabilities in large-scale programs through graph traversal. However, because the code property graph does not model the concurrency relationships between threads, Joern does not support the detection of concurrency bugs. Therefore, this paper proposes a property graph -based data race detection method called RaceQuery. This method introduces the concept of property graphs to represent concurrent programs. It constructs the concurrent code property graph based on the control flow graph and enhances it with synchronization operations analysis to improve detection precision. Subsequently, RaceQuery generates a set of concurrent thread pairs for the program based on the concurrent code property graph and models data races through graph traversal. This enables RaceQuery to effectively locate data races in the concurrent code property graph. The method is implemented as a prototype system using the popular graph database Neo4j and successfully detects 2 data races in an open-source project.
What problem does this paper attempt to address?