Efficient Construction of Practical Python Call Graphs with Entity Knowledge Base

Yulu Cao,Lin Chen,Zhifei Chen,Jiacheng Zhong,Xiaowei Zhang,Linzhang Wang
DOI: https://doi.org/10.1142/s0218194024500104
IF: 1.007
2024-01-01
International Journal of Software Engineering and Knowledge Engineering
Abstract:Call graphs facilitate various tasks in software engineering. However, for the dynamic language Python, the complex language features and external library dependencies pose enormous challenges for building the call graphs of real projects. Some program analysis techniques used for call graph construction in other languages are impractical for Python. In this paper, we present STAR, a practical technique for the construction of Python static call graphs. We reformulate call graph construction as an entity identification task. STAR leverages inter-module summary and cross-project dependencies to construct a fine-grained entity knowledge base to identify the possible nodes and edges of the call graph in the code, and then construct the call graph. Our evaluation of three benchmarks shows that (1) STAR improves recall in three benchmarks compared to three baseline tools. Especially, STAR improves the recall of reachable nodes and reachable edges compared with the state-of-the-art tool by 11.3% and 9.8%, respectively; (2) STAR achieves comparable performance as three baseline tools in execution time and memory usage and is more efficient in large projects; (3) STAR can be effectively used for the task of detecting vulnerability propagation with real-world cases. We expect our results will attract more exploration of practical methods and improve the application of Python call graphs.
What problem does this paper attempt to address?