Towards Fully Declarative Program Analysis via Source Code Transformation

Rijnard van Tonder
DOI: https://doi.org/10.48550/arXiv.2112.12398
2021-12-23
Software Engineering
Abstract:Advances in logic programming and increasing industrial uptake of Datalog-inspired approaches demonstrate the emerging need to express powerful code analyses more easily. Declarative program analysis frameworks (e.g., using logic programming like Datalog) significantly ease defining analyses compared to imperative implementations. However, the declarative benefits of these frameworks only materialize after parsing and translating source code to generate facts. Fact generation remains a non-declarative precursor to analysis where imperative implementations first parse and interpret program structures (e.g., abstract syntax trees and control-flow graphs). The procedure of fact generation thus remains opaque and difficult for non-experts to understand or modify. We present a new perspective on this analysis workflow by proposing declarative fact generation to ease specification and exploration of lightweight declarative analyses. Our approach demonstrates the first venture towards fully declarative analysis specification across multiple languages. The key idea is to translate source code directly to Datalog facts in the analysis domain using declarative syntax transformation. We then reuse existing Datalog analyses over generated facts, yielding an end-to-end declarative pipeline. As a first approximation we pursue a syntax-driven approach and demonstrate the feasibility of generating and using lightweight versions of liveness and call graph reachability properties. We then discuss the workability of extending declarative fact generation to also incorporate semantic information.
What problem does this paper attempt to address?