NativeSummary: Summarizing Native Binary Code for Inter-language Static Analysis of Android Apps

Jikai Wang,Haoyu Wang
DOI: https://doi.org/10.1145/3650212.3680335
2024-01-01
Abstract:With the prosperity of Android app research in the last decade, many static analysis techniques have been proposed. They generally aim to tackle DEX bytecode in Android apps. Beyond DEX bytecode, native code (usually written in C/C++) is prevalent in modern Android apps, whose analysis is usually overlooked by most existing analysis frameworks. Although a few recent works attempted to handle native code, they suffer from scalability and accuracy issues. In this paper, we propose NativeSummary, a novel inter-language static analysis framework for Android apps with high accuracy, scalability, and compatibility. Our key idea is to extract semantic summary of the native binary code, then convert common usage patterns of JNI interface functions into Java bytecode operations, and additionally transform native library function calls to bytecode calls. Along with this effort, we can empower the legacy Java static frameworks with the ability of inter-language data flow analysis without tampering their inherent logic. Extensive evaluation suggests that NativeSummary outperforms SOTA techniques in terms of accuracy, scalability and compatibility. NativeSummary sheds light on the promising direction of inter-language analysis, and thousands of existing app analysis works can be boosted atop NativeSummary with almost no effort.
What problem does this paper attempt to address?