SparrowHawk - Memory Safety Flaw Detection via Data-Driven Source Code Annotation.

Yunlong Lyu,Wang Gao,Siqi Ma,Qibin Sun,Juanru Li
DOI: https://doi.org/10.1007/978-3-030-88323-2_7
2021-01-01
Abstract:Detecting code flaws in programs is a vital aspect of software maintenance and security. Classic code flaw detection techniques rely on program analysis to check whether the code logic violates certain pre-define rules. In many cases, however, program analysis falls short of understanding the semantics of a function (e.g., the functionality of an API), and thus is difficult to judge whether the function and its related behaviors would lead to a security bug. In response, we propose an automated data-driven annotation strategy to enhance the understanding of the semantics of functions during flaw detection. Our designed SparrowHawk source code analysis system utilizes a programming language aware text similarity comparison to efficiently annotate the attributes of functions. With the annotation results, SparrowHawk makes use of the Clang static analyzer to guide security analyses. To evaluate the performance of SparrowHawk, we tested SparrowHawk for memory corruption detection, which relies on the annotation of customized memory allocation/release functions. The experiment results show that by introducing function annotation to the original source code analysis, SparrowHawk achieves more effective and efficient flaw detection, and successfully discovers 51 new memory corruption vulnerabilities in popular open source projects such as FFmpeg and kernel of OpenHarmony IoT operating system.
What problem does this paper attempt to address?