A refined decompiler to generate C code with high readability
gengbiao chen,zhengwei qi,shiqiu huang,kangqi ni,yudi zheng,walter binder,haibing guan
DOI: https://doi.org/10.1002/spe.2138
2013-01-01
Abstract:As a key part of reverse engineering, decompilation plays a very important role in software security and maintenance. A number of tools, such as Boomerang and IDA Hex_rays, have been developed to translate executable programs into source code in a relatively high-level language. Unfortunately, most existing decompilation tools suffer from low accuracy in identifying variables, functions, and composite structures, resulting in poor readability. To address these limitations, we present a practical decompiler called C-Decompiler for Windows C programs that (i) uses a shadow stack to perform refined data flow analysis, (ii) adopts inter-basic-block register propagation to reduce redundant variables, and (iii) recognizes library (i.e., Standard Template Library) functions by signatures. We evaluate and compare the decompilation quality of C-Decompiler with two existing tools, Boomerang and IDA Hex_rays, considering four aspects: function analysis, variable expansion rate, total percentage reduction, and cyclomatic complexity. Our experimental results show that on average, C-Decompiler has the highest total percentage reduction of 55.91%, lowest variable expansion rate of 55.79%, and the same cyclomatic complexityastheoriginal source code for each considered application. Furthermore, in our experiments, C-Decompiler is able to recognize functions with a lower false positive and false negative rate than the other decompilers. A case study and our evaluation results confirm that C-Decompiler is a practical tool to produce highly readable C-style code. Copyright (c) 2012 John Wiley & Sons, Ltd.