EnBinDiff: Identifying Data-only Patches for Binaries

J Lin,D Wang,R Chang,L Wu,Y Zhou,K Ren
DOI: https://doi.org/10.1109/TDSC.2021.3133500
2021-01-01
IEEE Transactions on Dependable and Secure Computing
Abstract:In this article, we focus on <bold xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">data-only</b> patches, a specific type of security patches <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">not incurring any structural changes</i> . As one of the most significant causes leading to false negatives, data-only patches become a fundamental problem that affects all state-of-the-art binary diffing approaches/tools. To this end, we first systematically study data-only patches, and thoroughly illustrate the essence and adverse effect on existing tools. Based on the observations, we further propose and implement a system named <monospace xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">EnBinDiff</monospace> based on Value Set Analysis (VSA) to effectively identify data-only patches. Specifically, <monospace xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">EnBinDiff</monospace> first precisely identifies functions from binaries, and then efficiently locates all “matched” function pairs based on structural binary diffing. After that, <monospace xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">EnBinDiff</monospace> performs <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">data-only patch analysis</i> , including stack frame matching and constant value matching, to identify data-only patches from the matched functions. To demonstrate the effectiveness of <monospace xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">EnBinDiff</monospace> , we conduct an extensive evaluation with multiple datasets. The results demonstrate that the proposed system outperforms state-of-the-art binary diffing tools, and the false negative rate is reduced from 11.02% to 1.63%. Furthermore, we apply <monospace xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">EnBinDiff</monospace> to analyze real-world binaries, and successfully identify 20 1-day vulnerabilities.
What problem does this paper attempt to address?