GenLog: Accurate Log Template Discovery for Stripped X86 Binaries

Maosheng Zhang,Ying Zhao,Zengmingyu He
DOI: https://doi.org/10.1109/compsac.2017.137
2017-01-01
Abstract:Log analysis plays an important role for computer failure diagnosis. With the ever increasing size and complexity of logs, the task of analyzing logs has become cumbersome to carry out manually. For this reason, recent research has focused on automatic analysis techniques for large log files. However, log messages are texts with certain formats and it is very challenging for automatic analysis to understand the semantic meanings of log messages. The current state-of-the-art approaches depend on the quality of observed log messages or source code producing these log messages. In this paper, we propose a method GenLog that can extract log templates from stripped executables (neither source code nor debugging information need to be available). GenLog finds all log related functions in a binary through a combined bottom-up and top-down slicing method, reconstructs the memory buffers where log messages were constructeStripped X86 Binaries d, and identifies components of log messages using data flow analysis and taint propagation analysis. GenLog can be used to analyze large binary code, and is suitable for commercial off-the-shelf (COTS) software or dynamic libraries. We evaluated GenLog on four X86 executables and one of them is Nginx. The experiments show that GenLog can identify the template for log messages in testing log files with a precision of 99.9%.
What problem does this paper attempt to address?