Automatic Protocol Reverse Engineering for Industrial Control Systems with Dynamic Taint Analysis
Ma Rongkuan,Zheng Hao,Wang Jingyi,Wang Mufeng,Wei Qiang,Wang Qingxian
DOI: https://doi.org/10.1631/fitee.2000709
IF: 2.526
2022-01-01
Frontiers of Information Technology & Electronic Engineering
Abstract:Proprietary (or semi-proprietary) protocols are widely adopted in industrial control systems (ICSs). Inferring protocol format by reverse engineering is important for many network security applications, e.g., program tests and intrusion detection. Conventional protocol reverse engineering methods have been proposed which are considered time-consuming, tedious, and error-prone. Recently, automatical protocol reverse engineering methods have been proposed which are, however, neither effective in handling binary-based ICS protocols based on network traffic analysis nor accurate in extracting protocol fields from protocol implementations. In this paper, we present a framework called the industrial control system protocol reverse engineering framework (ICSPRF) that aims to extract ICS protocol fields with high accuracy. ICSPRF is based on the key insight that an individual field in a message is typically handled in the same execution context, e.g., basic block (BBL) group. As a result, by monitoring program execution, we can collect the tainted data information processed in every BBL group in the execution trace and cluster it to derive the protocol format. We evaluate our approach with six open-source ICS protocol implementations. The results show that ICSPRF can identify individual protocol fields with high accuracy (on average a 94.3% match ratio). ICSPRF also has a low coarse-grained and overly fine-grained match ratio. For the same metric, ICSPRF is more accurate than AutoFormat (88.5% for all evaluated protocols and 80.0% for binary-based protocols).