Inference of Protocol State Machine Using a Semantic-Related Labeling Method Based on Dynamic Taint Analysis

Yulong Liu,Peidai Xie,Yongjun Wang,Mengzu Liu
DOI: https://doi.org/10.1109/compcomm.2017.8322552
2017-01-01
Abstract:A protocol state machine is a description of essential behaviors expressed when protocol messages exchanged between network applications. The inference of protocol state machine (IPSM for short) is a process of learning the inner workings of a protocol. The existing methods are mainly based on the idea of passive monitoring, and the protocol state machine is inferred by constructing an APTA tree and simplifying it. But in the phase of state labeling, they distinguish between different states only according to the similarity of messages, which would result in different states to be treated as the same states, so the states would be merged by mistake, causing the state machine to be over generalization and can't describe the behaviors of protocol accurately. This paper proposes a method to improve the accuracy of the protocol state machine inferred by semantic-related labeling based on dynamic taint analysis technology. We rely on DECAF to analyze the network applications, then construct an APTA tree and use the semantic information to distinguish the states, and the similar states would be merged to get the final results. At last, TCP, and Agobot control protocol are selected to be tested for this method. The results show that the method can overcome the shortcomings of the existing methods, the final results are more accurate.
What problem does this paper attempt to address?