Inferring State Machine from the Protocol Implementation Via Large Langeuage Model.

Haiyang Wei,Zhengjie Du,Haohui Huang,Yue Liu,Guang Cheng,Linzhang Wang,Bing Mao
DOI: https://doi.org/10.48550/arxiv.2405.00393
2024-01-01
Abstract:State machines play a pivotal role in augmenting the efficacy of protocolanalyzing to unveil more vulnerabilities. However, the task of inferring statemachines from network protocol implementations presents significant challenges.Traditional methods based on dynamic analysis often overlook crucial statetransitions due to limited coverage, while static analysis faces difficultieswith complex code structures and behaviors. To address these limitations, wepropose an innovative state machine inference approach powered by LargeLanguage Models (LLMs). Utilizing text-embedding technology, this method allowsLLMs to dissect and analyze the intricacies of protocol implementation code.Through targeted prompt engineering, we systematically identify and infer theunderlying state machines. Our evaluation across six protocol implementationsdemonstrates the method's high efficacy, achieving an accuracy rate exceeding90implementations of the same protocol. Importantly, integrating this approachwith protocol fuzzing has notably enhanced AFLNet's code coverage by 10RFCNLP, showcasing the considerable potential of LLMs in advancing networkprotocol security analysis. Our proposed method not only marks a significantstep forward in accurate state machine inference but also opens new avenues forimproving the security and reliability of protocol implementations.
What problem does this paper attempt to address?