Ferry: State-Aware Symbolic Execution for Exploring State-Dependent Program Paths

Shunfan Zhou,Zhemin Yang,Dan Qiao,Peng Liu,Min Yang,Zhe Wang,Chenggang Wu
2022-01-01
Abstract:Symbolic execution and fuzz testing are effective approaches for program analysis, thanks to their evolving path exploration approaches. The state-of-the-art symbolic execution and fuzzing techniques are able to generate valid program inputs to satisfy the conditional statements. However, they have very limited ability to explore the finite-state-machine models implemented by real-world programs. This is because such state machines contain program-state-dependent branches (state-dependent branches in this paper) which depend on earlier program execution instead of the current program inputs. This paper is the first attempt to thoroughly explore the state-dependent branches in real-world programs. We introduce program-state-aware symbolic execution, a novel technique that guides symbolic execution engines to efficiently explore the state-dependent branches. As we show in this paper, state-dependent branches are prevalent in many important programs because they implement state machines to fulfill their application logic. Symbolically executing arbitrary programs with state-dependent branches is difficult, since there is a lack of unified specifications for their state machine implementation. Faced with this challenging problem, this paper recognizes widely-existing data dependency between current program states and previous inputs in a class of important programs. Our insights into these programs help us take a successful first step on this task. We design and implement a tool Ferry, which efficiently guides symbolic execution engine by automatically recognizing program states and exploring state-dependent branches. By applying Ferry to 13 different real-world programs and the comprehensive dataset Google FuzzBench, Ferry achieves higher block and branch coverage than two state-of-the-art symbolic execution engines and manages to locate three 0-day vulnerabilities in jhead. Our further investigation shows that Ferry is able to cover more hard-to-reach code compared with existing symbolic executors and fuzzers. Further, we show that Ferry is able to reach more program-state-dependent vulnerabilities than existing symbolic executors and fuzzing approaches with 15 collected state-dependent vulnerabilities and a test suite of six prominent programs. Finally, we test Ferry on LAVA-M dataset to understand its strengths and limitations.
What problem does this paper attempt to address?