Abstract:Symbolic execution and fuzz testing are effective approaches for program analysis, thanks to their evolving path exploration approaches. The state-of-the-art symbolic execution and fuzzing techniques are able to generate valid program inputs to satisfy the conditional statements. However, they have very limited ability to explore the finite-state-machine models implemented by real-world programs. This is because such state machines contain program-state-dependent branches (state-dependent branches in this paper) which depend on earlier program execution instead of the current program inputs. This paper is the first attempt to thoroughly explore the state-dependent branches in real-world programs. We introduce program-state-aware symbolic execution, a novel technique that guides symbolic execution engines to efficiently explore the state-dependent branches. As we show in this paper, state-dependent branches are prevalent in many important programs because they implement state machines to fulfill their application logic. Symbolically executing arbitrary programs with state-dependent branches is difficult, since there is a lack of unified specifications for their state machine implementation. Faced with this challenging problem, this paper recognizes widely-existing data dependency between current program states and previous inputs in a class of important programs. Our insights into these programs help us take a successful first step on this task. We design and implement a tool Ferry, which efficiently guides symbolic execution engine by automatically recognizing program states and exploring state-dependent branches. By applying Ferry to 13 different real-world programs and the comprehensive dataset Google FuzzBench, Ferry achieves higher block and branch coverage than two state-of-the-art symbolic execution engines and manages to locate three 0-day vulnerabilities in jhead. Our further investigation shows that Ferry is able to cover more hard-to-reach code compared with existing symbolic executors and fuzzers. Further, we show that Ferry is able to reach more program-state-dependent vulnerabilities than existing symbolic executors and fuzzing approaches with 15 collected state-dependent vulnerabilities and a test suite of six prominent programs. Finally, we test Ferry on LAVA-M dataset to understand its strengths and limitations.

Path Exploration Strategy for Symbolic Execution Based on Multi-strategy Active Learning

Steering Symbolic Execution to Less Traveled Paths

Symbolic Execution with Test Cases Generated by Large Language Models

Machine Learning Steered Symbolic Execution Framework for Complex Software Code

Python Symbolic Execution with LLM-powered Code Generation

Symbolic Execution of Complex Program Driven by Machine Learning Based Constraint Solving

Fitness-guided Path Exploration in Dynamic Symbolic Execution

Speculative Symbolic Execution

Learning to Accelerate Symbolic Execution via Code Transformation.

Dependence Guided Symbolic Execution.

SAFL: increasing and accelerating testing coverage with symbolic execution and guided fuzzing.

Loop Transparency for Scalable Dynamic Symbolic Execution

Choices are More Important than Efforts: LLM Enables Efficient Multi-Agent Exploration

Strategist: Learning Strategic Skills by LLMs via Bi-Level Tree Search

Towards Symbolic Pointers Reasoning in Dynamic Symbolic Execution

SCSE: Boosting Symbolic Execution Via State Concretization

Ferry: State-Aware Symbolic Execution for Exploring State-Dependent Program Paths

Improving exploration in policy gradient search: Application to symbolic optimization

Visualizing Path Exploration to Assist Problem Diagnosis for Structural Test Generation

Abstracting Path Conditions for Effective Symbolic Execution

Boosting Symbolic Execution Via Constraint Solving Time Prediction (experience Paper)