Abstract:In this paper we take 4 different features of the SAT solver CaDiCaL, blocked clause elimination, vivification, on-the-fly self subsumption, and increasing the bound of variable elimination over the SAT Competitions benchmarks between 2009 and 2022. We study these features by both activating them one-by-one and deactivating them one-by-one. We have three hypothesis regarding the experiments: (i) disabling features is always harmful; (ii) the life span of the techniques is limited; and (iii) features simulate each other. Our experiments cannot confirm any of the hypothesis.
What problem does this paper attempt to address?
The problem that this paper attempts to solve is about the effectiveness and life cycle of specific techniques in SAT (Satisfiability) solvers. Specifically, the author selected four features in the SAT solver CaDiCaL: Blocked Clause Elimination (BCE), Vivification, On - the - Fly Self Subsumption (OTFS), and Bound Increase of Variable Elimination (BVE+), and conducted experiments to verify the following three hypotheses:
1. **Hypothesis 1** - Disabling a feature will reduce the number of instances solved. That is, if disabling a certain feature causes the number of instances solved by the solver to decrease, it indicates that these features are all necessary.
2. **Hypothesis 2** - The life cycle of a technique is limited. That is, a certain feature has a significant effect on solving instances within a few years after its invention, but over time, its effect will gradually weaken until it disappears.
3. **Hypothesis 3** - Features can simulate each other. That is, the performance of some features is similar, indicating that one feature can simulate the effect of another feature.
To verify these hypotheses, the author designed three sets of experiments:
- **Base Configuration**: Disable all four features and only retain other default functions of CaDiCaL.
- **Default Configuration**: Use the default settings of CaDiCaL, enable OTFS, Vivification and BVE+, but do not enable BCE.
- **Everything Configuration**: Enable all four features.
Through these experiments, the author hopes to understand the performance of these features in the SAT competition benchmark tests in different years, thereby verifying the above hypotheses. Eventually, the experimental results failed to confirm any of the hypotheses, indicating that the interactions between these features are very complex, and simply testing the effect of a single feature alone may not be accurate.