DFix: Automatically Fixing Timing Bugs in Distributed Systems

Guangpu Li,Haopeng Liu,Xianglan Chen,Haryadi S. Gunawi,Shan Lu
DOI: https://doi.org/10.1145/3314221.3314620
2019-01-01
Abstract:Distributed systems nowadays are the backbone of computing society, and are expected to have high availability. Unfortunately, distributed timing bugs, a type of bugs triggered by non-deterministic timing of messages and node crashes, widely exist. They lead to many production-run failures, and are difficult to reason about and patch. Although recently proposed techniques can automatically detect these bugs, how to automatically and correctly fix them still remains as an open problem. This paper presents DFix, a tool that automatically processes distributed timing bug reports, statically analyzes the buggy system, and produces patches. Our evaluation shows that DFix is effective in fixing real-world distributed timing bugs.
What problem does this paper attempt to address?