Software Rejuvenation Based Fault Tolerance Scheme for Cloud Applications

Jing Liu,Jiantao Zhou,Rajkumar Buyya
DOI: https://doi.org/10.1109/CLOUD.2015.164
2015-01-01
Abstract:Cloud applications are typically composed of multiple cloud service components communicating with each other through web service interfaces, where each component fulfills specified functionalities. Lack of effective fault tolerance scheme is one of major obstacles for enhancing availability and efficiency of complex and aging cloud application systems. In this paper, we propose a holistic software rejuvenation based fault tolerance scheme for cloud applications, which contains three indispensible parts: adaptive failure detection, aging degree evaluation, and checkpoint with trace replay based component rejuvenation. Through a preliminary and qualitative evaluation, it shows that our new fault tolerance scheme brings promising improvement on the availability of cloud applications.
What problem does this paper attempt to address?