A Survey of Fault-tolerance in Cloud Computing: Concepts and Practice

Ameen Alkasem,Hongwei Liu
DOI: https://doi.org/10.19026/rjaset.11.2244
2015-01-01
Abstract:A fault tolerance is an important property in order to achieve performance levels for important attributes for a system’s dependability, reliability, availability and Quality of Service (QoS). In this survey a comprehensive review of representative works on fault tolerance in cloud computing is presented, in which general readers will be provided an overview of the concepts and practices of a fault-tolerance computing. Cloud computing service providers will rise and fall based on their ability to execute and deliver a satisfactory QoS in primary areas such as dependability. Many enterprise users are wary of the public clouds' dependability limitations, but also curious about the possibility of adopting the technologies, designs and best practices of clouds for their own data centers such as private clouds. The situation is evolving rapidly with public, private and hybrid clouds, as vendors and users are struggling to keep up with new developments.
What problem does this paper attempt to address?