The Cloud's Cloudy Moment: A Systematic Survey of Public Cloud Service Outage

Zheng Li,Mingfei Liang,Liam O'Brien,He Zhang
DOI: https://doi.org/10.11591/closer.v2i5.5125
2013-01-01
International Journal of Cloud Computing and Services Science (IJ-CLOSER)
Abstract:Inadequate service availability is the top concern when employing Cloud computing. It has been recognized that zero downtime is impossible for large-scale Internet services. By learning from the previous and others' mistakes, nevertheless, it is possible for Cloud vendors to minimize the risk of future downtime or at least keep the downtime short. To facilitate summarizing lessons for Cloud providers, we performed a systematic survey of public Cloud service outage events. This paper reports the result of this survey. In addition to a set of findings, our work generated a lessons framework by classifying the outage root causes. The framework can in turn be used to arrange outage lessons for reference by Cloud providers. By including potentially new root causes, this lessons framework will be smoothly expanded in our future work.
What problem does this paper attempt to address?