A Selective Mirrored Task Based Fault Tolerance Mechanism for Big Data Application Using Cloud
Hao Wu,Qinggeng Jin,Chenghua Zhang,He Guo
DOI: https://doi.org/10.1155/2019/4807502
2019-01-01
Wireless Communications and Mobile Computing
Abstract:With the wide deployment of cloud computing in big data processing and the growing scale of big data application, managing reliability of resources becomes a critical issue. Unfortunately, due to the highly intricate directed-acyclic-graph (DAG) based application and the flexible usage of processors (virtual machines) in cloud platform, the existing fault tolerant approaches are inefficient to strike a balance between the parallelism and the topology of the DAG-based application while using the processors, which causes a longer makespan for an application and consumes more processor time (computation cost). To address these issues, this paper presents a novel fault tolerant framework named Fault Tolerance Algorithm using Selective Mirrored Tasks Method (FAUSIT) for the fault tolerance of running a big data application on cloud. First, we provide comprehensive theoretical analyses on how to improve the performance of fault tolerance for running a single task on a processor. Second, considering the balance between the parallelism and the topology of an application, we present a selective mirrored task method. Finally, by employing the selective mirrored task method, the FAUSIT is designed to improve the fault tolerance for DAG based application and incorporates two important objects: minimizing the makespan and the computation cost. Our solution approach is evaluated through rigorous performance evaluation study using real-word workflows, and the results show that the proposed FAUSIT approach outperforms existing algorithms in terms of makespan and computation cost.