A Data Placement Strategy for Scientific Workflow in Hybrid Cloud

Zhanghui Liu,Tao Xiang,Bing Lin,Xinshu Ye,Haijiang Wang,Ying Zhang,Xing Chen
DOI: https://doi.org/10.1109/cloud.2018.00077
2018-01-01
Abstract:In cloud computing environments, data centers can provide high-performance computing resources and distributed storage space. Scientific workflows often need to be implemented across multiple data centers, where copious amounts of application data are stored. Moving data across geographically distributed data centers leads to intolerable delays and hinders the efficient execution of scientific workflows, which are large-scale data-intensive. Reasonable data placement can reduce data scheduling between the data centers effectively. In this paper, an adaptive discrete particle swarm optimization (PSO) algorithm based on genetic algorithm has been proposed to decrease the number of data transmissions across data centers. The algorithm overcame the premature convergence defect of PSO by introducing the mutation and crossover of genetic algorithm. Moreover, it effectively improved the diversity in the process of population evolution. Compared with the previous work, the simulation results showed that the proposed strategy greatly reduced the volume of data transfer while reducing the number of data movement across data centers.
What problem does this paper attempt to address?