A YARN-based Energy-Aware Scheduling Method for Big Data Applications under Deadline Constraints

Fatemeh Shabestari,Amir Masoud Rahmani,Nima Jafari Navimipour,Sam Jabbehdari
DOI: https://doi.org/10.1007/s10723-022-09627-w
2022-11-05
Journal of Grid Computing
Abstract:Hadoop is a distributed framework for processing big data. One of the critical parts of Hadoop is YARN, which carries out scheduling and resource management. A scheduling algorithm should consider multiple objectives. However, YARN schedulers do not consider the Service Level Agreement (SLA) and the energy-related issues. The present paper proposes an energy-efficient deadline-aware model for the scheduling problem. The scheduling issue is an NP-hard problem regarding the deadline of applications and reducing energy. Hence, an Energy-efficient Deadline-aware Scheduling Algorithm based on the Moth-Flame Optimization algorithm (EDSA-MFO) is suggested to minimize the energy consumption and execute the application within a given soft deadline. Moreover, the earliest deadline first-based (EDF-based) heuristic approach is proposed to decode a moth into a scheduling solution. The algorithm is implemented for both static and dynamic scheduling. To evaluate the performance of the proposed algorithm, extensive simulations are conducted. The outcomes demonstrated that the suggested method could find near-optimal scheduling. It outperforms the YARN default FIFO scheduler, EDF, the energy-aware greedy algorithm (EAGA), and the deadline-aware energy-efficient MapReduce scheduling algorithm for YARN (EMRSAY) in total cluster energy consumption and meeting job deadline.
computer science, information systems, theory & methods
What problem does this paper attempt to address?