Bi-Objective Scheduling Algorithm for Hybrid Workflow in JointCloud

Rui Li,Huaimin Wang,Peichang Shi
DOI: https://doi.org/10.1109/jcc62314.2024.00014
2024-01-01
Abstract:Big data workflows are widely used in IoT, recommended systems, and real-time vision applications, and they continue to grow in complexity. These hybrid workflows consist of both resource-intensive batch jobs and latency-sensitive stream jobs. Examples include the data analytics workflow, which incorporates batch data transformations and low-latency querying, and the machine learning workflow, which processes stream data feature extraction before performing batch training and low-latency inference. However, existing research on workflow scheduling primarily focuses on either stream or batch workflows, neglecting the efficient scheduling of hybrid workflows that respect their diverse resource requirements and the costly data transfers between them.In this article, we propose a hybrid workflow model that defines the optimal placement of hybrid workflows (OHWP) as a bi-objective optimization problem. Our proposed model takes into account parameters related to inter-communication between stream and batch jobs, as well as the heterogeneous resources in JointCloud environment. Additionally, we present OHWP-PS (OHWP on a Pruned Space), a scheduling algorithm for hybrid workflows that minimizes both cost and latency by improving the initial population and dynamically updating the search space. The results demonstrate that the proposed OHWP-PS algorithm is effective and competitive across all experiments.
What problem does this paper attempt to address?