CLIC: An Extensible and Efficient Cross-Platform Data Analytics System

Qixiang Chen,Zhijun Chen,Kai Zhang,X. Sean Wang
DOI: https://doi.org/10.1109/tpds.2023.3298038
IF: 5.3
2023-01-01
IEEE Transactions on Parallel and Distributed Systems
Abstract:With the ever-increasing data volume and application diversity, a modern data analytics job is generally built as a workflow consisting of multiple tasks. For either specific functionalities or higher performance, tasks in a workflow may need to be deployed on different data processing platforms. This paper proposes CLIC, a highly extensible system for efficient cross-platform data analytics. To leverage the advantage of diverse platforms while alleviating development efforts, we propose an embedding-based operator encoding scheme and a Graph Convolutional Network model for efficient platform selection. Aiming at flexibly integrating new operators and platforms, CLIC is designed with a highly extensible system architecture that decouples the core functionalities from backend platforms. Experiments show that CLIC can significantly improve the performance of modern data analysis workflows with fast platform selection.
computer science, theory & methods,engineering, electrical & electronic
What problem does this paper attempt to address?