A Time-Cost Based Automatic Scheduling Framework for Matrix Computation on Various Distributed Computing Platforms

Rong Gu,Zhiqiang Liu,Chunfeng Yuan,Yihua Huang
DOI: https://doi.org/10.1109/ipdpsw.2016.53
2016-01-01
Abstract:Matrix computation is considered to be the core of many machine learning and graph algorithm workloads. In traditional single-node age, numerical analysis platforms like R and Matlab provide matrix programming model natively. As data is increasingly scaled up in the Big Data era, there is an increasing demand to seamlessly integrate large-scale matrix computation into distributed data-parallel computing systems. Therefore a variety of matrix computation libraries have been implemented on these distributed computing platforms such as MPI, Hadoop and Spark. However, a specific matrix-based algorithm has quite different performance over different platforms and it is very challenging for data scientists to specify the platform or combination of platforms for a given algorithm workflow to achieve the best performance. To solve this problem, in this paper, we put forward a time-cost based scheduling framework that can automatically specify the best platforms for the matrix operations and schedule the execution workflow. We have implemented a system prototype which using R as the user language and MPI, R and Spark as the backend computing platforms. The experimental results show that our time-cost based model has good accuracy with less than 10% error rate on average. Moreover, the scheduling framework built on it achieves efficient performance in applications.
What problem does this paper attempt to address?