BigDL: A Distributed Deep Learning Framework for Big Data

Jason Dai,Yiheng Wang,Xin Qiu,Ding Ding,Yao Zhang,Yanzhang Wang,Xianyan Jia,Cherry Zhang,Yan Wan,Zhichao Li,Jiao Wang,Shengsheng Huang,Zhongyuan Wu,Yang Wang,Yuhao Yang,Bowen She,Dongjie Shi,Qi Lu,Kai Huang,Guoqiong Song
DOI: https://doi.org/10.1145/3357223.3362707
2019-11-05
Abstract:This paper presents BigDL (a distributed deep learning framework for Apache Spark), which has been used by a variety of users in the industry for building deep learning applications on production big data platforms. It allows deep learning applications to run on the Apache Hadoop/Spark cluster so as to directly process the production data, and as a part of the end-to-end data analysis pipeline for deployment and management. Unlike existing deep learning frameworks, BigDL implements distributed, data parallel training directly on top of the functional compute model (with copy-on-write and coarse-grained operations) of Spark. We also share real-world experience and "war stories" of users that have adopted BigDL to address their challenges(i.e., how to easily build end-to-end data analysis and deep learning pipelines for their production data).
Distributed, Parallel, and Cluster Computing,Artificial Intelligence,Machine Learning
What problem does this paper attempt to address?