A Model-Driven Parallel Processing System for IoT Data Based on User-Defined Functions

Jianqiao Luo,Li Zhang,Xuan Li
DOI: https://doi.org/10.1109/icccbda49378.2020.9095646
2020-01-01
Abstract:Internet of things (IoT) devices have produced large data rapidly in recent years. Though parallel computing architectures like Map-Reduce have been successful in processing massive data, they are not enough for problems in practice. The variety of IoT sensors and usage requires many different applications, where applications may involve complex logic that unsuitable for ordinary Map-Reduce model, and the management of those applications can be difficult. This paper presents a novel model-driven parallel processing system based on user-defined functions to solve these problems. The system uses a model-driven method to define and manage parallel computation tasks, separates the procedure of data collection and data computation, presents a generalized user-defined function (UDF) abstraction for data computation that allows users to implement applications in a simple way and largely reduces the cost of programming. The system runtime is based on Apache Spark and computation tasks are translated into Apache Spark applications. Experiments show that the system performance is as good as theoretical expectation.
What problem does this paper attempt to address?