Dataflow Model and Its Applications in Big Data Processing

BI Nifei,DING Guangyao,CHEN Qihang,XU Chen,ZHOU Aoying
DOI: https://doi.org/10.11959/j.issn.2096-0271.2020025
IF: 3.3
2020-01-01
Big Data Research
Abstract:Unbounded,unordered and large scale datasets are increasingly common in recent years.Meanwhile,the processing requirements from data consumers are becoming more and more sophisticated,such as event time,window and latency.In order to deal with the evolved processing requirements on these unbounded,unordered and large scale datasets,the dataflow model in big data processing was introduced.On one hand,the dataflow graph of the dataflow model in big data processing was analyzed from the level of execution engine.On other hand,the dataflow programming model of the dataflow model in big data processing was analyzed from the level of unified programming.Furthermore,the different implementations of dataflow graph and dataflow programming model in multiple execution engines were analyzed,including Spark,a batch processing engine,and Flink,a stream processing engine.
What problem does this paper attempt to address?