Real-time data stream processing and key techniques oriented to large-scale sensor data

Kaiyuan Qi,Yanbo Han,Zhuofeng Zhao,MA Qiang
DOI: https://doi.org/10.13196/j.cims.2013.03.195.qiky.018
2013-01-01
Abstract:With the development of Internet of Things, how to realize real time computation for high speed data stream based on large scale history sensor data became a new challenge to cloud manufacturing. A processing method named Real-Time MapReduce (RTMR) oriented to large scale historical data was proposed, which improved data stream processing capacity of MapReduce through intermediate result cache, pipelining and localization. To construct RTMR sets, the model analysis method was used to configure the node type and topological structure based on application characteristics and cluster environments. Furthermore, to realize cluster load balancing, the idle nodes and overload nodes were grouped by computing load state transition relation. Thus the dynamic load balancing problem of NP hard was decomposed into small scale sub-problems, and execution time as well as data cost were integrated as sub-problem's optimization objective. The experiment result showed that the proposed method and technology could ensure the scalability for data stream processing of large scale historical data.
What problem does this paper attempt to address?