An edge streaming data processing framework for autonomous driving
Hang Zhao,LinBin Yao,ZhiXin Zeng,DongHua Li,JinLiang Xie,WeiLing Zhu,Jie Tang
DOI: https://doi.org/10.1080/09540091.2020.1782840
2020-06-25
Connection Science
Abstract:In recent years, with the rapid development of sensing technology and the Internet of Things (IoT), sensors play increasingly important roles in traffic control, medical monitoring, industrial production and etc. They generated high volume of data in a streaming way that often need to be processed in real time. Therefore, streaming data computing technology plays an indispensable role in the real-time processing of sensor data in high throughput but low latency. However, there are two problems in deploying streaming data process ability in cloud computing data centre. Firstly, massive sensor nodes simultaneously upload data to the remote cloud computing data centre, which requires a large number of bandwidth resources supports. The existing network infrastructure cannot provide enough bandwidth at a reasonable price. Secondly, due to the geographical distribution characteristics of the cloud computing data centre, there will inevitably be large transmission delay during the process of data transmission. Such end-to-end delay is intolerable to mobile applications especially for those latency sensitive tasks. In view of the above problems, this paper proposes an autonomous driving oriented edge streaming data processing framework, which migrates the computing and storage capability from the remote cloud data centre to the edge data centre. It focuses on the change of vehicle flow in a specific geographical area, and uses the computing power sunk to edge node to process the massive streaming data generated by autonomous vehicles nearby. The proposed framework is implemented on top of Spark Streaming, which builds up a gray model based traffic flow monitor, a traffic prediction orientated prediction layer and a fuzzy control based Batch Interval dynamic adjustment layer for Spark Streaming. It could forecast the variation of sensors data arrive rate, make streaming Batch Interval adjustment in advance and implement real-time streaming process by edge. Therefore, it can realise the monitor and prediction of the data flow changes of the autonomous driving vehicle sensor data in geographical coverage of edge computing node area, meanwhile minimise the end-to-end latency but satisfy the application throughput requirements. The experiments show that it can predict short-term traffic with no more than 4% relative error in a whole day. By making batch consuming rate close to data generating rate, it can maintain system stability well even when arrival data rate changes rapidly. The Batch Interval can be converged to a suitable value in two minutes when data arrival rate is doubled. Compared with vanilla version Spark Streaming, where there has serious task accumulation and introduces large delay, it can reduce 35% latency by squeezing Batch Interval when data arrival rate is low; it also can significantly improve system throughput by only at most 25% Batch Interval increase when data arrival rate is high.
computer science, artificial intelligence, theory & methods