Extraction and Analysis of Hotspot Region of Parallel Taxi Trajectory Based on Spark

Y. Sheng,Xiaoji Lan,Xueli Li
DOI: https://doi.org/10.12677/CSA.2018.89161
Abstract:The taxi GPS trajectory data can mine wealthy residents travel law information, but for the increasing number of data, there are new requirements have been put forward about the accuracy and efficiency of data mining. This paper takes Chengdu taxi GPS trajectory data as the research object. First, the distortion of the original data and the redundant field should be deleted, and partial time data should be filtered, then the map should be matched; finally using the spark Big Data processing platform, it realized K-means| |, divided into working days and rest days to analyze and get the hot spot area of Chengdu residents and its space-time distribution characteristics. Finally, compared the performance of the K-means and K-means| |, the result showed that K-means| | had superiority in accuracy and time efficiency compared with the single machine.
Computer Science
What problem does this paper attempt to address?