Identifying Commuters Based on Random Forest of Smartcard Data

Zhenyu Mei,Wenchao Ding,Chi Feng,Liting Shen
DOI: https://doi.org/10.1049/iet-its.2019.0414
IF: 2.7
2020-01-01
IET Intelligent Transport Systems
Abstract:Commuter flow is an important part of metro passenger flow. The aim of this study is to develop an efficient and effective method to identify the spatiotemporal commuting patterns of Metro Line 2 in Hangzhou. Using one-week transit smart card data and a questionnaire survey of Metro Line 2, the authors distinguished the spatiotemporal regularity of individual commuters, including first travel time, last travel time, and the number of travelling days on weekdays. This data could be used to identify transit commuters by leveraging ensemble learning approaches. The random forest algorithm was adopted as a low-cost, high-efficiency analysis method, and the classification model was established with the information of travel time, days of travelling, and the unique tag information in the questionnaire survey data. Then, numerical tests were carried out to show that the Precision and Recall rates of the proposed model could reach as high as 0.96 and 0.92, respectively. Finally, the validated random forest model was applied to identify metro commuters from the smartcard data. The results show that less than one-third of passengers are commuter traffic and are mainly concentrated during peak hours. These extracted personal-level commute models can be used as valuable information for the design and optimisation of public transportation networks.
What problem does this paper attempt to address?