Unveiling the influence of behavioural, built environment and socio-economic features on the spatial and temporal variability of bus use using explainable machine learning

Sui Tao,Francisco Rowe,Hongyu Shan
2024-02-06
Abstract:Understanding the variability of people's travel patterns is key to transport planning and policy-making. However, to what extent daily transit use displays geographic and temporal variabilities, and what are the contributing factors have not been fully addressed. Drawing on smart card data in Beijing, China, this study seeks to address these deficits by adopting new indices to capture the spatial and temporal variability of bus use during peak hours and investigate their associations with relevant contextual features. Using explainable machine learning, our findings reveal non-linear interaction between spatial and temporal variability and trip frequency. Furthermore, greater distance to the urban centres (>10 kilometres) is associated with increased spatial variability of bus use, while greater separation of trip origins and destinations from the subcentres reduces both spatial and temporal variability. Higher availability of bus routes is linked to higher spatial variability but lower temporal variability. Meanwhile, both lower and higher road density is associated with higher spatial variability of bus use especially in morning times. These findings indicate that different built environment features moderate the flexibility of travel time and locations. Implications are derived to inform more responsive and reliable operation and planning of transit systems.
Computers and Society
What problem does this paper attempt to address?
This paper attempts to solve the following problems: 1. **Spatial and Temporal Variability in Public Transport Use**: Specifically, the research aims to reveal the characteristics of the spatial and temporal variability of bus use in different areas of the city and at different times (such as the morning and evening rush hours). This includes exploring the patterns of these variabilities and their influencing factors. 2. **Human, Built - environment and Socio - economic Factors Affecting the Variability of Bus Use**: The research attempts to understand which behavioral characteristics (such as travel frequency), built - environment characteristics (such as distance from the city center, road density, number of bus routes) and socio - economic characteristics (such as population density, income level) have a significant impact on the temporal and spatial variability of bus use, and how these factors interact. 3. **Explaining Non - linear Relationships**: By using interpretable machine - learning methods (such as XGBoost and SHAP values), the research attempts to reveal the non - linear relationships between these influencing factors and the variability of bus use. For example, the research finds that a greater distance from the city center (> 10 kilometers) is associated with higher spatial variability, while higher bus - route availability is associated with higher spatial variability but lower temporal variability. ### Research Background With the wide application of smart - card data (SCD), researchers are able to gain a deeper understanding of people's travel behaviors, especially the use of public transport. However, most of the existing research only focuses on a single variability in the spatial or temporal dimension, and rarely considers both dimensions and their influencing factors simultaneously. In addition, there is still a lack of systematic understanding of how these influencing factors specifically affect the temporal and spatial variability of bus use. ### Research Methods To answer the above questions, the research adopts the following methods: - **Data Sources**: Smart - card data for one month in June 2016 in Beijing were used, covering individual - level bus - travel records. - **Index Definitions**: Spatial variability \( S V \) and temporal variability \( T V \) were defined, representing the spatial and temporal distances between different trips of the same person respectively. - Spatial distance formula: \[ D_{ij} = \sqrt{(x_i - x_j)^2 + (y_i - y_j)^2 + (x'_i - x'_j)^2 + (y'_i - y'_j)^2} \] - Temporal distance formula: \[ T_{ij} = \sqrt{(t_i - t_j)^2 + (t'_i - t'_j)^2} \] - Spatial variability formula: \[ S V = \frac{1}{C_n^2} \sum_{i = 1}^{n} \sum_{j = 1}^{n} D_{ij} \] - Temporal variability formula: \[ T V = \frac{1}{C_n^2} \sum_{i = 1}^{n} \sum_{j = 1}^{n} T_{ij} \] - **Model Selection**: The XGBoost model was adopted to quantify the relationships between the variability of bus use and various features, and SHAP values were used to explain the non - linear characteristics of these relationships. Through these methods, the research aims to provide more responsive and reliable suggestions for public - transport planning and operation to better meet the diverse needs of passengers.