Predicting Traffic Congestion at Urban Intersections Using Data-Driven Modeling

Tara Kelly,Jessica Gupta
2024-05-30
Abstract:Traffic congestion at intersections is a significant issue in urban areas, leading to increased commute times, safety hazards, and operational inefficiencies. This study aims to develop a predictive model for congestion at intersections in major U.S. cities, utilizing a dataset of trip-logging metrics from commercial vehicles across 4,800 intersections. The dataset encompasses 27 features, including intersection coordinates, street names, time of day, and traffic metrics (Kashyap et al., 2019). Additional features, such as rainfall/snowfall percentage, distance from downtown and outskirts, and road types, were incorporated to enhance the model's predictive power. The methodology involves data exploration, feature transformation, and handling missing values through low-rank models and label encoding. The proposed model has the potential to assist city planners and governments in anticipating traffic hot spots, optimizing operations, and identifying infrastructure challenges.
Machine Learning
What problem does this paper attempt to address?
This paper aims to address the prediction problem of traffic congestion at urban intersections. The researchers utilized a large dataset of intersection travel records from commercial vehicles, covering 4800 intersections, to develop a predictive model through data-driven modeling. The dataset includes 27 features such as intersection coordinates, street names, time, traffic indicators, and also considers additional factors like rainfall/snowfall percentages, distance from city center and suburbs, and road types to enhance the predictive capability of the model. The paper first introduces the issues caused by urban traffic congestion and reviews the applications of machine learning and predictive modeling in addressing urban traffic challenges in existing literature. The research methodology includes data exploration, feature engineering, handling missing values, as well as model development and evaluation using various machine learning algorithms such as linear regression, decision trees, random forests, and neural networks. The study also explores the generalizability and robustness of the models by employing cross-validation and hyperparameter tuning to optimize performance. The results show that polynomial linear regression, K-nearest neighbors (K-NN), and gradient boosting methods have their own strengths and weaknesses in predicting traffic congestion. Linear regression is simple and easy to interpret but may not capture nonlinear relationships; K-NN performs well but has higher computational complexity for predictions; gradient boosting models demonstrate powerful performance in handling complex relationships with lower computational complexity, making them suitable for real-time predictions. The conclusion of the paper indicates that these models can assist urban planners and governments in predicting traffic hotspots, optimizing operations, and identifying infrastructure challenges. Future research could explore ensemble methods combining these technologies, as well as utilizing deep learning models to handle spatiotemporal patterns. Additionally, the paper emphasizes the importance of data security and privacy protection, especially when applying machine learning models in traffic management systems.