A novel machine learning-based spatialized population synthesis framework

Khachman, Mohamed,Morency, Catherine,Ciari, Francesco
DOI: https://doi.org/10.1007/s11116-024-10534-0
IF: 4.814
2024-08-28
Transportation
Abstract:Synthetic populations are increasingly required in transportation demand modelling practice to feed the large-scale agent-based microsimulation platforms gaining in popularity. The quality of the synthetic population, i.e., its representativeness of the sociodemographic and the spatial distribution of the real population, is a determinant factor of the reliability of the microsimulation it feeds. While many research works focused on improving the sociodemographic accuracy of synthetic populations, the quality of their spatial distribution remained less covered. This paper suggests a new explicitly spatialized population synthesis framework. It leverages the performant Clustering Large Applications (CLARA) and Random Forest algorithms as well as rich spatial information collected as part of surveys to make accurate predictions of synthetic households' locations at the building scale directly. In addition to preserving optimal sociodemographic accuracy and achieving realistic explicit spatialization, the new framework shows acceptable transferability thanks to CLARA's efficiency. An explicitly spatialized synthetic population for Montreal Island is generated using the proposed clustering + classification framework. The four components of the proposed framework have generated satisfactory results with the zonal synthetic population established showing a 2.85% average relative error, the building clustering selected having a 0.48 average silhouette width, the classification model achieving a 0.79 macro-average F1 score, and 78.9% of the synthetic households being assigned to their preferred building cluster.
transportation,transportation science & technology,engineering, civil
What problem does this paper attempt to address?