Spatiotemporal <i>K</i>-Nearest Neighbors Algorithm and Bayesian Approach for Estimating Urban Link Travel Time Distribution From Sparse GPS Trajectories

Wenwen Qin,Mingfeng Zhang,Wu Li,Yunyi Liang
DOI: https://doi.org/10.1109/mits.2023.3296331
IF: 5.293
2023-01-01
IEEE Intelligent Transportation Systems Magazine
Abstract:Travel time distribution (TTD) estimation on urban arterial links with sparse trajectory data is a practically important while substantially challenging subject. Although several methods have been proposed to estimate link TTDs, the applications of the existing methods are often limited by their shortcomings, such as the needs for extra road geometric features, signal control plans, model assumptions, etc. As an alternative, this article makes full use of ubiquitous incomplete trajectories that only traverse part of the link and introduces a novel bilevel Bayesian sampling method to alleviate the data sparsity problem. The focus of this study is to develop a framework of estimating link TTDs based on incomplete and complete trajectories by using the spatiotemporal <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">K</i> -nearest neighbors (KNN) algorithm and Bayesian approach. Three major steps are involved: <list list-type="bullet" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <list-item><label>•</label> First, we consider a straightforward trajectory imputation method for missing GPS points to improve the input data quality and serve as a basis for measuring the similarity between incomplete and complete trajectories. </list-item> <list-item><label>•</label> Then, a spatiotemporal KNN algorithm is proposed to estimate virtual link travel times of incomplete trajectories for the purposes of increasing the travel time sample size. </list-item> <list-item><label>•</label> Finally, a bilevel Bayesian-based sampling method comprising an improved particle filter and Gibbs sampling is introduced to approximate the posterior distribution of link travel times based on the enhanced data. </list-item> </list> <p xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">A case study was conducted on a major arterial in Nanjing, China. The results indicate that the proposed approach with the augmented data can achieve promising performance compared to the competing methods in terms of effectiveness and adaptiveness.
What problem does this paper attempt to address?