Abstract:In recent years, passively recorded probe traffic volumes have increasingly been used to estimate traffic volumes. However, it is not always possible to count probe traffic volume in a spatial dataset when probe trajectories cannot be fully reconstructed from raw probe point location data due to sparse recording intervals, lack of pseudonyms or timestamps. As a result, the application of such probe point location data has been limited in traffic volume estimation. To relax these constraints, we present the exact distribution of the estimated probe traffic volume in a road segment based on probe point location data without trajectory reconstruction. The distribution of the estimated probe traffic volume can exhibit multimodality, without necessarily being line-symmetric with respect to the true probe traffic volume. As more probes are present, the distribution approaches a normal distribution. The conformity of the distribution was visualised through numerical simulations. Sometimes, there exists a local optimal cordon length that maximises estimation precision. The theoretical variance of estimated probe traffic volume can address heteroscedasticity in the modelling of traffic volume estimates.

What problem does this paper attempt to address?

### What problem does this paper attempt to solve? This paper aims to solve the problem of how to estimate the probe traffic volume in road segments using sparse, non - chronological probe point data without reconstructing the trajectory. Specifically, the author proposes a method to estimate the number of probes passing through a certain road segment without relying on complete trajectory information. This solves the problem in existing methods where probe data cannot be effectively used due to data sparsity or lack of anonymity. #### Background and problem description In recent years, passively - recorded probe traffic volume data has been increasingly used to estimate traffic volume. However, in some cases, due to sparse recording intervals, lack of pseudonyms or timestamps, it is impossible to fully reconstruct the probe trajectory, making it difficult to accurately count the probe traffic volume. This limits the application of such probe location data in traffic volume estimation. To relax these limitations, this paper proposes a probe traffic volume estimation method based on probe location data without trajectory reconstruction. #### Main problems 1. **Data sparsity**: The probe data recording intervals are large, resulting in the inability to accurately reconstruct the trajectory. 2. **Privacy protection**: The data may lack pseudonyms and timestamps to protect user privacy. 3. **Limitations of traditional methods**: Existing traffic volume estimation methods rely on traditional devices at fixed locations (such as pneumatic tubes, coils, radars, etc.), which are limited by space, time and budget. #### Solution The author proposes a mathematical model to describe the distribution of probe traffic volume and proves the applicability and accuracy of this model under different conditions. This model allows the estimation of traffic volume through probe location data without complete trajectory information, and its compliance can be verified by numerical simulation. In addition, the author also explores the influence of the length of the virtual cordon on the estimation accuracy and finds that there is a locally optimal cordon length that can maximize the estimation accuracy. #### Key formulas 1. **Unbiased estimator**: \[ \hat{m}=\frac{t}{d}\sum_{a = 1}^{n}s_{a} \] where \(\hat{m}\) is the estimated value of the probe traffic volume, \(t\) is the recording interval, \(d\) is the length of the virtual cordon, and \(s_{a}\) is the speed of the \(a\)-th probe. 2. **Variance**: \[ \text{Var}[\hat{m}]=\frac{mt^{2}}{d^{2}}\int_{0}^{\infty}b(s, d, t)g(s)\,ds \] where \(b(s, d, t)=s^{2}p(1 - p)\), \(p\) is the fractional part, and \(g(s)\) is the probability density function of the probe speed. 3. **Normal distribution approximation**: \[ \lim_{m\rightarrow\infty}f(\hat{m}; m)=N\left(m,\frac{mt^{2}}{d^{2}}\int_{0}^{\infty}b(s, d, t)g(s)\,ds\right) \] 4. **Optimal cordon length**: \[ \text{argmin}_{0 < d\leq\max(d)}\text{obj}(d) \] where \(\text{obj}(d)\) can be the coefficient of variation (CV) or the variance - to - mean ratio (VMR). #### Conclusion The model proposed in this paper provides an effective solution for estimating traffic volume using probe location data without reconstructing the trajectory. This method not only improves data utilization but also enables traffic volume estimation while protecting privacy.

On the Distribution of Probe Traffic Volume Estimated without Trajectory Reconstruction

Real-time Detection of Traffic Congestion Based on Trajectory Data

Estimating Traffic Flow in Large Road Networks Based on Multi-Source Traffic Data

Estimation of Queue Lengths, Probe Vehicle Penetration Rates, and Traffic Volumes at Signalized Intersections using Probe Vehicle Trajectories

Spatial random modeling of vehicular traffic in VANETs

How Many Probe Vehicles Are Enough for Identifying Traffic Congestion?—a Study from a Streaming Data Perspective

Characterising Scattering Features in Flow–density Plots Using a Stochastic Platoon Model

Urban Network-Wide Traffic Speed Estimation with Massive Ride-Sourcing GPS Traces

Real-time Estimation of Vehicle Counts on Signalized Intersection Approaches Using Probe Vehicle Data

Analysis of Time- and Space-domain Sampling for Probe Vehicle-based Traffic Information System.

Tracing Road Network Bottleneck By Data Driven Approach

A Real-time Traffic Information System Using Probe Vehicles

Development of a Machine-Learning-Based Novel Framework for Travel Time Distribution Determination Using Probe Vehicle Data

Traffic Volume Estimate Based on Low Penetration Connected Vehicle Data at Signalized Intersections: A Bayesian Deduction Approach

Estimating Historical Hourly Traffic Volumes via Machine Learning and Vehicle Probe Data: A Maryland Case Study

Network-wide Traffic Flow Estimation with Insufficient Volume Detection and Crowdsourcing Data

Estimation of urban traffic state with probe vehicles

Maximum Likelihood Estimation of Probe Vehicle Penetration Rates and Queue Length Distributions From Probe Vehicle Data

Network-wide identification of turn-level intersection congestion using only low-frequency probe vehicle data

An Improved Method For Estimating Urban Traffic State Via Probe Vehicle Tracking

Urban Road Traffic Speed Estimation for Missing Probe Vehicle Data Based on Multiple Linear Regression Model