Application of 2D Homography for High Resolution Traffic Data Collection using CCTV Cameras

Linlin Zhang,Xiang Yu,Abdulateef Daud,Abdul Rashid Mussah,Yaw Adu-Gyamfi
2024-01-14
Abstract:Traffic cameras remain the primary source data for surveillance activities such as congestion and incident monitoring. To date, State agencies continue to rely on manual effort to extract data from networked cameras due to limitations of the current automatic vision systems including requirements for complex camera calibration and inability to generate high resolution data. This study implements a three-stage video analytics framework for extracting high-resolution traffic data such vehicle counts, speed, and acceleration from infrastructure-mounted CCTV cameras. The key components of the framework include object recognition, perspective transformation, and vehicle trajectory reconstruction for traffic data collection. First, a state-of-the-art vehicle recognition model is implemented to detect and classify vehicles. Next, to correct for camera distortion and reduce partial occlusion, an algorithm inspired by two-point linear perspective is utilized to extracts the region of interest (ROI) automatically, while a 2D homography technique transforms the CCTV view to bird's-eye view (BEV). Cameras are calibrated with a two-layer matrix system to enable the extraction of speed and acceleration by converting image coordinates to real-world measurements. Individual vehicle trajectories are constructed and compared in BEV using two time-space-feature-based object trackers, namely Motpy and BYTETrack. The results of the current study showed about +/- 4.5% error rate for directional traffic counts, less than 10% MSE for speed bias between camera estimates in comparison to estimates from probe data sources. Extracting high-resolution data from traffic cameras has several implications, ranging from improvements in traffic management and identify dangerous driving behavior, high-risk areas for accidents, and other safety concerns, enabling proactive measures to reduce accidents and fatalities.
Computer Vision and Pattern Recognition,Artificial Intelligence
What problem does this paper attempt to address?
The problem that this paper attempts to solve is how to use the existing infrastructure - installed closed - circuit television (CCTV) cameras - to extract high - resolution traffic data, such as vehicle count, speed, and acceleration. Currently, state - level agencies still rely on manual efforts to extract data from webcams because the existing automatic vision systems have complex camera calibration requirements and are unable to generate high - resolution data. Therefore, this paper proposes a three - stage video analysis framework, aiming to overcome these problems in the following aspects: 1. **Object Recognition**: Detect and classify vehicles using the state - of - the - art vehicle recognition models. 2. **Perspective Transformation**: Automatically extract the Region of Interest (ROI) using an algorithm inspired by two - point linear perspective, and convert the CCTV view into a Bird - Eye View (BEV) through 2D homeomorphic techniques to correct camera distortion and reduce partial occlusion. 3. **Vehicle Trajectory Reconstruction**: Construct the trajectories of individual vehicles and compare them using object trackers based on spatio - temporal features (such as Motpy and BYTETrack) in order to collect traffic data. Through these methods, researchers hope to improve the quality and resolution of traffic data, thereby improving traffic management, identifying dangerous driving behaviors, high - risk accident areas, and other safety issues, and ultimately taking proactive measures to reduce accidents and casualties.