Abstract:Classifying and counting vehicles in road traffic has numerous applications in the transportation engineering domain. However, the wide variety of vehicles (two-wheelers, three-wheelers, cars, buses, trucks etc.) plying on roads of developing regions without any lane discipline, makes vehicle classification and counting a hard problem to automate. In this paper, we use state of the art Convolutional Neural Network (CNN) based object detection models and train them for multiple vehicle classes using data from Delhi roads. We get upto 75% MAP on an 80-20 train-test split using 5562 video frames from four different locations. As robust network connectivity is scarce in developing regions for continuous video transmissions from the road to cloud servers, we also evaluate the latency, energy and hardware cost of embedded implementations of our CNN model based inferences.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is to automatically classify and count vehicles on roads without lane markings. Specifically, researchers are concerned with how to use computer vision technology to automatically classify and count multiple types of vehicles (including two - wheelers, three - wheelers, cars, buses, and trucks, etc.) in developing - country cities with complex traffic conditions and no clear lane markings (such as Delhi in India and its surrounding areas). ### Main Problems and Challenges 1. **Diversity**: There are a large number of different types of vehicles in the road traffic of developing countries, and these vehicles are very different in appearance from those in developed countries (for example, three - wheeled motorcycles, electric tricycles, rickshaws, etc.). 2. **Occlusion**: Due to the lack of lane discipline, large vehicles are often occluded by small vehicles, increasing the detection difficulty. 3. **Road Structure**: The road intersection designs in developing countries are irregular, different from the rectangular grid - like road structures in developed countries, resulting in different viewing angles and traffic flow patterns. 4. **Dataset Difference**: Most of the existing labeled datasets are from developed countries, and the vehicle types and traffic scenes in these datasets do not match the situations in developing countries. Direct application will lead to poor model performance. ### Solutions To address the above challenges, the authors took the following measures: - **Create a Local Dataset**: Collected and labeled video frames from Delhi and its surrounding areas, and constructed a dataset containing 5,562 image frames, with a total of 32,088 labeled boxes. - **Train a CNN Model**: Used the YOLO (You Only Look Once) convolutional neural network model and fine - tuned it on the self - built dataset. The experimental results show that under an 80 - 20 training - test split, the model reached a maximum MAP (Mean Average Precision) value of 75%. - **Embedded Platform Evaluation**: Considering the problem of unstable broadband network connections, the researchers also evaluated the inference performance of the model on three embedded platforms (Nvidia Jetson TX2, Raspberry PI Model 3B, and Intel Movidius Neural Compute Stick), and analyzed the trade - offs between latency, energy consumption, and hardware cost. ### Application Value This research has a wide range of application prospects, mainly including: - **Infrastructure Planning**: Calculate the number of different types of vehicles to evaluate road capacity and help plan new overpasses, underpasses, pedestrian overpasses, and other facilities. - **Policy Evaluation**: Monitor the impact of specific policies (such as "odd - even license plate restrictions") on road traffic and provide data support to optimize policies. - **Real - time Traffic Management**: Detect speeding or illegally - driving heavy vehicles and impose penalties in a timely manner. - **Public Transport Monitoring**: Track the arrival time of buses and improve the predictability of public transport. In summary, this paper aims to solve key problems in urban traffic management in developing countries through deep learning and computer vision technology, and provide technical support for future intelligent transportation systems.

Embedded CNN based vehicle classification and counting in non-laned road traffic

Deep learning based highway vehicles detection and counting system using computer vision

Deep Neural Network Based Vehicle Detection and Classification of Aerial Images

EnsembleNet: a hybrid approach for vehicle detection and estimation of traffic density based on faster R-CNN and YOLO models

Artificial Intelligence (AI) Enabled Vehicle Detection and counting Using Deep Learning

Classifying logistic vehicles in cities using Deep learning

Vehicle Classification and Counting for Traffic Analysis based on Single-stage YOLOv8 Model

Object Detection and Tracking Algorithms for Vehicle Counting: A Comparative Analysis

Smart Traffic Management of Vehicles using Faster R-CNN based Deep Learning Method

Automated Vehicle Recognition with Deep Convolutional Neural Networks

Faster CNN-based vehicle detection and counting strategy for fixed camera scenes

Traffic Sign Detection and Classification

Detection and classification of vehicles using audio visual cues

RCNet: road classification convolutional neural networks for intelligent vehicle system

Artificial Intelligence Enabled Traffic Monitoring System

An Integrated Approach For Vehicle Detection And Type Recognition

Classification and Counting of Vehicle using Image Processing Techniques

Automatic Extraction of Relevant Road Infrastructure using Connected vehicle data and Deep Learning Model

Automatic vehicle detection system in different environment conditions using fast R-CNN

An Intelligent Traffic Analysis and Prediction System Using Deep Learning Technique