Abstract:In the recent years, we have witnessed a paradigm shift in the field of Computer Vision, with the forthcoming of the transformer architecture. Detection Transformers has become a state of the art solution to object detection and is a potential candidate for Road Object Detection in Autonomous Vehicles. Despite the abundance of object detection schemes, real-time DETR models are shown to perform significantly better on inference times, with minimal loss of accuracy and performance. In our work, we used Real-Time DETR (RTDETR) object detection on the BadODD Road Object Detection dataset based in Bangladesh, and performed necessary experimentation and testing. Our results gave a mAP50 score of 0.41518 in the public 60% test set, and 0.28194 in the private 40% test set.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is: to achieve real - time object detection on Bangladeshi roads to support the safe operation of autonomous vehicles. Specifically, the authors used a real - time detection model based on the Transformer architecture (Real - Time DETR, RTDETR) and conducted experiments and tests on the Bangladesh Road Object Detection Dataset (BadODD), aiming to improve the speed and accuracy of object detection. ### Problem Background 1. **Importance of Computer Vision** - Computer vision (CV) plays a crucial role in modern technology, especially in the field of autonomous vehicles. Accurate and fast object detection is vital for the safety and decision - making ability of autonomous vehicles. 2. **Limitations of Existing Methods** - Traditional object detection methods such as Region - based Convolutional Neural Networks (R - CNN) and You - Only - Look - Once (YOLO) have advantages in speed, but their performance in terms of accuracy and in complex scenarios is somewhat lacking. - DETR (Detection Transformer) has superior performance, but its inference time is long and it is not suitable for real - time applications. 3. **Advantages of RTDETR** - RTDETR combines the speed advantage of YOLO and the powerful expressive ability of Transformer, and can significantly reduce the inference time while maintaining high accuracy, which is suitable for real - time object detection tasks. ### Research Objectives - **Improve Real - Time Performance**: By optimizing the model structure and parameter configuration, ensure that the model can achieve real - time detection in practical applications. - **Improve Accuracy**: Conduct training and testing on a specific dataset (BadODD) to ensure the detection accuracy of the model in complex road environments. - **Meet Challenges**: Deal with various challenges in the dataset, such as class imbalance, image quality problems (such as halos, night - time images, windshield stains, etc.), to enhance the robustness of the model. ### Main Contributions - **Model Selection and Optimization**: Select the RTDETR model and optimize the inference speed by adjusting the number of decoder layers and other hyper - parameters. - **Data Pre - processing**: Conduct detailed pre - processing on the dataset, including image size adjustment, label correction, and the application of multiple data augmentation methods. - **Experimental Verification**: Verify the performance of the model on public and private test sets through a large number of experiments, and obtain relatively good mAP50 scores (0.41518 and 0.28194 respectively). ### Conclusions and Future Work Although RTDETR performs well in terms of real - time performance and accuracy, there are still some challenges, such as the detection of small and occluded objects, and performance under extreme conditions. Future work can focus on further optimizing the model structure, improving data pre - processing methods, and exploring more solutions suitable for different scenarios. Through these efforts, this research provides new ideas and technical support for autonomous vehicles to achieve safer and more efficient object detection in complex road environments.

A Real-Time DETR Approach to Bangladesh Road Object Detection for Autonomous Vehicles

Real-Time Detection and Analysis of Vehicles and Pedestrians using Deep Learning

Finetuning YOLOv9 for Vehicle Detection: Deep Learning for Intelligent Transportation Systems in Dhaka, Bangladesh

Real-time Traffic Object Detection for Autonomous Driving

Object Detection for Vehicle Dashcams using Transformers

Real-time vehicle detection system on the highway

EfficientLiteDet: a real-time pedestrian and vehicle detection algorithm

Innovative road distress detection (IR-DD): an efficient and scalable deep learning approach

A Fast and Accurate Real-Time Vehicle Detection Method Using Deep Learning for Unconstrained Environments

Multi-Vehicle Tracking and Counting Framework in Average Daily Traffic Survey Using RT-DETR and ByteTrack

DETR-ORD: An Improved DETR Detector for Oriented Remote Sensing Object Detection with Feature Reconstruction and Dynamic Query

Real-time Transformer-based Open-Vocabulary Detection with Efficient Fusion Head

Redefining Real-Time Road Quality Analysis With Vision Transformers on Edge Devices

BadODD: Bangladeshi Autonomous Driving Object Detection Dataset

Towards Real-time Traffic Sign and Traffic Light Detection on Embedded Systems

Target detection and classification via EfficientDet and CNN over unmanned aerial vehicles

RSUD20K: A Dataset for Road Scene Understanding In Autonomous Driving

MCG-RTDETR: Multi-Convolution and Context-Guided Network with Cascaded Group Attention for Object Detection in Unmanned Aerial Vehicle Imagery

DETRs Beat YOLOs on Real-time Object Detection

RST-MODNet: Real-time Spatio-temporal Moving Object Detection for Autonomous Driving

DEEGITS: Deep Learning based Framework for Measuring Heterogenous Traffic State in Challenging Traffic Scenarios