Abstract:Object detection is one of the predominant and challenging problems in computer vision. Over the decade, with the expeditious evolution of deep learning, researchers have extensively experimented and contributed in the performance enhancement of object detection and related tasks such as object classification, localization, and segmentation using underlying deep models. Broadly, object detectors are classified into two categories viz. two stage and single stage object detectors. Two stage detectors mainly focus on selective region proposals strategy via complex architecture; however, single stage detectors focus on all the spatial region proposals for the possible detection of objects via relatively simpler architecture in one shot. Performance of any object detector is evaluated through detection accuracy and inference time. Generally, the detection accuracy of two stage detectors outperforms single stage object detectors. However, the inference time of single stage detectors is better compared to its counterparts. Moreover, with the advent of YOLO (You Only Look Once) and its architectural successors, the detection accuracy is improving significantly and sometime it is better than two stage detectors. YOLOs are adopted in various applications majorly due to their faster inferences rather than considering detection accuracy. As an example, detection accuracies are 63.4 and 70 for YOLO and Fast-RCNN respectively, however, inference time is around 300 times faster in case of YOLO. In this paper, we present a comprehensive review of single stage object detectors specially YOLOs, regression formulation, their architecture advancements, and performance statistics. Moreover, we summarize the comparative illustration between two stage and single stage object detectors, among different versions of YOLOs, applications based on two stage detectors, and different versions of YOLOs along with the future research directions.

Yolo Versions Architecture: Review

A Comprehensive Review of YOLO Architectures in Computer Vision: From YOLOv1 to YOLOv8 and YOLO-NAS

YOLOv1 to v8: Unveiling Each Variant–A Comprehensive Review of YOLO

YOLOv5, YOLOv8 and YOLOv10: The Go-To Detectors for Real-time Vision

A Comprehensive Review of YOLO: From YOLOv1 to YOLOv8 and Beyond

YOLOv11: An Overview of the Key Architectural Enhancements

YOLO-based Object Detection Models: A Review and its Applications

A review of the development of YOLO object detection algorithm

Object detection using YOLO: challenges, architectural successors, datasets and applications

Overview of Research on Object Detection Based on YOLO

YOLOv1 to YOLOv10: The fastest and most accurate real-time object detection systems

What is YOLOv8: An In-Depth Exploration of the Internal Features of the Next-Generation Object Detector

What is YOLOv9: An In-Depth Exploration of the Internal Features of the Next-Generation Object Detector

YOLOv8: A Novel Object Detection Algorithm with Enhanced Performance and Robustness

Real-time object detection and segmentation technology: an analysis of the YOLO algorithm

YOLO-v1 to YOLO-v8, the Rise of YOLO and Its Complementary Nature toward Digital Manufacturing and Industrial Defect Detection

YOLOv10 to Its Genesis: A Decadal and Comprehensive Review of The You Only Look Once Series

Evaluating the Evolution of YOLO (You Only Look Once) Models: A Comprehensive Benchmark Study of YOLO11 and Its Predecessors

Real time object recognition based on YOLO model

What is YOLOv5: A deep look into the internal features of the popular object detector