Abstract:YOLO has become a central real-time object detection system for robotics, driverless cars, and video monitoring applications. We present a comprehensive analysis of YOLO's evolution, examining the innovations and contributions in each iteration from the original YOLO up to YOLOv8, YOLO-NAS, and YOLO with Transformers. We start by describing the standard metrics and postprocessing; then, we discuss the major changes in network architecture and training tricks for each model. Finally, we summarize the essential lessons from YOLO's development and provide a perspective on its future, highlighting potential research directions to enhance real-time object detection systems.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is to comprehensively review the development of the YOLO (You Only Look Once) architecture in computer vision, from the initial YOLOv1 to the latest YOLOv8, YOLO - NAS, and the YOLO model with the introduction of Transformer. Specifically, the paper mainly focuses on the following aspects: 1. **Requirements for real - time object detection**: With the rapid development of fields such as self - driving cars, robotics, video surveillance, and augmented reality, real - time object detection has become a key component. The YOLO series stands out due to its excellent balance between speed and accuracy, but there are performance differences among various versions. 2. **Evolution of the YOLO architecture**: Since the release of YOLOv1, subsequent versions have been continuously improved to overcome the limitations of earlier versions and enhance performance. The paper analyzes in detail the main changes in each iterative version, including innovations in network structure, training techniques, etc. 3. **Evaluation metrics and post - processing methods**: In order to better understand the performance of the YOLO series models, the paper introduces commonly used evaluation metrics such as AP (Average Precision) and its calculation method, and discusses post - processing techniques such as Non - Maximum Suppression (NMS). 4. **Future development directions**: Based on the research on the existing YOLO architecture, the paper also explores possible future research directions in this field, aiming to further enhance the performance of real - time object detection systems. Through the above content, the paper not only summarizes the development process of the YOLO framework but also provides readers with guidance on choosing the best YOLO model suitable for specific application scenarios and points out potential research paths. ### Formula presentation - **Average Precision (AP)**: \[ AP=\frac{\sum_{r}(\text{Precision}(r)\times\Delta\text{Recall}(r))}{\text{Total Number of Relevant Instances}} \] where $\text{Precision}(r)$ is the precision at recall rate $r$, and $\Delta\text{Recall}(r)$ is the change in recall rate. - **Intersection over Union (IoU)**: \[ IoU = \frac{\text{Area of Overlap}}{\text{Area of Union}}=\frac{|A\cap B|}{|A\cup B|} \] where $A$ and $B$ are the predicted box and the ground - truth box respectively. These formulas are used to measure the performance of object detection models, especially in different versions of YOLO, how to improve detection accuracy and speed by improving network structures and training methods.

A Comprehensive Review of YOLO Architectures in Computer Vision: From YOLOv1 to YOLOv8 and YOLO-NAS

A Comprehensive Review of YOLO Architectures in Computer Vision: From YOLOv1 to YOLOv8 and YOLO-NAS

A Comprehensive Review of YOLO: From YOLOv1 to YOLOv8 and Beyond

Yolo Versions Architecture: Review

YOLOv10 to Its Genesis: A Decadal and Comprehensive Review of The You Only Look Once Series

YOLOv1 to YOLOv10: The fastest and most accurate real-time object detection systems

A review of the development of YOLO object detection algorithm

YOLO-based Object Detection Models: A Review and its Applications

YOLOv5, YOLOv8 and YOLOv10: The Go-To Detectors for Real-time Vision

YOLOv1 to v8: Unveiling Each Variant–A Comprehensive Review of YOLO

Overview of Research on Object Detection Based on YOLO

YOLOv8: A Novel Object Detection Algorithm with Enhanced Performance and Robustness

A Review of YOLO Object Detection Algorithms based on Deep Learning

What is YOLOv8: An In-Depth Exploration of the Internal Features of the Next-Generation Object Detector

YOLOv11: An Overview of the Key Architectural Enhancements

Object detection using YOLO: challenges, architectural successors, datasets and applications

Real Time Object Detection System with YOLO and CNN Models: A Review

Evaluating the Evolution of YOLO (You Only Look Once) Models: A Comprehensive Benchmark Study of YOLO11 and Its Predecessors

Traffic Sign Detection and Recognition Using YOLO Object Detection Algorithm: A Systematic Review

YOLOv10: Real-Time End-to-End Object Detection

Real-time object detection and segmentation technology: an analysis of the YOLO algorithm