Improved YOLOv5 Based on Attention Mechanism and FasterNet for Foreign Object Detection on Railway and Airway tracks

Zongqing Qi,Danqing Ma,Jingyu Xu,Ao Xiang,Hedi Qu
DOI: https://doi.org/10.48550/arXiv.2403.08499
2024-03-13
Computer Vision and Pattern Recognition
Abstract:In recent years, there have been frequent incidents of foreign objects intruding into railway and Airport runways. These objects can include pedestrians, vehicles, animals, and debris. This paper introduces an improved YOLOv5 architecture incorporating FasterNet and attention mechanisms to enhance the detection of foreign objects on railways and Airport runways. This study proposes a new dataset, AARFOD (Aero and Rail Foreign Object Detection), which combines two public datasets for detecting foreign objects in aviation and railway systems.The dataset aims to improve the recognition capabilities of foreign object targets. Experimental results on this large dataset have demonstrated significant performance improvements of the proposed model over the baseline YOLOv5 model, reducing computational requirements.Improved YOLO model shows a significant improvement in precision by 1.2%, recall rate by 1.0%, and mAP@.5 by 0.6%, while mAP@.5-.95 remained unchanged. The parameters were reduced by approximately 25.12%, and GFLOPs were reduced by about 10.63%. In the ablation experiment, it is found that the FasterNet module can significantly reduce the number of parameters of the model, and the reference of the attention mechanism can slow down the performance loss caused by lightweight.
What problem does this paper attempt to address?
The paper aims to address the issue of foreign object intrusion detection on railways and airport runways. Specifically, the paper proposes an improved version of the YOLOv5 architecture, combining the FasterNet module and attention mechanism (NAM) to enhance the detection capability of foreign objects in complex environments. The main contributions of the paper include: 1. **Proposed an improved YOLOv5 architecture**: By introducing the FasterNet module and NAM, the model's precision, recall, and mean Average Precision (mAP) are improved, while reducing the number of parameters and computational load. 2. **Constructed a new dataset AARFOD**: This dataset integrates two public datasets, RailFOD23 and FOD-A, containing 48,409 high-resolution images covering 35 categories, with annotations for 74,334 foreign objects. These images encompass various weather conditions and perspectives, helping to enhance the model's generalization capability. Experimental results show that compared to the baseline YOLOv5, the improved model significantly enhances precision, recall, and mAP, while reducing the demand for computational resources. This indicates that the improved YOLOv5 has better performance and higher efficiency in practical applications.