A Real-Time and High-Accuracy Railway Obstacle Detection Method Using Lightweight CNN and Improved Transformer

Zongyang Zhao,Jiehu Kang,Zefeng Sun,Tao Ye,Bin Wu
DOI: https://doi.org/10.1016/j.measurement.2024.115380
IF: 5.6
2024-01-01
Measurement
Abstract:With the sustained development of railway transportation, the urgent necessity to improve train operation safety makes obstacle detection become the research focus. However, existing railway obstacle detectors still face challenges in balancing detection accuracy and speed during the shunting process. In addition, they are not robust enough in real-world railway environments, especially in complex scenes involving small obstacles. To address these problems, this paper presents a real-time and high-accuracy railway obstacle detection model using lightweight CNN and improved transformer (RH-Net) for detecting railway obstacles efficiently to guarantee traffic safety. First, the Lightweight Feature Extraction Module (LEM) is designed to minimize the model’s computational load while maintaining its feature extraction ability. Then, the Improved Transformer Module (IFM) is developed to boost the model’s ability about stably extract global contextual information. Finally, the Enhanced Multi-Scale Feature Fusion Module (EFM) is proposed to optimize the detection of obstacles with different sizes, especially small objects. In the experiments on railway dataset, RH-Net achieves optimal detection performance on GeForce GTX 1080Ti (96.99% mAP and 135 FPS) and Jetson Xavier NX (97.02% mAP and 43 FPS), which is significantly superior to the existing detection models. Experimental results show that RH-Net has excellent detection ability, which can accurately and efficiently detect obstacles in complicated railway environments. Moreover, the experiments on MS COCO indicate that RH-Net can achieve more satisfactory detection performance than existing state-of-the-art methods. Therefore, the proposed model can be well-applied to more complex real-world scenes for multiple object detection.
What problem does this paper attempt to address?