Voice-Assisted Real-Time Traffic Sign Recognition System Using Convolutional Neural Network
Mayura Manawadu,Udaya Wijenayake
2024-04-11
Abstract:Traffic signs are important in communicating information to drivers. Thus, comprehension of traffic signs is essential for road safety and ignorance may result in road accidents. Traffic sign detection has been a research spotlight over the past few decades. Real-time and accurate detections are the preliminaries of robust traffic sign detection system which is yet to be achieved. This study presents a voice-assisted real-time traffic sign recognition system which is capable of assisting drivers. This system functions under two subsystems. Initially, the detection and recognition of the traffic signs are carried out using a trained Convolutional Neural Network (CNN). After recognizing the specific traffic sign, it is narrated to the driver as a voice message using a text-to-speech engine. An efficient CNN model for a benchmark dataset is developed for real-time detection and recognition using Deep Learning techniques. The advantage of this system is that even if the driver misses a traffic sign, or does not look at the traffic sign, or is unable to comprehend the sign, the system detects it and narrates it to the driver. A system of this type is also important in the development of autonomous vehicles.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
This paper proposes a solution to the problem of real-time traffic sign recognition. Existing traffic sign detection systems still need to improve in terms of real-time performance and accuracy, as ignoring or misinterpreting traffic signs may lead to traffic accidents. To address this issue, the researchers designed a real-time traffic sign recognition system with voice assistance. The system uses Convolutional Neural Networks (CNNs) for detection and recognition, and then broadcasts the recognized traffic signs to the driver in the form of voice messages through a text-to-speech engine. Even if the driver misses, does not notice, or cannot understand the traffic signs, the system can detect and announce them. In addition, this system is also important for the development of autonomous vehicles.
In the paper, the researchers adopted the YOLO (You Only Look Once) CNN model architecture and optimized the performance of real-time detection and recognition. Experimental results show that although certain versions of the YOLO model may have lower accuracy, they have faster detection speed. Eventually, they selected the YOLOv4-tiny model, which achieves higher recognition accuracy while maintaining fast detection speed. The system also incorporates an audio feedback system that instantly converts the information into voice and plays it to the driver when a traffic sign is detected.
To train the model, the researchers used the German Traffic Sign Detection Benchmark (GTSDB) and Mapillary Traffic Sign Dataset. After adjustments and optimizations, the final model achieves good traffic sign detection results in different environments and lighting conditions, with an average frame rate of 55 FPS and an average accuracy of 64.71%. Future work plans include using more powerful GPU devices to expand model training and improving the YOLO architecture to further enhance accuracy.