SmartRSD: An Intelligent Multimodal Approach to Real-Time Road Surface Detection for Safe Driving

Adnan Md Tayeb,Mst Ayesha Khatun,Mohtasin Golam,Md Facklasur Rahaman,Ali Aouto,Oroceo Paul Angelo,Minseon Lee,Dong-Seong Kim,Jae-Min Lee,Jung-Hyeon Kim
2024-06-14
Abstract:Precise and prompt identification of road surface conditions enables vehicles to adjust their actions, like changing speed or using specific traction control techniques, to lower the chance of accidents and potential danger to drivers and pedestrians. However, most of the existing methods for detecting road surfaces solely rely on visual data, which may be insufficient in certain situations, such as when the roads are covered by debris, in low light conditions, or in the presence of fog. Therefore, we introduce a multimodal approach for the automated detection of road surface conditions by integrating audio and images. The robustness of the proposed method is tested on a diverse dataset collected under various environmental conditions and road surface types. Through extensive evaluation, we demonstrate the effectiveness and reliability of our multimodal approach in accurately identifying road surface conditions in real-time scenarios. Our findings highlight the potential of integrating auditory and visual cues for enhancing road safety and minimizing accident risks
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The paper aims to address the issue of real-time detection of road conditions to improve driving safety. Traditional methods mostly rely on visual data to identify road conditions, but in certain situations (such as low light, road covered with debris, or fog), relying solely on visual information may not be sufficient to accurately judge road conditions. Therefore, the researchers proposed a multimodal approach that combines audio and image data to automatically detect road conditions. Specifically, the research team developed a system called SmartRSD, which integrates audio and image data to improve the accuracy and reliability of road condition detection. They tested this method on a diverse dataset containing various environmental conditions and road types, and extensive evaluations demonstrated its effectiveness in accurately identifying road conditions in real-time scenarios. The core contributions of the research include: 1. **Multimodal Fusion Technology**: Introduced three main multimodal fusion strategies—feature-level fusion, decision-level fusion, and model-level fusion, and detailed how these strategies are applied to the task of road condition detection. 2. **Data Collection and Preprocessing**: Described in detail the process of collecting image and audio data, as well as the steps for standardizing and transforming these data. 3. **Improved Model Architecture**: Proposed improved versions of the MobileNet and YAMNet models to enhance the accuracy of image and audio classification. 4. **Weighted Fusion Algorithm**: Designed a weighted fusion algorithm that assigns different weights based on the varying accuracy of image and audio prediction results to optimize overall performance. Experimental results show that the multimodal classifier combining the improved MobileNet and improved YAMNet achieved the highest accuracy (94.91%), significantly outperforming other combinations. This indicates that the proposed multimodal approach can effectively detect road conditions in various challenging environments, thereby contributing to improved driving safety.