Abstract:Maintaining roads is crucial to economic growth and citizen well-being because roads are a vital means of transportation. In various countries, the inspection of road surfaces is still done manually, however, to automate it, research interest is now focused on detecting the road surface defects via the visual data. While, previous research has been focused on deep learning methods which tend to process the entire image and leads to heavy computational cost. In this study, we focus our attention on improving the classification performance while keeping the computational cost of our solution low. Instead of processing the whole image, we introduce a segmentation model to only focus the downstream classification model to the road surface in the image. Furthermore, we employ contrastive learning during model training to improve the road surface condition classification. Our experiments on the public RTK dataset demonstrate a significant improvement in our proposed method when compared to previous works.

What problem does this paper attempt to address?

### What problem does this paper attempt to solve? This paper aims to solve the problem of road surface condition classification. Specifically, the authors are concerned with how to improve the performance of road surface condition classification while maintaining a low computational cost. Traditional deep - learning methods usually process the entire image, which leads to a high computational cost and may contain information irrelevant to the road (such as buildings, vehicles, etc.), thus affecting the accuracy of classification. To solve these problems, the authors propose a new method, which includes the following two main steps: 1. **Road area extraction**: Extract only the road area from the original image through a segmentation model to reduce the interference of irrelevant information. 2. **Contrastive learning**: Introduce contrastive learning during the model training process to improve the consistency of semantic embedding features, thereby enhancing the classification performance. ### Method overview 1. **Road area extraction**: - Use an encoder - decoder model for binary - classification segmentation tasks to divide the image into road areas and non - road areas. - The segmentation model is trained using the binary cross - entropy loss function \( L_{\text{seg}} \): \[ L_{\text{seg}} = -\frac{1}{N} \sum_{i = 1}^{N} \left( y_i \log(P_i)+(1 - y_i) \log(1 - P_i) \right) \] where \( N \) is the number of pixels in the training batch, and \( y_i \) and \( P_i \) are the labeled and predicted pixel confidences, respectively. 2. **Classification model**: - The extracted road area is sent as input to the classification model, which aims to classify it into multiple categories \( C_1, C_2,\ldots, C_n \). - During the training process, contrastive learning is used to improve the classification task. For a pair of samples \( (x_i, x_j) \), their corresponding embedding features are \( p_i \) and \( p_j \), respectively, and the contrastive loss function \( L_{\text{ct}} \) is defined as follows: \[ L_{\text{ct}}(x_i, x_j)=-\log\frac{\exp\left(\frac{\text{SIM}(p_i, p_j)}{\tau}\right)}{\sum_{k = 1}^{K} I(x_i, x_k)\cdot\exp\left(\frac{\text{SIM}(p_i, p_k)}{\tau}\right)} \] where \( I(x_i, x_k) \) is an indicator function: \[ I(x_i, x_k)= \begin{cases} 0 & \text{if } x_i \text{ and } x_k \text{ are in the same class}\\ 1 & \text{otherwise} \end{cases} \] The similarity function \( \text{SIM}(p_i, p_j) \) is calculated using the cosine distance: \[ \text{SIM}(p_i, p_j)\approx\cos(p_i, p_j)=\frac{p_i^T\times p_j}{||p_i||\times||p_j||} \] 3. **Total loss function**: - The total loss function \( L \) combines the classification cross - entropy loss \( L_{\text{ce}} \) and the contrastive loss \( L_{\text{ct}} \): \[ L = L_{\text{ce}}+\lambda L_{\text{ct}} \]

Improving classification of road surface conditions via road area extraction and contrastive learning

Road Defect Detection from On-Board Cameras with Scarce and Cross-Domain Data

Multiple data sources and domain generalization learning method for road surface defect classification

An Iterative Semi-Supervised Approach with Pixel-wise Contrastive Loss for Road Extraction in Aerial Images

A Lightweight High-Resolution RS Image Road Extraction Method Combining Multi-Scale and Attention Mechanism

Multi-class Road Defect Detection and Segmentation using Spatial and Channel-wise Attention for Autonomous Road Repairing

Extracting roads from satellite images via enhancing road feature investigation in learning

Winter Road Surface Condition Recognition Using A Pretrained Deep Convolutional Network

Scribble-Based Weakly Supervised Deep Learning for Road Surface Extraction From Remote Sensing Images

Improving Road Segmentation in Challenging Domains Using Similar Place Priors

Application of the Semi-Supervised Learning Approach for Pavement Defect Detection

RoadScan: A Novel and Robust Transfer Learning Framework for Autonomous Pothole Detection in Roads

Image Enhancement Technology in Pavement Disease Detection System

Road Surface Defect Detection Algorithm Based on YOLOv8

Epurate-Net: Efficient Progressive Uncertainty Refinement Analysis for Traffic Environment Urban Road Detection

A Comprehensive Implementation of Road Surface Classification for Vehicle Driving Assistance: Dataset, Models, and Deployment

Research on a Road Defect Detection Method based on Improved YOLOv8

Automated Road Extraction from Satellite Imagery Integrating Dense Depthwise Dilated Separable Spatial Pyramid Pooling with DeepLabV3+

3D lidar point-cloud projection operator and transfer machine learning for effective road surface features detection and segmentation

Lightweight remote sensing road detection with an attention-augmented transformer

Automatic Road Extraction From Remote Sensing Imagery Using Ensemble Learning and Postprocessing