Abstract:Deep learning-based (DL) visual recognition algorithms are widely investigated to enhance the accuracy, efficiency, and objectivity of the bridge inspection process, which is largely manual today. These algorithms typically require a large amount of training data, which consists of images and corresponding annotations. The manual preparation of such data sets is time-consuming, and more automated data generation approaches that are aided by synthetic environments suffer from domain gaps, which result in poor performance in real-world tasks. This study investigates an unsupervised domain adaptation (UDA) approach for visual recognition in bridge inspection scenes to reduce and eventually eliminate the need for time-consuming and inaccurate manual image annotations. A state-of-the-art UDA framework, termed DAFormer, is applied to the synthetic source domain data with full annotations and real-world target domain data with no or partial annotations. The synthetic data set in this study is designed to correlate with real-world data by incorporating the relevant design standards and practices into the modeling step. Compared with the source-only supervised learning approach (which performed poorly on real-world data), the UDA improved the performance to a level close to the supervised learning that used real-world data with manual annotations (the Intersection over Union (IoU) difference is only 1.03%). Furthermore, the UDA approach outperformed the supervised learning that used target domain data if the small amount of annotated target domain data is mixed with the synthetic source domain data to guide the network's learning of patterns that only exist in the real-world environment (the IoU improvement was 5.03%). The UDA approach presented in this study facilitates the applications of DL-based visual recognition algorithms to bridge inspection tasks with limited manual effort.

Transformer-Based Domain Adaptation for Event Data Classification

Learning cross-domain representations by vision transformer for unsupervised domain adaptation

Joint Feature-Level And Pixel-Level Domain Adaption For Object Detection In The Wild

Unsupervised Domain Adaptation for Remote Sensing Semantic Segmentation with Transformer

Towards Unsupervised Domain Adaptation via Domain-Transformer

Relating Events and Frames Based on Self-Supervised Learning and Uncorrelated Conditioning for Unsupervised Domain Adaptation

Domain Adaptation Transformer for Unsupervised Driving-Scene Segmentation in Adverse Conditions

Exploiting Temporal Coherence for Self-Supervised Visual Tracking by Using Vision Transformer

TransConv: Transformer Meets Contextual Convolution for Unsupervised Domain Adaptation

CDTrans: Cross-domain Transformer for Unsupervised Domain Adaptation

TransAdapter: Vision Transformer for Feature-Centric Unsupervised Domain Adaptation

Bridging the Gap between Events and Frames through Unsupervised Domain Adaptation

Safe Self-Refinement for Transformer-based Domain Adaptation

Feature Fusion Transferability Aware Transformer for Unsupervised Domain Adaptation

Vision Transformer-based Adversarial Domain Adaptation

EUDA: An Efficient Unsupervised Domain Adaptation via Self-Supervised Vision Transformer

Event-based Monocular Dense Depth Estimation with Recurrent Transformers

DA-DETR: Domain Adaptive Detection Transformer with Information Fusion

Universal Domain Adaptation via Compressive Attention Matching

Domain Adaptation via Bidirectional Cross-Attention Transformer

Unsupervised Domain Adaptation Approach for Vision-Based Semantic Understanding of Bridge Inspection Scenes Without Manual Annotations