Lightweight remote sensing road detection with an attention-augmented transformer

Feng Deng,Hongyan Tian,Xu Zhao,Duo Han
DOI: https://doi.org/10.1504/ijsnet.2024.142717
2024-11-27
International Journal of Sensor Networks
Abstract:Road extraction is a critical task in computer vision. However, accurate road delineation faces challenges due to multiple factors, e.g., object occlusions and similar entities. This study proposes a lightweight road detection model with an attention-augmented transformer to create an effective encoder-decoder and semantic extractor to enhance the road extraction precision. The encoder optimises MobileNetv3 by improving the squeeze and excitation module and bottleneck structure. This modification exploits road global feature extraction efficiency, simultaneously decreasing parameters and computational demands. Moreover, we present an attention-augmented semantic extractor comprising the enhanced transformer blocks that merge depth-wise separable convolutions with an improved multi-head attention as well as efficient channel attention mechanism, thus boosting the model proficiency in capturing extensive dependencies within road semantics. Empirical assessments on the Massachusetts and DeepGlobe road datasets demonstrate that our method outperforms the alternative state-of-the-art solutions, attaining mean intersection over union scores of 80.41% and 79.14%, respectively.
computer science, information systems,telecommunications
What problem does this paper attempt to address?