Abstract:Line segment detection is a fundamental low-level task in computer vision, and improvements in this task can impact more advanced methods that depend on it. Most new methods developed for line segment detection are based on Convolutional Neural Networks (CNNs). Our paper seeks to address challenges that prevent the wider adoption of transformer-based methods for line segment detection. More specifically, we introduce a new model called Deformable Transformer-based Line Segment Detection (DT-LSD) that supports cross-scale interactions and can be trained quickly. This work proposes a novel Deformable Transformer-based Line Segment Detector (DT-LSD) that addresses LETR's drawbacks. For faster training, we introduce Line Contrastive DeNoising (LCDN), a technique that stabilizes the one-to-one matching process and speeds up training by 34$\times$. We show that DT-LSD is faster and more accurate than its predecessor transformer-based model (LETR) and outperforms all CNN-based models in terms of accuracy. In the Wireframe dataset, DT-LSD achieves 71.7 for $sAP^{10}$ and 73.9 for $sAP^{15}$; while 33.2 for $sAP^{10}$ and 35.1 for $sAP^{15}$ in the YorkUrban dataset.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is some challenges existing in the existing Transformer - based methods in the line segment detection task, which hinder their wider application. Specifically, the paper points out: 1. **Limitations of Existing Methods**: - Most new line segment detection methods are based on Convolutional Neural Networks (CNN). Although CNN requires additional post - processing steps to generate the final prediction when dealing with line segment detection tasks. - Transformer - based methods (such as LETR) can capture long - range dependencies between pixels, but they have deficiencies in training speed and performance. In particular, LETR only supports the enhancement of single - scale feature maps and lacks cross - scale interaction ability, resulting in slow convergence speed and high computational complexity. 2. **Proposed New Method**: - The paper introduces a new model, called Deformable - Transformer - based Line Segment Detection (DT - LSD), which supports cross - scale interaction and can be trained quickly. - To accelerate the training process, the paper proposes the "Line Contrastive DeNoising" (LCDN) technique, which improves the training speed by 34 times by stabilizing the one - to - one matching process. 3. **Main Contributions**: - A new end - to - end Transformer framework is proposed, which outperforms CNN - based line segment detectors in accuracy. This is achieved by using the deformable attention mechanism. - An efficient training technique, Line Contrastive DeNoising (LCDN), is introduced, which reduces the required number of training epochs, enabling DT - LSD to reach convergence within a similar number of epochs as CNN - based models. - Experimental results on two datasets (Wireframe and YorkUrban) show that DT - LSD outperforms the existing state - of - the - art methods in both structural and heatmap metrics. - This work provides an opportunity for line segment detectors to remove hand - designed post - processing by leveraging end - to - end Transformer models. In conclusion, this paper aims to improve the performance and training efficiency of line segment detection tasks by improving Transformer - based methods, thereby promoting further development in this field.

DT-LSD: Deformable Transformer-based Line Segment Detection

A Transformer-Based Object Detector with Coarse-Fine Crossing Representations

TP-LSD: Tri-Points Based Line Segment Detector

ELSD: Efficient Line Segment Detector and Descriptor

LACTNet: A Lightweight Real-Time Semantic Segmentation Network Based on an Aggregated Convolutional Neural Network and Transformer

SSD-MonoDETR: Supervised Scale-aware Deformable Transformer for Monocular 3D Object Detection

Deformable DETR: Deformable Transformers for End-to-End Object Detection

A Robust and Fast Line Segment Detector Based on Top-Down Smaller Eigenvalue Analysis

A Simplified Pipeline for Line Segment Detection

Sem-LSD: A Learning-based Semantic Line Segment Detector

Automatic Detection of Coseismic Landslides Using a New Transformer Method

CosineTR: A Dual-Branch Transformer-Based Network for Semantic Line Detection

MResTNet: A Multi-Resolution Transformer Framework with CNN Extensions for Semantic Segmentation

RTFormer: Efficient Design for Real-Time Semantic Segmentation with Transformer

MCRformer: Morphological constraint reticular transformer for 3D medical image segmentation

L-DETR: A Light-Weight Detector for End-to-End Object Detection With Transformers

D-CONFORMER: Deformable Sparse Transformer Augmented Convolution for Voxel-Based 3D Object Detection

Li3DeTr: A LiDAR based 3D Detection Transformer

DVST: Deformable Voxel Set Transformer for 3D Object Detection from Point Clouds

D-former: a U-shaped Dilated Transformer for 3D medical image segmentation