Lung-DETR: Deformable Detection Transformer for Sparse Lung Nodule Anomaly Detection

Hooman Ramezani,Dionne Aleman,Daniel Létourneau
DOI: https://doi.org/10.48550/arXiv.2409.05200
2024-09-09
Abstract:Accurate lung nodule detection for computed tomography (CT) scan imagery is challenging in real-world settings due to the sparse occurrence of nodules and similarity to other anatomical structures. In a typical positive case, nodules may appear in as few as 3% of CT slices, complicating detection. To address this, we reframe the problem as an anomaly detection task, targeting rare nodule occurrences in a predominantly normal dataset. We introduce a novel solution leveraging custom data preprocessing and Deformable Detection Transformer (Deformable- DETR). A 7.5mm Maximum Intensity Projection (MIP) is utilized to combine adjacent lung slices into single images, reducing the slice count and decreasing nodule sparsity. This enhances spatial context, allowing for better differentiation between nodules and other structures such as complex vascular structures and bronchioles. Deformable-DETR is employed to detect nodules, with a custom focal loss function to better handle the imbalanced dataset. Our model achieves state-of-the-art performance on the LUNA16 dataset with an F1 score of 94.2% (95.2% recall, 93.3% precision) on a dataset sparsely populated with lung nodules that is reflective of real-world clinical data.
Computer Vision and Pattern Recognition,Machine Learning
What problem does this paper attempt to address?
The main problem that this paper attempts to solve is to improve the accuracy of sparse lung nodule detection in real - world scenarios. Specifically, the paper proposes innovative solutions to the following challenges: 1. **Sparsity of lung nodules**: In actual clinical data, the occurrence of lung nodules is very sparse, usually only appearing in 3% of CT slices. This complicates the detection task because the model needs to process a large number of normal tissue slices and accurately identify the few slices containing nodules from them. 2. **Similarity to other anatomical structures**: Lung nodules are highly similar to other anatomical structures (such as bronchi and blood vessels), which increases the difficulty of correctly distinguishing nodules from other structures. 3. **Class imbalance problem**: Since the number of normal tissue slices is far greater than that of slices containing nodules, there is a serious class imbalance problem in the data set. This imbalance will affect the learning effect of the model, making the model more inclined to predict the majority class (i.e., normal tissue), resulting in inaccurate detection of the minority class (i.e., nodules). To solve these problems, the paper proposes a new method - Lung - DETR, which combines the following key techniques: - **Deformable Detection Transformer (Deformable DETR)**: By introducing the deformable attention mechanism, the model can dynamically focus on the most relevant regions, thereby improving the detection accuracy of small nodules. - **Focal Loss**: By adjusting the loss function, the model pays more attention to difficult - to - classify samples (such as nodules), thereby improving the class imbalance problem. - **Maximum Intensity Projection (MIP)**: Adjacent CT slices are combined into a single 2D image to reduce the sparsity of nodules and enhance the spatial context information, helping to better distinguish nodules, bronchi and other complex vascular structures. The combination of these techniques enables Lung - DETR to achieve a significant performance improvement on the LUNA16 data set, achieving an F1 score of 94.2% (a recall rate of 95.2% and a precision rate of 93.3%), especially performing well in dealing with sparse lung nodules.