Abstract:Identifying and locating diseases in chest X-rays are very challenging, due to the low visual contrast between normal and abnormal regions, and distortions caused by other overlapping tissues. An interesting phenomenon is that there exist many similar structures in the left and right parts of the chest, such as ribs, lung fields and bronchial tubes. This kind of similarities can be used to identify diseases in chest X-rays, according to the experience of broad-certificated radiologists. Aimed at improving the performance of existing detection methods, we propose a deep end-to-end module to exploit the contralateral context information for enhancing feature representations of disease proposals. First of all, under the guidance of the spine line, the spatial transformer network is employed to extract local contralateral patches, which can provide valuable context information for disease proposals. Then, we build up a specific module, based on both additive and subtractive operations, to fuse the features of the disease proposal and the contralateral patch. Our method can be integrated into both fully and weakly supervised disease detection frameworks. It achieves 33.17 AP50 on a carefully annotated private chest X-ray dataset which contains 31,000 images. Experiments on the NIH chest X-ray dataset indicate that our method achieves state-of-the-art performance in weakly-supervised disease localization.
Image and Video Processing,Computer Vision and Pattern Recognition,Machine Learning
What problem does this paper attempt to address?
This paper attempts to solve the problem of identifying and locating diseases in chest X - ray films. Specifically, due to the low visual contrast between normal and abnormal areas and the distortion caused by other overlapping tissues, it is very challenging to identify and locate diseases in chest X - ray films. In addition, this paper also focuses on how to use the similarity of the left and right - side structures of the chest to improve the disease detection performance. These similar structures include ribs, lung fields, and bronchi, etc. According to the experience of widely - certified radiologists, this similarity can be used to improve disease detection in chest X - ray films.
### Main Contributions
1. **Propose a new deep module**: Contralaterally Enhanced Networks (CE - Nets) for enhancing the feature representation of disease detection in chest X - ray films. This is the first time to explicitly use the contralateral context information between the left and right sides of the chest to enhance the feature representation of disease proposals.
2. **Develop effective methods to find contralateral reference patches**: For each disease proposal, extract a preliminary contralateral reference patch guided by the spine line, and further optimize its position through the Spatial Transformer Network (STN). Design a new feature fusion module to combine the features of the disease proposal and its contralateral reference patch through addition and subtraction operations.
3. **Significantly improve the performance of existing object detection baseline models**: On the fully - supervised chest X - ray data set, the proposed method has achieved significant performance improvements on multiple baseline models. For example, on the NIH chest X - ray data set, this method has also achieved state - of - the - art performance in the weakly - supervised setting.
### Technical Details
- **Contralateral Patch Extraction**:
- **Preliminary Contralateral Patch**: Use the spine line as the axis of symmetry and determine the preliminary contralateral patch for each disease proposal by solving a system of linear equations.
- **Optimized Contralateral Patch**: Use the Spatial Transformer Network (STN) to further adjust the position of the preliminary contralateral patch to obtain a more appropriate contralateral patch.
- **Feature Fusion Module**:
- Use ROI pooling to extract the feature representation of the disease proposal from the input feature map.
- Generate a specific feature map through the contralateral patch, and extract its feature representation through bilinear interpolation and ROI pooling.
- Fuse the features of the disease proposal and its contralateral patch together through addition and subtraction operations, and then generate the final prediction result through a fully - connected layer.
### Experimental Results
- **Fully - Supervised Disease Detection**:
- Experiments were carried out on a self - built data set of 31,000 chest X - ray images, which contains 30 disease categories and a total of 155,000 lesion areas were labeled.
- The proposed method significantly improved the performance on multiple baseline models, especially on the AP50 and AP75 metrics.
- **Weakly - Supervised Disease Detection**:
- Experiments were carried out on the NIH chest X - ray data set, which contains 112,120 images and 14 disease categories.
- The proposed method has also achieved state - of - the - art performance in the weakly - supervised setting.
### Conclusion
This paper significantly improves the performance of disease detection in chest X - ray films by introducing contralateral context information, especially in cases of low contrast and overlapping tissues. The proposed method is not only applicable to the fully - supervised disease detection framework, but can also be integrated into the weakly - supervised disease detection framework, and has broad application prospects.