Classification of lung cancer subtypes on CT images with synthetic pathological priors

Wentao Zhu,Yuan Jin,Gege Ma,Geng Chen,Jan Egger,Shaoting Zhang,Dimitris N. Metaxas
DOI: https://doi.org/10.1016/j.media.2024.103199
2023-08-09
Abstract:The accurate diagnosis on pathological subtypes for lung cancer is of significant importance for the follow-up treatments and prognosis managements. In this paper, we propose self-generating hybrid feature network (SGHF-Net) for accurately classifying lung cancer subtypes on computed tomography (CT) images. Inspired by studies stating that cross-scale associations exist in the image patterns between the same case's CT images and its pathological images, we innovatively developed a pathological feature synthetic module (PFSM), which quantitatively maps cross-modality associations through deep neural networks, to derive the "gold standard" information contained in the corresponding pathological images from CT images. Additionally, we designed a radiological feature extraction module (RFEM) to directly acquire CT image information and integrated it with the pathological priors under an effective feature fusion framework, enabling the entire classification model to generate more indicative and specific pathologically related features and eventually output more accurate predictions. The superiority of the proposed model lies in its ability to self-generate hybrid features that contain multi-modality image information based on a single-modality input. To evaluate the effectiveness, adaptability, and generalization ability of our model, we performed extensive experiments on a large-scale multi-center dataset (i.e., 829 cases from three hospitals) to compare our model and a series of state-of-the-art (SOTA) classification models. The experimental results demonstrated the superiority of our model for lung cancer subtypes classification with significant accuracy improvements in terms of accuracy (ACC), area under the curve (AUC), and F1 score.
Image and Video Processing,Computer Vision and Pattern Recognition,Machine Learning
What problem does this paper attempt to address?
The paper aims to address the accurate classification of lung cancer subtypes in computed tomography (CT) images. Specifically, the research objectives include the following aspects: 1. **Propose a new Self-Generated Hybrid Feature Network (SGHF-Net)**: To improve the accuracy of lung cancer subtype classification, the researchers developed a novel deep learning model—SGHF-Net. This model utilizes CT images to accurately classify lung adenocarcinoma (LUAD) and lung squamous cell carcinoma (LUSC). 2. **Application of cross-modal correlation**: Based on the cross-scale correlation between CT images and pathological images mentioned in existing literature, this study quantitatively maps these correlations through deep neural networks, thereby extracting "gold standard" information from CT images, which is typically only obtainable through pathological images. 3. **Development of a Pathological Feature Synthesis Module (PFSM)**: To obtain pathological information from CT images, the researchers designed a Pathological Feature Synthesis Module (PFSM), which can generate high-order pathological features from CT images. 4. **Design of a Radiological Feature Extraction Module (RFEM)**: Additionally, a Radiological Feature Extraction Module (RFEM) was designed to directly obtain radiological information from CT images and combine it with the pathological features generated by PFSM to enhance the performance of the entire classification model. 5. **Construction of a feature fusion framework**: By effectively integrating the synthetic pathological features generated by PFSM with the radiological features extracted by RFEM, the feature fusion framework guides the entire classification model to extract pathology-related features from CT inputs. 6. **Experimental validation**: To evaluate the effectiveness and generalization ability of the proposed method, the researchers conducted extensive experimental comparisons on a large-scale multi-center dataset, which includes a total of 829 cases from three hospitals. The results show that the proposed model achieved significant improvements in classification accuracy, area under the curve (AUC), and F1 score. In summary, this study aims to improve the classification accuracy of lung cancer subtypes in CT images by developing a new deep learning model, especially in the early diagnosis stage, which is crucial for formulating effective personalized treatment plans.