Classification of multiple primary lung cancer in patients with multifocal lung cancer: assessment of a machine learning approach using multidimensional genomic data

Guotian Pei,Kunkun Sun,Yingshun Yang,Shuai Wang,Mingwei Li,Xiaoxue Ma,Huina Wang,Libin Chen,Jiayue Qin,Shanbo Cao,Jun Liu,Yuqing Huang
DOI: https://doi.org/10.3389/fonc.2024.1388575
IF: 4.7
2024-05-04
Frontiers in Oncology
Abstract:Background: Multiple primary lung cancer (MPLC) is an increasingly well-known clinical phenomenon. However, its molecular characterizations are poorly understood, and still lacks of effective method to distinguish it from intrapulmonary metastasis (IM). Herein, we propose an identification model based on molecular multidimensional analysis in order to accurately optimize treatment. Methods: A total of 112 Chinese lung cancers harboring at least two tumors (n = 270) were enrolled. We retrospectively selected 74 patients with 121 tumor pairs and randomly divided the tumor pairs into a training cohort and a test cohort in a 7:3 ratio. A novel model was established in training cohort, optimized for MPLC identification using comprehensive genomic profiling analyzed by a broad panel with 808 cancer-related genes, and evaluated in the test cohort and a prospective validation cohort of 38 patients with 112 tumors. Results: We found differences in molecular characterizations between the two diseases and rigorously selected the characterizations to build an identification model. We evaluated the performance of the classifier using the test cohort data and observed an 89.5% percent agreement (PA) for MPLC and a 100.0% percent agreement for IM. The model showed an excellent area under the curve (AUC) of 0.947 and a 91.3% overall accuracy. Similarly, the assay achieved a considerable performance in the independent validation set with an AUC of 0.938 and an MPLC predictive value of 100%. More importantly, the MPLC predictive value of the classification achieved 100% in both the test set and validation cohort. Compared to our previous mutation-based method, the classifier showed better κ consistencies with clinical classification among all 112 patients (0.84 vs . 0.65, p <.01). Conclusion: These data provide novel evidence of MPLC-specific genomic characteristics and demonstrate that our one-step molecular classifier can accurately classify multifocal lung tumors as MPLC or IM, which suggested that broad panel NGS may be a useful tool for assisting with differential diagnoses.
oncology
What problem does this paper attempt to address?