Improving differential diagnosis of pulmonary large cell neuroendocrine carcinoma (LCNEC) and small cell lung cancer via a transcriptomic, biological pathway-based ridge regression model.

Jun Hong,Likun Hou,Wei Zhang,Zhengwei Dong,Zhan Huang,Wuzhou Yuan,Lei Zhang,Chunyan Wu
DOI: https://doi.org/10.1200/jco.2021.39.15_suppl.3065
IF: 45.3
2021-05-20
Journal of Clinical Oncology
Abstract:3065 Background: In clinics, it can be challenging to make correct diagnosis of LCNEC, Small cell lung cancer (SCLC), if tissues, like needle biopsies, are insufficient or morphology was poorly preserved. In this study, a reliable classifier was constructed based on transcriptome data and machine learning (Ridge regression) to improve the diagnostic accuracy for LCNEC and SCLC. Methods: RNA-Seq data obtained from 3 public cohorts were collected as training set, including 60 NSCLC cases from The Cancer Genome Atlas (TCGA), 66 LCNEC cases from Julie George et al., Nature Communications 2018, and 33 SCLC cases from Julie George et al., Nature 2015. Another 80 NSCLC, 30 LCNEC and 15 SCLC cases published by Martin Peifer et al., Nature Genetics 2012 were used as validation set. Additionally, RNA-Seq data of 27 borderline samples which were hard to make diagnosis based on histology and Immunohistochemistry were used to test the accuracy of the prediction model. Results: 13,959 genes mapped to 186 Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways were included. Gene Set Variation Analysis (GSVA) algorithm was used to enrich and score each KEGG pathway. A prediction model based on GSVA score of each pathway was constructed via Ridge regression. This GSVA Score Model achieved ROC-AUC 0.949 and concordant rate of 0.75 for the entire prediction efficiency. Of the 27 borderline samples which were hard to make confirmed diagnosis, 17/27 (63.0%) were predicted as LCNEC, 7/27 were predicted as SCLC, and the remainder were predicted as NSCLC. While only 8 (29.6%) cases with LCNEC were diagnosed by pathologists, which was significantly lower than the results predicted by the model. Furthermore, cases with model predicted LCNEC had a significant longer disease-free survival than that with model predicted SCLC (median DFS,59 months for LCNEC vs 5 months for SCLC, p = 0.0043), which was in parallel with currently known prognostic difference of these two types of neuroendocrine tumors. Conclusions: This GSVA algorithm-based prediction model was able to make accurate diagnosis of LCNEC and SCLC. And it may provide valuable information for clinics to choose optimal therapeutic approach for patients with pulmonary neuroendocrine tumors.[Table: see text]
oncology
What problem does this paper attempt to address?