Machine-learning and scRNA-Seq-based diagnostic and prognostic models illustrating survival and therapy response of lung adenocarcinoma

Qingyu Cheng,Weidong Zhao,Xiaoyuan Song,Tengchuan Jin
DOI: https://doi.org/10.1038/s41435-024-00289-0
2024-07-29
Abstract:Lung cancer is a major cause accounting for cancer-related mortalities, with lung adenocarcinoma (LUAD) being the most prevalent subtype. Given the high clinical and cellular heterogeneities of LUAD, accurate diagnosis and prognosis are crucial to avoid overdiagnosis and overtreatment. Taking full advantage of scRNA-Seq data to resolve the tumor heterogeneities, we explored the overall landscape of LUAD microenvironment. Utilizing the stage-specific tumor cell markers, we have developed highly accurate diagnostic and prognostic models with elevated sensitivity and specificity. The diagnostic model, developed through random forest algorithms with a thirteen-gene signature, achieved an accuracy of 96.4% and an AUC of 0.993. These metrics were further demonstrated by benchmarking with available models and scoring systems in independent cohorts. Concurrently, the prognostic model, formulated via Cox regression with a six-gene signature, effectively predicted overall survival, with elevated risk scores associated with increased fractions of cancer-associated fibroblasts, and higher likelihood of immune escape and T-cell exclusion. Subsequently, two nomograms were developed to predict survival and drug responses, facilitating their integration into clinical practice. Overall, this study underscores the potential of our models for efficient, rapid, and cost-effective diagnosis and prognosis of LUAD, adaptable to multiple expression profiling platforms and quantification methods.
What problem does this paper attempt to address?