Abstract 7394: Delineating spatial expression signatures of lung adenocarcinoma subtypes from spatial single-cell transcriptomics using graph neural networks

Arun Das,Md Musaddaqul Hasib,Zhentao Liu,Wen Meng,Yu-Chiao Chiu,Gabriel Sica,Shou-Jiang Gao
DOI: https://doi.org/10.1158/1538-7445.am2024-7394
IF: 11.2
2024-03-28
Cancer Research
Abstract:Lung Adenocarcinoma (LUAD) subtypes including complex acinar (CA), micropapillary, and solid regions are characterized by distinct morphological and biological features that bear significant implications for prognosis and treatment strategies. While histology-based subtype prediction has provided a structural understanding, it overlooks crucial biological insights related to molecular mechanisms and the influences of tumor microenvironments (TMEs). Recent advances in spatially resolved single-cell profiling enable precise measurements of gene expression at the cellular and sub-cellular levels within intact tissues. We hypothesize that spatial gene expression (SGE) patterns can effectively identify LUAD subtypes and characterize the influence of TMEs. Thus, to address the limitations of histology-based subtyping of LUAD, we developed an AI-driven approach that leverages SMI samples to predict LUAD subtypes and elucidate their underlying molecular landscape and associated TMEs. For this study, we leverage the NanoString CosMx Spatial Molecular Imaging (SMI) NSCLC FFPE datasets comprising of 8 LUAD tissue samples, annotated by pathologists at UPMC for subtype identification. The dataset consists of approximately 98k cells per sample with 18 cell types and 960 genes per cell. We encoded the spatial organization of cells as a graph with nodes representing cells with expression features and edges based on k-nearest neighbors of 30 cells. We considered the prediction of micropapillary, CA, and solid for each tumor cell and developed a Graph Convolutional Network (GCN) with semi-supervised training to account for the unlabeled nodes. A subgraph splitting strategy with cross-validation was developed to train on localized spatial regions and ensure balanced label proportions in each subgraph. Our GCN achieved 92.63% test accuracy, surpassing non-spatial deep learning models (77.45%). Notably, explanation analysis using the GNNExplainer algorithm of our trained model identified high-attribution cells in CA and solid TME, revealing unique differentially expressed genes (DEGs) - JUNB in CA, and VIM, GPX1, PSAP, IFI27 in solid tumors, compared to low-attribution cells within the same subtype. Further analysis revealed an enrichment of EMT and interferon alpha/gamma, suggesting a more inflamed phenotype than low-attribution cells. Differential expression analysis of high-attribution CA vs solid cells identified significant upregulation of NDRG1 in CA and COL1A1 in solid. Functional enrichment analysis of these DEGs revealed PI3k-Akt in CA and MAPK signaling pathway in solid, suggesting an aggressive solid TME and CA with high survival and metastasis. We further identified a distinct neutrophil-enriched TME associated with CA and revealed cell-cell communication through MIF interactions with the CD74/CXCR4 and CD44/CD74 axes. Citation Format: Arun Das, Md Musaddaqul Hasib, Zhentao Liu, Wen Meng, Yu-Chiao Chiu, Gabriel Sica, Shou-Jiang Gao. Delineating spatial expression signatures of lung adenocarcinoma subtypes from spatial single-cell transcriptomics using graph neural networks [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2024; Part 1 (Regular s); 2024 Apr 5-10; San Diego, CA. Philadelphia (PA): AACR; Cancer Res 2024;84(6_Suppl) nr 7394.
oncology
What problem does this paper attempt to address?