Identification of A 10-Gene Signature to Predict Efficacy of Neoadjuvant Therapy in Patients with HER2 Positive Breast Cancer.

Yusong Wang,Mozhi Wang,Xiangyu Sun,Litong Yao,Mengshen Wang,Haoran Dong,Xinyan Li,Mingcong He,Yingying Xu
DOI: https://doi.org/10.21203/rs.3.rs-154352/v1
2021-01-01
Abstract:Abstract Background:Patients with human epidermal growth factor receptor 2 (HER2) positive breast cancer represent a poor prognosis, which are recommended to be treated with neoadjuvant therapy (NAT). Tumor immune microenvironment, especially tumor infiltrating cells (TILs), are proved to predict the efficacy of NAT. However, validated immune-related multi-gene signatures for HER2-positive BC are still lacking.Methods:We collected gene expression arrays of pre-NAT samples from the National Center for Biotechnology Information Gene Expression Omnibus. Totally 4 studies are included in our study (n=295, no. of train=207, no. of validation=95) to construct the signature. Single Sample Gene Set Enrichment Analysis (ssGSEA)and weighted gene co-expression network analysis (WGCNA)were used to quantify immune-infiltrating components in tumor environment and to identify immune related modules. We used spline regression to evaluate non-linear effect of genes and to construct the signature.Results:Immune infiltration status was significantly related to pathological complete response (pCR) (p=0.02). We filtered 80 differential expression genes according to immune infiltration status, and identified two gene modules correlated to pCR and immune infiltration status. CCL5, CD72, PTGDS, CYTIP, PAX5, and estrogen receptor (ER)status were significantly related to pCR in linear multivariate analysis. In spline regression, non-linear aspects of MAP7, IL2RB, CD3G, PTPRC, TRAC were relevant to pCR. We constructed a signature concerning both linear and non-linear effects of genes, which was validated in 5-fold cross validation (AUC=0.81) and an external validation cohort (n=88) (AUC=0.797).Conclusions:In HER2 positive BC, immune infiltration status should be involved into consideration to make optimal regimens. A ten-gene generalized non-linear signature including ER status could predict the efficacy of NAT.
What problem does this paper attempt to address?