Clinical and multi-omic features differentiate young Black and White breast cancer cohorts derived by machine learning approaches

Kawther Abdilleh,Boris Aguilar,George Acquaah-Mensah
DOI: https://doi.org/10.1016/j.clbc.2024.11.015
IF: 3.078
2024-12-01
Clinical Breast Cancer
Abstract:Background There are documented differences in Breast cancer (BrCA) presentations and outcomes between Black and White patients. In addition to molecular factors, socioeconomic, racial, and clinical factors result in disparities in outcomes for women in the United States. Using machine learning and unsupervised biclustering methods within a multi- omics framework, here we sought to shed light on the biological and clinical underpinnings of observed differences between Black and White BrCA patients. Methods We examined The Cancer Genome Atlas (TCGA) BrCA samples from stage II patients aged 50 or younger that are Black (BAA50) or White (W50) (n=139 patients; 36 BAA50 and 103 W50) These patients were chosen because marked differences in survival were observed in an earlier study. A variety of multi-omic data sets were analyzed to further characterize the clinical and molecular disparities for insights. Results We coupled RNAseq data with protein-protein interaction as well as BrCA-specific protein co-expression network data to identify two novel biclusters. These biclusters are significantly associated with clinical features including race, number of lymph nodes involved with disease, estrogen receptor status, progesterone receptor status and menopausal status. There were also differentially mutated genes. Using DNA methylation data, we identified differentially methylated genes. Machine learning algorithms were trained on differential methylation values of driver genes. The trained algorithms were successful in predicting the bicluster assignment of each sample. Conclusions These results demonstrate that there was a significant association between the cluster membership and BAA50 and W50 cohorts, indicating that these biclusters accurately stratify these cohorts. Micro-abstract Breast cancer is more aggressive in Black patients relative to White patients. We used disparate data types to identify novel groups of genes that are significantly associated with race and other clinical features. We identified various molecular differences and potential biomarkers between the groups. These identified biomarkers can be useful for detection or diagnosis.
oncology
What problem does this paper attempt to address?