Abstract:Background. Breast invasive carcinoma (BRCA) is not a single disease as each subtype has a distinct morphology structure. Although several computational methods have been proposed to conduct breast cancer subtype identification, the specific interaction mechanisms of genes involved in the subtypes are still incomplete. To identify and explore the corresponding interaction mechanisms of genes for each subtype of breast cancer can impose an important impact on the personalized treatment for different patients. Methods. We integrate the biological importance of genes from the gene regulatory networks to the differential expression analysis and then obtain the weighted differentially expressed genes (weighted DEGs). A gene with a high weight means it regulates more target genes and thus holds more biological importance. Besides, we constructed gene coexpression networks for control and experiment groups, and the significantly differentially interacting structures encouraged us to design the corresponding Gene Ontology (GO) enrichment based on gene coexpression networks (GOEGCN). The GOEGCN considers the two-side distinction analysis between gene coexpression networks for control and experiment groups. The method allows us to study how the modulated coexpressed gene couples impact biological functions at a GO level. Results. We modeled the binary classification with weighted DEGs for each subtype. The binary classifier could make a good prediction for an unseen sample, and the experimental results validated the effectiveness of our proposed approaches. The novel enriched GO terms based on GOEGCN for control and experiment groups of each subtype explain the specific biological function changes according to the two-side distinction of coexpression network structures to some extent. Conclusion. The weighted DEGs contain biological importance derived from the gene regulatory network. Based on the weighted DEGs, five binary classifiers were learned and showed good performance concerning the “Sensitivity,” “Specificity,” “Accuracy,” “F1,” and “AUC” metrics. The GOEGCN with weighted DEGs for control and experiment groups presented a novel GO enrichment analysis results and the novel enriched GO terms would further unveil the changes of specific biological functions among all the BRCA subtypes to some extent. The R code in this research is available at https://github.com/yxchspring/GOEGCN_BRCA_Subtypes.

Comparison of Different Classification Methods for Breast Cancer Subtypes Prediction

A Pathways-Based Prediction Model for Classifying Breast Cancer Subtypes

Classifying Breast Cancer Subtypes Using Multiple Kernel Learning Based on Omics Data.

Multiclass Cancer Classification by Using Fuzzy Support Vector Machine and Binary Decision Tree with Gene Selection

Using feature selection and Bayesian network identify cancer subtypes based on proteomic data

[Identification of breast cancer subtypes based on graph convolutional network]

Classifying Breast Cancer Using Multi-View Graph Neural Network Based on Multi-Omics Data

Molecular Subtyping of Cancer Based on Distinguishing Co-Expression Modules and Machine Learning

Classifying Breast Cancer Subtypes Using Deep Neural Networks Based on Multi-Omics Data

Classification Prediction of Breast Cancer Based on Machine Learning

Machine Learning for Cancer Subtype Prediction with FSA Method

BCDForest: a Boosting Cascade Deep Forest Model Towards the Classification of Cancer Subtypes Based on Gene Expression Data

Prognostically Relevant Subtypes and Survival Prediction for Breast Cancer Based on Multimodal Genomics Data

An Experimental Comparison of Machine Learning Classification Algorithms for Breast Cancer Diagnosis

RNA-Seq-Based Breast Cancer Subtypes Classification Using Machine Learning Approaches

A Hybrid Deep Learning Model for Predicting Molecular Subtypes of Human Breast Cancer Using Multimodal Data

Comparison of Machine Learning Classifiers for Breast Cancer Diagnosis Based on Feature Selection

MLW-gcForest: A Multi-Weighted Gcforest Model for Cancer Subtype Classification by Methylation Data

Analysis of Breast Cancer Subtypes Prediction Based on Alternative Splicing Disorders

Breast Cancer Subtype by Imbalanced Omics Data Through A Deep Learning Fusion Model