A Novel Two-Stage Stacking Model for Breast Cancer Survival Prediction

Haokai Gao,Xiangru Li,Ruizheng Shi
DOI: https://doi.org/10.1109/cscwd61410.2024.10580419
2024-01-01
Abstract:The prediction of breast cancer survival rates holds important clinical significance for treatment decision-making by attending physicians and treatment selection by patients. According to the characteristics of the problem, this paper proposes a multilayer perceptron (MLP) based stacking ensemble learning approach, Stacking-MLP , and apply it to the analysis of survival rates among breast cancer patients within 1-year, 3-year, 5-year, and 10-year periods. Stacking-MLP consists of two stages: firstly, it trains a series of primary perceptron models through feature random undersampling to mitigate the negative impact of inter-feature correlations; then, it employs a fully connected neural network for ensemble learning (Stacking ensemble learning) to adaptively extract complementary information from the learning results of the primary perceptrons. The Surveillance, Epidemiology, and End Results(SEER) is an authoritative cancer epidemiology database managed by the National Cancer Institute in the United States. In this study, we constructed a SEER dataset consisting of 19 variables for breast cancer survival prediction studies. We evaluated the performance of the proposed Stacking-MLP model on this dataset, and the experimental results show an improvement of approximately 10% in accuracy compared to similar studies. The experimental code and the constructed dataset used in this study are publicly available at https://github.com/GaoHaokai222/Stacking-MLP.
What problem does this paper attempt to address?