IOFS-SA: An interactive online feature selection tool for survival analysis

Xudong Zhao,Yuanyuan He,Youlin Wu,Tong Liu,Guohua Wang
DOI: https://doi.org/10.1016/j.compbiomed.2022.106121
Abstract:Background: Survival analysis is a primary problem before clinical treatments to cancer patients after their operations. In order to make this kind of analysis simple, many corresponding tools have been proposed. Though these tools are easy to use, there exist still two fatal flaws. One is that sample grouping is commonly empirical and wrongly based on original gene expressions or survival time. The other is that their feature selection methods mostly depend univariate semi-supervised regression or the multivariate one without considering the small sample size compared with the high dimension. Objective: In order to solve the two problems, we design an automatic feature selection web tool which can also satisfy interactive sample grouping. Methods: An automatic feature selection is performed on user-defined data or TCGA data. users can also perform manual feature selection. Then, hierarchical clustering is used and an automatic re-clustering strategy is proposed after interactive risk score split. Kaplan-Meier survival curve and log-rank test are utilized as the measurement. Results: Experimental results on 53 datasets from TCGA demonstrate the effectiveness of our method. The tree view, heat map and scatter map can intuitively display the result of the selected genes to the doctors for further research. Conclusions: This method is suitable for survival analysis of high-dimensional small sample data sets. At the same time, it also provides a platform for researchers to analyze custom data. It solves the problems of the existing web tools and provides an effective feature selection method for survival analysis. Availability: The full code package is freely available and can be downloaded at https://github.com/Yuan-23/IOFS-SA-ecp-data-main, and the online version at https://bioinfor.nefu.edu.cn/IOFS-SA/ is ready for use freely.
What problem does this paper attempt to address?