Controlled variable selection with nonconvex regularization for identifying biomarkers
Shoujiang Li,Hui Zhang,Yong Liang
DOI: https://doi.org/10.1016/j.bspc.2024.105965
IF: 5.1
2024-02-03
Biomedical Signal Processing and Control
Abstract:Biomedical big data has revolutionized biomarker identification and has become a key driver for the development of precision medicine applications. However, existing computational methods have been able to rapidly identify biomarkers (variable selection), but true validation of biomarkers is still hampered by low statistical power and poor reproducibility of results. To address the above issues, in this paper, we propose two knockoff-based nonconvex regularization methods for identifying biomarkers. These two methods can perform variable selection while rigorously guaranteeing the false discovery rate (FDR) at a given desired level with high statistical power. We combine two nonconvex regularization methods, Smoothly Clipped Absolute Deviation (SCAD) and Minimax Concave Penalty (MCP), with the knockoff framework, respectively. Knockoff variables are first constructed to mimic the correlation structure of the original variables while maintaining independence from the response, and then the original and knockoff variables are used as augmentation matrices for variable selection. Since the nonconvex regularization method has good statistical theoretical properties such as unbiasedness, sparsity and Oracle, the proposed methods are better able to deal with heavy-tailed distributions, high noise and high correlation data. We verify the effectiveness of the proposed methods through numerical simulation experiments, and the results show that the proposed methods have strong statistical power while controlling the FDR compared to the comparison baseline method. We also apply the proposed methods to identify Human Immunodeficiency Virus (HIV) drug resistance-related gene mutations, Alzheimer's disease brain lesion regions, and purity-related genes in tumor samples, which can provide references and help for clinical diagnosis and treatment.
engineering, biomedical