Gaussian Graphical Model Estimation and Selection for High-Dimensional Incomplete Data Using Multiple Imputation and Horseshoe Estimators

Yunxi Zhang,Soeun Kim
DOI: https://doi.org/10.3390/math12121837
IF: 2.4
2024-06-14
Mathematics
Abstract:Gaussian graphical models have been widely used to measure the association networks for high-dimensional data; however, most existing methods assume fully observed data. In practice, missing values are inevitable in high-dimensional data and should be handled carefully. Under the Bayesian framework, we propose a regression-based approach to estimating sparse precision matrix for high-dimensional incomplete data. The proposed approach nests multiple imputation and precision matrix estimation with horseshoe estimators in a combined Gibbs sampling process. For fast and efficient selection using horseshoe priors, a post-iteration 2-means clustering strategy is employed. Through extensive simulations, we show the predominant selection and estimation performance of our approach compared to several prevalent methods. We further demonstrate the proposed approach to incomplete genetics data compared to alternative methods applied to completed data.
mathematics
What problem does this paper attempt to address?