Sufficient dimension reduction for regression with metric space-valued responses

Abdul-Nasah Soale,Yuexiao Dong
DOI: https://doi.org/10.48550/arXiv.2310.12402
2024-05-25
Abstract:Data visualization and dimension reduction for regression between a general metric space-valued response and Euclidean predictors is proposed. Current Fréchét dimension reduction methods require that the response metric space be continuously embeddable into a Hilbert space, which imposes restriction on the type of metric and kernel choice. We relax this assumption by proposing a Euclidean embedding technique which avoids the use of kernels. Under this framework, classical dimension reduction methods such as ordinary least squares and sliced inverse regression are extended. An extensive simulation experiment demonstrates the superior performance of the proposed method on synthetic data compared to existing methods where applicable. The real data analysis of factors influencing the distribution of COVID-19 transmission in the U.S. and the association between BMI and structural brain connectivity of healthy individuals are also investigated.
Methodology,Human-Computer Interaction,Computation
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the data visualization and dimension reduction between general metric - space - valued response variables and Euclidean predictor variables in regression analysis. Specifically, the current Fréchet dimension - reduction method requires that the response metric space can be continuously embedded into a Hilbert space, which limits the types of metrics that can be used and the choice of kernel functions. The author relaxes this assumption by proposing a Euclidean embedding technique, which avoids the use of kernel functions. Under this framework, classical dimension - reduction methods such as ordinary least squares (OLS) and sliced inverse regression (SIR) are extended. The paper also demonstrates the superior performance of the proposed method on synthetic data through extensive simulation experiments, and conducts practical data analysis on the influencing factors of COVID - 19 transmission in the United States and the association between BMI of healthy individuals and structural brain connectivity.