Distribution-in-distribution-out Regression

Xiaoyu Chen,Mengfan Fu,Yujing,Huang,Xinwei Deng
2024-05-20
Abstract:Regression analysis with probability measures as input predictors and output response has recently drawn great attention. However, it is challenging to handle multiple input probability measures due to the non-flat Riemannian geometry of the Wasserstein space, hindering the definition of arithmetic operations, hence additive linear structure is not well-defined. In this work, a distribution-in-distribution-out regression model is proposed by introducing parallel transport to achieve provable commutativity and additivity of newly defined arithmetic operations in Wasserstein space. The appealing properties of the DIDO regression model can serve a foundation for model estimation, prediction, and inference. Specifically, the Fréchet least squares estimator is employed to obtain the best linear unbiased estimate, supported by the newly established Fréchet Gauss-Markov Theorem. Furthermore, we investigate a special case when predictors and response are all univariate Gaussian measures, leading to a simple close-form solution of linear model coefficients and $R^2$ metric. A simulation study and real case study in intraoperative cardiac output prediction are performed to evaluate the performance of the proposed method.
Methodology,Statistics Theory
What problem does this paper attempt to address?