Statistical Inference in High-dimensional Poisson Regression with Applications to Mediation Analysis

Prabrisha Rakshit,Zijian Guo
2024-10-28
Abstract:Large-scale datasets with count outcome variables are widely present in various applications, and the Poisson regression model is among the most popular models for handling count outcomes. This paper considers the high-dimensional sparse Poisson regression model and proposes bias-corrected estimators for both linear and quadratic transformations of high-dimensional regression vectors. We establish the asymptotic normality of the estimators, construct asymptotically valid confidence intervals, and conduct related hypothesis testing. We apply the devised methodology to high-dimensional mediation analysis with count outcome, with particular application of testing for the existence of interaction between the treatment variable and high-dimensional mediators. We demonstrate the proposed methods through extensive simulation studies and application to real-world epigenetic data.
Methodology,Statistics Theory
What problem does this paper attempt to address?
This paper aims to solve the statistical inference problems in high - dimensional Poisson regression models and apply them to high - dimensional mediation analysis. Specifically, the paper focuses on how to construct unbiased estimators for the linear function \(x^{\top}\beta\) and the quadratic function \(\beta^{\top}A\beta_G\) in high - dimensional sparse Poisson regression models, establish the asymptotic normality of these estimators, construct effective confidence intervals, and conduct relevant hypothesis tests. ### Main contributions of the paper: 1. **Constructing unbiased estimators**: - Proposed bias - corrected estimators for the linear function \(x^{\top}\beta\) and the quadratic function \(\beta^{\top}A\beta_G\). - These estimators can effectively handle the sparsity problem in high - dimensional data and can work without relying on the assumption of the inverse of the Hessian matrix or sparsity. 2. **Asymptotic normality and confidence intervals**: - Established the asymptotic normality of these bias - corrected estimators, so that asymptotically effective confidence intervals can be constructed. - Proposed hypothesis testing methods applicable to high - dimensional Poisson regression, especially for hypothesis tests of linear and quadratic functions. 3. **Application to mediation analysis**: - Applied the proposed statistical inference methods to high - dimensional mediation analysis, especially in the case of count - type outcome variables. - Verified the effectiveness and practicality of the methods through simulation studies and actual epigenetic data analysis. ### Specific problem - solving: - **Inference of linear functions**: - Constructed a bias - corrected estimator \(\hat{x}^{\top}\beta\) for \(x^{\top}\beta\) and established its asymptotic normality. - Proposed corresponding confidence interval and hypothesis testing methods for testing \(H_0: x^{\top}\beta = 0\). - **Inference of quadratic functions**: - Constructed a bias - corrected estimator \(\hat{Q}_A\) for \(\beta^{\top}A\beta_G\) and established its asymptotic normality. - Proposed corresponding confidence interval and hypothesis testing methods for testing \(H_0: \beta_G = 0\). ### Application examples: - **Interaction test in mediation analysis**: - Considered the interaction term \(\beta_2\) in the mediation model and proposed a method for testing \(H_0: \beta_2 = 0\). - Applied the quadratic function inference method developed in the paper by transforming the test of the interaction term into a test of \(\beta_G\). ### Conclusion: This paper systematically solves the statistical inference problems in high - dimensional Poisson regression models and successfully applies them to high - dimensional mediation analysis. By proposing new bias - corrected estimators and hypothesis testing methods, the paper provides a powerful tool for dealing with complex statistical problems in high - dimensional data.