Optimal Statistical Inference for Individualized Treatment Effects in High-dimensional Models

Tianxi Cai,Tony Cai,Zijian Guo
DOI: https://doi.org/10.48550/arXiv.1904.12891
2020-08-08
Abstract:The ability to predict individualized treatment effects (ITEs) based on a given patient's profile is essential for personalized medicine. We propose a hypothesis testing approach to choosing between two potential treatments for a given individual in the framework of high-dimensional linear models. The methodological novelty lies in the construction of a debiased estimator of the ITE and establishment of its asymptotic normality uniformly for an arbitrary future high-dimensional observation, while the existing methods can only handle certain specific forms of observations. We introduce a testing procedure with the type-I error controlled and establish its asymptotic power. The proposed method can be extended to making inference for general linear contrasts, including both the average treatment effect and outcome prediction. We introduce the optimality framework for hypothesis testing from both the minimaxity and adaptivity perspectives and establish the optimality of the proposed procedure. An extension to high-dimensional approximate linear models is also considered. The finite sample performance of the procedure is demonstrated in simulation studies and further illustrated through an analysis of electronic health records data from patients with rheumatoid arthritis.
Methodology,Statistics Theory
What problem does this paper attempt to address?
The problem that this paper attempts to solve is how to effectively estimate and infer the individualized treatment effect (ITE) in high - dimensional models. Specifically, the paper proposes a new hypothesis - testing method for selecting the best option among two potential treatment methods given patient characteristics. This method is based on the high - dimensional linear model framework, and its innovation lies in constructing a debiased estimator to estimate ITE and proving that this estimator has asymptotic normality for any future high - dimensional observations. In addition, the paper also proposes a test procedure to control the type - I error rate and establishes its asymptotic power. These methods are applicable not only to the estimation of ITE but also can be extended to the inference of general linear contrasts, including the average treatment effect (ATE) and outcome prediction. The main contributions of the paper are as follows: 1. Propose the first unified inference procedure that can perform effective statistical inferences on general linear contrasts \(x^\top_{\text{new}}(\beta_1 - \beta_2)\) and \(x^\top_{\text{new}}\beta_k\) without making any structural assumptions on the high - dimensional loading vector \(x_{\text{new}}\). 2. Solve the problem of the optimal detection boundary in the case of not knowing the exact sparsity level, which is an open problem mentioned in previous literature. Through these methods, researchers can more accurately evaluate the efficacy of different treatment methods for specific patients, thereby supporting the development of personalized medicine.