Abstract:Consider the problem of estimating the local average treatment effect with an instrument variable, where the instrument unconfoundedness holds after adjusting for a set of measured covariates. Several unknown functions of the covariates need to be estimated through regression models, such as instrument propensity score and treatment and outcome regression models. We develop a computationally tractable method in high-dimensional settings where the numbers of regression terms are close to or larger than the sample size. Our method exploits regularized calibrated estimation, which involves Lasso penalties but carefully chosen loss functions for estimating coefficient vectors in these regression models, and then employs a doubly robust estimator for the treatment parameter through augmented inverse probability weighting. We provide rigorous theoretical analysis to show that the resulting Wald confidence intervals are valid for the treatment parameter under suitable sparsity conditions if the instrument propensity score model is correctly specified, but the treatment and outcome regression models may be misspecified. For existing high-dimensional methods, valid confidence intervals are obtained for the treatment parameter if all three models are correctly specified. We evaluate the proposed methods via extensive simulation studies and an empirical application to estimate the returns to education.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is the challenges encountered when using instrumental variables (IV) to estimate the local average treatment effect (LATE) in high - dimensional data settings. Specifically, when the number of regression terms is close to or exceeds the sample size, traditional high - dimensional methods may not be able to provide valid confidence intervals, unless all three models (the instrumental variable propensity score model, the treatment regression model, and the outcome regression model) are correctly specified. This paper proposes a new computationally feasible method that can obtain valid Wald confidence intervals even when the treatment and outcome regression models may be mis - specified, as long as the instrumental variable propensity score model is correctly specified. ### Method Overview The method proposed in the paper mainly includes the following steps: 1. **Regularized Calibration Estimation in High - Dimensional Data**: - Use the Lasso penalty to estimate the coefficient vector, but choose different loss functions to estimate the coefficient vector in these regression models. - Specifically, for the instrumental variable propensity score model, the regularized calibrated estimation is used, and the parameters are estimated by minimizing the Lasso - penalized objective function. 2. **Doubly Robust Estimator**: - A doubly robust estimator is constructed through augmented inverse probability weighting (AIPW) to estimate the treatment parameter. - This estimator can provide valid estimates even when the treatment and outcome regression models are mis - specified, provided that the instrumental variable propensity score model is correctly specified. 3. **Theoretical Analysis**: - Provide a strict theoretical analysis to prove that the proposed Wald confidence interval is valid under appropriate sparsity conditions. - Analyze the convergence rate of the regularized calibration estimator and prove that the estimated LATE has the desired asymptotic expansion form when the instrumental variable propensity score model is correctly specified. ### Key Contributions - **Model - Assisted Confidence Intervals**: The paper proposes a model - assisted method that can provide valid confidence intervals in high - dimensional data even when some models are mis - specified. - **Computational Feasibility**: The proposed method is computationally feasible in practical applications and is achieved by sequentially constructing regularized calibration estimators. - **Theoretical Guarantees**: Provide a strict theoretical analysis to prove the effectiveness and asymptotic properties of the method. ### Application Examples The paper evaluates the proposed method through extensive simulation studies and an empirical application (estimating the return on education). The results show that the method performs well in high - dimensional data and can provide valid confidence intervals. ### Conclusion The paper proposes a new method that can effectively estimate the local average treatment effect in high - dimensional data and remains valid even when some models may be mis - specified. This method is not only strictly proven theoretically but also shows good performance in practical applications.

High-dimensional Model-assisted Inference for Local Average Treatment Effects with Instrumental Variables

High-dimensional model-assisted inference for treatment effects with multi-valued treatments

The Finite Sample Performance of Instrumental Variable-Based Estimators of the Local Average Treatment Effect When Controlling for Covariates

Selective Machine Learning of the Average Treatment Effect with an Invalid Instrumental Variable

Estimation and inference for high-dimensional nonparametric additive instrumental-variables regression

Instrumental Variable Model Average With Applications in Nonlinear Causal Inference

Bounded, efficient and multiply robust estimation of average treatment effects using instrumental variables

Program Evaluation and Causal Inference with High-Dimensional Data

IDENTIFICATION AND INFERENCE FOR MARGINAL AVERAGE TREATMENT EFFECT ON THE TREATED WITH AN INSTRUMENTAL VARIABLE.

Efficient Covariate Balancing for the Local Average Treatment Effect

Bayesian variable selection in linear regression models with instrumental variables

Local Effects of Continuous Instruments without Positivity

Robust causal inference with continuous instruments using the local instrumental variable curve

Doubly Robust Estimation of Local Average Treatment Effects Using Inverse Probability Weighted Regression Adjustment

Valid Causal Inference with (Some) Invalid Instruments

Identification-robust inference for the LATE with high-dimensional covariates

Ill-posed Estimation in High-Dimensional Models with Instrumental Variables

Model-assisted sensitivity analysis for treatment effects under unmeasured confounding via regularized calibrated estimation

Non-separable Models with High-dimensional Data

Selecting Valid Instrumental Variables in Linear Models with Multiple Exposure Variables: Adaptive Lasso and the Median-of-Medians Estimator

Estimation and Inference of Treatment Effects with $L_2$-Boosting in High-Dimensional Settings