Doubly robust inference with censoring unbiased transformations

Oliver Lunding Sandqvist
2024-11-08
Abstract:This paper extends doubly robust censoring unbiased transformations to a broad class of censored data structures under the assumption of coarsening at random and positivity. This includes the classic survival and competing risks setting, but also encompasses multiple events. A doubly robust representation for the conditional bias of the transformed data is derived. This leads to rate double robustness and oracle efficiency properties for estimating conditional expectations when combined with cross-fitting and linear smoothers. Simulation studies demonstrate favourable performance of the proposed method relative to existing approaches. An application of the methods to a regression discontinuity design with censored data illustrates its practical utility.
Methodology,Statistics Theory
What problem does this paper attempt to address?
The problem that this paper attempts to solve is how to estimate the conditional expectation more accurately in the presence of censored data. Specifically, the paper focuses on how to use the Doubly Robust Censoring Unbiased Transformation (DRCUT) to handle various censored data structures, including classical survival analysis and competing risk settings, as well as multi - event situations, under the assumptions of coarsening at random (CAR) and positivity. By introducing DRCUT, the paper aims to overcome the biases that may be generated by existing methods when estimating the conditional expectation and improve the efficiency and robustness of the estimation. ### Main Contributions 1. **Generalize DRCUT**: Generalize the DRCUT method from survival data analysis to any censored data structure that satisfies the assumptions of random coarsening and positivity. 2. **Doubly Robust Representation of Conditional Bias**: Derive a doubly robust representation of the DRCUT conditional bias, which helps to establish the large - sample properties of DRCUT - based estimators. 3. **Large - Sample Properties**: Using the framework of Kennedy (2023), establish the large - sample properties of DRCUT - based estimators, including rate double - robustness and oracle efficiency. 4. **Cross - Fitting**: Extend the sample - splitting method of Kennedy (2023) to cross - fitting and explore the results of estimators with slower convergence rates. ### Method Overview - **Definition of DRCUT**: By introducing two probability measures \(P_1\) and \(P_2\), a new DRCUT is defined. This transformation has double - robustness when \(P_1\) and \(P_2\) satisfy the assumptions of random coarsening and positivity. - **Representation of Conditional Bias**: Through Theorem 1, a doubly robust representation of the DRCUT conditional bias is derived, which shows that if \(P_1\) or \(P_2\) is correctly specified, the conditional bias is zero. - **Large - Sample Properties**: Through sample - splitting and cross - fitting methods, it is proved that DRCUT - based estimators have rate double - robustness and oracle efficiency. ### Application Examples - **Simulation Study**: Through simulation experiments, the superior performance of the proposed method compared to existing methods is demonstrated. - **Practical Application**: The method is applied to the regression discontinuity design (RDD) in the Longitudinal Study of Young People in England (LSYPE) to infer the conditional average treatment effect (CATE). ### Formula Presentation - **Definition of DRCUT**: \[ Y^*_{P_1, P_2}(C, X_C)=\frac{Y(X) \mathbb{1}(C \geq \eta)}{P_1(C \geq \eta|X)}+\int_0^\eta \frac{E_2[Y(X)|X_u]}{P_1(C > u|X)}\left\{d_1(C \leq u)-\mathbb{1}(C \geq u)\frac{P_1(C \in du|X)}{P_1(C > u|X)}\right\} \] - **Doubly Robust Representation of Conditional Bias**: \[ E[Y^*_{P_1, P_2}(C, X_C)-Y(X)|W]=E\left[\int_0^\eta\left\{E[Y(X)|X_u]-E_2[Y(X)|X_u]\right\}\left\{\gamma_1(u|X)-\gamma(u|X)\right\}\frac{P(C \geq u|X)}{P_1(C > u|X)}d\mu(u)\bigg|W\right] \] These formulas show the definition of DRCUT and its doubly robust representation of conditional bias, which are helpful for understanding the core contributions and methods of the paper.