Study designs for extending causal inferences from a randomized trial to a target population

Issa J. Dahabreh,Sebastien J-P.A. Haneuse,James M. Robins,Sarah E. Robertson,Ashley L. Buchanan,Elisabeth A. Stuart,Miguel A. Hernán
DOI: https://doi.org/10.48550/arXiv.1905.07764
2019-05-20
Abstract:We examine study designs for extending (generalizing or transporting) causal inferences from a randomized trial to a target population. Specifically, we consider nested trial designs, where randomized individuals are nested within a sample from the target population, and non-nested trial designs, including composite dataset designs, where a randomized trial is combined with a separately obtained sample of non-randomized individuals from the target population. We show that the causal quantities that can be identified in each study design depend on what is known about the probability of sampling non-randomized individuals. For each study design, we examine identification of potential outcome means via the g-formula and inverse probability weighting. Last, we explore the implications of the sampling properties underlying the designs for the identification and estimation of the probability of trial participation.
Methodology,Applications
What problem does this paper attempt to address?
The main problem that this paper attempts to solve is **how to generalize causal inference from randomized trials to the target population**. Specifically, the author explored two research designs - Nested Trial Designs and Non - Nested Trial Designs, which are used to extend causal inference in randomized trials to the target population. The paper specifically focused on the fact that in different designs, the identifiable causal quantity depends on the knowledge of the sampling probability of non - randomized individuals. Through the g - formula and inverse probability weighting methods, the author analyzed the identification problem of the mean of potential outcomes in each design and discussed the impact of the sampling characteristics of these designs on the modeling of trial participation probability. ### Main Research Questions 1. **Generalization of Causal Inference**: How to generalize the results of a randomized trial to a larger target population, especially when there are differences between trial participants and the target population. 2. **Choice of Research Design**: Compare the advantages and disadvantages of Nested Trial Designs and Non - Nested Trial Designs in generalizing causal inference, especially when the sampling probability of non - randomized individuals is known or unknown. 3. **Identification Conditions**: Under different research designs, what conditions are necessary for identifying the mean of potential outcomes, especially through the g - formula and inverse probability weighting methods. 4. **Model Estimation**: How to estimate trial participation probability in practical applications, especially when subsampling non - randomized individuals in Nested Trial Designs. ### Key Concepts - **Nested Trial Design**: A randomized trial is embedded in a sample of the target population. - **Non - Nested Trial Design**: Randomized trial data is combined with data from an independently sampled non - randomized individual sample in the target population. - **g - formula**: A method for identifying the mean of potential outcomes. - **Inverse Probability Weighting**: Adjust the sample through a weighting method to reflect the characteristics of the target population. - **Sampling Probability**: The probability that a non - randomized individual is sampled, which is crucial for identifying causal quantities. ### Methods and Results - **Nested Trial Design**: - **Census**: When data for all of the actual population is available, the mean of potential outcomes in the target population can be directly identified. - **Sub - sampling**: When only a part of non - randomized individuals is sampled, the sampling probability \( c \) needs to be considered and the mean of potential outcomes can be identified through a weighting method. - **Non - Nested Trial Design**: - When the sampling probability \( u \) of non - randomized individuals is unknown, identifying the mean of potential outcomes in the target population is more complex, but it can still be achieved through the inverse probability weighting method. ### Conclusions Through detailed theoretical analysis and method discussion, the paper provides researchers with guidance on generalizing causal inference from randomized trials under different research designs. Especially in Nested Trial Designs, through a reasonable weighting method, the subsampling problem of non - randomized individuals can be effectively dealt with. For Non - Nested Trial Designs, although the sampling probability of non - randomized individuals is unknown, through the inverse probability weighting method, the mean of potential outcomes in the target population can still be identified under certain conditions.