Learning the Covariance of Treatment Effects Across Many Weak Experiments

Aurélien Bibaut,Winston Chou,Simon Ejdemyr,Nathan Kallus
2024-02-28
Abstract:When primary objectives are insensitive or delayed, experimenters may instead focus on proxy metrics derived from secondary outcomes. For example, technology companies often infer long-term impacts of product interventions from their effects on weighted indices of short-term user engagement signals. We consider meta-analysis of many historical experiments to learn the covariance of treatment effects on different outcomes, which can support the construction of such proxies. Even when experiments are plentiful and large, if treatment effects are weak, the sample covariance of estimated treatment effects across experiments can be highly biased and remains inconsistent even as more experiments are considered. We overcome this by using techniques inspired by weak instrumental variable analysis, which we show can reliably estimate parameters of interest, even without a structural model. We show the Limited Information Maximum Likelihood (LIML) estimator learns a parameter that is equivalent to fitting total least squares to a transformation of the scatterplot of estimated treatment effects, and that Jackknife Instrumental Variables Estimation (JIVE) learns another parameter that can be computed from the average of Jackknifed covariance matrices across experiments. We also present a total-covariance-based estimator for the latter estimand under homoskedasticity, which we show is equivalent to a $k$-class estimator. We show how these parameters relate to causal quantities and can be used to construct unbiased proxy metrics under a structural model with both direct and indirect effects subject to the INstrument Strength Independent of Direct Effect (INSIDE) assumption of Mendelian randomization. Lastly, we discuss the application of our methods at Netflix.
Methodology
What problem does this paper attempt to address?
This paper attempts to solve the problem of using short - term surrogate indicators in experiments to predict long - term treatment effects. Specifically, when the primary objective is insensitive or delayed, researchers may focus on surrogate indicators derived from secondary outcomes. For example, technology companies usually infer the long - term impact of product interventions based on their impact on the short - term user - engagement - signal - weighted index. However, because the signal - to - noise ratio in each experiment may be low, directly estimating the covariance matrix of treatment effects between different outcomes may produce highly biased results, even when more experiments are considered. To overcome this challenge, the authors propose several methods that draw on weak instrumental variable analysis techniques and can reliably estimate the parameters of interest without a structural model. These methods include: 1. **Limited - Information Maximum Likelihood Estimation (LIML)**: Learn parameters by applying the total least squares method to the transformed scatter plot of treatment effects. 2. **Jackknife Instrumental Variable Estimation (JIVE)**: Learn parameters by calculating the average of the Jackknifed covariance matrices across experiments. 3. **Total - Covariance - Based Estimator**: Under the homoscedasticity assumption, estimate parameters by subtracting the covariance matrix of unit - level noise from the empirical covariance matrix. These methods can effectively estimate the covariance matrix of treatment effects and support the construction of unbiased surrogate indicators, thereby accurately predicting long - term treatment effects in new experiments. In addition, the paper also discusses the practical applications of these methods in Netflix, demonstrating their feasibility and effectiveness on a large - scale online experiment platform.