Abstract:Interference exists when a unit's outcome depends on another unit's treatment assignment. For example, intensive policing on one street could have a spillover effect on neighboring streets. Classical randomization tests typically break down in this setting because many null hypotheses of interest are no longer sharp under interference. A promising alternative is to instead construct a conditional randomization test on a subset of units and assignments for which a given null hypothesis is sharp. Finding these subsets is challenging, however, and existing methods are limited to special cases or have limited power. In this paper, we propose valid and easy-to-implement randomization tests for a general class of null hypotheses under arbitrary interference between units. Our key idea is to represent the hypothesis of interest as a bipartite graph between units and assignments, and to find an appropriate biclique of this graph. Importantly, the null hypothesis is sharp within this biclique, enabling conditional randomization-based tests. We also connect the size of the biclique to statistical power. Moreover, we can apply off-the-shelf graph clustering methods to find such bicliques efficiently and at scale. We illustrate our approach in settings with clustered interference and show advantages over methods designed specifically for that setting. We then apply our method to a large-scale policing experiment in Medellin, Colombia, where interference has a spatial structure.
What problem does this paper attempt to address?
This paper attempts to solve the problem that the classical randomization test method fails in the presence of interference. Specifically, interference exists when the outcome of one unit depends on the treatment assignment of another unit. For example, intensive patrols on one street may affect the crime rate on neighboring streets. In this case, traditional randomization test methods usually fail because many null hypotheses of interest are no longer sharp under interference. To meet this challenge, the paper proposes a graph - theory - based method to construct effective conditional randomization tests, which is applicable to any form of interference.
### Main contributions of the paper:
1. **Proposing a general method**: The paper proposes an effective and easy - to - implement randomization test method that can be used to test a wide class of null hypotheses under any interference conditions. The key to this method is to represent the hypothesis of interest as a bipartite graph between units and assignments and find the appropriate bicliques in this graph. In these bicliques, the null hypothesis is sharp, so conditional randomization tests can be carried out.
2. **Improving statistical power**: The paper links the size of the biclique to statistical power, indicating that larger bicliques can improve the statistical power of the test.
3. **Applying off - the - shelf graph clustering methods**: The paper shows how to use existing graph clustering methods to efficiently find such bicliques, so that this method can also be applied to large - scale data sets.
4. **Examples of practical applications**: The paper illustrates the application of this method through two specific examples: one is a scenario with cluster interference, and the other is a large - scale policing experiment in Medellin, Colombia, where the interference has a spatial structure.
### Method overview:
- **Null Exposure Graph**: This is a bipartite graph. The nodes include units and assignments, and the edges indicate whether the exposure of a unit under a specific assignment belongs to the exposure set in the null hypothesis. A biclique refers to a complete bipartite subgraph in which all units are connected to all assignments in this graph.
- **Biclique Decomposition**: By finding suitable bicliques, conditional randomization tests can be carried out within these bicliques, thus ensuring that the null hypothesis is sharp within these bicliques.
- **Conditional Randomization Test**: By performing conditional randomization tests on the found bicliques, the null hypothesis of interest can be effectively tested without the need for complex adjustments to the entire data set.
### Formulas and symbols:
- \( U \): The set of all units.
- \( Z \): The set of all assignments.
- \( f_i(z) \): The exposure function of unit \( i \) under assignment \( z \).
- \( F \): The set of exposures of interest.
- \( G_F^f=(V, E) \): The null exposure graph, where \( V = U\cup Z \), \( E=\{(i, z)\in U\times Z: f_i(z)\in F\} \).
- \( C=(U', Z') \): A biclique in the null exposure graph, where \( U'\subseteq U \) and \( Z'\subseteq Z \).
Through these methods, the paper provides a general and powerful framework for causal inference in the presence of interference.