A p-value for Process Tracing and other N=1 Studies

Matias Lopez,Jake Bowers
2023-10-21
Abstract:The paper introduces a \(p\)-value that summarizes the evidence against a rival causal theory that explains an observed outcome in a single case. We show how to represent the probability distribution characterizing a theorized rival hypothesis (the null) in the absence of randomization of treatment and when counting on qualitative data, for instance when conducting process tracing. As in Fisher's \autocite*{fisher1935design} original design, our \(p\)-value indicates how frequently one would find the same observations or even more favorable observations under a theory that is compatible with our observations but antagonistic to the working hypothesis. We also present an extension that allows researchers assess the sensitivity of their results to confirmation bias. Finally, we illustrate the application of our hypothesis test using the study by Snow \autocite*{Snow1855} about the cause of Cholera in Soho, a classic in Process Tracing, Epidemiology, and Microbiology. Our framework suits any type of case studies and evidence, such as data from interviews, archives, or participant observation.
Methodology,Other Statistics
What problem does this paper attempt to address?
The paper aims to address the issue of how to use p-values to evaluate theoretical causal explanations in single-case studies, especially in the absence of randomized experiments. The author proposes a method for summarizing evidence against competing causal theories that explain observed results (referred to as the "null hypothesis"). This method does not require randomized treatment and is applicable to studies that rely on qualitative data, such as Process Tracing. The main contribution of the paper is that it not only demonstrates how to represent the probability distribution of competing hypotheses in the absence of randomization and reliance on qualitative data such as interviews, archives, or participant observation, but also provides a conservative p-value to measure the degree of evidence supporting the working hypothesis. In addition, the paper proposes a sensitivity analysis to assess the sensitivity of the research findings to confirmation bias. The paper illustrates the practicality of the proposed hypothesis testing framework by applying it to Snow's (1855) classic study on the causes of cholera. Snow's theory suggested that cholera was caused by contaminated water, and his strongest evidence came from a series of interviews, public registers, and a map. However, his conclusions were once questioned due to the lack of available tests to refute the popular miasma theory at the time. Structurally, the paper begins with an introduction that outlines the research motivation and background, followed by the second part which defines the main concepts in causal inference and sets the problem of inferring unobserved counterfactuals in single-case studies. The third part introduces a null model for generating null hypotheses to assess the strength of evidence between the working hypothesis and competing explanations. Finally, the fourth part discusses how to assess the sensitivity of the research to observation bias, introducing a biased null model of the non-central hypergeometric distribution, which allows some types of evidence to be more easily obtained than others, thus better reflecting the conditions of real-world research.