Omitted Labels in Causality: A Study of Paradoxes

Bijan Mazaheri,Siddharth Jain,Matthew Cook,Jehoshua Bruck
2024-05-23
Abstract:We explore what we call ``omitted label contexts,'' in which training data is limited to a subset of the possible labels. This setting is common among specialized human experts or specific focused studies. We lean on well-studied paradoxes (Simpson's and Condorcet) to illustrate the more general difficulties of causal inference in omitted label contexts. Contrary to the fundamental principles on which much of causal inference is built, we show that ``correct'' adjustments sometimes require non-exchangeable treatment and control groups. These pitfalls lead us to the study networks of conclusions drawn from different contexts and the structures the form, proving an interesting connection between these networks and social choice theory.
Machine Learning,Artificial Intelligence,Information Theory,Social and Information Networks,Methodology
What problem does this paper attempt to address?
The paper attempts to address the challenges posed by "omitted label contexts" in causal inference. Specifically: 1. **Bias in Causal Inference**: In actual research, data often contains only a subset of possible labels rather than all labels. In such cases, traditional causal inference methods may fail or produce misleading results. 2. **Irreversibility Issue**: When the probability of certain labels is zero, it is impossible to restore them to the general population distribution through reweighting, making causal effects unrecoverable. 3. **Simpson's Paradox and Condorcet Paradox**: The paper explores Simpson's paradox and Condorcet paradox to reveal counterintuitive phenomena that may occur in causal inference under omitted label contexts. Simpson's paradox shows that even after adjusting for covariates, the treatment effect may reverse; while the Condorcet paradox illustrates the mutual contradictions of different research conclusions within a network structure. 4. **Decision Fusion**: The paper also discusses how to combine conclusions drawn from different models and demonstrates that these conclusions' manifestations in network structures are consistent with phenomena in social choice theory. This has significant implications for future information integration based on large models. In summary, the paper aims to reveal the complexity and challenges of causal inference under omitted label contexts and proposes a new perspective to understand the relationships between different research conclusions.