Causal Responsibility Attribution for Human-AI Collaboration

Yahang Qi,Bernhard Schölkopf,Zhijing Jin
2024-11-06
Abstract:As Artificial Intelligence (AI) systems increasingly influence decision-making across various fields, the need to attribute responsibility for undesirable outcomes has become essential, though complicated by the complex interplay between humans and AI. Existing attribution methods based on actual causality and Shapley values tend to disproportionately blame agents who contribute more to an outcome and rely on real-world measures of blameworthiness that may misalign with responsible AI standards. This paper presents a causal framework using Structural Causal Models (SCMs) to systematically attribute responsibility in human-AI systems, measuring overall blameworthiness while employing counterfactual reasoning to account for agents' expected epistemic levels. Two case studies illustrate the framework's adaptability in diverse human-AI collaboration scenarios.
Artificial Intelligence,Human-Computer Interaction,Applications
What problem does this paper attempt to address?
The problem that this paper attempts to solve is how to systematically attribute responsibility in human - AI collaborative decision - making systems, especially when there are adverse outcomes. As the influence of AI systems in various fields is increasing day by day, clearly defining and attributing responsibility has become crucial, especially when adverse outcomes or failures occur. However, due to the shared decision - making responsibility between humans and algorithms, traditional accountability mechanisms become complicated. Humans can override or modify AI - driven decisions, which increases the unpredictability of the results; on the other hand, AI decisions usually rely on large - scale data sets or complex models, and these models lack complete transparency, making it difficult for humans to fully understand and predict the behavior of AI. Therefore, this paper proposes a causal framework based on Structural Causal Models (SCMs), aiming to systematically attribute responsibility, measure the overall degree of responsibility, and consider the cognitive level that agents should have through counterfactual reasoning. The paper shows the adaptability of this framework in different human - AI collaboration scenarios through two case studies.