Abstract:Big data and algorithmic risk prediction tools promise to improve criminal justice systems by reducing human biases and inconsistencies in decision making. Yet different, equally-justifiable choices when developing, testing, and deploying these sociotechnical tools can lead to disparate predicted risk scores for the same individual. Synthesizing diverse perspectives from machine learning, statistics, sociology, criminology, law, philosophy and economics, we conceptualize this phenomenon as predictive inconsistency. We describe sources of predictive inconsistency at different stages of algorithmic risk assessment tool development and deployment and consider how future technological developments may amplify predictive inconsistency. We argue, however, that in a diverse and pluralistic society we should not expect to completely eliminate predictive inconsistency. Instead, to bolster the legal, political, and scientific legitimacy of algorithmic risk prediction tools, we propose identifying and documenting relevant and reasonable "forking paths" to enable quantifiable, reproducible multiverse and specification curve analyses of predictive inconsistency at the individual level.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is the predictive inconsistency that occurs when using Algorithmic Risk Assessment Instruments (ARAIs) in the criminal justice system. Specifically, although these tools aim to improve the criminal justice system by reducing human bias and inconsistency in decision - making, different choices in the development, testing, and deployment of these socio - technical tools, even if these choices are all reasonable and well - founded, may lead to different predicted risk scores for the same person. This phenomenon is called predictive inconsistency. The author believes that in a diverse society, we should not expect to completely eliminate predictive inconsistency, but should identify and record the relevant "forking paths" in order to conduct quantifiable and repeatable multiverse and specification curve analyses of predictive inconsistency at the individual level, thereby enhancing the legal, political, and scientific legitimacy of algorithmic risk prediction tools. ### Main contributions of the paper 1. **Collecting and organizing relevant literature**: Relevant issues related to ARAIs were collected and organized from fields such as machine learning, statistics, sociology, criminology, law, economics, and philosophy. 2. **Introducing the concept of predictive inconsistency**: Defined predictive inconsistency and explained the reasons for its occurrence. 3. **Identifying and classifying the sources of predictive inconsistency**: Described in detail the specific sources that may lead to predictive inconsistency at different stages of ARAI development and deployment. 4. **Proposing multiverse and specification curve analysis techniques**: Suggested incorporating these techniques into the ARAI development and audit toolkits to estimate the lower limit of predictive inconsistency and reveal multiple sources of discretion and their impact on predicted risk scores. ### Background of predictive inconsistency - **Legal consistency**: In law, consistency refers to the consistency and coherence of judicial decisions, which helps protect citizens from arbitrary and biased legislation. - **Predictive inconsistency**: In science and algorithms, inconsistency is usually regarded as an undesirable phenomenon, but in the criminal justice system, a certain degree of inconsistency is tolerable or even expected due to the diversity of society. ### ARAI construction process 1. **Data source selection**: Determine the data source, usually court or administrative records. 2. **Defining and operationalizing the target variable**: For example, define recidivism as being convicted again within four years after release. 3. **Selecting prediction variables**: Based on available data and theoretical or policy considerations, select prediction variables. 4. **Model construction and evaluation**: Construct multiple models and evaluate their prediction accuracy and risk distribution. 5. **Model adjustment and deployment**: Adjust the model according to institutional resource limitations and deploy it to practical applications. Through these steps, the author hopes to improve the understanding of predictive inconsistency, thereby enhancing the legitimacy and reliability of ARAIs in high - risk public areas.

Forks Over Knives: Predictive Inconsistency in Criminal Justice Algorithmic Risk Assessment Tools

Almost Politically Acceptable Criminal Justice Risk Assessment

A statistical framework for fair predictive algorithms

Interventions over Predictions: Reframing the Ethical Debate for Actuarial Risk Assessment

Improving Fairness in Criminal Justice Algorithmic Risk Assessments Using Conformal Prediction Sets

Improving Fairness in Criminal Justice Algorithmic Risk Assessments Using Optimal Transport and Conformal Prediction Sets

Fairness Deconstructed: A Sociotechnical View of 'Fair' Algorithms in Criminal Justice

Uncertainty in Criminal Justice Algorithms: simulation studies of the Pennsylvania Additive Classification Tool

Accuracy and Fairness for Juvenile Justice Risk Assessments

The age of secrecy and unfairness in recidivism prediction

Predicting risk in criminal procedure: actuarial tools, algorithms, AI and judicial decision-making

Feedback Effects in Repeat-Use Criminal Risk Assessments

An algorithm for removing sensitive information: application to race-independent recidivism prediction

Pursuing Open-Source Development of Predictive Algorithms: The Case of Criminal Sentencing Algorithms

Algorithmic Bias in Recidivism Prediction: A Causal Perspective

Human Perceptions of Fairness in Algorithmic Decision Making: A Case Study of Criminal Risk Prediction

Achieving Fairness through Adversarial Learning: an Application to Recidivism Prediction

From fair predictions to just decisions? Conceptualizing algorithmic fairness and distributive justice in the context of data-driven decision-making

Counterfactual risk assessments, evaluation, and fairness

People Perceive Algorithmic Assessments as Less Fair and Trustworthy Than Identical Human Assessments

Technological Tethereds: Potential Impact of Untrustworthy Artificial Intelligence in Criminal Justice Risk Assessment Instruments