Abstract:Can machine learning improve human decision making? Bail decisions provide a good test case. Millions of times each year, judges make jail-or-release decisions that hinge on a prediction of what a defendant would do if released. The concreteness of the prediction task combined with the volume of data available makes this a promising machine-learning application. Yet comparing the algorithm to judges proves complicated. First, the available data are generated by prior judge decisions. We only observe crime outcomes for released defendants, not for those judges detained. This makes it hard to evaluate counterfactual decision rules based on algorithmic predictions. Second, judges may have a broader set of preferences than the variable the algorithm predicts; for instance, judges may care specifically about violent crimes or about racial inequities. We deal with these problems using different econometric strategies, such as quasi-random assignment of cases to judges. Even accounting for these concerns, our results suggest potentially large welfare gains: one policy simulation shows crime reductions up to 24.7% with no change in jailing rates, or jailing rate reductions up to 41.9% with no increase in crime rates. Moreover, all categories of crime, including violent crimes, show reductions; and these gains can be achieved while simultaneously reducing racial disparities. These results suggest that while machine learning can be valuable, realizing this value requires integrating these tools into an economic framework: being clear about the link between predictions and decisions; specifying the scope of payoff functions; and constructing unbiased decision counterfactuals. JEL Codes: C10 (Econometric and statistical methods and methodology), C55 (Large datasets: Modeling and analysis), K40 (Legal procedure, the legal system, and illegal behavior).

How Aligned are Generative Models to Humans in High-Stakes Decision-Making?

Investigating Human + Machine Complementarity for Recidivism Predictions

In Pursuit of Interpretable, Fair and Accurate Machine Learning for Criminal Recidivism Prediction

Accuracy, Fairness, and Interpretability of Machine Learning Criminal Recidivism Models

Interpretable Classification Models for Recidivism Prediction

LLM Voting: Human Choices and AI Collective Decision Making

The Application of Machine Learning to a General Risk–Need Assessment Instrument in the Prediction of Criminal Recidivism

Achieving Fairness through Adversarial Learning: an Application to Recidivism Prediction

HUMAN DECISIONS AND MACHINE PREDICTIONS

Modeling Legal Reasoning: LM Annotation at the Edge of Human Agreement

The Moral Turing Test: Evaluating Human-LLM Alignment in Moral Decision-Making

Evaluating and Mitigating Discrimination in Language Model Decisions

Predicting Recidivism With Machine Learning: An Analysis of Risk Factors and Proposal of Preventions

A Comparative User Study of Human Predictions in Algorithm-Supported Recidivism Risk Assessment

Can Language Models Use Forecasting Strategies?

Alignment Between the Decision-Making Logic of LLMs and Human Cognition: A Case Study on Legal LLMs

Reducing race-based bias and increasing recidivism prediction accuracy by using past criminal history details

Do Large Language Models Perform the Way People Expect? Measuring the Human Generalization Function

On Predicting Recidivism: Epistemic Risk, Tradeoffs, and Values in Machine Learning

Exploring the psychology of LLMs' Moral and Legal Reasoning

Fair-by-design explainable models for prediction of recidivism