Can ChatGPT Make Explanatory Inferences? Benchmarks for Abductive Reasoning

Paul Thagard
2024-09-19
Abstract:Explanatory inference is the creation and evaluation of hypotheses that provide explanations, and is sometimes known as abduction or abductive inference. Generative AI is a new set of artificial intelligence models based on novel algorithms for generating text, images, and sounds. This paper proposes a set of benchmarks for assessing the ability of AI programs to perform explanatory inference, and uses them to determine the extent to which ChatGPT, a leading generative AI model, is capable of making explanatory inferences. Tests on the benchmarks reveal that ChatGPT performs creative and evaluative inferences in many domains, although it is limited to verbal and visual modalities. Claims that ChatGPT and similar models are incapable of explanation, understanding, causal reasoning, meaning, and creativity are rebutted.
Artificial Intelligence
What problem does this paper attempt to address?
The paper primarily explores the capabilities of Generative AI in explanatory inference. Specifically, the authors propose a benchmark test set to evaluate the performance of the current leading AI model, ChatGPT, in conducting creative and evaluative explanatory inference. Through case studies across multiple domains, the paper demonstrates ChatGPT's performance in different fields and refutes some views that generative models cannot perform tasks such as explanation, understanding, and causal reasoning. The core issue of the paper is to verify whether ChatGPT can effectively conduct explanatory inference in various fields, including but not limited to science, medicine, law, and technology. Through detailed experiments and analysis, the paper proves that ChatGPT can not only generate new hypotheses but also evaluate these hypotheses to draw reasonable conclusions. Additionally, the paper discusses ChatGPT's performance in handling different modalities of information (such as visual and language) and points out the gaps between its capabilities and human cognitive abilities, as well as directions for future improvement.