Benchmarking Defeasible Reasoning with Large Language Models -- Initial Experiments and Future Directions

Ilias Tachmazidis,Sotiris Batsakis,Grigoris Antoniou
2024-10-16
Abstract:Large Language Models (LLMs) have gained prominence in the AI landscape due to their exceptional performance. Thus, it is essential to gain a better understanding of their capabilities and limitations, among others in terms of nonmonotonic reasoning. This paper proposes a benchmark that corresponds to various defeasible rule-based reasoning patterns. We modified an existing benchmark for defeasible logic reasoners by translating defeasible rules into text suitable for LLMs. We conducted preliminary experiments on nonmonotonic rule-based reasoning using ChatGPT and compared it with reasoning patterns defined by defeasible logic.
Artificial Intelligence
What problem does this paper attempt to address?