Abstract:Backgrounds: The field of Artificial Intelligence (AI) has seen a major shift in recent years due to the development of new Machine Learning (ML) models such as Generative Pre-trained Transformer (GPT). GPT has achieved previously unheard-of levels of accuracy in most computerized language processing tasks and their chat-based variations. Aim: The aim of this study was to investigate the problem-solving abilities of ChatGPT using two sets of verbal insight problems, with a known performance level established by a sample of human participants. Materials and methods: A total of 30 problems labeled as "practice problems" and "transfer problems" were administered to ChatGPT. ChatGPT's answers received a score of "0" for each incorrectly answered problem and a score of "1" for each correct response. The highest possible score for both the practice and transfer problems was 15 out of 15. The solution rate for each problem (based on a sample of 20 subjects) was used to assess and compare the performance of ChatGPT with that of human subjects. Results: The study highlighted that ChatGPT can be trained in out-of-the-box thinking and demonstrated potential in solving verbal insight problems. The global performance of ChatGPT equalled the most probable outcome for the human sample in both practice problems and transfer problems as well as upon their combination. Additionally, ChatGPT answer combinations were among the 5% of most probable outcomes for the human sample both when considering practice problems and pooled problem sets. These findings demonstrate that ChatGPT performance on both set of problems was in line with the mean rate of success of human subjects, indicating that it performed reasonably well. Conclusions: The use of transformer architecture and self-attention in ChatGPT may have helped to prioritize inputs while predicting, contributing to its potential in verbal insight problem-solving. ChatGPT has shown potential in solving insight problems, thus highlighting the importance of incorporating AI into psychological research. However, it is acknowledged that there are still open challenges. Indeed, further research is required to fully understand AI's capabilities and limitations in verbal problem-solving.

Challenging large language models' " intelligence" with human tools: A neuropsychological investigation in Italian language on prefrontal functioning

How to Measure the Intelligence of Large Language Models?

Artificial Neuropsychology: Are Large Language Models Developing Executive Functions?

Thinking Fast and Slow in Large Language Models

Testing theory of mind in large language models and humans

Large Language Models and the Reverse Turing Test

GPT-4 Surpassing Human Performance in Linguistic Pragmatics

Human-like problem-solving abilities in large language models using ChatGPT

Do large language models show decision heuristics similar to humans? A case study using GPT-3.5.

LLM4DS: Evaluating Large Language Models for Data Science Code Generation

Cognitive Effects in Large Language Models

Language models and psychological sciences

Assessing the nature of large language models: A caution against anthropocentrism

Evaluating Large Language Models in Theory of Mind Tasks

Are Large Language Models Strategic Decision Makers? A Study of Performance and Bias in Two-Player Non-Zero-Sum Games

Show, Don't Tell: Evaluating Large Language Models Beyond Textual Understanding with ChildPlay

Using cognitive psychology to understand GPT-3

LLM Cognitive Judgements Differ From Human

Testing AI on language comprehension tasks reveals insensitivity to underlying meaning

Human-like intuitive behavior and reasoning biases emerged in large language models but disappeared in ChatGPT

Can large language models help predict results from a complex behavioural science study?