Automating psychological hypothesis generation with AI: when large language models meet causal graph

Song Tong,Kai Mao,Zhen Huang,Yukun Zhao,Kaiping Peng

DOI: https://doi.org/10.31234/osf.io/7ck9m

2024-07-16

Abstract:Leveraging the synergy between causal knowledge graphs and a large language model (LLM), our study introduces a groundbreaking approach for computational hypothesis generation in psychology. We analyzed 43,312 psychology articles using a LLM to extract causal relation pairs. This analysis produced a specialized causal graph for psychology. Applying link prediction algorithms, we generated 130 potential psychological hypotheses focusing on `well-being', then compared them against research ideas conceived by doctoral scholars and those produced solely by the LLM. Interestingly, our combined approach of a LLM and causal graphs mirrored the expert-level insights in terms of novelty, clearly surpassing the LLM-only hypotheses (t(59) = 3.34, p=0.007 and t(59) = 4.32, p<0.001, respectively). This alignment was further corroborated using deep semantic analysis. Our results show that combining LLM with machine learning techniques such as causal knowledge graphs can revolutionize automated discovery in psychology, extracting novel insights from the extensive literature. This work stands at the crossroads of psychology and artificial intelligence, championing a new enriched paradigm for data-driven hypothesis generation in psychological research.

Artificial Intelligence,Computers and Society

What problem does this paper attempt to address?

The paper aims to address the issue of automating hypothesis generation in psychological research, specifically by combining large language models (LLMs) with causal graphs to achieve automatic generation of hypotheses in the field of psychology. To achieve this goal, the research team conducted the following work: 1. **Literature Analysis**: Collected 43,312 psychology-related articles from public databases and used large language models to extract causal relationship pairs from them, thereby constructing a specialized psychological causal graph. 2. **Hypothesis Generation**: Applied link prediction algorithms to generate potential psychological hypotheses on the causal graph. The research focused on hypotheses related to "well-being." 3. **Comparative Evaluation**: Compared the generated hypotheses with research ideas conceived by doctoral students and hypotheses generated solely by large language models to assess their novelty and practicality. The results showed that the method combining large language models with causal graphs was comparable to expert levels in generating novelty and significantly outperformed hypotheses generated solely by large language models. Additionally, in-depth semantic analysis confirmed the advantages of this method in conceptual integration and semantic scope. This study not only demonstrates the ability to extract causal knowledge from a large body of literature but also provides a new tool and methodology to promote data-driven hypothesis generation in the field of psychology. By combining traditional theory-driven research methods with emerging data-centric research paradigms, this study enriches our understanding of factors influencing psychology, particularly in the field of social psychology.

Automating psychological hypothesis generation with AI: when large language models meet causal graph

Causal Reasoning and Large Language Models: Opening a New Frontier for Causality

Causal Graph Discovery with Retrieval-Augmented Generation based Large Language Models

Large Language Models for Constrained-Based Causal Discovery

Can large language models build causal graphs?

From Query Tools to Causal Architects: Harnessing Large Language Models for Advanced Causal Discovery from Data

An Explainable AI Approach to Large Language Model Assisted Causal Model Auditing and Development

Causality for Large Language Models

Large Language Models are Effective Priors for Causal Graph Discovery

Large Language Models and Causal Inference in Collaboration: A Comprehensive Survey

Using GPT-4 to guide causal machine learning

Causal Reasoning in Large Language Models: A Knowledge Graph Approach

Efficient Causal Graph Discovery Using Large Language Models

Improving Causal Reasoning in Large Language Models: A Survey

Zero-shot Causal Graph Extrapolation from Text via LLMs

Counterfactual Causal Inference in Natural Language with Large Language Models

Integrating Large Language Models in Causal Discovery: A Statistical Causal Approach

Hypothesis Generation with Large Language Models

CausalChat: Interactive Causal Model Development and Refinement Using Large Language Models

Large Language Model for Causal Decision Making

Causal Dataset Discovery with Large Language Models