Utilizing LLMs for Enhanced Argumentation and Extraction of Causal Knowledge from Scientific Literature

Shuang Wang,Wenjie Chen,Yang Zhang,Ting Chen,Jian Du
DOI: https://doi.org/10.1101/2024.03.20.24304652
2024-10-17
Abstract:Current semantic extraction tools have limited performance in identifying causal relations, neglecting variations in argument quality, especially persuasive strength across different sentences. The present study proposes a five-element based (evidence cogency, concept, relation stance, claim-context relevance, conditional information) causal knowledge mining framework and automatically implements it using large language models (LLMs) to improve the understanding of disease causal mechanisms. As a result, regarding cogency evaluation, the accuracy (0.84) of the fine-tuned Llama2-7b largely exceeds the accuracy of GPT-3.5 turbo with few-shot. Regarding causal extraction, by combining PubTator and ChatGLM, the entity first-relation later extraction (recall, 0.85) outperforms the relation first-entity later means (recall, 0.76), performing great in three outer validation sets (a gestational diabetes-relevant dataset and two general biomedical datasets), aligning entities for further causal graph construction. LLMs-enabled scientific causality mining is promising in delineating the causal argument structure and understanding the underlying mechanisms of a given exposure-outcome pair.
Health Informatics
What problem does this paper attempt to address?