Abstract:The Semantic MEDLINE Database (SemMedDB) has limited performance in identifying entities and relations, while also neglects variations in argument quality, especially persuasive strength across different sentences. The present study aims to utilize large language models (LLMs) to evaluate the contextual argument quality of triples in SemMedDB to improve the understanding of disease mechanisms. Using argument mining methods, we first design a quality evaluation framework across four major dimensions, triples’ accuracy, triple-sentence correlation, research object, and evidence cogency, to evaluate the argument quality of the triple-based claim according to their contextual sentences. Then we choose a sample of 66 triple-sentence pairs for repeated annotations and framework optimization. As a result, the predicted performances of GPT-3.5 and GPT-4 are excellent with an accuracy up to 0.90 in the complex cogency evaluation task. The tentative case evaluating whether there exists an association between gestational diabetes and periodontitis reveals accurate predictions (GPT-4, accuracy, 0.88). LLMs-enabled argument quality evaluation is promising for evidence integration in understanding disease mechanisms, especially how evidence in two stances with varying levels of cogency evolves over time.### Competing Interest StatementThe authors have declared no competing interest.### Funding StatementThis study was funded by the National Key R&D Program for Young Scientists (Project number 2022YFF0712000 to JD) and the National Natural Science Foundation of China (Project number 72074006 to JD).### Author DeclarationsI confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.YesI confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.YesI understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).YesI have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.YesAll data produced in the present study are available upon reasonable request to the authors

Utilizing LLMs for Enhanced Argumentation and Extraction of Causal Knowledge from Scientific Literature

From Query Tools to Causal Architects: Harnessing Large Language Models for Advanced Causal Discovery from Data

LLM4Causal: Democratized Causal Tools for Everyone via Large Language Model

Causal Reasoning and Large Language Models: Opening a New Frontier for Causality

CausalGraph2LLM: Evaluating LLMs for Causal Queries

Causality extraction from medical text using Large Language Models (LLMs)

Large Language Model for Causal Decision Making

Large Language Models and Causal Inference in Collaboration: A Comprehensive Survey

Evaluating Large Language Models for Causal Modeling

Causality for Large Language Models

ALCM: Autonomous LLM-Augmented Causal Discovery Framework

Utilizing LLMs to Evaluate the Argument Quality of Triples in SemMedDB for Enhanced Understanding of Disease Mechanisms

A Large Language Model Approach to Extracting Causal Evidence across Study Designs for Evidence Triangulation

From Pre-training Corpora to Large Language Models: What Factors Influence LLM Performance in Causal Discovery Tasks?

CausalBench: A Comprehensive Benchmark for Causal Learning Capability of LLMs

Causal Dataset Discovery with Large Language Models

Causal Graph Discovery with Retrieval-Augmented Generation based Large Language Models

Large Language Models are Effective Priors for Causal Graph Discovery

LLMs Are Prone to Fallacies in Causal Inference

Applying Large Language Models for Causal Structure Learning in Non Small Cell Lung Cancer

Multi-Agent Causal Discovery Using Large Language Models