Abstract:Purpose: Large Language Models (LLMs) have shown exceptional performance in various natural language processing tasks, benefiting from their language generation capabilities and ability to acquire knowledge from unstructured text. However, in the biomedical domain, LLMs face limitations that lead to inaccurate and inconsistent answers. Knowledge Graphs (KGs) have emerged as valuable resources for organizing structured information. Biomedical Knowledge Graphs (BKGs) have gained significant attention for managing diverse and large-scale biomedical knowledge. The objective of this study is to assess and compare the capabilities of ChatGPT and existing BKGs in question-answering, biomedical knowledge discovery, and reasoning tasks within the biomedical domain. Methods: We conducted a series of experiments to assess the performance of ChatGPT and the BKGs in various aspects of querying existing biomedical knowledge, knowledge discovery, and knowledge reasoning. Firstly, we tasked ChatGPT with answering questions sourced from the "Alternative Medicine" sub-category of Yahoo! Answers and recorded the responses. Additionally, we queried BKG to retrieve the relevant knowledge records corresponding to the questions and assessed them manually. In another experiment, we formulated a prediction scenario to assess ChatGPT's ability to suggest potential drug/dietary supplement repurposing candidates. Simultaneously, we utilized BKG to perform link prediction for the same task. The outcomes of ChatGPT and BKG were compared and analyzed. Furthermore, we evaluated ChatGPT and BKG's capabilities in establishing associations between pairs of proposed entities. This evaluation aimed to assess their reasoning abilities and the extent to which they can infer connections within the knowledge domain. Results: The results indicate that ChatGPT with GPT-4.0 outperforms both GPT-3.5 and BKGs in providing existing information. However, BKGs demonstrate higher reliability in terms of information accuracy. ChatGPT exhibits limitations in performing novel discoveries and reasoning, particularly in establishing structured links between entities compared to BKGs. Conclusions: To address the limitations observed, future research should focus on integrating LLMs and BKGs to leverage the strengths of both approaches. Such integration would optimize task performance and mitigate potential risks, leading to advancements in knowledge within the biomedical field and contributing to the overall well-being of individuals.

Developing ChatGPT for Biology and Medicine: A Complete Review of Biomedical Question Answering

From Answers to Insights: Unveiling the Strengths and Limitations of ChatGPT and Biomedical Knowledge Graphs

Integration of Multi-Source Medical Data for Medical Diagnosis Question Answering

Biomedical Question Answering: A Survey of Approaches and Challenges

Multimodal ChatGPT for Medical Applications: an Experimental Study of GPT-4V

Evaluation of ChatGPT Family of Models for Biomedical Reasoning and Classification

How Well Does ChatGPT Do When Taking the Medical Licensing Exams? The Implications of Large Language Models for Medical Education and Knowledge Assessment

Opportunities and Challenges for ChatGPT and Large Language Models in Biomedicine and Health

DoctorGPT: A Large Language Model with Chinese Medical Question-Answering Capabilities

Development and Prospect of ChatGpt in the Medical Field

An Extensive Benchmark Study on Biomedical Text Generation and Mining with ChatGPT

Embracing ChatGPT for Medical Education: Exploring Its Impact on Doctors and Medical Students

Answering medical questions in Chinese using automatically mined knowledge and deep neural networks: an end-to-end solution

A Chinese Question Answering System in Medical Domain

Evaluation of ChatGPT-generated medical responses: A systematic review and meta-analysis

Evaluating the ChatGPT family of models for biomedical reasoning and classification

Efficient Medical Question Answering with Knowledge-Augmented Question Generation

ChatGPT in healthcare: A taxonomy and systematic review

A Joint-Reasoning based Disease Q&A System

Advancing Medical Imaging with Language Models: A Journey from N-grams to ChatGPT

Integrating UMLS Knowledge into Large Language Models for Medical Question Answering