Abstract:Background Healthcare-related artificial intelligence (AI) is developing. The capacity of the system to carry out sophisticated cognitive processes, such as problem-solving, decision-making, reasoning, and perceiving, is referred to as higher cognitive thinking in AI. This kind of thinking requires more than just processing facts; it also entails comprehending and working with abstract ideas, evaluating and applying data relevant to the context, and producing new insights based on prior learning and experience. ChatGPT is an artificial intelligence-based conversational software that can engage with people to answer questions and uses natural language processing models. The platform has created a worldwide buzz and keeps setting an ongoing trend in solving many complex problems in various dimensions. Nevertheless, ChatGPT's capacity to correctly respond to queries requiring higher-level thinking in medical biochemistry has not yet been investigated. So, this research aimed to evaluate ChatGPT's aptitude for responding to higher-order questions on medical biochemistry. Objective In this study, our objective was to determine whether ChatGPT can address higher-order problems related to medical biochemistry. Methods This cross-sectional study was done online by conversing with the current version of ChatGPT (14 March 2023, which is presently free for registered users). It was presented with 200 medical biochemistry reasoning questions that require higher-order thinking. These questions were randomly picked from the institution's question bank and classified according to the Competency-Based Medical Education (CBME) curriculum's competency modules. The responses were collected and archived for subsequent research. Two expert biochemistry academicians examined the replies on a zero to five scale. The score's accuracy was determined by a one-sample Wilcoxon signed rank test using hypothetical values. Result The AI software answered 200 questions requiring higher-order thinking with a median score of 4.0 (Q1=3.50, Q3=4.50). Using a single sample Wilcoxon signed rank test, the result was less than the hypothetical maximum of five (p=0.001) and comparable to four (p=0.16). There was no difference in the replies to questions from different CBME modules in medical biochemistry (Kruskal-Wallis p=0.39). The inter-rater reliability of the scores scored by two biochemistry faculty members was outstanding (ICC=0.926 (95% CI: 0.814-0.971); F=19; p=0.001) Conclusion The results of this research indicate that ChatGPT has the potential to be a successful tool for answering questions requiring higher-order thinking in medical biochemistry, with a median score of four out of five. However, continuous training and development with data of recent advances are essential to improve performance and make it functional for the ever-growing field of academic medical usage.

Evaluating ChatGPT-3.5 and Claude-2 in Answering and Explaining Conceptual Medical Physiology Multiple-Choice Questions

Evaluating ChatGPT's Ability to Solve Higher-Order Questions on the Competency-Based Medical Education Curriculum in Medical Biochemistry

Analysing the Applicability of ChatGPT, Bard, and Bing to Generate Reasoning-Based Multiple-Choice Questions in Medical Physiology

Assessing the Efficacy of ChatGPT in Solving Questions Based on the Core Concepts in Physiology

Is ChatGPT 'ready' to be a learning tool for medical undergraduates and will it perform equally in different subjects? Comparative study of ChatGPT performance in tutorial and case-based learning questions in physiology and biochemistry

How Well Does ChatGPT Do When Taking the Medical Licensing Exams? The Implications of Large Language Models for Medical Education and Knowledge Assessment

Below average ChatGPT performance in medical microbiology exam compared to university students

ChatGPT Knowledge Evaluation in Basic and Clinical Medical Sciences: Multiple Choice Question Examination-Based Performance

Evaluation of ChatGPT's performance in Medical Education: A Comparative Analysis with Students in a Pulmonology Examination

Evaluation of responses to cardiac imaging questions by the artificial intelligence large language model ChatGPT

ChatGPT versus human in generating medical graduate exam multiple choice questions—A multinational prospective study (Hong Kong S.A.R., Singapore, Ireland, and the United Kingdom)

Assessment of the capacity of ChatGPT as a self-learning tool in medical pharmacology: a study using MCQs

Human versus Artificial Intelligence: ChatGPT-4 Outperforming Bing, Bard, ChatGPT-3.5, and Humans in Clinical Chemistry Multiple-Choice Questions

Evaluating the limits of AI in medical specialisation: ChatGPT’s performance on the UK Neurology Specialty Certificate Examination

How Does ChatGPT Perform on the United States Medical Licensing Examination (USMLE)? The Implications of Large Language Models for Medical Education and Knowledge Assessment

Can ChatGPT-3.5 Pass a Medical Exam? A Systematic Review of ChatGPT's Performance in Academic Testing

ChatGPT in medical school: how successful is AI in progress testing?

Applicability of ChatGPT in Assisting to Solve Higher Order Problems in Pathology

ChatGPT as a teaching tool: Preparing pathology residents for board examination with AI-generated digestive system pathology tests

Evaluating ChatGPT as a self-learning tool in medical biochemistry: A performance assessment in undergraduate medical university examination

Assessing ChatGPT's potential as a clinical resource for medical oncologists: An evaluation with board-style questions and real-world patient cases.