Abstract:Background and objective Chat Generative Pre-trained Transformer (ChatGPT) is an artificial intelligence (AI)-based language processing model using deep learning to create human-like text dialogue. It has been a popular source of information covering vast number of topics including medicine. Patient education in head and neck cancer (HNC) is crucial to enhance the understanding of patients about their medical condition, diagnosis, and treatment options. Therefore, this study aims to examine the accuracy and reliability of ChatGPT in answering questions regarding HNC. Methods 154 head and neck cancer-related questions were compiled from sources including professional societies, institutions, patient support groups, and social media. These questions were categorized into topics like basic knowledge, diagnosis, treatment, recovery, operative risks, complications, follow-up, and cancer prevention. ChatGPT was queried with each question, and two experienced head and neck surgeons assessed each response independently for accuracy and reproducibility. Responses were rated on a scale: (1) comprehensive/correct, (2) incomplete/partially correct, (3) a mix of accurate and inaccurate/misleading, and (4) completely inaccurate/irrelevant. Discrepancies in grading were resolved by a third reviewer. Reproducibility was evaluated by repeating questions and analyzing grading consistency. Results ChatGPT yielded “comprehensive/correct” responses to 133/154 (86.4%) of the questions whereas, rates of “incomplete/partially correct” and “mixed with accurate and inaccurate data/misleading” responses were 11% and 2.6%, respectively. There were no “completely inaccurate/irrelevant” responses. According to category, the model provided “comprehensive/correct” answers to 80.6% of questions regarding “basic knowledge”, 92.6% related to “diagnosis”, 88.9% related to “treatment”, 80% related to “recovery – operative risks – complications – follow-up”, 100% related to “cancer prevention” and 92.9% related to “other”. There was not any significant difference between the categories regarding the grades of ChatGPT responses (p=0.88). The rate of reproducibility was 94.1% (145 of 154 questions). Conclusion ChatGPT generated substantially accurate and reproducible information to diverse medical queries related to HNC. Despite its limitations, it can be a useful source of information for both patients and medical professionals. With further developments in the model, ChatGPT can also play a crucial role in clinical decision support to provide the clinicians with up-to-date information.

Assessing the performance of chat generative pretrained transformer (ChatGPT) in answering chronic kidney disease‐related questions

ChatGPT Solving Complex Kidney Transplant Cases: A Comparative Study With Human Respondents

Evaluating the Performance of ChatGPT in Urology: A Comparative Study of Knowledge Interpretation and Patient Guidance

Evaluating ChatGPT's Accuracy in Responding to Patient Education Questions on Acute Kidney Injury and Continuous Renal Replacement Therapy

Evaluating accuracy and reproducibility of ChatGPT responses to patient-based questions in Ophthalmology: An observational study

Can ChatGPT help patients understand their andrological diseases?

Evaluating the accuracy and adequacy of ChatGPT in responding to queries of diabetes patients in primary healthcare

Is ChatGPT accurate and reliable in answering questions regarding head and neck cancer?

Assessing the performance of ChatGPT in answering questions regarding cirrhosis and hepatocellular carcinoma

How Reliable is ChatGPT as a Novel Consultant in Infectious Diseases and Clinical Microbiology?

ChatGPT and Artificial Intelligence in Transplantation Research: Is It Always Correct?

AI-Driven Patient Education in Chronic Kidney Disease: Evaluating Chatbot Responses against Clinical Guidelines

Analyzing the performance of ChatGPT in answering inquiries about cervical cancer

Evaluation of the accuracy and quality of ChatGPT-4 responses for hyperparathyroidism patients discussed at multidisciplinary endocrinology meetings

Potential Use of ChatGPT for Patient Information in Periodontology: A Descriptive Pilot Study

Assessing the accuracy and reproducibility of ChatGPT for responding to patient inquiries about otosclerosis

Availability of ChatGPT to provide medical information for patients with kidney cancer

Assessing ChatGPT's Responses to Prolactinoma Queries

Digital health tools in nephrology: A comparative analysis of AI and professional opinions via online polls

STILL USING ONLY CHATGPT? THE COMPARISON OF FIVE DIFFERENT ARTIFICIAL INTELLIGENCE CHATBOTS' ANSWERS TO THE MOST COMMON QUESTIONS ABOUT KIDNEY STONES

Evaluating Performance of ChatGPT on MKSAP Cardiology Board Review Questions