Abstract:Objective To evaluate the effectiveness and reasoning ability of ChatGPT in diagnosing retinal vascular diseases in the Chinese clinical environment. Materials and Methods We collected 1226 fundus fluorescein angiography reports and corresponding diagnosis written in Chinese, and tested ChatGPT with four prompting strategies (direct diagnosis or diagnosis with explanation and in Chinese or English). Results ChatGPT using English prompt for direct diagnosis achieved the best performance, with F1-score of 80.05%, which was inferior to ophthalmologists (89.35%) but close to ophthalmologist interns (82.69%). Although ChatGPT can derive reasoning process with a low error rate, mistakes such as misinformation (1.96%), and hallucination (0.59%) still exist. Discussion and Conclusions ChatGPT can serve as a helpful medical assistant to provide diagnosis under non-English clinical environments, but there are still performance gaps, language disparity, and errors compared to professionals, which demonstrates the potential limitations and the desiration to continually explore more robust LLMs in ophthalmology practice. ### Competing Interest Statement The authors have declared no competing interest. ### Funding Statement The work is supported by Natural Science Foundation of China (grant number: 82201195). ### Author Declarations I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained. Yes The details of the IRB/oversight body that provided approval or exemption for the research described are given below: Ethics committee/IRB of Second Affiliated Hospital, School of Medicine, Zhejiang University gave ethical approval for this work.(IRB:[NCT04718532][1]) I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals. Yes I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance). Yes I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable. Yes Data will be made available for research purposes upon request. Data requests are to be directed to jinkai{at}zju.edu.cn. [1]: /lookup/external-ref?link_type=CLINTRIALGOV&access_num=NCT04718532&atom=%2Fmedrxiv%2Fearly%2F2023%2F07%2F14%2F2023.06.28.23291931.atom

Prompt engineering with ChatGPT3.5 and GPT4 to improve patient education on retinal diseases

Encouragement vs. liability: How prompt engineering influences ChatGPT-4's radiology exam performance

Prompt engineering on leveraging large language models in generating response to InBasket messages

Using Large Language Models to Generate Educational Materials on Childhood Glaucoma

Prompt engineering with a large language model to assist providers in responding to patient inquiries: a real-time implementation in the electronic health record

Uncovering Language Disparity of ChatGPT in Healthcare: Non-English Clinical Environment for Retinal Vascular Disease Classification (Preprint)

Uncovering Language Disparity of ChatGPT in Healthcare: Non-English Clinical Environment for Retinal Vascular Disease Classification

Prompt matters: evaluation of large language model chatbot responses related to Peyronie's disease

Prompt matters: evaluation of large language model chatbot responses related to Peyronie’s disease

ChatGPT and retinal disease: a cross-sectional study on AI comprehension of clinical guidelines

Utility of ChatGPT for Automated Creation of Patient Education Handouts: An Application in Neuro-Ophthalmology

Large language models: a new frontier in paediatric cataract patient education

Evaluating prompt engineering on GPT-3.5's performance in USMLE-style medical calculations and clinical scenarios generated by GPT-4

Investigating the capabilities of advanced large language models in generating patient instructions and patient educational material

Advancing Patient Education in Idiopathic Intracranial Hypertension: The Promise of Large Language Models

Effectiveness of ChatGPT in explaining complex medical reports to patients

Enhancing Health Literacy: Evaluating the Readability of Patient Handouts Revised by ChatGPT's Large Language Model

Automated HEART score determination via ChatGPT: Honing a framework for iterative prompt development

Comparing the Ability of Google and ChatGPT to Accurately Respond to Oculoplastics-Related Patient Questions and Generate Customized Oculoplastics Patient Education Materials

Evaluating Chatbot responses to patient questions in the field of glaucoma

Evaluation of Generative Language Models in Personalizing Medical Information: Instrument Validation Study