Abstract:Introduction: Original research in radiology often involves handling large datasets, data manipulation, statistical tests, and coding. Recent studies show that large language models (LLMs) can solve bioinformatics tasks, suggesting their potential in radiology research. This study evaluates an LLM's ability to provide statistical and deep learning solutions and code for radiology research. Materials and methods: We used web-based chat interfaces available for ChatGPT-4o, ChatGPT-3.5, and Google Gemini. EXPERIMENT 1: BIOSTATISTICS AND DATA VISUALIZATION: We assessed each LLMs' ability to suggest biostatistical tests and generate R code for the same using a Cancer Imaging Archive dataset. Prompts were based on statistical analyses from a peer-reviewed manuscript. The generated code was tested in R Studio for correctness, runtime errors and the ability to generate the requested visualization. EXPERIMENT 2: DEEP LEARNING: We used the RSNA-STR Pneumonia Detection Challenge dataset to evaluate ChatGPT-4o and Gemini's ability to generate Python code for transformer-based image classification models (Vision Transformer ViT-B/16). The generated code was tested in a Jupiter Notebook for functionality and run time errors. Results: Out of the 8 statistical questions posed, correct statistical answers were suggested for 7 (ChatGPT-4o), 6 (ChatGPT-3.5), and 5 (Gemini) scenarios. The R code output by ChatGPT-4o had fewer runtime errors (6 out of the 7 total codes provided) compared to ChatGPT-3.5 (5/7) and Gemini (5/7). Both ChatGPT4o and Gemini were able to generate visualization requested with a few run time errors. Iteratively copying runtime errors from the code generated by ChatGPT4o into the chat helped resolve them. Gemini initially hallucinated during code generation but was able to provide accurate code on restarting the experiment. ChatGPT4-o and Gemini successfully generated initial Python code for deep learning tasks. Errors encountered during implementation were resolved through iterations using the chat interface, demonstrating LLM utility in providing baseline code for further code refinement and resolving run time errors. Conclusion: LLMs can assist in coding tasks for radiology research, providing initial code for data visualization, statistical tests, and deep learning models helping researchers with foundational biostatistical knowledge. While LLM can offer a useful starting point, they require users to refine and validate the code and caution is necessary due to potential errors, the risk of hallucinations and data privacy regulations. Summary statement: LLMs can help with coding and statistical problems in radiology research. This can help primary authors trouble shoot coding needed in radiology research.

Large language models can help with biostatistics and coding needed in radiology research

Advancing radiology practice and research: harnessing the potential of large language models amidst imperfections

Large Language Models: A Guide for Radiologists

Impact of ChatGPT and Large Language Models on Radiology Education: Association of Academic Radiology—Radiology Research Alliance Task Force White Paper

Large language models (LLMs) in radiology exams for medical students: Performance and consequences

From Bench to Bedside With Large Language Models: AJR Expert Panel Narrative Review

Exploring the Potential of Large Language Models in Radiological Imaging Systems: Improving User Interface Design and Functional Capabilities

Large language models in radiology: fundamentals, applications, ethical considerations, risks, and future directions

ChatGPT and Large Language Models in Radiology: Perspectives From the Field

Programming Chatbots Using Natural Language: Generating Cervical Spine MRI Impressions

ChatGPT and Beyond: An overview of the growing field of large language models and their use in ophthalmology

The Application of LLMs for Radiologic Decision-Making

Evaluating Large Language Models on a Highly-specialized Topic, Radiation Oncology Physics

Establishing priorities for implementation of large language models in pathology and laboratory medicine

Large Language Models in Ophthalmology: Potential and Pitfalls

Assessing Large Language Models for Oncology Data Inference from Radiology Reports

ChatGPT, Bard, and Large Language Models for Biomedical Research: Opportunities and Pitfalls

Large language models reshaping molecular biology and drug development

Applications of Large Language Models (LLMs) in Breast Cancer Care

An Evaluation of Large Language Models in Bioinformatics Research

Large language models: a primer and gastroenterology applications