Comparing Scoring Consistency of Large Language Models with Faculty for Formative Assessments in Medical Education

Radhika Sreedhar,Linda Chang,Ananya Gangopadhyaya,Peggy Woziwodzki Shiels,Julie Loza,Euna Chi,Elizabeth Gabel,Yoon Soo Park
DOI: https://doi.org/10.1007/s11606-024-09050-9
IF: 5.7
2024-10-16
Journal of General Internal Medicine
Abstract:The Liaison Committee on Medical Education requires that medical students receive individualized feedback on their self-directed learning skills. Pre-clinical students are asked to complete multiple spaced critical appraisal assignments. However, the individual feedback requires significant faculty time. As large language models (LLMs) can score and generate feedback, we explored their use in grading formative assessments through validity and feasibility lenses.
medicine, general & internal,health care sciences & services
What problem does this paper attempt to address?