Early identification of Family Medicine residents at risk of failure using Natural Language Processing and Explainable Artificial Intelligence

Abhisht Joshi,Pouria Mortezaagha,Diana Inkpen,Edward Seale,Douglas Archibald,Kendall Noel,Arya Rahgozar
DOI: https://doi.org/10.1101/2024.12.07.24318566
2024-12-08
Abstract:Background: During residency, each resident is observed and receives feedback based on their performance. Residency training is demanding, with a few residents struggling in their academic performance. A competency-based residency training program's success depends on its ability to identify residents with difficulty during their first year of post-graduate education and to provide them with timely intervention and support. Objective: In large training programs such as Family Medicine, identifying residents at risk of failing their certification exams is difficult. We develop a AI system using state-of-the-art technologies in Machine Learning (ML), Deep Learning (DL), Natural Language Processing (NLP) and Explainable AI (XAI) to detect at-risk residents automatically. Methods: We implemented ML, DL and NLP models for the prediction and its performance analysis. The target variable chosen for the prediction was the determination of whether the resident would fail or pass their certification exam. XAI was used to enhance the understanding of the model's inner workings. Results: In total, there were 1382 data points of residents. The champion model, Support Vector Machine (SVM), achieved an accuracy of 89.05% and an F1 score of 74.54 for the multiclass classification when multimodal (text and tabular) data was used. This model outperformed the models that only used qualitative or quantitative data exclusively. Conclusion: Combining qualitative and quantitative data represents a novel approach and has provided better classification results. This research demonstrates the feasibility of an automated AI system for the early identification of residents at risk of academic struggle.
What problem does this paper attempt to address?