Predicting medical specialty from text based on a domain-specific pre-trained BERT

Yoojoong Kim,Jong-Ho Kim,Young-Min Kim,Sanghoun Song,Hyung Joon Joo
DOI: https://doi.org/10.1016/j.ijmedinf.2022.104956
Abstract:Background: Owing to the prevalence of the coronavirus disease (COVID-19), coping with clinical issues at the individual level has become important to the healthcare system. Accordingly, precise initiation of treatment after a hospital visit is required for expedited processes and effective diagnoses of outpatients. To achieve this, artificial intelligence in medical natural language processing (NLP), such as a healthcare chatbot or a clinical decision support system, can be suitable tools for an advanced clinical system. Furthermore, support for decisions on the medical specialty from the initial visit can be helpful. Materials and methods: In this study, we propose a medical specialty prediction model from patient-side medical question text based on pre-trained bidirectional encoder representations from transformers (BERT). The dataset comprised pairs of medical question texts and labeled specialties scraped from a website for the medical question-and-answer service. The model was fine-tuned for predicting the required medical specialty labels among 27 labels from medical question texts. To demonstrate the feasibility, we conducted experiments on a real-world dataset and elaborately evaluated the predictive performance compared with four deep learning NLP models through cross-validation and test set evaluation. Results: The proposed model showed improved performance compared with competitive models in terms of overall specialties. In addition, we demonstrate the usefulness of the proposed model by performing case studies for visualization applications. Conclusion: The proposed model can benefit hospital patient management and reasonable recommendations for specialties for patients.
What problem does this paper attempt to address?