A tool for mapping medical narratives into medical ontologies in low resource settings: A case study for German

Juan G. Diaz Ochoa,Faizan E Mustafa
DOI: https://doi.org/10.1101/2024.06.11.24307163
2024-06-12
Abstract:Named Entity Recognition (NER) is extremely relevant in the clinical field since it allows the extraction of information, such as diagnoses or medical procedures, from non-structured data (doctor's letters, vignettes, etc.) and coding them based on international classification systems. As a result, language models should be trained to recognize and classify these items accurately. While Large Language Models (LLMs) like ChatGPT are capable of recognizing medical entities in texts, they are not reliable at performing this task. Unlike English, where there are a variety of resources to assist with this task, other languages, such as German, lack appropriate language models. This study presents a methodology for the generation of high-quality full-synthetic datasets and the implementation of a workflow for the identification and classification of diseases, co-diseases, and medical procedures for clinical narratives in oncology.
Health Informatics
What problem does this paper attempt to address?