Improving the Quality of Unstructured Cancer Data Using Large Language Models: A German Oncological Case Study

Yongli Mou,Jonathan Lehmkuhl,Nicolas Sauerbrunn,Anja Köchel,Jens Panse,Daniel Truh,Sulayman Sowe,Tim Brümmendorf,Stefan Decker
DOI: https://doi.org/10.3233/SHTI240507
2024-08-22
Abstract:With cancer being a leading cause of death globally, epidemiological and clinical cancer registration is paramount for enhancing oncological care and facilitating scientific research. However, the heterogeneous landscape of medical data presents significant challenges to the current manual process of tumor documentation. This paper explores the potential of Large Language Models (LLMs) for transforming unstructured medical reports into the structured format mandated by the German Basic Oncology Dataset. Our findings indicate that integrating LLMs into existing hospital data management systems or cancer registries can significantly enhance the quality and completeness of cancer data collection - a vital component for diagnosing and treating cancer and improving the effectiveness and benefits of therapies. This work contributes to the broader discussion on the potential of artificial intelligence or LLMs to revolutionize medical data processing and reporting in general and cancer care in particular.
What problem does this paper attempt to address?