Toward an enhanced automatic medical report generator based on large transformer models
Olanda Prieto-Ordaz,Graciela Ramirez-Alonso,Manuel Montes-y-Gomez,Roberto Lopez-Santillan
DOI: https://doi.org/10.1007/s00521-024-10382-0
2024-11-22
Neural Computing and Applications
Abstract:Medical imaging interpretation plays a vital role in primary health care, and with increasing workloads, the integration of artificial intelligence to automate this task can be useful in assisting doctors in their daily work. In the present study, we develop a novel neural architecture based on Transformer models called the enhanced transformer based-medical image interpretation (ETB-MII) to generate medical reports. The structure of the model aims to demonstrate that an implementation that relies solely on attention models, supported by effective data augmentation, can achieve competitive and state-of-the-art (SOTA) performance. To properly assess the effectiveness of our approach, we compare the medical reports generated by ETB-MII against SOTA methods considering the IU X-ray and MIMIC-CXR datasets. Commonly, the metrics evaluated in said collections are BLEU, METEOR, ROUGE-L, and CIDEr. ETB-MII achieves competitive results across BLEU and ROUGE-L metrics while also attaining SOTA performance in the CIDEr score. A comparative evaluation between metrics revealed that CIDEr effectively penalizes medical reports with unclear wording, misspellings, and poor semantic structure. Therefore, we consider CIDEr the most appropriate metric for evaluating medical reports. In addition, a computational complexity analysis reveals that our approach has lower computational demands compared to a baseline SOTA model. This efficiency is critical for supporting decision-making processes in patient care.
computer science, artificial intelligence