Automated Spinal MRI Labelling from Reports Using a Large Language Model

Robin Y. Park,Rhydian Windsor,Amir Jamaludin,Andrew Zisserman
2024-10-23
Abstract:We propose a general pipeline to automate the extraction of labels from radiology reports using large language models, which we validate on spinal MRI reports. The efficacy of our labelling method is measured on five distinct conditions: spinal cancer, stenosis, spondylolisthesis, cauda equina compression and herniation. Using open-source models, our method equals or surpasses GPT-4 on a held-out set of reports. Furthermore, we show that the extracted labels can be used to train imaging models to classify the identified conditions in the accompanying MR scans. All classifiers trained using automated labels achieve comparable performance to models trained using scans manually annotated by clinicians. Code can be found at <a class="link-external link-https" href="https://github.com/robinyjpark/AutoLabelClassifier" rel="external noopener nofollow">this https URL</a>.
Image and Video Processing,Computation and Language,Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
This paper attempts to address the issue of time-consuming and expert-dependent annotation of medical imaging datasets. Specifically, the authors propose a general pipeline that leverages large language models (LLMs) to automatically extract labels from radiology reports, reducing the manual annotation workload and unlocking larger-scale training datasets for medical imaging problems. ### Main Issues: 1. **Time-consuming annotation of medical imaging datasets**: Annotating medical imaging datasets typically requires a significant amount of time and the involvement of expert annotators, which is both expensive and limited. 2. **Diverse medical conditions**: Various medical conditions may appear in medical imaging, leading to inconsistency in labels. 3. **Small-scale datasets**: Due to the above reasons, researchers often have to work with small-scale datasets for medical imaging problems or rely on a few publicly available datasets that cover limited conditions and modalities. ### Solution: The authors propose a general approach that adapts general large language models (LLMs) to extract structured labels from clinical reports. The specific steps include: 1. **Model prompting**: By providing definitions of target conditions, the model is asked to generate summaries of the reports and generate binary labels based on the summaries. 2. **Self-supervised fine-tuning**: The model undergoes self-supervised fine-tuning to familiarize it with the task of summary generation. 3. **Application validation**: The method's effectiveness is validated on spine MRI radiology reports, testing for five different conditions: spinal cancer, stenosis, spondylolisthesis, cauda equina compression, and disc herniation. ### Main Contributions: - **Automated label extraction**: This method can automatically extract labels from radiology reports, significantly reducing the manual annotation workload. - **Superior performance**: Using open-source models, this method achieved performance comparable to or better than GPT-4 across multiple conditions. - **Downstream applications**: The extracted labels can be used to train image models to detect relevant conditions, with performance comparable to models trained on expert-annotated images. ### Conclusion: This paper presents a general method for automatically extracting labels from radiology reports without additional model training. The method outperforms a strong GPT-4 baseline in the application to spine MRI reports, offering privacy protection and cost-effectiveness. Additionally, the extracted labels can be used to train classifiers, achieving performance comparable to models trained on expert-annotated scans.