TECRR: a benchmark dataset of radiological reports for BI-RADS classification with machine learning, deep learning, and large language model baselines

Sadam Hussain,Usman Naseem,Mansoor Ali,Daly Betzabeth Avendaño Avalos,Servando Cardona-Huerta,Beatriz Alejandra Bosques Palomo,Jose Gerardo Tamez-Peña
DOI: https://doi.org/10.1186/s12911-024-02717-7
IF: 3.298
2024-10-26
BMC Medical Informatics and Decision Making
Abstract:Recently, machine learning (ML), deep learning (DL), and natural language processing (NLP) have provided promising results in the free-form radiological reports' classification in the respective medical domain. In order to classify radiological reports properly, a high-quality annotated and curated dataset is required. Currently, no publicly available breast imaging-based radiological dataset exists for the classification of Breast Imaging Reporting and Data System (BI-RADS) categories and breast density scores, as characterized by the American College of Radiology (ACR). To tackle this problem, we construct and annotate a breast imaging-based radiological reports dataset and its benchmark results.
medical informatics
What problem does this paper attempt to address?