The implementation of natural language processing to extract index lesions from breast magnetic resonance imaging reports

Yi Liu,Qing Liu,Chao Han,Xiaodong Zhang,Xiaoying Wang
DOI: https://doi.org/10.1186/s12911-019-0997-3
IF: 3.298
2019-01-01
BMC Medical Informatics and Decision Making
Abstract:Background There are often multiple lesions in breast magnetic resonance imaging (MRI) reports and radiologists usually focus on describing the index lesion that is most crucial to clinicians in determining the management and prognosis of patients. Natural language processing (NLP) has been used for information extraction from mammography reports. However, few studies have investigated NLP in breast MRI data based on free-form text. The objective of the current study was to assess the validity of our NLP program to accurately extract index lesions and their corresponding imaging features from free-form text of breast MRI reports. Methods This cross-sectional study examined 1633 free-form text reports of breast MRIs from 2014 to 2017. First, the NLP system was used to extract 9 features from all the lesions in the reports according to the Breast Imaging Reporting and Data System (BI-RADS) descriptors. Second, the index lesion was defined as the lesion with the largest number of imaging features. Third, we extracted the values of each imaging feature and the BI-RADS category from each index lesion. To evaluate the accuracy of our system, 478 reports were manually reviewed by two individuals. The time taken to extract data by NLP was compared with that by reviewers. Results The NLP system extracted 889 lesions from 478 reports. The mean number of imaging features per lesion was 6.5 ± 2.1 (range: 3–9; 95% CI: 6.362–6.638). The mean number of imaging features per index lesion was 8.0 ± 1.1 (range: 5–9; 95% CI: 7.901–8.099). The NLP system demonstrated a recall of 100.0% and a precision of 99.6% for correct identification of the index lesion. The recall and precision of NLP to correctly extract the value of imaging features from the index lesions were 91.0 and 92.6%, respectively. The recall and precision for the correct identification of the BI-RADS categories were 96.6 and 94.8%, respectively. NLP generated the total results in less than 1 s, whereas the manual reviewers averaged 4.47 min and 4.56 min per report. Conclusions Our NLP method successfully extracted the index lesion and its corresponding information from free-form text.
What problem does this paper attempt to address?