Automatic Detection and Classification of Rib Fractures on Thoracic CT Using Convolutional Neural Network: Accuracy and Feasibility
Qing-Qing Zhou,Jiashuo Wang,Wen Tang,Zhang-Chun Hu,Zi-Yi Xia,Xue-Song Li,Rongguo Zhang,Xindao Yin,Bing Zhang,Hong Zhang
DOI: https://doi.org/10.3348/kjr.2019.0651
IF: 7.109
2020-01-01
Korean Journal of Radiology
Abstract:Objective: To evaluate the performance of a convolutional neural network (CNN) model that can automatically detect and classify rib fractures, and output structured reports from computed tomography (CT) images. Materials and Methods: This study included 1079 patients (median age, 55 years; men, 718) from three hospitals, between January 2011 and January 2019, who were divided into a monocentric training set (n = 876; median age, 55 years; men, 582), five multicenter/multiparameter validation sets (n = 173; median age, 59 years; men, 118) with different slice thicknesses and image pixels, and a normal control set (n = 30; median age, 53 years; men, 18). Three classifications (fresh, healing, and old fracture) combined with fracture location (corresponding CT layers) were detected automatically and delivered in a structured report. Precision, recall, and F1-score were selected as metrics to measure the optimum CNN model. Detection/diagnosis time, precision, and sensitivity were employed to compare the diagnostic efficiency of the structured report and that of experienced radiologists. Results: A total of 25054 annotations (fresh fracture, 10089; healing fracture, 10922; old fracture, 4043) were labelled for training (18584) and validation (6470). The detection efficiency was higher for fresh fractures and healing fractures than for old fractures (F1-scores, 0.849, 0.856, 0.770, respectively, p = 0.023 for each), and the robustness of the model was good in the five multicenter/multiparameter validation sets (all mean F1-scores > 0.8 except validation set 5 [512 x 512 pixels; F1- score = 0.757]). The precision of the five radiologists improved from 80.3% to 91.1%, and the sensitivity increased from 62.4% to 86.3% with artificial intelligence-assisted diagnosis. On average, the diagnosis time of the radiologists was reduced by 73.9 seconds. Conclusion: Our CNN model for automatic rib fracture detection could assist radiologists in improving diagnostic efficiency, reducing diagnosis time and radiologists' workload.