Deep Learning Model for Automated Detection and Classification of Central Canal, Lateral Recess, and Neural Foraminal Stenosis at Lumbar Spine MRI
James Thomas Patrick Decourcy Hallinan,Lei Zhu,Kaiyuan Yang,Andrew Makmur,Diyaa Abdul Rauf Algazwi,Yee Liang Thian,Samuel Lau,Yun Song Choo,Sterling Ellis Eide,Qai Ven Yap,Yiong Huak Chan,Jiong Hao Tan,Naresh Kumar,Beng Chin Ooi,Hiroshi Yoshioka,Swee Tian Quek
DOI: https://doi.org/10.1148/radiol.2021204289
IF: 19.7
2021-07-01
Radiology
Abstract:Background Assessment of lumbar spinal stenosis at MRI is repetitive and time consuming. Deep learning (DL) could improve -productivity and the consistency of reporting. Purpose To develop a DL model for automated detection and classification of lumbar central canal, lateral recess, and neural -foraminal stenosis. Materials and Methods In this retrospective study, lumbar spine MRI scans obtained from September 2015 to September 2018 were included. Studies of patients with spinal instrumentation or studies with suboptimal image quality, as well as postgadolinium studies and studies of patients with scoliosis, were excluded. Axial T2-weighted and sagittal T1-weighted images were used. Studies were split into an internal training set (80%), validation set (9%), and test set (11%). Training data were labeled by four radiologists using predefined gradings (normal, mild, moderate, and severe). A two-component DL model was developed. First, a convolutional neural network (CNN) was trained to detect the region of interest (ROI), with a second CNN for classification. An internal test set was labeled by a musculoskeletal radiologist with 31 years of experience (reference standard) and two subspecialist radiologists (radiologist 1: A.M., 5 years of experience; radiologist 2: J.T.P.D.H., 9 years of experience). DL model performance on an external test set was evaluated. Detection recall (in percentage), interrater agreement (Gwet κ), sensitivity, and specificity were calculated. Results Overall, 446 MRI lumbar spine studies were analyzed (446 patients; mean age ± standard deviation, 52 years ± 19; 240 women), with 396 patients in the training (80%) and validation (9%) sets and 50 (11%) in the internal test set. For internal testing, DL model and radiologist central canal recall were greater than 99%, with reduced neural foramina recall for the DL model (84.5%) and radiologist 1 (83.9%) compared with radiologist 2 (97.1%) ( < .001). For internal testing, dichotomous classification (normal or mild vs moderate or severe) showed almost-perfect agreement for both radiologists and the DL model, with respective κ values of 0.98, 0.98, and 0.96 for the central canal; 0.92, 0.95, and 0.92 for lateral recesses; and 0.94, 0.95, and 0.89 for neural foramina ( < .001). External testing with 100 MRI scans of lumbar spines showed almost perfect agreement for the DL model for dichotomous classification of all ROIs (κ, 0.95-0.96; < .001). Conclusion A deep learning model showed comparable agreement with subspecialist radiologists for detection and classification of central canal and lateral recess stenosis, with slightly lower agreement for neural foraminal stenosis at lumbar spine MRI. © RSNA, 2021 See also the editorial by Hayashi in this issue.
radiology, nuclear medicine & medical imaging