Natural Language Processing-Driven Artificial Intelligence Models for the Diagnosis of Lumbar Disc Herniation with L5 and S1 Radiculopathy: A Preliminary Evaluation

PeiYang Wang,Zhe Zhang,ZhiYang Xie,Lei Liu,GuanRui Ren,ZongJie Guo,Li Xu,XiangJie Yin,YiLi Hu,YunTao Wang,XiaoTao Wu
DOI: https://doi.org/10.1016/j.wneu.2024.06.041
Abstract:Objective: To develop and validate natural language processing-driven artificial intelligence (AI) models for the diagnosis of lumbar disc herniation (LDH) with L5 and S1 radiculopathy using electronic health records (EHRs). Methods: EHRs of patients undergoing single-level percutaneous endoscopic lumbar discectomy for the treatment of LDH at the L4/5 or L5/S1 level between June 1, 2013, and December 31, 2021, were collected. The primary outcome was LDH with L5 and S1 radiculopathy, which was defined as nerve root compression recorded in the operative notes. Datasets were created using the history of present illness text and positive symptom text with radiculopathy (L5 or S1), respectively. The datasets were randomly split into a training set and a testing set in a 7:3 ratio. Two machine learning models, the long short-term memory network and Extreme Gradient Boosting, were developed using the training set. Performance evaluation of the models on the testing set was done using measures such as the receiver operating characteristic curve, area under the curve, accuracy, recall, F1-score, and precision. Results: The study included a total of 1681 patients, with 590 patients having L5 radiculopathy and 1091 patients having S1 radiculopathy. Among the 4 models developed, the long short-term memory model based on positive symptom text showed the best discrimination in the testing set, with precision (0.9054), recall (0.9405), accuracy (0.8950), F1-score (0.9226), and area under the curve (0.9485). Conclusions: This study provides preliminary validation of the concept that natural language processing-driven AI models can be used for the diagnosis of lumbar disease using EHRs. This study could pave the way for future research that may develop more comprehensive and clinically impactful AI-driven diagnostic systems.
What problem does this paper attempt to address?