Predicting knee osteoarthritis severity: comparative modeling based on patient's data and plain X-ray images

Jaynal Abedin,Joseph Antony,Kevin McGuinness,Kieran Moran,Noel E O'Connor,Dietrich Rebholz-Schuhmann,John Newell
DOI: https://doi.org/10.1038/s41598-019-42215-9
2019-08-23
Abstract:Knee osteoarthritis (KOA) is a disease that impairs knee function and causes pain. A radiologist reviews knee X-ray images and grades the severity level of the impairments according to the Kellgren and Lawrence grading scheme; a five-point ordinal scale (0--4). In this study, we used Elastic Net (EN) and Random Forests (RF) to build predictive models using patient assessment data (i.e. signs and symptoms of both knees and medication use) and a convolution neural network (CNN) trained using X-ray images only. Linear mixed effect models (LMM) were used to model the within subject correlation between the two knees. The root mean squared error for the CNN, EN, and RF models was 0.77, 0.97, and 0.94 respectively. The LMM shows similar overall prediction accuracy as the EN regression but correctly accounted for the hierarchical structure of the data resulting in more reliable inference. Useful explanatory variables were identified that could be used for patient monitoring before X-ray imaging. Our analyses suggest that the models trained for predicting the KOA severity levels achieve comparable results when modeling X-ray images and patient data. The subjectivity in the KL grade is still a primary concern.
Image and Video Processing,Computer Vision and Pattern Recognition,Machine Learning
What problem does this paper attempt to address?
The problem that this paper attempts to solve is how to predict the severity of knee osteoarthritis (KOA) through patients' questionnaire data and knee X - ray films, and compare the performance of these two methods in prediction accuracy. Specifically, the researchers used three models, namely Elastic Net (EN), Random Forests (RF), and Convolutional Neural Network (CNN), to establish prediction models respectively based on patient assessment data (such as symptoms, signs, medication use, etc.) and X - ray image data, in order to evaluate the performance of these models in predicting the severity of KOA. ### Research Background and Motivation Knee osteoarthritis is a disease that affects the function of the knee joint and causes pain. Currently, the main method for evaluating the severity of KOA is for radiologists to visually examine knee X - ray films according to the Kellgren and Lawrence (KL) scoring criteria, and then grade the degree of lesions according to a five - level ordinal scale from 0 to 4. However, this method is highly subjective, and there may be scoring differences among different evaluators. Therefore, it is of great significance to develop an objective and reliable method for predicting the severity of KOA. ### Research Objectives 1. **Compare prediction accuracy**: By comparing the prediction models based on patients' questionnaire data and X - ray image data, evaluate the accuracy of both in predicting the severity of KOA. 2. **Identify key variables**: Determine which variables have the strongest explanatory power in predicting the severity of KOA, and these variables can be used for patient monitoring and early intervention. ### Method Overview - **Data source**: The study used the dataset of Osteoarthritis Initiative (OAI), which is a multi - center longitudinal observational study aiming to better understand KOA. - **Model selection**: The study adopted three models: Elastic Net regression, Random Forest, and Convolutional Neural Network. - **Data processing**: The data was pre - processed, including missing value handling, variable selection, and data splitting (training set and validation set). - **Performance evaluation**: The root - mean - square error (RMSE) was used as an evaluation index to compare the prediction performance of different models. ### Main Findings - **Prediction accuracy**: - The RMSE of the CNN model based on X - ray images is 0.77. - The RMSE of the Elastic Net regression model based on patients' questionnaire data is 0.97. - The RMSE of the Random Forest model based on patients' questionnaire data is 0.94. - The RMSE of the Linear Mixed - Effects Model (LMM) is 0.978. - **Key variables**: The study identified some variables that make important contributions to predicting the severity of KOA, including patients' baseline X - ray film status, surgical history, medication use, gender, symptoms such as dysfunction and pain. ### Conclusion The research results show that the statistical model based on patients' questionnaire data has relatively high accuracy in predicting the severity of KOA, and its performance is comparable to that of the CNN model based on X - ray images. In addition, the statistical model can also identify some key variables, which are helpful for designing early intervention measures and monitoring the progress of patients' conditions. Nevertheless, the KL score itself is subjective, and future research can consider combining patients' questionnaire data and X - ray image data to further improve the prediction accuracy.