Artificial intelligence applied to magnetic resonance imaging reliably detects the presence, but not the location, of meniscus tears: a systematic review and meta-analysis

Yi Zhao,Andrew Coppola,Urvi Karamchandani,Dimitri Amiras,Chinmay M. Gupte
DOI: https://doi.org/10.1007/s00330-024-10625-7
IF: 7.034
2024-02-22
European Radiology
Abstract:Abstract Objectives To review and compare the accuracy of convolutional neural networks (CNN) for the diagnosis of meniscal tears in the current literature and analyze the decision-making processes utilized by these CNN algorithms. Materials and methods PubMed, MEDLINE, EMBASE, and Cochrane databases up to December 2022 were searched in accordance with the Preferred Reporting Items for Systematic Reviews and Meta-analysis (PRISMA) statement. Risk of analysis was used for all identified articles. Predictive performance values, including sensitivity and specificity, were extracted for quantitative analysis. The meta-analysis was divided between AI prediction models identifying the presence of meniscus tears and the location of meniscus tears. Results Eleven articles were included in the final review, with a total of 13,467 patients and 57,551 images. Heterogeneity was statistically significantly large for the sensitivity of the tear identification analysis ( I 2 = 79%). A higher level of accuracy was observed in identifying the presence of a meniscal tear over locating tears in specific regions of the meniscus (AUC, 0.939 vs 0.905). Pooled sensitivity and specificity were 0.87 (95% confidence interval (CI) 0.80–0.91) and 0.89 (95% CI 0.83–0.93) for meniscus tear identification and 0.88 (95% CI 0.82–0.91) and 0.84 (95% CI 0.81–0.85) for locating the tears. Conclusions AI prediction models achieved favorable performance in the diagnosis, but not location, of meniscus tears. Further studies on the clinical utilities of deep learning should include standardized reporting, external validation, and full reports of the predictive performances of these models, with a view to localizing tears more accurately. Clinical relevance statement Meniscus tears are hard to diagnose in the knee magnetic resonance images. AI prediction models may play an important role in improving the diagnostic accuracy of clinicians and radiologists. Key Points • Artificial intelligence (AI) provides great potential in improving the diagnosis of meniscus tears. • The pooled diagnostic performance for artificial intelligence (AI) in identifying meniscus tears was better (sensitivity 87%, specificity 89%) than locating the tears (sensitivity 88%, specificity 84%). • AI is good at confirming the diagnosis of meniscus tears, but future work is required to guide the management of the disease.
radiology, nuclear medicine & medical imaging
What problem does this paper attempt to address?
The problem that this paper attempts to solve is to evaluate the feasibility and accuracy of convolutional neural networks (CNN) in diagnosing meniscal tears. Specifically, the study aims to evaluate the accuracy of using CNN technology to diagnose meniscal tears through systematic review and meta - analysis, and analyze the decision - making algorithms reported in these studies. The paper specifically focuses on the performance differences of CNN models in identifying the presence or absence of meniscal tears and locating the tear location. ### Research Background Accurate and timely diagnosis of meniscal tears is crucial for patient management and treatment. Magnetic resonance imaging (MRI), as an effective tool for evaluating knee meniscal pathology, provides high - resolution images. However, the interpretation of MRI images requires professional knowledge and there are inter - observer differences. In recent years, artificial intelligence (AI), especially convolutional neural networks (CNN), has received extensive attention due to its potential in improving diagnostic accuracy and efficiency. ### Research Objectives 1. **Systematic Review and Meta - analysis**: Evaluate the accuracy and feasibility of CNN in diagnosing meniscal tears. 2. **Decision - making Algorithm Analysis**: Analyze the CNN decision - making algorithms reported in the existing literature. ### Main Findings - **Overall Performance**: The AI prediction model performs excellently in diagnosing the presence of meniscal tears (AUC = 0.939), but performs slightly worse in locating the tear location (AUC = 0.905). - **Sensitivity and Specificity**: - Identifying the presence of meniscal tears: Sensitivity is 0.87 (95% CI 0.80–0.91), Specificity is 0.89 (95% CI 0.83–0.93). - Locating the tear location: Sensitivity is 0.88 (95% CI 0.82–0.91), Specificity is 0.84 (95% CI 0.81–0.85). - **Heterogeneity**: In the sensitivity analysis of identifying the presence of tears, the heterogeneity is significant (I² = 79%), while it is not significant in the specificity analysis (p = 0.12). The sensitivity heterogeneity of locating the tear location is low (I² = 16%), and the specificity heterogeneity is moderate (I² = 60%). ### Conclusions - **Main Conclusion**: CNN performs excellently in confirming the presence of meniscal tears, but there is still room for improvement in locating the tear location. - **Clinical Significance**: The AI prediction model may play an important role in improving the diagnostic accuracy of clinicians and radiologists. - **Future Research Directions**: Future research should include standardized reporting, external validation, and comprehensive reporting of prediction performance to more accurately locate the tear location. In addition, research should also explore the impact of different patterns of meniscal tears on patient management. ### Limitations - **Heterogeneity**: There is significant heterogeneity in the methods of the included studies and the MRI sequences used, which limits the application of the algorithm in the actual environment. - **Sample Size**: The number of studies included in individual analyses is small, which may not be able to fully explore the overall performance of the AI model. - **Standardized Reporting**: No study follows specific AI standardized reporting guidelines, which may limit the applicability of the results in clinical practice. ### Formulas - **AUC** (Area Under the Curve): \[ \text{AUC}=\frac{\text{TPR}+\text{TNR}}{2} \] - **Sensitivity**: \[ \text{Sensitivity}=\frac{\text{TP}}{\text{TP}+\text{FN}} \] - **Specificity**: \[ \text{Specificity}=\frac{\text{TN}}{\text{TN}+\text{FP}} \] Hope this information is helpful for you to understand this paper! If you have any further questions or need more detailed explanations, please feel free to let me know.