Abstract:Background: Radiography (X-rays) is the dominant modality in orthopedics, and improving the interpretation of radiographs is clinically relevant. Machine learning (ML) has revolutionized data analysis and has been applied to medicine, with some success, in the form of natural language processing (NLP) and artificial neural networks (ANN). Latent Dirichlet allocation (LDA) is an NLP method that automatically categorizes documents into topics. Successfully applying ML to orthopedic radiography could enable the creation of computer-aided decision systems for use in the clinic. We studied how an automated ML pipeline could classify orthopedic trauma radiographs from radiologist reports. Methods: Wrist and ankle radiographs from Danderyd Hospital in Sweden taken between 2002 and 2015, with radiologist reports. LDA was used to create image labels for radiographs from the radiologist reports. Radiographs and labels were used to train an image recognition ANN. The ANN outcomes were manually reviewed to get an accurate estimate of the method's utility and accuracy. Results: Image Labels generated via LDA could successfully train the ANN. The ANN reached an accuracy between 91% and 60% compared to a gold standard, depending on the label. Conclusions: We found that LDA was unsuited to label orthopedic radiographs from reports with high accuracy. However, despite this, the ANN could learn to detect some features in radiographs with high accuracy. The study also illustrates how ML and ANN can be applied to medical research.

What problem does this paper attempt to address?

### What problem does this paper attempt to solve? This paper aims to solve the problem of how to use machine learning (ML) techniques, especially natural language processing (NLP) and artificial neural networks (ANN), to automatically classify orthopedic X - ray films. Specifically, the researchers attempt to solve this problem through the following two main objectives: 1. **Automatically generate image labels from radiologists' reports**: - Use the **Latent Dirichlet Allocation (LDA)** algorithm to analyze radiologists' text reports in order to automatically generate labels that describe the content of X - ray films. These labels will be used to train the artificial neural network (ANN) for image recognition, thereby achieving automatic classification of orthopedic X - ray films. 2. **Evaluate whether the automatically generated labels can effectively train ANN for image recognition**: - The researchers hope to verify whether the labels generated by LDA can accurately train a convolutional neural network (CNN) so that it can detect and classify pathological features in orthopedic X - ray films. The ultimate goal is to develop a computer - aided decision - making system (CAD) to improve the accuracy and efficiency of clinical diagnosis. ### Background and Motivation - **The importance of orthopedic X - ray films**: In the field of orthopedics, X - ray films (i.e., radiological images) are the main basis for diagnosis and treatment decisions. However, when human doctors interpret X - ray films, they may be influenced by subjective factors, resulting in differences in the interpretation of the same X - ray film by different doctors (i.e., the inter - observer reliability problem). In addition, as the amount of medical data increases, it becomes increasingly difficult to manually label a large number of X - ray films. - **The application potential of machine learning**: Machine learning techniques, especially convolutional neural networks (CNN) in deep learning, have achieved remarkable success in the field of image recognition. If these techniques can be applied to the automatic classification of orthopedic X - ray films, it can not only improve the accuracy of diagnosis but also reduce the workload of doctors, especially in resource - limited situations (such as emergency rooms, natural disaster sites, etc.). ### Research Methods To achieve the above objectives, the researchers designed three experiments: 1. **Experiment 1: Calibrate LDA parameters**: - Objective: Determine the LDA model parameters that are most suitable for radiologists' reports. - Method: Use 24,948 randomly selected wrist X - ray film reports and select the optimal combination of LDA parameters through a linear regression model. 2. **Experiment 2: Generate image labels and train ANN**: - Objective: Use LDA to generate labels that describe the content of X - ray films and use these labels to train a CNN. - Method: Use the best LDA parameters selected in Experiment 1 to perform LDA modeling on 88,026 wrist and ankle X - ray film reports to generate image labels. Then, use these labels to train a VGG - 16 CNN and evaluate its performance on the test set. 3. **Experiment 3: Evaluate ANN classification quality**: - Objective: Evaluate the quality of CNN classification results by comparing with the gold standard. - Method: Select 5 labels from Experiment 2, randomly select 300 images for each label (150 correctly classified and 150 misclassified), and have them manually reviewed by radiologists to obtain the gold standard. Then, calculate the true accuracy rate of the CNN. ### Conclusions Although LDA performs poorly in generating high - precision labels, the study found that even if the labels are not precise enough, the CNN can still learn to detect some features in X - ray films. This indicates that although LDA may not be suitable for directly generating high - quality image labels, it can be used as an initial step to help train more complex image recognition models. In addition, this study also shows how to apply machine learning and artificial neural networks to medical research, providing a valuable reference for future research. ### Formula Summary 1. **LDA joint probability distribution**: \[ p(\mathbf{w},\mathbf{z}|\alpha,\beta) = p(\mathbf{w}|\mathbf{z},\beta)p(\mathbf{z}|\alpha) \] where: - \( p(\mathbf{w},\mathb

From Radiologist Report to Image Label: Assessing Latent Dirichlet Allocation in Training Neural Networks for Orthopedic Radiograph Classification

Improving Image Classification of Knee Radiographs: An Automated Image Labeling Approach

A Deep Learning Approach for MRI in the Diagnosis of Labral Injuries of the Hip Joint.

From jargon to clarity: Improving the readability of foot and ankle radiology reports with an artificial intelligence large language model

Deep Learning for Automated Classification of Hip Hardware on Radiographs

Automatic Annotation of Narrative Radiology Reports

Artificial intelligence can be used in the identification and classification of shoulder osteoarthritis and avascular necrosis on plain radiographs: a training study of 7,139 radiograph sets

Diagnostic performance for severity grading of hip osteoarthritis and osteonecrosis of femoral head on radiographs: Deep learning model vs. board-certified orthopaedic surgeons

Automatic Diagnosis and Grading of Patellofemoral Osteoarthritis from the Axial Radiographic View: a Deep Learning-Based Approach

Computer-Aided Ankle Ligament Injury Diagnosis from Magnetic Resonance Images Using Machine Learning Techniques

Artificial Intelligence System for Automatic Quantitative Analysis and Radiology Reporting of Leg Length Radiographs

Deep learning to automatically classify very large sets of preoperative and postoperative shoulder arthroplasty radiographs

Deep learning to automate the labelling of head MRI datasets for computer vision applications

Deep learning-based algorithm for assessment of knee osteoarthritis severity in radiographs matches performance of radiologists

Automated Classification of Free-Text Radiology Reports: Using Different Feature Extraction Methods to Identify Fractures of the Distal Fibula

Use of natural language processing techniques to predict patient selection for total hip and knee arthroplasty from radiology reports

Decoding Radiology Reports: Artificial Intelligence-Large Language Models Can Improve the Readability of Hand and Wrist Orthopedic Radiology Reports

Deep Learning for Automatic Knee Osteoarthritis Severity Grading and Classification

Classification of radiology reports by modality and anatomy: A comparative study

Deep learning automation of radiographic patterns for hallux valgus diagnosis

Multiclass datasets expand neural network utility: an example on ankle radiographs