Abstract:In this work, we present our solution for the MICCAI 2024 CXR-LT challenge, achieving 4th place in Subtask 2 and 5th in Subtask 1. We leveraged an ensemble of ConvNeXt V2 and MaxViT models, pretrained on an external chest X-ray dataset, to address the long-tailed distribution of chest findings. The proposed method combines state-of-the-art image classification techniques, asymmetric loss for handling class imbalance, and view-based prediction aggregation to enhance classification performance. Through experiments, we demonstrate the advantages of our approach in improving both detection accuracy and the handling of the long-tailed distribution in CXR findings. The code is available at <a class="link-external link-https" href="https://github.com/yamagishi0824/cxrlt24-multiview-pp" rel="external noopener nofollow">this https URL</a>.

What problem does this paper attempt to address?

The main problems that this paper attempts to solve are the long - tail distribution problem in chest X - ray (CXR) classification and the problem of multi - view information integration. Specifically: 1. **Long - tail distribution problem**: In chest X - ray data, the frequency of certain diseases or pathological conditions is much lower than that of other common diseases, forming the so - called "long - tail distribution". This unbalanced data distribution poses challenges to model training, especially for the accurate detection of rare diseases. 2. **Multi - view information integration**: Chest X - ray examinations usually include multiple views (such as anteroposterior and lateral views), and each view provides different and unique information. How to effectively integrate the information of these multi - views to improve the accuracy of diagnosis is an important research direction. To solve these problems, the author proposes an integration method based on ConvNeXt V2 and MaxViT models, combined with the following technical means: - **Asymmetric loss function**: It is used to deal with the class imbalance problem by assigning higher weights to rare classes to reduce the bias towards common classes. - **View - based prediction aggregation**: By performing a weighted average of the prediction results of anteroposterior and lateral images, a more reliable overall prediction is obtained. Through these methods, the author aims to improve the detection accuracy in the chest X - ray classification task, especially when dealing with long - tail distributed data. ### Formula summary The formulas mentioned in the paper are mainly used to describe the view - based prediction aggregation process: 1. **Calculate the average prediction values of each view**: \[ P_f=\frac{1}{N_f}\sum_{i = 1}^{N_f}P_{f,i},\quad P_l=\frac{1}{N_l}\sum_{i = 1}^{N_l}P_{l,i} \] where \(P_f\) and \(P_l\) are the average prediction values of the anteroposterior and lateral views respectively, \(N_f\) and \(N_l\) are the numbers of anteroposterior and lateral images respectively, and \(P_{f,i}\) and \(P_{l,i}\) are the prediction values of a single image. 2. **Weighted average to combine the prediction values of each view**: \[ P_{\text{final}}=\frac{w_fP_f + w_lP_l}{w_f + w_l} \] where \(P_{\text{final}}\) is the final prediction value, and \(w_f\) and \(w_l\) are the weights of the anteroposterior and lateral views respectively. These methods work together to enable the model to perform better when dealing with complex and unbalanced chest X - ray data.

Ensemble of ConvNeXt V2 and MaxViT for Long-Tailed CXR Classification with View-Based Aggregation

CheXFusion: Effective Fusion of Multi-View Features using Transformers for Long-Tailed Chest X-Ray Classification

Bag of Tricks for Long-Tailed Multi-Label Classification on Chest X-Rays

Distilling Sub-Space Structure Across Views for Cardiac Indices Estimation

LTCXNet: Advancing Chest X-Ray Analysis with Solutions for Long-Tailed Multi-Label Classification and Fairness Challenges

MMViT-Seg: A Lightweight Transformer and CNN Fusion Network for COVID-19 Segmentation.

Detecting Tuberculosis-Consistent Findings in Lateral Chest X-Rays Using an Ensemble of CNNs and Vision Transformers

Towards long-tailed, multi-label disease classification from chest X-ray: Overview of the CXR-LT challenge

Modeling Long-Range Dependencies for Weakly Supervised Disease Classification and Localization on Chest X-ray

Expanding the Horizon: Enabling Hybrid Quantum Transfer Learning for Long-Tailed Chest X-Ray Classification

Long-Tailed Classification of Thorax Diseases on Chest X-Ray: A New Benchmark Study

Long-tailed multi-label classification with noisy label of thoracic diseases from chest X-ray

CATS v2: Hybrid encoders for robust medical segmentation

Quantifying the Value of Lateral Views in Deep Learning for Chest X-rays

Low-Resolution Chest X-ray Classification via Knowledge Distillation and Multi-task Learning

SynthEnsemble: A Fusion of CNN, Vision Transformer, and Hybrid Models for Multi-Label Chest X-Ray Classification

Ensemble CNN models for Covid-19 Recognition and Severity Perdition From 3D CT-scan

Multi-task contrastive learning for automatic CT and X-ray diagnosis of COVID-19

Enhancing COVID-19 Severity Analysis through Ensemble Methods

FDVTS's Solution for 2nd COV19D Competition on COVID-19 Detection and Severity Analysis

An Ensemble Deep Learning Approach for COVID-19 Severity Prediction Using Chest CT Scans