Abstract:Pulmonary Fibrosis (PF) is an immedicable respiratory condition distinguished by permanent fibrotic alterations in the pulmonary tissue for which there is no cure. Hence, it is crucial to diagnose PF swiftly and precisely. The existing research on deep learning-based pulmonary fibrosis detection methods has limitations, including dataset sample sizes and a lack of standardization in data preprocessing and evaluation metrics. This study presents a comparative analysis of four vision transformers regarding their efficacy in accurately detecting and classifying patients with Pulmonary Fibrosis and their ability to localize abnormalities within Images obtained from Computerized Tomography (CT) scans. The dataset consisted of 13,486 samples selected out of 24647 from the Pulmonary Fibrosis dataset, which included both PF-positive CT and normal images that underwent preprocessing. The preprocessed images were divided into three sets: the training set, which accounted for 80% of the total pictures; the validation set, which comprised 10%; and the test set, which also consisted of 10%. The vision transformer models, including ViT, MobileViT2, ViTMSN, and BEiT were subjected to training and validation procedures, during which hyperparameters like the learning rate and batch size were fine-tuned. The overall performance of the optimized architectures has been assessed using various performance metrics to showcase the consistent performance of the fine-tuned model. Regarding performance, ViT has shown superior performance in validation and testing accuracy and loss minimization, specifically for CT images when trained at a single epoch with a tuned learning rate of 0.0001. The results were as follows: validation accuracy of 99.85%, testing accuracy of 100%, training loss of 0.0075, and validation loss of 0.0047. The experimental evaluation of the independently collected data gives empirical evidence that the optimized Vision Transformer (ViT) architecture exhibited superior performance compared to all other optimized architectures. It achieved a flawless score of 1.0 in various standard performance metrics, including Sensitivity, Specificity, Accuracy, F1-score, Precision, Recall, Mathew Correlation Coefficient (MCC), Precision-Recall Area under the Curve (AUC PR), Receiver Operating Characteristic and Area Under the Curve (ROC-AUC). Therefore, the optimized Vision Transformer (ViT) functions as a reliable diagnostic tool for the automated categorization of individuals with pulmonary fibrosis (PF) using chest computed tomography (CT) scans.

A vision transformer for emphysema classification using CT images

Classification Of Pulmonary Emphysema In Ct Images Based On Multi-Scale Deep Convolutional Neural Networks

Classification and Quantification of Emphysema Using a Multi-Scale Residual Network

Multi-scale Residual Network with Two Channels of Raw CT Image and Its Differential Excitation Component for Emphysema Classification

Joint Weber-Based Rotation Invariant Uniform Local Ternary Pattern for Classification of Pulmonary Emphysema in CT Images

FibroVit—Vision transformer-based framework for detection and classification of pulmonary fibrosis from chest CT images

Vision Transformer for Efficient Chest X-ray and Gastrointestinal Image Classification

Diff3Dformer: Leveraging Slice Sequence Diffusion for Enhanced 3D CT Classification with Transformer Networks

Implementing vision transformer for classifying 2D biomedical images

Improved EATFormer: A Vision Transformer for Medical Image Classification

Transformer-Based Disease Identification for Small-Scale Imbalanced Capsule Endoscopy Dataset

Vision transformer introduces a new vitality to the classification of renal pathology

Application of Vision-Series Transformer in Screening for Coronary Heart Diseases Using Coronary CT Angiography.

COVID-19 detection using ViT transformer-based approach from Computed Tomography Images

Research and implementation of multi-disease diagnosis on chest X-ray based on vision transformer

Vision transformer with masked autoencoders for referable diabetic retinopathy classification based on large-size retina image

Conv-attention ViT for classification of multi-label class imbalanced data of lung thoracic diseases

HViT: Hybrid vision inspired transformer for the assessment of carotid artery plaque by addressing the cross-modality domain adaptation problem in MRI

Automated retinal disease classification using hybrid transformer model (SViT) using optical coherence tomography images

Towards efficient diagnostics: refining vision transformers for medical image multi-label classification

Multi-scale Deep Convolutional Neural Networks for Emphysema Classification and Quantification