Abstract:Brain tumor detection and classification are crucial steps in evaluating life-threatening abnormal tissues to provide appropriate treatment plans. For clinical assessment, Magnetic resonance imaging (MRI) is normally used because of its excellent quality and lack of ionizing radiation. However, as the volume of the data grows, manual processing of MRI images becomes expensive, time-taking, and error prone. Also, traditional automated detection systems struggle to handle complex image patterns, leading to reduced classification accuracy. So, this paper designs a reliable and effective brain tumor detection mechanism as a solution to these problems. The proposed "Vision Transformer with Attention and Linear Transformation module (VITALT)" system is a combination of modules such as Vision Transformer (ViT), Split bidirectional feature pyramid network (S-BiFPN), linear transformation module (LTM) and soft-quantization that effectively extracts features from complex brain structures. At first, to mitigate the training inaccuracies developed by dimension and quality constraints, the preprocessing steps such as resizing and normalization are executed. The preprocessed images are divided into number of patches and embedded into high-dimensional vector to provide more compact image representation. Subsequently, the global and local features in the image are captured through ViT module by learning the relationship between image patches. The multi-scale spatial features formed are then fused using S-BiFPN to increase the accuracy of prediction. By using LTM to improve the linear expression capability of the design, the characteristics that are most important for the classification of brain tumors are discovered. Also, soft quantization is used to minimize memory footprint and minimize quantization errors in detection. Finally, the head module with set of fully connected layers accurately classifies different classes of brain tumors. The experimental analysis conducted using four different benchmark brain tumor datasets shows the viability and reliability of the suggested VITALT system in predicting brain tumors, as measured by multiple evaluation metrics. The proposed system achieves classification accuracy of 99.08% for Dataset A, 98.97% for Dataset B, 98.82% for Dataset C and 99.15% for Dataset D. A high level of classification accuracy attained by the suggested system highlights its potential in medical imaging applications and its ability to contribute to improved surgical treatments.

Splitting expands the application range of Vision Transformer -- variable Vision Transformer (vViT)

Vision transformer introduces a new vitality to the classification of renal pathology

Implementing vision transformer for classifying 2D biomedical images

Vision Transformers in Medical Computer Vision -- A Contemplative Retrospection

Vision transformer: To discover the "four secrets" of image patches

Vision Transformer: Vit and its Derivatives

MultiCrossViT: Multimodal Vision Transformer for Schizophrenia Prediction using Structural MRI and Functional Network Connectivity Data

VITALT: a robust and efficient brain tumor detection system using vision transformer with attention and linear transformation

MPViT: Multi-Path Vision Transformer for Dense Prediction

Vision Transformer for Classification of Breast Ultrasound Images

MMMViT: Multiscale multimodal vision transformer for brain tumor segmentation with missing modalities

ViTBIS: Vision Transformer for Biomedical Image Segmentation

MoViT: Memorizing Vision Transformers for Medical Image Analysis

A Novel Vision Transformer with Residual in Self-attention for Biomedical Image Classification

Vision Transformer for Efficient Chest X-ray and Gastrointestinal Image Classification

Super Vision Transformer

SepViT: Separable Vision Transformer

Application of Vision-Series Transformer in Screening for Coronary Heart Diseases Using Coronary CT Angiography.

Enhancing brain tumor detection in MRI with a rotation invariant Vision Transformer

Hierarchical Vision Transformers for Context-Aware Prostate Cancer Grading in Whole Slide Images

CF-ViT: A General Coarse-to-Fine Method for Vision Transformer