Abstract:In recent years, the integration of advanced imaging techniques and deep learning methods has significantly advanced computer-aided diagnosis (CAD) systems for breast cancer detection and classification. Transformers, which have shown great promise in computer vision, are now being applied to medical image analysis. However, their application to histopathological images presents challenges due to the need for extensive manual annotations of whole-slide images (WSIs), as these models require large amounts of data to work effectively, which is costly and time-consuming. Furthermore, the quadratic computational cost of Vision Transformers (ViTs) is particularly prohibitive for large, high-resolution histopathological images, especially on edge devices with limited computational resources. In this study, we introduce a novel lightweight breast cancer classification approach using transformers that operates effectively without large datasets. By incorporating parallel processing pathways for Discrete Cosine Transform (DCT) Attention and MobileConv, we convert image data from the spatial domain to the frequency domain to utilize the benefits such as filtering out high frequencies in the image, which reduces computational cost. This demonstrates the potential of our approach to improve breast cancer classification in histopathological images, offering a more efficient solution with reduced reliance on extensive annotated datasets. Our proposed model achieves an accuracy of 96.00% $\pm$ 0.48% for binary classification and 87.85% $\pm$ 0.93% for multiclass classification, which is comparable to state-of-the-art models while significantly reducing computational costs. This demonstrates the potential of our approach to improve breast cancer classification in histopathological images, offering a more efficient solution with reduced reliance on extensive annotated datasets.

Vision Transformers(ViT) Pretraining on 3D ABUS Image and Dual-CapsViT: Enhancing ViT Decoding Via Dual-Channel Dynamic Routing

Semi-supervised vision transformer with adaptive token sampling for breast cancer classification

BUViTNet: Breast Ultrasound Detection via Vision Transformers

Identifying Malignant Breast Ultrasound Images Using ViT-Patch

Transformer-Based Disease Identification for Small-Scale Imbalanced Capsule Endoscopy Dataset

3D-EffiViTCaps: 3D Efficient Vision Transformer with Capsule for Medical Image Segmentation

Using Vision Transformers in 3-D Medical Image Classifications

EfficientUNetViT: Efficient Breast Tumor Segmentation Utilizing UNet Architecture and Pretrained Vision Transformer

Vision Transformer for Classification of Breast Ultrasound Images

Vision transformer-convolution for breast cancer classification using mammography images: A comparative study

Implementing vision transformer for classifying 2D biomedical images

MMMViT: Multiscale multimodal vision transformer for brain tumor segmentation with missing modalities

ViT-CB: Integrating hybrid Vision Transformer and CatBoost to enhanced brain tumor detection with SHAP

DctViT: Discrete Cosine Transform Meet Vision Transformers

ViR:the Vision Reservoir

DCT-HistoTransformer: Efficient Lightweight Vision Transformer with DCT Integration for histopathological image analysis

DDViT: Double-Level Fusion Domain Adapter Vision Transformer (Student Abstract)

Pretrained ViTs Yield Versatile Representations For Medical Images

Vision transformer introduces a new vitality to the classification of renal pathology

UNetFormer: A Unified Vision Transformer Model and Pre-Training Framework for 3D Medical Image Segmentation

MIL-ViT: A Multiple Instance Vision Transformer for Fundus Image Classification