Abstract:Autism spectrum disorder (ASD) is a condition observed in children who display abnormal patterns of interaction, behavior, and communication with others. Despite extensive research efforts, the underlying causes of this neurodevelopmental disorder and its biomarkers remain unknown. However, advancements in artificial intelligence and machine learning have improved clinicians' ability to diagnose ASD. This review paper investigates various MRI modalities to identify distinct features that characterize individuals with ASD compared to typical control subjects. The review then moves on to explore deep learning models for ASD diagnosis, including convolutional neural networks (CNNs), autoencoders, graph convolutions, attention networks, and other models. CNNs and their variations are particularly effective due to their capacity to learn structured image representations and identify reliable biomarkers for brain disorders. Computer vision transformers often employ CNN architectures with transfer learning techniques like fine-tuning and layer freezing to enhance image classification performance, surpassing traditional machine learning models. This review paper contributes in three main ways. Firstly, it provides a comprehensive overview of a recommended architecture for using vision transformers in the systematic ASD diagnostic process. To this end, the paper investigates various pre-trained vision architectures such as VGG, ResNet, Inception, InceptionResNet, DenseNet, and Swin models that were fine-tuned for ASD diagnosis and classification. Secondly, it discusses the vision transformers of 2020th like BiT, ViT, MobileViT, and ConvNeXt, and applying transfer learning methods in relation to their prospective practicality in ASD classification. Thirdly, it explores brain transformers that are pre-trained on medically rich data and MRI neuroimaging datasets. The paper recommends a systematic architecture for ASD diagnosis using brain transformers. It also reviews recently developed brain transformer-based models, such as METAFormer, Com-BrainTF, Brain Network, ST-Transformer, STCAL, BolT, and BrainFormer, discussing their deep transfer learning architectures and results in ASD detection. Additionally, the paper summarizes and discusses brain-related transformers for various brain disorders, such as MSGTN, STAGIN, and MedTransformer, in relation to their potential usefulness in ASD. The study suggests that developing specialized transformer-based models, following the success of natural language processing (NLP), can offer new directions for image classification problems in ASD brain biomarkers learning and classification. By incorporating the attention mechanism, treating MRI modalities as sequence prediction tasks trained on brain disorder classification problems, and fine-tuned on ASD datasets, brain transformers can show a great promise in ASD diagnosis.

ViTASD: Robust Vision Transformer Baselines for Autism Spectrum Disorder Facial Diagnosis

ASDFace: Face-based Autism Diagnosis Via Heterogeneous Domain Adaptation

Identifying Children with Autism Spectrum Disorder Via Transformer-Based Representation Learning from Dynamic Facial Cues

Computer Vision-Based Assessment of Autistic Children: Analyzing Interactions, Emotions, Human Pose, and Life Skills

Effective And Efficient Visual Stimuli Design For Quantitative Autism Screening: An Exploratory Study

Visual Attention Analysis and Prediction on Human Faces for Children with Autism Spectrum Disorder

Dynamic graph transformer network via dual-view connectivity for autism spectrum disorder identification

Vision-based activity recognition in children with autism-related behaviors

An Advanced Deep Learning Framework for Video-Based Diagnosis of ASD

Do it the transformer way: A comprehensive review of brain and vision transformers for autism spectrum disorder diagnosis and classification

Facial Features Detection System To Identify Children With Autism Spectrum Disorder: Deep Learning Models

Computer vision tools for the non-invasive assessment of autism-related behavioral markers

Deep Learning Approach for Screening Autism Spectrum Disorder in Children with Facial Images and Analysis of Ethnoracial Factors in Model Development and Application

Dynamic Viewing Pattern Analysis: Towards Large-Scale Screening of Children with ASD in Remote Areas

HSViT: Horizontally Scalable Vision Transformer

Advancing Plain Vision Transformer Toward Remote Sensing Foundation Model

TimeConvNets: A Deep Time Windowed Convolution Neural Network Design for Real-time Video Facial Expression Recognition

A Multiview Brain Network Transformer Fusing Individualized Information for Autism Spectrum Disorder Diagnosis

Pretraining is All You Need: A Multi-Atlas Enhanced Transformer Framework for Autism Spectrum Disorder Classification

Vision Transformers(ViT) Pretraining on 3D ABUS Image and Dual-CapsViT: Enhancing ViT Decoding Via Dual-Channel Dynamic Routing