Abstract:Objective: Bio-Signals such as electroencephalography (EEG) and electromyography (EMG) are widely used for the rehabilitation of physically disabled people and for the characterization of cognitive impairments. Successful decoding of these bio-signals is however non-trivial because of the time-varying and non-stationary characteristics. Furthermore, existence of short- and long-range dependencies in these time-series signal makes the decoding even more challenging. State-of-the-art studies proposed Convolutional Neural Networks (CNNs) based architectures for the classification of these bio-signals, which are proven useful to learn spatial representations. However, CNNs because of the fixed size convolutional kernels and shared weights pay only uniform attention and are also suboptimal in learning short-long term dependencies, simultaneously, which could be pivotal in decoding EEG and EMG signals. Therefore, it is important to address these limitations of CNNs. To learn short- and long-range dependencies simultaneously and to pay more attention to more relevant part of the input signal, Transformer neural network-based architectures can play a significant role. Nonetheless, it requires a large corpus of training data. However, EEG and EMG decoding studies produce limited amount of the data. Therefore, using standalone transformers neural networks produce ordinary results. In this study, we ask a question whether we can fix the limitations of CNN and transformer neural networks and provide a robust and generalized model that can simultaneously learn spatial patterns, long-short term dependencies, pay variable amount of attention to time-varying non-stationary input signal with limited training data. Approach: In this work, we introduce a novel single hybrid model called ConTraNet, which is based on CNN and Transformer architectures that contains the strengths of both CNN and Transformer neural networks. ConTraNet uses a CNN block to introduce inductive bias in the model and learn local dependencies, whereas the Transformer block uses the self-attention mechanism to learn the short- and long-range or global dependencies in the signal and learn to pay different attention to different parts of the signals. Main results: We evaluated and compared the ConTraNet with state-of-the-art methods on four publicly available datasets (BCI Competition IV dataset 2b, Physionet MI-EEG dataset, Mendeley sEMG dataset, Mendeley sEMG V1 dataset) which belong to EEG-HMI and EMG-HMI paradigms. ConTraNet outperformed its counterparts in all the different category tasks (2-class, 3-class, 4-class, 7-class, and 10-class decoding tasks). Significance: With limited training data ConTraNet significantly improves classification performance on four publicly available datasets for 2, 3, 4, 7, and 10-classes compared to its counterparts.

CiTrus: Squeezing Extra Performance out of Low-data Bio-signal Transfer Learning

ConTraNet: A hybrid network for improving the classification of EEG and EMG signals with limited training data

Enhancing Eye-Tracking Performance through Multi-Task Learning Transformer

Parkinsonian Tremor Detection with Compact Convolutional Transformer from Bispectrum Representation of tri-Axial Accelerometer Signals

Masked Transformer for Electrocardiogram Classification

Self-Supervised Pretraining on Paired Sequences of fMRI Data for Transfer Learning to Brain Decoding Tasks

Enhancing ECG signal classification through pre-trained stacked-CNN embeddings: a transfer learning approach

HeartBEiT: Vision Transformer for Electrocardiogram Data Improves Diagnostic Performance at Low Sample Sizes

Fusing Pretrained ViTs with TCNet for Enhanced EEG Regression

On Efficient Transformer-Based Image Pre-training for Low-Level Vision

Frequency-Aware Masked Autoencoders for Multimodal Pretraining on Biosignals

Self-supervised Pretraining and Transfer Learning Enable Flu and COVID-19 Predictions in Small Mobile Sensing Datasets

Video and Synthetic MRI Pre-training of 3D Vision Architectures for Neuroimage Analysis

Echo-Vision-FM: A Pre-training and Fine-tuning Framework for Echocardiogram Videos Vision Foundation Model

Cross-dataset transfer learning for Motor Imagery signal classification via multi-task learning and pre-training

Transformer-based network with temporal depthwise convolutions for sEMG recognition

ViT2EEG: Leveraging Hybrid Pretrained Vision Transformers for EEG Data

Promoting cross-modal representations to improve multimodal foundation models for physiological signals

Toward Domain-Free Transformer for Generalized EEG Pre-Training

Neuro-BERT: Rethinking Masked Autoencoding for Self-supervised Neurological Pretraining

Improving Hybrid CTC/Attention End-to-end Speech Recognition with Pretrained Acoustic and Language Model