ADAPT: Multimodal Learning for Detecting Physiological Changes under Missing Modalities

Julie Mordacq,Leo Milecki,Maria Vakalopoulou,Steve Oudot,Vicky Kalogeiton

2024-07-04

Abstract:Multimodality has recently gained attention in the medical domain, where imaging or video modalities may be integrated with biomedical signals or health records. Yet, two challenges remain: balancing the contributions of modalities, especially in cases with a limited amount of data available, and tackling missing modalities. To address both issues, in this paper, we introduce the AnchoreD multimodAl Physiological Transformer (ADAPT), a multimodal, scalable framework with two key components: (i) aligning all modalities in the space of the strongest, richest modality (called anchor) to learn a joint embedding space, and (ii) a Masked Multimodal Transformer, leveraging both inter- and intra-modality correlations while handling missing modalities. We focus on detecting physiological changes in two real-life scenarios: stress in individuals induced by specific triggers and fighter pilots' loss of consciousness induced by $g$-forces. We validate the generalizability of ADAPT through extensive experiments on two datasets for these tasks, where we set the new state of the art while demonstrating its robustness across various modality scenarios and its high potential for real-life applications.

Computer Vision and Pattern Recognition,Machine Learning

What problem does this paper attempt to address?

The paper attempts to address two main issues encountered when detecting physiological changes in multimodal data: 1. how to balance the contributions of different modalities, especially when the data is limited; and 2. how to handle the problem of missing modalities. Specifically, the paper introduces a multimodal learning framework named ADAPT (Anchored Multimodal Physiological Transformer), which aims to solve these problems through the following two key techniques: 1. **Modality Alignment**: Aligning the data of all modalities into the space of the strongest and most informative modality (referred to as the "anchor" modality) to learn a joint embedding space. This ensures that the contribution of each modality is balanced and that the model can effectively utilize information from different modalities. 2. **Masked Multimodal Transformer**: Using a Masked Multimodal Transformer to handle the problem of missing modalities. This transformer flexibly deals with missing modalities through a masking mechanism while leveraging the correlations between modalities for feature fusion. The paper validates the effectiveness of ADAPT through two practical scenarios: one is detecting individual stress triggered by specific factors; the other is detecting loss of consciousness in fighter pilots due to high G-forces. Experimental results show that ADAPT not only achieves new state-of-the-art performance on these tasks but also performs excellently in various missing modality situations, demonstrating high practical value.

ADAPT: Multimodal Learning for Detecting Physiological Changes under Missing Modalities

Multimodal Neurophysiological Transformer for Emotion Recognition

MDA: An Interpretable and Scalable Multi-Modal Fusion under Missing Modalities and Intrinsic Noise Conditions

Exploring Missing Modality in Multimodal Egocentric Datasets

Attention-based multimodal fusion with contrast for robust clinical prediction in the face of missing modalities

A transformer-based unified multimodal framework for Alzheimer's disease assessment

Accommodating Missing Modalities in Time-Continuous Multimodal Emotion Recognition

Hierarchical multimodal-fusion of physiological signals for emotion recognition with scenario adaption and contrastive alignment

Disentangled Adversarial Transfer Learning for Physiological Biosignals

Multimodal Representation Learning by Alternating Unimodal Adaptation

MoRE: Multi-Modal Contrastive Pre-training with Transformers on X-Rays, ECGs, and Diagnostic Report

Cascaded Multi-Modal Mixing Transformers for Alzheimer's Disease Classification with Incomplete Data

Paying attention to uncertainty: A stochastic multimodal transformers for post-traumatic stress disorder detection using video

Spatial Adaptation Layer: Interpretable Domain Adaptation For Biosignal Sensor Array Applications

PhysioMTL: Personalizing Physiological Patterns using Optimal Transport Multi-Task Regression

Real-time emotional health detection using fine-tuned transfer networks with multimodal fusion

Promoting cross-modal representations to improve multimodal foundation models for physiological signals

An Improved ConvNeXt with Multimodal Transformer for Physiological Signal Classification

Multimodal Learning To Improve Cardiac Late Mechanical Activation Detection From Cine MR Images

HEALNet: Multimodal Fusion for Heterogeneous Biomedical Data