Abstract:Most mainstream statistical models will achieve poor performance in Out-Of-Distribution (OOD) generalization. This is because these models tend to learn the spurious correlation between data and will collapse when the domain shift exists. If we want artificial intelligence (AI) to make great strides in real life, the current focus needs to be shifted to the OOD problem of deep learning models to explore the generalization ability under unknown environments. Domain generalization (DG) focusing on OOD generalization is proposed, which is able to transfer the knowledge extracted from multiple source domains to the unseen target domain. We are inspired by intuitive thinking about human intelligence relying on causality. Unlike relying on plain probability correlations, we apply a novel causal perspective to DG, which can improve the OOD generalization ability of the trained model by mining the invariant causal mechanism. Firstly, we construct the inclusive causal graph for most DG tasks through stepwise causal analysis based on the data generation process in the natural environment and introduce the reasonable Structural Causal Model (SCM). Secondly, based on counterfactual inference, causal semantic representation learning with domain intervention (CSRDN) is proposed to train a robust model. In this regard, we generate counterfactual representations for different domain interventions, which can help the model learn causal semantics and develop generalization capacity. At the same time, we seek the Pareto optimal solution in the optimization process based on the loss function to obtain a more advanced training model. Extensive experimental results of Rotated MNIST and PACS as well as VLCS datasets verify the effectiveness of the proposed CSRDN. The proposed method can integrate causal inference into domain generalization by enhancing interpretability and applicability and brings a boost to challenging OOD generalization problems.

Background no more: Action recognition across domains by causal interventions

ActionCLIP: Adapting Language-Image Pretrained Models for Video Action Recognition.

Deep Causal Domain Generalization Network for Human Action Recognition in Internet of Behaviors

Learning Causal Representation for Training Cross-Domain Pose Estimator Via Generative Interventions

Human Action Recognition with Contextual Constraints Using a RGB-D Sensor

Unintentional Action Localization Via Counterfactual Examples

Domain-Specific Priors and Meta Learning for Few-Shot First-Person Action Recognition

Learning Causal Domain-Invariant Temporal Dynamics for Few-Shot Action Recognition

Causal Interventional Training for Image Recognition

Dynamic Video Mix-Up for Cross-Domain Action Recognition

OccludeNet: A Causal Journey into Mixed-View Actor-Centric Video Action Recognition under Occlusions

A Causality-Aware Perspective on Domain Generalization via Domain Intervention

Spatio-Temporal Context Prompting for Zero-Shot Action Detection

Implicit Affordance Acquisition via Causal Action-Effect Modeling in the Video Domain

GLAD: Global-Local View Alignment and Background Debiasing for Unsupervised Video Domain Adaptation with Large Domain Gap

Rethinking CLIP-based Video Learners in Cross-Domain Open-Vocabulary Action Recognition

Self-supervised pretext task collaborative multi-view contrastive learning for video action recognition

Human-Centric Transformer for Domain Adaptive Action Recognition

Cross-domain video action recognition via adaptive gradual learning

Disentanglement of Latent Representations via Causal Interventions

Interpretable Action Recognition on Hard to Classify Actions