Robust Domain Generalisation with Causal Invariant Bayesian Neural Networks

Gaël Gendron,Michael Witbrock,Gillian Dobbie
2024-10-09
Abstract:Deep neural networks can obtain impressive performance on various tasks under the assumption that their training domain is identical to their target domain. Performance can drop dramatically when this assumption does not hold. One explanation for this discrepancy is the presence of spurious domain-specific correlations in the training data that the network exploits. Causal mechanisms, in the other hand, can be made invariant under distribution changes as they allow disentangling the factors of distribution underlying the data generation. Yet, learning causal mechanisms to improve out-of-distribution generalisation remains an under-explored area. We propose a Bayesian neural architecture that disentangles the learning of the the data distribution from the inference process mechanisms. We show theoretically and experimentally that our model approximates reasoning under causal interventions. We demonstrate the performance of our method, outperforming point estimate-counterparts, on out-of-distribution image recognition tasks where the data distribution acts as strong adversarial confounders.
Machine Learning,Methodology
What problem does this paper attempt to address?
### Problems the paper attempts to solve This paper aims to solve the performance degradation problem of deep neural networks in out - of - distribution (o.o.d) tasks. Specifically, when the distribution of training data is inconsistent with that of test data, the performance of deep neural networks will decline significantly. One of the main reasons for this performance degradation is the existence of domain - specific spurious correlations in the training data, which are exploited by the network but may no longer be applicable in the test data. ### Solutions To solve this problem, the author proposes a Causal - Invariant Bayesian Neural Network (CIBNN). This method improves the generalization ability of the model in out - of - distribution tasks through the following aspects: 1. **Separation of causal mechanisms**: Through the causal graph structure, the domain - specific mechanisms and domain - invariant mechanisms are separated, so that the model can better adapt to new environments. 2. **Partially - stochastic Bayesian neural network**: Combine variational inference and partially - stochastic Bayesian neural network to model causal paths and perform inference in an intervention setting. 3. **Fusion of context information**: Through the label mixup technique, incorporate context information into the training process to improve the robustness and training stability of the model. 4. **Regularization techniques**: Introduce weight function regularization to enhance the diversity of weights and align them with functional diversity. ### Experimental results The author conducted experiments on standard out - of - distribution image recognition tasks, including the CIFAR - 10 and OFFICEHOME datasets. The experimental results show that the proposed CIBNN model outperforms the baseline models in both in - distribution (i.i.d) and out - of - distribution (o.o.d) tasks. Especially in out - of - distribution tasks, the CIBNN model performs more prominently, showing stronger generalization ability and stability. ### Main contributions 1. **Redefine the Bayesian inference problem**: Incorporate causal intervention and supervised learning mechanisms into the Bayesian inference problem and propose a factorized model that explicitly models domain - invariant mechanisms. 2. **Propose a new network architecture**: Design an architecture around the neural network that can handle challenging tasks requiring domain transfer. 3. **Improve performance and reduce overfitting**: By adding the CIBNN model, the i.i.d and o.o.d performance of the underlying neural network can be improved and overfitting can be reduced. 4. **Enhance robustness and training stability**: By adding unsupervised context information and Bayesian classifiers, the robustness and training stability of the model are improved, even in the case of a small amount of context and Bayesian weight samples. ### Conclusion This paper effectively solves the performance degradation problem of deep neural networks in out - of - distribution tasks by introducing the Causal - Invariant Bayesian Neural Network. The experimental results verify the effectiveness of this method and provide a new direction for future research.