Abstract:The generalization of machine learning (ML) models to out-of-distribution (OOD) examples remains a key challenge in extracting information from upcoming astronomical surveys. Interpretability approaches are a natural way to gain insights into the OOD generalization problem. We use Centered Kernel Alignment (CKA), a similarity measure metric of neural network representations, to examine the relationship between representation similarity and performance of pre-trained Convolutional Neural Networks (CNNs) on the CAMELS Multifield Dataset. We find that when models are robust to a distribution shift, they produce substantially different representations across their layers on OOD data. However, when they fail to generalize, these representations change less from layer to layer on OOD data. We discuss the potential application of similarity representation in guiding model design, training strategy, and mitigating the OOD problem by incorporating CKA as an inductive bias during training.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is the generalization ability of machine learning (ML) models for out - of - distribution (OOD) samples in large - scale astronomical survey investigations. Specifically, the research aims to explore how to understand and improve the performance of models on OOD data by explaining the similarity of the internal representations of ML models. ### Research Background and Motivation 1. **Generalization Challenges**: Although machine learning methods have been widely applied in the fields of astronomy and cosmology, the generalization ability of models between data of different distributions (such as different simulation environments or real - world data) still poses challenges. This is because simulated data can only approximately reflect the observed universe, and different simulation models may provide different realizations of physical phenomena. Therefore, a model trained in one simulation environment may not be well - generalized to other simulation environments or actual observational data. 2. **Explanatory Requirements**: In order to better understand and improve these models, researchers need to deeply understand the working mechanisms of the models, especially their performance when facing OOD data. This not only helps to discover potential problems in the models but also reveals new features hidden in the trained models, which is of great significance for astronomical research. ### Solutions and Methods To solve the above problems, the author introduced a technique named "Centered Kernel Alignment" (CKA) to measure the representational similarity between different layers of a neural network. By comparing the representations of pre - trained convolutional neural networks (CNNs) on the same input data, CKA can help identify the behavioral differences of the model on data of different distributions: - **Models with Successful Generalization**: When a model can be successfully generalized to OOD data, the representations between its different layers will be significantly different. This means that as the data is transmitted in the network, the model can extract more discriminative features. - **Models with Failed Generalization**: Conversely, if a model fails to generalize to OOD data, the representation changes between its different layers are small or even tend to be consistent. In this case, the model shows a stagnant state when processing OOD data. ### Experimental Results Through experiments on the CAMELS multi - field data set (CMD), the author verified their hypothesis: - **Temperature Field**: For the case of training from the TNG simulation and testing on the SIMBA simulation, the model failed to generalize successfully, resulting in a significant drop in the R² score. The corresponding CKA matrix shows that at non - diagonal positions, the similarity between different layers is high, forming a block - like structure, indicating small representation changes. - **Total Matter Density Field (Mtot)**: In contrast, when the model is trained on the SIMBA simulation and tested on the TNG simulation, it can maintain a high R² score, indicating that the model is successfully generalized. At this time, the representation differences between different layers in the CKA matrix are large, and there is no obvious block - like structure. ### Conclusions and Future Work In summary, this research shows that successful OOD generalization is usually accompanied by significant changes in the representations of each layer of the model, while failed generalization is manifested as representation stagnation. Based on this finding, future research can consider using the CKA matrix to optimize the model structure, such as pruning unnecessary layers, thereby reducing the memory footprint of the model while maintaining performance. In addition, CKA can also be added to the loss function as an inductive bias to promote the generalization ability of the model on OOD data. In general, this research provides a new perspective for understanding the OOD generalization problem of ML models in astronomical data and lays the foundation for constructing more robust models.

Towards out-of-distribution generalization in large-scale astronomical surveys: robust networks learn similar representations

Towards Robust Out-of-Distribution Generalization: Data Augmentation and Neural Architecture Search Approaches

Toward Metrics for Differentiating Out-of-Distribution Sets

InvariantOODG: Learning Invariant Features of Point Clouds for Out-of-Distribution Generalization

Towards Out-Of-Distribution Generalization: A Survey

Predicting Out-of-Domain Generalization with Neighborhood Invariance

Accuracy on the Line: On the Strong Correlation Between Out-of-Distribution and In-Distribution Generalization

Minimizing Embedding Distortion for Robust Out-of-Distribution Performance

NeuralOOD: Improving Out-of-Distribution Generalization Performance with Brain-machine Fusion Learning Framework

Certifiable Out-of-Distribution Generalization.

Mixture Data for Training Cannot Ensure Out-of-distribution Generalization

Towards a Theoretical Framework of Out-of-Distribution Generalization

HYPO: Hyperspherical Out-of-Distribution Generalization

Characterizing out-of-distribution generalization of neural networks: application to the disordered Su-Schrieffer-Heeger model

Mahalanobis-Aware Training for Out-of-Distribution Detection

Out-of-Distribution Generalization in Text Classification: Past, Present, and Future

Out-of-Distribution Data: An Acquaintance of Adversarial Examples -- A Survey

Out-of-Distribution Generalization in Natural Language Processing: Past, Present, and Future

Bridging Multicalibration and Out-of-distribution Generalization Beyond Covariate Shift

An Empirical Study of Pre-trained Model Selection for Out-of-Distribution Generalization and Calibration

LCA-on-the-Line: Benchmarking Out-of-Distribution Generalization with Class Taxonomies