Abstract:Exploiting symmetry in dynamical systems is a powerful way to improve the generalization of deep learning. The model learns to be invariant to transformation and hence is more robust to distribution shift. Data augmentation and equivariant networks are two major approaches to injecting symmetry into learning. However, their exact role in improving generalization is not well understood. In this work, we derive the generalization bounds for data augmentation and equivariant networks, characterizing their effect on learning in a unified framework. Unlike most prior theories for the i.i.d. setting, we focus on non-stationary dynamics forecasting with complex temporal dependencies.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is to understand how data augmentation and equivariant networks can improve the generalization ability of deep learning models by exploiting symmetry in the prediction tasks of non - stationary dynamical systems. Specifically, the author focuses on the performance of these two methods on non - stationary time - series data and attempts to answer the following key questions: 1. **The roles of data augmentation and equivariant networks**: How do the two improve the generalization ability of the model by injecting symmetry? What are their relative advantages under different conditions? 2. **Lack of theoretical analysis**: Although there have been a large number of empirical studies on data augmentation and equivariant networks, there is a lack of theoretical explanations and comparisons of their behaviors. 3. **Generalization bounds**: What are the generalization bounds of data augmentation and equivariant networks in non - stationary and non - mixing time - series data? ### Main contributions of the paper 1. **Formal description of symmetry in dynamic prediction**: The author assumes that the underlying dynamical system retains a certain amount of symmetry and formalizes the dynamic prediction problem on this basis. 2. **Derivation of generalization bounds**: The author derives generalization bounds for data augmentation and equivariant networks (including strictly equivariant and approximately equivariant networks), which are applicable to non - stationary and non - mixing time - series data. 3. **Proof of the advantages of equivariant networks**: When the underlying dynamical system is symmetric, the generalization bound of the strictly equivariant network is tighter than that of data augmentation. When there is only approximate symmetry in the data, the generalization bound of the approximately equivariant network is further improved. ### Key formulas - **Sequential Rademacher complexity**: \[ R_{\text{sq}}^T(G)=\mathbb{E}_z\mathbb{E}_\sigma\left[\sup_{g\in G}\sum_{t = 1}^T\sigma_tq_tg(z_t(\sigma))\right] \] - **Equivariant error**: \[ \|f\|_{\text{EE}}=\sup_{x, g}\|f(\rho_{\text{in}}(g)(x))-\rho_{\text{out}}(g)f(x)\| \] - **Generalization bound theorem**: \[ \mathbb{E}[L(\hat{\theta}, Z_{T + 1})]-\mathbb{E}[L(\theta^*, Z_{T + 1})]\leq2\text{disc}_T(q)+6M\sqrt{\frac{4\pi\log T}{N}}R_{\text{sq}}^T(L\circ\Theta)+\sqrt{\frac{2\log(2 / \sigma)}{N}}+\|q\|_2\left(M\sqrt{8\log\frac{1}{\delta}}+1\right) \] ### Summary This paper reveals the differences in the generalization performance of data augmentation and equivariant networks in the prediction of non - stationary dynamical systems through strict theoretical analysis. In particular, it proves that under symmetry conditions, equivariant networks have better generalization ability; and under approximate symmetry conditions, approximately equivariant networks perform well. This research result provides a theoretical basis for choosing appropriate methods.

Data Augmentation vs. Equivariant Networks: A Theory of Generalization on Dynamics Forecasting

Optimization Dynamics of Equivariant and Augmented Neural Networks

Data Augmentation Revisited: Rethinking the Distribution Gap between Clean and Augmented Data

A Simple Data Augmentation for Graph Classification: A Perspective of Equivariance and Invariance

On the Generalization Effects of Linear Transformations in Data Augmentation

Automatic Data Augmentation via Invariance-Constrained Learning

DynamicAug: Enhancing Transfer Learning Through Dynamic Data Augmentation Strategies Based on Model State

A Kernel Theory of Modern Data Augmentation

Towards fidelity of graph data augmentation via equivariance

Approximately Equivariant Neural Processes

Rotating spiders and reflecting dogs: a class conditional approach to learning data augmentation distributions

On the Benefits of Invariance in Neural Networks

A General Theory of Correct, Incorrect, and Extrinsic Equivariance

DNA: Dynamic Network Augmentation

Relaxing Continuous Constraints of Equivariant Graph Neural Networks for Broad Physical Dynamics Learning

Generalization Gap in Data Augmentation: Insights from Illumination

Toward Understanding Generative Data Augmentation

Equivariant score-based generative models provably learn distributions with symmetries efficiently

Improving the accuracy of global forecasting models using time series data augmentation

Relaxing Continuous Constraints of Equivariant Graph Neural Networks for Physical Dynamics Learning

Invariance Learning in Deep Neural Networks with Differentiable Laplace Approximations