What problem does this paper attempt to address?

The problem that this paper attempts to solve is to understand the mechanism behind the success of Residual Network (ResNet). Although ResNet significantly improves performance through simple skip connections, the fundamental reason for its success has not yet been widely accepted in theoretical explanations. Specifically, the author hopes to reveal its internal mechanism through in - depth research on the ResNet architecture and its components, and explain why ResNet can perform excellently in various tasks. ### Main Problem Statements 1. **The Success Mechanism of ResNet**: Although ResNet has quickly become a mainstream architecture in the field of deep learning since its proposal, the specific mechanism behind its success remains unclear. The author hopes to find the key factors contributing to its performance improvement by studying the internal structure and behavior of ResNet. 2. **The Role of Skip Connections**: Skip connection is one of the core designs of ResNet, but its impact on model performance is not fully understood. The author hopes to verify through experiments whether skip connections are the key to ResNet's success and explore their specific roles. 3. **The Geometric Structure of Intermediate Representations**: Does there exist a certain geometric structure in the intermediate - layer representations of ResNet? Does this structure contribute to the generalization ability of the model? The author hopes to reveal the geometric characteristics of intermediate representations by analyzing Residual Jacobians of ResNet. 4. **The Relationship with Neural Collapse**: Does the Neural Collapse phenomenon occur simultaneously during the training process of ResNet? If so, is there an intrinsic connection between these two phenomena? The author hopes to verify the relationship between the two through experiments. ### Overview of Research Methods To answer the above questions, the author adopts the following methods: - **Linearizing Residual Blocks**: By linearizing the residual blocks of ResNet and using Residual Jacobians to measure their Singular Value Decomposition (SVD), the geometric structure of intermediate representations is analyzed. - **Empirical Research**: Train ResNet models on multiple benchmark datasets (such as MNIST, CIFAR10, etc.), and observe their performance under different depths, widths, and numbers of classes to verify the consistent characteristics of ResNet. - **Counterfactual Experiments**: By removing skip connections or changing other hyper - parameters, observe the impact of these changes on ResNet performance to further verify the importance of skip connections. - **Mathematical Model**: Propose an Unconstrained Jacobians Model to theoretically prove the occurrence conditions of Residual Alignment (RA) phenomenon. ### Main Phenomena Discovered The author discovered a phenomenon called "Residual Alignment" (RA), which has the following four characteristics: 1. **(RA1)**: Given the input, the intermediate representations are evenly distributed on a straight line in high - dimensional space. 2. **(RA2)**: The first few left and right singular vectors of the Residual Jacobian matrix are aligned with each other at different depths. 3. **(RA3)**: For fully - connected ResNet, the rank of the Residual Jacobian matrix is at most the number of classes \( C \). 4. **(RA4)**: The largest singular value of the Residual Jacobian matrix has a reciprocal relationship as the depth increases. These characteristics together reveal the high - order and geometric structure of ResNet's internal representations, which may be one of the important reasons for its success. ### Conclusions Through detailed empirical research and theoretical analysis, the author reveals the mechanism behind ResNet's success, especially the key role of skip connections in it. In addition, the author also discovered the Residual Alignment (RA) phenomenon and verified its consistency and wide applicability through multiple experiments. These findings not only enhance the understanding of ResNet but also provide new perspectives and directions for future research.

Residual Alignment: Uncovering the Mechanisms of Residual Networks

Residual Recurrent Neural Networks for Learning Sequential Representations.

Residual Connections Encourage Iterative Inference

Demystifying ResNet

Multi-level Residual Networks from Dynamical Systems View

Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning

Irsnet: An Inception-Resnet Feature Reconstruction Model For Building Segmentation

Interpreting the Residual Stream of ResNet18

Identity Mappings in Deep Residual Networks

Peeking Behind the Curtains of Residual Learning

Towards Understanding the Importance of Shortcut Connections in Residual Networks

Residual Networks Behave Like Ensembles of Relatively Shallow Networks

RA-Net: reverse attention for generalizing residual learning

Residual Networks as Nonlinear Systems: Stability Analysis using Linearization

Entangled Residual Mappings

Are deep ResNets provably better than linear predictors?

Why ResNet Works? Residuals Generalize

Residual Networks of Residual Networks: Multilevel Residual Networks

Sharing Residual Units Through Collective Tensor Factorization in Deep Neural Networks

Riemannian Residual Neural Networks

Residual Feature-Reutilization Inception Network