Residual Alignment: Uncovering the Mechanisms of Residual Networks

Jianing Li,Vardan Papyan
2024-01-17
Abstract:The ResNet architecture has been widely adopted in deep learning due to its significant boost to performance through the use of simple skip connections, yet the underlying mechanisms leading to its success remain largely unknown. In this paper, we conduct a thorough empirical study of the ResNet architecture in classification tasks by linearizing its constituent residual blocks using Residual Jacobians and measuring their singular value decompositions. Our measurements reveal a process called Residual Alignment (RA) characterized by four properties:
Machine Learning
What problem does this paper attempt to address?