A spring-block theory of feature learning in deep neural networks

Cheng Shi,Liming Pan,Ivan Dokmanić
2024-10-23
Abstract:Feature-learning deep nets progressively collapse data to a regular low-dimensional geometry. How this phenomenon emerges from collective action of nonlinearity, noise, learning rate, and other choices that shape the dynamics, has eluded first-principles theories built from microscopic neuronal dynamics. We exhibit a noise-nonlinearity phase diagram that identifies regimes where shallow or deep layers learn more effectively. We then propose a macroscopic mechanical theory that reproduces the diagram, explaining why some DNNs are lazy and some active, and linking feature learning across layers to generalization.
Disordered Systems and Neural Networks,Statistical Mechanics,Machine Learning
What problem does this paper attempt to address?