Abstract:These notes are based on lectures I gave at TASI 2024 on Physics for Machine Learning. The focus is on neural network theory, organized according to network expressivity, statistics, and dynamics. I present classic results such as the universal approximation theorem and neural network / Gaussian process correspondence, and also more recent results such as the neural tangent kernel, feature learning with the maximal update parameterization, and Kolmogorov-Arnold networks. The exposition on neural network theory emphasizes a field theoretic perspective familiar to theoretical physicists. I elaborate on connections between the two, including a neural network approach to field theory.

What problem does this paper attempt to address?

The paper attempts to address the problem of understanding the theoretical foundations of neural networks in the field of machine learning (ML). Specifically, the paper focuses on the following aspects: 1. **Expressivity**: - Investigates which types of functions neural networks can represent or approximate. For example, through the Universal Approximation Theorem (UAT), the paper demonstrates that a single-layer neural network can approximate any continuous function. - Explores the Kolmogorov-Arnold theorem, showing that multivariable continuous functions can be represented as combinations of one-dimensional functions. 2. **Statistics**: - Analyzes the statistical properties of neural networks at initialization, particularly the distribution of parameters and their impact on network output. - Proposes the correspondence between neural networks and Gaussian processes (NNGP), i.e., in the case of infinite width, neural networks can be viewed as samples from a Gaussian process. - Discusses non-Gaussian process scenarios, where neural networks may exhibit interactions when the central limit theorem assumptions are violated. 3. **Dynamics**: - Explores the dynamic behavior of neural network parameters over time, including the trajectory of parameter changes during training. - Introduces concepts such as the Neural Tangent Kernel to study the dynamic changes of neural networks during the training process. Through these studies, the paper aims to provide a deeper understanding of neural networks and to lay a theoretical foundation for future research. Additionally, the paper explores the connection between neural networks and physics, particularly from the perspective of field theory, to understand and analyze the behavior of neural networks.

TASI Lectures on Physics for Machine Learning

Modern Machine Learning for LHC Physicists

Notes on Deep Learning Theory

Statistical physics, Bayesian inference and neural information processing

TASI Lectures: (No) Global Symmetries to Axion Physics

Sparse representations, inference and learning

A high-bias, low-variance introduction to Machine Learning for physicists

Information Theory and Statistical Physics - Lecture Notes

An Analysis of Physics-Informed Neural Networks

Six lectures on linearized neural networks

In Pursuit of New Paradigms: TASI 2024

Kernels, Data & Physics

Lattice physics approaches for neural networks

Machine Learning for New Physics Searches

Learning Curves for Deep Neural Networks: A Gaussian Field Theory Perspective

Toward an AI Physicist for Unsupervised Learning

Machine learning with neural networks

TASI Lectures on Collider Physics

Physics-informed machine learning