Mathematics of Neural Networks (Lecture Notes Graduate Course)

Bart M. N. Smets

2024-03-06

Abstract:These are the lecture notes that accompanied the course of the same name that I taught at the Eindhoven University of Technology from 2021 to 2023. The course is intended as an introduction to neural networks for mathematics students at the graduate level and aims to make mathematics students interested in further researching neural networks. It consists of two parts: first a general introduction to deep learning that focuses on introducing the field in a formal mathematical way. The second part provides an introduction to the theory of Lie groups and homogeneous spaces and how it can be applied to design neural networks with desirable geometric equivariances. The lecture notes were made to be as self-contained as possible so as to accessible for any student with a moderate mathematics background. The course also included coding tutorials and assignments in the form of a set of Jupyter notebooks that are publicly available at https://gitlab.com/bsmetsjr/mathematics_of_neural_networks.

Artificial Intelligence,Machine Learning

What problem does this paper attempt to address?

This paper mainly discusses the mathematical principles of neural networks, especially the problems and solutions in deep learning. It starts with the basic concepts of supervised learning and explains how to train models through data to accomplish specific tasks. Then, the paper delves into deep neural networks (DNN), including feedforward networks, the problems of gradient vanishing and exploding, high-dimensional data processing, initialization methods (such as random initialization and Xavier initialization), and details of convolutional neural networks (CNN), such as discrete convolution, padding, max pooling, and convolutional layers. In addition, the paper also introduces automatic differentiation and backpropagation algorithm, as well as adaptive learning rate algorithms such as Adagrad, RMSProp, and Adam. In Chapter 3, the paper turns to the concepts of group theory and homomorphic spaces, discussing how to utilize these geometric theories to construct neural networks with structural symmetries such as rotation and translation, namely group convolutional networks. The authors propose concepts such as "uplift layer," "group convolutional layer," and "projection," and briefly introduce tropical operators and semirings. In summary, the problem this paper attempts to address is how to understand the workings of neural networks from a mathematical perspective, especially the challenges encountered in deep learning such as gradient vanishing, parameter initialization, model symmetries, and optimization strategies. By introducing concepts from geometry and group theory, the paper aims to provide a stronger theoretical foundation for neural networks.

Mathematics of Neural Networks (Lecture Notes Graduate Course)

Machine learning with neural networks

Lecture Notes on Linear Neural Networks: A Tale of Optimization and Generalization in Deep Learning

Introduction to Machine Learning for the Sciences

Deep Learning and Computational Physics (Lecture Notes)

Lecture Notes in Lie Groups

Six lectures on linearized neural networks

Notes on Deep Learning Theory

TASI Lectures on Physics for Machine Learning

Lecture Notes: Neural Network Architectures

Geometric spectral theory of quantum graphs

Mathematical ideas and notions of quantum field theory

Mathematical Introduction to Deep Learning: Methods, Implementations, and Theory

Lie groups and Lie algebras

Mathematics of Isogeny Based Cryptography

A Study of the Mathematics of Deep Learning

Lecture notes on high-dimensional data

Lectures on Batalin-Vilkovisky formalism and its applications in topological quantum field theory

Lecture Notes: Optimization for Machine Learning

Deep Learning and Geometric Deep Learning: an introduction for mathematicians and physicists

Statistical physics, Bayesian inference and neural information processing