Analyzing the Neural Tangent Kernel of Periodically Activated Coordinate Networks

Hemanth Saratchandran, Shin-Fang Chng, Simon Lucey

2024-02-08

Abstract:Recently, neural networks utilizing periodic activation functions have been proven to demonstrate superior performance in vision tasks compared to traditional ReLU-activated networks. However, there is still a limited understanding of the underlying reasons for this improved performance. In this paper, we aim to address this gap by providing a theoretical understanding of periodically activated networks through an analysis of their Neural Tangent Kernel (NTK). We derive bounds on the minimum eigenvalue of their NTK in the finite width setting, using a fairly general network architecture which requires only one wide layer that grows at least linearly with the number of data samples. Our findings indicate that periodically activated networks are \textit{notably more well-behaved}, from the NTK perspective, than ReLU activated networks. Additionally, we give an application to the memorization capacity of such networks and verify our theoretical predictions empirically. Our study offers a deeper understanding of the properties of periodically activated neural networks and their potential in the field of deep learning.

Machine Learning

What problem does this paper attempt to address?

### Problems the Paper Aims to Solve This paper aims to fill the theoretical understanding gap by analyzing the Neural Tangent Kernel (NTK) of Periodically Activated Coordinate Networks. Specifically: 1. **Theoretical Analysis**: - The study provides upper and lower bounds on the minimum eigenvalue of the NTK for periodically activated networks in a finite-width setting. - The paper demonstrates that, from the NTK perspective, periodically activated networks perform better compared to traditional ReLU-activated networks. 2. **Experimental Validation**: - Researchers experimentally validated the theoretical predictions and compared the results with ReLU-activated networks. - Experiments show that the minimum eigenvalue of the NTK for periodically activated networks grows much faster than that of ReLU networks. 3. **Memory Capacity**: - The study also explores the memory capacity of periodically activated networks and proves that such networks can fit different data points arbitrarily closely, regardless of their labels. In summary, through theoretical analysis and experimental validation, this paper reveals the advantages of periodically activated networks over traditional ReLU networks in terms of NTK spectral properties and further illustrates their potential in the field of deep learning.

Analyzing the Neural Tangent Kernel of Periodically Activated Coordinate Networks

Convergence Analysis of General Neural Networks under Almost Periodic Stimuli

Neural Tangent Kernel Beyond the Infinite-Width Limit: Effects of Depth and Initialization

On the Empirical Neural Tangent Kernel of Standard Finite-Width Convolutional Neural Network Architectures

Spectral Analysis of the Neural Tangent Kernel for Deep Residual Networks

On the Disconnect Between Theory and Practice of Neural Networks: Limits of the NTK Perspective

Fast Finite Width Neural Tangent Kernel

The Positivity of the Neural Tangent Kernel

Analyzing Finite Neural Networks: Can We Trust Neural Tangent Kernel Theory?

A Revision of Neural Tangent Kernel-based Approaches for Neural Networks

New Insights into Graph Convolutional Networks using Neural Tangent Kernels

Tensor Programs II: Neural Tangent Kernel for Any Architecture

Dynamics of Deep Neural Networks and Neural Tangent Hierarchy

Towards an Understanding of Residual Networks Using Neural Tangent Hierarchy (NTH)

Characterizing the Spectrum of the NTK via a Power Series Expansion

Neural Tangent Kernels Motivate Graph Neural Networks with Cross-Covariance Graphs

Efficient NTK using Dimensionality Reduction

A Theory of Neural Tangent Kernel Alignment and Its Influence on Training

Exact Convergence Rates of the Neural Tangent Kernel in the Large Depth Limit

Fixing the NTK: From Neural Network Linearizations to Exact Convex Programs

Reverse Engineering the Neural Tangent Kernel