Abstract:Universal approximation capability, also referred to as universality, is an important property of deep neural networks, endowing them with the potency to accurately represent the underlying target function in learning tasks. In practice, the architecture of deep neural networks largely influences the performance of the models. However, most existing methodologies for designing neural architectures, such as the heuristic manual design or neural architecture search, ignore the universal approximation property, thus losing a potential safeguard about the performance. In this paper, we propose a unified framework to design the architectures of deep neural networks with a universality guarantee based on first-order optimization algorithms, where the forward pass is interpreted as the updates of an optimization algorithm. The (explicit or implicit) network is designed by replacing each gradient term in the algorithm with a learnable module similar to a two-layer network or its derivatives Specifically, we explore the realm of width-bounded neural networks, a common practical scenario, showcasing their universality. Moreover, adding operations of normalization, downsampling, and upsampling does not hurt the universality. To the best of our knowledge, this is the first work that width-bounded networks with universal approximation guarantee can be designed in a principled way. Our framework can inspire a variety of neural architectures including some renowned structures such as ResNet and DenseNet, as well as novel innovations. The experimental results on image classification problems demonstrate that the newly inspired networks are competitive and surpass the baselines of ResNet, DenseNet, as well as the advanced ConvNeXt and ViT, testifying to the effectiveness of our framework.

A unified and constructive framework for the universality of neural networks

An elementary proof of a universal approximation theorem

Normalized Activation Function: Toward Better Convergence

Universal Approximation to Nonlinear Operators by Neural Networks with Arbitrary Activation Functions and Its Application to Dynamical Systems

Universal approximation with complex-valued deep narrow neural networks

On the Expressive Power of Neural Networks

Universal approximation results for neural networks with non-polynomial activation function over non-compact domains

Universal Approximation Theorem for Neural Networks

A Unified Framework for Training Neural Networks

Unification of popular artificial neural network activation functions

Designing Universally-Approximating Deep Neural Networks: A First-Order Optimization Approach

A Minimal Control Family of Dynamical Syetem for Universal Approximation

Noncompact uniform universal approximation

Universal Activation Function For Machine Learning

Activation Functions for "A Feedforward Unitary Equivariant Neural Network"

A Survey on Universal Approximation Theorems

Function Approximation with Randomly Initialized Neural Networks for Approximate Model Reference Adaptive Control

Activation Functions Not To Active: A Plausible Theory on Interpreting Neural Networks

New advances in universal approximation with neural networks of minimal width