Subhomogeneous Deep Equilibrium Models

Pietro Sittoni,Francesco Tudisco
2024-06-07
Abstract:Implicit-depth neural networks have grown as powerful alternatives to traditional networks in various applications in recent years. However, these models often lack guarantees of existence and uniqueness, raising stability, performance, and reproducibility issues. In this paper, we present a new analysis of the existence and uniqueness of fixed points for implicit-depth neural networks based on the concept of subhomogeneous operators and the nonlinear Perron-Frobenius theory. Compared to previous similar analyses, our theory allows for weaker assumptions on the parameter matrices, thus yielding a more flexible framework for well-defined implicit networks. We illustrate the performance of the resulting subhomogeneous networks on feedforward, convolutional, and graph neural network examples.
Machine Learning,Numerical Analysis,Optimization and Control
What problem does this paper attempt to address?
This paper mainly discusses the existence and uniqueness of fixed points in Deep Equilibrium Models (DEQs). DEQs are a powerful class of deep learning models that define feature embedding through nonlinear equations instead of traditional multilayer structures. However, these models lack guarantees in terms of existence and uniqueness, which may lead to stability, performance, and reproducibility issues. The authors propose a new analytical approach based on subhomogeneous operators and nonlinear Perron-Frobenius theory, to study the existence and uniqueness of fixed points in DEQs. This approach is more flexible and has weaker assumptions on the parameter matrix compared to previous analyses, allowing for a more diverse definition of well-defined implicit networks. The paper points out that if the activation function is subhomogeneous, the fixed points in the DEQ model are unique. This includes common activation functions such as sigmoid, SoftPlus, and tanh. The authors also propose a new model called SubDEQ, which demonstrates its performance on examples of feedforward, convolutional, and graph neural networks. In addition, the paper compares the performance of the simple fully connected and convolutional DEQ architectures based on monotone operators with the newly proposed SubDEQ on benchmark image classification tasks. In summary, this paper addresses the stability and uniqueness issues of DEQ models, proposes a new theoretical framework, and provides empirical support for creating more stable and efficient DEQ models.