On Minimal Depth in Neural Networks

Juan L. Valerdi
2024-10-06
Abstract:A characterization of the representability of neural networks is relevant to comprehend their success in artificial intelligence. This study investigate two topics on ReLU neural network expressivity and their connection with a conjecture related to the minimum depth required for representing any continuous piecewise linear (CPWL) function. The topics are the minimal depth representation of the sum and max operations, as well as the exploration of polytope neural networks. For the sum operation, we establish a sufficient condition on the minimal depth of the operands to find the minimal depth of the operation. In contrast, regarding the max operation, a comprehensive set of examples is presented, demonstrating that no sufficient conditions, depending solely on the depth of the operands, would imply a minimal depth for the operation. The study also examine the minimal depth relationship between convex CPWL functions. On polytope neural networks, we investigate basic depth properties from Minkowski sums, convex hulls, number of vertices, faces, affine transformations, and indecomposable polytopes. More significant findings include depth characterization of polygons; identification of polytopes with an increasing number of vertices, exhibiting small depth and others with arbitrary large depth; and most notably, the minimal depth of simplices, which is strictly related to the minimal depth conjecture in ReLU networks.
Machine Learning,Discrete Mathematics,Combinatorics
What problem does this paper attempt to address?
### What problems does this paper attempt to solve? This paper aims to explore and solve key problems related to the expressive power of neural networks, especially the representation of minimal depth. Specifically, the paper mainly focuses on two aspects: 1. **The minimal - depth problem of the expressiveness of ReLU neural networks**: - Research on the minimum number of layers required for ReLU neural networks to represent continuous piecewise linear (CPWL) functions. - Explore the minimal - depth representation of sum and max operations at different depths. 2. **The depth characteristics of neural network polytopes**: - Analyze the basic depth properties of neural network polytopes, including Minkowski sum, convex hull, number of vertices, number of faces, affine transformation, and indecomposable polytopes. - Pay special attention to the minimal depth of simplices and show the unbounded - depth characteristics of cyclic polytopes with an increasing number of vertices. #### Key problems - **Minimal Depth Conjecture**: The paper explores the conjecture proposed by Hertrich et al., that is, for any continuous piecewise linear function, the required minimum number of layers \( M_n=\lceil\log_2(n + 1)\rceil\). - **The minimal depth of sum and max operations**: Research on the minimal - depth representation of these operations when the depths of the operands are known. - **The depth representation of polytopes**: Analyze the depth of polytopes by geometric methods, especially the application on simplices and cyclic polytopes. #### Main contributions - **The minimal depth of the sum operation**: Prove that for the sum operation, when the depths of the operands are the same, its minimal depth is the maximum of the two; when the depths are different, the minimal depth can be determined by specific conditions. - **The minimal depth of the max operation**: Show that the minimal depth of the max operation cannot be determined only by the depths of the operands, and propose more factors to be considered. - **The depth characteristics of polytopes**: Reveal that the minimal depth of simplices is strictly equal to \( \lceil\log_2(n + 1)\rceil\), and show the unbounded - depth characteristics of cyclic polytopes with an increasing number of vertices. Through these studies, the paper provides new insights into understanding the expressive power and minimal depth of neural networks and provides a theoretical basis for further research.