Convex Dual Theory Analysis of Two-Layer Convolutional Neural Networks With Soft-Thresholding
Chunyan Xiong,Chaoxing Zhang,Mengli Lu,Xiaotong Yu,Jian Cao,Zhong Chen,Di Guo,Xiaobo Qu
DOI: https://doi.org/10.1109/tnnls.2024.3353795
IF: 14.255
2024-01-01
IEEE Transactions on Neural Networks and Learning Systems
Abstract:Soft-thresholding has been widely used in neural networks. Its basic network structure is a two-layer convolution neural network with soft-thresholding. Due to the network's nature of nonlinear and nonconvex, the training process heavily depends on an appropriate initialization of network parameters, resulting in the difficulty of obtaining a globally optimal solution. To address this issue, a convex dual network is designed here. We theoretically analyze the network convexity and prove that the strong duality holds. Extensive results on both simulation and real-world datasets show that strong duality holds, the dual network does not depend on initialization and optimizer, and enables faster convergence than the state-of-the-art two-layer network. This work provides a new way to convexify soft-thresholding neural networks. Furthermore, the convex dual network model of a deep soft-thresholding network with a parallel structure is deduced.
computer science, artificial intelligence, theory & methods,engineering, electrical & electronic, hardware & architecture