Optimization-inspired Manual Architecture Design and Neural Architecture Search

Yibo Yang,Zhengyang Shen,Huan Li,Zhouchen Lin
DOI: https://doi.org/10.1007/s11432-021-3527-7
2023-01-01
Science China Information Sciences
Abstract:Neural architecture has been a research focus in recent years due to its importance in deciding the performance of deep networks. Representative ones include a residual network (ResNet) with skip connections and a dense network (DenseNet) with dense connections. However, a theoretical guidance for manual architecture design and neural architecture search (NAS) is still lacking. In this paper, we propose a manual architecture design framework, which is inspired by optimization algorithms. It is based on the conjecture that an optimization algorithm with a good convergence rate may imply a neural architecture with good performance. Concretely, we prove under certain conditions that forward propagation in a deep neural network is equivalent to the iterative optimization procedure of the gradient descent algorithm minimizing a cost function. Inspired by this correspondence, we derive neural architectures from fast optimization algorithms, including the heavy ball algorithm and Nesterov’s accelerated gradient descent algorithm. Surprisingly, we find that we can deem the ResNet and DenseNet as special cases of the optimization-inspired architectures. These architectures offer not only theoretical guidance, but also good performances in image recognition on multiple datasets, including CIFAR-10, CIFAR-100, and ImageNet. Moreover, we show that our method is also useful for NAS by offering a good initial search point or guiding the search space.
What problem does this paper attempt to address?