Training Compact DNNs with l 1 / 2 Regularization

Anda Tang,Lingfeng Niu,Jianyu Miao,Peng Zhang
DOI: https://doi.org/10.1016/j.patcog.2022.109206
IF: 8
2023-04-01
Pattern Recognition
Abstract:Deep neural network(DNN) has achieved unprecedented success in many fields. However, its large model parameters which bring a great burden on storage and calculation hinder the development and application of DNNs. It is worthy of compressing the model to reduce the complexity of the DNN. Sparsity-inducing regularizer is one of the most common tools for compression. In this paper, we propose utilizing the ℓ 1 / 2 quasi-norm to zero out weights of neural networks and compressing the networks automatically during the learning process. To our knowledge, it is the first work applying the non-Lipschitz continuous regularizer for the compression of DNNs. The resulting sparse optimization problem is solved by stochastic proximal gradient algorithm. For further convenience of calculation, an approximation of the threshold-form solution to the proximal operator with ℓ 1 / 2 is given at the same time. Extensive experiments with various datasets and baselines demonstrate the advantages of our new method.
computer science, artificial intelligence,engineering, electrical & electronic
What problem does this paper attempt to address?