Enhancing Accuracy and Parameter-Efficiency of Neural Representations for Network Parameterization

Hongjun Choi,Jayaraman J. Thiagarajan,Ruben Glatt,Shusen Liu

2024-06-29

Abstract:In this work, we investigate the fundamental trade-off regarding accuracy and parameter efficiency in the parameterization of neural network weights using predictor networks. We present a surprising finding that, when recovering the original model accuracy is the sole objective, it can be achieved effectively through the weight reconstruction objective alone. Additionally, we explore the underlying factors for improving weight reconstruction under parameter-efficiency constraints, and propose a novel training scheme that decouples the reconstruction objective from auxiliary objectives such as knowledge distillation that leads to significant improvements compared to state-of-the-art approaches. Finally, these results pave way for more practical scenarios, where one needs to achieve improvements on both model accuracy and predictor network parameter-efficiency simultaneously.

Machine Learning,Computer Vision and Pattern Recognition

What problem does this paper attempt to address?

The paper mainly discusses the trade-off between accuracy and efficiency in neural network weight parameterization. It was found that the accuracy of the original model can be effectively restored by only reconstructing the weight target, and the performance can be further improved by repeating the reconstruction process multiple times. The authors propose a new training scheme that divides the training objective into the reconstruction stage and the knowledge distillation stage, decoupling different learning objectives, which significantly outperforms existing methods. This method allows for improving the efficiency of the prediction network while maintaining high accuracy. The paper also found that using only the reconstruction loss can achieve better network performance than the original model, and higher compression rates can be achieved without sacrificing performance through multiple rounds of reconstruction. Additionally, using a high-capacity teacher network can further optimize the balance between compression and performance. Experimental results demonstrate the effectiveness of these strategies on multiple datasets and network architectures.

Enhancing Accuracy and Parameter-Efficiency of Neural Representations for Network Parameterization

DyRep: Bootstrapping Training with Dynamic Re-parameterization

Expand-and-Cluster: Parameter Recovery of Neural Networks

Enhancing Deep Neural Network Training Efficiency and Performance through Linear Prediction

Parameter Prediction for Unseen Deep Architectures

Weight Reparametrization for Budget-Aware Network Pruning

Adaptive inertia weights: an effective way to improve parameter estimation of hidden layer in stochastic configuration networks

Efficient and sparse neural networks by pruning weights in a multiobjective learning approach

Toward Compact Parameter Representations for Architecture-Agnostic Neural Network Compression

Post-Training Quantization for Re-parameterization via Coarse & Fine Weight Splitting

Training compact neural networks via

PELA: Learning Parameter-Efficient Models with Low-Rank Approximation

Towards a Unified View of Parameter-Efficient Transfer Learning

Training Compact Neural Networks via Auxiliary Overparameterization

Sequencing the Neurome: Towards Scalable Exact Parameter Reconstruction of Black-Box Neural Networks

NEAT: Nonlinear Parameter-efficient Adaptation of Pre-trained Models

Over Parameterized Two-level Neural Networks Can Learn Near Optimal Feature Representations

Sine Activated Low-Rank Matrices for Parameter Efficient Learning

Neural reparameterization improves structural optimization

Parametric Enhancement of PerceptNet: A Human-Inspired Approach for Image Quality Assessment