Can Jin,Tianjin Huang,Yihua Zhang,Mykola Pechenizkiy,Sijia Liu,Shiwei Liu,Tianlong Chen
Abstract:The rapid development of large-scale deep learning models questions the affordability of hardware platforms, which necessitates the pruning to reduce their computational and memory footprints. Sparse neural networks as the product, have demonstrated numerous favorable benefits like low complexity, undamaged generalization, etc. Most of the prominent pruning strategies are invented from a model-centric perspective, focusing on searching and preserving crucial weights by analyzing network topologies. However, the role of data and its interplay with model-centric pruning has remained relatively unexplored. In this research, we introduce a novel data-model co-design perspective: to promote superior weight sparsity by learning important model topology and adequate input data in a synergetic manner. Specifically, customized Visual Prompts are mounted to upgrade neural Network sparsification in our proposed VPNs framework. As a pioneering effort, this paper conducts systematic investigations about the impact of different visual prompts on model pruning and suggests an effective joint optimization approach. Extensive experiments with 3 network architectures and 8 datasets evidence the substantial performance improvements from VPNs over existing start-of-the-art pruning algorithms. Furthermore, we find that subnetworks discovered by VPNs from pre-trained models enjoy better transferability across diverse downstream scenarios. These insights shed light on new promising possibilities of data-model co-designs for vision model sparsification.
What problem does this paper attempt to address?
The problem that this paper attempts to solve is how to use the co - design of data and models to improve the effect of sparsification during the sparsification process of visual models. Specifically, the paper explores how to simultaneously optimize model weights and input data during the sparsification process by introducing customized Visual Prompts, thereby enhancing the performance and efficiency of sparse neural networks.
### Background and Motivation
With the rapid development of large - scale deep - learning models, the burden on hardware platforms has become increasingly heavy, which has prompted researchers to seek methods to reduce computation and memory usage. As a solution, sparse neural networks have demonstrated advantages such as low complexity and no damage to generalization ability. However, most of the existing pruning strategies mainly start from a model - centric perspective and search for and retain key weights by analyzing the network topology, while data and its interaction with model pruning have not been fully explored.
### Research Objectives
This paper proposes a new data - model co - design perspective, aiming to promote better weight sparsification in a collaborative manner by learning important model topologies and appropriate input data. The specific method is to upgrade the network sparsification process in the proposed VPNs framework by introducing customized Visual Prompts in neural networks.
### Main Contributions
1. **Pilot Study**: The authors conducted a systematic preliminary study on the effects of existing post - pruning prompts in sparse visual models and found that they are not effective in improving performance.
2. **Algorithm**: Proposed a new data - model co - designed sparsification paradigm VPNs, which simultaneously optimizes weight masks and customized Visual Prompts.
3. **Experiments**: Conducted extensive experiments on multiple datasets, architectures, and pruning strategies. The experiments proved that VPNs has significant improvements in both performance and efficiency. For example, at a sparsity of 90%, VPNs outperforms HYDRA, BiP, and LTH by 3.41%, 1.69%, and 2.00% respectively on Tiny - ImageNet.
4. **Additional Findings**: The research shows that the sparse masks generated by VPNs have better transferability in multiple downstream tasks.
5. **Potential Practical Benefits**: VPNs can be seamlessly integrated into structured pruning methods to achieve higher real - time acceleration and memory reduction while maintaining competitive accuracy.
### Method Overview
1. **Design Appropriate Visual Prompts**: Visual prompts modify the input image by injecting a small number of learnable parameters. The specific form of the input prompt is:
\[
x'(\delta)=h(x, \delta), \quad x \in D =\{(x_1, y_1), \ldots,(x_n, y_n)\}
\]
where \(h(\cdot, \cdot)\) is an input transformation function that combines the original image \(x\) with the learnable input perturbation \(\delta\) to generate the modified data \(x'\).
2. **Use Visual Prompts to Improve Network Sparsification**: Improve the performance of the pre - trained source model in downstream tasks by jointly optimizing the visual prompt \(\delta\) and the weight mask \(m\). The specific optimization problem is as follows:
\[
\min_{m, \delta}\mathbb{E}_{(x, y) \in D}\left[L(f_{\theta_{\text{pre}}} \odot m(x'(\delta)), y)\right] \quad \text{s.t.} \quad \|m\|_0 \leq(1 - s)|\theta_{\text{pre}}|
\]
where \(\theta_{\text{pre}}\) is the pre - trained model weight, \(m\) is the weight mask, and \(s\) is the desired sparsity.
3. **Overall Process**: VPNs first creates visual prompts, then jointly optimizes the visual prompts and parameterized weight masks to find the sparse sub - network. Finally, further fine - tunes the weights and visual prompts of the sparse sub - network.
### Experimental Results
1. **Main Results**: In multiple...