Ten Lessons We Have Learned in the New "Sparseland": A Short Handbook for Sparse Neural Network Researchers

Shiwei Liu,Zhangyang Wang
2023-06-25
Abstract:This article does not propose any novel algorithm or new hardware for sparsity. Instead, it aims to serve the "common good" for the increasingly prosperous Sparse Neural Network (SNN) research community. We attempt to summarize some most common confusions in SNNs, that one may come across in various scenarios such as paper review/rebuttal and talks - many drawn from the authors' own bittersweet experiences! We feel that doing so is meaningful and timely, since the focus of SNN research is notably shifting from traditional pruning to more diverse and profound forms of sparsity before, during, and after training. The intricate relationships between their scopes, assumptions, and approaches lead to misunderstandings, for non-experts or even experts in SNNs. In response, we summarize ten Q\&As of SNNs from many key aspects, including dense vs. sparse, unstructured sparse vs. structured sparse, pruning vs. sparse training, dense-to-sparse training vs. sparse-to-sparse training, static sparsity vs. dynamic sparsity, before-training/during-training vs. post-training sparsity, and many more. We strive to provide proper and generically applicable answers to clarify those confusions to the best extent possible. We hope our summary provides useful general knowledge for people who want to enter and engage with this exciting community; and also provides some "mind of ease" convenience for SNN researchers to explain their work in the right contexts. At the very least (and perhaps as this article's most insignificant target functionality), if you are writing/planning to write a paper or rebuttal in the field of SNNs, we hope some of our answers could help you!
Machine Learning
What problem does this paper attempt to address?
The problems that this paper attempts to solve are the common confusions and misunderstandings in the research field of Sparse Neural Networks (SNNs). Specifically, the author aims to help researchers and newcomers better understand and enter this rapidly - developing research field by summarizing ten common questions and their answers. The following are the main problems that the paper attempts to solve: 1. **Clarifying the concept and application scenarios of sparse neural networks**: - The paper explains why sparse neural networks are studied instead of simply using smaller dense networks. By sparsifying large - scale dense models, the computational and memory costs can be significantly reduced while maintaining performance. 2. **Distinguishing different types of sparsity**: - It elaborates on the differences between unstructured sparsity and structured sparsity and explains their different performances in hardware acceleration. - It differentiates weight pruning and activation pruning and illustrates their respective application scenarios. 3. **Explaining different sparse training methods**: - It compares the differences between dense - to - sparse training and sparse - to - sparse training, including their methodological differences, resource consumption, and applicable scenarios. - It explains the differences between Dynamic Sparse Training (DST) and Static Sparse Training (SST) and discusses their respective advantages and disadvantages. 4. **Answering questions about the Lottery Ticket Hypothesis (LTH)**: - It clarifies whether LTH belongs to the sparse - to - sparse training method and explains its relationship and differences with other sparse training methods. 5. **Providing a fair comparison framework**: - It proposes how to conduct fair and reliable comparisons between different sparse algorithms to avoid unfair comparisons due to the lack of standardized evaluation protocols. ### Summary The main purpose of this paper is to help readers better understand the basic concepts of sparse neural networks, the characteristics of different sparse methods, and how to conduct fair research comparisons by summarizing and answering ten common questions in the research of sparse neural networks. This is not only helpful for researchers newly entering this field, but also can provide references for experienced researchers to explain and present their work more clearly.