NetBooster: Empowering Tiny Deep Learning By Standing on the Shoulders of Deep Giants

Zhongzhi Yu,Yonggan Fu,Jiayi Yuan,Haoran You,Yingyan Lin
2023-06-24
Abstract:Tiny deep learning has attracted increasing attention driven by the substantial demand for deploying deep learning on numerous intelligent Internet-of-Things devices. However, it is still challenging to unleash tiny deep learning's full potential on both large-scale datasets and downstream tasks due to the under-fitting issues caused by the limited model capacity of tiny neural networks (TNNs). To this end, we propose a framework called NetBooster to empower tiny deep learning by augmenting the architectures of TNNs via an expansion-then-contraction strategy. Extensive experiments show that NetBooster consistently outperforms state-of-the-art tiny deep learning solutions.
Machine Learning,Distributed, Parallel, and Cluster Computing
What problem does this paper attempt to address?
### Problems Addressed by the Paper The paper "NetBooster: Empowering Tiny Deep Learning By Standing on the Shoulders of Deep Giants" aims to address the challenges faced when deploying deep learning models on resource-constrained IoT devices. Specifically, the paper attempts to solve the following two main problems: 1. **Tiny Neural Networks (TNNs) perform poorly on large-scale datasets**: - **Problem Description**: Due to limited model capacity, tiny neural networks struggle to learn complex representative features, resulting in low accuracy on large-scale datasets (e.g., ImageNet). - **Impact**: This limits the performance of tiny neural networks in practical applications, especially in tasks requiring high precision. 2. **Tiny Neural Networks are limited in downstream tasks**: - **Problem Description**: Due to insufficient training on large-scale datasets, tiny neural networks cannot fully leverage the pre-training-fine-tuning paradigm to improve accuracy in downstream tasks. - **Impact**: This further restricts the applicability of tiny neural networks in real-world scenarios, particularly in tasks requiring complex feature representations. ### Solution To address the above issues, the paper proposes a framework called NetBooster, which enhances the architecture of tiny neural networks through an expand-shrink strategy. The specific methods are as follows: 1. **Network Expansion**: - **Method**: Convert some layers of the original tiny neural network into multi-layer blocks, forming an expanded deep giant network. This helps in learning more complex features. - **Objective**: Increase the capacity of tiny neural networks, alleviating underfitting during training, and thus achieving better feature learning on large-scale datasets. 2. **Progressive Linearization Tuning (PLT)**: - **Method**: Gradually remove the non-linear activation functions in the expanded blocks and shrink the expanded deep giant network back to the original tiny neural network structure. - **Objective**: Inherit the complex features learned by the deep giant network while maintaining the inference efficiency of the original tiny neural network. ### Experimental Results The paper validates the effectiveness of NetBooster through extensive experiments: - **Performance on large-scale datasets**: NetBooster significantly improves the accuracy of tiny neural networks on the ImageNet dataset, outperforming existing methods by 1.3% to 2.5%. - **Performance on downstream tasks**: NetBooster also excels in multiple downstream tasks (e.g., image classification and object detection), particularly on datasets such as CIFAR-100, Cars, Flowers102, Food101, and Pets, with accuracy improvements ranging from 0.46% to 4.75%. ### Conclusion NetBooster effectively addresses the performance issues of tiny neural networks on large-scale datasets and downstream tasks through an expand-shrink strategy, enhancing the accuracy and practicality of tiny neural networks while maintaining their efficient inference capability.