OCAP: On-device Class-Aware Pruning for personalized edge DNN models

Ye-Da Ma,Zhi-Chao Zhao,Di Liu,Zhenli He,Wei Zhou
DOI: https://doi.org/10.1016/j.sysarc.2023.102956
IF: 5.836
2023-08-04
Journal of Systems Architecture
Abstract:In this paper, we propose a new on-device class-aware pruning method for edge systems, namely OCAP . The motivation behind is that Deep Neural Network (DNN) models are usually trained with a large dataset so that they can learn more diverse features and be generalized to accurately predict numerous classes. Some works reveal that some features (channels) are only related to some classes. And edge systems are usually implemented in a specific environment, where classes the system detects are limited. As a result, implementing a general-trained model for a specific edge environment leads to unnecessary redundancy. Meanwhile, transferring some data and models to the cloud for personalization will cause privacy issues. Thus, we may have an on-device class-aware pruning method to remove the channels which are irrelevant for the classes the edge system observes mostly, thereby reducing the model's Floating Point Operations (FLOPs), memory footprint, latency, improving energy efficiency and keeping a relatively high accuracy for the observed classes while protecting the in-situ data privacy. OCAP proposes a novel class-aware pruning method based on the intermediate activation of input images to identify the class-irrelevant channels. Moreover, we propose a method based on KL-divergence to select diverse and representative data for effectively fine-tuning the pruned model. The experimental results show the effectiveness and efficiency of OCAP . In comparison with state-of-the-art class-aware pruning methods, OCAP has better accuracy and higher compression ratio. Additionally, we evaluate OCAP on Nvidia Jetson Nano, Nvidia Jetson TX2 and Nvidia Jetson AGX Xavier in terms of efficiency, where the experimental results demonstrate the applicability of OCAP on edge systems. The code is available at https://github.com/mzd2222/OCAP .
computer science, software engineering, hardware & architecture
What problem does this paper attempt to address?