Condense: A Framework for Device and Frequency Adaptive Neural Network Models on the Edge.
Yifan Gong,Pu Zhao,Zheng Zhan,Yushu Wu,Chao Wu,Zhenglun Kong,Minghai Qin,Caiwen Ding,Yanzhi Wang
DOI: https://doi.org/10.1109/dac56929.2023.10247713
2023-01-01
Abstract:With the popularity of battery-powered edge computing, an important yet under-explored problem is the supporting of DNNs for diverse edge devices. On the one hand, different edge platforms have various runtime requirements and computation/memory capabilities. Deploying the same DNN model is unsatisfiable, while designing a specialized DNN for each platform is prohibitively expensive. On the other hand, for a single edge device, DVFS is leveraged to prolong the battery, incurring significant inference speed variation for the same DNN and consequently poor user experience. To tackle this, we propose CONDENSE, a framework providing a single adaptive model that can be reconfigured (switch to various sub-networks with different computations/parameters) instantly for diverse devices and execution frequencies without any retraining. Experiments demonstrate that CONDENSE can simultaneously provide vast high-accuracy sub-networks with different computations and parameters corresponding to various sparsity ratios to support diverse edge devices with different runtime requirements, and reduce the speed variation under varying frequencies on each device, with a memory cost of only one set of weights.