Edge intelligence in motion: Mobility-aware dynamic DNN inference service migration with downtime in mobile edge computing

p wang,t ouyang,g liao,j gong,s yu,x chen
DOI: https://doi.org/10.1016/j.sysarc.2022.102664
IF: 5.836
2022-01-01
Journal of Systems Architecture
Abstract:Edge intelligence (EI) becomes a trend to push the deep learning frontiers to the network edge, so that deep neural networks (DNNs) applications can be well leveraged at resource-constrained mobile devices with benefits of edge computing. Due to the high user mobility among scattered edge servers in many scenarios such as internet of vehicular applications, dynamic service migration is desired to maintain a reliable and efficient quality of service (QoS). However, inevitable service downtime incurred by service migration would largely degrade the real-time performance of delay-sensitive DNN inference services. To address this issue, we advocate a user-centric management for dynamic DNN inference service migration with flexible multi-exit mechanism, aiming at maximizing overall user utility (e.g., DNN model inference accuracy) with various service downtime. We first leverage dynamic programming to propose an optimal offline migration and exit point selection strategy (OMEPS) algorithm when complete future information of user behaviors is available. Amenable to a more practical application domain without complete future information, we incorporate the OMEPS algorithm into a model predictive control (MPC) framework, then construct a mobility-aware service migration and DNN exit point selection (MOMEPS) algorithm, which improves the long-term user utility within limited predictive future information. However, heavy computation overheads of MOMEPS algorithm impose burdens on mobile devices, thus we further advocate a cost-efficient algorithm, named smart-MOMEPS, which introduces a smart migration judgement based on Neural Networks to control the implementation of (MOMEPS) algorithm by wisely estimating whether the DNN service should be migrated or not. Extensive trace-driven simulation results demonstrate the superior performance of our smart-MOMEPS algorithm for achieving significant overall utility improvements with low computation overheads compared with other online algorithms.
What problem does this paper attempt to address?