Abstract:Motivated by the prospects of 5G communications and industrial Internet of Things (IoT), recent years have seen the rise of a new computing paradigm, edge computing, which shifts data analytics to network edges that are at the proximity of big data sources. Although deep neural networks (DNNs) have been extensively used in many platforms and scenarios, they are usually both compute and memory intensive, thus, difficult to be deployed on resource-limited edge devices and in performance-demanding edge applications. Hence, there is an urgent need for techniques that enable DNN models to fit into edge devices, while ensuring acceptable execution costs and inference accuracy. This article proposes an on-demand DNN model inference system for industrial edge devices, called knowledge distillation and early exit on edge (EdgeKE). It focuses on the following two design knobs: first, DNN compression based on knowledge distillation, which trains the compact edge models under the supervision of large complex models for improving accuracy and speed; second, DNN acceleration based on early exit, which provides flexible choices for satisfying distinct latency or accuracy requirements from edge applications. By extensive evaluations on the CIFAR100 dataset and across three state-of-art edge devices, experimental results demonstrate that EdgeKE significantly outperforms the baseline models in terms of inference latency and memory footprint, while maintaining competitive classification accuracy. Furthermore, EdgeKE is verified to be efficiently adaptive to the application requirements on the inference performance. The accuracy loss is within 4.84% under various latency constraints, and the speedup ratio is up to 3.30× under various accuracy requirements.

Rethinking Machine Learning Development and Deployment for Edge Devices

Deep Learning in the Era of Edge Computing: Challenges and Opportunities

Understanding Sensor Data Using Deep Learning Methods on Resource-Constrained Edge Devices.

Data-Intensive Application Deployment at Edge: A Deep Reinforcement Learning Approach

Privacy-Preserving Machine Learning Based Data Analytics on Edge Devices

Deep Learning With Edge Computing: A Review

Enabling Deep Learning on Edge Devices

Power Efficient Machine Learning Models Deployment on Edge IoT Devices

In-Edge AI: Intelligentizing Mobile Edge Computing, Caching and Communication by Federated Learning

EdgeKE: An On-Demand Deep Learning IoT System for Cognitive Big Data on Industrial Edge Devices

Edge Intelligence: On-Demand Deep Learning Model Co-Inference with Device-Edge Synergy

Wireless Network Intelligence at the Edge

TinyMLOps: Operational Challenges for Widespread Edge AI Adoption

Edge AI and On-Device Machine Learning For Real Time Processing

Edge-PRUNE: Flexible Distributed Deep Learning Inference

AI on the Edge: Rethinking AI-based IoT Applications Using Specialized Edge Architectures

AI Multi-Tenancy on Edge: Concurrent Deep Learning Model Executions and Dynamic Model Placements on Edge Devices

EdgeMove: Pipelining Device-Edge Model Training for Mobile Intelligence

EdgeLD: Locally Distributed Deep Learning Inference on Edge Device Clusters

Toward Democratized Generative AI in Next-Generation Mobile Edge Networks