Abstract:In recent years, deep neural networks (DNNs) have witnessed a booming of artificial intelligence Internet of Things applications with stringent demands across high accuracy and low latency. A widely adopted solution is to process such computation-intensive DNNs inference tasks with edge computing. Nevertheless, existing edge-based DNN processing methods still cannot achieve acceptable performance due to the intensive transmission data and unnecessary computation. To address the above limitations, we take the advantage of Multi-exit DNNs (ME-DNNs) that allows the tasks to exit early at different depths of the DNN during inference, based on the input complexity. However, naively deploying ME-DNNs in edge still fails to deliver fast and consistent inference in the wild environment. Specifically, 1) at the model-level, unsuitable exit settings will increase additional computational overhead and will lead to excessive queuing delay; 2) at the computation-level, it is hard to sustain high performance consistently in the dynamic edge computing environment. In this paper, we present a Low Latency Edge Intelligence Scheme based on Multi-Exit DNNs (LEIME) to tackle the aforementioned problem. At the model-level, we propose an exit setting algorithm to automatically build optimal ME-DNNs with lower time complexity; At the computation-level, we present a distributed offloading mechanism to fine-tune the task dispatching at runtime to sustain high performance in the dynamic environment, which has the property of close-to-optimal performance guarantee. Finally, we implement a prototype system and extensively evaluate it through testbed and large-scale simulation experiments. Experimental results demonstrate that LEIME significantly improves applications' performance, achieving 1.1–18.7 × speedup in different situations.

Improving the Accuracy of Early Exits in Multi-Exit Architectures Via Curriculum Learning.

Elastic DNN Inference with Unpredictable Exit in Edge Computing

Unlocking the Non-deterministic Computing Power with Memory-Elastic Multi-Exit Neural Networks

Consistency Training of Multi-exit Architectures for Sensor Data

Multi-Exit DNN Inference Acceleration Based on Multi-Dimensional Optimization for Edge Intelligence

Deep Feature Surgery: Towards Accurate and Efficient Multi-Exit Networks

To Exit or Not to Exit: Cost-Effective Early-Exit Architecture Based on Markov Decision Process

DynExit: A Dynamic Early-Exit Strategy for Deep Residual Networks

Single-layer vision transformers for more accurate early exits with less overhead

T-RECX: Tiny-Resource Efficient Convolutional neural networks with early-eXit

Multi-exit self-distillation with appropriate teachers

Early-Exit with Class Exclusion for Efficient Inference of Neural Networks

MMExit: Enabling Fast and Efficient Multi-modal DNN Inference with Adaptive Network Exits

LECO: Improving Early Exiting Via Learned Exits and Comparison-based Exiting Mechanism.

QuickNets: Saving Training and Preventing Overconfidence in Early-Exit Neural Architectures

Simulating multi-exit evacuation using deep reinforcement learning

A Closer Look at Branch Classifiers of Multi-exit Architectures

Enabling Low Latency Edge Intelligence Based on Multi-exit DNNs in the Wild

Temporal Decisions: Leveraging Temporal Correlation for Efficient Decisions in Early Exit Neural Networks

Dynamic Path Based DNN Synergistic Inference Acceleration in Edge Computing Environment.

Why should we add early exits to neural networks?