Abstract:Large pretrained models, coupled with fine-tuning, are slowly becoming established as the dominant architecture in machine learning. Even though these models offer impressive performance, their practical application is often limited by the prohibitive amount of resources required for every inference. Early-exiting dynamic neural networks (EDNN) circumvent this issue by allowing a model to make some of its predictions from intermediate layers (i.e., early-exit). Training an EDNN architecture is challenging as it consists of two intertwined components: the gating mechanism (GM) that controls early-exiting decisions and the intermediate inference modules (IMs) that perform inference from intermediate representations. As a result, most existing approaches rely on thresholding confidence metrics for the gating mechanism and strive to improve the underlying backbone network and the inference modules. Although successful, this approach has two fundamental shortcomings: 1) the GMs and the IMs are decoupled during training, leading to a train-test mismatch; and 2) the thresholding gating mechanism introduces a positive bias into the predictive probabilities, making it difficult to readily extract uncertainty information. We propose a novel architecture that connects these two modules. This leads to significant performance improvements on classification datasets and enables better uncertainty characterization capabilities.

What problem does this paper attempt to address?

The main problems that this paper attempts to solve are two core issues in the existing early - exiting dynamic neural networks (EEDN): 1. **Training - testing mismatch**: In most existing EEDN methods, the intermediate inference modules (IMs) and the gating mechanisms (GMs) are decoupled during the training process, resulting in inconsistent data distributions in the training and testing phases. This will affect the performance of the model. 2. **Lack of confidence information**: The threshold - based gating mechanism introduces over - confident bias, making it difficult to extract the probability uncertainty information of the prediction. This is a serious problem for models that need to output different results according to different computational resources, because confidence information is crucial for users to decide whether to accept efficient but possibly less accurate results or request more reliable outputs. To solve these problems, the author proposes a new framework - **JEI - DNN (Jointly - Learned Exit and Inference for a Dynamic Neural Network)**, and its main contributions include: - **Joint training of IMs and GMs**: By jointly training IMs and GMs, the problem of training - testing mismatch is directly avoided, and good uncertainty features are provided. - **New method for modeling exit probability**: A new method for modeling the probability of exiting from a specific inference module is introduced. - **Two - layer optimization task**: Optimize the loss function while evaluating accuracy and inference cost, and formulate it as a two - layer optimization task. Each layer of optimization is simpler than the overall problem, thus achieving more stable training. - **Empirical verification**: Experiments prove that this method significantly improves the overall inference/performance trade - off on classification datasets and can generate reliable uncertainty features, such as conforming intervals and well - calibrated prediction probabilities. In conclusion, this paper aims to improve the performance and reliability of the model by improving the training method of EEDN, especially in dealing with uncertainty and resource allocation.

Jointly-Learned Exit and Inference for a Dynamic Neural Network : JEI-DNN

Elastic DNN Inference with Unpredictable Exit in Edge Computing

Unlocking the Non-deterministic Computing Power with Memory-Elastic Multi-Exit Neural Networks

Early-Exit with Class Exclusion for Efficient Inference of Neural Networks

Joint or Disjoint: Mixing Training Regimes for Early-Exit Models

Boosted Dynamic Neural Networks

Binary Early-Exit Network for Adaptive Inference on Low-Resource Devices

Dynamic Multi-path Neural Network

Fast yet Safe: Early-Exiting with Risk Control

Energy-Aware Dynamic Neural Inference

Resource-aware Deployment of Dynamic DNNs over Multi-tiered Interconnected Systems

A Progressive Subnetwork Searching Framework for Dynamic Inference

Early-Exit Neural Networks with Nested Prediction Sets

Subnetwork-to-go: Elastic Neural Network with Dynamic Training and Customizable Inference

JointDNN: An Efficient Training and Inference Engine for Intelligent Mobile Cloud Computing Services

QuickNets: Saving Training and Preventing Overconfidence in Early-Exit Neural Architectures

DyCE: Dynamically Configurable Exiting for Deep Learning Compression and Real-time Scaling

Learning to Weight Samples for Dynamic Early-Exiting Networks.

Grad-Instructor: Universal Backpropagation with Explainable Evaluation Neural Networks for Meta-learning and AutoML

SEENN: Towards Temporal Spiking Early-Exit Neural Networks

Efficient Post-Training Augmentation for Adaptive Inference in Heterogeneous and Distributed IoT Environments