Abstract:The state-of-the-art approaches employ approximate computing to reduce the energy consumption of DNN hardware. Approximate DNNs then require extensive retraining afterwards to recover from the accuracy loss caused by the use of approximate operations. However, retraining of complex DNNs does not scale well. In this paper, we demonstrate that efficient approximations can be introduced into the computational path of DNN accelerators while retraining can completely be avoided. ALWANN provides highly optimized implementations of DNNs for custom low-power accelerators in which the number of computing units is lower than the number of DNN layers. First, a fully trained DNN is converted to operate with 8-bit weights and 8-bit multipliers in convolutional layers. A suitable approximate multiplier is then selected for each computing element from a library of approximate multipliers in such a way that (i) one approximate multiplier serves several layers, and (ii) the overall classification error and energy consumption are minimized. The optimizations including the multiplier selection problem are solved by means of a multiobjective optimization NSGA-II algorithm. In order to completely avoid the computationally expensive retraining of DNN, which is usually employed to improve the classification accuracy, we propose a simple weight updating scheme that compensates the inaccuracy introduced by employing approximate multipliers. The proposed approach is evaluated for two architectures of DNN accelerators with approximate multipliers from the open-source "EvoApprox" library. We report that the proposed approach saves 30% of energy needed for multiplication in convolutional layers of ResNet-50 while the accuracy is degraded by only 0.6%. The proposed technique and approximate layers are available as an open-source extension of TensorFlow at <a class="link-external link-https" href="https://github.com/ehw-fit/tf-approximate" rel="external noopener nofollow">this https URL</a>.

Invocation-driven Neural Approximate Computing with a Multiclass-Classifier and Multiple Approximators

Neural Approximating Architecture Targeting Multiple Application Domains

AXNet: ApproXimate computing using an end-to-end trainable neural network

Reconfigurable Architecture for Neural Approximation in Multimedia Computing.

ALWANN: Automatic Layer-Wise Approximation of Deep Neural Network Accelerators without Retraining

AxR-NN: Approximate Computation Reuse for Energy-Efficient Convolutional Neural Networks

CoAxNN: Optimizing on-device deep learning with conditional approximate neural networks

Neural Network Approximators for Marginal MAP in Probabilistic Circuits

Function Approximation with Randomly Initialized Neural Networks for Approximate Model Reference Adaptive Control

A Hardware/Software Co-Design Methodology for Adaptive Approximate Computing in Clustering and ANN Learning

Towards Energy-Efficient Collaborative Inference Using Multi-System Approximations

AxTrain

Energy-efficient DNN Inference on Approximate Accelerators Through Formal Property Exploration

QoS-Nets: Adaptive Approximate Neural Network Inference

Energy-efficient Neural Networks Using Approximate Computation Reuse.

ApproxDARTS: Differentiable Neural Architecture Search with Approximate Multipliers

ApproxPilot: A GNN-based Accelerator Approximation Framework

Memristor-based Approximated Computation.

Training Neural Networks for Execution on Approximate Hardware

INA: Incremental Network Approximation Algorithm for Limited Precision Deep Neural Networks

Accelerating TinyML Inference on Microcontrollers through Approximate Kernels