Abstract:Approximate computing is a promising design paradigm that introduces a new dimension-error-into the original design space. By allowing the inexact computation in error-tolerance applications, approximate computing can gain both performance and energy efficiency. A neural network (NN) is a universal approximator in theory and possesses a high level of parallelism. The emerging deep neural network accelerators deployed with NN-based approximator is thereby a promising candidate for approximate computing. Nevertheless, the approximation result must satisfy the users' requirement, and the approximation result varies across different applications. We normally deploy an NN-based classifier to ensure the approximation quality. Only the inputs predicted to meet the quality requirement can be executed by the approximator. The potential of these two NNs, however, is fully explored; the involving of two NNs in approximate computing imposes critical optimization questions, such as two NNs' distinct views of the input data space, how to train the two correlated NNs, and what are their topologies. In this article, we propose a novel NN-based approximate computing framework with quality insurance. We advocate a co-training approach that trains the classifier and the approximator alternately to maximize the agreement of the two NNs on the input space. In each iteration, we coordinate the training of the two NNs with a judicious selection of training data. Next, we explore different selection policies and propose to select training data from multiple iterations, which can enhance the invocation of the approximate accelerator. In addition, we optimize the classifier by integrating a dynamic threshold tuning algorithm to improve the invocation of the approximate accelerator further. The increased invocation of accelerator leads to higher energy efficiency under the same quality requirement. We propose two efficient algorithms to explore the smallest topology of the NN-based approximator and the classifier to achieve the quality requirement. The first algorithm straightforward searches the minimum topology using a greedy strategy. However, the first algorithm incurs too much training overhead. To solve this issue, the second one gradually grows the topology of NNs to match the quality requirement by transferring the learned parameters. Experimental results show significant improvement on the quality and the energy efficiency compared to the existing NN-based approximate computing frameworks.

Exploiting Network Loss for Distributed Approximate Computing with NetApprox

ApproSync: Approximate State Synchronization for Programmable Networks.

On Performance Optimization and Quality Control for Approximate-Communication-Enabled Networks-on-Chip

AXNet: ApproXimate computing using an end-to-end trainable neural network

Work in Progress: ACAC: an Adaptive Congestion-aware Approximate Communication Mechanism for Network-on-Chip Systems

Energy-Efficient and Quality-Assured Approximate Computing Framework Using a Co-Training Method.

Chapter Six - Approximate communication for energy-efficient network-on-chip.

A Machine Learning Based Approximate Computing Approach on Data Flow Graphs: Work-in-Progress

INA: Incremental Network Approximation Algorithm for Limited Precision Deep Neural Networks

A FPGA Friendly Approximate Computing Framework with Hybrid Neural Networks: (Abstract Only).

Nova: Towards On-Demand Equivalent Network View Abstraction For Network Optimization

From Network Inference Errors to Utility Suboptimality: How Much is the Impact?

A comprehensive exploration of approximate DNN models with a novel floating-point simulation framework

Concrete: A Per-layer Configurable Framework for Evaluating DNN with Approximate Operators

Exploiting Significance of Computations for Energy-Constrained Approximate Computing

Markov Approximation for Combinatorial Network Optimization

Resilient Approximation-Based Distributed Nonconvex Optimization

Approximate Byzantine Fault-Tolerance in Distributed Optimization

Approximation Algorithms for Minimizing Congestion in Demand-Aware Networks

Approximating Generalized Network Design under (Dis)economies of Scale with Applications to Energy Efficiency