Abstract:The posit number system aims to be a drop-in replacement of the existing IEEE floating-point standard. Its properties- tapered precision and high dynamic range, allow a smaller size posit to almost match the performance of a much larger size floating-point in representing decimals. This becomes especially useful for performing error-tolerant tasks like deep learning inference computation where low latency and area are a priority. Recent research has found that the performance of deep neural network models saturates beyond a certain level of accuracy of multipliers used for convolutions. Therefore, the extra hardware cost of developing precise arithmetic circuits for such applications becomes an unnecessary overhead. This paper explores approximate posit multipliers in the convolutional layers of deep neural networks and attempts to find an ideal balance between hardware utilization and inference accuracy. Posit multiplication involves several steps, with the mantissa multiplication step utilizing maximum hardware resources. To mitigate this, a posit multiplier circuit using an approximate hybrid-radix Booth encoding for mantissa multiplication and techniques such as truncation and bit masking based on input regime size are proposed. In addition, a novel Booth encoding control scheme to prevent unnecessary bits from switching has been devised to reduce dynamic power dissipation. Compared to existing literature, these optimizations have contributed to a 23% decrease in power dissipation in the mantissa multiplication stage. Further, a novel area and energy-efficient decoder architecture have also been developed with an 11% reduction in dynamic power dissipation and area compared to existing decoders. Overall, the proposed posit multiplier offers a 14% reduction in the PDP over the existing approximate posit multiplier designs. The proposed multiplier also achieves over 90% accuracy in inferencing deep learning models such as ResNet20, VGG-19 and DenseNet.

Evaluation of Posits for Spectral Analysis Using a Software-Defined Dataflow Architecture

Brightening the Optical Flow through Posit Arithmetic

The Accuracy and Efficiency of Posit Arithmetic

Evaluation of POSIT Arithmetic with Accelerators

Big-PERCIVAL: Exploring the Native Use of 64-Bit Posit Arithmetic in Scientific Computing

Posit Arithmetic Hardware Implementations With The Minimum Cost Divider And Squareroot

34.1 A 28nm 83.23TFLOPS/W POSIT-Based Compute-in-Memory Macro for High-Accuracy AI Applications

Area-Efficient Iterative Logarithmic Approximate Multipliers for IEEE 754 and Posit Numbers

PPU: Design and Implementation of a Pipelined Full Posit Processing Unit

An Area- and Energy-Efficient Hybrid Architecture for Floating-Point FFT Computations.

PERCIVAL: Open-Source Posit RISC-V Core with Quire Capability

A Hybrid SDC/SDF Architecture for Area and Power Minimization of Floating-Point FFT Computations

A Number Representation Systems Library Supporting New Representations Based on Morris Tapered Floating-point with Hidden Exponent Bit

PositNN: Training Deep Neural Networks with Mixed Low-Precision Posit

Hardware Trends Impacting Floating-Point Computations In Scientific Applications

Tackling Gaps In Floating-Point Arithmetic: Unum Arithmetic Implementation On Fpga

Leveraging the bfloat16 Artificial Intelligence Datatype For Higher-Precision Computations

Hardware Architectures for Computing Eigendecomposition-Based Discrete Fractional Fourier Transforms with Reduced Arithmetic Complexity

ADEPNET: A Dynamic-Precision Efficient Posit Multiplier for Neural Networks

CLARINET: A RISC-V Based Framework for Posit Arithmetic Empiricism

Compressed Real Numbers for AI: a case-study using a RISC-V CPU