Floating-Point Quantization Analysis of Multi-Layer Perceptron Artificial Neural Networks

DOI: https://doi.org/10.1007/s11265-024-01911-0
2024-03-26
Journal of Signal Processing Systems
Abstract:The impact of quantization in Multi-Layer Perceptron (MLP) Artificial Neural Networks (ANNs) is presented in this paper. In this architecture, the constant increase in size and the demand to decrease bit precision are two factors that contribute to the significant enlargement of quantization errors. We introduce an analytical tool that models the propagation of Quantization Noise Power (QNP) in floating-point MLP ANNs. Contrary to the state-of-the-art approach, which compares the exact and quantized data experimentally, the proposed algorithm can predict the QNP theoretically when the effect of operation quantization and Coefficient Quantization Error (CQE) are considered. This supports decisions in determining the required precision during the hardware design. The algorithm is flexible in handling MLP ANNs of user-defined parameters, such as size and type of activation function. Additionally, a simulation environment is built that can perform each operation on an adjustable bit precision. The accuracy of the QNP calculation is verified with two publicly available benchmarked datasets, using the default precision simulation environment as a reference.
computer science, information systems,engineering, electrical & electronic
What problem does this paper attempt to address?