Exploring and Exploiting Runtime Reconfigurable Floating Point Precision in Scientific Computing: a Case Study for Solving PDEs

Cong "Callie" Hao
2024-09-23
Abstract:Scientific computing applications, such as computational fluid dynamics and climate modeling, typically rely on 64-bit double-precision floating-point operations, which are extremely costly in terms of computation, memory, and energy. While the machine learning community has successfully utilized low-precision computations, scientific computing remains cautious due to concerns about numerical stability. To tackle this long-standing challenge, we propose a novel approach to dynamically adjust the floating-point data precision at runtime, maintaining computational fidelity using lower bit widths. We first conduct a thorough analysis of data range distributions during scientific simulations to identify opportunities and challenges for dynamic precision adjustment. We then propose a runtime reconfigurable, flexible floating-point multiplier (R2F2), which automatically and dynamically adjusts multiplication precision based on the current operands, ensuring accurate results with lower bit widths. Our evaluation shows that 16-bit R2F2 significantly reduces error rates by 70.2\% compared to standard half-precision, with resource overhead ranging from a 5% reduction to a 7% increase and no latency overhead. In two representative scientific computing applications, R2F2, using 16 or fewer bits, can achieve the same simulation results as 32-bit precision, while standard half precision will fail. This study pioneers runtime reconfigurable arithmetic, demonstrating great potential to enhance scientific computing efficiency. Code available at <a class="link-external link-https" href="https://github.com/sharc-lab/R2F2" rel="external noopener nofollow">this https URL</a>.
Hardware Architecture
What problem does this paper attempt to address?
This paper attempts to solve the trade - off problem between the precision and efficiency of floating - point operations in scientific computing. Specifically, scientific computing applications (such as computational fluid dynamics and climate modeling) usually rely on 64 - bit double - precision floating - point operations, which are extremely costly in terms of computation, memory, and energy. Although the machine - learning community has successfully utilized low - precision computing to save resources, the scientific computing field remains cautious due to concerns about numerical stability. To address this long - standing challenge, the authors propose a novel method to dynamically adjust the precision of floating - point data at runtime to maintain the fidelity of the computation with a lower bit width. The following are the main contents and contributions of the paper: 1. **Problem Background**: - Scientific computing applications (such as molecular dynamics, computational fluid dynamics, and climate modeling) usually require high numerical precision and mainly use 64 - bit double - precision floating - point operations. - Although high - precision computing ensures accuracy, it also brings huge computational, memory, and energy consumption. - Reducing the computational precision can significantly reduce the number of computing nodes and save resources, but in scientific computing, low precision may lead to numerical instability and produce incorrect results. 2. **Research Motivation**: - The machine - learning field has successfully utilized low - precision computing, but the scientific computing field is cautious about it, mainly due to concerns about numerical stability. - The research aims to explore how to dynamically adjust the floating - point precision in scientific computing to achieve high fidelity while improving efficiency. 3. **Solution**: - **Exploration Phase**: By analyzing the data range distribution in scientific simulations, identify the opportunities and challenges of dynamic precision adjustment. - **Propose R2F2**: Design a runtime - reconfigurable flexible floating - point multiplier (R2F2) that can automatically and dynamically adjust the multiplication precision according to the current operands to ensure accurate results with a lower bit width. 4. **Experimental Verification**: - Experiments show that using 16 - bit R2F2 can reduce the error rate by 70.2%. Compared with the standard half - precision, the resource overhead varies from a 5% reduction to a 7% increase, and there is no latency overhead. - In two representative scientific computing applications, R2F2 can achieve the same simulation results as 32 - bit precision with 16 - bit or less bit width, while the standard half - precision will fail. 5. **Main Contributions**: - **Exploration**: Conducted a fine - grained data range distribution exploration for the first time, discovered the characteristics of data local clustering and dynamic range change, and supported mixed - precision and runtime - precision adjustment. - **Develop R2F2**: Proposed R2F2, a runtime - reconfigurable floating - point multiplier that can automatically adjust the precision according to the operands at runtime and support custom precision formats. - **Evaluation**: Implement R2F2 through FPGA and compare it with standard single - precision and half - precision multiplications to verify its superior performance. In summary, this paper proposes an innovative method that improves the efficiency of scientific computing while ensuring the accuracy of the computation by dynamically adjusting the floating - point precision. This provides new ideas and technical means for efficient low - precision computing in the future scientific computing field.