Training Neural Networks for Execution on Approximate Hardware

Tianmu Li,Shurui Li,Puneet Gupta
2023-04-09
Abstract:Approximate computing methods have shown great potential for deep learning. Due to the reduced hardware costs, these methods are especially suitable for inference tasks on battery-operated devices that are constrained by their power budget. However, approximate computing hasn't reached its full potential due to the lack of work on training methods. In this work, we discuss training methods for approximate hardware. We demonstrate how training needs to be specialized for approximate hardware, and propose methods to speed up the training process by up to 18X.
Machine Learning,Hardware Architecture
What problem does this paper attempt to address?
The main problem that this paper attempts to solve is: how to optimize the training methods of neural networks for approximate hardware. Specifically, the authors point out that although approximate computing methods show great potential in deep learning, especially in reducing cost and power consumption when performing inference tasks on battery - powered devices, these methods have not yet fully realized their potential due to the lack of specialized training methods. ### Background and Problem Description of the Paper 1. **Potential of Approximate Computing Methods** - Approximate computing methods (such as stochastic computing, approximate arithmetic, and analog computing) improve performance by sacrificing some computational precision. - These methods are particularly suitable for inference tasks on battery - powered devices with power budget limitations. 2. **Existing Challenges** - At present, most research focuses on approximate computing in the inference stage and ignores the training stage. - Since the errors introduced by approximate computing will affect the accuracy of the model, it is difficult to directly use the models trained for floating - point or fixed - point computing. - The lack of training methods specifically for approximate hardware leads to a limited application range of approximate computing, which is usually only applicable to simple models and datasets, or requires sacrificing performance to maintain accuracy. ### Objectives of the Paper To overcome the above challenges, this paper proposes the following objectives: - **Develop Specialized Training Methods**: Enable neural networks to be efficiently trained on approximate hardware while maintaining high accuracy. - **Accelerate the Training Process**: Propose new techniques to significantly reduce training time, such as through activation function approximation, error injection, and gradient checkpointing methods. ### Main Contributions The main contributions of the paper include: 1. **Activation Function Approximation**: Use activation functions to approximate computational errors during the back - propagation process. 2. **Error Injection**: Reduce the time of each training iteration by introducing errors and combine accurate modeling to maintain model accuracy. 3. **Gradient Checkpointing**: Reduce memory consumption through the gradient checkpointing technique, thereby supporting larger - batch training and improving GPU utilization. 4. **Significantly Accelerate Training**: Experiments show that the proposed training method can shorten the end - to - end training time by up to 18 times, making it possible to train complex models on a single consumer - level GPU. ### Conclusion Through these improvements, the paper demonstrates how to effectively optimize the training of neural networks for approximate hardware, thereby promoting the further development of approximate computing in practical applications.