Hardware-Aware Softmax Approximation for Deep Neural Networks

Xue Geng,Jie Lin,Bin Zhao,Anmin Kong,Mohamed M. Sabry Aly,Vijay Chandrasekhar
DOI: https://doi.org/10.1007/978-3-030-20870-7_7
2019-01-01
Abstract:There has been a rapid development of custom hardware for accelerating the inference speed of deep neural networks (DNNs), by explicitly incorporating hardware metrics (e.g., area and energy) as additional constraints, in addition to application accuracy. Recent efforts mainly focused on linear functions (matrix multiplication) in convolutional (Conv) or fully connected (FC) layers, while there is no publicly available study on optimizing the inference of non-linear functions in DNNs, with hardware constraints.
What problem does this paper attempt to address?