Efficient Implementation of Pooling Operation for AI Accelerators

Vijay Deep Bhatt,Ashutosh Pandey,Modini Ayyagari
DOI: https://doi.org/10.1109/ICWITE57052.2022.10176216
2022-12-01
Abstract:Deep neural networks (DNNs) are extensively used in artificial intelligence (AI) applications which often have stringent design constraints. To enable large scale deployment of DNNs in AI applications it is imperative to design and implement efficient algorithms and hardware architectures for DNNs. Most neural network hardware accelerators prioritize in optimizing the execution of the convolutional layers as opposed to the pooling layers, resulting in performance degradation of the accelerator and increase in the system power consumption. This paper explores efficient hardware architectures for processing pooling layers by focusing on mitigating memory access bottlenecks by vectorizing the pooling operations along the channel dimension of the activation matrix. The proposed hardware provides performance improvement of approximately 200x as compared to the software implementation, subjected to simulation environment constraints. Similar observation can be extrapolated for the reduction in power consumption with this proposed pooling hardware architecture.
Engineering,Computer Science
What problem does this paper attempt to address?