Improving overall parallelism in AES accelerator using BRAM and multiple input blocks

B.C. Manjith
DOI: https://doi.org/10.1109/i-pact44901.2019.8960016
2019-03-01
Abstract:With the design of custom architecture of a hardware for user applications in FPGAs, throughput, power and latency can be optimized. Accelerators deployed in FPGAs can be used to accelerate user applications and later on reprogrammed to run another application. Availability of huge amount of resources, high parallelism and re-programmability of FPGAs, it is found to be useful as accelerators in cloud infrastructure. Communication speed between main processor and accelerator should not pull down the acceleration process speed. Usage of BRAMs for increasing the communication speed and improved pipelining, thereby speeding up of acceleration process is discussed here. Input is divided into blocks are continuously passed to BRAM which is read by AES accelerator and improving overall parallelism by pipelining multiple input blocks. After processing, output is passed to another BRAM without waiting the user process to read from accelerator. Result shows 32% decrease in number of clock cycles while using BRAM with pipelining than without using BRAM with pipelining.
What problem does this paper attempt to address?