CUDA Based High Performance Implementation of RSA Algorithm

Zhiying Wang
2011-01-01
Computer Engineering and Applications Journal
Abstract:As a new architecture supporting general purpose computing on GPU,Compute Unified Device Architecture(CU-DA) plays an important role in massive data parallel computing.RSA is a kind of computing concentrated public key cryptographic algorithm.To improve the performance of RSA algorithm,CUDA based high performance implementation is presented in this paper.The key of CUDA implementation of RSA is large amount of independent parallel Montgomery modular multiplication thread in the kernel side.The threads organization scheme and data structure of this implementation are also presented.Besides,shared memory based performance improvement method is also presented.According to the implementation method of this paper,the performance and throughput of RSA algorithm are obtained for a CUDA GPU.The experiment results show that the CUDA implementation can achieve more than 40 times speedup in comparison with general CPU implementation of RSA.
What problem does this paper attempt to address?