Research on Multi-Level Parallel Algorithm of GPU Based QR Decomposition

MU Shuai,WANG Chen-xi,DENG Yang-dong
DOI: https://doi.org/10.3969/j.issn.1006-9348.2013.09.054
2013-01-01
Abstract:QR decomposition has been widely used as a fundamental computation module in many applications,such as image processing,signal processing and communication,and so on. Traditional parallel implementation of QR decomposition can only exploit data parallelism. Based on the inherent characteristics of Fast Givens Rotation algorithm,this paper proposed a multi-level parallel algorithm,which can exploit task parallelism and data parallelism concurrently and be suitable for massively parallel processors exemplified by Graphics Processing Units( GPU).Meanwhile,the parallel QR implementation on GPU can be reused by a variety of applications. The experimental results reveal that compared to OpenMP based implementation on CPU,this multi-level parallel algorithm implemented on GPU can improve the performance of 5X and SVD application,and invoking GPU based QR module can achieve a speedup of 3X.
What problem does this paper attempt to address?