Novel GPU Data Partitioning Method to Overlap Communication and Computation

ZHANG Bao,CAO Haijun,DONG Xiaoshe,LI Dan,HU Leijun
DOI: https://doi.org/10.1016/j.egypro.2011.12.523
2011-01-01
Abstract:A novel data partitioning method is proposed to address the problem that the CPU+GPU heterogeneous parallel processing system cannot fully utilize its resources when average-partition data blocks in batches is processed to deal with the extra overhead for communication.Application data is processed by GPU after being partitioned into blocks with different sizes in proportion by taking the communication bandwidth and the GPU computing capacity into account.Therefore,PCI-E bus and GPU can work in parallel in a period of time to overlap communication and computation.The partitioned data blocks can utilize system resources as much as possible,and hence the mutual waiting time between data transferring and computing can be reduced.Experimental results show that application performance is raised significantly by effectively overlapping communication and computation.Comparisons with no-partition and average-partition show that matrix multiplication's performance is improved by about 5% and 3%,while Fast Fourier Transform's performance is enhanced by about 30% and 6%,respectively.
What problem does this paper attempt to address?