Massive parallel implementation of JPEG2000 decoding algorithm with multi-GPUs

xianyun wu,yunsong li,kai liu,keyan wang,li wang
DOI: https://doi.org/10.1117/12.2053007
2014-01-01
Abstract:JPEG2000 is an important technique for image compression that has been successfully used in many fields. Due to the increasing spatial, spectral and temporal resolution of remotely sensed imagery data sets, fast decompression of remote sensed data is becoming a very important and challenging object. In this paper, we develop an implementation of the JPEG2000 decompression in graphics processing units (GPUs) for fast decoding of codeblock-based parallel compression stream. We use one CUDA block to decode one frame. Tier-2 is still serial decoded while Tier-1 and IDWT are parallel processed. Since our encode stream are block-based parallel which means each block are independent with other blocks, we parallel process each block in T1 with one thread. For IDWT, we use one CUDA block to execute one line and one CUDA thread to process one pixel. We investigate the speedups that can be gained by using the GPUs implementations with regards to the CPUs-based serial implementations. Experimental result reveals that our implementation can achieve significant speedups compared with serial implementations.
What problem does this paper attempt to address?