A Model of Parallel Mosaicking for Massive Remote Sensing Images Based on Self-defined RDD

Weipeng JING,Shuaiqi HUO
DOI: https://doi.org/10.3724/SP.J.1047.2017.01346
2017-01-01
Abstract:Image mosaicking is an important part of remote sensing image processing. It plays a vital role in the analysis of trans-regional remote sensing images. In order to solve the problems of low utilization rates of the nodes and frequent data I/O in the traditional parallel algorithms of remote sensing images, we proposed a paral-lel mosaicking algorithms based on self-defined RDD (Resilient Distributed Datasets), in which the Spark distrib-uted memory computing framework has been used. In this paper, we take full advantage of the Spark, which is conducive to the processing of iterative data, and build remote sensing images parallel mosaic processing model through the operation of the Spark RDD. Firstly, according to the logical separability and data independence of the Fourier transform and inverse Fourier transform in the phase correlation method, we improved the traditional phase correlation method by executing a single instruction on multiple nodes, which are executed parallel in the cluster. We did so to improve the image overlapping region estimation multi-node parallel computation in the al-gorithm. Then, we override the compute and getPartitions methods in RDD and self-define the RDD for remote sensing image processing. Meanwhile, we used the three key steps of the image mosaicking, including overlap-ping region estimation, image registration and image fusion, which are the transformation-type operators of the self-defined RDD. These transformation-type operators do not perform calculations in the process of parallel mo-saicking, until the final mosaicking image is required to be written to disk or file system. Thus, reducing the time consumption in the process of image parallel mosaicking. Finally, the parallel processing of image mosaicking is realized by calling the operators of self-defined RDD with the method of implicit conversion, compared with the parallel mosaicking algorithm based on MPI. The experimental results show that the parallel mosaicking algo-rithm of massive remote sensing image based on self-defined RDD can effectively improve the image mosaick-ing efficiency of large data volume on the basis of guaranteeing the image mosaicking effects.
What problem does this paper attempt to address?