CuQ-RTM: A CUDA-based Code Package for Stable and Efficient Q-compensated Reverse Time Migration

Yufeng Wang,Hui Zhou,Xuebin Zhao,Qingchen Zhang,Poru Zhao,Xiance Yu,Yangkang Chen
DOI: https://doi.org/10.1190/geo2017-0624.1
IF: 3.264
2018-01-01
Geophysics
Abstract:Reverse time migration (RTM) in attenuating media should take absorption and dispersion effects into consideration. The latest proposed viscoacoustic wave equation with decoupled fractional Laplacians facilitates separate amplitude compensation and phase correction in [Formula: see text]-compensated RTM ([Formula: see text]-RTM). However, intensive computation and enormous storage requirements of [Formula: see text]-RTM prevent it from being extended into practical application, especially for large-scale 2D or 3D cases. The emerging graphics processing unit (GPU) computing technology, built around a scalable array of multithreaded streaming multiprocessors, presents an opportunity for greatly accelerating [Formula: see text]-RTM by appropriately exploiting GPUs architectural characteristics. We have developed the cu[Formula: see text]-RTM, a CUDA-based code package that implements [Formula: see text]-RTM based on a set of stable and efficient strategies, such as streamed CUDA fast Fourier transform, checkpointing-assisted time-reversal reconstruction, and adaptive stabilization. The cu[Formula: see text]-RTM code package can run in a multilevel parallelism fashion, either synchronously or asynchronously, to take advantages of all the CPUs and GPUs available, while maintaining impressively good stability and flexibility. We mainly outline the architecture of the cu[Formula: see text]-RTM code package and some program optimization schemes. The speedup ratio on a single GeForce GTX760 GPU card relative to a single core of Intel Core i5-4460 CPU can reach greater than 80 in a large-scale simulation. The strong scaling property of multi-GPU parallelism is demonstrated by performing [Formula: see text]-RTM on a Marmousi model with one to six GPU(s) involved. Finally, we further verified the feasibility and efficiency of the cu[Formula: see text]-RTM on a field data set.
What problem does this paper attempt to address?