Abstract:Reverse time migration (RTM) in attenuating media should take absorption and dispersion effects into consideration. The latest proposed viscoacoustic wave equation with decoupled fractional Laplacians facilitates separate amplitude compensation and phase correction in [Formula: see text]-compensated RTM ([Formula: see text]-RTM). However, intensive computation and enormous storage requirements of [Formula: see text]-RTM prevent it from being extended into practical application, especially for large-scale 2D or 3D cases. The emerging graphics processing unit (GPU) computing technology, built around a scalable array of multithreaded streaming multiprocessors, presents an opportunity for greatly accelerating [Formula: see text]-RTM by appropriately exploiting GPUs architectural characteristics. We have developed the cu[Formula: see text]-RTM, a CUDA-based code package that implements [Formula: see text]-RTM based on a set of stable and efficient strategies, such as streamed CUDA fast Fourier transform, checkpointing-assisted time-reversal reconstruction, and adaptive stabilization. The cu[Formula: see text]-RTM code package can run in a multilevel parallelism fashion, either synchronously or asynchronously, to take advantages of all the CPUs and GPUs available, while maintaining impressively good stability and flexibility. We mainly outline the architecture of the cu[Formula: see text]-RTM code package and some program optimization schemes. The speedup ratio on a single GeForce GTX760 GPU card relative to a single core of Intel Core i5-4460 CPU can reach greater than 80 in a large-scale simulation. The strong scaling property of multi-GPU parallelism is demonstrated by performing [Formula: see text]-RTM on a Marmousi model with one to six GPU(s) involved. Finally, we further verified the feasibility and efficiency of the cu[Formula: see text]-RTM on a field data set.

Accelerating the Training of HTK on GPU with CUDA.

GPU Accelerated GMM Supervectors for Speaker and Language Recognition

GPU-based Acceleration of the Hyperspectral Band Selection by SNR Estimation Using Wavelet Transform

HPH: Hybrid Parallelism on Heterogeneous Clusters for Accelerating Large-scale DNNs Training.

High throughput TCR sequence alignment using multi-GPU with inter-task parallelization

Exponential Moving Average Model in Parallel Speech Recognition Training

Accelerating Convolution-Based Detection Model on Gpu

CUDAMPF++: A Proactive Resource Exhaustion Scheme for Accelerating Homologous Sequence Search on CUDA-enabled GPU

Efficient and Robust Parallel DNN Training through Model Parallelism on Multi-GPU Platform

A Practical Implementation of GPU based Accelerator for Deep Neural Networks

Multiple-GPU accelerated high-order gas-kinetic scheme on three-dimensional unstructured meshes

A Graphics Processing Unit Implementation and Optimization for Parallel Double-Difference Seismic Tomography

GPU-HADVPPM V1.0: a high-efficiency parallel GPU design of the piecewise parabolic method (PPM) for horizontal advection in an air quality model (CAMx V6.10)

A 1000-fold Acceleration of Hidden Markov Model Fitting using Graphical Processing Units, with application to Nonvolcanic Tremor Classification

Accelerating Haze Removal Algorithm Using CUDA

Implementation of Accelerated BCH Decoders on GPU.

CuQ-RTM: A CUDA-based Code Package for Stable and Efficient Q-compensated Reverse Time Migration

gpuPairHMM: High-speed Pair-HMM Forward Algorithm for DNA Variant Calling on GPUs

Research of GPU Acceleration Techniques for Image Processing

A GPU-based Kalman Filter for Track Fitting

Accelerate Helical Cone-Beam CT with Graphics Hardware