Abstract:The moving particle semi-implicit (MPS) method performs well in simulating incompressible flows with free surfaces. Despite its applicability, the MPS method suffers from the fundamental instability problem and high computational cost in its practical application. Substantial research has been conducted on improving the stability and accuracy of the MPS method. Moreover, graphics processing units (GPUs), which are multi-processors that execute many three-dimensional geometric processes at high speed, provide unprecedented capability for scientific computations. However, the usage of a single GPU card is not sufficient for engineering applications that require several million particles that predict the desired physical processes, because the available memory space is insufficient. In this work, the dynamic stability (DS) algorithm and particle shifting (PS) algorithm have been used to overcome the instability and inaccuracies caused by tensile instability and non-uniform particle distribution, respectively. Based on the stable MPS method, a GPU-based MPS code that uses the compute unified device architecture (CUDA) language has been developed. An efficient neighborhood particle search is performed using an indirect method, and the matrix for the pressure Poisson equation (PPE) is assembled in parallel. Based on the single-GPU version, a multi-GPU MPS code has been developed. The approach uses a non-geometric dynamic domain decomposition method that provides homogeneous load balancing whereby different portions (subdomains) of the physical system under study are assigned to different GPUs. Communication between devices is achieved with the use of a message passing interface (MPI). Based on the neighborhood particle search, the techniques for building and updating the “halo” are described in detail. The speed-up of the single-GPU version is analyzed for different numbers of particles, and the scalability of the multi-GPU version is analyzed for different numbers of particles and different numbers of GPUs. Last, an application with more than 107 particles is presented to show the capability of the code in handling large-scale simulations.

A Load-Decoupling Parallel Strategy Based on Shared Memory Architecture for DSMC to Simulate Near-Continuum Gases

Two-level parallel load balancing strategy for accelerating DSMC simulations in near-continuum gases

A Parallel Simulator for Massive Reservoir Models Utilizing Distributed-Memory Parallel Systems

Parallel algorithm for simulation of altitude-control thruster nozzle flow with DSMC method

An Improved Hybrid Particle Scheme for Hypersonic Rarefied-Continuum Flow

The Hypersonic Chemical Reaction Flow of Flat-Nosed Cylinder with Hybrid DSMC/EPSM Method

An Undecomposed Hybrid Algorithm for Nonlinear Coupled Constitutive Relations of Rarefied Gas Dynamics

Implementation of the moving particle semi-implicit method for free-surface flows on GPU clusters.

Parallel Data Partitioning Strategy In Solving Large Scale Electromagnetic Scattering Problems

Study on hybrid DSMC/EPSM method for chemical reacting gas flow

Load balancing strategies for the DSMC simulation of hypersonic flows using HPC

Parallelization of DSMC Method on Unstructured Grids for Hypersonic Rarefied Gas

Parallel SPH modeling using dynamic domain decomposition and load balancing displacement of Voronoi subdomains

MPI+X:Massive Parallelization and Dynamic Load Balance of a Production-level Unstructured DSMC Solver

Adaptable Parallel Acceleration Strategy For Dynamic Monte Carlo Simulations Of Polymerization With Microscopic Resolution

A hybrid parallel approach for fully resolved simulations of particle-laden flows in sediment transport

Development of the Integrated Parallelism Strategy for Large Scale Depletion Calculation in the Monte Carlo Code RMC

A Parallel Coupling Framework for DEM-MBD: Model Verification and Application

Hybrid Decomposition Method in Parallel Molecular Dynamics Simulation Based on SMP Cluster Architecture

Two-level dynamic load-balanced p-adaptive discontinuous Galerkin methods for compressible CFD simulations

Parallel Scheme for Multi-Layer Refinement Non-Uniform Grid Lattice Boltzmann Method Based on Load Balancing