Upload
others
View
9
Download
0
Embed Size (px)
Citation preview
A GPU Accelerated Continuous-based Discrete Element
Method for Elastodynamics
AnalysisZhaosong
Ma, Chun Feng, Tianping
Liu, Shihai
LiInstitute of Mechanics, Chinese Academy of Sciences, Beijing 100190, China
Email: [email protected]
Introduction to CDEM
Serial Algorithm in CPU
Solution Algorithm in Parallel
Unlike general CDEM algorithm which uses clone node strategy to ensure node consistency among elements, the parallel algorithm does the same
work by means of a “nodal force group”
strategy. The difference is out of reason of access conflict. For general CDEM, a node’s force is cloned to the associated nodes just when the node’s force calculation is done. This works well in serial execution, but may cause data access conflict in parallel execution, for various threads may clone their forces to a same node at the same time. Differently, the parallel algorithm uses a data structure to temporarily store the forces calculated. The nodal forces will be redistributed in a separate procedure after all of the node’s force calculation is done.
The element calculation procedure is designed according to the specialty of GPU. As is known, a GPU consists of a set of SIMT (single-instruction, multiple-thread) multi-
processors, which are mapped to CUDA blocks; each multi-processor has an instruction unit and several scalar processor cores, which are mapped to CUDA threads, and each scalar thread executes independently with its own instruction address and register state. Accordingly, the element calculation processes are divided into parallel CUDA blocks, each of which contains certain number of elements. In this work,
a CUDA block contains32 elements for best performance.
The Continuum-based Distinct Element Method (CDEM) is an approach to simulate the progressive failure of geological mass, which is mainly used in landslide stability evaluation, coal and gas outburst analysis, as well as mining and tunnel designing.
CDEM is the combination of Finite Element Method (FEM) and Distinct Element Method (DEM), It contains two kinds of elements, blocks and interfaces.
A discrete block is consisted of one or more FEM elements, all of which share the same nodes and faces. An interface contains several normal and tangent springs, each connects nodes in differentblocks. Inside the block the FEM is used, while for the interface, the DEM is adopted.
CDEM is an explicit time history-analysis FEM/DEM approach on finite difference principles and forward-difference approximation is adopted to calculate the progressive
process through a time marching scheme. During the calculation, the dynamic relaxation method is used to achieve convergence in a reasonable period time with small time steps, and the convergence is reached when the total magnitude of the kinetic energy is minimized.
Parallel Algorithm in GPU
`
Nodes
… Element 1
Nodes
Element 2
Block 1
Nodes
Element 3
Nodes
Element 4
Block 2
Nodes
Elem. n-1
Nodes
Element n
Block m
Thread 1 Nodal Calculation
…...
… Thread 2 Nodal Calculation
…...
Thread 3 Nodal Calculation
…...
Thread nnNodal Calculation
…...
Threads Threads Threads
Element
Group 1 Group 2
Group 3 Group 4
Element Element Element
Element Element
Element Element Element
`
Threads Threads Threads
Block 1 Block 2
…
Block nb
Thread 1
Process
Forces
Thread 2
Process
Forces
Thread 3
Process
Forces
Thread ng
Process
Forces
…
Thread Division Scheme for Element Calculation
Nodal Force Group
Case 2: Elastic Field Calculation of Great Wall Beacon
41232 Hex Elements4000 Steps
41232 Hex Elements4000 Steps
Time Cost for CDEM (GPU Version) 6.1 Sec
Time Cost for CDEM (CPU Version) 95 Min
Time Cost for FLAC3D 520 Sec
GeForce
GTX285
Core (TM) 21.83GHz
Core Duo3.0GHz
Case 3: Ground Settlement Induced by Deep Excavation
0.86 Million Tet
Elements0.86 Million Tet
Elements
Time Cost for Elastic Calculation 6 Min GeForce
GTX480
Case 1: Simple Model Comparison
Speed Up Ratio Is 252 / 0.554 = 455 GeForce
GTX285
A 1000 elements brick, with bottom fixed and gravity appliedA 1000 elements brick, with bottom fixed and gravity applied
CPU
Version
GPU
Version
http://english.imech.cas.cn/
Case 4: Elastic Calculation of Landslide
Time Cost for CDEM (GPU Version) 12.5 Sec
Time Cost for FLAC3D 1380Sec
GeForce
GTX480
Core Duo3.0GHz
85800 Hex elements 5000 Steps85800 Hex elements 5000 Steps
Element calculation
Nodal force redistribution
Global synchronization
Element calculation Element calculation…
Nodal force redistribution … Nodal force redistribution
Preprocessing
Copy data to GPU
Meet iteration conditions?
Post processing
End
Yes
No
Interface Force
Node Acceleration
Node Velocity
Node Displacement
Block Force
Constitutive Equation
Node Force
Damping ForceDamping
Equation
Is reached theconvergence ?
Kinetic Energy
No
FinishYes
Start