Implementation Of Deflated Preconditioned Conjugate...

Implementation of Deflated Preconditioned Conjugate Gradient

on the GPU

Rohit Gupta

Responsible Professor: Prof. Dr. Ir. Henk Sips

Supervisors: Prof. Dr. Ir. Kees Vuik and Ir. C.W.J. Lemmens

Two-Phase Fluid Flow

Model for Two Phase Computation

Boundary

Conditions

-1 4 -1

-1 4 0

0 4 -1

-1 4 -1

0 1.5015 -0.5005

-0.5005 2.002 -0.5005

-0.5005 2.002

GPU Architecture

Memory Bandwidth

Launch Configuration

•Coalescing

•Minimum Memory Transfers

•Minimum Divergence

•Caching

Key Optimizations

•SpMV Kernel

•Preconditioning Kernels

•Deflation Kernels

Related Work

Two-Phase Fluid Flow

•SpMV Kernel

•Preconditioning Kernels

Implementation GPU

•Meschach/ gotoBLAS

•Compiler Options

•Same Data Structures

Implementation CPU

•Conjugate Gradient (CG)

•CG with Preconditioning

•CG with Preconditioning and

Deflation

Flavors

•Diagonal

•Incomplete Cholesky

•Incomplete Poisson

Preconditioning

•Important Operations

•Subdomains

Deflation

•Step-by-Step Optimizations

•Profiler for Advice

•Modularity and Readability

Methodology

•DIA Format is best

•SpmV dominate GPU execution

Conjugate Gradient - Vanilla

23 10||||

||||10

kexact

•Preconditioning dominates execution

•More Blocks – Better SpeedUp

•Shared Memory Use restricted

Preconditioning

Block Incomplete Cholesky

•Speed Up recovers

•Preconditioning gets company in

•More Deflation Vectors – More Speed Up

Deflation

bEAZ 1

•Storage of optimized

•Speed Up Suffers – CPU performs better

•Optimizations on CPU

Deflated Preconditioned

Conjugate Gradient

•Incomplete Poisson Preconditioning

•Speed Up favoring move

Conjugate Gradient

kexact

•Matrix Vector Multiplication optimized

•Kernels utilize 85% of bandwidth

•Clubbing Kernels – Reducing Memory Traffic

Conjugate Gradient

•Density Contrast 1000:1

•False Stop

Two Phase Matrix

2 10||||

kexact

Performance Timeline

Iterative OptimizationsSpeedUp Factors

•Very close to possible peak performance

•Bandwidth bound Kernels

•Platform Utilization – Startling Facts

Analysis

•Multi-GPU Multi-CPU

•3D Domains, More Interfaces, Mixed Precision

•Different Preconditioners, Grid Types.

Future Work

Conclusions

•Deflation suits the many core platform

•Two Phase accuracy suffers

•Deflation with IP Preconditioning

Conclusions

Implementation Of Deflated Preconditioned Conjugate...

Documents

Statistically Preconditioned Accelerated Gradient Method ... · Statistically Preconditioned Accelerated Gradient Method for Distributed Optimization Hadrien Hendrikxz (hadrien.hendrikx@inria.fr)

ordering methods for preconditioned conjugate gradient methods

PRECONDITIONED CONJUGATE-GRADIENT 2 (PCG2), · preconditioned conjugate-gradient method to solve the matrix equations produced by the U.S. Geological Survey modular three-dimensional,

Preconditioned Multigrid Methods for Compressible Flow

COMPARISON OF THE DEFLATED PRECONDITIONED …smaclachlan/research/asphalt.pdf · require ne-scale nite-element (FE) meshes, which lead to large linear systems that are chal-lenging

Eﬃcient Two-Level Preconditioned Conjugate Gradient Methodon theGPUta.twi.tudelft.nl/nw/users/vuik/papers/Gup13aGV.pdf · 2013. 10. 18. · 0 50 100 150 200 250 10−5 10−4 10−3

Extending Preconditioned GMRES to Nonlinear Optimization

Time Complexity of a Parallel Conjugate Gradient Solver for Light … · 2016-07-18 · We describe parallelization for distributed memory computers of a preconditioned Conjugate

Preconditioned Iterative Methods for Linear Systems

Implementation of the Deflated Pre-conditioned Conjugate …homepage.tudelft.nl/d2b4e/numanal/gupta_presentation1.pdf · 2010. 1. 27. · •Successive Over- Relaxation G = CONJUGATE

Left-Preconditioned Communication-Avoiding Conjugate Gradient Methods ... · CG with Matrix Power Kernel compute Ar. sk+j. based on recurrence formula. using basis vectors given by

Preconditioned Conjugate Gradient Algorithms for Graph ... · ∗This work was supported by (i) the Fonds de la Recherche Scienti que { FNRS and the Fonds Wetenschappelijk Onderzoek

Deflated Conjugate Gradient Method for modeling Groundwater Flow Near Faults

TOWARD THE OPTIMAL PRECONDITIONED - CiteSeer

Preconditioned Iterative Methods White Paper

Preconditioned conjugate- and secant-Newton methods for ...users.ntua.gr/.../files/Preconditioned_Conjugate_and_Secant-Newton... · pTKp,=O holds for any i#j, where K is the linear

Study on Preconditioned Conjugate Gradient Methods for ... · Study on Preconditioned Conjugate Gradient Methods for Solving Large Sparse Matrix in CSR Format (Course Project of ECE697NA)

COMPUTER PROGRAM FOR SOLVING GROUND-WATER FLOW … · by the preconditioned conjugate gradient method ... flow chart ... computer program for solving ground-water flow equations by

5.4 CONJUGATE GRADIENT CONVERGENCE ...anitescu/CLASSES/2012/LECTURES/...Preconditioned conjugate gradient Preconditioner action M=CTC 5.5 NONLINEAR CONJUGATE GRADIENT The Fletcher-Reeves

GPU-Accelerated Preconditioned Iterative Linear Solvers