Upload
marshall-green
View
218
Download
0
Embed Size (px)
Citation preview
Initial experience on openCL pragamming and develop GPU
solver for OpenFoam
Presented by: Qingfeng Xia
School of MACEUniversity of Manchester
Date: 2011-05-21
Structure
• Part 1. Introduction to OpenCL tools• coding, analyzing, debugging and profiling tools
• ViennaCL C++ template library for BLAS1, 2,3
• Part 2. Introduction to OpenFoam• Summary of GPU plugins
• Part 3. Profiling results• 3.1profiling ViennaCL blas library
• 3.2 profiling OpenFoam with GPU plugins• Future work: real-time PIV and DSMC solver
AMD APP kernel analyzer and profiler
• Command line tool: sprofile32 for Linux• Gui tool is a plugin for Visual Studio 2008/2010,
Professional version is needed
AMD APP kernel analyzer
AMD APP Profiler
Nvidia visual profiler (cross-platfrom)
gDEBbuger(cross-platform)
• (1)Powerful tool for openGL debugging and profiling, now available for openCL
• (2) Cross-GPU platform:• Support Nvidia and AMD GPU• (3) Cross-OS: windows and Linux Mac
• Cons: Too powerful to quick get work
IDE for C++ development (codelite)
• Cross-platform IDE
viennaCL : openCL c++ blas lib
• Brilliant lib, BLAS I,II, III
• Same API with Boost::ublas, can fall back to CPU
• However, this lib can not been linked with OpenFoam (Error: Segment fault)
Part 2
Introduction to OpenFoam (CFD)
Installation: (hope your GPU support double precision)
http://www.openfoam.org/archive/2.1.0/download/git.php
2.1 Quick Introduction to OpenFoam
• Free Computional Fluid Dynamics(CFD)
• (1) OpenFoam is programmed in C++, without an GUI frontend.
• (2)Code_saturne (finite volume) programmed in Fortran, has GUI front end
GPU solvers for OpenFoam
• (1) OF plugin for OpenFoam (before 2010)• Only free for single precision
• (2) Ofgpm package (GPL, May 2011)• from Symscape.com, Which transplant
OpenFoam from *nix to windows and develop a GUI(Cadium) for OpenFoam
• No preconditioner is implemented,• No benchmark is done
My work : clUtils, clFoam, vclFoam
(1) clUtils
• Just as practice of openCL programming, and provide utility for clFoam solver.
• Mainly in C, so there is no template suport,
• Single precision and double precision is switchable via Macro #define scalar float
(2)clFoam(PCG & PBiCG)
• (1) parallel the CPU serial code to parallel code in openCL, all preconditioners of Openfoam are usable
• (2) my own PCG and Bistab solver implented according to algorithms of textbook.
(3)vclFoam
• Wrappers to call ViennalCL sparse matrix solving utility.
• No preconditioner is implemented.
• Yet, do not work until now (gcc 4.4 openfoam 1.7)
Part 3: profiling
• (1) Tricks on the profiling
Profiling method of ViennaCL
Vector adding via ViennaCL
SpeedIT classic PCG solver
• A Japanese research has already make an profiling
• The result show the PCG on GPU is 3 time slower than CPU.
clFoam profiliing
Profiling platform
• Redqueen.rcs.manchester.ac.uk
• CPU AMD core 4, 2.3 GHz
• using one core on the cluster node
• GPU: Telsa C2050
Conclusion
• (1)Looks promising, peak Gflops is hundreds times higher than single CPU.
• (2)But not powerful enough to boost CFD simulation now.
• Domain decomposition is still the most effective way.
Future work
• (1) Real-time PIV vectoring processing up to 10 Hz.
• Most of calculation time is spent on Inter?? between 32X32pixels spots. it can make the best usage of the fast local cache on GPU.
• (2) Direct simulation Monte Carlo method: particles tracking,etc.
Acknowledgement
• RCS Dr Mike Bane, Dr. Simon, etc.
• Test on Nvidia GPU of the cluster Redqueen