ISZT - Stanford Universityforum.stanford.edu/events/posterslides/LisztaDSLforMesh... ·...

LISZT!A DSL for Mesh-Based PDEs

Z. DeVito, M. Medina, N. Joubert, M. Barrientos, E. Elsen,S. Oakley, J. Alonso, E. Darve, F. Ham, P. Hanrahan

GPGPU RUNTIME

Liszt Codeval Flux = FieldWithConst[Cell,Float](0.f)while (t < 2.f) {

for(f <- interior_set) { val normal = face_unit_normal(f) val vDotN = dot(globalVelocity,normal) val area = face_area(f) var flux = 0.f val cell = if(vDotN >= 0.f) inside(f) else outside(f) flux = area * vDotN * Phi(cell) Flux(inside(f) : Cell) -= flux Flux(outside(f) : Cell) += flux}for(f <- inlet_set) { val area = face_area(f) val vDotN = dot(globalVelocity,face_unit_normal(f)) Flux(outside(f) : Cell) += area * vDotN * phi_sine_function(t)}

val Flux = FieldWithConst[Cell,Float](0.f)def determineInclusions() : Unit = { for(f <- inlet_set) { isInlet_face(f) = 1; } for(f <- interior_set) { isInterior_face(f) = 1; }}for(c <- cells(mesh)) { for(f <- faces(c)) { if(isInterior_face(f) > 0){ val normal = face_unit_normal(f) val vDotN = dot(globalVelocity,normal) val area = face_area(f) val cell = if(vDotN >= 0.f) inside(f) else outside(f) var flux = area * vDotN * Phi(cell) Flux(c) -= flux if(ID(c) == insideID(f)) Flux(c) += flux if(ID(c) == outsideID(f)) } } }for(c <- cells(mesh)){ for(f <- faces(c)){ if(isInlet_face(f) > 0) { if(ID(c) == outsideID(f)) { val area = face_area(f) val vDotN = dot(globalVelocity,face_unit_normal(f)) Flux(c) += area * vDotN * phi_sine_function(t) } } } }}

Liszt GPU Code

IMPLICIT METHODS

ARCHITECTURE

Liszt is a domain specific language that exposes a high-level interface for building mesh-based solvers of PDEs. This frees scientists from architecture-specific implementations and increases programmer productivity tenfold. Current PSAAP solvers are tied to a specific platform, while Liszt solvers are portable across architectures. Our compiler achieves this by using domain knowledge in its program analysis stage to produce high performance code for a variety of platforms.

Liszt has a stable implementation for finite difference methods with a fully functional MPI-based backend. Liszt now supports implicit methods by providing native sparse matrix operations, as used by our implementation of the Joe RANS solver. Program transformations for our GPU runtime are in development, and our preliminary GPU runtime provides explicit finite difference support. A full stack of debugging, visualization, and compiler tools is now available.

OVERVIEW

State of the art finite element and finite difference methods use implicit solvers to provide stability and performance. Implicit methods depend on global solves of sparse matrices. Liszt has added language-level support for solving sparse matrices, and integrates the PETSc solver as a backend.

Sparse matrices are tied to the topology of the mesh, allowing for simple referencing. Implicit formulations of finite difference methods have a regular matrix structure, currently supported by Liszt. Higher order finite element methods require multiple, different submatrices per element in its matrix formulation, currently in development.

The implicit version of Joe has been ported to Liszt, reducing its codebase from 3106 lines to 1520 lines (this disregards the 20 000+ lines of MPI boilerplate code in C++ Joe). MPI performance is comparable for both the explicit and implicit versions of Joe.

The Liszt framework cross‐compiles Scala‐embedded DSL code to C++. Three implementa=ons of the run=me exist: an MPI‐based run=me for clusters, an OpenMP‐based run=me for SMPs and a preliminary GPU backend.

The GPU backend implements gathers and reductions in native NVidia C, and manages mesh and field data on the GPU. The JIT phase for the GPU performs transformations to convert standard scatter-based operations into gathers, allowing arbitrary code to be executed on the GPU.

CURRENT AND FUTURE WORKWe are currently working on:

DSL advances through Polymorphic Embedding GPGPU-specific loop transformations FEM & DG support through canonical elements

Future work: Release private beta at upcoming Codeathon Uncertainty quantification support Transformations between scatters, gathers & reduces A hybrid runtime combining MPI and GPGPU

double *A = new double[ncv][5][5];double *phi = new double[ncv][5];double *rhs = new double[ncv][5];for (int ifa = 0; ifa < nfa; ifa++) { int icv0 = cvofa[ifa][0]; int icv1 = cvofa[ifa][1]; int noc00, noc01, noc11, noc10; getImplDependencyIndex(noc00, noc01, noc11, noc10, icv0, icv1); calcEulerFluxMatrices_HLLC(Apl, Ami); for (int i=0; i<5; i++) for (int j=0; j<5; j++) { A[noc00][i][j] += Apl[i][j]; A[noc01][i][j] += Ami[i][j]; }}int *nbocv_v_global = new int[ncv_g];for (int icv = 0; icv < ncv; icv++) { nbocv_v_global[icv] = cvora[mpi_rank] + icv; updateCvData(nbocv_v_global, REPLACE_DATA);}PetscSolver petscSolver(..., cvora, nbocv_i, nbocv_v, 5);petscSolver.solveGMRES(A, phi, rhs, cvora, nbocv_i, nbocv_v, nbocv_v_global, 5, ...);

val A = new SparseMatrix[Float] ;val phi = new SparseVector[Float] ;val rhs = new SparseVector[Float] ;for ( c <- cells(mesh) ) { for ( f <- faces(c) ) { val Apl = AplMatrixStorageField(f) ; val Ami = AmiMatrixStorageField(f) ; val cc = inside(f) ; A(c,c) += Apl ; A(c,cc) += Ami ; }}phi = A/rhs ;

Liszt Implicit CodeJoe Implicit Code

Scala Compiler

MPI Build GPU Build

Liszt JIT

GPU Codegen

MPI Codegen

SMP Codegen

Program Analysis

Platform-specific Transforms

MPICXXNVCC

Viz Build

SMP Build

GCCmpirun

cocoa threads

SC.class

SC.scala

liszt.cfg

!" #" $%" &'" (&"

Number of nodes

Joe Explicit Euler

!" #" $%" &'" (&"

Number of nodes

Joe Implicit Euler

LisztViz is an extension of our single‐core run=me that provides mesh visualiza=on of the simula=on system. LisztViz eases debugging by making all symbols visible through watchpoints in the execu=on stream.

The GPU implementa=on demands a separa=on of code into CPU drivers and GPU kernels, manages memory transfers and transforms types. This happens in two passes: “transform” and “codegen”.

PERFORMANCE

ISZT - Stanford Universityforum.stanford.edu/events/posterslides/LisztaDSLforMesh... ·...

Documents

€¦ · maybelline volum colossal mascara do rues iszt. peena ofertÅ l'oreal paris color riche szminka do ustiszt peena ofertÅ 4099 maybelline color sensat onal szminka do ust

Deep Compression: Compressing Deep Neural Networks with …forum.stanford.edu/events/posterslides/DeepCompression... · 2016-03-01 · Deep Compression: Compressing Deep Neural Networks

Directed Regression - Stanford Universityforum.stanford.edu/events/posterslides/DirectedRegression.pdf · • Directed regression (DR) takes a convex combination of rOLS and rEO where

GGTGCCAGGGAAAGGGCAGGAGGTGAGTGCTGGGAGGCAGCTGAGGTCAACTTCTTTT ...forum.stanford.edu/events/posterslides/GenomicsApproachto... · The cell is the basic unit of life. DNA = linear molecule

D E ISETBio: A Computational Engine for Modeling the Early ...forum.stanford.edu/events/posterslides/ISETBioA...ISETBio: A Computational Engine for Modeling the Early Visual System

Pathway and mechanism - Stanford Universityforum.stanford.edu/events/posterslides/Pathwayand... · 2017-04-04 · method, but the entry pathway of each ligand into the ... interest

Cloud Coordinator - Stanford Universityforum.stanford.edu/events/posterslides/PowerNet...R. Rajagopal 1, J. Qin , M. Kiener1, T. Navidi1 1S3L –Stanford University, Stanford, CA,

Human Friendly Robot Design and Control - …forum.stanford.edu/events/posterslides/AHybridActuation...Human Friendly Robot Design and Control Dongjun Shin, Irene Sardellitti Artificial

· 2012-08-13 · maybelline volum colossal tijsz do iszt., oferta 2599 l'oreal paris mascara volume million lashes mascara do peena oferta 79 bourjois mascara beautyfull 99 2299

Model-Based Design of a 3D Haptic Shape Displayforum.stanford.edu/events/posterslides/ModelBasedDesign... · 2017-04-04 · simplify the manufacturing process by limiting ourselves

Understanding Black-box Predictions via Inﬂuence Functionsforum.stanford.edu/events/posterslides/Understanding... · 2017. 4. 4. · Understanding Black-box Predictions via Inﬂuence

Sample Protein Functional - Stanford Universityforum.stanford.edu/events/posterslides/SamplingProtein... · 2009-04-22 · 3 Motivation (I) Proteins The major molecules that carry

Tracking and Retargeting Facial Performances with …forum.stanford.edu/events/posterslides/Trackingand...Tracking and Retargeting Facial Performances with a Data-Augmented Anatomical

Professor William J. Dally Curt Harting Vishal Parikh Jongsoo Parkforum.stanford.edu/events/posterslides/EnergyEfficientComputing.pdf · Professor William J. Dally Curt Harting Vishal

Pricing Strategies for Viral Marketing on Social Networksforum.stanford.edu/events/posterslides/Pricingstrategies... · 2009-04-21 · Simulation on YouTube graph •Comparison with

GMAILVALET COM: MANAGING EMAIL OVERLOAD THROUGH PARSIMONIOUS, ACCOUNTABLEforum.stanford.edu/events/posterslides/EmailValet... · 2013-04-15 · using email valet for privacy reasons

poster - Stanford Universityforum.stanford.edu/events/posterslides/Content... · Graph and the labels are Flickr tags. Since there are many regions in the same image, all regions

CAMPAIGN – Clustering Algorithms in Modular, Parallel, and ...forum.stanford.edu/events/posterslides/CAMPAIGNClustering... · CAMPAIGN – Clustering Algorithms in Modular, Parallel,

Independent component analysis: idenfying data‐driven ...forum.stanford.edu/events/posterslides/IndependentComponent... · Independent component analysis: idenfying data‐driven

UIST poster 2011 - Stanford Computer Forumforum.stanford.edu/events/posterslides/MaintainingShared... · 2012-04-08 · Checklists can act as cognitive aids during crisis scenarios,