29
Scientific visualization of HPC simulation data – introduction and overview on MPG projects – Elena Erastova, Markus Rampp, Klaus Reuter [email protected] Max Planck Computing and Data Facility (MPCDF) Interdisciplinary Cluster Workshop on Visualization, Excellence Cluster Universe 2015/11/04, Garching

Scientific visualization of HPC simulation data ... · K. Reuter, MPCDF Garching, 2015/11/04 some basics on scientific visualization (cont'd) Gridded vs ungridded data depending on

  • Upload
    others

  • View
    11

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Scientific visualization of HPC simulation data ... · K. Reuter, MPCDF Garching, 2015/11/04 some basics on scientific visualization (cont'd) Gridded vs ungridded data depending on

Scientific visualization of HPC simulation data– introduction and overview on MPG projects –

Elena Erastova, Markus Rampp, Klaus Reuter

[email protected]

Max Planck Computing and Data Facility (MPCDF)

Interdisciplinary Cluster Workshop on Visualization, Excellence Cluster Universe2015/11/04, Garching

Page 2: Scientific visualization of HPC simulation data ... · K. Reuter, MPCDF Garching, 2015/11/04 some basics on scientific visualization (cont'd) Gridded vs ungridded data depending on

K. Reuter, MPCDF Garching, 2015/11/04

Outline

● Introduction

● Some basics on grid-based visualization techniques

● Visualization workhorses: ParaView and VisIt

● Data handling

● Gallery of visualization projects

● Summary

Page 3: Scientific visualization of HPC simulation data ... · K. Reuter, MPCDF Garching, 2015/11/04 some basics on scientific visualization (cont'd) Gridded vs ungridded data depending on

K. Reuter, MPCDF Garching, 2015/11/04

Introduction

MPCDF provides visualization infrastructure and project support for the Max-Planck Society

● support for adaptation and instrumentation of simulation codes

● guidance for selection, adoption and usage of analysis & visualization software

● dedicated support for individual (particularly demanding) visualization projects

Main challenges

● broad range of scientific fields: plasma physics, astrophysics, materials science, life sciences, ...

● variety of simulation codes: in-house developments, third-party, commercial, open source, closed source

● integration of legacy analysis pipelines

● complexity of datasets generated by HPC simulations

● data structures and file formats: often non-standardized and heterogeneous

● dimensionality: multidimensional (3D + time), multi-variate data

● memory requirements: massive amount of raw data

● topology: gridded data, mesh-free data, complex coordinates, ...

[MPCDF]

Page 4: Scientific visualization of HPC simulation data ... · K. Reuter, MPCDF Garching, 2015/11/04 some basics on scientific visualization (cont'd) Gridded vs ungridded data depending on

K. Reuter, MPCDF Garching, 2015/11/04

some basics on scientific visualization

Terminology: “Visualization” vs “Rendering”

● visualization: visual representation of data e.g. chemical structures: create a ball-and-stick model fromatom positions (and compute an image or do a 3D print)

● rendering: generation of a 2D image from a 1, 2, 3D modele.g. create a 2D projection of a 3D “balls and sticks” model

Popular techniques for 3D scalar fields

● volume rendering

● ray casting: follow straight lines starting from a camera model (view point),intersecting an image plane (2D grid), and hitting objects (3D model)

● splatting: “throw” voxels at 2D image plane from back to front (like “snowballs”)

● transfer function and color table map from data space to color space,e.g. we may intuitively mimick opacity and emissivity of a gas

● provides qualitative and quantitative information

● pseudocolor plots

● color table maps from data space to color space

● 2D is straightforward, 3D requires clipping

● provides mostly quantitative information

● iso surfaces (surfaces of constant scalar value)

[wikipedia.org]

[visitusers.org]

[MPCDF]

[MPCDF]

Page 5: Scientific visualization of HPC simulation data ... · K. Reuter, MPCDF Garching, 2015/11/04 some basics on scientific visualization (cont'd) Gridded vs ungridded data depending on

K. Reuter, MPCDF Garching, 2015/11/04

ray casting

View Ray

Image

Camera

[adapted from wikipedia.org]

[MPCDF]

Page 6: Scientific visualization of HPC simulation data ... · K. Reuter, MPCDF Garching, 2015/11/04 some basics on scientific visualization (cont'd) Gridded vs ungridded data depending on

K. Reuter, MPCDF Garching, 2015/11/04

ray tracing

Light Source

Scene Object

Shadow RayView Ray

Image

Camera

[adapted from wikipedia.org]

[blender.org]

[K. Reuter]

Page 7: Scientific visualization of HPC simulation data ... · K. Reuter, MPCDF Garching, 2015/11/04 some basics on scientific visualization (cont'd) Gridded vs ungridded data depending on

K. Reuter, MPCDF Garching, 2015/11/04

some basics on scientific visualization (cont'd)

Popular techniques for 3D vector fields

● colored arrow plots

● streamlines, streaklines, pathlines

● contraction to 3D scalar field

● vector magnitude

● projection

[visitusers.org]

streamlines

[visitusers.org]

colored arrow plot

[MPCDF]

streamwise vorticity

[by courtesy of V. Avsarkisov (TU Darmstadt)]

[MPCDF]

velocity magnitude

[MPCDF]

Page 8: Scientific visualization of HPC simulation data ... · K. Reuter, MPCDF Garching, 2015/11/04 some basics on scientific visualization (cont'd) Gridded vs ungridded data depending on

K. Reuter, MPCDF Garching, 2015/11/04

some basics on scientific visualization (cont'd)

Gridded vs ungridded data

● depending on the simulation approach, data values may be availableat the vertices of a grid or at (independent) points in space

● gridded data (vertices connected by edges)

● connectivity information greatly simplifies interpolation and parallelization

● many of the aforementioned algorithms require gridded data

● topologies: unstructured, structured, rectilinear, tetrahedral, …

● parallelization is often easiest/work best on rectilinear grids

● ungridded data

● only suitable for certain visualization algorithms (→ talk by K Dolag)

● triangulation (eg. Delaunay) may be used to calculate a grid (expensive!)

Page 9: Scientific visualization of HPC simulation data ... · K. Reuter, MPCDF Garching, 2015/11/04 some basics on scientific visualization (cont'd) Gridded vs ungridded data depending on

K. Reuter, MPCDF Garching, 2015/11/04

Visualization tools

Overview on popular software tools for scientific visualization

● 1D, 2D (time dependent) data: Python/SciPy/matplotlib, R, Matlab, gnuplot, IDL, ...

● mostly conventional plotting (2D vector graphics, publication-quality)

● suitable to implement data processing pipelines and (automated) quantitative analysis

● 3D (time dependent) data: ParaView and VisIt

● implement extensive toolboxes of visualization algorithms

● enable interactive data exploration and quantitative analysis

● publication-quality plots, renderings, movies

● main workhorses when it comes to HPC visualization

[gnuplot.info]

Page 10: Scientific visualization of HPC simulation data ... · K. Reuter, MPCDF Garching, 2015/11/04 some basics on scientific visualization (cont'd) Gridded vs ungridded data depending on

K. Reuter, MPCDF Garching, 2015/11/04

[www.paraview.org]

ParaView's architecture enabling HPC scale parallelism

● comprehensive open-source toolbox for scientific visualization

● scalable parallel architecture, implementation based on VTK, MPI and TCP/IP

● mature code base, but still actively developed

*2000 LANL & Kitware, since 2005 Sandia & Kitware

● extensible via plugins (eg. data readers, data filters, in-situ connectors)

● controllable via Python scripting

● widely used

● well documented → www.paraview.org

[www.paraview.org]

ParaView's graphical user interface

Page 11: Scientific visualization of HPC simulation data ... · K. Reuter, MPCDF Garching, 2015/11/04 some basics on scientific visualization (cont'd) Gridded vs ungridded data depending on

K. Reuter, MPCDF Garching, 2015/11/04

● comprehensive open-source toolbox for scientific visualization

● scalable parallel architecture, implementation based on VTK, MPI and TCP/IP

● mature code base, but still actively developed

*2000 LLNL

● extensible via plugins (eg. data readers, data filters, in-situ connectors)

● controllable via Python scripting

● widely used

● well documented → visit.llnl.gov

[MPCDF]

Page 12: Scientific visualization of HPC simulation data ... · K. Reuter, MPCDF Garching, 2015/11/04 some basics on scientific visualization (cont'd) Gridded vs ungridded data depending on

K. Reuter, MPCDF Garching, 2015/11/04

Workflow (VitIt example)

[MPCDF]

Page 13: Scientific visualization of HPC simulation data ... · K. Reuter, MPCDF Garching, 2015/11/04 some basics on scientific visualization (cont'd) Gridded vs ungridded data depending on

K. Reuter, MPCDF Garching, 2015/11/04

Data management

How to get data from an HPC code into your favourite visualization tool?

Criteria for selecting an IO library (and data format):

● performance: a parallel code requires parallel IO

● portability: code maturity, availability at HPC centers

● usability: code changes, debug options

For small data sets (time series of aggregate variables) it may work to

● write text or binary files in a proprietary format during a code run

● apply a data conversion pipeline as a second step (e.g. using Python)

However, for larger data sets (~ few hundred MB) this procedure is inefficient and will finally become the bottleneck.

VTK (www.vtk.org) provides – among others – an extensive framework to handle griddedand ungridded data of various kinds, including file formats that are compatible withParaView and VisIt.Pro: Python bindings are helpful to implement postprocessing including format conversion.Caveat: VTK's (complex!) C++ code may be hard to integrate with existing HPC code.

Page 14: Scientific visualization of HPC simulation data ... · K. Reuter, MPCDF Garching, 2015/11/04 some basics on scientific visualization (cont'd) Gridded vs ungridded data depending on

K. Reuter, MPCDF Garching, 2015/11/04

Data management (cont'd)

Software solutions for HPC

● MPI-IO (low-level), HDF5 (library, tools), NetCDF (library, tools)

● popular strategy: HDF5 for data to be visualized, raw MPI-IO or HDF5 for checkpoints

● future: in-situ visualization

HDF5 in a nutshell

● Hierarchical Data Format(groups, dataset) correspond to POSIX (directory, file)

● efficient parallel IO (MPI-IO, GPFS)

● → www.hdfgroup.org

● supported by tools such as ParaView and VisIt

● use XDMF to add XML grid information → www.xdmf.org

[www.hdfgroup.org]

Hierarchical Data Format

Page 15: Scientific visualization of HPC simulation data ... · K. Reuter, MPCDF Garching, 2015/11/04 some basics on scientific visualization (cont'd) Gridded vs ungridded data depending on

K. Reuter, MPCDF Garching, 2015/11/04

Data management: workflowsWrap-up: How to get data from an HPC code into your favourite visualization tool?

Explicit data conversion

allows some basic post-processing and/or data reduction of simulation output

quick (& dirty) programming: copy/paste from I/O statements in simulation code

duplication of data

→ which format? Silo (VisIt's "proprietary" data format), HDF5, VTK, ...

Development of a reader plugin for VisIt or Paraview

no data duplication, no additional pre-processing step

plugin is dynamically loaded (code may reside in $HOME)

development requires C programming and compilation against a ParaView/VisIt installation

Adaptation of I/O in simulation code

no data duplication, no additional preprocessing step

may promote interoperability with other tools (depending on chosen format, e.g. hdfview)

implications for software management (code policies, access to source code, …)

→ which format? HDF5 (may require XDMF for metadata), VTK, ...

Page 16: Scientific visualization of HPC simulation data ... · K. Reuter, MPCDF Garching, 2015/11/04 some basics on scientific visualization (cont'd) Gridded vs ungridded data depending on

K. Reuter, MPCDF Garching, 2015/11/04

Outline

● Introduction

● Some basics on grid-based visualization techniques

● Visualization workhorses: ParaView and VisIt

● Data handling

● Gallery of visualization projects

● Summary

Page 17: Scientific visualization of HPC simulation data ... · K. Reuter, MPCDF Garching, 2015/11/04 some basics on scientific visualization (cont'd) Gridded vs ungridded data depending on

K. Reuter, MPCDF Garching, 2015/11/04

MPCDF visualization projects

Project selection

● scientific domains:

● plasma physics, astrophysics, CFD, molecular dynamics, ...

● data structures/grids:

● regular: cartesian, polar (2D, 3D), block-structured (”Yin-Yan”)

● irregular: (mapped) point clouds

● data sizes, dimensions:

● up to 20483 (cartesian), 1000 × 180 × 360 (polar), 2048 × 769 × 1153 (cylindrical)

● up to 106 particles in 3D, 107 nodes in 3D unstructured mesh

● multi-variable (scalar, vector), time-dependent

● tools: parallel HDF5 (+XDMF), VisIt, ParaView, Splotch

● → http://www.rzg.mpg.de/services/visualisation/scientificdata/projects

Aims

● Sketch results & experiences from real-world visualization projects

● Answer user questions: What can be done? Can certain tools support my research?

Page 18: Scientific visualization of HPC simulation data ... · K. Reuter, MPCDF Garching, 2015/11/04 some basics on scientific visualization (cont'd) Gridded vs ungridded data depending on

K. Reuter, MPCDF Garching, 2015/11/04

Core-collapse supernova

Visualisation approach (M. Rampp, MPCDF)

● data: (1000×180×360) zones on a non-uniform, polar grid

● approx. 700 output files (time steps)

● proprietary output format was converted to VisIt's Silo format first using a simple FORTRAN code

● ”multi-channel” volume-rendering (non-standard use-case for VisIt)

● elements Ni56 , O16 , C12 shine in blue, green, red

● gained experience with stereo rendering

Simulations by N. Hammer, Th. Janka & E. Müller(MPA)

● supernova explosion of 15 Msol star

● first 3D simulations of long-term evolution(Hammer et al., ApJ 714, 1371, 2010)

● instabilities & mixing of heavy elements

● simulation code: PROMETHEUS/HOTB(3D hydrodynamics, finite-volume, PPM)

→ movie [MPCDF]

Page 19: Scientific visualization of HPC simulation data ... · K. Reuter, MPCDF Garching, 2015/11/04 some basics on scientific visualization (cont'd) Gridded vs ungridded data depending on

K. Reuter, MPCDF Garching, 2015/11/04

Core-collapse supernova (cont'd)

Visualization techniques using VisIt

3D Volume rendering

● operators: box, ”spherical to cartesian” coordinate transform

● rendering algorithms: splatting (for interactive exploration), ray casting (for producing the final HQ result)

● movie was coded up using a Python script and then rendered non-interactively

● created individual image files for each of the 3 scalar variables

● finally applied RGB image composition using ImageMagick

Quantitative analysis

● plots: pseudocolor

● operators: box, ”spherical to cartesian” coordinate transform, isosurface, slice (2D projection)

Page 20: Scientific visualization of HPC simulation data ... · K. Reuter, MPCDF Garching, 2015/11/04 some basics on scientific visualization (cont'd) Gridded vs ungridded data depending on

K. Reuter, MPCDF Garching, 2015/11/04

Core-collapse supernova (cont'd)

Quantitative analysis using VisIt

● plots are taken directly from Hammer et al., ApJ 714, 1371, 2010

● multiple isosurfaces shed light on the morphology of the instability

● analyze different scalar fields in slice planes,in particular the chemical composition

Page 21: Scientific visualization of HPC simulation data ... · K. Reuter, MPCDF Garching, 2015/11/04 some basics on scientific visualization (cont'd) Gridded vs ungridded data depending on

K. Reuter, MPCDF Garching, 2015/11/04

Core-collapse supernova (another case)

Visualisation approach (E. Erastova, M. Rampp)

● data: (1000×180×360) zones on non-uniform, polar grid

● approx. 1000 output files (time steps)

● pseudo-color plots for data exploration and quantitative analysis

● combined volume renderings for HQ movies

● alternative technique: multiple, semi-transparent iso-surfaces

Simulations by Th. Janka et al. (MPA)

● neutrino-driven explosions of massive stars from first principles

● simulation code: VERTEX (3D, time-dependent radiation hydrodynamics with detailed microphysics)-first 3D simulations of long-term evolution

● code writes HDF5 and XDMF

● a “spiral mode” was discovered

with the help of 3D visualization

(Melson et al. arXiv:1504.07631)

Page 22: Scientific visualization of HPC simulation data ... · K. Reuter, MPCDF Garching, 2015/11/04 some basics on scientific visualization (cont'd) Gridded vs ungridded data depending on

K. Reuter, MPCDF Garching, 2015/11/04

Core-collapse supernova (cont'd)

Interactive graphics with X3DOM in a web browser

● supplements publication of simulation results, e.g. by APJ→ http://iopscience.iop.org/0004-637X/793/2/127/media

● 3D data format and object model (→ www.x3dom.org)

● X3D(OM) file export supported by Paraview, VisIT (2.10)

● controls: mouse enables zoom and interaction

● plain HTML5, no browser plugin required

[MPCDF]

.x3d file export

.x3d file reference

Page 23: Scientific visualization of HPC simulation data ... · K. Reuter, MPCDF Garching, 2015/11/04 some basics on scientific visualization (cont'd) Gridded vs ungridded data depending on

K. Reuter, MPCDF Garching, 2015/11/04

DNS of turbulence

Visualisation approach (M. Rampp, L. Shi)

● data: (2048×769×1153) zones on non-uniform, cylindrical grid

● approx. 1000 output files (time steps)

● developed an I/O and visualisation strategy

● changes to the simulation code NSCOUETTE

– parallel HDF5 output of physical variables, p(θ,z,r), (uθ,uz,ur)

– generation of XDMF metadata and output in separate XML files

● visualisation with VisIt

– applied ”swap coordinates” operator to transpose coordinates: (θ,z,r) → (r,θ,z)

Simulations by L. Shi, M. Avila, B. Hof (MPI f. Dynamics and Self Organization, FAU Erlangen, IST Austria)

● DNS of fluids (pipe flows, Taylor-Couette flows)

● Code NSCOUETTE: Solves incompressibleNavier Stokes equations using a pseudospectral method(Shi, Rampp, Hof, Avila, Computers and Fluids, 2015)

● Basic research in turbulence: lab experiments, numerical simulations, astrophysis: accretion in cold discs(Hof et al., Science, 2010, Avila et al., Science, 2011)

● PRACE/DECI project HYDRAD

[MPCDF]

Page 24: Scientific visualization of HPC simulation data ... · K. Reuter, MPCDF Garching, 2015/11/04 some basics on scientific visualization (cont'd) Gridded vs ungridded data depending on

K. Reuter, MPCDF Garching, 2015/11/04

DNS of turbulence (cont'd)

Visualization pipeline implemented using VisIt

● expressions: vorticity(ur,uθ,uz)= ∂ur /∂z – ∂uz /∂r

● operator ”swap coordinates”: (θ,z,r) → (r,θ,z)

● operator ”transform coordinates”: (r,θ,z)→(x,y,z)

● plots: pseudocolor, volume, (+vector, ...)

● Python code

[MPCDF]

(streamwise vorticity: ∂ur /∂z – ∂uz /∂r)

Page 25: Scientific visualization of HPC simulation data ... · K. Reuter, MPCDF Garching, 2015/11/04 some basics on scientific visualization (cont'd) Gridded vs ungridded data depending on

K. Reuter, MPCDF Garching, 2015/11/04

SPH visualization

Simulations by S. Kochfahr et al. (MPE)

● SPH simulations produce point clouds with a strongly varying particle density(SPH's “adaptive resolution”)

● Background: SPH "particles" sample scalar fields, particles carry size information (smoothing kernel)

● very limited support by standard software, special-purpose software does not cover full spectrum of features: interactivity, slicing, quantitative analysis

?

visualization as particles

visualization as asmooth density field

Page 26: Scientific visualization of HPC simulation data ... · K. Reuter, MPCDF Garching, 2015/11/04 some basics on scientific visualization (cont'd) Gridded vs ungridded data depending on

K. Reuter, MPCDF Garching, 2015/11/04

SPH visualization (cont'd)

Visualisation approach (C. Simion, MPE and K. Reuter, MPCDF)

● mapping to unstructured grids which can be handled by VisIt and Paraview

● approach: 3D Delaunay triangulation to create a tetrahedral mesh

● preserves resolution, avoids interpolation to a regular grid

● initially used inefficient VTK library implementation

● CPU time scales as N2

● huge memory requirements

● implemented custom code using the faster Qhull library

[MPCDF]

visualization as asmooth density field

Synergies with a materials science project (FHI)

● principle/code was reused to analyze materials science data[MPCDF]

Page 27: Scientific visualization of HPC simulation data ... · K. Reuter, MPCDF Garching, 2015/11/04 some basics on scientific visualization (cont'd) Gridded vs ungridded data depending on

K. Reuter, MPCDF Garching, 2015/11/04

SPH visualization (cont'd)

[MPCDF]

● Splotch is a special ray tracer to visualize SPH simulation data (without conversion).→ Talk by K. Dolag (LMU) this afternoon will give you all the technical details.→ K. Dolag, et al., New Journal of Physics, Volume 10, Issue 12, pp. 125006 (2008)

● Code extension to tackle time-dependent visualizations of large simulation data sets (~1010 particles) (K. Reuter, MPCDF, 2011).→ Implemented MPI parallelization of particle handling and frame interpolation.

● Parallelization enabled visualization support for users at the MPE and MPA,movie rendering utilized the complete memory of the VIZ cluster at the MPCDF.

Page 28: Scientific visualization of HPC simulation data ... · K. Reuter, MPCDF Garching, 2015/11/04 some basics on scientific visualization (cont'd) Gridded vs ungridded data depending on

K. Reuter, MPCDF Garching, 2015/11/04

Geospatial data and bird migration

Visualisation approach (K. Reuter, MPCDF & K. Safi, MPI-Orn., 2012)

● visualization with ParaView, lots of Python scripting

● tedious generation and optimization of camera movement → better use artist's tools such as Blender?

Data by M. Wikelski (MPI for Ornithology)

● observational data

● a bird’s (gull) GPS track correlated with wind data

● topography data, earth's magnetic field data, …

● time-dependent data

Visualization adapted to wall-projection at the “hennhouse”, ie. thevisitors and media center MPI-Orn at Radolfzell/Bodensee.

[MPCDF]

movie presented by M. Wikelski at the general assembly of the MPG, 2012

Page 29: Scientific visualization of HPC simulation data ... · K. Reuter, MPCDF Garching, 2015/11/04 some basics on scientific visualization (cont'd) Gridded vs ungridded data depending on

K. Reuter, MPCDF Garching, 2015/11/04

Summary and Conclusions

● ParaView and VisIt provide (partly parallel) implementationsof many state-of-the-art visualization algorithms

● spohisticated and mature open-source tools

● end-user ready

● strong in visualizing gridded data at HPC scale

● Use HDF5 (& XDMF) to implement IO, if possible

● Check out our visualization gallery at www.mpcdf.mpg.de