Upload
carys
View
44
Download
0
Tags:
Embed Size (px)
DESCRIPTION
BG/Q vs BG/P—Applications Perspective from Early Science Program. Timothy J. Williams Argonne Leadership Computing Facility 2013 MiraCon Workshop Monday 3/4/2013 Session: 3:45-4:30pm. BG/P applications should run, unchanged, on BG/Q — faster. - PowerPoint PPT Presentation
Citation preview
BG/Q vs BG/P—Applications Perspective from Early Science
Program
Timothy J. WilliamsArgonne Leadership Computing Facility
2013 MiraCon WorkshopMonday 3/4/2013Session: 3:45-4:30pm
2
BG/P applications should run, unchanged, on BG/Q — faster
3
16 projects– Large target allocations– Postdoc
Proposed runs between Mira acceptance and start of production
2 billion core-hours to burn in a few months
First in Mira Queue: Early Science Program
http://esp.alcf.anl.gov
4
16 ESP ProjectsAlgorithms/MethodsStructured GridsUnstructured GridsFFTDense Linear AlgebraSparse Linear AlgebraParticles/N-BodyMonte Carlo
7 National Lab PIs
9 University PIs
Science AreasAstrophysicsBiologyCFD/AerodynamicsChemistryClimateCombustionCosmologyEnergyFusion PlasmaGeophysicsMaterialsNuclear Structure
5
Next 2 slides, efforts characterized as S=small, M=medium, L=large– S : zero – few days of effort, modifications to 0% - 3% of existing lines of code– M : few weeks of effort, modifications to 3% - 10% of existing lines of code– S : few months of effort, modifications beyond 10% of existing lines of code
Ranking based on estimates by people who actually did the work
How Much Effort to “Port” to BG/Q?
6
How Much Effort?PI/affiliation Code(s) Magnitude
of Changes Nature of Changes
Balaji/GFDL HIRAM L Improve OpenMP implementation, reformulate divergence-damping
Curtiss/ANL QMCPACK M S to port, L to use QPX in key kernels; plan: nested OpenMP
Frouzakis/ETH Nek5000 S Optimized small matrix-matrix multiply using QPX
Gordon/Iowa State GAMESS M 64-bit addressing, thread integral kernels with OpenMP
Habib/ANL, UC HACC M Short-range-force only: tree code
Harrison/ORNL MADNESS S Threading runtime tuning Kernel tuning to use QPX
Jansen/U Colorado PHASTA S Unchanged MPI-only performs well; OpenMP threaded in testing
Jordan/USC AWP-ODC, SORD S, M None, Threading
7
How Much Effort?PI/affiliation Code(s) Magnitude
of Changes Nature of Changes
Khoklov/UC HSCD S Tune OpenMP parameters, link optimized math libs
Lamb/UC FLASH/RTFlame S OpenMP threading
Mackenzie/Fermilab
MILC, Chroma, CPS L
Full threading, QPX intrinsics/assembler, kernel on SPI comm.
Moser/UTexas PSDNS S Compile dependency libs, add OpenMP directives for threading
Pieper/ANL GFMC S Tune no. threads & ranks.
Roux/UC NAMD, Charm++ L Threads, PAMI implementation of Charm++
Tang/Princeton GTC S Improve OpenMP implementation
Voth/UC, ANL NAMD, LAMMPS, RAPTOR M OpenMP threads & serial
optimizations in RAPTOR/LAMMPS
8
Threads Communications
– One-sided– Beneath MPI
Kernel optimizations– QPX – Code restructuring
Parallel I/O Algorithms targeting Blue Gene architecture BG/Q Tuned libraries
– Linear algebra– Math functions– FFTs
Areas of Effort