27
Compiler and Tools: User Requirements from ARSC Ed Kornkven Arctic Region Supercomputing Center DSRC [email protected] HPC User Forum September 10, 2009

Compiler and Tools: User Requirements from ARSC

  • Upload
    aysel

  • View
    47

  • Download
    0

Embed Size (px)

DESCRIPTION

Compiler and Tools: User Requirements from ARSC. Ed Kornkven Arctic Region Supercomputing Center DSRC [email protected] HPC User Forum September 10, 2009. Outline. ARSC and our user community User issues and eight needs they have in the HPC environment Conclusions. About ARSC. - PowerPoint PPT Presentation

Citation preview

Page 1: Compiler and Tools: User Requirements from ARSC

Compiler and Tools: User Requirements from ARSC

Ed KornkvenArctic Region Supercomputing Center DSRC

[email protected]

HPC User Forum September 10, 2009

Page 2: Compiler and Tools: User Requirements from ARSC

Outline• ARSC and our user

community • User issues and

eight needs they have in the HPC environment

• Conclusions

Page 3: Compiler and Tools: User Requirements from ARSC

About ARSC• HPCMP DoD

Supercomputing Resource Center, est. 1993

• University of Alaska Fairbanks owned and operated

• Provides HPC resources & support– Cray XT5, 3456 cores– Sun cluster, 2312 cores

• Supports and conducts research

Page 4: Compiler and Tools: User Requirements from ARSC

ARSC User Community• An HPCMP DoD Supercomputing Resource Center

– Support of DoD computational research priorities– Open research (publishable in open research journals)

• Non-DoD academic research– ARSC supports high performance computational research in

science and engineering with an emphasis on high latitudes and the Arctic

• In-house research– Oceanography, space physics– Heterogeneous computing technologies, multicore systems

• ARSC supports about 300 users, HPCMP about 4000

Page 5: Compiler and Tools: User Requirements from ARSC

HPCMP Application Areas

• HPCMP projects are defined by ten Computational Technology Areas (CTAs)– Computational Structural Mechanics; Computational Fluid

Dynamics; Computational Biology, Chemistry and Materials Science; Computational Electromagnetics and Acoustics; Climate/Weather/Ocean Modeling and Simulation; Signal/Image Processing; Forces Modeling and Simulation; Environmental Quality Modeling and Simulation; Electronics, Networking, and Systems/C4I; Integrated Modeling and Test Environments

– These CTAs encompass many application codes• Mostly parallel, with varying degrees of scalability• Commercial, community-developed and home-grown• Unclassified and classified

Page 6: Compiler and Tools: User Requirements from ARSC

HPCMP Application Suite

• This suite is used for various benchmarking uses including system health monitoring, procurement evaluation and acceptance testing– Contains applications and test cases

• Composition of the suite fluctuates according to current and projected use– Past apps include WRF

• Significance: Believed to represent the Program’s workload

Page 7: Compiler and Tools: User Requirements from ARSC

HPCMP Application Suite

AMR gas dynamics using Adaptive Mesh Refinement

GAMESS quantum chemistry

HYCOM primitive equation ocean circulation

LAMMPS molecular dynamics

AVUS fluid flow and turbulence generated by air vehicles

CTH shock physics

ICEPIC particle-in-cell magnetohydrodynamics

Page 8: Compiler and Tools: User Requirements from ARSC

HPCMP Application Suite

AMR C++/Fortran, MPI, 40,000 SLOC

GAMESS Fortran, MPI, 330,000 SLOC

HYCOM Fortran, MPI, 31,000 SLOC

LAMMPS C++, MPI, 45,400 SLOC

AVUS Fortran, MPI, 19,000 SLOC

CTH ~43% Fortran/~57% C, MPI, 436,000 SLOC

ICEPIC C, MPI, 60,000 SLOC

Page 9: Compiler and Tools: User Requirements from ARSC

ARSC Academic Codes

• Non-DoD users’ codes have similar profiles– Many are community codes

• E.g., WRF, ROMS, CCSM, Espresso, NAMD

– Also some commercial (e.g., Fluent) and home-grown

• Predominantly MPI + Fortran/C; some OpenMP/hybrid

Page 10: Compiler and Tools: User Requirements from ARSC

Need #1

• Protect our code investment by supporting our legacy code base– MPI-based codes will be around for a while– Some are scaling well, even to 104 cores

(our largest machines)– Many are not – lots of users still use 102

cores or fewer– Some single-node codes might be able to

take advantage of many cores

Page 11: Compiler and Tools: User Requirements from ARSC

Parallel Programming is Too Unwieldy

• Memory hierarchy stages have different “APIs”– CPU / registers – mostly invisible (handled by compiler)– Caches – code restructuring for reuse; possibly explicit cache

management calls; may have to handle levels differently– Socket memory – maintain memory affinity of processes/threads– Node memory – explicit language features (e.g. Fortran refs/defs)– Off-node memory – different explicit language features (MPI calls)– Persistent storage – more language features (I/O, MPI-IO calls)

• Other things to worry about– TLB misses– Cache bank conflicts– New memory layers (e.g. SSD), effect of multicore on memory

performance, …

Page 12: Compiler and Tools: User Requirements from ARSC

Need #2

• Help with the complexity of parallel programming, esp. managing memory

• State-of-the-art is to be an expert in– Architectural features (which constantly

change)– Multiple languages (Fortran, MPI, OpenMP)– Performance analysis tools– Coding tricks (which depend on

architecture)

Page 13: Compiler and Tools: User Requirements from ARSC

Q: Why do so few of our users use performance tools?Does the average user have no incentive? -- or –Have they given up because it seems too difficult?

Page 14: Compiler and Tools: User Requirements from ARSC

Need #3

• Users need to understand what the “performance game” is and they need tools to help them win.– Remember the days of “98% vectorized?”– What expectations (metrics) should users

have for their code on today’s machines? (It must not be utilization.)

– What will the rules be in a many-core world?

Page 15: Compiler and Tools: User Requirements from ARSC

Beyond Fortran & MPI

1. We do have some codes based on other parallel models or languages, e.g.– Charm++ -- NAMD, ChaNGa– Linda – Gaussian (as an optional feature)– PetSc – E.g., PISM (Parallel Ice Sheet Model)

• These illustrate some willingness (or need) in our community to break out of the Fortran/MPI boxHowever: The pool of expertise outside the box is even

smaller than for MPI.

Page 16: Compiler and Tools: User Requirements from ARSC

HPCMP Investments in Software

2. HPCMP is investing in new software and software development methodologies

• E.g., The PET and CREATE programs– User education– Modern software engineering methods– Transferable techniques and/or code– Highly scalable codes capable of speeding up decision-making

ability

Page 17: Compiler and Tools: User Requirements from ARSC

3. At ARSC, we are interested in PGAS languages for improving productivity in new development– Have a history with PGAS languages

• Collaboration with GWU team (UPC)• Experience with Tsunami model:

– Parallelization using CAF in days vs. weeks w/ MPI

“New” Programming Models

Page 18: Compiler and Tools: User Requirements from ARSC

Need #4

• High-performing implementations of new programming models– For starters, timely implementations of co-

array features of Fortran 2008– Users need some confidence that their

investments in these languages will be safe since their codes will outlive several hardware generations and perhaps the languages themselves.

Page 19: Compiler and Tools: User Requirements from ARSC

Beyond Fortran & MPI

4. Heterogeneous processors– Had a Cray XD1 with FPGAs

• Very little use

– Cell processors and GPUs• PlayStation cluster• IBM QS22

Page 20: Compiler and Tools: User Requirements from ARSC

Need #5

• Easier code development for heterogeneous environments.– Cell processors, GPUs and FPGAs have

tempting performance, but– For most users the effort required to use

these accelerators is too high. – Work underway in these areas is

encouraging.

Page 21: Compiler and Tools: User Requirements from ARSC

5. Multicore Research

•In collaboration with GWU, we are seeking to better understand multicore behavior on our machines.

•Codes based on Charm++ (NAMD and ChaNGa) performed better on our 16-core nodes than the MPI-based codes we tested.

Page 22: Compiler and Tools: User Requirements from ARSC

Need #6

• We need models and methods to effectively use many cores.– Who doesn’t?

• Could potential of many core processors go untapped?– Vector processors weren’t universally accepted

because not all apps were a good fit.– If users don’t find a fit with many cores, they will

still need to compute.– It’s up to CS, not users, to make multicore work.

Page 23: Compiler and Tools: User Requirements from ARSC

Need #7

• Corollary to other requirements: Provide new avenues to productive development, but allow it to be adopted incrementally.– Probably implies good language interoperability– Tools for analyzing code and giving advice, not

just statistics– Automatically fix code or, show where the new

language will most help

Page 24: Compiler and Tools: User Requirements from ARSC

Users Run Codes

• Our users want to do science.– For many users, code development is a

negligible part of their HPC use.– For all of them, it isn’t the main part.– Most will spend more time running

programs than writing them.

Page 25: Compiler and Tools: User Requirements from ARSC

Need #8• Users need help with the process of

executing their codes.– Setting up runs– Launching jobs– Monitoring job progress– Checkpoint/restart– Storing output (TBs on ARSC machines)– Cataloguing results– Data analysis and visualization

Page 26: Compiler and Tools: User Requirements from ARSC

Conclusion

• We are ready for new parallel programming paradigms.

• Much science is being done with today’s machines, so “First do no harm” applies.

• There are still plenty of opportunities for innovators to make a difference in making HPC more productive for users.

Page 27: Compiler and Tools: User Requirements from ARSC