HPC Technology Track: Foundations of Computational Science Lecture 2 Dr. Greg Wettstein, Ph.D....

HPC Technology Track:Foundations of Computational Science

Lecture 2

Dr. Greg Wettstein, Ph.D.

Research Support Group LeaderDivision of Information Technology

Adjunct ProfessorDepartment of Computer Science

North Dakota State University

What is High Performance Computing?

Definition:

The solution of problems involving highdegrees of computational complexityor data analysis which require specializedhardware and software systems.

What is Parallel Computing?

Definition:

A strategy of decreasing the time to solutionof a computational problem by carrying outmultiple elements of the computationat the same time.

Does HPC imply Parallel Computing?

Typically but not always. HPC solutions may require specialized systems due

to memory and/or I/O performance issues.

Conversely parallel computing does not necessarily imply high performance computing.

Flynn's Taxonomy:Classification Strategy for Concurrent Execution

SISD Single Instruction, Single Data

MISD Multiple Instruction, Single Data

SIMD * Single Instruction, Multiple Data

MIMD * Multiple Instruction, Multiple Data

* = Relevant to HPC

SIMDThe Origin of HPC

Architectural model at the heart of 'vector processors'.

Performance enhancement in machines at origin of HPC:

CDC STAR-100 and Cray-1 Utility predicated on fact that mathematical

operations on vectors or vector spaces are at the heart of linear algebra.

Vector Processing Diagram

7 4 87

Vector Length = 8 'words'

Vector elements

Parallel mathematicaloperations +,-,*,/

Current SIMD Examples Embedded in modern x86 and x86_64 architectures.

primarily focus on graphics/signal processing MMX, PNI, SSE2-4, AVX

Foundation for current trend in 'GPGPU computing' NVIDIA Tesla architecture

Component of Larrabee architecture.

SSE Implementation

7 4 87

Vector elements

Parallel operations100+ (SSE4)

128 bit XMM register 128 bit XMM register

Stride Length

MIMDMultiple Instruction Multiple Data

Characterized by multiple execution threads operating on separate data elements.

Threads may operate in shared or disjoint (distributed) memory configurations.

Implementation example SMP (Symmetric Multi-Processing)

SPMDThe Basis for Modern HPC

Defined as a single process executing a common program at different points.

Different from SIMD in that execution is not in lockstep format.

Common implementations: shared memory:

OpenMP Pthreads

distributed memory MPI

Characteristics of MD Models

MIMD/SPMD requires active participation by programmer to implement 'orthogonalization'.

SIMD requires active participation by the compiler with consideration by the programmer to support orthogonalization.

Orthogonalization defn: The isolation ofa problem into discrete elementscapable of being independentlyresolved.

The Real World - A Continuum

Practical programs do not exhibit strict model partitioning.

More pragmatic model is to consider 'dimensions' of parallelism available to a program.

Currently a total of four dimensions of parallelism are exploitable.

Dimensions of Parallelism

First dimension. Standard sequential programming with processor

supplied ILP (Instruction Level Parallelism) Referred to as 'free' or 'invisible' parallelism.

Second dimension. SIMD or OpenMP loop parallelism characterized by isolation of the problem into a

single system image primarily supported by programming language or

compiler

Dimensions of Parallelism - cont.

Third dimension – Two subtypes. use of MPI to partition problem into orthogonal

elements partitioning is frequently implemented on multiple

system images

MIMD threading on a single system image separate threads dispatched to handle separate tasks

which can execute asynchronously Common HPC example is to 'thread' computation

and Input/Output (I/O)

Dimensions of Parallelism - cont.

Fourth dimension partitioning of the problem into orthogonal

elements which can be dispatched to a heterogeneous instruction architecture.

examples: GPGPU/CUDA PowerXcell SPU FPGA

Depth of Parallelism

Measure of the complexity of parallelism implemented.

Simplest metric is the count of the number of programmer implemented dimensions of parallelism on a single system image.

Example MPI implementation with SIMD loop vectorization

on each node Parallelism depth is two

Parallelism Analysis Example

Process based MIMD application. Depth = 1

MPI simulation with OpenMP loop vectorization. Depth = 2

MPI partitioning with CUDA PTree offload and SIMD loop vectorization.

Depth = 3

Escalation of Complexity

Dimension

Architectural decisions must be basedon cost benefit analysis of performancereturns.

Exercise

Verify you have changeset which adds experimental code for SSE/SIMD based boolean PTree operators.

Study the class methods implementing the AND and OR operators.

Review and understood how vector and stride length effect the number of times a loop needs to be executed.

goto skills_lecture1;

HPC Technology Track: Foundations of Computational Science Lecture 2 Dr. Greg Wettstein, Ph.D....

Documents

1 Agenda … HPC Technology & Trends HPC Platforms & Roadmaps HP Supercomputing Vision HP Today

HPC Benchmarking, Rudi Eigenmann, 2006 SPEC Benchmark Workshop this is important for machine procurements and for understanding where HPC technology is

ETP4HPC’s Update on European HPC Technology – June 2016

HPC Technology Compass 2014/15

Advanced Technology Group - HPC User Forum

3 Wettstein Waertsilae Low-Speed Low-Pressure Dual-Fuel Engine

HPC Technology Enables Advancements for Increased Gas

UF Research Computing - Information Technology · PDF fileIntroduction to Galaxy at! UF HPC!! Oleksandr Moskalenko! Assoc. Sci., UF HPC Center! Biological Applications Support! Matt

Intel HPC Technology Overview - ODU · Intel HPC Technology Overview Tom Zahniser ... HEATCODE application case study1 ... NEC Case Study Intel Developer Zone7

Local HPC Resources - Cyber Infrastructure and Advanced ...geco.mines.edu/workshop/aug2011/01mon/LocalResources.pdf · GECO HPC Hardware/Software: “Ra” Center for Technology and

HPC Systems Engineering in the Interaction Room Systems Engineering in the IR.pdf · Architecture canvas: Implementation of simulation on suitable HPC technology ... e.g. agile backlog

HPC Technology Inc. Technology Inc 2018.pdf · HPC Introduction 4 HPC Technology Inc. was established on Sep.6, 2003. Despite the small number of staff, the company has successfully

Fujitsu and the HPC Pyramid · 17 HPC Wales –A Grid of HPC Excellence Performance & Technology PRIMERGY CX250 and BX922 ~200 TFlops aggregated peak performance Infiniband, 10

Information Technology Services, HKU MPI Parallel Programming Information Technology Services The University of Hong Kong By HPC/Grid Team, Email: hpc@cc.hku.hkhpc@cc.hku.hk

Introduction: Foundations of Computational Science Dr. Greg Wettstein, Ph.D. Research Support Group Leader Division of Information Technology Adjunct Professor

HPC on Oracle Cloud Infrastructure › au › a › ocom › docs › cloud › oracle-hpc-report… · pursuit, many industries use HPC technology in their day-to-day business. Genomics

© NVIDIA Corporation 2013 GPU Parallel Computing Zehuan Wang HPC Developer Technology Engineer

HPC-Cloud / Cloud-HPC - Approaches to HPC with OpenStack

HPC in a Cloud - HPC Advisory Council · HyperTransport™ Technology . ... Mellanox 40Gb/s InfiniBand end -to-end the only proven technology on the list ... Conclusion • Cloud

Behavioral Modification Presented by: Molly Blum, Abram Figueroa, Joseph Wettstein, Emily Yen