69
Victor Haydin Head of R&D, ELEKS

Fast & Furious: building HPC solutions in a nutshell

Embed Size (px)

DESCRIPTION

Slides from IT Weekend Ukraine conference presentation

Citation preview

Page 1: Fast & Furious: building HPC solutions in a nutshell

Victor Haydin

Head of R&D, ELEKS

Page 2: Fast & Furious: building HPC solutions in a nutshell

Agenda

1. What is HPC?2. Why does somebody need it?3. How to do it?

Page 3: Fast & Furious: building HPC solutions in a nutshell

What?

Page 4: Fast & Furious: building HPC solutions in a nutshell

Wikipedia: “High-performance computing (HPC)

uses supercomputers and computer clusters to solve

advanced computation problems.

Today, computer systems approaching the teraflops-

region are counted as HPC-computers.”

Definition

Page 5: Fast & Furious: building HPC solutions in a nutshell

Definition

advanced computation

problems

Page 6: Fast & Furious: building HPC solutions in a nutshell

Modeling and Simulation

Page 7: Fast & Furious: building HPC solutions in a nutshell

Low-latency processing

Page 8: Fast & Furious: building HPC solutions in a nutshell

Big Data

Page 9: Fast & Furious: building HPC solutions in a nutshell

A.I.

Page 10: Fast & Furious: building HPC solutions in a nutshell

SupercomputersComputer clusters

Teraflops performance

Page 11: Fast & Furious: building HPC solutions in a nutshell

HPC systems comparison

1

10

100

1000

10000

100000

1000000

10000000

100000000

CPU (Intel Ivy Bridge) 100xCPU GPU (NVIDIA Kepler) 100xGPU IBM Sequoia

HPC

Page 12: Fast & Furious: building HPC solutions in a nutshell

Why?

Page 13: Fast & Furious: building HPC solutions in a nutshell

Finances

Page 14: Fast & Furious: building HPC solutions in a nutshell

Healthcare

Page 15: Fast & Furious: building HPC solutions in a nutshell

Fluid- and Aerodynamics

Page 16: Fast & Furious: building HPC solutions in a nutshell

Genetics

Page 17: Fast & Furious: building HPC solutions in a nutshell

Computer Vision and Image Processing

Page 18: Fast & Furious: building HPC solutions in a nutshell

How?

Page 19: Fast & Furious: building HPC solutions in a nutshell

Disclaimer

Page 20: Fast & Furious: building HPC solutions in a nutshell

Commodity Hardware

Page 21: Fast & Furious: building HPC solutions in a nutshell

VS.

Specialized

Page 22: Fast & Furious: building HPC solutions in a nutshell

GPU-based

Page 23: Fast & Furious: building HPC solutions in a nutshell

Example 1:Financial Risk Analysis Using Monte-Carlo methodOn GPGPU

Page 24: Fast & Furious: building HPC solutions in a nutshell
Page 25: Fast & Furious: building HPC solutions in a nutshell
Page 26: Fast & Furious: building HPC solutions in a nutshell
Page 27: Fast & Furious: building HPC solutions in a nutshell
Page 28: Fast & Furious: building HPC solutions in a nutshell

Distribute

Page 29: Fast & Furious: building HPC solutions in a nutshell

Run

Page 30: Fast & Furious: building HPC solutions in a nutshell

Define

Page 31: Fast & Furious: building HPC solutions in a nutshell

Store

Page 32: Fast & Furious: building HPC solutions in a nutshell

Feed

Page 33: Fast & Furious: building HPC solutions in a nutshell

Present

Page 34: Fast & Furious: building HPC solutions in a nutshell

Survive

Page 35: Fast & Furious: building HPC solutions in a nutshell

High-level architecture

Page 36: Fast & Furious: building HPC solutions in a nutshell

Middleware

Page 37: Fast & Furious: building HPC solutions in a nutshell

Worker

Page 38: Fast & Furious: building HPC solutions in a nutshell

Example 2:Image Search platformUsing local feature detectionOn GPGPU

Page 39: Fast & Furious: building HPC solutions in a nutshell
Page 40: Fast & Furious: building HPC solutions in a nutshell
Page 41: Fast & Furious: building HPC solutions in a nutshell
Page 42: Fast & Furious: building HPC solutions in a nutshell
Page 43: Fast & Furious: building HPC solutions in a nutshell
Page 44: Fast & Furious: building HPC solutions in a nutshell
Page 45: Fast & Furious: building HPC solutions in a nutshell
Page 46: Fast & Furious: building HPC solutions in a nutshell
Page 47: Fast & Furious: building HPC solutions in a nutshell
Page 48: Fast & Furious: building HPC solutions in a nutshell
Page 49: Fast & Furious: building HPC solutions in a nutshell
Page 50: Fast & Furious: building HPC solutions in a nutshell
Page 51: Fast & Furious: building HPC solutions in a nutshell
Page 52: Fast & Furious: building HPC solutions in a nutshell

High-level architecture

Page 53: Fast & Furious: building HPC solutions in a nutshell

Middleware

Page 54: Fast & Furious: building HPC solutions in a nutshell

Load Balancing

Page 55: Fast & Furious: building HPC solutions in a nutshell
Page 56: Fast & Furious: building HPC solutions in a nutshell
Page 57: Fast & Furious: building HPC solutions in a nutshell

0

20

40

60

80

100

120

140

9 workers 18 workers

Unicast

• Computation time – 1 second• Sending time – 120 seconds!

• More workers – slower speed

Unicast

Page 58: Fast & Furious: building HPC solutions in a nutshell

0

20

40

60

80

100

120

140

1 2

Unicast

Multicast

• Computation time – 1 second• Sending time –25 seconds

• Almost 5 times faster

Multicast

Page 59: Fast & Furious: building HPC solutions in a nutshell

Middleware

Page 60: Fast & Furious: building HPC solutions in a nutshell

Worker

Page 61: Fast & Furious: building HPC solutions in a nutshell
Page 62: Fast & Furious: building HPC solutions in a nutshell

ERROR: CUDA ERROR CODE 30 (“UNKNOWN ERROR”)

Page 63: Fast & Furious: building HPC solutions in a nutshell

Run same code on CPU and GPU

Page 64: Fast & Furious: building HPC solutions in a nutshell

CUDA_KERNEL foo(…)

{

CUDA_DEFINE_PARAMS;

// your code here

}

CUDA_CALL(threads, blocks, foo(…))

Kernel

Page 65: Fast & Furious: building HPC solutions in a nutshell

Generated code// GPU mode

__global__ void foo (…)

{

// your code here

}

foo<<<threads, blocks>>>(…)

// CPU modevoid foo(…){

// same code here}

// LOOP OVER threads and blocks{

foo(…)}

Page 66: Fast & Furious: building HPC solutions in a nutshell

Pros & Cons• Same code for CPU and

GPU

• Debugging

• Range checking

• No CUDA ERROR 30

• Shared memory

• __syncthreads()

Page 67: Fast & Furious: building HPC solutions in a nutshell
Page 68: Fast & Furious: building HPC solutions in a nutshell

@victor_haydin

linkedin.com/in/victorhaydin

[email protected]

Page 69: Fast & Furious: building HPC solutions in a nutshell

Got a question?Ask!