19
BY: ALI AJORIAN ISFAHAN UNIVERSITY OF TECHNOLOGY 2012 GPU Architecture 1

BY: ALI AJORIAN ISFAHAN UNIVERSITY OF TECHNOLOGY 2012 GPU Architecture 1

Embed Size (px)

Citation preview

Page 1: BY: ALI AJORIAN ISFAHAN UNIVERSITY OF TECHNOLOGY 2012 GPU Architecture 1

1

BY: ALI AJORIANISFAHAN UNIVERSITY OF TECHNOLOGY

2012

GPU Architecture

Page 2: BY: ALI AJORIAN ISFAHAN UNIVERSITY OF TECHNOLOGY 2012 GPU Architecture 1

2

Age of parallelism

Single CPU performance Doubled every 2 years for 30 years until 5 years ago. Marginal improvement in the last 5 years.

2005 year and checking walls Memory Wall Power Wall Processor Design Complexity

Sequential or parallel: this is the problem!!! More cores rather than more clock rate

Page 3: BY: ALI AJORIAN ISFAHAN UNIVERSITY OF TECHNOLOGY 2012 GPU Architecture 1

3

Early parallel computing

It was not a big idea Main frames and super computers

Page 4: BY: ALI AJORIAN ISFAHAN UNIVERSITY OF TECHNOLOGY 2012 GPU Architecture 1

4

And now GPUs

Stands for “Graphics Processing Unit”Integration Scheme: a card on the

motherboard with Massively Parallel computing power

Page 5: BY: ALI AJORIAN ISFAHAN UNIVERSITY OF TECHNOLOGY 2012 GPU Architecture 1

5

A desktop supper computer

Page 6: BY: ALI AJORIAN ISFAHAN UNIVERSITY OF TECHNOLOGY 2012 GPU Architecture 1

6

History of parallel computing

Page 7: BY: ALI AJORIAN ISFAHAN UNIVERSITY OF TECHNOLOGY 2012 GPU Architecture 1

7

GPUs: A Brief History

Stage0: graphic accelerators Early VGA cards accelerate 2D GUI Just configurable

Stage1: Fixed Graphics Hardware Graphics-only platform Very limited programmability

Stage2: GPGPU Trick GPU to do general purpose computing Programmable, but requires knowledge on computer graphics

Stream Processing Platforms High-level programming interface No knowledge on Computer Graphics is required Examples: NVIDIA’s CUDA, OpenCL

Page 8: BY: ALI AJORIAN ISFAHAN UNIVERSITY OF TECHNOLOGY 2012 GPU Architecture 1

8

Stream Processing Characteristics

Fairly simple computation on huge amount of data (streams) Single Program Multiple Data (SPMD)

Data Parallelism e.g., Matrix Operations, Image Processing

Page 9: BY: ALI AJORIAN ISFAHAN UNIVERSITY OF TECHNOLOGY 2012 GPU Architecture 1

9

Graphic accelerators to CUDA GPUs(cont)

Page 10: BY: ALI AJORIAN ISFAHAN UNIVERSITY OF TECHNOLOGY 2012 GPU Architecture 1

10

CUDA programming model

CPU + GPU heterogeneous programming Applications with sequential and parallel parts

Host : CPU Sequential threads

Device : GPU Parallel threads in SIMT architecture some kernels that runs on a grid of threads.

Page 11: BY: ALI AJORIAN ISFAHAN UNIVERSITY OF TECHNOLOGY 2012 GPU Architecture 1

11

CUDA programming model

Page 12: BY: ALI AJORIAN ISFAHAN UNIVERSITY OF TECHNOLOGY 2012 GPU Architecture 1

12

CUDA programming model(cont)

Page 13: BY: ALI AJORIAN ISFAHAN UNIVERSITY OF TECHNOLOGY 2012 GPU Architecture 1

13

GPU Architecture (NVIDIA)

Page 14: BY: ALI AJORIAN ISFAHAN UNIVERSITY OF TECHNOLOGY 2012 GPU Architecture 1

14

GPU Architecture (Fermi)

Page 15: BY: ALI AJORIAN ISFAHAN UNIVERSITY OF TECHNOLOGY 2012 GPU Architecture 1

15

SM architecture

Page 16: BY: ALI AJORIAN ISFAHAN UNIVERSITY OF TECHNOLOGY 2012 GPU Architecture 1

16

CUDA programming model

Page 17: BY: ALI AJORIAN ISFAHAN UNIVERSITY OF TECHNOLOGY 2012 GPU Architecture 1

17

Memory types

Per block registers shared memory

Per thread local memory

Per grid Global memory Constant memory Texture memory

Page 18: BY: ALI AJORIAN ISFAHAN UNIVERSITY OF TECHNOLOGY 2012 GPU Architecture 1

18

Memory types(cont)

Page 19: BY: ALI AJORIAN ISFAHAN UNIVERSITY OF TECHNOLOGY 2012 GPU Architecture 1

19

Questions?