17
Parallel Computers Organizations and Architecture Department of Computer Science Southern Illinois University Edwardsville Summer, 2015 Dr. Hiroshi Fujinoki E-mail: [email protected] S 312 Computer Organization and Architecture

Parallel Computers Organizations and Architecture Department of Computer Science Southern Illinois University Edwardsville Summer, 2015 Dr. Hiroshi Fujinoki

Embed Size (px)

Citation preview

Page 1: Parallel Computers Organizations and Architecture Department of Computer Science Southern Illinois University Edwardsville Summer, 2015 Dr. Hiroshi Fujinoki

Parallel ComputersOrganizations and Architecture

Department of Computer ScienceSouthern Illinois University Edwardsville

Summer, 2015

Dr. Hiroshi FujinokiE-mail: [email protected]

CS 312 Computer Organization and Architecture

Page 2: Parallel Computers Organizations and Architecture Department of Computer Science Southern Illinois University Edwardsville Summer, 2015 Dr. Hiroshi Fujinoki

Mult_Sched/001

CS 312 Computer Organization and Architecture

Four hardware architecture for “parallel computers”

Tightly-Coupled Multi-Processor System

Functionally-Specialized Multi-Processor System

Loosely-Coupled Multi-Processor System

Distributed Systems (“most loosely coupled systems”)

Page 3: Parallel Computers Organizations and Architecture Department of Computer Science Southern Illinois University Edwardsville Summer, 2015 Dr. Hiroshi Fujinoki

MotherboardMotherboard

Mult_Sched/002

Tightly-Coupled Multi-Processor System

• Multi-Processor System (multi-processor motherboard)

• Single-Processor System with a multi-core processor

Multi-ProcessorSystem

Single-Processor Systemwith multi-core processor

ProcessorProcessor

Processor Core(ALU and others)

CS 312 Computer Organization and Architecture

Page 4: Parallel Computers Organizations and Architecture Department of Computer Science Southern Illinois University Edwardsville Summer, 2015 Dr. Hiroshi Fujinoki

Mult_Sched/002

Tightly-Coupled Multi-Processor System

• Multi-Processor System (multi-processor motherboard)

CS 312 Computer Organization and Architecture

Two processors on a motherboard

Page 5: Parallel Computers Organizations and Architecture Department of Computer Science Southern Illinois University Edwardsville Summer, 2015 Dr. Hiroshi Fujinoki

Mult_Sched/002

Tightly-Coupled Multi-Processor System

CS 312 Computer Organization and Architecture

• Single-Processor System with a multi-core processor

CPU cores

Page 6: Parallel Computers Organizations and Architecture Department of Computer Science Southern Illinois University Edwardsville Summer, 2015 Dr. Hiroshi Fujinoki

Motherboard

Graphic Interface

Video RAM (“VRAM”)

Mult_Sched/003

Functionally-Specialized Multi-Processor System

Examples: • GPU on graphics card• Built-in processor on high-speed disk controllers or NICs

(especially those using DMA)

Processor

Monitor(CRT, Flat Panel)

DAC

Graphic-card performs D/A conversion using DAC.

GPU

GPU processes image data in the graphic-card memory

Processor sends graphic command to GPU

Graphic-card sends analog image signals (RGB-signals) to monitor

(GPU = “Graphic Processing Unit”)

CS 312 Computer Organization and Architecture

Page 7: Parallel Computers Organizations and Architecture Department of Computer Science Southern Illinois University Edwardsville Summer, 2015 Dr. Hiroshi Fujinoki

Mult_Sched/003

Functionally-Specialized Multi-Processor System

Examples: • GPU on graphics card (GPU = “Graphic Processing Unit”)

CS 312 Computer Organization and Architecture

DMA SCSI I/O card

CPU

Control Program (in ROM)

Page 8: Parallel Computers Organizations and Architecture Department of Computer Science Southern Illinois University Edwardsville Summer, 2015 Dr. Hiroshi Fujinoki

Mult_Sched/004

Loosely-Coupled Multi-Processor System

• Multi-Systemboard (multiple motherboard) computers

Computer System“Bus”

Processor

System Board(Motherboard)

Memory

• A computer with multiple motherboards (“blades”)

• Blades communicate through the bus

• Each blade is a computer

• Communication delay over the bus

at least “s” order

CS 312 Computer Organization and Architecture

Page 9: Parallel Computers Organizations and Architecture Department of Computer Science Southern Illinois University Edwardsville Summer, 2015 Dr. Hiroshi Fujinoki

Mult_Sched/004

Loosely-Coupled Multi-Processor System

• Multi-Systemboard (multiple motherboard) computers

CS 312 Computer Organization and Architecture

Page 10: Parallel Computers Organizations and Architecture Department of Computer Science Southern Illinois University Edwardsville Summer, 2015 Dr. Hiroshi Fujinoki

Mult_Sched/005

Distributed Systems (“most loosely coupled systems”)

AS 1

AS 4

AS 2

AS 3

• Processor• Local Memory• Secondary Storage• Other I/O

• Processor• Local Memory• Secondary Storage• Other I/O

• Processor• Local Memory• Secondary Storage• Other I/O

• Processor• Local Memory• Secondary Storage• Other I/O

Process(executable codes)

Process Migration

File (data)

Data MigrationNetwork

CS 312 Computer Organization and Architecture

Page 11: Parallel Computers Organizations and Architecture Department of Computer Science Southern Illinois University Edwardsville Summer, 2015 Dr. Hiroshi Fujinoki

Mult_Sched/006

Three different types of tightly-coupled multi-processor systems

(1) “Fine-grained” multi-processor parallel computers

(2) “Medium-grained” multi-processor parallel computers

(3) “Coarse-grained” multi-processor parallel computers

CS 312 Computer Organization and Architecture

Page 12: Parallel Computers Organizations and Architecture Department of Computer Science Southern Illinois University Edwardsville Summer, 2015 Dr. Hiroshi Fujinoki

Mult_Sched/007

Fine-Grained Multi-Process

• Fine-grained = instruction-level multi-processing

Your program(binary executable)

A = B + C;X = Y + Z;

W = A + X;

synchronization

Dependency

Granularity: 1~20 instructions

CPU CPU

CS 312 Computer Organization and Architecture

Page 13: Parallel Computers Organizations and Architecture Department of Computer Science Southern Illinois University Edwardsville Summer, 2015 Dr. Hiroshi Fujinoki

Mult_Sched/008

Medium-Grained Multi-Process

• Medium-grained = thread-level multi-processing

Your program(binary executable)

ThreadA

ThreadB

ThreadC

ThreadD

Processor Processor

CS 312 Computer Organization and Architecture

Page 14: Parallel Computers Organizations and Architecture Department of Computer Science Southern Illinois University Edwardsville Summer, 2015 Dr. Hiroshi Fujinoki

Mult_Sched/009

Medium-Grained Multi-Process

• Example: Web Browser

ThreadA -- Display thread (text output & jpeg image processing)

ThreadB -- Taking user inputs (edit boxes, radio boxes in the browser window

ThreadC -- Network input (receiving data from network)

ThreadD -- Network output (sending data to network)

ThreadA ThreadB ThreadC ThreadD

Receivingdata

Displayingdata

User makesinputs

Receivingdata

Transmitdata

CS 312 Computer Organization and Architecture

Page 15: Parallel Computers Organizations and Architecture Department of Computer Science Southern Illinois University Edwardsville Summer, 2015 Dr. Hiroshi Fujinoki

Mult_Sched/010

Medium-Grained Multi-Process

• Example: Web Browser

ThreadA -- Display thread (text output & jpeg image processing)

ThreadB -- Taking user inputs (edit boxes, radio boxes in the browser window

ThreadC -- Network input (receiving data from network)

ThreadD -- Network output (sending data to network)

ThreadA ThreadB ThreadC ThreadD

ReceivingdataDisplaying

dataUser makesinputs

Receivingdata

Transmitdata

Browser executionwith better responses

Granularity: 20~200 instructions

CS 312 Computer Organization and Architecture

Page 16: Parallel Computers Organizations and Architecture Department of Computer Science Southern Illinois University Edwardsville Summer, 2015 Dr. Hiroshi Fujinoki

Mult_Sched/011

Coarse-Grained Multi-Process

• Coarse-grained = process-level multi-tasking

Process assignment to multiple processors in multi-tasking environment

Memory

Processor

Time

CS 312 Computer Organization and Architecture

Page 17: Parallel Computers Organizations and Architecture Department of Computer Science Southern Illinois University Edwardsville Summer, 2015 Dr. Hiroshi Fujinoki

Mult_Sched/012

Coarse-Grained Multi-Process

• Coarse-grained = process-level multi-tasking

Process assignment to multiple processors in multi-tasking environment

Memory

Processor PoolGranularity = ms order

• 1ms (@ 1GHz) = 1 million instructions

• 100ms (@ 1GHz) = 100M instructions

Granularity: 1~100 M instructions

Time

CS 312 Computer Organization and Architecture