Multiprocessor & Multicomputer...

Preview:

Citation preview

Multiprocessing :: Slide 1 of 26David Rye :: MTRX 3700

Multiprocessor & Multicomputer Organisation

Parallel and Distributed Computing

Multiprocessing :: Slide 2 of 26David Rye :: MTRX 3700

Multiprocessors and Multicomputers

A multiprocessor system has more than one processor (CPU), with common memory shared between processors

A multicomputer system has more than one processor, with each processor having local memory

In either case, processors may be on a common bus (close coupled), or distributed on a network (loosely coupled)

Multiprocessing :: Slide 3 of 26David Rye :: MTRX 3700

Multiprocessing Systems

Generally accepted definition of a multiprocessing/multicomputing system: Multiple processors, each with its own CPU and memory Interconnection hardware Processors fail independently There exists a shared state Appears to users as single system

Multiprocessing :: Slide 4 of 26David Rye :: MTRX 3700

Flynn’s Taxonomy

Computer system organisation described by two characteristics Number of instruction streams Number of data streams

SISD (PC) SIMD (Supercomputer, MMX processor) MISD (??) MIMD (network of processors or network of computers) Tightly coupled (backplane) Loosely coupled (network)

Limited usefulness, but serves to categorise…

Multiprocessing :: Slide 5 of 26David Rye :: MTRX 3700

SISD

Single Instruction stream, Single Data stream All conventional uniprocessor systems are SISD,

from PCs to mainframes

Examples: 8080, M6800, M68000, i8086, etc, etc, etc.

Multiprocessing :: Slide 6 of 26David Rye :: MTRX 3700

SISD

Can include Harvard memory organisation, pipelined units

May execute more than one instruction simultaneously (superscalar processor)

Processor ‘P’

Minstr Mdata

fetch

decode

execute

to I/O

Multiprocessing :: Slide 7 of 26David Rye :: MTRX 3700

SIMD

Single Instruction stream, Multiple Data stream Often called “Array Processor” or “Vector

Architecture”

One instruction unit that fetches an instruction, then commands many processing elements to execute the same instruction simultaneously on many different data sets

Multiprocessing :: Slide 8 of 26David Rye :: MTRX 3700

SIMD

Organisation is usually in the form of a network of processing elements with local memory

Various topologies are used, and may be dynamically configured - e.g. 64k processors in the CM-2

MemoryMasterCPUI/O

Processing Elements with Local Memory

Multiprocessing :: Slide 9 of 26David Rye :: MTRX 3700

Nearest neighbour network

May be end-around connected

P11 P12 P13 P1y

P21 P22 P23 P2y

P31 P32 P33 P3y

Px1 Px2 Px3 Pxy

P000 P001

P010 P011

P100 P101

P110 P111

3-cube network

Multiprocessing :: Slide 10 of 26David Rye :: MTRX 3700

SIMD

Examples – mainly supercomputers in the mid-1990s Goodyear Aerospace MPP (Massively Parallel Processor) ICL DAP (Distributed Array Processor) Thinking Machines Corp CM-1 and CM-2

Uses are computational rather than for control In 2011, only 1 of the world’s top 500 supercomputers

(TOP500) was SIMD (vector). Now there are none. Currently, 85% are clusters, 15% are MPP.

SIMD still used in current PCs for image processing and audio applications

Multiprocessing :: Slide 11 of 26David Rye :: MTRX 3700

Dead (Super) Computer Society ACRI Alliant American Supercomputer Ametek Applied Dynamics Astronautics BBN CDC Convex Cray Computer Cray Research Culler-Harris Culler Scientific Cydrome Dana/Ardent/Stellar/Stardent Denelcor Elxsi ETA Systems Evans and Sutherland Computer Division Floating Point Systems Galaxy YH-1 Goodyear Aerospace MPP

Gould NPL Guiltech Intel Scientific Computers International Parallel Machines Kendall Square Research Key Computer Laboratories MasPar Meiko Multiflow Myrias Numerix nCube Prisma Thinking Machines Saxpy Scientific Computer Systems (SCS) Soviet Supercomputers Supertek Supercomputer Systems (SSI) Suprenum Vitesse Electronics

(from http://www.paralogos.com/DeadSuper/ )(see also their Architectural Themes page)

Multiprocessing :: Slide 12 of 26David Rye :: MTRX 3700

MISD

Multiple Instruction stream, Single Data stream No true implementations

Pipelined processors are sometimes regarded as MISD (each data element is processed by sequential segments of the pipeline)

Examples: Cray-1, CDC Cyber 205, PIC18...

Fetch Decode Execute Write

Multiprocessing :: Slide 13 of 26David Rye :: MTRX 3700

MIMD

Multiple Instruction stream, Multiple Data stream

Essentially a group of independent computers

All distributed systems are MIMD

Multiprocessing :: Slide 14 of 26David Rye :: MTRX 3700

Parallel and Distributed Computers

A taxonomy of parallel & distributed computer systems

Parallel & distributed computers

Multiprocessors(shared memory)

Bus Switched

Multicomputers(private memory)

Bus Switched

Sequent, Encore Ultracomputer,RP3

Workstationson a LAN

Hypercube,Transputer

Tightlycoupled Loosely

coupled

Multiprocessing :: Slide 15 of 26David Rye :: MTRX 3700

Structural Classification

Computer system is essentially ‘p’ processing elements =

(CPU + registers + cache) ‘m’ memory units joined by an inter-

connection network

Memory may be local to a processor, shared or both

P1 P2 Pp M1 M2 Mm

‘p’ Processors ‘m’ Memories

... ...

Interconnection Network

Multiprocessing :: Slide 16 of 26David Rye :: MTRX 3700

Shared Memory(Multiprocessor)

Distributed Memory (Multicomputer or distributed

computer system)

P1 P2 Pp

Memory M

‘p’ Processors

...

InterconnectionNetwork N

ProcessorsP1 P2 Pc

M1 M2 McLocal memories

...

...

InterconnectionNetwork N

C1 C2 Cc

‘c’ Computers (c = P and M)

Multiprocessing :: Slide 17 of 26David Rye :: MTRX 3700

Shared Memory

If processor A writes 0x55 to its address 2000, then processor B will read 0x55 from its address 2000. This is a multiprocessor

Obviously, some mechanism is needed to resolve contention for the shared resource

Multiprocessing :: Slide 18 of 26David Rye :: MTRX 3700

Multiprocessor interconnections may be

Bussed (time shared) only one bus write at any time must prevent bus contention at the bus interface ports BREQ signals etc limited to about 64 processors

Switched multiple simultaneous writes requires fast (parallel) bus switches - not cheap!

Multiprocessing :: Slide 19 of 26David Rye :: MTRX 3700

Bussed Systems

Single shared bus widely used

Multiple busses relieve bus contention provides some

redundancy

Systembus B

P1 P2 Pp M1 M2 Mm

‘p’ Processors ‘m’ Memories

... ...

P1 P2 Pp M1 M2 Mm

‘p’ Processors ‘m’ Memories

... ...

B1

B2

Bb

Multiprocessing :: Slide 20 of 26David Rye :: MTRX 3700

Switched Systems

Crossbar switch max(m,p) writes at any

time requires fast mp bus

switch‘p’ Processors

P1

P2

Pp

M1 M2 Mm

‘m’ Memories

.

.

.

...

Crossbar network

Multiprocessing :: Slide 21 of 26David Rye :: MTRX 3700

Switched Systems

Crosspoint switch cheaper but slower!! for n processors and n

memories, have log2(n) stages

used in “Omega” or “Banyan” networks

P1

P2

Processors

2x2 switches

M1

M2

P3

P4

M3

M4

Mem

ories

Multiprocessing :: Slide 22 of 26David Rye :: MTRX 3700

Interconnections (topology) may be either

Static – fixed by hardware

Dynamic – re-configurable in software, perhaps even during program execution

Multiprocessing :: Slide 23 of 26David Rye :: MTRX 3700

Static Topologies

Common arrangements are array, ring, star, cube, tree, and complete interconnection of processors.

Linear

ArrayCube

RingStar

Fully connected Tree

Multiprocessing :: Slide 24 of 26David Rye :: MTRX 3700

Static Topology

Cube (or hypercube) gives good balance between internode length (communications latency) number of neighbouring nodes (cost of switching circuitry).

Several commercial hypercube implementations exist

Multiprocessing :: Slide 25 of 26David Rye :: MTRX 3700

Dynamic Topology

Single bus, multiple bus, crossbar-switched and omega networks are all examples of dynamic topologies.

Multiprocessing :: Slide 26 of 26David Rye :: MTRX 3700

References

Crichlow. An Introduction to Distributed and Parallel Computing. 2 ed., Prentice Hall, 1997.Tanenbaum. Distributed Operating Systems. Pearson, 2009.

Recommended