Upload
votuyen
View
225
Download
5
Embed Size (px)
Citation preview
Multiprocessing :: Slide 1 of 26David Rye :: MTRX 3700
Multiprocessor & Multicomputer Organisation
Parallel and Distributed Computing
Multiprocessing :: Slide 2 of 26David Rye :: MTRX 3700
Multiprocessors and Multicomputers
A multiprocessor system has more than one processor (CPU), with common memory shared between processors
A multicomputer system has more than one processor, with each processor having local memory
In either case, processors may be on a common bus (close coupled), or distributed on a network (loosely coupled)
Multiprocessing :: Slide 3 of 26David Rye :: MTRX 3700
Multiprocessing Systems
Generally accepted definition of a multiprocessing/multicomputing system: Multiple processors, each with its own CPU and memory Interconnection hardware Processors fail independently There exists a shared state Appears to users as single system
Multiprocessing :: Slide 4 of 26David Rye :: MTRX 3700
Flynn’s Taxonomy
Computer system organisation described by two characteristics Number of instruction streams Number of data streams
SISD (PC) SIMD (Supercomputer, MMX processor) MISD (??) MIMD (network of processors or network of computers) Tightly coupled (backplane) Loosely coupled (network)
Limited usefulness, but serves to categorise…
Multiprocessing :: Slide 5 of 26David Rye :: MTRX 3700
SISD
Single Instruction stream, Single Data stream All conventional uniprocessor systems are SISD,
from PCs to mainframes
Examples: 8080, M6800, M68000, i8086, etc, etc, etc.
Multiprocessing :: Slide 6 of 26David Rye :: MTRX 3700
SISD
Can include Harvard memory organisation, pipelined units
May execute more than one instruction simultaneously (superscalar processor)
Processor ‘P’
Minstr Mdata
fetch
decode
execute
to I/O
Multiprocessing :: Slide 7 of 26David Rye :: MTRX 3700
SIMD
Single Instruction stream, Multiple Data stream Often called “Array Processor” or “Vector
Architecture”
One instruction unit that fetches an instruction, then commands many processing elements to execute the same instruction simultaneously on many different data sets
Multiprocessing :: Slide 8 of 26David Rye :: MTRX 3700
SIMD
Organisation is usually in the form of a network of processing elements with local memory
Various topologies are used, and may be dynamically configured - e.g. 64k processors in the CM-2
MemoryMasterCPUI/O
Processing Elements with Local Memory
Multiprocessing :: Slide 9 of 26David Rye :: MTRX 3700
Nearest neighbour network
May be end-around connected
P11 P12 P13 P1y
P21 P22 P23 P2y
P31 P32 P33 P3y
Px1 Px2 Px3 Pxy
P000 P001
P010 P011
P100 P101
P110 P111
3-cube network
Multiprocessing :: Slide 10 of 26David Rye :: MTRX 3700
SIMD
Examples – mainly supercomputers in the mid-1990s Goodyear Aerospace MPP (Massively Parallel Processor) ICL DAP (Distributed Array Processor) Thinking Machines Corp CM-1 and CM-2
Uses are computational rather than for control In 2011, only 1 of the world’s top 500 supercomputers
(TOP500) was SIMD (vector). Now there are none. Currently, 85% are clusters, 15% are MPP.
SIMD still used in current PCs for image processing and audio applications
Multiprocessing :: Slide 11 of 26David Rye :: MTRX 3700
Dead (Super) Computer Society ACRI Alliant American Supercomputer Ametek Applied Dynamics Astronautics BBN CDC Convex Cray Computer Cray Research Culler-Harris Culler Scientific Cydrome Dana/Ardent/Stellar/Stardent Denelcor Elxsi ETA Systems Evans and Sutherland Computer Division Floating Point Systems Galaxy YH-1 Goodyear Aerospace MPP
Gould NPL Guiltech Intel Scientific Computers International Parallel Machines Kendall Square Research Key Computer Laboratories MasPar Meiko Multiflow Myrias Numerix nCube Prisma Thinking Machines Saxpy Scientific Computer Systems (SCS) Soviet Supercomputers Supertek Supercomputer Systems (SSI) Suprenum Vitesse Electronics
(from http://www.paralogos.com/DeadSuper/ )(see also their Architectural Themes page)
Multiprocessing :: Slide 12 of 26David Rye :: MTRX 3700
MISD
Multiple Instruction stream, Single Data stream No true implementations
Pipelined processors are sometimes regarded as MISD (each data element is processed by sequential segments of the pipeline)
Examples: Cray-1, CDC Cyber 205, PIC18...
Fetch Decode Execute Write
Multiprocessing :: Slide 13 of 26David Rye :: MTRX 3700
MIMD
Multiple Instruction stream, Multiple Data stream
Essentially a group of independent computers
All distributed systems are MIMD
Multiprocessing :: Slide 14 of 26David Rye :: MTRX 3700
Parallel and Distributed Computers
A taxonomy of parallel & distributed computer systems
Parallel & distributed computers
Multiprocessors(shared memory)
Bus Switched
Multicomputers(private memory)
Bus Switched
Sequent, Encore Ultracomputer,RP3
Workstationson a LAN
Hypercube,Transputer
Tightlycoupled Loosely
coupled
Multiprocessing :: Slide 15 of 26David Rye :: MTRX 3700
Structural Classification
Computer system is essentially ‘p’ processing elements =
(CPU + registers + cache) ‘m’ memory units joined by an inter-
connection network
Memory may be local to a processor, shared or both
P1 P2 Pp M1 M2 Mm
‘p’ Processors ‘m’ Memories
... ...
Interconnection Network
Multiprocessing :: Slide 16 of 26David Rye :: MTRX 3700
Shared Memory(Multiprocessor)
Distributed Memory (Multicomputer or distributed
computer system)
P1 P2 Pp
Memory M
‘p’ Processors
...
InterconnectionNetwork N
ProcessorsP1 P2 Pc
M1 M2 McLocal memories
...
...
InterconnectionNetwork N
C1 C2 Cc
‘c’ Computers (c = P and M)
Multiprocessing :: Slide 17 of 26David Rye :: MTRX 3700
Shared Memory
If processor A writes 0x55 to its address 2000, then processor B will read 0x55 from its address 2000. This is a multiprocessor
Obviously, some mechanism is needed to resolve contention for the shared resource
Multiprocessing :: Slide 18 of 26David Rye :: MTRX 3700
Multiprocessor interconnections may be
Bussed (time shared) only one bus write at any time must prevent bus contention at the bus interface ports BREQ signals etc limited to about 64 processors
Switched multiple simultaneous writes requires fast (parallel) bus switches - not cheap!
Multiprocessing :: Slide 19 of 26David Rye :: MTRX 3700
Bussed Systems
Single shared bus widely used
Multiple busses relieve bus contention provides some
redundancy
Systembus B
P1 P2 Pp M1 M2 Mm
‘p’ Processors ‘m’ Memories
... ...
P1 P2 Pp M1 M2 Mm
‘p’ Processors ‘m’ Memories
... ...
B1
B2
Bb
Multiprocessing :: Slide 20 of 26David Rye :: MTRX 3700
Switched Systems
Crossbar switch max(m,p) writes at any
time requires fast mp bus
switch‘p’ Processors
P1
P2
Pp
M1 M2 Mm
‘m’ Memories
.
.
.
...
Crossbar network
Multiprocessing :: Slide 21 of 26David Rye :: MTRX 3700
Switched Systems
Crosspoint switch cheaper but slower!! for n processors and n
memories, have log2(n) stages
used in “Omega” or “Banyan” networks
P1
P2
Processors
2x2 switches
M1
M2
P3
P4
M3
M4
Mem
ories
Multiprocessing :: Slide 22 of 26David Rye :: MTRX 3700
Interconnections (topology) may be either
Static – fixed by hardware
Dynamic – re-configurable in software, perhaps even during program execution
Multiprocessing :: Slide 23 of 26David Rye :: MTRX 3700
Static Topologies
Common arrangements are array, ring, star, cube, tree, and complete interconnection of processors.
Linear
ArrayCube
RingStar
Fully connected Tree
Multiprocessing :: Slide 24 of 26David Rye :: MTRX 3700
Static Topology
Cube (or hypercube) gives good balance between internode length (communications latency) number of neighbouring nodes (cost of switching circuitry).
Several commercial hypercube implementations exist
Multiprocessing :: Slide 25 of 26David Rye :: MTRX 3700
Dynamic Topology
Single bus, multiple bus, crossbar-switched and omega networks are all examples of dynamic topologies.
Multiprocessing :: Slide 26 of 26David Rye :: MTRX 3700
References
Crichlow. An Introduction to Distributed and Parallel Computing. 2 ed., Prentice Hall, 1997.Tanenbaum. Distributed Operating Systems. Pearson, 2009.