View
213
Download
0
Embed Size (px)
Citation preview
Parallel Processing and Minimum Spanning Trees
Prof. Sin-Min Lee
Dept. of Computer Science,
San Jose State University
Overview
• What is Parallel Processing
• Parallel Processing in Nature
• Parallel Processing vs. Multitasking
• Amdahl’s Law
• Challenges in Parallel Processing
What is Parallel Processing?
• How to make machines solve problems better and faster?
• Physical barriers limit the extent to which single processor performance can be improved—Clock Speed & Heat Dissipation
• The next most obvious solution is to distribute the computing load among several processors
What is Parallel Processing?
• Parallel Processing encompasses a wide variety of different things:
• Intel Core Duo, Quad, Cell multiprocessors, networked and distributed computer systems, SETI@Home, Folding@Home, neural nets are all examples
Parallel Processing in Nature
• The world’s most powerful parallel processor comes standard
• For everything else there’s American Express…
Parallel Processing in Nature
• For humans parallel processing comes easy
• Human Vision
Color, Motion,Depth, Shape
Parallel Processing in Nature
• Machines that are more like us, and hence more useful, will need to be able to process information more like us—in parallel.
• Parallel Processing is key element in pattern recognition which distinguishes human & machine intelligence
• This is not an easy hurdle
Parallel Processing vs. Multitasking
• Today, computers are great at multi-tasking, but…
• Multi-tasking only creates the illusion of parallel processing
• The processor must switch between activities—it is only the speed with which it does so that creates the illusion of simultaneous execution
• The illusion is most easily shattered when running virus scan while attempting to do anything else
Parallel Processing vs. Multitasking
• Think about it this way:
• Multi-tasking
• Parallel Processing
Amdahl’s Law
• Consider a single processor
• Or two…
• We tend to think that 2x as much work will be done in the same time
• Or that the same amount of work will be done in half the time
Amdahl’s Law
• Do n processors imply that a computational job should complete in 1/n time?
• Sadly, no…
Amdahl’s Law
• 1967 Gene Amdahl recognizes the interrelationship of all components
• Overall speedup of a system depends on the speedup of a particular component & how much that component is used
• S = 1/([1- f] + f / k)
• S = overall system speed• f = fraction of the work performed
by the faster component• k = speedup of new component
Amdahl’s Law
• Additionally, no matter how well (or much) you parallelize an application there will always be a small portion of work that must be done serially.
• Other processors must simply sit and wait in this interval
• Every algorithm has a sequential part that limits potential speedup
Challenges in Parallel Processing
• Not always obvious where to “split” workload or even possible.
• If you don’t use it, you lose it…programs not specifically written for parallel architecture run no more efficiently on parallel systems
Challenges in Parallel Processing• Connecting your CPUs• Dynamic vs Static—connections can change from one communication to
next• Blocking vs Nonblocking—can simultaneous connections be present?• Connections can be complete, linear, star, grid, tree, hypercube, etc.• Bus-based routing• Crossbar switching—impractical for all but the most expensive super-
computers• 2X2 switch—can route inputs to different destinations
Challenges in Parallel Processing
• Dealing with memory• Various options:• Global Shared Memory• Distributed Shared Memory• Global shared memory with separate
cache for processors
• Potential Hazards:• Individual CPU caches or memories
can become out of synch with each other. “Cache Coherence”
• Solutions:• UMA/NUMA machines• Snoopy cache controllers• Write-through protocols
In the design of electronic circuitry, it is often necessary to make the pins of several components electrically equivalent by wiring them together. To interconnect a set of n pins, we can use an arrangement of n – 1 wires, each connecting two pins. Of all such arrangements, the one that uses the least amount of wire is usually the most desirable. We can model this wiring problem with a connected, undirected graph G = (V,E), where V is the set of pins, E is the set of possible interconnections between pairs of pins, and for each edge (u, v) ∈ E, we have a weight w(u,v) specifying the cost (amount of wire needed) to connect u and v.
We then wish to find an acyclic subset T E that connects all of the vertices and whose total weight
is minimized. Since T is acyclic and connects all of the vertices, it must form a tree, which we call a spanning tree since it “spans” the graph G. We call the problem of determining the tree T the minimum-spanning-tree problem.
Spanning trees
• Suppose you have a connected undirected graph
– Connected: every node is reachable from every other node
– Undirected: edges do not have an associated direction
• ...then a spanning tree of the graph is a connected subgraph in which there are no cycles
A connected,undirected graph
Four of the spanning trees of the graph
Finding a spanning tree• To find a spanning tree of a graph,
pick a node and call it part of the spanning tree do a search from the initial node:
each time you find a node that is not in the spanning tree, add to the spanning tree both the new node and the edge you followed to get to it
An undirected graph Result of a BFSstarting from top
Result of a DFSstarting from top
Minimizing costs
• Suppose you want to supply a set of houses (say, in a new subdivision) with:– electric power– water– sewage lines– telephone lines
• To keep costs down, you could connect these houses with a spanning tree (of, for example, power lines)– However, the houses are not all equal distances apart
• To reduce costs even further, you could connect the houses with a minimum-cost spanning tree
Minimum-cost spanning trees• Suppose you have a connected undirected graph
with a weight (or cost) associated with each edge• The cost of a spanning tree would be the sum of the
costs of its edges• A minimum-cost spanning tree is that spanning tree
that has the lowest cost
A B
E D
F C
16
19
21 11
33 14
18 10
65
A connected, undirected graph
A B
E D
F C
1611
18
65
A minimum-cost spanning tree
Applications of Spanning Trees
• Minimal path routing in all kinds of settings– circuits, networks, roads and sewers
Kruskal’s algorithm
• Pick the cheapest edge that does not create a cycle in the tree
• Add the edge to the solution and remove it from the graph.
• Continue until all nodes are part of the tree.
Spanning Tree
Herman
Etna Old Town
Bangor
Hampden
Orono
How do we plow the fewest roads so there willalways be cleared roads connecting any two towns?
Kruskal’s Algorithm
• Start by initializing a graph K with all of G’s nodes and none of G’s edges.
• For each edge (x,y) in G (taken in increasing order of weight)– If x and y are not in same connected component•Add edge (x,y) to K
Kruskal’s Algorithm{A,B} 1
{A,D} 2
{A,E} 5
{D,F} 5
{E,D} 7
{B,C} 8
{B,D} 9
{C,D} 12
A
BC
D
E
F
5 7
12
912
8
5G
A
BC
D
E
F
K
Kruskal’s Algorithm
A
BC
D
E
F
K12
5
{A,B} 1
{A,D} 2
{A,E} 5
{D,F} 5
{E,D} 7
{B,C} 8
{B,D} 9
{C,D} 12
Kruskal’s Algorithm
A
BC
D
E
F
K12
5
5
{A,B} 1
{A,D} 2
{A,E} 5
{D,F} 5
{E,D} 7
{B,C} 8
{B,D} 9
{C,D} 12
Kruskal’s Algorithm
A
BC
D
E
F
K12
5
5
{E,D} already connected
{A,B} 1
{A,D} 2
{A,E} 5
{D,F} 5
{E,D} 7
{B,C} 8
{B,D} 9
{C,D} 12
Kruskal’s Algorithm
A
BC
D
E
F
K12
5
5
{A,B} 1
{A,D} 2
{A,E} 5
{D,F} 5
{E,D} 7
{B,C} 8
{B,D} 9
{C,D} 12
8
Kruskal’s Algorithm{A,B} 1
{A,D} 2
{A,E} 5
{D,F} 5
{E,D} 7
{B,C} 8
{B,D} 9
{C,D} 12
A
BC
D
E
F
K12
5
5
8
{B,D} already connected
Kruskal’s Algorithm{A,B} 1
{A,D} 2
{A,E} 5
{D,F} 5
{E,D} 7
{B,C} 8
{B,D} 9
{C,D} 12
A
BC
D
E
F
K12
5
5
8
{C,D} already connected
Compare Prim and Kruskal
• Which one is better?
• Which one would you use if… – you don’t know the entire graph at the
beginning? – you always want a tree in partial solutions?