Back to George One More Time Before they invented drawing boards, what did they go back to? If all...
of 104/104
Back to George One More Time • Before they invented drawing boards, what did they go back to? • If all the world is a stage, where is the audience sitting? • If the #2 pencil is the most popular, why is it still #2? • If work is so terrific, how come they have to pay you to do it? • If you ate pasta and antipasto, would you still be hungry? • If you try to fail, and succeed, which have you done? • "People who think they know everything are a great annoyance to those of us who do.” - Anon
Back to George One More Time Before they invented drawing boards, what did they go back to? If all the world is a stage, where is the audience sitting?
Text of Back to George One More Time Before they invented drawing boards, what did they go back to? If all...
Slide 1
Back to George One More Time Before they invented drawing
boards, what did they go back to? If all the world is a stage,
where is the audience sitting? If the #2 pencil is the most
popular, why is it still #2? If work is so terrific, how come they
have to pay you to do it? If you ate pasta and antipasto, would you
still be hungry? If you try to fail, and succeed, which have you
done? "People who think they know everything are a great annoyance
to those of us who do. - Anon
Slide 2
O() Analysis Reasonable vs. Unreasonable Algorithms Using O()
Analysis in Design Concurrent Systems Parallelism
Slide 3
Recipe for Determining O() Break algorithm down into known
pieces Well learn the Big-Os in this section Identify relationships
between pieces Sequential is additive Nested (loop / recursion) is
multiplicative Drop constants Keep only dominant factor for each
variable
Slide 4
Comparing Data Structures and Methods Data
StructureTraverseSearchInsert Unsorted L ListNN1 Sorted L ListNNN
Unsorted ArrayNN1 Sorted ArrayNLog NN Binary TreeNN1 BSTNNN F&B
BSTNLog NLog N LB
Slide 5
Reasonable vs. Unreasonable Algorithms
Slide 6
Algorithmic Performance Thus Far Some examples thus far: O(1)
Insert to front of linked list O(N) Simple/Linear Search O(N Log
N)MergeSort O(N 2 )BubbleSort But it could get worse: O(N 5 ), O(N
2000 ), etc.
Slide 7
An O(N 5 ) Example For N = 256 N 5 = 256 5 = 1,100,000,000,000
If we had a computer that could execute a million instructions per
second 1,100,000 seconds = 12.7 days to complete But it could get
worse
Slide 8
The Power of Exponents A rich king and a wise peasant
Slide 9
The Wise Peasants Pay Day(N)Pieces of Grain 12 24 38 416...
2N2N 63 9,223,000,000,000,000,000 64
18,450,000,000,000,000,000
Slide 10
How Bad is 2 N ? Imagine being able to grow a billion
(1,000,000,000) pieces of grain a second It would take 585 years to
grow enough grain just for the 64 th day Over a thousand years to
fulfill the peasants request!
Slide 11
So the King cut off the peasants head. LB
Slide 12
The Towers of Hanoi A B C Goal: Move stack of rings to another
peg Rule 1: May move only 1 ring at a time Rule 2: May never have
larger ring on top of smaller ring
Slide 13
The Towers of Hanoi A B C
Slide 14
The Towers of Hanoi A B C
Slide 15
The Towers of Hanoi A B C
Slide 16
The Towers of Hanoi A B C
Slide 17
The Towers of Hanoi A B C
Slide 18
The Towers of Hanoi A B C
Slide 19
The Towers of Hanoi A B C
Slide 20
The Towers of Hanoi A B C
Slide 21
The Towers of Hanoi A B C
Slide 22
The Towers of Hanoi A B C
Slide 23
The Towers of Hanoi A B C
Slide 24
The Towers of Hanoi A B C
Slide 25
The Towers of Hanoi A B C
Slide 26
The Towers of Hanoi A B C
Slide 27
The Towers of Hanoi A B C
Slide 28
The Towers of Hanoi A B C
Slide 29
Towers of Hanoi - Complexity For 1 rings we have 1 operations.
For 2 rings we have 3 operations. For 3 rings we have 7 operations.
For 4 rings we have 15 operations. In general, the cost is 2 N 1 =
O(2 N ) Each time we increment N, we double the amount of work.
This grows incredibly fast!
Slide 30
Towers of Hanoi (2 N ) Runtime For N = 64 2 N = 2 64 =
18,450,000,000,000,000,000 If we had a computer that could execute
a million instructions per second It would take 584,000 years to
complete But it could get worse
Slide 31
The Bounded Tile Problem Match up the patterns in the tiles.
Can it be done, yes or no?
Slide 32
The Bounded Tile Problem Matching tiles
Slide 33
Tiling a 5x5 Area 25 available tiles remaining
Slide 34
Tiling a 5x5 Area 24 available tiles remaining
Slide 35
Tiling a 5x5 Area 23 available tiles remaining
Slide 36
Tiling a 5x5 Area 22 available tiles remaining
Slide 37
Tiling a 5x5 Area 2 available tiles remaining
Slide 38
Analysis of the Bounded Tiling Problem Tile a 5 by 5 area (N =
25 tiles) 1st location: 25 choices 2nd location: 24 choices And so
on Total number of arrangements: 25 * 24 * 23 * 22 * 21 *.... * 3 *
2 * 1 25! (Factorial) = 15,500,000,000,000,000,000,000,000 Bounded
Tiling Problem is O(N!)
Slide 39
Tiling (N!) Runtime For N = 25 25! =
15,500,000,000,000,000,000,000,000 If we could place a million
tiles per second It would take 470 billion years to complete Why
not a faster computer?
Slide 40
A Faster Computer If we had a computer that could execute a
trillion instructions per second (a million times faster than our
MIPS computer) 5x5 tiling problem would take 470,000 years 64-ring
Tower of Hanoi problem would take 213 days Why not an even faster
computer!
Slide 41
The Fastest Computer Possible? What if: Instructions took ZERO
time to execute CPU registers could be loaded at the speed of light
These algorithms are still unreasonable! The speed of light is only
so fast!
Slide 42
Where Does this Leave Us? Clearly algorithms have varying
runtimes. Wed like a way to categorize them: Reasonable, so it may
be useful Unreasonable, so why bother running
Slide 43
Performance Categories of Algorithms Sub-linear O(Log N) Linear
O(N) Nearly linear O(N Log N) Quadratic O(N 2 ) Exponential O(2 N )
O(N!) O(N N ) Polynomial
Slide 44
Reasonable vs. Unreasonable Reasonable algorithms have
polynomial factors O (Log N) O (N) O (N K ) where K is a constant
Unreasonable algorithms have exponential factors O (2 N ) O (N!) O
(N N )
Slide 45
Reasonable vs. Unreasonable Reasonable algorithms May be usable
depending upon the input size Unreasonable algorithms Are
impractical and useful to theorists Demonstrate need for
approximate solutions Remember were dealing with large N (input
size)
Slide 46
Two Categories of Algorithms 2 4 8 16 32 64 128 256 512 1024
Size of Input (N) 10 35 10 30 10 25 10 20 10 15 trillion billion
million 1000 100 10 N N5N5 2N2N N Unreasonable Dont Care!
Reasonable Runtime
Slide 47
Summary Reasonable algorithms feature polynomial factors in
their O() and may be usable depending upon input size. Unreasonable
algorithms feature exponential factors in their O() and have no
practical utility.
Slide 48
Questions?
Slide 49
Using O() Analysis in Design
Slide 50
Air Traffic Control Coast, add, delete Conflict Alert
Slide 51
Problem Statement What data structure should be used to store
the aircraft records for this system? Normal operations conducted
are: Data Entry: adding new aircraft entering the area Radar
Update: input from the antenna Coast: global traversal to verify
that all aircraft have been updated [coast for 5 cycles, then drop]
Query: controller requesting data about a specific aircraft by
location Conflict Analysis: make sure no two aircraft are too close
together
Slide 52
Air Traffic Control System ProgramAlgorithmFreq 1. Data Entry /
ExitInsert15 2. Radar Data UpdateN*Search12 3. Coast /
DropTraverse60 4. QuerySearch 1 5. Conflict
AnalysisTraverse*Search12
Slide 53
Questions?
Slide 54
Concurrent Systems
Slide 55
Sequential Processing All of the algorithms weve seen so far
are sequential: They have one thread of execution One step follows
another in sequence One processor is all that is needed to run the
algorithm
Slide 56
A Non-sequential Example Consider a house with a burglar alarm
system. The system continually monitors: The front door The back
door The sliding glass door The door to the deck The kitchen
windows The living room windows The bedroom windows The burglar
alarm is watching all of these at once (at the same time).
Slide 57
Another Non-sequential Example Your car has an onboard digital
dashboard that simultaneously: Calculates how fast youre going and
displays it on the speedometer Checks your oil level Checks your
fuel level and calculates consumption Monitors the heat of the
engine and turns on a light if it is too hot Monitors your
alternator to make sure it is charging your battery
Slide 58
Concurrent Systems A system in which: Multiple tasks can be
executed at the same time The tasks may be duplicates of each
other, or distinct tasks The overall time to perform the series of
tasks is reduced
Slide 59
Advantages of Concurrency Concurrent processes can reduce
duplication in code. The overall runtime of the algorithm can be
significantly reduced. More real-world problems can be solved than
with sequential algorithms alone. Redundancy can make systems more
reliable.
Slide 60
Disadvantages of Concurrency Runtime is not always reduced, so
careful planning is required Concurrent algorithms can be more
complex than sequential algorithms Shared data can be corrupted
Communications between tasks is needed
Slide 61
Achieving Concurrency CPU 1 CPU 2 Memory bus Many computers
today have more than one processor (multiprocessor machines)
Slide 62
Achieving Concurrency CPU task 1 task 2 task 3 ZZZZ Concurrency
can also be achieved on a computer with only one processor: The
computer juggles jobs, swapping its attention to each in turn Time
slicing allows many users to get CPU resources Tasks may be
suspended while they wait for something, such as device I/O
Slide 63
Concurrency vs. Parallelism Concurrency is the execution of
multiple tasks at the same time, regardless of the number of
processors. Parallelism is the execution of multiple processors on
the same task.
Slide 64
Types of Concurrent Systems Multiprogramming Multiprocessing
Multitasking Distributed Systems
Slide 65
Multiprogramming Share a single CPU among many users or tasks.
May have a time-shared algorithm or a priority algorithm for
determining which task to run next Give the illusion of
simultaneous processing through rapid swapping of tasks
(interleaving).
Slide 66
Multiprogramming Memory User 1 User 2 CPU User1 User2
Slide 67
Multiprogramming 1 2 3 4 1 234 CPUs Tasks/Users
Slide 68
Multiprocessing Executes multiple tasks at the same time Uses
multiple processors to accomplish the tasks Each processor may also
timeshare among several tasks Has a shared memory that is used by
all the tasks
Slide 69
Multiprocessing Memory User 1: Task1 User 1: Task2 User 2:
Task1 CPU User1 User2 CPU
Multitasking A single user can have multiple tasks running at
the same time. Can be done with one or more processors. Used to be
rare and for only expensive multiprocessing systems, but now most
modern operating systems can do it.
Slide 72
Multitasking Memory User 1: Task1 User 1: Task2 User 1: Task3
CPU User1
Slide 73
Multitasking 1 2 3 4 1 234 CPUs Tasks Single User
Slide 74
Distributed Systems Central Bank ATM Buford ATM Perimeter ATM
Student Ctr ATM North Ave Multiple computers working together with
no central program in charge.
Slide 75
Distributed Systems Advantages: No bottlenecks from sharing
processors No central point of failure Processing can be localized
for efficiency Disadvantages: Complexity Communication overhead
Distributed control
Slide 76
Questions?
Slide 77
Parallelism
Slide 78
Using multiple processors to solve a single task. Involves:
Breaking the task into meaningful pieces Doing the work on many
processors Coordinating and putting the pieces back together.
Slide 79
Parallelism CPU Memory Network Interface
Slide 80
Parallelism 1 2 3 4 1 234 CPUs Tasks
Slide 81
Pipeline Processing Repeating a sequence of operations or
pieces of a task. Allocating each piece to a separate processor and
chaining them together produces a pipeline, completing tasks
faster. ABCD input output
Slide 82
Example Suppose you have a choice between a washer and a dryer
each having a 30 minutes cycle or A washer/dryer with a one hour
cycle The correct answer depends on how much work you have to
do.
Slide 83
One Load washdry combo Transfer Overhead
Slide 84
Three Loads washdry combo wash dry combo
Slide 85
Examples of Pipelined Tasks Automobile manufacturing
Instruction processing within a computer 154 32 1 23 4 5 1 23 4 5 1
2 3 4 5 A B C D 12345670 time
Slide 86
Task Queues P1P2P3Pn Super Task Queue A supervisor processor
maintains a queue of tasks to be performed in shared memory. Each
processor queries the queue, dequeues the next task and performs
it. Task execution may involve adding more tasks to the task
queue.
Slide 87
Parallelizing Algorithms How much gain can we get from
parallelizing an algorithm?
Slide 88
Parallel Bubblesort 9387746557453327 9387746557453327
9387746557453327 We can use N/2 processors to do all the
comparisons at once, flopping the pair-wise comparisons.
Slide 89
Runtime of Parallel Bubblesort 93877465574533273
93877465574533274 93877465574533275 93877465574533276
93877465574533277 93877465574533278
Slide 90
Completion Time of Bubblesort Sequential bubblesort finishes in
N 2 time. Parallel bubblesort finishes in N time. Bubble Sort
parallel O(N 2 ) O(N)
Slide 91
Product Complexity Got done in O(N) time, better than O(N 2 )
Each time chunk does O(N) work There are N time chunks. Thus, the
amount of work is still O(N 2 ) Product complexity is the amount of
work per time chunk multiplied by the number of time chunks the
total work done.
Slide 92
Ceiling of Improvement Parallelization can reduce time, but it
cannot reduce work. The product complexity cannot change or
improve. How much improvement can parallelization provide? Given an
O(NLogN) algorithm and Log N processors, the algorithm will take at
least O(?) time. Given an O(N 3 ) algorithm and N processors, the
algorithm will take at least O(?) time. O(N) time. O(N 2 )
time.
Slide 93
Number of Processors Processors are limited by hardware.
Typically, the number of processors is a power of 2 Usually: The
number of processors is a constant factor, 2 K Conceivably:
Networked computers joined as needed (ala Borg?).
Slide 94
Adding Processors A program on one processor Runs in X time
Adding another processor Runs in no more than X/2 time
Realistically, it will run in X/2 + time because of overhead At
some point, adding processors will not help and could degrade
performance.
Slide 95
Overhead of Parallelization Parallelization is not free.
Processors must be controlled and coordinated. We need a way to
govern which processor does what work; this involves extra work.
Often the program must be written in a special programming language
for parallel systems. Often, a parallelized program for one machine
(with, say, 2 K processors) doesnt work on other machines (with,
say, 2 L processors).
Slide 96
What We Know about Tasks Relatively isolated units of
computation Should be roughly equal in duration Duration of the
unit of work must be much greater than overhead time Policy
decisions and coordination required for shared data Simpler
algorithm are the easiest to parallelize
Slide 97
Questions?
Slide 98
More?
Slide 99
Matrix Multiplication
Slide 100
Inner Product Procedure Procedure inner_prod(a, b, c isoftype
in/out Matrix, i, j isoftype in Num) // Compute inner product of
a[i][*] and b[*][j] Sum isoftype Num k isoftype Num Sum