Upload
krishver2
View
48
Download
0
Tags:
Embed Size (px)
Citation preview
Randomized Data Structures
CSE 326 Autumn 20012
Motivation for
Randomized Data StructuresWe’ve seen many data structures with good average case performance on random inputs, but bad behavior on particular inputs
e.g. Binary Search Trees
Instead of randomizing the input (since we cannot!), consider randomizing the data structure
No bad inputs, just unlucky random numbers Expected case good behavior on any input
Randomized Data Structures
CSE 326 Autumn 20013
Average vs. Expected TimeAverage (1/N) xi
Expectation Pr(xi) xi
Deterministic with good average time If your application happens to always (or often) use the
“bad” case, you are in big trouble!
Randomized with good expected time Once in a while you will have an expensive operation, but
no inputs can make this happen all the time
Like an insurance policy for your algorithm!
Randomized Data Structures
CSE 326 Autumn 20014
Randomized Data Structures Define a property (or subroutine) in an algorithm Sample or randomly modify the property Use altered property as if it were the true
property
Can transform average case runtimes into expected runtimes (remove input dependency).
Sometimes allows substantial speedup in exchange for probabilistic unsoundness.
Randomized Data Structures
CSE 326 Autumn 20015
Randomization in Action Quicksort Randomized data structures
Treaps Randomized skip lists
Random number generation Primality testing
Randomized Data Structures
CSE 326 Autumn 20016
Treap Dictionary Data Structure
Treap is a BST binary tree property search tree
property
Treap is also a heap heap-order
property random priorities
1512
1030
915
78
418
67
29
prioritykey
Randomized Data Structures
CSE 326 Autumn 20017
Treap Insert Choose a random priority Insert as in normal BST Rotate up until heap order is restored
67
insert(15)
78
29
1412
67
78
29
1412
915
67
78
29
915
1412
Randomized Data Structures
CSE 326 Autumn 20018
Tree + Heap… Why Bother? Insert data in sorted order into a treap …
What shape tree comes out?
67
insert(7)
67
insert(8)
78
67
insert(9)
78
29
67
insert(12)
78
29
1512
Treap Delete Find the key Increase its value to Rotate it to the fringe Snip it off
delete(9)
67
78
29
915
1512
78
67
9
915
1512
78
67
915
9
1512
78
67
915
1512
9
78
67
915
1512
Randomized Data Structures
CSE 326 Autumn 200110
Treap Summary Implements Dictionary ADT
insert in expected O(log n) time delete in expected O(log n) time find in expected O(log n) time but worst case O(n)
Memory use O(1) per node about the cost of AVL trees
Very simple to implement little overhead – less than AVL trees
Randomized Data Structures
CSE 326 Autumn 200111
Perfect Binary Skip List Sorted linked list # of links of a node is its height The height i link of each node (that has one)
links to the next node of height i or greater
8
2
11
10
1913 20
22
2923
Randomized Data Structures
CSE 326 Autumn 200112
Find in a Perfect Binary Skip List Start i at the maximum height Until the node is found or i is one and
the next node is too large: If the next node along the i link is less than
the target, traverse to the next node Otherwise, decrease i by one
Runtime?
Randomized Data Structures
CSE 326 Autumn 200113
Randomized Skip List IntuitionIt’s far too hard to insert into a perfect skip
list
But is perfection necessary?
What matters in a skip list? We want way fewer tall nodes than short ones Make good progress through the list with each
high traverse
Randomized Data Structures
CSE 326 Autumn 200114
Randomized Skip List Sorted linked list # of links of a node is its height The height i link of each node (that has one)
links to the next node of height i or greater There should be about 1/2 as many height i+1
nodes as height i nodes
2 19 23
8
13
292010
22
11
Randomized Data Structures
CSE 326 Autumn 200115
Find in a RSL Start i at the maximum height Until the node is found or i is one and
the next node is too large: If the next node along the i link is less than
the target, traverse to the next node Otherwise, decrease i by one
Same as for a perfect skip list!
Runtime?
Randomized Data Structures
CSE 326 Autumn 200116
Insertion in a RSL Flip a coin until it comes up heads; that
takes i flips. Make the new node’s height i. Do a find, remembering nodes where we go
down Put the node at the spot where the find ends Point all the nodes where we went down (up
to the new node’s height) at the new node Point the new node’s links where those
redirected pointers were pointing
Randomized Data Structures
CSE 326 Autumn 200118
Range Queries and IterationRange query: search for everything that falls between two values
find start point walk list at the bottom output each node until the end point
Iteration: successively return (in order) each element in the structure
start at beginning, walk list to end Just like a linked list!
Runtimes?
Randomized Data Structures
CSE 326 Autumn 200119
Randomized Skip ListSummary
Implements Dictionary ADT insert in expected O(log n) find in expected O(log n)
Memory use expected constant memory per node about double a linked list
Less overhead than search trees for iteration over range queries
Randomized Data Structures
CSE 326 Autumn 200120
Random Number GenerationLinear congruential generator
xi+1 = Axi mod M
Choose A, M to minimize repetitions M is prime (Axi mod M) cycles through {1, … , M-1} before
repeating Good choices:
M = 231 – 1 = 2,147,483,647A = 48,271
Must handle overflow!
Randomized Data Structures
CSE 326 Autumn 200121
Some Properties of PrimesIf:
P is prime 0 < X < P
Then: XP-1 = 1 (mod P) the only solutions to X2 = 1 (mod P) are:
X = 1 and X = P - 1
Randomized Data Structures
CSE 326 Autumn 200122
Checking Non-PrimalityExponentiation (pow function, Weiss 2.4.4) can be modified to detect non-primality (composite-ness)
// Computes x^n mod(M)
HugeInt pow(HugeInt x, HugeInt n, HugeInt M) {
if (n == 0) return 1;
if (n == 1) return x;
HugeInt squared = x * x % M;
// 1 < x < M - 1 && squared == 1 --> M isn’t prime!
if (isEven(n)) return pow(squared, n/2, M);
else return (pow(squared, n/2, M) * x);
}
Randomized Data Structures
CSE 326 Autumn 200123
Checking PrimalitySystematic algorithm
For prime P, for all A such that 0 < A < PCalculate AP-1 mod P using powCheck at each step of pow and at end for primality conditions
Randomized algorithmUse one random A If the randomized algorithm reports failure, then P isn’t prime. If the randomized algorithm reports success, then P might be
prime. At most (P-9)/4 values of A fool this prime check if algorithm reports success, then P is prime with probability > ¾
Each new A has independent probability of false positive We can make probability of false positive arbitrarily small
Randomized Data Structures
CSE 326 Autumn 200124
Evaluating Randomized Primality TestingYour probability of being struck by lightning this
year: 0.00004%Your probability that a number that tests prime 11
times in a row is actually not prime: 0.00003%
Your probability of winning a lottery of 1 million people five times in a row: 1 in 2100
Your probability that a number that tests prime 50 times in a row is actually not prime: 1 in 2100