5.4 randomized datastructures

CSE 326Randomized Data Structures

David Kaplan

Dept of Computer Science & EngineeringAutumn 2001

Randomized Data Structures

CSE 326 Autumn 20012

Motivation for

Randomized Data StructuresWe’ve seen many data structures with good average case performance on random inputs, but bad behavior on particular inputs

e.g. Binary Search Trees

Instead of randomizing the input (since we cannot!), consider randomizing the data structure

No bad inputs, just unlucky random numbers Expected case good behavior on any input



Average vs. Expected TimeAverage (1/N) xi

Expectation Pr(xi) xi

Deterministic with good average time If your application happens to always (or often) use the

“bad” case, you are in big trouble!

Randomized with good expected time Once in a while you will have an expensive operation, but

no inputs can make this happen all the time

Like an insurance policy for your algorithm!



Randomized Data Structures Define a property (or subroutine) in an algorithm Sample or randomly modify the property Use altered property as if it were the true

property

Can transform average case runtimes into expected runtimes (remove input dependency).

Sometimes allows substantial speedup in exchange for probabilistic unsoundness.



Randomization in Action Quicksort Randomized data structures

Treaps Randomized skip lists

Random number generation Primality testing



Treap Dictionary Data Structure

Treap is a BST binary tree property search tree

property

Treap is also a heap heap-order

property random priorities

1512

1030

915

78

418

67

29

prioritykey



Treap Insert Choose a random priority Insert as in normal BST Rotate up until heap order is restored

67

insert(15)

78

29

1412

67

78

29

1412

915

67

78

29

915

1412



Tree + Heap… Why Bother? Insert data in sorted order into a treap …

What shape tree comes out?

67

insert(7)

67

insert(8)

78

67

insert(9)

78

29

67

insert(12)

78

29

1512

Treap Delete Find the key Increase its value to Rotate it to the fringe Snip it off

delete(9)

67

78

29

915

1512

78

67

9

915

1512

78

67

915

9

1512

78

67

915

1512

9

78

67

915

1512



Treap Summary Implements Dictionary ADT

insert in expected O(log n) time delete in expected O(log n) time find in expected O(log n) time but worst case O(n)

Memory use O(1) per node about the cost of AVL trees

Very simple to implement little overhead – less than AVL trees



Perfect Binary Skip List Sorted linked list # of links of a node is its height The height i link of each node (that has one)

links to the next node of height i or greater

8

2

11

10

1913 20

22

2923



Find in a Perfect Binary Skip List Start i at the maximum height Until the node is found or i is one and

the next node is too large: If the next node along the i link is less than

the target, traverse to the next node Otherwise, decrease i by one

Runtime?



Randomized Skip List IntuitionIt’s far too hard to insert into a perfect skip

list

But is perfection necessary?

What matters in a skip list? We want way fewer tall nodes than short ones Make good progress through the list with each

high traverse



Randomized Skip List Sorted linked list # of links of a node is its height The height i link of each node (that has one)

links to the next node of height i or greater There should be about 1/2 as many height i+1

nodes as height i nodes

2 19 23

8

13

292010

22

11



Find in a RSL Start i at the maximum height Until the node is found or i is one and

the next node is too large: If the next node along the i link is less than

the target, traverse to the next node Otherwise, decrease i by one

Same as for a perfect skip list!

Runtime?



Insertion in a RSL Flip a coin until it comes up heads; that

takes i flips. Make the new node’s height i. Do a find, remembering nodes where we go

down Put the node at the spot where the find ends Point all the nodes where we went down (up

to the new node’s height) at the new node Point the new node’s links where those

redirected pointers were pointing

RSL Insert Example

2 19 23

8

13

292010 11

insert(22)with 3 flips

2 19 23

8

13

292010

22

11

Runtime?



Range Queries and IterationRange query: search for everything that falls between two values

find start point walk list at the bottom output each node until the end point

Iteration: successively return (in order) each element in the structure

start at beginning, walk list to end Just like a linked list!

Runtimes?



Randomized Skip ListSummary

Implements Dictionary ADT insert in expected O(log n) find in expected O(log n)

Memory use expected constant memory per node about double a linked list

Less overhead than search trees for iteration over range queries



Random Number GenerationLinear congruential generator

xi+1 = Axi mod M

Choose A, M to minimize repetitions M is prime (Axi mod M) cycles through {1, … , M-1} before

repeating Good choices:

M = 231 – 1 = 2,147,483,647A = 48,271

Must handle overflow!



Some Properties of PrimesIf:

P is prime 0 < X < P

Then: XP-1 = 1 (mod P) the only solutions to X2 = 1 (mod P) are:

X = 1 and X = P - 1



Checking Non-PrimalityExponentiation (pow function, Weiss 2.4.4) can be modified to detect non-primality (composite-ness)

// Computes x^n mod(M)

HugeInt pow(HugeInt x, HugeInt n, HugeInt M) {

if (n == 0) return 1;

if (n == 1) return x;

HugeInt squared = x * x % M;

// 1 < x < M - 1 && squared == 1 --> M isn’t prime!

if (isEven(n)) return pow(squared, n/2, M);

else return (pow(squared, n/2, M) * x);

}



Checking PrimalitySystematic algorithm

For prime P, for all A such that 0 < A < PCalculate AP-1 mod P using powCheck at each step of pow and at end for primality conditions

Randomized algorithmUse one random A If the randomized algorithm reports failure, then P isn’t prime. If the randomized algorithm reports success, then P might be

prime. At most (P-9)/4 values of A fool this prime check if algorithm reports success, then P is prime with probability > ¾

Each new A has independent probability of false positive We can make probability of false positive arbitrarily small



Evaluating Randomized Primality TestingYour probability of being struck by lightning this

year: 0.00004%Your probability that a number that tests prime 11

times in a row is actually not prime: 0.00003%

Your probability of winning a lottery of 1 million people five times in a row: 1 in 2100

Your probability that a number that tests prime 50 times in a row is actually not prime: 1 in 2100



Cycles

Prime numbers

Random numbers

Randomized algorithms

Education

5.4 randomized datastructures