Randomized Algorithms

Preview:

DESCRIPTION

Pasi Fränti. Randomized Algorithms. 1.10.2014. Treasure island. Treasure worth 20.000 awaits. ?. Map for sale: 3000. 5000. 5000. 5000. ?. DAA expedition. To buy or not to buy. Buy the map:. 20000 – 5000 – 3000 = 12.000. Take a change:. 20000 – 5000 = 15.000. - PowerPoint PPT Presentation

Citation preview

Randomized Algorithms

Pasi Fränti9.10.2015

Treasure islandTreasure worth 20.000 awaits

5000

DAAexpedition

5000

5000

?

?

Map for sale: 3000

To buy or not to buy

Buy the map:

Take a change:

20000 – 5000 – 3000 = 12.000

20000 – 5000 = 15.000

20000 – 5000 – 5000 = 10.000

To buy or not to buy

Buy the map:

Take a change:

20000 – 5000 – 3000 = 12.000

20000 – 5000 = 15.000

20000 – 5000 – 5000 = 10.000

Expected result:0.5 ∙ 15000 + 0.5 ∙ 10000 = 12.500

Three type of randomization

1. Las Vegas- Output is always correct result- Result is not always found- Probability of success p

2. Monte Carlo- Result is always found- Result can be inaccurate (or even false!)- Probability of success p

3. Sherwood- Balancing the worst case behavior

Las Vegas

Dining philosophers

Who eats?

Las Vegas

Input: Bit-vector A[1,n]Output: Index of any 1-bit from A

LasVegas(A, n) index

REPEATk ← Random(1, n);

UNTIL A[k]=1;

RETURN k

1 0 0 1 0 00 0 0 0 1 0 …

6

8-Queens puzzle

INPUT: Eight chess queens and an 8×8 chessboardOUTPUT: Setup where no queens attack each other

8-Queens brute force

Brute force• Try all positions• Mark illegal squares• Backtrack if dead-end• 114 setups in total

Random• Select positions randomly• If dead-end, start over

Randomized• Select k rows randomly• Rest rows by Brute Force

8

6

4

Where next…?

Pseudo code8-Queens(k)

FOR i=1 TO k DO // k Queens randomly r Random[1,8];IF Board[i,r]=TAKEN THEN RETURN Fail;ELSE ConquerSquare(i,r);

FOR i=k+1 TO 8 DO // Rest by Brute Forcer1; foundNO;WHILE (r≤8) AND (NOT found) DO

IF Board[i,r] NOT TAKEN THEN ConquerSquare(i,r); foundYES;

IF NOT found THEN RETURN Fail;

ConquerSquare(i,j)Board[i,j] QUEEN;FOR z=i+1 TO 8 DO

Board[z,j] TAKEN;Board[z,j-(z-i)] TAKEN;Board[z,j+(z-i)] TAKEN;

Probability of success

s = processing time in case of successe = processing time in case of failure

p = probability of successq = 1-p = probability of failure

ep

qst

qepspttt

qepsqtt

qtqepsteqpst

Example:

s=e=1, p=1/6

t=1+5/1∙1=6

Experiments with varying k

K S E T P

0 114 - 114 100%

1 39.6 - 39.6 100%

2 22.5 36.7 25.2 88%

3 13.5 15.1 29.0 49%

4 10.3 8.8 35.1 26%

5 9.3 7.3 46.9 16%

6 9.1 7 53.5 14%

7 9 7 56.0 13%

8 9 7 56.0 13%

Fastestexpectedtime

Random SwapClustering

Two centroids , butonly one cluster .

One centroid , buttwo clusters .

Two centroids , butonly one cluster .

One centroid , buttwo clusters .

Swap-based clustering

Clustering by Random Swap

RandomSwap(X) → C, PC ← SelectRandomRepresentatives(X);P ← OptimalPartition(X, C);REPEAT T times

(Cnew, j) ← RandomSwap(X, C);Pnew ← LocalRepartition(X, Cnew, P, j);Cnew, Pnew ← Kmeans(X, Cnew, Pnew);IF f(Cnew, Pnew) < f(C, P) THEN

(C, P) ← Cnew, Pnew;

RETURN (C, P);

P. Fränti and J. Kivijärvi, "Randomised local search algorithm for the clustering problem", Pattern Analysis and Applications, 3 (4), 358-

369, 2000.

Select random neighbor

Accept only if it improves

1. Random swap:

2. Re-partition vectors from old cluster:

3. Create new cluster:

c x j random M i random Nj i ( , ), ( , )1 1

p d x c i p jik M

i k i

arg min ,1

2

p d x c i Nik j k p

i ki

arg min , ,2

1

Clustering by Random Swap

Choices for swapSwap is made from

centroid rich area tocentroid poor area.

Swap is made fromcentroid rich area tocentroid poor area.

O(M) clusters

to be removed

O(M) clusters

where to add

O(M2) different choices in total

=

Select a proper centroid for removal:

– M clusters in total: premoval=1/M.

Select a proper new location:

– N choices: padd=1/N

– M of them significantly different: padd=1/M

In total:

– M2 significantly different swaps.

– Probability of each is pswap=1/M2

– Open question: how many of these are good

– Theorem: α are good for add and removal.

Probability for successful Swap

Probability of not finding good swap:T

Mq

2

2

1

2

2

1loglogM

Tq

2

2

1log

log

M

qT

Estimated number of iterations:

Clustering by Random Swap

Iterated T times

2

2

ln -α

MqT

2

2

2222-ln

/

ln -

/1ln

ln

α

Mq

q

qT

Upper limit:

Lower limit similarly; resulting in:

Bounds for the iterations

Number of iterations needed (T):

α

NMq-N

α

Mq-MNT

2

2

2 lnln ,

2

2

ln -α

MqT

t = O(αN)

Total time:

Time complexity of single step (t):

Total time complexity

Probability of success (p)depending on T

0

20

40

60

80

100

0 50 100 150 200 250 300

Iterations

p

Bridge

160

165

170

175

180

185

190

0.1 1 10 100 1000Time

MS

E

Random Swap

Repeated k-means

Time-distortion performance

Monte Carlo

Input: Bit-vector A[1,n], max iterations zOutput: An index of any 1-bit in A.

MonteCarlo(A, n, z) → True/Falsei ← 0;REPEAT

k ← Random(1, n);i ← i+1;

UNTIL (A[k]=1) OR (i>z);RETURN k

Monte Carlo1 0 0 1 0 00 0 0 0 1 0 …

Monte Carlo

Potential problems to be considered:• Detecting prime numbers• Calculating integral of a function

To appear in … future…

Sherwood

• Used in Quicksort• Data is already sorted• Worst can happens always if sorted data

N-11

N-21

N-31

…O(N2)

Selection of pivot elementNaïve: first or last item

… 31 28 24 12 11 750 33 7 5 2 191929798

5 7 7 11 12 241 2 28 31 33 50 98979291…

Selection of pivot elementRandom item

75%

75%

O(NlogN)

5 7 7 11 12 241 2 28 31 33 50 …

• Worst can still happens• But with probability (1/n)n

11

25%

75%25%

25%

Simulated dynamic linked list

1. Sorted array- Search efficient: O(logN)- Insert and Delete slow: O(N)

2. Dynamically linked list- Insert and Delete fast: O(1)- Search inefficient: O(N)

Simulated dynamic linked listExample

i 1 2 3 4 5 6 7

Value 2 4 15

1 5 21

7

Next 2 5 6 1 7 0 3

1 152 4 75 21Head

Linked list:

Head=4Simulated by

array:

SEARCH (A, x)

i := A.HEAD;max := A[i].VALUE;

FOR k:=1 TO N DOj:=RANDOM(1, N);y:=A[j].VALUE;IF (max<y) AND (y≤x)

THENi:=j; max:=y;

RETURN LinearSearch(A, x, i);

Simulated dynamic linked listDivide-and-conquer with randomization

N random breakpoints

Biggest breakpoint ≤ x

Value searched

Full search from breakpoint i

Analysis of the search

max search for

N N(on

average)

• Divide into N segments• Each segment has N/N = N elements• Linear search within one segment.• Expected time complexity = N + N =

O(N)

Experiment with students

1 2 3 4 99 100

Data (N=100) consists of numbers from 1..100:

Select N breaking points:

Searching for…

77

Empty space for notes

Recommended