307
Analysis of Algorithm Disjoint Set Representation Andres Mendez-Vazquez November 8, 2015 1 / 114

17 Disjoint Set Representation

Embed Size (px)

Citation preview

Page 1: 17 Disjoint Set Representation

Analysis of AlgorithmDisjoint Set Representation

Andres Mendez-Vazquez

November 8, 2015

1 / 114

Page 2: 17 Disjoint Set Representation

Outline1 Disjoint Set Representation

Definition of the ProblemOperations

2 Union-Find ProblemThe Main ProblemApplications

3 ImplementationsFirst Attempt: Circular ListOperations and CostStill we have a Problem

Weighted-Union HeuristicOperationsStill a Problem

Heuristic Union by Rank

4 Balanced UnionPath compressionTime ComplexityAckermann’s FunctionBoundsThe Rank ObservationProof of ComplexityTheorem for Union by Rank and Path Compression)

2 / 114

Page 3: 17 Disjoint Set Representation

Outline1 Disjoint Set Representation

Definition of the ProblemOperations

2 Union-Find ProblemThe Main ProblemApplications

3 ImplementationsFirst Attempt: Circular ListOperations and CostStill we have a Problem

Weighted-Union HeuristicOperationsStill a Problem

Heuristic Union by Rank

4 Balanced UnionPath compressionTime ComplexityAckermann’s FunctionBoundsThe Rank ObservationProof of ComplexityTheorem for Union by Rank and Path Compression)

3 / 114

Page 4: 17 Disjoint Set Representation

Disjoint Set Representation

Problem1 Items are drawn from the finite universe U = 1, 2, ...,n for some fixed

n.2 We want to maintain a partition of U as a collection of disjoint sets.3 In addition, we want to uniquely name each set by one of its items

called its representative item.

These disjoint sets are maintained under the following operations1 MakeSet(x)2 Union(A,B)3 Find(x)

4 / 114

Page 5: 17 Disjoint Set Representation

Disjoint Set Representation

Problem1 Items are drawn from the finite universe U = 1, 2, ...,n for some fixed

n.2 We want to maintain a partition of U as a collection of disjoint sets.3 In addition, we want to uniquely name each set by one of its items

called its representative item.

These disjoint sets are maintained under the following operations1 MakeSet(x)2 Union(A,B)3 Find(x)

4 / 114

Page 6: 17 Disjoint Set Representation

Disjoint Set Representation

Problem1 Items are drawn from the finite universe U = 1, 2, ...,n for some fixed

n.2 We want to maintain a partition of U as a collection of disjoint sets.3 In addition, we want to uniquely name each set by one of its items

called its representative item.

These disjoint sets are maintained under the following operations1 MakeSet(x)2 Union(A,B)3 Find(x)

4 / 114

Page 7: 17 Disjoint Set Representation

Disjoint Set Representation

Problem1 Items are drawn from the finite universe U = 1, 2, ...,n for some fixed

n.2 We want to maintain a partition of U as a collection of disjoint sets.3 In addition, we want to uniquely name each set by one of its items

called its representative item.

These disjoint sets are maintained under the following operations1 MakeSet(x)2 Union(A,B)3 Find(x)

4 / 114

Page 8: 17 Disjoint Set Representation

Disjoint Set Representation

Problem1 Items are drawn from the finite universe U = 1, 2, ...,n for some fixed

n.2 We want to maintain a partition of U as a collection of disjoint sets.3 In addition, we want to uniquely name each set by one of its items

called its representative item.

These disjoint sets are maintained under the following operations1 MakeSet(x)2 Union(A,B)3 Find(x)

4 / 114

Page 9: 17 Disjoint Set Representation

Disjoint Set Representation

Problem1 Items are drawn from the finite universe U = 1, 2, ...,n for some fixed

n.2 We want to maintain a partition of U as a collection of disjoint sets.3 In addition, we want to uniquely name each set by one of its items

called its representative item.

These disjoint sets are maintained under the following operations1 MakeSet(x)2 Union(A,B)3 Find(x)

4 / 114

Page 10: 17 Disjoint Set Representation

Outline1 Disjoint Set Representation

Definition of the ProblemOperations

2 Union-Find ProblemThe Main ProblemApplications

3 ImplementationsFirst Attempt: Circular ListOperations and CostStill we have a Problem

Weighted-Union HeuristicOperationsStill a Problem

Heuristic Union by Rank

4 Balanced UnionPath compressionTime ComplexityAckermann’s FunctionBoundsThe Rank ObservationProof of ComplexityTheorem for Union by Rank and Path Compression)

5 / 114

Page 11: 17 Disjoint Set Representation

Operations

MakeSet(x)Given x ∈ U currently not belonging to any set in the collection, create anew singleton set {x} and name it x.

I This is usually done at start, once per item, to create the initial trivialpartition.

Union(A,B)It changes the current partition by replacing its sets A and B with A ∪ B.Name the set A or B.

I The operation may choose either one of the two representatives as thenew representatives.

Find(x)It returns the name of the set that currently contains item x.

6 / 114

Page 12: 17 Disjoint Set Representation

Operations

MakeSet(x)Given x ∈ U currently not belonging to any set in the collection, create anew singleton set {x} and name it x.

I This is usually done at start, once per item, to create the initial trivialpartition.

Union(A,B)It changes the current partition by replacing its sets A and B with A ∪ B.Name the set A or B.

I The operation may choose either one of the two representatives as thenew representatives.

Find(x)It returns the name of the set that currently contains item x.

6 / 114

Page 13: 17 Disjoint Set Representation

Operations

MakeSet(x)Given x ∈ U currently not belonging to any set in the collection, create anew singleton set {x} and name it x.

I This is usually done at start, once per item, to create the initial trivialpartition.

Union(A,B)It changes the current partition by replacing its sets A and B with A ∪ B.Name the set A or B.

I The operation may choose either one of the two representatives as thenew representatives.

Find(x)It returns the name of the set that currently contains item x.

6 / 114

Page 14: 17 Disjoint Set Representation

Example

for x = 1 to 9 do MakeSet(x)98621 2 3 4 5 6 8 97

Then, you do a Union(1, 2)

Now, Union(3, 4); Union(5, 8); Union(6, 9)

7 / 114

Page 15: 17 Disjoint Set Representation

Example

for x = 1 to 9 do MakeSet(x)98621 2 3 4 5 6 8 97

Then, you do a Union(1, 2)

9863 4 5 6 8 9721 2

Now, Union(3, 4); Union(5, 8); Union(6, 9)

7 / 114

Page 16: 17 Disjoint Set Representation

Example

for x = 1 to 9 do MakeSet(x)98621 2 3 4 5 6 8 97

Then, you do a Union(1, 2)

9863 4 5 6 8 9721 2

Now, Union(3, 4); Union(5, 8); Union(6, 9)

21 2 3 4 5 678 9

7 / 114

Page 17: 17 Disjoint Set Representation

Example

Now, Union(1, 5); Union(7, 4)

21 2 3 45 678 9

Then, if we do the following operationsFind(1) returns 5Find(9) returns 9

Finally, Union(5, 9)

Then Find(9) returns 5

8 / 114

Page 18: 17 Disjoint Set Representation

Example

Now, Union(1, 5); Union(7, 4)

21 2 3 45 678 9

Then, if we do the following operationsFind(1) returns 5Find(9) returns 9

Finally, Union(5, 9)

Then Find(9) returns 5

8 / 114

Page 19: 17 Disjoint Set Representation

Example

Now, Union(1, 5); Union(7, 4)

21 2 3 45 678 9

Then, if we do the following operationsFind(1) returns 5Find(9) returns 9

Finally, Union(5, 9)

21 2 3 45 6 78 9

Then Find(9) returns 5

8 / 114

Page 20: 17 Disjoint Set Representation

Outline1 Disjoint Set Representation

Definition of the ProblemOperations

2 Union-Find ProblemThe Main ProblemApplications

3 ImplementationsFirst Attempt: Circular ListOperations and CostStill we have a Problem

Weighted-Union HeuristicOperationsStill a Problem

Heuristic Union by Rank

4 Balanced UnionPath compressionTime ComplexityAckermann’s FunctionBoundsThe Rank ObservationProof of ComplexityTheorem for Union by Rank and Path Compression)

9 / 114

Page 21: 17 Disjoint Set Representation

Union-Find Problem

ProblemS = be a sequence of m = |S | MakeSet, Union and Find operations(intermixed in arbitrary order):

I n of which are MakeSet.I At most n − 1 are Union.I The rest are Finds.

Cost(S) = total computational time to execute sequence s.Goal: Find an implementation that, for every m and n, minimizes theamortized cost per operation:

Cost (S)|S | (1)

for any arbitrary sequence S .

10 / 114

Page 22: 17 Disjoint Set Representation

Union-Find Problem

ProblemS = be a sequence of m = |S | MakeSet, Union and Find operations(intermixed in arbitrary order):

I n of which are MakeSet.I At most n − 1 are Union.I The rest are Finds.

Cost(S) = total computational time to execute sequence s.Goal: Find an implementation that, for every m and n, minimizes theamortized cost per operation:

Cost (S)|S | (1)

for any arbitrary sequence S .

10 / 114

Page 23: 17 Disjoint Set Representation

Union-Find Problem

ProblemS = be a sequence of m = |S | MakeSet, Union and Find operations(intermixed in arbitrary order):

I n of which are MakeSet.I At most n − 1 are Union.I The rest are Finds.

Cost(S) = total computational time to execute sequence s.Goal: Find an implementation that, for every m and n, minimizes theamortized cost per operation:

Cost (S)|S | (1)

for any arbitrary sequence S .

10 / 114

Page 24: 17 Disjoint Set Representation

Union-Find Problem

ProblemS = be a sequence of m = |S | MakeSet, Union and Find operations(intermixed in arbitrary order):

I n of which are MakeSet.I At most n − 1 are Union.I The rest are Finds.

Cost(S) = total computational time to execute sequence s.Goal: Find an implementation that, for every m and n, minimizes theamortized cost per operation:

Cost (S)|S | (1)

for any arbitrary sequence S .

10 / 114

Page 25: 17 Disjoint Set Representation

Union-Find Problem

ProblemS = be a sequence of m = |S | MakeSet, Union and Find operations(intermixed in arbitrary order):

I n of which are MakeSet.I At most n − 1 are Union.I The rest are Finds.

Cost(S) = total computational time to execute sequence s.Goal: Find an implementation that, for every m and n, minimizes theamortized cost per operation:

Cost (S)|S | (1)

for any arbitrary sequence S .

10 / 114

Page 26: 17 Disjoint Set Representation

Union-Find Problem

ProblemS = be a sequence of m = |S | MakeSet, Union and Find operations(intermixed in arbitrary order):

I n of which are MakeSet.I At most n − 1 are Union.I The rest are Finds.

Cost(S) = total computational time to execute sequence s.Goal: Find an implementation that, for every m and n, minimizes theamortized cost per operation:

Cost (S)|S | (1)

for any arbitrary sequence S .

10 / 114

Page 27: 17 Disjoint Set Representation

Outline1 Disjoint Set Representation

Definition of the ProblemOperations

2 Union-Find ProblemThe Main ProblemApplications

3 ImplementationsFirst Attempt: Circular ListOperations and CostStill we have a Problem

Weighted-Union HeuristicOperationsStill a Problem

Heuristic Union by Rank

4 Balanced UnionPath compressionTime ComplexityAckermann’s FunctionBoundsThe Rank ObservationProof of ComplexityTheorem for Union by Rank and Path Compression)

11 / 114

Page 28: 17 Disjoint Set Representation

Applications

Examples1 Maintaining partitions and equivalence classes.2 Graph connectivity under edge insertion.3 Minimum spanning trees (e.g. Kruskal’s algorithm).4 Random maze construction.

12 / 114

Page 29: 17 Disjoint Set Representation

Applications

Examples1 Maintaining partitions and equivalence classes.2 Graph connectivity under edge insertion.3 Minimum spanning trees (e.g. Kruskal’s algorithm).4 Random maze construction.

12 / 114

Page 30: 17 Disjoint Set Representation

Applications

Examples1 Maintaining partitions and equivalence classes.2 Graph connectivity under edge insertion.3 Minimum spanning trees (e.g. Kruskal’s algorithm).4 Random maze construction.

12 / 114

Page 31: 17 Disjoint Set Representation

Applications

Examples1 Maintaining partitions and equivalence classes.2 Graph connectivity under edge insertion.3 Minimum spanning trees (e.g. Kruskal’s algorithm).4 Random maze construction.

12 / 114

Page 32: 17 Disjoint Set Representation

Outline1 Disjoint Set Representation

Definition of the ProblemOperations

2 Union-Find ProblemThe Main ProblemApplications

3 ImplementationsFirst Attempt: Circular ListOperations and CostStill we have a Problem

Weighted-Union HeuristicOperationsStill a Problem

Heuristic Union by Rank

4 Balanced UnionPath compressionTime ComplexityAckermann’s FunctionBoundsThe Rank ObservationProof of ComplexityTheorem for Union by Rank and Path Compression)

13 / 114

Page 33: 17 Disjoint Set Representation

Circular lists

We use the following structuresData structure: Two arrays Set[1..n] and next[1..n].

Set[x] returns the name of the set that contains item x.A is a set if and only if Set[A] = Anext[x] returns the next item on the list of the set that contains itemx.

14 / 114

Page 34: 17 Disjoint Set Representation

Circular lists

We use the following structuresData structure: Two arrays Set[1..n] and next[1..n].

Set[x] returns the name of the set that contains item x.A is a set if and only if Set[A] = Anext[x] returns the next item on the list of the set that contains itemx.

14 / 114

Page 35: 17 Disjoint Set Representation

Circular lists

We use the following structuresData structure: Two arrays Set[1..n] and next[1..n].

Set[x] returns the name of the set that contains item x.A is a set if and only if Set[A] = Anext[x] returns the next item on the list of the set that contains itemx.

14 / 114

Page 36: 17 Disjoint Set Representation

Circular lists

We use the following structuresData structure: Two arrays Set[1..n] and next[1..n].

Set[x] returns the name of the set that contains item x.A is a set if and only if Set[A] = Anext[x] returns the next item on the list of the set that contains itemx.

14 / 114

Page 37: 17 Disjoint Set Representation

Circular lists

Example: n = 16,Partition: {{1, 2, 8, 9} , {4, 3, 10, 13, 14, 15, 16} , {7, 6, 5, 11, 12} }Set

next

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

1 1 4 4 7 7 7 1 1 4 7 7 4 4 4 4

2 8 10 3 12 5 6 9 1 13 7 11 14 15 16 4

Set Position 1

15 / 114

Page 38: 17 Disjoint Set Representation

Circular lists

Example: n = 16,Partition: {{1, 2, 8, 9} , {4, 3, 10, 13, 14, 15, 16} , {7, 6, 5, 11, 12} }Set

next

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

1 1 4 4 7 7 7 1 1 4 7 7 4 4 4 4

2 8 10 3 12 5 6 9 1 13 7 11 14 15 16 4

Set Position 11 2 8 91

15 / 114

Page 39: 17 Disjoint Set Representation

Circular lists

Set Position 77 6 5 12 117

Set Position 4

16 / 114

Page 40: 17 Disjoint Set Representation

Circular lists

Set Position 77 6 5 12 117

Set Position 44 3 10 13 14 15 164

16 / 114

Page 41: 17 Disjoint Set Representation

Operations and Cost

Make(x)1 Set[x] = x2 next[x] = x

ComplexityO (1) Time

Find(x)1 return Set[x]

ComplexityO (1) Time

17 / 114

Page 42: 17 Disjoint Set Representation

Operations and Cost

Make(x)1 Set[x] = x2 next[x] = x

ComplexityO (1) Time

Find(x)1 return Set[x]

ComplexityO (1) Time

17 / 114

Page 43: 17 Disjoint Set Representation

Operations and Cost

Make(x)1 Set[x] = x2 next[x] = x

ComplexityO (1) Time

Find(x)1 return Set[x]

ComplexityO (1) Time

17 / 114

Page 44: 17 Disjoint Set Representation

Operations and Cost

Make(x)1 Set[x] = x2 next[x] = x

ComplexityO (1) Time

Find(x)1 return Set[x]

ComplexityO (1) Time

17 / 114

Page 45: 17 Disjoint Set Representation

Operations and Cost

For the unionWe are assuming Set[A] = A 6=Set[B] = B

Union1(A,B)1 Set[B] = A2 x =next[B]3 while (x 6= B)4 Set[x] = A /* Rename Set B to A*/5 x =next[x]6 x =next[B] /* Splice list A and B */7 next[B] =next[A]8 next[A] = x

18 / 114

Page 46: 17 Disjoint Set Representation

Operations and Cost

For the unionWe are assuming Set[A] = A 6=Set[B] = B

Union1(A,B)1 Set[B] = A2 x =next[B]3 while (x 6= B)4 Set[x] = A /* Rename Set B to A*/5 x =next[x]6 x =next[B] /* Splice list A and B */7 next[B] =next[A]8 next[A] = x

18 / 114

Page 47: 17 Disjoint Set Representation

Operations and Cost

For the unionWe are assuming Set[A] = A 6=Set[B] = B

Union1(A,B)1 Set[B] = A2 x =next[B]3 while (x 6= B)4 Set[x] = A /* Rename Set B to A*/5 x =next[x]6 x =next[B] /* Splice list A and B */7 next[B] =next[A]8 next[A] = x

18 / 114

Page 48: 17 Disjoint Set Representation

Operations an Cost

Thus, we have in the Splice part

A

Bx

19 / 114

Page 49: 17 Disjoint Set Representation

We have a Problem

ComplexityO (|B|) Time

Not only that, if we have the following sequence of operations1 for x = 1 to n2 MakeSet(x)3 for x = 1 to n − 14 Union1(x + 1, x)

20 / 114

Page 50: 17 Disjoint Set Representation

We have a Problem

ComplexityO (|B|) Time

Not only that, if we have the following sequence of operations1 for x = 1 to n2 MakeSet(x)3 for x = 1 to n − 14 Union1(x + 1, x)

20 / 114

Page 51: 17 Disjoint Set Representation

Thus

Thus, we have the following number of aggregated steps

n +n−1∑i=1

i = n + n (n − 1)2

= n + n2 − n2

= n2

2 + n2

= Θ(n2)

21 / 114

Page 52: 17 Disjoint Set Representation

Thus

Thus, we have the following number of aggregated steps

n +n−1∑i=1

i = n + n (n − 1)2

= n + n2 − n2

= n2

2 + n2

= Θ(n2)

21 / 114

Page 53: 17 Disjoint Set Representation

Thus

Thus, we have the following number of aggregated steps

n +n−1∑i=1

i = n + n (n − 1)2

= n + n2 − n2

= n2

2 + n2

= Θ(n2)

21 / 114

Page 54: 17 Disjoint Set Representation

Thus

Thus, we have the following number of aggregated steps

n +n−1∑i=1

i = n + n (n − 1)2

= n + n2 − n2

= n2

2 + n2

= Θ(n2)

21 / 114

Page 55: 17 Disjoint Set Representation

Thus

Thus, we have the following number of aggregated steps

n +n−1∑i=1

i = n + n (n − 1)2

= n + n2 − n2

= n2

2 + n2

= Θ(n2)

21 / 114

Page 56: 17 Disjoint Set Representation

Aggregate Time

Thus, the aggregate time is as followAggregate Time = Θ

(n2)

ThereforeAmortized Time per operation = Θ (n)

22 / 114

Page 57: 17 Disjoint Set Representation

Aggregate Time

Thus, the aggregate time is as followAggregate Time = Θ

(n2)

ThereforeAmortized Time per operation = Θ (n)

22 / 114

Page 58: 17 Disjoint Set Representation

This is not exactly good

Thus, we need to have something betterWe will try now the Weighted-Union Heuristic!!!

23 / 114

Page 59: 17 Disjoint Set Representation

Outline1 Disjoint Set Representation

Definition of the ProblemOperations

2 Union-Find ProblemThe Main ProblemApplications

3 ImplementationsFirst Attempt: Circular ListOperations and CostStill we have a Problem

Weighted-Union HeuristicOperationsStill a Problem

Heuristic Union by Rank

4 Balanced UnionPath compressionTime ComplexityAckermann’s FunctionBoundsThe Rank ObservationProof of ComplexityTheorem for Union by Rank and Path Compression)

24 / 114

Page 60: 17 Disjoint Set Representation

Implementation 2: Weighted-Union Heuristic Lists

We extend the previous data structureData structure: Three arrays Set[1..n], next[1..n], size[1..n].

size[A] returns the number of items in set A if A == Set[A](Otherwise, we do not care).

25 / 114

Page 61: 17 Disjoint Set Representation

Operations

MakeSet(x)1 Set[x] = x2 next[x] = x3 size[x] = 1

ComplexityO (1) time

Find(x)1 return Set[x]

ComplexityO (1) time

26 / 114

Page 62: 17 Disjoint Set Representation

Operations

MakeSet(x)1 Set[x] = x2 next[x] = x3 size[x] = 1

ComplexityO (1) time

Find(x)1 return Set[x]

ComplexityO (1) time

26 / 114

Page 63: 17 Disjoint Set Representation

Operations

MakeSet(x)1 Set[x] = x2 next[x] = x3 size[x] = 1

ComplexityO (1) time

Find(x)1 return Set[x]

ComplexityO (1) time

26 / 114

Page 64: 17 Disjoint Set Representation

Operations

MakeSet(x)1 Set[x] = x2 next[x] = x3 size[x] = 1

ComplexityO (1) time

Find(x)1 return Set[x]

ComplexityO (1) time

26 / 114

Page 65: 17 Disjoint Set Representation

Operations

Union2(A,B)1 if size[set [A]] >size[set [B]]2 size[set [A]] =size[set [A]]+size[set [B]]3 Union1(A,B)4 else5 size[set [B]] =size[set [A]]+size[set [B]]6 Union1(B,A)

Note: Weight Balanced Union: Merge smaller set into large set

ComplexityO (min {|A| , |B|}) time.

27 / 114

Page 66: 17 Disjoint Set Representation

Operations

Union2(A,B)1 if size[set [A]] >size[set [B]]2 size[set [A]] =size[set [A]]+size[set [B]]3 Union1(A,B)4 else5 size[set [B]] =size[set [A]]+size[set [B]]6 Union1(B,A)

Note: Weight Balanced Union: Merge smaller set into large set

ComplexityO (min {|A| , |B|}) time.

27 / 114

Page 67: 17 Disjoint Set Representation

What about the operations eliciting the worst behaviorRemember

1 for x = 1 to n2 MakeSet(x)3 for x = 1 to n − 14 Union2(x + 1, x)

We have then

n +n−1∑i=1

1 = n + n − 1

= 2n − 1= Θ (n)

IMPORTANT: This is not the worst sequence!!!

28 / 114

Page 68: 17 Disjoint Set Representation

What about the operations eliciting the worst behaviorRemember

1 for x = 1 to n2 MakeSet(x)3 for x = 1 to n − 14 Union2(x + 1, x)

We have then

n +n−1∑i=1

1 = n + n − 1

= 2n − 1= Θ (n)

IMPORTANT: This is not the worst sequence!!!

28 / 114

Page 69: 17 Disjoint Set Representation

For this, notice the following worst sequence

Worst Sequence sMakeSet(x), for x = 1, ..,n. Then do n − 1 Unions in round-robin manner.

Within each round, the sets have roughly equal size.I Starting round: Each round has size 1.I Next round: Each round has size 2.I Next: ... size 4.I ...

We claim the followingAggregate time = Θ(n log n)Amortized time per operation = Θ(log n)

29 / 114

Page 70: 17 Disjoint Set Representation

For this, notice the following worst sequence

Worst Sequence sMakeSet(x), for x = 1, ..,n. Then do n − 1 Unions in round-robin manner.

Within each round, the sets have roughly equal size.I Starting round: Each round has size 1.I Next round: Each round has size 2.I Next: ... size 4.I ...

We claim the followingAggregate time = Θ(n log n)Amortized time per operation = Θ(log n)

29 / 114

Page 71: 17 Disjoint Set Representation

For this, notice the following worst sequence

Worst Sequence sMakeSet(x), for x = 1, ..,n. Then do n − 1 Unions in round-robin manner.

Within each round, the sets have roughly equal size.I Starting round: Each round has size 1.I Next round: Each round has size 2.I Next: ... size 4.I ...

We claim the followingAggregate time = Θ(n log n)Amortized time per operation = Θ(log n)

29 / 114

Page 72: 17 Disjoint Set Representation

For this, notice the following worst sequence

Worst Sequence sMakeSet(x), for x = 1, ..,n. Then do n − 1 Unions in round-robin manner.

Within each round, the sets have roughly equal size.I Starting round: Each round has size 1.I Next round: Each round has size 2.I Next: ... size 4.I ...

We claim the followingAggregate time = Θ(n log n)Amortized time per operation = Θ(log n)

29 / 114

Page 73: 17 Disjoint Set Representation

For this, notice the following worst sequence

Worst Sequence sMakeSet(x), for x = 1, ..,n. Then do n − 1 Unions in round-robin manner.

Within each round, the sets have roughly equal size.I Starting round: Each round has size 1.I Next round: Each round has size 2.I Next: ... size 4.I ...

We claim the followingAggregate time = Θ(n log n)Amortized time per operation = Θ(log n)

29 / 114

Page 74: 17 Disjoint Set Representation

For this, notice the following worst sequence

Worst Sequence sMakeSet(x), for x = 1, ..,n. Then do n − 1 Unions in round-robin manner.

Within each round, the sets have roughly equal size.I Starting round: Each round has size 1.I Next round: Each round has size 2.I Next: ... size 4.I ...

We claim the followingAggregate time = Θ(n log n)Amortized time per operation = Θ(log n)

29 / 114

Page 75: 17 Disjoint Set Representation

For this, notice the following worst sequence

Worst Sequence sMakeSet(x), for x = 1, ..,n. Then do n − 1 Unions in round-robin manner.

Within each round, the sets have roughly equal size.I Starting round: Each round has size 1.I Next round: Each round has size 2.I Next: ... size 4.I ...

We claim the followingAggregate time = Θ(n log n)Amortized time per operation = Θ(log n)

29 / 114

Page 76: 17 Disjoint Set Representation

For this, notice the following worst sequence

Worst Sequence sMakeSet(x), for x = 1, ..,n. Then do n − 1 Unions in round-robin manner.

Within each round, the sets have roughly equal size.I Starting round: Each round has size 1.I Next round: Each round has size 2.I Next: ... size 4.I ...

We claim the followingAggregate time = Θ(n log n)Amortized time per operation = Θ(log n)

29 / 114

Page 77: 17 Disjoint Set Representation

For this, notice the following worst sequence

Example n = 16Round 0: {1} {2} {3} {4} {5} {6} {7} {8} {9} {10} {11} {12}{13} {14} {15} {16}Round 1: {1, 2} {3, 4} {5, 6} {7, 8} {9, 10} {11, 12} {13, 14} {15,16}Round 2: {1, 2, 3, 4} {5, 6, 7, 8} {9, 10, 11, 12} {13, 14, 15, 16}Round 3: {1, 2, 3, 4, 5, 6, 7, 8} {9, 10, 11, 12, 13, 14, 15, 16}Round 4: {1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16}

30 / 114

Page 78: 17 Disjoint Set Representation

For this, notice the following worst sequence

Example n = 16Round 0: {1} {2} {3} {4} {5} {6} {7} {8} {9} {10} {11} {12}{13} {14} {15} {16}Round 1: {1, 2} {3, 4} {5, 6} {7, 8} {9, 10} {11, 12} {13, 14} {15,16}Round 2: {1, 2, 3, 4} {5, 6, 7, 8} {9, 10, 11, 12} {13, 14, 15, 16}Round 3: {1, 2, 3, 4, 5, 6, 7, 8} {9, 10, 11, 12, 13, 14, 15, 16}Round 4: {1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16}

30 / 114

Page 79: 17 Disjoint Set Representation

For this, notice the following worst sequence

Example n = 16Round 0: {1} {2} {3} {4} {5} {6} {7} {8} {9} {10} {11} {12}{13} {14} {15} {16}Round 1: {1, 2} {3, 4} {5, 6} {7, 8} {9, 10} {11, 12} {13, 14} {15,16}Round 2: {1, 2, 3, 4} {5, 6, 7, 8} {9, 10, 11, 12} {13, 14, 15, 16}Round 3: {1, 2, 3, 4, 5, 6, 7, 8} {9, 10, 11, 12, 13, 14, 15, 16}Round 4: {1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16}

30 / 114

Page 80: 17 Disjoint Set Representation

For this, notice the following worst sequence

Example n = 16Round 0: {1} {2} {3} {4} {5} {6} {7} {8} {9} {10} {11} {12}{13} {14} {15} {16}Round 1: {1, 2} {3, 4} {5, 6} {7, 8} {9, 10} {11, 12} {13, 14} {15,16}Round 2: {1, 2, 3, 4} {5, 6, 7, 8} {9, 10, 11, 12} {13, 14, 15, 16}Round 3: {1, 2, 3, 4, 5, 6, 7, 8} {9, 10, 11, 12, 13, 14, 15, 16}Round 4: {1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16}

30 / 114

Page 81: 17 Disjoint Set Representation

For this, notice the following worst sequence

Example n = 16Round 0: {1} {2} {3} {4} {5} {6} {7} {8} {9} {10} {11} {12}{13} {14} {15} {16}Round 1: {1, 2} {3, 4} {5, 6} {7, 8} {9, 10} {11, 12} {13, 14} {15,16}Round 2: {1, 2, 3, 4} {5, 6, 7, 8} {9, 10, 11, 12} {13, 14, 15, 16}Round 3: {1, 2, 3, 4, 5, 6, 7, 8} {9, 10, 11, 12, 13, 14, 15, 16}Round 4: {1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16}

30 / 114

Page 82: 17 Disjoint Set Representation

Now

Given the previous worst caseWhat is the complexity of this implementation?

31 / 114

Page 83: 17 Disjoint Set Representation

Now, the Amortized Costs of this implementation

Claim 1: Amortized time per operation is O(log n)For this, we have the following theorem!!!

Theorem 1Using the linked-list representation of disjoint sets and the weighted-Unionheuristic, a sequence of m MakeSet, Union, and FindSet operations, n ofwhich are MakeSet operations, takes O (m + n log n) time.

32 / 114

Page 84: 17 Disjoint Set Representation

Now, the Amortized Costs of this implementation

Claim 1: Amortized time per operation is O(log n)For this, we have the following theorem!!!

Theorem 1Using the linked-list representation of disjoint sets and the weighted-Unionheuristic, a sequence of m MakeSet, Union, and FindSet operations, n ofwhich are MakeSet operations, takes O (m + n log n) time.

32 / 114

Page 85: 17 Disjoint Set Representation

Proof

Because each Union operation unites two disjoint setsWe perform at most n − 1 Union operations over all.

We now bound the total time taken by these Union operationsWe start by determining, for each object, an upper bound on thenumber of times the object’s pointer back to its set object isupdated.Consider a particular object x.

I We know that each time x’s pointer was updated, x must have startedin the smaller set.

The first time x’s pointer was updated, therefore, the resulting setmust have had at least 2 members.

I Similarly, the next time x’s pointer was updated, the resulting set musthave had at least 4 members.

33 / 114

Page 86: 17 Disjoint Set Representation

Proof

Because each Union operation unites two disjoint setsWe perform at most n − 1 Union operations over all.

We now bound the total time taken by these Union operationsWe start by determining, for each object, an upper bound on thenumber of times the object’s pointer back to its set object isupdated.Consider a particular object x.

I We know that each time x’s pointer was updated, x must have startedin the smaller set.

The first time x’s pointer was updated, therefore, the resulting setmust have had at least 2 members.

I Similarly, the next time x’s pointer was updated, the resulting set musthave had at least 4 members.

33 / 114

Page 87: 17 Disjoint Set Representation

Proof

Because each Union operation unites two disjoint setsWe perform at most n − 1 Union operations over all.

We now bound the total time taken by these Union operationsWe start by determining, for each object, an upper bound on thenumber of times the object’s pointer back to its set object isupdated.Consider a particular object x.

I We know that each time x’s pointer was updated, x must have startedin the smaller set.

The first time x’s pointer was updated, therefore, the resulting setmust have had at least 2 members.

I Similarly, the next time x’s pointer was updated, the resulting set musthave had at least 4 members.

33 / 114

Page 88: 17 Disjoint Set Representation

Proof

Because each Union operation unites two disjoint setsWe perform at most n − 1 Union operations over all.

We now bound the total time taken by these Union operationsWe start by determining, for each object, an upper bound on thenumber of times the object’s pointer back to its set object isupdated.Consider a particular object x.

I We know that each time x’s pointer was updated, x must have startedin the smaller set.

The first time x’s pointer was updated, therefore, the resulting setmust have had at least 2 members.

I Similarly, the next time x’s pointer was updated, the resulting set musthave had at least 4 members.

33 / 114

Page 89: 17 Disjoint Set Representation

Proof

Because each Union operation unites two disjoint setsWe perform at most n − 1 Union operations over all.

We now bound the total time taken by these Union operationsWe start by determining, for each object, an upper bound on thenumber of times the object’s pointer back to its set object isupdated.Consider a particular object x.

I We know that each time x’s pointer was updated, x must have startedin the smaller set.

The first time x’s pointer was updated, therefore, the resulting setmust have had at least 2 members.

I Similarly, the next time x’s pointer was updated, the resulting set musthave had at least 4 members.

33 / 114

Page 90: 17 Disjoint Set Representation

Proof

Continuing onWe observe that for any k ≤ n, after x’s pointer has been updated dlog netimes!!!

The resulting set must have at least k members.

ThusSince the largest set has at most n members, each object’s pointer isupdated at most dlog ne times over all the Union operations.

ThenThe total time spent updating object pointers over all Union operations isO (n log n).

34 / 114

Page 91: 17 Disjoint Set Representation

Proof

Continuing onWe observe that for any k ≤ n, after x’s pointer has been updated dlog netimes!!!

The resulting set must have at least k members.

ThusSince the largest set has at most n members, each object’s pointer isupdated at most dlog ne times over all the Union operations.

ThenThe total time spent updating object pointers over all Union operations isO (n log n).

34 / 114

Page 92: 17 Disjoint Set Representation

Proof

Continuing onWe observe that for any k ≤ n, after x’s pointer has been updated dlog netimes!!!

The resulting set must have at least k members.

ThusSince the largest set has at most n members, each object’s pointer isupdated at most dlog ne times over all the Union operations.

ThenThe total time spent updating object pointers over all Union operations isO (n log n).

34 / 114

Page 93: 17 Disjoint Set Representation

Proof

Continuing onWe observe that for any k ≤ n, after x’s pointer has been updated dlog netimes!!!

The resulting set must have at least k members.

ThusSince the largest set has at most n members, each object’s pointer isupdated at most dlog ne times over all the Union operations.

ThenThe total time spent updating object pointers over all Union operations isO (n log n).

34 / 114

Page 94: 17 Disjoint Set Representation

Proof

We must also account for updating the tail pointers and the listlengthsIt takes only O (1) time per Union operation

ThereforeThe total time spent in all Union operations is thus O (n log n).

The time for the entire sequence of m operations follows easilyEach MakeSet and FindSet operation takes O (1) time, and there areO (m) of them.

35 / 114

Page 95: 17 Disjoint Set Representation

Proof

We must also account for updating the tail pointers and the listlengthsIt takes only O (1) time per Union operation

ThereforeThe total time spent in all Union operations is thus O (n log n).

The time for the entire sequence of m operations follows easilyEach MakeSet and FindSet operation takes O (1) time, and there areO (m) of them.

35 / 114

Page 96: 17 Disjoint Set Representation

Proof

We must also account for updating the tail pointers and the listlengthsIt takes only O (1) time per Union operation

ThereforeThe total time spent in all Union operations is thus O (n log n).

The time for the entire sequence of m operations follows easilyEach MakeSet and FindSet operation takes O (1) time, and there areO (m) of them.

35 / 114

Page 97: 17 Disjoint Set Representation

Proof

ThereforeThe total time for the entire sequence is thus O (m + n log n).

36 / 114

Page 98: 17 Disjoint Set Representation

Amortized Cost: Aggregate Analysis

Aggregate cost O(m + n log n). Amortized cost per operationO(log n).

O(m + n log n)m = O (1 + log n) = O (log n) (2)

37 / 114

Page 99: 17 Disjoint Set Representation

There are other ways of analyzing the amortized cost

It is possible to use1 Accounting Method.2 Potential Method.

38 / 114

Page 100: 17 Disjoint Set Representation

Amortized Costs: Accounting Method

Accounting methodMakeSet(x): Charge (1 + log n). 1 to do the operation, log n storedas credit with item x.Find(x): Charge 1, and use it to do the operation.Union(A,B): Charge 0 and use 1 stored credit from each item in thesmaller set to move it.

39 / 114

Page 101: 17 Disjoint Set Representation

Amortized Costs: Accounting Method

Accounting methodMakeSet(x): Charge (1 + log n). 1 to do the operation, log n storedas credit with item x.Find(x): Charge 1, and use it to do the operation.Union(A,B): Charge 0 and use 1 stored credit from each item in thesmaller set to move it.

39 / 114

Page 102: 17 Disjoint Set Representation

Amortized Costs: Accounting Method

Accounting methodMakeSet(x): Charge (1 + log n). 1 to do the operation, log n storedas credit with item x.Find(x): Charge 1, and use it to do the operation.Union(A,B): Charge 0 and use 1 stored credit from each item in thesmaller set to move it.

39 / 114

Page 103: 17 Disjoint Set Representation

Amortized Costs: Accounting Method

Credit invariantTotal stored credit is

∑S|S | log

(n|S |

), where the summation is taken over

the collection S of all disjoint sets of the current partition.

40 / 114

Page 104: 17 Disjoint Set Representation

Amortized Costs: Potential Method

Potential function methodExercise:

Define a regular potential function and use it to do the amortizedanalysis.Can you make the Union amortized cost O(log n), MakeSet and Findcosts O(1)?

41 / 114

Page 105: 17 Disjoint Set Representation

Amortized Costs: Potential Method

Potential function methodExercise:

Define a regular potential function and use it to do the amortizedanalysis.Can you make the Union amortized cost O(log n), MakeSet and Findcosts O(1)?

41 / 114

Page 106: 17 Disjoint Set Representation

Amortized Costs: Potential Method

Potential function methodExercise:

Define a regular potential function and use it to do the amortizedanalysis.Can you make the Union amortized cost O(log n), MakeSet and Findcosts O(1)?

41 / 114

Page 107: 17 Disjoint Set Representation

Outline1 Disjoint Set Representation

Definition of the ProblemOperations

2 Union-Find ProblemThe Main ProblemApplications

3 ImplementationsFirst Attempt: Circular ListOperations and CostStill we have a Problem

Weighted-Union HeuristicOperationsStill a Problem

Heuristic Union by Rank

4 Balanced UnionPath compressionTime ComplexityAckermann’s FunctionBoundsThe Rank ObservationProof of ComplexityTheorem for Union by Rank and Path Compression)

42 / 114

Page 108: 17 Disjoint Set Representation

Improving over the heuristic using union by rank

Union by RankInstead of using the number of nodes in each tree to make a decision, wemaintain a rank, a upper bound on the height of the tree.

We have the following data structure to support this:We maintain a parent array p[1..n].

A is a set if and only if A = p[A] (a tree root).x ∈ A if and only if x is in the tree rooted at A.

43 / 114

Page 109: 17 Disjoint Set Representation

Improving over the heuristic using union by rank

Union by RankInstead of using the number of nodes in each tree to make a decision, wemaintain a rank, a upper bound on the height of the tree.

We have the following data structure to support this:We maintain a parent array p[1..n].

A is a set if and only if A = p[A] (a tree root).x ∈ A if and only if x is in the tree rooted at A.

43 / 114

Page 110: 17 Disjoint Set Representation

Improving over the heuristic using union by rank

Union by RankInstead of using the number of nodes in each tree to make a decision, wemaintain a rank, a upper bound on the height of the tree.

We have the following data structure to support this:We maintain a parent array p[1..n].

A is a set if and only if A = p[A] (a tree root).x ∈ A if and only if x is in the tree rooted at A.

43 / 114

Page 111: 17 Disjoint Set Representation

Improving over the heuristic using union by rank

Union by RankInstead of using the number of nodes in each tree to make a decision, wemaintain a rank, a upper bound on the height of the tree.

We have the following data structure to support this:We maintain a parent array p[1..n].

A is a set if and only if A = p[A] (a tree root).x ∈ A if and only if x is in the tree rooted at A.

43 / 114

Page 112: 17 Disjoint Set Representation

Improving over the heuristic using union by rankUnion by RankInstead of using the number of nodes in each tree to make a decision, wemaintain a rank, a upper bound on the height of the tree.

We have the following data structure to support this:We maintain a parent array p[1..n].

A is a set if and only if A = p[A] (a tree root).x ∈ A if and only if x is in the tree rooted at A.

1

13 5 8

20 14 10

19

4

2 18 6

15 9 11

17

7

3

12

16

43 / 114

Page 113: 17 Disjoint Set Representation

Forest of Up-Trees: Operations without union by rank orweight

MakeSet(x)1 p[x] = x

ComplexityO (1) time

Union(A,B)1 p[B] = A

Note: We are assuming that p[A] == A 6=p[B] == B. This is thereason we need a find operation!!!

44 / 114

Page 114: 17 Disjoint Set Representation

Forest of Up-Trees: Operations without union by rank orweight

MakeSet(x)1 p[x] = x

ComplexityO (1) time

Union(A,B)1 p[B] = A

Note: We are assuming that p[A] == A 6=p[B] == B. This is thereason we need a find operation!!!

44 / 114

Page 115: 17 Disjoint Set Representation

Forest of Up-Trees: Operations without union by rank orweight

MakeSet(x)1 p[x] = x

ComplexityO (1) time

Union(A,B)1 p[B] = A

Note: We are assuming that p[A] == A 6=p[B] == B. This is thereason we need a find operation!!!

44 / 114

Page 116: 17 Disjoint Set Representation

Example

Remember we are doing the joins without caring about getting theworst case

B

A

45 / 114

Page 117: 17 Disjoint Set Representation

Forest of Up-Trees: Operations without union by rank orweight

Find(x)1 if x ==p[x]2 return x3 return Find(p [x])

Example

46 / 114

Page 118: 17 Disjoint Set Representation

Forest of Up-Trees: Operations without union by rank orweight

Find(x)1 if x ==p[x]2 return x3 return Find(p [x])

Example

x

46 / 114

Page 119: 17 Disjoint Set Representation

Forest of Up-Trees: Operations without union by rank orweight

Still I can give you a horrible caseSequence of operations

1 for x = 1 to n2 MakeSet(x)3 for x = 1 to n − 14 Union(x)5 for x = 1 to n − 16 Find(1)

47 / 114

Page 120: 17 Disjoint Set Representation

Forest of Up-Trees: Operations without union by rank orweight

Still I can give you a horrible caseSequence of operations

1 for x = 1 to n2 MakeSet(x)3 for x = 1 to n − 14 Union(x)5 for x = 1 to n − 16 Find(1)

47 / 114

Page 121: 17 Disjoint Set Representation

Forest of Up-Trees: Operations without union by rank orweight

We finish with this data structure

1

2

n-1

n

Thus the last part of the sequence give us a total time ofAggregate Time Θ

(n2)

Amortized Analysis per operation Θ (n)

48 / 114

Page 122: 17 Disjoint Set Representation

Forest of Up-Trees: Operations without union by rank orweight

We finish with this data structure

1

2

n-1

n

Thus the last part of the sequence give us a total time ofAggregate Time Θ

(n2)

Amortized Analysis per operation Θ (n)

48 / 114

Page 123: 17 Disjoint Set Representation

Self-Adjusting forest of Up-Trees

How, we avoid this problemUse together the following heuristics!!!

1 Balanced Union.I By tree weight (i.e., size)I By tree rank (i.e., height)

2 Find with path compression

ObservationsEach single improvement (1 or 2) by itself will result in logarithmicamortized cost per operation.The two improvements combined will result in amortized cost peroperation approaching very close to O(1).

49 / 114

Page 124: 17 Disjoint Set Representation

Self-Adjusting forest of Up-Trees

How, we avoid this problemUse together the following heuristics!!!

1 Balanced Union.I By tree weight (i.e., size)I By tree rank (i.e., height)

2 Find with path compression

ObservationsEach single improvement (1 or 2) by itself will result in logarithmicamortized cost per operation.The two improvements combined will result in amortized cost peroperation approaching very close to O(1).

49 / 114

Page 125: 17 Disjoint Set Representation

Self-Adjusting forest of Up-Trees

How, we avoid this problemUse together the following heuristics!!!

1 Balanced Union.I By tree weight (i.e., size)I By tree rank (i.e., height)

2 Find with path compression

ObservationsEach single improvement (1 or 2) by itself will result in logarithmicamortized cost per operation.The two improvements combined will result in amortized cost peroperation approaching very close to O(1).

49 / 114

Page 126: 17 Disjoint Set Representation

Self-Adjusting forest of Up-Trees

How, we avoid this problemUse together the following heuristics!!!

1 Balanced Union.I By tree weight (i.e., size)I By tree rank (i.e., height)

2 Find with path compression

ObservationsEach single improvement (1 or 2) by itself will result in logarithmicamortized cost per operation.The two improvements combined will result in amortized cost peroperation approaching very close to O(1).

49 / 114

Page 127: 17 Disjoint Set Representation

Self-Adjusting forest of Up-Trees

How, we avoid this problemUse together the following heuristics!!!

1 Balanced Union.I By tree weight (i.e., size)I By tree rank (i.e., height)

2 Find with path compression

ObservationsEach single improvement (1 or 2) by itself will result in logarithmicamortized cost per operation.The two improvements combined will result in amortized cost peroperation approaching very close to O(1).

49 / 114

Page 128: 17 Disjoint Set Representation

Self-Adjusting forest of Up-Trees

How, we avoid this problemUse together the following heuristics!!!

1 Balanced Union.I By tree weight (i.e., size)I By tree rank (i.e., height)

2 Find with path compression

ObservationsEach single improvement (1 or 2) by itself will result in logarithmicamortized cost per operation.The two improvements combined will result in amortized cost peroperation approaching very close to O(1).

49 / 114

Page 129: 17 Disjoint Set Representation

Self-Adjusting forest of Up-Trees

How, we avoid this problemUse together the following heuristics!!!

1 Balanced Union.I By tree weight (i.e., size)I By tree rank (i.e., height)

2 Find with path compression

ObservationsEach single improvement (1 or 2) by itself will result in logarithmicamortized cost per operation.The two improvements combined will result in amortized cost peroperation approaching very close to O(1).

49 / 114

Page 130: 17 Disjoint Set Representation

Balanced Union by Size

Using size for Balanced UnionWe can use the size of each set to obtain what we want

50 / 114

Page 131: 17 Disjoint Set Representation

We have thenMakeSet(x)

1 p[x] = x2 size[x] = 1

Note: Complexity O (1) time

Union(A,B)Input: assume that p[A]=A 6=p[B]=B

1 if size[A] >size[B]2 size[A] =size[A]+size[B]3 p[B] = A4 else5 size[B] =size[A]+size[B]6 p[A] = B

Note: Complexity O (1) time51 / 114

Page 132: 17 Disjoint Set Representation

We have thenMakeSet(x)

1 p[x] = x2 size[x] = 1

Note: Complexity O (1) time

Union(A,B)Input: assume that p[A]=A 6=p[B]=B

1 if size[A] >size[B]2 size[A] =size[A]+size[B]3 p[B] = A4 else5 size[B] =size[A]+size[B]6 p[A] = B

Note: Complexity O (1) time51 / 114

Page 133: 17 Disjoint Set Representation

We have thenMakeSet(x)

1 p[x] = x2 size[x] = 1

Note: Complexity O (1) time

Union(A,B)Input: assume that p[A]=A 6=p[B]=B

1 if size[A] >size[B]2 size[A] =size[A]+size[B]3 p[B] = A4 else5 size[B] =size[A]+size[B]6 p[A] = B

Note: Complexity O (1) time51 / 114

Page 134: 17 Disjoint Set Representation

Example

Now, we use the size for the union

size[A]>size[B]

B

A

52 / 114

Page 135: 17 Disjoint Set Representation

Nevertheless

Union by size can make the analysis too complexPeople would rather use the rank

RankIt is defined as the height of the tree

BecauseThe use of the rank simplify the amortized analysis for the data structure!!!

53 / 114

Page 136: 17 Disjoint Set Representation

Nevertheless

Union by size can make the analysis too complexPeople would rather use the rank

RankIt is defined as the height of the tree

BecauseThe use of the rank simplify the amortized analysis for the data structure!!!

53 / 114

Page 137: 17 Disjoint Set Representation

Nevertheless

Union by size can make the analysis too complexPeople would rather use the rank

RankIt is defined as the height of the tree

BecauseThe use of the rank simplify the amortized analysis for the data structure!!!

53 / 114

Page 138: 17 Disjoint Set Representation

Thus, we use the balanced union by rank

MakeSet(x)1 p[x] = x2 rank[x] = 0

Note: Complexity O (1) time

Union(A,B)Input: assume that p[A]=A 6=p[B]=B

1 if rank[A] >rank[B]2 p[B] = A3 else4 p[A] = B5 if rank[A] ==rank[B]6 rank[B]=rank[B]+1

Note: Complexity O (1) time

54 / 114

Page 139: 17 Disjoint Set Representation

Thus, we use the balanced union by rank

MakeSet(x)1 p[x] = x2 rank[x] = 0

Note: Complexity O (1) time

Union(A,B)Input: assume that p[A]=A 6=p[B]=B

1 if rank[A] >rank[B]2 p[B] = A3 else4 p[A] = B5 if rank[A] ==rank[B]6 rank[B]=rank[B]+1

Note: Complexity O (1) time

54 / 114

Page 140: 17 Disjoint Set Representation

Thus, we use the balanced union by rank

MakeSet(x)1 p[x] = x2 rank[x] = 0

Note: Complexity O (1) time

Union(A,B)Input: assume that p[A]=A 6=p[B]=B

1 if rank[A] >rank[B]2 p[B] = A3 else4 p[A] = B5 if rank[A] ==rank[B]6 rank[B]=rank[B]+1

Note: Complexity O (1) time

54 / 114

Page 141: 17 Disjoint Set Representation

Example

NowWe use the rank for the union

Case IThe rank of A is larger than B

55 / 114

Page 142: 17 Disjoint Set Representation

Example

NowWe use the rank for the union

Case IThe rank of A is larger than B

rank[A]>rank[B]

B

A

55 / 114

Page 143: 17 Disjoint Set Representation

Example

Case IIThe rank of B is larger than A

56 / 114

Page 144: 17 Disjoint Set Representation

Example

Case IIThe rank of B is larger than A

rank[B]>rank[A]

BA

56 / 114

Page 145: 17 Disjoint Set Representation

Outline1 Disjoint Set Representation

Definition of the ProblemOperations

2 Union-Find ProblemThe Main ProblemApplications

3 ImplementationsFirst Attempt: Circular ListOperations and CostStill we have a Problem

Weighted-Union HeuristicOperationsStill a Problem

Heuristic Union by Rank

4 Balanced UnionPath compressionTime ComplexityAckermann’s FunctionBoundsThe Rank ObservationProof of ComplexityTheorem for Union by Rank and Path Compression)

57 / 114

Page 146: 17 Disjoint Set Representation

Here is the new heuristic to improve overall performance:

Path CompressionFind(x)

1 if x 6=p[x ]2 p[x ]=Find(p [x ])3 return p[x ]

ComplexityO (depth (x)) time

58 / 114

Page 147: 17 Disjoint Set Representation

Here is the new heuristic to improve overall performance:

Path CompressionFind(x)

1 if x 6=p[x ]2 p[x ]=Find(p [x ])3 return p[x ]

ComplexityO (depth (x)) time

58 / 114

Page 148: 17 Disjoint Set Representation

Example

We have the following structure

59 / 114

Page 149: 17 Disjoint Set Representation

Example

The recursive Find(p [x ])

60 / 114

Page 150: 17 Disjoint Set Representation

Example

The recursive Find(p [x ])

61 / 114

Page 151: 17 Disjoint Set Representation

Path compression

Find(x) should traverse the path from x up to its root.This might as well create shortcuts along the way to improve the efficiencyof the future operations.

Find(2)3

13 15 10 13 15 10

3

12 14 12 1411

11

2

2

1

1

7 8

7 84

6 5

9

4

6 5

9

62 / 114

Page 152: 17 Disjoint Set Representation

Outline1 Disjoint Set Representation

Definition of the ProblemOperations

2 Union-Find ProblemThe Main ProblemApplications

3 ImplementationsFirst Attempt: Circular ListOperations and CostStill we have a Problem

Weighted-Union HeuristicOperationsStill a Problem

Heuristic Union by Rank

4 Balanced UnionPath compressionTime ComplexityAckermann’s FunctionBoundsThe Rank ObservationProof of ComplexityTheorem for Union by Rank and Path Compression)

63 / 114

Page 153: 17 Disjoint Set Representation

Time complexity

Tight upper bound on time complexityAn amortized time of O(mα(m,n)) for m operations.Where α(m,n) is the inverse of the Ackermann’s function (almost aconstant).This bound, for a slightly different definition of α than that givenhere is shown in Cormen’s book.

64 / 114

Page 154: 17 Disjoint Set Representation

Time complexity

Tight upper bound on time complexityAn amortized time of O(mα(m,n)) for m operations.Where α(m,n) is the inverse of the Ackermann’s function (almost aconstant).This bound, for a slightly different definition of α than that givenhere is shown in Cormen’s book.

64 / 114

Page 155: 17 Disjoint Set Representation

Time complexity

Tight upper bound on time complexityAn amortized time of O(mα(m,n)) for m operations.Where α(m,n) is the inverse of the Ackermann’s function (almost aconstant).This bound, for a slightly different definition of α than that givenhere is shown in Cormen’s book.

64 / 114

Page 156: 17 Disjoint Set Representation

Ackermann’s Function

DefinitionA(1, j) = 2j where j ≥ 1A(i, 1) = A(i − 1, 2) where i ≥ 2A(i, j) = A(i − 1,A(i, j − 1)) where i, j ≥ 2

Note:This is one of several in-equivalent but similar definitionsof Ackermann’s function found in the literature.Cormen’s book authors give a different definition,although they never really call theirs Ackermann’sfunction.

PropertyAckermann’s function grows very fast, thus it’s inverse grows very slow.

65 / 114

Page 157: 17 Disjoint Set Representation

Ackermann’s Function

DefinitionA(1, j) = 2j where j ≥ 1A(i, 1) = A(i − 1, 2) where i ≥ 2A(i, j) = A(i − 1,A(i, j − 1)) where i, j ≥ 2

Note:This is one of several in-equivalent but similar definitionsof Ackermann’s function found in the literature.Cormen’s book authors give a different definition,although they never really call theirs Ackermann’sfunction.

PropertyAckermann’s function grows very fast, thus it’s inverse grows very slow.

65 / 114

Page 158: 17 Disjoint Set Representation

Ackermann’s Function

DefinitionA(1, j) = 2j where j ≥ 1A(i, 1) = A(i − 1, 2) where i ≥ 2A(i, j) = A(i − 1,A(i, j − 1)) where i, j ≥ 2

Note:This is one of several in-equivalent but similar definitionsof Ackermann’s function found in the literature.Cormen’s book authors give a different definition,although they never really call theirs Ackermann’sfunction.

PropertyAckermann’s function grows very fast, thus it’s inverse grows very slow.

65 / 114

Page 159: 17 Disjoint Set Representation

Ackermann’s Function

DefinitionA(1, j) = 2j where j ≥ 1A(i, 1) = A(i − 1, 2) where i ≥ 2A(i, j) = A(i − 1,A(i, j − 1)) where i, j ≥ 2

Note:This is one of several in-equivalent but similar definitionsof Ackermann’s function found in the literature.Cormen’s book authors give a different definition,although they never really call theirs Ackermann’sfunction.

PropertyAckermann’s function grows very fast, thus it’s inverse grows very slow.

65 / 114

Page 160: 17 Disjoint Set Representation

Ackermann’s Function

DefinitionA(1, j) = 2j where j ≥ 1A(i, 1) = A(i − 1, 2) where i ≥ 2A(i, j) = A(i − 1,A(i, j − 1)) where i, j ≥ 2

Note:This is one of several in-equivalent but similar definitionsof Ackermann’s function found in the literature.Cormen’s book authors give a different definition,although they never really call theirs Ackermann’sfunction.

PropertyAckermann’s function grows very fast, thus it’s inverse grows very slow.

65 / 114

Page 161: 17 Disjoint Set Representation

Ackermann’s Function

DefinitionA(1, j) = 2j where j ≥ 1A(i, 1) = A(i − 1, 2) where i ≥ 2A(i, j) = A(i − 1,A(i, j − 1)) where i, j ≥ 2

Note:This is one of several in-equivalent but similar definitionsof Ackermann’s function found in the literature.Cormen’s book authors give a different definition,although they never really call theirs Ackermann’sfunction.

PropertyAckermann’s function grows very fast, thus it’s inverse grows very slow.

65 / 114

Page 162: 17 Disjoint Set Representation

Ackermann’s Function

Example A(3, 4)

A (3, 4) = 22. . .

2}2

2. . .2}2

2. . .2}16

Notation: 22. . .

2}10

means 22222222222

66 / 114

Page 163: 17 Disjoint Set Representation

Inverse of Ackermann’s function

Definition

α(m,n) = min{

i ≥ 1|A(

i,⌊m

n

⌋)> log n

}(3)

Note: This is not a true mathematical inverse.Intuition: Grows about as slowly as Ackermann’s function does fast.

How slowly?Let bm

n c = k, then m ≥ n → k ≥ 1

67 / 114

Page 164: 17 Disjoint Set Representation

Inverse of Ackermann’s function

Definition

α(m,n) = min{

i ≥ 1|A(

i,⌊m

n

⌋)> log n

}(3)

Note: This is not a true mathematical inverse.Intuition: Grows about as slowly as Ackermann’s function does fast.

How slowly?Let bm

n c = k, then m ≥ n → k ≥ 1

67 / 114

Page 165: 17 Disjoint Set Representation

Inverse of Ackermann’s function

Definition

α(m,n) = min{

i ≥ 1|A(

i,⌊m

n

⌋)> log n

}(3)

Note: This is not a true mathematical inverse.Intuition: Grows about as slowly as Ackermann’s function does fast.

How slowly?Let bm

n c = k, then m ≥ n → k ≥ 1

67 / 114

Page 166: 17 Disjoint Set Representation

Inverse of Ackermann’s function

Definition

α(m,n) = min{

i ≥ 1|A(

i,⌊m

n

⌋)> log n

}(3)

Note: This is not a true mathematical inverse.Intuition: Grows about as slowly as Ackermann’s function does fast.

How slowly?Let bm

n c = k, then m ≥ n → k ≥ 1

67 / 114

Page 167: 17 Disjoint Set Representation

Thus

FirstWe can show that A(i, k) ≥ A(i, 1) for all i ≥ 1.

This is left to you...

For Example

Consider i = 4, then A(i, k) ≥ A(4, 1) = 22..

.2}

10≈ 1080.

Finallyif log n < 1080. i.e., if n < 21080 =⇒ α(m,n) ≤ 4

68 / 114

Page 168: 17 Disjoint Set Representation

Thus

FirstWe can show that A(i, k) ≥ A(i, 1) for all i ≥ 1.

This is left to you...

For Example

Consider i = 4, then A(i, k) ≥ A(4, 1) = 22..

.2}

10≈ 1080.

Finallyif log n < 1080. i.e., if n < 21080 =⇒ α(m,n) ≤ 4

68 / 114

Page 169: 17 Disjoint Set Representation

Thus

FirstWe can show that A(i, k) ≥ A(i, 1) for all i ≥ 1.

This is left to you...

For Example

Consider i = 4, then A(i, k) ≥ A(4, 1) = 22..

.2}

10≈ 1080.

Finallyif log n < 1080. i.e., if n < 21080 =⇒ α(m,n) ≤ 4

68 / 114

Page 170: 17 Disjoint Set Representation

Thus

FirstWe can show that A(i, k) ≥ A(i, 1) for all i ≥ 1.

This is left to you...

For Example

Consider i = 4, then A(i, k) ≥ A(4, 1) = 22..

.2}

10≈ 1080.

Finallyif log n < 1080. i.e., if n < 21080 =⇒ α(m,n) ≤ 4

68 / 114

Page 171: 17 Disjoint Set Representation

Thus

FirstWe can show that A(i, k) ≥ A(i, 1) for all i ≥ 1.

This is left to you...

For Example

Consider i = 4, then A(i, k) ≥ A(4, 1) = 22..

.2}

10≈ 1080.

Finallyif log n < 1080. i.e., if n < 21080 =⇒ α(m,n) ≤ 4

68 / 114

Page 172: 17 Disjoint Set Representation

Instead of Using the Ackermann Inverse

We define the following function

log∗ n = min{

i ≥ 0| log(i) n ≤ 1}

(4)

The i means log · · · log n i times

ThenWe will establish O (m log∗ n) as upper bound.

69 / 114

Page 173: 17 Disjoint Set Representation

Instead of Using the Ackermann Inverse

We define the following function

log∗ n = min{

i ≥ 0| log(i) n ≤ 1}

(4)

The i means log · · · log n i times

ThenWe will establish O (m log∗ n) as upper bound.

69 / 114

Page 174: 17 Disjoint Set Representation

In particular

Something Notable

In particular, we have that log∗ 22. . .

2}k

= k + 1

For Example

log∗ 265536 = 22222}

4= 5 (5)

ThereforeWe have that log∗ n ≤ 5 for all practical purposes.

70 / 114

Page 175: 17 Disjoint Set Representation

In particular

Something Notable

In particular, we have that log∗ 22. . .

2}k

= k + 1

For Example

log∗ 265536 = 22222}

4= 5 (5)

ThereforeWe have that log∗ n ≤ 5 for all practical purposes.

70 / 114

Page 176: 17 Disjoint Set Representation

In particular

Something Notable

In particular, we have that log∗ 22. . .

2}k

= k + 1

For Example

log∗ 265536 = 22222}

4= 5 (5)

ThereforeWe have that log∗ n ≤ 5 for all practical purposes.

70 / 114

Page 177: 17 Disjoint Set Representation

The Rank Observation

Something NotableIt is that once somebody becomes a child of another node their rank doesnot change given any posterior operation.

71 / 114

Page 178: 17 Disjoint Set Representation

For Example

The number in the right is the heightMakeSet(1),MakeSet(2),MakeSet(3), ..., MakeSet(10)

1/0 2/0 3/0 4/0 5/0

6/0 7/0 8/0 9/0 10/0

72 / 114

Page 179: 17 Disjoint Set Representation

Example

Now, we doUnion(6, 1),Union(7, 2), ..., Union(10, 1)

1/1 2/1 3/1 4/1 5/1

6/0 7/0 8/0 9/0 10/0

73 / 114

Page 180: 17 Disjoint Set Representation

Example

Next - Assuming that you are using a FindSet to get the name setUnion(1, 2)

3/1 4/1 5/1

8/0 9/0 10/0

2/2

7/01/1

6/0

74 / 114

Page 181: 17 Disjoint Set Representation

Example

NextUnion(3, 4)

2/2

7/01/1

6/0

3/1

4/2

8/0

9/0

5/1

10/0

75 / 114

Page 182: 17 Disjoint Set Representation

Example

NextUnion(2, 4)

2/2

1/1

6/0

7/0

3/1

4/3

8/0

9/0

5/1

10/0

76 / 114

Page 183: 17 Disjoint Set Representation

Example

Now you give a FindSet(8)

2/2

1/1

6/0

7/0

4/3

9/0

5/1

10/03/1 8/0

77 / 114

Page 184: 17 Disjoint Set Representation

Example

Now you give a Union(4, 5)

2/2

1/1

6/0

7/0

4/3

9/03/1 8/0 5/1

10/0

78 / 114

Page 185: 17 Disjoint Set Representation

Properties of ranks

Lemma 1 (About the Rank Properties)1 ∀x, rank[x] ≤ rank[p[x]].2 ∀x and x 6= p[x], then rank[x] < rank[p[x]].3 rank[x] is initially 0.4 rank[x] does not decrease.5 Once x 6= p[x] holds rank[x] does not change.6 rank[p[x]] is a monotonically increasing function of time.

ProofBy induction on the number of operations...

79 / 114

Page 186: 17 Disjoint Set Representation

Properties of ranks

Lemma 1 (About the Rank Properties)1 ∀x, rank[x] ≤ rank[p[x]].2 ∀x and x 6= p[x], then rank[x] < rank[p[x]].3 rank[x] is initially 0.4 rank[x] does not decrease.5 Once x 6= p[x] holds rank[x] does not change.6 rank[p[x]] is a monotonically increasing function of time.

ProofBy induction on the number of operations...

79 / 114

Page 187: 17 Disjoint Set Representation

Properties of ranks

Lemma 1 (About the Rank Properties)1 ∀x, rank[x] ≤ rank[p[x]].2 ∀x and x 6= p[x], then rank[x] < rank[p[x]].3 rank[x] is initially 0.4 rank[x] does not decrease.5 Once x 6= p[x] holds rank[x] does not change.6 rank[p[x]] is a monotonically increasing function of time.

ProofBy induction on the number of operations...

79 / 114

Page 188: 17 Disjoint Set Representation

Properties of ranks

Lemma 1 (About the Rank Properties)1 ∀x, rank[x] ≤ rank[p[x]].2 ∀x and x 6= p[x], then rank[x] < rank[p[x]].3 rank[x] is initially 0.4 rank[x] does not decrease.5 Once x 6= p[x] holds rank[x] does not change.6 rank[p[x]] is a monotonically increasing function of time.

ProofBy induction on the number of operations...

79 / 114

Page 189: 17 Disjoint Set Representation

Properties of ranks

Lemma 1 (About the Rank Properties)1 ∀x, rank[x] ≤ rank[p[x]].2 ∀x and x 6= p[x], then rank[x] < rank[p[x]].3 rank[x] is initially 0.4 rank[x] does not decrease.5 Once x 6= p[x] holds rank[x] does not change.6 rank[p[x]] is a monotonically increasing function of time.

ProofBy induction on the number of operations...

79 / 114

Page 190: 17 Disjoint Set Representation

Properties of ranks

Lemma 1 (About the Rank Properties)1 ∀x, rank[x] ≤ rank[p[x]].2 ∀x and x 6= p[x], then rank[x] < rank[p[x]].3 rank[x] is initially 0.4 rank[x] does not decrease.5 Once x 6= p[x] holds rank[x] does not change.6 rank[p[x]] is a monotonically increasing function of time.

ProofBy induction on the number of operations...

79 / 114

Page 191: 17 Disjoint Set Representation

Properties of ranks

Lemma 1 (About the Rank Properties)1 ∀x, rank[x] ≤ rank[p[x]].2 ∀x and x 6= p[x], then rank[x] < rank[p[x]].3 rank[x] is initially 0.4 rank[x] does not decrease.5 Once x 6= p[x] holds rank[x] does not change.6 rank[p[x]] is a monotonically increasing function of time.

ProofBy induction on the number of operations...

79 / 114

Page 192: 17 Disjoint Set Representation

For Example

Imagine a MakeSet(x)Then, rank [x] ≤ rank [p [x]]Thus, it is true after n operations.The we get the n + 1 operations that can be:

I Case I - FindSet.I Case II - Union.

The rest are for you to proveIt is a good mental exercise!!!

80 / 114

Page 193: 17 Disjoint Set Representation

For Example

Imagine a MakeSet(x)Then, rank [x] ≤ rank [p [x]]Thus, it is true after n operations.The we get the n + 1 operations that can be:

I Case I - FindSet.I Case II - Union.

The rest are for you to proveIt is a good mental exercise!!!

80 / 114

Page 194: 17 Disjoint Set Representation

For Example

Imagine a MakeSet(x)Then, rank [x] ≤ rank [p [x]]Thus, it is true after n operations.The we get the n + 1 operations that can be:

I Case I - FindSet.I Case II - Union.

The rest are for you to proveIt is a good mental exercise!!!

80 / 114

Page 195: 17 Disjoint Set Representation

For Example

Imagine a MakeSet(x)Then, rank [x] ≤ rank [p [x]]Thus, it is true after n operations.The we get the n + 1 operations that can be:

I Case I - FindSet.I Case II - Union.

The rest are for you to proveIt is a good mental exercise!!!

80 / 114

Page 196: 17 Disjoint Set Representation

For Example

Imagine a MakeSet(x)Then, rank [x] ≤ rank [p [x]]Thus, it is true after n operations.The we get the n + 1 operations that can be:

I Case I - FindSet.I Case II - Union.

The rest are for you to proveIt is a good mental exercise!!!

80 / 114

Page 197: 17 Disjoint Set Representation

The Number of Nodes in a Tree

Lemma 2For all tree roots x, size(x) ≥ 2rank[x]

Note size (x)= Number of nodes in tree rooted at x

ProofBy induction on the number of link operations:

Basis StepI Before first link, all ranks are 0 and each tree contains one node.

Inductive StepI Consider linking x and y (Link (x, y))I Assume lemma holds before this operation; we show that it will holds

after.

81 / 114

Page 198: 17 Disjoint Set Representation

The Number of Nodes in a Tree

Lemma 2For all tree roots x, size(x) ≥ 2rank[x]

Note size (x)= Number of nodes in tree rooted at x

ProofBy induction on the number of link operations:

Basis StepI Before first link, all ranks are 0 and each tree contains one node.

Inductive StepI Consider linking x and y (Link (x, y))I Assume lemma holds before this operation; we show that it will holds

after.

81 / 114

Page 199: 17 Disjoint Set Representation

The Number of Nodes in a Tree

Lemma 2For all tree roots x, size(x) ≥ 2rank[x]

Note size (x)= Number of nodes in tree rooted at x

ProofBy induction on the number of link operations:

Basis StepI Before first link, all ranks are 0 and each tree contains one node.

Inductive StepI Consider linking x and y (Link (x, y))I Assume lemma holds before this operation; we show that it will holds

after.

81 / 114

Page 200: 17 Disjoint Set Representation

The Number of Nodes in a Tree

Lemma 2For all tree roots x, size(x) ≥ 2rank[x]

Note size (x)= Number of nodes in tree rooted at x

ProofBy induction on the number of link operations:

Basis StepI Before first link, all ranks are 0 and each tree contains one node.

Inductive StepI Consider linking x and y (Link (x, y))I Assume lemma holds before this operation; we show that it will holds

after.

81 / 114

Page 201: 17 Disjoint Set Representation

The Number of Nodes in a Tree

Lemma 2For all tree roots x, size(x) ≥ 2rank[x]

Note size (x)= Number of nodes in tree rooted at x

ProofBy induction on the number of link operations:

Basis StepI Before first link, all ranks are 0 and each tree contains one node.

Inductive StepI Consider linking x and y (Link (x, y))I Assume lemma holds before this operation; we show that it will holds

after.

81 / 114

Page 202: 17 Disjoint Set Representation

The Number of Nodes in a Tree

Lemma 2For all tree roots x, size(x) ≥ 2rank[x]

Note size (x)= Number of nodes in tree rooted at x

ProofBy induction on the number of link operations:

Basis StepI Before first link, all ranks are 0 and each tree contains one node.

Inductive StepI Consider linking x and y (Link (x, y))I Assume lemma holds before this operation; we show that it will holds

after.

81 / 114

Page 203: 17 Disjoint Set Representation

The Number of Nodes in a Tree

Lemma 2For all tree roots x, size(x) ≥ 2rank[x]

Note size (x)= Number of nodes in tree rooted at x

ProofBy induction on the number of link operations:

Basis StepI Before first link, all ranks are 0 and each tree contains one node.

Inductive StepI Consider linking x and y (Link (x, y))I Assume lemma holds before this operation; we show that it will holds

after.

81 / 114

Page 204: 17 Disjoint Set Representation

Case 1: rank[x ] 6= rank[y]

Assume rank [x ] < rank [y]

Note:rank ′ [x] == rank [x] and rank ′ [y] == rank [y]

82 / 114

Page 205: 17 Disjoint Set Representation

Therefore

We have that

size′ (y) = size (x) + size (y)≥ 2rank[x] + 2rank[y]

≥ 2rank[y]

= 2rank′[y]

83 / 114

Page 206: 17 Disjoint Set Representation

Therefore

We have that

size′ (y) = size (x) + size (y)≥ 2rank[x] + 2rank[y]

≥ 2rank[y]

= 2rank′[y]

83 / 114

Page 207: 17 Disjoint Set Representation

Therefore

We have that

size′ (y) = size (x) + size (y)≥ 2rank[x] + 2rank[y]

≥ 2rank[y]

= 2rank′[y]

83 / 114

Page 208: 17 Disjoint Set Representation

Therefore

We have that

size′ (y) = size (x) + size (y)≥ 2rank[x] + 2rank[y]

≥ 2rank[y]

= 2rank′[y]

83 / 114

Page 209: 17 Disjoint Set Representation

Case 2: rank[x ] == rank[y]

Assume rank [x ] == rank [y]

Note:rank ′ [x] == rank [x] and rank ′ [y] == rank [y] + 1

84 / 114

Page 210: 17 Disjoint Set Representation

Therefore

We have that

size′ (y) = size (x) + size (y)≥ 2rank[x] + 2rank[y]

≥ 2rank[y]+1

= 2rank′[y]

Note: In the worst case rank [x] == rank [y] == 0

85 / 114

Page 211: 17 Disjoint Set Representation

Therefore

We have that

size′ (y) = size (x) + size (y)≥ 2rank[x] + 2rank[y]

≥ 2rank[y]+1

= 2rank′[y]

Note: In the worst case rank [x] == rank [y] == 0

85 / 114

Page 212: 17 Disjoint Set Representation

Therefore

We have that

size′ (y) = size (x) + size (y)≥ 2rank[x] + 2rank[y]

≥ 2rank[y]+1

= 2rank′[y]

Note: In the worst case rank [x] == rank [y] == 0

85 / 114

Page 213: 17 Disjoint Set Representation

Therefore

We have that

size′ (y) = size (x) + size (y)≥ 2rank[x] + 2rank[y]

≥ 2rank[y]+1

= 2rank′[y]

Note: In the worst case rank [x] == rank [y] == 0

85 / 114

Page 214: 17 Disjoint Set Representation

Therefore

We have that

size′ (y) = size (x) + size (y)≥ 2rank[x] + 2rank[y]

≥ 2rank[y]+1

= 2rank′[y]

Note: In the worst case rank [x] == rank [y] == 0

85 / 114

Page 215: 17 Disjoint Set Representation

The number of nodes at certain rank

Lemma 3For any integer r ≥ 0, there are an most n

2r nodes of rank r .

ProofFirst fix r .When rank r is assigned to some node x, then imagine that you labeleach node in the tree rooted at x by “x.”By lemma 21.3, 2r or more nodes are labeled each time whenexecuting a union.By lemma 21.2, each node is labeled at most once, when its root isfirst assigned rank r .If there were more than n

2r nodes of rank r .Then, we will have that more than 2r ·

( n2r)

= n nodes would belabeled by a node of rank r , a contradiction.

86 / 114

Page 216: 17 Disjoint Set Representation

The number of nodes at certain rank

Lemma 3For any integer r ≥ 0, there are an most n

2r nodes of rank r .

ProofFirst fix r .When rank r is assigned to some node x, then imagine that you labeleach node in the tree rooted at x by “x.”By lemma 21.3, 2r or more nodes are labeled each time whenexecuting a union.By lemma 21.2, each node is labeled at most once, when its root isfirst assigned rank r .If there were more than n

2r nodes of rank r .Then, we will have that more than 2r ·

( n2r)

= n nodes would belabeled by a node of rank r , a contradiction.

86 / 114

Page 217: 17 Disjoint Set Representation

The number of nodes at certain rank

Lemma 3For any integer r ≥ 0, there are an most n

2r nodes of rank r .

ProofFirst fix r .When rank r is assigned to some node x, then imagine that you labeleach node in the tree rooted at x by “x.”By lemma 21.3, 2r or more nodes are labeled each time whenexecuting a union.By lemma 21.2, each node is labeled at most once, when its root isfirst assigned rank r .If there were more than n

2r nodes of rank r .Then, we will have that more than 2r ·

( n2r)

= n nodes would belabeled by a node of rank r , a contradiction.

86 / 114

Page 218: 17 Disjoint Set Representation

The number of nodes at certain rank

Lemma 3For any integer r ≥ 0, there are an most n

2r nodes of rank r .

ProofFirst fix r .When rank r is assigned to some node x, then imagine that you labeleach node in the tree rooted at x by “x.”By lemma 21.3, 2r or more nodes are labeled each time whenexecuting a union.By lemma 21.2, each node is labeled at most once, when its root isfirst assigned rank r .If there were more than n

2r nodes of rank r .Then, we will have that more than 2r ·

( n2r)

= n nodes would belabeled by a node of rank r , a contradiction.

86 / 114

Page 219: 17 Disjoint Set Representation

The number of nodes at certain rank

Lemma 3For any integer r ≥ 0, there are an most n

2r nodes of rank r .

ProofFirst fix r .When rank r is assigned to some node x, then imagine that you labeleach node in the tree rooted at x by “x.”By lemma 21.3, 2r or more nodes are labeled each time whenexecuting a union.By lemma 21.2, each node is labeled at most once, when its root isfirst assigned rank r .If there were more than n

2r nodes of rank r .Then, we will have that more than 2r ·

( n2r)

= n nodes would belabeled by a node of rank r , a contradiction.

86 / 114

Page 220: 17 Disjoint Set Representation

The number of nodes at certain rank

Lemma 3For any integer r ≥ 0, there are an most n

2r nodes of rank r .

ProofFirst fix r .When rank r is assigned to some node x, then imagine that you labeleach node in the tree rooted at x by “x.”By lemma 21.3, 2r or more nodes are labeled each time whenexecuting a union.By lemma 21.2, each node is labeled at most once, when its root isfirst assigned rank r .If there were more than n

2r nodes of rank r .Then, we will have that more than 2r ·

( n2r)

= n nodes would belabeled by a node of rank r , a contradiction.

86 / 114

Page 221: 17 Disjoint Set Representation

The number of nodes at certain rank

Lemma 3For any integer r ≥ 0, there are an most n

2r nodes of rank r .

ProofFirst fix r .When rank r is assigned to some node x, then imagine that you labeleach node in the tree rooted at x by “x.”By lemma 21.3, 2r or more nodes are labeled each time whenexecuting a union.By lemma 21.2, each node is labeled at most once, when its root isfirst assigned rank r .If there were more than n

2r nodes of rank r .Then, we will have that more than 2r ·

( n2r)

= n nodes would belabeled by a node of rank r , a contradiction.

86 / 114

Page 222: 17 Disjoint Set Representation

Corollary 1

Corollary 1Every node has rank at most blog nc.

Proofif there is a rank r such that r > log n → n

2r < 1 nodes of rank r acontradiction.

87 / 114

Page 223: 17 Disjoint Set Representation

Corollary 1

Corollary 1Every node has rank at most blog nc.

Proofif there is a rank r such that r > log n → n

2r < 1 nodes of rank r acontradiction.

87 / 114

Page 224: 17 Disjoint Set Representation

Providing the time bound

Lemma 4 (Lemma 21.7)Suppose we convert a sequence S ′ of m′ MakeSet, Union and FindSetoperations into a sequence S of m MakeSet, Link, and FindSet operationsby turning each Union into two FindSet operations followed by a Link.Then, if sequence S runs in O(m log∗ n) time, sequence S ′ runs inO(m′ log∗ n) time.

88 / 114

Page 225: 17 Disjoint Set Representation

Proof:

The proof is quite easy1 Since each UNION operation in sequence S ′ is converted into three

operations in S .

m′ ≤ m ≤ 3m′ (6)

2 We have that m = O (m′)3 Then, if the new sequence S runs in O (m log∗ n) this implies that

the old sequence S ′ runs in O (m′ log∗ n)

89 / 114

Page 226: 17 Disjoint Set Representation

Proof:

The proof is quite easy1 Since each UNION operation in sequence S ′ is converted into three

operations in S .

m′ ≤ m ≤ 3m′ (6)

2 We have that m = O (m′)3 Then, if the new sequence S runs in O (m log∗ n) this implies that

the old sequence S ′ runs in O (m′ log∗ n)

89 / 114

Page 227: 17 Disjoint Set Representation

Proof:

The proof is quite easy1 Since each UNION operation in sequence S ′ is converted into three

operations in S .

m′ ≤ m ≤ 3m′ (6)

2 We have that m = O (m′)3 Then, if the new sequence S runs in O (m log∗ n) this implies that

the old sequence S ′ runs in O (m′ log∗ n)

89 / 114

Page 228: 17 Disjoint Set Representation

Theorem for Union by Rank and Path Compression

TheoremAny sequence of m MakeSet, Link, and FindSet operations, n of which areMakeSet operations, is performed in worst-case time O(m log∗ n).

ProofFirst, MakeSet and Link take O(1) time.The Key of the Analysis is to Accurately Charging FindSet.

90 / 114

Page 229: 17 Disjoint Set Representation

Theorem for Union by Rank and Path Compression

TheoremAny sequence of m MakeSet, Link, and FindSet operations, n of which areMakeSet operations, is performed in worst-case time O(m log∗ n).

ProofFirst, MakeSet and Link take O(1) time.The Key of the Analysis is to Accurately Charging FindSet.

90 / 114

Page 230: 17 Disjoint Set Representation

Theorem for Union by Rank and Path Compression

TheoremAny sequence of m MakeSet, Link, and FindSet operations, n of which areMakeSet operations, is performed in worst-case time O(m log∗ n).

ProofFirst, MakeSet and Link take O(1) time.The Key of the Analysis is to Accurately Charging FindSet.

90 / 114

Page 231: 17 Disjoint Set Representation

For this, we have the following

We can do the followingPartition ranks into blocks.Put each rank j into block log∗ r for r = 0, 1, ..., blog nc (Corollary 1).Highest-numbered block is log∗(log n) = (log∗ n)− 1.

In addition, the cost of FindSet pays for the foollowing situations1 The FindSet pays for the cost of the root and its child.2 A bill is given to every node whose rank parent changes in the path

compression!!!

91 / 114

Page 232: 17 Disjoint Set Representation

For this, we have the following

We can do the followingPartition ranks into blocks.Put each rank j into block log∗ r for r = 0, 1, ..., blog nc (Corollary 1).Highest-numbered block is log∗(log n) = (log∗ n)− 1.

In addition, the cost of FindSet pays for the foollowing situations1 The FindSet pays for the cost of the root and its child.2 A bill is given to every node whose rank parent changes in the path

compression!!!

91 / 114

Page 233: 17 Disjoint Set Representation

For this, we have the following

We can do the followingPartition ranks into blocks.Put each rank j into block log∗ r for r = 0, 1, ..., blog nc (Corollary 1).Highest-numbered block is log∗(log n) = (log∗ n)− 1.

In addition, the cost of FindSet pays for the foollowing situations1 The FindSet pays for the cost of the root and its child.2 A bill is given to every node whose rank parent changes in the path

compression!!!

91 / 114

Page 234: 17 Disjoint Set Representation

For this, we have the following

We can do the followingPartition ranks into blocks.Put each rank j into block log∗ r for r = 0, 1, ..., blog nc (Corollary 1).Highest-numbered block is log∗(log n) = (log∗ n)− 1.

In addition, the cost of FindSet pays for the foollowing situations1 The FindSet pays for the cost of the root and its child.2 A bill is given to every node whose rank parent changes in the path

compression!!!

91 / 114

Page 235: 17 Disjoint Set Representation

For this, we have the following

We can do the followingPartition ranks into blocks.Put each rank j into block log∗ r for r = 0, 1, ..., blog nc (Corollary 1).Highest-numbered block is log∗(log n) = (log∗ n)− 1.

In addition, the cost of FindSet pays for the foollowing situations1 The FindSet pays for the cost of the root and its child.2 A bill is given to every node whose rank parent changes in the path

compression!!!

91 / 114

Page 236: 17 Disjoint Set Representation

Now, define the Block function

Define the following Upper Bound Function

B(j) ≡

−1 if j = −11 if j = 02 if j = 1

22. . .

2}j−1

if j ≥ 2

92 / 114

Page 237: 17 Disjoint Set Representation

First

Something NotableThese are going to be the upper bounds for blocks in the ranks

WhereFor j = 0, 1, ..., log∗ n − 1, block j consist of the set of ranks:

B(j − 1) + 1,B(j − 1) + 2, ...,B(j)︸ ︷︷ ︸Elements in Block j

(7)

93 / 114

Page 238: 17 Disjoint Set Representation

First

Something NotableThese are going to be the upper bounds for blocks in the ranks

WhereFor j = 0, 1, ..., log∗ n − 1, block j consist of the set of ranks:

B(j − 1) + 1,B(j − 1) + 2, ...,B(j)︸ ︷︷ ︸Elements in Block j

(7)

93 / 114

Page 239: 17 Disjoint Set Representation

For Example

We have thatB(−1) = 1

B(0) = 0B(1) = 2B(2) = 22 = 4

B(3) = 222 = 24 = 16

B(4) = 2222= 216 = 65536

94 / 114

Page 240: 17 Disjoint Set Representation

For Example

We have thatB(−1) = 1

B(0) = 0B(1) = 2B(2) = 22 = 4

B(3) = 222 = 24 = 16

B(4) = 2222= 216 = 65536

94 / 114

Page 241: 17 Disjoint Set Representation

For Example

We have thatB(−1) = 1

B(0) = 0B(1) = 2B(2) = 22 = 4

B(3) = 222 = 24 = 16

B(4) = 2222= 216 = 65536

94 / 114

Page 242: 17 Disjoint Set Representation

For Example

We have thatB(−1) = 1

B(0) = 0B(1) = 2B(2) = 22 = 4

B(3) = 222 = 24 = 16

B(4) = 2222= 216 = 65536

94 / 114

Page 243: 17 Disjoint Set Representation

For Example

We have thatB(−1) = 1

B(0) = 0B(1) = 2B(2) = 22 = 4

B(3) = 222 = 24 = 16

B(4) = 2222= 216 = 65536

94 / 114

Page 244: 17 Disjoint Set Representation

For Example

We have thatB(−1) = 1

B(0) = 0B(1) = 2B(2) = 22 = 4

B(3) = 222 = 24 = 16

B(4) = 2222= 216 = 65536

94 / 114

Page 245: 17 Disjoint Set Representation

For Example

Thus, we haveBlock j Set of Ranks

0 0,11 22 3,43 5,...,164 17,...,65536...

...

Note B(j) = 2B(j−1) for j > 0.

95 / 114

Page 246: 17 Disjoint Set Representation

For Example

Thus, we haveBlock j Set of Ranks

0 0,11 22 3,43 5,...,164 17,...,65536...

...

Note B(j) = 2B(j−1) for j > 0.

95 / 114

Page 247: 17 Disjoint Set Representation

Example

Now you give a Union(4, 5)

2/2

1/1

6/0

7/0

4/3

9/03/1 8/0 5/1

10/0

Bock 0

Bock 1

Bock 2

96 / 114

Page 248: 17 Disjoint Set Representation

Finally

Given our Bound in the RanksThus, all the blocks from B (0) to B (log∗ n − 1) will be used for storingthe ranking elements

97 / 114

Page 249: 17 Disjoint Set Representation

Charging for FindSets

Two types of charges for FindSet(x0)Block charges and Path charges.

Charge eachnode as either:1) Block Charge2) Path Charge

98 / 114

Page 250: 17 Disjoint Set Representation

Charging for FindSets

Thus, for find setsThe find operation pays for the work done for the root and itsimmediate child.It also pays for all the nodes which are not in the same block astheir parents.

99 / 114

Page 251: 17 Disjoint Set Representation

Charging for FindSets

Thus, for find setsThe find operation pays for the work done for the root and itsimmediate child.It also pays for all the nodes which are not in the same block astheir parents.

99 / 114

Page 252: 17 Disjoint Set Representation

Then

First1 All these nodes are children of some other nodes, so their ranks will

not change and they are bound to stay in the same block until theend of the computation.

2 If a node is in the same block as its parent, it will be charged for thework done in the FindSet Operation!!!

100 / 114

Page 253: 17 Disjoint Set Representation

Then

First1 All these nodes are children of some other nodes, so their ranks will

not change and they are bound to stay in the same block until theend of the computation.

2 If a node is in the same block as its parent, it will be charged for thework done in the FindSet Operation!!!

100 / 114

Page 254: 17 Disjoint Set Representation

Thus

We have the following chargesBlock Charge :

I For j = 0, 1, ..., log∗ n − 1, give one block charge to the last node withrank in block j on the path x0, x1, ..., xl .

I Also give one block charge to the child of the root, i.e., xl−1, and theroot itself, i.e., xl−1.

Path Charge :I Give nodes in x0, ..., xl a path charge until they are moved to point to a

name element with a rank different from the child’s block

101 / 114

Page 255: 17 Disjoint Set Representation

Thus

We have the following chargesBlock Charge :

I For j = 0, 1, ..., log∗ n − 1, give one block charge to the last node withrank in block j on the path x0, x1, ..., xl .

I Also give one block charge to the child of the root, i.e., xl−1, and theroot itself, i.e., xl−1.

Path Charge :I Give nodes in x0, ..., xl a path charge until they are moved to point to a

name element with a rank different from the child’s block

101 / 114

Page 256: 17 Disjoint Set Representation

Thus

We have the following chargesBlock Charge :

I For j = 0, 1, ..., log∗ n − 1, give one block charge to the last node withrank in block j on the path x0, x1, ..., xl .

I Also give one block charge to the child of the root, i.e., xl−1, and theroot itself, i.e., xl−1.

Path Charge :I Give nodes in x0, ..., xl a path charge until they are moved to point to a

name element with a rank different from the child’s block

101 / 114

Page 257: 17 Disjoint Set Representation

Thus

We have the following chargesBlock Charge :

I For j = 0, 1, ..., log∗ n − 1, give one block charge to the last node withrank in block j on the path x0, x1, ..., xl .

I Also give one block charge to the child of the root, i.e., xl−1, and theroot itself, i.e., xl−1.

Path Charge :I Give nodes in x0, ..., xl a path charge until they are moved to point to a

name element with a rank different from the child’s block

101 / 114

Page 258: 17 Disjoint Set Representation

Thus

We have the following chargesBlock Charge :

I For j = 0, 1, ..., log∗ n − 1, give one block charge to the last node withrank in block j on the path x0, x1, ..., xl .

I Also give one block charge to the child of the root, i.e., xl−1, and theroot itself, i.e., xl−1.

Path Charge :I Give nodes in x0, ..., xl a path charge until they are moved to point to a

name element with a rank different from the child’s block

101 / 114

Page 259: 17 Disjoint Set Representation

Charging for FindSets

Two types of charges for FindSet(x0)Block charges and Path charges.

Charge eachnode as either:1) Block Charge2) Path Charge

102 / 114

Page 260: 17 Disjoint Set Representation

Charging for FindSets

Two types of charges for FindSet(x0)Block charges and Path charges.

Charge eachnode as either:1) Block Charge2) Path Charge

102 / 114

Page 261: 17 Disjoint Set Representation

Next

Something NotableNumber of nodes whose parents are in different blocks is limited by(log∗ n)− 1.

I Making it an upper bound for the charges when changing the lastnode with rank in block j.

2 charges for the root and its child.

ThusThe cost of the Block Charges for the FindSet operation is upper boundedby:

log∗ n − 1 + 2 = log∗ n + 1. (8)

103 / 114

Page 262: 17 Disjoint Set Representation

Next

Something NotableNumber of nodes whose parents are in different blocks is limited by(log∗ n)− 1.

I Making it an upper bound for the charges when changing the lastnode with rank in block j.

2 charges for the root and its child.

ThusThe cost of the Block Charges for the FindSet operation is upper boundedby:

log∗ n − 1 + 2 = log∗ n + 1. (8)

103 / 114

Page 263: 17 Disjoint Set Representation

Next

Something NotableNumber of nodes whose parents are in different blocks is limited by(log∗ n)− 1.

I Making it an upper bound for the charges when changing the lastnode with rank in block j.

2 charges for the root and its child.

ThusThe cost of the Block Charges for the FindSet operation is upper boundedby:

log∗ n − 1 + 2 = log∗ n + 1. (8)

103 / 114

Page 264: 17 Disjoint Set Representation

Next

Something NotableNumber of nodes whose parents are in different blocks is limited by(log∗ n)− 1.

I Making it an upper bound for the charges when changing the lastnode with rank in block j.

2 charges for the root and its child.

ThusThe cost of the Block Charges for the FindSet operation is upper boundedby:

log∗ n − 1 + 2 = log∗ n + 1. (8)

103 / 114

Page 265: 17 Disjoint Set Representation

Next

Something NotableNumber of nodes whose parents are in different blocks is limited by(log∗ n)− 1.

I Making it an upper bound for the charges when changing the lastnode with rank in block j.

2 charges for the root and its child.

ThusThe cost of the Block Charges for the FindSet operation is upper boundedby:

log∗ n − 1 + 2 = log∗ n + 1. (8)

103 / 114

Page 266: 17 Disjoint Set Representation

Claim

ClaimOnce a node other than a root or its child is given a Block Charge (B.C.),it will never be given a Path Charge (P.C.)

104 / 114

Page 267: 17 Disjoint Set Representation

Proof

ProofGiven a node x, we know that:

rank [p [x]]− rank [x] is monotonically increasing ⇒log∗ rank [p [x]]− log∗ rank [x] is monotonically increasing.Thus, Once x and p[x] are in different blocks, they will always be indifferent blocks because:

I The rank of the parent can only increases.I And the child’s rank stays the same

Thus, the node x will be billed in the first FindSet operation a patchcharge and block charge if necessary.Thus, the node x will never be charged again a path charge becauseis already pointing to the member set name.

105 / 114

Page 268: 17 Disjoint Set Representation

Proof

ProofGiven a node x, we know that:

rank [p [x]]− rank [x] is monotonically increasing ⇒log∗ rank [p [x]]− log∗ rank [x] is monotonically increasing.Thus, Once x and p[x] are in different blocks, they will always be indifferent blocks because:

I The rank of the parent can only increases.I And the child’s rank stays the same

Thus, the node x will be billed in the first FindSet operation a patchcharge and block charge if necessary.Thus, the node x will never be charged again a path charge becauseis already pointing to the member set name.

105 / 114

Page 269: 17 Disjoint Set Representation

Proof

ProofGiven a node x, we know that:

rank [p [x]]− rank [x] is monotonically increasing ⇒log∗ rank [p [x]]− log∗ rank [x] is monotonically increasing.Thus, Once x and p[x] are in different blocks, they will always be indifferent blocks because:

I The rank of the parent can only increases.I And the child’s rank stays the same

Thus, the node x will be billed in the first FindSet operation a patchcharge and block charge if necessary.Thus, the node x will never be charged again a path charge becauseis already pointing to the member set name.

105 / 114

Page 270: 17 Disjoint Set Representation

Proof

ProofGiven a node x, we know that:

rank [p [x]]− rank [x] is monotonically increasing ⇒log∗ rank [p [x]]− log∗ rank [x] is monotonically increasing.Thus, Once x and p[x] are in different blocks, they will always be indifferent blocks because:

I The rank of the parent can only increases.I And the child’s rank stays the same

Thus, the node x will be billed in the first FindSet operation a patchcharge and block charge if necessary.Thus, the node x will never be charged again a path charge becauseis already pointing to the member set name.

105 / 114

Page 271: 17 Disjoint Set Representation

Proof

ProofGiven a node x, we know that:

rank [p [x]]− rank [x] is monotonically increasing ⇒log∗ rank [p [x]]− log∗ rank [x] is monotonically increasing.Thus, Once x and p[x] are in different blocks, they will always be indifferent blocks because:

I The rank of the parent can only increases.I And the child’s rank stays the same

Thus, the node x will be billed in the first FindSet operation a patchcharge and block charge if necessary.Thus, the node x will never be charged again a path charge becauseis already pointing to the member set name.

105 / 114

Page 272: 17 Disjoint Set Representation

Proof

ProofGiven a node x, we know that:

rank [p [x]]− rank [x] is monotonically increasing ⇒log∗ rank [p [x]]− log∗ rank [x] is monotonically increasing.Thus, Once x and p[x] are in different blocks, they will always be indifferent blocks because:

I The rank of the parent can only increases.I And the child’s rank stays the same

Thus, the node x will be billed in the first FindSet operation a patchcharge and block charge if necessary.Thus, the node x will never be charged again a path charge becauseis already pointing to the member set name.

105 / 114

Page 273: 17 Disjoint Set Representation

Proof

ProofGiven a node x, we know that:

rank [p [x]]− rank [x] is monotonically increasing ⇒log∗ rank [p [x]]− log∗ rank [x] is monotonically increasing.Thus, Once x and p[x] are in different blocks, they will always be indifferent blocks because:

I The rank of the parent can only increases.I And the child’s rank stays the same

Thus, the node x will be billed in the first FindSet operation a patchcharge and block charge if necessary.Thus, the node x will never be charged again a path charge becauseis already pointing to the member set name.

105 / 114

Page 274: 17 Disjoint Set Representation

Remaining Goal

The Total cost of the FindSet’s OperationsTotal cost of FindSet’s = Total Block Charges + Total Path Charges.

We want to showTotal Block Charges + Total Path Charges= O(m log∗ n)

106 / 114

Page 275: 17 Disjoint Set Representation

Remaining Goal

The Total cost of the FindSet’s OperationsTotal cost of FindSet’s = Total Block Charges + Total Path Charges.

We want to showTotal Block Charges + Total Path Charges= O(m log∗ n)

106 / 114

Page 276: 17 Disjoint Set Representation

Bounding Block Charges

This part is easyBlock numbers range over 0, ..., log∗ n − 1.

The number of Block Charges per FindSet is ≤ log∗ n + 1 .The total number of FindSet’s is ≤ mThe total number of Block Charges is ≤ m(log∗ n + 1) .

107 / 114

Page 277: 17 Disjoint Set Representation

Bounding Block Charges

This part is easyBlock numbers range over 0, ..., log∗ n − 1.

The number of Block Charges per FindSet is ≤ log∗ n + 1 .The total number of FindSet’s is ≤ mThe total number of Block Charges is ≤ m(log∗ n + 1) .

107 / 114

Page 278: 17 Disjoint Set Representation

Bounding Block Charges

This part is easyBlock numbers range over 0, ..., log∗ n − 1.

The number of Block Charges per FindSet is ≤ log∗ n + 1 .The total number of FindSet’s is ≤ mThe total number of Block Charges is ≤ m(log∗ n + 1) .

107 / 114

Page 279: 17 Disjoint Set Representation

Bounding Block Charges

This part is easyBlock numbers range over 0, ..., log∗ n − 1.

The number of Block Charges per FindSet is ≤ log∗ n + 1 .The total number of FindSet’s is ≤ mThe total number of Block Charges is ≤ m(log∗ n + 1) .

107 / 114

Page 280: 17 Disjoint Set Representation

Bounding Path ChargesClaimLet N (j) be the number of nodes whose ranks are in block j. Then, for allj ≥ 0, N (j) ≤ 3n

2B(j)

Proof

By Lemma 3, N (j) ≤B(j)∑

r=B(j−1)+1

n2r summing over all possible ranks

For j = 0:

N (0) ≤ n20 + n

2= 3n

2= 3n

2B(0)108 / 114

Page 281: 17 Disjoint Set Representation

Bounding Path ChargesClaimLet N (j) be the number of nodes whose ranks are in block j. Then, for allj ≥ 0, N (j) ≤ 3n

2B(j)

Proof

By Lemma 3, N (j) ≤B(j)∑

r=B(j−1)+1

n2r summing over all possible ranks

For j = 0:

N (0) ≤ n20 + n

2= 3n

2= 3n

2B(0)108 / 114

Page 282: 17 Disjoint Set Representation

Bounding Path ChargesClaimLet N (j) be the number of nodes whose ranks are in block j. Then, for allj ≥ 0, N (j) ≤ 3n

2B(j)

Proof

By Lemma 3, N (j) ≤B(j)∑

r=B(j−1)+1

n2r summing over all possible ranks

For j = 0:

N (0) ≤ n20 + n

2= 3n

2= 3n

2B(0)108 / 114

Page 283: 17 Disjoint Set Representation

Bounding Path ChargesClaimLet N (j) be the number of nodes whose ranks are in block j. Then, for allj ≥ 0, N (j) ≤ 3n

2B(j)

Proof

By Lemma 3, N (j) ≤B(j)∑

r=B(j−1)+1

n2r summing over all possible ranks

For j = 0:

N (0) ≤ n20 + n

2= 3n

2= 3n

2B(0)108 / 114

Page 284: 17 Disjoint Set Representation

Bounding Path ChargesClaimLet N (j) be the number of nodes whose ranks are in block j. Then, for allj ≥ 0, N (j) ≤ 3n

2B(j)

Proof

By Lemma 3, N (j) ≤B(j)∑

r=B(j−1)+1

n2r summing over all possible ranks

For j = 0:

N (0) ≤ n20 + n

2= 3n

2= 3n

2B(0)108 / 114

Page 285: 17 Disjoint Set Representation

Proof of claim

For j ≥ 1

N (j) ≤ n2B(j−1)+1

B(j)−(B(j−1)+1)∑r=0

12r

<n

2B(j−1)+1

∞∑r=0

12r

= n2B(j−1) This is where the fact that B (j) = 2B(j−1)is used.

= nB(j)

<3n

2B(j)

109 / 114

Page 286: 17 Disjoint Set Representation

Proof of claim

For j ≥ 1

N (j) ≤ n2B(j−1)+1

B(j)−(B(j−1)+1)∑r=0

12r

<n

2B(j−1)+1

∞∑r=0

12r

= n2B(j−1) This is where the fact that B (j) = 2B(j−1)is used.

= nB(j)

<3n

2B(j)

109 / 114

Page 287: 17 Disjoint Set Representation

Proof of claim

For j ≥ 1

N (j) ≤ n2B(j−1)+1

B(j)−(B(j−1)+1)∑r=0

12r

<n

2B(j−1)+1

∞∑r=0

12r

= n2B(j−1) This is where the fact that B (j) = 2B(j−1)is used.

= nB(j)

<3n

2B(j)

109 / 114

Page 288: 17 Disjoint Set Representation

Proof of claim

For j ≥ 1

N (j) ≤ n2B(j−1)+1

B(j)−(B(j−1)+1)∑r=0

12r

<n

2B(j−1)+1

∞∑r=0

12r

= n2B(j−1) This is where the fact that B (j) = 2B(j−1)is used.

= nB(j)

<3n

2B(j)

109 / 114

Page 289: 17 Disjoint Set Representation

Proof of claim

For j ≥ 1

N (j) ≤ n2B(j−1)+1

B(j)−(B(j−1)+1)∑r=0

12r

<n

2B(j−1)+1

∞∑r=0

12r

= n2B(j−1) This is where the fact that B (j) = 2B(j−1)is used.

= nB(j)

<3n

2B(j)

109 / 114

Page 290: 17 Disjoint Set Representation

Bounding Path Charges

We have the followingLet P(n) denote the overall number of path charges. Then:

P(n) ≤log∗ n−1∑

j=0αj · βj (9)

I αj is the max number of nodes with ranks in Block jI βj is the max number of path charges per node of Block j.

110 / 114

Page 291: 17 Disjoint Set Representation

Bounding Path Charges

We have the followingLet P(n) denote the overall number of path charges. Then:

P(n) ≤log∗ n−1∑

j=0αj · βj (9)

I αj is the max number of nodes with ranks in Block jI βj is the max number of path charges per node of Block j.

110 / 114

Page 292: 17 Disjoint Set Representation

Bounding Path Charges

We have the followingLet P(n) denote the overall number of path charges. Then:

P(n) ≤log∗ n−1∑

j=0αj · βj (9)

I αj is the max number of nodes with ranks in Block jI βj is the max number of path charges per node of Block j.

110 / 114

Page 293: 17 Disjoint Set Representation

Then, we have the following

Upper BoundsBy claim, αj upper-bounded by 3n

2B(j) ,In addition, we need to bound βj that represents the maximumnumber of path charges for nodes x at block j.

Note: Any node in Block j that is given a P.C. will be in Block jafter all m operations.

111 / 114

Page 294: 17 Disjoint Set Representation

Then, we have the following

Upper BoundsBy claim, αj upper-bounded by 3n

2B(j) ,In addition, we need to bound βj that represents the maximumnumber of path charges for nodes x at block j.

Note: Any node in Block j that is given a P.C. will be in Block jafter all m operations.

111 / 114

Page 295: 17 Disjoint Set Representation

Then, we have the following

Upper BoundsBy claim, αj upper-bounded by 3n

2B(j) ,In addition, we need to bound βj that represents the maximumnumber of path charges for nodes x at block j.

Note: Any node in Block j that is given a P.C. will be in Block jafter all m operations.

111 / 114

Page 296: 17 Disjoint Set Representation

Then, we have the following

Upper BoundsBy claim, αj upper-bounded by 3n

2B(j) ,In addition, we need to bound βj that represents the maximumnumber of path charges for nodes x at block j.

Note: Any node in Block j that is given a P.C. will be in Block jafter all m operations.

Path Compressionis issued

111 / 114

Page 297: 17 Disjoint Set Representation

Now, we bound βj

So, every time x is assessed a Path Charges, it gets a new parent withincreased rank.

Note: x’s rank is not changed by path compression

Suppose x has a rank in Block jRepeated Path Charges to x will ultimately result in x’s parent havinga rank in a Block higher than j.From that point onward, x is given Block Charges, not Path Charges.

Therefore, the Worst Casex has the lowest rank in Block j, i.e., B (j − 1) + 1, and x’s parentsranks successively take on the values.

B(j − 1) + 2,B(j − 1) + 3, ...,B(j)

112 / 114

Page 298: 17 Disjoint Set Representation

Now, we bound βj

So, every time x is assessed a Path Charges, it gets a new parent withincreased rank.

Note: x’s rank is not changed by path compression

Suppose x has a rank in Block jRepeated Path Charges to x will ultimately result in x’s parent havinga rank in a Block higher than j.From that point onward, x is given Block Charges, not Path Charges.

Therefore, the Worst Casex has the lowest rank in Block j, i.e., B (j − 1) + 1, and x’s parentsranks successively take on the values.

B(j − 1) + 2,B(j − 1) + 3, ...,B(j)

112 / 114

Page 299: 17 Disjoint Set Representation

Now, we bound βj

So, every time x is assessed a Path Charges, it gets a new parent withincreased rank.

Note: x’s rank is not changed by path compression

Suppose x has a rank in Block jRepeated Path Charges to x will ultimately result in x’s parent havinga rank in a Block higher than j.From that point onward, x is given Block Charges, not Path Charges.

Therefore, the Worst Casex has the lowest rank in Block j, i.e., B (j − 1) + 1, and x’s parentsranks successively take on the values.

B(j − 1) + 2,B(j − 1) + 3, ...,B(j)

112 / 114

Page 300: 17 Disjoint Set Representation

Now, we bound βj

So, every time x is assessed a Path Charges, it gets a new parent withincreased rank.

Note: x’s rank is not changed by path compression

Suppose x has a rank in Block jRepeated Path Charges to x will ultimately result in x’s parent havinga rank in a Block higher than j.From that point onward, x is given Block Charges, not Path Charges.

Therefore, the Worst Casex has the lowest rank in Block j, i.e., B (j − 1) + 1, and x’s parentsranks successively take on the values.

B(j − 1) + 2,B(j − 1) + 3, ...,B(j)

112 / 114

Page 301: 17 Disjoint Set Representation

Now, we bound βj

So, every time x is assessed a Path Charges, it gets a new parent withincreased rank.

Note: x’s rank is not changed by path compression

Suppose x has a rank in Block jRepeated Path Charges to x will ultimately result in x’s parent havinga rank in a Block higher than j.From that point onward, x is given Block Charges, not Path Charges.

Therefore, the Worst Casex has the lowest rank in Block j, i.e., B (j − 1) + 1, and x’s parentsranks successively take on the values.

B(j − 1) + 2,B(j − 1) + 3, ...,B(j)

112 / 114

Page 302: 17 Disjoint Set Representation

Finally

Hence, x can be given at most B(j)− B(j − 1)− 1 Path Charges.Therefore:

P(n) ≤log∗ n−1∑

j=03n

2B(j)(B(j)− B(j − 1)− 1)

P(n) ≤log∗ n−1∑

j=03n

2B(j)B(j)

P(n) = 32n log∗ n

113 / 114

Page 303: 17 Disjoint Set Representation

Finally

Hence, x can be given at most B(j)− B(j − 1)− 1 Path Charges.Therefore:

P(n) ≤log∗ n−1∑

j=03n

2B(j)(B(j)− B(j − 1)− 1)

P(n) ≤log∗ n−1∑

j=03n

2B(j)B(j)

P(n) = 32n log∗ n

113 / 114

Page 304: 17 Disjoint Set Representation

Finally

Hence, x can be given at most B(j)− B(j − 1)− 1 Path Charges.Therefore:

P(n) ≤log∗ n−1∑

j=03n

2B(j)(B(j)− B(j − 1)− 1)

P(n) ≤log∗ n−1∑

j=03n

2B(j)B(j)

P(n) = 32n log∗ n

113 / 114

Page 305: 17 Disjoint Set Representation

Finally

Hence, x can be given at most B(j)− B(j − 1)− 1 Path Charges.Therefore:

P(n) ≤log∗ n−1∑

j=03n

2B(j)(B(j)− B(j − 1)− 1)

P(n) ≤log∗ n−1∑

j=03n

2B(j)B(j)

P(n) = 32n log∗ n

113 / 114

Page 306: 17 Disjoint Set Representation

Thus

FindSet operations contribute

O(m(log∗ n + 1) + n log∗ n) = O(m log∗ n) (10)

MakeSet and Link contribute O(n)Entire sequence takes O (m log∗ n).

114 / 114

Page 307: 17 Disjoint Set Representation

Thus

FindSet operations contribute

O(m(log∗ n + 1) + n log∗ n) = O(m log∗ n) (10)

MakeSet and Link contribute O(n)Entire sequence takes O (m log∗ n).

114 / 114