168
Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat: Take two balls from the bag. If they are the same color, discard them both and add a blue ball. If they are different colors, discard the blue ball and put the red ball back. What do you know about the color of the final ball?

Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

Algorithms at Scale

(Week 2)

Puzzle of the Day:

A bag contains a collection of blue and red balls. Repeat:

• Take two balls from the bag.

• If they are the same color, discard them both and add a blue ball.

• If they are different colors, discard the blue ball and put the red ball back.

What do you know about the color of the final ball?

Page 2: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

Summary

Today:

Number of connected components in a graph.

• Additive approximation algorithm.

Weight of MST

• Multiplicative approximation algorithm.

Last Week:

Toy example 1: array all 0’s?

• Gap-style question: All 0’s or far from all 0’s?

Toy example 2: Faction of 1’s?

• Additive ± 𝜀 approximation

• Hoeffding Bound

Is the graph connected?

• Gap-style question.

• O(1) time algorithm.

• Correct with probability 2/3. 9 dots4 lines

Page 3: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

Announcements / Reminders

Problem sets:

Problem Set 1 was due today.

Problem Set 2 will be released tonight.

Page 4: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

Announcements / Reminders

Next Week: Guest Lecture

Arnab Bhattacharyya

Arnab’s research:“My research area is theoretical computer science, in a broad sense. More specifically, I am interested in algorithms for big data, computational complexity, analysis and extremal combinatorics on finite fields, and algorithmic models for natural systems.”

Page 5: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

Today’s Problem: Connected Components

Assumptions:

Graph G = (V,E)

• Undirected

• n nodes

• m edges

• maximum degree d

Error term: 𝜀

Output: Number of connected components.

Example: output 3

A

B

c

Page 6: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

Today’s Problem: Connected Components

Approximation:

Output C such that:

Alternate form:

Correct output: w.p. > 2/3

Example: 𝜀 = 1/10Output ∊ {2,3,4}

A

B

c

Page 7: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

Today’s Problem: Connected Components

When is this useful?

What are trivial values of 𝜀?

What are hard values of 𝜀?

What sort of applications is this useful for?

Page 8: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

Approximate Connected Components

When is this useful?

What are interesting values of 𝜀?

• What happens when 𝜀 = 1?

• What happens when 𝜀 = 1/(2n)?

What sort of applications is this useful for?

• Large graphs?

• Large social networks?

• The internet?

• Networks with many connected components?

• Number of components follows a heavy tail distribution?

Page 9: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

Approximate Connected Components

Define: per-node cost

Let n(u) = number of nodes in the connected component containing node u.

wx

y

z

Key Idea 1:

n(w) = 6n(x) = 6

n(y) = 3

n(z) = 1

A

B

c

Page 10: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

Approximate Connected Components

Define: per-node cost

Let n(u) = number of nodes in the connected component containing node u.

Let cost(u) = 1/n(u).

Key Idea 1:

wx

y

z

cost(w) = 1/6cost(x) = 1/6

cost(y) = 1/3

cost(z) = 1

A

B

c

Page 11: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

Approximate Connected Components

Why is this useful?

Key Idea 1:

wx

y

z

cost(w) = 1/6cost(x) = 1/6

cost(y) = 1/3

cost(z) = 1

A

B

c

Page 12: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

Why is this useful?

Approximate Connected Components

Key Idea 1:

wx

y

z

cost(w) = 1/6cost(x) = 1/6

cost(y) = 1/3

cost(z) = 1

A

B

c

Page 13: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

Why is this useful?

Approximate Connected Components

Key Idea 1:

wx

y

z

cost(w) = 1/6cost(x) = 1/6

cost(y) = 1/3

cost(z) = 1

A

B

c

Page 14: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

Why is this useful?

Approximate Connected Components

Key Idea 1:

wx

y

z

cost(w) = 1/6cost(x) = 1/6

cost(y) = 1/3

cost(z) = 1

A

B

c

Page 15: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

sum = 0for each u in V:

sum = sum + cost(u)return sum

Approximate Connected Components

Algorithm 1

wx

y

z

cost(w) = 1/6cost(x) = 1/6

cost(y) = 1/3

cost(z) = 1

A

B

c

Page 16: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

sum = 0for each u in V:

sum = sum + cost(u)return sum

Comments:• Need a way to efficiently

compute cost(u).

• Runs in O(n) time.

Approximate Connected Components

Algorithm 1

wx

y

z

cost(w) = 1/6cost(x) = 1/6

cost(y) = 1/3

cost(z) = 1

A

B

c

Page 17: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

Sample

• Choose a small random subset S of V.

• For each node u in S, compute cost(u).

• Use the sample to estimate the average cost of all the nodes.

Approximate Connected Components

Key Idea 2: Sampling

wx

y

z

cost(w) = 1/6cost(x) = 1/6

cost(y) = 1/3

cost(z) = 1

A

B

c

Page 18: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

Worries?

Approximate Connected Components

Key Idea 2: Sampling

wx

y

z

cost(w) = 1/6cost(x) = 1/6

cost(y) = 1/3

cost(z) = 1

A

B

c

Page 19: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

Worries?

• Big components are sampled more often than small components?

• Small components may never be sampled?

• Bad examples?1 component of size 90, 10 components of size 1

Approximate Connected Components

Key Idea 2: Sampling

wx

y

z

cost(w) = 1/6cost(x) = 1/6

cost(y) = 1/3

cost(z) = 1

A

B

c

Page 20: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

sum = 0for j = 1 to s:

Choose u uniformly at random.sum = sum + cost(u)

return n∙(sum/s)

Comments:• (sum/s) is average cost of sample.

• Efficiently compute cost(u)?

• Runs in O(s) time.

Approximate Connected Components

Algorithm 2

y

z

cost(y) = 1/3

cost(z) = 1

wx

cost(w) = 1/6cost(x) = 1/6

A

B

c

Page 21: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

sum = 0for j = 1 to s:

Choose u uniformly at random.sum = sum + cost(u)

return n∙(sum/s)

Define random variables: Y1, Y2, …, Ys

Approximate Connected Components

Algorithm 2 Analysis

Page 22: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

sum = 0for j = 1 to s:

Choose u uniformly at random.sum = sum + cost(u)

return n∙(sum/s)

Approximate Connected Components

Algorithm 2 Analysis

Page 23: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

sum = 0for j = 1 to s:

Choose u uniformly at random.sum = sum + cost(u)

return n∙(sum/s)

Approximate Connected Components

Algorithm 2 Analysis

Page 24: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

sum = 0for j = 1 to s:

Choose u uniformly at random.sum = sum + cost(u)

return n∙(sum/s)

Approximate Connected Components

Algorithm 2 Analysis

Page 25: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

sum = 0for j = 1 to s:

Choose u uniformly at random.sum = sum + cost(u)

return n∙(sum/s)

Approximate Connected Components

Algorithm 2 Analysis

Page 26: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

sum = 0for j = 1 to s:

Choose u uniformly at random.sum = sum + cost(u)

return n∙(sum/s)

Approximate Connected Components

Algorithm 2 Analysis

Page 27: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

sum = 0for j = 1 to s:

Choose u uniformly at random.sum = sum + cost(u)

return n∙(sum/s)

Notice:

Output of algorithm is:

Approximate Connected Components

Algorithm 2 Analysis

Page 28: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

sum = 0for j = 1 to s:

Choose u uniformly at random.sum = sum + cost(u)

return n∙(sum/s)

Notice:

Expected output of algorithm is:

Approximate Connected Components

Algorithm 2 Analysis

Page 29: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

sum = 0for j = 1 to s:

Choose u uniformly at random.sum = sum + cost(u)

return n∙(sum/s)

Important step:

Expected out is number of connected components!

(Algorithm is an unbiased estimator.)

Approximate Connected Components

Algorithm 2 Analysis

Page 30: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

sum = 0for j = 1 to s:

Choose u uniformly at random.sum = sum + cost(u)

return n∙(sum/s)

Notice:

Goal:

Approximate Connected Components

Algorithm 2 Analysis

Page 31: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

sum = 0for j = 1 to s:

Choose u uniformly at random.sum = sum + cost(u)

return n∙(sum/s)

Notice:

Goal:

Approximate Connected Components

Algorithm 2 Analysis

Page 32: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

Given: independent random variables Y1, Y2, …, Ys

Assume: each Yj ∊ [0,1]

Then:

Approximate Connected Components

Reminder: Hoeffding Bound

Page 33: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

Given: independent random variables Y1, Y2, …, Ys

Assume: each Yj ∊ [0,1]

Then:

Approximate Connected Components

Reminder: Hoeffding Bound

Goal:

Page 34: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

Derivation:

Approximate Connected Components

Algorithm 2 Analysis

Page 35: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

Derivation:

Approximate Connected Components

Algorithm 2 Analysis

Page 36: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

Derivation:

Approximate Connected Components

Algorithm 2 Analysis

Page 37: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

Derivation:

Approximate Connected Components

Algorithm 2 Analysis

Page 38: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

Derivation:

Approximate Connected Components

Algorithm 2 Analysis

Page 39: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

Approximate Connected Components

Algorithm 2 Analysis

Derivation:

Page 40: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

Approximate Connected Components

Algorithm 2 Analysis

Derivation:

Page 41: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

Approximate Connected Components

Algorithm 2 Analysis

Derivation:

Page 42: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

Approximate Connected Components

Algorithm 2 Analysis

Derivation:

Page 43: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

Approximate Connected Components

Algorithm 2 Analysis

Derivation:

Page 44: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

sum = 0for j = 1 to s:

Choose u uniformly at random.sum = sum + cost(u)

return n∙(sum/s)

We have shown:W.p. > 2/3, output is equal to:

CC(G) ± 𝜀n/2

Approximate Connected Components

Algorithm 2

y

z

cost(y) = 1/3

cost(z) = 1

wx

cost(w) = 1/6cost(x) = 1/6

A

B

c

Page 45: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

sum = 0for j = 1 to s:

Choose u uniformly at random.sum = sum + cost(u)

return n∙(sum/s)

We have shown:Time: O(1/𝜀2)

Approximate Connected Components

Algorithm 2

y

z

cost(y) = 1/3

cost(z) = 1

wx

cost(w) = 1/6cost(x) = 1/6

A

B

c

Page 46: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

Key problem:

How to efficiently compute cost(u).

Approximate Connected Components

Key Idea 2: Sampling

wx

y

z

cost(w) = 1/6cost(x) = 1/6

cost(y) = 1/3

cost(z) = 1

A

B

c

Page 47: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

Key problem:

How to efficiently compute cost(u).

Key idea 3:

Approximate cost(u).

Approximate Connected Components

Key Idea 2: Sampling

wx

y

z

cost(w) = 1/6cost(x) = 1/6

cost(y) = 1/3

cost(z) = 1

A

B

c

Page 48: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

Approximate low cost components:

If cost(u) is small, round up.

Approximate Connected Components

Key Idea 3: Approximate Cost

wx

y

z

cost(w) = 1/6cost(x) = 1/6

cost(y) = 1/3

cost(z) = 1

A

B

cHow small is small enough?

Page 49: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

Approximate low cost components:

If cost(u) < 𝜀/2, round up.

Approximate Connected Components

Key Idea 3: Approximate Cost

wx

y

z

cost(w) = 1/6cost(x) = 1/6

cost(y) = 1/3

cost(z) = 1

A

B

c

Page 50: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

Ignore low cost components:

If cost(u) < 𝜀/2, round up.

Total added cost ≤ 𝜀n/2.

Approximate Connected Components

Key Idea 3: Approximate Cost

wx

y

z

cost(w) = 1/6cost(x) = 1/6

cost(y) = 1/3

cost(z) = 1

A

B

c

Page 51: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

Approximate Connected Components

Define: per-node cost

Let n(u) = number of nodes in the connected component containing node u.

Let ñ(u) = min(n(u), 2/𝜀).

Let cost(u) = max(1/n(u), 𝜀/2).

= 1/ñ(u).

Key Idea 3: Approximate Cost

wx

y

z

cost(w) = 1/6cost(x) = 1/6

cost(y) = 1/3

cost(z) = 1

A

B

c

Page 52: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

Approximate Connected Components

Define: per-node cost

Let n(u) = number of nodes in the connected component containing node u.

Let ñ(u) = min(n(u), 2/𝜀).

Let cost(u) = max(1/n(u), 𝜀/2).

= 1/ñ(u).

Key Idea 3: Approximate Cost

Define:

Note:

Page 53: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

Approximate Connected Components

Define: per-node cost

Let n(u) = number of nodes in the connected component containing node u.

Let ñ(u) = min(n(u), 2/𝜀).

Let cost(u) = max(1/n(u), 𝜀/2).

= 1/ñ(u).

Key Idea 3: Approximate Cost

Define:

Note:

Page 54: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

Approximate Connected Components

Close enough approximation:

Intuition:By rounding cost(u) up to𝜀/2, we increase the error at most 𝜀n/2.

Page 55: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

Approximate Connected Components

Close enough approximation:

Intuition:By rounding cost(u) up to𝜀/2, we increase the error at most 𝜀n/2.

Page 56: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

Approximate Connected Components

Close enough approximation:

Intuition:By rounding cost(u) up to𝜀/2, we increase the error at most 𝜀n/2.

Page 57: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

Approximate Connected Components

Close enough approximation:

Intuition:By rounding cost(u) up to𝜀/2, we increase the error at most 𝜀n/2.

Page 58: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

Approximate Connected Components

Close enough approximation:

Intuition:By rounding cost(u) up to𝜀/2, we increase the error at most 𝜀n/2.

Page 59: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

Approximate Connected Components

Close enough approximation:

Intuition:By rounding cost(u) up to𝜀/2, we increase the error at most 𝜀n/2.

Page 60: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

sum = 0for j = 1 to s:

Choose u uniformly at random.sum = sum + cost(u)

return n∙(sum/s)

We have shown:Sufficient to approximate

cost(u) by rounding up.

Approximate Connected Components

Algorithm 3

y

z

cost(y) = 1/3

cost(z) = 1

wx

cost(w) = 1/6cost(x) = 1/6

A

B

c

Page 61: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

Approximate Connected Components

Define: per-node cost

Let n(u) = number of nodes in the connected component containing node u.

Let ñ(u) = min(n(u), 2/𝜀).

Let cost(u) = max(1/n(u), 𝜀/2).

= 1/ñ(u).

Algorithm 3

How to efficiently compute cost(u)?

Page 62: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

Approximate Connected Components

Define: per-node cost

Let n(u) = number of nodes in the connected component containing node u.

Let ñ(u) = min(n(u), 2/𝜀).

Let cost(u) = max(1/n(u), 𝜀/2).

= 1/ñ(u).

Algorithm 3

How to efficiently compute cost(u)?

Page 63: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

sum = 0for j = 1 to s:

Choose u uniformly at random.Perform a BFS from u; stop after seeing 2/𝜀 nodes.if BFS found > 2/𝜀 nodes:

sum = sum + 𝜀/2else if BFS found n(u) nodes:

sum = sum + 1/n(u)return n∙(sum/s)

Approximate Connected Components

Algorithm 3

Page 64: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

Goal:

Approximate Connected Components

Analysis

Page 65: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

Goal:

Implies:

Approximate Connected Components

Analysis

Page 66: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

Define random variables: Y1, Y2, …, Ys

Approximate Connected Components

Algorithm 3 Analysis

Rounded up cost

Page 67: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

Define random variables: Y1, Y2, …, Ys

Approximate Connected Components

Algorithm 3 Analysis

Page 68: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

Unbiased estimator:

Approximate Connected Components

Algorithm 3 Analysis

Page 69: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

Notice:

Expected output of algorithm is:

Approximate Connected Components

Algorithm 3 Analysis

Page 70: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

Goal:

Approximate Connected Components

Algorithm 3 Analysis

Page 71: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

Derivation:

Approximate Connected Components

Algorithm 3 Analysis

Page 72: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

Approximate Connected Components

Algorithm 3 Analysis

Derivation:

Page 73: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

Goal:

Implies:

Approximate Connected Components

Analysis

Page 74: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

sum = 0for j = 1 to s:

Choose u uniformly at random.Perform a BFS from u; stop after seeing 2/𝜀 nodes.if BFS found > 2/𝜀 nodes:

sum = sum + 𝜀/2else if BFS found n(u) nodes:

sum = sum + 1/n(u)return n∙(sum/s)

Approximate Connected Components

Algorithm 3

Page 75: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

We have shown:

With probability > 2/3,

output is equal to:

CC(G) ± 𝜀n

Approximate Connected Components

Algorithm 3

y

z

cost(y) = 1/3

cost(z) = 1

wx

cost(w) = 1/6cost(x) = 1/6

A

B

c

Page 76: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

sum = 0for j = 1 to s:

Choose u uniformly at random.Perform a BFS from u; stop after seeing 2/𝜀 nodes.if BFS found > 2/𝜀 nodes:

sum = sum + 𝜀/2else if BFS found n(u) nodes:

sum = sum + 1/n(u)return n∙(sum/s)

Approximate Connected Components

Algorithm 3

Cost of BFS: O((2 / 𝜀)∙d)

Page 77: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

sum = 0for j = 1 to s:

Choose u uniformly at random.Perform a BFS from u; stop after seeing 2/𝜀 nodes.if BFS found > 2/𝜀 nodes:

sum = sum + 𝜀/2else if BFS found n(u) nodes:

sum = sum + 1/n(u)return n∙(sum/s)

Approximate Connected Components

Algorithm 3

Cost of BFS: O((2 / 𝜀)∙d)

Total cost:

O(s(2/𝜀)∙d) =

O((1/𝜀2)(2/𝜀)d) =

O(d/𝜀3)

Page 78: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

We have shown:

With probability > 2/3,

output is equal to:

CC(G) ± 𝜀n

Running time:

Approximate Connected Components

Algorithm 3

y

z

cost(y) = 1/3

cost(z) = 1

wx

cost(w) = 1/6cost(x) = 1/6

A

B

c

Page 79: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

We have shown:

With probability > 1 - 1/δ,

output is equal to:

CC(G) ± 𝜀n

Running time:

Approximate Connected Components

Algorithm 3

y

z

cost(y) = 1/3

cost(z) = 1

wx

cost(w) = 1/6cost(x) = 1/6

A

B

c

Page 80: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

Summary

Today:

Number of connected components in a graph.

• Approximation algorithm.

Weight of MST

• Approximation algorithm.

Last Week:

Toy example 1: array all 0’s?

• Gap-style question: All 0’s or far from all 0’s?

Toy example 2: Faction of 1’s?

• Additive ± 𝜀 approximation

• Hoeffding Bound

Is the graph connected?

• Gap-style question.

• O(1) time algorithm.

• Correct with probability 2/3. 9 dots4 lines

Page 81: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

Today’s Problem: Minimum Spanning Tree

Assumptions:

Graph G = (V,E)

• Undirected

• Weighted, max weight W

• Connected

• n nodes

• m edges

• maximum degree d

Error term: 𝜀 < 1/2

Output: Weight of MST. Example: output 16

1

1

1

2

22

2

2

3

3

33

Page 82: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

Today’s Problem: Minimum Spanning Tree

Approximation:

Output M such that:

Alternate form:

Correct output: w.p. > 2/31

1

1

2

22

2

2

3

3

33

Example: 𝜀 = 1/4Output ∊ [12,20]

Page 83: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

Today’s Problem: Minimum Spanning Tree

When is this useful?

What are trivial values of 𝜀?

What are hard values of 𝜀?

What sort of applications is this useful for?

Why multiplicative approximation for MST and additive approximation for connected components?

Page 84: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

Simple Minimum Spanning Tree

Which edges must be in MST?

How many weight-2 edges inMST?

Best (exact) algorithm?

Assume all weights 1 or 2

1

1

1

2

22

2

2

2

2

22

1

Page 85: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

Simple Minimum Spanning Tree

Let G1 = graph containing only edges of weight 1.

Assume all weights 1 or 2

1

1

1

2

22

2

2

2

2

22

1

Page 86: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

Simple Minimum Spanning Tree

Let G1 = graph containing only edges of weight 1.

Let C1 = number of connected components in G1.

Assume all weights 1 or 2

1

1

1

2

22

2

2

2

2

22

1

Ex: C1 = 6

Page 87: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

Simple Minimum Spanning Tree

Let G1 = graph containing only edges of weight 1.

Let C1 = number of connected components in G1.

Claim: MST contains example C1-1 edges of weight 2.

Assume all weights 1 or 2

1

1

1

2

22

2

2

2

2

22

1

Ex: C1 = 6

Page 88: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

Simple Minimum Spanning Tree

Claim: MST contains example C1-1 edges of weight 2.

Basic MST Property:For any cut, minimum weight edge across cut is in MST.

Assume all weights 1 or 2

1

1

1

2

22

2

2

2

2

22

1

Ex: C1 = 6

Page 89: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

Simple Minimum Spanning Tree

Claim: MST contains example C1-1 edges of weight 2.

Algorithm:For any connected component, add minimum weight outgoing edge.

Here all the edges have weight 2, so add C1-1 edges of weight 2.

Assume all weights 1 or 2

1

1

1

2

22

2

2

2

2

22

1

Ex: C1 = 6

Page 90: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

Simple Minimum Spanning Tree

Claim: MST contains example C1-1 edges of weight 2.

Weight of MST?

Assume all weights 1 or 2

1

1

1

2

22

2

2

2

2

22

1

Ex: C1 = 6

Page 91: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

Simple Minimum Spanning Tree

Claim: MST contains example C1-1 edges of weight 2.

Weight of MST?

Assume all weights 1 or 2

1

1

1

2

22

2

2

2

2

22

1

Ex: C1 = 6Ex: 10 + 6 – 2 = 14

Page 92: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

Simple Minimum Spanning Tree

Weight of MST: n + C1 - 2

Algorithm idea?

Assume all weights 1 or 2

1

1

1

2

22

2

2

2

2

22

1

Ex: C1 = 6

Page 93: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

Simple Minimum Spanning Tree

Weight of MST: n + C1 - 2

Algorithm idea:Approximate connected components of G1.

Assume all weights 1 or 2

1

1

1

2

22

2

2

2

2

22

1

Ex: C1 = 6

Page 94: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

Approximate Minimum Spanning Tree

Let G1 = graph containing only edges of weight 1.

Let G2 = graph containing only edges of weight {1, 2}.

…Let Gj = graph containing only

edges of weights {1, 2, …, j}.

Weights {1, 2, …, W}

1

1

1

2

22

2

2

3

3

33

Ex: G2

Page 95: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

Approximate Minimum Spanning Tree

Let C1 = number CC in G1.

Let C2 = number CC in G2.

…Let Cj = number CC in Gj.

Weights {1, 2, …, W}

1

1

1

2

22

2

2

3

3

33

Ex: G2

Page 96: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

Approximate Minimum Spanning Tree

Claim:MST(G) contains Cj – 1 edges of weight > j.

Weights {1, 2, …, W}

1

1

1

2

22

2

2

3

3

33

Ex: G2

Page 97: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

Approximate Minimum Spanning Tree

Claim:MST(G) contains Cj – 1 edges of weight > j.

Why?There are Cj connected components in Gj. There much be Cj – 1 edges connecting them, and each must have weight > j.

Weights {1, 2, …, W}

1

1

1

2

22

2

2

3

3

33

Ex: G2

Page 98: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

Approximate Minimum Spanning Tree

Lemma:

Weights {1, 2, …, W}

1

1

1

2

22

2

2

3

3

33

Ex: G2

Page 99: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

Approximate Minimum Spanning Tree

Edges of weight 1:

n – 1 edges total in MSTC1 – 1 edges of weight > 1

(n – 1) – (C1 – 1) edges of weight 1.

(n – C1) edges of weight 1.

Weights {1, 2, …, W}

1

1

1

2

22

2

2

3

3

33

Ex: G2

Page 100: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

Approximate Minimum Spanning Tree

Edges of weight j+1:

Cj – 1 edges of weight > jCj+1 – 1 edges of weight > j+1

(Cj – 1) – (Cj+1 – 1) edges of weight j+1.

(Cj – Cj+1) edges of weight j+1.

Weights {1, 2, …, W}

1

1

1

2

22

2

2

3

3

33

Ex: G2Note: Cj ≥ Cj+1

Page 101: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

Approximate Minimum Spanning Tree

Sum the weights:

Weights {1, 2, …, W}

number ofedges ofweight 1

number ofedges ofweight j+1

weight of edgeof weight j+1

Note: sum is from j = 1 to W-1.

Page 102: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

Approximate Minimum Spanning Tree

Sum the weights:

Weights {1, 2, …, W}

Page 103: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

Approximate Minimum Spanning Tree

Sum the weights:

Weights {1, 2, …, W}

Page 104: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

Approximate Minimum Spanning Tree

Sum the weights:

Weights {1, 2, …, W}

Page 105: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

Approximate Minimum Spanning Tree

Lemma:

Weights {1, 2, …, W}

1

1

1

2

22

2

2

3

3

33

Ex: G2

Page 106: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

Approximate Minimum Spanning Tree

sum = n – W

for j = 1 to W – 1:Xj = AproxCC(Gj, d, 𝜀’, δ)sum = sum + Xj

return sum

Algorithm ApproxMST

1

1

1

2

22

2

2

3

3

33

Ex: G2

Page 107: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

Approximate Minimum Spanning Tree

sum = n – W

for j = 1 to W – 1:Xj = AproxCC(Gj, d, 𝜀’, δ)sum = sum + Xj

return sum

Error Calculation

Set: 𝜀’ = 𝜀/W

Sum of errors: ≤ W(𝜀n/W) ≤ 𝜀n

Page 108: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

Approximate Minimum Spanning Tree

sum = n – W

for j = 1 to W – 1:Xj = AproxCC(Gj, d, 𝜀’, δ)sum = sum + Xj

return sum

Error Calculation

Guarantee for each AproxCC:

Page 109: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

Approximate Minimum Spanning Tree

sum = n – W

for j = 1 to W – 1:Xj = AproxCC(Gj, d, 𝜀’, δ)sum = sum + Xj

return sum

Error Calculation

Guarantee for each AproxCC:

Not good enough: Pr{all correct} ≅ (2/3)W

Page 110: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

Approximate Minimum Spanning Tree

sum = n – W

for j = 1 to W – 1:Xj = AproxCC(Gj, d, 𝜀’, δ)sum = sum + Xj

return sum

Error Calculation

Set 𝜀’ = 𝜀/W, δ = 1/(3W)

Error probability:

Page 111: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

Approximate Minimum Spanning Tree

sum = n – W

for j = 1 to W – 1:Xj = AproxCC(Gj, d, 𝜀’, δ)sum = sum + Xj

return sum

Error Calculation

Set 𝜀’ = 𝜀/W, δ = 1/(3W)

Guarantee for each AproxCC:

Page 112: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

Approximate Minimum Spanning Tree

sum = n – W

for j = 1 to W – 1:Xj = AproxCC(Gj, d, 𝜀’, δ)sum = sum + Xj

return sum

Error Calculation

Set: 𝜀’ = 𝜀/W, δ = 1/(3W)

Sum of errors: ≤ W(𝜀n/W) ≤ 𝜀n

Page 113: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

Approximate Minimum Spanning Tree

Error Calculation

Page 114: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

Approximate Minimum Spanning Tree

Error Calculation

Page 115: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

Approximate Minimum Spanning Tree

Error Calculation

Page 116: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

Approximate Minimum Spanning Tree

Error Calculation

Page 117: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

Approximate Minimum Spanning Tree

Error Calculation

Page 118: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

Approximate Minimum Spanning Tree

sum = n – W

for j = 1 to W – 1:Xj = AproxCC(Gj, d, 𝜀’, δ)sum = sum + Xj

return sum

Running time

Set 𝜀’ = 𝜀/W, δ = 1/(3W)

Running time:

Page 119: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

Approximate Minimum Spanning Tree

sum = n – W

for j = 1 to W – 1:Xj = AproxCC(Gj, d, 𝜀’, δ)sum = sum + Xj

return sum

Running Time

Set 𝜀’ = 𝜀/W, δ = 1/(3W)

Running time:

Page 120: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

We have shown:

With probability > 2/3, output is equal to:

MST(G)(1 ± 𝜀n)

Running time:

Approximate MST

Summary

Page 121: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

Note:

Impossible to do better than:

Best known:

Approximate MST

Summary

See: Chazelle, Rubinfeld, Trevisan

Page 122: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

Summary

Today:

Number of connected components in a graph.

• Approximation algorithm.

Weight of MST

• Approximation algorithm.

Last Week:

Toy example 1: array all 0’s?

• Gap-style question: All 0’s or far from all 0’s?

Toy example 2: Faction of 1’s?

• Additive ± 𝜀 approximation

• Hoeffding Bound

Is the graph connected?

• Gap-style question.

• O(1) time algorithm.

• Correct with probability 2/3. 9 dots4 lines

Page 123: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

Today’s Problem: Maximum Matching

Matching:

Output set of edges M such that no two edges in M are adjacent.

Size of Maximum Matching:

Output the largest value v where there is a matching M of size v.

Example: Size of matching: 5

Page 124: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

Today’s Problem: Maximal Matching

Maximal Matching:

Output set of edges M such that no two edges in M are adjacent, and no more edges can be added to M.

Size of Maximal Matching:

Output the largest value v where there is a maximal matching M of size v.

Example: Size of matching: 5

Page 125: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

Today’s Problem: Maximal Matching

Size of Maximal Matching:

Output the largest value v where there is a maximal matching M of size v.

Note:

The maximum matching is at most twice as big as the maximalmatching.

Maximal is a 2-approximation of maximum.

Example: Size of matching: 5

Page 126: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

Today’s Problem: Maximal Matching

Algorithm for maximal matching:

1) Assign each edge a random number. (Equivalent: choose a random permutation of the edges.)

1

2

3

4

5

67

8

9

10

11

12

Page 127: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

Today’s Problem: Maximal Matching

Algorithm for maximal matching:

1) Assign each edge a random number. (Equivalent: choose a random permutation of the edges.)

2) Greedily, in order, try to add each edge to the matching.

1

2

3

4

5

67

8

9

10

11

12

Page 128: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

Today’s Problem: Maximal Matching

Algorithm for maximal matching:

1) Assign each edge a random number. (Equivalent: choose a random permutation of the edges.)

2) Greedily, in order, try to add each edge to the matching.

1

2

3

4

5

67

8

9

10

11

12

Page 129: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

Today’s Problem: Maximal Matching

Algorithm for maximal matching:

1) Assign each edge a random number. (Equivalent: choose a random permutation of the edges.)

2) Greedily, in order, try to add each edge to the matching.

Each random permutation defines a unique maximal matching.

1

2

3

4

5

67

8

9

10

11

12

Page 130: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

Today’s Problem: Maximal Matching

To solve via sampling:

1) Choose a random permutation for the edges (e.g., a hash function).

2) Choose s edges at random.

3) Decide if they are in the matching for the chosen permutation.

1

2

3

4

5

67

8

9

10

11

12

Page 131: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

Today’s Problem: Maximal Matching

To decide if an edge is in the matching:

1

2

3

4

5

67

8

9

10

11

12query(e):

for all neighbors e’ of e:if query(e’) = true

return falsereturn true

Page 132: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

Today’s Problem: Maximal Matching

To decide if an edge is in the matching:

1

2

3

4

5

67

8

9

10

11

12query(e):

for all neighbors e’ of e:if query(e’) = true

return falsereturn true

Oops… That doesn’t exactly work!

Page 133: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

Today’s Problem: Maximal Matching

To decide if an edge is in the matching:

1

2

3

4

5

67

8

9

10

11

12query(e):

for all neighbors e’ of e:if hash(e’) < hash(e)

if query(e’) = truereturn false

return true

hash(e) returns the number chosen for edge e.Only query smaller edges. Larger edges do not matter.

Page 134: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

Today’s Problem: Maximal Matching

To decide if an edge is in the matching:

1

2

3

4

5

67

8

9

10

11

12query(e):

for all neighbors e’ of e:if hash(e’) < hash(e)

if query(e’) = truereturn false

return true

hash(e) returns the number chosen for edge e.Only query smaller edges. Larger edges do not matter.

Page 135: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

Today’s Problem: Maximal Matching

To decide if an edge is in the matching:

1

2

3

4

5

67

8

9

10

11

12query(e):

for all neighbors e’ of e:if hash(e’) < hash(e)

if query(e’) = truereturn false

return true

hash(e) returns the number chosen for edge e.Only query smaller edges. Larger edges do not matter.

Page 136: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

Today’s Problem: Maximal Matching

To decide if an edge is in the matching:

1

2

3

4

5

67

8

9

10

11

12query(e):

for all neighbors e’ of e:if hash(e’) < hash(e)

if query(e’) = truereturn false

return true

hash(e) returns the number chosen for edge e.Only query smaller edges. Larger edges do not matter.

Page 137: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

Today’s Problem: Maximal Matching

To decide if an edge is in the matching:

1

2

3

4

5

67

8

9

10

11

12query(e):

for all neighbors e’ of e:if hash(e’) < hash(e)

if query(e’) = truereturn false

return true

hash(e) returns the number chosen for edge e.Only query smaller edges. Larger edges do not matter.

Page 138: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

Today’s Problem: Maximal Matching

To decide if an edge is in the matching:

1

2

3

4

5

67

8

9

10

11

12query(e):

for all neighbors e’ of e:if hash(e’) < hash(e)

if query(e’) = truereturn false

return true

hash(e) returns the number chosen for edge e.Only query smaller edges. Larger edges do not matter.

Page 139: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

Today’s Problem: Maximal Matching

To decide if an edge is in the matching:

1

2

3

4

5

67

8

9

10

11

12query(e):

for all neighbors e’ of e:if hash(e’) < hash(e)

if query(e’) = truereturn false

return true

hash(e) returns the number chosen for edge e.Only query smaller edges. Larger edges do not matter.

Page 140: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

Today’s Problem: Maximal Matching

To decide if an edge is in the matching:

1

2

3

4

5

67

8

9

10

11

12query(e):

for all neighbors e’ of e:if hash(e’) < hash(e)

if query(e’) = truereturn false

return true

hash(e) returns the number chosen for edge e.Only query smaller edges. Larger edges do not matter.

Page 141: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

Today’s Problem: Maximal Matching

To decide if an edge is in the matching:

1

2

3

4

5

67

8

9

10

11

12query(e):

for all neighbors e’ of e:if hash(e’) < hash(e)

if query(e’) = truereturn false

return true

hash(e) returns the number chosen for edge e.Only query smaller edges. Larger edges do not matter.

Page 142: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

Today’s Problem: Maximal Matching

To decide if an edge is in the matching:

1

2

3

4

5

67

8

9

10

11

12query(e):

for all neighbors e’ of e:if hash(e’) < hash(e)

if query(e’) = truereturn false

return true

hash(e) returns the number chosen for edge e.Only query smaller edges. Larger edges do not matter.

Page 143: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

Today’s Problem: Maximal Matching

To decide if an edge is in the matching:

1

2

3

4

5

67

8

9

10

11

12query(e):

for all neighbors e’ of e:if hash(e’) < hash(e)

if query(e’) = truereturn false

return true

hash(e) returns the number chosen for edge e.Only query smaller edges. Larger edges do not matter.

Page 144: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

Today’s Problem: Maximal Matching

To decide if an edge is in the matching:

1

2

3

4

5

67

8

9

10

11

12query(e):

for all neighbors e’ of e:if hash(e’) < hash(e)

if query(e’) = truereturn false

return true

hash(e) returns the number chosen for edge e.Only query smaller edges. Larger edges do not matter.

Page 145: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

Today’s Problem: Maximal Matching

To decide if an edge is in the matching:

1

2

3

4

5

67

8

9

10

11

12query(e):

for all neighbors e’ of e:if hash(e’) < hash(e)

if query(e’) = truereturn false

return true

hash(e) returns the number chosen for edge e.Only query smaller edges. Larger edges do not matter.

Page 146: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

Today’s Problem: Maximal Matching

To decide if an edge is in the matching:

1

2

3

4

5

67

8

9

10

11

12query(e):

for all neighbors e’ of e:if hash(e’) < hash(e)

if query(e’) = truereturn false

return true

hash(e) returns the number chosen for edge e.Only query smaller edges. Larger edges do not matter.

Page 147: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

Today’s Problem: Maximal Matching

To decide if an edge is in the matching:

1

2

3

4

5

67

8

9

10

11

12query(e):

for all neighbors e’ of e:if hash(e’) < hash(e)

if query(e’) = truereturn false

return true

hash(e) returns the number chosen for edge e.Only query smaller edges. Larger edges do not matter.

Page 148: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

Today’s Problem: Maximal Matching

To decide if an edge is in the matching:

1

2

3

4

5

67

8

9

10

11

12query(e):

for all neighbors e’ of e:if hash(e’) < hash(e)

if query(e’) = truereturn false

return true

hash(e) returns the number chosen for edge e.Only query smaller edges. Larger edges do not matter.

Page 149: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

Today’s Problem: Maximal Matching

To decide if an edge is in the matching:

1

2

3

4

5

67

8

9

10

11

12query(e):

for all neighbors e’ of e:if hash(e’) < hash(e)

if query(e’) = truereturn false

return true

hash(e) returns the number chosen for edge e.Only query smaller edges. Larger edges do not matter.

FALSE

Page 150: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

Today’s Problem: Maximal Matching

Key question:How expensive is a query?

1

2

3

4

5

67

8

9

10

11

12

query(e):for all neighbors e’ of e:

if hash(e’) < hash(e)if query(e’) = true

return falsereturn true

Page 151: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

Today’s Problem: Maximal Matching

Some simple analysis:

If graph has maximum degree d, then there are at most 2dk paths of length k starting from the query edge.

1

2

3

4

5

67

8

9

10

11

12

Page 152: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

Today’s Problem: Maximal Matching

Some simple analysis:

If graph has maximum degree d, then there are at most 2dk paths of length k starting from the query edge.

Each path of length k defines a random permutation of hash values.

1

2

3

4

5

67

8

9

10

11

12

Permutation: [6,1,11,10,3]

Page 153: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

Today’s Problem: Maximal Matching

Some simple analysis:

If graph has maximum degree d, then there are at most 2dk paths of length kstarting from the query edge.

Each path of length k defines a random permutation of hash values.

There are k! possible permutations.

1

2

3

4

5

67

8

9

10

11

12

Permutation: [6,1,11,10,3]

Page 154: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

Today’s Problem: Maximal Matching

Some simple analysis:

If graph has maximum degree d, then there are at most 2dk paths of length kstarting from the query edge.

Each path of length k defines a random permutation of hash values.

There are k! possible permutations.

Pr[path is all decreasing] = 1/k!

1

2

3

4

5

67

8

9

10

11

12

Permutation: [6,1,11,10,3]

Page 155: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

Today’s Problem: Maximal Matching

Conclusion:

The expected number of paths

traversed of length k is at most: 𝑑𝑘

𝑘!

1

2

3

4

5

67

8

9

10

11

12

Permutation: [6,1,11,10,3]

Page 156: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

Today’s Problem: Maximal Matching

Conclusion:

The expected number of paths

traversed of length k is at most: 𝑑𝑘

𝑘!

The expected total cost of a query is:

𝑘=1

∞𝑑𝑘

𝑘!= 𝑂 𝑒𝑑

1

2

3

4

5

67

8

9

10

11

12

Permutation: [6,1,11,10,3]

Page 157: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

Today’s Problem: Maximal Matching

Key question:How expensive is a query?

E[cost] = O(ed)

1

2

3

4

5

67

8

9

10

11

12

query(e):for all neighbors e’ of e:

if hash(e’) < hash(e)if query(e’) = true

return falsereturn true

Page 158: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

Today’s Problem: Maximal Matching

To solve via sampling:

1) Choose a random permutation for the edges (e.g., a hash function).

2) Choose s edges at random.

3) Decide if they are in the matching for the chosen permutation via query operation.

1

2

3

4

5

67

8

9

10

11

12

Page 159: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

sum = 0for j = 1 to s:

Choose edge e uniformly at random.if (query(e) = true) then

sum = sum + 1return m∙(sum/s)

Approximate Maximal Matching

MaxMatch-Sampling

Page 160: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

sum = 0for j = 1 to s:

Choose edge e uniformly at random.if (query(e) = true) then

sum = sum + 1return m∙(sum/s)

Approximate Maximal Matching

MaxMatch-Sampling

Claim: returns size of maximal matching ± εm

Page 161: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

sum = 0for j = 1 to s:

Choose edge e uniformly at random.if (query(e) = true) then

sum = sum + 1return m∙(sum/s)

Approximate Maximal Matching

MaxMatch-Sampling

Claim: returns size of maximal matching ± εm

Claim: Runs in time O(ed / ε2)

Page 162: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

Today’s Problem: Maximal Matching

Two improvements:

1) Reduce error from ± εm to ± εn. 1

2

3

4

5

67

8

9

10

11

12

Page 163: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

Today’s Problem: Maximal Matching

Two improvements:

1) Reduce error from ± εm to ± εn.(Hint: each node is either matched or unmatched, and you can compute the size of the matching from the number of matched nodes.)

1

2

3

4

5

67

8

9

10

11

12

Page 164: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

Today’s Problem: Maximal Matching

Two improvements:

1) Reduce error from ± εm to ± εn.(Hint: each node is either matched or unmatched, and you can compute the size of the matching from the number of matched nodes.)

2) Reduce the running time from exponential to O(d4/ ε2).

1

2

3

4

5

67

8

9

10

11

12

Page 165: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

Today’s Problem: Maximal Matching

Two improvements:

1) Reduce error from ± εm to ± εn.(Hint: each node is either matched or unmatched, and you can compute the size of the matching from the number of matched nodes.)

2) Reduce the running time from exponential to O(d4/ ε2).

(Hint: In query, explore neighboring edges in order of smallest weight first. Analysis is not simple!)

1

2

3

4

5

67

8

9

10

11

12

Page 166: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

Questions to think about:

1) Show that the sampling algorithm works as claims (if the query operation is correct).

2) Reduce error from ± εm to ± εn.(Hint: each node is either matched or unmatched, and you can compute the size of the matching from the number of matched nodes.)

3) Can you find a multiplicative (instead of additive) approximation? Why not? (Hint: Think about a graph where the maximal matching is very small.)

1

2

3

4

5

67

8

9

10

11

12

Page 167: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

Two more questions:

1) Give an algorithm for deciding if the black pixels are connected or ε-far from connected in an n by nsquare of pixels.

2) Give an algorithm for deciding if the black pixels are a rectangle or ε-far from a rectangle in an n by nsquare of pixels.

connected

rectangleHint: imagine querying a grid of pixels distance εn apart.

Page 168: Algorithms at Scale - NUS Computinggilbert/CS5234/2019/... · 2019. 8. 22. · Algorithms at Scale (Week 2) Puzzle of the Day: A bag contains a collection of blue and red balls. Repeat:

Summary

Today:

Number of connected components in a graph.

• Approximation algorithm.

Weight of MST

• Approximation algorithm.

Size of maximal matching

• Approximation algorithm.

Last Week:

Toy example 1: array all 0’s?

• Gap-style question: All 0’s or far from all 0’s?

Toy example 2: Faction of 1’s?

• Additive ± 𝜀 approximation

• Hoeffding Bound

Is the graph connected?

• Gap-style question.

• O(1) time algorithm.

• Correct with probability 2/3.