Divide and Conquer (continued) - cs.uwaterloo.cabinma/cs341/04-05-divide-and-conquer-1.pdf ·...

Preview:

Citation preview

Divide and Conquer (continued) Example: Integer Multiplication (Karatsuba algorithm) The problem: Let 𝑥 and 𝑦 be two 𝑛-digit numbers. Compute 𝑥 ⋅ 𝑦.

A straightforward algorithm (like what a human does) would require 𝑂(𝑛2) time to multiply each digit of

𝑥 with each digit of 𝑦. Let’s try to do divide-and-conquer.

Let 𝑚 =𝑛

2. Let 𝑥 = 2𝑚 ⋅ 𝑥1 + 𝑥2 and 𝑦 = 2𝑚 ⋅ 𝑦1 + 𝑦2. Then

𝑥 ⋅ 𝑦 = 22𝑚 ⋅ 𝑥1𝑦1 + 2𝑚 ⋅ (𝑥1𝑦2 + 𝑥2𝑦1) + 𝑥2𝑦2

So, we have 𝑇(𝑛) = 4 ⋅ 𝑇 (𝑛

2) + 𝑂(𝑛). By Master’s theorem, 𝑎 = 4, 𝑏 = 2, and 𝑐 = 1. So, 𝑇(𝑛) = 𝑛2.

Disappointing…

The trick is to reduce 𝑎. Notice that (𝑥1 + 𝑥2)(𝑦1 + 𝑦2) = 𝑥1𝑦1 + (𝑥1𝑦2 + 𝑥2𝑦1) + 𝑥2𝑦2. Therefore,

after computing 𝑥1𝑦1 and 𝑥2𝑦2, the value 𝑥1𝑦2 + 𝑥2𝑦1 can be computed in the following way,

𝑥1𝑦2 + 𝑥2𝑦1 = (𝑥1 + 𝑥2)(𝑦1 + 𝑦2) − 𝑥1𝑦1 − 𝑥2𝑦2

Thus, only three multiplications are needed. Therefore,

𝑇(𝑛) = 3 ⋅ 𝑇 (𝑛

2) + 𝑂(𝑛)

Thus, by Master’s theorem, 𝑎 = 3, 𝑏 = 2, and 𝑐 = 1. So, 𝑇(𝑛) = 𝑛log2 3 < 𝑛1.59.

Practical concerns:

1. 𝑛 is odd?

2. What base to use?

3. For small 𝑛 overhead is too high. Better use 𝑂(𝑛2) method. For 𝑛 > 1000, this method is

better.

4. There is an 𝑂(𝑛 log 𝑛 2Θ(log∗ 𝑛)) method that is asymptotically better but requires very large 𝑛

to be beneficial. Here log∗ 𝑛 is the number of times the logarithm function must be iteratively

applied before the result is less than or equal to 1.

Example: Matrix Multiplication (Strassen Algorithm) Let 𝐴 and 𝐵 be two matrices, 𝐶 = 𝐴 × 𝐵, then 𝑐𝑖,𝑗 = ∑ 𝑎𝑖,𝑘𝑏𝑘,𝑗

𝑛𝑘=1 . For brevity we assume each matrix is

𝑛 × 𝑛 matrix. A straightforward algorithm would take 𝑂(𝑛3) time.

Remark: Note that in this case the input size is 𝑛2 but not 𝑛. The time complexity is a parameter related

to the input size.

We can divide each matrix into four blocks as follows:

Each block is a 𝑛

𝑛

2 matrix. Then

But using these straightforwardly will not make things run faster. Instead, Strassen computed seven

other matrices:

Then it can be verified that

Matrix additions take only 𝑂(𝑛2) time. So, 𝑇(𝑛) = 7𝑇 (𝑛

2) + 𝑂(𝑛2). By using Master’s theorem, the

time complexity is 𝑇(𝑛) = 𝑂(𝑛log2 7) ≈ 𝑂(𝑛2.807).

Nobody knows how Strassen figured out the seven intermediate matrices 𝑀1 to 𝑀7. With the

knowledge of Karatsuba algorithm for integer multiplication, a clever person would think one may be

able to apply the same trick to matrices. The remaining is a strong belief in it and the persistence of

trying out different ways. Be prepared to fail many times before you find the right answer.

Example: Polynomial Multiplication

Given two polynomials 𝑝(𝑥) = 𝑎0 + 𝑎1𝑥 + 𝑎2𝑥2 + ⋯ + 𝑎𝑛𝑥𝑛 and𝑞(𝑥) = 𝑎0 + 𝑎1𝑥 + 𝑎2𝑥2 + ⋯ +

𝑎𝑛𝑥𝑛, compute 𝑝(𝑥) ⋅ 𝑞(𝑥). The divide-and-conquer idea is very similar to the integer multiplication.

Work it out as an exercise.

Example: Counting inversions If you rank 𝑛 movies with order 1, 2, … , 𝑛, and your friend ranks the same movies with order 𝑖1, … , 𝑖𝑛.

The number of pairs of movies that you and your friend rank with different orders is called the

inversions in the permutation 𝑖1, … , 𝑖𝑛. This can be used as a measure of how similar (dissimilar) you and

your friend are. More formally, the number of inversions is |{(𝑖𝑗, 𝑖𝑘): 𝑗 < 𝑘 and 𝑖𝑗 > 𝑖𝑘}|.

Algorithm 1: For every pair of 𝑗 < 𝑘, check if 𝑖𝑗 > 𝑖𝑘 and count. 𝑂(𝑛2).

Algorithm 2: Divide and Conquer.

Divide the array 𝐴[1. . 𝑛] into two halves. Then the number of inversions is equal to the sum of the

number of inversion in the first and second halves, as well as the inversions 𝐴[𝑖] > 𝐴[𝑗] for 𝑖 and 𝑗 in the

first and second halves, respectively. One can check that a straightforward counting of the inversion

across the two halves take 𝑛2

4 comparisons. By Master theorem, this divide-and-conquer ends up with an

𝑂(𝑛2) algorithm. To improve, we have to reduce the time complexity to count the inversions across the

two halves.

An often used technique when dealing with arrays is to check if things become easier if the arrays are

sorted. Consider the situation when 𝐴[1. . 𝑛/2] and 𝐴[𝑛/2 + 1. . 𝑛] are sorted, respectively. If 𝐴[𝑖] >

𝐴[𝑗] for 1 ≤ 𝑖 ≤ 𝑛/2 and 𝑛

2< 𝑗 ≤ 𝑛, then we know that 𝐴[𝑖′] > 𝐴[𝑗] for every 𝑖′ ≥ 𝑖. If we count in a

proper way, all these inversions may be counted together in one operation. This saves time.

Let 𝐴 and 𝐵 be two sorted array with sizes 𝑚 and 𝑛, respectively. We put 𝐴 before 𝐵 and count the

inversions between them. The idea is that we loop through every 𝐵[𝑗] in the second array, and find the

first 𝑖 such that 𝐴[𝑖] > 𝐵[𝑗]. Then we know that 𝐵[𝑗] is involved in precisely 𝑚 − 𝑖 + 1 inversions. We

sum up all the inversions of every 𝐵[𝑗].

CountPairSorted

Input: Two sorted arrays 𝐴 and B with length 𝑚 and 𝑛, respectively

Output: the number of inversions when 𝐴 is put before 𝐵.

1. 𝑖 ← 1; 𝑘 ← 0;

2. For 𝑗 from 1 to 𝑛

3. while 𝑖 ≤ 𝑚 and 𝐴[𝑖] ≤ 𝐵[𝑗]

4. 𝑖++

5. 𝑘 ← 𝑘 + 𝑚 − 𝑖 + 1 6. Return 𝑘

The correctness follows the discussion above the pseudocode.

Time complexity: Note that the condition in line 3 can only be true for at most 𝑚 times during the whole

algorithm execution. Also, for each 𝑗, the condition in line 3 can be false only once. Therefore, lines 3

and 4 take 𝑂(𝑚 + 𝑛) times in total. Line 5 takes 𝑂(1) time for each 𝑗. Therefore, the total time

complexity is 𝑂(𝑚 + 𝑛).

Now we can use CountPairSorted as a subroutine for the divide and conquer algorithm.

SortAndCount:

Input: A is a list of 𝑛 elements.

Output: (S, k), where S is sorted, and k is the number of inversions.

1. If 𝑛 ≤ 3 sort and count with a trivial algorithm and return.

2. (𝑆1, 𝑘1) ← SortAndCount (A[1..n/2])

3. (𝑆2, 𝑘2) ← SortAndCount (A[n/2+1..n])

4. 𝑘3 ← CountPairSorted(𝑆1,𝑆2).

5. 𝑆 ← Merge(𝑆1, 𝑆2)

6. Return (𝑆, 𝑘1 + 𝑘2 + 𝑘3).

Proof of correctness. By induction. Base case is straightforward and omitted. Inversions in 𝐴 includes

inversions in the first half, second half, and across the two halves. This is indeed what line 4 of

SortAndCount computes.

Time complexity: 𝑇(𝑛) = 2𝑇 (𝑛

2) + 𝑂(𝑛). Therefore, 𝑇(𝑛) = 𝑂(𝑛 log 𝑛).

Remark: It is possible to combine lines 4 and 5 together into a single MergeAndCount subroutine. This is

perhaps what you will do when you write the program. But since it does not affect the big O complexity,

separating them is easier for our proofs.

Recommended