View
220
Download
0
Tags:
Embed Size (px)
Citation preview
Notes on the analysis of multiplication algorithms.. Notes on the analysis of multiplication algorithms.. Dr. M. Sakalli, Marmara UniversityDr. M. Sakalli, Marmara University
3-2M, Sakalli, CS246 Design & Analysis of Algorithms, Lecture Notes
Integer Multiplication MIT notes and wikipediaInteger Multiplication MIT notes and wikipedia
Example.. Classic High school math.. Example.. Classic High school math.. Let Let gg = A|B and = A|B and hh = C|D where A,B,C and D are n/2 bit integers = C|D where A,B,C and D are n/2 bit integersSimple Method: Simple Method: ghgh = (2 = (2n/2n/2A+B)(2A+B)(2n/2n/2C+D) same as given above. C+D) same as given above. 4 multiplication routines. XY = (24 multiplication routines. XY = (2nn)AC+2)AC+2n/2n/2(AD+BC) + BD and (AD+BC) + BD and
carriages carriages cc..
Long multiplicationLong multiplication:: rrjj = = cc + + ΣΣk = i-jk = i-j ggj j hhkk
Running Time Recurrence T(n) < 4T(n/2) + 100n, 100 Running Time Recurrence T(n) < 4T(n/2) + 100n, 100 multiplications.??, In-place??..multiplications.??, In-place??..
T(n) = T(n) = (n(n22))
Provided that neither Provided that neither cc nor the total sum exceed nor the total sum exceed log space,log space, indeed, a simple inductive argument shows that the carry indeed, a simple inductive argument shows that the carry cc and the total sum for and the total sum for rrii can never exceed can never exceed nn and 2 and 2nn: <<?? 2lg: <<?? 2lgn n respectively. Space efficiency: S(respectively. Space efficiency: S(nn)=O(loglog()=O(loglog(NN)), )), (loglog((loglog(NN)). )). NN==ghgh. .
3-3M, Sakalli, CS246 Design & Analysis of Algorithms, Lecture Notes
Integer Multiplication MIT notes and wikipediaInteger Multiplication MIT notes and wikipedia
Pseudo code: Pseudo code: Log space multiplication algorithmLog space multiplication algorithm, , multiply(multiply(gg[0..n-1], [0..n-1], hh[0..n-1]) // Arrays representing to [0..n-1]) // Arrays representing to
the binary representations the binary representations x ← 0 x ← 0 for i= 0 : 2n-1for i= 0 : 2n-1 for j= 0 : i for j= 0 : i k ← i - j k ← i - j x ← x + (x ← x + (gg[j] × [j] × hh[k]) [k])
r[i] ← x mod 2r[i] ← x mod 2 x ← floor(x/2) //I think this is carriage return. Last x ← floor(x/2) //I think this is carriage return. Last bit if 1.. bit if 1.. endendendend
Lattice method,Lattice method, Muhammad Muhammad ibnibn Musa al-Khwarizmi Musa al-Khwarizmi. Gauss's . Gauss's complex multiplication algorithm.complex multiplication algorithm.
3-4M, Sakalli, CS246 Design & Analysis of Algorithms, Lecture Notes
Integer Multiplication MIT notes and wikipediaInteger Multiplication MIT notes and wikipedia
Karatsuba’s algorithmKaratsuba’s algorithm: Polynomial extensions.. : Polynomial extensions..
gg == gg111010n/2n/2 + + gg22
hh == hh111010n/2n/2 + + hh22
gg hh == gg1 1 hh111010nn + (+ (gg11hh2 2 + + gg22hh11)10)10n/2n/2 + + gg22hh22
((gg11hh22+ + gg22hh11) ) = = ((gg1 1 + + hh11)()(gg2 2 + + hh22) - () - (gg22hh22+ + gg11hh11), f(n) ), f(n) = =
4sums+1 more final sum 4sums+1 more final sum = = 5n, n>2, suppose it is 5n, n>2, suppose it is a constant 100n, and some carriages. a constant 100n, and some carriages.
XY XY == (2 (2n/2n/2+2+2nn)AC+2)AC+2n/2n/2(A-B)(C-D) + (2(A-B)(C-D) + (2n/2n/2+1) BD+1) BD
A(n) = 3A(n/2)+5n, A(n) = 3A(n/2)+5n,
A(n) A(n) << O(n O(n lg 3lg 3) ) ≈≈(n(n1.61.6))
Base value 7, when n<2, Base value 7, when n<2,
3-5M, Sakalli, CS246 Design & Analysis of Algorithms, Lecture Notes
Karatsuba (g, h : n-digit integer; n : integer) Karatsuba (g, h : n-digit integer; n : integer) // return (2n)-digit integer is// return (2n)-digit integer isa, b, c, d;a, b, c, d; // (n/2)-digit integer// (n/2)-digit integerU, V, W; U, V, W; //n-digit integer;//n-digit integer;beginbegin if n == 1 thenif n == 1 then return return g(0)*h(0); ????g(0)*h(0); ???? elseelse g1 g1 g(n-1) ... g(n/2); g(n-1) ... g(n/2); g2 g2 g(n/2-1) ... g(0); g(n/2-1) ... g(0); h1 h1 h(n-1) ... h(n/2); h(n-1) ... h(n/2); h2 h2 h(n/2-1) ... h(0); h(n/2-1) ... h(0); U U Karatsuba ( g1, h1, n/2 ); Karatsuba ( g1, h1, n/2 ); V V Karatsuba ( g2, h2, n/2 ); Karatsuba ( g2, h2, n/2 ); W W Karatsuba ( g1+g2, h1+h2, n/2 ); Karatsuba ( g1+g2, h1+h2, n/2 ); return return U*10 U*10nn + (W-U-V)*10^n/2 + V; + (W-U-V)*10^n/2 + V; end if;end if; end Karatsuba;end Karatsuba;
FFT and Fast Matrix multiplication. FFT and Fast Matrix multiplication.
3-6M, Sakalli, CS246 Design & Analysis of Algorithms, Lecture Notes
Quarter square multiplierQuarter square multiplier 1980, Everett L. Johnson: 1980, Everett L. Johnson:
ghgh = {( = {(gg + + hh))22 - ( - (gg - - hh))22}/4= {(}/4= {(gg22 + 2 + 2hghg+ + hh22) - () - (gg22 - 2 - 2hghg+ + hh22) }/4 ) }/4
Think hardware implementation, with a lookup table Think hardware implementation, with a lookup table (converter), the difficulty is that summation of the two (converter), the difficulty is that summation of the two numbers each 8bits, will require at least 9 bits, when squared, numbers each 8bits, will require at least 9 bits, when squared, 18 bits wide.. But if divided by 2 before squared, (discarding 18 bits wide.. But if divided by 2 before squared, (discarding remainder when n is odd) . remainder when n is odd) .
Table lookupTable lookup from 0 to .. 9+9, from 0 … to 81. O(3n), working from 0 to .. 9+9, from 0 … to 81. O(3n), working S(n) = S(n) = (n).(n).
i.e. 7 by 3, observe that the sum and difference are 10 and 4 i.e. 7 by 3, observe that the sum and difference are 10 and 4 respectively. Looking both of those values up on the table respectively. Looking both of those values up on the table yields 25 and 4, the difference of which is 21. yields 25 and 4, the difference of which is 21.
3-7M, Sakalli, CS246 Design & Analysis of Algorithms, Lecture Notes
Russian (Egyptian) Peasant’s binary multiplicationRussian (Egyptian) Peasant’s binary multiplication
Shift and add.. In-place algorithm, may be implemented and 2n Shift and add.. In-place algorithm, may be implemented and 2n space.. Try complex examples.space.. Try complex examples.
11 3, in binary 11 3, in binary 1011 1011 11 11 011 011
5 6, 5 6, 101 101 110 110 110 110
2 12, 2 12, 10 110010 1100
1 24, 1 24, 1 1 11000 11000.. = 10000111000 11000.. = 100001
T(n) = T(n) = (n)+O(n(n)+O(n22), think about this?.. Why.. ), think about this?.. Why..
S(n) = S(n) = (loglog(n)) which is the carriage. (loglog(n)) which is the carriage.
If invertible? Division potential question. If invertible? Division potential question.
3-8M, Sakalli, CS246 Design & Analysis of Algorithms, Lecture Notes
Matrix multiplicationMatrix multiplication, , 8 multiplications, O(n8 multiplications, O(n33))
A11 A12
A21 A22
B11 B12
B21 B22
C11 C12
C21 C22
C11 A11B11 A12B21
C12 A11B12 A12B22
C21 A21B11 A22B21
C22 A21B12 A22B22
Pseudo code for MM. Pseudo code for MM. MM(A, B) MM(A, B)
for i ← 1 : Nfor i ← 1 : N for j ← 1 : N for j ← 1 : N C(i, j) ← 0;C(i, j) ← 0; for k ← 1 : N for k ← 1 : N
C(i, j) ← C(i, j) + A(i, k) * B(k, j)C(i, j) ← C(i, j) + A(i, k) * B(k, j)end, end, endend, end, end
Time complexity of this algo is Time complexity of this algo is nn33 multiplications and multiplications and additions. additions.
Can we do better using divide and conquer?.. Can we do better using divide and conquer?.. Subdividing matrices into four sub-matrices. Subdividing matrices into four sub-matrices. T(n) = b, nT(n) = b, n2,2,T(n) = 8T(n/2) + cT(n) = 8T(n/2) + cnn22, n>2, which has T(n) = O(n??), n>2, which has T(n) = O(n??)
3-9M, Sakalli, CS246 Design & Analysis of Algorithms, Lecture Notes
Strassen’s AlgorithmStrassen’s Algorithm
P1 A11 A22 B11 B22 P2 A21 A22 B11
P3 A11 B12 B22 P4 A22 B21 B11
P5 A11 A12 B22
P6 A21 A11 B11 B12 P7 A12 A22 B21 B22
C11 P1 P4 P5 P7
C12 P3 P5
C21 P2 P4
C22 P1 P3 P2 P6
Strassen: 7 multiplies, 18 additionsStrassen: 7 multiplies, 18 additionsT(n) = b, nT(n) = b, n2,2,T(n) = 7T(n/2) + (7m+18s)T(n) = 7T(n/2) + (7m+18s)nn22, n>2, which has T(n) = , n>2, which has T(n) = O(nO(n2.812.81))77nn22(1/4+1/16+…)(1/4+1/16+…)Strassen-Winograd: 7 multiplies, 15 additionsStrassen-Winograd: 7 multiplies, 15 additionsCoppersmith-Winograd, O(nCoppersmith-Winograd, O(n2.3762.376) (not easily implementable)) (not easily implementable)
In practice faster (not large hidden constants) for relatively In practice faster (not large hidden constants) for relatively smaller n~64, and stable but demonstrated that for some smaller n~64, and stable but demonstrated that for some matrices (Strassen and Strassen-Winograd) are too unstable.matrices (Strassen and Strassen-Winograd) are too unstable.