Upload
others
View
0
Download
0
Embed Size (px)
Citation preview
DraftEfficient and Accurate Evaluation
of Polynomials of Matrices
Massimiliano FasiNumerical Linear Algebra Group Meeting
Manchester, 11 December 2018
DraftOutline
Definition and motivation
Evaluation schemes
Open questions
Summary
Joint work with Steven Elsworth & Gian Maria Negri Porzio.
[email protected] Evaluating polynomials of matrices 1/18
DraftPolynomials of matricesWe want to evaluate (accurately and efficiently)
p(A) =k∑
i=0
ciAi = c0I + c1A + c2A
2 + · · ·+ ckAk ,
where
• k ∈ N,• ci ∈ C and mostly nonzero,• A ∈ CN×N .
[email protected] Evaluating polynomials of matrices 2/18
DraftMotivation
• Evaluation of series expansions:
• computation of matrix functions (Taylor series),
• solution of matrix equations.
• Evaluation of matrix rational functions q(A)−1p(A).
[email protected] Evaluating polynomials of matrices 3/18
DraftBeginner’s algorithmAlgorithm: Explicit powers.
Input: A ∈CN×N , c0, . . . , ck ∈ COutput: S = p(A)P ← AS ← c0I + c1Afor i ← 2 to k do
P ← PAS ← S + ciP
return S
• 1 diagonal sum• k scalings, k − 1 sums• k − 1 matrix products• 2n2 additional storage
≈ 2(k − 1)n3
[email protected] Evaluating polynomials of matrices 4/18
DraftBeginner’s algorithmAlgorithm: Explicit powers.
Input: A ∈CN×N , c0, . . . , ck ∈ COutput: S = p(A)P ← AS ← c0I + c1Afor i ← 2 to k do
P ← PAS ← S + ciP
return S
• 1 diagonal sum• k scalings, k − 1 sums• k − 1 matrix products• 2n2 additional storage
≈ 2(k − 1)n3
[email protected] Evaluating polynomials of matrices 4/18
DraftAvoiding scalingsAlgorithm: Horner’s method.
Input: A ∈CN×N , c0, . . . , ck ∈ COutput: S = p(A)S ← ckA + ck−1Ifor i ← k − 2 down to 0 do
S ← SA + ci I
return S
• k diagonal sums• 1 scaling, 0 sums• k − 1 matrix products• n2 additional storage
≈ 2(k − 1)n3
[email protected] Evaluating polynomials of matrices 5/18
DraftAvoiding scalingsAlgorithm: Horner’s method.
Input: A ∈CN×N , c0, . . . , ck ∈ COutput: S = p(A)S ← ckA + ck−1Ifor i ← k − 2 down to 0 do
S ← SA + ci I
return S
• k diagonal sums• 1 scaling, 0 sums• k − 1 matrix products• n2 additional storage
≈ 2(k − 1)n3
[email protected] Evaluating polynomials of matrices 5/18
DraftAre faster algorithms possible?
Theorem [Paterson & Stockmeyer, 1973]For any k ∈ N there are polynomials whose evaluation requires at least√
k nonscalar multiplications.
Therefore:
•√
k matrix multiplications are required in general
• Can we do better than k − 1?
[email protected] Evaluating polynomials of matrices 6/18
https://epubs.siam.org/doi/10.1137/0202007?mobileUi=0
DraftComplicating an easy taskFor a positive integer s, we can rewrite
p(A) =r∑
i=0
(As)iBi , r = bk/sc,
where
Bi =
{∑s−1j=0 csi+jA
j , i = 0, . . . , r − 1,∑k mod sj=0 csi+jA
j , i = r .
Remark: p(A) is a polynomial in As with coefficients Bi .
[email protected] Evaluating polynomials of matrices 7/18
DraftComplicating an easy taskFor a positive integer s, we can rewrite
p(A) =r∑
i=0
(As)iBi , r = bk/sc,
where
Bi =
{∑s−1j=0 csi+jA
j , i = 0, . . . , r − 1,∑k mod sj=0 csi+jA
j , i = r .
Remark: p(A) is a polynomial in As with coefficients Bi .
[email protected] Evaluating polynomials of matrices 7/18
DraftThe Paterson–Stockmeyer methodInput: A ∈CN×N , c0, . . . , ck ∈ C
Output: S = p(A)
A0 ← I , A1 ← Afor i ← 2 to s doAi ← AAi−1
if s divides k then S ← ckelse S ←
∑k mod sj=0 csr+jAj
for i ← r − 1 down to 0 doS ← SAs +
∑s−1j=0 csi+jAj
return S
[email protected] Evaluating polynomials of matrices 8/18
DraftThe Paterson–Stockmeyer methodInput: A ∈CN×N , c0, . . . , ck ∈ C
Output: S = p(A)A0 ← I , A1 ← Afor i ← 2 to s doAi ← AAi−1
if s divides k then S ← ckelse S ←
∑k mod sj=0 csr+jAj
for i ← r − 1 down to 0 doS ← SAs +
∑s−1j=0 csi+jAj
return S
[email protected] Evaluating polynomials of matrices 8/18
DraftThe Paterson–Stockmeyer methodInput: A ∈CN×N , c0, . . . , ck ∈ C
Output: S = p(A)A0 ← I , A1 ← Afor i ← 2 to s doAi ← AAi−1
if s divides k then S ← ckelse S ←
∑k mod sj=0 csr+jAj
for i ← r − 1 down to 0 doS ← SAs +
∑s−1j=0 csi+jAj
return S
[email protected] Evaluating polynomials of matrices 8/18
DraftThe Paterson–Stockmeyer methodInput: A ∈CN×N , c0, . . . , ck ∈ C
Output: S = p(A)A0 ← I , A1 ← Afor i ← 2 to s doAi ← AAi−1
if s divides k then S ← ckelse S ←
∑k mod sj=0 csr+jAj
for i ← r − 1 down to 0 doS ← SAs +
∑s−1j=0 csi+jAj
return [email protected] Evaluating polynomials of matrices 8/18
DraftCost of the Paterson–Stockmeyer algorithmThe algorithm requires (recall that r = bk/sc):
• r + 1 diagonal scalings, r + 1 diagonal sums,• k − r − 1 scalings, k − r − 1 sums,• about s + r − 1 matrix products,• (s − 1)n2 additional elements of storage.
Optimal choice of the parameter
For s ∈ R the number of matrix multiplication is minimized by s =√
k .For s ∈ N one can choose either s =
⌊√k⌋or s =
⌈√k⌉.
[email protected] Evaluating polynomials of matrices 9/18
DraftComputational cost
0 5 10 15 20 25 30 35 40 45 50
0
5
10
15
k
MxM
toev
alua
tep(A
)of
degr
eek
s =⌊√
k⌋
s =⌈√
k⌉
Matrix multiplications to evaluate p(A) of degree k for s =⌊√
k⌋
and s =⌈√
k⌉.
[email protected] Evaluating polynomials of matrices 10/18
DraftThe Paterson–Stockmeyer method is optimalEquivalence of the methods [Hargreaves, 2005]
Both choices s =⌊√
k⌋or s =
⌈√k⌉yield the same number of matrix
multiplications.
Optimality [Working note, 2018]
The choice s =⌊√
k⌋ (
or s =⌈√
k⌉)
minimizes the number of matrixmultiplications required to evaluate p(A) over all choices of s.
Feel of the proofsProofs by exhaustion, with plenty of cases.
[email protected] Evaluating polynomials of matrices 11/18
http://www.maths.manchester.ac.uk/~higham/links/theses/hargreaves05.pdfhttp://www.maths.manchester.ac.uk/~mbbxqmf2/preprints/fasi18.pdf
DraftSlow growth of the computational cost
0 5 10 15 20 25 30 35 40 45 50
0
5
10
15
k
MxM
toev
alua
tep(A
)of
degr
eek
s =⌊√
k⌋
s =⌈√
k⌉
Matrix multiplications to evaluate p(A) of degree k for s =⌊√
k⌋
and s =⌈√
k⌉.
[email protected] Evaluating polynomials of matrices 12/18
DraftEffect of rounding errorsBound on forward error [Higham, 2002]
For a polynomial p of degree k , let p̂(A) be the computed polynomial.Then:
• |p(A)− p̂(A)| ≤ γknp̃(|A|),• ‖p(A)− p̂(A)‖1 ≤ γknp̃(‖A‖1),• ‖p(A)− p̂(A)‖∞ ≤ γknp̃(‖A‖∞),
where γn = cnu/(1− cnu), and p̃(A) =∑k
i=0 |ci |Ai .
If significant cancellation, the error in computing p(A) can be [email protected] Evaluating polynomials of matrices 13/18
https://epubs.siam.org/doi/pdf/10.1137/1.9780898718027.ch5
DraftDeriving new coefficients• Evaluation in factored form, 2(k − 1)n3 flops.
• Recent algorithm [Sastre, 2018], [Sastre, Ibáñez, & Defez, 2018].• Modified Paterson–Stockmeyer [Paterson & Stockmeyer, 1973]:• only
√2k + log2 k nonscalar multiplications,
• only rational preprocessing,• never seen discussed/used.
Open questionIs any of these methods worth considering at all?
[email protected] Evaluating polynomials of matrices 14/18
https://reader.elsevier.com/reader/sd/pii/S0024379517306389?token=B1C2AFC2269512428B8684E6BD11C6F85E0F98F91E9D6A4B433C20E8C4B67CC97620B2FDBEDC41F9DFED11199EAE4892https://reader.elsevier.com/reader/sd/pii/S0096300318307318?token=03A0156158AAEC9048EF89AF04F06EAE5EEDAD667011D5560FDC786E295AA143C0A7857A7058094D346575FCE635202Chttps://epubs.siam.org/doi/10.1137/0202007?mobileUi=0
DraftDeriving new coefficients• Evaluation in factored form, 2(k − 1)n3 flops.
• Recent algorithm [Sastre, 2018], [Sastre, Ibáñez, & Defez, 2018].• Modified Paterson–Stockmeyer [Paterson & Stockmeyer, 1973]:• only
√2k + log2 k nonscalar multiplications,
• only rational preprocessing,• never seen discussed/used.
Open questionIs any of these methods worth considering at all?
[email protected] Evaluating polynomials of matrices 14/18
https://reader.elsevier.com/reader/sd/pii/S0024379517306389?token=B1C2AFC2269512428B8684E6BD11C6F85E0F98F91E9D6A4B433C20E8C4B67CC97620B2FDBEDC41F9DFED11199EAE4892https://reader.elsevier.com/reader/sd/pii/S0096300318307318?token=03A0156158AAEC9048EF89AF04F06EAE5EEDAD667011D5560FDC786E295AA143C0A7857A7058094D346575FCE635202Chttps://epubs.siam.org/doi/10.1137/0202007?mobileUi=0
DraftIs Paterson–Stockmeyer optimal?Summary
• Theoretical lower bound of√
k MxM.
• Paterson–Stockmeyer requires 2√
k MxM with original coefficients.
• Algorithm that requires√2k MxM with modified coefficients.
Open questions
1. Is 2√
k a lower bound without modifying the coefficients?
2. Is there an algorithm that requires only√
k MxM?
[email protected] Evaluating polynomials of matrices 15/18
DraftMost accurate evaluationPaterson–Stockmeyer requires computing s consecutive powers of A.
Since A commutes with its powers, we have several algorithms:• AA, AA2, AA3, AA4, . . . (A on the left)• AA, A2A, A3A, A4A, . . . (A on the right)• AA, A2A, AA3, A4A, . . . (alternating)• AA, AA2, A2A2, A2A3, . . . (balanced)
Open questions
1. Select most accurate algorithm by looking at A and s only?
2. If A and B commute, when is fl(AB) more accurate than fl(BA)[email protected] Evaluating polynomials of matrices 16/18
DraftEvaluation of sparse polynomials
Sparse polynomialA polynomial with enough zero coefficients to make exploiting thestructure worth it.
Open questions
1. What is the best way of evaluating sparse polynomials?
2. When is the sparsity worth exploiting?
[email protected] Evaluating polynomials of matrices 17/18
DraftTake home messages
1. Evaluating polynomials of matrices is not entirely trivial.
2. Not clear how fast reasonably accurate algorithms are.
3. Not clear how accurate faster algorithms are.
4. Still plenty of open questions waiting to be solved.
[email protected] Evaluating polynomials of matrices 18/18
Definition and motivationEvaluation schemesOpen questionsSummary