Research Matters
February 25 2009
Nick HighamDirector of Research
School of Mathematics
1 6
Challenges in ExploitingHigher Precision Floating Point Arithmetic
Nick HighamSchool of Mathematics
The University of Manchester
highammamanacukhttpwwwmamanacuk~higham
nhigham
Chebfun and Beyond Conference OxfordSeptember 17 - 19 2012
Floating Point Number System
Floating point number system F sub R
y = plusmnm times βeminust 0 le m le βt minus 1
Base βprecision t exponent range emin le e le emax
Floating point numbers are not equally spaced
If β = 2 t = 3 emin = minus1 and emax = 3
0 05 10 20 30 40 50 60 70
University of Manchester Nick Higham Higher Precision Arithmetic 2 33
Floating Point Number System
Floating point number system F sub R
y = plusmnm times βeminust 0 le m le βt minus 1
Base βprecision t exponent range emin le e le emax
Floating point numbers are not equally spaced
If β = 2 t = 3 emin = minus1 and emax = 3
0 05 10 20 30 40 50 60 70
University of Manchester Nick Higham Higher Precision Arithmetic 2 33
Precision versus Accuracy
Unit roundoff u = 12β
1minust
fl(abc) = ab(1 + δ1) middot c(1 + δ2) |δi | le u= abc(1 + δ1)(1 + δ2)
asymp abc(1 + δ1 + δ2)
Precision = uAccuracy asymp 2u
Accuracy is not limited by precision
University of Manchester Nick Higham Higher Precision Arithmetic 3 33
Precision versus Accuracy
Unit roundoff u = 12β
1minust
fl(abc) = ab(1 + δ1) middot c(1 + δ2) |δi | le u= abc(1 + δ1)(1 + δ2)
asymp abc(1 + δ1 + δ2)
Precision = uAccuracy asymp 2u
Accuracy is not limited by precision
University of Manchester Nick Higham Higher Precision Arithmetic 3 33
Walkersrsquo Trouser Review
bull pARAMO Torres Trousers pound105 Waterproof insulated trousers to be worn as
an outer layer or next to the skin I understand the concept but the execution is unusual The waist is
bull wide relative to thighs and the cut is narrow around backside and crotch (despi te a gusset) There is only
it The wa ist connects like a nappy adjust the stati c integral webbing belt via a large quick-release buckle (which can clash with a rucksack or camera bag hipbelt) then
fold a fabric fla p up to the belt and secure with two Ve lcro tabs This leaves an air gap on each side between
waist and the top of the leg zip Knees articulate but when the leg is raised above step height the fabric is restrictive across the thigh and backside Full length side zips (legs cannot be easily shortened) allow rapid zipping on or off - useful if they are worn as a supershywarm overtrouser Paramo say Torres represent a practical su rviva l aid but as a next-to-skin trouser they fee l compromised by the cut and an insulated overt rouser has a limited market
THE LOWDOWN
Fabric Nikwax Analogy Insulator (polyester microfibre outer 100g polyester fill) Sizes XS-XL (unisex) Inside leg 79cm only Waist integral belt front flap with Velcro tabs
IHtIrnl Paramo 01892 786444 wwwparamo_couk
Cit THE NORTH FACE Insulated Trekker Pant Heading to the Antarctic Pack a pair of
these Inside the stretch nylon exterior is a quilted taffeta lining its li ke being wrapped in a duvet The on ly hitch is that outside co ld dry cond it ions theyre often too warm There are no leg vents and no water repellency although the insulation stops moisture (mist not rain) seeping through Breathability is good for such a warm garment and they are super-comfortable in chilly weather A choice of leg lengths is available and the plain hem is easily adjusted The waist is generous with belt loops and a static drawcord to keep them in place although the drawcord isnt very effective against the weight and bulk of the trousers Pockets are odd the zipped side pockets are on ly just big enough for my (small) hands and Im sti ll searching for a use for the zipped pocket behind the left thigh I found these pants too warm for British hill walking but appreciated them in cold dry weather in the French Alps
THE LOWDOWN Fabric 90 nylon 10 elastane polyester insulation Sizes Men 30-38 Women 8-16
IHtImU The North Face 01539822155 wwwthenorthfacecomeu
Inside leg Men Regular 80-83cm Long 85-88cm Women Short 71-75cm Regular 76-80cm Long 81-85cm_ All size graduated Waist zip 2 press studs belt loops static drawcord Pockets 2 zipped front 1 zipped rear 1 zipped back thigh
pound65
January 2011 Outdoor Photography 75
University of Manchester Nick Higham Higher Precision Arithmetic 5 33
RGB to XYZ
From CIE Standard (1931)XYZ
=
049 031 020017697 081240 001063
0 001 099
RGB
But in many booksXYZ
=
049000 031000 020000017697 081240 001063
0 001000 099000
RGB
University of Manchester Nick Higham Higher Precision Arithmetic 6 33
RGB to XYZ
From CIE Standard (1931)XYZ
=
049 031 020017697 081240 001063
0 001 099
RGB
But in many booksX
YZ
=
049000 031000 020000017697 081240 001063
0 001000 099000
RGB
University of Manchester Nick Higham Higher Precision Arithmetic 6 33
IEEE Standard 754-1985
Binary β = 2
Type Size Range u = 2minust
single 32 bits 10plusmn38 2minus24 asymp 60times 10minus8
double 64 bits 10plusmn308 2minus53 asymp 11times 10minus16
Arithmetic ops (+minus lowast radic) performed as if firstcalculated to infinite precision then roundedDefault round to nearest round to even in case of tie
University of Manchester Nick Higham Higher Precision Arithmetic 7 33
Kahan on Higher Precision
For now the 10-byte Extended format is atolerable compromise between
the value of extra-precise arithmetic and the price ofimplementing it to run fast
very soon two more bytes of precision will become tolerableand ultimately a 16-byte format
That kind of gradual evolution towards wider precision
was already in view whenIEEE Standard 754 for Floating-Point Arithmetic was framed
mdash Computer Benchmarks Versus Accuracy (1994)
University of Manchester Nick Higham Higher Precision Arithmetic 8 33
IEEE Standard 754-2008
Type Size Range u = 2minust
single 32 bits 10plusmn38 2minus24 asymp 60times 10minus8
double 64 bits 10plusmn308 2minus53 asymp 11times 10minus16
quadruple 128 bits 10plusmn4932 2minus113 asymp 96times 10minus35
University of Manchester Nick Higham Higher Precision Arithmetic 9 33
Extended and Mixed Precision BLAS (2002)
Part of new standard developed by BLAS TechnicalForumExtra precision used internally input and outputarguments remain single or doubleExtra precision is anything at least 15 times asaccurate as double precision and wider than 80 bitsCounterparts of selected level 1 2 and 3 BLASroutines Extra input argument specifies precision ofinternal computationsReference implementation uses double-double formatgiving extra precision of about 106 bits
University of Manchester Nick Higham Higher Precision Arithmetic 10 33
Need for Higher Precision
Bailey Simon Barton amp Fouts Floating PointArithmetic in Future Supercomputers Internat JSupercomputer Appl 3 86ndash90 1989Bailey Barrio amp Borwein High-Precision ComputationMathematical Physics and Dynamics Appl MathComput 218 10106ndash10121 2012
Long-time simulationsResolving small-scale phenomenaLarge-scale simulations
University of Manchester Nick Higham Higher Precision Arithmetic 11 33
Going to Higher Precision
If we have quadruple or higher precision what do we needto do to modify existing algorithms
To what extent are existing algs precision-independent
University of Manchester Nick Higham Higher Precision Arithmetic 13 33
Direct Matrix Factorizations
Threshold pivoting for sparse matrices may depend onprecisionCondition estimation to detect nearly singular matrices
gtgt A = gallery(rsquolotkinrsquo14) x = Arandn(n1)Warning Matrix is close to singular or badly scaledResults may be inaccurate RCOND = 2761711e-18
Iterative refinementEquilibration
But all ldquorely only on LAPACK DLAMCHrdquo
University of Manchester Nick Higham Higher Precision Arithmetic 14 33
Matrix Functions
(Inverse) scaling and squaring-type algorithms for eAlog(A) cos(A) At use Padeacute approximants
Padeacute degree chosen to achieve accuracy uPadeacute coeffs and algorithm parameters need rederivingfor a different u Logic may changeexpm logm need changing for smaller u
Methods based on best Linfin approximations to eA forHermitian A also need higher order approximationsderiving
Scalar elementary functions
University of Manchester Nick Higham Higher Precision Arithmetic 15 33
Iterative Algorithms
QR-based Dynamically Weighted Halley IterationNakatsukasa Bai amp Gygi (2010)X0 = Aα isin Cmtimesn (m ge n)[radic
ck Xk
I
]=
[Q1
Q2
]R
Xk+1 =bk
ckXk +
1radicck
(ak minus
bk
ck
)Q1Qlowast2
Cubically convergent with limkrarrinfin Xk = U whereA = UH is a polar decompositionForms basis of new spectral divide amp conquer algs forsymm eirsquoproblem and SVD (H amp Nakatsukasa 2012)
University of Manchester Nick Higham Higher Precision Arithmetic 16 33
Maximum Number of Iterations
If κ2(A) le uminus1
u 10minus16 10minus32 10minus64 10minus128
Scaled Newton 9 11 13 15QDWH 6 7 8 9
Increasing the precision makes little difference to thecost in flops
University of Manchester Nick Higham Higher Precision Arithmetic 17 33
Derivative Approximations
Freacutechet derivative of f Cntimesn rarr Cntimesn
Lf (AE) asymp f (A + hE)minus f (A)h
forward difference
with error O(h) and hopt =(
uf (A)E2
)12
If f Rntimesn rarr Rntimesn and AE isin Rntimesn (Al-Mohy amp H 2010)
Lf (AE) asymp Imf (A + ihE)
hcomplex step
has error O(h2) and can take h arbitrarily small eg 10minus100
University of Manchester Nick Higham Higher Precision Arithmetic 18 33
Derivative Approximations
Freacutechet derivative of f Cntimesn rarr Cntimesn
Lf (AE) asymp f (A + hE)minus f (A)h
forward difference
with error O(h) and hopt =(
uf (A)E2
)12
If f Rntimesn rarr Rntimesn and AE isin Rntimesn (Al-Mohy amp H 2010)
Lf (AE) asymp Imf (A + ihE)
hcomplex step
has error O(h2) and can take h arbitrarily small eg 10minus100
University of Manchester Nick Higham Higher Precision Arithmetic 18 33
Mixed Precision
Even if we have higher precision we might not need to useit everywhere
Iterative refinement for Ax = bIterative refinement for ersquovalue problemsPolynomial root finding (Boyd this workshop)Petschow Quintana-Ortiacute amp Bientinesi (2012) usingquad precision to improve orthogonality of doubleprecision MRRR
University of Manchester Nick Higham Higher Precision Arithmetic 19 33
Chebfun
How would Chebfun need to change with higher precisionarithmetic
gtgt help chebfunprefchebfunpref Settings for Chebfun
eps - Relative tolerance used in construction andsubsequent operationsFactory value is 2^-52 (Matlabrsquos factoryvalue of machine epsilon)
University of Manchester Nick Higham Higher Precision Arithmetic 20 33
Chebfun
function pass = complexrotation This code makes sure a few things are ok if you make them complex eg integration norms inner products and singular values LNT 20 May 2008
f = chebfun((x) exp(x))fi = chebfun((x) 1iexp(x))g = chebfun(rsquo1(2-x)rsquo)gi = chebfun(rsquo1i(2-x)rsquo)A = [f g]
pass(1) = (sum(fi)==1isum(f))pass(2) = norm(fi)==norm(f)pass(3) = abs((frsquog)-((1if)rsquo(1ig))) lt 1e-15pass(4) = (norm(giinf)-norm(ginf)) lt 1e-15pass(5) = (norm(fi1)-norm(f1)) lt 1e-15pass(6) = norm(svd(A) - svd(1iA)) lt 1e-15
University of Manchester Nick Higham Higher Precision Arithmetic 21 33
Unstable Algorithms in Higher Precision
For unstable alg where error bounds available increase theprecision accordingly
University of Manchester Nick Higham Higher Precision Arithmetic 22 33
RandomizationA isin Cntimesn with A = 1Davies (2007) ldquoapproximate diagonalizationrdquo
tol = 1e-8[VD] = eig(A + tolrandn(n))F = Vdiag(feval(fundiag(D)))V
Manchester usage in VPA
digits(d)tol = 10^(-d2)[VD] = eig(vpa(A) + tolvpa( rand(n) ))F = Vdiag(feval(fundiag(D)))V
Daviesrsquo analysis suggests rel err asymp 10minusd2
H amp Relton (in progress)
University of Manchester Nick Higham Higher Precision Arithmetic 23 33
RandomizationA isin Cntimesn with A = 1Davies (2007) ldquoapproximate diagonalizationrdquo
tol = 1e-8[VD] = eig(A + tolrandn(n))F = Vdiag(feval(fundiag(D)))V
Manchester usage in VPA
digits(d)tol = 10^(-d2)[VD] = eig(vpa(A) + tolvpa( rand(n) ))F = Vdiag(feval(fundiag(D)))V
Daviesrsquo analysis suggests rel err asymp 10minusd2
H amp Relton (in progress)
University of Manchester Nick Higham Higher Precision Arithmetic 23 33
Tiny Relative Errors
Normwise relative errors
θinfin(x y) =x minus yinfinxinfin
from a numerical experiment
132e-22 339e-22 339e-21 867e-20139e-18 436e-18 530e-18 583e-18145e-17 376e-17 376e-17 427e-17
What precision was used
University of Manchester Nick Higham Higher Precision Arithmetic 24 33
Base β = 2 u = 2minust Dingle amp H (2011)
Theorem
If x 6= 0 and y are distinct normalized flpt numbers then|x minus y ||x | ge u and this lower bound is attainable
Theorem
Let 0 6= x y isin Rn be vectors of normalized flpt numbers Ifθinfin(x y) lt u then xk = yk for all k st |xk | = xinfin
x =
[1
10minus22
] y =
[1
2times 10minus22
] θinfin(x y) = 10minus22
Tiny relative errors can corrupt performance profilesSee cure in Dingle amp H (2011)
University of Manchester Nick Higham Higher Precision Arithmetic 25 33
Base β = 2 u = 2minust Dingle amp H (2011)
Theorem
If x 6= 0 and y are distinct normalized flpt numbers then|x minus y ||x | ge u and this lower bound is attainable
Theorem
Let 0 6= x y isin Rn be vectors of normalized flpt numbers Ifθinfin(x y) lt u then xk = yk for all k st |xk | = xinfin
x =
[1
10minus22
] y =
[1
2times 10minus22
] θinfin(x y) = 10minus22
Tiny relative errors can corrupt performance profilesSee cure in Dingle amp H (2011)
University of Manchester Nick Higham Higher Precision Arithmetic 25 33
Increasing the Precision
y = eπradic
163 evaluated at t digit precision
t y20 2625374126407687440025 262537412640768744000000030 262537412640768743999999999999
Is the last digit before the decimal point 4
t y35 2625374126407687439999999999992500740 2625374126407687439999999999992500725972
So no itrsquos 3
University of Manchester Nick Higham Higher Precision Arithmetic 26 33
Increasing the Precision
y = eπradic
163 evaluated at t digit precision
t y20 2625374126407687440025 262537412640768744000000030 262537412640768743999999999999
Is the last digit before the decimal point 4
t y35 2625374126407687439999999999992500740 2625374126407687439999999999992500725972
So no itrsquos 3
University of Manchester Nick Higham Higher Precision Arithmetic 26 33
Another Example
Consider the evaluation in precision u = 2minust of
y = x + a sin(bx) x = 17 a = 10minus8 b = 224
10 15 20 25 30 35 4010
minus14
10minus13
10minus12
10minus11
10minus10
10minus9
10minus8
10minus7
10minus6
10minus5
10minus4
t
error
University of Manchester Nick Higham Higher Precision Arithmetic 27 33
MythIncreasing the precision at which a computation isperformed increases the accuracy of the answer
Rounding Errors in Digital Imaging
An 8-bit RGB image is an m times n times 3 array of integersaijk isin 01 255Every editing op (levels curves colour balance )aijk larr round(fijk(aijk)) incurs a rounding error
Should we edit in 16-bit
University of Manchester Nick Higham Higher Precision Arithmetic 29 33
Rounding Errors in Digital Imaging
An 8-bit RGB image is an m times n times 3 array of integersaijk isin 01 255Every editing op (levels curves colour balance )aijk larr round(fijk(aijk)) incurs a rounding error
Should we edit in 16-bit
University of Manchester Nick Higham Higher Precision Arithmetic 29 33
16-bit vs 8-bit editing
SLR cameras generate image at 12ndash14 bits internally
8-bit 0 1 25516-bit 0 39times 10minus3 255
Controversial Margulis says using 16 bits makes nopractical difference in quality
Relevant metric is not normwise relative error
University of Manchester Nick Higham Higher Precision Arithmetic 30 33
16-bit vs 8-bit editing
SLR cameras generate image at 12ndash14 bits internally
8-bit 0 1 25516-bit 0 39times 10minus3 255
Controversial Margulis says using 16 bits makes nopractical difference in quality
Relevant metric is not normwise relative error
University of Manchester Nick Higham Higher Precision Arithmetic 30 33
16-bit vs 8-bit editing
SLR cameras generate image at 12ndash14 bits internally
8-bit 0 1 25516-bit 0 39times 10minus3 255
Controversial Margulis says using 16 bits makes nopractical difference in quality
Relevant metric is not normwise relative error
University of Manchester Nick Higham Higher Precision Arithmetic 30 33
High Precision in MATLAB
VPA arithmetic in Symbolic Math ToolboxAdvanpix Multiprecision Computing Toolbox forMATLAB
University of Manchester Nick Higham Higher Precision Arithmetic 31 33
gtgt which -all eigbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleeig) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleeig) double methodCMATLAB2012btoolboxsymbolicsymbolicsymeigm sym methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedeigm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayeigm gpuArray methoddmatlabchebfunlinopeigm linop methoddmatlabchebfunchebopeigm chebop method
gtgt which -all qrbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleqr) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleqr) double methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedqrm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayqrm gpuArray methoddmatlabchebfunchebfunqrm chebfun method
gtgt which -all toeplitzCMATLAB2012btoolboxmatlabelmattoeplitzm
University of Manchester Nick Higham Higher Precision Arithmetic 32 33
gtgt which -all eigbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleeig) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleeig) double methodCMATLAB2012btoolboxsymbolicsymbolicsymeigm sym methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedeigm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayeigm gpuArray methoddmatlabchebfunlinopeigm linop methoddmatlabchebfunchebopeigm chebop method
gtgt which -all qrbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleqr) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleqr) double methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedqrm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayqrm gpuArray methoddmatlabchebfunchebfunqrm chebfun method
gtgt which -all toeplitzCMATLAB2012btoolboxmatlabelmattoeplitzm
University of Manchester Nick Higham Higher Precision Arithmetic 32 33
gtgt which -all eigbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleeig) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleeig) double methodCMATLAB2012btoolboxsymbolicsymbolicsymeigm sym methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedeigm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayeigm gpuArray methoddmatlabchebfunlinopeigm linop methoddmatlabchebfunchebopeigm chebop method
gtgt which -all qrbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleqr) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleqr) double methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedqrm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayqrm gpuArray methoddmatlabchebfunchebfunqrm chebfun method
gtgt which -all toeplitzCMATLAB2012btoolboxmatlabelmattoeplitzm
University of Manchester Nick Higham Higher Precision Arithmetic 32 33
Conclusions amp Future Directions
Need to prepare for quad precision in hardwareArbitrary precision arithmetic in software increasinglyuseful Are we fully exploiting it
High precision veryuseful for computingldquoexact answersrdquo fortesting algs
NLA group
University of Manchester Nick Higham Higher Precision Arithmetic 33 33
References I
Multiprecision Computing Toolbox for MATLABAdvanpix LLC Tokyo|httpwwwadvanpixcom|
A H Al-Mohy and N J HighamThe complex step approximation to the Freacutechetderivative of a matrix functionNumer Algorithms 53(1)133ndash148 2010
Basic Linear Algebra Subprograms Technical (BLAST)Forum Standard IInt J High Performance Computing Applications 16(1)1ndash111 2002
University of Manchester Nick Higham Higher Precision Arithmetic 1 4
References II
Basic Linear Algebra Subprograms Technical (BLAST)Forum Standard IIInt J High Performance Computing Applications 16(2)115ndash199 2002
E B DaviesApproximate diagonalizationSIAM J Matrix Anal Appl 29(4)1051ndash1064 2007
University of Manchester Nick Higham Higher Precision Arithmetic 2 4
References III
N J Dingle and N J HighamReducing the influence of tiny normwise relative errorson performance profilesMIMS EPrint 201190 Manchester Institute forMathematical Sciences The University of ManchesterUK Nov 201111 pp
W KahanComputer benchmarks versus accuracyDraft manuscript June 1994
University of Manchester Nick Higham Higher Precision Arithmetic 3 4
References IV
X S Li J W Demmel D H Bailey G Henry Y HidaJ Iskandar W Kahan S Y Kang A Kapur M CMartin B J Thompson T Tung and D J YooDesign implementation and testing of extended andmixed precision BLASACM Trans Math Software 28(2)152ndash205 June 2002
D MargulisPhotoshop LAB Color The Canyon Conundrum andOther Adventures in the Most Powerful ColorspacePeachpit Press Berkeley CA USA 2006ISBN 0-321-35678-0xviii+366 pp
University of Manchester Nick Higham Higher Precision Arithmetic 4 4
Floating Point Number System
Floating point number system F sub R
y = plusmnm times βeminust 0 le m le βt minus 1
Base βprecision t exponent range emin le e le emax
Floating point numbers are not equally spaced
If β = 2 t = 3 emin = minus1 and emax = 3
0 05 10 20 30 40 50 60 70
University of Manchester Nick Higham Higher Precision Arithmetic 2 33
Floating Point Number System
Floating point number system F sub R
y = plusmnm times βeminust 0 le m le βt minus 1
Base βprecision t exponent range emin le e le emax
Floating point numbers are not equally spaced
If β = 2 t = 3 emin = minus1 and emax = 3
0 05 10 20 30 40 50 60 70
University of Manchester Nick Higham Higher Precision Arithmetic 2 33
Precision versus Accuracy
Unit roundoff u = 12β
1minust
fl(abc) = ab(1 + δ1) middot c(1 + δ2) |δi | le u= abc(1 + δ1)(1 + δ2)
asymp abc(1 + δ1 + δ2)
Precision = uAccuracy asymp 2u
Accuracy is not limited by precision
University of Manchester Nick Higham Higher Precision Arithmetic 3 33
Precision versus Accuracy
Unit roundoff u = 12β
1minust
fl(abc) = ab(1 + δ1) middot c(1 + δ2) |δi | le u= abc(1 + δ1)(1 + δ2)
asymp abc(1 + δ1 + δ2)
Precision = uAccuracy asymp 2u
Accuracy is not limited by precision
University of Manchester Nick Higham Higher Precision Arithmetic 3 33
Walkersrsquo Trouser Review
bull pARAMO Torres Trousers pound105 Waterproof insulated trousers to be worn as
an outer layer or next to the skin I understand the concept but the execution is unusual The waist is
bull wide relative to thighs and the cut is narrow around backside and crotch (despi te a gusset) There is only
it The wa ist connects like a nappy adjust the stati c integral webbing belt via a large quick-release buckle (which can clash with a rucksack or camera bag hipbelt) then
fold a fabric fla p up to the belt and secure with two Ve lcro tabs This leaves an air gap on each side between
waist and the top of the leg zip Knees articulate but when the leg is raised above step height the fabric is restrictive across the thigh and backside Full length side zips (legs cannot be easily shortened) allow rapid zipping on or off - useful if they are worn as a supershywarm overtrouser Paramo say Torres represent a practical su rviva l aid but as a next-to-skin trouser they fee l compromised by the cut and an insulated overt rouser has a limited market
THE LOWDOWN
Fabric Nikwax Analogy Insulator (polyester microfibre outer 100g polyester fill) Sizes XS-XL (unisex) Inside leg 79cm only Waist integral belt front flap with Velcro tabs
IHtIrnl Paramo 01892 786444 wwwparamo_couk
Cit THE NORTH FACE Insulated Trekker Pant Heading to the Antarctic Pack a pair of
these Inside the stretch nylon exterior is a quilted taffeta lining its li ke being wrapped in a duvet The on ly hitch is that outside co ld dry cond it ions theyre often too warm There are no leg vents and no water repellency although the insulation stops moisture (mist not rain) seeping through Breathability is good for such a warm garment and they are super-comfortable in chilly weather A choice of leg lengths is available and the plain hem is easily adjusted The waist is generous with belt loops and a static drawcord to keep them in place although the drawcord isnt very effective against the weight and bulk of the trousers Pockets are odd the zipped side pockets are on ly just big enough for my (small) hands and Im sti ll searching for a use for the zipped pocket behind the left thigh I found these pants too warm for British hill walking but appreciated them in cold dry weather in the French Alps
THE LOWDOWN Fabric 90 nylon 10 elastane polyester insulation Sizes Men 30-38 Women 8-16
IHtImU The North Face 01539822155 wwwthenorthfacecomeu
Inside leg Men Regular 80-83cm Long 85-88cm Women Short 71-75cm Regular 76-80cm Long 81-85cm_ All size graduated Waist zip 2 press studs belt loops static drawcord Pockets 2 zipped front 1 zipped rear 1 zipped back thigh
pound65
January 2011 Outdoor Photography 75
University of Manchester Nick Higham Higher Precision Arithmetic 5 33
RGB to XYZ
From CIE Standard (1931)XYZ
=
049 031 020017697 081240 001063
0 001 099
RGB
But in many booksXYZ
=
049000 031000 020000017697 081240 001063
0 001000 099000
RGB
University of Manchester Nick Higham Higher Precision Arithmetic 6 33
RGB to XYZ
From CIE Standard (1931)XYZ
=
049 031 020017697 081240 001063
0 001 099
RGB
But in many booksX
YZ
=
049000 031000 020000017697 081240 001063
0 001000 099000
RGB
University of Manchester Nick Higham Higher Precision Arithmetic 6 33
IEEE Standard 754-1985
Binary β = 2
Type Size Range u = 2minust
single 32 bits 10plusmn38 2minus24 asymp 60times 10minus8
double 64 bits 10plusmn308 2minus53 asymp 11times 10minus16
Arithmetic ops (+minus lowast radic) performed as if firstcalculated to infinite precision then roundedDefault round to nearest round to even in case of tie
University of Manchester Nick Higham Higher Precision Arithmetic 7 33
Kahan on Higher Precision
For now the 10-byte Extended format is atolerable compromise between
the value of extra-precise arithmetic and the price ofimplementing it to run fast
very soon two more bytes of precision will become tolerableand ultimately a 16-byte format
That kind of gradual evolution towards wider precision
was already in view whenIEEE Standard 754 for Floating-Point Arithmetic was framed
mdash Computer Benchmarks Versus Accuracy (1994)
University of Manchester Nick Higham Higher Precision Arithmetic 8 33
IEEE Standard 754-2008
Type Size Range u = 2minust
single 32 bits 10plusmn38 2minus24 asymp 60times 10minus8
double 64 bits 10plusmn308 2minus53 asymp 11times 10minus16
quadruple 128 bits 10plusmn4932 2minus113 asymp 96times 10minus35
University of Manchester Nick Higham Higher Precision Arithmetic 9 33
Extended and Mixed Precision BLAS (2002)
Part of new standard developed by BLAS TechnicalForumExtra precision used internally input and outputarguments remain single or doubleExtra precision is anything at least 15 times asaccurate as double precision and wider than 80 bitsCounterparts of selected level 1 2 and 3 BLASroutines Extra input argument specifies precision ofinternal computationsReference implementation uses double-double formatgiving extra precision of about 106 bits
University of Manchester Nick Higham Higher Precision Arithmetic 10 33
Need for Higher Precision
Bailey Simon Barton amp Fouts Floating PointArithmetic in Future Supercomputers Internat JSupercomputer Appl 3 86ndash90 1989Bailey Barrio amp Borwein High-Precision ComputationMathematical Physics and Dynamics Appl MathComput 218 10106ndash10121 2012
Long-time simulationsResolving small-scale phenomenaLarge-scale simulations
University of Manchester Nick Higham Higher Precision Arithmetic 11 33
Going to Higher Precision
If we have quadruple or higher precision what do we needto do to modify existing algorithms
To what extent are existing algs precision-independent
University of Manchester Nick Higham Higher Precision Arithmetic 13 33
Direct Matrix Factorizations
Threshold pivoting for sparse matrices may depend onprecisionCondition estimation to detect nearly singular matrices
gtgt A = gallery(rsquolotkinrsquo14) x = Arandn(n1)Warning Matrix is close to singular or badly scaledResults may be inaccurate RCOND = 2761711e-18
Iterative refinementEquilibration
But all ldquorely only on LAPACK DLAMCHrdquo
University of Manchester Nick Higham Higher Precision Arithmetic 14 33
Matrix Functions
(Inverse) scaling and squaring-type algorithms for eAlog(A) cos(A) At use Padeacute approximants
Padeacute degree chosen to achieve accuracy uPadeacute coeffs and algorithm parameters need rederivingfor a different u Logic may changeexpm logm need changing for smaller u
Methods based on best Linfin approximations to eA forHermitian A also need higher order approximationsderiving
Scalar elementary functions
University of Manchester Nick Higham Higher Precision Arithmetic 15 33
Iterative Algorithms
QR-based Dynamically Weighted Halley IterationNakatsukasa Bai amp Gygi (2010)X0 = Aα isin Cmtimesn (m ge n)[radic
ck Xk
I
]=
[Q1
Q2
]R
Xk+1 =bk
ckXk +
1radicck
(ak minus
bk
ck
)Q1Qlowast2
Cubically convergent with limkrarrinfin Xk = U whereA = UH is a polar decompositionForms basis of new spectral divide amp conquer algs forsymm eirsquoproblem and SVD (H amp Nakatsukasa 2012)
University of Manchester Nick Higham Higher Precision Arithmetic 16 33
Maximum Number of Iterations
If κ2(A) le uminus1
u 10minus16 10minus32 10minus64 10minus128
Scaled Newton 9 11 13 15QDWH 6 7 8 9
Increasing the precision makes little difference to thecost in flops
University of Manchester Nick Higham Higher Precision Arithmetic 17 33
Derivative Approximations
Freacutechet derivative of f Cntimesn rarr Cntimesn
Lf (AE) asymp f (A + hE)minus f (A)h
forward difference
with error O(h) and hopt =(
uf (A)E2
)12
If f Rntimesn rarr Rntimesn and AE isin Rntimesn (Al-Mohy amp H 2010)
Lf (AE) asymp Imf (A + ihE)
hcomplex step
has error O(h2) and can take h arbitrarily small eg 10minus100
University of Manchester Nick Higham Higher Precision Arithmetic 18 33
Derivative Approximations
Freacutechet derivative of f Cntimesn rarr Cntimesn
Lf (AE) asymp f (A + hE)minus f (A)h
forward difference
with error O(h) and hopt =(
uf (A)E2
)12
If f Rntimesn rarr Rntimesn and AE isin Rntimesn (Al-Mohy amp H 2010)
Lf (AE) asymp Imf (A + ihE)
hcomplex step
has error O(h2) and can take h arbitrarily small eg 10minus100
University of Manchester Nick Higham Higher Precision Arithmetic 18 33
Mixed Precision
Even if we have higher precision we might not need to useit everywhere
Iterative refinement for Ax = bIterative refinement for ersquovalue problemsPolynomial root finding (Boyd this workshop)Petschow Quintana-Ortiacute amp Bientinesi (2012) usingquad precision to improve orthogonality of doubleprecision MRRR
University of Manchester Nick Higham Higher Precision Arithmetic 19 33
Chebfun
How would Chebfun need to change with higher precisionarithmetic
gtgt help chebfunprefchebfunpref Settings for Chebfun
eps - Relative tolerance used in construction andsubsequent operationsFactory value is 2^-52 (Matlabrsquos factoryvalue of machine epsilon)
University of Manchester Nick Higham Higher Precision Arithmetic 20 33
Chebfun
function pass = complexrotation This code makes sure a few things are ok if you make them complex eg integration norms inner products and singular values LNT 20 May 2008
f = chebfun((x) exp(x))fi = chebfun((x) 1iexp(x))g = chebfun(rsquo1(2-x)rsquo)gi = chebfun(rsquo1i(2-x)rsquo)A = [f g]
pass(1) = (sum(fi)==1isum(f))pass(2) = norm(fi)==norm(f)pass(3) = abs((frsquog)-((1if)rsquo(1ig))) lt 1e-15pass(4) = (norm(giinf)-norm(ginf)) lt 1e-15pass(5) = (norm(fi1)-norm(f1)) lt 1e-15pass(6) = norm(svd(A) - svd(1iA)) lt 1e-15
University of Manchester Nick Higham Higher Precision Arithmetic 21 33
Unstable Algorithms in Higher Precision
For unstable alg where error bounds available increase theprecision accordingly
University of Manchester Nick Higham Higher Precision Arithmetic 22 33
RandomizationA isin Cntimesn with A = 1Davies (2007) ldquoapproximate diagonalizationrdquo
tol = 1e-8[VD] = eig(A + tolrandn(n))F = Vdiag(feval(fundiag(D)))V
Manchester usage in VPA
digits(d)tol = 10^(-d2)[VD] = eig(vpa(A) + tolvpa( rand(n) ))F = Vdiag(feval(fundiag(D)))V
Daviesrsquo analysis suggests rel err asymp 10minusd2
H amp Relton (in progress)
University of Manchester Nick Higham Higher Precision Arithmetic 23 33
RandomizationA isin Cntimesn with A = 1Davies (2007) ldquoapproximate diagonalizationrdquo
tol = 1e-8[VD] = eig(A + tolrandn(n))F = Vdiag(feval(fundiag(D)))V
Manchester usage in VPA
digits(d)tol = 10^(-d2)[VD] = eig(vpa(A) + tolvpa( rand(n) ))F = Vdiag(feval(fundiag(D)))V
Daviesrsquo analysis suggests rel err asymp 10minusd2
H amp Relton (in progress)
University of Manchester Nick Higham Higher Precision Arithmetic 23 33
Tiny Relative Errors
Normwise relative errors
θinfin(x y) =x minus yinfinxinfin
from a numerical experiment
132e-22 339e-22 339e-21 867e-20139e-18 436e-18 530e-18 583e-18145e-17 376e-17 376e-17 427e-17
What precision was used
University of Manchester Nick Higham Higher Precision Arithmetic 24 33
Base β = 2 u = 2minust Dingle amp H (2011)
Theorem
If x 6= 0 and y are distinct normalized flpt numbers then|x minus y ||x | ge u and this lower bound is attainable
Theorem
Let 0 6= x y isin Rn be vectors of normalized flpt numbers Ifθinfin(x y) lt u then xk = yk for all k st |xk | = xinfin
x =
[1
10minus22
] y =
[1
2times 10minus22
] θinfin(x y) = 10minus22
Tiny relative errors can corrupt performance profilesSee cure in Dingle amp H (2011)
University of Manchester Nick Higham Higher Precision Arithmetic 25 33
Base β = 2 u = 2minust Dingle amp H (2011)
Theorem
If x 6= 0 and y are distinct normalized flpt numbers then|x minus y ||x | ge u and this lower bound is attainable
Theorem
Let 0 6= x y isin Rn be vectors of normalized flpt numbers Ifθinfin(x y) lt u then xk = yk for all k st |xk | = xinfin
x =
[1
10minus22
] y =
[1
2times 10minus22
] θinfin(x y) = 10minus22
Tiny relative errors can corrupt performance profilesSee cure in Dingle amp H (2011)
University of Manchester Nick Higham Higher Precision Arithmetic 25 33
Increasing the Precision
y = eπradic
163 evaluated at t digit precision
t y20 2625374126407687440025 262537412640768744000000030 262537412640768743999999999999
Is the last digit before the decimal point 4
t y35 2625374126407687439999999999992500740 2625374126407687439999999999992500725972
So no itrsquos 3
University of Manchester Nick Higham Higher Precision Arithmetic 26 33
Increasing the Precision
y = eπradic
163 evaluated at t digit precision
t y20 2625374126407687440025 262537412640768744000000030 262537412640768743999999999999
Is the last digit before the decimal point 4
t y35 2625374126407687439999999999992500740 2625374126407687439999999999992500725972
So no itrsquos 3
University of Manchester Nick Higham Higher Precision Arithmetic 26 33
Another Example
Consider the evaluation in precision u = 2minust of
y = x + a sin(bx) x = 17 a = 10minus8 b = 224
10 15 20 25 30 35 4010
minus14
10minus13
10minus12
10minus11
10minus10
10minus9
10minus8
10minus7
10minus6
10minus5
10minus4
t
error
University of Manchester Nick Higham Higher Precision Arithmetic 27 33
MythIncreasing the precision at which a computation isperformed increases the accuracy of the answer
Rounding Errors in Digital Imaging
An 8-bit RGB image is an m times n times 3 array of integersaijk isin 01 255Every editing op (levels curves colour balance )aijk larr round(fijk(aijk)) incurs a rounding error
Should we edit in 16-bit
University of Manchester Nick Higham Higher Precision Arithmetic 29 33
Rounding Errors in Digital Imaging
An 8-bit RGB image is an m times n times 3 array of integersaijk isin 01 255Every editing op (levels curves colour balance )aijk larr round(fijk(aijk)) incurs a rounding error
Should we edit in 16-bit
University of Manchester Nick Higham Higher Precision Arithmetic 29 33
16-bit vs 8-bit editing
SLR cameras generate image at 12ndash14 bits internally
8-bit 0 1 25516-bit 0 39times 10minus3 255
Controversial Margulis says using 16 bits makes nopractical difference in quality
Relevant metric is not normwise relative error
University of Manchester Nick Higham Higher Precision Arithmetic 30 33
16-bit vs 8-bit editing
SLR cameras generate image at 12ndash14 bits internally
8-bit 0 1 25516-bit 0 39times 10minus3 255
Controversial Margulis says using 16 bits makes nopractical difference in quality
Relevant metric is not normwise relative error
University of Manchester Nick Higham Higher Precision Arithmetic 30 33
16-bit vs 8-bit editing
SLR cameras generate image at 12ndash14 bits internally
8-bit 0 1 25516-bit 0 39times 10minus3 255
Controversial Margulis says using 16 bits makes nopractical difference in quality
Relevant metric is not normwise relative error
University of Manchester Nick Higham Higher Precision Arithmetic 30 33
High Precision in MATLAB
VPA arithmetic in Symbolic Math ToolboxAdvanpix Multiprecision Computing Toolbox forMATLAB
University of Manchester Nick Higham Higher Precision Arithmetic 31 33
gtgt which -all eigbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleeig) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleeig) double methodCMATLAB2012btoolboxsymbolicsymbolicsymeigm sym methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedeigm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayeigm gpuArray methoddmatlabchebfunlinopeigm linop methoddmatlabchebfunchebopeigm chebop method
gtgt which -all qrbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleqr) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleqr) double methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedqrm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayqrm gpuArray methoddmatlabchebfunchebfunqrm chebfun method
gtgt which -all toeplitzCMATLAB2012btoolboxmatlabelmattoeplitzm
University of Manchester Nick Higham Higher Precision Arithmetic 32 33
gtgt which -all eigbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleeig) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleeig) double methodCMATLAB2012btoolboxsymbolicsymbolicsymeigm sym methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedeigm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayeigm gpuArray methoddmatlabchebfunlinopeigm linop methoddmatlabchebfunchebopeigm chebop method
gtgt which -all qrbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleqr) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleqr) double methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedqrm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayqrm gpuArray methoddmatlabchebfunchebfunqrm chebfun method
gtgt which -all toeplitzCMATLAB2012btoolboxmatlabelmattoeplitzm
University of Manchester Nick Higham Higher Precision Arithmetic 32 33
gtgt which -all eigbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleeig) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleeig) double methodCMATLAB2012btoolboxsymbolicsymbolicsymeigm sym methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedeigm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayeigm gpuArray methoddmatlabchebfunlinopeigm linop methoddmatlabchebfunchebopeigm chebop method
gtgt which -all qrbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleqr) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleqr) double methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedqrm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayqrm gpuArray methoddmatlabchebfunchebfunqrm chebfun method
gtgt which -all toeplitzCMATLAB2012btoolboxmatlabelmattoeplitzm
University of Manchester Nick Higham Higher Precision Arithmetic 32 33
Conclusions amp Future Directions
Need to prepare for quad precision in hardwareArbitrary precision arithmetic in software increasinglyuseful Are we fully exploiting it
High precision veryuseful for computingldquoexact answersrdquo fortesting algs
NLA group
University of Manchester Nick Higham Higher Precision Arithmetic 33 33
References I
Multiprecision Computing Toolbox for MATLABAdvanpix LLC Tokyo|httpwwwadvanpixcom|
A H Al-Mohy and N J HighamThe complex step approximation to the Freacutechetderivative of a matrix functionNumer Algorithms 53(1)133ndash148 2010
Basic Linear Algebra Subprograms Technical (BLAST)Forum Standard IInt J High Performance Computing Applications 16(1)1ndash111 2002
University of Manchester Nick Higham Higher Precision Arithmetic 1 4
References II
Basic Linear Algebra Subprograms Technical (BLAST)Forum Standard IIInt J High Performance Computing Applications 16(2)115ndash199 2002
E B DaviesApproximate diagonalizationSIAM J Matrix Anal Appl 29(4)1051ndash1064 2007
University of Manchester Nick Higham Higher Precision Arithmetic 2 4
References III
N J Dingle and N J HighamReducing the influence of tiny normwise relative errorson performance profilesMIMS EPrint 201190 Manchester Institute forMathematical Sciences The University of ManchesterUK Nov 201111 pp
W KahanComputer benchmarks versus accuracyDraft manuscript June 1994
University of Manchester Nick Higham Higher Precision Arithmetic 3 4
References IV
X S Li J W Demmel D H Bailey G Henry Y HidaJ Iskandar W Kahan S Y Kang A Kapur M CMartin B J Thompson T Tung and D J YooDesign implementation and testing of extended andmixed precision BLASACM Trans Math Software 28(2)152ndash205 June 2002
D MargulisPhotoshop LAB Color The Canyon Conundrum andOther Adventures in the Most Powerful ColorspacePeachpit Press Berkeley CA USA 2006ISBN 0-321-35678-0xviii+366 pp
University of Manchester Nick Higham Higher Precision Arithmetic 4 4
Floating Point Number System
Floating point number system F sub R
y = plusmnm times βeminust 0 le m le βt minus 1
Base βprecision t exponent range emin le e le emax
Floating point numbers are not equally spaced
If β = 2 t = 3 emin = minus1 and emax = 3
0 05 10 20 30 40 50 60 70
University of Manchester Nick Higham Higher Precision Arithmetic 2 33
Precision versus Accuracy
Unit roundoff u = 12β
1minust
fl(abc) = ab(1 + δ1) middot c(1 + δ2) |δi | le u= abc(1 + δ1)(1 + δ2)
asymp abc(1 + δ1 + δ2)
Precision = uAccuracy asymp 2u
Accuracy is not limited by precision
University of Manchester Nick Higham Higher Precision Arithmetic 3 33
Precision versus Accuracy
Unit roundoff u = 12β
1minust
fl(abc) = ab(1 + δ1) middot c(1 + δ2) |δi | le u= abc(1 + δ1)(1 + δ2)
asymp abc(1 + δ1 + δ2)
Precision = uAccuracy asymp 2u
Accuracy is not limited by precision
University of Manchester Nick Higham Higher Precision Arithmetic 3 33
Walkersrsquo Trouser Review
bull pARAMO Torres Trousers pound105 Waterproof insulated trousers to be worn as
an outer layer or next to the skin I understand the concept but the execution is unusual The waist is
bull wide relative to thighs and the cut is narrow around backside and crotch (despi te a gusset) There is only
it The wa ist connects like a nappy adjust the stati c integral webbing belt via a large quick-release buckle (which can clash with a rucksack or camera bag hipbelt) then
fold a fabric fla p up to the belt and secure with two Ve lcro tabs This leaves an air gap on each side between
waist and the top of the leg zip Knees articulate but when the leg is raised above step height the fabric is restrictive across the thigh and backside Full length side zips (legs cannot be easily shortened) allow rapid zipping on or off - useful if they are worn as a supershywarm overtrouser Paramo say Torres represent a practical su rviva l aid but as a next-to-skin trouser they fee l compromised by the cut and an insulated overt rouser has a limited market
THE LOWDOWN
Fabric Nikwax Analogy Insulator (polyester microfibre outer 100g polyester fill) Sizes XS-XL (unisex) Inside leg 79cm only Waist integral belt front flap with Velcro tabs
IHtIrnl Paramo 01892 786444 wwwparamo_couk
Cit THE NORTH FACE Insulated Trekker Pant Heading to the Antarctic Pack a pair of
these Inside the stretch nylon exterior is a quilted taffeta lining its li ke being wrapped in a duvet The on ly hitch is that outside co ld dry cond it ions theyre often too warm There are no leg vents and no water repellency although the insulation stops moisture (mist not rain) seeping through Breathability is good for such a warm garment and they are super-comfortable in chilly weather A choice of leg lengths is available and the plain hem is easily adjusted The waist is generous with belt loops and a static drawcord to keep them in place although the drawcord isnt very effective against the weight and bulk of the trousers Pockets are odd the zipped side pockets are on ly just big enough for my (small) hands and Im sti ll searching for a use for the zipped pocket behind the left thigh I found these pants too warm for British hill walking but appreciated them in cold dry weather in the French Alps
THE LOWDOWN Fabric 90 nylon 10 elastane polyester insulation Sizes Men 30-38 Women 8-16
IHtImU The North Face 01539822155 wwwthenorthfacecomeu
Inside leg Men Regular 80-83cm Long 85-88cm Women Short 71-75cm Regular 76-80cm Long 81-85cm_ All size graduated Waist zip 2 press studs belt loops static drawcord Pockets 2 zipped front 1 zipped rear 1 zipped back thigh
pound65
January 2011 Outdoor Photography 75
University of Manchester Nick Higham Higher Precision Arithmetic 5 33
RGB to XYZ
From CIE Standard (1931)XYZ
=
049 031 020017697 081240 001063
0 001 099
RGB
But in many booksXYZ
=
049000 031000 020000017697 081240 001063
0 001000 099000
RGB
University of Manchester Nick Higham Higher Precision Arithmetic 6 33
RGB to XYZ
From CIE Standard (1931)XYZ
=
049 031 020017697 081240 001063
0 001 099
RGB
But in many booksX
YZ
=
049000 031000 020000017697 081240 001063
0 001000 099000
RGB
University of Manchester Nick Higham Higher Precision Arithmetic 6 33
IEEE Standard 754-1985
Binary β = 2
Type Size Range u = 2minust
single 32 bits 10plusmn38 2minus24 asymp 60times 10minus8
double 64 bits 10plusmn308 2minus53 asymp 11times 10minus16
Arithmetic ops (+minus lowast radic) performed as if firstcalculated to infinite precision then roundedDefault round to nearest round to even in case of tie
University of Manchester Nick Higham Higher Precision Arithmetic 7 33
Kahan on Higher Precision
For now the 10-byte Extended format is atolerable compromise between
the value of extra-precise arithmetic and the price ofimplementing it to run fast
very soon two more bytes of precision will become tolerableand ultimately a 16-byte format
That kind of gradual evolution towards wider precision
was already in view whenIEEE Standard 754 for Floating-Point Arithmetic was framed
mdash Computer Benchmarks Versus Accuracy (1994)
University of Manchester Nick Higham Higher Precision Arithmetic 8 33
IEEE Standard 754-2008
Type Size Range u = 2minust
single 32 bits 10plusmn38 2minus24 asymp 60times 10minus8
double 64 bits 10plusmn308 2minus53 asymp 11times 10minus16
quadruple 128 bits 10plusmn4932 2minus113 asymp 96times 10minus35
University of Manchester Nick Higham Higher Precision Arithmetic 9 33
Extended and Mixed Precision BLAS (2002)
Part of new standard developed by BLAS TechnicalForumExtra precision used internally input and outputarguments remain single or doubleExtra precision is anything at least 15 times asaccurate as double precision and wider than 80 bitsCounterparts of selected level 1 2 and 3 BLASroutines Extra input argument specifies precision ofinternal computationsReference implementation uses double-double formatgiving extra precision of about 106 bits
University of Manchester Nick Higham Higher Precision Arithmetic 10 33
Need for Higher Precision
Bailey Simon Barton amp Fouts Floating PointArithmetic in Future Supercomputers Internat JSupercomputer Appl 3 86ndash90 1989Bailey Barrio amp Borwein High-Precision ComputationMathematical Physics and Dynamics Appl MathComput 218 10106ndash10121 2012
Long-time simulationsResolving small-scale phenomenaLarge-scale simulations
University of Manchester Nick Higham Higher Precision Arithmetic 11 33
Going to Higher Precision
If we have quadruple or higher precision what do we needto do to modify existing algorithms
To what extent are existing algs precision-independent
University of Manchester Nick Higham Higher Precision Arithmetic 13 33
Direct Matrix Factorizations
Threshold pivoting for sparse matrices may depend onprecisionCondition estimation to detect nearly singular matrices
gtgt A = gallery(rsquolotkinrsquo14) x = Arandn(n1)Warning Matrix is close to singular or badly scaledResults may be inaccurate RCOND = 2761711e-18
Iterative refinementEquilibration
But all ldquorely only on LAPACK DLAMCHrdquo
University of Manchester Nick Higham Higher Precision Arithmetic 14 33
Matrix Functions
(Inverse) scaling and squaring-type algorithms for eAlog(A) cos(A) At use Padeacute approximants
Padeacute degree chosen to achieve accuracy uPadeacute coeffs and algorithm parameters need rederivingfor a different u Logic may changeexpm logm need changing for smaller u
Methods based on best Linfin approximations to eA forHermitian A also need higher order approximationsderiving
Scalar elementary functions
University of Manchester Nick Higham Higher Precision Arithmetic 15 33
Iterative Algorithms
QR-based Dynamically Weighted Halley IterationNakatsukasa Bai amp Gygi (2010)X0 = Aα isin Cmtimesn (m ge n)[radic
ck Xk
I
]=
[Q1
Q2
]R
Xk+1 =bk
ckXk +
1radicck
(ak minus
bk
ck
)Q1Qlowast2
Cubically convergent with limkrarrinfin Xk = U whereA = UH is a polar decompositionForms basis of new spectral divide amp conquer algs forsymm eirsquoproblem and SVD (H amp Nakatsukasa 2012)
University of Manchester Nick Higham Higher Precision Arithmetic 16 33
Maximum Number of Iterations
If κ2(A) le uminus1
u 10minus16 10minus32 10minus64 10minus128
Scaled Newton 9 11 13 15QDWH 6 7 8 9
Increasing the precision makes little difference to thecost in flops
University of Manchester Nick Higham Higher Precision Arithmetic 17 33
Derivative Approximations
Freacutechet derivative of f Cntimesn rarr Cntimesn
Lf (AE) asymp f (A + hE)minus f (A)h
forward difference
with error O(h) and hopt =(
uf (A)E2
)12
If f Rntimesn rarr Rntimesn and AE isin Rntimesn (Al-Mohy amp H 2010)
Lf (AE) asymp Imf (A + ihE)
hcomplex step
has error O(h2) and can take h arbitrarily small eg 10minus100
University of Manchester Nick Higham Higher Precision Arithmetic 18 33
Derivative Approximations
Freacutechet derivative of f Cntimesn rarr Cntimesn
Lf (AE) asymp f (A + hE)minus f (A)h
forward difference
with error O(h) and hopt =(
uf (A)E2
)12
If f Rntimesn rarr Rntimesn and AE isin Rntimesn (Al-Mohy amp H 2010)
Lf (AE) asymp Imf (A + ihE)
hcomplex step
has error O(h2) and can take h arbitrarily small eg 10minus100
University of Manchester Nick Higham Higher Precision Arithmetic 18 33
Mixed Precision
Even if we have higher precision we might not need to useit everywhere
Iterative refinement for Ax = bIterative refinement for ersquovalue problemsPolynomial root finding (Boyd this workshop)Petschow Quintana-Ortiacute amp Bientinesi (2012) usingquad precision to improve orthogonality of doubleprecision MRRR
University of Manchester Nick Higham Higher Precision Arithmetic 19 33
Chebfun
How would Chebfun need to change with higher precisionarithmetic
gtgt help chebfunprefchebfunpref Settings for Chebfun
eps - Relative tolerance used in construction andsubsequent operationsFactory value is 2^-52 (Matlabrsquos factoryvalue of machine epsilon)
University of Manchester Nick Higham Higher Precision Arithmetic 20 33
Chebfun
function pass = complexrotation This code makes sure a few things are ok if you make them complex eg integration norms inner products and singular values LNT 20 May 2008
f = chebfun((x) exp(x))fi = chebfun((x) 1iexp(x))g = chebfun(rsquo1(2-x)rsquo)gi = chebfun(rsquo1i(2-x)rsquo)A = [f g]
pass(1) = (sum(fi)==1isum(f))pass(2) = norm(fi)==norm(f)pass(3) = abs((frsquog)-((1if)rsquo(1ig))) lt 1e-15pass(4) = (norm(giinf)-norm(ginf)) lt 1e-15pass(5) = (norm(fi1)-norm(f1)) lt 1e-15pass(6) = norm(svd(A) - svd(1iA)) lt 1e-15
University of Manchester Nick Higham Higher Precision Arithmetic 21 33
Unstable Algorithms in Higher Precision
For unstable alg where error bounds available increase theprecision accordingly
University of Manchester Nick Higham Higher Precision Arithmetic 22 33
RandomizationA isin Cntimesn with A = 1Davies (2007) ldquoapproximate diagonalizationrdquo
tol = 1e-8[VD] = eig(A + tolrandn(n))F = Vdiag(feval(fundiag(D)))V
Manchester usage in VPA
digits(d)tol = 10^(-d2)[VD] = eig(vpa(A) + tolvpa( rand(n) ))F = Vdiag(feval(fundiag(D)))V
Daviesrsquo analysis suggests rel err asymp 10minusd2
H amp Relton (in progress)
University of Manchester Nick Higham Higher Precision Arithmetic 23 33
RandomizationA isin Cntimesn with A = 1Davies (2007) ldquoapproximate diagonalizationrdquo
tol = 1e-8[VD] = eig(A + tolrandn(n))F = Vdiag(feval(fundiag(D)))V
Manchester usage in VPA
digits(d)tol = 10^(-d2)[VD] = eig(vpa(A) + tolvpa( rand(n) ))F = Vdiag(feval(fundiag(D)))V
Daviesrsquo analysis suggests rel err asymp 10minusd2
H amp Relton (in progress)
University of Manchester Nick Higham Higher Precision Arithmetic 23 33
Tiny Relative Errors
Normwise relative errors
θinfin(x y) =x minus yinfinxinfin
from a numerical experiment
132e-22 339e-22 339e-21 867e-20139e-18 436e-18 530e-18 583e-18145e-17 376e-17 376e-17 427e-17
What precision was used
University of Manchester Nick Higham Higher Precision Arithmetic 24 33
Base β = 2 u = 2minust Dingle amp H (2011)
Theorem
If x 6= 0 and y are distinct normalized flpt numbers then|x minus y ||x | ge u and this lower bound is attainable
Theorem
Let 0 6= x y isin Rn be vectors of normalized flpt numbers Ifθinfin(x y) lt u then xk = yk for all k st |xk | = xinfin
x =
[1
10minus22
] y =
[1
2times 10minus22
] θinfin(x y) = 10minus22
Tiny relative errors can corrupt performance profilesSee cure in Dingle amp H (2011)
University of Manchester Nick Higham Higher Precision Arithmetic 25 33
Base β = 2 u = 2minust Dingle amp H (2011)
Theorem
If x 6= 0 and y are distinct normalized flpt numbers then|x minus y ||x | ge u and this lower bound is attainable
Theorem
Let 0 6= x y isin Rn be vectors of normalized flpt numbers Ifθinfin(x y) lt u then xk = yk for all k st |xk | = xinfin
x =
[1
10minus22
] y =
[1
2times 10minus22
] θinfin(x y) = 10minus22
Tiny relative errors can corrupt performance profilesSee cure in Dingle amp H (2011)
University of Manchester Nick Higham Higher Precision Arithmetic 25 33
Increasing the Precision
y = eπradic
163 evaluated at t digit precision
t y20 2625374126407687440025 262537412640768744000000030 262537412640768743999999999999
Is the last digit before the decimal point 4
t y35 2625374126407687439999999999992500740 2625374126407687439999999999992500725972
So no itrsquos 3
University of Manchester Nick Higham Higher Precision Arithmetic 26 33
Increasing the Precision
y = eπradic
163 evaluated at t digit precision
t y20 2625374126407687440025 262537412640768744000000030 262537412640768743999999999999
Is the last digit before the decimal point 4
t y35 2625374126407687439999999999992500740 2625374126407687439999999999992500725972
So no itrsquos 3
University of Manchester Nick Higham Higher Precision Arithmetic 26 33
Another Example
Consider the evaluation in precision u = 2minust of
y = x + a sin(bx) x = 17 a = 10minus8 b = 224
10 15 20 25 30 35 4010
minus14
10minus13
10minus12
10minus11
10minus10
10minus9
10minus8
10minus7
10minus6
10minus5
10minus4
t
error
University of Manchester Nick Higham Higher Precision Arithmetic 27 33
MythIncreasing the precision at which a computation isperformed increases the accuracy of the answer
Rounding Errors in Digital Imaging
An 8-bit RGB image is an m times n times 3 array of integersaijk isin 01 255Every editing op (levels curves colour balance )aijk larr round(fijk(aijk)) incurs a rounding error
Should we edit in 16-bit
University of Manchester Nick Higham Higher Precision Arithmetic 29 33
Rounding Errors in Digital Imaging
An 8-bit RGB image is an m times n times 3 array of integersaijk isin 01 255Every editing op (levels curves colour balance )aijk larr round(fijk(aijk)) incurs a rounding error
Should we edit in 16-bit
University of Manchester Nick Higham Higher Precision Arithmetic 29 33
16-bit vs 8-bit editing
SLR cameras generate image at 12ndash14 bits internally
8-bit 0 1 25516-bit 0 39times 10minus3 255
Controversial Margulis says using 16 bits makes nopractical difference in quality
Relevant metric is not normwise relative error
University of Manchester Nick Higham Higher Precision Arithmetic 30 33
16-bit vs 8-bit editing
SLR cameras generate image at 12ndash14 bits internally
8-bit 0 1 25516-bit 0 39times 10minus3 255
Controversial Margulis says using 16 bits makes nopractical difference in quality
Relevant metric is not normwise relative error
University of Manchester Nick Higham Higher Precision Arithmetic 30 33
16-bit vs 8-bit editing
SLR cameras generate image at 12ndash14 bits internally
8-bit 0 1 25516-bit 0 39times 10minus3 255
Controversial Margulis says using 16 bits makes nopractical difference in quality
Relevant metric is not normwise relative error
University of Manchester Nick Higham Higher Precision Arithmetic 30 33
High Precision in MATLAB
VPA arithmetic in Symbolic Math ToolboxAdvanpix Multiprecision Computing Toolbox forMATLAB
University of Manchester Nick Higham Higher Precision Arithmetic 31 33
gtgt which -all eigbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleeig) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleeig) double methodCMATLAB2012btoolboxsymbolicsymbolicsymeigm sym methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedeigm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayeigm gpuArray methoddmatlabchebfunlinopeigm linop methoddmatlabchebfunchebopeigm chebop method
gtgt which -all qrbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleqr) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleqr) double methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedqrm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayqrm gpuArray methoddmatlabchebfunchebfunqrm chebfun method
gtgt which -all toeplitzCMATLAB2012btoolboxmatlabelmattoeplitzm
University of Manchester Nick Higham Higher Precision Arithmetic 32 33
gtgt which -all eigbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleeig) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleeig) double methodCMATLAB2012btoolboxsymbolicsymbolicsymeigm sym methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedeigm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayeigm gpuArray methoddmatlabchebfunlinopeigm linop methoddmatlabchebfunchebopeigm chebop method
gtgt which -all qrbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleqr) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleqr) double methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedqrm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayqrm gpuArray methoddmatlabchebfunchebfunqrm chebfun method
gtgt which -all toeplitzCMATLAB2012btoolboxmatlabelmattoeplitzm
University of Manchester Nick Higham Higher Precision Arithmetic 32 33
gtgt which -all eigbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleeig) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleeig) double methodCMATLAB2012btoolboxsymbolicsymbolicsymeigm sym methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedeigm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayeigm gpuArray methoddmatlabchebfunlinopeigm linop methoddmatlabchebfunchebopeigm chebop method
gtgt which -all qrbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleqr) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleqr) double methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedqrm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayqrm gpuArray methoddmatlabchebfunchebfunqrm chebfun method
gtgt which -all toeplitzCMATLAB2012btoolboxmatlabelmattoeplitzm
University of Manchester Nick Higham Higher Precision Arithmetic 32 33
Conclusions amp Future Directions
Need to prepare for quad precision in hardwareArbitrary precision arithmetic in software increasinglyuseful Are we fully exploiting it
High precision veryuseful for computingldquoexact answersrdquo fortesting algs
NLA group
University of Manchester Nick Higham Higher Precision Arithmetic 33 33
References I
Multiprecision Computing Toolbox for MATLABAdvanpix LLC Tokyo|httpwwwadvanpixcom|
A H Al-Mohy and N J HighamThe complex step approximation to the Freacutechetderivative of a matrix functionNumer Algorithms 53(1)133ndash148 2010
Basic Linear Algebra Subprograms Technical (BLAST)Forum Standard IInt J High Performance Computing Applications 16(1)1ndash111 2002
University of Manchester Nick Higham Higher Precision Arithmetic 1 4
References II
Basic Linear Algebra Subprograms Technical (BLAST)Forum Standard IIInt J High Performance Computing Applications 16(2)115ndash199 2002
E B DaviesApproximate diagonalizationSIAM J Matrix Anal Appl 29(4)1051ndash1064 2007
University of Manchester Nick Higham Higher Precision Arithmetic 2 4
References III
N J Dingle and N J HighamReducing the influence of tiny normwise relative errorson performance profilesMIMS EPrint 201190 Manchester Institute forMathematical Sciences The University of ManchesterUK Nov 201111 pp
W KahanComputer benchmarks versus accuracyDraft manuscript June 1994
University of Manchester Nick Higham Higher Precision Arithmetic 3 4
References IV
X S Li J W Demmel D H Bailey G Henry Y HidaJ Iskandar W Kahan S Y Kang A Kapur M CMartin B J Thompson T Tung and D J YooDesign implementation and testing of extended andmixed precision BLASACM Trans Math Software 28(2)152ndash205 June 2002
D MargulisPhotoshop LAB Color The Canyon Conundrum andOther Adventures in the Most Powerful ColorspacePeachpit Press Berkeley CA USA 2006ISBN 0-321-35678-0xviii+366 pp
University of Manchester Nick Higham Higher Precision Arithmetic 4 4
Precision versus Accuracy
Unit roundoff u = 12β
1minust
fl(abc) = ab(1 + δ1) middot c(1 + δ2) |δi | le u= abc(1 + δ1)(1 + δ2)
asymp abc(1 + δ1 + δ2)
Precision = uAccuracy asymp 2u
Accuracy is not limited by precision
University of Manchester Nick Higham Higher Precision Arithmetic 3 33
Precision versus Accuracy
Unit roundoff u = 12β
1minust
fl(abc) = ab(1 + δ1) middot c(1 + δ2) |δi | le u= abc(1 + δ1)(1 + δ2)
asymp abc(1 + δ1 + δ2)
Precision = uAccuracy asymp 2u
Accuracy is not limited by precision
University of Manchester Nick Higham Higher Precision Arithmetic 3 33
Walkersrsquo Trouser Review
bull pARAMO Torres Trousers pound105 Waterproof insulated trousers to be worn as
an outer layer or next to the skin I understand the concept but the execution is unusual The waist is
bull wide relative to thighs and the cut is narrow around backside and crotch (despi te a gusset) There is only
it The wa ist connects like a nappy adjust the stati c integral webbing belt via a large quick-release buckle (which can clash with a rucksack or camera bag hipbelt) then
fold a fabric fla p up to the belt and secure with two Ve lcro tabs This leaves an air gap on each side between
waist and the top of the leg zip Knees articulate but when the leg is raised above step height the fabric is restrictive across the thigh and backside Full length side zips (legs cannot be easily shortened) allow rapid zipping on or off - useful if they are worn as a supershywarm overtrouser Paramo say Torres represent a practical su rviva l aid but as a next-to-skin trouser they fee l compromised by the cut and an insulated overt rouser has a limited market
THE LOWDOWN
Fabric Nikwax Analogy Insulator (polyester microfibre outer 100g polyester fill) Sizes XS-XL (unisex) Inside leg 79cm only Waist integral belt front flap with Velcro tabs
IHtIrnl Paramo 01892 786444 wwwparamo_couk
Cit THE NORTH FACE Insulated Trekker Pant Heading to the Antarctic Pack a pair of
these Inside the stretch nylon exterior is a quilted taffeta lining its li ke being wrapped in a duvet The on ly hitch is that outside co ld dry cond it ions theyre often too warm There are no leg vents and no water repellency although the insulation stops moisture (mist not rain) seeping through Breathability is good for such a warm garment and they are super-comfortable in chilly weather A choice of leg lengths is available and the plain hem is easily adjusted The waist is generous with belt loops and a static drawcord to keep them in place although the drawcord isnt very effective against the weight and bulk of the trousers Pockets are odd the zipped side pockets are on ly just big enough for my (small) hands and Im sti ll searching for a use for the zipped pocket behind the left thigh I found these pants too warm for British hill walking but appreciated them in cold dry weather in the French Alps
THE LOWDOWN Fabric 90 nylon 10 elastane polyester insulation Sizes Men 30-38 Women 8-16
IHtImU The North Face 01539822155 wwwthenorthfacecomeu
Inside leg Men Regular 80-83cm Long 85-88cm Women Short 71-75cm Regular 76-80cm Long 81-85cm_ All size graduated Waist zip 2 press studs belt loops static drawcord Pockets 2 zipped front 1 zipped rear 1 zipped back thigh
pound65
January 2011 Outdoor Photography 75
University of Manchester Nick Higham Higher Precision Arithmetic 5 33
RGB to XYZ
From CIE Standard (1931)XYZ
=
049 031 020017697 081240 001063
0 001 099
RGB
But in many booksXYZ
=
049000 031000 020000017697 081240 001063
0 001000 099000
RGB
University of Manchester Nick Higham Higher Precision Arithmetic 6 33
RGB to XYZ
From CIE Standard (1931)XYZ
=
049 031 020017697 081240 001063
0 001 099
RGB
But in many booksX
YZ
=
049000 031000 020000017697 081240 001063
0 001000 099000
RGB
University of Manchester Nick Higham Higher Precision Arithmetic 6 33
IEEE Standard 754-1985
Binary β = 2
Type Size Range u = 2minust
single 32 bits 10plusmn38 2minus24 asymp 60times 10minus8
double 64 bits 10plusmn308 2minus53 asymp 11times 10minus16
Arithmetic ops (+minus lowast radic) performed as if firstcalculated to infinite precision then roundedDefault round to nearest round to even in case of tie
University of Manchester Nick Higham Higher Precision Arithmetic 7 33
Kahan on Higher Precision
For now the 10-byte Extended format is atolerable compromise between
the value of extra-precise arithmetic and the price ofimplementing it to run fast
very soon two more bytes of precision will become tolerableand ultimately a 16-byte format
That kind of gradual evolution towards wider precision
was already in view whenIEEE Standard 754 for Floating-Point Arithmetic was framed
mdash Computer Benchmarks Versus Accuracy (1994)
University of Manchester Nick Higham Higher Precision Arithmetic 8 33
IEEE Standard 754-2008
Type Size Range u = 2minust
single 32 bits 10plusmn38 2minus24 asymp 60times 10minus8
double 64 bits 10plusmn308 2minus53 asymp 11times 10minus16
quadruple 128 bits 10plusmn4932 2minus113 asymp 96times 10minus35
University of Manchester Nick Higham Higher Precision Arithmetic 9 33
Extended and Mixed Precision BLAS (2002)
Part of new standard developed by BLAS TechnicalForumExtra precision used internally input and outputarguments remain single or doubleExtra precision is anything at least 15 times asaccurate as double precision and wider than 80 bitsCounterparts of selected level 1 2 and 3 BLASroutines Extra input argument specifies precision ofinternal computationsReference implementation uses double-double formatgiving extra precision of about 106 bits
University of Manchester Nick Higham Higher Precision Arithmetic 10 33
Need for Higher Precision
Bailey Simon Barton amp Fouts Floating PointArithmetic in Future Supercomputers Internat JSupercomputer Appl 3 86ndash90 1989Bailey Barrio amp Borwein High-Precision ComputationMathematical Physics and Dynamics Appl MathComput 218 10106ndash10121 2012
Long-time simulationsResolving small-scale phenomenaLarge-scale simulations
University of Manchester Nick Higham Higher Precision Arithmetic 11 33
Going to Higher Precision
If we have quadruple or higher precision what do we needto do to modify existing algorithms
To what extent are existing algs precision-independent
University of Manchester Nick Higham Higher Precision Arithmetic 13 33
Direct Matrix Factorizations
Threshold pivoting for sparse matrices may depend onprecisionCondition estimation to detect nearly singular matrices
gtgt A = gallery(rsquolotkinrsquo14) x = Arandn(n1)Warning Matrix is close to singular or badly scaledResults may be inaccurate RCOND = 2761711e-18
Iterative refinementEquilibration
But all ldquorely only on LAPACK DLAMCHrdquo
University of Manchester Nick Higham Higher Precision Arithmetic 14 33
Matrix Functions
(Inverse) scaling and squaring-type algorithms for eAlog(A) cos(A) At use Padeacute approximants
Padeacute degree chosen to achieve accuracy uPadeacute coeffs and algorithm parameters need rederivingfor a different u Logic may changeexpm logm need changing for smaller u
Methods based on best Linfin approximations to eA forHermitian A also need higher order approximationsderiving
Scalar elementary functions
University of Manchester Nick Higham Higher Precision Arithmetic 15 33
Iterative Algorithms
QR-based Dynamically Weighted Halley IterationNakatsukasa Bai amp Gygi (2010)X0 = Aα isin Cmtimesn (m ge n)[radic
ck Xk
I
]=
[Q1
Q2
]R
Xk+1 =bk
ckXk +
1radicck
(ak minus
bk
ck
)Q1Qlowast2
Cubically convergent with limkrarrinfin Xk = U whereA = UH is a polar decompositionForms basis of new spectral divide amp conquer algs forsymm eirsquoproblem and SVD (H amp Nakatsukasa 2012)
University of Manchester Nick Higham Higher Precision Arithmetic 16 33
Maximum Number of Iterations
If κ2(A) le uminus1
u 10minus16 10minus32 10minus64 10minus128
Scaled Newton 9 11 13 15QDWH 6 7 8 9
Increasing the precision makes little difference to thecost in flops
University of Manchester Nick Higham Higher Precision Arithmetic 17 33
Derivative Approximations
Freacutechet derivative of f Cntimesn rarr Cntimesn
Lf (AE) asymp f (A + hE)minus f (A)h
forward difference
with error O(h) and hopt =(
uf (A)E2
)12
If f Rntimesn rarr Rntimesn and AE isin Rntimesn (Al-Mohy amp H 2010)
Lf (AE) asymp Imf (A + ihE)
hcomplex step
has error O(h2) and can take h arbitrarily small eg 10minus100
University of Manchester Nick Higham Higher Precision Arithmetic 18 33
Derivative Approximations
Freacutechet derivative of f Cntimesn rarr Cntimesn
Lf (AE) asymp f (A + hE)minus f (A)h
forward difference
with error O(h) and hopt =(
uf (A)E2
)12
If f Rntimesn rarr Rntimesn and AE isin Rntimesn (Al-Mohy amp H 2010)
Lf (AE) asymp Imf (A + ihE)
hcomplex step
has error O(h2) and can take h arbitrarily small eg 10minus100
University of Manchester Nick Higham Higher Precision Arithmetic 18 33
Mixed Precision
Even if we have higher precision we might not need to useit everywhere
Iterative refinement for Ax = bIterative refinement for ersquovalue problemsPolynomial root finding (Boyd this workshop)Petschow Quintana-Ortiacute amp Bientinesi (2012) usingquad precision to improve orthogonality of doubleprecision MRRR
University of Manchester Nick Higham Higher Precision Arithmetic 19 33
Chebfun
How would Chebfun need to change with higher precisionarithmetic
gtgt help chebfunprefchebfunpref Settings for Chebfun
eps - Relative tolerance used in construction andsubsequent operationsFactory value is 2^-52 (Matlabrsquos factoryvalue of machine epsilon)
University of Manchester Nick Higham Higher Precision Arithmetic 20 33
Chebfun
function pass = complexrotation This code makes sure a few things are ok if you make them complex eg integration norms inner products and singular values LNT 20 May 2008
f = chebfun((x) exp(x))fi = chebfun((x) 1iexp(x))g = chebfun(rsquo1(2-x)rsquo)gi = chebfun(rsquo1i(2-x)rsquo)A = [f g]
pass(1) = (sum(fi)==1isum(f))pass(2) = norm(fi)==norm(f)pass(3) = abs((frsquog)-((1if)rsquo(1ig))) lt 1e-15pass(4) = (norm(giinf)-norm(ginf)) lt 1e-15pass(5) = (norm(fi1)-norm(f1)) lt 1e-15pass(6) = norm(svd(A) - svd(1iA)) lt 1e-15
University of Manchester Nick Higham Higher Precision Arithmetic 21 33
Unstable Algorithms in Higher Precision
For unstable alg where error bounds available increase theprecision accordingly
University of Manchester Nick Higham Higher Precision Arithmetic 22 33
RandomizationA isin Cntimesn with A = 1Davies (2007) ldquoapproximate diagonalizationrdquo
tol = 1e-8[VD] = eig(A + tolrandn(n))F = Vdiag(feval(fundiag(D)))V
Manchester usage in VPA
digits(d)tol = 10^(-d2)[VD] = eig(vpa(A) + tolvpa( rand(n) ))F = Vdiag(feval(fundiag(D)))V
Daviesrsquo analysis suggests rel err asymp 10minusd2
H amp Relton (in progress)
University of Manchester Nick Higham Higher Precision Arithmetic 23 33
RandomizationA isin Cntimesn with A = 1Davies (2007) ldquoapproximate diagonalizationrdquo
tol = 1e-8[VD] = eig(A + tolrandn(n))F = Vdiag(feval(fundiag(D)))V
Manchester usage in VPA
digits(d)tol = 10^(-d2)[VD] = eig(vpa(A) + tolvpa( rand(n) ))F = Vdiag(feval(fundiag(D)))V
Daviesrsquo analysis suggests rel err asymp 10minusd2
H amp Relton (in progress)
University of Manchester Nick Higham Higher Precision Arithmetic 23 33
Tiny Relative Errors
Normwise relative errors
θinfin(x y) =x minus yinfinxinfin
from a numerical experiment
132e-22 339e-22 339e-21 867e-20139e-18 436e-18 530e-18 583e-18145e-17 376e-17 376e-17 427e-17
What precision was used
University of Manchester Nick Higham Higher Precision Arithmetic 24 33
Base β = 2 u = 2minust Dingle amp H (2011)
Theorem
If x 6= 0 and y are distinct normalized flpt numbers then|x minus y ||x | ge u and this lower bound is attainable
Theorem
Let 0 6= x y isin Rn be vectors of normalized flpt numbers Ifθinfin(x y) lt u then xk = yk for all k st |xk | = xinfin
x =
[1
10minus22
] y =
[1
2times 10minus22
] θinfin(x y) = 10minus22
Tiny relative errors can corrupt performance profilesSee cure in Dingle amp H (2011)
University of Manchester Nick Higham Higher Precision Arithmetic 25 33
Base β = 2 u = 2minust Dingle amp H (2011)
Theorem
If x 6= 0 and y are distinct normalized flpt numbers then|x minus y ||x | ge u and this lower bound is attainable
Theorem
Let 0 6= x y isin Rn be vectors of normalized flpt numbers Ifθinfin(x y) lt u then xk = yk for all k st |xk | = xinfin
x =
[1
10minus22
] y =
[1
2times 10minus22
] θinfin(x y) = 10minus22
Tiny relative errors can corrupt performance profilesSee cure in Dingle amp H (2011)
University of Manchester Nick Higham Higher Precision Arithmetic 25 33
Increasing the Precision
y = eπradic
163 evaluated at t digit precision
t y20 2625374126407687440025 262537412640768744000000030 262537412640768743999999999999
Is the last digit before the decimal point 4
t y35 2625374126407687439999999999992500740 2625374126407687439999999999992500725972
So no itrsquos 3
University of Manchester Nick Higham Higher Precision Arithmetic 26 33
Increasing the Precision
y = eπradic
163 evaluated at t digit precision
t y20 2625374126407687440025 262537412640768744000000030 262537412640768743999999999999
Is the last digit before the decimal point 4
t y35 2625374126407687439999999999992500740 2625374126407687439999999999992500725972
So no itrsquos 3
University of Manchester Nick Higham Higher Precision Arithmetic 26 33
Another Example
Consider the evaluation in precision u = 2minust of
y = x + a sin(bx) x = 17 a = 10minus8 b = 224
10 15 20 25 30 35 4010
minus14
10minus13
10minus12
10minus11
10minus10
10minus9
10minus8
10minus7
10minus6
10minus5
10minus4
t
error
University of Manchester Nick Higham Higher Precision Arithmetic 27 33
MythIncreasing the precision at which a computation isperformed increases the accuracy of the answer
Rounding Errors in Digital Imaging
An 8-bit RGB image is an m times n times 3 array of integersaijk isin 01 255Every editing op (levels curves colour balance )aijk larr round(fijk(aijk)) incurs a rounding error
Should we edit in 16-bit
University of Manchester Nick Higham Higher Precision Arithmetic 29 33
Rounding Errors in Digital Imaging
An 8-bit RGB image is an m times n times 3 array of integersaijk isin 01 255Every editing op (levels curves colour balance )aijk larr round(fijk(aijk)) incurs a rounding error
Should we edit in 16-bit
University of Manchester Nick Higham Higher Precision Arithmetic 29 33
16-bit vs 8-bit editing
SLR cameras generate image at 12ndash14 bits internally
8-bit 0 1 25516-bit 0 39times 10minus3 255
Controversial Margulis says using 16 bits makes nopractical difference in quality
Relevant metric is not normwise relative error
University of Manchester Nick Higham Higher Precision Arithmetic 30 33
16-bit vs 8-bit editing
SLR cameras generate image at 12ndash14 bits internally
8-bit 0 1 25516-bit 0 39times 10minus3 255
Controversial Margulis says using 16 bits makes nopractical difference in quality
Relevant metric is not normwise relative error
University of Manchester Nick Higham Higher Precision Arithmetic 30 33
16-bit vs 8-bit editing
SLR cameras generate image at 12ndash14 bits internally
8-bit 0 1 25516-bit 0 39times 10minus3 255
Controversial Margulis says using 16 bits makes nopractical difference in quality
Relevant metric is not normwise relative error
University of Manchester Nick Higham Higher Precision Arithmetic 30 33
High Precision in MATLAB
VPA arithmetic in Symbolic Math ToolboxAdvanpix Multiprecision Computing Toolbox forMATLAB
University of Manchester Nick Higham Higher Precision Arithmetic 31 33
gtgt which -all eigbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleeig) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleeig) double methodCMATLAB2012btoolboxsymbolicsymbolicsymeigm sym methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedeigm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayeigm gpuArray methoddmatlabchebfunlinopeigm linop methoddmatlabchebfunchebopeigm chebop method
gtgt which -all qrbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleqr) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleqr) double methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedqrm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayqrm gpuArray methoddmatlabchebfunchebfunqrm chebfun method
gtgt which -all toeplitzCMATLAB2012btoolboxmatlabelmattoeplitzm
University of Manchester Nick Higham Higher Precision Arithmetic 32 33
gtgt which -all eigbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleeig) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleeig) double methodCMATLAB2012btoolboxsymbolicsymbolicsymeigm sym methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedeigm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayeigm gpuArray methoddmatlabchebfunlinopeigm linop methoddmatlabchebfunchebopeigm chebop method
gtgt which -all qrbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleqr) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleqr) double methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedqrm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayqrm gpuArray methoddmatlabchebfunchebfunqrm chebfun method
gtgt which -all toeplitzCMATLAB2012btoolboxmatlabelmattoeplitzm
University of Manchester Nick Higham Higher Precision Arithmetic 32 33
gtgt which -all eigbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleeig) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleeig) double methodCMATLAB2012btoolboxsymbolicsymbolicsymeigm sym methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedeigm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayeigm gpuArray methoddmatlabchebfunlinopeigm linop methoddmatlabchebfunchebopeigm chebop method
gtgt which -all qrbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleqr) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleqr) double methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedqrm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayqrm gpuArray methoddmatlabchebfunchebfunqrm chebfun method
gtgt which -all toeplitzCMATLAB2012btoolboxmatlabelmattoeplitzm
University of Manchester Nick Higham Higher Precision Arithmetic 32 33
Conclusions amp Future Directions
Need to prepare for quad precision in hardwareArbitrary precision arithmetic in software increasinglyuseful Are we fully exploiting it
High precision veryuseful for computingldquoexact answersrdquo fortesting algs
NLA group
University of Manchester Nick Higham Higher Precision Arithmetic 33 33
References I
Multiprecision Computing Toolbox for MATLABAdvanpix LLC Tokyo|httpwwwadvanpixcom|
A H Al-Mohy and N J HighamThe complex step approximation to the Freacutechetderivative of a matrix functionNumer Algorithms 53(1)133ndash148 2010
Basic Linear Algebra Subprograms Technical (BLAST)Forum Standard IInt J High Performance Computing Applications 16(1)1ndash111 2002
University of Manchester Nick Higham Higher Precision Arithmetic 1 4
References II
Basic Linear Algebra Subprograms Technical (BLAST)Forum Standard IIInt J High Performance Computing Applications 16(2)115ndash199 2002
E B DaviesApproximate diagonalizationSIAM J Matrix Anal Appl 29(4)1051ndash1064 2007
University of Manchester Nick Higham Higher Precision Arithmetic 2 4
References III
N J Dingle and N J HighamReducing the influence of tiny normwise relative errorson performance profilesMIMS EPrint 201190 Manchester Institute forMathematical Sciences The University of ManchesterUK Nov 201111 pp
W KahanComputer benchmarks versus accuracyDraft manuscript June 1994
University of Manchester Nick Higham Higher Precision Arithmetic 3 4
References IV
X S Li J W Demmel D H Bailey G Henry Y HidaJ Iskandar W Kahan S Y Kang A Kapur M CMartin B J Thompson T Tung and D J YooDesign implementation and testing of extended andmixed precision BLASACM Trans Math Software 28(2)152ndash205 June 2002
D MargulisPhotoshop LAB Color The Canyon Conundrum andOther Adventures in the Most Powerful ColorspacePeachpit Press Berkeley CA USA 2006ISBN 0-321-35678-0xviii+366 pp
University of Manchester Nick Higham Higher Precision Arithmetic 4 4
Precision versus Accuracy
Unit roundoff u = 12β
1minust
fl(abc) = ab(1 + δ1) middot c(1 + δ2) |δi | le u= abc(1 + δ1)(1 + δ2)
asymp abc(1 + δ1 + δ2)
Precision = uAccuracy asymp 2u
Accuracy is not limited by precision
University of Manchester Nick Higham Higher Precision Arithmetic 3 33
Walkersrsquo Trouser Review
bull pARAMO Torres Trousers pound105 Waterproof insulated trousers to be worn as
an outer layer or next to the skin I understand the concept but the execution is unusual The waist is
bull wide relative to thighs and the cut is narrow around backside and crotch (despi te a gusset) There is only
it The wa ist connects like a nappy adjust the stati c integral webbing belt via a large quick-release buckle (which can clash with a rucksack or camera bag hipbelt) then
fold a fabric fla p up to the belt and secure with two Ve lcro tabs This leaves an air gap on each side between
waist and the top of the leg zip Knees articulate but when the leg is raised above step height the fabric is restrictive across the thigh and backside Full length side zips (legs cannot be easily shortened) allow rapid zipping on or off - useful if they are worn as a supershywarm overtrouser Paramo say Torres represent a practical su rviva l aid but as a next-to-skin trouser they fee l compromised by the cut and an insulated overt rouser has a limited market
THE LOWDOWN
Fabric Nikwax Analogy Insulator (polyester microfibre outer 100g polyester fill) Sizes XS-XL (unisex) Inside leg 79cm only Waist integral belt front flap with Velcro tabs
IHtIrnl Paramo 01892 786444 wwwparamo_couk
Cit THE NORTH FACE Insulated Trekker Pant Heading to the Antarctic Pack a pair of
these Inside the stretch nylon exterior is a quilted taffeta lining its li ke being wrapped in a duvet The on ly hitch is that outside co ld dry cond it ions theyre often too warm There are no leg vents and no water repellency although the insulation stops moisture (mist not rain) seeping through Breathability is good for such a warm garment and they are super-comfortable in chilly weather A choice of leg lengths is available and the plain hem is easily adjusted The waist is generous with belt loops and a static drawcord to keep them in place although the drawcord isnt very effective against the weight and bulk of the trousers Pockets are odd the zipped side pockets are on ly just big enough for my (small) hands and Im sti ll searching for a use for the zipped pocket behind the left thigh I found these pants too warm for British hill walking but appreciated them in cold dry weather in the French Alps
THE LOWDOWN Fabric 90 nylon 10 elastane polyester insulation Sizes Men 30-38 Women 8-16
IHtImU The North Face 01539822155 wwwthenorthfacecomeu
Inside leg Men Regular 80-83cm Long 85-88cm Women Short 71-75cm Regular 76-80cm Long 81-85cm_ All size graduated Waist zip 2 press studs belt loops static drawcord Pockets 2 zipped front 1 zipped rear 1 zipped back thigh
pound65
January 2011 Outdoor Photography 75
University of Manchester Nick Higham Higher Precision Arithmetic 5 33
RGB to XYZ
From CIE Standard (1931)XYZ
=
049 031 020017697 081240 001063
0 001 099
RGB
But in many booksXYZ
=
049000 031000 020000017697 081240 001063
0 001000 099000
RGB
University of Manchester Nick Higham Higher Precision Arithmetic 6 33
RGB to XYZ
From CIE Standard (1931)XYZ
=
049 031 020017697 081240 001063
0 001 099
RGB
But in many booksX
YZ
=
049000 031000 020000017697 081240 001063
0 001000 099000
RGB
University of Manchester Nick Higham Higher Precision Arithmetic 6 33
IEEE Standard 754-1985
Binary β = 2
Type Size Range u = 2minust
single 32 bits 10plusmn38 2minus24 asymp 60times 10minus8
double 64 bits 10plusmn308 2minus53 asymp 11times 10minus16
Arithmetic ops (+minus lowast radic) performed as if firstcalculated to infinite precision then roundedDefault round to nearest round to even in case of tie
University of Manchester Nick Higham Higher Precision Arithmetic 7 33
Kahan on Higher Precision
For now the 10-byte Extended format is atolerable compromise between
the value of extra-precise arithmetic and the price ofimplementing it to run fast
very soon two more bytes of precision will become tolerableand ultimately a 16-byte format
That kind of gradual evolution towards wider precision
was already in view whenIEEE Standard 754 for Floating-Point Arithmetic was framed
mdash Computer Benchmarks Versus Accuracy (1994)
University of Manchester Nick Higham Higher Precision Arithmetic 8 33
IEEE Standard 754-2008
Type Size Range u = 2minust
single 32 bits 10plusmn38 2minus24 asymp 60times 10minus8
double 64 bits 10plusmn308 2minus53 asymp 11times 10minus16
quadruple 128 bits 10plusmn4932 2minus113 asymp 96times 10minus35
University of Manchester Nick Higham Higher Precision Arithmetic 9 33
Extended and Mixed Precision BLAS (2002)
Part of new standard developed by BLAS TechnicalForumExtra precision used internally input and outputarguments remain single or doubleExtra precision is anything at least 15 times asaccurate as double precision and wider than 80 bitsCounterparts of selected level 1 2 and 3 BLASroutines Extra input argument specifies precision ofinternal computationsReference implementation uses double-double formatgiving extra precision of about 106 bits
University of Manchester Nick Higham Higher Precision Arithmetic 10 33
Need for Higher Precision
Bailey Simon Barton amp Fouts Floating PointArithmetic in Future Supercomputers Internat JSupercomputer Appl 3 86ndash90 1989Bailey Barrio amp Borwein High-Precision ComputationMathematical Physics and Dynamics Appl MathComput 218 10106ndash10121 2012
Long-time simulationsResolving small-scale phenomenaLarge-scale simulations
University of Manchester Nick Higham Higher Precision Arithmetic 11 33
Going to Higher Precision
If we have quadruple or higher precision what do we needto do to modify existing algorithms
To what extent are existing algs precision-independent
University of Manchester Nick Higham Higher Precision Arithmetic 13 33
Direct Matrix Factorizations
Threshold pivoting for sparse matrices may depend onprecisionCondition estimation to detect nearly singular matrices
gtgt A = gallery(rsquolotkinrsquo14) x = Arandn(n1)Warning Matrix is close to singular or badly scaledResults may be inaccurate RCOND = 2761711e-18
Iterative refinementEquilibration
But all ldquorely only on LAPACK DLAMCHrdquo
University of Manchester Nick Higham Higher Precision Arithmetic 14 33
Matrix Functions
(Inverse) scaling and squaring-type algorithms for eAlog(A) cos(A) At use Padeacute approximants
Padeacute degree chosen to achieve accuracy uPadeacute coeffs and algorithm parameters need rederivingfor a different u Logic may changeexpm logm need changing for smaller u
Methods based on best Linfin approximations to eA forHermitian A also need higher order approximationsderiving
Scalar elementary functions
University of Manchester Nick Higham Higher Precision Arithmetic 15 33
Iterative Algorithms
QR-based Dynamically Weighted Halley IterationNakatsukasa Bai amp Gygi (2010)X0 = Aα isin Cmtimesn (m ge n)[radic
ck Xk
I
]=
[Q1
Q2
]R
Xk+1 =bk
ckXk +
1radicck
(ak minus
bk
ck
)Q1Qlowast2
Cubically convergent with limkrarrinfin Xk = U whereA = UH is a polar decompositionForms basis of new spectral divide amp conquer algs forsymm eirsquoproblem and SVD (H amp Nakatsukasa 2012)
University of Manchester Nick Higham Higher Precision Arithmetic 16 33
Maximum Number of Iterations
If κ2(A) le uminus1
u 10minus16 10minus32 10minus64 10minus128
Scaled Newton 9 11 13 15QDWH 6 7 8 9
Increasing the precision makes little difference to thecost in flops
University of Manchester Nick Higham Higher Precision Arithmetic 17 33
Derivative Approximations
Freacutechet derivative of f Cntimesn rarr Cntimesn
Lf (AE) asymp f (A + hE)minus f (A)h
forward difference
with error O(h) and hopt =(
uf (A)E2
)12
If f Rntimesn rarr Rntimesn and AE isin Rntimesn (Al-Mohy amp H 2010)
Lf (AE) asymp Imf (A + ihE)
hcomplex step
has error O(h2) and can take h arbitrarily small eg 10minus100
University of Manchester Nick Higham Higher Precision Arithmetic 18 33
Derivative Approximations
Freacutechet derivative of f Cntimesn rarr Cntimesn
Lf (AE) asymp f (A + hE)minus f (A)h
forward difference
with error O(h) and hopt =(
uf (A)E2
)12
If f Rntimesn rarr Rntimesn and AE isin Rntimesn (Al-Mohy amp H 2010)
Lf (AE) asymp Imf (A + ihE)
hcomplex step
has error O(h2) and can take h arbitrarily small eg 10minus100
University of Manchester Nick Higham Higher Precision Arithmetic 18 33
Mixed Precision
Even if we have higher precision we might not need to useit everywhere
Iterative refinement for Ax = bIterative refinement for ersquovalue problemsPolynomial root finding (Boyd this workshop)Petschow Quintana-Ortiacute amp Bientinesi (2012) usingquad precision to improve orthogonality of doubleprecision MRRR
University of Manchester Nick Higham Higher Precision Arithmetic 19 33
Chebfun
How would Chebfun need to change with higher precisionarithmetic
gtgt help chebfunprefchebfunpref Settings for Chebfun
eps - Relative tolerance used in construction andsubsequent operationsFactory value is 2^-52 (Matlabrsquos factoryvalue of machine epsilon)
University of Manchester Nick Higham Higher Precision Arithmetic 20 33
Chebfun
function pass = complexrotation This code makes sure a few things are ok if you make them complex eg integration norms inner products and singular values LNT 20 May 2008
f = chebfun((x) exp(x))fi = chebfun((x) 1iexp(x))g = chebfun(rsquo1(2-x)rsquo)gi = chebfun(rsquo1i(2-x)rsquo)A = [f g]
pass(1) = (sum(fi)==1isum(f))pass(2) = norm(fi)==norm(f)pass(3) = abs((frsquog)-((1if)rsquo(1ig))) lt 1e-15pass(4) = (norm(giinf)-norm(ginf)) lt 1e-15pass(5) = (norm(fi1)-norm(f1)) lt 1e-15pass(6) = norm(svd(A) - svd(1iA)) lt 1e-15
University of Manchester Nick Higham Higher Precision Arithmetic 21 33
Unstable Algorithms in Higher Precision
For unstable alg where error bounds available increase theprecision accordingly
University of Manchester Nick Higham Higher Precision Arithmetic 22 33
RandomizationA isin Cntimesn with A = 1Davies (2007) ldquoapproximate diagonalizationrdquo
tol = 1e-8[VD] = eig(A + tolrandn(n))F = Vdiag(feval(fundiag(D)))V
Manchester usage in VPA
digits(d)tol = 10^(-d2)[VD] = eig(vpa(A) + tolvpa( rand(n) ))F = Vdiag(feval(fundiag(D)))V
Daviesrsquo analysis suggests rel err asymp 10minusd2
H amp Relton (in progress)
University of Manchester Nick Higham Higher Precision Arithmetic 23 33
RandomizationA isin Cntimesn with A = 1Davies (2007) ldquoapproximate diagonalizationrdquo
tol = 1e-8[VD] = eig(A + tolrandn(n))F = Vdiag(feval(fundiag(D)))V
Manchester usage in VPA
digits(d)tol = 10^(-d2)[VD] = eig(vpa(A) + tolvpa( rand(n) ))F = Vdiag(feval(fundiag(D)))V
Daviesrsquo analysis suggests rel err asymp 10minusd2
H amp Relton (in progress)
University of Manchester Nick Higham Higher Precision Arithmetic 23 33
Tiny Relative Errors
Normwise relative errors
θinfin(x y) =x minus yinfinxinfin
from a numerical experiment
132e-22 339e-22 339e-21 867e-20139e-18 436e-18 530e-18 583e-18145e-17 376e-17 376e-17 427e-17
What precision was used
University of Manchester Nick Higham Higher Precision Arithmetic 24 33
Base β = 2 u = 2minust Dingle amp H (2011)
Theorem
If x 6= 0 and y are distinct normalized flpt numbers then|x minus y ||x | ge u and this lower bound is attainable
Theorem
Let 0 6= x y isin Rn be vectors of normalized flpt numbers Ifθinfin(x y) lt u then xk = yk for all k st |xk | = xinfin
x =
[1
10minus22
] y =
[1
2times 10minus22
] θinfin(x y) = 10minus22
Tiny relative errors can corrupt performance profilesSee cure in Dingle amp H (2011)
University of Manchester Nick Higham Higher Precision Arithmetic 25 33
Base β = 2 u = 2minust Dingle amp H (2011)
Theorem
If x 6= 0 and y are distinct normalized flpt numbers then|x minus y ||x | ge u and this lower bound is attainable
Theorem
Let 0 6= x y isin Rn be vectors of normalized flpt numbers Ifθinfin(x y) lt u then xk = yk for all k st |xk | = xinfin
x =
[1
10minus22
] y =
[1
2times 10minus22
] θinfin(x y) = 10minus22
Tiny relative errors can corrupt performance profilesSee cure in Dingle amp H (2011)
University of Manchester Nick Higham Higher Precision Arithmetic 25 33
Increasing the Precision
y = eπradic
163 evaluated at t digit precision
t y20 2625374126407687440025 262537412640768744000000030 262537412640768743999999999999
Is the last digit before the decimal point 4
t y35 2625374126407687439999999999992500740 2625374126407687439999999999992500725972
So no itrsquos 3
University of Manchester Nick Higham Higher Precision Arithmetic 26 33
Increasing the Precision
y = eπradic
163 evaluated at t digit precision
t y20 2625374126407687440025 262537412640768744000000030 262537412640768743999999999999
Is the last digit before the decimal point 4
t y35 2625374126407687439999999999992500740 2625374126407687439999999999992500725972
So no itrsquos 3
University of Manchester Nick Higham Higher Precision Arithmetic 26 33
Another Example
Consider the evaluation in precision u = 2minust of
y = x + a sin(bx) x = 17 a = 10minus8 b = 224
10 15 20 25 30 35 4010
minus14
10minus13
10minus12
10minus11
10minus10
10minus9
10minus8
10minus7
10minus6
10minus5
10minus4
t
error
University of Manchester Nick Higham Higher Precision Arithmetic 27 33
MythIncreasing the precision at which a computation isperformed increases the accuracy of the answer
Rounding Errors in Digital Imaging
An 8-bit RGB image is an m times n times 3 array of integersaijk isin 01 255Every editing op (levels curves colour balance )aijk larr round(fijk(aijk)) incurs a rounding error
Should we edit in 16-bit
University of Manchester Nick Higham Higher Precision Arithmetic 29 33
Rounding Errors in Digital Imaging
An 8-bit RGB image is an m times n times 3 array of integersaijk isin 01 255Every editing op (levels curves colour balance )aijk larr round(fijk(aijk)) incurs a rounding error
Should we edit in 16-bit
University of Manchester Nick Higham Higher Precision Arithmetic 29 33
16-bit vs 8-bit editing
SLR cameras generate image at 12ndash14 bits internally
8-bit 0 1 25516-bit 0 39times 10minus3 255
Controversial Margulis says using 16 bits makes nopractical difference in quality
Relevant metric is not normwise relative error
University of Manchester Nick Higham Higher Precision Arithmetic 30 33
16-bit vs 8-bit editing
SLR cameras generate image at 12ndash14 bits internally
8-bit 0 1 25516-bit 0 39times 10minus3 255
Controversial Margulis says using 16 bits makes nopractical difference in quality
Relevant metric is not normwise relative error
University of Manchester Nick Higham Higher Precision Arithmetic 30 33
16-bit vs 8-bit editing
SLR cameras generate image at 12ndash14 bits internally
8-bit 0 1 25516-bit 0 39times 10minus3 255
Controversial Margulis says using 16 bits makes nopractical difference in quality
Relevant metric is not normwise relative error
University of Manchester Nick Higham Higher Precision Arithmetic 30 33
High Precision in MATLAB
VPA arithmetic in Symbolic Math ToolboxAdvanpix Multiprecision Computing Toolbox forMATLAB
University of Manchester Nick Higham Higher Precision Arithmetic 31 33
gtgt which -all eigbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleeig) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleeig) double methodCMATLAB2012btoolboxsymbolicsymbolicsymeigm sym methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedeigm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayeigm gpuArray methoddmatlabchebfunlinopeigm linop methoddmatlabchebfunchebopeigm chebop method
gtgt which -all qrbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleqr) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleqr) double methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedqrm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayqrm gpuArray methoddmatlabchebfunchebfunqrm chebfun method
gtgt which -all toeplitzCMATLAB2012btoolboxmatlabelmattoeplitzm
University of Manchester Nick Higham Higher Precision Arithmetic 32 33
gtgt which -all eigbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleeig) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleeig) double methodCMATLAB2012btoolboxsymbolicsymbolicsymeigm sym methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedeigm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayeigm gpuArray methoddmatlabchebfunlinopeigm linop methoddmatlabchebfunchebopeigm chebop method
gtgt which -all qrbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleqr) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleqr) double methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedqrm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayqrm gpuArray methoddmatlabchebfunchebfunqrm chebfun method
gtgt which -all toeplitzCMATLAB2012btoolboxmatlabelmattoeplitzm
University of Manchester Nick Higham Higher Precision Arithmetic 32 33
gtgt which -all eigbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleeig) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleeig) double methodCMATLAB2012btoolboxsymbolicsymbolicsymeigm sym methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedeigm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayeigm gpuArray methoddmatlabchebfunlinopeigm linop methoddmatlabchebfunchebopeigm chebop method
gtgt which -all qrbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleqr) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleqr) double methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedqrm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayqrm gpuArray methoddmatlabchebfunchebfunqrm chebfun method
gtgt which -all toeplitzCMATLAB2012btoolboxmatlabelmattoeplitzm
University of Manchester Nick Higham Higher Precision Arithmetic 32 33
Conclusions amp Future Directions
Need to prepare for quad precision in hardwareArbitrary precision arithmetic in software increasinglyuseful Are we fully exploiting it
High precision veryuseful for computingldquoexact answersrdquo fortesting algs
NLA group
University of Manchester Nick Higham Higher Precision Arithmetic 33 33
References I
Multiprecision Computing Toolbox for MATLABAdvanpix LLC Tokyo|httpwwwadvanpixcom|
A H Al-Mohy and N J HighamThe complex step approximation to the Freacutechetderivative of a matrix functionNumer Algorithms 53(1)133ndash148 2010
Basic Linear Algebra Subprograms Technical (BLAST)Forum Standard IInt J High Performance Computing Applications 16(1)1ndash111 2002
University of Manchester Nick Higham Higher Precision Arithmetic 1 4
References II
Basic Linear Algebra Subprograms Technical (BLAST)Forum Standard IIInt J High Performance Computing Applications 16(2)115ndash199 2002
E B DaviesApproximate diagonalizationSIAM J Matrix Anal Appl 29(4)1051ndash1064 2007
University of Manchester Nick Higham Higher Precision Arithmetic 2 4
References III
N J Dingle and N J HighamReducing the influence of tiny normwise relative errorson performance profilesMIMS EPrint 201190 Manchester Institute forMathematical Sciences The University of ManchesterUK Nov 201111 pp
W KahanComputer benchmarks versus accuracyDraft manuscript June 1994
University of Manchester Nick Higham Higher Precision Arithmetic 3 4
References IV
X S Li J W Demmel D H Bailey G Henry Y HidaJ Iskandar W Kahan S Y Kang A Kapur M CMartin B J Thompson T Tung and D J YooDesign implementation and testing of extended andmixed precision BLASACM Trans Math Software 28(2)152ndash205 June 2002
D MargulisPhotoshop LAB Color The Canyon Conundrum andOther Adventures in the Most Powerful ColorspacePeachpit Press Berkeley CA USA 2006ISBN 0-321-35678-0xviii+366 pp
University of Manchester Nick Higham Higher Precision Arithmetic 4 4
Walkersrsquo Trouser Review
bull pARAMO Torres Trousers pound105 Waterproof insulated trousers to be worn as
an outer layer or next to the skin I understand the concept but the execution is unusual The waist is
bull wide relative to thighs and the cut is narrow around backside and crotch (despi te a gusset) There is only
it The wa ist connects like a nappy adjust the stati c integral webbing belt via a large quick-release buckle (which can clash with a rucksack or camera bag hipbelt) then
fold a fabric fla p up to the belt and secure with two Ve lcro tabs This leaves an air gap on each side between
waist and the top of the leg zip Knees articulate but when the leg is raised above step height the fabric is restrictive across the thigh and backside Full length side zips (legs cannot be easily shortened) allow rapid zipping on or off - useful if they are worn as a supershywarm overtrouser Paramo say Torres represent a practical su rviva l aid but as a next-to-skin trouser they fee l compromised by the cut and an insulated overt rouser has a limited market
THE LOWDOWN
Fabric Nikwax Analogy Insulator (polyester microfibre outer 100g polyester fill) Sizes XS-XL (unisex) Inside leg 79cm only Waist integral belt front flap with Velcro tabs
IHtIrnl Paramo 01892 786444 wwwparamo_couk
Cit THE NORTH FACE Insulated Trekker Pant Heading to the Antarctic Pack a pair of
these Inside the stretch nylon exterior is a quilted taffeta lining its li ke being wrapped in a duvet The on ly hitch is that outside co ld dry cond it ions theyre often too warm There are no leg vents and no water repellency although the insulation stops moisture (mist not rain) seeping through Breathability is good for such a warm garment and they are super-comfortable in chilly weather A choice of leg lengths is available and the plain hem is easily adjusted The waist is generous with belt loops and a static drawcord to keep them in place although the drawcord isnt very effective against the weight and bulk of the trousers Pockets are odd the zipped side pockets are on ly just big enough for my (small) hands and Im sti ll searching for a use for the zipped pocket behind the left thigh I found these pants too warm for British hill walking but appreciated them in cold dry weather in the French Alps
THE LOWDOWN Fabric 90 nylon 10 elastane polyester insulation Sizes Men 30-38 Women 8-16
IHtImU The North Face 01539822155 wwwthenorthfacecomeu
Inside leg Men Regular 80-83cm Long 85-88cm Women Short 71-75cm Regular 76-80cm Long 81-85cm_ All size graduated Waist zip 2 press studs belt loops static drawcord Pockets 2 zipped front 1 zipped rear 1 zipped back thigh
pound65
January 2011 Outdoor Photography 75
University of Manchester Nick Higham Higher Precision Arithmetic 5 33
RGB to XYZ
From CIE Standard (1931)XYZ
=
049 031 020017697 081240 001063
0 001 099
RGB
But in many booksXYZ
=
049000 031000 020000017697 081240 001063
0 001000 099000
RGB
University of Manchester Nick Higham Higher Precision Arithmetic 6 33
RGB to XYZ
From CIE Standard (1931)XYZ
=
049 031 020017697 081240 001063
0 001 099
RGB
But in many booksX
YZ
=
049000 031000 020000017697 081240 001063
0 001000 099000
RGB
University of Manchester Nick Higham Higher Precision Arithmetic 6 33
IEEE Standard 754-1985
Binary β = 2
Type Size Range u = 2minust
single 32 bits 10plusmn38 2minus24 asymp 60times 10minus8
double 64 bits 10plusmn308 2minus53 asymp 11times 10minus16
Arithmetic ops (+minus lowast radic) performed as if firstcalculated to infinite precision then roundedDefault round to nearest round to even in case of tie
University of Manchester Nick Higham Higher Precision Arithmetic 7 33
Kahan on Higher Precision
For now the 10-byte Extended format is atolerable compromise between
the value of extra-precise arithmetic and the price ofimplementing it to run fast
very soon two more bytes of precision will become tolerableand ultimately a 16-byte format
That kind of gradual evolution towards wider precision
was already in view whenIEEE Standard 754 for Floating-Point Arithmetic was framed
mdash Computer Benchmarks Versus Accuracy (1994)
University of Manchester Nick Higham Higher Precision Arithmetic 8 33
IEEE Standard 754-2008
Type Size Range u = 2minust
single 32 bits 10plusmn38 2minus24 asymp 60times 10minus8
double 64 bits 10plusmn308 2minus53 asymp 11times 10minus16
quadruple 128 bits 10plusmn4932 2minus113 asymp 96times 10minus35
University of Manchester Nick Higham Higher Precision Arithmetic 9 33
Extended and Mixed Precision BLAS (2002)
Part of new standard developed by BLAS TechnicalForumExtra precision used internally input and outputarguments remain single or doubleExtra precision is anything at least 15 times asaccurate as double precision and wider than 80 bitsCounterparts of selected level 1 2 and 3 BLASroutines Extra input argument specifies precision ofinternal computationsReference implementation uses double-double formatgiving extra precision of about 106 bits
University of Manchester Nick Higham Higher Precision Arithmetic 10 33
Need for Higher Precision
Bailey Simon Barton amp Fouts Floating PointArithmetic in Future Supercomputers Internat JSupercomputer Appl 3 86ndash90 1989Bailey Barrio amp Borwein High-Precision ComputationMathematical Physics and Dynamics Appl MathComput 218 10106ndash10121 2012
Long-time simulationsResolving small-scale phenomenaLarge-scale simulations
University of Manchester Nick Higham Higher Precision Arithmetic 11 33
Going to Higher Precision
If we have quadruple or higher precision what do we needto do to modify existing algorithms
To what extent are existing algs precision-independent
University of Manchester Nick Higham Higher Precision Arithmetic 13 33
Direct Matrix Factorizations
Threshold pivoting for sparse matrices may depend onprecisionCondition estimation to detect nearly singular matrices
gtgt A = gallery(rsquolotkinrsquo14) x = Arandn(n1)Warning Matrix is close to singular or badly scaledResults may be inaccurate RCOND = 2761711e-18
Iterative refinementEquilibration
But all ldquorely only on LAPACK DLAMCHrdquo
University of Manchester Nick Higham Higher Precision Arithmetic 14 33
Matrix Functions
(Inverse) scaling and squaring-type algorithms for eAlog(A) cos(A) At use Padeacute approximants
Padeacute degree chosen to achieve accuracy uPadeacute coeffs and algorithm parameters need rederivingfor a different u Logic may changeexpm logm need changing for smaller u
Methods based on best Linfin approximations to eA forHermitian A also need higher order approximationsderiving
Scalar elementary functions
University of Manchester Nick Higham Higher Precision Arithmetic 15 33
Iterative Algorithms
QR-based Dynamically Weighted Halley IterationNakatsukasa Bai amp Gygi (2010)X0 = Aα isin Cmtimesn (m ge n)[radic
ck Xk
I
]=
[Q1
Q2
]R
Xk+1 =bk
ckXk +
1radicck
(ak minus
bk
ck
)Q1Qlowast2
Cubically convergent with limkrarrinfin Xk = U whereA = UH is a polar decompositionForms basis of new spectral divide amp conquer algs forsymm eirsquoproblem and SVD (H amp Nakatsukasa 2012)
University of Manchester Nick Higham Higher Precision Arithmetic 16 33
Maximum Number of Iterations
If κ2(A) le uminus1
u 10minus16 10minus32 10minus64 10minus128
Scaled Newton 9 11 13 15QDWH 6 7 8 9
Increasing the precision makes little difference to thecost in flops
University of Manchester Nick Higham Higher Precision Arithmetic 17 33
Derivative Approximations
Freacutechet derivative of f Cntimesn rarr Cntimesn
Lf (AE) asymp f (A + hE)minus f (A)h
forward difference
with error O(h) and hopt =(
uf (A)E2
)12
If f Rntimesn rarr Rntimesn and AE isin Rntimesn (Al-Mohy amp H 2010)
Lf (AE) asymp Imf (A + ihE)
hcomplex step
has error O(h2) and can take h arbitrarily small eg 10minus100
University of Manchester Nick Higham Higher Precision Arithmetic 18 33
Derivative Approximations
Freacutechet derivative of f Cntimesn rarr Cntimesn
Lf (AE) asymp f (A + hE)minus f (A)h
forward difference
with error O(h) and hopt =(
uf (A)E2
)12
If f Rntimesn rarr Rntimesn and AE isin Rntimesn (Al-Mohy amp H 2010)
Lf (AE) asymp Imf (A + ihE)
hcomplex step
has error O(h2) and can take h arbitrarily small eg 10minus100
University of Manchester Nick Higham Higher Precision Arithmetic 18 33
Mixed Precision
Even if we have higher precision we might not need to useit everywhere
Iterative refinement for Ax = bIterative refinement for ersquovalue problemsPolynomial root finding (Boyd this workshop)Petschow Quintana-Ortiacute amp Bientinesi (2012) usingquad precision to improve orthogonality of doubleprecision MRRR
University of Manchester Nick Higham Higher Precision Arithmetic 19 33
Chebfun
How would Chebfun need to change with higher precisionarithmetic
gtgt help chebfunprefchebfunpref Settings for Chebfun
eps - Relative tolerance used in construction andsubsequent operationsFactory value is 2^-52 (Matlabrsquos factoryvalue of machine epsilon)
University of Manchester Nick Higham Higher Precision Arithmetic 20 33
Chebfun
function pass = complexrotation This code makes sure a few things are ok if you make them complex eg integration norms inner products and singular values LNT 20 May 2008
f = chebfun((x) exp(x))fi = chebfun((x) 1iexp(x))g = chebfun(rsquo1(2-x)rsquo)gi = chebfun(rsquo1i(2-x)rsquo)A = [f g]
pass(1) = (sum(fi)==1isum(f))pass(2) = norm(fi)==norm(f)pass(3) = abs((frsquog)-((1if)rsquo(1ig))) lt 1e-15pass(4) = (norm(giinf)-norm(ginf)) lt 1e-15pass(5) = (norm(fi1)-norm(f1)) lt 1e-15pass(6) = norm(svd(A) - svd(1iA)) lt 1e-15
University of Manchester Nick Higham Higher Precision Arithmetic 21 33
Unstable Algorithms in Higher Precision
For unstable alg where error bounds available increase theprecision accordingly
University of Manchester Nick Higham Higher Precision Arithmetic 22 33
RandomizationA isin Cntimesn with A = 1Davies (2007) ldquoapproximate diagonalizationrdquo
tol = 1e-8[VD] = eig(A + tolrandn(n))F = Vdiag(feval(fundiag(D)))V
Manchester usage in VPA
digits(d)tol = 10^(-d2)[VD] = eig(vpa(A) + tolvpa( rand(n) ))F = Vdiag(feval(fundiag(D)))V
Daviesrsquo analysis suggests rel err asymp 10minusd2
H amp Relton (in progress)
University of Manchester Nick Higham Higher Precision Arithmetic 23 33
RandomizationA isin Cntimesn with A = 1Davies (2007) ldquoapproximate diagonalizationrdquo
tol = 1e-8[VD] = eig(A + tolrandn(n))F = Vdiag(feval(fundiag(D)))V
Manchester usage in VPA
digits(d)tol = 10^(-d2)[VD] = eig(vpa(A) + tolvpa( rand(n) ))F = Vdiag(feval(fundiag(D)))V
Daviesrsquo analysis suggests rel err asymp 10minusd2
H amp Relton (in progress)
University of Manchester Nick Higham Higher Precision Arithmetic 23 33
Tiny Relative Errors
Normwise relative errors
θinfin(x y) =x minus yinfinxinfin
from a numerical experiment
132e-22 339e-22 339e-21 867e-20139e-18 436e-18 530e-18 583e-18145e-17 376e-17 376e-17 427e-17
What precision was used
University of Manchester Nick Higham Higher Precision Arithmetic 24 33
Base β = 2 u = 2minust Dingle amp H (2011)
Theorem
If x 6= 0 and y are distinct normalized flpt numbers then|x minus y ||x | ge u and this lower bound is attainable
Theorem
Let 0 6= x y isin Rn be vectors of normalized flpt numbers Ifθinfin(x y) lt u then xk = yk for all k st |xk | = xinfin
x =
[1
10minus22
] y =
[1
2times 10minus22
] θinfin(x y) = 10minus22
Tiny relative errors can corrupt performance profilesSee cure in Dingle amp H (2011)
University of Manchester Nick Higham Higher Precision Arithmetic 25 33
Base β = 2 u = 2minust Dingle amp H (2011)
Theorem
If x 6= 0 and y are distinct normalized flpt numbers then|x minus y ||x | ge u and this lower bound is attainable
Theorem
Let 0 6= x y isin Rn be vectors of normalized flpt numbers Ifθinfin(x y) lt u then xk = yk for all k st |xk | = xinfin
x =
[1
10minus22
] y =
[1
2times 10minus22
] θinfin(x y) = 10minus22
Tiny relative errors can corrupt performance profilesSee cure in Dingle amp H (2011)
University of Manchester Nick Higham Higher Precision Arithmetic 25 33
Increasing the Precision
y = eπradic
163 evaluated at t digit precision
t y20 2625374126407687440025 262537412640768744000000030 262537412640768743999999999999
Is the last digit before the decimal point 4
t y35 2625374126407687439999999999992500740 2625374126407687439999999999992500725972
So no itrsquos 3
University of Manchester Nick Higham Higher Precision Arithmetic 26 33
Increasing the Precision
y = eπradic
163 evaluated at t digit precision
t y20 2625374126407687440025 262537412640768744000000030 262537412640768743999999999999
Is the last digit before the decimal point 4
t y35 2625374126407687439999999999992500740 2625374126407687439999999999992500725972
So no itrsquos 3
University of Manchester Nick Higham Higher Precision Arithmetic 26 33
Another Example
Consider the evaluation in precision u = 2minust of
y = x + a sin(bx) x = 17 a = 10minus8 b = 224
10 15 20 25 30 35 4010
minus14
10minus13
10minus12
10minus11
10minus10
10minus9
10minus8
10minus7
10minus6
10minus5
10minus4
t
error
University of Manchester Nick Higham Higher Precision Arithmetic 27 33
MythIncreasing the precision at which a computation isperformed increases the accuracy of the answer
Rounding Errors in Digital Imaging
An 8-bit RGB image is an m times n times 3 array of integersaijk isin 01 255Every editing op (levels curves colour balance )aijk larr round(fijk(aijk)) incurs a rounding error
Should we edit in 16-bit
University of Manchester Nick Higham Higher Precision Arithmetic 29 33
Rounding Errors in Digital Imaging
An 8-bit RGB image is an m times n times 3 array of integersaijk isin 01 255Every editing op (levels curves colour balance )aijk larr round(fijk(aijk)) incurs a rounding error
Should we edit in 16-bit
University of Manchester Nick Higham Higher Precision Arithmetic 29 33
16-bit vs 8-bit editing
SLR cameras generate image at 12ndash14 bits internally
8-bit 0 1 25516-bit 0 39times 10minus3 255
Controversial Margulis says using 16 bits makes nopractical difference in quality
Relevant metric is not normwise relative error
University of Manchester Nick Higham Higher Precision Arithmetic 30 33
16-bit vs 8-bit editing
SLR cameras generate image at 12ndash14 bits internally
8-bit 0 1 25516-bit 0 39times 10minus3 255
Controversial Margulis says using 16 bits makes nopractical difference in quality
Relevant metric is not normwise relative error
University of Manchester Nick Higham Higher Precision Arithmetic 30 33
16-bit vs 8-bit editing
SLR cameras generate image at 12ndash14 bits internally
8-bit 0 1 25516-bit 0 39times 10minus3 255
Controversial Margulis says using 16 bits makes nopractical difference in quality
Relevant metric is not normwise relative error
University of Manchester Nick Higham Higher Precision Arithmetic 30 33
High Precision in MATLAB
VPA arithmetic in Symbolic Math ToolboxAdvanpix Multiprecision Computing Toolbox forMATLAB
University of Manchester Nick Higham Higher Precision Arithmetic 31 33
gtgt which -all eigbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleeig) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleeig) double methodCMATLAB2012btoolboxsymbolicsymbolicsymeigm sym methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedeigm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayeigm gpuArray methoddmatlabchebfunlinopeigm linop methoddmatlabchebfunchebopeigm chebop method
gtgt which -all qrbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleqr) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleqr) double methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedqrm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayqrm gpuArray methoddmatlabchebfunchebfunqrm chebfun method
gtgt which -all toeplitzCMATLAB2012btoolboxmatlabelmattoeplitzm
University of Manchester Nick Higham Higher Precision Arithmetic 32 33
gtgt which -all eigbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleeig) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleeig) double methodCMATLAB2012btoolboxsymbolicsymbolicsymeigm sym methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedeigm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayeigm gpuArray methoddmatlabchebfunlinopeigm linop methoddmatlabchebfunchebopeigm chebop method
gtgt which -all qrbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleqr) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleqr) double methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedqrm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayqrm gpuArray methoddmatlabchebfunchebfunqrm chebfun method
gtgt which -all toeplitzCMATLAB2012btoolboxmatlabelmattoeplitzm
University of Manchester Nick Higham Higher Precision Arithmetic 32 33
gtgt which -all eigbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleeig) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleeig) double methodCMATLAB2012btoolboxsymbolicsymbolicsymeigm sym methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedeigm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayeigm gpuArray methoddmatlabchebfunlinopeigm linop methoddmatlabchebfunchebopeigm chebop method
gtgt which -all qrbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleqr) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleqr) double methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedqrm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayqrm gpuArray methoddmatlabchebfunchebfunqrm chebfun method
gtgt which -all toeplitzCMATLAB2012btoolboxmatlabelmattoeplitzm
University of Manchester Nick Higham Higher Precision Arithmetic 32 33
Conclusions amp Future Directions
Need to prepare for quad precision in hardwareArbitrary precision arithmetic in software increasinglyuseful Are we fully exploiting it
High precision veryuseful for computingldquoexact answersrdquo fortesting algs
NLA group
University of Manchester Nick Higham Higher Precision Arithmetic 33 33
References I
Multiprecision Computing Toolbox for MATLABAdvanpix LLC Tokyo|httpwwwadvanpixcom|
A H Al-Mohy and N J HighamThe complex step approximation to the Freacutechetderivative of a matrix functionNumer Algorithms 53(1)133ndash148 2010
Basic Linear Algebra Subprograms Technical (BLAST)Forum Standard IInt J High Performance Computing Applications 16(1)1ndash111 2002
University of Manchester Nick Higham Higher Precision Arithmetic 1 4
References II
Basic Linear Algebra Subprograms Technical (BLAST)Forum Standard IIInt J High Performance Computing Applications 16(2)115ndash199 2002
E B DaviesApproximate diagonalizationSIAM J Matrix Anal Appl 29(4)1051ndash1064 2007
University of Manchester Nick Higham Higher Precision Arithmetic 2 4
References III
N J Dingle and N J HighamReducing the influence of tiny normwise relative errorson performance profilesMIMS EPrint 201190 Manchester Institute forMathematical Sciences The University of ManchesterUK Nov 201111 pp
W KahanComputer benchmarks versus accuracyDraft manuscript June 1994
University of Manchester Nick Higham Higher Precision Arithmetic 3 4
References IV
X S Li J W Demmel D H Bailey G Henry Y HidaJ Iskandar W Kahan S Y Kang A Kapur M CMartin B J Thompson T Tung and D J YooDesign implementation and testing of extended andmixed precision BLASACM Trans Math Software 28(2)152ndash205 June 2002
D MargulisPhotoshop LAB Color The Canyon Conundrum andOther Adventures in the Most Powerful ColorspacePeachpit Press Berkeley CA USA 2006ISBN 0-321-35678-0xviii+366 pp
University of Manchester Nick Higham Higher Precision Arithmetic 4 4
RGB to XYZ
From CIE Standard (1931)XYZ
=
049 031 020017697 081240 001063
0 001 099
RGB
But in many booksXYZ
=
049000 031000 020000017697 081240 001063
0 001000 099000
RGB
University of Manchester Nick Higham Higher Precision Arithmetic 6 33
RGB to XYZ
From CIE Standard (1931)XYZ
=
049 031 020017697 081240 001063
0 001 099
RGB
But in many booksX
YZ
=
049000 031000 020000017697 081240 001063
0 001000 099000
RGB
University of Manchester Nick Higham Higher Precision Arithmetic 6 33
IEEE Standard 754-1985
Binary β = 2
Type Size Range u = 2minust
single 32 bits 10plusmn38 2minus24 asymp 60times 10minus8
double 64 bits 10plusmn308 2minus53 asymp 11times 10minus16
Arithmetic ops (+minus lowast radic) performed as if firstcalculated to infinite precision then roundedDefault round to nearest round to even in case of tie
University of Manchester Nick Higham Higher Precision Arithmetic 7 33
Kahan on Higher Precision
For now the 10-byte Extended format is atolerable compromise between
the value of extra-precise arithmetic and the price ofimplementing it to run fast
very soon two more bytes of precision will become tolerableand ultimately a 16-byte format
That kind of gradual evolution towards wider precision
was already in view whenIEEE Standard 754 for Floating-Point Arithmetic was framed
mdash Computer Benchmarks Versus Accuracy (1994)
University of Manchester Nick Higham Higher Precision Arithmetic 8 33
IEEE Standard 754-2008
Type Size Range u = 2minust
single 32 bits 10plusmn38 2minus24 asymp 60times 10minus8
double 64 bits 10plusmn308 2minus53 asymp 11times 10minus16
quadruple 128 bits 10plusmn4932 2minus113 asymp 96times 10minus35
University of Manchester Nick Higham Higher Precision Arithmetic 9 33
Extended and Mixed Precision BLAS (2002)
Part of new standard developed by BLAS TechnicalForumExtra precision used internally input and outputarguments remain single or doubleExtra precision is anything at least 15 times asaccurate as double precision and wider than 80 bitsCounterparts of selected level 1 2 and 3 BLASroutines Extra input argument specifies precision ofinternal computationsReference implementation uses double-double formatgiving extra precision of about 106 bits
University of Manchester Nick Higham Higher Precision Arithmetic 10 33
Need for Higher Precision
Bailey Simon Barton amp Fouts Floating PointArithmetic in Future Supercomputers Internat JSupercomputer Appl 3 86ndash90 1989Bailey Barrio amp Borwein High-Precision ComputationMathematical Physics and Dynamics Appl MathComput 218 10106ndash10121 2012
Long-time simulationsResolving small-scale phenomenaLarge-scale simulations
University of Manchester Nick Higham Higher Precision Arithmetic 11 33
Going to Higher Precision
If we have quadruple or higher precision what do we needto do to modify existing algorithms
To what extent are existing algs precision-independent
University of Manchester Nick Higham Higher Precision Arithmetic 13 33
Direct Matrix Factorizations
Threshold pivoting for sparse matrices may depend onprecisionCondition estimation to detect nearly singular matrices
gtgt A = gallery(rsquolotkinrsquo14) x = Arandn(n1)Warning Matrix is close to singular or badly scaledResults may be inaccurate RCOND = 2761711e-18
Iterative refinementEquilibration
But all ldquorely only on LAPACK DLAMCHrdquo
University of Manchester Nick Higham Higher Precision Arithmetic 14 33
Matrix Functions
(Inverse) scaling and squaring-type algorithms for eAlog(A) cos(A) At use Padeacute approximants
Padeacute degree chosen to achieve accuracy uPadeacute coeffs and algorithm parameters need rederivingfor a different u Logic may changeexpm logm need changing for smaller u
Methods based on best Linfin approximations to eA forHermitian A also need higher order approximationsderiving
Scalar elementary functions
University of Manchester Nick Higham Higher Precision Arithmetic 15 33
Iterative Algorithms
QR-based Dynamically Weighted Halley IterationNakatsukasa Bai amp Gygi (2010)X0 = Aα isin Cmtimesn (m ge n)[radic
ck Xk
I
]=
[Q1
Q2
]R
Xk+1 =bk
ckXk +
1radicck
(ak minus
bk
ck
)Q1Qlowast2
Cubically convergent with limkrarrinfin Xk = U whereA = UH is a polar decompositionForms basis of new spectral divide amp conquer algs forsymm eirsquoproblem and SVD (H amp Nakatsukasa 2012)
University of Manchester Nick Higham Higher Precision Arithmetic 16 33
Maximum Number of Iterations
If κ2(A) le uminus1
u 10minus16 10minus32 10minus64 10minus128
Scaled Newton 9 11 13 15QDWH 6 7 8 9
Increasing the precision makes little difference to thecost in flops
University of Manchester Nick Higham Higher Precision Arithmetic 17 33
Derivative Approximations
Freacutechet derivative of f Cntimesn rarr Cntimesn
Lf (AE) asymp f (A + hE)minus f (A)h
forward difference
with error O(h) and hopt =(
uf (A)E2
)12
If f Rntimesn rarr Rntimesn and AE isin Rntimesn (Al-Mohy amp H 2010)
Lf (AE) asymp Imf (A + ihE)
hcomplex step
has error O(h2) and can take h arbitrarily small eg 10minus100
University of Manchester Nick Higham Higher Precision Arithmetic 18 33
Derivative Approximations
Freacutechet derivative of f Cntimesn rarr Cntimesn
Lf (AE) asymp f (A + hE)minus f (A)h
forward difference
with error O(h) and hopt =(
uf (A)E2
)12
If f Rntimesn rarr Rntimesn and AE isin Rntimesn (Al-Mohy amp H 2010)
Lf (AE) asymp Imf (A + ihE)
hcomplex step
has error O(h2) and can take h arbitrarily small eg 10minus100
University of Manchester Nick Higham Higher Precision Arithmetic 18 33
Mixed Precision
Even if we have higher precision we might not need to useit everywhere
Iterative refinement for Ax = bIterative refinement for ersquovalue problemsPolynomial root finding (Boyd this workshop)Petschow Quintana-Ortiacute amp Bientinesi (2012) usingquad precision to improve orthogonality of doubleprecision MRRR
University of Manchester Nick Higham Higher Precision Arithmetic 19 33
Chebfun
How would Chebfun need to change with higher precisionarithmetic
gtgt help chebfunprefchebfunpref Settings for Chebfun
eps - Relative tolerance used in construction andsubsequent operationsFactory value is 2^-52 (Matlabrsquos factoryvalue of machine epsilon)
University of Manchester Nick Higham Higher Precision Arithmetic 20 33
Chebfun
function pass = complexrotation This code makes sure a few things are ok if you make them complex eg integration norms inner products and singular values LNT 20 May 2008
f = chebfun((x) exp(x))fi = chebfun((x) 1iexp(x))g = chebfun(rsquo1(2-x)rsquo)gi = chebfun(rsquo1i(2-x)rsquo)A = [f g]
pass(1) = (sum(fi)==1isum(f))pass(2) = norm(fi)==norm(f)pass(3) = abs((frsquog)-((1if)rsquo(1ig))) lt 1e-15pass(4) = (norm(giinf)-norm(ginf)) lt 1e-15pass(5) = (norm(fi1)-norm(f1)) lt 1e-15pass(6) = norm(svd(A) - svd(1iA)) lt 1e-15
University of Manchester Nick Higham Higher Precision Arithmetic 21 33
Unstable Algorithms in Higher Precision
For unstable alg where error bounds available increase theprecision accordingly
University of Manchester Nick Higham Higher Precision Arithmetic 22 33
RandomizationA isin Cntimesn with A = 1Davies (2007) ldquoapproximate diagonalizationrdquo
tol = 1e-8[VD] = eig(A + tolrandn(n))F = Vdiag(feval(fundiag(D)))V
Manchester usage in VPA
digits(d)tol = 10^(-d2)[VD] = eig(vpa(A) + tolvpa( rand(n) ))F = Vdiag(feval(fundiag(D)))V
Daviesrsquo analysis suggests rel err asymp 10minusd2
H amp Relton (in progress)
University of Manchester Nick Higham Higher Precision Arithmetic 23 33
RandomizationA isin Cntimesn with A = 1Davies (2007) ldquoapproximate diagonalizationrdquo
tol = 1e-8[VD] = eig(A + tolrandn(n))F = Vdiag(feval(fundiag(D)))V
Manchester usage in VPA
digits(d)tol = 10^(-d2)[VD] = eig(vpa(A) + tolvpa( rand(n) ))F = Vdiag(feval(fundiag(D)))V
Daviesrsquo analysis suggests rel err asymp 10minusd2
H amp Relton (in progress)
University of Manchester Nick Higham Higher Precision Arithmetic 23 33
Tiny Relative Errors
Normwise relative errors
θinfin(x y) =x minus yinfinxinfin
from a numerical experiment
132e-22 339e-22 339e-21 867e-20139e-18 436e-18 530e-18 583e-18145e-17 376e-17 376e-17 427e-17
What precision was used
University of Manchester Nick Higham Higher Precision Arithmetic 24 33
Base β = 2 u = 2minust Dingle amp H (2011)
Theorem
If x 6= 0 and y are distinct normalized flpt numbers then|x minus y ||x | ge u and this lower bound is attainable
Theorem
Let 0 6= x y isin Rn be vectors of normalized flpt numbers Ifθinfin(x y) lt u then xk = yk for all k st |xk | = xinfin
x =
[1
10minus22
] y =
[1
2times 10minus22
] θinfin(x y) = 10minus22
Tiny relative errors can corrupt performance profilesSee cure in Dingle amp H (2011)
University of Manchester Nick Higham Higher Precision Arithmetic 25 33
Base β = 2 u = 2minust Dingle amp H (2011)
Theorem
If x 6= 0 and y are distinct normalized flpt numbers then|x minus y ||x | ge u and this lower bound is attainable
Theorem
Let 0 6= x y isin Rn be vectors of normalized flpt numbers Ifθinfin(x y) lt u then xk = yk for all k st |xk | = xinfin
x =
[1
10minus22
] y =
[1
2times 10minus22
] θinfin(x y) = 10minus22
Tiny relative errors can corrupt performance profilesSee cure in Dingle amp H (2011)
University of Manchester Nick Higham Higher Precision Arithmetic 25 33
Increasing the Precision
y = eπradic
163 evaluated at t digit precision
t y20 2625374126407687440025 262537412640768744000000030 262537412640768743999999999999
Is the last digit before the decimal point 4
t y35 2625374126407687439999999999992500740 2625374126407687439999999999992500725972
So no itrsquos 3
University of Manchester Nick Higham Higher Precision Arithmetic 26 33
Increasing the Precision
y = eπradic
163 evaluated at t digit precision
t y20 2625374126407687440025 262537412640768744000000030 262537412640768743999999999999
Is the last digit before the decimal point 4
t y35 2625374126407687439999999999992500740 2625374126407687439999999999992500725972
So no itrsquos 3
University of Manchester Nick Higham Higher Precision Arithmetic 26 33
Another Example
Consider the evaluation in precision u = 2minust of
y = x + a sin(bx) x = 17 a = 10minus8 b = 224
10 15 20 25 30 35 4010
minus14
10minus13
10minus12
10minus11
10minus10
10minus9
10minus8
10minus7
10minus6
10minus5
10minus4
t
error
University of Manchester Nick Higham Higher Precision Arithmetic 27 33
MythIncreasing the precision at which a computation isperformed increases the accuracy of the answer
Rounding Errors in Digital Imaging
An 8-bit RGB image is an m times n times 3 array of integersaijk isin 01 255Every editing op (levels curves colour balance )aijk larr round(fijk(aijk)) incurs a rounding error
Should we edit in 16-bit
University of Manchester Nick Higham Higher Precision Arithmetic 29 33
Rounding Errors in Digital Imaging
An 8-bit RGB image is an m times n times 3 array of integersaijk isin 01 255Every editing op (levels curves colour balance )aijk larr round(fijk(aijk)) incurs a rounding error
Should we edit in 16-bit
University of Manchester Nick Higham Higher Precision Arithmetic 29 33
16-bit vs 8-bit editing
SLR cameras generate image at 12ndash14 bits internally
8-bit 0 1 25516-bit 0 39times 10minus3 255
Controversial Margulis says using 16 bits makes nopractical difference in quality
Relevant metric is not normwise relative error
University of Manchester Nick Higham Higher Precision Arithmetic 30 33
16-bit vs 8-bit editing
SLR cameras generate image at 12ndash14 bits internally
8-bit 0 1 25516-bit 0 39times 10minus3 255
Controversial Margulis says using 16 bits makes nopractical difference in quality
Relevant metric is not normwise relative error
University of Manchester Nick Higham Higher Precision Arithmetic 30 33
16-bit vs 8-bit editing
SLR cameras generate image at 12ndash14 bits internally
8-bit 0 1 25516-bit 0 39times 10minus3 255
Controversial Margulis says using 16 bits makes nopractical difference in quality
Relevant metric is not normwise relative error
University of Manchester Nick Higham Higher Precision Arithmetic 30 33
High Precision in MATLAB
VPA arithmetic in Symbolic Math ToolboxAdvanpix Multiprecision Computing Toolbox forMATLAB
University of Manchester Nick Higham Higher Precision Arithmetic 31 33
gtgt which -all eigbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleeig) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleeig) double methodCMATLAB2012btoolboxsymbolicsymbolicsymeigm sym methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedeigm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayeigm gpuArray methoddmatlabchebfunlinopeigm linop methoddmatlabchebfunchebopeigm chebop method
gtgt which -all qrbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleqr) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleqr) double methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedqrm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayqrm gpuArray methoddmatlabchebfunchebfunqrm chebfun method
gtgt which -all toeplitzCMATLAB2012btoolboxmatlabelmattoeplitzm
University of Manchester Nick Higham Higher Precision Arithmetic 32 33
gtgt which -all eigbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleeig) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleeig) double methodCMATLAB2012btoolboxsymbolicsymbolicsymeigm sym methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedeigm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayeigm gpuArray methoddmatlabchebfunlinopeigm linop methoddmatlabchebfunchebopeigm chebop method
gtgt which -all qrbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleqr) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleqr) double methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedqrm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayqrm gpuArray methoddmatlabchebfunchebfunqrm chebfun method
gtgt which -all toeplitzCMATLAB2012btoolboxmatlabelmattoeplitzm
University of Manchester Nick Higham Higher Precision Arithmetic 32 33
gtgt which -all eigbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleeig) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleeig) double methodCMATLAB2012btoolboxsymbolicsymbolicsymeigm sym methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedeigm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayeigm gpuArray methoddmatlabchebfunlinopeigm linop methoddmatlabchebfunchebopeigm chebop method
gtgt which -all qrbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleqr) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleqr) double methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedqrm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayqrm gpuArray methoddmatlabchebfunchebfunqrm chebfun method
gtgt which -all toeplitzCMATLAB2012btoolboxmatlabelmattoeplitzm
University of Manchester Nick Higham Higher Precision Arithmetic 32 33
Conclusions amp Future Directions
Need to prepare for quad precision in hardwareArbitrary precision arithmetic in software increasinglyuseful Are we fully exploiting it
High precision veryuseful for computingldquoexact answersrdquo fortesting algs
NLA group
University of Manchester Nick Higham Higher Precision Arithmetic 33 33
References I
Multiprecision Computing Toolbox for MATLABAdvanpix LLC Tokyo|httpwwwadvanpixcom|
A H Al-Mohy and N J HighamThe complex step approximation to the Freacutechetderivative of a matrix functionNumer Algorithms 53(1)133ndash148 2010
Basic Linear Algebra Subprograms Technical (BLAST)Forum Standard IInt J High Performance Computing Applications 16(1)1ndash111 2002
University of Manchester Nick Higham Higher Precision Arithmetic 1 4
References II
Basic Linear Algebra Subprograms Technical (BLAST)Forum Standard IIInt J High Performance Computing Applications 16(2)115ndash199 2002
E B DaviesApproximate diagonalizationSIAM J Matrix Anal Appl 29(4)1051ndash1064 2007
University of Manchester Nick Higham Higher Precision Arithmetic 2 4
References III
N J Dingle and N J HighamReducing the influence of tiny normwise relative errorson performance profilesMIMS EPrint 201190 Manchester Institute forMathematical Sciences The University of ManchesterUK Nov 201111 pp
W KahanComputer benchmarks versus accuracyDraft manuscript June 1994
University of Manchester Nick Higham Higher Precision Arithmetic 3 4
References IV
X S Li J W Demmel D H Bailey G Henry Y HidaJ Iskandar W Kahan S Y Kang A Kapur M CMartin B J Thompson T Tung and D J YooDesign implementation and testing of extended andmixed precision BLASACM Trans Math Software 28(2)152ndash205 June 2002
D MargulisPhotoshop LAB Color The Canyon Conundrum andOther Adventures in the Most Powerful ColorspacePeachpit Press Berkeley CA USA 2006ISBN 0-321-35678-0xviii+366 pp
University of Manchester Nick Higham Higher Precision Arithmetic 4 4
RGB to XYZ
From CIE Standard (1931)XYZ
=
049 031 020017697 081240 001063
0 001 099
RGB
But in many booksX
YZ
=
049000 031000 020000017697 081240 001063
0 001000 099000
RGB
University of Manchester Nick Higham Higher Precision Arithmetic 6 33
IEEE Standard 754-1985
Binary β = 2
Type Size Range u = 2minust
single 32 bits 10plusmn38 2minus24 asymp 60times 10minus8
double 64 bits 10plusmn308 2minus53 asymp 11times 10minus16
Arithmetic ops (+minus lowast radic) performed as if firstcalculated to infinite precision then roundedDefault round to nearest round to even in case of tie
University of Manchester Nick Higham Higher Precision Arithmetic 7 33
Kahan on Higher Precision
For now the 10-byte Extended format is atolerable compromise between
the value of extra-precise arithmetic and the price ofimplementing it to run fast
very soon two more bytes of precision will become tolerableand ultimately a 16-byte format
That kind of gradual evolution towards wider precision
was already in view whenIEEE Standard 754 for Floating-Point Arithmetic was framed
mdash Computer Benchmarks Versus Accuracy (1994)
University of Manchester Nick Higham Higher Precision Arithmetic 8 33
IEEE Standard 754-2008
Type Size Range u = 2minust
single 32 bits 10plusmn38 2minus24 asymp 60times 10minus8
double 64 bits 10plusmn308 2minus53 asymp 11times 10minus16
quadruple 128 bits 10plusmn4932 2minus113 asymp 96times 10minus35
University of Manchester Nick Higham Higher Precision Arithmetic 9 33
Extended and Mixed Precision BLAS (2002)
Part of new standard developed by BLAS TechnicalForumExtra precision used internally input and outputarguments remain single or doubleExtra precision is anything at least 15 times asaccurate as double precision and wider than 80 bitsCounterparts of selected level 1 2 and 3 BLASroutines Extra input argument specifies precision ofinternal computationsReference implementation uses double-double formatgiving extra precision of about 106 bits
University of Manchester Nick Higham Higher Precision Arithmetic 10 33
Need for Higher Precision
Bailey Simon Barton amp Fouts Floating PointArithmetic in Future Supercomputers Internat JSupercomputer Appl 3 86ndash90 1989Bailey Barrio amp Borwein High-Precision ComputationMathematical Physics and Dynamics Appl MathComput 218 10106ndash10121 2012
Long-time simulationsResolving small-scale phenomenaLarge-scale simulations
University of Manchester Nick Higham Higher Precision Arithmetic 11 33
Going to Higher Precision
If we have quadruple or higher precision what do we needto do to modify existing algorithms
To what extent are existing algs precision-independent
University of Manchester Nick Higham Higher Precision Arithmetic 13 33
Direct Matrix Factorizations
Threshold pivoting for sparse matrices may depend onprecisionCondition estimation to detect nearly singular matrices
gtgt A = gallery(rsquolotkinrsquo14) x = Arandn(n1)Warning Matrix is close to singular or badly scaledResults may be inaccurate RCOND = 2761711e-18
Iterative refinementEquilibration
But all ldquorely only on LAPACK DLAMCHrdquo
University of Manchester Nick Higham Higher Precision Arithmetic 14 33
Matrix Functions
(Inverse) scaling and squaring-type algorithms for eAlog(A) cos(A) At use Padeacute approximants
Padeacute degree chosen to achieve accuracy uPadeacute coeffs and algorithm parameters need rederivingfor a different u Logic may changeexpm logm need changing for smaller u
Methods based on best Linfin approximations to eA forHermitian A also need higher order approximationsderiving
Scalar elementary functions
University of Manchester Nick Higham Higher Precision Arithmetic 15 33
Iterative Algorithms
QR-based Dynamically Weighted Halley IterationNakatsukasa Bai amp Gygi (2010)X0 = Aα isin Cmtimesn (m ge n)[radic
ck Xk
I
]=
[Q1
Q2
]R
Xk+1 =bk
ckXk +
1radicck
(ak minus
bk
ck
)Q1Qlowast2
Cubically convergent with limkrarrinfin Xk = U whereA = UH is a polar decompositionForms basis of new spectral divide amp conquer algs forsymm eirsquoproblem and SVD (H amp Nakatsukasa 2012)
University of Manchester Nick Higham Higher Precision Arithmetic 16 33
Maximum Number of Iterations
If κ2(A) le uminus1
u 10minus16 10minus32 10minus64 10minus128
Scaled Newton 9 11 13 15QDWH 6 7 8 9
Increasing the precision makes little difference to thecost in flops
University of Manchester Nick Higham Higher Precision Arithmetic 17 33
Derivative Approximations
Freacutechet derivative of f Cntimesn rarr Cntimesn
Lf (AE) asymp f (A + hE)minus f (A)h
forward difference
with error O(h) and hopt =(
uf (A)E2
)12
If f Rntimesn rarr Rntimesn and AE isin Rntimesn (Al-Mohy amp H 2010)
Lf (AE) asymp Imf (A + ihE)
hcomplex step
has error O(h2) and can take h arbitrarily small eg 10minus100
University of Manchester Nick Higham Higher Precision Arithmetic 18 33
Derivative Approximations
Freacutechet derivative of f Cntimesn rarr Cntimesn
Lf (AE) asymp f (A + hE)minus f (A)h
forward difference
with error O(h) and hopt =(
uf (A)E2
)12
If f Rntimesn rarr Rntimesn and AE isin Rntimesn (Al-Mohy amp H 2010)
Lf (AE) asymp Imf (A + ihE)
hcomplex step
has error O(h2) and can take h arbitrarily small eg 10minus100
University of Manchester Nick Higham Higher Precision Arithmetic 18 33
Mixed Precision
Even if we have higher precision we might not need to useit everywhere
Iterative refinement for Ax = bIterative refinement for ersquovalue problemsPolynomial root finding (Boyd this workshop)Petschow Quintana-Ortiacute amp Bientinesi (2012) usingquad precision to improve orthogonality of doubleprecision MRRR
University of Manchester Nick Higham Higher Precision Arithmetic 19 33
Chebfun
How would Chebfun need to change with higher precisionarithmetic
gtgt help chebfunprefchebfunpref Settings for Chebfun
eps - Relative tolerance used in construction andsubsequent operationsFactory value is 2^-52 (Matlabrsquos factoryvalue of machine epsilon)
University of Manchester Nick Higham Higher Precision Arithmetic 20 33
Chebfun
function pass = complexrotation This code makes sure a few things are ok if you make them complex eg integration norms inner products and singular values LNT 20 May 2008
f = chebfun((x) exp(x))fi = chebfun((x) 1iexp(x))g = chebfun(rsquo1(2-x)rsquo)gi = chebfun(rsquo1i(2-x)rsquo)A = [f g]
pass(1) = (sum(fi)==1isum(f))pass(2) = norm(fi)==norm(f)pass(3) = abs((frsquog)-((1if)rsquo(1ig))) lt 1e-15pass(4) = (norm(giinf)-norm(ginf)) lt 1e-15pass(5) = (norm(fi1)-norm(f1)) lt 1e-15pass(6) = norm(svd(A) - svd(1iA)) lt 1e-15
University of Manchester Nick Higham Higher Precision Arithmetic 21 33
Unstable Algorithms in Higher Precision
For unstable alg where error bounds available increase theprecision accordingly
University of Manchester Nick Higham Higher Precision Arithmetic 22 33
RandomizationA isin Cntimesn with A = 1Davies (2007) ldquoapproximate diagonalizationrdquo
tol = 1e-8[VD] = eig(A + tolrandn(n))F = Vdiag(feval(fundiag(D)))V
Manchester usage in VPA
digits(d)tol = 10^(-d2)[VD] = eig(vpa(A) + tolvpa( rand(n) ))F = Vdiag(feval(fundiag(D)))V
Daviesrsquo analysis suggests rel err asymp 10minusd2
H amp Relton (in progress)
University of Manchester Nick Higham Higher Precision Arithmetic 23 33
RandomizationA isin Cntimesn with A = 1Davies (2007) ldquoapproximate diagonalizationrdquo
tol = 1e-8[VD] = eig(A + tolrandn(n))F = Vdiag(feval(fundiag(D)))V
Manchester usage in VPA
digits(d)tol = 10^(-d2)[VD] = eig(vpa(A) + tolvpa( rand(n) ))F = Vdiag(feval(fundiag(D)))V
Daviesrsquo analysis suggests rel err asymp 10minusd2
H amp Relton (in progress)
University of Manchester Nick Higham Higher Precision Arithmetic 23 33
Tiny Relative Errors
Normwise relative errors
θinfin(x y) =x minus yinfinxinfin
from a numerical experiment
132e-22 339e-22 339e-21 867e-20139e-18 436e-18 530e-18 583e-18145e-17 376e-17 376e-17 427e-17
What precision was used
University of Manchester Nick Higham Higher Precision Arithmetic 24 33
Base β = 2 u = 2minust Dingle amp H (2011)
Theorem
If x 6= 0 and y are distinct normalized flpt numbers then|x minus y ||x | ge u and this lower bound is attainable
Theorem
Let 0 6= x y isin Rn be vectors of normalized flpt numbers Ifθinfin(x y) lt u then xk = yk for all k st |xk | = xinfin
x =
[1
10minus22
] y =
[1
2times 10minus22
] θinfin(x y) = 10minus22
Tiny relative errors can corrupt performance profilesSee cure in Dingle amp H (2011)
University of Manchester Nick Higham Higher Precision Arithmetic 25 33
Base β = 2 u = 2minust Dingle amp H (2011)
Theorem
If x 6= 0 and y are distinct normalized flpt numbers then|x minus y ||x | ge u and this lower bound is attainable
Theorem
Let 0 6= x y isin Rn be vectors of normalized flpt numbers Ifθinfin(x y) lt u then xk = yk for all k st |xk | = xinfin
x =
[1
10minus22
] y =
[1
2times 10minus22
] θinfin(x y) = 10minus22
Tiny relative errors can corrupt performance profilesSee cure in Dingle amp H (2011)
University of Manchester Nick Higham Higher Precision Arithmetic 25 33
Increasing the Precision
y = eπradic
163 evaluated at t digit precision
t y20 2625374126407687440025 262537412640768744000000030 262537412640768743999999999999
Is the last digit before the decimal point 4
t y35 2625374126407687439999999999992500740 2625374126407687439999999999992500725972
So no itrsquos 3
University of Manchester Nick Higham Higher Precision Arithmetic 26 33
Increasing the Precision
y = eπradic
163 evaluated at t digit precision
t y20 2625374126407687440025 262537412640768744000000030 262537412640768743999999999999
Is the last digit before the decimal point 4
t y35 2625374126407687439999999999992500740 2625374126407687439999999999992500725972
So no itrsquos 3
University of Manchester Nick Higham Higher Precision Arithmetic 26 33
Another Example
Consider the evaluation in precision u = 2minust of
y = x + a sin(bx) x = 17 a = 10minus8 b = 224
10 15 20 25 30 35 4010
minus14
10minus13
10minus12
10minus11
10minus10
10minus9
10minus8
10minus7
10minus6
10minus5
10minus4
t
error
University of Manchester Nick Higham Higher Precision Arithmetic 27 33
MythIncreasing the precision at which a computation isperformed increases the accuracy of the answer
Rounding Errors in Digital Imaging
An 8-bit RGB image is an m times n times 3 array of integersaijk isin 01 255Every editing op (levels curves colour balance )aijk larr round(fijk(aijk)) incurs a rounding error
Should we edit in 16-bit
University of Manchester Nick Higham Higher Precision Arithmetic 29 33
Rounding Errors in Digital Imaging
An 8-bit RGB image is an m times n times 3 array of integersaijk isin 01 255Every editing op (levels curves colour balance )aijk larr round(fijk(aijk)) incurs a rounding error
Should we edit in 16-bit
University of Manchester Nick Higham Higher Precision Arithmetic 29 33
16-bit vs 8-bit editing
SLR cameras generate image at 12ndash14 bits internally
8-bit 0 1 25516-bit 0 39times 10minus3 255
Controversial Margulis says using 16 bits makes nopractical difference in quality
Relevant metric is not normwise relative error
University of Manchester Nick Higham Higher Precision Arithmetic 30 33
16-bit vs 8-bit editing
SLR cameras generate image at 12ndash14 bits internally
8-bit 0 1 25516-bit 0 39times 10minus3 255
Controversial Margulis says using 16 bits makes nopractical difference in quality
Relevant metric is not normwise relative error
University of Manchester Nick Higham Higher Precision Arithmetic 30 33
16-bit vs 8-bit editing
SLR cameras generate image at 12ndash14 bits internally
8-bit 0 1 25516-bit 0 39times 10minus3 255
Controversial Margulis says using 16 bits makes nopractical difference in quality
Relevant metric is not normwise relative error
University of Manchester Nick Higham Higher Precision Arithmetic 30 33
High Precision in MATLAB
VPA arithmetic in Symbolic Math ToolboxAdvanpix Multiprecision Computing Toolbox forMATLAB
University of Manchester Nick Higham Higher Precision Arithmetic 31 33
gtgt which -all eigbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleeig) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleeig) double methodCMATLAB2012btoolboxsymbolicsymbolicsymeigm sym methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedeigm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayeigm gpuArray methoddmatlabchebfunlinopeigm linop methoddmatlabchebfunchebopeigm chebop method
gtgt which -all qrbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleqr) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleqr) double methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedqrm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayqrm gpuArray methoddmatlabchebfunchebfunqrm chebfun method
gtgt which -all toeplitzCMATLAB2012btoolboxmatlabelmattoeplitzm
University of Manchester Nick Higham Higher Precision Arithmetic 32 33
gtgt which -all eigbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleeig) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleeig) double methodCMATLAB2012btoolboxsymbolicsymbolicsymeigm sym methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedeigm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayeigm gpuArray methoddmatlabchebfunlinopeigm linop methoddmatlabchebfunchebopeigm chebop method
gtgt which -all qrbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleqr) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleqr) double methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedqrm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayqrm gpuArray methoddmatlabchebfunchebfunqrm chebfun method
gtgt which -all toeplitzCMATLAB2012btoolboxmatlabelmattoeplitzm
University of Manchester Nick Higham Higher Precision Arithmetic 32 33
gtgt which -all eigbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleeig) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleeig) double methodCMATLAB2012btoolboxsymbolicsymbolicsymeigm sym methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedeigm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayeigm gpuArray methoddmatlabchebfunlinopeigm linop methoddmatlabchebfunchebopeigm chebop method
gtgt which -all qrbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleqr) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleqr) double methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedqrm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayqrm gpuArray methoddmatlabchebfunchebfunqrm chebfun method
gtgt which -all toeplitzCMATLAB2012btoolboxmatlabelmattoeplitzm
University of Manchester Nick Higham Higher Precision Arithmetic 32 33
Conclusions amp Future Directions
Need to prepare for quad precision in hardwareArbitrary precision arithmetic in software increasinglyuseful Are we fully exploiting it
High precision veryuseful for computingldquoexact answersrdquo fortesting algs
NLA group
University of Manchester Nick Higham Higher Precision Arithmetic 33 33
References I
Multiprecision Computing Toolbox for MATLABAdvanpix LLC Tokyo|httpwwwadvanpixcom|
A H Al-Mohy and N J HighamThe complex step approximation to the Freacutechetderivative of a matrix functionNumer Algorithms 53(1)133ndash148 2010
Basic Linear Algebra Subprograms Technical (BLAST)Forum Standard IInt J High Performance Computing Applications 16(1)1ndash111 2002
University of Manchester Nick Higham Higher Precision Arithmetic 1 4
References II
Basic Linear Algebra Subprograms Technical (BLAST)Forum Standard IIInt J High Performance Computing Applications 16(2)115ndash199 2002
E B DaviesApproximate diagonalizationSIAM J Matrix Anal Appl 29(4)1051ndash1064 2007
University of Manchester Nick Higham Higher Precision Arithmetic 2 4
References III
N J Dingle and N J HighamReducing the influence of tiny normwise relative errorson performance profilesMIMS EPrint 201190 Manchester Institute forMathematical Sciences The University of ManchesterUK Nov 201111 pp
W KahanComputer benchmarks versus accuracyDraft manuscript June 1994
University of Manchester Nick Higham Higher Precision Arithmetic 3 4
References IV
X S Li J W Demmel D H Bailey G Henry Y HidaJ Iskandar W Kahan S Y Kang A Kapur M CMartin B J Thompson T Tung and D J YooDesign implementation and testing of extended andmixed precision BLASACM Trans Math Software 28(2)152ndash205 June 2002
D MargulisPhotoshop LAB Color The Canyon Conundrum andOther Adventures in the Most Powerful ColorspacePeachpit Press Berkeley CA USA 2006ISBN 0-321-35678-0xviii+366 pp
University of Manchester Nick Higham Higher Precision Arithmetic 4 4
IEEE Standard 754-1985
Binary β = 2
Type Size Range u = 2minust
single 32 bits 10plusmn38 2minus24 asymp 60times 10minus8
double 64 bits 10plusmn308 2minus53 asymp 11times 10minus16
Arithmetic ops (+minus lowast radic) performed as if firstcalculated to infinite precision then roundedDefault round to nearest round to even in case of tie
University of Manchester Nick Higham Higher Precision Arithmetic 7 33
Kahan on Higher Precision
For now the 10-byte Extended format is atolerable compromise between
the value of extra-precise arithmetic and the price ofimplementing it to run fast
very soon two more bytes of precision will become tolerableand ultimately a 16-byte format
That kind of gradual evolution towards wider precision
was already in view whenIEEE Standard 754 for Floating-Point Arithmetic was framed
mdash Computer Benchmarks Versus Accuracy (1994)
University of Manchester Nick Higham Higher Precision Arithmetic 8 33
IEEE Standard 754-2008
Type Size Range u = 2minust
single 32 bits 10plusmn38 2minus24 asymp 60times 10minus8
double 64 bits 10plusmn308 2minus53 asymp 11times 10minus16
quadruple 128 bits 10plusmn4932 2minus113 asymp 96times 10minus35
University of Manchester Nick Higham Higher Precision Arithmetic 9 33
Extended and Mixed Precision BLAS (2002)
Part of new standard developed by BLAS TechnicalForumExtra precision used internally input and outputarguments remain single or doubleExtra precision is anything at least 15 times asaccurate as double precision and wider than 80 bitsCounterparts of selected level 1 2 and 3 BLASroutines Extra input argument specifies precision ofinternal computationsReference implementation uses double-double formatgiving extra precision of about 106 bits
University of Manchester Nick Higham Higher Precision Arithmetic 10 33
Need for Higher Precision
Bailey Simon Barton amp Fouts Floating PointArithmetic in Future Supercomputers Internat JSupercomputer Appl 3 86ndash90 1989Bailey Barrio amp Borwein High-Precision ComputationMathematical Physics and Dynamics Appl MathComput 218 10106ndash10121 2012
Long-time simulationsResolving small-scale phenomenaLarge-scale simulations
University of Manchester Nick Higham Higher Precision Arithmetic 11 33
Going to Higher Precision
If we have quadruple or higher precision what do we needto do to modify existing algorithms
To what extent are existing algs precision-independent
University of Manchester Nick Higham Higher Precision Arithmetic 13 33
Direct Matrix Factorizations
Threshold pivoting for sparse matrices may depend onprecisionCondition estimation to detect nearly singular matrices
gtgt A = gallery(rsquolotkinrsquo14) x = Arandn(n1)Warning Matrix is close to singular or badly scaledResults may be inaccurate RCOND = 2761711e-18
Iterative refinementEquilibration
But all ldquorely only on LAPACK DLAMCHrdquo
University of Manchester Nick Higham Higher Precision Arithmetic 14 33
Matrix Functions
(Inverse) scaling and squaring-type algorithms for eAlog(A) cos(A) At use Padeacute approximants
Padeacute degree chosen to achieve accuracy uPadeacute coeffs and algorithm parameters need rederivingfor a different u Logic may changeexpm logm need changing for smaller u
Methods based on best Linfin approximations to eA forHermitian A also need higher order approximationsderiving
Scalar elementary functions
University of Manchester Nick Higham Higher Precision Arithmetic 15 33
Iterative Algorithms
QR-based Dynamically Weighted Halley IterationNakatsukasa Bai amp Gygi (2010)X0 = Aα isin Cmtimesn (m ge n)[radic
ck Xk
I
]=
[Q1
Q2
]R
Xk+1 =bk
ckXk +
1radicck
(ak minus
bk
ck
)Q1Qlowast2
Cubically convergent with limkrarrinfin Xk = U whereA = UH is a polar decompositionForms basis of new spectral divide amp conquer algs forsymm eirsquoproblem and SVD (H amp Nakatsukasa 2012)
University of Manchester Nick Higham Higher Precision Arithmetic 16 33
Maximum Number of Iterations
If κ2(A) le uminus1
u 10minus16 10minus32 10minus64 10minus128
Scaled Newton 9 11 13 15QDWH 6 7 8 9
Increasing the precision makes little difference to thecost in flops
University of Manchester Nick Higham Higher Precision Arithmetic 17 33
Derivative Approximations
Freacutechet derivative of f Cntimesn rarr Cntimesn
Lf (AE) asymp f (A + hE)minus f (A)h
forward difference
with error O(h) and hopt =(
uf (A)E2
)12
If f Rntimesn rarr Rntimesn and AE isin Rntimesn (Al-Mohy amp H 2010)
Lf (AE) asymp Imf (A + ihE)
hcomplex step
has error O(h2) and can take h arbitrarily small eg 10minus100
University of Manchester Nick Higham Higher Precision Arithmetic 18 33
Derivative Approximations
Freacutechet derivative of f Cntimesn rarr Cntimesn
Lf (AE) asymp f (A + hE)minus f (A)h
forward difference
with error O(h) and hopt =(
uf (A)E2
)12
If f Rntimesn rarr Rntimesn and AE isin Rntimesn (Al-Mohy amp H 2010)
Lf (AE) asymp Imf (A + ihE)
hcomplex step
has error O(h2) and can take h arbitrarily small eg 10minus100
University of Manchester Nick Higham Higher Precision Arithmetic 18 33
Mixed Precision
Even if we have higher precision we might not need to useit everywhere
Iterative refinement for Ax = bIterative refinement for ersquovalue problemsPolynomial root finding (Boyd this workshop)Petschow Quintana-Ortiacute amp Bientinesi (2012) usingquad precision to improve orthogonality of doubleprecision MRRR
University of Manchester Nick Higham Higher Precision Arithmetic 19 33
Chebfun
How would Chebfun need to change with higher precisionarithmetic
gtgt help chebfunprefchebfunpref Settings for Chebfun
eps - Relative tolerance used in construction andsubsequent operationsFactory value is 2^-52 (Matlabrsquos factoryvalue of machine epsilon)
University of Manchester Nick Higham Higher Precision Arithmetic 20 33
Chebfun
function pass = complexrotation This code makes sure a few things are ok if you make them complex eg integration norms inner products and singular values LNT 20 May 2008
f = chebfun((x) exp(x))fi = chebfun((x) 1iexp(x))g = chebfun(rsquo1(2-x)rsquo)gi = chebfun(rsquo1i(2-x)rsquo)A = [f g]
pass(1) = (sum(fi)==1isum(f))pass(2) = norm(fi)==norm(f)pass(3) = abs((frsquog)-((1if)rsquo(1ig))) lt 1e-15pass(4) = (norm(giinf)-norm(ginf)) lt 1e-15pass(5) = (norm(fi1)-norm(f1)) lt 1e-15pass(6) = norm(svd(A) - svd(1iA)) lt 1e-15
University of Manchester Nick Higham Higher Precision Arithmetic 21 33
Unstable Algorithms in Higher Precision
For unstable alg where error bounds available increase theprecision accordingly
University of Manchester Nick Higham Higher Precision Arithmetic 22 33
RandomizationA isin Cntimesn with A = 1Davies (2007) ldquoapproximate diagonalizationrdquo
tol = 1e-8[VD] = eig(A + tolrandn(n))F = Vdiag(feval(fundiag(D)))V
Manchester usage in VPA
digits(d)tol = 10^(-d2)[VD] = eig(vpa(A) + tolvpa( rand(n) ))F = Vdiag(feval(fundiag(D)))V
Daviesrsquo analysis suggests rel err asymp 10minusd2
H amp Relton (in progress)
University of Manchester Nick Higham Higher Precision Arithmetic 23 33
RandomizationA isin Cntimesn with A = 1Davies (2007) ldquoapproximate diagonalizationrdquo
tol = 1e-8[VD] = eig(A + tolrandn(n))F = Vdiag(feval(fundiag(D)))V
Manchester usage in VPA
digits(d)tol = 10^(-d2)[VD] = eig(vpa(A) + tolvpa( rand(n) ))F = Vdiag(feval(fundiag(D)))V
Daviesrsquo analysis suggests rel err asymp 10minusd2
H amp Relton (in progress)
University of Manchester Nick Higham Higher Precision Arithmetic 23 33
Tiny Relative Errors
Normwise relative errors
θinfin(x y) =x minus yinfinxinfin
from a numerical experiment
132e-22 339e-22 339e-21 867e-20139e-18 436e-18 530e-18 583e-18145e-17 376e-17 376e-17 427e-17
What precision was used
University of Manchester Nick Higham Higher Precision Arithmetic 24 33
Base β = 2 u = 2minust Dingle amp H (2011)
Theorem
If x 6= 0 and y are distinct normalized flpt numbers then|x minus y ||x | ge u and this lower bound is attainable
Theorem
Let 0 6= x y isin Rn be vectors of normalized flpt numbers Ifθinfin(x y) lt u then xk = yk for all k st |xk | = xinfin
x =
[1
10minus22
] y =
[1
2times 10minus22
] θinfin(x y) = 10minus22
Tiny relative errors can corrupt performance profilesSee cure in Dingle amp H (2011)
University of Manchester Nick Higham Higher Precision Arithmetic 25 33
Base β = 2 u = 2minust Dingle amp H (2011)
Theorem
If x 6= 0 and y are distinct normalized flpt numbers then|x minus y ||x | ge u and this lower bound is attainable
Theorem
Let 0 6= x y isin Rn be vectors of normalized flpt numbers Ifθinfin(x y) lt u then xk = yk for all k st |xk | = xinfin
x =
[1
10minus22
] y =
[1
2times 10minus22
] θinfin(x y) = 10minus22
Tiny relative errors can corrupt performance profilesSee cure in Dingle amp H (2011)
University of Manchester Nick Higham Higher Precision Arithmetic 25 33
Increasing the Precision
y = eπradic
163 evaluated at t digit precision
t y20 2625374126407687440025 262537412640768744000000030 262537412640768743999999999999
Is the last digit before the decimal point 4
t y35 2625374126407687439999999999992500740 2625374126407687439999999999992500725972
So no itrsquos 3
University of Manchester Nick Higham Higher Precision Arithmetic 26 33
Increasing the Precision
y = eπradic
163 evaluated at t digit precision
t y20 2625374126407687440025 262537412640768744000000030 262537412640768743999999999999
Is the last digit before the decimal point 4
t y35 2625374126407687439999999999992500740 2625374126407687439999999999992500725972
So no itrsquos 3
University of Manchester Nick Higham Higher Precision Arithmetic 26 33
Another Example
Consider the evaluation in precision u = 2minust of
y = x + a sin(bx) x = 17 a = 10minus8 b = 224
10 15 20 25 30 35 4010
minus14
10minus13
10minus12
10minus11
10minus10
10minus9
10minus8
10minus7
10minus6
10minus5
10minus4
t
error
University of Manchester Nick Higham Higher Precision Arithmetic 27 33
MythIncreasing the precision at which a computation isperformed increases the accuracy of the answer
Rounding Errors in Digital Imaging
An 8-bit RGB image is an m times n times 3 array of integersaijk isin 01 255Every editing op (levels curves colour balance )aijk larr round(fijk(aijk)) incurs a rounding error
Should we edit in 16-bit
University of Manchester Nick Higham Higher Precision Arithmetic 29 33
Rounding Errors in Digital Imaging
An 8-bit RGB image is an m times n times 3 array of integersaijk isin 01 255Every editing op (levels curves colour balance )aijk larr round(fijk(aijk)) incurs a rounding error
Should we edit in 16-bit
University of Manchester Nick Higham Higher Precision Arithmetic 29 33
16-bit vs 8-bit editing
SLR cameras generate image at 12ndash14 bits internally
8-bit 0 1 25516-bit 0 39times 10minus3 255
Controversial Margulis says using 16 bits makes nopractical difference in quality
Relevant metric is not normwise relative error
University of Manchester Nick Higham Higher Precision Arithmetic 30 33
16-bit vs 8-bit editing
SLR cameras generate image at 12ndash14 bits internally
8-bit 0 1 25516-bit 0 39times 10minus3 255
Controversial Margulis says using 16 bits makes nopractical difference in quality
Relevant metric is not normwise relative error
University of Manchester Nick Higham Higher Precision Arithmetic 30 33
16-bit vs 8-bit editing
SLR cameras generate image at 12ndash14 bits internally
8-bit 0 1 25516-bit 0 39times 10minus3 255
Controversial Margulis says using 16 bits makes nopractical difference in quality
Relevant metric is not normwise relative error
University of Manchester Nick Higham Higher Precision Arithmetic 30 33
High Precision in MATLAB
VPA arithmetic in Symbolic Math ToolboxAdvanpix Multiprecision Computing Toolbox forMATLAB
University of Manchester Nick Higham Higher Precision Arithmetic 31 33
gtgt which -all eigbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleeig) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleeig) double methodCMATLAB2012btoolboxsymbolicsymbolicsymeigm sym methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedeigm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayeigm gpuArray methoddmatlabchebfunlinopeigm linop methoddmatlabchebfunchebopeigm chebop method
gtgt which -all qrbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleqr) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleqr) double methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedqrm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayqrm gpuArray methoddmatlabchebfunchebfunqrm chebfun method
gtgt which -all toeplitzCMATLAB2012btoolboxmatlabelmattoeplitzm
University of Manchester Nick Higham Higher Precision Arithmetic 32 33
gtgt which -all eigbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleeig) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleeig) double methodCMATLAB2012btoolboxsymbolicsymbolicsymeigm sym methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedeigm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayeigm gpuArray methoddmatlabchebfunlinopeigm linop methoddmatlabchebfunchebopeigm chebop method
gtgt which -all qrbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleqr) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleqr) double methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedqrm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayqrm gpuArray methoddmatlabchebfunchebfunqrm chebfun method
gtgt which -all toeplitzCMATLAB2012btoolboxmatlabelmattoeplitzm
University of Manchester Nick Higham Higher Precision Arithmetic 32 33
gtgt which -all eigbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleeig) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleeig) double methodCMATLAB2012btoolboxsymbolicsymbolicsymeigm sym methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedeigm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayeigm gpuArray methoddmatlabchebfunlinopeigm linop methoddmatlabchebfunchebopeigm chebop method
gtgt which -all qrbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleqr) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleqr) double methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedqrm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayqrm gpuArray methoddmatlabchebfunchebfunqrm chebfun method
gtgt which -all toeplitzCMATLAB2012btoolboxmatlabelmattoeplitzm
University of Manchester Nick Higham Higher Precision Arithmetic 32 33
Conclusions amp Future Directions
Need to prepare for quad precision in hardwareArbitrary precision arithmetic in software increasinglyuseful Are we fully exploiting it
High precision veryuseful for computingldquoexact answersrdquo fortesting algs
NLA group
University of Manchester Nick Higham Higher Precision Arithmetic 33 33
References I
Multiprecision Computing Toolbox for MATLABAdvanpix LLC Tokyo|httpwwwadvanpixcom|
A H Al-Mohy and N J HighamThe complex step approximation to the Freacutechetderivative of a matrix functionNumer Algorithms 53(1)133ndash148 2010
Basic Linear Algebra Subprograms Technical (BLAST)Forum Standard IInt J High Performance Computing Applications 16(1)1ndash111 2002
University of Manchester Nick Higham Higher Precision Arithmetic 1 4
References II
Basic Linear Algebra Subprograms Technical (BLAST)Forum Standard IIInt J High Performance Computing Applications 16(2)115ndash199 2002
E B DaviesApproximate diagonalizationSIAM J Matrix Anal Appl 29(4)1051ndash1064 2007
University of Manchester Nick Higham Higher Precision Arithmetic 2 4
References III
N J Dingle and N J HighamReducing the influence of tiny normwise relative errorson performance profilesMIMS EPrint 201190 Manchester Institute forMathematical Sciences The University of ManchesterUK Nov 201111 pp
W KahanComputer benchmarks versus accuracyDraft manuscript June 1994
University of Manchester Nick Higham Higher Precision Arithmetic 3 4
References IV
X S Li J W Demmel D H Bailey G Henry Y HidaJ Iskandar W Kahan S Y Kang A Kapur M CMartin B J Thompson T Tung and D J YooDesign implementation and testing of extended andmixed precision BLASACM Trans Math Software 28(2)152ndash205 June 2002
D MargulisPhotoshop LAB Color The Canyon Conundrum andOther Adventures in the Most Powerful ColorspacePeachpit Press Berkeley CA USA 2006ISBN 0-321-35678-0xviii+366 pp
University of Manchester Nick Higham Higher Precision Arithmetic 4 4
Kahan on Higher Precision
For now the 10-byte Extended format is atolerable compromise between
the value of extra-precise arithmetic and the price ofimplementing it to run fast
very soon two more bytes of precision will become tolerableand ultimately a 16-byte format
That kind of gradual evolution towards wider precision
was already in view whenIEEE Standard 754 for Floating-Point Arithmetic was framed
mdash Computer Benchmarks Versus Accuracy (1994)
University of Manchester Nick Higham Higher Precision Arithmetic 8 33
IEEE Standard 754-2008
Type Size Range u = 2minust
single 32 bits 10plusmn38 2minus24 asymp 60times 10minus8
double 64 bits 10plusmn308 2minus53 asymp 11times 10minus16
quadruple 128 bits 10plusmn4932 2minus113 asymp 96times 10minus35
University of Manchester Nick Higham Higher Precision Arithmetic 9 33
Extended and Mixed Precision BLAS (2002)
Part of new standard developed by BLAS TechnicalForumExtra precision used internally input and outputarguments remain single or doubleExtra precision is anything at least 15 times asaccurate as double precision and wider than 80 bitsCounterparts of selected level 1 2 and 3 BLASroutines Extra input argument specifies precision ofinternal computationsReference implementation uses double-double formatgiving extra precision of about 106 bits
University of Manchester Nick Higham Higher Precision Arithmetic 10 33
Need for Higher Precision
Bailey Simon Barton amp Fouts Floating PointArithmetic in Future Supercomputers Internat JSupercomputer Appl 3 86ndash90 1989Bailey Barrio amp Borwein High-Precision ComputationMathematical Physics and Dynamics Appl MathComput 218 10106ndash10121 2012
Long-time simulationsResolving small-scale phenomenaLarge-scale simulations
University of Manchester Nick Higham Higher Precision Arithmetic 11 33
Going to Higher Precision
If we have quadruple or higher precision what do we needto do to modify existing algorithms
To what extent are existing algs precision-independent
University of Manchester Nick Higham Higher Precision Arithmetic 13 33
Direct Matrix Factorizations
Threshold pivoting for sparse matrices may depend onprecisionCondition estimation to detect nearly singular matrices
gtgt A = gallery(rsquolotkinrsquo14) x = Arandn(n1)Warning Matrix is close to singular or badly scaledResults may be inaccurate RCOND = 2761711e-18
Iterative refinementEquilibration
But all ldquorely only on LAPACK DLAMCHrdquo
University of Manchester Nick Higham Higher Precision Arithmetic 14 33
Matrix Functions
(Inverse) scaling and squaring-type algorithms for eAlog(A) cos(A) At use Padeacute approximants
Padeacute degree chosen to achieve accuracy uPadeacute coeffs and algorithm parameters need rederivingfor a different u Logic may changeexpm logm need changing for smaller u
Methods based on best Linfin approximations to eA forHermitian A also need higher order approximationsderiving
Scalar elementary functions
University of Manchester Nick Higham Higher Precision Arithmetic 15 33
Iterative Algorithms
QR-based Dynamically Weighted Halley IterationNakatsukasa Bai amp Gygi (2010)X0 = Aα isin Cmtimesn (m ge n)[radic
ck Xk
I
]=
[Q1
Q2
]R
Xk+1 =bk
ckXk +
1radicck
(ak minus
bk
ck
)Q1Qlowast2
Cubically convergent with limkrarrinfin Xk = U whereA = UH is a polar decompositionForms basis of new spectral divide amp conquer algs forsymm eirsquoproblem and SVD (H amp Nakatsukasa 2012)
University of Manchester Nick Higham Higher Precision Arithmetic 16 33
Maximum Number of Iterations
If κ2(A) le uminus1
u 10minus16 10minus32 10minus64 10minus128
Scaled Newton 9 11 13 15QDWH 6 7 8 9
Increasing the precision makes little difference to thecost in flops
University of Manchester Nick Higham Higher Precision Arithmetic 17 33
Derivative Approximations
Freacutechet derivative of f Cntimesn rarr Cntimesn
Lf (AE) asymp f (A + hE)minus f (A)h
forward difference
with error O(h) and hopt =(
uf (A)E2
)12
If f Rntimesn rarr Rntimesn and AE isin Rntimesn (Al-Mohy amp H 2010)
Lf (AE) asymp Imf (A + ihE)
hcomplex step
has error O(h2) and can take h arbitrarily small eg 10minus100
University of Manchester Nick Higham Higher Precision Arithmetic 18 33
Derivative Approximations
Freacutechet derivative of f Cntimesn rarr Cntimesn
Lf (AE) asymp f (A + hE)minus f (A)h
forward difference
with error O(h) and hopt =(
uf (A)E2
)12
If f Rntimesn rarr Rntimesn and AE isin Rntimesn (Al-Mohy amp H 2010)
Lf (AE) asymp Imf (A + ihE)
hcomplex step
has error O(h2) and can take h arbitrarily small eg 10minus100
University of Manchester Nick Higham Higher Precision Arithmetic 18 33
Mixed Precision
Even if we have higher precision we might not need to useit everywhere
Iterative refinement for Ax = bIterative refinement for ersquovalue problemsPolynomial root finding (Boyd this workshop)Petschow Quintana-Ortiacute amp Bientinesi (2012) usingquad precision to improve orthogonality of doubleprecision MRRR
University of Manchester Nick Higham Higher Precision Arithmetic 19 33
Chebfun
How would Chebfun need to change with higher precisionarithmetic
gtgt help chebfunprefchebfunpref Settings for Chebfun
eps - Relative tolerance used in construction andsubsequent operationsFactory value is 2^-52 (Matlabrsquos factoryvalue of machine epsilon)
University of Manchester Nick Higham Higher Precision Arithmetic 20 33
Chebfun
function pass = complexrotation This code makes sure a few things are ok if you make them complex eg integration norms inner products and singular values LNT 20 May 2008
f = chebfun((x) exp(x))fi = chebfun((x) 1iexp(x))g = chebfun(rsquo1(2-x)rsquo)gi = chebfun(rsquo1i(2-x)rsquo)A = [f g]
pass(1) = (sum(fi)==1isum(f))pass(2) = norm(fi)==norm(f)pass(3) = abs((frsquog)-((1if)rsquo(1ig))) lt 1e-15pass(4) = (norm(giinf)-norm(ginf)) lt 1e-15pass(5) = (norm(fi1)-norm(f1)) lt 1e-15pass(6) = norm(svd(A) - svd(1iA)) lt 1e-15
University of Manchester Nick Higham Higher Precision Arithmetic 21 33
Unstable Algorithms in Higher Precision
For unstable alg where error bounds available increase theprecision accordingly
University of Manchester Nick Higham Higher Precision Arithmetic 22 33
RandomizationA isin Cntimesn with A = 1Davies (2007) ldquoapproximate diagonalizationrdquo
tol = 1e-8[VD] = eig(A + tolrandn(n))F = Vdiag(feval(fundiag(D)))V
Manchester usage in VPA
digits(d)tol = 10^(-d2)[VD] = eig(vpa(A) + tolvpa( rand(n) ))F = Vdiag(feval(fundiag(D)))V
Daviesrsquo analysis suggests rel err asymp 10minusd2
H amp Relton (in progress)
University of Manchester Nick Higham Higher Precision Arithmetic 23 33
RandomizationA isin Cntimesn with A = 1Davies (2007) ldquoapproximate diagonalizationrdquo
tol = 1e-8[VD] = eig(A + tolrandn(n))F = Vdiag(feval(fundiag(D)))V
Manchester usage in VPA
digits(d)tol = 10^(-d2)[VD] = eig(vpa(A) + tolvpa( rand(n) ))F = Vdiag(feval(fundiag(D)))V
Daviesrsquo analysis suggests rel err asymp 10minusd2
H amp Relton (in progress)
University of Manchester Nick Higham Higher Precision Arithmetic 23 33
Tiny Relative Errors
Normwise relative errors
θinfin(x y) =x minus yinfinxinfin
from a numerical experiment
132e-22 339e-22 339e-21 867e-20139e-18 436e-18 530e-18 583e-18145e-17 376e-17 376e-17 427e-17
What precision was used
University of Manchester Nick Higham Higher Precision Arithmetic 24 33
Base β = 2 u = 2minust Dingle amp H (2011)
Theorem
If x 6= 0 and y are distinct normalized flpt numbers then|x minus y ||x | ge u and this lower bound is attainable
Theorem
Let 0 6= x y isin Rn be vectors of normalized flpt numbers Ifθinfin(x y) lt u then xk = yk for all k st |xk | = xinfin
x =
[1
10minus22
] y =
[1
2times 10minus22
] θinfin(x y) = 10minus22
Tiny relative errors can corrupt performance profilesSee cure in Dingle amp H (2011)
University of Manchester Nick Higham Higher Precision Arithmetic 25 33
Base β = 2 u = 2minust Dingle amp H (2011)
Theorem
If x 6= 0 and y are distinct normalized flpt numbers then|x minus y ||x | ge u and this lower bound is attainable
Theorem
Let 0 6= x y isin Rn be vectors of normalized flpt numbers Ifθinfin(x y) lt u then xk = yk for all k st |xk | = xinfin
x =
[1
10minus22
] y =
[1
2times 10minus22
] θinfin(x y) = 10minus22
Tiny relative errors can corrupt performance profilesSee cure in Dingle amp H (2011)
University of Manchester Nick Higham Higher Precision Arithmetic 25 33
Increasing the Precision
y = eπradic
163 evaluated at t digit precision
t y20 2625374126407687440025 262537412640768744000000030 262537412640768743999999999999
Is the last digit before the decimal point 4
t y35 2625374126407687439999999999992500740 2625374126407687439999999999992500725972
So no itrsquos 3
University of Manchester Nick Higham Higher Precision Arithmetic 26 33
Increasing the Precision
y = eπradic
163 evaluated at t digit precision
t y20 2625374126407687440025 262537412640768744000000030 262537412640768743999999999999
Is the last digit before the decimal point 4
t y35 2625374126407687439999999999992500740 2625374126407687439999999999992500725972
So no itrsquos 3
University of Manchester Nick Higham Higher Precision Arithmetic 26 33
Another Example
Consider the evaluation in precision u = 2minust of
y = x + a sin(bx) x = 17 a = 10minus8 b = 224
10 15 20 25 30 35 4010
minus14
10minus13
10minus12
10minus11
10minus10
10minus9
10minus8
10minus7
10minus6
10minus5
10minus4
t
error
University of Manchester Nick Higham Higher Precision Arithmetic 27 33
MythIncreasing the precision at which a computation isperformed increases the accuracy of the answer
Rounding Errors in Digital Imaging
An 8-bit RGB image is an m times n times 3 array of integersaijk isin 01 255Every editing op (levels curves colour balance )aijk larr round(fijk(aijk)) incurs a rounding error
Should we edit in 16-bit
University of Manchester Nick Higham Higher Precision Arithmetic 29 33
Rounding Errors in Digital Imaging
An 8-bit RGB image is an m times n times 3 array of integersaijk isin 01 255Every editing op (levels curves colour balance )aijk larr round(fijk(aijk)) incurs a rounding error
Should we edit in 16-bit
University of Manchester Nick Higham Higher Precision Arithmetic 29 33
16-bit vs 8-bit editing
SLR cameras generate image at 12ndash14 bits internally
8-bit 0 1 25516-bit 0 39times 10minus3 255
Controversial Margulis says using 16 bits makes nopractical difference in quality
Relevant metric is not normwise relative error
University of Manchester Nick Higham Higher Precision Arithmetic 30 33
16-bit vs 8-bit editing
SLR cameras generate image at 12ndash14 bits internally
8-bit 0 1 25516-bit 0 39times 10minus3 255
Controversial Margulis says using 16 bits makes nopractical difference in quality
Relevant metric is not normwise relative error
University of Manchester Nick Higham Higher Precision Arithmetic 30 33
16-bit vs 8-bit editing
SLR cameras generate image at 12ndash14 bits internally
8-bit 0 1 25516-bit 0 39times 10minus3 255
Controversial Margulis says using 16 bits makes nopractical difference in quality
Relevant metric is not normwise relative error
University of Manchester Nick Higham Higher Precision Arithmetic 30 33
High Precision in MATLAB
VPA arithmetic in Symbolic Math ToolboxAdvanpix Multiprecision Computing Toolbox forMATLAB
University of Manchester Nick Higham Higher Precision Arithmetic 31 33
gtgt which -all eigbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleeig) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleeig) double methodCMATLAB2012btoolboxsymbolicsymbolicsymeigm sym methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedeigm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayeigm gpuArray methoddmatlabchebfunlinopeigm linop methoddmatlabchebfunchebopeigm chebop method
gtgt which -all qrbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleqr) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleqr) double methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedqrm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayqrm gpuArray methoddmatlabchebfunchebfunqrm chebfun method
gtgt which -all toeplitzCMATLAB2012btoolboxmatlabelmattoeplitzm
University of Manchester Nick Higham Higher Precision Arithmetic 32 33
gtgt which -all eigbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleeig) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleeig) double methodCMATLAB2012btoolboxsymbolicsymbolicsymeigm sym methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedeigm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayeigm gpuArray methoddmatlabchebfunlinopeigm linop methoddmatlabchebfunchebopeigm chebop method
gtgt which -all qrbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleqr) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleqr) double methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedqrm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayqrm gpuArray methoddmatlabchebfunchebfunqrm chebfun method
gtgt which -all toeplitzCMATLAB2012btoolboxmatlabelmattoeplitzm
University of Manchester Nick Higham Higher Precision Arithmetic 32 33
gtgt which -all eigbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleeig) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleeig) double methodCMATLAB2012btoolboxsymbolicsymbolicsymeigm sym methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedeigm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayeigm gpuArray methoddmatlabchebfunlinopeigm linop methoddmatlabchebfunchebopeigm chebop method
gtgt which -all qrbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleqr) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleqr) double methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedqrm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayqrm gpuArray methoddmatlabchebfunchebfunqrm chebfun method
gtgt which -all toeplitzCMATLAB2012btoolboxmatlabelmattoeplitzm
University of Manchester Nick Higham Higher Precision Arithmetic 32 33
Conclusions amp Future Directions
Need to prepare for quad precision in hardwareArbitrary precision arithmetic in software increasinglyuseful Are we fully exploiting it
High precision veryuseful for computingldquoexact answersrdquo fortesting algs
NLA group
University of Manchester Nick Higham Higher Precision Arithmetic 33 33
References I
Multiprecision Computing Toolbox for MATLABAdvanpix LLC Tokyo|httpwwwadvanpixcom|
A H Al-Mohy and N J HighamThe complex step approximation to the Freacutechetderivative of a matrix functionNumer Algorithms 53(1)133ndash148 2010
Basic Linear Algebra Subprograms Technical (BLAST)Forum Standard IInt J High Performance Computing Applications 16(1)1ndash111 2002
University of Manchester Nick Higham Higher Precision Arithmetic 1 4
References II
Basic Linear Algebra Subprograms Technical (BLAST)Forum Standard IIInt J High Performance Computing Applications 16(2)115ndash199 2002
E B DaviesApproximate diagonalizationSIAM J Matrix Anal Appl 29(4)1051ndash1064 2007
University of Manchester Nick Higham Higher Precision Arithmetic 2 4
References III
N J Dingle and N J HighamReducing the influence of tiny normwise relative errorson performance profilesMIMS EPrint 201190 Manchester Institute forMathematical Sciences The University of ManchesterUK Nov 201111 pp
W KahanComputer benchmarks versus accuracyDraft manuscript June 1994
University of Manchester Nick Higham Higher Precision Arithmetic 3 4
References IV
X S Li J W Demmel D H Bailey G Henry Y HidaJ Iskandar W Kahan S Y Kang A Kapur M CMartin B J Thompson T Tung and D J YooDesign implementation and testing of extended andmixed precision BLASACM Trans Math Software 28(2)152ndash205 June 2002
D MargulisPhotoshop LAB Color The Canyon Conundrum andOther Adventures in the Most Powerful ColorspacePeachpit Press Berkeley CA USA 2006ISBN 0-321-35678-0xviii+366 pp
University of Manchester Nick Higham Higher Precision Arithmetic 4 4
IEEE Standard 754-2008
Type Size Range u = 2minust
single 32 bits 10plusmn38 2minus24 asymp 60times 10minus8
double 64 bits 10plusmn308 2minus53 asymp 11times 10minus16
quadruple 128 bits 10plusmn4932 2minus113 asymp 96times 10minus35
University of Manchester Nick Higham Higher Precision Arithmetic 9 33
Extended and Mixed Precision BLAS (2002)
Part of new standard developed by BLAS TechnicalForumExtra precision used internally input and outputarguments remain single or doubleExtra precision is anything at least 15 times asaccurate as double precision and wider than 80 bitsCounterparts of selected level 1 2 and 3 BLASroutines Extra input argument specifies precision ofinternal computationsReference implementation uses double-double formatgiving extra precision of about 106 bits
University of Manchester Nick Higham Higher Precision Arithmetic 10 33
Need for Higher Precision
Bailey Simon Barton amp Fouts Floating PointArithmetic in Future Supercomputers Internat JSupercomputer Appl 3 86ndash90 1989Bailey Barrio amp Borwein High-Precision ComputationMathematical Physics and Dynamics Appl MathComput 218 10106ndash10121 2012
Long-time simulationsResolving small-scale phenomenaLarge-scale simulations
University of Manchester Nick Higham Higher Precision Arithmetic 11 33
Going to Higher Precision
If we have quadruple or higher precision what do we needto do to modify existing algorithms
To what extent are existing algs precision-independent
University of Manchester Nick Higham Higher Precision Arithmetic 13 33
Direct Matrix Factorizations
Threshold pivoting for sparse matrices may depend onprecisionCondition estimation to detect nearly singular matrices
gtgt A = gallery(rsquolotkinrsquo14) x = Arandn(n1)Warning Matrix is close to singular or badly scaledResults may be inaccurate RCOND = 2761711e-18
Iterative refinementEquilibration
But all ldquorely only on LAPACK DLAMCHrdquo
University of Manchester Nick Higham Higher Precision Arithmetic 14 33
Matrix Functions
(Inverse) scaling and squaring-type algorithms for eAlog(A) cos(A) At use Padeacute approximants
Padeacute degree chosen to achieve accuracy uPadeacute coeffs and algorithm parameters need rederivingfor a different u Logic may changeexpm logm need changing for smaller u
Methods based on best Linfin approximations to eA forHermitian A also need higher order approximationsderiving
Scalar elementary functions
University of Manchester Nick Higham Higher Precision Arithmetic 15 33
Iterative Algorithms
QR-based Dynamically Weighted Halley IterationNakatsukasa Bai amp Gygi (2010)X0 = Aα isin Cmtimesn (m ge n)[radic
ck Xk
I
]=
[Q1
Q2
]R
Xk+1 =bk
ckXk +
1radicck
(ak minus
bk
ck
)Q1Qlowast2
Cubically convergent with limkrarrinfin Xk = U whereA = UH is a polar decompositionForms basis of new spectral divide amp conquer algs forsymm eirsquoproblem and SVD (H amp Nakatsukasa 2012)
University of Manchester Nick Higham Higher Precision Arithmetic 16 33
Maximum Number of Iterations
If κ2(A) le uminus1
u 10minus16 10minus32 10minus64 10minus128
Scaled Newton 9 11 13 15QDWH 6 7 8 9
Increasing the precision makes little difference to thecost in flops
University of Manchester Nick Higham Higher Precision Arithmetic 17 33
Derivative Approximations
Freacutechet derivative of f Cntimesn rarr Cntimesn
Lf (AE) asymp f (A + hE)minus f (A)h
forward difference
with error O(h) and hopt =(
uf (A)E2
)12
If f Rntimesn rarr Rntimesn and AE isin Rntimesn (Al-Mohy amp H 2010)
Lf (AE) asymp Imf (A + ihE)
hcomplex step
has error O(h2) and can take h arbitrarily small eg 10minus100
University of Manchester Nick Higham Higher Precision Arithmetic 18 33
Derivative Approximations
Freacutechet derivative of f Cntimesn rarr Cntimesn
Lf (AE) asymp f (A + hE)minus f (A)h
forward difference
with error O(h) and hopt =(
uf (A)E2
)12
If f Rntimesn rarr Rntimesn and AE isin Rntimesn (Al-Mohy amp H 2010)
Lf (AE) asymp Imf (A + ihE)
hcomplex step
has error O(h2) and can take h arbitrarily small eg 10minus100
University of Manchester Nick Higham Higher Precision Arithmetic 18 33
Mixed Precision
Even if we have higher precision we might not need to useit everywhere
Iterative refinement for Ax = bIterative refinement for ersquovalue problemsPolynomial root finding (Boyd this workshop)Petschow Quintana-Ortiacute amp Bientinesi (2012) usingquad precision to improve orthogonality of doubleprecision MRRR
University of Manchester Nick Higham Higher Precision Arithmetic 19 33
Chebfun
How would Chebfun need to change with higher precisionarithmetic
gtgt help chebfunprefchebfunpref Settings for Chebfun
eps - Relative tolerance used in construction andsubsequent operationsFactory value is 2^-52 (Matlabrsquos factoryvalue of machine epsilon)
University of Manchester Nick Higham Higher Precision Arithmetic 20 33
Chebfun
function pass = complexrotation This code makes sure a few things are ok if you make them complex eg integration norms inner products and singular values LNT 20 May 2008
f = chebfun((x) exp(x))fi = chebfun((x) 1iexp(x))g = chebfun(rsquo1(2-x)rsquo)gi = chebfun(rsquo1i(2-x)rsquo)A = [f g]
pass(1) = (sum(fi)==1isum(f))pass(2) = norm(fi)==norm(f)pass(3) = abs((frsquog)-((1if)rsquo(1ig))) lt 1e-15pass(4) = (norm(giinf)-norm(ginf)) lt 1e-15pass(5) = (norm(fi1)-norm(f1)) lt 1e-15pass(6) = norm(svd(A) - svd(1iA)) lt 1e-15
University of Manchester Nick Higham Higher Precision Arithmetic 21 33
Unstable Algorithms in Higher Precision
For unstable alg where error bounds available increase theprecision accordingly
University of Manchester Nick Higham Higher Precision Arithmetic 22 33
RandomizationA isin Cntimesn with A = 1Davies (2007) ldquoapproximate diagonalizationrdquo
tol = 1e-8[VD] = eig(A + tolrandn(n))F = Vdiag(feval(fundiag(D)))V
Manchester usage in VPA
digits(d)tol = 10^(-d2)[VD] = eig(vpa(A) + tolvpa( rand(n) ))F = Vdiag(feval(fundiag(D)))V
Daviesrsquo analysis suggests rel err asymp 10minusd2
H amp Relton (in progress)
University of Manchester Nick Higham Higher Precision Arithmetic 23 33
RandomizationA isin Cntimesn with A = 1Davies (2007) ldquoapproximate diagonalizationrdquo
tol = 1e-8[VD] = eig(A + tolrandn(n))F = Vdiag(feval(fundiag(D)))V
Manchester usage in VPA
digits(d)tol = 10^(-d2)[VD] = eig(vpa(A) + tolvpa( rand(n) ))F = Vdiag(feval(fundiag(D)))V
Daviesrsquo analysis suggests rel err asymp 10minusd2
H amp Relton (in progress)
University of Manchester Nick Higham Higher Precision Arithmetic 23 33
Tiny Relative Errors
Normwise relative errors
θinfin(x y) =x minus yinfinxinfin
from a numerical experiment
132e-22 339e-22 339e-21 867e-20139e-18 436e-18 530e-18 583e-18145e-17 376e-17 376e-17 427e-17
What precision was used
University of Manchester Nick Higham Higher Precision Arithmetic 24 33
Base β = 2 u = 2minust Dingle amp H (2011)
Theorem
If x 6= 0 and y are distinct normalized flpt numbers then|x minus y ||x | ge u and this lower bound is attainable
Theorem
Let 0 6= x y isin Rn be vectors of normalized flpt numbers Ifθinfin(x y) lt u then xk = yk for all k st |xk | = xinfin
x =
[1
10minus22
] y =
[1
2times 10minus22
] θinfin(x y) = 10minus22
Tiny relative errors can corrupt performance profilesSee cure in Dingle amp H (2011)
University of Manchester Nick Higham Higher Precision Arithmetic 25 33
Base β = 2 u = 2minust Dingle amp H (2011)
Theorem
If x 6= 0 and y are distinct normalized flpt numbers then|x minus y ||x | ge u and this lower bound is attainable
Theorem
Let 0 6= x y isin Rn be vectors of normalized flpt numbers Ifθinfin(x y) lt u then xk = yk for all k st |xk | = xinfin
x =
[1
10minus22
] y =
[1
2times 10minus22
] θinfin(x y) = 10minus22
Tiny relative errors can corrupt performance profilesSee cure in Dingle amp H (2011)
University of Manchester Nick Higham Higher Precision Arithmetic 25 33
Increasing the Precision
y = eπradic
163 evaluated at t digit precision
t y20 2625374126407687440025 262537412640768744000000030 262537412640768743999999999999
Is the last digit before the decimal point 4
t y35 2625374126407687439999999999992500740 2625374126407687439999999999992500725972
So no itrsquos 3
University of Manchester Nick Higham Higher Precision Arithmetic 26 33
Increasing the Precision
y = eπradic
163 evaluated at t digit precision
t y20 2625374126407687440025 262537412640768744000000030 262537412640768743999999999999
Is the last digit before the decimal point 4
t y35 2625374126407687439999999999992500740 2625374126407687439999999999992500725972
So no itrsquos 3
University of Manchester Nick Higham Higher Precision Arithmetic 26 33
Another Example
Consider the evaluation in precision u = 2minust of
y = x + a sin(bx) x = 17 a = 10minus8 b = 224
10 15 20 25 30 35 4010
minus14
10minus13
10minus12
10minus11
10minus10
10minus9
10minus8
10minus7
10minus6
10minus5
10minus4
t
error
University of Manchester Nick Higham Higher Precision Arithmetic 27 33
MythIncreasing the precision at which a computation isperformed increases the accuracy of the answer
Rounding Errors in Digital Imaging
An 8-bit RGB image is an m times n times 3 array of integersaijk isin 01 255Every editing op (levels curves colour balance )aijk larr round(fijk(aijk)) incurs a rounding error
Should we edit in 16-bit
University of Manchester Nick Higham Higher Precision Arithmetic 29 33
Rounding Errors in Digital Imaging
An 8-bit RGB image is an m times n times 3 array of integersaijk isin 01 255Every editing op (levels curves colour balance )aijk larr round(fijk(aijk)) incurs a rounding error
Should we edit in 16-bit
University of Manchester Nick Higham Higher Precision Arithmetic 29 33
16-bit vs 8-bit editing
SLR cameras generate image at 12ndash14 bits internally
8-bit 0 1 25516-bit 0 39times 10minus3 255
Controversial Margulis says using 16 bits makes nopractical difference in quality
Relevant metric is not normwise relative error
University of Manchester Nick Higham Higher Precision Arithmetic 30 33
16-bit vs 8-bit editing
SLR cameras generate image at 12ndash14 bits internally
8-bit 0 1 25516-bit 0 39times 10minus3 255
Controversial Margulis says using 16 bits makes nopractical difference in quality
Relevant metric is not normwise relative error
University of Manchester Nick Higham Higher Precision Arithmetic 30 33
16-bit vs 8-bit editing
SLR cameras generate image at 12ndash14 bits internally
8-bit 0 1 25516-bit 0 39times 10minus3 255
Controversial Margulis says using 16 bits makes nopractical difference in quality
Relevant metric is not normwise relative error
University of Manchester Nick Higham Higher Precision Arithmetic 30 33
High Precision in MATLAB
VPA arithmetic in Symbolic Math ToolboxAdvanpix Multiprecision Computing Toolbox forMATLAB
University of Manchester Nick Higham Higher Precision Arithmetic 31 33
gtgt which -all eigbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleeig) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleeig) double methodCMATLAB2012btoolboxsymbolicsymbolicsymeigm sym methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedeigm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayeigm gpuArray methoddmatlabchebfunlinopeigm linop methoddmatlabchebfunchebopeigm chebop method
gtgt which -all qrbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleqr) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleqr) double methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedqrm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayqrm gpuArray methoddmatlabchebfunchebfunqrm chebfun method
gtgt which -all toeplitzCMATLAB2012btoolboxmatlabelmattoeplitzm
University of Manchester Nick Higham Higher Precision Arithmetic 32 33
gtgt which -all eigbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleeig) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleeig) double methodCMATLAB2012btoolboxsymbolicsymbolicsymeigm sym methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedeigm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayeigm gpuArray methoddmatlabchebfunlinopeigm linop methoddmatlabchebfunchebopeigm chebop method
gtgt which -all qrbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleqr) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleqr) double methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedqrm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayqrm gpuArray methoddmatlabchebfunchebfunqrm chebfun method
gtgt which -all toeplitzCMATLAB2012btoolboxmatlabelmattoeplitzm
University of Manchester Nick Higham Higher Precision Arithmetic 32 33
gtgt which -all eigbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleeig) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleeig) double methodCMATLAB2012btoolboxsymbolicsymbolicsymeigm sym methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedeigm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayeigm gpuArray methoddmatlabchebfunlinopeigm linop methoddmatlabchebfunchebopeigm chebop method
gtgt which -all qrbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleqr) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleqr) double methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedqrm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayqrm gpuArray methoddmatlabchebfunchebfunqrm chebfun method
gtgt which -all toeplitzCMATLAB2012btoolboxmatlabelmattoeplitzm
University of Manchester Nick Higham Higher Precision Arithmetic 32 33
Conclusions amp Future Directions
Need to prepare for quad precision in hardwareArbitrary precision arithmetic in software increasinglyuseful Are we fully exploiting it
High precision veryuseful for computingldquoexact answersrdquo fortesting algs
NLA group
University of Manchester Nick Higham Higher Precision Arithmetic 33 33
References I
Multiprecision Computing Toolbox for MATLABAdvanpix LLC Tokyo|httpwwwadvanpixcom|
A H Al-Mohy and N J HighamThe complex step approximation to the Freacutechetderivative of a matrix functionNumer Algorithms 53(1)133ndash148 2010
Basic Linear Algebra Subprograms Technical (BLAST)Forum Standard IInt J High Performance Computing Applications 16(1)1ndash111 2002
University of Manchester Nick Higham Higher Precision Arithmetic 1 4
References II
Basic Linear Algebra Subprograms Technical (BLAST)Forum Standard IIInt J High Performance Computing Applications 16(2)115ndash199 2002
E B DaviesApproximate diagonalizationSIAM J Matrix Anal Appl 29(4)1051ndash1064 2007
University of Manchester Nick Higham Higher Precision Arithmetic 2 4
References III
N J Dingle and N J HighamReducing the influence of tiny normwise relative errorson performance profilesMIMS EPrint 201190 Manchester Institute forMathematical Sciences The University of ManchesterUK Nov 201111 pp
W KahanComputer benchmarks versus accuracyDraft manuscript June 1994
University of Manchester Nick Higham Higher Precision Arithmetic 3 4
References IV
X S Li J W Demmel D H Bailey G Henry Y HidaJ Iskandar W Kahan S Y Kang A Kapur M CMartin B J Thompson T Tung and D J YooDesign implementation and testing of extended andmixed precision BLASACM Trans Math Software 28(2)152ndash205 June 2002
D MargulisPhotoshop LAB Color The Canyon Conundrum andOther Adventures in the Most Powerful ColorspacePeachpit Press Berkeley CA USA 2006ISBN 0-321-35678-0xviii+366 pp
University of Manchester Nick Higham Higher Precision Arithmetic 4 4
Extended and Mixed Precision BLAS (2002)
Part of new standard developed by BLAS TechnicalForumExtra precision used internally input and outputarguments remain single or doubleExtra precision is anything at least 15 times asaccurate as double precision and wider than 80 bitsCounterparts of selected level 1 2 and 3 BLASroutines Extra input argument specifies precision ofinternal computationsReference implementation uses double-double formatgiving extra precision of about 106 bits
University of Manchester Nick Higham Higher Precision Arithmetic 10 33
Need for Higher Precision
Bailey Simon Barton amp Fouts Floating PointArithmetic in Future Supercomputers Internat JSupercomputer Appl 3 86ndash90 1989Bailey Barrio amp Borwein High-Precision ComputationMathematical Physics and Dynamics Appl MathComput 218 10106ndash10121 2012
Long-time simulationsResolving small-scale phenomenaLarge-scale simulations
University of Manchester Nick Higham Higher Precision Arithmetic 11 33
Going to Higher Precision
If we have quadruple or higher precision what do we needto do to modify existing algorithms
To what extent are existing algs precision-independent
University of Manchester Nick Higham Higher Precision Arithmetic 13 33
Direct Matrix Factorizations
Threshold pivoting for sparse matrices may depend onprecisionCondition estimation to detect nearly singular matrices
gtgt A = gallery(rsquolotkinrsquo14) x = Arandn(n1)Warning Matrix is close to singular or badly scaledResults may be inaccurate RCOND = 2761711e-18
Iterative refinementEquilibration
But all ldquorely only on LAPACK DLAMCHrdquo
University of Manchester Nick Higham Higher Precision Arithmetic 14 33
Matrix Functions
(Inverse) scaling and squaring-type algorithms for eAlog(A) cos(A) At use Padeacute approximants
Padeacute degree chosen to achieve accuracy uPadeacute coeffs and algorithm parameters need rederivingfor a different u Logic may changeexpm logm need changing for smaller u
Methods based on best Linfin approximations to eA forHermitian A also need higher order approximationsderiving
Scalar elementary functions
University of Manchester Nick Higham Higher Precision Arithmetic 15 33
Iterative Algorithms
QR-based Dynamically Weighted Halley IterationNakatsukasa Bai amp Gygi (2010)X0 = Aα isin Cmtimesn (m ge n)[radic
ck Xk
I
]=
[Q1
Q2
]R
Xk+1 =bk
ckXk +
1radicck
(ak minus
bk
ck
)Q1Qlowast2
Cubically convergent with limkrarrinfin Xk = U whereA = UH is a polar decompositionForms basis of new spectral divide amp conquer algs forsymm eirsquoproblem and SVD (H amp Nakatsukasa 2012)
University of Manchester Nick Higham Higher Precision Arithmetic 16 33
Maximum Number of Iterations
If κ2(A) le uminus1
u 10minus16 10minus32 10minus64 10minus128
Scaled Newton 9 11 13 15QDWH 6 7 8 9
Increasing the precision makes little difference to thecost in flops
University of Manchester Nick Higham Higher Precision Arithmetic 17 33
Derivative Approximations
Freacutechet derivative of f Cntimesn rarr Cntimesn
Lf (AE) asymp f (A + hE)minus f (A)h
forward difference
with error O(h) and hopt =(
uf (A)E2
)12
If f Rntimesn rarr Rntimesn and AE isin Rntimesn (Al-Mohy amp H 2010)
Lf (AE) asymp Imf (A + ihE)
hcomplex step
has error O(h2) and can take h arbitrarily small eg 10minus100
University of Manchester Nick Higham Higher Precision Arithmetic 18 33
Derivative Approximations
Freacutechet derivative of f Cntimesn rarr Cntimesn
Lf (AE) asymp f (A + hE)minus f (A)h
forward difference
with error O(h) and hopt =(
uf (A)E2
)12
If f Rntimesn rarr Rntimesn and AE isin Rntimesn (Al-Mohy amp H 2010)
Lf (AE) asymp Imf (A + ihE)
hcomplex step
has error O(h2) and can take h arbitrarily small eg 10minus100
University of Manchester Nick Higham Higher Precision Arithmetic 18 33
Mixed Precision
Even if we have higher precision we might not need to useit everywhere
Iterative refinement for Ax = bIterative refinement for ersquovalue problemsPolynomial root finding (Boyd this workshop)Petschow Quintana-Ortiacute amp Bientinesi (2012) usingquad precision to improve orthogonality of doubleprecision MRRR
University of Manchester Nick Higham Higher Precision Arithmetic 19 33
Chebfun
How would Chebfun need to change with higher precisionarithmetic
gtgt help chebfunprefchebfunpref Settings for Chebfun
eps - Relative tolerance used in construction andsubsequent operationsFactory value is 2^-52 (Matlabrsquos factoryvalue of machine epsilon)
University of Manchester Nick Higham Higher Precision Arithmetic 20 33
Chebfun
function pass = complexrotation This code makes sure a few things are ok if you make them complex eg integration norms inner products and singular values LNT 20 May 2008
f = chebfun((x) exp(x))fi = chebfun((x) 1iexp(x))g = chebfun(rsquo1(2-x)rsquo)gi = chebfun(rsquo1i(2-x)rsquo)A = [f g]
pass(1) = (sum(fi)==1isum(f))pass(2) = norm(fi)==norm(f)pass(3) = abs((frsquog)-((1if)rsquo(1ig))) lt 1e-15pass(4) = (norm(giinf)-norm(ginf)) lt 1e-15pass(5) = (norm(fi1)-norm(f1)) lt 1e-15pass(6) = norm(svd(A) - svd(1iA)) lt 1e-15
University of Manchester Nick Higham Higher Precision Arithmetic 21 33
Unstable Algorithms in Higher Precision
For unstable alg where error bounds available increase theprecision accordingly
University of Manchester Nick Higham Higher Precision Arithmetic 22 33
RandomizationA isin Cntimesn with A = 1Davies (2007) ldquoapproximate diagonalizationrdquo
tol = 1e-8[VD] = eig(A + tolrandn(n))F = Vdiag(feval(fundiag(D)))V
Manchester usage in VPA
digits(d)tol = 10^(-d2)[VD] = eig(vpa(A) + tolvpa( rand(n) ))F = Vdiag(feval(fundiag(D)))V
Daviesrsquo analysis suggests rel err asymp 10minusd2
H amp Relton (in progress)
University of Manchester Nick Higham Higher Precision Arithmetic 23 33
RandomizationA isin Cntimesn with A = 1Davies (2007) ldquoapproximate diagonalizationrdquo
tol = 1e-8[VD] = eig(A + tolrandn(n))F = Vdiag(feval(fundiag(D)))V
Manchester usage in VPA
digits(d)tol = 10^(-d2)[VD] = eig(vpa(A) + tolvpa( rand(n) ))F = Vdiag(feval(fundiag(D)))V
Daviesrsquo analysis suggests rel err asymp 10minusd2
H amp Relton (in progress)
University of Manchester Nick Higham Higher Precision Arithmetic 23 33
Tiny Relative Errors
Normwise relative errors
θinfin(x y) =x minus yinfinxinfin
from a numerical experiment
132e-22 339e-22 339e-21 867e-20139e-18 436e-18 530e-18 583e-18145e-17 376e-17 376e-17 427e-17
What precision was used
University of Manchester Nick Higham Higher Precision Arithmetic 24 33
Base β = 2 u = 2minust Dingle amp H (2011)
Theorem
If x 6= 0 and y are distinct normalized flpt numbers then|x minus y ||x | ge u and this lower bound is attainable
Theorem
Let 0 6= x y isin Rn be vectors of normalized flpt numbers Ifθinfin(x y) lt u then xk = yk for all k st |xk | = xinfin
x =
[1
10minus22
] y =
[1
2times 10minus22
] θinfin(x y) = 10minus22
Tiny relative errors can corrupt performance profilesSee cure in Dingle amp H (2011)
University of Manchester Nick Higham Higher Precision Arithmetic 25 33
Base β = 2 u = 2minust Dingle amp H (2011)
Theorem
If x 6= 0 and y are distinct normalized flpt numbers then|x minus y ||x | ge u and this lower bound is attainable
Theorem
Let 0 6= x y isin Rn be vectors of normalized flpt numbers Ifθinfin(x y) lt u then xk = yk for all k st |xk | = xinfin
x =
[1
10minus22
] y =
[1
2times 10minus22
] θinfin(x y) = 10minus22
Tiny relative errors can corrupt performance profilesSee cure in Dingle amp H (2011)
University of Manchester Nick Higham Higher Precision Arithmetic 25 33
Increasing the Precision
y = eπradic
163 evaluated at t digit precision
t y20 2625374126407687440025 262537412640768744000000030 262537412640768743999999999999
Is the last digit before the decimal point 4
t y35 2625374126407687439999999999992500740 2625374126407687439999999999992500725972
So no itrsquos 3
University of Manchester Nick Higham Higher Precision Arithmetic 26 33
Increasing the Precision
y = eπradic
163 evaluated at t digit precision
t y20 2625374126407687440025 262537412640768744000000030 262537412640768743999999999999
Is the last digit before the decimal point 4
t y35 2625374126407687439999999999992500740 2625374126407687439999999999992500725972
So no itrsquos 3
University of Manchester Nick Higham Higher Precision Arithmetic 26 33
Another Example
Consider the evaluation in precision u = 2minust of
y = x + a sin(bx) x = 17 a = 10minus8 b = 224
10 15 20 25 30 35 4010
minus14
10minus13
10minus12
10minus11
10minus10
10minus9
10minus8
10minus7
10minus6
10minus5
10minus4
t
error
University of Manchester Nick Higham Higher Precision Arithmetic 27 33
MythIncreasing the precision at which a computation isperformed increases the accuracy of the answer
Rounding Errors in Digital Imaging
An 8-bit RGB image is an m times n times 3 array of integersaijk isin 01 255Every editing op (levels curves colour balance )aijk larr round(fijk(aijk)) incurs a rounding error
Should we edit in 16-bit
University of Manchester Nick Higham Higher Precision Arithmetic 29 33
Rounding Errors in Digital Imaging
An 8-bit RGB image is an m times n times 3 array of integersaijk isin 01 255Every editing op (levels curves colour balance )aijk larr round(fijk(aijk)) incurs a rounding error
Should we edit in 16-bit
University of Manchester Nick Higham Higher Precision Arithmetic 29 33
16-bit vs 8-bit editing
SLR cameras generate image at 12ndash14 bits internally
8-bit 0 1 25516-bit 0 39times 10minus3 255
Controversial Margulis says using 16 bits makes nopractical difference in quality
Relevant metric is not normwise relative error
University of Manchester Nick Higham Higher Precision Arithmetic 30 33
16-bit vs 8-bit editing
SLR cameras generate image at 12ndash14 bits internally
8-bit 0 1 25516-bit 0 39times 10minus3 255
Controversial Margulis says using 16 bits makes nopractical difference in quality
Relevant metric is not normwise relative error
University of Manchester Nick Higham Higher Precision Arithmetic 30 33
16-bit vs 8-bit editing
SLR cameras generate image at 12ndash14 bits internally
8-bit 0 1 25516-bit 0 39times 10minus3 255
Controversial Margulis says using 16 bits makes nopractical difference in quality
Relevant metric is not normwise relative error
University of Manchester Nick Higham Higher Precision Arithmetic 30 33
High Precision in MATLAB
VPA arithmetic in Symbolic Math ToolboxAdvanpix Multiprecision Computing Toolbox forMATLAB
University of Manchester Nick Higham Higher Precision Arithmetic 31 33
gtgt which -all eigbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleeig) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleeig) double methodCMATLAB2012btoolboxsymbolicsymbolicsymeigm sym methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedeigm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayeigm gpuArray methoddmatlabchebfunlinopeigm linop methoddmatlabchebfunchebopeigm chebop method
gtgt which -all qrbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleqr) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleqr) double methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedqrm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayqrm gpuArray methoddmatlabchebfunchebfunqrm chebfun method
gtgt which -all toeplitzCMATLAB2012btoolboxmatlabelmattoeplitzm
University of Manchester Nick Higham Higher Precision Arithmetic 32 33
gtgt which -all eigbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleeig) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleeig) double methodCMATLAB2012btoolboxsymbolicsymbolicsymeigm sym methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedeigm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayeigm gpuArray methoddmatlabchebfunlinopeigm linop methoddmatlabchebfunchebopeigm chebop method
gtgt which -all qrbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleqr) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleqr) double methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedqrm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayqrm gpuArray methoddmatlabchebfunchebfunqrm chebfun method
gtgt which -all toeplitzCMATLAB2012btoolboxmatlabelmattoeplitzm
University of Manchester Nick Higham Higher Precision Arithmetic 32 33
gtgt which -all eigbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleeig) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleeig) double methodCMATLAB2012btoolboxsymbolicsymbolicsymeigm sym methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedeigm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayeigm gpuArray methoddmatlabchebfunlinopeigm linop methoddmatlabchebfunchebopeigm chebop method
gtgt which -all qrbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleqr) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleqr) double methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedqrm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayqrm gpuArray methoddmatlabchebfunchebfunqrm chebfun method
gtgt which -all toeplitzCMATLAB2012btoolboxmatlabelmattoeplitzm
University of Manchester Nick Higham Higher Precision Arithmetic 32 33
Conclusions amp Future Directions
Need to prepare for quad precision in hardwareArbitrary precision arithmetic in software increasinglyuseful Are we fully exploiting it
High precision veryuseful for computingldquoexact answersrdquo fortesting algs
NLA group
University of Manchester Nick Higham Higher Precision Arithmetic 33 33
References I
Multiprecision Computing Toolbox for MATLABAdvanpix LLC Tokyo|httpwwwadvanpixcom|
A H Al-Mohy and N J HighamThe complex step approximation to the Freacutechetderivative of a matrix functionNumer Algorithms 53(1)133ndash148 2010
Basic Linear Algebra Subprograms Technical (BLAST)Forum Standard IInt J High Performance Computing Applications 16(1)1ndash111 2002
University of Manchester Nick Higham Higher Precision Arithmetic 1 4
References II
Basic Linear Algebra Subprograms Technical (BLAST)Forum Standard IIInt J High Performance Computing Applications 16(2)115ndash199 2002
E B DaviesApproximate diagonalizationSIAM J Matrix Anal Appl 29(4)1051ndash1064 2007
University of Manchester Nick Higham Higher Precision Arithmetic 2 4
References III
N J Dingle and N J HighamReducing the influence of tiny normwise relative errorson performance profilesMIMS EPrint 201190 Manchester Institute forMathematical Sciences The University of ManchesterUK Nov 201111 pp
W KahanComputer benchmarks versus accuracyDraft manuscript June 1994
University of Manchester Nick Higham Higher Precision Arithmetic 3 4
References IV
X S Li J W Demmel D H Bailey G Henry Y HidaJ Iskandar W Kahan S Y Kang A Kapur M CMartin B J Thompson T Tung and D J YooDesign implementation and testing of extended andmixed precision BLASACM Trans Math Software 28(2)152ndash205 June 2002
D MargulisPhotoshop LAB Color The Canyon Conundrum andOther Adventures in the Most Powerful ColorspacePeachpit Press Berkeley CA USA 2006ISBN 0-321-35678-0xviii+366 pp
University of Manchester Nick Higham Higher Precision Arithmetic 4 4
Need for Higher Precision
Bailey Simon Barton amp Fouts Floating PointArithmetic in Future Supercomputers Internat JSupercomputer Appl 3 86ndash90 1989Bailey Barrio amp Borwein High-Precision ComputationMathematical Physics and Dynamics Appl MathComput 218 10106ndash10121 2012
Long-time simulationsResolving small-scale phenomenaLarge-scale simulations
University of Manchester Nick Higham Higher Precision Arithmetic 11 33
Going to Higher Precision
If we have quadruple or higher precision what do we needto do to modify existing algorithms
To what extent are existing algs precision-independent
University of Manchester Nick Higham Higher Precision Arithmetic 13 33
Direct Matrix Factorizations
Threshold pivoting for sparse matrices may depend onprecisionCondition estimation to detect nearly singular matrices
gtgt A = gallery(rsquolotkinrsquo14) x = Arandn(n1)Warning Matrix is close to singular or badly scaledResults may be inaccurate RCOND = 2761711e-18
Iterative refinementEquilibration
But all ldquorely only on LAPACK DLAMCHrdquo
University of Manchester Nick Higham Higher Precision Arithmetic 14 33
Matrix Functions
(Inverse) scaling and squaring-type algorithms for eAlog(A) cos(A) At use Padeacute approximants
Padeacute degree chosen to achieve accuracy uPadeacute coeffs and algorithm parameters need rederivingfor a different u Logic may changeexpm logm need changing for smaller u
Methods based on best Linfin approximations to eA forHermitian A also need higher order approximationsderiving
Scalar elementary functions
University of Manchester Nick Higham Higher Precision Arithmetic 15 33
Iterative Algorithms
QR-based Dynamically Weighted Halley IterationNakatsukasa Bai amp Gygi (2010)X0 = Aα isin Cmtimesn (m ge n)[radic
ck Xk
I
]=
[Q1
Q2
]R
Xk+1 =bk
ckXk +
1radicck
(ak minus
bk
ck
)Q1Qlowast2
Cubically convergent with limkrarrinfin Xk = U whereA = UH is a polar decompositionForms basis of new spectral divide amp conquer algs forsymm eirsquoproblem and SVD (H amp Nakatsukasa 2012)
University of Manchester Nick Higham Higher Precision Arithmetic 16 33
Maximum Number of Iterations
If κ2(A) le uminus1
u 10minus16 10minus32 10minus64 10minus128
Scaled Newton 9 11 13 15QDWH 6 7 8 9
Increasing the precision makes little difference to thecost in flops
University of Manchester Nick Higham Higher Precision Arithmetic 17 33
Derivative Approximations
Freacutechet derivative of f Cntimesn rarr Cntimesn
Lf (AE) asymp f (A + hE)minus f (A)h
forward difference
with error O(h) and hopt =(
uf (A)E2
)12
If f Rntimesn rarr Rntimesn and AE isin Rntimesn (Al-Mohy amp H 2010)
Lf (AE) asymp Imf (A + ihE)
hcomplex step
has error O(h2) and can take h arbitrarily small eg 10minus100
University of Manchester Nick Higham Higher Precision Arithmetic 18 33
Derivative Approximations
Freacutechet derivative of f Cntimesn rarr Cntimesn
Lf (AE) asymp f (A + hE)minus f (A)h
forward difference
with error O(h) and hopt =(
uf (A)E2
)12
If f Rntimesn rarr Rntimesn and AE isin Rntimesn (Al-Mohy amp H 2010)
Lf (AE) asymp Imf (A + ihE)
hcomplex step
has error O(h2) and can take h arbitrarily small eg 10minus100
University of Manchester Nick Higham Higher Precision Arithmetic 18 33
Mixed Precision
Even if we have higher precision we might not need to useit everywhere
Iterative refinement for Ax = bIterative refinement for ersquovalue problemsPolynomial root finding (Boyd this workshop)Petschow Quintana-Ortiacute amp Bientinesi (2012) usingquad precision to improve orthogonality of doubleprecision MRRR
University of Manchester Nick Higham Higher Precision Arithmetic 19 33
Chebfun
How would Chebfun need to change with higher precisionarithmetic
gtgt help chebfunprefchebfunpref Settings for Chebfun
eps - Relative tolerance used in construction andsubsequent operationsFactory value is 2^-52 (Matlabrsquos factoryvalue of machine epsilon)
University of Manchester Nick Higham Higher Precision Arithmetic 20 33
Chebfun
function pass = complexrotation This code makes sure a few things are ok if you make them complex eg integration norms inner products and singular values LNT 20 May 2008
f = chebfun((x) exp(x))fi = chebfun((x) 1iexp(x))g = chebfun(rsquo1(2-x)rsquo)gi = chebfun(rsquo1i(2-x)rsquo)A = [f g]
pass(1) = (sum(fi)==1isum(f))pass(2) = norm(fi)==norm(f)pass(3) = abs((frsquog)-((1if)rsquo(1ig))) lt 1e-15pass(4) = (norm(giinf)-norm(ginf)) lt 1e-15pass(5) = (norm(fi1)-norm(f1)) lt 1e-15pass(6) = norm(svd(A) - svd(1iA)) lt 1e-15
University of Manchester Nick Higham Higher Precision Arithmetic 21 33
Unstable Algorithms in Higher Precision
For unstable alg where error bounds available increase theprecision accordingly
University of Manchester Nick Higham Higher Precision Arithmetic 22 33
RandomizationA isin Cntimesn with A = 1Davies (2007) ldquoapproximate diagonalizationrdquo
tol = 1e-8[VD] = eig(A + tolrandn(n))F = Vdiag(feval(fundiag(D)))V
Manchester usage in VPA
digits(d)tol = 10^(-d2)[VD] = eig(vpa(A) + tolvpa( rand(n) ))F = Vdiag(feval(fundiag(D)))V
Daviesrsquo analysis suggests rel err asymp 10minusd2
H amp Relton (in progress)
University of Manchester Nick Higham Higher Precision Arithmetic 23 33
RandomizationA isin Cntimesn with A = 1Davies (2007) ldquoapproximate diagonalizationrdquo
tol = 1e-8[VD] = eig(A + tolrandn(n))F = Vdiag(feval(fundiag(D)))V
Manchester usage in VPA
digits(d)tol = 10^(-d2)[VD] = eig(vpa(A) + tolvpa( rand(n) ))F = Vdiag(feval(fundiag(D)))V
Daviesrsquo analysis suggests rel err asymp 10minusd2
H amp Relton (in progress)
University of Manchester Nick Higham Higher Precision Arithmetic 23 33
Tiny Relative Errors
Normwise relative errors
θinfin(x y) =x minus yinfinxinfin
from a numerical experiment
132e-22 339e-22 339e-21 867e-20139e-18 436e-18 530e-18 583e-18145e-17 376e-17 376e-17 427e-17
What precision was used
University of Manchester Nick Higham Higher Precision Arithmetic 24 33
Base β = 2 u = 2minust Dingle amp H (2011)
Theorem
If x 6= 0 and y are distinct normalized flpt numbers then|x minus y ||x | ge u and this lower bound is attainable
Theorem
Let 0 6= x y isin Rn be vectors of normalized flpt numbers Ifθinfin(x y) lt u then xk = yk for all k st |xk | = xinfin
x =
[1
10minus22
] y =
[1
2times 10minus22
] θinfin(x y) = 10minus22
Tiny relative errors can corrupt performance profilesSee cure in Dingle amp H (2011)
University of Manchester Nick Higham Higher Precision Arithmetic 25 33
Base β = 2 u = 2minust Dingle amp H (2011)
Theorem
If x 6= 0 and y are distinct normalized flpt numbers then|x minus y ||x | ge u and this lower bound is attainable
Theorem
Let 0 6= x y isin Rn be vectors of normalized flpt numbers Ifθinfin(x y) lt u then xk = yk for all k st |xk | = xinfin
x =
[1
10minus22
] y =
[1
2times 10minus22
] θinfin(x y) = 10minus22
Tiny relative errors can corrupt performance profilesSee cure in Dingle amp H (2011)
University of Manchester Nick Higham Higher Precision Arithmetic 25 33
Increasing the Precision
y = eπradic
163 evaluated at t digit precision
t y20 2625374126407687440025 262537412640768744000000030 262537412640768743999999999999
Is the last digit before the decimal point 4
t y35 2625374126407687439999999999992500740 2625374126407687439999999999992500725972
So no itrsquos 3
University of Manchester Nick Higham Higher Precision Arithmetic 26 33
Increasing the Precision
y = eπradic
163 evaluated at t digit precision
t y20 2625374126407687440025 262537412640768744000000030 262537412640768743999999999999
Is the last digit before the decimal point 4
t y35 2625374126407687439999999999992500740 2625374126407687439999999999992500725972
So no itrsquos 3
University of Manchester Nick Higham Higher Precision Arithmetic 26 33
Another Example
Consider the evaluation in precision u = 2minust of
y = x + a sin(bx) x = 17 a = 10minus8 b = 224
10 15 20 25 30 35 4010
minus14
10minus13
10minus12
10minus11
10minus10
10minus9
10minus8
10minus7
10minus6
10minus5
10minus4
t
error
University of Manchester Nick Higham Higher Precision Arithmetic 27 33
MythIncreasing the precision at which a computation isperformed increases the accuracy of the answer
Rounding Errors in Digital Imaging
An 8-bit RGB image is an m times n times 3 array of integersaijk isin 01 255Every editing op (levels curves colour balance )aijk larr round(fijk(aijk)) incurs a rounding error
Should we edit in 16-bit
University of Manchester Nick Higham Higher Precision Arithmetic 29 33
Rounding Errors in Digital Imaging
An 8-bit RGB image is an m times n times 3 array of integersaijk isin 01 255Every editing op (levels curves colour balance )aijk larr round(fijk(aijk)) incurs a rounding error
Should we edit in 16-bit
University of Manchester Nick Higham Higher Precision Arithmetic 29 33
16-bit vs 8-bit editing
SLR cameras generate image at 12ndash14 bits internally
8-bit 0 1 25516-bit 0 39times 10minus3 255
Controversial Margulis says using 16 bits makes nopractical difference in quality
Relevant metric is not normwise relative error
University of Manchester Nick Higham Higher Precision Arithmetic 30 33
16-bit vs 8-bit editing
SLR cameras generate image at 12ndash14 bits internally
8-bit 0 1 25516-bit 0 39times 10minus3 255
Controversial Margulis says using 16 bits makes nopractical difference in quality
Relevant metric is not normwise relative error
University of Manchester Nick Higham Higher Precision Arithmetic 30 33
16-bit vs 8-bit editing
SLR cameras generate image at 12ndash14 bits internally
8-bit 0 1 25516-bit 0 39times 10minus3 255
Controversial Margulis says using 16 bits makes nopractical difference in quality
Relevant metric is not normwise relative error
University of Manchester Nick Higham Higher Precision Arithmetic 30 33
High Precision in MATLAB
VPA arithmetic in Symbolic Math ToolboxAdvanpix Multiprecision Computing Toolbox forMATLAB
University of Manchester Nick Higham Higher Precision Arithmetic 31 33
gtgt which -all eigbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleeig) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleeig) double methodCMATLAB2012btoolboxsymbolicsymbolicsymeigm sym methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedeigm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayeigm gpuArray methoddmatlabchebfunlinopeigm linop methoddmatlabchebfunchebopeigm chebop method
gtgt which -all qrbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleqr) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleqr) double methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedqrm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayqrm gpuArray methoddmatlabchebfunchebfunqrm chebfun method
gtgt which -all toeplitzCMATLAB2012btoolboxmatlabelmattoeplitzm
University of Manchester Nick Higham Higher Precision Arithmetic 32 33
gtgt which -all eigbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleeig) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleeig) double methodCMATLAB2012btoolboxsymbolicsymbolicsymeigm sym methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedeigm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayeigm gpuArray methoddmatlabchebfunlinopeigm linop methoddmatlabchebfunchebopeigm chebop method
gtgt which -all qrbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleqr) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleqr) double methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedqrm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayqrm gpuArray methoddmatlabchebfunchebfunqrm chebfun method
gtgt which -all toeplitzCMATLAB2012btoolboxmatlabelmattoeplitzm
University of Manchester Nick Higham Higher Precision Arithmetic 32 33
gtgt which -all eigbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleeig) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleeig) double methodCMATLAB2012btoolboxsymbolicsymbolicsymeigm sym methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedeigm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayeigm gpuArray methoddmatlabchebfunlinopeigm linop methoddmatlabchebfunchebopeigm chebop method
gtgt which -all qrbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleqr) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleqr) double methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedqrm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayqrm gpuArray methoddmatlabchebfunchebfunqrm chebfun method
gtgt which -all toeplitzCMATLAB2012btoolboxmatlabelmattoeplitzm
University of Manchester Nick Higham Higher Precision Arithmetic 32 33
Conclusions amp Future Directions
Need to prepare for quad precision in hardwareArbitrary precision arithmetic in software increasinglyuseful Are we fully exploiting it
High precision veryuseful for computingldquoexact answersrdquo fortesting algs
NLA group
University of Manchester Nick Higham Higher Precision Arithmetic 33 33
References I
Multiprecision Computing Toolbox for MATLABAdvanpix LLC Tokyo|httpwwwadvanpixcom|
A H Al-Mohy and N J HighamThe complex step approximation to the Freacutechetderivative of a matrix functionNumer Algorithms 53(1)133ndash148 2010
Basic Linear Algebra Subprograms Technical (BLAST)Forum Standard IInt J High Performance Computing Applications 16(1)1ndash111 2002
University of Manchester Nick Higham Higher Precision Arithmetic 1 4
References II
Basic Linear Algebra Subprograms Technical (BLAST)Forum Standard IIInt J High Performance Computing Applications 16(2)115ndash199 2002
E B DaviesApproximate diagonalizationSIAM J Matrix Anal Appl 29(4)1051ndash1064 2007
University of Manchester Nick Higham Higher Precision Arithmetic 2 4
References III
N J Dingle and N J HighamReducing the influence of tiny normwise relative errorson performance profilesMIMS EPrint 201190 Manchester Institute forMathematical Sciences The University of ManchesterUK Nov 201111 pp
W KahanComputer benchmarks versus accuracyDraft manuscript June 1994
University of Manchester Nick Higham Higher Precision Arithmetic 3 4
References IV
X S Li J W Demmel D H Bailey G Henry Y HidaJ Iskandar W Kahan S Y Kang A Kapur M CMartin B J Thompson T Tung and D J YooDesign implementation and testing of extended andmixed precision BLASACM Trans Math Software 28(2)152ndash205 June 2002
D MargulisPhotoshop LAB Color The Canyon Conundrum andOther Adventures in the Most Powerful ColorspacePeachpit Press Berkeley CA USA 2006ISBN 0-321-35678-0xviii+366 pp
University of Manchester Nick Higham Higher Precision Arithmetic 4 4
Going to Higher Precision
If we have quadruple or higher precision what do we needto do to modify existing algorithms
To what extent are existing algs precision-independent
University of Manchester Nick Higham Higher Precision Arithmetic 13 33
Direct Matrix Factorizations
Threshold pivoting for sparse matrices may depend onprecisionCondition estimation to detect nearly singular matrices
gtgt A = gallery(rsquolotkinrsquo14) x = Arandn(n1)Warning Matrix is close to singular or badly scaledResults may be inaccurate RCOND = 2761711e-18
Iterative refinementEquilibration
But all ldquorely only on LAPACK DLAMCHrdquo
University of Manchester Nick Higham Higher Precision Arithmetic 14 33
Matrix Functions
(Inverse) scaling and squaring-type algorithms for eAlog(A) cos(A) At use Padeacute approximants
Padeacute degree chosen to achieve accuracy uPadeacute coeffs and algorithm parameters need rederivingfor a different u Logic may changeexpm logm need changing for smaller u
Methods based on best Linfin approximations to eA forHermitian A also need higher order approximationsderiving
Scalar elementary functions
University of Manchester Nick Higham Higher Precision Arithmetic 15 33
Iterative Algorithms
QR-based Dynamically Weighted Halley IterationNakatsukasa Bai amp Gygi (2010)X0 = Aα isin Cmtimesn (m ge n)[radic
ck Xk
I
]=
[Q1
Q2
]R
Xk+1 =bk
ckXk +
1radicck
(ak minus
bk
ck
)Q1Qlowast2
Cubically convergent with limkrarrinfin Xk = U whereA = UH is a polar decompositionForms basis of new spectral divide amp conquer algs forsymm eirsquoproblem and SVD (H amp Nakatsukasa 2012)
University of Manchester Nick Higham Higher Precision Arithmetic 16 33
Maximum Number of Iterations
If κ2(A) le uminus1
u 10minus16 10minus32 10minus64 10minus128
Scaled Newton 9 11 13 15QDWH 6 7 8 9
Increasing the precision makes little difference to thecost in flops
University of Manchester Nick Higham Higher Precision Arithmetic 17 33
Derivative Approximations
Freacutechet derivative of f Cntimesn rarr Cntimesn
Lf (AE) asymp f (A + hE)minus f (A)h
forward difference
with error O(h) and hopt =(
uf (A)E2
)12
If f Rntimesn rarr Rntimesn and AE isin Rntimesn (Al-Mohy amp H 2010)
Lf (AE) asymp Imf (A + ihE)
hcomplex step
has error O(h2) and can take h arbitrarily small eg 10minus100
University of Manchester Nick Higham Higher Precision Arithmetic 18 33
Derivative Approximations
Freacutechet derivative of f Cntimesn rarr Cntimesn
Lf (AE) asymp f (A + hE)minus f (A)h
forward difference
with error O(h) and hopt =(
uf (A)E2
)12
If f Rntimesn rarr Rntimesn and AE isin Rntimesn (Al-Mohy amp H 2010)
Lf (AE) asymp Imf (A + ihE)
hcomplex step
has error O(h2) and can take h arbitrarily small eg 10minus100
University of Manchester Nick Higham Higher Precision Arithmetic 18 33
Mixed Precision
Even if we have higher precision we might not need to useit everywhere
Iterative refinement for Ax = bIterative refinement for ersquovalue problemsPolynomial root finding (Boyd this workshop)Petschow Quintana-Ortiacute amp Bientinesi (2012) usingquad precision to improve orthogonality of doubleprecision MRRR
University of Manchester Nick Higham Higher Precision Arithmetic 19 33
Chebfun
How would Chebfun need to change with higher precisionarithmetic
gtgt help chebfunprefchebfunpref Settings for Chebfun
eps - Relative tolerance used in construction andsubsequent operationsFactory value is 2^-52 (Matlabrsquos factoryvalue of machine epsilon)
University of Manchester Nick Higham Higher Precision Arithmetic 20 33
Chebfun
function pass = complexrotation This code makes sure a few things are ok if you make them complex eg integration norms inner products and singular values LNT 20 May 2008
f = chebfun((x) exp(x))fi = chebfun((x) 1iexp(x))g = chebfun(rsquo1(2-x)rsquo)gi = chebfun(rsquo1i(2-x)rsquo)A = [f g]
pass(1) = (sum(fi)==1isum(f))pass(2) = norm(fi)==norm(f)pass(3) = abs((frsquog)-((1if)rsquo(1ig))) lt 1e-15pass(4) = (norm(giinf)-norm(ginf)) lt 1e-15pass(5) = (norm(fi1)-norm(f1)) lt 1e-15pass(6) = norm(svd(A) - svd(1iA)) lt 1e-15
University of Manchester Nick Higham Higher Precision Arithmetic 21 33
Unstable Algorithms in Higher Precision
For unstable alg where error bounds available increase theprecision accordingly
University of Manchester Nick Higham Higher Precision Arithmetic 22 33
RandomizationA isin Cntimesn with A = 1Davies (2007) ldquoapproximate diagonalizationrdquo
tol = 1e-8[VD] = eig(A + tolrandn(n))F = Vdiag(feval(fundiag(D)))V
Manchester usage in VPA
digits(d)tol = 10^(-d2)[VD] = eig(vpa(A) + tolvpa( rand(n) ))F = Vdiag(feval(fundiag(D)))V
Daviesrsquo analysis suggests rel err asymp 10minusd2
H amp Relton (in progress)
University of Manchester Nick Higham Higher Precision Arithmetic 23 33
RandomizationA isin Cntimesn with A = 1Davies (2007) ldquoapproximate diagonalizationrdquo
tol = 1e-8[VD] = eig(A + tolrandn(n))F = Vdiag(feval(fundiag(D)))V
Manchester usage in VPA
digits(d)tol = 10^(-d2)[VD] = eig(vpa(A) + tolvpa( rand(n) ))F = Vdiag(feval(fundiag(D)))V
Daviesrsquo analysis suggests rel err asymp 10minusd2
H amp Relton (in progress)
University of Manchester Nick Higham Higher Precision Arithmetic 23 33
Tiny Relative Errors
Normwise relative errors
θinfin(x y) =x minus yinfinxinfin
from a numerical experiment
132e-22 339e-22 339e-21 867e-20139e-18 436e-18 530e-18 583e-18145e-17 376e-17 376e-17 427e-17
What precision was used
University of Manchester Nick Higham Higher Precision Arithmetic 24 33
Base β = 2 u = 2minust Dingle amp H (2011)
Theorem
If x 6= 0 and y are distinct normalized flpt numbers then|x minus y ||x | ge u and this lower bound is attainable
Theorem
Let 0 6= x y isin Rn be vectors of normalized flpt numbers Ifθinfin(x y) lt u then xk = yk for all k st |xk | = xinfin
x =
[1
10minus22
] y =
[1
2times 10minus22
] θinfin(x y) = 10minus22
Tiny relative errors can corrupt performance profilesSee cure in Dingle amp H (2011)
University of Manchester Nick Higham Higher Precision Arithmetic 25 33
Base β = 2 u = 2minust Dingle amp H (2011)
Theorem
If x 6= 0 and y are distinct normalized flpt numbers then|x minus y ||x | ge u and this lower bound is attainable
Theorem
Let 0 6= x y isin Rn be vectors of normalized flpt numbers Ifθinfin(x y) lt u then xk = yk for all k st |xk | = xinfin
x =
[1
10minus22
] y =
[1
2times 10minus22
] θinfin(x y) = 10minus22
Tiny relative errors can corrupt performance profilesSee cure in Dingle amp H (2011)
University of Manchester Nick Higham Higher Precision Arithmetic 25 33
Increasing the Precision
y = eπradic
163 evaluated at t digit precision
t y20 2625374126407687440025 262537412640768744000000030 262537412640768743999999999999
Is the last digit before the decimal point 4
t y35 2625374126407687439999999999992500740 2625374126407687439999999999992500725972
So no itrsquos 3
University of Manchester Nick Higham Higher Precision Arithmetic 26 33
Increasing the Precision
y = eπradic
163 evaluated at t digit precision
t y20 2625374126407687440025 262537412640768744000000030 262537412640768743999999999999
Is the last digit before the decimal point 4
t y35 2625374126407687439999999999992500740 2625374126407687439999999999992500725972
So no itrsquos 3
University of Manchester Nick Higham Higher Precision Arithmetic 26 33
Another Example
Consider the evaluation in precision u = 2minust of
y = x + a sin(bx) x = 17 a = 10minus8 b = 224
10 15 20 25 30 35 4010
minus14
10minus13
10minus12
10minus11
10minus10
10minus9
10minus8
10minus7
10minus6
10minus5
10minus4
t
error
University of Manchester Nick Higham Higher Precision Arithmetic 27 33
MythIncreasing the precision at which a computation isperformed increases the accuracy of the answer
Rounding Errors in Digital Imaging
An 8-bit RGB image is an m times n times 3 array of integersaijk isin 01 255Every editing op (levels curves colour balance )aijk larr round(fijk(aijk)) incurs a rounding error
Should we edit in 16-bit
University of Manchester Nick Higham Higher Precision Arithmetic 29 33
Rounding Errors in Digital Imaging
An 8-bit RGB image is an m times n times 3 array of integersaijk isin 01 255Every editing op (levels curves colour balance )aijk larr round(fijk(aijk)) incurs a rounding error
Should we edit in 16-bit
University of Manchester Nick Higham Higher Precision Arithmetic 29 33
16-bit vs 8-bit editing
SLR cameras generate image at 12ndash14 bits internally
8-bit 0 1 25516-bit 0 39times 10minus3 255
Controversial Margulis says using 16 bits makes nopractical difference in quality
Relevant metric is not normwise relative error
University of Manchester Nick Higham Higher Precision Arithmetic 30 33
16-bit vs 8-bit editing
SLR cameras generate image at 12ndash14 bits internally
8-bit 0 1 25516-bit 0 39times 10minus3 255
Controversial Margulis says using 16 bits makes nopractical difference in quality
Relevant metric is not normwise relative error
University of Manchester Nick Higham Higher Precision Arithmetic 30 33
16-bit vs 8-bit editing
SLR cameras generate image at 12ndash14 bits internally
8-bit 0 1 25516-bit 0 39times 10minus3 255
Controversial Margulis says using 16 bits makes nopractical difference in quality
Relevant metric is not normwise relative error
University of Manchester Nick Higham Higher Precision Arithmetic 30 33
High Precision in MATLAB
VPA arithmetic in Symbolic Math ToolboxAdvanpix Multiprecision Computing Toolbox forMATLAB
University of Manchester Nick Higham Higher Precision Arithmetic 31 33
gtgt which -all eigbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleeig) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleeig) double methodCMATLAB2012btoolboxsymbolicsymbolicsymeigm sym methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedeigm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayeigm gpuArray methoddmatlabchebfunlinopeigm linop methoddmatlabchebfunchebopeigm chebop method
gtgt which -all qrbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleqr) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleqr) double methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedqrm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayqrm gpuArray methoddmatlabchebfunchebfunqrm chebfun method
gtgt which -all toeplitzCMATLAB2012btoolboxmatlabelmattoeplitzm
University of Manchester Nick Higham Higher Precision Arithmetic 32 33
gtgt which -all eigbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleeig) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleeig) double methodCMATLAB2012btoolboxsymbolicsymbolicsymeigm sym methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedeigm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayeigm gpuArray methoddmatlabchebfunlinopeigm linop methoddmatlabchebfunchebopeigm chebop method
gtgt which -all qrbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleqr) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleqr) double methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedqrm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayqrm gpuArray methoddmatlabchebfunchebfunqrm chebfun method
gtgt which -all toeplitzCMATLAB2012btoolboxmatlabelmattoeplitzm
University of Manchester Nick Higham Higher Precision Arithmetic 32 33
gtgt which -all eigbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleeig) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleeig) double methodCMATLAB2012btoolboxsymbolicsymbolicsymeigm sym methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedeigm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayeigm gpuArray methoddmatlabchebfunlinopeigm linop methoddmatlabchebfunchebopeigm chebop method
gtgt which -all qrbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleqr) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleqr) double methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedqrm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayqrm gpuArray methoddmatlabchebfunchebfunqrm chebfun method
gtgt which -all toeplitzCMATLAB2012btoolboxmatlabelmattoeplitzm
University of Manchester Nick Higham Higher Precision Arithmetic 32 33
Conclusions amp Future Directions
Need to prepare for quad precision in hardwareArbitrary precision arithmetic in software increasinglyuseful Are we fully exploiting it
High precision veryuseful for computingldquoexact answersrdquo fortesting algs
NLA group
University of Manchester Nick Higham Higher Precision Arithmetic 33 33
References I
Multiprecision Computing Toolbox for MATLABAdvanpix LLC Tokyo|httpwwwadvanpixcom|
A H Al-Mohy and N J HighamThe complex step approximation to the Freacutechetderivative of a matrix functionNumer Algorithms 53(1)133ndash148 2010
Basic Linear Algebra Subprograms Technical (BLAST)Forum Standard IInt J High Performance Computing Applications 16(1)1ndash111 2002
University of Manchester Nick Higham Higher Precision Arithmetic 1 4
References II
Basic Linear Algebra Subprograms Technical (BLAST)Forum Standard IIInt J High Performance Computing Applications 16(2)115ndash199 2002
E B DaviesApproximate diagonalizationSIAM J Matrix Anal Appl 29(4)1051ndash1064 2007
University of Manchester Nick Higham Higher Precision Arithmetic 2 4
References III
N J Dingle and N J HighamReducing the influence of tiny normwise relative errorson performance profilesMIMS EPrint 201190 Manchester Institute forMathematical Sciences The University of ManchesterUK Nov 201111 pp
W KahanComputer benchmarks versus accuracyDraft manuscript June 1994
University of Manchester Nick Higham Higher Precision Arithmetic 3 4
References IV
X S Li J W Demmel D H Bailey G Henry Y HidaJ Iskandar W Kahan S Y Kang A Kapur M CMartin B J Thompson T Tung and D J YooDesign implementation and testing of extended andmixed precision BLASACM Trans Math Software 28(2)152ndash205 June 2002
D MargulisPhotoshop LAB Color The Canyon Conundrum andOther Adventures in the Most Powerful ColorspacePeachpit Press Berkeley CA USA 2006ISBN 0-321-35678-0xviii+366 pp
University of Manchester Nick Higham Higher Precision Arithmetic 4 4
Direct Matrix Factorizations
Threshold pivoting for sparse matrices may depend onprecisionCondition estimation to detect nearly singular matrices
gtgt A = gallery(rsquolotkinrsquo14) x = Arandn(n1)Warning Matrix is close to singular or badly scaledResults may be inaccurate RCOND = 2761711e-18
Iterative refinementEquilibration
But all ldquorely only on LAPACK DLAMCHrdquo
University of Manchester Nick Higham Higher Precision Arithmetic 14 33
Matrix Functions
(Inverse) scaling and squaring-type algorithms for eAlog(A) cos(A) At use Padeacute approximants
Padeacute degree chosen to achieve accuracy uPadeacute coeffs and algorithm parameters need rederivingfor a different u Logic may changeexpm logm need changing for smaller u
Methods based on best Linfin approximations to eA forHermitian A also need higher order approximationsderiving
Scalar elementary functions
University of Manchester Nick Higham Higher Precision Arithmetic 15 33
Iterative Algorithms
QR-based Dynamically Weighted Halley IterationNakatsukasa Bai amp Gygi (2010)X0 = Aα isin Cmtimesn (m ge n)[radic
ck Xk
I
]=
[Q1
Q2
]R
Xk+1 =bk
ckXk +
1radicck
(ak minus
bk
ck
)Q1Qlowast2
Cubically convergent with limkrarrinfin Xk = U whereA = UH is a polar decompositionForms basis of new spectral divide amp conquer algs forsymm eirsquoproblem and SVD (H amp Nakatsukasa 2012)
University of Manchester Nick Higham Higher Precision Arithmetic 16 33
Maximum Number of Iterations
If κ2(A) le uminus1
u 10minus16 10minus32 10minus64 10minus128
Scaled Newton 9 11 13 15QDWH 6 7 8 9
Increasing the precision makes little difference to thecost in flops
University of Manchester Nick Higham Higher Precision Arithmetic 17 33
Derivative Approximations
Freacutechet derivative of f Cntimesn rarr Cntimesn
Lf (AE) asymp f (A + hE)minus f (A)h
forward difference
with error O(h) and hopt =(
uf (A)E2
)12
If f Rntimesn rarr Rntimesn and AE isin Rntimesn (Al-Mohy amp H 2010)
Lf (AE) asymp Imf (A + ihE)
hcomplex step
has error O(h2) and can take h arbitrarily small eg 10minus100
University of Manchester Nick Higham Higher Precision Arithmetic 18 33
Derivative Approximations
Freacutechet derivative of f Cntimesn rarr Cntimesn
Lf (AE) asymp f (A + hE)minus f (A)h
forward difference
with error O(h) and hopt =(
uf (A)E2
)12
If f Rntimesn rarr Rntimesn and AE isin Rntimesn (Al-Mohy amp H 2010)
Lf (AE) asymp Imf (A + ihE)
hcomplex step
has error O(h2) and can take h arbitrarily small eg 10minus100
University of Manchester Nick Higham Higher Precision Arithmetic 18 33
Mixed Precision
Even if we have higher precision we might not need to useit everywhere
Iterative refinement for Ax = bIterative refinement for ersquovalue problemsPolynomial root finding (Boyd this workshop)Petschow Quintana-Ortiacute amp Bientinesi (2012) usingquad precision to improve orthogonality of doubleprecision MRRR
University of Manchester Nick Higham Higher Precision Arithmetic 19 33
Chebfun
How would Chebfun need to change with higher precisionarithmetic
gtgt help chebfunprefchebfunpref Settings for Chebfun
eps - Relative tolerance used in construction andsubsequent operationsFactory value is 2^-52 (Matlabrsquos factoryvalue of machine epsilon)
University of Manchester Nick Higham Higher Precision Arithmetic 20 33
Chebfun
function pass = complexrotation This code makes sure a few things are ok if you make them complex eg integration norms inner products and singular values LNT 20 May 2008
f = chebfun((x) exp(x))fi = chebfun((x) 1iexp(x))g = chebfun(rsquo1(2-x)rsquo)gi = chebfun(rsquo1i(2-x)rsquo)A = [f g]
pass(1) = (sum(fi)==1isum(f))pass(2) = norm(fi)==norm(f)pass(3) = abs((frsquog)-((1if)rsquo(1ig))) lt 1e-15pass(4) = (norm(giinf)-norm(ginf)) lt 1e-15pass(5) = (norm(fi1)-norm(f1)) lt 1e-15pass(6) = norm(svd(A) - svd(1iA)) lt 1e-15
University of Manchester Nick Higham Higher Precision Arithmetic 21 33
Unstable Algorithms in Higher Precision
For unstable alg where error bounds available increase theprecision accordingly
University of Manchester Nick Higham Higher Precision Arithmetic 22 33
RandomizationA isin Cntimesn with A = 1Davies (2007) ldquoapproximate diagonalizationrdquo
tol = 1e-8[VD] = eig(A + tolrandn(n))F = Vdiag(feval(fundiag(D)))V
Manchester usage in VPA
digits(d)tol = 10^(-d2)[VD] = eig(vpa(A) + tolvpa( rand(n) ))F = Vdiag(feval(fundiag(D)))V
Daviesrsquo analysis suggests rel err asymp 10minusd2
H amp Relton (in progress)
University of Manchester Nick Higham Higher Precision Arithmetic 23 33
RandomizationA isin Cntimesn with A = 1Davies (2007) ldquoapproximate diagonalizationrdquo
tol = 1e-8[VD] = eig(A + tolrandn(n))F = Vdiag(feval(fundiag(D)))V
Manchester usage in VPA
digits(d)tol = 10^(-d2)[VD] = eig(vpa(A) + tolvpa( rand(n) ))F = Vdiag(feval(fundiag(D)))V
Daviesrsquo analysis suggests rel err asymp 10minusd2
H amp Relton (in progress)
University of Manchester Nick Higham Higher Precision Arithmetic 23 33
Tiny Relative Errors
Normwise relative errors
θinfin(x y) =x minus yinfinxinfin
from a numerical experiment
132e-22 339e-22 339e-21 867e-20139e-18 436e-18 530e-18 583e-18145e-17 376e-17 376e-17 427e-17
What precision was used
University of Manchester Nick Higham Higher Precision Arithmetic 24 33
Base β = 2 u = 2minust Dingle amp H (2011)
Theorem
If x 6= 0 and y are distinct normalized flpt numbers then|x minus y ||x | ge u and this lower bound is attainable
Theorem
Let 0 6= x y isin Rn be vectors of normalized flpt numbers Ifθinfin(x y) lt u then xk = yk for all k st |xk | = xinfin
x =
[1
10minus22
] y =
[1
2times 10minus22
] θinfin(x y) = 10minus22
Tiny relative errors can corrupt performance profilesSee cure in Dingle amp H (2011)
University of Manchester Nick Higham Higher Precision Arithmetic 25 33
Base β = 2 u = 2minust Dingle amp H (2011)
Theorem
If x 6= 0 and y are distinct normalized flpt numbers then|x minus y ||x | ge u and this lower bound is attainable
Theorem
Let 0 6= x y isin Rn be vectors of normalized flpt numbers Ifθinfin(x y) lt u then xk = yk for all k st |xk | = xinfin
x =
[1
10minus22
] y =
[1
2times 10minus22
] θinfin(x y) = 10minus22
Tiny relative errors can corrupt performance profilesSee cure in Dingle amp H (2011)
University of Manchester Nick Higham Higher Precision Arithmetic 25 33
Increasing the Precision
y = eπradic
163 evaluated at t digit precision
t y20 2625374126407687440025 262537412640768744000000030 262537412640768743999999999999
Is the last digit before the decimal point 4
t y35 2625374126407687439999999999992500740 2625374126407687439999999999992500725972
So no itrsquos 3
University of Manchester Nick Higham Higher Precision Arithmetic 26 33
Increasing the Precision
y = eπradic
163 evaluated at t digit precision
t y20 2625374126407687440025 262537412640768744000000030 262537412640768743999999999999
Is the last digit before the decimal point 4
t y35 2625374126407687439999999999992500740 2625374126407687439999999999992500725972
So no itrsquos 3
University of Manchester Nick Higham Higher Precision Arithmetic 26 33
Another Example
Consider the evaluation in precision u = 2minust of
y = x + a sin(bx) x = 17 a = 10minus8 b = 224
10 15 20 25 30 35 4010
minus14
10minus13
10minus12
10minus11
10minus10
10minus9
10minus8
10minus7
10minus6
10minus5
10minus4
t
error
University of Manchester Nick Higham Higher Precision Arithmetic 27 33
MythIncreasing the precision at which a computation isperformed increases the accuracy of the answer
Rounding Errors in Digital Imaging
An 8-bit RGB image is an m times n times 3 array of integersaijk isin 01 255Every editing op (levels curves colour balance )aijk larr round(fijk(aijk)) incurs a rounding error
Should we edit in 16-bit
University of Manchester Nick Higham Higher Precision Arithmetic 29 33
Rounding Errors in Digital Imaging
An 8-bit RGB image is an m times n times 3 array of integersaijk isin 01 255Every editing op (levels curves colour balance )aijk larr round(fijk(aijk)) incurs a rounding error
Should we edit in 16-bit
University of Manchester Nick Higham Higher Precision Arithmetic 29 33
16-bit vs 8-bit editing
SLR cameras generate image at 12ndash14 bits internally
8-bit 0 1 25516-bit 0 39times 10minus3 255
Controversial Margulis says using 16 bits makes nopractical difference in quality
Relevant metric is not normwise relative error
University of Manchester Nick Higham Higher Precision Arithmetic 30 33
16-bit vs 8-bit editing
SLR cameras generate image at 12ndash14 bits internally
8-bit 0 1 25516-bit 0 39times 10minus3 255
Controversial Margulis says using 16 bits makes nopractical difference in quality
Relevant metric is not normwise relative error
University of Manchester Nick Higham Higher Precision Arithmetic 30 33
16-bit vs 8-bit editing
SLR cameras generate image at 12ndash14 bits internally
8-bit 0 1 25516-bit 0 39times 10minus3 255
Controversial Margulis says using 16 bits makes nopractical difference in quality
Relevant metric is not normwise relative error
University of Manchester Nick Higham Higher Precision Arithmetic 30 33
High Precision in MATLAB
VPA arithmetic in Symbolic Math ToolboxAdvanpix Multiprecision Computing Toolbox forMATLAB
University of Manchester Nick Higham Higher Precision Arithmetic 31 33
gtgt which -all eigbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleeig) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleeig) double methodCMATLAB2012btoolboxsymbolicsymbolicsymeigm sym methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedeigm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayeigm gpuArray methoddmatlabchebfunlinopeigm linop methoddmatlabchebfunchebopeigm chebop method
gtgt which -all qrbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleqr) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleqr) double methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedqrm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayqrm gpuArray methoddmatlabchebfunchebfunqrm chebfun method
gtgt which -all toeplitzCMATLAB2012btoolboxmatlabelmattoeplitzm
University of Manchester Nick Higham Higher Precision Arithmetic 32 33
gtgt which -all eigbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleeig) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleeig) double methodCMATLAB2012btoolboxsymbolicsymbolicsymeigm sym methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedeigm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayeigm gpuArray methoddmatlabchebfunlinopeigm linop methoddmatlabchebfunchebopeigm chebop method
gtgt which -all qrbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleqr) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleqr) double methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedqrm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayqrm gpuArray methoddmatlabchebfunchebfunqrm chebfun method
gtgt which -all toeplitzCMATLAB2012btoolboxmatlabelmattoeplitzm
University of Manchester Nick Higham Higher Precision Arithmetic 32 33
gtgt which -all eigbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleeig) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleeig) double methodCMATLAB2012btoolboxsymbolicsymbolicsymeigm sym methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedeigm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayeigm gpuArray methoddmatlabchebfunlinopeigm linop methoddmatlabchebfunchebopeigm chebop method
gtgt which -all qrbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleqr) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleqr) double methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedqrm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayqrm gpuArray methoddmatlabchebfunchebfunqrm chebfun method
gtgt which -all toeplitzCMATLAB2012btoolboxmatlabelmattoeplitzm
University of Manchester Nick Higham Higher Precision Arithmetic 32 33
Conclusions amp Future Directions
Need to prepare for quad precision in hardwareArbitrary precision arithmetic in software increasinglyuseful Are we fully exploiting it
High precision veryuseful for computingldquoexact answersrdquo fortesting algs
NLA group
University of Manchester Nick Higham Higher Precision Arithmetic 33 33
References I
Multiprecision Computing Toolbox for MATLABAdvanpix LLC Tokyo|httpwwwadvanpixcom|
A H Al-Mohy and N J HighamThe complex step approximation to the Freacutechetderivative of a matrix functionNumer Algorithms 53(1)133ndash148 2010
Basic Linear Algebra Subprograms Technical (BLAST)Forum Standard IInt J High Performance Computing Applications 16(1)1ndash111 2002
University of Manchester Nick Higham Higher Precision Arithmetic 1 4
References II
Basic Linear Algebra Subprograms Technical (BLAST)Forum Standard IIInt J High Performance Computing Applications 16(2)115ndash199 2002
E B DaviesApproximate diagonalizationSIAM J Matrix Anal Appl 29(4)1051ndash1064 2007
University of Manchester Nick Higham Higher Precision Arithmetic 2 4
References III
N J Dingle and N J HighamReducing the influence of tiny normwise relative errorson performance profilesMIMS EPrint 201190 Manchester Institute forMathematical Sciences The University of ManchesterUK Nov 201111 pp
W KahanComputer benchmarks versus accuracyDraft manuscript June 1994
University of Manchester Nick Higham Higher Precision Arithmetic 3 4
References IV
X S Li J W Demmel D H Bailey G Henry Y HidaJ Iskandar W Kahan S Y Kang A Kapur M CMartin B J Thompson T Tung and D J YooDesign implementation and testing of extended andmixed precision BLASACM Trans Math Software 28(2)152ndash205 June 2002
D MargulisPhotoshop LAB Color The Canyon Conundrum andOther Adventures in the Most Powerful ColorspacePeachpit Press Berkeley CA USA 2006ISBN 0-321-35678-0xviii+366 pp
University of Manchester Nick Higham Higher Precision Arithmetic 4 4
Matrix Functions
(Inverse) scaling and squaring-type algorithms for eAlog(A) cos(A) At use Padeacute approximants
Padeacute degree chosen to achieve accuracy uPadeacute coeffs and algorithm parameters need rederivingfor a different u Logic may changeexpm logm need changing for smaller u
Methods based on best Linfin approximations to eA forHermitian A also need higher order approximationsderiving
Scalar elementary functions
University of Manchester Nick Higham Higher Precision Arithmetic 15 33
Iterative Algorithms
QR-based Dynamically Weighted Halley IterationNakatsukasa Bai amp Gygi (2010)X0 = Aα isin Cmtimesn (m ge n)[radic
ck Xk
I
]=
[Q1
Q2
]R
Xk+1 =bk
ckXk +
1radicck
(ak minus
bk
ck
)Q1Qlowast2
Cubically convergent with limkrarrinfin Xk = U whereA = UH is a polar decompositionForms basis of new spectral divide amp conquer algs forsymm eirsquoproblem and SVD (H amp Nakatsukasa 2012)
University of Manchester Nick Higham Higher Precision Arithmetic 16 33
Maximum Number of Iterations
If κ2(A) le uminus1
u 10minus16 10minus32 10minus64 10minus128
Scaled Newton 9 11 13 15QDWH 6 7 8 9
Increasing the precision makes little difference to thecost in flops
University of Manchester Nick Higham Higher Precision Arithmetic 17 33
Derivative Approximations
Freacutechet derivative of f Cntimesn rarr Cntimesn
Lf (AE) asymp f (A + hE)minus f (A)h
forward difference
with error O(h) and hopt =(
uf (A)E2
)12
If f Rntimesn rarr Rntimesn and AE isin Rntimesn (Al-Mohy amp H 2010)
Lf (AE) asymp Imf (A + ihE)
hcomplex step
has error O(h2) and can take h arbitrarily small eg 10minus100
University of Manchester Nick Higham Higher Precision Arithmetic 18 33
Derivative Approximations
Freacutechet derivative of f Cntimesn rarr Cntimesn
Lf (AE) asymp f (A + hE)minus f (A)h
forward difference
with error O(h) and hopt =(
uf (A)E2
)12
If f Rntimesn rarr Rntimesn and AE isin Rntimesn (Al-Mohy amp H 2010)
Lf (AE) asymp Imf (A + ihE)
hcomplex step
has error O(h2) and can take h arbitrarily small eg 10minus100
University of Manchester Nick Higham Higher Precision Arithmetic 18 33
Mixed Precision
Even if we have higher precision we might not need to useit everywhere
Iterative refinement for Ax = bIterative refinement for ersquovalue problemsPolynomial root finding (Boyd this workshop)Petschow Quintana-Ortiacute amp Bientinesi (2012) usingquad precision to improve orthogonality of doubleprecision MRRR
University of Manchester Nick Higham Higher Precision Arithmetic 19 33
Chebfun
How would Chebfun need to change with higher precisionarithmetic
gtgt help chebfunprefchebfunpref Settings for Chebfun
eps - Relative tolerance used in construction andsubsequent operationsFactory value is 2^-52 (Matlabrsquos factoryvalue of machine epsilon)
University of Manchester Nick Higham Higher Precision Arithmetic 20 33
Chebfun
function pass = complexrotation This code makes sure a few things are ok if you make them complex eg integration norms inner products and singular values LNT 20 May 2008
f = chebfun((x) exp(x))fi = chebfun((x) 1iexp(x))g = chebfun(rsquo1(2-x)rsquo)gi = chebfun(rsquo1i(2-x)rsquo)A = [f g]
pass(1) = (sum(fi)==1isum(f))pass(2) = norm(fi)==norm(f)pass(3) = abs((frsquog)-((1if)rsquo(1ig))) lt 1e-15pass(4) = (norm(giinf)-norm(ginf)) lt 1e-15pass(5) = (norm(fi1)-norm(f1)) lt 1e-15pass(6) = norm(svd(A) - svd(1iA)) lt 1e-15
University of Manchester Nick Higham Higher Precision Arithmetic 21 33
Unstable Algorithms in Higher Precision
For unstable alg where error bounds available increase theprecision accordingly
University of Manchester Nick Higham Higher Precision Arithmetic 22 33
RandomizationA isin Cntimesn with A = 1Davies (2007) ldquoapproximate diagonalizationrdquo
tol = 1e-8[VD] = eig(A + tolrandn(n))F = Vdiag(feval(fundiag(D)))V
Manchester usage in VPA
digits(d)tol = 10^(-d2)[VD] = eig(vpa(A) + tolvpa( rand(n) ))F = Vdiag(feval(fundiag(D)))V
Daviesrsquo analysis suggests rel err asymp 10minusd2
H amp Relton (in progress)
University of Manchester Nick Higham Higher Precision Arithmetic 23 33
RandomizationA isin Cntimesn with A = 1Davies (2007) ldquoapproximate diagonalizationrdquo
tol = 1e-8[VD] = eig(A + tolrandn(n))F = Vdiag(feval(fundiag(D)))V
Manchester usage in VPA
digits(d)tol = 10^(-d2)[VD] = eig(vpa(A) + tolvpa( rand(n) ))F = Vdiag(feval(fundiag(D)))V
Daviesrsquo analysis suggests rel err asymp 10minusd2
H amp Relton (in progress)
University of Manchester Nick Higham Higher Precision Arithmetic 23 33
Tiny Relative Errors
Normwise relative errors
θinfin(x y) =x minus yinfinxinfin
from a numerical experiment
132e-22 339e-22 339e-21 867e-20139e-18 436e-18 530e-18 583e-18145e-17 376e-17 376e-17 427e-17
What precision was used
University of Manchester Nick Higham Higher Precision Arithmetic 24 33
Base β = 2 u = 2minust Dingle amp H (2011)
Theorem
If x 6= 0 and y are distinct normalized flpt numbers then|x minus y ||x | ge u and this lower bound is attainable
Theorem
Let 0 6= x y isin Rn be vectors of normalized flpt numbers Ifθinfin(x y) lt u then xk = yk for all k st |xk | = xinfin
x =
[1
10minus22
] y =
[1
2times 10minus22
] θinfin(x y) = 10minus22
Tiny relative errors can corrupt performance profilesSee cure in Dingle amp H (2011)
University of Manchester Nick Higham Higher Precision Arithmetic 25 33
Base β = 2 u = 2minust Dingle amp H (2011)
Theorem
If x 6= 0 and y are distinct normalized flpt numbers then|x minus y ||x | ge u and this lower bound is attainable
Theorem
Let 0 6= x y isin Rn be vectors of normalized flpt numbers Ifθinfin(x y) lt u then xk = yk for all k st |xk | = xinfin
x =
[1
10minus22
] y =
[1
2times 10minus22
] θinfin(x y) = 10minus22
Tiny relative errors can corrupt performance profilesSee cure in Dingle amp H (2011)
University of Manchester Nick Higham Higher Precision Arithmetic 25 33
Increasing the Precision
y = eπradic
163 evaluated at t digit precision
t y20 2625374126407687440025 262537412640768744000000030 262537412640768743999999999999
Is the last digit before the decimal point 4
t y35 2625374126407687439999999999992500740 2625374126407687439999999999992500725972
So no itrsquos 3
University of Manchester Nick Higham Higher Precision Arithmetic 26 33
Increasing the Precision
y = eπradic
163 evaluated at t digit precision
t y20 2625374126407687440025 262537412640768744000000030 262537412640768743999999999999
Is the last digit before the decimal point 4
t y35 2625374126407687439999999999992500740 2625374126407687439999999999992500725972
So no itrsquos 3
University of Manchester Nick Higham Higher Precision Arithmetic 26 33
Another Example
Consider the evaluation in precision u = 2minust of
y = x + a sin(bx) x = 17 a = 10minus8 b = 224
10 15 20 25 30 35 4010
minus14
10minus13
10minus12
10minus11
10minus10
10minus9
10minus8
10minus7
10minus6
10minus5
10minus4
t
error
University of Manchester Nick Higham Higher Precision Arithmetic 27 33
MythIncreasing the precision at which a computation isperformed increases the accuracy of the answer
Rounding Errors in Digital Imaging
An 8-bit RGB image is an m times n times 3 array of integersaijk isin 01 255Every editing op (levels curves colour balance )aijk larr round(fijk(aijk)) incurs a rounding error
Should we edit in 16-bit
University of Manchester Nick Higham Higher Precision Arithmetic 29 33
Rounding Errors in Digital Imaging
An 8-bit RGB image is an m times n times 3 array of integersaijk isin 01 255Every editing op (levels curves colour balance )aijk larr round(fijk(aijk)) incurs a rounding error
Should we edit in 16-bit
University of Manchester Nick Higham Higher Precision Arithmetic 29 33
16-bit vs 8-bit editing
SLR cameras generate image at 12ndash14 bits internally
8-bit 0 1 25516-bit 0 39times 10minus3 255
Controversial Margulis says using 16 bits makes nopractical difference in quality
Relevant metric is not normwise relative error
University of Manchester Nick Higham Higher Precision Arithmetic 30 33
16-bit vs 8-bit editing
SLR cameras generate image at 12ndash14 bits internally
8-bit 0 1 25516-bit 0 39times 10minus3 255
Controversial Margulis says using 16 bits makes nopractical difference in quality
Relevant metric is not normwise relative error
University of Manchester Nick Higham Higher Precision Arithmetic 30 33
16-bit vs 8-bit editing
SLR cameras generate image at 12ndash14 bits internally
8-bit 0 1 25516-bit 0 39times 10minus3 255
Controversial Margulis says using 16 bits makes nopractical difference in quality
Relevant metric is not normwise relative error
University of Manchester Nick Higham Higher Precision Arithmetic 30 33
High Precision in MATLAB
VPA arithmetic in Symbolic Math ToolboxAdvanpix Multiprecision Computing Toolbox forMATLAB
University of Manchester Nick Higham Higher Precision Arithmetic 31 33
gtgt which -all eigbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleeig) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleeig) double methodCMATLAB2012btoolboxsymbolicsymbolicsymeigm sym methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedeigm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayeigm gpuArray methoddmatlabchebfunlinopeigm linop methoddmatlabchebfunchebopeigm chebop method
gtgt which -all qrbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleqr) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleqr) double methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedqrm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayqrm gpuArray methoddmatlabchebfunchebfunqrm chebfun method
gtgt which -all toeplitzCMATLAB2012btoolboxmatlabelmattoeplitzm
University of Manchester Nick Higham Higher Precision Arithmetic 32 33
gtgt which -all eigbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleeig) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleeig) double methodCMATLAB2012btoolboxsymbolicsymbolicsymeigm sym methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedeigm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayeigm gpuArray methoddmatlabchebfunlinopeigm linop methoddmatlabchebfunchebopeigm chebop method
gtgt which -all qrbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleqr) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleqr) double methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedqrm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayqrm gpuArray methoddmatlabchebfunchebfunqrm chebfun method
gtgt which -all toeplitzCMATLAB2012btoolboxmatlabelmattoeplitzm
University of Manchester Nick Higham Higher Precision Arithmetic 32 33
gtgt which -all eigbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleeig) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleeig) double methodCMATLAB2012btoolboxsymbolicsymbolicsymeigm sym methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedeigm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayeigm gpuArray methoddmatlabchebfunlinopeigm linop methoddmatlabchebfunchebopeigm chebop method
gtgt which -all qrbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleqr) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleqr) double methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedqrm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayqrm gpuArray methoddmatlabchebfunchebfunqrm chebfun method
gtgt which -all toeplitzCMATLAB2012btoolboxmatlabelmattoeplitzm
University of Manchester Nick Higham Higher Precision Arithmetic 32 33
Conclusions amp Future Directions
Need to prepare for quad precision in hardwareArbitrary precision arithmetic in software increasinglyuseful Are we fully exploiting it
High precision veryuseful for computingldquoexact answersrdquo fortesting algs
NLA group
University of Manchester Nick Higham Higher Precision Arithmetic 33 33
References I
Multiprecision Computing Toolbox for MATLABAdvanpix LLC Tokyo|httpwwwadvanpixcom|
A H Al-Mohy and N J HighamThe complex step approximation to the Freacutechetderivative of a matrix functionNumer Algorithms 53(1)133ndash148 2010
Basic Linear Algebra Subprograms Technical (BLAST)Forum Standard IInt J High Performance Computing Applications 16(1)1ndash111 2002
University of Manchester Nick Higham Higher Precision Arithmetic 1 4
References II
Basic Linear Algebra Subprograms Technical (BLAST)Forum Standard IIInt J High Performance Computing Applications 16(2)115ndash199 2002
E B DaviesApproximate diagonalizationSIAM J Matrix Anal Appl 29(4)1051ndash1064 2007
University of Manchester Nick Higham Higher Precision Arithmetic 2 4
References III
N J Dingle and N J HighamReducing the influence of tiny normwise relative errorson performance profilesMIMS EPrint 201190 Manchester Institute forMathematical Sciences The University of ManchesterUK Nov 201111 pp
W KahanComputer benchmarks versus accuracyDraft manuscript June 1994
University of Manchester Nick Higham Higher Precision Arithmetic 3 4
References IV
X S Li J W Demmel D H Bailey G Henry Y HidaJ Iskandar W Kahan S Y Kang A Kapur M CMartin B J Thompson T Tung and D J YooDesign implementation and testing of extended andmixed precision BLASACM Trans Math Software 28(2)152ndash205 June 2002
D MargulisPhotoshop LAB Color The Canyon Conundrum andOther Adventures in the Most Powerful ColorspacePeachpit Press Berkeley CA USA 2006ISBN 0-321-35678-0xviii+366 pp
University of Manchester Nick Higham Higher Precision Arithmetic 4 4
Iterative Algorithms
QR-based Dynamically Weighted Halley IterationNakatsukasa Bai amp Gygi (2010)X0 = Aα isin Cmtimesn (m ge n)[radic
ck Xk
I
]=
[Q1
Q2
]R
Xk+1 =bk
ckXk +
1radicck
(ak minus
bk
ck
)Q1Qlowast2
Cubically convergent with limkrarrinfin Xk = U whereA = UH is a polar decompositionForms basis of new spectral divide amp conquer algs forsymm eirsquoproblem and SVD (H amp Nakatsukasa 2012)
University of Manchester Nick Higham Higher Precision Arithmetic 16 33
Maximum Number of Iterations
If κ2(A) le uminus1
u 10minus16 10minus32 10minus64 10minus128
Scaled Newton 9 11 13 15QDWH 6 7 8 9
Increasing the precision makes little difference to thecost in flops
University of Manchester Nick Higham Higher Precision Arithmetic 17 33
Derivative Approximations
Freacutechet derivative of f Cntimesn rarr Cntimesn
Lf (AE) asymp f (A + hE)minus f (A)h
forward difference
with error O(h) and hopt =(
uf (A)E2
)12
If f Rntimesn rarr Rntimesn and AE isin Rntimesn (Al-Mohy amp H 2010)
Lf (AE) asymp Imf (A + ihE)
hcomplex step
has error O(h2) and can take h arbitrarily small eg 10minus100
University of Manchester Nick Higham Higher Precision Arithmetic 18 33
Derivative Approximations
Freacutechet derivative of f Cntimesn rarr Cntimesn
Lf (AE) asymp f (A + hE)minus f (A)h
forward difference
with error O(h) and hopt =(
uf (A)E2
)12
If f Rntimesn rarr Rntimesn and AE isin Rntimesn (Al-Mohy amp H 2010)
Lf (AE) asymp Imf (A + ihE)
hcomplex step
has error O(h2) and can take h arbitrarily small eg 10minus100
University of Manchester Nick Higham Higher Precision Arithmetic 18 33
Mixed Precision
Even if we have higher precision we might not need to useit everywhere
Iterative refinement for Ax = bIterative refinement for ersquovalue problemsPolynomial root finding (Boyd this workshop)Petschow Quintana-Ortiacute amp Bientinesi (2012) usingquad precision to improve orthogonality of doubleprecision MRRR
University of Manchester Nick Higham Higher Precision Arithmetic 19 33
Chebfun
How would Chebfun need to change with higher precisionarithmetic
gtgt help chebfunprefchebfunpref Settings for Chebfun
eps - Relative tolerance used in construction andsubsequent operationsFactory value is 2^-52 (Matlabrsquos factoryvalue of machine epsilon)
University of Manchester Nick Higham Higher Precision Arithmetic 20 33
Chebfun
function pass = complexrotation This code makes sure a few things are ok if you make them complex eg integration norms inner products and singular values LNT 20 May 2008
f = chebfun((x) exp(x))fi = chebfun((x) 1iexp(x))g = chebfun(rsquo1(2-x)rsquo)gi = chebfun(rsquo1i(2-x)rsquo)A = [f g]
pass(1) = (sum(fi)==1isum(f))pass(2) = norm(fi)==norm(f)pass(3) = abs((frsquog)-((1if)rsquo(1ig))) lt 1e-15pass(4) = (norm(giinf)-norm(ginf)) lt 1e-15pass(5) = (norm(fi1)-norm(f1)) lt 1e-15pass(6) = norm(svd(A) - svd(1iA)) lt 1e-15
University of Manchester Nick Higham Higher Precision Arithmetic 21 33
Unstable Algorithms in Higher Precision
For unstable alg where error bounds available increase theprecision accordingly
University of Manchester Nick Higham Higher Precision Arithmetic 22 33
RandomizationA isin Cntimesn with A = 1Davies (2007) ldquoapproximate diagonalizationrdquo
tol = 1e-8[VD] = eig(A + tolrandn(n))F = Vdiag(feval(fundiag(D)))V
Manchester usage in VPA
digits(d)tol = 10^(-d2)[VD] = eig(vpa(A) + tolvpa( rand(n) ))F = Vdiag(feval(fundiag(D)))V
Daviesrsquo analysis suggests rel err asymp 10minusd2
H amp Relton (in progress)
University of Manchester Nick Higham Higher Precision Arithmetic 23 33
RandomizationA isin Cntimesn with A = 1Davies (2007) ldquoapproximate diagonalizationrdquo
tol = 1e-8[VD] = eig(A + tolrandn(n))F = Vdiag(feval(fundiag(D)))V
Manchester usage in VPA
digits(d)tol = 10^(-d2)[VD] = eig(vpa(A) + tolvpa( rand(n) ))F = Vdiag(feval(fundiag(D)))V
Daviesrsquo analysis suggests rel err asymp 10minusd2
H amp Relton (in progress)
University of Manchester Nick Higham Higher Precision Arithmetic 23 33
Tiny Relative Errors
Normwise relative errors
θinfin(x y) =x minus yinfinxinfin
from a numerical experiment
132e-22 339e-22 339e-21 867e-20139e-18 436e-18 530e-18 583e-18145e-17 376e-17 376e-17 427e-17
What precision was used
University of Manchester Nick Higham Higher Precision Arithmetic 24 33
Base β = 2 u = 2minust Dingle amp H (2011)
Theorem
If x 6= 0 and y are distinct normalized flpt numbers then|x minus y ||x | ge u and this lower bound is attainable
Theorem
Let 0 6= x y isin Rn be vectors of normalized flpt numbers Ifθinfin(x y) lt u then xk = yk for all k st |xk | = xinfin
x =
[1
10minus22
] y =
[1
2times 10minus22
] θinfin(x y) = 10minus22
Tiny relative errors can corrupt performance profilesSee cure in Dingle amp H (2011)
University of Manchester Nick Higham Higher Precision Arithmetic 25 33
Base β = 2 u = 2minust Dingle amp H (2011)
Theorem
If x 6= 0 and y are distinct normalized flpt numbers then|x minus y ||x | ge u and this lower bound is attainable
Theorem
Let 0 6= x y isin Rn be vectors of normalized flpt numbers Ifθinfin(x y) lt u then xk = yk for all k st |xk | = xinfin
x =
[1
10minus22
] y =
[1
2times 10minus22
] θinfin(x y) = 10minus22
Tiny relative errors can corrupt performance profilesSee cure in Dingle amp H (2011)
University of Manchester Nick Higham Higher Precision Arithmetic 25 33
Increasing the Precision
y = eπradic
163 evaluated at t digit precision
t y20 2625374126407687440025 262537412640768744000000030 262537412640768743999999999999
Is the last digit before the decimal point 4
t y35 2625374126407687439999999999992500740 2625374126407687439999999999992500725972
So no itrsquos 3
University of Manchester Nick Higham Higher Precision Arithmetic 26 33
Increasing the Precision
y = eπradic
163 evaluated at t digit precision
t y20 2625374126407687440025 262537412640768744000000030 262537412640768743999999999999
Is the last digit before the decimal point 4
t y35 2625374126407687439999999999992500740 2625374126407687439999999999992500725972
So no itrsquos 3
University of Manchester Nick Higham Higher Precision Arithmetic 26 33
Another Example
Consider the evaluation in precision u = 2minust of
y = x + a sin(bx) x = 17 a = 10minus8 b = 224
10 15 20 25 30 35 4010
minus14
10minus13
10minus12
10minus11
10minus10
10minus9
10minus8
10minus7
10minus6
10minus5
10minus4
t
error
University of Manchester Nick Higham Higher Precision Arithmetic 27 33
MythIncreasing the precision at which a computation isperformed increases the accuracy of the answer
Rounding Errors in Digital Imaging
An 8-bit RGB image is an m times n times 3 array of integersaijk isin 01 255Every editing op (levels curves colour balance )aijk larr round(fijk(aijk)) incurs a rounding error
Should we edit in 16-bit
University of Manchester Nick Higham Higher Precision Arithmetic 29 33
Rounding Errors in Digital Imaging
An 8-bit RGB image is an m times n times 3 array of integersaijk isin 01 255Every editing op (levels curves colour balance )aijk larr round(fijk(aijk)) incurs a rounding error
Should we edit in 16-bit
University of Manchester Nick Higham Higher Precision Arithmetic 29 33
16-bit vs 8-bit editing
SLR cameras generate image at 12ndash14 bits internally
8-bit 0 1 25516-bit 0 39times 10minus3 255
Controversial Margulis says using 16 bits makes nopractical difference in quality
Relevant metric is not normwise relative error
University of Manchester Nick Higham Higher Precision Arithmetic 30 33
16-bit vs 8-bit editing
SLR cameras generate image at 12ndash14 bits internally
8-bit 0 1 25516-bit 0 39times 10minus3 255
Controversial Margulis says using 16 bits makes nopractical difference in quality
Relevant metric is not normwise relative error
University of Manchester Nick Higham Higher Precision Arithmetic 30 33
16-bit vs 8-bit editing
SLR cameras generate image at 12ndash14 bits internally
8-bit 0 1 25516-bit 0 39times 10minus3 255
Controversial Margulis says using 16 bits makes nopractical difference in quality
Relevant metric is not normwise relative error
University of Manchester Nick Higham Higher Precision Arithmetic 30 33
High Precision in MATLAB
VPA arithmetic in Symbolic Math ToolboxAdvanpix Multiprecision Computing Toolbox forMATLAB
University of Manchester Nick Higham Higher Precision Arithmetic 31 33
gtgt which -all eigbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleeig) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleeig) double methodCMATLAB2012btoolboxsymbolicsymbolicsymeigm sym methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedeigm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayeigm gpuArray methoddmatlabchebfunlinopeigm linop methoddmatlabchebfunchebopeigm chebop method
gtgt which -all qrbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleqr) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleqr) double methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedqrm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayqrm gpuArray methoddmatlabchebfunchebfunqrm chebfun method
gtgt which -all toeplitzCMATLAB2012btoolboxmatlabelmattoeplitzm
University of Manchester Nick Higham Higher Precision Arithmetic 32 33
gtgt which -all eigbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleeig) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleeig) double methodCMATLAB2012btoolboxsymbolicsymbolicsymeigm sym methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedeigm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayeigm gpuArray methoddmatlabchebfunlinopeigm linop methoddmatlabchebfunchebopeigm chebop method
gtgt which -all qrbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleqr) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleqr) double methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedqrm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayqrm gpuArray methoddmatlabchebfunchebfunqrm chebfun method
gtgt which -all toeplitzCMATLAB2012btoolboxmatlabelmattoeplitzm
University of Manchester Nick Higham Higher Precision Arithmetic 32 33
gtgt which -all eigbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleeig) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleeig) double methodCMATLAB2012btoolboxsymbolicsymbolicsymeigm sym methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedeigm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayeigm gpuArray methoddmatlabchebfunlinopeigm linop methoddmatlabchebfunchebopeigm chebop method
gtgt which -all qrbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleqr) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleqr) double methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedqrm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayqrm gpuArray methoddmatlabchebfunchebfunqrm chebfun method
gtgt which -all toeplitzCMATLAB2012btoolboxmatlabelmattoeplitzm
University of Manchester Nick Higham Higher Precision Arithmetic 32 33
Conclusions amp Future Directions
Need to prepare for quad precision in hardwareArbitrary precision arithmetic in software increasinglyuseful Are we fully exploiting it
High precision veryuseful for computingldquoexact answersrdquo fortesting algs
NLA group
University of Manchester Nick Higham Higher Precision Arithmetic 33 33
References I
Multiprecision Computing Toolbox for MATLABAdvanpix LLC Tokyo|httpwwwadvanpixcom|
A H Al-Mohy and N J HighamThe complex step approximation to the Freacutechetderivative of a matrix functionNumer Algorithms 53(1)133ndash148 2010
Basic Linear Algebra Subprograms Technical (BLAST)Forum Standard IInt J High Performance Computing Applications 16(1)1ndash111 2002
University of Manchester Nick Higham Higher Precision Arithmetic 1 4
References II
Basic Linear Algebra Subprograms Technical (BLAST)Forum Standard IIInt J High Performance Computing Applications 16(2)115ndash199 2002
E B DaviesApproximate diagonalizationSIAM J Matrix Anal Appl 29(4)1051ndash1064 2007
University of Manchester Nick Higham Higher Precision Arithmetic 2 4
References III
N J Dingle and N J HighamReducing the influence of tiny normwise relative errorson performance profilesMIMS EPrint 201190 Manchester Institute forMathematical Sciences The University of ManchesterUK Nov 201111 pp
W KahanComputer benchmarks versus accuracyDraft manuscript June 1994
University of Manchester Nick Higham Higher Precision Arithmetic 3 4
References IV
X S Li J W Demmel D H Bailey G Henry Y HidaJ Iskandar W Kahan S Y Kang A Kapur M CMartin B J Thompson T Tung and D J YooDesign implementation and testing of extended andmixed precision BLASACM Trans Math Software 28(2)152ndash205 June 2002
D MargulisPhotoshop LAB Color The Canyon Conundrum andOther Adventures in the Most Powerful ColorspacePeachpit Press Berkeley CA USA 2006ISBN 0-321-35678-0xviii+366 pp
University of Manchester Nick Higham Higher Precision Arithmetic 4 4
Maximum Number of Iterations
If κ2(A) le uminus1
u 10minus16 10minus32 10minus64 10minus128
Scaled Newton 9 11 13 15QDWH 6 7 8 9
Increasing the precision makes little difference to thecost in flops
University of Manchester Nick Higham Higher Precision Arithmetic 17 33
Derivative Approximations
Freacutechet derivative of f Cntimesn rarr Cntimesn
Lf (AE) asymp f (A + hE)minus f (A)h
forward difference
with error O(h) and hopt =(
uf (A)E2
)12
If f Rntimesn rarr Rntimesn and AE isin Rntimesn (Al-Mohy amp H 2010)
Lf (AE) asymp Imf (A + ihE)
hcomplex step
has error O(h2) and can take h arbitrarily small eg 10minus100
University of Manchester Nick Higham Higher Precision Arithmetic 18 33
Derivative Approximations
Freacutechet derivative of f Cntimesn rarr Cntimesn
Lf (AE) asymp f (A + hE)minus f (A)h
forward difference
with error O(h) and hopt =(
uf (A)E2
)12
If f Rntimesn rarr Rntimesn and AE isin Rntimesn (Al-Mohy amp H 2010)
Lf (AE) asymp Imf (A + ihE)
hcomplex step
has error O(h2) and can take h arbitrarily small eg 10minus100
University of Manchester Nick Higham Higher Precision Arithmetic 18 33
Mixed Precision
Even if we have higher precision we might not need to useit everywhere
Iterative refinement for Ax = bIterative refinement for ersquovalue problemsPolynomial root finding (Boyd this workshop)Petschow Quintana-Ortiacute amp Bientinesi (2012) usingquad precision to improve orthogonality of doubleprecision MRRR
University of Manchester Nick Higham Higher Precision Arithmetic 19 33
Chebfun
How would Chebfun need to change with higher precisionarithmetic
gtgt help chebfunprefchebfunpref Settings for Chebfun
eps - Relative tolerance used in construction andsubsequent operationsFactory value is 2^-52 (Matlabrsquos factoryvalue of machine epsilon)
University of Manchester Nick Higham Higher Precision Arithmetic 20 33
Chebfun
function pass = complexrotation This code makes sure a few things are ok if you make them complex eg integration norms inner products and singular values LNT 20 May 2008
f = chebfun((x) exp(x))fi = chebfun((x) 1iexp(x))g = chebfun(rsquo1(2-x)rsquo)gi = chebfun(rsquo1i(2-x)rsquo)A = [f g]
pass(1) = (sum(fi)==1isum(f))pass(2) = norm(fi)==norm(f)pass(3) = abs((frsquog)-((1if)rsquo(1ig))) lt 1e-15pass(4) = (norm(giinf)-norm(ginf)) lt 1e-15pass(5) = (norm(fi1)-norm(f1)) lt 1e-15pass(6) = norm(svd(A) - svd(1iA)) lt 1e-15
University of Manchester Nick Higham Higher Precision Arithmetic 21 33
Unstable Algorithms in Higher Precision
For unstable alg where error bounds available increase theprecision accordingly
University of Manchester Nick Higham Higher Precision Arithmetic 22 33
RandomizationA isin Cntimesn with A = 1Davies (2007) ldquoapproximate diagonalizationrdquo
tol = 1e-8[VD] = eig(A + tolrandn(n))F = Vdiag(feval(fundiag(D)))V
Manchester usage in VPA
digits(d)tol = 10^(-d2)[VD] = eig(vpa(A) + tolvpa( rand(n) ))F = Vdiag(feval(fundiag(D)))V
Daviesrsquo analysis suggests rel err asymp 10minusd2
H amp Relton (in progress)
University of Manchester Nick Higham Higher Precision Arithmetic 23 33
RandomizationA isin Cntimesn with A = 1Davies (2007) ldquoapproximate diagonalizationrdquo
tol = 1e-8[VD] = eig(A + tolrandn(n))F = Vdiag(feval(fundiag(D)))V
Manchester usage in VPA
digits(d)tol = 10^(-d2)[VD] = eig(vpa(A) + tolvpa( rand(n) ))F = Vdiag(feval(fundiag(D)))V
Daviesrsquo analysis suggests rel err asymp 10minusd2
H amp Relton (in progress)
University of Manchester Nick Higham Higher Precision Arithmetic 23 33
Tiny Relative Errors
Normwise relative errors
θinfin(x y) =x minus yinfinxinfin
from a numerical experiment
132e-22 339e-22 339e-21 867e-20139e-18 436e-18 530e-18 583e-18145e-17 376e-17 376e-17 427e-17
What precision was used
University of Manchester Nick Higham Higher Precision Arithmetic 24 33
Base β = 2 u = 2minust Dingle amp H (2011)
Theorem
If x 6= 0 and y are distinct normalized flpt numbers then|x minus y ||x | ge u and this lower bound is attainable
Theorem
Let 0 6= x y isin Rn be vectors of normalized flpt numbers Ifθinfin(x y) lt u then xk = yk for all k st |xk | = xinfin
x =
[1
10minus22
] y =
[1
2times 10minus22
] θinfin(x y) = 10minus22
Tiny relative errors can corrupt performance profilesSee cure in Dingle amp H (2011)
University of Manchester Nick Higham Higher Precision Arithmetic 25 33
Base β = 2 u = 2minust Dingle amp H (2011)
Theorem
If x 6= 0 and y are distinct normalized flpt numbers then|x minus y ||x | ge u and this lower bound is attainable
Theorem
Let 0 6= x y isin Rn be vectors of normalized flpt numbers Ifθinfin(x y) lt u then xk = yk for all k st |xk | = xinfin
x =
[1
10minus22
] y =
[1
2times 10minus22
] θinfin(x y) = 10minus22
Tiny relative errors can corrupt performance profilesSee cure in Dingle amp H (2011)
University of Manchester Nick Higham Higher Precision Arithmetic 25 33
Increasing the Precision
y = eπradic
163 evaluated at t digit precision
t y20 2625374126407687440025 262537412640768744000000030 262537412640768743999999999999
Is the last digit before the decimal point 4
t y35 2625374126407687439999999999992500740 2625374126407687439999999999992500725972
So no itrsquos 3
University of Manchester Nick Higham Higher Precision Arithmetic 26 33
Increasing the Precision
y = eπradic
163 evaluated at t digit precision
t y20 2625374126407687440025 262537412640768744000000030 262537412640768743999999999999
Is the last digit before the decimal point 4
t y35 2625374126407687439999999999992500740 2625374126407687439999999999992500725972
So no itrsquos 3
University of Manchester Nick Higham Higher Precision Arithmetic 26 33
Another Example
Consider the evaluation in precision u = 2minust of
y = x + a sin(bx) x = 17 a = 10minus8 b = 224
10 15 20 25 30 35 4010
minus14
10minus13
10minus12
10minus11
10minus10
10minus9
10minus8
10minus7
10minus6
10minus5
10minus4
t
error
University of Manchester Nick Higham Higher Precision Arithmetic 27 33
MythIncreasing the precision at which a computation isperformed increases the accuracy of the answer
Rounding Errors in Digital Imaging
An 8-bit RGB image is an m times n times 3 array of integersaijk isin 01 255Every editing op (levels curves colour balance )aijk larr round(fijk(aijk)) incurs a rounding error
Should we edit in 16-bit
University of Manchester Nick Higham Higher Precision Arithmetic 29 33
Rounding Errors in Digital Imaging
An 8-bit RGB image is an m times n times 3 array of integersaijk isin 01 255Every editing op (levels curves colour balance )aijk larr round(fijk(aijk)) incurs a rounding error
Should we edit in 16-bit
University of Manchester Nick Higham Higher Precision Arithmetic 29 33
16-bit vs 8-bit editing
SLR cameras generate image at 12ndash14 bits internally
8-bit 0 1 25516-bit 0 39times 10minus3 255
Controversial Margulis says using 16 bits makes nopractical difference in quality
Relevant metric is not normwise relative error
University of Manchester Nick Higham Higher Precision Arithmetic 30 33
16-bit vs 8-bit editing
SLR cameras generate image at 12ndash14 bits internally
8-bit 0 1 25516-bit 0 39times 10minus3 255
Controversial Margulis says using 16 bits makes nopractical difference in quality
Relevant metric is not normwise relative error
University of Manchester Nick Higham Higher Precision Arithmetic 30 33
16-bit vs 8-bit editing
SLR cameras generate image at 12ndash14 bits internally
8-bit 0 1 25516-bit 0 39times 10minus3 255
Controversial Margulis says using 16 bits makes nopractical difference in quality
Relevant metric is not normwise relative error
University of Manchester Nick Higham Higher Precision Arithmetic 30 33
High Precision in MATLAB
VPA arithmetic in Symbolic Math ToolboxAdvanpix Multiprecision Computing Toolbox forMATLAB
University of Manchester Nick Higham Higher Precision Arithmetic 31 33
gtgt which -all eigbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleeig) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleeig) double methodCMATLAB2012btoolboxsymbolicsymbolicsymeigm sym methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedeigm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayeigm gpuArray methoddmatlabchebfunlinopeigm linop methoddmatlabchebfunchebopeigm chebop method
gtgt which -all qrbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleqr) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleqr) double methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedqrm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayqrm gpuArray methoddmatlabchebfunchebfunqrm chebfun method
gtgt which -all toeplitzCMATLAB2012btoolboxmatlabelmattoeplitzm
University of Manchester Nick Higham Higher Precision Arithmetic 32 33
gtgt which -all eigbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleeig) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleeig) double methodCMATLAB2012btoolboxsymbolicsymbolicsymeigm sym methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedeigm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayeigm gpuArray methoddmatlabchebfunlinopeigm linop methoddmatlabchebfunchebopeigm chebop method
gtgt which -all qrbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleqr) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleqr) double methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedqrm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayqrm gpuArray methoddmatlabchebfunchebfunqrm chebfun method
gtgt which -all toeplitzCMATLAB2012btoolboxmatlabelmattoeplitzm
University of Manchester Nick Higham Higher Precision Arithmetic 32 33
gtgt which -all eigbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleeig) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleeig) double methodCMATLAB2012btoolboxsymbolicsymbolicsymeigm sym methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedeigm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayeigm gpuArray methoddmatlabchebfunlinopeigm linop methoddmatlabchebfunchebopeigm chebop method
gtgt which -all qrbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleqr) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleqr) double methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedqrm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayqrm gpuArray methoddmatlabchebfunchebfunqrm chebfun method
gtgt which -all toeplitzCMATLAB2012btoolboxmatlabelmattoeplitzm
University of Manchester Nick Higham Higher Precision Arithmetic 32 33
Conclusions amp Future Directions
Need to prepare for quad precision in hardwareArbitrary precision arithmetic in software increasinglyuseful Are we fully exploiting it
High precision veryuseful for computingldquoexact answersrdquo fortesting algs
NLA group
University of Manchester Nick Higham Higher Precision Arithmetic 33 33
References I
Multiprecision Computing Toolbox for MATLABAdvanpix LLC Tokyo|httpwwwadvanpixcom|
A H Al-Mohy and N J HighamThe complex step approximation to the Freacutechetderivative of a matrix functionNumer Algorithms 53(1)133ndash148 2010
Basic Linear Algebra Subprograms Technical (BLAST)Forum Standard IInt J High Performance Computing Applications 16(1)1ndash111 2002
University of Manchester Nick Higham Higher Precision Arithmetic 1 4
References II
Basic Linear Algebra Subprograms Technical (BLAST)Forum Standard IIInt J High Performance Computing Applications 16(2)115ndash199 2002
E B DaviesApproximate diagonalizationSIAM J Matrix Anal Appl 29(4)1051ndash1064 2007
University of Manchester Nick Higham Higher Precision Arithmetic 2 4
References III
N J Dingle and N J HighamReducing the influence of tiny normwise relative errorson performance profilesMIMS EPrint 201190 Manchester Institute forMathematical Sciences The University of ManchesterUK Nov 201111 pp
W KahanComputer benchmarks versus accuracyDraft manuscript June 1994
University of Manchester Nick Higham Higher Precision Arithmetic 3 4
References IV
X S Li J W Demmel D H Bailey G Henry Y HidaJ Iskandar W Kahan S Y Kang A Kapur M CMartin B J Thompson T Tung and D J YooDesign implementation and testing of extended andmixed precision BLASACM Trans Math Software 28(2)152ndash205 June 2002
D MargulisPhotoshop LAB Color The Canyon Conundrum andOther Adventures in the Most Powerful ColorspacePeachpit Press Berkeley CA USA 2006ISBN 0-321-35678-0xviii+366 pp
University of Manchester Nick Higham Higher Precision Arithmetic 4 4
Derivative Approximations
Freacutechet derivative of f Cntimesn rarr Cntimesn
Lf (AE) asymp f (A + hE)minus f (A)h
forward difference
with error O(h) and hopt =(
uf (A)E2
)12
If f Rntimesn rarr Rntimesn and AE isin Rntimesn (Al-Mohy amp H 2010)
Lf (AE) asymp Imf (A + ihE)
hcomplex step
has error O(h2) and can take h arbitrarily small eg 10minus100
University of Manchester Nick Higham Higher Precision Arithmetic 18 33
Derivative Approximations
Freacutechet derivative of f Cntimesn rarr Cntimesn
Lf (AE) asymp f (A + hE)minus f (A)h
forward difference
with error O(h) and hopt =(
uf (A)E2
)12
If f Rntimesn rarr Rntimesn and AE isin Rntimesn (Al-Mohy amp H 2010)
Lf (AE) asymp Imf (A + ihE)
hcomplex step
has error O(h2) and can take h arbitrarily small eg 10minus100
University of Manchester Nick Higham Higher Precision Arithmetic 18 33
Mixed Precision
Even if we have higher precision we might not need to useit everywhere
Iterative refinement for Ax = bIterative refinement for ersquovalue problemsPolynomial root finding (Boyd this workshop)Petschow Quintana-Ortiacute amp Bientinesi (2012) usingquad precision to improve orthogonality of doubleprecision MRRR
University of Manchester Nick Higham Higher Precision Arithmetic 19 33
Chebfun
How would Chebfun need to change with higher precisionarithmetic
gtgt help chebfunprefchebfunpref Settings for Chebfun
eps - Relative tolerance used in construction andsubsequent operationsFactory value is 2^-52 (Matlabrsquos factoryvalue of machine epsilon)
University of Manchester Nick Higham Higher Precision Arithmetic 20 33
Chebfun
function pass = complexrotation This code makes sure a few things are ok if you make them complex eg integration norms inner products and singular values LNT 20 May 2008
f = chebfun((x) exp(x))fi = chebfun((x) 1iexp(x))g = chebfun(rsquo1(2-x)rsquo)gi = chebfun(rsquo1i(2-x)rsquo)A = [f g]
pass(1) = (sum(fi)==1isum(f))pass(2) = norm(fi)==norm(f)pass(3) = abs((frsquog)-((1if)rsquo(1ig))) lt 1e-15pass(4) = (norm(giinf)-norm(ginf)) lt 1e-15pass(5) = (norm(fi1)-norm(f1)) lt 1e-15pass(6) = norm(svd(A) - svd(1iA)) lt 1e-15
University of Manchester Nick Higham Higher Precision Arithmetic 21 33
Unstable Algorithms in Higher Precision
For unstable alg where error bounds available increase theprecision accordingly
University of Manchester Nick Higham Higher Precision Arithmetic 22 33
RandomizationA isin Cntimesn with A = 1Davies (2007) ldquoapproximate diagonalizationrdquo
tol = 1e-8[VD] = eig(A + tolrandn(n))F = Vdiag(feval(fundiag(D)))V
Manchester usage in VPA
digits(d)tol = 10^(-d2)[VD] = eig(vpa(A) + tolvpa( rand(n) ))F = Vdiag(feval(fundiag(D)))V
Daviesrsquo analysis suggests rel err asymp 10minusd2
H amp Relton (in progress)
University of Manchester Nick Higham Higher Precision Arithmetic 23 33
RandomizationA isin Cntimesn with A = 1Davies (2007) ldquoapproximate diagonalizationrdquo
tol = 1e-8[VD] = eig(A + tolrandn(n))F = Vdiag(feval(fundiag(D)))V
Manchester usage in VPA
digits(d)tol = 10^(-d2)[VD] = eig(vpa(A) + tolvpa( rand(n) ))F = Vdiag(feval(fundiag(D)))V
Daviesrsquo analysis suggests rel err asymp 10minusd2
H amp Relton (in progress)
University of Manchester Nick Higham Higher Precision Arithmetic 23 33
Tiny Relative Errors
Normwise relative errors
θinfin(x y) =x minus yinfinxinfin
from a numerical experiment
132e-22 339e-22 339e-21 867e-20139e-18 436e-18 530e-18 583e-18145e-17 376e-17 376e-17 427e-17
What precision was used
University of Manchester Nick Higham Higher Precision Arithmetic 24 33
Base β = 2 u = 2minust Dingle amp H (2011)
Theorem
If x 6= 0 and y are distinct normalized flpt numbers then|x minus y ||x | ge u and this lower bound is attainable
Theorem
Let 0 6= x y isin Rn be vectors of normalized flpt numbers Ifθinfin(x y) lt u then xk = yk for all k st |xk | = xinfin
x =
[1
10minus22
] y =
[1
2times 10minus22
] θinfin(x y) = 10minus22
Tiny relative errors can corrupt performance profilesSee cure in Dingle amp H (2011)
University of Manchester Nick Higham Higher Precision Arithmetic 25 33
Base β = 2 u = 2minust Dingle amp H (2011)
Theorem
If x 6= 0 and y are distinct normalized flpt numbers then|x minus y ||x | ge u and this lower bound is attainable
Theorem
Let 0 6= x y isin Rn be vectors of normalized flpt numbers Ifθinfin(x y) lt u then xk = yk for all k st |xk | = xinfin
x =
[1
10minus22
] y =
[1
2times 10minus22
] θinfin(x y) = 10minus22
Tiny relative errors can corrupt performance profilesSee cure in Dingle amp H (2011)
University of Manchester Nick Higham Higher Precision Arithmetic 25 33
Increasing the Precision
y = eπradic
163 evaluated at t digit precision
t y20 2625374126407687440025 262537412640768744000000030 262537412640768743999999999999
Is the last digit before the decimal point 4
t y35 2625374126407687439999999999992500740 2625374126407687439999999999992500725972
So no itrsquos 3
University of Manchester Nick Higham Higher Precision Arithmetic 26 33
Increasing the Precision
y = eπradic
163 evaluated at t digit precision
t y20 2625374126407687440025 262537412640768744000000030 262537412640768743999999999999
Is the last digit before the decimal point 4
t y35 2625374126407687439999999999992500740 2625374126407687439999999999992500725972
So no itrsquos 3
University of Manchester Nick Higham Higher Precision Arithmetic 26 33
Another Example
Consider the evaluation in precision u = 2minust of
y = x + a sin(bx) x = 17 a = 10minus8 b = 224
10 15 20 25 30 35 4010
minus14
10minus13
10minus12
10minus11
10minus10
10minus9
10minus8
10minus7
10minus6
10minus5
10minus4
t
error
University of Manchester Nick Higham Higher Precision Arithmetic 27 33
MythIncreasing the precision at which a computation isperformed increases the accuracy of the answer
Rounding Errors in Digital Imaging
An 8-bit RGB image is an m times n times 3 array of integersaijk isin 01 255Every editing op (levels curves colour balance )aijk larr round(fijk(aijk)) incurs a rounding error
Should we edit in 16-bit
University of Manchester Nick Higham Higher Precision Arithmetic 29 33
Rounding Errors in Digital Imaging
An 8-bit RGB image is an m times n times 3 array of integersaijk isin 01 255Every editing op (levels curves colour balance )aijk larr round(fijk(aijk)) incurs a rounding error
Should we edit in 16-bit
University of Manchester Nick Higham Higher Precision Arithmetic 29 33
16-bit vs 8-bit editing
SLR cameras generate image at 12ndash14 bits internally
8-bit 0 1 25516-bit 0 39times 10minus3 255
Controversial Margulis says using 16 bits makes nopractical difference in quality
Relevant metric is not normwise relative error
University of Manchester Nick Higham Higher Precision Arithmetic 30 33
16-bit vs 8-bit editing
SLR cameras generate image at 12ndash14 bits internally
8-bit 0 1 25516-bit 0 39times 10minus3 255
Controversial Margulis says using 16 bits makes nopractical difference in quality
Relevant metric is not normwise relative error
University of Manchester Nick Higham Higher Precision Arithmetic 30 33
16-bit vs 8-bit editing
SLR cameras generate image at 12ndash14 bits internally
8-bit 0 1 25516-bit 0 39times 10minus3 255
Controversial Margulis says using 16 bits makes nopractical difference in quality
Relevant metric is not normwise relative error
University of Manchester Nick Higham Higher Precision Arithmetic 30 33
High Precision in MATLAB
VPA arithmetic in Symbolic Math ToolboxAdvanpix Multiprecision Computing Toolbox forMATLAB
University of Manchester Nick Higham Higher Precision Arithmetic 31 33
gtgt which -all eigbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleeig) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleeig) double methodCMATLAB2012btoolboxsymbolicsymbolicsymeigm sym methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedeigm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayeigm gpuArray methoddmatlabchebfunlinopeigm linop methoddmatlabchebfunchebopeigm chebop method
gtgt which -all qrbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleqr) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleqr) double methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedqrm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayqrm gpuArray methoddmatlabchebfunchebfunqrm chebfun method
gtgt which -all toeplitzCMATLAB2012btoolboxmatlabelmattoeplitzm
University of Manchester Nick Higham Higher Precision Arithmetic 32 33
gtgt which -all eigbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleeig) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleeig) double methodCMATLAB2012btoolboxsymbolicsymbolicsymeigm sym methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedeigm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayeigm gpuArray methoddmatlabchebfunlinopeigm linop methoddmatlabchebfunchebopeigm chebop method
gtgt which -all qrbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleqr) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleqr) double methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedqrm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayqrm gpuArray methoddmatlabchebfunchebfunqrm chebfun method
gtgt which -all toeplitzCMATLAB2012btoolboxmatlabelmattoeplitzm
University of Manchester Nick Higham Higher Precision Arithmetic 32 33
gtgt which -all eigbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleeig) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleeig) double methodCMATLAB2012btoolboxsymbolicsymbolicsymeigm sym methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedeigm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayeigm gpuArray methoddmatlabchebfunlinopeigm linop methoddmatlabchebfunchebopeigm chebop method
gtgt which -all qrbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleqr) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleqr) double methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedqrm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayqrm gpuArray methoddmatlabchebfunchebfunqrm chebfun method
gtgt which -all toeplitzCMATLAB2012btoolboxmatlabelmattoeplitzm
University of Manchester Nick Higham Higher Precision Arithmetic 32 33
Conclusions amp Future Directions
Need to prepare for quad precision in hardwareArbitrary precision arithmetic in software increasinglyuseful Are we fully exploiting it
High precision veryuseful for computingldquoexact answersrdquo fortesting algs
NLA group
University of Manchester Nick Higham Higher Precision Arithmetic 33 33
References I
Multiprecision Computing Toolbox for MATLABAdvanpix LLC Tokyo|httpwwwadvanpixcom|
A H Al-Mohy and N J HighamThe complex step approximation to the Freacutechetderivative of a matrix functionNumer Algorithms 53(1)133ndash148 2010
Basic Linear Algebra Subprograms Technical (BLAST)Forum Standard IInt J High Performance Computing Applications 16(1)1ndash111 2002
University of Manchester Nick Higham Higher Precision Arithmetic 1 4
References II
Basic Linear Algebra Subprograms Technical (BLAST)Forum Standard IIInt J High Performance Computing Applications 16(2)115ndash199 2002
E B DaviesApproximate diagonalizationSIAM J Matrix Anal Appl 29(4)1051ndash1064 2007
University of Manchester Nick Higham Higher Precision Arithmetic 2 4
References III
N J Dingle and N J HighamReducing the influence of tiny normwise relative errorson performance profilesMIMS EPrint 201190 Manchester Institute forMathematical Sciences The University of ManchesterUK Nov 201111 pp
W KahanComputer benchmarks versus accuracyDraft manuscript June 1994
University of Manchester Nick Higham Higher Precision Arithmetic 3 4
References IV
X S Li J W Demmel D H Bailey G Henry Y HidaJ Iskandar W Kahan S Y Kang A Kapur M CMartin B J Thompson T Tung and D J YooDesign implementation and testing of extended andmixed precision BLASACM Trans Math Software 28(2)152ndash205 June 2002
D MargulisPhotoshop LAB Color The Canyon Conundrum andOther Adventures in the Most Powerful ColorspacePeachpit Press Berkeley CA USA 2006ISBN 0-321-35678-0xviii+366 pp
University of Manchester Nick Higham Higher Precision Arithmetic 4 4
Derivative Approximations
Freacutechet derivative of f Cntimesn rarr Cntimesn
Lf (AE) asymp f (A + hE)minus f (A)h
forward difference
with error O(h) and hopt =(
uf (A)E2
)12
If f Rntimesn rarr Rntimesn and AE isin Rntimesn (Al-Mohy amp H 2010)
Lf (AE) asymp Imf (A + ihE)
hcomplex step
has error O(h2) and can take h arbitrarily small eg 10minus100
University of Manchester Nick Higham Higher Precision Arithmetic 18 33
Mixed Precision
Even if we have higher precision we might not need to useit everywhere
Iterative refinement for Ax = bIterative refinement for ersquovalue problemsPolynomial root finding (Boyd this workshop)Petschow Quintana-Ortiacute amp Bientinesi (2012) usingquad precision to improve orthogonality of doubleprecision MRRR
University of Manchester Nick Higham Higher Precision Arithmetic 19 33
Chebfun
How would Chebfun need to change with higher precisionarithmetic
gtgt help chebfunprefchebfunpref Settings for Chebfun
eps - Relative tolerance used in construction andsubsequent operationsFactory value is 2^-52 (Matlabrsquos factoryvalue of machine epsilon)
University of Manchester Nick Higham Higher Precision Arithmetic 20 33
Chebfun
function pass = complexrotation This code makes sure a few things are ok if you make them complex eg integration norms inner products and singular values LNT 20 May 2008
f = chebfun((x) exp(x))fi = chebfun((x) 1iexp(x))g = chebfun(rsquo1(2-x)rsquo)gi = chebfun(rsquo1i(2-x)rsquo)A = [f g]
pass(1) = (sum(fi)==1isum(f))pass(2) = norm(fi)==norm(f)pass(3) = abs((frsquog)-((1if)rsquo(1ig))) lt 1e-15pass(4) = (norm(giinf)-norm(ginf)) lt 1e-15pass(5) = (norm(fi1)-norm(f1)) lt 1e-15pass(6) = norm(svd(A) - svd(1iA)) lt 1e-15
University of Manchester Nick Higham Higher Precision Arithmetic 21 33
Unstable Algorithms in Higher Precision
For unstable alg where error bounds available increase theprecision accordingly
University of Manchester Nick Higham Higher Precision Arithmetic 22 33
RandomizationA isin Cntimesn with A = 1Davies (2007) ldquoapproximate diagonalizationrdquo
tol = 1e-8[VD] = eig(A + tolrandn(n))F = Vdiag(feval(fundiag(D)))V
Manchester usage in VPA
digits(d)tol = 10^(-d2)[VD] = eig(vpa(A) + tolvpa( rand(n) ))F = Vdiag(feval(fundiag(D)))V
Daviesrsquo analysis suggests rel err asymp 10minusd2
H amp Relton (in progress)
University of Manchester Nick Higham Higher Precision Arithmetic 23 33
RandomizationA isin Cntimesn with A = 1Davies (2007) ldquoapproximate diagonalizationrdquo
tol = 1e-8[VD] = eig(A + tolrandn(n))F = Vdiag(feval(fundiag(D)))V
Manchester usage in VPA
digits(d)tol = 10^(-d2)[VD] = eig(vpa(A) + tolvpa( rand(n) ))F = Vdiag(feval(fundiag(D)))V
Daviesrsquo analysis suggests rel err asymp 10minusd2
H amp Relton (in progress)
University of Manchester Nick Higham Higher Precision Arithmetic 23 33
Tiny Relative Errors
Normwise relative errors
θinfin(x y) =x minus yinfinxinfin
from a numerical experiment
132e-22 339e-22 339e-21 867e-20139e-18 436e-18 530e-18 583e-18145e-17 376e-17 376e-17 427e-17
What precision was used
University of Manchester Nick Higham Higher Precision Arithmetic 24 33
Base β = 2 u = 2minust Dingle amp H (2011)
Theorem
If x 6= 0 and y are distinct normalized flpt numbers then|x minus y ||x | ge u and this lower bound is attainable
Theorem
Let 0 6= x y isin Rn be vectors of normalized flpt numbers Ifθinfin(x y) lt u then xk = yk for all k st |xk | = xinfin
x =
[1
10minus22
] y =
[1
2times 10minus22
] θinfin(x y) = 10minus22
Tiny relative errors can corrupt performance profilesSee cure in Dingle amp H (2011)
University of Manchester Nick Higham Higher Precision Arithmetic 25 33
Base β = 2 u = 2minust Dingle amp H (2011)
Theorem
If x 6= 0 and y are distinct normalized flpt numbers then|x minus y ||x | ge u and this lower bound is attainable
Theorem
Let 0 6= x y isin Rn be vectors of normalized flpt numbers Ifθinfin(x y) lt u then xk = yk for all k st |xk | = xinfin
x =
[1
10minus22
] y =
[1
2times 10minus22
] θinfin(x y) = 10minus22
Tiny relative errors can corrupt performance profilesSee cure in Dingle amp H (2011)
University of Manchester Nick Higham Higher Precision Arithmetic 25 33
Increasing the Precision
y = eπradic
163 evaluated at t digit precision
t y20 2625374126407687440025 262537412640768744000000030 262537412640768743999999999999
Is the last digit before the decimal point 4
t y35 2625374126407687439999999999992500740 2625374126407687439999999999992500725972
So no itrsquos 3
University of Manchester Nick Higham Higher Precision Arithmetic 26 33
Increasing the Precision
y = eπradic
163 evaluated at t digit precision
t y20 2625374126407687440025 262537412640768744000000030 262537412640768743999999999999
Is the last digit before the decimal point 4
t y35 2625374126407687439999999999992500740 2625374126407687439999999999992500725972
So no itrsquos 3
University of Manchester Nick Higham Higher Precision Arithmetic 26 33
Another Example
Consider the evaluation in precision u = 2minust of
y = x + a sin(bx) x = 17 a = 10minus8 b = 224
10 15 20 25 30 35 4010
minus14
10minus13
10minus12
10minus11
10minus10
10minus9
10minus8
10minus7
10minus6
10minus5
10minus4
t
error
University of Manchester Nick Higham Higher Precision Arithmetic 27 33
MythIncreasing the precision at which a computation isperformed increases the accuracy of the answer
Rounding Errors in Digital Imaging
An 8-bit RGB image is an m times n times 3 array of integersaijk isin 01 255Every editing op (levels curves colour balance )aijk larr round(fijk(aijk)) incurs a rounding error
Should we edit in 16-bit
University of Manchester Nick Higham Higher Precision Arithmetic 29 33
Rounding Errors in Digital Imaging
An 8-bit RGB image is an m times n times 3 array of integersaijk isin 01 255Every editing op (levels curves colour balance )aijk larr round(fijk(aijk)) incurs a rounding error
Should we edit in 16-bit
University of Manchester Nick Higham Higher Precision Arithmetic 29 33
16-bit vs 8-bit editing
SLR cameras generate image at 12ndash14 bits internally
8-bit 0 1 25516-bit 0 39times 10minus3 255
Controversial Margulis says using 16 bits makes nopractical difference in quality
Relevant metric is not normwise relative error
University of Manchester Nick Higham Higher Precision Arithmetic 30 33
16-bit vs 8-bit editing
SLR cameras generate image at 12ndash14 bits internally
8-bit 0 1 25516-bit 0 39times 10minus3 255
Controversial Margulis says using 16 bits makes nopractical difference in quality
Relevant metric is not normwise relative error
University of Manchester Nick Higham Higher Precision Arithmetic 30 33
16-bit vs 8-bit editing
SLR cameras generate image at 12ndash14 bits internally
8-bit 0 1 25516-bit 0 39times 10minus3 255
Controversial Margulis says using 16 bits makes nopractical difference in quality
Relevant metric is not normwise relative error
University of Manchester Nick Higham Higher Precision Arithmetic 30 33
High Precision in MATLAB
VPA arithmetic in Symbolic Math ToolboxAdvanpix Multiprecision Computing Toolbox forMATLAB
University of Manchester Nick Higham Higher Precision Arithmetic 31 33
gtgt which -all eigbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleeig) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleeig) double methodCMATLAB2012btoolboxsymbolicsymbolicsymeigm sym methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedeigm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayeigm gpuArray methoddmatlabchebfunlinopeigm linop methoddmatlabchebfunchebopeigm chebop method
gtgt which -all qrbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleqr) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleqr) double methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedqrm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayqrm gpuArray methoddmatlabchebfunchebfunqrm chebfun method
gtgt which -all toeplitzCMATLAB2012btoolboxmatlabelmattoeplitzm
University of Manchester Nick Higham Higher Precision Arithmetic 32 33
gtgt which -all eigbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleeig) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleeig) double methodCMATLAB2012btoolboxsymbolicsymbolicsymeigm sym methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedeigm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayeigm gpuArray methoddmatlabchebfunlinopeigm linop methoddmatlabchebfunchebopeigm chebop method
gtgt which -all qrbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleqr) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleqr) double methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedqrm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayqrm gpuArray methoddmatlabchebfunchebfunqrm chebfun method
gtgt which -all toeplitzCMATLAB2012btoolboxmatlabelmattoeplitzm
University of Manchester Nick Higham Higher Precision Arithmetic 32 33
gtgt which -all eigbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleeig) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleeig) double methodCMATLAB2012btoolboxsymbolicsymbolicsymeigm sym methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedeigm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayeigm gpuArray methoddmatlabchebfunlinopeigm linop methoddmatlabchebfunchebopeigm chebop method
gtgt which -all qrbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleqr) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleqr) double methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedqrm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayqrm gpuArray methoddmatlabchebfunchebfunqrm chebfun method
gtgt which -all toeplitzCMATLAB2012btoolboxmatlabelmattoeplitzm
University of Manchester Nick Higham Higher Precision Arithmetic 32 33
Conclusions amp Future Directions
Need to prepare for quad precision in hardwareArbitrary precision arithmetic in software increasinglyuseful Are we fully exploiting it
High precision veryuseful for computingldquoexact answersrdquo fortesting algs
NLA group
University of Manchester Nick Higham Higher Precision Arithmetic 33 33
References I
Multiprecision Computing Toolbox for MATLABAdvanpix LLC Tokyo|httpwwwadvanpixcom|
A H Al-Mohy and N J HighamThe complex step approximation to the Freacutechetderivative of a matrix functionNumer Algorithms 53(1)133ndash148 2010
Basic Linear Algebra Subprograms Technical (BLAST)Forum Standard IInt J High Performance Computing Applications 16(1)1ndash111 2002
University of Manchester Nick Higham Higher Precision Arithmetic 1 4
References II
Basic Linear Algebra Subprograms Technical (BLAST)Forum Standard IIInt J High Performance Computing Applications 16(2)115ndash199 2002
E B DaviesApproximate diagonalizationSIAM J Matrix Anal Appl 29(4)1051ndash1064 2007
University of Manchester Nick Higham Higher Precision Arithmetic 2 4
References III
N J Dingle and N J HighamReducing the influence of tiny normwise relative errorson performance profilesMIMS EPrint 201190 Manchester Institute forMathematical Sciences The University of ManchesterUK Nov 201111 pp
W KahanComputer benchmarks versus accuracyDraft manuscript June 1994
University of Manchester Nick Higham Higher Precision Arithmetic 3 4
References IV
X S Li J W Demmel D H Bailey G Henry Y HidaJ Iskandar W Kahan S Y Kang A Kapur M CMartin B J Thompson T Tung and D J YooDesign implementation and testing of extended andmixed precision BLASACM Trans Math Software 28(2)152ndash205 June 2002
D MargulisPhotoshop LAB Color The Canyon Conundrum andOther Adventures in the Most Powerful ColorspacePeachpit Press Berkeley CA USA 2006ISBN 0-321-35678-0xviii+366 pp
University of Manchester Nick Higham Higher Precision Arithmetic 4 4
Mixed Precision
Even if we have higher precision we might not need to useit everywhere
Iterative refinement for Ax = bIterative refinement for ersquovalue problemsPolynomial root finding (Boyd this workshop)Petschow Quintana-Ortiacute amp Bientinesi (2012) usingquad precision to improve orthogonality of doubleprecision MRRR
University of Manchester Nick Higham Higher Precision Arithmetic 19 33
Chebfun
How would Chebfun need to change with higher precisionarithmetic
gtgt help chebfunprefchebfunpref Settings for Chebfun
eps - Relative tolerance used in construction andsubsequent operationsFactory value is 2^-52 (Matlabrsquos factoryvalue of machine epsilon)
University of Manchester Nick Higham Higher Precision Arithmetic 20 33
Chebfun
function pass = complexrotation This code makes sure a few things are ok if you make them complex eg integration norms inner products and singular values LNT 20 May 2008
f = chebfun((x) exp(x))fi = chebfun((x) 1iexp(x))g = chebfun(rsquo1(2-x)rsquo)gi = chebfun(rsquo1i(2-x)rsquo)A = [f g]
pass(1) = (sum(fi)==1isum(f))pass(2) = norm(fi)==norm(f)pass(3) = abs((frsquog)-((1if)rsquo(1ig))) lt 1e-15pass(4) = (norm(giinf)-norm(ginf)) lt 1e-15pass(5) = (norm(fi1)-norm(f1)) lt 1e-15pass(6) = norm(svd(A) - svd(1iA)) lt 1e-15
University of Manchester Nick Higham Higher Precision Arithmetic 21 33
Unstable Algorithms in Higher Precision
For unstable alg where error bounds available increase theprecision accordingly
University of Manchester Nick Higham Higher Precision Arithmetic 22 33
RandomizationA isin Cntimesn with A = 1Davies (2007) ldquoapproximate diagonalizationrdquo
tol = 1e-8[VD] = eig(A + tolrandn(n))F = Vdiag(feval(fundiag(D)))V
Manchester usage in VPA
digits(d)tol = 10^(-d2)[VD] = eig(vpa(A) + tolvpa( rand(n) ))F = Vdiag(feval(fundiag(D)))V
Daviesrsquo analysis suggests rel err asymp 10minusd2
H amp Relton (in progress)
University of Manchester Nick Higham Higher Precision Arithmetic 23 33
RandomizationA isin Cntimesn with A = 1Davies (2007) ldquoapproximate diagonalizationrdquo
tol = 1e-8[VD] = eig(A + tolrandn(n))F = Vdiag(feval(fundiag(D)))V
Manchester usage in VPA
digits(d)tol = 10^(-d2)[VD] = eig(vpa(A) + tolvpa( rand(n) ))F = Vdiag(feval(fundiag(D)))V
Daviesrsquo analysis suggests rel err asymp 10minusd2
H amp Relton (in progress)
University of Manchester Nick Higham Higher Precision Arithmetic 23 33
Tiny Relative Errors
Normwise relative errors
θinfin(x y) =x minus yinfinxinfin
from a numerical experiment
132e-22 339e-22 339e-21 867e-20139e-18 436e-18 530e-18 583e-18145e-17 376e-17 376e-17 427e-17
What precision was used
University of Manchester Nick Higham Higher Precision Arithmetic 24 33
Base β = 2 u = 2minust Dingle amp H (2011)
Theorem
If x 6= 0 and y are distinct normalized flpt numbers then|x minus y ||x | ge u and this lower bound is attainable
Theorem
Let 0 6= x y isin Rn be vectors of normalized flpt numbers Ifθinfin(x y) lt u then xk = yk for all k st |xk | = xinfin
x =
[1
10minus22
] y =
[1
2times 10minus22
] θinfin(x y) = 10minus22
Tiny relative errors can corrupt performance profilesSee cure in Dingle amp H (2011)
University of Manchester Nick Higham Higher Precision Arithmetic 25 33
Base β = 2 u = 2minust Dingle amp H (2011)
Theorem
If x 6= 0 and y are distinct normalized flpt numbers then|x minus y ||x | ge u and this lower bound is attainable
Theorem
Let 0 6= x y isin Rn be vectors of normalized flpt numbers Ifθinfin(x y) lt u then xk = yk for all k st |xk | = xinfin
x =
[1
10minus22
] y =
[1
2times 10minus22
] θinfin(x y) = 10minus22
Tiny relative errors can corrupt performance profilesSee cure in Dingle amp H (2011)
University of Manchester Nick Higham Higher Precision Arithmetic 25 33
Increasing the Precision
y = eπradic
163 evaluated at t digit precision
t y20 2625374126407687440025 262537412640768744000000030 262537412640768743999999999999
Is the last digit before the decimal point 4
t y35 2625374126407687439999999999992500740 2625374126407687439999999999992500725972
So no itrsquos 3
University of Manchester Nick Higham Higher Precision Arithmetic 26 33
Increasing the Precision
y = eπradic
163 evaluated at t digit precision
t y20 2625374126407687440025 262537412640768744000000030 262537412640768743999999999999
Is the last digit before the decimal point 4
t y35 2625374126407687439999999999992500740 2625374126407687439999999999992500725972
So no itrsquos 3
University of Manchester Nick Higham Higher Precision Arithmetic 26 33
Another Example
Consider the evaluation in precision u = 2minust of
y = x + a sin(bx) x = 17 a = 10minus8 b = 224
10 15 20 25 30 35 4010
minus14
10minus13
10minus12
10minus11
10minus10
10minus9
10minus8
10minus7
10minus6
10minus5
10minus4
t
error
University of Manchester Nick Higham Higher Precision Arithmetic 27 33
MythIncreasing the precision at which a computation isperformed increases the accuracy of the answer
Rounding Errors in Digital Imaging
An 8-bit RGB image is an m times n times 3 array of integersaijk isin 01 255Every editing op (levels curves colour balance )aijk larr round(fijk(aijk)) incurs a rounding error
Should we edit in 16-bit
University of Manchester Nick Higham Higher Precision Arithmetic 29 33
Rounding Errors in Digital Imaging
An 8-bit RGB image is an m times n times 3 array of integersaijk isin 01 255Every editing op (levels curves colour balance )aijk larr round(fijk(aijk)) incurs a rounding error
Should we edit in 16-bit
University of Manchester Nick Higham Higher Precision Arithmetic 29 33
16-bit vs 8-bit editing
SLR cameras generate image at 12ndash14 bits internally
8-bit 0 1 25516-bit 0 39times 10minus3 255
Controversial Margulis says using 16 bits makes nopractical difference in quality
Relevant metric is not normwise relative error
University of Manchester Nick Higham Higher Precision Arithmetic 30 33
16-bit vs 8-bit editing
SLR cameras generate image at 12ndash14 bits internally
8-bit 0 1 25516-bit 0 39times 10minus3 255
Controversial Margulis says using 16 bits makes nopractical difference in quality
Relevant metric is not normwise relative error
University of Manchester Nick Higham Higher Precision Arithmetic 30 33
16-bit vs 8-bit editing
SLR cameras generate image at 12ndash14 bits internally
8-bit 0 1 25516-bit 0 39times 10minus3 255
Controversial Margulis says using 16 bits makes nopractical difference in quality
Relevant metric is not normwise relative error
University of Manchester Nick Higham Higher Precision Arithmetic 30 33
High Precision in MATLAB
VPA arithmetic in Symbolic Math ToolboxAdvanpix Multiprecision Computing Toolbox forMATLAB
University of Manchester Nick Higham Higher Precision Arithmetic 31 33
gtgt which -all eigbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleeig) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleeig) double methodCMATLAB2012btoolboxsymbolicsymbolicsymeigm sym methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedeigm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayeigm gpuArray methoddmatlabchebfunlinopeigm linop methoddmatlabchebfunchebopeigm chebop method
gtgt which -all qrbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleqr) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleqr) double methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedqrm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayqrm gpuArray methoddmatlabchebfunchebfunqrm chebfun method
gtgt which -all toeplitzCMATLAB2012btoolboxmatlabelmattoeplitzm
University of Manchester Nick Higham Higher Precision Arithmetic 32 33
gtgt which -all eigbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleeig) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleeig) double methodCMATLAB2012btoolboxsymbolicsymbolicsymeigm sym methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedeigm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayeigm gpuArray methoddmatlabchebfunlinopeigm linop methoddmatlabchebfunchebopeigm chebop method
gtgt which -all qrbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleqr) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleqr) double methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedqrm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayqrm gpuArray methoddmatlabchebfunchebfunqrm chebfun method
gtgt which -all toeplitzCMATLAB2012btoolboxmatlabelmattoeplitzm
University of Manchester Nick Higham Higher Precision Arithmetic 32 33
gtgt which -all eigbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleeig) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleeig) double methodCMATLAB2012btoolboxsymbolicsymbolicsymeigm sym methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedeigm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayeigm gpuArray methoddmatlabchebfunlinopeigm linop methoddmatlabchebfunchebopeigm chebop method
gtgt which -all qrbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleqr) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleqr) double methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedqrm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayqrm gpuArray methoddmatlabchebfunchebfunqrm chebfun method
gtgt which -all toeplitzCMATLAB2012btoolboxmatlabelmattoeplitzm
University of Manchester Nick Higham Higher Precision Arithmetic 32 33
Conclusions amp Future Directions
Need to prepare for quad precision in hardwareArbitrary precision arithmetic in software increasinglyuseful Are we fully exploiting it
High precision veryuseful for computingldquoexact answersrdquo fortesting algs
NLA group
University of Manchester Nick Higham Higher Precision Arithmetic 33 33
References I
Multiprecision Computing Toolbox for MATLABAdvanpix LLC Tokyo|httpwwwadvanpixcom|
A H Al-Mohy and N J HighamThe complex step approximation to the Freacutechetderivative of a matrix functionNumer Algorithms 53(1)133ndash148 2010
Basic Linear Algebra Subprograms Technical (BLAST)Forum Standard IInt J High Performance Computing Applications 16(1)1ndash111 2002
University of Manchester Nick Higham Higher Precision Arithmetic 1 4
References II
Basic Linear Algebra Subprograms Technical (BLAST)Forum Standard IIInt J High Performance Computing Applications 16(2)115ndash199 2002
E B DaviesApproximate diagonalizationSIAM J Matrix Anal Appl 29(4)1051ndash1064 2007
University of Manchester Nick Higham Higher Precision Arithmetic 2 4
References III
N J Dingle and N J HighamReducing the influence of tiny normwise relative errorson performance profilesMIMS EPrint 201190 Manchester Institute forMathematical Sciences The University of ManchesterUK Nov 201111 pp
W KahanComputer benchmarks versus accuracyDraft manuscript June 1994
University of Manchester Nick Higham Higher Precision Arithmetic 3 4
References IV
X S Li J W Demmel D H Bailey G Henry Y HidaJ Iskandar W Kahan S Y Kang A Kapur M CMartin B J Thompson T Tung and D J YooDesign implementation and testing of extended andmixed precision BLASACM Trans Math Software 28(2)152ndash205 June 2002
D MargulisPhotoshop LAB Color The Canyon Conundrum andOther Adventures in the Most Powerful ColorspacePeachpit Press Berkeley CA USA 2006ISBN 0-321-35678-0xviii+366 pp
University of Manchester Nick Higham Higher Precision Arithmetic 4 4
Chebfun
How would Chebfun need to change with higher precisionarithmetic
gtgt help chebfunprefchebfunpref Settings for Chebfun
eps - Relative tolerance used in construction andsubsequent operationsFactory value is 2^-52 (Matlabrsquos factoryvalue of machine epsilon)
University of Manchester Nick Higham Higher Precision Arithmetic 20 33
Chebfun
function pass = complexrotation This code makes sure a few things are ok if you make them complex eg integration norms inner products and singular values LNT 20 May 2008
f = chebfun((x) exp(x))fi = chebfun((x) 1iexp(x))g = chebfun(rsquo1(2-x)rsquo)gi = chebfun(rsquo1i(2-x)rsquo)A = [f g]
pass(1) = (sum(fi)==1isum(f))pass(2) = norm(fi)==norm(f)pass(3) = abs((frsquog)-((1if)rsquo(1ig))) lt 1e-15pass(4) = (norm(giinf)-norm(ginf)) lt 1e-15pass(5) = (norm(fi1)-norm(f1)) lt 1e-15pass(6) = norm(svd(A) - svd(1iA)) lt 1e-15
University of Manchester Nick Higham Higher Precision Arithmetic 21 33
Unstable Algorithms in Higher Precision
For unstable alg where error bounds available increase theprecision accordingly
University of Manchester Nick Higham Higher Precision Arithmetic 22 33
RandomizationA isin Cntimesn with A = 1Davies (2007) ldquoapproximate diagonalizationrdquo
tol = 1e-8[VD] = eig(A + tolrandn(n))F = Vdiag(feval(fundiag(D)))V
Manchester usage in VPA
digits(d)tol = 10^(-d2)[VD] = eig(vpa(A) + tolvpa( rand(n) ))F = Vdiag(feval(fundiag(D)))V
Daviesrsquo analysis suggests rel err asymp 10minusd2
H amp Relton (in progress)
University of Manchester Nick Higham Higher Precision Arithmetic 23 33
RandomizationA isin Cntimesn with A = 1Davies (2007) ldquoapproximate diagonalizationrdquo
tol = 1e-8[VD] = eig(A + tolrandn(n))F = Vdiag(feval(fundiag(D)))V
Manchester usage in VPA
digits(d)tol = 10^(-d2)[VD] = eig(vpa(A) + tolvpa( rand(n) ))F = Vdiag(feval(fundiag(D)))V
Daviesrsquo analysis suggests rel err asymp 10minusd2
H amp Relton (in progress)
University of Manchester Nick Higham Higher Precision Arithmetic 23 33
Tiny Relative Errors
Normwise relative errors
θinfin(x y) =x minus yinfinxinfin
from a numerical experiment
132e-22 339e-22 339e-21 867e-20139e-18 436e-18 530e-18 583e-18145e-17 376e-17 376e-17 427e-17
What precision was used
University of Manchester Nick Higham Higher Precision Arithmetic 24 33
Base β = 2 u = 2minust Dingle amp H (2011)
Theorem
If x 6= 0 and y are distinct normalized flpt numbers then|x minus y ||x | ge u and this lower bound is attainable
Theorem
Let 0 6= x y isin Rn be vectors of normalized flpt numbers Ifθinfin(x y) lt u then xk = yk for all k st |xk | = xinfin
x =
[1
10minus22
] y =
[1
2times 10minus22
] θinfin(x y) = 10minus22
Tiny relative errors can corrupt performance profilesSee cure in Dingle amp H (2011)
University of Manchester Nick Higham Higher Precision Arithmetic 25 33
Base β = 2 u = 2minust Dingle amp H (2011)
Theorem
If x 6= 0 and y are distinct normalized flpt numbers then|x minus y ||x | ge u and this lower bound is attainable
Theorem
Let 0 6= x y isin Rn be vectors of normalized flpt numbers Ifθinfin(x y) lt u then xk = yk for all k st |xk | = xinfin
x =
[1
10minus22
] y =
[1
2times 10minus22
] θinfin(x y) = 10minus22
Tiny relative errors can corrupt performance profilesSee cure in Dingle amp H (2011)
University of Manchester Nick Higham Higher Precision Arithmetic 25 33
Increasing the Precision
y = eπradic
163 evaluated at t digit precision
t y20 2625374126407687440025 262537412640768744000000030 262537412640768743999999999999
Is the last digit before the decimal point 4
t y35 2625374126407687439999999999992500740 2625374126407687439999999999992500725972
So no itrsquos 3
University of Manchester Nick Higham Higher Precision Arithmetic 26 33
Increasing the Precision
y = eπradic
163 evaluated at t digit precision
t y20 2625374126407687440025 262537412640768744000000030 262537412640768743999999999999
Is the last digit before the decimal point 4
t y35 2625374126407687439999999999992500740 2625374126407687439999999999992500725972
So no itrsquos 3
University of Manchester Nick Higham Higher Precision Arithmetic 26 33
Another Example
Consider the evaluation in precision u = 2minust of
y = x + a sin(bx) x = 17 a = 10minus8 b = 224
10 15 20 25 30 35 4010
minus14
10minus13
10minus12
10minus11
10minus10
10minus9
10minus8
10minus7
10minus6
10minus5
10minus4
t
error
University of Manchester Nick Higham Higher Precision Arithmetic 27 33
MythIncreasing the precision at which a computation isperformed increases the accuracy of the answer
Rounding Errors in Digital Imaging
An 8-bit RGB image is an m times n times 3 array of integersaijk isin 01 255Every editing op (levels curves colour balance )aijk larr round(fijk(aijk)) incurs a rounding error
Should we edit in 16-bit
University of Manchester Nick Higham Higher Precision Arithmetic 29 33
Rounding Errors in Digital Imaging
An 8-bit RGB image is an m times n times 3 array of integersaijk isin 01 255Every editing op (levels curves colour balance )aijk larr round(fijk(aijk)) incurs a rounding error
Should we edit in 16-bit
University of Manchester Nick Higham Higher Precision Arithmetic 29 33
16-bit vs 8-bit editing
SLR cameras generate image at 12ndash14 bits internally
8-bit 0 1 25516-bit 0 39times 10minus3 255
Controversial Margulis says using 16 bits makes nopractical difference in quality
Relevant metric is not normwise relative error
University of Manchester Nick Higham Higher Precision Arithmetic 30 33
16-bit vs 8-bit editing
SLR cameras generate image at 12ndash14 bits internally
8-bit 0 1 25516-bit 0 39times 10minus3 255
Controversial Margulis says using 16 bits makes nopractical difference in quality
Relevant metric is not normwise relative error
University of Manchester Nick Higham Higher Precision Arithmetic 30 33
16-bit vs 8-bit editing
SLR cameras generate image at 12ndash14 bits internally
8-bit 0 1 25516-bit 0 39times 10minus3 255
Controversial Margulis says using 16 bits makes nopractical difference in quality
Relevant metric is not normwise relative error
University of Manchester Nick Higham Higher Precision Arithmetic 30 33
High Precision in MATLAB
VPA arithmetic in Symbolic Math ToolboxAdvanpix Multiprecision Computing Toolbox forMATLAB
University of Manchester Nick Higham Higher Precision Arithmetic 31 33
gtgt which -all eigbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleeig) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleeig) double methodCMATLAB2012btoolboxsymbolicsymbolicsymeigm sym methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedeigm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayeigm gpuArray methoddmatlabchebfunlinopeigm linop methoddmatlabchebfunchebopeigm chebop method
gtgt which -all qrbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleqr) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleqr) double methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedqrm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayqrm gpuArray methoddmatlabchebfunchebfunqrm chebfun method
gtgt which -all toeplitzCMATLAB2012btoolboxmatlabelmattoeplitzm
University of Manchester Nick Higham Higher Precision Arithmetic 32 33
gtgt which -all eigbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleeig) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleeig) double methodCMATLAB2012btoolboxsymbolicsymbolicsymeigm sym methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedeigm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayeigm gpuArray methoddmatlabchebfunlinopeigm linop methoddmatlabchebfunchebopeigm chebop method
gtgt which -all qrbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleqr) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleqr) double methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedqrm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayqrm gpuArray methoddmatlabchebfunchebfunqrm chebfun method
gtgt which -all toeplitzCMATLAB2012btoolboxmatlabelmattoeplitzm
University of Manchester Nick Higham Higher Precision Arithmetic 32 33
gtgt which -all eigbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleeig) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleeig) double methodCMATLAB2012btoolboxsymbolicsymbolicsymeigm sym methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedeigm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayeigm gpuArray methoddmatlabchebfunlinopeigm linop methoddmatlabchebfunchebopeigm chebop method
gtgt which -all qrbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleqr) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleqr) double methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedqrm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayqrm gpuArray methoddmatlabchebfunchebfunqrm chebfun method
gtgt which -all toeplitzCMATLAB2012btoolboxmatlabelmattoeplitzm
University of Manchester Nick Higham Higher Precision Arithmetic 32 33
Conclusions amp Future Directions
Need to prepare for quad precision in hardwareArbitrary precision arithmetic in software increasinglyuseful Are we fully exploiting it
High precision veryuseful for computingldquoexact answersrdquo fortesting algs
NLA group
University of Manchester Nick Higham Higher Precision Arithmetic 33 33
References I
Multiprecision Computing Toolbox for MATLABAdvanpix LLC Tokyo|httpwwwadvanpixcom|
A H Al-Mohy and N J HighamThe complex step approximation to the Freacutechetderivative of a matrix functionNumer Algorithms 53(1)133ndash148 2010
Basic Linear Algebra Subprograms Technical (BLAST)Forum Standard IInt J High Performance Computing Applications 16(1)1ndash111 2002
University of Manchester Nick Higham Higher Precision Arithmetic 1 4
References II
Basic Linear Algebra Subprograms Technical (BLAST)Forum Standard IIInt J High Performance Computing Applications 16(2)115ndash199 2002
E B DaviesApproximate diagonalizationSIAM J Matrix Anal Appl 29(4)1051ndash1064 2007
University of Manchester Nick Higham Higher Precision Arithmetic 2 4
References III
N J Dingle and N J HighamReducing the influence of tiny normwise relative errorson performance profilesMIMS EPrint 201190 Manchester Institute forMathematical Sciences The University of ManchesterUK Nov 201111 pp
W KahanComputer benchmarks versus accuracyDraft manuscript June 1994
University of Manchester Nick Higham Higher Precision Arithmetic 3 4
References IV
X S Li J W Demmel D H Bailey G Henry Y HidaJ Iskandar W Kahan S Y Kang A Kapur M CMartin B J Thompson T Tung and D J YooDesign implementation and testing of extended andmixed precision BLASACM Trans Math Software 28(2)152ndash205 June 2002
D MargulisPhotoshop LAB Color The Canyon Conundrum andOther Adventures in the Most Powerful ColorspacePeachpit Press Berkeley CA USA 2006ISBN 0-321-35678-0xviii+366 pp
University of Manchester Nick Higham Higher Precision Arithmetic 4 4
Chebfun
function pass = complexrotation This code makes sure a few things are ok if you make them complex eg integration norms inner products and singular values LNT 20 May 2008
f = chebfun((x) exp(x))fi = chebfun((x) 1iexp(x))g = chebfun(rsquo1(2-x)rsquo)gi = chebfun(rsquo1i(2-x)rsquo)A = [f g]
pass(1) = (sum(fi)==1isum(f))pass(2) = norm(fi)==norm(f)pass(3) = abs((frsquog)-((1if)rsquo(1ig))) lt 1e-15pass(4) = (norm(giinf)-norm(ginf)) lt 1e-15pass(5) = (norm(fi1)-norm(f1)) lt 1e-15pass(6) = norm(svd(A) - svd(1iA)) lt 1e-15
University of Manchester Nick Higham Higher Precision Arithmetic 21 33
Unstable Algorithms in Higher Precision
For unstable alg where error bounds available increase theprecision accordingly
University of Manchester Nick Higham Higher Precision Arithmetic 22 33
RandomizationA isin Cntimesn with A = 1Davies (2007) ldquoapproximate diagonalizationrdquo
tol = 1e-8[VD] = eig(A + tolrandn(n))F = Vdiag(feval(fundiag(D)))V
Manchester usage in VPA
digits(d)tol = 10^(-d2)[VD] = eig(vpa(A) + tolvpa( rand(n) ))F = Vdiag(feval(fundiag(D)))V
Daviesrsquo analysis suggests rel err asymp 10minusd2
H amp Relton (in progress)
University of Manchester Nick Higham Higher Precision Arithmetic 23 33
RandomizationA isin Cntimesn with A = 1Davies (2007) ldquoapproximate diagonalizationrdquo
tol = 1e-8[VD] = eig(A + tolrandn(n))F = Vdiag(feval(fundiag(D)))V
Manchester usage in VPA
digits(d)tol = 10^(-d2)[VD] = eig(vpa(A) + tolvpa( rand(n) ))F = Vdiag(feval(fundiag(D)))V
Daviesrsquo analysis suggests rel err asymp 10minusd2
H amp Relton (in progress)
University of Manchester Nick Higham Higher Precision Arithmetic 23 33
Tiny Relative Errors
Normwise relative errors
θinfin(x y) =x minus yinfinxinfin
from a numerical experiment
132e-22 339e-22 339e-21 867e-20139e-18 436e-18 530e-18 583e-18145e-17 376e-17 376e-17 427e-17
What precision was used
University of Manchester Nick Higham Higher Precision Arithmetic 24 33
Base β = 2 u = 2minust Dingle amp H (2011)
Theorem
If x 6= 0 and y are distinct normalized flpt numbers then|x minus y ||x | ge u and this lower bound is attainable
Theorem
Let 0 6= x y isin Rn be vectors of normalized flpt numbers Ifθinfin(x y) lt u then xk = yk for all k st |xk | = xinfin
x =
[1
10minus22
] y =
[1
2times 10minus22
] θinfin(x y) = 10minus22
Tiny relative errors can corrupt performance profilesSee cure in Dingle amp H (2011)
University of Manchester Nick Higham Higher Precision Arithmetic 25 33
Base β = 2 u = 2minust Dingle amp H (2011)
Theorem
If x 6= 0 and y are distinct normalized flpt numbers then|x minus y ||x | ge u and this lower bound is attainable
Theorem
Let 0 6= x y isin Rn be vectors of normalized flpt numbers Ifθinfin(x y) lt u then xk = yk for all k st |xk | = xinfin
x =
[1
10minus22
] y =
[1
2times 10minus22
] θinfin(x y) = 10minus22
Tiny relative errors can corrupt performance profilesSee cure in Dingle amp H (2011)
University of Manchester Nick Higham Higher Precision Arithmetic 25 33
Increasing the Precision
y = eπradic
163 evaluated at t digit precision
t y20 2625374126407687440025 262537412640768744000000030 262537412640768743999999999999
Is the last digit before the decimal point 4
t y35 2625374126407687439999999999992500740 2625374126407687439999999999992500725972
So no itrsquos 3
University of Manchester Nick Higham Higher Precision Arithmetic 26 33
Increasing the Precision
y = eπradic
163 evaluated at t digit precision
t y20 2625374126407687440025 262537412640768744000000030 262537412640768743999999999999
Is the last digit before the decimal point 4
t y35 2625374126407687439999999999992500740 2625374126407687439999999999992500725972
So no itrsquos 3
University of Manchester Nick Higham Higher Precision Arithmetic 26 33
Another Example
Consider the evaluation in precision u = 2minust of
y = x + a sin(bx) x = 17 a = 10minus8 b = 224
10 15 20 25 30 35 4010
minus14
10minus13
10minus12
10minus11
10minus10
10minus9
10minus8
10minus7
10minus6
10minus5
10minus4
t
error
University of Manchester Nick Higham Higher Precision Arithmetic 27 33
MythIncreasing the precision at which a computation isperformed increases the accuracy of the answer
Rounding Errors in Digital Imaging
An 8-bit RGB image is an m times n times 3 array of integersaijk isin 01 255Every editing op (levels curves colour balance )aijk larr round(fijk(aijk)) incurs a rounding error
Should we edit in 16-bit
University of Manchester Nick Higham Higher Precision Arithmetic 29 33
Rounding Errors in Digital Imaging
An 8-bit RGB image is an m times n times 3 array of integersaijk isin 01 255Every editing op (levels curves colour balance )aijk larr round(fijk(aijk)) incurs a rounding error
Should we edit in 16-bit
University of Manchester Nick Higham Higher Precision Arithmetic 29 33
16-bit vs 8-bit editing
SLR cameras generate image at 12ndash14 bits internally
8-bit 0 1 25516-bit 0 39times 10minus3 255
Controversial Margulis says using 16 bits makes nopractical difference in quality
Relevant metric is not normwise relative error
University of Manchester Nick Higham Higher Precision Arithmetic 30 33
16-bit vs 8-bit editing
SLR cameras generate image at 12ndash14 bits internally
8-bit 0 1 25516-bit 0 39times 10minus3 255
Controversial Margulis says using 16 bits makes nopractical difference in quality
Relevant metric is not normwise relative error
University of Manchester Nick Higham Higher Precision Arithmetic 30 33
16-bit vs 8-bit editing
SLR cameras generate image at 12ndash14 bits internally
8-bit 0 1 25516-bit 0 39times 10minus3 255
Controversial Margulis says using 16 bits makes nopractical difference in quality
Relevant metric is not normwise relative error
University of Manchester Nick Higham Higher Precision Arithmetic 30 33
High Precision in MATLAB
VPA arithmetic in Symbolic Math ToolboxAdvanpix Multiprecision Computing Toolbox forMATLAB
University of Manchester Nick Higham Higher Precision Arithmetic 31 33
gtgt which -all eigbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleeig) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleeig) double methodCMATLAB2012btoolboxsymbolicsymbolicsymeigm sym methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedeigm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayeigm gpuArray methoddmatlabchebfunlinopeigm linop methoddmatlabchebfunchebopeigm chebop method
gtgt which -all qrbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleqr) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleqr) double methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedqrm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayqrm gpuArray methoddmatlabchebfunchebfunqrm chebfun method
gtgt which -all toeplitzCMATLAB2012btoolboxmatlabelmattoeplitzm
University of Manchester Nick Higham Higher Precision Arithmetic 32 33
gtgt which -all eigbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleeig) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleeig) double methodCMATLAB2012btoolboxsymbolicsymbolicsymeigm sym methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedeigm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayeigm gpuArray methoddmatlabchebfunlinopeigm linop methoddmatlabchebfunchebopeigm chebop method
gtgt which -all qrbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleqr) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleqr) double methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedqrm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayqrm gpuArray methoddmatlabchebfunchebfunqrm chebfun method
gtgt which -all toeplitzCMATLAB2012btoolboxmatlabelmattoeplitzm
University of Manchester Nick Higham Higher Precision Arithmetic 32 33
gtgt which -all eigbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleeig) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleeig) double methodCMATLAB2012btoolboxsymbolicsymbolicsymeigm sym methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedeigm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayeigm gpuArray methoddmatlabchebfunlinopeigm linop methoddmatlabchebfunchebopeigm chebop method
gtgt which -all qrbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleqr) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleqr) double methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedqrm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayqrm gpuArray methoddmatlabchebfunchebfunqrm chebfun method
gtgt which -all toeplitzCMATLAB2012btoolboxmatlabelmattoeplitzm
University of Manchester Nick Higham Higher Precision Arithmetic 32 33
Conclusions amp Future Directions
Need to prepare for quad precision in hardwareArbitrary precision arithmetic in software increasinglyuseful Are we fully exploiting it
High precision veryuseful for computingldquoexact answersrdquo fortesting algs
NLA group
University of Manchester Nick Higham Higher Precision Arithmetic 33 33
References I
Multiprecision Computing Toolbox for MATLABAdvanpix LLC Tokyo|httpwwwadvanpixcom|
A H Al-Mohy and N J HighamThe complex step approximation to the Freacutechetderivative of a matrix functionNumer Algorithms 53(1)133ndash148 2010
Basic Linear Algebra Subprograms Technical (BLAST)Forum Standard IInt J High Performance Computing Applications 16(1)1ndash111 2002
University of Manchester Nick Higham Higher Precision Arithmetic 1 4
References II
Basic Linear Algebra Subprograms Technical (BLAST)Forum Standard IIInt J High Performance Computing Applications 16(2)115ndash199 2002
E B DaviesApproximate diagonalizationSIAM J Matrix Anal Appl 29(4)1051ndash1064 2007
University of Manchester Nick Higham Higher Precision Arithmetic 2 4
References III
N J Dingle and N J HighamReducing the influence of tiny normwise relative errorson performance profilesMIMS EPrint 201190 Manchester Institute forMathematical Sciences The University of ManchesterUK Nov 201111 pp
W KahanComputer benchmarks versus accuracyDraft manuscript June 1994
University of Manchester Nick Higham Higher Precision Arithmetic 3 4
References IV
X S Li J W Demmel D H Bailey G Henry Y HidaJ Iskandar W Kahan S Y Kang A Kapur M CMartin B J Thompson T Tung and D J YooDesign implementation and testing of extended andmixed precision BLASACM Trans Math Software 28(2)152ndash205 June 2002
D MargulisPhotoshop LAB Color The Canyon Conundrum andOther Adventures in the Most Powerful ColorspacePeachpit Press Berkeley CA USA 2006ISBN 0-321-35678-0xviii+366 pp
University of Manchester Nick Higham Higher Precision Arithmetic 4 4
Unstable Algorithms in Higher Precision
For unstable alg where error bounds available increase theprecision accordingly
University of Manchester Nick Higham Higher Precision Arithmetic 22 33
RandomizationA isin Cntimesn with A = 1Davies (2007) ldquoapproximate diagonalizationrdquo
tol = 1e-8[VD] = eig(A + tolrandn(n))F = Vdiag(feval(fundiag(D)))V
Manchester usage in VPA
digits(d)tol = 10^(-d2)[VD] = eig(vpa(A) + tolvpa( rand(n) ))F = Vdiag(feval(fundiag(D)))V
Daviesrsquo analysis suggests rel err asymp 10minusd2
H amp Relton (in progress)
University of Manchester Nick Higham Higher Precision Arithmetic 23 33
RandomizationA isin Cntimesn with A = 1Davies (2007) ldquoapproximate diagonalizationrdquo
tol = 1e-8[VD] = eig(A + tolrandn(n))F = Vdiag(feval(fundiag(D)))V
Manchester usage in VPA
digits(d)tol = 10^(-d2)[VD] = eig(vpa(A) + tolvpa( rand(n) ))F = Vdiag(feval(fundiag(D)))V
Daviesrsquo analysis suggests rel err asymp 10minusd2
H amp Relton (in progress)
University of Manchester Nick Higham Higher Precision Arithmetic 23 33
Tiny Relative Errors
Normwise relative errors
θinfin(x y) =x minus yinfinxinfin
from a numerical experiment
132e-22 339e-22 339e-21 867e-20139e-18 436e-18 530e-18 583e-18145e-17 376e-17 376e-17 427e-17
What precision was used
University of Manchester Nick Higham Higher Precision Arithmetic 24 33
Base β = 2 u = 2minust Dingle amp H (2011)
Theorem
If x 6= 0 and y are distinct normalized flpt numbers then|x minus y ||x | ge u and this lower bound is attainable
Theorem
Let 0 6= x y isin Rn be vectors of normalized flpt numbers Ifθinfin(x y) lt u then xk = yk for all k st |xk | = xinfin
x =
[1
10minus22
] y =
[1
2times 10minus22
] θinfin(x y) = 10minus22
Tiny relative errors can corrupt performance profilesSee cure in Dingle amp H (2011)
University of Manchester Nick Higham Higher Precision Arithmetic 25 33
Base β = 2 u = 2minust Dingle amp H (2011)
Theorem
If x 6= 0 and y are distinct normalized flpt numbers then|x minus y ||x | ge u and this lower bound is attainable
Theorem
Let 0 6= x y isin Rn be vectors of normalized flpt numbers Ifθinfin(x y) lt u then xk = yk for all k st |xk | = xinfin
x =
[1
10minus22
] y =
[1
2times 10minus22
] θinfin(x y) = 10minus22
Tiny relative errors can corrupt performance profilesSee cure in Dingle amp H (2011)
University of Manchester Nick Higham Higher Precision Arithmetic 25 33
Increasing the Precision
y = eπradic
163 evaluated at t digit precision
t y20 2625374126407687440025 262537412640768744000000030 262537412640768743999999999999
Is the last digit before the decimal point 4
t y35 2625374126407687439999999999992500740 2625374126407687439999999999992500725972
So no itrsquos 3
University of Manchester Nick Higham Higher Precision Arithmetic 26 33
Increasing the Precision
y = eπradic
163 evaluated at t digit precision
t y20 2625374126407687440025 262537412640768744000000030 262537412640768743999999999999
Is the last digit before the decimal point 4
t y35 2625374126407687439999999999992500740 2625374126407687439999999999992500725972
So no itrsquos 3
University of Manchester Nick Higham Higher Precision Arithmetic 26 33
Another Example
Consider the evaluation in precision u = 2minust of
y = x + a sin(bx) x = 17 a = 10minus8 b = 224
10 15 20 25 30 35 4010
minus14
10minus13
10minus12
10minus11
10minus10
10minus9
10minus8
10minus7
10minus6
10minus5
10minus4
t
error
University of Manchester Nick Higham Higher Precision Arithmetic 27 33
MythIncreasing the precision at which a computation isperformed increases the accuracy of the answer
Rounding Errors in Digital Imaging
An 8-bit RGB image is an m times n times 3 array of integersaijk isin 01 255Every editing op (levels curves colour balance )aijk larr round(fijk(aijk)) incurs a rounding error
Should we edit in 16-bit
University of Manchester Nick Higham Higher Precision Arithmetic 29 33
Rounding Errors in Digital Imaging
An 8-bit RGB image is an m times n times 3 array of integersaijk isin 01 255Every editing op (levels curves colour balance )aijk larr round(fijk(aijk)) incurs a rounding error
Should we edit in 16-bit
University of Manchester Nick Higham Higher Precision Arithmetic 29 33
16-bit vs 8-bit editing
SLR cameras generate image at 12ndash14 bits internally
8-bit 0 1 25516-bit 0 39times 10minus3 255
Controversial Margulis says using 16 bits makes nopractical difference in quality
Relevant metric is not normwise relative error
University of Manchester Nick Higham Higher Precision Arithmetic 30 33
16-bit vs 8-bit editing
SLR cameras generate image at 12ndash14 bits internally
8-bit 0 1 25516-bit 0 39times 10minus3 255
Controversial Margulis says using 16 bits makes nopractical difference in quality
Relevant metric is not normwise relative error
University of Manchester Nick Higham Higher Precision Arithmetic 30 33
16-bit vs 8-bit editing
SLR cameras generate image at 12ndash14 bits internally
8-bit 0 1 25516-bit 0 39times 10minus3 255
Controversial Margulis says using 16 bits makes nopractical difference in quality
Relevant metric is not normwise relative error
University of Manchester Nick Higham Higher Precision Arithmetic 30 33
High Precision in MATLAB
VPA arithmetic in Symbolic Math ToolboxAdvanpix Multiprecision Computing Toolbox forMATLAB
University of Manchester Nick Higham Higher Precision Arithmetic 31 33
gtgt which -all eigbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleeig) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleeig) double methodCMATLAB2012btoolboxsymbolicsymbolicsymeigm sym methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedeigm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayeigm gpuArray methoddmatlabchebfunlinopeigm linop methoddmatlabchebfunchebopeigm chebop method
gtgt which -all qrbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleqr) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleqr) double methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedqrm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayqrm gpuArray methoddmatlabchebfunchebfunqrm chebfun method
gtgt which -all toeplitzCMATLAB2012btoolboxmatlabelmattoeplitzm
University of Manchester Nick Higham Higher Precision Arithmetic 32 33
gtgt which -all eigbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleeig) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleeig) double methodCMATLAB2012btoolboxsymbolicsymbolicsymeigm sym methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedeigm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayeigm gpuArray methoddmatlabchebfunlinopeigm linop methoddmatlabchebfunchebopeigm chebop method
gtgt which -all qrbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleqr) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleqr) double methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedqrm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayqrm gpuArray methoddmatlabchebfunchebfunqrm chebfun method
gtgt which -all toeplitzCMATLAB2012btoolboxmatlabelmattoeplitzm
University of Manchester Nick Higham Higher Precision Arithmetic 32 33
gtgt which -all eigbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleeig) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleeig) double methodCMATLAB2012btoolboxsymbolicsymbolicsymeigm sym methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedeigm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayeigm gpuArray methoddmatlabchebfunlinopeigm linop methoddmatlabchebfunchebopeigm chebop method
gtgt which -all qrbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleqr) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleqr) double methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedqrm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayqrm gpuArray methoddmatlabchebfunchebfunqrm chebfun method
gtgt which -all toeplitzCMATLAB2012btoolboxmatlabelmattoeplitzm
University of Manchester Nick Higham Higher Precision Arithmetic 32 33
Conclusions amp Future Directions
Need to prepare for quad precision in hardwareArbitrary precision arithmetic in software increasinglyuseful Are we fully exploiting it
High precision veryuseful for computingldquoexact answersrdquo fortesting algs
NLA group
University of Manchester Nick Higham Higher Precision Arithmetic 33 33
References I
Multiprecision Computing Toolbox for MATLABAdvanpix LLC Tokyo|httpwwwadvanpixcom|
A H Al-Mohy and N J HighamThe complex step approximation to the Freacutechetderivative of a matrix functionNumer Algorithms 53(1)133ndash148 2010
Basic Linear Algebra Subprograms Technical (BLAST)Forum Standard IInt J High Performance Computing Applications 16(1)1ndash111 2002
University of Manchester Nick Higham Higher Precision Arithmetic 1 4
References II
Basic Linear Algebra Subprograms Technical (BLAST)Forum Standard IIInt J High Performance Computing Applications 16(2)115ndash199 2002
E B DaviesApproximate diagonalizationSIAM J Matrix Anal Appl 29(4)1051ndash1064 2007
University of Manchester Nick Higham Higher Precision Arithmetic 2 4
References III
N J Dingle and N J HighamReducing the influence of tiny normwise relative errorson performance profilesMIMS EPrint 201190 Manchester Institute forMathematical Sciences The University of ManchesterUK Nov 201111 pp
W KahanComputer benchmarks versus accuracyDraft manuscript June 1994
University of Manchester Nick Higham Higher Precision Arithmetic 3 4
References IV
X S Li J W Demmel D H Bailey G Henry Y HidaJ Iskandar W Kahan S Y Kang A Kapur M CMartin B J Thompson T Tung and D J YooDesign implementation and testing of extended andmixed precision BLASACM Trans Math Software 28(2)152ndash205 June 2002
D MargulisPhotoshop LAB Color The Canyon Conundrum andOther Adventures in the Most Powerful ColorspacePeachpit Press Berkeley CA USA 2006ISBN 0-321-35678-0xviii+366 pp
University of Manchester Nick Higham Higher Precision Arithmetic 4 4
RandomizationA isin Cntimesn with A = 1Davies (2007) ldquoapproximate diagonalizationrdquo
tol = 1e-8[VD] = eig(A + tolrandn(n))F = Vdiag(feval(fundiag(D)))V
Manchester usage in VPA
digits(d)tol = 10^(-d2)[VD] = eig(vpa(A) + tolvpa( rand(n) ))F = Vdiag(feval(fundiag(D)))V
Daviesrsquo analysis suggests rel err asymp 10minusd2
H amp Relton (in progress)
University of Manchester Nick Higham Higher Precision Arithmetic 23 33
RandomizationA isin Cntimesn with A = 1Davies (2007) ldquoapproximate diagonalizationrdquo
tol = 1e-8[VD] = eig(A + tolrandn(n))F = Vdiag(feval(fundiag(D)))V
Manchester usage in VPA
digits(d)tol = 10^(-d2)[VD] = eig(vpa(A) + tolvpa( rand(n) ))F = Vdiag(feval(fundiag(D)))V
Daviesrsquo analysis suggests rel err asymp 10minusd2
H amp Relton (in progress)
University of Manchester Nick Higham Higher Precision Arithmetic 23 33
Tiny Relative Errors
Normwise relative errors
θinfin(x y) =x minus yinfinxinfin
from a numerical experiment
132e-22 339e-22 339e-21 867e-20139e-18 436e-18 530e-18 583e-18145e-17 376e-17 376e-17 427e-17
What precision was used
University of Manchester Nick Higham Higher Precision Arithmetic 24 33
Base β = 2 u = 2minust Dingle amp H (2011)
Theorem
If x 6= 0 and y are distinct normalized flpt numbers then|x minus y ||x | ge u and this lower bound is attainable
Theorem
Let 0 6= x y isin Rn be vectors of normalized flpt numbers Ifθinfin(x y) lt u then xk = yk for all k st |xk | = xinfin
x =
[1
10minus22
] y =
[1
2times 10minus22
] θinfin(x y) = 10minus22
Tiny relative errors can corrupt performance profilesSee cure in Dingle amp H (2011)
University of Manchester Nick Higham Higher Precision Arithmetic 25 33
Base β = 2 u = 2minust Dingle amp H (2011)
Theorem
If x 6= 0 and y are distinct normalized flpt numbers then|x minus y ||x | ge u and this lower bound is attainable
Theorem
Let 0 6= x y isin Rn be vectors of normalized flpt numbers Ifθinfin(x y) lt u then xk = yk for all k st |xk | = xinfin
x =
[1
10minus22
] y =
[1
2times 10minus22
] θinfin(x y) = 10minus22
Tiny relative errors can corrupt performance profilesSee cure in Dingle amp H (2011)
University of Manchester Nick Higham Higher Precision Arithmetic 25 33
Increasing the Precision
y = eπradic
163 evaluated at t digit precision
t y20 2625374126407687440025 262537412640768744000000030 262537412640768743999999999999
Is the last digit before the decimal point 4
t y35 2625374126407687439999999999992500740 2625374126407687439999999999992500725972
So no itrsquos 3
University of Manchester Nick Higham Higher Precision Arithmetic 26 33
Increasing the Precision
y = eπradic
163 evaluated at t digit precision
t y20 2625374126407687440025 262537412640768744000000030 262537412640768743999999999999
Is the last digit before the decimal point 4
t y35 2625374126407687439999999999992500740 2625374126407687439999999999992500725972
So no itrsquos 3
University of Manchester Nick Higham Higher Precision Arithmetic 26 33
Another Example
Consider the evaluation in precision u = 2minust of
y = x + a sin(bx) x = 17 a = 10minus8 b = 224
10 15 20 25 30 35 4010
minus14
10minus13
10minus12
10minus11
10minus10
10minus9
10minus8
10minus7
10minus6
10minus5
10minus4
t
error
University of Manchester Nick Higham Higher Precision Arithmetic 27 33
MythIncreasing the precision at which a computation isperformed increases the accuracy of the answer
Rounding Errors in Digital Imaging
An 8-bit RGB image is an m times n times 3 array of integersaijk isin 01 255Every editing op (levels curves colour balance )aijk larr round(fijk(aijk)) incurs a rounding error
Should we edit in 16-bit
University of Manchester Nick Higham Higher Precision Arithmetic 29 33
Rounding Errors in Digital Imaging
An 8-bit RGB image is an m times n times 3 array of integersaijk isin 01 255Every editing op (levels curves colour balance )aijk larr round(fijk(aijk)) incurs a rounding error
Should we edit in 16-bit
University of Manchester Nick Higham Higher Precision Arithmetic 29 33
16-bit vs 8-bit editing
SLR cameras generate image at 12ndash14 bits internally
8-bit 0 1 25516-bit 0 39times 10minus3 255
Controversial Margulis says using 16 bits makes nopractical difference in quality
Relevant metric is not normwise relative error
University of Manchester Nick Higham Higher Precision Arithmetic 30 33
16-bit vs 8-bit editing
SLR cameras generate image at 12ndash14 bits internally
8-bit 0 1 25516-bit 0 39times 10minus3 255
Controversial Margulis says using 16 bits makes nopractical difference in quality
Relevant metric is not normwise relative error
University of Manchester Nick Higham Higher Precision Arithmetic 30 33
16-bit vs 8-bit editing
SLR cameras generate image at 12ndash14 bits internally
8-bit 0 1 25516-bit 0 39times 10minus3 255
Controversial Margulis says using 16 bits makes nopractical difference in quality
Relevant metric is not normwise relative error
University of Manchester Nick Higham Higher Precision Arithmetic 30 33
High Precision in MATLAB
VPA arithmetic in Symbolic Math ToolboxAdvanpix Multiprecision Computing Toolbox forMATLAB
University of Manchester Nick Higham Higher Precision Arithmetic 31 33
gtgt which -all eigbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleeig) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleeig) double methodCMATLAB2012btoolboxsymbolicsymbolicsymeigm sym methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedeigm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayeigm gpuArray methoddmatlabchebfunlinopeigm linop methoddmatlabchebfunchebopeigm chebop method
gtgt which -all qrbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleqr) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleqr) double methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedqrm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayqrm gpuArray methoddmatlabchebfunchebfunqrm chebfun method
gtgt which -all toeplitzCMATLAB2012btoolboxmatlabelmattoeplitzm
University of Manchester Nick Higham Higher Precision Arithmetic 32 33
gtgt which -all eigbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleeig) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleeig) double methodCMATLAB2012btoolboxsymbolicsymbolicsymeigm sym methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedeigm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayeigm gpuArray methoddmatlabchebfunlinopeigm linop methoddmatlabchebfunchebopeigm chebop method
gtgt which -all qrbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleqr) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleqr) double methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedqrm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayqrm gpuArray methoddmatlabchebfunchebfunqrm chebfun method
gtgt which -all toeplitzCMATLAB2012btoolboxmatlabelmattoeplitzm
University of Manchester Nick Higham Higher Precision Arithmetic 32 33
gtgt which -all eigbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleeig) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleeig) double methodCMATLAB2012btoolboxsymbolicsymbolicsymeigm sym methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedeigm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayeigm gpuArray methoddmatlabchebfunlinopeigm linop methoddmatlabchebfunchebopeigm chebop method
gtgt which -all qrbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleqr) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleqr) double methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedqrm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayqrm gpuArray methoddmatlabchebfunchebfunqrm chebfun method
gtgt which -all toeplitzCMATLAB2012btoolboxmatlabelmattoeplitzm
University of Manchester Nick Higham Higher Precision Arithmetic 32 33
Conclusions amp Future Directions
Need to prepare for quad precision in hardwareArbitrary precision arithmetic in software increasinglyuseful Are we fully exploiting it
High precision veryuseful for computingldquoexact answersrdquo fortesting algs
NLA group
University of Manchester Nick Higham Higher Precision Arithmetic 33 33
References I
Multiprecision Computing Toolbox for MATLABAdvanpix LLC Tokyo|httpwwwadvanpixcom|
A H Al-Mohy and N J HighamThe complex step approximation to the Freacutechetderivative of a matrix functionNumer Algorithms 53(1)133ndash148 2010
Basic Linear Algebra Subprograms Technical (BLAST)Forum Standard IInt J High Performance Computing Applications 16(1)1ndash111 2002
University of Manchester Nick Higham Higher Precision Arithmetic 1 4
References II
Basic Linear Algebra Subprograms Technical (BLAST)Forum Standard IIInt J High Performance Computing Applications 16(2)115ndash199 2002
E B DaviesApproximate diagonalizationSIAM J Matrix Anal Appl 29(4)1051ndash1064 2007
University of Manchester Nick Higham Higher Precision Arithmetic 2 4
References III
N J Dingle and N J HighamReducing the influence of tiny normwise relative errorson performance profilesMIMS EPrint 201190 Manchester Institute forMathematical Sciences The University of ManchesterUK Nov 201111 pp
W KahanComputer benchmarks versus accuracyDraft manuscript June 1994
University of Manchester Nick Higham Higher Precision Arithmetic 3 4
References IV
X S Li J W Demmel D H Bailey G Henry Y HidaJ Iskandar W Kahan S Y Kang A Kapur M CMartin B J Thompson T Tung and D J YooDesign implementation and testing of extended andmixed precision BLASACM Trans Math Software 28(2)152ndash205 June 2002
D MargulisPhotoshop LAB Color The Canyon Conundrum andOther Adventures in the Most Powerful ColorspacePeachpit Press Berkeley CA USA 2006ISBN 0-321-35678-0xviii+366 pp
University of Manchester Nick Higham Higher Precision Arithmetic 4 4
RandomizationA isin Cntimesn with A = 1Davies (2007) ldquoapproximate diagonalizationrdquo
tol = 1e-8[VD] = eig(A + tolrandn(n))F = Vdiag(feval(fundiag(D)))V
Manchester usage in VPA
digits(d)tol = 10^(-d2)[VD] = eig(vpa(A) + tolvpa( rand(n) ))F = Vdiag(feval(fundiag(D)))V
Daviesrsquo analysis suggests rel err asymp 10minusd2
H amp Relton (in progress)
University of Manchester Nick Higham Higher Precision Arithmetic 23 33
Tiny Relative Errors
Normwise relative errors
θinfin(x y) =x minus yinfinxinfin
from a numerical experiment
132e-22 339e-22 339e-21 867e-20139e-18 436e-18 530e-18 583e-18145e-17 376e-17 376e-17 427e-17
What precision was used
University of Manchester Nick Higham Higher Precision Arithmetic 24 33
Base β = 2 u = 2minust Dingle amp H (2011)
Theorem
If x 6= 0 and y are distinct normalized flpt numbers then|x minus y ||x | ge u and this lower bound is attainable
Theorem
Let 0 6= x y isin Rn be vectors of normalized flpt numbers Ifθinfin(x y) lt u then xk = yk for all k st |xk | = xinfin
x =
[1
10minus22
] y =
[1
2times 10minus22
] θinfin(x y) = 10minus22
Tiny relative errors can corrupt performance profilesSee cure in Dingle amp H (2011)
University of Manchester Nick Higham Higher Precision Arithmetic 25 33
Base β = 2 u = 2minust Dingle amp H (2011)
Theorem
If x 6= 0 and y are distinct normalized flpt numbers then|x minus y ||x | ge u and this lower bound is attainable
Theorem
Let 0 6= x y isin Rn be vectors of normalized flpt numbers Ifθinfin(x y) lt u then xk = yk for all k st |xk | = xinfin
x =
[1
10minus22
] y =
[1
2times 10minus22
] θinfin(x y) = 10minus22
Tiny relative errors can corrupt performance profilesSee cure in Dingle amp H (2011)
University of Manchester Nick Higham Higher Precision Arithmetic 25 33
Increasing the Precision
y = eπradic
163 evaluated at t digit precision
t y20 2625374126407687440025 262537412640768744000000030 262537412640768743999999999999
Is the last digit before the decimal point 4
t y35 2625374126407687439999999999992500740 2625374126407687439999999999992500725972
So no itrsquos 3
University of Manchester Nick Higham Higher Precision Arithmetic 26 33
Increasing the Precision
y = eπradic
163 evaluated at t digit precision
t y20 2625374126407687440025 262537412640768744000000030 262537412640768743999999999999
Is the last digit before the decimal point 4
t y35 2625374126407687439999999999992500740 2625374126407687439999999999992500725972
So no itrsquos 3
University of Manchester Nick Higham Higher Precision Arithmetic 26 33
Another Example
Consider the evaluation in precision u = 2minust of
y = x + a sin(bx) x = 17 a = 10minus8 b = 224
10 15 20 25 30 35 4010
minus14
10minus13
10minus12
10minus11
10minus10
10minus9
10minus8
10minus7
10minus6
10minus5
10minus4
t
error
University of Manchester Nick Higham Higher Precision Arithmetic 27 33
MythIncreasing the precision at which a computation isperformed increases the accuracy of the answer
Rounding Errors in Digital Imaging
An 8-bit RGB image is an m times n times 3 array of integersaijk isin 01 255Every editing op (levels curves colour balance )aijk larr round(fijk(aijk)) incurs a rounding error
Should we edit in 16-bit
University of Manchester Nick Higham Higher Precision Arithmetic 29 33
Rounding Errors in Digital Imaging
An 8-bit RGB image is an m times n times 3 array of integersaijk isin 01 255Every editing op (levels curves colour balance )aijk larr round(fijk(aijk)) incurs a rounding error
Should we edit in 16-bit
University of Manchester Nick Higham Higher Precision Arithmetic 29 33
16-bit vs 8-bit editing
SLR cameras generate image at 12ndash14 bits internally
8-bit 0 1 25516-bit 0 39times 10minus3 255
Controversial Margulis says using 16 bits makes nopractical difference in quality
Relevant metric is not normwise relative error
University of Manchester Nick Higham Higher Precision Arithmetic 30 33
16-bit vs 8-bit editing
SLR cameras generate image at 12ndash14 bits internally
8-bit 0 1 25516-bit 0 39times 10minus3 255
Controversial Margulis says using 16 bits makes nopractical difference in quality
Relevant metric is not normwise relative error
University of Manchester Nick Higham Higher Precision Arithmetic 30 33
16-bit vs 8-bit editing
SLR cameras generate image at 12ndash14 bits internally
8-bit 0 1 25516-bit 0 39times 10minus3 255
Controversial Margulis says using 16 bits makes nopractical difference in quality
Relevant metric is not normwise relative error
University of Manchester Nick Higham Higher Precision Arithmetic 30 33
High Precision in MATLAB
VPA arithmetic in Symbolic Math ToolboxAdvanpix Multiprecision Computing Toolbox forMATLAB
University of Manchester Nick Higham Higher Precision Arithmetic 31 33
gtgt which -all eigbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleeig) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleeig) double methodCMATLAB2012btoolboxsymbolicsymbolicsymeigm sym methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedeigm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayeigm gpuArray methoddmatlabchebfunlinopeigm linop methoddmatlabchebfunchebopeigm chebop method
gtgt which -all qrbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleqr) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleqr) double methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedqrm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayqrm gpuArray methoddmatlabchebfunchebfunqrm chebfun method
gtgt which -all toeplitzCMATLAB2012btoolboxmatlabelmattoeplitzm
University of Manchester Nick Higham Higher Precision Arithmetic 32 33
gtgt which -all eigbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleeig) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleeig) double methodCMATLAB2012btoolboxsymbolicsymbolicsymeigm sym methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedeigm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayeigm gpuArray methoddmatlabchebfunlinopeigm linop methoddmatlabchebfunchebopeigm chebop method
gtgt which -all qrbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleqr) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleqr) double methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedqrm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayqrm gpuArray methoddmatlabchebfunchebfunqrm chebfun method
gtgt which -all toeplitzCMATLAB2012btoolboxmatlabelmattoeplitzm
University of Manchester Nick Higham Higher Precision Arithmetic 32 33
gtgt which -all eigbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleeig) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleeig) double methodCMATLAB2012btoolboxsymbolicsymbolicsymeigm sym methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedeigm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayeigm gpuArray methoddmatlabchebfunlinopeigm linop methoddmatlabchebfunchebopeigm chebop method
gtgt which -all qrbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleqr) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleqr) double methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedqrm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayqrm gpuArray methoddmatlabchebfunchebfunqrm chebfun method
gtgt which -all toeplitzCMATLAB2012btoolboxmatlabelmattoeplitzm
University of Manchester Nick Higham Higher Precision Arithmetic 32 33
Conclusions amp Future Directions
Need to prepare for quad precision in hardwareArbitrary precision arithmetic in software increasinglyuseful Are we fully exploiting it
High precision veryuseful for computingldquoexact answersrdquo fortesting algs
NLA group
University of Manchester Nick Higham Higher Precision Arithmetic 33 33
References I
Multiprecision Computing Toolbox for MATLABAdvanpix LLC Tokyo|httpwwwadvanpixcom|
A H Al-Mohy and N J HighamThe complex step approximation to the Freacutechetderivative of a matrix functionNumer Algorithms 53(1)133ndash148 2010
Basic Linear Algebra Subprograms Technical (BLAST)Forum Standard IInt J High Performance Computing Applications 16(1)1ndash111 2002
University of Manchester Nick Higham Higher Precision Arithmetic 1 4
References II
Basic Linear Algebra Subprograms Technical (BLAST)Forum Standard IIInt J High Performance Computing Applications 16(2)115ndash199 2002
E B DaviesApproximate diagonalizationSIAM J Matrix Anal Appl 29(4)1051ndash1064 2007
University of Manchester Nick Higham Higher Precision Arithmetic 2 4
References III
N J Dingle and N J HighamReducing the influence of tiny normwise relative errorson performance profilesMIMS EPrint 201190 Manchester Institute forMathematical Sciences The University of ManchesterUK Nov 201111 pp
W KahanComputer benchmarks versus accuracyDraft manuscript June 1994
University of Manchester Nick Higham Higher Precision Arithmetic 3 4
References IV
X S Li J W Demmel D H Bailey G Henry Y HidaJ Iskandar W Kahan S Y Kang A Kapur M CMartin B J Thompson T Tung and D J YooDesign implementation and testing of extended andmixed precision BLASACM Trans Math Software 28(2)152ndash205 June 2002
D MargulisPhotoshop LAB Color The Canyon Conundrum andOther Adventures in the Most Powerful ColorspacePeachpit Press Berkeley CA USA 2006ISBN 0-321-35678-0xviii+366 pp
University of Manchester Nick Higham Higher Precision Arithmetic 4 4
Tiny Relative Errors
Normwise relative errors
θinfin(x y) =x minus yinfinxinfin
from a numerical experiment
132e-22 339e-22 339e-21 867e-20139e-18 436e-18 530e-18 583e-18145e-17 376e-17 376e-17 427e-17
What precision was used
University of Manchester Nick Higham Higher Precision Arithmetic 24 33
Base β = 2 u = 2minust Dingle amp H (2011)
Theorem
If x 6= 0 and y are distinct normalized flpt numbers then|x minus y ||x | ge u and this lower bound is attainable
Theorem
Let 0 6= x y isin Rn be vectors of normalized flpt numbers Ifθinfin(x y) lt u then xk = yk for all k st |xk | = xinfin
x =
[1
10minus22
] y =
[1
2times 10minus22
] θinfin(x y) = 10minus22
Tiny relative errors can corrupt performance profilesSee cure in Dingle amp H (2011)
University of Manchester Nick Higham Higher Precision Arithmetic 25 33
Base β = 2 u = 2minust Dingle amp H (2011)
Theorem
If x 6= 0 and y are distinct normalized flpt numbers then|x minus y ||x | ge u and this lower bound is attainable
Theorem
Let 0 6= x y isin Rn be vectors of normalized flpt numbers Ifθinfin(x y) lt u then xk = yk for all k st |xk | = xinfin
x =
[1
10minus22
] y =
[1
2times 10minus22
] θinfin(x y) = 10minus22
Tiny relative errors can corrupt performance profilesSee cure in Dingle amp H (2011)
University of Manchester Nick Higham Higher Precision Arithmetic 25 33
Increasing the Precision
y = eπradic
163 evaluated at t digit precision
t y20 2625374126407687440025 262537412640768744000000030 262537412640768743999999999999
Is the last digit before the decimal point 4
t y35 2625374126407687439999999999992500740 2625374126407687439999999999992500725972
So no itrsquos 3
University of Manchester Nick Higham Higher Precision Arithmetic 26 33
Increasing the Precision
y = eπradic
163 evaluated at t digit precision
t y20 2625374126407687440025 262537412640768744000000030 262537412640768743999999999999
Is the last digit before the decimal point 4
t y35 2625374126407687439999999999992500740 2625374126407687439999999999992500725972
So no itrsquos 3
University of Manchester Nick Higham Higher Precision Arithmetic 26 33
Another Example
Consider the evaluation in precision u = 2minust of
y = x + a sin(bx) x = 17 a = 10minus8 b = 224
10 15 20 25 30 35 4010
minus14
10minus13
10minus12
10minus11
10minus10
10minus9
10minus8
10minus7
10minus6
10minus5
10minus4
t
error
University of Manchester Nick Higham Higher Precision Arithmetic 27 33
MythIncreasing the precision at which a computation isperformed increases the accuracy of the answer
Rounding Errors in Digital Imaging
An 8-bit RGB image is an m times n times 3 array of integersaijk isin 01 255Every editing op (levels curves colour balance )aijk larr round(fijk(aijk)) incurs a rounding error
Should we edit in 16-bit
University of Manchester Nick Higham Higher Precision Arithmetic 29 33
Rounding Errors in Digital Imaging
An 8-bit RGB image is an m times n times 3 array of integersaijk isin 01 255Every editing op (levels curves colour balance )aijk larr round(fijk(aijk)) incurs a rounding error
Should we edit in 16-bit
University of Manchester Nick Higham Higher Precision Arithmetic 29 33
16-bit vs 8-bit editing
SLR cameras generate image at 12ndash14 bits internally
8-bit 0 1 25516-bit 0 39times 10minus3 255
Controversial Margulis says using 16 bits makes nopractical difference in quality
Relevant metric is not normwise relative error
University of Manchester Nick Higham Higher Precision Arithmetic 30 33
16-bit vs 8-bit editing
SLR cameras generate image at 12ndash14 bits internally
8-bit 0 1 25516-bit 0 39times 10minus3 255
Controversial Margulis says using 16 bits makes nopractical difference in quality
Relevant metric is not normwise relative error
University of Manchester Nick Higham Higher Precision Arithmetic 30 33
16-bit vs 8-bit editing
SLR cameras generate image at 12ndash14 bits internally
8-bit 0 1 25516-bit 0 39times 10minus3 255
Controversial Margulis says using 16 bits makes nopractical difference in quality
Relevant metric is not normwise relative error
University of Manchester Nick Higham Higher Precision Arithmetic 30 33
High Precision in MATLAB
VPA arithmetic in Symbolic Math ToolboxAdvanpix Multiprecision Computing Toolbox forMATLAB
University of Manchester Nick Higham Higher Precision Arithmetic 31 33
gtgt which -all eigbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleeig) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleeig) double methodCMATLAB2012btoolboxsymbolicsymbolicsymeigm sym methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedeigm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayeigm gpuArray methoddmatlabchebfunlinopeigm linop methoddmatlabchebfunchebopeigm chebop method
gtgt which -all qrbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleqr) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleqr) double methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedqrm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayqrm gpuArray methoddmatlabchebfunchebfunqrm chebfun method
gtgt which -all toeplitzCMATLAB2012btoolboxmatlabelmattoeplitzm
University of Manchester Nick Higham Higher Precision Arithmetic 32 33
gtgt which -all eigbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleeig) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleeig) double methodCMATLAB2012btoolboxsymbolicsymbolicsymeigm sym methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedeigm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayeigm gpuArray methoddmatlabchebfunlinopeigm linop methoddmatlabchebfunchebopeigm chebop method
gtgt which -all qrbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleqr) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleqr) double methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedqrm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayqrm gpuArray methoddmatlabchebfunchebfunqrm chebfun method
gtgt which -all toeplitzCMATLAB2012btoolboxmatlabelmattoeplitzm
University of Manchester Nick Higham Higher Precision Arithmetic 32 33
gtgt which -all eigbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleeig) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleeig) double methodCMATLAB2012btoolboxsymbolicsymbolicsymeigm sym methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedeigm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayeigm gpuArray methoddmatlabchebfunlinopeigm linop methoddmatlabchebfunchebopeigm chebop method
gtgt which -all qrbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleqr) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleqr) double methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedqrm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayqrm gpuArray methoddmatlabchebfunchebfunqrm chebfun method
gtgt which -all toeplitzCMATLAB2012btoolboxmatlabelmattoeplitzm
University of Manchester Nick Higham Higher Precision Arithmetic 32 33
Conclusions amp Future Directions
Need to prepare for quad precision in hardwareArbitrary precision arithmetic in software increasinglyuseful Are we fully exploiting it
High precision veryuseful for computingldquoexact answersrdquo fortesting algs
NLA group
University of Manchester Nick Higham Higher Precision Arithmetic 33 33
References I
Multiprecision Computing Toolbox for MATLABAdvanpix LLC Tokyo|httpwwwadvanpixcom|
A H Al-Mohy and N J HighamThe complex step approximation to the Freacutechetderivative of a matrix functionNumer Algorithms 53(1)133ndash148 2010
Basic Linear Algebra Subprograms Technical (BLAST)Forum Standard IInt J High Performance Computing Applications 16(1)1ndash111 2002
University of Manchester Nick Higham Higher Precision Arithmetic 1 4
References II
Basic Linear Algebra Subprograms Technical (BLAST)Forum Standard IIInt J High Performance Computing Applications 16(2)115ndash199 2002
E B DaviesApproximate diagonalizationSIAM J Matrix Anal Appl 29(4)1051ndash1064 2007
University of Manchester Nick Higham Higher Precision Arithmetic 2 4
References III
N J Dingle and N J HighamReducing the influence of tiny normwise relative errorson performance profilesMIMS EPrint 201190 Manchester Institute forMathematical Sciences The University of ManchesterUK Nov 201111 pp
W KahanComputer benchmarks versus accuracyDraft manuscript June 1994
University of Manchester Nick Higham Higher Precision Arithmetic 3 4
References IV
X S Li J W Demmel D H Bailey G Henry Y HidaJ Iskandar W Kahan S Y Kang A Kapur M CMartin B J Thompson T Tung and D J YooDesign implementation and testing of extended andmixed precision BLASACM Trans Math Software 28(2)152ndash205 June 2002
D MargulisPhotoshop LAB Color The Canyon Conundrum andOther Adventures in the Most Powerful ColorspacePeachpit Press Berkeley CA USA 2006ISBN 0-321-35678-0xviii+366 pp
University of Manchester Nick Higham Higher Precision Arithmetic 4 4
Base β = 2 u = 2minust Dingle amp H (2011)
Theorem
If x 6= 0 and y are distinct normalized flpt numbers then|x minus y ||x | ge u and this lower bound is attainable
Theorem
Let 0 6= x y isin Rn be vectors of normalized flpt numbers Ifθinfin(x y) lt u then xk = yk for all k st |xk | = xinfin
x =
[1
10minus22
] y =
[1
2times 10minus22
] θinfin(x y) = 10minus22
Tiny relative errors can corrupt performance profilesSee cure in Dingle amp H (2011)
University of Manchester Nick Higham Higher Precision Arithmetic 25 33
Base β = 2 u = 2minust Dingle amp H (2011)
Theorem
If x 6= 0 and y are distinct normalized flpt numbers then|x minus y ||x | ge u and this lower bound is attainable
Theorem
Let 0 6= x y isin Rn be vectors of normalized flpt numbers Ifθinfin(x y) lt u then xk = yk for all k st |xk | = xinfin
x =
[1
10minus22
] y =
[1
2times 10minus22
] θinfin(x y) = 10minus22
Tiny relative errors can corrupt performance profilesSee cure in Dingle amp H (2011)
University of Manchester Nick Higham Higher Precision Arithmetic 25 33
Increasing the Precision
y = eπradic
163 evaluated at t digit precision
t y20 2625374126407687440025 262537412640768744000000030 262537412640768743999999999999
Is the last digit before the decimal point 4
t y35 2625374126407687439999999999992500740 2625374126407687439999999999992500725972
So no itrsquos 3
University of Manchester Nick Higham Higher Precision Arithmetic 26 33
Increasing the Precision
y = eπradic
163 evaluated at t digit precision
t y20 2625374126407687440025 262537412640768744000000030 262537412640768743999999999999
Is the last digit before the decimal point 4
t y35 2625374126407687439999999999992500740 2625374126407687439999999999992500725972
So no itrsquos 3
University of Manchester Nick Higham Higher Precision Arithmetic 26 33
Another Example
Consider the evaluation in precision u = 2minust of
y = x + a sin(bx) x = 17 a = 10minus8 b = 224
10 15 20 25 30 35 4010
minus14
10minus13
10minus12
10minus11
10minus10
10minus9
10minus8
10minus7
10minus6
10minus5
10minus4
t
error
University of Manchester Nick Higham Higher Precision Arithmetic 27 33
MythIncreasing the precision at which a computation isperformed increases the accuracy of the answer
Rounding Errors in Digital Imaging
An 8-bit RGB image is an m times n times 3 array of integersaijk isin 01 255Every editing op (levels curves colour balance )aijk larr round(fijk(aijk)) incurs a rounding error
Should we edit in 16-bit
University of Manchester Nick Higham Higher Precision Arithmetic 29 33
Rounding Errors in Digital Imaging
An 8-bit RGB image is an m times n times 3 array of integersaijk isin 01 255Every editing op (levels curves colour balance )aijk larr round(fijk(aijk)) incurs a rounding error
Should we edit in 16-bit
University of Manchester Nick Higham Higher Precision Arithmetic 29 33
16-bit vs 8-bit editing
SLR cameras generate image at 12ndash14 bits internally
8-bit 0 1 25516-bit 0 39times 10minus3 255
Controversial Margulis says using 16 bits makes nopractical difference in quality
Relevant metric is not normwise relative error
University of Manchester Nick Higham Higher Precision Arithmetic 30 33
16-bit vs 8-bit editing
SLR cameras generate image at 12ndash14 bits internally
8-bit 0 1 25516-bit 0 39times 10minus3 255
Controversial Margulis says using 16 bits makes nopractical difference in quality
Relevant metric is not normwise relative error
University of Manchester Nick Higham Higher Precision Arithmetic 30 33
16-bit vs 8-bit editing
SLR cameras generate image at 12ndash14 bits internally
8-bit 0 1 25516-bit 0 39times 10minus3 255
Controversial Margulis says using 16 bits makes nopractical difference in quality
Relevant metric is not normwise relative error
University of Manchester Nick Higham Higher Precision Arithmetic 30 33
High Precision in MATLAB
VPA arithmetic in Symbolic Math ToolboxAdvanpix Multiprecision Computing Toolbox forMATLAB
University of Manchester Nick Higham Higher Precision Arithmetic 31 33
gtgt which -all eigbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleeig) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleeig) double methodCMATLAB2012btoolboxsymbolicsymbolicsymeigm sym methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedeigm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayeigm gpuArray methoddmatlabchebfunlinopeigm linop methoddmatlabchebfunchebopeigm chebop method
gtgt which -all qrbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleqr) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleqr) double methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedqrm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayqrm gpuArray methoddmatlabchebfunchebfunqrm chebfun method
gtgt which -all toeplitzCMATLAB2012btoolboxmatlabelmattoeplitzm
University of Manchester Nick Higham Higher Precision Arithmetic 32 33
gtgt which -all eigbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleeig) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleeig) double methodCMATLAB2012btoolboxsymbolicsymbolicsymeigm sym methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedeigm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayeigm gpuArray methoddmatlabchebfunlinopeigm linop methoddmatlabchebfunchebopeigm chebop method
gtgt which -all qrbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleqr) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleqr) double methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedqrm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayqrm gpuArray methoddmatlabchebfunchebfunqrm chebfun method
gtgt which -all toeplitzCMATLAB2012btoolboxmatlabelmattoeplitzm
University of Manchester Nick Higham Higher Precision Arithmetic 32 33
gtgt which -all eigbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleeig) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleeig) double methodCMATLAB2012btoolboxsymbolicsymbolicsymeigm sym methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedeigm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayeigm gpuArray methoddmatlabchebfunlinopeigm linop methoddmatlabchebfunchebopeigm chebop method
gtgt which -all qrbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleqr) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleqr) double methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedqrm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayqrm gpuArray methoddmatlabchebfunchebfunqrm chebfun method
gtgt which -all toeplitzCMATLAB2012btoolboxmatlabelmattoeplitzm
University of Manchester Nick Higham Higher Precision Arithmetic 32 33
Conclusions amp Future Directions
Need to prepare for quad precision in hardwareArbitrary precision arithmetic in software increasinglyuseful Are we fully exploiting it
High precision veryuseful for computingldquoexact answersrdquo fortesting algs
NLA group
University of Manchester Nick Higham Higher Precision Arithmetic 33 33
References I
Multiprecision Computing Toolbox for MATLABAdvanpix LLC Tokyo|httpwwwadvanpixcom|
A H Al-Mohy and N J HighamThe complex step approximation to the Freacutechetderivative of a matrix functionNumer Algorithms 53(1)133ndash148 2010
Basic Linear Algebra Subprograms Technical (BLAST)Forum Standard IInt J High Performance Computing Applications 16(1)1ndash111 2002
University of Manchester Nick Higham Higher Precision Arithmetic 1 4
References II
Basic Linear Algebra Subprograms Technical (BLAST)Forum Standard IIInt J High Performance Computing Applications 16(2)115ndash199 2002
E B DaviesApproximate diagonalizationSIAM J Matrix Anal Appl 29(4)1051ndash1064 2007
University of Manchester Nick Higham Higher Precision Arithmetic 2 4
References III
N J Dingle and N J HighamReducing the influence of tiny normwise relative errorson performance profilesMIMS EPrint 201190 Manchester Institute forMathematical Sciences The University of ManchesterUK Nov 201111 pp
W KahanComputer benchmarks versus accuracyDraft manuscript June 1994
University of Manchester Nick Higham Higher Precision Arithmetic 3 4
References IV
X S Li J W Demmel D H Bailey G Henry Y HidaJ Iskandar W Kahan S Y Kang A Kapur M CMartin B J Thompson T Tung and D J YooDesign implementation and testing of extended andmixed precision BLASACM Trans Math Software 28(2)152ndash205 June 2002
D MargulisPhotoshop LAB Color The Canyon Conundrum andOther Adventures in the Most Powerful ColorspacePeachpit Press Berkeley CA USA 2006ISBN 0-321-35678-0xviii+366 pp
University of Manchester Nick Higham Higher Precision Arithmetic 4 4
Base β = 2 u = 2minust Dingle amp H (2011)
Theorem
If x 6= 0 and y are distinct normalized flpt numbers then|x minus y ||x | ge u and this lower bound is attainable
Theorem
Let 0 6= x y isin Rn be vectors of normalized flpt numbers Ifθinfin(x y) lt u then xk = yk for all k st |xk | = xinfin
x =
[1
10minus22
] y =
[1
2times 10minus22
] θinfin(x y) = 10minus22
Tiny relative errors can corrupt performance profilesSee cure in Dingle amp H (2011)
University of Manchester Nick Higham Higher Precision Arithmetic 25 33
Increasing the Precision
y = eπradic
163 evaluated at t digit precision
t y20 2625374126407687440025 262537412640768744000000030 262537412640768743999999999999
Is the last digit before the decimal point 4
t y35 2625374126407687439999999999992500740 2625374126407687439999999999992500725972
So no itrsquos 3
University of Manchester Nick Higham Higher Precision Arithmetic 26 33
Increasing the Precision
y = eπradic
163 evaluated at t digit precision
t y20 2625374126407687440025 262537412640768744000000030 262537412640768743999999999999
Is the last digit before the decimal point 4
t y35 2625374126407687439999999999992500740 2625374126407687439999999999992500725972
So no itrsquos 3
University of Manchester Nick Higham Higher Precision Arithmetic 26 33
Another Example
Consider the evaluation in precision u = 2minust of
y = x + a sin(bx) x = 17 a = 10minus8 b = 224
10 15 20 25 30 35 4010
minus14
10minus13
10minus12
10minus11
10minus10
10minus9
10minus8
10minus7
10minus6
10minus5
10minus4
t
error
University of Manchester Nick Higham Higher Precision Arithmetic 27 33
MythIncreasing the precision at which a computation isperformed increases the accuracy of the answer
Rounding Errors in Digital Imaging
An 8-bit RGB image is an m times n times 3 array of integersaijk isin 01 255Every editing op (levels curves colour balance )aijk larr round(fijk(aijk)) incurs a rounding error
Should we edit in 16-bit
University of Manchester Nick Higham Higher Precision Arithmetic 29 33
Rounding Errors in Digital Imaging
An 8-bit RGB image is an m times n times 3 array of integersaijk isin 01 255Every editing op (levels curves colour balance )aijk larr round(fijk(aijk)) incurs a rounding error
Should we edit in 16-bit
University of Manchester Nick Higham Higher Precision Arithmetic 29 33
16-bit vs 8-bit editing
SLR cameras generate image at 12ndash14 bits internally
8-bit 0 1 25516-bit 0 39times 10minus3 255
Controversial Margulis says using 16 bits makes nopractical difference in quality
Relevant metric is not normwise relative error
University of Manchester Nick Higham Higher Precision Arithmetic 30 33
16-bit vs 8-bit editing
SLR cameras generate image at 12ndash14 bits internally
8-bit 0 1 25516-bit 0 39times 10minus3 255
Controversial Margulis says using 16 bits makes nopractical difference in quality
Relevant metric is not normwise relative error
University of Manchester Nick Higham Higher Precision Arithmetic 30 33
16-bit vs 8-bit editing
SLR cameras generate image at 12ndash14 bits internally
8-bit 0 1 25516-bit 0 39times 10minus3 255
Controversial Margulis says using 16 bits makes nopractical difference in quality
Relevant metric is not normwise relative error
University of Manchester Nick Higham Higher Precision Arithmetic 30 33
High Precision in MATLAB
VPA arithmetic in Symbolic Math ToolboxAdvanpix Multiprecision Computing Toolbox forMATLAB
University of Manchester Nick Higham Higher Precision Arithmetic 31 33
gtgt which -all eigbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleeig) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleeig) double methodCMATLAB2012btoolboxsymbolicsymbolicsymeigm sym methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedeigm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayeigm gpuArray methoddmatlabchebfunlinopeigm linop methoddmatlabchebfunchebopeigm chebop method
gtgt which -all qrbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleqr) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleqr) double methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedqrm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayqrm gpuArray methoddmatlabchebfunchebfunqrm chebfun method
gtgt which -all toeplitzCMATLAB2012btoolboxmatlabelmattoeplitzm
University of Manchester Nick Higham Higher Precision Arithmetic 32 33
gtgt which -all eigbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleeig) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleeig) double methodCMATLAB2012btoolboxsymbolicsymbolicsymeigm sym methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedeigm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayeigm gpuArray methoddmatlabchebfunlinopeigm linop methoddmatlabchebfunchebopeigm chebop method
gtgt which -all qrbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleqr) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleqr) double methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedqrm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayqrm gpuArray methoddmatlabchebfunchebfunqrm chebfun method
gtgt which -all toeplitzCMATLAB2012btoolboxmatlabelmattoeplitzm
University of Manchester Nick Higham Higher Precision Arithmetic 32 33
gtgt which -all eigbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleeig) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleeig) double methodCMATLAB2012btoolboxsymbolicsymbolicsymeigm sym methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedeigm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayeigm gpuArray methoddmatlabchebfunlinopeigm linop methoddmatlabchebfunchebopeigm chebop method
gtgt which -all qrbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleqr) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleqr) double methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedqrm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayqrm gpuArray methoddmatlabchebfunchebfunqrm chebfun method
gtgt which -all toeplitzCMATLAB2012btoolboxmatlabelmattoeplitzm
University of Manchester Nick Higham Higher Precision Arithmetic 32 33
Conclusions amp Future Directions
Need to prepare for quad precision in hardwareArbitrary precision arithmetic in software increasinglyuseful Are we fully exploiting it
High precision veryuseful for computingldquoexact answersrdquo fortesting algs
NLA group
University of Manchester Nick Higham Higher Precision Arithmetic 33 33
References I
Multiprecision Computing Toolbox for MATLABAdvanpix LLC Tokyo|httpwwwadvanpixcom|
A H Al-Mohy and N J HighamThe complex step approximation to the Freacutechetderivative of a matrix functionNumer Algorithms 53(1)133ndash148 2010
Basic Linear Algebra Subprograms Technical (BLAST)Forum Standard IInt J High Performance Computing Applications 16(1)1ndash111 2002
University of Manchester Nick Higham Higher Precision Arithmetic 1 4
References II
Basic Linear Algebra Subprograms Technical (BLAST)Forum Standard IIInt J High Performance Computing Applications 16(2)115ndash199 2002
E B DaviesApproximate diagonalizationSIAM J Matrix Anal Appl 29(4)1051ndash1064 2007
University of Manchester Nick Higham Higher Precision Arithmetic 2 4
References III
N J Dingle and N J HighamReducing the influence of tiny normwise relative errorson performance profilesMIMS EPrint 201190 Manchester Institute forMathematical Sciences The University of ManchesterUK Nov 201111 pp
W KahanComputer benchmarks versus accuracyDraft manuscript June 1994
University of Manchester Nick Higham Higher Precision Arithmetic 3 4
References IV
X S Li J W Demmel D H Bailey G Henry Y HidaJ Iskandar W Kahan S Y Kang A Kapur M CMartin B J Thompson T Tung and D J YooDesign implementation and testing of extended andmixed precision BLASACM Trans Math Software 28(2)152ndash205 June 2002
D MargulisPhotoshop LAB Color The Canyon Conundrum andOther Adventures in the Most Powerful ColorspacePeachpit Press Berkeley CA USA 2006ISBN 0-321-35678-0xviii+366 pp
University of Manchester Nick Higham Higher Precision Arithmetic 4 4
Increasing the Precision
y = eπradic
163 evaluated at t digit precision
t y20 2625374126407687440025 262537412640768744000000030 262537412640768743999999999999
Is the last digit before the decimal point 4
t y35 2625374126407687439999999999992500740 2625374126407687439999999999992500725972
So no itrsquos 3
University of Manchester Nick Higham Higher Precision Arithmetic 26 33
Increasing the Precision
y = eπradic
163 evaluated at t digit precision
t y20 2625374126407687440025 262537412640768744000000030 262537412640768743999999999999
Is the last digit before the decimal point 4
t y35 2625374126407687439999999999992500740 2625374126407687439999999999992500725972
So no itrsquos 3
University of Manchester Nick Higham Higher Precision Arithmetic 26 33
Another Example
Consider the evaluation in precision u = 2minust of
y = x + a sin(bx) x = 17 a = 10minus8 b = 224
10 15 20 25 30 35 4010
minus14
10minus13
10minus12
10minus11
10minus10
10minus9
10minus8
10minus7
10minus6
10minus5
10minus4
t
error
University of Manchester Nick Higham Higher Precision Arithmetic 27 33
MythIncreasing the precision at which a computation isperformed increases the accuracy of the answer
Rounding Errors in Digital Imaging
An 8-bit RGB image is an m times n times 3 array of integersaijk isin 01 255Every editing op (levels curves colour balance )aijk larr round(fijk(aijk)) incurs a rounding error
Should we edit in 16-bit
University of Manchester Nick Higham Higher Precision Arithmetic 29 33
Rounding Errors in Digital Imaging
An 8-bit RGB image is an m times n times 3 array of integersaijk isin 01 255Every editing op (levels curves colour balance )aijk larr round(fijk(aijk)) incurs a rounding error
Should we edit in 16-bit
University of Manchester Nick Higham Higher Precision Arithmetic 29 33
16-bit vs 8-bit editing
SLR cameras generate image at 12ndash14 bits internally
8-bit 0 1 25516-bit 0 39times 10minus3 255
Controversial Margulis says using 16 bits makes nopractical difference in quality
Relevant metric is not normwise relative error
University of Manchester Nick Higham Higher Precision Arithmetic 30 33
16-bit vs 8-bit editing
SLR cameras generate image at 12ndash14 bits internally
8-bit 0 1 25516-bit 0 39times 10minus3 255
Controversial Margulis says using 16 bits makes nopractical difference in quality
Relevant metric is not normwise relative error
University of Manchester Nick Higham Higher Precision Arithmetic 30 33
16-bit vs 8-bit editing
SLR cameras generate image at 12ndash14 bits internally
8-bit 0 1 25516-bit 0 39times 10minus3 255
Controversial Margulis says using 16 bits makes nopractical difference in quality
Relevant metric is not normwise relative error
University of Manchester Nick Higham Higher Precision Arithmetic 30 33
High Precision in MATLAB
VPA arithmetic in Symbolic Math ToolboxAdvanpix Multiprecision Computing Toolbox forMATLAB
University of Manchester Nick Higham Higher Precision Arithmetic 31 33
gtgt which -all eigbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleeig) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleeig) double methodCMATLAB2012btoolboxsymbolicsymbolicsymeigm sym methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedeigm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayeigm gpuArray methoddmatlabchebfunlinopeigm linop methoddmatlabchebfunchebopeigm chebop method
gtgt which -all qrbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleqr) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleqr) double methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedqrm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayqrm gpuArray methoddmatlabchebfunchebfunqrm chebfun method
gtgt which -all toeplitzCMATLAB2012btoolboxmatlabelmattoeplitzm
University of Manchester Nick Higham Higher Precision Arithmetic 32 33
gtgt which -all eigbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleeig) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleeig) double methodCMATLAB2012btoolboxsymbolicsymbolicsymeigm sym methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedeigm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayeigm gpuArray methoddmatlabchebfunlinopeigm linop methoddmatlabchebfunchebopeigm chebop method
gtgt which -all qrbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleqr) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleqr) double methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedqrm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayqrm gpuArray methoddmatlabchebfunchebfunqrm chebfun method
gtgt which -all toeplitzCMATLAB2012btoolboxmatlabelmattoeplitzm
University of Manchester Nick Higham Higher Precision Arithmetic 32 33
gtgt which -all eigbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleeig) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleeig) double methodCMATLAB2012btoolboxsymbolicsymbolicsymeigm sym methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedeigm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayeigm gpuArray methoddmatlabchebfunlinopeigm linop methoddmatlabchebfunchebopeigm chebop method
gtgt which -all qrbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleqr) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleqr) double methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedqrm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayqrm gpuArray methoddmatlabchebfunchebfunqrm chebfun method
gtgt which -all toeplitzCMATLAB2012btoolboxmatlabelmattoeplitzm
University of Manchester Nick Higham Higher Precision Arithmetic 32 33
Conclusions amp Future Directions
Need to prepare for quad precision in hardwareArbitrary precision arithmetic in software increasinglyuseful Are we fully exploiting it
High precision veryuseful for computingldquoexact answersrdquo fortesting algs
NLA group
University of Manchester Nick Higham Higher Precision Arithmetic 33 33
References I
Multiprecision Computing Toolbox for MATLABAdvanpix LLC Tokyo|httpwwwadvanpixcom|
A H Al-Mohy and N J HighamThe complex step approximation to the Freacutechetderivative of a matrix functionNumer Algorithms 53(1)133ndash148 2010
Basic Linear Algebra Subprograms Technical (BLAST)Forum Standard IInt J High Performance Computing Applications 16(1)1ndash111 2002
University of Manchester Nick Higham Higher Precision Arithmetic 1 4
References II
Basic Linear Algebra Subprograms Technical (BLAST)Forum Standard IIInt J High Performance Computing Applications 16(2)115ndash199 2002
E B DaviesApproximate diagonalizationSIAM J Matrix Anal Appl 29(4)1051ndash1064 2007
University of Manchester Nick Higham Higher Precision Arithmetic 2 4
References III
N J Dingle and N J HighamReducing the influence of tiny normwise relative errorson performance profilesMIMS EPrint 201190 Manchester Institute forMathematical Sciences The University of ManchesterUK Nov 201111 pp
W KahanComputer benchmarks versus accuracyDraft manuscript June 1994
University of Manchester Nick Higham Higher Precision Arithmetic 3 4
References IV
X S Li J W Demmel D H Bailey G Henry Y HidaJ Iskandar W Kahan S Y Kang A Kapur M CMartin B J Thompson T Tung and D J YooDesign implementation and testing of extended andmixed precision BLASACM Trans Math Software 28(2)152ndash205 June 2002
D MargulisPhotoshop LAB Color The Canyon Conundrum andOther Adventures in the Most Powerful ColorspacePeachpit Press Berkeley CA USA 2006ISBN 0-321-35678-0xviii+366 pp
University of Manchester Nick Higham Higher Precision Arithmetic 4 4
Increasing the Precision
y = eπradic
163 evaluated at t digit precision
t y20 2625374126407687440025 262537412640768744000000030 262537412640768743999999999999
Is the last digit before the decimal point 4
t y35 2625374126407687439999999999992500740 2625374126407687439999999999992500725972
So no itrsquos 3
University of Manchester Nick Higham Higher Precision Arithmetic 26 33
Another Example
Consider the evaluation in precision u = 2minust of
y = x + a sin(bx) x = 17 a = 10minus8 b = 224
10 15 20 25 30 35 4010
minus14
10minus13
10minus12
10minus11
10minus10
10minus9
10minus8
10minus7
10minus6
10minus5
10minus4
t
error
University of Manchester Nick Higham Higher Precision Arithmetic 27 33
MythIncreasing the precision at which a computation isperformed increases the accuracy of the answer
Rounding Errors in Digital Imaging
An 8-bit RGB image is an m times n times 3 array of integersaijk isin 01 255Every editing op (levels curves colour balance )aijk larr round(fijk(aijk)) incurs a rounding error
Should we edit in 16-bit
University of Manchester Nick Higham Higher Precision Arithmetic 29 33
Rounding Errors in Digital Imaging
An 8-bit RGB image is an m times n times 3 array of integersaijk isin 01 255Every editing op (levels curves colour balance )aijk larr round(fijk(aijk)) incurs a rounding error
Should we edit in 16-bit
University of Manchester Nick Higham Higher Precision Arithmetic 29 33
16-bit vs 8-bit editing
SLR cameras generate image at 12ndash14 bits internally
8-bit 0 1 25516-bit 0 39times 10minus3 255
Controversial Margulis says using 16 bits makes nopractical difference in quality
Relevant metric is not normwise relative error
University of Manchester Nick Higham Higher Precision Arithmetic 30 33
16-bit vs 8-bit editing
SLR cameras generate image at 12ndash14 bits internally
8-bit 0 1 25516-bit 0 39times 10minus3 255
Controversial Margulis says using 16 bits makes nopractical difference in quality
Relevant metric is not normwise relative error
University of Manchester Nick Higham Higher Precision Arithmetic 30 33
16-bit vs 8-bit editing
SLR cameras generate image at 12ndash14 bits internally
8-bit 0 1 25516-bit 0 39times 10minus3 255
Controversial Margulis says using 16 bits makes nopractical difference in quality
Relevant metric is not normwise relative error
University of Manchester Nick Higham Higher Precision Arithmetic 30 33
High Precision in MATLAB
VPA arithmetic in Symbolic Math ToolboxAdvanpix Multiprecision Computing Toolbox forMATLAB
University of Manchester Nick Higham Higher Precision Arithmetic 31 33
gtgt which -all eigbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleeig) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleeig) double methodCMATLAB2012btoolboxsymbolicsymbolicsymeigm sym methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedeigm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayeigm gpuArray methoddmatlabchebfunlinopeigm linop methoddmatlabchebfunchebopeigm chebop method
gtgt which -all qrbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleqr) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleqr) double methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedqrm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayqrm gpuArray methoddmatlabchebfunchebfunqrm chebfun method
gtgt which -all toeplitzCMATLAB2012btoolboxmatlabelmattoeplitzm
University of Manchester Nick Higham Higher Precision Arithmetic 32 33
gtgt which -all eigbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleeig) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleeig) double methodCMATLAB2012btoolboxsymbolicsymbolicsymeigm sym methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedeigm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayeigm gpuArray methoddmatlabchebfunlinopeigm linop methoddmatlabchebfunchebopeigm chebop method
gtgt which -all qrbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleqr) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleqr) double methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedqrm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayqrm gpuArray methoddmatlabchebfunchebfunqrm chebfun method
gtgt which -all toeplitzCMATLAB2012btoolboxmatlabelmattoeplitzm
University of Manchester Nick Higham Higher Precision Arithmetic 32 33
gtgt which -all eigbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleeig) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleeig) double methodCMATLAB2012btoolboxsymbolicsymbolicsymeigm sym methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedeigm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayeigm gpuArray methoddmatlabchebfunlinopeigm linop methoddmatlabchebfunchebopeigm chebop method
gtgt which -all qrbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleqr) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleqr) double methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedqrm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayqrm gpuArray methoddmatlabchebfunchebfunqrm chebfun method
gtgt which -all toeplitzCMATLAB2012btoolboxmatlabelmattoeplitzm
University of Manchester Nick Higham Higher Precision Arithmetic 32 33
Conclusions amp Future Directions
Need to prepare for quad precision in hardwareArbitrary precision arithmetic in software increasinglyuseful Are we fully exploiting it
High precision veryuseful for computingldquoexact answersrdquo fortesting algs
NLA group
University of Manchester Nick Higham Higher Precision Arithmetic 33 33
References I
Multiprecision Computing Toolbox for MATLABAdvanpix LLC Tokyo|httpwwwadvanpixcom|
A H Al-Mohy and N J HighamThe complex step approximation to the Freacutechetderivative of a matrix functionNumer Algorithms 53(1)133ndash148 2010
Basic Linear Algebra Subprograms Technical (BLAST)Forum Standard IInt J High Performance Computing Applications 16(1)1ndash111 2002
University of Manchester Nick Higham Higher Precision Arithmetic 1 4
References II
Basic Linear Algebra Subprograms Technical (BLAST)Forum Standard IIInt J High Performance Computing Applications 16(2)115ndash199 2002
E B DaviesApproximate diagonalizationSIAM J Matrix Anal Appl 29(4)1051ndash1064 2007
University of Manchester Nick Higham Higher Precision Arithmetic 2 4
References III
N J Dingle and N J HighamReducing the influence of tiny normwise relative errorson performance profilesMIMS EPrint 201190 Manchester Institute forMathematical Sciences The University of ManchesterUK Nov 201111 pp
W KahanComputer benchmarks versus accuracyDraft manuscript June 1994
University of Manchester Nick Higham Higher Precision Arithmetic 3 4
References IV
X S Li J W Demmel D H Bailey G Henry Y HidaJ Iskandar W Kahan S Y Kang A Kapur M CMartin B J Thompson T Tung and D J YooDesign implementation and testing of extended andmixed precision BLASACM Trans Math Software 28(2)152ndash205 June 2002
D MargulisPhotoshop LAB Color The Canyon Conundrum andOther Adventures in the Most Powerful ColorspacePeachpit Press Berkeley CA USA 2006ISBN 0-321-35678-0xviii+366 pp
University of Manchester Nick Higham Higher Precision Arithmetic 4 4
Another Example
Consider the evaluation in precision u = 2minust of
y = x + a sin(bx) x = 17 a = 10minus8 b = 224
10 15 20 25 30 35 4010
minus14
10minus13
10minus12
10minus11
10minus10
10minus9
10minus8
10minus7
10minus6
10minus5
10minus4
t
error
University of Manchester Nick Higham Higher Precision Arithmetic 27 33
MythIncreasing the precision at which a computation isperformed increases the accuracy of the answer
Rounding Errors in Digital Imaging
An 8-bit RGB image is an m times n times 3 array of integersaijk isin 01 255Every editing op (levels curves colour balance )aijk larr round(fijk(aijk)) incurs a rounding error
Should we edit in 16-bit
University of Manchester Nick Higham Higher Precision Arithmetic 29 33
Rounding Errors in Digital Imaging
An 8-bit RGB image is an m times n times 3 array of integersaijk isin 01 255Every editing op (levels curves colour balance )aijk larr round(fijk(aijk)) incurs a rounding error
Should we edit in 16-bit
University of Manchester Nick Higham Higher Precision Arithmetic 29 33
16-bit vs 8-bit editing
SLR cameras generate image at 12ndash14 bits internally
8-bit 0 1 25516-bit 0 39times 10minus3 255
Controversial Margulis says using 16 bits makes nopractical difference in quality
Relevant metric is not normwise relative error
University of Manchester Nick Higham Higher Precision Arithmetic 30 33
16-bit vs 8-bit editing
SLR cameras generate image at 12ndash14 bits internally
8-bit 0 1 25516-bit 0 39times 10minus3 255
Controversial Margulis says using 16 bits makes nopractical difference in quality
Relevant metric is not normwise relative error
University of Manchester Nick Higham Higher Precision Arithmetic 30 33
16-bit vs 8-bit editing
SLR cameras generate image at 12ndash14 bits internally
8-bit 0 1 25516-bit 0 39times 10minus3 255
Controversial Margulis says using 16 bits makes nopractical difference in quality
Relevant metric is not normwise relative error
University of Manchester Nick Higham Higher Precision Arithmetic 30 33
High Precision in MATLAB
VPA arithmetic in Symbolic Math ToolboxAdvanpix Multiprecision Computing Toolbox forMATLAB
University of Manchester Nick Higham Higher Precision Arithmetic 31 33
gtgt which -all eigbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleeig) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleeig) double methodCMATLAB2012btoolboxsymbolicsymbolicsymeigm sym methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedeigm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayeigm gpuArray methoddmatlabchebfunlinopeigm linop methoddmatlabchebfunchebopeigm chebop method
gtgt which -all qrbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleqr) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleqr) double methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedqrm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayqrm gpuArray methoddmatlabchebfunchebfunqrm chebfun method
gtgt which -all toeplitzCMATLAB2012btoolboxmatlabelmattoeplitzm
University of Manchester Nick Higham Higher Precision Arithmetic 32 33
gtgt which -all eigbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleeig) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleeig) double methodCMATLAB2012btoolboxsymbolicsymbolicsymeigm sym methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedeigm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayeigm gpuArray methoddmatlabchebfunlinopeigm linop methoddmatlabchebfunchebopeigm chebop method
gtgt which -all qrbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleqr) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleqr) double methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedqrm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayqrm gpuArray methoddmatlabchebfunchebfunqrm chebfun method
gtgt which -all toeplitzCMATLAB2012btoolboxmatlabelmattoeplitzm
University of Manchester Nick Higham Higher Precision Arithmetic 32 33
gtgt which -all eigbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleeig) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleeig) double methodCMATLAB2012btoolboxsymbolicsymbolicsymeigm sym methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedeigm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayeigm gpuArray methoddmatlabchebfunlinopeigm linop methoddmatlabchebfunchebopeigm chebop method
gtgt which -all qrbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleqr) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleqr) double methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedqrm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayqrm gpuArray methoddmatlabchebfunchebfunqrm chebfun method
gtgt which -all toeplitzCMATLAB2012btoolboxmatlabelmattoeplitzm
University of Manchester Nick Higham Higher Precision Arithmetic 32 33
Conclusions amp Future Directions
Need to prepare for quad precision in hardwareArbitrary precision arithmetic in software increasinglyuseful Are we fully exploiting it
High precision veryuseful for computingldquoexact answersrdquo fortesting algs
NLA group
University of Manchester Nick Higham Higher Precision Arithmetic 33 33
References I
Multiprecision Computing Toolbox for MATLABAdvanpix LLC Tokyo|httpwwwadvanpixcom|
A H Al-Mohy and N J HighamThe complex step approximation to the Freacutechetderivative of a matrix functionNumer Algorithms 53(1)133ndash148 2010
Basic Linear Algebra Subprograms Technical (BLAST)Forum Standard IInt J High Performance Computing Applications 16(1)1ndash111 2002
University of Manchester Nick Higham Higher Precision Arithmetic 1 4
References II
Basic Linear Algebra Subprograms Technical (BLAST)Forum Standard IIInt J High Performance Computing Applications 16(2)115ndash199 2002
E B DaviesApproximate diagonalizationSIAM J Matrix Anal Appl 29(4)1051ndash1064 2007
University of Manchester Nick Higham Higher Precision Arithmetic 2 4
References III
N J Dingle and N J HighamReducing the influence of tiny normwise relative errorson performance profilesMIMS EPrint 201190 Manchester Institute forMathematical Sciences The University of ManchesterUK Nov 201111 pp
W KahanComputer benchmarks versus accuracyDraft manuscript June 1994
University of Manchester Nick Higham Higher Precision Arithmetic 3 4
References IV
X S Li J W Demmel D H Bailey G Henry Y HidaJ Iskandar W Kahan S Y Kang A Kapur M CMartin B J Thompson T Tung and D J YooDesign implementation and testing of extended andmixed precision BLASACM Trans Math Software 28(2)152ndash205 June 2002
D MargulisPhotoshop LAB Color The Canyon Conundrum andOther Adventures in the Most Powerful ColorspacePeachpit Press Berkeley CA USA 2006ISBN 0-321-35678-0xviii+366 pp
University of Manchester Nick Higham Higher Precision Arithmetic 4 4
MythIncreasing the precision at which a computation isperformed increases the accuracy of the answer
Rounding Errors in Digital Imaging
An 8-bit RGB image is an m times n times 3 array of integersaijk isin 01 255Every editing op (levels curves colour balance )aijk larr round(fijk(aijk)) incurs a rounding error
Should we edit in 16-bit
University of Manchester Nick Higham Higher Precision Arithmetic 29 33
Rounding Errors in Digital Imaging
An 8-bit RGB image is an m times n times 3 array of integersaijk isin 01 255Every editing op (levels curves colour balance )aijk larr round(fijk(aijk)) incurs a rounding error
Should we edit in 16-bit
University of Manchester Nick Higham Higher Precision Arithmetic 29 33
16-bit vs 8-bit editing
SLR cameras generate image at 12ndash14 bits internally
8-bit 0 1 25516-bit 0 39times 10minus3 255
Controversial Margulis says using 16 bits makes nopractical difference in quality
Relevant metric is not normwise relative error
University of Manchester Nick Higham Higher Precision Arithmetic 30 33
16-bit vs 8-bit editing
SLR cameras generate image at 12ndash14 bits internally
8-bit 0 1 25516-bit 0 39times 10minus3 255
Controversial Margulis says using 16 bits makes nopractical difference in quality
Relevant metric is not normwise relative error
University of Manchester Nick Higham Higher Precision Arithmetic 30 33
16-bit vs 8-bit editing
SLR cameras generate image at 12ndash14 bits internally
8-bit 0 1 25516-bit 0 39times 10minus3 255
Controversial Margulis says using 16 bits makes nopractical difference in quality
Relevant metric is not normwise relative error
University of Manchester Nick Higham Higher Precision Arithmetic 30 33
High Precision in MATLAB
VPA arithmetic in Symbolic Math ToolboxAdvanpix Multiprecision Computing Toolbox forMATLAB
University of Manchester Nick Higham Higher Precision Arithmetic 31 33
gtgt which -all eigbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleeig) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleeig) double methodCMATLAB2012btoolboxsymbolicsymbolicsymeigm sym methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedeigm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayeigm gpuArray methoddmatlabchebfunlinopeigm linop methoddmatlabchebfunchebopeigm chebop method
gtgt which -all qrbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleqr) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleqr) double methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedqrm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayqrm gpuArray methoddmatlabchebfunchebfunqrm chebfun method
gtgt which -all toeplitzCMATLAB2012btoolboxmatlabelmattoeplitzm
University of Manchester Nick Higham Higher Precision Arithmetic 32 33
gtgt which -all eigbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleeig) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleeig) double methodCMATLAB2012btoolboxsymbolicsymbolicsymeigm sym methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedeigm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayeigm gpuArray methoddmatlabchebfunlinopeigm linop methoddmatlabchebfunchebopeigm chebop method
gtgt which -all qrbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleqr) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleqr) double methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedqrm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayqrm gpuArray methoddmatlabchebfunchebfunqrm chebfun method
gtgt which -all toeplitzCMATLAB2012btoolboxmatlabelmattoeplitzm
University of Manchester Nick Higham Higher Precision Arithmetic 32 33
gtgt which -all eigbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleeig) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleeig) double methodCMATLAB2012btoolboxsymbolicsymbolicsymeigm sym methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedeigm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayeigm gpuArray methoddmatlabchebfunlinopeigm linop methoddmatlabchebfunchebopeigm chebop method
gtgt which -all qrbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleqr) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleqr) double methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedqrm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayqrm gpuArray methoddmatlabchebfunchebfunqrm chebfun method
gtgt which -all toeplitzCMATLAB2012btoolboxmatlabelmattoeplitzm
University of Manchester Nick Higham Higher Precision Arithmetic 32 33
Conclusions amp Future Directions
Need to prepare for quad precision in hardwareArbitrary precision arithmetic in software increasinglyuseful Are we fully exploiting it
High precision veryuseful for computingldquoexact answersrdquo fortesting algs
NLA group
University of Manchester Nick Higham Higher Precision Arithmetic 33 33
References I
Multiprecision Computing Toolbox for MATLABAdvanpix LLC Tokyo|httpwwwadvanpixcom|
A H Al-Mohy and N J HighamThe complex step approximation to the Freacutechetderivative of a matrix functionNumer Algorithms 53(1)133ndash148 2010
Basic Linear Algebra Subprograms Technical (BLAST)Forum Standard IInt J High Performance Computing Applications 16(1)1ndash111 2002
University of Manchester Nick Higham Higher Precision Arithmetic 1 4
References II
Basic Linear Algebra Subprograms Technical (BLAST)Forum Standard IIInt J High Performance Computing Applications 16(2)115ndash199 2002
E B DaviesApproximate diagonalizationSIAM J Matrix Anal Appl 29(4)1051ndash1064 2007
University of Manchester Nick Higham Higher Precision Arithmetic 2 4
References III
N J Dingle and N J HighamReducing the influence of tiny normwise relative errorson performance profilesMIMS EPrint 201190 Manchester Institute forMathematical Sciences The University of ManchesterUK Nov 201111 pp
W KahanComputer benchmarks versus accuracyDraft manuscript June 1994
University of Manchester Nick Higham Higher Precision Arithmetic 3 4
References IV
X S Li J W Demmel D H Bailey G Henry Y HidaJ Iskandar W Kahan S Y Kang A Kapur M CMartin B J Thompson T Tung and D J YooDesign implementation and testing of extended andmixed precision BLASACM Trans Math Software 28(2)152ndash205 June 2002
D MargulisPhotoshop LAB Color The Canyon Conundrum andOther Adventures in the Most Powerful ColorspacePeachpit Press Berkeley CA USA 2006ISBN 0-321-35678-0xviii+366 pp
University of Manchester Nick Higham Higher Precision Arithmetic 4 4
Rounding Errors in Digital Imaging
An 8-bit RGB image is an m times n times 3 array of integersaijk isin 01 255Every editing op (levels curves colour balance )aijk larr round(fijk(aijk)) incurs a rounding error
Should we edit in 16-bit
University of Manchester Nick Higham Higher Precision Arithmetic 29 33
Rounding Errors in Digital Imaging
An 8-bit RGB image is an m times n times 3 array of integersaijk isin 01 255Every editing op (levels curves colour balance )aijk larr round(fijk(aijk)) incurs a rounding error
Should we edit in 16-bit
University of Manchester Nick Higham Higher Precision Arithmetic 29 33
16-bit vs 8-bit editing
SLR cameras generate image at 12ndash14 bits internally
8-bit 0 1 25516-bit 0 39times 10minus3 255
Controversial Margulis says using 16 bits makes nopractical difference in quality
Relevant metric is not normwise relative error
University of Manchester Nick Higham Higher Precision Arithmetic 30 33
16-bit vs 8-bit editing
SLR cameras generate image at 12ndash14 bits internally
8-bit 0 1 25516-bit 0 39times 10minus3 255
Controversial Margulis says using 16 bits makes nopractical difference in quality
Relevant metric is not normwise relative error
University of Manchester Nick Higham Higher Precision Arithmetic 30 33
16-bit vs 8-bit editing
SLR cameras generate image at 12ndash14 bits internally
8-bit 0 1 25516-bit 0 39times 10minus3 255
Controversial Margulis says using 16 bits makes nopractical difference in quality
Relevant metric is not normwise relative error
University of Manchester Nick Higham Higher Precision Arithmetic 30 33
High Precision in MATLAB
VPA arithmetic in Symbolic Math ToolboxAdvanpix Multiprecision Computing Toolbox forMATLAB
University of Manchester Nick Higham Higher Precision Arithmetic 31 33
gtgt which -all eigbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleeig) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleeig) double methodCMATLAB2012btoolboxsymbolicsymbolicsymeigm sym methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedeigm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayeigm gpuArray methoddmatlabchebfunlinopeigm linop methoddmatlabchebfunchebopeigm chebop method
gtgt which -all qrbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleqr) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleqr) double methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedqrm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayqrm gpuArray methoddmatlabchebfunchebfunqrm chebfun method
gtgt which -all toeplitzCMATLAB2012btoolboxmatlabelmattoeplitzm
University of Manchester Nick Higham Higher Precision Arithmetic 32 33
gtgt which -all eigbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleeig) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleeig) double methodCMATLAB2012btoolboxsymbolicsymbolicsymeigm sym methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedeigm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayeigm gpuArray methoddmatlabchebfunlinopeigm linop methoddmatlabchebfunchebopeigm chebop method
gtgt which -all qrbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleqr) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleqr) double methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedqrm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayqrm gpuArray methoddmatlabchebfunchebfunqrm chebfun method
gtgt which -all toeplitzCMATLAB2012btoolboxmatlabelmattoeplitzm
University of Manchester Nick Higham Higher Precision Arithmetic 32 33
gtgt which -all eigbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleeig) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleeig) double methodCMATLAB2012btoolboxsymbolicsymbolicsymeigm sym methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedeigm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayeigm gpuArray methoddmatlabchebfunlinopeigm linop methoddmatlabchebfunchebopeigm chebop method
gtgt which -all qrbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleqr) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleqr) double methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedqrm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayqrm gpuArray methoddmatlabchebfunchebfunqrm chebfun method
gtgt which -all toeplitzCMATLAB2012btoolboxmatlabelmattoeplitzm
University of Manchester Nick Higham Higher Precision Arithmetic 32 33
Conclusions amp Future Directions
Need to prepare for quad precision in hardwareArbitrary precision arithmetic in software increasinglyuseful Are we fully exploiting it
High precision veryuseful for computingldquoexact answersrdquo fortesting algs
NLA group
University of Manchester Nick Higham Higher Precision Arithmetic 33 33
References I
Multiprecision Computing Toolbox for MATLABAdvanpix LLC Tokyo|httpwwwadvanpixcom|
A H Al-Mohy and N J HighamThe complex step approximation to the Freacutechetderivative of a matrix functionNumer Algorithms 53(1)133ndash148 2010
Basic Linear Algebra Subprograms Technical (BLAST)Forum Standard IInt J High Performance Computing Applications 16(1)1ndash111 2002
University of Manchester Nick Higham Higher Precision Arithmetic 1 4
References II
Basic Linear Algebra Subprograms Technical (BLAST)Forum Standard IIInt J High Performance Computing Applications 16(2)115ndash199 2002
E B DaviesApproximate diagonalizationSIAM J Matrix Anal Appl 29(4)1051ndash1064 2007
University of Manchester Nick Higham Higher Precision Arithmetic 2 4
References III
N J Dingle and N J HighamReducing the influence of tiny normwise relative errorson performance profilesMIMS EPrint 201190 Manchester Institute forMathematical Sciences The University of ManchesterUK Nov 201111 pp
W KahanComputer benchmarks versus accuracyDraft manuscript June 1994
University of Manchester Nick Higham Higher Precision Arithmetic 3 4
References IV
X S Li J W Demmel D H Bailey G Henry Y HidaJ Iskandar W Kahan S Y Kang A Kapur M CMartin B J Thompson T Tung and D J YooDesign implementation and testing of extended andmixed precision BLASACM Trans Math Software 28(2)152ndash205 June 2002
D MargulisPhotoshop LAB Color The Canyon Conundrum andOther Adventures in the Most Powerful ColorspacePeachpit Press Berkeley CA USA 2006ISBN 0-321-35678-0xviii+366 pp
University of Manchester Nick Higham Higher Precision Arithmetic 4 4
Rounding Errors in Digital Imaging
An 8-bit RGB image is an m times n times 3 array of integersaijk isin 01 255Every editing op (levels curves colour balance )aijk larr round(fijk(aijk)) incurs a rounding error
Should we edit in 16-bit
University of Manchester Nick Higham Higher Precision Arithmetic 29 33
16-bit vs 8-bit editing
SLR cameras generate image at 12ndash14 bits internally
8-bit 0 1 25516-bit 0 39times 10minus3 255
Controversial Margulis says using 16 bits makes nopractical difference in quality
Relevant metric is not normwise relative error
University of Manchester Nick Higham Higher Precision Arithmetic 30 33
16-bit vs 8-bit editing
SLR cameras generate image at 12ndash14 bits internally
8-bit 0 1 25516-bit 0 39times 10minus3 255
Controversial Margulis says using 16 bits makes nopractical difference in quality
Relevant metric is not normwise relative error
University of Manchester Nick Higham Higher Precision Arithmetic 30 33
16-bit vs 8-bit editing
SLR cameras generate image at 12ndash14 bits internally
8-bit 0 1 25516-bit 0 39times 10minus3 255
Controversial Margulis says using 16 bits makes nopractical difference in quality
Relevant metric is not normwise relative error
University of Manchester Nick Higham Higher Precision Arithmetic 30 33
High Precision in MATLAB
VPA arithmetic in Symbolic Math ToolboxAdvanpix Multiprecision Computing Toolbox forMATLAB
University of Manchester Nick Higham Higher Precision Arithmetic 31 33
gtgt which -all eigbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleeig) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleeig) double methodCMATLAB2012btoolboxsymbolicsymbolicsymeigm sym methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedeigm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayeigm gpuArray methoddmatlabchebfunlinopeigm linop methoddmatlabchebfunchebopeigm chebop method
gtgt which -all qrbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleqr) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleqr) double methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedqrm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayqrm gpuArray methoddmatlabchebfunchebfunqrm chebfun method
gtgt which -all toeplitzCMATLAB2012btoolboxmatlabelmattoeplitzm
University of Manchester Nick Higham Higher Precision Arithmetic 32 33
gtgt which -all eigbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleeig) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleeig) double methodCMATLAB2012btoolboxsymbolicsymbolicsymeigm sym methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedeigm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayeigm gpuArray methoddmatlabchebfunlinopeigm linop methoddmatlabchebfunchebopeigm chebop method
gtgt which -all qrbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleqr) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleqr) double methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedqrm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayqrm gpuArray methoddmatlabchebfunchebfunqrm chebfun method
gtgt which -all toeplitzCMATLAB2012btoolboxmatlabelmattoeplitzm
University of Manchester Nick Higham Higher Precision Arithmetic 32 33
gtgt which -all eigbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleeig) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleeig) double methodCMATLAB2012btoolboxsymbolicsymbolicsymeigm sym methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedeigm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayeigm gpuArray methoddmatlabchebfunlinopeigm linop methoddmatlabchebfunchebopeigm chebop method
gtgt which -all qrbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleqr) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleqr) double methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedqrm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayqrm gpuArray methoddmatlabchebfunchebfunqrm chebfun method
gtgt which -all toeplitzCMATLAB2012btoolboxmatlabelmattoeplitzm
University of Manchester Nick Higham Higher Precision Arithmetic 32 33
Conclusions amp Future Directions
Need to prepare for quad precision in hardwareArbitrary precision arithmetic in software increasinglyuseful Are we fully exploiting it
High precision veryuseful for computingldquoexact answersrdquo fortesting algs
NLA group
University of Manchester Nick Higham Higher Precision Arithmetic 33 33
References I
Multiprecision Computing Toolbox for MATLABAdvanpix LLC Tokyo|httpwwwadvanpixcom|
A H Al-Mohy and N J HighamThe complex step approximation to the Freacutechetderivative of a matrix functionNumer Algorithms 53(1)133ndash148 2010
Basic Linear Algebra Subprograms Technical (BLAST)Forum Standard IInt J High Performance Computing Applications 16(1)1ndash111 2002
University of Manchester Nick Higham Higher Precision Arithmetic 1 4
References II
Basic Linear Algebra Subprograms Technical (BLAST)Forum Standard IIInt J High Performance Computing Applications 16(2)115ndash199 2002
E B DaviesApproximate diagonalizationSIAM J Matrix Anal Appl 29(4)1051ndash1064 2007
University of Manchester Nick Higham Higher Precision Arithmetic 2 4
References III
N J Dingle and N J HighamReducing the influence of tiny normwise relative errorson performance profilesMIMS EPrint 201190 Manchester Institute forMathematical Sciences The University of ManchesterUK Nov 201111 pp
W KahanComputer benchmarks versus accuracyDraft manuscript June 1994
University of Manchester Nick Higham Higher Precision Arithmetic 3 4
References IV
X S Li J W Demmel D H Bailey G Henry Y HidaJ Iskandar W Kahan S Y Kang A Kapur M CMartin B J Thompson T Tung and D J YooDesign implementation and testing of extended andmixed precision BLASACM Trans Math Software 28(2)152ndash205 June 2002
D MargulisPhotoshop LAB Color The Canyon Conundrum andOther Adventures in the Most Powerful ColorspacePeachpit Press Berkeley CA USA 2006ISBN 0-321-35678-0xviii+366 pp
University of Manchester Nick Higham Higher Precision Arithmetic 4 4
16-bit vs 8-bit editing
SLR cameras generate image at 12ndash14 bits internally
8-bit 0 1 25516-bit 0 39times 10minus3 255
Controversial Margulis says using 16 bits makes nopractical difference in quality
Relevant metric is not normwise relative error
University of Manchester Nick Higham Higher Precision Arithmetic 30 33
16-bit vs 8-bit editing
SLR cameras generate image at 12ndash14 bits internally
8-bit 0 1 25516-bit 0 39times 10minus3 255
Controversial Margulis says using 16 bits makes nopractical difference in quality
Relevant metric is not normwise relative error
University of Manchester Nick Higham Higher Precision Arithmetic 30 33
16-bit vs 8-bit editing
SLR cameras generate image at 12ndash14 bits internally
8-bit 0 1 25516-bit 0 39times 10minus3 255
Controversial Margulis says using 16 bits makes nopractical difference in quality
Relevant metric is not normwise relative error
University of Manchester Nick Higham Higher Precision Arithmetic 30 33
High Precision in MATLAB
VPA arithmetic in Symbolic Math ToolboxAdvanpix Multiprecision Computing Toolbox forMATLAB
University of Manchester Nick Higham Higher Precision Arithmetic 31 33
gtgt which -all eigbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleeig) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleeig) double methodCMATLAB2012btoolboxsymbolicsymbolicsymeigm sym methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedeigm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayeigm gpuArray methoddmatlabchebfunlinopeigm linop methoddmatlabchebfunchebopeigm chebop method
gtgt which -all qrbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleqr) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleqr) double methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedqrm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayqrm gpuArray methoddmatlabchebfunchebfunqrm chebfun method
gtgt which -all toeplitzCMATLAB2012btoolboxmatlabelmattoeplitzm
University of Manchester Nick Higham Higher Precision Arithmetic 32 33
gtgt which -all eigbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleeig) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleeig) double methodCMATLAB2012btoolboxsymbolicsymbolicsymeigm sym methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedeigm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayeigm gpuArray methoddmatlabchebfunlinopeigm linop methoddmatlabchebfunchebopeigm chebop method
gtgt which -all qrbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleqr) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleqr) double methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedqrm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayqrm gpuArray methoddmatlabchebfunchebfunqrm chebfun method
gtgt which -all toeplitzCMATLAB2012btoolboxmatlabelmattoeplitzm
University of Manchester Nick Higham Higher Precision Arithmetic 32 33
gtgt which -all eigbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleeig) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleeig) double methodCMATLAB2012btoolboxsymbolicsymbolicsymeigm sym methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedeigm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayeigm gpuArray methoddmatlabchebfunlinopeigm linop methoddmatlabchebfunchebopeigm chebop method
gtgt which -all qrbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleqr) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleqr) double methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedqrm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayqrm gpuArray methoddmatlabchebfunchebfunqrm chebfun method
gtgt which -all toeplitzCMATLAB2012btoolboxmatlabelmattoeplitzm
University of Manchester Nick Higham Higher Precision Arithmetic 32 33
Conclusions amp Future Directions
Need to prepare for quad precision in hardwareArbitrary precision arithmetic in software increasinglyuseful Are we fully exploiting it
High precision veryuseful for computingldquoexact answersrdquo fortesting algs
NLA group
University of Manchester Nick Higham Higher Precision Arithmetic 33 33
References I
Multiprecision Computing Toolbox for MATLABAdvanpix LLC Tokyo|httpwwwadvanpixcom|
A H Al-Mohy and N J HighamThe complex step approximation to the Freacutechetderivative of a matrix functionNumer Algorithms 53(1)133ndash148 2010
Basic Linear Algebra Subprograms Technical (BLAST)Forum Standard IInt J High Performance Computing Applications 16(1)1ndash111 2002
University of Manchester Nick Higham Higher Precision Arithmetic 1 4
References II
Basic Linear Algebra Subprograms Technical (BLAST)Forum Standard IIInt J High Performance Computing Applications 16(2)115ndash199 2002
E B DaviesApproximate diagonalizationSIAM J Matrix Anal Appl 29(4)1051ndash1064 2007
University of Manchester Nick Higham Higher Precision Arithmetic 2 4
References III
N J Dingle and N J HighamReducing the influence of tiny normwise relative errorson performance profilesMIMS EPrint 201190 Manchester Institute forMathematical Sciences The University of ManchesterUK Nov 201111 pp
W KahanComputer benchmarks versus accuracyDraft manuscript June 1994
University of Manchester Nick Higham Higher Precision Arithmetic 3 4
References IV
X S Li J W Demmel D H Bailey G Henry Y HidaJ Iskandar W Kahan S Y Kang A Kapur M CMartin B J Thompson T Tung and D J YooDesign implementation and testing of extended andmixed precision BLASACM Trans Math Software 28(2)152ndash205 June 2002
D MargulisPhotoshop LAB Color The Canyon Conundrum andOther Adventures in the Most Powerful ColorspacePeachpit Press Berkeley CA USA 2006ISBN 0-321-35678-0xviii+366 pp
University of Manchester Nick Higham Higher Precision Arithmetic 4 4
16-bit vs 8-bit editing
SLR cameras generate image at 12ndash14 bits internally
8-bit 0 1 25516-bit 0 39times 10minus3 255
Controversial Margulis says using 16 bits makes nopractical difference in quality
Relevant metric is not normwise relative error
University of Manchester Nick Higham Higher Precision Arithmetic 30 33
16-bit vs 8-bit editing
SLR cameras generate image at 12ndash14 bits internally
8-bit 0 1 25516-bit 0 39times 10minus3 255
Controversial Margulis says using 16 bits makes nopractical difference in quality
Relevant metric is not normwise relative error
University of Manchester Nick Higham Higher Precision Arithmetic 30 33
High Precision in MATLAB
VPA arithmetic in Symbolic Math ToolboxAdvanpix Multiprecision Computing Toolbox forMATLAB
University of Manchester Nick Higham Higher Precision Arithmetic 31 33
gtgt which -all eigbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleeig) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleeig) double methodCMATLAB2012btoolboxsymbolicsymbolicsymeigm sym methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedeigm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayeigm gpuArray methoddmatlabchebfunlinopeigm linop methoddmatlabchebfunchebopeigm chebop method
gtgt which -all qrbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleqr) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleqr) double methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedqrm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayqrm gpuArray methoddmatlabchebfunchebfunqrm chebfun method
gtgt which -all toeplitzCMATLAB2012btoolboxmatlabelmattoeplitzm
University of Manchester Nick Higham Higher Precision Arithmetic 32 33
gtgt which -all eigbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleeig) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleeig) double methodCMATLAB2012btoolboxsymbolicsymbolicsymeigm sym methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedeigm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayeigm gpuArray methoddmatlabchebfunlinopeigm linop methoddmatlabchebfunchebopeigm chebop method
gtgt which -all qrbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleqr) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleqr) double methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedqrm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayqrm gpuArray methoddmatlabchebfunchebfunqrm chebfun method
gtgt which -all toeplitzCMATLAB2012btoolboxmatlabelmattoeplitzm
University of Manchester Nick Higham Higher Precision Arithmetic 32 33
gtgt which -all eigbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleeig) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleeig) double methodCMATLAB2012btoolboxsymbolicsymbolicsymeigm sym methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedeigm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayeigm gpuArray methoddmatlabchebfunlinopeigm linop methoddmatlabchebfunchebopeigm chebop method
gtgt which -all qrbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleqr) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleqr) double methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedqrm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayqrm gpuArray methoddmatlabchebfunchebfunqrm chebfun method
gtgt which -all toeplitzCMATLAB2012btoolboxmatlabelmattoeplitzm
University of Manchester Nick Higham Higher Precision Arithmetic 32 33
Conclusions amp Future Directions
Need to prepare for quad precision in hardwareArbitrary precision arithmetic in software increasinglyuseful Are we fully exploiting it
High precision veryuseful for computingldquoexact answersrdquo fortesting algs
NLA group
University of Manchester Nick Higham Higher Precision Arithmetic 33 33
References I
Multiprecision Computing Toolbox for MATLABAdvanpix LLC Tokyo|httpwwwadvanpixcom|
A H Al-Mohy and N J HighamThe complex step approximation to the Freacutechetderivative of a matrix functionNumer Algorithms 53(1)133ndash148 2010
Basic Linear Algebra Subprograms Technical (BLAST)Forum Standard IInt J High Performance Computing Applications 16(1)1ndash111 2002
University of Manchester Nick Higham Higher Precision Arithmetic 1 4
References II
Basic Linear Algebra Subprograms Technical (BLAST)Forum Standard IIInt J High Performance Computing Applications 16(2)115ndash199 2002
E B DaviesApproximate diagonalizationSIAM J Matrix Anal Appl 29(4)1051ndash1064 2007
University of Manchester Nick Higham Higher Precision Arithmetic 2 4
References III
N J Dingle and N J HighamReducing the influence of tiny normwise relative errorson performance profilesMIMS EPrint 201190 Manchester Institute forMathematical Sciences The University of ManchesterUK Nov 201111 pp
W KahanComputer benchmarks versus accuracyDraft manuscript June 1994
University of Manchester Nick Higham Higher Precision Arithmetic 3 4
References IV
X S Li J W Demmel D H Bailey G Henry Y HidaJ Iskandar W Kahan S Y Kang A Kapur M CMartin B J Thompson T Tung and D J YooDesign implementation and testing of extended andmixed precision BLASACM Trans Math Software 28(2)152ndash205 June 2002
D MargulisPhotoshop LAB Color The Canyon Conundrum andOther Adventures in the Most Powerful ColorspacePeachpit Press Berkeley CA USA 2006ISBN 0-321-35678-0xviii+366 pp
University of Manchester Nick Higham Higher Precision Arithmetic 4 4
16-bit vs 8-bit editing
SLR cameras generate image at 12ndash14 bits internally
8-bit 0 1 25516-bit 0 39times 10minus3 255
Controversial Margulis says using 16 bits makes nopractical difference in quality
Relevant metric is not normwise relative error
University of Manchester Nick Higham Higher Precision Arithmetic 30 33
High Precision in MATLAB
VPA arithmetic in Symbolic Math ToolboxAdvanpix Multiprecision Computing Toolbox forMATLAB
University of Manchester Nick Higham Higher Precision Arithmetic 31 33
gtgt which -all eigbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleeig) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleeig) double methodCMATLAB2012btoolboxsymbolicsymbolicsymeigm sym methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedeigm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayeigm gpuArray methoddmatlabchebfunlinopeigm linop methoddmatlabchebfunchebopeigm chebop method
gtgt which -all qrbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleqr) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleqr) double methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedqrm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayqrm gpuArray methoddmatlabchebfunchebfunqrm chebfun method
gtgt which -all toeplitzCMATLAB2012btoolboxmatlabelmattoeplitzm
University of Manchester Nick Higham Higher Precision Arithmetic 32 33
gtgt which -all eigbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleeig) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleeig) double methodCMATLAB2012btoolboxsymbolicsymbolicsymeigm sym methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedeigm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayeigm gpuArray methoddmatlabchebfunlinopeigm linop methoddmatlabchebfunchebopeigm chebop method
gtgt which -all qrbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleqr) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleqr) double methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedqrm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayqrm gpuArray methoddmatlabchebfunchebfunqrm chebfun method
gtgt which -all toeplitzCMATLAB2012btoolboxmatlabelmattoeplitzm
University of Manchester Nick Higham Higher Precision Arithmetic 32 33
gtgt which -all eigbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleeig) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleeig) double methodCMATLAB2012btoolboxsymbolicsymbolicsymeigm sym methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedeigm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayeigm gpuArray methoddmatlabchebfunlinopeigm linop methoddmatlabchebfunchebopeigm chebop method
gtgt which -all qrbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleqr) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleqr) double methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedqrm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayqrm gpuArray methoddmatlabchebfunchebfunqrm chebfun method
gtgt which -all toeplitzCMATLAB2012btoolboxmatlabelmattoeplitzm
University of Manchester Nick Higham Higher Precision Arithmetic 32 33
Conclusions amp Future Directions
Need to prepare for quad precision in hardwareArbitrary precision arithmetic in software increasinglyuseful Are we fully exploiting it
High precision veryuseful for computingldquoexact answersrdquo fortesting algs
NLA group
University of Manchester Nick Higham Higher Precision Arithmetic 33 33
References I
Multiprecision Computing Toolbox for MATLABAdvanpix LLC Tokyo|httpwwwadvanpixcom|
A H Al-Mohy and N J HighamThe complex step approximation to the Freacutechetderivative of a matrix functionNumer Algorithms 53(1)133ndash148 2010
Basic Linear Algebra Subprograms Technical (BLAST)Forum Standard IInt J High Performance Computing Applications 16(1)1ndash111 2002
University of Manchester Nick Higham Higher Precision Arithmetic 1 4
References II
Basic Linear Algebra Subprograms Technical (BLAST)Forum Standard IIInt J High Performance Computing Applications 16(2)115ndash199 2002
E B DaviesApproximate diagonalizationSIAM J Matrix Anal Appl 29(4)1051ndash1064 2007
University of Manchester Nick Higham Higher Precision Arithmetic 2 4
References III
N J Dingle and N J HighamReducing the influence of tiny normwise relative errorson performance profilesMIMS EPrint 201190 Manchester Institute forMathematical Sciences The University of ManchesterUK Nov 201111 pp
W KahanComputer benchmarks versus accuracyDraft manuscript June 1994
University of Manchester Nick Higham Higher Precision Arithmetic 3 4
References IV
X S Li J W Demmel D H Bailey G Henry Y HidaJ Iskandar W Kahan S Y Kang A Kapur M CMartin B J Thompson T Tung and D J YooDesign implementation and testing of extended andmixed precision BLASACM Trans Math Software 28(2)152ndash205 June 2002
D MargulisPhotoshop LAB Color The Canyon Conundrum andOther Adventures in the Most Powerful ColorspacePeachpit Press Berkeley CA USA 2006ISBN 0-321-35678-0xviii+366 pp
University of Manchester Nick Higham Higher Precision Arithmetic 4 4
High Precision in MATLAB
VPA arithmetic in Symbolic Math ToolboxAdvanpix Multiprecision Computing Toolbox forMATLAB
University of Manchester Nick Higham Higher Precision Arithmetic 31 33
gtgt which -all eigbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleeig) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleeig) double methodCMATLAB2012btoolboxsymbolicsymbolicsymeigm sym methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedeigm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayeigm gpuArray methoddmatlabchebfunlinopeigm linop methoddmatlabchebfunchebopeigm chebop method
gtgt which -all qrbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleqr) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleqr) double methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedqrm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayqrm gpuArray methoddmatlabchebfunchebfunqrm chebfun method
gtgt which -all toeplitzCMATLAB2012btoolboxmatlabelmattoeplitzm
University of Manchester Nick Higham Higher Precision Arithmetic 32 33
gtgt which -all eigbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleeig) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleeig) double methodCMATLAB2012btoolboxsymbolicsymbolicsymeigm sym methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedeigm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayeigm gpuArray methoddmatlabchebfunlinopeigm linop methoddmatlabchebfunchebopeigm chebop method
gtgt which -all qrbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleqr) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleqr) double methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedqrm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayqrm gpuArray methoddmatlabchebfunchebfunqrm chebfun method
gtgt which -all toeplitzCMATLAB2012btoolboxmatlabelmattoeplitzm
University of Manchester Nick Higham Higher Precision Arithmetic 32 33
gtgt which -all eigbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleeig) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleeig) double methodCMATLAB2012btoolboxsymbolicsymbolicsymeigm sym methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedeigm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayeigm gpuArray methoddmatlabchebfunlinopeigm linop methoddmatlabchebfunchebopeigm chebop method
gtgt which -all qrbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleqr) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleqr) double methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedqrm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayqrm gpuArray methoddmatlabchebfunchebfunqrm chebfun method
gtgt which -all toeplitzCMATLAB2012btoolboxmatlabelmattoeplitzm
University of Manchester Nick Higham Higher Precision Arithmetic 32 33
Conclusions amp Future Directions
Need to prepare for quad precision in hardwareArbitrary precision arithmetic in software increasinglyuseful Are we fully exploiting it
High precision veryuseful for computingldquoexact answersrdquo fortesting algs
NLA group
University of Manchester Nick Higham Higher Precision Arithmetic 33 33
References I
Multiprecision Computing Toolbox for MATLABAdvanpix LLC Tokyo|httpwwwadvanpixcom|
A H Al-Mohy and N J HighamThe complex step approximation to the Freacutechetderivative of a matrix functionNumer Algorithms 53(1)133ndash148 2010
Basic Linear Algebra Subprograms Technical (BLAST)Forum Standard IInt J High Performance Computing Applications 16(1)1ndash111 2002
University of Manchester Nick Higham Higher Precision Arithmetic 1 4
References II
Basic Linear Algebra Subprograms Technical (BLAST)Forum Standard IIInt J High Performance Computing Applications 16(2)115ndash199 2002
E B DaviesApproximate diagonalizationSIAM J Matrix Anal Appl 29(4)1051ndash1064 2007
University of Manchester Nick Higham Higher Precision Arithmetic 2 4
References III
N J Dingle and N J HighamReducing the influence of tiny normwise relative errorson performance profilesMIMS EPrint 201190 Manchester Institute forMathematical Sciences The University of ManchesterUK Nov 201111 pp
W KahanComputer benchmarks versus accuracyDraft manuscript June 1994
University of Manchester Nick Higham Higher Precision Arithmetic 3 4
References IV
X S Li J W Demmel D H Bailey G Henry Y HidaJ Iskandar W Kahan S Y Kang A Kapur M CMartin B J Thompson T Tung and D J YooDesign implementation and testing of extended andmixed precision BLASACM Trans Math Software 28(2)152ndash205 June 2002
D MargulisPhotoshop LAB Color The Canyon Conundrum andOther Adventures in the Most Powerful ColorspacePeachpit Press Berkeley CA USA 2006ISBN 0-321-35678-0xviii+366 pp
University of Manchester Nick Higham Higher Precision Arithmetic 4 4
gtgt which -all eigbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleeig) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleeig) double methodCMATLAB2012btoolboxsymbolicsymbolicsymeigm sym methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedeigm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayeigm gpuArray methoddmatlabchebfunlinopeigm linop methoddmatlabchebfunchebopeigm chebop method
gtgt which -all qrbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleqr) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleqr) double methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedqrm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayqrm gpuArray methoddmatlabchebfunchebfunqrm chebfun method
gtgt which -all toeplitzCMATLAB2012btoolboxmatlabelmattoeplitzm
University of Manchester Nick Higham Higher Precision Arithmetic 32 33
gtgt which -all eigbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleeig) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleeig) double methodCMATLAB2012btoolboxsymbolicsymbolicsymeigm sym methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedeigm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayeigm gpuArray methoddmatlabchebfunlinopeigm linop methoddmatlabchebfunchebopeigm chebop method
gtgt which -all qrbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleqr) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleqr) double methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedqrm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayqrm gpuArray methoddmatlabchebfunchebfunqrm chebfun method
gtgt which -all toeplitzCMATLAB2012btoolboxmatlabelmattoeplitzm
University of Manchester Nick Higham Higher Precision Arithmetic 32 33
gtgt which -all eigbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleeig) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleeig) double methodCMATLAB2012btoolboxsymbolicsymbolicsymeigm sym methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedeigm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayeigm gpuArray methoddmatlabchebfunlinopeigm linop methoddmatlabchebfunchebopeigm chebop method
gtgt which -all qrbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleqr) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleqr) double methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedqrm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayqrm gpuArray methoddmatlabchebfunchebfunqrm chebfun method
gtgt which -all toeplitzCMATLAB2012btoolboxmatlabelmattoeplitzm
University of Manchester Nick Higham Higher Precision Arithmetic 32 33
Conclusions amp Future Directions
Need to prepare for quad precision in hardwareArbitrary precision arithmetic in software increasinglyuseful Are we fully exploiting it
High precision veryuseful for computingldquoexact answersrdquo fortesting algs
NLA group
University of Manchester Nick Higham Higher Precision Arithmetic 33 33
References I
Multiprecision Computing Toolbox for MATLABAdvanpix LLC Tokyo|httpwwwadvanpixcom|
A H Al-Mohy and N J HighamThe complex step approximation to the Freacutechetderivative of a matrix functionNumer Algorithms 53(1)133ndash148 2010
Basic Linear Algebra Subprograms Technical (BLAST)Forum Standard IInt J High Performance Computing Applications 16(1)1ndash111 2002
University of Manchester Nick Higham Higher Precision Arithmetic 1 4
References II
Basic Linear Algebra Subprograms Technical (BLAST)Forum Standard IIInt J High Performance Computing Applications 16(2)115ndash199 2002
E B DaviesApproximate diagonalizationSIAM J Matrix Anal Appl 29(4)1051ndash1064 2007
University of Manchester Nick Higham Higher Precision Arithmetic 2 4
References III
N J Dingle and N J HighamReducing the influence of tiny normwise relative errorson performance profilesMIMS EPrint 201190 Manchester Institute forMathematical Sciences The University of ManchesterUK Nov 201111 pp
W KahanComputer benchmarks versus accuracyDraft manuscript June 1994
University of Manchester Nick Higham Higher Precision Arithmetic 3 4
References IV
X S Li J W Demmel D H Bailey G Henry Y HidaJ Iskandar W Kahan S Y Kang A Kapur M CMartin B J Thompson T Tung and D J YooDesign implementation and testing of extended andmixed precision BLASACM Trans Math Software 28(2)152ndash205 June 2002
D MargulisPhotoshop LAB Color The Canyon Conundrum andOther Adventures in the Most Powerful ColorspacePeachpit Press Berkeley CA USA 2006ISBN 0-321-35678-0xviii+366 pp
University of Manchester Nick Higham Higher Precision Arithmetic 4 4
gtgt which -all eigbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleeig) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleeig) double methodCMATLAB2012btoolboxsymbolicsymbolicsymeigm sym methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedeigm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayeigm gpuArray methoddmatlabchebfunlinopeigm linop methoddmatlabchebfunchebopeigm chebop method
gtgt which -all qrbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleqr) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleqr) double methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedqrm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayqrm gpuArray methoddmatlabchebfunchebfunqrm chebfun method
gtgt which -all toeplitzCMATLAB2012btoolboxmatlabelmattoeplitzm
University of Manchester Nick Higham Higher Precision Arithmetic 32 33
gtgt which -all eigbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleeig) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleeig) double methodCMATLAB2012btoolboxsymbolicsymbolicsymeigm sym methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedeigm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayeigm gpuArray methoddmatlabchebfunlinopeigm linop methoddmatlabchebfunchebopeigm chebop method
gtgt which -all qrbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleqr) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleqr) double methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedqrm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayqrm gpuArray methoddmatlabchebfunchebfunqrm chebfun method
gtgt which -all toeplitzCMATLAB2012btoolboxmatlabelmattoeplitzm
University of Manchester Nick Higham Higher Precision Arithmetic 32 33
Conclusions amp Future Directions
Need to prepare for quad precision in hardwareArbitrary precision arithmetic in software increasinglyuseful Are we fully exploiting it
High precision veryuseful for computingldquoexact answersrdquo fortesting algs
NLA group
University of Manchester Nick Higham Higher Precision Arithmetic 33 33
References I
Multiprecision Computing Toolbox for MATLABAdvanpix LLC Tokyo|httpwwwadvanpixcom|
A H Al-Mohy and N J HighamThe complex step approximation to the Freacutechetderivative of a matrix functionNumer Algorithms 53(1)133ndash148 2010
Basic Linear Algebra Subprograms Technical (BLAST)Forum Standard IInt J High Performance Computing Applications 16(1)1ndash111 2002
University of Manchester Nick Higham Higher Precision Arithmetic 1 4
References II
Basic Linear Algebra Subprograms Technical (BLAST)Forum Standard IIInt J High Performance Computing Applications 16(2)115ndash199 2002
E B DaviesApproximate diagonalizationSIAM J Matrix Anal Appl 29(4)1051ndash1064 2007
University of Manchester Nick Higham Higher Precision Arithmetic 2 4
References III
N J Dingle and N J HighamReducing the influence of tiny normwise relative errorson performance profilesMIMS EPrint 201190 Manchester Institute forMathematical Sciences The University of ManchesterUK Nov 201111 pp
W KahanComputer benchmarks versus accuracyDraft manuscript June 1994
University of Manchester Nick Higham Higher Precision Arithmetic 3 4
References IV
X S Li J W Demmel D H Bailey G Henry Y HidaJ Iskandar W Kahan S Y Kang A Kapur M CMartin B J Thompson T Tung and D J YooDesign implementation and testing of extended andmixed precision BLASACM Trans Math Software 28(2)152ndash205 June 2002
D MargulisPhotoshop LAB Color The Canyon Conundrum andOther Adventures in the Most Powerful ColorspacePeachpit Press Berkeley CA USA 2006ISBN 0-321-35678-0xviii+366 pp
University of Manchester Nick Higham Higher Precision Arithmetic 4 4
gtgt which -all eigbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleeig) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleeig) double methodCMATLAB2012btoolboxsymbolicsymbolicsymeigm sym methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedeigm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayeigm gpuArray methoddmatlabchebfunlinopeigm linop methoddmatlabchebfunchebopeigm chebop method
gtgt which -all qrbuilt-in (CMATLAB2012btoolboxmatlabmatfunsingleqr) single methodbuilt-in (CMATLAB2012btoolboxmatlabmatfundoubleqr) double methoddmatlabMultiprecision_Computing_Toolboxmpm mp methodCMATLAB2012btoolboxdistcompparallelcodistributedqrm codistributed methodCMATLAB2012btoolboxdistcompgpugpuArrayqrm gpuArray methoddmatlabchebfunchebfunqrm chebfun method
gtgt which -all toeplitzCMATLAB2012btoolboxmatlabelmattoeplitzm
University of Manchester Nick Higham Higher Precision Arithmetic 32 33
Conclusions amp Future Directions
Need to prepare for quad precision in hardwareArbitrary precision arithmetic in software increasinglyuseful Are we fully exploiting it
High precision veryuseful for computingldquoexact answersrdquo fortesting algs
NLA group
University of Manchester Nick Higham Higher Precision Arithmetic 33 33
References I
Multiprecision Computing Toolbox for MATLABAdvanpix LLC Tokyo|httpwwwadvanpixcom|
A H Al-Mohy and N J HighamThe complex step approximation to the Freacutechetderivative of a matrix functionNumer Algorithms 53(1)133ndash148 2010
Basic Linear Algebra Subprograms Technical (BLAST)Forum Standard IInt J High Performance Computing Applications 16(1)1ndash111 2002
University of Manchester Nick Higham Higher Precision Arithmetic 1 4
References II
Basic Linear Algebra Subprograms Technical (BLAST)Forum Standard IIInt J High Performance Computing Applications 16(2)115ndash199 2002
E B DaviesApproximate diagonalizationSIAM J Matrix Anal Appl 29(4)1051ndash1064 2007
University of Manchester Nick Higham Higher Precision Arithmetic 2 4
References III
N J Dingle and N J HighamReducing the influence of tiny normwise relative errorson performance profilesMIMS EPrint 201190 Manchester Institute forMathematical Sciences The University of ManchesterUK Nov 201111 pp
W KahanComputer benchmarks versus accuracyDraft manuscript June 1994
University of Manchester Nick Higham Higher Precision Arithmetic 3 4
References IV
X S Li J W Demmel D H Bailey G Henry Y HidaJ Iskandar W Kahan S Y Kang A Kapur M CMartin B J Thompson T Tung and D J YooDesign implementation and testing of extended andmixed precision BLASACM Trans Math Software 28(2)152ndash205 June 2002
D MargulisPhotoshop LAB Color The Canyon Conundrum andOther Adventures in the Most Powerful ColorspacePeachpit Press Berkeley CA USA 2006ISBN 0-321-35678-0xviii+366 pp
University of Manchester Nick Higham Higher Precision Arithmetic 4 4
Conclusions amp Future Directions
Need to prepare for quad precision in hardwareArbitrary precision arithmetic in software increasinglyuseful Are we fully exploiting it
High precision veryuseful for computingldquoexact answersrdquo fortesting algs
NLA group
University of Manchester Nick Higham Higher Precision Arithmetic 33 33
References I
Multiprecision Computing Toolbox for MATLABAdvanpix LLC Tokyo|httpwwwadvanpixcom|
A H Al-Mohy and N J HighamThe complex step approximation to the Freacutechetderivative of a matrix functionNumer Algorithms 53(1)133ndash148 2010
Basic Linear Algebra Subprograms Technical (BLAST)Forum Standard IInt J High Performance Computing Applications 16(1)1ndash111 2002
University of Manchester Nick Higham Higher Precision Arithmetic 1 4
References II
Basic Linear Algebra Subprograms Technical (BLAST)Forum Standard IIInt J High Performance Computing Applications 16(2)115ndash199 2002
E B DaviesApproximate diagonalizationSIAM J Matrix Anal Appl 29(4)1051ndash1064 2007
University of Manchester Nick Higham Higher Precision Arithmetic 2 4
References III
N J Dingle and N J HighamReducing the influence of tiny normwise relative errorson performance profilesMIMS EPrint 201190 Manchester Institute forMathematical Sciences The University of ManchesterUK Nov 201111 pp
W KahanComputer benchmarks versus accuracyDraft manuscript June 1994
University of Manchester Nick Higham Higher Precision Arithmetic 3 4
References IV
X S Li J W Demmel D H Bailey G Henry Y HidaJ Iskandar W Kahan S Y Kang A Kapur M CMartin B J Thompson T Tung and D J YooDesign implementation and testing of extended andmixed precision BLASACM Trans Math Software 28(2)152ndash205 June 2002
D MargulisPhotoshop LAB Color The Canyon Conundrum andOther Adventures in the Most Powerful ColorspacePeachpit Press Berkeley CA USA 2006ISBN 0-321-35678-0xviii+366 pp
University of Manchester Nick Higham Higher Precision Arithmetic 4 4
References I
Multiprecision Computing Toolbox for MATLABAdvanpix LLC Tokyo|httpwwwadvanpixcom|
A H Al-Mohy and N J HighamThe complex step approximation to the Freacutechetderivative of a matrix functionNumer Algorithms 53(1)133ndash148 2010
Basic Linear Algebra Subprograms Technical (BLAST)Forum Standard IInt J High Performance Computing Applications 16(1)1ndash111 2002
University of Manchester Nick Higham Higher Precision Arithmetic 1 4
References II
Basic Linear Algebra Subprograms Technical (BLAST)Forum Standard IIInt J High Performance Computing Applications 16(2)115ndash199 2002
E B DaviesApproximate diagonalizationSIAM J Matrix Anal Appl 29(4)1051ndash1064 2007
University of Manchester Nick Higham Higher Precision Arithmetic 2 4
References III
N J Dingle and N J HighamReducing the influence of tiny normwise relative errorson performance profilesMIMS EPrint 201190 Manchester Institute forMathematical Sciences The University of ManchesterUK Nov 201111 pp
W KahanComputer benchmarks versus accuracyDraft manuscript June 1994
University of Manchester Nick Higham Higher Precision Arithmetic 3 4
References IV
X S Li J W Demmel D H Bailey G Henry Y HidaJ Iskandar W Kahan S Y Kang A Kapur M CMartin B J Thompson T Tung and D J YooDesign implementation and testing of extended andmixed precision BLASACM Trans Math Software 28(2)152ndash205 June 2002
D MargulisPhotoshop LAB Color The Canyon Conundrum andOther Adventures in the Most Powerful ColorspacePeachpit Press Berkeley CA USA 2006ISBN 0-321-35678-0xviii+366 pp
University of Manchester Nick Higham Higher Precision Arithmetic 4 4
References II
Basic Linear Algebra Subprograms Technical (BLAST)Forum Standard IIInt J High Performance Computing Applications 16(2)115ndash199 2002
E B DaviesApproximate diagonalizationSIAM J Matrix Anal Appl 29(4)1051ndash1064 2007
University of Manchester Nick Higham Higher Precision Arithmetic 2 4
References III
N J Dingle and N J HighamReducing the influence of tiny normwise relative errorson performance profilesMIMS EPrint 201190 Manchester Institute forMathematical Sciences The University of ManchesterUK Nov 201111 pp
W KahanComputer benchmarks versus accuracyDraft manuscript June 1994
University of Manchester Nick Higham Higher Precision Arithmetic 3 4
References IV
X S Li J W Demmel D H Bailey G Henry Y HidaJ Iskandar W Kahan S Y Kang A Kapur M CMartin B J Thompson T Tung and D J YooDesign implementation and testing of extended andmixed precision BLASACM Trans Math Software 28(2)152ndash205 June 2002
D MargulisPhotoshop LAB Color The Canyon Conundrum andOther Adventures in the Most Powerful ColorspacePeachpit Press Berkeley CA USA 2006ISBN 0-321-35678-0xviii+366 pp
University of Manchester Nick Higham Higher Precision Arithmetic 4 4
References III
N J Dingle and N J HighamReducing the influence of tiny normwise relative errorson performance profilesMIMS EPrint 201190 Manchester Institute forMathematical Sciences The University of ManchesterUK Nov 201111 pp
W KahanComputer benchmarks versus accuracyDraft manuscript June 1994
University of Manchester Nick Higham Higher Precision Arithmetic 3 4
References IV
X S Li J W Demmel D H Bailey G Henry Y HidaJ Iskandar W Kahan S Y Kang A Kapur M CMartin B J Thompson T Tung and D J YooDesign implementation and testing of extended andmixed precision BLASACM Trans Math Software 28(2)152ndash205 June 2002
D MargulisPhotoshop LAB Color The Canyon Conundrum andOther Adventures in the Most Powerful ColorspacePeachpit Press Berkeley CA USA 2006ISBN 0-321-35678-0xviii+366 pp
University of Manchester Nick Higham Higher Precision Arithmetic 4 4
References IV
X S Li J W Demmel D H Bailey G Henry Y HidaJ Iskandar W Kahan S Y Kang A Kapur M CMartin B J Thompson T Tung and D J YooDesign implementation and testing of extended andmixed precision BLASACM Trans Math Software 28(2)152ndash205 June 2002
D MargulisPhotoshop LAB Color The Canyon Conundrum andOther Adventures in the Most Powerful ColorspacePeachpit Press Berkeley CA USA 2006ISBN 0-321-35678-0xviii+366 pp
University of Manchester Nick Higham Higher Precision Arithmetic 4 4