Arénaire Major results 1998-2002 and future prospects Common project CNRS / ENS Lyon / INRIA LIP Laboratory (UMR CNRS-ENSL-INRIA N° 5668) Research area:

Arénaire

Major results 1998-2002 and future prospects

Common project CNRS / ENS Lyon / INRIA LIP Laboratory (UMR CNRS-ENSL-INRIA N° 5668)

Research area: Computer arithmeticResp. J.M. Muller

Computer arithmeticArithmetic algorithms (+, , , ,

sin/cos, exp, log, etc.);Number systems;Software and Hardware

implementations;Accuracy;Reliability, validation

•Elementary functions•Multiple-precision•Exact computations•Processor-specific libraries

, , , •Elementary functions•Dedicated operators

•Worst cases for DP•Towards larger precision

•Table-based methods•Shift and add algorithms•Polynomial approximations

•Prove arithmetic operators•Prove algos. & props.

•Rounding errors•Interval arithmetic•Exception handling

Com

pu

ter

Ari

thm

eti

cBuilding

basic operators

and libraries

Cunningly using basic elements

Hardware operators

Software

libraries

Table Maker’s Dilemma

Elem. functions algorithms

Proofs & Validation

Floating-Point

expertise

Com

pute

r A

rith

meti

cBuilding basic operators and

libraries


Hardware operators

Software libraries


Elem. functions algorithms

Proofs & Validation

Floating-Point expertise

Computer architectureMicroelectronics

CAD tools

Computer algebraGlobal optimization

Numerical algorithms

Number theory

Formal proofs

Hardware

operatorsSoftware

libraries


Elem. functions algorithmsProofs &

Validation


Com

pute

r A

rith

meti

c

Building basic operators and

libraries


Current project members « permanent » researchers

Marc Daumas, CR CNRS

Florent de Dinechin, MdC ENSL

Jean-Michel Muller, DR CNRS

1998

Arnaud Tisserand, CR INRIA

1999

Claude-Pierre Jeannerod, CR INRIA

2002

Gilles Villard, CR CNRS

2000

2001

CR, DR = pure research position

Mdc = Maître de Conferences ( associate professor)

Hardware

operatorsSoftware

libraries



Validation


Com

pute

r A

rith

meti

c


libraries


« permanent » researchers







Temporary positions

Jean-Luc Beuchat, postdoc

2001

Nathalie Revol, “délégation”

MdC Lille 1, CR INRIA

2000

Current project members

Hardware

operatorsSoftware

libraries



Validation


Com

pute

r A

rith

meti

c


libraries


« permanent » researchers







Temporary positions

Nathalie Revol, MdC délég. CR INRIA

Jean-Luc Beuchat, postdoc

PhD Students

David Defour

2000Sylvie BoldoNicolas Boullis

Pascal Giorgi

2001

Current project members

Various INRIA CNRSENS Lyon

Total

DR / Professors 0

CR / Ass. Professors 2 1 3

PhD students 2 2

Total 0 0 2 3 5

Guests (> 1 Month) 0

Arénaire in oct. 1998

Arénaire nowVariou

sINRIA CNRS

ENS Lyon

Total

DR / Professors 1 1

CR / Ass. Professors

2 (+1 délégation

)2 1 6

PhD students 4 4

Total 0 3 3 5 11

PostDocs 1 1

Guests (> 1 month)

1 1

0

2

4

6

1999 2000 2001 2002

Permanent researchers 2 (+1 ?)

0

2

4

6

8

10

12

1999 2000 2001 2002

All members 2.4

Past members Anne Mignotte full prof position at

INSA Lyon; Vincent Lefèvre CR INRIA in the

SPACES Project; Claire Finot-Moreau R&D dept of

Michelin; Philippe Langlois associate professor

in Perpignan University (got HDR and has just been « qualified » for full prof position).

Publications

1998 1999 2000200

1

PhD dissertations 0 0 1 1

H.D.R. (*) 0 1 0 2

Articles in journals 4 5 7 9

Articles in conf. Proc. 9 6 14 14

Res. reports 3 10 9 14

Grants reports 1 1

Gilles Villard and Nathalie Revol will defend their HDR in 2002

Reflects our supervisionabilities in 1998

Presentation of some results

Hardware

operatorsSoftware

libraries



Validation


Com

pute

r A

rith

meti

c


libraries


The Table Makers’ dilemma

Table-basedmethods

Formal proofs forComputer arith.

Multiple-prec.Interval arith


Hardware

operatorsSoftware

libraries



Validation


Com

pute

r A

rith

meti

c


libraries



Table-basedmethods



Rounding modes (IEEE 754)

Result of : in general, must be rounded

« correct rounding » : deterministic choice between 4 modes

To the nearest (even)Towards +Towards Towards zero

The system must behave as if the result was first computedwith « infinite » precision, then rounded.

Properties of correct roundingPredictability: build algorithms and

proofs; Improves portability of numerical

software Interval arithmetic easier to

implementElementary functions: seems

difficultNo elementary function specification in IEEE 754

Shame on themSystem sin (1022) exact -0.852200849767… HP 48 GX -0.852200849767 HP 700 0.0 IBM 3090/600S-VF AIX 370 0.0 Matlab V.4.2c.1 for SPARC -0.8522 Matlab V.4.2c.1 macintosh

0.8740

Silicon graphics Indy 0.87402806 Sharp EL5806 -0.090748172 DEC Station 3100 NaN

Table Makers’ Dilemma

The correct value lies somewhere in the red area

Radix 2, FP format with n-bit mantissas

Directed roundings

Rounding to nearest

x.xxxxxxxxxxx 0000000000000 1zzzzzz


m bits

n bits



n bits

m bits

Largestvalue

ofm ?

Nesterenko-Waldschmidt, 95

For p/q, q > 0 and gcd(p,q)=1 H(p/q) = max(p,q)

Let , Q, and A, B et E real numbers such that

A max(H(),e) B H() E e

We have:

2loglog2log3.3log62log

10,1max(log2logloglog211exp

EEEEA

EABe

Gives m <106 for double precision ex and loge x

Towards exhaustive testing

Filtering quickly eliminates most cases (253 for each function and each input exponent);

Splitting into very small domains linear approximations;

Is the distance between a (bounded) grid and a segment of straight line < ;

Variant of Euclid’s GCD algorithm; Massive parallelism

A typical « worst case »loge(1.011000101010100010000110000100110110

0010100110110110 2678)

= 111010110.010001111001111010111010011111 00100101110001 0000000000 0000000000

0000000000 0000000000 0000000000 0000000000 00000 11…

(65 zeros)

This is the worst case for logarithms of double precision numbers

Consequence

Let y = ln(x), x double-precision number. Let y* = approximation to y s.t. mantissa distance between y and y* < 2-117

then for any of the 4 rounding modes, rounding y* is equivalent to rounding ySimilar results for ex, log2(x), 2x, sin, cos, tan, arctan, sinh, cosh…

Full domain

Bounded domain

http://www.ens-lyon.fr/~jmmuller/TMD.html

What do we do with these results?most of them publishedwww.ens-lyon.fr/~jmmuller/TMD.html

We are writing an elementary function library (exp and 2x already running « experimentally »)

Inria « ODL » (help for developing software) + trainees + …


Hardware

operatorsSoftware

libraries



Validation


Com

pute

r A

rith

meti

c


libraries



Table-basedmethods



Table-based methods Implementing function f using tablesAssume n – bit input and output

precision (fixed-point). n rather small.

Naive methodTable :

All possible values f(x)

n nx f (x)

Size of table : n 2n

OK until n 10 (depends on techno & trade-offsize vs. access time)

Increase the precision Interpolation (requires multipliers)Basic « Bipartite » method n = 3k

x = x1 + x22-k + x32

-

2k x1 x2 x3

kn

f(x ) = f(x1 + x22-k ) + x32

-2k f’(x1 + x22-k ) + …

A(x1 , x2) B (x1 ,x3)

Total table size: n 2n 2n 22n/3

Some « tuning » depending on function

10 bits 13 bits

12 bits 16 bits

Improvements ?

•Cut into more than 3 parts

•Be more function specific

« multipartite » methods

•Algorithm for getting best trade-offs

• Fast FPGA implementations

•Values up to n 24 achievable with current technology

Results on 16-bit operandsf m tables size size [3]

sin 1 10 6 5 6 17.210 + 7.210 24576 32768

2 8 8 7,4 3,5 19.28 + 10.29 + 8.28 12032 20480

3 8 8 7,6,4 2,3,3 18.28 + 9.28 + 7.28 + 4.26 8960 17920

[0, / 4] 4 8 8 7,6,4,4 2,2,2,2 19.28 + 10.28 + 8.27 + 6.25 + 4.25 8768 na

2x 1 10 6 5 6 16.210 + 6.210 22528 24576

2 8 8 7,4 3,5 17.28 + 9.29 + 6.28 10496 14592

3 8 8 7,6,4 2,2,4 17.28 + 9.28 + 7.27 + 5.27 8192 13468

[0,1] 4 8 8 7,6,5,4 2,2,2,2 18.28 + 10.28 + 8.27 + 6.26 + 4.25 8704 na

1/ x 2 9 6 7,6 3,3 18.29 + 9.29 + 6.28 15360 16896

[1,2] 3 9 6 8,7,5 2,2,2 17.29 + 8.29 + 6.28 + 4.26 14592 15872

Other results on table-based methodsDivision, sqrt and elementary

function evaluation using small rectangular multipliers;

Acceleration of Goldschmidt division/square-root iterations on a pipelined multiplier


Hardware

operatorsSoftware

libraries



Validation


Com

pute

r A

rith

meti

c


libraries



Table-basedmethods



Formal proofs and computer arithmetic Prove arithmetic operations (e.g., divide

and square-root); Considering the operations do satisfy

the specifications (of IEEE 754 FP Std, for instance), build and prove algorithms that use these specifications;

Collaboration with INRIA Lemme and Spaces projects: ARC « Arithmétique des Ordinateurs Certifiée » (AOC)

Is it useful ? Maple 6, all systems, enter 21474836480, You

get 0 Some Cray computers: multiplying by 1 may lead

to overflow; Pentium bug: 3 significant digits only for some

divisions. 8391667/12582905 = 0.666869455…

USS Yorktown (1998): a member of the crew erroneously entered a 0 as an input data. Zero divide sequence of errors that ultimately stopped the propulsion system (Scientific American, Nov. 1998)

Critical applications: You cannot assume x y is x +y Assuming it is (x +y)(1+) does not suffice

Examples (actually used) Sterbenz theorem: if x/2 y 2x then

x-y is computed exactly in IEEE 754 arithmetic

Fast2sum(x,y) (with |y| |x|, and base 2) s x + y v s – x t y – v

gives s +t = x +y, with |t| lastbit(s)

Computed value

need for formal proofs that use knowledge of floating-point arithmetic: ARC Arithmétique des Ordinateurs Certifiée

Are we the only ones ? Formal proofs of the AMD K5 square-root

and K7 mult, division & sqrt algorithms (David Russinoff);

Formal proof of Newton-Raphson based division and square-root algorithms suggested for the Intel/HP IA 64 instruction set (Cornea-Hasegan, Golliver, Markstein);

Formal proofs of some elementary function algorithms: e.g., 2x-1 for IA-64 (John Harrison, Intel)

Main result: Formal specification of FP representation

Generalizes IEEE 754 Set of proofs (using the Coq tool, with

the Lemme Inria project); Validated algorithms: multiple precision

library (« expansions »), polynomial evaluation;

Some surprises with Sterbenz and Fast2sum (work with much weaker conditions)

Some resultsAccurate polynomial evaluation: under

some conditions, Horner’s rule returns one of the FP numbers surrounding exact result.


Hardware

operatorsSoftware

libraries



Validation


Com

pute

r A

rith

meti

c


libraries



Table-basedmethods



Arbitrary precision interval arithmetic

Principle: every number is replaced by an interval containing it.

validated computations: result belongs to computed interval;

global computations: enclose f(I) Brouwer’s theorem: proof of the existence of a

fixed point Hansen’s algorithm for global optimization.

Unfortunately: overestimation of the results.Solution: use arbitrary precision

Goal: global optimization

Newton iteration: cornerstone for global optimization

Interval Newton algorithm:

Needs to be adapted to arbitrary precision

Interval Newton algorithm using arbitrary precisionMoving from double precision to

multiple precision requires: a new stopping criterion automatic adaptation of the

precision a termination proof for the new

algorithm.Avoids restarting the whole

computation.

Software & hardware realizations

Software MPFI: multiple prec interval arithmetic

(with Spaces) Elementary functions for Bailey’s double

double library FPexpansions: Floating-Point Expansions

(programs and proofs) Division by a constant code generator

for ST100 DSP chip (included in June 2001 release of the compiler)

Linbox: generic C++ library for linear alg. (delivery: Aug 2002)

HardwareGenerator of hardware operators

based on multipartite methods (finds good splittings, fills the tables and generates VHDL code)

Asynchronous multiplier/adder (in cooperation with ST Microelectronics)

Dedicated arithmetic operators for image processors (with CSEM)

Collaborations

Industrial collaborations ST Microelectronics:

Euclidean division by a constant using a 16 x 16 + 40 40 MAC. Our algorithm runs on the ST100 (incl. in June 2001 compiler release)

Asynchronous multiplier (700+ MHz) POSIC-SA: arithmetic of a position sensor and its FPGA

implementation CSEM, Switzerland: arithmetic operators for vision circuits (FPGA

and ASIC implementation) Aérospatiale: studies of problems related to switching from fixed-

point precision to floating-point precision for on-board computers. HP/Intel: donation of an Itanium-based machine Xilinx: donation of 2 Virtex FPGA chips (validation of multipliers by

a constant & elem. function generators)

Academic international collaborations PICS grant: M.D. Ercegovac’s Digital Arithmetic

and Reconfigurable Architecture Laboratory, UC Los Angeles; T. Lang, UC Irvine; D. Matula, Southern methodist university;

France-Berkeley fund: J. Schewchuck, UCB

Odense Univ., Denmark (P. Kornerup)

Publications, conf. organisation

2 libraries: « pseudo-expansions » and double-double precision

Publications, conf. & special issue organisation

NSF-CNRS Linbox project: Kaltofen (NCSU), Saunders (Udel), NSERC, Canada

ORCCA London & Waterloo (M. Giesbrecht, A. Storjohann & G. Labahn)

INPT, Rabat, Morocco

Collaborative research, joint development of a library

Publications

« Action intégrée », conference organization, edition of a special issue of JCAM

Collaborations within INRIA Spaces (was Polka)

Lemme

Aladin, Prisme

Apache

ARC Fiable & AOC, Software (MPFI), Workshop organization, joint proposal (elementary function specification)

ARC AOC, common publications, proof development, workshop organization

ARC Fiable, visits

PhD thesis, software

Academic french collaborations outside INRIA

LIP6 Lab. (Paris 6, Univ.), LIRMM (Montpellier), LIM (Marseille)

ANO (Lille)

Some people from GDR ALP (C. Frougny)

GDR ARP, Conference organization, Publications

Publications

Conference organization, publication of special issues

International & national recognition Marc Daumas, Florent de Dinechin and

Arnaud Tisserand guest editors of a special issue « arithmétique

des ordinateurs » of RSR-CP Jean-Michel Muller

Associate editor of IEEE Trans. on Computers (1996-2000)

General Chair of 14th IEEE Symposium on Computer Arithmetic (Adelaide, Australia, 1999)

Guest co-editor of 2 special issues of Theoretical Computer Science (Jan. 99, and to appear)

Nathalie Revol Co-organizer of ALA conference (Rabat,

Morocco, May 2001) Guest co-editor of a special issue of Journal

of Computational and Applied Mathematics Gilles Villard

Prog. Chair of ISSAC’2001 (London, Ontario) Guest editor of special issue of Journ. of

Symbolic Computation Invited Speaker at SIAM Conf. on Applied

Linear Algebra (Virginia) in 2003

Administration, responsibilities Jean-Michel Muller

Head of the LIP: joint CNRS/ENSL/INRIA Lab (72 people), since Sept 2001 (vice-head before)

Member of the INRIA Evaluation Committee (from Jan 99 to, hopefully, Jun 2002)

Member of the managing committee of « GDR ARP »

Gilles Villard In charge of an « action spécifique » of the STIC

department of CNRS (computer algebra) Vice chair of the computer science DEA ( MSc) of

Ecole Normale Supérieure de Lyon

Future prospects

The near future of Arénaire

Domains of interestNeed to manage the growthMany Arénaire researchers are

becoming seniors (HDRs) J.M. Muller needs a rest (head of LIP)

Com

pute

r A

rith

meti

c

Building basic

operators and libraries


Hardware operators

Software libraries



Validation


Arithmetic & hardware

Arithmetic & proofs

Arithmetic & algorithmsfor scientific computing

Future prospects

Arithmetic & elementary function

libraries

Com

pute

r A

rith

meti

c

Building basic



Hardware operators

Software libraries



Validation



Arithmetic & proofs

Future prospects


libraries

Arithmetic & algorithms for scientific computing

Com

pute

r A

rith

meti

c


libraries

Cunningly using basic

elements

Hardware operators

Software libraries



Validation


FPGA VLSI

Number representation Algorithms Targets

Async.

FP operators for FPGA (, , , ,EF)Basic operators for VLSI

Elementary function operators

Cryptographic operators

Asynchronous operators (, , , , EF)

integer FP Low powerRNS GF(q)

Automatic generation

Fit technology

Fit application’s arithmetic context

CAD tools

Operators test

Elementary function operators

Arithmetic & Hardware future prospects

, , , Elem. Functions

CompositesSpecific

Operators for elementary function evaluation

more complex decompositionpipelininghigher order approx (small multipliers)

Arithmetic aware CAD tools

High level validation

Arithmetic & Hardware: examples of prospects

Low-power considerations

Floating-Point operators

Table and add methods improvement

comparison table-based vs polynomial

Advanced number systems support

Number and digit coding (sign, values dynamic…)

Operators and algorithms tuning

Multiple technology support

Automatic generation of composite operators (1/(x^2+y^2))

Com

pute

r A

rith

meti

c

Building basic



Hardware operators

Software libraries



Validation




libraries

Arithmetic & proofs

Future prospects


Com

pute

r A

rith

meti

c

Building basic



Hardware operators

Software libraries



Validation


Arithmetic & Elementary function libraries future prospects

• worst cases for the TMD• expertise in elementary function algorithms

Elementary function libs

Targets

• « general library » with correct rounding for microprocessors (assuming IEEE 754 only)

• libs optimized for specific microprocessors (MACs, pipelined multipliers, ...)

• libraries for DSP’s

• participation to a standardINRIA « ODL » + industrial partners (ST ?)

Com

pute

r A

rith

meti

c

Building basic



Hardware operators

Software libraries



Validation



Arithmetic & proofs

Future prospects


libraries


Com

pute

r A

rith

meti

c


libraries


elements

Hardware operators

Software libraries



Validation


Software

Number representations

Targets

FP

Explore and validate properties

Isolate useful behaviors

Integrate into libraries

Arithmetic & Proofs future prospects

Fixed pointRedundant

Partial reduction of redundancy

Algorithms

Hardware

Prepare Top-down approach

Allow validation of parts of thearchitecture

Implementation

Validated behavior of floating point algorithms

Faithful computation of AXPYFaithful rounding of Horner’s schemePropose a general framework

Validation and transfer of floating point algorithms

Arithmetic & Proofs: examples of prospects

Inner loops of numerical applications

Filters (correct even after a long running time)

Polynomial evaluation

FFT

To fixed point arithmetic (strong industrial incentive)

Conservative and efficient word length

To low precision (multimedia extensions)

Com

pute

r A

rith

meti

c

Building basic



Hardware operators

Software libraries



Validation



Arithmetic & proofs

Arithmetic & algorithmsfor scientific computing

Future prospects


libraries

Floating-Pointexpertise

Software libraries

Arithmetic & Scientific Computing Algorithms



Validation


libraries

Com

pute

r A

rith

meti

c


elements

Hardwareoperators

Impact of arithmetics on solving methodsAlgorithmic design for arithmetic enhancement

- Interval & exact arithmetics

Table & floating-point methodsNormal forms & diophantine systemsInterface issues

Level 1,2 BLAS operatorsAutomatic differentiation

Software libraries

Floating-Point

expertise

MPFI (C++ library)

LinBox (C++ library)

Jointly with SPACES

Collaborative NSF/CNRS Research

Multi-precision floating-point intervals

Generic black-box operators, finite fields & integers

Linear algebra& global optimization

- Algorithmic designDynamic Precision & data-size Non-linear systems

Constrained optimizationControl & system theory

Bit-complexity Matrix polynomials Perturbations

Management

Quite likely, new head, or « responsable permanent » soon (within 2 years)

That’s all!

Consider

Floating-point system, rounding to nearest, n mantissa digits

Radix 10, 3 digits fatal error for x = 450 and y = 20 Radix 2: always OK (the input argument to the arcos

is always < 1)

22 cos )(

yxxarxf

Documents

Arénaire Major results 1998-2002 and future prospects Common project CNRS / ENS Lyon / INRIA LIP Laboratory (UMR CNRS-ENSL-INRIA N° 5668) Research area: