29
Comparison of Accurate Dot Product Algorithms urgen Wolff V. Gudenberg To cite this version: urgen Wolff V. Gudenberg. Comparison of Accurate Dot Product Algorithms. [Research Report] RR-2413, INRIA. 1994. <inria-00074262> HAL Id: inria-00074262 https://hal.inria.fr/inria-00074262 Submitted on 24 May 2006 HAL is a multi-disciplinary open access archive for the deposit and dissemination of sci- entific research documents, whether they are pub- lished or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers. L’archive ouverte pluridisciplinaire HAL, est destin´ ee au d´ epˆ ot et ` a la diffusion de documents scientifiques de niveau recherche, publi´ es ou non, ´ emanant des ´ etablissements d’enseignement et de recherche fran¸cais ou ´ etrangers, des laboratoires publics ou priv´ es.

Comparison of Accurate Dot Product AlgorithmsComparison of Accurate Dot Product Algorithms Jurgen Wol V. Gudenberg To cite this version: Jurgen Wol V. Gudenberg. Comparison of Accurate

  • Upload
    others

  • View
    12

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Comparison of Accurate Dot Product AlgorithmsComparison of Accurate Dot Product Algorithms Jurgen Wol V. Gudenberg To cite this version: Jurgen Wol V. Gudenberg. Comparison of Accurate

Comparison of Accurate Dot Product Algorithms

Jurgen Wolff V. Gudenberg

To cite this version:

Jurgen Wolff V. Gudenberg. Comparison of Accurate Dot Product Algorithms. [ResearchReport] RR-2413, INRIA. 1994. <inria-00074262>

HAL Id: inria-00074262

https://hal.inria.fr/inria-00074262

Submitted on 24 May 2006

HAL is a multi-disciplinary open accessarchive for the deposit and dissemination of sci-entific research documents, whether they are pub-lished or not. The documents may come fromteaching and research institutions in France orabroad, or from public or private research centers.

L’archive ouverte pluridisciplinaire HAL, estdestinee au depot et a la diffusion de documentsscientifiques de niveau recherche, publies ou non,emanant des etablissements d’enseignement et derecherche francais ou etrangers, des laboratoirespublics ou prives.

Page 2: Comparison of Accurate Dot Product AlgorithmsComparison of Accurate Dot Product Algorithms Jurgen Wol V. Gudenberg To cite this version: Jurgen Wol V. Gudenberg. Comparison of Accurate

ISS

N 0

249-

6399

appor t de r ech er ch e

INSTITUT NATIONAL DE RECHERCHE EN INFORMATIQUE ET EN AUTOMATIQUE

Comparison of Accurate Dot ProductAlgorithms

Jurgen Wolff v. Gudenberg

Institut fur Informatik

Universitat Wurzburg

N˚ 2413Octobre 1994

PROGRAMME 6

Page 3: Comparison of Accurate Dot Product AlgorithmsComparison of Accurate Dot Product Algorithms Jurgen Wol V. Gudenberg To cite this version: Jurgen Wol V. Gudenberg. Comparison of Accurate
Page 4: Comparison of Accurate Dot Product AlgorithmsComparison of Accurate Dot Product Algorithms Jurgen Wol V. Gudenberg To cite this version: Jurgen Wol V. Gudenberg. Comparison of Accurate

������������ �����������������������! "�#��$%&�(')���*�+�-,�./����0��12��

3�45�6�798;:=<?>9@BA?C�D�E(5�FG8;:IH�8;6J7�KL :�MON�PQN�5�N#R 45�6 L :GRS>96�TVUWN�PQXY :�PQCZ8[6\MOP]N 4U^N#< 45�6J_[H�5�6J7

`ba0c^dWa�e;f2f2g�hji klenm]oJp9mrq0oOsBg\t9uvsxwzy^pZg^{�f2cZ|~}gJmxs]q0e;u�s�cWt�g\u�mBc^d;s]oOsBgJm�t�pZf=}g\avs]y^pZg`*a0c^�vg\u��m]e^|WsBt

�e;�Z�IcWa0u/|9g�a0g�oO�Zg\a�oO�Zg+t(�9�;�I���ji �/oJuOc^�Za0g��n�^�;�ji��^���Ie;dWg�q

�����\���n�I�;�^����aOg�oJg\t�uvmB� �Za0c^�Ic9qvg�|�enmBd^c^a�s�uO�Zf�¡¢c^a)u0�ZgVe^o\oJpZa�e£u0g¤oJc^f��ZpZu�e;uvsBc^t¥c;¡�u0�Zg|9c^u��Za0cZ|9pIoJu§¦©¨�cW�ª�;�W«l�IeWq(�Gg\g\t=sBf2�9mBg\f2g\t9u0g�|?e£tI|Vf2cZ|WsxwIg�|¬sBt=c^a�|9g\a�u0c­c^�ZuJensBt=u0�Zg�Gg�qvu��Gc�q0q®sB�9mBgVenmBd^c^avsBu0�Zf¯¡¢c^a�u0�Zg��°yWpIe;a0gJm]q�u0c9c;mB�Icn±³²%´Su�s]q)oJc^f2�Ie;a0g�|¬µ�s�uO�?u0�Zg�c^uO�Zg\aµ�gJmxm�¶�tZc;µtVenmBd^cWavsBu0�Zf�q�µ�sBu0�VaOg�qv�Ig�oJu�u0c�aOpZt�uvsBf2g2e;tI|·sBt9u0g\a0f2g�|Ws]e;u0g)qvu0c^a�e£d^g�qv�IeWoJg^²(�oJc^tIq¸s¹|9g\aJe;�9mBg2sBf2�Za0c;º^g\f�g\t�u�c;¡�u0�Zg��Ig\a�¡¢c^aOf�e;tIoJg�s]q�c^�ZuJensBtZg�|­���§e)oJc^f��9sBtIe;uvsBc^t­c;¡�u0�9s]qenmBd^cWavsBu0�Zf»µ�s�uO�§e;t�c;m]|�c^tZg^²¼·½�¾�¿vÀ�Áz�;Â��^�¬oJc^f2�ZpZu0g\a/e£avsBu0�Zf2g\uvs]o;{�|9c^u0�Za0cZ|9pIoJu/oJc^f2�ZpZuJe;uvsBc^t

âÄ�ÅÆvÇvÈ�É ÅÆbÊbË�ǸÌQÍ�Î

ÏSÐ�ÑÓÒ©Ô�Õ�Ö�×ÙØ#Ñ[ÚOÔÜÛ;ÝvÝvÞ/ßn×Ùݸß;Ú0×ÙÝvà�àná�×QÒ©Þ�â�Ú~ÔÙãSÚvä/ÚOã�åÙæ�å¢çÓèVæ�ÝvÞ�Þ�ÝvÔvé^ÔÙánß�ß;Ö�×ÙãÙÝvàjÛ\ä/ê�ë å®ë ì³ë ç

Unite de recherche INRIA RennesIRISA, Campus universitaire de Beaulieu, 35042 RENNES Cedex (France)

Telephone : (33) 99 84 71 00 – Telecopie : (33) 99 84 71 71

Page 5: Comparison of Accurate Dot Product AlgorithmsComparison of Accurate Dot Product Algorithms Jurgen Wol V. Gudenberg To cite this version: Jurgen Wol V. Gudenberg. Comparison of Accurate

�������2�#����� �����'��S��,�.�����0��12���� ¥��� �������,����2,��� �2���l���� ���� ����,�� �2��j')�2�0�+ �����,��°����

���½ �����½§�� #c^pIq2}g\u0pI|WsBc^tIq*s]oOsÜpZt�tZc^pZºWgJm&e[m�dWc^avsBu0�Zf2g��Gc^pZa/o\enm]oJp9mBg\amBg��Za0cZ|9p9sBu/q0o\enm]ensBa0g�ZaZ}g�oOs]qg\utZcWpIq~mBg�oJc^f2�Ie£a0c^tIq�e;p9±¤enmBd^c^a�s�uO�Zf2g�qoJc^tZt9pIq\²��pZgJm]yWpZg�q�f2c |Wsxwzo\e;u�s�cWtIq/|9g�oJg\uenmBd^cWavsBu0�Zf2g§qvc^t9u�f�s]qvg�q�g\t���pZº�a0g§|Ze£tIqjm��©e;u0gJmxsBg\a2��y^pIe;aOgJm¹q�g\u�g\t?e;f=}gJm sBc^aOg\t�u2ensBtIq®smBg�q�Gg\av¡¢cWa0f�e;tIoJg�q\²� ÁG���\¿0�����½?�¬�Za0c |9p9sBu�q0o\enm]ensBa0g^{Ie£avsBu0�Zf2g\uvs]yWpZg(�ZaZ}g�oOs¹q�g�|9g�q�c^aJ|Ws�tIe£u0g\pZa�q

Page 6: Comparison of Accurate Dot Product AlgorithmsComparison of Accurate Dot Product Algorithms Jurgen Wol V. Gudenberg To cite this version: Jurgen Wol V. Gudenberg. Comparison of Accurate

������������ ��������������������������! "���#��$�%�&�(' )����*�,+$�- �

. / �/����j')���b����� �0��Zg§eWo\oJpZa�e;u0g�oJc^f2�ZpZu�e£uvsBc^t�c£¡°e­|9c^u��Za0c |9pIoJu21ScWa q���tZc^t9�9f2c^pIq®mB�43rqOo\enm]e;a��Za0c |9pIoJu65/s]qg�q0q�g\t�uvs]enmb¡Ùc^a�f�e;t9��t�pZf�g\avs]o\enm��Za0c^d^aJe;f�q\²G´St­u0�Zg�tZgJ±Zu(qvg�oJuvsBc^t­µ�g q�¶^g\u�oO�¤qvg\º^g\aJenm*e[m�dWc7avsBu0�Zf)q�¡Ùc^a�u0�Zg(gJ±Ie^oJu/oJcWf2�ZpZu�e;uvsBc^t�c£¡Üu0�Zg�qvg�|9c^u�Za0cZ|9pIoJu�q

8 3:9<;�1�=>? @4ACB?4D�E�? 5

c;¡4FIc�e£uvsBtZd7Ù�Ic;sBt9u�t9pZf��Ig\a�q B?{E�?µ�sBu0��e(q¸s�tZd£m�g�a0c^pZtI|WsBtZdG;�{ns ² g^²�enmxmIsBt9u0g\a0f2g�|Ws]e;uOg/a0g�q�p9m�uJq

�Ienº^g2uOc§�Ig�oJc^f2�ZpZuOg�|IHJ�,�,+$��������$�K#�,�$)�������L�´Su s]q(cW��ºWsBc^pIqju0�Ie;u�u0�Zg2aOg�qvp9mBu�µ�s mxm�Ig|WsNM�g\a0g\t9u¡Ùa0c^f�u0�ZgjpIqvpIenmxmB�§oJc^f2�ZpZu0g�| e;�Z�Za0cn±�sBf�e£uvsBc^t

8 3N9 B A KEA O B$P K

EP O Q�QRQ O B = K

E=

µ�Zg\aOg O S K |9g\tZc^uOg�FIc�e;u�s�tZd�7Ù�Ic;sBt9u/c^�Gg\a�e;uvsBc^tIqÓ²T&s�aJqvu�c;¡ e[m m�u0�ZgVoJc^f2�ZpZu�e£uvsBc^t�c;¡2|9c^pZ�9mBg­mBg\tZd^u0� �Za0c |9pIoJuJq)�Ie^q�u0c=�GgV�Ig\a�¡¢c^aOf2g�|µ�sBu0�ZcWpZu(a0c^pZtI|WsBtZd g\a0a0c^a�²�´ ¡ltZc­|9c^pZ�9mBg�mBg\tZd^u0�UFIc�e;uvsBtZd7Ù�Gc;sBt�u ¡¢c^aOf�e;u#s¹qjenº[ensxm]e;�9mBg^{�u0�9s]qo\e;t��Gg��Ig\av¡Ùc^a0f�g�|2�9��qv�9mxsBu0uvsBtZd)u0�Zg¡Se^oJu0cWa�q�e;tI|�oJcWf2�ZpZuvsBtZd2�Ie;a0u�s¹e[m³�ZaOc |9pIoJu�q~c;¡&¡ e^oJuOc^a�qµ�Zg\aOg(e;ubmBg�e^qvu#�Ienmx¡Üc;¡renmxmÜ|WsBd;sBu�qe;aOgWV\g\a0cG²$0��9pIq�tZc�a0c^pZtI|WsBtZd2g\a0aOc^a�cZo\oJpZa�q\²�0��9s]q�f2g�e£tIqu0�Ie£u2µ�g��Ie�ºWg�u0c¬e^|Z|¬�Vu0cVhYX^sx¡�f�e;t9uvs]q0q0e mBg\tZd^uO�¬s¹q�cZ|Z|UX�t�pZf��Gg\a�qjsBtIqvu0g�e^|%c;¡�c^tZg�Za0cZ|9pIoJu2�ZpZu�u0�Zg)�Za0c^�9mBg\f�u0�Zg\t s]qja0g�|9pIoJg�|¬u0c­u0�Zg�oJc^f2�ZpZu�e;u�s�cWt�c;¡/e­qvpZf§²[ZVg�qv�Zc;µu0�Ie£u�sBu/qvp�\ oJg�q#u0c2oJc^f2�ZpZu0g����Ie;a0u�s¹e[m³�ZaOc |9pIoJu�q�¡Ùc^ae��9sBtIe;a0�§qv�Zqvu0g\f§²0��Zg�enmBd^cWavsBu0�Zf�q�µ�9s]oO�¤e;aOg�s�t9u0a0cZ|9pIoJg�|)sBt­u0�Zg�tZgJ±Zu/oO�Ie;�Zu0g\a��Ie�ºWg��Ig\g\t¤|9g\ºWgJm�cW�Z�Ig�|sBt¬u0�Zg�¡Ùa�e;f2g\µ�c^a0¶ c;¡luO�Zg�¨�e;avm]qva0pZ�Zg�e^o\oJpZa�e£u0g�e;avsBu0�Zf�g\uvs]o2e;�Z�Za0c�eWo0�?¦ ¨�p9m�][h9{ܨ�p9m&^ �\«S{qvg\g�enm]qvc§¦`_~c^�§��a�«³¡¢c^a�e;t�c;º^g\a0ºWs�g\µj²ZVg�uO�Zg\t?oJc^f2�Ie;a0g)u0�Zg�qvg�enmBd^c^a�s�uO�Zf�q�e;tI|=e a0g�oJg\t�u�m��¬�ZpZ�9mxs]qv�Zg�|=f2g\uO�Zc |¥¦©¨�cW�§�;�;«µ�sBu0� a0g�qv�Gg�oJu�u0c­uvsBf2g­e;tI|?q��Ie^oJg§oJc^f2�9mBgJ±9s�uv�^²!ZVg§enm]qvc sBt�º^g�q�uvsBd�e;u0g�u0�Zg�aOg�y^p9sBa0g\f2g\t9u�q¡Ùc^a�oJcWf2�ZpZuvsBtZd�e^o\oJpZf�p9m]e;u0g�|¬|9c^u0�Za0cZ|9pIoJu�q\{³sS² g^²be;tªeW|Z|Ws�u�s�cWt�c;¡�e�º[e[m�pZg�uOc¤e;t=enmBa0g�e^|9�oJc^f��ZpZu0g�|­|9c^u0�Za0cZ|9pIoJu(qv�Zc^p9m]|­�Gg��Gg\av¡Ùc^a0f�e;�9mBg2qvc�u0�Ie£uu0�Zg�pZ�z|Ze;uOg�|­º[enmBpZg�e£d�ensBt�s]q�c;¡c^�Zu�s�f)enmre^o\oJpZa�eWoJ�^²

� |9g\u�ensxmBg�|)a0c^pZtI|WsBtZd�g\a0a0c^a�e;tIenmB�Zq®s]q�c;¡Üqvc^f2g#tZg\µ�mB���Za0cW�Ic�qvg�|�f2cZ|Wsxwzo\e;uvsBc^tIq~¡Ùc;mxmBc[µ�qsBt=qvg�oJuvsBc^t��Z²G� |9g�q0oJavsB�ZuvsBc^t¬c;¡luO�ZgbT*�/�c0��� d]�]�s�f��9m�g\f�g\t�u�e£uvsBc^tªe£tI|¤q�c^f2g2a0g�q�p9m�uJqc;¡bgJ±Zg�oJpZuvsBc^t§uvsBf2gjf2g�e^qvpZaOg\f2g\t�uJqoJc^tIoOmBpI|9g�u0�9s]q�a0g\�IcWa0u�²

æ�æ ÞJe�fhg%ikj

Page 7: Comparison of Accurate Dot Product AlgorithmsComparison of Accurate Dot Product Algorithms Jurgen Wol V. Gudenberg To cite this version: Jurgen Wol V. Gudenberg. Comparison of Accurate

� ����$�)������G��' �������$#���� �h���)����� ����*����������$�������� � �������� �K���%��h �,�������������������)� �����2��j')���*�"�-,�./����0��12�� �� �°�! #"���$"2����%&�')( *�+-,�.0/213134�5�476)8:9;+=<�=º^g\aO�°q®sBf2�9mBg^{;�ZpZu��Ic;µ�g\av¡Ùp9m enmBd^c^avsBu0�Zf ¡Ùc^a�u0�Zg�oJc^f��ZpZu�e;uvsBc^tjc;¡ |9cWu��Za0cZ|9pIoJu�q�f)e;¶^g�q&pIqvgc;¡�e(qvc�7So\enmxm�g�|)|9c^ub�Za0g�oOs]q®sBc^t)º[e£avs]e;�9mBg/µ�9s]oO��s]qb�Ie^q¸s¹o\e[m mB��e#mBc^tZd wZ± g�|�7Ù�Gc;sBt�ue^o\oJpZf�p9m]e;u0cWa>¬�Gg\a0f�sBu0uvsBtZd�u0�Zgle^|Z|WsBuvsBc^t�c;¡Ie;t9�/�ZaOc |9pIoJu�c;¡�FIc�e;uvsBtZd7Ù�Gc;sBt�u�t�pZf��Gg\a�q�µ�s�uO�Zc^pZu�a0c^pZtI|WsBtZdg\a0aOc^a�1�¦©��pZf2� ^a { _~c^� ^^�£«�5 3WmBg\u@?§e;tI|BA)�Ig(uvµ�c FIc�e£uvsBtZd7Ù�Ic;sBt9u/t9pZf��Ig\a�q#µ�sBu0� �Ie^qvg@C~{D |WsBd;sBu�q~sBt�uO�Zg�f�e;t9uvs]q0q0e(e£tI|2gJ±Z�Ic^tZg\t9ua�e£tZd^gFEHGJILK S�Q%Q�Q�S EHG B M ²$0��Zg��Za0cZ|9pIoJu M 3N9�?

DA

�Ie^qb�DD |WsBd;sBu�q�sBt2uO�Zglf)e;t�u�s¹qOq0eZ² M s]q�oJc^t9º^g\a0u0g�|°sBt�u0c(e#wZ±Zg�|�7Ù�Ic£s�t9u�t9pZf��Ig\a~����qv�9sx¡¢uvsBtZd s�uJq

f�e£t�uvs]q0qOe�mBgJ¡¢u�c^aba�s�dW��ue^o\oJc^aJ|Ws�tZdju0c�sBu�q*gJ± �Gc^tZg\t9ulµ#�9s¹oO��s¹q*sBt)u0�Zg/aJe;tZd^g�DE�GJI)K S�Q�Q%Q S �

DEHG B M ²�T&sBtIenmxm�� u0�Zg¤q��9s ¡Ùu0g�|%º[enmBpZg s]q2e^|Z|9g�|=uOcVu0�Zg¤|9cWu2�Za0g�oOs]q®sBc^t¥º[e;a�s¹e£�9m�g2> HJ�*�,+���$������ #�,�$)-�����J²�� D DON � D EHG B M N �

D�PE�GJI)K

PNRQ |Ws�d£s�uJq�e;a0g�a0g�yWp9sBa0g�|j¡Ùc^abu0�Zg�a0g\�Za0g�q�g\t�u�e£uvsBc^t

c;¡S>({Gµ�Zg\a0g Q d^pIe£a�|§|WsBd;sBu�qje;a0g�eW|Z|9g�|�sBtVcWa�|9g\a/uOc��Za0g\ºWg\t�u�c;º^g\a�FIc;µ(² 0��Zg\aOgJ¡¢c^aOg^{�g\º^g\tu0�Zg�q0yWpIe;a0g�T

DTVc£¡³uO�Zgm]e;a0dWg�qvu!FIc�e;uvsBtZd7Ù�Gc;sBt�u�t9pZf��Gg\aST 9 a Q 1LCVUV� 5

D�D%D 1)CWUV� 5 D C�XZY\[^]o\e;t �Ig�e^|Z|9g�|_C$`�u�s�f�g�q�µ�sBu0�Zc^pZuc;º^g\akFIc[µj²

aJb]�n� �³�n��b¹Á-cb�C0��ZgmBc^tZd2e^o\oJpZf�p9m]e;u0c^a�f�en�2�GgsBf2�9mBg\f2g\t9u0g�| e^q�e�º^g�oJu0cWalc£¡zsBt9u0g\d^g\a�qÓ²�#¡¢uOg\a\K e^|Z|WsBuvsBc^tIq~u0�ZgsBt9wIt9sBu0gJmB�)�Za0g�oOs]qvg�a0g�qvp9mBubs]q~c^�Zu�e[s�tZg�|�{Zqvc�uO�Zg°enmBd^cWavsBu0�Zf"oOmBg�e;avmB��s]qmxsBtZg�e;a�²:d;sBtIoJg/µ�g�c^t9mB�2e^|Z|j�Za0cZ|9pIoJu�q�sBtIqvu0g�eW|�c;¡ImBc^tZd�e^o\oJpZf�p9m]e;u0c^a�q�{;uO�Zg�oJcWtIqvu�e;t9u�¡ eWoJu0c^a�GgJ¡¢c^aOg(u0�Zg@K!s]qtZcWud^a0g�e;uOg\aluO�Ie;t§�Z²9�°o\oJpZf�p9m]e;uvsBtZd�q0o\enm]e;a�ZaOc |9pIoJu�q�e;a0g�c^�Zu�e[s�tZg�| �¸pIq�u�9�§e^|Z|9uvsBc^t�u0c�u0�Zg(e^o\oJpZf�p9m]e;u0c^a�²

d^c¤uO�9s¹q�enmBd^cWavsBu0�Zf s¹q�e º^g\a0�­g�\ oOsBg\t9u�qvc;mBpZuvsBc^t=c£¡�uO�Zg�d;sBº^g\t¬�Za0c^�9mBg\f§²�� �Gc^a0u�e£�9m�gº^g\aJq®sBc^t�c;¡�s�u#pIq®sBtZd2sBt9u0g\d^g\a/e;a�s�uO�Zf2g\uvs]o�u0c�e m]e;a0d^g�gJ±Zu0g\t9u�Ie^q~�Ig\g\t·sBf2�9mBg\f2g\t�uOg�|�sBt u0�Zg� ��e#���Sf7ghd��Za0cW�¸g�oJu�¦©`be£c �W�;«S²ZVg�tZg\º^g\a0u0�ZgJmBg�q0qls�t9u0a0cZ|9pIoJg(uO�Zg/c^uO�Zg\a�{Ws�t·¡ eWoJulc£m¹|9g\an{ |9cWu0�Za0cZ|9pIoJu/enmBd^c^avsBu0�Zf)qbsBt u0�Zg¡Ùc;mxmBc[µ�s�tZd)�Ie;a�e£d^a�e;�Z�Iq\²

&�'Z& /2i�i7jk9�j)+-,mlnjZ9�op<�qr5m8$j),7i7q:<0��Zg�wIaJqvu/enmBd^c^a�s�uO�Zf"µ#�9s¹oO�¤oJcWp9m¹|·d^pIe;a�e£t�u0g\g�tZg�e;avmB��¡¢p9mxmÜ�ZaOg�oOs¹q¸s�cWt�¡Ùc^a�u0�Zg(oJc^f2�ZpZu�e£uvsBc^tc;¡2eUFIc�e;uvsBtZd7Ù�Gc;sBt�u­qvpZf 8 3N9ts =

?@4A M

?µe^q ¡¢c^pZtI| ���¥`�s]o0�Ie£u�1¸qvg\g=¦©`�s]oO� ][�;« 5�²b´Su)s]q

�Ie^q�g�|2c^t)u0�Zg�c^�Iqvg\a0º;e;uvsBc^t)u0�Ie;u~pZtI|9g\a�oJg\aOu�ensBt eWq0qvpZf2�ZuvsBc^tIq�e;�Gc^pZu�uO�Zg/a0cWpZtI|Ws�tZdY;/{Wu0�Zga0g\f)ensBtI|9g\a�c;¡�u0�Zg a0c^pZtI|9g�|?q�pZf B O

Eµ�9s]o0�%s]q�|9gJwItZg�| �9�vu 3:9 B N

EU 1 B O

E 5�o\e;t�Gg2a0g\�Za0g�qvg\t9u0g�|­e^q�e FIc9e;uvsBtZd7Ù�Ic£s�t9u(t9pZf��Ig\a 1S�ZaOc[ºWs]|9g�| u0�Ie;u�tZc�c;º^g\a�FIc;µ c^a�pZtI|9g\a�FIc;µcZo\oJpZa�q�5�²wd^u�e;a0uvsBtZd�µ�sBu0��u0�Zg�º;enmBpZg�q M ? {Ge�q�g�y^pZg\tIoJgjc;¡rº;enmBpZg�q M3x? s]q/oJcWf2�ZpZu0g�|�e^q*¡Ùc;mxm�c;µ/q63¡Ùc^a�IJ3:90K_U?�(|9c;µt9u0c��(|9c8 3N9 M ? O M

?O Au-3N9 M ? N M ? O A U 8

åkyÜæ³å]è

Page 8: Comparison of Accurate Dot Product AlgorithmsComparison of Accurate Dot Product Algorithms Jurgen Wol V. Gudenberg To cite this version: Jurgen Wol V. Gudenberg. Comparison of Accurate

������������ ��������������������������! "���#��$�%�&�(' )����*�,+$�- �

M x? 3:9 8M x?O A 3:9�ug\tI| ¡¢c^a�I

0��9s]qbtZg\µ q�g�y^pZg\tIoJg��Ie^qbuO�Zg�Za0c^�Gg\a0uvsBg�qbu0�Ie£u s M x? 9 s M

?{9e;tI| M3x A s]q�e£t�e;�Z�Za0cn±9s�f)e 7uvsBc^t­c;¡bu0�Zg2qvpZf s M

?{Gµ�9sxmBg M x P S

D%D�DS M x= e;a0g(aOg\f�ensBtI|9g\a�q\²���g\�Ig�e£uvsBtZd§u0�Zg�q0e;f2gj�Za0c oJg�qOqmBg�e^|Zqju0c º[enmBpZg�q M �����

?� pZtI|9g\a�oJg\a0u�ensBt=oJc^tI|WsBuvsBc^tIq M �����A oJc^t9º^g\a0d^g�q u0c§e�º;enmBpZg�� q�µ�sBu0�=e;t

g\a0aOc^a�c;¡GmBg�q0q~u0�Ie;t�uvµlc�pZt9sBu�q*c;¡�u0�Zg�m]e^qvu~�9m]e^oJgc;¡�u0�Zgf)e;t�u�s¹qOq0eZ²0��9s]q�e[m�dWc^avsBu0�Zf+µe^q�f�c7|WsxwIg�|2e;tI|�e;�Z�9mxsBg�|�u0c�oJc^f2�ZpZuOgl|9c^u��Za0cZ|9pIoJu�q&sBt ¦ _�cW� ]]�{%_�cW� ] ^;« 3\pIq®sBtZd(e;t�e;�Z�Za0c^�Za�s¹e£u0gg�qvu�s�f)e;uvsBc^tVc;¡�u0�Zg2qvpZf�c£¡la0g\f�e[s�tI|9g\aJq s =

?@ P M

����?{�u0�ZgjsBu0g\a�e;u�s�cWt�o\e;tV�Gg2u0g\a0f�sBtIe;uOg�|¤eWq

qvc9c^tVe^q#u0�Zg�a0g�yWp9s�aOg�|Ve^o\oJpZa�e^oJ�·s]q�a0g�e^oO�Zg�|�²hd^g�oJc^tI|WmB�^{�sBu(oJc^p9m]|­�Ig��Za0c;º^g�| u0�Ie£u 1Ùsx¡�c^tZg|Ws]qva0g\d9e;a�|ZqJV\g\aOc�q�5ÜuO�Zgc^�Ig\aJe;tI|Zq M ���A

D�D%DM ���� O A e£a0gc^a�|9g\aOg�|2e^o\oJc^a�|WsBtZdju0c�u0�ZgJsBa�gJ±Z�Ic^tZg\t9u�qe;tI| sx¡Üuvµ�c(c^�Gg\a�e;tI|Zq~c;º^g\avm]e;��e^o\oJcWa�|WsBtZd�uOc(u0�ZgJsBa�gJ± �Gc^tZg\t9u�q\{9u0�Zg�f�e;t9uvs]q0q0e�c£¡³uO�Zgm]e;a0dWg\a

º;enmBpZg¤oJc^t9u�ensBtIq�c^t9mB�=|WsBd;sBu�q V\g\a0c sBt?u0�Zg)c;º^g\avm]e;�Z�9sBtZd¬a0g\d;sBc^t�²J0��Zg\a0gJ¡Ùc^a0g^{�u0�Zg)sBu0g\a�e£uvsBc^tf�en�2�Gg°q�u0c^�Z�Ig�|�e[¡¢u0g\a�e;u�f�c�qvu�� 90K UV�/q�u0g\�Iq~µ�Zg\t e[m m�c^�Gg\a�e;tI|Zq��Ienº^g#�Ig\g\t�c^a�|9g\aOg�|�²´®u°o\e;t��Gg(�Za0c;º^g�|)uO�Ie;u

; => ? @CA B?CD%E�? 9<; =>? @4A M

? 9<; M �����A¡Ùc^a/enmxmÜaOgJm�g\º;e;t9u/a0cWpZtI|Ws�tZd)f2cZ|9g�q�;¥e;tI|)¡Ùc^a|9c^pZ�9mBg�| �Za0g�oOs]q®sBc^t­c^�Ig\a�e£tI|Zq M ? e;tI| M ���A ²

aJb]�n� �³�n��b¹Á-cV� ZVc^aJqvuÜo\e^qvgbu�s�f�gloJc^f2�9mBgJ±9sBu¸� s]q� b1 K P 5�{�enmBu0�Zc^pZdW�(µ�gbf�en�°eWq0qvpZf2g~u0�Ie;usBt¬d^g\tZg\a�enm�c^t9mB�Vuvµ�c�s�uOg\a�e;uvsBc^tIq�e;a0g�tZg�oJg�q0qOe;a0�^² _~pZujg\º^g\tVu0�Zg\tVuO�Zg�¡ eWoJu0c^a�s]qja0gJm]e;uvsBº^gJmB��9sBd^�§q®sBtIoJg�e;t eW|Z|Ws�u�s�cWt�µ�sBu0��a0g\f�ensBtI|9g\a/oJc9qvu�q�e£u�mBg�e^q�uluO�Za0g\g�FIc9e;uvsBtZd7Ù�Ic£s�t9uc^�Gg\a�e;uvsBc^tIqÓ²

´Ù¡Üu0�Zg�s�tZ�ZpZu#º^g�oJu0c^a�q�f�pIqvu~�Ig�¶^g\�Zun{^e(uOg\f2�Ic^aJe;a0�(ºWg�oJu0c^a�c£¡=K oJc^f2�Gc^tZg\t9u�q*s¹q~tZg�oJg�q�7q0e£a0�^²0��9s]q~º^g�oJu0c^a�o\e;t��IgpIqvg�|)e^q&sBt9u0g\a0f2g�|Ws]e;u0g�qvu0cWa�e;d^g*¡¢c^a~u0�Zg�sBt9wIt9sBu0gJmB���ZaOg�oOs¹q�g/a0g�q�p9m�un{µ�Zg\aOg�en¡Ùu0g\a�uO�Zg�qvpZf2f�e£uvsBc^t¥pIqvpIenmxmB��f�e;t9�?a0g\f)ensBtI|9g\a�q§e;aOg§gJ±Ie^oJuvmB� a=e;tI|¥f�en�=�Ig|9a0cW�Z�Ig�|�²�Z¥�Zg\t�e�¶^sBtI|�c;¡�tZc^a0f�e[m s�V�e;u�s�cWt�s]qle£�Z�9m sBg�|�{Zqvc�u0�Ie£ubu0�Zga0g�qvp9mBuvsBtZd�º^g�oJu0cWa�q��Ienº^ggJ±Z�Ic^tZg\t9u(|WsNM�g\a0g\tIoJg�q�tZc^u�mBg�q0q#u0�Ie;t u0�Zg(f�e£t�uvs]q0qOejmBg\tZd^u0� D {41¸��1 E�G B M U EHGJILK[5 N � 5�� DoJc^f��Ic^tZg\t9u�q�qvp�\ oJg 1¸qvg\g(� ²B� 5�²�0��Zg�e^|Z|WsBuvsBc^t c;¡bc^tZg�º[enmBpZgju0c2u0�Ie;u�gJ±Ie^oJu�sBt9u0g\a0�Za0g\u�e£uvsBc^tf2g�e£tIq/e;t)sBu0g\aJe;uvsBc^t)¡Ùc^au0�9s]q�t9pZf��Ig\a�c;¡*oJcWf2�Ic^tZg\t9u�q\²

&�'�� �n<�i�qr<����������S+-,7q:,-9��0��Zg\aOg�gJ±9s]qvu�q(e£tZc^u0�Zg\a�e[m�dWc^avsBu0�Zf�¦ ¨�p9mW][h;«bµ#�9s¹oO�=c^a�|9g\aJq�u0�Zg q�pZf2f�e;tI|Zqje^o\oJc^a�|WsBtZd­u0cu0�ZgJsBabgJ±Z�Ic^tZg\t9u�q�e;tI|ju0�Zg\t�q�u�e;a0u�q³u0c/e^|Z|°¡¢a0cWf¥mBgJ¡¢ubu0c#avsBd^��un{JsS² g^²[q�u�e;a0uvsBtZd�µ�sBu0�(uO�Zg��9sBd^�Zg�qvugJ±Z�Ic^tZg\t9u�{9pZt9uvsxm³uO�Zg°|WsNM�g\a0g\tIoJg�c;¡�u0�Zg�gJ± �Gc^tZg\t9u�q~c;¡�u0�Zg°q�pZf e£tI|2u0�Zg�tZgJ±Zu�q�pZf2f�e;tI|js]qm]e;a0dWg\a/u0�Ie£t§�Z²�0��9s]q/e^|Z|WsBuvsBc^t­�Ie^q�u0c2�Igj�Ig\av¡Ùc^a0f�g�|)µ�sBu0�Zc^pZu�a0c^pZtI|WsBtZd�g\aOa0c^a�²�0��Zg\t u0�ZgqvpZf o\e;t��Gg�oJc^f2�ZpZuOg�|)���§e^|Z|WsBuvsBc^t·¡¢aOc^f�avsBd^��u#u0c�mBgJ¡¢un²

æ�æ ÞJe�fhg%ikj

Page 9: Comparison of Accurate Dot Product AlgorithmsComparison of Accurate Dot Product Algorithms Jurgen Wol V. Gudenberg To cite this version: Jurgen Wol V. Gudenberg. Comparison of Accurate

h ����$�)������G��' �������$#���� �h���)����� ����*����������$�������� � �������� �K���%��h �,�������������������)aJb]�n� �³�n��b¹Á-cb�C_~g�o\e;pIq�g�c;¡ruO�Zg°qvcWa0uvsBtZd�u0�Zg(oJc^f2�9mBgJ±9s�uv�)c;¡Üu0�9s]qe[m�dWc^avsBu0�Zf»o\e;t)tZc^u��Ig

�Gg\u0u0g\au0�Ie£t 1 K D mBc^dK1 K[5 5²���pZavsBtZd2uO�Zg(qvc^a0uvsBtZd��Z�Ie^qvgjº[enmBpZg�q#µ�sBu0��u0�Zg(q0e£f2g/gJ±Z�IcWtZg\t�ue;aOg/e^|Z|9g�|�²:d^c�e º^g�oJu0c^abc;¡�f2c9qvuvmB�+��1 EHG B M UWEHGJILK[5 N �º;enmBpZg�q��Ie^q*u0c��Ig�qvu0c^aOg�|�²^��d�ensBtu0�Zg(eW|Z|Ws�u�s�cWt�c;¡Üc^tZg�t9pZf��Ig\a�f2g�e;tIqe£t2sBtIqvg\a0u�s�cWt)s�t�uO�Ie;u�º^g�oJu0c^ae;tI|)e;t§e^|Z|WsBuvsBc^t�c;¡benmxmsBu�q�oJc^f2�Gc^tZg\t�uJq\²TZc^a�|9g\u�ensxmBg�|�|Ws]q0oJpIq0q¸s�cWt�c;¡�u0�Zg�qvg�|9c^u��Za0cZ|9pIoJule[m�dWc^avsBu0�Zf�q~e;tI|2aOgJm¹e£u0g�|2�ZaOc^�9mBg\f�q�qvg\gg^² dG²�¦`_~c^�§�a;«®²&�'�� �h9�+-<�q��V<�q ��<�+ 13q � �qri�� 4�5m5m8$,7i �bg\a0� a0g�oJg\t�u�m��^{b¨�c^�Z�GgJmBu��IeWq��ZpZ�9mxs]qv�Zg�|?e·tZg\µ"|9c^u0�ZaOc |9pIoJu�enmBd^c^avsBu0�Zf ¦©¨�c^�§�£�[«~µ�9s]o0�pIqvg�qjqvu�e;tI|Ze£a�| ´ f7f�f FIc�e;u�s�tZd�7Ù�Ic;sBt9u(c^�Gg\a�e;uvsBc^tIq u0c§e�m]e;a0d^g�gJ±Zu0g\t�u(e£tI|�sBt9u0g\a0f2g�|Ws]e;uOgJm��qvuOc^a0g�q�u0�Zg(q�pZf2f�e;tI|Zq~sBt­e�u�e;�9mBg(µ�s�uO����1 E�G B M U EHGJILK N � 5 oJc^f2�Gc^tZg\t�uJq\²0��9s]q�enmBd^c^a�s�uO�Zf+µ�9s]oO���Ie^q��ZaOc[ºWg\t2u0c��Ig#º^g\a0�j¡ eWqvu�s]q�|9g�q0oJavsB�Gg�|�s�t)|9g\u�ensxmIsBt)u0�ZgtZgJ±Zuqvg�oJu�s�cWt�²

� � �����#,0� �® "����2&�(')���*�"�-,�.��/��0��12�

0��Zg�enmBd^c^a�s�uO�Zf�µ�9s]oO�¬s¹q°¡¢cWa0f�p9m¹e£u0g�| ¡Ùc^a(�9sBtIe;a0�¬e;avsBu0�Zf2g\uvs]o�f�sBf�s]o\q�u0�Zg2�ZaOc oJg�|9pZa0g)c;¡u0�Zg����#��@�������h� ��K��� �, �enmBd^c^avsBu0�Zf»e£tI|§oJc^tIq®s]qvuJq�c^pZuc;¡����Z�Ie^q�g�q\²

� ')( � o78 �q (0��ZgqvpZf�f�e;tI|Zq�µ�9s]oO��e;a0g~�Ie;a0uvs]enmZ�Za0cZ|9pIoJu�q�a0g\�Za0g�qvg\t9u0g�| sBt2pIqvpIenm�FIc�e;u�s�tZd�7Ù�Ic;sBt9u�¡¢c^aOf�e;ue;aOg�s�tIq�g\a0u0g�|)sBt­e�u�e;�9mBg^{�qvc�u0�Ie;uu0�Zg(q�pZf c;¡�enmxmÜu�e;�9mBg(g\t9u0avsBg�q~s]q#u0�Zg�gJ±Ie^oJu/|9c^u0�ZaOc |9pIoJu�²0�c mxsBf�sBu�u0�Zg§q¸s:V\g c;¡/u0�Zg)u�e;�9mBg�u0cVeVoJcWtIqvu�e;t9u(µ�g§|Ws]qvuvsBtZd^p9s]qv� �Ig\uvµ�g\g\t � c |Z|��~e;tI|� g\ºWg\t �FIc9e;uvsBtZd7Ù�Ic£s�t9ut9pZf��Ig\a�qeWo\oJc^a�|WsBtZd2u0c�u0�ZgJsBa�m]e^qvu��9sBu�²�0��9s]q��Za0cW�Ig\a0uv�)c;¡Üe�t�pZf��Gg\as]q�o\enmxm�g�|·sBu�q�dWg\t�pIq\²ZVg�u0�Zg\t§pIq�g�u0�Zg�¡Ùc;mxmBc[µ�s�tZd)�Za0c^�Gc�q®sBuvsBc^tIqe£�Ic^pZu�u0�Zg(e;avsBu0�Zf�g\uvs]o;²

� �°|Z|WsBuvsBc^t)c;¡�u¸µ�c�t9pZf��Ig\a�q*µ�sBu0�2u0�Zg/qOe;f2g�gJ± �Gc^tZg\t9ule;tI|�uO�Zg/q0e;f2g�d^g\t9pIq�s]q~gJ±Ie^oJu�²

� �°|Z|WsBuvsBc^t­c;¡�u¸µ�c�t9pZf��Gg\a�q#µ�sBu0��u0�Zg�qOe;f2g�gJ± �Gc^tZg\t�u�e;tI|·c^�Z�Ic�q¸s�uOg(q®sBd^t�s]q#gJ± e^oJun²0��Zg(qvpZf2f�e£tI|Zqle£a0gsBtIqvg\aOu0g�|)sBt§e;t�e;a0a�en��µ�sBu0�§��1 E�G B M U2E�GJI)K N � 5�a0c;µ/qÓ{^c^tZg�¡Ùc^ag�e^oO��gJ±Z�Ic^tZg\t9u/e;tI|���oJc;mBpZf2tIq\{ c^tZg¡Ùc^a�g�e^oO��d^g\t9pIq\²9´Ù¡Üu0�Zg�mBcZo\e;uvsBc^t�µ�Zg\aOg�u0�Zg�t�pZf��Gg\a

s]q�uOc��Gg�sBtIqvg\a0u0g�| s]q�cZo\oJpZ�9sBg�|Vµ�sBu0�Ve)tZc^t�V\g\aOc�º[enmBpZg2c^a�sBu�q�oJcWa0a0g�qv�Gc^tI|Ze;t9u(e^|Z|9a0g�q0q#¡Ùc^au0�Zg(|WsNM�g\a0g\t9u�d^g\t�pIq#oJc^t�uJensBtIqlejº;enmBpZg�c;¡Üc^�Z�Gc�q®sBu0g(q®sBd^t�{Ze;t�gJ±Ie^oJue^|Z|WsBuvsBc^t)s]q��Ig\a�¡¢c^aOf2g�|e;tI|·u0�Zg�a0g�qvp9mBuvsBtZd�qvpZf-s¹qls�tIq�g\a0u0g�|)sBt u0�Zg(u�e£�9m�g^²

åkyÜæ³å]è

Page 10: Comparison of Accurate Dot Product AlgorithmsComparison of Accurate Dot Product Algorithms Jurgen Wol V. Gudenberg To cite this version: Jurgen Wol V. Gudenberg. Comparison of Accurate

������������ ��������������������������! "���#��$�%�&�(' )����*�,+$�- ]�u/uO�Zg(g\tI| c;¡�u0�9s]q��Z�Ie^qvg�u0�Zg�qvpZf c;¡lenmxmbu�e;�9mBg�g\t�uOavsBg�q#s¹q�gJ±Ie^oJuvmB� u0�Zg(µe;t9u0g�|§q�pZf

e;tI|¬uO�Zg�u�e;�9mBg¤o\e;t=�Gg��Za0cZoJg�q0qvg�|¬sBt=u0�Zg�cWa�|9g\a2c;¡/uO�Zg�gJ± �Gc^tZg\t�uJq\{�¡Ùc^a�g�e^o0�=gJ±Z�IcWtZg\t�uu0�Zg\aOg�e;a0g(e£ulf�c�qvu��g\t9u0avsBg�q\²

aJb]�n� �³�n��b¹Á-cb�\f~e^oO��e^|Z|WsBuvsBc^t¬sBt=u0�9s]qj�Z�Ie^qvg)a0g�|9pIoJg�q�u0�Zg)t�pZf��Gg\a�c£¡/u�e;�9mBg)g\t�u0a�s�g�qÓ²0��9pIq��Gg�q®s]|9g�q�u0�Zg�sBtIqvg\aOuvsBc^tVsBt9u0c§uO�Zg2u�e;�9mBg2tZc c[ºWg\a0�Zg�e^|·s]q(�ZaOc |9pIoJg�| µ�sBu0�Va0g�q��Ig�oJu�u0cu0�Zgja0c^pZtI|9g�| oJc^f2�ZpZu�e;u�s�cWt�c;¡Üu0�Zg�qvpZf§²

´Ù¡z|9�9tIe;f�s]o~f2g\f2c^a0��f�e;tIe£d^g\f2g\t9u�s]qbe�º;ensxm]e;�9mBg^{;u0�Zglq®s�V\g�c;¡Iu0�Zg~u�e;�9mBg�f)e����Ig�a0g�|9pIoJg�|gJsBu0�Zg\a ���%s�uJq)c^a0d�e;t9s�V�e;u�s�cWt eWq�eV�Ie^q���u�e;�9mBg^{�c^a)���¥enmxmBc o\e;u�s�tZd¥qvu0c^aJe;d^g§cWt9m��%¡¢cWa�u0�Zge^oJuOpIenmxm��¬pIqvg�|¬a�e£tZd^g�c£¡/º[enmBpZg�q2|9g\�Gg\tI|WsBtZd=c^t�uO�Zg�f�sBt9sBf�enm�e;tI|Vf)en±�sBf�e[m�gJ±Z�IcWtZg\t�u�qÓ²d;sBtIoJgu0�9s]q³sBtIoJa0g�e^qvg�q�a0pZt9uvsBf2ge;tI|°s¹q�enm]qvc�e;�Z�9mxs]o\e;�9mBg/u0c#u0�ZgbmBc^tZd�e^o\oJpZf�p9m]e;u0c^abenmBd^c^avsBu0�Zf {µ�g(|9c2tZc^u#gJ± �9mBc;sBu�u0�9s]qc^�Zu�s�cWt�¡ÙpZa0u0�Zg\a�²_�gJ¡Ùc^a0g�µ�g�oJc^f2g�u0c­u0�Zg�cWu0�Zg\a��Z�Ie^qvg�q\{�µ�g�µe;t�uju0cV|9g�q0oJavsB�Ig­u0�Zg§oJcWf2�ZpZu�e;uvsBc^t¬c;¡|9c^pZ�9mBgjmBg\tZd^u0� �Za0c |9pIoJuJq~sBt§f2c^aOg(|9g\u�ensxmS²

� 'Z& � +-4���6Lq��O6Lq:,7.$9�om5 476Z9;j �76)jL138:9�jL+-,0&c�oJc^f��ZpZu0g/e(|9cWpZ�9m�g�7QmBg\tZd^uO���Za0c |9pIoJu~sBt)u0�ZgpIqvpIenmKFIc�e;uvsBtZd7Ù�Gc;sBt�u*¡¢cWa0f�e;u M e;tI|�� e£a0gqv�9mxsBu�sBt�uOc2�IenmBº^g�q\{9sx¡�f�e£t�uvs]q0qOejmBg\tZd^u0��s]q�cZ|Z|�uO�Zg(qvg�oJc^tI|��Ie[m ¡�c£¡�� s]q/qv�9mxsBu�e;d�ensBt�² 0��9s]qµ�c^p9m]|�aOg�qvp9mBusBt�sBtIqvg\aOuvsBtZd h��Ie£a0uvs]enm��Za0cZ|9pIoJu�q�sBt9u0c2u0�Zg�uJe;�9mBg^²

�^²qv�9mxsBu M 9 M � N M � � � 9��G� N �9� � �9��9����Z� N ���^��Z²~sBtIqvg\a0u M ���G� S M ���9�Z� S M ����^� S M ��I� S M ���9�Z� S M ���9�^�0Üe;¶WsBtZd�u0�Zg2d^g\t9pIq�c;¡�u0�Zg�t�pZf��Ig\aJq�sBt�uOc§e^o\oJc^pZt9u�¡¢c^a�u0�Zg2�IeWqvg CI9 ��uO�Zg2t�pZf��Gg\a

c;¡b�Ie;a0u�s¹e[m³�ZaOc |9pIoJu�q�o\e;t��Gg(a0g�|9pIoJg�|·u0c��I{Ws ¡��Gc^u0�)¡Se^oJu0c^aJqle£a0g�g\º^g\t�{Ie;tI|)uOc � {9sx¡rc^tZg�cWa�Gc^u0�¤e£a0g�c |Z|�²

�^²~sx¡ M e;tI| � e;aOg/cZ|Z|�{zq��9m sBu M sBt9u0cWG 8 E 1 M 5 e;tI| M ��9 M U G 8 E 1 M 5µ�Zg\a0g@G 8 E 1 M 5J9 a Q �

D� Xk]���� = X =� � ] �sBtIqvg\a0u�G 8 E 1 M 5����

�Z²�gJm]qvg M ��9 M�Z²qv�9mxsBu M ��9 M �^� N M �n� � �-9��G� N �9��I²~sBtIqvg\a0u M �^���G� S M �^���9� S M �����G� S M �����9�0��Zg2qvpZ�Zu0a�e^oJu�s�cWt e£tI|§f�p9mBuvsB�9mxs]o\e;uvsBc^t s�t¬qvu0g\�=��e£a0g�gJ± e^oJun{zq®sBtIoJg2�Gc^u0�­g\t9u0avsBg�q��Ienº^g

u0�Zg)q0e;f2gjgJ± �Gc^tZg\t�u�e;tI| c^tZg�s]q�e)�Ic;µ�g\a�c;¡�u0�Zg��Ie^q�g^²�´Ù¡�Ic^uO� M e£tI|�� e;a0g�c |Z|�{ M �

æ�æ ÞJe�fhg%ikj

Page 11: Comparison of Accurate Dot Product AlgorithmsComparison of Accurate Dot Product Algorithms Jurgen Wol V. Gudenberg To cite this version: Jurgen Wol V. Gudenberg. Comparison of Accurate

^ ����$�)������G��' �������$#���� �h���)����� ����*����������$�������� � �������� �K���%��h �,�������������������)s]q�g\ºWg\t�{zq®sBtIoJg2cWtZg��9sBu s¹q�o\e£tIoJgJm mBg�|�{��Zg\tIoJgjsBt�qvu0g\�V�)e;umBg�e^q�u�c^tZg�t�pZf��Ig\a�s]q g\º^g\tVe;tI|µ�sxmxmluO�Zg\a0gJ¡¢cWa0g��Ig�qv�9mxsBu�sBt9u0c§�)�Ie;a0uJqg�e^oO�­c;¡lµ#�9s¹oO�V�Ie^q � 9�� D �[��� |WsBd;sBu�q\² #c^u0g�u0�Ie;u#s�tu0�9s]qo\eWqvg°� � 9 D U¬�(²�0��Zg��Za0cZ|9pIoJu�q*sBt§qvu0g\�)��uO�Zg\a0gJ¡¢cWa0g/�Ienº^g�e;u�f2c�qvu D |WsBd;sBu�q#e;tI|�e£a0goJc^f��ZpZu0g�|�gJ±Ie^oJuvmB�^²

�/tZc^uO�Zg\alenmBu0g\a0tIe£uvsBº^g/oJc^p9m]|)�Ig#u0c�oJc;mxmBg�oJu�u0�Zg�Ie;aOuvs]enmz�Za0cZ|9pIoJu�qrs�t)uvµ�c�FIc�e;uvsBtZd7Ù�Gc;sBt�ut9pZf��Ig\a�q�a0g\�Za0g�qvg\t9uvsBtZdVu0�Zg§e£�Z�Za0cn±�sBf�e;uOg�a0g�qvp9mBu�e;tI|¬u0�Zg�aOg\f�ensBtI|9g\a e£tI| s�tIq�g\a0u�uO�Zg�qvguvµ�cI²�^²qv�9mxsBu M 9 M � N M � � � 9��G� N �9� � �9��9����Z� N ���^��Z² 8 3:9 M K ��Z²mu 3N9 1 1 1 1 M ���G� � 8 5 O M �����5 O M � �I� 5 O M ���9�Z� 5 O M ���9�^��I²~sBtIqvg\a0u 8 e;tI|Bu

0��9s]q�µ�c^p9m]|)a0g�|9pIoJg�u0�Zgjt�pZf��Ig\a#c;¡�u�e;�9mBg�sBtIqvg\aOuvsBc^tIq\{G�ZpZu/q®sBtIoJg2e;t�sBtIqvg\a0u�s�cWt�s]q/e)oO�Zg�e;�c^�Gg\a�e;uvsBc^t�{ µlg(|Ws]| tZc^u~sBf2�9mBg\f2g\t9u�u0�9s]qº;e;avs]e;t9u�²0��Zg~¡¢c;mxmBc;µ�sBtZd�qv�9mxsBu0uvsBtZdj�Za0c oJg�|9pZaOglµ#�9s¹oO�2c^t9mB�jtZg\g�|ZqcFIc�e;u�s�tZd�7Ù�Ic;sBt9ule;avsBu0�Zf�g\uvs]o��ZaOc7|9pIoJg�q�e ��7S|WsBd;sBu�t�pZf��Gg\a M �+e£tI|�e D U �§|WsBd;sBu�t�pZf��Ig\a M ��{Zµ#�Zg\a0g2��� ��� D U=�

qv�9mxs�u-1 M S M � S M �b5�� 3:9 1LC� � � N � 5 K M� M ��3N9 � � 1 � � M 5� M � 3N9 M � M �0��9s]q�enmBd^cWavsBu0�Zf s]q�oJcWa0a0g�oJu�¡Ùc^a�e;t?e£a0�9sBu0a�e;a0�¬f2c^tZcWu0c^tZg�a0c^pZtI|WsBtZdI{*sx¡�e;avsBu0�Zf2g\u�s¹o\e[m

c^�Gg\a�e;uvsBc^tIq�e£a0g(c^�Zu�s�f)enmxm��­|9gJwItZg�|�²�´®u°o\e;tVenm]qvc)�Ig�pIqvg�| u0c�|9g\u0g\a0f s�tZg�u0�Zg�d^g\t9pIq#c;¡ M {µ�Zg\t­qvg\u0uvsBtZd � 9 D U=�+e;tI|)u0g�q�uvsBtZd M � ¡¢cWa"V\g\aOcI²

`ba0c9c;¡Sq�f)e��)�Ig ¡¢cWpZtI|�s�t¬¦ gGsBt ^ �\«´Ù¡luO�Zg(g\º^g\t �Ie;a0u�c;¡le2t9pZf��Ig\a�s]q#tZg\g�|9g�|�{�µ�9s]oO�§sBtI|9g\g�| s]q�u0�Zg�o\e^qvg s�t­u0�Zg�c^a�s�d£s�tIe[m

enmBd^cWavsBu0�Zf ¦©¨�c^�§�;�^«®{ M �¥�Ie^q~u0c(�Gg°o\enm]oJp9m]e;u0g�|�eWqle u0a0pZtIo\e;u�s�cWt�µ�9s]oO�)sBt�u0pZaOt2wZ±Zg�q~u0�Zga0cWpZtI|Ws�tZd)f2cZ|9g3

�� 3:9 1LC� � � N � 5 K � M� M ��3N9 � � � 1 � � M 5

åkyÜæ³å]è

Page 12: Comparison of Accurate Dot Product AlgorithmsComparison of Accurate Dot Product Algorithms Jurgen Wol V. Gudenberg To cite this version: Jurgen Wol V. Gudenberg. Comparison of Accurate

������������ ��������������������������! "���#��$�%�&�(' )����*�,+$�- �

µ�Zg\a0g�c^�Ig\a�e£uvsBc^tIq~µ�sBu0� � e;a0g�u0a0pZtIo\e£u0g�|�e£tI|2µ�sBu0� ¥e;a0g�a0c^pZtI|9g�|�enµe�� ¡¢aOc^f V\g\a0cG²´Ù¡�u0�Zg)m]e��Wc^pZu�c;¡ e FIc�e;uvsBtZd7Ù�Gc;sBt�u2t9pZf��Ig\a�s]q wZ± g�|�{~e^q ¡¢c^a�u0�Zg�´ f�f7f ¡Ùc^a0f)e;u�{ g^² dG²B{

qv�9mxsBu0uvsBtZd¬c;¡/e�t9pZf��Ig\a2f�en�­f2c^aOg2g�\ oOsBg\t�u�m��¬�Ig)sBf2�9mBg\f2g\t9u0g�|¬�9�VgJ±Z�9m�c£s�u�s�tZd �9s�u wIgJm¹|c^�Gg\a�e;uvsBc^tIqÓ²

� '�� � o78 �q &´®t¤�Z�IeWqvg2�2u0�Zgjg\t�uOavsBg�q�c;¡�u0�Zg�u�e;�9mBg�e£a0g(u0aJe;tIq®¡Ùc^a0f2g�| s�tVqvpIoO�Ve2µen�)u0�Ie;u�enmxmr�Ienº^gju0�Zgq0e£f2g2q®sBd^t�² 0��9s]qjo\e;t­�Ig�e^oO�9sBg\º^g�|­µ�sBu0�ZcWpZu�oO�Ie;tZd;sBtZd�u0�Zg�qvpZf c£¡lu0�Zg�u�e;�9mBg2g\t9u0avsBg�q ���u0�ZgjgJ± �9mBc;sBu�e£uvsBc^t§c;¡bc^tZg�c;¡bu0�Zg�¡Ùc;mxmBc[µ�s�tZd�¡Ùc^a0f�p9m]e^q

M N � 3N9 B � Y O� U

E� Y 9 � � A>?

@�� B � Y O?N 1 B U

E 5¸� Y

M N � 3:9 B � Y O� U

E� Y 9 � � A> ?

@CA B � Y O?N 1¸� B U

E 5¸� Yµ�Zg\aOg M e;tI| �+e;a0g#u¸µ�c�e^|n��eWoJg\t�u~u�e;�9mBg/g\t9u0avsBg�q~µ�sBu0��|WsNM³g\aOg\t�uq®sBd^t�²:d;sBtIoJg(gJsBu0�Zg\a B U

Ec^a� B U

E�Ie^q~u0�Zg�q0e;f�g°q®sBd^t�e^q B µ�g/o\e;t�f2cZ|Wsx¡¢�2uO�Zg/u�e£�9m�g��9�2a0g\�9m]e^oOsBtZd M e;tI| � ���

u0�Zg�qvpZf2f�e£tI|Zq�c;¡bu0�Zg�avsBd^�9u�Ie;tI|§q¸s¹|9g^²�¡Ùu0g\a)u0�Zg s�uOg\a�e;u0g�| e;�Z�9mxs¹o\e£uvsBc^t�c;¡�u0�9s]q2c^�Gg\a�e;uvsBc^t¥u0c=enmxm��IensBa�q)c;¡�e^|n�0e^oJg\t9u�u�e£�9m�g

g\t9u0avsBg�qjµ�sBu0��|WsNM�g\a0g\t9u�q®sBd^t ¡¢aOc^f �9sBd^��uOc)m�c;µ gJ± �Gc^tZg\t�uJq(enmxmlº;enmBpZg�q s�t¬u0�Zg2uJe;�9mBg2�Ienº^gu0�Zg)q0e;f2g2q®sBd^t�² d;s�tIoJg)u0�Zg2f�p9mBuvsB�9mxs]o\e;uvsBc^tIq�µ�sBu0�¬e)�Ic;µ�g\a�c;¡lu0�Zg��Ie^qvg���e^q�µ�gJmxm�eWq�u0�Zge^|Z|WsBuvsBc^tIq)µ#�9s¹oO��e;a0g­e^oJu0pIenmxmB�?�Gg\av¡¢cWa0f2g�|?e£a0g§a0cWpZtI|Ws�tZd=g\aOa0c^a�¡Ùa0g\g^{~u0�Zg§º;enmBpZg¤c£¡(u0�Zg|9c^uO�Za0c |9pIoJu�qvuvsxmxm�s¹q#u0�Zg�gJ±Ie^oJu�qvpZf c;¡�enmxmruJe;�9mBg�g\t�u0a�s�g�q#µ�9s]oO�¤tZc;µ¥sBt u0pZa0t§enmxmb�Ie�ºWg�u0�Zgq0e£f2g(q®sBd^t�²

d^c�uO�Zg°q®sBd^t�c£¡re�|9c^u0�Za0cZ|9pIoJu�f�en�2g�e^q®sxmB���Gg�a0g�e^| ¡¢aOc^f+u0�Zg�uJe;�9mBg^² 0��9s¹q�c^�Ig\a�e£uvsBc^t2s]q�ZgJmB�9¡Ùp9mb¡¢c^aoJc^f��ZpZu�e;uvsBc^t�c;¡b�IcWpZtI|Zq\²

c^uOg^{��Zc;µ�g\º^g\a�{�uO�Ie;u(u0�Zg)t9pZf��Gg\a�c;¡°qvpZf2f)e;tI|Zqjf�en�¤�Gg§oJc^tIq®s]|9g\a�e;�9mB� sBtIoJa0g�e^qvg�|�²T c^a2u0�Zg§qvpZ�ZuOa�e^oJuvsBc^t=c;¡�u0�Zg§q0yWpIe;a0g�c;¡�u0�Zg§qvf�e[m mBg�qvu�t�pZf��Ig\a ¡¢a0cWf uO�Zg¤q0yWpIe;a0g)c;¡�u0�Zgm]e;a0dWg�qvu/t9pZf��Ig\a�{ g^² dI²B{9�Ienmx¡lc£¡ru0�Zgju�e;�9mBg^{ s ² g^²Ic^tZgjg\t�uOa0�2¡Ùc^ag�eWo0� gJ± �Gc^tZg\t�un{ze;aOg�wZmxmBg�|�sBt�²0��Zg\aOgJ¡¢c^aOg§µ�g)sBt�º^g�q�uvsBd�e;u0g�| qvc^f2g§enmBu0g\aOtIe;uvsBº^g�q�µ�9s]oO� e;aOg§�Za0g�qvg\t9u0g�| sBt¥u0�Zg)¡Ùc;mxm�c;µ�sBtZdoO�Ie;�Zu0g\a�²

� '�� � o78 �q �

´®t��Z�Ie^q�g �·u0�Zg2u�e;�9mBg)g\t�uOavsBg�q(e;a0g)e^|Z|9g�|­¡Ùa0c^f!mBc[µ u0c��9sBd^�¬gJ± �Gc^tZg\t�uJq\²J0��Zg�¡Ùc;mxm�c;µ�sBtZdenmBd^cWavsBu0�Zf»dWpIe;a�e;t9u0g\g�q�e�a0c^pZtI|WsBtZd)g\a0a0c^a~mBg�q0q#u0�Ie;t§��p9mB�Iqj¦©¨�c^�§�;�W«S²

æ�æ ÞJe�fhg%ikj

Page 13: Comparison of Accurate Dot Product AlgorithmsComparison of Accurate Dot Product Algorithms Jurgen Wol V. Gudenberg To cite this version: Jurgen Wol V. Gudenberg. Comparison of Accurate

��a ����$�)������G��' �������$#���� �h���)����� ����*����������$�������� � �������� �K���%��h �,�������������������)¡Ùc^agJ±Z� 3N9¥mBc;µ�u0c��9sBd^�

sx¡�gJ±Z�Ic�1¸q�pZfG5��¥gJ±Z�§u0�Zg\te;p9± 3N9�f�e;¶Wg g\º^g\t41�uJe;�9mBg^¦ gJ± ��{ c |Z|9«�5qvpZf 3:9 qvpZf O 1¸e;p9± O u�e;�9mBg^¦ gJ±Z��{ g\º^g\tZ«�5

gJm]qvgqvpZf 3:9 1¸qvpZf O u�e;�9mBg^¦ gJ±Z��{ cZ|Z|9«�5 O u�e;�9mBg^¦ gJ±Z��{ g\º^g\tZ«

g\tI| s ¡g\tI| ¡¢c^a

� '�� � j � 134 � � j)+-,0��Zg#a0pZt�u�s�f�g�oJcWf2�9mBgJ±�sBuv��c;¡zu0�Zg�µ�Zc£m�genmBd^cWavsBu0�Zf s]q� 1 K 5b²;�#m mIcW�Ig\a�e;u�s�cWtIq�e;a0g�qvu�e;tI|Ze£a�|FIc�e£uvsBtZd7Ù�Ic;sBt9u�c^�Ig\aJe;uvsBc^tIq\{�qvc)u0�Zg�¡Se^qvu FIc9e;uvsBtZd7Ù�Ic£s�t9u��Ie;a�|9µe;aOg�f�en���Gg�pIq�g�|�² ��g�qv�9sBu0gu0�Zg wZm m�7QsBt s�t �Z�Ie^qvg2�)u0�Zgj¡ eWoJu0c^a#¡¢c^a�uO�Zg q�pZf2f�e;uvsBc^t­c;¡�uO�Zg2u�e;�9mBg�pIqvpIenmxmB� s¹q tZg�e;avmB�=�^²��pZg�u0c�u0�Zg�|9c^pZ�9mBg�7Qm�g\tZdWu0�¤o\enm]oJp9m]e;uvsBc^t�c£¡Ü�Za0cZ|9pIoJu�q~u0�Zg/gJ±Zg�oJpZuvsBc^t�uvsBf2g#s]q��Ig\uvµ�g\g\t ^�Ke;tI|¥�%a�K FIc�e;uvsBtZd7Ù�Gc;sBt�u2cW�Ig\a�e;u�s�cWtIq21Sf�p9mBuvsB�9m s]o\e;u�s�cWtIq�e;tI|=eW|Z|Ws�u�s�cWtIq�5��9mBpIq2e oJc^tIqvu�e;t9uc;º^g\a0�Zg�eW|�²�´Ù¡0{��Zc[µ�g\º^g\an{�u0�ZgVenmBd^c^avsBu0�Zf�s]q�uOc=�Ig�oJc^f��Ie;a0g�|¥µ�sBu0� u0�Zg­pIqvpIenm(a0c^pZtI|9g�|enmBd^cWavsBu0�Zf�u0�ZgjsBtI|9gJ±ªoJc^f��ZpZu�e;uvsBc^tIq#¡Ùc^a�u0�Zg2u�e;�9mBg�sBtIqvg\aOuvsBc^tVf�en��tZc^u��Gg2tZg\d;mBg�oJu0g�| c^tf2cZ|9g\a0t­oJc^f2�ZpZu0g\a�qÓ{Ie^q~µ�g�µ�sxm m�qvg\g ¡¢a0cWf"cWpZaf2g�e^qvpZaOg\f2g\t�uJq��Za0g�qvg\t9u0g�|)sBt­o0�Ie;�ZuOg\aW^Z²0��Zg¤e;f�c^pZt�u�c;¡�qvu0c^aJe;d^g)u0c�¶Wg\g\�?u0�Zg·s�t9wIt9sBu0gJmB�¥�Za0g�oOs]qvgVaOg�qvp9mBu2s]q�a0gJm]e;uvsBº^gJmB�?�9sBd^��{��1 E�G B M U E�GJI)K N � 5=t9pZf��Gg\a�q\{Ie[m�uO�Zc^pZd^��sBt f�e;t9��e;�Z�9mxs]o\e;uvsBc^tIq�e2qvf)enmxm�g\a�º^g�oJu0cWalµ�s mxmqvp�\�oJg^²0��Zge^o\oJpZf�p9m¹e£uvsBc^t2c;¡ze;t2e^|Z|WsBuvsBc^tIe[mzº[e[m�pZg#f2g�e;tIq&sBu�q&sBtIqvg\a0uvsBc^t s�t9u0c�u0�Zg�u�e£�9m�g�1S�Z�IeWqvg� 5J{Zu0�Zg(e^|n�vpIqvu0f2g\t9uc;¡�q®sBd^tIq\{ sx¡rtZg�oJg�qOq0e;a0��1S�Z�IeWqvg���5�e;tI| wItIenmxmB�§u0�Zg�tZg\µ oJc^f2�ZpZu�e£uvsBc^tc;¡bu0�Zg(qvpZf c;¡re[m mbu�e;�9mBg�g\t9u0avsBg�q(1S�Z�Ie^qvg(��5J²� � �j')�����~���������� ��� � � �����#,0� �S "�#�³����(')���b�+�-,�./����0��12�

´®tVu0�9s]q�qvg�oJuvsBc^t­µ�g�|9g�q0oJa�s��Gg�qvc^f�g(f2cZ|Wsxwzo\e;uvsBc^tIq�c£¡l¨�c^�Z�IgJmBu �©q ��cWu0�Za0cZ|9pIoJu��mBd^c^a�s�uO�Zfµ�9s]oO�§�Ienº^g��Gg\g\t§sBf2�9mBg\f2g\t9u0g�| s�t­c^aJ|9g\aluOc

� sBtIoJa0g�e^qvg�e^o\oJpZa�e^oJ�

� a0g�|9pIoJg(aOpZt�uvsBf2g

� a0g�|9pIoJg�q�u0c^a�e;dWg/a0g�yWp9sBa0g\f2g\t9u�q

� ¡ eWoOs mxsBu�e;uOg�e^o\oJpZf�p9m]e;uvsBc^t

åkyÜæ³å]è

Page 14: Comparison of Accurate Dot Product AlgorithmsComparison of Accurate Dot Product Algorithms Jurgen Wol V. Gudenberg To cite this version: Jurgen Wol V. Gudenberg. Comparison of Accurate

������������ ��������������������������! "���#��$�%�&�(' )����*�,+$�- �W�

� ')( /2i�i7jk9�j)+-,mlnjZ9�o�� q:5m8$jL,�i7qr<_jL, � o78 �q �

0��Zg f�en±9s�f)enm(a0c^pZtI|WsBtZd¥g\a0a0c^a§oJcWf2f�sBu0u0g�|%sBt �Z�IeWqvg�� s]q§�%p9m��IqÓ²´Ù¡2µ�g­a0g\�9m]e^oJg¬u0�Zge^|Z|WsBuvsBc^t=c£¡lu0�Zg)u�e;�9mBg2g\t9u0avsBg�qj���­u0�Zg ¡¢c;mxmBc;µ�sBtZdVenmBd^c^avsBu0�Zf§{&u0�9s]q�g\a0aOc^a�s]q(oJcWf2�ZpZu0g�| s�tuE�G e;tI|�e;t�pZ��|Ze;u0g(c£¡ru0�Zg�aOg�qvp9mBu 8 ?rG s]q�Gc�q0q®sB�9mBg�u0c�c^�Zu�ensBt¬��p9mB�¤e^o\oJpZa�eWoJ�^²

¡Ùc^agJ±Z� 3N9¥mBc;µ�u0c��9sBd^�sx¡�gJ±Z�Ic�1¸q�pZfG5�� gJ±Z�§u0�Zg\te;p9± 3N9�f�e;¶Wg g\º^g\t41�uJe;�9mBg^¦ gJ± ��{ c |Z|9«�51¸e;p9±�{ a�� 5"3N9 e;p9± N uJe;�9mBg^¦ gJ± ��{ g\º^g\tZ«1¸qvpZf { a���5 3N9 qvpZf N e;p9±

gJm]qvg1¸e;p9±�{ a�� 5"3N9 qvpZf N u�e;�9mBg^¦ gJ±Z��{ cZ|Z|9«1¸qvpZf { a���5 3N9 e;p9± N u�e;�9mBg^¦ gJ±Z��{ g\º^g\tZ«g\tI| s ¡a0g\f 3N9�a�� O a��

g\tI| ¡¢c^a

�¡Ùu0g\a�u0�Zg�¡Ùc^a�mBc9c^��qvpZf e;tI|­a0g\f o\e;t­�Ig�e^|Z|9g�|Ve;tI|Vq�pZ�Zu0a�e^oJu0g�| u0c�a0g�oJgJsBº^g�e·c^tZgp9mB�Ve;�Z�Za0cn±9s�f)e;uvsBc^t�c^a/e�qv�Ie£a0��sBtIoOmBpIq®sBc^t­c;¡bu0�Zg�a0g�qvp9mBu�²

f�e^oO��e^|WsBuvsBc^t�µ�sBu0��a0g\f)ensBtI|9g\a�{��Zc;µ�g\º^g\a�{noJc�qvu�q��/q¸s�f��9m�g�e^|Z|WsBuvsBc^tIqbe;tI|�uO�Zg\a0gJ¡¢cWa0g�u0�9s]qº;e;avs]e;t9us]q|9g\�Ie;uJe;�9mBg^²I´®u�s]qu0c9c2q®mBc;µ�u0c2�Gg�e;uu0�Zg m�cWtZd�e^o\oJpZf�p9m]e;u0c^a�sBf2�9mBg\f2g\t9u�e;u�s�cWt�²

� 'Z& �2q �76L8$13q � o78 �q &´®t��Z�Ie^q�g��)u0�Zg2t9pZf��Ig\a(c£¡lu�e;�9mBg)g\t�uOavsBg�q�s]q�sBtIoJa0g�e^q�g�|­sBt¬d^g\tZg\a�enmS²�´Sujf�en�§�Gg�a0g\�9m]e^oJg�|�9�Ve�o0�Zg�e;�Gg\a(�Za0cZoJg�|9pZa0g)µ�9s]o0�¬oJc^f2�ZpZu0g�qje)a0c^pZd^��e£�Z�Za0cn±�sBf�e;u�s�cWt¤c£¡lu0�Zg�q�pZf�c;¡�u0�Zgu�e£�9m�g­g\t9u0avsBg�q\{qvc¬u0�Ie;u)u0c^uJenm°o\e;tIoJgJmxm]e;uvsBc^t¥s]q�enº^c;s]|9g�|%sBt¥�Z�Ie^qvgV�Z² 0��9s]q)�Za0c oJg�|9pZaOg§s]qu0�ZgVe^|Z|WsBuvsBc^t¥c;¡(uO�Zg¤uJe;�9mBg¤g\t9u0avsBg�q�¡Ùa0c^f��9sBd^�¥u0c mBc;µ»gJ±Z�Ic^tZg\t9u�q)pZt9uvsxm(u0�Zg§gJ±Z�IcWtZg\t�u|WsNM�g\a0g\tIoJg2�Gg\u¸µ�g\g\t­u0�Zg2qvpZf e;tI| u0�Zg�tZgJ± u�g\t9u0a0�)s]q�d^a0g�e;uOg\a/u0�Ie£t � ² 0��9s]q(oJc^aOa0g�qv�IcWtI|Zqu0c­u0�Zg�e^|Z|WsBuvsBc^t¬¡Ùa0c^f!m�gJ¡Ùu�u0c§a�s�dW��ujsBt=u0�Zg�enmBd^c^avsBu0�Zf µ�9s]oO�?qvc^a0u�q u0�Zg�qvpZf2f�e;tI|ZqÓ²�´St¦©¨�p9m!][h;«ZsBu*s¹q~qv�Zc;µt2u0�Ie£u�u0�ZgaOg\f�ensBt9sBtZd q�pZf2f�e;tI|Zq*f�en��tZcWubu0c^u�enmxmB�2o\e;tIoJgJm�u0�Zga0g�q�p9m�un{sx¡/u0�Zg)gJ±Z�Ic^tZg\t9u2|Ws M�g\a0g\tIoJg�s]qjd^a0g�e;u0g\aju0�Ie;t�� ² d;sBtIoJg�µ�g2f�en�­�Ie�ºWg���g\t9u0avsBg�q ¡¢cWa(g�e^oO�gJ±Z�Ic^tZg\t9uµ�g��Ie�ºWgu0c�a0g\�9m]e^oJg�u0�9s]q��9���Z²�0��Zg°eW|Z|Ws�u�s�cWt)¡¢a0cWf+mBgJ¡Ùu�u0c�avsBd^��un{^�Zc;µ�g\º^g\a�{^�IeWqu0cj�Ig�Gg\av¡Ùc^a0f2g�|�µ�s�uO�Zc^pZu�a0cWpZtI|Ws�tZd�g\a0aOc^a�{Ós�u�|9g\u0g\a0f�sBtZg�q~u0�ZgcWa�|9g\a�c;¡�f�e;d^t9sBu0pI|9g#c;¡�u0�ZgqvpZf e;tI|)d;sBº^g�qeja0c^pZdW�§e;�Z�Za0cn±�sBf�e£uvsBc^t�²�0��Zg\a0gJ¡Ùc^a0g�µlg(e£�Z�9m���e^|Z|WsBuvsBc^t µ�sBu0��a0g\f�ensBtI|9g\ae;tI| sBtIqvg\a0uju0�Zg�aOg\f�ensBtI|9g\a�q�sBt9u0c§uO�Zg�u�e£�9m�g^²cT&sBtIenmxm��¬u0�Zg�q�pZf s]q s�tIq�g\a0u0g�| s�t¬u0�Zg2uJe;�9mBg^{

æ�æ ÞJe�fhg%ikj

Page 15: Comparison of Accurate Dot Product AlgorithmsComparison of Accurate Dot Product Algorithms Jurgen Wol V. Gudenberg To cite this version: Jurgen Wol V. Gudenberg. Comparison of Accurate

��� ����$�)������G��' �������$#���� �h���)����� ����*����������$�������� � �������� �K���%��h �,�������������������)µ�9s]oO� q�uvsxm m�a0g\�ZaOg�qvg\t�uJqbu0�ZggJ±Ie^oJubº;enmBpZg�c;¡zu0�Zg�|9c^u0�Za0cZ|9pIoJule;tI|�f�en�(tZc;µ=�Gg°e^|Z|9g�| ¡Ùa0c^fmBc;µ�u0c2�9sBd^� gJ± �Gc^tZg\t9u�quOc�cW�Zu�ensBt¤e�dWc�cZ|�e;�Z�Za0cn±9s�f)e;uvsBc^t�²

´Ù¡ztZc�o\e;tIoJgJmxm]e;uvsBc^t)c o\oJpZaJq�u0�Zg�t9pZf��Ig\abc;¡ze^|Z|WsBuvsBc^tIq*µ�sBu0��a0g\f�ensBtI|9g\a�s]qbe;u�f2c�qvu���e;tI|u0�Zg#t�pZf��Ig\a�c;¡�u�e;�9mBg�g\t�u0a�s�g�q*a0g\f�ensBtIq�oJc^tIqvuJe;t�un²�´ ¡ro\e;tIoJgJmxm]e;uvsBc^t)c;¡�f2c^a0guO�Ie;t�u¸µ�c(|WsBd;sBu�qcZo\oJpZa�qjsBt=e¤q¸s�tZd£m�g§eW|Z|Ws�u�s�cWt=µ�9s]o0�=o\e;t¬c^t9mB�V�Ig�u0�Zg�o\e^qvg^{³sx¡°u0�Zg�gJ± �Gc^tZg\t�u)|Ws M�g\a0g\tIoJgs]q��^{&u0�Zg)gJ± e^oJu�a0g�q�p9m�u2o\e;t¬�Ig�aOg\�Za0g�qvg\t9u0g�|�eWq�e FIc�e;uvsBtZd7Ù�Gc;sBt�u�t�pZf��Ig\a2e;tI| u0�9pIqju0�Zga0g\f)ensBtI|9g\a�q2e;a0g2aZ²��#g\tIoJg�u0�Zg2t9pZf��Ig\a�c;¡/u�e£�9m�g)g\t9u0avsBg�q s¹q�|9g�oJa0g�eWqvg�|�²~kle;tIoJgJmxm]e;uvsBc^t=c;¡c^tZg�|WsBd;sBu�{��Zc;µlg\ºWg\a�{9f�en�§qvu�e;a0ue;t­e^|Z|WsBuvsBc^t­�Za0cZoJg�|9pZa0g�µ�9s]o0�­a0pZtIq#u0�Za0c^pZdW��u0�Zg(µ�Zc£m�gu�e£�9m�g^²

´StVu0�Zg)o\e^qvg�q�c£¡lm]e;aOd^g�wZmxm�7Qs�t=µ#�9s¹oO�¬f2g�e;tIq#m]e;a0d^g�gJ± �Gc^tZg\t�u2|WsNM�g\a0g\tIoJg�qjc^tZg2c^a�V\g\aOce^|Z|WsBuvsBc^tIq°¡¢aOc^f mBgJ¡¢u(qvp�\�oJg e£tI|§uO�9s¹q �Za0cZoJg�|9pZa0g�oOmBg�e;avmB� s]q�qvpZ�Ig\avsBc^aju0c)u0�Zg2e^|n�¸pIq�u0f2g\t9uc;¡�q®sBd^tIq\²

��g\�9m]e^oJg\f2g\t9u�c;¡��Z�Ie^qvg§� �9�ªeW|Z|Ws�u�s�cWt=¡Ùa0c^f mBgJ¡¢u2uOc¤a�s�dW��u2aOg�|9pIoJg�q�u0�Zg�t9pZf��Gg\a2c;¡e^|Z|WsBuvsBc^tIq#sBt u0�Zg(enº^g\a�e;d^g^²9´®u�q�a0c^pZtI|WsBtZd)g\a0a0c^a��Gg\�Ie�ºWsBc^a#s¹q�e;tIenmB��V\g�|)sBt¤q�g�oJuvsBc^t¤hZ²� '�� �2+-4�.-op/ ����<�+ �7j)5m8:9;j)+-,0��Zg#s]|9g�ejc;¡�s�t9sBuvs]enmbe^|Z|WsBuvsBc^t2¡Ùa0c^f �9sBd^��uOc�mBc;µ?gJ±Z�Ic^tZg\t9u�q�o\e;t)�Ig�gJ± u0g\tI|9g�|�{Zqvcju0�Ie;u~u0�Zga0gJm]e;u�s�ºWg�g\aOa0c^ac;¡bu0�Zg�a0g\f)ensBt9s�tZd�uOg\a0f�q|WsBf�sBt9s]qv�Zg�q\²&´Ù¡�µ�g(oJc^f2�ZpZu0gjpZt�u�s mbu0�Zg�|Ws M�g\a0g\tIoJgc;¡�gJ±Z�IcWtZg\t�u�q#�Ig\uvµ�g\g\tVqvpZf e;tI|)wIa�q�u/u�e£�9m�gjg\t�uOa0�)s]q��I{Ie£t¤e;�Z�Za0cn±9sBf�e;uvsBc^t c;¡�u0�Zg�q�pZfµ�sBu0�­a0gJm]e;u�s�ºWg2g\a0a0c^a���� � � s]q c^�Zu�ensBtZg�|�{�g\º^g\t sx¡lµ�g�|Ws]q0o\e£a�|Venmxm�a0g\f�e[s�t9sBtZd g\t�u0a�s�g�qÓ²C0��9s]q¡Ùc;mxmBc[µ�qb¡Ùa0c^f�u0�Zg#¡Se^oJu�u0�Ie£uluO�Zg/u�e£�9m�g�g\t9u0avsBg�qe;a0g�c^a�|9g\aOg�|2�9�2gJ±Z�Ic^tZg\t9u�q~µ�sBu0�§e;u~f2c�qvu��º;enmBpZg�q#¡¢c^a�g�e^oO�§gJ±Z�IcWtZg\t�u�²wd^g\g�qvg�oJuvsBc^t§h�¡¢c^a�|9g\u�ensxm]q\²_�g�o\e£pIqvg e[m m�u0�Zg2a0g\f)ensBtI|9g\a�qj�Ie�ºWg�uOc§�Ig�sBtIqvg\a0u0g�| sBtVu0�Zg)u�e;�9mBg�e;tI| qvc^f2g2c£¡�uO�Zg\f�Ienº^g�u0c��Gg�a0g�oJc^tIq®s]|9g\a0g�| ¡¢c^a��9sBd^�Zg\a/eWo\oJpZa�e^oJ�^{Zu0�9s]q��ZaOc oJg�|9pZa0g(qOe�ºWg�q�u�s�f�g�c^t9mB�2¡Ùc^a�º^g\a0�mBc;µ e^o\oJpZa�eWoJ��a0g�y^p9sBa0g\f�g\t�u�qÓ²

� '�� �W+=5 �7j),78:9�jL+-, lnjk9�op*�+-,7. /21313475�4�6L8:9�+-<ZVg��Ig\a�¡¢c^aOf+�Z�Ie^qvg��/e£tI|�u0�Zg\t�e^|Z|�u0�Zg�u�e;�9mBg�g\t�u0a�s�g�q*pIq®sBtZd�e�mBc^tZd�e^o\oJpZf�p9m]e;u0c^a�²��g\tIoJgc^�Zu�s�f)enm e^o\oJpZa�eWoJ�s]q&c^�Zu�ensBtZg�|�{nu0�Zg�qvu0c^aJe;d^g&¡¢c^a&u0�Zg�sBt9wIt9s�uOgJm����Za0g�oOs]qvg~a0g�qvp9mBu&s¹q&f�sBt9sBf�s�V\g�|�{e;tI| e^o\oJpZf�p9m]e;uvsBc^t§c£¡*q0o\e[m¹e£a�Za0cZ|9pIoJu�qoJc^tIq®s]qvu�s�tZd)c^pZuc£¡*qvg\ºWg\a�enm��Ie;a0u�qls¹q�¡Se^oOsxm sBu�e£u0g�|�²

� �2� � ����~��' ��� ���2�r�� z� ���/�¬��1�� �-,�./����0��12��

´®t�u0�Zg­¡Ùc;mxmBc[µ�s�tZd u�e;�9mBg?�­µ�gV|9g\�9s]oJu§¡Ùc^a)g�e^oO� enmBd^c^avsBu0�Zf�sBu�q·f�en±9sBf�enm�e;tI| f�sBt9sBf�enmgJ±Zg�oJpZuvsBc^t¬uvsBf2g2f2g�e^q�pZa0g�| sBt FIc9e;uvsBtZd7Ù�Ic£s�t9u�eW|Z|Ws�u�s�cWtIq(��{Üe£tI|¬k�c^f2�Ie£avs]qvc^tIqjk µ#�Zg\a0gu0�Zg�enº^g\a�e;d^g�o\e^qvg�µ�s mxmb�Ig�oOm�c9qvg�u0c2u0�Zg�f s�t9sBf�e[mruvsBf2gjº^g\a0�2c£¡¢u0g\t�²

åkyÜæ³å]è

Page 16: Comparison of Accurate Dot Product AlgorithmsComparison of Accurate Dot Product Algorithms Jurgen Wol V. Gudenberg To cite this version: Jurgen Wol V. Gudenberg. Comparison of Accurate

������������ ��������������������������! "���#��$�%�&�(' )����*�,+$�- �n�

ZVg2e^q0qvpZf2g�u0�Ie£u°|9c^pZ�9mBg�mBg\tZd^uO�V�Za0cZ|9pIoJu�q�e;a0g�oJc^f2�ZpZu0g�|­�9��u0�Zg2qv�9mxsBu0uvsBtZd­f2g\u0�ZcZ||9g�q0oJa�s��Gg�|·s�t­qvg�oJuvsBc^t§�Z² �(gJ±IoJg\�Zu�¡Ùc^a�u0�Zg�mBc^tZd�e^o\oJpZf�p9m]e;u0cWalµ#�Zg\a0g�sBt�u0g\dWg\a/e;avsBu0�Zf2g\u�s¹o#s]qpIqvg�|�² �g\tIoJg s�tVenmxm�c^uO�Zg\a(enmBd^c^avsBu0�Zf�q�e)f s�t9sBf�pZf�g�M�c^aOu/c;¡��Wt e£tI|Ve2f)en±�sBf�pZf c;¡�;tf�p9mBuvsB�9mxs¹o\e£uvsBc^tIq�� �Ienº^g�u0c��Ig�e^|Z|9g�|�uOc�uO�Zg(u�e;�9mBg�g\t9u0avsBg�q\²

d^u0c^a�e;dWg�aOg�y^p9sBa0g\f2g\t9u�q�e;a0g)qvu0c^a0g�|·sBt¬u0�Zg2m]e^q�u���aOc[µ�q\²���0 u0�Zg2f�en±9sBf�enmxmB��tZg\g�|9g�|qv�IeWoJg/|9pZavsBtZd(u0�ZgoJc^f��ZpZu�e;uvsBc^t2|Ws]qva0g\d�e£a�|WsBtZd(sBt9u0g\d^g\a�qbe£tI|�q�u�e^oO¶�qv�Ie^oJg^²�� ´�uO�Zglf�g\f2c^a0�pIqvg�| u0c�a0g\�Za0g�qvg\t9u/uO�Zg�sBt9wIt9sBu0gJmB�¤�ZaOg�oOs¹q�g�aOg�qvp9mBu�²C0��Zg�º;enmBpZg�q�e;a0g�d;sBº^g\t sBt§�9sBu�q�µ#�Zg\a0g |9g\tZc^uOg�q�u0�Zg�t�pZf��Ig\a�c;¡��9s�uJq�pIq�g�|�¡¢c^a�e(FIc�e;uvsBtZd7Ù�Gc;sBt�u�t9pZf��Gg\a/e;tI|�g 9¥g\f�en±�7Ùg\f�sBt N �^²

´Su2qv�Zc^p9m]| tZc^u(�Gg§oJc^tIoJg�enmBg�|¬u0�Ie;u�c^t¬f2cZ|9g\a0t=oJc^f2�ZpZu0g\a�qÓ{�sBt��Ie£a0uvs]oJp9m]e;a���´^dZk e£a 7oO�9sBu0g�oJu0pZa0g�q\{*u0�Zg2sBt9u0g\d^g\a2u�s�f�g qv�ZcWp9m¹|=eWoJu0pIenmxmB��tZc^u��Gg�tZg\d;mBg�oJu0g�|�²JZVg�¡¢pZaOu0�Zg\a2|9c¤tZcWuoJc^tIq¸s¹|9g\a�q0o\enmxsBtZd�g�M�c^aOu�q�sBt)u0�9s]qloJcWf2�Ie;avs]qvc^t�µ�9s]o0��e;a0g�tZg�oJg�q0q0e;aO�(¡Ùc^a�enmxmzu0�Zg�enmBd^c^avsBu0�Zf)qµ�9s]oO��qvu0c^a0g u0�Zg§aOg�qvp9mBu§e^q2e º^g�oJu0c^a)c;¡�FIc�e;u�s�tZd�7Ù�Ic;sBt9u�t9pZf��Ig\a�q\{*sS² g^²enmxm�µ�sBu0�Zc^pZu)u0�ZgmBc^tZd=e^o\oJpZf�p9m]e;u0c^a�²J0��Zg§e[m�dWc^avsBu0�Zf�q2e£a0g§|9g\tZc^u0g�|¬�9��uO�Zg�t�pZf��Ig\aJq2c;¡�u0�Zg��Za0g�oJg\g�|WsBtZd�Ie;aJe;d^a�e;�Z�Iqls�t µ�9s]o0� u0�Zg\�)µlg\aOg(|9g�q0oJavsB�Ig�|�²

�Z²B� � ²©� �Z² �f�en± tI` N � ��1v�^�;t P 7S�;t45S� oJt�mBc^d�1St45�k N �£tZ� �f�sBt tI` N � �;�^tZ� oJt)m�cWd�1St45k N �^tZ� ���0 �;g �;t �;t � ´ �;g �£g �;g

� �G²B� �I² � �I² �f�en± 1¸�;t N �^g 5®� N �^g�� 1J���;t N �;g 5S� 1v�;t N �;g 5S� 1v�;t�7Ù�^g 5S� N �^gd

N h;t�k N h;t�k N h£t�k N h;t�kf�sBt �^tZ� N �;t�k �WtZ� N �;t�k �^tZ� N �;t�k �^tZ� N �;t�k N d��0 �Wg �^g �^g �^g � ´ �Wg �^g �^g �;g0�e;�9mBg��3Gk�c^f��Ie;avs]qvc^t��9��cW�Ig\a�e;u�s�cWt¤oJc^pZt9u/e;tI|�qvuOc^a�e;d^g�a0g�yWp9sBa0g\f2g\t9u�q�mxm�u�e;�9mBg�g\t�u0a�s�g�q#e;a0gc^�ZuJensBtZg�|)����oJcWpZt�uvsBtZd�u0�Zg�c^�Ig\aJe;uvsBc^tIq�e^q0qvpZf s�tZd�u0�Zg#¡Ùc;mxm�c;µ�sBtZd

�Za0cW�Ig\a0uvsBg�q\²TZc^a�u0�ZgjmBc^tZd­e^o\oJpZf�p9m¹e£u0c^a 1¸�Z²¹� 5�` |9g\tZc^u0g�q�e£t�e^|Z|WsBuvsBc^tVc;¡e)�Za0cZ|9pIoJu�e;tI| d u0�Ie;uc;¡/e º[e[m�pZg^{b� u0�Zg)gJ± uOa�e^oJuvsBc^tVc£¡�uO�Zg2wItIenm�a0g�qvp9mBu�²b��c^pZdW�9m�� g�qvuvsBf�e;u0g�| u0�Zg2u�s�f�g2¡Ùc^a�`g�yWpIenm]q�u0�Ie£u�¡¢c^a���� N � ²©�;� {�u0�Ie;u�¡Ùc^a d�9»�^² �;��{�e;tI|­uO�Ie;u¡Ùc^a(� 9+g�� +��{��ZpZu(uO�Zg�qvgwId^pZaOg�q�9sBd^�9mB�§|9g\�Ig\tI| c^t�uO�Zg(�Ie;a�|�7�e£tI|�qvc;¡Ùu¸µe;aOg/c;¡�u0�Zg�qv�Zqvu0g\f ²

�°|Z|WsBuvsBc^t­µ�sBu0��a0g\f)ensBtI|9g\a/pIqvg�| sBt 1¸�Z² ��5re;tI| 1S�I²B� 5roJc�qvuJq�2e^|Z|WsBuvsBc^tIq\²0��Zg�f)en±�sBf�e[ml�Ig\a�¡¢c^aOf�e;tIoJg�c;¡�1¸�Z² ��5�s]q2y^pIeW|9a�e;uvs]o ¡¢c^a��;t=qvpZf2f�e;tI|Zqjµ�Zg\aOg�e^qju0�Zgf�sBt9sBf�enm�s]qcW�Zu�ensBtZg�|��9� ��s�uOg\a�e;uvsBc^tIq�c£¡r�^t§q�pZf2f�e;tI|Zq\²

æ�æ ÞJe�fhg%ikj

Page 17: Comparison of Accurate Dot Product AlgorithmsComparison of Accurate Dot Product Algorithms Jurgen Wol V. Gudenberg To cite this version: Jurgen Wol V. Gudenberg. Comparison of Accurate

�\� ����$�)������G��' �������$#���� �h���)����� ����*����������$�������� � �������� �K���%��h �,�������������������)0��Zg�uvsBf2g�¡Ùc^au0�Zg�c^aJ|9g\al�9�)gJ± �Gc^tZg\t�u�enmBd^c^avsBu0�Zf 1v�Z²©��5�s]q|9c^f�sBtIe;uOg�|)���)u0�Zg(qvc^aOuvsBtZd

uvsBf2g^²G� ��f�g�e;tIquO�Zg(qv�Ig�oOsxwzo�e^|Z|WsBuvsBc^t§µ�s�uO��mBc^tZd^g\af)e;t�u�s¹qOq0eZ²0��Zg­f�en±9s�f)enm/wZmxm�7QsBt ¡¢cWa ¨�c^�Z�GgJmBu �©q§e[m�dWc^avsBu0�Zf 1¸��5js]q)c^�Zu�ensBtZg�| sBt¥�Ie;u0�Zc£m�cWd;s]o\enmxm��oJc^tIq�u0a0pZg�|�gJ±Ie;f2�9mBg�q#c^t9mB�^{Zu0�Zg�q0e;f2g��Zc;m]|Zq~¡Ùc^a�1®�I²©��5�²

�°|Z|WsBuvsBc^t2µ�s�uO��aOg\f�ensBtI|9g\alo\e;t��Gg/oJc^f��9sBtZg�|�µ�sBu0�2q®sBd^t2e^|n�vpIqvu0f2g\t9uÜcWa�e^|Z|WsBuvsBc^t�¡Ùa0c^fmBgJ¡Ùu/u0c�avsBd^�9u�{9sBtU1S�I²B� 5bu0�Zg�m]e;uOu0g\a�s]q�mxs]qvu0g�|�²0��Zg�f�sBt9sBf�enmÜgJ±Zg�oJpZuvsBc^t§u�s�f�g�¡Ùc^aenmxmrenmBd^c^avsBu0�Zf)q�pIq¸s�tZd)u0�Zg�m]e;a0dWg�u�e;�9mBg�s]qc^�ZuJensBtZg�|�{sx¡lenmxmÜg\t9u0avsBg�q�e;a0g(e^|Z|9g�|�|9pZavsBtZd��Z�Ie^q�g2�^²

´St21S�I² � 5�{Jsx¡³�;t ���Wg�u0�Zg�f�en±9sBf�pZf uvsBf2g~s¹q�cW�Zu�ensBtZg�|�{nsx¡zu0�Zg~u�e;�9mBg�s]q&wZmxm�g�|�s�t��Z�Ie^qvg��^{�Zg\tIoJg(ejf�en±9sBf�enm�t9pZf��Gg\a�c;¡re^|Z|WsBuvsBc^tIq�uOc(u0�Zg#mBc^tZd2e^o\oJpZf�p9m¹e£u0c^a��Ie^q~u0c(�Gg��Ig\av¡Ùc^a0f�g�|�²

� u�e£�9m�g sBtIqvg\a0uvsBc^t­oJc�qvu�q��Gg\uvµlg\g\t§��e;tI|�h2oJc^f2�Ie£avs]qvc^tIq#µ�sBu0� aZ²� � � �2��')�\��.��������� ����,��� &��

ZVg�a0g�o\e[m m�u0�Ie;u��-1LC S D S E�GJI)K S E�G B M 5�|9g\tZc^u0g�q�u0�Zg!FIc�e;u�s�tZd�7Ù�Ic;sBt9ulqv�Zqvu0g\f µ�sBu0���Ie^q�g\CV{ Df�e£t�uvs]q0qOe|WsBd;sBu�q\{;e;tI|�gJ± �Gc^tZg\t9uÜa�e£tZd^g EHGJILK Q�Q EHG B M ²�0��ZgbtZc^aOf�enmxs�V�e;uvsBc^t�s]q�qvpIoO��u0�Ie;ub� ��C �G� � ¡Ùc^a�g�e^o0� f�e;t9uvs]q0q0e_G�²G�°q�sBt­u0�Zg��Za0g\º^sBc^pIq�q�g�oJuvsBc^t O |9g\tZc^uOg�q/e^|Z|WsBuvsBc^t µ�sBu0�a0cWpZtI|Ws�tZd u0c§u0�Zg�tZg�e;a0g�qvu(FIc�e;uvsBtZd7Ù�Gc;sBt�u2t9pZf��Ig\a�² �8 s]qju0�Zg oJcWf2�ZpZu0g�|¬e;�Z�Za0cn±�sBf�e£uvsBc^tu0c 8 ²

b½ ­���I�-g�g\u B? 90G ? C X� �� �-1LC S D S E�GJI)K S E�G B M 5 S IK9 � 1�� 5 K�µ�s�uO�

E A�� E P N �E?� E

?O A�� E

?��E

?O P S I � �g�g\u�¡¢pZa0uO�Zg\a 8 90s =

?@ P B

?e;tI| �8 �Gg�sBu�q�e;�Z�Za0cn±�sBf�e£u0g/º;enmBpZg(oJc^f2�ZpZu0g�|)�9�2u0�Zg#¡Ùc;mxm�c;µ�sBtZd

enmBd^cWavsBu0�Zf 3�8 3N9 B =¡Ùc^a�Ic9 KJU=�(|9c;µt9u0c2�

�8 3N9��8 O B?

g\tI| ¡¢c^a

0��Zg\t§u0�Zg�aOgJm¹e£uvsBº^g(g\a0aOc^a��"c£¡Üu0�Zg�q�pZf c;¡�enmxm B?s]q�Gc^pZtI|9g�| ���

� ���9� N 1�� N �9� 5 APC A �

µ�sBu0���� � ��h C � �

C � A UV� C � � �� C �

åkyÜæ³å]è

Page 18: Comparison of Accurate Dot Product AlgorithmsComparison of Accurate Dot Product Algorithms Jurgen Wol V. Gudenberg To cite this version: Jurgen Wol V. Gudenberg. Comparison of Accurate

������������ ��������������������������! "���#��$�%�&�(' )����*�,+$�- �n�

�(�nÁ�Á��

��9�����B A N 8 U 1 B A O �8 5B A N 8 ����

������B A N 8 U 1 B A N �8 5

B A N 8 ����N ����

B A N �8 U 1 B A O �8 5B A N �8 ��������B A N �8B A N 8 ����

� ����

8 U �8B A N 8 ����

N �� C A � ���� B

A N �8B A N 8 ����ZVg2wIaJqvu�|9g\avsBº^g§e·mBc;µ�g\a(�IcWpZtI| ¡¢c^aju0�Zg�|9g\tZc^f�sBtIe;u0c^a B A N 8 {�u0�Zg\tªoJcWf2�ZpZu0g)u0�Zg

e;�Iq�c;mBpZu0g�g\a0a0c^a�c;¡�u0�Zg2qvpZf§²G´Ù¡�µ�g�uO�Zg\t¤�Ienº^g�e2�Gc^pZtI| �9��¡Ùc^a/uO�Zg(wIaJqvu/g\aOa0c^a�u0g\a0f u0�Zg¡Ùa�e^oJuvsBc^t)sBt u0�Zg�qvg�oJcWtI|)u0g\a0f�s]q#�Ic^pZtI|9g�|·����� N �9�

P

B A N 8 P 9 ����� B AN => ? @ P B

?�����

�P

B APU =>? @ P

P

B?�P�P

B APU =>? @ P

1��SU2C � 5LC X� �P

B APU => ? @ P

C X� 9 P

B APU C X�� => ? @ P

C X� � X��

�P

B APU2C X��

D�� = � P �� P>?@�� C �

?�P

B APU C X�

D���> ? @ � C

�?

9 P

B APU C X� �

� U2C � A �P

B APU­��C X�

� C X� � A U­��C X� � C X� � A U­� C X� � �ZVg�f)e��Ve^qOqvpZf2g2µ(² m ² cI² dI² 8 � �8 {&µ�Zg\t=µ�g2tZc;µ"|9g\avsBº^g§e·�Ic^pZtI| ¡Ùc^a(uO�Zg e;�Iq�c;mBpZu0gg\a0aOc^ac;¡bu0�Zg(qvpZf§²

8 U �8 9 =>? @ P B?U 1 QRQ�Q 1 B = O B = � A 5 O Q�Q�Q O B�P 5J9 => ? @ P B

?U =>? @ P B

?N��

? 9 => ? @ P�?

µ�Zg\aOg �?|9g\tZcWu0g�q�u0�Zg�e;�Iqvc;mBpZu0gjg\a0a0c^a�sBt­u0�Zg@I^U�u0�¤qvpZf�f�e;uvsBc^t§qvuOg\�§µ�9s]oO��s]q�Gc^pZtI|9g�|

�9��?�P 8 ? P A P

C � �#g\a0g(µ�g(e;�Z�9mB��a0cWpZtI|Ws�tZd)u0c�u0�ZgjtZg�e;a0g�qvucWu0�Zg\a0µ�s]qvg�u0�Zg�¡Se^oJu0cWabA

P�Ie^q�u0c2�Gg�|9a0cW�Z�Ig�|�²

c;µVµ�g��Ie�ºWgbu0c�IcWpZtI| 8 ? ²�0��9s]q�e;d�ensBt�s]q�|9c^tZg��9�/gJ±Z�9mBc;sBuvsBtZd�u0�Zgbc^a�|9g\a&c;¡ gJ±Z�IcWtZg\t�u�qÓ²

æ�æ ÞJe�fhg%ikj

Page 19: Comparison of Accurate Dot Product AlgorithmsComparison of Accurate Dot Product Algorithms Jurgen Wol V. Gudenberg To cite this version: Jurgen Wol V. Gudenberg. Comparison of Accurate

��h ����$�)������G��' �������$#���� �h���)����� ����*����������$�������� � �������� �K���%��h �,�������������������)

8 ? � =>� @? PB �P�=��C X�

0��Zg\a0gJ¡Ùc^a0g8 U �8 9 => ? @ P

�?�=� �� C � =>? @ P

C X� � �� C � ��hC X P

e;tI|��� � ��h C � �

C � A UV� C � � �� C � �g�g\f2f�e¥�­f�en�=tZc;µ �Ig�e;�Z�9mxsBg�| ¡¢c^a�|WsNM�g\a0g\t�u��IeWqvg�q_C e;tI| µ�9s]o0�¥s]q)u0�Zg­f2cWa0g

sBt9u0g\a0g�qvuvsBtZd�o\eWqvg�uOc |9g\uOg\a0f�sBtZg�u0�Zg)gJ± �Gc^tZg\t�u2|WsNM�g\a0g\tIoJg � tZg�oJg�q0q0e;aO�Vu0c­d^pIe;a�e£t�u0g\g)eqv�Gg�oOsxwIg�|Ve^o\oJpZa�e^oJ�)c;¡bu0�Zg(e;�Z�Za0cn±9sBf�e;uvsBc^t�²

c^uOgu0�Ie;u�|9pZg/u0cju0�Zg#wItIenm³e^|Z|WsBuvsBc^t�uO�Zg/f�e[±�sBf�enm�a0c^pZtI|9c M§g\aOa0c^alo\e£t�tZc^u~�Ig��Ic^pZt�7|9g�|·��� A

P? D�� 9 A

PC A � ­µ�sBu0�ZcWpZu�¡¢pZaOu0�Zg\a/oJc^tIq®s]|9g\a�e£uvsBc^t§c;¡bu0�Zg(qv�Gg�oOsxwzo2q®sBu0pIe;uvsBc^t�²

´St§uO�Zg�¡¢c;mxmBc;µ�sBtZd��Ie£a0uµ�g(e^q0qvpZf�g D � ��²�(Áz�nÁ � �¹�I��¾����-g�g\uu0�Zg�enmBd^c^avsBu0�Zf-u0c2|9g\u0g\a0f�sBtZg(uO�Zg(qvpZf �Igjd;sBº^g\t¤eWq~sBtng�g\f2f�e �^²T c^aSC 9¥� e;tI| � 9¥��µ�g�c^�Zu�ensBt

� �?� A � µ�9s]oO�§f2g�e£tIq/eja0c^pZtI|WsBtZd�g\a0aOc^a�c;¡rcWtZg@? D�� ²

�(�nÁ�Á��

�9������h � ���� � A U � D � ��� �� � � 9 �

� �� � ��� N 1�� N �9� 5¸� � �?� D � �

�TZc^a �Y9 � µ�g�c^�Zu�ensBt��9� � � A � µ�9s]o0�­sBf2�9mxsBg�q � � � Q �D� � {��ZpZu�s ¡µ�g2o0�Ie£tZd^g

c^pZa�e^|Z|WsBuvsBc^t�e[m�dWc^avsBu0�Zf ���§qvpZf�f�sBtZd�pZ�§u0�Zg�uvµ�c2g\t�uOavsBg�qµ�sBu0�­g�yWpIenmbgJ± �Gc^tZg\t�u��GgJ¡¢cWa0ge^|Z|WsBtZd)u0�Zg\f u0c�u0�Zg�wItIenmrqvpZf {Zµ�g(o\e;t)sBf2�Za0c;º^g(cWpZag�qvuvsBf�e;u�s�cWt�c;¡ 8 ? ²

�(Áz�nÁ � �¹�I��¾�&�-g�g\u�u0�Zg(qvpZf2f�e£uvsBc^t§enmBd^c^avsBu0�Zf-�Ig(d£s�ºWg\t¤e^q~¡Ùc;mxmBc;µ/q�8 3N9 aIJ3N9�Kµ�9sxmBgVI �?� |9c

s ¡0E? 90E ? � A u0�Zg\t�8 3N9��8 O 1 B

?O B

?� A 5I 3N90I-UV�

åkyÜæ³å]è

Page 20: Comparison of Accurate Dot Product AlgorithmsComparison of Accurate Dot Product Algorithms Jurgen Wol V. Gudenberg To cite this version: Jurgen Wol V. Gudenberg. Comparison of Accurate

������������ ��������������������������! "���#��$�%�&�(' )����*�,+$�- � ]gJm¹q�g

�8 3N9��8 O B?

I 3N90I-U=�g\tI|)sx¡

g\tI|·µ�9sxm�g

TZc^aSC�9¥� e;tI| �b9?� µ�g�u0�Zg\t�c^�ZuJensBt

� � � Q hD� �

�(�nÁ�Á��0��Zgµ�c^a�qvu�o\eWqvg�¡Ùc^ab�Ic^pZtI|WsBtZd2uO�Zg/e;�Iqvc;mBpZu0g#g\a0a0a0c^a~c;¡zuO�Zg/qvpZf"e£d�ensBt2cZo\oJpZa�q\{;sx¡ÜuO�Zg\a0ge;aOg�u¸µ�c�g\t9u0avsBg�q�¡Ùc^a�g�e^oO��gJ± �Gc^tZg\t9u�{IcWu0�Zg\a0µ�s]qvg(ej¡Se^oJu0cWalc£¡*�js]q�g�eWq®sxm�� d�ensBtZg�|�² _~pZu�µ�sBu0�u0�Zg#tZg\µ qvpZf2f)e;uvsBc^t�enmBd^cWavsBu0�Zf �Ienmx¡�c;¡�u0�Zg~m�cZo\enm�g\a0a0c^a�q �

?e£a0g��Ic^pZtI|9g�|����2�C X� a�e;uO�Zg\a

u0�Ie£t���C�X� ¥² 0��9s]q�m�g�eW|Zqu0c�u0�Zgjg�qvuvsBf�e;u�s�cWt8 U �8 9 => ? @ P

�?� �� 1S� => ? @ P

C X� N � =>? @ PC X� 5 �� C � � �

� C � ���C X P´StIqvg\a0u�s�tZd � 9 �»e;tI|JC�9¥��u0�9s]q�WsBgJm]|Zq

��� � �� ��

µ�9s]oO��sBf2�9mxsBg�q�u0�Zg(eWq0qvg\a0uvsBc^t�²�ZVg°o\e£t |9c�g\º^g\t��Gg\u0u0g\a�²Z¨�c^�Z�GgJm�u�qv�Zc;µ/q*s�tV¦ ¨�c^�§�;�;«�u0�Ie;ub¡Ùc^a�u0�Zg�qvpZf2f�e£uvsBc^t e[m�dWc7

avsBu0�Zf d;sBº^g\t sBt¤qvg�oJu�s�cWt � ² ��u0�Zg�e;�Iq�c;mBpZu0g(g\a0aOc^ac;¡ 8 U �8 s]q��IcWpZtI|9g�|§�9�)��C X� C � ²40��Zge^qOqvpZf2�ZuvsBc^t=c;¡�g�y^pIenmxmB�=q®sBd^tZg�|?qvpZf�f�e;tI|Zq wIu�qjµ�gJm m�sBt=c^pZa2oJc^tIq®s]|9g\a�e£uvsBc^tIq\{�q®sBtIoJg�u0�9s]qs]qju0�Zg�µ�c^a�qvu(o\e^q�g�sBtVc^a�|9g\aju0c�c^�Zu�ensBt­m]e;aOd^g�o\e;tIoJgJmxm]e;uvsBc^t¬c;¡luO�Zg2�Za0g\ºWs�cWpIq®mB��o\enm]oJp9m]e;u0g�|e;�Z�ZaOc�±9sBf�e;uvsBc^t B A ²�(Áz�nÁ � �¹�I��¾��Ü�$g�g\uuO�Zg°enmBd^cWavsBu0�Zf�u0c2|9g\u0g\a0f�sBtZg�u0�Zg�qvpZf �Gg/d;sBº^g\t§eWqbsBt§qvg�oJu�s�cWt§�Z² �TZc^aSC�9¥� e;tI| �b9?� u0�Zg\t��Zc£m¹|Zq

� � � Q �D� �

�(�nÁ�Á��e/q®sBtZd�¨�c^�Z�IgJmBu � q�Gc^pZtI|�µ�g�c^�Zu�e[s�t

�9���?� �

æ�æ ÞJe�fhg%ikj

Page 21: Comparison of Accurate Dot Product AlgorithmsComparison of Accurate Dot Product Algorithms Jurgen Wol V. Gudenberg To cite this version: Jurgen Wol V. Gudenberg. Comparison of Accurate

��^ ����$�)������G��' �������$#���� �h���)����� ����*����������$�������� � �������� �K���%��h �,�������������������)e;tI|·u0�Zg�a0g�qvp9mBu#s]q�c^�9º^sBc^pIq\²

�T&s�tIe[m mB� µlg�µe;t9u�u0c2oJc^f2�ZpZu0gju0�Zg�a0c^pZtI|WsBtZd)g\a0a0c^a~¡Ùc^au0�Zg(e£�Z�Za0cn±�sBf�e;u�s�cWt B A�(Áz�nÁ � �¹�I��¾�����´St u0�Zg�tZc^u�e;u�s�cWt�c;¡ g³g\f2f�e���mBg\u�u0�Zg�gJ±Z�Ic^tZg\t9u°|WsNM�g\a0g\tIoJg�b90E A U E P � �e;tI| C 9?�0��Zg\t B A e;�Z�Za0cn±9s�f)e;u0g�q~u0�Zg�q�pZf c;¡�enmxm B

?µ�sBu0�­e�f�en±9sBf�enm�a0gJm]e;uvsBº^gjg\a0a0c^a

� a � � P � �� � A U¬� P � ��?� � � �

�(�nÁ�Á��

� ab9P 8 PP

B A N 8 P �=� C � �C � A U ��C � � 9 � P � �� � A UV� P � �

�?� � � �

� / ���),��#�������³���Ü��� �� "�� ���r�����������

¨�c^�Z�GgJmBu �©q�|9c^u0�Za0cZ|9pIoJu�e[m�dWc^avsBu0�Zf eWq�µlgJmxme^q�u0�Zg2�GgJ¡¢c^aOg2f2g\t9uvsBc^tZg�|Vº;e;avs]e;t9u�q�e;tI|­f�c7|Wsxwzo\e;uvsBc^tIq �Ie�ºWg2�Ig\g\t¬sBf2�9mBg\f2g\t9u0g�|¬sBt Tb�/�c0��� ]]§e;tI| oJc^f2�Ie;a0g�| µ�sBu0�¬u0�Zg�m�cWtZde^o\oJpZf�p9m]e;u0cWasBf2�9mBg\f2g\t9u�e;uvsBc^t c;¡"TZc^aOu0a�e;t���y^pIe£a0gJm]q�¦©`be;c2�^�n«S²40��Zg�enmBd^cWavsBu0�Zf�q�e;a0g�o\e[m:7m]e;�9mBg�e^q�q�pZ�Za0c^pZuvsBtZg�q)µ�sBu0� sBt�uOg\av¡ eWoJg�q�s¹|9g\t9uvs]o\enm(c^a�e;u�mBg�e^qvu§q®sBf�sxm]e;a§uOc�u0�Zc9qvg�¡Ùc^a)u0�Zg� ��e#���Sf7ghd)sBf2�9mBg\f2g\t9u�e;uvsBc^t�²G´StI|9g\g�|�{�uO�Zg��Za0c^d^a�e£f��������� � ¡¢g�e;uOpZa0g�qe;t)sBt9u0g\a�e^oJuvsBº^gu0g�q�u/qvp9sBu0g�¡Ùc^aenmxm�|9c^u0�Za0cZ|9pIoJu/enmBd^c^a�s�uO�Zf�q\²0��Zg�¡¢c£m mBc;µ�sBtZd)a0g\f�e;a0¶ZqoO�Ie;a�e^oJuOg\avs�V\g�u0�Zg�sBf2�9mBg\f2g\t9u�e;uvsBc^t

� 0��Zg��Za0c^dWa�e;f�q�e;a0g�µ#avsBu0u0g\t�sBt­qvu�e;tI|Ze;aJ| T*�/� 0#��� ]�]� 0��Zg�mBc;µ¥mBg\º^gJmrcW�Ig\a�e;u�s�cWtIq~mxs�¶Wg�qv�9mxsBu0uvsBtZd�e-FIc�e;uvsBtZd7Ù�Gc;sBt�u�t9pZf��Gg\a�{ZgJ±Zu0a�e^oJu�s�tZd�u0�ZggJ± �Gc^tZg\t9uÜc^a�d^g\t9pIq&pIqvgbu0�Zg�´ f7f�f�a0g\�Za0g�q�g\t�u�e£uvsBc^t�c;¡����������������������������/t9pZf��Ig\a�qÓ²

� d�o\enmxsBtZd­s]q(tZcWu(�Ig\a�¡¢c^aOf2g�|�{��Zg\tIoJg�u0�Zg��Za0c |9pIoJuJq(�Ienº^g�u0c§�Ienº^g2gJ±Z�IcWtZg\t�u�q sBt¬u0�Zg|9c^pZ�9mBg(�Za0g�oOs]q®sBc^t­a�e£tZd^g^²

� fba0aOc^a��Ie;tI|Wmxs�tZd·s]qtZc^u�s�f��9m�g\f�g\t�u0g�|�{Gu0�Zgj�Ie;a�e;f2g\uOg\a� ���qv�Zc^p9m]|�tZc^u#�Ig(pIq�g�|�²I´®us]q¶^g\�Zu#u0c��Ie�ºWg�u0�Zg(q0e;f2g�sBt9u0g\av¡Se^oJg�q�e^q~u0�Zg�mBc^tZd�eWo\oJpZf�p9m]e;uOc^aa0c^pZu�s�tZg�qÓ²

åkyÜæ³å]è

Page 22: Comparison of Accurate Dot Product AlgorithmsComparison of Accurate Dot Product Algorithms Jurgen Wol V. Gudenberg To cite this version: Jurgen Wol V. Gudenberg. Comparison of Accurate

������������ ��������������������������! "���#��$�%�&�(' )����*�,+$�- �n�

� 0��Zg)m]e;a0d^g)u�e;�9mBg)s]q�c^a0d�e;t9s�V\g�|=e^q�e c^tZg�7S|WsBf2g\tIq®sBc^tIe[m e;aOa�en�¤µ#�Zg\a0g�u0�Zg�cZ|Z|=e;tI|g\º^g\tVº;enmBpZg�q�¡Ùc^a�u0�Zg)q0e;f2g2gJ±Z�IcWtZg\t�u�e;a0g�e^|n�0e^oJg\t9u�² 0��9s]qju�e;�9mBg�s]q(oJaOg�e;u0g�| sBt¬u0�Zgu0c^� m�g\ºWgJm³qvpZ�Za0cWpZuvsBtZg�qbe;tI|j�Ie^q0qvg�|�e^q��Ie;aJe;f2g\u0g\a�u0c�u0�Zg�c^u0�Zg\a�q\²n´SuJq�cZo\oJpZ�9sBg�|2a�e£tZd^ge;a0g2enmxmlg\t9u0avsBg�q �Ig\uvµ�g\g\tVu0�Zg2f�sBt9sBf�pZf�e;tI|­f�en±9sBf�pZf s�tI|9gJ±¬µ�9s]o0� sBt�u0pZaOt e£a0genm]qvc2�Ie^qOqvg�|�u0c2e;tI| e^o\oJc^a�|WsBtZd;mB��pZ��|Ze;u0g�|��9�)u0�Zg(o\enmxmBg�|¤q�pZ�Za0c^pZuvsBtZg�q\²

0��Zg(qvc;¡Ùuvµ�e£a0gs]q#c^a0d�e£t9s:V\g�|�sBt u0�Zg�¡¢c£m mBc;µ�sBtZd�wZm�g�qwZmBg\tIe;f2g oJc^t�uOg\t�u�q|9c^u0uOg�qvu s�t9u0g\a�eWoJuvsBº^g(u0g�q�u0�Za0c^d^aJe;f¶^c^�Iq�pZ� |9c^u0�Za0cZ|9pIoJu/qvpZ�Za0cWpZuvsBtZg�q�Z�Ie^qvg^� qvpZ�Za0c^pZuvsBtZg�q�¡¢cWau�e;�9mBg�sBtIqvg\a0uvsBc^t�Z�Ie^qvg�� e^|n�¸pIq�u0f2g\t9uc;¡rq®sBd^tIq�e;tI|�oJc^f��ZpZu�e;uvsBc^t�c;¡�mBg�e^|WsBtZd�u0g\a0f�Z�Ie^qvg�� qvpZf c;¡bu�e;�9mBg(g\t9u0avsBg�q�IcWa0u�� �ZavsBf�sBuvsBº^g�q�¡Ùc^a/q��9m sBu0u�s�tZd�g\uJo;²0�e;�9mBg��G3��aOd�e;t9s]q0e;uvsBc^t�c£¡*qvcWpIoJg�wZm�g�q

� ')( � +$9��7<�+ i�471 9 � 4 �7<�+=4 9�jL,�q �ZVg~mxs]qvubu0�Zg~sBt�uOg\av¡ eWoJg�q�c;¡IuO�Zg�|9cWu0�Za0cZ|9pIoJurqvpZ�ZaOc^pZuvsBtZg�q�u0c^d^g\uO�Zg\aܵ�sBu0�2e�qv�Zc^a0u�|9g�q0oJavsB�ZuvsBc^tc;¡bu0�ZgJsBa��ZpZa0�Ic�q�g^² TZc^a#f2c^a0g(|9g\u�e[s mBg�| |9c oJpZf2g\t9u�e;u�s�cWt§µ�g�a0gJ¡Ùg\au0c2u0�Zg(q�c^pZa�oJg�u0gJ±Zu�q\²

� e£tI|��Ve;aOg2º^g�oJu0c^aJq�µ�sBu0�������¬oJc^f2�IcWtZg\t�u�q µ�9s]o0�¬e;a0g�e^o\oJg�qOqvg�|­pIq®sBtZd­u0�Zg�qvuOavs]|9g�q� � � � e;tI| � � ���z{ZaOg�qv�Ig�oJuvsBº^gJmB�^² ���s¹q�e;t�pZt9pIqvg�|)sBt9u0g\d^g\ag\aOa0c^a Fze;dG²

��� ����������� �������������� ����� �����"!#��� $�%'& �)( � � � �)( � ( � � ��� ( ���*� ( ����,+

¨�c^�Z�IgJmBu � qc^a�s�d£s�tIe[mrenmBd^c^avsBu0�Zf ²��� ����������� �������������� ����� �����"!#���#��%'& �)( � � � �)( � ( � � ��� ( ���*� ( ����,+

0��Zg(e^|n�vpIqvu0f2g\t9uc;¡�q®sBd^tIq~s]q#a0g\�9m]e^oJg�| ����e^|Z|WsBuvsBc^t·¡¢a0cWf+mBgJ¡Ùu/uOc2avsBd^�9u��� ����������� �������������� ����� �����"!#���#���-& �)( � � � �)( � ( � � ��� ( ���*� ( ����,+

0��Zg(e^|n�vpIqvu0f2g\t9uc;¡�q®sBd^tIq~s]q#a0g\�9m]e^oJg�|��9��e^|Z|WsBuvsBc^t ¡¢a0cWf+mBgJ¡Ùuu0c2a�s�dW��u��|Z|Ws�7uvsBc^t�¡Ùa0c^f�avsBd^�9u�u0cjm�gJ¡Ùu�pZ�z|Ze;u0g�q#u0�Zgjg\a0a0c^auOg\a0f �9� e^|Z|WsBtZd)pZ�§u0�Zgja0g\f�ensBt�7|9g\a�q\²

��� ����������� �������������� ����� �����"!#���#�����.& �)( � � � �/( � ( � � ��� ( ����� ( � ( ��,+

oJc^f2�ZpZu0g�aOc^pZd^�§e;�Z�Za0cn±9sBf�e;uvsBc^t�²ZaOgJm¹e£uvsBº^g(g\a0aOc^a� � =��� ����������� �������������� ����� �����"!#���#��� �0& �/( � � � �/( � ( � � ��� ( ����� ( ���1�� ( � ��2�3�� ( ����,+

æ�æ ÞJe�fhg%ikj

Page 23: Comparison of Accurate Dot Product AlgorithmsComparison of Accurate Dot Product Algorithms Jurgen Wol V. Gudenberg To cite this version: Jurgen Wol V. Gudenberg. Comparison of Accurate

�a ����$�)������G��' �������$#���� �h���)����� ����*����������$�������� � �������� �K���%��h �,�������������������)oJc^f2�ZpZu0g*sBtIoOm�pIq¸s�cWt�{^oJc;mxm�g�oJubuO�Zg�g\a0a0cWazuOg\a0f��9�(e^|Z|WsBuvsBc^t(µ�sBu0�ja0g\f�ensBtI|9g\a�{;e^|Z|µ�sBu0�?|WsBa0g�oJu0g�|%a0c^pZtI|WsBtZdI²�0��9s]q�aOc^pZuvsBtZg�pIqvg�q2|WsBa0g�oJu0g�|%a0c^pZtI|WsBtZdVºWs]e d^pZt �©qsBt�u0g\a�¡ e^oJg�uOc2´ f�f7f?e£avsBu0�Zf2g\uvs]o;²

��� ����������� �������������� ����� �����"!#��������� & �)( � � � �/( � ( � � ��� ( ����� ( �� +

´StIqvg\aOul�ZaOc |9pIoJu�q*sBt9u0c(u�e£�9m�g^{Ze;tI|�u0�Zg\t�eW|Z|2u�e£�9m�g�g\t9u0avsBg�q~µ�sBu0�2mBc^tZd2e^o\oJpZf�p�7m]e;u0c^a�²0��9s]q�a0c^pZuvsBtZg�|9gJm sBº^g\aJq�|9c^u0�Za0cZ|9pIoJu�q#µ�sBu0�§c^�Zu�s�f)enm*eWo\oJpZa�e^oJ��µ�9s]oO� e£a0g(enm]qvcqvp9sBu�e;�9mBg�uOc�Ig~pZ�z|Ze;u0g�| ����e^o\oJpZf�p9m]e;uvsBc^t�e;tI|�o\e;t��Ig~pIqvg�|(eWq ����� � ��� ��� �����º[e;a�s¹e£�9m�g�q#sBt�TZc^a0uOa�e;t�e^y^pIe£a0gJm]q��Za0c^d^aJe;f�q\²

� 'Z& ��8 �76)q�� ,��q:<9�jL+-,0��Zg sBtIqvg\a0uvsBc^t c;¡r�ZaOc |9pIoJu�qls�t9u0c)u0�Zg�m]e;a0d^g�uJe;�9mBg�s]q�u0�Zgjg�q0qvg\t9uvs]enmÜtZg\µ¥s]|9g�e�c;¡*¨�cW�Z�IgJmBu �©qenmBd^cWavsBu0�Zf§²Z´®u�s]q/e^o\oJcWf2�9mxs¹q��Zg�|§�9��e2o\enmxmruOc

����#� �� � ����� � ��� � �#��� &���� (���/( � (� ��� ��� ( ��� ��� +

µ�9s]o0� qv�9mxs�uJq�u0�Zg c^�Ig\a�e£tI|Zq§e;tI|%sBtIqvg\a0u�q)uO�Zg¤�¬c^a��¬�Ie;a0uvs]enm��Za0cZ|9pIoJu�q)�9�o\enmxmxs�tZd

����#� �� � ���������������#� &�� ( � (� ������ ( ��� ��� +

´St��Gc^u0�§qvpZ�Za0cWpZuvsBtZg�q�0�|9g\tZc^u0g�q~u0�Zg�uJe;�9mBg�u0�Zg(qvpZf�c;¡Üµ�Zc9qvg�g\t�uOavsBg�q�a0g\�Za0g�7qvg\t9u�q�u0�Zg�gJ±Ie^oJu/|9c^uO�Za0c |9pIoJun² 0��Zg�cZo\oJpZ�9s�g�| a�e£tZd^g�sBt§u0�9s]q#u�e;�9mBg�s]q�Gg\u¸µ�g\g\tu0�ZgjsBtI|Ws¹oJg�q � � � � � e;tI| ��� � � � ²Z� ¡Ùc^a0f�e[mr�Za0c9c;¡�u0�Ie£u/u0�ZgjsBt9º[e;avs]e;t9u�s]q�¶^g\�Zuf�en��g�eWq®sxm����Gg�|9g�|9pIoJg�| ¡¢a0cWf u0�Zg(qvc^pZaJoJg(oJc |9g^²

� '�� � +-<�5m8$6Lj��3jL,�. 9�o7q���8 ��6Lq0��Zg�u�e£�9m�g��Ie^q�uOc��Gg�tZcWa0f�enmxs�V\g�|­�IgJ¡Ùc^a0gjsBu�q�g\t9u0avsBg�q�o\e;t­�Ig2qvpZf�f2g�| pZ�§¡Ùa0c^f avsBd^�9u�u0cmBgJ¡Ùu�² 0��Zg�tZc^a0f�e[m s�V�e;u�s�cWt�s]q�gJsBu0�Zg\a/e;t�e^|n�vpIqvu0f2g\t9u�c;¡Üu0�Zg(q¸s�dWtIq�cWa�e�oJc^f��ZpZu�e;uvsBc^t�c;¡�u0�ZgmBg�e^|WsBtZd)u0g\a0f�c;¡Üu0�Zg�qvpZf �9��e^|Z|WsBuvsBc^t·¡¢aOc^f mBgJ¡Ùuu0c�avsBd^�9u�²^´®u�s]q~sBf2�Ic^aOu�e;t9uluO�Ie;u�u0�Zg(q�pZfc;¡bu0�Zg�u�e£�9m�gjg\t�uOavsBg�q/|9c9g�q�tZc^u/oO�Ie;tZd^g^²

����#� �� � ����� ����� � � $ & � (� ��� ��� (� ��� ��� +

g�y^pIe[m s�V\g�q�u0�Zg(q¸s�dWtIq\²����#� �� � ����������� & � (� ������ (� ��� ��� +

åkyÜæ³å]è

Page 24: Comparison of Accurate Dot Product AlgorithmsComparison of Accurate Dot Product Algorithms Jurgen Wol V. Gudenberg To cite this version: Jurgen Wol V. Gudenberg. Comparison of Accurate

������������ ��������������������������! "���#��$�%�&�(' )����*�,+$�- � �

e^|Z|ZqjmBg�e^|WsBtZd¬u0g\a0f�qjpZt9uvsxm°|WsNM³g\aOg\tIoJg§c;¡�gJ± �Gc^tZg\t9u�qjs]qjm]e;a0d^g\a�u0�Ie;t?�Z²J0��Zga0g\f�ensBtI|9g\a�q�e;tI|¬u0�Zg)wItIenm/qvpZf�e£a0g2sBtIqvg\aOu0g�|¬sBt=u0�Zg�u�e£�9m�g^{*�Zg\tIoJg)s�uJq2qvpZf¶^g\g\�Iq~sBt9º[e;avs]e;t9u�²

��� ����������� �������������� ����� ����� �#���#� & � ( � (� ������ (� ��� ��� +

e^|Z|Zq~mBg�e^|WsBtZd�u0g\a0f)q�pZt�u�s mb|Ws M�g\a0g\tIoJgjc;¡bgJ± �Gc^tZg\t�uJq�s]q~m]e;a0dWg\au0�Ie;t�t N �� '�� � 4�5 + � 9�o�q ��8 �76Lq´®t�u0�Zg~wZmBg �#3 ��� %�� 1�u0�Zg/|WsNM�g\a0g\t9u�a0c^pZu�s�tZg�q*u0c(qvpZf pZ�2uO�Zgu�e;�9mBgg\t9u0avsBg�q�e£a0goJc^f2�Zavs]qvg�|�²´®t§�Ie;a0u�s¹oJp9m]e;a�u0�Zg\a0g(e£a0g

��� ����������� �������������� ����� ����� �#��� & � ( ������ (� ������ +

¨�c^�Z�IgJmBu � q�cWavsBd;sBtIenmz�Za0cZoJg�|9pZa0g�gJ± �9mBc;sBuvsBtZd2e^q*f�pIoO�2c;¡�u0�Zg�u�e;�9mBg/qvu0aOpIoJu0pZa0g/e^q�Ic�qOq®sB�9m�g^²

��� ����������� �������������� ����� �����"!����#� &�� (� ��� ��� ( ��� ��� +

d;sBf2�9mBg q�pZf�¡¢aOc^f avsBd^�9uu0cjmBgJ¡¢un²

��� ����������� �������������� ����� �����"�����#� &�� (� ��� ��� ( ��� ��� +

�°|Z| g\t�u0a�s�g�q)µ�s�uO� g�yWpIenm(gJ±Z�Ic^tZg\t9u��IgJ¡Ùc^a0g¬e^|Z|WsBtZd=u0c=u0�ZgVqvpZf§² 1StZc^u��^g\usBf2�9mBg\f2g\t�uOg�|K5

����#� �� � ����������� $ & �� )( ��� )( � ( ������ (� ������ +

d^pZf»c£¡Üu0�Zg(uJe;�9mBg�µ�sBu0�§a0g\f)ensBtI|9g\a/u0c�c^�ZuJensBt§�9sBd^�Zg�qvu/eWo\oJpZa�e^oJ�^²

� '�� �V<�j)5mjk9�j���q � � +-<2qr6)qr5mq 9;8$< ��+ �Sqr<�8:9�jL+=,��0��Zg�q�g2a0c^pZuvsBtZg�q�enmxmbgJ± �9mBc;sBuju0�Zg2qv�Ig�oOsxwzo m¹en��7Ùc^pZu�c;¡�e;t­´ f7f7f |9c^pZ�9mBg2�Za0g�oOs]q®sBc^t�FIc�e£uvsBtZd�Gc;sBt�u�t�pZf��Gg\a�²

����#� �� � ����������� &�� ( � ( � (�� +

oJc^f2�ZpZu0g�qbeW|Z|Ws�u�s�cWt�µ�sBu0��a0g\f)ensBtI|9g\a�{;sS²^g^²-1�dI{ ��5J9� N� ´¸q*º[enmxs]| ¡¢c^ab�9sBtIe;aO�e;avsBu0�Zf2g\u�s¹o µ�sBu0��a0c^pZtI|WsBtZd)u0c2u0�Zg�tZg�e£a0g�qvu�²

�����#� � �#���� ����� ����������� �� & � +

|9g\u0g\a0f�sBtZg�q#u0�Zg(q®sBd^t­e;tI|)s]q�µa�s�uOu0g\t)sBt q�u�e;tI|Ze;a�|2TZcWa0u0a�e;t�²

æ�æ ÞJe�fhg%ikj

Page 25: Comparison of Accurate Dot Product AlgorithmsComparison of Accurate Dot Product Algorithms Jurgen Wol V. Gudenberg To cite this version: Jurgen Wol V. Gudenberg. Comparison of Accurate

�^� ����$�)������G��' �������$#���� �h���)����� ����*����������$�������� � �������� �K���%��h �,�������������������)�����#� � �#���� ����� ����� � �#� �� & � +

|9g\u0g\a0f�sBtZg�q�uO�Zg2d^g\t9pIq(c£¡#e·t�pZf��Gg\a��9�­u0g�qvuvsBtZd­u0�Zg�m]e^qvu(�9sBu�c;¡u0�Zg)a0g\�Za0g�7qvg\t9u�e;uvsBc^t¬pIq®sBtZdV�^� �9s�ujsBt9u0g\d^g\a2e;avsBu0�Zf2g\u�s¹o;²*�»o\enmxmc;¡/e;t�gJ±Z�9mxs]oOs�u)�9sBu2u0g�qvu¡¢pZtIoJu�s�cWt�f�en� s�f��Za0c;º^gu0�Zg��Ig\av¡Ùc^a0f)e;tIoJg^²^��q®sBf�sxm]e;aoJc^f2f2g\t9u��Zc£m¹|Zq*¡Ùc^abu0�Zg¡¢c£m mBc;µ�sBtZd)�Za0c oJg�|9pZaOg�q\²�/tZcWu0�Zg\a/e;tZc^f)enmB�2µ�sBu0��u0�9s]q��Za0cZoJg�|9pZa0g s¹q�u0�Ie;uoJpZa0a0g\t9uvmB� e£t � c |Z|��9t9pZf��Ig\aµ�sBu0�Vu0�Zgjm]e^qvu��9sBu�qvg\u��Ie^q�u0�Zg�d^g\t9pIq aZ²40��9s]q�f)e�� �Ig�oO�Ie;tZd^g�|·sBt�e�¡ÙpZu0pZa0gº^g\a�q®sBc^t��ZpZu�a0g�yWp9sBa0g�qaOg\µavsBuvsBtZd2c;¡bu0�Zg(o\enmxmxsBtZd�a0c^pZuvsBtZg�q\²

��� ����������� �������������� ����� ����� �����#����� & � +

d^g\tZg\a�e;uOg�q�u0�Zg�� g\º^g\t � º^g\a�q®sBc^t�c;¡�e�t9pZf��Ig\a�����oOmBg�e;avsBtZd�uO�Zg�m¹eWqvu�9sBu�²

�����#� � �#���� ����� ����� ���#������� & � +

|9g\u0g\a0f�sBtZg�q u0�Zg2u�e;�9mBg�sBtI|9gJ±�c£¡�e)t9pZf��Gg\a�e^o\oJcWa�|WsBtZd§u0c)sBu�q�gJ±Z�Ic^tZg\t9u�²&�°o�7u0pIenmxmB�Vu0�Zg)u�e;�9mBg)�Ic�q®sBuvsBc^t¬c;¡u0�Zg)º[enmBpZg�s]q � ���� � � & � +�� 2� �� � & � +�µ�Zg\a0g� ���� � � & � +�� $'& � ����� ��� & � +���� $ � +

��� ����������� �������������� ����� ����� � � & � +

oJc^f2�ZpZu0g�q*u0�Zg!FIc�e;uvsBtZd7Ù�Gc;sBt�u�t9pZf��Ig\al|9g\aOu0g\a0f�sBtZg�|��9��u0�Zgf2c�q�urq®sBd^t9sxwzo\e;tI|f�e;uvs]q0qOe��9sBu�{9sS² g^²�� � � & � +�����XZ]�� � = X =�� � ] � �

��� ����������� �������������� ����� ����� ����� � � & � +

oJc^f2�ZpZu0g�q�u0�Zg�FIc�e;uvsBtZd7Ù�Gc;sBt�u#t�pZf��Ig\a�|9g\a0u0g\a0f�sBtZg�|·���)u0�Zg�wIa�qvu��^hjf�e;uvs]q0q0e�9sBu�q\{ u0�Zg�aOg\f�ensBt9sBtZd��9s�uJqe;a0g(qvg\u#u0c aZ²

� � �2�����\��� ��� ���2���� �� �

��pZt9uvsBf2g�f2g�e^q�pZa0g\f2g\t9u�qG�Ie�ºWgb�Ig\g\t��Ig\av¡Ùc^a0f�g�|�c^t(e�d^pZtVd�`G���#k qvuJe;uvsBc^t�pIq®sBtZd/u0�Zg � �*� oJc^f�f�e;tI|=c£¡ d^pZt �©q2TZcWa0u0a�e;t�²~�e;tI|9c^f�mB�=d^g\tZg\aJe;u0g�|=ºWg�oJu0c^a�q�µ�s�uO�?º[e£a0�^sBtZd=|WsBf2g\tIq®sBc^te;tI| º[e£a0�^sBtZdVgJ±Z�IcWtZg\t�u2|WsNM�g\a0g\tIoJg��Ienº^g��Gg\g\t=d^g\tZg\a�e;uOg�|�² 0��Zg�|9c^u0�Za0cZ|9pIoJu2enmBd^c^avsBu0�Zf)qe;aOg�a0g\�Ig�e;uOg�|)sBt¤ejmBc9c^�§µ#�9s¹oO��s]q#c^�9º^sBc^pIq®mB��tZc^u#c^�ZuvsBf�s�V\g�| ���)u0�Zg�oJcWf2�9sxm�g\an²C_~gJ¡Ùc^a0g�µ�gmxs]qvu�qvc^f2g(q0e£f2�9mBg(uvsBf2g�q\{ µlg(q�pZf2f�e;avs�V\gjc^pZac^�Iqvg\aOº[e;uvsBc^tIqÓ²TZc^a�qv�Zc^a0u~º^g�oJu0c^a�q 1¸|WsBf2g\tIq®sBc^tV��a 5�u0�ZgmBc^tZd2e^o\oJpZf�p9m]e;u0cWalenmBd^c^a�s�uO�Zf"µe^q~qvc^f2g\u�s�f�g�q¡Se^qvu0g\a�u0�Ie;t u0�Zg(tZg\µ c^tZg^{�qvc^f2g\u�s�f�g�qu0�ZgjuvsBf2g�q�µ�g\a0g�e£�Ic^pZu�g�y^pIe[m ²40��Zg�c;º^g\a0�Zg�e^|�|9pZgu0c�u0�Zg�sBt9sBuvs]enmxs�V�e;uvsBc^t¬c;¡ruO�Zg�u�e;�9mBg�s]q�u0c9c�m]e;a0d^g^{ s�u�qvc^f�g\uvsBf2g�q�u0c9c^¶)f2c^a0g�uvsBf2gju0�Ie;t u0�Zg

åkyÜæ³å]è

Page 26: Comparison of Accurate Dot Product AlgorithmsComparison of Accurate Dot Product Algorithms Jurgen Wol V. Gudenberg To cite this version: Jurgen Wol V. Gudenberg. Comparison of Accurate

������������ ��������������������������! "���#��$�%�&�(' )����*�,+$�- �W�

oJc^f��ZpZu�e;uvsBc^t­sBu�q�gJm ¡0² 0��9s¹q c;º^g\a0�Zg�e^|­f)e�� �Ig�|9g�oJaOg�e^qvg�|­�9�­pIq®sBtZd¤e§oJcWf2f2c^t­�9mBc oO¶­¡Ùc^au0�Zg�m]e;aOd^g/uJe;�9mBg^²$Z g°enm]qvc)oJc^p9m]|)c^�Iqvg\a0º^g�u0�Zg�gJ± �Gg�oJu0g�|�qvpZ�Gg\avsBc^avsBuv��c;¡�u0�Zg°eW|Z|Ws�u�s�cWt�¡Ùa0c^fmBgJ¡Ùu/u0c�avsBd^�9u�sBt¤oJcWt�u0aJe^qvu�u0c2u0�Zg(eW|n�¸pIqvuOf2g\t�u#c;¡*q¸s�dWtIq\²_�pZu#¡Ùc^amBc^tZdWg\a�º^g�oJu0c^a�q#u0�Zg�qvg�|WsNM³g\aOg\tIoJg�q�tZg�e;avmB� º[e;t9s]qv�Zg�|�{�enmxm�º;e;avs]e;t9u�qc£¡�u0�Zg�tZg\µenmBd^cWavsBu0�Zf �Ig\av¡Ùc^a0f-g�y^pIenmxmB�Ve;tI|§oJcWtIq®s]|9g\a�e;�9mB� ¡ e^q�u0g\au0�Ie;t u0�ZgjmBc^tZd eWo\oJpZf�p9m]e;uOc^a�²40��ZgmBc^tZd^g\a�e�º^g�oJuOc^a�u0�Zg�f2c^a0g�µ�c^aO¶�s]q/|9c^tZg�sBt �Z�Ie^qvg���e;tI|·u0�9s]qqv�Gg\g�|ZqpZ� u0�Zg(enmBd^c^avsBu0�Zf ²ZVg¡ÙpZa0u0�Zg\a~c^�Iqvg\a0ºWg�|2u0�Ie£u�a0c^pZtI|9g�|�oJcWf2�ZpZu�e;uvsBc^t�s]q*¡ e^q�u0g\a��9��e�¡Se^oJu0cWa�c;¡Üe;�Gc^pZu��$aZ²0��9s]q�oJc^t9u0a�eW|Ws¹oJuJq�c^pZa�u0�Zg\c^a0g\uvs]o\enmbu0a0g�e£u0f2g\t9u�sBtVqvg�oJuvsBc^t­��e£tI|§qvpZ�Z�Gc^a0u�q#u0�Zg�q�u�e;u0g\f2g\t9uu0�Ie£u�sBt�u0g\dWg\a/e;avsBu0�Zf2g\u�s¹o f�en��tZcWul�Gg(tZg\d;mBg�oJu0g�| u0cWu�enmxmB�^²

´St�u�e£�9m�g¬�?qvc^f�g qOe;f2�9mBgVuvsBf2g�q e;a0g mxs¹q�u0g�|�µ#�Zg\a0gVg�|WsNM"|9g\tZc^u0g�q·u0�Zg�|WsNM�g\a0tIoJgVc;¡gJ±Z�Ic^tZg\t9u�q�c£¡Üu0�Zg�sBtZ�ZpZu/ºWg�oJu0c^a�q~e;tI|)a0g\��u0�Zg�a0g\�Ig\u�s�u�s�cWt�¡Se^oJu0c^a�²�0��Zg�uvsBf2g�qe£a0g/d£s�ºWg\t)s�tqvg�oJcWtI|Zq\²zq�pIo0�§e�µen�2u0�Ie£uu0�Zg�gJ±Zg�oJpZuvsBc^t§c;¡�e�a0cWpZtI|9g�|§q0o\enm]e;a#�Za0c |9pIoJu#s]q��^²

|WsBf ��a ��a�a �%aaa �%aaaag�|WsNM ��aa ��� ha�a �aaa0g\� �aaa �aa�a �a�a �a�j��7QmBc�cW� a � � �¨ ��`��W� �^� �Z� �Wh �^�¨ ��`7g&� ��� ��] � ] �^�¨ ��`G��kk ha ][� hW� �;�g�� �\� ���£� ���W� �^��h0�e;�9mBg��G33f�±Zg�oJpZuvsBc^t uvsBf2g�q

� ��� ����,\�� &�����

d^g\º^g\aJenmrtZg\µ º[e£avs]e;t�uJq�c;¡�q0o\enm]e;a��Za0c |9pIoJu�enmBd^c^a�s�uO�Zf�q�Ienº^g��Gg\g\t s�f��9m�g\f�g\t�u0g�| µ�9s]oO� s�td^g\tZg\aJenm³e;aOg¡Se^qvu0g\abuO�Ie;t�uO�Zc�qvg¶9tZc;µt2�IgJ¡Ùc^a0g^²�0��ZgJsBae^o\oJpZa�eWoJ��s]q�ºWg\a0���9s�dW�§e;tI|�d^pIe;aJe;t�7u0g\g�|�{;qvcg�eWo0�jf�sBd^��u&�Ig�pIq�g�|�¡Ùc^a�º^g\avsxwIg�|�e^o\oJpZa�e£u0g�oJc^f2�ZpZu�e£uvsBc^t�²�_~pZuÜcWt9m���uO�ZglenmBd^c^a�s�uO�Zf¨ ��`G��kk¬µ�9s]o0�)pIqvg�q*u0�Zg�m]e;a0d^g�u�e;�9mBg�¡Ùc^a~�Za0g\�Za0cZoJg�q0q®sBtZd2e;tI|�u0�Zg\t�eW|Z|Zq�uO�Zgº[enmBpZg�q*sBt�uOcu0�Zg m�cWtZd eWo\oJpZf�p9m]e;uOc^a(�Za0c;ºWs]|9g�q�c^�ZuvsBf�e[mle^o\oJpZa�e^oJ�­¡Ùc^a(enmxm�a0gJmBg\º[e;t9u(a0cWpZtI|Ws�tZd9q\²�´Su�e[m¹q�c¡Se^oOsxmxs�uJe;u0g�q(eWo\oJpZf�p9m]e;uOg�|Vo\enm]oJp9m]e;uvsBc^tVc;¡|9c^u0�Za0cZ|9pIoJu�q�e;tI|­qvpZ�Z�Gc^a0u�q�uO�Zg |Ze£u�e2uv�9�Ig ��������������� ����²Z´Su�q#a0pZt9uvsBf2g�¡Ùc^a�m]e;aOd^g(º^g�oJuOc^a�q�s¹q�e;�Gc^pZu°e�¡ e^oJuOc^ac;¡l� ¡ eWqvu0g\au0�Ie£t§u0�Zg m�cWtZde^o\oJpZf�p9m]e;u0cWa�e[m�dWc^avsBu0�Zf"e£tI|2tZg�e;a�m��)e^q�¡Se^qvu�e^qbu0�Zg�m�g�qOqleWo\oJpZa�e;u0g�º[e;a�s¹e£t�u�qÓ²ZVguO�Zg\a0gJ¡¢cWa0gqvpZdWd^g�qvu�uOc�pIqvg/uO�9s¹q�enmBd^cWavsBu0�Zf ¡Ùc^abº^g�oJu0cWa�qbµ�sBu0��f�e;t9� oJcWf2�Ic^tZg\t9u�q 1Sf2c^a0g�u0�Ie£t��a�5be;tI|�Za0gJ¡Ùg\a�u0�Zg�mBc^tZd�e^o\oJpZf�p9m]e;u0c^a�enmBd^c^avsBu0�Zf-¡¢cWa�¡Ùg\µ�g\a/oJc^f2�Gc^tZg\t9u�q\²

æ�æ ÞJe�fhg%ikj

Page 27: Comparison of Accurate Dot Product AlgorithmsComparison of Accurate Dot Product Algorithms Jurgen Wol V. Gudenberg To cite this version: Jurgen Wol V. Gudenberg. Comparison of Accurate

�;� ����$�)������G��' �������$#���� �h���)����� ����*����������$�������� � �������� �K���%��h �,�������������������)� ��������#���~�l

¦`_~c^��]�]�« _~c^�9mBg\tI|9g\a�{���²�3���' �%����,�$)�� "��,� �2�������4�$������,�� ����� ��� �%���*�� HJ�*�,+�� � � ��� ��� ��� �%�$��� �[²~´ f�f7f 0�aJe;tIq0e^oJuvsBc^tIq)cWt k~c^f2�ZpZu0g\a�qÓ{�º^c£m ²�k!7S�^hZ{~tZcI²�]�{£p9m��V�n��]]�²

¦`_~c^��] ^;« _~c^�9mBg\tI|9g\a�{���²�3r��%� ���� ���������+$�K�$��) � ��+�������+�����4�$� � ����� "���#���� �����$�K#�G��^� �%' �2�%�� �' �%�*� � �� � �� � +$'N��� ��� #2�',' )���� ���,� ������,�,+$� ������� �*�Y+\�� +$������ �� )��� � �,��h �C���6+$���G²Z`b��² ��²9u0�Zg�q®s]q\{$e#t9s�ºWg\a�q®sBu��e;u/¨�e£avm]qva0pZ�Zg^{�����] ^Z²

¦`_~c^� ^W�;« _~c^�9mBg\tI|9g\a�{���²B{��/a��pZtZg\a�{9¨�²�3��"���'N� ������,��G������ �c�4���*� �'J�!�� �C�$���%�����*�,+��� �%���*�n²Z´St¬¦©¨�p9m&^^�;«®²¦`_~c^�§��a;« _~c^�9mBg\tI|9g\a�{���²R37�2+$��!��� �G����� ��#���� � �� #2�,+$�R� � �����4���� #���#^²9´St 33e�mxm�7

a�s¹oO� 1Sg�|�²:5 3�k�c^f2�ZpZuOg\a§��avsBu0�Zf2g\uvs]oVe;tI| d^gJm ¡�7 �enmxs]|Ze;uvsBtZd pZf2g\a�s¹o\e[m � g�7uO�Zc |Zq\² �°o\e^|9g\f s¹oj`ba0g�qOq\{����^��aZ²

¦ ��g\¶Y]��\« ��g\¶9¶^g\a�{ 0 ²�Z²�34����'N������,�$)�� ��*�K��� � �6+$� � ����� ������� �����K#�*��)G�,+$� ���%��,' � ��'N� �� �%�� ��*���I{ #pZf§² �­e;uO��²B��^Z{©�W�;�7S�;��� {���� ]��¦©¨�cW�§�;�^« ¨�cW�Z�IgJmBu�{ gb²4�!� � ��J���� � �%#�����J��' )���,�,+$�dHJ�,�,+"� �,� �,� ��'#�"���� #��*��)�� $���h �k�c^f2�ZpZuvsBtZd��W�Z{©�^�^� X �^h^�Z{¹���^�n�¦©¨�t9p¤hW�;« ¨�t9pZu0��{ ��²�3%� +�� �� ����� �!�� �C�$���% �� )��� � �,�$)��'&K�' �()�C�%� �*�K�$� ����,� �'�(' )����*�,+$�- \²9�°|Z|Ws]qvc^t�7�Z g�q®mBg\�^{���g�e^|WsBtZdI{����^h^�Z²¦©¨�p9m!][h;« ¨�p9mxs]q0oO��{\ej²�3����$�K#�'N��)�%�I#�6 � ��� ����� ���+$�%�*�"����+$�K��� ² _~´�{ �­e;tZtZ�ZgJsBf§{

�n��][h¦©¨�p _~cY][h;« ¨�p9mxs]q0oO��{ e(²¹{ _~c^�9mBg\tI|9g\a�{+��²�3)� � � �'N��� �����*�� �� # � � �C'N��� ���K������*��� ���

��'N�%����*��)�� "��,� �,����� � � �c���������,��� \²Gk�c^f2�ZpZuvsBtZd§�nhZ{Z�^�^�%X �WhZ�^{I�n��][hZ²¦©¨�p9mJ^Z�\« ¨�p9mxs]q0oO��{�ej²B{ ��sBa�e£tZ¶^g\a�{!Z ²�gb²�3 �!�� �C�$���% ����,�,+$� �����,� �,�!� +$��� � ��K# �������,� ����o\e^|9g\f�s]o�`baOg�q0q\{G�avm]e;tI|9cI{����^Z�W²¦©¨�p9mJ^^�;« ¨�p9mxs]q0oO��{�ej²B{ ��sBa�e£tZ¶^g\a�{!Z ²�gb²�1Sg�|Zq\²N5G3 �-����H � ���C�����6+I���.�C�%�*�%� ��� /"��!�� �C�$�������*��!�°o\e^|9g\f s¹oj`ba0g�qOq\{��avm]e;tI|9cI{&���^^�Z²¦ gzs�t�^Z�\« gzs�tZtIe[s�tZf)e^eZ{ d�30�4������HJ������������3��'N� #�� "������R �,��1��' �%����,�$)�� "��,� �!���� �C�2��������*�� {9��k � 0&c � dI{ bc;m�]�{ cI²Z�2� ][� 7S�^^� {z�n�^Z�¦©`be£c �W�;« `be£c;mBg\u0uvsS{ Z² 7vk/² � �� ����� � ��������' { �#e;�Z�Ic^aOu 0&g�oO�Zt9s¹yWpZg

���°`�²`0�dI² �^�Z² d f~`ܲ �;�I{ �70�^²`a {rd^´ �2e�g��%��{Ü�n�^�^�¦©`rs]o0�U][�;« `rs]o0�Ie;un{ �¬²�34������ �%���*��2#43 �$�K�� ��� � �����G���*�,+��65�%���7�%�$�98� ���,�)��'N�,: �������� ���n²

#pZf§² �­e;uO��²z�n�Z{9�$aa%X���a^hZ{I�n��][�Z²

åkyÜæ³å]è

Page 28: Comparison of Accurate Dot Product AlgorithmsComparison of Accurate Dot Product Algorithms Jurgen Wol V. Gudenberg To cite this version: Jurgen Wol V. Gudenberg. Comparison of Accurate

������������ ��������������������������! "���#��$�%�&�(' )����*�,+$�- �W�

¦©��pZf2� ^a£«¯��pZf2��{ dI² �¬²�3�� ' �%�*�K� � �6+$'N��h �6+$���3� ���V� �%��� ����� �h�C�� ��'N�%� �%��`b��² ��²;u0�Zg�q¸s¹qÓ{e#t9sBº^g\a�q®sBu��e£u/¨�e;avm]qva0pZ�Zg^{����^�aZ²

¦ Z¥º��°�a£« Z c;mNM º�² ��pI|9g\t��Gg\a0dI{� 3�����*�,+$� �%���*�(� �.&K� �%��� ��K#� "�� ��'*'N��'��������4�$���% ��g�oOs]q®sBc^t d^pZ�Z�Ic^aOu@d^�Zqvu0g\f�q�]�{����^�Z�

æ�æ ÞJe�fhg%ikj

Page 29: Comparison of Accurate Dot Product AlgorithmsComparison of Accurate Dot Product Algorithms Jurgen Wol V. Gudenberg To cite this version: Jurgen Wol V. Gudenberg. Comparison of Accurate

Unite de recherche INRIA Lorraine, Technopole de Nancy-Brabois, Campus scientifique,615 rue du Jardin Botanique, BP 101, 54600 VILLERS LES NANCY

Unite de recherche INRIA Rennes, Irisa, Campus universitaire de Beaulieu, 35042 RENNES CedexUnite de recherche INRIA Rhone-Alpes, 46 avenue Felix Viallet, 38031 GRENOBLE Cedex 1

Unite de recherche INRIA Rocquencourt, Domaine de Voluceau, Rocquencourt, BP 105, 78153 LE CHESNAY CedexUnite de recherche INRIA Sophia-Antipolis, 2004 route des Lucioles, BP 93, 06902 SOPHIA-ANTIPOLIS Cedex

EditeurINRIA, Domaine de Voluceau, Rocquencourt, BP 105, 78153 LE CHESNAY Cedex (France)

ISSN 0249-6399