Upload
clio
View
36
Download
0
Tags:
Embed Size (px)
DESCRIPTION
Future Research in Computer Arithmetic. September 28, 2007 Eric Schwarz, IBM. Topics. Binary Multiplication Proofs of Overlapped ScanningFoundations 7:3 Counter DesignFuture Division Direct Division Remainder Avoidanceactive Decimal Floating-Point extremely active - PowerPoint PPT Presentation
Citation preview
The Power Architecture and Power.org word marks and the Power and Power.org logos and related marks are trademarks and service marks licensed by Power.org.
Stamatis Vassiliadis Symposium
Future Research in Computer Arithmetic
September 28, 2007Eric Schwarz, IBM
Stamatis VassiliadisSymposium
Topics
»Binary Multiplication• Proofs of Overlapped Scanning Foundations• 7:3 Counter Design Future
»Division• Direct Division• Remainder Avoidance active
»Decimal Floating-Point extremely active• Pipelining Add• Multiply• Multiply-Add
Binary Multiplication
Stamatis VassiliadisSymposium
Basics
»Recode multiplier and separate into digits»Create multiples of the multiplicand»Multiplex the multiples»Sum all partial products in counter tree»reduce final 2 partial products in CLA
Stamatis VassiliadisSymposium
History of Overlapped Scanning
» A.D. Booth in 1951 showed overlapped scanning» L. Rubinfield in 1975 proved radix-4 Booth» S. Vassiliadis in 1989 proved Booth for any radix
Stamatis VassiliadisSymposium
Fixed Point Multiplication in 1973
Stamatis VassiliadisSymposium
Fixed Point Multiplication in 1991
Stamatis VassiliadisSymposium
High Level Counters 7:3
US Patent 5,187,679 in 1993
Power6 and Z6* Decimal Floating Point: Hardware
Stamatis VassiliadisSymposium
Power6 and Z6* processors have DFU
CoreCore
CoreCore
L2L2CtrlCtrl
L2L2CtrlCtrl
L2L2DataData
L2L2DataData
L2L2DataData
L2L2DataData
L3L3CtrlCtrl
L3L3CtrlCtrl
MemMemCtrlCtrl
MemMemCtrlCtrl
Stamatis VassiliadisSymposium
754R Decimal Floating Point Format
» IEEE 754R defines 2 formats: • Integer coefficient & DPD (Densely Packed Decimal)
»Formats for 32,64,128 bit. • C encodes 2 exponent bits and 1 decimal digit in 5 bits.
decimal32 decimal64 decimal128
Coefficient precision (p)
7 16 34
Bits of Exponent
8 10 14
Exponent range
-101 to 90 -398 to 369 -6176 to 6111
Nmax (107 -1) x 1090 (1016 -1) x 10369 (1034 -1) x 106111
Nmin 1 x 10-95 1 x 10-383 1 x 10-6143
S1 C5 Exp6,8,12 Coef20,50,110
Stamatis VassiliadisSymposium
Power6 Decimal Floating Point Unit
»Cycle Time: approx 5Ghz, 13 FO4 design»Hardware executes 64-bit and 128-bit
formats.• 144-bit Dataflow can be split into two 72-bit pipes.• DPD coefficients are decoded into BCD for
execution.
»All cases are handled in hardware. (No Special Case Software Traps)
Stamatis VassiliadisSymposium
Coefficient Dataflow
Compress BCD to DPD
Operand B Hi Register
Operand A Hi Register
Operand B Lo Register
Operand A Lo Register
Expand DPD to BCD
Expand DPD to BCD
Working Register Hi Working Register Lo
Result Register Hi Result Register Lo
Multiple Generator
36 digits wide (144 bits)
Pipelined 2 cycle Rotator
4D 4D 4D 4D 2D 4D 4D 4D 4D
+1 +1 +1 +1+1
+1 +1 +1 +1 +1
Adder Register Adder Register
2D
Doubler & Quintupler1X,3X,4X Registers
A mux B muxQ
Prescale Table
Q Correction and Multiplexers
DEC to BIN & BIN to DEC Converters
Q
36 Digit Dataflow splits toTwo 18 Digit Dataflows
Magnitude Calculations:A-B B-A
Multiplication:Partial Product Partial ProductAccumulate Generate
Stamatis VassiliadisSymposium
Power6 DFP Multiplication
Partial products are formed from two multiples to reduce area
ii
p
ii NMNMP 100)](10)[( 1
2/
0
ip
iiNMP 10)(
0
Multiple A B
1X 0 1X
2X 2X 0
3X 2X 1X
4X 2X 2X
5X 5X 0
6X 5X 1X
7X 5X 2X
8X 10X -2X
9X 10X -1X
34 digit multiplication on 36 digit dataflow 1 digit every 2 cycles:
16 digit multiplication on dual 18 digit dataflows 1 digit every cycles:
Stamatis VassiliadisSymposium
Performance of Arithemetic Operations
Cycles Required for Execution
Doubleword Operands Quadword Operands
Case 1 Add/Sub 9 to 13 11 to 15
Case 2 Add/Sub 11 to 15 13 to 17
Case 3 Add/Sub 13 to 17 15 to 19
Multiplication * 19 + N 21 + 2N
Division 82 154
* N is the number of digits in the first operand excluding leading zeros
Stamatis VassiliadisSymposium
Future
» Pipelined Adder with Rounding Injection • Lia-Kai Wang
» Decimal Multiplication • Mark Erle and Michael Schulte – 3:2 Counter• Tomas Lang and A. Nannarelli - 4:2 Counter• Alvaro Vazquez et. al. 4221• Luigi Dadda - counters
» Decimal Multiply-Add Pipeline with Rounding Injection
» Divide • Tomas Lang and A. Nannarelli – Base 2 and Base 5
» Intel Format
Stamatis VassiliadisSymposium
Future of Computer Arithmetic
» Is based on clear proofs and expositions of the fundamental concepts.
• the easier to understand, the easier to build on
»Arithmetic is very active• IEEE 754R Standard currently in ballot• Decimal Floating-Point pipelined designs• new adder designs, new multiplier designs• vector processing, image processing, video game
Stamatis VassiliadisSymposium
thanks Stamatis!