Upload
mary-figueroa
View
32
Download
1
Tags:
Embed Size (px)
DESCRIPTION
Factorization of DSP Transforms using Taylor Expansion Diagram. Jeremie Guillot, E. Boutillon M.Ciesielski * , D. Gomez-Prado * , Q.Ren * , S. Askar * LESTER Lab, Université de Bretagne SUD * VLSI CAD Lab, University of Massachusetts, Amherst. Outline. Taylor Expansion Diagram - PowerPoint PPT Presentation
Citation preview
Factorization of DSP Transforms using Taylor Expansion Diagram
Jeremie Guillot, E. Boutillon
M.Ciesielski *, D. Gomez-Prado*, Q.Ren*, S. Askar*
LESTER Lab, Université de Bretagne SUD
*VLSI CAD Lab, University of Massachusetts, Amherst
2
Outline
• Taylor Expansion Diagram
• TED-based Factorization
• DSP example
• Results
• Conclusions
3
Taylor Expansion Diagram
...)0(2
1)0()0()(
!
1)( "2'
000
fxfxfxf)x(xk
xf kk
k
• Graph based representation of arithmetical expression.
• Based on Taylor Series Expansion:
f
x
f(0) f ’(0) f ’’(0)/2
1x x2
4
Your First TED
• Example:
f(x,y)=5x+3y+5xy-3
• Taylor decomposition:• f(x,y)= (3y-3) + x*(5y+5)
• g(y) = -3+y*(3)
• h(y) = 5+y*(5)
• Representation used by the tool:(^0 -3) means an (additive) edge
with power 0 and weight -3
f(0)=g(y) fx’(0)=h(y)
f(x,y)x
g(y) h(y)
f(x,y)x
y y
one
^1 5^0 5
^0 -3^1 3
^0 1 ^1 1
5
• After normalization:
• And more…
Properties:– Acyclic and oriented graph.
– Compact representation of linear expression.
– When the graph is reduced, ordered and normalized, it is canonical.
– For a given functionality, there exists only one representation
• useful for verification, equivalence checking…)
– Handles word-level & bit-level.
Your First TED, cont’d
^0 -3
f(x,y)
x
y y
ONE
^1 5
^0 5
^1 3
^0 1^1 1
6
Discrete Cosine Transform,
one of the main block in JPEG/MPEG compression
TED-based Factorization, Example
• DCT can be expressed as follows:
• A direct implementation: (N=4)
)2
1(cos)(
1
0
njN
xjyN
nn
for j in 0 to N-1 looptemp:=0;for n in 0 to N-1 loop
temp:=temp+x(n)*cosine(n,j);end loop;y(j)<=temp;
end loop;
7
TED-based Factorization, Example• DCT - Direct implementation: Y=M*X
BAAB
CCCC
ABBA
DDDD
M
)8
5cos()
8cos()
8
7cos()
8
3cos(
)4
7cos()
4
5cos()
4
3cos()
4cos(
)8
7cos()
8
5cos()
8
3cos()
8cos(
1111
3*2*1*0*3
3*2*1*0*2
3*2*1*0*1
3*2*1*0*0
XBXAXAXBY
XCXCXCXCY
XAXBXBXAY
XDXDXDXDY
12 Additions
16 Multiplications
8
TED-based Factorization• TED for the DCTII size 4
These nodes and associated sub-graphs are shared by Y1, Y3.
x0-x3
x1-x2
).4
cos( ,)8
3cos( ),
8cos(
CBAwith
9
S0=x0-x3
S1=x0+x3
TED-based Factorization
• Changing variable order helps identify candidates for CSE.
• Reuse sub-expressions by creating new variables:
10
TED-based Factorization
S2=x1-x2
S3=x1+x2
• Continue with next substitutions:
11
TED-based Factorization
S0=x0-x3; S1=x0+x3; S2=x1-x2; S3=x1+x2;Y0=S3+S1;Y1=A*S0+B*S2Y2=C*(S1-S3);Y3=-A*S2+B*S0
• No more candidates can be found for common sub-expression elimination
• Each sub expression Sn in this graph is represented by an adder
• The expressions can be rewritten as:
8 Additions 5 Multiplications
12
TED-based Factorization Algorithm
13
Results
WH
T 4
x4W
HT
8x8
WH
T 1
6x16
WH
T 3
2x32
WH
T 6
4x64
DC
T 4
x4D
CT
8x8
DC
T 1
6x16
DC
T 3
2x32
DC
T 6
4x64
DC
T12
8x12
8D
HT
4x4
DH
T 8
x8D
HT
16x
16
DH
T32
x32
DH
T 6
4x64
DH
T 1
28x1
28
DH
T 2
56x2
56M
EA
N
0,00%10,00%20,00%30,00%40,00%50,00%60,00%70,00%80,00%90,00%
100,00%
Optimization Rate
Optimization Rate ADD Optimization Rate MPY
14
Conclusions
• TED makes the CSE process straightforward.
• It extracts the functionality from the specification and reduces computation.
• Other factorization schemes are currently under development (Radix Decomposition, etc.).
• Applications:• High Level Synthesis.
• Compilation
• Mathematical software…
15
• TEDify: a tool to optimize mathematical expressions using TEDs
• Available at: http://tango.ecs.umass.edu/TED/Doc/html/index.html
Software: TEDify
16
Thanks• Any questions ?
17
ResultsTransform: Original # ADD Original # MPY # ADD after TED # MPY after TED Time
WHT 4x4 12 16 8 0 0,08
WHT 8x8 56 64 24 0 0,09
WHT 16x16 240 256 64 0 0,211
WHT 32x32 992 1024 160 0 1,768
WHT 64x64 4032 4096 384 0 27,158
DCT 4x4 12 16 8 5 0,084
DCT 8x8 56 64 34 21 0,097
DCT 16x16 240 256 126 85 0,182
DCT 32x32 992 1024 454 341 1,210
DCT 64x64 4032 4096 1654 1365 16,035
DCT128x128 16256 16384 6166 5461 468
DHT 4x4 12 16 8 0 0,092
DHT 8x8 56 64 32 4 0,094
DHT 16x16 240 256 112 28 0,195
DHT32x32 992 1024 360 140 1,386
DHT 64x64 4032 4096 1200 620 17,98
DHT 128x128 16256 16384 4016 2604 340
DHT 256x256 65280 65536 14000 10668 10756