Upload
others
View
0
Download
0
Embed Size (px)
Citation preview
An Optimized S-Box Circuit Architecture for Low Power
AES Design
IBM Japan Ltd.Tokyo Research Laboratory
Sumio Morioka and Akashi Satoh
� Background
� Power Analysis of Conventional S-Boxes
� Multi-Stage PPRM S-Box for Low-Power H/W
� Conclusion
Contents
Background
Back Ground
� In 2001, NIST selected Rijndael as the new symmetric key standard cipher AES
� AES H/W will be integrated in various applications � Low-power feature is important not only in low-end
applications but also in high-end servers� A 10-Gbps AES H/W chip consumes several Watts
in 0.13-µm CMOS � Need for power analysis and development of low-
power architectures for AES H/W
Power Analysis Method
PowerCalculation
� Power analysis is based on timing simulation� Gate switching including dynamic hazard is evaluated� Quite accurate estimation compared with static
analysis
TimingSimulation
BUFFER 0101010101MUX21I 0011001101XOR2 0100101101NAND2N1 0100101010XNOR2 1111000111XOR3 0101010100XOR2 1110001111NAND2N1 0111000111XOR3 0011100011
Wave Form
Test Vector
DQ
Synthesis
Net Listlibrary ieee;use ieee.std_logic;
entity XNOR is port (A : in std_logic; B : in std_logic; Y : out std_logic;end XNOR;
architecture RTL of XNOR isbegin signal T : std_logic; T <= A xor B; Y <= not T;end RTL
VHDL Code
� 128-bit bus 11-round Loop Architecture� Table-lookup-based S-Box using SOP logic
Power Analysis in AES H/W
128
AddRoundKey
Data Input
Data Output
Data Register
SubBytes
ShiftRows
MixColumns
3:1
2:11 %
10 %
1 %
75 %
15%OthersOthers
For Low-Power AES H/W
� Reducing S-Box power is most effective� There are various S-Box H/W architectures� Which architecture is suitable for low power ?
� Investigate performance of all conventional S-Boxes
� Develop a low-power S-Box architecture
Objectives
Power Analysis ofConventional S-Boxes
S-Box Definition
� Nonlinear byte substitution function� Multiplicative Inversion on GF(28) + affine
transformation.
1 0 0 0 1 1 1 11 1 0 0 0 1 1 1 1 1 1 0 0 0 1 11 1 1 1 0 0 0 11 1 1 1 1 0 0 00 1 1 1 1 1 0 00 0 1 1 1 1 1 00 0 0 1 1 1 1 1
1 1000110
+-1a=b
Affine trans.8x8 XOR matrix
InversionComplicated math. logic
S-Box Architectures
� Generate black box circuit from a truth table� High-speed� Rather large� S-Box and S-Box-1 cannot be merged
� Various mathematical techniques can be applied� Rather slow� Compact implementation� S-Box and S-Box-1 can be merged
GF inverter + affine
Direct mapping
GF Inverter + Affine� Apply hieratical structure of composite field GF(((22)2)2)
� Map elements on GF(28) onto GF(((22)2)2) by isomorphism� Each component is constructed using AND-XOR logic
isomorphism
-1
affine trans.
isomorphism
inversion
����
����
GF((2
GF(2)
)2 )2GF(((2 )2 )2 )2
)2GF(2
GF(28)
GF(((22)2 2) )
GF(28)φφφφ2
2
2x-1x
2
2
4
4
2x-1x
4
4
λλλλ
isomorphism
-1
affine trans.
isomorphism
inversion
����
����
GF((2
GF(2)
)2 )2GF(((2 )2 )2 )2
)2GF(2
GF(28)
GF(((22)2 2) )
GF(28)merged
Direct Mapping
ANDmatrix
8-bit input
8-bit output
OR / XORmatrix
Severalhundred
wires
� 2-Level Logic� SOP (Sum of Products) : (NOT)-AND-OR� POS (Product of Sums) : (NOT)-OR-AND� PPRM (Positive Polarity Reed-Muller) : AND-XOR
Direct Mapping
in0 in1 in2 in3 in4 in5 in6 in7
out0
out7
out1
2:1MUX
0
1
0
1
in0 in1 in2 in3 in4 in5 in6 in7
out0
out7
out1
� Selector-Based Logic� BDD (Binary Decision Diagram)� Twisted BDD will appear at ICCD 2002 in Sep.
BDDTwisted
BDD
Power vs. Gate Count
� Smaller circuits do not always consume less power
0
50
100
150
200
250
300
350
400
0 0.5 1 1.5 2 2.5 3
Gate Count (Kgate)
Pow
er (u
W)
Composite Field SOP Table
Lookup
TwistedBDD
BDDPPRM
0.13-µm 1.5-V CMOS @ 10 MHzNominal conditions
POS
Analysis
� Power consumption of S-Box is greatly influenced by the number of dynamic hazards
� Differences of signal arrival times at each gate� Propagation probability of signal transitions
Caused by
Analysis
� Composite field S-Box consumes a lot of power in spite of having the smallest size
� It has many crossing and branched signal paths
2x-1x
λλλλ
Switch many times
Multiple signalpaths
Differences of signal arrival times at each gate
Analysis
� An XOR gate transfers signal transitions from input to output with probability 100%
� For AND, OR gates, the probability is 50%� So power for SOP is lower than PPRM
100% Probability
AND
OR
0
1
50% Probability
1
0 1
0
1XOR0
Propagation probability of signal transitions
Multi-Stage PPRM S-Box for Low-Power H/W
Approach for Low Power S-Box
0.5
ANDarray
XORarray
1.0Transition Probability
0.5 1.0 0.5 1.0 0.5 1.0
� Use composite field S-Box� Reduce gate counts
� Divide combination logic into multiple stages� Reduce probability of signal transitions
� Adjust the signal timing between each stage using 2-level logic� Reduce the number of dynamic hazards
Multi-Stage PPRM� Composite field S-Box is divided into several blocks� Each block is designed using PPRM logic suitable for
GF operations� Adjust signal timing by using 2-level (AND-XOR) logic
ANDarray
XORarray
28 ANDarray
XORarray
12 ANDarray
XORarray
324 4
4
4
4
4
Delay chain
88
4
4
2x-1x
4
4
-1����
����44
4
4
88 +affine
λλλλ
3-stage PPRM
Multi-Stage PPRM� PPRM S-Boxes can be divided many ways (1~6-stages)
4
4
2x-1x
4
4
λλλλ -1����
����44
4
4
88 +affine
XOR AND-XOR AND-XOR AND-XOR
AND-XOR AND-XOR AND-XOR
φφφφ2
2
2x-1x
2
2
Power vs. Gate Count� 3-stage PPRM is the most effective architecture� Multi-stage SOP is not suitable for GF operation
0
100
200
300
400
500
600
700
800
0 1 2 3 4 5 6 7Gate Count (Kgate)
Pow
er (u
W)
PPRMSOP
1-stage(normal PPRM)
2-
3-4-6-
2-
1-stage(normal SOP)
3-0.13-µm 1.5-V CMOS @ 10 MHzNominal conditions
0
50
100
150
200
250
300
350
400
0 0.5 1 1.5 2 2.5 3
Gate Count (Kgate)
Pow
er (u
W)
Composite Field SOP Table
Lookup
TwistedBDD
BDDPPRM
3-stage PPRM
0.13-µm 1.5-V CMOS @ 10 MHzNominal conditions
POS
Power vs. Gate Count� 3-stage PPRM is the most effective architecture� Multi-stage SOP is not suitable for GF operation
1/3 power with 1/2 gates
Conclusion
� The AES S-Box (SOP logic) consumes 75% of the power
� Dynamic hazards boost power needs of S-Boxes� A multi-stage PPRM architecture based on a
composite field S-Box was developed� 3-stage PPRM / SOP : Power = 1/3, Size = 1/2� This architecture can be applied to other S-Boxes
defined over Galois fields