4
SRAM-FPGA Implementation of Masked S-Box Based DPA countermeasure for AES Najeh Kamoun (1), Lilian Bossuet (2), Adel Ghazel (1) (1) CIRTA'COM Lab, SUP'COM, Cite technologique des Communications El Ghazala, Ariana, Tunisia (2) IMS Lab, ENSEIRB, Bordeaux, FRANCE e-mail: [email protected]@[email protected] Abstract- This paper presents FPGA implementation and overhead evaluation for an algorithmic DPA countermeasure for Advanced Encryption Standard AES. To reduce implementation overhead the masked compact S-Box, proposed by Canright, was chosen to implement a DPA countermeasure on an SRAM FPGA. Obtained results showed that secured AES IP leads to slices number increase by 60,1 % and a frequency decrease by 40/0. Keywords- AES, DPA, Masked S-Box, SRAM FPGA. I. INTRODUCTION The expanding use of digital communications, electronic financial transactions and digital signature applications have raised demanding security issues to fulfill the requirements for secrecy, integrity and non-repudiation of exchanged confidential information. In this context, cryptographic algorithms and devices are fundamental for building blocks of every secure communication systems. Symmetric cipher is widely used because it is a simple and efficient algorithm. Since 2001, the Advanced Encryption Standard AES [1] is considered by the National Institute of Standards and Technology NIST as the symmetric cipher block standard. It is a secure encryption algorithm, used not only for U.S. government documents, but also in electronic commerce. Besides performing a cipher algorithm a cryptographic device is also requested to physically protect the secret data manipulated during its real execution. Traditionally, cryptanalysis has been directed around algorithms but, since only few years, hardware implementation is considered as a fundamental part for security evaluation of the cryptographic design. New challenges in this field are the so called side- channel attacks [2], i.e. which exploit information leakage from the cryptographic device due to physical phenomena such as power consumption, electromagnetic radiation and execution timing. These attacks are based on monitoring a physical quantity and applying statistical analysis to extract confidential information from extremely noisy signals. Many research works focused particularly on studying Differential Power Attack (DPA) [2-6] and have proposed multiple countermeasure techniques at different levels: algorithmic, system and logic [7-8]. In this paper we will focus on algorithmic DPA countermeasure for AES by firstly choosing low complexity masking technique. Secondly design its FPGA implementation then analyze proposed solution hardware complexity and timing performances. The present paper is organized as follows. Section II describes AES algorithm and DPA principles. Masked S-Box based DPA countermeasure is chosen in section III. Section IV presents FPGA implementation details and results. II. AES AND DPA PRINCIPLES A. AES Algorithm processing The Advanced Encryption Standard AES [1] is a symmetric cipher which encrypts and decrypts 128-bit data with key's size 128, 192 or 256 bits. In this paper, we are interested in the implementation of the AES encryption scheme with 128-bit key. The l6-byte input plain text D;(iE [1 .. 16]) is arranged in a four-by-four byte matrix called state, as it is indicated in Table I. All the transformations in AES operate on the state. Table I. Data arrangement in AES state DI D s D9 Dl3 D 2 D 6 DIO D I4 D 3 D 7 Dll DIS D4 Dg D 12 D I6 The main AES cipher transformations are: AddRoundKey : A round key is added to the state matrix using the XOR operation. The rounds keys are derived from the key of the first round using Key Expansion algorithm. ShiftRows : The second row of the state matrix is cyclically shifted by one byte to the left, the third row by two bytes and the fourth row by three bytes. The first one remains unchanged. The ShiftRows transformation increases the "diffusion" properties of AES. SubBytes : Each byte of the state matrix is substituted using a bijective Substitution Box: short S-Box. The S-Box is based on the non-linear inversion in the finite field GF(2 8 ) and a bitwise affine transformation. The S-Box step increases the "confusion" properties of AES. MixColumns : This step is a linear transformation, which increases the diffusion properties of AES. Equation (1)

[IEEE 2008 3rd International Design and Test Workshop (IDT) - Monastir, Tunisia (2008.12.20-2008.12.22)] 2008 3rd International Design and Test Workshop - SRAM-FPGA implementation

  • Upload
    adel

  • View
    216

  • Download
    4

Embed Size (px)

Citation preview

Page 1: [IEEE 2008 3rd International Design and Test Workshop (IDT) - Monastir, Tunisia (2008.12.20-2008.12.22)] 2008 3rd International Design and Test Workshop - SRAM-FPGA implementation

SRAM-FPGA Implementation ofMasked S-BoxBased DPA countermeasure for AES

Najeh Kamoun (1), Lilian Bossuet (2), Adel Ghazel (1)

(1) CIRTA'COM Lab, SUP'COM, Cite technologique des Communications El Ghazala, Ariana, Tunisia(2) IMS Lab, ENSEIRB, Bordeaux, FRANCE

e-mail: [email protected]@[email protected]

Abstract- This paper presents FPGA implementation andoverhead evaluation for an algorithmic DPA countermeasure forAdvanced Encryption Standard AES. To reduce implementationoverhead the masked compact S-Box, proposed by Canright, waschosen to implement a DPA countermeasure on an SRAMFPGA. Obtained results showed that secured AES IP leads toslices number increase by 60,1% and a frequency decrease by40/0.

Keywords- AES, DPA, Masked S-Box, SRAM FPGA.

I. INTRODUCTION

The expanding use of digital communications, electronicfinancial transactions and digital signature applications haveraised demanding security issues to fulfill the requirements forsecrecy, integrity and non-repudiation of exchangedconfidential information. In this context, cryptographicalgorithms and devices are fundamental for building blocks ofevery secure communication systems. Symmetric cipher iswidely used because it is a simple and efficient algorithm.Since 2001, the Advanced Encryption Standard AES [1] isconsidered by the National Institute of Standards andTechnology NIST as the symmetric cipher block standard. It isa secure encryption algorithm, used not only for U.S.government documents, but also in electronic commerce.

Besides performing a cipher algorithm a cryptographicdevice is also requested to physically protect the secret datamanipulated during its real execution. Traditionally,cryptanalysis has been directed around algorithms but, sinceonly few years, hardware implementation is considered as afundamental part for security evaluation of the cryptographicdesign. New challenges in this field are the so called side­channel attacks [2], i.e. which exploit information leakage fromthe cryptographic device due to physical phenomena such aspower consumption, electromagnetic radiation and executiontiming. These attacks are based on monitoring a physicalquantity and applying statistical analysis to extract confidentialinformation from extremely noisy signals.

Many research works focused particularly on studyingDifferential Power Attack (DPA) [2-6] and have proposedmultiple countermeasure techniques at different levels:algorithmic, system and logic [7-8].

In this paper we will focus on algorithmic DPAcountermeasure for AES by firstly choosing low complexitymasking technique. Secondly design its FPGA implementation

then analyze proposed solution hardware complexity andtiming performances.

The present paper is organized as follows. Section IIdescribes AES algorithm and DPA principles. Masked S-Boxbased DPA countermeasure is chosen in section III. Section IVpresents FPGA implementation details and results.

II. AES AND DPA PRINCIPLES

A. AES Algorithm processing

The Advanced Encryption Standard AES [1] is a symmetriccipher which encrypts and decrypts 128-bit data with key's size128, 192 or 256 bits. In this paper, we are interested in theimplementation of the AES encryption scheme with 128-bitkey. The l6-byte input plain text D;(iE [1 .. 16]) is arranged in afour-by-four byte matrix called state, as it is indicated in TableI. All the transformations in AES operate on the state.

Table I. Data arrangement in AES state

DI Ds D9 Dl3

D2 D6 DIO D I4

D3 D7 Dll DIS

D4 Dg D12 D I6

The main AES cipher transformations are:

AddRoundKey : A round key is added to the state matrix usingthe XOR operation. The rounds keys are derived from the keyof the first round using Key Expansion algorithm.

ShiftRows : The second row of the state matrix is cyclicallyshifted by one byte to the left, the third row by two bytes andthe fourth row by three bytes. The first one remains unchanged.The ShiftRows transformation increases the "diffusion"properties of AES.

SubBytes : Each byte of the state matrix is substituted using abijective Substitution Box: short S-Box. The S-Box is based onthe non-linear inversion in the finite field GF(28

) and a bitwiseaffine transformation. The S-Box step increases the"confusion" properties of AES.

MixColumns : This step is a linear transformation, whichincreases the diffusion properties of AES. Equation (1)

Page 2: [IEEE 2008 3rd International Design and Test Workshop (IDT) - Monastir, Tunisia (2008.12.20-2008.12.22)] 2008 3rd International Design and Test Workshop - SRAM-FPGA implementation

(2)

describes the relation between the output column byte Ci andthe input column byte hi.

Co 02 03 01 01 bo

cI 01 02 03 01 bI

01 01 02 03 b2

(1)c2

c3 03 01 01 02 b3

The matrix elements {03}, {02} and {01} correspond to thepolynomials x + 1, x and 1. The multiplication operation isachieved in GF(28

) modulo the polynomial x4+1.

Key Expansion: The key expansion derives the round keysfrom the cipher key.

Figure 1 describes all the transformations in an AESencryption. Iteration's number is equal to 10 for the 128-bitkey.

Figure 1. Basic transformation of an AES encryption

B. DPA principle

To predict power consumption of the device under attack,power analysis attacks [2] generally require a hypotheticalmodel of the device under attack to predict its powerconsumption. For example, FPGAs are usually made of CMOSgates. The switching activity in these devices is responsible forthe main component of the power consumption. For a singleCMOS gate, the dynamic power consumption PD, as given byequation (2), is the product of the gate load capacitance CL, thedynamic voltage VDD, the probability Ptr of output transitionfrom 0 to 1 and the clock frequency F [4].

measures the power consumption of the device while the chipis operating the targeted operation. This results in a Nxlmeasurement vector VM.

Finally, the attacker computes the correlation between themeasurement vector and all the columns of the selectedprediction matrix Mp • If the attack is successful, it is expectedthat only one value, corresponding to the correct key bits, leadsto a high correlation. An efficient way to compute thecorrelation is to use the Pearson coefficients that can beexpressed as indicated in equation (3).

C(V M ( )) = E(VM·Mp(ci ))- E(VM)·E(Mp(c))

M' p Ci I'1 var(VM). var(MP(Ci))

In this expression, E(VM) denotes the mean of themeasurements set VM and var(VM) its variance, Mp(c;) thecolumn number i in the matrix Mp • More explanations of thepower analysis attack principles can be found in previouspublications [5].

III. ALGORITHMIC DPA COUNTERMEASURE CHOICE

The feasibility of power analysis attacks is due to the factthat cryptographic devices power consumption depends on theexecuted cryptographic algorithms intermediate values. Hence,the idea of countermeasures is to make the power consumptionindependent of those intermediate values to withstand suchattacks. Security is as strong as the weakest link; thereforeprotecting cryptographic designs should be done on all levelsof abstraction. Those levels can be: algorithm, system andlogic. Each abstraction layer represents specific modeling,design and implementation issues that must be covered forsecure system operation [6]. Figure 2 explains the possibleactions at each level.

System level countermeasures cannot be integrated insidethe design so it will be easily localized by the adversary. Thiscountermeasure effect can be removed by smoothing the powerconsumption. Moreover, it consumes a significant part of thechip power.

Countermeasures level

(2)

The equation (2) shows the data dependence through the PD

variable with the power consumption. Consequently, anattacker may consequently estimate device power consumptionat time t by correspondent output's hamming weight inside thedevice. Based on this simple observation, power analysisattacks have been applied to numerous algorithms and deviceslike FPGAs [3, 6]. We will focus on attacks targeting the AESalgorithm. The attacker proceeds as follows. First, he targetsthe most significant byte of the key KMS• Then, for N differentplain texts, he predicts the hamming weight at the S-Boxoutput, for every possible value of KMS• The result of thisprediction is a Nx256 selected prediction matrix Mp •

In the second part of the attack, the adversary lets thecircuit encrypt the same N plain texts with a fixed key and he

System level Powr..ply---------

Figure 2. Classification of security levels.

Algorithmic countermeasures can be restricted to themasking schemes. Its basic idea, explained in references [7, 8],is to randomize the intermediate results that are producedduring the computation ofa cryptographic algorithm.

We take the case of an AES cipher. All the steps in a roundof AES are affine, except for the Galois field inversion sub-

Page 3: [IEEE 2008 3rd International Design and Test Workshop (IDT) - Monastir, Tunisia (2008.12.20-2008.12.22)] 2008 3rd International Design and Test Workshop - SRAM-FPGA implementation

GBIP

step of the SubBytes step. For the other steps, calculation of themask correction is linear, so an additive mask is mostconvenient. Some previous research works have suggestedswitching to a multiplicative mask for the Galois inverse step[8], but one inescapable weakness is that a zero data byte isunmasked by multiplication. Oswald et al. [9] suggested theuse of additive and multiplicative mask for SubBytes functionwith the Tower field representation. Applying thisrepresentation, inversion in GF(28

) involves severalmultiplications and one inversion in the subfield GF(24

), whichin tum involves multiplications and inversion in GF(22

). In thesub-subfield GF(22

), inversion is identical to squaring, and so islinear (over GF(2)). They showed how to compute the maskcorrection for the tower field approach. Many of the correctionterms involve multiplication in subfields, and they mentionhow some of these multiplications can be eliminated throughclever re-use ofparts of the input mask for the output.

Canrigth in [10] proposed to incorporate this maskingapproach into the compact S-Box introduced in [11]. Applyingthe same optimizations for the unmasked S-Box, to the maskcorrection terms, results in a more compact masked S-Box. S­Box layer occupies 60% of the area in an efficient hardwareimplementation of the AES [1]. Then the area overhead isabove 120% ifwe use 16 S-Box. Mentens in [12] optimizes thearea of secure AES with masking schemes to 20% minimizingthe number of S-Box functions to four without taking intoconsideration the area of the True Random Noise GeneratorTRNG [13]. We mention that Mentens has used the Satoh S­Box [12].

Algorithmic countermeasures are global solutions and canbe adapted to variant technologies. It is necessary to employthe true random noise generator for the mask function. Thisgenerator takes an important area and the robustness of thiscountermeasure is based on its quality of randomness [9].

IV. SRAM FPGA IMPLEMENTATION AND RESULTS

Figure 3 presents the experimental set-up for the DPA andits countermeasure implementation and evaluation for secureAES implemented on an FPGA board.

An Agilent 54622D digital storage oscilloscope is used andhas a bandwidth of 100 MHz with maximum sampling rate of200 MSample/sec. To obtain enough sample points per cycle,we lowered our design speed to 95 Hz. The communicationbetween the scope and the PC is done via the General PurposeBus Interface GPIB (IEEE-488). A 0.2 Q resistor is insertedbetween the power supply and the FPGA Board. Voltagedifference between CHI and CH2 is measured with a 20 MHzlow-pass probe to reject DC signal.

A. Implementation ofsecure S-Box

The S-Box composed of SubBytes and the AddRoundKeyfunction, as indicated in figure 4, was considered for ahardware implementation on a Xilinx Virtex 4 FPGA (LX25­FF676). The input data Din is combined with a secret key-byteK with exclusive-or. Then, the S-Box module is applied toretrieve the output samples S. The design of the compact S-

Box, proposed by Canright [14], was used for thisimplementation.

Figure 3. Experimental set-up for DPA and countermeasure implementation

Figure 4. Data flow chart of of unsecured AES S-Box module

Table 2. FPGA implementation results for AES S-Box module.

AES S-BoxPerformance

Unsecured Secure with masking

Area (slices) 36 100

Area overhead 0% 44%

Frequency (MHz) 88 60

Frequency0% 31%

decrease

We notice from the table 2 that the secure AES S-Box withmasking needs larger area. Indeed, the number of used slicesincreased by 44%. However, the frequency decreased from 88MHz to 60 MHz. This is due to the increase of the used slices.

B. Implementation ofsecure AES design

Results of table 3 shows that secure AES with the maskingscheme needs larger area than the unsecured one. As for theprevious implementation, the number of used slices rose by60,1 %. The frequency decreased from 4%. This is due to theincrease of the used slices.

V. CONCLUSION

In this paper, FPGA implementation of a first orderalgorithmic DPA countermeasure for Advanced Encryption

Page 4: [IEEE 2008 3rd International Design and Test Workshop (IDT) - Monastir, Tunisia (2008.12.20-2008.12.22)] 2008 3rd International Design and Test Workshop - SRAM-FPGA implementation

Standard (AES) is proposed. After presenting AES processingand DPA principle authors focused on choosing the lowcomplexity algorithmic countermeasure technique

Table 3. FPGA implementation results for AES box

Performance AESUnsecured Secure with maskio2

Area (slices) 1424 2281Area overhead 0% +60.1%

Frequency 143 137(MHz)

Frequency 0% 4%decrease

.The study of proposed techniques in recent literature showedthat best solution can be obtained with the Canright MaskedCompact AES S-Box. The originality of presented results inthis work comes from the exploration for the first time ofSRAM FPGA implementation overhead for this DPAcountermeasure technique. Obtained results showed thatsecured AES IP leads to slices number increase by 60,1% anda frequency decrease by 4%.

ACKNOWLEDGMENT

The authors would like to acknowledge the University of 7th

of November at Carthage for financing this work and the IMSLab at the University of Bordeaux I for giving access to theirsecurity lab facilities.

REFERENCES[1] National Institute of Standards and Technology (NIST). FIPS-197:

Advanced Encryption Standard, November 2001. Available online athttp://www.itl.nist.~ov/fipspubs/.

[2] P. Kocher, 1. Jaffe, B. Jun, Differential Power Analysis, in theproceedings of Cypto 1999, Lecture Notes in Computer Science, vol1666, pp 398-412, Santa-Barbara,USA, August 1999, Springer-Verlag.

[3] S. B. Ors, E. Oswald and B. Preneel, Power-Analysis Attacks on anFPGA--First Experimental Results. Proceedings of CryptographicHardware and Embedded Systems - CHES 2003, 5th InternationalWorkshop Cologne, Germany, September 8-10, 2003, pp. 35-50, LNCS2779.

[4] J.M. Rabaey, Digital Integrated Circuits, Prentice Hall International,1996.

[5] F.-X. Standaert, E. Peeters, F. Mace, J.-1. Quisquater, Updates on theSecurity ofFPGAs Against Power Analysis Attacks, in the proceedings ofARC 2006, Lecture Notes in Computer Science, vol 3985, pp 335-346,Delft, The Netherlands, March 2006, Springer-Verlag

[6] L. Batina, N. Mentens, and I. Verbauwhede, Side-channel Issues forDesigning Secure Hardware Implementations, In IEEE internationalonline testing symposium, IOLTS 2005, special session on side-channeland fault attacks IEEE, 4 pages, July 6-8, 2005, Saint Raphael, France.36(4):68-74, April 2003.

[7] T. Popp, S. Mangard, E. Oswald, Power Analysis Attacks andCountermeasures, IEEE Design & Test of Computers - Design and Testof ICs for Secure Embedded Computing, IEEE, 2007.

[8] M.-Laurent Akkar and C. Giraud. An implementation of DES andAES,secure against some attacks. In CHES 2001, volume 2162 ofLecture Notes in Computer Science, pages 309-18, 2001.

[9] E. Oswald and K. Schramm, An Efficient Masking Scheme for AESSoftware Implementations, Proceedings of WISA 2005, LNCS 3786,Springer, pp. 292-305, 2006.

[10] D. Canright and L. Batina, A Very Compact "Perfectly Masked" S-Boxfor AES, Applied Cryptography and Network Security, ACNS 2008, June3-6, New York.

[11] D. Canright . A Very Compact S-Box for AES, Workshop onCryptographic Hardware and Embedded Systems (CHES2005), LectureNotes in Computer Science 3659, pp.441-455, Springer-Verlag 2005.

[12] N. Mentens, L. Batina, B. Preneel, and I. Verbauwhede, An FPGAImplementation of Rijndael: Trade-offs for side-channel security, InIFAC Workshop - PDS 2004, Programmable Devices and Systems,Elsevier, pp. 493-498, 2004.

[13] V.Fischer, M. Drutarovsky, M. Simka, N. Brochard. High PerformanceTrue Random Number Generator in Altera Stratix FPLDs. In FieldProgrammable Logic and Application (FPL), sptember 2004.

[14] T. Messerges, E. Dabbish, and R. Sloan, Investigations of PowerAnalysis Attacks on Smartcards, IEEE Trans. Computers, vol. 51 ,no. 5,May 2002.