4
A General Model for Differential Power Analysis Attacks to Static Logic Circuits Massimo Alioto, Massimo Poli, Santina Rocchi DII - University of Siena Siena - Italy Email: {malioto,poli,rocchi}@dii.unisi.it Abstract—In this paper, a general model of multi-bit Dif- ferential Power Analysis (DPA) attacks to static logic circuits is proposed, with emphasis on symmetric-key cryptographic algorithms. The main parameters that are of interest in practical DPA attacks are analytically derived by introducing suitable approximations. Several interesting properties of DPA attacks are derived, allowing a deep understanding of the vulnerability of algorithms and circuits. The proposed model was validated by means of experimental measurements on an FPGA implementa- tion of the Advanced Encryption Standard (AES) algorithm. The model accuracy is shown to be adequate, as the resulting error is always lower than 11%. I. I NTRODUCTION In the last decade, a strong demand for portable devices that are able to store confidential data (e.g., Smartcards) has lead to the widespread adoption of cryptographic algorithms and circuits [1]. However, it was proved that all cryptographic devices leak information due to their interaction with the external environment [2], [3]. Indeed, several ”side-channel” attacks able to recover confidential information were proposed that are based on the monitoring of some physical parameter related to the circuit operation [2]–[4]. One of the most powerful side-channel attacks is the Differ- ential Power Analysis (DPA), which is able to recover secret information by exploiting the dependence of the chip power consumption on the processed data [2]–[4]. DPA attacks are a major threat to the data security, since they are non-invasive and require little knowledge of the cryptographic algorithm specific implementation [2]–[4]. Accordingly, several counter- measures have been proposed aiming at the reduction of the device power consumption dependence on the processed data [4], [5]. Among the existing logic styles, static logic is the largely preferred one when implementing digital Integrated Circuits (ICs), due to its suitability for VLSI design and robustness. Nevertheless, its power consumption is strongly dependent on the processed data [4]. Until now, only few power consumption models to understand DPA attacks have been proposed in the literature. Unfortunately, these models are rather simplistic, hence many aspects related to the DPA attack are not yet understood. In this paper, a model of DPA attacks to static logic circuits for generic symmetric-key cryptographic algorithms is developed and then applied to the Advanced Encryption Standard (AES) algorithm [4]. The closed-form model equa- tions are simple enough to allow an intuitive understanding of the vulnerability to DPA attacks of current cryptographic devices. The proposed model is also used to better understand the tradeoffs involved in a practical attack, as well as to derive simple guidelines for future-generation cryptographic algorithms that are robust against DPA attacks. The model generalizes the approach adopted by the same authors in the particular case of precharged circuits allowing the comparison in terms of vulnerability to DPA attacks between static logic and precharged circuits [6]. To validate the theoretical results, DPA attacks were carried out on an FPGA implementation of the AES algorithm [4]. II. REVIEW OF DPA ATTACKS TO STATIC LOGIC CIRCUITS DPA attacks to cryptographic devices aim at recovering a portion k of the secret key by monitoring power consumption during cryptographic operations. A DPA attack starts from the collection of power consump- tion waveforms acquired during the encryption (decryption) of N random plaintexts (ciphertexts) I i (with i =1 ...N ). The N power traces PC i (j ) consisting of M samples (i.e., j =1 ...M ) are then classified into one of two sets S 0 and S 1 using the value of an intermediate signal D physically evaluated within the circuit during the encryption (decryption) at a given but unknown time j = j . Therefore, D is derived from the knowledge of the algorithm under attack and it is a function of k and I i , i.e. D = f (I i ,k) [2]–[4]. The function f is usually referred to as the ”partition function” and it is selected to obtain the maximum possible correlation between D and the actual power dissipation at j = j [2]–[4]. Let us consider a DPA attack targeting an m-bit signal X (j ) physically generated within the static logic circuit under attack at time j . In static logic, the power dissipated by the attacked m bits at time j = j is proportional to the number t of 0 1 transitions between the current state X (j ) and the previous one X (j 1) [4]. Hence, f should be chosen as a function of t. Nevertheless, even if the algorithm under attack is well known, an adversary has no information on the previous state X (j 1) [4]. Therefore, in practical attacks the classification of the collected power traces is done using the weight w of the current state, i.e. the number of 1’s of X (j ) [4]. Observe that the attack based on w is still successful since w correctly predicts t in half of the cases [4]. This is easily justified by 978-1-4244-1684-4/08/$25.00 ©2008 IEEE 3346

[IEEE 2008 IEEE International Symposium on Circuits and Systems - ISCAS 2008 - Seattle, WA, USA (2008.05.18-2008.05.21)] 2008 IEEE International Symposium on Circuits and Systems -

  • Upload
    santina

  • View
    214

  • Download
    0

Embed Size (px)

Citation preview

A General Model for Differential Power AnalysisAttacks to Static Logic Circuits

Massimo Alioto, Massimo Poli, Santina RocchiDII - University of Siena

Siena - ItalyEmail: {malioto,poli,rocchi}@dii.unisi.it

Abstract—In this paper, a general model of multi-bit Dif-ferential Power Analysis (DPA) attacks to static logic circuitsis proposed, with emphasis on symmetric-key cryptographicalgorithms. The main parameters that are of interest in practicalDPA attacks are analytically derived by introducing suitableapproximations. Several interesting properties of DPA attacksare derived, allowing a deep understanding of the vulnerabilityof algorithms and circuits. The proposed model was validated bymeans of experimental measurements on an FPGA implementa-tion of the Advanced Encryption Standard (AES) algorithm. Themodel accuracy is shown to be adequate, as the resulting erroris always lower than 11%.

I. INTRODUCTION

In the last decade, a strong demand for portable devicesthat are able to store confidential data (e.g., Smartcards) haslead to the widespread adoption of cryptographic algorithmsand circuits [1]. However, it was proved that all cryptographicdevices leak information due to their interaction with theexternal environment [2], [3]. Indeed, several ”side-channel”attacks able to recover confidential information were proposedthat are based on the monitoring of some physical parameterrelated to the circuit operation [2]–[4].

One of the most powerful side-channel attacks is the Differ-ential Power Analysis (DPA), which is able to recover secretinformation by exploiting the dependence of the chip powerconsumption on the processed data [2]–[4]. DPA attacks are amajor threat to the data security, since they are non-invasiveand require little knowledge of the cryptographic algorithmspecific implementation [2]–[4]. Accordingly, several counter-measures have been proposed aiming at the reduction of thedevice power consumption dependence on the processed data[4], [5].

Among the existing logic styles, static logic is the largelypreferred one when implementing digital Integrated Circuits(ICs), due to its suitability for VLSI design and robustness.Nevertheless, its power consumption is strongly dependent onthe processed data [4]. Until now, only few power consumptionmodels to understand DPA attacks have been proposed in theliterature. Unfortunately, these models are rather simplistic,hence many aspects related to the DPA attack are not yetunderstood.

In this paper, a model of DPA attacks to static logiccircuits for generic symmetric-key cryptographic algorithmsis developed and then applied to the Advanced Encryption

Standard (AES) algorithm [4]. The closed-form model equa-tions are simple enough to allow an intuitive understandingof the vulnerability to DPA attacks of current cryptographicdevices. The proposed model is also used to better understandthe tradeoffs involved in a practical attack, as well as toderive simple guidelines for future-generation cryptographicalgorithms that are robust against DPA attacks. The modelgeneralizes the approach adopted by the same authors in theparticular case of precharged circuits allowing the comparisonin terms of vulnerability to DPA attacks between static logicand precharged circuits [6].

To validate the theoretical results, DPA attacks were carriedout on an FPGA implementation of the AES algorithm [4].

II. REVIEW OF DPA ATTACKS TO STATIC LOGIC CIRCUITS

DPA attacks to cryptographic devices aim at recovering aportion k of the secret key by monitoring power consumptionduring cryptographic operations.

A DPA attack starts from the collection of power consump-tion waveforms acquired during the encryption (decryption)of N random plaintexts (ciphertexts) Ii (with i = 1 . . .N ).The N power traces PCi(j) consisting of M samples (i.e.,j = 1 . . .M ) are then classified into one of two sets S0 andS1 using the value of an intermediate signal D physicallyevaluated within the circuit during the encryption (decryption)at a given but unknown time j = j∗. Therefore, D is derivedfrom the knowledge of the algorithm under attack and it is afunction of k and Ii, i.e. D = f (Ii, k) [2]–[4]. The functionf is usually referred to as the ”partition function” and it isselected to obtain the maximum possible correlation betweenD and the actual power dissipation at j = j∗ [2]–[4].

Let us consider a DPA attack targeting an m-bit signal X(j)physically generated within the static logic circuit under attackat time j. In static logic, the power dissipated by the attackedm bits at time j = j∗ is proportional to the number t of 0 → 1transitions between the current state X(j∗) and the previousone X(j∗ − 1) [4]. Hence, f should be chosen as a functionof t. Nevertheless, even if the algorithm under attack is wellknown, an adversary has no information on the previous stateX(j∗−1) [4]. Therefore, in practical attacks the classificationof the collected power traces is done using the weight w ofthe current state, i.e. the number of 1’s of X(j∗) [4]. Observethat the attack based on w is still successful since w correctlypredicts t in half of the cases [4]. This is easily justified by

978-1-4244-1684-4/08/$25.00 ©2008 IEEE 3346

considering that, when the current bit is 1, the probability ofhaving the previous bit equal to 0 is 0.5 (assuming X(j) tobe sufficiently random). To better understand this point, letus consider the case where X(j) is a 1-bit signal. Therefore,there are 4 possible values for the pair [X(j∗ − 1), X(j∗)], asreported in Tab. I. From this table, it is clear that w incorrectlypredicts t only for the pair 1 → 1 in the last row of Tab. I.

According to the above considerations, the following parti-tion function is usually adopted [2]–[4]

D ={

0 if 0 ≤ w < m/21 if m/2 < w ≤ m

(1)

therefore power traces with weight lower than m/2 (greaterthan m/2) are assigned to the set S0 (S1), whereas the othertraces are discarded as outlined in Fig. 1 (where m is even).

After the power traces classification, the average powerconsumption waveform A0 (A1) in the set S0 (S1) is evaluated

AD(j) =1

ND

∑i∈SD

PCi(j) with D = 0, 1 (2)

being N0 (N1) the number of power traces such that D isequal to 0 (1) at j = j∗. The difference |A0(j) − A1(j)| isusually referred to as the differential power trace Δ(j).

Let us assume that an infinite number of statistically inde-pendent power traces is collected (i.e., N → ∞). In this case,the averages A0 and A1 in (2) tend to the power consumptionmean value E [PCi (j)|D=0] and E [PCi (j)|D=1] [7]. SinceD is a function of the secret key k, it is evaluated guessing thevalue of k. Under a correct guess, D is equal to the physicallyevaluated D, and the differential power trace Δ∞(j) at j = j∗

is equal to the non-zero value ε, whereas Δ∞(j) = 0 atj �= j∗, since the averages A1 and A0 are affected by Donly at j = j∗. In other words the differential power traceΔ∞ exhibits at j = j∗ a spike ε given by

ε = |A1(j∗) − A0(j∗)| . (3)

When k is incorrectly guessed, a lower spike at j = j∗

is observed since the power traces are erroneously partitionedinto S0 and S1 [2]–[4]. Hence, the correct guess of k can bedistinguished only if its associated spike is sufficiently greaterthan the maximum spike under a wrong guess, i.e. if the Inter-Signal Signal-to-Noise Ratio SNRINTER in (4) is greaterthan unity

SNRINTER = ε (Dcorrect) / max [ε (Dwrong)] . (4)

In realistic cases, the differential power trace ΔN is eval-uated with a finite number N of power traces. From basicstatistical theory, ΔN − Δ∞ is a zero-mean random processwith a standard deviation equal to σN = σ/

√N , being σ

the standard deviation of ΔN − Δ∞ in a single power trace

set S0 (D=0) discarded

m/20 1 ... (m/2-1) (m/2+1) ... m

set S1 (D=1)

Fig. 1. Classification of power traces according to the weight w.

TABLE ISUCCESSIVE STATES FOR A 1-BIT STATIC LOGIC

[X(j∗ − 1), X(j∗)] w classificationbased on w

t

0 → 0 0 S0 (D = 0) 00 → 1 1 S1 (D = 1) 11 → 0 0 S0 (D = 0) 01 → 1 1 S1 (D = 1) 0

[2]–[4], [7]. Accordingly, ΔN can be considered the sumof the signal Δ∞ and an additive noise signal ΔN − Δ∞with standard deviation σN . Therefore, the spike ε can bedetected only if it is significantly greater than the typicalnoise range σN . Accordingly, an Intra-Signal Signal-to-NoiseRatio SNRINTRA is usually defined as the ratio of the spikeamplitude ε and the noise standard deviation σN [2]–[4]

SNRINTRA = ε/σN = ε√

N/σ. (5)

In practical DPA attacks, the spike detection is possiblewhen SNRINTRA is greater than 10 (i.e., when a sufficientlyhigh number N of power traces is collected) [2]–[4].

III. MODELING OF THE DPA SPIKE AMPLITUDE

A closed-form expression of (3) can be derived by assumingthat all values of X(j∗) are equally likely. This assumptionis reasonable in cryptographic circuits, since they exhibitthe features of confusion and diffusion in every operationinvolving the key [4]. As stated in Section II, since the numbert of 0 → 1 transitions between X(j∗ − 1) and X(j∗) is notknown, the power traces must be classified according to (1).Assuming that a unit energy is dissipated by a single bit 0 → 1transition, average A1 (A0) results to

AD =1

ND

∑i∈SD

i · nt=i with D = 0, 1 (6)

being nt=i the number of pairs [X(j∗ − 1), X(j∗)] such thatt = i (i.e., those contributing to the power consumption), andND the overall number of pairs with X(j∗) having weight iwith i ∈ SD (i.e., those classified in the set SD). It can be

easily shown that nt=i is equal to 12

(mi

), and ND is equal

to∑

i∈SDnw=i, being nw=i the number of m-bit symbols

having weight i. Considering that nw=i is equal to

(mi

)

and substituting A1 and A0 into (3), ε results to

ε =m

2

(m

m/2

)/ [2m −

(m

m/2

)]≈ 0.78m0.34. (7)

For the sake of clarity, (7) is plotted as a function of min Fig. 2 along with the proposed approximation (the error isalways lower than 5%). According to this figure, the spike εhas a rather weak dependence on m. It is worth nothing thatthe spike amplitude in (7) is half that obtained in the caseof precharged circuits [6]. Hence, SNRINTRA of static logiccircuits is lower than that of precharged logic. This means that,from the SNRINTRA point of view, the static logic is lessvulnerable to DPA attacks than precharged logic.

3347

Finally, SNRINTRA in (5) is easily obtained by substitut-ing (7).

IV. SIMPLE MODEL FOR SNRINTER

The inter-signal SNR defined in (4) is given by the ratiobetween the spike ε in (7) associated with the correct guessand the maximum spike amplitude among all possible wrongguesses. To evaluate the latter, let us assume that the thestatic logic circuit combines the plaintext (ciphertext) Ii withthe key through an XOR (⊕) operator, as always occurs insymmetric-key cryptographic algorithms [4]. Accordingly, theweight w of X(j∗) under a wrong key k′ differs from thatunder k by the number of wrong bits in k′ (i.e. the Hammingdistance Hd(k, k′) between k and k′). Furthermore, the spikeamplitude associated with a wrong guess k′ is reduced whenincreasing the number of wrong bits [2]–[4]. Therefore, amongall the wrong key k′ guesses, the highest spike amplitude ε′

is achieved when Hd(k, k′) = 1.To evaluate ε′, consider that (3) and (6) still hold, but the

wrong classification leads to different averages A1 and A0

due to the different value of nt=i. In particular (3) must berewritten as

ε′ =

∣∣∣∣∣1

N1

∑i∈S1

i · nt=i − 1N0

∑i∈S0

i · nt=i

∣∣∣∣∣ (8)

being nt=i the number of pairs [X(j∗ − 1), X(j∗)] with anactual number t of 0 → 1 transitions equal to i. Accordingto the considerations in Section III, it is simple to show thatnt=i = 1

2nw=i, being nw=i the number of m-bit symbols withan actual weight i.

Parameter nw=i can be evaluated by noting that, due to thepresence of one wrong bit in k′ (i.e., Hd(k, k′) = 1), thepredicted weight can differ only by ±1 from the actual one.For the sake of clarity, this argument is applied to the casewith m = 2 in Tab. II, where it was assumed with no lossof generality that the wrong bit of the guessed key is in therightmost position and it is equal to 0. It is apparent that half ofthose values predicted to have weight i, actually have weight(i + 1) if their rightmost bit is 0, whereas they have actualweight (i−1) if their rightmost bit is 1. Accordingly, we have

nw=i = 12

(m

i + 1

)+ 1

2

(m

i − 1

)when i = 1 . . . m/2−2,

1.9

2.1

2.3

2.5

2.7

plit

ude

0.9

1.1

1.3

1.5

1.7

2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32

spik

e am

m

exact

approximated

trend line

Fig. 2. Spike amplitude vs. m (eq. (7)).

TABLE IIWRONG CLASSIFICATION EXAMPLE (m = 2, k = 11, k′ = 10)

Ii k ⊕ Iicorrectweight k′ ⊕ Ii

wrongweight

11 00 0 01 110 01 1 00 001 10 1 11 200 11 2 10 1

whereas nw=i is different at boundary values of i, i.e. fori = 0 and i = m/2 − 1. In particular, nw=0 = 1 sinceonly the predicted value (00 . . . 01) has an actual weight 0,being the rightmost bit actually 0. Analogously, nw=m/2−1 =12

(m

m/2 − 2

)due to the contribution of only half the

predicted values with actual weight (m/2−2) (the values withpredicted weight m/2 are discarded, according to Fig. 1). Fi-

nally, observe that there are also nw=m/2 = 12

(m

m/2 − 1

)

predicted values with weight m/2−1 (i.e., those having a 0 inthe rightmost position) that actually have weight m/2, sincethe rightmost bit is actually 1. Accordingly, by repeating thesame procedure adopted to evaluate (7), the maximum spikeamplitude among all the wrong guesses results to

ε′ =m − 2

2

(m

m/2

)/ [2m −

(m

m/2

)](9)

that, substituted into (4), leads to the following expression

SNRINTER = m/ (m − 2) (10)

which is plotted versus m in Fig. 3.As it is apparent from Fig. 2, increasing the number m of

attacked bits leads to a higher spike amplitude and thus toa higher value of SNRINTRA in (5). However from Fig. 3,greater values of m cause a lower SNRINTER. This meansthat, increasing m, an adversary is able to easily distinguish thespike from the noise due to the higher value of SNRINTRA,but can difficultly recognize the correct key from the wrongkeys due to the lower value of SNRINTER.

It is worth nothing that current algorithms, which havemoderate values of m, exhibit approximately the same vul-nerability to DPA attacks due to the weak dependence ofSNRINTRA and SNRINTER on m. As a result, the only

1.6

1.8

2

NTE

R

1

1.2

1.4

4 6 8 10 12 14 16 18 20 22 24 26 28 30 32

SNR IN

m

Fig. 3. SNRINTER vs. m (eq. (10)).

3348

advantage of algorithms having a greater m is simply due tothe increase in the number of guesses 2m that an adversarymust consider when exhaustively analyzing the possible m-bitsub-keys, thereby making the attack less feasible.

Finally note that the SNRINTER expression in (10) is thesame as that of precharged logic that was evaluated in [6].Accordingly, the static logic is as vulnerable as prechargedlogic in terms of SNRINTER, despite of the partially incor-rect classification based on the weight that is usually adopted.

V. VALIDATION

The model of the DPA attack derived in the previoussection was validated by means of DPA attacks to an FPGAimplementation of AES-128. The AES is a symmetric blockcipher that processes 128-bit data blocks and has a key sizeof 128, 192 or 256 bits [4]. The cipher algorithm starts withthe key expansion in which eleven 128-bit round-keys aregenerated. The first round-key is equal to the least significant128 bits of the secret key. Then an AddRoundKey operation ismade in which the 128-bit input is XORed with the first round-key, i.e. it is XORed with the least significant 128 bits of thesecret key. Subsequently, four operations are repeated for tenrounds: SubBytes (i.e., a non-linear byte substitution using aS-BOX), ShiftRows (i.e., a circular shift), MixColumns (i.e., alinear transformation), and an AddRoundKey [4]. The S-BOXoperation, in the SubBytes step, is usually implemented witha logic block that processes each 8 bits of its input accordingto a non-linear transformation. Since in the first round the S-BOX is made after the AddRoundKey operation, the S-BOXlogic block processes each 8-bit subsets of the bitwise XORbetween the input and round-keys [4]. For this reason, all DPAattacks to AES reported in literature are made on 8 bits of theround-key during the first round, i.e. m is set to 8.

The DPA attacks were performed targeting an FPGA im-plementation of the AES-128 algorithm. The predicted andmeasured values of the spike amplitude ε for a correct keyguess, the value ε′ for an incorrect guess with a unity Ham-ming distance (as discussed in Section IV), and the resultingSNRINTER are reported in Tab. III (all results are normalizedto the energy of a single switching bit). The model error shownin the same table is always within 11%, thereby confirming theadeguate model accuracy and the validity of the assumptionsintroduced in the analysis. As an example, a differential powertrace is shown in Fig. 4 for SNRINTRA = 10.

VI. CONCLUSIONS

In this paper, an analytical model of DPA attacks to staticlogic circuits was proposed for a generic symmetric-key cryp-

TABLE IIISPIKE AMPLITUDE FOR A CORRECT AND A WRONG KEY (HAMMING

DISTANCE=1) FOR AES ALGORITHM

predicted measured error (%)

ε in (7) 1.505 1.342 -10.83%ε′ in (8) 1.129 1.012 -10.36%

SNRIN T ER in (10) 1.333 1.415 6.15%

0

0.5

1

1.5

eren

�al

pow

er t

race

1.342

1.012

-1.5

-1

-0.5

0 25 50 75 100 125 150 175 200 225 250 275 300 325

Nor

mal

ized

diff

e

�me (ns)

correct key

1-bit wrong key

Fig. 4. Differential power trace for SNRINTRA = 10.

tographic algorithms. The model represents a useful tool tobetter understand the effectiveness of DPA attacks, since it isgeneral and evaluates all the parameters that are of interest ina DPA attack (ε, SNRINTRA, SNRINTER). In particular,it was shown that current symmetric-key algorithms withmoderate values of m do not exhibit significant differencesin terms of SNRINTRA and SNRINTER due to their weakdependence on m. This means that a weak increase in m can-not greatly enhance the robustness against DPA, although thishas a beneficial effect due to the exponential increase in thenumber of possible key values (2m) that must be exhaustivelyanalyzed in a DPA attack. Nevertheless, a strong increase inm should be pursued in future-generation algorithms. Indeed,this allows for severely degrading SNRINTER, thereby dra-matically increasing the number of keys that lead to almost thesame maximum spike amplitude and thus making the detectionof the correct key very hard.

It was also shown that, from the parameter SNRINTER

point of view, static logic circuits are as vulnerable to DPAattacks as the precharged circuits, whereas the latter logicstyle is more vulnerable to DPA attacks, when consideringthe parameter SNRINTRA.

The model was validated on an FPGA implementation of theAES algorithm. Results showed that the model is sufficientlyaccurate for practical purposes.

REFERENCES

[1] W. Rankl and W. Effing, Smart Card Handbook. John Wiley and Sons,Inc., 1999.

[2] T. S. Messerges, E. A. Dabbish, and R. H. Sloan, “Examining smart-card security under the threat of power analysis attacks,” IEEE Trans. onComputers, vol. 51, no. 5, pp. 541–552, 2002.

[3] P. C. Kocher, J. Jaffe, and B. Jun, “Differential power analysis,” Proc. ofCRYPTO’99, pp. 388–397, 1999.

[4] S. Mangard, E. Oswald, and T. Popp, Power analysis attacks: Revealingthe secrets of smart cards. Springer-Verlag, 2007.

[5] M. Alioto, M. Poli, S. Rocchi, and V. Vignoli, “Techniques to enhancethe resistance of precharged busses to differential power analysis,” Proc.of PATMOS’06, pp. 624–533, 2006.

[6] ——, “A general model of DPA attacks to precharged busses insymmetric-key cryptographic algorithms,” Proc. of ECCTD’07, pp. 368–371, 2007.

[7] A. Papoulis, Probability, Random Variables, and Stochastic Processes.Mc-Graw Hill, 1984.

3349