[IEEE 2009 International Conference on Microelectronics - ICM - Marrakech, Morocco (2009.12.19-2009.12.22)] 2009 International Conference on Microelectronics - ICM - Low-overhead countermeasures

Abstract—In this paper, novel circuit techniques are proposed to enhance the resistance of precharged busses against Power Analysis attacks. Indeed, a low-power low-area bus coding scheme is used to make power consumption nominally constant. In addition, a simple scrambling technique is developed in order to make the bus robust against attacks even in the presence of process variations or load unbalance. The proposed techniques are shown to be more efficient than the traditional dual-rail pre-charged busses, both in terms of area and power consumption. Measurements confirm that the proposed technique increases the robustness of precharged.

Index Terms— Correlation Power Analysis, countermeasures, security, Smart Cards, Cryptography, precharged bus, VLSI.

I. INTRODUCTION N the last decade, the interest in the information security has been increasing as a consequence of the wide spreading

out of low-cost portable devices that store confidential data or copyrighted material (e.g., smartphones, PDAs, smartcards) [1]. Although these electronic devices are protected with cryp-tographic algorithms that ensure a high software security, they are still vulnerable to the so-called “side-channel” attacks [2], [3]. These attacks exploit the physical interaction of an elec-tronic device with the external environment, in order to recov-er the secret information stored in it [2], [3]. Among these at-tacks, the Correlation Power Analysis (CPA) exploits the rela-tion between the power consumption and data processed in the cryptographic chip under attack. CPA has been demonstrated to be a major threat to the information security since it is not invasive and requires inexpensive off-the-shelf measurement setups [4]. CPA is also widely agreed to be more effective than other Power Analysis attacks, such as the Differential Power Analysis (DPA) [4].

Among the building blocks of a cryptographic chip, busses are known to be critical in terms of immunity to CPA attacks since they are one of the main contributions to the overall chip power consumption and all data are transferred on them [2], [3]. In this paper, circuit techniques to strengthen the resis-tance of precharged busses against CPA attacks are proposed, as an alternative to traditional dual-rail precharged scheme [4]. To this aim, two different techniques are combined to destroy the correlation between the secret key and the power con-sumption despite of single-rail operation. Actually, in practical circuit implementations there is a residual information leakage due to the imperfect matching between different lines (e.g., due to process variations, load unbalance). This further infor-

mation leakage is suppressed by applying a simple scrambling technique that randomly assigns each bit that is transmitted on the bus, so that it is not possible to physically identify the line through which each bit is transferred.

Interestingly, the proposed approach has a negligible power overhead, compared to single-rail precharged busses. In regard to area, a 1.5X overhead is found with the proposed technique, which is lower than the 2X area increase observed in dual-rail busses. Experimental CPA attacks on an FPGA implementa-tion of the AES algorithm are used to validate the technique.

II. BRIEF REVIEW OF CORRELATION POWER ANALYSIS ATTACKS AND CONSIDERATIONS

CPA attacks aim at recovering the secret key that is used in a cryptographic circuit to encrypt (decrypt) a plaintext (ci-phertext). A CPA attack consists of three phases. In the first phase, an adversary sequentially injects random (but known) inputs (with ) and acquires samples of the corresponding waveform (with ) of the power dissipated during the encryption (decryption) of .

In the second phase, the adversary chooses an internal sig-nal that is physically evaluated within the chip under attack. The choice of the signal and its dependence on and is obtained from the knowledge of the algorithm [4]

(1) being the point of time in which is evaluated and the operation that the algorithm performs in . Then, the adver-sary chooses a function to estimate the power dissipated by the cryptographic circuit to evaluate [4]

. (2)

In the third CPA phase, the adversary evaluates the correla-tion coefficient between the measured power and the estimated power . Since is secret, is unknown from (1)-(2), hence (i.e., ) must be evaluated for every possible choice of . Obviously, since is eva-luated by the circuit at , and are maximally correlated at time . In other points of time, the measured and estimated powers are not correlated and hence , i.e. a spike appears at in . Moreover, the measured power and the estimated power are maximally cor-related if the correct key is guessed, whereas a lower spike is

Low-Overhead Countermeasures to protect Pre-charged Busses against Power Analysis Attacks

Massimo Alioto, Massimo Poli, Santina Rocchi DII – University of Siena

Siena – Italy Email: {malioto, poli, rocchi}@dii.unisi.it

I

2009 International Conference on Microelectronics

978-1-4244-5816-5/09/$26.00 ©2009 IEEE 165

observed in for wrong key guesses. Hence, the secret key is identified as the key guess leading to the highest spike of .

In practical attacks, the correlation coefficient is esti-mated by the sample correlation coefficient , since a finite number N of power traces is actually available [5]. From basic statistics theory, assuming power traces to be statistically in-dependent, the sample correlation coefficient is a random process whose mean value tends to for , and its standard deviation decreases as (being a proper constant) [4], [5]. Hence, can be thought of as the sum of the signal and an additive zero-mean noise with standard deviation which masks the spike at .

Additionally, there is a further noise contribution that ac-counts for all power contributions that are independent of the key, i.e. the power contributions of blocks that either do not process the key, or do not contribute to the signal . Un-der the hypothesis discussed before, the standard deviation of this noise contribution is proportional to , hence it can be written as , where is a suitable constant [3].

Since the above noise contributions are superimposed to the spike to be detected, an Intra-Signal Signal-to-Noise Ratio

is usually defined as the ratio of the spike ampli-tude and the overall noise standard deviation [3]

(3)

where the two above noise sources were assumed independent. In practical attacks, values of are sufficient to recognize the correct spike [2]-[4], which are achieved with a sufficiently high number of power traces .

III. DESIGN OF NOMINALLY CONSTANT-POWER PRECHARGED BUSSES THROUGH BUS CODING

In an -bit precharged bus, the clock cycle is divided into two different phases, the precharge phase in which each bus line is set to logical one, and the subsequent evaluation phase in which the bus line is set to the actual data. Therefore, in a clock cycle, the average power consumption of a precharged bus is proportional to the number of transferred zeroes. Equivalently, the bus power consumption is proportional to

, being the bus weight, i.e. the number of trans-ferred 1’s [3], [4].

From the above considerations, the power consumption mean value and variance result to [3], [4]

(4)

which were evaluated assuming with no loss of generality that a unit energy is dissipated in the transition of a single bus line and under the usual hypothesis of independent uniformly dis-tributed data.

The robustness of busses against CPA attacks can be in-creased by reducing the correlation between the processed da-ta and the bus power consumption. Indeed, in the limit case where perfectly constant bus power consumption is achieved

(i.e., ), the bus is perfectly immune to CPA attacks [3]. Accordingly, circuit techniques like dual-rail precharged busses were devised to achieve constant bus power consump-tion [3], [4], [6], [7]. Actually, dual-rail precharged busses ex-hibit a nominally constant power, thereby ideally nullifying any information leakage. Compared to a single-ended bus, this is achieved at the expense of a 2X area and power overhead, because of the doubled number of lines [3]. Unfortunately, in-formation leakage is destroyed only if the bus lines are per-fectly matched, which of course is not verified in real circuit implementations due to process variations and load unbalance.

The dual-rail approach is nothing but a bus coding, which doubles the existing bus lines. From this perspective, careful analysis can be lead to better codings. For example, the Bus-Invert coding technique, which is usually adopted to reduce the average power consumption and its standard deviation, was recently considered [8]-[10]. According to this technique, if the bit-wise complement of the data is actually transferred to the bus. On the other hand, if the data is transferred as it is. This two conditions are decoded by an Invert bus line which is set to if , and to other-wise [8], [9]. As a result, the maximum number of 0’s trans-mitted on the bus is limited to , thereby reducing the stan-dard deviation of the bus power consumption. The over-head of this technique is negligible, which justifies its wide-spread adoption.

Another coding technique that reduces consists in the addition of a proper number of Balance lines [10]. These balance lines are encoded so that the overall number of zeroes in the bus is kept as close to as possible. Indeed, h balance lines can compensate the presence of up to h ones in the bus. Obviously, greater values of h ensure a better compensation of ones, and in the limit case with h=n the bus scheme is exactly the Dual-Rail precharged technique.

In general, the control logic that sets the bits transferred through the Balance lines must generate a number of 0’s that is equal to . Hence, the control logic must simply count the number of 0’s in the bus lines, and transmit

0’s through the additional Balance lines. Apparently, the scheme is very simple, and the overhead associated with the control logic is expected to be very small.

In this paper, the two above techniques are jointly adopted to achieve constant bus power consumption with a number of additional lines that is significantly lower than that required in dual-rail busses. Indeed, the Bus-invert coding limits the max-imum number of 0’s to , hence the bus power consump-tion can be made perfectly constant by adding only balance lines. This coding ideally permits to have constant power by adding one Invert line and balance lines.

The proposed joint techniques permit to significantly lower the area/power overhead that is paid to achieve full bus protec-tion. In particular, the bus area is increased by a factor

. Hence, for practical values of , the area overhead of the joint techniques (1.5X) is lower than that re-quired in dual-rail precharged busses.

In regard to the power consumption, the joint techniques

166

keep the bus power consumption constantly equal to . In-terestingly, this power consumption is exactly equal to that dissipated by precharged busses without any protection tech-nique in (4), as opposite to dual-rail busses that double the bus power consumption.

IV. A SCRAMBLING TECHNIQUE TO COUNTERACT PROCESS VARIATIONS AND LOAD UNBALANCE

The above considerations on the perfectly constant bus power consumption are valid under the assumption of perfect-ly matched bus lines. However, this assumption is not satisfied in real circuits, due to process variations, as well as to the un-balance in the load of each bus line. Due to the small differ-ences in the energy associated with bus lines, the energy con-sumption of the bus is not perfectly constant in real circuits, although the circuit was nominally designed to make lines and loads exactly equal. For this reason, process variations and load unbalance introduce undesirable correlation between the transferred data and the bus power consumption, thus reducing the benefits brought by any protection technique that aims at achieving perfectly constant bus power consumption, includ-ing the traditional dual-rail precharged technique and the above discussed joint technique.

To cancel the effect of small differences between bus lines, a simple and effective idea is to randomly swap the bus lines through which the signals encoded with Bus-Invert/Balance Lines are transmitted. In this way, the energy consumption as-sociated with a given bit is still affected by small differences between bus lines, but in an unpredictable way.

The above idea can be implemented by randomly mapping each of the bits that are encoded with Bus-Invert/Balance Line onto one of the bits transmitted through the bus. This function is a simple scrambler as in Fig. 1, which unambiguously maps its input bits to the output bits, with a function that is con-trolled by a signal sel. In principle, the scrambler can be im-plemented in many different ways. In the simplest case, the scrambler can be a circular shifter, which sets the output bits as the shifted version of the input bits, where the number of shifts is set by the integer value of sel.

To make this process random, the signal sel is generated by a pseudo-random number generator. A Linear Feedback Shift Registers (LFSR) is a good choice because it can generate pseudo-random bit sequences with low circuit complexity and extremely high period [1]. The initial seed of the LFSR is gen-erated by a true-random bit generator, which is usually availa-ble in cryptographic chips.

To ensure consistency between the data transmitted to and received from the bus, a descrambler performing the inverse mapping of the scrambler is introduced in the proposed scheme and it controlled by a signal that must be the same as the scrambler.

Under the above assumptions, the mechanism introduced by the scrambler/descrambler is completely transparent to the transmitting and receiving block. Moreover, observe that the scrambler does not modify the number of transferred zeroes, but only the bus lines that are used to transfer them. If the pe-riod of the LFSR is sufficiently high, the scrambling is unpre-

dictable. Accordingly, the scrambler cancels the residual cor-relation between the processed data and the bus power con-sumption that is due to process variations or load unbalance. As a consequence, the proposed technique is expected to be immune to CPA attacks and in general to other Power Analy-sis attacks, which are generally able to violate circuits with load unbalance or process variations [11].

V. VALIDATION OF THE IMPROVED TECHNIQUE The proposed approach was validated by performing CPA

attacks on an FPGA implementation of the AES algorithm [1]-[4]. As usual, the signal under attack was chosen to be the in-put of the S-BOX in the first round, which in turn is equal to the XOR of an 8-bit portion of the input and the secret key [2], [4]. The AES algorithm was implemented in an Altera Cyc-lone™ FPGA. In this FPGA, an 8-bit precharged bus was also implemented as a built-in function available in the Altera li-brary, and was used to transfer the signal under attack.

The robustness against CPA attacks was measured by the number of power traces that are needed to identify the spike, i.e. to achieve =10. Indeed, the computational and measurement effort involved in the attack is proportional to . The results obtained are reported in Table I for the bus with no protection, the case of Bus-Invert/Balance Lines pro-tection in Section III, and finally the latter approach in Fig. 1. To quantitatively express the improvement factor in the im-munity to CPA attacks, the value of normalized to the case of unprotected bus is also reported in Table I.

From Table I, the unprotected bus is easily attacked as the required number of power traces is only 1,000. Moreover, the Bus-Invert/Balance Lines technique in Section III im-proves the robustness against CPA attacks by a factor of only 3. This is due to process variations and load unbalance, which is particularly evident if matching issues are not explicitly tak-en into account (as usually occurs in FPGAs).

When the scheme in Fig. 1 is adopted, from the last row of Table I, the number of power traces required to successfully attack the bus is extremely high [4]. Indeed, the number of power traces was intentionally limited to 1,000,000 to avoid performing very long attacks [4]. Even at the very high value

1,000,000, the attack was unsuccessful, thereby confirm-ing that the proposed technique considerably enhances the bus robustness. In other words, the proposed scheme in Fig. 1 im-proves the robustness against CPA attacks by a factor greater than 1,000. This proves that this technique correctly cancels

Fig. 1. Block diagram of the mixed technique with a scramb-ler/descrambler scheme.

167

the effect of small differences between bus lines. The experimental correlation coefficient obtained by attack-

ing the precharged bus is plotted in Fig. 2 and Fig. 3 for the Bus-Invert/Balance Lines techniques without and with the scheme in Fig. 1, respectively. From Fig. 2, the spike is visi-ble for , and was obviously found to be even more evident in the case without any protection, whose plot was omitted for the sake of brevity. Instead, under the proposed scheme in Fig. 1, no spike can be detected from the CPA plot even for , as shown in Fig. 3. This confirms the very high level of security of the bus scheme in Fig. 1.

In regard to the area overhead, the area associated with the encoder/decoder, scrambler/descrambler was found to be only 3% of the overall bus area. Hence, the area increase with re-spect to the unprotected bus is essentially due to the additional Bus-Invert line and the Balance lines.

As regards the power overhead, measurements showed that the average bus power consumption of the bus scheme in Fig. 1 is greater than that of the unprotected bus by only 1.8%. This confirms that the proposed approach is more suitable for low-power applications since it has halved the power con-sumption, compared to dual-rail busses.

VI. CONCLUSIONS In this paper, techniques to improve the robustness of pre-

charged busses against Power Analysis attacks were proposed. Proper bus coding based on the mixed Bus-Invert/Balance Lines techniques was described to achieve a nominally con-stant power consumption that cancels the correlation of the power with the secret key. Actually, the latter condition is never exactly met due to process variations and load unbal-ance, hence a residual correlation exists that makes the circuit vulnerable. The latter observation holds for any technique that targets nominally constant power, including the dual-rail ap-proach. To cancel the effect of the residual correlation, a scrambling/descrambling technique was used. The overall scheme is modular and simple, and the area/power overhead associated with the additional logic is very low.

The results obtained from experimental CPA attacks on an FPGA implementation of the AES algorithm showed that the proposed approach provides a remarkable improvement in the immunity to CPA attacks. In addition, the power consumption of the overall scheme was confirmed to be almost the same as that of the unprotected bus. Finally, in regard to the area, a 1.5X overhead was observed which is lower than that paid by dual-rail busses.

Summarizing, the proposed technique offers a very high

level of information security, at a significantly lower area and power overhead compared to the traditional dual-rail bus ap-proach.

REFERENCES [1] W. Rankl, W. Effing, Smart Card Handbook, J. Wiley & Sons, 1999. [2] T. S. Messerges, E. A. Dabbish, R. H. Sloan, “Examining Smart-Card

Security under the Threat of Power Analysis Attacks,” IEEE Trans. on Computers, vol. 51, no. 5, pp. 541-552, May 2002.

[3] M. Alioto, M. Poli, S. Rocchi, “Differential Power Analysis Attacks to Precharged Busses: a General Analysis for Symmetric-Key Crypto-graphic Algorithms,” in print on IEEE Trans. on Dependable and Secure Computing.

[4] S. Mangard, E. Oswald, T. Popp, Power Analysis Attacks: Revealing the Secrets of Smart Cards, Springer, 2007.

[5] R. E. Walpole, R. H. Myers, S. L. Myers, K. Ye, Probability & Statistics for Engineers & Scientists, Prentice Hall, 2006.

[6] K. Tiri, M. Akmal, I. Verbauwhede, “A dynamic and differential CMOS logic with signal independent power consumption to withstand Differen-tial Power Analysis on Smart Cards,” Proc. of ESSCIRC’02, pp. 403-406, 2002.

[7] K. Tiri, I. Verbauwhede, “A Logic Level Design Methodology for a Se-cure DPA Resistant ASIC or FPGA Implementation,” Proc. of DATE 2004, pp. 246–251, 2004.

[8] M. R. Stan, W. P. Burleson, “Bus-invert coding for low-power I/O,” IEEE Trans. on VLSI Systems, vol. 3, no. 1, pp. 49-58, Mar. 1995.

[9] R. B. Lin, C. M. Tsai, “Theoretical analysis of bus-invert coding,” IEEE Trans. on VLSI Systems, vol. 10, no. 6, pp. 929-935, Dec. 2002.

[10] M. Alioto, M. Poli, S. Rocchi, V. Vignoli, “Techniques to Enhance the Resistance of Precharged Busses to Differential Power Analysis,” in Proc. of PATMOS2006, pp. 624-633, Montpellier (France), Sept. 2006.

[11] P. Schaumont, K. Tiri, “Masking and Dual-rail Logic Don’t Add Up,” in Proc. of CHES 2007, pp. 95–106, 2007.

Fig. 2. Correlation coefficient for the Bus-Invert/Balance Lines technique with .

Fig. 3. Correlation coefficient for (scheme in Fig. 2).

TABLE I EXPERIMENTAL RESULTS

N to achieve SNRINTRA=10

N normalized to case with

no protection

no protection 1,000 1

Bus-Invert (BI)+ Balance Lines (BL) 3,000 3

BI + BL + scrambler/descrambler >1,000,000 >1,000

168

Documents

[IEEE 2009 International Conference on Microelectronics - ICM - Marrakech, Morocco (2009.12.19-2009.12.22)] 2009 International Conference on Microelectronics - ICM - Low-overhead countermeasures