Upload
duongquynh
View
215
Download
0
Embed Size (px)
Citation preview
A simple and low cost platform to
perform Power Analysis Attacks
Manuel Bejarano Carmona
June, 2012
Abstract
Power Analysis Attacks use the fact that power consumption in
modern microprocessors and cryptographic devices depends on the
instructions executed on them and so, it varies with time. This leak-
age is mainly used to deduce cryptographic keys as well as algorithms
by direct observation of power traces. Power Analysis is a recent
field of study that has been developed for the last decade. Since then,
the techniques used have evolved into more complex forms, that some-
times require a variety of skills that makes the subject difficult to start
with. Nowadays it is changeling to tackle the problem without expen-
sive equipment; what is more, the off-the-shelf solutions to do Power
Analysis Attacks are rare and expensive. This thesis aim to provide
a low cost and open platform as an entry point to Power Analysis for
a price under 10 USD. Besides that, it is designed to be able to per-
form Simple Power Analysis and Differential Power Analysis attacks
to a 8 bit microcontroller, including the software needed to automate
the process of taking the measurements. Finally, the platform can be
extended to cover a wide range of microcontrollers, microprocessors
and cryptographic devices by simple insertion in a bread board, which
makes it the perfect device for new comers to the field.
1
Contents
1 Introduction 5
1.1 Power Analysis Attacks . . . . . . . . . . . . . . . . . . . . . . 6
1.2 Countermeasures Against Power Analysis Attacks . . . . . . . 8
1.3 Hardware and Software Design . . . . . . . . . . . . . . . . . . 9
2 Power Consumption 9
2.1 Static and Dynamic Power Consumption . . . . . . . . . . . . 10
2.2 Power Models . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.3 Quality Criteria . . . . . . . . . . . . . . . . . . . . . . . . . . 12
3 Design of the Circuit 12
3.1 Microcontroller . . . . . . . . . . . . . . . . . . . . . . . . . . 14
3.2 PCB Layer Stack Up . . . . . . . . . . . . . . . . . . . . . . . 15
3.3 Input Filtering and Decoupling . . . . . . . . . . . . . . . . . 16
3.4 Clock Signal Selection . . . . . . . . . . . . . . . . . . . . . . 18
3.5 Isolated Communications Interface . . . . . . . . . . . . . . . 20
3.6 Communication Protocol . . . . . . . . . . . . . . . . . . . . . 21
3.7 Sensing and Triggering . . . . . . . . . . . . . . . . . . . . . . 23
3.8 In Circuit Serial Programming . . . . . . . . . . . . . . . . . . 24
3.9 Production Cost . . . . . . . . . . . . . . . . . . . . . . . . . . 25
4 Simple Power Analysis 27
4.1 General Description . . . . . . . . . . . . . . . . . . . . . . . . 27
4.2 Measurements Set Up . . . . . . . . . . . . . . . . . . . . . . . 27
4.3 Template Attacks . . . . . . . . . . . . . . . . . . . . . . . . . 29
4.4 Template Building . . . . . . . . . . . . . . . . . . . . . . . . 31
4.5 Template Matching . . . . . . . . . . . . . . . . . . . . . . . . 32
4.6 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
5 Conclusion 38
A Software Implementation 39
2
B Bill Of Materials (BOM) 58
C Gerber Files 59
D Circuit Schematic 63
3
List of Figures
1 Assembled Rev. A Circuit Board . . . . . . . . . . . . . . . . 13
2 3D View of the PCB . . . . . . . . . . . . . . . . . . . . . . . 16
3 Oscilloscope timming and trigger settings . . . . . . . . . . . . 25
4 Pickit 3 Connection Schematic . . . . . . . . . . . . . . . . . . 26
5 Measurements Set Up . . . . . . . . . . . . . . . . . . . . . . . 28
6 Capture of a power trace . . . . . . . . . . . . . . . . . . . . . 30
7 Load accumulator power trace (1 vs 127 vs 255) . . . . . . . . 34
8 Clear accumulator power trace (1 vs 127) . . . . . . . . . . . . 35
9 No Operation (NOP) power trace . . . . . . . . . . . . . . . . . 36
10 AES-128 Power Trace . . . . . . . . . . . . . . . . . . . . . . . 37
11 Bottom Copper Layer . . . . . . . . . . . . . . . . . . . . . . 59
12 Top Copper Layer . . . . . . . . . . . . . . . . . . . . . . . . . 59
13 Bottom Solder Mask Layer . . . . . . . . . . . . . . . . . . . . 60
14 Top Solder Mask Layer . . . . . . . . . . . . . . . . . . . . . . 60
15 Bottom Silk Screen Layer . . . . . . . . . . . . . . . . . . . . 61
16 Top Silk Screen Layer . . . . . . . . . . . . . . . . . . . . . . . 61
17 Drill File . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
List of Tables
1 Basic specifications of the target device . . . . . . . . . . . . . 15
2 PCB and Layer Stack Up Configuration . . . . . . . . . . . . 15
3 Characteristics of the filter capacitors . . . . . . . . . . . . . . 17
4 Serial connection configuration . . . . . . . . . . . . . . . . . . 21
5 UART Configuration Parameters . . . . . . . . . . . . . . . . 22
6 Handshake command . . . . . . . . . . . . . . . . . . . . . . . 22
7 Reset command . . . . . . . . . . . . . . . . . . . . . . . . . . 23
8 Resend command . . . . . . . . . . . . . . . . . . . . . . . . . 23
9 Execute command . . . . . . . . . . . . . . . . . . . . . . . . 24
10 Cost of producing the first prototype batch . . . . . . . . . . . 26
11 Bill Of Materials . . . . . . . . . . . . . . . . . . . . . . . . . 58
4
1 Introduction
Power analysis attacks are cryptanalytic attacks that allow the extraction
of secret information from cryptographic devices and generally, any inte-
grated circuit. The difference with other cryptanalytic attacks is that they
exploit the power consumption characteristics of electronic devices rather
than mathematical properties of algorithms and ciphers. They are as well,
non-invasive attacks that can be performed with off-the-shelf equipment but,
as the attacks becomes more complex, the equipment required get more ex-
pensive. Anyhow, this kind of attacks pose a serious threat to the security of
these cryptographic devices that nowadays are embedded in many consumer
electronic products.
Power analysis attacks were first used to retrieve keys out of cryptographic
devices and since then, a wide range of techniques have arisen that, based
on these foundations, expand the uses of Power Analysis attacks. In this
section, we give an introduction to the problem of designing a low cost circuit
to perform this kind of attacks to devices that, not entirely conceived to
work as cryptographic devices, are powerful enough to perform all kind of
computations, including cryptographic functions.
Grasping the necessary skills to face an attack using Power Analysis to
a device as the one chosen in this thesis, can take some time until having a
working set up. Then, we wanted to shorten this time and let the attacker
concentrate in the analysis itself, leaving the details of the design of the
circuit to a known and working platform.
This thesis intends to give a broader view of the very first steps that needs
to be done before sitting in the laboratory and start working in the analysis
of the target device, that by all means, at the end it is the important part of
any research of this kind. Those steps comprise two main areas, the design
of the circuit and the design of the software to perform the measurements,
which until now, required extra engineering effort in the first phases of a
research project about Power Analysis.
The order in which we present the topics is intended to a reader that
has never perform this kind of attacks. A general introduction to the field
5
is given in Sections 1.1, going into further details in Section 2, where we
analyse how power consumption is modelled and simulated plus what is the
different criteria to distinguish and classify power traces. There is glimpse
about countermeasures against Power Analysis Attacks in Section 1.2 but
the reader should know that there is an entire industry behind this processes
and of course, intellectual property rights make it difficult to give a better
understanding of how these countermeasures are implemented in silicon in
the devices that we use today.
In the other hand, Sections 3 and 4 are dedicated to explain the details
behind the design in the first place, and the basics about Simple Power
Analysis in second case. A finer level of detail was intended in the design steps
since the decisions involved have to be understood by the person using the
circuit for research. For this purpose, Section 4 is written as the next logical
step to follow after setting up the right laboratory equipment to analyse
the target; a couple of examples are given at the end with the intention of
providing a proof of concept of SPA. Since this platform is not only hardware
based, Section 3.6 explains how the software was written taking into account
the different Power Analysis techniques that can be used.
The reader should be aware that the topic is not covered as deep as in
other papers and publications in the field, so it is recommended take a look
at some of the main references like [MOP07] and [KJJ99] in order to have a
better overview.
1.1 Power Analysis Attacks
Nowadays, most of the electronic devices that we use have a microprocessor,
a microcontroller or some sort of cryptographic capable device. They have
multiple uses but the main applications are related with security. Protecting
electronic money transactions, authenticating with a bank using a mobile
phone application or with a government institution requires some sort of
cryptography. All of this happens behind the scenes and without the user
being aware of it, and so, trust plays an important role in a lot of things
that are meant to be common used and wide spread. Those microprocessors,
6
sometimes, leak information by the simple fact of being executing one of the
operations required to, for example, electronically sign a document. This
leakage has serious security implications and then, trust is at risk.
Power Analysis Attacks can be targeted to devices that are built to do a
simple cryptographic operations, but as it will be explained later, to any elec-
tronic device that does something that we are interested in and has nothing
to do with cryptography, for example, any sort of calculation such as check-
sums or scrambling routines. Those attacks can be classified in two types,
passive or active, depending on whether the device is operated attending to
its specifications in the first case, or in the second case, if its inputs and
the environment are modified or tampered to make it behave abnormally.
There are other kind of attacks that fall within any of the previous two and
they could be invasive, semi-invasive and non-invasive attacks depending on
matters related with what can be done with the device in order to obtain the
secret key, algorithm, etc., that we are looking for.
Invasive attacks require expensive equipment and are meant to depackag-
ing the device and probe into the signals directly on the silicon, see [KK99],
[And08] and [Sko05]. Semi-invasive attacks follow the same procedure of
depackaging the device but in contrast, there is not direct contact with the
silicon, since the purpose is always to read out or erase memory cells or flip
specific bits that can disable de memory protection measures, see [SSAQ02]
and [SA03]; same as before, semi-invasive attacks require expensive equip-
ment too. Finally, non-invasive attacks are the cheapest to perform and are
conducted interfacing with the devices through the physical available pins.
These kind of attacks are also called side-channel attacks, being the three
most important types, timing, power analysis and electromagnetic attacks,
see [Koc96], [KJJ99] and [GMO01].
As a definition, and quoting [MOP07, ch. 1.3], ”Power Analysis Attacks
exploit the fact that the instantaneous power consumption of a cryptographic
device depends on the data it processes and on the operation it performs”.
This principle is the one that drives the design of the circuit in this thesis.
The examples presented in Section 4.6 are based on a 8 bit microcontroller
executing pieces of code that have been previously flashed into the device
7
memory and cover some basic algorithms and calculations that are known to
the attacker. The execution flow is controlled from a computer that sends
commands and communicate with the device. In order to measure the power
consumption in the device we placed a resistor between 1Ω and 68Ω in the
VDD line of the microcontroller. Since the voltage drop across this resistor
is proportional to the power consumption of the microcontroller, we can
then plot and analyse this voltage versus time that was measured with an
oscilloscope. Looking for interesting points and try to find relationships
between the data processed and the instructions that were executed are the
basics of Power Analysis Attacks.
1.2 Countermeasures Against Power Analysis Attacks
The fact that the power consumption of a device depends on the data it pro-
cesses makes difficult to implement a cryptographic algorithm in hardware
or software without taking some precautions to avoid an attacker to charac-
terize the device by recording its power traces. The countermeasures try to
mitigate this dependency by using two techniques:
Hiding consists of removing the dependency that exists between the data
and the power consumption, which is a difficult task since there is al-
ways a relationship between these two variables. To achieve this, the
execution of the algorithm is randomized or the device power consump-
tion is altered in order to make it difficult for an attacker to find this
dependency in the power traces.
Masking is slightly different and it allows data dependency but not in the
intermediate values of a particular algorithm; those values can be, for
example, the result of one of the steps of a AES encryption algorithm.
Masking those values means randomizing them and thus, making the
key extraction a tedious or almost impossible process.
But, apart from these two principles, there is another one particularly
known in the security field, that says that the security can be increased by
avoiding the use of repeated session keys plus not relying the entire security of
8
the system in the paradigm of ”security through obscurity”. If a device can be
characterized from its power traces, the device can not be considered secure
any more, since any code executed by the microcontroller can be compared
against power templates to find what it does and how it is done.
1.3 Hardware and Software Design
As we have mentioned at the beginning of this section, the skills required
to analyse a device based on its power consumption go from mathematics,
statistics, analogue design and software development plus being familiar with
some instrumentation such as Digital Sampling Oscilloscopes and the typical
bench instruments found on any electronic laboratory.
Our intention was to provide a combination of ready to use hardware
and software to let the user concentrate in the analysis of the device under
attack. This required some experience in electronic design since we wanted to
produce a Printed Circuit Board that, in most of the cases in the literature,
is neglected and not explained at all. The set up of this kind of measurement
environments is forgotten or not mentioned in most of the papers and then,
there is a lack of a platform to investigate without spending thousands of
dollars in solutions like the SASEBO boards, see [KSS+08], [KSS+09] and
[SAS07].
Another point to have into account is that the design does not contain
electronic components like FPGAs or DSPs to perform real time analysis, but
the minimum to grasp the basics and familiarize with the related concepts
and thus, the intention of an entry point level design.
2 Power Consumption
In this section we discuss the basics of power consumption in CMOS digital
circuits. The fact that a digital integrated circuit like a microcontroller needs
power to execute the operations that it is programmed to do is obvious. The
kind of power supply necessary determines if a device can be attacked or not,
among other things.
9
There are some decisions, within digital circuits design, that have direct
implications in the power consumption of the device. Those decisions are
made at all levels, including system level, architecture level, cell level, tran-
sistor level, and the different kind of MOS technologies used to create the
building blocks of the final device.
In order to calculate the instantaneous power consumption of a device, we
should know that the supply voltage is constant at all times but the circuit
draws current depending of the state of the logic blocks that are needed in
this particular slice of time. Then, the power consumption of the circuit Pcir
is given by equation 2.1, where VDD is the supply voltage and iDD is the
instantaneous current.
Pcir =1
T
∫ T
0
pcir (t) dt =VDD
T
∫ T
0
iDD (t) dt (2.1)
In addition to this, we differentiate between the power consumption when
the cells that forms the logic gates are in a ON or OFF state, that is, static
or dynamic power consumption. The total power consumed by a device will
be the sum of both, Pstat and Pdyn.
Pcir = Pstat + Pdyn (2.2)
2.1 Static and Dynamic Power Consumption
The static power consumption is orders of magnitude smaller than the dy-
namic power consumption due to the nature of the CMOS technology. Logic
blocks built using CMOS consists of a Pull-Up Network (PUN) and a Pull-
Down Network (PDN) that are never conducting at the same time and thus,
there is no current flowing from VDD to ground, resulting in no power con-
sumed by the logic block.
There are always leakage currents but they are of the order of pico Farads
and so, not important when compared with the dominant dynamic power
consumption that represents, almost on its own, the total power consumption
in the device. Obviously, the dynamic power consumption is then directly
related with the data that is processed as mentioned in Section 1.1.
10
2.2 Power Models
Models play an important role in Power Analysis Attacks since the attacker
has the need of mapping power traces to data involved in the computations.
This is referred as a kind of a power simulation in [MOP07, ch. 3.3]. The
attacker needs to build a model for the device under attack or in other words,
the attacker characterize the power consumption of the device using a power
model. The two most relevant power models are the Hamming-Distance
Model and the Hamming-Weight Model. There are, as well, variations of
these models that we don’t cover here but can be found in [MOP07, ch.
3.3.3].
Hamming-Distance Model : This model is used to describe the power
consumption of buses and registers and it is based in the fact that
an attacker can know by intuition, since the building blocks of digital
circuits are similar among devices, how some parts of the microcon-
troller are connected and interact with each other. A change in a data
bus from one state to another will produce a variation in the power
consumption that will be proportional to the variation in the Ham-
ming weight of the two values that were in the bus, see Equation 2.3.
Knowing that, the attacker can predict the data that will be in the
bus, if for example the algorithm that is executing is known, by simple
observation of some power traces.
HD(v0, v1) = HW (v0 ⊕ v1) (2.3)
Hamming-Weight Model : In order to use this model, the attacker as-
sumes that, given a data value to be processed, the power consumption
is proportional to the number of bits set in this value. This assumption
by itself does not throw a lot of information about the value of the data
being processed or previous data in a bus. But, it serves as a perfect
tool to characterize the power consumption of a device for which we can
perform a controlled non-invasive attack. For example, we can identify
the device that is used in a circuit and build a separated test circuit
11
with the same or a similar device; then we can write software for it and
take as many measures as we want using the techniques described in
Sections 4.3 and 4.4. Note that the power consumption can be directly
or inversely proportional to the number of bits set in the value under
study. After fully characterize the device it’s a matter of matching the
power traces with the templates built for the attack.
2.3 Quality Criteria
Measuring power signals is influenced by two main sources of noise, electronic
noise and switching noise. Those sources or noise are produced in the first
case by thermal noise, reflections, crosstalk, filtering and in the second case
by the nature of the CMOS technology due to the switching time of the logic
cell being in the order of nano seconds, which produces frequencies in the
giga Hertz range when the circuit is working at a couple of mega Hertz.
These and other considerations were taken into account in the layout
process of the Printed Circuit Board and they are explained in detail in
the next section. Filtering the power supply, isolating the communication
interface, using short traces for the clock signal and proper decoupling of all
the integrated circuits in the board are, among others, some of the constraints
that have to be taken into account when designing a circuit to do Power
Analysis Attacks.
3 Design of the Circuit
Now that we have established the formal core concepts behind Simple Power
Analysis, the task was to design a circuit taking into account two premises:
Automated Measures : Together with the software, the system should
have the capabilities to perform automatic measures in conjunction
with an oscilloscope and a computer. Due to the nature of the power
analysis method, there is a clear need to perform thousands of mea-
sures to cover the microcontroller’s instruction set, but not limited to
that; removing the noise will add another constraint to the amount
12
Figure 1: Assembled Rev. A Circuit Board
of measures required, since, as described in Section 2.3, increasing the
number of measures reduces the noise and the variance.
Low Cost : There is an intrinsic relationship between cost and features
when it comes to designing a circuit to perform power analysis but, as
it turned out, there is a balance that can fulfil our expectations for a
relatively low budget of around 150 USD. This is described with more
detail in Section 3.9.
In the following sections, we will cover the most important decisions that
were made in the design of the circuit. Although the proposed circuit could
seem simplistic, it is powerful enough as it was intended from the beginning
and, in the other hand, adequate for the task. This design was inspired in the
one described in [MOP07, ch. 3.4.4] for analysing a microcontroller. How-
ever, it was improved in some aspects and complemented in others, making
13
it easier to expand or adapt to analyse a bigger variety of devices.
3.1 Microcontroller
The selection of the device under analysis was made attending to some basic
assumptions:
Clock Frequency : Since we were limited by the accuracy of the equipment
in the laboratory, we had to select a microcontroller that was able to
run at a relatively low frequency. That’s not usually a problem since
these devices have been around for a long time and there is a huge range
that satisfies our needs. The main factors to consider were the input
bandwidth, the sampling rate and the resolution of the oscilloscope.
Choosing a microcontroller that is able to run at 4 MHz falls within
the specifications of our measurement equipment since we were using
a 200MHz bandwidth oscilloscope; the sampling rate won’t conflict
with the Nyquist-Shannon sampling theorem since we could set the
sampling rate to be at least twice as high as the highest frequency
component without loosing information of the input signal, see [OS09].
We selected a sampling rate that was five times higher than the most
dominant frequency in the power consumption signal, that is, 20 MHz.
Serial Interface : Although this is not a hard requirement, in order to avoid
programming the device every time we want to measure something, it
had to have at least a serial interface to communicate with a computer.
Apart from these two requirements, the final device, Microchip PIC16F628A
(see [Mic09]), was chosen considering my previous experience and the avail-
ability of development tools that would shorten the learning and development
processes. Table 1 shows the basic specifications of the target device.
One last consideration, but not less important, was the fact that the
selected microcontroller has low power features (120 µA at 1 MHz, 2.0V,
typical) that will influence the design of the sensing circuit as it is explained
in Section 3.7.
14
Processor Type High Performance RISC CPUNumber of Instructions 35 single-word instructionsBus Width 8 bitsProgram Memory 2KB FlashData Memory (SRAM) 224 BytesData Memory (EEPROM) 128 BytesOperating Voltage 5VClock Frequency Precision internal 4 MHz oscillator
Table 1: Basic specifications of the PIC16F628A
Number of Layers 2Copper Thickness 0.035 mmIsolation Core Thickness 1.5 mmWidth 40 mmLength 50 mm
Table 2: PCB and Layer Stack Up Configuration
3.2 PCB Layer Stack Up
One of the most important considerations before starting to layout the Printed
Circuit board was to decide the best layer stack up to reduce the noise to
a minimum. Due to the budget, the number of layers was reduced to two.
That was not a big issue by the time of designing the board but the ideal
configuration would be a four layers board of which two would contain signals
and the other two would be a power and a ground plane respectively.
Separated ground and power planes offer a lower impedance path between
power and ground at high frequencies due to the distributed inter-plane ca-
pacitance, see [Wil12, ch. 2.2.4], as well as improving the electrical signal
integrity of the grounding system and EMC. As stated before, only two lay-
ers were used and as such, we used two ground planes on each layer in order
to create an isolation barrier between the computer and the microcontroller,
as it will be explained in Section 3.5. Table 2 shows the layer stack up
configuration and the PCB dimensions.
Although the current and voltage levels of the circuit were not too high to
consider further design constraints, multiple-point grounding concepts and
15
Figure 2: 3D View of the PCB
ground stitching (placing multiple vias between ground areas on different
layers), see [O’H01], were used in early designs, but only the latest was
maintained in the PCB revision that was used to test the circuit. For more
information about grounding techniques, see [Coo07, ch. 15]. Figure 2 shows
a 3D view of the first revision of the PCB.
3.3 Input Filtering and Decoupling
A basic requirement for a low noise design starts by having a noise free and
stable input power supply. The main idea was to use several capacitors to
filter the noise at different target frequencies and to have a common power
supply for al the integrated circuits on the board. This idea was rejected
since the RS-232 transceiver used in the design comes with a DC-DC buck
converter which introduces a considerably amount of noise in the power line.
We decided to separate the power lines in order to have one signal to power
the microcontroller and a second one to power the rest of the ICs in the
16
Reference Value TypeC15 470 uF TantalumC16 47 uF CeramicC17 0.1 uF CeramicC18 100 pF Ceramic
Table 3: Characteristics of the filter capacitors
design.
The filter configuration adopted to power all the ICs except the micro-
controller was a CLC filter in a π configuration. A ferrite bead was used but
it can be bypassed using a zero Ohm resistor if necessary depending on the
noise requirements, see [Pau92] and [Ber11]. Table 3 shows the characteristics
of the capacitors that were selected.
With this set of capacitors we cover a whole range of frequencies to be
filtered out of the power line signal and that should be enough. But we opted
to add an additional linear voltage regulator with ultra low-noise output and
very low drop out voltage which adds extra stability and allows a wider range
of input voltages. This LDO is able to provide 500 mA at the output so it is
more than enough to power the circuit; another interesting characteristic of
this LDO is the possibility of being enabled on demand using a PWM signal
but, since the circuit is powered at all times and there are not multiple
voltage requirements to be satisfied, there was not a clear need to drive the
LDO using the microcontroller as such, so the enable pin (EN) is connected to
the input pin (IN), see [Mic06, p. 8]. Finally, there are two extra capacitors
for decoupling (C19) and to add stability to the circuit and reduce the noise
(C20), see [Mic06, p. 9].
The final design came up with two differentiated power lines, VCC FILTERED
and VCC UNFILTERED that were used to power the ICs and the microcontroller
respectively. The reasons behind using VCC UNFILTERED to power the micro-
controller were mainly, to use a different filter topology plus to maintain two
different voltage nodes to balance the loads.
The filter topology chosen to filter the power signal of the microcontroller
is a LC filter, with a ferrite bead and three capacitors. Similarly to the
17
previous filter configuration, the values chosen were the same except for the
100 pF that was not used this time, see Table 3. Both power lines were
measured with the oscilloscope with satisfactory results regarding noise and
stability.
3.4 Clock Signal Selection
The clock signal pays an important role in the design of the circuit since
it is one of the key pieces in the power analysis. Selecting the appropriate
clock frequency will affect the measurements and can be a limitation due to
not using the proper equipment. In this section we will cover the different
options that are available in the circuit together with their advantages and
drawbacks.
As stated in Section 3.1, the selected clock frequency was 4 MHz. The
microcontroller by itself has an internal precision oscillator that is capable of
running at exactly 4 MHz, see [Mic09, ch. 14.2.4] but the idea was to have
multiple sources to generate the desired clock frequency. The reason behind
that is simple and was to provide different power consumption scenarios, as
one can select the most appropriate oscillator configuration to perform the
power analysis.
There are four possible configurations available that can be configured by
using jumpers as explained below:
Internal Precision Oscillator : This is the most common configuration
if the desired clock frequency is 4 MHz since there is no need for ad-
ditional components on the board. The microcontroller is able to run
up to 20 MHz but, obviously, that will be disadvantage for the power
analysis because of the reasons explained in Section 3.1 and of course,
additional components are required to run at this speed. However, the
most important consideration is the power consumption that we tried
to maintain at a minimum disabling all peripherals and using external
circuitry when available.
Crystal Oscillator / Ceramic Resonator : There are three different modes
18
when using this configuration: XT(crystal/resonator), LP (Low Power
crystal) and HS (High Speed crystal/resonator). We decided to use
a quartz crystal so the oscillator configuration was set in XT mode.
In this mode a crystal or ceramic resonator is connected to the OSC1
and OSC2 pins to establish oscillation (see [Mic09, fig. 14-1]). The
PIC16F628A oscillator design requires the use of a parallel cut crystal.
The use of a series cut crystal may give a frequency out of the crys-
tal manufacturers specifications. This mode requires as well the use of
two capacitors (C6 and C7) that according to [Cor04] have to be 18
pF each. To select this configuration, jumpers JP5 (CLKIN) and JP6
(CLKOUT) have to be set.
Clock Oscillator : Similarly to the previous mode, there are some cases
when a packaged ceramic oscillator is desired, avoiding the need for
extra capacitors. We selected a 4 MHz HCMOS/TTL clock oscilla-
tor that can be activated setting the jumpers JP1 (power) and JP2
(CLKIN), see [Cor08].
External Clock : Since the measurements were taken in a laboratory envi-
ronment where there was a waveform generator, it was a good idea to
use it and so, according to [Mic09, ch. 14.2.5], an external clock signal
can feed the microcontroller. We added a standard BNC connector
plus a single logic inverter as shown in [Mic09, fig. 14-4]. The inverter
used, see [STM04], required two additional components, a 0.1 uF de-
coupling capacitor plus a 22Ω resistor at the output to reduce the rise
time and preserve the signal integrity. This configuration is activated
when setting the jumpers JP3 (power) and JP4 (CLKIN).
All these configurations were measured in the laboratory with the oscil-
loscope and all matched the specifications and produced a clean and smooth
4 MHz signal.
19
3.5 Isolated Communications Interface
As stated in Section 3, one of the main goals of designing this circuit was
to be able to perform the measurements in a automated fashion. For this
task, and inspired by [MOP07, ch. 3.4.4], we decided to use the Universal
Synchronous Asynchronous Receiver Transmitter (USART) peripheral of the
microcontroller to communicate with a PC using a custom protocol through
a RS-232 channel. The details of this protocol are explained in the next
section.
Surprisingly, all the set ups that I have seen so far did not use an isolated
communication interface and that’s a drawback in my opinion, since a lot of
noise can be introduced in the circuit through these lines. Then, the idea
was to find a solution to isolate the measurement part of the circuit from the
communications part.
The first idea was to use one optocoupler with two channels, one for
each RX and TX lines respectively. The problem with this solution, that by all
means was simple and elegant, was the need of an external and isolated power
supply for the optocoupler; that will add extra components and increase the
cost so we looked for other alternatives.
The final design uses an Isolated Single-Channel RS-232 Line Driver/Re-
ceiver, see [Dev10], that comes with an integrated DC-DC buck converter
that is able to power the isolated part of the circuit by itself, avoiding the
need of feeding the isolated part through external circuitry. The DC-DC
buck converted has to be enabled by setting the VCC pin between 4.5V and
5.5V, for this matter we used the VCC FILTERED power line to feed it. Ac-
cording to [Dev10, p. 11], the VISO pin should not be used to power external
circuitry so, in our case, only three pins are connected to the serial interface:
TX USB, RX USB and GNDISO, this last one is used as the reference to produce
the RS-232 signal. We need as well six 0.1 uF capacitors for decoupling and
bypassing as described in [Dev10, p. 11]. The connections between the mi-
crocontroller, the transceiver and the computer were done as shown in Table
4.
Finally, but not less important, it is the fact that the layout of the
20
Microcontroller Transceiver PC (DB9)TX (8) ROUT (8)
RX (7) TIN (9)
VSS (5) GND (4-7, 10)
TOUT (16) RXD (2)
RIN (15) TXD (3)
GNDISO (11) GND (5)
Table 4: Serial connection configuration [NAME (pin)]
transceiver and decoupling capacitors was done following the directives given
by the manufacturer in [Dev10, p. 13] when possible. Some considerations
were left behind due to cost or physical space constraints, since, as proposed
by the manufacturer, a six layer PCB would reduce the EMI emissions that,
with no doubt, can affect the performance of the measurements. But, as
stated in the beginning of this section, the currents, voltages and frequen-
cies handled in the circuit were far away from high speed, high voltage or
high current considerations. So, at the end, a good isolation was achieved
by a proper layer stack up plus using separated ground planes and ground
stitching methods.
3.6 Communication Protocol
The process of automating the measurements has a serious constraint that
is adjusting the timing between the oscilloscope, the computer and the in-
structions executed in the microcontroller. This process is described in more
detail in section 4.2. The challenge was to design a simple protocol that
made the measures easier to synchronize.
The protocol is based in four simple commands: handshake, reset, resend
and finally, an execution command. An important decision was made based
on the design of the microcontroller UART, since it has a FIFO buffer that is
capable of storing two bytes at a time, see [Mic09, ch. 12.2.2], so, to increase
the performance, we decided that every command should have even number
of bytes. Table 5 shows the UART configuration.
Now, we present the commands in detail:
21
Mode AsynchronousBandwidth 9600 bpsData 8 bitsParity NoneStop 1 bitFlow Control None
Table 5: UART Configuration Parameters
Handshake : This command is sent at the beginning of every transaction
and is a simple echo of the two bytes. It is used as well as a keep alive
command to find out if the microcontroller is running or in a low power
state. If everything is working and the UART has been initialized, the
microcontroller resend the bytes received, see Table 6.
Reset : This command serves as a hard reset of the UART of the micro-
controller. The microcontroller executes the initial code and set up the
UART as it is done when powering it on. After this command is sent,
the microcontroller will wait for the handshake, see Table 7.
Resend : This command is used when the checksum of an execute command
is wrong and forces the PC to resend again the command, see Table 8.
Execute : This command contains the information for executing a single in-
struction or a set of them with or without repetitions and the operands.
With this command it is possible to test entire algorithms if desired as
well, just by establishing a common operand for it, see Table 9.
Length 2 BytesCommand 0x10 0x20
Response 0x10 0x20
Use At the beginning of every transaction.
Table 6: Handshake command
The XX parameter in the execute command corresponding to the instruction
id is a number from 0 to 34 that represents the instruction to execute ac-
cording to the set of instructions as presented in [Mic09, tab. 15-2, p. 118];
22
Length 2 BytesCommand 0x10 0x30
Response 0x10 0x20
Use Force UART to hard reset, then handshake.
Table 7: Reset command
Length 2 BytesCommand 0x20 0x20
Response 0x20 0x20
Use Ask PC to resend the last execute command.
Table 8: Resend command
then, for example, to execute the instruction movlw 0x05 once, the command
will be 0x01 0x40 0x1C 0x05 0x01 0x18, where 0x1C is 28 decimal that is
the instruction id, 0x05 is the literal, 0x01 is the number of repetitions, in
this case it will be executed exactly once, and 0x18 is the checksum that is
calculated as 0x1C xor 0x05 xor 0x01 = 0x18.
3.7 Sensing and Triggering
Maybe, the most relevant part of the circuit due to the nature of the method
used to measure the power consumption, is the fact that there are traces on
it that are correlated to the instructions that are executed in the microcon-
troller.
The idea is simple and is implemented by putting a resistor in series in
the VDD line of the microcontroller right after the filter. Attached to it, we
put test pads that were connected to the oscilloscope to monitor the voltage
across the resistor while the microcontroller was executing a controlled set of
commands that were transmitted using the protocol described in the previous
section.
The only design consideration was to choose the appropriate value of
the resistor or, in fact, using a couple of them in a parallel configuration to
reduce the effect of the tolerance in the value; this is possible because the
values of the resistors are normal distributed. But, as in the previous section,
23
Length 6 BytesCommand 0x10 0x40 XX YY ZZ KK
Response 0x10 0x20
Description XX: instruction id
YY: literal
ZZ: repetitions/mask/operand
KK: checksum (XX xor YY xor ZZ)
Use Execute single or multiple instructions or algorithms.
Table 9: Execute command
the physical space on the PCB is a constraint that did not let us include a
couple of resistors, so there is only 1Ω (0.1%) in the final design.
The selection of the value turned out to be a mistake when we started
to take measures since the power consumption was so low, that the voltage
across the resistor that was around 1 mV was mixed with bare noise. The
trick was to increase the value of the resistor in order to be able to distinguish
a slightly change in current consumption versus voltage across the resistor.
The results that are be presented in Section 4.6 were taken using three 22Ω
resistors in series, adding up a total resistance of 66Ω.
Being able to measure the desired thing is a complicated task to do with-
out the help of some kind of synchronization between the target device and
the device used to measure the variations in power consumption. For this
reason, we set up one of the General Purpose Input Output pins of the micro-
controller to act as a raising/falling edge trigger signal. This pin will go from
low to high state when starting/finishing the execution of the commands
specified by the computer. An example of how to set up the timing scale and
the trigger is shown in Figure 3.
3.8 In Circuit Serial Programming
Writing software using a low level programming language like assembler could
be difficult to debug looking for bugs if tools like In Circuit Debuggers are
not used. The same reason apply when the microcontroller comes in a small
package that is appropriate for surface mount applications but, in practice,
24
Figure 3: Oscilloscope timming and trigger settings
can not be programmed using old fashioned programmers. The idea was to
adapt the circuit in order to be able to use these tools.
Microchip has a cheap and powerful solution called Pickit 3, see [Mic10],
that is a low cost In Circuit Debugger/Serial Programmer that cost less than
40 USD and can program and debug the whole range of Microchip’s devices.
In order to interface the Pickit 3 with our design we had to simply follow the
connection guidelines given in [Mic10, ch. 2.3, 2.4] with a slight modification,
that is, the addition of a Schottky diode (D1) that prevents the current from
flowing when using the Pickit 3 and the board was powered by an external
power supply. The connection configuration is shown in Figure 4.
3.9 Production Cost
Finding the balance between cost and features was one of the key points that
drove this design from the early stages. Here we present the production cost
of 10 units of the circuit including all the necessary components, the printed
25
Figure 4: Pickit 3 Connection Schematic
Components 115.68 USD
PCBs (10 boards) 13.90 USD
Taxes (customs) 44.30 USD
Total 173.88 USD
Table 10: Cost of producing the first prototype batch
circuit boards plus shipping and taxes. All the components are surface mount
components for which most of the footprints used in the layout of the PCB
were custom made; these costs are seen as engineering work and are not taken
into account. These costs do not include the assembling of the boards either,
since it was done manually. Table 10 shows the detailed cost of producing
the prototype.
If we calculate the price per board, 17 USD could seem over price but
this can be reduced to 13 USD if the components were available at any EU
supplier by the time of production. What is more, using a double channel
optocoupler instead of the transceiver, which costs 7 USD per chip, can even
decrease the costs to 7 USD per board, that is so far, the cheapest solution
to perform Power Analysis Attacks.
26
4 Simple Power Analysis
In this section we cover the basic formal aspects of this technique. We also
go through some examples that were obtained using the design proposed in
this thesis. We don’t do a detailed study as such of any particular algorithm
but we leave this for future research and investigation.
4.1 General Description
Simple Power Analysis is explained in the first paper in the subject as a
technique that is used to retrieve information from cryptographic devices by
interpreting power consumption measurements, see [KJJ99]. Although the
last intention is to retrieve the key of a cryptographic device, given or not the
algorithm or the plain text, SPA attacks can have other uses, for example, de-
riving the instructions executed in a microprocessor or microcontroller were
the code is protected or not known. According to [MOP07, ch. 5.1] SPA at-
tacks are divided into single-shot SPA attacks and multiple-shot SPA attacks.
The difference between them is the number of traces that can be recorded.
As it is explained in the next section, this design is capable of doing both
types of attacks but the last one was not implemented due to the lack of an
oscilloscope to interface with a computer via RS-232, USB or GPIB [GPI04].
This approach was taken into account in this design as a proof of concept
or an introduction to SPA attacks. However, the device under attack is capa-
ble of performing cryptographic functions written for 8 bit microcontrollers;
this is an achievement given the few resources that are available in terms
of memory and computing power. In the last part of this chapter, Section
4.6, there are some examples of this kind of algorithms that are available
under an open source license, see [Per08a], [Per08b] and [Smi03], and were
measured to demonstrate the capabilities of the circuit.
4.2 Measurements Set Up
Simple Power Analysis can be performed without a whole set of high end
equipment as explained in Section 3. In this section we will cover how to
27
Figure 5: Measurements Set Up
set up this equipment and the steps that need to be followed to perform the
measurements.
The set up includes seven elements, a power supply, a signal generator,
a digital sampling oscilloscope, the measurement circuit and the probe, the
microcontroller, the In Circuit Serial Programming interface and a personal
computer. The process has the following steps as shown in Figure 5:
1. Write the software, compile it and flash the microcontroller using the
ICSP interface. The device can be programmed with or without the
power supply connected since the Pickit 3 is able to supply the needed
voltage for programming. Once the software has been programmed in
the device, disconnect the Pickit 3 to avoid adding noise through the
ICSP lines.
2. Connect the clock generator to the BNC connector using a coaxial cable
and set the appropriate jumpers to configure the microcontroller to use
an external clock signal. Set the frequency in the clock generator to 4
MHz.
3. Connect the probe to the sensing terminals in the circuit and turn
on the digital oscilloscope. Configure the second channel as a single
shoot trigger on rising or falling edge. The sensing terminals should be
connected to the first channel with a x1 probe.
4. Connect the power supply to the circuit and open a serial connection
in the computer. Check that the UART has been properly initialised,
28
if so, UART [OK] would appear on the screen and the microcontroller is
waiting for commands.
5. Send the desired command(s) and observe the results on the oscillo-
scope, see Figure 6
6. Save the data and be aware of including data from both channels. If
the second channel is omitted, there will be no time reference of the
time when the operations started and finished.
7. Import the data and analyse.
The number of traces or measures needed to do SPA is relatively low. The
microcontroller can be characterized by using one or two traces of the same
instruction. There is no need of communicating with the oscilloscope from
the computer since the set up can be done manually in a matter of minutes.
But, if desired, the process can be further automated by using scripts to
communicate with both the oscilloscope and the microcontroller. This is a
basic feature needed to perform DPA or using multiple-shots SPA attacks.
4.3 Template Attacks
Template attacks rely on the dependence that exists between the power con-
sumption and the data being processed. When using this kind of attacks
we expect some kind of relationship, for example, depending on the literals,
operands and registers that are involved in a particular instruction executed
by the microcontroller. Using these kind of attacks add statistics and numer-
ical analysis to the list of tools needed to retrieve useful information from the
traces and as such, those traces are characterized by a normal distribution
of multiple variables, see [MOP07, ch. 5.3].
The basic idea behind template attacks is to build pairs of a mean vector
and a covariance matrix (m,C) using pairs of data di and keys kj. As an
example, Listing 1 shows three operations that are quite similar. Using power
analysis we can derive a template for the last two instructions, that compute
an eXclusive OR between the number 5 and 10, by changing both data (5)
29
Figure 6: Capture of a power trace
and the key (10). After that, we calculate a mean vector m with all the
traces that use the same data-key pair (di, kj) and by changing one of them,
we calculate the covariance matrix C. The same procedure is repeated for
every pair combination of (di, kj) to build its template hdi,kj , see Equation
4.1. If we have a similar device, we can then match power consumption
traces with the ones that we have previously characterized. These two phases
are known as characterization and attack respectively. Note that there is a
paralelism between data-key and operation-literal pairs. The first case applies
for cryptographic attacks known as Plain Text Attacks and the second case
is used for characterizing the power consumption per instruction executed
or, in another words, in the template building phase.
(di, kj) : hdi,kj = (m,C)di,kj (4.1)
30
Listing 1: Dependence between data and power consumption example.
1 list p=16 f628A
#include <p16F628a.inc >
3
spa:
5 ; add two numbers: W = 5 + 10
movlw d’5’
7 addlw d’10’
; add two numbers: W = 100 + 139
9 movlw d’100’
addlw d’139’
11 ; W = 0000 0101 XOR 0000 1010
movlw b’00000101 ’
13 xorlw b’00001010 ’
Finally, once we have all the data-key pairs, we calculate the Probability
Density Function of the power trace of the device under attack for every tem-
plate that we have characterized using Equation 4.2. The highest probability
will point to the correct template, see Equation 4.3.
p(t; (m,C)di,kj) =exp(−1
2(t−m)′C−1(t−m))√
(2π)Tdet(C)(4.2)
p(t;hdi,kj) > p(t;hdi,kl) ∀l 6= j (4.3)
4.4 Template Building
This step is open for different types of template building strategies. Following
the directions given in the previous section and depending on whether our
intention is to characterize every instruction individually, a group of them
or a complete algorithm, we need a way to decide which are the interesting
points that contains more information about what we are looking for. There
are several strategies, but we will discuss two of them.
Pairs of Data and Key The interesting points of a trace are the ones that
31
correlate to the power trace pair, as described in the previous section.
This technique can be used to derive, for example, a checksum calcula-
tion that is done using logical instructions (and, xor, or, xnor, etc.) by
feeding different plain texts and segmenting the possible stages of the
algorithm. This can be seen as building templates for intermediate val-
ues of an algorithm but, checksums do not usually perform complicated
calculations and have to be fast, so, this kind of template strategy is
suitable for the task.
Power Models Sometimes, power traces does not leak any information
about the dependence between data and keys, as for example, rela-
tionships regarding the Hamming weight of the data. Then, the most
common approach is to reduce the number of templates to the ones that
have the same Hamming weight. An important consideration is that
if a device does not leak any information about the Hamming weight,
it is not possible to retrieve information about the key using only one
power trace. This can be applied to the previous example as well if
only one power trace is available.
4.5 Template Matching
Template matching is the last step in the process of performing SPA and
it requires numerical methods to calculate the matching template. Some
issues arise when calculating the probabilities using 4.2; in order to avoid the
exponentiation, using logarithm with the absolute value and selecting the
smallest possible value will throw the matching template as in Equation 4.4.
| ln p(t;hdi,kj)| < | ln p(t;hdi,kl)| ∀l 6= j (4.4)
Sometimes, calculating the covariance matrix will inevitably imply the
possibility of being singular and thus, making the calculation not possible.
For this reason, we can discard the covariances between points, which results
in the identity matrix and we skip this chance. This is called a reduced
template and it consists of only a mean vector. In a similar way as before,
32
the smallest value of the absolute value of the logarithm will be the correct
guess. This method is called least-square estimator (LSQ), see [MA02, ch.
11.2, 12.1-2].
4.6 Examples
In this section we present some measurements that were taken during the
testing phase of the circuit. These are pretty basic captures without the only
intention to confirm that the system was working, without going further in
the analysis of them. We want to encourage the use of this platform for
further research with the device selected or with any other that could be
considered. In the next pages the reader can observe the different power
traces that were obtained either in the characterization stage of the device
or, in a later stage, as a test of more advance cryptographic algorithms
implemented for 8 bit microcontrollers.
There are examples of single operations using different literals where we
can observe the differences in power consumption depending on the liter-
als used. This shows clearly, the relationship that exists between power
consumption and data being processed. We added a power trace of a NOP
operation to prove that the power consumption did not vary when the mi-
crocontroller was not executing anything.
33
0 100 200 300 400 500 60050
100
150
200SPA on movlw instruction
Time [nS/div]
Voltag
e [m
V/d
iv]
movlw .1
movlw .127
movlw .255
Figure 7: Load accumulator power trace (1 vs 127 vs 255)
34
0 100 200 300 400 500 60050
100
150
200SPA on clrw instruction
Time [nS/div]
Voltag
e [m
V/d
iv]
clrw (W=1)
clrw (W=127)
Figure 8: Clear accumulator power trace (1 vs 127)
35
0 100 200 300 400 500 60050
60
70
80
90
100
110
120
130SPA on nop instruction
Time [nS/div]
Voltag
e [m
V/d
iv]
Figure 9: No Operation (NOP) power trace
36
0 200 400 600 800 1000 1200−0.1
−0.05
0
0.05
0.1
0.15
0.2
0.25
Time
Volta
ge
[m
V]
0 500 1000 1500 2000 2500−0.1
−0.05
0
0.05
0.1
0.15
0.2
0.25
Time
Volta
ge
[m
V]
0 500 1000 1500 2000 2500−0.1
−0.05
0
0.05
0.1
0.15
0.2
0.25
0.3
Time
Volta
ge
[m
V]
0 500 1000 1500 2000 2500−0.1
−0.05
0
0.05
0.1
0.15
0.2
0.25
0.3
Time
Volta
ge
[m
V]
Figure 10: AES-128 Power Trace at (from top left to bottom right) 200 ns, 200 uS, 500 uS and 20 uS per division
37
5 Conclusion
This discussion has argued that it is possible to build a low cost circuit
suitable as an entry level design to do Power Analysis Attacks. This thesis
was undertaken to design an under 10 USD solution and evaluate it through
a set of experiments using Simple Power Analysis as the driving example. It
has been found that, generally, there is no need of using expensive equipment
to perform this kind of research and the evidence from this study suggest
that it is possible to build powerful attacks as a combination of software and
hardware for less than 400 USD. The methods sed for this design may be
applied to other microprocessors or cryptographic devices that are widely
used nowadays without the need of an invasive or semi-invasive attack.
A number of caveats need to be noted regarding the present study. Due to
the low cost nature of it, some design constraints were applied that can lead
to bad performance in terms of noise. However, following some recommenda-
tions when taking measurements will minimize the risks. Being the physical
size of the PCB the most important one, a proper layout using a slightly
bigger board plus the addition of up to six layers can reduce drastically the
noise and so, improve the stability of the circuit.
It is recommended that further designs based on the one proposed in this
study to be undertaken in the following areas:
• Improvements in the software run by the microcontroller to make it
easier to ”plug and play” algorithms or routines under study.
• Improve the layout of the PCB as described before to keep the noise
under control.
• Improve the communication protocol by adding the possibility of com-
plex commands that execute complex pieces of code that not only serve
to characterize the instruction set of the device.
The findings of this study have a number of important implications for
future practice of low cost Power Analysis Attacks.
38
A Software Implementation
Listing 2: Main routine of the SPA software
1 ;
; Bachelor Thesis
3 ; SIMPLE POWER ANALYSIS
;
5 ;
; Filename: main.asm
7 ; Date: 20/04/2012
; File Version: 0.1a
9 ;
; Author: Manuel Bejarano
11 ; Contact: [email protected]
;
13 ;
; Files Required: P16F628A.INC
15 ;
;
17 ; Notes:
;
19
list p=16 f628A
21 #include <P16F628A.INC >
include macros.inc
23
extern init_uart
25 extern send
extern receive
27
errorlevel -302
29
__CONFIG _CP_OFF & DATA_CP_OFF & _LVP_ON & _BOREN_OFF &
_MCLRE_ON & _WDTE_OFF & _PWRTE_ON & _EXTCLK_OSC
31
33 ; variables
context udata_shr
35 w_temp res 1 ; context saving
39
status_temp res 1
37 b0_received res 1 ; bytes received by RS232
b1_received res 1
39 ex_instruction res 1 ; opcode to execute
ex_literal res 1
41 ex_repetitions res 1
ex_checksum res 1
43 spa_temp_reg res 1
45 global b0_received , b1_received
47 ; constants
constant cmd1 = 0x20 ; ack
49 constant cmd2 = 0x30 ; reset
constant cmd3 = 0x40 ; execute opcode
51
RESET CODE 0x0000
53 goto init
55 ISR CODE 0x0004
movwf w_temp
57 movf STATUS , W
movwf status_temp
59
; ISR subroutines
61
btfsc PIR1 , RCIF
63 call receive
65 movf status_temp , W
movwf STATUS
67 swapf w_temp , F
swapf w_temp , W
69 retfie
71 MAIN CODE
init:
73 ; set unused ports as outputs
clrf PORTA
40
75 clrf PORTB
banksel TRISA
77 clrf TRISA
movlw (1 << RB1) | (1 << RB2) ; RB1(RX) and RB2(TX)
as inputs
79 movwf TRISB
81 call init_uart ; init UART
83 bsf INTCON , GIE ; enable interrupts
85
main_loop:
87 movf b0_received , F ; wait for the first
byte to be 00h
brz main_loop
89 cflbie b1_received , cmd1 , ack
cflbie b1_received , cmd2 , 0x0000 ; reset device
91 cflbie b1_received , cmd2 , process_command
93 process_command:
call receive ; read 3 bytes [ AA
BB CC ]
95 movff b0_received , ex_instruction ; AA: process_command
to execute
movff b1_received , ex_literal ; BB: literal (00h if
NA)
97 call receive
movff b0_received , ex_repetitions ; CC: repetitions
99 movf ex_instruction , W ; last byte has to by
AA xor BB xor CC
xorwf ex_literal , W
101 xorwf ex_repetitions , W
movwf ex_checksum
103 cflbie b1_received , ex_checksum , spa
goto tx_error ; ask for resendind
the command
105
41
107 ack: ; ACK [ 00h 20h ]
clrw
109 call send
movlw cmd1
111 call send
goto main_loop
113
tx_error: ; resend command [ 20
h 20h ]
115 movlw cmd2
call send
117 call send
goto main_loop
119
spa:
121 movlw ex_instruction ; offset
addwf PCL , F
123 goto test_addwf ; addwf
goto test_andwf ; andwf
125 goto test_clrf ; clrf
goto test_clrw ; clrw
127 goto test_comf ; comf
goto test_decf ; decf
129 goto test_decfsz ; decfsz
goto test_incf ; incf
131 goto test_incfsz ; incfsz
goto test_iorwf ; iorwf
133 goto test_movf ; movf
goto test_movwf ; movwf
135 goto test_nop ; nop
goto test_rlf ; rlf
137 goto test_rrf ; rrf
goto test_subwf ; subwf
139 goto test_swapf ; swapf
goto test_xorwf ; xorwf
141 goto test_bsf ; bsf
goto test_bcf ; bcf
143 goto test_btfsc ; btfsc
goto test_btfss ; btfss
42
145 goto test_addlw ; addlw
goto test_andlw ; andlw
147 goto test_call ; call
goto test_clrwt ; clrwt
149 goto test_goto ; goto
goto test_iorlw ; iorlw
151 goto test_movlw ; movlw
goto test_retfie ; retfie
153 goto test_retlw ; retlw
goto test_return ; return
155 goto test_sleep ; sleep
goto test_sublw ; sublw
157 goto test_xorlw ; xorlw
goto $
159
161 test_addwf:
clrf spa_temp_reg
163 movlw ex_literal
settrigger
165 addwf spa_temp_reg , F
clrtrigger
167 decfsz ex_literal , F
goto test_addwf
169 goto ack
test_andwf:
171 clrf spa_temp_reg
movlw ex_literal
173 movwf spa_temp_reg
movlw ex_repetitions
175 settrigger
andwf spa_temp_reg , F
177 clrtrigger
goto ack
179 test_clrf:
settrigger
181 clrf ex_literal
clrtrigger
183 goto ack
43
test_clrw:
185 movlw ex_literal
settrigger
187 clrw
clrtrigger
189 goto ack
test_comf:
191 settrigger
comf ex_literal , F
193 clrtrigger
goto ack
195 test_decf:
settrigger
197 decf ex_literal , F
clrtrigger
199 decfsz ex_repetitions , F
goto test_decf
201 goto ack
test_decfsz:
203 settrigger
decfsz ex_literal , F
205 clrtrigger
clrtrigger
207 decfsz ex_repetitions , F
goto test_decfsz
209 goto ack
test_incf:
211 settrigger
incf ex_literal , F
213 clrtrigger
decfsz ex_repetitions , F
215 goto test_incf
goto ack
217 test_incfsz:
settrigger
219 incfsz ex_literal , F
clrtrigger
221 clrtrigger
decfsz ex_repetitions , F
44
223 goto test_incfsz
goto ack
225 test_iorwf:
clrf spa_temp_reg
227 movlw ex_literal
movwf spa_temp_reg
229 movlw ex_repetitions
settrigger
231 iorwf spa_temp_reg , F
clrtrigger
233 goto ack
test_movf:
235 settrigger
movf ex_literal , F
237 clrtrigger
goto ack
239 test_movwf:
clrf spa_temp_reg
241 movlw ex_literal
settrigger
243 movwf spa_temp_reg
clrtrigger
245 goto ack
test_nop:
247 settrigger
nop
249 clrtrigger
goto ack
251 test_rlf:
settrigger
253 rlf ex_literal , F
clrtrigger
255 goto ack
test_rrf:
257 settrigger
rrf ex_literal , F
259 clrtrigger
goto ack
261 test_subwf:
45
clrf spa_temp_reg
263 movlw ex_literal
settrigger
265 subwf spa_temp_reg , F
clrtrigger
267 decfsz ex_literal , F
goto test_subwf
269 goto ack
test_swapf:
271 settrigger
swapf ex_literal , F
273 clrtrigger
goto ack
275 test_xorwf:
clrf spa_temp_reg
277 movlw ex_literal
movwf spa_temp_reg
279 movlw ex_repetitions
settrigger
281 xorwf spa_temp_reg , F
clrtrigger
283 goto ack
test_bsf:
285 clrf spa_temp_reg
movlw ex_literal
287 movwf spa_temp_reg
settrigger
289 bsf spa_temp_reg , 0
clrtrigger
291 goto ack
test_bcf:
293 clrf spa_temp_reg
movlw ex_literal
295 movwf spa_temp_reg
settrigger
297 bcf spa_temp_reg , 0
clrtrigger
299 goto ack
test_btfsc:
46
301 clrf spa_temp_reg
movlw ex_literal
303 movwf spa_temp_reg
settrigger
305 btfsc spa_temp_reg , 0
clrtrigger
307 clrtrigger
goto ack
309 test_btfss:
clrf spa_temp_reg
311 movlw ex_literal
movwf spa_temp_reg
313 settrigger
btfss spa_temp_reg , 0
315 clrtrigger
clrtrigger
317 goto ack
test_addlw:
319 clrf spa_temp_reg
movlw ex_literal
321 settrigger
addlw ex_repetitions
323 clrtrigger
goto ack
325 test_andlw:
clrf spa_temp_reg
327 movlw ex_literal
settrigger
329 andlw ex_repetitions
clrtrigger
331 goto ack
test_call:
333 settrigger
call test_call2
335 goto ack
test_call2:
337 clrtrigger
return
339 test_goto:
47
settrigger
341 goto $+1
clrtrigger
343 goto ack
test_iorlw:
345 clrf spa_temp_reg
movlw ex_literal
347 settrigger
iorlw ex_repetitions
349 clrtrigger
goto ack
351 test_movlw:
settrigger
353 movlw ex_literal
clrtrigger
355 goto ack
test_return:
357 call test_return2
clrtrigger
359 goto ack
test_return2:
361 settrigger
return
363 test_sublw:
clrf spa_temp_reg
365 movlw ex_literal
settrigger
367 sublw ex_repetitions
clrtrigger
369 goto ack
test_xorlw:
371 clrf spa_temp_reg
movlw ex_literal
373 settrigger
xorlw ex_repetitions
375 clrtrigger
goto ack
377
; TODO: Special uC functions
48
379 test_clrwt:
test_retfie:
381 test_retlw:
test_sleep:
383 goto ack
385 ;EEPROM CODE 0x2100
; de 0x00 , 0x01 , 0x02 , 0x03
387
389 END
49
Listing 3: UART Routines
1 #include <P16F628A.INC >
include macros.inc
3
global init_uart
5 global send
global receive
7
extern b0_received
9 extern b1_received
11 errorlevel -302
13 CODE
init_uart:
15 banksel SPBRG
movlw .25 ; 9600 bps
17 movwf SPBRG
movlw (1 << TXEN) | (1 << BRGH)
19 movwf TXSTA ; enable asynchronous
transmission
movlw (1 << RCIE)
21 movwf PIE1 ; enable interrupt
when receiving
banksel RCSTA
23 movlw (1 << CREN) | (1 << SPEN)
movwf RCSTA
25 movf RCREG , W ; clear FIFO buffer
movf RCREG , W
27 movf RCREG , W
return
29
send:
31 movwf TXREG
banksel TXSTA
33 btfss TXSTA , TRMT
goto $-1
35 return
50
37 receive:
movff RCREG , b0_received
39 movff RCREG , b1_received
banksel RCSTA
41 brset RCSTA , FERR , init_uart ; check for
frame error
brset RCSTA , OERR , init_uart ; check for
FIFO overrun error
43 return
45 END
51
Listing 4: Simple measurement template
1 list p=16 f628A
#include <p16F628a.inc >
3 ; include macros.inc
5 errorlevel -302
7 __CONFIG _CP_OFF & DATA_CP_OFF & _LVP_ON & _BOREN_OFF &
_WDTE_OFF & _PWRTE_ON & _INTOSC_OSC_NOCLKOUT
9 ; variables
context udata_shr
11 dataL res 1
b0_received res 1
13 d1 res 1
d2 res 1
15 d3 res 1
17
RESET CODE 0x0000
19 goto init
21
MAIN CODE
23 init:
; set unused ports as outputs
25 clrf PORTA
clrf PORTB
27 banksel TRISA
clrf TRISA
29 movlw (1 << TRISB1) | (1 << TRISB2) ; RB1(RX) and
RB2(TX) as inputs
movwf TRISB
31
init_uart:
33 banksel SPBRG
movlw .25 ; 9600 bps
35 movwf SPBRG
movlw (1 << BRGH) | (1 << TXEN)
52
37 movwf TXSTA ; enable asynchronous
transmission
banksel RCSTA
39 movlw (1 << CREN) | (1 << SPEN)
movwf RCSTA
41
movf RCREG , W ; flush buffer
43 movf RCREG , W
movf RCREG , W
45
; clrf dataL
47 ;settle:
; decfsz dataL ,F
49 ; goto settle
51 call Delay
call message
53
main_loop:
55 call receive
movwf b0_received
57 cflcie b0_received , ’m’, spa ; perform SPA
; call send
59 ; call Delay
; call message
61
63 goto main_loop
65 spa:
settrigger
67
; ********************
69 ;
; PLACE YOUR CODE HERE
71 ;
; ********************
73
call Delay
53
75 clrtrigger
call ack
77 return
79
message:
81 movlw ’U’
call send
83 movlw ’A’
call send
85 movlw ’R’
call send
87 movlw ’T’
call send
89 movlw ’ ’
call send
91 movlw ’[’
call send
93 movlw ’O’
call send
95 movlw ’K’
call send
97 movlw ’]’
call send
99 movlw 0x0D ; CR
call send
101 movlw 0x0A ; LF
call send
103 return
105 ack:
movlw ’R’
107 call send
movlw ’X’
109 call send
movlw 0x0D ; CR
111 call send
movlw 0x0A ; LF
113 call send
54
return
115
send:
117 movwf TXREG
banksel TXSTA
119 btfss TXSTA , TRMT
goto $-1
121 bcf STATUS , RP0
return
123
receive:
125 btfss PIR1 , RCIF
goto $-1
127 brset RCSTA , FERR , frame_error ; check for frame
error
brset RCSTA , OERR , overrun_error ; check for FIFO
overrun error
129 movf RCREG , W
goto end_receive
131
overrun_error:
133 bcf RCSTA , CREN ;reset the receiver
logic
movf RCREG , W ; flush buffer
135 movf RCREG , W
movf RCREG , W
137 bsf RCSTA , CREN ;enable reception
again
goto end_receive
139
frame_error:
141 movf RCREG , W ;discard received
data that has error
143 end_receive:
return
145
Delay:
147 ;499994 cycles
55
movlw 0x03
149 movwf d1
movlw 0x18
151 movwf d2
movlw 0x02
153 movwf d3
Delay_0:
155 decfsz d1, f
goto $+2
157 decfsz d2, f
goto $+2
159 decfsz d3, f
goto Delay_0
161
;2 cycles
163 goto $+1
165 ;4 cycles (including call)
return
167
END
56
Listing 5: Useful macros
#define skifz btfss STATUS , Z ; Skip if zero
2 #define skifnz btfsc STATUS , Z ; Skip if not zero
#define settrigger bsf PORTB , RB5 ; set measurement trigger
4 #define clrtrigger bcf PORTB , RB5 ; clear trigger
6 ; *** Branch if Zero
brz macro WHERE
8 skifnz
goto WHERE
10 endm
12 ; *** If FILE1 = LIT1 then Branch
cflbie macro FILE1 , LIT1 , WHERE
14 movf FILE1 , W
sublw LIT1
16 btfsc STATUS , Z
goto WHERE
18 endm
20 ; *** If FILE1 != LIT1 then Branch
cflcie macro FILE1 , LIT1 , WHERE
22 movf FILE1 , W
sublw LIT1
24 btfsc STATUS , Z
call WHERE
26 endm
28 ; *** movff
movff macro HERE , THERE
30 movf HERE , W
movwf THERE
32 endm
34 ; *** brset
brset macro WHO , BIT , WHERE
36 btfsc WHO , BIT
goto WHERE
38 endm
57
B Bill Of Materials (BOM)
0Ω 2 R3, R7
100nF 11 C1, C2, C5, C8..C13, C17, C19
100pF 1 C18
18pF 2 C6, C7
1R 1 R1
2.2uF 1 C14
22Ω 3 R4, R5, R6
4.7K 1 R2
4.7uF 1 C4
470pF 1 C20
470uF 1 C15
47uF 2 C3, C16
742792112 2 L1, L2
74LX1G04 1 U3
9C-4.000MAAJ-T 1 X1
ADM3251E 1 U4
BNC 1 P3
CB3-3C-4M0000 1 U1
CDBA120-G 1 D1
ICSP 1 P4
JUMPER 6 JP1, JP2, JP3, JP4, JP5, JP6
MIC5205 1 U5
PIC16F628A 1 U2
POWER 1 P6
PROBE 1 P1
TRIGGER 1 P2
UART 1 P5
Table 11: Bill Of Materials
58
C Gerber Files
Figure 11: Bottom Copper Layer
Figure 12: Top Copper Layer
59
Figure 13: Bottom Solder Mask Layer
Figure 14: Top Solder Mask Layer
60
Figure 15: Bottom Silk Screen Layer
Figure 16: Top Silk Screen Layer
61
Figure 17: Drill File
62
D Circuit Schematic
63
References
[And08] Ross J. Anderson. Security Engineering: A Guide to Building
Dependable Distributed Systems. Wiley Publishing, 2 edition, 2008.
[Ber11] Bertho. Decoupling by example - bypassing currents in overdrive.
http://www.vagrearg.org/?p=decoupling, November 2011.
[Coo07] Clyde F. Coombs. Printed Circuits Handbook. McGraw-Hill Pro-
fessional, 2007.
[Cor04] TXC Corporation. 4 MHz Quartz Crystal, SMD HC-49S, 9C SE-
RIES Datasheet. http://www.txccrystal.com/images/pdf/9c.
pdf, 2004.
[Cor08] CTS Corporation. Model CB3/CB3LV 7.0x5.0mm Low
Cost HCMOS/TTL Clock Oscillator Datasheet. http://www.
ctscorp.com/components/Datasheets/008-0256-0_F.pdf, Oc-
tober 2008.
[Dev10] Analog Devices. ADM3251E Isolated Single-Channel RS-
232 Line Driver/Receiver Datasheet (rev. E). http:
//www.analog.com/en/interface/digital-isolators/
adm3251e/products/product.html, May 2010.
[GMO01] Karine Gandolfi, Christophe Mourtel, and Francis Olivier. Elec-
tromagnetic analysis: Concrete results. In Proceedings of the Third
International Workshop on Cryptographic Hardware and Embed-
ded Systems, CHES ’01, pages 251–261, London, UK, UK, 2001.
Springer-Verlag.
[GPI04] Iec/ieee standard for higher performance protocol for the standard
digital interface for programmable instrumentation - part 1: Gen-
eral (adoption of ieee std 488.1-2003). IEC 60488-1First edition
2004-07; IEEE 488.1, pages 01 –158, 2004.
66
[KJJ99] Paul C. Kocher, Joshua Jaffe, and Benjamin Jun. Differential
power analysis. In Proceedings of the 19th Annual International
Cryptology Conference on Advances in Cryptology, CRYPTO ’99,
pages 388–397, London, UK, UK, 1999. Springer-Verlag.
[KK99] Oliver Kommerling and Markus G. Kuhn. Design principles
for tamper-resistant smartcard processors. In Proceedings of the
USENIX Workshop on Smartcard Technology on USENIX Work-
shop on Smartcard Technology, WOST’99, pages 2–2, Berkeley,
CA, USA, 1999. USENIX Association.
[Koc96] Paul C. Kocher. Timing attacks on implementations of diffie-
hellman, rsa, dss, and other systems. In Proceedings of the 16th
Annual International Cryptology Conference on Advances in Cryp-
tology, CRYPTO ’96, pages 104–113, London, UK, UK, 1996.
Springer-Verlag.
[KSS+08] Toshihiro Katashita, Akashi Satoh, Takeshi Sugawara, Naofumi
Homma, and Takafumi Aoki. Enhanced correlation power analysis
using key screening technique. In Proceedings of the 2008 Interna-
tional Conference on Reconfigurable Computing and FPGAs, RE-
CONFIG ’08, pages 403–408, Washington, DC, USA, 2008. IEEE
Computer Society.
[KSS+09] T. Katashita, A. Satoh, T. Sugawara, N. Homma, and T. Aoki.
Development of side-channel attack standard evaluation environ-
ment. In Circuit Theory and Design, 2009. ECCTD 2009. Euro-
pean Conference on, pages 403 –408, aug. 2009.
[MA02] J. Susan Milton and Jesse C. Arnold. Introduction to Probability
and Statistics: Principles and Applications for Engineering and
the Computing Sciences. McGraw-Hill, Inc., New York, NY, USA,
4th edition, 2002.
67
[Mic06] Micrel. Micrel 150mA Low-Noise LDO Regulator Datasheet.
http://www.micrel.com/page.do?page=/product-info/
products/mic5205.shtml, February 2006.
[Mic09] Microchip. PIC 16F628A Datasheet (18-pin Flash-Based,
8-Bit CMOS Microcontrollers with nanoWatt Technology).
http://www.microchip.com/wwwproducts/devices.aspx?
ddocname=en010210, October 2009.
[Mic10] Microchip. Microchip PICkitTM 3 Programmer/Debugger
User’s Guide). http://www.microchip.com/stellent/idcplg?
IdcService=SS_GET_PAGE&nodeId=1406&dDocName=en538340,
July 2010.
[MOP07] Stefan Mangard, Elisabeth Oswald, and Thomas Popp. Power
Analysis Attacks: Revealing the Secrets of Smart Cards (Advances
in Information Security). Springer-Verlag New York, Inc., Secau-
cus, NJ, USA, 2007.
[O’H01] Martin O’Hara. Grounding strategies for printed circuit
boards. http://www.ce-mag.com/ce-mag.com/archive/01/03/
0103CE_028.html, April 2001.
[OS09] Alan V. Oppenheim and Ronald W. Schafer. Discrete-Time Signal
Processing. Prentice Hall Press, Upper Saddle River, NJ, USA, 3rd
edition, 2009.
[Pau92] C.R. Paul. Effectiveness of multiple decoupling capacitors. Elec-
tromagnetic Compatibility, IEEE Transactions on, 34(2):130 –133,
May 1992.
[Per08a] Edi Permadi. cryptonutter - embedding cryptographic and hash
function. http://code.google.com/p/cryptonutter/, February
2008.
[Per08b] Edi Permadi. The hardware side of cryptography. http://
edipermadi.wordpress.com/, January 2008.
68
[SA03] Sergei P. Skorobogatov and Ross J. Anderson. Optical fault induc-
tion attacks. In Revised Papers from the 4th International Work-
shop on Cryptographic Hardware and Embedded Systems, CHES
’02, pages 2–12, London, UK, UK, 2003. Springer-Verlag.
[SAS07] SASEBO. Side-channel attack standard evaluation board (sasebo).
http://www.morita-tech.co.jp/SASEBO/en/index.html, 2007.
[Sko05] Sergei P. Skorobogatov. Semi-invasive attacks – A new approach to
hardware security analysis. Technical Report UCAM-CL-TR-630,
University of Cambridge, Computer Laboratory, April 2005.
[Smi03] Eric Smith. Eric’s crypto software. http://www.brouhaha.com/
~eric/crypto/, August 2003.
[SSAQ02] David Samyde, Sergei Skorobogatov, Ross Anderson, and Jean-
Jacques Quisquater. On a new way to read data from memory.
In Proceedings of the First International IEEE Security in Storage
Workshop, SISW ’02, pages 65–, Washington, DC, USA, 2002.
IEEE Computer Society.
[STM04] STMicroelectronics. Low Voltage CMOS Single Inverter With 5V
Tolerant Input Datasheet. http://www.st.com/internet/com/
TECHNICAL_RESOURCES/TECHNICAL_LITERATURE/DATASHEET/
CD00002704.pdf, April 2004.
[Wil12] Peter Wilson. The Circuit Designer’s Companion, Third Edition.
Newnes, Newton, MA, USA, 3rd edition, 2012.
69