28
CHAPTER 12 Hardware and Security: Vulnerabilities and Solutions Gedare Bloom, Eugen Leontie, Bhagirath Narahari, Rahul Simha 12.1. INTRODUCTION This chapter introduces the role that computer hardware plays for attack and defense in cyber- physical systems. Hardware security – whether for attack or defense – differs from software, net- work, and data security because of the nature of hardware. Often, hardware design and manufac- turing occur before or during software develop- ment, and as a result, we must consider hardware security early in product life cycles. Yet, hard- ware executes the software that controls a cyber- physical system, so hardware is the last line of defense before damage is done – if an attacker compromises hardware then software security mechanisms may be useless. Hardware also has a longer lifespan than most software because after we deploy hardware we usually cannot update it, short of wholesale replacement, whereas we can update software by uploading new code, often remotely. Even after hardware outlives its use- fulness, we must dispose of it properly or risk attacks such as theft of the data or software still resident in the hardware. So, hardware security concerns the entire lifespan of a cyber-physical system, from before design until after retirement. In this chapter, we consider two aspects of hardware security: security in the processor supply chain and hardware mechanisms that pro- vide software with a secure execution environ- ment. We start by exploring the security threats that arise during the major phases of the pro- cessor supply chain (Section 12.2). One such threat is the Trojan circuit, an insidious attack that involves planting a vulnerability in a pro- cessor sometime between design and fabrication that manifests as an exploit after the processor has been integrated, tested, and deployed as part of a system. We discuss ways to test for Tro- jan circuits (Section 12.2.1), how design automa- tion tools can improve the trustworthiness of design and fabrication to reduce the likelihood of successful supply chain attacks (Section 12.2.2), defensive techniques that modify a computer pro- cessor’s architecture to detect runtime deviations (Section 12.2.3), and how software might check the correctness of its execution by verifying the underlying hardware (Section 12.2.4). We begin the second aspect of hardware secu- rity – how hardware can support software to provide secure execution throughout a cyber- physical system’s lifetime – by introducing how hardware supports secure systems (Section 12.3). One contribution of hardware is that it can implement security mechanisms with smaller Handbook on Securing Cyber-Physical Critical Infrastructure. DOI: 10.1016/B978-0-12-415815-3.00012-1 305 c 2012 Elsevier Inc. All rights reserved.

Hardware and Security: Vulnerabilities andsimha/research/HWSecBookChapter12.pdf · CHAPTER 12 Hardware and Security: Vulnerabilities and Solutions Gedare Bloom, Eugen Leontie, Bhagirath

  • Upload
    vothuy

  • View
    224

  • Download
    0

Embed Size (px)

Citation preview

C H A P T E R

12Hardware and Security:

Vulnerabilities and

Solutions

Gedare Bloom, Eugen Leontie, Bhagirath Narahari, Rahul Simha

12.1. INTRODUCTION

This chapter introduces the role that computerhardware plays for attack and defense in cyber-physical systems. Hardware security – whetherfor attack or defense – differs from software, net-work, and data security because of the nature ofhardware. Often, hardware design and manufac-turing occur before or during software develop-ment, and as a result, we must consider hardwaresecurity early in product life cycles. Yet, hard-ware executes the software that controls a cyber-physical system, so hardware is the last line ofdefense before damage is done – if an attackercompromises hardware then software securitymechanisms may be useless. Hardware also has alonger lifespan than most software because afterwe deploy hardware we usually cannot update it,short of wholesale replacement, whereas we canupdate software by uploading new code, oftenremotely. Even after hardware outlives its use-fulness, we must dispose of it properly or riskattacks such as theft of the data or software stillresident in the hardware. So, hardware securityconcerns the entire lifespan of a cyber-physicalsystem, from before design until after retirement.

In this chapter, we consider two aspectsof hardware security: security in the processor

supply chain and hardware mechanisms that pro-vide software with a secure execution environ-ment. We start by exploring the security threatsthat arise during the major phases of the pro-cessor supply chain (Section 12.2). One suchthreat is the Trojan circuit, an insidious attackthat involves planting a vulnerability in a pro-cessor sometime between design and fabricationthat manifests as an exploit after the processorhas been integrated, tested, and deployed as partof a system. We discuss ways to test for Tro-jan circuits (Section 12.2.1), how design automa-tion tools can improve the trustworthiness ofdesign and fabrication to reduce the likelihood ofsuccessful supply chain attacks (Section 12.2.2),defensive techniques that modify a computer pro-cessor’s architecture to detect runtime deviations(Section 12.2.3), and how software might checkthe correctness of its execution by verifying theunderlying hardware (Section 12.2.4).

We begin the second aspect of hardware secu-rity – how hardware can support software toprovide secure execution throughout a cyber-physical system’s lifetime – by introducing howhardware supports secure systems (Section 12.3).One contribution of hardware is that it canimplement security mechanisms with smaller

Handbook on Securing Cyber-Physical Critical Infrastructure. DOI: 10.1016/B978-0-12-415815-3.00012-1 305c© 2012 Elsevier Inc. All rights reserved.

306 CHAPTER 12 Hardware and Security: Vulnerabilities and Solutions

performance penalties than software implemen-tations. Memory isolation techniques rely onhardware to enforce isolation primitives effi-ciently (Section 12.3.1). Cryptographic accelera-tors use hardware parallelism to provide speedupto cryptography primitives and protocols (Sec-tion 12.3.3). Another contribution of hardwareis that it can provide a trusted base for com-puting: if the hardware is trustworthy or secure,then we can build a secure system from trustedhardware primitives. Granted if the hardware iscompromised by some of the attacks presentedin Section 12.2, then the trusted base is nottrustworthy and our system would be insecure.Secure coprocessors can be a trusted base forcomputing if they are manufactured properly,protected from physical compromise, and pro-vide an isolated platform that implements secu-rity mechanisms – for example, cryptographicprimitives – for use by the rest of the system (Sec-tion 12.3.4). Physical compromise of a computerintroduces sophisticated attacks that target oth-erwise unobservable components like system bustraffic and memory contents; one countermeasureagainst physical attack is to place the processor ina tamper-proof enclosure and encrypt (decrypt)everything that exits (enters) the processor (Sec-tion 12.3.5). Even with the above security mea-sures, if an application contains an exploitablesecurity vulnerability, then malicious code canenter the system. Hardware techniques can mit-igate the potential that software vulnerabilitiesare exploitable by protecting an application fromthe software-based attacks (Section 12.3.2). Weconclude this chapter with some areas for futurework and exercises that demonstrate the conceptsof hardware security.

12.2. HARDWARE SUPPLY CHAIN

SECURITY

For years, Moore’s Law has predicted the suc-cess of the semiconductor integrated circuit (ICor chip) manufacturing business: a new IC canbe produced, which has twice as many transis-tors as a similar IC made less than 2 years prior.

With each new generation of smaller transistorsizes, IC fabrication plants, called foundries orfabs, must update their tools to support the newtechnology, an expensive cost to incur every otheryear.

The increase in costs and demand for newchips, coupled with decreased production costsoverseas, has led to the globalization of thesemiconductor design and fabrication industry,which in turn raises concerns about the vulner-ability of ICs to subversion. In particular, howcan chip manufacturers, who still design andsell but no longer fabricate chips, verify thatthe chips they sell are authentic to the designs?Currently, this problem is solved by trust: endusers trust chip manufacturers, who trust chipfabricators. But what if chip fabrication is nottrustworthy?

Concerns about IC fabrication seem to haveoriginated in the military sector, with DARPAissuing a Broad Agency Announcement (BAA)in 2006, and again in 2007, requesting propos-als for the TRUST in Integrated Circuits (TIC)program [1, 2]. Adee [3] gives an overview ofthe TIC program and IC supply chain prob-lems. Semiconductor Equipment and MaterialsInternational (SEMI) also has acknowledged theproblems inherent in outsourced fabrication andindustry stratification [4].

Possible attacks on the IC supply chain includemaliciously modifying ICs, copying of ICs toproduce cloned devices, or stealing intellectualproperty (IP). A malicious modification to an ICis called a Trojan circuit or Hardware Trojan,because the modification hides itself within the ICand possibly even provides correct, useful func-tionality before it attacks, much like its epony-mous myth. Cloned or counterfeit ICs lead to lostrevenue and may be less reliable than the original,potentially affecting the reputation of the origi-nal device manufacturer and the security of theend user. IP theft is similar to counterfeiting butmay be used to create competing devices withoutincurring the research and development costs. Allof these attacks may cause financial harm to theoriginal designer or device manufacturer, while

CHAPTER 12 Hardware and Security: Vulnerabilities and Solutions 307

Trusted Semi-trusted

(mixed trust)Untrusted

Level of control of IC supply chain

FabricationSpecs Code TestDesign

HDL

libs

IP

coresTools

Std

cells

FIGURE 12-1 IC Supply Chain. End-users trust chip designers, whose name appears on the final product, but designhouses outsource much of their work to lesser known companies, hence introducing semitrusted and untrusted phases.These phases introduce vulnerabilities that attackers can exploit before the designer incorporates its IC in a device destinedfor end-users.

also introducing potential risks to end users dueto loss of reliability and security.

Supply chain attacks can happen during any ofthe untrusted phases prior to deployment withinan electronic device. The primary phases of the ICsupply chain, shown in Figure 12-1, are design,fabrication, and test. Categorical levels of trustare assigned based on the end user point-of-view, with the assumption that chip designers aretrusted.

The IC design phase consists of all the codeand other inputs to the tools that will gener-ate the specification for the foundry processes.Some of the design processes are observable andauditable, and therefore trusted. However, thirdparty IP is increasingly used in a protected man-ner such that designers and consumers cannotverify the contents of the IP. Thus, the designphase has mixed levels of trust, and IC designis vulnerable to supply chain attacks. Fabricationprocesses are similarly untrusted due to a lack ofverifiability and trust at offshore foundries. Onlythe test phase is trusted in its entirety.

The test phase ensures each IC passes a setof test procedures that check for manufacturingfaults based on the known design. Because the ICmay be modified in unknown ways that may acti-vate malicious activity under rare circumstances,the current test phase cannot detect Trojan cir-cuits. Furthermore, IP theft and device cloning

occur before the test phase, so it is not sufficientfor detecting any of the above attacks, even if itis trusted.

As designing circuits has become morecomplicated, commercial pressures have driventechnologies that improve time-to-market. Onesuch technology is reconfigurable logic, for exam-ple, the field programmable gate array (FPGA)devices. A reconfigurable device can be repro-grammed after it is deployed. Another technologythat has gained momentum is soft IP, which arecomponents of circuit design that are deployedas source code, for example, as a library ofhardware description language (HDL) files. Bothreconfigurable logic and soft IP make IP theft andcloning easier.

Reconfigurable logic is both a boon and a bur-den for designing secure circuits. On the positiveside, reconfigurability decouples manufacturingfrom design, in contrast to the sequential sup-ply chain demonstrated in Figure 12-1. Thus, asensitive design does not need to be sent to thefoundry. On the negative side, the design anddeployed device can be more easily accessed byan attacker than a fixed-logic design. For bit-stream reconfigurable devices, such as FPGAs,the security of the design depends on the secu-rity of the bitstream. Trimberger [5] details someof the security concerns and solutions for FPGAdesigns of trusted devices.

308 CHAPTER 12 Hardware and Security: Vulnerabilities and Solutions

The following sections discuss how research isattempting to solve the problems of the modernIC supply chain.

12.2.1. Testing for Trojan Circuits

One approach to Trojan circuit detection is tolook for additions and modifications to an IC ina laboratory setting prior to deploying the finalproduct. Lab-based detection augments the testphase of the supply chain with silicon designauthentication, which verifies that an IC con-tains only the designer’s intended functionalityand nothing else [6]. Silicon design authentica-tion is difficult due to increasing IC complex-ity, shrinking feature sizes, rarity of conditionsfor activating malicious logic, and lack of knowl-edge about what functionality has been added tothe IC. Testing cannot capture the behavior ofadded functionality, so an IC could contain allof the intended functionality while having extramalicious logic that might not be activated ordetected by tests. Detection by physical inspec-tion is insufficient due to the complexity andsize of modern ICs. Destructive approaches aretoo costly and cannot be applied to every IC.Detection based on physical characteristics of theIC are imprecise because fabrication introducesphysical variations that increase as ICs shrink.Two promising directions that are active areasfor research in the area of Trojan circuit detec-tion at the silicon design authentication step areside-channel analysis and Trojan activation.

Side-channel analysis relies on variations insignals, usually in the analog domain, that an ICemits and a Trojan circuit would affect – changesto the behavior or physical characteristics of anIC may change non-behavioral traits when theIC is tested. For example, the presence of a Tro-jan circuit may cause deviations in a chip’s powerdissipation, current, temperature, or timing. Sem-inal work by Agrawal et al. [7] uses power anal-ysis for detection of Trojan circuits; they alsosuggest some other side-channels to investigate.Other research expands on side-channel analysisby refining power-based approaches for Trojan

detection [8–10], investigating other side-channeldetection techniques [11, 12], and characterizingthe gate-level behaviors of an IC [13, 14].

Trojan activation techniques attempt to triggera Trojan circuit during silicon design authentica-tion to make the malicious behavior observableor to improve side-channel analysis techniques. Amotivating assumption is that attackers are likelyto target the least-activated circuitry in an IC, soresearchers have explored methods for generatinginputs that activate an IC where Trojan circuitsare likely to be hidden [15–17]. Other techniquesattempt to explore the state space of the circuit innew ways, for example, via randomization [18]or by partitioning the IC into regions [9, 19] inwhich areas of the IC are isolated and then testedfor Trojan activity.

The silicon design authentication techniquesare useful for detecting the presence of maliciouscircuitry prior to deploying devices. Physicalmeasures, such as leakage current or path delay,characterize chips and can expose anomalies thatmay indicate malicious logic. Usually, the char-acteristics under measurement must be knownfor a “golden” chip that is fabricated securely ormodeled based on the IC specifications and man-ufacturing processes. Physical variations in themanufacturing process makes side-channel anal-ysis probabilistic, and IC complexity hinders acti-vation techniques. Approaches that combine thetwo techniques appear to offer good tradeoffs interms of accuracy and cost. In the following sec-tion, techniques for circuit design are reviewedthat may assist in silicon design authenticationby improving the initial design phase of the ICsupply chain.

12.2.2. Design for Hardware Trust

Computer-aided design (CAD) and electronicdesign automation (EDA) tools are crucial to theefficiency and success of circuit design. Thesetools, however, neglect IC supply chain problems.This section describes ways to improve designtools and processes to defend against IC supplychain attacks.

CHAPTER 12 Hardware and Security: Vulnerabilities and Solutions 309

Design tools focus mainly on design for test(DFT) or design for manufacturability (DFM).Researchers have proposed that the design phasebe augmented to improve hardware trust, intro-ducing the notion of design for hardware trust(DFHT). DFHT seeks to prevent Trojan circuitsfrom being introduced during design or fabri-cation. An example of a DFHT technique is tomodify the design tools such that low probabil-ity transitions in a circuit are increased [20, 21],which presumably frustrates an attacker’s job offinding an input that can trigger a Trojan circuityet is unlikely to occur in testing.Watermarking. Watermarking is an establishedtechnique for marking property such that coun-terfeit goods are detectable; the next chapterdiscusses digital watermarks for detecting soft-ware piracy at greater length. IC watermark-ing [22] helps to detect counterfeit devices andIP theft. Metadata are embedded within the ICdesign, so the manufactured product containsan identifiable watermark that does not affectthe circuit’s functionality. Effective watermarkingrequires that the watermark is not easy to copy. Adownside to watermarking is that pirated devicesmust be re-acquired and verified by checking thewatermark.

Lach et al. [23, 24] propose watermarksfor FPGA design components with the intentof identifying a component’s vendor and con-sumer. FPGA designs are challenging to water-mark because of the flexibility of the architecture.Distributing the output of cryptographic hashfunctions across the design will raise the barrierto attacks that attempt to reconfigure the FPGAto copy or remove watermarks.

Soft IP cores ease the design process, but theyalso reveal the hardware design of a compo-nent in its entirety. A challenge is to watermarksoft IP without losing the benefit of IP blocks.Yuan et al. [25] describe this problem and evalu-ate practical solutions at the Verilog HDL level,demonstrating that watermarks can be embed-ded in soft IP that is recoverable in the synthe-sized design. Lin et al. [26] address the challengeof combining FPGA and soft IP designs. Castilloet al. [27] consider a signature-based approach

that can protect against IP theft of both soft andhard IP cores.Fingerprints, PUFs, and Metering. Building onthe notion of watermarking, a device fingerprintis a unique feature of a design applied to adevice. Fingerprinting improves on watermark-ing by enabling tracing of stolen property sothat after discovering a counterfeit device puni-tive measures may be taken. Caldwell et al. [28]show how to generate versions of IP cores thatsynthesize with identical functionality but havesome other measurably unique properties, a keyto good fingerprinting.

Variations inherent in the IC manufacturingprocesses give rise to techniques for device finger-printing. Physical device fingerprinting via man-ufacturing variability [29, 30] and physicallyunclonable functions (PUFs) [31, 32] can provideIC authentication. A PUF can act in a challenge-response protocol to provide IC authentica-tion based on unclonable physical characteristicsresulting from the variation inherent in manufac-turing processes. Suh and Devadas [33] demon-strate that PUFs are also useful as a secure sourceof a random secret, which can be used to gener-ate cryptographic keying material, in addition toIC authentication. Simpson and Schaumont [34]show how to use PUFs to authenticate FPGA-based systems without an online verifier.

IC metering extends device fingerprinting torestrict production and distribution of ICs. Thegoal of metering is to prevent attackers fromcopying a design and creating cloned devices.Metering information is extracted from the ICvia manufacturing variability or PUFs and reg-istered with the design company. When an IC isrecovered and suspected of being pirated, it canbe checked against the registered ICs.

Metering can be either passive or active.Effective passive IC metering was proposed byKoushanfar et al. [35] and is sufficient to statewhether or not a device is pirated. Active ICmetering is proposed by Lee et al. [36], mak-ing use of wire delay characteristics to generatea secret key unique to the circuit.

Active IC metering techniques improve onpassive metering by preventing pirated devices

310 CHAPTER 12 Hardware and Security: Vulnerabilities and Solutions

from functioning. Activation logic is embeddedin the metered IC that requires keying materialnot included in the manufactured device. Theintent of active IC metering is to prevent cloneddevices from operating without receiving the key-ing material from the designer. For ASIC designs,the EPIC technique [37] is a comprehensive solu-tion for active metering that combines chip lock-ing and activation using public-key cryptography.For FPGAs, the work of Couture and Kent [38]makes a first step by supporting IP licensing withexpirations. Baumgarten et al. [39] demonstratehow to prevent IC piracy by introducing barriersin the reconfigurable logic with small overhead.The main advantage of active metering combinedwith FPGA technology is that the IC design neednot be shared with the foundry.Verifying Design and Fabrication. Much ofthe research in security for the IC supply chainfocuses on the foundry as untrusted and thereforethe main source of IC supply chain attacks. Twodirections push back on this assumption by sup-posing either (1) the design phase is also suscep-tible to attack or (2) the foundry is semi-trusted.

As an example of (1), Di and Smith [40, 41]consider attacks on the design side of the supplychain and introduce tools that work together todetect malicious circuitry that is added at designtime. The goal of these tools is to protect fromviruses in design software and from flawed thirdparty IP.

Bloom et al. [42] consider how the foundrycan add forensic information to its processes,similar to IC metering techniques, but with addi-tional information to assist in auditing occur-rences of piracy. Forensic information is gatheredduring fabrication and is embedded on the ICso that forensic recovery tools can extract infor-mation about counterfeit ICs that are recovered.This approach is not able to prevent piracy, muchlike watermarking and passive IC metering.

12.2.3. Architectural Techniques

In contrast to the design and verification tech-niques of the previous two sections, architec-tural support for silicon design authentication

attempts to prevent or detect the maliciousbehavior of a Trojan circuit at run-time. Onestraightforward architectural solution is to mea-sure the IC’s physical characteristics, much likein side-channel analysis, and then use thosemeasurements on-line by designing circuits thatreport the measurements [43]. Similarly, thewell-studied fault-tolerant technique of repli-cating entire processing elements can help todetect some types of Trojan circuit attacks [44].Researchers have also proposed some specificarchitectural techniques to detect Trojan circuits.

BlueChip [45] is a hybrid design-time and run-time system that introduces unused circuit iden-tification to detect suspect circuitry and replacessuspect circuits with exception generation hard-ware. An exception handler emulates the elidedcircuit’s functionality, thus avoiding the untestedcomponents of an IC in which Trojan circuits arelikely to be hidden.

SHADE [46] is a hardware–software approachthat uses multiple ICs as guards in a single boardwith the assumption at least two ICs come fromdifferent foundries, so malicious circuitry wouldnot collude between two or more ICs. Hardwareand software protocols use the two-guard archi-tecture to protect applications with pseudoran-dom heartbeats and two layers of encryption.These techniques are meant to prevent data exfil-tration and detect denial-of-service attacks.

Similar concern for sharing information isexplored in the context of FPGA technology byHuffmire et al. [47]. They describe a simple iso-lation primitive (moat) that can ensure separa-tion of cores after the place-and-route stage ofchip manufacturing. Moats can isolate function-ally independent chip components and provide aframework for designing secure systems out ofuntrusted components. To enable communicationbetween the disparate cores, the authors presenta shared memory bus (drawbridge).

Architectural techniques are complementaryto design and verification techniques. Indeed, theuse of multiple techniques may improve over-all Trojan circuit detection. Side-channel anal-ysis performs well against large modifications,but small Trojan circuits can hide in the noise

CHAPTER 12 Hardware and Security: Vulnerabilities and Solutions 311

of the side-channel. Thus, applying side-channelanalysis can bound the size of undetected Tro-jan circuits and assist architectural techniques byimposing realistic assumptions on the possiblefunctionality of the Trojan circuit.

12.2.4. Can Software CheckHardware?

Let us now consider how software might checkthe validity of the hardware on which it executes.A challenge in this direction is that, if the hard-ware is incorrect, software has difficulty in veri-fying the results of such checks.

For detecting Trojan circuits, Bloom et al. [48]propose adding some OS routines that cause off-chip accesses or checks. A secondary chip pro-cesses these checks to verify the primary CPU.Checks are an OS mechanism to challenge theCPU in collaboration with an on-board verifier.

Software can also check other aspects of thehardware. SoftWare-based attestation (SWATT)[49] is a technique for verifying the integrity ofmemory and establishing the absence of mali-cious changes to memory. SWATT relies on anexternal verifier in a challenge-response proto-col. The verifier issues a challenge to an embed-ded device, which must compute and return thecorrect response. The correctness of the responsedepends on the device’s clock speed, memory sizeand architecture, and the instruction set archi-tecture (ISA). A limitation of SWATT’s threatmodel is that physical attacks are not considered.In particular, an assumption is that an attackeris unable to modify the device’s hardware. Ifthe hardware itself is compromised or replacedwholesale, for example by a faster CPU or mem-ory, then SWATT might verify the platform incor-rectly. The problem is that SWATT relies on thetiming of the hardware for its verification, but ifthe timing is different then the approach to veri-fication may fail.

Deng et al. [50] propose a method for authen-ticating hardware by establishing tight perfor-mance bounds of real hardware and ensuringthat the runtime characteristics of the hard-ware are at those bounds. Thus, a processor

design is authenticated (and not emulated) if itsperformance matches the imposed constraints.This approach depends on the details of thehardware’s microarchitecture and supposes thatemulating the ISA with precise timing of themicroarchitecture is difficult.

Software-based checks can measure hardwarefor correctness. A limiting assumption to all suchapproaches is that the correct measurements areknown. Developing a predictive model for themeasurement would be a positive step forward inthis line of research. Integrating software-basedmeasurements with trusted hardware designscan also improve platform security for high-assurance applications.

12.3. HARDWARE SUPPORT FOR

SOFTWARE SECURITY

In Section 12.2, we saw that a determined andresourceful attacker can compromise hardware.Software is an easier target to attack. Compro-mised software has but one goal: to abuse com-puting resources. Such abuse can take multipleforms, including corruption of critical code ordata, theft of sensitive data, and eavesdroppingor participating in system activities. Despite thepresence of malicious attacks, cyber-physical sys-tems should be resilient and continue to operatecorrectly. In the following sections, we inves-tigate how computer hardware can help toprevent malicious attacks from compromisingsoftware. We start with software-based attacksand defenses, then we discuss how hardwareimproves the performance and security of cryp-tography, and we finish with countermeasures tophysical attacks that rely on hardware access.

A simple attack is to access an application’smemory from another application. Code anddata vulnerabilities entice attackers to try gain-ing unfettered access to execute malicious soft-ware or to steal sensitive information. Traditionaldefenses against memory-based attacks isolateexecution contexts – for example, processes andvirtual memory address spaces – to preventone context from accessing another’s code anddata without permission. In Section 12.3.1, we

312 CHAPTER 12 Hardware and Security: Vulnerabilities and Solutions

discuss established and new techniques for isolat-ing memory with hardware support.

Even if we isolate every application in a systemand follow good security practices (strong pass-words, good cryptography, physically securedhardware), attacks might still compromise sys-tem security by exploiting vulnerabilities withinthe applications. An exploitable security vulnera-bility in a software application is program behav-ior that an attacker can trigger to circumvent thesystem’s security policies. One such vulnerabil-ity, the buffer overflow, is widely regarded as themost commonly exploited vulnerability (on parwith database injections and cross-site scripting).Good programming practices – bounds checking,input validation – can prevent exploitable vul-nerabilities, but enforcing such practices is notalways easy. Software solutions for buffer over-flow detection – for example, StackGuard [51] –provide decent protection with little overhead.Buffer overflow prevention, however, incurs non-negligible overheads with software-only solutions– for example, Return Address Defender [52] anddynamic instruction stream editing [53] – becauseprotection relies on page tables and virtual mem-ory protection. Saving return addresses to read-only memory pages can prevent a buffer overflowfrom subverting the control flow, but the calls tochange page permissions incur a high overhead.When protecting the return address, permissionchanges twice for every function call, once toallow writing the return address and then back toread-only for the duration of function execution.Hardware can improve the speed and reduce theoverhead of buffer overflow prevention. In Sec-tion 12.3.2, we analyze hardware techniques tomitigate buffer overflow attacks; we also exam-ine hardware-based information flow tracking,which prevents untrusted (low-integrity) inputsfrom propagating to trusted (high-integrity) out-puts thereby preventing unsanitized input fromaffecting program behavior.

After we secure our systems from softwareattacks, our next concern is the security ofsensitive data in networked systems. Datacommunication networks have a wide attacksurface that requires encryption for securing

sensitive data. Transmission mediums do little todeter attacks: wireless communication is a broad-cast medium and cables can be tapped with littleeffort. With cryptography, we can create end-to-end secure channels across an unsecured physicalenvironment. Unfortunately, software implemen-tations of cryptographic primitives come ata high cost, especially in resource-constraineddevices. Perhaps worse than the performance costis the possibility of side-channel vulnerabilities –see Section 12.2.1 – that arise due to power andtiming variations outside the control of software.A dedicated hardware accelerator for crypto-graphic primitives can outperform most softwareimplementations and can reduce the likelihoodof side-channel vulnerabilities. We describe theadvantages and challenges – maximize through-put and minimize latency while keeping resourceutilization (size, weight, and power) small –of cryptographic accelerators further in Sec-tion 12.3.3.

Although cryptographic accelerators improvethe performance of cryptography and help toprevent side-channel attacks, the danger thatthe device might fall into the attacker’s posses-sion still remains. Even with moderately sophis-ticated equipment, an attacker with physicalaccess to hardware can steal data or subvertsecurity measures. One solution for defendingagainst physical attack is secure coprocessing.General requirements for a secure coprocessorare (1) to protect secret information such ascryptographic keys from physical attack, (2)to provide a host system with cryptographicoperations that resist side-channel analysis, and(3) to provide a safe execution environment forapplications. The first requirement is typicallyachieved with physical protection techniques:fuses, temperature sensors, and other tamperdetection techniques that can trigger destruc-tion of all secrets. Built-in cryptographic prim-itives, the second requirement, are needed sothat secret keys need not leave the coproces-sor, since it implements the cryptography usingthose keys. The third requirement, a secure andauthenticated execution environment, refines asecure coprocessor’s architecture as a computing

CHAPTER 12 Hardware and Security: Vulnerabilities and Solutions 313

device with private memory, processor, and soft-ware that can execute a wide range of applica-tions. Section 12.3.4 further explores the algo-rithms and example implementations of securecoprocessors and briefly discusses less complexalternatives.

Related to secure coprocessing are encryptedexecution and data (EED) platforms, whichencrypt all storage (including memory) and onlydecrypt within the CPU boundary. EED plat-forms counter attacks that probe or controlsystem buses, control memory content [54], orattempt to reconstruct an application’s controlflow. Section 12.3.5 describes attacks on EEDplatforms and their countermeasures.

12.3.1. Memory Protection

Security policies define how users of a comput-ing system are allowed to use, modify, or shareits resources. Traditionally, an operating system(OS) defines the security policies and relies onhardware to assist in enforcing them. This sec-tion reviews the role of hardware in supportingsystem security through isolation and control ofresources.Memory Protection in Commodity Systems.Consider an abstract computing device compris-ing two major components—the processor andmemory. The processor fetches code and datafrom memory to execute programs. If both com-ponents are shared by all programs then a pro-gram might corrupt or steal another’s data inmemory, accidentally or maliciously, or preventany other program from accessing the proces-sor (denial-of-service). To avoid such scenarios,a privileged entity must control the allocation(time and space) of compute resources (pro-cessor and memory). Traditionally, the OS ker-nel is responsible for deciding how to allocateresources, and hardware enforces the kernel’sdecisions by supporting different privilege ringsor levels (Figure 12-2).

By occupying the highest privilege ring, thekernel controls all compute resources. It canread and write from any memory location,

Ring 1

Ring 2

Ring n

Ring 0

FIGURE 12-2 Privilege rings (Levels). The innermost ringis the highest privilege at which software can execute, nor-mally used by the OS or hypervisor. The outermost ring isthe lowest privilege, normally used by application software.The middle rings (if they exist) are architecture-specific andare often unused in practice.

execute any instruction supported by the pro-cessor, receive all hardware events, and operateall peripherals. User applications – residing inthe lowest privilege level – have limited accessto memory (nowadays enforced through virtualmemory), cannot execute privileged instructions,and can only access peripherals by invokingOS services. A processor differentiates betweenkernel and user with bits in a special register,generically called the program status word andspecifically called the Program Status Register inARM architectures, Hypervisor State Bit in Pow-erPC architectures, Privileged Mode in the Pro-cessor State Register in SPARC architectures, orCurrent Privilege Level in the Code Segment Reg-ister in Intel architectures.

A hierarchy of control from the most to leastprivilege, combined with memory access con-trols, prevents user programs from performingany action outside a carefully sandboxed envi-ronment without invoking services (code) in amore privileged level. Control transfer betweendifferent privilege rings usually is done withinterrupts or specialized control instructions; on

314 CHAPTER 12 Hardware and Security: Vulnerabilities and Solutions

more recent Intel architectures, the sysenter andsysexit instructions allow fast switching duringsystem calls. The OS controls the data structuresand policies that control high-level security con-cepts like users and files, and code running out-side the highest privileged ring cannot manipulatethem directly. A critical aspect of securing high-level code and data is memory protection.

In simple architectures, the privileged statealso defines memory separation. An example ofa simple policy could be that user code canonly access memory in a specified range suchas 0xF000 to 0xFFFF, whereas privileged codecan access the full memory range. With one ormore fixed memory partitions, privileged codecan manage both the allocation and separationof memory among application tasks. Except forembedded and highly customized applications,static memory partitioning is impractical.

A multiprocessing system comprising dynamictasks, each with distinct (often statically un-known) memory and execution demands, requiredynamic memory management to limit memoryfragmentation and balance resource utilization.Two practical solutions for dynamic memorymanagement are to use fixed sized blocks (pages)or variable length segments. With either solu-tion, memory accesses must be translated toa physical address and verified for correctaccess rights. Modern architectures usually pro-vide some support for translation and verifica-tion of memory accesses, whether for pages orsegments.

Virtual memory with paging is the norm indynamic memory management. Paging provideseach application (process) with a linear contigu-ous virtual address space. The physical loca-tion of data referenced by such an address iscomputed based on a translation table that theOS maintains. The memory management unit(MMU) is a special hardware unit that helps totranslate from virtual to physical addresses, withacceleration in the form of a hardware lookuptable called the translation look aside buffer(TLB) and assist in checking access permissions.

The OS maintains a page table for eachvirtual address space that contains entries of

virtual-to-physical pages. Each page table entrycontains a couple of protection bits, which arearchitecture-dependent and either one bit todistinguish between read/write permissions ortwo encoded bits (three independent bits) forread/write/execute permissions. A process gainsaccess to individual pages based on the permis-sion bits. Because each process has a differentpage table, the OS can control how processesaccess memory by setting (clearing) permissionbits or by not loading a mapping into the MMU.During a process context switch, however, theOS must flush (invalidate) the hardware thataccelerates translation (the translation lookasidebuffer or TLB). So, hardware supports pagingwith protection bits, which generate an exceptionon invalid accesses and with the TLB for acceler-ating translations.

Although paged-based virtual memory sys-tems allow for process isolation and controlledsharing, the granularity of permission is coarse –permissions can only be assigned to full pages,typically 4 KB. An application that needs to iso-late small chunks of memory must place eachchunk in its own page, leading to excessive inter-nal fragmentation.

With segmentation, instead of one contigu-ous virtual address space per process, we canhave multiple variable sized virtual spaces, eachmapped, managed, and shared independently.Segmentation-based addressing is a two step pro-cess involving application code and processorhardware: code loads a segment selector into aregister and issues memory accesses, then, foreach memory access, the processor uses the selec-tor as an index in a segment table, obtainsa segment descriptor, and computes the accessrelative to the base address present in the seg-ment descriptor. Access to the segment descriptoris restricted to high-privilege code.

Consider the Intel architecture as an exam-ple of segmentation. It uses the code segment(CS) register as a segment selector and storesthe current privilege level (CPL) as its two lowerbits. When the executing code tries to access adata segment, the descriptor privilege level (DPL)is checked. Depending on whether the loaded

CHAPTER 12 Hardware and Security: Vulnerabilities and Solutions 315

segment is data, code, or a system call, thecheck ensures the CPL allows loading the seg-ment descriptor based on the DPL. For example,a data segment DPL specifies the highest privi-lege level (CPL) that a task can have in orderto be allowed access, so if DPL is 2 then accessis granted only to tasks with CPL of 0, 1, or2. A third privilege level, the requested privilegelevel (RPL), is used when invoking OS servicesthrough call gates and prevents less privilegedapplications from elevating privileges and gain-ing access to restricted system segments.

Most open or commercial OSs ignore or makelimited use of segmentation. OS/2 [55] is a mod-ern commercial OS that uses the full featuresof segmentation. Some virtualization techniques(such as the VMWare ESX hypervisor) do reservea segment for a resident memory area and rely onsegment limit checks to catch illegal accesses.Research Directions in Memory Protection.State-of-the-art research in memory protectiongenerally takes one of two directions. Thefirst builds security applications on top of thepopular paging mechanisms. One way to buildsecure high-level applications with commodityhardware is to use inter-process communication(IPC) mechanisms that rely on process isolationprimitives. The Chromium project [56] splits amonolithic user application – a web browser –into isolated processes that can only communi-cate using OS moderated IPC. The main browserprocess interfaces most system interactions suchas drawing the GUI, storing cookies and history,and accessing the network. Web page renderingexecutes in a separate process so that any vul-nerability or injection attack will not have directaccess to the main browser memory space.

The second general research direction aims toaddress limitations of current architectural sup-port, such as the granularity of protection in pag-ing or the poor performance from the two-stepaddressing of segmentation. Most work in thisdirection considers how to support efficient fine-grained data and task security controls.

InfoShield [57] makes a significant step toensure that only legitimate code can gain accessto protected memory regions at a word-level

granularity. A hardware reference monitor holdsaccess permissions for a limited number ofmemory locations. New instructions manage thepermissions and allow for creating, verifying,clearing, and securely moving them. InfoShieldassumes that only a few security-critical memorylocations exist during a process’s execution. Intypical use cases, InfoShield protects secret keys,passwords, and application-specified secret datafrom improper access.

A hardware-based table of permissions moti-vates work to improve how the table is created,stored, and manipulated in memory withoutdegrading performance of permission verifica-tion. A data-centric approach designates permis-sions for every address range representing eachmemory object. On a memory reference, the cor-responding permission is fetched with the contentof the referenced memory and hardware checksthe access permissions. Permissions can be cachedby using the existing hardware [58] or by using aseparate caching structure [59, 60].

Mondrian Memory Protection [59] uses a hier-archical trie structure that provides word-levelpermission granularity. Hardware traverses thetrie to check permissions. A TLB-like cachingstructure accelerates permission checks based onlocality. Permissions are encoded compactly, thusminimizing the amount of memory reserved forpermission storage. Unfortunately, compactioncomplicates permission changes and frequentchanges to permissions results in frequent trienode insertions and deletions that are hard tosupport in hardware and slow to run in soft-ware. Mondrian aims to protect large code mod-ules whose interaction is mediated by an OS. Assuch, the OS is invoked every time a security con-text switch occurs.

Hardware containers [61, 62] are a memoryprotection mechanism for fine-grained separa-tion of code that maintains word-level data per-missions. Containers isolate code regions at thegranularity of functions and offer a strict sand-box for faulty code that aids in a controlledrecovery from failure. With the aid of com-piler tools and runtime instrumentation, a secu-rity manifest that dictates permissions for each

316 CHAPTER 12 Hardware and Security: Vulnerabilities and Solutions

code region is enforced by a hardware referencemonitor. The hardware detects improper use ofsystem resources (unauthorized memory accessesor denial-of-service) and enables applications toundertake a hardware-supervised recovery pro-cedure. Caching and architectural optimizationshelp to reduce the performance overhead of secu-rity enforcement.

Arora et al. [58] show a solution for enforc-ing application-specified data properties at run-time by tagging data addresses with an additionalsecurity tag. For example, using a single bit SEC-TAG can mark fine grained read-only memoryregions. The checker (reference monitor) can beset to interpret and enforce the value of the secu-rity tag as specified by a program-wide securitypolicy. One policy example is to label code zoneswith specific security attributes. All zones withthe same label share the access on all data regionswith the SECTAG equal or less in value. Cachingthe security tags changes the cache hierarchy intwo ways. First, the L2 cache automatically loadssecurity tags when a cache line is stored. TheL1 cache controller expands the cache line withthe security attributes for faster checking. Thechecker directly uses the L1 security tag infor-mation for enforcing the policy. Even though theSECTAG length can vary based on the securitypolicy, the length is fixed by the processor archi-tecture: any changes in SECTAG length meanschanging the internal cache layout and size.

Reconfigurable hardware, like the FPGA,poses a similar challenge to protecting the mem-ory space. While processor-based systems oftenhave some separation mechanisms for protectingprocesses in sharing the memory space, currentsystems built based on reconfigurable hardwarehave a flat memory addressing scheme. Since softcores are synthesized from a wide source of IPproviders, vulnerabilities or malicious cores canissue unauthorized memory requests, corrupting,or revealing sensitive data. One solution is tocompile an access policy directly to a synthesizedcircuit that can act as an execution monitor andenforce a memory access policy [63]. The special-ized compiler transforms the access policy froma formal specification to hardware description

language that represents the hardware referencemonitor.

An area of work related to memory protec-tion is capability-based systems, out of whichmany modern notions of protection arise. Capa-bilities in their full form have had limited com-mercial success, however, so they are not a veryactive research area. We refer interested readersto Levy’s book on the subject [64] for a thoroughintroduction and history of the topic.

12.3.2. Architectural Support forControl Flow Security

Vulnerabilities in software applications continueto be targets for memory-based attacks – forexample buffer overflows – that attempt to gainillegitimate access to code and data. In the follow-ing, we examine how hardware helps to preventbuffer overflows. Then, we explain informationflow tracking, a general technique for detectingand preventing memory-based attacks.Architectural Support for Buffer OverflowDefense. Hardware-assisted approaches tobuffer overflow protection improve upon accu-racy and performance of software-only schemesfor dynamic attack detection by using a variety oftechniques. One common solution is to maintaina shadow of the return address in hardware(Figure 12-3) by creating a return address stackor monitoring the location of the return addressfor any unauthorized modifications [65–70].Other hardware-supported solutions protect allcontrol flow in general, including branches andjumps [71–75].

Secure Return Address Stack (SRAS) [65]implements a shadow stack in hardware withprocessor modifications including: ISA (Instruc-tion Set Architecture) changes, additional logic,and protected storage. Unlike the usual call stack,the shadow stack only holds return addresses.On a function call (CALL), the return address ispushed to the regular stack and the shadow stack.On a return (RET), SRAS pops and comparesthe return address from both stacks. To handlefunction call nesting, the OS is modified to han-dle secure overflow storage. The spill-over of the

CHAPTER 12 Hardware and Security: Vulnerabilities and Solutions 317

push

pop

Low

SP

FPGuard

HiProtected stack

a

&pop

caller stack

frame

caller FP

buf[99]buf[98]

buf[0]caller RA

main RA

RA stack

FIGURE 12-3 Stack memory protection moves (stores)the return address to protected storage when a functionstarts and then restores (checks) the return address beforethe function returns.

secure stack is stored in a special part of memorythat is accessible only to the kernel. The kernel isresponsible for managing a secure spill-over stackfor each process.

SmashGuard [66], like SRAS, modifies theprocessor and OS. The new semantics of CALLand RET instructions store a return addressinside a memory-mapped hardware stack that ischecked on the function’s return. The OS includessecurity functions during context switching andto handle spill-over stack growth.

Xu et al. [67] present a scheme that splitsthe stack into two pieces: the control stack tostore the return addresses, and the data stackto store the rest. They propose a software solu-tion involving compiler modification, and a hard-ware solution that modifies the processor andsemantics of CALL and RET. In the software-only solution, the compiler allocates and managesthe additional control stack. During each functionprologue, the compiler saves the return addressto the control stack, which resides in memory thatthe OS securely manages. The compiler restoresthe return address from the control stack onto thesystem stack in the function epilog. Simulationyielded significant performance overheads, whichled the researchers to a hardware solution.

Secure Cache (SCache) [68] uses cache mem-ory to protect the return address by provid-ing replica cache lines that shadow the returnaddress. A corrupted return address will have adifferent value in cache than its replica. A draw-back to using cache space for storing the returnaddress is that performance is sensitive to cacheparameters and behavior.

Kao and Wu [69] present a scheme to detectreturn address modifications without explicitlystoring the good return address for verification.Two registers are added to store the currentreturn address and the previous frame pointer.A valid bit that acts as an exception flag is alsoadded. If a memory store is issued to eithera return address on the stack or to the framepointer, then the flag is raised to signal a vio-lation. When the function returns, if the returnaddress is to the local stack or to the currentreturn address then execution is halted. Theauthors claim that only two levels of functioncalls need monitoring.

Using reconfigurable logic such as an FPGAto implement a hardware guard, Leontie et al.[76] provide a secure stack with little changeto the processor microarchitecture. The returnaddress is copied to a memory region in a hard-ware guard that is inaccessible through any directmemory calls. This memory region is called thereturn address (RA) stack, and it is automat-ically managed by a modified compiler. Thecompiler-inserted instructions to manage the RAstack synchronize the state of the RA stack withthe program stack independent of code local-ity and caching policies. On each CALL, thereturn address is stored in the RA stack, and ateach RET, the guard checks if an overflow hasoccurred and ensures that the correct addressis returned to the processor. Thus, the guardmaintains a copy of the valid return addressand ensures that the correct address is alwaysreturned regardless of any buffer overflows.

Heap overflows are as dangerous as – if lesscommon than – stack buffer overflows. Dynam-ically allocated data is stored in memory inter-leaved with management control data structures

318 CHAPTER 12 Hardware and Security: Vulnerabilities and Solutions

and metadata. Heap allocation functions (mal-loc, free, and new) maintain these structures totrack available and occupied memory regions.A heap overflow might just destroy these struc-tures and crash the program, but some allocatorsare vulnerable to more powerful attacks [77].When the allocator primitive tries to consolidatefreed memory using corrupted heap metadata, anadversary might trick the allocator to write toarbitrary locations in memory. Then, the attackercan override pointers used in indirect jumpsto redirect a program’s control flow. Solutionsto the problems involving heap overflows in-clude fine-grained memory protection, discussedin Subsection 12.3.1, software mechanisms dis-cussed in the next chapter, hardware-assistedbounds checking [78], and hardware-supportedprogrammatic debugging [79].Information Flow Tracking. Vulnerabilities insoftware can be hard to find. Since attacksusually enter a system through input channels,researchers propose that instead of verifyingcode, systems should verify data. Informationflow tracking techniques monitor data as it entersa system and prevents any potentially damagingactivity that uses untrusted input. Based on thesource of data, trust levels are associated withthe data. Simply put, data comes from either“trusted” or “untrusted” sources. Depending onprogram semantics, the information channels aretracked in order to determine if untrusted dataaffects (taints) the trusted information, or thereverse, if trusted – sensitive or secure – dataleaks into untrusted information that can divulgesecrets to the insecure channels. Preventing sen-sitive data from leaking means identifying thecommunication channels, whether explicit (directvariable assignments) or implicit (control flow,timing, cache content, power etc.), which thedata can influence.

Traditional approaches to track informa-tion flow through static and dynamic track-ing [80–82] have the disadvantage that just afew untrusted sources can taint trusted dataand cause almost all computations to becomeuntrusted. Maintaining trusted execution in realsystems that track taint with many input channelsbecomes extremely difficult.

Gate level information flow tracking (GLIFT)[83] achieves full control over the informationleakage although with limitations on the compu-tation model and increased resource utilization.GLIFT shadows traditional gates with logic totrack trust. AND, OR, and MUX gates are aug-mented to compute the trust level of their out-put. Higher-level logic is built using these basicgates. Much of the architectural changes and lim-itations of GLIFT deal with program counter(PC) manipulations. The resulting processor is aslower, larger, and limited core, but one that canguarantee complete information flow tracking.

Flexible (limited programmability) policies forthe propagation of taint is possible through archi-tectural modifications [84]. Taint information isassociated with each memory word not by widen-ing the width of memory cells but by keepinga compact flat data structure (array of bits) ina separate memory segment. Hardware supportincludes new pipeline stages and modificationsto handle taint. Because the taint algorithm candegrade performance, common operations arecached using a custom unit.

12.3.3. Cryptographic Accelerators

Cryptographic primitives are demanding interms of computation resources: public keycryptography requires expensive exponentia-tions; symmetric ciphers use multiple itera-tions of dictionary lookups and permutationsthat are sequentially ordered; secure hashesrepeat iterative rounds of shifts and permuta-tions. With more consumer applications requir-ing cryptographic operations in their algorithmsfor security and privacy, hardware manufacturerspropose hardware implementations of these pop-ular primitives. The advantages of using hard-ware are lower latency for operations, higherthroughput for high volume transactions, andlower overall power consumption. As with mosthardware implementations, the cost is highercomplexity and cost of the hardware, andless flexibility, as silicon space is reserved forthe fixed operations. Because hardware designissues are considered when cryptographic stan-dards are chosen, academic and commercial

CHAPTER 12 Hardware and Security: Vulnerabilities and Solutions 319

Crypto

units

(a)

(c)

(b)

(d)

Crypto

unit

Crypto

unitExternal bus

Crypto

units

CPU

core

Inst.

cache

FPGA

CPU

core

Data

cache

CPU

core

01100

10011

111010011

1110

0110001100

10011

1110

01100

10011

1110

01100

10011

111010011

1110

0110001100

10011

1110

01100

10011

1110

Crypto

unitsCPU

core

CPU

core

CPU

core

FIGURE 12-4 Configurations for connecting a cryptographic accelerator to a CPU: (a) tightly coupled to the pipeline,(b) attached over internal interconnect, (c) synthesized in specialized reconfigurable logic, and (d) attached as a coprocessor.

implementations of cryptographic protocols arecommon [85–87].

The challenge and multitude of approachesand solutions are in implementing units that areoptimized to offer a maximum throughput andminimum size, latency, and power. Figure 12-4shows a couple of integration options for con-necting a cryptographic accelerator with a CPU,from tightly coupled execution units in the pro-cessor pipeline to bus-connected coprocessors.

Some processors are optimized for crypto-graphic operations [88]. Their execution unitshave dedicated resources to handle key storageand specific arithmetic operations such as expo-nentiation or modulus arithmetic. The instruc-tion set supports the cryptographic operationsdirectly. As these are not standalone implemen-tations of cryptographic algorithms, they requiresupport from the compiler to generate opti-mized code [89]. The main advantage of suchan approach is that it is generic to a large setof algorithms and not only cryptography. Thedisadvantage is cryptographic operations willstill exert pressure on the processor pipeline.

Other implementation options for cryptographicaccelerators include small cores that implementa dedicated function, a generic cryptographiccoprocessor that can handle different types ofoperations, or a general purpose core that isreserved for certain cryptographic algorithms.

One popular integration solution is for thecryptographic engine to work as a coproces-sor. Typically connected on standard interfaceswith the rest of the system, the processor canoff-load specific operations to the coprocessor.Direct memory access (DMA) enables the copro-cessor to make data accesses to memory. Incontrast to the secure coprocessors described inSection 12.3.4, cryptographic accelerators do notoffer guarantees on the physical security of thekeying material used and may not even havemanufacturer-installed keys.

As with most hardware-implemented algo-rithms, cryptographic accelerators are exposedto the following problems: bugs in the imple-mentation can cause expensive recalls, changesin standards render old hardware obsolete, andthe development cost and time to market are

320 CHAPTER 12 Hardware and Security: Vulnerabilities and Solutions

significant. FPGA technology is appealing forimplementing cryptographic primitives for sev-eral reasons [90–94]. First, although generallyslower than application-specific integrated cir-cuit (ASIC) implementations, FPGAs outperformsoftware implementations. Second, the designand deployment cycle is short and cost-effectiveat small scale. Third, the reconfigurability prop-erty of FPGAs allows post-deployment patchesfor protocol modifications, optimizations, andbug fixes.

12.3.4. Secure Coprocessing

How can sensitive computations be executed inremote devices such that the results obtained aretrustworthy? Programs can be altered, data canbe modified, records and logs deleted, or secretsrevealed. Simulators and debuggers offer mecha-nisms for observing memory content and execu-tion patterns; memory can be frozen and read;operating systems and compilers may be altered;and even the hardware might harbor a Trojan cir-cuit.

Designers of secure coprocessors argue thata system’s root of trust has to be a computa-tional device that can execute its tasks correctlyand without disclosing secrets despite any attack(physical or programmatic) [95]. A fast, gen-eral purpose computer in a tamper-proof physi-cal enclosure would suffice, but that is hard toachieve. High-end processors consume a lot ofpower and generate a lot of heat that is difficultto dissipate in a tamper-proof enclosure. Thus,secure coprocessors are limited to low power,limited bandwidth devices incapable of process-ing at high throughput. Figure 12-5 shows asecure coprocessor as a peripheral to a host com-puter. The services that the coprocessor offersinclude safekeeping of cryptographic keys, cryp-tographic operations, a secure execution envi-ronment, and attestation of the host’s executionstate. A secure coprocessor ensures that a remotedevice produces trustworthy results.

The enclosure of a secure coprocessor mustdeter a sophisticated attacker from obtainingany of the internal memory content. Fuses and

sensors help to determine whether the physi-cal enclosure has been breached [96]. The usualresponse to an attack is to erase all memory con-tent. Since extreme temperatures could poten-tially disable the erasure mechanisms and freezememory content, designers add thermal sensorsto detect such conditions. Other side channelsmay reveal the secure coprocessor’s internal state.Electromagnetic radiation is detected or pre-vented by the enclosure, but power and heat dis-sipation remain a concern. Although a limitedinternal power exists to ensure the erasure proce-dures, operational power is drawn from the host.To minimize the opportunity for power analysisattacks [97], the coprocessor is equipped with fil-ters on the power supply. Well-designed shieldingon the enclosure obfuscates the heat signature.

Having a secure physical enclosure is insuffi-cient to guarantee a unique, unclonable device.At the least, the manufacturer initializes thedevice with a uniquely certified root key-pair. Allother keying material can be derived from thisroot key and stored in the device memory forapplication use. Other cryptographic primitivesin a secure coprocessor may include a secure ran-dom number generator [98], one-way counters,software or hardware implementations of hashfunctions, or symmetric and asymmetric ciphers[99, 100].

The operational requirements of the securecoprocessors are key management, communi-cation, encryption services, internal resourcemanagement, and programmability. Theserequirements are difficult to implement withdedicated hardware alone. Most solutions comewith a complex software stack composed ofa secure kernel, hardware assisted protection,resource partitioning, layers of operational ser-vices, and user applications. The IBM 4758 [101]is built using a general purpose (x86-compatible)processor, a secure boot mechanism, and ahigh-level OS. As a complex software stack ishard to validate for correctness, the designerstried to create an architecture that would resistattacks from malicious software executed onthe device. They implemented hardware locksthat are used to create partitions of memory and

CHAPTER 12 Hardware and Security: Vulnerabilities and Solutions 321

Cryptographic

primitives

Volatile

storage

Persistent

storage

Reference

monitor

Secure coprocessor

Untrusted host

(CPU, RAM)

Attacker

System bus

FIGURE 12-5 Secure coprocessing relies on a tamper-proof coprocessor to provide security services to a host system. Anattacker is unable to examine the internals of the coprocessor, so secrets stored within are safe.

resources that untrusted software – includingprivileged OS services in the highest-privilegering 0 – cannot break [102].

The power of secure coprocessors showsin their application use cases: host integrityvalidation, secure logs, audit trails, digitalrights management (DRM), electronic currency,sandboxing, and attestation for remote agents.We briefly review some of these examples in thefollowing.

Host integrity. Trust in a computing systemrequires both tamper-proof hardware and priv-ileged software. With viruses, trojans, and mal-ware masquerading in a system, software cannoteasily prove itself authentic. If trust in the securecoprocessor is established, then that trust enablesauthenticating the host. The secure coprocessormust control or verify the initial boot processof the host and take snapshots of the memorystate. With this ability, the coprocessor can com-pute secure hashes of the boot and kernel images,as well as memory states at different milestonesin the host execution. By comparing the securehashes with known-good values, the secure copro-cessor can detect memory corruptions.

Secure logs. Logs are an important targetfor attackers. Forging entries to eliminate intru-sion traces is essential for a stealth attack.With the aid of a trusted coprocessor, logs canbe made tamper-evident. Cryptographic check-sums are the main mechanism, but for sensitiveinformation, such as financial logs, encryptionprimitives can be used for secrecy.

DRM. Two key properties make secure copro-cessors appealing in copyright protection appli-cations. The first is the ability to attest thata host software system is tamper-free, includ-ing the operating system and licensed applica-tions. If an attacker (user) cannot circumventthe license validations, then commercial softwarecan reliably check for correct registration proce-dures. Second are the trustworthy cryptographicprimitives, including unique unforgeable identi-fiers. With these primitives, content providers candeliver encrypted content to the host, limitingthe attacker’s (including the end user) ability toobtain access to original digital content.

Despite the usefulness of secure coprocessors,their cost and limitations prevent widespreaddeployment and motivate lighter mechanisms,such as the smartcard and the trusted platformmodule (TPM). Smartcards have as their mainrole the safekeeping of a private key. With onlya small amount of memory and limited com-putational power, smartcards store one or moremanufacturer-provided cryptographic key pairsand perform basic cryptographic operations. Inthis regard, they are similar to TPMs. Smartcardsconnect via standard plug-and-play protocols(USB, infra-red, and short distance radio waves)with the host and offer their services on-demandto the host for authentication, signatures, andother protocols. This is where they differ fromTPMs. A typical TPM is closely integrated withits host, plugged into the system buses, and maybe able to pause the host execution and take

322 CHAPTER 12 Hardware and Security: Vulnerabilities and Solutions

memory snapshots. Smartcards are intended totravel with a person and help authenticate theowner, whereas the TPM remains with the hostdevice and helps to verify its authenticity. Bothsmartcards and TPMs are cheap because theylack tamper-proof mechanisms, but by the sametoken they are not secure from an adversary thatcan mount a physical attack.

12.3.5. Encrypted Execution andData (EED) Platforms

A physical attack assumes that the attacker hasdirect physical access to the hardware and issophisticated and resourceful enough to exam-ine and manipulate instruction and data buses.A simple countermeasure is to encrypt data whileit is in memory or storage and only decrypt dataas it is fetched by the processor. A platform thatencrypts all data entering and exiting the CPUboundary is called an encrypted execution anddata (EED) platform.

Figure 12-6 shows a typical layout for anEED platform with an encryption/decryptionunit interposing on off-chip interfaces. Since thecommunication protocol between the processorsand external memory does not necessarily alignwith the typical use of standard cryptographicprimitives, adaptations typically are needed forencrypted execution. We motivate the additionaldata integrity, control flow validation, and autho-rization mechanisms in the following.

We begin by describing an unauthenticatedEED platform that simply encrypts cache blocksand shows why such a mechanism is insuffi-cient against a determined attacker. Since mem-ory chips are external from the CPU, the attackercan supply the processor with arbitrary blocksof data. The most effective form of attacktries to supply the processor with an unex-pected block; in doing so, an attacker mightthen observe the outcome and use that advanta-geously. For example, an attacker might noticethat skipping a certain block leads to skip-ping a license check. A detailed descriptionof known attacks against this unauthenticatedEED platform follows, after which we introduce

scholarly work that mitigates such attacks anddescribe architectural techniques to alleviate per-formance overheads caused by encryption andauthentication.

Code Injection/Execution Disruptions. Anattacker may try to modify or replace a portion ofan encrypted block of instructions. If the key hasnot been broken, this attack merely places ran-dom bits into a cache block. Nonetheless, theserandom bits will be decrypted into possibly validinstructions, whose outcome can be observed bya sophisticated attacker. If the instruction set usesn bits for each opcode, there are a total of 2n

possible instructions. If, among these, v is thenumber of valid instructions, and if the encryp-tion block contains k instructions, then the prob-ability that the decryption will result in at leastone invalid instruction in the block is 1− (v/2n)k .Since a good processor architecture avoids wast-ing opcode space with unused instructions, it ishighly probable that the attacker can supply arandom block that will be decrypted and exe-cuted without detection. For example, if we con-sider an encryption block size of 16 bytes andif 90% of the opcode space is used for validinstructions, the probability of an undetected dis-rupted execution is 19%. We term this type ofattack execution disruption through instruction/code injection. The attacker will not be able toexecute arbitrary code, but by observing the pro-gram’s behavior, the attacker can deduce infor-mation about the program’s execution. Such codeinjection attacks suggest the need for run-timecode integrity checking in EED platforms, forexample by the use of signatures embedded intothe executable.

Instruction Replay. An attacker may re-issuea block of encrypted instructions from mem-ory. This can be accomplished either by freez-ing the bus and replacing the memory read valuewith an old one, overriding the address bus witha different memory location than the one theprocessor requested, or simply overwriting thememory at the targeted address. This is illus-trated in Figure 12-7(a). In this example, the pro-cessor requests blocks A ,B ,and C from addresses0x100, 0x200, and 0x300, respectively.

CHAPTER 12 Hardware and Security: Vulnerabilities and Solutions 323

Data integrity

Privacy

(crypto)

Control flow

validation

Chip boundary (trusted)

Authorization

CPU

core

Inst.

cache

Data

cache

Untrusted

External

memory

FIGURE 12-6 Encrypted execution and data (EED) platform. Everything that leaves the chip boundary is encrypted, andeverything that enters it is decrypted. Control flow validation and data integrity checking, together with proper authoriza-tion policies, defend against physical attacks that compromise encryption-only EED platforms.

(a)

Block C

Block B

Block B

0x200

0x100

0x3000x200

Block A

Block C Block B

Block D

Block B0x200

0x100

x ≤ 0 x > 0

0x400

0x3000x200

Block A

(b)

FIGURE 12-7 Example of (a) instruction replay and (b) control flow attacks.

However, when the processor requests blockfrom address 0x300, the attacker injects block B ,which will be correctly decrypted and executed.The incorrect block is decrypted into valid exe-cutable code. If the replayed block has an imme-diate observable result, the attacker can store theblock and replay it at any time during programexecution. Also, by observing the results of a mul-titude of replay attacks, the attacker can cataloginformation about the results of individual replayattacks and use such attacks in tandem for greaterdisruption or to get a better understanding of

the application code. The vulnerability of EEDplatforms to such replay attacks suggests the needto validate that the contents are indeed from thememory location requested by the processor. Wenote that simply storing the signature inside thecode block does not prevent such attacks, sincethe code block is unaltered.

Control Flow attacks. An attacker can deter-mine the control-flow structure of a programby watching memory accesses. This control-flowinformation can lead to both information leakageas well as attacks on the application. As described

324 CHAPTER 12 Hardware and Security: Vulnerabilities and Solutions

in Refs. [103, 104], obtaining the block levelcontrol flow graph (CFG) can lead to informationleakage since CFGs can serve as unique finger-prints of the code and can detect code reuse. Bywatching the interaction and calling sequences,the attacker can learn more about the applicationcode. Additionally, knowledge of the CFG canalso compromise a secret key and leak sensitivedata. Since branches compare values, the attackercould force which path to take in the applica-tion code. To disrupt the execution or steer theexecution in the desired direction, the attackercan transfer control out of a loop, transfer con-trol to a distant part of the executable, or forcethe executable to follow an existing executionpath. For example, consider Figure 12-7(b). Dur-ing normal execution, block A (at address 0x100)can transfer control to either block B (address0x200) or block C (address 0x300) depending onthe value of some variable x . An attacker whohas observed this control flow property can sub-stitute block C when B is requested and observethe outcome as a prelude to further attack. Thus,they can successfully bypass any check condi-tion that may be present in block A . Addition-ally, another type of attack is when blocks A andB together form a loop. Then, upon observingthis once without interference and recording theblocks, the attacker can substitute blocks from anearlier execution to prevent the loop from beingcompletely executed. To protect against controlflow attacks requires a mechanism by which thecorrect control flow, as specified by the applica-tion, can be embedded and validated at run-time.

Data injection/modification. By examining thepattern of data write-backs to RAM, an attackercan guess the location of the runtime stack, ascommonly used software stack structures areusually simple. Since the attacker has physicalaccess to the device, stack data can be injectedand, even though the data will be decrypted intoa random stream, the effect on program behaviorcan be observed. The attacker may still be ableto disrupt the execution or learn about the exe-cutable.

Data substitution. The attacker can substi-tute a requested data block for another block,

which is also currently valid, and observe theprogram’s behavior. Unlike instructions whichare limited to the valid opcodes in the instruc-tion set, any data injected by the attacker willbe correctly decrypted and passed to the pro-cessor. Thus, encryption in this case providesno protection other than privacy of the appli-cation’s data. To address this issue, EED plat-forms typically include a location label (typi-cally a memory address) as part of the validationinformation.

Stale data replay. An attacker that sniffs thebus can keep snapshots of data and replay oldversions of data blocks when newer versions arerequested. Data is even more sensitive to replaysthan code is, as its content changes in time. Anychange in the content of the block creates a newvalid encryption of that block in memory. Toaddress stale data replay, a time-stamp mecha-nism records data “freshness” and the validationrejects any old data blocks that are no longervalid.

As described above, these EED attacks pointout that encryption is not sufficient to guaranteesecurity against physical attacks, and these typesof attacks can go undetected without explicitcountermeasures. There are several projects thataddress the design of EED platforms and provideboth data and code integrity against physicalattacks. Examples include the XOM architec-ture [105] and several other tamper-resistantprocessors [106, 107]; these techniques requirere-design of the processor. Under a physicalthreat model, these systems are still vulnerable toattack.

Others minimize processor changes by usingreconfigurable logic [108, 109]. The AEGIS [110]architecture greatly improves on XOM, its prede-cessor, and presents techniques for control-flowprotection, privacy, and prevention of data tam-pering. AEGIS also includes an optional secureOS that is responsible for interfacing with thesecure hardware. The code and memory protec-tion schemes use cached hash trees, and a log-hash integrity verification mechanism is used toverify integrity of memory accesses. The levelof confidentiality and integrity protection further

CHAPTER 12 Hardware and Security: Vulnerabilities and Solutions 325

improves by merging encrypted execution withaccess control validation [60].

An attacker can extract patterns of access in anEED platform and match those patterns againsta database of known patterns extracted fromopen-source software or from unencrypted exe-cutables run inside a debugger. Algorithms canbe identified by observing memory access pat-terns, and this signature pattern can itself leadto both information leakage as well as addi-tional types of attacks. Address randomizationcan foil such attacks, and specific architecturesfor address randomization have been proposed toaddress this problem [103, 104].

Physical probing of devices is also investi-gated by a strong community that focuses itsresearch on differential fault (power) analysis(DFA) [97]. Although the attacks target encryp-tion keys of common encryption algorithms [111,112], other control flow or algorithm designscan be revealed by this approach. EED platformsare still vulnerable to power analysis and bene-fit from countermeasures such as the DFA resis-tant implementations of DES and AES [113] orsecure IC design [114]. Specific techniques to pro-tect FPGA circuits from DFA also exist [115].

12.4. CONCLUSIONS AND FUTURE

WORK

We have described two aspects of computerhardware security: the security of the hardwareitself and how hardware enables secure software.Threats to the processor supply chain, such asthe Trojan circuit, are emerging as a fundamen-tal problem that faces security practitioners. Howdo we ensure that attacks do not succeed inthe supply chain before our systems are evendeployed? How might we create a trustworthysystem comprising untrusted components? Tra-ditionally, computer security has relied on hard-ware as a trusted base for security. We havereviewed how hardware can provide such secu-rity even in the face of determined attackers thatcapture the protected computing devices.

Directions for future work in hardware secu-rity abound: the problems are far from solved.

Instead of trying to rank future work directions,we discuss possible improvements and paths for-ward in the order that we presented the topics inthis chapter.

Some possible areas to improve securityagainst supply chain attacks include developingTrojan circuit detectors that do not need a ref-erence implementation, improving side-channelanalysis so it detects inclusions that try to hidein the noise, incentives for vendors to includeDFHT in EDA tool chains, and changing the trustassumptions that software makes about the secu-rity of the underlying hardware.

Research in memory protection seems to takeone of two directions: use the existing hardware(paging) to build secure systems or improve theexisting hardware to support stronger securityprimitives; in spite of years of research and devel-opment, paging dominates commercial systems.Whether fine-grained memory protection can bemade efficient, cost-effective, and commerciallyviable (for both hardware and software vendors)is an open problem. Although we did not discusscapabilities at length, they are well-matched toobject-oriented concepts and may prove usefulin developing systems-of-systems that integratemultiple object models – related to multiple phys-ical entities – in large-scale deployments. Unfor-tunately, the history of capabilities indicates thatresearch in them is a dead end short of funda-mental breakthroughs.

Although buffer overflows are still preva-lent in the wild, network-oriented attackslike cross-site scripting and database injectionare the new memory-based attacks of choice.Hardware can efficiently defend against bufferoverflows, so naturally we should considerhardware approaches to defend against theseother memory-based attacks. Information flowtracking seems to be a good approach, but solv-ing its inherent problems, namely the unboundedpropagation of taint, drives ongoing research.Fine-grained permission checking also may help,but managing fine-grained permissions well isalso an open problem.

Room for improvement exists in better secur-ing and integrating cryptographic accelerators,

326 CHAPTER 12 Hardware and Security: Vulnerabilities and Solutions

secure coprocessors, and EED platforms. Espe-cially important is considering how multicoremultiprocessing affects hardware-based securityprimitives. One of the exercises asks you to con-sider some of the ramifications of multiprocessingwith EED cores. Performance of the tamper-proof components is low compared with whatunsecured hardware can achieve; mostly to blameis that tamper-proof enclosures dissipate heatslowly, and to prevent burning, the enclosed pro-cessor must execute at lower frequencies than ifunenclosed.

Low-cost, highly secure hardware remainsa lofty yet important goal that may involvemultiple fields of science, engineering, and math-ematics. Importantly, we should think about howsoftware and hardware can together play a role insecuring the systems of tomorrow.

EXERCISES

1. Explain the relationship between the termstrusted, untrusted, trustworthy, and untrust-worthy in the context of secure computing.Give a hardware example for each term.

2. An attacker determines how to get a Trojancircuit past every detection technique and intoa processor such that it will accept arbitrarycommands remotely from the attacker. Sup-pose an armed military UAV has been com-promised by this Trojan circuit in its mainprocessor. As an attacker, what is a goodstrategy to use to gain an advantage from theTrojan circuit: would you wait for a criticalmoment to strike or perhaps try to slowlysabotage the UAV? As a defender, how mightthe military prevent the compromise of theprocessor from compromising the UAV? Howmight your defensive solution affect the per-formance of the UAV? Does the UAV still pro-vide services in spite of an active cyber-attack?

3. Explain how a hardware Trojan circuit com-pares with software malware. What are theadvantages and disadvantages that each hasfrom the perspective of an attacker? Whatare techniques for protecting systems againsteach?

4. Some manufacturers ship integrated circuitswith reconfigurable logic attached directly toa processor core. Explain how the reconfig-urable logic might be used to improve the per-formance and effectiveness of system security.

5. Take an example for each of the ARM,Intel, PowerPC, and SPARC architectures andidentify how many privilege rings (levels) itsupports and how it encodes a privilegedprocessor state in a control structure. In prac-tice, only two privilege rings really matter:high and low. State some reasons why the restare ignored.

6. Suppose you are developing an applicationfor a limited resource embedded system. Thememory protection mechanism offered bythe architecture is only capable of protect-ing four distinct statically defined memoryranges. How would you partition your appli-cation code and data among the four memoryranges? What permissions would you set toeach range?

7. In critical systems where physical accesscannot be easily restricted, some form ofencrypted execution can protect data andcode on buses and memory. Consider sucha system that has multiple processors, eachwith its own cache, and shared main mem-ory. Identify some of the issues that can occurin such a system. Discuss how sharing some(but not all) data among some (but not all) ofthe processors complicates system design.

8. Smartcards and TPMs have a similar internalarchitecture, yet they are used in differentapplication domains. Discuss the differencesin how the two interface with another sys-tem and describe two specific applications,one that motivates the use of smartcards andone that motivates TPMs.

REFERENCES

[1] DARPA. BAA 06-40 – solicitations – microsystemstechnology office. http://www.darpa.mil/mto/solicitations/baa06-40/, 2006.

[2] DARPA. BAA 07-24 – solicitations – microsystemstechnology office. http://www.darpa.mil/MTO/solicitations/baa07-24/index.html, 2007.

CHAPTER 12 Hardware and Security: Vulnerabilities and Solutions 327

[3] S. Adee, The hunt for the kill switch, Spectr. IEEE45 (5) May (2008) 34–39.

[4] Semiconductor Equipment and Materials Industry(SEMI). Innovation at risk: Intellectual propertychallenges and opportunities. http://www.semi.org/en/Issues/IntellectualProperty/index.htm, 2008.

[5] S. Trimberger, Trusted design in FPGAs, in: DesignAutomation Conference, 2007. DAC ’07. 44thACM/IEEE, ACM, New York, NY, USA, 2007,pp. 5–8.

[6] M. Tehranipoor, F. Koushanfar, A survey ofhardware trojan taxonomy and detection, IEEEDes. Test Comput. 27 (1) (2010) 10–25.

[7] D. Agrawal, S. Baktir, D. Karakoyunlu, P. Rohatgi,B. Sunar, Trojan detection using IC fingerprinting,IEEE Symposium on Security and Privacy, IEEE,Berkeley, CA, 2007, pp. 296–310.

[8] R. Rad, J. Plusquellic, M. Tehranipoor, Sensitivityanalysis to hardware trojans using power supplytransient signals, in: IEEE International Workshopon Hardware-Oriented Security and Trust, 2008,HOST 2008, pp. 3–7.

[9] M. Banga, M. S. Hsiao, A region based approachfor the identification of hardware trojans, in:Proceedings of the 2008 IEEE InternationalWorkshop on Hardware-Oriented Security andTrust, IEEE Computer Society, Anaheim, CA,2008, pp. 40–47.

[10] S. Narasimhan, D. Du, R.S. Chakraborty, S. Paul,F. Wolff, C. Papachristou, K. Roy, et al.,Multiple-parameter side-channel analysis: Anon-invasive hardware trojan detection approach,in: Hardware-Oriented Security and Trust(HOST), 2010 IEEE International Symposium on,IEEE, Anaheim, CA, 2010, pp. 13–18.

[11] J. Li, J. Lach, At-speed delay characterization forIC authentication and trojan horse detection, in:IEEE International Workshop onHardware-Oriented Security and Trust, 2008,HOST 2008, pp. 8–14.

[12] Y. Jin, Y. Makris, Hardware trojan detection usingpath delay fingerprint, in: Proceedings of the 2008IEEE International Workshop onHardware-Oriented Security and Trust, IEEEComputer Society, Anaheim, CA, 2008,pp. 51–57.

[13] M. Potkonjak, A. Nahapetian, M. Nelson,T. Massey, Hardware trojan horse detection usinggate-level characterization, in: Proceedings of the46th Annual Design Automation Conference,ACM, San Francisco, CA, 2009, pp. 688–693.

[14] S. Wei, S. Meguerdichian, M. Potkonjak,Gate-level characterization: foundations andhardware security applications, in: Proceedings ofthe 47th Design Automation Conference, ACM,Anaheim, CA, 2010, pp. 222–227.

[15] F. Wolff, C. Papachristou, S. Bhunia,R. S. Chakraborty, Towards trojan-free trustedICs: problem analysis and detection scheme, in:Proceedings of the Conference on Design,Automation and Test in Europe, ACM, Munich,Germany, 2008, pp. 1362–1365.

[16] R. S. Chakraborty, S. Paul, S. Bhunia, On-demandtransparency for improving hardware trojandetectability, in: IEEE International Workshop onHardware-Oriented Security and Trust, IEEE,Anaheim, CA, 2008, HOST 2008, pp. 48–50.

[17] R. S. Chakraborty, F. Wolff, S. Paul,C. Papachristou, S. Bhunia, MERO: a statisticalapproach for hardware trojan detection, in:Proceedings of the 11th International Workshopon Cryptographic Hardware and EmbeddedSystems, Springer-Verlag, Lausanne, Switzerland,2009, pp. 396–410.

[18] S. Jha, S. K. Jha, Randomization basedprobabilistic approach to detect trojan circuits, in:Proceedings of the 2008 11th IEEE HighAssurance Systems Engineering Symposium, IEEEComputer Society, Nanjing, China, 2008,pp. 117–124.

[19] M. Banga, M. S. Hsiao, A novel sustained vectortechnique for the detection of hardware trojans,in: Proceedings of the 22nd InternationalConference on VLSI Design. IEEE ComputerSociety, Los Alamitos, CA, 2009, pp. 327–332.

[20] H. Salmani, M. Tehranipoor, J. Plusquellic, Newdesign strategy for improving hardware trojandetection and reducing trojan activation time, in:Proceedings of the 2009 IEEE InternationalWorkshop on Hardware-Oriented Security andTrust, IEEE Computer Society, Los Alamitos,CA, 2009, pp. 66–73.

[21] M. Banga, M. S. Hsiao, VITAMIN: voltageinversion technique to ascertain maliciousinsertions in ICs, in: IEEE International Workshopon Hardware-Oriented Security and Trust, IEEEComputer Society, Los Alamitos, CA, 2009,pp. 104–107.

[22] E. Charbon, I. Torunoglu, Watermarkingtechniques for electronic circuit design, in: DigitalWatermarking, 2003, pp. 347–374.

[23] J. Lach, W. H. Mangione-Smith, M. Potkonjak,FPGA fingerprinting techniques for protectingintellectual property, in: Proceedings of the IEEECustom Integrated Circuits Conference, IEEE,Santa Clara, CA, 1998, pp. 299–302.

[24] J. Lach, W. H. Mangione-Smith, M. Potkonjak,Robust FPGA intellectual property protectionthrough multiple small watermarks, in:Proceedings of the 36th Annual ACM/IEEEDesign Automation Conference, ACM,

328 CHAPTER 12 Hardware and Security: Vulnerabilities and Solutions

New Orleans, Louisiana, United States, 1999,pp. 831–836.

[25] L. Yuan, P. Pari, G. Qu, Soft IP protection:Watermarking HDL codes, in: InformationHiding, 2005, pp. 224–238.

[26] M. Lin, G.-R. Tsai, C.-R. Wu, C.-H. Lin,Watermarking technique for HDL-based IPmodule protection, in: Third InternationalConference on Intelligent Information Hiding andMultimedia Signal Processing, 2007. IIHMSP2007. vol. 2, IEEE Computer Society, LosAlamitos, CA, USA, 2007, pp. 393–396.

[27] E. Castillo, U. Meyer-Baese, A. Garcia, L. Parrilla,A. Lloris, IPP@HDL: efficient intellectual propertyprotection scheme for IP cores, IEEE Transactionson Very Large Scale Integration (VLSI) Systems,15 (5) (2007) 578–591.

[28] A. E. Caldwell, H.-J. Choi, A. B. Kahng,S. Mantik, M. Potkonjak, G. Qu, et al., Effectiveiterative techniques for fingerprinting design IP,IEEE Trans. Comput. Aided Des. Integr. Circuits.Syst. 23 (2) (2004) 208–215.

[29] K. Lofstrom, W. R. Daasch, D. Taylor, ICidentification circuit using device mismatch, IEEEInternational Solid-State Circuits Conference,2000. Digest of Technical Papers. ISSCC 2000,pp. 372–373.

[30] S. Maeda, H. Kuriyama, T. Ipposhi, S. Maegawa,M. Inuishi, An artificial fingerprint device (AFD)module using poly-Si thin film transistors withlogic LSI compatible process for built-insecurity, Electron Devices Meeting, 2001. IEDMTechnical Digest. International, 2001,pp. 34.5.1–34.5.4.

[31] B. Gassend, D. Clarke, M. van Dijk, S. Devadas,Silicon physical random functions, in: Proceedingsof the 9th ACM Conference on Computer andCommunications Security, ACM, Washington,DC, 2002, pp. 148–160.

[32] B. Gassend, D. Lim, D. Clarke, M. van Dijk,S. Devadas, Identification and authentication ofintegrated circuits, Concurrency. Computat. Pract.Exper. 16 (11) (2004) 1077–1098.

[33] G. E. Suh, S. Devadas, Physical unclonablefunctions for device authentication and secret keygeneration, in: Proceedings of the 44th AnnualDesign Automation Conference, DAC ’07, ACM,New York, NY, 2007, 9–14. ACM ID: 1278484.

[34] E. Simpson, P. Schaumont, OfflineHardware/Software authentication forreconfigurable platforms, in: CryptographicHardware and Embedded Systems – CHES 2006,2006, pp. 311–323.

[35] F. Koushanfar, G. Qu, Hardware metering, in:Proceedings Design Automation Conference,2001, pp. 490–493.

[36] J. W Lee, D. Lim, B. Gassend, G. E. Suh, M. vanDijk, S. Devadas, A technique to build a secret keyin integrated circuits with identification andauthentication applications, in: Proceedings of theIEEE VLSI Circuits Symposium, IEEE, 2004,pp. 176—179.

[37] J. A. Roy, F. Koushanfar, I. L. Markov, EPIC:ending piracy of integrated circuits, in: Design,Automation and Test in Europe, 2008. DATE ’08,2008, pp. 1069–1074.

[38] N. Couture, K. B. Kent, Periodic licensingof FPGA based intellectual property, in:IEEE International Conference on FieldProgrammable Technology, IEEE, 2006, FPT2006, pp. 357–360.

[39] A. Baumgarten, A. Tyagi, J. Zambreno, PreventingIC piracy using reconfigurable logic barriers, IEEEDes. Test. Comput. 27 (1) (2010) 66–75.

[40] J. Di, S. Smith, A hardware threat modelingconcept for trustable integrated circuits, in:Region 5 Technical Conference, 2007 IEEE, 2007,pp. 354–357.

[41] S. C. Smith, J. Di, Detecting malicious logicthrough structural checking, in: IEEERegion 5 Technical Conference, IEEE, 2007,pp. 217–222.

[42] G. Bloom, B. Narahari, R. Simha, Fab forensics:Increasing trust in IC fabrication, in: IEEEInternational Conference on Technologies forHomeland Security (HST), IEEE, Waltham, MA,November 2010.

[43] X. Wang, M. Tehranipoor, J. Plusquellic, Detectingmalicious inclusions in secure hardware:Challenges and solutions, in: IEEE InternationalWorkshop on Hardware-Oriented Security andTrust, IEEE, HOST 2008, pp. 15–19.

[44] S. T. King, J. Tucek, A. Cozzie, C. Grier, W. Jiang,Y. Zhou, Designing and implementing malicioushardware, in: Proceedings of the 1st UsenixWorkshop on Large-Scale Exploits and EmergentThreats, USENIX Association, San Francisco, CA,2008, pp. 1–8.

[45] M. Hicks, M. Finnicum, S. T. King, M. M. K.Martin, J. M. Smith, Overcoming an untrustedcomputing base: Detecting and removingmalicious hardware automatically, in: IEEEInternational Conference on Security and Privacy(SP), IEEE, 2010, pp. 159–172.

[46] G. Bloom, B. Narahari, R. Simha, J. Zambreno,Providing secure execution environments with alast line of defense against trojan circuit attacks,Comput. Secur. 28 (7) (2009) 660–669.

[47] T. Huffmire, B. Brotherton, G. Wang,T. Sherwood, R. Kastner, T. Levin, et al., Moatsand drawbridges: An isolation primitive forreconfigurable hardware based systems, in: IEEE

CHAPTER 12 Hardware and Security: Vulnerabilities and Solutions 329

Symposium on Security and Privacy, 2007. SP ’07,IEEE, 2007, pp. 281–295.

[48] G. Bloom, B. Narahari, R. Simha, OS support fordetecting trojan circuit attacks, in: IEEEInternational Workshop on Hardware-OrientedSecurity and Trust, IEEE Computer Society, LosAlamitos, CA, 2009, pp. 100–103.

[49] A. Seshadri, A. Perrig, L. van Doorn, P. Khosla,SWATT: SoftWare-based ATTestation forembedded devices. in: Proceedings of the IEEESymposium on Security and Privacy, 2004.

[50] D. Y. Deng, A. H. Chan, G. E. Suh, Hardwareauthentication leveraging performance limits indetailed simulations and emulations, in:Proceedings of the 46th Annual DesignAutomation Conference, ACM, San Francisco,CA, 2009, pp. 682–687.

[51] C. Cowan, C. Pu, D. Maier, H. Hinton, P. Bakke,S. Beattie, et al., Stackguard: Automatic adaptivedetection and prevention of buffer-overflowattacks, in: USENIX Security Symposium,USENIX Society, 1998.

[52] T. Chiueh, F. Hsu, Rad: A compile-time solution tobuffer overflow attacks, in: 21st IEEEInternational Conference on DistributedComputing Systems (ICDCS’01), IEEE, 2001.

[53] M. L. Corliss, E. C. Lewis, A. Roth, Using DISE toprotect return addresses from attack, in:Workshop on Architectural Support for Securityand Anti-Virus, 2004.

[54] D. Samyde, S. Skorobogatov, R. Anderson,J.-J. Quisquater, On a new way to read data frommemory, in: Security in Storage Workshop, 2002.Proceedings of the First International IEEE,Greenbelt, Maryland, December 2002, pp. 65–69.

[55] H. M. Deitel, M. S. Kogan, The Design of OS/2,Addison-Wesley Longman Publishing Co., Inc.,Boston, MA, 1992.

[56] C. Reis, A. Barth, C. Pizano, Browser security:lessons from google chrome, Commun. ACM 52(2009) 45–49.

[57] W. Shi, J. B. Fryman, G. Gu, H. H. S. Lee, Y.Zhang, J. Yang, Infoshield: a security architecturefor protecting information usage in memory, in:The Twelfth International Symposium onHigh-Performance Computer Architecture, 2006,pp. 222–231.

[58] D. Arora, A. Raghunathan, S. Ravi, N. K. Jha,Architectural support for run-time validation ofprogram data properties, IEEE Trans. Very LargeScale Integration (VLSI) Systems 15 (5) (2007)(546–559).

[59] E. Witchel, J. Cates, K. Asanovic, Mondrianmemory protection, in: Proceedings of ASPLOS-X,ACM, New York, NY, USA, October 2002.

[60] W. Shi, C. Lu, H.-H. S. Lee, Memory-centricsecurity architecture. High performance embeddedarchitectures and compilers, Barcelona, Spain,November 17–18, 2005.

[61] E. Leontie, G. Bloom, B. Narahari, R. Simha,J. Zambreno, Hardware containers for softwarecomponents: A trusted platform for COTS-basedsystems, in: Proceedings of The 2009 IEEEInternational Symposium on Trusted Computing,TrustCom09, Vancouver, Canada, August 2009.

[62] E. Leontie, G. Bloom, B. Narahari, R. Simha,J. Zambreno, Hardware-enforced fine-grainedisolation of untrusted code, in: Proceedings of theFirst ACM Workshop on Secure Execution ofUntrusted Code, ACM, Chicago, IL, 2009,pp. 11–18.

[63] T. Huffmire, S. Prasad, T. Sherwood, R. Kastner,Policy-driven memory protection forreconfigurable hardware. Lecture Notes inComputer Science, 2006, pp. 461–478.

[64] H. M. Levy, Capability-Based Computer Systems,Butterworth-Heinemann, Newton, MA, USA,1984.

[65] R. B. Lee, D. K. Karig, J. P. Mcgregor, Z. Shi,Enlisting hardware architecture to thwartmalicious code injection, in: Proceedings of the2003 International Conference on Security inPervasive Computing, Springer Verlag, Boppard,Germany, 2003, pp. 237–252.

[66] H. Ozdoganoglu, T. N. Vijaykumar, C. E. Brodley,B. A. Kuperman, A. Jalote, SmashGuard: Ahardware solution to prevent attacks on thefunction return address. IEEE Transactions onComputers, 2006.

[67] J. Xu, Z. Kalbarczyk, S. Patel, R. K. Iyer,Architecture support for defending against bufferoverflow attacks, in: Proceedings of the 2ndWorkshop Evaluating and Architecting SystemDependability (EASY-2002), 2002.

[68] K. Inoue, Energy-security tradeoff in a securecache architecture against buffer overflow attacks,SIGARCH Comput. Archit. News 33 (1) (2005)81–89.

[69] W. Kao, S. F. Wu, Lightweight hardware returnaddress and stack frame tracking to preventfunction return address attack, in: CSE ’09:Proceedings of the 2009 International Conferenceon Computational Science and Engineering, IEEEComputer Society, Washington, DC, 2009, pages859–866.

[70] Z. Shao, J. Cao, K. C. C. Chan, C. Xue, E. H.-M.Sha, Hardware/software optimization for array &pointer boundary checking against buffer overflowattacks, J. Parall. Distrib. Comput. 66 (9) (2006)1129–1136.

330 CHAPTER 12 Hardware and Security: Vulnerabilities and Solutions

[71] J. R. Crandall, S. F. Wu, F. T. Chong, Minos:Architectural support for protecting control data,ACM Trans. Archit. Code Optim. 3 (4) (2006)359–389.

[72] A. Smirnov, T. Chiueh, Dira: Automatic detection,identification, and repair of control-hijackingattacks, in: NDSS, 2005.

[73] G. E. Suh, J. W. Lee, D. Zhang, S. Devadas, Secureprogram execution via dynamic information flowtracking, in: ASPLOS-XI: Proceedings of the 11thInternational Conference on Architectural Supportfor Programming Languages and OperatingSystems, ACM, New York, NY, 2004, pp. 85–96.

[74] J. Zambreno, A. Choudhary, R. Simha, B.Narahari, N. Memon. Safe-ops: An approach toembedded software security, ACM Trans. Embed.Comput. Syst. 4 (1) (2005) 189–210.

[75] M. Abadi, M. Budiu, U. Erlingsson, J. Ligatti,Control-flow integrity, in: Proceedings of the 12thACM Conference on Computer andCommunications Security, ACM, November 2005,pp. 340–353.

[76] E. Leontie, G. Bloom, O. Gelbart, B. Narahari,R. Simha, A compiler-hardware technique forprotecting against buffer overflow attacks,J. Inf. Assur. Secur. 5 (1) (2010).

[77] M. Kharbutli, X. Jiang, Y. Solihin,G. Venkataramani, M. Prvulovic,Comprehensively and efficiently protecting theheap, ACM SIGOPS Operating Systems Review,Proceedings of the 2006 ASPLOS Conference,December 2006, pp. 207–218.

[78] D. Arora, A. Raghunathan, S. Ravi, N. K. Jha,Architectural support for safe software executionon embedded processors, in: Proceedings of thefourth International Conference onHardware/Software Codesign and SystemSynthesis, ACM, New York, NY, USA, 2006,pp. 106–111.

[79] G. Venkataramani, B. Roemer, Y. Solihin,M. Prvulovic, Memtracker: Efficient andprogrammable support for memory accessmonitoring and debugging. High-PerformanceComputer Architecture, International Symposiumon, 0:273–284, 2007.

[80] G.E. Suh, J. Lee, S. Devadas, Secure programexecution via dynamic information flow tracking.Architectural support for programming languagesand operating systems, 2004, pp. 85–96.

[81] M. Dalton, H. Kannan, C. Kozyrakis, Raksha:A flexible information flow architecture forsoftware security, in: International Symposiumon Computer Architecture, ACM, New York, NY,USA, June 2007, pp. 482–493.

[82] F. Qin, C. Wang, Z. Li, H. S. Kim, Y. Zhou,Y. Wu, Lift: A low-overhead practical information

flow tracking system for detecting security attacks,in: 39th Annual IEEE/ACM InternationalSymposium on Microarchitecture (MICRO’06),IEEE, 2006, pp. 135–148.

[83] M. Tiwari, H. Wassel, B. Mazloom, S. Mysore,F. Chong, T. Sherwood, Complete informationflow tracking from the gates up, in: Proceedings ofthe 14th International Conference onArchitectural Support for Programming Languagesand Operating Systems (ASPLOS), ACM,New York, NY, USA, March 2009.

[84] G. Venkataramani, I. Doudalis, Y. Solihin,M. Prvulovic, Flexitaint: A programmableaccelerator for dynamic taint propagation, in:International Symposium on High PerformanceComputer Architecture, IEEE, 2008, pp. 173–184.

[85] T. Grembowski, R. Lien, K. Gaj, N. Nguyen,P. Bellows, J. Flidr, et al., Comparative analysis ofthe hardware implementations of hash functionsSHA-1 and SHA-512, in: Proceedings of theInternational Conference on Information Security(ISC), Springer-Verlag, London, UK, 2002,pp. 75–89.

[86] K. U. Jarvinen, M. T. Tommiska, J. O. Skytt, Afully pipelined memoryless 17.8 Gbps AES-128encryptor, in: Proceedings of the InternationalSymposium on Field Programmable Gate Arrays(FPGA), 2003, pp. 207–215.

[87] J. Kaps, C. Paar, Fast DES implementation forFPGAs and its application to a universalkey-search machine, in: Proceedings of the AnnualWorkshop on Selected Areas in Cryptography(SAC), Springer-Verlag, London, UK, 1998,pp. 234–247.

[88] O. Kocabas, E. Savas, J. Grosschadl, Enhancing anembedded processor core with a cryptographicunit for speed and security, in: Internationalconference on Reconfigurable Computing andFPGAs, 2008, ReConFig ’08. December 2008,IEEE, pp. 409–414.

[89] J. Burke, J. McDonald, T. Austin, Architecturalsupport for fast symmetric-key cryptography,SIGOPS Oper. Syst. Rev. 34 (2000) 178–189.

[90] J. Zambreno, D. Nguyen, A. Choudhary,Exploring area/delay tradeoffs in an AES FPGAimplementation, in: Proceedings of theInternational Conference on Field ProgrammableLogic and Applications (FPL), Springer, 2004,pp. 575–585.

[91] S. Okada, N. Torii, K. Itoh, M. Takenaka,Implementation of elliptic curve cryptographiccoprocessor over GF(2m ) on an FPGA, in:Proceedings of the International Workshop onCryptographic Hardware and Embedded Systems(CHES), Springer-Verlag, London, UK, 2000,pp. 25–40.

CHAPTER 12 Hardware and Security: Vulnerabilities and Solutions 331

[92] M. Ernst, M. Jung, F. Madlener, S. Huss,R. Blumel, A reconfigurable system on chipimplementation for elliptic curve cryptographyover GF(2n ), in: Proceedings of theInternational Workshop on CryptographicHardware and Embedded Systems (CHES),Springer-Verlag, London, UK, 2002,pp. 381–399.

[93] A. Hodjat, I. Verbauwhede, A 21.54 Gbits/s fullypipelined AES processor on FPGA, 12thField-Programmable Custom ComputingMachines (FCCM), 2004.

[94] A. Dandalis, V. K. Prasanna, J. D. P. Rolim, Anadaptive cryptographic engine for ipsecarchitectures, in: Proceedings of the IEEESymposium on Field-Programmable CustomComputing Machines (FCCM), 2000,pp. 132–144.

[95] B. Yee, J. D. Tygar, Secure coprocessors inelectronic commerce applications, in: Proceedingsof the USENIX Workshop on ElectronicCommerce, USENIX Association, Berkeley, CA,USA, July 1995, pp. 155–170.

[96] S. H. Weingart, Physical security for the µabysssystem, IEEE Symposium on Security and Privacy,vol. 52, 1987.

[97] P. Kocher, J. Jaffe, B. Jun, Differential poweranalysis, in: Proceedings of Advances inCryptology (Crypto), Springer-Verlag, London,UK, 1999, pp. 388–397.

[98] S. R. White, Abyss: A trusted architecture forsoftware protection, IEEE Symposium on Securityand Privacy, vol. 38, 1987.

[99] J. D. Tygar, B. Yee, Dyad: A system for usingphysically secure coprocessors, in: Proceedings ofthe Harvard-MIT Workshop on Protection ofIntellectual Property, April 1993.

[100] E. Palmer. An introduction to Citadel – a securecrypto coprocessor for workstations. ResearchReport RC 18373, IBM T. J. Watson ResearchCenter, 1992.

[101] J. G. Dyer, M. Lindemann, R. Perez, R. Sailer,L. van Doorn, S.W. Smith, Building the IBM 4758secure coprocessor, Computer, 34 (10) (2001)57–66.

[102] S. W. Smith, Secure coprocessing applications andresearch issues, in: Los Alamos UnclassifiedRelease LA-UR-96-2805, Los Alamos NationalLaboratory, Los Alamos, NM, 1996.

[103] X. Zhuang, T. Zhang, H.-H. S. Lee, S. Pande,Hardware assisted control flow obfuscation forembedded processors. in: Proceedings of theInternational Conference on Compilers,Architecture, and Synthesis for Embedded Systems(CASES), ACM, New York, NY, USA, September2004.

[104] X. Zhuang, T. Zhang, S. Pande. HIDE: Aninfrastructure for efficiently protectinginformation leakage on the address bus.Architectural Support for Programming Languagesand Operating Systems (ASPLOS), 2004.

[105] D. Lie, C. Thekkath, M. Mitchell, P. Lincoln, DanBoneh, John Mitchell, et al., Architectural Supportfor Copy and Tamper Resistant Software.Architectural Support for Programming Languagesand Operating Systems (ASPLOS), ACM,New York, NY, USA, 2000.

[106] M. Milenkovic, A. Milenkovic, E. Jovanov,Hardware support for code integrity in embeddedprocessors. CASES, 2005.

[107] D. Kirovski, M. Drinic, M. Potkonjak, Enablingtrusted software integrity, ASPLOS, 30 (5)October (2002).

[108] O. Gelbart, P. Ott, B. Narahari, R. Simha,A. Choudhary, J. Zambreno, CODESEAL:Compiler/FPGA approach to secure applications,in: Proc. IEEE, ISI, IEEE, 2005.

[109] O. Gelbart, E. Leontie, B. Narahari, R. Simha,Architectural support for securing application datain embedded systems, in: IEEE InternationalConference on Electro/Information Technology,IEEE, May 2008, pp. 19–24.

[110] G. E. Suh, D. Clarke, B. Gassend, M. van Dijk,S. Devadas, AEGIS: Architecture for tamper-evident and tamper-resistant processing, in: P17International Conference on Supercomputing(ICS), ACM, New York, NY, USA, 2003.

[111] F.-X. Standaert, S. B. Ors, J.-J. Quisquater,B. Preneel, Power analysis attacks against FPGAimplementations of the DES, in: Proceedings of theInternational Conference on Field-ProgrammableLogic and its Applications (FPL), 2004, pp. 84–94.

[112] S. B. Ors, F. Gurkaynak, E. Oswald, B. Preneel,Power-analysis attack on an ASIC AESimplementation, in: Proceedings of theInternational Conference on InformationTechnology (ITCC), IEEE Computer Society,Washington, DC, USA, 2004.

[113] M.-L. Akkar, C. Giraud, An implementation ofDES and AES secure against some attacks, in:Proceedings of the International Workshop onCryptographic Hardware and Embedded Systems(CHES), Springer-Verlag, London, UK, May 2001,pp. 309–318.

[114] K. Tiri, I. Verbauwhede, A digital design flow forsecure integrated circuits, IEEETrans. Comput.-Aided Design of IntegratedCircuits and Systems, July 2006, pp. 1197–1208.

[115] P. Yu, P. Schaumont, Secure FPGA circuits usingcontrolled placement and routing. InternationalConference on Hardware/Software Codesign andSystem Synthesis, October 2007.

This page intentionally left blank