17
SCOTT MILLER , AMBROSE CHU, MIHAI SIMA, MICHAEL MCGUIRE [email protected] ReCoEng Lab DEPARTMENT OF ELECTRICAL AND COMPUTER ENGINEERING UNIVERSITY OF VICTORIA VICTORIA, B.C., CANADA VLSI Implementation of a Cryptography-Oriented Recon gurable Array DSD 2008 - Parma, Italy ReCoEng Lab, University of Victoria, Canada

SCOTT MILLER, AMBROSE CHU, MIHAI SIMA, MICHAEL MCGUIRE [email protected] ReCoEng Lab DEPARTMENT OF ELECTRICAL AND COMPUTER ENGINEERING UNIVERSITY OF

  • View
    216

  • Download
    2

Embed Size (px)

Citation preview

SCOTT MILLER, AMBROSE CHU, MIHAI SIMA, MICHAEL MCGUIRE

[email protected]

ReCoEng LabDEPARTMENT OF ELECTRICAL AND COMPUTER ENGINEERING

UNIVERSITY OF VICTORIAVICTORIA, B.C., CANADA

VLSI Implementation of a Cryptography-Oriented Reconfigurable Array

DSD 2008 - Parma, Italy ReCoEng Lab, University of Victoria, Canada

Outline

Motivation and Problem StatementOverview of Current FPGAs

• Limitations for Cryptography• Carry Lookahead Addition

CryptoRATile Implementation, Split LUTSimulation FrameworkResultsConclusions

DSD 2008 - Parma, Italy ReCoEng Lab, University of Victoria, Canada

Motivation

ProblemCryptography on

mobile, embedded systems

ASICs are expensive Recurring engineering,

quick obsolescencePoor long-integer

arithmetic support in current FPGAs

Design ConstraintsLow added

complexityNo (negligible)

impact on reconfigurability

“Cheap” solution

DSD 2008 - Parma, Italy ReCoEng Lab, University of Victoria, Canada

Overview of FPGAs

Grid of computing units

Mesh of configurable interconnection busses

Emulate any digital logic function

Global Interconnect slow

CLB CLB

CLBCLB

DSD 2008 - Parma, Italy ReCoEng Lab, University of Victoria, Canada

Overview of FPGAs

DSD 2008 - Parma, Italy ReCoEng Lab, University of Victoria, Canada

Xilinx Virtex-II4-input LUTSupport for

ripple-carry and carry-lookahead adders

Carry-Lookahead Addition

Ripple Carry Adders have serial delay

Carry Lookahead calculate carries in parallel

Can use hierarchies of CLA adders to speed-up long-operand calculations

OPERANDS FOR CLA

1 + 1 = Generate1 + 0 = Propagate0 + 1 = Propagate

0 + 0 = Nothing

FPGAs: Limitations for Cryptography

Poor support for long-integer arithmetic Long ripple-carry chains (with global interconnects) Fast-adders still require multiple stages of global-

interconnectsSame difficulties for comparison operations

Required in most common ECC and RSA algorithms

DSD 2008 - Parma, Italy ReCoEng Lab, University of Victoria, Canada

FPGAs: Limitations for Cryptography

DSD 2008 - Parma, Italy ReCoEng Lab, University of Victoria, Canada

Proposed Solution: CryptoRA

Based on Xilinx architecture

Additional fast-path provided for simultaneous Carry, Propagate signals

Extends fast-path across in rows as well as columns

Splits LUT to handle subtraction, etc.

DSD 2008 - Parma, Italy ReCoEng Lab, University of Victoria, Canada

CryptoRA: Split LUT

DSD 2008 - Parma, Italy ReCoEng Lab, University of Victoria, Canada

VLSI Modeling

DSD 2008 - Parma, Italy ReCoEng Lab, University of Victoria, Canada

Simulation Framework

All designs simulated in 65nm technology Simulated with Cadence Spectre simulator Average taken of 10 Monte Carlo runs with process

variation and mismatched includedSimulated simplified CLB models

Many components outside the scope of this research Respective loads for omitted modules were included

Timing simulated at every point of interest in the LUT -> Fast chain path to find all timing trade-offs

DSD 2008 - Parma, Italy ReCoEng Lab, University of Victoria, Canada

Results: Split LUT

Scenario

Edge Delay (ps)Selection Path Generate Path LUT Output Cout

Good Poor Good Poor Good Poor Good Poor

Incremental2-select/2-generate 30 71 46 56 94 129 104 1692-select/3-generate 31 72 58 115 98 128 117 1713-select/3-generate 38 88 60 112 102 137 123 182

Virtex-II / Spartan-3 106 157 50 111 62 147 156 260Virtex-4 59 120 50 124 59 120 155 209Virtex-5 141 276 121 204 121/141 204/276 299 353

DesignSpeed-up Versus

Virtex-II/Spartan-3 Virtex-4 Virtex-52-select/2-generate 1.54 1.24 2.092-select/3-generate 1.52 1.22 2.063-select/3-generate 1.42 1.15 1.94

DSD 2008 - Parma, Italy ReCoEng Lab, University of Victoria, Canada

Results: Split LUT

Selection Generate LUT_Out Cout

0

50

100

150

200

250

300

350

400

2-sel/2-gen

2-sel/3-gen

3-sel/3-gen

V-II/S-3

V-4

V-5

DSD 2008 - Parma, Italy ReCoEng Lab, University of Victoria, Canada

Results - Discussion

Performance boost of added carry-chain and additional fast-path cannot be directly quantified Dependence on physical FPGA itself, and operand word-

lengthHierarchical carry-lookahead adders show

promise with the new chains for increased performance

Example calculations are given in the paperPerformance comes at 2.5% area increase over

smallest reference structure

DSD 2008 - Parma, Italy ReCoEng Lab, University of Victoria, Canada

Conclusions

Split LUT structure enhances performance at minor (2.5%) area penalty

Increased speed in carry chain and avoiding global interconnect improves long-integer operation performance

Line-loading overhead from extra fast-chains is very small

This device shows promise for performing cryptographic operations.

DSD 2008 - Parma, Italy ReCoEng Lab, University of Victoria, Canada

Thank You for Listening

Any Questions?

Scott [email protected]

http://www.ece.uvic.ca/~smiller

DSD 2008 - Parma, Italy ReCoEng Lab, University of Victoria, Canada