96
Kris Gaj Electrical and Computer Engineerin George Mason University rds secure cryptographic transformatio ficient in both software and hardware: A case for synergy among math, computing, and engineering http://ece.gmu.edu/crypto-text.htm

Kris Gaj Electrical and Computer Engineering George Mason University Towards secure cryptographic transformations efficient in both software and hardware:

Embed Size (px)

Citation preview

Page 1: Kris Gaj Electrical and Computer Engineering George Mason University Towards secure cryptographic transformations efficient in both software and hardware:

Kris GajElectrical and Computer EngineeringGeorge Mason University

Towards secure cryptographic transformations efficient in both software and hardware:

A case for synergy among math, computing, and engineering

http://ece.gmu.edu/crypto-text.htm

Page 2: Kris Gaj Electrical and Computer Engineering George Mason University Towards secure cryptographic transformations efficient in both software and hardware:

Motivation

Page 3: Kris Gaj Electrical and Computer Engineering George Mason University Towards secure cryptographic transformations efficient in both software and hardware:

Criteria used to evaluate cryptographictransformations

Security

SoftwareEfficiency

HardwareEfficiency

Flexibility

Page 4: Kris Gaj Electrical and Computer Engineering George Mason University Towards secure cryptographic transformations efficient in both software and hardware:

Flexibility

• Additional key-sizes and block-sizes

• Ability to function efficiently and securely in a wide variety of platforms and applications low-end smartcards, wireless: small memory requirements IPSec, ATM – small key setup time in hardware B-ISDN, satellite communication – large encryption speed

Page 5: Kris Gaj Electrical and Computer Engineering George Mason University Towards secure cryptographic transformations efficient in both software and hardware:

Advanced Encryption Standard (AES) Contest1997-2001

15 Candidates from USA, Canada, Belgium,

France, Germany, Norway, UK, Israel,Korea, Japan, Australia, Costa Rica

June 1998

August 1999

October 2000

1 winner: RijndaelBelgium

5 final candidates

Mars, RC6, Rijndael, Serpent, Twofish

Round 1

Round 2

SecuritySoftware efficiency

Flexibility

SecurityHardware efficiency

Page 6: Kris Gaj Electrical and Computer Engineering George Mason University Towards secure cryptographic transformations efficient in both software and hardware:

NESSIE ProjectNew European Schemes for Signatures,

Integrity, and Encryption2000-2002

CRYPTREC Project2000-2002

Europe

Japan

Page 7: Kris Gaj Electrical and Computer Engineering George Mason University Towards secure cryptographic transformations efficient in both software and hardware:

Multiple types of transformations:

Development of methodology of a fair evaluation and comparison of algorithms belonging to the same class, including

software and hardware efficiency

NESSIE, CRYPTREC

• Symmetric-key block ciphers• Stream ciphers• Hash functions• MACs• Asymmetric encryption schemes• Asymmetric digital signature schemes• Asymmetric identification schemes

Page 8: Kris Gaj Electrical and Computer Engineering George Mason University Towards secure cryptographic transformations efficient in both software and hardware:

0

50

100

150

200

250

300

350

400

450

500

Serpent Rijndael Twofish RC6 Mars

Speed of the final AES candidates in hardware

Speed [Mbit/s] K.Gaj, P. Chodowiec, AES3, April, 2000

Page 9: Kris Gaj Electrical and Computer Engineering George Mason University Towards secure cryptographic transformations efficient in both software and hardware:

0102030405060708090100

SerpentRijndael Twofish RC6 Mars

Survey filled by 167 participants of the Third AES Conference, April 2000

# votes

Page 10: Kris Gaj Electrical and Computer Engineering George Mason University Towards secure cryptographic transformations efficient in both software and hardware:

SerpentRijndael Twofish RC6 Mars

Results of the NSA groupHardwareSpeed [Mbit/s]

606

414

0

100

200

300

400

500

600

700

202

105 10357

431

177143

61

NSAASIC

GMUFPGA

Page 11: Kris Gaj Electrical and Computer Engineering George Mason University Towards secure cryptographic transformations efficient in both software and hardware:

0

5

10

15

20

25

30

SerpentRijndael TwofishRC6 Mars

Efficiency in software: NIST-specified platform

128-bit key192-bit key256-bit key

200 MHz Pentium Pro, Borland C++Speed [Mbits/s]

Page 12: Kris Gaj Electrical and Computer Engineering George Mason University Towards secure cryptographic transformations efficient in both software and hardware:

Security Margin

Complexity

High

Adequate

Simple Complex

NIST Report: Security

Rijndael

MARSSerpentTwofish

RC6

Page 13: Kris Gaj Electrical and Computer Engineering George Mason University Towards secure cryptographic transformations efficient in both software and hardware:

Security: Theoretical attacks better than exhaustive key search

0 5 10 15 20 25 30 35

Twofish

Serpent

Rijndael

RC6

Mars without 16 mixing rounds

# of rounds in the attack/total # of rounds

6 16

329

7 10

15 20

1611

23

10

5

3

5

Page 14: Kris Gaj Electrical and Computer Engineering George Mason University Towards secure cryptographic transformations efficient in both software and hardware:

0 10 20 30 40 50 60 70 80 90 100

Twofish

Serpent

Rijndael

RC6

Mars

Security: Theoretical attacks better than exhaustive key search

# of rounds in the attack/total # of rounds 100%

28% 72%

38% 62%

69% 31%

70% 30%

75% 25%

Page 15: Kris Gaj Electrical and Computer Engineering George Mason University Towards secure cryptographic transformations efficient in both software and hardware:

0

100

200

300

400

500

600

700

359

610

Speed in hardware [Mbit/s]

SHA-1 SHA-512

Security and hardware speed for hash functions

Complexityof the best attack 280 2256

GMU team, May 2002

Skipjack AES-256the same as

Page 16: Kris Gaj Electrical and Computer Engineering George Mason University Towards secure cryptographic transformations efficient in both software and hardware:

What’s more important:software or hardware?

Page 17: Kris Gaj Electrical and Computer Engineering George Mason University Towards secure cryptographic transformations efficient in both software and hardware:

Historical view

Secret-key ciphers Hash functions

time

1970

1980

1990

2000

DES – optimized for hardware

Fast Software Encryption:ciphers optimized for software:e.g., RC5, Blowfish, RC4

AES – optimized for software and hardware

MD4-familyoptimized primarilyfor software

DES-based hash functions– optimized for hardware

Page 18: Kris Gaj Electrical and Computer Engineering George Mason University Towards secure cryptographic transformations efficient in both software and hardware:

Software or hardware?

SOFTWARE HARDWARE

security of dataduring transmission

flexibility(new cryptoalgorithms,

protection against new attacks)

speed

random keygeneration

access controlto keys

tamper resistance(viruses, internal attacks)

low cost

Page 19: Kris Gaj Electrical and Computer Engineering George Mason University Towards secure cryptographic transformations efficient in both software and hardware:

Efficiency indicators

Page 20: Kris Gaj Electrical and Computer Engineering George Mason University Towards secure cryptographic transformations efficient in both software and hardware:

Memory

Power consumption

Primary efficiency indicators

Software Hardware

Speed Memory Speed Area

Page 21: Kris Gaj Electrical and Computer Engineering George Mason University Towards secure cryptographic transformations efficient in both software and hardware:

Efficiency parameters

Latency Throughput = Speed

Encryption/decryption

Time to encrypt/decrypt

a single block of data

Mi

Ci

Number of bits encrypted/decrypted

in a unit of time

Encryption/decryption

Mi

Mi+1

Mi+2

Ci

Ci+1

Ci+2

Throughput =Block_size · Number_of_blocks_processed_simultaneously

Latency

Page 22: Kris Gaj Electrical and Computer Engineering George Mason University Towards secure cryptographic transformations efficient in both software and hardware:

What’s more important:Speed or area?

Page 23: Kris Gaj Electrical and Computer Engineering George Mason University Towards secure cryptographic transformations efficient in both software and hardware:

Non-Feedback Cipher ModesECB, counter

Page 24: Kris Gaj Electrical and Computer Engineering George Mason University Towards secure cryptographic transformations efficient in both software and hardware:

Comparison for non-feedback cipher modes, e.g.Counter Mode - CTR

M0 M1 M2

E

Ci = Mi E(IV+i) for i=0..N

MN-1 MN

. . .

E E E E. . .

C1 C2 C3 CN-1 CN

IV IV+1 IV+2 IV+N-1 IV+N

Page 25: Kris Gaj Electrical and Computer Engineering George Mason University Towards secure cryptographic transformations efficient in both software and hardware:

Increasing speed by parallel processing

Encryption/decryption

unit

Encryption/decryption

unit

Encryption/decryption

unit

Encryption/decryption

unit

Encryption/decryption

unit

Encryption/decryption

unit

Page 26: Kris Gaj Electrical and Computer Engineering George Mason University Towards secure cryptographic transformations efficient in both software and hardware:

Increasing speed using pipelining

Cipher 1 Cipher 2

round 1round 1

round 2

round 10

. . .

round 16

. . .

Speed =target_clock_period

block size

targetclock period,e.g., 20 ns

Page 27: Kris Gaj Electrical and Computer Engineering George Mason University Towards secure cryptographic transformations efficient in both software and hardware:

Pipelined operation of the encryption unit

B1

clockcycle 1

B2

2

B1

B3

3

B2B1

B4

4

B3B2B1

B5

5

B4B3B2

B6

6

B5B4B3

B7

7

B6B5B4

B8B7B6B5

8

B13B4B3B10

B14B5B4B11

B15B6B5

B12

B16B7B6

B13

B9B8B7B6

B10B9B8B7

B11B10B9B8

B12B3B2B9

clockcycle 9 10 11 12 13 14 15 16

Page 28: Kris Gaj Electrical and Computer Engineering George Mason University Towards secure cryptographic transformations efficient in both software and hardware:

0

1000

2000

3000

4000

5000

6000

7000

0 10000 20000 30000 40000 50000 60000Area [CLB slices]

Speed [Mbit/s]

Encryption in non-feedback modes (ECB, counter)decryption in all modes

Assuming clock period = 50 MHz

6.4 Gbit/s

SerpentTwofish

RC6

Rijndael

Mars

Page 29: Kris Gaj Electrical and Computer Engineering George Mason University Towards secure cryptographic transformations efficient in both software and hardware:

0

2

4

6

8

10

12

14

16

18

Our Results: Full mixed pipelining

Throughput [Gbit/s] Virtex FPGA

Serpent RijndaelTwofish RC6

16.815.2

13.1 12.2

Page 30: Kris Gaj Electrical and Computer Engineering George Mason University Towards secure cryptographic transformations efficient in both software and hardware:

0

5000

10000

15000

20000

25000

30000

35000

40000

45000

50000

Serpent RijndaelTwofish RC6

Area [CLB slices]

19,700 21,000

46,900

12,600

80 RAMs

dedicated memory blocks, RAMs

Our Results: Full mixed pipelining

Page 31: Kris Gaj Electrical and Computer Engineering George Mason University Towards secure cryptographic transformations efficient in both software and hardware:

NIST Report + GMU Report: Hardware Efficiency

Non-feedback cipher modes: ECB, CTR

Speed

Area

High

Low

Small

RijndaelSerpentTwofish

RC6Mars

Medium

Medium Large

Page 32: Kris Gaj Electrical and Computer Engineering George Mason University Towards secure cryptographic transformations efficient in both software and hardware:

Feedback cipher modesCBC, CFB, OFB

Page 33: Kris Gaj Electrical and Computer Engineering George Mason University Towards secure cryptographic transformations efficient in both software and hardware:

Feedback cipher modes - CBCM1 M2 M3

E

IV

C1 = E(Mi IV)

Ci = E(Mi Ci-1) for i=2..N

MN-1 MN

. . .

E E E E. . .

C1 C2 C3CN-1

CN

Page 34: Kris Gaj Electrical and Computer Engineering George Mason University Towards secure cryptographic transformations efficient in both software and hardware:

Initial transformation

Final transformation

#rounds times

Round Key[i]

i:=i+1

Round Key[0]

i:=1

i<#rounds?

Cipher Round

Round Key[#rounds+1]

Typical Flow Diagram of a Secret-Key Block Cipher

Page 35: Kris Gaj Electrical and Computer Engineering George Mason University Towards secure cryptographic transformations efficient in both software and hardware:

register

combinationallogic

one round

multiplexer

Basic iterative architecture

Page 36: Kris Gaj Electrical and Computer Engineering George Mason University Towards secure cryptographic transformations efficient in both software and hardware:

speed

area

k=2 k=3 k=4 k=5

loop-unrolling

basic architecture

Increasing speed in cipher feedback modes

Page 37: Kris Gaj Electrical and Computer Engineering George Mason University Towards secure cryptographic transformations efficient in both software and hardware:

GMU Results: Encryption in cipher feedback modes (CBC, CFB, OFB) - Virtex FPGA

Throughput [Mbit/s]

Area [CLB slices]

0

100

200

300

400

500

0 1000 2000 3000 4000 5000

Rijndael Serpent I8

Mars

RC6

TwofishSerpent I1

Page 38: Kris Gaj Electrical and Computer Engineering George Mason University Towards secure cryptographic transformations efficient in both software and hardware:

NSA Results: Encryption in cipher feedback modes (CBC, CFB, OFB) - ASIC, 0.5 m CMOS

Throughput [Mbit/s]

Area [CLB slices]

0

100

200

300

400

500

600

700

0 5 10 15 20 25 30 35 40

Serpent I1

RC6 TwofishMars

Rijndael

Page 39: Kris Gaj Electrical and Computer Engineering George Mason University Towards secure cryptographic transformations efficient in both software and hardware:

Decreasing area by resource sharing

F F

D0 D1

D0’ D1’

F

D0 D1

D0’ D1’

multiplexer

Before After

register register

Page 40: Kris Gaj Electrical and Computer Engineering George Mason University Towards secure cryptographic transformations efficient in both software and hardware:

Throughput

Area

basic architecture

Resource sharing: Speed vs. Area

- basic architecture

- resource sharing

resource sharing

Page 41: Kris Gaj Electrical and Computer Engineering George Mason University Towards secure cryptographic transformations efficient in both software and hardware:

NIST Report + GMU Report: Hardware Efficiency

Feedback cipher modes: CBC, CFB

Speed

Area

High

Low

Small

Rijndael

MARS

Serpent

Twofish

RC6Medium

Medium Large

Page 42: Kris Gaj Electrical and Computer Engineering George Mason University Towards secure cryptographic transformations efficient in both software and hardware:

Aren’t software and hardwareoptimizations equivalent?

Page 43: Kris Gaj Electrical and Computer Engineering George Mason University Towards secure cryptographic transformations efficient in both software and hardware:

0

5

10

15

20

25

30

SerpentRijndael TwofishRC6 Mars

Efficiency in software: NIST-specified platform

128-bit key192-bit key256-bit key

200 MHz Pentium Pro, Borland C++Speed [Mbits/s]

Page 44: Kris Gaj Electrical and Computer Engineering George Mason University Towards secure cryptographic transformations efficient in both software and hardware:

0

50

100

150

200

250

300

350

400

450

500

Serpent Rijndael Twofish RC6 Mars

Our Results: Basic architecture - SpeedThroughput [Mbit/s]

Page 45: Kris Gaj Electrical and Computer Engineering George Mason University Towards secure cryptographic transformations efficient in both software and hardware:

Basic atomic operationsof secret-key ciphers and hash functions

Page 46: Kris Gaj Electrical and Computer Engineering George Mason University Towards secure cryptographic transformations efficient in both software and hardware:

Atomic operations used in 41 most popular secret-key ciphers (1)

B. Chetwynd, MS Thesis, WPI

Considered ciphers:

Blowfish, CAST, CAST-128, CAST-256, CRYPTON, CS-Cipher, DEAL, DES, DFC, E2, FEAL, FROG, GOST, Hasty Pudding, ICE, IDEA, Khafre, Khufu, LOKI91, LOKI97, Lucifer, MacGuffin, MAGENTA, MARS, MISTY1, MISTY2, MMB, RC2, RC5, RC6, Rijndael, SAFER K, SAFER+, Serpent, SQUARE, SHARK, Skipjack, TEA, Twofish, WAKE, WiderWake

Page 47: Kris Gaj Electrical and Computer Engineering George Mason University Towards secure cryptographic transformations efficient in both software and hardware:

Major atomic operations used in 41 most popular secret-key ciphers (2)

B. Chetwynd, MS Thesis, WPI

0

5

10

15

20

25

30

35

40

30

107 7

1

S-box Variablerotation

Modularmulti-

plication

GF(2n)multi-

plication

Modularinversion

Page 48: Kris Gaj Electrical and Computer Engineering George Mason University Towards secure cryptographic transformations efficient in both software and hardware:

Auxiliary atomic operations used in 41 most popular secret-key ciphers (3)

B. Chetwynd, MS Thesis, WPI

Boolean(XOR, AND, OR,

etc.)

Fixedrotation

Modularaddition

& subtraction

Permutation0

5

10

15

20

25

30

35

4040

25

20

?

Page 49: Kris Gaj Electrical and Computer Engineering George Mason University Towards secure cryptographic transformations efficient in both software and hardware:

Major cipher operations (1) - S-box

S-box n x mROM

Software Hardware

C

ASM

WORD S[1<<n]={ 0x23, 0x34, 0x56 . . . . . . . . . . . . . .}

S DW 23H, 34H, 56H …..

direct logic

n

m

2n words

n-bit address

m-bit output

...

x1x2

xn

...

y1y2

ym

S

2n m bits

Page 50: Kris Gaj Electrical and Computer Engineering George Mason University Towards secure cryptographic transformations efficient in both software and hardware:

S-box: Memory in hardware32 x 4 = 128 bits

S

4

4

S

4

4

S

4

4

S

4

4

S

4

4

S

4

4

S

4

4

. . .

Memory = 32 24 4 bits = 2 kbit

S

8

8

S

8

8

S

8

8

S

8

8

. . .

16 x 8 = 128 bits

Memory = 16 28 8 bits = 32 kbit = 16 2 kbit

Page 51: Kris Gaj Electrical and Computer Engineering George Mason University Towards secure cryptographic transformations efficient in both software and hardware:

S-box: Memory in software32 x 4 = 128 bits

S

4

4

S

4

4

S

4

4

S

4

4

S

4

4

S

4

4

S

4

4

. . .

Memory = 24 4 bits = 64 bit

S

8

8

S

8

8

S

8

8

S

8

8

. . .

16 x 8 = 128 bits

Memory = 28 8 bits = 2 kbit = 32 64 bits

Page 52: Kris Gaj Electrical and Computer Engineering George Mason University Towards secure cryptographic transformations efficient in both software and hardware:

variable rotation ROL32

Mux-based shifter

High-speed clock

C

ASM

Major cipher operations (2) – Variable Rotation

A <<< B

ROL A, B

C = (A << B) | (A >> (32-B));

min (B, 32-B) CLK’ cycles

HardwareSoftware

fast clock CLK’

A

A<<<B

A<<<0 A<<<16

32

B[4]B[3]

B[2]B[1]

B[0]

Page 53: Kris Gaj Electrical and Computer Engineering George Mason University Towards secure cryptographic transformations efficient in both software and hardware:

C=A·B mod 2n

Half-Multiplier

ASM

C

Major cipher operations (3) – Modular Multiplication

HardwareSoftware

C = A*B;

MUL

n n

MUL

n

n n

n

unsigned long A, B, C;

A B

C

n=32, 16

Page 54: Kris Gaj Electrical and Computer Engineering George Mason University Towards secure cryptographic transformations efficient in both software and hardware:

ASM

C

Major cipher operations (4)Multiplication in the Galois Field GF(2m)

HardwareSoftware

ROL, XOR, OR, ANDorALOG DW 3H, 5H, …LOG DW 7H, 9H, …

8 8

MUL GF(28)

<<, ^, |, &oralog[log[X]+log[C]%255]

X

Y

C = constx0 x3 x7

y0

. . .

x0 x3 x7

y7

x4

Page 55: Kris Gaj Electrical and Computer Engineering George Mason University Towards secure cryptographic transformations efficient in both software and hardware:

Permutation

C

order of wires

Auxiliary cipher operations (1) - Permutation

P

HardwareSoftware

ASM

complexsequence ofinstructions<<, |, &

complexsequence ofinstructionsROL, OR, AND

n

n

x1 x2 x3 xnxn-1

. . .

y1 y2 y3 ynyn-1

. . .

Page 56: Kris Gaj Electrical and Computer Engineering George Mason University Towards secure cryptographic transformations efficient in both software and hardware:

C

order of wires

Auxiliary cipher operations (2) - Fixed rotation

HardwareSoftware

ASM

ROL A, n

x1 x2 x3 xnxn-1

. . .

y1 y2 y3 ynyn-1

. . .

C = (A << n) | (A >> (32-n));

fixed rotationROL32

A <<< n32

Page 57: Kris Gaj Electrical and Computer Engineering George Mason University Towards secure cryptographic transformations efficient in both software and hardware:

ASM

C

Auxiliary cipher operations (3)Boolean operations

HardwareSoftware

XOR A, B AND A, BOR A, B

n n

XOR, AND, OR

A ^ BA & BA | B

A

Y

Ba0 b0

y0

. . .

an-1 bn-1

yn-1

a0 b0

y0

. . .

an-1 bn-1

yn-1

Page 58: Kris Gaj Electrical and Computer Engineering George Mason University Towards secure cryptographic transformations efficient in both software and hardware:

C=A+B mod 2n

Adder/subtractor

ASM

C

Auxiliary cipher operations (4)Addition/subtraction

HardwareSoftware

C = A+B;

ADD

n n

ADD

n

n n

n

unsigned long A, B, C;

A B

C

n=32, 16

Page 59: Kris Gaj Electrical and Computer Engineering George Mason University Towards secure cryptographic transformations efficient in both software and hardware:

Delay

Area

Multiple designs for hardware adders

Ripple carry adder (RC)

Carry-Skip adder (CS)

Carry-LookAhead adder (CLA)Carry-Select adder

Parallel-Prefix Network adder(Kogge-Stone, Brent-Kung)

Page 60: Kris Gaj Electrical and Computer Engineering George Mason University Towards secure cryptographic transformations efficient in both software and hardware:

Delay

Area

modularmultiplication

Boolean

permutation

variablerotationGF(2n)

multiplication

fixed rotation

Delay and area in HARDWAREBasic operations

addition (CLA)

addition (RC)

S-box4x4

S-box8x8

S-box9x32

modularinverse

Page 61: Kris Gaj Electrical and Computer Engineering George Mason University Towards secure cryptographic transformations efficient in both software and hardware:

additionmultiplication

Boolean

permutation

fixed rotation

GF(2n)multiplication

variable rotation

Delay and area in SOFTWAREBasic operations

Delay

Memory

S-box4x4

S-box8x8

S-box9x32

modular inverse

Page 62: Kris Gaj Electrical and Computer Engineering George Mason University Towards secure cryptographic transformations efficient in both software and hardware:

MarsTwofishSerpent RC6Rijndael

Major operations of AES finalists

S-boxes

Integer multiplication

Variable rotation

Multiplication in GF(2m)

Page 63: Kris Gaj Electrical and Computer Engineering George Mason University Towards secure cryptographic transformations efficient in both software and hardware:

MarsTwofishSerpent RC6Rijndael

Auxiliary operations of AES finalists

Boolean

Addition/subtraction

Permutation

Fixed rotation

Page 64: Kris Gaj Electrical and Computer Engineering George Mason University Towards secure cryptographic transformations efficient in both software and hardware:

Delay

Area

modularmultiplication

Boolean

permutation

variablerotationGF(2n)

multiplication

fixed rotation

Delay and area in HARDWAREMARS – IBM team

addition (CLA)

addition (RC)

S-box4x4

S-box8x8

S-box9x32

modularinverse

Page 65: Kris Gaj Electrical and Computer Engineering George Mason University Towards secure cryptographic transformations efficient in both software and hardware:

Delay

Area

modularmultiplication

Boolean

permutation

variablerotationGF(2n)

multiplication

fixed rotation

Delay and area in HARDWARESerpent – R. Anderson, E. Biham, L. Knudsen

addition (CLA)

addition (RC)

S-box4x4

S-box8x8

S-box9x32

modularinverse

Page 66: Kris Gaj Electrical and Computer Engineering George Mason University Towards secure cryptographic transformations efficient in both software and hardware:

Delay

Area

modularmultiplication

Boolean

permutation

variablerotationGF(2n)

multiplication

fixed rotation

Delay and area in HARDWARERijndael – V. Rijmen, J. Daemen

addition (CLA)

addition (RC)

S-box4x4

S-box8x8

S-box9x32

modularinverse

Page 67: Kris Gaj Electrical and Computer Engineering George Mason University Towards secure cryptographic transformations efficient in both software and hardware:

additionmultiplication

Boolean

permutation

fixed rotation

GF(2n)multiplication

variable rotation

Delay and area in SOFTWAREMARS – IBM team

Delay

Memory

S-box4x4

S-box8x8

S-box9x32

modular inverse

Page 68: Kris Gaj Electrical and Computer Engineering George Mason University Towards secure cryptographic transformations efficient in both software and hardware:

Fast & compact Slow & big

Software

Fast &compact

Slow &big

permutation

addition

GF(2n) multiply

multiplication

S-box

Booleanfixed rotation

variable rotation

Operations efficient in both software and hardwareSummary

Slow orbig

Slow or big Hardware

modular inverse

Page 69: Kris Gaj Electrical and Computer Engineering George Mason University Towards secure cryptographic transformations efficient in both software and hardware:

Types of ciphers

Page 70: Kris Gaj Electrical and Computer Engineering George Mason University Towards secure cryptographic transformations efficient in both software and hardware:

Feistel Networks Modified FeistelNetwork

Substitution-Linear TransformationNetworks

Others

AES: Types of candidate algorithms

TwofishE2DFC

DealLOKI97Magenta

RC6MARSCAST-256

RijndaelSerpent

Safer+Crypton

FrogHPC

Page 71: Kris Gaj Electrical and Computer Engineering George Mason University Towards secure cryptographic transformations efficient in both software and hardware:

<<< 1

>>> 1

F - function

Feistel Network: Single Round of Twofish

D[3] D[2] D[1] D[0]

D’[3] D’[2] D’[1] D’[0]

K2r+8 K2r+9

- units shared between encryption and decryption

Page 72: Kris Gaj Electrical and Computer Engineering George Mason University Towards secure cryptographic transformations efficient in both software and hardware:

Modified Feistel Network: Single Round of MARS

D[3] D[2] D[1] D[0]

E

<<<13

D’[3] D’[2] D’[1] D’[0]

k k’

out1

out2

out3

in

k=K[4+2i],k’ = K[5+2i],i - round no.

- units shared between encryption and decryption

Page 73: Kris Gaj Electrical and Computer Engineering George Mason University Towards secure cryptographic transformations efficient in both software and hardware:

Substitution-Linear Transformation Network:Single Round of Serpent

S-boxes

Linear Transformation

128

128

K[i]

- units shared between encryption and decryption

128

Page 74: Kris Gaj Electrical and Computer Engineering George Mason University Towards secure cryptographic transformations efficient in both software and hardware:

initial permutation

encryptionblock

decryptionblock

final permutation

128

128128128

128128

128

128

K0, ... , K7, K32 K32, ... , K7, K0

Substitution-Linear Transformation Network: Serpent in Hardware

Page 75: Kris Gaj Electrical and Computer Engineering George Mason University Towards secure cryptographic transformations efficient in both software and hardware:

Inversion in GF(28)

affinetransformation

inversed affinetransformation

ShiftRow

MixColumn

subkey

InvShiftRow

subkey

InvMixColumn

encryption decryption

Substitution-Linear Transformation Network: Rijndael in Hardware

- units shared between encryption and decryption

Page 76: Kris Gaj Electrical and Computer Engineering George Mason University Towards secure cryptographic transformations efficient in both software and hardware:

Number and complexity of rounds

Page 77: Kris Gaj Electrical and Computer Engineering George Mason University Towards secure cryptographic transformations efficient in both software and hardware:

Number of rounds

Complexity of a round

Triple DES

DES

Serpent

Rijndael

Mars

RC6

Twofish

Number vs. complexity of a round

10

20

30

40

50

Page 78: Kris Gaj Electrical and Computer Engineering George Mason University Towards secure cryptographic transformations efficient in both software and hardware:

Complexity of the cipher round in hardware

Serpent

Rijndael

Twofish

RC6

Mars

S-box 4x4 XOR7

S-box 8x8 XOR6 XOR5 XOR4

6 S-boxes 4x42 ADD32 XOR5 XOR49 XOR2

SQR32 2 ADD32 ROT32

MUL32 4 MUX2

4 MUX2

2 MUX2

MUX2

2 MUX2

regular round

0 20 40 60 80 100Time in hardware [ns]

ADD32 ROT32 ADD32 2XOR2

K. Gaj, P. ChodowiecApril 2000

Page 79: Kris Gaj Electrical and Computer Engineering George Mason University Towards secure cryptographic transformations efficient in both software and hardware:

Security margin: Theoretical attacks better than exhaustive key search

0 5 10 15 20 25 30 35

Twofish

Serpent

Rijndael

RC6

Mars without 16 mixing rounds

# of rounds in the attack/total # of rounds

6 16

329

7 10

15 20

1611

23

10

5

3

5

Page 80: Kris Gaj Electrical and Computer Engineering George Mason University Towards secure cryptographic transformations efficient in both software and hardware:

Making all rounds identical

Page 81: Kris Gaj Electrical and Computer Engineering George Mason University Towards secure cryptographic transformations efficient in both software and hardware:

128-bit register

32 x S-box 0

linear transformation

K0 round 0

32 x S-box 7

linear transformation

K7 round 7

K32

output

128

128

128

Serpent: Hardware Architecture I8

one implementation round of Serpent

=8 regular cipher

rounds

Page 82: Kris Gaj Electrical and Computer Engineering George Mason University Towards secure cryptographic transformations efficient in both software and hardware:

128-bit register

32 x S-box 0

Ki regular Serpent round

32 x S-box 7

linear transformationK32

output

128

128

128

Serpent – Hardware Architecture I1

32 x S-box 1

8-to-1 128-bit multiplexer

128 128 128

128 128 128

Page 83: Kris Gaj Electrical and Computer Engineering George Mason University Towards secure cryptographic transformations efficient in both software and hardware:

GMU Results: Encryption in cipher feedback modes (CBC, CFB, OFB) - Virtex FPGA

Throughput [Mbit/s]

Area [CLB slices]

0

100

200

300

400

500

0 1000 2000 3000 4000 5000

Rijndael Serpent I8

Mars

RC6

TwofishSerpent I1

Serpent with all S-boxesidentical

Page 84: Kris Gaj Electrical and Computer Engineering George Mason University Towards secure cryptographic transformations efficient in both software and hardware:

Parallelism

Page 85: Kris Gaj Electrical and Computer Engineering George Mason University Towards secure cryptographic transformations efficient in both software and hardware:

Parallelism in SHA-1

A

B

D

C

E

ROTL5

ft

ROTL 30

+ + ++

Kt Wt

A

B

D

C

E

32

32

32

32

32

A

B

C

ROTL5

f t

ROTL30

+ + ++

Kt Wt

A

B

D

C

E

32

32

32

32

32

Operations from two different steps that can be performedin parallel

Page 86: Kris Gaj Electrical and Computer Engineering George Mason University Towards secure cryptographic transformations efficient in both software and hardware:

ROL5

ROL1

ROL30

ROL1

ROL5 ROL30

ROL1

ROL5 ROL30

ROL1

ROL5ROL30

ROL30

ROL1

ROL1

ROL30

step n

step n+1

step n+2

step n+3

step n+4

Executing SHA-1 on a 7-way superscalar processorA. Bosselaers, R. Govaerts, J. Vandewalle, 1997

Page 87: Kris Gaj Electrical and Computer Engineering George Mason University Towards secure cryptographic transformations efficient in both software and hardware:

Number of operations that can be executed in parallel

for various hash functions

0

1

2

3

4

5

6

7

8

SHA-1 RIPEMD160

RIPEMD128

RIPEMD MD5 MD4

A. Bosselaers, R. Govaerts, J. Vandewalle, 1997

Page 88: Kris Gaj Electrical and Computer Engineering George Mason University Towards secure cryptographic transformations efficient in both software and hardware:

Optimization tricks

Page 89: Kris Gaj Electrical and Computer Engineering George Mason University Towards secure cryptographic transformations efficient in both software and hardware:

Rijndael round: Table-lookup implementation

a0,0 a0,1 a0,2 a0,3

a1,0 a1,1 a1,2 a1,3

a2,0 a2,1 a2,2 a2,3

a3,0 a3,1 a3,2 a3,3

b0 b1 b2 b3

T0

T1

T2

T3

= k2 x3,2 x2,2 x1,2 x0,2 b2

Speed-up in software: ~ 100 timesSpeed-up in hardware: ~ 20%

Page 90: Kris Gaj Electrical and Computer Engineering George Mason University Towards secure cryptographic transformations efficient in both software and hardware:

Serpent: Bit-slice implementation

S

x1

(0)x2

(0) x3

(0)x4

(0)

y1(0)

S

x1

(1)x2

(1) x3

(1)x4

(1)

y1(1)

S

x1

(2)x2

(2) x3

(2)x4

(2)

y1(2)

S

x1

(3)x2

(3) x3

(3)x4

(3)

y1(3)

S

x1

(31)x2

(31)x3

(31)x4

(31)

y1(31)

y1 = f (x1, x2, x3, x4 ) = x1 x2 (x3 x4 ) (k) (k) (k) (k) (k)

e.g. (k) (k) (k) (k )

=

ANDx1

(31)x1(30) x1

(1) x1(0)x1

(3) x1(2). . .

x2(31)x2

(30) x2(1) x2

(0)x2(3) x2

(2). . .

u1(31)u1

(30) u1(1) u1

(0)u1(3) u1

(2). . .=

ORx3

(31)x3(30) x3

(1) x3(0)x3

(3) x3(2). . .

x4(31)x4

(30) x4(1) x4

(0)x4(3) x4

(2). . .

v1(31)v1

(30) v1(1) v1

(0)v1(3) v1

(2). . .XOR

y1(31)y1

(30) y1(1) y1

(0)y1(3) y1

(2)

32 x 4 = 128 bits

Page 91: Kris Gaj Electrical and Computer Engineering George Mason University Towards secure cryptographic transformations efficient in both software and hardware:

The proposed approach

Page 92: Kris Gaj Electrical and Computer Engineering George Mason University Towards secure cryptographic transformations efficient in both software and hardware:

Cipher design methodology (1)

1. Choose one or maximum two major operations efficient in both software and hardware

best choice: S-box 4x4, GF(2n) multiplication2. Choose one or maximum two auxiliary operations efficient in both software and hardware

best choice: Boolean, fixed rotation3. Choose cipher type that enables maximum sharing among encryption and decryption

best choice: Feistel network, modified Feistel network

Page 93: Kris Gaj Electrical and Computer Engineering George Mason University Towards secure cryptographic transformations efficient in both software and hardware:

Cipher design methodology (2)

4. Design a round taking into account a trade-off among • round complexity• number of rounds necessary to guarantee sufficient security margin

5. Make each round [possibly] identicalnegative examples: Serpent, Mars

6. Look for parallelism within a round and among consecutive rounds

positive example: SHA-1

7. Look for optimization trickspositive examples:

table-look-up in Rijndaelbit-slice implementation in Serpent

Page 94: Kris Gaj Electrical and Computer Engineering George Mason University Towards secure cryptographic transformations efficient in both software and hardware:

Mathematicians

Computerscientists

ComputerEngineers

Security

Softwareefficiency

Hardwareefficiency

Flexibility

Page 95: Kris Gaj Electrical and Computer Engineering George Mason University Towards secure cryptographic transformations efficient in both software and hardware:

$A100 Challenges

For mathematicians:

Prove or disprove that Serpent with • all S-boxes identical• 16 rounds

is at least as secure as Rijndael

For computer scientists:

Is there a way of using instruction level parallelismto speed-up software implementation of [modified] Serpent to make it as fast as Rijndael?

Page 96: Kris Gaj Electrical and Computer Engineering George Mason University Towards secure cryptographic transformations efficient in both software and hardware:

$A50 Challenge

For computer scientists:

What is a level of parallelism present in SHA-256, SHA-384, SHA-512?

For mathematicians:

Is there a way of changing Serpent into a modified Feistel network cipherwithout loosing its security properties?