23
Implementation of Three LWC Schemes in the WiFi 4-Way Handshake with Software Defned Radio Yunjie Yi, Guang Gong, Kalikinkar Mandal {yunjie.yi,ggong,kmandal}@uwaterloo.ca Department of Electrical and Computer Engineering University of Waterloo Waterloo, ON, N2L 3G1, CANADA November 4 - 6, 2019 Third Lightweight Cryptography Workshop 2019 at NIST

Implementation of Three LWC Schemes in the WiFi 4-Way ......Implementation of Three LWC Schemes in the WiFi 4-Way Handshake with Software Defned Radio . Yunjie Yi, Guang Gong, Kalikinkar

  • Upload
    others

  • View
    10

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Implementation of Three LWC Schemes in the WiFi 4-Way ......Implementation of Three LWC Schemes in the WiFi 4-Way Handshake with Software Defned Radio . Yunjie Yi, Guang Gong, Kalikinkar

Implementation of Three LWC Schemes in the WiFi4-Way Handshake with Software Defned Radio

Yunjie Yi Guang Gong Kalikinkar Mandal yunjieyiggongkmandaluwaterlooca

Department of Electrical and Computer Engineering University of Waterloo

Waterloo ON N2L 3G1 CANADA

November 4 - 6 2019

Third Lightweight Cryptography Workshop 2019 at NIST

Outline

1 Background - SPIX ACE and WAGE algorithms - IEEE 8021a PHY 4-way handshake and data protection - Software defned radio

2 Implementation of Three LWC Schemes - KDF and MIC generation - Implementation on microcontrollers

3 Performance Results - KDF and MIC for 4-way handshake - SPIX ACE and WAGE for data protection

Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake 121

SPIX An authenticated encryption algorithm

Design1

Adopts the monkey duplex mode of operation Underlying permutation sLiSCP-light-2562

128-bit security with straight out parameters key size = tag size = nonce size = security level

Data limit 263 bytes before re-keying is done

P18

64

P18 P18 P9

M0 MlM minus2 MlM minus1

AD0 ADlAD minus 1 K0 K1 C0

P9 P9

K0 K1 ClM minus2 ClM minus1

tagextract(S)

128 init(N K)

192

0x00 0x00 0x01 0x01 0x02 0x02 0x00

Processing associated data Encryption Initialization Finalization

P9 P9 P18 P18

0x02 0x00

SPIX AE algorithm

1 Altawy R Gong G He M Mandal K and Rohit R SPIX An authenticated cipher NIST LWC round 2

candidate (2019) 2

Altawy R Rohit R He M Mandal K Yang G and Gong G sLiSCP-light Towards hardware optimized sponge-specifc cryptographic permutations ACM Trans Embedded Computing Systems (2018)

Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake 221

SPIX (Cont) sLiSCP-light permutation round function

State Xi = (Xi Xi Xi Xi 3210 ) with size 256 bits

Simeck box SB-2n where n = 32

Xi0 Xi

1 Xi2 Xi

3SB-2n SB-2n

rci0 rci1

sci0 sci1

2n 2n

2n 2n

sLiSCP-light permutation step function

x1 x0

f(501) (L5(x) x) oplus L1(x)

nn

rc = (cuminus1 middot middot middot c1 c0)

n n

Simeck (SB-2n)

321 Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake

ACE An AE and hash algorithm Design goals

Both AE and Hash functionalities with a single hardware footprint 128-bit security level with good security margins Balance among hardware cost and software eyumlciency

ACE ACE ACE ACE ACE ACE ACE ACE ACE ACEload-AE(NK)

256

64

tagextract(S)

128

0x00 0x00 0x01 0x01 0x02 0x02 0x02 0x00 0x00

K0 K1 AD0 ADlADminus 1 K0 K1

M0 MlMminus2 MlMminus1

C0 ClMminus2 ClMminus1

Initialization Processing associated data Encryption Finalization

(a) ACE Authenticated encryption algorithm

ACE ACE ACE ACE ACE ACE ACEload-H(IV )

256

64

M0 M1 MlMminus1 0 0 0 0H0 H1 H2 H3

Initialization Absorbing Squeezing

(b) ACE Hash algorithm

421 Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake

ACE Permutation Ai Bi Ci Di Ei

64 64 64 64 64

SB-64 SB-64 SB-64rci0 rci1 rci2

156||sci0 156||sci1 156||sci2

Ai+1 Bi+1 Ci+1 Di+1 Ei+1

ACE permutation step function

Parameters1

State size 320 bits Nonlinear layer 3 SB-64 (Simeck) boxes Linear layer (˙) (32041) rounds(u)steps(s) 816 Rate positions 4 bytes of A[7 middot middot middot 4] and C[7 middot middot middot 4] 8-bit round and step constants rci

j and scij

1 Aagaard M AlTawy R Gong G Mandal K and Rohit R ACE An authenticated encryption and hash algorithm

NIST LWC round 2 candidate (2019)

521 Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake

WAGE An authenticated encryption algorithm

Design1

Underlying permutation WAGE (over extension feld F27) of width 259 bits Round function of WAGE is constructed by tweaking the initialization

phase of the WG stream cipher Mode of operation Similar to ACE rounds 111 128-bit security low hardware cost Data limit 264 bytes

Si36 Si

35 Si34 Si

33 Si32 Si

31 Si30 Si

29 Si28 Si

27 Si26 Si

25 Si24 Si

23 Si22 Si

21 Si20 Si

19

WGP SB SB

Si17Si

18 Si16 Si

15 Si14 Si

13 Si12 Si

11 Si10 Si

9 Si8 Si

7 Si6 Si

5 Si4 Si

3 S2i Si1 Si

0

WGP SB SB

oplusω

rci1

rci0

1 Aagaard M AlTawy R Gong G Mandal K Rohit R and Zidaric N WAGE An authenticated cipher NIST LWC

round 2 candidate (2019)

Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake 621

IEEE 8021X 4-way handshake and data protection

4-way handshake and data protection 4-way handshake Conducts mutual entity authentication and generation

of session keys Data protection After 4-way handshake it executes a data protection

protocol either CCMP or GCMP CCMP AES in counter mode + CBC MAC GCMP AES in counter mode + polynomial hash for MAC

Supplicant Authenticator

ANonce Generate ANonce

(SNonce MICA)Generate SNonceDerive PTK

(ANonce MICS )Derive PTK

MICallInstall TKInstall TK

Figure IEEE 8021X 4-way handshake

Software Defined Radio

721 Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake

IEEE 8021X (Cont)

Computing keys in the 4-way handshake protocol Pairwise transit key (PTK) is generated as

PTK = KDF (PMK ANonce||SNonce||AP MAC adr||STA MAC adr)

PTK = KCK||KEK||TK

- KCK is used to generate a MIC - KEK is used for encrypting the group key - TK is used for protecting trayumlc data

Message integrity codes (MICs) are generated as

MICA = MIC(KCK ANonce r) MICS = MIC(KCK SNonce r) MICall = MIC(KCK D r + 1)

where r is a replay counter number and D carries the cipher suite

Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake 821

Software defned radio (SDR)

SDR setup SDR has two main parts Universal Software Radio Peripheral (USRP) and

GNU radio

USRP is a hardware consisting of ADC DAC low pass fler and mixer

GNU radio is a digital signal processing (DSP) software on a Linux OS

921 Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake

OFDM sender in SDR

OFDM sender The digital processing part before the USRP sender are done on PC in real

time

The OFDM sender includes package header generator modulation carrier allocator cyclic prefx and so on

1021 Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake

OFDM receiver in SDR

OFDM receiver USRP receiver received sampled data in complex domain

The receiver contains synchronization module header and payload demux OFDM demodulation and so on

USRP-

Source

Schmidl amp Cox

OFDM synch

Frequency Mod

Sensitivity -3125m

OFDM Frame Equalizer

FFT length = 64

CP-length = 16

Delay Multiply

HeaderPayload Demux

File Sink

Pathoutputdat

Type = Byte

FFT

Size =64

Type = Complex

Vector = 64

FFT

Size =64

Type = Complex

Vector = 64

OFDM Channel

Estimation

OFDM Frame

Equalizer

OFDM Serializer

FFT length = 64

Constellation

Decoder

Package Header

Parser

OFDM Frame

Equalizer

OFDM Serializer

FFT length = 64

Constellation

DecoderRepack Bits

1121 Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake

Our Contribution

Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake 1221

Implementation of IEEE80211a PHY in SDR

One USRP is confgured as a supplicant and another is an authenticator

Two USRPs communicate wirelessly

1321 Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake

Implementation of KDF and MIC

Implementation - KDF and MIC

Sponge structure is used for KDF and generation of MIC

SPIX ACE and WAGE follow the same framework

Key derivation function construction

F F F F F F F Fload-AE(AN [64 127]SN [64 127] PMK)

c

r

0x00 0x00 0x01 0x01 0x01 0x01 0x01

PMK[0 63] PMK[64 127] AN [64 127] SN [64 127] AP [0 63] AP [64 127] STA[0 63] STA[64 127]

KCK[0 63] KCK[64 127] KEK[0 63] KEK[64 127] TK[0 63] TK[64 127]

Initialization Processing Associated Data and Extracting Session Keys

Message integrity code generation

F F F F F F F F Fload-AE(NKCK)

c

r

tagextract(S)

t

0x00 0x00 0x01 0x01 0x01 0x01 0x00 0x00

KCK[0 63] KCK[64 127] Nonce[0 63] Nonce[64 127] r[0 63] r[64 127] KCK[0 63] KCK[64 127]

Initialization Processing associated data Finalization

1421 Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake

Microcontroller implementations SPIX ACE and WAGE

Microcontroller platform specifcations3

SPIX ACE and WAGE are implemented in assembly language

SPIX 8-bit 16-bit and 32-bit microcontrollers Atmega128 MSP430 and Cortex-M3

ACE 16-bit and 32-bit microcontrollers MSP430 and Cortex-M3

WAGE 8-bit 16-bit and 32-bit microcontrollers Atmega128 MSP430 and Cortex-M3

Clock frequency 16 MHz for all platforms

Microcontrollers Flash memory size [kB] RAM [kB] Number of general-purpose register ATmega128(SPIX WAGE) 128 4448 32(R0 - R31) MSP430F2013(ACE SPIX) 2304 0128 12 (R4 - R15)

MSP430F2370(WAGE) 33024 2048 12 (R4 - R15) LM3S9D96(ACE SPIX WAGE) 524288 131072 13 (R0 - R12)

3Cortex-M3 Cortex-m3lm3s9d96 MSP430 MSP430f2013 1521 Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake

Implementation Details

For SPIX the state is stored in the registers

For ACE and WAGE the state is stored in memory

Instead of loading everything into registers the WAGE state is continuously stored in random access memory (RAM)

- Initial memory location and the current round is recorded - After the permutation evaluation copy the fnal state to the

initial state location in RAM - Continue the next WAGE permutation evaluation

Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake 1621

Performance of 4-way handshake authentication Authentication time

Auth-Time includes the 4-way transmission time + computation times of KDF and MIC functions

Tauth = T4minuswayminustx + 2 TKDF + 3 TMIC

KDF computation is equivalent to AE(lAD = 0 lM = 6) no tag MIC computation is equivalent to AE(lAD = 4 lM = 0) with tag

Cryptographic F Platform Function

Memory usage [Bytes] Setup

[Cycles] Throughput

[Kbps] Gen-time

[ms] 4-way-Tx-time

[ms] Auth-Time

[ms] SRAM Flash

SPIX

8-bits ATmega128

KDF 175 1586 705314 2323 4408 700 95640 MIC 175 1634 897225 1826 5608 16-bits

MSP430F2013 KDF 50 1562 286679 5715 1792 690 79409 MIC 50 1580 363991 4501 2275

32-bits LM3S9D96

KDF 408 1230 59140 27704 370 700 72150 MIC 408 1326 75132 21807 470

ACE

16-bits MSP430F2013

KDF 330 1720 550752 2975 3442 710 89503 MIC 330 1738 619701 2644 3873 32-bits

LM3S9D96 KDF 599 1826 102762 15944 642 730 76450 MIC 599 1790 115561 14178 722

WAGE

8-bits ATmega128

KDF 808 4448 139478 11747 872 710 75678 MIC 808 4516 156491 10470 978 16-bits

MSP430F2013 KDF 46 4518 166993 9811 1044 720 77601 MIC 46 4536 187340 8746 1171

32-bits LM3S9D96

KDF 3084 6278 107071 15302 669 690 72591 MIC 3084 6382 120190 13632 751

SPIX implementation is faster due to storing state into registers

1721 Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake

SPIX for IEEE 80211i data protection protocol

32-bit Cortex-M3 gives the highest throughput and lowest memory usage

Total time for no AD and 128 bytes message 1058 ms Total time for 16 byte AD and 128 bytes message 1079 ms

Cryptographic Platform Memory usage

[Bytes] Setup [Cycles]

Throughput [Kbps]

Gen-time [ms]

Tx-time [ms] SRAM Flash

SPIX Perm-18

8-bits ATmega128 161 1262 128377 3191 802

NA 16-bits MSP430F2013 24 1409 52294 7833 327

32-bits LM3S9D96 352 946 10900 37578 068

SPIX-AE (lAD = 0 lM = 16)

8-bits ATmega128 175 1550 1667042 983 10419 1060

16-bits MSP430F2013 50 1845 677818 2417 4236 1080

32-bits LM3S9D96 408 1210 139569 11739 872 1050

SPIX-AE (lAD = 2 lM = 16)

8-bits ATmega128 175 1644 1795322 913 11221 1080

16-bits MSP430F2013 50 1891 730340 2243 4565 1050

32-bits LM3S9D96 424 1326 150313 10900 939 1070

Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake 1821

ACE for IEEE 80211i data protection protocol

32-bit Cortex-M3 gives the best result Total time for no AD and 128 bytes message 1087 ms Total time for 16 byte AD and 128 bytes message 1098 ms

Cryptographic Platform Memory usage

[Bytes] Setup [Cycles]

Throughput [Kbps]

Gen-time [ms]

Tx-time [ms] SRAM Flash

ACE Perm 16-bits

MSP430F2013 304 1456 69440 7373 434 NA 32-bits

LM3S9D96 523 1598 13003 39376 081

ACE-AE (lAD = 0 lM = 16)

16-bits MSP430F2013 330 1740 1445059 1134 9032 1060

32-bits LM3S9D96 559 1790 269341 6083 1683 107

ACE-AE (lAD = 2 lM = 16)

16-bits MSP430F2013 330 1786 1582892 1035 9893 1080

32-bits LM3S9D96 559 1858 294988 5554 1844 1080

ACE-Hash (lM = 2 j =4)

16-bits MSP430F2013 330 1682 413056 496 2582

NA 32-bits LM3S9D96 559 1822 77114 2656 482

ACE-Hash (lM = 16 j =4)

16-bits MSP430F2013 330 1684 1375672 1191 8598

32-bits LM3S9D96 559 1822 256524 6387 1603

Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake 1921

WAGE for IEEE 80211i data protection protocol Cortex-M3 gives the highest throughput with highest memory

usage MSP430 gives the slightly lower throughput but much lower

memory usage Total time for no AD and 128 bytes message 1077 ms Total time for 16 byte AD and 128 bytes message 1079 ms

Cryptographic Platform Memory usage

[Bytes] Setup [Cycles]

Throughput [Kbps]

Gen-time [ms]

Tx-time [ms] SRAM Flash

WAGE Perm

8-bits ATmega128 802 4132 19011 21798 119

NA 16-bits MSP430F2370 4 5031 23524 17616 147

32-bits LM3S9D96 3076 5902 14450 28678 09

WAGE-AE (lAD = 0 lM = 16)

8-bits ATmega128 808 4416 362888 4515 2268 1080

16-bits MSP430F2370 46 5289 433105 3783 2707 1090

32-bits LM3S9D96 3084 6230 278848 5876 1743 1060

WAGE-AE (lAD = 2 lM = 16)

8-bits ATmega128 808 4502 397260 4124 2483 1050

16-bits MSP430F2370 46 5339 474067 3456 2963 1060

32-bits LM3S9D96 3084 6354 305284 5367 1908 1060

Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake 2021

Conclusions

Implemented the IEEE 80211a physical layer OFDM transmission systems by software defned radio to simulate the 4-way handshake modulation and communication

Implemented the IEEE 8021x 4-way handshake mutual authentication and key establishment protocol using SPIX ACE and WAGE on three dierent microcontrollers

Throughput of WAGE is higher than that of AES-128 written in C1 on the same 8-bit platform

Execution time for the cryptographic operations is the dominating factor in the 4-way handshake

- USRP tx rate 1682 Kbps - WiFi systemrsquos tx rate 50 - 320 Mbps

07times1682 - Scaling tx to WiFi = 0235 ms 50000

1 G Meiser T Eisenbarth K Lemke-Rust and C Paar Eyumlcient implementation of estream ciphers on 8-bit avr

microcontrollers In 2008 International Symposium on Industrial Embedded Systems pages 58 - 66 June 2008

Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake 2121

Thank you

  • Background and Preliminary
    • SPIX ACE and WAGE in NIST LWC round 2
    • IEEE 8021X 4-way handshake and data protection
    • SDR Overall
      • Implementation of Three LWC Schemes
        • KDF and MIC generation
        • Microcontroller Implementation
        • RAM moving technology
          • Performance Results
            • KDF and MIC for 4-way handshake
            • SPIX AE mode for data protection
            • ACE AE amp Hash modes for data protection
            • WAGE AE mode for data protection
              • Conclusion
Page 2: Implementation of Three LWC Schemes in the WiFi 4-Way ......Implementation of Three LWC Schemes in the WiFi 4-Way Handshake with Software Defned Radio . Yunjie Yi, Guang Gong, Kalikinkar

Outline

1 Background - SPIX ACE and WAGE algorithms - IEEE 8021a PHY 4-way handshake and data protection - Software defned radio

2 Implementation of Three LWC Schemes - KDF and MIC generation - Implementation on microcontrollers

3 Performance Results - KDF and MIC for 4-way handshake - SPIX ACE and WAGE for data protection

Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake 121

SPIX An authenticated encryption algorithm

Design1

Adopts the monkey duplex mode of operation Underlying permutation sLiSCP-light-2562

128-bit security with straight out parameters key size = tag size = nonce size = security level

Data limit 263 bytes before re-keying is done

P18

64

P18 P18 P9

M0 MlM minus2 MlM minus1

AD0 ADlAD minus 1 K0 K1 C0

P9 P9

K0 K1 ClM minus2 ClM minus1

tagextract(S)

128 init(N K)

192

0x00 0x00 0x01 0x01 0x02 0x02 0x00

Processing associated data Encryption Initialization Finalization

P9 P9 P18 P18

0x02 0x00

SPIX AE algorithm

1 Altawy R Gong G He M Mandal K and Rohit R SPIX An authenticated cipher NIST LWC round 2

candidate (2019) 2

Altawy R Rohit R He M Mandal K Yang G and Gong G sLiSCP-light Towards hardware optimized sponge-specifc cryptographic permutations ACM Trans Embedded Computing Systems (2018)

Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake 221

SPIX (Cont) sLiSCP-light permutation round function

State Xi = (Xi Xi Xi Xi 3210 ) with size 256 bits

Simeck box SB-2n where n = 32

Xi0 Xi

1 Xi2 Xi

3SB-2n SB-2n

rci0 rci1

sci0 sci1

2n 2n

2n 2n

sLiSCP-light permutation step function

x1 x0

f(501) (L5(x) x) oplus L1(x)

nn

rc = (cuminus1 middot middot middot c1 c0)

n n

Simeck (SB-2n)

321 Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake

ACE An AE and hash algorithm Design goals

Both AE and Hash functionalities with a single hardware footprint 128-bit security level with good security margins Balance among hardware cost and software eyumlciency

ACE ACE ACE ACE ACE ACE ACE ACE ACE ACEload-AE(NK)

256

64

tagextract(S)

128

0x00 0x00 0x01 0x01 0x02 0x02 0x02 0x00 0x00

K0 K1 AD0 ADlADminus 1 K0 K1

M0 MlMminus2 MlMminus1

C0 ClMminus2 ClMminus1

Initialization Processing associated data Encryption Finalization

(a) ACE Authenticated encryption algorithm

ACE ACE ACE ACE ACE ACE ACEload-H(IV )

256

64

M0 M1 MlMminus1 0 0 0 0H0 H1 H2 H3

Initialization Absorbing Squeezing

(b) ACE Hash algorithm

421 Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake

ACE Permutation Ai Bi Ci Di Ei

64 64 64 64 64

SB-64 SB-64 SB-64rci0 rci1 rci2

156||sci0 156||sci1 156||sci2

Ai+1 Bi+1 Ci+1 Di+1 Ei+1

ACE permutation step function

Parameters1

State size 320 bits Nonlinear layer 3 SB-64 (Simeck) boxes Linear layer (˙) (32041) rounds(u)steps(s) 816 Rate positions 4 bytes of A[7 middot middot middot 4] and C[7 middot middot middot 4] 8-bit round and step constants rci

j and scij

1 Aagaard M AlTawy R Gong G Mandal K and Rohit R ACE An authenticated encryption and hash algorithm

NIST LWC round 2 candidate (2019)

521 Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake

WAGE An authenticated encryption algorithm

Design1

Underlying permutation WAGE (over extension feld F27) of width 259 bits Round function of WAGE is constructed by tweaking the initialization

phase of the WG stream cipher Mode of operation Similar to ACE rounds 111 128-bit security low hardware cost Data limit 264 bytes

Si36 Si

35 Si34 Si

33 Si32 Si

31 Si30 Si

29 Si28 Si

27 Si26 Si

25 Si24 Si

23 Si22 Si

21 Si20 Si

19

WGP SB SB

Si17Si

18 Si16 Si

15 Si14 Si

13 Si12 Si

11 Si10 Si

9 Si8 Si

7 Si6 Si

5 Si4 Si

3 S2i Si1 Si

0

WGP SB SB

oplusω

rci1

rci0

1 Aagaard M AlTawy R Gong G Mandal K Rohit R and Zidaric N WAGE An authenticated cipher NIST LWC

round 2 candidate (2019)

Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake 621

IEEE 8021X 4-way handshake and data protection

4-way handshake and data protection 4-way handshake Conducts mutual entity authentication and generation

of session keys Data protection After 4-way handshake it executes a data protection

protocol either CCMP or GCMP CCMP AES in counter mode + CBC MAC GCMP AES in counter mode + polynomial hash for MAC

Supplicant Authenticator

ANonce Generate ANonce

(SNonce MICA)Generate SNonceDerive PTK

(ANonce MICS )Derive PTK

MICallInstall TKInstall TK

Figure IEEE 8021X 4-way handshake

Software Defined Radio

721 Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake

IEEE 8021X (Cont)

Computing keys in the 4-way handshake protocol Pairwise transit key (PTK) is generated as

PTK = KDF (PMK ANonce||SNonce||AP MAC adr||STA MAC adr)

PTK = KCK||KEK||TK

- KCK is used to generate a MIC - KEK is used for encrypting the group key - TK is used for protecting trayumlc data

Message integrity codes (MICs) are generated as

MICA = MIC(KCK ANonce r) MICS = MIC(KCK SNonce r) MICall = MIC(KCK D r + 1)

where r is a replay counter number and D carries the cipher suite

Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake 821

Software defned radio (SDR)

SDR setup SDR has two main parts Universal Software Radio Peripheral (USRP) and

GNU radio

USRP is a hardware consisting of ADC DAC low pass fler and mixer

GNU radio is a digital signal processing (DSP) software on a Linux OS

921 Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake

OFDM sender in SDR

OFDM sender The digital processing part before the USRP sender are done on PC in real

time

The OFDM sender includes package header generator modulation carrier allocator cyclic prefx and so on

1021 Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake

OFDM receiver in SDR

OFDM receiver USRP receiver received sampled data in complex domain

The receiver contains synchronization module header and payload demux OFDM demodulation and so on

USRP-

Source

Schmidl amp Cox

OFDM synch

Frequency Mod

Sensitivity -3125m

OFDM Frame Equalizer

FFT length = 64

CP-length = 16

Delay Multiply

HeaderPayload Demux

File Sink

Pathoutputdat

Type = Byte

FFT

Size =64

Type = Complex

Vector = 64

FFT

Size =64

Type = Complex

Vector = 64

OFDM Channel

Estimation

OFDM Frame

Equalizer

OFDM Serializer

FFT length = 64

Constellation

Decoder

Package Header

Parser

OFDM Frame

Equalizer

OFDM Serializer

FFT length = 64

Constellation

DecoderRepack Bits

1121 Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake

Our Contribution

Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake 1221

Implementation of IEEE80211a PHY in SDR

One USRP is confgured as a supplicant and another is an authenticator

Two USRPs communicate wirelessly

1321 Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake

Implementation of KDF and MIC

Implementation - KDF and MIC

Sponge structure is used for KDF and generation of MIC

SPIX ACE and WAGE follow the same framework

Key derivation function construction

F F F F F F F Fload-AE(AN [64 127]SN [64 127] PMK)

c

r

0x00 0x00 0x01 0x01 0x01 0x01 0x01

PMK[0 63] PMK[64 127] AN [64 127] SN [64 127] AP [0 63] AP [64 127] STA[0 63] STA[64 127]

KCK[0 63] KCK[64 127] KEK[0 63] KEK[64 127] TK[0 63] TK[64 127]

Initialization Processing Associated Data and Extracting Session Keys

Message integrity code generation

F F F F F F F F Fload-AE(NKCK)

c

r

tagextract(S)

t

0x00 0x00 0x01 0x01 0x01 0x01 0x00 0x00

KCK[0 63] KCK[64 127] Nonce[0 63] Nonce[64 127] r[0 63] r[64 127] KCK[0 63] KCK[64 127]

Initialization Processing associated data Finalization

1421 Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake

Microcontroller implementations SPIX ACE and WAGE

Microcontroller platform specifcations3

SPIX ACE and WAGE are implemented in assembly language

SPIX 8-bit 16-bit and 32-bit microcontrollers Atmega128 MSP430 and Cortex-M3

ACE 16-bit and 32-bit microcontrollers MSP430 and Cortex-M3

WAGE 8-bit 16-bit and 32-bit microcontrollers Atmega128 MSP430 and Cortex-M3

Clock frequency 16 MHz for all platforms

Microcontrollers Flash memory size [kB] RAM [kB] Number of general-purpose register ATmega128(SPIX WAGE) 128 4448 32(R0 - R31) MSP430F2013(ACE SPIX) 2304 0128 12 (R4 - R15)

MSP430F2370(WAGE) 33024 2048 12 (R4 - R15) LM3S9D96(ACE SPIX WAGE) 524288 131072 13 (R0 - R12)

3Cortex-M3 Cortex-m3lm3s9d96 MSP430 MSP430f2013 1521 Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake

Implementation Details

For SPIX the state is stored in the registers

For ACE and WAGE the state is stored in memory

Instead of loading everything into registers the WAGE state is continuously stored in random access memory (RAM)

- Initial memory location and the current round is recorded - After the permutation evaluation copy the fnal state to the

initial state location in RAM - Continue the next WAGE permutation evaluation

Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake 1621

Performance of 4-way handshake authentication Authentication time

Auth-Time includes the 4-way transmission time + computation times of KDF and MIC functions

Tauth = T4minuswayminustx + 2 TKDF + 3 TMIC

KDF computation is equivalent to AE(lAD = 0 lM = 6) no tag MIC computation is equivalent to AE(lAD = 4 lM = 0) with tag

Cryptographic F Platform Function

Memory usage [Bytes] Setup

[Cycles] Throughput

[Kbps] Gen-time

[ms] 4-way-Tx-time

[ms] Auth-Time

[ms] SRAM Flash

SPIX

8-bits ATmega128

KDF 175 1586 705314 2323 4408 700 95640 MIC 175 1634 897225 1826 5608 16-bits

MSP430F2013 KDF 50 1562 286679 5715 1792 690 79409 MIC 50 1580 363991 4501 2275

32-bits LM3S9D96

KDF 408 1230 59140 27704 370 700 72150 MIC 408 1326 75132 21807 470

ACE

16-bits MSP430F2013

KDF 330 1720 550752 2975 3442 710 89503 MIC 330 1738 619701 2644 3873 32-bits

LM3S9D96 KDF 599 1826 102762 15944 642 730 76450 MIC 599 1790 115561 14178 722

WAGE

8-bits ATmega128

KDF 808 4448 139478 11747 872 710 75678 MIC 808 4516 156491 10470 978 16-bits

MSP430F2013 KDF 46 4518 166993 9811 1044 720 77601 MIC 46 4536 187340 8746 1171

32-bits LM3S9D96

KDF 3084 6278 107071 15302 669 690 72591 MIC 3084 6382 120190 13632 751

SPIX implementation is faster due to storing state into registers

1721 Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake

SPIX for IEEE 80211i data protection protocol

32-bit Cortex-M3 gives the highest throughput and lowest memory usage

Total time for no AD and 128 bytes message 1058 ms Total time for 16 byte AD and 128 bytes message 1079 ms

Cryptographic Platform Memory usage

[Bytes] Setup [Cycles]

Throughput [Kbps]

Gen-time [ms]

Tx-time [ms] SRAM Flash

SPIX Perm-18

8-bits ATmega128 161 1262 128377 3191 802

NA 16-bits MSP430F2013 24 1409 52294 7833 327

32-bits LM3S9D96 352 946 10900 37578 068

SPIX-AE (lAD = 0 lM = 16)

8-bits ATmega128 175 1550 1667042 983 10419 1060

16-bits MSP430F2013 50 1845 677818 2417 4236 1080

32-bits LM3S9D96 408 1210 139569 11739 872 1050

SPIX-AE (lAD = 2 lM = 16)

8-bits ATmega128 175 1644 1795322 913 11221 1080

16-bits MSP430F2013 50 1891 730340 2243 4565 1050

32-bits LM3S9D96 424 1326 150313 10900 939 1070

Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake 1821

ACE for IEEE 80211i data protection protocol

32-bit Cortex-M3 gives the best result Total time for no AD and 128 bytes message 1087 ms Total time for 16 byte AD and 128 bytes message 1098 ms

Cryptographic Platform Memory usage

[Bytes] Setup [Cycles]

Throughput [Kbps]

Gen-time [ms]

Tx-time [ms] SRAM Flash

ACE Perm 16-bits

MSP430F2013 304 1456 69440 7373 434 NA 32-bits

LM3S9D96 523 1598 13003 39376 081

ACE-AE (lAD = 0 lM = 16)

16-bits MSP430F2013 330 1740 1445059 1134 9032 1060

32-bits LM3S9D96 559 1790 269341 6083 1683 107

ACE-AE (lAD = 2 lM = 16)

16-bits MSP430F2013 330 1786 1582892 1035 9893 1080

32-bits LM3S9D96 559 1858 294988 5554 1844 1080

ACE-Hash (lM = 2 j =4)

16-bits MSP430F2013 330 1682 413056 496 2582

NA 32-bits LM3S9D96 559 1822 77114 2656 482

ACE-Hash (lM = 16 j =4)

16-bits MSP430F2013 330 1684 1375672 1191 8598

32-bits LM3S9D96 559 1822 256524 6387 1603

Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake 1921

WAGE for IEEE 80211i data protection protocol Cortex-M3 gives the highest throughput with highest memory

usage MSP430 gives the slightly lower throughput but much lower

memory usage Total time for no AD and 128 bytes message 1077 ms Total time for 16 byte AD and 128 bytes message 1079 ms

Cryptographic Platform Memory usage

[Bytes] Setup [Cycles]

Throughput [Kbps]

Gen-time [ms]

Tx-time [ms] SRAM Flash

WAGE Perm

8-bits ATmega128 802 4132 19011 21798 119

NA 16-bits MSP430F2370 4 5031 23524 17616 147

32-bits LM3S9D96 3076 5902 14450 28678 09

WAGE-AE (lAD = 0 lM = 16)

8-bits ATmega128 808 4416 362888 4515 2268 1080

16-bits MSP430F2370 46 5289 433105 3783 2707 1090

32-bits LM3S9D96 3084 6230 278848 5876 1743 1060

WAGE-AE (lAD = 2 lM = 16)

8-bits ATmega128 808 4502 397260 4124 2483 1050

16-bits MSP430F2370 46 5339 474067 3456 2963 1060

32-bits LM3S9D96 3084 6354 305284 5367 1908 1060

Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake 2021

Conclusions

Implemented the IEEE 80211a physical layer OFDM transmission systems by software defned radio to simulate the 4-way handshake modulation and communication

Implemented the IEEE 8021x 4-way handshake mutual authentication and key establishment protocol using SPIX ACE and WAGE on three dierent microcontrollers

Throughput of WAGE is higher than that of AES-128 written in C1 on the same 8-bit platform

Execution time for the cryptographic operations is the dominating factor in the 4-way handshake

- USRP tx rate 1682 Kbps - WiFi systemrsquos tx rate 50 - 320 Mbps

07times1682 - Scaling tx to WiFi = 0235 ms 50000

1 G Meiser T Eisenbarth K Lemke-Rust and C Paar Eyumlcient implementation of estream ciphers on 8-bit avr

microcontrollers In 2008 International Symposium on Industrial Embedded Systems pages 58 - 66 June 2008

Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake 2121

Thank you

  • Background and Preliminary
    • SPIX ACE and WAGE in NIST LWC round 2
    • IEEE 8021X 4-way handshake and data protection
    • SDR Overall
      • Implementation of Three LWC Schemes
        • KDF and MIC generation
        • Microcontroller Implementation
        • RAM moving technology
          • Performance Results
            • KDF and MIC for 4-way handshake
            • SPIX AE mode for data protection
            • ACE AE amp Hash modes for data protection
            • WAGE AE mode for data protection
              • Conclusion
Page 3: Implementation of Three LWC Schemes in the WiFi 4-Way ......Implementation of Three LWC Schemes in the WiFi 4-Way Handshake with Software Defned Radio . Yunjie Yi, Guang Gong, Kalikinkar

SPIX An authenticated encryption algorithm

Design1

Adopts the monkey duplex mode of operation Underlying permutation sLiSCP-light-2562

128-bit security with straight out parameters key size = tag size = nonce size = security level

Data limit 263 bytes before re-keying is done

P18

64

P18 P18 P9

M0 MlM minus2 MlM minus1

AD0 ADlAD minus 1 K0 K1 C0

P9 P9

K0 K1 ClM minus2 ClM minus1

tagextract(S)

128 init(N K)

192

0x00 0x00 0x01 0x01 0x02 0x02 0x00

Processing associated data Encryption Initialization Finalization

P9 P9 P18 P18

0x02 0x00

SPIX AE algorithm

1 Altawy R Gong G He M Mandal K and Rohit R SPIX An authenticated cipher NIST LWC round 2

candidate (2019) 2

Altawy R Rohit R He M Mandal K Yang G and Gong G sLiSCP-light Towards hardware optimized sponge-specifc cryptographic permutations ACM Trans Embedded Computing Systems (2018)

Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake 221

SPIX (Cont) sLiSCP-light permutation round function

State Xi = (Xi Xi Xi Xi 3210 ) with size 256 bits

Simeck box SB-2n where n = 32

Xi0 Xi

1 Xi2 Xi

3SB-2n SB-2n

rci0 rci1

sci0 sci1

2n 2n

2n 2n

sLiSCP-light permutation step function

x1 x0

f(501) (L5(x) x) oplus L1(x)

nn

rc = (cuminus1 middot middot middot c1 c0)

n n

Simeck (SB-2n)

321 Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake

ACE An AE and hash algorithm Design goals

Both AE and Hash functionalities with a single hardware footprint 128-bit security level with good security margins Balance among hardware cost and software eyumlciency

ACE ACE ACE ACE ACE ACE ACE ACE ACE ACEload-AE(NK)

256

64

tagextract(S)

128

0x00 0x00 0x01 0x01 0x02 0x02 0x02 0x00 0x00

K0 K1 AD0 ADlADminus 1 K0 K1

M0 MlMminus2 MlMminus1

C0 ClMminus2 ClMminus1

Initialization Processing associated data Encryption Finalization

(a) ACE Authenticated encryption algorithm

ACE ACE ACE ACE ACE ACE ACEload-H(IV )

256

64

M0 M1 MlMminus1 0 0 0 0H0 H1 H2 H3

Initialization Absorbing Squeezing

(b) ACE Hash algorithm

421 Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake

ACE Permutation Ai Bi Ci Di Ei

64 64 64 64 64

SB-64 SB-64 SB-64rci0 rci1 rci2

156||sci0 156||sci1 156||sci2

Ai+1 Bi+1 Ci+1 Di+1 Ei+1

ACE permutation step function

Parameters1

State size 320 bits Nonlinear layer 3 SB-64 (Simeck) boxes Linear layer (˙) (32041) rounds(u)steps(s) 816 Rate positions 4 bytes of A[7 middot middot middot 4] and C[7 middot middot middot 4] 8-bit round and step constants rci

j and scij

1 Aagaard M AlTawy R Gong G Mandal K and Rohit R ACE An authenticated encryption and hash algorithm

NIST LWC round 2 candidate (2019)

521 Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake

WAGE An authenticated encryption algorithm

Design1

Underlying permutation WAGE (over extension feld F27) of width 259 bits Round function of WAGE is constructed by tweaking the initialization

phase of the WG stream cipher Mode of operation Similar to ACE rounds 111 128-bit security low hardware cost Data limit 264 bytes

Si36 Si

35 Si34 Si

33 Si32 Si

31 Si30 Si

29 Si28 Si

27 Si26 Si

25 Si24 Si

23 Si22 Si

21 Si20 Si

19

WGP SB SB

Si17Si

18 Si16 Si

15 Si14 Si

13 Si12 Si

11 Si10 Si

9 Si8 Si

7 Si6 Si

5 Si4 Si

3 S2i Si1 Si

0

WGP SB SB

oplusω

rci1

rci0

1 Aagaard M AlTawy R Gong G Mandal K Rohit R and Zidaric N WAGE An authenticated cipher NIST LWC

round 2 candidate (2019)

Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake 621

IEEE 8021X 4-way handshake and data protection

4-way handshake and data protection 4-way handshake Conducts mutual entity authentication and generation

of session keys Data protection After 4-way handshake it executes a data protection

protocol either CCMP or GCMP CCMP AES in counter mode + CBC MAC GCMP AES in counter mode + polynomial hash for MAC

Supplicant Authenticator

ANonce Generate ANonce

(SNonce MICA)Generate SNonceDerive PTK

(ANonce MICS )Derive PTK

MICallInstall TKInstall TK

Figure IEEE 8021X 4-way handshake

Software Defined Radio

721 Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake

IEEE 8021X (Cont)

Computing keys in the 4-way handshake protocol Pairwise transit key (PTK) is generated as

PTK = KDF (PMK ANonce||SNonce||AP MAC adr||STA MAC adr)

PTK = KCK||KEK||TK

- KCK is used to generate a MIC - KEK is used for encrypting the group key - TK is used for protecting trayumlc data

Message integrity codes (MICs) are generated as

MICA = MIC(KCK ANonce r) MICS = MIC(KCK SNonce r) MICall = MIC(KCK D r + 1)

where r is a replay counter number and D carries the cipher suite

Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake 821

Software defned radio (SDR)

SDR setup SDR has two main parts Universal Software Radio Peripheral (USRP) and

GNU radio

USRP is a hardware consisting of ADC DAC low pass fler and mixer

GNU radio is a digital signal processing (DSP) software on a Linux OS

921 Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake

OFDM sender in SDR

OFDM sender The digital processing part before the USRP sender are done on PC in real

time

The OFDM sender includes package header generator modulation carrier allocator cyclic prefx and so on

1021 Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake

OFDM receiver in SDR

OFDM receiver USRP receiver received sampled data in complex domain

The receiver contains synchronization module header and payload demux OFDM demodulation and so on

USRP-

Source

Schmidl amp Cox

OFDM synch

Frequency Mod

Sensitivity -3125m

OFDM Frame Equalizer

FFT length = 64

CP-length = 16

Delay Multiply

HeaderPayload Demux

File Sink

Pathoutputdat

Type = Byte

FFT

Size =64

Type = Complex

Vector = 64

FFT

Size =64

Type = Complex

Vector = 64

OFDM Channel

Estimation

OFDM Frame

Equalizer

OFDM Serializer

FFT length = 64

Constellation

Decoder

Package Header

Parser

OFDM Frame

Equalizer

OFDM Serializer

FFT length = 64

Constellation

DecoderRepack Bits

1121 Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake

Our Contribution

Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake 1221

Implementation of IEEE80211a PHY in SDR

One USRP is confgured as a supplicant and another is an authenticator

Two USRPs communicate wirelessly

1321 Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake

Implementation of KDF and MIC

Implementation - KDF and MIC

Sponge structure is used for KDF and generation of MIC

SPIX ACE and WAGE follow the same framework

Key derivation function construction

F F F F F F F Fload-AE(AN [64 127]SN [64 127] PMK)

c

r

0x00 0x00 0x01 0x01 0x01 0x01 0x01

PMK[0 63] PMK[64 127] AN [64 127] SN [64 127] AP [0 63] AP [64 127] STA[0 63] STA[64 127]

KCK[0 63] KCK[64 127] KEK[0 63] KEK[64 127] TK[0 63] TK[64 127]

Initialization Processing Associated Data and Extracting Session Keys

Message integrity code generation

F F F F F F F F Fload-AE(NKCK)

c

r

tagextract(S)

t

0x00 0x00 0x01 0x01 0x01 0x01 0x00 0x00

KCK[0 63] KCK[64 127] Nonce[0 63] Nonce[64 127] r[0 63] r[64 127] KCK[0 63] KCK[64 127]

Initialization Processing associated data Finalization

1421 Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake

Microcontroller implementations SPIX ACE and WAGE

Microcontroller platform specifcations3

SPIX ACE and WAGE are implemented in assembly language

SPIX 8-bit 16-bit and 32-bit microcontrollers Atmega128 MSP430 and Cortex-M3

ACE 16-bit and 32-bit microcontrollers MSP430 and Cortex-M3

WAGE 8-bit 16-bit and 32-bit microcontrollers Atmega128 MSP430 and Cortex-M3

Clock frequency 16 MHz for all platforms

Microcontrollers Flash memory size [kB] RAM [kB] Number of general-purpose register ATmega128(SPIX WAGE) 128 4448 32(R0 - R31) MSP430F2013(ACE SPIX) 2304 0128 12 (R4 - R15)

MSP430F2370(WAGE) 33024 2048 12 (R4 - R15) LM3S9D96(ACE SPIX WAGE) 524288 131072 13 (R0 - R12)

3Cortex-M3 Cortex-m3lm3s9d96 MSP430 MSP430f2013 1521 Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake

Implementation Details

For SPIX the state is stored in the registers

For ACE and WAGE the state is stored in memory

Instead of loading everything into registers the WAGE state is continuously stored in random access memory (RAM)

- Initial memory location and the current round is recorded - After the permutation evaluation copy the fnal state to the

initial state location in RAM - Continue the next WAGE permutation evaluation

Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake 1621

Performance of 4-way handshake authentication Authentication time

Auth-Time includes the 4-way transmission time + computation times of KDF and MIC functions

Tauth = T4minuswayminustx + 2 TKDF + 3 TMIC

KDF computation is equivalent to AE(lAD = 0 lM = 6) no tag MIC computation is equivalent to AE(lAD = 4 lM = 0) with tag

Cryptographic F Platform Function

Memory usage [Bytes] Setup

[Cycles] Throughput

[Kbps] Gen-time

[ms] 4-way-Tx-time

[ms] Auth-Time

[ms] SRAM Flash

SPIX

8-bits ATmega128

KDF 175 1586 705314 2323 4408 700 95640 MIC 175 1634 897225 1826 5608 16-bits

MSP430F2013 KDF 50 1562 286679 5715 1792 690 79409 MIC 50 1580 363991 4501 2275

32-bits LM3S9D96

KDF 408 1230 59140 27704 370 700 72150 MIC 408 1326 75132 21807 470

ACE

16-bits MSP430F2013

KDF 330 1720 550752 2975 3442 710 89503 MIC 330 1738 619701 2644 3873 32-bits

LM3S9D96 KDF 599 1826 102762 15944 642 730 76450 MIC 599 1790 115561 14178 722

WAGE

8-bits ATmega128

KDF 808 4448 139478 11747 872 710 75678 MIC 808 4516 156491 10470 978 16-bits

MSP430F2013 KDF 46 4518 166993 9811 1044 720 77601 MIC 46 4536 187340 8746 1171

32-bits LM3S9D96

KDF 3084 6278 107071 15302 669 690 72591 MIC 3084 6382 120190 13632 751

SPIX implementation is faster due to storing state into registers

1721 Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake

SPIX for IEEE 80211i data protection protocol

32-bit Cortex-M3 gives the highest throughput and lowest memory usage

Total time for no AD and 128 bytes message 1058 ms Total time for 16 byte AD and 128 bytes message 1079 ms

Cryptographic Platform Memory usage

[Bytes] Setup [Cycles]

Throughput [Kbps]

Gen-time [ms]

Tx-time [ms] SRAM Flash

SPIX Perm-18

8-bits ATmega128 161 1262 128377 3191 802

NA 16-bits MSP430F2013 24 1409 52294 7833 327

32-bits LM3S9D96 352 946 10900 37578 068

SPIX-AE (lAD = 0 lM = 16)

8-bits ATmega128 175 1550 1667042 983 10419 1060

16-bits MSP430F2013 50 1845 677818 2417 4236 1080

32-bits LM3S9D96 408 1210 139569 11739 872 1050

SPIX-AE (lAD = 2 lM = 16)

8-bits ATmega128 175 1644 1795322 913 11221 1080

16-bits MSP430F2013 50 1891 730340 2243 4565 1050

32-bits LM3S9D96 424 1326 150313 10900 939 1070

Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake 1821

ACE for IEEE 80211i data protection protocol

32-bit Cortex-M3 gives the best result Total time for no AD and 128 bytes message 1087 ms Total time for 16 byte AD and 128 bytes message 1098 ms

Cryptographic Platform Memory usage

[Bytes] Setup [Cycles]

Throughput [Kbps]

Gen-time [ms]

Tx-time [ms] SRAM Flash

ACE Perm 16-bits

MSP430F2013 304 1456 69440 7373 434 NA 32-bits

LM3S9D96 523 1598 13003 39376 081

ACE-AE (lAD = 0 lM = 16)

16-bits MSP430F2013 330 1740 1445059 1134 9032 1060

32-bits LM3S9D96 559 1790 269341 6083 1683 107

ACE-AE (lAD = 2 lM = 16)

16-bits MSP430F2013 330 1786 1582892 1035 9893 1080

32-bits LM3S9D96 559 1858 294988 5554 1844 1080

ACE-Hash (lM = 2 j =4)

16-bits MSP430F2013 330 1682 413056 496 2582

NA 32-bits LM3S9D96 559 1822 77114 2656 482

ACE-Hash (lM = 16 j =4)

16-bits MSP430F2013 330 1684 1375672 1191 8598

32-bits LM3S9D96 559 1822 256524 6387 1603

Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake 1921

WAGE for IEEE 80211i data protection protocol Cortex-M3 gives the highest throughput with highest memory

usage MSP430 gives the slightly lower throughput but much lower

memory usage Total time for no AD and 128 bytes message 1077 ms Total time for 16 byte AD and 128 bytes message 1079 ms

Cryptographic Platform Memory usage

[Bytes] Setup [Cycles]

Throughput [Kbps]

Gen-time [ms]

Tx-time [ms] SRAM Flash

WAGE Perm

8-bits ATmega128 802 4132 19011 21798 119

NA 16-bits MSP430F2370 4 5031 23524 17616 147

32-bits LM3S9D96 3076 5902 14450 28678 09

WAGE-AE (lAD = 0 lM = 16)

8-bits ATmega128 808 4416 362888 4515 2268 1080

16-bits MSP430F2370 46 5289 433105 3783 2707 1090

32-bits LM3S9D96 3084 6230 278848 5876 1743 1060

WAGE-AE (lAD = 2 lM = 16)

8-bits ATmega128 808 4502 397260 4124 2483 1050

16-bits MSP430F2370 46 5339 474067 3456 2963 1060

32-bits LM3S9D96 3084 6354 305284 5367 1908 1060

Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake 2021

Conclusions

Implemented the IEEE 80211a physical layer OFDM transmission systems by software defned radio to simulate the 4-way handshake modulation and communication

Implemented the IEEE 8021x 4-way handshake mutual authentication and key establishment protocol using SPIX ACE and WAGE on three dierent microcontrollers

Throughput of WAGE is higher than that of AES-128 written in C1 on the same 8-bit platform

Execution time for the cryptographic operations is the dominating factor in the 4-way handshake

- USRP tx rate 1682 Kbps - WiFi systemrsquos tx rate 50 - 320 Mbps

07times1682 - Scaling tx to WiFi = 0235 ms 50000

1 G Meiser T Eisenbarth K Lemke-Rust and C Paar Eyumlcient implementation of estream ciphers on 8-bit avr

microcontrollers In 2008 International Symposium on Industrial Embedded Systems pages 58 - 66 June 2008

Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake 2121

Thank you

  • Background and Preliminary
    • SPIX ACE and WAGE in NIST LWC round 2
    • IEEE 8021X 4-way handshake and data protection
    • SDR Overall
      • Implementation of Three LWC Schemes
        • KDF and MIC generation
        • Microcontroller Implementation
        • RAM moving technology
          • Performance Results
            • KDF and MIC for 4-way handshake
            • SPIX AE mode for data protection
            • ACE AE amp Hash modes for data protection
            • WAGE AE mode for data protection
              • Conclusion
Page 4: Implementation of Three LWC Schemes in the WiFi 4-Way ......Implementation of Three LWC Schemes in the WiFi 4-Way Handshake with Software Defned Radio . Yunjie Yi, Guang Gong, Kalikinkar

SPIX (Cont) sLiSCP-light permutation round function

State Xi = (Xi Xi Xi Xi 3210 ) with size 256 bits

Simeck box SB-2n where n = 32

Xi0 Xi

1 Xi2 Xi

3SB-2n SB-2n

rci0 rci1

sci0 sci1

2n 2n

2n 2n

sLiSCP-light permutation step function

x1 x0

f(501) (L5(x) x) oplus L1(x)

nn

rc = (cuminus1 middot middot middot c1 c0)

n n

Simeck (SB-2n)

321 Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake

ACE An AE and hash algorithm Design goals

Both AE and Hash functionalities with a single hardware footprint 128-bit security level with good security margins Balance among hardware cost and software eyumlciency

ACE ACE ACE ACE ACE ACE ACE ACE ACE ACEload-AE(NK)

256

64

tagextract(S)

128

0x00 0x00 0x01 0x01 0x02 0x02 0x02 0x00 0x00

K0 K1 AD0 ADlADminus 1 K0 K1

M0 MlMminus2 MlMminus1

C0 ClMminus2 ClMminus1

Initialization Processing associated data Encryption Finalization

(a) ACE Authenticated encryption algorithm

ACE ACE ACE ACE ACE ACE ACEload-H(IV )

256

64

M0 M1 MlMminus1 0 0 0 0H0 H1 H2 H3

Initialization Absorbing Squeezing

(b) ACE Hash algorithm

421 Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake

ACE Permutation Ai Bi Ci Di Ei

64 64 64 64 64

SB-64 SB-64 SB-64rci0 rci1 rci2

156||sci0 156||sci1 156||sci2

Ai+1 Bi+1 Ci+1 Di+1 Ei+1

ACE permutation step function

Parameters1

State size 320 bits Nonlinear layer 3 SB-64 (Simeck) boxes Linear layer (˙) (32041) rounds(u)steps(s) 816 Rate positions 4 bytes of A[7 middot middot middot 4] and C[7 middot middot middot 4] 8-bit round and step constants rci

j and scij

1 Aagaard M AlTawy R Gong G Mandal K and Rohit R ACE An authenticated encryption and hash algorithm

NIST LWC round 2 candidate (2019)

521 Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake

WAGE An authenticated encryption algorithm

Design1

Underlying permutation WAGE (over extension feld F27) of width 259 bits Round function of WAGE is constructed by tweaking the initialization

phase of the WG stream cipher Mode of operation Similar to ACE rounds 111 128-bit security low hardware cost Data limit 264 bytes

Si36 Si

35 Si34 Si

33 Si32 Si

31 Si30 Si

29 Si28 Si

27 Si26 Si

25 Si24 Si

23 Si22 Si

21 Si20 Si

19

WGP SB SB

Si17Si

18 Si16 Si

15 Si14 Si

13 Si12 Si

11 Si10 Si

9 Si8 Si

7 Si6 Si

5 Si4 Si

3 S2i Si1 Si

0

WGP SB SB

oplusω

rci1

rci0

1 Aagaard M AlTawy R Gong G Mandal K Rohit R and Zidaric N WAGE An authenticated cipher NIST LWC

round 2 candidate (2019)

Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake 621

IEEE 8021X 4-way handshake and data protection

4-way handshake and data protection 4-way handshake Conducts mutual entity authentication and generation

of session keys Data protection After 4-way handshake it executes a data protection

protocol either CCMP or GCMP CCMP AES in counter mode + CBC MAC GCMP AES in counter mode + polynomial hash for MAC

Supplicant Authenticator

ANonce Generate ANonce

(SNonce MICA)Generate SNonceDerive PTK

(ANonce MICS )Derive PTK

MICallInstall TKInstall TK

Figure IEEE 8021X 4-way handshake

Software Defined Radio

721 Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake

IEEE 8021X (Cont)

Computing keys in the 4-way handshake protocol Pairwise transit key (PTK) is generated as

PTK = KDF (PMK ANonce||SNonce||AP MAC adr||STA MAC adr)

PTK = KCK||KEK||TK

- KCK is used to generate a MIC - KEK is used for encrypting the group key - TK is used for protecting trayumlc data

Message integrity codes (MICs) are generated as

MICA = MIC(KCK ANonce r) MICS = MIC(KCK SNonce r) MICall = MIC(KCK D r + 1)

where r is a replay counter number and D carries the cipher suite

Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake 821

Software defned radio (SDR)

SDR setup SDR has two main parts Universal Software Radio Peripheral (USRP) and

GNU radio

USRP is a hardware consisting of ADC DAC low pass fler and mixer

GNU radio is a digital signal processing (DSP) software on a Linux OS

921 Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake

OFDM sender in SDR

OFDM sender The digital processing part before the USRP sender are done on PC in real

time

The OFDM sender includes package header generator modulation carrier allocator cyclic prefx and so on

1021 Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake

OFDM receiver in SDR

OFDM receiver USRP receiver received sampled data in complex domain

The receiver contains synchronization module header and payload demux OFDM demodulation and so on

USRP-

Source

Schmidl amp Cox

OFDM synch

Frequency Mod

Sensitivity -3125m

OFDM Frame Equalizer

FFT length = 64

CP-length = 16

Delay Multiply

HeaderPayload Demux

File Sink

Pathoutputdat

Type = Byte

FFT

Size =64

Type = Complex

Vector = 64

FFT

Size =64

Type = Complex

Vector = 64

OFDM Channel

Estimation

OFDM Frame

Equalizer

OFDM Serializer

FFT length = 64

Constellation

Decoder

Package Header

Parser

OFDM Frame

Equalizer

OFDM Serializer

FFT length = 64

Constellation

DecoderRepack Bits

1121 Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake

Our Contribution

Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake 1221

Implementation of IEEE80211a PHY in SDR

One USRP is confgured as a supplicant and another is an authenticator

Two USRPs communicate wirelessly

1321 Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake

Implementation of KDF and MIC

Implementation - KDF and MIC

Sponge structure is used for KDF and generation of MIC

SPIX ACE and WAGE follow the same framework

Key derivation function construction

F F F F F F F Fload-AE(AN [64 127]SN [64 127] PMK)

c

r

0x00 0x00 0x01 0x01 0x01 0x01 0x01

PMK[0 63] PMK[64 127] AN [64 127] SN [64 127] AP [0 63] AP [64 127] STA[0 63] STA[64 127]

KCK[0 63] KCK[64 127] KEK[0 63] KEK[64 127] TK[0 63] TK[64 127]

Initialization Processing Associated Data and Extracting Session Keys

Message integrity code generation

F F F F F F F F Fload-AE(NKCK)

c

r

tagextract(S)

t

0x00 0x00 0x01 0x01 0x01 0x01 0x00 0x00

KCK[0 63] KCK[64 127] Nonce[0 63] Nonce[64 127] r[0 63] r[64 127] KCK[0 63] KCK[64 127]

Initialization Processing associated data Finalization

1421 Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake

Microcontroller implementations SPIX ACE and WAGE

Microcontroller platform specifcations3

SPIX ACE and WAGE are implemented in assembly language

SPIX 8-bit 16-bit and 32-bit microcontrollers Atmega128 MSP430 and Cortex-M3

ACE 16-bit and 32-bit microcontrollers MSP430 and Cortex-M3

WAGE 8-bit 16-bit and 32-bit microcontrollers Atmega128 MSP430 and Cortex-M3

Clock frequency 16 MHz for all platforms

Microcontrollers Flash memory size [kB] RAM [kB] Number of general-purpose register ATmega128(SPIX WAGE) 128 4448 32(R0 - R31) MSP430F2013(ACE SPIX) 2304 0128 12 (R4 - R15)

MSP430F2370(WAGE) 33024 2048 12 (R4 - R15) LM3S9D96(ACE SPIX WAGE) 524288 131072 13 (R0 - R12)

3Cortex-M3 Cortex-m3lm3s9d96 MSP430 MSP430f2013 1521 Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake

Implementation Details

For SPIX the state is stored in the registers

For ACE and WAGE the state is stored in memory

Instead of loading everything into registers the WAGE state is continuously stored in random access memory (RAM)

- Initial memory location and the current round is recorded - After the permutation evaluation copy the fnal state to the

initial state location in RAM - Continue the next WAGE permutation evaluation

Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake 1621

Performance of 4-way handshake authentication Authentication time

Auth-Time includes the 4-way transmission time + computation times of KDF and MIC functions

Tauth = T4minuswayminustx + 2 TKDF + 3 TMIC

KDF computation is equivalent to AE(lAD = 0 lM = 6) no tag MIC computation is equivalent to AE(lAD = 4 lM = 0) with tag

Cryptographic F Platform Function

Memory usage [Bytes] Setup

[Cycles] Throughput

[Kbps] Gen-time

[ms] 4-way-Tx-time

[ms] Auth-Time

[ms] SRAM Flash

SPIX

8-bits ATmega128

KDF 175 1586 705314 2323 4408 700 95640 MIC 175 1634 897225 1826 5608 16-bits

MSP430F2013 KDF 50 1562 286679 5715 1792 690 79409 MIC 50 1580 363991 4501 2275

32-bits LM3S9D96

KDF 408 1230 59140 27704 370 700 72150 MIC 408 1326 75132 21807 470

ACE

16-bits MSP430F2013

KDF 330 1720 550752 2975 3442 710 89503 MIC 330 1738 619701 2644 3873 32-bits

LM3S9D96 KDF 599 1826 102762 15944 642 730 76450 MIC 599 1790 115561 14178 722

WAGE

8-bits ATmega128

KDF 808 4448 139478 11747 872 710 75678 MIC 808 4516 156491 10470 978 16-bits

MSP430F2013 KDF 46 4518 166993 9811 1044 720 77601 MIC 46 4536 187340 8746 1171

32-bits LM3S9D96

KDF 3084 6278 107071 15302 669 690 72591 MIC 3084 6382 120190 13632 751

SPIX implementation is faster due to storing state into registers

1721 Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake

SPIX for IEEE 80211i data protection protocol

32-bit Cortex-M3 gives the highest throughput and lowest memory usage

Total time for no AD and 128 bytes message 1058 ms Total time for 16 byte AD and 128 bytes message 1079 ms

Cryptographic Platform Memory usage

[Bytes] Setup [Cycles]

Throughput [Kbps]

Gen-time [ms]

Tx-time [ms] SRAM Flash

SPIX Perm-18

8-bits ATmega128 161 1262 128377 3191 802

NA 16-bits MSP430F2013 24 1409 52294 7833 327

32-bits LM3S9D96 352 946 10900 37578 068

SPIX-AE (lAD = 0 lM = 16)

8-bits ATmega128 175 1550 1667042 983 10419 1060

16-bits MSP430F2013 50 1845 677818 2417 4236 1080

32-bits LM3S9D96 408 1210 139569 11739 872 1050

SPIX-AE (lAD = 2 lM = 16)

8-bits ATmega128 175 1644 1795322 913 11221 1080

16-bits MSP430F2013 50 1891 730340 2243 4565 1050

32-bits LM3S9D96 424 1326 150313 10900 939 1070

Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake 1821

ACE for IEEE 80211i data protection protocol

32-bit Cortex-M3 gives the best result Total time for no AD and 128 bytes message 1087 ms Total time for 16 byte AD and 128 bytes message 1098 ms

Cryptographic Platform Memory usage

[Bytes] Setup [Cycles]

Throughput [Kbps]

Gen-time [ms]

Tx-time [ms] SRAM Flash

ACE Perm 16-bits

MSP430F2013 304 1456 69440 7373 434 NA 32-bits

LM3S9D96 523 1598 13003 39376 081

ACE-AE (lAD = 0 lM = 16)

16-bits MSP430F2013 330 1740 1445059 1134 9032 1060

32-bits LM3S9D96 559 1790 269341 6083 1683 107

ACE-AE (lAD = 2 lM = 16)

16-bits MSP430F2013 330 1786 1582892 1035 9893 1080

32-bits LM3S9D96 559 1858 294988 5554 1844 1080

ACE-Hash (lM = 2 j =4)

16-bits MSP430F2013 330 1682 413056 496 2582

NA 32-bits LM3S9D96 559 1822 77114 2656 482

ACE-Hash (lM = 16 j =4)

16-bits MSP430F2013 330 1684 1375672 1191 8598

32-bits LM3S9D96 559 1822 256524 6387 1603

Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake 1921

WAGE for IEEE 80211i data protection protocol Cortex-M3 gives the highest throughput with highest memory

usage MSP430 gives the slightly lower throughput but much lower

memory usage Total time for no AD and 128 bytes message 1077 ms Total time for 16 byte AD and 128 bytes message 1079 ms

Cryptographic Platform Memory usage

[Bytes] Setup [Cycles]

Throughput [Kbps]

Gen-time [ms]

Tx-time [ms] SRAM Flash

WAGE Perm

8-bits ATmega128 802 4132 19011 21798 119

NA 16-bits MSP430F2370 4 5031 23524 17616 147

32-bits LM3S9D96 3076 5902 14450 28678 09

WAGE-AE (lAD = 0 lM = 16)

8-bits ATmega128 808 4416 362888 4515 2268 1080

16-bits MSP430F2370 46 5289 433105 3783 2707 1090

32-bits LM3S9D96 3084 6230 278848 5876 1743 1060

WAGE-AE (lAD = 2 lM = 16)

8-bits ATmega128 808 4502 397260 4124 2483 1050

16-bits MSP430F2370 46 5339 474067 3456 2963 1060

32-bits LM3S9D96 3084 6354 305284 5367 1908 1060

Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake 2021

Conclusions

Implemented the IEEE 80211a physical layer OFDM transmission systems by software defned radio to simulate the 4-way handshake modulation and communication

Implemented the IEEE 8021x 4-way handshake mutual authentication and key establishment protocol using SPIX ACE and WAGE on three dierent microcontrollers

Throughput of WAGE is higher than that of AES-128 written in C1 on the same 8-bit platform

Execution time for the cryptographic operations is the dominating factor in the 4-way handshake

- USRP tx rate 1682 Kbps - WiFi systemrsquos tx rate 50 - 320 Mbps

07times1682 - Scaling tx to WiFi = 0235 ms 50000

1 G Meiser T Eisenbarth K Lemke-Rust and C Paar Eyumlcient implementation of estream ciphers on 8-bit avr

microcontrollers In 2008 International Symposium on Industrial Embedded Systems pages 58 - 66 June 2008

Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake 2121

Thank you

  • Background and Preliminary
    • SPIX ACE and WAGE in NIST LWC round 2
    • IEEE 8021X 4-way handshake and data protection
    • SDR Overall
      • Implementation of Three LWC Schemes
        • KDF and MIC generation
        • Microcontroller Implementation
        • RAM moving technology
          • Performance Results
            • KDF and MIC for 4-way handshake
            • SPIX AE mode for data protection
            • ACE AE amp Hash modes for data protection
            • WAGE AE mode for data protection
              • Conclusion
Page 5: Implementation of Three LWC Schemes in the WiFi 4-Way ......Implementation of Three LWC Schemes in the WiFi 4-Way Handshake with Software Defned Radio . Yunjie Yi, Guang Gong, Kalikinkar

ACE An AE and hash algorithm Design goals

Both AE and Hash functionalities with a single hardware footprint 128-bit security level with good security margins Balance among hardware cost and software eyumlciency

ACE ACE ACE ACE ACE ACE ACE ACE ACE ACEload-AE(NK)

256

64

tagextract(S)

128

0x00 0x00 0x01 0x01 0x02 0x02 0x02 0x00 0x00

K0 K1 AD0 ADlADminus 1 K0 K1

M0 MlMminus2 MlMminus1

C0 ClMminus2 ClMminus1

Initialization Processing associated data Encryption Finalization

(a) ACE Authenticated encryption algorithm

ACE ACE ACE ACE ACE ACE ACEload-H(IV )

256

64

M0 M1 MlMminus1 0 0 0 0H0 H1 H2 H3

Initialization Absorbing Squeezing

(b) ACE Hash algorithm

421 Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake

ACE Permutation Ai Bi Ci Di Ei

64 64 64 64 64

SB-64 SB-64 SB-64rci0 rci1 rci2

156||sci0 156||sci1 156||sci2

Ai+1 Bi+1 Ci+1 Di+1 Ei+1

ACE permutation step function

Parameters1

State size 320 bits Nonlinear layer 3 SB-64 (Simeck) boxes Linear layer (˙) (32041) rounds(u)steps(s) 816 Rate positions 4 bytes of A[7 middot middot middot 4] and C[7 middot middot middot 4] 8-bit round and step constants rci

j and scij

1 Aagaard M AlTawy R Gong G Mandal K and Rohit R ACE An authenticated encryption and hash algorithm

NIST LWC round 2 candidate (2019)

521 Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake

WAGE An authenticated encryption algorithm

Design1

Underlying permutation WAGE (over extension feld F27) of width 259 bits Round function of WAGE is constructed by tweaking the initialization

phase of the WG stream cipher Mode of operation Similar to ACE rounds 111 128-bit security low hardware cost Data limit 264 bytes

Si36 Si

35 Si34 Si

33 Si32 Si

31 Si30 Si

29 Si28 Si

27 Si26 Si

25 Si24 Si

23 Si22 Si

21 Si20 Si

19

WGP SB SB

Si17Si

18 Si16 Si

15 Si14 Si

13 Si12 Si

11 Si10 Si

9 Si8 Si

7 Si6 Si

5 Si4 Si

3 S2i Si1 Si

0

WGP SB SB

oplusω

rci1

rci0

1 Aagaard M AlTawy R Gong G Mandal K Rohit R and Zidaric N WAGE An authenticated cipher NIST LWC

round 2 candidate (2019)

Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake 621

IEEE 8021X 4-way handshake and data protection

4-way handshake and data protection 4-way handshake Conducts mutual entity authentication and generation

of session keys Data protection After 4-way handshake it executes a data protection

protocol either CCMP or GCMP CCMP AES in counter mode + CBC MAC GCMP AES in counter mode + polynomial hash for MAC

Supplicant Authenticator

ANonce Generate ANonce

(SNonce MICA)Generate SNonceDerive PTK

(ANonce MICS )Derive PTK

MICallInstall TKInstall TK

Figure IEEE 8021X 4-way handshake

Software Defined Radio

721 Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake

IEEE 8021X (Cont)

Computing keys in the 4-way handshake protocol Pairwise transit key (PTK) is generated as

PTK = KDF (PMK ANonce||SNonce||AP MAC adr||STA MAC adr)

PTK = KCK||KEK||TK

- KCK is used to generate a MIC - KEK is used for encrypting the group key - TK is used for protecting trayumlc data

Message integrity codes (MICs) are generated as

MICA = MIC(KCK ANonce r) MICS = MIC(KCK SNonce r) MICall = MIC(KCK D r + 1)

where r is a replay counter number and D carries the cipher suite

Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake 821

Software defned radio (SDR)

SDR setup SDR has two main parts Universal Software Radio Peripheral (USRP) and

GNU radio

USRP is a hardware consisting of ADC DAC low pass fler and mixer

GNU radio is a digital signal processing (DSP) software on a Linux OS

921 Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake

OFDM sender in SDR

OFDM sender The digital processing part before the USRP sender are done on PC in real

time

The OFDM sender includes package header generator modulation carrier allocator cyclic prefx and so on

1021 Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake

OFDM receiver in SDR

OFDM receiver USRP receiver received sampled data in complex domain

The receiver contains synchronization module header and payload demux OFDM demodulation and so on

USRP-

Source

Schmidl amp Cox

OFDM synch

Frequency Mod

Sensitivity -3125m

OFDM Frame Equalizer

FFT length = 64

CP-length = 16

Delay Multiply

HeaderPayload Demux

File Sink

Pathoutputdat

Type = Byte

FFT

Size =64

Type = Complex

Vector = 64

FFT

Size =64

Type = Complex

Vector = 64

OFDM Channel

Estimation

OFDM Frame

Equalizer

OFDM Serializer

FFT length = 64

Constellation

Decoder

Package Header

Parser

OFDM Frame

Equalizer

OFDM Serializer

FFT length = 64

Constellation

DecoderRepack Bits

1121 Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake

Our Contribution

Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake 1221

Implementation of IEEE80211a PHY in SDR

One USRP is confgured as a supplicant and another is an authenticator

Two USRPs communicate wirelessly

1321 Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake

Implementation of KDF and MIC

Implementation - KDF and MIC

Sponge structure is used for KDF and generation of MIC

SPIX ACE and WAGE follow the same framework

Key derivation function construction

F F F F F F F Fload-AE(AN [64 127]SN [64 127] PMK)

c

r

0x00 0x00 0x01 0x01 0x01 0x01 0x01

PMK[0 63] PMK[64 127] AN [64 127] SN [64 127] AP [0 63] AP [64 127] STA[0 63] STA[64 127]

KCK[0 63] KCK[64 127] KEK[0 63] KEK[64 127] TK[0 63] TK[64 127]

Initialization Processing Associated Data and Extracting Session Keys

Message integrity code generation

F F F F F F F F Fload-AE(NKCK)

c

r

tagextract(S)

t

0x00 0x00 0x01 0x01 0x01 0x01 0x00 0x00

KCK[0 63] KCK[64 127] Nonce[0 63] Nonce[64 127] r[0 63] r[64 127] KCK[0 63] KCK[64 127]

Initialization Processing associated data Finalization

1421 Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake

Microcontroller implementations SPIX ACE and WAGE

Microcontroller platform specifcations3

SPIX ACE and WAGE are implemented in assembly language

SPIX 8-bit 16-bit and 32-bit microcontrollers Atmega128 MSP430 and Cortex-M3

ACE 16-bit and 32-bit microcontrollers MSP430 and Cortex-M3

WAGE 8-bit 16-bit and 32-bit microcontrollers Atmega128 MSP430 and Cortex-M3

Clock frequency 16 MHz for all platforms

Microcontrollers Flash memory size [kB] RAM [kB] Number of general-purpose register ATmega128(SPIX WAGE) 128 4448 32(R0 - R31) MSP430F2013(ACE SPIX) 2304 0128 12 (R4 - R15)

MSP430F2370(WAGE) 33024 2048 12 (R4 - R15) LM3S9D96(ACE SPIX WAGE) 524288 131072 13 (R0 - R12)

3Cortex-M3 Cortex-m3lm3s9d96 MSP430 MSP430f2013 1521 Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake

Implementation Details

For SPIX the state is stored in the registers

For ACE and WAGE the state is stored in memory

Instead of loading everything into registers the WAGE state is continuously stored in random access memory (RAM)

- Initial memory location and the current round is recorded - After the permutation evaluation copy the fnal state to the

initial state location in RAM - Continue the next WAGE permutation evaluation

Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake 1621

Performance of 4-way handshake authentication Authentication time

Auth-Time includes the 4-way transmission time + computation times of KDF and MIC functions

Tauth = T4minuswayminustx + 2 TKDF + 3 TMIC

KDF computation is equivalent to AE(lAD = 0 lM = 6) no tag MIC computation is equivalent to AE(lAD = 4 lM = 0) with tag

Cryptographic F Platform Function

Memory usage [Bytes] Setup

[Cycles] Throughput

[Kbps] Gen-time

[ms] 4-way-Tx-time

[ms] Auth-Time

[ms] SRAM Flash

SPIX

8-bits ATmega128

KDF 175 1586 705314 2323 4408 700 95640 MIC 175 1634 897225 1826 5608 16-bits

MSP430F2013 KDF 50 1562 286679 5715 1792 690 79409 MIC 50 1580 363991 4501 2275

32-bits LM3S9D96

KDF 408 1230 59140 27704 370 700 72150 MIC 408 1326 75132 21807 470

ACE

16-bits MSP430F2013

KDF 330 1720 550752 2975 3442 710 89503 MIC 330 1738 619701 2644 3873 32-bits

LM3S9D96 KDF 599 1826 102762 15944 642 730 76450 MIC 599 1790 115561 14178 722

WAGE

8-bits ATmega128

KDF 808 4448 139478 11747 872 710 75678 MIC 808 4516 156491 10470 978 16-bits

MSP430F2013 KDF 46 4518 166993 9811 1044 720 77601 MIC 46 4536 187340 8746 1171

32-bits LM3S9D96

KDF 3084 6278 107071 15302 669 690 72591 MIC 3084 6382 120190 13632 751

SPIX implementation is faster due to storing state into registers

1721 Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake

SPIX for IEEE 80211i data protection protocol

32-bit Cortex-M3 gives the highest throughput and lowest memory usage

Total time for no AD and 128 bytes message 1058 ms Total time for 16 byte AD and 128 bytes message 1079 ms

Cryptographic Platform Memory usage

[Bytes] Setup [Cycles]

Throughput [Kbps]

Gen-time [ms]

Tx-time [ms] SRAM Flash

SPIX Perm-18

8-bits ATmega128 161 1262 128377 3191 802

NA 16-bits MSP430F2013 24 1409 52294 7833 327

32-bits LM3S9D96 352 946 10900 37578 068

SPIX-AE (lAD = 0 lM = 16)

8-bits ATmega128 175 1550 1667042 983 10419 1060

16-bits MSP430F2013 50 1845 677818 2417 4236 1080

32-bits LM3S9D96 408 1210 139569 11739 872 1050

SPIX-AE (lAD = 2 lM = 16)

8-bits ATmega128 175 1644 1795322 913 11221 1080

16-bits MSP430F2013 50 1891 730340 2243 4565 1050

32-bits LM3S9D96 424 1326 150313 10900 939 1070

Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake 1821

ACE for IEEE 80211i data protection protocol

32-bit Cortex-M3 gives the best result Total time for no AD and 128 bytes message 1087 ms Total time for 16 byte AD and 128 bytes message 1098 ms

Cryptographic Platform Memory usage

[Bytes] Setup [Cycles]

Throughput [Kbps]

Gen-time [ms]

Tx-time [ms] SRAM Flash

ACE Perm 16-bits

MSP430F2013 304 1456 69440 7373 434 NA 32-bits

LM3S9D96 523 1598 13003 39376 081

ACE-AE (lAD = 0 lM = 16)

16-bits MSP430F2013 330 1740 1445059 1134 9032 1060

32-bits LM3S9D96 559 1790 269341 6083 1683 107

ACE-AE (lAD = 2 lM = 16)

16-bits MSP430F2013 330 1786 1582892 1035 9893 1080

32-bits LM3S9D96 559 1858 294988 5554 1844 1080

ACE-Hash (lM = 2 j =4)

16-bits MSP430F2013 330 1682 413056 496 2582

NA 32-bits LM3S9D96 559 1822 77114 2656 482

ACE-Hash (lM = 16 j =4)

16-bits MSP430F2013 330 1684 1375672 1191 8598

32-bits LM3S9D96 559 1822 256524 6387 1603

Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake 1921

WAGE for IEEE 80211i data protection protocol Cortex-M3 gives the highest throughput with highest memory

usage MSP430 gives the slightly lower throughput but much lower

memory usage Total time for no AD and 128 bytes message 1077 ms Total time for 16 byte AD and 128 bytes message 1079 ms

Cryptographic Platform Memory usage

[Bytes] Setup [Cycles]

Throughput [Kbps]

Gen-time [ms]

Tx-time [ms] SRAM Flash

WAGE Perm

8-bits ATmega128 802 4132 19011 21798 119

NA 16-bits MSP430F2370 4 5031 23524 17616 147

32-bits LM3S9D96 3076 5902 14450 28678 09

WAGE-AE (lAD = 0 lM = 16)

8-bits ATmega128 808 4416 362888 4515 2268 1080

16-bits MSP430F2370 46 5289 433105 3783 2707 1090

32-bits LM3S9D96 3084 6230 278848 5876 1743 1060

WAGE-AE (lAD = 2 lM = 16)

8-bits ATmega128 808 4502 397260 4124 2483 1050

16-bits MSP430F2370 46 5339 474067 3456 2963 1060

32-bits LM3S9D96 3084 6354 305284 5367 1908 1060

Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake 2021

Conclusions

Implemented the IEEE 80211a physical layer OFDM transmission systems by software defned radio to simulate the 4-way handshake modulation and communication

Implemented the IEEE 8021x 4-way handshake mutual authentication and key establishment protocol using SPIX ACE and WAGE on three dierent microcontrollers

Throughput of WAGE is higher than that of AES-128 written in C1 on the same 8-bit platform

Execution time for the cryptographic operations is the dominating factor in the 4-way handshake

- USRP tx rate 1682 Kbps - WiFi systemrsquos tx rate 50 - 320 Mbps

07times1682 - Scaling tx to WiFi = 0235 ms 50000

1 G Meiser T Eisenbarth K Lemke-Rust and C Paar Eyumlcient implementation of estream ciphers on 8-bit avr

microcontrollers In 2008 International Symposium on Industrial Embedded Systems pages 58 - 66 June 2008

Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake 2121

Thank you

  • Background and Preliminary
    • SPIX ACE and WAGE in NIST LWC round 2
    • IEEE 8021X 4-way handshake and data protection
    • SDR Overall
      • Implementation of Three LWC Schemes
        • KDF and MIC generation
        • Microcontroller Implementation
        • RAM moving technology
          • Performance Results
            • KDF and MIC for 4-way handshake
            • SPIX AE mode for data protection
            • ACE AE amp Hash modes for data protection
            • WAGE AE mode for data protection
              • Conclusion
Page 6: Implementation of Three LWC Schemes in the WiFi 4-Way ......Implementation of Three LWC Schemes in the WiFi 4-Way Handshake with Software Defned Radio . Yunjie Yi, Guang Gong, Kalikinkar

ACE Permutation Ai Bi Ci Di Ei

64 64 64 64 64

SB-64 SB-64 SB-64rci0 rci1 rci2

156||sci0 156||sci1 156||sci2

Ai+1 Bi+1 Ci+1 Di+1 Ei+1

ACE permutation step function

Parameters1

State size 320 bits Nonlinear layer 3 SB-64 (Simeck) boxes Linear layer (˙) (32041) rounds(u)steps(s) 816 Rate positions 4 bytes of A[7 middot middot middot 4] and C[7 middot middot middot 4] 8-bit round and step constants rci

j and scij

1 Aagaard M AlTawy R Gong G Mandal K and Rohit R ACE An authenticated encryption and hash algorithm

NIST LWC round 2 candidate (2019)

521 Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake

WAGE An authenticated encryption algorithm

Design1

Underlying permutation WAGE (over extension feld F27) of width 259 bits Round function of WAGE is constructed by tweaking the initialization

phase of the WG stream cipher Mode of operation Similar to ACE rounds 111 128-bit security low hardware cost Data limit 264 bytes

Si36 Si

35 Si34 Si

33 Si32 Si

31 Si30 Si

29 Si28 Si

27 Si26 Si

25 Si24 Si

23 Si22 Si

21 Si20 Si

19

WGP SB SB

Si17Si

18 Si16 Si

15 Si14 Si

13 Si12 Si

11 Si10 Si

9 Si8 Si

7 Si6 Si

5 Si4 Si

3 S2i Si1 Si

0

WGP SB SB

oplusω

rci1

rci0

1 Aagaard M AlTawy R Gong G Mandal K Rohit R and Zidaric N WAGE An authenticated cipher NIST LWC

round 2 candidate (2019)

Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake 621

IEEE 8021X 4-way handshake and data protection

4-way handshake and data protection 4-way handshake Conducts mutual entity authentication and generation

of session keys Data protection After 4-way handshake it executes a data protection

protocol either CCMP or GCMP CCMP AES in counter mode + CBC MAC GCMP AES in counter mode + polynomial hash for MAC

Supplicant Authenticator

ANonce Generate ANonce

(SNonce MICA)Generate SNonceDerive PTK

(ANonce MICS )Derive PTK

MICallInstall TKInstall TK

Figure IEEE 8021X 4-way handshake

Software Defined Radio

721 Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake

IEEE 8021X (Cont)

Computing keys in the 4-way handshake protocol Pairwise transit key (PTK) is generated as

PTK = KDF (PMK ANonce||SNonce||AP MAC adr||STA MAC adr)

PTK = KCK||KEK||TK

- KCK is used to generate a MIC - KEK is used for encrypting the group key - TK is used for protecting trayumlc data

Message integrity codes (MICs) are generated as

MICA = MIC(KCK ANonce r) MICS = MIC(KCK SNonce r) MICall = MIC(KCK D r + 1)

where r is a replay counter number and D carries the cipher suite

Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake 821

Software defned radio (SDR)

SDR setup SDR has two main parts Universal Software Radio Peripheral (USRP) and

GNU radio

USRP is a hardware consisting of ADC DAC low pass fler and mixer

GNU radio is a digital signal processing (DSP) software on a Linux OS

921 Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake

OFDM sender in SDR

OFDM sender The digital processing part before the USRP sender are done on PC in real

time

The OFDM sender includes package header generator modulation carrier allocator cyclic prefx and so on

1021 Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake

OFDM receiver in SDR

OFDM receiver USRP receiver received sampled data in complex domain

The receiver contains synchronization module header and payload demux OFDM demodulation and so on

USRP-

Source

Schmidl amp Cox

OFDM synch

Frequency Mod

Sensitivity -3125m

OFDM Frame Equalizer

FFT length = 64

CP-length = 16

Delay Multiply

HeaderPayload Demux

File Sink

Pathoutputdat

Type = Byte

FFT

Size =64

Type = Complex

Vector = 64

FFT

Size =64

Type = Complex

Vector = 64

OFDM Channel

Estimation

OFDM Frame

Equalizer

OFDM Serializer

FFT length = 64

Constellation

Decoder

Package Header

Parser

OFDM Frame

Equalizer

OFDM Serializer

FFT length = 64

Constellation

DecoderRepack Bits

1121 Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake

Our Contribution

Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake 1221

Implementation of IEEE80211a PHY in SDR

One USRP is confgured as a supplicant and another is an authenticator

Two USRPs communicate wirelessly

1321 Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake

Implementation of KDF and MIC

Implementation - KDF and MIC

Sponge structure is used for KDF and generation of MIC

SPIX ACE and WAGE follow the same framework

Key derivation function construction

F F F F F F F Fload-AE(AN [64 127]SN [64 127] PMK)

c

r

0x00 0x00 0x01 0x01 0x01 0x01 0x01

PMK[0 63] PMK[64 127] AN [64 127] SN [64 127] AP [0 63] AP [64 127] STA[0 63] STA[64 127]

KCK[0 63] KCK[64 127] KEK[0 63] KEK[64 127] TK[0 63] TK[64 127]

Initialization Processing Associated Data and Extracting Session Keys

Message integrity code generation

F F F F F F F F Fload-AE(NKCK)

c

r

tagextract(S)

t

0x00 0x00 0x01 0x01 0x01 0x01 0x00 0x00

KCK[0 63] KCK[64 127] Nonce[0 63] Nonce[64 127] r[0 63] r[64 127] KCK[0 63] KCK[64 127]

Initialization Processing associated data Finalization

1421 Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake

Microcontroller implementations SPIX ACE and WAGE

Microcontroller platform specifcations3

SPIX ACE and WAGE are implemented in assembly language

SPIX 8-bit 16-bit and 32-bit microcontrollers Atmega128 MSP430 and Cortex-M3

ACE 16-bit and 32-bit microcontrollers MSP430 and Cortex-M3

WAGE 8-bit 16-bit and 32-bit microcontrollers Atmega128 MSP430 and Cortex-M3

Clock frequency 16 MHz for all platforms

Microcontrollers Flash memory size [kB] RAM [kB] Number of general-purpose register ATmega128(SPIX WAGE) 128 4448 32(R0 - R31) MSP430F2013(ACE SPIX) 2304 0128 12 (R4 - R15)

MSP430F2370(WAGE) 33024 2048 12 (R4 - R15) LM3S9D96(ACE SPIX WAGE) 524288 131072 13 (R0 - R12)

3Cortex-M3 Cortex-m3lm3s9d96 MSP430 MSP430f2013 1521 Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake

Implementation Details

For SPIX the state is stored in the registers

For ACE and WAGE the state is stored in memory

Instead of loading everything into registers the WAGE state is continuously stored in random access memory (RAM)

- Initial memory location and the current round is recorded - After the permutation evaluation copy the fnal state to the

initial state location in RAM - Continue the next WAGE permutation evaluation

Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake 1621

Performance of 4-way handshake authentication Authentication time

Auth-Time includes the 4-way transmission time + computation times of KDF and MIC functions

Tauth = T4minuswayminustx + 2 TKDF + 3 TMIC

KDF computation is equivalent to AE(lAD = 0 lM = 6) no tag MIC computation is equivalent to AE(lAD = 4 lM = 0) with tag

Cryptographic F Platform Function

Memory usage [Bytes] Setup

[Cycles] Throughput

[Kbps] Gen-time

[ms] 4-way-Tx-time

[ms] Auth-Time

[ms] SRAM Flash

SPIX

8-bits ATmega128

KDF 175 1586 705314 2323 4408 700 95640 MIC 175 1634 897225 1826 5608 16-bits

MSP430F2013 KDF 50 1562 286679 5715 1792 690 79409 MIC 50 1580 363991 4501 2275

32-bits LM3S9D96

KDF 408 1230 59140 27704 370 700 72150 MIC 408 1326 75132 21807 470

ACE

16-bits MSP430F2013

KDF 330 1720 550752 2975 3442 710 89503 MIC 330 1738 619701 2644 3873 32-bits

LM3S9D96 KDF 599 1826 102762 15944 642 730 76450 MIC 599 1790 115561 14178 722

WAGE

8-bits ATmega128

KDF 808 4448 139478 11747 872 710 75678 MIC 808 4516 156491 10470 978 16-bits

MSP430F2013 KDF 46 4518 166993 9811 1044 720 77601 MIC 46 4536 187340 8746 1171

32-bits LM3S9D96

KDF 3084 6278 107071 15302 669 690 72591 MIC 3084 6382 120190 13632 751

SPIX implementation is faster due to storing state into registers

1721 Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake

SPIX for IEEE 80211i data protection protocol

32-bit Cortex-M3 gives the highest throughput and lowest memory usage

Total time for no AD and 128 bytes message 1058 ms Total time for 16 byte AD and 128 bytes message 1079 ms

Cryptographic Platform Memory usage

[Bytes] Setup [Cycles]

Throughput [Kbps]

Gen-time [ms]

Tx-time [ms] SRAM Flash

SPIX Perm-18

8-bits ATmega128 161 1262 128377 3191 802

NA 16-bits MSP430F2013 24 1409 52294 7833 327

32-bits LM3S9D96 352 946 10900 37578 068

SPIX-AE (lAD = 0 lM = 16)

8-bits ATmega128 175 1550 1667042 983 10419 1060

16-bits MSP430F2013 50 1845 677818 2417 4236 1080

32-bits LM3S9D96 408 1210 139569 11739 872 1050

SPIX-AE (lAD = 2 lM = 16)

8-bits ATmega128 175 1644 1795322 913 11221 1080

16-bits MSP430F2013 50 1891 730340 2243 4565 1050

32-bits LM3S9D96 424 1326 150313 10900 939 1070

Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake 1821

ACE for IEEE 80211i data protection protocol

32-bit Cortex-M3 gives the best result Total time for no AD and 128 bytes message 1087 ms Total time for 16 byte AD and 128 bytes message 1098 ms

Cryptographic Platform Memory usage

[Bytes] Setup [Cycles]

Throughput [Kbps]

Gen-time [ms]

Tx-time [ms] SRAM Flash

ACE Perm 16-bits

MSP430F2013 304 1456 69440 7373 434 NA 32-bits

LM3S9D96 523 1598 13003 39376 081

ACE-AE (lAD = 0 lM = 16)

16-bits MSP430F2013 330 1740 1445059 1134 9032 1060

32-bits LM3S9D96 559 1790 269341 6083 1683 107

ACE-AE (lAD = 2 lM = 16)

16-bits MSP430F2013 330 1786 1582892 1035 9893 1080

32-bits LM3S9D96 559 1858 294988 5554 1844 1080

ACE-Hash (lM = 2 j =4)

16-bits MSP430F2013 330 1682 413056 496 2582

NA 32-bits LM3S9D96 559 1822 77114 2656 482

ACE-Hash (lM = 16 j =4)

16-bits MSP430F2013 330 1684 1375672 1191 8598

32-bits LM3S9D96 559 1822 256524 6387 1603

Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake 1921

WAGE for IEEE 80211i data protection protocol Cortex-M3 gives the highest throughput with highest memory

usage MSP430 gives the slightly lower throughput but much lower

memory usage Total time for no AD and 128 bytes message 1077 ms Total time for 16 byte AD and 128 bytes message 1079 ms

Cryptographic Platform Memory usage

[Bytes] Setup [Cycles]

Throughput [Kbps]

Gen-time [ms]

Tx-time [ms] SRAM Flash

WAGE Perm

8-bits ATmega128 802 4132 19011 21798 119

NA 16-bits MSP430F2370 4 5031 23524 17616 147

32-bits LM3S9D96 3076 5902 14450 28678 09

WAGE-AE (lAD = 0 lM = 16)

8-bits ATmega128 808 4416 362888 4515 2268 1080

16-bits MSP430F2370 46 5289 433105 3783 2707 1090

32-bits LM3S9D96 3084 6230 278848 5876 1743 1060

WAGE-AE (lAD = 2 lM = 16)

8-bits ATmega128 808 4502 397260 4124 2483 1050

16-bits MSP430F2370 46 5339 474067 3456 2963 1060

32-bits LM3S9D96 3084 6354 305284 5367 1908 1060

Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake 2021

Conclusions

Implemented the IEEE 80211a physical layer OFDM transmission systems by software defned radio to simulate the 4-way handshake modulation and communication

Implemented the IEEE 8021x 4-way handshake mutual authentication and key establishment protocol using SPIX ACE and WAGE on three dierent microcontrollers

Throughput of WAGE is higher than that of AES-128 written in C1 on the same 8-bit platform

Execution time for the cryptographic operations is the dominating factor in the 4-way handshake

- USRP tx rate 1682 Kbps - WiFi systemrsquos tx rate 50 - 320 Mbps

07times1682 - Scaling tx to WiFi = 0235 ms 50000

1 G Meiser T Eisenbarth K Lemke-Rust and C Paar Eyumlcient implementation of estream ciphers on 8-bit avr

microcontrollers In 2008 International Symposium on Industrial Embedded Systems pages 58 - 66 June 2008

Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake 2121

Thank you

  • Background and Preliminary
    • SPIX ACE and WAGE in NIST LWC round 2
    • IEEE 8021X 4-way handshake and data protection
    • SDR Overall
      • Implementation of Three LWC Schemes
        • KDF and MIC generation
        • Microcontroller Implementation
        • RAM moving technology
          • Performance Results
            • KDF and MIC for 4-way handshake
            • SPIX AE mode for data protection
            • ACE AE amp Hash modes for data protection
            • WAGE AE mode for data protection
              • Conclusion
Page 7: Implementation of Three LWC Schemes in the WiFi 4-Way ......Implementation of Three LWC Schemes in the WiFi 4-Way Handshake with Software Defned Radio . Yunjie Yi, Guang Gong, Kalikinkar

WAGE An authenticated encryption algorithm

Design1

Underlying permutation WAGE (over extension feld F27) of width 259 bits Round function of WAGE is constructed by tweaking the initialization

phase of the WG stream cipher Mode of operation Similar to ACE rounds 111 128-bit security low hardware cost Data limit 264 bytes

Si36 Si

35 Si34 Si

33 Si32 Si

31 Si30 Si

29 Si28 Si

27 Si26 Si

25 Si24 Si

23 Si22 Si

21 Si20 Si

19

WGP SB SB

Si17Si

18 Si16 Si

15 Si14 Si

13 Si12 Si

11 Si10 Si

9 Si8 Si

7 Si6 Si

5 Si4 Si

3 S2i Si1 Si

0

WGP SB SB

oplusω

rci1

rci0

1 Aagaard M AlTawy R Gong G Mandal K Rohit R and Zidaric N WAGE An authenticated cipher NIST LWC

round 2 candidate (2019)

Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake 621

IEEE 8021X 4-way handshake and data protection

4-way handshake and data protection 4-way handshake Conducts mutual entity authentication and generation

of session keys Data protection After 4-way handshake it executes a data protection

protocol either CCMP or GCMP CCMP AES in counter mode + CBC MAC GCMP AES in counter mode + polynomial hash for MAC

Supplicant Authenticator

ANonce Generate ANonce

(SNonce MICA)Generate SNonceDerive PTK

(ANonce MICS )Derive PTK

MICallInstall TKInstall TK

Figure IEEE 8021X 4-way handshake

Software Defined Radio

721 Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake

IEEE 8021X (Cont)

Computing keys in the 4-way handshake protocol Pairwise transit key (PTK) is generated as

PTK = KDF (PMK ANonce||SNonce||AP MAC adr||STA MAC adr)

PTK = KCK||KEK||TK

- KCK is used to generate a MIC - KEK is used for encrypting the group key - TK is used for protecting trayumlc data

Message integrity codes (MICs) are generated as

MICA = MIC(KCK ANonce r) MICS = MIC(KCK SNonce r) MICall = MIC(KCK D r + 1)

where r is a replay counter number and D carries the cipher suite

Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake 821

Software defned radio (SDR)

SDR setup SDR has two main parts Universal Software Radio Peripheral (USRP) and

GNU radio

USRP is a hardware consisting of ADC DAC low pass fler and mixer

GNU radio is a digital signal processing (DSP) software on a Linux OS

921 Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake

OFDM sender in SDR

OFDM sender The digital processing part before the USRP sender are done on PC in real

time

The OFDM sender includes package header generator modulation carrier allocator cyclic prefx and so on

1021 Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake

OFDM receiver in SDR

OFDM receiver USRP receiver received sampled data in complex domain

The receiver contains synchronization module header and payload demux OFDM demodulation and so on

USRP-

Source

Schmidl amp Cox

OFDM synch

Frequency Mod

Sensitivity -3125m

OFDM Frame Equalizer

FFT length = 64

CP-length = 16

Delay Multiply

HeaderPayload Demux

File Sink

Pathoutputdat

Type = Byte

FFT

Size =64

Type = Complex

Vector = 64

FFT

Size =64

Type = Complex

Vector = 64

OFDM Channel

Estimation

OFDM Frame

Equalizer

OFDM Serializer

FFT length = 64

Constellation

Decoder

Package Header

Parser

OFDM Frame

Equalizer

OFDM Serializer

FFT length = 64

Constellation

DecoderRepack Bits

1121 Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake

Our Contribution

Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake 1221

Implementation of IEEE80211a PHY in SDR

One USRP is confgured as a supplicant and another is an authenticator

Two USRPs communicate wirelessly

1321 Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake

Implementation of KDF and MIC

Implementation - KDF and MIC

Sponge structure is used for KDF and generation of MIC

SPIX ACE and WAGE follow the same framework

Key derivation function construction

F F F F F F F Fload-AE(AN [64 127]SN [64 127] PMK)

c

r

0x00 0x00 0x01 0x01 0x01 0x01 0x01

PMK[0 63] PMK[64 127] AN [64 127] SN [64 127] AP [0 63] AP [64 127] STA[0 63] STA[64 127]

KCK[0 63] KCK[64 127] KEK[0 63] KEK[64 127] TK[0 63] TK[64 127]

Initialization Processing Associated Data and Extracting Session Keys

Message integrity code generation

F F F F F F F F Fload-AE(NKCK)

c

r

tagextract(S)

t

0x00 0x00 0x01 0x01 0x01 0x01 0x00 0x00

KCK[0 63] KCK[64 127] Nonce[0 63] Nonce[64 127] r[0 63] r[64 127] KCK[0 63] KCK[64 127]

Initialization Processing associated data Finalization

1421 Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake

Microcontroller implementations SPIX ACE and WAGE

Microcontroller platform specifcations3

SPIX ACE and WAGE are implemented in assembly language

SPIX 8-bit 16-bit and 32-bit microcontrollers Atmega128 MSP430 and Cortex-M3

ACE 16-bit and 32-bit microcontrollers MSP430 and Cortex-M3

WAGE 8-bit 16-bit and 32-bit microcontrollers Atmega128 MSP430 and Cortex-M3

Clock frequency 16 MHz for all platforms

Microcontrollers Flash memory size [kB] RAM [kB] Number of general-purpose register ATmega128(SPIX WAGE) 128 4448 32(R0 - R31) MSP430F2013(ACE SPIX) 2304 0128 12 (R4 - R15)

MSP430F2370(WAGE) 33024 2048 12 (R4 - R15) LM3S9D96(ACE SPIX WAGE) 524288 131072 13 (R0 - R12)

3Cortex-M3 Cortex-m3lm3s9d96 MSP430 MSP430f2013 1521 Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake

Implementation Details

For SPIX the state is stored in the registers

For ACE and WAGE the state is stored in memory

Instead of loading everything into registers the WAGE state is continuously stored in random access memory (RAM)

- Initial memory location and the current round is recorded - After the permutation evaluation copy the fnal state to the

initial state location in RAM - Continue the next WAGE permutation evaluation

Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake 1621

Performance of 4-way handshake authentication Authentication time

Auth-Time includes the 4-way transmission time + computation times of KDF and MIC functions

Tauth = T4minuswayminustx + 2 TKDF + 3 TMIC

KDF computation is equivalent to AE(lAD = 0 lM = 6) no tag MIC computation is equivalent to AE(lAD = 4 lM = 0) with tag

Cryptographic F Platform Function

Memory usage [Bytes] Setup

[Cycles] Throughput

[Kbps] Gen-time

[ms] 4-way-Tx-time

[ms] Auth-Time

[ms] SRAM Flash

SPIX

8-bits ATmega128

KDF 175 1586 705314 2323 4408 700 95640 MIC 175 1634 897225 1826 5608 16-bits

MSP430F2013 KDF 50 1562 286679 5715 1792 690 79409 MIC 50 1580 363991 4501 2275

32-bits LM3S9D96

KDF 408 1230 59140 27704 370 700 72150 MIC 408 1326 75132 21807 470

ACE

16-bits MSP430F2013

KDF 330 1720 550752 2975 3442 710 89503 MIC 330 1738 619701 2644 3873 32-bits

LM3S9D96 KDF 599 1826 102762 15944 642 730 76450 MIC 599 1790 115561 14178 722

WAGE

8-bits ATmega128

KDF 808 4448 139478 11747 872 710 75678 MIC 808 4516 156491 10470 978 16-bits

MSP430F2013 KDF 46 4518 166993 9811 1044 720 77601 MIC 46 4536 187340 8746 1171

32-bits LM3S9D96

KDF 3084 6278 107071 15302 669 690 72591 MIC 3084 6382 120190 13632 751

SPIX implementation is faster due to storing state into registers

1721 Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake

SPIX for IEEE 80211i data protection protocol

32-bit Cortex-M3 gives the highest throughput and lowest memory usage

Total time for no AD and 128 bytes message 1058 ms Total time for 16 byte AD and 128 bytes message 1079 ms

Cryptographic Platform Memory usage

[Bytes] Setup [Cycles]

Throughput [Kbps]

Gen-time [ms]

Tx-time [ms] SRAM Flash

SPIX Perm-18

8-bits ATmega128 161 1262 128377 3191 802

NA 16-bits MSP430F2013 24 1409 52294 7833 327

32-bits LM3S9D96 352 946 10900 37578 068

SPIX-AE (lAD = 0 lM = 16)

8-bits ATmega128 175 1550 1667042 983 10419 1060

16-bits MSP430F2013 50 1845 677818 2417 4236 1080

32-bits LM3S9D96 408 1210 139569 11739 872 1050

SPIX-AE (lAD = 2 lM = 16)

8-bits ATmega128 175 1644 1795322 913 11221 1080

16-bits MSP430F2013 50 1891 730340 2243 4565 1050

32-bits LM3S9D96 424 1326 150313 10900 939 1070

Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake 1821

ACE for IEEE 80211i data protection protocol

32-bit Cortex-M3 gives the best result Total time for no AD and 128 bytes message 1087 ms Total time for 16 byte AD and 128 bytes message 1098 ms

Cryptographic Platform Memory usage

[Bytes] Setup [Cycles]

Throughput [Kbps]

Gen-time [ms]

Tx-time [ms] SRAM Flash

ACE Perm 16-bits

MSP430F2013 304 1456 69440 7373 434 NA 32-bits

LM3S9D96 523 1598 13003 39376 081

ACE-AE (lAD = 0 lM = 16)

16-bits MSP430F2013 330 1740 1445059 1134 9032 1060

32-bits LM3S9D96 559 1790 269341 6083 1683 107

ACE-AE (lAD = 2 lM = 16)

16-bits MSP430F2013 330 1786 1582892 1035 9893 1080

32-bits LM3S9D96 559 1858 294988 5554 1844 1080

ACE-Hash (lM = 2 j =4)

16-bits MSP430F2013 330 1682 413056 496 2582

NA 32-bits LM3S9D96 559 1822 77114 2656 482

ACE-Hash (lM = 16 j =4)

16-bits MSP430F2013 330 1684 1375672 1191 8598

32-bits LM3S9D96 559 1822 256524 6387 1603

Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake 1921

WAGE for IEEE 80211i data protection protocol Cortex-M3 gives the highest throughput with highest memory

usage MSP430 gives the slightly lower throughput but much lower

memory usage Total time for no AD and 128 bytes message 1077 ms Total time for 16 byte AD and 128 bytes message 1079 ms

Cryptographic Platform Memory usage

[Bytes] Setup [Cycles]

Throughput [Kbps]

Gen-time [ms]

Tx-time [ms] SRAM Flash

WAGE Perm

8-bits ATmega128 802 4132 19011 21798 119

NA 16-bits MSP430F2370 4 5031 23524 17616 147

32-bits LM3S9D96 3076 5902 14450 28678 09

WAGE-AE (lAD = 0 lM = 16)

8-bits ATmega128 808 4416 362888 4515 2268 1080

16-bits MSP430F2370 46 5289 433105 3783 2707 1090

32-bits LM3S9D96 3084 6230 278848 5876 1743 1060

WAGE-AE (lAD = 2 lM = 16)

8-bits ATmega128 808 4502 397260 4124 2483 1050

16-bits MSP430F2370 46 5339 474067 3456 2963 1060

32-bits LM3S9D96 3084 6354 305284 5367 1908 1060

Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake 2021

Conclusions

Implemented the IEEE 80211a physical layer OFDM transmission systems by software defned radio to simulate the 4-way handshake modulation and communication

Implemented the IEEE 8021x 4-way handshake mutual authentication and key establishment protocol using SPIX ACE and WAGE on three dierent microcontrollers

Throughput of WAGE is higher than that of AES-128 written in C1 on the same 8-bit platform

Execution time for the cryptographic operations is the dominating factor in the 4-way handshake

- USRP tx rate 1682 Kbps - WiFi systemrsquos tx rate 50 - 320 Mbps

07times1682 - Scaling tx to WiFi = 0235 ms 50000

1 G Meiser T Eisenbarth K Lemke-Rust and C Paar Eyumlcient implementation of estream ciphers on 8-bit avr

microcontrollers In 2008 International Symposium on Industrial Embedded Systems pages 58 - 66 June 2008

Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake 2121

Thank you

  • Background and Preliminary
    • SPIX ACE and WAGE in NIST LWC round 2
    • IEEE 8021X 4-way handshake and data protection
    • SDR Overall
      • Implementation of Three LWC Schemes
        • KDF and MIC generation
        • Microcontroller Implementation
        • RAM moving technology
          • Performance Results
            • KDF and MIC for 4-way handshake
            • SPIX AE mode for data protection
            • ACE AE amp Hash modes for data protection
            • WAGE AE mode for data protection
              • Conclusion
Page 8: Implementation of Three LWC Schemes in the WiFi 4-Way ......Implementation of Three LWC Schemes in the WiFi 4-Way Handshake with Software Defned Radio . Yunjie Yi, Guang Gong, Kalikinkar

IEEE 8021X 4-way handshake and data protection

4-way handshake and data protection 4-way handshake Conducts mutual entity authentication and generation

of session keys Data protection After 4-way handshake it executes a data protection

protocol either CCMP or GCMP CCMP AES in counter mode + CBC MAC GCMP AES in counter mode + polynomial hash for MAC

Supplicant Authenticator

ANonce Generate ANonce

(SNonce MICA)Generate SNonceDerive PTK

(ANonce MICS )Derive PTK

MICallInstall TKInstall TK

Figure IEEE 8021X 4-way handshake

Software Defined Radio

721 Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake

IEEE 8021X (Cont)

Computing keys in the 4-way handshake protocol Pairwise transit key (PTK) is generated as

PTK = KDF (PMK ANonce||SNonce||AP MAC adr||STA MAC adr)

PTK = KCK||KEK||TK

- KCK is used to generate a MIC - KEK is used for encrypting the group key - TK is used for protecting trayumlc data

Message integrity codes (MICs) are generated as

MICA = MIC(KCK ANonce r) MICS = MIC(KCK SNonce r) MICall = MIC(KCK D r + 1)

where r is a replay counter number and D carries the cipher suite

Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake 821

Software defned radio (SDR)

SDR setup SDR has two main parts Universal Software Radio Peripheral (USRP) and

GNU radio

USRP is a hardware consisting of ADC DAC low pass fler and mixer

GNU radio is a digital signal processing (DSP) software on a Linux OS

921 Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake

OFDM sender in SDR

OFDM sender The digital processing part before the USRP sender are done on PC in real

time

The OFDM sender includes package header generator modulation carrier allocator cyclic prefx and so on

1021 Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake

OFDM receiver in SDR

OFDM receiver USRP receiver received sampled data in complex domain

The receiver contains synchronization module header and payload demux OFDM demodulation and so on

USRP-

Source

Schmidl amp Cox

OFDM synch

Frequency Mod

Sensitivity -3125m

OFDM Frame Equalizer

FFT length = 64

CP-length = 16

Delay Multiply

HeaderPayload Demux

File Sink

Pathoutputdat

Type = Byte

FFT

Size =64

Type = Complex

Vector = 64

FFT

Size =64

Type = Complex

Vector = 64

OFDM Channel

Estimation

OFDM Frame

Equalizer

OFDM Serializer

FFT length = 64

Constellation

Decoder

Package Header

Parser

OFDM Frame

Equalizer

OFDM Serializer

FFT length = 64

Constellation

DecoderRepack Bits

1121 Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake

Our Contribution

Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake 1221

Implementation of IEEE80211a PHY in SDR

One USRP is confgured as a supplicant and another is an authenticator

Two USRPs communicate wirelessly

1321 Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake

Implementation of KDF and MIC

Implementation - KDF and MIC

Sponge structure is used for KDF and generation of MIC

SPIX ACE and WAGE follow the same framework

Key derivation function construction

F F F F F F F Fload-AE(AN [64 127]SN [64 127] PMK)

c

r

0x00 0x00 0x01 0x01 0x01 0x01 0x01

PMK[0 63] PMK[64 127] AN [64 127] SN [64 127] AP [0 63] AP [64 127] STA[0 63] STA[64 127]

KCK[0 63] KCK[64 127] KEK[0 63] KEK[64 127] TK[0 63] TK[64 127]

Initialization Processing Associated Data and Extracting Session Keys

Message integrity code generation

F F F F F F F F Fload-AE(NKCK)

c

r

tagextract(S)

t

0x00 0x00 0x01 0x01 0x01 0x01 0x00 0x00

KCK[0 63] KCK[64 127] Nonce[0 63] Nonce[64 127] r[0 63] r[64 127] KCK[0 63] KCK[64 127]

Initialization Processing associated data Finalization

1421 Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake

Microcontroller implementations SPIX ACE and WAGE

Microcontroller platform specifcations3

SPIX ACE and WAGE are implemented in assembly language

SPIX 8-bit 16-bit and 32-bit microcontrollers Atmega128 MSP430 and Cortex-M3

ACE 16-bit and 32-bit microcontrollers MSP430 and Cortex-M3

WAGE 8-bit 16-bit and 32-bit microcontrollers Atmega128 MSP430 and Cortex-M3

Clock frequency 16 MHz for all platforms

Microcontrollers Flash memory size [kB] RAM [kB] Number of general-purpose register ATmega128(SPIX WAGE) 128 4448 32(R0 - R31) MSP430F2013(ACE SPIX) 2304 0128 12 (R4 - R15)

MSP430F2370(WAGE) 33024 2048 12 (R4 - R15) LM3S9D96(ACE SPIX WAGE) 524288 131072 13 (R0 - R12)

3Cortex-M3 Cortex-m3lm3s9d96 MSP430 MSP430f2013 1521 Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake

Implementation Details

For SPIX the state is stored in the registers

For ACE and WAGE the state is stored in memory

Instead of loading everything into registers the WAGE state is continuously stored in random access memory (RAM)

- Initial memory location and the current round is recorded - After the permutation evaluation copy the fnal state to the

initial state location in RAM - Continue the next WAGE permutation evaluation

Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake 1621

Performance of 4-way handshake authentication Authentication time

Auth-Time includes the 4-way transmission time + computation times of KDF and MIC functions

Tauth = T4minuswayminustx + 2 TKDF + 3 TMIC

KDF computation is equivalent to AE(lAD = 0 lM = 6) no tag MIC computation is equivalent to AE(lAD = 4 lM = 0) with tag

Cryptographic F Platform Function

Memory usage [Bytes] Setup

[Cycles] Throughput

[Kbps] Gen-time

[ms] 4-way-Tx-time

[ms] Auth-Time

[ms] SRAM Flash

SPIX

8-bits ATmega128

KDF 175 1586 705314 2323 4408 700 95640 MIC 175 1634 897225 1826 5608 16-bits

MSP430F2013 KDF 50 1562 286679 5715 1792 690 79409 MIC 50 1580 363991 4501 2275

32-bits LM3S9D96

KDF 408 1230 59140 27704 370 700 72150 MIC 408 1326 75132 21807 470

ACE

16-bits MSP430F2013

KDF 330 1720 550752 2975 3442 710 89503 MIC 330 1738 619701 2644 3873 32-bits

LM3S9D96 KDF 599 1826 102762 15944 642 730 76450 MIC 599 1790 115561 14178 722

WAGE

8-bits ATmega128

KDF 808 4448 139478 11747 872 710 75678 MIC 808 4516 156491 10470 978 16-bits

MSP430F2013 KDF 46 4518 166993 9811 1044 720 77601 MIC 46 4536 187340 8746 1171

32-bits LM3S9D96

KDF 3084 6278 107071 15302 669 690 72591 MIC 3084 6382 120190 13632 751

SPIX implementation is faster due to storing state into registers

1721 Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake

SPIX for IEEE 80211i data protection protocol

32-bit Cortex-M3 gives the highest throughput and lowest memory usage

Total time for no AD and 128 bytes message 1058 ms Total time for 16 byte AD and 128 bytes message 1079 ms

Cryptographic Platform Memory usage

[Bytes] Setup [Cycles]

Throughput [Kbps]

Gen-time [ms]

Tx-time [ms] SRAM Flash

SPIX Perm-18

8-bits ATmega128 161 1262 128377 3191 802

NA 16-bits MSP430F2013 24 1409 52294 7833 327

32-bits LM3S9D96 352 946 10900 37578 068

SPIX-AE (lAD = 0 lM = 16)

8-bits ATmega128 175 1550 1667042 983 10419 1060

16-bits MSP430F2013 50 1845 677818 2417 4236 1080

32-bits LM3S9D96 408 1210 139569 11739 872 1050

SPIX-AE (lAD = 2 lM = 16)

8-bits ATmega128 175 1644 1795322 913 11221 1080

16-bits MSP430F2013 50 1891 730340 2243 4565 1050

32-bits LM3S9D96 424 1326 150313 10900 939 1070

Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake 1821

ACE for IEEE 80211i data protection protocol

32-bit Cortex-M3 gives the best result Total time for no AD and 128 bytes message 1087 ms Total time for 16 byte AD and 128 bytes message 1098 ms

Cryptographic Platform Memory usage

[Bytes] Setup [Cycles]

Throughput [Kbps]

Gen-time [ms]

Tx-time [ms] SRAM Flash

ACE Perm 16-bits

MSP430F2013 304 1456 69440 7373 434 NA 32-bits

LM3S9D96 523 1598 13003 39376 081

ACE-AE (lAD = 0 lM = 16)

16-bits MSP430F2013 330 1740 1445059 1134 9032 1060

32-bits LM3S9D96 559 1790 269341 6083 1683 107

ACE-AE (lAD = 2 lM = 16)

16-bits MSP430F2013 330 1786 1582892 1035 9893 1080

32-bits LM3S9D96 559 1858 294988 5554 1844 1080

ACE-Hash (lM = 2 j =4)

16-bits MSP430F2013 330 1682 413056 496 2582

NA 32-bits LM3S9D96 559 1822 77114 2656 482

ACE-Hash (lM = 16 j =4)

16-bits MSP430F2013 330 1684 1375672 1191 8598

32-bits LM3S9D96 559 1822 256524 6387 1603

Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake 1921

WAGE for IEEE 80211i data protection protocol Cortex-M3 gives the highest throughput with highest memory

usage MSP430 gives the slightly lower throughput but much lower

memory usage Total time for no AD and 128 bytes message 1077 ms Total time for 16 byte AD and 128 bytes message 1079 ms

Cryptographic Platform Memory usage

[Bytes] Setup [Cycles]

Throughput [Kbps]

Gen-time [ms]

Tx-time [ms] SRAM Flash

WAGE Perm

8-bits ATmega128 802 4132 19011 21798 119

NA 16-bits MSP430F2370 4 5031 23524 17616 147

32-bits LM3S9D96 3076 5902 14450 28678 09

WAGE-AE (lAD = 0 lM = 16)

8-bits ATmega128 808 4416 362888 4515 2268 1080

16-bits MSP430F2370 46 5289 433105 3783 2707 1090

32-bits LM3S9D96 3084 6230 278848 5876 1743 1060

WAGE-AE (lAD = 2 lM = 16)

8-bits ATmega128 808 4502 397260 4124 2483 1050

16-bits MSP430F2370 46 5339 474067 3456 2963 1060

32-bits LM3S9D96 3084 6354 305284 5367 1908 1060

Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake 2021

Conclusions

Implemented the IEEE 80211a physical layer OFDM transmission systems by software defned radio to simulate the 4-way handshake modulation and communication

Implemented the IEEE 8021x 4-way handshake mutual authentication and key establishment protocol using SPIX ACE and WAGE on three dierent microcontrollers

Throughput of WAGE is higher than that of AES-128 written in C1 on the same 8-bit platform

Execution time for the cryptographic operations is the dominating factor in the 4-way handshake

- USRP tx rate 1682 Kbps - WiFi systemrsquos tx rate 50 - 320 Mbps

07times1682 - Scaling tx to WiFi = 0235 ms 50000

1 G Meiser T Eisenbarth K Lemke-Rust and C Paar Eyumlcient implementation of estream ciphers on 8-bit avr

microcontrollers In 2008 International Symposium on Industrial Embedded Systems pages 58 - 66 June 2008

Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake 2121

Thank you

  • Background and Preliminary
    • SPIX ACE and WAGE in NIST LWC round 2
    • IEEE 8021X 4-way handshake and data protection
    • SDR Overall
      • Implementation of Three LWC Schemes
        • KDF and MIC generation
        • Microcontroller Implementation
        • RAM moving technology
          • Performance Results
            • KDF and MIC for 4-way handshake
            • SPIX AE mode for data protection
            • ACE AE amp Hash modes for data protection
            • WAGE AE mode for data protection
              • Conclusion
Page 9: Implementation of Three LWC Schemes in the WiFi 4-Way ......Implementation of Three LWC Schemes in the WiFi 4-Way Handshake with Software Defned Radio . Yunjie Yi, Guang Gong, Kalikinkar

IEEE 8021X (Cont)

Computing keys in the 4-way handshake protocol Pairwise transit key (PTK) is generated as

PTK = KDF (PMK ANonce||SNonce||AP MAC adr||STA MAC adr)

PTK = KCK||KEK||TK

- KCK is used to generate a MIC - KEK is used for encrypting the group key - TK is used for protecting trayumlc data

Message integrity codes (MICs) are generated as

MICA = MIC(KCK ANonce r) MICS = MIC(KCK SNonce r) MICall = MIC(KCK D r + 1)

where r is a replay counter number and D carries the cipher suite

Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake 821

Software defned radio (SDR)

SDR setup SDR has two main parts Universal Software Radio Peripheral (USRP) and

GNU radio

USRP is a hardware consisting of ADC DAC low pass fler and mixer

GNU radio is a digital signal processing (DSP) software on a Linux OS

921 Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake

OFDM sender in SDR

OFDM sender The digital processing part before the USRP sender are done on PC in real

time

The OFDM sender includes package header generator modulation carrier allocator cyclic prefx and so on

1021 Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake

OFDM receiver in SDR

OFDM receiver USRP receiver received sampled data in complex domain

The receiver contains synchronization module header and payload demux OFDM demodulation and so on

USRP-

Source

Schmidl amp Cox

OFDM synch

Frequency Mod

Sensitivity -3125m

OFDM Frame Equalizer

FFT length = 64

CP-length = 16

Delay Multiply

HeaderPayload Demux

File Sink

Pathoutputdat

Type = Byte

FFT

Size =64

Type = Complex

Vector = 64

FFT

Size =64

Type = Complex

Vector = 64

OFDM Channel

Estimation

OFDM Frame

Equalizer

OFDM Serializer

FFT length = 64

Constellation

Decoder

Package Header

Parser

OFDM Frame

Equalizer

OFDM Serializer

FFT length = 64

Constellation

DecoderRepack Bits

1121 Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake

Our Contribution

Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake 1221

Implementation of IEEE80211a PHY in SDR

One USRP is confgured as a supplicant and another is an authenticator

Two USRPs communicate wirelessly

1321 Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake

Implementation of KDF and MIC

Implementation - KDF and MIC

Sponge structure is used for KDF and generation of MIC

SPIX ACE and WAGE follow the same framework

Key derivation function construction

F F F F F F F Fload-AE(AN [64 127]SN [64 127] PMK)

c

r

0x00 0x00 0x01 0x01 0x01 0x01 0x01

PMK[0 63] PMK[64 127] AN [64 127] SN [64 127] AP [0 63] AP [64 127] STA[0 63] STA[64 127]

KCK[0 63] KCK[64 127] KEK[0 63] KEK[64 127] TK[0 63] TK[64 127]

Initialization Processing Associated Data and Extracting Session Keys

Message integrity code generation

F F F F F F F F Fload-AE(NKCK)

c

r

tagextract(S)

t

0x00 0x00 0x01 0x01 0x01 0x01 0x00 0x00

KCK[0 63] KCK[64 127] Nonce[0 63] Nonce[64 127] r[0 63] r[64 127] KCK[0 63] KCK[64 127]

Initialization Processing associated data Finalization

1421 Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake

Microcontroller implementations SPIX ACE and WAGE

Microcontroller platform specifcations3

SPIX ACE and WAGE are implemented in assembly language

SPIX 8-bit 16-bit and 32-bit microcontrollers Atmega128 MSP430 and Cortex-M3

ACE 16-bit and 32-bit microcontrollers MSP430 and Cortex-M3

WAGE 8-bit 16-bit and 32-bit microcontrollers Atmega128 MSP430 and Cortex-M3

Clock frequency 16 MHz for all platforms

Microcontrollers Flash memory size [kB] RAM [kB] Number of general-purpose register ATmega128(SPIX WAGE) 128 4448 32(R0 - R31) MSP430F2013(ACE SPIX) 2304 0128 12 (R4 - R15)

MSP430F2370(WAGE) 33024 2048 12 (R4 - R15) LM3S9D96(ACE SPIX WAGE) 524288 131072 13 (R0 - R12)

3Cortex-M3 Cortex-m3lm3s9d96 MSP430 MSP430f2013 1521 Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake

Implementation Details

For SPIX the state is stored in the registers

For ACE and WAGE the state is stored in memory

Instead of loading everything into registers the WAGE state is continuously stored in random access memory (RAM)

- Initial memory location and the current round is recorded - After the permutation evaluation copy the fnal state to the

initial state location in RAM - Continue the next WAGE permutation evaluation

Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake 1621

Performance of 4-way handshake authentication Authentication time

Auth-Time includes the 4-way transmission time + computation times of KDF and MIC functions

Tauth = T4minuswayminustx + 2 TKDF + 3 TMIC

KDF computation is equivalent to AE(lAD = 0 lM = 6) no tag MIC computation is equivalent to AE(lAD = 4 lM = 0) with tag

Cryptographic F Platform Function

Memory usage [Bytes] Setup

[Cycles] Throughput

[Kbps] Gen-time

[ms] 4-way-Tx-time

[ms] Auth-Time

[ms] SRAM Flash

SPIX

8-bits ATmega128

KDF 175 1586 705314 2323 4408 700 95640 MIC 175 1634 897225 1826 5608 16-bits

MSP430F2013 KDF 50 1562 286679 5715 1792 690 79409 MIC 50 1580 363991 4501 2275

32-bits LM3S9D96

KDF 408 1230 59140 27704 370 700 72150 MIC 408 1326 75132 21807 470

ACE

16-bits MSP430F2013

KDF 330 1720 550752 2975 3442 710 89503 MIC 330 1738 619701 2644 3873 32-bits

LM3S9D96 KDF 599 1826 102762 15944 642 730 76450 MIC 599 1790 115561 14178 722

WAGE

8-bits ATmega128

KDF 808 4448 139478 11747 872 710 75678 MIC 808 4516 156491 10470 978 16-bits

MSP430F2013 KDF 46 4518 166993 9811 1044 720 77601 MIC 46 4536 187340 8746 1171

32-bits LM3S9D96

KDF 3084 6278 107071 15302 669 690 72591 MIC 3084 6382 120190 13632 751

SPIX implementation is faster due to storing state into registers

1721 Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake

SPIX for IEEE 80211i data protection protocol

32-bit Cortex-M3 gives the highest throughput and lowest memory usage

Total time for no AD and 128 bytes message 1058 ms Total time for 16 byte AD and 128 bytes message 1079 ms

Cryptographic Platform Memory usage

[Bytes] Setup [Cycles]

Throughput [Kbps]

Gen-time [ms]

Tx-time [ms] SRAM Flash

SPIX Perm-18

8-bits ATmega128 161 1262 128377 3191 802

NA 16-bits MSP430F2013 24 1409 52294 7833 327

32-bits LM3S9D96 352 946 10900 37578 068

SPIX-AE (lAD = 0 lM = 16)

8-bits ATmega128 175 1550 1667042 983 10419 1060

16-bits MSP430F2013 50 1845 677818 2417 4236 1080

32-bits LM3S9D96 408 1210 139569 11739 872 1050

SPIX-AE (lAD = 2 lM = 16)

8-bits ATmega128 175 1644 1795322 913 11221 1080

16-bits MSP430F2013 50 1891 730340 2243 4565 1050

32-bits LM3S9D96 424 1326 150313 10900 939 1070

Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake 1821

ACE for IEEE 80211i data protection protocol

32-bit Cortex-M3 gives the best result Total time for no AD and 128 bytes message 1087 ms Total time for 16 byte AD and 128 bytes message 1098 ms

Cryptographic Platform Memory usage

[Bytes] Setup [Cycles]

Throughput [Kbps]

Gen-time [ms]

Tx-time [ms] SRAM Flash

ACE Perm 16-bits

MSP430F2013 304 1456 69440 7373 434 NA 32-bits

LM3S9D96 523 1598 13003 39376 081

ACE-AE (lAD = 0 lM = 16)

16-bits MSP430F2013 330 1740 1445059 1134 9032 1060

32-bits LM3S9D96 559 1790 269341 6083 1683 107

ACE-AE (lAD = 2 lM = 16)

16-bits MSP430F2013 330 1786 1582892 1035 9893 1080

32-bits LM3S9D96 559 1858 294988 5554 1844 1080

ACE-Hash (lM = 2 j =4)

16-bits MSP430F2013 330 1682 413056 496 2582

NA 32-bits LM3S9D96 559 1822 77114 2656 482

ACE-Hash (lM = 16 j =4)

16-bits MSP430F2013 330 1684 1375672 1191 8598

32-bits LM3S9D96 559 1822 256524 6387 1603

Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake 1921

WAGE for IEEE 80211i data protection protocol Cortex-M3 gives the highest throughput with highest memory

usage MSP430 gives the slightly lower throughput but much lower

memory usage Total time for no AD and 128 bytes message 1077 ms Total time for 16 byte AD and 128 bytes message 1079 ms

Cryptographic Platform Memory usage

[Bytes] Setup [Cycles]

Throughput [Kbps]

Gen-time [ms]

Tx-time [ms] SRAM Flash

WAGE Perm

8-bits ATmega128 802 4132 19011 21798 119

NA 16-bits MSP430F2370 4 5031 23524 17616 147

32-bits LM3S9D96 3076 5902 14450 28678 09

WAGE-AE (lAD = 0 lM = 16)

8-bits ATmega128 808 4416 362888 4515 2268 1080

16-bits MSP430F2370 46 5289 433105 3783 2707 1090

32-bits LM3S9D96 3084 6230 278848 5876 1743 1060

WAGE-AE (lAD = 2 lM = 16)

8-bits ATmega128 808 4502 397260 4124 2483 1050

16-bits MSP430F2370 46 5339 474067 3456 2963 1060

32-bits LM3S9D96 3084 6354 305284 5367 1908 1060

Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake 2021

Conclusions

Implemented the IEEE 80211a physical layer OFDM transmission systems by software defned radio to simulate the 4-way handshake modulation and communication

Implemented the IEEE 8021x 4-way handshake mutual authentication and key establishment protocol using SPIX ACE and WAGE on three dierent microcontrollers

Throughput of WAGE is higher than that of AES-128 written in C1 on the same 8-bit platform

Execution time for the cryptographic operations is the dominating factor in the 4-way handshake

- USRP tx rate 1682 Kbps - WiFi systemrsquos tx rate 50 - 320 Mbps

07times1682 - Scaling tx to WiFi = 0235 ms 50000

1 G Meiser T Eisenbarth K Lemke-Rust and C Paar Eyumlcient implementation of estream ciphers on 8-bit avr

microcontrollers In 2008 International Symposium on Industrial Embedded Systems pages 58 - 66 June 2008

Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake 2121

Thank you

  • Background and Preliminary
    • SPIX ACE and WAGE in NIST LWC round 2
    • IEEE 8021X 4-way handshake and data protection
    • SDR Overall
      • Implementation of Three LWC Schemes
        • KDF and MIC generation
        • Microcontroller Implementation
        • RAM moving technology
          • Performance Results
            • KDF and MIC for 4-way handshake
            • SPIX AE mode for data protection
            • ACE AE amp Hash modes for data protection
            • WAGE AE mode for data protection
              • Conclusion
Page 10: Implementation of Three LWC Schemes in the WiFi 4-Way ......Implementation of Three LWC Schemes in the WiFi 4-Way Handshake with Software Defned Radio . Yunjie Yi, Guang Gong, Kalikinkar

Software defned radio (SDR)

SDR setup SDR has two main parts Universal Software Radio Peripheral (USRP) and

GNU radio

USRP is a hardware consisting of ADC DAC low pass fler and mixer

GNU radio is a digital signal processing (DSP) software on a Linux OS

921 Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake

OFDM sender in SDR

OFDM sender The digital processing part before the USRP sender are done on PC in real

time

The OFDM sender includes package header generator modulation carrier allocator cyclic prefx and so on

1021 Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake

OFDM receiver in SDR

OFDM receiver USRP receiver received sampled data in complex domain

The receiver contains synchronization module header and payload demux OFDM demodulation and so on

USRP-

Source

Schmidl amp Cox

OFDM synch

Frequency Mod

Sensitivity -3125m

OFDM Frame Equalizer

FFT length = 64

CP-length = 16

Delay Multiply

HeaderPayload Demux

File Sink

Pathoutputdat

Type = Byte

FFT

Size =64

Type = Complex

Vector = 64

FFT

Size =64

Type = Complex

Vector = 64

OFDM Channel

Estimation

OFDM Frame

Equalizer

OFDM Serializer

FFT length = 64

Constellation

Decoder

Package Header

Parser

OFDM Frame

Equalizer

OFDM Serializer

FFT length = 64

Constellation

DecoderRepack Bits

1121 Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake

Our Contribution

Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake 1221

Implementation of IEEE80211a PHY in SDR

One USRP is confgured as a supplicant and another is an authenticator

Two USRPs communicate wirelessly

1321 Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake

Implementation of KDF and MIC

Implementation - KDF and MIC

Sponge structure is used for KDF and generation of MIC

SPIX ACE and WAGE follow the same framework

Key derivation function construction

F F F F F F F Fload-AE(AN [64 127]SN [64 127] PMK)

c

r

0x00 0x00 0x01 0x01 0x01 0x01 0x01

PMK[0 63] PMK[64 127] AN [64 127] SN [64 127] AP [0 63] AP [64 127] STA[0 63] STA[64 127]

KCK[0 63] KCK[64 127] KEK[0 63] KEK[64 127] TK[0 63] TK[64 127]

Initialization Processing Associated Data and Extracting Session Keys

Message integrity code generation

F F F F F F F F Fload-AE(NKCK)

c

r

tagextract(S)

t

0x00 0x00 0x01 0x01 0x01 0x01 0x00 0x00

KCK[0 63] KCK[64 127] Nonce[0 63] Nonce[64 127] r[0 63] r[64 127] KCK[0 63] KCK[64 127]

Initialization Processing associated data Finalization

1421 Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake

Microcontroller implementations SPIX ACE and WAGE

Microcontroller platform specifcations3

SPIX ACE and WAGE are implemented in assembly language

SPIX 8-bit 16-bit and 32-bit microcontrollers Atmega128 MSP430 and Cortex-M3

ACE 16-bit and 32-bit microcontrollers MSP430 and Cortex-M3

WAGE 8-bit 16-bit and 32-bit microcontrollers Atmega128 MSP430 and Cortex-M3

Clock frequency 16 MHz for all platforms

Microcontrollers Flash memory size [kB] RAM [kB] Number of general-purpose register ATmega128(SPIX WAGE) 128 4448 32(R0 - R31) MSP430F2013(ACE SPIX) 2304 0128 12 (R4 - R15)

MSP430F2370(WAGE) 33024 2048 12 (R4 - R15) LM3S9D96(ACE SPIX WAGE) 524288 131072 13 (R0 - R12)

3Cortex-M3 Cortex-m3lm3s9d96 MSP430 MSP430f2013 1521 Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake

Implementation Details

For SPIX the state is stored in the registers

For ACE and WAGE the state is stored in memory

Instead of loading everything into registers the WAGE state is continuously stored in random access memory (RAM)

- Initial memory location and the current round is recorded - After the permutation evaluation copy the fnal state to the

initial state location in RAM - Continue the next WAGE permutation evaluation

Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake 1621

Performance of 4-way handshake authentication Authentication time

Auth-Time includes the 4-way transmission time + computation times of KDF and MIC functions

Tauth = T4minuswayminustx + 2 TKDF + 3 TMIC

KDF computation is equivalent to AE(lAD = 0 lM = 6) no tag MIC computation is equivalent to AE(lAD = 4 lM = 0) with tag

Cryptographic F Platform Function

Memory usage [Bytes] Setup

[Cycles] Throughput

[Kbps] Gen-time

[ms] 4-way-Tx-time

[ms] Auth-Time

[ms] SRAM Flash

SPIX

8-bits ATmega128

KDF 175 1586 705314 2323 4408 700 95640 MIC 175 1634 897225 1826 5608 16-bits

MSP430F2013 KDF 50 1562 286679 5715 1792 690 79409 MIC 50 1580 363991 4501 2275

32-bits LM3S9D96

KDF 408 1230 59140 27704 370 700 72150 MIC 408 1326 75132 21807 470

ACE

16-bits MSP430F2013

KDF 330 1720 550752 2975 3442 710 89503 MIC 330 1738 619701 2644 3873 32-bits

LM3S9D96 KDF 599 1826 102762 15944 642 730 76450 MIC 599 1790 115561 14178 722

WAGE

8-bits ATmega128

KDF 808 4448 139478 11747 872 710 75678 MIC 808 4516 156491 10470 978 16-bits

MSP430F2013 KDF 46 4518 166993 9811 1044 720 77601 MIC 46 4536 187340 8746 1171

32-bits LM3S9D96

KDF 3084 6278 107071 15302 669 690 72591 MIC 3084 6382 120190 13632 751

SPIX implementation is faster due to storing state into registers

1721 Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake

SPIX for IEEE 80211i data protection protocol

32-bit Cortex-M3 gives the highest throughput and lowest memory usage

Total time for no AD and 128 bytes message 1058 ms Total time for 16 byte AD and 128 bytes message 1079 ms

Cryptographic Platform Memory usage

[Bytes] Setup [Cycles]

Throughput [Kbps]

Gen-time [ms]

Tx-time [ms] SRAM Flash

SPIX Perm-18

8-bits ATmega128 161 1262 128377 3191 802

NA 16-bits MSP430F2013 24 1409 52294 7833 327

32-bits LM3S9D96 352 946 10900 37578 068

SPIX-AE (lAD = 0 lM = 16)

8-bits ATmega128 175 1550 1667042 983 10419 1060

16-bits MSP430F2013 50 1845 677818 2417 4236 1080

32-bits LM3S9D96 408 1210 139569 11739 872 1050

SPIX-AE (lAD = 2 lM = 16)

8-bits ATmega128 175 1644 1795322 913 11221 1080

16-bits MSP430F2013 50 1891 730340 2243 4565 1050

32-bits LM3S9D96 424 1326 150313 10900 939 1070

Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake 1821

ACE for IEEE 80211i data protection protocol

32-bit Cortex-M3 gives the best result Total time for no AD and 128 bytes message 1087 ms Total time for 16 byte AD and 128 bytes message 1098 ms

Cryptographic Platform Memory usage

[Bytes] Setup [Cycles]

Throughput [Kbps]

Gen-time [ms]

Tx-time [ms] SRAM Flash

ACE Perm 16-bits

MSP430F2013 304 1456 69440 7373 434 NA 32-bits

LM3S9D96 523 1598 13003 39376 081

ACE-AE (lAD = 0 lM = 16)

16-bits MSP430F2013 330 1740 1445059 1134 9032 1060

32-bits LM3S9D96 559 1790 269341 6083 1683 107

ACE-AE (lAD = 2 lM = 16)

16-bits MSP430F2013 330 1786 1582892 1035 9893 1080

32-bits LM3S9D96 559 1858 294988 5554 1844 1080

ACE-Hash (lM = 2 j =4)

16-bits MSP430F2013 330 1682 413056 496 2582

NA 32-bits LM3S9D96 559 1822 77114 2656 482

ACE-Hash (lM = 16 j =4)

16-bits MSP430F2013 330 1684 1375672 1191 8598

32-bits LM3S9D96 559 1822 256524 6387 1603

Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake 1921

WAGE for IEEE 80211i data protection protocol Cortex-M3 gives the highest throughput with highest memory

usage MSP430 gives the slightly lower throughput but much lower

memory usage Total time for no AD and 128 bytes message 1077 ms Total time for 16 byte AD and 128 bytes message 1079 ms

Cryptographic Platform Memory usage

[Bytes] Setup [Cycles]

Throughput [Kbps]

Gen-time [ms]

Tx-time [ms] SRAM Flash

WAGE Perm

8-bits ATmega128 802 4132 19011 21798 119

NA 16-bits MSP430F2370 4 5031 23524 17616 147

32-bits LM3S9D96 3076 5902 14450 28678 09

WAGE-AE (lAD = 0 lM = 16)

8-bits ATmega128 808 4416 362888 4515 2268 1080

16-bits MSP430F2370 46 5289 433105 3783 2707 1090

32-bits LM3S9D96 3084 6230 278848 5876 1743 1060

WAGE-AE (lAD = 2 lM = 16)

8-bits ATmega128 808 4502 397260 4124 2483 1050

16-bits MSP430F2370 46 5339 474067 3456 2963 1060

32-bits LM3S9D96 3084 6354 305284 5367 1908 1060

Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake 2021

Conclusions

Implemented the IEEE 80211a physical layer OFDM transmission systems by software defned radio to simulate the 4-way handshake modulation and communication

Implemented the IEEE 8021x 4-way handshake mutual authentication and key establishment protocol using SPIX ACE and WAGE on three dierent microcontrollers

Throughput of WAGE is higher than that of AES-128 written in C1 on the same 8-bit platform

Execution time for the cryptographic operations is the dominating factor in the 4-way handshake

- USRP tx rate 1682 Kbps - WiFi systemrsquos tx rate 50 - 320 Mbps

07times1682 - Scaling tx to WiFi = 0235 ms 50000

1 G Meiser T Eisenbarth K Lemke-Rust and C Paar Eyumlcient implementation of estream ciphers on 8-bit avr

microcontrollers In 2008 International Symposium on Industrial Embedded Systems pages 58 - 66 June 2008

Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake 2121

Thank you

  • Background and Preliminary
    • SPIX ACE and WAGE in NIST LWC round 2
    • IEEE 8021X 4-way handshake and data protection
    • SDR Overall
      • Implementation of Three LWC Schemes
        • KDF and MIC generation
        • Microcontroller Implementation
        • RAM moving technology
          • Performance Results
            • KDF and MIC for 4-way handshake
            • SPIX AE mode for data protection
            • ACE AE amp Hash modes for data protection
            • WAGE AE mode for data protection
              • Conclusion
Page 11: Implementation of Three LWC Schemes in the WiFi 4-Way ......Implementation of Three LWC Schemes in the WiFi 4-Way Handshake with Software Defned Radio . Yunjie Yi, Guang Gong, Kalikinkar

OFDM sender in SDR

OFDM sender The digital processing part before the USRP sender are done on PC in real

time

The OFDM sender includes package header generator modulation carrier allocator cyclic prefx and so on

1021 Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake

OFDM receiver in SDR

OFDM receiver USRP receiver received sampled data in complex domain

The receiver contains synchronization module header and payload demux OFDM demodulation and so on

USRP-

Source

Schmidl amp Cox

OFDM synch

Frequency Mod

Sensitivity -3125m

OFDM Frame Equalizer

FFT length = 64

CP-length = 16

Delay Multiply

HeaderPayload Demux

File Sink

Pathoutputdat

Type = Byte

FFT

Size =64

Type = Complex

Vector = 64

FFT

Size =64

Type = Complex

Vector = 64

OFDM Channel

Estimation

OFDM Frame

Equalizer

OFDM Serializer

FFT length = 64

Constellation

Decoder

Package Header

Parser

OFDM Frame

Equalizer

OFDM Serializer

FFT length = 64

Constellation

DecoderRepack Bits

1121 Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake

Our Contribution

Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake 1221

Implementation of IEEE80211a PHY in SDR

One USRP is confgured as a supplicant and another is an authenticator

Two USRPs communicate wirelessly

1321 Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake

Implementation of KDF and MIC

Implementation - KDF and MIC

Sponge structure is used for KDF and generation of MIC

SPIX ACE and WAGE follow the same framework

Key derivation function construction

F F F F F F F Fload-AE(AN [64 127]SN [64 127] PMK)

c

r

0x00 0x00 0x01 0x01 0x01 0x01 0x01

PMK[0 63] PMK[64 127] AN [64 127] SN [64 127] AP [0 63] AP [64 127] STA[0 63] STA[64 127]

KCK[0 63] KCK[64 127] KEK[0 63] KEK[64 127] TK[0 63] TK[64 127]

Initialization Processing Associated Data and Extracting Session Keys

Message integrity code generation

F F F F F F F F Fload-AE(NKCK)

c

r

tagextract(S)

t

0x00 0x00 0x01 0x01 0x01 0x01 0x00 0x00

KCK[0 63] KCK[64 127] Nonce[0 63] Nonce[64 127] r[0 63] r[64 127] KCK[0 63] KCK[64 127]

Initialization Processing associated data Finalization

1421 Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake

Microcontroller implementations SPIX ACE and WAGE

Microcontroller platform specifcations3

SPIX ACE and WAGE are implemented in assembly language

SPIX 8-bit 16-bit and 32-bit microcontrollers Atmega128 MSP430 and Cortex-M3

ACE 16-bit and 32-bit microcontrollers MSP430 and Cortex-M3

WAGE 8-bit 16-bit and 32-bit microcontrollers Atmega128 MSP430 and Cortex-M3

Clock frequency 16 MHz for all platforms

Microcontrollers Flash memory size [kB] RAM [kB] Number of general-purpose register ATmega128(SPIX WAGE) 128 4448 32(R0 - R31) MSP430F2013(ACE SPIX) 2304 0128 12 (R4 - R15)

MSP430F2370(WAGE) 33024 2048 12 (R4 - R15) LM3S9D96(ACE SPIX WAGE) 524288 131072 13 (R0 - R12)

3Cortex-M3 Cortex-m3lm3s9d96 MSP430 MSP430f2013 1521 Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake

Implementation Details

For SPIX the state is stored in the registers

For ACE and WAGE the state is stored in memory

Instead of loading everything into registers the WAGE state is continuously stored in random access memory (RAM)

- Initial memory location and the current round is recorded - After the permutation evaluation copy the fnal state to the

initial state location in RAM - Continue the next WAGE permutation evaluation

Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake 1621

Performance of 4-way handshake authentication Authentication time

Auth-Time includes the 4-way transmission time + computation times of KDF and MIC functions

Tauth = T4minuswayminustx + 2 TKDF + 3 TMIC

KDF computation is equivalent to AE(lAD = 0 lM = 6) no tag MIC computation is equivalent to AE(lAD = 4 lM = 0) with tag

Cryptographic F Platform Function

Memory usage [Bytes] Setup

[Cycles] Throughput

[Kbps] Gen-time

[ms] 4-way-Tx-time

[ms] Auth-Time

[ms] SRAM Flash

SPIX

8-bits ATmega128

KDF 175 1586 705314 2323 4408 700 95640 MIC 175 1634 897225 1826 5608 16-bits

MSP430F2013 KDF 50 1562 286679 5715 1792 690 79409 MIC 50 1580 363991 4501 2275

32-bits LM3S9D96

KDF 408 1230 59140 27704 370 700 72150 MIC 408 1326 75132 21807 470

ACE

16-bits MSP430F2013

KDF 330 1720 550752 2975 3442 710 89503 MIC 330 1738 619701 2644 3873 32-bits

LM3S9D96 KDF 599 1826 102762 15944 642 730 76450 MIC 599 1790 115561 14178 722

WAGE

8-bits ATmega128

KDF 808 4448 139478 11747 872 710 75678 MIC 808 4516 156491 10470 978 16-bits

MSP430F2013 KDF 46 4518 166993 9811 1044 720 77601 MIC 46 4536 187340 8746 1171

32-bits LM3S9D96

KDF 3084 6278 107071 15302 669 690 72591 MIC 3084 6382 120190 13632 751

SPIX implementation is faster due to storing state into registers

1721 Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake

SPIX for IEEE 80211i data protection protocol

32-bit Cortex-M3 gives the highest throughput and lowest memory usage

Total time for no AD and 128 bytes message 1058 ms Total time for 16 byte AD and 128 bytes message 1079 ms

Cryptographic Platform Memory usage

[Bytes] Setup [Cycles]

Throughput [Kbps]

Gen-time [ms]

Tx-time [ms] SRAM Flash

SPIX Perm-18

8-bits ATmega128 161 1262 128377 3191 802

NA 16-bits MSP430F2013 24 1409 52294 7833 327

32-bits LM3S9D96 352 946 10900 37578 068

SPIX-AE (lAD = 0 lM = 16)

8-bits ATmega128 175 1550 1667042 983 10419 1060

16-bits MSP430F2013 50 1845 677818 2417 4236 1080

32-bits LM3S9D96 408 1210 139569 11739 872 1050

SPIX-AE (lAD = 2 lM = 16)

8-bits ATmega128 175 1644 1795322 913 11221 1080

16-bits MSP430F2013 50 1891 730340 2243 4565 1050

32-bits LM3S9D96 424 1326 150313 10900 939 1070

Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake 1821

ACE for IEEE 80211i data protection protocol

32-bit Cortex-M3 gives the best result Total time for no AD and 128 bytes message 1087 ms Total time for 16 byte AD and 128 bytes message 1098 ms

Cryptographic Platform Memory usage

[Bytes] Setup [Cycles]

Throughput [Kbps]

Gen-time [ms]

Tx-time [ms] SRAM Flash

ACE Perm 16-bits

MSP430F2013 304 1456 69440 7373 434 NA 32-bits

LM3S9D96 523 1598 13003 39376 081

ACE-AE (lAD = 0 lM = 16)

16-bits MSP430F2013 330 1740 1445059 1134 9032 1060

32-bits LM3S9D96 559 1790 269341 6083 1683 107

ACE-AE (lAD = 2 lM = 16)

16-bits MSP430F2013 330 1786 1582892 1035 9893 1080

32-bits LM3S9D96 559 1858 294988 5554 1844 1080

ACE-Hash (lM = 2 j =4)

16-bits MSP430F2013 330 1682 413056 496 2582

NA 32-bits LM3S9D96 559 1822 77114 2656 482

ACE-Hash (lM = 16 j =4)

16-bits MSP430F2013 330 1684 1375672 1191 8598

32-bits LM3S9D96 559 1822 256524 6387 1603

Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake 1921

WAGE for IEEE 80211i data protection protocol Cortex-M3 gives the highest throughput with highest memory

usage MSP430 gives the slightly lower throughput but much lower

memory usage Total time for no AD and 128 bytes message 1077 ms Total time for 16 byte AD and 128 bytes message 1079 ms

Cryptographic Platform Memory usage

[Bytes] Setup [Cycles]

Throughput [Kbps]

Gen-time [ms]

Tx-time [ms] SRAM Flash

WAGE Perm

8-bits ATmega128 802 4132 19011 21798 119

NA 16-bits MSP430F2370 4 5031 23524 17616 147

32-bits LM3S9D96 3076 5902 14450 28678 09

WAGE-AE (lAD = 0 lM = 16)

8-bits ATmega128 808 4416 362888 4515 2268 1080

16-bits MSP430F2370 46 5289 433105 3783 2707 1090

32-bits LM3S9D96 3084 6230 278848 5876 1743 1060

WAGE-AE (lAD = 2 lM = 16)

8-bits ATmega128 808 4502 397260 4124 2483 1050

16-bits MSP430F2370 46 5339 474067 3456 2963 1060

32-bits LM3S9D96 3084 6354 305284 5367 1908 1060

Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake 2021

Conclusions

Implemented the IEEE 80211a physical layer OFDM transmission systems by software defned radio to simulate the 4-way handshake modulation and communication

Implemented the IEEE 8021x 4-way handshake mutual authentication and key establishment protocol using SPIX ACE and WAGE on three dierent microcontrollers

Throughput of WAGE is higher than that of AES-128 written in C1 on the same 8-bit platform

Execution time for the cryptographic operations is the dominating factor in the 4-way handshake

- USRP tx rate 1682 Kbps - WiFi systemrsquos tx rate 50 - 320 Mbps

07times1682 - Scaling tx to WiFi = 0235 ms 50000

1 G Meiser T Eisenbarth K Lemke-Rust and C Paar Eyumlcient implementation of estream ciphers on 8-bit avr

microcontrollers In 2008 International Symposium on Industrial Embedded Systems pages 58 - 66 June 2008

Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake 2121

Thank you

  • Background and Preliminary
    • SPIX ACE and WAGE in NIST LWC round 2
    • IEEE 8021X 4-way handshake and data protection
    • SDR Overall
      • Implementation of Three LWC Schemes
        • KDF and MIC generation
        • Microcontroller Implementation
        • RAM moving technology
          • Performance Results
            • KDF and MIC for 4-way handshake
            • SPIX AE mode for data protection
            • ACE AE amp Hash modes for data protection
            • WAGE AE mode for data protection
              • Conclusion
Page 12: Implementation of Three LWC Schemes in the WiFi 4-Way ......Implementation of Three LWC Schemes in the WiFi 4-Way Handshake with Software Defned Radio . Yunjie Yi, Guang Gong, Kalikinkar

OFDM receiver in SDR

OFDM receiver USRP receiver received sampled data in complex domain

The receiver contains synchronization module header and payload demux OFDM demodulation and so on

USRP-

Source

Schmidl amp Cox

OFDM synch

Frequency Mod

Sensitivity -3125m

OFDM Frame Equalizer

FFT length = 64

CP-length = 16

Delay Multiply

HeaderPayload Demux

File Sink

Pathoutputdat

Type = Byte

FFT

Size =64

Type = Complex

Vector = 64

FFT

Size =64

Type = Complex

Vector = 64

OFDM Channel

Estimation

OFDM Frame

Equalizer

OFDM Serializer

FFT length = 64

Constellation

Decoder

Package Header

Parser

OFDM Frame

Equalizer

OFDM Serializer

FFT length = 64

Constellation

DecoderRepack Bits

1121 Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake

Our Contribution

Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake 1221

Implementation of IEEE80211a PHY in SDR

One USRP is confgured as a supplicant and another is an authenticator

Two USRPs communicate wirelessly

1321 Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake

Implementation of KDF and MIC

Implementation - KDF and MIC

Sponge structure is used for KDF and generation of MIC

SPIX ACE and WAGE follow the same framework

Key derivation function construction

F F F F F F F Fload-AE(AN [64 127]SN [64 127] PMK)

c

r

0x00 0x00 0x01 0x01 0x01 0x01 0x01

PMK[0 63] PMK[64 127] AN [64 127] SN [64 127] AP [0 63] AP [64 127] STA[0 63] STA[64 127]

KCK[0 63] KCK[64 127] KEK[0 63] KEK[64 127] TK[0 63] TK[64 127]

Initialization Processing Associated Data and Extracting Session Keys

Message integrity code generation

F F F F F F F F Fload-AE(NKCK)

c

r

tagextract(S)

t

0x00 0x00 0x01 0x01 0x01 0x01 0x00 0x00

KCK[0 63] KCK[64 127] Nonce[0 63] Nonce[64 127] r[0 63] r[64 127] KCK[0 63] KCK[64 127]

Initialization Processing associated data Finalization

1421 Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake

Microcontroller implementations SPIX ACE and WAGE

Microcontroller platform specifcations3

SPIX ACE and WAGE are implemented in assembly language

SPIX 8-bit 16-bit and 32-bit microcontrollers Atmega128 MSP430 and Cortex-M3

ACE 16-bit and 32-bit microcontrollers MSP430 and Cortex-M3

WAGE 8-bit 16-bit and 32-bit microcontrollers Atmega128 MSP430 and Cortex-M3

Clock frequency 16 MHz for all platforms

Microcontrollers Flash memory size [kB] RAM [kB] Number of general-purpose register ATmega128(SPIX WAGE) 128 4448 32(R0 - R31) MSP430F2013(ACE SPIX) 2304 0128 12 (R4 - R15)

MSP430F2370(WAGE) 33024 2048 12 (R4 - R15) LM3S9D96(ACE SPIX WAGE) 524288 131072 13 (R0 - R12)

3Cortex-M3 Cortex-m3lm3s9d96 MSP430 MSP430f2013 1521 Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake

Implementation Details

For SPIX the state is stored in the registers

For ACE and WAGE the state is stored in memory

Instead of loading everything into registers the WAGE state is continuously stored in random access memory (RAM)

- Initial memory location and the current round is recorded - After the permutation evaluation copy the fnal state to the

initial state location in RAM - Continue the next WAGE permutation evaluation

Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake 1621

Performance of 4-way handshake authentication Authentication time

Auth-Time includes the 4-way transmission time + computation times of KDF and MIC functions

Tauth = T4minuswayminustx + 2 TKDF + 3 TMIC

KDF computation is equivalent to AE(lAD = 0 lM = 6) no tag MIC computation is equivalent to AE(lAD = 4 lM = 0) with tag

Cryptographic F Platform Function

Memory usage [Bytes] Setup

[Cycles] Throughput

[Kbps] Gen-time

[ms] 4-way-Tx-time

[ms] Auth-Time

[ms] SRAM Flash

SPIX

8-bits ATmega128

KDF 175 1586 705314 2323 4408 700 95640 MIC 175 1634 897225 1826 5608 16-bits

MSP430F2013 KDF 50 1562 286679 5715 1792 690 79409 MIC 50 1580 363991 4501 2275

32-bits LM3S9D96

KDF 408 1230 59140 27704 370 700 72150 MIC 408 1326 75132 21807 470

ACE

16-bits MSP430F2013

KDF 330 1720 550752 2975 3442 710 89503 MIC 330 1738 619701 2644 3873 32-bits

LM3S9D96 KDF 599 1826 102762 15944 642 730 76450 MIC 599 1790 115561 14178 722

WAGE

8-bits ATmega128

KDF 808 4448 139478 11747 872 710 75678 MIC 808 4516 156491 10470 978 16-bits

MSP430F2013 KDF 46 4518 166993 9811 1044 720 77601 MIC 46 4536 187340 8746 1171

32-bits LM3S9D96

KDF 3084 6278 107071 15302 669 690 72591 MIC 3084 6382 120190 13632 751

SPIX implementation is faster due to storing state into registers

1721 Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake

SPIX for IEEE 80211i data protection protocol

32-bit Cortex-M3 gives the highest throughput and lowest memory usage

Total time for no AD and 128 bytes message 1058 ms Total time for 16 byte AD and 128 bytes message 1079 ms

Cryptographic Platform Memory usage

[Bytes] Setup [Cycles]

Throughput [Kbps]

Gen-time [ms]

Tx-time [ms] SRAM Flash

SPIX Perm-18

8-bits ATmega128 161 1262 128377 3191 802

NA 16-bits MSP430F2013 24 1409 52294 7833 327

32-bits LM3S9D96 352 946 10900 37578 068

SPIX-AE (lAD = 0 lM = 16)

8-bits ATmega128 175 1550 1667042 983 10419 1060

16-bits MSP430F2013 50 1845 677818 2417 4236 1080

32-bits LM3S9D96 408 1210 139569 11739 872 1050

SPIX-AE (lAD = 2 lM = 16)

8-bits ATmega128 175 1644 1795322 913 11221 1080

16-bits MSP430F2013 50 1891 730340 2243 4565 1050

32-bits LM3S9D96 424 1326 150313 10900 939 1070

Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake 1821

ACE for IEEE 80211i data protection protocol

32-bit Cortex-M3 gives the best result Total time for no AD and 128 bytes message 1087 ms Total time for 16 byte AD and 128 bytes message 1098 ms

Cryptographic Platform Memory usage

[Bytes] Setup [Cycles]

Throughput [Kbps]

Gen-time [ms]

Tx-time [ms] SRAM Flash

ACE Perm 16-bits

MSP430F2013 304 1456 69440 7373 434 NA 32-bits

LM3S9D96 523 1598 13003 39376 081

ACE-AE (lAD = 0 lM = 16)

16-bits MSP430F2013 330 1740 1445059 1134 9032 1060

32-bits LM3S9D96 559 1790 269341 6083 1683 107

ACE-AE (lAD = 2 lM = 16)

16-bits MSP430F2013 330 1786 1582892 1035 9893 1080

32-bits LM3S9D96 559 1858 294988 5554 1844 1080

ACE-Hash (lM = 2 j =4)

16-bits MSP430F2013 330 1682 413056 496 2582

NA 32-bits LM3S9D96 559 1822 77114 2656 482

ACE-Hash (lM = 16 j =4)

16-bits MSP430F2013 330 1684 1375672 1191 8598

32-bits LM3S9D96 559 1822 256524 6387 1603

Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake 1921

WAGE for IEEE 80211i data protection protocol Cortex-M3 gives the highest throughput with highest memory

usage MSP430 gives the slightly lower throughput but much lower

memory usage Total time for no AD and 128 bytes message 1077 ms Total time for 16 byte AD and 128 bytes message 1079 ms

Cryptographic Platform Memory usage

[Bytes] Setup [Cycles]

Throughput [Kbps]

Gen-time [ms]

Tx-time [ms] SRAM Flash

WAGE Perm

8-bits ATmega128 802 4132 19011 21798 119

NA 16-bits MSP430F2370 4 5031 23524 17616 147

32-bits LM3S9D96 3076 5902 14450 28678 09

WAGE-AE (lAD = 0 lM = 16)

8-bits ATmega128 808 4416 362888 4515 2268 1080

16-bits MSP430F2370 46 5289 433105 3783 2707 1090

32-bits LM3S9D96 3084 6230 278848 5876 1743 1060

WAGE-AE (lAD = 2 lM = 16)

8-bits ATmega128 808 4502 397260 4124 2483 1050

16-bits MSP430F2370 46 5339 474067 3456 2963 1060

32-bits LM3S9D96 3084 6354 305284 5367 1908 1060

Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake 2021

Conclusions

Implemented the IEEE 80211a physical layer OFDM transmission systems by software defned radio to simulate the 4-way handshake modulation and communication

Implemented the IEEE 8021x 4-way handshake mutual authentication and key establishment protocol using SPIX ACE and WAGE on three dierent microcontrollers

Throughput of WAGE is higher than that of AES-128 written in C1 on the same 8-bit platform

Execution time for the cryptographic operations is the dominating factor in the 4-way handshake

- USRP tx rate 1682 Kbps - WiFi systemrsquos tx rate 50 - 320 Mbps

07times1682 - Scaling tx to WiFi = 0235 ms 50000

1 G Meiser T Eisenbarth K Lemke-Rust and C Paar Eyumlcient implementation of estream ciphers on 8-bit avr

microcontrollers In 2008 International Symposium on Industrial Embedded Systems pages 58 - 66 June 2008

Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake 2121

Thank you

  • Background and Preliminary
    • SPIX ACE and WAGE in NIST LWC round 2
    • IEEE 8021X 4-way handshake and data protection
    • SDR Overall
      • Implementation of Three LWC Schemes
        • KDF and MIC generation
        • Microcontroller Implementation
        • RAM moving technology
          • Performance Results
            • KDF and MIC for 4-way handshake
            • SPIX AE mode for data protection
            • ACE AE amp Hash modes for data protection
            • WAGE AE mode for data protection
              • Conclusion
Page 13: Implementation of Three LWC Schemes in the WiFi 4-Way ......Implementation of Three LWC Schemes in the WiFi 4-Way Handshake with Software Defned Radio . Yunjie Yi, Guang Gong, Kalikinkar

Our Contribution

Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake 1221

Implementation of IEEE80211a PHY in SDR

One USRP is confgured as a supplicant and another is an authenticator

Two USRPs communicate wirelessly

1321 Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake

Implementation of KDF and MIC

Implementation - KDF and MIC

Sponge structure is used for KDF and generation of MIC

SPIX ACE and WAGE follow the same framework

Key derivation function construction

F F F F F F F Fload-AE(AN [64 127]SN [64 127] PMK)

c

r

0x00 0x00 0x01 0x01 0x01 0x01 0x01

PMK[0 63] PMK[64 127] AN [64 127] SN [64 127] AP [0 63] AP [64 127] STA[0 63] STA[64 127]

KCK[0 63] KCK[64 127] KEK[0 63] KEK[64 127] TK[0 63] TK[64 127]

Initialization Processing Associated Data and Extracting Session Keys

Message integrity code generation

F F F F F F F F Fload-AE(NKCK)

c

r

tagextract(S)

t

0x00 0x00 0x01 0x01 0x01 0x01 0x00 0x00

KCK[0 63] KCK[64 127] Nonce[0 63] Nonce[64 127] r[0 63] r[64 127] KCK[0 63] KCK[64 127]

Initialization Processing associated data Finalization

1421 Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake

Microcontroller implementations SPIX ACE and WAGE

Microcontroller platform specifcations3

SPIX ACE and WAGE are implemented in assembly language

SPIX 8-bit 16-bit and 32-bit microcontrollers Atmega128 MSP430 and Cortex-M3

ACE 16-bit and 32-bit microcontrollers MSP430 and Cortex-M3

WAGE 8-bit 16-bit and 32-bit microcontrollers Atmega128 MSP430 and Cortex-M3

Clock frequency 16 MHz for all platforms

Microcontrollers Flash memory size [kB] RAM [kB] Number of general-purpose register ATmega128(SPIX WAGE) 128 4448 32(R0 - R31) MSP430F2013(ACE SPIX) 2304 0128 12 (R4 - R15)

MSP430F2370(WAGE) 33024 2048 12 (R4 - R15) LM3S9D96(ACE SPIX WAGE) 524288 131072 13 (R0 - R12)

3Cortex-M3 Cortex-m3lm3s9d96 MSP430 MSP430f2013 1521 Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake

Implementation Details

For SPIX the state is stored in the registers

For ACE and WAGE the state is stored in memory

Instead of loading everything into registers the WAGE state is continuously stored in random access memory (RAM)

- Initial memory location and the current round is recorded - After the permutation evaluation copy the fnal state to the

initial state location in RAM - Continue the next WAGE permutation evaluation

Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake 1621

Performance of 4-way handshake authentication Authentication time

Auth-Time includes the 4-way transmission time + computation times of KDF and MIC functions

Tauth = T4minuswayminustx + 2 TKDF + 3 TMIC

KDF computation is equivalent to AE(lAD = 0 lM = 6) no tag MIC computation is equivalent to AE(lAD = 4 lM = 0) with tag

Cryptographic F Platform Function

Memory usage [Bytes] Setup

[Cycles] Throughput

[Kbps] Gen-time

[ms] 4-way-Tx-time

[ms] Auth-Time

[ms] SRAM Flash

SPIX

8-bits ATmega128

KDF 175 1586 705314 2323 4408 700 95640 MIC 175 1634 897225 1826 5608 16-bits

MSP430F2013 KDF 50 1562 286679 5715 1792 690 79409 MIC 50 1580 363991 4501 2275

32-bits LM3S9D96

KDF 408 1230 59140 27704 370 700 72150 MIC 408 1326 75132 21807 470

ACE

16-bits MSP430F2013

KDF 330 1720 550752 2975 3442 710 89503 MIC 330 1738 619701 2644 3873 32-bits

LM3S9D96 KDF 599 1826 102762 15944 642 730 76450 MIC 599 1790 115561 14178 722

WAGE

8-bits ATmega128

KDF 808 4448 139478 11747 872 710 75678 MIC 808 4516 156491 10470 978 16-bits

MSP430F2013 KDF 46 4518 166993 9811 1044 720 77601 MIC 46 4536 187340 8746 1171

32-bits LM3S9D96

KDF 3084 6278 107071 15302 669 690 72591 MIC 3084 6382 120190 13632 751

SPIX implementation is faster due to storing state into registers

1721 Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake

SPIX for IEEE 80211i data protection protocol

32-bit Cortex-M3 gives the highest throughput and lowest memory usage

Total time for no AD and 128 bytes message 1058 ms Total time for 16 byte AD and 128 bytes message 1079 ms

Cryptographic Platform Memory usage

[Bytes] Setup [Cycles]

Throughput [Kbps]

Gen-time [ms]

Tx-time [ms] SRAM Flash

SPIX Perm-18

8-bits ATmega128 161 1262 128377 3191 802

NA 16-bits MSP430F2013 24 1409 52294 7833 327

32-bits LM3S9D96 352 946 10900 37578 068

SPIX-AE (lAD = 0 lM = 16)

8-bits ATmega128 175 1550 1667042 983 10419 1060

16-bits MSP430F2013 50 1845 677818 2417 4236 1080

32-bits LM3S9D96 408 1210 139569 11739 872 1050

SPIX-AE (lAD = 2 lM = 16)

8-bits ATmega128 175 1644 1795322 913 11221 1080

16-bits MSP430F2013 50 1891 730340 2243 4565 1050

32-bits LM3S9D96 424 1326 150313 10900 939 1070

Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake 1821

ACE for IEEE 80211i data protection protocol

32-bit Cortex-M3 gives the best result Total time for no AD and 128 bytes message 1087 ms Total time for 16 byte AD and 128 bytes message 1098 ms

Cryptographic Platform Memory usage

[Bytes] Setup [Cycles]

Throughput [Kbps]

Gen-time [ms]

Tx-time [ms] SRAM Flash

ACE Perm 16-bits

MSP430F2013 304 1456 69440 7373 434 NA 32-bits

LM3S9D96 523 1598 13003 39376 081

ACE-AE (lAD = 0 lM = 16)

16-bits MSP430F2013 330 1740 1445059 1134 9032 1060

32-bits LM3S9D96 559 1790 269341 6083 1683 107

ACE-AE (lAD = 2 lM = 16)

16-bits MSP430F2013 330 1786 1582892 1035 9893 1080

32-bits LM3S9D96 559 1858 294988 5554 1844 1080

ACE-Hash (lM = 2 j =4)

16-bits MSP430F2013 330 1682 413056 496 2582

NA 32-bits LM3S9D96 559 1822 77114 2656 482

ACE-Hash (lM = 16 j =4)

16-bits MSP430F2013 330 1684 1375672 1191 8598

32-bits LM3S9D96 559 1822 256524 6387 1603

Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake 1921

WAGE for IEEE 80211i data protection protocol Cortex-M3 gives the highest throughput with highest memory

usage MSP430 gives the slightly lower throughput but much lower

memory usage Total time for no AD and 128 bytes message 1077 ms Total time for 16 byte AD and 128 bytes message 1079 ms

Cryptographic Platform Memory usage

[Bytes] Setup [Cycles]

Throughput [Kbps]

Gen-time [ms]

Tx-time [ms] SRAM Flash

WAGE Perm

8-bits ATmega128 802 4132 19011 21798 119

NA 16-bits MSP430F2370 4 5031 23524 17616 147

32-bits LM3S9D96 3076 5902 14450 28678 09

WAGE-AE (lAD = 0 lM = 16)

8-bits ATmega128 808 4416 362888 4515 2268 1080

16-bits MSP430F2370 46 5289 433105 3783 2707 1090

32-bits LM3S9D96 3084 6230 278848 5876 1743 1060

WAGE-AE (lAD = 2 lM = 16)

8-bits ATmega128 808 4502 397260 4124 2483 1050

16-bits MSP430F2370 46 5339 474067 3456 2963 1060

32-bits LM3S9D96 3084 6354 305284 5367 1908 1060

Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake 2021

Conclusions

Implemented the IEEE 80211a physical layer OFDM transmission systems by software defned radio to simulate the 4-way handshake modulation and communication

Implemented the IEEE 8021x 4-way handshake mutual authentication and key establishment protocol using SPIX ACE and WAGE on three dierent microcontrollers

Throughput of WAGE is higher than that of AES-128 written in C1 on the same 8-bit platform

Execution time for the cryptographic operations is the dominating factor in the 4-way handshake

- USRP tx rate 1682 Kbps - WiFi systemrsquos tx rate 50 - 320 Mbps

07times1682 - Scaling tx to WiFi = 0235 ms 50000

1 G Meiser T Eisenbarth K Lemke-Rust and C Paar Eyumlcient implementation of estream ciphers on 8-bit avr

microcontrollers In 2008 International Symposium on Industrial Embedded Systems pages 58 - 66 June 2008

Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake 2121

Thank you

  • Background and Preliminary
    • SPIX ACE and WAGE in NIST LWC round 2
    • IEEE 8021X 4-way handshake and data protection
    • SDR Overall
      • Implementation of Three LWC Schemes
        • KDF and MIC generation
        • Microcontroller Implementation
        • RAM moving technology
          • Performance Results
            • KDF and MIC for 4-way handshake
            • SPIX AE mode for data protection
            • ACE AE amp Hash modes for data protection
            • WAGE AE mode for data protection
              • Conclusion
Page 14: Implementation of Three LWC Schemes in the WiFi 4-Way ......Implementation of Three LWC Schemes in the WiFi 4-Way Handshake with Software Defned Radio . Yunjie Yi, Guang Gong, Kalikinkar

Implementation of IEEE80211a PHY in SDR

One USRP is confgured as a supplicant and another is an authenticator

Two USRPs communicate wirelessly

1321 Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake

Implementation of KDF and MIC

Implementation - KDF and MIC

Sponge structure is used for KDF and generation of MIC

SPIX ACE and WAGE follow the same framework

Key derivation function construction

F F F F F F F Fload-AE(AN [64 127]SN [64 127] PMK)

c

r

0x00 0x00 0x01 0x01 0x01 0x01 0x01

PMK[0 63] PMK[64 127] AN [64 127] SN [64 127] AP [0 63] AP [64 127] STA[0 63] STA[64 127]

KCK[0 63] KCK[64 127] KEK[0 63] KEK[64 127] TK[0 63] TK[64 127]

Initialization Processing Associated Data and Extracting Session Keys

Message integrity code generation

F F F F F F F F Fload-AE(NKCK)

c

r

tagextract(S)

t

0x00 0x00 0x01 0x01 0x01 0x01 0x00 0x00

KCK[0 63] KCK[64 127] Nonce[0 63] Nonce[64 127] r[0 63] r[64 127] KCK[0 63] KCK[64 127]

Initialization Processing associated data Finalization

1421 Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake

Microcontroller implementations SPIX ACE and WAGE

Microcontroller platform specifcations3

SPIX ACE and WAGE are implemented in assembly language

SPIX 8-bit 16-bit and 32-bit microcontrollers Atmega128 MSP430 and Cortex-M3

ACE 16-bit and 32-bit microcontrollers MSP430 and Cortex-M3

WAGE 8-bit 16-bit and 32-bit microcontrollers Atmega128 MSP430 and Cortex-M3

Clock frequency 16 MHz for all platforms

Microcontrollers Flash memory size [kB] RAM [kB] Number of general-purpose register ATmega128(SPIX WAGE) 128 4448 32(R0 - R31) MSP430F2013(ACE SPIX) 2304 0128 12 (R4 - R15)

MSP430F2370(WAGE) 33024 2048 12 (R4 - R15) LM3S9D96(ACE SPIX WAGE) 524288 131072 13 (R0 - R12)

3Cortex-M3 Cortex-m3lm3s9d96 MSP430 MSP430f2013 1521 Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake

Implementation Details

For SPIX the state is stored in the registers

For ACE and WAGE the state is stored in memory

Instead of loading everything into registers the WAGE state is continuously stored in random access memory (RAM)

- Initial memory location and the current round is recorded - After the permutation evaluation copy the fnal state to the

initial state location in RAM - Continue the next WAGE permutation evaluation

Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake 1621

Performance of 4-way handshake authentication Authentication time

Auth-Time includes the 4-way transmission time + computation times of KDF and MIC functions

Tauth = T4minuswayminustx + 2 TKDF + 3 TMIC

KDF computation is equivalent to AE(lAD = 0 lM = 6) no tag MIC computation is equivalent to AE(lAD = 4 lM = 0) with tag

Cryptographic F Platform Function

Memory usage [Bytes] Setup

[Cycles] Throughput

[Kbps] Gen-time

[ms] 4-way-Tx-time

[ms] Auth-Time

[ms] SRAM Flash

SPIX

8-bits ATmega128

KDF 175 1586 705314 2323 4408 700 95640 MIC 175 1634 897225 1826 5608 16-bits

MSP430F2013 KDF 50 1562 286679 5715 1792 690 79409 MIC 50 1580 363991 4501 2275

32-bits LM3S9D96

KDF 408 1230 59140 27704 370 700 72150 MIC 408 1326 75132 21807 470

ACE

16-bits MSP430F2013

KDF 330 1720 550752 2975 3442 710 89503 MIC 330 1738 619701 2644 3873 32-bits

LM3S9D96 KDF 599 1826 102762 15944 642 730 76450 MIC 599 1790 115561 14178 722

WAGE

8-bits ATmega128

KDF 808 4448 139478 11747 872 710 75678 MIC 808 4516 156491 10470 978 16-bits

MSP430F2013 KDF 46 4518 166993 9811 1044 720 77601 MIC 46 4536 187340 8746 1171

32-bits LM3S9D96

KDF 3084 6278 107071 15302 669 690 72591 MIC 3084 6382 120190 13632 751

SPIX implementation is faster due to storing state into registers

1721 Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake

SPIX for IEEE 80211i data protection protocol

32-bit Cortex-M3 gives the highest throughput and lowest memory usage

Total time for no AD and 128 bytes message 1058 ms Total time for 16 byte AD and 128 bytes message 1079 ms

Cryptographic Platform Memory usage

[Bytes] Setup [Cycles]

Throughput [Kbps]

Gen-time [ms]

Tx-time [ms] SRAM Flash

SPIX Perm-18

8-bits ATmega128 161 1262 128377 3191 802

NA 16-bits MSP430F2013 24 1409 52294 7833 327

32-bits LM3S9D96 352 946 10900 37578 068

SPIX-AE (lAD = 0 lM = 16)

8-bits ATmega128 175 1550 1667042 983 10419 1060

16-bits MSP430F2013 50 1845 677818 2417 4236 1080

32-bits LM3S9D96 408 1210 139569 11739 872 1050

SPIX-AE (lAD = 2 lM = 16)

8-bits ATmega128 175 1644 1795322 913 11221 1080

16-bits MSP430F2013 50 1891 730340 2243 4565 1050

32-bits LM3S9D96 424 1326 150313 10900 939 1070

Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake 1821

ACE for IEEE 80211i data protection protocol

32-bit Cortex-M3 gives the best result Total time for no AD and 128 bytes message 1087 ms Total time for 16 byte AD and 128 bytes message 1098 ms

Cryptographic Platform Memory usage

[Bytes] Setup [Cycles]

Throughput [Kbps]

Gen-time [ms]

Tx-time [ms] SRAM Flash

ACE Perm 16-bits

MSP430F2013 304 1456 69440 7373 434 NA 32-bits

LM3S9D96 523 1598 13003 39376 081

ACE-AE (lAD = 0 lM = 16)

16-bits MSP430F2013 330 1740 1445059 1134 9032 1060

32-bits LM3S9D96 559 1790 269341 6083 1683 107

ACE-AE (lAD = 2 lM = 16)

16-bits MSP430F2013 330 1786 1582892 1035 9893 1080

32-bits LM3S9D96 559 1858 294988 5554 1844 1080

ACE-Hash (lM = 2 j =4)

16-bits MSP430F2013 330 1682 413056 496 2582

NA 32-bits LM3S9D96 559 1822 77114 2656 482

ACE-Hash (lM = 16 j =4)

16-bits MSP430F2013 330 1684 1375672 1191 8598

32-bits LM3S9D96 559 1822 256524 6387 1603

Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake 1921

WAGE for IEEE 80211i data protection protocol Cortex-M3 gives the highest throughput with highest memory

usage MSP430 gives the slightly lower throughput but much lower

memory usage Total time for no AD and 128 bytes message 1077 ms Total time for 16 byte AD and 128 bytes message 1079 ms

Cryptographic Platform Memory usage

[Bytes] Setup [Cycles]

Throughput [Kbps]

Gen-time [ms]

Tx-time [ms] SRAM Flash

WAGE Perm

8-bits ATmega128 802 4132 19011 21798 119

NA 16-bits MSP430F2370 4 5031 23524 17616 147

32-bits LM3S9D96 3076 5902 14450 28678 09

WAGE-AE (lAD = 0 lM = 16)

8-bits ATmega128 808 4416 362888 4515 2268 1080

16-bits MSP430F2370 46 5289 433105 3783 2707 1090

32-bits LM3S9D96 3084 6230 278848 5876 1743 1060

WAGE-AE (lAD = 2 lM = 16)

8-bits ATmega128 808 4502 397260 4124 2483 1050

16-bits MSP430F2370 46 5339 474067 3456 2963 1060

32-bits LM3S9D96 3084 6354 305284 5367 1908 1060

Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake 2021

Conclusions

Implemented the IEEE 80211a physical layer OFDM transmission systems by software defned radio to simulate the 4-way handshake modulation and communication

Implemented the IEEE 8021x 4-way handshake mutual authentication and key establishment protocol using SPIX ACE and WAGE on three dierent microcontrollers

Throughput of WAGE is higher than that of AES-128 written in C1 on the same 8-bit platform

Execution time for the cryptographic operations is the dominating factor in the 4-way handshake

- USRP tx rate 1682 Kbps - WiFi systemrsquos tx rate 50 - 320 Mbps

07times1682 - Scaling tx to WiFi = 0235 ms 50000

1 G Meiser T Eisenbarth K Lemke-Rust and C Paar Eyumlcient implementation of estream ciphers on 8-bit avr

microcontrollers In 2008 International Symposium on Industrial Embedded Systems pages 58 - 66 June 2008

Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake 2121

Thank you

  • Background and Preliminary
    • SPIX ACE and WAGE in NIST LWC round 2
    • IEEE 8021X 4-way handshake and data protection
    • SDR Overall
      • Implementation of Three LWC Schemes
        • KDF and MIC generation
        • Microcontroller Implementation
        • RAM moving technology
          • Performance Results
            • KDF and MIC for 4-way handshake
            • SPIX AE mode for data protection
            • ACE AE amp Hash modes for data protection
            • WAGE AE mode for data protection
              • Conclusion
Page 15: Implementation of Three LWC Schemes in the WiFi 4-Way ......Implementation of Three LWC Schemes in the WiFi 4-Way Handshake with Software Defned Radio . Yunjie Yi, Guang Gong, Kalikinkar

Implementation of KDF and MIC

Implementation - KDF and MIC

Sponge structure is used for KDF and generation of MIC

SPIX ACE and WAGE follow the same framework

Key derivation function construction

F F F F F F F Fload-AE(AN [64 127]SN [64 127] PMK)

c

r

0x00 0x00 0x01 0x01 0x01 0x01 0x01

PMK[0 63] PMK[64 127] AN [64 127] SN [64 127] AP [0 63] AP [64 127] STA[0 63] STA[64 127]

KCK[0 63] KCK[64 127] KEK[0 63] KEK[64 127] TK[0 63] TK[64 127]

Initialization Processing Associated Data and Extracting Session Keys

Message integrity code generation

F F F F F F F F Fload-AE(NKCK)

c

r

tagextract(S)

t

0x00 0x00 0x01 0x01 0x01 0x01 0x00 0x00

KCK[0 63] KCK[64 127] Nonce[0 63] Nonce[64 127] r[0 63] r[64 127] KCK[0 63] KCK[64 127]

Initialization Processing associated data Finalization

1421 Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake

Microcontroller implementations SPIX ACE and WAGE

Microcontroller platform specifcations3

SPIX ACE and WAGE are implemented in assembly language

SPIX 8-bit 16-bit and 32-bit microcontrollers Atmega128 MSP430 and Cortex-M3

ACE 16-bit and 32-bit microcontrollers MSP430 and Cortex-M3

WAGE 8-bit 16-bit and 32-bit microcontrollers Atmega128 MSP430 and Cortex-M3

Clock frequency 16 MHz for all platforms

Microcontrollers Flash memory size [kB] RAM [kB] Number of general-purpose register ATmega128(SPIX WAGE) 128 4448 32(R0 - R31) MSP430F2013(ACE SPIX) 2304 0128 12 (R4 - R15)

MSP430F2370(WAGE) 33024 2048 12 (R4 - R15) LM3S9D96(ACE SPIX WAGE) 524288 131072 13 (R0 - R12)

3Cortex-M3 Cortex-m3lm3s9d96 MSP430 MSP430f2013 1521 Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake

Implementation Details

For SPIX the state is stored in the registers

For ACE and WAGE the state is stored in memory

Instead of loading everything into registers the WAGE state is continuously stored in random access memory (RAM)

- Initial memory location and the current round is recorded - After the permutation evaluation copy the fnal state to the

initial state location in RAM - Continue the next WAGE permutation evaluation

Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake 1621

Performance of 4-way handshake authentication Authentication time

Auth-Time includes the 4-way transmission time + computation times of KDF and MIC functions

Tauth = T4minuswayminustx + 2 TKDF + 3 TMIC

KDF computation is equivalent to AE(lAD = 0 lM = 6) no tag MIC computation is equivalent to AE(lAD = 4 lM = 0) with tag

Cryptographic F Platform Function

Memory usage [Bytes] Setup

[Cycles] Throughput

[Kbps] Gen-time

[ms] 4-way-Tx-time

[ms] Auth-Time

[ms] SRAM Flash

SPIX

8-bits ATmega128

KDF 175 1586 705314 2323 4408 700 95640 MIC 175 1634 897225 1826 5608 16-bits

MSP430F2013 KDF 50 1562 286679 5715 1792 690 79409 MIC 50 1580 363991 4501 2275

32-bits LM3S9D96

KDF 408 1230 59140 27704 370 700 72150 MIC 408 1326 75132 21807 470

ACE

16-bits MSP430F2013

KDF 330 1720 550752 2975 3442 710 89503 MIC 330 1738 619701 2644 3873 32-bits

LM3S9D96 KDF 599 1826 102762 15944 642 730 76450 MIC 599 1790 115561 14178 722

WAGE

8-bits ATmega128

KDF 808 4448 139478 11747 872 710 75678 MIC 808 4516 156491 10470 978 16-bits

MSP430F2013 KDF 46 4518 166993 9811 1044 720 77601 MIC 46 4536 187340 8746 1171

32-bits LM3S9D96

KDF 3084 6278 107071 15302 669 690 72591 MIC 3084 6382 120190 13632 751

SPIX implementation is faster due to storing state into registers

1721 Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake

SPIX for IEEE 80211i data protection protocol

32-bit Cortex-M3 gives the highest throughput and lowest memory usage

Total time for no AD and 128 bytes message 1058 ms Total time for 16 byte AD and 128 bytes message 1079 ms

Cryptographic Platform Memory usage

[Bytes] Setup [Cycles]

Throughput [Kbps]

Gen-time [ms]

Tx-time [ms] SRAM Flash

SPIX Perm-18

8-bits ATmega128 161 1262 128377 3191 802

NA 16-bits MSP430F2013 24 1409 52294 7833 327

32-bits LM3S9D96 352 946 10900 37578 068

SPIX-AE (lAD = 0 lM = 16)

8-bits ATmega128 175 1550 1667042 983 10419 1060

16-bits MSP430F2013 50 1845 677818 2417 4236 1080

32-bits LM3S9D96 408 1210 139569 11739 872 1050

SPIX-AE (lAD = 2 lM = 16)

8-bits ATmega128 175 1644 1795322 913 11221 1080

16-bits MSP430F2013 50 1891 730340 2243 4565 1050

32-bits LM3S9D96 424 1326 150313 10900 939 1070

Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake 1821

ACE for IEEE 80211i data protection protocol

32-bit Cortex-M3 gives the best result Total time for no AD and 128 bytes message 1087 ms Total time for 16 byte AD and 128 bytes message 1098 ms

Cryptographic Platform Memory usage

[Bytes] Setup [Cycles]

Throughput [Kbps]

Gen-time [ms]

Tx-time [ms] SRAM Flash

ACE Perm 16-bits

MSP430F2013 304 1456 69440 7373 434 NA 32-bits

LM3S9D96 523 1598 13003 39376 081

ACE-AE (lAD = 0 lM = 16)

16-bits MSP430F2013 330 1740 1445059 1134 9032 1060

32-bits LM3S9D96 559 1790 269341 6083 1683 107

ACE-AE (lAD = 2 lM = 16)

16-bits MSP430F2013 330 1786 1582892 1035 9893 1080

32-bits LM3S9D96 559 1858 294988 5554 1844 1080

ACE-Hash (lM = 2 j =4)

16-bits MSP430F2013 330 1682 413056 496 2582

NA 32-bits LM3S9D96 559 1822 77114 2656 482

ACE-Hash (lM = 16 j =4)

16-bits MSP430F2013 330 1684 1375672 1191 8598

32-bits LM3S9D96 559 1822 256524 6387 1603

Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake 1921

WAGE for IEEE 80211i data protection protocol Cortex-M3 gives the highest throughput with highest memory

usage MSP430 gives the slightly lower throughput but much lower

memory usage Total time for no AD and 128 bytes message 1077 ms Total time for 16 byte AD and 128 bytes message 1079 ms

Cryptographic Platform Memory usage

[Bytes] Setup [Cycles]

Throughput [Kbps]

Gen-time [ms]

Tx-time [ms] SRAM Flash

WAGE Perm

8-bits ATmega128 802 4132 19011 21798 119

NA 16-bits MSP430F2370 4 5031 23524 17616 147

32-bits LM3S9D96 3076 5902 14450 28678 09

WAGE-AE (lAD = 0 lM = 16)

8-bits ATmega128 808 4416 362888 4515 2268 1080

16-bits MSP430F2370 46 5289 433105 3783 2707 1090

32-bits LM3S9D96 3084 6230 278848 5876 1743 1060

WAGE-AE (lAD = 2 lM = 16)

8-bits ATmega128 808 4502 397260 4124 2483 1050

16-bits MSP430F2370 46 5339 474067 3456 2963 1060

32-bits LM3S9D96 3084 6354 305284 5367 1908 1060

Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake 2021

Conclusions

Implemented the IEEE 80211a physical layer OFDM transmission systems by software defned radio to simulate the 4-way handshake modulation and communication

Implemented the IEEE 8021x 4-way handshake mutual authentication and key establishment protocol using SPIX ACE and WAGE on three dierent microcontrollers

Throughput of WAGE is higher than that of AES-128 written in C1 on the same 8-bit platform

Execution time for the cryptographic operations is the dominating factor in the 4-way handshake

- USRP tx rate 1682 Kbps - WiFi systemrsquos tx rate 50 - 320 Mbps

07times1682 - Scaling tx to WiFi = 0235 ms 50000

1 G Meiser T Eisenbarth K Lemke-Rust and C Paar Eyumlcient implementation of estream ciphers on 8-bit avr

microcontrollers In 2008 International Symposium on Industrial Embedded Systems pages 58 - 66 June 2008

Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake 2121

Thank you

  • Background and Preliminary
    • SPIX ACE and WAGE in NIST LWC round 2
    • IEEE 8021X 4-way handshake and data protection
    • SDR Overall
      • Implementation of Three LWC Schemes
        • KDF and MIC generation
        • Microcontroller Implementation
        • RAM moving technology
          • Performance Results
            • KDF and MIC for 4-way handshake
            • SPIX AE mode for data protection
            • ACE AE amp Hash modes for data protection
            • WAGE AE mode for data protection
              • Conclusion
Page 16: Implementation of Three LWC Schemes in the WiFi 4-Way ......Implementation of Three LWC Schemes in the WiFi 4-Way Handshake with Software Defned Radio . Yunjie Yi, Guang Gong, Kalikinkar

Microcontroller implementations SPIX ACE and WAGE

Microcontroller platform specifcations3

SPIX ACE and WAGE are implemented in assembly language

SPIX 8-bit 16-bit and 32-bit microcontrollers Atmega128 MSP430 and Cortex-M3

ACE 16-bit and 32-bit microcontrollers MSP430 and Cortex-M3

WAGE 8-bit 16-bit and 32-bit microcontrollers Atmega128 MSP430 and Cortex-M3

Clock frequency 16 MHz for all platforms

Microcontrollers Flash memory size [kB] RAM [kB] Number of general-purpose register ATmega128(SPIX WAGE) 128 4448 32(R0 - R31) MSP430F2013(ACE SPIX) 2304 0128 12 (R4 - R15)

MSP430F2370(WAGE) 33024 2048 12 (R4 - R15) LM3S9D96(ACE SPIX WAGE) 524288 131072 13 (R0 - R12)

3Cortex-M3 Cortex-m3lm3s9d96 MSP430 MSP430f2013 1521 Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake

Implementation Details

For SPIX the state is stored in the registers

For ACE and WAGE the state is stored in memory

Instead of loading everything into registers the WAGE state is continuously stored in random access memory (RAM)

- Initial memory location and the current round is recorded - After the permutation evaluation copy the fnal state to the

initial state location in RAM - Continue the next WAGE permutation evaluation

Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake 1621

Performance of 4-way handshake authentication Authentication time

Auth-Time includes the 4-way transmission time + computation times of KDF and MIC functions

Tauth = T4minuswayminustx + 2 TKDF + 3 TMIC

KDF computation is equivalent to AE(lAD = 0 lM = 6) no tag MIC computation is equivalent to AE(lAD = 4 lM = 0) with tag

Cryptographic F Platform Function

Memory usage [Bytes] Setup

[Cycles] Throughput

[Kbps] Gen-time

[ms] 4-way-Tx-time

[ms] Auth-Time

[ms] SRAM Flash

SPIX

8-bits ATmega128

KDF 175 1586 705314 2323 4408 700 95640 MIC 175 1634 897225 1826 5608 16-bits

MSP430F2013 KDF 50 1562 286679 5715 1792 690 79409 MIC 50 1580 363991 4501 2275

32-bits LM3S9D96

KDF 408 1230 59140 27704 370 700 72150 MIC 408 1326 75132 21807 470

ACE

16-bits MSP430F2013

KDF 330 1720 550752 2975 3442 710 89503 MIC 330 1738 619701 2644 3873 32-bits

LM3S9D96 KDF 599 1826 102762 15944 642 730 76450 MIC 599 1790 115561 14178 722

WAGE

8-bits ATmega128

KDF 808 4448 139478 11747 872 710 75678 MIC 808 4516 156491 10470 978 16-bits

MSP430F2013 KDF 46 4518 166993 9811 1044 720 77601 MIC 46 4536 187340 8746 1171

32-bits LM3S9D96

KDF 3084 6278 107071 15302 669 690 72591 MIC 3084 6382 120190 13632 751

SPIX implementation is faster due to storing state into registers

1721 Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake

SPIX for IEEE 80211i data protection protocol

32-bit Cortex-M3 gives the highest throughput and lowest memory usage

Total time for no AD and 128 bytes message 1058 ms Total time for 16 byte AD and 128 bytes message 1079 ms

Cryptographic Platform Memory usage

[Bytes] Setup [Cycles]

Throughput [Kbps]

Gen-time [ms]

Tx-time [ms] SRAM Flash

SPIX Perm-18

8-bits ATmega128 161 1262 128377 3191 802

NA 16-bits MSP430F2013 24 1409 52294 7833 327

32-bits LM3S9D96 352 946 10900 37578 068

SPIX-AE (lAD = 0 lM = 16)

8-bits ATmega128 175 1550 1667042 983 10419 1060

16-bits MSP430F2013 50 1845 677818 2417 4236 1080

32-bits LM3S9D96 408 1210 139569 11739 872 1050

SPIX-AE (lAD = 2 lM = 16)

8-bits ATmega128 175 1644 1795322 913 11221 1080

16-bits MSP430F2013 50 1891 730340 2243 4565 1050

32-bits LM3S9D96 424 1326 150313 10900 939 1070

Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake 1821

ACE for IEEE 80211i data protection protocol

32-bit Cortex-M3 gives the best result Total time for no AD and 128 bytes message 1087 ms Total time for 16 byte AD and 128 bytes message 1098 ms

Cryptographic Platform Memory usage

[Bytes] Setup [Cycles]

Throughput [Kbps]

Gen-time [ms]

Tx-time [ms] SRAM Flash

ACE Perm 16-bits

MSP430F2013 304 1456 69440 7373 434 NA 32-bits

LM3S9D96 523 1598 13003 39376 081

ACE-AE (lAD = 0 lM = 16)

16-bits MSP430F2013 330 1740 1445059 1134 9032 1060

32-bits LM3S9D96 559 1790 269341 6083 1683 107

ACE-AE (lAD = 2 lM = 16)

16-bits MSP430F2013 330 1786 1582892 1035 9893 1080

32-bits LM3S9D96 559 1858 294988 5554 1844 1080

ACE-Hash (lM = 2 j =4)

16-bits MSP430F2013 330 1682 413056 496 2582

NA 32-bits LM3S9D96 559 1822 77114 2656 482

ACE-Hash (lM = 16 j =4)

16-bits MSP430F2013 330 1684 1375672 1191 8598

32-bits LM3S9D96 559 1822 256524 6387 1603

Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake 1921

WAGE for IEEE 80211i data protection protocol Cortex-M3 gives the highest throughput with highest memory

usage MSP430 gives the slightly lower throughput but much lower

memory usage Total time for no AD and 128 bytes message 1077 ms Total time for 16 byte AD and 128 bytes message 1079 ms

Cryptographic Platform Memory usage

[Bytes] Setup [Cycles]

Throughput [Kbps]

Gen-time [ms]

Tx-time [ms] SRAM Flash

WAGE Perm

8-bits ATmega128 802 4132 19011 21798 119

NA 16-bits MSP430F2370 4 5031 23524 17616 147

32-bits LM3S9D96 3076 5902 14450 28678 09

WAGE-AE (lAD = 0 lM = 16)

8-bits ATmega128 808 4416 362888 4515 2268 1080

16-bits MSP430F2370 46 5289 433105 3783 2707 1090

32-bits LM3S9D96 3084 6230 278848 5876 1743 1060

WAGE-AE (lAD = 2 lM = 16)

8-bits ATmega128 808 4502 397260 4124 2483 1050

16-bits MSP430F2370 46 5339 474067 3456 2963 1060

32-bits LM3S9D96 3084 6354 305284 5367 1908 1060

Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake 2021

Conclusions

Implemented the IEEE 80211a physical layer OFDM transmission systems by software defned radio to simulate the 4-way handshake modulation and communication

Implemented the IEEE 8021x 4-way handshake mutual authentication and key establishment protocol using SPIX ACE and WAGE on three dierent microcontrollers

Throughput of WAGE is higher than that of AES-128 written in C1 on the same 8-bit platform

Execution time for the cryptographic operations is the dominating factor in the 4-way handshake

- USRP tx rate 1682 Kbps - WiFi systemrsquos tx rate 50 - 320 Mbps

07times1682 - Scaling tx to WiFi = 0235 ms 50000

1 G Meiser T Eisenbarth K Lemke-Rust and C Paar Eyumlcient implementation of estream ciphers on 8-bit avr

microcontrollers In 2008 International Symposium on Industrial Embedded Systems pages 58 - 66 June 2008

Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake 2121

Thank you

  • Background and Preliminary
    • SPIX ACE and WAGE in NIST LWC round 2
    • IEEE 8021X 4-way handshake and data protection
    • SDR Overall
      • Implementation of Three LWC Schemes
        • KDF and MIC generation
        • Microcontroller Implementation
        • RAM moving technology
          • Performance Results
            • KDF and MIC for 4-way handshake
            • SPIX AE mode for data protection
            • ACE AE amp Hash modes for data protection
            • WAGE AE mode for data protection
              • Conclusion
Page 17: Implementation of Three LWC Schemes in the WiFi 4-Way ......Implementation of Three LWC Schemes in the WiFi 4-Way Handshake with Software Defned Radio . Yunjie Yi, Guang Gong, Kalikinkar

Implementation Details

For SPIX the state is stored in the registers

For ACE and WAGE the state is stored in memory

Instead of loading everything into registers the WAGE state is continuously stored in random access memory (RAM)

- Initial memory location and the current round is recorded - After the permutation evaluation copy the fnal state to the

initial state location in RAM - Continue the next WAGE permutation evaluation

Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake 1621

Performance of 4-way handshake authentication Authentication time

Auth-Time includes the 4-way transmission time + computation times of KDF and MIC functions

Tauth = T4minuswayminustx + 2 TKDF + 3 TMIC

KDF computation is equivalent to AE(lAD = 0 lM = 6) no tag MIC computation is equivalent to AE(lAD = 4 lM = 0) with tag

Cryptographic F Platform Function

Memory usage [Bytes] Setup

[Cycles] Throughput

[Kbps] Gen-time

[ms] 4-way-Tx-time

[ms] Auth-Time

[ms] SRAM Flash

SPIX

8-bits ATmega128

KDF 175 1586 705314 2323 4408 700 95640 MIC 175 1634 897225 1826 5608 16-bits

MSP430F2013 KDF 50 1562 286679 5715 1792 690 79409 MIC 50 1580 363991 4501 2275

32-bits LM3S9D96

KDF 408 1230 59140 27704 370 700 72150 MIC 408 1326 75132 21807 470

ACE

16-bits MSP430F2013

KDF 330 1720 550752 2975 3442 710 89503 MIC 330 1738 619701 2644 3873 32-bits

LM3S9D96 KDF 599 1826 102762 15944 642 730 76450 MIC 599 1790 115561 14178 722

WAGE

8-bits ATmega128

KDF 808 4448 139478 11747 872 710 75678 MIC 808 4516 156491 10470 978 16-bits

MSP430F2013 KDF 46 4518 166993 9811 1044 720 77601 MIC 46 4536 187340 8746 1171

32-bits LM3S9D96

KDF 3084 6278 107071 15302 669 690 72591 MIC 3084 6382 120190 13632 751

SPIX implementation is faster due to storing state into registers

1721 Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake

SPIX for IEEE 80211i data protection protocol

32-bit Cortex-M3 gives the highest throughput and lowest memory usage

Total time for no AD and 128 bytes message 1058 ms Total time for 16 byte AD and 128 bytes message 1079 ms

Cryptographic Platform Memory usage

[Bytes] Setup [Cycles]

Throughput [Kbps]

Gen-time [ms]

Tx-time [ms] SRAM Flash

SPIX Perm-18

8-bits ATmega128 161 1262 128377 3191 802

NA 16-bits MSP430F2013 24 1409 52294 7833 327

32-bits LM3S9D96 352 946 10900 37578 068

SPIX-AE (lAD = 0 lM = 16)

8-bits ATmega128 175 1550 1667042 983 10419 1060

16-bits MSP430F2013 50 1845 677818 2417 4236 1080

32-bits LM3S9D96 408 1210 139569 11739 872 1050

SPIX-AE (lAD = 2 lM = 16)

8-bits ATmega128 175 1644 1795322 913 11221 1080

16-bits MSP430F2013 50 1891 730340 2243 4565 1050

32-bits LM3S9D96 424 1326 150313 10900 939 1070

Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake 1821

ACE for IEEE 80211i data protection protocol

32-bit Cortex-M3 gives the best result Total time for no AD and 128 bytes message 1087 ms Total time for 16 byte AD and 128 bytes message 1098 ms

Cryptographic Platform Memory usage

[Bytes] Setup [Cycles]

Throughput [Kbps]

Gen-time [ms]

Tx-time [ms] SRAM Flash

ACE Perm 16-bits

MSP430F2013 304 1456 69440 7373 434 NA 32-bits

LM3S9D96 523 1598 13003 39376 081

ACE-AE (lAD = 0 lM = 16)

16-bits MSP430F2013 330 1740 1445059 1134 9032 1060

32-bits LM3S9D96 559 1790 269341 6083 1683 107

ACE-AE (lAD = 2 lM = 16)

16-bits MSP430F2013 330 1786 1582892 1035 9893 1080

32-bits LM3S9D96 559 1858 294988 5554 1844 1080

ACE-Hash (lM = 2 j =4)

16-bits MSP430F2013 330 1682 413056 496 2582

NA 32-bits LM3S9D96 559 1822 77114 2656 482

ACE-Hash (lM = 16 j =4)

16-bits MSP430F2013 330 1684 1375672 1191 8598

32-bits LM3S9D96 559 1822 256524 6387 1603

Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake 1921

WAGE for IEEE 80211i data protection protocol Cortex-M3 gives the highest throughput with highest memory

usage MSP430 gives the slightly lower throughput but much lower

memory usage Total time for no AD and 128 bytes message 1077 ms Total time for 16 byte AD and 128 bytes message 1079 ms

Cryptographic Platform Memory usage

[Bytes] Setup [Cycles]

Throughput [Kbps]

Gen-time [ms]

Tx-time [ms] SRAM Flash

WAGE Perm

8-bits ATmega128 802 4132 19011 21798 119

NA 16-bits MSP430F2370 4 5031 23524 17616 147

32-bits LM3S9D96 3076 5902 14450 28678 09

WAGE-AE (lAD = 0 lM = 16)

8-bits ATmega128 808 4416 362888 4515 2268 1080

16-bits MSP430F2370 46 5289 433105 3783 2707 1090

32-bits LM3S9D96 3084 6230 278848 5876 1743 1060

WAGE-AE (lAD = 2 lM = 16)

8-bits ATmega128 808 4502 397260 4124 2483 1050

16-bits MSP430F2370 46 5339 474067 3456 2963 1060

32-bits LM3S9D96 3084 6354 305284 5367 1908 1060

Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake 2021

Conclusions

Implemented the IEEE 80211a physical layer OFDM transmission systems by software defned radio to simulate the 4-way handshake modulation and communication

Implemented the IEEE 8021x 4-way handshake mutual authentication and key establishment protocol using SPIX ACE and WAGE on three dierent microcontrollers

Throughput of WAGE is higher than that of AES-128 written in C1 on the same 8-bit platform

Execution time for the cryptographic operations is the dominating factor in the 4-way handshake

- USRP tx rate 1682 Kbps - WiFi systemrsquos tx rate 50 - 320 Mbps

07times1682 - Scaling tx to WiFi = 0235 ms 50000

1 G Meiser T Eisenbarth K Lemke-Rust and C Paar Eyumlcient implementation of estream ciphers on 8-bit avr

microcontrollers In 2008 International Symposium on Industrial Embedded Systems pages 58 - 66 June 2008

Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake 2121

Thank you

  • Background and Preliminary
    • SPIX ACE and WAGE in NIST LWC round 2
    • IEEE 8021X 4-way handshake and data protection
    • SDR Overall
      • Implementation of Three LWC Schemes
        • KDF and MIC generation
        • Microcontroller Implementation
        • RAM moving technology
          • Performance Results
            • KDF and MIC for 4-way handshake
            • SPIX AE mode for data protection
            • ACE AE amp Hash modes for data protection
            • WAGE AE mode for data protection
              • Conclusion
Page 18: Implementation of Three LWC Schemes in the WiFi 4-Way ......Implementation of Three LWC Schemes in the WiFi 4-Way Handshake with Software Defned Radio . Yunjie Yi, Guang Gong, Kalikinkar

Performance of 4-way handshake authentication Authentication time

Auth-Time includes the 4-way transmission time + computation times of KDF and MIC functions

Tauth = T4minuswayminustx + 2 TKDF + 3 TMIC

KDF computation is equivalent to AE(lAD = 0 lM = 6) no tag MIC computation is equivalent to AE(lAD = 4 lM = 0) with tag

Cryptographic F Platform Function

Memory usage [Bytes] Setup

[Cycles] Throughput

[Kbps] Gen-time

[ms] 4-way-Tx-time

[ms] Auth-Time

[ms] SRAM Flash

SPIX

8-bits ATmega128

KDF 175 1586 705314 2323 4408 700 95640 MIC 175 1634 897225 1826 5608 16-bits

MSP430F2013 KDF 50 1562 286679 5715 1792 690 79409 MIC 50 1580 363991 4501 2275

32-bits LM3S9D96

KDF 408 1230 59140 27704 370 700 72150 MIC 408 1326 75132 21807 470

ACE

16-bits MSP430F2013

KDF 330 1720 550752 2975 3442 710 89503 MIC 330 1738 619701 2644 3873 32-bits

LM3S9D96 KDF 599 1826 102762 15944 642 730 76450 MIC 599 1790 115561 14178 722

WAGE

8-bits ATmega128

KDF 808 4448 139478 11747 872 710 75678 MIC 808 4516 156491 10470 978 16-bits

MSP430F2013 KDF 46 4518 166993 9811 1044 720 77601 MIC 46 4536 187340 8746 1171

32-bits LM3S9D96

KDF 3084 6278 107071 15302 669 690 72591 MIC 3084 6382 120190 13632 751

SPIX implementation is faster due to storing state into registers

1721 Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake

SPIX for IEEE 80211i data protection protocol

32-bit Cortex-M3 gives the highest throughput and lowest memory usage

Total time for no AD and 128 bytes message 1058 ms Total time for 16 byte AD and 128 bytes message 1079 ms

Cryptographic Platform Memory usage

[Bytes] Setup [Cycles]

Throughput [Kbps]

Gen-time [ms]

Tx-time [ms] SRAM Flash

SPIX Perm-18

8-bits ATmega128 161 1262 128377 3191 802

NA 16-bits MSP430F2013 24 1409 52294 7833 327

32-bits LM3S9D96 352 946 10900 37578 068

SPIX-AE (lAD = 0 lM = 16)

8-bits ATmega128 175 1550 1667042 983 10419 1060

16-bits MSP430F2013 50 1845 677818 2417 4236 1080

32-bits LM3S9D96 408 1210 139569 11739 872 1050

SPIX-AE (lAD = 2 lM = 16)

8-bits ATmega128 175 1644 1795322 913 11221 1080

16-bits MSP430F2013 50 1891 730340 2243 4565 1050

32-bits LM3S9D96 424 1326 150313 10900 939 1070

Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake 1821

ACE for IEEE 80211i data protection protocol

32-bit Cortex-M3 gives the best result Total time for no AD and 128 bytes message 1087 ms Total time for 16 byte AD and 128 bytes message 1098 ms

Cryptographic Platform Memory usage

[Bytes] Setup [Cycles]

Throughput [Kbps]

Gen-time [ms]

Tx-time [ms] SRAM Flash

ACE Perm 16-bits

MSP430F2013 304 1456 69440 7373 434 NA 32-bits

LM3S9D96 523 1598 13003 39376 081

ACE-AE (lAD = 0 lM = 16)

16-bits MSP430F2013 330 1740 1445059 1134 9032 1060

32-bits LM3S9D96 559 1790 269341 6083 1683 107

ACE-AE (lAD = 2 lM = 16)

16-bits MSP430F2013 330 1786 1582892 1035 9893 1080

32-bits LM3S9D96 559 1858 294988 5554 1844 1080

ACE-Hash (lM = 2 j =4)

16-bits MSP430F2013 330 1682 413056 496 2582

NA 32-bits LM3S9D96 559 1822 77114 2656 482

ACE-Hash (lM = 16 j =4)

16-bits MSP430F2013 330 1684 1375672 1191 8598

32-bits LM3S9D96 559 1822 256524 6387 1603

Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake 1921

WAGE for IEEE 80211i data protection protocol Cortex-M3 gives the highest throughput with highest memory

usage MSP430 gives the slightly lower throughput but much lower

memory usage Total time for no AD and 128 bytes message 1077 ms Total time for 16 byte AD and 128 bytes message 1079 ms

Cryptographic Platform Memory usage

[Bytes] Setup [Cycles]

Throughput [Kbps]

Gen-time [ms]

Tx-time [ms] SRAM Flash

WAGE Perm

8-bits ATmega128 802 4132 19011 21798 119

NA 16-bits MSP430F2370 4 5031 23524 17616 147

32-bits LM3S9D96 3076 5902 14450 28678 09

WAGE-AE (lAD = 0 lM = 16)

8-bits ATmega128 808 4416 362888 4515 2268 1080

16-bits MSP430F2370 46 5289 433105 3783 2707 1090

32-bits LM3S9D96 3084 6230 278848 5876 1743 1060

WAGE-AE (lAD = 2 lM = 16)

8-bits ATmega128 808 4502 397260 4124 2483 1050

16-bits MSP430F2370 46 5339 474067 3456 2963 1060

32-bits LM3S9D96 3084 6354 305284 5367 1908 1060

Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake 2021

Conclusions

Implemented the IEEE 80211a physical layer OFDM transmission systems by software defned radio to simulate the 4-way handshake modulation and communication

Implemented the IEEE 8021x 4-way handshake mutual authentication and key establishment protocol using SPIX ACE and WAGE on three dierent microcontrollers

Throughput of WAGE is higher than that of AES-128 written in C1 on the same 8-bit platform

Execution time for the cryptographic operations is the dominating factor in the 4-way handshake

- USRP tx rate 1682 Kbps - WiFi systemrsquos tx rate 50 - 320 Mbps

07times1682 - Scaling tx to WiFi = 0235 ms 50000

1 G Meiser T Eisenbarth K Lemke-Rust and C Paar Eyumlcient implementation of estream ciphers on 8-bit avr

microcontrollers In 2008 International Symposium on Industrial Embedded Systems pages 58 - 66 June 2008

Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake 2121

Thank you

  • Background and Preliminary
    • SPIX ACE and WAGE in NIST LWC round 2
    • IEEE 8021X 4-way handshake and data protection
    • SDR Overall
      • Implementation of Three LWC Schemes
        • KDF and MIC generation
        • Microcontroller Implementation
        • RAM moving technology
          • Performance Results
            • KDF and MIC for 4-way handshake
            • SPIX AE mode for data protection
            • ACE AE amp Hash modes for data protection
            • WAGE AE mode for data protection
              • Conclusion
Page 19: Implementation of Three LWC Schemes in the WiFi 4-Way ......Implementation of Three LWC Schemes in the WiFi 4-Way Handshake with Software Defned Radio . Yunjie Yi, Guang Gong, Kalikinkar

SPIX for IEEE 80211i data protection protocol

32-bit Cortex-M3 gives the highest throughput and lowest memory usage

Total time for no AD and 128 bytes message 1058 ms Total time for 16 byte AD and 128 bytes message 1079 ms

Cryptographic Platform Memory usage

[Bytes] Setup [Cycles]

Throughput [Kbps]

Gen-time [ms]

Tx-time [ms] SRAM Flash

SPIX Perm-18

8-bits ATmega128 161 1262 128377 3191 802

NA 16-bits MSP430F2013 24 1409 52294 7833 327

32-bits LM3S9D96 352 946 10900 37578 068

SPIX-AE (lAD = 0 lM = 16)

8-bits ATmega128 175 1550 1667042 983 10419 1060

16-bits MSP430F2013 50 1845 677818 2417 4236 1080

32-bits LM3S9D96 408 1210 139569 11739 872 1050

SPIX-AE (lAD = 2 lM = 16)

8-bits ATmega128 175 1644 1795322 913 11221 1080

16-bits MSP430F2013 50 1891 730340 2243 4565 1050

32-bits LM3S9D96 424 1326 150313 10900 939 1070

Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake 1821

ACE for IEEE 80211i data protection protocol

32-bit Cortex-M3 gives the best result Total time for no AD and 128 bytes message 1087 ms Total time for 16 byte AD and 128 bytes message 1098 ms

Cryptographic Platform Memory usage

[Bytes] Setup [Cycles]

Throughput [Kbps]

Gen-time [ms]

Tx-time [ms] SRAM Flash

ACE Perm 16-bits

MSP430F2013 304 1456 69440 7373 434 NA 32-bits

LM3S9D96 523 1598 13003 39376 081

ACE-AE (lAD = 0 lM = 16)

16-bits MSP430F2013 330 1740 1445059 1134 9032 1060

32-bits LM3S9D96 559 1790 269341 6083 1683 107

ACE-AE (lAD = 2 lM = 16)

16-bits MSP430F2013 330 1786 1582892 1035 9893 1080

32-bits LM3S9D96 559 1858 294988 5554 1844 1080

ACE-Hash (lM = 2 j =4)

16-bits MSP430F2013 330 1682 413056 496 2582

NA 32-bits LM3S9D96 559 1822 77114 2656 482

ACE-Hash (lM = 16 j =4)

16-bits MSP430F2013 330 1684 1375672 1191 8598

32-bits LM3S9D96 559 1822 256524 6387 1603

Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake 1921

WAGE for IEEE 80211i data protection protocol Cortex-M3 gives the highest throughput with highest memory

usage MSP430 gives the slightly lower throughput but much lower

memory usage Total time for no AD and 128 bytes message 1077 ms Total time for 16 byte AD and 128 bytes message 1079 ms

Cryptographic Platform Memory usage

[Bytes] Setup [Cycles]

Throughput [Kbps]

Gen-time [ms]

Tx-time [ms] SRAM Flash

WAGE Perm

8-bits ATmega128 802 4132 19011 21798 119

NA 16-bits MSP430F2370 4 5031 23524 17616 147

32-bits LM3S9D96 3076 5902 14450 28678 09

WAGE-AE (lAD = 0 lM = 16)

8-bits ATmega128 808 4416 362888 4515 2268 1080

16-bits MSP430F2370 46 5289 433105 3783 2707 1090

32-bits LM3S9D96 3084 6230 278848 5876 1743 1060

WAGE-AE (lAD = 2 lM = 16)

8-bits ATmega128 808 4502 397260 4124 2483 1050

16-bits MSP430F2370 46 5339 474067 3456 2963 1060

32-bits LM3S9D96 3084 6354 305284 5367 1908 1060

Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake 2021

Conclusions

Implemented the IEEE 80211a physical layer OFDM transmission systems by software defned radio to simulate the 4-way handshake modulation and communication

Implemented the IEEE 8021x 4-way handshake mutual authentication and key establishment protocol using SPIX ACE and WAGE on three dierent microcontrollers

Throughput of WAGE is higher than that of AES-128 written in C1 on the same 8-bit platform

Execution time for the cryptographic operations is the dominating factor in the 4-way handshake

- USRP tx rate 1682 Kbps - WiFi systemrsquos tx rate 50 - 320 Mbps

07times1682 - Scaling tx to WiFi = 0235 ms 50000

1 G Meiser T Eisenbarth K Lemke-Rust and C Paar Eyumlcient implementation of estream ciphers on 8-bit avr

microcontrollers In 2008 International Symposium on Industrial Embedded Systems pages 58 - 66 June 2008

Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake 2121

Thank you

  • Background and Preliminary
    • SPIX ACE and WAGE in NIST LWC round 2
    • IEEE 8021X 4-way handshake and data protection
    • SDR Overall
      • Implementation of Three LWC Schemes
        • KDF and MIC generation
        • Microcontroller Implementation
        • RAM moving technology
          • Performance Results
            • KDF and MIC for 4-way handshake
            • SPIX AE mode for data protection
            • ACE AE amp Hash modes for data protection
            • WAGE AE mode for data protection
              • Conclusion
Page 20: Implementation of Three LWC Schemes in the WiFi 4-Way ......Implementation of Three LWC Schemes in the WiFi 4-Way Handshake with Software Defned Radio . Yunjie Yi, Guang Gong, Kalikinkar

ACE for IEEE 80211i data protection protocol

32-bit Cortex-M3 gives the best result Total time for no AD and 128 bytes message 1087 ms Total time for 16 byte AD and 128 bytes message 1098 ms

Cryptographic Platform Memory usage

[Bytes] Setup [Cycles]

Throughput [Kbps]

Gen-time [ms]

Tx-time [ms] SRAM Flash

ACE Perm 16-bits

MSP430F2013 304 1456 69440 7373 434 NA 32-bits

LM3S9D96 523 1598 13003 39376 081

ACE-AE (lAD = 0 lM = 16)

16-bits MSP430F2013 330 1740 1445059 1134 9032 1060

32-bits LM3S9D96 559 1790 269341 6083 1683 107

ACE-AE (lAD = 2 lM = 16)

16-bits MSP430F2013 330 1786 1582892 1035 9893 1080

32-bits LM3S9D96 559 1858 294988 5554 1844 1080

ACE-Hash (lM = 2 j =4)

16-bits MSP430F2013 330 1682 413056 496 2582

NA 32-bits LM3S9D96 559 1822 77114 2656 482

ACE-Hash (lM = 16 j =4)

16-bits MSP430F2013 330 1684 1375672 1191 8598

32-bits LM3S9D96 559 1822 256524 6387 1603

Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake 1921

WAGE for IEEE 80211i data protection protocol Cortex-M3 gives the highest throughput with highest memory

usage MSP430 gives the slightly lower throughput but much lower

memory usage Total time for no AD and 128 bytes message 1077 ms Total time for 16 byte AD and 128 bytes message 1079 ms

Cryptographic Platform Memory usage

[Bytes] Setup [Cycles]

Throughput [Kbps]

Gen-time [ms]

Tx-time [ms] SRAM Flash

WAGE Perm

8-bits ATmega128 802 4132 19011 21798 119

NA 16-bits MSP430F2370 4 5031 23524 17616 147

32-bits LM3S9D96 3076 5902 14450 28678 09

WAGE-AE (lAD = 0 lM = 16)

8-bits ATmega128 808 4416 362888 4515 2268 1080

16-bits MSP430F2370 46 5289 433105 3783 2707 1090

32-bits LM3S9D96 3084 6230 278848 5876 1743 1060

WAGE-AE (lAD = 2 lM = 16)

8-bits ATmega128 808 4502 397260 4124 2483 1050

16-bits MSP430F2370 46 5339 474067 3456 2963 1060

32-bits LM3S9D96 3084 6354 305284 5367 1908 1060

Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake 2021

Conclusions

Implemented the IEEE 80211a physical layer OFDM transmission systems by software defned radio to simulate the 4-way handshake modulation and communication

Implemented the IEEE 8021x 4-way handshake mutual authentication and key establishment protocol using SPIX ACE and WAGE on three dierent microcontrollers

Throughput of WAGE is higher than that of AES-128 written in C1 on the same 8-bit platform

Execution time for the cryptographic operations is the dominating factor in the 4-way handshake

- USRP tx rate 1682 Kbps - WiFi systemrsquos tx rate 50 - 320 Mbps

07times1682 - Scaling tx to WiFi = 0235 ms 50000

1 G Meiser T Eisenbarth K Lemke-Rust and C Paar Eyumlcient implementation of estream ciphers on 8-bit avr

microcontrollers In 2008 International Symposium on Industrial Embedded Systems pages 58 - 66 June 2008

Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake 2121

Thank you

  • Background and Preliminary
    • SPIX ACE and WAGE in NIST LWC round 2
    • IEEE 8021X 4-way handshake and data protection
    • SDR Overall
      • Implementation of Three LWC Schemes
        • KDF and MIC generation
        • Microcontroller Implementation
        • RAM moving technology
          • Performance Results
            • KDF and MIC for 4-way handshake
            • SPIX AE mode for data protection
            • ACE AE amp Hash modes for data protection
            • WAGE AE mode for data protection
              • Conclusion
Page 21: Implementation of Three LWC Schemes in the WiFi 4-Way ......Implementation of Three LWC Schemes in the WiFi 4-Way Handshake with Software Defned Radio . Yunjie Yi, Guang Gong, Kalikinkar

WAGE for IEEE 80211i data protection protocol Cortex-M3 gives the highest throughput with highest memory

usage MSP430 gives the slightly lower throughput but much lower

memory usage Total time for no AD and 128 bytes message 1077 ms Total time for 16 byte AD and 128 bytes message 1079 ms

Cryptographic Platform Memory usage

[Bytes] Setup [Cycles]

Throughput [Kbps]

Gen-time [ms]

Tx-time [ms] SRAM Flash

WAGE Perm

8-bits ATmega128 802 4132 19011 21798 119

NA 16-bits MSP430F2370 4 5031 23524 17616 147

32-bits LM3S9D96 3076 5902 14450 28678 09

WAGE-AE (lAD = 0 lM = 16)

8-bits ATmega128 808 4416 362888 4515 2268 1080

16-bits MSP430F2370 46 5289 433105 3783 2707 1090

32-bits LM3S9D96 3084 6230 278848 5876 1743 1060

WAGE-AE (lAD = 2 lM = 16)

8-bits ATmega128 808 4502 397260 4124 2483 1050

16-bits MSP430F2370 46 5339 474067 3456 2963 1060

32-bits LM3S9D96 3084 6354 305284 5367 1908 1060

Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake 2021

Conclusions

Implemented the IEEE 80211a physical layer OFDM transmission systems by software defned radio to simulate the 4-way handshake modulation and communication

Implemented the IEEE 8021x 4-way handshake mutual authentication and key establishment protocol using SPIX ACE and WAGE on three dierent microcontrollers

Throughput of WAGE is higher than that of AES-128 written in C1 on the same 8-bit platform

Execution time for the cryptographic operations is the dominating factor in the 4-way handshake

- USRP tx rate 1682 Kbps - WiFi systemrsquos tx rate 50 - 320 Mbps

07times1682 - Scaling tx to WiFi = 0235 ms 50000

1 G Meiser T Eisenbarth K Lemke-Rust and C Paar Eyumlcient implementation of estream ciphers on 8-bit avr

microcontrollers In 2008 International Symposium on Industrial Embedded Systems pages 58 - 66 June 2008

Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake 2121

Thank you

  • Background and Preliminary
    • SPIX ACE and WAGE in NIST LWC round 2
    • IEEE 8021X 4-way handshake and data protection
    • SDR Overall
      • Implementation of Three LWC Schemes
        • KDF and MIC generation
        • Microcontroller Implementation
        • RAM moving technology
          • Performance Results
            • KDF and MIC for 4-way handshake
            • SPIX AE mode for data protection
            • ACE AE amp Hash modes for data protection
            • WAGE AE mode for data protection
              • Conclusion
Page 22: Implementation of Three LWC Schemes in the WiFi 4-Way ......Implementation of Three LWC Schemes in the WiFi 4-Way Handshake with Software Defned Radio . Yunjie Yi, Guang Gong, Kalikinkar

Conclusions

Implemented the IEEE 80211a physical layer OFDM transmission systems by software defned radio to simulate the 4-way handshake modulation and communication

Implemented the IEEE 8021x 4-way handshake mutual authentication and key establishment protocol using SPIX ACE and WAGE on three dierent microcontrollers

Throughput of WAGE is higher than that of AES-128 written in C1 on the same 8-bit platform

Execution time for the cryptographic operations is the dominating factor in the 4-way handshake

- USRP tx rate 1682 Kbps - WiFi systemrsquos tx rate 50 - 320 Mbps

07times1682 - Scaling tx to WiFi = 0235 ms 50000

1 G Meiser T Eisenbarth K Lemke-Rust and C Paar Eyumlcient implementation of estream ciphers on 8-bit avr

microcontrollers In 2008 International Symposium on Industrial Embedded Systems pages 58 - 66 June 2008

Yunjie Yi et al | Implementation of Three LWC Schemes in the WiFi 4-Way Handshake 2121

Thank you

  • Background and Preliminary
    • SPIX ACE and WAGE in NIST LWC round 2
    • IEEE 8021X 4-way handshake and data protection
    • SDR Overall
      • Implementation of Three LWC Schemes
        • KDF and MIC generation
        • Microcontroller Implementation
        • RAM moving technology
          • Performance Results
            • KDF and MIC for 4-way handshake
            • SPIX AE mode for data protection
            • ACE AE amp Hash modes for data protection
            • WAGE AE mode for data protection
              • Conclusion
Page 23: Implementation of Three LWC Schemes in the WiFi 4-Way ......Implementation of Three LWC Schemes in the WiFi 4-Way Handshake with Software Defned Radio . Yunjie Yi, Guang Gong, Kalikinkar

Thank you

  • Background and Preliminary
    • SPIX ACE and WAGE in NIST LWC round 2
    • IEEE 8021X 4-way handshake and data protection
    • SDR Overall
      • Implementation of Three LWC Schemes
        • KDF and MIC generation
        • Microcontroller Implementation
        • RAM moving technology
          • Performance Results
            • KDF and MIC for 4-way handshake
            • SPIX AE mode for data protection
            • ACE AE amp Hash modes for data protection
            • WAGE AE mode for data protection
              • Conclusion