Analysis and Design of Block Ciphers · Acknowledgments This work would never have been possible without the guidance, encouragement, and suggestions of my supervisor, Dr. Stafford

Analysis and Design of Block Ciphers

by

Amr Mohamed Youssef

A thesis submitted to the

Department of Electrical and Cornputer Engineering

in conformity with the requirements for the degree of

Doctor of Philosophy

Queen's University

Kingston, Ontario, Canada

December 1997

Copyright @ 1997 Amr Mobamed Youssef

National Library Bibliothèque nationale du Canada

Acquisitions and Acquisitions et Bibliographie Services services bibliographiques

395 Wellington Street 395. rue Wellington OttawaON K1AON4 Ottawa ON K1A ON4 Canada Canada

The author has granted a non- exclusive licence dowing the National Library of Canada to reproduce, loan, distribute or sel1 copies of this thesis in microform, paper or electronic formats.

The author retains ownership of the copyright in this thesis. Neither the thesis nor substantial extracts fiom it may be printed or otherwise reproduced without the author's permission.

L'auteur a accordé une licence non exclusive permettant à la Bibliothèque nationale du Canada de reproduire, prêter, distribuer ou vendre des copies de cette thèse sous la forme de microfiche/fih, de reproduction sur papier ou sur format électronique.

L'auteur conserve la propriété du droit d'auteur qui protège cette thèse. Ni la thèse ni des extraits substantiels de celle-ci ne doivent être imprimés ou autrement reproduits sans son autorisation.

In this thesis we study various cryptographic properties of boolean mappings from n bits

to m bits. In particular, we derive expressions for the expected size of the maximum XOR

table entry and the maximum Linear Approximation Table enûy for some combinatorid

structures of interest such as regular (balanced) mappings, and injective mappings. We

derive similar expressions for the expected value of different foms of information leakage

and relate different forms of information leakage to the spectral properties of the function.

We also extend the definitions of many cryptographic criteria to multi-ouput boolean

functions and study the relationship between the Walsh-Hadamard transform and various

types of information leakage.

A new construction method for highly nonlinea. injective s-boxes is presented. It

is shown that the resistance of CAST-like encryption algorithms (based on randornly

selected substitution boxes) to the basic linear cryptanalysis was underestimated in

previous work.

We introduce a new class of Substitution Permutation Networks (SPNs) with the

advantage that the sarne network cm be used to perform both the encryption and the

decryption operations. Different cryptographic properties of this class such as resistance to

both linear and differential cryptanalysis are exarnined. We also presenr two constmction

methods for involution linear transformations for SPNs based on Maximum Distance

Separable codes.

An analytical mode1 for the avalanche characteristics of SPNs with difierent linear

transformation layers is developed. We dso prove a conjecture by Cusick regarding the

number of functions satisfying the Strict Avalanche Criterion.

Dedicated to my parents

For their etemal love, encouragement, and support, 1 am grateful.

iii

Acknowledgments

This work would never have been possible without the guidance, encouragement, and

suggestions of my supervisor, Dr. Stafford Tavares. It is my pleasure to thank him

for al1 of his help.

1 wish to thank Dr. Moustafa Fahrny who introduced me to Dr. Tavares.

1 gratefully acknowledge the financial support of the Ontario Ministry of Educarion,

Telecommunication Research Lnstitute of Ontario (TRIO), and Queen's University.

Special thanks and appreciation go to my wife, Ayda, for her many years of patient

support and encouragement.

List of Notation

LAT*

XO R*

NLf

In(n. m )

The set of binary numben

The set of binary n-niples

XOR operation

A variable in Z i or GF(2" )

A scalar variable

A random vaiable in 2; or GF(2")

The cardinality of the enclosed set

A mapping from 2: -, Z F

The Hamming weight of a

The dot product of a and x over Z2

The multiplication of a and x over GF(Zn)

The Walsh transform of (- 1 ) ce/ (XI

Linear Approximation Table entry

XOR distribution table entry

rnax 1 L AT(a. b) 1 a#O.b

N A ~ A ~ Ax#O,Ay

Nonlinearity of f

Number of injective fùnctions f : ZI: + Z y

Number of balanced functions f : 21" -. Zr:

The entropy hinction

The binary entropy function

The mutuai information

Static information leakage

Self static information leakage

Dynamic information leakage

Trace function

Table of Contents

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Abstract ii

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv

List of Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v

List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi

List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiii

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . Chapter 1 Introduction 1

. . . . . . . . . . . . . . . . . . . 1.1 Motivation for this Research 2

. . . . . . . . . . . . . . . . . 1.2 Outline of Remainder of Thesis 3

. . . . . . . . . . . . . . . . . . . . . . . . Chapter 2 Previous Research 4

. . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1 Architectures 4

. . . . . . . . . . . . . . . 2.1.1 Substitution-Permutation Networks 5

. . . . . . . . . . . . . 2.1.2 DES-like Structure (Feistel Networks) 7

. . . . . . . . . . . . . . . . . . . . . . . . . . 2.1.3 Other Structures 9

2.2 S-box versus Non S-box Approaches . . . . . . . . . . . . . 9

2.3 Cryptanalysis . . . . . . . . . . . . . . . . . . . . . . . . . . 10

2.3.1 Exhaustive Search . . . . . . . . . . . . . . . . . . . . . . . . . 11

2.3.2 Differential Cryptanalysis . . . . . . . . . . . . . . . . . . . . . 12

2.3.3 Linear Cryptanalysis . . . . . . . . . . . . . . . . . . . . . . . - 1 4

2.3.4 Extensions to Linear and Differential Cryptanalysis . . . . 15

2.3.5 Other Attacks . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

2.3.6 lmplementation Dependent Attacks . . . . . . . . . . . . . . 17

2.3.6.1 Timing Attacks . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

. . . . . . . . . . . . . . . . . 2.3.6.2 Differential Fault Cryptanalysis 17

Chapter 3

3.4

3.5

Chapter 4

Chapter 5

. . . Advanced Encryption Standard (AES) Requirements 19

. . . . . . . . . Mathematical Background and Definitions -21

. . . . . . . . Cryptographic Criteria for Boolean Functions 21

The Inclusion-Exclusion Principle . . . . . . . . . . . . . . . . 25

The Walsh Transform of Boolean Functions . . . . . . . . . 26

Linear Approximation Table and XOR Distribution Table of

Balanced S-boxes . . . . . . . . . . . . . . . . . . . . . . . . 29

. . . . . Linear Approximation Table of Balanced S-boxes 30

XOR Distribution Table of Balanced S-boxes . . . . . . . . 32

Counting the Number of Nonlinear Balanced S-boxes . . . 38

Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . - 4 1

. . . . . . . . . . Linear Approximation of Injective S-boxes 43

Definitions and Notation . . . . . . . . . . . . . . . . . . . . . 43

. . . . . Linear Approximation Table of Injective Mappings 44

. . . . Construction of Highly Nonlinear Injective S-boxes 47

Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . -47

. . . . . . . . . . . . . . . . . . . . . . Construction Method I - 4 9

Construction Method II . . . . . . . . . . . . . . . . . . . . . -50

Comments on the Security of the CAST Encryption

Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

Information Leakage and Spectral Properties of Boolean

Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . -55

Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

viii

5.6.1.1

5.7

Chapter 6

6.1

6.2

6.3

6.3.1

6.3.2

6.4

6.5

6.6

6.7

6.8

Chapter 7

Information Leakage of a Randornly Selected Boolean

Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60

Information Leakage of a Randorniy Selected Balanced

. . . . . . . . . . . . . . . . . . . . . . . . Boolean Function -64

Information Leakage of a Randomly Selected Injective

Boolean Function . . . . . . . . . . . . . . . . . . . . . . . . . 67

Relation Between the Walsh Transforrn and Different

Foms of Information Leakage . . . . . . . . . . . . . . . . 69

. . . . . . . . . . . . . . . . . . . . . . . Extended Definitions 73

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . Conclusion -75

. . . . A New Class of Substitution-Permutation Networks 77

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . -77

S-boxes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79

S-box lnterconnection Layer . . . . . . . . . . . . . . . . . 87

Proposed Linear Transformation . . . . . . . . . . . . . . . . 88

Involution Linear Transformations based on MDS codes . 89

Resistance to Differential and Linear Cryptanalysis . . . . 93

Cyclic Properties of the Proposed SPN . . . . . . . . . . . 96

. . . . . . . . . . . . . . . . . . . . Key Scheduling Algorithm 97

Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . 100

Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100

Modelling Avalanche Characteristics of Substitution-

Permutation Networks Using Markov Chains . . . . . . . 102

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102

Convergence in Markov Chains . . . . . . . . . . . . . . . 105

7.3

7.4

7.4.1

7.4.2

7.4.3

7.4.4

7.5

Chapter 8

8.1

8.2

Appendix A

. . . . . . . . . . . . . . . Modelling Avalanche in S-boxes 106

. . . . . . . . . Modelling the Linear Transformation Layer 108

. . . . . . . Model A Fixed Permutation Layer 108

. . . . . . . . . . Model 8 Linear Transformation Type 1 108

. . . . Model C Linear Transformation Type 2 111

. . . . . . . . . . Model 0 Linear Transformation Type 3 113

Resufts and Discussion . . . . . . . . . . . . . . . . . . . . . 114

Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120

Contributions of the Thesis . . . . . . . . . . . . . . . . . . 120

Future Worù . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121

Cryptanalysis of the 'Nonlinear-Panty Circuitsn Proposed

at Crypto '90 . . . . . . . . . . . . . . . . . . . . . . . . . . . 123

Cryptanalysis of 'key agreement scheme based on

generalized inverses of matricesn . . . . . . . . . . . . . . 127

Bounds on the Number of Functions Satisfying the Strict

Avalanche Criterion . . . . . . . . . . . . . . . . . , . . . . . 131

References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137

Vitae . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151

Appendix B

Appendix C

List of Figures

Figure 2.1

Figure 2.2

Figure 2.3

Figure 4.1

Figure 5.1

Figure 5.2

Figure 5.3

Figure 5.4

Figure 6.1

Figure 6.2

Figure 6.3

Figure 6.4

Figure 6.5

Figure 6.6

Figure 7.1

Figure 7.2

Figure 7.3

. . . . General Classification of Block Cipher Architectures 5

. . . . . . . . . . . . . . SPN with N = 16. R = 4 and R = 3 6

. . . . . . . . . . . . . . . . . . . Round i of DES-like Cipher 8

. . . . . . . . . . . . . . . . . . . . . . CAST Round Function 53

. . . . . . . . Lower Bound for the Binary Entropy Function 61

Expected Value of SS L(Y) for a Randomly Selected

. . . . . . . . . . . . . . . . . . . . . . . . . Boolean Function 63

Expected Value of S S L ( Y ) and Expected Value of

S L ( Y 1 X+ ) for a Randomly Selected Boolean

Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63

Expected Value of DL(AY1AX) for an n x n Random

Mapping and an n x n Random Bijective Mapping . . . . . 66

SPN with N = 16. n = 4. and R = 3 . . . . . . . . . . . . . . 77

Average Nonlinearity and Average Maximum XOR Table

Entry Versus the Number of Fixed Points (n = 8) . . . . . 85

Involution Linear Transformation Based on MDS Codes

(hl = n = 8 . lrreducible Polynomial = 11 d) . . . . . . . . . 93

Normalized Distribution of Cycle Length for al1 216 Starting

Points . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97

A Typical Expanded Segment of the Cycle Length

Distribution of SPNs with Random Involution S-boxes . . 98

. . . . . . . . . . . . . . . . . . . . Key Scheduling Algorithm 99

SPN with N=16. n = 4 . and R = 3 . . . . . . . . . . . . . 102

Theoretical and Experimental Avalanche for SPN with

N=64 . n = M = 8 . . . . . . . . . . . . . . . . . . . . . . . . 115

The Transition Matrices for Model A and B (M = n = 8) . 11 7

Figure 7.4 The Transition Matrices for Model C and D (hi = n = 8) . 11 8

Figure 7.5 The Limiting Distribution and the Eigenvalues for Model A,

6, Candi3 ( M = n = S ) . . . . . . . . . . . . . . . . . . . . 119

xii

List of Tables

Table 3.1

Table 3.2

Table 4.1

Table 4.2

Table 4.3

Table 4.4

Table 6.1

Table 6.2

Table 7.1

Table A.1

Table C.1

Experimental Results versus Theoretical Bounds for 104

Randomly Chosen &bl Bijective Mappings . . . . . . . . . 37

Distribution of the Maximum XOR Table Entry and the

Maximum LAT Entry (Experimental Results for 10'

Randomly Chosen &bit Bijective Mappings) . . . . . . . . 38

Experimental Results for 8 x rn Injective S-boxes . . . . .47

;brL, -iOR* for 8 x 16 S-boxes Obtained Using

Construction Method II . . . . . . . . . . . . . . . . . . . . . . 52 NL, XOR* for 8 x 24 S-boxes Obtained Using

Construction Method II . . . . . . . . . . . . . . . . . . . . . . 52

Best S-box Nonlinearity Obtained By DifFerent

Construction Methods . . . . . . . . . . . . . . . . . . . . . . . 52

Minimum Distance for 105 Randomly Selected Involution

Linear Transformations (n = iM = 8) . . . . . . . . . . . . . 91

Relative Speed of the Proposed SPN and Q-CAST . . . 100

The Second Largest Eigenvalues for SPNs with -i = 64,

and hf = n = 8. . . . . . . . . . . . . . . . . . . . . . . . . . 115

A parity circuit C(n. d) with n = 10, and d = 3 . . . . . . . 124

Exact Number of Functions Satisfying SAC versus the

Derived Lower Bounds. . . . . . . . . , . . . . . . . . . . . 136

xiii

Chapter 1 Introduction

Until recently, cryptography has k e n of interest primarily to the military and diplornatic

communities. Today, however, several factors have combined to stimulate great interest

in commercial applications. It is becoming increasingly apparent that a primary focus

of society is on the generation, transmission, processing, and storage of information.

Consequently, the focus on data vulnerability and data crime in recent years has

emphasized the need for techniques to protect information, especially when it is in

electronic fom. Among these techniques is cryptography: transformations of data

intended to make the data useless to one's opponents [40]. Such transformations provide

solutions to two basic problems of data secunty: the priva- problem, preventing opponent

from extracting information from a communication channel, and the authentication

problem, preventing an opponent from injecting false data into the channel or altering

messages so that their meaning is changed. nie field of cryprology encompasses both

cryptography (cipher making) and cqptanalysis (cipher breaking).

cryptographic systems c m be classified into two groups: pnvate-key cryptosystems.

and public-key cryptosystems. In a pnvate-key cryptosystem. the enciphering and

deciphenng keys are the same (or easily determined from each other), and the key

is kept secret, while in public-key cryptosystems, the two keys differ in such a way

that one key is computationally infeasible to detemine from the other. While pubiic-

key algorithms have been suggested as a solution to the key distribution problem for

private-key cryptosystems, private-key cryptosystems are still the only practical solution

for cryptographic applications that require high data rates and low power consumption.

Many cryptographic systems implement a public-key algorithm for key distribution, and

a private-key algorithm for the data encryption.

Pnvate-key cryptosystems can be classified into block and Stream ciphers. A block cipher

divides the plaintext message into successive blocks and encipher each block with the

sarne key. A Stream cipher breaks the plaintexi message into successive characters or

bits and enciphen each character or bit with one element of the key Stream.

1.1 Motivation for this Research

Private-key cryptosystems, with their attractive features such as high data rates and low

power consumption, provide a practical solution for a variety of applications. While

the most recognizable private-key cryptosystem is the Data Encryption Standard (DES)

[102], DES controversy was bom at the same time as DES. Since DES was proposed

as a standard there was considerable criticism. The small key size and unpublished

s-box design criteria are the main issues in this criticism. Today, the cryptographic

community is quite aware of the fact that breaking DES is getting a lot more feasible

than it was previously thought [139], [134]. The use of triple-DES, while having good

security, is excessively inefficient especially for software implementations. While the

need for a serious alternative to DES increases, the ciramatic failure of some other

proposed alternatives, such as FEAL [124]. and the "Nonlinear Parity Circuits proposed

at Crypto ' 9 0 [153] (see appendix A) illustrates the difficulty of achieving this'.

Boolean hinctions, because they are the basic components of cryptographic

transformations, attract great attention in the cryptographic field. Despite the fact

that almost al1 cryptographic functions are multi-output boolean functions, the area

of multi-output boolean functions is still almost untouched. Moreover. most of the

successful cryptanalytic techniques on cryptographic hinctions ded with several or al1

output components simultaneously instead of dealing with single components separately.

This work explores different cryptographic properties of multi-output boolean functions.

While there are currently many block cipher proposais, aimost al1 of them are software

oriented ciphers and most of them are optimized around the 32 or 64-bit processors. The

purpose of this work is not only to design a hardware efficient block cipher that is both

Appendix B shows an example of a dramaric failure of a key agreement scheme

2

secure and also software efficient on most platforms. but also to provide some design

principles and analytical tools that cm help in the design process.

1.2 Outline of Remainder of Thesis

Chapter two gives a review of the previous research related to Our work and some of the

mathematical background and the necessary definitions required throughout the thesis.

In chapter three. we study the Linear Approximation Table (LAT) and the XOR

distribution table of balanced s-boxes. We also calculate the probability that any nonzero

linear combination of the output coordinates of a regular s-box is an affine function.

Chapter four contains similar results for injective s-boxes. It also contains two algebraic

construction methods for highly nonlinear s-boxes.

Ln chapter five we study different forms of the information leakage through a randornly

selected boolean function. We also present the relation between the Walsh transforrn and

different forms of information leakage.

In chapter six we introduce a new class of Substitution Permutation Networks (SPNs)

and study its resistance to both linear and differential cryptanaiysis.

In chapter seven we study the avalanche characteristics of four different classes SPNs.

Finally in chapter eight we give a surnmary of Our results and directions for future work.

We note that some of the material in chapter three appeared in [141], [145]. Part of

chapter four appeared in [154], [144]. Part of chapter five appeared in [146],[149]. Part

of chapter six appeared in [153]. Some of the material of chapter seven appeared in

[i50], [152].

Chapter 2 Previous Research

A good introduction to cryptography cm be found in the following books

[2 1,38,67,85,86,120, 1291 and survey articles [4O,4 1,44,8O, 1 14,115,127] . In this chapter

we focus on topics relevant to our work.

2.1 Architectures

in his landmark paper, Shannon [123] presented the principles of confusion and diffision.

Because these principles are so successful in capturing the essence of the desired attributes

of a block cipher, they have become the comerstone of block cipher design.

Confusion is described as "the use of enciphering transformations that cornplicate the

determination of how the statistics of the ciphertext depend on the statistics of the

plaintext" [128] or. more briefly, to rnake the relation between the key and the ciphertext

as complex as possible. thereby hiding the statistical features of the plaintext.

On the other hand. diffusion spreads the influence of individual plaintext characters

over as much of the ciphertext as possible, thereby hiding the statistical features of the

plaintext. Methods of achieving good diffision and confusion lie at the hem of block

cipher design. A general classification of block cipher design architectures is shown

in Figure 2.1.

Block Cipher Architectures

SPN-like structure DES-like Structure

S-box based Non S-box based

Fixd S-box

Figure 2.1 General Classification of Block Cipher Architectures

2.1.1 Substitution-Permutation Networks

Feistel [44] and Feistel, Non, and Smith [45] were the first to suggest that a Substitution

Permutation Network (SPN) architecture (Figure 2.2) consisting of iterative rounds

of nonlinear substitutions on smdl sub-blocks referred to as substitutions (s-boxes)

connected by bit permutations was a simple, effective implementation of Shannon's

principle of mixing transformation based on the concepts of "confusion" and "difision".

Keying the network cm be accomplished by XORing the key bits with the data bits

before each round of substitution and after the last round or by choosing a different set

of s-boxes for each key.

Karn and Davida [66] introduced the concept of completeness. A technique for designing

a cornplete SPN was given. Completeness guarantees that every ciphertext bit of the SPN

depends on al1 input plaintext bits, and this should hold for every possible key value.

Chosen plaintext and known plaintext attacks against Tree-structured SPNs introduced

by Karn and Davida are given in [57], [59].

Plaintext P. - - -

C l ... cl6 Ciphertext

Figure 2 2 SPN wiih N = 16. n = 4, and R = 3.

In [9] Ayoub presented a variant of SPNs which incorporates random permutations and

retains, with a high probability. the completeness criterion after a small number of rounds.

Ayoub suggested that the s-boxes cm be randornly chosen as proof of freedom from

an inten tional trapdoor.

Heys 1551 and Heys and Tavares [60] introduced a new type of SPNs in which they

replaced permutations between rounds by diffusive h e a r transformations to improve the

avalanche characteristics of the cipher and to increase resistance to differential and linear

cryptanalysis. The s-box design cnteria used in their proposed SPN are diffusion and

nonlinearity [60]. The regular structure used in the Heys and Tavares SPN allows both

careful design and careful analysis of the network which can be considered as a good

step towards designing a provably secure block cipherl. The main contribution of Heys

and Tavares is that they showed that the s-box interconnection layer (despite being a

Throughout this thesis. the rem "provably secure" meam provably secure against a prespecified set of crypmdytic attacks.

6

linear layer) plays an important role in improving the cipher's resistance to both linear

and differential cryptanalysis.

SAFER [79], SHARK [log], and SQUARE [34] are some exarnples of block ciphers

based on the SPN architecture.

2-12 DES-like Structure (Feistel Networks)

The National Bureau of Standards (May 1973) published a solicitation for cryptosystems

in the Federal Register. This lead to the development of the Data Encryption Standard.

or DES. which is believed to be the most widely used cryptosystem in the world (for

unclassified information). DES was developed at IBM as a modification of an earlier

systern known as LUCIFER [131]. On March 17, 1975, DES was first pubiished in the

Federal Register. DES was adopted as a standard [IO21 for unclassified data on January

15, 1977. DES has been reviewed by the National Bureau of Standards every five yean.

Its most recent renewal was in January 1994. when it was renewed until 1998. It is

anticipated that it will not remain a standard past 1998. On January 1997. the National

Institute of Standards and Technology (NIST~) announced the development of a Federal

Information Processing Standard for Advanced Encryption Standard (AES). A sumrnary

of the proposed minimum requirements for the AES is listed in section 4.

The idea behind DES is first described by Feistel, Notz, and Smith [45] who described a

block cipher construction which became the network structure for DES. As in an SPN,

the DES-like network also uses s-boxes and permutations to achieve Shannon's mixing

transformation but performs these operations on only half the block at a time. For

round 2 . the mixing ( a substitution layer followed by a permutation layer) is executed

by the round function f : z:" + 2:'' on the N / 2 bits of the right haif block, r,.

The parameters of the function are determined by the key bits associated with round i,

denoted by ki. Each bit of the output of f is then XORed with the corresponding bit

NIST is a new ti~ne for the National Bureau of Standards

7

of the left half block, i i , then the left and right half blocks are swapped. Equivalently,

the cipher can be viewed as follows:

' i+, r i+ 1

Figure 23 Round i of DES-like Cipher

Schneier and Kelsey exarnined a generalization of the concept of Feistel (DES-

like) networks 11211, which they cailed Unbdanced Feistel Networks (UFNs). Like

conventional Feistel networks, UFNs consist of a series of rounds in which one part of

the block operates on the rest of the block. However, in a UFN the two parts need

not be of equal s ix . Removing this limitation has many interesting implications for

designing ciphen which are secure against linear and differential cryptanalysis. The

authors also presented some UFN constnictions and made some initial observations

about their security .

Matsui [83] introduced a methodology for designing block ciphers, with provable security

against differential and linear cryptanalysis. This rnethodology is based on three new

principles: change of the location of the round functions, round functions with recursive

structure, and s-boxes of different sizes. Changing the location of the round function

realizes parallel computation of the round functions without losing provable security. The

recursive structure reduces the size of the s-boxes. The use of s-boxes of different sizes

is expected to make algebraic attacks difficult. Matsui also gave specific examples of

practicai block ciphers that are provably secure under an independent subkey assurnption

and are reasonably fast in hardware as well as in software.

Some other examples for block ciphers based on DES-like structures are CAST [2][3],

FEAL [124], Blowfish [119], LOKI [22][23], TEA [138], and RC5 [112].

2.1.3 Other Structures

There are many other block cipher architectures available. Many of these architectures

are based on some interesting theoretical foundation. DEA is based on the principle of

mixing operations from different algebraic groups.

BEAR and LION [8] are 3-round unbalanced Feistel networks, rnotivated by the earlier

construction of Luby and Rackoff [77] which provides a provably secure (under suitable

assumptions) block cipher from pseudorandom functions using a 3-round Feistel structure.

A class of cryptosystems based on the use of interconnection networks was introduced in

[lO4]. Another class of cryptosystems based on nonlinear finite automata was proposed

in [52].

2.2 S-box versus Non S-box Approaches

Until recently, most of the block cipher proposais used a set of s-boxes in the construction

of the round function to achieve the confusion effect. Some of these s-boxes are fixed,

such as DES s-boxes. Others are generated dynamically as a function of the key and

kept fixed throughout the encryption process. An example of this latter approach is

Blowfish [119]. A more complex approach is the dynarnic substitution scheme. In this

case, after each substirution the s-box is re-ordered. In most of the cases, the just-used

substitution value is exchanged with some other entry in the s-box selected at random.

This dynarnic substitution scheme, while currently applied in Stream cipher design (e.g.,

RC4 [ I l 1],[119] and other commercial ciphers by Ritter Software Engineering 111 O])

has not yet been widely applied to block cipher design.

According to Nyberg [93], the most common methods for constructing s-boxes are based

on: random generation, testing against a set of design criteria, algebraic construction

having good properties, or a combination of these.

The advantage of the s-box based approach is that the theory of s-box design is mature

enough and many techniques are available to constnict cryptographically strong s-boxes.

The disadvantage of the s-box approach is that it may require a large amount of memory

to store the s-boxes. This disadvantage becomes very clear for large s-boxes with non-

simple algebraic description (i.e., when they can only be implemented using look-up

tables).

To overcome this problem one proposal is to use key dependent operations for

the s-boxes. For exarnple, D E A 1731 achieves the required confusion effect by

mixing operations from different algebraic groups (XOR addition modulo 216, and

multiplications modulo 216 + 1). The disadvantage of this approach is that multiplication,

which is the most nonlinear operation arnong the three operations above, usually requires

a large number of logic gates in hardware and is relatively slow in software. Moreover,

the scaling process. to wider block length, is not straightforward (for example D E A can

not be scaled to 128-bit block size because 231 + 1 is not a prime [120]).

Two elegant non s-box based proposals with minimum memory requirements are

TEA[138] and the RC5 farnily[ll2]. The TEA round function is consuucted with

simple fixed logical shift operations. In RC5 the round function is constructed with data

dependent rotations. The only disadvantage of these two proposals (assuming the most

common block sizes of 64 bits or 128 bits) is that rotating or shifting large blocks are

not very efficient operations on 8-bit platforms.

2.3 Cryptanalysis

In this section we give a brief review of some of the cryptanalysis techniques that have

been applied to block ciphers. Cryptanalytic attacks can be categorized into the following

types: (i) ciphertext only, (ii) known plaintext, (iii) chosen plaintext, and (iv) chosen

ciphertext. A ciphertext attack assumes that the cryptanalyst has access to the ciphertext

only. A known plaintext attack uses knowledge of the plaintext values corresponding to

the available set of ciphertexts; a chosen plaintext attack assumes that the cryptanalyst

can select specific values of the plaintexts and acquire the corresponding ciphertexts

while a chosen ciphertext attack assumes that the cryptanalyst can select specific values

of the ciphertexts and acquire the corresponding plaintexts.

Among the attacks that are applicable to different block cipher structures are: Hellman's

tirne-memory trade-off attack [54], meet-in-the-rniddle attack [64], key degeneracy attack

[27,39,107], maximum likelihood estimation [7], method of formal coding (MFC) [118].

and related-keys attack [ 1 33.

Here we focus on the following attacks: the Wiener exhaustive key search machine

[ 1391, differential cryptanalysis [I 21, linear cryptanalysis [8 1 ] and some other recently

proposed implementation dependent attacks.

2.3.1 Exhaustive Search

Despite recent improvements in analytic techniques for attacking block ciphers,

exhaustive key search remains the most practical and efficient attack for ciphers with

realtively shon keys (i.e., 5 .56 bits). In [139] Wiener described in detail. an exhaustive

DES key search machine that costs about US$l million and can find a key in 3.5 hours

on average. The basic machine design can be adapted to attack the standard DES modes

of operation for a small penalty in running time.

Wiener designed his DES key search machine by himself in about five months, which

raises the question of what a well funded govemment agency could do. This machine,

even though it was not built. is a statement that breaking DES is getting a lot more

feasible than it was previously thought.

More recently, responding to a challenge. including a pnze of $10,000, offered by RSA

Data Security, Inc., the DESCHALL group [134] successfdly found a DES key by

exhaustive search over the intemet. The search takes less than 3 months. On April 8,

1997 the group announced finding the key after searching almost 25% of the total key

space. Full details about the project statistics cm be found in [134].

23.2 Differential Cryptanalysis

Differential cryptanalysis [15] is a statistical attack that was developed and popularized

by Biham and Shamir and has been applied to a wide range of cryptosystems including

LUCIFER. DES, FEAL, REDOC, and Khafre [16]. Differential cryptanaiysis is a chosen

plaintext attack which examines changes in the output of the cipher in response to

controlled changes in the input. In particular. the attack exploits the existence of a highly

probable ( R - 1)-round characteristic to determine a portion of the key bits, where an

r-round characteristic is defined to be a sequence of r pairs of input and output XOR

differences corresponding to each round. The existence of a highly probable characteristic

depends on: the XOR distribution table and the bit changes within the network [IS].

The main results of applying this attack to DES reported by Biharn and Shamir in

[15] are : DES with 6 rounds was broken in less than 0.3 seconds on a PC using 240

ciphertexts, DES with 8 rounds was broken in less than 2 minutes by anaiyzing ISOOO

ciphertexts chosen from a pool of 50000 candidate ciphertexts, and DES with 15 rounds

can be broken in about 2'' steps. More importantly, they showed that modifying the key

scheduling algorithm can not make DES much stronger.

One interesting result reported is that FEAL-4 can be broken with just eight ciphertexts

and one of their plaintexts.

At Crypto '92, Biham, and Shamir [17] presented a slightly modified version of their

original differentiai cryptanaiysis attack on the full 16-round DES. This attack requires

-*' chosen plaintexts and at that time, this attack was the first known attack capable of

"breaking" the full 16-round DES in less than the zS5 complexity of exhaustive search3.

In [29] Coppersmith claimed that differentiai cryptanalysis was known to the DES

designers at IBM (within IBM, the attack was formally known as the 'T attack").

Furthemore, he showed how the design cntena for the s-boxes and permutations were

developed to thwart such attacks.

A closer look at differential attacks shows that for an r-round characteristic, only the

plaintext difference and the ciphertext difference need to be fixed. That is, the intermediate

differences can be arbitrarily selected. The notion of differential was introduced by Lai

and Massey [73] to account for this observation. For DES-like ciphers, sequences of

differences c m be modeled as a Markov chah when the subkeys are assumed to be

independent. In order to make a successfd attack on an iterated cipher by differential

cryptanalysis, the existence of good characteristics is sufficient. On the other hand, to

prove security against differential attacks for such ciphers, one has to ensure that there

is no differential with a probability high enough to enable successful attacks.

Knudsen and Nyberg [94] showed that for DES-like ciphers, with independent round

keys, the probability of an r-round differentiai, r 2 4. is less than or equal to 2p$,,

where p,., is the highest probability for a non trivial one round characteristic. They

also gave some (far from being practical) examples for block ciphers with provable

security against differential cryptanalysis. Some of these examples were later broken

using higher order differential cryptanalysis [72], [68].

O'Connor [97] showed that, for a randornly selected m-bit bijective mapping, the expected

size of the largest entry in the XOR distribution table is bounded by 2m while the fraction

of the XOR table that is zero approaches e-0-5 = 0.60653. O'Conner [96] also showed

that the above two quantities are tending to optimal values in composite permutations.

' This 1 bit reduction in exhaustive s m h is due IO the syrnmetric propeny of DES: DES(rn. k) = D E S ( m , k )

13

233 Linear Cryptanalysis

In 1986, Rueppel [ 1 171 presented the idea of the "closest linear approximation" to a

boolean function f . The Walsh transform coefficient of largest magnitude indicates a

linear boolean function of the input variables whose output agrees with f more often than

any other linear function (this is the closest linear approximation to f). Matsui extended

this idea by finding the ciosest linear approximation to a nonzero linear combination

of the s-box outputs. In [82] Matsui introduced linear cryptanalysis, which is a known

plaintext attack, on DES . The purpose of this method is to obtain a linear expression

where p is the plaintext, c is the ciphertext, k is the key, and (a, b, d) are the attack

parameters. This expression holds with probability PL # & over al1 keys, such that

1 PL - 1 is maximal for a given cipher algorithm. To achieve this, a statistical linear

path between input and output of each s-box is constructed. Then the path is extended

to the entire DES algonthm and finally a linear approximation expression without any

intermediate vaiues is reached. The main results of this attack are4: 8-round DES is

breakable with 2'' known plaintexts in 40 seconds, 12-round DES is breakable with 233

known plaintexts in 50 hours. and 16-round DES is breakable with 24i known plaintexts

(faster than an exhaustive search).

In [82] Matsui reported the first experimental attack of the full 16-round DES with

243 known plaintexts. using 12 workstations (HP9735PA-RISC 99MHZ). and using a

prograrn, written in C, and assembly language, implementing the linear cryptanalysis

aigorithm. The program, consisting of a total of 1000 lines, takes 50 days to find the key

of which 40 days were spent for generating phintexts and the corresponding ciphertexts

and only 10 days were spent for the actual key search.

One should note that while linear cryptanalysis can be considered as the most successful

theoretical attack on DES, this attack is by no means more practical than exhaustive --- -

' The experiments were implernented with the C-language on a workstation PA-RISU66MHZ

14

search: one cm not neglect the time needed to obtain the information about the plaintext.

Also, the cryptanalyst is free to invest as much, in technology, as he c m afford to make

the search more efficient when doing exhaustive key search. In a known plaintext attack

the cryptanalyst is restricted to the technology of the legitimate owner of the key. and to

the frequency with which the key is used. l n aimost every practical application, a single

DES key will be applied to much less than 243 bloch, even in its entire life time.

Immunity to linear cryptanalysis can be proved using similar concepts to that used to

prove security against differential cryptanalysis [99]. In 1921 Nyberg showed how to

achieve provable security against linear cryptanalysis.

2.3.4 Extensions to Linear and Düferential Cryptanalysis

In [74] Langford and Hellman presented a new chosen text attack on iterated

cryptosystems. This attack, which is a mixed version of both the differential and

linear cryptanalysis attacks, is very efficient for 8-round DES, recovering 10 bits of key

with 95% probability of success using only 768 chosen plaintexts. More keys c m be

recovered with less probability of success. This Cround attack, while comparable in

speed to existing attacks, represents an order of magnitude irnprovement in the arnount

of required text. The authors reported that they are currently working to extend this

attack to the full 16-round DES.

In [65] Kaliski and Robshaw presented a technique which aids linear cryptanalysis of

block ciphers and achieves a reduction of data required for successful attack. The basic

idea is to deduce several linear approximations using the same set of plaintextlciphertext

pairs. Finally, the authors noted that the use of larger s-boxes, which is sometimes

recommended as a way of increasing the secunty of DES-like block ciphers, might

in certain circumstances facilitate the use of linear cryptanalysis with multiple linear

approximations.

In [72] higher order derivatives of discrete functions were considered and the concept of

higher order differentiais was introduced. The cryptographie relevance of higher order

differentiais was discussed, but no application was given.

The concept of tmncated differentials is introduced in [68] by Knudsen. Tntncated

differentials are differentials where only a part of the difference in the ciphertext. after a

number of rounds, cm be predicted. Knudsen also gave exarnples of Feistel block ciphers

secure against differential cryptanalysis using first order differentials, but vulnerable to a

differential attack using tmncated differentials and higher order differentials.

In [53], linear cryptanalysis is extended to an attack called partition cryptanalysis which

considers a partition of the plaintext space and a partition of the last round input space.

Partitioning cryptanalysis exploits a potential weakness of the cipher, narnely. that the

last round inputs are non-uniformly distributed over the blocks of the second partition

when the plaintexts are taken from a particular block of the first partition. An example

of a cipher for which partition cryptanalysis perfoms better than linear and differential

cryptanalysis was contrived. The success probability of partition cryptanalysis was given

and a procedure for finding a pair of partitions that yields a successful attack was analyzed.

2.3.5 Other Attacks

Rijmen and Preneel [IO91 proposed a new attack on Feistel ciphers with a non-su jective

round function. CAST and LOKI are examples of such ciphers. They also extended

the attack to ciphers that use a non-uniformly distributed round function and applied

the attack to a six round weak version of CAST which uses round keys with 16 bit

entropy per round.

In [63] Jakobsen and Knudsen introduced a new attack on block ciphers, the interpolation

attack. This attack is useful for breaking ciphers using simple algebraic functions (in

particular quadratic functions) as s-boxes.

23.6 Irnplementation Dependent Attacks

2.3.6.1 Timing A ttacks

This attack was introduced by Kocher in [70]. The basic idea is that cryptosystems

often take slightly different arnount of time to process different inputs. Reasons include

performance optimization to bypass unnecessary operations. branching and conditional

statements, M M cache hits. processor instructions (such as multiplication and division)

that run in non-fixed time, and many other causes. Performance charactenstics typically

depend on both the encryption key and the input data.

By carehilly measuring the amount of time required to perform pnvate key operations.

attackers may be able to find fixed Diffe-Hellman exponents or factor RSA keys.

Timing attacks can potentially be used against other cryptosystems. including symrnetnc

hnctions. For example, in software the 28-bit C and D values in DES key schedule are

often rotated using a condition which tests whether a one-bit must be wrapped around.

The additional time required to move nonzero bits could slightiy degrade the cipher's

throughput or key setup time. The cipher's performance cm thus reveal the Hamming

( == 1 weight of the key, which provides an average of .+log? i= 3.93 bits of key n=O (9

information. DEA uses multiplication modulo (216 + 1) operation. which will usually

run in non-constant time. RC5 is at risk on platforms where rotates run in non-constant

tirne. RAM cache hits can produce timing characteristics in many ciphers if tables in

memory are not used identicaily in every encryption.

Additional research is needed to determine whether specific implementations are at risk

and, if so, the degree of their vulnerability. So far, only a few specific systerns have

been studied in detail and the attacks against them are theoretical.

2.3.6.2 DüXerential Fault CryptanaIysis

This attacks, developed by Biharn and Shamir. follows the Bellcore fundamental

E. B i h and A. Shamir. Resewch announcement: A new crypmalytic atack on DES (posted to [email protected]. 18

Ocrober 1996)

assumption:

" By exposing a sealed tamperproof device such as a srnart card to certain physical effects

(e.g., ionizing or microwave radiation). one can induce with reasonable probability a fault

at a random bit location in one of the registers at some random intermediate stage in the

cryptographic computation. Both the bit location and the round number are unknown

to the attacker"

Another assumption is that the attacker is in physical possession of the tamperproof

device, so that he can repeat the experiment with the sarne cleartext and key but without

applying the extemai physical effects. As a result he obtains two ciphertexts denved from

the same (unknown) cleartext and key, where one of the ciphertexts is correct and the

other is the result of a computation conupted by a single bit error dunng the computation.

The theoretical part of the attack can be applied to DES, triple DES (with 168 bit keys),

and DES with independent subkeys (with 768 bit keys). This attack still works even

with more general assumptions on the fault locations, such as faults inside the round

funciion, or even faults in the key scheduling aigorithm. Differential Fault Analysis

can break many additional secret key cryptosystems, including IDEA. RC5 and FEAl.

Some ciphers, such as Khufu, Khafre and Blowfish compute their s-boxes from the key

material. In such ciphers, it may be even possible to extract the s-boxes themselves, and

the keys, using the techniques of Differential Fault Anaiysis.

In a later research announcement6 by the sarne authors, a modified fault model (with

some more plausible assumptions) was introduced. This model makes it possible to

find the secret key stored in a tarnperproof cryptographic device even when nothing is

known about the structure and operation of the cryptosystem. A prime example of such

a scenario is the Skipjack cryptosystern (11.

E. B i h and A. Shamir. Reseveh annauncement: The next suge of differentid fault crypmalysis: How CO break cornpletely

unknown cryptosystems. 25 Seprember 1996

The main assumption behind the new fault model is that the cryptographie key is stored

in an asymmetric type of mernory, in which induced faults are much more likely to

change a 1 bit into a O than to change a O bit into a 1 (or the other way around). The

attack is guaranteed to succeed if the fault model is satisfied. A detailed description of

the differential fault analysis c m be found in [18].

Further Improved Differential FauIt Analysis was introduced by Anderson and ~uhn ' .

2.4 Advanced Encryption Standard (AES) Requirements

On January 2. 1997, the National Institute of Standards and Technology (NIST) announced

the development of a Federal Information Processing Standard for Advanced Encryption

Standard (AES).

The draft minimum acceptability requirements and evaluation cnteria are:

(1 ) AES shall be publicly defined.

(2) AES shall be a symmetric block cipher.

( 3 ) AES shall be designed so that the key length may be increased as nez-

(4) AES shall be implementable in both hardware and software.

(5) AES shall either be freely available or available under tems consistent with the

American National Standards Institute (ANSI) patent policy.

(6) Algorithms which meet the above requirements will be judged based on the following

factors:

( I ) Secunty (i.e., the effort required to cryptanalyze),

(2) Computational efficiency,

(3) Memory requirements,

(4) Hardware and software suitability,

(5) Simplicity,

' A serious weakness of DES (poaed to [email protected]. 2 Novembcr 1996)

19

(6) Flexibility, and

(7) Licensing requirements.

While there are many published block ciphers, none of them meets al1 the above

requirements. Most of the previously published block ciphers have a 64-bit block length.

However, the AES is most likely to be a 128-bit block cipher. Also, most of the

previously published block ciphers have a fixed key length with no obvious way to

increase the key length as needed. Designing an efficient block cipher that meets al1 the

above requirement is a challenging project for the world cryptographie researchers. An

issue that was overlooked by the standard requirements is the practicai need to satisfy

users who require a %bit block cipher for compatibility reasons. For more details about

the AES developrnent effort the reader is referred to the NIST home page [ 10 11.

2.5 Mathematical Background and Definitions

In this section we introduce sorne of the mathematical toois and definitions used

throughout this thesis.

2.5.1 Cryptographie Criteria for Boolean Funetions

Nonlinearity

A function f : -+ Z2 is defined to be a linear f'unction if f ( x ) = a - x for some

constant a E 2:.

A function f is defined to be an affine function if f (x) = a . x 9 b for some constants

a E Zt , b E 22.

The nonlinearity of the function f = (fi f2 fm) : 22 + Z l , fi : 2; + Z?. i =

1.. . - . m. is defined as the minimum Hamming distance between the je: oi aifine functions

and every nonzero linear combination of the output coordinates of f , i.e.,

Arcf = min #{x E Zîlc f ( x ) # w x 3 b ) , (2.3) b.c,w

where w E 2;. c E Zy\{O}, 6 E 22, w x denotes the dot product between w and

x over Z2, and

Linear Structures

A linear structure of a boolean function f : 2; + Z2 can be identified as a vector

a E Z;/{O) such that f(x $ a) $ f (x) takes the sarne value ( O or 1 ) for al1 x E 2:

Wl .

Output Bit Independence Criterion (BK)

A function f : 2; + Zy satisfies output Bit Independence Criterion if whenever one

bit of input is complemented, the correlation coefficient between every two output bit

changes is zero [37].

Balance

A function f : 2; + Z2 satisfies the 0-1 balance criterion if it has equd numbers of 0's

and 1's in its output sequence when the inputs are taken over al1 possible values.

In general. a function / : 2; -t Zy. n 2 n, is said to be balanced or regular if each

ourput symbol y = f(x) E Zy appears an equal number of times (Le., for times)

as x varies over al1 its possible 2" values.

Completeness

A function f : 2; + 27 satisfies the completeness criterion if every Y -

function depends on al1 input arguments [66].

3; : of the

Strict Avalanche Criterion (SAC)

A function f : 2; + 22 satisfies SAC if whenever a single input bit is complemented,

the output bit changes with a probability of one half [37].

Higher Order SAC (Ford)

A function f : ZZ -t Z2 satisfies SAC of order k if any function obtained from f by

keeping k of its input bits constant satisfies SAC [47].

Higher Order SAC (Adams and Tavares)

A function f : 22 -, Z2 satisfies high order SAC of degree k if f(x) changes with a

probability of one half whenever i (1 5 i 5 k ) bits of x are complemented [4].

Propagation Criterion B C ) ~

A hinction f : 22 - Z2 satisfies Propagation Criterion of degree k (denoted PC-k)

if f ( x ) changes with a probability of one half whenever i (1 $ i 5 k ) bits of x are

complemented.

A hinction f : 2; 4 & satisfies the extended Propagation Cntenon of degree k and

order t (PC-k order t ) if any function obtained from f by keeping t bits constant satisfies

PC-k [105].

Correlation Immunity

A function f : 22 4 2 2 is order correlation immune if every k-tuple obtained by

choosing k components from x is statistically independent of f(x)[126].

Binary Bent Functions

A function f : 22 -t Zz is bent [116] if and only if

F(w) =&1, W E 22.

where F (w ) is the Walsh transform [6] of the fbnction ( - 1 (See section 2.5.3).

Binary bent functions exist only for even n.

If f : Z; + 2- is bent, then the distance between f and any affine function is

qn-' - f Y/'-'. A bent function attains the maximum achievable nonlinearity (its

nonlinearity is Y-' - Y/2-') and it has the maximum distance to functions with

linear structures (2"-'). If f : Zq + Z2 is bent, then we have

This is identical io Aâams and Tavves higher order SAC and was defined independently by Preneel ci el [IOSI

in [go] Nyberg extended the concept of bent functions to mu1 ti-ouput boolean functions.

Nyberg presented what she called "perfect nonlinear s-boxesg". A function f : Zq - Zy is perfect nonlinear if for every fixed Ax E 2; the difference f (x G Ax) f f ( x ) obtains

each possible value y E 27 for Zn-m values of x. Every nonzero linear combination

of the output coordinates of perfect non linear functions is a bent function. The main

result is that for perfect nonlinear s-boxes the number of input variables is at least twice

the number of output variables.

Linear Approximation Table

For a given s-box constructed from a mapping f : 2; -, ZT, the linear approximation

table entry LAT(a. b) is defined as [82]:

where a E 22. b E {22}/O, and a - x denotes the inner product of the vectors a and

x evaiuated over Z2.

The linear approximation entry with the maximum absolute values is denoted by LAT*.

XOR Distribution Table

For a given s-box constructed from a mapping f : 22" + 21, The XOR table entry

.VaZay is defined as [ E l :

Naxay = #{x E 2; I f (x 6 Ax) 4 f (x) = Ay} (2.9)

where Ax E 22, ny E 21. The entry Noo = 2", is noi taken into consideration as it

does oot have any cryptographic significance.

For Ax # O, the maximum entry in the XOR table is denoted by XOR*.

Y Nyberg 's definition is more grnerd as it assumes lhat f : 2: - Z?

2.5.2 The Inclusion-Exclusion Principle

A counting tool that is used very frequently throughout this thesis is the Inclusion-

Exclusion Principle [113]. Different applications of the the Inclusioc-Exclusion Principle

to cryptography can be found in [98].

Definition

Consider a set of objects, each of which may or may not possess each property from a

totai set of properties. Label the properties as p i , 1 < i 5 4, and define the number

of objects possesing the P properties

as N ( p ; , . pi,, . . . . pi, ) .

If N(p1, pz. ..., pk) = !V(p,, , p i 2 , . . . ,p i,) for al1 values of k, 1 5 k 5 4, and al1 selections

of il, i2, .--. ik, then the properties are said to be symmetric.

Inclusion-Exclusion Principle:

Consider a set of objects. each of which may or may not possess each property from a

totai of 4 properties. If the properties are symmetric, the number of objects having one

or more properties. @. is given by [113]

where W ( i ) is the number of objects having i particular properties.

Generalization of Inclusion-Exclusion P ~ c i p l e :

Consider a set of objects. each of which may or rnay not possess each property from

a total of m properties. If the properties are symrneuic, the number of objecü which

possess exactly t . 1 5 t 5 6. properties, r ( t ) , is - given by [113]

where T * ( i ) represents the number of objects which have i particular properties.

2.53 The Walsh ïhmform of Boolean Functions

In this section, we examine some properties of the Walsh transform [6,42] of boolean

functions. For a boolean hinction f : 2; -r Zz, the Walsh transform of the function

( - I ) ~ ( ~ ) is given bylo

The function ( - l ) f ( X ) is given by the inverse Walsh transform as follows

Autocorrelation Function

The autocorrelation function ll of the function f is defined as

The autocorrelation

coefficients by

function as defined above can be determined from the Walsh

'O Fmm now on. we will refer ro F ( w ) as the Walsh uansform of f

26

Summation Property

The Walsh transform of a boolean function f satisfies

Parseval's Theorem

Dyadic Shift

Let Fl (w) = F(w 9 a) then fi(x) is given by

Similarly, if f2 (x) = f (x @ a) then

Dyadic Convolution

From the above result it can be shown that the Harnming distance

can be expressed as

The Walsh transform is a powemil analytical tool for studying the properties of boolean

functions for different reasons:

Some cryptographic properties, such as bentness, were originally defined in terms of

the Walsh spectrum of the boolean hinctions.

Al1 the cryptographic properties of a boolean fimction can be extracted from its Walsh

spectrum. For example :

The largest spectral component gives the nonlinearity of the function f : ZI: + Z2

Sirnilarly, for a multi-output boolean function f : Z!: - 2 1 we have

where F C ( w ) is the Walsh transform of the function ( - 1 )'-/.

order correlation immune functions must have

SAC fulfilling functions must have

C F'(w)(-1)"' = O, i E {I, 2, ... n}, ~ € 2 ;

where u;i is the i-th bit in W.

Functions satisfying PC-K must have

The field of digital signai processing is well established and it cm provide u s with a good

theoreticai foundation for studying different cryptographic characteristics of multi-output

boolean functions.

Possible extension of the Walsh transfonn to the case of multi-output boolean functions:

multi-output boolean functions can be descnbed by the Walsh transform of every nonzero

linear combination of its output coordinates.

This motivates us to express different forms of information leakage in terms of the Walsh

transform coefficients (see section 5.6).

Chapter 3 Linear Approximation Table and XOR Distribution Table of Balanced S-boxes

Differential cryptanalysis [15], and linear cryptanalysis [82] are currently the most

powerful cryptanalytic attacks on private-key block ciphers. The complexity of

differential cryptanalysis depends on the size of the largest entry in the XOR table,

the total number of zeroes in the XOR table, and the number of nonzero entries in the

fint column in that table [15], [122]. The complexity of linear cryptanalysis depends on

the size of the largest entry in the linear approximation table (LAT) [81], [82].

Thus one way to reduce the risk of differential and linear cryptanalysis is to choose the

s-boxes with small maximum LAT entries and smalI maximum XOR tabie entries.

O'Connor [97] studied the distribution of the XOR table of bijective mappings. In

particular, O'Connor showed that for a randornly selected n-bit bijective :: : :-:na, the

expected size of the largest entry in the XOR distribution table is bounded by 2m

while the fraction of the XOR table that is zero approaches e-0-5 = 0.60653. O'Connor

also showed that the above two quantities are tending to optimal values in composite

permutations [96].

In some block cipher designs, it is required to have balanced s-boxes (also known as a

regular s-box). In this chapter, we derive an upper bound on the fraction of balanced

functions (s-boxes) having a specified lower bound on the maximum entry in the XOR

distribution table or the LAT. For reasonably smdl values of these maximum entries we

show that this fraction decreases dramatically with the number of input variables.

We also calculate the probability that any nonzero linear combination of the output

coordinates of a regular s-box is affine.

From [88], the total number of balanced s-boxes with n input bits, and m output bits

is given by

3.2 Linear Approximation Table of Balanced S-boxes

R e d 1 that for a given s-box constructed from a mapping f : 27 -, Zy. the linear

approximation table entry LkT(a. b ) is defined as [82]:

where a E Z?. b E {ZT}/O, and a x denotes the inner product of the vectors a and

x evaluated over Z2.

It is straightforward to notice that LAT(a, b) = Y-' - d ( a . x, b f), where

Let w t ( a ) be the Hamming weight of the binary vector a; which is the number of 1's

in the vector a. For wt(a) = O, we have LAT(a. b) = O as b f is always a baianced

function for a baianced s-box.

For a # O. b # O we have

Lemma 3.1

2

Proof For wt(b) = k > O, we have ways of arranging the bits of f (x)

corresponding to nonzero bits of b such that d(a x, b . f) = O. This result follows

directly by noting that the number of arrangements of r different objects, for each of

which there are c copies, is ( rc ) ! / (c! ) ' . In Our case, we have k-bit symbols with

the corresponding XOR equd 0, and 2"' k-bit symbols with the corresponding XOR

equal 1. Each of these symbols occurs times. Thus we have ( ) possible

arrangements for the symbols corresponding to the 0's of a. x and a similar number of

possible arrangements for the symbols corresponding to the 1's of a x.

Since we can partition the input x into zk distinct sets, al1 x' s in a given set are assigned

the same common value of the k bits of f ( x ) corresponding to the nonzero bits of b.

We still need to assign the remaining rn - k output bits for each x. Each x within

a given set must be assigned a distinct m - k tuple of the remaining output bits for y-&! times, each set can be assigned in 2 n ,,+ ways, so the remaining bits can be

2&

assigned in (( 2"-VI! - ! zrn- ) wayr.

Thus we have ( - - ) ( ) 2k - - (?n-1!)2

2"-k! 2"-m! zm-' distinct balanced functions

with d(a - x. b f ) = O. Using the sarne argument, and by noting that we have ) ways to generate balanced functions that have d(a x. b f ) = 21 from a . x. we have

Zn-I

( ) i2n-m!)2m ways of generating distinct balanced s-boxes with d h . Y. b f ) = 21.

By dividing by the total number of baianced s-boxes, B(n, m), we get the lemma abovea

For any integer vdue f k f ~ ~ ~ , 0 5 1 2 1 ~ ~ ~ 5 Y-', let iVtAT denote the number of LATs

with any entry having absolute vaiue > ~ M L ~ T , then fViAT is upper bounded by

Proof: For 1 # O we have

Using lemrna 3.1, and by noting that'

we get the theorem above. O Numerical substitution in the formula above shows that the fraction of balanced s-boxes

with undesirable LATs decreases drarnatically as the number of inputs increases. To give

a numerical example, consider the case where M L , ~ = 2n-5, for n = 12. m = 6 we

have ,$&$ < 3.86 * 10-'O.

3.3 XOR Distribution Table of Balanced S-boxes

Recall that for a given s-box constructed from a mapping f : Z i -+ 27, the XOR table

entry NAxay is defined as [15]:

i V ~ f i ~ = #{x E ZZ 1 f (x 6 Ax) @ j (x) = Ay}

where Ax E 2,": Ay E 22. Lemma 3.2

For A x # 0: Ay # 0, the number of balanced functions with NAxay 2 2k is upper

bounded by Q, , , (k ) , where

Equaiion (3.7) stmply mwns that the nurnber of LATr with entries having absolute value > 2MLAT is upper bounded by the

nurnber o f these envies

Proof: For any balanced function f : Z!: - 21. n 2 m, we have '2m distinct output

symbols. ~ 0 . ~ 1 . - - . y ? m - l , and each of them is repeated times. For a given

Ay # O' AX # O, let S be the set of Y"" O U ~ P U ~ symbols pairs (Yi, yj) with Yi = yjeAy.

Thus S has distinct unordered pairs, each of them is repeated times. Divide

S into 2m-1 subsets such that each of these subsets contains only identical unordered

pairs. Let Si. S?, . S2m-i denote these subsets.

Now we count the number of ways by which we cm constmct a balanced function

f : ZI: -+ Z!: for which .Lay 2 2k. Select k, pairs from each subset 3,. 2-1

i = 1.2. . lm-' such that I , = k. There is only one way to choose k, pairs i = l

from the subset Si (as the pairs within Si are indistinguishable). These k pairs can be

ordered in C (k; ki , k2 , . . . , k p - I ) ways. Note that we have 2 possible orders for each

pair, giving 2' total possible orders.

Now we select k different input values xi, xi,, . xi,-, for which xl, 5 xi, # Ax

for O 5 i, j < k. There are (pi ) possible choices for xia, xil. . . xik- - For each

such choice we assign ro each pair (f (xi, ), f (xi, 5 Ax)) an element iLi; :- p i n

chosen above.

The remaining - 3k output symbols of f can be assigned in

C'('Zn - 2k: I I , I I , l?, l?, .... l p - i . 1 p - i ) ways, where li = 2n-m - ki.

The construction approach described above does not guarantee the construction of distinct

balanced functions, and so Q, , , (k ) is an upper bound. O

Lemma 3.3

For Ax # 0.Ay = O, the number of balanced functions with ..K&, 2 2k is upper

bounded by an-m(k), where

Proof: For any balanced function f : 2; + Zy. n 2 m, we have 2* distinct output

9n-m symbols, ya, yl, . : yp-1, and each of them is repeated , times. For a given

Ax # O. Ay = O, let S be the set of Y-' output symbols pairs (yi, yj ) with y; = yj .

Thus S has Zrn distinct pairs, each of them is repeated times. Divide S into '3"

subseü such that each of these subsets contains only identical pain. Let Si: S2.S . . . S2-

denote these subsets.

Now we count the number of ways by which we can construct a balanced function

f : 2: -, Zy for which > 2k. Select k; pairs from each subset Si, i = 1.2, . . Zrn ~ r n

such that ki = k. There is only one way to C ~ O O S ~ ki pairs from Si (as the syrnbols i= 1

within Si are indistinguishable). These k pairs can be ordered in C ( k ; kl k?. ..., h m )

ways. In this case. we do not have the 2k factor (as in Lemma 3.2) because the

two elements within each pair are identical. Now we select k different input values

Xia - Xil, - - 7 Xibl for which xi, 3 xl, # n x for O 5 i, j < k. There are

possible choices for xi,. xi,. , xlk-l. >

For each such choice we assign to each pair

(/(xi,), f (xr , @ Ax)) an element from the k y pairs chosen above. The remaining

2" - 2k output syrnbols of f can be assigned in C(Zn - 2k; Ill 12, ...: 1 2 4 ways where

1; = 2"-* , 2ki .

A similar statement about the upper bounding applies to @,,,(k). 0

The exact number of balanced functions with i\'AxAy = 2 k . Ax # O. A y # O (denoted

by . \ l ,m.ay(k) ) is given by

Proof: Let pi, (cf. equation (2.10)) denote the number of balanced functions with the

property that niaxay 3 2 k The Lemma follows by direct application of the Inclusion-

Exclusion Principle.

Similarly, if we denote the exact number of balanced functions with Nhay = 26. Ax f

0.Ay = O by L'I,.,,~(~) then we have

Using the above results, and by noting that2

we get the following theorem

EquYnn (3.15) simply means Chu chc number of the XOR distribution tables with entries having vdue > ? M X O R is upper

bounded by ihe number of these entries

The fraction of baianced functions with maximum XOR table entry 3 2ik1.yoR,

O 5 M Y O R 5 Y- ' , is upper bounded by

Numencal substitution in the formula above shows that the fraction of baianced s-boxes

with undesirable XOR distribution tables decreases dramaticaily as the number of inputs

increases.

One specid case of interest is for bijective mappings. Le., when n = m. For this case

(see [97] for more details on this special case) we have

It is also straightforward to see that

and

and hence for O < . y o R 5 Y-', we have

To give a numerical example, consider the case where k1.t-QR = n. For n = 1- we

Table 3.1 shows the results of Our simulation3 for 10' randomly chosen 8-bits bijective

mappings. T-Bound refen to our theoretical bound (equations (3.17) and (3.20) ) for

the fraction of bijective mappings with .YOR* = 3 i M a x + y o ~ , or LAT* = 2 M a x r A T .

SBound refers to the bounds obtained from the simulation results. Only the nontrivial

bounds are shown in the table.

Table 3.2 shows the number of bijective mappings with XOR* = 3 M a 1 - ~ - ~ ~ , and

LAT* = 2 M a x L A T obtained from our simulations.

Table 3.1 Experimental Results versus Theoretical Bounds for 10' Randornly Chosen &bit Bijective Mappings

Simulation results in this thesis mrike use of the UNIX C library funaion "rand"

37

Table 3 2 Distribution of the Maximum XOR TabIe Enuy and the Maximum LAT

Entry (Experimental Results for 10' Randomly Chosen Mit Bijective Mappings)

3.4 Counting the Number of Nonlinear Balanced S-boxes

Gordon and Retkin [SOI calculated the probability that any of the output coordinates of a

random, reversible substitution box (Le., a bijective mapping) is an affine function. After

linear cryptanalysis was introduced, it was realized that the cryptographie strength of a

multi-output function depends not only on the strength of its individuai output coordinates

but also on the strength of every nonzero linear combination of these coordinates [go].

In this section. we calculate the probability that any nonzero linear combination of the

output coordinates of a regular s-box is an affine function.

Lemma 3.5

The number of regular n x rn s-boxes (described by the multi-output boolean functions

f : 22 + ZT, n > m) for which the first P output functions are affine is given by

Proof: By noting that every nonzero linear combination of the output coordinates of

regular s-boxes is a balanced function, the first function can be chosen in (Y+' - 2)

ways, which is the total number of balanced affine functions. The second function cm

be chosen from the set of baianced affine functions not including the first one or its

cornpiement, i.e., in (ZR+' - 4) ways. The third one can be chosen from the set of

baianced affine functions not including any linear combination of the first two functions. k-1

Proceeding as above. the first k output functions cm be chosen in 2" (2" - 2') ways. t=O

Since we cm partition the input x into 2' distinct sets, al1 x's in a given set are assigned

the sarne cornmon value of these k bits. We still need to assign the remaining m - P

output bits for each x. Each x within a given set must be assigned a distinct rn - k tuple

of the remaining output bits for Z n - m times,

ways, so the remaining bits can be assigned in

.-ln-'=! each set cm be assigned in -

( ?ri-m t 12m-k

Consider the hnction 0 : 2; + zirn-' constnicted from every nonzero linear

combination of the output coordinates of f . Counting the number of regular s-boxes with

,4/*L # O corresponds to counting the nurnber of functions 8 with no affine coordinates.

The number of ways one can choose 1 coordinates of @ such that k of them are linearly

independent is equivalent to the number of 1 x m binary matrices (without t&ng ihe order

of rows into account, Le., two matrices with the same rows but in different orders are

counted once) with nonzero distinct rows which have rank k. This is given by [132], [49]

w here

Lemma 3.6

The number of ways to construct certain k linearly independent coordinates of @ from

affine functions is also given by R(n, rn, k).

Proof: It is clear that for every k Iinearly independent coordinates of d ,

denoted by oi,. ai,, . - . O,,, one can find ( m - k ) coordinates of @ , denoted by

OZ,+, 0 t l E + 7 - - O:m v such that

where A is an m x m invertible binary mauix. (fi f2 .. .. Jm ) denotes the output

coordinates of f . This means that as f varies over al1 the set of distinct regular s-boxes,

( Q * ~ 4 .. .. i;, ) scans the whole set but in a different order. O

From the lemma above it follows that the number of ways to constmct 1 coordinates of

Q from affine hnctions is given by

Using the inclusion-exclusion principle, the number of regular s-boxes with the property

that one or more of the nonzero linear combinations of its output coordinai-- . . .iffine,

RL(n. m), is given by

RL(n. m ) = (-1)'-1 L I ( m , l . k ) R(n. m. P ) . (3.26)

To express the above count as a fraction of the total number of regular s-boxes , denoted

by F RL(n. m), we divide by the total number of n x m regular s-boxes.

To give a numericai example, for n = 6. m = 4, which is the size of DES s-boxes,

FRL(6 ,4) = 2.46 * 10-16. One can easily get an upper bound for FRL(n, m) by noting

that we have

and hence we have

Since we have n 2 m, then

By noting that

then we have

where O(- ) is the asymptotic upper bound order notation4.

3.5 Conclusion

In this chapter we denved an upper bound on the fraction of balanced functions (s-boxes)

having a specified lower bound on the maximum entry in the XOR distribution table or

the LAT. For reasonably small values of these maximum entries we showed that this

fraction decreases dramaticdly with the number of input variables.

We also cdculated the probability that any nonzero linear combination of the output

coordinates of a regular s-box is affine. Our results shows that the number of balanced

functions with nonzero nonlinearity goes down exponentially (as .??"-Sn/') with the

number of input variables n.

While these results suggest that large cryptographically strong balanced s-boxes may be

obtained by selecting them at random, these statistical results might have some practical

h(n) = O(g(n)) if the= exisrs a positive constani c and a positive integer no such ihu O ' h ( n ) 5 cg(n) for ail n > no.

41

limitations. Describing a large randomly chosen s-box requires a large amount of memory

which might be impractical in some applications. This suggests that other methods such

as algebraic constnictions, which trade-off memory requirements and computational speed

(see [89] for an example of such methods) might offer alternative approaches.

Chapter 4 Linear Approximation of Injective S-boxes

One way to reduce the size of the largest entry in the XOR table. and hence reduce the

risk of differential cryptanalysis, is to use injective substitution boxes (s-boxes) such that

the number of output bits of the s-box is sufficiently larger than the number of input

bits. In this way, it is very likely that the entries in the XOR distribution table of a

randomiy chosen injective s-box will have only small values, making the block cipher

resistant to differential cryptanalysis. Some proposed block ciphers, such as CAST [5]

and Blowfïsh [119], take advantage of this property.

On the other hand. Biham [14] proved that if for an n x m s-box described by f : 2: + Z!:

we have m 2 2" - n, then at least one linear combination of the output bits rnust be

an affine combination of the input bits and the block cipher can be trivially broken by

linear cryptanalysis.

In this chapter, we estimate the size of the largest entry in the LAT rmdomly

selected injective s-box. We also present two methods for constnicting highly nonlinear

s-boxes. Finally we show how the resistance of a CAST-like encryption aigorithm to the

basic linear cryptandysis was underestimated in [75].

4.1 Definitions and Notation

The function g : -y -t kW is injective (or one-to-one) if for x . i in S the equality

g ( x ) = g(i) implies z = 2 . That is. distinct elements of X can not have the same

image in Y [62].

From the definitions of nonlinearity and LAT, it is clear that the nonlinearity of the

function f : 2; -, Zy is given by

iVLl = 2"-' - mâx (LAT(a, b)(, a, b

Throughout the rest of this chapter, the b = O case is not taken into consideration as it

does not have any cryptographic significance.

It is easy to see that the number of injective n x rn s-boxes is given by

In(n. rn) = n (Zm - i). Throughout this chapter, let

f : Z!: -, Zy describe a random injective mapping,

w t ( a ) denote the Hamrning weight of the binary vector a,

Pw(k) = P { w t ( b -1) = k},

P;"~~)(L) = P { d ( a - xl b f) = l } , b # O . Pdlu,( l ,k) = P { d ( a - x . b . f) = 1 1 w t ( b - f) = k}.

4.2 Linear Approximation Table of injective Mappings

We first determine the probability distribution of the weight of b . f:

Proof: For b # 0, wt (b f ) follows the hypergeometnc distribution (201. This follows

by noting that the weight of b f has the sarne distribution as that of the function

constmcted by randomly choosing 2" bits from the function b - R where K : Zy + Z F

is an arbitrary bijective mapping. O

Given the weight of b . f. the distribution of the distance between a . x and b f cm

be detennined :

Lemma 4.2

For a # O, we have

Proop Let M i j = #{XE 2; l a - x = 2.b. f ( x ) = j } , i ? j E Z2. Then we have

d(a x. b - f ) = Mlo + Mol . Since wt(b - f ) = k, we have Mol + 1%' = k. We also

have Mie + 11lI1 = 2"-' as a x is a balanced function. Using these equations,

we get the lemma. O Combining the two results above. the following theorem gives the probability distribution

of the distance between a x and b f for fixed a and b. This can be used to find the

probability that a given entry in the LAT has a particular value, - 1.

Theorem 4.1

proof: The a = O case follows because the distance between any function f and the

zero function is the weight of f . For a # 0, the result holds because

For any integer value L&,~T. O c - \ . i~~= 5 and by denoting the number of

LATs with any entry having absolute value 2 M L A ~ by iviATl we have the following

upper bound:

Corollary 4.1

Proof.. We have

LAT(a, b) = Y-' - d ( a S x . b - f).

From Theorem 4.1 , and by noting that

N i ~ T = P{ (max I L A T ( ~ . ~ ) ~ ) 2 l k f r r ~ In(n. m ) a,b#o

we get the Corollary above.

Take the minimum value of :CfLAT for which 1z~:;'7) 5 0.5 as an estimate for the

expected value of the maximum LAT entry, LAT*. Table 4.1 shows the simulation

results for LAT*, the average value of the maximum entry in the LAT of a randomly

selected 8 x m injective s-box (m = 10,12, ..., 32) together with Our theoretical estimate,

denoted by LAT". Table 4.1 also shows the maximum and the minimum values for

LAT* found throughout the experiments. For ail the results given in table 4.1, the

sample variance is upper bounded by 2. The Fast Waish transform [6] and other speed-

up techniques were used in the calculation of the maximum LAT entries throughout Our

simulation. It is worth noting that for m = 32, four of the s-boxes out of the ten tested

achieved a nonlineaxity of 73, while the nonlinearity of the remaining six s-boxes was 77.

- -

Table 4.1 Experïrnental Results for 8 x m Injective S-boxes

4.3 Construction of Highly Nonlinear Injective S-boxes

In this section we present two methods for constmcting highly nonlinear injective s-

boxes. Both of these methods, which are motivated by bounds on exponential sums,

outperform the previously proposed methods. In particular, we are able to obtain injective

S x 32 s-boxes with nonlinearity equal to S0 and maximum XOR -- :..: entry of 2. We

also re-evaluate the resistance of the CAST encryption aigorithm to the basic linear

cryptanalysis.

4.3.1 Definitions

Trace: The sum

is called the trace of a E GF(2m).

Basis und duul basis: A set {a,, a,' . , am} of rn elements over GF(2*) which are

linearly independent over GF(2) is called a basis for GF(2") over G F ( 2 ) .

The corresponding duai bais is defined to be the unique set of elements

{ T ~ , 72, . . - . ym} C G F ( P ) such that -

Lemma 4.3 (Cadi& and Uchiyama bound [25l)

If F ( x ) is a polynornial over GF(2") of degree r such that F ( x ) # G(x)? + G(x) + b

for al1 polynomiais G(x) over GF(Zm ) and constants b E GF(2m ), then

Lemma 4.4 (Kloostermun sum [137], [25])

Note that a function, F, over GF(2") c m also be expressed as a function over C F ( ? ) " ,

i.e., as n functions over GF(2). Let

be a function over GF(2)" , let B = {cr,,a,,-.an) be any basis of G F ( Z n ) over

G F ( 2 ) , then

n where x = riai E GF(2") . This means that there is a one-to-one correspondence

i= 1 between the functions of G F ( 2 " ) and those o f GF(2)" under a chosen basis of G F ( Y )

over GF(2) . If we let {a;, . . . , a:} be the dual bais of B. then each component of

f ( x l , - . . x n ) can be expressed as

where x = C z i a f . t= 1

The nonlinearity of the function f(ri .-s,) = (fi(xl.---rn).---. fm(xl.---I,)) is

defined as the minimum Hamming distance between the set of affine functions and every

nonzero Iinear combination of the output coordinates of f .

From the above, one can easily prove that the nonlinearity of the function f that

corresponds to the function F is given by

:kpLf = min ~ ( T ~ ( C F ( ~ ) ) . T r ( w S ) ~ ; b ) cf0.b.w

where c.w E G F ( Y ) , b E GF(2) .

4.3.2 Construction Method 1'

This construction method is based on the observation that highly nonlinear injective

s-boxes may be obtained by adding the coordinate functions of highly nonlinear bijective

s-boxes.

Given the distinct bijective functions F* over GF(2") , 1 5 i 5 iM, an injective function

G over GF (2"") can be obtained by setting

In this method we use the inversion mapping proposed by Nyberg [91]

where x E GF(2") .

Using lemma 4.4, the nonlinearity of the function F, is lower bounded by -

Expenrnental results show that injective 8 x 16 s-boxes constnicted by this method always

have nonlinearity of 96. For 8 x 24 and 100 random choices of ai pairs, we found 57.40

' The experimental results in ttiis part were obtained by the second auchor in (14 I l

49

and 3 s-boxes with JV-L = 86.84 and Y0 respectively. The only S x 32 s-box tested to

date has nonlinearity of 76.

Conjecîure

The nonlinearity of the function g that corresponds to the function

obtained using construction method 1, is bounded by

We venfied this conjecture experimentally for n 5 10 where we found that th:. 5ound

is tight for even n.

Remrk: Wc may prove that NC, 22"-' - (2"12+' + 1) by provmg that for o # 0

we have

but this seems to be a hard problem.

4.33 Construction Method II

This method is aiso based on the observation that highly nonlinear injective s-boxes

may be obtained by adding the coordinate hnctions of highly nonlinear s-boxes (not

necessary bijective).

If the concatenated hnctions are distinct polynornials over G F ( 2 " ) such that

for dl polynornials U(x) over GF(2") and constants ai E GF(2")\{0}. b? w E GF(2")

then the Carlitz and Uchiyama bound can be used to provide a lower bound for the

nonlinearity of the resulting s-box as follows.

If r = max (degree (F , ) ) then the nonlinearity of the resulung function is lower bounded t

Using the Carlitz and Uchiyama bound one cm check that the nonlinearity of the function

F(x) = x3. x E GF(2') is lower bounded by 112. Also the nonlinearity of the function

G(x) = (x311x5). x E GF ( z ~ ) is lower bounded by 96.

Our basic rezult is based on the experimental observation that the Carlitz and Uchiyama

bound is not tight for higher values of r. In this section we consider the following

five constructions

Gi = (x511x'11x1' 1 1 ~ ' ~ ) ~

G2 = (x3llx'llx" 11x13),

GB = (x3 1 1 ~ ~ ( 1 ~ ~ ~ llx13)

G4 =

G5 = (x3 [lx5 [ (x i l(xl').

Using Carlitz and Uchiyama bound we have

Experimental results shows that

Let Fi = x3. x5. x i : XI'. x ' ~ for i = 1 .2 .3 ,4 ,5 respectively. By noting that Fi is

bijective for i = 3 , 4 5 then any construction that includes any of Fi, i = 3 . 4 or 5 will

be injective. Ln fact, it is not hard to see that al1 the s-boxes below are injective.

Table 4.2 below shows the nonlinearity and the maximum XOR table entry, XOR*, for

the injective S x 16 s-boxes constructed by this method by concatenating Fi with F,.

TabIe 4.3 shows similar results for the 8 x 24 s-boxes.

Table 4 3 ,firC, SOR' for 8 x 24 S-boxes Obtained Using Consuuction Meihod I I

For the 8 x 32 s-boxes. SOR* = 2 for ail the constructions above except for Gl where

it is equal to 4 which may lirnit the usefulness of Gis

Table 4.4 shows the best s-box nonlinearity obtained by different methods.

*

Table 4 3 J ~ L . SOR' for 8 x 16 S-boxes Obtained Using Construction Method II

2.5

96

4

4.5 1

96

4

3.3

88

4

Table 4.4 Best S-box Nonlinearity Obtained By Different Consuucrion Meihods

2.3

96

4

3.5

96

4

k

Random

Misrer and Adam [9/

Method I

Merhod II

The construction methods proposed in this section c m be extended to other highly

nonlinear rnappings such as that proposed in [26],[91].

1.3

80

2

2.4

96

4

1.3

96

2

1.1 1

.%-f

.Y0 R'

In order to frustrate possible algebraic attacks, the four 8 x 32 s-boxes should be generated

1.5

80

2

1.2

96

2

8 x 16

87

-

96

96

using different imeducible polynomials. When using s-boxes obtained from construction

method II, we recommend that the bytes XORed together should correspond to different

8 x 24

80

- 86

96

degrees. For example, if G5 is used. then the 8 x 32 s-boxes may be constructed as follows S 1 = (x3 ~ ~ x 5 ~ ~ x 7 1lx1l),

S- = (x5 11xillxl1 1(x3),

53 = (xi llxl1 11x3 1 1 x 5 ) , sq = ( X ~ ~ I I X ~ I I X ' ~ J X ~ ) ,

such that the exponents forrn a Latin square [113].

8 x 32

73

74 1

7é I

60

43.4 Comments on the Security of the CAST Encryption Algorithm

& Figure 4.1 CAST Round Function

Figure 4.1 shows the CAST round function. In this paper we assume that operations

a, b, c and d are XOR addition of 32-bit quantities.

The resistance of CAST-like encryption algorithms [2] constructed using randomly

generated s-boxes against the basic linear cryptanalysis [82] was snidied in [75]. The

number of known plaintexts, X', in a basic linear attack (Algorithm 1 in [82]) required

to give a 97.7% confidence of getting the right key bit is approximately given by

where 7 is the number of s-boxes involved in the R-round linear approximation 2"-1 -Ncs

expression, I p , - $ 1 = 2" and n is the number of input bits to the s-box.

The bounds for Np can be improved by re-evaluating the expressions in [75] using

the new 8 x 32 s-boxes nonlinearity. However a better bound can be obtained by

considenng the nonlinearity of the resulting 32 x 32 s-boxes. The nonlinearity, JL-&

can be bounded using the nonlinearity, ADf ,, . 1 5 i < 4, of the four 8 x 3'2 s-boxes

used in its construction as follows 4

- . r = l

The exact nonlinearity can be efficiently calculated using the Wai

(4.33)

sh transforms [6] of

the four 8 x 32 s-boxes. Since an R-round linear approximation must involve at least as

many s-boxes as R / 2 iterations of the best --round approximation, the number of 32 x 32

s-boxes involved in an R-round linear approximation is at least R / 2 and hence we have

where N t s is the nonlinearity of the 32 x 32 s-box.

An important observation, which was overlooked in [87] is that the nonlinearity of the

32 x 32 s-box depends not only on the nonlinearity of 8 x 33 s-boxes used in the

construction, but it also depends on how the four 8 x 32 s-boxes interact topther. This

means that improving the nonlinearity of the individual 8 x 32 s-boxes dm, not always

guarantee improving the resistance of the cipher to the basic linear cryptanalysis. For

example, when combining the output of the four CAST-128 s-boxes [2] (each with

nonlinearity 74) by XOR, the resulting 32 x 32 s-box has nonlinearity 2.132.774

which is less than 2,133,721,088, the nonlinearity we found by combining

randomiy selected s-boxes with nonlinearity less than 74. Using such s-boxes

have Np zz 272 > 264 for R = 8 which is much higher than Np = 234 estimated in

912

four

we

751.

Finally, one should note that the prirnary motive for this work is obtaining highly

nonlinear injective s-boxes. We are not proposing the use of such s-boxes in CAST-like

ciphers before examining their other cryptographic properties. In fact, we believe that

randornly selected s-boxes are a good choice for CAST-like ciphen2.

O' Connor and Kiapper [IO01 showed that che expectcd degrte of the Algebraic N o d Form of a mdomly selected boolean

huiction j : 2; - Z2 is quai io n - ; + 0 (&). This suggots iha randomly selecicd 8 x 32 s-boxes are more misrnt to higher

order differential cryptyialysis rhui CAST-128 s-boxes.

Chapter Properti

5 Information Leakage and Spectral es of Boolean Functions

5.1 Introduction

Several cryptographic critena have been previously proposed as a measure of the strength

of cryptographic functions. Among these criteria are balance, correiation imrnunity [126],

resiliency [28], nonlinearity [84], Strict Avalanche Criterion (SAC) [136], higher order

SAC [47], Propagation Criterion (PC), higher order PC [105], Bit Independence Criterion

[ 1351, and Completeness 1661.

nie above set of cryptographic critena are not independent of each other and

a cryptographic function that satisfies al1 these criteria would be a golden one.

Unfortunately, it can be proven that no hinction can satisfy al1 the above set of criteria

simultaneously. This can be considered as the main motive for proposing a new set of

criteria based on information theory.

Several design criteria, based on information theory, have been proposed in [37],[48],

11301, and [ M l .

Information Leakage cm be classified into two classes: Static information leakage and

dynamic information leakage. It is argued in [155] that a boolean hinction is resistant

to statistical analysis (e.g., differential cryptanalysis [15], linear cryptanalysis [S2], and

Siegenthaler's correlation attack [125]) if there is no significant static and dynamic

information leakage between its inputs and outputs.

It is worth noting that Brynielsson [24] gives an approximate expression for the expected

value of the mutud information between the output and input subvectors for multi-

output boolean functions. Here we follow the definition of information leakage given

in [155] and give an exact expression for the expected value of different forms of

these information leakages for a randomly selected multi-output boolean function and for

some other cornbinatorial structures of interest such as regular mappings, and injective

mappings.

Gordon and Retkin [50] conjectured that good substitution boxes (s-boxes) may be built

by choosing a random reversible mapping of suficient size. Their argument is based

on the observation that the probability of accidental linearity occumng in such s-boxes

decreases dramatically as the size of the s-box increases. In this chapter, we provide

further evidence that bigger s-boxes (by bigger we mean s-boxes with a Iarger number

of inputs) are better by showing that the expected value of information leakage of a

randomly selected boolean function decreases rapidly with the number of input variabies.

We also give extended definitions of many cryptographic cnteria such as such as balance.

correlation irnrnunity, SAC, higher order SAC, and Propagation Criterion to multi-ouput

boolean functions.

Finally we study the relationship between the Walsh-Hadamard transform and various

types of information leakage. Conditions on the Walsh transform are given. which imply

that the function satisfies certain cryptographic properties of interest.

5.2 Definitions

Entropy and Mutual Information

Entropy

Let X be a discrete random variable with alphabet X and probability distribution function

p(x) = Pr {X = x}, z E X then the entropy of a random variable -4' is defined by

If .Y is a binary random variable with Pr(x = 0) = p, then

where h ( p ) is called the binary entropy function of probability p.

Conditional Enîropy

Let Y and X be discrete random variables with alphabet Y . X respectively, and with a

conditional distribution p( y Ir) then the conditional entropy H(Y1.Y) is defined by

where

Mutual Information

The mutual information of discrete random variables Y and X is defined by

For a good general reference for information theory, see [30].

Throughout this chapter. let f : 2; -+ Z y be a randomly selected boolean function

and let Y denote the output of f .

Sfatic In formation Leakage

The static information leakage of Y . given input subvector Xt E 2: (Le., given that we

know k bits of the n-bit input vector) , is defined by

S L ( Y I X k ) = m - H(YIXt),

where H(Y 1 Xk) is the conditional entropy of Y given Xk.

57

Remark: It is easy to show that

where I(Y: Xr ) is the munial information between Y and Xk. Note that if the mutual

information I ( Y ; Xk) is used to define the static information leakage. then the minimum

of I(Y: Xk) can be achieved while H ( Y ) = O which contradicts our objective.

The self static information leakage of Y is defined as:

By noting that H(YIXk) 5 n - k, then from (5.6) we have

One should note that the value of ail the forms of information leakage defined above is - *

greater than or equal to zero. Le., we always have S S L ( Y ) 2 0, and 2. .. - 0.

Dynamic In formation Leakage

The dynamic information leakage of AY, given the input change vector AX is defined

by :

DL(AY1AX) = m - H(AYIAX), (5. IO)

where AY = f (X) 63 f ( X @ AX).

We also note that the static information leakage %(Y 1 Xk) = O is achieved by Ph order

resilient functions (see [28], [126] for the definition and propenies of resilient functions),

while zero dynamic information leakage for al1 values of AX # O is achieved oniy by

perfect nonlinear functions (see [go], [Il61 for the definition and properties of both bent

and perfect nonlinear functions).

By noting diat we aiways have H(AYl0) = O. then from (5.10) we have

The equaiity in the above bound can be achieved only by perfect nonlinear functions

for which n 2 ?m.

Let

Assuming that al1 input vectors are equally probable, we have:

The problem of finding the expected values of the above forms of information leakage

is now reduced to finding the marginal probability distribution of the random variables

X,, ,V&, and iVnxay.

5.3 Information Leakage of a Randomly Selected Boolean Function

Lemma 5.1

Let Y be the output of a randomly selected boolean function f : Za + Z? then we "

have the following probabilities:

Proof: The proof of the above lemma ~OIIOWS by noting that NY, Ni,, and iVaxar/2

follow the multi-nomial distribution. O

Theorem 5.1

The expected values of the static and dynamic information leakage of a randomly selected

boolean function f : Z!: + 21" are given respectively by

Proofi Theorem 5.1 follows directly from the definition of the expected value, and (for

part 2) by noting that AY = O for AX = 0, and hence H(AY 1 0 ) = 0. O Figures 5.2 and 5.3 show the expected value of the self static information leakage and

the expected value of the static leakage given that half the input bits are known. From

these graphs , it is clear that the relative dimensions of the boolean functions (i.e., the

ratio between n. m ) greatly affect different foms of information leakage.

Based on the results above, one cm not conclude that s-boxes with n > m are better than

s-boxes with n < m because of the method we used in the normalization step (dividing by

the number of output bits to get information leakage per output bit). Moreover, s-boxes

with n < rn provide better diffusion characteristics, and may be used in SPNs with no

permutation layers [3] which leads to faster software implementation. The conclusion

that we c m make at this time is that ail foms of information leakage seem to decrease

with the number of input variables.

Using theorem 5.1, one can denve an upper bound for the information Ieakage of

a randornly selected single output boolean function. Single output functions are of

practical interest especially for the combining functions in Stream ciphers.

t

Figure 5.1 Lower Bound for the Binary Enuopy Function

CorolhTy 5.1

Let Y be the output of a randomly selected single output boolean function, then the

expected values of both the static leakage and dynamic leakage are bounded by

Proof: The above corollary follows by direct substitution into theorem 5.1 and by noting

that for the binary entropy function

h ( t ) = - t l o g 2 ( t ) - (1 - t)logl(l - t ) , (5.17)

we bave (see Figure 5.1 )

Figure 5 2 Expected Value of S S L ( Y ) for a Randomly Selectcd Boolean Function

1 Lower bound I

2 4 6 8 10 12 14 16 18 UI Number Of Inputs (nb

Figure 5.3 Expected Value of S S L ( Y ) and Expected Value

of S L ( Y 1 X,,,) for a Randornly Selected Boolean Function

5.4 Information Leakage of a Randomly Selected Balanced Boolean Function

In this section, we calculate the expected values of both the dynamic leakage and the

static leakage for regular (balanced) functions.

Let Y be a randomly selected balanced hnction f : 2; -, 2.7, n 2 m. then we have

Zn! where B(n. rn) = (2,-,!12m is the number of n x m balanced boolean functions.

Proof: For n x rn balanced boolean functions, we have N,, = 2"-? If we fix k input

variables, then there are (Y;*) (Y-?"-&) ?"-,,,-* ways of arranging the output such that Y = y

when Xk = x for i times. The remaining (2" - 2"-*)outputs , of which tL -- 2re only ( y - y - m ) !

- 1) distinct ones, c m be permuted in (2,-,!) i 2 m - i , ways. O

Corollary 5.2

Let Y be a randornly selected bijective mapping r : 2; -, 21 then the expected value of

the static information leakage of Y given the input subvector Xk. O 5 k 5 n, is given by

Proof: The proof follows directly by substiniting (5.19), with n = m, into the first part

of (5.15). A simpler proof (independent of Iernma 5.2 ) follows by noting that for any

arbitrary bijective function, r : 2; -t 2;. if we fix A- input bits, we will have 2"-'

different output symbols with H(Y 1 Xk) = n - k. O

From the proof of the lemma above, it follows that we have

for any arbitrary bijective fimction.

In chapter 3, we derived expressions for the number of balanced functions with

-YhAy = Zi. O 5 i 5 Using these results we have for Ax # O

where -4 ,,,, ay (i), A ,,,, (i) are given by equations (3.13) and (3.14).

Thus the expected value of the dynamic leakage of a randornly selected baianced function

is given by

Theorem 5.3

DL(AY 1 AX) = m

Corollary 5.2

Let Y be the output of a randornly selected bijective mapping r : 2: + 2:: then

the expected value

AX, is given by

of the dynamic information leakage given the input change vector

Proof: The corollary above is a special case (with n = m) of theorem (5.3). O

65

Remark: Note that = O as each output symbol occurs once.

For m = n. the minimum dynamic information leakage is achieved by hinctions with

differentially 2-uniform XOR table. Hence. using equation (5.13), we have

"

n - l

Figure 5.4 shows a cornparison between the expected value of dynamic information

leakage of a randomiy chosen n x n bijective mapping and that or' a randomly chosen

function of the sarne dimensions.

2 4 6 8 IO Number Of inputs (n)

Figure 5.4 Expecced Value of D L(AY 1 W ) for an n x n Random Mapping and an n x n Random Bijective Mapping

5.5 Information Leakage of a Randomly Selected Injective Boolean Function

In this section, we calculate the expected values of both the dynamic leakage and the

static leakage for injective functions.

Theorem 5.3

Let Y be the output of a randomly selected injective hinction f : 2.: + Zy. - n l m . t h e n

Proufi The theorem follows by noting that for any arbitrary injective function, f :

2; -, 2" if we fix k input bits, we have Y-' different

H(Y 1 Xk) = n - k.

From the proof of the lemma above, it follows that we have

3L(Y i Xk) = ( m - n ) + k for any arbitrary injective function.

Lemma 5.3

The number of injective hinctions

bounded by

output symbols with

O

with ibxay > 2k .Ax # O. Ay # O. is upper

w here

Proufi We will count the number of ways to construct an injective function such that

!YAxAy 2 %k, Ax # O, Ay # O. There are ('y1) possible choices for the x positions

of these k pairs and there are choices for such pairs. We also have 2 possible

orders for each pair, giving 2' total possible orders. The rest of the output syrnbols can

be picked up in any order from the remaining Zrn - 2k unused symbols, Le., they c m be

picked in In(?" - 2 k . Y - 2 k ) ways. 0 Using the Inclusion-Exclusion principle, we get

Lemma 5.4

The exact number of injective functions with iVkAy = 2k, Ax # O. Ay # O, is given by

Remark: Note that :Là, 0 = O for Ax # 0, and hence An,,,o = 0.

Theorem 5.4

D L ( A Y 1 A X ) =

where I 7 ~ ( 2 ~ , Zn), the number of n x rn injective boolean functions, is given by equation

(5.29).

Numericd substitution into the theorem (5.4) shows that the dynarnic information leakage

of a randornly selected injective function decreases with the number of input variables.

This rate of decrease is very similar to that of a randomly selected boolean function with

the same number of inputs and outputs, especiaily for n << m. This cm be explained

by noting that for n <<< rn, a randomly selected function is most likely to be injective.

5.6.1 Relation Between the Walsh Transform and Different Forms of Information

Leakage

Let y be the output of a boolean function f : Z," -. Zy. Let .\;, and -Yhay

be as defined in equation (5.12).

R e d that

Thus for w = O we have

Multiplying both sides by (-1 and surnrning over c we get

By noting that

we get

3m, f ( x ) =y.

C O! ot herwise,

and hence

If wi denotes the vector with 1 in position i and zeros othenvise, then

Y / ? F , ( w ~ ) = x ( N ~ ~ - ~ i ~ ) ( - l ) ~ * ~ , (5.38) Y

where Nby denotes the number of times the symbol y appean. and xi = 6. Thus we have

We also have

Adding and subtracting equations (5.39) and (5.40) we get

Sirnilarly we can prove that

where w0 denotes the n-dimensional vector obtained by completing the k-dimensional

subvector w with zeros. For exarnple if n = 6 . x = { x ~ . x ~ . z ~ } then w0 =

{W0. o. W?. o. o. w5}.

Recall that the autocorrelation functicn is related to the Walsh transform as follows:

c- Ay Multiplying both sides by (-1) and sumrning over c we get

The left hand side of the equation above can be expressed as

C n c . f ( w ( - i ) c-ny

C

By noting that

2". f (x) 9 f (x 3 Ax) = ny. = { 0. ofherwise,

we get

C

- - 2'-#{xl f(x) 8 f(x$ Ax) = Ay}

- .>m-n - - ~ V A ~ A ~ ,

and hence

Using the above results we get the following theorem:

Let y be the output of a boolean function j : ZT - Zy then the different foms of

information leakage of y cm be expressed as:

where -Vy. A&. Nha, are given by

Theorem (5.5) expresses the static and dynamic information leakage of a multi-output

boolean function in terrns of the Walsh transform of the nonzero linear combinations

of its output coordinates. It is worth noting that static leakage of order k. SL(y lxk )

depends on every Walsh coefficient FC(w) for c E Zm. wt!w) 5 k, and by noting that

SL(y lxk+, ) 2 SL(y lxk) this implies that the Walsh coefficients for w with smaller

Harnming weight have more impact on the static information leakage. It is also clear

that minirnizing the dynamic information leakage can be achieved by having the energy

spectrum F;(W) as flat as possible. Similar results for single output boolean functions

were reported by Forré [48].

5.6.1.1 Extended Defini tions

In this section we consider some extended definitions for sorne other cryptographic

criteria to the multi-output boolean function f : 2% + 27.

As an illustration for how one c m extend the definitions of cryptographic criteria, consider

the Strict Avalanche Criterion (SAC) defined for single-output boolean function as :

A function f : 2; -, Z2 satisfies the SAC if whenever a single input bit is complemented.

the output bit changes with a probability of one haif [136].

The natural extended definition of SAC for multi-output boolean functions would be:

A function f : 22 + Z a satisfies SAC if whenever a single input bit is complemented,

1 the resulting output change vector occurs with a probability p = F .

The sarne concept can be applied to other cryptographic critena such as balance,

Propagation Criterion [los], correlation immunity [126,140], higher order SAC, and

higher order Propagation Cnterion [ 1051.

The following is a summary of these extended definitions.

A balanced hnction is defined as a fûnction for which iVy =

A k resilient function is a function for which Nk, = 2n-k-m for all x E z:.

A SAC-fullfilling function is a function for which we have NAxay = I n - m for al1

AX with w t ( A x ) = 1.

A function satisfies propagation criterion of degree k (PC-k) if !VA,Ay = for

al1 Ax with wt(Ax) 5 k.

Using the above extended definitions, let criterion "C" be any of the following : balance,

correlation immunity , Strict Avalanche Cnterion (SAC) , higher order SAC , Propagation

Cntenon (PC), higher order PC , or perfect nonlinearity. Then we have

Theorem 5.6

If y is the output of a multi-output boolean function then y satisfies criterion C if and

only if every nonzero linear combination of its output coordinates satisfies criterion C.

Proof: For balanced functions we have Ny = . By noting that Fo(0) = 2n/2,

then we have

= p n - m + .-)n/?-m -

and hence

As the equation above should be satisfied for ail y E Z!: then we must have Fc(0) = 0

for al1 c # O which impiies that the function c f: c # O is a balanced function.

Similarly for k-resilient functions we have

?n-ls-rn - 3n/?-m-k .v*y = - - Y C F c ( W U ) ( - 1 ) w.x+c.y

c.w

By noting that

then the following equation must be satisfied for al1 y E Zy

This can be achieved if and only if

In other words, we must have F,(w) = O for k >_ wt(w) >_ 1.

For SAC fulfilling functions we have for wt(Ax) = 1

Substituting with the value of Fo(w) = 2"j26(0) we get

As the above equation should hold for al1 Ay E then we must have

This means that a multi-ouput function fulfills SAC if and only if every non-zero linear

combination of its output coordinates fulfills SAC.

The rest of the theorem foiIows using a similar argument.

5.7 Conclusion

Many of the previously known cryptographic criteria are related to information leakage.

Most of these criteria require zero information leakage in some domain. However, they

often constrain the function to such an extent that large information leakage of other

types become likely. These leakages provide usehl information for the cryptanalyst to

develop attacks on the cipher. This motivates the minimization of information leakage

as a general criterion for cryptographic functions.

We have derived expressions for the expected values of the static and dynamic information

leakage of randomly selected boolean functions and for randomly selected bdanced, and

injective boolean functions. Based on this we showed that the expected values of the

information leaicages decrease dramatically with the number of input variables. In sorne

cases, we showed that this decrease is exponential. With the same approach developed in

this chapter, one can show that the variance of different forms of information leakage also

decreases drarnatically with the number of input variables. In fact, one c m dso show that

the expected maximum value of different foms of information leakage decrease with the

number of input variables. This indicates that cryptographically strong boolean functions

may be obtained by choosing random mappings of sufficiently large dimensions.

The generalized result in theorem (5.6) confims that the cryptographie strength of a multi-

output boolean function depends on the strength of every nonzero linear combination

of its output coordinates.

Chapter 6 A New Class of Substitution- Permutation Networks

6.1 Introduction

Feistel [44] was the first to suggest that a basic substitution-permutation network (SPN)

consisting of iterative rounds of nonlinear substitutions (s-boxes) connected by bit

permutations was a simple, effective implementation of a private-key block cipher.

The SPN structure is directly based on Shannon's pnnciple of a rnixing transformation

using the concepts of "confusion" and "difision" [123]. Lening N represent the block

size of a basic SPN consisting of R rounds of ;2 x n s-boxes, a simple example of an

SPN with N = 16. n = 4, and R = 3 is illustrated in Figure 6.1 . Keying the network

can be accomplished by XORing the key bits with the data bits before each round of

substitution and after the last round. The key bits associated with each round are denved

from the master key according to the key scheduling algorithm.

Figure 6.1 SPN with IV = 16, n = 4, and R = 3.

One advantage of the basic SPN mode1 is that it is a simple, yet elegant, structure for

which it is generaily possible to prove security properties. Indeed, it has been shown

that a basic SPN c m be constmcted to possess good cryptographic properties such as

completeness or nondegeneracy [66], adherence to the avalanche criterion 1611, and

resistance to differential and linear cryptandysis [60].

The basic SPN architecture differs from a DES-like architecture in which the substitutions

and permutations, used as a mixing transformation, operate on only half of the block at

a time. Since SPNs do not have this iast property, in general, SPNs need two different

modules for the encryption and the decryption operations. In an SPN, decryption is

performed by running the data backwards through the inverse network (i.e., applying

the key scheduling algorithm in reverse and using the inverse s-boxes and the inverse

permutation layer). In a DES-like cipher, the inverse s-boxes and inverse permutation

are not required. Hence, a practical disadvantage of the basic SPN architecture compared

with the DES-like architecture is that both the s-boxes and their inverses must be located

in the sarne encryption hardware or software. The resulting extra rnemory or power

consumption requirements may render this solution less attractive in some situations

especially for hardware implementations.

One proposal to overcome this problem is to use a single s-box and its inverse for both the

encryption and the decryption. This idea was employed in SAFER [79]. Unfominately, in

SAFER, the encryption and the decryption are different and one still needs two different

hardware modules.

In this chapter, we introduce a special class of substitution-permutation networks. This

class has the advantage that the sarne network c m be used to perform both the encryption

and the decryption operations. The basic idea is to use involution substitution layers and

involution permutation layers or linear transformations. We investigate the resistance of

these networks to both differentiai and linear cryptanalysis: it is shown that using an

appropnate linear transformation between rounds is effective in improving the security

of the SPNs in relation to these two attacks. Further results suggest that the cyclic

properties of the overall network are not negatively influenced by the cyclic propenies

of the involution s-boxes. As well, a key scheduling algorithm is proposed that has

the advantages of preventing weak keys and ensunng that, given that key bits in a

particular round are cornpromised, it is hard to get any information about the key bits

of other rounds.

2.1 Semi-Involution Functions

It is possible to construct SPNs which do not require inverse s-boxes if the s-boxes in the

network belong to the class of functions that we refer to as serni-involution functions.

Such hinctions have the property that their inverses can be easily obtained by a simple

XOR operation on the function input and output. Hence, differences between the s-

boxes in the encryption network and the decryption network can be accommodated by

incorporating the XOR into the application of the round key bits.

DefinitTon: A bijective function r : Z! -, Zr: is called a semi-involution function if

for some constants a. b E Z;.

Involution functions are the sub-cIass of serni-involution functions for which a = b = 0.

Lemma 6.1

A semi-involution function as defined above has a 3 b as a linear structure.

Proof: Let y = a-'(x) and, from (6.1), we have y $ b = n ( x @ a). Therefore,

x = T-'(~ @ b) @ a. Hence, ~ ( y ) = T-'(~ @ b) @ a. NOW replacing y 5 b with x

gives ~ ( x $ b) = 1;-'(x) $ a. From (6.l), r ( x $ a) $ b = ~ ( x $ b) $ a. Replacing

x with x $ b gives

n ( x Bi a @ b) = ~ ( x ) @ a $ b,

which is the definition of a linear structure [43], [84].

79

Thus a semi-involution function has A h a y = 'Ln where is the XOR difference

distribution table enuy[lS] for input Ax = a @ b and Ay = a @ b. For a 2 b # O this

renders the SPN tnvially broken by differential cryptandysis. This means that. if we want

to use the same SPN for both the encryption and decryption, then only semi-involution

s-boxes with a = b can be used.

The following Iernma shows how the usefui ciass of semi-involution functions can be

obtained from involution functions.

Lemma 6.2

Let 6 : 21" + Z? be an involution function, then the function r ( x ) = d(x) 6 a is a

semi-involution function such that a = b, i.e., r-'(x) = T(X $ a) 5 a.

Proof: From the definition of involution functions, b2(x) = x. Hence, ~ ( ; r ( x ) a)+a =

x. Replacing x with x @ a gives a(x $ a) $ a = T - ' ( x ) . 0

Lemma (6.2) is important, not only because it provides an easy way to generate the

useful class of semi-involution functions frorn involution functions, but also because it

implies that the functions d(x) and n ( x ) belong to the same cryptographic class and

hence they have the sarne linear approximation table [82], and the sarne XOR difference

distribution table II 51.

The only cryptographic difference between involution s-boxes and semi-involution s-

boxes with a = b, a # O, is their cyclic properties. Al1 cycles of involution functions

have length one or two. In SPNs where the key bits are XORed with the data bits at

the s-box input, if we assume that d l the key bits are equi-probable, then both the SPNs

built using semi-involution s-boxes with a = b # O and the SPNs built using involution

s-boxes will have the same cryptographic properties. In the rest of the chapter we will

focus on the ciass of SPNs that use involution s-boxes.

An interesting class of involution mappings is the inversion mapping in G F ( 9 " ) defined

as 1911:

Different cryptographic properties of this mapping were studied in [91]. This inversion

mapping is differentially 2-unifom if n is odd and it is differentially 4-uniform if n is

even. The nonlinearity of this mapping is given by AfL(*) 2 Y-' - sn/' - .

The above class of s-boxes can be generated using different irreducible polynomials. The

number of rnonic polynornials of degree n which are irreducible over GF(q) , where q

is any prime power, is given by [Il], 1781:

where ~ ( d ) is the Mobius function given by

,cl = 1 r ( d ) { i-1). , ci ir a prociuct of i distinct primes

, otherwise.

For n = 8, we have 30 irreducible polynornials of degree 8 and hence we can generate

30 such s-boxes. Al1 these 30 s-boxes have nonlinearity equal to 112 and maximum

XOR table entry equal to 4. In order to fmstrate possible algebraic attacks, the SPN

should use s-boxes generated using different irreducible polynomials. Another approach

is to use randomly generated s-boxes so that the overail cipher would not have any easy

algebraic description. In section 2.3 we study some of the cryptographic properties of

such randomly generated involution s-boxes.

The number of involution fmctions s : 2; 4 21 is given by

Proof: An involution function c m only have an even number of fixed points1. There

are (:), O 5 i 5 Y-' ways to speciS any of these 2i fixed points. Note also that

an involution function with 2i fixed points rnust have Y-' - z cycles2 of length 2. An

involution function is completely defined by specifying its fixed points and a single point

on each of its 2"-' - i cycles. Now, we will count the number of ways of assigning these

2" - 22 points dong the - i cycles. To choose the first point. pick any arbitrary

point xo E 21 such that xo is not equal to any of the assigned fixed points. Choose a

random value ro E 2; for r(xo). ro should not be equal to any of the fixed points. It aiso

should not be equal to xo. Thus there are (2" - 32 - 1) ways to choose ro. To choose a

second point, pick another arbitrary point xl such that r ( x l ) has not been assigned yet

(this also ensures that it belongs to a new distinct cycle) and pick a random rl E Z? for

a(xl). Again, ri should satisfy the following conditions: ri # xl and it should not be

equal to any of the previously assigned values for T. Proceeding as above, we have

' Tire nurnkr of fixcd points of n : ZT - Zî is quai to #{x E Zîlr(x) = x).

A cycle of n : 22 - 2," is a sapence of elements XI, x,, - m . X L such Lh;u xl+, = a ( x 1 ) for 1 _< I 5 (L - 1).

and x, = x ( x L ) , but x, # n(xl) for 2 1 (L - 1). The length of the cycIe is L. k t d(x) be the length of

che cycle to which x befongs. and let CI, C2. -. . C-, be the dinina cycles of a. Thcn the average cycle length is given by

ways of assigning these points. Hence, the number of involution functions is given by3

whic h proves the lemma.

2.2 Equivalence Classes

Two s-boxes T I , a2 are said to belong to the same cryptographic class [1 301 if

for arbitrary constants a. b E 2;.

The use of s-boxes within the sarne cryptographic classes was suggested as a means

to design SPNs that are resistant to differential cryptanalysis [130]. Unfortunately,

involution s-boxes can not be used in such SPNs because, as shown in the following

lemma, if two involution s-boxes belong to the same cryptographic class then they

possess a linear structure.

If and ~2 are both involution mappings and

then xl, Q have a @ b as a linear structure.

' The fint lem in the equation above stands for the unity bijection rnapping with 2" fixed points

8 3

Proofi By noting that ~ ( x ) = al(x 3 a) b then we have ( x ) =

al(al(x 3 a) 2 a 3' b) 5 b. But we also have d ( x ) = x and. hence.

i i l ( ~ ~ ( ~ 5 a) 6 a 3 b ) 5 b = x. Thus. we have xi(* 6 a) 5 a b = r ; ' ( x 5 b).

Replacing x 3 b by x and noting that *rl(x) = ri(x) gives

which is the definition of a linear structure. By a similar argument, one cm show that

a2 aiso has a 8 b as a Iinear structure. 0

2.3 Number of Fixed Points

Involution s-boxes have the characteristic that al1 cycles are of length one or two and,

as will be shown, have a larger expected number of fixed points than a randornly chosen

s-box. Although there is no known effective cryptandytic attack directly based on the

existence of fixed points in the s-boxes, it is of interest to determine if a large number

of fixed points affects other cryptographic properties. such as the nonlinearity and the

maximum XOR table entry, that lead to other cryptographic attacks .

Figure 6.2 shows the expenmental results for the average nonlinearity and the average

maximum XOR table entry as a function of the number of fixed points for 8-bit random

bijections and 8-bit random involutions. One thousand random bijective s-boxes and

one thousand randorn involution s-boxes were tested for each point. The graphs were

denved by incrementing the nurnber of fixed points by 2. The graphs clearly indicate a

strong correlation between the cryptographic properties and the number of fixed points

and suggest that the s-boxes should be chosen to contain few fixed points.

64 te 192 256 Number of Frxed Points

Figure 6 2 Average Nonlinearity and Average Maximum XOR Table Entry Versus the Number of Fixed Points (n = 8)

We now calculate the expected number of fixed points for a random bijection and for

a random involution.

Lemma 6.5

The expected value of the number of fixed points for a random bijective mapping is 1 .

Proof: The number of bijective mappings with exactly t fixed points is given by (this

result follows by using the inclusion-exclusion principle )

I - 2 ( - q i + t ( ; ) (?)(Y - i)! . i=i

The probability of having exactly t fixed points is given by the above formula divided

by Y!.

Hence, the expected number of fixed points is given by

The last step in the equation above follows by noting that

which completes the proof of the lemma.

Similady. one can show that the variance of the number of fixed point s is als

The expected number of fixed points for a random involution rnapping is given by

where

@(n. i) = -

(F1 - i)! pz)! -

Proof From the proof of lernrna (6.3), the number of involution functions with 2i fixed

points is given by

The probability of randornly selecting an involution function with 22 fixed points is

obtained by dividing (6.17) by the total number of involution functions. Thus, the

expected number of fixed points for a random involution function is given by

which completes the proof of the lernrna. O Numencal substitution in the formula above shows that the expected number of fixed

points of a random involution exceeds that of a random injective mapping by a large

factor. For example, an 8-bit involution mapping is expected to have about 16 fixed

points. Fortunately, the construction proof of lemma (6.3) can be used to generate

involution functions with a predetermined number of fixed points. A special case of

interest is involution Functions with zero fixed points since this seems to optimize their

cryptographic properties (see Figure. 6.2 ). The number of such functions follows from

the proof of lemma (6.3) and can be approximated using Stirling's formula as follows

6.3 S-box lnterconnection Layer

In order to use the sarne SPN to perform both the encryption and the decryption

operations, the s-box inter-connection layer should also be an involution mapping. One

permutation layer, applicable to networks for which N = n2, with nice cryptographic

properties [60] and which satisfies the involution requirement is described by: output bit

i of s-box j at round r is connected to input bit j of s-box i at round r + 1.

In 1601 it was shown that with such a permutation iayer we can develop upper bounds

on the differential characteristic probability [15] and on the probability of a linear

approximation [82] as a function of the number of rounds of substitution. Unfortunately.

to achieve good bounds, with a relatively small number of rounds, it is suggested to have

s-boxes with a large diffusion order [60]. Letting 4x and A y denote the input change

vector and the output change vector, respectively, an s-box satisfies diffusion order of

A. X 2 O, if for wt(Ax) > 0.

X w t ( A x ) < X + 1, O otherwise.

where zut(-) denotes the Hamming weight of the enclosed argument.

Using the depth-first search algorithm proposed in [60] we could not find any 8 x 8

involution s-boxes with diffusion order greater than 1 (without the involution constraint,

some 8 x 8 s-boxes with X = 2 were found in [60]). As an alternative to this, the authors

in [60] proposed the use of an invertible linear transformation between rounds. The SPN

resistance to linear and differential cryptanalysis was very encouraging. Unfortunately,

their proposed linear transformation is not very attractive in practice as it requires a bit

XORing operation of al1 the output bits of the round.

6.3.1 Proposed Linear Transformation

We propose a more efficient linear transformation that runs much faster. Moreover it

has irnproved bounds for the linear approximation and the differential characteristic. The

linear transformation between rounds of s-boxes is descnbed by

l < z _ < M (6.2 1 )

of the transformation, w; is the ih input where z; represents the i* n-bit output word (

word, M = 5 denotes the number of s-boxes, and $ denotes a bit-wise XOR operation.

It is assumed that M is even. For 8 x 8 s-boxes this is a byte oriented operation. One

can easily check that this linear transformation operation is an involution.

The linear transformation described above may be efficiently implemented by noting

that each z; could be simply determined by XORing Wi with the XOR sum of al1

2,. 1 5 j 5 M. Le.,

where

Equation (6.23) above requires ( M - 1) word-onented XORs (which c m be done in

parallel in log2 M steps) and equation (6.22) requires M word-oriented XORs (which

c m be done in one step). Hence for a 64-bit SPN using 8 x 8 s-boxes, the above linear

transformation requires 7 + 8 = 15 byte-oriented XORs compared to 63 + 64 = 127

bit-oriented XORs required for the linear transformation of [60].

6.3.2 Involution Linear lkansformations based on MDS codes

An interesting class of linear transformations is the one based on Maximum Distance

Separable (MDS) codes [78]. The use of such linear transformations was first proposed

in [133] and then utilized in the cipher SHAW [108] and later in the cipher SQUARE

[34]. This class of Iinear transformations has the advantage that the number of s-boxes

involved in any 2 rounds of a linear approximation or in any 2 rounds of a differential

characteristic is equal to M + 1, which is the maximum theoretically possible number.

In this section we provide two construction methods for involution linear transformations

based on Maximum Distance Separable Codes.

Rijmen et al [108] noted that the framework of linear codes over GF(2*) provides an

elegant way to constmct the linear transformation layer. More details about the theory

of error correcting codes can be found in 1781.

Let C be a ( 2 M , M, d) code over GF(2"). Let G = [II A] be the generator matrix in

echelon form where A is a nonsingular M x M matrix and 1 is the il. x M identity

matrix. Then C defines an invertible linear mapping

* lli G F ( ~ " ) ' " GF(2 ) : -Y -+ Y = A S . (6-24)

If the matrix A is used in the implementation of the linear transformation of the SPN,

then it is easy to see that the number of s-boxes involved in any 2 rounds of a differentiai

characteristic or a linear approximation expression is lower bounded by d. the minimum

distance of the code (1081. The minimum distance of the code is equai to the minimum

number of linearly dependent colurnns in its nul1 matrix (dso known as the parity-check

matrix).

Lmma 6.7 [78]:

An (n. k. d) code with generator matrix G = [ I J A ] . where A is a k x ( n - k) ma&, is

MDS if and only if every square submatnx (fomed from any i rows and any i columns.

for any i = 1' 2 , . . - . min{k . n - k}) of A is nonsingular.

Random Construction

One way to obtain an involution matrix A which satisfies the above constraint is to pick

a random involution matrix and test it for the condition in lemma (6.7).

Let

be an il1 x LM random matrix where A l i , Ai?, AZ1 and A?? are y x matrices. An 2

involution matrix is one which satisfies A? = I , and thus A is an involution iff

If we let A?? = Ali then equation (6.26) is satisfied iff Ali and Al? commute with each

other. To achieve this we let Al? = A;. For these choices of Al? and .4n, equations

(6.27). (6.28) and (6.29) are linearly dependent with the solution = A:, .Al1.

Thus the hl x 12ï matrix

.4 =

where Ali is a random nonsingular

Al1 A,' [A:, Q Al1 Ai l 1 -

x matrix, is an involution over GF(Zn ). 2 2

For n = 8, random search for a matrix A, with the structure in equation (6.30). that

satisfies the condition in lemma (6.7), terminates within a few seconds for even values

of M, M 5 6. For M = 8 we were unable to obtain any matrix that satisfies the

conditions in lemma (6.7) by random search. Table 6.1 [143] shows the experimental

results for the minimum distance distribution for 10' randomiy selected invertible linear

transformations and 10' randorniy selected invertible involution linear transformation in

the form of equation (6.30).

r ri mental (Random) 46 15539 84415

Experimental (Involution) 118 7714 93168

Table 6.1 Minimum Distance for 105 Randomly Selected Involution Linear Transformations (n = M = 8)

Algebraic Construction

In this section we show how to obtain an involution matrix satisfying lemma (6.7) by

a simple algebraic construction.

Lemma 6.8 [78J-

Given xo,--- ,~~-~.yo,--.~y,-, the matrix .il = [a,].0 5 i. j 5 n - 1 where

aij = - l is called a Cauchy matrix. It is known that xt +Y,

Hence, provided the xi are distinct, the y; are distinct, and xi + yj # O for dl i, j, it

follows that any square submatrix of a Cauchy rnatrix is nonsingular over any field.

Let

w here

and the least significant r [og2(M)l bits of r # O are teros.

For .A' = H = [h,] we have

where i,j and k are evaluated as in equation (6.33). Thus the matrix A will satisfy n

A' = c2 1, c = $ a:i over GF(2"). Dividing (division over GF(2")) each elernent i=l

we obtain an involution matrix for which every square submatrix is nonsingular over

GF(2") . Figure 6.3 shows an example for M = n = 8, using the irreducible polynomial

Figure 63 Involution Linear Transformation Bascd on MDS Codes ( M = n = 8, Irreducible Polynomial = 1 id)

The resistance of SPNs using this MDS Iinear transformation is studied in [ log] and

[143]. Throughout the rest of this chapter we will concentrate on the class of linear

transformations described by equation (6.2 1).

6.4 Resistance to Differential and Linear Cryptanalysis

Using an approach similar to the anaiysis in [60], it is possible to establish upper bounds

on the most Iikely differential characteristic and linear approximation expression using

the linear transformation of (6.21). The resulü of this section are obtained by assuming

that al1 the round keys are independent.

4.1 DSferential Cryptanalysis

The following lernrna gives a lower bound on the number of s-boxes involved in any

2 rounds of a difierenfiai characteristic.

'' All nurnbers arc in hcxadecimal format

L e m m 6.9

Consider an SPN with M s-boxes, hl 2 4. If the SPN employs the linear transformation

described in (6.21), then the number of s-boxes involved in any 2 rounds of a differential

characteristic is greater than or equal to 4.

Proof: From the linear transformation expression one can check that if only one s-box

is involved in round r this implies that iM - 1 s-boxes are involved in round r + 1. If 2

s-boxes are involved in round r , (6.21) ensures that at least 2 s-boxes will be involved

in round r + 1. The rest of the proof follows by noting that the minimum number of

s-boxes involved per round is 1. 0 The number of chosen plaintextkiphertext pairs required for differential cryptanalysis of

an R round SPN (based on the best characteristic and not the best differential [94],[73])

may be approximated by [15], [60]

where Pa,-, is the probability of the best R - 1 round characteristic. This probability

c m be bounded by

where the maximum s-box XOR pair probability is given by Pb = - with XOR* 3"

denoting the maximum entry in the XOR distribution tables of the s-boxes used in the

SPN and a is the total number of s-boxes involved in the characteristic. For even R, from

lemrna 6.9 and assuming that only one s-box will be involved in round R- 1 then we have

and, hence,

Using 8 x 8 involution s-boxes with maximum XOR table entry of 10 (easily found by

randomly selecting involution s-boxes), an 8 round 64-bit SPN that utilizes the proposed

linear transformation will have iVo 2 260-8 chosen plaintext/ciphertext pairs required

for differential cryptanalysis. If we use the inversion s-boxes given by (6.3), then we

will have ND 2 278.

4.2 Lmear Cryptanalysis

The following lemma gives a lower bound on the number of s-boxes involved in any

2 round linear approximation and is based on the assumption of independence between

linear approximation of different rounds.

Lemma 6.10

Consider an SPN with M s-boxes, M > 4. If the SPN employs the linear transformation

described in (6.21) then the number of s-boxes involved in any 2 rounds of a linear

approximation is greater than or equal to 4.

Proofi If the number of s-boxes involved in round r + 1, 1, is odd, then the number of

s-boxes involved in round r is M - 1. If I is even, then the number of s-boxes involved

in round r is 1. The lernma above follows by considering different values for 1. O

For an SPN based on n x n s-boxes, the number of known plaintexts required for the

basic linear cryptandysis (algorithm 1 in [82]) may be approximated by [60]

where

and

with NL denoting the minimum nonlinearity [go] of the s-boxes used in the SPN and

cr is the total number of s-boxes involved in the linear approximation. From the above

argument we have

and, hence,

Using 8 x 8 involution s-boxes with nonlinearity of 98 (easily found by randomly

selecting involution s-boxes), an 8 round 64-bit SPN that utilizes the proposed linear

transformation will have IVL 2 known plaintextkiphertext pairs required for the

basic linear attack. Since this number is greater than the size of the plaintext set, we

interpret this to mean that the basic linear attack is not effective against this class of

SPNs, even if we use al1 possible plaintexts. If we use the inversion s-boxes given by

(6.3), then we will have XL 2 zg8.

6.5 Cyclic Properties of the Proposed SPN

A significant difference between an involution s-box and a non-involution s-box is

likely to be their cyclic properties. For a randomly chosen n-bit bijective mapping, the

expected value and the variance of the number of cycles are both approximately equal to

log, (2") = 0.69n [46]. The expected value of the cycle length is equal to 2"-' + 112 [35].

For an involution mapping with fifp fixed points, the expected cycle length is given by

and the number of cycles is given by Y-' + N f p / 2 .

1 96

SPNs with mndom involution *boxes

Cycle Length

SPNs with mndom b ï j w 8-boxes

I ~ O D O O . 0 0 0 0 . 0 0 0 D r a D m - - 7 0 0 0 0

Cycle Length

Figure 6.4 Normalized Distribution of Cycle Ltngth for al1 2'" Staning Points

h order to investigate whether the cyclic properties of involution S-boxes affect the cyclic

properties of the SPN, we measured the cycle distribution for 100,000 16-bit SPNs with

4-rounds. Each SPN uses four 4 x 4 random involution s-boxes with zero fixed points,

nonlinearity greater than or equal to 4 and maximum XOR table entry equal to 4. The

cycle length distribution is shown in Figure 6.4 (a) (the dark line shows the average

distribution over 100 adjacent points). The normalized frequency of occurrence is equal

to the acnial number divided by 100,000. In this case, the average cycle length over

al1 SPNs is equal to 32779. Figure 6.5 shows a typical expanded segment of the cycle

length distribution in Figure 6.4 (a).

We performed the same experirnent on 100,000 SPNs using random bijective mappings

with the sarne constraints on the nonlinearity and the XOR table. The simulation results

are shown in Figure 6.4 (b). The average cycle length over al1 SPNs is equal to 32766.

It is clear that the two distributions are almost indistinguishable. This suggests that the

involution s-boxes do not have a negative impact on the cyclic properties of the SPN.

6.6 Key Scheduling Algorithm

In Our discussion, we assume that the SPN is keyed by a simple key mixing operations

of the key bits before each substitution and after the last substitution (details will be

0.6 1 O O O 2 m o 2 - Q ~ t O O Z l O Q

Cycle Length

Figure 6.5 A Spica1 Expanded Segment of the Cycle Length Distribution of SPNs with Random Involution S-boxes

explained below). A weak key, kW, is any key for which Ek,(Ek,(p)) = p for every

piaintext vector p where Ek,(.) denotes the encryption operation using the key kW. In

this section we propose a simple key scheduling algorithm for the SPN. Three design

principles were employed:

(i) Prevent weak keys.

(ii) Given that some or al1 of the key bits at round r are compromised, it is hard to get

any information about the other round keys.

Although the above key scheduling cm be controlled to be relatively slow in order to

make brute force attack harder [106], it is far easier and more effective to use a larger

key. Using a larger key has the advantage that it does not penalize implementations

which must change the key often.

In the following algorithm key denotes the user supplied key which is assumed to be

of the sarne length as the block length of the SPN, Ei(p) denotes the output of the

SPN when it has p as an input, and the round keys are ail set to k. Consider the key

scheduling algorithm shown in Figure 6.6.

One c m assign any other arbitrary value to m. Op(- ) denotes any simple operation that

guarantees that al1 xi's are different. By noting that E< is a bijective mapping for any

fixed key then al1 ki's will be different which guarantees that we do not have any weak

keys. An example of operation O p ( - ) is the cornplementing of different bits in xo for

each i. Note that we control the key scheduling speed by controlling the nurnber of

rounds used in the encryption operation Ek. Also, this scheme is similar to the scheme

proposed in [69].

Figure 6.6 Key Scheduling Algorithm

It is also worth noting that the above keying scheme does not have the complementation

property; this property makes DES susceptible to exhaustive key search of 255 rather

than 2j6. This scheme also ensures that there are no simply related keys which leads

to Biham's related keys attack [13].

The key scheduling described above can be extended to accommodate the case where

the user supplied key size is a multiple of the SPN block length (Keys which are not

multiples can be padded to be so) .

6.7 Performance

While the usehlness of a cryptographic algorithm is based on assumptions about its

security, the complexity of the cryptographic function is another feature that should not

be overlooked. Table 6.2 shows the relative speed (the larger the number. the faster the

cipher) of Q-CAST~ and SPNs on three platforms: an &bit microcontroller (Motorola

68 1 l ) , a SUN SPARC workstation and a SUN ULTRA workstation. All aigorithm

operate on a 64-bit blocks and implemented 16 rounds. The speed of the SPN in Table

6.2 is normalized independently for each processor.

In considering these numbers, one should take into account that the proposed SPN

is a hardware oriented cipher (while Q-CAST is a software oriented cipher), and the

round function of the proposed SPN provides a better degree of security than the round

function of Q-CAST.

Table 6.2 Relative Speed of the Proposed SPN and Q-CAST

Motorola 68 1 1

SUN SPARC-20

SUN ULTRA- 1

6.8 Conclusion

We have presented a specid class of SPNs that have the advantage that the same network

can be used to perform both the encryption and the decryption operation. The s-boxes

used are involution mappings and the permutation layer is replaced by an efficient

involution linear transformation layer. In a few seconds on a SPARC-20 workstation, we

were able to obtain tens of 8 x 8 involution s-boxes with nonlinearity of 98 and maximum

XOR table entry of 10. Using these s-boxes, an 8 round 64-bit SPN that utilizes the

SPN

1

1

1

Q u m ' s University version of the CAST en-ption aigorithm

100

Q-CAST

0.46

7.5

1.56

proposed linear transformation will be resistant to both the basic linear cryptanalysis

and to the differential cryptandysis based on the best ( R - 1 )-round characteristic. We

also confinned that the cyclic properties of this special class of SPNs reveai no apparent

weakness. A key scheduling algorithm which satisfies certain design ptinciples was also

proposed.

Chapter 7 Modelling Avalanche Characteristics of Substitution-Permutation Networks Using Markov Chains

7.1 Introduction

An SPN is considered to display good avalanche characteristics if a one bit change

in the plaintext input is expected to cause close to haif the ciphertext bits to change.

Good avalanche charactenstics are important to ensure that a cipher is not susceptible

to statistical attacks such as clustenng attacks [58]. More fonnally, the avalanche is

defined as follows :

A cipher is said to satisQ the avalanche criterion if, for each key, on average

half the ciphertext bits change when one plaintext bit is changed. That is,

E ( w t ( A C ) 1 wt(AP) = 1 ) = N / 2 , where w t ( - ) denotes the Hamrning weight of

the enclosed argument, AC and AP denote the ciphertext and plaintext change vectors,

respectively.

Plaintext P l r

C l ... Ciphertext

Figure 7.1 SPN with N = 16, n = 4, and R = 3.

In [61], Heys and Tavares analyzed the avalanche characteristics of SPNs based on two

general network models. distinguished by the nature of their permutation layer. In the

first model. they considered a network where the permutation between two rounds is

modelled as a random variable whose values are equally likely. In the second model,

they considered a network which has a specified fixed permutation between rounds. In

[56], Heys extended this work and developed a mode1 of the avalanche characteristics

of DES-like ciphers.

In this chapter we develop analytical models for the avalanche characteristics of other

classes of SPNs. In particular. we consider the following four general network models.

distinguished by their linear transformation layers (see Figure 7.1 ):

Model A' - In this mode1 we consider SPNs, with M = n, in which the linear

transformation layer is a permutation layer x E R. where 0 is defined to be the set

of permutations for which no two outputs of an s-box are connected to one s-box in

the next round.

Model B - In this model we consider SPNs in which the linear transformation layer

is given by the iwertible bitwise linear transformation defined by v = r ( C ( u ) ) where

v = [ v p ? - . v ~ ] is the vector of input bits to a round of s-boxes. u = [uiu? . u N ]

is the vector of bits from the previous round output. C(u) = [ I l (u) - - I N (u)], T E R.

where is the set of permutations defined in model A, and

iV is assumed to be even so that the linear transformation is invertible.

Note that i i (u) could be simply determined by noting that

' ïhis model was dso considered by Heys and Tavares b a we inch& it hem for compleceness

w here

)=l

The resistance of SPNs with this class of linear transformation against linear and

differential cryptanal ysis is snidied in [60].

Model C- In this model we consider SPNs in which the linear transformation layer is

given by the invertible wordwise linear transformation defined by

where Zi represents the ilh n-bit output word of the transformation, W i is the i" input

word, and iV1 = $ denotes the number of s-boxes. It is assumed that M is even so that

the linear transformation is invertible. For 8 x 8 s-boxes this is a byte oriented operation.

The linear transformation descnbed above may be efficiently implemented by noting

that each zi could be simply determined by XORing wi with the XOR sum of al1

2,. 1 5 j 5 M. i-e..

where

The resistance of SPNs with this class of linear transformation against Iinear and

differential cryptanalysis is studied in [153].

Model D- In this model we consider SPNs in which the linear transformation layer

is given by the invertible linear transformation

G F ( ~ " ) ~ + G F ( ~ " ) ~ : X 4 Y = AX,

where A is a nonsingular M x M matrix for which every square submatrix (formed

from any i rowr and any i columns, for any i = 1,2! - - . , M) is nonsingular. Note that

the matrix G = [IIA], where 1 is the M x M identity matrix, is the generator matrix

of an (3M. M. M + 1) MDS code. Throughout the rest of this thesis. we will refer to

this type of linear transformations as the MDS linear transformation. The resistance of

SPNs with this class of linear transformation against linear and differential cryptanalysis

is studied in [I08].

7.2 Convergence in Markov Chains

A Markov chain is ergodic if it is finite, aperiodic and irreducible. A sufficient condition

for a Markov chain with n States and a state transition matrix2

CO be apenodic is that pi* > O for some i,1 5 i 5 n, and it is irreducible if for al1 2. j

there exists r such that 4:) > 0,1 5 i, j 5 n, where Pr = [g)]. If P is ergodic, then there exists a unique distribution II = (ri, m, . . r,) such that

The distribution Il is said to be the limiting distribution of P. The classical method to

determine the rate of convergence towards this limiting distribution is to consider the

eigenvalues of P [46].

Suppose that we have a matrix P with distinct and nonzero eigenvalues X i A?, - - . A,. T

Let the column vector di) = (zji), IF), . . , zt)) satisfy

p,, is the eiement in mw j and column i in the rnauix P.

for i = 1.3.--- .n. Then

It follows that a matrix P with distinct eigenvalues has a limiting distribution if and

only if the largest eigenvalue is 1, and the remaining eigenvalues are less than one in

modulus. We order the eigenvalues with 1 = Ai > ]A2 1 2 1 X3 1 2 - - 2 1 An 1.

Powers of a transition matrix can be written as

where P* = [II, II, - , II is the lirniting distribution of P. Thus al1 enuies of Pr

converge to their lirnit exponentially fast as a function of A?.

7.3 Modelling Avalanche in S-boxes

We will use the sarne s-box model proposed in [6 11. Let the s-boxes in the network be

defined by a random bijective rnapping S : X + Y. Assume that any set of one or

more input bit changes to an s-box results in a number of output bit changes represented

by the random variable D, i.e.. D = w t ( A Y ) where AY is the output change vector of

the s-box. We assume that the likelihood of a particular nonzero value for D is given by

assuming that dl possible values of AY belonging to the set of 2" - 1 nonzero changes

are equally likely. Hence the probability distribution of D is given by

for 1 5 d 5 n. Note that the above s-box model essentially represents an average

over al1 randornly selected s-boxes and is not intended to characterize the behavior of

an actual physically realizable s-box. However, as expenmental evidence [61] suggests.

modelling the number of output changes of each s-box as a random variable is a suitable

approximation when considenng an SPN constructed using randornly selected bijective

s-boxes.

Let W, represent the random variable corresponding to the number of bit changes after

round r given a one bit plaintext change, Le.,

where AY,, denotes the output change vector of the sth s-box in round r . Hence, the

expected value of W, is given by

where PAr(A, = i) denotes the probability of having i active s-boxes in round r. Le..

i s-boxes with nonzero input change vectors.

Thus we have

with the initial condition

where 6(a = b) = 1 if a = 6 and b(a = b) = O if a # 6.

It is clear that PA, ( A , = il A,-1 = j ) does not depend on r and hence the number of

active s-boxes c m be modelled by a Markov chain. Now Our problem is reduced to

calculating the state transition matrix P = hi], where pji = PA, ( A , = ilA,-i = j ) is

the probability of having i active s-box in round r given that we have j active s-boxes

in round r - 1.

7.4 Modelling the Linear Transformation Layer

Throughout this section, we are interested in the case A4 = n.

7.4.1 Mode1 A - Fixed Permutation Layer

Lemm 7.1

Assume an SPN with a fixed permutation layer n E fl, then we have

1 n

Pji = (p - 1 ) ~ C ( - p + i

l=n-i

Proufi The number of arrangements of the output bit changes so that the first 1 s-boxes

in round r are not active given that we have j active s-boxes in round r - 1 is obtained

by sening the input bit changes to these non-active s-boxes to zero and varying the other

bit changes over al1 its possible nonzero changes. Thus the number of such arrangements

is (Y-' - 1)'.

Using the inclusion-exclusion principle, the number of arrangements such that we have

exacdy i non-active s-box in round r given that we have j active s-box in round r - 1

is given by

NA(n- i , j ) The lemma follows by noting that p,i = (2n- l lJ .

7.4.2 Mode1 B - Linear Transformation Type 1

The following lemma will be used throughout this section.

Lemma 7.2

Let S = $ bi, bi E 22 with P(bi = 1) = p, O < p < 1, then the probability that i= 1

S = O is given by

For p = O or p = 1, we have

1, Ieven. a,,. 1, = { O, 1 odd.

Lemma 7.3

Assume an SPN with a linear transformation type 1, then for AQ = O (see equation

(7.3) ), the number of arrangements for the output bit changes such that we have i active

s-boxes in round r given that we have j active s-boxes in round r - 1 is given by

where @(i, j ) is given by equations (7.22) and (7.23).

Proof: For every active s-box in round r - 1, al1 the input bit changes to the i non-active

s-boxes should be set to zero. The other output bit changes can be assigned arbitrarily

such that AQ = O. Note that, in this case, QQ can be seen as the XOR of j binary y - ' - 1 1 /3

i.i.d. random variables, bi E Z?, 1 5 1 5 j , with P(bi = 1) = - = - ,-2,-,.Thusthe

rest of the output change bits cm be assigned in (2"-' - 1)'@ (j, -) . U s i ~ ththe

inclusion-exclusion principle, the number of ways to have exactly i non-active s-boxes

in round r given that we have j active s-boxes in round r - 1 is given by

The iernrna follows by noting that Ao(& j ) = NAo(M - 2,j).

Assume an SPN with a linear transformation type 1. then for AQ = 1 (see equation

(7.3) ), the number of arrangements for the output bit changes such that we have i active

s-box in round r given that we have j active s-box in round r - 1 is given by

N A i ( ( M - i), M j = A4, A l ( i , j ) = {(2n-1)j(1-@(i,-)) j+M,i= M. (7.26)

O. otherwise.

where

and (y-')

1 # o. 1 / 2 (2" - llM (1 - (M. -)), 1 = 0'

and @(i. j ) is given by equations (7.22) and (7.23).

Proof: For AQ = 1, we have j = M for every i # M. For i = M. for al1 the 1bf

s-boxes in round r - 1, al1 the output bit changes connected to the i non-active s-boxes

in round r should be set to one. The other output bit changes c m be assigned arbitrarily

such that AQ = 1. Note that, in this case, AQ c m be seen as the XOR of j binary i.i.d.

random variables, bl E 22, 1 5 1 5 j , with P(bl = 1) = 1. Thus the rest of the output (2" -' )'

change bits can be assigned in (Y-')'@ (j, i) = - 2 ways. The total number of bit

changes arrangements with AQ = 1 is given by (2" - l)Y (1 - <P (M, -)) . Using

the inclusion-exclusion principle, the number of ways to have exactly i non-active s-box

in round r given that we have M active s-box in round r - 1 is given by

where

Thus the number of ways to have exactly i active s-box in round r given that we have

j active s-box in round r - 1 is given by

j = M. *) ) j # M. i = M.

otherwise.

which proves the lemrna.

Combining the results above, we have

Pjt = A o ( 4 j ) + &(id

(2" - 1)'

7.43 Mode1 C - Linear lhnsformation Type 2

Lemma 7.5

Let Q(n. k ) be the nurnber of choices of k nonzero elements of 2; which sum to zero, then

k-l

B(n, k) = ( - 1 1 ~ + ( - l ) r ( f ) 2 n ( k - r - l ) .

Proof Let T(n , k, r ) denote the number of ways to choose k elements of 21 which sum

to zero such that r or more of them are zero. Thus

Using the inclusion-exclusion principle, we have

which proves the lemma.

Lemma 7.6

Assuming SPN with a linear transformation type 2, then we have

where

and

proofi3 Since ail choices for which j s-boxes in round r - 1 are active are equally

likely, we may assume the first j s-boxes are active. Thus the last (iM - j ) s-boxes in

round r will have the same input change vector. (6(i = j ) - b( i = M))Q(n . j ) counts

the number of ways this could give rise to the last ( M - j ) s-boxes in round r being

non-active and makes sure that they are not counted towards the case where ail the

s-boxes in round r are active. If the last ( M - j) s-boxes in round r are active then our

problem is basically reduced to considering the first j s-boxes in round r. and we need

to cornpute p(i + j - hl 1 j ) for M = j . In this case we have (,$-i) ways to choose

which (hl - i) of the remaining j s-boxes in round r should be non-active. The input

change vectors to these s-boxes must be the same and we have - 1) ways to choose

which value they have. Cal1 this value W. Since the solution does not depend on the

specific value of w # O, we assume that w = (100.. - O). The remaining i + j - i\rl input changes can not be O (by definition they are nonzero changes) or w (because

this will contradict with the assumption that we have i active s-box). So these input

changes must have the form (ab), a E Z2, b E 2;-' \ {O). Moreover these nonzero

b's must add to O and the a's must add to 1 over Z2. If z + j # M then we have

Two different proofs for this Lernm were provided by 11321 and [SI].

q ( n - 1. i + j - M ) choices for the b's and 2'fj-M-1 choices for the a's. If i + j = .M

(this is only correct if j is odd) then the sum of the a's wil1 always equal to 1. Thus we

have 2 i + j - M - 1 + )(-1)'-'6(i + j = hi) total choices for the a's. Sirnilarly. one can

show that the number of arrangements such that al1 the s-boxes in round r are active

is given by 2j-l (sn - 1) Q(n - 1, j ) + @(n, j ) 6 ( j = -M). The conditional probability is

obtained by dividing by the total number of nonzero input vector changes, (2" - 1)'.

Lemma 7.7 [78]=

The number of code words of weight w in an [n, k, d = n - k + l] MDS code over

G F ( q ) is given by

Lemma 7.8

Assuming an SPN with a linear transformation type 3, then we have

w here

and @(n, d, w) is given by equation (7.39).

Proofi If the number of non-active s-boxes in round r - 1 is greater than or equd to

i, then Our problem corresponds to [2M - i, M - i, M + 11 MDS code. Let A, denote

the number of active s-boxes in round r , then the number of arrangements for which we

have A,-l + A, = w and the number of non-active s-boxes in round r - 1 is greater

than or equai to i is given by

Using the Inclusion-Exclusion Principle, the number of choices for which Ar-i + -4, = u.

and the number of non-active s-boxes in round r - 1 is exactly equal to i is given by

Thus we have

P j i = Q ( M - j , i + j ) / ( Y - l ) j ( l ; ) ?

which proves the lernrna.

7.5 Results and Discussion

For N = 64 and M = n = 8 (a practical size of SPN) the transition matrices for SPNs

based on the four models are shown in Figure 7.3 and Figure 7.4. It is easy to check

that these matrices are ergodic. The limiting distribution and the eigenvalues of these

matrices is shown in Figure 7.5.

Let

i.e., e denotes the deviation of the limiting avalanche characteristics frorn the ideai

characteristics. By numerical substitution with n = hi = 8? and the limiting distributions

in Figure 7.5, we get

s-box then. using equation (7.15). we have

é =

for the four models above. If the SPN is constmcted using one randomly selected 64 x 64

N AM n/2 -- 2 l - Sen 1 in(;)

i=l < r4' (7.46)

Table 7.1 shows the value of the second largest eigenvalue for the four rnodels.

Table 7.1 The Second Largest Eigenvalucs for SPNs with N = 64, and AM = n = 8.

From Table 7.1. it is clear that the transition matrices for SPNs with a permutation layer

have the slowest convergence (largest second eigenvalue), and the transition matrices for

SPNs with an MDS iinear transformation layer have the fastest convergence (smallest

second eigenvalue). It is also clear that the linear transformation is effective in improving

the avalanche characteristics of SPNs. This is also illustrated by noting that the transition

matrices of SPNs with linear transformations tend to have more zeroes in the places

corresponding to transitions from small number of active s-boxes to small number of

active s-boxes. This implies that for small number of rounds, on average, SPNs with

linear transformations tend to have more active s-boxes than SPNs with permutation

layers. Figure 7.2 shows how the experimental results agree with Our theoretical mode1

for SPNs with IV = 64, n = M = 8.

I m m I a

1

f

Model D (Experirnent) I C- Model O (hory) M Model C (Experiment) M Model C (lheory)

i Model B (Experirnent)

1

Model B meory) M Model A (Experiment) -- Mdel A (Theory)

I I 1 I

k

P? 1

O t

1 2 3 4 5 6 7 8 Nurnber of Rounds

b

Modei A

3.185 x IO-'

Mode1 B

4.139 x IO-^

Figure 72 Theoretical and Experimental Avalanche for SPN with N = 64. n = 1C.i = 8.

Mode1 C:

3.922 x IO-^

J

Mode1 D

2.07 x

One should note that the limiting distribution is the same for al1 the models above. This

is a nice illustration of the product cipher design philosophy: a strong block cipher cm

be obtained by iterating a weaker one for a sufficient number of rounds.

While the s-box mode1 used throughout this chapter is suitable for studying the avalanche

characteristics of SPNs, it can not be used directly to determine the resistance of the

network to differential crypianalysis, where the analysis should be performed on a

specified set of s-boxes. However. from the transition matrix, we can calculate the

minimum number of s-boxes involved in any 2 rounds of a differential characteristic.

This number will be greater than or equd to min ( i + j ) . i,l,p,I=o

This mde l aiso helps visudizing the average dynamic performance of SPNs with

randomly selected s-boxes and different linear transformation layers.

In sumrnary, we have presented analytical models for the avalanche charactenstics of

four general classes of substitution-permutation encryption networks. The results show

that using an appropnate diffusive linear transformation between rounds can improve

the avalanche characteristics of the network. This facilitates the construction of efficient

ciphers with fewer rounds.

X X X X X X X X

I I I I I I I I 0 0 0 0 6 0 0 0 r i ( r i ( M h - - + i (

X X X X X X X

X X X X X X X X

I I I I I I I I 0 0 0 0 0 0 0 0

X X X X X X X X X X X X X X X

X X X X X X X X

X X X X X X X X

X X X X X X X X X X X X X u ' ~ - - e ~ - Y v J

= - 9 9 9 9 9 3 O & ~ r n M e c ! r s , r n

X X X X X X X X X - . P - t - b q o 5 - - -

0 0 4 4 F ; t ;

X X X X X X X X X X

X X X X

X X X

Chapter 8 Conclusion

Although some researchers feel that the design of secure and efficient block ciphers

has a solid theoretical foundation. we feel that the work is not yet complete. In this

thesis, we have provided many results that enhance our understanding of how to achieve

this objective.

8.1 Contributions of the Thesis

Our work has contributed to cryptography and cryptanalysis in several ways. In particular

we have

derived expressions for the expected size of the maximum XOR table entry and the

maximum Linear Approximation Table entry for some combinatonal structures of interest

such as regular mappings, and injective rnappings.

developed an expression to estimate the size of the largest entry in the LAT of a

randomly selected injective s-box. In particular we showed that the nonlinearity of a

randomly selected 8 x 32 s-box (the size of CAST s-boxes) is about 72.

presented a new construction method for highly nonlinear injective s-boxes and showed

how the resistance of CAST-like encryption algori thms (based on randoml y selected

substitution boxes) to the basic linear cryptanalysis was underestimated.

extended the definitions of many cryptographic criteria to multi-ouput boolean functions.

showed the relationship between the Walsh-Hadamard transform and various types of

information Ieakage.

introduced a new class of Substitution Permutation Networks (SPNs) with the advantage

that the sarne network can be used to perform both the encryption and the decryption

operations, and examined different cryptographic properties of this class such as resistance

to both linear and differentid cryptanalysis.

showed how to construct MDS linear transformations with the involution property.

developed an analytical mode1 of the avalanche cnterion for SPNs with different linear

transformation layers.

showed that the private key cryptosystem proposed at Crypto' 90 by Koyarna and

Terada is affine over GF(2) and hence it is completely insecure (see appendix A).

proved a conjecture by Cusick regarding the number of functions satisfying the Strict

Avalanche Criterion (see appendix C).

Most of the contributions above were published in [154], [145], [144], [146], [149],

[153], [150], [142], [147], [148], [141], [152], [151] and [143].

8.2 Future Work

As we rnentioned before, we feel that design and analysis of efficient and secure block

ciphen is far from being a complete task. In this section we present several directions

for future research

Investigation of the non s-box based approach in the design of block ciphers.

This approach (e.g.. TEA and RC5) is attractive since it has the minimum memory

requirements which makes it suitable for applications with limited memory such as srnart

cards and wireless applications. Although DEA can be considered as belonging to this

class of block ciphers, we feel that the data dependent rotation used in RC5, because of

its sirnplicity compared to cornplex operations such as modular multiplication, is a more

elegant operation. The security of this simple operation should be explored further and

more operations might be investigated.

Finding s-boxes with high difision order. The first step is to find a theoretical bound

on the diffusion order. Algebraic constmctions or clever search techniques, such as

Genetic Algorithm or simulated annealing, can be used.

Extending the results in this thesis to Higher order differentials and tmncated

differentials.

Exarnining the pseudo-randomness properties of SPNs is still an untouched area. A

complexity theoretic approach similar to Luby-Rackoff results using DES-like ciphen

can be investigated.

Examining possible modifications of the CAST s-boxes such that the resulting round

function is bijective. This will make the CAST algorithm immune to the attack described

by Rijmen et el. Although we are aware of rnany techniques to achieve this, al1 of

them result in some degradation of the nonlinearity and the differential uniformity of

the round function.

Investigation of efficient algonthms to calculate the nonlinearity of the resulting CAST

32 x 32 s-box when the injective s-boxes are cornbined by modular addition or modular

subtraction.

Deriving upper bounds for the differential probabilities and linear correlations for the

different SPNs studied in this thesis.

Appendix A Cryptanalysis of the "Nonlinear- Parity Circuits" Proposed at Crypto '90

At Crypto '90, K. Koyama and R Terada from the Japanese NTT Research Laboratory

proposed a family of cryptographic functions for application to symmetric block ciphers

[71]. Here, we show that this family of circuits is a n e over G F ( 2 ) . More explicitly.

for any specific key (key), the ciphertext y is related to the plaintext x by the simple

affine relation y = iklkeyx 63 dke, where fbfkey is an n x n non singular binary matrix

and dkey is an n x 1 b i n w vector where n is the block length of the cipher. This

renders this family of ciphers completely insecure as it can be broken with only n + 1

linearl y inde pendent plaintext blocks and their corresponding ciphertext bloc ks.

Definitions and Cipher Description:

Koyama and Terada described their proposed cipher as follows:

Definition 2: A parity circuit iayer of length n, or simply an L(n) circuit layer, is a

boolean device with an n-bit input and an n-bit output, characterized by a key that is

a sequence of TI symbols from {O, 1, -. +}.

Definition 2: Function b = f (k.a), as computed by an L ( n ) circuit layer with

key k = kl k p k. E {O, 1, -, +}" is the relation from an n-bit input sequence

a = alaz - - - a, E {O. l}" to an n-bit output sequence b = blb2 . 6, E {O. 1}" defined

below. An L ( n ) circuit layer first cornputes a variable T E {O, 1) such that:

where

Output b = b i b - b, of the circuit layer is then

k, = -

ot herwise.

and T = 1

and T = O

The proposed cipher is obtained by composing the parity circuit layers as follows:

Definson 3: A parity circuit of width n and depth d, or simply a C(n, d) circuit. is a

matrix of d L ( n ) circuit layers with keys denoted by key = kl 11 k2 11 . 11 kd for which

the n output bits of the ( i - 1)-th circuit layer are the n input bits for the i-th circuit

layer for 2 5 i 9 d. The key for the C(n, d) circuit is a d x n matrix whose d lines

contain the circuit layer keys.

An example of C(nt d ) with n = 10, and d = 3 is shown in Table A. 1.

Table A.1 A parity circuit C(n, d ) with n = 10, and d = 3

Main Result:

The lemma below follows from the fact that the set of affine functions is closed under

the XOR operation.

Any combinational boolean function that can be realized using only XOR gates is an

aftine function over GF(2) .

Koyama and Terada presented many cryptographie properties of these C ( n , d) circuits.

They also claimeci, without proof, that the order of the boolean canonical form of C(n. d),

defined to be the maximum of the order of its product terms. increases exponentially as

n or d increases. and hence it is practically infeasible to cryptanalyze C ( n o d) if n 2 64

and d 2 8. The following theorem shows that this is not m e .

The output ciphertext, y. of the parity check circuit C(n. d) is related to the plaintext

input, x, by the following affine relation:

where Mkey is an n x n non singular binary rnatrix and dke, is an n x 1 binary vector.

Proof: Equations (A.2) and (A.3) c m be expressed as

Hence for any fixed key. key = k, 11 k, 11 - (1 kd equations (1). (2), and (3), and

consequently the parity circuit, C(n. d). can be realized using only XOR gates. Hence

every output component of the parity circuit is an affine hinction of the inpu: variables

over GF(2) . Since C(n.d) is a bijective mapping, then the mauix :CIke, is non

singular. O

Example: For the cipher described by Table A.1, the ciphenext, y, in relation to the

plaintext input, x. is given by the following equation:

By noting that affine functions can not satiso the Strict Avalanche Criterion (SAC) then

the C(n. d) circuits c m never satisfy the SAC as was faisely clairned.

Conclusion

The private key cryptosystem proposed at Crypto' 90 by Koyama and Terada is affine

over GF(2) and hence it is completely insecure.

Appendix B Cryptanalysis of "key agreement scheme based on generalized inverses of matrices"

Dawson and Wu [36] proposed a new key agreement scheme based on generalized

inverses of matrices [IO]. They suggested that for a key length of kn, using binary

matrices of size k x m. k < m. and rn x n, the security parameter is at least 2 ( m - k ) n .

Here, we show that this scheme is insecure. In particular, we show that breaking this

scheme is equivalent to solving a set of m x n consistent linear equations with rn2

binary variables.

Definition: Any matrix A has a generalized inverse A- if and only if

AA'A = A.

A- is given by

where r is the rank of A. Ir is the identity matnx of order r. R and C are such that

L( V and I.I/ are arbitrary matrices of the proper order. A generalized inverse which

satisfies AA-A = A and A-AA- = A- is cailed a reflexive generalized inverse of

A. In this case, we rnust have U' = VU. Note that a generalized inverse defined by

equation (B.l) always exists. Our discussion will be on the binary field F2 = {O, 1).

The key agreement scheme proposed in [36] is described by the authors as follows:

Assume that a user, Alice, wants to establish a common secret key with Bob via a public

channel. They cm follow the steps below:

(i) Alice randomly chooses a binary k x m matrix A and an arbitrary generalized inverse

-4- of A.

(ii) Alice sends Bob K A . Alice keeps A and A- secret.

(iii) Bob randomly chooses a binary rn x n matrix B and an arbitrary generaiized inverse

B- of B.

(iv) Bob sends Alice the matrices A-AB and A- AB B-. Bob keeps B and B- secret.

(v) Alice sends Bob the matrix A A - A B E = ABB-.

(vi) Alice and Bob are then able to formulate the key K = AB which is a k x n matrix

by computing AA-AB = A B and ABB-B = AB. respectively.

Pinch [IO31 proposed two attacks on the above scheme. The first attack reduces the

security parameter to 2') and the second attack, which applies to about 28% of the

cases, reduces the security parameter to 2(m-k%23. Here we show that breaking this

scheme (Le. obtaining A B from the public information) is equivdent to soiving a set of

consistent rn x n linear equations with m2 binary variables.

The following Lemma will be used in the attack.

L e m m B. 2

There exisü at least one matrix )r that satisfies the equation

-YBB' Y X B = X B .

Proof: Take Y = B ( X B ) - then

XBB- Y X B = X(BB-B) (XB) -SB

The security argument given in [36] was based on the difficulty of solving for A and B

given the public information A-A, A-AB, A-ABB- and ABB-. Here we show that

we c m determine the product AB without solving separately for A or B.

The Proposed Attack:

Let -Y = A-A then solve the linear equation

( X B B - ) Y ( X B ) = ( X B ) (B .6)

for the m x m matrix 1'. From the above lemma , this equation has at least one valid

solution. Note that we need to solve this equation ( i.e., we can not directly use the

soiution given in the lemma above because B is not known.)

Multiplying both sides of equation (B.6) by A (and by noting that .4X = AA-A = A)

we get

(ABB- )Y (xB) = AB. (B -7)

Both ABB- and .YB = A-AB are sent over the public channel. Hence we can

determine the secret key.

For (k, m, n ) = (7,12,15), the secunty parameter suggested in [36] is 2''. The attack

descnbed in this lener gets the key by solving a set of consistent 180 linear equations

with 144 binary variables.

Example:

Take ( k , m , n ) = (3.4.5). Let

and

Then we have

(B. IO)

Solving equation (B.6) for 1' we get 1096 distinct valid solutions. Here we give just

two of them

It c m be verified that

(B. 11)

(B. 12)

Conclusion

The key agreement scheme proposed by Dawson and Wu is insecure.

Appendix C Bounds on the Number of Functions Satisfying the Strict Avalanche Criterion

The Strict Avalanche Criterion (SAC) was introduced by Webster and Tavares [136]

in a study of design criteria for certain cryptographic functions. As in the case with

any criterion of cryptographic significance, it is of interest to count the functions which

satisw the critenon. Many recent papea (for exarnple [76], [32]) have been concemed

with counting functions that satisfy the SAC of various orden. It is easier to count the

functions satisfying the SAC of the largest order, because relatively few functions exist

which satisQ these stringent criteria.

O'Connor [95] gave an upper bound for the number of functions satisfying the SAC.

In (311 Cusick gave a lower bound for the number of functions satisfying the SAC. He

also gave a conjecture that provided an improvement of the lower bound. In this section,

we give a corxtnictive proof for this conjecture. Moreover, we provide an improved

lower bound.

Notation

Throughout this section, let

f, : 2; -, Z2 describes a boolean function with n input variables.

G' = {v; 1 O 5 i 5 3" - l}: denotes the set of vectors in Z!: in lexicographical order.

A boolean function f,(x) is specified by f.(x) = [bo, bi, ..., b p - l ] , where bi = f&).

e: denotes any element of 2; with Harnming weight 1. Let è, Yi E 2;-' denote the

n - 1 l e s t significant bits of e and vi respectively.

a: denotes any element of 2;-' with odd Hamrning weight.

gn-i : 22-1 -) Zz: denotes the boolean function 1 . x @ 6, b E Z2 where 1 denotes

the al1 ones vector in z;-', denotes the dot product operation over Z2 and ;+. denotes

the XOR operation. It is easy to see that gn-1 satisfies

MSB( . ) denotes the most signifiant bit of the enclosed argument-

SACn denotes the number of functions with n input bits that satisfj the SAC.

Cusick 's Conjecture

The following conjecture is given in [31]. This conjecture implies that there are at least .,y"'' - boolean functions of n variables which satisfy the SAC.

Conjecture [3 11: Given any choice of the values f n ( v i ) , O 5 i 5 Y-' - 1, there

exists a choice of fn(vi), 2 < - i - < 2" - 1. such that the resulting function f n ( x )

satisfies the SAC.

We prove this conjecture below. Afier cornpleting Our proof, we learned that Cusick and

Stanica [33] independently proved the conjecture. Also, D. Biss [19] has proved a much

stronger result by a much more complicated argument. If we let Ln = log2Sdcn . L = 3" I lim Ln, then Biss proved that L = 1. The conjecture, of course, only says that Ln > ..

n-oc "

For n = 1, it is trivial to show that if f l(l) = fl (0) $ 1 then the resulting function

satisfies the SAC. In the following lemma we prove that, for n 2 2, there exist at least two

choices for fn(vi), < - i < - <In - 1, such that the resulting function satisfies the SAC.

Lemmu C.1

Let f, = [h,-l [h,- 1 $ gn-l]] where hn-1 is an arbitrary boolean function with n - 1

input variables, n 2 2, and gn-1 is constructed as above to satisfy equation (C.l), then

f, satisfies the SAC.

h o f i Case 1: ICISB(e) = 0:

Case 2: iCiSB(e) = 1:

which proves the lemma. O From lernma C. 1 above, and by noting that we have two choices for g,, we conclude

qn-l + 1 that, for n 2 2. the number of function satisQing the SAC is lower bounded by 2-

Using the following lemma, one can provide some improvement to the above bound.

Let fn = [hn-l [Zn-l @ gn-l]] where h,-l is an arbitras, boolean function with n - 1

input variables, In-l(x) = hn-l(x $ a), n 2 2, and gn-1 is constnicted as above to

satisfy equation (C.1). then f, satisfies the SAC.

Pro08 Case 1: 1\1SB(e) = 0: 2" -1

which proves the lemma.

Note that if the function fn-l does not have any linear structures, then al1 the functions

generated by In-l @ gn-1 will be unique for ail the choices of a. From lemrna

C.1 and lemma C.2 we have 2"-' + 2 distinct choices for f n - 1 ( v i ) , 2"-' - < i < - 2" - 1.

Thus we have the following corollary:

Corollary C.1

The number of functions satisSing the SAC is lower bounded by

where LSn-' is the number of functions with n - 1 input bits having any linear structure.

A complicated formula for LS" is given in [100]. It can aIso be shown [lûû] that LSn 9"-1+1

is asymptotic to ( Z n - 1)2-

One should note that while this bound provides some improvement over the proved bound

in [31], exhaustive search (see Table C.1) shows that the quality of this bound degrades

as n increases. One can improve this bound slightly by identibing specid classes of

functions f, (vi) , O 5 i Y-' - 1 for which there is a large number of choices for fn (vi),

< i < 2" - 1 such that the resulting function, f., satisfies the SAC. For example, - - -

if the function h,-l satisfies the SAC, then the function f,, = [h,-' [h,- 1 c.x 3 b]] ,

b E Zz aiso satisfies the SAC. Thus our bound is slightly improved to

We now give a lower bound on the number of balanced functions that satisfy the SAC.

Lentma C.3

Let f, = [hn-1 [ Z n - , 9 gn-111 where hne1 is an arbitrary boolean function with n - 1

input variables that satisfies hn-l (vi) = F - ~ , (x) = h(x @ a), n 2 2, and w t ( v , ) odd

gn-1 is constructed as above to satisb equation (C. I) , then f, is a balanced function

that satisfies the SAC.

Proof: From lemma C.3 , it follows that fn satisfies the SAC. Here we will prove that

f, is a balanced function.

w t ( 3 , ) odd

which proves the lemrna.

Table C.l Exact Number of Functions Satisfying SAC versus the Denved Lower Bounds.

n

LSn-'

Old Bound [3 11

New Bound (exp. (C.6) )

New Bound (exp. ((2.7))

Exact Number

Sirnilarly, one can also show that the function f n = [hn-l [hn-l a gn-l]] where h,-l

is an arbitrary boolean function that satisfies hn-1(v;) = F~ is a balanced w t ( v , ) even

function that satisfies the SAC. From the lemma above, it follows that the number of

baianced SAC

2

4

2

8

8

8

functions

3

8

4

64

64

64

4

138

16

1536

1920

4 128

is lower bounded by

1

5

4992

256

1099776 1

1 157568

27522560 .

References

[ l ] FIPS PUB 185. Escrowed encryption standard. Depr. of Commerce, Feb. 1994.

[2] C. M. Adams. Constructing symmetric ciphers using the CAST design procedure.

Designs, Codes, and Cryptography, Vol. 12. No. 3, pp. 283-316, 1997.

[3] C.M. Adams. A F o m l and Practicd Design Procedure for Substitution-Permutation

work Cryptosystems. PhD thesis, Queen's University, Kingston, Ontario, Canada,

September, 1990.

[4] C.M. Adams and S.E. Tavares. The use of bent sequences to achieve higher-order

Strict Avalanche Criterion in s-box design. Technical report, TR 90-013, Dept. of

Electrical Engineering, Queen's University, Kingston. Ontario. 1990.

[SI C.M. Adams and S.E. Tavares. Designing s-boxes for ciphers resistant to differential

cryptanalysis. Proceedings of 3rd Symposium on the State and Progress of Research

in Cryptography, pp. 18 1-1 90, Rome, Italy, 1993.

[6] N. Ahmed and KR. Rao. Orthogonal Transfom for Digital Signal Processing.

S pringer-Verlag, New York, 1975.

[7] D. Andelman and J. Reeds. On the cryptanalysis of rotor and substitution-permutation

networks. IEEE Tranî. Inform. Theory, 28(44), pp. 578-584. 1982.

[8] R. Anderson and E. Biham. Two practical and provably secure block ciphers: BEAR

and LION. Proc. of Fast Sofovare Encryption (3). LNCS 1039, Springer-Verlag, pp.

II3-l2O1 1996.

[9] F. Ayoub. Probabilistic completeness of Substitution-Permutation encryption

networks. ZEE Proc. Part E, 129, pp. 195-1 99, 1982.

[10]A. Ben-Israel and T.N.E. Greville. Generalized inverses: Theory and applications.

John Wiley and Sons, New York, 1974.

[11]E. R. Berlekarnp. Algebraic coding theory. McGraw-Hill Book Inc., New York, 1968.

[ 1 21E. B iharn. Dinential Ceptanalysis of Iterated Cryptosysterns. PhD thesis, The

Weizmann Institute of Science, Rehovot, Israel, 1992.

[13]E. Biham. New type of cryptanalytic attacks using related keys. Advances in

cryptology: Proc. of EUROCRYPT '93. Springer-Verlug. Berlin. pp. 398-409, 1994.

[14)E. Biharn. On Matsui's linear cryptanalysis. Advances in Ctyptology: Proc. of

EUROCRYPT '94, Springer- Verlag, pp. 341-355, 1995.

[15]E. Biharn and A. Shamir. Differential cryptanalysis of DES-like cryptosystems.

Journal of Cryptology, Vol. 4, no. 1. pp. 3-72, 1991.

[16]E. Biham and A. Shamir. Differential cryptanalysis of snefm, Khafre, REDOC-II,

LOKI and Lucifer. Advances in Cryptology: Proc. of CRYPTO '91, Springer-Verlag,

pp. 156171, 1992.

[17]E. Biham and A. Shamir. Diffrential cryptanalysis of the full 16-round DES.

Advances in Cryptology: Proc. of CRYPTO '92, Springer-Verlag. pp. 487-496, 1 993.

[18]E. Biharn and A. Shamir. Differential fault anaiysis of secret key cryptosystems.

Advances in Cryptology : Proc. of CRYPTO '97, Springer- Verlag. pp. 5 13-525, 1 997.

[19]D. K. Biss. A lower bound on the number of functions satisfying the Strict Avalanche

Criterion. Subrnitted.

[20]LF. Blake. An Introduction to Applied Probability. John Wiley and Sons, Inc., 1979.

[2 1 ]G. Brassard. Modern Cryptology. Springe-Verlag, New York, 1988.

[22]L. Brown, M. Kwan. J. Pieprzyk, and J. Sebeny. Improving resistance to

differential cryptanalysis and the redesign of LOU. Advances in Cryptology: Proc.

of ASIACRYPT '91, Springer- Verlag, pp. 3650, 1 993.

[23]L. Brown, J. Pieprzyk, and J. Seberry. LOU: A cryptographic primitive for

authentication and secrecy applications. Advances in Cryptology: Proc. of AUSCRYPT

'90, Springer- firlag. pp. 229-236, 1990.

[24]L. Brynielsson. The information leakage through a randomly generated function.

Advances in C~~ptology: Pmc. of EUROCRYPT '9 1, Springer- Verlag. Berlin, pp.

552-553, 199 1.

[25]L. Carlitz and S. Uchiyama. Bounds for exponential sums. Duke Math. Journal. Vol.

24, pp. 3 7 4 1 . 1957.

[26]X. Chang, 2. Dai. and G. Gong. Some cryptographie properties of exponential

hinctions. Advances in Crytology: Proc. of ASIACRYPT '94. Springer- Verlag, pp.

41.5418, 1995.

[27]D. Chaum and J.H. Evertse. Cryptanalysis of DES with a reduced number of rounds.

Advances in Cryptology: Proc. of CRYPTO '8.5, Springer-Verlag, Berlin. pp. 192-21 1,

1986.

[28]B. Chor. O. Goldreich, I. Hastad, J. Friedman, S. Rudich, and R. Smolensky. The

bit extraction problem or t-resilient functions. Proc. of the 26th IEEE Symposium on

Foundation of Computer Science, pp. 3 9 U 0 7 , 1985.

[29]D. Coppersmith. The Data Encryption Standard @ES) and its strength against attacks.

IBM J. Res. Develop. Vol. 38 No. 3, pp. 243-250, May, 1994.

[30]T.M. Cover and J.A. Thomas. Elements of Infonnation Theory. John Wiely & Sons

Inc, 1991.

[31]T. W. Cusick. Bounds on the nurnber of functions satisQing the Strict Avalanche

Criterion. Infonnation Processing Letters. 57, pp. 261-263, 1996.

[32]T.W. Cusick. Boolean functions satisfying a higher order Strict Avalanche Criterion.

Advances in Cryptology: Proc. of EUROCRYPT '93. Springer-Vedag, pp. 102-1 17,

1994.

[33]T.W. Cusick and P. Stanicil. Bounds on the number of functions satisfying the Strict

Avalanche Criterion. Information Processing kt ters 60, pp. 2 15-2 19, 1996.

[34]J. Daemen. L. Knudsen. and V. Rijmen. The block cipher SQUARE. Proc. of Fast

Somare Encryption (4). W C S . Springer- Verlng. 1 997.

[35]D.W. Davis and G.I.P. Parkin. The average cycle size of the key strearn in output

feedback enciphement. Lecture Notes in Cornputer Science : Proc. of the Workshop

on Cryptography. Springer-Verlag' Berlin, pp.263-279. 1982.

[36]E. Dawson and Chuan-Kun Wu. Key agreement scheme based on generaiized inverses

of matrices. IEE Electronics Letiers, Vol. 33, No. 14. pp. 12 1&1211, 1997.

[37]M.H. Dawson and S.E. Tavares. An expanded set of S-box design critena based on

information theory and its relation to differential attacks. Advances in Cryptology:

Pruc. of EUROCRYPT '91. Springer-Verlag. pp. 352-365, 1992.

[38]D.E.R. Denning. Cryptography and Data Security. Addison-Wesley, 1982.

[39]Y. Desmedt. I.J. Quisquater, and M. Davio. Dependence of the output on the

input in DES: Small avalanche charachteristics. Advances in Cvptology: Proc. of

EUROCRYPT '84. Springer- Verlg, Berlin, pp. 359-376, 1985.

[40]W. Diffie and M.E. Helman. Privacy and authentication: An introduction to

cryptography . Proc. of IEEE, 67, pp.397-427, 1 979.

[41]W. Diffie and M.E. Helman. The first ten years of public-key cryptography. Proc.

of IEEE, 76, pp.560-577, 1988.

[421D. Ellion and KR. Rao. Fast a lgorithms: Algorithm. Ana lysis, Applications.

Academic press, 1982.

[43]J.H. Evertse. Linear structures in block ciphers. Advances in Cryptology: Proc. of

EUROCRYPT '87. Springer-Verlag, Berlin. pp.249-266. 1988.

[44]H. Feistel. Cryptography and computer pnvacy. Scientific Arnerican, 228, pp. 15-23.

1973.

[45]H. Feistel, W. Notz. and J.L. Smith. Some cryptographic techniques for machine-to-

machine data communications. Proc. of lEEE, 63, pp. 1545-1554, 1975.

[46]W. Feller. An Introduction to Probubiliq Theory with AppZicationî. Vol. I, 3rd edition.

John Wiley and Sons, New York, 1968.

[47]R. Forrk. The Strict Avalanche Criterion: Spectral properties of boolean functions

and an extended definition. Advances in Cryptology: Proc. of CRYPTO '88. Springer-

Verlag, pp. 45M68, 1989.

[48]R. Forré. Methods and instruments for designing S-boxes. Joumd of Crvptoiogu.

Vol -2, N0.3 pp. 115-130, 1990.

[49]J. Goldman and Gian-Car10 Rota. On the foundation of combinatorial theory

TV. Finite vector spaces and Eulerian generating functions. Studies in Applied

Mathematics. Vol. XWX No.3, pp. 239-258, 1970.

[50]J. Gordon and H. Retkin. Are big S-boxes best ? Proc. of the Worhhop on

Cryptography, W C S , Springer-krlag, Berlin. pp.257-262, 1982.

[5 1JD. Gregory. Pnvate communication.

[52]M. Gysin. A one-key cryptosystem based on a finite nonlinear automaton. Proc.

of International Conference of Cryptography: Policy and Algorithrns, Brisbane,

Queensland, Australia. July 1995, LNCS 1029, Springer- Verlag, pp. 165-1 73, 1 996.

[53]C. Harpes, G.G. Kramer. and J.L. Massey . A generalization of linear cryptanal ysis

and the applicability of Matsui's piling up Lemma. Advances in Cryptology: Proc.

of EUROCRYPT '95, Springer-Vèrlag, pp. 24-38, 1995.

[54]M. E. Hellman. A cryptanalytic tirne-memory trade-off. IEEE Proc. , Vol. IT-26.

No.4, pp.401-406, July, 1980.

[55] H.M. Heys. The Design of Substitution Permutation Network Ciphers Resistan t to

Cryptanalysis. PhD thesis, Queen's University, Kingston, Ontario, Canada, 1994.

[56]H.M. Heys. Modelling avalanche characteristics in DES-like ciphers. Workshop on

Selected Areas in Cryptography, SAC '96, Workshop Record, pp.77-94, 1996.

[57]H.M. Heys and S.E. Tavares. Cryptanalysis of tree-structured substitution-permutation

networks. IEE Elecrronics Letters, Vol. 29, no. 1, pp. 40-41, 1993.

[58]H.M. Heys and S.E. Tavares. Key clustering in substitution-permutation network

cryptosysterns. Workshop on Selected Areas in Cryptography, SAC '94. Worhhop

Record, pp. 1.34- 145, 1994.

[59]H.M. Heys and S.E. Tavares. Known plaintext attack of tree-structured block ciphers.

IEE Electronics k a e r s , Vol. 31 , no.10. pp. 784-785, 1995.

[60]H.M. Heys and S.E. Tavares. The design of product ciphers resistant to differential

and linear cryptanaiysis. J o u d of Cryptology, Vol. 9, no. I, pp. 1-19, 1996.

[6 11H.M. Heys and S.E. Tavares. Avalanche characteristics of substitution-permutation

encryption networks. IEEE Tram. Comp., Vol. 44, pp. 1131-1 139, Sept. 1995.

[62]J.F. Humphreys and M.Y. Prest. Nwnbers, groups and Codes. Cambridge University

Press, Cambridge, 1989.

[63]T. Iakobsen and L.R. Knudsen. The interpolation attack on block ciphers. Proc. of

Fasr Sofiare Encryption Workshop (41, 1997.

[64]B.S. Kaliski, R.L. Rivest, and A.T. Sherman. 1s the Data Encryption Standard a

group ? Journal of Cryptology, pp. 1-36, 1988.

[65]B.S. Kaliski and M.J.B. Robshaw. Linear cryptanalysis using multiple

approximations. Advances in Cryptology: Proc. of CRYPTO '94, Springer-krlag,

Berlin, pp. 2639 , 1994.

L661J.B. Karn and G.I. Davida. Stmctured design of substitution-permutation encryption

networks. IEEE Tram. Comp. C-28, pp.747-753. 1979.

[67]D. Khan. nte Codebreakers, The Story of Secret Writing. Macmillan Co. New York,

1967.

168lL.R. Knudsen. Tmncated and higher order differentials. Proc. of Fast Sofrware

Encryption (2). LNCS 1008, Springer- Verlag, pp. 196-2 1 1, 1995.

[69]L.R. Knudsen. Block Ciphers- Analysis, Design and Applications. PhD thesis, Aarhus

University, Denmark, July 1994.

[70]P. C. Kocher. Timing attacks on implernentations of Diffe-Hellman, RSA. DSS.

and other systems. Advances in Cryptology: Proc. of CRYPTO '96, Springer- Verlag,

Berlin. pp. 1 W 1 1 3 . 1996.

[71]K. Koyama and R. Terada. Nonlinear parity circuits and their cryptographie

applications. Advances in Cryptology: Proc. of CRYPTO '90. Springer-Verlag, Berlin.

pp. 582-599, 1991.

[72]X. Lai. Higher order derivatives and differential cryptanalysis. Symposium on

Communication, Coding. and Cryptography. Monte-Verita, Ascona, Switrerlund,

Feb.10-13, 1994.

[73]X. Lai, J.L. Massey, and S. Murphy. Markov ciphers and differential cryptanalysis.

Advances in Crytology: Pmc. of EUROCRYPT '9 1. Springer- Verlag, pp. 17-38. 1992.

[74]S.K. Langford and M.E. Hellman. Differential-linear cryptanalysis. Advances in

Cryptology: Proc. of CRYPTO '94, Springer-Verlag, Berlin, pp. 17-25. 1994.

[75]J. Lee, H.M. Heys. and S.E. Tavares. On the resistance of a CAST-like encryption

algorithm to linear and differential cryptanalysis. Designs. Codes and Cryptography,

Vol. 12, No. 3, pp. 267-282, 1997.

[76]S. Lloyd. Counting functions satisfying a higher order Strict Avalanche Criterion.

Advances in Cryptology: Proc. of EUROCRYPT '89, Springer- Verlag. pp. 63-74, 1 990.

[77]M. Luby and C. Rackoff. How to constnict pseudorandom permutation from

pseudorandom functions. SIAM Journal on Computing. Vol. 17. pp.3 73-386, 1988.

[78]F.J. MacWilliarns and N.J.A. Sloane. rite theory of error correcting codes. North-

Holland Publishing Company, 1977.

[79]J. Massey. SAFER K-64: a byte-oriented block-ciphering algonthm. Fast Sojhvare

Encryption, LNCS 809. Springer- Verlag. pp. 1- 17, 1994.

[80]J. L. Massey. An introduction to contemporary cryptology. Proc. of IEEE. 76,

pp.533-549, 1 988.

[8 1]M. Matsui. The first experirnental cryptanalysis of the Data Encryption Standard.

Advances in Cryptology: Proc. of CR YPTO '94, Springer- Verlag, Berlin. pp. I-I I ,

1994.

[82]M. Matsui. Linear cryptanalysis method for DES cipher. Advances in Cryptology:

Proc. of EUROCRYPT '93, Springer-krlag, Berlin, pp. 386397, 1994.

[83]M. Matsui. New structure of block cipher with provable security against differential

and Iinear cryptanalysis. Proc. cf Fasr Sofrware Encryption (3). LNCS 1039. Springer-

Verlag, pp. 205-2 18, 1996.

[84]W. Meier and 0. Staffelbach. Nonlinearity criteria for cryptographic functions.

Advances in Cryptology: Proc. of EUROCRYPT' 89, Springer-Vèrlag. pp. 549-562,

1990.

[85]A. J. Menezes, P. Van Oonchot, and S. A. Vanstone. Handbook of Applied

Cryptography. CRC press, New York, 1997.

[86]C.H. Meyer and S.M. Matyas. Cryptography: A New Dimension Ni Cornputer Data

Securiv. Wiley, New York, 1982.

[87]S. Mister and C. Adams. Practical s-box design. Workshop on Selected Areas in

Cryptography, SA C '96. Workshop Record. pp. 61-76, 1 996.

[88]C. Mitchell. Enumerating boolean functions of cryptographic significance. Journal

of Cryptology, pp. 155-170, 1990.

[89]K. Nyberg. On the construction of highly nonlinear permutations. Advances in

Cryptology: Proc. of EUROCRYPT '92. Springer- Rrlag, Berlin. pp.92-98, 1992.

[90]K. Nyberg. Perfect nonlinear S-boxes. Advances in Cryprology: Proc. of EUROCRYPT

'91 , Springer- Verlag, pp. 37û-386, 1992.

[91]K. Nyberg. Differentidly uniform mappings for cryptography. Advances in

Cryptology: Proc. of EUROCRYPT '93, Springer- Verlag, Berlin, pp.5544. 1994.

[92]K. Nyberg. Linear approximation of block ciphers. Advances in Cpptology: Proc. of

EUROCRYPT '94. Springer-Verlag. Berlin, pp. 439444, 1995.

[93]K. Nyberg. S-boxes and round functions with controllable linearity and differential

uniformity. Pmc. of Fast Software Encryption (2). LNCS 1008. Springer-Verlag, pp.

111-130, 1995.

[94]K. Nyberg and L.R. Knudsen. Provable security against differential cryptanalysis.

Advances in Crytology: Proc. of CRYPTO '92, Sprhger-Verlag. pp. 566-574, 1993.

[95]L.J. O'Connor. An upper bound on the number of functions satisfying the Strict

Avalanche Critenon. hformation Processing Leners, 52, pp.325-327, 1994.

r96lL.J. O'Connor. On the distribution of charachteristics in composite permutation.

Advances in Cryptology: Proc. of CRYPTO '93. Springer-Verlag, Berlin, pp. 403-

412, 1994.

[97]L.J. O'Connor. On the distribution of characteristics in bijective mappings. Advances

in Cryptology: Proc. of EUROCRYPT '93, Springer-Verlag, Berlin, pp. 36&370,

1994.

[98]L.J. O'Connor. The inclusion-exclusion pnnciple and its applications to cryptography.

Cryptologia, Vol. XVII. no. 1, pp. 63-79, January, 1993.

[99]L.J. O'Connor and I. Dj. Golic. A unified Markov approach to differential and lineaar

cryptanalysis. Advonces in Cryptology: Proc. of ASIACRYPT '94, Springer-Verlag,

Berlin, pp. 387-397, 1995.

[lûû]L.J. O'Connor and A. Klapper. Algebraic nonlinearity and its application to

cryptography. Journal of Cryptology, Vol. 7, pp. 2 13-227, 1994.

[IO1 ]National Institute of Standard and Technology. Advanced encryption standard (AES)

development effort. http://csrr.nist.gov/encryption/aed'es~home. hml.

[102]National Bureau of Standards (US). Data Encryption Standard@ES). Federal

Infornation Processing Standards Publicarion 46, 1 977.

[103]R.G.E. Pinch. Comments on "key agreement scheme based on generalized inverses

of matrices". Preprint, 1997.

[104]M. Portz. On the use of interconnection networks in cryptography. Advances in

Cryptology: Proc. of ECIROCRYPT '9 1. Springer- Verlag, pp. 302-3 15, 1992.

[105]B. Preneel. W.V. Leekwijk, L.V. Linden, R-Govaerts, and J. Vandewalle. Propagation

charachtenstic of boolean functions. Advances in Cryptology: Proc. of ELIROCRYPT

'90, Sprfnger-Verlag, pp. 161-1 73, 199 1.

[106]J. Quisquater, Y. Desmedt, and M. Davio. The importance of "good" key scheduling

SC hemes . Advances in Cqptology : Proc. of CRYPTO '85, Springer- Ve rlag. Berlin.

pp. 537-542, 1986.

[107]J.A. Reeds and J.L. Manferdelli. DES has no per round linear factor. Advances in

Cryptology: Proc. of CRYPTO '84, Springer- Velag. Berlin, pp. 377-389, 1985.

[108)V. Rijmen, J. Daemen, B. Preneel, A. Bosselaen, and E. De Win. The cipher

SHARK. Fast SofMtare Encryption, LNCS 1039. D. Gollmann. Ed., Springer-Verlag,

pp. 99-112, 1996.

[109]V. Rijmen, B. Preneel, and E. De Win. On weakness of non-surjective functions.

Designs. Codes. and Cryprography. Vol. 12. No. 3. pp. 253-266, 1997.

[ 1 10]T. Ritter. Dynarnic Substitution, U.S. patent 4979832. HTML article availuble from

" www. io.codm/ritter " .

[ 1 1 1]R. Rivest. The RC4 encryption algorithm. Intemal Report, RSA Data Security, Inc.,

1992.

[ 1 1 2lR.L. Rivest . The RC5 encry ption algorithm. Fast sofiare encryption. W C S 1008,

Springer- Verlag, pp. 86-96, 1 995.

[ 1 1 3lF.S. Roberts. Applied Combinatorics. Englewood CIi ffs, N.J. : Prentice-Hall, 1 984.

[114]M. J.B. Robshaw. Block ciphen. Technical report, TRaOl version 2, RSA

Laboratories, August, 1995.

[115]M. J.B. Robshaw. Stream ciphen. Technicai report, TR-701 version 2, RSA

Laboratones, July 1995.

[116]O.S Rothaus. On bent hinctions. Journal of Combinatorial 7heor-y. Vol. 20(A):30&

305, 1976.

[117]R.A. Rueppel. Analysis and Design of Stream Ciphers. Springer-Verlag, Heidelberg

and New York, 1986.

[118]I. Schaumüller-Bichl. Cryptanalysis of the Data Encryption Standard by method

of formal coding. Proc. of the workshop on Cryptogrophy, Springer- Mrlag. Berlin.

pp.235-255, 1982.

[119]B. Schneier. Description of a new variable-length key, M i t block cipher (Blowfish).

Proc. of Fast So&are Encryption Workrhop. UVCS 809, Springer- Verlag, Berlin, pp.

191-204, 1994.

[120]B. Schneier. Applied Cryptography, Second edition. John Wiley and Sons, Inc, 1996.

[12 1JB. Schneier and J. Kelsey. Unbalanced Feistel networks and block cipher design.

Proc. of Fast Sofrware Encryption (31, LNCS 1039, Springer-Verlag, pp. 121-144,

1996.

[122]J. Seberry, X. Zhang, and Y. Zheng. Systematic generation of cryptographically

robust s-boxes. 1st A CM Conference on Computer and Communications Security,

Fai@iàx, Krginia, pp. 172-182, November, 1993.

[123]C.E. Shannon. Communication theory of secrecy systems. Bell Systems Technical

Journal, Vo1.28, pp. 6.56715. 1949.

[124]A. Shimizu and S. Miyaguchi. Fast data encryption algonthm FEAL. Advances in

Cryptology: Proc. of EUROCRYPT '87. Springer- Verlag, Berlin, pp. 267-2 78, 1988.

[125]T. Siegenthaler. Decrypting a class of Stream ciphers using ciphertext only. IEEE

Trans. Cornput.. Vo1.C-34, No. 1. pp. 81r85, 1985.

[ 126]T. Siegenthaler. Correlation-irnrnunity of nonlinear combining functions for

cryptographic applications. IEEE Tram. on hfonn. Zheory. VoLIT-30. No.5. pp.

776:780, Sept. 1984.

[ 127JG.J. Simmons. A survey of information authentication. Proc. of lEEE. 76. pp.603-

620, 1988.

[128]G.J. Simmons. Contemporas, cryptology, the science of information integrity. IEEE

Press, New York, 1992.

[129]A. Sinkov. Elementary Cryptanalysis. Mathematical Association of America, 1966.

[130]M. Sivabalan, S.E. Tavares, and L.E. Peppard. On the design of SP networks from

an information theoretic point of view. Advances in Cryptology: Proc. of CRYPTO

'92, Springer- Verlag, Berlin. pp. 26&2 79, 1993.

[13 1]A. Sorkin. Lucifer: A cryptographic algorithm. Cryptologia, 8, pp. 22-35, 1984.

( 132JR. Stong. Pnvate communication.

[133 ]S. Vaudenay. On the need for rnultipermutations: Cryptanalysis of MD4 and SAFER.

Proc. of Fast Sofrware Encryption (21, LNCS 1008, Springer- Verlag, pp. 286297,

1995.

[134]R. Verser. DES challenge: We cracked the code. Avilable through

"www. frii.co~rcv/deschaZZ. htm ", April 8, 1 997.

[ I35lA.F. Webster. Plaintext I ciphertext bit dependencies in cryptographic systems.

Master's thesis, Queen's University, Kingston. Ontario, Canada, December, 1985.

[136]A.F. Webster and S.E. Tavares. On the design of S-boxes. Advances in Cryptology:

Pmc. of CRYPTO '85 , Springer-Verlag, pp. 523-534, 1986.

[137]A. Weil. On some exponential sums. Proceedings of the National Academy of

Sciences, Vol. 34, p p . 2 W 2 0 7 , 1 948.

[138]D. 1. Wheeler and R.M. Needham. TEA, a Tiny Encryption Algorithm. Proc. of

F m Sofnvare Encryption Workshop (2). LNCS 1008. Springer-Verlag, Berlin. pp.

363-366, 1995.

[139]M.J. Wiener. Efficient DES key search. Technicai report, TR-244. School of Computer

Science, Carleton University, Ottawa, Canada, May 1994. (also presented at the

Rump Session of CRYPTO '93).

[140]G.Z. Xiao and J.L Massey. A spectral charachtenzation of correlation-immune

combining functions. IEEE Trans. Infonn. Theory. Vol iT-3#:569:571, 1988.

[141]A.M. Youssef, 2. Chen, and S.E. Tavares. Construction of highly nonlinear injective

s-boxes with application to CAST-like encryption algorithm. Proceedings of the

Candian Conference on Electrical and Computer Engineering (CCECE' 97), pp.

330-333, 1997.

[142]A.M. Youssef, T.W. Cusick, P. Stanica, and S.E. Tavares. New bounds on the number

of functions satisfying the Strict Avalanche Cntenon. Workîhop on Selected Arem in

Cryptography. SAC '96, Workshop Record, pp.49-60, 1 996.

L143lA.M. Youssef, S. Mister, and S.E. Tavares. On the design of linear transformations

for substitution permutation encryption networks. Workshop on Selected Arear in

Cryptography. SAC ' 97, Workshop Record, pp. 40-18, 1 997.

[144]A.M. Youssef and S.E. Tavares. Number of nonlinear regular s-boxes. IEE Electronics

Leiîers, Vo1.31, No. 19, pp. 1643-1644, 1995.

[145]A.M. Youssef and S.E. Tavares. Resistance of balanced s-boxes to linear and

differential cryptanalysis. Infomtion Processing Leners, 56(199.S), pp. 249-252,

1995.

[146]A.M. Youssef and S.E. T'avares. Spectral properties and information leakage of

multi-output boolean functions. Proceedings of the IEEE International Symposium

On Information Theory. Whistler, B. C., Canada, Sept. 1 7-22, 1 995.

[147]A.M. Youssef and S.E. Tavares. Comment on "bounds on the number of functions

satisfying the Strict .4valanche Critenon. Infomtion Pmcessing Letters 60. pp.

271-275, 2996.

[148]A.M. Youssef and S.E. Tavares. A cornparison of two families of block ciphers.

Proceedings of the 18th Biennial Symposium on Commwiicationr. Kingston. Ontario,

pp. 200-203. 1996.

[ 149lA.M. Youssef and S.E. Tavares. Information leakage of randornly selected boolean

functions. Information Theory and Applications II, 4th Canadiun Workshop On

Information Theory. May 1995, LNCS 1133, pp. 41 -52, Springer-Verlag, 1996.

[lSO]A.M. Youssef and S.E. Tavares. On the avalanche charactenstics of substitution-

permutation networks. Proceedings of Pragocrypt '96, pan 1, pp. 18-29, 1996.

[15 1lA.M. Youssef and S.E. Tavares. Cryptanalysis of the nonlinear parity circuits proposed

at crypto' 90. IEE Electronics Letters, Vo1.33, No. 7, pp. 585-586, 1997.

[152]A.M. Youssef and S.E. Tavares. Modelling avalanche characteristics of substitution-

permutation networks using Markov chahs. Proceedings of the 5th Canaàian

Workshop on Information Theory, pp. 45-48, Toronto, June 3-6. 1997.

[153]A.M. Youssef, S.E. Tavares, and H.M. Heys. A new class of substitution-permutation

networks. WorkFhop on Selected Areas in Cryptography, SAC '96, Workrhop Record,

pp. 132-147, 1996.

[154]A.M. Youssef. S.E. Tavares. S. Mister, and C.M. Adams. Linear approximation of

injective s-boxes. IEE Electronics Letters. VoLJI, No. 25, pp. 2 168-2 169, 1 995.

[155]M. Zhang, S.E. Tavares, and L.L. Campbell. Information leakage of boolean

functions and its relationship to other cryptographic critena. Proceedings of 2nd

ACM Conference on Cornputer and Communications Securiv, Fai$àx, Hrginia, pp.

156-165., 1994.

IMAGE EVALUATION TEST TARGET (QA-3)

APPLIED 1 IMAGE. lnc S 1653 East Main Street - -. - Rochester. NY 14609 USA -- -- - - Phone: 71W2-0300 -- -- - - Fax: 7161288-5989

O 1993, Applied Image. tnc.. All R ~ t i t t Resewed

Documents

Analysis and Design of Block Ciphers · Acknowledgments This work would never have been possible without the guidance, encouragement, and suggestions of my supervisor, Dr. Stafford