View
1
Download
0
Category
Preview:
Citation preview
Analysis and Design of Block Ciphers
by
Amr Mohamed Youssef
A thesis submitted to the
Department of Electrical and Cornputer Engineering
in conformity with the requirements for the degree of
Doctor of Philosophy
Queen's University
Kingston, Ontario, Canada
December 1997
Copyright @ 1997 Amr Mobamed Youssef
National Library Bibliothèque nationale du Canada
Acquisitions and Acquisitions et Bibliographie Services services bibliographiques
395 Wellington Street 395. rue Wellington OttawaON K1AON4 Ottawa ON K1A ON4 Canada Canada
The author has granted a non- exclusive licence dowing the National Library of Canada to reproduce, loan, distribute or sel1 copies of this thesis in microform, paper or electronic formats.
The author retains ownership of the copyright in this thesis. Neither the thesis nor substantial extracts fiom it may be printed or otherwise reproduced without the author's permission.
L'auteur a accordé une licence non exclusive permettant à la Bibliothèque nationale du Canada de reproduire, prêter, distribuer ou vendre des copies de cette thèse sous la forme de microfiche/fih, de reproduction sur papier ou sur format électronique.
L'auteur conserve la propriété du droit d'auteur qui protège cette thèse. Ni la thèse ni des extraits substantiels de celle-ci ne doivent être imprimés ou autrement reproduits sans son autorisation.
In this thesis we study various cryptographic properties of boolean mappings from n bits
to m bits. In particular, we derive expressions for the expected size of the maximum XOR
table entry and the maximum Linear Approximation Table enûy for some combinatorid
structures of interest such as regular (balanced) mappings, and injective mappings. We
derive similar expressions for the expected value of different foms of information leakage
and relate different forms of information leakage to the spectral properties of the function.
We also extend the definitions of many cryptographic criteria to multi-ouput boolean
functions and study the relationship between the Walsh-Hadamard transform and various
types of information leakage.
A new construction method for highly nonlinea. injective s-boxes is presented. It
is shown that the resistance of CAST-like encryption algorithms (based on randornly
selected substitution boxes) to the basic linear cryptanalysis was underestimated in
previous work.
We introduce a new class of Substitution Permutation Networks (SPNs) with the
advantage that the sarne network cm be used to perform both the encryption and the
decryption operations. Different cryptographic properties of this class such as resistance to
both linear and differential cryptanalysis are exarnined. We also presenr two constmction
methods for involution linear transformations for SPNs based on Maximum Distance
Separable codes.
An analytical mode1 for the avalanche characteristics of SPNs with difierent linear
transformation layers is developed. We dso prove a conjecture by Cusick regarding the
number of functions satisfying the Strict Avalanche Criterion.
Dedicated to my parents
For their etemal love, encouragement, and support, 1 am grateful.
iii
Acknowledgments
This work would never have been possible without the guidance, encouragement, and
suggestions of my supervisor, Dr. Stafford Tavares. It is my pleasure to thank him
for al1 of his help.
1 wish to thank Dr. Moustafa Fahrny who introduced me to Dr. Tavares.
1 gratefully acknowledge the financial support of the Ontario Ministry of Educarion,
Telecommunication Research Lnstitute of Ontario (TRIO), and Queen's University.
Special thanks and appreciation go to my wife, Ayda, for her many years of patient
support and encouragement.
List of Notation
LAT*
XO R*
NLf
In(n. m )
The set of binary numben
The set of binary n-niples
XOR operation
A variable in Z i or GF(2" )
A scalar variable
A random vaiable in 2; or GF(2")
The cardinality of the enclosed set
A mapping from 2: -, Z F
The Hamming weight of a
The dot product of a and x over Z2
The multiplication of a and x over GF(Zn)
The Walsh transform of (- 1 ) ce/ (XI
Linear Approximation Table entry
XOR distribution table entry
rnax 1 L AT(a. b) 1 a#O.b
N A ~ A ~ Ax#O,Ay
Nonlinearity of f
Number of injective fùnctions f : ZI: + Z y
Number of balanced functions f : 21" -. Zr:
The entropy hinction
The binary entropy function
The mutuai information
Static information leakage
Self static information leakage
Dynamic information leakage
Trace function
Table of Contents
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Abstract ii
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv
List of Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v
List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi
List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiii
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . Chapter 1 Introduction 1
. . . . . . . . . . . . . . . . . . . 1.1 Motivation for this Research 2
. . . . . . . . . . . . . . . . . 1.2 Outline of Remainder of Thesis 3
. . . . . . . . . . . . . . . . . . . . . . . . Chapter 2 Previous Research 4
. . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1 Architectures 4
. . . . . . . . . . . . . . . 2.1.1 Substitution-Permutation Networks 5
. . . . . . . . . . . . . 2.1.2 DES-like Structure (Feistel Networks) 7
. . . . . . . . . . . . . . . . . . . . . . . . . . 2.1.3 Other Structures 9
2.2 S-box versus Non S-box Approaches . . . . . . . . . . . . . 9
2.3 Cryptanalysis . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.3.1 Exhaustive Search . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.3.2 Differential Cryptanalysis . . . . . . . . . . . . . . . . . . . . . 12
2.3.3 Linear Cryptanalysis . . . . . . . . . . . . . . . . . . . . . . . - 1 4
2.3.4 Extensions to Linear and Differential Cryptanalysis . . . . 15
2.3.5 Other Attacks . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.3.6 lmplementation Dependent Attacks . . . . . . . . . . . . . . 17
2.3.6.1 Timing Attacks . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
. . . . . . . . . . . . . . . . . 2.3.6.2 Differential Fault Cryptanalysis 17
Chapter 3
3.4
3.5
Chapter 4
Chapter 5
. . . Advanced Encryption Standard (AES) Requirements 19
. . . . . . . . . Mathematical Background and Definitions -21
. . . . . . . . Cryptographic Criteria for Boolean Functions 21
The Inclusion-Exclusion Principle . . . . . . . . . . . . . . . . 25
The Walsh Transform of Boolean Functions . . . . . . . . . 26
Linear Approximation Table and XOR Distribution Table of
Balanced S-boxes . . . . . . . . . . . . . . . . . . . . . . . . 29
. . . . . Linear Approximation Table of Balanced S-boxes 30
XOR Distribution Table of Balanced S-boxes . . . . . . . . 32
Counting the Number of Nonlinear Balanced S-boxes . . . 38
Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . - 4 1
. . . . . . . . . . Linear Approximation of Injective S-boxes 43
Definitions and Notation . . . . . . . . . . . . . . . . . . . . . 43
. . . . . Linear Approximation Table of Injective Mappings 44
. . . . Construction of Highly Nonlinear Injective S-boxes 47
Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . -47
. . . . . . . . . . . . . . . . . . . . . . Construction Method I - 4 9
Construction Method II . . . . . . . . . . . . . . . . . . . . . -50
Comments on the Security of the CAST Encryption
Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
Information Leakage and Spectral Properties of Boolean
Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . -55
Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
viii
5.6.1.1
5.7
Chapter 6
6.1
6.2
6.3
6.3.1
6.3.2
6.4
6.5
6.6
6.7
6.8
Chapter 7
Information Leakage of a Randornly Selected Boolean
Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
Information Leakage of a Randorniy Selected Balanced
. . . . . . . . . . . . . . . . . . . . . . . . Boolean Function -64
Information Leakage of a Randomly Selected Injective
Boolean Function . . . . . . . . . . . . . . . . . . . . . . . . . 67
Relation Between the Walsh Transforrn and Different
Foms of Information Leakage . . . . . . . . . . . . . . . . 69
. . . . . . . . . . . . . . . . . . . . . . . Extended Definitions 73
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . Conclusion -75
. . . . A New Class of Substitution-Permutation Networks 77
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . -77
S-boxes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
S-box lnterconnection Layer . . . . . . . . . . . . . . . . . 87
Proposed Linear Transformation . . . . . . . . . . . . . . . . 88
Involution Linear Transformations based on MDS codes . 89
Resistance to Differential and Linear Cryptanalysis . . . . 93
Cyclic Properties of the Proposed SPN . . . . . . . . . . . 96
. . . . . . . . . . . . . . . . . . . . Key Scheduling Algorithm 97
Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
Modelling Avalanche Characteristics of Substitution-
Permutation Networks Using Markov Chains . . . . . . . 102
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
Convergence in Markov Chains . . . . . . . . . . . . . . . 105
7.3
7.4
7.4.1
7.4.2
7.4.3
7.4.4
7.5
Chapter 8
8.1
8.2
Appendix A
. . . . . . . . . . . . . . . Modelling Avalanche in S-boxes 106
. . . . . . . . . Modelling the Linear Transformation Layer 108
. . . . . . . Model A Fixed Permutation Layer 108
. . . . . . . . . . Model 8 Linear Transformation Type 1 108
. . . . Model C Linear Transformation Type 2 111
. . . . . . . . . . Model 0 Linear Transformation Type 3 113
Resufts and Discussion . . . . . . . . . . . . . . . . . . . . . 114
Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
Contributions of the Thesis . . . . . . . . . . . . . . . . . . 120
Future Worù . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
Cryptanalysis of the 'Nonlinear-Panty Circuitsn Proposed
at Crypto '90 . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
Cryptanalysis of 'key agreement scheme based on
generalized inverses of matricesn . . . . . . . . . . . . . . 127
Bounds on the Number of Functions Satisfying the Strict
Avalanche Criterion . . . . . . . . . . . . . . . . . , . . . . . 131
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
Vitae . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151
Appendix B
Appendix C
List of Figures
Figure 2.1
Figure 2.2
Figure 2.3
Figure 4.1
Figure 5.1
Figure 5.2
Figure 5.3
Figure 5.4
Figure 6.1
Figure 6.2
Figure 6.3
Figure 6.4
Figure 6.5
Figure 6.6
Figure 7.1
Figure 7.2
Figure 7.3
. . . . General Classification of Block Cipher Architectures 5
. . . . . . . . . . . . . . SPN with N = 16. R = 4 and R = 3 6
. . . . . . . . . . . . . . . . . . . Round i of DES-like Cipher 8
. . . . . . . . . . . . . . . . . . . . . . CAST Round Function 53
. . . . . . . . Lower Bound for the Binary Entropy Function 61
Expected Value of SS L(Y) for a Randomly Selected
. . . . . . . . . . . . . . . . . . . . . . . . . Boolean Function 63
Expected Value of S S L ( Y ) and Expected Value of
S L ( Y 1 X+ ) for a Randomly Selected Boolean
Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
Expected Value of DL(AY1AX) for an n x n Random
Mapping and an n x n Random Bijective Mapping . . . . . 66
SPN with N = 16. n = 4. and R = 3 . . . . . . . . . . . . . . 77
Average Nonlinearity and Average Maximum XOR Table
Entry Versus the Number of Fixed Points (n = 8) . . . . . 85
Involution Linear Transformation Based on MDS Codes
(hl = n = 8 . lrreducible Polynomial = 11 d) . . . . . . . . . 93
Normalized Distribution of Cycle Length for al1 216 Starting
Points . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
A Typical Expanded Segment of the Cycle Length
Distribution of SPNs with Random Involution S-boxes . . 98
. . . . . . . . . . . . . . . . . . . . Key Scheduling Algorithm 99
SPN with N=16. n = 4 . and R = 3 . . . . . . . . . . . . . 102
Theoretical and Experimental Avalanche for SPN with
N=64 . n = M = 8 . . . . . . . . . . . . . . . . . . . . . . . . 115
The Transition Matrices for Model A and B (M = n = 8) . 11 7
Figure 7.4 The Transition Matrices for Model C and D (hi = n = 8) . 11 8
Figure 7.5 The Limiting Distribution and the Eigenvalues for Model A,
6, Candi3 ( M = n = S ) . . . . . . . . . . . . . . . . . . . . 119
xii
List of Tables
Table 3.1
Table 3.2
Table 4.1
Table 4.2
Table 4.3
Table 4.4
Table 6.1
Table 6.2
Table 7.1
Table A.1
Table C.1
Experimental Results versus Theoretical Bounds for 104
Randomly Chosen &bl Bijective Mappings . . . . . . . . . 37
Distribution of the Maximum XOR Table Entry and the
Maximum LAT Entry (Experimental Results for 10'
Randomly Chosen &bit Bijective Mappings) . . . . . . . . 38
Experimental Results for 8 x rn Injective S-boxes . . . . .47
;brL, -iOR* for 8 x 16 S-boxes Obtained Using
Construction Method II . . . . . . . . . . . . . . . . . . . . . . 52 NL, XOR* for 8 x 24 S-boxes Obtained Using
Construction Method II . . . . . . . . . . . . . . . . . . . . . . 52
Best S-box Nonlinearity Obtained By DifFerent
Construction Methods . . . . . . . . . . . . . . . . . . . . . . . 52
Minimum Distance for 105 Randomly Selected Involution
Linear Transformations (n = iM = 8) . . . . . . . . . . . . . 91
Relative Speed of the Proposed SPN and Q-CAST . . . 100
The Second Largest Eigenvalues for SPNs with -i = 64,
and hf = n = 8. . . . . . . . . . . . . . . . . . . . . . . . . . 115
A parity circuit C(n. d) with n = 10, and d = 3 . . . . . . . 124
Exact Number of Functions Satisfying SAC versus the
Derived Lower Bounds. . . . . . . . . , . . . . . . . . . . . 136
xiii
Chapter 1 Introduction
Until recently, cryptography has k e n of interest primarily to the military and diplornatic
communities. Today, however, several factors have combined to stimulate great interest
in commercial applications. It is becoming increasingly apparent that a primary focus
of society is on the generation, transmission, processing, and storage of information.
Consequently, the focus on data vulnerability and data crime in recent years has
emphasized the need for techniques to protect information, especially when it is in
electronic fom. Among these techniques is cryptography: transformations of data
intended to make the data useless to one's opponents [40]. Such transformations provide
solutions to two basic problems of data secunty: the priva- problem, preventing opponent
from extracting information from a communication channel, and the authentication
problem, preventing an opponent from injecting false data into the channel or altering
messages so that their meaning is changed. nie field of cryprology encompasses both
cryptography (cipher making) and cqptanalysis (cipher breaking).
cryptographic systems c m be classified into two groups: pnvate-key cryptosystems.
and public-key cryptosystems. In a pnvate-key cryptosystem. the enciphering and
deciphenng keys are the same (or easily determined from each other), and the key
is kept secret, while in public-key cryptosystems, the two keys differ in such a way
that one key is computationally infeasible to detemine from the other. While pubiic-
key algorithms have been suggested as a solution to the key distribution problem for
private-key cryptosystems, private-key cryptosystems are still the only practical solution
for cryptographic applications that require high data rates and low power consumption.
Many cryptographic systems implement a public-key algorithm for key distribution, and
a private-key algorithm for the data encryption.
Pnvate-key cryptosystems can be classified into block and Stream ciphers. A block cipher
divides the plaintext message into successive blocks and encipher each block with the
sarne key. A Stream cipher breaks the plaintexi message into successive characters or
bits and enciphen each character or bit with one element of the key Stream.
1.1 Motivation for this Research
Private-key cryptosystems, with their attractive features such as high data rates and low
power consumption, provide a practical solution for a variety of applications. While
the most recognizable private-key cryptosystem is the Data Encryption Standard (DES)
[102], DES controversy was bom at the same time as DES. Since DES was proposed
as a standard there was considerable criticism. The small key size and unpublished
s-box design criteria are the main issues in this criticism. Today, the cryptographic
community is quite aware of the fact that breaking DES is getting a lot more feasible
than it was previously thought [139], [134]. The use of triple-DES, while having good
security, is excessively inefficient especially for software implementations. While the
need for a serious alternative to DES increases, the ciramatic failure of some other
proposed alternatives, such as FEAL [124]. and the "Nonlinear Parity Circuits proposed
at Crypto ' 9 0 [153] (see appendix A) illustrates the difficulty of achieving this'.
Boolean hinctions, because they are the basic components of cryptographic
transformations, attract great attention in the cryptographic field. Despite the fact
that almost al1 cryptographic functions are multi-output boolean functions, the area
of multi-output boolean functions is still almost untouched. Moreover. most of the
successful cryptanalytic techniques on cryptographic hinctions ded with several or al1
output components simultaneously instead of dealing with single components separately.
This work explores different cryptographic properties of multi-output boolean functions.
While there are currently many block cipher proposais, aimost al1 of them are software
oriented ciphers and most of them are optimized around the 32 or 64-bit processors. The
purpose of this work is not only to design a hardware efficient block cipher that is both
Appendix B shows an example of a dramaric failure of a key agreement scheme
2
secure and also software efficient on most platforms. but also to provide some design
principles and analytical tools that cm help in the design process.
1.2 Outline of Remainder of Thesis
Chapter two gives a review of the previous research related to Our work and some of the
mathematical background and the necessary definitions required throughout the thesis.
In chapter three. we study the Linear Approximation Table (LAT) and the XOR
distribution table of balanced s-boxes. We also calculate the probability that any nonzero
linear combination of the output coordinates of a regular s-box is an affine function.
Chapter four contains similar results for injective s-boxes. It also contains two algebraic
construction methods for highly nonlinear s-boxes.
Ln chapter five we study different forms of the information leakage through a randornly
selected boolean function. We also present the relation between the Walsh transforrn and
different forms of information leakage.
In chapter six we introduce a new class of Substitution Permutation Networks (SPNs)
and study its resistance to both linear and differential cryptanaiysis.
In chapter seven we study the avalanche characteristics of four different classes SPNs.
Finally in chapter eight we give a surnmary of Our results and directions for future work.
We note that some of the material in chapter three appeared in [141], [145]. Part of
chapter four appeared in [154], [144]. Part of chapter five appeared in [146],[149]. Part
of chapter six appeared in [153]. Some of the material of chapter seven appeared in
[i50], [152].
Chapter 2 Previous Research
A good introduction to cryptography cm be found in the following books
[2 1,38,67,85,86,120, 1291 and survey articles [4O,4 1,44,8O, 1 14,115,127] . In this chapter
we focus on topics relevant to our work.
2.1 Architectures
in his landmark paper, Shannon [123] presented the principles of confusion and diffision.
Because these principles are so successful in capturing the essence of the desired attributes
of a block cipher, they have become the comerstone of block cipher design.
Confusion is described as "the use of enciphering transformations that cornplicate the
determination of how the statistics of the ciphertext depend on the statistics of the
plaintext" [128] or. more briefly, to rnake the relation between the key and the ciphertext
as complex as possible. thereby hiding the statistical features of the plaintext.
On the other hand. diffusion spreads the influence of individual plaintext characters
over as much of the ciphertext as possible, thereby hiding the statistical features of the
plaintext. Methods of achieving good diffision and confusion lie at the hem of block
cipher design. A general classification of block cipher design architectures is shown
in Figure 2.1.
Block Cipher Architectures
SPN-like structure DES-like Structure
S-box based Non S-box based
Fixd S-box
Figure 2.1 General Classification of Block Cipher Architectures
2.1.1 Substitution-Permutation Networks
Feistel [44] and Feistel, Non, and Smith [45] were the first to suggest that a Substitution
Permutation Network (SPN) architecture (Figure 2.2) consisting of iterative rounds
of nonlinear substitutions on smdl sub-blocks referred to as substitutions (s-boxes)
connected by bit permutations was a simple, effective implementation of Shannon's
principle of mixing transformation based on the concepts of "confusion" and "difision".
Keying the network cm be accomplished by XORing the key bits with the data bits
before each round of substitution and after the last round or by choosing a different set
of s-boxes for each key.
Karn and Davida [66] introduced the concept of completeness. A technique for designing
a cornplete SPN was given. Completeness guarantees that every ciphertext bit of the SPN
depends on al1 input plaintext bits, and this should hold for every possible key value.
Chosen plaintext and known plaintext attacks against Tree-structured SPNs introduced
by Karn and Davida are given in [57], [59].
Plaintext P. - - -
C l ... cl6 Ciphertext
Figure 2 2 SPN wiih N = 16. n = 4, and R = 3.
In [9] Ayoub presented a variant of SPNs which incorporates random permutations and
retains, with a high probability. the completeness criterion after a small number of rounds.
Ayoub suggested that the s-boxes cm be randornly chosen as proof of freedom from
an inten tional trapdoor.
Heys 1551 and Heys and Tavares [60] introduced a new type of SPNs in which they
replaced permutations between rounds by diffusive h e a r transformations to improve the
avalanche characteristics of the cipher and to increase resistance to differential and linear
cryptanalysis. The s-box design cnteria used in their proposed SPN are diffusion and
nonlinearity [60]. The regular structure used in the Heys and Tavares SPN allows both
careful design and careful analysis of the network which can be considered as a good
step towards designing a provably secure block cipherl. The main contribution of Heys
and Tavares is that they showed that the s-box interconnection layer (despite being a
Throughout this thesis. the rem "provably secure" meam provably secure against a prespecified set of crypmdytic attacks.
6
linear layer) plays an important role in improving the cipher's resistance to both linear
and differential cryptanalysis.
SAFER [79], SHARK [log], and SQUARE [34] are some exarnples of block ciphers
based on the SPN architecture.
2-12 DES-like Structure (Feistel Networks)
The National Bureau of Standards (May 1973) published a solicitation for cryptosystems
in the Federal Register. This lead to the development of the Data Encryption Standard.
or DES. which is believed to be the most widely used cryptosystem in the world (for
unclassified information). DES was developed at IBM as a modification of an earlier
systern known as LUCIFER [131]. On March 17, 1975, DES was first pubiished in the
Federal Register. DES was adopted as a standard [IO21 for unclassified data on January
15, 1977. DES has been reviewed by the National Bureau of Standards every five yean.
Its most recent renewal was in January 1994. when it was renewed until 1998. It is
anticipated that it will not remain a standard past 1998. On January 1997. the National
Institute of Standards and Technology (NIST~) announced the development of a Federal
Information Processing Standard for Advanced Encryption Standard (AES). A sumrnary
of the proposed minimum requirements for the AES is listed in section 4.
The idea behind DES is first described by Feistel, Notz, and Smith [45] who described a
block cipher construction which became the network structure for DES. As in an SPN,
the DES-like network also uses s-boxes and permutations to achieve Shannon's mixing
transformation but performs these operations on only half the block at a time. For
round 2 . the mixing ( a substitution layer followed by a permutation layer) is executed
by the round function f : z:" + 2:'' on the N / 2 bits of the right haif block, r,.
The parameters of the function are determined by the key bits associated with round i,
denoted by ki. Each bit of the output of f is then XORed with the corresponding bit
NIST is a new ti~ne for the National Bureau of Standards
7
of the left half block, i i , then the left and right half blocks are swapped. Equivalently,
the cipher can be viewed as follows:
' i+, r i+ 1
Figure 23 Round i of DES-like Cipher
Schneier and Kelsey exarnined a generalization of the concept of Feistel (DES-
like) networks 11211, which they cailed Unbdanced Feistel Networks (UFNs). Like
conventional Feistel networks, UFNs consist of a series of rounds in which one part of
the block operates on the rest of the block. However, in a UFN the two parts need
not be of equal s ix . Removing this limitation has many interesting implications for
designing ciphen which are secure against linear and differential cryptanalysis. The
authors also presented some UFN constnictions and made some initial observations
about their security .
Matsui [83] introduced a methodology for designing block ciphers, with provable security
against differential and linear cryptanalysis. This rnethodology is based on three new
principles: change of the location of the round functions, round functions with recursive
structure, and s-boxes of different sizes. Changing the location of the round function
realizes parallel computation of the round functions without losing provable security. The
recursive structure reduces the size of the s-boxes. The use of s-boxes of different sizes
is expected to make algebraic attacks difficult. Matsui also gave specific examples of
practicai block ciphers that are provably secure under an independent subkey assurnption
and are reasonably fast in hardware as well as in software.
Some other examples for block ciphers based on DES-like structures are CAST [2][3],
FEAL [124], Blowfish [119], LOKI [22][23], TEA [138], and RC5 [112].
2.1.3 Other Structures
There are many other block cipher architectures available. Many of these architectures
are based on some interesting theoretical foundation. DEA is based on the principle of
mixing operations from different algebraic groups.
BEAR and LION [8] are 3-round unbalanced Feistel networks, rnotivated by the earlier
construction of Luby and Rackoff [77] which provides a provably secure (under suitable
assumptions) block cipher from pseudorandom functions using a 3-round Feistel structure.
A class of cryptosystems based on the use of interconnection networks was introduced in
[lO4]. Another class of cryptosystems based on nonlinear finite automata was proposed
in [52].
2.2 S-box versus Non S-box Approaches
Until recently, most of the block cipher proposais used a set of s-boxes in the construction
of the round function to achieve the confusion effect. Some of these s-boxes are fixed,
such as DES s-boxes. Others are generated dynamically as a function of the key and
kept fixed throughout the encryption process. An example of this latter approach is
Blowfish [119]. A more complex approach is the dynarnic substitution scheme. In this
case, after each substirution the s-box is re-ordered. In most of the cases, the just-used
substitution value is exchanged with some other entry in the s-box selected at random.
This dynarnic substitution scheme, while currently applied in Stream cipher design (e.g.,
RC4 [ I l 1],[119] and other commercial ciphers by Ritter Software Engineering 111 O])
has not yet been widely applied to block cipher design.
According to Nyberg [93], the most common methods for constructing s-boxes are based
on: random generation, testing against a set of design criteria, algebraic construction
having good properties, or a combination of these.
The advantage of the s-box based approach is that the theory of s-box design is mature
enough and many techniques are available to constnict cryptographically strong s-boxes.
The disadvantage of the s-box approach is that it may require a large amount of memory
to store the s-boxes. This disadvantage becomes very clear for large s-boxes with non-
simple algebraic description (i.e., when they can only be implemented using look-up
tables).
To overcome this problem one proposal is to use key dependent operations for
the s-boxes. For exarnple, D E A 1731 achieves the required confusion effect by
mixing operations from different algebraic groups (XOR addition modulo 216, and
multiplications modulo 216 + 1). The disadvantage of this approach is that multiplication,
which is the most nonlinear operation arnong the three operations above, usually requires
a large number of logic gates in hardware and is relatively slow in software. Moreover,
the scaling process. to wider block length, is not straightforward (for example D E A can
not be scaled to 128-bit block size because 231 + 1 is not a prime [120]).
Two elegant non s-box based proposals with minimum memory requirements are
TEA[138] and the RC5 farnily[ll2]. The TEA round function is consuucted with
simple fixed logical shift operations. In RC5 the round function is constructed with data
dependent rotations. The only disadvantage of these two proposals (assuming the most
common block sizes of 64 bits or 128 bits) is that rotating or shifting large blocks are
not very efficient operations on 8-bit platforms.
2.3 Cryptanalysis
In this section we give a brief review of some of the cryptanalysis techniques that have
been applied to block ciphers. Cryptanalytic attacks can be categorized into the following
types: (i) ciphertext only, (ii) known plaintext, (iii) chosen plaintext, and (iv) chosen
ciphertext. A ciphertext attack assumes that the cryptanalyst has access to the ciphertext
only. A known plaintext attack uses knowledge of the plaintext values corresponding to
the available set of ciphertexts; a chosen plaintext attack assumes that the cryptanalyst
can select specific values of the plaintexts and acquire the corresponding ciphertexts
while a chosen ciphertext attack assumes that the cryptanalyst can select specific values
of the ciphertexts and acquire the corresponding plaintexts.
Among the attacks that are applicable to different block cipher structures are: Hellman's
tirne-memory trade-off attack [54], meet-in-the-rniddle attack [64], key degeneracy attack
[27,39,107], maximum likelihood estimation [7], method of formal coding (MFC) [118].
and related-keys attack [ 1 33.
Here we focus on the following attacks: the Wiener exhaustive key search machine
[ 1391, differential cryptanalysis [I 21, linear cryptanalysis [8 1 ] and some other recently
proposed implementation dependent attacks.
2.3.1 Exhaustive Search
Despite recent improvements in analytic techniques for attacking block ciphers,
exhaustive key search remains the most practical and efficient attack for ciphers with
realtively shon keys (i.e., 5 .56 bits). In [139] Wiener described in detail. an exhaustive
DES key search machine that costs about US$l million and can find a key in 3.5 hours
on average. The basic machine design can be adapted to attack the standard DES modes
of operation for a small penalty in running time.
Wiener designed his DES key search machine by himself in about five months, which
raises the question of what a well funded govemment agency could do. This machine,
even though it was not built. is a statement that breaking DES is getting a lot more
feasible than it was previously thought.
More recently, responding to a challenge. including a pnze of $10,000, offered by RSA
Data Security, Inc., the DESCHALL group [134] successfdly found a DES key by
exhaustive search over the intemet. The search takes less than 3 months. On April 8,
1997 the group announced finding the key after searching almost 25% of the total key
space. Full details about the project statistics cm be found in [134].
23.2 Differential Cryptanalysis
Differential cryptanalysis [15] is a statistical attack that was developed and popularized
by Biham and Shamir and has been applied to a wide range of cryptosystems including
LUCIFER. DES, FEAL, REDOC, and Khafre [16]. Differential cryptanaiysis is a chosen
plaintext attack which examines changes in the output of the cipher in response to
controlled changes in the input. In particular. the attack exploits the existence of a highly
probable ( R - 1)-round characteristic to determine a portion of the key bits, where an
r-round characteristic is defined to be a sequence of r pairs of input and output XOR
differences corresponding to each round. The existence of a highly probable characteristic
depends on: the XOR distribution table and the bit changes within the network [IS].
The main results of applying this attack to DES reported by Biharn and Shamir in
[15] are : DES with 6 rounds was broken in less than 0.3 seconds on a PC using 240
ciphertexts, DES with 8 rounds was broken in less than 2 minutes by anaiyzing ISOOO
ciphertexts chosen from a pool of 50000 candidate ciphertexts, and DES with 15 rounds
can be broken in about 2'' steps. More importantly, they showed that modifying the key
scheduling algorithm can not make DES much stronger.
One interesting result reported is that FEAL-4 can be broken with just eight ciphertexts
and one of their plaintexts.
At Crypto '92, Biham, and Shamir [17] presented a slightly modified version of their
original differentiai cryptanaiysis attack on the full 16-round DES. This attack requires
-*' chosen plaintexts and at that time, this attack was the first known attack capable of
"breaking" the full 16-round DES in less than the zS5 complexity of exhaustive search3.
In [29] Coppersmith claimed that differentiai cryptanalysis was known to the DES
designers at IBM (within IBM, the attack was formally known as the 'T attack").
Furthemore, he showed how the design cntena for the s-boxes and permutations were
developed to thwart such attacks.
A closer look at differential attacks shows that for an r-round characteristic, only the
plaintext difference and the ciphertext difference need to be fixed. That is, the intermediate
differences can be arbitrarily selected. The notion of differential was introduced by Lai
and Massey [73] to account for this observation. For DES-like ciphers, sequences of
differences c m be modeled as a Markov chah when the subkeys are assumed to be
independent. In order to make a successfd attack on an iterated cipher by differential
cryptanalysis, the existence of good characteristics is sufficient. On the other hand, to
prove security against differential attacks for such ciphers, one has to ensure that there
is no differential with a probability high enough to enable successful attacks.
Knudsen and Nyberg [94] showed that for DES-like ciphers, with independent round
keys, the probability of an r-round differentiai, r 2 4. is less than or equal to 2p$,,
where p,., is the highest probability for a non trivial one round characteristic. They
also gave some (far from being practical) examples for block ciphers with provable
security against differential cryptanalysis. Some of these examples were later broken
using higher order differential cryptanalysis [72], [68].
O'Connor [97] showed that, for a randornly selected m-bit bijective mapping, the expected
size of the largest entry in the XOR distribution table is bounded by 2m while the fraction
of the XOR table that is zero approaches e-0-5 = 0.60653. O'Conner [96] also showed
that the above two quantities are tending to optimal values in composite permutations.
' This 1 bit reduction in exhaustive s m h is due IO the syrnmetric propeny of DES: DES(rn. k) = D E S ( m , k )
13
233 Linear Cryptanalysis
In 1986, Rueppel [ 1 171 presented the idea of the "closest linear approximation" to a
boolean function f . The Walsh transform coefficient of largest magnitude indicates a
linear boolean function of the input variables whose output agrees with f more often than
any other linear function (this is the closest linear approximation to f). Matsui extended
this idea by finding the ciosest linear approximation to a nonzero linear combination
of the s-box outputs. In [82] Matsui introduced linear cryptanalysis, which is a known
plaintext attack, on DES . The purpose of this method is to obtain a linear expression
where p is the plaintext, c is the ciphertext, k is the key, and (a, b, d) are the attack
parameters. This expression holds with probability PL # & over al1 keys, such that
1 PL - 1 is maximal for a given cipher algorithm. To achieve this, a statistical linear
path between input and output of each s-box is constructed. Then the path is extended
to the entire DES algonthm and finally a linear approximation expression without any
intermediate vaiues is reached. The main results of this attack are4: 8-round DES is
breakable with 2'' known plaintexts in 40 seconds, 12-round DES is breakable with 233
known plaintexts in 50 hours. and 16-round DES is breakable with 24i known plaintexts
(faster than an exhaustive search).
In [82] Matsui reported the first experimental attack of the full 16-round DES with
243 known plaintexts. using 12 workstations (HP9735PA-RISC 99MHZ). and using a
prograrn, written in C, and assembly language, implementing the linear cryptanalysis
aigorithm. The program, consisting of a total of 1000 lines, takes 50 days to find the key
of which 40 days were spent for generating phintexts and the corresponding ciphertexts
and only 10 days were spent for the actual key search.
One should note that while linear cryptanalysis can be considered as the most successful
theoretical attack on DES, this attack is by no means more practical than exhaustive --- -
' The experiments were implernented with the C-language on a workstation PA-RISU66MHZ
14
search: one cm not neglect the time needed to obtain the information about the plaintext.
Also, the cryptanalyst is free to invest as much, in technology, as he c m afford to make
the search more efficient when doing exhaustive key search. In a known plaintext attack
the cryptanalyst is restricted to the technology of the legitimate owner of the key. and to
the frequency with which the key is used. l n aimost every practical application, a single
DES key will be applied to much less than 243 bloch, even in its entire life time.
Immunity to linear cryptanalysis can be proved using similar concepts to that used to
prove security against differential cryptanalysis [99]. In 1921 Nyberg showed how to
achieve provable security against linear cryptanalysis.
2.3.4 Extensions to Linear and Düferential Cryptanalysis
In [74] Langford and Hellman presented a new chosen text attack on iterated
cryptosystems. This attack, which is a mixed version of both the differential and
linear cryptanalysis attacks, is very efficient for 8-round DES, recovering 10 bits of key
with 95% probability of success using only 768 chosen plaintexts. More keys c m be
recovered with less probability of success. This Cround attack, while comparable in
speed to existing attacks, represents an order of magnitude irnprovement in the arnount
of required text. The authors reported that they are currently working to extend this
attack to the full 16-round DES.
In [65] Kaliski and Robshaw presented a technique which aids linear cryptanalysis of
block ciphers and achieves a reduction of data required for successful attack. The basic
idea is to deduce several linear approximations using the same set of plaintextlciphertext
pairs. Finally, the authors noted that the use of larger s-boxes, which is sometimes
recommended as a way of increasing the secunty of DES-like block ciphers, might
in certain circumstances facilitate the use of linear cryptanalysis with multiple linear
approximations.
In [72] higher order derivatives of discrete functions were considered and the concept of
higher order differentiais was introduced. The cryptographie relevance of higher order
differentiais was discussed, but no application was given.
The concept of tmncated differentials is introduced in [68] by Knudsen. Tntncated
differentials are differentials where only a part of the difference in the ciphertext. after a
number of rounds, cm be predicted. Knudsen also gave exarnples of Feistel block ciphers
secure against differential cryptanalysis using first order differentials, but vulnerable to a
differential attack using tmncated differentials and higher order differentials.
In [53], linear cryptanalysis is extended to an attack called partition cryptanalysis which
considers a partition of the plaintext space and a partition of the last round input space.
Partitioning cryptanalysis exploits a potential weakness of the cipher, narnely. that the
last round inputs are non-uniformly distributed over the blocks of the second partition
when the plaintexts are taken from a particular block of the first partition. An example
of a cipher for which partition cryptanalysis perfoms better than linear and differential
cryptanalysis was contrived. The success probability of partition cryptanalysis was given
and a procedure for finding a pair of partitions that yields a successful attack was analyzed.
2.3.5 Other Attacks
Rijmen and Preneel [IO91 proposed a new attack on Feistel ciphers with a non-su jective
round function. CAST and LOKI are examples of such ciphers. They also extended
the attack to ciphers that use a non-uniformly distributed round function and applied
the attack to a six round weak version of CAST which uses round keys with 16 bit
entropy per round.
In [63] Jakobsen and Knudsen introduced a new attack on block ciphers, the interpolation
attack. This attack is useful for breaking ciphers using simple algebraic functions (in
particular quadratic functions) as s-boxes.
23.6 Irnplementation Dependent Attacks
2.3.6.1 Timing A ttacks
This attack was introduced by Kocher in [70]. The basic idea is that cryptosystems
often take slightly different arnount of time to process different inputs. Reasons include
performance optimization to bypass unnecessary operations. branching and conditional
statements, M M cache hits. processor instructions (such as multiplication and division)
that run in non-fixed time, and many other causes. Performance charactenstics typically
depend on both the encryption key and the input data.
By carehilly measuring the amount of time required to perform pnvate key operations.
attackers may be able to find fixed Diffe-Hellman exponents or factor RSA keys.
Timing attacks can potentially be used against other cryptosystems. including symrnetnc
hnctions. For example, in software the 28-bit C and D values in DES key schedule are
often rotated using a condition which tests whether a one-bit must be wrapped around.
The additional time required to move nonzero bits could slightiy degrade the cipher's
throughput or key setup time. The cipher's performance cm thus reveal the Hamming
( == 1 weight of the key, which provides an average of .+log? i= 3.93 bits of key n=O (9
information. DEA uses multiplication modulo (216 + 1) operation. which will usually
run in non-constant time. RC5 is at risk on platforms where rotates run in non-constant
tirne. RAM cache hits can produce timing characteristics in many ciphers if tables in
memory are not used identicaily in every encryption.
Additional research is needed to determine whether specific implementations are at risk
and, if so, the degree of their vulnerability. So far, only a few specific systerns have
been studied in detail and the attacks against them are theoretical.
2.3.6.2 DüXerential Fault CryptanaIysis
This attacks, developed by Biharn and Shamir. follows the Bellcore fundamental
E. B i h and A. Shamir. Resewch announcement: A new crypmalytic atack on DES (posted to cypherpunks@toad.com. 18
Ocrober 1996)
assumption:
" By exposing a sealed tamperproof device such as a srnart card to certain physical effects
(e.g., ionizing or microwave radiation). one can induce with reasonable probability a fault
at a random bit location in one of the registers at some random intermediate stage in the
cryptographic computation. Both the bit location and the round number are unknown
to the attacker"
Another assumption is that the attacker is in physical possession of the tamperproof
device, so that he can repeat the experiment with the sarne cleartext and key but without
applying the extemai physical effects. As a result he obtains two ciphertexts denved from
the same (unknown) cleartext and key, where one of the ciphertexts is correct and the
other is the result of a computation conupted by a single bit error dunng the computation.
The theoretical part of the attack can be applied to DES, triple DES (with 168 bit keys),
and DES with independent subkeys (with 768 bit keys). This attack still works even
with more general assumptions on the fault locations, such as faults inside the round
funciion, or even faults in the key scheduling aigorithm. Differential Fault Analysis
can break many additional secret key cryptosystems, including IDEA. RC5 and FEAl.
Some ciphers, such as Khufu, Khafre and Blowfish compute their s-boxes from the key
material. In such ciphers, it may be even possible to extract the s-boxes themselves, and
the keys, using the techniques of Differential Fault Anaiysis.
In a later research announcement6 by the sarne authors, a modified fault model (with
some more plausible assumptions) was introduced. This model makes it possible to
find the secret key stored in a tarnperproof cryptographic device even when nothing is
known about the structure and operation of the cryptosystem. A prime example of such
a scenario is the Skipjack cryptosystern (11.
E. B i h and A. Shamir. Reseveh annauncement: The next suge of differentid fault crypmalysis: How CO break cornpletely
unknown cryptosystems. 25 Seprember 1996
The main assumption behind the new fault model is that the cryptographie key is stored
in an asymmetric type of mernory, in which induced faults are much more likely to
change a 1 bit into a O than to change a O bit into a 1 (or the other way around). The
attack is guaranteed to succeed if the fault model is satisfied. A detailed description of
the differential fault analysis c m be found in [18].
Further Improved Differential FauIt Analysis was introduced by Anderson and ~uhn ' .
2.4 Advanced Encryption Standard (AES) Requirements
On January 2. 1997, the National Institute of Standards and Technology (NIST) announced
the development of a Federal Information Processing Standard for Advanced Encryption
Standard (AES).
The draft minimum acceptability requirements and evaluation cnteria are:
(1 ) AES shall be publicly defined.
(2) AES shall be a symmetric block cipher.
( 3 ) AES shall be designed so that the key length may be increased as nez-
(4) AES shall be implementable in both hardware and software.
(5) AES shall either be freely available or available under tems consistent with the
American National Standards Institute (ANSI) patent policy.
(6) Algorithms which meet the above requirements will be judged based on the following
factors:
( I ) Secunty (i.e., the effort required to cryptanalyze),
(2) Computational efficiency,
(3) Memory requirements,
(4) Hardware and software suitability,
(5) Simplicity,
' A serious weakness of DES (poaed to cypherpunks@toad.com. 2 Novembcr 1996)
19
(6) Flexibility, and
(7) Licensing requirements.
While there are many published block ciphers, none of them meets al1 the above
requirements. Most of the previously published block ciphers have a 64-bit block length.
However, the AES is most likely to be a 128-bit block cipher. Also, most of the
previously published block ciphers have a fixed key length with no obvious way to
increase the key length as needed. Designing an efficient block cipher that meets al1 the
above requirement is a challenging project for the world cryptographie researchers. An
issue that was overlooked by the standard requirements is the practicai need to satisfy
users who require a %bit block cipher for compatibility reasons. For more details about
the AES developrnent effort the reader is referred to the NIST home page [ 10 11.
2.5 Mathematical Background and Definitions
In this section we introduce sorne of the mathematical toois and definitions used
throughout this thesis.
2.5.1 Cryptographie Criteria for Boolean Funetions
Nonlinearity
A function f : -+ Z2 is defined to be a linear f'unction if f ( x ) = a - x for some
constant a E 2:.
A function f is defined to be an affine function if f (x) = a . x 9 b for some constants
a E Zt , b E 22.
The nonlinearity of the function f = (fi f2 fm) : 22 + Z l , fi : 2; + Z?. i =
1.. . - . m. is defined as the minimum Hamming distance between the je: oi aifine functions
and every nonzero linear combination of the output coordinates of f , i.e.,
Arcf = min #{x E Zîlc f ( x ) # w x 3 b ) , (2.3) b.c,w
where w E 2;. c E Zy\{O}, 6 E 22, w x denotes the dot product between w and
x over Z2, and
Linear Structures
A linear structure of a boolean function f : 2; + Z2 can be identified as a vector
a E Z;/{O) such that f(x $ a) $ f (x) takes the sarne value ( O or 1 ) for al1 x E 2:
Wl .
Output Bit Independence Criterion (BK)
A function f : 2; + Zy satisfies output Bit Independence Criterion if whenever one
bit of input is complemented, the correlation coefficient between every two output bit
changes is zero [37].
Balance
A function f : 2; + Z2 satisfies the 0-1 balance criterion if it has equd numbers of 0's
and 1's in its output sequence when the inputs are taken over al1 possible values.
In general. a function / : 2; -t Zy. n 2 n, is said to be balanced or regular if each
ourput symbol y = f(x) E Zy appears an equal number of times (Le., for times)
as x varies over al1 its possible 2" values.
Completeness
A function f : 2; + 27 satisfies the completeness criterion if every Y -
function depends on al1 input arguments [66].
3; : of the
Strict Avalanche Criterion (SAC)
A function f : 2; + 22 satisfies SAC if whenever a single input bit is complemented,
the output bit changes with a probability of one half [37].
Higher Order SAC (Ford)
A function f : ZZ -t Z2 satisfies SAC of order k if any function obtained from f by
keeping k of its input bits constant satisfies SAC [47].
Higher Order SAC (Adams and Tavares)
A function f : 22 -, Z2 satisfies high order SAC of degree k if f(x) changes with a
probability of one half whenever i (1 5 i 5 k ) bits of x are complemented [4].
Propagation Criterion B C ) ~
A hinction f : 22 - Z2 satisfies Propagation Criterion of degree k (denoted PC-k)
if f ( x ) changes with a probability of one half whenever i (1 $ i 5 k ) bits of x are
complemented.
A hinction f : 2; 4 & satisfies the extended Propagation Cntenon of degree k and
order t (PC-k order t ) if any function obtained from f by keeping t bits constant satisfies
PC-k [105].
Correlation Immunity
A function f : 22 4 2 2 is order correlation immune if every k-tuple obtained by
choosing k components from x is statistically independent of f(x)[126].
Binary Bent Functions
A function f : 22 -t Zz is bent [116] if and only if
F(w) =&1, W E 22.
where F (w ) is the Walsh transform [6] of the fbnction ( - 1 (See section 2.5.3).
Binary bent functions exist only for even n.
If f : Z; + 2- is bent, then the distance between f and any affine function is
qn-' - f Y/'-'. A bent function attains the maximum achievable nonlinearity (its
nonlinearity is Y-' - Y/2-') and it has the maximum distance to functions with
linear structures (2"-'). If f : Zq + Z2 is bent, then we have
This is identical io Aâams and Tavves higher order SAC and was defined independently by Preneel ci el [IOSI
in [go] Nyberg extended the concept of bent functions to mu1 ti-ouput boolean functions.
Nyberg presented what she called "perfect nonlinear s-boxesg". A function f : Zq - Zy is perfect nonlinear if for every fixed Ax E 2; the difference f (x G Ax) f f ( x ) obtains
each possible value y E 27 for Zn-m values of x. Every nonzero linear combination
of the output coordinates of perfect non linear functions is a bent function. The main
result is that for perfect nonlinear s-boxes the number of input variables is at least twice
the number of output variables.
Linear Approximation Table
For a given s-box constructed from a mapping f : 2; -, ZT, the linear approximation
table entry LAT(a. b) is defined as [82]:
where a E 22. b E {22}/O, and a - x denotes the inner product of the vectors a and
x evaiuated over Z2.
The linear approximation entry with the maximum absolute values is denoted by LAT*.
XOR Distribution Table
For a given s-box constructed from a mapping f : 22" + 21, The XOR table entry
.VaZay is defined as [ E l :
Naxay = #{x E 2; I f (x 6 Ax) 4 f (x) = Ay} (2.9)
where Ax E 22, ny E 21. The entry Noo = 2", is noi taken into consideration as it
does oot have any cryptographic significance.
For Ax # O, the maximum entry in the XOR table is denoted by XOR*.
Y Nyberg 's definition is more grnerd as it assumes lhat f : 2: - Z?
2.5.2 The Inclusion-Exclusion Principle
A counting tool that is used very frequently throughout this thesis is the Inclusion-
Exclusion Principle [113]. Different applications of the the Inclusioc-Exclusion Principle
to cryptography can be found in [98].
Definition
Consider a set of objects, each of which may or may not possess each property from a
totai set of properties. Label the properties as p i , 1 < i 5 4, and define the number
of objects possesing the P properties
as N ( p ; , . pi,, . . . . pi, ) .
If N(p1, pz. ..., pk) = !V(p,, , p i 2 , . . . ,p i,) for al1 values of k, 1 5 k 5 4, and al1 selections
of il, i2, .--. ik, then the properties are said to be symmetric.
Inclusion-Exclusion Principle:
Consider a set of objects. each of which may or may not possess each property from a
totai of 4 properties. If the properties are symmetric, the number of objects having one
or more properties. @. is given by [113]
where W ( i ) is the number of objects having i particular properties.
Generalization of Inclusion-Exclusion P ~ c i p l e :
Consider a set of objects. each of which may or rnay not possess each property from
a total of m properties. If the properties are symrneuic, the number of objecü which
possess exactly t . 1 5 t 5 6. properties, r ( t ) , is - given by [113]
where T * ( i ) represents the number of objects which have i particular properties.
2.53 The Walsh ïhmform of Boolean Functions
In this section, we examine some properties of the Walsh transform [6,42] of boolean
functions. For a boolean hinction f : 2; -r Zz, the Walsh transform of the function
( - I ) ~ ( ~ ) is given bylo
The function ( - l ) f ( X ) is given by the inverse Walsh transform as follows
Autocorrelation Function
The autocorrelation function ll of the function f is defined as
The autocorrelation
coefficients by
function as defined above can be determined from the Walsh
'O Fmm now on. we will refer ro F ( w ) as the Walsh uansform of f
26
Summation Property
The Walsh transform of a boolean function f satisfies
Parseval's Theorem
Dyadic Shift
Let Fl (w) = F(w 9 a) then fi(x) is given by
Similarly, if f2 (x) = f (x @ a) then
Dyadic Convolution
From the above result it can be shown that the Harnming distance
can be expressed as
The Walsh transform is a powemil analytical tool for studying the properties of boolean
functions for different reasons:
Some cryptographic properties, such as bentness, were originally defined in terms of
the Walsh spectrum of the boolean hinctions.
Al1 the cryptographic properties of a boolean fimction can be extracted from its Walsh
spectrum. For example :
The largest spectral component gives the nonlinearity of the function f : ZI: + Z2
Sirnilarly, for a multi-output boolean function f : Z!: - 2 1 we have
where F C ( w ) is the Walsh transform of the function ( - 1 )'-/.
order correlation immune functions must have
SAC fulfilling functions must have
C F'(w)(-1)"' = O, i E {I, 2, ... n}, ~ € 2 ;
where u;i is the i-th bit in W.
Functions satisfying PC-K must have
The field of digital signai processing is well established and it cm provide u s with a good
theoreticai foundation for studying different cryptographic characteristics of multi-output
boolean functions.
Possible extension of the Walsh transfonn to the case of multi-output boolean functions:
multi-output boolean functions can be descnbed by the Walsh transform of every nonzero
linear combination of its output coordinates.
This motivates us to express different forms of information leakage in terms of the Walsh
transform coefficients (see section 5.6).
Chapter 3 Linear Approximation Table and XOR Distribution Table of Balanced S-boxes
Differential cryptanalysis [15], and linear cryptanalysis [82] are currently the most
powerful cryptanalytic attacks on private-key block ciphers. The complexity of
differential cryptanalysis depends on the size of the largest entry in the XOR table,
the total number of zeroes in the XOR table, and the number of nonzero entries in the
fint column in that table [15], [122]. The complexity of linear cryptanalysis depends on
the size of the largest entry in the linear approximation table (LAT) [81], [82].
Thus one way to reduce the risk of differential and linear cryptanalysis is to choose the
s-boxes with small maximum LAT entries and smalI maximum XOR tabie entries.
O'Connor [97] studied the distribution of the XOR table of bijective mappings. In
particular, O'Connor showed that for a randornly selected n-bit bijective :: : :-:na, the
expected size of the largest entry in the XOR distribution table is bounded by 2m
while the fraction of the XOR table that is zero approaches e-0-5 = 0.60653. O'Connor
also showed that the above two quantities are tending to optimal values in composite
permutations [96].
In some block cipher designs, it is required to have balanced s-boxes (also known as a
regular s-box). In this chapter, we derive an upper bound on the fraction of balanced
functions (s-boxes) having a specified lower bound on the maximum entry in the XOR
distribution table or the LAT. For reasonably smdl values of these maximum entries we
show that this fraction decreases dramatically with the number of input variables.
We also calculate the probability that any nonzero linear combination of the output
coordinates of a regular s-box is affine.
From [88], the total number of balanced s-boxes with n input bits, and m output bits
is given by
3.2 Linear Approximation Table of Balanced S-boxes
R e d 1 that for a given s-box constructed from a mapping f : 27 -, Zy. the linear
approximation table entry LkT(a. b ) is defined as [82]:
where a E Z?. b E {ZT}/O, and a x denotes the inner product of the vectors a and
x evaluated over Z2.
It is straightforward to notice that LAT(a, b) = Y-' - d ( a . x, b f), where
Let w t ( a ) be the Hamming weight of the binary vector a; which is the number of 1's
in the vector a. For wt(a) = O, we have LAT(a. b) = O as b f is always a baianced
function for a baianced s-box.
For a # O. b # O we have
Lemma 3.1
2
Proof For wt(b) = k > O, we have ways of arranging the bits of f (x)
corresponding to nonzero bits of b such that d(a x, b . f) = O. This result follows
directly by noting that the number of arrangements of r different objects, for each of
which there are c copies, is ( rc ) ! / (c! ) ' . In Our case, we have k-bit symbols with
the corresponding XOR equd 0, and 2"' k-bit symbols with the corresponding XOR
equal 1. Each of these symbols occurs times. Thus we have ( ) possible
arrangements for the symbols corresponding to the 0's of a. x and a similar number of
possible arrangements for the symbols corresponding to the 1's of a x.
Since we can partition the input x into zk distinct sets, al1 x' s in a given set are assigned
the same common value of the k bits of f ( x ) corresponding to the nonzero bits of b.
We still need to assign the remaining rn - k output bits for each x. Each x within
a given set must be assigned a distinct m - k tuple of the remaining output bits for y-&! times, each set can be assigned in 2 n ,,+ ways, so the remaining bits can be
2&
assigned in (( 2"-VI! - ! zrn- ) wayr.
Thus we have ( - - ) ( ) 2k - - (?n-1!)2
2"-k! 2"-m! zm-' distinct balanced functions
with d(a - x. b f ) = O. Using the sarne argument, and by noting that we have ) ways to generate balanced functions that have d(a x. b f ) = 21 from a . x. we have
Zn-I
( ) i2n-m!)2m ways of generating distinct balanced s-boxes with d h . Y. b f ) = 21.
By dividing by the total number of baianced s-boxes, B(n, m), we get the lemma abovea
For any integer vdue f k f ~ ~ ~ , 0 5 1 2 1 ~ ~ ~ 5 Y-', let iVtAT denote the number of LATs
with any entry having absolute vaiue > ~ M L ~ T , then fViAT is upper bounded by
Proof: For 1 # O we have
Using lemrna 3.1, and by noting that'
we get the theorem above. O Numerical substitution in the formula above shows that the fraction of balanced s-boxes
with undesirable LATs decreases drarnatically as the number of inputs increases. To give
a numerical example, consider the case where M L , ~ = 2n-5, for n = 12. m = 6 we
have ,$&$ < 3.86 * 10-'O.
3.3 XOR Distribution Table of Balanced S-boxes
Recall that for a given s-box constructed from a mapping f : Z i -+ 27, the XOR table
entry NAxay is defined as [15]:
i V ~ f i ~ = #{x E ZZ 1 f (x 6 Ax) @ j (x) = Ay}
where Ax E 2,": Ay E 22. Lemma 3.2
For A x # 0: Ay # 0, the number of balanced functions with NAxay 2 2k is upper
bounded by Q, , , (k ) , where
Equaiion (3.7) stmply mwns that the nurnber of LATr with entries having absolute value > 2MLAT is upper bounded by the
nurnber o f these envies
Proof: For any balanced function f : Z!: - 21. n 2 m, we have '2m distinct output
symbols. ~ 0 . ~ 1 . - - . y ? m - l , and each of them is repeated times. For a given
Ay # O' AX # O, let S be the set of Y"" O U ~ P U ~ symbols pairs (Yi, yj) with Yi = yjeAy.
Thus S has distinct unordered pairs, each of them is repeated times. Divide
S into 2m-1 subsets such that each of these subsets contains only identical unordered
pairs. Let Si. S?, . S2m-i denote these subsets.
Now we count the number of ways by which we cm constmct a balanced function
f : ZI: -+ Z!: for which .Lay 2 2k. Select k, pairs from each subset 3,. 2-1
i = 1.2. . lm-' such that I , = k. There is only one way to choose k, pairs i = l
from the subset Si (as the pairs within Si are indistinguishable). These k pairs can be
ordered in C (k; ki , k2 , . . . , k p - I ) ways. Note that we have 2 possible orders for each
pair, giving 2' total possible orders.
Now we select k different input values xi, xi,, . xi,-, for which xl, 5 xi, # Ax
for O 5 i, j < k. There are (pi ) possible choices for xia, xil. . . xik- - For each
such choice we assign ro each pair (f (xi, ), f (xi, 5 Ax)) an element iLi; :- p i n
chosen above.
The remaining - 3k output symbols of f can be assigned in
C'('Zn - 2k: I I , I I , l?, l?, .... l p - i . 1 p - i ) ways, where li = 2n-m - ki.
The construction approach described above does not guarantee the construction of distinct
balanced functions, and so Q, , , (k ) is an upper bound. O
Lemma 3.3
For Ax # 0.Ay = O, the number of balanced functions with ..K&, 2 2k is upper
bounded by an-m(k), where
Proof: For any balanced function f : 2; + Zy. n 2 m, we have 2* distinct output
9n-m symbols, ya, yl, . : yp-1, and each of them is repeated , times. For a given
Ax # O. Ay = O, let S be the set of Y-' output symbols pairs (yi, yj ) with y; = yj .
Thus S has Zrn distinct pairs, each of them is repeated times. Divide S into '3"
subseü such that each of these subsets contains only identical pain. Let Si: S2.S . . . S2-
denote these subsets.
Now we count the number of ways by which we can construct a balanced function
f : 2: -, Zy for which > 2k. Select k; pairs from each subset Si, i = 1.2, . . Zrn ~ r n
such that ki = k. There is only one way to C ~ O O S ~ ki pairs from Si (as the syrnbols i= 1
within Si are indistinguishable). These k pairs can be ordered in C ( k ; kl k?. ..., h m )
ways. In this case. we do not have the 2k factor (as in Lemma 3.2) because the
two elements within each pair are identical. Now we select k different input values
Xia - Xil, - - 7 Xibl for which xi, 3 xl, # n x for O 5 i, j < k. There are
possible choices for xi,. xi,. , xlk-l. >
For each such choice we assign to each pair
(/(xi,), f (xr , @ Ax)) an element from the k y pairs chosen above. The remaining
2" - 2k output syrnbols of f can be assigned in C(Zn - 2k; Ill 12, ...: 1 2 4 ways where
1; = 2"-* , 2ki .
A similar statement about the upper bounding applies to @,,,(k). 0
The exact number of balanced functions with i\'AxAy = 2 k . Ax # O. A y # O (denoted
by . \ l ,m.ay(k) ) is given by
Proof: Let pi, (cf. equation (2.10)) denote the number of balanced functions with the
property that niaxay 3 2 k The Lemma follows by direct application of the Inclusion-
Exclusion Principle.
Similarly, if we denote the exact number of balanced functions with Nhay = 26. Ax f
0.Ay = O by L'I,.,,~(~) then we have
Using the above results, and by noting that2
we get the following theorem
EquYnn (3.15) simply means Chu chc number of the XOR distribution tables with entries having vdue > ? M X O R is upper
bounded by ihe number of these entries
The fraction of baianced functions with maximum XOR table entry 3 2ik1.yoR,
O 5 M Y O R 5 Y- ' , is upper bounded by
Numencal substitution in the formula above shows that the fraction of baianced s-boxes
with undesirable XOR distribution tables decreases dramaticaily as the number of inputs
increases.
One specid case of interest is for bijective mappings. Le., when n = m. For this case
(see [97] for more details on this special case) we have
It is also straightforward to see that
and
and hence for O < . y o R 5 Y-', we have
To give a numerical example, consider the case where k1.t-QR = n. For n = 1- we
Table 3.1 shows the results of Our simulation3 for 10' randomly chosen 8-bits bijective
mappings. T-Bound refen to our theoretical bound (equations (3.17) and (3.20) ) for
the fraction of bijective mappings with .YOR* = 3 i M a x + y o ~ , or LAT* = 2 M a x r A T .
SBound refers to the bounds obtained from the simulation results. Only the nontrivial
bounds are shown in the table.
Table 3.2 shows the number of bijective mappings with XOR* = 3 M a 1 - ~ - ~ ~ , and
LAT* = 2 M a x L A T obtained from our simulations.
Table 3.1 Experimental Results versus Theoretical Bounds for 10' Randornly Chosen &bit Bijective Mappings
Simulation results in this thesis mrike use of the UNIX C library funaion "rand"
37
Table 3 2 Distribution of the Maximum XOR TabIe Enuy and the Maximum LAT
Entry (Experimental Results for 10' Randomly Chosen Mit Bijective Mappings)
3.4 Counting the Number of Nonlinear Balanced S-boxes
Gordon and Retkin [SOI calculated the probability that any of the output coordinates of a
random, reversible substitution box (Le., a bijective mapping) is an affine function. After
linear cryptanalysis was introduced, it was realized that the cryptographie strength of a
multi-output function depends not only on the strength of its individuai output coordinates
but also on the strength of every nonzero linear combination of these coordinates [go].
In this section. we calculate the probability that any nonzero linear combination of the
output coordinates of a regular s-box is an affine function.
Lemma 3.5
The number of regular n x rn s-boxes (described by the multi-output boolean functions
f : 22 + ZT, n > m) for which the first P output functions are affine is given by
Proof: By noting that every nonzero linear combination of the output coordinates of
regular s-boxes is a balanced function, the first function can be chosen in (Y+' - 2)
ways, which is the total number of balanced affine functions. The second function cm
be chosen from the set of baianced affine functions not including the first one or its
cornpiement, i.e., in (ZR+' - 4) ways. The third one can be chosen from the set of
baianced affine functions not including any linear combination of the first two functions. k-1
Proceeding as above. the first k output functions cm be chosen in 2" (2" - 2') ways. t=O
Since we cm partition the input x into 2' distinct sets, al1 x's in a given set are assigned
the sarne cornmon value of these k bits. We still need to assign the remaining m - P
output bits for each x. Each x within a given set must be assigned a distinct rn - k tuple
of the remaining output bits for Z n - m times,
ways, so the remaining bits can be assigned in
.-ln-'=! each set cm be assigned in -
( ?ri-m t 12m-k
Consider the hnction 0 : 2; + zirn-' constnicted from every nonzero linear
combination of the output coordinates of f . Counting the number of regular s-boxes with
,4/*L # O corresponds to counting the nurnber of functions 8 with no affine coordinates.
The number of ways one can choose 1 coordinates of @ such that k of them are linearly
independent is equivalent to the number of 1 x m binary matrices (without t&ng ihe order
of rows into account, Le., two matrices with the same rows but in different orders are
counted once) with nonzero distinct rows which have rank k. This is given by [132], [49]
w here
Lemma 3.6
The number of ways to construct certain k linearly independent coordinates of @ from
affine functions is also given by R(n, rn, k).
Proof: It is clear that for every k Iinearly independent coordinates of d ,
denoted by oi,. ai,, . - . O,,, one can find ( m - k ) coordinates of @ , denoted by
OZ,+, 0 t l E + 7 - - O:m v such that
where A is an m x m invertible binary mauix. (fi f2 .. .. Jm ) denotes the output
coordinates of f . This means that as f varies over al1 the set of distinct regular s-boxes,
( Q * ~ 4 .. .. i;, ) scans the whole set but in a different order. O
From the lemma above it follows that the number of ways to constmct 1 coordinates of
Q from affine hnctions is given by
Using the inclusion-exclusion principle, the number of regular s-boxes with the property
that one or more of the nonzero linear combinations of its output coordinai-- . . .iffine,
RL(n. m), is given by
RL(n. m ) = (-1)'-1 L I ( m , l . k ) R(n. m. P ) . (3.26)
To express the above count as a fraction of the total number of regular s-boxes , denoted
by F RL(n. m), we divide by the total number of n x m regular s-boxes.
To give a numericai example, for n = 6. m = 4, which is the size of DES s-boxes,
FRL(6 ,4) = 2.46 * 10-16. One can easily get an upper bound for FRL(n, m) by noting
that we have
and hence we have
Since we have n 2 m, then
By noting that
then we have
where O(- ) is the asymptotic upper bound order notation4.
3.5 Conclusion
In this chapter we denved an upper bound on the fraction of balanced functions (s-boxes)
having a specified lower bound on the maximum entry in the XOR distribution table or
the LAT. For reasonably small values of these maximum entries we showed that this
fraction decreases dramaticdly with the number of input variables.
We also cdculated the probability that any nonzero linear combination of the output
coordinates of a regular s-box is affine. Our results shows that the number of balanced
functions with nonzero nonlinearity goes down exponentially (as .??"-Sn/') with the
number of input variables n.
While these results suggest that large cryptographically strong balanced s-boxes may be
obtained by selecting them at random, these statistical results might have some practical
h(n) = O(g(n)) if the= exisrs a positive constani c and a positive integer no such ihu O ' h ( n ) 5 cg(n) for ail n > no.
41
limitations. Describing a large randomly chosen s-box requires a large amount of memory
which might be impractical in some applications. This suggests that other methods such
as algebraic constnictions, which trade-off memory requirements and computational speed
(see [89] for an example of such methods) might offer alternative approaches.
Chapter 4 Linear Approximation of Injective S-boxes
One way to reduce the size of the largest entry in the XOR table. and hence reduce the
risk of differential cryptanalysis, is to use injective substitution boxes (s-boxes) such that
the number of output bits of the s-box is sufficiently larger than the number of input
bits. In this way, it is very likely that the entries in the XOR distribution table of a
randomiy chosen injective s-box will have only small values, making the block cipher
resistant to differential cryptanalysis. Some proposed block ciphers, such as CAST [5]
and Blowfïsh [119], take advantage of this property.
On the other hand. Biham [14] proved that if for an n x m s-box described by f : 2: + Z!:
we have m 2 2" - n, then at least one linear combination of the output bits rnust be
an affine combination of the input bits and the block cipher can be trivially broken by
linear cryptanalysis.
In this chapter, we estimate the size of the largest entry in the LAT rmdomly
selected injective s-box. We also present two methods for constnicting highly nonlinear
s-boxes. Finally we show how the resistance of a CAST-like encryption aigorithm to the
basic linear cryptandysis was underestimated in [75].
4.1 Definitions and Notation
The function g : -y -t kW is injective (or one-to-one) if for x . i in S the equality
g ( x ) = g(i) implies z = 2 . That is. distinct elements of X can not have the same
image in Y [62].
From the definitions of nonlinearity and LAT, it is clear that the nonlinearity of the
function f : 2; -, Zy is given by
iVLl = 2"-' - mâx (LAT(a, b)(, a, b
Throughout the rest of this chapter, the b = O case is not taken into consideration as it
does not have any cryptographic significance.
It is easy to see that the number of injective n x rn s-boxes is given by
In(n. rn) = n (Zm - i). Throughout this chapter, let
f : Z!: -, Zy describe a random injective mapping,
w t ( a ) denote the Hamrning weight of the binary vector a,
Pw(k) = P { w t ( b -1) = k},
P;"~~)(L) = P { d ( a - xl b f) = l } , b # O . Pdlu,( l ,k) = P { d ( a - x . b . f) = 1 1 w t ( b - f) = k}.
4.2 Linear Approximation Table of injective Mappings
We first determine the probability distribution of the weight of b . f:
Proof: For b # 0, wt (b f ) follows the hypergeometnc distribution (201. This follows
by noting that the weight of b f has the sarne distribution as that of the function
constmcted by randomly choosing 2" bits from the function b - R where K : Zy + Z F
is an arbitrary bijective mapping. O
Given the weight of b . f. the distribution of the distance between a . x and b f cm
be detennined :
Lemma 4.2
For a # O, we have
Proop Let M i j = #{XE 2; l a - x = 2.b. f ( x ) = j } , i ? j E Z2. Then we have
d(a x. b - f ) = Mlo + Mol . Since wt(b - f ) = k, we have Mol + 1%' = k. We also
have Mie + 11lI1 = 2"-' as a x is a balanced function. Using these equations,
we get the lemma. O Combining the two results above. the following theorem gives the probability distribution
of the distance between a x and b f for fixed a and b. This can be used to find the
probability that a given entry in the LAT has a particular value, - 1.
Theorem 4.1
proof: The a = O case follows because the distance between any function f and the
zero function is the weight of f . For a # 0, the result holds because
For any integer value L&,~T. O c - \ . i~~= 5 and by denoting the number of
LATs with any entry having absolute value 2 M L A ~ by iviATl we have the following
upper bound:
Corollary 4.1
Proof.. We have
LAT(a, b) = Y-' - d ( a S x . b - f).
From Theorem 4.1 , and by noting that
N i ~ T = P{ (max I L A T ( ~ . ~ ) ~ ) 2 l k f r r ~ In(n. m ) a,b#o
we get the Corollary above.
Take the minimum value of :CfLAT for which 1z~:;'7) 5 0.5 as an estimate for the
expected value of the maximum LAT entry, LAT*. Table 4.1 shows the simulation
results for LAT*, the average value of the maximum entry in the LAT of a randomly
selected 8 x m injective s-box (m = 10,12, ..., 32) together with Our theoretical estimate,
denoted by LAT". Table 4.1 also shows the maximum and the minimum values for
LAT* found throughout the experiments. For ail the results given in table 4.1, the
sample variance is upper bounded by 2. The Fast Waish transform [6] and other speed-
up techniques were used in the calculation of the maximum LAT entries throughout Our
simulation. It is worth noting that for m = 32, four of the s-boxes out of the ten tested
achieved a nonlineaxity of 73, while the nonlinearity of the remaining six s-boxes was 77.
- -
Table 4.1 Experïrnental Results for 8 x m Injective S-boxes
4.3 Construction of Highly Nonlinear Injective S-boxes
In this section we present two methods for constmcting highly nonlinear injective s-
boxes. Both of these methods, which are motivated by bounds on exponential sums,
outperform the previously proposed methods. In particular, we are able to obtain injective
S x 32 s-boxes with nonlinearity equal to S0 and maximum XOR -- :..: entry of 2. We
also re-evaluate the resistance of the CAST encryption aigorithm to the basic linear
cryptanalysis.
4.3.1 Definitions
Trace: The sum
is called the trace of a E GF(2m).
Basis und duul basis: A set {a,, a,' . , am} of rn elements over GF(2*) which are
linearly independent over GF(2) is called a basis for GF(2") over G F ( 2 ) .
The corresponding duai bais is defined to be the unique set of elements
{ T ~ , 72, . . - . ym} C G F ( P ) such that -
Lemma 4.3 (Cadi& and Uchiyama bound [25l)
If F ( x ) is a polynornial over GF(2") of degree r such that F ( x ) # G(x)? + G(x) + b
for al1 polynomiais G(x) over GF(Zm ) and constants b E GF(2m ), then
Lemma 4.4 (Kloostermun sum [137], [25])
Note that a function, F, over GF(2") c m also be expressed as a function over C F ( ? ) " ,
i.e., as n functions over GF(2). Let
be a function over GF(2)" , let B = {cr,,a,,-.an) be any basis of G F ( Z n ) over
G F ( 2 ) , then
n where x = riai E GF(2") . This means that there is a one-to-one correspondence
i= 1 between the functions of G F ( 2 " ) and those o f GF(2)" under a chosen basis of G F ( Y )
over GF(2) . If we let {a;, . . . , a:} be the dual bais of B. then each component of
f ( x l , - . . x n ) can be expressed as
where x = C z i a f . t= 1
The nonlinearity of the function f(ri .-s,) = (fi(xl.---rn).---. fm(xl.---I,)) is
defined as the minimum Hamming distance between the set of affine functions and every
nonzero Iinear combination of the output coordinates of f .
From the above, one can easily prove that the nonlinearity of the function f that
corresponds to the function F is given by
:kpLf = min ~ ( T ~ ( C F ( ~ ) ) . T r ( w S ) ~ ; b ) cf0.b.w
where c.w E G F ( Y ) , b E GF(2) .
4.3.2 Construction Method 1'
This construction method is based on the observation that highly nonlinear injective
s-boxes may be obtained by adding the coordinate functions of highly nonlinear bijective
s-boxes.
Given the distinct bijective functions F* over GF(2") , 1 5 i 5 iM, an injective function
G over GF (2"") can be obtained by setting
In this method we use the inversion mapping proposed by Nyberg [91]
where x E GF(2") .
Using lemma 4.4, the nonlinearity of the function F, is lower bounded by -
Expenrnental results show that injective 8 x 16 s-boxes constnicted by this method always
have nonlinearity of 96. For 8 x 24 and 100 random choices of ai pairs, we found 57.40
' The experimental results in ttiis part were obtained by the second auchor in (14 I l
49
and 3 s-boxes with JV-L = 86.84 and Y0 respectively. The only S x 32 s-box tested to
date has nonlinearity of 76.
Conjecîure
The nonlinearity of the function g that corresponds to the function
obtained using construction method 1, is bounded by
We venfied this conjecture experimentally for n 5 10 where we found that th:. 5ound
is tight for even n.
Remrk: Wc may prove that NC, 22"-' - (2"12+' + 1) by provmg that for o # 0
we have
but this seems to be a hard problem.
4.33 Construction Method II
This method is aiso based on the observation that highly nonlinear injective s-boxes
may be obtained by adding the coordinate hnctions of highly nonlinear s-boxes (not
necessary bijective).
If the concatenated hnctions are distinct polynornials over G F ( 2 " ) such that
for dl polynornials U(x) over GF(2") and constants ai E GF(2")\{0}. b? w E GF(2")
then the Carlitz and Uchiyama bound can be used to provide a lower bound for the
nonlinearity of the resulting s-box as follows.
If r = max (degree (F , ) ) then the nonlinearity of the resulung function is lower bounded t
Using the Carlitz and Uchiyama bound one cm check that the nonlinearity of the function
F(x) = x3. x E GF(2') is lower bounded by 112. Also the nonlinearity of the function
G(x) = (x311x5). x E GF ( z ~ ) is lower bounded by 96.
Our basic rezult is based on the experimental observation that the Carlitz and Uchiyama
bound is not tight for higher values of r. In this section we consider the following
five constructions
Gi = (x511x'11x1' 1 1 ~ ' ~ ) ~
G2 = (x3llx'llx" 11x13),
GB = (x3 1 1 ~ ~ ( 1 ~ ~ ~ llx13)
G4 =
G5 = (x3 [lx5 [ (x i l(xl').
Using Carlitz and Uchiyama bound we have
Experimental results shows that
Let Fi = x3. x5. x i : XI'. x ' ~ for i = 1 .2 .3 ,4 ,5 respectively. By noting that Fi is
bijective for i = 3 , 4 5 then any construction that includes any of Fi, i = 3 . 4 or 5 will
be injective. Ln fact, it is not hard to see that al1 the s-boxes below are injective.
Table 4.2 below shows the nonlinearity and the maximum XOR table entry, XOR*, for
the injective S x 16 s-boxes constructed by this method by concatenating Fi with F,.
TabIe 4.3 shows similar results for the 8 x 24 s-boxes.
Table 4 3 ,firC, SOR' for 8 x 24 S-boxes Obtained Using Consuuction Meihod I I
For the 8 x 32 s-boxes. SOR* = 2 for ail the constructions above except for Gl where
it is equal to 4 which may lirnit the usefulness of Gis
Table 4.4 shows the best s-box nonlinearity obtained by different methods.
*
Table 4 3 J ~ L . SOR' for 8 x 16 S-boxes Obtained Using Construction Method II
2.5
96
4
4.5 1
96
4
3.3
88
4
Table 4.4 Best S-box Nonlinearity Obtained By Different Consuucrion Meihods
2.3
96
4
3.5
96
4
k
Random
Misrer and Adam [9/
Method I
Merhod II
The construction methods proposed in this section c m be extended to other highly
nonlinear rnappings such as that proposed in [26],[91].
1.3
80
2
2.4
96
4
1.3
96
2
1.1 1
.%-f
.Y0 R'
In order to frustrate possible algebraic attacks, the four 8 x 32 s-boxes should be generated
1.5
80
2
1.2
96
2
8 x 16
87
-
96
96
using different imeducible polynomials. When using s-boxes obtained from construction
method II, we recommend that the bytes XORed together should correspond to different
8 x 24
80
- 86
96
degrees. For example, if G5 is used. then the 8 x 32 s-boxes may be constructed as follows S 1 = (x3 ~ ~ x 5 ~ ~ x 7 1lx1l),
S- = (x5 11xillxl1 1(x3),
53 = (xi llxl1 11x3 1 1 x 5 ) , sq = ( X ~ ~ I I X ~ I I X ' ~ J X ~ ) ,
such that the exponents forrn a Latin square [113].
8 x 32
73
74 1
7é I
60
43.4 Comments on the Security of the CAST Encryption Algorithm
& Figure 4.1 CAST Round Function
Figure 4.1 shows the CAST round function. In this paper we assume that operations
a, b, c and d are XOR addition of 32-bit quantities.
The resistance of CAST-like encryption algorithms [2] constructed using randomly
generated s-boxes against the basic linear cryptanalysis [82] was snidied in [75]. The
number of known plaintexts, X', in a basic linear attack (Algorithm 1 in [82]) required
to give a 97.7% confidence of getting the right key bit is approximately given by
where 7 is the number of s-boxes involved in the R-round linear approximation 2"-1 -Ncs
expression, I p , - $ 1 = 2" and n is the number of input bits to the s-box.
The bounds for Np can be improved by re-evaluating the expressions in [75] using
the new 8 x 32 s-boxes nonlinearity. However a better bound can be obtained by
considenng the nonlinearity of the resulting 32 x 32 s-boxes. The nonlinearity, JL-&
can be bounded using the nonlinearity, ADf ,, . 1 5 i < 4, of the four 8 x 3'2 s-boxes
used in its construction as follows 4
- . r = l
The exact nonlinearity can be efficiently calculated using the Wai
(4.33)
sh transforms [6] of
the four 8 x 32 s-boxes. Since an R-round linear approximation must involve at least as
many s-boxes as R / 2 iterations of the best --round approximation, the number of 32 x 32
s-boxes involved in an R-round linear approximation is at least R / 2 and hence we have
where N t s is the nonlinearity of the 32 x 32 s-box.
An important observation, which was overlooked in [87] is that the nonlinearity of the
32 x 32 s-box depends not only on the nonlinearity of 8 x 33 s-boxes used in the
construction, but it also depends on how the four 8 x 32 s-boxes interact topther. This
means that improving the nonlinearity of the individual 8 x 32 s-boxes dm, not always
guarantee improving the resistance of the cipher to the basic linear cryptanalysis. For
example, when combining the output of the four CAST-128 s-boxes [2] (each with
nonlinearity 74) by XOR, the resulting 32 x 32 s-box has nonlinearity 2.132.774
which is less than 2,133,721,088, the nonlinearity we found by combining
randomiy selected s-boxes with nonlinearity less than 74. Using such s-boxes
have Np zz 272 > 264 for R = 8 which is much higher than Np = 234 estimated in
912
four
we
751.
Finally, one should note that the prirnary motive for this work is obtaining highly
nonlinear injective s-boxes. We are not proposing the use of such s-boxes in CAST-like
ciphers before examining their other cryptographic properties. In fact, we believe that
randornly selected s-boxes are a good choice for CAST-like ciphen2.
O' Connor and Kiapper [IO01 showed that che expectcd degrte of the Algebraic N o d Form of a mdomly selected boolean
huiction j : 2; - Z2 is quai io n - ; + 0 (&). This suggots iha randomly selecicd 8 x 32 s-boxes are more misrnt to higher
order differential cryptyialysis rhui CAST-128 s-boxes.
Chapter Properti
5 Information Leakage and Spectral es of Boolean Functions
5.1 Introduction
Several cryptographic critena have been previously proposed as a measure of the strength
of cryptographic functions. Among these criteria are balance, correiation imrnunity [126],
resiliency [28], nonlinearity [84], Strict Avalanche Criterion (SAC) [136], higher order
SAC [47], Propagation Criterion (PC), higher order PC [105], Bit Independence Criterion
[ 1351, and Completeness 1661.
nie above set of cryptographic critena are not independent of each other and
a cryptographic function that satisfies al1 these criteria would be a golden one.
Unfortunately, it can be proven that no hinction can satisfy al1 the above set of criteria
simultaneously. This can be considered as the main motive for proposing a new set of
criteria based on information theory.
Several design criteria, based on information theory, have been proposed in [37],[48],
11301, and [ M l .
Information Leakage cm be classified into two classes: Static information leakage and
dynamic information leakage. It is argued in [155] that a boolean hinction is resistant
to statistical analysis (e.g., differential cryptanalysis [15], linear cryptanalysis [S2], and
Siegenthaler's correlation attack [125]) if there is no significant static and dynamic
information leakage between its inputs and outputs.
It is worth noting that Brynielsson [24] gives an approximate expression for the expected
value of the mutud information between the output and input subvectors for multi-
output boolean functions. Here we follow the definition of information leakage given
in [155] and give an exact expression for the expected value of different forms of
these information leakages for a randomly selected multi-output boolean function and for
some other cornbinatorial structures of interest such as regular mappings, and injective
mappings.
Gordon and Retkin [50] conjectured that good substitution boxes (s-boxes) may be built
by choosing a random reversible mapping of suficient size. Their argument is based
on the observation that the probability of accidental linearity occumng in such s-boxes
decreases dramatically as the size of the s-box increases. In this chapter, we provide
further evidence that bigger s-boxes (by bigger we mean s-boxes with a Iarger number
of inputs) are better by showing that the expected value of information leakage of a
randomly selected boolean function decreases rapidly with the number of input variabies.
We also give extended definitions of many cryptographic cnteria such as such as balance.
correlation irnrnunity, SAC, higher order SAC, and Propagation Criterion to multi-ouput
boolean functions.
Finally we study the relationship between the Walsh-Hadamard transform and various
types of information leakage. Conditions on the Walsh transform are given. which imply
that the function satisfies certain cryptographic properties of interest.
5.2 Definitions
Entropy and Mutual Information
Entropy
Let X be a discrete random variable with alphabet X and probability distribution function
p(x) = Pr {X = x}, z E X then the entropy of a random variable -4' is defined by
If .Y is a binary random variable with Pr(x = 0) = p, then
where h ( p ) is called the binary entropy function of probability p.
Conditional Enîropy
Let Y and X be discrete random variables with alphabet Y . X respectively, and with a
conditional distribution p( y Ir) then the conditional entropy H(Y1.Y) is defined by
where
Mutual Information
The mutual information of discrete random variables Y and X is defined by
For a good general reference for information theory, see [30].
Throughout this chapter. let f : 2; -+ Z y be a randomly selected boolean function
and let Y denote the output of f .
Sfatic In formation Leakage
The static information leakage of Y . given input subvector Xt E 2: (Le., given that we
know k bits of the n-bit input vector) , is defined by
S L ( Y I X k ) = m - H(YIXt),
where H(Y 1 Xk) is the conditional entropy of Y given Xk.
57
Remark: It is easy to show that
where I(Y: Xr ) is the munial information between Y and Xk. Note that if the mutual
information I ( Y ; Xk) is used to define the static information leakage. then the minimum
of I(Y: Xk) can be achieved while H ( Y ) = O which contradicts our objective.
The self static information leakage of Y is defined as:
By noting that H(YIXk) 5 n - k, then from (5.6) we have
One should note that the value of ail the forms of information leakage defined above is - *
greater than or equal to zero. Le., we always have S S L ( Y ) 2 0, and 2. .. - 0.
Dynamic In formation Leakage
The dynamic information leakage of AY, given the input change vector AX is defined
by :
DL(AY1AX) = m - H(AYIAX), (5. IO)
where AY = f (X) 63 f ( X @ AX).
We also note that the static information leakage %(Y 1 Xk) = O is achieved by Ph order
resilient functions (see [28], [126] for the definition and propenies of resilient functions),
while zero dynamic information leakage for al1 values of AX # O is achieved oniy by
perfect nonlinear functions (see [go], [Il61 for the definition and properties of both bent
and perfect nonlinear functions).
By noting diat we aiways have H(AYl0) = O. then from (5.10) we have
The equaiity in the above bound can be achieved only by perfect nonlinear functions
for which n 2 ?m.
Let
Assuming that al1 input vectors are equally probable, we have:
The problem of finding the expected values of the above forms of information leakage
is now reduced to finding the marginal probability distribution of the random variables
X,, ,V&, and iVnxay.
5.3 Information Leakage of a Randomly Selected Boolean Function
Lemma 5.1
Let Y be the output of a randomly selected boolean function f : Za + Z? then we "
have the following probabilities:
Proof: The proof of the above lemma ~OIIOWS by noting that NY, Ni,, and iVaxar/2
follow the multi-nomial distribution. O
Theorem 5.1
The expected values of the static and dynamic information leakage of a randomly selected
boolean function f : Z!: + 21" are given respectively by
Proofi Theorem 5.1 follows directly from the definition of the expected value, and (for
part 2) by noting that AY = O for AX = 0, and hence H(AY 1 0 ) = 0. O Figures 5.2 and 5.3 show the expected value of the self static information leakage and
the expected value of the static leakage given that half the input bits are known. From
these graphs , it is clear that the relative dimensions of the boolean functions (i.e., the
ratio between n. m ) greatly affect different foms of information leakage.
Based on the results above, one cm not conclude that s-boxes with n > m are better than
s-boxes with n < m because of the method we used in the normalization step (dividing by
the number of output bits to get information leakage per output bit). Moreover, s-boxes
with n < rn provide better diffusion characteristics, and may be used in SPNs with no
permutation layers [3] which leads to faster software implementation. The conclusion
that we c m make at this time is that ail foms of information leakage seem to decrease
with the number of input variables.
Using theorem 5.1, one can denve an upper bound for the information Ieakage of
a randornly selected single output boolean function. Single output functions are of
practical interest especially for the combining functions in Stream ciphers.
t
Figure 5.1 Lower Bound for the Binary Enuopy Function
CorolhTy 5.1
Let Y be the output of a randomly selected single output boolean function, then the
expected values of both the static leakage and dynamic leakage are bounded by
Proof: The above corollary follows by direct substitution into theorem 5.1 and by noting
that for the binary entropy function
h ( t ) = - t l o g 2 ( t ) - (1 - t)logl(l - t ) , (5.17)
we bave (see Figure 5.1 )
Figure 5 2 Expected Value of S S L ( Y ) for a Randomly Selectcd Boolean Function
1 Lower bound I
2 4 6 8 10 12 14 16 18 UI Number Of Inputs (nb
Figure 5.3 Expected Value of S S L ( Y ) and Expected Value
of S L ( Y 1 X,,,) for a Randornly Selected Boolean Function
5.4 Information Leakage of a Randomly Selected Balanced Boolean Function
In this section, we calculate the expected values of both the dynamic leakage and the
static leakage for regular (balanced) functions.
Let Y be a randomly selected balanced hnction f : 2; -, 2.7, n 2 m. then we have
Zn! where B(n. rn) = (2,-,!12m is the number of n x m balanced boolean functions.
Proof: For n x rn balanced boolean functions, we have N,, = 2"-? If we fix k input
variables, then there are (Y;*) (Y-?"-&) ?"-,,,-* ways of arranging the output such that Y = y
when Xk = x for i times. The remaining (2" - 2"-*)outputs , of which tL -- 2re only ( y - y - m ) !
- 1) distinct ones, c m be permuted in (2,-,!) i 2 m - i , ways. O
Corollary 5.2
Let Y be a randornly selected bijective mapping r : 2; -, 21 then the expected value of
the static information leakage of Y given the input subvector Xk. O 5 k 5 n, is given by
Proof: The proof follows directly by substiniting (5.19), with n = m, into the first part
of (5.15). A simpler proof (independent of Iernma 5.2 ) follows by noting that for any
arbitrary bijective function, r : 2; -t 2;. if we fix A- input bits, we will have 2"-'
different output symbols with H(Y 1 Xk) = n - k. O
From the proof of the lemma above, it follows that we have
for any arbitrary bijective fimction.
In chapter 3, we derived expressions for the number of balanced functions with
-YhAy = Zi. O 5 i 5 Using these results we have for Ax # O
where -4 ,,,, ay (i), A ,,,, (i) are given by equations (3.13) and (3.14).
Thus the expected value of the dynamic leakage of a randornly selected baianced function
is given by
Theorem 5.3
DL(AY 1 AX) = m
Corollary 5.2
Let Y be the output of a randornly selected bijective mapping r : 2: + 2:: then
the expected value
AX, is given by
of the dynamic information leakage given the input change vector
Proof: The corollary above is a special case (with n = m) of theorem (5.3). O
65
Remark: Note that = O as each output symbol occurs once.
For m = n. the minimum dynamic information leakage is achieved by hinctions with
differentially 2-uniform XOR table. Hence. using equation (5.13), we have
"
n - l
Figure 5.4 shows a cornparison between the expected value of dynamic information
leakage of a randomiy chosen n x n bijective mapping and that or' a randomly chosen
function of the sarne dimensions.
2 4 6 8 IO Number Of inputs (n)
Figure 5.4 Expecced Value of D L(AY 1 W ) for an n x n Random Mapping and an n x n Random Bijective Mapping
5.5 Information Leakage of a Randomly Selected Injective Boolean Function
In this section, we calculate the expected values of both the dynamic leakage and the
static leakage for injective functions.
Theorem 5.3
Let Y be the output of a randomly selected injective hinction f : 2.: + Zy. - n l m . t h e n
Proufi The theorem follows by noting that for any arbitrary injective function, f :
2; -, 2" if we fix k input bits, we have Y-' different
H(Y 1 Xk) = n - k.
From the proof of the lemma above, it follows that we have
3L(Y i Xk) = ( m - n ) + k for any arbitrary injective function.
Lemma 5.3
The number of injective hinctions
bounded by
output symbols with
O
with ibxay > 2k .Ax # O. Ay # O. is upper
w here
Proufi We will count the number of ways to construct an injective function such that
!YAxAy 2 %k, Ax # O, Ay # O. There are ('y1) possible choices for the x positions
of these k pairs and there are choices for such pairs. We also have 2 possible
orders for each pair, giving 2' total possible orders. The rest of the output syrnbols can
be picked up in any order from the remaining Zrn - 2k unused symbols, Le., they c m be
picked in In(?" - 2 k . Y - 2 k ) ways. 0 Using the Inclusion-Exclusion principle, we get
Lemma 5.4
The exact number of injective functions with iVkAy = 2k, Ax # O. Ay # O, is given by
Remark: Note that :Là, 0 = O for Ax # 0, and hence An,,,o = 0.
Theorem 5.4
D L ( A Y 1 A X ) =
where I 7 ~ ( 2 ~ , Zn), the number of n x rn injective boolean functions, is given by equation
(5.29).
Numericd substitution into the theorem (5.4) shows that the dynarnic information leakage
of a randornly selected injective function decreases with the number of input variables.
This rate of decrease is very similar to that of a randomly selected boolean function with
the same number of inputs and outputs, especiaily for n << m. This cm be explained
by noting that for n <<< rn, a randomly selected function is most likely to be injective.
5.6.1 Relation Between the Walsh Transform and Different Forms of Information
Leakage
Let y be the output of a boolean function f : Z," -. Zy. Let .\;, and -Yhay
be as defined in equation (5.12).
R e d that
Thus for w = O we have
Multiplying both sides by (-1 and surnrning over c we get
By noting that
we get
3m, f ( x ) =y.
C O! ot herwise,
and hence
If wi denotes the vector with 1 in position i and zeros othenvise, then
Y / ? F , ( w ~ ) = x ( N ~ ~ - ~ i ~ ) ( - l ) ~ * ~ , (5.38) Y
where Nby denotes the number of times the symbol y appean. and xi = 6. Thus we have
We also have
Adding and subtracting equations (5.39) and (5.40) we get
Sirnilarly we can prove that
where w0 denotes the n-dimensional vector obtained by completing the k-dimensional
subvector w with zeros. For exarnple if n = 6 . x = { x ~ . x ~ . z ~ } then w0 =
{W0. o. W?. o. o. w5}.
Recall that the autocorrelation functicn is related to the Walsh transform as follows:
c- Ay Multiplying both sides by (-1) and sumrning over c we get
The left hand side of the equation above can be expressed as
C n c . f ( w ( - i ) c-ny
C
By noting that
2". f (x) 9 f (x 3 Ax) = ny. = { 0. ofherwise,
we get
C
- - 2'-#{xl f(x) 8 f(x$ Ax) = Ay}
- .>m-n - - ~ V A ~ A ~ ,
and hence
Using the above results we get the following theorem:
Let y be the output of a boolean function j : ZT - Zy then the different foms of
information leakage of y cm be expressed as:
where -Vy. A&. Nha, are given by
Theorem (5.5) expresses the static and dynamic information leakage of a multi-output
boolean function in terrns of the Walsh transform of the nonzero linear combinations
of its output coordinates. It is worth noting that static leakage of order k. SL(y lxk )
depends on every Walsh coefficient FC(w) for c E Zm. wt!w) 5 k, and by noting that
SL(y lxk+, ) 2 SL(y lxk) this implies that the Walsh coefficients for w with smaller
Harnming weight have more impact on the static information leakage. It is also clear
that minirnizing the dynamic information leakage can be achieved by having the energy
spectrum F;(W) as flat as possible. Similar results for single output boolean functions
were reported by Forré [48].
5.6.1.1 Extended Defini tions
In this section we consider some extended definitions for sorne other cryptographic
criteria to the multi-output boolean function f : 2% + 27.
As an illustration for how one c m extend the definitions of cryptographic criteria, consider
the Strict Avalanche Criterion (SAC) defined for single-output boolean function as :
A function f : 2; -, Z2 satisfies the SAC if whenever a single input bit is complemented.
the output bit changes with a probability of one haif [136].
The natural extended definition of SAC for multi-output boolean functions would be:
A function f : 22 + Z a satisfies SAC if whenever a single input bit is complemented,
1 the resulting output change vector occurs with a probability p = F .
The sarne concept can be applied to other cryptographic critena such as balance,
Propagation Criterion [los], correlation immunity [126,140], higher order SAC, and
higher order Propagation Cnterion [ 1051.
The following is a summary of these extended definitions.
A balanced hnction is defined as a fûnction for which iVy =
A k resilient function is a function for which Nk, = 2n-k-m for all x E z:.
A SAC-fullfilling function is a function for which we have NAxay = I n - m for al1
AX with w t ( A x ) = 1.
A function satisfies propagation criterion of degree k (PC-k) if !VA,Ay = for
al1 Ax with wt(Ax) 5 k.
Using the above extended definitions, let criterion "C" be any of the following : balance,
correlation immunity , Strict Avalanche Cnterion (SAC) , higher order SAC , Propagation
Cntenon (PC), higher order PC , or perfect nonlinearity. Then we have
Theorem 5.6
If y is the output of a multi-output boolean function then y satisfies criterion C if and
only if every nonzero linear combination of its output coordinates satisfies criterion C.
Proof: For balanced functions we have Ny = . By noting that Fo(0) = 2n/2,
then we have
= p n - m + .-)n/?-m -
and hence
As the equation above should be satisfied for ail y E Z!: then we must have Fc(0) = 0
for al1 c # O which impiies that the function c f: c # O is a balanced function.
Similarly for k-resilient functions we have
?n-ls-rn - 3n/?-m-k .v*y = - - Y C F c ( W U ) ( - 1 ) w.x+c.y
c.w
By noting that
then the following equation must be satisfied for al1 y E Zy
This can be achieved if and only if
In other words, we must have F,(w) = O for k >_ wt(w) >_ 1.
For SAC fulfilling functions we have for wt(Ax) = 1
Substituting with the value of Fo(w) = 2"j26(0) we get
As the above equation should hold for al1 Ay E then we must have
This means that a multi-ouput function fulfills SAC if and only if every non-zero linear
combination of its output coordinates fulfills SAC.
The rest of the theorem foiIows using a similar argument.
5.7 Conclusion
Many of the previously known cryptographic criteria are related to information leakage.
Most of these criteria require zero information leakage in some domain. However, they
often constrain the function to such an extent that large information leakage of other
types become likely. These leakages provide usehl information for the cryptanalyst to
develop attacks on the cipher. This motivates the minimization of information leakage
as a general criterion for cryptographic functions.
We have derived expressions for the expected values of the static and dynamic information
leakage of randomly selected boolean functions and for randomly selected bdanced, and
injective boolean functions. Based on this we showed that the expected values of the
information leaicages decrease dramatically with the number of input variables. In sorne
cases, we showed that this decrease is exponential. With the same approach developed in
this chapter, one can show that the variance of different forms of information leakage also
decreases drarnatically with the number of input variables. In fact, one c m dso show that
the expected maximum value of different foms of information leakage decrease with the
number of input variables. This indicates that cryptographically strong boolean functions
may be obtained by choosing random mappings of sufficiently large dimensions.
The generalized result in theorem (5.6) confims that the cryptographie strength of a multi-
output boolean function depends on the strength of every nonzero linear combination
of its output coordinates.
Chapter 6 A New Class of Substitution- Permutation Networks
6.1 Introduction
Feistel [44] was the first to suggest that a basic substitution-permutation network (SPN)
consisting of iterative rounds of nonlinear substitutions (s-boxes) connected by bit
permutations was a simple, effective implementation of a private-key block cipher.
The SPN structure is directly based on Shannon's pnnciple of a rnixing transformation
using the concepts of "confusion" and "difision" [123]. Lening N represent the block
size of a basic SPN consisting of R rounds of ;2 x n s-boxes, a simple example of an
SPN with N = 16. n = 4, and R = 3 is illustrated in Figure 6.1 . Keying the network
can be accomplished by XORing the key bits with the data bits before each round of
substitution and after the last round. The key bits associated with each round are denved
from the master key according to the key scheduling algorithm.
Figure 6.1 SPN with IV = 16, n = 4, and R = 3.
One advantage of the basic SPN mode1 is that it is a simple, yet elegant, structure for
which it is generaily possible to prove security properties. Indeed, it has been shown
that a basic SPN c m be constmcted to possess good cryptographic properties such as
completeness or nondegeneracy [66], adherence to the avalanche criterion 1611, and
resistance to differential and linear cryptandysis [60].
The basic SPN architecture differs from a DES-like architecture in which the substitutions
and permutations, used as a mixing transformation, operate on only half of the block at
a time. Since SPNs do not have this iast property, in general, SPNs need two different
modules for the encryption and the decryption operations. In an SPN, decryption is
performed by running the data backwards through the inverse network (i.e., applying
the key scheduling algorithm in reverse and using the inverse s-boxes and the inverse
permutation layer). In a DES-like cipher, the inverse s-boxes and inverse permutation
are not required. Hence, a practical disadvantage of the basic SPN architecture compared
with the DES-like architecture is that both the s-boxes and their inverses must be located
in the sarne encryption hardware or software. The resulting extra rnemory or power
consumption requirements may render this solution less attractive in some situations
especially for hardware implementations.
One proposal to overcome this problem is to use a single s-box and its inverse for both the
encryption and the decryption. This idea was employed in SAFER [79]. Unfominately, in
SAFER, the encryption and the decryption are different and one still needs two different
hardware modules.
In this chapter, we introduce a special class of substitution-permutation networks. This
class has the advantage that the sarne network c m be used to perform both the encryption
and the decryption operations. The basic idea is to use involution substitution layers and
involution permutation layers or linear transformations. We investigate the resistance of
these networks to both differentiai and linear cryptanalysis: it is shown that using an
appropnate linear transformation between rounds is effective in improving the security
of the SPNs in relation to these two attacks. Further results suggest that the cyclic
properties of the overall network are not negatively influenced by the cyclic propenies
of the involution s-boxes. As well, a key scheduling algorithm is proposed that has
the advantages of preventing weak keys and ensunng that, given that key bits in a
particular round are cornpromised, it is hard to get any information about the key bits
of other rounds.
2.1 Semi-Involution Functions
It is possible to construct SPNs which do not require inverse s-boxes if the s-boxes in the
network belong to the class of functions that we refer to as serni-involution functions.
Such hinctions have the property that their inverses can be easily obtained by a simple
XOR operation on the function input and output. Hence, differences between the s-
boxes in the encryption network and the decryption network can be accommodated by
incorporating the XOR into the application of the round key bits.
DefinitTon: A bijective function r : Z! -, Zr: is called a semi-involution function if
for some constants a. b E Z;.
Involution functions are the sub-cIass of serni-involution functions for which a = b = 0.
Lemma 6.1
A semi-involution function as defined above has a 3 b as a linear structure.
Proof: Let y = a-'(x) and, from (6.1), we have y $ b = n ( x @ a). Therefore,
x = T-'(~ @ b) @ a. Hence, ~ ( y ) = T-'(~ @ b) @ a. NOW replacing y 5 b with x
gives ~ ( x $ b) = 1;-'(x) $ a. From (6.l), r ( x $ a) $ b = ~ ( x $ b) $ a. Replacing
x with x $ b gives
n ( x Bi a @ b) = ~ ( x ) @ a $ b,
which is the definition of a linear structure [43], [84].
79
Thus a semi-involution function has A h a y = 'Ln where is the XOR difference
distribution table enuy[lS] for input Ax = a @ b and Ay = a @ b. For a 2 b # O this
renders the SPN tnvially broken by differential cryptandysis. This means that. if we want
to use the same SPN for both the encryption and decryption, then only semi-involution
s-boxes with a = b can be used.
The following Iernma shows how the usefui ciass of semi-involution functions can be
obtained from involution functions.
Lemma 6.2
Let 6 : 21" + Z? be an involution function, then the function r ( x ) = d(x) 6 a is a
semi-involution function such that a = b, i.e., r-'(x) = T(X $ a) 5 a.
Proof: From the definition of involution functions, b2(x) = x. Hence, ~ ( ; r ( x ) a)+a =
x. Replacing x with x @ a gives a(x $ a) $ a = T - ' ( x ) . 0
Lemma (6.2) is important, not only because it provides an easy way to generate the
useful class of semi-involution functions frorn involution functions, but also because it
implies that the functions d(x) and n ( x ) belong to the same cryptographic class and
hence they have the sarne linear approximation table [82], and the sarne XOR difference
distribution table II 51.
The only cryptographic difference between involution s-boxes and semi-involution s-
boxes with a = b, a # O, is their cyclic properties. Al1 cycles of involution functions
have length one or two. In SPNs where the key bits are XORed with the data bits at
the s-box input, if we assume that d l the key bits are equi-probable, then both the SPNs
built using semi-involution s-boxes with a = b # O and the SPNs built using involution
s-boxes will have the same cryptographic properties. In the rest of the chapter we will
focus on the ciass of SPNs that use involution s-boxes.
An interesting class of involution mappings is the inversion mapping in G F ( 9 " ) defined
as 1911:
Different cryptographic properties of this mapping were studied in [91]. This inversion
mapping is differentially 2-unifom if n is odd and it is differentially 4-uniform if n is
even. The nonlinearity of this mapping is given by AfL(*) 2 Y-' - sn/' - .
The above class of s-boxes can be generated using different irreducible polynomials. The
number of rnonic polynornials of degree n which are irreducible over GF(q) , where q
is any prime power, is given by [Il], 1781:
where ~ ( d ) is the Mobius function given by
,cl = 1 r ( d ) { i-1). , ci ir a prociuct of i distinct primes
, otherwise.
For n = 8, we have 30 irreducible polynornials of degree 8 and hence we can generate
30 such s-boxes. Al1 these 30 s-boxes have nonlinearity equal to 112 and maximum
XOR table entry equal to 4. In order to fmstrate possible algebraic attacks, the SPN
should use s-boxes generated using different irreducible polynomials. Another approach
is to use randomly generated s-boxes so that the overail cipher would not have any easy
algebraic description. In section 2.3 we study some of the cryptographic properties of
such randomly generated involution s-boxes.
The number of involution fmctions s : 2; 4 21 is given by
Proof: An involution function c m only have an even number of fixed points1. There
are (:), O 5 i 5 Y-' ways to speciS any of these 2i fixed points. Note also that
an involution function with 2i fixed points rnust have Y-' - z cycles2 of length 2. An
involution function is completely defined by specifying its fixed points and a single point
on each of its 2"-' - i cycles. Now, we will count the number of ways of assigning these
2" - 22 points dong the - i cycles. To choose the first point. pick any arbitrary
point xo E 21 such that xo is not equal to any of the assigned fixed points. Choose a
random value ro E 2; for r(xo). ro should not be equal to any of the fixed points. It aiso
should not be equal to xo. Thus there are (2" - 32 - 1) ways to choose ro. To choose a
second point, pick another arbitrary point xl such that r ( x l ) has not been assigned yet
(this also ensures that it belongs to a new distinct cycle) and pick a random rl E Z? for
a(xl). Again, ri should satisfy the following conditions: ri # xl and it should not be
equal to any of the previously assigned values for T. Proceeding as above, we have
' Tire nurnkr of fixcd points of n : ZT - Zî is quai to #{x E Zîlr(x) = x).
A cycle of n : 22 - 2," is a sapence of elements XI, x,, - m . X L such Lh;u xl+, = a ( x 1 ) for 1 _< I 5 (L - 1).
and x, = x ( x L ) , but x, # n(xl) for 2 1 (L - 1). The length of the cycIe is L. k t d(x) be the length of
che cycle to which x befongs. and let CI, C2. -. . C-, be the dinina cycles of a. Thcn the average cycle length is given by
ways of assigning these points. Hence, the number of involution functions is given by3
whic h proves the lemma.
2.2 Equivalence Classes
Two s-boxes T I , a2 are said to belong to the same cryptographic class [1 301 if
for arbitrary constants a. b E 2;.
The use of s-boxes within the sarne cryptographic classes was suggested as a means
to design SPNs that are resistant to differential cryptanalysis [130]. Unfortunately,
involution s-boxes can not be used in such SPNs because, as shown in the following
lemma, if two involution s-boxes belong to the same cryptographic class then they
possess a linear structure.
If and ~2 are both involution mappings and
then xl, Q have a @ b as a linear structure.
' The fint lem in the equation above stands for the unity bijection rnapping with 2" fixed points
8 3
Proofi By noting that ~ ( x ) = al(x 3 a) b then we have ( x ) =
al(al(x 3 a) 2 a 3' b) 5 b. But we also have d ( x ) = x and. hence.
i i l ( ~ ~ ( ~ 5 a) 6 a 3 b ) 5 b = x. Thus. we have xi(* 6 a) 5 a b = r ; ' ( x 5 b).
Replacing x 3 b by x and noting that *rl(x) = ri(x) gives
which is the definition of a linear structure. By a similar argument, one cm show that
a2 aiso has a 8 b as a Iinear structure. 0
2.3 Number of Fixed Points
Involution s-boxes have the characteristic that al1 cycles are of length one or two and,
as will be shown, have a larger expected number of fixed points than a randornly chosen
s-box. Although there is no known effective cryptandytic attack directly based on the
existence of fixed points in the s-boxes, it is of interest to determine if a large number
of fixed points affects other cryptographic properties. such as the nonlinearity and the
maximum XOR table entry, that lead to other cryptographic attacks .
Figure 6.2 shows the expenmental results for the average nonlinearity and the average
maximum XOR table entry as a function of the number of fixed points for 8-bit random
bijections and 8-bit random involutions. One thousand random bijective s-boxes and
one thousand randorn involution s-boxes were tested for each point. The graphs were
denved by incrementing the nurnber of fixed points by 2. The graphs clearly indicate a
strong correlation between the cryptographic properties and the number of fixed points
and suggest that the s-boxes should be chosen to contain few fixed points.
64 te 192 256 Number of Frxed Points
Figure 6 2 Average Nonlinearity and Average Maximum XOR Table Entry Versus the Number of Fixed Points (n = 8)
We now calculate the expected number of fixed points for a random bijection and for
a random involution.
Lemma 6.5
The expected value of the number of fixed points for a random bijective mapping is 1 .
Proof: The number of bijective mappings with exactly t fixed points is given by (this
result follows by using the inclusion-exclusion principle )
I - 2 ( - q i + t ( ; ) (?)(Y - i)! . i=i
The probability of having exactly t fixed points is given by the above formula divided
by Y!.
Hence, the expected number of fixed points is given by
The last step in the equation above follows by noting that
which completes the proof of the lemma.
Similady. one can show that the variance of the number of fixed point s is als
The expected number of fixed points for a random involution rnapping is given by
where
@(n. i) = -
(F1 - i)! pz)! -
Proof From the proof of lernrna (6.3), the number of involution functions with 2i fixed
points is given by
The probability of randornly selecting an involution function with 22 fixed points is
obtained by dividing (6.17) by the total number of involution functions. Thus, the
expected number of fixed points for a random involution function is given by
which completes the proof of the lernrna. O Numencal substitution in the formula above shows that the expected number of fixed
points of a random involution exceeds that of a random injective mapping by a large
factor. For example, an 8-bit involution mapping is expected to have about 16 fixed
points. Fortunately, the construction proof of lemma (6.3) can be used to generate
involution functions with a predetermined number of fixed points. A special case of
interest is involution Functions with zero fixed points since this seems to optimize their
cryptographic properties (see Figure. 6.2 ). The number of such functions follows from
the proof of lemma (6.3) and can be approximated using Stirling's formula as follows
6.3 S-box lnterconnection Layer
In order to use the sarne SPN to perform both the encryption and the decryption
operations, the s-box inter-connection layer should also be an involution mapping. One
permutation layer, applicable to networks for which N = n2, with nice cryptographic
properties [60] and which satisfies the involution requirement is described by: output bit
i of s-box j at round r is connected to input bit j of s-box i at round r + 1.
In 1601 it was shown that with such a permutation iayer we can develop upper bounds
on the differential characteristic probability [15] and on the probability of a linear
approximation [82] as a function of the number of rounds of substitution. Unfortunately.
to achieve good bounds, with a relatively small number of rounds, it is suggested to have
s-boxes with a large diffusion order [60]. Letting 4x and A y denote the input change
vector and the output change vector, respectively, an s-box satisfies diffusion order of
A. X 2 O, if for wt(Ax) > 0.
X w t ( A x ) < X + 1, O otherwise.
where zut(-) denotes the Hamming weight of the enclosed argument.
Using the depth-first search algorithm proposed in [60] we could not find any 8 x 8
involution s-boxes with diffusion order greater than 1 (without the involution constraint,
some 8 x 8 s-boxes with X = 2 were found in [60]). As an alternative to this, the authors
in [60] proposed the use of an invertible linear transformation between rounds. The SPN
resistance to linear and differential cryptanalysis was very encouraging. Unfortunately,
their proposed linear transformation is not very attractive in practice as it requires a bit
XORing operation of al1 the output bits of the round.
6.3.1 Proposed Linear Transformation
We propose a more efficient linear transformation that runs much faster. Moreover it
has irnproved bounds for the linear approximation and the differential characteristic. The
linear transformation between rounds of s-boxes is descnbed by
l < z _ < M (6.2 1 )
of the transformation, w; is the ih input where z; represents the i* n-bit output word (
word, M = 5 denotes the number of s-boxes, and $ denotes a bit-wise XOR operation.
It is assumed that M is even. For 8 x 8 s-boxes this is a byte oriented operation. One
can easily check that this linear transformation operation is an involution.
The linear transformation described above may be efficiently implemented by noting
that each z; could be simply determined by XORing Wi with the XOR sum of al1
2,. 1 5 j 5 M. Le.,
where
Equation (6.23) above requires ( M - 1) word-onented XORs (which c m be done in
parallel in log2 M steps) and equation (6.22) requires M word-oriented XORs (which
c m be done in one step). Hence for a 64-bit SPN using 8 x 8 s-boxes, the above linear
transformation requires 7 + 8 = 15 byte-oriented XORs compared to 63 + 64 = 127
bit-oriented XORs required for the linear transformation of [60].
6.3.2 Involution Linear lkansformations based on MDS codes
An interesting class of linear transformations is the one based on Maximum Distance
Separable (MDS) codes [78]. The use of such linear transformations was first proposed
in [133] and then utilized in the cipher SHAW [108] and later in the cipher SQUARE
[34]. This class of Iinear transformations has the advantage that the number of s-boxes
involved in any 2 rounds of a linear approximation or in any 2 rounds of a differential
characteristic is equal to M + 1, which is the maximum theoretically possible number.
In this section we provide two construction methods for involution linear transformations
based on Maximum Distance Separable Codes.
Rijmen et al [108] noted that the framework of linear codes over GF(2*) provides an
elegant way to constmct the linear transformation layer. More details about the theory
of error correcting codes can be found in 1781.
Let C be a ( 2 M , M, d) code over GF(2"). Let G = [II A] be the generator matrix in
echelon form where A is a nonsingular M x M matrix and 1 is the il. x M identity
matrix. Then C defines an invertible linear mapping
* lli G F ( ~ " ) ' " GF(2 ) : -Y -+ Y = A S . (6-24)
If the matrix A is used in the implementation of the linear transformation of the SPN,
then it is easy to see that the number of s-boxes involved in any 2 rounds of a differentiai
characteristic or a linear approximation expression is lower bounded by d. the minimum
distance of the code (1081. The minimum distance of the code is equai to the minimum
number of linearly dependent colurnns in its nul1 matrix (dso known as the parity-check
matrix).
Lmma 6.7 [78]:
An (n. k. d) code with generator matrix G = [ I J A ] . where A is a k x ( n - k) ma&, is
MDS if and only if every square submatnx (fomed from any i rows and any i columns.
for any i = 1' 2 , . . - . min{k . n - k}) of A is nonsingular.
Random Construction
One way to obtain an involution matrix A which satisfies the above constraint is to pick
a random involution matrix and test it for the condition in lemma (6.7).
Let
be an il1 x LM random matrix where A l i , Ai?, AZ1 and A?? are y x matrices. An 2
involution matrix is one which satisfies A? = I , and thus A is an involution iff
If we let A?? = Ali then equation (6.26) is satisfied iff Ali and Al? commute with each
other. To achieve this we let Al? = A;. For these choices of Al? and .4n, equations
(6.27). (6.28) and (6.29) are linearly dependent with the solution = A:, .Al1.
Thus the hl x 12ï matrix
.4 =
where Ali is a random nonsingular
Al1 A,' [A:, Q Al1 Ai l 1 -
x matrix, is an involution over GF(Zn ). 2 2
For n = 8, random search for a matrix A, with the structure in equation (6.30). that
satisfies the condition in lemma (6.7), terminates within a few seconds for even values
of M, M 5 6. For M = 8 we were unable to obtain any matrix that satisfies the
conditions in lemma (6.7) by random search. Table 6.1 [143] shows the experimental
results for the minimum distance distribution for 10' randomiy selected invertible linear
transformations and 10' randorniy selected invertible involution linear transformation in
the form of equation (6.30).
r ri mental (Random) 46 15539 84415
Experimental (Involution) 118 7714 93168
Table 6.1 Minimum Distance for 105 Randomly Selected Involution Linear Transformations (n = M = 8)
Algebraic Construction
In this section we show how to obtain an involution matrix satisfying lemma (6.7) by
a simple algebraic construction.
Lemma 6.8 [78J-
Given xo,--- ,~~-~.yo,--.~y,-, the matrix .il = [a,].0 5 i. j 5 n - 1 where
aij = - l is called a Cauchy matrix. It is known that xt +Y,
Hence, provided the xi are distinct, the y; are distinct, and xi + yj # O for dl i, j, it
follows that any square submatrix of a Cauchy rnatrix is nonsingular over any field.
Let
w here
and the least significant r [og2(M)l bits of r # O are teros.
For .A' = H = [h,] we have
where i,j and k are evaluated as in equation (6.33). Thus the matrix A will satisfy n
A' = c2 1, c = $ a:i over GF(2"). Dividing (division over GF(2")) each elernent i=l
we obtain an involution matrix for which every square submatrix is nonsingular over
GF(2") . Figure 6.3 shows an example for M = n = 8, using the irreducible polynomial
Figure 63 Involution Linear Transformation Bascd on MDS Codes ( M = n = 8, Irreducible Polynomial = 1 id)
The resistance of SPNs using this MDS Iinear transformation is studied in [ log] and
[143]. Throughout the rest of this chapter we will concentrate on the class of linear
transformations described by equation (6.2 1).
6.4 Resistance to Differential and Linear Cryptanalysis
Using an approach similar to the anaiysis in [60], it is possible to establish upper bounds
on the most Iikely differential characteristic and linear approximation expression using
the linear transformation of (6.21). The resulü of this section are obtained by assuming
that al1 the round keys are independent.
4.1 DSferential Cryptanalysis
The following lernrna gives a lower bound on the number of s-boxes involved in any
2 rounds of a difierenfiai characteristic.
'' All nurnbers arc in hcxadecimal format
L e m m 6.9
Consider an SPN with M s-boxes, hl 2 4. If the SPN employs the linear transformation
described in (6.21), then the number of s-boxes involved in any 2 rounds of a differential
characteristic is greater than or equal to 4.
Proof: From the linear transformation expression one can check that if only one s-box
is involved in round r this implies that iM - 1 s-boxes are involved in round r + 1. If 2
s-boxes are involved in round r , (6.21) ensures that at least 2 s-boxes will be involved
in round r + 1. The rest of the proof follows by noting that the minimum number of
s-boxes involved per round is 1. 0 The number of chosen plaintextkiphertext pairs required for differential cryptanalysis of
an R round SPN (based on the best characteristic and not the best differential [94],[73])
may be approximated by [15], [60]
where Pa,-, is the probability of the best R - 1 round characteristic. This probability
c m be bounded by
where the maximum s-box XOR pair probability is given by Pb = - with XOR* 3"
denoting the maximum entry in the XOR distribution tables of the s-boxes used in the
SPN and a is the total number of s-boxes involved in the characteristic. For even R, from
lemrna 6.9 and assuming that only one s-box will be involved in round R- 1 then we have
and, hence,
Using 8 x 8 involution s-boxes with maximum XOR table entry of 10 (easily found by
randomly selecting involution s-boxes), an 8 round 64-bit SPN that utilizes the proposed
linear transformation will have iVo 2 260-8 chosen plaintext/ciphertext pairs required
for differential cryptanalysis. If we use the inversion s-boxes given by (6.3), then we
will have ND 2 278.
4.2 Lmear Cryptanalysis
The following lemma gives a lower bound on the number of s-boxes involved in any
2 round linear approximation and is based on the assumption of independence between
linear approximation of different rounds.
Lemma 6.10
Consider an SPN with M s-boxes, M > 4. If the SPN employs the linear transformation
described in (6.21) then the number of s-boxes involved in any 2 rounds of a linear
approximation is greater than or equal to 4.
Proofi If the number of s-boxes involved in round r + 1, 1, is odd, then the number of
s-boxes involved in round r is M - 1. If I is even, then the number of s-boxes involved
in round r is 1. The lernma above follows by considering different values for 1. O
For an SPN based on n x n s-boxes, the number of known plaintexts required for the
basic linear cryptandysis (algorithm 1 in [82]) may be approximated by [60]
where
and
with NL denoting the minimum nonlinearity [go] of the s-boxes used in the SPN and
cr is the total number of s-boxes involved in the linear approximation. From the above
argument we have
and, hence,
Using 8 x 8 involution s-boxes with nonlinearity of 98 (easily found by randomly
selecting involution s-boxes), an 8 round 64-bit SPN that utilizes the proposed linear
transformation will have IVL 2 known plaintextkiphertext pairs required for the
basic linear attack. Since this number is greater than the size of the plaintext set, we
interpret this to mean that the basic linear attack is not effective against this class of
SPNs, even if we use al1 possible plaintexts. If we use the inversion s-boxes given by
(6.3), then we will have XL 2 zg8.
6.5 Cyclic Properties of the Proposed SPN
A significant difference between an involution s-box and a non-involution s-box is
likely to be their cyclic properties. For a randomly chosen n-bit bijective mapping, the
expected value and the variance of the number of cycles are both approximately equal to
log, (2") = 0.69n [46]. The expected value of the cycle length is equal to 2"-' + 112 [35].
For an involution mapping with fifp fixed points, the expected cycle length is given by
and the number of cycles is given by Y-' + N f p / 2 .
1 96
SPNs with mndom involution *boxes
Cycle Length
SPNs with mndom b ï j w 8-boxes
I ~ O D O O . 0 0 0 0 . 0 0 0 D r a D m - - 7 0 0 0 0
Cycle Length
Figure 6.4 Normalized Distribution of Cycle Ltngth for al1 2'" Staning Points
h order to investigate whether the cyclic properties of involution S-boxes affect the cyclic
properties of the SPN, we measured the cycle distribution for 100,000 16-bit SPNs with
4-rounds. Each SPN uses four 4 x 4 random involution s-boxes with zero fixed points,
nonlinearity greater than or equal to 4 and maximum XOR table entry equal to 4. The
cycle length distribution is shown in Figure 6.4 (a) (the dark line shows the average
distribution over 100 adjacent points). The normalized frequency of occurrence is equal
to the acnial number divided by 100,000. In this case, the average cycle length over
al1 SPNs is equal to 32779. Figure 6.5 shows a typical expanded segment of the cycle
length distribution in Figure 6.4 (a).
We performed the same experirnent on 100,000 SPNs using random bijective mappings
with the sarne constraints on the nonlinearity and the XOR table. The simulation results
are shown in Figure 6.4 (b). The average cycle length over al1 SPNs is equal to 32766.
It is clear that the two distributions are almost indistinguishable. This suggests that the
involution s-boxes do not have a negative impact on the cyclic properties of the SPN.
6.6 Key Scheduling Algorithm
In Our discussion, we assume that the SPN is keyed by a simple key mixing operations
of the key bits before each substitution and after the last substitution (details will be
0.6 1 O O O 2 m o 2 - Q ~ t O O Z l O Q
Cycle Length
Figure 6.5 A Spica1 Expanded Segment of the Cycle Length Distribution of SPNs with Random Involution S-boxes
explained below). A weak key, kW, is any key for which Ek,(Ek,(p)) = p for every
piaintext vector p where Ek,(.) denotes the encryption operation using the key kW. In
this section we propose a simple key scheduling algorithm for the SPN. Three design
principles were employed:
(i) Prevent weak keys.
(ii) Given that some or al1 of the key bits at round r are compromised, it is hard to get
any information about the other round keys.
Although the above key scheduling cm be controlled to be relatively slow in order to
make brute force attack harder [106], it is far easier and more effective to use a larger
key. Using a larger key has the advantage that it does not penalize implementations
which must change the key often.
In the following algorithm key denotes the user supplied key which is assumed to be
of the sarne length as the block length of the SPN, Ei(p) denotes the output of the
SPN when it has p as an input, and the round keys are ail set to k. Consider the key
scheduling algorithm shown in Figure 6.6.
One c m assign any other arbitrary value to m. Op(- ) denotes any simple operation that
guarantees that al1 xi's are different. By noting that E< is a bijective mapping for any
fixed key then al1 ki's will be different which guarantees that we do not have any weak
keys. An example of operation O p ( - ) is the cornplementing of different bits in xo for
each i. Note that we control the key scheduling speed by controlling the nurnber of
rounds used in the encryption operation Ek. Also, this scheme is similar to the scheme
proposed in [69].
Figure 6.6 Key Scheduling Algorithm
It is also worth noting that the above keying scheme does not have the complementation
property; this property makes DES susceptible to exhaustive key search of 255 rather
than 2j6. This scheme also ensures that there are no simply related keys which leads
to Biham's related keys attack [13].
The key scheduling described above can be extended to accommodate the case where
the user supplied key size is a multiple of the SPN block length (Keys which are not
multiples can be padded to be so) .
6.7 Performance
While the usehlness of a cryptographic algorithm is based on assumptions about its
security, the complexity of the cryptographic function is another feature that should not
be overlooked. Table 6.2 shows the relative speed (the larger the number. the faster the
cipher) of Q-CAST~ and SPNs on three platforms: an &bit microcontroller (Motorola
68 1 l ) , a SUN SPARC workstation and a SUN ULTRA workstation. All aigorithm
operate on a 64-bit blocks and implemented 16 rounds. The speed of the SPN in Table
6.2 is normalized independently for each processor.
In considering these numbers, one should take into account that the proposed SPN
is a hardware oriented cipher (while Q-CAST is a software oriented cipher), and the
round function of the proposed SPN provides a better degree of security than the round
function of Q-CAST.
Table 6.2 Relative Speed of the Proposed SPN and Q-CAST
Motorola 68 1 1
SUN SPARC-20
SUN ULTRA- 1
6.8 Conclusion
We have presented a specid class of SPNs that have the advantage that the same network
can be used to perform both the encryption and the decryption operation. The s-boxes
used are involution mappings and the permutation layer is replaced by an efficient
involution linear transformation layer. In a few seconds on a SPARC-20 workstation, we
were able to obtain tens of 8 x 8 involution s-boxes with nonlinearity of 98 and maximum
XOR table entry of 10. Using these s-boxes, an 8 round 64-bit SPN that utilizes the
SPN
1
1
1
Q u m ' s University version of the CAST en-ption aigorithm
100
Q-CAST
0.46
7.5
1.56
proposed linear transformation will be resistant to both the basic linear cryptanalysis
and to the differential cryptandysis based on the best ( R - 1 )-round characteristic. We
also confinned that the cyclic properties of this special class of SPNs reveai no apparent
weakness. A key scheduling algorithm which satisfies certain design ptinciples was also
proposed.
Chapter 7 Modelling Avalanche Characteristics of Substitution-Permutation Networks Using Markov Chains
7.1 Introduction
An SPN is considered to display good avalanche characteristics if a one bit change
in the plaintext input is expected to cause close to haif the ciphertext bits to change.
Good avalanche charactenstics are important to ensure that a cipher is not susceptible
to statistical attacks such as clustenng attacks [58]. More fonnally, the avalanche is
defined as follows :
A cipher is said to satisQ the avalanche criterion if, for each key, on average
half the ciphertext bits change when one plaintext bit is changed. That is,
E ( w t ( A C ) 1 wt(AP) = 1 ) = N / 2 , where w t ( - ) denotes the Hamrning weight of
the enclosed argument, AC and AP denote the ciphertext and plaintext change vectors,
respectively.
Plaintext P l r
C l ... Ciphertext
Figure 7.1 SPN with N = 16, n = 4, and R = 3.
In [61], Heys and Tavares analyzed the avalanche characteristics of SPNs based on two
general network models. distinguished by the nature of their permutation layer. In the
first model. they considered a network where the permutation between two rounds is
modelled as a random variable whose values are equally likely. In the second model,
they considered a network which has a specified fixed permutation between rounds. In
[56], Heys extended this work and developed a mode1 of the avalanche characteristics
of DES-like ciphers.
In this chapter we develop analytical models for the avalanche characteristics of other
classes of SPNs. In particular. we consider the following four general network models.
distinguished by their linear transformation layers (see Figure 7.1 ):
Model A' - In this mode1 we consider SPNs, with M = n, in which the linear
transformation layer is a permutation layer x E R. where 0 is defined to be the set
of permutations for which no two outputs of an s-box are connected to one s-box in
the next round.
Model B - In this model we consider SPNs in which the linear transformation layer
is given by the iwertible bitwise linear transformation defined by v = r ( C ( u ) ) where
v = [ v p ? - . v ~ ] is the vector of input bits to a round of s-boxes. u = [uiu? . u N ]
is the vector of bits from the previous round output. C(u) = [ I l (u) - - I N (u)], T E R.
where is the set of permutations defined in model A, and
iV is assumed to be even so that the linear transformation is invertible.
Note that i i (u) could be simply determined by noting that
' ïhis model was dso considered by Heys and Tavares b a we inch& it hem for compleceness
w here
)=l
The resistance of SPNs with this class of linear transformation against linear and
differential cryptanal ysis is snidied in [60].
Model C- In this model we consider SPNs in which the linear transformation layer is
given by the invertible wordwise linear transformation defined by
where Zi represents the ilh n-bit output word of the transformation, W i is the i" input
word, and iV1 = $ denotes the number of s-boxes. It is assumed that M is even so that
the linear transformation is invertible. For 8 x 8 s-boxes this is a byte oriented operation.
The linear transformation descnbed above may be efficiently implemented by noting
that each zi could be simply determined by XORing wi with the XOR sum of al1
2,. 1 5 j 5 M. i-e..
where
The resistance of SPNs with this class of linear transformation against Iinear and
differential cryptanalysis is studied in [153].
Model D- In this model we consider SPNs in which the linear transformation layer
is given by the invertible linear transformation
G F ( ~ " ) ~ + G F ( ~ " ) ~ : X 4 Y = AX,
where A is a nonsingular M x M matrix for which every square submatrix (formed
from any i rowr and any i columns, for any i = 1,2! - - . , M) is nonsingular. Note that
the matrix G = [IIA], where 1 is the M x M identity matrix, is the generator matrix
of an (3M. M. M + 1) MDS code. Throughout the rest of this thesis. we will refer to
this type of linear transformations as the MDS linear transformation. The resistance of
SPNs with this class of linear transformation against linear and differential cryptanalysis
is studied in [I08].
7.2 Convergence in Markov Chains
A Markov chain is ergodic if it is finite, aperiodic and irreducible. A sufficient condition
for a Markov chain with n States and a state transition matrix2
CO be apenodic is that pi* > O for some i,1 5 i 5 n, and it is irreducible if for al1 2. j
there exists r such that 4:) > 0,1 5 i, j 5 n, where Pr = [g)]. If P is ergodic, then there exists a unique distribution II = (ri, m, . . r,) such that
The distribution Il is said to be the limiting distribution of P. The classical method to
determine the rate of convergence towards this limiting distribution is to consider the
eigenvalues of P [46].
Suppose that we have a matrix P with distinct and nonzero eigenvalues X i A?, - - . A,. T
Let the column vector di) = (zji), IF), . . , zt)) satisfy
p,, is the eiement in mw j and column i in the rnauix P.
for i = 1.3.--- .n. Then
It follows that a matrix P with distinct eigenvalues has a limiting distribution if and
only if the largest eigenvalue is 1, and the remaining eigenvalues are less than one in
modulus. We order the eigenvalues with 1 = Ai > ]A2 1 2 1 X3 1 2 - - 2 1 An 1.
Powers of a transition matrix can be written as
where P* = [II, II, - , II is the lirniting distribution of P. Thus al1 enuies of Pr
converge to their lirnit exponentially fast as a function of A?.
7.3 Modelling Avalanche in S-boxes
We will use the sarne s-box model proposed in [6 11. Let the s-boxes in the network be
defined by a random bijective rnapping S : X + Y. Assume that any set of one or
more input bit changes to an s-box results in a number of output bit changes represented
by the random variable D, i.e.. D = w t ( A Y ) where AY is the output change vector of
the s-box. We assume that the likelihood of a particular nonzero value for D is given by
assuming that dl possible values of AY belonging to the set of 2" - 1 nonzero changes
are equally likely. Hence the probability distribution of D is given by
for 1 5 d 5 n. Note that the above s-box model essentially represents an average
over al1 randornly selected s-boxes and is not intended to characterize the behavior of
an actual physically realizable s-box. However, as expenmental evidence [61] suggests.
modelling the number of output changes of each s-box as a random variable is a suitable
approximation when considenng an SPN constructed using randornly selected bijective
s-boxes.
Let W, represent the random variable corresponding to the number of bit changes after
round r given a one bit plaintext change, Le.,
where AY,, denotes the output change vector of the sth s-box in round r . Hence, the
expected value of W, is given by
where PAr(A, = i) denotes the probability of having i active s-boxes in round r. Le..
i s-boxes with nonzero input change vectors.
Thus we have
with the initial condition
where 6(a = b) = 1 if a = 6 and b(a = b) = O if a # 6.
It is clear that PA, ( A , = il A,-1 = j ) does not depend on r and hence the number of
active s-boxes c m be modelled by a Markov chain. Now Our problem is reduced to
calculating the state transition matrix P = hi], where pji = PA, ( A , = ilA,-i = j ) is
the probability of having i active s-box in round r given that we have j active s-boxes
in round r - 1.
7.4 Modelling the Linear Transformation Layer
Throughout this section, we are interested in the case A4 = n.
7.4.1 Mode1 A - Fixed Permutation Layer
Lemm 7.1
Assume an SPN with a fixed permutation layer n E fl, then we have
1 n
Pji = (p - 1 ) ~ C ( - p + i
l=n-i
Proufi The number of arrangements of the output bit changes so that the first 1 s-boxes
in round r are not active given that we have j active s-boxes in round r - 1 is obtained
by sening the input bit changes to these non-active s-boxes to zero and varying the other
bit changes over al1 its possible nonzero changes. Thus the number of such arrangements
is (Y-' - 1)'.
Using the inclusion-exclusion principle, the number of arrangements such that we have
exacdy i non-active s-box in round r given that we have j active s-box in round r - 1
is given by
NA(n- i , j ) The lemma follows by noting that p,i = (2n- l lJ .
7.4.2 Mode1 B - Linear Transformation Type 1
The following lemma will be used throughout this section.
Lemma 7.2
Let S = $ bi, bi E 22 with P(bi = 1) = p, O < p < 1, then the probability that i= 1
S = O is given by
For p = O or p = 1, we have
1, Ieven. a,,. 1, = { O, 1 odd.
Lemma 7.3
Assume an SPN with a linear transformation type 1, then for AQ = O (see equation
(7.3) ), the number of arrangements for the output bit changes such that we have i active
s-boxes in round r given that we have j active s-boxes in round r - 1 is given by
where @(i, j ) is given by equations (7.22) and (7.23).
Proof: For every active s-box in round r - 1, al1 the input bit changes to the i non-active
s-boxes should be set to zero. The other output bit changes can be assigned arbitrarily
such that AQ = O. Note that, in this case, QQ can be seen as the XOR of j binary y - ' - 1 1 /3
i.i.d. random variables, bi E Z?, 1 5 1 5 j , with P(bi = 1) = - = - ,-2,-,.Thusthe
rest of the output change bits cm be assigned in (2"-' - 1)'@ (j, -) . U s i ~ ththe
inclusion-exclusion principle, the number of ways to have exactly i non-active s-boxes
in round r given that we have j active s-boxes in round r - 1 is given by
The iernrna follows by noting that Ao(& j ) = NAo(M - 2,j).
Assume an SPN with a linear transformation type 1. then for AQ = 1 (see equation
(7.3) ), the number of arrangements for the output bit changes such that we have i active
s-box in round r given that we have j active s-box in round r - 1 is given by
N A i ( ( M - i), M j = A4, A l ( i , j ) = {(2n-1)j(1-@(i,-)) j+M,i= M. (7.26)
O. otherwise.
where
and (y-')
1 # o. 1 / 2 (2" - llM (1 - (M. -)), 1 = 0'
and @(i. j ) is given by equations (7.22) and (7.23).
Proof: For AQ = 1, we have j = M for every i # M. For i = M. for al1 the 1bf
s-boxes in round r - 1, al1 the output bit changes connected to the i non-active s-boxes
in round r should be set to one. The other output bit changes c m be assigned arbitrarily
such that AQ = 1. Note that, in this case, AQ c m be seen as the XOR of j binary i.i.d.
random variables, bl E 22, 1 5 1 5 j , with P(bl = 1) = 1. Thus the rest of the output (2" -' )'
change bits can be assigned in (Y-')'@ (j, i) = - 2 ways. The total number of bit
changes arrangements with AQ = 1 is given by (2" - l)Y (1 - <P (M, -)) . Using
the inclusion-exclusion principle, the number of ways to have exactly i non-active s-box
in round r given that we have M active s-box in round r - 1 is given by
where
Thus the number of ways to have exactly i active s-box in round r given that we have
j active s-box in round r - 1 is given by
j = M. *) ) j # M. i = M.
otherwise.
which proves the lemrna.
Combining the results above, we have
Pjt = A o ( 4 j ) + &(id
(2" - 1)'
7.43 Mode1 C - Linear lhnsformation Type 2
Lemma 7.5
Let Q(n. k ) be the nurnber of choices of k nonzero elements of 2; which sum to zero, then
k-l
B(n, k) = ( - 1 1 ~ + ( - l ) r ( f ) 2 n ( k - r - l ) .
Proof Let T(n , k, r ) denote the number of ways to choose k elements of 21 which sum
to zero such that r or more of them are zero. Thus
Using the inclusion-exclusion principle, we have
which proves the lemma.
Lemma 7.6
Assuming SPN with a linear transformation type 2, then we have
where
and
proofi3 Since ail choices for which j s-boxes in round r - 1 are active are equally
likely, we may assume the first j s-boxes are active. Thus the last (iM - j ) s-boxes in
round r will have the same input change vector. (6(i = j ) - b( i = M))Q(n . j ) counts
the number of ways this could give rise to the last ( M - j ) s-boxes in round r being
non-active and makes sure that they are not counted towards the case where ail the
s-boxes in round r are active. If the last ( M - j) s-boxes in round r are active then our
problem is basically reduced to considering the first j s-boxes in round r. and we need
to cornpute p(i + j - hl 1 j ) for M = j . In this case we have (,$-i) ways to choose
which (hl - i) of the remaining j s-boxes in round r should be non-active. The input
change vectors to these s-boxes must be the same and we have - 1) ways to choose
which value they have. Cal1 this value W. Since the solution does not depend on the
specific value of w # O, we assume that w = (100.. - O). The remaining i + j - i\rl input changes can not be O (by definition they are nonzero changes) or w (because
this will contradict with the assumption that we have i active s-box). So these input
changes must have the form (ab), a E Z2, b E 2;-' \ {O). Moreover these nonzero
b's must add to O and the a's must add to 1 over Z2. If z + j # M then we have
Two different proofs for this Lernm were provided by 11321 and [SI].
q ( n - 1. i + j - M ) choices for the b's and 2'fj-M-1 choices for the a's. If i + j = .M
(this is only correct if j is odd) then the sum of the a's wil1 always equal to 1. Thus we
have 2 i + j - M - 1 + )(-1)'-'6(i + j = hi) total choices for the a's. Sirnilarly. one can
show that the number of arrangements such that al1 the s-boxes in round r are active
is given by 2j-l (sn - 1) Q(n - 1, j ) + @(n, j ) 6 ( j = -M). The conditional probability is
obtained by dividing by the total number of nonzero input vector changes, (2" - 1)'.
Lemma 7.7 [78]=
The number of code words of weight w in an [n, k, d = n - k + l] MDS code over
G F ( q ) is given by
Lemma 7.8
Assuming an SPN with a linear transformation type 3, then we have
w here
and @(n, d, w) is given by equation (7.39).
Proofi If the number of non-active s-boxes in round r - 1 is greater than or equd to
i, then Our problem corresponds to [2M - i, M - i, M + 11 MDS code. Let A, denote
the number of active s-boxes in round r , then the number of arrangements for which we
have A,-l + A, = w and the number of non-active s-boxes in round r - 1 is greater
than or equai to i is given by
Using the Inclusion-Exclusion Principle, the number of choices for which Ar-i + -4, = u.
and the number of non-active s-boxes in round r - 1 is exactly equal to i is given by
Thus we have
P j i = Q ( M - j , i + j ) / ( Y - l ) j ( l ; ) ?
which proves the lernrna.
7.5 Results and Discussion
For N = 64 and M = n = 8 (a practical size of SPN) the transition matrices for SPNs
based on the four models are shown in Figure 7.3 and Figure 7.4. It is easy to check
that these matrices are ergodic. The limiting distribution and the eigenvalues of these
matrices is shown in Figure 7.5.
Let
i.e., e denotes the deviation of the limiting avalanche characteristics frorn the ideai
characteristics. By numerical substitution with n = hi = 8? and the limiting distributions
in Figure 7.5, we get
s-box then. using equation (7.15). we have
é =
for the four models above. If the SPN is constmcted using one randomly selected 64 x 64
N AM n/2 -- 2 l - Sen 1 in(;)
i=l < r4' (7.46)
Table 7.1 shows the value of the second largest eigenvalue for the four rnodels.
Table 7.1 The Second Largest Eigenvalucs for SPNs with N = 64, and AM = n = 8.
From Table 7.1. it is clear that the transition matrices for SPNs with a permutation layer
have the slowest convergence (largest second eigenvalue), and the transition matrices for
SPNs with an MDS iinear transformation layer have the fastest convergence (smallest
second eigenvalue). It is also clear that the linear transformation is effective in improving
the avalanche characteristics of SPNs. This is also illustrated by noting that the transition
matrices of SPNs with linear transformations tend to have more zeroes in the places
corresponding to transitions from small number of active s-boxes to small number of
active s-boxes. This implies that for small number of rounds, on average, SPNs with
linear transformations tend to have more active s-boxes than SPNs with permutation
layers. Figure 7.2 shows how the experimental results agree with Our theoretical mode1
for SPNs with IV = 64, n = M = 8.
I m m I a
1
f
Model D (Experirnent) I C- Model O (hory) M Model C (Experiment) M Model C (lheory)
i Model B (Experirnent)
1
Model B meory) M Model A (Experiment) -- Mdel A (Theory)
I I 1 I
k
P? 1
O t
1 2 3 4 5 6 7 8 Nurnber of Rounds
b
Modei A
3.185 x IO-'
Mode1 B
4.139 x IO-^
Figure 72 Theoretical and Experimental Avalanche for SPN with N = 64. n = 1C.i = 8.
Mode1 C:
3.922 x IO-^
J
Mode1 D
2.07 x
One should note that the limiting distribution is the same for al1 the models above. This
is a nice illustration of the product cipher design philosophy: a strong block cipher cm
be obtained by iterating a weaker one for a sufficient number of rounds.
While the s-box mode1 used throughout this chapter is suitable for studying the avalanche
characteristics of SPNs, it can not be used directly to determine the resistance of the
network to differential crypianalysis, where the analysis should be performed on a
specified set of s-boxes. However. from the transition matrix, we can calculate the
minimum number of s-boxes involved in any 2 rounds of a differential characteristic.
This number will be greater than or equd to min ( i + j ) . i,l,p,I=o
This mde l aiso helps visudizing the average dynamic performance of SPNs with
randomly selected s-boxes and different linear transformation layers.
In sumrnary, we have presented analytical models for the avalanche charactenstics of
four general classes of substitution-permutation encryption networks. The results show
that using an appropnate diffusive linear transformation between rounds can improve
the avalanche characteristics of the network. This facilitates the construction of efficient
ciphers with fewer rounds.
X X X X X X X X
I I I I I I I I 0 0 0 0 6 0 0 0 r i ( r i ( M h - - + i (
X X X X X X X
X X X X X X X X
I I I I I I I I 0 0 0 0 0 0 0 0
X X X X X X X X X X X X X X X
X X X X X X X X
X X X X X X X X
X X X X X X X X X X X X X u ' ~ - - e ~ - Y v J
= - 9 9 9 9 9 3 O & ~ r n M e c ! r s , r n
X X X X X X X X X - . P - t - b q o 5 - - -
0 0 4 4 F ; t ;
X X X X X X X X X X
X X X X
X X X
Chapter 8 Conclusion
Although some researchers feel that the design of secure and efficient block ciphers
has a solid theoretical foundation. we feel that the work is not yet complete. In this
thesis, we have provided many results that enhance our understanding of how to achieve
this objective.
8.1 Contributions of the Thesis
Our work has contributed to cryptography and cryptanalysis in several ways. In particular
we have
derived expressions for the expected size of the maximum XOR table entry and the
maximum Linear Approximation Table entry for some combinatonal structures of interest
such as regular mappings, and injective rnappings.
developed an expression to estimate the size of the largest entry in the LAT of a
randomly selected injective s-box. In particular we showed that the nonlinearity of a
randomly selected 8 x 32 s-box (the size of CAST s-boxes) is about 72.
presented a new construction method for highly nonlinear injective s-boxes and showed
how the resistance of CAST-like encryption algori thms (based on randoml y selected
substitution boxes) to the basic linear cryptanalysis was underestimated.
extended the definitions of many cryptographic criteria to multi-ouput boolean functions.
showed the relationship between the Walsh-Hadamard transform and various types of
information Ieakage.
introduced a new class of Substitution Permutation Networks (SPNs) with the advantage
that the sarne network can be used to perform both the encryption and the decryption
operations, and examined different cryptographic properties of this class such as resistance
to both linear and differentid cryptanalysis.
showed how to construct MDS linear transformations with the involution property.
developed an analytical mode1 of the avalanche cnterion for SPNs with different linear
transformation layers.
showed that the private key cryptosystem proposed at Crypto' 90 by Koyarna and
Terada is affine over GF(2) and hence it is completely insecure (see appendix A).
proved a conjecture by Cusick regarding the number of functions satisfying the Strict
Avalanche Criterion (see appendix C).
Most of the contributions above were published in [154], [145], [144], [146], [149],
[153], [150], [142], [147], [148], [141], [152], [151] and [143].
8.2 Future Work
As we rnentioned before, we feel that design and analysis of efficient and secure block
ciphen is far from being a complete task. In this section we present several directions
for future research
Investigation of the non s-box based approach in the design of block ciphers.
This approach (e.g.. TEA and RC5) is attractive since it has the minimum memory
requirements which makes it suitable for applications with limited memory such as srnart
cards and wireless applications. Although DEA can be considered as belonging to this
class of block ciphers, we feel that the data dependent rotation used in RC5, because of
its sirnplicity compared to cornplex operations such as modular multiplication, is a more
elegant operation. The security of this simple operation should be explored further and
more operations might be investigated.
Finding s-boxes with high difision order. The first step is to find a theoretical bound
on the diffusion order. Algebraic constmctions or clever search techniques, such as
Genetic Algorithm or simulated annealing, can be used.
Extending the results in this thesis to Higher order differentials and tmncated
differentials.
Exarnining the pseudo-randomness properties of SPNs is still an untouched area. A
complexity theoretic approach similar to Luby-Rackoff results using DES-like ciphen
can be investigated.
Examining possible modifications of the CAST s-boxes such that the resulting round
function is bijective. This will make the CAST algorithm immune to the attack described
by Rijmen et el. Although we are aware of rnany techniques to achieve this, al1 of
them result in some degradation of the nonlinearity and the differential uniformity of
the round function.
Investigation of efficient algonthms to calculate the nonlinearity of the resulting CAST
32 x 32 s-box when the injective s-boxes are cornbined by modular addition or modular
subtraction.
Deriving upper bounds for the differential probabilities and linear correlations for the
different SPNs studied in this thesis.
Appendix A Cryptanalysis of the "Nonlinear- Parity Circuits" Proposed at Crypto '90
At Crypto '90, K. Koyama and R Terada from the Japanese NTT Research Laboratory
proposed a family of cryptographic functions for application to symmetric block ciphers
[71]. Here, we show that this family of circuits is a n e over G F ( 2 ) . More explicitly.
for any specific key (key), the ciphertext y is related to the plaintext x by the simple
affine relation y = iklkeyx 63 dke, where fbfkey is an n x n non singular binary matrix
and dkey is an n x 1 b i n w vector where n is the block length of the cipher. This
renders this family of ciphers completely insecure as it can be broken with only n + 1
linearl y inde pendent plaintext blocks and their corresponding ciphertext bloc ks.
Definitions and Cipher Description:
Koyama and Terada described their proposed cipher as follows:
Definition 2: A parity circuit iayer of length n, or simply an L(n) circuit layer, is a
boolean device with an n-bit input and an n-bit output, characterized by a key that is
a sequence of TI symbols from {O, 1, -. +}.
Definition 2: Function b = f (k.a), as computed by an L ( n ) circuit layer with
key k = kl k p k. E {O, 1, -, +}" is the relation from an n-bit input sequence
a = alaz - - - a, E {O. l}" to an n-bit output sequence b = blb2 . 6, E {O. 1}" defined
below. An L ( n ) circuit layer first cornputes a variable T E {O, 1) such that:
where
Output b = b i b - b, of the circuit layer is then
k, = -
ot herwise.
and T = 1
and T = O
The proposed cipher is obtained by composing the parity circuit layers as follows:
Definson 3: A parity circuit of width n and depth d, or simply a C(n, d) circuit. is a
matrix of d L ( n ) circuit layers with keys denoted by key = kl 11 k2 11 . 11 kd for which
the n output bits of the ( i - 1)-th circuit layer are the n input bits for the i-th circuit
layer for 2 5 i 9 d. The key for the C(n, d) circuit is a d x n matrix whose d lines
contain the circuit layer keys.
An example of C(nt d ) with n = 10, and d = 3 is shown in Table A. 1.
Table A.1 A parity circuit C(n, d ) with n = 10, and d = 3
Main Result:
The lemma below follows from the fact that the set of affine functions is closed under
the XOR operation.
Any combinational boolean function that can be realized using only XOR gates is an
aftine function over GF(2) .
Koyama and Terada presented many cryptographie properties of these C ( n , d) circuits.
They also claimeci, without proof, that the order of the boolean canonical form of C(n. d),
defined to be the maximum of the order of its product terms. increases exponentially as
n or d increases. and hence it is practically infeasible to cryptanalyze C ( n o d) if n 2 64
and d 2 8. The following theorem shows that this is not m e .
The output ciphertext, y. of the parity check circuit C(n. d) is related to the plaintext
input, x, by the following affine relation:
where Mkey is an n x n non singular binary rnatrix and dke, is an n x 1 binary vector.
Proof: Equations (A.2) and (A.3) c m be expressed as
Hence for any fixed key. key = k, 11 k, 11 - (1 kd equations (1). (2), and (3), and
consequently the parity circuit, C(n. d). can be realized using only XOR gates. Hence
every output component of the parity circuit is an affine hinction of the inpu: variables
over GF(2) . Since C(n.d) is a bijective mapping, then the mauix :CIke, is non
singular. O
Example: For the cipher described by Table A.1, the ciphenext, y, in relation to the
plaintext input, x. is given by the following equation:
By noting that affine functions can not satiso the Strict Avalanche Criterion (SAC) then
the C(n. d) circuits c m never satisfy the SAC as was faisely clairned.
Conclusion
The private key cryptosystem proposed at Crypto' 90 by Koyama and Terada is affine
over GF(2) and hence it is completely insecure.
Appendix B Cryptanalysis of "key agreement scheme based on generalized inverses of matrices"
Dawson and Wu [36] proposed a new key agreement scheme based on generalized
inverses of matrices [IO]. They suggested that for a key length of kn, using binary
matrices of size k x m. k < m. and rn x n, the security parameter is at least 2 ( m - k ) n .
Here, we show that this scheme is insecure. In particular, we show that breaking this
scheme is equivalent to solving a set of m x n consistent linear equations with rn2
binary variables.
Definition: Any matrix A has a generalized inverse A- if and only if
AA'A = A.
A- is given by
where r is the rank of A. Ir is the identity matnx of order r. R and C are such that
L( V and I.I/ are arbitrary matrices of the proper order. A generalized inverse which
satisfies AA-A = A and A-AA- = A- is cailed a reflexive generalized inverse of
A. In this case, we rnust have U' = VU. Note that a generalized inverse defined by
equation (B.l) always exists. Our discussion will be on the binary field F2 = {O, 1).
The key agreement scheme proposed in [36] is described by the authors as follows:
Assume that a user, Alice, wants to establish a common secret key with Bob via a public
channel. They cm follow the steps below:
(i) Alice randomly chooses a binary k x m matrix A and an arbitrary generalized inverse
-4- of A.
(ii) Alice sends Bob K A . Alice keeps A and A- secret.
(iii) Bob randomly chooses a binary rn x n matrix B and an arbitrary generaiized inverse
B- of B.
(iv) Bob sends Alice the matrices A-AB and A- AB B-. Bob keeps B and B- secret.
(v) Alice sends Bob the matrix A A - A B E = ABB-.
(vi) Alice and Bob are then able to formulate the key K = AB which is a k x n matrix
by computing AA-AB = A B and ABB-B = AB. respectively.
Pinch [IO31 proposed two attacks on the above scheme. The first attack reduces the
security parameter to 2') and the second attack, which applies to about 28% of the
cases, reduces the security parameter to 2(m-k%23. Here we show that breaking this
scheme (Le. obtaining A B from the public information) is equivdent to soiving a set of
consistent rn x n linear equations with m2 binary variables.
The following Lemma will be used in the attack.
L e m m B. 2
There exisü at least one matrix )r that satisfies the equation
-YBB' Y X B = X B .
Proof: Take Y = B ( X B ) - then
XBB- Y X B = X(BB-B) (XB) -SB
The security argument given in [36] was based on the difficulty of solving for A and B
given the public information A-A, A-AB, A-ABB- and ABB-. Here we show that
we c m determine the product AB without solving separately for A or B.
The Proposed Attack:
Let -Y = A-A then solve the linear equation
( X B B - ) Y ( X B ) = ( X B ) (B .6)
for the m x m matrix 1'. From the above lemma , this equation has at least one valid
solution. Note that we need to solve this equation ( i.e., we can not directly use the
soiution given in the lemma above because B is not known.)
Multiplying both sides of equation (B.6) by A (and by noting that .4X = AA-A = A)
we get
(ABB- )Y (xB) = AB. (B -7)
Both ABB- and .YB = A-AB are sent over the public channel. Hence we can
determine the secret key.
For (k, m, n ) = (7,12,15), the secunty parameter suggested in [36] is 2''. The attack
descnbed in this lener gets the key by solving a set of consistent 180 linear equations
with 144 binary variables.
Example:
Take ( k , m , n ) = (3.4.5). Let
and
Then we have
(B. IO)
Solving equation (B.6) for 1' we get 1096 distinct valid solutions. Here we give just
two of them
It c m be verified that
(B. 11)
(B. 12)
Conclusion
The key agreement scheme proposed by Dawson and Wu is insecure.
Appendix C Bounds on the Number of Functions Satisfying the Strict Avalanche Criterion
The Strict Avalanche Criterion (SAC) was introduced by Webster and Tavares [136]
in a study of design criteria for certain cryptographic functions. As in the case with
any criterion of cryptographic significance, it is of interest to count the functions which
satisw the critenon. Many recent papea (for exarnple [76], [32]) have been concemed
with counting functions that satisfy the SAC of various orden. It is easier to count the
functions satisfying the SAC of the largest order, because relatively few functions exist
which satisQ these stringent criteria.
O'Connor [95] gave an upper bound for the number of functions satisfying the SAC.
In (311 Cusick gave a lower bound for the number of functions satisfying the SAC. He
also gave a conjecture that provided an improvement of the lower bound. In this section,
we give a corxtnictive proof for this conjecture. Moreover, we provide an improved
lower bound.
Notation
Throughout this section, let
f, : 2; -, Z2 describes a boolean function with n input variables.
G' = {v; 1 O 5 i 5 3" - l}: denotes the set of vectors in Z!: in lexicographical order.
A boolean function f,(x) is specified by f.(x) = [bo, bi, ..., b p - l ] , where bi = f&).
e: denotes any element of 2; with Harnming weight 1. Let è, Yi E 2;-' denote the
n - 1 l e s t significant bits of e and vi respectively.
a: denotes any element of 2;-' with odd Hamrning weight.
gn-i : 22-1 -) Zz: denotes the boolean function 1 . x @ 6, b E Z2 where 1 denotes
the al1 ones vector in z;-', denotes the dot product operation over Z2 and ;+. denotes
the XOR operation. It is easy to see that gn-1 satisfies
MSB( . ) denotes the most signifiant bit of the enclosed argument-
SACn denotes the number of functions with n input bits that satisfj the SAC.
Cusick 's Conjecture
The following conjecture is given in [31]. This conjecture implies that there are at least .,y"'' - boolean functions of n variables which satisfy the SAC.
Conjecture [3 11: Given any choice of the values f n ( v i ) , O 5 i 5 Y-' - 1, there
exists a choice of fn(vi), 2 < - i - < 2" - 1. such that the resulting function f n ( x )
satisfies the SAC.
We prove this conjecture below. Afier cornpleting Our proof, we learned that Cusick and
Stanica [33] independently proved the conjecture. Also, D. Biss [19] has proved a much
stronger result by a much more complicated argument. If we let Ln = log2Sdcn . L = 3" I lim Ln, then Biss proved that L = 1. The conjecture, of course, only says that Ln > ..
n-oc "
For n = 1, it is trivial to show that if f l(l) = fl (0) $ 1 then the resulting function
satisfies the SAC. In the following lemma we prove that, for n 2 2, there exist at least two
choices for fn(vi), < - i < - <In - 1, such that the resulting function satisfies the SAC.
Lemmu C.1
Let f, = [h,-l [h,- 1 $ gn-l]] where hn-1 is an arbitrary boolean function with n - 1
input variables, n 2 2, and gn-1 is constructed as above to satisfy equation (C.l), then
f, satisfies the SAC.
h o f i Case 1: ICISB(e) = 0:
Case 2: iCiSB(e) = 1:
which proves the lemma. O From lernma C. 1 above, and by noting that we have two choices for g,, we conclude
qn-l + 1 that, for n 2 2. the number of function satisQing the SAC is lower bounded by 2-
Using the following lemma, one can provide some improvement to the above bound.
Let fn = [hn-l [Zn-l @ gn-l]] where h,-l is an arbitras, boolean function with n - 1
input variables, In-l(x) = hn-l(x $ a), n 2 2, and gn-1 is constnicted as above to
satisfy equation (C.1). then f, satisfies the SAC.
Pro08 Case 1: 1\1SB(e) = 0: 2" -1
which proves the lemma.
Note that if the function fn-l does not have any linear structures, then al1 the functions
generated by In-l @ gn-1 will be unique for ail the choices of a. From lemrna
C.1 and lemma C.2 we have 2"-' + 2 distinct choices for f n - 1 ( v i ) , 2"-' - < i < - 2" - 1.
Thus we have the following corollary:
Corollary C.1
The number of functions satisSing the SAC is lower bounded by
where LSn-' is the number of functions with n - 1 input bits having any linear structure.
A complicated formula for LS" is given in [100]. It can aIso be shown [lûû] that LSn 9"-1+1
is asymptotic to ( Z n - 1)2-
One should note that while this bound provides some improvement over the proved bound
in [31], exhaustive search (see Table C.1) shows that the quality of this bound degrades
as n increases. One can improve this bound slightly by identibing specid classes of
functions f, (vi) , O 5 i Y-' - 1 for which there is a large number of choices for fn (vi),
< i < 2" - 1 such that the resulting function, f., satisfies the SAC. For example, - - -
if the function h,-l satisfies the SAC, then the function f,, = [h,-' [h,- 1 c.x 3 b]] ,
b E Zz aiso satisfies the SAC. Thus our bound is slightly improved to
We now give a lower bound on the number of balanced functions that satisfy the SAC.
Lentma C.3
Let f, = [hn-1 [ Z n - , 9 gn-111 where hne1 is an arbitrary boolean function with n - 1
input variables that satisfies hn-l (vi) = F - ~ , (x) = h(x @ a), n 2 2, and w t ( v , ) odd
gn-1 is constructed as above to satisb equation (C. I) , then f, is a balanced function
that satisfies the SAC.
Proof: From lemma C.3 , it follows that fn satisfies the SAC. Here we will prove that
f, is a balanced function.
w t ( 3 , ) odd
which proves the lemrna.
Table C.l Exact Number of Functions Satisfying SAC versus the Denved Lower Bounds.
n
LSn-'
Old Bound [3 11
New Bound (exp. (C.6) )
New Bound (exp. ((2.7))
Exact Number
Sirnilarly, one can also show that the function f n = [hn-l [hn-l a gn-l]] where h,-l
is an arbitrary boolean function that satisfies hn-1(v;) = F~ is a balanced w t ( v , ) even
function that satisfies the SAC. From the lemma above, it follows that the number of
baianced SAC
2
4
2
8
8
8
functions
3
8
4
64
64
64
4
138
16
1536
1920
4 128
is lower bounded by
1
5
4992
256
1099776 1
1 157568
27522560 .
References
[ l ] FIPS PUB 185. Escrowed encryption standard. Depr. of Commerce, Feb. 1994.
[2] C. M. Adams. Constructing symmetric ciphers using the CAST design procedure.
Designs, Codes, and Cryptography, Vol. 12. No. 3, pp. 283-316, 1997.
[3] C.M. Adams. A F o m l and Practicd Design Procedure for Substitution-Permutation
work Cryptosystems. PhD thesis, Queen's University, Kingston, Ontario, Canada,
September, 1990.
[4] C.M. Adams and S.E. Tavares. The use of bent sequences to achieve higher-order
Strict Avalanche Criterion in s-box design. Technical report, TR 90-013, Dept. of
Electrical Engineering, Queen's University, Kingston. Ontario. 1990.
[SI C.M. Adams and S.E. Tavares. Designing s-boxes for ciphers resistant to differential
cryptanalysis. Proceedings of 3rd Symposium on the State and Progress of Research
in Cryptography, pp. 18 1-1 90, Rome, Italy, 1993.
[6] N. Ahmed and KR. Rao. Orthogonal Transfom for Digital Signal Processing.
S pringer-Verlag, New York, 1975.
[7] D. Andelman and J. Reeds. On the cryptanalysis of rotor and substitution-permutation
networks. IEEE Tranî. Inform. Theory, 28(44), pp. 578-584. 1982.
[8] R. Anderson and E. Biham. Two practical and provably secure block ciphers: BEAR
and LION. Proc. of Fast Sofovare Encryption (3). LNCS 1039, Springer-Verlag, pp.
II3-l2O1 1996.
[9] F. Ayoub. Probabilistic completeness of Substitution-Permutation encryption
networks. ZEE Proc. Part E, 129, pp. 195-1 99, 1982.
[10]A. Ben-Israel and T.N.E. Greville. Generalized inverses: Theory and applications.
John Wiley and Sons, New York, 1974.
[11]E. R. Berlekarnp. Algebraic coding theory. McGraw-Hill Book Inc., New York, 1968.
[ 1 21E. B iharn. Dinential Ceptanalysis of Iterated Cryptosysterns. PhD thesis, The
Weizmann Institute of Science, Rehovot, Israel, 1992.
[13]E. Biham. New type of cryptanalytic attacks using related keys. Advances in
cryptology: Proc. of EUROCRYPT '93. Springer-Verlug. Berlin. pp. 398-409, 1994.
[14)E. Biharn. On Matsui's linear cryptanalysis. Advances in Ctyptology: Proc. of
EUROCRYPT '94, Springer- Verlag, pp. 341-355, 1995.
[15]E. Biharn and A. Shamir. Differential cryptanalysis of DES-like cryptosystems.
Journal of Cryptology, Vol. 4, no. 1. pp. 3-72, 1991.
[16]E. Biham and A. Shamir. Differential cryptanalysis of snefm, Khafre, REDOC-II,
LOKI and Lucifer. Advances in Cryptology: Proc. of CRYPTO '91, Springer-Verlag,
pp. 156171, 1992.
[17]E. Biham and A. Shamir. Diffrential cryptanalysis of the full 16-round DES.
Advances in Cryptology: Proc. of CRYPTO '92, Springer-Verlag. pp. 487-496, 1 993.
[18]E. Biharn and A. Shamir. Differential fault anaiysis of secret key cryptosystems.
Advances in Cryptology : Proc. of CRYPTO '97, Springer- Verlag. pp. 5 13-525, 1 997.
[19]D. K. Biss. A lower bound on the number of functions satisfying the Strict Avalanche
Criterion. Subrnitted.
[20]LF. Blake. An Introduction to Applied Probability. John Wiley and Sons, Inc., 1979.
[2 1 ]G. Brassard. Modern Cryptology. Springe-Verlag, New York, 1988.
[22]L. Brown, M. Kwan. J. Pieprzyk, and J. Sebeny. Improving resistance to
differential cryptanalysis and the redesign of LOU. Advances in Cryptology: Proc.
of ASIACRYPT '91, Springer- Verlag, pp. 3650, 1 993.
[23]L. Brown, J. Pieprzyk, and J. Seberry. LOU: A cryptographic primitive for
authentication and secrecy applications. Advances in Cryptology: Proc. of AUSCRYPT
'90, Springer- firlag. pp. 229-236, 1990.
[24]L. Brynielsson. The information leakage through a randomly generated function.
Advances in C~~ptology: Pmc. of EUROCRYPT '9 1, Springer- Verlag. Berlin, pp.
552-553, 199 1.
[25]L. Carlitz and S. Uchiyama. Bounds for exponential sums. Duke Math. Journal. Vol.
24, pp. 3 7 4 1 . 1957.
[26]X. Chang, 2. Dai. and G. Gong. Some cryptographie properties of exponential
hinctions. Advances in Crytology: Proc. of ASIACRYPT '94. Springer- Verlag, pp.
41.5418, 1995.
[27]D. Chaum and J.H. Evertse. Cryptanalysis of DES with a reduced number of rounds.
Advances in Cryptology: Proc. of CRYPTO '8.5, Springer-Verlag, Berlin. pp. 192-21 1,
1986.
[28]B. Chor. O. Goldreich, I. Hastad, J. Friedman, S. Rudich, and R. Smolensky. The
bit extraction problem or t-resilient functions. Proc. of the 26th IEEE Symposium on
Foundation of Computer Science, pp. 3 9 U 0 7 , 1985.
[29]D. Coppersmith. The Data Encryption Standard @ES) and its strength against attacks.
IBM J. Res. Develop. Vol. 38 No. 3, pp. 243-250, May, 1994.
[30]T.M. Cover and J.A. Thomas. Elements of Infonnation Theory. John Wiely & Sons
Inc, 1991.
[31]T. W. Cusick. Bounds on the nurnber of functions satisQing the Strict Avalanche
Criterion. Infonnation Processing Letters. 57, pp. 261-263, 1996.
[32]T.W. Cusick. Boolean functions satisfying a higher order Strict Avalanche Criterion.
Advances in Cryptology: Proc. of EUROCRYPT '93. Springer-Vedag, pp. 102-1 17,
1994.
[33]T.W. Cusick and P. Stanicil. Bounds on the number of functions satisfying the Strict
Avalanche Criterion. Information Processing kt ters 60, pp. 2 15-2 19, 1996.
[34]J. Daemen. L. Knudsen. and V. Rijmen. The block cipher SQUARE. Proc. of Fast
Somare Encryption (4). W C S . Springer- Verlng. 1 997.
[35]D.W. Davis and G.I.P. Parkin. The average cycle size of the key strearn in output
feedback enciphement. Lecture Notes in Cornputer Science : Proc. of the Workshop
on Cryptography. Springer-Verlag' Berlin, pp.263-279. 1982.
[36]E. Dawson and Chuan-Kun Wu. Key agreement scheme based on generaiized inverses
of matrices. IEE Electronics Letiers, Vol. 33, No. 14. pp. 12 1&1211, 1997.
[37]M.H. Dawson and S.E. Tavares. An expanded set of S-box design critena based on
information theory and its relation to differential attacks. Advances in Cryptology:
Pruc. of EUROCRYPT '91. Springer-Verlag. pp. 352-365, 1992.
[38]D.E.R. Denning. Cryptography and Data Security. Addison-Wesley, 1982.
[39]Y. Desmedt. I.J. Quisquater, and M. Davio. Dependence of the output on the
input in DES: Small avalanche charachteristics. Advances in Cvptology: Proc. of
EUROCRYPT '84. Springer- Verlg, Berlin, pp. 359-376, 1985.
[40]W. Diffie and M.E. Helman. Privacy and authentication: An introduction to
cryptography . Proc. of IEEE, 67, pp.397-427, 1 979.
[41]W. Diffie and M.E. Helman. The first ten years of public-key cryptography. Proc.
of IEEE, 76, pp.560-577, 1988.
[421D. Ellion and KR. Rao. Fast a lgorithms: Algorithm. Ana lysis, Applications.
Academic press, 1982.
[43]J.H. Evertse. Linear structures in block ciphers. Advances in Cryptology: Proc. of
EUROCRYPT '87. Springer-Verlag, Berlin. pp.249-266. 1988.
[44]H. Feistel. Cryptography and computer pnvacy. Scientific Arnerican, 228, pp. 15-23.
1973.
[45]H. Feistel, W. Notz. and J.L. Smith. Some cryptographic techniques for machine-to-
machine data communications. Proc. of lEEE, 63, pp. 1545-1554, 1975.
[46]W. Feller. An Introduction to Probubiliq Theory with AppZicationî. Vol. I, 3rd edition.
John Wiley and Sons, New York, 1968.
[47]R. Forrk. The Strict Avalanche Criterion: Spectral properties of boolean functions
and an extended definition. Advances in Cryptology: Proc. of CRYPTO '88. Springer-
Verlag, pp. 45M68, 1989.
[48]R. Forré. Methods and instruments for designing S-boxes. Joumd of Crvptoiogu.
Vol -2, N0.3 pp. 115-130, 1990.
[49]J. Goldman and Gian-Car10 Rota. On the foundation of combinatorial theory
TV. Finite vector spaces and Eulerian generating functions. Studies in Applied
Mathematics. Vol. XWX No.3, pp. 239-258, 1970.
[50]J. Gordon and H. Retkin. Are big S-boxes best ? Proc. of the Worhhop on
Cryptography, W C S , Springer-krlag, Berlin. pp.257-262, 1982.
[5 1JD. Gregory. Pnvate communication.
[52]M. Gysin. A one-key cryptosystem based on a finite nonlinear automaton. Proc.
of International Conference of Cryptography: Policy and Algorithrns, Brisbane,
Queensland, Australia. July 1995, LNCS 1029, Springer- Verlag, pp. 165-1 73, 1 996.
[53]C. Harpes, G.G. Kramer. and J.L. Massey . A generalization of linear cryptanal ysis
and the applicability of Matsui's piling up Lemma. Advances in Cryptology: Proc.
of EUROCRYPT '95, Springer-Vèrlag, pp. 24-38, 1995.
[54]M. E. Hellman. A cryptanalytic tirne-memory trade-off. IEEE Proc. , Vol. IT-26.
No.4, pp.401-406, July, 1980.
[55] H.M. Heys. The Design of Substitution Permutation Network Ciphers Resistan t to
Cryptanalysis. PhD thesis, Queen's University, Kingston, Ontario, Canada, 1994.
[56]H.M. Heys. Modelling avalanche characteristics in DES-like ciphers. Workshop on
Selected Areas in Cryptography, SAC '96, Workshop Record, pp.77-94, 1996.
[57]H.M. Heys and S.E. Tavares. Cryptanalysis of tree-structured substitution-permutation
networks. IEE Elecrronics Letters, Vol. 29, no. 1, pp. 40-41, 1993.
[58]H.M. Heys and S.E. Tavares. Key clustering in substitution-permutation network
cryptosysterns. Workshop on Selected Areas in Cryptography, SAC '94. Worhhop
Record, pp. 1.34- 145, 1994.
[59]H.M. Heys and S.E. Tavares. Known plaintext attack of tree-structured block ciphers.
IEE Electronics k a e r s , Vol. 31 , no.10. pp. 784-785, 1995.
[60]H.M. Heys and S.E. Tavares. The design of product ciphers resistant to differential
and linear cryptanaiysis. J o u d of Cryptology, Vol. 9, no. I, pp. 1-19, 1996.
[6 11H.M. Heys and S.E. Tavares. Avalanche characteristics of substitution-permutation
encryption networks. IEEE Tram. Comp., Vol. 44, pp. 1131-1 139, Sept. 1995.
[62]J.F. Humphreys and M.Y. Prest. Nwnbers, groups and Codes. Cambridge University
Press, Cambridge, 1989.
[63]T. Iakobsen and L.R. Knudsen. The interpolation attack on block ciphers. Proc. of
Fasr Sofiare Encryption Workshop (41, 1997.
[64]B.S. Kaliski, R.L. Rivest, and A.T. Sherman. 1s the Data Encryption Standard a
group ? Journal of Cryptology, pp. 1-36, 1988.
[65]B.S. Kaliski and M.J.B. Robshaw. Linear cryptanalysis using multiple
approximations. Advances in Cryptology: Proc. of CRYPTO '94, Springer-krlag,
Berlin, pp. 2639 , 1994.
L661J.B. Karn and G.I. Davida. Stmctured design of substitution-permutation encryption
networks. IEEE Tram. Comp. C-28, pp.747-753. 1979.
[67]D. Khan. nte Codebreakers, The Story of Secret Writing. Macmillan Co. New York,
1967.
168lL.R. Knudsen. Tmncated and higher order differentials. Proc. of Fast Sofrware
Encryption (2). LNCS 1008, Springer- Verlag, pp. 196-2 1 1, 1995.
[69]L.R. Knudsen. Block Ciphers- Analysis, Design and Applications. PhD thesis, Aarhus
University, Denmark, July 1994.
[70]P. C. Kocher. Timing attacks on implernentations of Diffe-Hellman, RSA. DSS.
and other systems. Advances in Cryptology: Proc. of CRYPTO '96, Springer- Verlag,
Berlin. pp. 1 W 1 1 3 . 1996.
[71]K. Koyama and R. Terada. Nonlinear parity circuits and their cryptographie
applications. Advances in Cryptology: Proc. of CRYPTO '90. Springer-Verlag, Berlin.
pp. 582-599, 1991.
[72]X. Lai. Higher order derivatives and differential cryptanalysis. Symposium on
Communication, Coding. and Cryptography. Monte-Verita, Ascona, Switrerlund,
Feb.10-13, 1994.
[73]X. Lai, J.L. Massey, and S. Murphy. Markov ciphers and differential cryptanalysis.
Advances in Crytology: Pmc. of EUROCRYPT '9 1. Springer- Verlag, pp. 17-38. 1992.
[74]S.K. Langford and M.E. Hellman. Differential-linear cryptanalysis. Advances in
Cryptology: Proc. of CRYPTO '94, Springer-Verlag, Berlin, pp. 17-25. 1994.
[75]J. Lee, H.M. Heys. and S.E. Tavares. On the resistance of a CAST-like encryption
algorithm to linear and differential cryptanalysis. Designs. Codes and Cryptography,
Vol. 12, No. 3, pp. 267-282, 1997.
[76]S. Lloyd. Counting functions satisfying a higher order Strict Avalanche Criterion.
Advances in Cryptology: Proc. of EUROCRYPT '89, Springer- Verlag. pp. 63-74, 1 990.
[77]M. Luby and C. Rackoff. How to constnict pseudorandom permutation from
pseudorandom functions. SIAM Journal on Computing. Vol. 17. pp.3 73-386, 1988.
[78]F.J. MacWilliarns and N.J.A. Sloane. rite theory of error correcting codes. North-
Holland Publishing Company, 1977.
[79]J. Massey. SAFER K-64: a byte-oriented block-ciphering algonthm. Fast Sojhvare
Encryption, LNCS 809. Springer- Verlag. pp. 1- 17, 1994.
[80]J. L. Massey. An introduction to contemporary cryptology. Proc. of IEEE. 76,
pp.533-549, 1 988.
[8 1]M. Matsui. The first experirnental cryptanalysis of the Data Encryption Standard.
Advances in Cryptology: Proc. of CR YPTO '94, Springer- Verlag, Berlin. pp. I-I I ,
1994.
[82]M. Matsui. Linear cryptanalysis method for DES cipher. Advances in Cryptology:
Proc. of EUROCRYPT '93, Springer-krlag, Berlin, pp. 386397, 1994.
[83]M. Matsui. New structure of block cipher with provable security against differential
and Iinear cryptanalysis. Proc. cf Fasr Sofrware Encryption (3). LNCS 1039. Springer-
Verlag, pp. 205-2 18, 1996.
[84]W. Meier and 0. Staffelbach. Nonlinearity criteria for cryptographic functions.
Advances in Cryptology: Proc. of EUROCRYPT' 89, Springer-Vèrlag. pp. 549-562,
1990.
[85]A. J. Menezes, P. Van Oonchot, and S. A. Vanstone. Handbook of Applied
Cryptography. CRC press, New York, 1997.
[86]C.H. Meyer and S.M. Matyas. Cryptography: A New Dimension Ni Cornputer Data
Securiv. Wiley, New York, 1982.
[87]S. Mister and C. Adams. Practical s-box design. Workshop on Selected Areas in
Cryptography, SA C '96. Workshop Record. pp. 61-76, 1 996.
[88]C. Mitchell. Enumerating boolean functions of cryptographic significance. Journal
of Cryptology, pp. 155-170, 1990.
[89]K. Nyberg. On the construction of highly nonlinear permutations. Advances in
Cryptology: Proc. of EUROCRYPT '92. Springer- Rrlag, Berlin. pp.92-98, 1992.
[90]K. Nyberg. Perfect nonlinear S-boxes. Advances in Cryprology: Proc. of EUROCRYPT
'91 , Springer- Verlag, pp. 37û-386, 1992.
[91]K. Nyberg. Differentidly uniform mappings for cryptography. Advances in
Cryptology: Proc. of EUROCRYPT '93, Springer- Verlag, Berlin, pp.5544. 1994.
[92]K. Nyberg. Linear approximation of block ciphers. Advances in Cpptology: Proc. of
EUROCRYPT '94. Springer-Verlag. Berlin, pp. 439444, 1995.
[93]K. Nyberg. S-boxes and round functions with controllable linearity and differential
uniformity. Pmc. of Fast Software Encryption (2). LNCS 1008. Springer-Verlag, pp.
111-130, 1995.
[94]K. Nyberg and L.R. Knudsen. Provable security against differential cryptanalysis.
Advances in Crytology: Proc. of CRYPTO '92, Sprhger-Verlag. pp. 566-574, 1993.
[95]L.J. O'Connor. An upper bound on the number of functions satisfying the Strict
Avalanche Critenon. hformation Processing Leners, 52, pp.325-327, 1994.
r96lL.J. O'Connor. On the distribution of charachteristics in composite permutation.
Advances in Cryptology: Proc. of CRYPTO '93. Springer-Verlag, Berlin, pp. 403-
412, 1994.
[97]L.J. O'Connor. On the distribution of characteristics in bijective mappings. Advances
in Cryptology: Proc. of EUROCRYPT '93, Springer-Verlag, Berlin, pp. 36&370,
1994.
[98]L.J. O'Connor. The inclusion-exclusion pnnciple and its applications to cryptography.
Cryptologia, Vol. XVII. no. 1, pp. 63-79, January, 1993.
[99]L.J. O'Connor and I. Dj. Golic. A unified Markov approach to differential and lineaar
cryptanalysis. Advonces in Cryptology: Proc. of ASIACRYPT '94, Springer-Verlag,
Berlin, pp. 387-397, 1995.
[lûû]L.J. O'Connor and A. Klapper. Algebraic nonlinearity and its application to
cryptography. Journal of Cryptology, Vol. 7, pp. 2 13-227, 1994.
[IO1 ]National Institute of Standard and Technology. Advanced encryption standard (AES)
development effort. http://csrr.nist.gov/encryption/aed'es~home. hml.
[102]National Bureau of Standards (US). Data Encryption Standard@ES). Federal
Infornation Processing Standards Publicarion 46, 1 977.
[103]R.G.E. Pinch. Comments on "key agreement scheme based on generalized inverses
of matrices". Preprint, 1997.
[104]M. Portz. On the use of interconnection networks in cryptography. Advances in
Cryptology: Proc. of ECIROCRYPT '9 1. Springer- Verlag, pp. 302-3 15, 1992.
[105]B. Preneel. W.V. Leekwijk, L.V. Linden, R-Govaerts, and J. Vandewalle. Propagation
charachtenstic of boolean functions. Advances in Cryptology: Proc. of ELIROCRYPT
'90, Sprfnger-Verlag, pp. 161-1 73, 199 1.
[106]J. Quisquater, Y. Desmedt, and M. Davio. The importance of "good" key scheduling
SC hemes . Advances in Cqptology : Proc. of CRYPTO '85, Springer- Ve rlag. Berlin.
pp. 537-542, 1986.
[107]J.A. Reeds and J.L. Manferdelli. DES has no per round linear factor. Advances in
Cryptology: Proc. of CRYPTO '84, Springer- Velag. Berlin, pp. 377-389, 1985.
[108)V. Rijmen, J. Daemen, B. Preneel, A. Bosselaen, and E. De Win. The cipher
SHARK. Fast SofMtare Encryption, LNCS 1039. D. Gollmann. Ed., Springer-Verlag,
pp. 99-112, 1996.
[109]V. Rijmen, B. Preneel, and E. De Win. On weakness of non-surjective functions.
Designs. Codes. and Cryprography. Vol. 12. No. 3. pp. 253-266, 1997.
[ 1 10]T. Ritter. Dynarnic Substitution, U.S. patent 4979832. HTML article availuble from
" www. io.codm/ritter " .
[ 1 1 1]R. Rivest. The RC4 encryption algorithm. Intemal Report, RSA Data Security, Inc.,
1992.
[ 1 1 2lR.L. Rivest . The RC5 encry ption algorithm. Fast sofiare encryption. W C S 1008,
Springer- Verlag, pp. 86-96, 1 995.
[ 1 1 3lF.S. Roberts. Applied Combinatorics. Englewood CIi ffs, N.J. : Prentice-Hall, 1 984.
[114]M. J.B. Robshaw. Block ciphen. Technical report, TRaOl version 2, RSA
Laboratories, August, 1995.
[115]M. J.B. Robshaw. Stream ciphen. Technicai report, TR-701 version 2, RSA
Laboratones, July 1995.
[116]O.S Rothaus. On bent hinctions. Journal of Combinatorial 7heor-y. Vol. 20(A):30&
305, 1976.
[117]R.A. Rueppel. Analysis and Design of Stream Ciphers. Springer-Verlag, Heidelberg
and New York, 1986.
[118]I. Schaumüller-Bichl. Cryptanalysis of the Data Encryption Standard by method
of formal coding. Proc. of the workshop on Cryptogrophy, Springer- Mrlag. Berlin.
pp.235-255, 1982.
[119]B. Schneier. Description of a new variable-length key, M i t block cipher (Blowfish).
Proc. of Fast So&are Encryption Workrhop. UVCS 809, Springer- Verlag, Berlin, pp.
191-204, 1994.
[120]B. Schneier. Applied Cryptography, Second edition. John Wiley and Sons, Inc, 1996.
[12 1JB. Schneier and J. Kelsey. Unbalanced Feistel networks and block cipher design.
Proc. of Fast Sofrware Encryption (31, LNCS 1039, Springer-Verlag, pp. 121-144,
1996.
[122]J. Seberry, X. Zhang, and Y. Zheng. Systematic generation of cryptographically
robust s-boxes. 1st A CM Conference on Computer and Communications Security,
Fai@iàx, Krginia, pp. 172-182, November, 1993.
[123]C.E. Shannon. Communication theory of secrecy systems. Bell Systems Technical
Journal, Vo1.28, pp. 6.56715. 1949.
[124]A. Shimizu and S. Miyaguchi. Fast data encryption algonthm FEAL. Advances in
Cryptology: Proc. of EUROCRYPT '87. Springer- Verlag, Berlin, pp. 267-2 78, 1988.
[125]T. Siegenthaler. Decrypting a class of Stream ciphers using ciphertext only. IEEE
Trans. Cornput.. Vo1.C-34, No. 1. pp. 81r85, 1985.
[ 126]T. Siegenthaler. Correlation-irnrnunity of nonlinear combining functions for
cryptographic applications. IEEE Tram. on hfonn. Zheory. VoLIT-30. No.5. pp.
776:780, Sept. 1984.
[ 127JG.J. Simmons. A survey of information authentication. Proc. of lEEE. 76. pp.603-
620, 1988.
[128]G.J. Simmons. Contemporas, cryptology, the science of information integrity. IEEE
Press, New York, 1992.
[129]A. Sinkov. Elementary Cryptanalysis. Mathematical Association of America, 1966.
[130]M. Sivabalan, S.E. Tavares, and L.E. Peppard. On the design of SP networks from
an information theoretic point of view. Advances in Cryptology: Proc. of CRYPTO
'92, Springer- Verlag, Berlin. pp. 26&2 79, 1993.
[13 1]A. Sorkin. Lucifer: A cryptographic algorithm. Cryptologia, 8, pp. 22-35, 1984.
( 132JR. Stong. Pnvate communication.
[133 ]S. Vaudenay. On the need for rnultipermutations: Cryptanalysis of MD4 and SAFER.
Proc. of Fast Sofrware Encryption (21, LNCS 1008, Springer- Verlag, pp. 286297,
1995.
[134]R. Verser. DES challenge: We cracked the code. Avilable through
"www. frii.co~rcv/deschaZZ. htm ", April 8, 1 997.
[ I35lA.F. Webster. Plaintext I ciphertext bit dependencies in cryptographic systems.
Master's thesis, Queen's University, Kingston. Ontario, Canada, December, 1985.
[136]A.F. Webster and S.E. Tavares. On the design of S-boxes. Advances in Cryptology:
Pmc. of CRYPTO '85 , Springer-Verlag, pp. 523-534, 1986.
[137]A. Weil. On some exponential sums. Proceedings of the National Academy of
Sciences, Vol. 34, p p . 2 W 2 0 7 , 1 948.
[138]D. 1. Wheeler and R.M. Needham. TEA, a Tiny Encryption Algorithm. Proc. of
F m Sofnvare Encryption Workshop (2). LNCS 1008. Springer-Verlag, Berlin. pp.
363-366, 1995.
[139]M.J. Wiener. Efficient DES key search. Technicai report, TR-244. School of Computer
Science, Carleton University, Ottawa, Canada, May 1994. (also presented at the
Rump Session of CRYPTO '93).
[140]G.Z. Xiao and J.L Massey. A spectral charachtenzation of correlation-immune
combining functions. IEEE Trans. Infonn. Theory. Vol iT-3#:569:571, 1988.
[141]A.M. Youssef, 2. Chen, and S.E. Tavares. Construction of highly nonlinear injective
s-boxes with application to CAST-like encryption algorithm. Proceedings of the
Candian Conference on Electrical and Computer Engineering (CCECE' 97), pp.
330-333, 1997.
[142]A.M. Youssef, T.W. Cusick, P. Stanica, and S.E. Tavares. New bounds on the number
of functions satisfying the Strict Avalanche Cntenon. Workîhop on Selected Arem in
Cryptography. SAC '96, Workshop Record, pp.49-60, 1 996.
L143lA.M. Youssef, S. Mister, and S.E. Tavares. On the design of linear transformations
for substitution permutation encryption networks. Workshop on Selected Arear in
Cryptography. SAC ' 97, Workshop Record, pp. 40-18, 1 997.
[144]A.M. Youssef and S.E. Tavares. Number of nonlinear regular s-boxes. IEE Electronics
Leiîers, Vo1.31, No. 19, pp. 1643-1644, 1995.
[145]A.M. Youssef and S.E. Tavares. Resistance of balanced s-boxes to linear and
differential cryptanalysis. Infomtion Processing Leners, 56(199.S), pp. 249-252,
1995.
[146]A.M. Youssef and S.E. T'avares. Spectral properties and information leakage of
multi-output boolean functions. Proceedings of the IEEE International Symposium
On Information Theory. Whistler, B. C., Canada, Sept. 1 7-22, 1 995.
[147]A.M. Youssef and S.E. Tavares. Comment on "bounds on the number of functions
satisfying the Strict .4valanche Critenon. Infomtion Pmcessing Letters 60. pp.
271-275, 2996.
[148]A.M. Youssef and S.E. Tavares. A cornparison of two families of block ciphers.
Proceedings of the 18th Biennial Symposium on Commwiicationr. Kingston. Ontario,
pp. 200-203. 1996.
[ 149lA.M. Youssef and S.E. Tavares. Information leakage of randornly selected boolean
functions. Information Theory and Applications II, 4th Canadiun Workshop On
Information Theory. May 1995, LNCS 1133, pp. 41 -52, Springer-Verlag, 1996.
[lSO]A.M. Youssef and S.E. Tavares. On the avalanche charactenstics of substitution-
permutation networks. Proceedings of Pragocrypt '96, pan 1, pp. 18-29, 1996.
[15 1lA.M. Youssef and S.E. Tavares. Cryptanalysis of the nonlinear parity circuits proposed
at crypto' 90. IEE Electronics Letters, Vo1.33, No. 7, pp. 585-586, 1997.
[152]A.M. Youssef and S.E. Tavares. Modelling avalanche characteristics of substitution-
permutation networks using Markov chahs. Proceedings of the 5th Canaàian
Workshop on Information Theory, pp. 45-48, Toronto, June 3-6. 1997.
[153]A.M. Youssef, S.E. Tavares, and H.M. Heys. A new class of substitution-permutation
networks. WorkFhop on Selected Areas in Cryptography, SAC '96, Workrhop Record,
pp. 132-147, 1996.
[154]A.M. Youssef. S.E. Tavares. S. Mister, and C.M. Adams. Linear approximation of
injective s-boxes. IEE Electronics Letters. VoLJI, No. 25, pp. 2 168-2 169, 1 995.
[155]M. Zhang, S.E. Tavares, and L.L. Campbell. Information leakage of boolean
functions and its relationship to other cryptographic critena. Proceedings of 2nd
ACM Conference on Cornputer and Communications Securiv, Fai$àx, Hrginia, pp.
156-165., 1994.
IMAGE EVALUATION TEST TARGET (QA-3)
APPLIED 1 IMAGE. lnc S 1653 East Main Street - -. - Rochester. NY 14609 USA -- -- - - Phone: 71W2-0300 -- -- - - Fax: 7161288-5989
O 1993, Applied Image. tnc.. All R ~ t i t t Resewed
Recommended