A general quantitative cryptanalysis of permutation-only multimedia ciphers against plaintext attacks

ARTICLE IN PRESS

0923-5965/$ - se

doi:10.1016/j.im

�Correspondfax: +492331 9

E-mail addr

URL: http:/

Signal Processing: Image Communication 23 (2008) 212–223

www.elsevier.com/locate/image

A general quantitative cryptanalysis of permutation-onlymultimedia ciphers against plaintext attacks

Shujun Lia,�, Chengqing Lib, Guanrong Chenb,Nikolaos G. Bourbakisc, Kwok-Tung Lod

aFernUniversitat in Hagen, Lehrgebiet Informationstechnik, UniversitatsstraX e 27, 58084 Hagen, GermanybDepartment of Electronic Engineering, City University of Hong Kong, 83 Tat Chee Avenue, Kowloon, Hong Kong, China

cInformation Technology Research Institute, College of Engineering and Computer Science, Wright State University, 3640 Glenn Hwy,

Dayton, OH 45435, USAdDepartment of Electronic and Information Engineering, The Hong Kong Polytechnic University, Hung Hom, Kowloon,

Hong Kong SAR, China

Received 1 August 2006; received in revised form 21 January 2008; accepted 24 January 2008

Abstract

In recent years secret permutations have been widely used for protecting different types of multimedia data, including

speech files, digital images and videos. Based on a general model of permutation-only multimedia ciphers, this paper

performs a quantitative cryptanalysis on the performance of these kind of ciphers against plaintext attacks. When the

plaintext is of size M �N and with L different levels of values, the following quantitative cryptanalytic findings have been

concluded under the assumption of a uniform distribution of each element in the plaintext: (1) all permutation-only

multimedia ciphers are practically insecure against known/chosen-plaintext attacks in the sense that only OðlogLðMNÞÞ

known/chosen plaintexts are sufficient to recover not less than (in an average sense) half elements of the plaintext; (2) the

computational complexity of the known/chosen-plaintext attack is only Oðn � ðMNÞ2Þ, where n is the number of known/

chosen plaintexts used. When the plaintext has a non-uniform distribution, the number of required plaintexts and the

computational complexity is also discussed. Experiments are given to demonstrate the real performance of the known-

plaintext attack for a typical permutation-only image cipher.

r 2008 Elsevier B.V. All rights reserved.

Keywords: Permutation-only multimedia encryption; Image; Video; Speech; Cryptanalysis; Known-plaintext attack; Chosen-plaintext

attack

1. Introduction

With the rapid progress of computer and com-munication network technologies, a great deal of

e front matter r 2008 Elsevier B.V. All rights reserved

age.2008.01.003

ing author. Tel.: +49 2331 9871160;

87375.

ess: [email protected] (S. Li).

/www.hooklee.com (S. Li).

concerns have been raised about the security ofmultimedia data transmitted over open networks.Also, secure storage of digital multimedia datais demanded in many real applications, such asconfidential teleconferencing, pay-TV, medical andmilitary imaging, and privacy-related multimediaservices. Due to the prevalence of multimediaservices in consumer electronic devices, users ofhandheld devices have started to require content

.

www.elsevier.com/locate/image

dx.doi.org/10.1016/j.image.2008.01.003

mailto:[email protected]

http://www.hooklee.com

ARTICLE IN PRESSS. Li et al. / Signal Processing: Image Communication 23 (2008) 212–223 213

protection of multimedia data including recordedspeech segments, personal photos and private movieclips.

To meet all these needs in practice, encryptionalgorithms are required to offer a sufficient levelof security for different multimedia applications.Apparently, the simplest way to encrypt multimediadata is to treat them as 1-D bit-streams, and thento encrypt them with any available cipher [38,31].In some multimedia applications, such a simple idea of naive encryption may be enough. However, inmany other applications, especially when digitalimages and videos are involved, encryption schemesconsidering special features of the multimediadata, such as bulky sizes and large redundancy inuncompressed images/videos, are still requiredto achieve a better overall performance and tomake the integration of the encryption schemeinto the whole process easier. In the past severaldecades, different algorithms have been proposed toprovide specific solutions to the encryption ofimages, videos and speech data. Meanwhile, manycryptanalytic results have been reported, leading tothe conclusion that a number of multimediaencryption schemes are insecure from the crypto-graphical point of view. For recent surveys on imageand video encryption algorithms, see [15,16,23,45,51], and for surveys on speech encryption, see[2,15,22,32].

The use of secret permutations is very popular inanalog pay-TV services as a main approach toprotecting video signals in broadcast-TV [6,19,47].Due to the specific structure of analog video signals,there are only three major ways to perform secretpermutations on lines: time reversal (transmittingrandomly selected lines in reverse order), line cut androtation (cut each line at a random point and swapthe two halves of the line), and line shuffling. Thecondition becomes much better, however, when secretpermutations are used to protect digital multimediadata. According to the format of the multimedia datato be encrypted, a lot of elements can be permuted ina secure way: pixels (samples), bitplanes, lines, rows,blocks, macroblocks, slices, transform coefficients,VLC (variable-length codewords) syntax elements,tree nodes, and so on [1,4,5,8,9,12,14,17,24–30,33,37,39–42,44,48–50,52,54]. While some multimediaencryption schemes combine secret permutationswith other encryption techniques, there are manymultimedia encryption algorithms that are entirelybased on secret permutations [1,4,6,8,9,12,17,19,30,33,39–42,44,47–50,52,54]. These are called permuta-

tion-only multimedia ciphers in this paper. Note thatsome ciphers can also be classified as permutation-only ones, even though other encryption techniquesare used together with secret permutations. Forinstance, the video ciphers proposed in [24–26]become permutation-only ciphers, if the sign bits ofall encrypted data elements are neglected. The mainadvantages of using only secret permutations in acipher include: (i) they can be easily implemented; (ii)when used properly, perceptual information aboutthe plaintext can be efficiently concealed.

The security of permutation-only multimediaciphers has been extensively studied. Almost allpermutation-only analog pay-TV encryptionschemes and some permutation-only ciphers hadalready been found insecure against ciphertext-only attacks, due to the high information redun-dancy in multimedia data and/or some specificweaknesses in the encryption algorithms [3,13,20,21,34]. In addition, it has been widely knownthat permutation-only multimedia ciphers areinsecure against the known/chosen-plaintext attack[7,10,11,13,17,18,34–36,41,43,44,53], which is quiteunderstandable since the secret permutationscan be recovered by comparing the plaintextsand the permuted ciphertexts. Though secretpermutations suffer from the above security pro-blems, many researchers still hope that it will beuseful to design multimedia encryption schemesbased on this technique, due to the followingreasons:

(1)
the insecurity against ciphertext-only attacks isnot a problem for most digital permutation-onlyciphers because of the use of more complicatedpermutations;
(2)
the insecurityagainst plaintext attacks can besolved in practice, by using dynamically updatedand/or plaintext-dependent secret permutations;
(3)
it is one of the simplest encryption techniques tomaintain format compliance and size preserva-tion simultaneously;
(4)
by combining it with very simple substitutionoperations, multimedia encryption of high con-fidentiality can be achieved.
To the best of our knowledge, all previouscryptanalytic results were performed for specificpermutation-only image/video ciphers, and a gen-eral quantitative study about plaintext attacks hasnot been reported to clarify the number of requiredplaintexts and the computational complexity of

ARTICLE IN PRESSS. Li et al. / Signal Processing: Image Communication 23 (2008) 212–223214

such an attack.1 As a result, some questions stillremain to be answered, which include: (i) Can thesecurity of permutation-only multimedia encryptionalgorithms be effectively enhanced by designing newmethods to generate ‘‘better’’ secret permutations?(ii) How frequent should the secret permutations beupdated to provide an acceptable security againstplaintext attacks?

This paper reports a general cryptanalysis ofpermutation-only multimedia encryption algo-rithms against plaintext attacks, mainly focusingon the quantitative relation between the breakingperformance and the number of required known/chosen plaintexts, and provides an estimation of theattack complexity. The cryptanalysis is performedon a general model of permutation-only multimediaciphers by considering the plaintext (image, speech,frame of videos, etc.) as an M �N matrix in whicheach element has L possible distinct values. Underthe assumption that each element in the matrix hasan independent and uniform distribution, it will beshown that the number of plaintexts required toobtain an acceptable breaking performance inknown plaintext attack is dlogLð2ðMN � 1ÞÞe. Whenthe plaintext does not have a uniform distribution,this number will increase accordingly. This issue willalso be studied on a special non-uniform distribu-tion. For chosen-plaintext attack, it will be shownthat only dlogLðMNÞe plaintexts are enough to get agood breaking performance. In addition, an upperbound of the attack complexity will be obtained:Oðn � ðMNÞ2Þ, where n is the number of known/chosen plain-images.

The rest of this paper is organized as follows. InSection 2, a general model of permutation-onlymultimedia ciphers is described. Cryptanalysison this normalized model is studied in detail inSection 3. Some experimental results are shown inSection 4 to support the theoretical cryptanalysis.The last section concludes the paper.

2. A general model of permutation-only multimedia

ciphers

Though different kinds of multimedia datarequire different kinds of secret permutations, it is

1Though there were some simple discussions on the quantita-

tive aspects of known/chosen-plaintext attacks of bit-permuta-

tion ciphers in the cryptology community [46], this problem has

not been systematically and quantitatively studied in a general

way for any case, especially for permutation-only multimedia

ciphers.

possible to construct a general model by consideringthe plaintext as a 2-D M �N matrix. This isbecause the following reasons: (i) 1-D speech dataare just a special case of M ¼ 1; (ii) 3-D videos aregenerally encrypted frame by frame, and each frameis encrypted block by block, so permutation-onlyvideo encryption is actually a generalized case ofpermutation-only image encryption; (iii) the dimen-sion remains unchanged when a multimedia signal isconverted to transform domain. Thus, in thefollowing of this section, we describe the generalmodel based on a 2-D input plaintext. To facilitatethe discussion below and to avoid potential confu-sion, we use a special term ‘‘particles’’ to denoteelements in the 2-D plaintext that are permuted,such as pixels in a plain-image or transformcoefficients in a block of a video frame.

As its name suggests, a permutation-only multi-media cipher encrypts a 2-D plaintext by permutingthe positions of all particles in a secret way. Thesecret permutations have to be invertible to makethe decryption possible. This means that allpermutation-only ciphers belong to symmetry ci-phers. Although many different methods have beenproposed to realize secret key-dependent pixelpermutations, for a given plaintext of size M �N,a permutation-only cipher can be normalized withan invertible key-dependent permutation matrix of

size M �N, denoted by

W ¼ ½wði; jÞ ¼ ði0; j0Þ 2M�N�M�N , (1)

where M ¼ f0; . . . ;M � 1g and N ¼ f0; . . . ;N � 1g.With the permutation matrix W and its inverseW�1 ¼ ½w�1ði; jÞ�M�N , for a plaintext f ¼

½f ði; jÞ�M�N and its corresponding ciphertextf 0 ¼ ½f 0ði; jÞ�M�N , the encryption and decryptionprocedures of a permutation-only cipher can bedescribed as follows:

�
the encryption procedure: For 0pipM � 1 and0pjpN � 1, f 0ðwði; jÞÞ ¼ f ði; jÞ; � the decryption procedure: For 0pipM � 1 and
0pjpN � 1, f ðw�1ði; jÞÞ ¼ f 0ði; jÞ.

In a short form, we denote the encryption procedureby f 0ðWðIÞÞ ¼ f ðIÞ and the decryption procedure byf ðW�1ðIÞÞ ¼ f 0ðIÞ, where

I ¼

ð0; 0Þ � � � ð0;N � 1Þ

..

. . .. ..

.

ðM � 1; 0Þ � � � ðM � 1;N � 1Þ

26643775

M�N

.


To ensure the invertibility of the permutationmatrix, i.e., to make the decryption possible, thefollowing property should be satisfied: 8ði1; j1Þaði2; j2Þ, wði1; j1Þawði2; j2Þ. This means that Wdetermines a bijective (i.e., one-to-one) permutationmapping, FW :M�N!M�N.

From the above description, one can see that thedesign of a permutation-only cipher focuses on twopoints: (1) what the secret key K is; (2) how thepermutation matrix W and its inverse W�1 arederived from the secret key K. Generally speaking,each key defines a permutation matrix, and eachpermutation-only cipher defines a finite set contain-ing a number of permutation matrices selected fromall ðMNÞ! possible ones. In the relevant literature,many different methods have been proposed toderive the permutation matrix from a key, some ofwhich are listed as follows:

�
SCAN language based methods [1,4,8,5,12,27]:Define some different scan patterns of the 2-Dplaintext and combine these patterns to obtain apermutation matrix by scanning the wholeplaintext particle by particle; � quadtree based methods [8,12]: Divide the plain-
text into multi-level quadtree and shuffle theorder of four nodes in each level to realize apermutation matrix;
� 2-D chaotic maps based methods [14,28,37]:
Iterate a discretized 2-D chaotic map over theM �N plaintext for many times to realize apermutation matrix;
� Fractal curves based methods [30,54]: Use a
fractal(-like) curve to replace the normal scanorder to realize a permutation matrix;
� pseudo-random rotations based methods [48,49]:
Pseudo-randomly rotate particles along somestraight lines for many times to realize apermutation matrix;
� matrix transformation based methods [33]: Use
(integer) transformations of matrix, such asn-dimensional Arnold transformation and Fibo-nacci-Q transformation, to define permutationmatrices;
� composite methods [52]: Combine different meth-
ods to realize more complicated permutationmatrices.

Although different types of secret keys are used indifferent permutation-only multimedia ciphers togenerate the permutation matrix, it is reasonable toconsider the permutation matrix W itself as the

equivalent encryption key and W�1 as the equiva-lent decryption key. From such a point of view, allpermutation-only multimedia ciphers can be con-sidered the same. This is the base for the securityanalysis to be carried out below in next section.

3. General quantitative cryptanalysis

In this section, we discuss the general quantitativecryptanalysis of plaintext attacks based on theabove general model of permutation-only multi-media ciphers.

3.1. Known-plaintext attack

As discussed above, when a permutation-only

multimedia cipher is used to encrypt a plaintext,the particle at the position ði; jÞ will be secretlypermuted to another fixed position ði0; j0Þ while itsvalue remains unchanged. Therefore, by comparinga number of known plaintexts and the correspond-ing ciphertexts, it is possible for an attacker to(partially or even totally) reconstruct the secretpermutations of all particles, i.e., to derive theencryption/decryption keys—the permutation ma-trix W and its inverse W�1.

Given n known plaintexts f 1; . . . ; f n and theirciphertexts f 01; . . . ; f

0n, the deduction procedure of the

two key-matrices W and W�1 can be described by afunction Get_Permutation_Matrix. With theinput parameters (f 1; . . . ; f n; f

01; . . . ; f

0n;M ;N), this

function returns an estimation of the permutationmatrix W and its inverse W�1. Assuming the value ofeach particle ranges in f0; . . . ;L� 1g, the functionGet_Permutation_Matrix works as follows:

�
Step 1: Compare pixel values within the n
ciphertexts f 01; . . . ; f0n to get ðn � LÞ sets of posi-

tions:

L01ð0Þ; . . . ;L01ðL� 1Þ; . . . ;L0nð0Þ; . . . ;L

0nðL� 1Þ,

where L0mðlÞ �M�N denotes a set containingpositions of all particles in f 0m ðm ¼ 1; . . . ; nÞwhose values are equal to l 2 f0; . . . ;L� 1g, i.e.,8ði0; j0Þ 2 L0mðlÞ, f 0mði

0; j0Þ ¼ l. Note thatL0mð0Þ; . . . ;L

0mðL� 1Þ actually compose a parti-

tion of the set of all positions:[L�1l¼0

L0mðlÞ ¼M�N ¼ fð0; 0Þ; . . . ; ðM � 1;N � 1Þg,

and 8l1al2, L0mðl1Þ \ L0mðl2Þ ¼ ;;


�
Step 2: Get a multi-valued permutation matrix,cW ¼ ½bwði; jÞ�M�N , where
bwði; jÞ ¼ \nm¼1

L0mðf mði; jÞÞ.

Here, note that[0pipM�10pjpN�1

bwði; jÞ ¼M�N

and that bwði1; j1Þ ¼ bwði2; j2Þ may hold ifði1; j1Þaði2; j2Þ;
� Step 3: Determine a single-valued permutation
matrix, fW ¼ ½ewði; jÞ�M�N from cW , where ewði; jÞ 2bwði; jÞ and 8ði1; j1Þaði2; j2Þ, ewði1; j1Þaewði2; j2Þ;
� Step 4: Output fW and its inverse fW�1
¼

½ew�1ði; jÞ�M�N as the estimations of W and W�1.

2This is actually not true for most multimedia data, but we use

this strong assumption to carry out a qualitative estimation.

Apparently, if and only if #ðbwð0; 0ÞÞ ¼ � � � ¼#ðbwðM � 1;N � 1ÞÞ ¼ 1, i.e., each element of cWcontains only one position, it is true that fW ¼Wand the cipher is totally broken. However, because

some elements of cW contain more than one

position, generally fW is not an exact estimation of

W . Assume that there are ð bNpMNÞ distinct

elements in cW , and that the bN elements arebw1; . . . ; bwbN . Then, it can be easily verified that there

areQbN

k¼1#ðbwkÞ! possibilities of fW . To make the

estimation of fW as accurate as possible, somespecific optimization algorithms can be used tochoose a better position from bwði; jÞ as the value ofewði; jÞ, such as the genetic and simulated annealingalgorithms. Our experiments have shown that evena simple algorithm may be enough to achieve arather good estimation when nX3 for 256� 256gray-scale images (see the next section for moredetails). The simple algorithm is called ‘‘taking-the-first’’ algorithm, which sets ewði; jÞ to be the firstavailable element in bwði; jÞ, where the term ‘‘avail-able’’ refers to the constraint that 8ði1; j1Þaði2; j2Þ,ewði1; j1Þaewði2; j2Þ.

Next, we study the decryption performance of theestimated permutation matrix fW when fWaW .Generally speaking, due to the large informationredundancy existing in multimedia data, usuallypartially recovered plaintext is enough to revealmost visual information. Therefore, if there areenough correct elements in fW , the decryptionperformance may be acceptable from a practical

point of view. From the above discussions, one cansee that correctly recovered elements infW belong totwo different classes:

�
the absolutely correct elements: Derived from thesingle-valued elements of cW ; � the probabilistically correct elements: Derived
from the multi-valued elements of cW , and arecorrectly guessed by an optimization algorithmof selecting a proper position from each bwði; jÞ.

Assuming that the number of single-valued elementsof cW is nc and the probability of success of theoptimization algorithm is ps, the average number ofcorrect elements in fW will be nc þ ps � ðMN � ncÞ.Because ps is generally not fixed (tightly dependenton the employed optimization algorithm), only theabsolutely correct elements are considered here (i.e.,ps ¼ 0 is assumed) to perform a qualitative analysis.This means that we will get a lower bound of theperformance.

Now, the problem of counting correct elements infW is simplified to be another problem of countingsingle-valued elements in cW . From Get_Permu-tation_Matrix function, one can see that thecardinality of bwði; jÞ is uniquely determined byL01ðf 1ði; jÞÞ; . . . ;L

0nðf nði; jÞÞ. To further simplify the

analysis, assume that any two particle valuesare independent of each other2 and denote theoccurrence probability of a particle value l 2

f0; . . . ;L� 1g by Pl . Apparently, it is true thatPL�1l¼0 Pl ¼ 1 and Pl ¼

1Lfor the uniform distribution

of the particle value. Then, one can consider thefollowing two types of positions in bwði; jÞ:
� the only one real position wði; jÞ, which absolutely
occurs in bwði; jÞ;
� other fake positions, each of which occurs in each
L0mðf mði; jÞÞ with probability Pf mði;jÞ, i.e., each ofwhich occurs in all the n sets, L01ðf 1ði; jÞÞ;. . . ;L0nðf nði; jÞÞ, with probability

Qnm¼1Pf mði;jÞ.

Therefore, when the values of f 1ði; jÞ; . . . ; f nði; jÞare fixed, the expected cardinality of bwði; jÞ is1þ ðMN � 1Þ

Qnm¼1Pf mði;jÞ. Because it is very diffi-

cult to estimate a general result when the values ofP1; . . . ;PL�1 are unknown, we only discuss twospecial distributions to demonstrate how to estimate

ARTICLE IN PRESS

103

S. Li et al. / Signal Processing: Image Communication 23 (2008) 212–223 217

a lower bound of the number of the requiredplaintexts.

(1)

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1100

101

102

p

L (p

)

3

104

Uniform distribution: In this case, Pl ¼ 1=L,8l 2 f0; . . . ;L� 1g. One can qualitatively deducethat the average cardinality of bwði; jÞ for anygiven position ði; jÞ is 1þ ðMN � 1Þ=Ln, whichapproaches 1 exponentially as n increases.Generally speaking, when 1þ ðMN � 1Þ=Lnp1:5 or ðMN � 1Þ=Lnp0:5, i.e., more than halfelements in fW are correct, the decryptionperformance will be acceptable. Solving thisinequality, one has nXdlogLð2ðMN � 1ÞÞe. Asan example, for 256� 256 gray-scale images,M ¼ N ¼ L ¼ 256, one has nXdlogLð2ðMN �

1ÞÞe ¼ d2:125e ¼ 3. The average cardinality isabout 1.0039 when n ¼ 3, so it is expected thatthe decryption performance for nX3 will berather good, which is verified by the experimentsgiven in the next section.

10
(2)
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1100

101

102

p

n

Fig. 1. The relationships between LðpÞ, dlogLðpÞð2ðMN � 1ÞÞe (the

lower bound of n) and p ¼ 1=L; 2=L; . . . ; ðL� 1Þ=L, when

M ¼ N ¼ L ¼ 256.

Uniform distribution except for one particlevalue: Typical examples of this kind of distribu-tion are images with large smooth background.Without loss of generality, assume P0 ¼ p andPl ¼ q ¼ ð1� pÞ=ðL� 1Þ for l 2 f1; . . . ;L� 1g.Then, if there are k values of f 1ði; jÞ; . . . ; f nði; jÞequal to 0, which occurs with a probability of

nk

� �pkð1� pÞn�k, the expected cardinality of bwði; jÞ

is 1þ ðMN � 1Þpkqn�k. As a result, one can getthe average value of #ðbwði; jÞÞ � 1 as follows:

ð#ðbwði; jÞÞ � 1Þ

¼Xn

k¼0

n

k

!pkð1� pÞn�k

ðMN � 1Þpkqn�k

¼ ðMN � 1ÞXn

k¼0

n

k

!ðp2Þ

k ð1� pÞ2

L� 1

� �n�k

¼ ðMN � 1Þ p2 þð1� pÞ2

L� 1

� �n

.

Let ðMN � 1Þðp2 þ ð1� pÞ2=ðL� 1ÞÞnp0:5.Then, one can get nXdlogLðpÞð2ðMN � 1ÞÞe,where

LðpÞ ¼1

p2 þ ðð1� pÞ2=L� 1Þ.

When M ¼ N ¼ L ¼ 256, Fig. 1 shows how thevalue of LðpÞ and the lower bound of n changewith respect to the value of p. It can be seen thatthe non-uniformity can cause an increase of thenumber of required plaintexts.

Though the distribution of most multimedia data isnot uniform, our experiments on permutation-onlyimage ciphers have shown that the above quantita-tive results obtained from the uniform distributionis basically correct for natural images: aboutlogLð2ðMN � 1ÞÞ plain-images are sufficient to geta good breaking performance as will be shown inthe next section. In fact, the actual decryptionperformance is even better than the theoreticalexpectation because of the following two reasons:

�
human eyes have a powerful capability ofsuppressing image noises and extracting signifi-cant features: 10% noisy pixels cannot make


much influence on the visual quality of a digitalimage, and it only needs 50% of pixels to revealmost visual information of the original image;
� due to the short-distance and long-distance
relationships in natural images, two pixel valuesare close to each other with a non-negligibleprobability larger than the average probability;as a result, many wrongly decrypted pixels areclose to their true values with a probability largerthan the average probability.

The above two points imply that the decryptionperformance of natural images will be better thanthat of noise-like images. For experimental verifica-tion and more explanations, see Section 4, Figs. 4and 5. This perceptual phenomenon can also begeneralized to audio and speech data.

Finally, consider the time complexity of theabove-discussed known-plaintext attack, i.e., thetime complexity of the Get_Permutation_Ma-trix function. Note that the time complexitydepends on implementation details of this function.This paper only gives a conservative estimation, i.e.,an upper bound. The time complexity of each step isas follows:

�
Step 1: The L sets of each ciphertext f 0m areobtained by scanning f 0m once: for 0pipM � 1and 0pjpN � 1, add ði; jÞ into the setL0mðf
0mði; jÞÞ. Thus, the time complexity of this

step is OðnMNÞ.
� Step 2: The average cardinality of L0mðlÞ is PlMN
and an upper bound of the time complexity ofthis step is

ðMNÞnXði;jÞ

Yn

m¼1

Pf mði;jÞ

!.

When the plaintext has a uniform distribution,

Xði;jÞ

Yn

m¼1

Pf mði;jÞ

!¼

MN

Ln

and the upper bound becomes MN � ðMN=LÞn,which exponentially increases as n increases ifMN4L. However, in practice, the real complex-ity is much smaller due to the optimization of thecalculation process. Here, we consider the so-called halving algorithm, which calculates theintersection of n sets A1; . . . ;An by dividing theminto multi-level groups of ð2; 4; . . . ; 2i; . . .Þ sets.

For example, when n ¼ 11, the calculationprocess is described by

ððA1 \1

A2Þ \3ðA3 \

2A4ÞÞ \

7ððA5 \

4A6Þ \

6ðA7 \

5A8ÞÞ

\10ððA9 \

8A10Þ \

9A11Þ,

where \idenotes the ith intersection operation.

The goal of this halving algorithm is to minimizethe cardinalities of the two sets involved in eachintersection operation so as to reduce the globalcomplexity. To make the estimation of thecomplexity easier, let us consider the case ofn ¼ 2d , where d is an integer. In this case, theoverall complexity can be calculated as followsfor the uniform distribution:

Xd�1k¼0

2k �MN

Ld�k

� �2

¼MN

Ld

� �2

�Xd�1k¼0

ð2L2Þk

¼MN

Ld

� �2

�1� ð2L2Þ

d

1� 2L2

¼ 2d � ðMNÞ2 �ð2L2Þ

�d� 1

1� 2L2

on � ðMNÞ2

2L2 � 1. (2)

As two typical examples, when M ¼ N ¼ 256and L ¼ 2, the complexity is about ð229:2 � nÞ;when M ¼ N ¼ 256 and L ¼ 256, the complex-ity is only ð215 � nÞ. One can see that in both casesthe complexity is always much smaller than2MN � ðMN=LÞn. When n is not a power of 2, thecomplexity will be smaller than ð2dlog2ne=2L2

�1Þ � ðMNÞ2pð2n=ð2L2 � 1ÞÞ � ðMNÞ2.When the distribution is not uniform, thereduction of each intersection becomes not easyto calculate. To simplify the deduction, we usethe average size for all sets and assume that thereduction is proportional to a factor 1=L� (as ananalogue of 1=L). The value of 1=L� can becalculated as a weighted sum of all the possiblesizes divided by MN:

1

L�¼XL�1l¼0

Pl � ðPlMNÞ=MN ¼XL�1l¼0

P2l .

Then, by replacing L in Eq. (2) with

L� ¼1PL�1

l¼0 P2l

,


the overall complexity becomes nðMNÞ2=ð2ðL�Þ2 � 1Þ. Taking the special non-uniformdistribution studied before, P0 ¼ p and Pl ¼

ð1� pÞ=ðL� 1Þ for l 2 f1; . . . ;L� 1g, the valueof L� can be obtained as

LðpÞ ¼1

p2 þð1� pÞ2

L� 1

,

whose relation with p has been shown in Fig. 1(a).As can be seen from the formula and the figure,LðpÞ goes to 1 decreasingly with respect to thevalue of p, so the complexity will goes to nðMNÞ2

as p approaches 1. Fortunately, this does notchange the level of the complexity.
� Step 3: The time complexity of this step is
determined by the details of the involvedoptimization algorithm. For the ‘‘taking-the-first’’ algorithm, the complexity is MN � ð1þðMN � 1Þ=ðL�ÞnÞ MN þ ðMNÞ2=ðL�Þn.
� Step 4: The time complexity of this step is
OðMNÞ.

Combining the above discussions, the final timecomplexity of the function Get_Permutation_-Matrix is always of order n � ðMNÞ2, which ispractically small even for a PC.

From the above analysis, one can see that thetime complexity is mainly determined by Step 2.When the ‘‘taking-the-first’’ algorithm is adopted inthe function Get_Permutation_Matrix, Step 2can be skipped so that the total complexity will stillbe of order Oðn � ðMNÞ2Þ, even without using thehalving algorithm to calculate the intersections. Inthis case, Step 3 can be described as follows:

�
Step 30: For 0pipM � 1 and 0pjpN � 1, dothe following operations: Step 30a: Find the first element satisfying
f 1ði; jÞ ¼ f 01ði0; j0Þ; . . . ; f nði; jÞ ¼ f 0nði

0; j0Þ bysearching each element in L01ðf 1ði; jÞÞ andchecking whether it occurs in L02ðf 2ði; jÞÞ; . . . ;L0nðf nði; jÞÞ; Step 30b: Set ewði; jÞ ¼ ði0; j0Þ and then deleteði0; j0Þ from L01ðf 1ði; jÞÞ; . . . ;L

0nðf mði; jÞÞ.

It is obvious that the time complexity of Step 30a isalways less than n � ðMNÞ and averagely isOðn �MN=L�Þ, so the time complexity of Step 30

is always less than n � ðMNÞ2 and averagely isOðn � ðMNÞ2=L�Þ.

3.2. Chosen-plaintext attack

The chosen-plaintext attack works in the sameway as the known-plaintext attack, but the plaintextcan be deliberately chosen to optimize the estima-tion of fW (i.e., to maximize the decryptionperformance). The following two rules are usefulin the creation of the n chosen plaintexts f 1; . . . ; f n:

�
the histogram of each chosen plaintext should beas uniform as possible; � the i-dimensional (2pipn) histogram of any i
chosen plaintexts should be as uniform aspossible, which is a generalization of the aboverule.

The goal of the above two rules is to minimize theaverage cardinality of the elements in cW , and thento maximize the number of correct elements in theestimated permutation matrix fW .

As an example of the two rules, consider thecondition when M ¼ N ¼ L ¼ 256. In this case, thefollowing two chosen plaintexts are enough toensure a perfect estimation of the permutationmatrix W : f 1 ¼ ½f 1ði; jÞ ¼ i�256�256 and f 2 ¼ ½f 2ði; jÞ¼ j�256�256, i.e.,

f 1 ¼ f T2 ¼

0 � � � 0

..

. . .. ..

.

i � � � i

..

. . .. ..

.

255 � � � 255

266666664

377777775256�256

(3)

and

f 2 ¼ f T1 ¼

0 � � � j � � � 255

..

. . .. ..

. . .. ..

.

0 � � � j � � � 255

26643775256�256

. (4)

For the two chosen plaintexts, ðf 1ði1; j1Þ; f 1ði2; j2ÞÞaðf 2ði1; j1Þ; f 2ði2; j2ÞÞ, 8ði1; j1Þaði2; j2Þ. This ensuresthat #ðL01ðl1Þ \ L

02ðl2ÞÞ ¼ 1, 8l1; l2 2 f0; . . . ;L� 1g.

In general cases, it can be easily deduced that n ¼

dlogLðMNÞe orthogonal plaintexts have to becreated to carry out a successful chosen-plaintextattack. Apparently, it will never be larger thandlogLð2ðMN � 1ÞÞe—the number of required plain-texts in the known-plaintext attack with a goodbreaking performance. This means the chosen-plaintext attack is a little (but not so much) strongerthan the chosen-plaintext attack.

ARTICLE IN PRESS

Fig. 2. The six 256� 256 test images used in the experiments.

Fig. 3. The cipher-images of the six test images.

S. Li et al. / Signal Processing: Image Communication 23 (2008) 212–223220

4. Experiments

To verify the decryption performance of theabove-discussed known-plaintext attack,3 someexperiments have been performed on a typicalpermutation-only image cipher called CIE [48], inwhich the secret permutations are pseudo-randomlygenerated by iterations of a chaotic map. Fig. 2shows six 256� 256 test images used in theexperiments, both of which are in 256 gray scales.In the experiments, the ‘‘taking-the-first’’ algorithmwas used to generate fW from cW in the Get_Per-mutation_Matrix function. It turned out thatsuch a simple algorithm was enough to achieve aconsiderable performance in real attacks.

The cipher-images of the six test images areshown in Fig. 3. When the first nð¼ 1; . . . ; 5Þ testimage(s) and the corresponding cipher-image(s) areknown to the attacker, the breaking results ofCipher-Image #6 are demonstrated in Fig. 4. It canbe seen that one known plain-image is not enoughto reveal any visual information about the 6th testimage, but two are capable to recover a rough view,and three or more are quite enough to achieve avery good performance.

To verify the fact that the breaking performanceis better than the theoretical prediction based on thecorrectly recovered elements in eW , let us see thedecryption performance with n ¼ 2 as an example.For this case, the number of the absolutely correctelements in eW are only 10,600, and the number ofall correct elements in eW is 26,631. In comparison,the number of correctly recovered pixels are 27,210.Although only about 27210

65536 41:52% of the pixelsare recovered, most visual information in theplain-image #6 has been revealed successfully.Now, let us consider the correct pixels that are notrecovered from the correct elements in eW , i.e., theð27210� 26631 ¼ 579Þ more correct pixels. Thesepixels are correctly decrypted with a frequency579=ð65536� 26631Þ 0:0149, which is larger thanthe average probability L�1 0:0039. If we alsocount those pixels whose values close to the rightones, this frequency will be even larger. In fact,excluding the pixels correctly determined by the26,631 correct elements in eW , the histogram of theother ð65536� 26631 ¼ 38905Þ pixels of the differ-ence image between the recovered image and the

Fig. 4. The decrypted images of Cipher-Image #6 when the first n

test images are known to the attacker.

3The chosen-plaintext attack is omitted in this section, since

one can absolutely break the permutation matrix by choosing two

plaintexts f 1 and f 2 as shown in Eqs. (3) and (4).

ARTICLE IN PRESS

−250 −200 −150 −100 −50 0 50 100 150 200 2500

100

200

300

400

500

600

Image #6

A noise image

Fig. 5. The histogram of the difference image between the

recovered image and the original plain-image, when the plain-

image is Image #6 (the blue line) or a randomly generated noise

image (the red line).

S. Li et al. / Signal Processing: Image Communication 23 (2008) 212–223 221

original plain-image #6 is a Laplacian-like functionas shown in Fig. 5. In comparison, the histogram ofthe difference image corresponding to a randomlygenerated noise image of the same size 256� 256 isalso shown. It is clear that the Laplacian-likehistogram corresponding to Image #6 is caused bythe correlation information existing in naturalimages. Note that the triangular histogram ofthe noise image can be easily deduced under theassumption that the two involved images (i.e., thenoise image and the corresponding cipher-image)are independent of each other and have a uniformhistogram: 8i ¼ �255; . . . ; 255, the occurrenceprobability of the difference value i in the histogramis: ð256� jijÞ=65536 ¼ 1=256� jij=65536.

5. Conclusions

Based on a general model of permutation-onlymultimedia ciphers and from a general perspective,the present paper analyzes the security this type ofciphers against plaintext attacks. When the plaintextis of size M �N and distributed uniformly with L

possible values, it is found that only OðlogLðMNÞÞ

plaintexts are enough to achieve a good breakingperformance. It has also been found that the attackcomplexity is practically small—only Oðn � ðMNÞ2Þ,where n denotes the number of known/chosenplaintexts. Some experiments on a permutation-only image cipher have been shown to demonstratethe performance of the proposed known-plaintext

attack. From the results of this paper, we draw thefollowing conclusions: for permutation-only ci-phers, (1) no better secret permutations can beachieved to offer a higher security level againstplaintext attacks (compared with the general modeldiscussed in this paper); (2) the secret permutationsshould be updated in a frequency smaller thanlogLðMNÞ to offer an acceptable level of securityagainst plaintext attacks, or they have to becombined with other encryption techniques toachieve this goal.

Acknowledgments

Shujun Li was supported by the Alexander vonHumboldt Foundation and by The Hong KongPolytechnic University’s Postdoctoral FellowshipScheme (under Grant no. G-YX63). The work ofNikolaos G. Bourbakis was supported by the AIISInc., NY, USA. The work of K.-T. Lo wassupported by the Research Grants Council of theHong Kong SAR Government under Project No.523206 (PolyU 5232/06E).

References

[1] C. Alexopoulos, N.G. Bourbakis, N. Ioannou, Image

encryption method using a class of fractals, J. Electron.

Imaging 4 (3) (1995) 251–259.

[2] H.J. Beker, F.C. Piper, Secure Speech Communications,

Academic, New York, 1985.

[3] M. Bertilsson, E.F. Brickell, I. Ingemarson, Cryptanalysis of

video encryption based on space-filling curves, in: Advances

in Cryptology—EuroCrypt’88, Lecture Notes in Computer

Science, vol. 434, Springer, Berlin, 1989, pp. 403–411.

[4] N.G. Bourbakis, C. Alexopoulos, Picture data encryption

using SCAN patterns, Pattern Recognition 25 (6) (1992)

567–581.

[5] N.G. Bourbakis, A. Dollas, SCAN-based compression-

encryption-hiding for video on demand, IEEE Multimedia

10 (3) (2003) 79–87.

[6] L. Brown, Comparing the security of pay-TV systems for use

in Australia, Austral. Telecomm. Res. 24 (2) (1990) 1–8.

[7] C.-C. Chang, T.-X. Yu, Cryptanalysis of an encryption

scheme for binary images, Pattern Recognition Lett. 23 (14)

(2002) 1847–1852.

[8] H.K.-C. Chang, J.-L. Liu, A linear quadtree compression

scheme for image encryption, Signal Process.: Image

Commun. 10 (4) (1997) 279–290.

[9] H.-C. Chen, J.-I. Guo, L.-C. Huang, J.-C. Yen, Design and

realization of a new signal security system for multimedia

data transmission, EURASIP J. Appl. Signal Process. 2003

(13) (2003) 1291–1305.

[10] H. Cheng, X. Li, Partial encryption of compressed images

and videos, IEEE Trans. Signal Process. 48 (8) (2000)

2439–2451.


[11] H.C.H. Cheng, Partial encryption for image and video

communication, Master Thesis, Department of Computing

Science, University of Alberta, Edmonton, Alberta, Canada,

1998.

[12] K.-L. Chung, L.-C. Chang, Large encryption binary images

with higher security, Pattern Recognition Lett. 19 (5–6)

(1998) 461–468.

[13] J.H. Dolske, Secure MPEG video: techniques and pitfalls,

June 1997, available online at hhttp://www.dolske.net/old/

gradwork/cis788r08/i.

[14] J. Fridrich, Symmetric ciphers based on two-dimensional

chaotic maps, Internat. J. Bifurcation Chaos 8 (6) (1998)

1259–1284.

[15] B. Furht, E. Muharemagic, D. Socek, Multimedia Encryp-

tion and Watermarking, Springer, Berlin, 2005.

[16] B. Furht, D. Socek, A.M. Eskicioglu, Fundamentals of

multimedia encryption techniques, in: B. Furht, D. Kirovski

(Eds.), Multimedia Security Handbook, CRC Press, LLC,

Boca Raton, FL, 2004, pp. 93–131 (Chapter 3).

[17] B. Goldburg, S. Sridharan, E. Dawson, Design and

cryptanalysis of transform-based analog speech scramblers,

IEEE J. Selected Areas Commun. 11 (5) (1993) 735–744.

[18] J.-K. Jan, Y.-M. Tseng, On the security of image encryption

method, Inform. Process. Lett. 60 (5) (1996) 261–265.

[19] A. Kudelski, Method for scrambling and unscrambling a

video signal, U.S. Patent 5375168, 1994.

[20] M. Kuhn, AntiSky—an image processing attack on Video-

Crypt, 1994, online document, available at hhttp://www.cl.

cam.ac.uk/�mgk25/tv-crypt/image-processing/antisky.htmli.

[21] M. Kuhn, Analysis for the Nagravision video scrambling

method, 1998, online document, available at hhttp://

www.cl.cam.ac.uk/�mgk25/nagra.pdfi.

[22] I.J. Kumar, Cryptology of speech signal, in: Cryptology:

System Identification and Key-Clustering, Aegean Park

Press, 1997, pp. 309–380 (Chapter 6).

[23] S. Li, G. Chen, X. Zheng, Chaos-based encryption for

digital images and videos, in: B. Furht, D. Kirovski (Eds.),

Multimedia Security Handbook, CRC Press, LLC, Boca

Raton, FL, 2004, pp. 133–167 (Chapter 4), preprint

available at hhttp://www.hooklee.com/pub.htmli.

[24] S. Lian, J. Sun, Z. Wang, Perceptual cryptography on

SPIHT compressed images or videos, in: Proceedings of the

IEEE International Conference on Multimedia & Expo

(ICME’2004), 2004.

[25] S. Lian, J. Sun, Z. Wang, Perceptual cryptography on

JPEG2000 compressed images or videos, in: Proceedings of

the International Conference on Computer and Information

Technology (CIT’2004), IEEE Computer Society, Silver

Spring, MD, 2004, pp. 78–83.

[26] S. Lian, X. Wang, J. Sun, Z. Wang, Perceptual cryptography

on wavelet-transform encoded videos, in: Proceedings of the

IEEE International Symposium on Intelligent Multimedia,

Video and Speech Processing (ISIMP’2004), 2004, pp. 57–60.

[27] S.S. Maniccam, N.G. Bourbakis, Image and video encryp-

tion using SCAN patterns, Pattern Recognition 37 (4) (2004)

725–737.

[28] Y. Mao, G. Chen, S. Lian, A novel fast image encryption

scheme based on 3D chaotic Baker maps, Int. J. Bifurcation

Chaos 14 (10) (2004) 3613–3624.

[29] Y. Mao, M. Wu, A joint signal processing and cryptographic

approach to multimedia encryption, IEEE Trans. Image

Process. 15 (7) (2006) 2061–2075.

[30] Y. Matias, A. Shamir, A video scrambing technique based

on space filling curve (extended abstract), in: Advances in

Cryptology—Crypto’87, Lecture Notes in Computer

Science, vol. 293, Springer, Berlin, 1987, pp. 398–417.

[31] A. Menezes, P. van Oorschot, S. Vanstone, Handbook of

Applied Cryptography, CRC Press, Inc., Boca Raton, FL,

1996.

[32] R.K. Nichols, P.C. Lekkas, Speech cryptology, in: Wireless

Security: Models, Threats, and Solutions, McGraw-Hill,

New York, 2002, pp. 253–327 (Chapter 6).

[33] D. Qi, J. Zou, X. Han, A new class of scrambling

transformation and its application in the image information

covering, Sci. China—Ser. E (English Edition) 43 (3) (2000)

304–312.

[34] L. Qiao, Multimedia security and copyright protection,

Ph.D. Thesis, Department of Computer Science, University

of Illinois at Urbana-Champaign, Urbana, IL, USA,

1998.

[35] L. Qiao, K. Nahrstedt, Is MPEG encryption by using

random list instead of ZigZag order secure?, in: Proceedings

of the IEEE International Symposium on Consumer

Electronics (ISCE’97), 1997, pp. 226–229.

[36] L. Qiao, K. Nahrsted, Comparison of MPEG encryption

algorithms, Comput. Graphics 22 (4) (1998) 437–448.

[37] J. Scharinger, Fast encryption of image data using chaotic

Kolmogorov flows, J. Electron. Imaging 7 (2) (1998)

318–325.

[38] B. Schneier, Applied Cryptography—Protocols, Algorithms,

and Source Code in C, second ed., Wiley, New York, 1996.

[39] S.U. Shin, K.S. Sim, K.H. Rhee, A secrecy scheme for

MPEG video data using the joint of compression

and encryption, in: Information Security: Second Interna-

tional Workshop (ISW’99) Proceedings, Lecture Notes

in Computer Science, vol. 1729, Springer, Berlin, 1999,

pp. 191–201.

[40] S. Sridharan, E. Dawson, B. Goldburg, Speech encryption in

the transform domain, Electron. Lett. 26 (10) (1990)

655–657.

[41] S. Sridharan, E. Dawson, B. Goldburg, Fast Fourier

transform based speech encryption system, IEE Proc. I—

Commun. Speech Vision 138 (3) (1991) 215–223.

[42] L. Tang, Methods for encrypting and decrypting MPEG

video data efficiently, in: Proceedings of the Fourth

ACM International Conference on Multimedia, 1996,

pp. 219–229.

[43] T. Uehara, R. Safavi-Naini, Chosen DCT coefficients attack

on MPEG encryption schemes, in: Proceedings of the IEEE

Pacific-Rim Conference on Multimedia (IEEE-PCM’2000),

2000, pp. 316–319.

[44] T. Uehara, R. Safavi-Naini, P. Ogunbona, Securing wavelet

compression with random permutations, in: Proceedings of

the IEEE Pacific-Rim Conference on Multimedia (IEEE

PCM’2000), 2000, pp. 332–335.

[45] A. Uhl, A. Pommer, Image and Video Encryption: From

Digital Rights Management to Secured Personal Commu-

nication, Springer, Berlin, 2005.

[46] D. Wagner, G.G. Rose, T. Ritter, T. Jakobsen, N. Ferguson,

D.R. Stinson, Transposition ciphers, 2001, online discus-

sions in news group sci.crypt.research at google.com,

available online at hhttp://groups-beta.google.com/group/

sci.crypt.research/browse_thread/thread/3cd88407a3485cb1/

58ff17304187ce74#58ff17304187ce74i.

http://www.dolske.net/old/gradwork/cis788r08/

http://www.dolske.net/old/gradwork/cis788r08/

http://www.cl.cam.ac.uk/~mgk25/tv-crypt/image-processing/antisky.html



http://www.cl.cam.ac.uk/~mgk25/nagra.pdf



http://www.hooklee.com/pub.html

http://groups-beta.google.com/group/sci.crypt.research/browsethread/thread/3cd88407a3485cb1/58ff17304187ce74#58ff17304187ce74




[47] Wikipedia, Television encryption, 2007, online document.

URL hhttp://en.wikipedia.org/wiki/Television_encryptioni.

[48] J.-C. Yen, J.-I. Guo, A new chaotic image encryption

algorithm, in: Proceedings of (Taiwan) National Symposium

on Telecommunications, 1998, pp. 358–362.

[49] J.-C. Yen, J.-I. Guo, Efficient hierarchical chaotic

image encryption algorithm and its VLSI realisation,

IEE Proc.—Vision Image Signal Process. 147 (2) (2000)

167–175.

[50] W. Zeng, S. Lei, Efficient frequency domain selective

scrambling of digital video, IEEE Trans. Multimedia 5 (1)

(2003) 118–129.

[51] W. Zeng, H. Yu, C.-Y. Lin (Eds.), Multimedia Security

Technologies for Digital Rights Management, Academic

Press, New York, 2006.

[52] X.-Y. Zhao, G. Chen, Ergodic matrix in image encryption,

in: Proceedings of the Second International Conference on

Image and Graphics, Proceedings of SPIE, vol. 4875, 2002,

pp. 394–401.

[53] X.-Y. Zhao, G. Chen, D. Zhang, X.-H. Wang, G.-C. Dong,

Decryption of pure-position permutation algorithms,

J. Zhejiang Univ. Sci. 5 (7) (2004) 803–809.

[54] R. Zunino, Fractal circuit layout for spatial decorrelation of

images, Electron. Lett. 34 (20) (1998) 1929–1930.

http://en.wikipedia.org/wiki/Television_encryption

Documents

A general quantitative cryptanalysis of permutation-only multimedia ciphers against plaintext attacks