Upload
shujun-li
View
215
Download
0
Embed Size (px)
Citation preview
ARTICLE IN PRESS
0923-5965/$ - se
doi:10.1016/j.im
�Correspondfax: +492331 9
E-mail addr
URL: http:/
Signal Processing: Image Communication 23 (2008) 212–223
www.elsevier.com/locate/image
A general quantitative cryptanalysis of permutation-onlymultimedia ciphers against plaintext attacks
Shujun Lia,�, Chengqing Lib, Guanrong Chenb,Nikolaos G. Bourbakisc, Kwok-Tung Lod
aFernUniversitat in Hagen, Lehrgebiet Informationstechnik, UniversitatsstraX e 27, 58084 Hagen, GermanybDepartment of Electronic Engineering, City University of Hong Kong, 83 Tat Chee Avenue, Kowloon, Hong Kong, China
cInformation Technology Research Institute, College of Engineering and Computer Science, Wright State University, 3640 Glenn Hwy,
Dayton, OH 45435, USAdDepartment of Electronic and Information Engineering, The Hong Kong Polytechnic University, Hung Hom, Kowloon,
Hong Kong SAR, China
Received 1 August 2006; received in revised form 21 January 2008; accepted 24 January 2008
Abstract
In recent years secret permutations have been widely used for protecting different types of multimedia data, including
speech files, digital images and videos. Based on a general model of permutation-only multimedia ciphers, this paper
performs a quantitative cryptanalysis on the performance of these kind of ciphers against plaintext attacks. When the
plaintext is of size M �N and with L different levels of values, the following quantitative cryptanalytic findings have been
concluded under the assumption of a uniform distribution of each element in the plaintext: (1) all permutation-only
multimedia ciphers are practically insecure against known/chosen-plaintext attacks in the sense that only OðlogLðMNÞÞ
known/chosen plaintexts are sufficient to recover not less than (in an average sense) half elements of the plaintext; (2) the
computational complexity of the known/chosen-plaintext attack is only Oðn � ðMNÞ2Þ, where n is the number of known/
chosen plaintexts used. When the plaintext has a non-uniform distribution, the number of required plaintexts and the
computational complexity is also discussed. Experiments are given to demonstrate the real performance of the known-
plaintext attack for a typical permutation-only image cipher.
r 2008 Elsevier B.V. All rights reserved.
Keywords: Permutation-only multimedia encryption; Image; Video; Speech; Cryptanalysis; Known-plaintext attack; Chosen-plaintext
attack
1. Introduction
With the rapid progress of computer and com-munication network technologies, a great deal of
e front matter r 2008 Elsevier B.V. All rights reserved
age.2008.01.003
ing author. Tel.: +49 2331 9871160;
87375.
ess: [email protected] (S. Li).
/www.hooklee.com (S. Li).
concerns have been raised about the security ofmultimedia data transmitted over open networks.Also, secure storage of digital multimedia datais demanded in many real applications, such asconfidential teleconferencing, pay-TV, medical andmilitary imaging, and privacy-related multimediaservices. Due to the prevalence of multimediaservices in consumer electronic devices, users ofhandheld devices have started to require content
.
ARTICLE IN PRESSS. Li et al. / Signal Processing: Image Communication 23 (2008) 212–223 213
protection of multimedia data including recordedspeech segments, personal photos and private movieclips.
To meet all these needs in practice, encryptionalgorithms are required to offer a sufficient levelof security for different multimedia applications.Apparently, the simplest way to encrypt multimediadata is to treat them as 1-D bit-streams, and thento encrypt them with any available cipher [38,31].In some multimedia applications, such a simple idea of naive encryption may be enough. However, inmany other applications, especially when digitalimages and videos are involved, encryption schemesconsidering special features of the multimediadata, such as bulky sizes and large redundancy inuncompressed images/videos, are still requiredto achieve a better overall performance and tomake the integration of the encryption schemeinto the whole process easier. In the past severaldecades, different algorithms have been proposed toprovide specific solutions to the encryption ofimages, videos and speech data. Meanwhile, manycryptanalytic results have been reported, leading tothe conclusion that a number of multimediaencryption schemes are insecure from the crypto-graphical point of view. For recent surveys on imageand video encryption algorithms, see [15,16,23,45,51], and for surveys on speech encryption, see[2,15,22,32].
The use of secret permutations is very popular inanalog pay-TV services as a main approach toprotecting video signals in broadcast-TV [6,19,47].Due to the specific structure of analog video signals,there are only three major ways to perform secretpermutations on lines: time reversal (transmittingrandomly selected lines in reverse order), line cut androtation (cut each line at a random point and swapthe two halves of the line), and line shuffling. Thecondition becomes much better, however, when secretpermutations are used to protect digital multimediadata. According to the format of the multimedia datato be encrypted, a lot of elements can be permuted ina secure way: pixels (samples), bitplanes, lines, rows,blocks, macroblocks, slices, transform coefficients,VLC (variable-length codewords) syntax elements,tree nodes, and so on [1,4,5,8,9,12,14,17,24–30,33,37,39–42,44,48–50,52,54]. While some multimediaencryption schemes combine secret permutationswith other encryption techniques, there are manymultimedia encryption algorithms that are entirelybased on secret permutations [1,4,6,8,9,12,17,19,30,33,39–42,44,47–50,52,54]. These are called permuta-
tion-only multimedia ciphers in this paper. Note thatsome ciphers can also be classified as permutation-only ones, even though other encryption techniquesare used together with secret permutations. Forinstance, the video ciphers proposed in [24–26]become permutation-only ciphers, if the sign bits ofall encrypted data elements are neglected. The mainadvantages of using only secret permutations in acipher include: (i) they can be easily implemented; (ii)when used properly, perceptual information aboutthe plaintext can be efficiently concealed.
The security of permutation-only multimediaciphers has been extensively studied. Almost allpermutation-only analog pay-TV encryptionschemes and some permutation-only ciphers hadalready been found insecure against ciphertext-only attacks, due to the high information redun-dancy in multimedia data and/or some specificweaknesses in the encryption algorithms [3,13,20,21,34]. In addition, it has been widely knownthat permutation-only multimedia ciphers areinsecure against the known/chosen-plaintext attack[7,10,11,13,17,18,34–36,41,43,44,53], which is quiteunderstandable since the secret permutationscan be recovered by comparing the plaintextsand the permuted ciphertexts. Though secretpermutations suffer from the above security pro-blems, many researchers still hope that it will beuseful to design multimedia encryption schemesbased on this technique, due to the followingreasons:
(1)
the insecurity against ciphertext-only attacks isnot a problem for most digital permutation-onlyciphers because of the use of more complicatedpermutations;(2)
the insecurityagainst plaintext attacks can besolved in practice, by using dynamically updatedand/or plaintext-dependent secret permutations;(3)
it is one of the simplest encryption techniques tomaintain format compliance and size preserva-tion simultaneously;(4)
by combining it with very simple substitutionoperations, multimedia encryption of high con-fidentiality can be achieved.To the best of our knowledge, all previouscryptanalytic results were performed for specificpermutation-only image/video ciphers, and a gen-eral quantitative study about plaintext attacks hasnot been reported to clarify the number of requiredplaintexts and the computational complexity of
ARTICLE IN PRESSS. Li et al. / Signal Processing: Image Communication 23 (2008) 212–223214
such an attack.1 As a result, some questions stillremain to be answered, which include: (i) Can thesecurity of permutation-only multimedia encryptionalgorithms be effectively enhanced by designing newmethods to generate ‘‘better’’ secret permutations?(ii) How frequent should the secret permutations beupdated to provide an acceptable security againstplaintext attacks?
This paper reports a general cryptanalysis ofpermutation-only multimedia encryption algo-rithms against plaintext attacks, mainly focusingon the quantitative relation between the breakingperformance and the number of required known/chosen plaintexts, and provides an estimation of theattack complexity. The cryptanalysis is performedon a general model of permutation-only multimediaciphers by considering the plaintext (image, speech,frame of videos, etc.) as an M �N matrix in whicheach element has L possible distinct values. Underthe assumption that each element in the matrix hasan independent and uniform distribution, it will beshown that the number of plaintexts required toobtain an acceptable breaking performance inknown plaintext attack is dlogLð2ðMN � 1ÞÞe. Whenthe plaintext does not have a uniform distribution,this number will increase accordingly. This issue willalso be studied on a special non-uniform distribu-tion. For chosen-plaintext attack, it will be shownthat only dlogLðMNÞe plaintexts are enough to get agood breaking performance. In addition, an upperbound of the attack complexity will be obtained:Oðn � ðMNÞ2Þ, where n is the number of known/chosen plain-images.
The rest of this paper is organized as follows. InSection 2, a general model of permutation-onlymultimedia ciphers is described. Cryptanalysison this normalized model is studied in detail inSection 3. Some experimental results are shown inSection 4 to support the theoretical cryptanalysis.The last section concludes the paper.
2. A general model of permutation-only multimedia
ciphers
Though different kinds of multimedia datarequire different kinds of secret permutations, it is
1Though there were some simple discussions on the quantita-
tive aspects of known/chosen-plaintext attacks of bit-permuta-
tion ciphers in the cryptology community [46], this problem has
not been systematically and quantitatively studied in a general
way for any case, especially for permutation-only multimedia
ciphers.
possible to construct a general model by consideringthe plaintext as a 2-D M �N matrix. This isbecause the following reasons: (i) 1-D speech dataare just a special case of M ¼ 1; (ii) 3-D videos aregenerally encrypted frame by frame, and each frameis encrypted block by block, so permutation-onlyvideo encryption is actually a generalized case ofpermutation-only image encryption; (iii) the dimen-sion remains unchanged when a multimedia signal isconverted to transform domain. Thus, in thefollowing of this section, we describe the generalmodel based on a 2-D input plaintext. To facilitatethe discussion below and to avoid potential confu-sion, we use a special term ‘‘particles’’ to denoteelements in the 2-D plaintext that are permuted,such as pixels in a plain-image or transformcoefficients in a block of a video frame.
As its name suggests, a permutation-only multi-media cipher encrypts a 2-D plaintext by permutingthe positions of all particles in a secret way. Thesecret permutations have to be invertible to makethe decryption possible. This means that allpermutation-only ciphers belong to symmetry ci-phers. Although many different methods have beenproposed to realize secret key-dependent pixelpermutations, for a given plaintext of size M �N,a permutation-only cipher can be normalized withan invertible key-dependent permutation matrix of
size M �N, denoted by
W ¼ ½wði; jÞ ¼ ði0; j0Þ 2M�N�M�N , (1)
where M ¼ f0; . . . ;M � 1g and N ¼ f0; . . . ;N � 1g.With the permutation matrix W and its inverseW�1 ¼ ½w�1ði; jÞ�M�N , for a plaintext f ¼
½f ði; jÞ�M�N and its corresponding ciphertextf 0 ¼ ½f 0ði; jÞ�M�N , the encryption and decryptionprocedures of a permutation-only cipher can bedescribed as follows:
�
the encryption procedure: For 0pipM � 1 and0pjpN � 1, f 0ðwði; jÞÞ ¼ f ði; jÞ; � the decryption procedure: For 0pipM � 1 and0pjpN � 1, f ðw�1ði; jÞÞ ¼ f 0ði; jÞ.
In a short form, we denote the encryption procedureby f 0ðWðIÞÞ ¼ f ðIÞ and the decryption procedure byf ðW�1ðIÞÞ ¼ f 0ðIÞ, where
I ¼
ð0; 0Þ � � � ð0;N � 1Þ
..
. . .. ..
.
ðM � 1; 0Þ � � � ðM � 1;N � 1Þ
26643775
M�N
.
ARTICLE IN PRESSS. Li et al. / Signal Processing: Image Communication 23 (2008) 212–223 215
To ensure the invertibility of the permutationmatrix, i.e., to make the decryption possible, thefollowing property should be satisfied: 8ði1; j1Þaði2; j2Þ, wði1; j1Þawði2; j2Þ. This means that Wdetermines a bijective (i.e., one-to-one) permutationmapping, FW :M�N!M�N.
From the above description, one can see that thedesign of a permutation-only cipher focuses on twopoints: (1) what the secret key K is; (2) how thepermutation matrix W and its inverse W�1 arederived from the secret key K. Generally speaking,each key defines a permutation matrix, and eachpermutation-only cipher defines a finite set contain-ing a number of permutation matrices selected fromall ðMNÞ! possible ones. In the relevant literature,many different methods have been proposed toderive the permutation matrix from a key, some ofwhich are listed as follows:
�
SCAN language based methods [1,4,8,5,12,27]:Define some different scan patterns of the 2-Dplaintext and combine these patterns to obtain apermutation matrix by scanning the wholeplaintext particle by particle; � quadtree based methods [8,12]: Divide the plain-text into multi-level quadtree and shuffle theorder of four nodes in each level to realize apermutation matrix;
� 2-D chaotic maps based methods [14,28,37]:Iterate a discretized 2-D chaotic map over theM �N plaintext for many times to realize apermutation matrix;
� Fractal curves based methods [30,54]: Use afractal(-like) curve to replace the normal scanorder to realize a permutation matrix;
� pseudo-random rotations based methods [48,49]:Pseudo-randomly rotate particles along somestraight lines for many times to realize apermutation matrix;
� matrix transformation based methods [33]: Use(integer) transformations of matrix, such asn-dimensional Arnold transformation and Fibo-nacci-Q transformation, to define permutationmatrices;
� composite methods [52]: Combine different meth-ods to realize more complicated permutationmatrices.
Although different types of secret keys are used indifferent permutation-only multimedia ciphers togenerate the permutation matrix, it is reasonable toconsider the permutation matrix W itself as the
equivalent encryption key and W�1 as the equiva-lent decryption key. From such a point of view, allpermutation-only multimedia ciphers can be con-sidered the same. This is the base for the securityanalysis to be carried out below in next section.
3. General quantitative cryptanalysis
In this section, we discuss the general quantitativecryptanalysis of plaintext attacks based on theabove general model of permutation-only multi-media ciphers.
3.1. Known-plaintext attack
As discussed above, when a permutation-only
multimedia cipher is used to encrypt a plaintext,the particle at the position ði; jÞ will be secretlypermuted to another fixed position ði0; j0Þ while itsvalue remains unchanged. Therefore, by comparinga number of known plaintexts and the correspond-ing ciphertexts, it is possible for an attacker to(partially or even totally) reconstruct the secretpermutations of all particles, i.e., to derive theencryption/decryption keys—the permutation ma-trix W and its inverse W�1.
Given n known plaintexts f 1; . . . ; f n and theirciphertexts f 01; . . . ; f
0n, the deduction procedure of the
two key-matrices W and W�1 can be described by afunction Get_Permutation_Matrix. With theinput parameters (f 1; . . . ; f n; f
01; . . . ; f
0n;M ;N), this
function returns an estimation of the permutationmatrix W and its inverse W�1. Assuming the value ofeach particle ranges in f0; . . . ;L� 1g, the functionGet_Permutation_Matrix works as follows:
�
Step 1: Compare pixel values within the nciphertexts f 01; . . . ; f0n to get ðn � LÞ sets of posi-
tions:
L01ð0Þ; . . . ;L01ðL� 1Þ; . . . ;L0nð0Þ; . . . ;L
0nðL� 1Þ,
where L0mðlÞ �M�N denotes a set containingpositions of all particles in f 0m ðm ¼ 1; . . . ; nÞwhose values are equal to l 2 f0; . . . ;L� 1g, i.e.,8ði0; j0Þ 2 L0mðlÞ, f 0mði
0; j0Þ ¼ l. Note thatL0mð0Þ; . . . ;L
0mðL� 1Þ actually compose a parti-
tion of the set of all positions:[L�1l¼0
L0mðlÞ ¼M�N ¼ fð0; 0Þ; . . . ; ðM � 1;N � 1Þg,
and 8l1al2, L0mðl1Þ \ L0mðl2Þ ¼ ;;
ARTICLE IN PRESSS. Li et al. / Signal Processing: Image Communication 23 (2008) 212–223216
�
Step 2: Get a multi-valued permutation matrix,cW ¼ ½bwði; jÞ�M�N , wherebwði; jÞ ¼ \nm¼1
L0mðf mði; jÞÞ.
Here, note that[0pipM�10pjpN�1
bwði; jÞ ¼M�N
and that bwði1; j1Þ ¼ bwði2; j2Þ may hold ifði1; j1Þaði2; j2Þ;
� Step 3: Determine a single-valued permutationmatrix, fW ¼ ½ewði; jÞ�M�N from cW , where ewði; jÞ 2bwði; jÞ and 8ði1; j1Þaði2; j2Þ, ewði1; j1Þaewði2; j2Þ;
� Step 4: Output fW and its inverse fW�1¼
½ew�1ði; jÞ�M�N as the estimations of W and W�1.
2This is actually not true for most multimedia data, but we use
this strong assumption to carry out a qualitative estimation.
Apparently, if and only if #ðbwð0; 0ÞÞ ¼ � � � ¼#ðbwðM � 1;N � 1ÞÞ ¼ 1, i.e., each element of cWcontains only one position, it is true that fW ¼Wand the cipher is totally broken. However, because
some elements of cW contain more than one
position, generally fW is not an exact estimation of
W . Assume that there are ð bNpMNÞ distinct
elements in cW , and that the bN elements arebw1; . . . ; bwbN . Then, it can be easily verified that there
areQbN
k¼1#ðbwkÞ! possibilities of fW . To make the
estimation of fW as accurate as possible, somespecific optimization algorithms can be used tochoose a better position from bwði; jÞ as the value ofewði; jÞ, such as the genetic and simulated annealingalgorithms. Our experiments have shown that evena simple algorithm may be enough to achieve arather good estimation when nX3 for 256� 256gray-scale images (see the next section for moredetails). The simple algorithm is called ‘‘taking-the-first’’ algorithm, which sets ewði; jÞ to be the firstavailable element in bwði; jÞ, where the term ‘‘avail-able’’ refers to the constraint that 8ði1; j1Þaði2; j2Þ,ewði1; j1Þaewði2; j2Þ.
Next, we study the decryption performance of theestimated permutation matrix fW when fWaW .Generally speaking, due to the large informationredundancy existing in multimedia data, usuallypartially recovered plaintext is enough to revealmost visual information. Therefore, if there areenough correct elements in fW , the decryptionperformance may be acceptable from a practical
point of view. From the above discussions, one cansee that correctly recovered elements infW belong totwo different classes:
�
the absolutely correct elements: Derived from thesingle-valued elements of cW ; � the probabilistically correct elements: Derivedfrom the multi-valued elements of cW , and arecorrectly guessed by an optimization algorithmof selecting a proper position from each bwði; jÞ.
Assuming that the number of single-valued elementsof cW is nc and the probability of success of theoptimization algorithm is ps, the average number ofcorrect elements in fW will be nc þ ps � ðMN � ncÞ.Because ps is generally not fixed (tightly dependenton the employed optimization algorithm), only theabsolutely correct elements are considered here (i.e.,ps ¼ 0 is assumed) to perform a qualitative analysis.This means that we will get a lower bound of theperformance.
Now, the problem of counting correct elements infW is simplified to be another problem of countingsingle-valued elements in cW . From Get_Permu-tation_Matrix function, one can see that thecardinality of bwði; jÞ is uniquely determined byL01ðf 1ði; jÞÞ; . . . ;L
0nðf nði; jÞÞ. To further simplify the
analysis, assume that any two particle valuesare independent of each other2 and denote theoccurrence probability of a particle value l 2
f0; . . . ;L� 1g by Pl . Apparently, it is true thatPL�1l¼0 Pl ¼ 1 and Pl ¼
1Lfor the uniform distribution
of the particle value. Then, one can consider thefollowing two types of positions in bwði; jÞ:
� the only one real position wði; jÞ, which absolutelyoccurs in bwði; jÞ;
� other fake positions, each of which occurs in eachL0mðf mði; jÞÞ with probability Pf mði;jÞ, i.e., each ofwhich occurs in all the n sets, L01ðf 1ði; jÞÞ;. . . ;L0nðf nði; jÞÞ, with probability
Qnm¼1Pf mði;jÞ.
Therefore, when the values of f 1ði; jÞ; . . . ; f nði; jÞare fixed, the expected cardinality of bwði; jÞ is1þ ðMN � 1Þ
Qnm¼1Pf mði;jÞ. Because it is very diffi-
cult to estimate a general result when the values ofP1; . . . ;PL�1 are unknown, we only discuss twospecial distributions to demonstrate how to estimate
ARTICLE IN PRESS
103
S. Li et al. / Signal Processing: Image Communication 23 (2008) 212–223 217
a lower bound of the number of the requiredplaintexts.
(1)
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1100
101
102
p
L (p
)
3
104
Uniform distribution: In this case, Pl ¼ 1=L,8l 2 f0; . . . ;L� 1g. One can qualitatively deducethat the average cardinality of bwði; jÞ for anygiven position ði; jÞ is 1þ ðMN � 1Þ=Ln, whichapproaches 1 exponentially as n increases.Generally speaking, when 1þ ðMN � 1Þ=Lnp1:5 or ðMN � 1Þ=Lnp0:5, i.e., more than halfelements in fW are correct, the decryptionperformance will be acceptable. Solving thisinequality, one has nXdlogLð2ðMN � 1ÞÞe. Asan example, for 256� 256 gray-scale images,M ¼ N ¼ L ¼ 256, one has nXdlogLð2ðMN �
1ÞÞe ¼ d2:125e ¼ 3. The average cardinality isabout 1.0039 when n ¼ 3, so it is expected thatthe decryption performance for nX3 will berather good, which is verified by the experimentsgiven in the next section.
10
(2)0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1100
101
102
p
n
Fig. 1. The relationships between LðpÞ, dlogLðpÞð2ðMN � 1ÞÞe (the
lower bound of n) and p ¼ 1=L; 2=L; . . . ; ðL� 1Þ=L, when
M ¼ N ¼ L ¼ 256.
Uniform distribution except for one particlevalue: Typical examples of this kind of distribu-tion are images with large smooth background.Without loss of generality, assume P0 ¼ p andPl ¼ q ¼ ð1� pÞ=ðL� 1Þ for l 2 f1; . . . ;L� 1g.Then, if there are k values of f 1ði; jÞ; . . . ; f nði; jÞequal to 0, which occurs with a probability of
nk
� �pkð1� pÞn�k, the expected cardinality of bwði; jÞ
is 1þ ðMN � 1Þpkqn�k. As a result, one can getthe average value of #ðbwði; jÞÞ � 1 as follows:
ð#ðbwði; jÞÞ � 1Þ
¼Xn
k¼0
n
k
!pkð1� pÞn�k
ðMN � 1Þpkqn�k
¼ ðMN � 1ÞXn
k¼0
n
k
!ðp2Þ
k ð1� pÞ2
L� 1
� �n�k
¼ ðMN � 1Þ p2 þð1� pÞ2
L� 1
� �n
.
Let ðMN � 1Þðp2 þ ð1� pÞ2=ðL� 1ÞÞnp0:5.Then, one can get nXdlogLðpÞð2ðMN � 1ÞÞe,where
LðpÞ ¼1
p2 þ ðð1� pÞ2=L� 1Þ.
When M ¼ N ¼ L ¼ 256, Fig. 1 shows how thevalue of LðpÞ and the lower bound of n changewith respect to the value of p. It can be seen thatthe non-uniformity can cause an increase of thenumber of required plaintexts.
Though the distribution of most multimedia data isnot uniform, our experiments on permutation-onlyimage ciphers have shown that the above quantita-tive results obtained from the uniform distributionis basically correct for natural images: aboutlogLð2ðMN � 1ÞÞ plain-images are sufficient to geta good breaking performance as will be shown inthe next section. In fact, the actual decryptionperformance is even better than the theoreticalexpectation because of the following two reasons:
�
human eyes have a powerful capability ofsuppressing image noises and extracting signifi-cant features: 10% noisy pixels cannot makeARTICLE IN PRESSS. Li et al. / Signal Processing: Image Communication 23 (2008) 212–223218
much influence on the visual quality of a digitalimage, and it only needs 50% of pixels to revealmost visual information of the original image;
� due to the short-distance and long-distancerelationships in natural images, two pixel valuesare close to each other with a non-negligibleprobability larger than the average probability;as a result, many wrongly decrypted pixels areclose to their true values with a probability largerthan the average probability.
The above two points imply that the decryptionperformance of natural images will be better thanthat of noise-like images. For experimental verifica-tion and more explanations, see Section 4, Figs. 4and 5. This perceptual phenomenon can also begeneralized to audio and speech data.
Finally, consider the time complexity of theabove-discussed known-plaintext attack, i.e., thetime complexity of the Get_Permutation_Ma-trix function. Note that the time complexitydepends on implementation details of this function.This paper only gives a conservative estimation, i.e.,an upper bound. The time complexity of each step isas follows:
�
Step 1: The L sets of each ciphertext f 0m areobtained by scanning f 0m once: for 0pipM � 1and 0pjpN � 1, add ði; jÞ into the setL0mðf0mði; jÞÞ. Thus, the time complexity of this
step is OðnMNÞ.
� Step 2: The average cardinality of L0mðlÞ is PlMNand an upper bound of the time complexity ofthis step is
ðMNÞnXði;jÞ
Yn
m¼1
Pf mði;jÞ
!.
When the plaintext has a uniform distribution,
Xði;jÞ
Yn
m¼1
Pf mði;jÞ
!¼
MN
Ln
and the upper bound becomes MN � ðMN=LÞn,which exponentially increases as n increases ifMN4L. However, in practice, the real complex-ity is much smaller due to the optimization of thecalculation process. Here, we consider the so-called halving algorithm, which calculates theintersection of n sets A1; . . . ;An by dividing theminto multi-level groups of ð2; 4; . . . ; 2i; . . .Þ sets.
For example, when n ¼ 11, the calculationprocess is described by
ððA1 \1
A2Þ \3ðA3 \
2A4ÞÞ \
7ððA5 \
4A6Þ \
6ðA7 \
5A8ÞÞ
\10ððA9 \
8A10Þ \
9A11Þ,
where \idenotes the ith intersection operation.
The goal of this halving algorithm is to minimizethe cardinalities of the two sets involved in eachintersection operation so as to reduce the globalcomplexity. To make the estimation of thecomplexity easier, let us consider the case ofn ¼ 2d , where d is an integer. In this case, theoverall complexity can be calculated as followsfor the uniform distribution:
Xd�1k¼0
2k �MN
Ld�k
� �2
¼MN
Ld
� �2
�Xd�1k¼0
ð2L2Þk
¼MN
Ld
� �2
�1� ð2L2Þ
d
1� 2L2
¼ 2d � ðMNÞ2 �ð2L2Þ
�d� 1
1� 2L2
on � ðMNÞ2
2L2 � 1. (2)
As two typical examples, when M ¼ N ¼ 256and L ¼ 2, the complexity is about ð229:2 � nÞ;when M ¼ N ¼ 256 and L ¼ 256, the complex-ity is only ð215 � nÞ. One can see that in both casesthe complexity is always much smaller than2MN � ðMN=LÞn. When n is not a power of 2, thecomplexity will be smaller than ð2dlog2ne=2L2
�1Þ � ðMNÞ2pð2n=ð2L2 � 1ÞÞ � ðMNÞ2.When the distribution is not uniform, thereduction of each intersection becomes not easyto calculate. To simplify the deduction, we usethe average size for all sets and assume that thereduction is proportional to a factor 1=L� (as ananalogue of 1=L). The value of 1=L� can becalculated as a weighted sum of all the possiblesizes divided by MN:
1
L�¼XL�1l¼0
Pl � ðPlMNÞ=MN ¼XL�1l¼0
P2l .
Then, by replacing L in Eq. (2) with
L� ¼1PL�1
l¼0 P2l
,
ARTICLE IN PRESSS. Li et al. / Signal Processing: Image Communication 23 (2008) 212–223 219
the overall complexity becomes nðMNÞ2=ð2ðL�Þ2 � 1Þ. Taking the special non-uniformdistribution studied before, P0 ¼ p and Pl ¼
ð1� pÞ=ðL� 1Þ for l 2 f1; . . . ;L� 1g, the valueof L� can be obtained as
LðpÞ ¼1
p2 þð1� pÞ2
L� 1
,
whose relation with p has been shown in Fig. 1(a).As can be seen from the formula and the figure,LðpÞ goes to 1 decreasingly with respect to thevalue of p, so the complexity will goes to nðMNÞ2
as p approaches 1. Fortunately, this does notchange the level of the complexity.
� Step 3: The time complexity of this step isdetermined by the details of the involvedoptimization algorithm. For the ‘‘taking-the-first’’ algorithm, the complexity is MN � ð1þðMN � 1Þ=ðL�ÞnÞ MN þ ðMNÞ2=ðL�Þn.
� Step 4: The time complexity of this step isOðMNÞ.
Combining the above discussions, the final timecomplexity of the function Get_Permutation_-Matrix is always of order n � ðMNÞ2, which ispractically small even for a PC.
From the above analysis, one can see that thetime complexity is mainly determined by Step 2.When the ‘‘taking-the-first’’ algorithm is adopted inthe function Get_Permutation_Matrix, Step 2can be skipped so that the total complexity will stillbe of order Oðn � ðMNÞ2Þ, even without using thehalving algorithm to calculate the intersections. Inthis case, Step 3 can be described as follows:
�
Step 30: For 0pipM � 1 and 0pjpN � 1, dothe following operations: Step 30a: Find the first element satisfyingf 1ði; jÞ ¼ f 01ði0; j0Þ; . . . ; f nði; jÞ ¼ f 0nði
0; j0Þ bysearching each element in L01ðf 1ði; jÞÞ andchecking whether it occurs in L02ðf 2ði; jÞÞ; . . . ;L0nðf nði; jÞÞ; Step 30b: Set ewði; jÞ ¼ ði0; j0Þ and then deleteði0; j0Þ from L01ðf 1ði; jÞÞ; . . . ;L
0nðf mði; jÞÞ.
It is obvious that the time complexity of Step 30a isalways less than n � ðMNÞ and averagely isOðn �MN=L�Þ, so the time complexity of Step 30
is always less than n � ðMNÞ2 and averagely isOðn � ðMNÞ2=L�Þ.
3.2. Chosen-plaintext attack
The chosen-plaintext attack works in the sameway as the known-plaintext attack, but the plaintextcan be deliberately chosen to optimize the estima-tion of fW (i.e., to maximize the decryptionperformance). The following two rules are usefulin the creation of the n chosen plaintexts f 1; . . . ; f n:
�
the histogram of each chosen plaintext should beas uniform as possible; � the i-dimensional (2pipn) histogram of any ichosen plaintexts should be as uniform aspossible, which is a generalization of the aboverule.
The goal of the above two rules is to minimize theaverage cardinality of the elements in cW , and thento maximize the number of correct elements in theestimated permutation matrix fW .
As an example of the two rules, consider thecondition when M ¼ N ¼ L ¼ 256. In this case, thefollowing two chosen plaintexts are enough toensure a perfect estimation of the permutationmatrix W : f 1 ¼ ½f 1ði; jÞ ¼ i�256�256 and f 2 ¼ ½f 2ði; jÞ¼ j�256�256, i.e.,
f 1 ¼ f T2 ¼
0 � � � 0
..
. . .. ..
.
i � � � i
..
. . .. ..
.
255 � � � 255
266666664
377777775256�256
(3)
and
f 2 ¼ f T1 ¼
0 � � � j � � � 255
..
. . .. ..
. . .. ..
.
0 � � � j � � � 255
26643775256�256
. (4)
For the two chosen plaintexts, ðf 1ði1; j1Þ; f 1ði2; j2ÞÞaðf 2ði1; j1Þ; f 2ði2; j2ÞÞ, 8ði1; j1Þaði2; j2Þ. This ensuresthat #ðL01ðl1Þ \ L
02ðl2ÞÞ ¼ 1, 8l1; l2 2 f0; . . . ;L� 1g.
In general cases, it can be easily deduced that n ¼
dlogLðMNÞe orthogonal plaintexts have to becreated to carry out a successful chosen-plaintextattack. Apparently, it will never be larger thandlogLð2ðMN � 1ÞÞe—the number of required plain-texts in the known-plaintext attack with a goodbreaking performance. This means the chosen-plaintext attack is a little (but not so much) strongerthan the chosen-plaintext attack.
ARTICLE IN PRESS
Fig. 2. The six 256� 256 test images used in the experiments.
Fig. 3. The cipher-images of the six test images.
S. Li et al. / Signal Processing: Image Communication 23 (2008) 212–223220
4. Experiments
To verify the decryption performance of theabove-discussed known-plaintext attack,3 someexperiments have been performed on a typicalpermutation-only image cipher called CIE [48], inwhich the secret permutations are pseudo-randomlygenerated by iterations of a chaotic map. Fig. 2shows six 256� 256 test images used in theexperiments, both of which are in 256 gray scales.In the experiments, the ‘‘taking-the-first’’ algorithmwas used to generate fW from cW in the Get_Per-mutation_Matrix function. It turned out thatsuch a simple algorithm was enough to achieve aconsiderable performance in real attacks.
The cipher-images of the six test images areshown in Fig. 3. When the first nð¼ 1; . . . ; 5Þ testimage(s) and the corresponding cipher-image(s) areknown to the attacker, the breaking results ofCipher-Image #6 are demonstrated in Fig. 4. It canbe seen that one known plain-image is not enoughto reveal any visual information about the 6th testimage, but two are capable to recover a rough view,and three or more are quite enough to achieve avery good performance.
To verify the fact that the breaking performanceis better than the theoretical prediction based on thecorrectly recovered elements in eW , let us see thedecryption performance with n ¼ 2 as an example.For this case, the number of the absolutely correctelements in eW are only 10,600, and the number ofall correct elements in eW is 26,631. In comparison,the number of correctly recovered pixels are 27,210.Although only about 27210
65536 41:52% of the pixelsare recovered, most visual information in theplain-image #6 has been revealed successfully.Now, let us consider the correct pixels that are notrecovered from the correct elements in eW , i.e., theð27210� 26631 ¼ 579Þ more correct pixels. Thesepixels are correctly decrypted with a frequency579=ð65536� 26631Þ 0:0149, which is larger thanthe average probability L�1 0:0039. If we alsocount those pixels whose values close to the rightones, this frequency will be even larger. In fact,excluding the pixels correctly determined by the26,631 correct elements in eW , the histogram of theother ð65536� 26631 ¼ 38905Þ pixels of the differ-ence image between the recovered image and the
Fig. 4. The decrypted images of Cipher-Image #6 when the first n
test images are known to the attacker.
3The chosen-plaintext attack is omitted in this section, since
one can absolutely break the permutation matrix by choosing two
plaintexts f 1 and f 2 as shown in Eqs. (3) and (4).
ARTICLE IN PRESS
−250 −200 −150 −100 −50 0 50 100 150 200 2500
100
200
300
400
500
600
Image #6
A noise image
Fig. 5. The histogram of the difference image between the
recovered image and the original plain-image, when the plain-
image is Image #6 (the blue line) or a randomly generated noise
image (the red line).
S. Li et al. / Signal Processing: Image Communication 23 (2008) 212–223 221
original plain-image #6 is a Laplacian-like functionas shown in Fig. 5. In comparison, the histogram ofthe difference image corresponding to a randomlygenerated noise image of the same size 256� 256 isalso shown. It is clear that the Laplacian-likehistogram corresponding to Image #6 is caused bythe correlation information existing in naturalimages. Note that the triangular histogram ofthe noise image can be easily deduced under theassumption that the two involved images (i.e., thenoise image and the corresponding cipher-image)are independent of each other and have a uniformhistogram: 8i ¼ �255; . . . ; 255, the occurrenceprobability of the difference value i in the histogramis: ð256� jijÞ=65536 ¼ 1=256� jij=65536.
5. Conclusions
Based on a general model of permutation-onlymultimedia ciphers and from a general perspective,the present paper analyzes the security this type ofciphers against plaintext attacks. When the plaintextis of size M �N and distributed uniformly with L
possible values, it is found that only OðlogLðMNÞÞ
plaintexts are enough to achieve a good breakingperformance. It has also been found that the attackcomplexity is practically small—only Oðn � ðMNÞ2Þ,where n denotes the number of known/chosenplaintexts. Some experiments on a permutation-only image cipher have been shown to demonstratethe performance of the proposed known-plaintext
attack. From the results of this paper, we draw thefollowing conclusions: for permutation-only ci-phers, (1) no better secret permutations can beachieved to offer a higher security level againstplaintext attacks (compared with the general modeldiscussed in this paper); (2) the secret permutationsshould be updated in a frequency smaller thanlogLðMNÞ to offer an acceptable level of securityagainst plaintext attacks, or they have to becombined with other encryption techniques toachieve this goal.
Acknowledgments
Shujun Li was supported by the Alexander vonHumboldt Foundation and by The Hong KongPolytechnic University’s Postdoctoral FellowshipScheme (under Grant no. G-YX63). The work ofNikolaos G. Bourbakis was supported by the AIISInc., NY, USA. The work of K.-T. Lo wassupported by the Research Grants Council of theHong Kong SAR Government under Project No.523206 (PolyU 5232/06E).
References
[1] C. Alexopoulos, N.G. Bourbakis, N. Ioannou, Image
encryption method using a class of fractals, J. Electron.
Imaging 4 (3) (1995) 251–259.
[2] H.J. Beker, F.C. Piper, Secure Speech Communications,
Academic, New York, 1985.
[3] M. Bertilsson, E.F. Brickell, I. Ingemarson, Cryptanalysis of
video encryption based on space-filling curves, in: Advances
in Cryptology—EuroCrypt’88, Lecture Notes in Computer
Science, vol. 434, Springer, Berlin, 1989, pp. 403–411.
[4] N.G. Bourbakis, C. Alexopoulos, Picture data encryption
using SCAN patterns, Pattern Recognition 25 (6) (1992)
567–581.
[5] N.G. Bourbakis, A. Dollas, SCAN-based compression-
encryption-hiding for video on demand, IEEE Multimedia
10 (3) (2003) 79–87.
[6] L. Brown, Comparing the security of pay-TV systems for use
in Australia, Austral. Telecomm. Res. 24 (2) (1990) 1–8.
[7] C.-C. Chang, T.-X. Yu, Cryptanalysis of an encryption
scheme for binary images, Pattern Recognition Lett. 23 (14)
(2002) 1847–1852.
[8] H.K.-C. Chang, J.-L. Liu, A linear quadtree compression
scheme for image encryption, Signal Process.: Image
Commun. 10 (4) (1997) 279–290.
[9] H.-C. Chen, J.-I. Guo, L.-C. Huang, J.-C. Yen, Design and
realization of a new signal security system for multimedia
data transmission, EURASIP J. Appl. Signal Process. 2003
(13) (2003) 1291–1305.
[10] H. Cheng, X. Li, Partial encryption of compressed images
and videos, IEEE Trans. Signal Process. 48 (8) (2000)
2439–2451.
ARTICLE IN PRESSS. Li et al. / Signal Processing: Image Communication 23 (2008) 212–223222
[11] H.C.H. Cheng, Partial encryption for image and video
communication, Master Thesis, Department of Computing
Science, University of Alberta, Edmonton, Alberta, Canada,
1998.
[12] K.-L. Chung, L.-C. Chang, Large encryption binary images
with higher security, Pattern Recognition Lett. 19 (5–6)
(1998) 461–468.
[13] J.H. Dolske, Secure MPEG video: techniques and pitfalls,
June 1997, available online at hhttp://www.dolske.net/old/
gradwork/cis788r08/i.
[14] J. Fridrich, Symmetric ciphers based on two-dimensional
chaotic maps, Internat. J. Bifurcation Chaos 8 (6) (1998)
1259–1284.
[15] B. Furht, E. Muharemagic, D. Socek, Multimedia Encryp-
tion and Watermarking, Springer, Berlin, 2005.
[16] B. Furht, D. Socek, A.M. Eskicioglu, Fundamentals of
multimedia encryption techniques, in: B. Furht, D. Kirovski
(Eds.), Multimedia Security Handbook, CRC Press, LLC,
Boca Raton, FL, 2004, pp. 93–131 (Chapter 3).
[17] B. Goldburg, S. Sridharan, E. Dawson, Design and
cryptanalysis of transform-based analog speech scramblers,
IEEE J. Selected Areas Commun. 11 (5) (1993) 735–744.
[18] J.-K. Jan, Y.-M. Tseng, On the security of image encryption
method, Inform. Process. Lett. 60 (5) (1996) 261–265.
[19] A. Kudelski, Method for scrambling and unscrambling a
video signal, U.S. Patent 5375168, 1994.
[20] M. Kuhn, AntiSky—an image processing attack on Video-
Crypt, 1994, online document, available at hhttp://www.cl.
cam.ac.uk/�mgk25/tv-crypt/image-processing/antisky.htmli.
[21] M. Kuhn, Analysis for the Nagravision video scrambling
method, 1998, online document, available at hhttp://
www.cl.cam.ac.uk/�mgk25/nagra.pdfi.
[22] I.J. Kumar, Cryptology of speech signal, in: Cryptology:
System Identification and Key-Clustering, Aegean Park
Press, 1997, pp. 309–380 (Chapter 6).
[23] S. Li, G. Chen, X. Zheng, Chaos-based encryption for
digital images and videos, in: B. Furht, D. Kirovski (Eds.),
Multimedia Security Handbook, CRC Press, LLC, Boca
Raton, FL, 2004, pp. 133–167 (Chapter 4), preprint
available at hhttp://www.hooklee.com/pub.htmli.
[24] S. Lian, J. Sun, Z. Wang, Perceptual cryptography on
SPIHT compressed images or videos, in: Proceedings of the
IEEE International Conference on Multimedia & Expo
(ICME’2004), 2004.
[25] S. Lian, J. Sun, Z. Wang, Perceptual cryptography on
JPEG2000 compressed images or videos, in: Proceedings of
the International Conference on Computer and Information
Technology (CIT’2004), IEEE Computer Society, Silver
Spring, MD, 2004, pp. 78–83.
[26] S. Lian, X. Wang, J. Sun, Z. Wang, Perceptual cryptography
on wavelet-transform encoded videos, in: Proceedings of the
IEEE International Symposium on Intelligent Multimedia,
Video and Speech Processing (ISIMP’2004), 2004, pp. 57–60.
[27] S.S. Maniccam, N.G. Bourbakis, Image and video encryp-
tion using SCAN patterns, Pattern Recognition 37 (4) (2004)
725–737.
[28] Y. Mao, G. Chen, S. Lian, A novel fast image encryption
scheme based on 3D chaotic Baker maps, Int. J. Bifurcation
Chaos 14 (10) (2004) 3613–3624.
[29] Y. Mao, M. Wu, A joint signal processing and cryptographic
approach to multimedia encryption, IEEE Trans. Image
Process. 15 (7) (2006) 2061–2075.
[30] Y. Matias, A. Shamir, A video scrambing technique based
on space filling curve (extended abstract), in: Advances in
Cryptology—Crypto’87, Lecture Notes in Computer
Science, vol. 293, Springer, Berlin, 1987, pp. 398–417.
[31] A. Menezes, P. van Oorschot, S. Vanstone, Handbook of
Applied Cryptography, CRC Press, Inc., Boca Raton, FL,
1996.
[32] R.K. Nichols, P.C. Lekkas, Speech cryptology, in: Wireless
Security: Models, Threats, and Solutions, McGraw-Hill,
New York, 2002, pp. 253–327 (Chapter 6).
[33] D. Qi, J. Zou, X. Han, A new class of scrambling
transformation and its application in the image information
covering, Sci. China—Ser. E (English Edition) 43 (3) (2000)
304–312.
[34] L. Qiao, Multimedia security and copyright protection,
Ph.D. Thesis, Department of Computer Science, University
of Illinois at Urbana-Champaign, Urbana, IL, USA,
1998.
[35] L. Qiao, K. Nahrstedt, Is MPEG encryption by using
random list instead of ZigZag order secure?, in: Proceedings
of the IEEE International Symposium on Consumer
Electronics (ISCE’97), 1997, pp. 226–229.
[36] L. Qiao, K. Nahrsted, Comparison of MPEG encryption
algorithms, Comput. Graphics 22 (4) (1998) 437–448.
[37] J. Scharinger, Fast encryption of image data using chaotic
Kolmogorov flows, J. Electron. Imaging 7 (2) (1998)
318–325.
[38] B. Schneier, Applied Cryptography—Protocols, Algorithms,
and Source Code in C, second ed., Wiley, New York, 1996.
[39] S.U. Shin, K.S. Sim, K.H. Rhee, A secrecy scheme for
MPEG video data using the joint of compression
and encryption, in: Information Security: Second Interna-
tional Workshop (ISW’99) Proceedings, Lecture Notes
in Computer Science, vol. 1729, Springer, Berlin, 1999,
pp. 191–201.
[40] S. Sridharan, E. Dawson, B. Goldburg, Speech encryption in
the transform domain, Electron. Lett. 26 (10) (1990)
655–657.
[41] S. Sridharan, E. Dawson, B. Goldburg, Fast Fourier
transform based speech encryption system, IEE Proc. I—
Commun. Speech Vision 138 (3) (1991) 215–223.
[42] L. Tang, Methods for encrypting and decrypting MPEG
video data efficiently, in: Proceedings of the Fourth
ACM International Conference on Multimedia, 1996,
pp. 219–229.
[43] T. Uehara, R. Safavi-Naini, Chosen DCT coefficients attack
on MPEG encryption schemes, in: Proceedings of the IEEE
Pacific-Rim Conference on Multimedia (IEEE-PCM’2000),
2000, pp. 316–319.
[44] T. Uehara, R. Safavi-Naini, P. Ogunbona, Securing wavelet
compression with random permutations, in: Proceedings of
the IEEE Pacific-Rim Conference on Multimedia (IEEE
PCM’2000), 2000, pp. 332–335.
[45] A. Uhl, A. Pommer, Image and Video Encryption: From
Digital Rights Management to Secured Personal Commu-
nication, Springer, Berlin, 2005.
[46] D. Wagner, G.G. Rose, T. Ritter, T. Jakobsen, N. Ferguson,
D.R. Stinson, Transposition ciphers, 2001, online discus-
sions in news group sci.crypt.research at google.com,
available online at hhttp://groups-beta.google.com/group/
sci.crypt.research/browse_thread/thread/3cd88407a3485cb1/
58ff17304187ce74#58ff17304187ce74i.
ARTICLE IN PRESSS. Li et al. / Signal Processing: Image Communication 23 (2008) 212–223 223
[47] Wikipedia, Television encryption, 2007, online document.
URL hhttp://en.wikipedia.org/wiki/Television_encryptioni.
[48] J.-C. Yen, J.-I. Guo, A new chaotic image encryption
algorithm, in: Proceedings of (Taiwan) National Symposium
on Telecommunications, 1998, pp. 358–362.
[49] J.-C. Yen, J.-I. Guo, Efficient hierarchical chaotic
image encryption algorithm and its VLSI realisation,
IEE Proc.—Vision Image Signal Process. 147 (2) (2000)
167–175.
[50] W. Zeng, S. Lei, Efficient frequency domain selective
scrambling of digital video, IEEE Trans. Multimedia 5 (1)
(2003) 118–129.
[51] W. Zeng, H. Yu, C.-Y. Lin (Eds.), Multimedia Security
Technologies for Digital Rights Management, Academic
Press, New York, 2006.
[52] X.-Y. Zhao, G. Chen, Ergodic matrix in image encryption,
in: Proceedings of the Second International Conference on
Image and Graphics, Proceedings of SPIE, vol. 4875, 2002,
pp. 394–401.
[53] X.-Y. Zhao, G. Chen, D. Zhang, X.-H. Wang, G.-C. Dong,
Decryption of pure-position permutation algorithms,
J. Zhejiang Univ. Sci. 5 (7) (2004) 803–809.
[54] R. Zunino, Fractal circuit layout for spatial decorrelation of
images, Electron. Lett. 34 (20) (1998) 1929–1930.