14
Cryptanalysis of the RCES/RSES image encryption scheme Shujun Li a, * , Chengqing Li b , Guanrong Chen b , Kwok-Tung Lo c a FernUniversita ¨ t in Hagen, Lehrgebiet Informationstechnik, Universita ¨ tsstraße 27, 58084 Hagen, Germany b Department of Electronic Engineering, City University of Hong Kong, 83 Tat Chee Avenue, Kowloon Tong, Hong Kong SAR, China c Department of Electronic and Information Engineering, The Hong Kong Polytechnic University, Hung Hom, Kowloon, Hong Kong SAR, China Received 13 April 2007; received in revised form 9 July 2007; accepted 26 July 2007 Available online 14 September 2007 Abstract Recently, a chaos-based image encryption scheme called RCES (also called RSES) was proposed. This paper analyses the security of RCES, and points out that it is insecure against the known/chosen-plaintext attacks: the number of required known/chosen plain-images is only one or two to succeed an attack. In addition, the security of RCES against the brute-force attack was overestimated. Both the- oretical and experimental analyses are given to show the performance of the suggested known/chosen-plaintext attacks. The insecurity of RCES is due to its special design, which makes it a typical example of insecure image encryption schemes. A number of lessons are drawn from the reported cryptanalysis of RCES, consequently suggesting some common principles for ensuring a high level of security of an image encryption scheme. Ó 2007 Elsevier Inc. All rights reserved. Keywords: Image encryption; Chaotic cryptography; RCES/RSES; Cryptanalysis; Known-plaintext attack; Chosen-plaintext attack; CKBA 1. Introduction In the digital world today, the security of digital images becomes more and more important, since the communica- tions of digital products over networks occur more and more frequently. Furthermore, special and reliable security in storage and transmission of digital images is needed in many applications, such as pay-TV, medical imaging sys- tems, military image database and communications as well as confidential video conferencing, etc. In recent years, some consumer electronic devices, especially mobile phones and hand-held devices, have also started to provide the function of saving and exchanging digital images via the support of multimedia messaging services over wireless networks. To meet the challenges arising from different applica- tions, good encryption of digital images is necessary. The simplest way to encrypt an image is to consider the 2D image stream as a 1D data stream, and then encrypt this 1D stream with any available cipher (Dang and Chau, 2000). Although such a simple way is sufficient to protect digital images in some civil applications, encryption schemes considering special features of digital images, such as the bulky size and the large redundancy in uncom- pressed images, are still needed to provide better overall performance and make the adoption of the encryption scheme easier in the whole image processing system. Since the 1990s, many specific algorithms have been pro- posed, aiming to provide better solutions to image encryp- tion (Bourbakis and Alexopoulos, 1992; Alexopoulos et al., 1995; Chung and Chang, 1998; Cheng and Li, 2000; Chang et al., 2001; Pommer, 2003; Maniccam and Bourbakis, 2004; Scharinger, 1998; Fridrich, 1998; Mao et al., 2004; Yano and Tanaka, 2002; Bhargava et al., 2004; Wu and Kuo, 2005; Mao and Wu, 2006; Yen and Guo, 1999, 2000a,b, 2003; Chen et al., 2002, 2003; Chen and Yen, 2003). At the same time, cryptanalytic work on proposed image encryption schemes has also been developed, and some existing schemes have been found to be insecure (Jan and Tseng, 1996; Qiao, 1998; Cheng, 1998; Chang and Yu, 2002; Li and Zheng, 2002a,b; Li et al., 2005, 0164-1212/$ - see front matter Ó 2007 Elsevier Inc. All rights reserved. doi:10.1016/j.jss.2007.07.037 * Corresponding author. URL: http://www.hooklee.com (S. Li), . www.elsevier.com/locate/jss Available online at www.sciencedirect.com The Journal of Systems and Software 81 (2008) 1130–1143

Cryptanalysis of the RCES/RSES image encryption scheme

Embed Size (px)

Citation preview

Available online at www.sciencedirect.com

www.elsevier.com/locate/jss

The Journal of Systems and Software 81 (2008) 1130–1143

Cryptanalysis of the RCES/RSES image encryption scheme

Shujun Li a,*, Chengqing Li b, Guanrong Chen b, Kwok-Tung Lo c

a FernUniversitat in Hagen, Lehrgebiet Informationstechnik, Universitatsstraße 27, 58084 Hagen, Germanyb Department of Electronic Engineering, City University of Hong Kong, 83 Tat Chee Avenue, Kowloon Tong, Hong Kong SAR, China

c Department of Electronic and Information Engineering, The Hong Kong Polytechnic University, Hung Hom, Kowloon, Hong Kong SAR, China

Received 13 April 2007; received in revised form 9 July 2007; accepted 26 July 2007Available online 14 September 2007

Abstract

Recently, a chaos-based image encryption scheme called RCES (also called RSES) was proposed. This paper analyses the security ofRCES, and points out that it is insecure against the known/chosen-plaintext attacks: the number of required known/chosen plain-imagesis only one or two to succeed an attack. In addition, the security of RCES against the brute-force attack was overestimated. Both the-oretical and experimental analyses are given to show the performance of the suggested known/chosen-plaintext attacks. The insecurity ofRCES is due to its special design, which makes it a typical example of insecure image encryption schemes. A number of lessons are drawnfrom the reported cryptanalysis of RCES, consequently suggesting some common principles for ensuring a high level of security of animage encryption scheme.� 2007 Elsevier Inc. All rights reserved.

Keywords: Image encryption; Chaotic cryptography; RCES/RSES; Cryptanalysis; Known-plaintext attack; Chosen-plaintext attack; CKBA

1. Introduction

In the digital world today, the security of digital imagesbecomes more and more important, since the communica-tions of digital products over networks occur more andmore frequently. Furthermore, special and reliable securityin storage and transmission of digital images is needed inmany applications, such as pay-TV, medical imaging sys-tems, military image database and communications as wellas confidential video conferencing, etc. In recent years,some consumer electronic devices, especially mobile phonesand hand-held devices, have also started to provide thefunction of saving and exchanging digital images via thesupport of multimedia messaging services over wirelessnetworks.

To meet the challenges arising from different applica-tions, good encryption of digital images is necessary. Thesimplest way to encrypt an image is to consider the 2Dimage stream as a 1D data stream, and then encrypt this

0164-1212/$ - see front matter � 2007 Elsevier Inc. All rights reserved.

doi:10.1016/j.jss.2007.07.037

* Corresponding author.URL: http://www.hooklee.com (S. Li), .

1D stream with any available cipher (Dang and Chau,2000). Although such a simple way is sufficient to protectdigital images in some civil applications, encryptionschemes considering special features of digital images, suchas the bulky size and the large redundancy in uncom-pressed images, are still needed to provide better overallperformance and make the adoption of the encryptionscheme easier in the whole image processing system.

Since the 1990s, many specific algorithms have been pro-posed, aiming to provide better solutions to image encryp-tion (Bourbakis and Alexopoulos, 1992; Alexopoulos et al.,1995; Chung and Chang, 1998; Cheng and Li, 2000; Changet al., 2001; Pommer, 2003; Maniccam and Bourbakis,2004; Scharinger, 1998; Fridrich, 1998; Mao et al., 2004;Yano and Tanaka, 2002; Bhargava et al., 2004; Wu andKuo, 2005; Mao and Wu, 2006; Yen and Guo, 1999,2000a,b, 2003; Chen et al., 2002, 2003; Chen and Yen,2003). At the same time, cryptanalytic work on proposedimage encryption schemes has also been developed, andsome existing schemes have been found to be insecure(Jan and Tseng, 1996; Qiao, 1998; Cheng, 1998; Changand Yu, 2002; Li and Zheng, 2002a,b; Li et al., 2005,

S. Li et al. / The Journal of Systems and Software 81 (2008) 1130–1143 1131

2006, 2007; Canniere et al., 2005). Due to the tight relation-ship between chaos and cryptography (Li, 2003, Chapter2), chaotic systems have been widely used in image encryp-tion to realize diffusion and confusion in a good cipher(Scharinger, 1998; Fridrich, 1998; Yano and Tanaka,2002; Mao et al., 2004; Yen and Guo, 1999, 2000a,b; Chenet al., 2002; Chen and Yen, 2003). For a more comprehen-sive survey of the state of the art about image encryptionschemes, see (Uhl and Pommer, 2005; Furht et al., 2004;Li et al., 2004).

The present paper focuses on a new chaos-based imageencryption scheme proposed by Chen et al. (2002) andChen and Yen (2003), which was originally called RSES(random seed encryption system) in Chen et al. (2002)and then renamed to be RCES (random control encryptionsystem) in Chen and Yen (2003). RCES can be consideredas an enhanced version of a previously-proposed imageencryption scheme called CKBA (chaotic key-based algo-rithm) (Yen and Guo, 2000b), which has been cryptana-lyzed by Li and Zheng (2002b). The present paperevaluates the security of RCES, and points out that RCESis as weak as CKBA, though it seems more complicatedthan CKBA. In known/chosen-plaintext attack, only oneor two known/chosen plain-images are enough to breakthis image encryption scheme. In addition, we also showthat the security of RCES against brute-force attack wasmuch overestimated in Chen et al. (2002) and Chen andYen (2003).

Due to the special design of RCES, some of its essentialsecurity defects are very useful for revealing several generalprinciples of designing secure image encryption schemes.This magnifies the cryptanalysis presented below, thoughRCES is not a very delicate cipher from the cryptographi-cal point of view.

This paper is organized as follows. Section 2 brieflyintroduces RCES and its parent version CKBA. A detailedcryptanalysis of RCES is presented in Section 3, wheresome experimental results are given to support the theoret-ical analysis. Section 4 discusses some design principlesdrawn from the essential security defects of RCES. The lastsection concludes the paper.

2. Introduction to RCES

2.1. CKBA (Yen and Guo, 2000b) – The Parent Version of

RCES

Assume that the size of the plain-image for encryption isM · N,1 CKBA can be described as follows.

2.1.1. The secret key

The secret key includes two bytes key1, key2, and theinitial condition x(0) 2 (0, 1) of the following chaotic Logis-tic map:

1 In this paper, M · N is in the form ‘‘width · height’’.

xðnþ 1Þ ¼ l � xðnÞ � ð1� xðnÞÞ; ð1Þwhich is a well-studied chaotic system in chaos theory andbehaves chaotically when l > 3.5699. . . (Devaney, 1989).

2.1.2. Initialization

Run the chaotic system to generate a chaotic sequence,fxðiÞgdMN=8e�1

i¼0 , where dae denotes the smallest integer thatis not less than a. From the 16-bit binary representationof x(i) = 0 Æ b(16i + 0)b(16i + 1) � � � b(16i + 15), derive apseudo-random binary sequence (PRBS), fbðiÞg2MN�1

i¼0 .

2.1.3. Encryption

For the plain-pixel f(x,y) (0 6 x 6M � 1,0 6 y 6 N � 1), the corresponding cipher-pixel f 0(x,y) isdetermined by the following rule:

f 0ðx; yÞ ¼

f ðx; yÞ � key1; Bðx; yÞ ¼ 3;

f ðx; yÞ � key1; Bðx; yÞ ¼ 2;

f ðx; yÞ � key2; Bðx; yÞ ¼ 1;

f ðx; yÞ � key2; Bðx; yÞ ¼ 0;

8>>><>>>: ð2Þ

where B(x,y) = 2 · b(x · N + y) + b(x · N + y + 1), and �and � denote XOR and XNOR operations, respectively.Since a� b ¼ a� b ¼ a� �b, the above equation is equiva-lent to

f 0ðx; yÞ ¼

f ðx; yÞ � key1; Bðx; yÞ ¼ 3;

f ðx; yÞ � key1; Bðx; yÞ ¼ 2;

f ðx; yÞ � key2; Bðx; yÞ ¼ 1;

f ðx; yÞ � key2; Bðx; yÞ ¼ 0:

8>>><>>>: ð3Þ

2.1.4. Decryption

The decryption procedure is like that of the encryption,since � is an involutive operation.2

2.1.5. A constraintBecause not all values of key1 and key2 can make well-

disorderly cipher-images, it is required that key1 and key2have four different bits (a half of all). In fact, this constraintensures that the encryption results of key1 and key2 aresufficiently far.

In Li and Zheng (2002b), CKBA was cryptanalyzed andthe following facts were pointed out:

• the security of CKBA against the brute-force attack wasover-estimated;

• CKBA is not secure against known/chosen-plaintextattacks, since only one known/chosen plain-image isenough to get an equivalent key, a mask image fm, byXORing the plain-image f and the cipher-image f 0, pixelby pixel: fm = f � f 0;

2 An involutive encryption operation satisfies f(f(x,k),k) = x for any x

and k.

1132 S. Li et al. / The Journal of Systems and Software 81 (2008) 1130–1143

• it is easy to reconstruct the whole secret key {key1,key2,x(0)} from the mask image fm, for which therequired complexity is rather small.

Apparently, the insecurity of CKBA against known/chosen-plaintext attacks is determined by the fact thatf(x,y) � f 0(x,y) is fixed to be one of the four values,key1, key1, key2, key2, at any given position (x,y). In fact,for any two plain-images, f1 and f2, and their cipher-images, f 01, f 02, one has

f1ðx; yÞ � f 01ðx; yÞ ¼ f2ðx; yÞ � f 02ðx; yÞ � fmðx; yÞ

for any position (x,y). As a result, given any cipher-imagef 0, the plain-image can be decrypted as follows: f = f 0 � fm.

2.2. RCES (Chen and Yen, 2003) (or RSES (Chen et al.,

2002))

RCES is an enhanced version of CKBA, by makingkey1 and key2 time-variant, and by introducing a simplepermutation operation, Swapb(x1,x2), which exchangesthe values of x1 and x2 if b = 1 and does nothing if b = 0.

RCES encrypts plain-images block by block, where eachblock contains 16 consecutive pixels. To simplify the fol-lowing description, without loss of generality, assume thatthe sizes of plain-images are all M · N, and that MN can bedivided by 16. Consider a plain-image ff ðx; yÞgx¼M�1;y¼N�1

x¼0;y¼0

as a 1D pixel-sequence ff ðlÞgMN�1l¼0 by scanning it line by

line from top to bottom. The plain-image can be dividedinto MN/16 blocks:

f ð16Þð0Þ; . . . ; f ð16ÞðkÞ; . . . ; f ð16ÞðMN=16� 1Þ� �

;

where

f ð16ÞðkÞ ¼ f ð16k þ 0Þ; . . . ; f ð16k þ iÞ; . . . ; f ð16k þ 15Þf g:

For the kth pixel-block f (16)(k), the work mechanism ofRCES can be described as follows.

2.2.1. The secret key

The secret key includes the control parameter l and theinitial condition x(0) of the Logistic map (1).

2.2.2. Initialization

Run the Logistic map to generate a chaotic sequence,fxðiÞgMN=16�1

i¼0 , and then extract the 24-bit representationof x(i) to yield a PRBS fbðiÞg3MN=2�1

i¼0 . Note that the Logisticmap is realized in 24-bit fixed-point arithmetic.

2.2.3. Encryption

Two pseudo-random seeds,

Seed1ðkÞ ¼X7

i¼0

bð24k þ iÞ � 27�i ð4Þ

Seed2ðkÞ ¼X7

i¼0

bð24k þ 8þ iÞ � 27�i ð5Þ

are calculated to encrypt the current plain-block with thefollowing two steps:

• Step 1 – Pseudo-randomly swapping adjacent pixels: fori = 0–7, do

Swapbð24kþ16þiÞðf ð16k þ 2iÞ; f ð16k þ 2iþ 1ÞÞ: ð6Þ

• Step 2 – Masking the current plain-block with the two

pseudo-random seeds: for j = 0–15, do

f 0ð16k þ jÞ ¼ f ð16k þ jÞ � Seedð16k þ jÞ; ð7Þwhere

Seedð16k þ jÞ ¼

Seed1ðkÞ; Bðk; jÞ ¼ 3;

Seed1ðkÞ; Bðk; jÞ ¼ 2;

Seed2ðkÞ; Bðk; jÞ ¼ 1;

Seed2ðkÞ; Bðk; jÞ ¼ 0

8>>><>>>: ð8Þ

and B(k, j) = 2 · b(24k + j) + b(24k + j + 1).

2.2.4. Decryption

The decryption procedure is similar to the encryptionprocedure, but the masking operation is exerted beforethe swapping for each pixel-block.

3. Cryptanalysis of RCES

Although RCES is more complicated than CKBA, asanalyzed below, its security is not really enhanced by theintroduced design complexity.

In this section, the following results are obtained on thesecurity of RCES: (1) its security against brute-force attackwas over-estimated; (2) it is not secure against known/cho-sen-plaintext attacks, and the number of required plain-images is only O(1) and, in fact, only one or two; (3) thereare two available known/chosen-plaintext attacks, andthey can be further combined to make a nearly-perfectattack to RCES; (4) the chosen-plaintext attacks can evenachieve much better breaking performance than theirknown-plaintext versions.

3.1. Brute-force attack

In Chen et al. (2002) and Chen and Yen (2003) it wasclaimed that the complexity of RCES against brute-forceattack is O(23MN/2) since fbðiÞg3MN=2�1

i¼0 has 3MN/2 bits.However, such a statement is not true due to the followingreason: all 3MN/2 bits are uniquely determined by the con-trol parameter l and the initial condition x(0) of the Logis-tic map (1), which has only 48 secret bits. This means thatthe key entropy of RCES is only 48. Considering not allvalues of l can produce chaoticity in the Logistic map,the key entropy should be even smaller than 48. To simplifythe following analysis, assume that the key entropy isKl < 48, so the total number of all possible keys forbrute-force search is only 2Kl .

Fig. 2. The mask image fm derived from fLenna and f 0Lenna.

Fig. 1. One 256 · 256 known plain-image, fLenna, and its cipher-imagef 0Lenna. (a) fLenna; (b) f 0Lenna.

S. Li et al. / The Journal of Systems and Software 81 (2008) 1130–1143 1133

Considering that the complexity of RCES is O(MN)(Chen and Yen, 2003, Section 2.4), the complexity againstthe brute-force attack will be Oð2Kl �MNÞ. AssumeKl = 48, for a typical image whose size is 256 · 256, thecomplexity is about O(264), which is much smaller thanO(23MN/2) = O(298 304), the claimed complexity in Chenet al. (2002) and Chen and Yen (2003). Apparently, thesecurity of RCES against the brute-force attack was over-estimated too much.

Note that one has to find a way to automatically verifythe guessed keys, in the case that the attacker does notknow any plain-image. This can be done by calculatingthe distribution of the differences of adjacent pixels. Fora wrong key, the obtained image is generally chaotic-like,and the distribution of the pixel-value difference would benearly uniform. For the correct key, a natural image willbe output, which corresponds to a Laplacian distributionof the neighboring differences. Note that the complexityof such a verification process is also O(MN).

3.2. Known-plaintext attack 1: breaking RCES with a maskimage fm

Although different seeds are used for pixels at differentpositions and pseudo-random swapping operations areexerted on the plain-image before masking, the known-plaintext attack breaking CKBA can be efficiently extendedto break RCES. With only one known plain-image and itscorresponding cipher-image, it is very easy to get a maskimage fm, which can be used as an equivalent key of thesecret key (l,x(0)) to decrypt any cipher-image whose sizeis not larger than the size of fm. When two or more plain-images are known, a swapping matrix Q can be constructedto enhance the breaking performance of the mask image fm.

3.2.1. Get fm from one known plain-imageAssume that an M · N plain-image fK and its corre-

sponding cipher-image f 0K have been known to an attacker.Similar to the way to get the mask image in the known-plaintext attack to CKBA, the attacker here can get fm

by simply XORing the plain-image and the cipher-imagepixel by pixel: fmðlÞ ¼ fKðlÞ � f 0KðlÞ, where l = 0 �MN � 1.

With the mask image fm, the attacker tries to recover theplain-image by XORing the mask image and the cipher-image pixel by pixel: f(l) = f 0(l) � fm(l). If a pixel f(l) isnot swapped, f(l) = f 0(l) � fm(l) holds; otherwise,f(l) = f 0(l) � fm(l) is generally not true. Assume that thebit b(24k + 16 + i) in Eq. (6) satisfies the balanced distribu-tion3 over {0,1}, it is expected that about half of all plain-pixels are not swapped and can be successfully decryptedwith fm � f 0. Intuitively, half of plain-pixels should be

3 Strictly speaking, the Logistic map cannot guarantee the balance ofeach generated bit, since its variant density function is not uniform(Kohda and Aihara, 1990). In this paper, without loss of generality, it istaken for granted so as to simplify the theoretical analyses.

enough to reveal the main content and some details ofthe plain-image.

With the secret key (l,x(0)) = (3.915264, 0.2526438),which is randomly chosen with the standard rand() func-tion, some experiments are made to show the real perfor-mance of the mask image fm in this attack. One knownplain-image fLenna and its cipher-image f 0Lenna are shownin Fig. 1. The mask image fm ¼ fLenna � f 0Lenna is given inFig. 2. For an unknown plain-image fPeppers (Fig. 3a), themask image fm is used to recover it from its cipher-imagef 0Peppers (Fig. 3b). The recovered plain-image f Peppers ¼fm � f 0Peppers and the recovery error jf Peppers � fPeppersj areshown in Figs. 4a and 4b, respectively. It is surprisingly

Fig. 3. A 256 · 256 plain-image unknown to the attacker, fPeppers, and itscipher-image f 0Peppers. (a) fPeppers; (b) f 0Peppers.

Fig. 4. The result of breaking the plain-image with fm derived from fLenna:(a) the recovered plain-image f Peppers; (b) the recovery error jf Peppers�fPeppersj.

1134 S. Li et al. / The Journal of Systems and Software 81 (2008) 1130–1143

seen that the decryption performance is much better thanexpected: most (much more than 50%) pixels are success-fully recovered, and almost all subtle details remain.

Although the recovery error jf Peppers � fPeppersj visuallyshows that most plain-pixels are exactly recovered, statisti-cal data reveal that 33,834 pixels in f Peppers � fPeppers are notzero, i.e., about 51.63% of pixels are not exactly recovered.To explain why fm is so effective to recover most pixels ofthe plain-image with only half exactly-recovered pixels,consider two pixels in the known plain-image, f(2i),f(2i + 1), and their cipher-pixels, f 0(2i), f 0(2i + 1), wherei = 0 �MN/2 � 1. Then, the corresponding elements ofthe two pixels in the mask image fm will befm(2i) = f(2i) � f 0(2i) and fm(2i + 1) = f(2i + 1) � f 0(2i + 1).Since all recovery errors are introduced at the positionswhere the adjacent plain-pixels are swapped, one can theo-retically study the recovery performance of the mask imagefm by considering the elements corresponding to theswapped pixels only. Assume that f(2i) and f(2i + 1) areswapped in the encryption procedure, f 0(2i) = f(2i + 1) �Seed(2i) and f 0(2i + 1) = f(2i) � Seed(2i + 1). Therefore

fmð2iÞ ¼ f ð�Þð2iÞ � Seedð2iÞ; ð9Þfmð2iþ 1Þ ¼ f ð�Þð2iÞ � Seedð2iþ 1Þ; ð10Þ

where f (�)(2i) = f(2i) � f(2i + 1).Consider a cipher-image f 01 and its corresponding plain-

image f1. Assuming that the plain-image recovered from fm

is f 1 , the recovered plain-pixels, f 1 ð2iÞ and f 1 ð2iþ 1Þ, sat-isfy the following propositions and corollaries. Note thatthese results are only true for swapped pixels.

Proposition 1. f 1 ð2iÞ � f1ð2iÞ ¼ f 1 ð2iþ 1Þ � f1ð2iþ 1Þ ¼f ð�Þð2iÞ � f ð�Þ1 ð2iÞ.

Proof. From Eq. (9) and f 01ð2iÞ ¼ f1ð2iþ 1Þ � Seedð2iÞf 1 ð2iÞ ¼ fmð2iÞ � f 01ð2iÞ;

¼ ðf ð�Þð2iÞ � Seedð2iÞÞ � ðf1ð2iþ 1Þ � Seedð2iÞÞ¼ f ð�Þð2iÞ � f1ð2iþ 1Þ

Then, one has

f 1 ð2iÞ � f1ð2iÞ ¼ f ð�Þð2iÞ � f1ð2iþ 1Þ � f1ð2iÞ

¼ f ð�Þð2iÞ � f ð�Þ1 ð2iÞ:

In a similar way, one can get f 1 ð2iþ 1Þ � f1ð2iþ 1Þ ¼f ð�Þð2iÞ � f ð�Þ1 ð2iÞ. Thus, the proof is completed. h

Corollary 1. When f(2i) = f(2i + 1), f 1 ð2iÞ ¼ f1ð2iþ 1Þand f 1 ð2iþ 1Þ ¼ f1ð2iÞ.

Proof. The results of this corollary are special cases of theabove two propositions with f (�)(2i) = 0. h

Based on the above propositions, one can get an upperbound of the recovery errors jf 1 ð2iÞ � f1ð2iÞj andjf 1 ð2iþ 1Þ � f1ð2iþ 1Þj. Firstly, a lemma should beintroduced.

Lemma 1. If a � b = c, then ja � bj 6 c.

Proof. Represent c in the following binary form:

c ¼ ð0; . . . ; 0; cn�1 ¼ 1; . . . ; ci; . . . ; c1; c0Þ2:

Similarly, represent a and b as follows:

a ¼ ðaN�1; . . . ; an�1; . . . ; ai; . . . ; a1; a0Þ2;b ¼ ðbN�1; . . . ; bn�1; . . . ; bi; . . . ; b1; b0Þ2:

From a � b = c, one have "j = n � N � 1, aj = bj.Therefore

ja� bj ¼XN�1

i¼0

ðai � biÞ � 2i

���������� ¼ Xn�1

i¼0

ðai � biÞ � 2i

����������

6

Xn�1

i¼0

ai � bij j � 2i:

Since jai � bij = ai � bi = ci, one has ja� bj 6Pn�1

i¼0 ci�2i ¼ c. The lemma is thus proved. h

Corollary 2. jf 1 ð2iÞ � f1ð2iÞj 6 f ð�Þð2iÞ � f ð�Þ1 ð2iÞ, andjf 1 ð2iþ 1Þ � f1ð2iþ 1Þj 6 f ð�Þð2iÞ � f ð�Þ1 ð2iÞ.

Proof. This corollary is an obvious result of Proposition 1and Lemma 1. h

Corollary 2 says that the recovery errors of both f 1 ð2iÞand f 1 ð2iþ 1Þ will not be larger than f ð�Þð2iÞ�f ð�Þ1 ð2iÞ ¼ f ð2iÞ � f ð2iþ 1Þ � f1ð2iÞ � f1ð2iþ 1Þ. Due tothe strong correlation between adjacent pixels of digitalimages, the difference between two adjacent pixels satisfiesLaplace distribution (also called generalized Gaussian dis-tribution). As a result, f (�)(2i) will also obeys a (positive)single-side Laplace distribution, which means that therecovery error of each plain-pixel recovered from fm willalso obey a Laplace distribution. The Laplace distributionof recovery errors actually implies that most recovered pix-els are close to the real values of the original plain-pixels.Therefore, the surprising recovery performance of fm

shown in Fig. 4 can be naturally explained.

S. Li et al. / The Journal of Systems and Software 81 (2008) 1130–1143 1135

For the plain-image fPeppers, the histograms of some dif-ferential images are plotted to verify the above-mentionedtheoretical results. Define two (M � 1) · N differentialimages f (�) and f (�):

f ð�Þðx; yÞ ¼ f ðx; yÞ � f ðxþ 1; yÞ; ð11Þf ð�Þðx; yÞ ¼ f ðx; yÞ � f ðxþ 1; yÞ; ð12Þ

where x = 0 �M � 2, y = 0 � N. The histograms of theabove two differential images of fPeppers are shown inFig. 5. When f = fLenna, f1 = fPeppers, the histograms off ð�Þ � f ð�Þ1 and jf Peppers � fPeppersj are shown in Fig. 6.Apparently, Fig. 6 agrees with Corollary 2 very well. Notethat only the swapped pixels are enumerated for the histo-gram of jf Peppers � fPeppersj, since the above theoretical anal-ysis on the recovery errors is only focused on the swappedpixels.

Since all recovery errors are introduced by swapped pix-els, the recovery performance will be better if some

–255 –200 –150 –100 –50 0 50 100 150 200 2550

1000

2000

3000

4000

5000

6000

7000

Fig. 5. The histograms of f ð�ÞPeppers and f ð�ÞPeppers.

0 50 100 150 200 2550

500

1000

1500

2000

2500

Fig. 6. The histograms of f ð�ÞLenna � f ð�ÞPeppers and jf Peppers � fPeppersj.

swapped pixels can be distinguished. In the following, itis shown that an attacker can manage to do so by manuallydetecting visible noises in cipher-images, and by intersect-ing multiple mask images generated from different knownplain-images.

3.2.2. Amending fm with more known cipher-imagesAssume that the corresponding plain-image of a cipher-

image does not contain salt–pepper impulsive noises. Then,one can assert that all such noises in the recovered plain-image indicates the positions of swapped pixels. Observingthe recovered plain-image f Peppers shown in Fig. 4a, one canfind many distinguishable noises with naked eyes, whichcorrespond to the strong edges of the known plain-imagefPeppers (see Fig. 4b). Following Proposition 1, strong edgesmeans large values of f (�)(x), and so generates salt–peppernoises.

Once some swapped pixels are distinguished, one cangenerate a swapping (0,1)-matrix Q = [qi,j]M·N, whereqi,j = 1 for swapped pixels and qi,j = 0 for others. Similarly,Q can be represented in 1D form: Q ¼ fqðlÞgMN�1

i¼0 . With theswapping matrix, the mask image fm is amended as follows:for i = 0 �MN/2 � 1, if q(2i) = 1 or q(2i + 1) = 1, the val-ues of fm(2i) and fm(2i + 1) are re-calculated as follows:fm(2i) = f(2i) � f 0(2i + 1) and fm(2i + 1) = f(2i + 1) �f 0(2i); otherwise, fm(2i) and fm(2i + 1) are left untouched.With the amended fm and the swapping matrix Q, onecan decrypt the cipher-images in the following two steps:

• use fm to XOR the cipher-image to get an initial recov-ered plain-image f *;

• "i = 0 �MN/2 � 1, if q(2i) = 1 or q(2i + 1) = 1, swapthe two adjacent pixels f *(2i) and f *(2i + 1).

If an attacker can get more cipher-images encryptedwith the same key, he can distinguish more swapped pixels,and gets better recovery performance with fm and Q. Thisimplies that more and more knowledge on how to purifythe attack can be learned from the cipher-images, whichis a desirable feature from an attacker’s point of view.

3.2.3. Amending fm with more known plain-images

With two or more known plain-images and their cipher-images encrypted with the same secret key, it is possible tosuccessfully distinguish most swapped pixels, achievingnearly-perfect recovery performance. Given n P 2 knownplain-images, f1, . . . , fn, and their cipher-images, f 01; . . . ; f 0n,one can get n mask images f ðiÞm ¼ fi � f 0i (i = 1–n). Appar-ently, if the lth pixel is not swapped, "i 5 j,f ðiÞm ðlÞ ¼ f ðjÞm ðlÞ. That is, if f ðiÞm ðlÞ 6¼ f ðjÞm ðlÞ, it can be assertedthat the pixel at this position is swapped. Therefore, bycomparing the elements of n mask images, some positionscorresponding to the swapped pixels can be distinguished.With the swapping information, following the same waydescribed above, a swapping matrix Q can be constructed,and then fm is amended with Q with the way mentionedabove. Using the amended fm and the swapping matrix

Fig. 8. The result of enhancing the recovered plain-image f Peppers with a3 · 3 median filter: (a) the enhanced image f ;3�3

Peppers; (b) the recovery errorjf ;3�3

Peppers � fPeppersj.

1136 S. Li et al. / The Journal of Systems and Software 81 (2008) 1130–1143

Q, the cipher-image is decrypted with XOR and swappingoperations.

From Eqs. (9) and (10), the probability of

f ðiÞm ðlÞ 6¼ f ðjÞm ðlÞ is the probability of f ð�Þi ð2iÞ 6¼ f ð�Þj ð2iÞ,where l = 2i or 2i + 1. Assume the n mask images are inde-pendent of each other and the value of each element dis-tributes uniformly over {0, . . . , 255}. The probability off ðiÞm ðlÞ 6¼ f ðjÞm ðlÞ will be 1 � 256�1 0.996. This means thatonly two mask images are enough to distinguish almost allswapped pixels. However, since the mask images are gener-ally not independent of each other and fm(l) does not obeyuniform distribution, the real probability will be less than1 � 256�1. Fortunately, for most natural images, this prob-ability is still sufficiently close to 1 � 256�1, so that twoknown plain-images are still enough to distinguish mostswapped pixels. Given two known plain-images, fLenna

(Fig. 1a) and fBarbara (Fig. 7a), the recovery performanceof the attack corresponding to fPeppers is shown inFig. 7b. It can be seen that the recovered plain-image isalmost perfect, and only 952 (about 1.45% of all) pixelsare not exactly recovered.

3.2.4. Enhancing the recovered plain-image with image

processing techniques

To further improve the visual quality of the recoveredplain-images, some noise reducing techniques can be usedto further reduce the recovery errors. For the recoveredplain-image f Peppers in Fig. 4a, the enhanced plain-imagef ;3�3

Peppers with a 3 · 3 median filter and the correspondingrecovery error jf ;3�3

Peppers � fPeppersj are shown in Figs. 8a and8b, respectively. It can be seen that the visual quality off Peppers is enhanced significantly. Note that more compli-cated image processing techniques are still available to fur-ther polish the recovered plain-image, one of which will beintroduced below in Section 3.5.

3.3. Known-plaintext attack 2: breaking the chaotic map

In the above-discussed attack based on mask images,assuming that the size of fm is M · N, it is obvious that onlyM · N leading pixels in a larger cipher-image can be recov-

Fig. 7. Another known plain-image fBarbara and the recovered plain-imagef Peppers with two known plain-images: fLenna and fBarbara. (a) fBarbara; (b)f Peppers.

ered with fm (and perhaps Q). To decrypt more pixels, thesecret control parameter l and a chaotic state x(k) occur-ring before x(MN/16 � 1) have to be known, so that onecan calculate more chaotic states after x(MN/16 � 1). Thatis, the chaotic map should be distinguished. Actually, it ispossible for an attacker to achieve this goal with a highprobability and a sufficiently small complexity, even whenonly one plain-image is known. Similarly, the larger thenumber of known plain-images is, the closer the probabil-ity will be to 1, the smaller the value of k will be, and thelower the attack complexity will be.

3.3.1. Guessing a chaotic state x(k) from fm

In the kth pixel-block, for any unswapped pixelf(16k + j),

fmð16k þ jÞ ¼ f ð16k þ jÞ � f 0ð16k þ jÞ ¼ Seedð16k þ jÞ;

which must be one value in the set

S4 ¼ fSeed1ðkÞ; Seed1ðkÞ; Seed2ðkÞ; Seed2ðkÞg: ð13Þ

Therefore, if there are enough unswapped pixels, the rightvalues of Seed1(k) and Seed2(k) can be guessed by enumer-ating all 2-value and 1-value4 combinations offm(16k + 0) � fm(16k + 15). To eliminate most wrong val-ues of Seed1(k), Seed2(k), the following requirements areuseful:

• both B(k, j) and (Seed1(k),Seed2(k)) are generated withfbð24k þ jÞg15

j¼0;• Seed(16k + j) is uniquely determined by B(k, j) and

Seed1(k), Seed2(k) following Eq. (8).

For each guessed values passing the above requirements,the corresponding chaotic state x(k) = 0. b(24k + 0) � � �b(24k + 23) is derived as follows:

4 The 1-value combinations are included since Seed1(k) = Seed2(k) mayoccur with a small probability.

S. Li et al. / The Journal of Systems and Software 81 (2008) 1130–1143 1137

• reconstruct fbð24k þ iÞg15i¼0 from Seed1(k), Seed2(k);

• reconstruct fbð24k þ 16þ iÞg7i¼0 with the following rule:

if both fm(16k + 2i) 2 S4 and fm(16k + 2i + 1) 2 S4 hold,b(24k + 16 + i) = 0, else b(24k + 16 + i) = 1.

Note that some extra errors will be introduced in theleast 8 bits fbð24k þ 16þ iÞg7

i¼0, which makes the derivedchaotic state x(k) incorrect. Apparently, the errors areinduced by the swapped pixels whose corresponding ele-ments of fm belong to S4. In the following, the probabilityof such errors, pse = Prob[fm(l) 2 S4], is studied. For anyswapped pixel f(l) in the kth pixel-block (l =16k + 0 � 16k + 15), according to Eqs. (9) and (10), onehas

pse ¼ Prob½f ð�ÞðlÞ 2 Sð�Þ4 �; ð14Þ

where f (�)(l) = f(2bl/2c) � f(2bl/2c + 1) and

Sð�Þ4 ¼ fSeed1ðkÞ � SeedðlÞ; Seed1ðkÞ � SeedðlÞ; Seed2ðkÞ

� SeedðlÞ; Seed2ðkÞ � SeedðlÞg:

Considering the Laplace distribution of f (�) (see Fig. 5)and the fact that 0 2 Sð�Þ4 , pse is generally not negligiblefor natural images. Without loss of generality, assume thateach bit in {b(i)} yields a balanced distribution over {0,1}and any two bits are independent of each other. One candeduce

P 1 ¼ Prob½xðkÞ is correct� ¼X8

i¼0

pbð8; iÞ � pic; ð15Þ

where pbð8; iÞ ¼8i

� �� 2�8, which denotes the probability

that there are i pairs of swapped pixels, and pc = 1 � pse.The relation between P1 and pc is given in Fig. 9.

3.3.2. Deriving l from two consecutive chaotic states

With two consecutive chaotic states, x(k) and x(k + 1),the estimated value of the secret control parameter l willbe ~lk ¼ xðkþ1Þ

xðkÞ�ð1�xðkÞÞ. Due to the negative influence of quanti-

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Fig. 9. The relationship between P1 = Prob[x(k) is correct] and pc.

zation errors, generally ~lk 6¼ l. As known, chaotic mapsare sensitive to noise in the initial condition, so an approx-imate value of l will generate completely different chaoticstates after several iterations, which implies that ~lk cannotbe directly used instead of l as the secret key. Fortunately,if j~lk � lj is small enough, one can exhaustively search inthe neighborhood of ~lk to find the right value of l. To ver-ify which guessed value of l is the right one, one shoulditerate the Logistic map from x(k + 1) until x(MN/16 � 1), and then check whether or not the correspondingelements in fm match the calculated chaotic states. Once amismatch occurs, the current guessed value is discarded,and the next guess will be tried. To minimize the verifica-tion complexity, one can check only a number of chaoticstates sufficiently far from x(k + 1) to eliminate most (oreven all) wrong values of ~lk, and verify the left few onesby checking all chaotic states from x(k + 2) to x(MN/16 � 1).

Now, the concern is when j~lk � lj will be small enoughto make the exhaustive search practical. According toProposition 2 (see the Appendix for a proof), whenx(k + 1) P 2�n, the quantization error of ~lk is less than2n+3/2L, which means that the size of the neighborhoodof ~lk for exhaustive search is 2n+3. To make the searchcomplexity practically small in real attacks,x(k + 1) P 0.5 is suggested to derive l, which occurs withprobability 0.5.

Proposition 2. Assume that the Logistic map

x(k + 1) = l Æ x(k) Æ (1 � x(k)) is iterated with L-bit fixed-

point arithmetic and that x(k + 1) P 2�n, where 1 6 n 6 L.

Then, the following inequality holds: jl� ~lkj 6 2nþ3=2L,

where ~lk ¼ xðkþ1ÞxðkÞ�ð1�xðkÞÞ.

Combining the above analyses, the final complexity of find-ing two correct consecutive chaotic states, x(k), x(k + 1),and the right value of l, is

O

2�16

2

� �þ

16

1

� �� �ð0:5� P 1Þ2

� 21þ3

0BB@1CCA ¼ O

17408

P 21

� �; ð16Þ

which is generally much smaller than the complexity ofexhaustively searching all possible keys. As a reference va-lue, when pc = 0.7, the complexity is aboutO(217.8)� O(248).

3.3.3. A quick algorithm to guess the two random seeds

Following the above-discussed search process, the foundcorrect chaotic states x(k) and x(k + 1) will be close to x(0).Considering the occurrence of two consecutive chaoticstates larger than 0.5 as a Bernoulli experiment, the mathe-matical expectation of k will be 1

ð0:5�P 1Þ2¼ 4

P 21

(Wikipedia,

2007). This means that only tens of known plain-pixels5

are enough for an attacker to break the chaotic map, which

5 For example, even a 10 · 10 ‘‘tiny’’ image is enough.

Fig. 10. Demonstration of the quick-search algorithm, where f Lenna is theonly known plain-image. (a) The recovered plain-image f Peppers from the7th pixel-block; (b) The recovered plain-image f Peppers from the 689th pixel-block; (c) The recovered plain-image f Peppers from the 1673rd pixel-block;(d) Recovering a larger plain-image f Peppers2;768o768 from the 1673rd pixel-block.

1138 S. Li et al. / The Journal of Systems and Software 81 (2008) 1130–1143

is a very desired feature for attackers. However, as an obvi-ous disadvantage, the search complexity to guess the tworandom seeds is somewhat large. In fact, for each pixel-block, one can only test a few number of possible 2-value(and 1-value) combinations, not all. Fortunately, we haveanother idea to make the search easier: if this pixel-blocklooks not good for guessing the two random seeds, simplydiscard it and go to the next pixel-block. Following suchan idea, a quicker algorithm can be designed to find thetwo random seeds. In this quick-search algorithm, thefound correct chaotic states x(k) and x(k + 1) may be farfrom x(0), so the size of the mask image has to be much lar-ger than 4

P 21

.The quick-search algorithm is based on the following

observation: the more the unswapped pixels there are inthe kth pixel-block, the more elements in ffmð16k þ jÞg15

j¼0

belong to S4. Then, define a new sequencef~f mð16k þ jÞg15

j¼0 accordingly as follows:

~f mð16k þ jÞ ¼ minðfmð16k þ jÞ; fmð16k þ jÞÞ: ð17Þ

Then, the following is also true: the more the unswappedpixels there are in the kth pixel-block, the more the numberof the values in S2 will be in f~f mð16k þ jÞg15

j¼0, where

S2 ¼ fminðSeed1ðkÞ; Seed1ðkÞÞ;minðSeed2ðkÞ; Seed2ðkÞÞg:

Therefore, assuming that there are nk pairs of unswappedpixels in the kth pixel-block, the following fact is true: ifnk is sufficiently large, the two most-occurring elements inf~f mð16k þ jÞg15

j¼0 are the two values in S2, with a high prob-ability. Then, the question becomes: when can one say thatnk is sufficiently large? In totally eight pairs of elements, theaverage number of pairs in S2 is N(S2) = nk + (8 � nk) Æ pse,and the number of other pairs is NðS2Þ ¼8� NðS2Þ ¼ ð8� nkÞ � ð1� pseÞ. From a conservative pointof view, let NðS2Þ < NðS2Þ

2, which ensures that the occurring

probability of each element of S2 is larger than the proba-bility of all other values, with a sufficiently high probabil-ity. Solving this inequality, one can get nk P 6, yieldingNðS2Þ 6 2 < 3 6 NðS2Þ

2.

Based on the above analyses, the quick-search algorithmis described as follows:

• Step 1: for each pixel-block, generate a new sequence,f~f mð16k þ jÞg15

j¼0;• Step 2: rank all values of f~f mð16k þ jÞg15

j¼0 to find the toptwo mostly-occurring values, value1 and value2;

• Step 3: if the occurrence times of value1 and value2 is notless than 12, or if the occurrence times of value2 is lessthan 3, skip the current pixel-block and goto Step 1;

• Step 4: in the set eS 4 ¼ fvalue1; value1; value2; value2g,exhaustively search Seed1(k) and Seed2(k).

If more than one value corresponds to the same positionin the rank of fef mð16k þ jÞg15

j¼0, all of them should be enu-merated as value1 and value2 in Step 2 to Step 4. In a realattack, some extra constraints, such as the relation between

the random seeds and B(k, j) (which is due to the reuse ofsome chaotic bits), can be added to further optimize theabove algorithm for different mask images. The attackcomplexity of this quick-search algorithm is hard to theo-retically analyzed, since the distribution of those valuesthat are not in S4 is generally unknown. Fortunately,experiments show that the complexity is much smaller thanthe one given above. In Fig. 10, the performance of thequick-search algorithm is shown for the recovered plain-image f Peppers, where different pixel-blocks are used toextract the chaotic states. Note that more than 40-pixel-blocks are eligible to be used to extract the correct chaoticstates, and the three shown here are randomly chosen fordemonstration.

In the following, it is theoretically studied as how muchMN should be to guarantee the efficiency of the quick-search algorithm, which is determined by the occurrenceprobability that two consecutive pixel-blocks satisfy therequirements given in Steps 1 and 3. Assume that eachbit in {b(i)} yields a balanced distribution over {0,1} andany two bits are independent of each other. The probabilitythat one pixel-block satisfies the requirements, which isdenoted by Po, yields Eq. (18). Then, for the occurrenceprobability that two consecutive pixel-blocks satisfy therequirements, which is denoted by Po2, one can calculatethat P o2 ¼ P 2

o P ðProb½S4 ¼ eS 4�Þ2 ¼ ð4699215 Þ2 0:02. This

means that there will be two consecutive pixel-blocks sat-isfy the requirements in 1

P o2 50 pixel-blocks (about

Fig. 11. The recovery performance of the combined known-plaintextattack. (a) The recovered plain-image f Peppers; (b) The recovered largerplain-image f Peppers2;768o768.

S. Li et al. / The Journal of Systems and Software 81 (2008) 1130–1143 1139

800 pixels), from the probabilistic point of view. Therefore,the required size of the known plain-image should be largerthan 800, which is even smaller than the size of a 30 · 30image. Hence, the quick-search algorithm is very efficient.

P o P Prob½S4 ¼ eS 4�¼ Prob½both Seed1ðkÞ and Seed2ðkÞ occur

P 3 times in fef mð16k þ jÞg15j¼0�

� Prob½minðSeed1ðkÞ; Seed1ðkÞÞ6¼ minðSeed2ðkÞ; Seed2ðkÞÞ�

¼X8

nk¼6

8

nk

� �� 2�8 � 1�

X2

m¼0

2nk

m

� �� 2�2nk

!

�ð1� 128�1Þ!

ð18Þ

3.3.4. Breaking the chaotic map with both fm and Q

All the above-mentioned algorithms are based on only-one known plain-image. When more than one plain/cipher-image is known, the constructed swapping (0, 1)-matrix Q

will be very useful to increase the efficiency of the attack.As already known, the mask image fm can be amendedusing the swapping information stored in Q. Since allamended elements in fm are also values in S4, it is obviousthat the efficiency of the search algorithm for finding cor-rect random seeds will be increased. In addition, the swap-ping matrix Q can be used to uniquely determine some bitsin fbð24k þ 16þ iÞg7

i¼0 without checking fm(16k + 2i) 2 S4

and fm(16k + 2i + 1) 2 S4. Thus, the total complexity infinding a correct chaotic state will be less, and the attackwill succeed more quickly.

When two or more plain-images and/or cipher-imagesare known, most swapped pixels can be successfully distin-guished. In this case, it is much easier to find a pixel-blockof fm whose elements are all in S4, which means thatSeed1(k), Seed2(k) can be quickly guessed by enumeratingall values in S4, and all the 8 bits fbð24k þ 16þ iÞg7

i¼0 canbe absolutely determined. This implies that the attack com-plexity is minimized to be the complexity of breakingRCES’s weaker parent – CKBA (Li and Zheng, 2002b).

3.4. Combined known-plaintext attack

The above two known-plaintext attacks have theirrespective disadvantages: the first attack cannot decryptthe cipher-images larger than MN (the size of fm), and thesecond one cannot decrypt all pixels before the positionwhere the first correct chaotic state x(k) is found. One cancombine them, however, to make a better known-plaintextattack without these disadvantages: use the first attack todecrypt the pixels before x(k) and then use the second attackto decrypt the others. Fig. 11 shows the performance of thiscombined attack with only one known plain-image, wherethe recovered chaotic state in the second attack is selected

as x(1673) (see also Fig. 10c), which can clearly show theboundary of the two parts decrypted by the two attacks.

3.5. Chosen-plaintext attack

Apparently, all the above three known-plaintext attackscan be extended to chosen-plaintext attacks.

For the first kind of known-plaintext attack, the chosen-plaintext version can achieve much better recovery perfor-mance with a nearly-perfect mask image fm, by choosingonly one plain-image whose pixels are all fixed to be thesame gray value. Given such a plain-image, from Corollary1, any recovered plain-pixel will be the plain-pixel itself orits adjacent pixel. Thus, although the recovery errorbounded by a1 = f1(16k + 2i) � f1(16k + 2i + 1) may stillbe large, it is expected that the visual quality of the recov-ered plain-image will be much better. It is also expectedthat all salt–pepper impulsive noises will disappear and adithering effect of edges will occur, which is demonstratedin Fig. 12c with the plain-image f Peppers recovered fromthe chosen plain-image shown in Fig. 12a. As a naturalresult, the visual quality of the recovered plain-imagef Peppers becomes much better as compared with the oneshown in Fig. 4a.

Similarly to the known-plaintext attack, with someimage processing techniques, the recovered plain-image inthe chosen-plaintext attack can also be enhanced to furtherprovide a better visual quality. Now, the question is: canone maximize the visual quality with an optimization algo-rithm? The answer is yes. In fact, with a subtly-designedalgorithm, almost all dithering edges can be perfectly pol-ished and a matrix Q containing partial swapping informa-tion can be constructed with only one chosen plain-image.In the following, this efficient algorithm and its real perfor-mance are studied in detail.

The proposed algorithm divides the image into 2n-pixel-blocks for enhancement, where 2n can exactly divide M.The basic idea is to exhaustively search the optimal swap-ping states of all pixels to achieve the minimal differentialerrors. For the mth 2n-pixel-block fBðmÞ ¼ff ðm � 2nþ iÞg2n�1

i¼0 , the algorithm works as follows:

Fig. 12. The recovery performance of the chosen-plaintext attack. (a) Thechosen plain-image fGray; (b) The mask image fGray,m; (c) The recoveredplain-image f Peppers; (d) The recovery error jf Peppers � fPeppersj.

Fig. 13. The performance of the optimization algorithm discussed inSection 3.5, when n = 8. (a) Enhancing the plain-image f Peppers shown inFig. 12c; (b) The recovery error of (a); (c) Enhancing the plain-imagef Peppers shown in Fig. 4a; (d) The recovery error of (c).

1140 S. Li et al. / The Journal of Systems and Software 81 (2008) 1130–1143

(1) set fbsðiÞ ¼ 0gn�1i¼0 and Dmin = 256(n � 1);

(2) for ðb0; . . . ; bn�1Þ ¼ ð0; . . . ; 0zfflfflfflffl}|fflfflfflffl{n

Þ � ð1; . . . ; 1zfflfflfflffl}|fflfflfflffl{n

Þ, do

(a) assign A = {a0, . . . ,a2n�1} = fB(m);(b) for i = 0 � n � 1, do Swapbi

ða2i; a2iþ1Þ;(c) calculate DA = ja2 � a1j + ja4 � a3j + � � � +ja 2i � a2i�1j + � � � + ja2n�2 � a2n�3j;

(d) if DA < Dmin, then set Dmin = DA andfbsðiÞ ¼ bign�1

i¼0 .

6 For more discussions on how to design a good image encryptionschemes, see the last section of Li et al. (2004).

(3) for i = 0 � n � 1, do SwapbsðiÞðf ðm � 2nþ 2iÞ;f ðm � 2nþ 2iþ 1ÞÞ;

(4) set the corresponding elements of the swappingmatrix Q to be 1 for bs(i) = 1.

The complexity of the above algorithm is O(2n Æ MN).When M = N = 256 and n = 8, it is less than 224, whichis practical even on PCs.

For the recovered plain-image f Peppers shown in Fig. 12c,the above algorithm has been tested with parameter n = 8,and the result is given in Figs. 13a and 13b. Although theenhanced plain-image has 14378 (about 21.94% of all) pix-els different from the original plain-image, its visual qualityis so perfect that no any visual degradation can be distin-guished. In fact, in a sense, the enhanced plain-image canbe considered as a better version of the original one, sinceeach 2n-pixel-block of the former reaches the minimum ofthe accumulated differential error. From such a point ofview, this optimization algorithm can also be used toenhance the visual quality of the plain-image recoveredby a known-plaintext attack. For the recovered plain-image shown in Fig. 4a, the enhancing result is given in

Figs. 13c and 13d. It can be seen that dithering edges exist-ing in the plain-image shown in Fig. 4a have been polished.

In the above algorithm, most swapped operations can bedistinguished by using the minimum-detecting rule on theaccumulated differential error of fB(m), which means thatmost elements in Q are correct for showing the real valuesof the swapping directive bits fbð24k þ 16þ iÞg7

i¼0. Once 32consecutive correct elements (two 16-pixel-blocks) in Q

have been found, it is possible to derive l and a chaoticstate x(k), like in the situation of the second known-plain-text attack.

4. Lessons learned from RCES/CKBA

From the above cryptanalysis of RCES, some principlescan be suggested for the design of good image encryptionschemes. Although the security of RCES and CKBAagainst the known/chosen-plaintext attack is very weak,they are still useful as typical carelessly-designed examplesto show what one should do and what one should not do.6

4.1. Principle 1: Security against the known/chosen-plaintext

attacks should be provided

As surveyed by Li et al. (2004), besides CKBA/RCES,many other image encryption schemes are also insecure

S. Li et al. / The Journal of Systems and Software 81 (2008) 1130–1143 1141

against the known/chosen-plaintext attack. However, with-out the capability against the known/chosen-plaintextattacks, it will be insecure to repeatedly use the same secretkey to encrypt multiple image files. When the cryptosys-tems are used to encrypt image streams transmitted overnetworks, this problem can be relaxed due to the use oftime-variant session keys (Schneier, 1996). Consideringthat most image encryption systems are proposed toencrypt local image files, the security against the known/chosen-plaintext attacks is generally required.

4.2. Principle 2: Do not use key-invertible encryption

function

Rewrite the encryption function of a symmetric cipheras C = E(P,K). The function E(Æ , Æ) is said to be key-invert-ible, if K can be derived from C and P with its inverse func-tion (with respect to K) E�1

K (Æ , Æ), i.e., K = E�1K (P,C). Most

modern ciphers employs a mixture of operations defined indifferent groups to make the encryption function not key-invertible.

In RCES/CKBA, the encryption function is XOR,which is a key-invertible operation since P � K =C) K = P � C. It is the essential reason why the maskimage fm can be used as an equivalent of the real key(x(0),l). Similarly, the key-invertibility of the swappingoperations is the reason for the success of the dithering-removal algorithm discussed in the chosen-plaintext attack.

To enhance the security of RCES, the XOR operationcan be replaced with some key-dependent functions.Another way is to replace the swapping operation withmore complex long-distance permutation operations, suchas the ones used by Maniccam and Bourbakis (2004), Scha-ringer (1998), Fridrich (1998) and Mao et al. (2004). If bothoperations are changed as above, the security will be fur-ther enhanced. Maniccam and Bourbakis (2004), Scharin-ger (1998), Fridrich (1998) and Mao et al. (2004) havesuggested some typical image ciphers that use such an ideato ensure the security against the known/chosen-plaintextattacks.

4.3. Principle 3: The correlation information within the

plain-image should be sufficiently reduced

As shown in the previous section, the high correlationinformation between adjacent pixels is an important reasonof the good performances of the known/chosen-plaintextattacks. In fact, there exists a large amount of correlationinformation within digital images, even between pixelswhose distances are large, such as pixels in a smooth area.To provide sufficient security against attacks, the correla-tion information within the plain-image should be suffi-ciently concealed. A typical method to conceal thecorrelation information is to carry out complex long-dis-tance permutation operations (Maniccam and Bourbakis,2004; Scharinger, 1998; Fridrich, 1998; Mao et al., 2004).Note that the long-distance permutations are not necessary

conditions, but sufficient ones, since any secure text ciphercan also provide enough security for digital images.

4.4. Principle 4: Any non-uniformity existing in the cipher-

images should be avoided

From a cryptographer’s point of view, any non-unifor-mity is not welcome due to the risk of causing statistics-based attacks, such as the well-known differential attacks(Schneier, 1996). So, it should be carefully checked whetheror not there exists any non-uniformity in the ciphertexts.

The essential reason for the insecurity of RCES/CKBAagainst the known/chosen-plaintext attacks can also beascribed to the non-uniformity of the distribution off(l) � f 0(l) over {0, . . . , 255}:

• for any unswapped pixel, Prob[f(l) � f 0(l) = -Seed(l)] = 1, i.e., the distribution is one with the mostnon-uniformity;

• for any swapped pixel, the distribution of f(l) � f 0(l) hasthe same non-uniformity level as the one off(l) � f(l + 1) (see the distribution of f ð�ÞPeppers shown inFig. 5).

This also suggests that all pixels should be permuted.Actually, in the second known-plaintext attack, the feasi-bility of the quick-search algorithm in finding the two ran-dom seeds is due to the non-uniformity of the distributionof f~f mð16k þ jÞg15

j¼0 over the discrete set {0, . . . , 127}. Ifeach ef mð16k þ jÞ distributes uniformly over {0, . . . , 127},the exhaustive search algorithm will be practically impossi-ble when the block size is changed to a sufficiently largevalue.

5. Conclusion

In this paper, it has been pointed out that the RCES/RSES image encryption method recently proposed in Chenet al. (2002) and Chen and Yen (2003) is not secure enoughagainst the known/chosen-plaintext attacks, and that thesecurity against brute-force attack was overestimated. Boththeoretical and experimental analyses have been given tosupport the feasibility of the known/chosen-plaintextattacks. The insecurity of RCES is caused by an inappro-priate design, and some principles on good design of secureimage encryption schemes have been learned from theweaknesses of RCES.

Acknowledgements

Shujun Li was sponsored by the Alexander von Hum-boldt Foundation, Germany. The work of K.-T. Lo wassupported by the Research Grants Council of the HongKong SAR Government under Project Number 523206(PolyU 5232/06E).

1142 S. Li et al. / The Journal of Systems and Software 81 (2008) 1130–1143

Appendix

Here we give the proof of Proposition 2, which hasoccurred in p. 423 of Li et al. (2004) (but with differentnotations).

Proposition 2. Assume that the Logistic map

x(k + 1) = l Æ x(k) Æ (1 � x(k)) is iterated with L-bit fixed-

point arithmetic and that x(k + 1) P 2�n, where 1 6 n 6 L.

Then, the following inequality holds: jl� ~lkj 6 2nþ3=2L,

where ~lk ¼ xðkþ1ÞxðkÞ�ð1�xðkÞÞ.

Proof. In L-bit fixed-point arithmetic, l, x(k), and x(k + 1)all have L binary decimal bits, and the quantization errorof x(k + 1) can be explained in the following equation:

xðk þ 1Þ ¼ ðl � xðkÞ þ e0xðkþ1ÞÞ � ð1� xðkÞÞ þ e00xðkþ1Þ

¼ l � xðkÞ � ð1� xðkÞÞ þ exðkþ1Þ;

where jexðkþ1Þj ¼ je0xðkþ1Þ � ð1� xðkÞÞ þ e00xðkþ1Þj 6 je0xðkþ1Þj þje00xðkþ1Þj. Considering je0xðkþ1Þj; je00xðkþ1Þj < 2�L for floor/ceil

quantization functions and je0xðkþ1Þj; je00xðkþ1Þj 6 2�ðLþ1Þ for

the round function, jex(k+1)j < 2�(L�1) is true in all cases.Then, the quantization error je~lk j ¼ jl� ~lkj can be esti-mated as follows:

je~lk j ¼xðk þ 1Þ þ exðkþ1Þ

xðkÞ � ð1� xðkÞÞ �xðk þ 1Þ

xðkÞ � ð1� xðkÞÞ

���� ����¼ exðkþ1Þ

xðk þ 1Þ �xðk þ 1Þ

xðkÞ � ð1� xðkÞÞ

���� ���� ¼ exðkþ1Þ�� ��xðk þ 1Þ � l

<2�ðL�1Þ � lxðk þ 1Þ 6

4

2L�1 � xðk þ 1Þ¼ 1

2L�3 � xðk þ 1Þ:

When x(k + 1) P 2�n (n = 1–L),

je~lk j <1

2L�3 � xðk þ 1Þ6

2n

2L�3¼ 2nþ3=2L: ð19Þ

This finishes the proof of the proposition. h

References

Alexopoulos, C., Bourbakis, N.G., Ioannou, N., 1995. Image encryptionmethod using a class of fractals. J. Electron. Imaging 4 (3), 251–259.

Bhargava, B., Shi, C., Wang, S.-Y., 2004. MPEG video encryptionalgorithms. Multimedia Tools Applicat. 24 (1), 57–79.

Bourbakis, N.G., Alexopoulos, C., 1992. Picture data encryption usingSCAN patterns. Pattern Recognit. 25 (6), 567–581.

Canniere, C.D., Lano, J., Preneel, B., 2005. Cryptanalysis of the two-dimensional circulation encryption algorithm. EURASIP J. Appl.Signal Process. 2005 (12), 1923–1927.

Chang, C.-C., Yu, T.-X., 2002. Cryptanalysis of an encryption scheme forbinary images. Pattern Recognit. Lett. 23 (14), 1847–1852.

Chang, C.-C., Hwang, M.-S., Chen, T.-S., 2001. A new encryptionalgorithm for image cryptosystems. J. Syst. Software 58 (2), 83–91.

Chen, H.-C., Yen, J.-C., 2003. A new cryptography system and its VLSIrealization. J. Syst. Architec. 49 (7–9), 355–367.

Chen, H.-C., Yen, J.-C., Guo, J.-I., 2002. Design of a new cryptographysystem. In: Proceedings of the PCM’2002. Lecture Notes in ComputerScience, vol. 2532. Springer-Verlag, Berlin, pp. 1041–1048.

Chen, H.-C., Guo, J.-I., Huang, L.-C., Yen, J.-C., 2003. Design andrealization of a new signal security system for multimedia datatransmission. EURASIP J. Appl. Signal Process. 2003 (13), 1291–1305.

Cheng, H.C.H., 1998. Partial encryption for image and video communi-cation. Master’s thesis, Department of Computing Science, Universityof Alberta, Edmonton, Alberta, Canada (Fall).

Cheng, H., Li, X., 2000. Partial encryption of compressed images andvideos. IEEE Trans. Signal Process. 48 (8), 2439–2451.

Chung, K.-L., Chang, L.-C., 1998. Large encryption binary images withhigher security. Pattern Recognit. Lett. 19 (5–6), 461–468.

Dang, P.P., Chau, P.M., 2000. Image encryption for secure internetmultimedia applications. IEEE Trans. Consumer Electron. 46 (3), 395–403.

Devaney, R.L., 1989. An Introduction to Chaotic Dynamical Systems.Addison-Wesley, Redwood City, California, USA.

Fridrich, J., 1998. Symmetric ciphers based on two-dimensional chaoticmaps. Int. J. Bifurcat. Chaos 8 (6), 1259–1284.

Furht, B., Socek, D., Eskicioglu, A.M., 2004. Fundamentals of multi-media encryption techniques. In: Furht, B., Kirovski, D. (Eds.),Multimedia Security Handbook. CRC Press, LLC, Boca Raton,Florida, pp. 93–132 (Chapter 3).

Jan, J.-K., Tseng, Y.-M., 1996. On the security of image encryptionmethod. Inf. Process. Lett. 60 (5), 261–265.

Kohda, T., Aihara, K., 1990. Chaos in discrete systems and diagnosis ofexperimental chaos. Trans. IEICE E73 (6), 772–783.

Li, S., 2003. Analyses and new designs of digital chaotic ciphers. Ph.D.thesis, School of Electronics and Information Engineering, Xi’anJiaotong University, Xi’an, China, June. Available from: <http://www.hooklee.com/pub.html>.

Li, S., Zheng, X., 2002a. On the security of an image encryption method.In: Proceedings of the IEEE International Conference on ImageProcessing (ICIP’2002), vol. 2, pp. 925–928.

Li, S., Zheng, X., 2002b. Cryptanalysis of a chaotic image encryptionmethod. In: Proceedings of the IEEE International Symposium onCircuits and Systems (ISCAS’2002), vol. II, pp. 708–711.

Li, S., Chen, G., Zheng, X., 2004. Chaos-based encryption for digitalimages and videos. In: Furht, B., Kirovski, D. (Eds.), MultimediaSecurity Handbook. CRC Press, LLC, pp. 133–167 (Chapter 4) .

Li, C., Li, S., Zhang, D., Chen, G., 2004. Cryptanalysis of a chaotic neuralnetwork based multimedia encryption scheme. In: Advances inMultimedia Information Processing – PCM 2004 Proceedings, PartIII. Lecture Notes in Computer Science, vol. 3333. Springer-Verlag,pp. 418–425.

Li, C., Li, S., Chen, G., Chen, G., Hu, L., 2005. Cryptanalysis of a newsignal security system for multimedia data transmission. EURASIP J.Appl. Signal Process. 2005 (8), 1277–1288.

Li, C., Li, S., Lou, D.-C., Zhang, D., 2006. On the security of the Yen–Guo’s domino signal encryption algorithm (DSEA). J. Syst. Software79 (2), 253–258.

Li, S., Li, C., Chen, G., Bourbakis, N.G., Lo, K.-T., 2007. A generalcryptanalysis of permutation-only multimedia encryption algorithms,IACR’s Cryptology ePrint Archive: Report 2004/374. Available from:<http://eprint.iacr.org/2004/374>.

Maniccam, S.S., Bourbakis, N.G., 2004. Image and video encryptionusing SCAN patterns. Pattern Recognit. 37 (4), 725–737.

Mao, Y., Wu, M., 2006. A joint signal processing and cryptographicapproach to multimedia encryption. IEEE Trans. Image Process. 15(7), 2061–2075.

Mao, Y., Chen, G., Lian, S., 2004. A novel fast image encryption schemebased on 3D chaotic Baker maps. Int. J. Bifurcat. Chaos 14 (10), 3613–3624.

Pommer, A. 2003. Selective encryption of wavelet-compressed visual data.Ph.D. thesis, Department of Scientific Computing, University ofSalzburg, Austria, June 2003.

Qiao, L., 1998. Multimedia security and copyright protection. Ph.D.thesis, Department of Computer Science, University of Illinois atUrbana-Champaign, Urbana, Illinois, USA.

S. Li et al. / The Journal of Systems and Software 81 (2008) 1130–1143 1143

Scharinger, J., 1998. Fast encryption of image data using chaoticKolmogorov flows. J. Electron. Imaging 7 (2), 318–325.

Schneier, B., 1996. Applied Cryptography – Protocols, Algorithms, andSource Code in C, second ed. John Wiley & Sons Inc., New York.

Uhl, A., Pommer, A., 2005. Image and Video Encryption: From DigitalRights Management to Secured Personal Communication. SpringerScience + Business Media Inc., Boston.

Wikipedia, 2007. Geometric distribution. Available from: <http://en.wiki-pedia.org/wiki/Geometric_Distribution>.

Wu, C.-P., Kuo, C.-C.J., 2005. Design of integrated multimedia compres-sion and encryption systems. IEEE Trans. Multimedia 7 (5), 828–839.

Yano, K., Tanaka, K., 2002. Image encryption scheme based on atruncated Baker transformation. IEICE Trans. Fundam. E85-A (9),2025–2035.

Yen, J.-C., Guo, J.-I., 1999. A new image encryption algorithm and itsVLSI architecture. In: Proceedings of the IEEE Workshop SignalProcessing Systems, pp. 430–437.

Yen, J.-C., Guo, J.-I., 2000a. Efficient hierarchical chaotic imageencryption algorithm and its VLSI realisation. IEE Proc. – Vis. ImageSignal Process. 147 (2), 167–175.

Yen, J.-C., Guo, J.-I., 2000b. A new chaotic key-based design for imageencryption and decryption. In: Proceedings of the IEEE InternationalSymposium on Circuits and Systems (ISCAS’2000), vol. 4, pp. 49–52.

Yen, J.-C., Guo, J.-I., 2003. The design and realization of a new dominosignal security system. J. Chin. Inst. Electr. Eng. (Trans. Chin. Inst.Eng., Ser. E) 10 (1), 69–76.

Shujun Li received his B.S. degree in Information Science and Engineering,and his Ph.D. degree in Information and Communication Engineering,both from the Xi’an Jiaotong University, Xi’an, China, in 1997 and 2003,respectively. After getting his Ph.D. degree, he was doing postdoctoralresearch in the City University of Hong Kong from September 2003 toJanuary 2005, and in The Hong Kong Polytechnic University from June2005 to January 2007. Currently he is a Humboldt Research Fellow withthe FernUniversitat in Hagen, Germany. His current research interestsinclude multimedia security (mainly image and video encryption), chaoticcryptography and secure human-computer identification.

Chengqing Li was born in Xiangxiang, Hunan, China. He received hisB.Sc. in Pure Mathematics from the Xiangtan University in June, 2002

and his M.Sc. in Applied Mathematics from the Zhejiang University inJune, 2005. From November 2005 to February 2006, he worked as aResearch Assistant at the Centre for Chaos Control and Complex Net-works, City University of Hong Kong. Currently he is a Ph.D. studentmajoring in Electronic Engineering at the Department of ElectronicEngineering, City University of Hong Kong. He has published more than10 scientific papers (all in English and in the subject of cryptanalysis ofchaotic multi-media encryption schemes).

Guanrong Chen received the M.Sc. degree in computer science fromZhongshan University, China and the Ph.D. degree in applied mathe-matics from Texas A&M University, College Station. Currently he is aChair Professor and the founding director of the Centre for Chaos andComplex Networks at the City University of Hong Kong. He has(co)authored 15 research monographs and advanced textbooks, more than400 journal papers, and about 200 refereed conference papers, publishedsince 1981 in the fields of nonlinear system dynamics and controls. He isHonorary Professor of the Central Queensland University, Australia, andof more than ten universities in China. Prof. Chen is a Fellow of the IEEEfor his fundamental contributions to the theory and applications of chaoscontrol and bifurcation analysis. He has served and is serving as editor foreight international journals, including IEEE Transactions on Circuits andSystems, IEEE Transactions on Automatic Control, and InternationalJournal of Bifurcation and Chaos, and received four best journal paperawards in the past.

Kwok-Tung Lo was born and raised in Hong Kong. He obtained hisM.Phil. and Ph.D. degrees in Electronic Engineering from the ChineseUniversity of Hong Kong in 1989 and 1992 respectively. Since 1992, hehas been with the Hong Kong Polytechnic University, where he is now anAssociate Professor at the Department of Electronic and InformationEngineering. Dr. Lo is very active in research and has published over 130papers in various international journals and conference proceedings. He isone of the authors of the book Fundamentals of Image Coding andWavelet Compression: Principles, Algorithms and Standards published bythe Tsinghua University Press. He is currently a member of the EditorialBoard of Multimedia Tools and Applications and an Associate Editor ofHKIE Transactions. His current research interests include multimediasignal processing, digital watermarking, multimedia communications andInternet applications.