Secure and traceable multimedia distribution for convergent Mobile TV services

Computer Communications 33 (2010) 1664–1673

Contents lists available at ScienceDirect

Computer Communications

journal homepage: www.elsevier .com/ locate/comcom

Secure and traceable multimedia distribution for convergent Mobile TV services

Shiguo Lian a, Xi Chen b,*

a France Telecom R&D (Orange Labs) Beijing, Beijing 100080, PR Chinab E-Commerce Department, Nanjing University, Nanjing 210093, PR China

a r t i c l e i n f o a b s t r a c t

Article history:Available online 16 March 2010

Keywords:Mobile TVSecure multimedia distributionDigital fingerprintingVideo encryptionDigital rights management

0140-3664/$ - see front matter � 2010 Elsevier B.V. Adoi:10.1016/j.comcom.2010.03.015

* Corresponding author.E-mail addresses: [email protected]

(X. Chen).

Few papers have focused on secure mobile multimedia distribution that protects both the confidentialityand copyright in mobile multimedia services. In this paper, a secure multimedia distribution scheme ispresented for a ubiquitous mobile content service, i.e., the Mobile TV based on the converged DVB-H (Dig-ital Video Broadcasting – Handheld) and GPRS/GSM. At the server side, the Joint Compression andEncryption method is proposed to encrypt video contents. At the mobile terminal side, the Joint Decryp-tion and Fingerprinting method is proposed to decrypt video contents and simultaneously embed themobile terminal’s identification information. When the media content is illegally redistributed to publicnetworks, such as Internet or public TV, the proposed Fingerprint Detection and Traitor Tracing methodwill be used to identify the illegal redistributor. To show the proposed scheme’s prior performances, theexisting secure media distribution schemes are reviewed, and the comparative evaluations are done. Theanalysis and experimental results show that the proposed scheme is more suitable for secure mobilemultimedia distribution. The work is expected to attract more researchers.

� 2010 Elsevier B.V. All rights reserved.

1. Introduction

With the development of multimedia technology and commu-nication technology, multimedia data (image, audio, video, text,etc.) are used more and more widely in human being’s daily life.The wide application makes multimedia content protection moreand more urgent and necessary. Till now, some multimedia protec-tion means have been reported, among which, multimedia encryp-tion and digital watermarking are two typical ones. Multimediaencryption [1,2,25] often transforms the original data into an unin-telligible form, which protects the confidentiality. Only the autho-rized customer who has the correct key can recover the datasuccessfully. Generally, multimedia encryption algorithms shouldbe not only secure against cryptographic attacks but also securein human perception. Digital watermarking [3,4] is the techniquethat embeds some information (e.g., copyright, ownership, integ-rity, etc.) into the original data by slightly modifying the data,which protects the data’s copyright, ownership, integrity, etc. Theembedded information can be extracted from the marked dataand used to authenticate the originality. Generally, a good water-marking algorithm has such properties as imperceptibility, robust-ness and so on. The imperceptibility means that the watermarkedmedia is perceptually different from the original media. The

ll rights reserved.

(S. Lian), [email protected]

robustness denotes to the ability to survive such common signalprocessing as recompression, adding noise, filtering, resizing, etc.

Existing Digital Rights Management (DRM) [37,38] and Condi-tional Access (CA) [39] systems aim to protect the digital multime-dia content and the related access rights, but ignore the traceabilityof contents or users. For example, during media transmission, themedia data are often encrypted to resist attackers. However, thedecrypted media content may be redistributed to unauthorizedusers (e.g., free sharing over Internet), which often leads to greatbenefit–losses to content/service providers. Due to this case, securemultimedia distribution [5,6,26,27] becomes more and more pop-ular in practical applications, which transmits multimedia contentfrom the sender to different receivers in a secure manner. Gener-ally, two properties of multimedia contents need to be protected,i.e., confidentiality and traitor tracing. Among them, the confiden-tiality can be protected by multimedia encryption, while the traitortracing is often protected by digital fingerprinting. Digital finger-printing [7,8] is the technique that embeds the unique customerinformation (e.g., customer ID) into media content with water-marking technology. If an illegal media copy is found, the uniqueinformation can be extracted from the media copy and used totrace the traitor who distributes the media copy to other unautho-rized customers. As a fingerprinting algorithm, it should be secureagainst collusion attack [8] that combines several copies togetherto produce a new copy without customer information.

According to the aims of secure multimedia distribution, suchtechniques as watermarking, encryption and fingerprinting needto be used for the design, thus, various properties belonging to

http://dx.doi.org/10.1016/j.comcom.2010.03.015

mailto:[email protected]

mailto:[email protected]

http://www.sciencedirect.com/science/journal/01403664

http://www.elsevier.com/locate/comcom

S. Lian, X. Chen / Computer Communications 33 (2010) 1664–1673 1665

different techniques should be considered, which are list asfollows.

� Secure against cryptographic attacks. The encryption algorithmshould be secure against typical attacks, such as brute-forceattacks, statistical attacks, differential attacks, etc. Additionally,the watermarking algorithm can resist some attackers who aimto forge the watermark or remove the watermark in an unau-thorized manner.� Robust against common processing. The processing operations

include typical signal processing (e.g., compression, addingnoise, filtering, etc.) and intentional operations (e.g., cameracapture, rotation, shifting, translation, etc.).� Robust against collusion attacks. The embedded fingerprint code

can survive such operations that combine several media copiestogether to produce a new copy. The combination operation [8]includes pixel averaging, min–max pixel selection, linear com-bination, etc.� Imperceptibility. The embedded fingerprint information should

be invisible. That is, there is no perceptual difference betweenthe original copy and the watermarked copy.� Distribution efficiency. For the media sender, its loading should

not be too large to support large number of concurrent userswho request multimedia services. For the user, the mediareceiving operation should be efficient in order to confirm thereal-time services.

There exist some secure multimedia distribution schemes [5,9–15]. However, they have some typical disadvantages and are nottargeted on mobile applications, e.g., such mobile services as Mo-bile TV and mobile image/video/music sharing. Generally, in Mo-bile TV, the content is encoded into low bit rate streams by suchcodec as MPEG-4, and the service has the property of real timeinteraction. The existing secure distribution schemes do not con-sider the Mobile TV compression process, and are not lightweightenough to reduce the delay and meet real time interactions. Moredetails about their performances will be presented in the nextsection.

Now, there exist some schemes for securing mobile video com-munication, e.g., Mobile TV services. Taking the ubiquitous MobileTV [36] based on the convergence of DVB-H (Digital Video Broad-casting – Handheld) [28] and GPRS/GSM [29,30] for example, the

Fig. 1. The Mobile TV service based on

TV content is broadcasted through the DVB-H channel, while theuser interaction information is unicasted to each user throughGPRS/GSM network’s channel, as shown in Fig. 1. In [31], we pro-posed the scheme to provide secure Mobile TV services, which en-crypts the TV program’s video content partially during the videocompression process, transmits the encrypted content throughDVB-H broadcasting and distributes the encrypted access licensesto users through the GPRS/GSM channel. Additionally, the schemeis extended to convergent TV services (between Mobile TV andHome TV) by encrypting the video content in a scalable manner[32,33]. Although these schemes show good performances for se-cure Mobile TV services, they do not consider the illegal redistribu-tion issues.

This paper aims to propose a secure multimedia distributionscheme for the converged Mobile TV services. This scheme willadopt both multimedia encryption and digital fingerprinting tech-niques to trace illegal redistributions. Among them, the encryptionoperation will be combined with compression operation in order toget high efficiency, while the fingerprinting operation will be com-bined with the decryption operation in order to get high security.To show the proposed scheme’s superior properties, we analyzethe existing secure distribution schemes’ performances, and com-pare our scheme with the most similar existing one.

The rest of the paper is arranged as follows. In Section 2, theexisting secure multimedia services and distribution schemes arereviewed, and the typical traceable scheme is introduced briefly.The architecture of the proposed secure distribution scheme is pre-sented in Section 3. In Section 4, the performance of the proposedscheme is evaluated, and the comparison with an existing schemeis presented. Finally, some conclusions are drawn, and the futurework is given.

2. Related work

2.1. Secure multimedia services

Digital Rights Management (DRM) and Conditional Access (CA)systems provide multimedia content protection and secure userinteractions during multimedia services. Generally, different sys-tems are designed for different application scenarios. For example,Open Mobile Alliance (OMA) DRM [37] provides the principles formanaging digital rights in mobile environment. It defines the for-

convergent DVB-H and GPRS/GSM.

P

αM

αW-αM

C=P+αM

P'=P+αW

Sender Receiver

Fig. 2. Lemma’s method for secure content distribution.

1666 S. Lian, X. Chen / Computer Communications 33 (2010) 1664–1673

mat and the protection mechanism for content and the rights ob-jects, and also the security model for management of encryptionkeys. Before the content is delivered over networks, it is securelypackaged to protect it from unauthorized usage. The content issuertransmits DRM content and a rights issuer produces a rights objectwith the encryption keys. For the broadcasting environment, DVBContent Protection and Copy Management (DVB CPCM) [38] spec-ifies the content protection and copy management of commercialdigital content delivered to consumer products. CPCM is designedfor protecting all types of content, including audio, video and asso-ciated applications and data. For Internet media streaming ser-vices, Internet Streaming Media Alliance (ISMA) proposes theISMACryp [39] standard to protect the MPEG-4 data stream inapplication layer. It defines various means for data encryptionand authentication, and is capable of achieving end-to-endsecurity.

In convergent environments, the corresponding digital rightmanagement or conditional access systems will also be combined.For example, in Mobile TV services based on convergent DVB-Hand GPRS, the new secure media transmission system [31] is con-structed by combining DVB CPCM and OMA DRM. Additionally, inthe convergent Mobile TV and Home TV environments, the DRMsystem is adapted from Mobile TV to Home TV [32,40].

These DRM or CA systems do not consider the traceability of theconsumed media content. This paper will solve the traceability is-sue in the first case of Mobile TV when the DVB-H and GPRS net-works are converged. Since some other issues, such as the keymanagement, secure interaction protocols and business models,have been solved in [31,32], this paper will only focus on the mediacontent encryption and tracing.

2.2. Secure multimedia distribution

Till now, some secure multimedia distribution schemes havebeen proposed, which can be classified into three types. The firstone [9,10] embeds customer information into media data and thenencrypts the fingerprinted media data at sender side. In thisscheme, the sender should produce different copy for differentreceivers, which makes the sender of high loading. Thus, it is moresuitable for unicasting network than for broadcasting or multicast-ing network. The second one [11] embeds customer informationinto media data by the middle nodes in the network. This schemereduces the sender’s loading, but changes the network protocol.Thus, the scheme is not compliant with common networks. Thethird one [12] embeds the receiver information into the media con-tent at receiver side. This scheme moves the sender’s loading to thereceivers, which reduces the sender’s loading greatly. However, thesecurity is critical in this case. If the media content is firstly de-crypted then fingerprinted, the clear content may be leaked outfrom the gap between decryption and fingerprint embedding. Forexample, in the video-on-demand services, if the media contentis decrypted, fingerprinted and playback independently in mediaplayer, the plain media may be stolen from the displaying buffer.

To improve the security of the third type of distribution scheme,some means have been presented. In Chamleon scheme [13], theimage is encrypted with a codebook at sender side, and decryptedwith different new codebooks at receiver side. Each new codebookis slightly different from the original codebook, i.e., only the leastsignificant bits (LSBs) of the codeword in the original codebookare changed, which contains the unique information. In thisscheme, the encryption operation is secure, while the fingerprint-ing operation is not secure against collusion attacks. Additionally,the fingerprint is only embedded into LSBs, which is not robustagainst common signal processing. In Lian et al.’s scheme [14],the variable-length codes of MPEG-2 video are encrypted by code-word scrambling at sender side, and they are decrypted into the

adjacent variable-length codes under the control of both thedecryption key and fingerprinting key. The original codes and thedecrypted codes are adjacent, and their differences tell the uniqueinformation. In this scheme, the encryption operation is secure,while the fingerprint is not robust against common signal process-ing. In Kundur’s scheme [5], the signs of DCT coefficients in an im-age are encrypted at the sender side, and only part of the signs aredecrypted at receiver side. The positions of the undecrypted signs,together with the signs, determine the customer information.However, in this scheme, only the signs are encrypted, and thus,the encrypted content is still intelligible. Additionally, the schemeis not secure against collusion attacks. In Lemma et al.’s scheme[15], the video content is encrypted with additive operation underthe control of a key sequence, and decrypted by subtraction oper-ation under the control of both key sequence and fingerprint se-quence. In this scheme, the additive operation causesoverflowing in practical implementation, and the key sequenceor fingerprint sequence has often small amplitude in order to re-duce the data overflows. Additionally, the media content encryptedby the additive encryption with low strength may still be intelligi-ble, and thus, the scheme is not secure against cryptographic at-tacks. Furthermore, the security against collusion attacks has notbeen analyzed.

2.3. A typical traceable scheme

In Lemma et al.’s scheme [15], as shown in Fig. 2, media data Pare encrypted into C with additive operation at the server side, anddecrypted into P0 with additive operation at the customer side. Theencryption and decryption processes are defined as

C ¼ P þ aM

P0 ¼ C þ ðaW � aMÞ

�ð1Þ

Here, M is the sequence for encryption, W is the fingerprint se-quence, and a is the adjustment factor for fingerprint embedding.Generally, M’s amplitude is big enough to change the content ofthe plain media P, while W’s amplitude is small enough to keepthe embedded fingerprint sequence imperceptible. The embeddedfingerprint can be used to trace the media copy or illegal distributor.

This scheme has some apparent disadvantages:Firstly, the encryption process based only on addition operation

is not secure against cryptographic attacks [16]. A simple attack isusing blind source separation [17]. Since M is a random sequencecompared with the natural sequence P and the addition is a linearoperation, it is easy to get M and P from the additive result C. Thus,the scheme is fragile to ciphertext-only attack.

Secondly, the encryption or decryption operation brings someoverflows in practical implementation. Generally, M has the similaramplitude compared with P. Since the pixels in P are in certainrange, the encryption operation makes the resulted pixels over-


flowing. Furthermore, the decryption operation brings also over-flows and even causes much degradation to the decrypted media’squality, which makes the decrypted media out of commercialvalue.

Thirdly, the fingerprint’s robustness against the most importantattack, named collusion attack, is not considered and evaluated,which limits the practical applications.

In the following content, we will propose a secure distributionscheme with better performance, especially resisting the disadvan-tages of Lemma et al.’s scheme.

According to the above analyses, the existing secure distribu-tion schemes are not suitable for practical applications becauseof their disadvantages in security, imperceptibility, robustnessand distribution efficiency. In this paper, a secure video distribu-tion scheme is proposed, which aims to avoid the disadvantagesof Lemma et al.’s scheme. Firstly, the preprocessing is proposedto process the data sequence that is composed of the extractedparameters in order to avoid overflowing. Secondly, some moreparameters are encrypted, and the module addition is used to en-crypt the data sequence in order to improve the scheme’s security.Thirdly, the fingerprint’s robustness against collusion attack is ana-lyzed and improved.

3. The proposed secure distribution scheme for Mobile TV

Taking Mobile TV for example, the proposed secure distributionscheme is shown in Fig. 3. At the server side, the TV program iscompressed and encrypted by the Joint Compression and Encryp-tion method, and the Content Key is encrypted before being sentto a mobile terminal. At the terminal side, the media content is de-crypted by the Joint Decryption and Fingerprinting method afterthe Content Key is decrypted. If the mobile user redistributes hisdecrypted TV program to public networks, such as Internet or pub-lic TV, the fingerprint information will be detected, and thus, theillegal redistributors can be identified. In this scheme, the encryp-tion/decryption of Content Key is similar with the method pre-sented in [31,32]. This paper’s contributions focus on the JointCompression and Encryption method, Joint Decryption and Finger-printing method, and Fingerprint Detection and Traitor Tracingmethod, which will be presented as follows in detail. The schemecan be extended for some other mobile applications, such as im-age/video/music sharing.

3.1. Joint Compression and Encryption

Since multimedia data contains some redundancy, e.g., imagesor videos, they are often compressed in order to save the space cost

Joint Compressionand Encryption

Encryption

TV program

Content Key

Server Side

DVB-H

GPRS/GSM

Fig. 3. Architecture of the propose

of storage or transmission. To keep compliant with communica-tion, the encryption schemes [18,19] considering the compressionare preferred, which encrypt some sensitive parameters during thecompression. MPEG-4 ASP (Advanced Sample Profile) [20] is thetypical standard for video compression. In this standard, the videoframe is classified into three types, i.e., I-frame, P-frame andB-frame. Among them, I-frame is partitioned into 16 � 16 macro-blocks, each macroblock is transformed by 8 � 8 DCT, and theDCT block’s DC and AC coefficients are encoded by quantizationand variable-length coding (VLC). In P-frame or B-frame, somemacroblocks, named intra macroblocks, are encoded with the sim-ilar method used for the macroblocks in I-frame, while some othermacroblocks, named inter macroblocks, are encoded by referenc-ing to the adjacent I-frame or P-frame. For the inter macroblocks,the motion vector difference (MVD) is encoded with VLC, and theresidue data are encoded by quantization and VLC.

In the proposed Joint Compression and Encryption method, asshown in Fig. 4, the parameters including DC, ACs’ signs and MVDs’signs are all encrypted. Additionally, DC is firstly preprocessed inorder to avoid overflowing, and then encrypted by sequenceencryption based on module addition. It is different from Lemmaet al.’s scheme that encrypts only the DC of each block by additiveoperation. In the proposed encryption process, the original video ispre-encoded with such operations as color space transformation,motion estimation and compensation, DCT and quantization, thenthe data are partitioned into three parts, i.e., ACs, MVDs and DC.The ACs and MVDs are encrypted by sign encryption, and DC is pre-processed and encrypted by sequence encryption. After encryptionoperations, the produced data are post-encoded by such operationsas VLC. In the decryption process, the video data are firstly decodedby such post-decoding operation as VLC. Then, the signs of ACs andMVDs are decrypted by sign decryption, and the DC is decrypted byJoint Decryption and Fingerprinting. Finally, the parameters arepre-decoded by such operations as inverse-quantization, IDCTand inverse color space transformation. The proposed encryptionand decryption operations, including sign encryption, preprocess-ing, sequence encryption and sign decryption are presented inthe following content, and the Joint Decryption and Fingerprintingmethod will be presented in the next subsection.

3.1.1. Sign encryption/decryptionSign encryption [18,19,34,35] is often used in multimedia

encryption algorithms that combine encryption and encoding.Generally, the changes of the signs of DCT coefficients and MVDsdegrade the media content’s intelligibility greatly, while theychange the length of variable-length codes slightly. In the proposedscheme, as shown in Fig. 5, the sign sequence is extracted from ACs

Joint Decryptionand Fingerprinting

Decryption

Public Networks(Internet,public TV, ...)

Terminal Side

FingerprintDetection and

Traitor Tracing

d secure distribution scheme.

Pre-encode Partition

Preprocessing

Post-encode

Post-decode Mux Pre-decode

Original Video

Decrypted andFingerprinted Video

SignEncryption

SequenceEncryption

Joint Decryptionand Fingerprinting

SignDecryption

AC & MVD

DC

DC

AC & MVD

DVB-H

Fig. 4. Architecture of the proposed encryption/decryption scheme.


and MVDs, then encrypted by a stream cipher, and finally, returnedto the corresponding ACs and MVDs. In the extracted sign se-quence, if the AC or MVD is positive, the sign bit is ‘0’, otherwise,the sign bit is ‘1’. In sign sequence returning, if the sign bit is ‘0’,the AC or MVD is positive, otherwise, the AC or MVD is negative.The decryption process is symmetric to the encryption process.

3.1.2. PreprocessingSet P ¼ p0; p1; . . . ; pn�1 ð0 6 pi < L; i ¼ 0;1; . . . ;n� 1; n > 0Þ be

the original n-length DC sequence and P0 ¼ p00; p01; . . . ; p0n�1

ð0 6 p0i < L; i ¼ 0;1; . . . ;n� 1; n > 0Þ the preprocessed n-lengthDC sequence. In preprocessing, the original DC sequence P is pro-cessed into a new sequence P0 according to maximal DC amplitudeand fingerprint amplitude in order to avoid the overflowing in thefollowed encryption or decryption operation. The preprocessing isdefined as

p0i ¼Q ; pi < Q

pi; Q 6 pi < L� Q

L� Q ; pi P L� Q

8><>: ð2Þ

Here, L and Q are the maximal amplitude of the DC sequence andfingerprint sequence, respectively. As can be seen, the preprocess-ing operation changes the pixels from the range ½0; L� 1� to½Q ; L� Q � 1�.

3.1.3. Sequence encryptionSet C ¼ c0; c1; . . . ; cn�1 ð0 6 ci < L; i ¼ 0;1; . . . ;n� 1; n > 0Þ be

the encrypted n-length DC sequence and R¼r0;r1;...;rn�1ð06ri<L;i ¼ 0;1; . . . ;n� 1; n > 0Þ the random sequence generated fromthe encryption key. In sequence encryption, the preprocessed DCsequence P0 is encrypted by the random sequence R according tothe following operation.

ci ¼ ðp0i þ riÞmodL ð3Þ

Extract signsequence

Streamcipher

OriginalACs and MVDs

Originalsign sequence

Return signsequence

EncryptedACs and MVDs

Encryptedsign sequence

Fig. 5. The process of sign encryption.

Here, the sequence is encrypted pixel by pixel, and i = 0,1, . . . ,n � 1.

3.2. Joint Decryption and Fingerprinting

The decryption process, jointing decryption and fingerprintembedding, is composed of several steps.

Firstly, from the fth receiver’s fingerprint key, the fingerprint se-quence is generated by random number generation, Xf ¼ xf

0; xf1; . . . ;

xfm�1 ð�Q 6 xf

i < Q ; i ¼ 0;1; . . . ;n� 1; n P m > 0Þ. Here, QðQ > 0Þis the maximal amplitude of the fingerprint sequence.

Secondly, the fingerprint sequence Xf is expanded from m-length to n-length, which produces the new fingerprint sequenceYf ¼ yf

0; yf1; . . . ; yf

n�1 ð�Q 6 yfi < Q ; i ¼ 0;1; . . . ;n� 1; n > 0Þ. The

expansion is based on the pixel set S ¼ fs0; s1; . . . ; sm�1gð0 6 si < n; i ¼ 0;1; . . . ; m� 1Þ that is composed of the corre-sponding positions of the m pixels in Yf . The expansion operationis defined as

yfi ¼

xfi ; i 2 S

0; i R S

(ð4Þ

Thirdly, the random sequence R is combined with the fth receiver’sfingerprint sequence Yf through subtraction operation, which pro-duces the combined sequence R00 ¼ r000; r

001; . . . ; r00n�1 ð0 6 r00i < L;

i ¼ 0;1; . . . ;n� 1; n > 0Þ.

r00i ¼ ri � aiyfi ð5Þ

Here, U ¼ a0;a1; . . . ;an�1 ð0 6 ai 6 1; i ¼ 0;1; . . . ;n� 1Þ is theembedding strength of the fingerprint sequence. Generally, thecombined sequence R00 can be generated at the sender side andtransmitted from the sender to the receiver in a secure manner.Alternatively, in Trusted Computing Component, such as Set-Topbox, the generation of R00 can be implemented at the receiver side.

Finally, the encrypted DC sequence C is decrypted into Pf by thecombined sequence R00 according to

pfi ¼ ðci � r00i ÞmodL ð6Þ

Here, the DC sequence is decrypted pixel by pixel, andi = 0,1, . . . ,n � 1.

According to Eqs. (3), (5) and (6), we have

pfi ¼ ðci � r00i ÞmodL ¼ ðci � ri þ aiy

fi ÞmodL

¼ ½ðp0i þ riÞmodL� ri þ aiyfi �modL

¼ ðp0i þ ri � ri þ aiyfi ÞmodL ¼ ðp0i þ aiy

fi ÞmodL ð7Þ

200 400 600 800 1000 1200 14000

500

1000

1500

2000

(a) The DC sequence encrypted by Lemma et al's scheme

200 400 600 800 1000 1200 14000

500

1000

1500

2000

(b) The DC sequence recovered by blind source separation

2000


According to Eqs. (2) and (7), we have Q 6 p0i 6 L� Q � 1, and thus,

pfi ¼ ðp

0i þ aiy

fi ÞmodL ¼ p0i þ aiy

fi ð8Þ

Thus, the decrypted DC sequence Pf contains the fth customer’s fin-gerprint sequence Yf .

3.3. Fingerprint Detection and Traitor Tracing

In detection, the correlation value between the DC sequenceand the fingerprint sequence is computed and compared with adecision threshold to tell whether the fingerprint exists or not.For example, to detect whether the fingerprint Yk exists in thefth media copy’s DC sequence Pf , the correlation value hPf ;Yki iscomputed according to

hPf ;Yki ¼Xn�1

i¼0

pfi y

ki

!, Xn�1

i¼0

ðyki Þ

2

!ð9Þ

Then, the decision is

Yk exists in Pf ; hPf ; YkiP T

Yk nonexists in Pf ; hPf ; Yki < T

(ð10Þ

200 400 600 800 1000 1200 14000

500

1000

1500

(c) The fingerprint sequence recovered by blind source separation

200 400 600 800 1000 1200 14000

500

1000

1500

2000

(d) The DC sequence encrypted by the proposed method

Fig. 6. Security against blind source separation attack.

4. Performance evaluation

In the following content, some performance issues of the pro-posed scheme are evaluated and compared with Lemma et al.’sscheme, including the security of the encryption algorithm, theimperceptibility of the fingerprint and the robustness of thefingerprint.

4.1. Security of the encryption algorithm

For multimedia encryption, the security depends on two aspects[18], i.e., cryptographic security and perceptual security.

4.1.1. Cryptographic securityCryptographic security denotes the encryption algorithm’s

security against cryptographic attacks [16]. In the proposed algo-rithms, the ACs and MVDs are encrypted by existing stream cipher,and the DCs are encrypted by the stream cipher based on moduleaddition. According to the properties of stream cipher [21], e.g.,AES CTR [22] or RC4 [23], the security depends on the randomnessof the random sequences. Thus, some existing random sequencegenerators [16,21,23] can be used to confirm the system’s security.For DC encryption, the stream cipher based on module addition ismore secure than the additive operation in Lemma et al.’s scheme.Taking the DC sequence in the video clip Tempete (CIF) for exam-ple, the blind source separation method [17] is used to recover theDC sequence and fingerprint sequence under the condition ofknowing only ciphertext, as shown in Fig. 6. According to the prop-erties of additive operation, it is easy to get the approximate DC se-quence (b) and fingerprint sequence (c) from the DC sequence (a)that is encrypted by Lemma et al.’s scheme. However, it is difficultto get them from the DC sequence (d) that is encrypted by the pro-posed scheme since the module addition is a nonlinear operationthat increases the randomness of the encrypted DC sequence. Thus,compared with Lemma et al.’s scheme, the proposed scheme ismore secure against such ciphertext-only attacks.

As should be noted, according to the security requirement ofstream cipher [21], keeping the random sequence R long enoughis able to confirm the security. Thus, in the proposed scheme, werecommend to use different encryption key for different audiosequence.

4.1.2. Perceptual securityPerceptual security denotes the unintelligibility of the en-

crypted video content [18]. In the proposed scheme, ACs, MVDsand DCs are all encrypted. Compared with Lemma et al.’s schemethat only encrypts DCs, the proposed scheme degrades the videoquality greatly. Fig. 7 shows the encryption results of two videoclips, i.e., Salesman (QCIF) and Tempete (CIF). Here, L = 2048. Ascan be seen, the video clips encrypted by the proposed schemeare more confused than the ones encrypted by Lemma et al.’sscheme, and the peak signal-to-noise ratio (PSNR) shows the objec-tive metric. The experiments on some other clips get the similar re-sults. Thus, the proposed scheme gets higher perceptual security.

4.2. Imperceptibility of the fingerprint

In the proposed scheme, the fingerprint is embedded into DCsequence with additive embedding. It is similar to the method usedin Lemma et al.’s scheme. The difference is that the overflowing isavoided in the proposed scheme. As has been analyzed in Section 2,the additive operation proposed in Lemma et al.’s scheme willcause overflowing in DC pixels and thus bring degradation on the

Fig. 7. Some results of video encryption.


decrypted video. Differently, in the proposed scheme, the prepro-cessing is introduced to avoid overflowing. The maximal degrada-tion on the DC pixels is 2Q that only happens when the pixel liesoutside of ½Q ; L� Q � 1�. Compared with Lemma et al.’s scheme,the degradation is small enough to keep the decrypted video con-tent of high quality. Fig. 8 shows the comparison results of two vi-deo clips, i.e., Foreman (30 frames/s, CIF size) and Tempete(30 frames/s, CIF size). Here, ai ¼ 1 ði ¼ 0;1; . . . ;n� 1Þ, L = 2048,Q = 3, m = n/2, and the PSNRs of the unencrypted video, the videoencrypted by Lemma et al.’s scheme and the one encrypted bythe proposed scheme are computed. As can be seen, the proposedscheme brings the degradation no more than 0.5 dB, while Lemmaet al.’s scheme often brings the degradation more than 0.5 dB.Additionally, by reducing Q, the degradation of the proposedscheme can be reduced, as shown in Fig. 9 (the compression bitrate is 2 Mbps). Fig. 10 gives some of the fingerprinted videoframes. Generally, when Q 6 3, the degradation is so small thatthe fingerprint is imperceptible.

400 600 800 1000 1200 1400 1600 1800 2000

26

28

30

32

34

36

38

40

42

Foreman-OriginalForeman-ProposedForeman-Lemma Tempete-OriginalTempete-ProposedTempete-Lemma

Bit Rate (kbps)

PSN

R (d

B)

Fig. 8. Comparison of different schemes’ imperceptibility.

4.3. Robustness of the fingerprint

4.3.1. Common signal processingThe fingerprint’s robustness against some signal processing pro-

posed in StirMark [24] is tested and shown in Table 1. Here, T = 0.1,20 video clips are used in the experiments, the parameters aresame to the ones used in Section 4.2, and the correct detection ratecorresponding to different embedding strength is tested. The cor-rect detection rate is the ratio between the number of fingerprintsthat are correctly detected and the total number of video clips, 20.For MPEG-4 recompression, the video bit rate is converted from1.5 Mbps to 768 kbps or 384 Mbps. Generally, when Q is no smallerthan 3, the fingerprint is robust against most of the tested signalprocessing, especially against MPEG-4 recompression. This prop-erty benefits from the additive embedding in frequency domain.However, against such desynchronization attacks as rotation and

0 1 2 3 4 5 6 730

32

34

36

38

40

42

Q

PSN

R (d

B)

Foreman(CIF)Tempete(CIF)

Fig. 9. Relation between imperceptibility and embedding strength Q.

Fig. 10. Results of some fingerprinted video frames.

Table 1Robustness against some common signal processing in StirMark 4.0. (Correct detection ra

Q MPEG-4 Additive noise Rescaling

Bitrate = 768 kbps

Bitrate = 384 kbps

Factor = 1 Factor = 3 Factor = 200% F

1 90% 60% 100% 70% 85% 92 95% 80% 100% 100% 100% 13 100% 95% 100% 100% 100% 14 100% 100% 100% 100% 100% 15 100% 100% 100% 100% 100% 1


temporal resampling (changed from 30 to 20 fps), this scheme isnot robust, which is caused by the simple embedding in DCcoefficients.

4.3.2. Collusion resistanceIn collusion attacks, different receivers combine their copies to-

gether and produce a new copy without the embedded fingerprintsequences. The general combination operation includes averaging,min-collusion, linear combination, etc. Among them, averagingmeans to average different copies, min-collusion means to formthe video frame with the minimal pixels in different copies, andlinear combination means to use the first k + 1 copies’ addition tosubtract the remaining k copies’ addition. In the proposed scheme,Fingerprint Detection is composed of two steps, i.e., computing thecorrelation value and comparing it with a threshold. Generally, thethreshold is fixed, while the correlation value changes with thenumber of colluders. Here, for Lemma et al.’s scheme and the pro-posed scheme, taking the video clip Foreman (CIF) for example, thecorrelation value after averaging, min-collusion and linear combi-nation are tested and shown in Fig. 11. Here, ai ¼ 1 ði ¼ 0;1; . . . ;n� 1Þ, L = 2048, Q = 3 and m = n/2. As can be seen, after aver-aging or min-collusion, both the schemes’ correlation values de-crease with the rise of the number of colluders. Thus, theschemes’ robustness against averaging or min-collusion decreaseswith the number of colluders. After linear combination, both theschemes’ correlation value keeps nearly unchanged with the num-ber of colluders. Thus, the schemes are robust against linear com-bination. These properties depend on the following facts: thefingerprints for different customers are independent from eachother, and the additive embedding is adopted. Furthermore, thecorrelation values corresponding to Lemma et al.’s scheme are of-ten smaller than the ones corresponding to the proposed scheme.The reason is that the overflowing in Lemma et al.’s schemechanges the fingerprinted pixels more or less, and the changes af-fect the correlation values.

Taking 20 video clips for example, the correct detection rateagainst averaging collusion is tested and shown in Table 2. Here,ai ¼ 1 ði ¼ 0;1; . . . ;n� 1Þ, L = 2048, Q = 3, m = n/2 and T = 0.1. Ascan be seen, the correct detection rate decreases with the rise ofthe number of colluders. Additionally, for the same number of col-luders, the proposed scheme obtains higher detection rate thanLemma et al.’s scheme does. Thus, the proposed scheme is more ro-bust against collusion attacks than Lemma et al.’s scheme.

5. Discussions and future work

In this paper, a secure multimedia distribution scheme is pre-sented for the converged Mobile TV service. The TV content is en-crypted during MPEG-4 Advanced Simple Profile encoding at theserver side. At the mobile terminal side, the TV content is de-crypted and fingerprinted simultaneously. In the encryption pro-cess, the signs of ACs and MVDs are encrypted with a streamcipher during encoding, while the DCs are firstly preprocessed with

te is tested.)

3 � 3sharpening

3 � 3medianfiltering

Rotation(2�)

Temporalresampling (30–15)

actor = 50%

0% 70% 65% 0 20%00% 85% 90% 0 30%00% 100% 100% 10% 50%00% 100% 100% 10% 50%00% 100% 100% 20% 60%

2 4 6 8 10 12 14 160

0.2

0.4

0.6

0.8

1

Number of colluders

Cor

rela

tion

valu

e

Proposed scheme Lemma et al's scheme

(a) Averaging

2 4 6 8 10 12 14 160

0.2

0.4

0.6

0.8

1

Number of colluders

Cor

rela

tion

valu

e


(b) Min-collusion

2 4 6 8 10 12 14 160.7

0.75

0.8

0.85

0.9

0.95

1

1.05

1.1

Number of colluders

Cor

rela

tion

valu

e


(c) Linear combination

Fig. 11. Relation between the detected correlation value and the number ofcolluders.

Table 2Robustness against averaging collusion. (Correct detection rate is tested.)

Method Number of colluders

1 3 5 7 11 13 15 17

Lemma et al.’sscheme

100% 100% 95% 95% 75% 60% 20% 0%

Proposedscheme

100% 100% 100% 100% 100% 100% 60% 30%


the proposed preprocessing, and then encrypted with the sequenceencryption based on module addition. In the decryption process,the ACs and MVDs are decrypted with a stream cipher during

decoding, while the DCs are decrypted by the proposed JointDecryption and Fingerprinting method under the control of thekey and fingerprint. Compared with the existing scheme, Lemmaet al.’s scheme, the proposed scheme avoids the overflowing inDC encryption or decryption and adopts the module addition basedencryption operation that is more secure than additive operation.As shown by the experiments and analyses, compared with Lemmaet al.’s scheme, the proposed scheme has better imperceptibility, ismore robust against collusion attacks, and thus, is more suitablefor secure Mobile TV. Additionally, the proposed scheme can alsobe used for mobile image, video or music sharing. In future work,the scheme’s robustness against some other common signal pro-cessing will be evaluated and improved, the adaptive embeddingin encryption domain will be studied, and the scheme combinedwith some other codecs, such as MPEG-4 Fine Granularity Scalabil-ity Profile and MPEG-4 AVC/H.264, will be considered.

Acknowledgements

This work was partially supported by the EU Project ‘‘MOBI-SERVE” through the Grant code of FP6-2005-IST-61-045410, Na-tional Natural Science Foundation of China under Grant No.70901039, National Postdoctoral Science Foundation of China un-der Grant No. 20090450144, France Telecom Project ‘‘Crypto”through the Grant No. PEK08-ILAB-006 and Jiangsu PostdoctoralScience Foundation under Grant No. 0901104C.

References

[1] E.I. Lin, A.M. Eskicioglu, R.L. Lagendijk, E.J. Delp, Advances in digital videocontent protection, Proceedings of the IEEE 93 (1) (2005) 171–183.

[2] W.-B. Lee, T.-H. Chen, C.-C. Lee, Security of new encryption algorithm for imagecryptosystems, Imaging Science Journal 54 (3) (2006) 178–187.

[3] I.J. Cox, M.L. Miller, J.A. Bloom, Digital Watermarking, Morgan-Kaufmann, SanFrancisco, 2002.

[4] M.-C. Chang, D.-C. Lou, H.-K. Tso, Combined watermarking and fingerprintingtechnologies for digital image copyright protection, Image Science Journal 55(1) (2007) 3–12.

[5] D. Kundur, K. Karthik, Video fingerprinting and encryption principles for digitalrights management, Proceedings of the IEEE 92 (6) (2004) 918–932.

[6] S. Lian, Z. Liu, Z. Ren, H. Wang, Commutative encryption and watermarking incompressed video data, IEEE Circuits and Systems for Video Technology 17 (6)(2007) 774–778.

[7] D. Boneh, J. Shaw, Collusion-secure fingerprinting for digital data, IEEETransactions on Information Theory 44 (5) (1998) 1897–1905.

[8] M. Wu, W. Trappe, Z.J. Wang, R. Liu, Collusion-resistant fingerprinting formultimedia, IEEE Signal Processing Magazine (2004) 15–27.

[9] H.V. Zhao, K.J. Ray Liu, Fingerprint multicast in secure video streaming, IEEETransactions on Image Processing 15 (1) (2006) 12–29.

[10] D. Simitopoulos, N. Zissis, P. Georgiadis, V. Emmanouilidis, M.G. Strintzis,Encryption and watermarking for the secure distribution of copyrighted MPEGvideo on DVD, ACM Multimedia Systems Journal, Special Issue on MultimediaSecurity 9 (3) (2003) 217–227.

[11] I. Brown, C. Perkins, J. Crowcroft, Watercasting: distributed watermarking formulticast media, in: Proceedings of the First International Workshop onNetworked Group Communication, Lecture Notes in Computer Science, vol.1736, Springer-Verlag, 1999.

[12] J. Bloom, Security and rights management in digital cinema, in: Proceedings ofthe IEEE International Conference on Acoustic, Speech and Signal Processing,vol. 4, 2003, pp. 712–715.

[13] R. Anderson, C. Manifavas, Chamleon – a new kind of stream cipher, in: LectureNotes in Computer Science, Fast Software Encryption, Springer-Verlag, 1997,pp. 107–113.


[14] S. Lian, Z. Liu, Z. Ren, H. Wang, Secure distribution scheme for compressed datastreams, in: 2006 IEEE Conference on Image Processing (ICIP 2006), October2006.

[15] A.N. Lemma, S. Katzenbeisser, M.U. Celik, M.V. Veen, Secure watermarkembedding through partial encryption, in: Proceedings of InternationalWorkshop on Digital Watermarking (IWDW 2006), Lecture Notes inComputer Science, vol. 4283, Springer, 2006, pp. 433–445.

[16] R.A. Mollin, An Introduction to Cryptography, CRC Press, 2006.[17] Qiu-Hua Lin, Fu-Liang Yin, Tie-Min Mei, Hua-Lou Liang, A blind source

separation based method for speech encryption, IEEE Transactions on Circuitsand Systems I 53 (6) (2006) 1320–1328.

[18] S. Lian, Z. Liu, Z. Ren, H. Wang, Secure advanced video coding based onselective encryption algorithms, IEEE Transactions on Consumer Electronics 52(2) (2006) 621–629.

[19] S. Lian, J. Sun, J. Wang, Z. Wang, A chaos based stream cipher and its usage invideo encryption, International Journal of Chaos, Solitons and Fractals 34 (3)(2007) 851–859.

[20] ISO/IEC 14496-2, Information technology – coding of audio–visual objects –Part 2, Visual, 1999.

[21] Security Requirements for Cryptographic Modules (Change Notice). FederalInformation Processing Standards Publication (FIPS PUB) 140-1, 25 May 2001.

[22] R. Housley, Using Advanced Encryption Standard (AES) Counter Mode WithIPsec Encapsulating Security Payload (ESP), RFC 3686, January 2004.

[23] B. Schneier, Section 17.1 RC4, Applied Cryptography, second ed., John Wiley &Sons, 1996.

[24] F.A.P. Petitcolas, R.J. Anderson, M.G. Kuhn, Attacks on copyright markingsystems, in: David Aucsmith (Ed.), Information Hiding, Second InternationalWorkshop, IH’98, Lecture Notes in Computer Science, vol. 1525, Portland, USA,April 1998, pp. 219–239.

[25] S. Lian, Multimedia Content Encryption: Techniques and Applications. ISBN:1420065270, Auerbach Publication, Taylor & Francis Group, September 2008.

[26] S. Lian, Y. Zhang (Eds.), Handbook of Research on Secure MultimediaDistribution, IGI Global (formerly Idea Group, Inc.), March 2009.

[27] S. Lian, D. Kanellopoulos, G. Ruffo, Recent advances in multimedia informationsystem security, Informatica 33 (1) (2009) 3–24.

[28] DVB-H (Digital Video Broadcasting – Handheld). <http://www.dvb-h.org/>(accessed by November 2009).

[29] GPRS (General Packet Radio Service). <http://en.wikipedia.org/wiki/General_Packet_Radio_Service/> (accessed by November 2009).

[30] GSM (Global System for Mobile communications). <http://en.wikipedia.org/wiki/GSM/> (accessed by November 2009).

[31] S. Lian, Y. Zhang, Protecting mobile TV multimedia content in DVB/GPRSheterogeneous wireless networks, Journal of Universal Computer Science 15(5) (2009) 1023–1041.

[32] S. Lian, Content and Service Protection for the Ubiquitous TV Based onConvergent Networks, Wireless Personal Communication, Springer.doi:10.1007/s11277-009-9783-3.

[33] S. Lian, Y. Dong, H. Wang, A secure solution for ubiquitous multimediabroadcasting, in: 2009 IEEE International Conference on Communications (ICC2009), Dresden, Germany, 13–18 June 2009.

[34] S. Lian, J. Sun, Z. Wang, A novel image encryption scheme based-on JPEGencoding, in: Proceedings of the Eighth International Conference onInformation Visualization (IV04), London, UK, July 2004, pp. 217–220.

[35] S. Lian, Z. Wang, Compare of several wavelet coefficient confusion methodsapplied in multimedia encryption, in: The 3rd International Conference onComputer Networks and Mobile Computing (ICCNMC2003), Shanghai, China,October 2003, pp. 372–376.

[36] MOBISERVE (New mobile services at big events using DVB-H broadcast andwireless networks), FP6-2005-IST-61-045410. <ftp://ftp.cordis.europa.eu/pub/ist/docs/ka4/au_fp6_mobiserve_en.pdf/> (accessed by November 2009).

[37] Open Mobile Alliance’s Digital Rights Management (OMA DRM). <http://www.openmobilealliance.org/Technical/DRM.aspx/> (accessed by November2009).

[38] DVB Content Protection & Copy Management (DVB CPCM). <http://www.dvb.org/technology/dvb-cpcm/index.xml/> (accessed by October 2009).

[39] ISMACryp 1.1 (Internet Streaming Media Alliance). <http://www.isma.tv/technology/ISMACryp1.1.html/> (accessed by October 2009).

[40] S. Lian, Digital rights management for the home TV based on scalable videocoding, IEEE Transactions on Consumer Electronics 54 (3) (2008) 1287–1293.

http://www.dvb-h.org/

http://en.wikipedia.org/wiki/General_Packet_Radio_Service

http://en.wikipedia.org/wiki/General_Packet_Radio_Service

http://en.wikipedia.org/wiki/GSM

http://en.wikipedia.org/wiki/GSM

http://dx.doi.org/10.1007/s11277-009-9783-3

http://www.openmobilealliance.org/Technical/DRM.aspx

http://www.openmobilealliance.org/Technical/DRM.aspx

http://www.dvb.org/technology/dvb-cpcm/index.xml

http://www.dvb.org/technology/dvb-cpcm/index.xml

http://www.isma.tv/technology/ISMACryp1.1.html

http://www.isma.tv/technology/ISMACryp1.1.html

Documents

Secure and traceable multimedia distribution for convergent Mobile TV services