Design and Implementation of Fast FPGA Based (1)

Embed Size (px)

Citation preview

  • 7/24/2019 Design and Implementation of Fast FPGA Based (1)

    1/6

    Design and Implementation of Fast FPGA Based

    Architecture for Reversible WatermarkingSudip Ghosh

    1*, Bijoy Kundu

    2, Debopam Datta

    3, Santi P Maity,

    4and Hafizur Rahaman

    1,4

    1School of VLSI Technology (Bengal Engineering and Science University at Shibpur, India)

    2

    Dept. of Electronics and Telecommunication (Bengal Engineering and Science University at Shibpur, India)3Dept. of Electrical and Computer Engineering (University of Illinois at Chicago, USA)

    4Dept. of Information Technology (Bengal Engineering and Science University at Shibpur, India)

    *E-mail: [email protected]

    AbstractThere are diverse hardware realization for digital

    watermarking of multimedia proposed in the literature. This

    paper focuses on the design and implementation of a fast

    FPGA(Field Programmable Gate Array) based architecture

    using reversible contrast mapping (RCM) based image

    watermarking algorithm. The specialty of this architecture

    attracts to the fact of clock-less encoder design and

    implementation which makes the design faster. The encoder

    module response time is independent of clock frequency, so the

    embedding of the watermark is possible as soon as the input isfetched. The schematic based design and implementation of the

    VLSI architecture have been done with Xilinx 14.1 on Spartan

    3E FPGA family. The encoder requires 528 4-input LUTs and

    303 slices. On the contrary, the decoder requires 613 LUTs and

    347 slices. The maximum clock frequency of the decoder is 45

    MHz. The results show the viability of low cost, high speed real-

    time use of the proposed VLSI architecture.

    Keywords- VLSI Architecture,Reversible Watermarking,

    FPGA .

    I. INTRODUCTION

    Digital watermarking [1] is an efficient tool to prevent

    unauthenticated use of data. Digital watermarks may be usedto verify the authenticity or integrity of the original data.

    Nowadays, it is prominently used for tracing copyrightinfringements and for banknote authentication. Digitalwatermarking is broadly classified depending on the type ofsignal like audio watermarking, image watermarking, videowatermarking, and database watermarking etc. The presentwork is focused on image watermarking.

    In image watermarking, the digital information (like adigital image, a digital signature or a random sequence of

    binary numbers) is embedded into an image. The embeddedinformation may or may not be perceptible after watermarkingand therefore falls into the category of visible or invisible

    watermarking respectively. Depending on the robustness ofthe watermark, it can also be categorized as robust or fragilewatermarking[2].

    One limitation of watermarking-based authenticationschemes is the distortion inflicted on the host media by theembedding process. Although the distortion is ofteninsignificant, it may not be acceptable for some applications,especially in the areas of medical imaging and militaryapplications. Therefore, watermarking scheme capable ofremoving the distortion and recovering the original mediaafter passing the authentication is desirable. Schemes with this

    capability are often referred to as reversible watermarkingschemes [3].Various Reversible Watermarking techniqueshave been proposed with different type of algorithm [4]-[5].Popular techniques of reversible watermarking are: i)Difference Expansion, ii) Histogram bin Shifting, iii) Datahiding using Integer Wavelet Transform, iv) ContrastMapping, and v) Integer Discrete Cosine Transform.Usually, areversible scheme performs some type of lossless compressionoperation on the host media in order to make space for hidingthe compressed data and the Message Authentication Code(MAC) (e.g., hash, signature, or some other feature derivedfrom the media) used as the watermark [6]. To authenticate thereceived media, the hidden information is extracted and thecompressed data is decompressed to reveal the possibleoriginal media. MAC is then derived from the possibleoriginal media. If the newly derived MAC matches theextracted one, the possible original media is deemedauthentic/original.

    However, in this paper a reversible watermarking techniqueis implemented using a specific transform reported by Coltucet. al. in [5]. The choice of this technique includes its lowcomputational complexity and robustness. The primary goal ofthe proposed design is to achieve high speed hardwareefficient VLSI architecture. The RCM technique was firstimplemented in Matlab to verify the algorithm and analyzevarious design constraints. Later the desired architecture isestablished in FPGA using Xilinx. The paper is organized bystarting with an abstract followed by section I with anintroduction. Section II describes the related works. Nextsection III reports the proposed VLSI architecture ofreversible watermarking followed by the analysis andexperimental results in section IV, finally the work isconcluded in section V with references.

    II. RELATED WORKS

    In the scheme proposed by Fridrich et al. [1], DiscreteCosine Transform (DCT) technique has been implemented.128-bit hash of all the DCT coefficients is used as thewatermark. The extracted compressed bit-stream is used forverification;however, the hash contains only the signature ofthe image, with no local information.Therefore, despite itssimplicity and ability to detect inauthenticity, this technique isunable to locate the position where the tampering has beendone. Van Leest et.al.[3] proposed another reversiblewatermarking scheme based on a transformation function thatintroduces gaps in the image histogram of image blocks.

    2013 International Conference on Electrical Information and Communication Technology (EICT)

    978-1-4799-2299-4/13/$31.00 2013 IEEE

  • 7/24/2019 Design and Implementation of Fast FPGA Based (1)

    2/6

    One drawback of this scheme is its need for the overheadinformation and the protocol to be hidden in the image.Moreover, a potential security loophole in the scheme is thatgiven the fact that the computational cost for extracting thewatermark is insignificant; an attacker can defeat the scheme

    by exhausting all the 256 possible gray level assuming that thegray level being tried is the gap. In [5], Coltuc et al. proposeda Reversible Contrast Mapping (RCM) based algorithm of

    reversible watermarking in the spatial domain. It provides ahigh data embedding bit-rate at a very low mathematicalcomplexity. The proposed scheme does not need anyadditional data compression but is able to recover the originalimage even after alterations in the encoded data.

    Over the last decade, a lot of research is performed onReversible Watermarking, however, VLSI implementation ofRCM based approach is still an area to be explored. In this

    paper, the advantages of RCM based watermarking techniquehave been explored and implemented. Major concentration isgiven on developing a low cost, high speed VLSI architecturethat can be used for real-time applications. Some significanthardware implementations of digital watermarking include the

    work [8], [9], [10], [11]. Mohanty et al.[9]concentrated on aspatial-domain invisible-fragile watermarking and theirarchitecture. But these designs are seriously constrained due totheir hardware complexities. In [12], a hardware architecturethat can insert two visible watermarks in images in the spatialdomain is introduced. The main objective of the proposedarchitecture was to decrease the hardware complexity keepingthe performance intact. Employing the advantages of RCMtechnique, a low cost hardware efficient VLSI implementationof RCM based RW has been presented in this paper.

    III. PROPOSED VLSI ARCHITECTURE OF REVERSIBLE

    WATERMARKING

    The implementation of the watermarking algorithm is doneusing the ISE Design Suite of Xilinx for Spartan 3E FPGAfamily. FPGA, because of its advantages like re-configurability, low cost and simpler design process, is usedfor the hardware implementation. The entire watermarkingarchitecture design involved construction of two main blocks,the encoder and the decoder. Each of the blocks is furtherdivided into three sub-blocks named as module 1, module 2,and module 3. Each of these modules is designed individuallythrough modularization and later interfaced with each other.The encoder and decoder were designed and simulatedseparately.Both the encoder and decoder designs are described in detailwith their respective modules in the following subsections 1

    and 2.1. ENCODER:

    In the proposed architecture, the encoder part is designed in

    three stages as given in Fig. 1.

    Image Acquisition and Pixel Transform:

    The proposed architecture is implemented and optimized for8 bit gray image. The source to the encoder, which is basicallya device providing image pixel as input, can be a storagedevice like a RAM or direct external input by the user in 8 bit

    digital form. In the proposed architecture, original image datais stored in a 256 byte RAM (eight 32-word by 8-bit SRAM).As discussed in [5], a specific transformation technique is

    performed on the image involving a pair of pixels. Among

    various ways of acquiring these pixels from the source,sequential column wise fetching (each element of a particularrow and column is an 8-bit pixel value represented by an 8-bitaddress) from the memory is carried out in this design. The 8-

    bit pixel value read from the memory is then converted to a10-bit data (adding zero at the 9

    th and 10

    th bit position) to

    provide the correct form of input for pixel transformation. Thetransformation technique [5] is mathematically given by,

    (1) (2)Where A, B are the pair of input pixels of the original imageand A

    transform, B

    transformare the pair of transformed pixels. This

    transformation technique allows error free transmission and

    A

    B

    Atransform

    Btransform

    X

    Y

    Z

    OUTA

    OUTB

    A

    B

    Atransform

    Btransform

    Atransform

    (9:0)

    Btransform

    X

    Y

    Z

    MODULE 1

    MODULE 2

    Fig. 1. Data flow path in encoder

    MODULE 3

    SUB SUB

    SUB SUB

    A

    B

    B

    A

    A

    B

    Atransform

    Btransform

    256 byte

    RAM

    Address decoderFig. 2. Data flow path in image acquisition and pixel transform

    module

  • 7/24/2019 Design and Implementation of Fast FPGA Based (1)

    3/6

    detection of image both at the transmitter and receiver endrespectively [5].Fig. 2 shows the data flow path in image acquisition and pixel

    transform module gives the pixel transform module

    implemented by using only two subtractor modules for each

    pixel. 10 bit subtractor ensured signed subtraction using twos

    complement logic and also prevented overflow.

    Control Signal:As mentioned earlier, the watermarking (embeddingwatermark image data into original image) algorithm is

    performed on image pixels constrained to a particular domain,Dc, of the transformed pairs [5]. Domain Dc of transformed

    pixels of the original image is defined such that the pair oftransformed pixels, Atransform& Btransform, belong to [0, L] whereL takes values from 0 to 254 leaving 1. The domain Dc

    prevents underflow and overflow as well as removesambiguous pairs. It also ensures robust error free transmissionof the watermarked image. This module generates controlsignals that are essential to carry out data embedding processwhich include determining Dc along with other essential

    control signals. As mentioned by Coltuc et. al. in [5], threedistinct groups are made partially depending on Dcwhich aredetermined distinctly by three control signals (X, Y, and Z) inFig. 3.

    The generation of these control signals is briefed below.

    A low logic level, 0, of X is generated when pair

    Atransform and Btransform belongs to Dc and each of them is

    even.

    A high logic level, 1, of Y is generated when pair

    Atransformand Btransformbelongs to Dcand is odd.

    A high logic level, 1, of Z is generated when the pair

    does not belong to Dc.

    These control signals are generated completely using logical

    gates as shown on Fig. 3. The 10th

    bit determines the polarity

    (either positive or negative) of the transformed pair while the

    9th

    bit determines if the transformed pair is below 255. The

    LSB determines whether it is even or odd.

    Data Embedding:

    The circuit generating performing the task of watermarkimage embedding in the original image is given in Fig. 4. Thismodule uses the control signals as input to selectively performwatermarking depending on the control signals. The controlsignals determine whether the pixels are to be transformed

    before embedding the watermark sequence into the originalimage. The LSB of the 2

    nd pixel among the pair is used to

    embed the watermark image while the transformation

    information (whether pixels are transformed or not) isembedded into the LSB of the 1

    st pixel. The watermarking

    algorithm in terms of the control signals is given in pseudocode as follows.

    Watermarking based on control signals:

    When X=0, pass pair Atransform and Btransform forwatermarking. Set LSB of Atransform to 1 and LSB ofBtransformreplaced by watermark image.

    When Y=1, pass pair A and B for watermarking. SetLSB of A to 0 and LSB of B replaced by watermarkimage.

    When Z=1, watermarking step is skipped. Set LSB of Ato 0 and the original image pixels are transmitted.

    2. DECODER:

    The decoder block is structured similarly like the encoderblock. It is comprised of the three modules, the signalgeneration block, the inverse transform block, and the imageand watermark extraction block. The entire decoderarchitecture is given in Fig. 5. Following sections from (i)-(iii) give a detailed hardware description of the individualmodules of the decoder.

    Control Signal:

    Similar to the encoder part, the control signals are generatedfrom the 8-bit input data received from the transmitter i.e. the

    encoder. The watermark image data as well as the

    transformation information has been embedded into the LSBof the transmitted pairs. Therefore, the preliminary task of this

    module is to extract LSB of both the received pairs. The LSB

    of the WIA received signal contained the transformation

    information and WIB contained the watermark data. The

    Fig. 3. Circuit diagram of the control signal module of Encoder

  • 7/24/2019 Design and Implementation of Fast FPGA Based (1)

    4/6

    primary task of this module is the generation of the signal

    labeled as Dcwhich checks if the corresponding signal in the

    encoder input belonged to the domain Dc. The image

    transform module, previously used in encoder, followed by theDc check block and few logical blocks generate this signal.

    The LSB of WIA determines if the received pair was

    transformed. If the LSB, WIA(0), is equal to logical 1 then

    the pair was transformed. Consequently the pair is fed to the

    inverse transform module; else it is passed on to the Dc check

    block. This control is achieved by the 8-bit 2-1MUX andDEMUX pair. If the generated signal Dc is satisfied, then the

    pair corresponds to one of the odd pairs transmitted by the

    encoder. It is then passed forward for further processing.

    Inverse Transform:

    This module is the most important part of the decoder and

    consumes major processing time. The received transformedpairs are performed inverse transform to get original image

    pixel. As mentioned in [5], the inverse transform is achieved

    by the mathematical expressions as given below:

    (3)

    (4)

    where, denotes the ceil function (the smallest integergreater than or equal to x).From (3) & (4), the above inverse transform can be executed

    by addition and division without using multiplier. Addition is

    performed twice followed by division by 3. All of these tasks

    are executed by the inverse transform block in Fig. 5. Keeping

    the cost constraint in mind, the division is performed by

    repetitive subtraction method limiting the overall decodingspeed. However, the delay due to other combinatorial blocks is

    low enough to facilitate a high frequency clock. The divider

    used in this module had to be 10-bit because the upper limit of

    Atransformand Btransform(inputs WIA and WIB at the decoder) is

    255 which could result a 10 bit input at the divider. The ceil

    function is achieved by a check operation performed on the

    two LSBs, 0thand 1stbit, followed by adder block. In Fig. 5,the check is performed by a single OR gate, and the output is

    added with the counter output, quotient, of the divider block.

    Image and Watermark Extraction:

    As mentioned earlier, watermark extraction isentirely

    performed depending on the control signals. The watermarkimage, embedded into the LSB of WIB, is extracted by using

    the signal labeled Watermark_seq_sig in Fig. 5. This signal

    determines the WIB that contains the embedded watermark

    sequence. The watermark sequence is stored in a 128 byte

    RAM (four 32-word by 8-bit SRAM). The size of the storage

    device depends on the size of the watermark image. The signal

    labeled Watermark_seq_sig is applied to the Write Enable

    (WE) input to the RAM as shown in Fig. 5. The A and B

    output signals from the 8 bit MUX forms the extracted image.

    The select line of this MUX is generated from the LSB of the

    WIA and carry out (Cout) of the inverse transform block as

    shown in Fig. 5.

    IV. ANALYSIS AND EXPERIMENTAL RESULTS

    The simulation and implementation of the entire architecture

    is carried out in ISE Design suite and other tools of Xilinx.

    The hardware is optimized in terms of hardware cost. A

    size binary watermark image is used to perform the

    watermark embedding and its extraction. The watermarking is

    performed on an 8 bit gray image of size . Theexperimental results and their analysis are summarized in

    following part of this section.

    1. Encoder Results :

    As discussed earlier, the watermark encoding process is

    carried by important blocks like the transform block, control

    signal block and the final watermark image embedding block.

    Considering the hardware complexity of these blocks, they

    were designed and implemented separately and then integrated

    Fig. 4. Circuit diagram of the watermark data embedding module of encoder

  • 7/24/2019 Design and Implementation of Fast FPGA Based (1)

    5/6

    to perform the desired operation. The multiplier requirement

    of the transform operation of the pair of pixels is achieved

    using four subtractors in order to maintain hardwareefficiency. The complex watermark embedding operation is

    very efficiently performed by using the control signals

    generated from combinational logical gates. Finally, the

    watermark embedding is realized using customized

    multiplexers and logical blocks. The implementation of the

    entire encoder required 303 slices and 528 four input LUTs.The hardware utilization of encoder along with its sub-blocks

    is given in Table I.

    TABLEI:DEVICE UTILIZATION SUMMARY (ESTIMATED VALUES)

    Logic Utilization Different modules

    Pixel

    transform

    Control

    signal

    Watermark

    embedding

    Watermarking

    encoder

    Number of 4 input LUTs 432 17 19 528

    Number of occupied Slices 268 9 10 303

    Number of bonded IOBs 152 23 50 230

    The encoder module is practically devoid of any clock signals

    as the module is realized only using combinatorial and logical

    elements. As a result, the watermarking process is fastalthough some intensive operations are being performed. The

    clock less architecture can be extended by incorporating

    parallel processing and pipelining, enabling the system to be

    highly effective in real time applications like in digital

    cameras, printers, medical and military applications etc.A concern of the implemented encoder is the combinational

    path delay whose maximum value is found out to be 31.642ns.

    However, there is a high scope of reducing this delay by

    employing pipeline architecture.

    2. Decoder results:

    The decoder module is having a higher complexity as

    compared to the encoder mainly because of the pixel inverse

    transform and the recovery of the original pixels. As a result,

    the hardware requirement is also higher that the encoder part

    which is detailed in Table II. The divider (division by 3) used

    for pixel inverse transformation is application specific and is

    of subtraction followed by right shifting type. It require only613, 4-input LUTs, 347 slices and 56 slice flip-flops.This

    transform module also involved 4 adders, 1 subtractor, 8 bit

    counter and multiplexers.

    TABLEII:DEVICE UTILIZATION SUMMARY (ESTIMATED VALUES)

    Logic Utilization Different modules

    pixel inverse

    transform

    control

    signal

    Watermark

    extraction(decoder)

    Number of Slice Flip Flops 48 8 56

    Number of 4 input LUTs 217 13 613

    Number of occupied Slices 125 8 347

    Number of bonded IOBs 42 20 37

    Because of the hardware complexity, the implemented

    architecture is prone to lower response time. The incorporation

    of the pipelined architecture facilitated in reducing the delay to

    a minimum of 22.663 ns. The maximum clock frequency of

    the watermark extraction module is 45 MHz.

    V. CONCLUSION

    This paper focuses on the design and implementation of a fast

    FPGA(Field Programmable Gate Array) based architecture

    using reversible contrast mapping (RCM) based imagewatermarking algorithm. To the best of our knowledge, prior

    research on RCM based watermarking algorithm with its

    VLSI implementation is very shallow. This limited the

    comparison of this hardware implementation with others and

    hence sole significance has been summarized. The encoder

    requires528 4 input LUTs and 303 slices. On the contrary, the

    decoder requires 613 LUTs and 347 slices. The encoder

    module is practically independent of clock, so the embeddingof the watermark is possible as soon as the input is fetched.

    This feature along with low hardware cost facilitates the

    Fig. 5. Circuit diagram of the decoder comprised of all necessary modules

  • 7/24/2019 Design and Implementation of Fast FPGA Based (1)

    6/6

    prospect of its use in real-time applications like digital

    cameras, medical and military applications. The hardwarecomplexity of the decoder module is higher compared to the

    encoder module because of the division followed by the ceil

    function in inverse transform module. The maximum clock

    frequency of the decoder is 45 MHz.The design is fast, low

    cost and easily implementable for real time watermarking.

    REFERENCES[1] Fridrich, J., Goljan, M., & Du, R. (2001). Invertible authentication

    watermark for JPEG images. Proceeding of the IEEE InternationalConferenceon Information Technology, 223227.

    [2] Cox, I., Miller, M., & Jeffrey, B. (2002). Digital watermarking:Principlesand practice. Morgan Kaufmann.

    [3] Van Leest, A., Van der Veen, M., & Bruekers, F. (2003). Reversibleimage watermarking.Proceedings of the IEEE International ConferenceonImage Processing,II, 731734.

    [4] J. Tian. Wavelet-based reversible watermarking for authentication. In E.J. Delp III and P. W. Wong, editors, Security and Watermarking ofMultimedia Contents volume 4675 of Proc. of SPIE, pages 679-690, Jan.2002

    [5] Coltuc, D., Chassery, J.M.: Very Fast Watermarking by ReversibleContrast Mapping. IEEE Signal Processing Letters 14, 255258 (2007).

    [6]

    Juergen Seitz. Digital Watermarking for digital media, University ofCooperative Education Heidenheim, Germany, 2005

    [7] M. U. Celik, G. Sharma, A. M. Tekalp, and E. Saber. Reversible datahiding. In Proc. of International Conference on Image Processing,volume II, pages 157-160, Sept. 2002.

    [8] Mohanty SP, Ranganathan N, Namballa RK. VLSI implementation ofinvisible digital watermarking algorithms towards the development of asecureJPEG encoder. In: Proceedings of the IEEE workshop on signalprocessing systems; 2003. p. 1838.

    [9] Mohanty SP, Kougianos E, Ranganathan N. VLSI architecture and chipfor combined invisible robust and fragile watermarking. IETComputDigital Tech(CDT) 2007;1(5):60011.

    [10] Mohanty SP, Nayak S. FPGA based implementation of an invisible-robust image watermarking encoder. In: Lecture notes in computerscience, vol.3356; 2004. p. 34453.

    [11]

    A. Garimella, M. V. V. Satyanarayan, R. S. Kumar, P. S. Murugesh, andU. C. Niranjan, VLSI Impementation of Online Digital WatermarkingTechniques with Difference Encoding for the 8-bit Gray Scale Images,in Proceedings of the International Conference on VLSI Design, 2003,pp. 283288.

    [12] S. P. Mohanty, N. Ranganathan, and R. K. Namballa, A VLSIArchitecture for Visible Watermarking in a Secure Still Digital Camera(S2DC) Design, IEEE Transactions on Very Large Scale IntegrationSystems, vol. 13, no. 8, pp. 10021012, August 2005.