Set Partitioning

Embed Size (px)

Citation preview

  • 7/30/2019 Set Partitioning

    1/17

    JOURNAL OF INFORMATION SCIENCE AND ENGINEERING 26, 1011-1027 (2010)

    1011

    Hybrid Image Compression Based on Set-Partitioning

    Embedded Block Coder and Residual Vector Quantization

    SHENG-FUU LIN, HSI-CHIN HSIN*AND CHIEN-KUN SU+Department of Electrical and Control Engineering

    National Chiao Tung University

    Hsinchu, 300 Taiwan*Department of Computer Science and Information Engineering

    National United University

    Miaoli, 360 Taiwan+Department of Electrical Engineering

    Chung Hua University

    Hsinchu, 300 Taiwan

    A hybrid image coding scheme based on the set-partitioning embedded block coder(SPECK) and residual vector quantization (RVQ) is proposed for image compression. In

    which, the scaling and wavelet coefficients of an image are coded by using the original

    SPECK algorithm and the SPECK with RVQ, respectively. The use of hybrid coding

    strategy by combining SPECK with RVQ for high frequency wavelet coefficients is to

    take account of the energy clustering property of wavelet transform. Experimental results

    show that, for gray-level still images, the proposed hybrid RVQ-SPECK coder outper-

    forms SPECK, e.g. the peak-signal-to-noise-ratio (PSNR) values can be improved by

    1.67 dB and 0.69 dB at compression rate of 1 bit per pixel for the 256 256 gray-levelLena and Barbra images, respectively. The application for chroma subsampling images is

    also presented in this paper, and the proposed method usually outperforms color SPECK

    method. The PSNR values can be improved by 1.11 dB for the Y plane, 0.99 dB for the

    U plane, and 2.31 dB for the V plane at the bit budget of 81,920 bits for the test image

    Goldhill. In addition to high coding efficiency, the proposed method also preserves the

    features of embeddedness, low decoding complexity, and exact bit-rate control.

    Keywords:image compression, residual vector quantization (RVQ), set-partitioning em-

    bedded block coder (SPECK), chroma subsampling images, embeddedness

    1. INTRODUCTION

    For the needs of high quality images, fast transmission, and less storage space, im-

    age compression is demanding increasingly. Differential pulse code modulation, transform

    coding, subband coding, and many other image compression techniques have been devel-

    oped [1-3]. State-of-the-art techniques can compress typical images by a factor ranging

    from 10 to 50 with acceptable quality [4]. The Joint Photographic Experts Group (JPEG)

    image standard [5] known as the most widely used transform-coding based algorithm

    shows good performances at moderate compression ratios. Recently, the wavelet basedmultiresolution representation has received a lot of attention to the compression appli-

    cations, as manifested in the JPEG2000 standard [6, 7]. Many wavelet based image cod-

    ing algorithms such as the embedded zero-tree wavelets (EZW) [8], set partitioning in

    Received July 8, 2008; revised November 20, 2008; accepted February 12, 2009.

    Communicated by Liang-Gee Chen.

  • 7/30/2019 Set Partitioning

    2/17

    SHENG-FUU LIN, HSI-CHIN HSINAND CHIEN-KUN SU1012

    hierarchical trees (SPIHT) [9], morphological representation of wavelet data (MRWD)

    [10], group testing for wavelets (GTW) [11], and the set-partitioning embedded block

    coder (SPECK) [12, 13] have been proposed with a great success. In wavelet domain, the

    higher detailed components of an image are projected onto the shorter basis functions

    with higher spatial resolutions, and the lower detailed components are projected onto the

    larger basis functions with narrower bandwidths; this matches the characteristics of a

    human visual system [14].

    In SPECK, the well-defined hierarchical structure with energy clustering within high

    frequency subbands has been taken into account such that the significant wavelet trans-

    form coefficients of an image can be efficiently coded as early as possible. SPECK has

    been incorporated into the verification model of JPEG 2000, which is known as subband

    hierarchical block partitioning (SBHP) [15]. Another variant of SPECK called the

    embedded zero block coding (EZBC) [16] is much more complicated, which combines

    SPECK with a context-based adaptive arithmetic coder to improve the compression per-

    formance.

    According to Shannons theory [17, 18], vector quantization (VQ) can significantly

    reduce the coding bits of signals, comparing to scalar quantization. Hence, VQ plays animportant role in many applications, e.g. speech recognition, volume rendering, and im-

    age compression. Gupta et al. utilized VQ to compress multispectral satellite images [19].

    Su et al. developed a hybrid coding system by using SPIHT and VQ for image compres-

    sion in [20]. Abdel-Galil et al. applied VQ to power systems for classifying power quality

    disturbances [21]. When the code vector and code book sizes become large enough, the

    distortion of the vector quantizer approaches the lower bound of the distortion-rate rela-

    tion. However, both the computation complexity and memory requirement, associated

    with the vector quantizer, increase exponentially. Hence, an unconstrained full search

    vector quantizer usually uses small vectors. For reducing the computation complexity and

    memory requirements of VQ, several variants of the original VQ had been proposed in

    literature, such as residual vector quantization (RVQ) [22, 23, 25], hierarchical VQ [24],

    and tree-structured VQ (TSVQ) [18]. Each VQ variant makes a compromise between thecomputation complexity and performance.

    RVQ or multistage VQ [25] is a VQ variant with less computation complexity. Be-

    cause the decoder of a RVQ is constrained by a direct-sum codebook structure and the

    encoder typically uses a suboptimal stage-sequential search procedure, the RVQ results in

    performance degradation. For efficiently coding high-frequency wavelet coefficients with

    energy clustering and compromising the complexity and performance of an image coder,

    a hybrid coder using SPECK and residual VQ (RVQ) is thus proposed for image com-

    pression. Specifically, the significant high-frequency wavelet coefficients of an image are

    to be coded on the basis of coefficient vectors, which can be efficiently located by using

    the significance coding procedure of SPECK. Recently, Chao et al. proposed a vector

    SPECK algorithm for gray-level still image compression [27] which is a variation on

    SPECK using VQ to code the significant coefficients. They used very sophisticated VQ

    method to improve compression efficiency at the cost of added complexity. The proposed

    hybrid method in this paper and the vector SPECK method were developed independently

    and with many differences in implementation, although both methods involve SPECK

    and VQ and have good performance.

  • 7/30/2019 Set Partitioning

    3/17

    HYBRID IMAGE COMPRESSION BASEDON SPECK AND RVQ 1013

    The remainder of this paper proceeds as follows. Section 2 describes the proposed

    hybrid image coder which combines SPECK and RVQ. Experimental results are given in

    section 3, and conclusions are given in section 4.

    2. THE PROPOSED HYBRID IMAGE COMPRESSION METHOD

    The SPECK algorithm, which was proposed by Pearlman et al., is a simple, efficient

    image coder with coding scalability. By recursively partitioning a significant block of a

    transformed image, SPECK locates the significant coefficients in the block and performs

    scalar quantization on these significant coefficients to generate a coded bit-stream. Since

    vector quantization is more efficient than scalar quantization according to Shannons

    rate-distortion theory, developing a hybrid image coder combining SPECK with VQ was

    motivated. For reducing the computational complexity, RVQ was selected to be combined

    with SPECK to constitute the proposed hybrid codec, and experimental results showed

    that the proposed hybrid method is efficient in image compression. In subsection 2.1,

    the application for still gray-level images will be discussed, and a chroma-subsampling-image application of the proposed hybrid method will be presented in subsection 2.2. In

    subsection 2.3, we will discuss the computational complexity and required memory of

    the proposed hybrid method in gray-level image compression.

    2.1 Application for Gray-Level Still Images

    A hybrid image coding system by combining SPECK with RVQ is therefore pro-

    posed to improve the compression performance, and Fig. 1 shows the block diagram

    which can be directly used for still gray-level image compression. In the first block, an

    input gray-level image is transformed by 2D discrete wavelet transform (DWT) to gener-

    ate its transformed image for further processing. For example, Fig. 2 shows the result of

    a 4-decomposition-level 2D DWT. The coefficients of the transformed image are classi-

    fied into two parts. One is the LL subband which contains the scaling coefficients, and

    the other is the high-frequency subbands which include all the coefficients of the trans-

    formed coefficients excluding those inLL subband. The scaling coefficients represent the

    lowest frequency component of an image, and they can be coded efficiently by using the

    original (scalar) SPECK algorithm. On the other hand, the wavelet coefficients in high-

    frequency subbands are coded by using the SPECK with RVQ. Finally, the coded bit-

    stream can be obtained by the use of multiplex operation.

    DWT

    Scalar SPECK

    SPECK withRVQ

    Mux

    LL subband

    Input Coded

    image bit-stream

    H,D, V subbands

    Fig. 1. The proposed hybrid image coder.

  • 7/30/2019 Set Partitioning

    4/17

    SHENG-FUU LIN, HSI-CHIN HSINAND CHIEN-KUN SU1014

    Fig. 2. The partition and assignment of a 4-decomposition-level transformed image.

    For the quantization of the coefficients in LL subband, original SPECK starts from

    the most significant nmax bit plane, where

    2max

    max log (| |) ,ij

    ijc LL

    cn

    = (1)

    and cij represents a coefficient inLL subband. Those scaling coefficients, whose magni-

    tudes are greater than or equal to 2nmax are located in the first pass. Then, the coefficients,

    whose magnitudes are in interval [2n

    max-1, 2n

    max), are located in the second pass, and the

    procedure goes on until all the coefficients are located or the bit-budget is exhausted. In

    the proposed hybrid method, all the coefficients in LL subband are normalized, by the

    absolute value of the coefficient with the largest magnitude, before sorting. Hence, the

    normalized scaling coefficients with magnitudes in [2-1, 20) are located in the first pass.

    The coefficients with magnitudes in interval [2-2, 2-1) are located in the second pass, and

    the procedure goes on till all the coefficients are located or the bit-budget is exhausted.

    For the coefficients in high-frequency subbands, they are classified into three cate-

    gories which areH(horizontal),D (diagonal), and V(vertical) types as shown in Fig. 2.

    All the coefficients in high-frequency subbands are partitioned into 2 2 blocks, andeach 2 2 block forms a corresponding 4D vector (Fig. 3). The three types of vectors in

    H,D, and Vare normalized by the maximumL2 norms of three categories, respectively,

    such that theL2 norm of each vector is less than or equal to one. If theL2 norm of a vec-

    tor is greater than or equal to the threshold, then the vector is significant and the block or

    subband containing this vector is also significant. Because the vectors are normalized,the thresholds for a 7-stage RVQ are: 2-1, 2-2, 2-3, 2-4, 2-5, 2-6, and 0. For a 4-decomposi-

    tion-level transformed image, in the initialization step, theH,D and Vblocks in the 4th

    decomposition level (i.e. on the left-top corner in Fig. 2) form the S set (significant set),

    and the otherH,D and Vsubbands form the Iset (insignificant set) in the RVQ-SPECK

  • 7/30/2019 Set Partitioning

    5/17

    HYBRID IMAGE COMPRESSION BASEDON SPECK AND RVQ 1015

    ci,j ci,j+1

    ci+1,j ci+1,j+1

    [ci,j, ci,j+1, ci+1,j, ci+1,j+1]

    A 22 coefficient block A 4D vector

    Fig. 3. A 2 2 coefficient block and its corresponding 4D vector.

    Fig. 4. The signal flow diagram of ap-stage RVQ.

    block of the proposed hybrid method (Fig. 1). The sorting pass of the significant vectorsof the high-frequency subbands and the definitions of S and I in the proposed hybrid

    method are identical to the scalar SPECK except that the significance path of the SPECK

    with RVQ ends when the block is 2 2.Generally speaking, full-search VQ has better performance than RVQ, but RVQ was

    selected to be used in the proposed hybrid method because of its low complexity and

    acceptable performance. The signal flow diagram of a p-stage RVQ is shown in Fig. 4,

    where xi (1 i p) is the input vector of the ith VQ stage in thep-stage RVQ, and ix isthe code vector which has the smallest distance toxi. The residual ofxi ix is xi+1 that isthe input vector for the (i + 1)th VQ stage in the RVQ system. Because the characteristics

    ofH, D, and Vsubbands are different, 3 RVQs are used for the H, D, and Vsubbands,

    respectively, of the RVQ-SPECK part in the proposed hybrid method.

    Since the information of the lowest frequency subband LL of an image is usuallymore important than that of the high-frequency subbands, the bit-plane resolution in the

    scalar SPECK is set to be higher than that of the RVQ-SPECK. Thus, the transmission

    rate of the scalar SPECK is usually faster than that of the RVQ-SPECK. Based on the

    simulation results, the transmission rate of the scalar SPECK is set empirically twice of

    the RVQ-SPECK transmission rate, i.e. one pass of the proposed hybrid method includ-

    ing two SPECK passes and one RVQ-SPECK pass. At last, the output coded bit-stream

    contains the overhead, binary output of SPECK, and binary output of RVQ-SPECK, and

    the relation is shown in Fig. 5.

    Fig. 5. The coded bit-stream of the proposed hybrid method.

  • 7/30/2019 Set Partitioning

    6/17

    SHENG-FUU LIN, HSI-CHIN HSINAND CHIEN-KUN SU1016

    The decoder of the proposed method can be implemented by simply reversing the

    processing steps of the encoder. Besides the overhead of the compression file, the bits in

    the compression file are ordered in importance, so the proposed method is embedded.

    The proposed encoder (decoder) can terminate the coding (decoding) process at any point,

    so it can achieve the exact bit-rate control which is an important requirement of modern

    codecs.

    The compression performance can be improved by the use of arithmetic coding af-

    ter SPECK, however, at the cost of increasing computational complexity. As an example,

    the PSNR value of the decoded 512 512 Lena image can be improved by 0.22 dB atcompression rate of 1 bpp by using SPECK with arithmetic coding [26]. For system sim-

    plicity, the operation of arithmetic coding is not performed in our experiments.

    2.2 Application for Chroma Subsampling Images

    The chroma subsampling format is used for balancing efficiency and quality in sam-

    pling, and the similar method is used in picture format to save bandwith (memory) and

    maintain good quality. CIF (Common Intermediate Format) and QCIF (Quarter CIF) aretwo such formats in H.261. For CIF, the size of the luminance plane is 352 288, and thesizes of the two chrominance planes are 176 144. Each of the two chrominance planesonly contains one quarter data (pixels) of the luminance planes, since the human eye is

    less sensitive to the chrominance information than to the luminance information. The im-

    age sequence format of MPEG-4 is CIF or 4:2:0, and we will discuss how to use the pro-

    posed hybrid method for the compression of the popular YUV 4:2:0 images.

    Fig. 6 shows the block diagram of the application of the proposed hybrid method

    for chroma subsampling images. First, Y, U, and V planes are transformed by using 2D

    discrete wavelet transform, respectively. Then, each transformed YUV plane is proc-

    essed like the transformed image in the still gray-level image case. The transformed co-

    efficients of each plane are partitioned into LL,H,D, and Vsubbands. The scaling coef-

    ficients of each LL subband, which will be processed by color SPECK (CSPECK) [12,13], are normalized by the maximum-amplitude of the coefficients in this LL subband.

    Another three positive base values which are determined from the L2 norms of the 4D

    vectors inH, V, andD subbands, respectively, are used to normalize the vectors in their

    corresponding subbands such that the L2 norm of each normalized 4D vector is not

    greater than 1. Hence, 12 base values used in normalization have to be stored and trans-

    mitted for the decoder of the proposed method. After normalization and coefficient clas-

    sification, the three LL subbands of the transformed Y, U, and V planes are processed by

    CSPECK, and the other coefficients (i.e.H,D, and Vsubbands) of the transformed Y, U,

    and V planes are coded by CSPECK with RVQ.

    Similar to the quantization of the monochrome application, since the scaling coeffi-

    cients in LL subbands contain more important information than the coefficients in H,D,

    or Vsubbands do, one quantization cycle of the proposed hybrid method includes two

    CSPECK quantization passes of the LL subbands and one RVQ-CSPECK quantization

    pass ofH,D, and Vsubbands (Fig. 7).

  • 7/30/2019 Set Partitioning

    7/17

    HYBRID IMAGE COMPRESSION BASEDON SPECK AND RVQ 1017

    Fig. 6. The block diagram of the application of the proposed method for chroma subsampling images.

    Fig 7. One quantization cycle of the proposed hybrid method which includes two CSPECK quan-

    tization and one RVQ-CSPECK quantization.

  • 7/30/2019 Set Partitioning

    8/17

    SHENG-FUU LIN, HSI-CHIN HSINAND CHIEN-KUN SU1018

    2.3 Memory Requirement and Computational Complexity of the Proposed HybridMethod

    In this section, the memory requirement and computational complexity of the pro-

    posed hybrid codec will be discussed, assuming that the coefficients in H, D, and Vsubbands are much more than the scaling coefficients in LL subband. Therefore, the

    computation complexity of the proposed hybrid method can be approximated by that of

    the RVQ-SPECK part, or the RVQ-SPECK is the dominant part of the proposed hybrid

    method.

    For the memory issue, the proposed hybrid method needs extra memory for storing

    codebooks and parameters. Assume that three p-stage RVQs are used and the codebook

    sizes ofH,D, and Vsubbands for the ith VQ stage are the same and equal to mi words.

    Then, all the codebooks need1

    3p

    i

    i

    m=

    words. The proposed hybrid method also needs

    memory to store the 12 base values for normalization and threshold information. The

    proposed hybrid method outperforms the scalar SPECK, for the memory issue, on the

    lengths of the list of significant vectors and the list of in-significant sets. Since each vec-tor in RVQ-SPECK contains 4 coefficients, the length of a RVQ-SPECK list is about

    one fourth of the length of the corresponding list in SPECK (e.g. LSP and LIP).

    For the computational complexity in encoding, the proposed hybrid method use L2

    norms for significance test of the vectors for each stage, and SPECK uses 1-bit compari-

    son to test significance for each bit-plane. Hence, the computational complexity of the

    proposed hybrid method for significant vector test is several times more complicated

    than the significance test in SPECK. Although the significance test complexity of the

    proposed method is more complicated than SPECKs, the proposed hybrid method has

    the advantage that its total significance test number is smaller than that of SPECK. If an

    NNgray-level test image with nmax = 11 is coded by SPECK and the proposed hybridmethod with p-stage RVQs. Then, the significance-test-number ratio of SPECK to the

    proposed hybrid method can be estimated by:

    max max4 .( /2) ( /2)

    N N n n

    N N p p

    =

    (2)

    For a 512 512 gray-level test image with nmax = 12 andp = 7, the significance-testratio of Eq. (2) is 6.9. Hence, for this example, the SPECK encoder needs 6.9 times of the

    significance tests of the proposed hybrid encoder. Although SPECK can use a simple bit-

    wise operation for significance test, it suffers from the growth of the number of signifi-

    cance tests for large images. Both SPECK and the proposed hybrid method use the same

    algorithm to locate significant coefficients, but the proposed method usually has shorter

    significant paths because of the use of a 4D vector instead of a single pixel (coefficient).

    When the proposed hybrid method locates a significant vector, SPECK needs one more

    quadtree partition and 4 significance test to complete the significant path.

    For the decoding part, no significance test needed for SPECK or the proposed hy-

    brid method, and the computational complexity of both methods is greatly reduced. The

    proposed hybrid method is implemented as simple look-up tables and the total amounts

    of significant vectors are about one quarter of the amounts of the significant pixels in

  • 7/30/2019 Set Partitioning

    9/17

    HYBRID IMAGE COMPRESSION BASEDON SPECK AND RVQ 1019

    SPECK. The actual computational complexity depends on the image characteristic, code-

    book size, bit-allocation of codewords and so on. From above discussion, the propose hy-

    brid decoder is as efficient as SPECK, and this is consistent with our experiment results.

    To summarize, the proposed hybrid method is suitable for asymmetric complexity

    applications that we can encode images off line, but need to decode images fast.

    3. EXPERIMENTAL RESULTS

    In this section, two applications of the proposed hybrid compression method are pre-

    sented. The first application, in subsection 3.1, is the gray-level still image compression,

    and the other is the compression of chroma subsampling images in subsection 3.2. The

    platform for simulation is an IBM PC with Windows XP, and SPECK, SPIHT, and the

    proposed hybrid method are coded by Matlab.

    Linear phase biorthogonal wavelet filters with 9/7-coefficients are used in this paper.

    The number of wavelet decomposition levels in our experiments is 4. Fig. 2 shows the

    classification of a 4-level transformed image whose coefficients are classified into fourtypes:LL,H, V, andD. The lowest frequency coefficients in subband LL are normalized

    such that their magnitudes are in the range of [0, 1), and these coefficients are coded by

    using the scalar SPECK. The wavelet coefficients in subbands of types H, V, andD are

    coded by using SPECK with RVQ. For the coefficient vectors ofH,D, and Vsubbands,

    we empirically choose the stage number, in the RVQs, as 10 and 7 for 256 256 and 512 512 test images, respectively. Because the characteristics ofH, V, andD subbands aredifferent, each category has its own codebooks. Therefore, 30 and 21 codebooks are trained

    by using the K-means algorithm for 256 256 and 512 512 monochrome images, re-spectively. The codebook size of the first RVQ stage is 64 words, and that of the other

    RVQ stages is 32 words. Each codeword is a 4D vector in R4.

    3.1 Grey-level Still Image Compression

    In this subsection, we will compare the proposed hybrid method with SPECK and

    SPIHT image codecs by encoding and decoding some test images (Fig. 8). Both 256 256 and 512 512 test images are used for testing. The proposed hybrid method andSPECK are compared by using 256 256 test images first, and then, three methods (in-cluding SPIHT) are compared for 512 512 test images. The compression rate is meas-ured in bits per pixel (bpp), and the peak signal to noise ratio (PSNR) measured in dB is

    utilized to evaluate the decoded image quality. For the simulation of 256 256 mono-chrome images, 41 images, which do not include the three test images, are used to train

    the codebooks of the RVQs of the proposed method.

    Table 1 shows the simulation results of the 256 256 test images, and Figs. 9-11show the PSNR-bpp curves for the 3 test images, where the horizontal and vertical axes

    are the compression rates in bpp and PSNR values in dB, respectively. For the 256 256monochrome image Lena, the proposed hybrid coder outperforms the SPECK coder by

    1.67 dB at 1.0 bpp, and 0.48 dB, on average, from 0.1 bpp to 1.5 bpp. For the 256 256monochrome image Barbra, the proposed hybrid coder outperforms the SPECK coder by

    1.23 dB at 1.1 bpp, and 0.49 dB on average. For the third 256 256 gray-level image

  • 7/30/2019 Set Partitioning

    10/17

    SHENG-FUU LIN, HSI-CHIN HSINAND CHIEN-KUN SU1020

    (a) Lena. (b) Babara. (c) Goldhill.

    Fig. 8. Three 8-bit gray-level 256 256 test images.

    Table 1. Simulation results of 256256 test images.

    PSNR (dB)

    Lena Barbara Goldhill

    bpp SPECK Proposed SPECK Proposed SPECK Proposed

    1.5 40.89 41.68 39.48 39.59 33.61 34.34

    1.4 40.51 40.82 39.02 39.21 33.22 33.92

    1.3 40.09 40.07 38.56 38.79 32.85 33.53

    1.2 39.58 39.68 37.60 38.38 32.48 33.19

    1.1 39.05 39.19 36.21 37.44 32.10 32.64

    1.0 36.96 38.63 35.71 36.41 31.48 31.72

    0.9 36.41 37.24 35.20 35.46 30.56 30.97

    0.8 35.78 36.02 34.64 34.87 30.10 30.43

    0.7 35.08 35.34 34.00 34.31 29.61 29.98

    0.6 33.74 34.55 32.37 33.59 29.09 29.48

    0.5 32.39 32.85 31.65 31.90 28.54 29.00

    0.4 31.43 31.56 30.80 30.95 27.44 27.76

    0.3 29.33 30.44 29.79 30.09 26.74 26.89

    0.25 28.72 28.97 28.59 28.93 26.29 26.49

    0.2 28.01 28.14 27.99 28.33 25.77 26.03

    0.125 25.89 26.58 25.84 26.84 24.46 24.92

    0.1 25.25 25.44 25.22 25.89 24.11 24.49

    0 0.5 1 1.525

    30

    35

    40

    bpp

    PSNR(

    dB)

    Proposed

    SPECK

    Fig. 9. The experimental results of the 256

    256 gray-level image Lena.

    Fig. 10. The experimental results of the 256 256 gray-level image Barbara.

  • 7/30/2019 Set Partitioning

    11/17

    HYBRID IMAGE COMPRESSION BASEDON SPECK AND RVQ 1021

    Fig. 11. The experimental results of the 256 256 gray-level image Goldhill.

    Fig. 12. The average improvements of the proposed

    hybrid coder compared with the original

    SPECK on more 256 256 test images.

    Goldhill, the proposed hybrid coder outperforms the SPECK coder by 0.73 dB at 1.5 bpp,

    and 0.43 dB on average. The experimental results of more test images obtained from the

    USC (University of Southern California) image database are shown in Fig. 12. In Fig. 12

    the curve denotes the average improvement by using the proposed hybrid coder com-

    pared with the pure SPECK coder. It is shown that the proposed hybrid coder is prefer-

    able to the SPECK coder in terms of the PSNR-bpp curves.

    For the experiments of 512 512 gray-level still images, the proposed hybridmethod, SPECK, and SPIHT (with arithmetic coding) are simulated and compared with

    each other. SPIHT is selected for comparison because it is a wavelet-based method with

    very good performance and used in JPEG2000. A set of codebooks were trained by using

    8 training images, downloaded from USC image database, and the K-means method. The

    stages of a RVQ of the proposed hybrid method were empirically reduced to 7 stages,

    since using fewer stages in a RVQ usually obtains better performance (higher PSNR

    values) for low bit-rate cases. The vectors used for 512 512 images are also 4D vectors(Fig. 3) in the vector space R

    4. The 7 thresholds of the 3 RVQs in the proposed hybrid

    method are: 2-1, 2-2, 2-3, 2-4, 2-5, 2-6, and 0. Table 2 shows the simulation results of the

    proposed hybrid method, SPECK, and SPIHT (with arithmetic coding) on 512 512 testimages. SPECK and SPIHT are two state-of-the-art techniques, and which one has better

    performance usually depends on the image characteristic. According to the results in Ta-

    ble 2, although we can not guarantee that the proposed hybrid method can always has the

    best performance; it seems that the proposed hybrid method can improve the SPECK co-

    dec for most images, especially under low bit-rate conditions. Three 0.25-bpp decoded

    images of SPECK, the proposed method, and SPIHT are shown in Fig. 13, and it is diffi-

    cult to find difference among these images by our eyes instantly. By carefully inspecting

    the reconstructed images in Fig. 13, we found that the image of SPECK codec is smoother

    than the others and the proposed hybrid codec preserves more small details of the origi-nal images.

    Chao et al. proposed a vector SPECK [27] for still gray-level image compression.

    Three types of VQs (full search VQ, tree-structured VQ, and entropy constrained VQ)

    were used in their method at the same time, and the vector dimension and vector entries

    depend on the subbands and quantization levels where the vector is located. A large

  • 7/30/2019 Set Partitioning

    12/17

    SHENG-FUU LIN, HSI-CHIN HSINAND CHIEN-KUN SU1022

    Table 2. Simulation results for 512512 test images.

    PSNR (dB)

    Lena Barbara Goldhill

    bpp SPECK Proposed SPIHT SPECK Proposed SPIHT SPECK Proposed SPIHT1.0 40.44 40.29 39.89 35.23 36.18 36.77 34.89 35.42 35.82

    0.9 39.99 40.03 39.39 34.67 34.83 35.96 34.46 35.00 35.31

    0.8 39.54 39.74 38.69 34.00 34.32 35.01 33.99 34.52 34.78

    0.7 38.89 39.40 38.14 32.85 33.75 33.88 33.49 34.06 34.15

    0.6 37.46 39.01 37.53 31.40 33.00 32.72 32.72 33.39 33.36

    0.5 36.87 37.38 36.78 30.62 31.02 31.63 31.71 32.32 32.55

    0.4 36.03 36.62 35.82 29.68 30.10 30.33 31.03 31.59 31.69

    0.3 34.07 35.61 34.42 28.00 28.95 28.54 30.18 30.72 30.79

    0.25 33.46 34.16 33.65 27.30 27.98 27.60 29.61 30.20 30.15

    0.2 32.64 33.29 32.71 26.49 26.92 26.66 28.69 29.26 29.39

    0.1 29.48 30.31 29.82 23.99 24.75 24.37 27.03 27.62 27.63

    Table 3. Experiment results of SPECK, JPEG2000, and vector SPECK from [27].

    Lena

    Bit rate SPECK JPEG2000 Vector SPECK

    0.125 30.96 30.92 31.25

    0.2 32.99 32.96 33.47

    0.25 34.03 34.09 34.33

    (a) (b) (c)

    Fig. 13. Decoded images of (a) SPECK, (b) the proposed hybrid method, and (c) SPIHT under 0.25-

    bpp condition.

    amount (1,500) of training images and Lloyd splitting method are used for training code-

    books. Vector SPECK can outperform the JPEG2000 codec under low bit-rate conditions

    at the cost of added complexity, but it does not handle the lower bit planes for n = 3, 2, 1,

    and 0. Compared with vector SPECK, the proposed hybrid method has the features of low

    complexity and a wide bit-rate range. Table 3 shows some experiment data form [27], andthey used 5 decomposition levels, 9/7 DWT, and arithmetic coding in SPECK. Since the

    conditions of Tables 2 and 3 are different, the results of Tables 2 and 3 of the same method

    are not equal. Hence, we only compare the difference values of SPECK and the proposed

    hybrid method in Table 2 with the difference values of SPECK and JPEG2000 (or the

    vector SPECK) in Table 3. For the Lena image under 0.25-bpp case, the vector SPECK

  • 7/30/2019 Set Partitioning

    13/17

    HYBRID IMAGE COMPRESSION BASEDON SPECK AND RVQ 1023

    outperforms SPECK 0.3 dB, JPEG2000 outperforms SPECK 0.06 dB (Table 3), and the

    proposed hybrid method outperforms SPECK 0.7 dB (Table 2). Hence, it shows that the

    proposed hybrid method is very competitive and efficient.

    3.2 Chroma Subsampling Image Compression

    The goal of the simulation is to compare the performance of the proposed hybrid

    coder with that of the CSPECK coder for YUV 4:2:0 images. Based on the simulation

    results, we can choose a proper coder for applications with such a format, e.g. MPEG-4,

    PAL DV, DVCAM, HDV, JPEG/JFIF, H.261, VC-1, and MJPEG. The test images, used

    in the simulation, have 256 256 Y (luminance) plane and 128 128 U and V (chromi-nance) planes. The 9/7-tap biorthogonal wavelet transform is performed on each plane

    separately, and the number of decomposition level is four. For a CSPECK codec, the de-

    coder needs to know the maximum number of binary bit planes (nmax) that is used for

    coding the transformed image.

    Excluding the threeLL subbands of Y, U, and V planes, the other coefficients (in H,

    V, andD subbands of each Y, U, or V plane) of the transformed image are coded by usingthe CSPECK with three 10-stage RVQs. In our experiments, 55 color images are used to

    generate 90 codebooks for the proposed codec, since the RVQs are 10-stage and there

    are three YUV planes that each has three types (H, V, andD) of 4D coefficient vectors.

    For the vectors inHsubbands of Y plane, 128 vectors are selected to be the basis vectors

    for the vectors withL2 norms in [0.5, 1), and each of the other 9 codebooks of theHsub-

    bands in Y plane has 64 codewords. The same basis vector arrangement as that used in the

    Hsubband is used in the D and Vsubbands in Y plane. Because the human eye is less

    sensitive to the chrominance information than to the luminance information, fewer basis

    vectors are used in U plane or V plane. For the vectors inH,D, or Vsubbands of plane U

    or plane V, 32 basis vectors are used in the highest ( i.e. 10th) stage of the RVQs, and

    each of the other 9 stages has 16 basis vectors. All the codebooks are trained by using the

    simple K-means method. The equivalent bit-per-pixel (ebpp) value defined in Eq. (3) isused for representing the compression rate for decoding a coded YUV 4:2:0 image:

    2 2

    number of bits used.

    256 2 128ebpp =

    + (3)

    The 256 256 color test image Goldhill is used for simulation, and the curves inFigs. 14-16show the simulation results. The test image is originally 256 256 size in theR, G, and B planes (true color space), so they had to be preprocessed before simulation.

    First, the test image was transformed to the YUV space. Then, the U and V planes were

    downsampled to 128 128 pixels, where the downsampleing method was to calculate thearithmetic mean of the adjacent four-point values. We compare the PSNR values of the

    proposed hybrid method with those of the CSPECK coder in Y, U, and V planes, respec-

    tively. It can be seen that the PSNR values can be improved by 1.11 dB for the Y plane,0.99 dB for the U plane, and 2.31 dB for the V plane at the bit budget of 98,304 bits (1.0

    ebpp). For the same image, the average PSNR values (from 0.1 ebpp to 1.5 ebpp) of the

    proposed method are higher than those of CSPECK by 0.66 dB, 1.21 dB, and 2.22 dB in

    Y, U, and V planes, respectively.

  • 7/30/2019 Set Partitioning

    14/17

    SHENG-FUU LIN, HSI-CHIN HSINAND CHIEN-KUN SU1024

    Fig. 14. The Y-plane experimental results of the

    chroma subsampling image Goldhill.

    Fig. 15. The U-plane experimental results of the

    chroma subsampling image Goldhill.

    Fig. 16. The V-plane experimental results of the chroma subsampling image Goldhill.

    Based on the simulation results, it is obvious that the proposed method has superiorimprovement in the two chrominance planes (U and V), since the colors (chrominance

    information) of the four neighbors in a 2 2 block are usually similar. On the other hand,the luminance values are more probable to change abruptly than the chrominance values

    are, because of sharp edges and corners. Even though, the proposed method also achieves

    good results in the Y planes. The major added cost of the proposed method is the needs

    of training codebooks and determining parameters before encoding. Since the most time

    consuming codebook design can be done off-line and the codebook sizes of the RVQs are

    small, the proposed hybrid method is efficient in time and bit-budget.

    4. CONCLUSIONS

    In this paper, we propose a hybrid image coder, which is based on SPECK and RVQ,

    for still gray-level and chroma subsampling images. Compared with SPIHT and SPECK

    (two state-of-the-art algorithms), the experimental results have shown that the proposed

    hybrid method is efficient for image compression. According to the applications that we

    are interested in, the flexible proposed hybrid codec can be designed to improve its low

  • 7/30/2019 Set Partitioning

    15/17

    HYBRID IMAGE COMPRESSION BASEDON SPECK AND RVQ 1025

    bit-rate or high bit-rate performance by using a short RVQ or a long RVQ. We also have

    shown that the proposed hybrid method has superior performance for the chrominance

    planes (i.e. U and V planes in YUV color space) in chroma subsampling image compres-

    sion. Because of the asymmetry property of VQ, the proposed hybrid method is suitable

    for those applications whose load is also asymmetric and is heavy on the decoding side

    (e.g. the image archiving of an image database). Although the proposed hybrid codec is

    asymmetric, using RVQ instead of full-search VQ makes the increased complexity af-

    fordable and worthy.

    ACKNOWLEDGEMENTS

    The authors would like to thank the anonymous reviewers for their comments that

    significantly helped improve this paper.

    REFERENCES

    1. H. G. Musmann, P. Pirsch, and H. J. Grallert, Advances in picture coding, in Pro-ceedings of IEEE, Vol. 73, 1985, pp. 523-548.

    2. R. J. Clarke, Transform Coding of Images, Academic Press, New York, 1985.3. O. J. Kwon and R. Chellappa, Region adaptive subband image coding, IEEE Trans-

    actions on Image Processing, Vol. 7, 1988, pp. 632-648.

    4. K. R. Rao and J. J. Hwang, Techniques and Standards for Image Video and AudioCoding, Prentice Hall, New Jersey, 1996.

    5. W. B. Pennebaker and J. L. Mitchell, JPEG Still Image Data Compression Stan-dards, Van Nostrand, New York, 1993.

    6. JPEG2000 Core Coding System (Part 1), ISO/IEC 15444-1, Dec. 2000.7. B. E. Usevitch, A tutorial on modern lossy wavelet image compression: Founda-

    tions of JPEG2000,IEEE Signal ProcessingMagazine, Vol. 18, 2001, pp. 22-35.8. J. M. Shapiro, Embedded image coding using zerotrees of wavelet coefficients,

    IEEE Transactions on Signal Processing, Vol. 41, 1993, pp. 3445-3462.

    9. A. Said and W. A. Pearlman, A new, fast, and efficient image codec based on setpartitioning in hierarchical trees,IEEE Transactions on Circuits Systems for Video

    Technology, Vol. 6, 1996, pp. 243-250.

    10. S. D. Servetto, K. Ramchandran, and M. T. Orchard, Image coding based on a mor-phological representation of wavelet data,IEEE Transactions on Image Processing,

    Vol. 8, 1999, pp. 1161-1174.

    11. E. S. Hong and R. E. Ladner, Group testing for image compression, IEEE Trans-actions on Image Processing, Vol. 11, 2002, pp. 901-911.

    12. A. Islam and W. A. Pearlman, An embedded and efficient low-complexity hierar-chical image coder, in Proceedings of SPIE Visual Communications and Image

    Processing, Vol. 3653, 1999, pp. 294-305.

    13. W. A. Pearlman, A. Islam, N. Nagaraj, and A. Said, Efficient, low-complexity im-age coding with a set-partitioning embedded block coder, IEEE Transactions on

    Circuits Systems for Video Technology, Vol. 14, 2004, pp. 1219-1235.

    14. G. Strang and T. Nguyen, Wavelets and Filter Banks, Wellesley-Cambridge, MA,

  • 7/30/2019 Set Partitioning

    16/17

    SHENG-FUU LIN, HSI-CHIN HSINAND CHIEN-KUN SU1026

    1996.

    15. C. Chrysafis, A. Said, A. Drukarev, and W. A. Pearlman, SBHP A low complex-ity wavelet coder, in Proceedings of IEEE International Conference on Acoustics,

    Speech, and Signal Processing, 2000, pp. 2035-2038.

    16. S. T. Hsiang and J. W. Woods, Embedded image coding using zero blocks of sub-band/wavelet coefficients and context modeling, in Proceedings of IEEE Interna-

    tional Conference on Circuits and Systems, 2000, pp. 662-665.

    17. C. E. Shannon, A mathematical theory of communication, The Bell System Tech-nical Journal, Vol. 27, 1948, pp. 379-423, 623-656.

    18. A. Gersho and R. M. Gray, Vector Quantization and Signal Compression, KluwerAcdemic Publishers, MA, 1992.

    19. S. Gupta and A. Gersho, Feature predictive vector quantization of multispectral im-ages,IEEE Transactions on Geoscience and Remote Sensing, Vol. 30, 1992, pp. 491-

    501.

    20. C. K. Su, H. C. Hsin, and S. F. Lin, Wavelet tree classification and hybrid codingfor image compression,IEE Proceedings of Vision,Image, and Signal Processing,

    Vol. 152, 2005, pp. 752-756.21. T. K. Abdel-Galil, E. F. El-Saadany, A. M. Youssef, and M. M. Salama, Distur-

    bance classification using hidden Markov models and vector quantization, IEEE

    Transactions on Power Delivery, Vol. 20, 2005, pp. 2129-2135.

    22. C. F. Barnes, Residual quantizers, Ph.D. Dissertation, Department of Electricaland Computer Engineering, BrigHam Young University, Provo, UT, 1989.

    23. F. Kossentini, M. J. T. Smith, and C. F. Barnes, Image coding using entropy-con-strained residual vector quantization,IEEE Transactions on Image Processing, Vol.

    4, 1995, pp. 1349-1357.

    24. Y. Shoham, Hierachical vector quantization with application to speech waveformcoding, Ph.D. Dissertation, Department of Electrical and Computer Engineering,

    University of California at Santa Barbara, 1985.

    25.B. H. Juang and A. H. Gray, Multiple stage vector quantization for speech coding,in Proceedings of IEEE International Conference on Acoustics, Speech, Signal Proc-

    essing, Vol. 1, 1982, pp. 597-600.

    26. G. Xie and H. Shen, Highly scalable, low-complexity image coding using zero-blocks of wavelet coefficients, IEEE Transactions on Circuits Systems for Video

    Technology, Vol. 15, 2005, pp. 762-770.

    27. C. C. Chao and R. M. Gray, Image compression with a vector SPECK algorithm,in Proceedings of IEEE International Conference on Acoustics, Speech, Signal Proc-

    essing, Vol. 2, 2006, pp. 445-448.

    Sheng-Fuu Lin () was born in Taiwan, R.O.C., in

    1954. He received the B.S. and M.S. degrees in Mathematics from

    National Taiwan Normal University in 1976 and 1979, respec-

    tively, the second M.S. degree in Computer Science from the

    University of Maryland in 1985, and the Ph.D. degree in Electri-

    cal Engineering from the University of Illinois, Champaign, in

    1988. Since 1988, he has been on the faculty of the Department

  • 7/30/2019 Set Partitioning

    17/17

    HYBRID IMAGE COMPRESSION BASEDON SPECK AND RVQ 1027

    of Electrical and Control Engineering at National Chiao Tung University, Hsinchu, Tai-

    wan. His research interests include fuzzy theory, automatic target recognition, scheduling,

    image processing, and image recognition. Professor Lin is a member of the IEEE Control

    Society, Chinese Fuzzy System Association, and Chinese Automatic Control Society.

    Hsi-Chin Hsin () received the M.S. and Ph.D. de-

    grees in Electrical Engineering from the University of Pittsburgh,

    Pittsburgh, PA, in 1992 and 1995, respectively.He is a Professor

    in the Department of Computer Science and Information Engi-

    neering at National United University, Taiwan. His research in-

    terests include wavelet transform, image processing, CORDIC,

    DSP architectures and system on chip.

    Chien-Kun Su () was born in 1962. He received the

    B.S. degree from National Taiwan University, Taiwan, in 1989,

    M.S. degree from the University of Southern California, U.S.A.,

    in 1992, and the Ph.D. degree from National Chiao Tung Uni-

    versity, Taiwan, in 2008. He has been on the faculty of the De-

    partment of Electrical Engineering at Chung Hua University,

    Hsinchu, Taiwan since 1995. His research interests include im-

    age processing and computer vision.