Download pdf - Set Partitioning

7/30/2019 Set Partitioning

1/17

JOURNAL OF INFORMATION SCIENCE AND ENGINEERING 26, 1011-1027 (2010)

1011

Hybrid Image Compression Based on Set-Partitioning

Embedded Block Coder and Residual Vector Quantization

SHENG-FUU LIN, HSI-CHIN HSIN*AND CHIEN-KUN SU+Department of Electrical and Control Engineering

National Chiao Tung University

Hsinchu, 300 Taiwan*Department of Computer Science and Information Engineering

National United University

Miaoli, 360 Taiwan+Department of Electrical Engineering

Chung Hua University

Hsinchu, 300 Taiwan

A hybrid image coding scheme based on the set-partitioning embedded block coder(SPECK) and residual vector quantization (RVQ) is proposed for image compression. In

which, the scaling and wavelet coefficients of an image are coded by using the original

SPECK algorithm and the SPECK with RVQ, respectively. The use of hybrid coding

strategy by combining SPECK with RVQ for high frequency wavelet coefficients is to

take account of the energy clustering property of wavelet transform. Experimental results

show that, for gray-level still images, the proposed hybrid RVQ-SPECK coder outper-

forms SPECK, e.g. the peak-signal-to-noise-ratio (PSNR) values can be improved by

1.67 dB and 0.69 dB at compression rate of 1 bit per pixel for the 256 256 gray-levelLena and Barbra images, respectively. The application for chroma subsampling images is

also presented in this paper, and the proposed method usually outperforms color SPECK

method. The PSNR values can be improved by 1.11 dB for the Y plane, 0.99 dB for the

U plane, and 2.31 dB for the V plane at the bit budget of 81,920 bits for the test image

Goldhill. In addition to high coding efficiency, the proposed method also preserves the

features of embeddedness, low decoding complexity, and exact bit-rate control.

Keywords:image compression, residual vector quantization (RVQ), set-partitioning em-

bedded block coder (SPECK), chroma subsampling images, embeddedness

1. INTRODUCTION

For the needs of high quality images, fast transmission, and less storage space, im-

age compression is demanding increasingly. Differential pulse code modulation, transform

coding, subband coding, and many other image compression techniques have been devel-

oped [1-3]. State-of-the-art techniques can compress typical images by a factor ranging

from 10 to 50 with acceptable quality [4]. The Joint Photographic Experts Group (JPEG)

image standard [5] known as the most widely used transform-coding based algorithm

shows good performances at moderate compression ratios. Recently, the wavelet basedmultiresolution representation has received a lot of attention to the compression appli-

cations, as manifested in the JPEG2000 standard [6, 7]. Many wavelet based image cod-

ing algorithms such as the embedded zero-tree wavelets (EZW) [8], set partitioning in

Received July 8, 2008; revised November 20, 2008; accepted February 12, 2009.

Communicated by Liang-Gee Chen.


2/17

SHENG-FUU LIN, HSI-CHIN HSINAND CHIEN-KUN SU1012

hierarchical trees (SPIHT) [9], morphological representation of wavelet data (MRWD)

[10], group testing for wavelets (GTW) [11], and the set-partitioning embedded block

coder (SPECK) [12, 13] have been proposed with a great success. In wavelet domain, the

higher detailed components of an image are projected onto the shorter basis functions

with higher spatial resolutions, and the lower detailed components are projected onto the

larger basis functions with narrower bandwidths; this matches the characteristics of a

human visual system [14].

In SPECK, the well-defined hierarchical structure with energy clustering within high

frequency subbands has been taken into account such that the significant wavelet trans-

form coefficients of an image can be efficiently coded as early as possible. SPECK has

been incorporated into the verification model of JPEG 2000, which is known as subband

hierarchical block partitioning (SBHP) [15]. Another variant of SPECK called the

embedded zero block coding (EZBC) [16] is much more complicated, which combines

SPECK with a context-based adaptive arithmetic coder to improve the compression per-

formance.

According to Shannons theory [17, 18], vector quantization (VQ) can significantly

reduce the coding bits of signals, comparing to scalar quantization. Hence, VQ plays animportant role in many applications, e.g. speech recognition, volume rendering, and im-

age compression. Gupta et al. utilized VQ to compress multispectral satellite images [19].

Su et al. developed a hybrid coding system by using SPIHT and VQ for image compres-

sion in [20]. Abdel-Galil et al. applied VQ to power systems for classifying power quality

disturbances [21]. When the code vector and code book sizes become large enough, the

distortion of the vector quantizer approaches the lower bound of the distortion-rate rela-

tion. However, both the computation complexity and memory requirement, associated

with the vector quantizer, increase exponentially. Hence, an unconstrained full search

vector quantizer usually uses small vectors. For reducing the computation complexity and

memory requirements of VQ, several variants of the original VQ had been proposed in

literature, such as residual vector quantization (RVQ) [22, 23, 25], hierarchical VQ [24],

and tree-structured VQ (TSVQ) [18]. Each VQ variant makes a compromise between thecomputation complexity and performance.

RVQ or multistage VQ [25] is a VQ variant with less computation complexity. Be-

cause the decoder of a RVQ is constrained by a direct-sum codebook structure and the

encoder typically uses a suboptimal stage-sequential search procedure, the RVQ results in

performance degradation. For efficiently coding high-frequency wavelet coefficients with

energy clustering and compromising the complexity and performance of an image coder,

a hybrid coder using SPECK and residual VQ (RVQ) is thus proposed for image com-

pression. Specifically, the significant high-frequency wavelet coefficients of an image are

to be coded on the basis of coefficient vectors, which can be efficiently located by using

the significance coding procedure of SPECK. Recently, Chao et al. proposed a vector

SPECK algorithm for gray-level still image compression [27] which is a variation on

SPECK using VQ to code the significant coefficients. They used very sophisticated VQ

method to improve compression efficiency at the cost of added complexity. The proposed

hybrid method in this paper and the vector SPECK method were developed independently

and with many differences in implementation, although both methods involve SPECK

and VQ and have good performance.


3/17

HYBRID IMAGE COMPRESSION BASEDON SPECK AND RVQ 1013

The remainder of this paper proceeds as follows. Section 2 describes the proposed

hybrid image coder which combines SPECK and RVQ. Experimental results are given in

section 3, and conclusions are given in section 4.

2. THE PROPOSED HYBRID IMAGE COMPRESSION METHOD

The SPECK algorithm, which was proposed by Pearlman et al., is a simple, efficient

image coder with coding scalability. By recursively partitioning a significant block of a

transformed image, SPECK locates the significant coefficients in the block and performs

scalar quantization on these significant coefficients to generate a coded bit-stream. Since

vector quantization is more efficient than scalar quantization according to Shannons

rate-distortion theory, developing a hybrid image coder combining SPECK with VQ was

motivated. For reducing the computational complexity, RVQ was selected to be combined

with SPECK to constitute the proposed hybrid codec, and experimental results showed

that the proposed hybrid method is efficient in image compression. In subsection 2.1,

the application for still gray-level images will be discussed, and a chroma-subsampling-image application of the proposed hybrid method will be presented in subsection 2.2. In

subsection 2.3, we will discuss the computational complexity and required memory of

the proposed hybrid method in gray-level image compression.

2.1 Application for Gray-Level Still Images

A hybrid image coding system by combining SPECK with RVQ is therefore pro-

posed to improve the compression performance, and Fig. 1 shows the block diagram

which can be directly used for still gray-level image compression. In the first block, an

input gray-level image is transformed by 2D discrete wavelet transform (DWT) to gener-

ate its transformed image for further processing. For example, Fig. 2 shows the result of

a 4-decomposition-level 2D DWT. The coefficients of the transformed image are classi-

fied into two parts. One is the LL subband which contains the scaling coefficients, and

the other is the high-frequency subbands which include all the coefficients of the trans-

formed coefficients excluding those inLL subband. The scaling coefficients represent the

lowest frequency component of an image, and they can be coded efficiently by using the

original (scalar) SPECK algorithm. On the other hand, the wavelet coefficients in high-

frequency subbands are coded by using the SPECK with RVQ. Finally, the coded bit-

stream can be obtained by the use of multiplex operation.

DWT

Scalar SPECK

SPECK withRVQ

Mux

LL subband

Input Coded

image bit-stream

H,D, V subbands

Fig. 1. The proposed hybrid image coder.


4/17


Fig. 2. The partition and assignment of a 4-decomposition-level transformed image.

For the quantization of the coefficients in LL subband, original SPECK starts from

the most significant nmax bit plane, where

2max

max log (| |) ,ij

ijc LL

cn

= (1)

and cij represents a coefficient inLL subband. Those scaling coefficients, whose magni-

tudes are greater than or equal to 2nmax are located in the first pass. Then, the coefficients,

whose magnitudes are in interval [2n

max-1, 2n

max), are located in the second pass, and the

procedure goes on until all the coefficients are located or the bit-budget is exhausted. In

the proposed hybrid method, all the coefficients in LL subband are normalized, by the

absolute value of the coefficient with the largest magnitude, before sorting. Hence, the

normalized scaling coefficients with magnitudes in [2-1, 20) are located in the first pass.

The coefficients with magnitudes in interval [2-2, 2-1) are located in the second pass, and

the procedure goes on till all the coefficients are located or the bit-budget is exhausted.

For the coefficients in high-frequency subbands, they are classified into three cate-

gories which areH(horizontal),D (diagonal), and V(vertical) types as shown in Fig. 2.

All the coefficients in high-frequency subbands are partitioned into 2 2 blocks, andeach 2 2 block forms a corresponding 4D vector (Fig. 3). The three types of vectors in

H,D, and Vare normalized by the maximumL2 norms of three categories, respectively,

such that theL2 norm of each vector is less than or equal to one. If theL2 norm of a vec-

tor is greater than or equal to the threshold, then the vector is significant and the block or

subband containing this vector is also significant. Because the vectors are normalized,the thresholds for a 7-stage RVQ are: 2-1, 2-2, 2-3, 2-4, 2-5, 2-6, and 0. For a 4-decomposi-

tion-level transformed image, in the initialization step, theH,D and Vblocks in the 4th

decomposition level (i.e. on the left-top corner in Fig. 2) form the S set (significant set),

and the otherH,D and Vsubbands form the Iset (insignificant set) in the RVQ-SPECK


5/17


ci,j ci,j+1

ci+1,j ci+1,j+1

[ci,j, ci,j+1, ci+1,j, ci+1,j+1]

A 22 coefficient block A 4D vector

Fig. 3. A 2 2 coefficient block and its corresponding 4D vector.

Fig. 4. The signal flow diagram of ap-stage RVQ.

block of the proposed hybrid method (Fig. 1). The sorting pass of the significant vectorsof the high-frequency subbands and the definitions of S and I in the proposed hybrid

method are identical to the scalar SPECK except that the significance path of the SPECK

with RVQ ends when the block is 2 2.Generally speaking, full-search VQ has better performance than RVQ, but RVQ was

selected to be used in the proposed hybrid method because of its low complexity and

acceptable performance. The signal flow diagram of a p-stage RVQ is shown in Fig. 4,

where xi (1 i p) is the input vector of the ith VQ stage in thep-stage RVQ, and ix isthe code vector which has the smallest distance toxi. The residual ofxi ix is xi+1 that isthe input vector for the (i + 1)th VQ stage in the RVQ system. Because the characteristics

ofH, D, and Vsubbands are different, 3 RVQs are used for the H, D, and Vsubbands,

respectively, of the RVQ-SPECK part in the proposed hybrid method.

Since the information of the lowest frequency subband LL of an image is usuallymore important than that of the high-frequency subbands, the bit-plane resolution in the

scalar SPECK is set to be higher than that of the RVQ-SPECK. Thus, the transmission

rate of the scalar SPECK is usually faster than that of the RVQ-SPECK. Based on the

simulation results, the transmission rate of the scalar SPECK is set empirically twice of

the RVQ-SPECK transmission rate, i.e. one pass of the proposed hybrid method includ-

ing two SPECK passes and one RVQ-SPECK pass. At last, the output coded bit-stream

contains the overhead, binary output of SPECK, and binary output of RVQ-SPECK, and

the relation is shown in Fig. 5.

Fig. 5. The coded bit-stream of the proposed hybrid method.


6/17


The decoder of the proposed method can be implemented by simply reversing the

processing steps of the encoder. Besides the overhead of the compression file, the bits in

the compression file are ordered in importance, so the proposed method is embedded.

The proposed encoder (decoder) can terminate the coding (decoding) process at any point,

so it can achieve the exact bit-rate control which is an important requirement of modern

codecs.

The compression performance can be improved by the use of arithmetic coding af-

ter SPECK, however, at the cost of increasing computational complexity. As an example,

the PSNR value of the decoded 512 512 Lena image can be improved by 0.22 dB atcompression rate of 1 bpp by using SPECK with arithmetic coding [26]. For system sim-

plicity, the operation of arithmetic coding is not performed in our experiments.

2.2 Application for Chroma Subsampling Images

The chroma subsampling format is used for balancing efficiency and quality in sam-

pling, and the similar method is used in picture format to save bandwith (memory) and

maintain good quality. CIF (Common Intermediate Format) and QCIF (Quarter CIF) aretwo such formats in H.261. For CIF, the size of the luminance plane is 352 288, and thesizes of the two chrominance planes are 176 144. Each of the two chrominance planesonly contains one quarter data (pixels) of the luminance planes, since the human eye is

less sensitive to the chrominance information than to the luminance information. The im-

age sequence format of MPEG-4 is CIF or 4:2:0, and we will discuss how to use the pro-

posed hybrid method for the compression of the popular YUV 4:2:0 images.

Fig. 6 shows the block diagram of the application of the proposed hybrid method

for chroma subsampling images. First, Y, U, and V planes are transformed by using 2D

discrete wavelet transform, respectively. Then, each transformed YUV plane is proc-

essed like the transformed image in the still gray-level image case. The transformed co-

efficients of each plane are partitioned into LL,H,D, and Vsubbands. The scaling coef-

ficients of each LL subband, which will be processed by color SPECK (CSPECK) [12,13], are normalized by the maximum-amplitude of the coefficients in this LL subband.

Another three positive base values which are determined from the L2 norms of the 4D

vectors inH, V, andD subbands, respectively, are used to normalize the vectors in their

corresponding subbands such that the L2 norm of each normalized 4D vector is not

greater than 1. Hence, 12 base values used in normalization have to be stored and trans-

mitted for the decoder of the proposed method. After normalization and coefficient clas-

sification, the three LL subbands of the transformed Y, U, and V planes are processed by

CSPECK, and the other coefficients (i.e.H,D, and Vsubbands) of the transformed Y, U,

and V planes are coded by CSPECK with RVQ.

Similar to the quantization of the monochrome application, since the scaling coeffi-

cients in LL subbands contain more important information than the coefficients in H,D,

or Vsubbands do, one quantization cycle of the proposed hybrid method includes two

CSPECK quantization passes of the LL subbands and one RVQ-CSPECK quantization

pass ofH,D, and Vsubbands (Fig. 7).


7/17


Fig. 6. The block diagram of the application of the proposed method for chroma subsampling images.

Fig 7. One quantization cycle of the proposed hybrid method which includes two CSPECK quan-

tization and one RVQ-CSPECK quantization.


8/17


2.3 Memory Requirement and Computational Complexity of the Proposed HybridMethod

In this section, the memory requirement and computational complexity of the pro-

posed hybrid codec will be discussed, assuming that the coefficients in H, D, and Vsubbands are much more than the scaling coefficients in LL subband. Therefore, the

computation complexity of the proposed hybrid method can be approximated by that of

the RVQ-SPECK part, or the RVQ-SPECK is the dominant part of the proposed hybrid

method.

For the memory issue, the proposed hybrid method needs extra memory for storing

codebooks and parameters. Assume that three p-stage RVQs are used and the codebook

sizes ofH,D, and Vsubbands for the ith VQ stage are the same and equal to mi words.

Then, all the codebooks need1

3p

i

i

m=

words. The proposed hybrid method also needs

memory to store the 12 base values for normalization and threshold information. The

proposed hybrid method outperforms the scalar SPECK, for the memory issue, on the

lengths of the list of significant vectors and the list of in-significant sets. Since each vec-tor in RVQ-SPECK contains 4 coefficients, the length of a RVQ-SPECK list is about

one fourth of the length of the corresponding list in SPECK (e.g. LSP and LIP).

For the computational complexity in encoding, the proposed hybrid method use L2

norms for significance test of the vectors for each stage, and SPECK uses 1-bit compari-

son to test significance for each bit-plane. Hence, the computational complexity of the

proposed hybrid method for significant vector test is several times more complicated

than the significance test in SPECK. Although the significance test complexity of the

proposed method is more complicated than SPECKs, the proposed hybrid method has

the advantage that its total significance test number is smaller than that of SPECK. If an

NNgray-level test image with nmax = 11 is coded by SPECK and the proposed hybridmethod with p-stage RVQs. Then, the significance-test-number ratio of SPECK to the

proposed hybrid method can be estimated by:

max max4 .( /2) ( /2)

N N n n

N N p p

=

(2)

For a 512 512 gray-level test image with nmax = 12 andp = 7, the significance-testratio of Eq. (2) is 6.9. Hence, for this example, the SPECK encoder needs 6.9 times of the

significance tests of the proposed hybrid encoder. Although SPECK can use a simple bit-

wise operation for significance test, it suffers from the growth of the number of signifi-

cance tests for large images. Both SPECK and the proposed hybrid method use the same

algorithm to locate significant coefficients, but the proposed method usually has shorter

significant paths because of the use of a 4D vector instead of a single pixel (coefficient).

When the proposed hybrid method locates a significant vector, SPECK needs one more

quadtree partition and 4 significance test to complete the significant path.

For the decoding part, no significance test needed for SPECK or the proposed hy-

brid method, and the computational complexity of both methods is greatly reduced. The

proposed hybrid method is implemented as simple look-up tables and the total amounts

of significant vectors are about one quarter of the amounts of the significant pixels in


9/17


SPECK. The actual computational complexity depends on the image characteristic, code-

book size, bit-allocation of codewords and so on. From above discussion, the propose hy-

brid decoder is as efficient as SPECK, and this is consistent with our experiment results.

To summarize, the proposed hybrid method is suitable for asymmetric complexity

applications that we can encode images off line, but need to decode images fast.

3. EXPERIMENTAL RESULTS

In this section, two applications of the proposed hybrid compression method are pre-

sented. The first application, in subsection 3.1, is the gray-level still image compression,

and the other is the compression of chroma subsampling images in subsection 3.2. The

platform for simulation is an IBM PC with Windows XP, and SPECK, SPIHT, and the

proposed hybrid method are coded by Matlab.

Linear phase biorthogonal wavelet filters with 9/7-coefficients are used in this paper.

The number of wavelet decomposition levels in our experiments is 4. Fig. 2 shows the

classification of a 4-level transformed image whose coefficients are classified into fourtypes:LL,H, V, andD. The lowest frequency coefficients in subband LL are normalized

such that their magnitudes are in the range of [0, 1), and these coefficients are coded by

using the scalar SPECK. The wavelet coefficients in subbands of types H, V, andD are

coded by using SPECK with RVQ. For the coefficient vectors ofH,D, and Vsubbands,

we empirically choose the stage number, in the RVQs, as 10 and 7 for 256 256 and 512 512 test images, respectively. Because the characteristics ofH, V, andD subbands aredifferent, each category has its own codebooks. Therefore, 30 and 21 codebooks are trained

by using the K-means algorithm for 256 256 and 512 512 monochrome images, re-spectively. The codebook size of the first RVQ stage is 64 words, and that of the other

RVQ stages is 32 words. Each codeword is a 4D vector in R4.

3.1 Grey-level Still Image Compression

In this subsection, we will compare the proposed hybrid method with SPECK and

SPIHT image codecs by encoding and decoding some test images (Fig. 8). Both 256 256 and 512 512 test images are used for testing. The proposed hybrid method andSPECK are compared by using 256 256 test images first, and then, three methods (in-cluding SPIHT) are compared for 512 512 test images. The compression rate is meas-ured in bits per pixel (bpp), and the peak signal to noise ratio (PSNR) measured in dB is

utilized to evaluate the decoded image quality. For the simulation of 256 256 mono-chrome images, 41 images, which do not include the three test images, are used to train

the codebooks of the RVQs of the proposed method.

Table 1 shows the simulation results of the 256 256 test images, and Figs. 9-11show the PSNR-bpp curves for the 3 test images, where the horizontal and vertical axes

are the compression rates in bpp and PSNR values in dB, respectively. For the 256 256monochrome image Lena, the proposed hybrid coder outperforms the SPECK coder by

1.67 dB at 1.0 bpp, and 0.48 dB, on average, from 0.1 bpp to 1.5 bpp. For the 256 256monochrome image Barbra, the proposed hybrid coder outperforms the SPECK coder by

1.23 dB at 1.1 bpp, and 0.49 dB on average. For the third 256 256 gray-level image


10/17


(a) Lena. (b) Babara. (c) Goldhill.

Fig. 8. Three 8-bit gray-level 256 256 test images.

Table 1. Simulation results of 256256 test images.

PSNR (dB)

Lena Barbara Goldhill

bpp SPECK Proposed SPECK Proposed SPECK Proposed

1.5 40.89 41.68 39.48 39.59 33.61 34.34

1.4 40.51 40.82 39.02 39.21 33.22 33.92

1.3 40.09 40.07 38.56 38.79 32.85 33.53

1.2 39.58 39.68 37.60 38.38 32.48 33.19

1.1 39.05 39.19 36.21 37.44 32.10 32.64

1.0 36.96 38.63 35.71 36.41 31.48 31.72

0.9 36.41 37.24 35.20 35.46 30.56 30.97

0.8 35.78 36.02 34.64 34.87 30.10 30.43

0.7 35.08 35.34 34.00 34.31 29.61 29.98

0.6 33.74 34.55 32.37 33.59 29.09 29.48

0.5 32.39 32.85 31.65 31.90 28.54 29.00

0.4 31.43 31.56 30.80 30.95 27.44 27.76

0.3 29.33 30.44 29.79 30.09 26.74 26.89

0.25 28.72 28.97 28.59 28.93 26.29 26.49

0.2 28.01 28.14 27.99 28.33 25.77 26.03

0.125 25.89 26.58 25.84 26.84 24.46 24.92

0.1 25.25 25.44 25.22 25.89 24.11 24.49

0 0.5 1 1.525

30

35

40

bpp

PSNR(

dB)

Proposed

SPECK

Fig. 9. The experimental results of the 256

256 gray-level image Lena.

Fig. 10. The experimental results of the 256 256 gray-level image Barbara.


11/17


Fig. 11. The experimental results of the 256 256 gray-level image Goldhill.

Fig. 12. The average improvements of the proposed

hybrid coder compared with the original

SPECK on more 256 256 test images.

Goldhill, the proposed hybrid coder outperforms the SPECK coder by 0.73 dB at 1.5 bpp,

and 0.43 dB on average. The experimental results of more test images obtained from the

USC (University of Southern California) image database are shown in Fig. 12. In Fig. 12

the curve denotes the average improvement by using the proposed hybrid coder com-

pared with the pure SPECK coder. It is shown that the proposed hybrid coder is prefer-

able to the SPECK coder in terms of the PSNR-bpp curves.

For the experiments of 512 512 gray-level still images, the proposed hybridmethod, SPECK, and SPIHT (with arithmetic coding) are simulated and compared with

each other. SPIHT is selected for comparison because it is a wavelet-based method with

very good performance and used in JPEG2000. A set of codebooks were trained by using

8 training images, downloaded from USC image database, and the K-means method. The

stages of a RVQ of the proposed hybrid method were empirically reduced to 7 stages,

since using fewer stages in a RVQ usually obtains better performance (higher PSNR

values) for low bit-rate cases. The vectors used for 512 512 images are also 4D vectors(Fig. 3) in the vector space R

4. The 7 thresholds of the 3 RVQs in the proposed hybrid

method are: 2-1, 2-2, 2-3, 2-4, 2-5, 2-6, and 0. Table 2 shows the simulation results of the

proposed hybrid method, SPECK, and SPIHT (with arithmetic coding) on 512 512 testimages. SPECK and SPIHT are two state-of-the-art techniques, and which one has better

performance usually depends on the image characteristic. According to the results in Ta-

ble 2, although we can not guarantee that the proposed hybrid method can always has the

best performance; it seems that the proposed hybrid method can improve the SPECK co-

dec for most images, especially under low bit-rate conditions. Three 0.25-bpp decoded

images of SPECK, the proposed method, and SPIHT are shown in Fig. 13, and it is diffi-

cult to find difference among these images by our eyes instantly. By carefully inspecting

the reconstructed images in Fig. 13, we found that the image of SPECK codec is smoother

than the others and the proposed hybrid codec preserves more small details of the origi-nal images.

Chao et al. proposed a vector SPECK [27] for still gray-level image compression.

Three types of VQs (full search VQ, tree-structured VQ, and entropy constrained VQ)

were used in their method at the same time, and the vector dimension and vector entries

depend on the subbands and quantization levels where the vector is located. A large


12/17


Table 2. Simulation results for 512512 test images.

PSNR (dB)

Lena Barbara Goldhill

bpp SPECK Proposed SPIHT SPECK Proposed SPIHT SPECK Proposed SPIHT1.0 40.44 40.29 39.89 35.23 36.18 36.77 34.89 35.42 35.82

0.9 39.99 40.03 39.39 34.67 34.83 35.96 34.46 35.00 35.31

0.8 39.54 39.74 38.69 34.00 34.32 35.01 33.99 34.52 34.78

0.7 38.89 39.40 38.14 32.85 33.75 33.88 33.49 34.06 34.15

0.6 37.46 39.01 37.53 31.40 33.00 32.72 32.72 33.39 33.36

0.5 36.87 37.38 36.78 30.62 31.02 31.63 31.71 32.32 32.55

0.4 36.03 36.62 35.82 29.68 30.10 30.33 31.03 31.59 31.69

0.3 34.07 35.61 34.42 28.00 28.95 28.54 30.18 30.72 30.79

0.25 33.46 34.16 33.65 27.30 27.98 27.60 29.61 30.20 30.15

0.2 32.64 33.29 32.71 26.49 26.92 26.66 28.69 29.26 29.39

0.1 29.48 30.31 29.82 23.99 24.75 24.37 27.03 27.62 27.63

Table 3. Experiment results of SPECK, JPEG2000, and vector SPECK from [27].

Lena

Bit rate SPECK JPEG2000 Vector SPECK

0.125 30.96 30.92 31.25

0.2 32.99 32.96 33.47

0.25 34.03 34.09 34.33

(a) (b) (c)

Fig. 13. Decoded images of (a) SPECK, (b) the proposed hybrid method, and (c) SPIHT under 0.25-

bpp condition.

amount (1,500) of training images and Lloyd splitting method are used for training code-

books. Vector SPECK can outperform the JPEG2000 codec under low bit-rate conditions

at the cost of added complexity, but it does not handle the lower bit planes for n = 3, 2, 1,

and 0. Compared with vector SPECK, the proposed hybrid method has the features of low

complexity and a wide bit-rate range. Table 3 shows some experiment data form [27], andthey used 5 decomposition levels, 9/7 DWT, and arithmetic coding in SPECK. Since the

conditions of Tables 2 and 3 are different, the results of Tables 2 and 3 of the same method

are not equal. Hence, we only compare the difference values of SPECK and the proposed

hybrid method in Table 2 with the difference values of SPECK and JPEG2000 (or the

vector SPECK) in Table 3. For the Lena image under 0.25-bpp case, the vector SPECK


13/17


outperforms SPECK 0.3 dB, JPEG2000 outperforms SPECK 0.06 dB (Table 3), and the

proposed hybrid method outperforms SPECK 0.7 dB (Table 2). Hence, it shows that the

proposed hybrid method is very competitive and efficient.

3.2 Chroma Subsampling Image Compression

The goal of the simulation is to compare the performance of the proposed hybrid

coder with that of the CSPECK coder for YUV 4:2:0 images. Based on the simulation

results, we can choose a proper coder for applications with such a format, e.g. MPEG-4,

PAL DV, DVCAM, HDV, JPEG/JFIF, H.261, VC-1, and MJPEG. The test images, used

in the simulation, have 256 256 Y (luminance) plane and 128 128 U and V (chromi-nance) planes. The 9/7-tap biorthogonal wavelet transform is performed on each plane

separately, and the number of decomposition level is four. For a CSPECK codec, the de-

coder needs to know the maximum number of binary bit planes (nmax) that is used for

coding the transformed image.

Excluding the threeLL subbands of Y, U, and V planes, the other coefficients (in H,

V, andD subbands of each Y, U, or V plane) of the transformed image are coded by usingthe CSPECK with three 10-stage RVQs. In our experiments, 55 color images are used to

generate 90 codebooks for the proposed codec, since the RVQs are 10-stage and there

are three YUV planes that each has three types (H, V, andD) of 4D coefficient vectors.

For the vectors inHsubbands of Y plane, 128 vectors are selected to be the basis vectors

for the vectors withL2 norms in [0.5, 1), and each of the other 9 codebooks of theHsub-

bands in Y plane has 64 codewords. The same basis vector arrangement as that used in the

Hsubband is used in the D and Vsubbands in Y plane. Because the human eye is less

sensitive to the chrominance information than to the luminance information, fewer basis

vectors are used in U plane or V plane. For the vectors inH,D, or Vsubbands of plane U

or plane V, 32 basis vectors are used in the highest ( i.e. 10th) stage of the RVQs, and

each of the other 9 stages has 16 basis vectors. All the codebooks are trained by using the

simple K-means method. The equivalent bit-per-pixel (ebpp) value defined in Eq. (3) isused for representing the compression rate for decoding a coded YUV 4:2:0 image:

2 2

number of bits used.

256 2 128ebpp =

+ (3)

The 256 256 color test image Goldhill is used for simulation, and the curves inFigs. 14-16show the simulation results. The test image is originally 256 256 size in theR, G, and B planes (true color space), so they had to be preprocessed before simulation.

First, the test image was transformed to the YUV space. Then, the U and V planes were

downsampled to 128 128 pixels, where the downsampleing method was to calculate thearithmetic mean of the adjacent four-point values. We compare the PSNR values of the

proposed hybrid method with those of the CSPECK coder in Y, U, and V planes, respec-

tively. It can be seen that the PSNR values can be improved by 1.11 dB for the Y plane,0.99 dB for the U plane, and 2.31 dB for the V plane at the bit budget of 98,304 bits (1.0

ebpp). For the same image, the average PSNR values (from 0.1 ebpp to 1.5 ebpp) of the

proposed method are higher than those of CSPECK by 0.66 dB, 1.21 dB, and 2.22 dB in

Y, U, and V planes, respectively.


14/17


Fig. 14. The Y-plane experimental results of the

chroma subsampling image Goldhill.

Fig. 15. The U-plane experimental results of the

chroma subsampling image Goldhill.

Fig. 16. The V-plane experimental results of the chroma subsampling image Goldhill.

Based on the simulation results, it is obvious that the proposed method has superiorimprovement in the two chrominance planes (U and V), since the colors (chrominance

information) of the four neighbors in a 2 2 block are usually similar. On the other hand,the luminance values are more probable to change abruptly than the chrominance values

are, because of sharp edges and corners. Even though, the proposed method also achieves

good results in the Y planes. The major added cost of the proposed method is the needs

of training codebooks and determining parameters before encoding. Since the most time

consuming codebook design can be done off-line and the codebook sizes of the RVQs are

small, the proposed hybrid method is efficient in time and bit-budget.

4. CONCLUSIONS

In this paper, we propose a hybrid image coder, which is based on SPECK and RVQ,

for still gray-level and chroma subsampling images. Compared with SPIHT and SPECK

(two state-of-the-art algorithms), the experimental results have shown that the proposed

hybrid method is efficient for image compression. According to the applications that we

are interested in, the flexible proposed hybrid codec can be designed to improve its low


15/17


bit-rate or high bit-rate performance by using a short RVQ or a long RVQ. We also have

shown that the proposed hybrid method has superior performance for the chrominance

planes (i.e. U and V planes in YUV color space) in chroma subsampling image compres-

sion. Because of the asymmetry property of VQ, the proposed hybrid method is suitable

for those applications whose load is also asymmetric and is heavy on the decoding side

(e.g. the image archiving of an image database). Although the proposed hybrid codec is

asymmetric, using RVQ instead of full-search VQ makes the increased complexity af-

fordable and worthy.

ACKNOWLEDGEMENTS

The authors would like to thank the anonymous reviewers for their comments that

significantly helped improve this paper.

REFERENCES

1. H. G. Musmann, P. Pirsch, and H. J. Grallert, Advances in picture coding, in Pro-ceedings of IEEE, Vol. 73, 1985, pp. 523-548.

2. R. J. Clarke, Transform Coding of Images, Academic Press, New York, 1985.3. O. J. Kwon and R. Chellappa, Region adaptive subband image coding, IEEE Trans-

actions on Image Processing, Vol. 7, 1988, pp. 632-648.

4. K. R. Rao and J. J. Hwang, Techniques and Standards for Image Video and AudioCoding, Prentice Hall, New Jersey, 1996.

5. W. B. Pennebaker and J. L. Mitchell, JPEG Still Image Data Compression Stan-dards, Van Nostrand, New York, 1993.

6. JPEG2000 Core Coding System (Part 1), ISO/IEC 15444-1, Dec. 2000.7. B. E. Usevitch, A tutorial on modern lossy wavelet image compression: Founda-

tions of JPEG2000,IEEE Signal ProcessingMagazine, Vol. 18, 2001, pp. 22-35.8. J. M. Shapiro, Embedded image coding using zerotrees of wavelet coefficients,

IEEE Transactions on Signal Processing, Vol. 41, 1993, pp. 3445-3462.

9. A. Said and W. A. Pearlman, A new, fast, and efficient image codec based on setpartitioning in hierarchical trees,IEEE Transactions on Circuits Systems for Video

Technology, Vol. 6, 1996, pp. 243-250.

10. S. D. Servetto, K. Ramchandran, and M. T. Orchard, Image coding based on a mor-phological representation of wavelet data,IEEE Transactions on Image Processing,

Vol. 8, 1999, pp. 1161-1174.

11. E. S. Hong and R. E. Ladner, Group testing for image compression, IEEE Trans-actions on Image Processing, Vol. 11, 2002, pp. 901-911.

12. A. Islam and W. A. Pearlman, An embedded and efficient low-complexity hierar-chical image coder, in Proceedings of SPIE Visual Communications and Image

Processing, Vol. 3653, 1999, pp. 294-305.

13. W. A. Pearlman, A. Islam, N. Nagaraj, and A. Said, Efficient, low-complexity im-age coding with a set-partitioning embedded block coder, IEEE Transactions on

Circuits Systems for Video Technology, Vol. 14, 2004, pp. 1219-1235.

14. G. Strang and T. Nguyen, Wavelets and Filter Banks, Wellesley-Cambridge, MA,


16/17


1996.

15. C. Chrysafis, A. Said, A. Drukarev, and W. A. Pearlman, SBHP A low complex-ity wavelet coder, in Proceedings of IEEE International Conference on Acoustics,

Speech, and Signal Processing, 2000, pp. 2035-2038.

16. S. T. Hsiang and J. W. Woods, Embedded image coding using zero blocks of sub-band/wavelet coefficients and context modeling, in Proceedings of IEEE Interna-

tional Conference on Circuits and Systems, 2000, pp. 662-665.

17. C. E. Shannon, A mathematical theory of communication, The Bell System Tech-nical Journal, Vol. 27, 1948, pp. 379-423, 623-656.

18. A. Gersho and R. M. Gray, Vector Quantization and Signal Compression, KluwerAcdemic Publishers, MA, 1992.

19. S. Gupta and A. Gersho, Feature predictive vector quantization of multispectral im-ages,IEEE Transactions on Geoscience and Remote Sensing, Vol. 30, 1992, pp. 491-

501.

20. C. K. Su, H. C. Hsin, and S. F. Lin, Wavelet tree classification and hybrid codingfor image compression,IEE Proceedings of Vision,Image, and Signal Processing,

Vol. 152, 2005, pp. 752-756.21. T. K. Abdel-Galil, E. F. El-Saadany, A. M. Youssef, and M. M. Salama, Distur-

bance classification using hidden Markov models and vector quantization, IEEE

Transactions on Power Delivery, Vol. 20, 2005, pp. 2129-2135.

22. C. F. Barnes, Residual quantizers, Ph.D. Dissertation, Department of Electricaland Computer Engineering, BrigHam Young University, Provo, UT, 1989.

23. F. Kossentini, M. J. T. Smith, and C. F. Barnes, Image coding using entropy-con-strained residual vector quantization,IEEE Transactions on Image Processing, Vol.

4, 1995, pp. 1349-1357.

24. Y. Shoham, Hierachical vector quantization with application to speech waveformcoding, Ph.D. Dissertation, Department of Electrical and Computer Engineering,

University of California at Santa Barbara, 1985.

25.B. H. Juang and A. H. Gray, Multiple stage vector quantization for speech coding,in Proceedings of IEEE International Conference on Acoustics, Speech, Signal Proc-

essing, Vol. 1, 1982, pp. 597-600.

26. G. Xie and H. Shen, Highly scalable, low-complexity image coding using zero-blocks of wavelet coefficients, IEEE Transactions on Circuits Systems for Video

Technology, Vol. 15, 2005, pp. 762-770.

27. C. C. Chao and R. M. Gray, Image compression with a vector SPECK algorithm,in Proceedings of IEEE International Conference on Acoustics, Speech, Signal Proc-

essing, Vol. 2, 2006, pp. 445-448.

Sheng-Fuu Lin () was born in Taiwan, R.O.C., in

1954. He received the B.S. and M.S. degrees in Mathematics from

National Taiwan Normal University in 1976 and 1979, respec-

tively, the second M.S. degree in Computer Science from the

University of Maryland in 1985, and the Ph.D. degree in Electri-

cal Engineering from the University of Illinois, Champaign, in

1988. Since 1988, he has been on the faculty of the Department


17/17


of Electrical and Control Engineering at National Chiao Tung University, Hsinchu, Tai-

wan. His research interests include fuzzy theory, automatic target recognition, scheduling,

image processing, and image recognition. Professor Lin is a member of the IEEE Control

Society, Chinese Fuzzy System Association, and Chinese Automatic Control Society.

Hsi-Chin Hsin () received the M.S. and Ph.D. de-

grees in Electrical Engineering from the University of Pittsburgh,

Pittsburgh, PA, in 1992 and 1995, respectively.He is a Professor

in the Department of Computer Science and Information Engi-

neering at National United University, Taiwan. His research in-

terests include wavelet transform, image processing, CORDIC,

DSP architectures and system on chip.

Chien-Kun Su () was born in 1962. He received the

B.S. degree from National Taiwan University, Taiwan, in 1989,

M.S. degree from the University of Southern California, U.S.A.,

in 1992, and the Ph.D. degree from National Chiao Tung Uni-

versity, Taiwan, in 2008. He has been on the faculty of the De-

partment of Electrical Engineering at Chung Hua University,

Hsinchu, Taiwan since 1995. His research interests include im-

age processing and computer vision.