7/30/2019 Set Partitioning
1/17
JOURNAL OF INFORMATION SCIENCE AND ENGINEERING 26, 1011-1027 (2010)
1011
Hybrid Image Compression Based on Set-Partitioning
Embedded Block Coder and Residual Vector Quantization
SHENG-FUU LIN, HSI-CHIN HSIN*AND CHIEN-KUN SU+Department of Electrical and Control Engineering
National Chiao Tung University
Hsinchu, 300 Taiwan*Department of Computer Science and Information Engineering
National United University
Miaoli, 360 Taiwan+Department of Electrical Engineering
Chung Hua University
Hsinchu, 300 Taiwan
A hybrid image coding scheme based on the set-partitioning embedded block coder(SPECK) and residual vector quantization (RVQ) is proposed for image compression. In
which, the scaling and wavelet coefficients of an image are coded by using the original
SPECK algorithm and the SPECK with RVQ, respectively. The use of hybrid coding
strategy by combining SPECK with RVQ for high frequency wavelet coefficients is to
take account of the energy clustering property of wavelet transform. Experimental results
show that, for gray-level still images, the proposed hybrid RVQ-SPECK coder outper-
forms SPECK, e.g. the peak-signal-to-noise-ratio (PSNR) values can be improved by
1.67 dB and 0.69 dB at compression rate of 1 bit per pixel for the 256 256 gray-levelLena and Barbra images, respectively. The application for chroma subsampling images is
also presented in this paper, and the proposed method usually outperforms color SPECK
method. The PSNR values can be improved by 1.11 dB for the Y plane, 0.99 dB for the
U plane, and 2.31 dB for the V plane at the bit budget of 81,920 bits for the test image
Goldhill. In addition to high coding efficiency, the proposed method also preserves the
features of embeddedness, low decoding complexity, and exact bit-rate control.
Keywords:image compression, residual vector quantization (RVQ), set-partitioning em-
bedded block coder (SPECK), chroma subsampling images, embeddedness
1. INTRODUCTION
For the needs of high quality images, fast transmission, and less storage space, im-
age compression is demanding increasingly. Differential pulse code modulation, transform
coding, subband coding, and many other image compression techniques have been devel-
oped [1-3]. State-of-the-art techniques can compress typical images by a factor ranging
from 10 to 50 with acceptable quality [4]. The Joint Photographic Experts Group (JPEG)
image standard [5] known as the most widely used transform-coding based algorithm
shows good performances at moderate compression ratios. Recently, the wavelet basedmultiresolution representation has received a lot of attention to the compression appli-
cations, as manifested in the JPEG2000 standard [6, 7]. Many wavelet based image cod-
ing algorithms such as the embedded zero-tree wavelets (EZW) [8], set partitioning in
Received July 8, 2008; revised November 20, 2008; accepted February 12, 2009.
Communicated by Liang-Gee Chen.
7/30/2019 Set Partitioning
2/17
SHENG-FUU LIN, HSI-CHIN HSINAND CHIEN-KUN SU1012
hierarchical trees (SPIHT) [9], morphological representation of wavelet data (MRWD)
[10], group testing for wavelets (GTW) [11], and the set-partitioning embedded block
coder (SPECK) [12, 13] have been proposed with a great success. In wavelet domain, the
higher detailed components of an image are projected onto the shorter basis functions
with higher spatial resolutions, and the lower detailed components are projected onto the
larger basis functions with narrower bandwidths; this matches the characteristics of a
human visual system [14].
In SPECK, the well-defined hierarchical structure with energy clustering within high
frequency subbands has been taken into account such that the significant wavelet trans-
form coefficients of an image can be efficiently coded as early as possible. SPECK has
been incorporated into the verification model of JPEG 2000, which is known as subband
hierarchical block partitioning (SBHP) [15]. Another variant of SPECK called the
embedded zero block coding (EZBC) [16] is much more complicated, which combines
SPECK with a context-based adaptive arithmetic coder to improve the compression per-
formance.
According to Shannons theory [17, 18], vector quantization (VQ) can significantly
reduce the coding bits of signals, comparing to scalar quantization. Hence, VQ plays animportant role in many applications, e.g. speech recognition, volume rendering, and im-
age compression. Gupta et al. utilized VQ to compress multispectral satellite images [19].
Su et al. developed a hybrid coding system by using SPIHT and VQ for image compres-
sion in [20]. Abdel-Galil et al. applied VQ to power systems for classifying power quality
disturbances [21]. When the code vector and code book sizes become large enough, the
distortion of the vector quantizer approaches the lower bound of the distortion-rate rela-
tion. However, both the computation complexity and memory requirement, associated
with the vector quantizer, increase exponentially. Hence, an unconstrained full search
vector quantizer usually uses small vectors. For reducing the computation complexity and
memory requirements of VQ, several variants of the original VQ had been proposed in
literature, such as residual vector quantization (RVQ) [22, 23, 25], hierarchical VQ [24],
and tree-structured VQ (TSVQ) [18]. Each VQ variant makes a compromise between thecomputation complexity and performance.
RVQ or multistage VQ [25] is a VQ variant with less computation complexity. Be-
cause the decoder of a RVQ is constrained by a direct-sum codebook structure and the
encoder typically uses a suboptimal stage-sequential search procedure, the RVQ results in
performance degradation. For efficiently coding high-frequency wavelet coefficients with
energy clustering and compromising the complexity and performance of an image coder,
a hybrid coder using SPECK and residual VQ (RVQ) is thus proposed for image com-
pression. Specifically, the significant high-frequency wavelet coefficients of an image are
to be coded on the basis of coefficient vectors, which can be efficiently located by using
the significance coding procedure of SPECK. Recently, Chao et al. proposed a vector
SPECK algorithm for gray-level still image compression [27] which is a variation on
SPECK using VQ to code the significant coefficients. They used very sophisticated VQ
method to improve compression efficiency at the cost of added complexity. The proposed
hybrid method in this paper and the vector SPECK method were developed independently
and with many differences in implementation, although both methods involve SPECK
and VQ and have good performance.
7/30/2019 Set Partitioning
3/17
HYBRID IMAGE COMPRESSION BASEDON SPECK AND RVQ 1013
The remainder of this paper proceeds as follows. Section 2 describes the proposed
hybrid image coder which combines SPECK and RVQ. Experimental results are given in
section 3, and conclusions are given in section 4.
2. THE PROPOSED HYBRID IMAGE COMPRESSION METHOD
The SPECK algorithm, which was proposed by Pearlman et al., is a simple, efficient
image coder with coding scalability. By recursively partitioning a significant block of a
transformed image, SPECK locates the significant coefficients in the block and performs
scalar quantization on these significant coefficients to generate a coded bit-stream. Since
vector quantization is more efficient than scalar quantization according to Shannons
rate-distortion theory, developing a hybrid image coder combining SPECK with VQ was
motivated. For reducing the computational complexity, RVQ was selected to be combined
with SPECK to constitute the proposed hybrid codec, and experimental results showed
that the proposed hybrid method is efficient in image compression. In subsection 2.1,
the application for still gray-level images will be discussed, and a chroma-subsampling-image application of the proposed hybrid method will be presented in subsection 2.2. In
subsection 2.3, we will discuss the computational complexity and required memory of
the proposed hybrid method in gray-level image compression.
2.1 Application for Gray-Level Still Images
A hybrid image coding system by combining SPECK with RVQ is therefore pro-
posed to improve the compression performance, and Fig. 1 shows the block diagram
which can be directly used for still gray-level image compression. In the first block, an
input gray-level image is transformed by 2D discrete wavelet transform (DWT) to gener-
ate its transformed image for further processing. For example, Fig. 2 shows the result of
a 4-decomposition-level 2D DWT. The coefficients of the transformed image are classi-
fied into two parts. One is the LL subband which contains the scaling coefficients, and
the other is the high-frequency subbands which include all the coefficients of the trans-
formed coefficients excluding those inLL subband. The scaling coefficients represent the
lowest frequency component of an image, and they can be coded efficiently by using the
original (scalar) SPECK algorithm. On the other hand, the wavelet coefficients in high-
frequency subbands are coded by using the SPECK with RVQ. Finally, the coded bit-
stream can be obtained by the use of multiplex operation.
DWT
Scalar SPECK
SPECK withRVQ
Mux
LL subband
Input Coded
image bit-stream
H,D, V subbands
Fig. 1. The proposed hybrid image coder.
7/30/2019 Set Partitioning
4/17
SHENG-FUU LIN, HSI-CHIN HSINAND CHIEN-KUN SU1014
Fig. 2. The partition and assignment of a 4-decomposition-level transformed image.
For the quantization of the coefficients in LL subband, original SPECK starts from
the most significant nmax bit plane, where
2max
max log (| |) ,ij
ijc LL
cn
= (1)
and cij represents a coefficient inLL subband. Those scaling coefficients, whose magni-
tudes are greater than or equal to 2nmax are located in the first pass. Then, the coefficients,
whose magnitudes are in interval [2n
max-1, 2n
max), are located in the second pass, and the
procedure goes on until all the coefficients are located or the bit-budget is exhausted. In
the proposed hybrid method, all the coefficients in LL subband are normalized, by the
absolute value of the coefficient with the largest magnitude, before sorting. Hence, the
normalized scaling coefficients with magnitudes in [2-1, 20) are located in the first pass.
The coefficients with magnitudes in interval [2-2, 2-1) are located in the second pass, and
the procedure goes on till all the coefficients are located or the bit-budget is exhausted.
For the coefficients in high-frequency subbands, they are classified into three cate-
gories which areH(horizontal),D (diagonal), and V(vertical) types as shown in Fig. 2.
All the coefficients in high-frequency subbands are partitioned into 2 2 blocks, andeach 2 2 block forms a corresponding 4D vector (Fig. 3). The three types of vectors in
H,D, and Vare normalized by the maximumL2 norms of three categories, respectively,
such that theL2 norm of each vector is less than or equal to one. If theL2 norm of a vec-
tor is greater than or equal to the threshold, then the vector is significant and the block or
subband containing this vector is also significant. Because the vectors are normalized,the thresholds for a 7-stage RVQ are: 2-1, 2-2, 2-3, 2-4, 2-5, 2-6, and 0. For a 4-decomposi-
tion-level transformed image, in the initialization step, theH,D and Vblocks in the 4th
decomposition level (i.e. on the left-top corner in Fig. 2) form the S set (significant set),
and the otherH,D and Vsubbands form the Iset (insignificant set) in the RVQ-SPECK
7/30/2019 Set Partitioning
5/17
HYBRID IMAGE COMPRESSION BASEDON SPECK AND RVQ 1015
ci,j ci,j+1
ci+1,j ci+1,j+1
[ci,j, ci,j+1, ci+1,j, ci+1,j+1]
A 22 coefficient block A 4D vector
Fig. 3. A 2 2 coefficient block and its corresponding 4D vector.
Fig. 4. The signal flow diagram of ap-stage RVQ.
block of the proposed hybrid method (Fig. 1). The sorting pass of the significant vectorsof the high-frequency subbands and the definitions of S and I in the proposed hybrid
method are identical to the scalar SPECK except that the significance path of the SPECK
with RVQ ends when the block is 2 2.Generally speaking, full-search VQ has better performance than RVQ, but RVQ was
selected to be used in the proposed hybrid method because of its low complexity and
acceptable performance. The signal flow diagram of a p-stage RVQ is shown in Fig. 4,
where xi (1 i p) is the input vector of the ith VQ stage in thep-stage RVQ, and ix isthe code vector which has the smallest distance toxi. The residual ofxi ix is xi+1 that isthe input vector for the (i + 1)th VQ stage in the RVQ system. Because the characteristics
ofH, D, and Vsubbands are different, 3 RVQs are used for the H, D, and Vsubbands,
respectively, of the RVQ-SPECK part in the proposed hybrid method.
Since the information of the lowest frequency subband LL of an image is usuallymore important than that of the high-frequency subbands, the bit-plane resolution in the
scalar SPECK is set to be higher than that of the RVQ-SPECK. Thus, the transmission
rate of the scalar SPECK is usually faster than that of the RVQ-SPECK. Based on the
simulation results, the transmission rate of the scalar SPECK is set empirically twice of
the RVQ-SPECK transmission rate, i.e. one pass of the proposed hybrid method includ-
ing two SPECK passes and one RVQ-SPECK pass. At last, the output coded bit-stream
contains the overhead, binary output of SPECK, and binary output of RVQ-SPECK, and
the relation is shown in Fig. 5.
Fig. 5. The coded bit-stream of the proposed hybrid method.
7/30/2019 Set Partitioning
6/17
SHENG-FUU LIN, HSI-CHIN HSINAND CHIEN-KUN SU1016
The decoder of the proposed method can be implemented by simply reversing the
processing steps of the encoder. Besides the overhead of the compression file, the bits in
the compression file are ordered in importance, so the proposed method is embedded.
The proposed encoder (decoder) can terminate the coding (decoding) process at any point,
so it can achieve the exact bit-rate control which is an important requirement of modern
codecs.
The compression performance can be improved by the use of arithmetic coding af-
ter SPECK, however, at the cost of increasing computational complexity. As an example,
the PSNR value of the decoded 512 512 Lena image can be improved by 0.22 dB atcompression rate of 1 bpp by using SPECK with arithmetic coding [26]. For system sim-
plicity, the operation of arithmetic coding is not performed in our experiments.
2.2 Application for Chroma Subsampling Images
The chroma subsampling format is used for balancing efficiency and quality in sam-
pling, and the similar method is used in picture format to save bandwith (memory) and
maintain good quality. CIF (Common Intermediate Format) and QCIF (Quarter CIF) aretwo such formats in H.261. For CIF, the size of the luminance plane is 352 288, and thesizes of the two chrominance planes are 176 144. Each of the two chrominance planesonly contains one quarter data (pixels) of the luminance planes, since the human eye is
less sensitive to the chrominance information than to the luminance information. The im-
age sequence format of MPEG-4 is CIF or 4:2:0, and we will discuss how to use the pro-
posed hybrid method for the compression of the popular YUV 4:2:0 images.
Fig. 6 shows the block diagram of the application of the proposed hybrid method
for chroma subsampling images. First, Y, U, and V planes are transformed by using 2D
discrete wavelet transform, respectively. Then, each transformed YUV plane is proc-
essed like the transformed image in the still gray-level image case. The transformed co-
efficients of each plane are partitioned into LL,H,D, and Vsubbands. The scaling coef-
ficients of each LL subband, which will be processed by color SPECK (CSPECK) [12,13], are normalized by the maximum-amplitude of the coefficients in this LL subband.
Another three positive base values which are determined from the L2 norms of the 4D
vectors inH, V, andD subbands, respectively, are used to normalize the vectors in their
corresponding subbands such that the L2 norm of each normalized 4D vector is not
greater than 1. Hence, 12 base values used in normalization have to be stored and trans-
mitted for the decoder of the proposed method. After normalization and coefficient clas-
sification, the three LL subbands of the transformed Y, U, and V planes are processed by
CSPECK, and the other coefficients (i.e.H,D, and Vsubbands) of the transformed Y, U,
and V planes are coded by CSPECK with RVQ.
Similar to the quantization of the monochrome application, since the scaling coeffi-
cients in LL subbands contain more important information than the coefficients in H,D,
or Vsubbands do, one quantization cycle of the proposed hybrid method includes two
CSPECK quantization passes of the LL subbands and one RVQ-CSPECK quantization
pass ofH,D, and Vsubbands (Fig. 7).
7/30/2019 Set Partitioning
7/17
HYBRID IMAGE COMPRESSION BASEDON SPECK AND RVQ 1017
Fig. 6. The block diagram of the application of the proposed method for chroma subsampling images.
Fig 7. One quantization cycle of the proposed hybrid method which includes two CSPECK quan-
tization and one RVQ-CSPECK quantization.
7/30/2019 Set Partitioning
8/17
SHENG-FUU LIN, HSI-CHIN HSINAND CHIEN-KUN SU1018
2.3 Memory Requirement and Computational Complexity of the Proposed HybridMethod
In this section, the memory requirement and computational complexity of the pro-
posed hybrid codec will be discussed, assuming that the coefficients in H, D, and Vsubbands are much more than the scaling coefficients in LL subband. Therefore, the
computation complexity of the proposed hybrid method can be approximated by that of
the RVQ-SPECK part, or the RVQ-SPECK is the dominant part of the proposed hybrid
method.
For the memory issue, the proposed hybrid method needs extra memory for storing
codebooks and parameters. Assume that three p-stage RVQs are used and the codebook
sizes ofH,D, and Vsubbands for the ith VQ stage are the same and equal to mi words.
Then, all the codebooks need1
3p
i
i
m=
words. The proposed hybrid method also needs
memory to store the 12 base values for normalization and threshold information. The
proposed hybrid method outperforms the scalar SPECK, for the memory issue, on the
lengths of the list of significant vectors and the list of in-significant sets. Since each vec-tor in RVQ-SPECK contains 4 coefficients, the length of a RVQ-SPECK list is about
one fourth of the length of the corresponding list in SPECK (e.g. LSP and LIP).
For the computational complexity in encoding, the proposed hybrid method use L2
norms for significance test of the vectors for each stage, and SPECK uses 1-bit compari-
son to test significance for each bit-plane. Hence, the computational complexity of the
proposed hybrid method for significant vector test is several times more complicated
than the significance test in SPECK. Although the significance test complexity of the
proposed method is more complicated than SPECKs, the proposed hybrid method has
the advantage that its total significance test number is smaller than that of SPECK. If an
NNgray-level test image with nmax = 11 is coded by SPECK and the proposed hybridmethod with p-stage RVQs. Then, the significance-test-number ratio of SPECK to the
proposed hybrid method can be estimated by:
max max4 .( /2) ( /2)
N N n n
N N p p
=
(2)
For a 512 512 gray-level test image with nmax = 12 andp = 7, the significance-testratio of Eq. (2) is 6.9. Hence, for this example, the SPECK encoder needs 6.9 times of the
significance tests of the proposed hybrid encoder. Although SPECK can use a simple bit-
wise operation for significance test, it suffers from the growth of the number of signifi-
cance tests for large images. Both SPECK and the proposed hybrid method use the same
algorithm to locate significant coefficients, but the proposed method usually has shorter
significant paths because of the use of a 4D vector instead of a single pixel (coefficient).
When the proposed hybrid method locates a significant vector, SPECK needs one more
quadtree partition and 4 significance test to complete the significant path.
For the decoding part, no significance test needed for SPECK or the proposed hy-
brid method, and the computational complexity of both methods is greatly reduced. The
proposed hybrid method is implemented as simple look-up tables and the total amounts
of significant vectors are about one quarter of the amounts of the significant pixels in
7/30/2019 Set Partitioning
9/17
HYBRID IMAGE COMPRESSION BASEDON SPECK AND RVQ 1019
SPECK. The actual computational complexity depends on the image characteristic, code-
book size, bit-allocation of codewords and so on. From above discussion, the propose hy-
brid decoder is as efficient as SPECK, and this is consistent with our experiment results.
To summarize, the proposed hybrid method is suitable for asymmetric complexity
applications that we can encode images off line, but need to decode images fast.
3. EXPERIMENTAL RESULTS
In this section, two applications of the proposed hybrid compression method are pre-
sented. The first application, in subsection 3.1, is the gray-level still image compression,
and the other is the compression of chroma subsampling images in subsection 3.2. The
platform for simulation is an IBM PC with Windows XP, and SPECK, SPIHT, and the
proposed hybrid method are coded by Matlab.
Linear phase biorthogonal wavelet filters with 9/7-coefficients are used in this paper.
The number of wavelet decomposition levels in our experiments is 4. Fig. 2 shows the
classification of a 4-level transformed image whose coefficients are classified into fourtypes:LL,H, V, andD. The lowest frequency coefficients in subband LL are normalized
such that their magnitudes are in the range of [0, 1), and these coefficients are coded by
using the scalar SPECK. The wavelet coefficients in subbands of types H, V, andD are
coded by using SPECK with RVQ. For the coefficient vectors ofH,D, and Vsubbands,
we empirically choose the stage number, in the RVQs, as 10 and 7 for 256 256 and 512 512 test images, respectively. Because the characteristics ofH, V, andD subbands aredifferent, each category has its own codebooks. Therefore, 30 and 21 codebooks are trained
by using the K-means algorithm for 256 256 and 512 512 monochrome images, re-spectively. The codebook size of the first RVQ stage is 64 words, and that of the other
RVQ stages is 32 words. Each codeword is a 4D vector in R4.
3.1 Grey-level Still Image Compression
In this subsection, we will compare the proposed hybrid method with SPECK and
SPIHT image codecs by encoding and decoding some test images (Fig. 8). Both 256 256 and 512 512 test images are used for testing. The proposed hybrid method andSPECK are compared by using 256 256 test images first, and then, three methods (in-cluding SPIHT) are compared for 512 512 test images. The compression rate is meas-ured in bits per pixel (bpp), and the peak signal to noise ratio (PSNR) measured in dB is
utilized to evaluate the decoded image quality. For the simulation of 256 256 mono-chrome images, 41 images, which do not include the three test images, are used to train
the codebooks of the RVQs of the proposed method.
Table 1 shows the simulation results of the 256 256 test images, and Figs. 9-11show the PSNR-bpp curves for the 3 test images, where the horizontal and vertical axes
are the compression rates in bpp and PSNR values in dB, respectively. For the 256 256monochrome image Lena, the proposed hybrid coder outperforms the SPECK coder by
1.67 dB at 1.0 bpp, and 0.48 dB, on average, from 0.1 bpp to 1.5 bpp. For the 256 256monochrome image Barbra, the proposed hybrid coder outperforms the SPECK coder by
1.23 dB at 1.1 bpp, and 0.49 dB on average. For the third 256 256 gray-level image
7/30/2019 Set Partitioning
10/17
SHENG-FUU LIN, HSI-CHIN HSINAND CHIEN-KUN SU1020
(a) Lena. (b) Babara. (c) Goldhill.
Fig. 8. Three 8-bit gray-level 256 256 test images.
Table 1. Simulation results of 256256 test images.
PSNR (dB)
Lena Barbara Goldhill
bpp SPECK Proposed SPECK Proposed SPECK Proposed
1.5 40.89 41.68 39.48 39.59 33.61 34.34
1.4 40.51 40.82 39.02 39.21 33.22 33.92
1.3 40.09 40.07 38.56 38.79 32.85 33.53
1.2 39.58 39.68 37.60 38.38 32.48 33.19
1.1 39.05 39.19 36.21 37.44 32.10 32.64
1.0 36.96 38.63 35.71 36.41 31.48 31.72
0.9 36.41 37.24 35.20 35.46 30.56 30.97
0.8 35.78 36.02 34.64 34.87 30.10 30.43
0.7 35.08 35.34 34.00 34.31 29.61 29.98
0.6 33.74 34.55 32.37 33.59 29.09 29.48
0.5 32.39 32.85 31.65 31.90 28.54 29.00
0.4 31.43 31.56 30.80 30.95 27.44 27.76
0.3 29.33 30.44 29.79 30.09 26.74 26.89
0.25 28.72 28.97 28.59 28.93 26.29 26.49
0.2 28.01 28.14 27.99 28.33 25.77 26.03
0.125 25.89 26.58 25.84 26.84 24.46 24.92
0.1 25.25 25.44 25.22 25.89 24.11 24.49
0 0.5 1 1.525
30
35
40
bpp
PSNR(
dB)
Proposed
SPECK
Fig. 9. The experimental results of the 256
256 gray-level image Lena.
Fig. 10. The experimental results of the 256 256 gray-level image Barbara.
7/30/2019 Set Partitioning
11/17
HYBRID IMAGE COMPRESSION BASEDON SPECK AND RVQ 1021
Fig. 11. The experimental results of the 256 256 gray-level image Goldhill.
Fig. 12. The average improvements of the proposed
hybrid coder compared with the original
SPECK on more 256 256 test images.
Goldhill, the proposed hybrid coder outperforms the SPECK coder by 0.73 dB at 1.5 bpp,
and 0.43 dB on average. The experimental results of more test images obtained from the
USC (University of Southern California) image database are shown in Fig. 12. In Fig. 12
the curve denotes the average improvement by using the proposed hybrid coder com-
pared with the pure SPECK coder. It is shown that the proposed hybrid coder is prefer-
able to the SPECK coder in terms of the PSNR-bpp curves.
For the experiments of 512 512 gray-level still images, the proposed hybridmethod, SPECK, and SPIHT (with arithmetic coding) are simulated and compared with
each other. SPIHT is selected for comparison because it is a wavelet-based method with
very good performance and used in JPEG2000. A set of codebooks were trained by using
8 training images, downloaded from USC image database, and the K-means method. The
stages of a RVQ of the proposed hybrid method were empirically reduced to 7 stages,
since using fewer stages in a RVQ usually obtains better performance (higher PSNR
values) for low bit-rate cases. The vectors used for 512 512 images are also 4D vectors(Fig. 3) in the vector space R
4. The 7 thresholds of the 3 RVQs in the proposed hybrid
method are: 2-1, 2-2, 2-3, 2-4, 2-5, 2-6, and 0. Table 2 shows the simulation results of the
proposed hybrid method, SPECK, and SPIHT (with arithmetic coding) on 512 512 testimages. SPECK and SPIHT are two state-of-the-art techniques, and which one has better
performance usually depends on the image characteristic. According to the results in Ta-
ble 2, although we can not guarantee that the proposed hybrid method can always has the
best performance; it seems that the proposed hybrid method can improve the SPECK co-
dec for most images, especially under low bit-rate conditions. Three 0.25-bpp decoded
images of SPECK, the proposed method, and SPIHT are shown in Fig. 13, and it is diffi-
cult to find difference among these images by our eyes instantly. By carefully inspecting
the reconstructed images in Fig. 13, we found that the image of SPECK codec is smoother
than the others and the proposed hybrid codec preserves more small details of the origi-nal images.
Chao et al. proposed a vector SPECK [27] for still gray-level image compression.
Three types of VQs (full search VQ, tree-structured VQ, and entropy constrained VQ)
were used in their method at the same time, and the vector dimension and vector entries
depend on the subbands and quantization levels where the vector is located. A large
7/30/2019 Set Partitioning
12/17
SHENG-FUU LIN, HSI-CHIN HSINAND CHIEN-KUN SU1022
Table 2. Simulation results for 512512 test images.
PSNR (dB)
Lena Barbara Goldhill
bpp SPECK Proposed SPIHT SPECK Proposed SPIHT SPECK Proposed SPIHT1.0 40.44 40.29 39.89 35.23 36.18 36.77 34.89 35.42 35.82
0.9 39.99 40.03 39.39 34.67 34.83 35.96 34.46 35.00 35.31
0.8 39.54 39.74 38.69 34.00 34.32 35.01 33.99 34.52 34.78
0.7 38.89 39.40 38.14 32.85 33.75 33.88 33.49 34.06 34.15
0.6 37.46 39.01 37.53 31.40 33.00 32.72 32.72 33.39 33.36
0.5 36.87 37.38 36.78 30.62 31.02 31.63 31.71 32.32 32.55
0.4 36.03 36.62 35.82 29.68 30.10 30.33 31.03 31.59 31.69
0.3 34.07 35.61 34.42 28.00 28.95 28.54 30.18 30.72 30.79
0.25 33.46 34.16 33.65 27.30 27.98 27.60 29.61 30.20 30.15
0.2 32.64 33.29 32.71 26.49 26.92 26.66 28.69 29.26 29.39
0.1 29.48 30.31 29.82 23.99 24.75 24.37 27.03 27.62 27.63
Table 3. Experiment results of SPECK, JPEG2000, and vector SPECK from [27].
Lena
Bit rate SPECK JPEG2000 Vector SPECK
0.125 30.96 30.92 31.25
0.2 32.99 32.96 33.47
0.25 34.03 34.09 34.33
(a) (b) (c)
Fig. 13. Decoded images of (a) SPECK, (b) the proposed hybrid method, and (c) SPIHT under 0.25-
bpp condition.
amount (1,500) of training images and Lloyd splitting method are used for training code-
books. Vector SPECK can outperform the JPEG2000 codec under low bit-rate conditions
at the cost of added complexity, but it does not handle the lower bit planes for n = 3, 2, 1,
and 0. Compared with vector SPECK, the proposed hybrid method has the features of low
complexity and a wide bit-rate range. Table 3 shows some experiment data form [27], andthey used 5 decomposition levels, 9/7 DWT, and arithmetic coding in SPECK. Since the
conditions of Tables 2 and 3 are different, the results of Tables 2 and 3 of the same method
are not equal. Hence, we only compare the difference values of SPECK and the proposed
hybrid method in Table 2 with the difference values of SPECK and JPEG2000 (or the
vector SPECK) in Table 3. For the Lena image under 0.25-bpp case, the vector SPECK
7/30/2019 Set Partitioning
13/17
HYBRID IMAGE COMPRESSION BASEDON SPECK AND RVQ 1023
outperforms SPECK 0.3 dB, JPEG2000 outperforms SPECK 0.06 dB (Table 3), and the
proposed hybrid method outperforms SPECK 0.7 dB (Table 2). Hence, it shows that the
proposed hybrid method is very competitive and efficient.
3.2 Chroma Subsampling Image Compression
The goal of the simulation is to compare the performance of the proposed hybrid
coder with that of the CSPECK coder for YUV 4:2:0 images. Based on the simulation
results, we can choose a proper coder for applications with such a format, e.g. MPEG-4,
PAL DV, DVCAM, HDV, JPEG/JFIF, H.261, VC-1, and MJPEG. The test images, used
in the simulation, have 256 256 Y (luminance) plane and 128 128 U and V (chromi-nance) planes. The 9/7-tap biorthogonal wavelet transform is performed on each plane
separately, and the number of decomposition level is four. For a CSPECK codec, the de-
coder needs to know the maximum number of binary bit planes (nmax) that is used for
coding the transformed image.
Excluding the threeLL subbands of Y, U, and V planes, the other coefficients (in H,
V, andD subbands of each Y, U, or V plane) of the transformed image are coded by usingthe CSPECK with three 10-stage RVQs. In our experiments, 55 color images are used to
generate 90 codebooks for the proposed codec, since the RVQs are 10-stage and there
are three YUV planes that each has three types (H, V, andD) of 4D coefficient vectors.
For the vectors inHsubbands of Y plane, 128 vectors are selected to be the basis vectors
for the vectors withL2 norms in [0.5, 1), and each of the other 9 codebooks of theHsub-
bands in Y plane has 64 codewords. The same basis vector arrangement as that used in the
Hsubband is used in the D and Vsubbands in Y plane. Because the human eye is less
sensitive to the chrominance information than to the luminance information, fewer basis
vectors are used in U plane or V plane. For the vectors inH,D, or Vsubbands of plane U
or plane V, 32 basis vectors are used in the highest ( i.e. 10th) stage of the RVQs, and
each of the other 9 stages has 16 basis vectors. All the codebooks are trained by using the
simple K-means method. The equivalent bit-per-pixel (ebpp) value defined in Eq. (3) isused for representing the compression rate for decoding a coded YUV 4:2:0 image:
2 2
number of bits used.
256 2 128ebpp =
+ (3)
The 256 256 color test image Goldhill is used for simulation, and the curves inFigs. 14-16show the simulation results. The test image is originally 256 256 size in theR, G, and B planes (true color space), so they had to be preprocessed before simulation.
First, the test image was transformed to the YUV space. Then, the U and V planes were
downsampled to 128 128 pixels, where the downsampleing method was to calculate thearithmetic mean of the adjacent four-point values. We compare the PSNR values of the
proposed hybrid method with those of the CSPECK coder in Y, U, and V planes, respec-
tively. It can be seen that the PSNR values can be improved by 1.11 dB for the Y plane,0.99 dB for the U plane, and 2.31 dB for the V plane at the bit budget of 98,304 bits (1.0
ebpp). For the same image, the average PSNR values (from 0.1 ebpp to 1.5 ebpp) of the
proposed method are higher than those of CSPECK by 0.66 dB, 1.21 dB, and 2.22 dB in
Y, U, and V planes, respectively.
7/30/2019 Set Partitioning
14/17
SHENG-FUU LIN, HSI-CHIN HSINAND CHIEN-KUN SU1024
Fig. 14. The Y-plane experimental results of the
chroma subsampling image Goldhill.
Fig. 15. The U-plane experimental results of the
chroma subsampling image Goldhill.
Fig. 16. The V-plane experimental results of the chroma subsampling image Goldhill.
Based on the simulation results, it is obvious that the proposed method has superiorimprovement in the two chrominance planes (U and V), since the colors (chrominance
information) of the four neighbors in a 2 2 block are usually similar. On the other hand,the luminance values are more probable to change abruptly than the chrominance values
are, because of sharp edges and corners. Even though, the proposed method also achieves
good results in the Y planes. The major added cost of the proposed method is the needs
of training codebooks and determining parameters before encoding. Since the most time
consuming codebook design can be done off-line and the codebook sizes of the RVQs are
small, the proposed hybrid method is efficient in time and bit-budget.
4. CONCLUSIONS
In this paper, we propose a hybrid image coder, which is based on SPECK and RVQ,
for still gray-level and chroma subsampling images. Compared with SPIHT and SPECK
(two state-of-the-art algorithms), the experimental results have shown that the proposed
hybrid method is efficient for image compression. According to the applications that we
are interested in, the flexible proposed hybrid codec can be designed to improve its low
7/30/2019 Set Partitioning
15/17
HYBRID IMAGE COMPRESSION BASEDON SPECK AND RVQ 1025
bit-rate or high bit-rate performance by using a short RVQ or a long RVQ. We also have
shown that the proposed hybrid method has superior performance for the chrominance
planes (i.e. U and V planes in YUV color space) in chroma subsampling image compres-
sion. Because of the asymmetry property of VQ, the proposed hybrid method is suitable
for those applications whose load is also asymmetric and is heavy on the decoding side
(e.g. the image archiving of an image database). Although the proposed hybrid codec is
asymmetric, using RVQ instead of full-search VQ makes the increased complexity af-
fordable and worthy.
ACKNOWLEDGEMENTS
The authors would like to thank the anonymous reviewers for their comments that
significantly helped improve this paper.
REFERENCES
1. H. G. Musmann, P. Pirsch, and H. J. Grallert, Advances in picture coding, in Pro-ceedings of IEEE, Vol. 73, 1985, pp. 523-548.
2. R. J. Clarke, Transform Coding of Images, Academic Press, New York, 1985.3. O. J. Kwon and R. Chellappa, Region adaptive subband image coding, IEEE Trans-
actions on Image Processing, Vol. 7, 1988, pp. 632-648.
4. K. R. Rao and J. J. Hwang, Techniques and Standards for Image Video and AudioCoding, Prentice Hall, New Jersey, 1996.
5. W. B. Pennebaker and J. L. Mitchell, JPEG Still Image Data Compression Stan-dards, Van Nostrand, New York, 1993.
6. JPEG2000 Core Coding System (Part 1), ISO/IEC 15444-1, Dec. 2000.7. B. E. Usevitch, A tutorial on modern lossy wavelet image compression: Founda-
tions of JPEG2000,IEEE Signal ProcessingMagazine, Vol. 18, 2001, pp. 22-35.8. J. M. Shapiro, Embedded image coding using zerotrees of wavelet coefficients,
IEEE Transactions on Signal Processing, Vol. 41, 1993, pp. 3445-3462.
9. A. Said and W. A. Pearlman, A new, fast, and efficient image codec based on setpartitioning in hierarchical trees,IEEE Transactions on Circuits Systems for Video
Technology, Vol. 6, 1996, pp. 243-250.
10. S. D. Servetto, K. Ramchandran, and M. T. Orchard, Image coding based on a mor-phological representation of wavelet data,IEEE Transactions on Image Processing,
Vol. 8, 1999, pp. 1161-1174.
11. E. S. Hong and R. E. Ladner, Group testing for image compression, IEEE Trans-actions on Image Processing, Vol. 11, 2002, pp. 901-911.
12. A. Islam and W. A. Pearlman, An embedded and efficient low-complexity hierar-chical image coder, in Proceedings of SPIE Visual Communications and Image
Processing, Vol. 3653, 1999, pp. 294-305.
13. W. A. Pearlman, A. Islam, N. Nagaraj, and A. Said, Efficient, low-complexity im-age coding with a set-partitioning embedded block coder, IEEE Transactions on
Circuits Systems for Video Technology, Vol. 14, 2004, pp. 1219-1235.
14. G. Strang and T. Nguyen, Wavelets and Filter Banks, Wellesley-Cambridge, MA,
7/30/2019 Set Partitioning
16/17
SHENG-FUU LIN, HSI-CHIN HSINAND CHIEN-KUN SU1026
1996.
15. C. Chrysafis, A. Said, A. Drukarev, and W. A. Pearlman, SBHP A low complex-ity wavelet coder, in Proceedings of IEEE International Conference on Acoustics,
Speech, and Signal Processing, 2000, pp. 2035-2038.
16. S. T. Hsiang and J. W. Woods, Embedded image coding using zero blocks of sub-band/wavelet coefficients and context modeling, in Proceedings of IEEE Interna-
tional Conference on Circuits and Systems, 2000, pp. 662-665.
17. C. E. Shannon, A mathematical theory of communication, The Bell System Tech-nical Journal, Vol. 27, 1948, pp. 379-423, 623-656.
18. A. Gersho and R. M. Gray, Vector Quantization and Signal Compression, KluwerAcdemic Publishers, MA, 1992.
19. S. Gupta and A. Gersho, Feature predictive vector quantization of multispectral im-ages,IEEE Transactions on Geoscience and Remote Sensing, Vol. 30, 1992, pp. 491-
501.
20. C. K. Su, H. C. Hsin, and S. F. Lin, Wavelet tree classification and hybrid codingfor image compression,IEE Proceedings of Vision,Image, and Signal Processing,
Vol. 152, 2005, pp. 752-756.21. T. K. Abdel-Galil, E. F. El-Saadany, A. M. Youssef, and M. M. Salama, Distur-
bance classification using hidden Markov models and vector quantization, IEEE
Transactions on Power Delivery, Vol. 20, 2005, pp. 2129-2135.
22. C. F. Barnes, Residual quantizers, Ph.D. Dissertation, Department of Electricaland Computer Engineering, BrigHam Young University, Provo, UT, 1989.
23. F. Kossentini, M. J. T. Smith, and C. F. Barnes, Image coding using entropy-con-strained residual vector quantization,IEEE Transactions on Image Processing, Vol.
4, 1995, pp. 1349-1357.
24. Y. Shoham, Hierachical vector quantization with application to speech waveformcoding, Ph.D. Dissertation, Department of Electrical and Computer Engineering,
University of California at Santa Barbara, 1985.
25.B. H. Juang and A. H. Gray, Multiple stage vector quantization for speech coding,in Proceedings of IEEE International Conference on Acoustics, Speech, Signal Proc-
essing, Vol. 1, 1982, pp. 597-600.
26. G. Xie and H. Shen, Highly scalable, low-complexity image coding using zero-blocks of wavelet coefficients, IEEE Transactions on Circuits Systems for Video
Technology, Vol. 15, 2005, pp. 762-770.
27. C. C. Chao and R. M. Gray, Image compression with a vector SPECK algorithm,in Proceedings of IEEE International Conference on Acoustics, Speech, Signal Proc-
essing, Vol. 2, 2006, pp. 445-448.
Sheng-Fuu Lin () was born in Taiwan, R.O.C., in
1954. He received the B.S. and M.S. degrees in Mathematics from
National Taiwan Normal University in 1976 and 1979, respec-
tively, the second M.S. degree in Computer Science from the
University of Maryland in 1985, and the Ph.D. degree in Electri-
cal Engineering from the University of Illinois, Champaign, in
1988. Since 1988, he has been on the faculty of the Department
7/30/2019 Set Partitioning
17/17
HYBRID IMAGE COMPRESSION BASEDON SPECK AND RVQ 1027
of Electrical and Control Engineering at National Chiao Tung University, Hsinchu, Tai-
wan. His research interests include fuzzy theory, automatic target recognition, scheduling,
image processing, and image recognition. Professor Lin is a member of the IEEE Control
Society, Chinese Fuzzy System Association, and Chinese Automatic Control Society.
Hsi-Chin Hsin () received the M.S. and Ph.D. de-
grees in Electrical Engineering from the University of Pittsburgh,
Pittsburgh, PA, in 1992 and 1995, respectively.He is a Professor
in the Department of Computer Science and Information Engi-
neering at National United University, Taiwan. His research in-
terests include wavelet transform, image processing, CORDIC,
DSP architectures and system on chip.
Chien-Kun Su () was born in 1962. He received the
B.S. degree from National Taiwan University, Taiwan, in 1989,
M.S. degree from the University of Southern California, U.S.A.,
in 1992, and the Ph.D. degree from National Chiao Tung Uni-
versity, Taiwan, in 2008. He has been on the faculty of the De-
partment of Electrical Engineering at Chung Hua University,
Hsinchu, Taiwan since 1995. His research interests include im-
age processing and computer vision.