29
' 2003 by CRC Press LLC chapter eight Compression of color images Ricardo de Queiroz Xerox Corporation Contents 8.1 Compression basics 8.2 Compression models 8.2.1 Transform coding 8.2.2 Predictive coding 8.2.3 Rate-distortion trade-off 8.2.4 Distortion measure 8.3 Standard image coders 8.4 Multidimensional color model and transforms 8.5 Color transforms 8.6 Compressing RGB images 8.7 Compressing CMYK images 8.8 Summary References In this chapter, we intend to cover the basic aspects of color image compres- sion. Basic aspects of color images can be found elsewhere in this book, and compression details can also be easily found in the literature. Even though we give a very brief introduction to compression methods, we intend to explore the issues pertaining to the intersection of the two topics (compres- sion and color) without exploring them individually. 8.1 Compression basics Image compression methods rely on the removal of information within images to reduce the amount of data necessary to represent them. 1–7 The information to be removed is usually characterized as one of two classes,

Chapter 8: Compression of color imagesread.pudn.com/downloads448/sourcecode/graph/texture_mapping/1887323... · In this chapter, we intend to cover the basic aspects of color image

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Chapter 8: Compression of color imagesread.pudn.com/downloads448/sourcecode/graph/texture_mapping/1887323... · In this chapter, we intend to cover the basic aspects of color image

© 2003 by CRC Press LLC

chapter eight

Compression of color images

Ricardo de Queiroz

Xerox Corporation

Contents

8.1 Compression basics 8.2 Compression models

8.2.1 Transform coding8.2.2 Predictive coding8.2.3 Rate-distortion trade-off8.2.4 Distortion measure

8.3 Standard image coders8.4 Multidimensional color model and transforms8.5 Color transforms8.6 Compressing RGB images8.7 Compressing CMYK images8.8 SummaryReferences

In this chapter, we intend to cover the basic aspects of color image compres-sion. Basic aspects of color images can be found elsewhere in this book, andcompression details can also be easily found in the literature. Even thoughwe give a very brief introduction to compression methods, we intend toexplore the issues pertaining to the intersection of the two topics (compres-sion and color) without exploring them individually.

8.1 Compression basics

Image compression methods rely on the removal of information withinimages to reduce the amount of data necessary to represent them.

1–7

Theinformation to be removed is usually characterized as one of two classes,

08 Page 559 Monday, November 18, 2002 9:31 AM

Page 2: Chapter 8: Compression of color imagesread.pudn.com/downloads448/sourcecode/graph/texture_mapping/1887323... · In this chapter, we intend to cover the basic aspects of color image

560 Digital Color Imaging Handbook

statistically redundant

or

visually irrelevant.

The removal of (statistically)redundant data often yields reversible or lossless compression; i.e., the imagedata can be completely retrieved from the compressed data. On the otherhand, removing the (visually) irrelevant information inflicts losses on thereconstructed image; it is hoped, however, that such losses are not objection-able for a given destination viewer.

Statistical

redundancy is the extra amount of bits used to represent agiven sequence of symbols. For example, if a sequence of symbols from thealphabet A,B,C is to be encoded, there are several ways to encode thesymbols into binary code words. Examples of code words are 00, 01, 10,001, 100, 111, 0, 10, 11, etc. Some representations are clearly longer thanothers. In other words, if we start from a given representation, by changingthe binary encoding, one might represent the same information with fewerbits. For example, if the probabilities of occurrence are

p

(A) = 0.5

p

(B) = 0.2

p

(C) = 0.3

then the average bit rate

R

per symbol achievable by each code is

which clearly shows how the rate could be reduced by 25% by only modi-fying the code. There are several means by which one can encode a particularsequence of symbols. The above examples are for instantaneous codes whereeach symbol is assigned a unique code. Instantaneous codes are those forwhich no code word is a prefix of another code word. (For example, 01 is aprefix of 010.) Also, the above examples are block codes; i.e., each symbol isindependently mapped to a code string and vice versa.

8

If the symbols have probabilities

p

i

, the entropy of the source is definedto be

(8.1)

One cannot encode a source with average rate below the entropy rate withoutlosses; i.e.,

H

is the lower bound for coding a particular source.

8

Furthermore,one can always construct codes with length

L

i

such that the average length is bounded by .

Huffman codes are typical examples of uniquely decodable and instan-taneous block codes that are optimized for a particular source.

2,3,7,8

However,

A = 00B = 10C = 01 þ

ÔýÔ¸

R 2 bits/symbol=A = 0B = 10C = 11 þ

ÔýÔ¸

R 1.5 bits/symbol=

H pi pilogi

Â=

L SiLipi= H L H 1+£ £

08 Page 560 Monday, November 18, 2002 9:32 AM

Page 3: Chapter 8: Compression of color imagesread.pudn.com/downloads448/sourcecode/graph/texture_mapping/1887323... · In this chapter, we intend to cover the basic aspects of color image

Chapter eight: Compression of color images 561

as any block code, Huffman codes cannot have less than one bit per symbol.So, when the entropy is far below one bit per symbol, Huffman codes arenot so efficient. In fact, one can encode the symbols at an average rate veryclose to

H

, even if

H

<< 1. One example is the arithmetic coder,

3

whichcombines multiple symbols and codes to achieve higher performance.

Insofar as there is no distortion to the data, this reduction in the datasize can be viewed as statistical redundancy extraction. Yet another type ofinformation that can be removed is the visually irrelevant one. To remove it,the image data are manipulated (preprocessed) so as to improve the statisticalredundancy extraction step. However, an attempt is made to make all imagechanges invisible to the final observer. If the changes are perceptible, theyare supposed to have low visibility, at least. The most common method toremove visual redundancy is to remove image details of little importance orfrequency components that are not discernible. The removal of informationfrom the image data is performed through the quantization operation.

A quantizer

Q

maps the input (a random variable, e.g.,

X

) into anothersymbol of a reduced set. For example, if

X

assumes numbers in the real line,it can be broken into

q

intervals by defining

q

– 1 decision levels,

t

n

, in thereal axis that effectively divide the axis into

q

intervals. In this example,

X

isassigned to level

k

if

X = x | t

k– 1

x < t

k

. The inverse quantization operation(

Q

1

) restores a number in the domain of

X

from the received coded levelnumber

k

. Typically, the reconstructed value lies within the interval to wherethe original value belongs; i.e., the reconstructed value for the

k

th level would lie within the interval [

t

k–1

, t

k

). In other words, quantization maps theinput (continuous or discrete) to a set of integer numbers, while the inversequantization maps the range back into the input domain, i.e.,

As a many-to-one mapping, quantization implies losses. However, lossesare the price paid to largely reduce the number of possible input values andthe data entropy. A typical quantization is the uniform one with center recon-struction where

(8.2)

This form of quantization can be very efficiently implemented and isoften used in compression standards because of its low complexity. A vari-ation of the uniform quantizer is the one including a small

dead zone

in themiddle,

(8.3)

x )X l X→ →

)Q1–

Q

xq round x∆---

x xq 1 2⁄ ∆+== )

xqx∆--- x xq 1 2⁄+( )∆= =)

-ColorBook Page 561 Wednesday, November 13, 2002 10:55 AM

Page 4: Chapter 8: Compression of color imagesread.pudn.com/downloads448/sourcecode/graph/texture_mapping/1887323... · In this chapter, we intend to cover the basic aspects of color image

562 Digital Color Imaging Handbook

where

means a round-off toward zero or “floor” function. The functionof the dead zone is to nullify numbers close to zero, which is an efficientmeans to remove noise.

To remove only statistical redundancy implies lossless compression; i.e.,the data after decompression are bit by bit identical to the original. Whenlossy compression is applied, the data cannot be recovered thereafter, thusimplying the use of some form of data quantization. Lossy compression ismuch more efficient than lossless compression; i.e., by accepting data losses(not too visible), much more compression can be applied to a particularimage. This is a general trade-off in compression: rate vs. distortion. Examplecoders will be discussed later on as well as rate-distortion paradigms.

As far as compression techniques are concerned, this chapter is merelyintroductory, and the reader is referred to the many texts in image compres-sion to understand compression techniques in more detail.

1–7

The centralfocus of this chapter is to cover representation of color image informationand its interaction with the compression.

8.2 Compression models

8.2.1 Transform coding

Transform coding is perhaps the most popular method of compression. Atypical compression system based on transform coding is depicted in Figure8.1. The input image data is transformed, quantized, and encoded. Theencoding step provides lossless compression and simply converts numbers(symbols) to some binary representation in a reversible manner. The quan-tization step implies losses but does not compress the data. In fact, it controlsthe compression and the distortion. The transform step does not controldistortion or compression, nor does it provide the compression by encodingthe data. However, it is the key enabler for successful image compression.Its job is to convert the image pixels into something that would make senseto quantize and encode. Figure 8.1b depicts the transform and quantizationsteps in greater detail. The data are transformed, and each transformedsample may undergo different quantization. The process of selecting quan-tizers is historically tied to the process of

bit allocation

.

2,4,6,7

This is so because,in the past, each quantizer used to be associated with its own

encoder;

i.e.,a quantizer result was directly mapped into output binary codes. In thissituation, the transform output was directly related to the compressed bitrate. In fact, under some reasonable conditions, the compression efficiencycan be related to

energy compaction

,

7,9

which is a measure of how muchenergy is concentrated in a few transformed coefficients. A useful measurecan be the ratio of geometric by arithmetic means of the energy of eachtransformed sample if we apply orthogonal transforms.

7

In this scenario,for very simple quantization and coding, with a few additional assump-tions,

7

the optimal transform can be found as the one that provides the most

-ColorBook Page 562 Wednesday, November 13, 2002 10:55 AM

Page 5: Chapter 8: Compression of color imagesread.pudn.com/downloads448/sourcecode/graph/texture_mapping/1887323... · In this chapter, we intend to cover the basic aspects of color image

Chapter eight: Compression of color images 563

energy compaction

. The Karhunen–Loeve transform (KLT)

2,4,7,9

achieves opti-mal compaction. If, in the past, each quantizer was associated with itsencoding method, it is a modern practice to analyze multiple quantizedsamples at once before entropy coding the whole quantized data stream.Even so, the connection between energy compaction and compression effi-ciency is clear from practice.

As an example of how energy compaction can be useful, let us considerthe example in Figure 8.2. Recall that an orthogonal transformation is asimple rotation of the input space. Refer to Figure 8.2, where two contiguoussamples in an image are being computed. For every pair of pixels, one point

Transform Quantizer Encoder

BlockTransform

xnM , xnM+1 ,…, xnM+M-1

y0(n)

y1(n)

yM-1(n)

Q0

Q1

QM-1

ColorTransform

R

G

B

Monochrome Compression

Transf. Quant. Encoder

Monochrome Compression

Transf. Quant. Encoder

Monochrome Compression

Transf. Quant. Encoder

M

U

X

C0

C1

C2

(a)

(b)

(c)

Figure 8.1 Basic steps of typical transform coding systems: (a) and (b) Details of thetransform and quantization steps. The transformed samples for each channel com-monly are processed using different quantizers. (c) Typical coding system for colorimages, exemplified for an RGB input image. After a pixel-wise color transform, eachresulting color plane is compressed independently, while the resulting compressedbit streams are multiplexed to produce the output.

-ColorBook Page 563 Wednesday, November 13, 2002 10:55 AM

Page 6: Chapter 8: Compression of color imagesread.pudn.com/downloads448/sourcecode/graph/texture_mapping/1887323... · In this chapter, we intend to cover the basic aspects of color image

564 Digital Color Imaging Handbook

in plotted in the (

x

n

,

x

n

+1

) plane. Because neighbor pixels are likely to besimilar, it is expected that points would cluster along a diagonal. Hence, anaxis rotation of

p

/4 might align one of the resulting axes to the data cluster.With this, one achieves compaction of the energy necessary to represent thedata around one axis (

y

n+1

) so that

y

n

tends to have small values. If one hadthe ability to keep only one of the samples, it is clear that, in the (

y

n

,

y

n

+1

)domain, the distortion incurred by discarding (setting to 0)

y

n

is much smallerthan discarding either

x

n

or

x

n

+1

. Hence, the advantage of providing a trans-form step that yields energy compaction becomes clear.

The

p

/4 rotation we just described is accomplished, for example, witha size-two discrete cosine transform (DCT).

9

In practical compression, vectorswith eight samples are transformed using the size-eight DCT.

9

Let

X

be amatrix containing a block with

M

¥

M

pixels. In two dimensions, pixel blocksare transformed by a separable transform such that

(8.4)

where

Y

is the matrix with the transformed data and

D

is the size-

M

DCTmatrix whose entries

d

ij

are given by

(8.5)

The DCT is asymptotically optimal for some image models and performswell for most images.

9

In fact, for an order-1 autoregressive model, the DCT

xn

xn+1

yn

yn+1

Figure 8.2 An orthogonal transformation is a simple axis rotation. In an image,neighbor pixels (xn, xn+1) are likely to have similar values. A rotation of p/4 providesa representation that is better aligned with the data; i.e., there is compaction of energyin one of the variables (yn+1).

Y DXDT X DTYD= =

dij2M-----ai

2 j 1+( )ip2M

-------------------------Ë ¯Ê cos=

a012

------- and ai 1 for i 0>= =

08 Page 564 Monday, November 18, 2002 9:34 AM

Page 7: Chapter 8: Compression of color imagesread.pudn.com/downloads448/sourcecode/graph/texture_mapping/1887323... · In this chapter, we intend to cover the basic aspects of color image

Chapter eight: Compression of color images 565

approaches the KLT as the correlation between samples increases. The DCTworks well to compact the input energy into few transformed samples, asillustrated in Figure 8.3.

Another powerful transform tool in image compression is the wavelettransform.10,11 The wavelet transform is based on an elementary two-channelfilter bank. Each bank decomposes the image into lowpass and highpasssub-bands. The filter bank has a lowpass and a highpass filter, and each isfollowed by a 2:1 decimator. At every stage of the two-channel filter bank,the number of input samples is maintained; i.e., for N input samples, thereare N/2 lowpass output samples and N/2 highpass ones. The decompositionis also said to be the analysis. To recompose the original signal from thelowpass and highpass sub-bands, in a process called the synthesis, one uses

0 4 7

0

1

2

3

4

5

6

7

0 1 2 3 4 5 6 70

50

100

150

200

250

300

350

400

450

n

Ene

rgy(

fn)

(a) (b)

(c) (d)

Bas

is n

umbe

r

0 100 200 300 400 500

50

100

150

200

250

gray

val

ue

0

Figure 8.3 The eight-basis functions of the DCT in the one-dimensional case (a)and its two-dimensional separable counterpart (b). The pixels of the 8 × 8 block areprojected into each of the bases shown. The projection enables energy compaction.For example, the signal shown in (c) is a scan line of a test image. By breaking thevector into eight sample segments and transforming each one, the energy of theDCT coefficients is depicted in (d). Note the concentration of energy in a few low-frequency coefficients.

-ColorBook Page 565 Wednesday, November 13, 2002 10:55 AM

Page 8: Chapter 8: Compression of color imagesread.pudn.com/downloads448/sourcecode/graph/texture_mapping/1887323... · In this chapter, we intend to cover the basic aspects of color image

566 Digital Color Imaging Handbook

1:2 up-samplers followed by the lowpass and highpass filters. The filteredoutput for both channels is added to resynthesize the input data. For thetwo-dimensional case, the image is decomposed into four decimated sub-bands by applying analysis filter banks in each of horizontal and verticaldirections as depicted in Figure 8.4a for the analysis and Figure 8.4b for the

Horizontal HorizontalVertical Vertical

LPF

LPFHPF

HPF

HPF

HPF

HPF

HPFLPF

LPF

LPF

LPF

Subbandstage

Subbandstage

Subbandstage

N stages

(b)

(c)

(d) (e)

(a)

Figure 8.4 Wavelet transform of an image is based on the elementary two-channelfilter bank, which, in two dimensions, decomposes the image into lowpass andhighpass sub-bands in each of the horizontal and vertical directions (a). The decom-position (analysis) is achieved through filters, followed by 2:1 down-samplers. Thesynthesis is formed by 1:2 up-samplers, followed by filters. The wavelet transformis the association of filter banks (c) such that the lowpass of one stage is input intoanother stage. In this way, the image in (d) is decomposed into the sub-bands depictedin (e).

08 Page 566 Monday, November 18, 2002 9:35 AM

Page 9: Chapter 8: Compression of color imagesread.pudn.com/downloads448/sourcecode/graph/texture_mapping/1887323... · In this chapter, we intend to cover the basic aspects of color image

Chapter eight: Compression of color images 567

synthesis section. The wavelet transform is an association of filter banks asin Figure 8.4c such that the lowpass output of one stage is input to anotherstage. In this way, the image in Figure 8.4d is decomposed into the sub-bandsdepicted in Figure 8.4e.

The different sub-bands contain information at a different orientationand different scale. However, unlike the DCT, higher-frequency bands arewider and have more samples. In other words, some bands have poor fre-quency resolution (wide bandwidth) but high spatial resolution (better fea-ture location). Some other bands have better frequency resolution but poorspatial location (narrow bandwidth and just a few samples to represent thewhole image). This resolution trade-off is an attractive feature of the wavelettransform and has been extensively studied.10,11

In a manner similar to block transforms, the wavelet transform aims attransforming the image data into a representation where a few coefficientsconvey most of the information to reconstruct the image. Referring back toFigure 8.1, the transform is used to convert the image data to a more suitablerepresentation. It is left to the quantization step to perform the job of actuallyreducing the amount of information to be conveyed to the decoder. Then,the encoder is left with the job of actually compressing the quantized trans-formed data. For example, for 8-bit image data, the transformed samplescan be floating point numbers, which are quantized to integer numbers usinga fixed-point representation of, for example, 16 bits (a 1:2 expansion so far).The entropy encoding step is the one that achieves the compression, properlyenabled by the previous steps.

8.2.2 Predictive coding

In predictive coding,2,6 the image data are not compressed in the mannerdescribed in Figure 8.1. Instead, one predicts the image pixels and computesa prediction error. The prediction error is actually the information that isquantized and encoded. In general, predictive coding is similar to DPCMsystems,2 which were commonly used before computer technology madetransform coding affordable. Figure 8.5 depicts a typical predictive codingsystem. The actual pixel is predicted somehow from information conveyedthrough the past (already processed) pixels, and only the prediction error isquantized and encoded. The decoder applies the same prediction step as theencoder. Given the predicted value, the decoder simply integrates the errorto the predicted value so as to produce the reconstructed pixel. A feedbackloop is required at the encoder side to synchronize encoder and decoder.However, predictive systems are often used for “lossless” compression; i.e.,the data are not quantized and, hence, there is no need for the feedback loop.There are several methods for adaptive or nonadaptive prediction. Typically,the pixel illustrated in Figure 8.5c and labeled “X” is predicted from itsneighbors (pixels labeled “A” through “H”) using some weighted combina-tion. Commonly, the immediate neighbors (pixels “A” through “D”) are usedto predict the actual pixel “X.”

-ColorBook Page 567 Wednesday, November 13, 2002 10:55 AM

Page 10: Chapter 8: Compression of color imagesread.pudn.com/downloads448/sourcecode/graph/texture_mapping/1887323... · In this chapter, we intend to cover the basic aspects of color image

568 Digital Color Imaging Handbook

8.2.3 Rate-distortion trade-off

To select compression systems or features, one needs to somehow evaluatethe compression performance. If we ignore computational complexity, wecan look at “how much compression can we achieve” and at “what is theimage quality” as the major issues in compression. The cost of compressingan image can be expressed in the amount of data needed to record theinformation, i.e., the amount of bits consumed. In compression jargon, thisis the rate (R) achieved by compressing a given image. The benefit of com-pressing the image is the image quality, or how well we approximate theoriginal image, assuming possible lossy compression. By defining some mea-sure of distortion D as the distance between the images before and aftercompression, we can quantify the compression process as a rate-distortiontrade-off.12 Commonly, the more distortion is allowed, the lower the achiev-able rate. Conversely, the more bits are spent (higher R), typically, the moreaccurate the representation (lower D).

Compressors are guided by parameters that are set by the user or appli-cation. Examples of parameters are quantizer steps, etc. For a given image,for each parameter choice, the (de)compression system achieves a particularrate R0 and a particular distortion D0; i.e., a instantiation of the coder is apoint on the RD plane. Let the set of all realizable RD points be Ω. If Ψ isthe set of all parameters to be adjusted, the coder can be seen as a map fromthe domain of Ψ to points in Ω. We always want to minimize rate anddistortion at the same time; that is, we want to have better quality at highercompression ratios:

(a) (b)

(c)

Q

Predictor

Encoder

E A XF B C D

G H

Q-1

Predictor

Decoder

Figure 8.5 In predictive coding (DPCM), the pixels are predicted from pastvalues, and the prediction error is quantized, encoded, and transmitted (a). Thereceiver (b) decodes the error and adds it to its own predicted value. For syn-chronizing transmitter (a) and receiver (b) predictions, the transmitter incorpo-rates a local decoder in the form of a feedback loop. A prediction template isillustrated in (c). In this template, pixels “A” through “H” are used to predictpixel “X.”

-ColorBook Page 568 Wednesday, November 13, 2002 10:55 AM

Page 11: Chapter 8: Compression of color imagesread.pudn.com/downloads448/sourcecode/graph/texture_mapping/1887323... · In this chapter, we intend to cover the basic aspects of color image

Chapter eight: Compression of color images 569

(8.6)

The points satisfying the above equation make the lower (typically) convexhull (LCH) of Ω as illustrated in Figure 8.6. Theoretically, coders shouldoperate at an operational point on the LCH for best performance. Typically,if optimization is involved, to get Equation 8.6, one can minimize the costfunction R + λD. If we move a line with inclination –1/λ, the first point inΩ it touches is the point that minimizes R + λD. This point belongs to theLCH. Actually, varying λ would move the operational point on the LCH.

The important message is that compression performance has a two-dimensional range. It does not matter if a coding method compresses morethan another without regard to the distortion. Even if both rate and distortionchanges are computed, a fair comparison between methods would requirecomparing RD curves and regions. One good method is to fix either R or Dand compare the other dimension.

8.2.4 Distortion measure

The computation of distortion involves some measure of the distancebetween the original and decompressed images. A common measure ofdistortion between images is the mean squared error (MSE). If the N colorchannels of the original image are denoted as Ck(m, n) for a pixel at position(m, n), then

(8.7)

minD R RT≤ or minR D DT≤ , R D, Ω∈Ψ Ψ

R

D

LCH

R+λD

Figure 8.6 In all lossy coders, there is an RD trade-off. While Ω is the space of allattainable RD points, there is typically a lower convex hull (LCH) to the distribution.The LCH contains the desirable operational points. Often, the cost function is chosenas R + λD, which is minimized by the point in Ω that first touches the line withinclination –1/λ.

MSE 1NpN------------ Ck m n,( ) Ck m n,( )–( )2

m n,∑

k∑=

-ColorBook Page 569 Wednesday, November 13, 2002 10:55 AM

Page 12: Chapter 8: Compression of color imagesread.pudn.com/downloads448/sourcecode/graph/texture_mapping/1887323... · In this chapter, we intend to cover the basic aspects of color image

570 Digital Color Imaging Handbook

where denotes the reconstructed image planes, and there are NP pixelsin the image. In practice, the MSE is often presented in another form: thepeak signal-to-noise ratio (PSNR). PSNR is by far the most commonly useddistortion measure for images and is defined for eight bits per pixel (bpp)images as

(8.8)

While PSNR is easy to compute, it provides a poor approximation of theperceived difference between images, except when the PSNR values arerelatively high. In general, MSE-like measures do not anticipate humanvisual characteristics. For example, if one compares an image with a versionof itself spatially shifted by a couple of pixels, the resulting MSE numberwill be high, even though the images are almost indistinguishable. Also,MSE or PSNR are not commonly applied to color images. Nevertheless, ifone compares compression methods yielding high PSNR (e.g., above 30 to35 dB), an advantage of 1 dB typically implies noticeably better image quality.

The perfect distortion measure still eludes researchers and is the subjectof intense debates (see Reference 13, Chapters 11 through 15.) Another mea-sure of distortion better suited to color images is CIE’s ∆E14,15 and its exten-sion, the so-called spatial ∆E (S∆E).16 In the S∆E, the image is mapped intoa device-independent CIELAB color space (and, optionally, farther into alinear luminance-chrominance color space). The luminance is filtered usinga spatial filter that provides a linear shift invariant approximation to thespatial response of the human visual system for luminance signals. Anotherfilter is applied to the chrominance channels, where the filter was properlydesigned for approximating the HVS chrominance sensitivity. The averageCIE ∆E between the filtered images results in the final S∆E number. Thewhole process is illustrated in Figure 8.7.

8.3 Standard image codersSo far, we have not explained image compression methods in detail. Thereare two good reasons for this. First, the compression field is so vast that itis impossible to cover it here with reasonable depth. Second, there are a fewstandard image compression systems that are widely used. The inner work-ings of common compression engines are well known and available to manyimplementers. Hence, one can buy them off the shelf, and we can treat themas black boxes that are used to compress a number of monochrome images.The reader is referred to Figure 8.1c for a diagram of a typical color imagecompression system. After a pixel-by-pixel color transform, each trans-formed image plane is fed to monochrome compressors whose compressedbitstreams are multiplexed somehow to produce the output compressedstream. The color spaces and color transformations are often application

Ck

PSNR 10 log102552

MSE-------------

=

-ColorBook Page 570 Wednesday, November 13, 2002 10:55 AM

Page 13: Chapter 8: Compression of color imagesread.pudn.com/downloads448/sourcecode/graph/texture_mapping/1887323... · In this chapter, we intend to cover the basic aspects of color image

Chapter eight: Compression of color images 571

dependent, and we will devote the future sections to the discussion of colortransforms and the nature of the resulting color planes. For this discussion,in this chapter, the reader may take these standard compression schemes atface value and treat them almost as black boxes.

The most popular image compression scheme is the JPEG standard.17

Details along with the standard’s text can be easily found in the literature.17

JPEG compresses up to four color planes independently, and details such ascolor spaces are left to the application level. Application information can beconveyed to the receiver through so-called application markers. Interestinglyenough, JPEG is highly popular because of its publicly available implemen-tation developed by the Independent JPEG Group (IJG).18 In the absence ofany standard method for describing color, the IJG defined the “JFIF” appli-cation marker. When present, among other things, it indicates to the decoderthat the color space was YCbCr, which is a simple linear transform of theinput RGB data that will be described later.

JPEG encodes each color plane following the steps in Figure 8.1. Theimage is broken into blocks of 8 × 8 pixels, and each block is transformedusing the DCT, the transformed samples being uniformly quantized. Eachof the 64 quantizers is governed by the entries of the “quantizer table,” whichcontains the quantization steps and is the main degree of freedom in JPEGcompression. The quantized data of a block are scanned following a zig-zagpath as shown in Figure 8.8. The vector is input to an entropy coder, whichcombines run-length counting and variable length coding of the counts. TheDCT samples are supposed to be small for high frequencies, and most are

ColorTransf.

+DeviceCorr.

C1

CN-1

C0 L

b

a

hL(m,n)

hC(m,n)

hC(m,n)

ColorTransf.

+DeviceCorr.

C1

CN-1

C0 L

b

a

hL(m,n)

hC(m,n)

hC(m,n)

CIE76

Average S∆E

Figure 8.7 Distance between two color images for computing distortion: spatial∆E. The color is converted to device-independent CIELAB. The channels arespatially filtered using separate filters for either luminance or chrominanceplanes. The average ∆E between filtered CIELAB images can be used as a distor-tion measure.

-ColorBook Page 571 Wednesday, November 13, 2002 10:55 AM

Page 14: Chapter 8: Compression of color imagesread.pudn.com/downloads448/sourcecode/graph/texture_mapping/1887323... · In this chapter, we intend to cover the basic aspects of color image

572 Digital Color Imaging Handbook

set to zero (discarded) by the quantization process. Effectively, most samplesare zero at the end of the zig-zag path, and this fact is explored for compres-sion. In essence, the more zero-value quantized samples there are in the zig-zag scanned vector, the higher the compression. A better introduction andcomplete description should be sought elsewhere.17 For now, we just needto think of it as a coder that processes each color plane independently. JPEGis available in virtually all software that involves digital images.

Another compression system that has recently been approved as aninternational standard is the JPEG 2000.19–21 JPEG 2000 is meant as animprovement on DCT-based JPEG, but, instead of solely increasing the com-pression efficiency, it provides increased functionality via new compressionparadigms. Conceptually, JPEG 2000 is a transform coder that is based onthe steps depicted in Figure 8.1. It employs a wavelet transform and uniformquantization. Commonly, the steps sizes are very small, which, in effect,makes the quantization an integer representation of the wavelet transformeddata. The data can be represented in binary code so that the transformedcoefficients can be seen as formed by bit planes as illustrated in Figure 8.9.

j

iDC

Figure 8.8 Zig-zag scanning of the DCT samples of a JPEG block of pixels intoa one-dimensional vector.

LSB

MSB

Sign

Figure 8.9 Each sub-band of the wavelet transform is represented via the signand magnitude of each of its coefficients, thus forming bit planes. Each bit planeof each sub-band is compressed using arithmetic coding.

-ColorBook Page 572 Wednesday, November 13, 2002 10:55 AM

Page 15: Chapter 8: Compression of color imagesread.pudn.com/downloads448/sourcecode/graph/texture_mapping/1887323... · In this chapter, we intend to cover the basic aspects of color image

Chapter eight: Compression of color images 573

In essence, the most significant bit plane is encoded first, followed by theother planes. Compression is achieved by applying arithmetic coders tocompress the bit planes, using contextual modeling of the probabilities.19–21

Compressed data for each bit plane, and for each sub-band, can be encap-sulated separately. As a result, data for reconstructing the image can beprogressively sent to the receiver. The progression can be either by resolution(sending all bit planes for one sub-band level before proceeding to the nexthigher resolution level) or by quality (sending one bit plane information forall sub-bands before proceeding to the next bit plane). The process is illus-trated in Figure 8.10.

JPEG 2000 was meant from its conception to be feature rich. There aremany other features in JPEG 2000. The reader is encouraged to read thestandard itself, or one of the many papers on the JPEG 2000 effort, to appre-ciate its full feature set.19–21 For our purposes, it suffices to say that it is anefficient state-of-the-art wavelet-based compression method that can be usedto compress a multitude of color planes. JPEG 2000 provides for a few colortransformations. Most interesting is the reversible color transform, whichwill be discussed later.

Another important compression method is the JPEG-LS standard.22,23 Itis a low-complexity compressor aimed at lossless or near-lossless imagecompression. JPEG-LS is based on pixel prediction and prediction errorencoding (a very sophisticated DPCM coder, in essence). Prediction is adap-tive, and compression is based on variable length coding.

All these JPEG coders were devised to fulfill different purposes and toaddress different image compression needs, but all of them treat each colorimage channel independently. Hence, for the following discussion, it wouldbe sufficient to assume the compression will be performed using any one ofthese compressors (JPEG, JPEG 2000, or JPEG-LS).

Bit plane k, Sub-bands

Bit plane k+1, Sub-bands

Bit plane k, Sub-bands+1

Bit plane k+1, Sub-bands+1

PARSER

Progressiveby resolutionor by quality

Lossy,scalable tolossless

Regions ofinterest

Figure 8.10 The data for each bit plane and for each sub-band are packetized. Thecompressor can write the file once, and the transmitter or decoder can decide howto read the data by parsing and gathering the compressed data in any suitable order.

-ColorBook Page 573 Wednesday, November 13, 2002 10:55 AM

Page 16: Chapter 8: Compression of color imagesread.pudn.com/downloads448/sourcecode/graph/texture_mapping/1887323... · In this chapter, we intend to cover the basic aspects of color image

574 Digital Color Imaging Handbook

8.4 Multidimensional color model and transformsA color image is represented as a finite number of color image planes. Eachcolor is obtained by filtering the image (pixel) spectrum and by measuringthe resulting luminosity energy. In this way, each pixel color is representedby a few values, corresponding to a few filters. Usually, three filters (RGB)are used, but multispectral data as well as subtractive spaces such as CMYKuse more than three channels. A sampled color image is an array of N-tuples:every pixel is a vector. For a continuous (not spatially sampled) image,another interpretation is that an image is a Riemannian surface in (N + 2)-D space. For the typical three-channel case, the image is a five-dimensionalparametric surface where the parameters are chosen as the spatial coordi-nates xy. One and only one point is mapped to each point in the xy plane.To see this, imagine a scan line of the image (one-dimensional signal) andone single-color signal (monochrome). Then, the “image” is a simple functionas in Figure 8.3c. If the image is a single -two-dimensional plane (mono-chrome), it is a surface; i.e., it is parameterized in two-dimensions as illus-trated in Figure 8.11a and 8.11b. To add more than one color signal, imaginea one-dimensional signal (scanline) and two color signals. The “image”would be a line in three-dimensions as depicted in Figure 8.11c. Extendingthe space to two-dimensions and the number of color channels to three, itis easy to conclude that a three-color image is a surface in five-dimensionalparameterized by two out of five axes. This abstraction is useful to re-emphasize that there is correlation not only across color planes or acrossspace within a color plane, but also across both color planes and spacecoordinates. In other words, correlation exists within a five-dimensionalspace! Furthermore, this simple topological representation is linear.

Of interest to us are the properties of color images related to compres-sion. In particular, well-correlated signals or regions without much detailtend to compress better than more noisy or detailed images; i.e., smoothregions compress better than regions involving too many edges.

Let Ck(m, n) denote the kth color plane at pixel position (m, n), and letc(m, n) be a vector containing all such color planes; i.e., cT(m, n) = [C0(m, n),C1(m, n), …, CN–1(m, n)]. Let us form a composite vector

uT = [cT(m,n), …, cT (m + k1, n + k2), …] (k1, k2) ∈ Ξ (8.9)

where Ξ is a set determining a neighborhood of K pixels around the origin.A measure of correlation among neighbor pixels across color planes can bemade via the autocorrelation matrix of u, i.e.,

(8.10)Ru E uuT

Γ0 Γ1 … ΓK

Γ1 Γ0

ΓK Γ0

= =

...

...

-ColorBook Page 574 Wednesday, November 13, 2002 10:55 AM

Page 17: Chapter 8: Compression of color imagesread.pudn.com/downloads448/sourcecode/graph/texture_mapping/1887323... · In this chapter, we intend to cover the basic aspects of color image

Chapter eight: Compression of color images 575

where Γ0 is the plain correlation across color planes, while ΓK represents thecorrelation of color values over spatially displaced pixels.

To decorrelate the samples of u, a transformation A such that y = A ucan be used to decorrelate the data. The KLT we mentioned earlier is theone such that

Ry = EyyT = A Ru AT (8.11)

is a diagonal matrix.2,4,5,7,9 This can be accomplished by choosing the rowsof A as being the eigenvectors of Ru. Note that the transform A will simul-taneously decorrelate the image within and across color planes. In otherwords, it achieves both color and space decorrelation.

The following example will illustrate the process. Assume the immedi-ate horizontal, vertical, and diagonal neighbors so that Γh = Ec(m, n)cT(m +1, n), Γv = Ec(m, n)cT(m, n + 1) Γd = Ec(m, n)c(m + 1,n + 1). For the RGBimage shown in Figure 8.12, uT = [r(m, n), g(m, n), b(m, n), r(m, n + 1), …,

50 100 150 200 250

50

100

150

200

250300

1001000

200200

100

200

150

300

250

0.8

0.7

0.6

0.4

0.2

0

0.5

0.3

0.1

0.9

1

0.5

0 0 5 10 15 20 25 30 35

(a) (b)

(c)

Figure 8.11 Geometric representation of color images (a) a monochrome imagedisplayed as light intensity, (b) the same image represented as a surface in three-dimensions, and (c) a one-dimensional image with two color channels representedas a parametric line in three dimensions.

-ColorBook Page 575 Wednesday, November 13, 2002 10:55 AM

Page 18: Chapter 8: Compression of color imagesread.pudn.com/downloads448/sourcecode/graph/texture_mapping/1887323... · In this chapter, we intend to cover the basic aspects of color image

576 Digital Color Imaging Handbook

g(m + 1, n + 1), b(m + 1, n + 1)] and the following correlation matrices werefound:

(8.12)

As expected, the within-pixel cross correlation is the largest, followedby the correlation among planes of horizontal or vertical neighbor pixels.What is less expected is the large correlation among samples of differentplanes and pixels. The matrix A that diagonalizes Ru is shown in Figure 8.13,while the standard deviations of the transformed signals are

[diag(Ry)]1/2 = [1.0000 0.1716 0.1448 0.1196 0.0788 0.0684 …… 0.0221 0.0116 0.0113 0.0071 0.0050 0.0014] (8.13)

Figure 8.12 Color (RGB) channels of the image used for the multidimensional trans-form example.

Γ0

0.9757 0.9551 0.86610.9551 1.0000 0.92420.8661 0.9242 0.9185

Γh

0.9392 0.9195 0.83460.9195 0.9630 0.89120.8346 0.8912 0.8862

= =

Γv

0.9295 0.9073 0.82120.9073 0.9521 0.87880.8212 0.8788 0.8749

Γd

0.9096 0.8889 0.80590.8889 0.9329 0.86240.8059 0.8624 0.8591

= =

0.2888 0.2974 0.2796 0.2888 0.2974 0.2796 0.2888 0.2974 0.2796 0.2888 0.2974 0.2796-0.3490 0.0022 0.3580 -0.3490 0.0022 0.3580 -0.3490 0.0022 0.3580 -0.3490 0.0022 0.35800.2913 0.2951 0.2794 0.2913 0.2951 0.2794 -0.2913 -0.2951 -0.2794 -0.2913 -0.2951 -0.2794

-0.2953 -0.2995 -0.2704 0.2953 0.2995 0.2704 -0.2953 -0.2995 -0.2704 0.2953 0.2995 0.27040.2117 -0.4020 0.2089 0.2117 -0.4020 0.2089 0.2117 -0.4020 0.2089 0.2117 -0.4020 0.2089

-0.2868 -0.2961 -0.2830 0.2868 0.2961 0.2830 0.2868 0.2961 0.2830 -0.2868 -0.2961 -0.28300.3372 0.0008 -0.3691 -0.3372 -0.0008 0.3691 0.3372 0.0008 -0.3691 -0.3372 -0.0008 0.3691

-0.2215 0.4004 -0.2015 0.2215 -0.4004 0.2015 -0.2215 0.4004 -0.2015 0.2215 -0.4004 0.2015-0.3739 0.3292 0.0422 -0.3739 0.3292 0.0422 0.3739 -0.3292 -0.0422 0.3739 -0.3292 -0.0422-0.2014 -0.1988 0.4122 0.2014 0.1988 -0.4122 0.2014 0.1988 -0.4122 -0.2014 -0.1988 0.41220.1591 0.2336 -0.4125 0.1591 0.2336 -0.4125 -0.1591 -0.2336 0.4125 -0.1591 -0.2336 0.41250.3566 -0.3504 0.0052 -0.3566 0.3504 -0.0052 -0.3566 0.3504 -0.0052 0.3566 -0.3504 0.0052

Figure 8.13 Transformation matrix A for the example image.

-ColorBook Page 576 Wednesday, November 13, 2002 10:55 AM

Page 19: Chapter 8: Compression of color imagesread.pudn.com/downloads448/sourcecode/graph/texture_mapping/1887323... · In this chapter, we intend to cover the basic aspects of color image

Chapter eight: Compression of color images 577

Note the rapid decay of the energy of the transformed samples, i.e., thehigh energy compaction, which is often a sign of high compression. Themultidimensional transform will outperform the compaction provided byseparate color transforms followed by linear spatial transforms (such asDCT) of the same sizes. Despite the theoretical advantage, these transformsare not commonly known or used. They are image dependent, and someimage-independent separate transforms also provide reasonable perfor-mance, as we will discuss later.

8.5 Color transformsOften, the spatial and color transforms are independent, as depicted in Figure8.1c. Each pixel is first transformed separately to remove color redundancy,then a spatial transform is applied to reduce the spatial redundancy.

If the color transform is a linear matrix Q, the within-pixel color trans-form process is reduced to constraining the space–color transform into ablock diagonal matrix,

A = diag(Q, Q, …, Q) (8.14)

The reason for using color transforms is to enhance the compressionperformance. As we discussed, compression is improved if we reduce bothrate and distortion or a cost function that is a linear combination of both.Typically “smooth” images (i.e., those lacking too many details and sharpedges) are more easily compressed than textured and detailed images. By“more easily,” we mean achieving higher compression for the same dis-tortion or less distortion for the same compression. How do we chose Qin Equations 8.11 and 8.14 so as to favor compression? It can be shownthat, if the transform has the form of Equation 8.14, and if we make Ry =B⊗D, we can optimize the RD trade-off, given some mild conditions,where B is some Toeplitz matrix, D is a diagonal matrix, and ⊗ denotesthe Kronecker product.

Decorrelation of the color planes without spatial considerations isachieved by the pixel-wise Karhunen-Loeve transform (KLT).4,5 To find theKLT, one just uses Ξ = (0, 0) in Equation 8.9, or u = c(m, n), so that Ru= Γ0

in Equation 8.10, and A = Q is selected as the matrix containing the eigen-vectors of Ru = Γ0. The KLT approach is general and should provide goodperformance for any color space, including multispectral data.

The disadvantage of the KLT is the fact that one needs to gather thestatistics of the image (Ru). However, some useful transforms provide rea-sonable plane decorrelation for most typical images. Color spaces such asYIQ, YUV, and YES are simple linear transformations of linear RGB planes.In fact, a very important (perhaps the most used) color transformation is theone that brings RGB into YCbCr. YCbCr is a variant of YUV and is definedby the following matrix transformation:

-ColorBook Page 577 Wednesday, November 13, 2002 10:55 AM

Page 20: Chapter 8: Compression of color imagesread.pudn.com/downloads448/sourcecode/graph/texture_mapping/1887323... · In this chapter, we intend to cover the basic aspects of color image

578 Digital Color Imaging Handbook

(8.15)

In the JPEG 2000 jargon, the above transform is also referred to as anirreversible color transform (ICT),19–21 as it uses floating point numbers, andthe YCbCr samples need to be re-quantized. However, in JPEG 2000, onecan use reversible wavelets that would allow lossless compression. All theefforts to provide lossy-to-lossless scalability in the JPEG 2000 architecturewould be in vain if the ICT was used. The JPEG 2000 committee decided toapprove an approximation of ICT as an optional transform that would allowtotal reversibility while providing reasonable decorrelation. Such an approx-imation is the reversible color transform (RCT), which is defined by thefollowing formulae:19

(8.16)

where Y0, Y1, Y2 are the transformed color planes, and ⋅ denotes the “floor”operator; i.e., discard the decimal places. The original color planes can beperfectly reconstructed from an integer representation of Y0, Y1, Y2 as

(8.17)

Note that Y1 and Y2 require one more precision bit than Y0.Other important color spaces for compression are CIELAB and CIELUV,

which are covered elsewhere in this handbook. Color fax systems demandcompression of images in a CIELAB color space.24 In all these cases, theconversion takes RGB data into some luminance–chrominance color space.An important aspect of these color spaces is that the human eye has lowersensitivity to high-frequency components (details) of the chrominanceimages. Hence, it is easy to subsample, or rather compress, more aggressivelythe chrominance components.

QYCC

0.299 0.587 0.1140.168– 0.332– 0.50.5 0.418– 0.082–

QYCC1–

1 0 1.4021 0.344– 0.7141 1.772 0

= =

Y0C0 2 C1 C2+( )+

4---------------------------------------=

Y1 C2 C1–=

Y2 C0 C1–=

C1 Y0Y1 Y2+

4------------------–=

C0 Y2 C1+=

C2 Y1 C1+=

-ColorBook Page 578 Wednesday, November 13, 2002 10:55 AM

Page 21: Chapter 8: Compression of color imagesread.pudn.com/downloads448/sourcecode/graph/texture_mapping/1887323... · In this chapter, we intend to cover the basic aspects of color image

Chapter eight: Compression of color images 579

Other noteworthy transformations are those pertaining to the compres-sion of CMYK image data.25–27 JPEG’s SPIFF file header specification28 definesthe YCbCrK color space as a derivation of CMYK data. Let us start with thenegative (inverse) of CMY, i.e., C0C1C2C3 are indeed RGBK. Then, the trans-formation from RGBK (CMYK) to YCbCrK is simply

(8.18)

An improved version of the above transformation is the one that bringsthe data into the Y+Y–CbCr or YYCC color space. If we invert all CMYKchannels so that C0C1C2C3 becomes RGBW, where W (white) is just the inverseof the K (black or key) channel, then the transform is defined as

(8.19)

which can be implemented as in the flow graph depicted in Figure 8.14. Notethat CMYK planes are often device-specific data. Simple linear transforma-tions of CMYK data will only aim to compact the data better, while remainingoriented to a particular device.

Several devices (e.g., printers with more than four inks) use additionalcolorants, for example, by adding orange colorants to the CMYK set. Multi-spectral data is also relevant and may use many color planes. For these cases,there is no common transform method. The KLT would work for all cases,but sometimes the image statistics are unavailable. There are some proposalsof using conventional transforms such as DCT or wavelets to transform the

QYCCK

0QYCC 0

00 0 0 1

=

QYYCC

1 2⁄ 0 0 1 2⁄0 1 0 00 0 1 0

1 2⁄ 0 0 1 2⁄–

0QYCC 0

00 0 0 1

=

TCMY

K

R

G

B Y+

Y

W W

CbCb

CrCrIIII Y--

1/2

1/2

Figure 8.14 Color transform implementation from CMYK to Y+Y–CbCr (or YYCC),where I means inversion (negative).

-ColorBook Page 579 Wednesday, November 13, 2002 10:55 AM

Page 22: Chapter 8: Compression of color imagesread.pudn.com/downloads448/sourcecode/graph/texture_mapping/1887323... · In this chapter, we intend to cover the basic aspects of color image

580 Digital Color Imaging Handbook

samples across color planes. Nevertheless, their efficacy is still being studiedand naturally varies with each case.

8.6 Compressing RGB imagesAs we discussed, transforms that aim to decorrelate the color planes can alsoproduce planes that more easily compressed. Figure 8.15 shows an exampleof three color planes before and after color transformation to YCbCr. Notehow the CbCr channels are much smoother than the RGB channels. If wego back to the image planes shown in Figure 8.12, apply a transformationto YCbCr, and then remeasure the correlation as in Equation 8.12, we obtain

(8.20)

Y Cr Cb

R G B

Figure 8.15 YCbCr color channels compared to the original RGB ones. Note the lackof details in the chrominance (CbCr) channels.

Γ0

1.0000 0.1140 0.02040.1140 0.0559 0.02390.0204 0.0239 0.0739

Γh

0.9424 0.1103 0.01960.1103 0.0541 0.02320.0196 0.0232 0.0715

= =

Γv

0.9281 0.1116 0.02320.1116 0.0539 0.02340.0232 0.0234 0.0712

Γd

0.9030 0.1089 0.02200.1089 0.0531 0.02310.0220 0.0231 0.0804

= =

-ColorBook Page 580 Wednesday, November 13, 2002 10:55 AM

Page 23: Chapter 8: Compression of color imagesread.pudn.com/downloads448/sourcecode/graph/texture_mapping/1887323... · In this chapter, we intend to cover the basic aspects of color image

Chapter eight: Compression of color images 581

Note how decorrelated the planes are as compared to the RGB channelsin Equation 8.12. The cross correlations are largely reduced, and the energyis concentrated in one channel. If we model the image planes as a first-orderMarkov process with correlation coefficient ρ, in this example, this coefficientis about 0.92 to 0.94 for the luminance channel and about 0.96 to 0.97 for thechrominance ones. This fact is additional evidence that the chrominancechannels are typically “smoother” than luminance channels. Similar conclu-sions can be reached by inspecting the Fourier transforms of the resultingYCbCr color planes.

It is clear that the transformation RGB–YCbCr produces “compression-friendly” color planes. The question is, how do we compare this space with,for example, device-independent CIELAB? To answer this question, RD plotsare shown in Figure 8.16, comparing JPEG compression using both CIELABand YCbCr. In these experiments, default (example) luminance and chromi-nance tables were used, and distortion is given as both PSNR and S∆E. Also,the RD curves in Figure 8.16 were obtained by varying a scaling parameterfor quantizer tables, luminance, and chrominance. This scaling is also knownas the quality factor and gives one knob to regulate compression; i.e., a singleparameter yields the RD points, hence a curve in RD space. The plots shownin Figure 8.16 are averages over several images. For same rate, YCbCr typ-ically yields higher PSNR or lower S∆E. Conversely, for the same distortion,YCbCr typically demands less rate than CIELAB. The plots in Figure 8.16are typical and serve to illustrate that it is commonly advantageous to com-press RGB images using the transformation to YCbCr instead of compressingunder CIELAB. The exception is for very low bit rates, where CIELABbecomes more competitive.

A typical JPEG (or JPEG 2000) compressor implementation would treatthe luminance channel differently from the chrominance ones. The differ-entiation can be to employ different quantizer tables and to subsample thechrominance planes. Often, CbCr will be reduced by a factor of two in each

0 0.2 0.4 0.6 0.8 10

5

10

15

20

25

30

35

40

Rate (bpp)

PSN

R(d

B) Lab

YCbCr

Rate (bpp)0 0.2 0.4 0.6 0.8 1

0

10

20

30

40

50

60

SE

Lab

YCbCr

Figure 8.16 RD plots comparing compression using YCbCr and CIELAB. Resultsshow average distortion for several images using the JPEG coder. The YCbCr spaceis typically slightly superior to CIELAB, except for very low bit rates.

-ColorBook Page 581 Wednesday, November 13, 2002 10:55 AM

Page 24: Chapter 8: Compression of color imagesread.pudn.com/downloads448/sourcecode/graph/texture_mapping/1887323... · In this chapter, we intend to cover the basic aspects of color image

582 Digital Color Imaging Handbook

direction before compression. The claim is that, because we are less sensitiveto the high frequency of chrominance components, one could reduce theimage planes from the start without loss of visual fidelity. The problem withthat argument is that, by reducing the data, one increases distortion andcompression. If one does not reduce the chrominance planes but insteadrelies on the increasing quantizer steps, one would also increase distortionand compression. The question is, which one is a better trade-off? Becauseof subsampling, data are irreversibly lost no matter how high the bit rate.So, for high enough bit rates, it is better not to subsample the planes. Forlow rates, subsampling artifacts might be better than compression artifacts.So, the best approach may actually depend on the bit-rate target. There isa breakpoint at which the curves with and without subsampling wouldcross. To clarify this issue, RD plots comparing the performance of a JPEGcoder with and without chrominance (Cb and Cr) subsampling are shownin Figure 8.17. The plots were obtained in the same conditions described forFigure 8.16. For rates above a certain breakpoint, it is always better not tosubsample the CbCr planes. This breakpoint is commonly around 0.2 bppor at a compression ratio about 120:1 (starting with the original 24-bpp RGBimage).

It is safe to say that, for the average compression of RGB images, onewould be better off using the YCbCr transformation and not subsamplingchrominance planes.

8.7 Compressing CMYK imagesCMYK data is targeted for a particular device. The correlation between Kand the other channels can change drastically from device to device. A colortransform for compressing CMYK images likely would not work across alldevices. For that reason, we would be content with a “good” solution thatworks across most devices, as the alternative is a case-by-case study.

Rate (bpp)

PSN

R(d

B)

0 0.2 0.4 0.6 0.8 10

5

10

15

20

25

30

35

40

Subsampling

Rate (bpp)

SE

0 0.2 0.4 0.6 0.8 10

5

10

15

20

25

30

35

40

Subsampling

Figure 8.17 RD plots comparing the performance of a JPEG coder with and withoutchrominance subsampling. For rates above a certain breakpoint, it is always betternot to subsample the CbCr (or ab) planes. This breakpoint is commonly ≈0.2 bpp.

-ColorBook Page 582 Wednesday, November 13, 2002 10:55 AM

Page 25: Chapter 8: Compression of color imagesread.pudn.com/downloads448/sourcecode/graph/texture_mapping/1887323... · In this chapter, we intend to cover the basic aspects of color image

Chapter eight: Compression of color images 583

The key difference between compressing CMYK and RGB images liesin the concept of luminance–chrominance models.25–27 Even though “lumi-nance” and “chrominance” are derived from colorimetric principles (e.g.,luminance aligned to an achromatic axis), as far as compression is con-cerned, what determine an image luminance are the spatial characteristicsof the color planes. With three-plane images, it is easy to define luminanceand chrominance. In Figure 8.15, it is natural to assign the channel thatmost resembles a “monochrome” version of the image as the luminancechannel. However, if we replace Y channel by any of the RGB channels,i.e., an RGB to RCbCr transformation, one would likely designate R as theluminance. This is so because the “chrominance” channels are definitelydistinct from what one perceives as a monochrome version of the colorimage. Figure 8.18a through 8.18d shows typical CMYK color planes,already inverted (i.e., 1 – C, 1 – M, 1 – Y, 1 – K, or RGBW). This particularrendering was performed for a given xerographic device. Note the largecorrelation between the K color plane and each of the others. That, ofcourse, depends on the strategy for calculating K from CMY. If we use thetransformation from CMYK to YYCC depicted in Figure 8.14, the resultingcolor planes corresponding to those at Figure 8.18a through 8.18d areshown in Figure 8.18e through 8.18h. Back to our discussion on luminancevs. chrominance, Figure 8.18e likely contains what most of us would call“luminance,” whereas Figure 8.18g and 8.18h contain images that wewould call “chrominance.” However, it is difficult to fit the image in Figure

(a) (b) (c) (d)

(e) (f) (g) (h)

Figure 8.18 CMYK color planes for a test image. The inverses of CMYK planes areshown in (a) through (d), respectively. After the transformation to YYCC space, thecolor planes are (e) Y+, (f) Y–, (g) Cb, and (h) Cr.

-ColorBook Page 583 Wednesday, November 13, 2002 10:55 AM

Page 26: Chapter 8: Compression of color imagesread.pudn.com/downloads448/sourcecode/graph/texture_mapping/1887323... · In this chapter, we intend to cover the basic aspects of color image

584 Digital Color Imaging Handbook

8.18f to any of the “models.” It contains typical luminance and chrominancespatial characteristics. This ambiguity is amplified for multispectral datawhere there are more channels, and any transformation of the input datawill produce some with both luminance and chrominance spatial charac-teristics. So, why do we need to designate a color plane as luminance orchrominance anyway? It is common for compressors to have more aggres-sive settings for chrominance planes than for luminance. By tagging thechannel as chrominance (thus, visually less important), one can exploit thebenefits of more aggressive compression while being more conservative incompressing the luminance. This is a complex issue beyond the scope ofthis chapter. It is advisable, however, in the YYCC case to apply “chromi-nance” settings to two of the channels while applying “luminance” settingsto the other two.

There are several transform options for the compression of CMYK data,including

1. No transform, i.e., compress CMYK independently2. Compress YCbCrK planes3. Compress YYCC planes

Figure 8.19 shows RD plots for the different compression schemes for sev-eral images. Distortion is given as both S∆E and PSNR. The plots includeresults with and without subsampling of the CbCr planes in the YCbCrKand YYCC schemes. YYCC typically outperforms the others for most bitrates. As in the case of compressing RGB images, chrominance subsamplingis just effective for high compression ratios, i.e., ratios larger than 150:1 (lessthan 0.2 bpp starting with CMYK at 8 bpp each). Summarizing, YYCC space

0 0.2 0.4 0.6 0.8 1 1.2 1.40

1

2

3

4

5

6

7

8

SUBSAMPLED

Bits/pel

Dis

t. (S

∆E)

CMYKYCbCrKYYCC

0 0.2 0.4 0.6 0.8 1 1.2 1.420

25

30

35

40

45

SUBSAMPLED

Bits/pel

Dis

t. (P

SN

R)

CMYKYCbCrKYYCC

Figure 8.19 Compression performance. For a number of CMYK images renderedfor different devices, the RD plots compare to the JPEG compression of the planesunder three different color spaces. Distortion measure was computed as both S∆Eand PSNR. For the YCbCrK and YYCC cases, tests were performed with and withoutchrominance subsampling.

-ColorBook Page 584 Wednesday, November 13, 2002 10:55 AM

Page 27: Chapter 8: Compression of color imagesread.pudn.com/downloads448/sourcecode/graph/texture_mapping/1887323... · In this chapter, we intend to cover the basic aspects of color image

Chapter eight: Compression of color images 585

without chrominance subsampling seems to be a good choice for compress-ing CMYK data.

8.8 SummaryIn this chapter, we exposed the reader to the basic aspects of the compressionof digital color images. That includes basic compression concepts such ascoding and quantization and the fact that compression is achieved by remov-ing the statistical redundancy and the visual irrelevancy contained in theimage. In fact, removing these two types of information determines whetherone attains lossy or lossless compression. We have presented the transformcoding model, along with the motivation for energy compaction image trans-formation in the context of compression. For that, the DCT and the wavelettransform were described.

Another model is the predictive coding model, which is popular forlossless and near-lossless compression. The key factor in devising and apply-ing compressors to images is to understand that there is always a rate-distortion trade-off in setting up compression parameters. The best opera-tional point lies somewhere on the LCH of the RD points. As for distortion,popular objective distortion measures are the PSNR, and S∆E distance mea-sures. Standard coders exist and are ready for our use, which include JPEG2000, JPEG, and JPEG-LS. We should use them but understand the effects oftheir parameter choices on the image quality, along with the choice of theproper color transform.

We did not intend to describe compression systems in detail, which areoften designed for monochrome images. Furthermore, the popular compres-sion systems are international standards, which are very well covered else-where. The focus of this chapter was on the interaction between the colorimage representation and existing compression systems.

A multidimensional color model was discussed to show that there mightbe strong correlation within and across pixels, simultaneously. We haveshown simple color and spatial correlation measurements. Even thoughthere is a multidimensional correlation, most of the compression systemsapply color transforms independently from the spatial transform. Popularcolor transforms for compression were discussed in detail. That includes thetransformation from RGB to YCbCR and the KLT. The YCbCr transformationis the irreversible color transform. This, and the reversible color transform,make up the main color transform options for JPEG 2000. Apart from theKLT, other color transforms for non-RGB data were discussed, including thetransformations from CMYK to YCbCrK and to YYCC. The concepts ofluminance vs. chrominance were discussed with respect to the compressionsettings. Compression of RGB and CMYK images was discussed, comparingcolor transforms and other settings.

We aimed at presenting a few basic compression concepts applied tocolor images. We regard this chapter as an introduction to the subject so,rather than serving as a thorough reference, we hope this chapter will inspire

-ColorBook Page 585 Wednesday, November 13, 2002 10:55 AM

Page 28: Chapter 8: Compression of color imagesread.pudn.com/downloads448/sourcecode/graph/texture_mapping/1887323... · In this chapter, we intend to cover the basic aspects of color image

586 Digital Color Imaging Handbook

the reader to explore the subject further in the following references andthrough independent experimentation.

References

1. Rao, K. R. and Hwang, J., Techniques and standards for image, in

Video andAudio Coding,

Prentice-Hall, Upper Saddle River, NJ, 1996. 2. Netravali, A. and Haskell, B.,

Digital Pictures: Representation and Compression,

Plenum Press, New York, 1988.3. Storer, J. A., Ed.,

Image and Text Compression,

Kluwer Academic, Norwell, MA,1992.

4. Pratt, W.,

Digital Image Processing,

John Wiley & Sons, New York, 1978.5. Rao, K. R. and Yip P., Eds.,

The Transform and Data Compression Handbook,

CRCPress, Boca Raton, FL, 2001.

6. Rabbani, M. and Jones P.,

Digital Image Compression Techniques,

SPIE Press,Bellingham, WA, 1991.

7. Gersho, A. and Gray R.,

Vector Quantization and Signal Compression,

KluwerAcademic, Norwell, MA, 1992.

8. Gallager, R. G.,

Information Theory and Reliable Communication,

John Wiley &Sons, New York, 1968.

9. Rao, K. R. and Yip, P.,

Discrete Cosine Transform, Algorithms, Advantages andApplications,

Academic Press, San Diego, CA, 1990.10. Vetterli, M. and Kovacevic, J.,

Wavelets and Subband Coding,

Prentice-Hall,Englewood Cliffs, NJ, 1995.

11. Strang, G. and Nguyen, T.,

Wavelets and Filter Banks,

Wellesley-Cambridge,Wellesley, MA, 1996.

12. Gray, R.,

Source Coding Theory,

Kluwer Academic, Norwell, MA, 1993.13. Watson, A. B., Ed.,

Digital Images and Human Vision,

MIT Press, Cambridge,MA, 1993.

14. CIE Publication No. 15.2,

Colorimetry,

Bureau Central de la CIE, Vienna, 1986.15. Robertson, A. R., Historical development of CIE recommended color differ-

ence equations,

Color Res. Appl.,

15(3), 167–170, 1990.16. Zhang, X. M. and Wandell, B. A., A spatial extension to CIELAB for digital

color image reproduction, in

Proc. Soc. for Info. Display Symp.,

1996.17. Pennebaker, W. B. and Mitchell, J. L.,

JPEG: Still Image Compression Standard,

Van Nostrand Reinhold, New York, 1993.18. Independent JPEG Group Library, http://www.ijg.org.19. Taubman, D. and Marcellin, M.,

Jpeg 2000: Image Compression Fundamentals,Standards, and Practice

, Kluwer Academic Press, Dordrecht, the Netherlands,2001.

20. Marcellin, M., Gormish, M. J., Bilgin, A., and Boliek, M., An overview of JPEG-2000, in

Proc. 2000 Data Compression Conference,

Snowbird, Utah, March 2000.21. Christopoulos, C., Skodras, A., and Ebrahimi, T., The JPEG 2000 still image

coding system: an overview,

IEEE Trans. Consumer Electronics,

46(4),1103–1127, 2000.

22. ISO/IEC FCD 14495-1 — Lossless and near-lossless compression of continu-ous-tone still images, http://www.jpeg.org/public/fcd14495p.pdf, 1997.

23. Weinberger, M., Seroussi, G., and Sapiro, G., From LOCO-I to JPEG-LS stan-dard, in

Proc. Int. Conf. Image Proc. (ICIP’99),

24A01.7, Kobe, Japan, 1999.

08 Page 586 Monday, November 18, 2002 9:38 AM

Page 29: Chapter 8: Compression of color imagesread.pudn.com/downloads448/sourcecode/graph/texture_mapping/1887323... · In this chapter, we intend to cover the basic aspects of color image

Chapter eight: Compression of color images 587

24. Buckley, R., Venable, D., and McIntyre, L., New developments in color fac-simile and internet fax, in Proc. IS&T’s Fifth Color Imaging Conference, Scotts-dale, AZ, November 1997, 296–300.

25. de Queiroz, R., On independent color space transformations for the compres-sion of CMYK images, IEEE Trans. Image Processing, 8, 1446–1451, 1999.

26. Van Assche, S., Denecker, K., and De Neve, P., Evaluation of lossless com-pression techniques for high-resolution RGB and CMYK color images, J.Electronic Imaging, 8, 415–421, 1999.

27. Van Assche, S., Denecker, K., Philips, W., and Lemahieu, I., A comparison oflossless compression techniques for prepress color images, in IS&T SPIESymp. on Electronic Imaging: Visual Communications and Image Processing, inProc. SPIE, 3653, San Jose, CA, January 1999, 1376–1383.

28. ISO/IEC CD 10918-3, Info. Technology — Digital Compression and Coding ofContinuous Tone Still Images — Part 3: Extensions, November 13, 1994.

-ColorBook Page 587 Wednesday, November 13, 2002 10:55 AM