ENEE631 Digital Image Processing (Spring'04) Transform Coding and JPEG Spring ’04 Instructor: Min Wu ECE Department, Univ. of Maryland, College Park

ENEE631 Digital Image Processing (Spring'04)

Transform Coding and JPEGTransform Coding and JPEG

Spring ’04 Instructor: Min Wu

ECE Department, Univ. of Maryland, College Park

www.ajconline.umd.edu (select ENEE631 S’04) [email protected]

Based on ENEE631 Based on ENEE631 Spring’04Spring’04Section 10Section 10

ENEE631 Digital Image Processing (Spring'04) Lec13 – Transf. Coding & JPEG [4]

Transform CodingTransform Coding

UM

CP

EN

EE

63

1 S

lide

s (c

rea

ted

by

M.W

u ©

20

04

)


Transform CodingTransform Coding Use transform to pack energy to only a few coeff.

How many bits to be allocated for each coeff.?– More bits for coeff. with high variance k

2 to keep total MSE small– Also determined by perceptual importance

From Jain’s Fig.11.15

UM

CP

EN

EE

40

8G

Slid

es

(cre

ate

d b

y M

.Wu

& R

.Liu

© 2

00

2)


Zonal Coding and Threshold CodingZonal Coding and Threshold Coding

Zonal coding– Only transmit a small predetermined zone of transformed coeff.

Threshold coding– Transmit coeff. that are above certain thresholds

Compare– Threshold coding is inherently adaptive

introduce smaller distortion for the same # of coded coeff.

– Threshold coding need overhead in specifying index of coded coeff. run-length coding helps to reduce overhead

UM

CP

EN

EE

40

8G

Slid

es

(cre

ate

d b

y M

.Wu

& R

.Liu

© 2

00

2)


Determining Block SizeDetermining Block Size

Why block based?

– High transform computation complexity for large block O( m log m m ) per block

in tranf. for (MN/m2) blocks

complexity in bit allocation

– Block transform captures local info. better than global transform

Rate & complexity vs. block size

– Commonly used block size ~ 8x8


complexity

UM

CP

EN

EE

63

1 S

lide

s (c

rea

ted

by

M.W

u ©

20

01

)


Block-based Transform CodingBlock-based Transform Coding Encoder

– Step-1 Divide an image into m x m blocks and perfrom transform– Step-2 Determine bit-allocation for coefficients – Step-3 Design quantizer and quantize coefficients (lossy!)– Step-4 Encode quantized coefficients

Decoder


UM

CP

EN

EE

40

8G

Slid

es

(cre

ate

d b

y M

.Wu

& R

.Liu

© 2

00

2)


How to Encode Quantized Coeff. in Each BlockHow to Encode Quantized Coeff. in Each Block Basic tools

– Entropy coding (Huffman, etc.) and run-length coding– Predictive coding ~ esp. for DC

Ordering– zig-zag scan for block-DCT to better achieve run-length coding gain

Horizontal frequency

Vertical frequency

DCAC01

AC07

AC70

AC77

low-frequency coefficients, then high frequency coefficients

UM

CP

EN

EE

40

8G

Slid

es

(cre

ate

d b

y M

.Wu

& R

.Liu

© 2

00

2)


Summary: List of Compression ToolsSummary: List of Compression Tools

Lossless encoding tools– Entropy coding: Huffman, Lemple-Ziv, and others (Arithmetic coding)– Run-length coding

Lossy tools for reducing redundancy– Quantization: scalar quantizer vs. vector quantizer– Truncations: discard unimportant parts of data

Facilitating compression via Prediction– Encode prediction parameters and residues with less bits

Facilitating compression via Transforms– Transform into a domain with improved energy compaction

UM

CP

EN

EE

63

1 S

lide

s (c

rea

ted

by

M.W

u ©

20

04

)


Put Basic Tools Together: Put Basic Tools Together:

JPEG Image Compression StandardJPEG Image Compression Standard

UM

CP

EN

EE

63

1 S

lide

s (c

rea

ted

by

M.W

u ©

20

04

)


JPEG Compression Standard (early 1990s)JPEG Compression Standard (early 1990s)

JPEG - Joint Photographic Experts Group– Compression standard of generic continuous-tone still image– Became an international standard in 1992

Allow for lossy and lossless encoding of still images– Part-1 DCT-based lossy compression

average compression ratio 15:1

– Part-2 Predictive-based lossless compression

Sequential, Progressive, Hierarchical modes– Sequential ~ encoded in a single left-to-right, top-to-bottom scan

– Progressive ~ encoded in multiple scans to first produce a quick, rough decoded image when the transmission time is long

– Hierarchical ~ encoded at multiple resolution to allow accessing low resolution without full decompression

UM

CP

EN

EE

40

8G

Slid

es

(cre

ate

d b

y M

.Wu

& R

.Liu

© 2

00

2)


Baseline JPEG AlgorithmBaseline JPEG Algorithm

“Baseline”– Simple, lossy compression

Subset of other DCT-based modes of JPEG standard

A few basics– 8x8 block-DCT based coding– Shift to zero-mean by subtracting 128 [-128, 127]

Allows using signed integer to represent both DC and AC coeff.

– Color (YCbCr / YUV) and downsample Color components can have lower

spatial resolution than luminance

– Interleaving color components

=> Flash demo on Baseline JPEG algorithm

by Dr. Ken Lam (HK PolyTech Univ.)

B

G

R

C

C

Y

r

b

100.0515.0615.0

436.0289.0147.0

114.0587.0299.0

(Based on Wang’s video book Chapt.1)UM

CP

EN

EE

40

8G

Slid

es

(cre

ate

d b

y M

.Wu

& R

.Liu

© 2

00

2)


Block Diagram of JPEG BaselineBlock Diagram of JPEG Baseline

Fro

m W

alla

ce’s

JP

EG

tuto

rial

(19

93)


475 x 330 x 3 = 157 KB luminance

Fro

m L

iu’s

EE

330

(Pri

ncet

on)


Y U V (Y Cb Cr) ComponentsY U V (Y Cb Cr) Components

Assign more bits to Y, less bits to Cb and Cr

Fro

m L

iu’s

EE

330

(Pri

ncet

on)


Lossless Coding Part in JPEGLossless Coding Part in JPEG

Differentially encode DC

– (lossy part: DC differences are then quantized.)

AC coefficients in one block

– Zig-zag scan after quantization for better run-length save bits in coding consecutive zeros

– Represent each AC run-length using entropy coding use shorter codes for more likely AC run-length symbols

UM

CP

EN

EE

40

8G

Slid

es

(cre

ate

d b

y M

.Wu

& R

.Liu

© 2

00

2)


Lossy Part in JPEGLossy Part in JPEG Quantization (adaptive bit allocation)

– Different quantization step size for different coeff. bands– Use same quantization matrix for all blocks in one image– Choose quantization matrix to best suit the image– Different quantization matrices for luminance and color components

Default quantization table– “Generic” over a variety of images

Quality factor “Q”– Scale the quantization table– Medium quality Q = 50% ~ no scaling– High quality Q = 100% ~ unit quantization step size– Poor quality ~ small Q, larger quantization step

visible artifacts like ringing and blokiness

UM

CP

EN

EE

40

8G

Slid

es

(cre

ate

d b

y M

.Wu

& R

.Liu

© 2

00

2)


Uncompressed (100KB)

JPEG 75% (18KB)

JPEG 50% (12KB)

JPEG 30% (9KB)

JPEG 10% (5KB)

UM

CP

EN

EE

40

8G

Slid

es

(cre

ate

d b

y M

.Wu

& R

.Liu

© 2

00

2)


JPEG Compression (Q=75% & 30%)JPEG Compression (Q=75% & 30%)

45 KB 22 KBFro

m L

iu’s

EE

330

(Pri

ncet

on)


Y Cb Cr After JPEG (Q=30%)Y Cb Cr After JPEG (Q=30%)

Fro

m L

iu’s

EE

330

(Pri

ncet

on)

JPEG Cb JPEG Cr


Lossless Coding Part in JPEG: DetailsLossless Coding Part in JPEG: Details

Differentially encode DC

– ( SIZE, AMPLITUDE ), with amplitude range in [-2048, 2047]

AC coefficients in one block

– Zig-zag scan for better run-length– Represent each AC with a pair of symbols

Symbol-1: ( RUNLENGTH, SIZE ) Huffman coded

Symbol-2: AMPLITUDE Variable length coded

RUNLENGTH [0,15] # of consecutive zero-valued AC coefficientspreceding the nonzero AC coefficient [0,15]

SIZE [0 to 10 in unit of bits] # of bits used to encode AMPLITUDE

AMPLITUDE in range of [-1023, 1024]

UM

CP

EN

EE

40

8G

Slid

es

(cre

ate

d b

y M

.Wu

& R

.Liu

© 2

00

2)


UM

CP

EN

EE

63

1 S

lide

s (c

rea

ted

by

M.W

u ©

20

04

)

Table is from slides at Gonzalez/ Woods DIP book website (Chapter 8)


Bit Allocation in Image CodingBit Allocation in Image Coding

UM

CP

EN

EE

63

1 S

lide

s (c

rea

ted

by

M.W

u ©

20

04

)


Revisiting QuantizerRevisiting Quantizer

Quantizer achieves compression in a lossy way– Lloyd-Max quantizer minimize MSE distortion with a given rate

Need at least how many # bits for certain amount of error? – Rate-Distortion theory

Rate distortion func. of a r.v.– Minimum average rate RD bits/sample required to represent this r.v.

while allowing a fixed distortion D

– For Gaussian r.v. and MSE

Convex shape (slope is becoming milder)

(See info. theory course for details on R-D theory)

D

RD

2

mean) theuse(just , 0

, )/(log2

2222

1

D

DDRD

UM

CP

EN

EE

63

1 S

lide

s (c

rea

ted

by

M.W

u ©

20

01

/20

04

)


Bit Allocation Among Indep. r.v. of Different Var.Bit Allocation Among Indep. r.v. of Different Var. Problem formulation

– To encode a set of independent Gaussian r.v. {X1, …, Xn}, Xk ~ N( 0, k

2 )

– Allocate Rk bits to represent each r.v. Xk , incurring distortion Dk

– Total bit cost is R = R1+…+Rn– Total MSE distortion D = D1+…+Dn

What is the best bit allocation {R1, …, Rn} such that R is minimized and total distortion D D(req) ?

What is the best bit allocation {R1, …, Rn} such that D is minimized and total rate R R(req) ?

– Recall Rk = max( ½ * log(k2 / Dk), 0 )

– Solving the constrained optimization problem using Lagrange multiplier 0

UM

CP

EN

EE

63

1 S

lide

s (c

rea

ted

by

M.W

u ©

20

01

/20

04

)


Rate/Distortion Allocation via Reverse Water-fillingRate/Distortion Allocation via Reverse Water-filling How many bits to be allocated for each coeff.?

– Determined by the variance of the coefficients– More bits for high variance k

2 to keep total MSE small

Optimal solution for Gaussian: “Reverse Water-filling” – Idea: Try to keep same amount of error in each freq. band and no need to

spend bit resource to represent coeff. w/ small variance than water level

– Results based on R-D function & via Lagrange-multiplier optimizationgiven D, determine and then RD; or vice versa for given RD

=> “Equal slope” idea can be extended to other convex (operational) R-D functions

UM

CP

EN

EE

63

1 S

lide

s (c

rea

ted

by

M.W

u ©

20

01

/20

04

)


Key Result in Rate Allocation: Equal R-D SlopeKey Result in Rate Allocation: Equal R-D Slope

Keep the slope in R-D curve the same (not necessarily Gaussian r.v)

– Otherwise the bits can be applied to other r.v. for better “return” in reducing the overall distortion

If all r.v. are Gaussian, the slope at the same distortion are identical in R-D curves for the two r.v.

UM

CP

EN

EE

63

1 S

lide

s (c

rea

ted

by

M.W

u ©

20

04

)


More on Rate-Distortion Based More on Rate-Distortion Based

Bit Allocation in Image CodingBit Allocation in Image Coding

UM

CP

EN

EE

63

1 S

lide

s (c

rea

ted

by

M.W

u ©

20

04

)


1

0

2 )()(

1min

N

kq kvkvE

ND

where k2 -- variance of k-th coeff. v(k); nk -- # bits allocated for v(k)

f(-) – quantization distortion func.

Bit AllocationBit Allocation How many bits to be allocated for each coeff.?

– Determined by the variance of coeff.– More bits for high variance k

2 to keep total MSE small

1

0

2 )(1 N

kkk nf

N

1

0

.bits/coeff 1

subject toN

kk Bn

N

“Inverse Water-filling” (from info. theory)

– Try to keep same amount of error in each freq. band

See Jain’s pp501 for detailed bit-allocation algorithm


Details on Reverse Water-filling Solution (cont’d)Details on Reverse Water-filling Solution (cont’d)

DDD i

i

in

i

tosubj. ,lnmin2

121

n

ii

i

in

i

DD

DJ1

2

121 ln)( Construct func. using

Lagrange multiplier

D and allfor 0 1

2

1ii

ii

DDiDD

J Necessary condition Keep the same marginal gain

DDD i

n

i i

i

tosubj. ,0,ln2

1maxmin

1

2

DDDD

D

D

Ji

ii

ii

ii

ii

i

s.t.

if ,

if ,

if ,0

if ,022

2

2

2

Necessary conditionfor choosing


Lagrangian Opt. for Indep. Budget ConstraintLagrangian Opt. for Indep. Budget Constraint

Previous: fix distortion, minimize total rate

Alternative: fix total rate (bit budget), minimize distortion

(Discrete) Lagrangian optimization on general source

– “Constant slope optimization” (e.g. in Box 6 Fig. 17 of Ortega’s tutorial)– Need to determine the quantizer q(i) for each coding unit i– Lagrangian cost for each coding unit

use a line with slope - to intersect with each operating point (Fig.14)

– For a given operating quality , the minimum can be computed independently for each coding unit

=> Find operating quality satisfying the rate constraint

n

iiqiiqi rd

1)(,)(,min

)(,)(, iqiiqi rd

n

iiqiiqi

n

iiqiiqi rdrd

1)(,)(,

1)(,)(, minmin


(from Box 6 Fig. 17 of Ortega’s tutorial)(from Fig. 14 of Ortega’s tutorial)


Bridging the Theory and Ad-hoc Practice Bridging the Theory and Ad-hoc Practice

Operational R-D curves– Directly achievable with practical implementation

Tradeoff! Tradeoff! Tradeoff! – Rate vs. Distortion

how close are the operational points to info. theory bounds?– Also tradeoff among practical considerations

Cost of memory, computation, delay, etc.

Often narrowing down to an optimization problem:~ optimize an objective func. subj. to a set of constraints

Model mismatch problem– How good are our assumptions on source modeling?– How well does a coding algorithm tolerate model mismatch?


Basic Steps in R-D OptimizationBasic Steps in R-D Optimization

Determine source’s statistics– Simplified models: Gaussian, Laplacian, Generalized Gaussian– Obtain from training samples

Obtain operational R-D points/functions/curves– E.g., compute distortion for each candidate quantizer

Determine objective func. and constraints

Search for optimal operational R-D points– Lagrange multiplier approach

select only operating points on the convex hull of overall R-D characteristics (Fig.19 of Ortega’s tutorial)

may handle different coding unit independently

– Dynamic programming approach not constrained by convex hull but has larger search space

(higher complexity)

Documents

ENEE631 Digital Image Processing (Spring'04) Transform Coding and JPEG Spring ’04 Instructor: Min Wu ECE Department, Univ. of Maryland, College Park