Upload
julie-james
View
216
Download
0
Tags:
Embed Size (px)
Citation preview
ENEE631 Digital Image Processing (Spring'04)
Transform Coding and JPEGTransform Coding and JPEG
Spring ’04 Instructor: Min Wu
ECE Department, Univ. of Maryland, College Park
www.ajconline.umd.edu (select ENEE631 S’04) [email protected]
Based on ENEE631 Based on ENEE631 Spring’04Spring’04Section 10Section 10
ENEE631 Digital Image Processing (Spring'04) Lec13 – Transf. Coding & JPEG [4]
Transform CodingTransform Coding
UM
CP
EN
EE
63
1 S
lide
s (c
rea
ted
by
M.W
u ©
20
04
)
ENEE631 Digital Image Processing (Spring'04) Lec13 – Transf. Coding & JPEG [5]
Transform CodingTransform Coding Use transform to pack energy to only a few coeff.
How many bits to be allocated for each coeff.?– More bits for coeff. with high variance k
2 to keep total MSE small– Also determined by perceptual importance
From Jain’s Fig.11.15
UM
CP
EN
EE
40
8G
Slid
es
(cre
ate
d b
y M
.Wu
& R
.Liu
© 2
00
2)
ENEE631 Digital Image Processing (Spring'04) Lec13 – Transf. Coding & JPEG [6]
Zonal Coding and Threshold CodingZonal Coding and Threshold Coding
Zonal coding– Only transmit a small predetermined zone of transformed coeff.
Threshold coding– Transmit coeff. that are above certain thresholds
Compare– Threshold coding is inherently adaptive
introduce smaller distortion for the same # of coded coeff.
– Threshold coding need overhead in specifying index of coded coeff. run-length coding helps to reduce overhead
UM
CP
EN
EE
40
8G
Slid
es
(cre
ate
d b
y M
.Wu
& R
.Liu
© 2
00
2)
ENEE631 Digital Image Processing (Spring'04) Lec13 – Transf. Coding & JPEG [7]
Determining Block SizeDetermining Block Size
Why block based?
– High transform computation complexity for large block O( m log m m ) per block
in tranf. for (MN/m2) blocks
complexity in bit allocation
– Block transform captures local info. better than global transform
Rate & complexity vs. block size
– Commonly used block size ~ 8x8
From Jain’s Fig.11.16
complexity
UM
CP
EN
EE
63
1 S
lide
s (c
rea
ted
by
M.W
u ©
20
01
)
ENEE631 Digital Image Processing (Spring'04) Lec13 – Transf. Coding & JPEG [8]
Block-based Transform CodingBlock-based Transform Coding Encoder
– Step-1 Divide an image into m x m blocks and perfrom transform– Step-2 Determine bit-allocation for coefficients – Step-3 Design quantizer and quantize coefficients (lossy!)– Step-4 Encode quantized coefficients
Decoder
From Jain’s Fig.11.17
UM
CP
EN
EE
40
8G
Slid
es
(cre
ate
d b
y M
.Wu
& R
.Liu
© 2
00
2)
ENEE631 Digital Image Processing (Spring'04) Lec13 – Transf. Coding & JPEG [9]
How to Encode Quantized Coeff. in Each BlockHow to Encode Quantized Coeff. in Each Block Basic tools
– Entropy coding (Huffman, etc.) and run-length coding– Predictive coding ~ esp. for DC
Ordering– zig-zag scan for block-DCT to better achieve run-length coding gain
Horizontal frequency
Vertical frequency
DCAC01
AC07
AC70
AC77
low-frequency coefficients, then high frequency coefficients
UM
CP
EN
EE
40
8G
Slid
es
(cre
ate
d b
y M
.Wu
& R
.Liu
© 2
00
2)
ENEE631 Digital Image Processing (Spring'04) Lec13 – Transf. Coding & JPEG [10]
Summary: List of Compression ToolsSummary: List of Compression Tools
Lossless encoding tools– Entropy coding: Huffman, Lemple-Ziv, and others (Arithmetic coding)– Run-length coding
Lossy tools for reducing redundancy– Quantization: scalar quantizer vs. vector quantizer– Truncations: discard unimportant parts of data
Facilitating compression via Prediction– Encode prediction parameters and residues with less bits
Facilitating compression via Transforms– Transform into a domain with improved energy compaction
UM
CP
EN
EE
63
1 S
lide
s (c
rea
ted
by
M.W
u ©
20
04
)
ENEE631 Digital Image Processing (Spring'04) Lec13 – Transf. Coding & JPEG [11]
Put Basic Tools Together: Put Basic Tools Together:
JPEG Image Compression StandardJPEG Image Compression Standard
UM
CP
EN
EE
63
1 S
lide
s (c
rea
ted
by
M.W
u ©
20
04
)
ENEE631 Digital Image Processing (Spring'04) Lec13 – Transf. Coding & JPEG [12]
JPEG Compression Standard (early 1990s)JPEG Compression Standard (early 1990s)
JPEG - Joint Photographic Experts Group– Compression standard of generic continuous-tone still image– Became an international standard in 1992
Allow for lossy and lossless encoding of still images– Part-1 DCT-based lossy compression
average compression ratio 15:1
– Part-2 Predictive-based lossless compression
Sequential, Progressive, Hierarchical modes– Sequential ~ encoded in a single left-to-right, top-to-bottom scan
– Progressive ~ encoded in multiple scans to first produce a quick, rough decoded image when the transmission time is long
– Hierarchical ~ encoded at multiple resolution to allow accessing low resolution without full decompression
UM
CP
EN
EE
40
8G
Slid
es
(cre
ate
d b
y M
.Wu
& R
.Liu
© 2
00
2)
ENEE631 Digital Image Processing (Spring'04) Lec13 – Transf. Coding & JPEG [13]
Baseline JPEG AlgorithmBaseline JPEG Algorithm
“Baseline”– Simple, lossy compression
Subset of other DCT-based modes of JPEG standard
A few basics– 8x8 block-DCT based coding– Shift to zero-mean by subtracting 128 [-128, 127]
Allows using signed integer to represent both DC and AC coeff.
– Color (YCbCr / YUV) and downsample Color components can have lower
spatial resolution than luminance
– Interleaving color components
=> Flash demo on Baseline JPEG algorithm
by Dr. Ken Lam (HK PolyTech Univ.)
B
G
R
C
C
Y
r
b
100.0515.0615.0
436.0289.0147.0
114.0587.0299.0
(Based on Wang’s video book Chapt.1)UM
CP
EN
EE
40
8G
Slid
es
(cre
ate
d b
y M
.Wu
& R
.Liu
© 2
00
2)
ENEE631 Digital Image Processing (Spring'04) Lec13 – Transf. Coding & JPEG [14]
Block Diagram of JPEG BaselineBlock Diagram of JPEG Baseline
Fro
m W
alla
ce’s
JP
EG
tuto
rial
(19
93)
ENEE631 Digital Image Processing (Spring'04) Lec13 – Transf. Coding & JPEG [16]
475 x 330 x 3 = 157 KB luminance
Fro
m L
iu’s
EE
330
(Pri
ncet
on)
ENEE631 Digital Image Processing (Spring'04) Lec13 – Transf. Coding & JPEG [18]
Y U V (Y Cb Cr) ComponentsY U V (Y Cb Cr) Components
Assign more bits to Y, less bits to Cb and Cr
Fro
m L
iu’s
EE
330
(Pri
ncet
on)
ENEE631 Digital Image Processing (Spring'04) Lec13 – Transf. Coding & JPEG [20]
Lossless Coding Part in JPEGLossless Coding Part in JPEG
Differentially encode DC
– (lossy part: DC differences are then quantized.)
AC coefficients in one block
– Zig-zag scan after quantization for better run-length save bits in coding consecutive zeros
– Represent each AC run-length using entropy coding use shorter codes for more likely AC run-length symbols
UM
CP
EN
EE
40
8G
Slid
es
(cre
ate
d b
y M
.Wu
& R
.Liu
© 2
00
2)
ENEE631 Digital Image Processing (Spring'04) Lec13 – Transf. Coding & JPEG [21]
Lossy Part in JPEGLossy Part in JPEG Quantization (adaptive bit allocation)
– Different quantization step size for different coeff. bands– Use same quantization matrix for all blocks in one image– Choose quantization matrix to best suit the image– Different quantization matrices for luminance and color components
Default quantization table– “Generic” over a variety of images
Quality factor “Q”– Scale the quantization table– Medium quality Q = 50% ~ no scaling– High quality Q = 100% ~ unit quantization step size– Poor quality ~ small Q, larger quantization step
visible artifacts like ringing and blokiness
UM
CP
EN
EE
40
8G
Slid
es
(cre
ate
d b
y M
.Wu
& R
.Liu
© 2
00
2)
ENEE631 Digital Image Processing (Spring'04) Lec13 – Transf. Coding & JPEG [22]
Uncompressed (100KB)
JPEG 75% (18KB)
JPEG 50% (12KB)
JPEG 30% (9KB)
JPEG 10% (5KB)
UM
CP
EN
EE
40
8G
Slid
es
(cre
ate
d b
y M
.Wu
& R
.Liu
© 2
00
2)
ENEE631 Digital Image Processing (Spring'04) Lec13 – Transf. Coding & JPEG [23]
JPEG Compression (Q=75% & 30%)JPEG Compression (Q=75% & 30%)
45 KB 22 KBFro
m L
iu’s
EE
330
(Pri
ncet
on)
ENEE631 Digital Image Processing (Spring'04) Lec13 – Transf. Coding & JPEG [24]
Y Cb Cr After JPEG (Q=30%)Y Cb Cr After JPEG (Q=30%)
Fro
m L
iu’s
EE
330
(Pri
ncet
on)
JPEG Cb JPEG Cr
ENEE631 Digital Image Processing (Spring'04) Lec13 – Transf. Coding & JPEG [25]
Lossless Coding Part in JPEG: DetailsLossless Coding Part in JPEG: Details
Differentially encode DC
– ( SIZE, AMPLITUDE ), with amplitude range in [-2048, 2047]
AC coefficients in one block
– Zig-zag scan for better run-length– Represent each AC with a pair of symbols
Symbol-1: ( RUNLENGTH, SIZE ) Huffman coded
Symbol-2: AMPLITUDE Variable length coded
RUNLENGTH [0,15] # of consecutive zero-valued AC coefficientspreceding the nonzero AC coefficient [0,15]
SIZE [0 to 10 in unit of bits] # of bits used to encode AMPLITUDE
AMPLITUDE in range of [-1023, 1024]
UM
CP
EN
EE
40
8G
Slid
es
(cre
ate
d b
y M
.Wu
& R
.Liu
© 2
00
2)
ENEE631 Digital Image Processing (Spring'04) Lec13 – Transf. Coding & JPEG [26]
UM
CP
EN
EE
63
1 S
lide
s (c
rea
ted
by
M.W
u ©
20
04
)
Table is from slides at Gonzalez/ Woods DIP book website (Chapter 8)
ENEE631 Digital Image Processing (Spring'04) Lec13 – Transf. Coding & JPEG [29]
Bit Allocation in Image CodingBit Allocation in Image Coding
UM
CP
EN
EE
63
1 S
lide
s (c
rea
ted
by
M.W
u ©
20
04
)
ENEE631 Digital Image Processing (Spring'04) Lec13 – Transf. Coding & JPEG [31]
Revisiting QuantizerRevisiting Quantizer
Quantizer achieves compression in a lossy way– Lloyd-Max quantizer minimize MSE distortion with a given rate
Need at least how many # bits for certain amount of error? – Rate-Distortion theory
Rate distortion func. of a r.v.– Minimum average rate RD bits/sample required to represent this r.v.
while allowing a fixed distortion D
– For Gaussian r.v. and MSE
Convex shape (slope is becoming milder)
(See info. theory course for details on R-D theory)
D
RD
2
mean) theuse(just , 0
, )/(log2
2222
1
D
DDRD
UM
CP
EN
EE
63
1 S
lide
s (c
rea
ted
by
M.W
u ©
20
01
/20
04
)
ENEE631 Digital Image Processing (Spring'04) Lec13 – Transf. Coding & JPEG [32]
Bit Allocation Among Indep. r.v. of Different Var.Bit Allocation Among Indep. r.v. of Different Var. Problem formulation
– To encode a set of independent Gaussian r.v. {X1, …, Xn}, Xk ~ N( 0, k
2 )
– Allocate Rk bits to represent each r.v. Xk , incurring distortion Dk
– Total bit cost is R = R1+…+Rn– Total MSE distortion D = D1+…+Dn
What is the best bit allocation {R1, …, Rn} such that R is minimized and total distortion D D(req) ?
What is the best bit allocation {R1, …, Rn} such that D is minimized and total rate R R(req) ?
– Recall Rk = max( ½ * log(k2 / Dk), 0 )
– Solving the constrained optimization problem using Lagrange multiplier 0
UM
CP
EN
EE
63
1 S
lide
s (c
rea
ted
by
M.W
u ©
20
01
/20
04
)
ENEE631 Digital Image Processing (Spring'04) Lec13 – Transf. Coding & JPEG [33]
Rate/Distortion Allocation via Reverse Water-fillingRate/Distortion Allocation via Reverse Water-filling How many bits to be allocated for each coeff.?
– Determined by the variance of the coefficients– More bits for high variance k
2 to keep total MSE small
Optimal solution for Gaussian: “Reverse Water-filling” – Idea: Try to keep same amount of error in each freq. band and no need to
spend bit resource to represent coeff. w/ small variance than water level
– Results based on R-D function & via Lagrange-multiplier optimizationgiven D, determine and then RD; or vice versa for given RD
=> “Equal slope” idea can be extended to other convex (operational) R-D functions
UM
CP
EN
EE
63
1 S
lide
s (c
rea
ted
by
M.W
u ©
20
01
/20
04
)
ENEE631 Digital Image Processing (Spring'04) Lec13 – Transf. Coding & JPEG [34]
Key Result in Rate Allocation: Equal R-D SlopeKey Result in Rate Allocation: Equal R-D Slope
Keep the slope in R-D curve the same (not necessarily Gaussian r.v)
– Otherwise the bits can be applied to other r.v. for better “return” in reducing the overall distortion
If all r.v. are Gaussian, the slope at the same distortion are identical in R-D curves for the two r.v.
UM
CP
EN
EE
63
1 S
lide
s (c
rea
ted
by
M.W
u ©
20
04
)
ENEE631 Digital Image Processing (Spring'04) Lec13 – Transf. Coding & JPEG [36]
More on Rate-Distortion Based More on Rate-Distortion Based
Bit Allocation in Image CodingBit Allocation in Image Coding
UM
CP
EN
EE
63
1 S
lide
s (c
rea
ted
by
M.W
u ©
20
04
)
ENEE631 Digital Image Processing (Spring'04) Lec13 – Transf. Coding & JPEG [37]
1
0
2 )()(
1min
N
kq kvkvE
ND
where k2 -- variance of k-th coeff. v(k); nk -- # bits allocated for v(k)
f(-) – quantization distortion func.
Bit AllocationBit Allocation How many bits to be allocated for each coeff.?
– Determined by the variance of coeff.– More bits for high variance k
2 to keep total MSE small
1
0
2 )(1 N
kkk nf
N
1
0
.bits/coeff 1
subject toN
kk Bn
N
“Inverse Water-filling” (from info. theory)
– Try to keep same amount of error in each freq. band
See Jain’s pp501 for detailed bit-allocation algorithm
ENEE631 Digital Image Processing (Spring'04) Lec13 – Transf. Coding & JPEG [38]
Details on Reverse Water-filling Solution (cont’d)Details on Reverse Water-filling Solution (cont’d)
DDD i
i
in
i
tosubj. ,lnmin2
121
n
ii
i
in
i
DD
DJ1
2
121 ln)( Construct func. using
Lagrange multiplier
D and allfor 0 1
2
1ii
ii
DDiDD
J Necessary condition Keep the same marginal gain
DDD i
n
i i
i
tosubj. ,0,ln2
1maxmin
1
2
DDDD
D
D
Ji
ii
ii
ii
ii
i
s.t.
if ,
if ,
if ,0
if ,022
2
2
2
Necessary conditionfor choosing
ENEE631 Digital Image Processing (Spring'04) Lec13 – Transf. Coding & JPEG [40]
Lagrangian Opt. for Indep. Budget ConstraintLagrangian Opt. for Indep. Budget Constraint
Previous: fix distortion, minimize total rate
Alternative: fix total rate (bit budget), minimize distortion
(Discrete) Lagrangian optimization on general source
– “Constant slope optimization” (e.g. in Box 6 Fig. 17 of Ortega’s tutorial)– Need to determine the quantizer q(i) for each coding unit i– Lagrangian cost for each coding unit
use a line with slope - to intersect with each operating point (Fig.14)
– For a given operating quality , the minimum can be computed independently for each coding unit
=> Find operating quality satisfying the rate constraint
n
iiqiiqi rd
1)(,)(,min
)(,)(, iqiiqi rd
n
iiqiiqi
n
iiqiiqi rdrd
1)(,)(,
1)(,)(, minmin
ENEE631 Digital Image Processing (Spring'04) Lec13 – Transf. Coding & JPEG [41]
(from Box 6 Fig. 17 of Ortega’s tutorial)(from Fig. 14 of Ortega’s tutorial)
ENEE631 Digital Image Processing (Spring'04) Lec13 – Transf. Coding & JPEG [42]
Bridging the Theory and Ad-hoc Practice Bridging the Theory and Ad-hoc Practice
Operational R-D curves– Directly achievable with practical implementation
Tradeoff! Tradeoff! Tradeoff! – Rate vs. Distortion
how close are the operational points to info. theory bounds?– Also tradeoff among practical considerations
Cost of memory, computation, delay, etc.
Often narrowing down to an optimization problem:~ optimize an objective func. subj. to a set of constraints
Model mismatch problem– How good are our assumptions on source modeling?– How well does a coding algorithm tolerate model mismatch?
ENEE631 Digital Image Processing (Spring'04) Lec13 – Transf. Coding & JPEG [43]
Basic Steps in R-D OptimizationBasic Steps in R-D Optimization
Determine source’s statistics– Simplified models: Gaussian, Laplacian, Generalized Gaussian– Obtain from training samples
Obtain operational R-D points/functions/curves– E.g., compute distortion for each candidate quantizer
Determine objective func. and constraints
Search for optimal operational R-D points– Lagrange multiplier approach
select only operating points on the convex hull of overall R-D characteristics (Fig.19 of Ortega’s tutorial)
may handle different coding unit independently
– Dynamic programming approach not constrained by convex hull but has larger search space
(higher complexity)