Upload
others
View
6
Download
0
Embed Size (px)
Citation preview
-- --
Image Compression
• Introduction
- The goal of image compression is the reduction of the amount ofdata required to represent a digital image.
- The idea is to remove redundant data from the image (i.e., datawhich do not affect image quality significantly)
- Image compression is very important for image storage and imagetransmission
f(x,y) Compress Decompress f(x,y)Transmit (channel)
storeretrieve
StorageDevice
• Image Storage Applications
- Educational and business documents- Medical images- Weather maps- Fingerprints (FBI database)
• Image Transmission Applications
- Remote sensing via satellite- Military communications via aircraft, radar, and sonar- Teleconferencing- Facsimile transmission (FAX)
-- --
• Compression Rates
- Advanced compression techniques can achieve compression ratiosin the range 10:1 to 50:1 without visibly affecting image quality.
- Very high compression ratios of up to 2000:1 can be achieved incompressing video signals.
- In order for a compression system to be useful, compression anddecompression must be very fast
• Compression Techniques
Lossless:
- Information preserving
- Low compression ratios
Lossy:
- Not information preserving
- High compression ratios
Tradeoff: image quality vs compression ratio
-- --
• Classification of Methods
-- --
• Fundamentals
- Data compression implies reducing the amount of data required torepresent a given quantity of information
- Data and information are not synonymous !
Data: the means by which information is conveyed
Same amount of information can be represented by variousamount of data
Example:
Your wife, Helen, will meet you at Logan Airport in Boston at 5 minutespast 6:00 pm tomorrow night
Your wife will meet you at Logan Airport at 5 minutes past 6:00pm tomorrow night
Helen will meet you at Logan at 6:00 pm tomorrow night
- Data redundancy is a mathematically quantifiable entity
Data Set 1
(image)
Data Set 2
(compressed image)
n1 carrying units n2 carrying units
(e.g., bits) (e.g., bits)
-- --
Compression ratio:
CR =n1
n2
Relative data redundancy:
RD = 1 −1
CR
if n2 = n1, then CR=1, RD=0
if n2 << n1, then CR → ∞, RD → 1
if n2 >> n1, then CR → 0, RD → −∞
Example:
If CR =10
1, then RD = 1 −
1
10= 0. 9
(90% of the data in dataset 1 is redundant)
• Data Redundancies
- Coding redundancy
- Interpixel redundancy
- Psychovisual redundancy
(data compression tries to reduced one or more of these redundancies)
-- --
Coding Redundancy
- Data compression can be achieved by encoding the data using anappropriate encoding scheme
• Elements of an encoding scheme
Code: a list of symbols (letters, numbers, bits etc.)
Code word: a sequence of symbols used to represent a piece of infor-mation or an event (e.g., gray levels)
Word length: number of symbols in each code word
Example: (binary code, symbols: 0,1, length: 3)
0: 000 4: 1001: 001 5: 1012: 010 6: 1103: 011 7: 111
• Av oiding coding redundancy
- When the codes have not been selected according tothe probabilities of the events then there is coding redundancy.
Idea: assign fewer symbols (bits) to the more probable events (graylevels in our case)
-- --
• Method (Variable Length Encoding)
- Use the histogram of the image for the construction of the codes
l(rk): # of bits for rk
Av erage # of bits: Lavg = E(l(rk)) =L−1
k=0Σ l(rk)P(rk)
Total # of bits: NMLavg
Example:
- Assume an image with L = 8
- Assume l(rk) = 3, Lavg =7
k=0Σ 3P(rk) = 3
7
k=0Σ P(rk) = 3 bits
- Total number of bits: 3NM
- Assume that the probability of the gray levels is considered
Lavg =7
k=0Σ l(rk)P(rk) = 2. 7 bits
CR =3
2. 7= 1. 11 (about 10%)
RD = 1 −1
1. 11= 0. 099
-- --
Interpixel redundancy
- Structural or geometric relationships between the objects in animage can be used for efficient image compression.
-- --
correlation: f (x) o g(x) =∞
−∞∫ f *(a)g(x + a)da
autocorrelation: g(x) = f (x)
- Interpixel redundancy implies that any pixel value can be reasonablypredicted by its neighbors
- For example, differences between adjacent pixels can be used to rep-resent an image
Run-length encoding:
(1,63) (0,87) (1,37) (0,5) (1,4) (0, 556) (1,62) (0,210)
Using 11 bits/pair:
88 bits are required (compared to 1024 !!)
-- --
Psychovisual redundancy
- Takes into advantage the peculiarities of the human visual system
- The eye does not respond with equal sensitivity to all visual infor-mation
- Human perception searches for important features (e.g., edges, tex-ture, etc.) and does not perform quantitative analysis of every pixel inthe image
Example: image compression based on gray-scale quantization (lossycompression)
-- --
Fidelity Criteria
- How close is f (x, y) to f̂ (x, y) ?
( f̂ (x, y) = f (x, y) + e(x, y))
f(x,y) Compress Decompress f(x,y)g(x,y)
• Criteria
Objective: mathematically defined criteria
Subjective: based on the opinion of a number of observers
• Objective fidelity criteria
- Root mean square error (rms)
erms = √ 1
MN
M−1
x=0Σ
N−1
y=0Σ ( f̂ (x, y) − f (x, y))2
- Mean-square signal-to-noise ratio
SNRms =
M−1
x=0Σ
N−1
y=0Σ ( f̂ (x, y))2
M−1
x=0Σ
N−1
y=0Σ ( f̂ (x, y) − f (x, y))2
-- --
Image Compression Models
f(x,y)Source encoder
Channelencoder
Channel f(x,y)decoderChannel Source
decoder
Encoder Decoder
(no redundancies)compression
noise tolerant representation
(additional bits are included to guaranteedetection and correction of errors due totransmission over the channel - Hamming coding )
• Encoder
f(x,y) Mapper Quantizer ChannelencoderSymbol
Encoder
no interpixel redundancies(reversible)
no psychovisual redundancies(not reversible in general) (reversible)
no coding redundancies
Mapper: transforms the input data into a format that facilitates reduc-tion in interpixel redundancies
Quantizer: reduces the accuracy of the mapper’s output in accordancewith some pre-established fidelity criteria
Symbol encoder: assigns the shortest code to the most frequentlyoccurring output values
• Decoder
SymbolChannel decoder
Decoder
De-Quantizer MapperInverse f(x,y)
- The inverse operations are performed
- Quantization is irreversible in general
-- --
Measuring Information
- What is the information content of a message ?
- What is the minimum amount of data that is sufficient to describecompletely an image without loss of information ?
• Modeling the information generation process
- Assume that the generation of information is a probabilistic process
- A random event E which occurs with probability P(E) contains
I (E) = log(1
P(E)) = − log(P(E)) units of information
(note that when P(E) = 1, then I (E) = 0: no information !!!)
- Suppose that the gray level value of pixels is generated by arandom variable then,
I (rk) = − log(P(rk))
Entropy: the average information content of an image:
H = −L−1
k=0Σ P(rk)log(P(rk))
(if log2() is used, the units are bits/pixel)
Redundancy:
R = Lavg − H
(note that if Lavg = H , then R = 0 - no redundancy)
-- --
• Estimating entropy
- Use the image histogram
- It is not easy to estimate entropy accurately ...
First order estimate of H:
H = −3
k=0Σ P(rk)log(P(rk)) = 1. 81 bits/pixel
Total bits: 4 x 8 x 1.81 = 58 bits
Second order estimate of H:
Relative frequencies of pixel blocks can be used !
H = 2. 5/2 = 1. 25 bits/pixel
-- --
- The above estimates give only a lower-bound on the compressionthat can be achieved through variable-length coding alone
- Differences between higher-order estimates of entropy and the first-order estimate indicate the presence of interpixel redundancies
- Example: consider the difference of the image
Compute H for difference image:
H = −2
k=0Σ P(rk)log(P(rk)) = 1. 41 bits/pixel
1.41 bits/pixel > 1.25 bits/pixel (from 2nd order estimate of H)
This implies that a better mapping can be found !!
-- --
Lossless Compression
f(x,y) f(x,y)Compression Decompression
e(x, y) = f̂ (x, y) − f (x, y) = 0
• Huffman coding
Source a k Encode Decode ak
(gray levels)
compression can be achievedby encoding appropriatelya k
- It belongs to the class of variable-length coding techniques
- It creates the optimal code for a set of source symbols which areencoded one at a time (constraint !!)
Optimal code: minimizes the number of code symbols per sourcesymbol
-- --
Method
Forward Pass
1. Sort probabilities per symbol
2. Combine the lowest two probabilities
3. Repeat step2 until only two probabilities remain
Backward Pass
1. Assign code symbols going backwards
- What is Lavg ?
Lavg = E(l(ak)) =6
k=1Σ l(ak)P(ak)=
3x0.1 + 1x0.4 + 5x0.06 + 4x0.1 + 5x0.04 + 2x0.3=2.2 bits/symbol
- Assume binary codes, what is Lavg in this case ?
6 symbols, we need a 3-bit code
(a1: 000, a2: 001, a3: 010, a4: 011, a5: 100, a6: 101)
Lavg =6
k=1Σ l(ak)P(ak) =
6
k=1Σ 3P(ak)=3
6
k=1Σ P(ak) = 3 bits/symbol
-- --
- After the code has been created, coding/decoding can be imple-mented using a look-up table
- Decoding can be done in an unambiguous way !!
0 1 0 1 0 0 1 1 1 1 0 0
a a a a a3 1 2 2 6
• Run-length coding
- Code each contiguous group of 0’s and 1’s, encountered in a left toright scan of a row, by its length
- Additional compression can be achieved by encoding the lengths ofthe runs using variable-length coding.
-- --
Lossy Compression
• Transform Coding
Idea: transform the image into a domain where compression can beperformed more efficiently
Warning: the transformation itself does not compress the image !!!
- Example: Fourier transform
f (x, y) =1
N
N−1
u=0Σ
N−1
v=0Σ F(u, v)e
j2π (ux+vy)
N , x,y=0,1,...,N-1
We hav e seen that the magnitude of the FT decreases, as u, vincrease ...
-- --
Some of the F(u, v) coefficients will be close to zero, so we candiscard them and reconstruct f (x, y) using the most significantF(u, v) coefficients only, e.g.,
f̂ (x, y) =1
N
N /2−1
u=0Σ
N /2−1
v=0Σ F(u, v)e
j2π (ux+vy)
N , x,y=0,1,...,N-1
where
x,yΣ( f̂ (x, y) − f (x, y))2 is very small !!
• Transform Selection
f (x, y) =N−1
u=0Σ
N−1
v=0Σ T (u, v)h(x, y, u, v)
- T (u, v) can be generated using various transformations
DFT
DCT (discrete cosine transform)
KLT (Karhunen-Loeve transform)
etc.
- It can be shown that if an image is transformed using KLT, theenergy of the image is more packed compared to the other transforms
- From a practical point of view, this means that fewer KLT coeffi-cients can be used for reconstructing a "good" image !!!
- In practice, DCT yields better results ( better packing ability) and is less computationally intensive !!
-- --
• Definition of DCT
C(u, v) = α (u)α (v)N−1
x=0Σ
N−1
y=0Σ f (x, y)cos(
(2x + 1)uπ2N
)cos((2y + 1)vπ
2N),
u, v=0,1,...,N-1
f (x, y) =N−1
u=0Σ
N−1
v=0Σ α (u)α (v)C(u, v)cos(
(2x + 1)uπ2N
)cos((2y + 1)vπ
2N),
x, y=0,1,...,N-1
α (u) =
√ 1/N
√ 2/N
ifu = 0
ifu > 0α (v) =
√ 1/N
√ 2/N
ifv = 0
ifv > 0
- Basis set of functions for a 4x4 image (cosines of different frequen-cies)
- DCT minimizes "blocking artifacts" (boundaries between subimages
-- --
do not become very visible)
• Subimage Selection
-- --
• JPEG compression algorithm (sequential)
-- --
-- --
Steps
1. Divide the image into 8x8 subimages; for each subimage do:
2. Shift the gray-levels in the range [-128, 127]
3. Apply DCT (64 coefficients will be obtained: 1 DC coeff, 63AC coeff)
4. Quantize the coefficients (reduce the amplitude of coefficientsthat do not contribute a lot)
Cq(u, v) = Round[C(u, v)
Q(u, v)]
5. Order the coefficients using zig-zag ordering (to place non-zerocoefficients first and to create long runs of zeros --> good for run-length encoding !!)
6. Encode coefficients
7. Encode symbol1 and symbol2 using variable-length encoding(Huffman coding or arithmetic coding can be used)
- Quantization
for i=0 to n;for j=0 to n;Q[i,j]= 1 + (1+i+j)*quality;
end j;end i;
(best - low compression) 1 ≤ quality ≤ 25 (worst - high compression)
-- --
- Zig-zag ordering
DC encoding:
AC encoding:
symbol1 symbol2
(RUN-LENGTH, SIZE) (AMPLITUDE)
4 bits0 <= RUN-LENGTH <= 15
this coefficient
#zeros preceding
#bits for amplitude[-1023, 1024]
10 bits
(if RUN-LENGTH > 15, then symbol (15,0) means RUN-LENGTH=16)
-- --
- Encoding (symbols)
-- --