24
2007 Theo Schouten 1 Compression "lossless" : f[x,y] { g[x,y] = Decompress ( Compress ( f[x,y] ) | “lossy” : quality measures e 2 rms = 1/MN ( g[x,y] - f[x,y] ) 2 SNR rms = 1/MN g[x,y] 2 / e 2 rms subjective: how does it look to the eye application: how does it influence the final results for both: the attained compression ratio the time and memory needed for compression the time and memory needed for

2007Theo Schouten1 Compression "lossless" : f[x,y] { g[x,y] = Decompress ( Compress ( f[x,y] ) | “lossy” : quality measures e 2 rms = 1/MN ( g[x,y]

  • View
    225

  • Download
    0

Embed Size (px)

Citation preview

2007 Theo Schouten 1

Compression

"lossless" : f[x,y] { g[x,y] = Decompress ( Compress ( f[x,y] ) |

“lossy” : quality measures• e2

rms = 1/MN  ( g[x,y] - f[x,y] )2

• SNRrms = 1/MN  g[x,y]2 / e2rms

• subjective: how does it look to the eye• application: how does it influence the final results

for both:• the attained compression ratio• the time and memory needed for compression• the time and memory needed for decompression

2007 Theo Schouten 2

Coding redundancy

Average number of bits for code 2: 2.7 bits

Compression ratio: Cr = 3/2.7 = 1.11

2007 Theo Schouten 3

Interpixel redundancy

Run Length Encoding (RLE):

For whole binary image:

Cr = 2.63

2007 Theo Schouten 4

Psycho-visual redundancy

2007 Theo Schouten 5

General Model

The "mapper" transforms the data to make it suitable for reducing the inter-pixel redundancies. This step is generally reversible and can reduce the amount of data, e.g. RLE, but not in transformations to the Fourier or Discrete Cosinus domains.

The "quantizer" reduces the precision of the output of the mapper according to the determined reliability criteria. This especially reduces psycho-visual redundancies and is irreversible.

The "symbol encoder" makes a static or variable length of code to represent the quantizer's output. It reduces the coding redundancy and is reversible.

2007 Theo Schouten 6

Information theoryQuestions such as: "what is the minimum amount of data needed to represent an image" will be answered in "information theory“.The generation of information is modeled as a statistical process that can be measured in the same manner as the intuition of information.

An event E with a probability P(E) has:I(E) = - logr P(E)    r-ary units of information

P(E) = 1/2  then:  I(E) = -log2 1/2 = 1 bit information

If a source generates symbols ai with a probability of P(ai), then the

average information per output is:H(z)  = -  P(ai) logr P(ai)      the insecurity of entropy of the source

This is maximal when every symbol has an equal probability (1/N). It indicates the minimal average length (for r=2 in bits per symbol) needed to code the symbols.

2007 Theo Schouten 7

Huffman coding

Under the condition that the symbols are coded one by one, an optimal code for the set of symbols and probabilities is generated. block code:•every source symbol is mapped to a static order of code symbols•instantaneous code: every code is decoded without reference to the previous symbols•and is uniquely decodable

2007 Theo Schouten 8

Lempel Ziv Welch coding

This translates variable length arrays of source symbols (with about the same probability) to a static (or predictable) code length.

The method is adaptive: the table with symbol arrays is built up in one pass over the data set during both compression and decompression.

Just as Huffman, this is a symbol encoder which can be used directly on the input or after a mapper and quantizer.

It is used in GIF, TIFF, PDF and in Unix compress.

2007 Theo Schouten 9

Predictive coding

1D: pn = round ( i=1m  ai fn-i ) , first f must be passed in another way

2D: p(x,y)= round (a1f[x,y-1] + a2 f[x-1,y])

2007 Theo Schouten 10

Lossy predictive coding

2007 Theo Schouten 11

Delta modulation

Delta Modulation is a simple but well known form of it:

pn =    pin with   < 1 (here, pi stands for the "predictor input")

qn =  sign( en) and can be represented by a 1-bit value: - or +  

2007 Theo Schouten 12

Differential Pulse Code modulation

With DPCM, pn = i=1m  i pin-i. Under the assumption that the

quantization error (en-qn) is small, the optimal values of i can be

found by minimizing E{en2} = E{ [fn-pn]

2 } .

These calculations are almost never done for each single image but rather for a few typical images or models of them.

Original image

2007 Theo Schouten 13

4 prediction methodsfig. 8.24

prediction0.97 f(x,y-1)0.5(f(x,y-1)+f(x-1,y))0.75 (f(x,y-1)+f(x-1,y)) -0.5f(x-1,y-1)0.97 f(x,y-1) or 0.97 f(x-1,y)

2007 Theo Schouten 14

Lloyd-Max quantizer

Instead of one level more levels can be used. They might be unequal, e.g. factor 2 beween them. With a Lloyd-Max quantizer the steps are optimized to achieve a minimum error.

Adjusting the level  ( for each n, e.g. 16 pixels) with a restricted amount (for example 4) scale factors yields a substantial improvement of the error in the decoded image with a small reduction of the compression ratio.

2007 Theo Schouten 15

AdaptionUsing the 3 point prediction with the best of 4 quantizers per 16 bits

error *8 for compression in bits/pixel:

1.0 1.1252.0 2.1253.0 3.125

2007 Theo Schouten 16

Transform coding

JPEG makes use of 8*8 sub-images, DCT transformation, quantization of the 64 coeffients by dividing with a quantization matrix [e.g. fig. 8.37b ], a zigzag ordering [fig. 8.36d] of the matrix followed by a Huffman encoder, separately for the DC component.

It uses a YUV color model, for the U and V component blocks of 2 by 2 pixels are combined into 1 pixel. The quantization matrixes can be scaled to yield several compression ratios. There are standard coding tables and quantization matrices, but the user can also indicate others to obtain better results for a certain image.

2007 Theo Schouten 17

examples

25%DCT 8x8

zoomedorg

2 x 2

4x4 8x8

DCT +norm arr

34:1 (3.42) 67:1 (6.33)

2007 Theo Schouten 18

Wavelet transform

Type wavelet Operaties per pixel Aantal 0’s (<1,5)

2007 Theo Schouten 19

4 wavelets

2007 Theo Schouten 20

Wavelet Compression ratio’s

34:1 (2.29) 67:1 (2.96) 108:1 (3.72) 167:1 (4.73)

2007 Theo Schouten 21

JPEG 2000• Uses wavelets (optionally on parts, tiles, of image)

• different for error-free and lossy compression

• gray and color images (upto 16 bit signed values)

• conversion to (about) YCbCr color space

– Cb, Cr components peak around 0

• complicated coding of wavelet values

– organised in layers and finally packets

– allowing more and more refined decoding (and storage)

– and access to parts of the image

2007 Theo Schouten 22

Fractal compression

GIF original Image (161x261 pixels, 8 bits/pixel), JPEG compression 15:1, JPEG compression 36:1, Fractal compression 36:1.

2007 Theo Schouten 23

MPEG (1,2,4) video

•I-frame (Intraframe or independent frame), JPEG like•P-frame (predictive frame): difference between frame and prediction from I-frame, motion compensated•B-frame (bidirectional): previous I or P and next P

2007 Theo Schouten 24

File formatsThe header contains information about:

•type: black and white, 8-bit gray level/color, 3-byte color•size: number of rows, columns and bands, number of images•compression method, possible parameters thereof•data format: for example band or colors per pixel or separated•origin of the image or conditions during acquisition•manipulations previously done on the image

A few well-known formats are:•GIF for binary, gray level and 8-bits color images•TIFF a multi-type format with many possibilities•JFIF: JPEG coded, for color or gray images of natural origin•MPEG for a series of images•PBM, PGM, PPM: the PBMPLUS formats•BMP: Microsoft's format