48
9LGHR&RGLQJ TSBK01 Image Coding and Data Compression Lecture 10 Jörgen Ahlberg

9LGHR&RGLQJ - Informationskodning the temporal and the spatial domain! ... – Bilinear interpolation. 3DUW,9 ... z Wavelet-based still image compression

  • Upload
    vunhi

  • View
    220

  • Download
    2

Embed Size (px)

Citation preview

9LGHR�&RGLQJ

TSBK01 Image Coding and Data Compression

Lecture 10

Jörgen Ahlberg

2XWOLQH

I. Colour coding

II. Moving images: From 2D to 3D?

III. Hybrid coding

IV. Video coding standards

3DUW�,�&RORXU�&RGLQJ

The base colours of colour television are

– Red: 700 nm

– Green: 546 nm

– Blue: 435 nm

Three base colours enough tosynthesize any visible colour!

B

G

R

7KH�&RORXU�9HFWRU

In this plane, theluminance Y = R+G+B = 1

7KH�3$/�FRORXUV

Y = 0.30B + 0.59G + 0.11B

Cr = 0.70R - 0.59G - 0.11B

Cb = - 0.30R - 0.59G + 0.89B

Y luminance; Cr, Cb chrominance

Matrix

R

G

B

Y

R-Y

B-Y

z Change basis to YUV (almost the same as YCrCb).

– For more info on color spaces, see colour FAQ at www.poynton.com/Poynton-color.html

z The Human Visual System perceives the luminance in higher resolution than the chrominance!

Æ Subsample the colour components.

'LJLWDO�&RORXU�&RGLQJ

YU V

4:2:0

Y U V

4:2:2

3DUW�,,�&RGLQJ�RI�0RYLQJ�,PDJHV

Principle I - Extend known methods to 3D

LowVery high0.1 - 0.5Fractal

HighHigh0.1 – 1.0Subband/Wavelet

HighHigh0.5 – 1.5Transform

LowLow2 – 5Predictive

LowVery high0.5 – 2VQ

LowLow6 – 8PCM

Decoding complexity

ComplexityPrestanda (bpp)Coding Method

([WHQGLQJ��'�0HWKRGV

z Predictive coding

– 3D predictors

– Motion compensated predictors

z Transform coding

– 3D transforms

z Subband coding

– 3D subband filters

BUT! The properties of the image signal are different in the temporal and the spatial domain!

7KXV�

Principle II:

Hybrid methodsHybrid predictive/transform coding popular++

3DUW�,,,�+\EULG�&RGLQJ

z Combine predictive coding and transform coding.

z Use predictive coding to predict the next frame in the sequence.

z Use transform coding to code the prediction error.

7UDQVIRUP�&RGLQJ

T Q VLC

T: TransformQ: QuantizerVLC: Variable Length Coder

3UHGLFWLYH�&RGLQJ

Q

Q-1

VLC

P

Q: QuantizerQ-1: Inverse quantizer (reconstructor)P: Predictor

+\EULG�&RGLQJ

T

T-1

Q

Q-1

VLC

P

)UDPH�3UHGLFWLRQ

Intra-codedI-frame

Predictivelycoded

P-frames

Better prediction if it can compensate for motion!

0RWLRQ�&RPSHQVDWLRQ

0RWLRQ�&RPSHQVDWHG�+\EULG�&RGLQJ

VLCME

ME: Motion estimation

TQ-1

TQ

P

VLC

TQ: Transform+ quantization

0RWLRQ�&RPSHQVDWLRQ

z Typically one motion vector per macroblock (4 transform blocks)

z Motion estimation is a time consuming process

– Hierarchical motion estimation

– Maximum length of motion vectors

– Clever search strategies

z Motion vector accuracy:

– Integer, half or quarter pixel

– Bilinear interpolation

3DUW�,9�9LGHR�&RGLQJ�6WDQGDUGV

8 16 64 384 1.5 5 20

kbit/s Mbit/s

Very low bitrate Low bitrate Medium bitrate High bitrate

Mobilevideophone

Videophoneover PSTN

ISDNvideophone

Digital TV HDTVVideo CD

MPEG-4 MPEG-1 MPEG-2H.261H.263

6WDQGDUGV

z H.26x

– Standards for real time communication like video telephony and video conferencing.

– Standardized by ITU.

z MPEG

– Standards for stored video data like movies on CDs, DVDs, etc.

– Standardized by ISO.

+����

z Standard for ISDN picture phones in 1990.

z Motion compensation:

– One motion vector per macroblock.

– One macroblock = four 8e8 luminance blocks + two chrominance blocks (one U and one V).

– Motion vectors max 15 pixels long in each direction.

z Format:

– CIF (352e288) or QCIF (176e144)

– 7.5 – 30 frames/s.

z Bitrate: Multiple of 64 kbit/s (=ISDN) including audio.

z Quality: Acceptable for small motion at 128 kbit/s.

+����

z Standard for picture telephones over analog subscriber lines in 1995.

z Format:

– CIF, QCIF or Sub-QCIF.

– Usually less than 10 frames/s.

z Bitrate: Typically 20 – 30 kbit/s.

z Quality: With new options as good as H.261 (at half the bitrate).

03(*

z Moving Pictures Expert Group – a committee under ISO and IEC.

z Original plan:

– MPEG-1 for 1.5 Mbit/s (VideoCD)

– MPEG-2 for 10 Mbit/s (Digital TV)

– MPEG-3 for 40 Mbit/s (HDTV)

z What happened:

– MPEG-1 for 1.5 Mbit/s (Video CD)

– MPEG-2 for 2 – 60 Mbit/s (TV and HDTV)

– MPEG-4, -7 and -21 for other things.

03(*��

z ISO/IEC standard in 1991.

z Target bitrate around 1.5 Mbit/s (Video CD).

z Properties:– Bi-directionally predictively coded frames (”B-frames”, see next

slide).

– More flexible than H.261.

– Almost JPEG for intra frames.

z Format:– CIF

– No interlace.

– 24 – 30 frames/s.

03(*�)UDPH�7\SHV

I B PB B PB B PB B IB

Intra-codedI-frame

Predictivelycoded

P-frames

Bi-directionallypredictively

codedB-framesGroup of frames (GOF)

03(*�FRGLQJ�RI�,�IUDPHV

z Intracoded

z 8e8 DCT

z Arbitrary weighting matrix for coefficients

z Predictive coding of DC-coefficients

z Uniform quantization

z Zig-zag, run-level, entropy coding

03(*�FRGLQJ�RI�3�IUDPHV

z Motion compensated prediction from I- or P-frame.

z Half-pixel accuracy of motion vectors, bilinear interpolation.

z Predictive coding of motion vectors.

z Prediction error coded as I-frame.

03(*�FRGLQJ�RI�%�IUDPHV

z Motion compensated prediction from two consecutive I-or P-frames.

– Forward prediction only (1 vector/macroblock).

– Backward prediction only (1 vector/macroblock).

– Average of fwd and bwd (2 vectors/macroblock).

z Otherwise as P-frames.

03(*��

z ISO/IEC standard in 1994.

z Properties:

– Handles interlace (optimized for TV)

– Even more flexible than MPEG-1

z Format:

– 352e288

– 704e576 (25 frames/s) or 720e480 (30 frames/s)

– 1440e1152 or 1920e1080 (HDTV)

z Bitrate:

– 2 – 60 Mbit/s

– ~4 Mbits/s: Image quality similar to PAL / NTSC / SECAM.

– 18 – 20 Mbit/s: HDTV.

03(*����FRQW��

z Profiles:

– Simple profile without B-frames.

– Scaleable profiles.

z Experience tells that:

– At 1.5 – 2 Mbit/s MPEG-2 is not better than MPEG-1.

– With manual interaction at the coding, good quality can be achieved at 3 – 4 Mbit/s.

– Problems with implementing the full standard has caused compatibility problems.

– Buffering and rate control hard problems.

03(*��

z ISO/IEC standard in 1998, version 2 in 1999

z Instead of frames as coding units, MPEG-4 use audio-visual objects

z Focus is not primarily on compression, but on content-based functionality

z Contains definitions of:

– Media object types (video, audio, text, graphics, ...)

– Parameters for describing the objects

– Bitstream syntax for the (compressed) parameters

– Scene description, file format, streaming, synchronization, ...

z Allows mixing of media objects.

3DUWV�RI�WKH�03(*���VWDQGDUG

z Part 1, Systems, contains

– The bitstream syntax and the the binary ”language” for scene description

– Computer graphics object descriptions

– Multiplexing, transport, ...

z Part 2, Visual, contains

– Video coding

– Still image coding

– Texture coding, ...

z Part 3, Audio, contains a toolbox of audio coders for different applications

z ...

6WUXFWXUH�RI�DQ�03(*���'HFRGHU

$�9REMHFW

'HFRGHU

08;

&RPSRVLWR

U

%LWVWUHDP $XGLR�9LGHR�VFHQH

$�9REMHFW

'HFRGHU

$�9REMHFW

'HFRGHU

$�YLGHR�IUDPH

%DFNJURXQG�923

923

923

03(*����1DWXUDO��9LGHR

z Instead of frames: Video Object Planes

z Coded with Shape Adaptive DCT

$OSKD�PDS

6$�'&7

74��7UDQVIRUP��TXDQWL]DWLRQ

TQ-1

TQ VLC

3UHGLFWRU

03(*���9LGHR�&RGLQJ

0RWLRQHVWLPDWLRQ

Mux

VLC

VLC6KDSHFRGLQJ

6\QWKHWLF�1DWXUDO�+\EULG�&RGLQJ

z Mix traditional video with 2D/3D graphics

– Compose virtual environments

– Easy to add text, graphs, images, etc

z High compression

z Receive object from separate sources

– Use predefined or locally defined objects

z Scaleability

– Progressive decoding

– Better terminal gives better quality.

6\QWKHWLF�2EMHFWV

z 2D/3D graphics

– Lines, polygons

– Still images

– Image/video mapping on polygon meshes

z VRML scenes and objects

z Animated people

z More on animation and virtual characters in Lecture 12!

z Synthetic audio

z More on natural and synthetic audio in Lecture 11!

&RPSXWHU�JUDSKLFV�JHQHUDWHGYLUWXDO�HQYLURQPHQW

1DWXUDO�YLGHR�REMHFW

1DWXUDO�YLGHR�REMHFWPDSSHG�RQ��'�PHVK

6WLOO�LPDJH�RU�QDWXUDO�YLGHR�REMHFWPDSSHG�RQ�DQLPDWHG��'�PHVK

$OO�PL[HG�LQWKH�GHFRGHU���

9LUWXDO�(QYLURQPHQWV

z Downloaded virtual environment

z Different environments for different users

z Simple change between environments

z Synthetic environments are cheaper than real ones

7RROV�IRU�6\QWKHWLF�2EMHFWV

z Wavelet-based still image compression

– Scaleable quality and resolution

– Progressive decoding

– Can be mapped on 2D or 3D meshes

z Compression of 2D and 3D meshes

– Mesh geometry and animation

– Transmit vertex coordinates and let the receiving terminal calculate the polygons

– A moving or still image can be mapped on the mesh (texture mapping).

0RUH�7RROV�IRU�6\QWKHWLF�2EMHFWV

z Face and Body Animation

z Text-to-speech (TTS) interface

z View-dependent scaleable texture

– Information about the users view position in a 3D scene is transmitted on a back-channel

– Only the necessary texture information is transmitted to the user

9LHZ�GHSHQGHQW�6FDOHDEOH�7H[WXUH

Original texture

The texture is mapped on a surface

What the user sees

2WKHU�IRUPDWV

z Microsoft, RealVideo, QuickTime, ...

z All are variations of the hybrid coder used in MPEG-coders, with some extra features.

1HZ�6WXII

ITU and ISO in cooperation:

H.264=

MPEG-4 part 10

Finished in 2003.

+�������03(*���SDUW���

z 4e4 integer transform (approximating DCT).

z Prediction of blocks of sizes up to 16e16.

z Motion vectors for blocks of sizes 4e4 up to 16e16.

z Up to 5 reference images for prediction.

z Non-uniform qunatization.

z Arithmetic coding of run-level pairs.

:KDW�DERXW�WKH�VRXQG"

z MPEG-1

– Audio layer I, II and III (mp3).

z MPEG-2

– Four channels, same codec as in MPEG-1.

– AAC (Advanced Audio Codec) added later.

z MPEG-4

– AAC

– Two speech coders

– Structured audio

– And more...

More on audio codingin Lecture 11.

&RQFOXVLRQ

z Color coding

– Change basis from RGB to YUV

– Colour components are compressed harder than the luminance

z Moving image coding

– Hybrid coding: Motion compensated predictive coding and transform coding of the prediction error

– I-, P-, and B-frames

– Object-based coding (MPEG-4) mixing synthetic and natural audio & video

&RQFOXVLRQ��FRQW�

z Standards

– MPEG-1: Video CD

– MPEG-2: Digital TV

– MPEG-4: Multimedia

– H.261: ISDN videophone

– H.263: PSTN videophone

– H.264 / MPEG-4 part 10: Universal video

That was the last slide!