34
1 www.site.uottawa.ca/~elsaddik www.el-saddik.com 1 04_Compression © elsaddik Multimedia Communications Multimedia Technologies & Applications Prof. Dr. Abdulmotaleb El Saddik Multimedia Communications Research Laboratory School of Information Technology and Engineering University of Ottawa Ottawa, Ontario, Canada elsaddik @ site.uottawa.ca abed @ mcrlab.uottawa.ca www.site.uottawa.ca/~elsaddik www.el-saddik.com 2 04_Compression © elsaddik Content 1. Motivation 2. Requirements - General 3. Fundamentals - Categories 4. Source Coding 5. Entropy Coding 6. Hybrid Coding: Basic Encoding Steps 7. JPEG 8. H.261 and related ITU Standards 9. MPEG -1 10. MPEG -2 11. MPEG -4 12. Wavelets 13. Fractal Image Compression 14. Basic Audio and Speech Coding Schemes 15. Conclusion

elsaddik @ site.uottawa.ca abed @ mcrlab.uottawaelsaddik/abedweb/teaching/elg... · MPEG in a Nutshell ØI-Frames are self ... ØMPEG-4: vInitially, lower data rates for e.g. mobile

  • Upload
    others

  • View
    7

  • Download
    0

Embed Size (px)

Citation preview

Page 1: elsaddik @ site.uottawa.ca abed @ mcrlab.uottawaelsaddik/abedweb/teaching/elg... · MPEG in a Nutshell ØI-Frames are self ... ØMPEG-4: vInitially, lower data rates for e.g. mobile

1

ww

w.s

ite.u

otta

wa.

ca/~

elsa

ddik

ww

w.e

l-sad

dik.

com

104_Compression © elsaddik

Multimedia Communications

Multimedia Technologies & Applications

Prof. Dr. Abdulmotaleb El SaddikMultimedia Communications Research Laboratory

School of Information Technology and EngineeringUniversity of Ottawa

Ottawa, Ontario, Canada

elsaddik @ site.uottawa.ca

abed @ mcrlab.uottawa.ca

ww

w.s

ite.u

otta

wa.

ca/~

elsa

ddik

ww

w.e

l-sad

dik.

com

204_Compression © elsaddik

Content

1. Motivation

2. Requirements - General

3. Fundamentals - Categories

4. Source Coding

5. Entropy Coding

6. Hybrid Coding: Basic Encoding Steps

7. JPEG

8. H.261 and related ITU Standards

9. MPEG-1

10. MPEG-2

11. MPEG-4

12. Wavelets

13. Fractal Image Compression

14. Basic Audio and Speech Coding Schemes

15. Conclusion

Page 2: elsaddik @ site.uottawa.ca abed @ mcrlab.uottawaelsaddik/abedweb/teaching/elg... · MPEG in a Nutshell ØI-Frames are self ... ØMPEG-4: vInitially, lower data rates for e.g. mobile

2

ww

w.s

ite.u

otta

wa.

ca/~

elsa

ddik

ww

w.e

l-sad

dik.

com

4504_Compression © elsaddik

Video Compression

In video streams, there are 2 types of redundancy that can be explored:ØSpatial redundancyØTemporal redundancy

Recall that spatial redundancy is what JPEG and other still image algorithms use.ØThere are two groups of video compression

products: vBased purely on spatial redundancyvBased on both spatial and temporal

redundancy

ww

w.s

ite.u

otta

wa.

ca/~

elsa

ddik

ww

w.e

l-sad

dik.

com

4604_Compression © elsaddik

Spatial-Redundancy-Only Video Compression

ØCalled motion JPEGØCompress each frame individually, without

reference to any other frames in the sequencevà thus does not consider inter-frame

redundanciesØaudio is not supported in an integrated fashionØMotion JPEG Hardware (Chips, boards) for near

real-time compression/ decompression available, but storage and retrieval from a hard disc still takes a second or more.vHigh quality video requires fast SCSI discs or

cashing of short video sequences in large memory buffers.

Page 3: elsaddik @ site.uottawa.ca abed @ mcrlab.uottawaelsaddik/abedweb/teaching/elg... · MPEG in a Nutshell ØI-Frames are self ... ØMPEG-4: vInitially, lower data rates for e.g. mobile

3

ww

w.s

ite.u

otta

wa.

ca/~

elsa

ddik

ww

w.e

l-sad

dik.

com

4704_Compression © elsaddik

JPEG for full-motion video

ØAdvantages:vLoss of frames does not affect other framesvLess encoding complexity and delayvEasier editing

ØDisadvantages:vnetwork-based JPEG applications unlikely,

since it is bandwidth-intensive• Typical rate for studio quality TV: 10 ~ 20

Mbps

Basically, lower compression rates is needed

ww

w.s

ite.u

otta

wa.

ca/~

elsa

ddik

ww

w.e

l-sad

dik.

com

4804_Compression © elsaddik

Spatial and temporal redundancy video compression – MPEG

We have seen with JPEG how spatial redundancy can be explored. MPEG utilises, as well as spatial redundancy, the fact that frames in a sequence are similar to each other. This is what is known as temporal redundancy.

A few definitions are required here:ØMacroblocksvThis is a 16x16 pixel block, composed of

4 times 8x8 luminance blocks and 2 colour difference blocks

ØMotion VectorsvIndicates the spatial translation of a

macroblock between two frames.

Page 4: elsaddik @ site.uottawa.ca abed @ mcrlab.uottawaelsaddik/abedweb/teaching/elg... · MPEG in a Nutshell ØI-Frames are self ... ØMPEG-4: vInitially, lower data rates for e.g. mobile

4

ww

w.s

ite.u

otta

wa.

ca/~

elsa

ddik

ww

w.e

l-sad

dik.

com

4904_Compression © elsaddik

Macroblocks

Y CB CR

0 1

2 3

4 5

YrcYbc

bgrY

b

r

−=−=

⋅+⋅+⋅= 0721,07154,02125,0

ww

w.s

ite.u

otta

wa.

ca/~

elsa

ddik

ww

w.e

l-sad

dik.

com

5004_Compression © elsaddik

Macroblocks

Page 5: elsaddik @ site.uottawa.ca abed @ mcrlab.uottawaelsaddik/abedweb/teaching/elg... · MPEG in a Nutshell ØI-Frames are self ... ØMPEG-4: vInitially, lower data rates for e.g. mobile

5

ww

w.s

ite.u

otta

wa.

ca/~

elsa

ddik

ww

w.e

l-sad

dik.

com

5104_Compression © elsaddik

Motion Vectors: Basis imagew

ww

.site

.uot

taw

a.ca

/~el

sadd

ikw

ww

.el-s

addi

k.co

m

5204_Compression © elsaddik

Motion Vectors: 2nd Image with motion

Page 6: elsaddik @ site.uottawa.ca abed @ mcrlab.uottawaelsaddik/abedweb/teaching/elg... · MPEG in a Nutshell ØI-Frames are self ... ØMPEG-4: vInitially, lower data rates for e.g. mobile

6

ww

w.s

ite.u

otta

wa.

ca/~

elsa

ddik

ww

w.e

l-sad

dik.

com

5304_Compression © elsaddik

Motion Vectors: Difference without motion compensationw

ww

.site

.uot

taw

a.ca

/~el

sadd

ikw

ww

.el-s

addi

k.co

m

5404_Compression © elsaddik

Motion Vectors: Difference with motion compensation

Page 7: elsaddik @ site.uottawa.ca abed @ mcrlab.uottawaelsaddik/abedweb/teaching/elg... · MPEG in a Nutshell ØI-Frames are self ... ØMPEG-4: vInitially, lower data rates for e.g. mobile

7

ww

w.s

ite.u

otta

wa.

ca/~

elsa

ddik

ww

w.e

l-sad

dik.

com

5504_Compression © elsaddik

Motion estimation for different frames

I P

B

Available from earlier frame (I)

Available from later frame (P)

ww

w.s

ite.u

otta

wa.

ca/~

elsa

ddik

ww

w.e

l-sad

dik.

com

5604_Compression © elsaddik

MPEG

ØMotion Picture Expert Group (MPEG)vISO/IEC working group(s)vISO/IEC JTC1/SC29/WG11vISO IS 11172 since 3/93

Øcoding of combined:vvideo and audio information

ØStarting point: MPEG-1vAudio/video at about 1.5 Mbit/svBased on experiences with JPEG and H.261

ØFollow-up standardsvMPEG-2vMPEG-4vMPEG-7vMPEG-21

Page 8: elsaddik @ site.uottawa.ca abed @ mcrlab.uottawaelsaddik/abedweb/teaching/elg... · MPEG in a Nutshell ØI-Frames are self ... ØMPEG-4: vInitially, lower data rates for e.g. mobile

8

ww

w.s

ite.u

otta

wa.

ca/~

elsa

ddik

ww

w.e

l-sad

dik.

com

5704_Compression © elsaddik

MPEG

ØMPEG vallows coding comparison across multiple

frames and therefore can yield compression ratios of 50:1 to 200:1vMPEG chips

• provide VHS quality at 1.2 -1.5 Mbps and 200:1

• can also give 50:1 and broadcast video quality at 6 Mbps

Øalgorithm asymmetrical: vmore complex to compress than decompress

ww

w.s

ite.u

otta

wa.

ca/~

elsa

ddik

ww

w.e

l-sad

dik.

com

5804_Compression © elsaddik

MPEG - Video: Processing Step

4 types of frames:ØI-frames (intra-coded frames):vReal-time decoding demands and sometimes

in encoding toovCompression of I frames the lowest in MPEGvI-frames are points for random access in

MPEG streamsvcoding and decoding like JPEGvStructured in 8x8 blocks, within macroblocks

of 16x16, that are DCT coded, quantized and entropy coded

Page 9: elsaddik @ site.uottawa.ca abed @ mcrlab.uottawaelsaddik/abedweb/teaching/elg... · MPEG in a Nutshell ØI-Frames are self ... ØMPEG-4: vInitially, lower data rates for e.g. mobile

9

ww

w.s

ite.u

otta

wa.

ca/~

elsa

ddik

ww

w.e

l-sad

dik.

com

5904_Compression © elsaddik

MPEG - Video: Processing Step

ØP-frames (predictive coded frames):

vRequire about 1/3 of data of I-framesvReference to previous I- or P-framesvMotion vector calculated

• MPEG does not define how to determine the motion vector

• difference of similar macroblocks is DCT codedvDC and AC coefficients are runlength coded

ØB-frames (bi-directional predictive coded frames):

vReference to previous and subsequent (I or P) framesvOne or two motion vectors are encodedvInterpolation between matching macroblocks allowed

(both directions)ØD-frames (DC-coded frames):vOnly DC-coefficients are DCT codedvFor fast forward and rewind

ww

w.s

ite.u

otta

wa.

ca/~

elsa

ddik

ww

w.e

l-sad

dik.

com

6004_Compression © elsaddik

MPEG Video-frame sequence

I B B P B B P B B I

•I frame: Intra frame •P frame: Predicted frame•B frame: Bidirectionallyinterpolated frame

1 2 3 4 5 6 7 8 9 10

MPEG coded sequence will be transmitted in different order:

I P B B P B B I B B1 4 2 3 7 5 6 10 8 9

Sequence• Defined by application

Page 10: elsaddik @ site.uottawa.ca abed @ mcrlab.uottawaelsaddik/abedweb/teaching/elg... · MPEG in a Nutshell ØI-Frames are self ... ØMPEG-4: vInitially, lower data rates for e.g. mobile

10

ww

w.s

ite.u

otta

wa.

ca/~

elsa

ddik

ww

w.e

l-sad

dik.

com

6104_Compression © elsaddik

MPEG in a Nutshell

ØI-Frames are self contained but less compressed than P and B Frames. ØB-Frames are the most compressed frames.

Typical sequences of frames are:ØI BBB P BBB I…ØI BB P BB P BB I…ØI BB P BB P BB P BB I...

ww

w.s

ite.u

otta

wa.

ca/~

elsa

ddik

ww

w.e

l-sad

dik.

com

6204_Compression © elsaddik

MPEG Video-Coding Procedure

Colourspace converter

FDCT QuantizationEntropyencoder

I frame

(RGB->YUV)

Video in

Compressed data

Colourspace converter

FDCT

Entropyencoder

+

-

+

Referenceframe

Errorterms

Motionestimator

P / B frame

(RGB->YUV)

Video in

Compressed data

Page 11: elsaddik @ site.uottawa.ca abed @ mcrlab.uottawaelsaddik/abedweb/teaching/elg... · MPEG in a Nutshell ØI-Frames are self ... ØMPEG-4: vInitially, lower data rates for e.g. mobile

11

ww

w.s

ite.u

otta

wa.

ca/~

elsa

ddik

ww

w.e

l-sad

dik.

com

6304_Compression © elsaddik

MPEG Encoder: One possible implementation

Framerecorder DCT Quantize

Variable-lengthcoder

Transmitbuffer

Predictionencoder

De-quantize

InverseDCT

Motionpredictor

Referenceframe

Ratecontroller

IN OUT

Scalefactor

Bufferfullness

Prediction

Motion vectors

DC

ww

w.s

ite.u

otta

wa.

ca/~

elsa

ddik

ww

w.e

l-sad

dik.

com

6404_Compression © elsaddik

MPEG- Audio Coding

ØSampling compatible to encoding of CD-DA and DAT:vSampling rates:

• 32 kHz, 44,1 kHz, 48 kHzvSampling precision:

• 16 bit/sampleØAudio channels:vMono (single, 1 channel)vStereo (2 channels)

• dual channel mode (independent, e.g., bilingual)

• optional: joint stereo (exploits redundancy and irrelevancy)

Page 12: elsaddik @ site.uottawa.ca abed @ mcrlab.uottawaelsaddik/abedweb/teaching/elg... · MPEG in a Nutshell ØI-Frames are self ... ØMPEG-4: vInitially, lower data rates for e.g. mobile

12

ww

w.s

ite.u

otta

wa.

ca/~

elsa

ddik

ww

w.e

l-sad

dik.

com

6504_Compression © elsaddik

MPEG Audio

ØApplication Example: DAB Digital Audio BroadcastingØuses MPEG layer 2 (compression also known as

“MUSICAM” =v(Masking pattern adapted Universal Subband

Integrated Coding And Multiplexing)Ødelays, for VLSI implementation:vmax. 30 ms encodingvmax. 10 ms decoding

ØSW codec delays vary for different layers, implementations, computers (rule-of-thumb may be 50/100/150 ms for layer 1/2/3, which makes MP3 rather inappropriate for real-time conversation)

ww

w.s

ite.u

otta

wa.

ca/~

elsa

ddik

ww

w.e

l-sad

dik.

com

6604_Compression © elsaddik

MPEG-Audio Coding

ØFFT applied to audio and spectrum is split into 32 non-interleaved sub-bandsvfor each sub-band, amplitude of audio signal

is calculatedvalso, noise level is determined simultaneously

with FFT, using a “psychoacoustic” model• Rough quantization at low noise level and

fine one at high-level

Sub-bandcoding Quantization Entropy

coding

Psychoacousticalmodel

32

control

Uncompressedaudio

Compressedaudio

Page 13: elsaddik @ site.uottawa.ca abed @ mcrlab.uottawaelsaddik/abedweb/teaching/elg... · MPEG in a Nutshell ØI-Frames are self ... ØMPEG-4: vInitially, lower data rates for e.g. mobile

13

ww

w.s

ite.u

otta

wa.

ca/~

elsa

ddik

ww

w.e

l-sad

dik.

com

6704_Compression © elsaddik

MPEG- Audio Coding

ØDefines 3 layers of quality, with different complexity of encoder/ decoder

v"higher layer" means "more complex" & "can handle lower layers"

ØData ratesv14 fixed data rates per layer, between 32 kbps-448 kbps

• In steps of 16 kbit/s

vLayer 1: max. 448 Kbit/s(ca. 1:4 compression, e.g. used as PASC in DCC)

vLayer 2: max. 384 Kbit/s(ca. 1:6-8, common, e.g. as MUSICAM in DAB)

vLayer 3: max. 320 Kbit/s(ca. 1:10-12, the famous MP3)

vHigher data rates are allowed for the modes:• “stereo”

• “joint stereo”• “dual channel”

ww

w.s

ite.u

otta

wa.

ca/~

elsa

ddik

ww

w.e

l-sad

dik.

com

6804_Compression © elsaddik

MPEG- Audio and Video Data Streams

Audio Data Stream LayersØ1. FramesØ2. Audio access unitsØ3. Slots ( 4 bytes in Layer 1 (low compexity), 1

byte in Layer 2 &3)

Video Data Stream LayersØ1. Video sequence layerØ2. Group of pictures layerØ3. Single picture layerØ4. Slice LayerØ5. Macroblock layerØ6. Block layer

PB

BI

Page 14: elsaddik @ site.uottawa.ca abed @ mcrlab.uottawaelsaddik/abedweb/teaching/elg... · MPEG in a Nutshell ØI-Frames are self ... ØMPEG-4: vInitially, lower data rates for e.g. mobile

14

ww

w.s

ite.u

otta

wa.

ca/~

elsa

ddik

ww

w.e

l-sad

dik.

com

6904_Compression © elsaddik

MPEG Layersw

ww

.site

.uot

taw

a.ca

/~el

sadd

ikw

ww

.el-s

addi

k.co

m

7004_Compression © elsaddik

MPEG Layers

ØEach picture is divided to m horizontal slicesØEach slices contains n macroblocksØEach macroblock contains of 16x16 pixels with

the total of 256 pixelsØEach block composed of 8x8 pixels which is 64

total pixels

PicturePicture

SliceSlice

MacroBlockMacroBlock

BlockBlock

Page 15: elsaddik @ site.uottawa.ca abed @ mcrlab.uottawaelsaddik/abedweb/teaching/elg... · MPEG in a Nutshell ØI-Frames are self ... ØMPEG-4: vInitially, lower data rates for e.g. mobile

15

ww

w.s

ite.u

otta

wa.

ca/~

elsa

ddik

ww

w.e

l-sad

dik.

com

7104_Compression © elsaddik

MPEG - Fellow upØMPEG-2:vHigher data rates for high-quality audio/videovMultiple layers and profilesvStudio quality TV and CD quality audio channels. 4 to 6 Mbps

typically.ØMPEG-3vInitially HDTVvMPEG-2 scaled up to subsume MPEG-3

ØMPEG-4:vInitially, lower data rates for e.g. mobile communicationvthen: focus coding & additional functionalities based on

image contentsvVideo conferencing at very low bit rates: 4.8 to 64 Kbps, with

10fps.ØMPEG-7 (EC = "experimental core" status):vContent descriptionvBasis for search and retrievalvSee section on databases

ØMPEG-21 (upcoming):vFramework for multimedia business, delivery... what’s

missing?vmaybe eCommerce focus --> e.g., security, watermarking?

ww

w.s

ite.u

otta

wa.

ca/~

elsa

ddik

ww

w.e

l-sad

dik.

com

7204_Compression © elsaddik

MPEG 2

ØFrom MPEG-1 to MPEG-2vImprovement in quality

• from VCR to TV to HDTVØNo CD-ROM based constraintsvhigher data rates

• MPEG-1: about 1.5 Mbit/s• MPEG-2: 2-100 Mbit/s

ØProminent role for digital TV in DVB (digital video broadcasting)vcommercial MPEG-2 realizations available

Page 16: elsaddik @ site.uottawa.ca abed @ mcrlab.uottawaelsaddik/abedweb/teaching/elg... · MPEG in a Nutshell ØI-Frames are self ... ØMPEG-4: vInitially, lower data rates for e.g. mobile

16

ww

w.s

ite.u

otta

wa.

ca/~

elsa

ddik

ww

w.e

l-sad

dik.

com

7304_Compression © elsaddik

MPEG 2

Øan international standard (1994)ØCBR (constant bit rate) and VBR video (Variable

bit rate) ØPicture quality higher than that of current NTSC,

PAL and SECAM broadcast systems ØCompression to bit rates in the range of:

v60 Mbps for HDTVv15 Mbps for NTSC, PAL and SECAMv4-15 Mbps for TV signals conforming

to CCIR 601ØMPEG-2 consists of five profiles: (Simple (does

not support B frames), Main, Next, .. ) each having four levels :

vHigh level Type 1: 1152 lpf, 1920 ppl, 60 fps -> 60 Mbps

ww

w.s

ite.u

otta

wa.

ca/~

elsa

ddik

ww

w.e

l-sad

dik.

com

7404_Compression © elsaddik

MPEG-2 Video Profiles and Levels

SimpleProfile

MainProfile

SNR ScalableProfile

Spatial Sca-lable Profile

HighProfile

High Level1920 pixels/line1152 lines

High-1440 Level1440 pixels/line1152 lines

Main Level720 pixels/line576 lines

Low Level352 pixels/line288 lines

LAYERSandPROFILES

No B-frames B-frames B-frames B-frames B-frames

Not Scalable Not Scalable SNR Scalable SNR Scalableor Spatial Sca-lable

SNR Scalableor Spatial Sca-lable

80 Mbps

80 Mbps60 Mbps 60 Mbps

100 Mbps

15 Mbps 15 Mbps 15 Mbps 20 Mbps

4 Mbps 4 Mbps

Signal to Noise (SNR) scaling : noise introduced byquantization errors and block structures

Page 17: elsaddik @ site.uottawa.ca abed @ mcrlab.uottawaelsaddik/abedweb/teaching/elg... · MPEG in a Nutshell ØI-Frames are self ... ØMPEG-4: vInitially, lower data rates for e.g. mobile

17

ww

w.s

ite.u

otta

wa.

ca/~

elsa

ddik

ww

w.e

l-sad

dik.

com

7504_Compression © elsaddik

MPEG 2 Audio

(two modest) extension to MPEG-1 audio: 1. "low sample rate extension" LSE: v 1/2 of all MPEG-1 rates: 16, 22.05, 24kHzv quantization down to 8 bits/sample

2. "multichannel extension": more channels, i.e. up to v 5 full bandwidth channels (surround system)

• left and right front• center (in front)• left and right back

v "multilingual extension": 7 more, i.e. up to 12 channels (multiple languages, commentary)

Ø Backward compatibility with MPEG-1 audiov Only three MPEG-2 audio codecs will not provide

backward compatibility ( in the range of 256- 448 kbps)

ww

w.s

ite.u

otta

wa.

ca/~

elsa

ddik

ww

w.e

l-sad

dik.

com

7604_Compression © elsaddik

MPEG-2 System DefinitionØStepsvaudio and video combined to “Packetized

Elementary Stream”vPES combined to “Program Stream” or “Transport

StreamӯProgram StreamvError-free environmentvPackets of variable lengthvOne single stream with one timing reference

ØTransport StreamvDesigned for “noisy” (lossy) media channelsvMultiplex of various programs with one or more

time basesvPackets of 188 bytes

ØConversion between Program and Transport Streams possible

Page 18: elsaddik @ site.uottawa.ca abed @ mcrlab.uottawaelsaddik/abedweb/teaching/elg... · MPEG in a Nutshell ØI-Frames are self ... ØMPEG-4: vInitially, lower data rates for e.g. mobile

18

ww

w.s

ite.u

otta

wa.

ca/~

elsa

ddik

ww

w.e

l-sad

dik.

com

7704_Compression © elsaddik

MPEG 2 Elementary Streams

Audio source

Video source

Audio encoder

Video encoder

Systemclock

MPEG2 Systemmultiplexerand encoder

MPEG2stream

Audio PacketizedUnit

MPEG2 encoded Audio

MPEG2 encoded Video

Video PacketizedUnit

Time Sync. Information

ww

w.s

ite.u

otta

wa.

ca/~

elsa

ddik

ww

w.e

l-sad

dik.

com

7804_Compression © elsaddik

MPEG 2 Streams

ISO 11172 Stream

PackHeader

PackHeader

SystemHeader ……..

Pack 1 Pack 2

VideoPacket

VideoPacket

VideoPacket

VideoPacket

VideoPacket

AudioPacket

188 bytes

Page 19: elsaddik @ site.uottawa.ca abed @ mcrlab.uottawaelsaddik/abedweb/teaching/elg... · MPEG in a Nutshell ØI-Frames are self ... ØMPEG-4: vInitially, lower data rates for e.g. mobile

19

ww

w.s

ite.u

otta

wa.

ca/~

elsa

ddik

ww

w.e

l-sad

dik.

com

7904_Compression © elsaddik

MPEG - Fellow upØMPEG-2:vHigher data rates for high-quality audio/videovMultiple layers and profilesvStudio quality TV and CD quality audio channels. 4 to 6 Mbps

typically.ØMPEG-3vInitially HDTVvMPEG-2 scaled up to subsume MPEG-3

ØMPEG-4:vInitially, lower data rates for e.g. mobile communicationvthen: focus coding & additional functionalities based on

image contentsvVideo conferencing at very low bit rates: 4.8 to 64 Kbps, with

10fps.ØMPEG-7 (EC = "experimental core" status):vContent descriptionvBasis for search and retrievalvSee section on databases

ØMPEG-21 (upcoming):vFramework for multimedia business, delivery... what’s

missing?vmaybe eCommerce focus --> e.g., security, watermarking?

ww

w.s

ite.u

otta

wa.

ca/~

elsa

ddik

ww

w.e

l-sad

dik.

com

8004_Compression © elsaddik

MPEG 4

ØMPEG-4 (ISO 14496) originally:vTargeted at systems with very scarce

resourcesvTo support applications like

• Mobile communication• Videophone and E-mail

vMax. data rates and dimensions (roughly):• VLBV “Very Low Bit-rate Video”

• Between 4800 and 64000 bits/s• 176 columns x 144 lines x 10 frames/s

• Largely covered by H.263 (QCIF)Øtherefore re-orientation:vGoal to provide enhanced functionalityvto allow for analysis and manipulation of

image contents

Page 20: elsaddik @ site.uottawa.ca abed @ mcrlab.uottawaelsaddik/abedweb/teaching/elg... · MPEG in a Nutshell ØI-Frames are self ... ØMPEG-4: vInitially, lower data rates for e.g. mobile

20

ww

w.s

ite.u

otta

wa.

ca/~

elsa

ddik

ww

w.e

l-sad

dik.

com

8104_Compression © elsaddik

MPEG 4

MPEG-4: Schedule for StandardizationØ1993: Work startedØ1997: Committee DraftØ1998: Final Committee DraftØ1998: Draft International StandardØ1999-2000: International Standard

ØAgainvStarted from original goal of providing an audio-visual

coding standard for very-low-bit-rate channels (e.g., for mobile applications)vEvolved into a complex tool kit vMPEG-4 innovates the MPEG-2 information production

and consumption paradigm by the way audio and video info is represented

vDeals with audio and video no longer as packaged “bitstreams”, produced by encoding, but as “audio-visual objects” (AVOs)

ww

w.s

ite.u

otta

wa.

ca/~

elsa

ddik

ww

w.e

l-sad

dik.

com

8204_Compression © elsaddik

MPEG 4 - Technical information

ØObjects are organized in a hierarchal fashion.ØEach object has its own

description element.

vAllows handling of the object

ØOne or more primitive media objects can be combined.ØTechniques from the

Virtual Reality model language.

Voice

Background

Image

Talkingperson

Compound mediaobject

Primitive media objects

Page 21: elsaddik @ site.uottawa.ca abed @ mcrlab.uottawaelsaddik/abedweb/teaching/elg... · MPEG in a Nutshell ØI-Frames are self ... ØMPEG-4: vInitially, lower data rates for e.g. mobile

21

ww

w.s

ite.u

otta

wa.

ca/~

elsa

ddik

ww

w.e

l-sad

dik.

com

8304_Compression © elsaddik

Video objects

ØDivide video components

vPerson and backgroundØCamera position information

ww

w.s

ite.u

otta

wa.

ca/~

elsa

ddik

ww

w.e

l-sad

dik.

com

8404_Compression © elsaddik

MPEG 4 -- Media streams

ØOne or more media streamsØDescriptors for the objects and the stream

Mediastream Decompression

Scenedescription

Composition

Page 22: elsaddik @ site.uottawa.ca abed @ mcrlab.uottawaelsaddik/abedweb/teaching/elg... · MPEG in a Nutshell ØI-Frames are self ... ØMPEG-4: vInitially, lower data rates for e.g. mobile

22

ww

w.s

ite.u

otta

wa.

ca/~

elsa

ddik

ww

w.e

l-sad

dik.

com

8504_Compression © elsaddik

Scene description

ØGrouping of the objectsvDirected acyclic graph

ØPositioning the objectsvSpecial attributes

Scene

Room

… … … …

Person

ww

w.s

ite.u

otta

wa.

ca/~

elsa

ddik

ww

w.e

l-sad

dik.

com

8604_Compression © elsaddik

New or Improved’ MPEG4 Functionalities

ØContent-Based ScalabilityØContent-Based Manipulation and Bitstream

EditingØContent-Based Multimedia Data Access ToolsØHybrid Natural and Synthetic Data CodingØCoding of Multiple Concurrent Data StreamsØImproved Coding EfficiencyØRobustness in Error-Prone EnvironmentsØImproved Temporal Random Access

Page 23: elsaddik @ site.uottawa.ca abed @ mcrlab.uottawaelsaddik/abedweb/teaching/elg... · MPEG in a Nutshell ØI-Frames are self ... ØMPEG-4: vInitially, lower data rates for e.g. mobile

23

ww

w.s

ite.u

otta

wa.

ca/~

elsa

ddik

ww

w.e

l-sad

dik.

com

8704_Compression © elsaddik

Content-Based Scalability

ØMPEG4 provides the ability to achieve scalability with a fine granularity in content, spatial resolution, temporal resolution, quality and complexity.ØContent-scalability may imply the existence of a

prioritization of the objects in the scene. The combination of more than one scalability case may yield interesting scene representations, where the more relevant objects are represented with higher spatial-temporal resolution. ØExample uses: vuser selection of decoded quality of individual

objects in the scene; vdatabase browsing at different scales,

resolutions, and qualities.

ww

w.s

ite.u

otta

wa.

ca/~

elsa

ddik

ww

w.e

l-sad

dik.

com

8804_Compression © elsaddik

Content-Based Manipulation and Bitstream Editing

ØMPEG4 provides a syntax and coding schemes to support content-based manipulation and bitstream editing without the need for transcoding.ØThis means the user should be able to access

one specific object in the scene/bitstream and perhaps change some of its characteristics.ØExample uses: vhome movie production and editing;

interactive home shopping; vinsertion of sign language interpreter or

subtitles.

Page 24: elsaddik @ site.uottawa.ca abed @ mcrlab.uottawaelsaddik/abedweb/teaching/elg... · MPEG in a Nutshell ØI-Frames are self ... ØMPEG-4: vInitially, lower data rates for e.g. mobile

24

ww

w.s

ite.u

otta

wa.

ca/~

elsa

ddik

ww

w.e

l-sad

dik.

com

8904_Compression © elsaddik

Content-Based Multimedia Data Access Tools

ØMPEG4 shall provide efficient data access and organisation based on the audio-visual contentvAccess tools may be

• indexing, hyperlinking, querying,browsing, uploading, downloading, and deleting.

ØExample uses: vcontent-based retrieval of information from

on-line libraries and travel information databases

ww

w.s

ite.u

otta

wa.

ca/~

elsa

ddik

ww

w.e

l-sad

dik.

com

9004_Compression © elsaddik

Hybrid Natural and Synthetic Data Coding

ØMPEG4 supports efficient methods for combining synthetic scenes with natural scenes (e.g. text and graphics overlays), the ability to code and manipulate natural and synthetic audio and video data and decoder-controllable methods of mixing synthetic data with ordinary video and audio, allowing for interactivity. Øharmonious integration of natural and synthetic audio-

visual objects. Ø first step towards the integration of all types of audio-

visual information.ØExample uses:

vvirtual reality applications; vanimations and synthetic audio (e.g. MIDI) can be mixed

with ordinary audio and video in a game; vgraphics can be rendered from different viewpoints.

Page 25: elsaddik @ site.uottawa.ca abed @ mcrlab.uottawaelsaddik/abedweb/teaching/elg... · MPEG in a Nutshell ØI-Frames are self ... ØMPEG-4: vInitially, lower data rates for e.g. mobile

25

ww

w.s

ite.u

otta

wa.

ca/~

elsa

ddik

ww

w.e

l-sad

dik.

com

9104_Compression © elsaddik

Coding of Multiple Concurrent Data Streams

Øability to efficiently code multiple views/soundtracks of a scene as well as sufficient synchronisation between the resulting elementary streams. ØFor stereoscopic and multiview video applications, MPEG4

shall include the ability to exploit redundancy in multiple views of the same scene, also permitting solutions that allow compatibility with normal (mono) video. This functionality should provide efficient representations of 3D natural objects provided a sufficient number of views is available. Again, this may require a complex analysis process. It is expected that this functionality could substantially benefit applications such as virtual reality where almost only synthetic objects are used till now.ØExample uses:

vmultimedia entertainment, e.g. virtual reality games, 3D movies; vtraining and flight simulations; vmultimedia presentations and education.

ww

w.s

ite.u

otta

wa.

ca/~

elsa

ddik

ww

w.e

l-sad

dik.

com

9204_Compression © elsaddik

Improved Coding Efficiency

Øthe growth of mobile networks provides a strong need for improved coding efficiency, ØMPEG4 is required to provide subjectively better

audio-visual quality compared to existing or other emerging standards (such as H.263), at comparable bit-rates. ØThe results of the MPEG4 video subjective tests,

held in November 1995, showed however that, in terms of coding efficiency, the available coding standards still perform very well in comparison with most of the other coding techniques proposedØExample uses: vefficient transmission of audio-visual data on

low-bandwidth channels; vefficient storage of audio-visual data on

limited capacity media, such as chip cards.

Page 26: elsaddik @ site.uottawa.ca abed @ mcrlab.uottawaelsaddik/abedweb/teaching/elg... · MPEG in a Nutshell ØI-Frames are self ... ØMPEG-4: vInitially, lower data rates for e.g. mobile

26

ww

w.s

ite.u

otta

wa.

ca/~

elsa

ddik

ww

w.e

l-sad

dik.

com

9304_Compression © elsaddik

Robustness in Error-Prone Environments

Øuniversal accessibility implies access to applications over a variety of wireless and wired networks and storage media ØMPEG4 shall provide an error robustness

capability. Particularly, for low bit-rate applications under severe error conditions.ØThe idea is not to substitute the error control

techniques implemented by the network but provide resilience against the residual errors, e.g. through selective forward error correction, error containment or error concealment.ØExample uses: vtransmitting from a database over a wireless

network;vcommunicating with a mobile terminal; vgathering audio-visual data from a remote

location

ww

w.s

ite.u

otta

wa.

ca/~

elsa

ddik

ww

w.e

l-sad

dik.

com

9404_Compression © elsaddik

Improved Temporal Random Access

ØMPEG4 shall provide efficient methods to randomly access, within a limited time and with fine resolution, parts from an audio-visual sequence. This includes ‘conventional’ random access at very low bit rates.

ØExample uses vaudio-visual data can be randomly accessed

from a remote terminal over limited capacity media; va ‘fast forward’ can be performed on a single

audio-visual object in the sequence.

Page 27: elsaddik @ site.uottawa.ca abed @ mcrlab.uottawaelsaddik/abedweb/teaching/elg... · MPEG in a Nutshell ØI-Frames are self ... ØMPEG-4: vInitially, lower data rates for e.g. mobile

27

ww

w.s

ite.u

otta

wa.

ca/~

elsa

ddik

ww

w.e

l-sad

dik.

com

9504_Compression © elsaddik

MPEG7

ØIncreasing availability of Multimedia contentØIncreasing creation of Multimedia contentØIncreasing use of Multimedia content by

machinesØThe need for searching, categorizing, describing,

managing and filtering

à Great need for Standard Description

ØMPEG-7 proposing such a standardØMPEG-7 does not deal with implementationv(Great for Master Thesis)

ww

w.s

ite.u

otta

wa.

ca/~

elsa

ddik

ww

w.e

l-sad

dik.

com

9604_Compression © elsaddik

MPEG 21

ØMPEG-21 Multimedia FrameworkvThe vision for MPEG-21 is:

to define a multimedia framework to enable transparent and augmented use of multimedia

resources across a wide range of networks and devices used by different communities

Page 28: elsaddik @ site.uottawa.ca abed @ mcrlab.uottawaelsaddik/abedweb/teaching/elg... · MPEG in a Nutshell ØI-Frames are self ... ØMPEG-4: vInitially, lower data rates for e.g. mobile

28

ww

w.s

ite.u

otta

wa.

ca/~

elsa

ddik

ww

w.e

l-sad

dik.

com

9704_Compression © elsaddik

MPEG 21

Seven Architectural ‘Elements’ in the Multimedia Framework:

1. Digital Item Declaration2. Digital Items Representation3. Digital Item Identification and Description 4. Content Management and Usage 5. Intellectual Property Management and

Protection 6. Terminals and Networks 7. Event Reporting

ww

w.s

ite.u

otta

wa.

ca/~

elsa

ddik

ww

w.e

l-sad

dik.

com

9804_Compression © elsaddik

MPEG 21

98

Identification and

Description

Content Management and Usage

Terminals & Networks

IPMP

Content Represent-

ation

Digital Item Declaration

Event Reporting

Event Reporting Metrics & InterfacesEvent Reporting Metrics & InterfacesUser A User BTransaction/Use/Relationship

ßContentàßAuthorization/Value Exchangeà

Example: item, resource

Example: Unique IdentifierExample: Natural & Synthetic

Example: Encryption, Authentication Watermarking

Example: resource Mgmt. (QoS)

Example: Storage MgmtPersonalization

Event reporting, by creating metrics and interfaces,

further describes specific interactions

Page 29: elsaddik @ site.uottawa.ca abed @ mcrlab.uottawaelsaddik/abedweb/teaching/elg... · MPEG in a Nutshell ØI-Frames are self ... ØMPEG-4: vInitially, lower data rates for e.g. mobile

29

ww

w.s

ite.u

otta

wa.

ca/~

elsa

ddik

ww

w.e

l-sad

dik.

com

9904_Compression © elsaddik

MPEG relations

MPEG2MPEG1 MPEG2 MPEG4MPEG7

ww

w.s

ite.u

otta

wa.

ca/~

elsa

ddik

ww

w.e

l-sad

dik.

com

10004_Compression © elsaddik

Standards for Narrow-Band Videoconferencing

ØH.320:

vStandard for videoconferencing over ISDN linesØH.324: vStandard for videoconferencing over POTS (Plain Old

Telephone Service)ØH.32x’s umbrella specification structure:

G.723 H.263H.245 H.223V.34

H.324

G.722 H.261H.242 H.221

H.320

Page 30: elsaddik @ site.uottawa.ca abed @ mcrlab.uottawaelsaddik/abedweb/teaching/elg... · MPEG in a Nutshell ØI-Frames are self ... ØMPEG-4: vInitially, lower data rates for e.g. mobile

30

ww

w.s

ite.u

otta

wa.

ca/~

elsa

ddik

ww

w.e

l-sad

dik.

com

10104_Compression © elsaddik

H.261 and related ITU Standards

ØVideo codec for audiovisual services at p x 64kbit/s

v("p-times-sixtyfour", where p means "multiples-of"):vITU- CCITT standard from 1990

• ITU = International Telecommunication Union• CCIT = Consultative Committee for International

Telegraph and Telephone vFor ISDNvWith p=1,..., 30

ØTechnical issues:

vReal-time encoding/decodingvMax. signal delay of 150msvConstant data ratevImplementation in hardware (main goal) and software

ww

w.s

ite.u

otta

wa.

ca/~

elsa

ddik

ww

w.e

l-sad

dik.

com

10204_Compression © elsaddik

H.261 – Resolution FormatØUnlike JPEG, H.261 defines a very precise image formatvImage components:

• Luminance signal (Y)• Two color difference signals (Cb,Cr)

vSubsampling according to CCIR 601 (4:1:1)• ITU-R 601: (formerly CCIR) designates a "raw" digital

video format with 704 x 480 pixels • CCIR = International Radio Consultative Committee

Two resolution formats are specified:ØOptionalvCommon Intermediate format (CIF) resolution

• Y: 352 x 288 pixel• At 29.97 frames/s app. 36.46 Mbps (uncompressed) i.e. ~

570 * 64kbpsØMandatoryvQuarter Common Intermediate Format (QCIF) resolution (has

half of CIF resolution)• Y: 176 x 144 pixel• At 29.97 frames/s app. 9.115 Mbps (uncompressed)

Ø all H.261 implementations must be able to encode and decode QCIF ; CIF is optional

Page 31: elsaddik @ site.uottawa.ca abed @ mcrlab.uottawaelsaddik/abedweb/teaching/elg... · MPEG in a Nutshell ØI-Frames are self ... ØMPEG-4: vInitially, lower data rates for e.g. mobile

31

ww

w.s

ite.u

otta

wa.

ca/~

elsa

ddik

ww

w.e

l-sad

dik.

com

10304_Compression © elsaddik

H.261 ( p x 64) Video Compression

ØDCT-based compression algorithm, like JPEG, with

vdifferential PCM (DPCM) with motion estimation for interframe coding and vvariable word-length entropy coding (such as Huffman)

Øvery high-compression ratios for full-color, real-time motion video transmissionØcombines intraframe and interframe codingØoptimized for applications such as vvideo-conferencing, which are not motion-intensive

Ø limited motion search and estimation strategiesØcompression ratios from 100:1 to 2,000:1Øcovers the entire ISDN channel capacity (p x 64 kbps,

p=1,2,...,30)vfor p=1 or 2: videophone, desk -top video-conferencing

applicationsvfor p=6 or higher, more complex pictures are

transmitted. Good for group video-conferencing

ww

w.s

ite.u

otta

wa.

ca/~

elsa

ddik

ww

w.e

l-sad

dik.

com

10404_Compression © elsaddik

H.261Ø Intraframe coding takes no advantage of redundancy between

frames.vIntraframe coding: yields "reference frame" f0veach 8x8 block is transformed by DCTvDCT with same quantization factor for all AC valuesvthis factor may be adjusted by loopback filtervintraframes rare (bandwidth!, main application videophone)

Ø Interframe coding (corresponds to P frames of MPEG) à Motion estimationvinterframes: f1,f2,f3,... relative to f0 (differential encoding)vSearch of similar macroblock (16x16) in previous imagevPosition of this macroblock defines motion vectorvSearch range is up to the implementation:

• max. ± 15 pixel• but: motion vector may also always be 0 ("bad" software

encoder) • e.g. H.261 also allows simple implementation, considering

only the differences between macroblocks located in the same position, thus a zero motion vector

Page 32: elsaddik @ site.uottawa.ca abed @ mcrlab.uottawaelsaddik/abedweb/teaching/elg... · MPEG in a Nutshell ØI-Frames are self ... ØMPEG-4: vInitially, lower data rates for e.g. mobile

32

ww

w.s

ite.u

otta

wa.

ca/~

elsa

ddik

ww

w.e

l-sad

dik.

com

10504_Compression © elsaddik

Main Differences between H.261 and H.263

ØExtension to H.261Ømax. bitrate: H.263 approx. 2.5 x H.261; lowest bitrates

suitable f. modem

Main Differences between H.261 and H.263ØBase Level Differences (always ON)vNo filter for HF noise in feedback loopvMotion vectors produced with 1/2-pixel resolutionvPicture format for sub-QCIF (128x96)vHuffman tables designed specifically for low bit rate.

vJPEG is the still picture modeØOptional Level Differences (Negotiated)vUnlimited search space for motion vector à fast encoder can do bettervSyntax-based Arithmetic codingvAdvanced prediction modevPB-frames (2 combined pictures: 1 B- & 1 P-Frame)

ww

w.s

ite.u

otta

wa.

ca/~

elsa

ddik

ww

w.e

l-sad

dik.

com

10604_Compression © elsaddik

Main Differences between H.261 and H.263

ØN.B. H.261 is fully contained within H.263

H.261

H.263

Page 33: elsaddik @ site.uottawa.ca abed @ mcrlab.uottawaelsaddik/abedweb/teaching/elg... · MPEG in a Nutshell ØI-Frames are self ... ØMPEG-4: vInitially, lower data rates for e.g. mobile

33

ww

w.s

ite.u

otta

wa.

ca/~

elsa

ddik

ww

w.e

l-sad

dik.

com

10704_Compression © elsaddik

Source Image Formats

optionalnot defined1408 x 115216QCIF16 Times Quarter

Common Intermediate Format

optionalnot defined704 x 5764QCIF4 Times Quarter

Common Intermediate Format

optionaloptional352 x 144CIFCommon Intermediate

Format

requiredrequired176 x 144QCIFQuarter Common

Intermediate Format

requiredoptional128 x 96SQCIFSub Quarter Common Intermediate Format

H263Encoder/Decoder

H261Encoder/Decoder

PixelsFormat

ww

w.s

ite.u

otta

wa.

ca/~

elsa

ddik

ww

w.e

l-sad

dik.

com

10804_Compression © elsaddik

Conclusion

JPEGØVery general format with good compression ratioØSW and HW for baseline mode available

H.261/ H.263ØEstablished standard by telecom worldØPreferable hardware realization

MPEG-1, MPEG-2, MPEG-4, MPEG-7ØMPEG-2 with data rates between 2 and 100 MbpsØMPEG-4, MPEG-7: objects coding, content descr.

Proprietary Systems: Quicktime, DVI, CD-I,...ØProduct that use of other standardsØMigration to use the standards

Page 34: elsaddik @ site.uottawa.ca abed @ mcrlab.uottawaelsaddik/abedweb/teaching/elg... · MPEG in a Nutshell ØI-Frames are self ... ØMPEG-4: vInitially, lower data rates for e.g. mobile

34

ww

w.s

ite.u

otta

wa.

ca/~

elsa

ddik

ww

w.e

l-sad

dik.

com

10904_Compression © elsaddik

Encoding Rates of Various Standards

JPEG (for video) 10-20 Mbps 7-27 timesMPEG-1 1.2-2.0 Mbps 100 timesH.261 64kbps-2Mbps 24 timesDVI 1.2-1.5 Mbps 160 timesCD-I 1.2-1.5 Mbps 100 timesMPEG-2 4-60 Mbps 30-100 timesCCIR 723 32-45 Mbps 3-5 timesCCIR 601/D-1 140-270 Mbps ReferencePictureTel SG3 0.1-1.5 Mbps 100 timesSoftware compression (small window) ~2 Mbps 6 times

Standard Data Rate Compression

NB. For JPEG , it was assumed 640 x 480 x 24-bit colour, 15 fps