View
223
Download
2
Embed Size (px)
Citation preview
Outline
Video/Image CompressionStill Image Compression– JPEG/ JPEG 2000
• 'Joint Photographic Experts Group‘
Video Compression– H.261, H.263, H.263+, MPEG-1, MPEG-2, MPEG-4,
MPEG-7, MPEG-21.
Transform coding
Encoder
Decoder
T QEntropycoding
Entropycoding
Q-1T-1
Image block
TransformCoefficients
Zigzag Scan(2D->1D)
Bitstream
BitstreamInverse Zigzag Scan(1D->2D)
ReconstructedTransformCoefficients
ReconstructedImage block
52 55 61 66 70 61 64 73
63 59 66 90 109 85 69 72
62 59 68 113 144 104 66 73
63 58 71 122 154 106 70 69
67 61 68 104 126 88 68 70
79 65 60 70 77 68 58 75
85 71 64 59 55 61 65 83
87 79 69 68 65 76 78 94
-26 -3 -6 2 2 0 0 0
1 -2 -4 0 0 0 0 0
-3 1 5 -1 -1 0 0 0
-4 1 2 -1 0 0 0 0
1 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
-26 -3 -6 2 2 0 0 0
1 -2 -4 0 0 0 0 0
-3 1 5 -1 -1 0 0 0
-4 1 2 -1 0 0 0 0
1 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
-26 –3 1 –3 2 –6 2 –4 1 –4 1 1 5 0 2 0 0 –1 2 0 0 0 0 0 –1 –1 EOB
2D->1D
Number->binary
-26 –3 1 –3 2 –6 2 –4 1 –4 1 1 5 0 2 0 0 –1 2 0 0 0 0 0 –1 –1 EOB
1010110 0100 001 0100 0101 100001 0110 100011 001 100011 001 001 100101 11100110 110110 0110 11110100 000 1010
-415 -29 -62 25 55 -20 -1 3
7 -21 -62 9 11 -7 -6 6
-46 8 77 -25 -30 10 7 -5
-50 13 35 -15 -9 6 0 3
11 -8 -13 -2 -1 1 -4 1
-10 1 3 -3 -1 0 2 -1
-4 -1 2 -1 2 -3 1 -2
-1 -1 -1 -2 -1 -1 0 -1
16 11 10 16 24 40 51 61
12 12 14 19 26 58 60 55
14 13 16 24 40 57 69 56
14 17 22 29 51 87 80 62
18 22 37 56 68 109 103 77
24 35 55 64 81 104 113 92
49 64 78 87 103 121 120 101
72 92 95 98 112 100 103 99
-415/16 = -26
Example of JPEG Coding(Encoder)
Transform coding(DCT)
Quantization
Zigzag Scan
Entropy Coding
(bit stream)
-26 -3 -6 2 2 0 0 0
1 -2 -4 0 0 0 0 0
-3 1 5 -1 -1 0 0 0
-4 1 2 -1 0 0 0 0
1 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
-26 –3 1 –3 2 –6 2 –4 1 –4 1 1 5 0 2 0 0 –1 2 0 0 0 0 0 –1 –1 EOB
1D->2D
Binary->number
1010110 0100 001 0100 0101 100001 0110 100011 001 100011 001 001 100101 11100110 110110 0110 11110100 000 1010
-26 –3 1 –3 2 –6 2 –4 1 –4 1 1 5 0 2 0 0 –1 2 0 0 0 0 0 –1 –1 EOB
-416 -33 -60 32 48 0 0 0
12 -24 -56 0 0 0 0 0
-42 13 80 -24 -40 0 0 0
-56 17 44 -29 0 0 0 0
18 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
58 64 67 64 59 62 70 78
56 55 67 89 98 88 74 69
60 50 70 119 141 116 80 64
69 51 71 128 149 115 77 68
74 53 64 105 115 84 65 72
76 57 56 74 75 57 57 74
83 69 59 60 61 61 67 83
93 81 67 62 69 80 84 84
Example of JPEG Coding(decoder)
Inverse Entropy Coding
(bit stream)
Inverse Zigzag Scan
Inverse Quantization
Inverse Transform coding(DCT)
Transform coding
Encoder
Decoder
T QEntropycoding
Entropycoding
Q-1T-1
Image block
TransformCoefficients
Zigzag Scan(2D->1D)
Bitstream
BitstreamInverse Zigzag Scan(1D->2D)
ReconstructedTransformCoefficients
ReconstructedImage block
DCT
52 55 61 66 70 61 64 73
63 59 66 90 109 85 69 72
62 59 68 113 144 104 66 73
63 58 71 122 154 106 70 69
67 61 68 104 126 88 68 70
79 65 60 70 77 68 58 75
85 71 64 59 55 61 65 83
87 79 69 68 65 76 78 94
-415 -29 -62 25 55 -20 -1 3
7 -21 -62 9 11 -7 -6 6
-46 8 77 -25 -30 10 7 -5
-50 13 35 -15 -9 6 0 3
11 -8 -13 -2 -1 1 -4 1
-10 1 3 -3 -1 0 2 -1
-4 -1 2 -1 2 -3 1 -2
-1 -1 -1 -2 -1 -1 0 -1
Example of JPEG Coding(Encoder)
Transform coding
Encoder
Decoder
T QEntropycoding
Entropycoding
Q-1T-1
Image block
TransformCoefficients
Zigzag Scan(2D->1D)
Bitstream
BitstreamInverse Zigzag Scan(1D->2D)
ReconstructedTransformCoefficients
ReconstructedImage block
-415 -29 -62 25 55 -20 -1 3
7 -21 -62 9 11 -7 -6 6
-46 8 77 -25 -30 10 7 -5
-50 13 35 -15 -9 6 0 3
11 -8 -13 -2 -1 1 -4 1
-10 1 3 -3 -1 0 2 -1
-4 -1 2 -1 2 -3 1 -2
-1 -1 -1 -2 -1 -1 0 -1
16 11 10 16 24 40 51 61
12 12 14 19 26 58 60 55
14 13 16 24 40 57 69 56
14 17 22 29 51 87 80 62
18 22 37 56 68 109 103 77
24 35 55 64 81 104 113 92
49 64 78 87 103 121 120 101
72 92 95 98 112 100 103 99
-415/16 = -26
Example of JPEG Coding(Encoder)
-26 -3 -6 2 2 0 0 0
1 -2 -4 0 0 0 0 0
-3 1 5 -1 -1 0 0 0
-4 1 2 -1 0 0 0 0
1 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
-415 -29 -62 25 55 -20 -1 3
7 -21 -62 9 11 -7 -6 6
-46 8 77 -25 -30 10 7 -5
-50 13 35 -15 -9 6 0 3
11 -8 -13 -2 -1 1 -4 1
-10 1 3 -3 -1 0 2 -1
-4 -1 2 -1 2 -3 1 -2
-1 -1 -1 -2 -1 -1 0 -1
Example of JPEG Coding(Encoder)
Transform coding
Encoder
Decoder
T QEntropycoding
Entropycoding
Q-1T-1
Image block
TransformCoefficients
Zigzag Scan(2D->1D)
Bitstream
BitstreamInverse Zigzag Scan(1D->2D)
ReconstructedTransformCoefficients
ReconstructedImage block
-26 -3 -6 2 2 0 0 0
1 -2 -4 0 0 0 0 0
-3 1 5 -1 -1 0 0 0
-4 1 2 -1 0 0 0 0
1 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
-26 –3 1 –3 2 –6 2 –4 1 –4 1 1 5 0 2 0 0 –1 2 0 0 0 0 0 –1 –1 EOB
2D->1D
Example of JPEG Coding(Encoder)
Transform coding(DCT)
Quantization
Zigzag ScanZigzag Scan
Entropy Coding
(bit stream)
Transform coding
Encoder
Decoder
T QEntropycoding
Entropycoding
Q-1T-1
Image block
TransformCoefficients
Zigzag Scan(2D->1D)
Bitstream
BitstreamInverse Zigzag Scan(1D->2D)
ReconstructedTransformCoefficients
ReconstructedImage block
Main Ideas of Still Image Coding (Intra Coding)
Block-based codingTransform coding (DCT)QuantizationZagzig scanDPCM (Differential PCM)Entropy coding (Variable-length coding)– Huffman coding– Run-length coding– Arithmetic coding
Main Ideas of Video Coding (Inter Coding)
Intra coding– Block-based coding, transform coding, quantization,
zagzig scan, DPCM, entropy coding
Inter coding– Intra coding for residual– Motion estimation/compensation
Image/Video Redundancy
Spatial redundancy
Temporal redundancy
253 255 A B
AFrame N-1
BFrame N
Use A to code B
Video CompressionEncoder For Still Image
T QEntropycoding
Image block
TransformCoefficients
Zigzag Scan(2D->1D)
Bitstream
• Encoder For Video Sequence
Q-1T-1
ReconstructedTransformCoefficients
ReconstructedImage block
MC
-
Results of DCT Coding JPEG
PSNR (Peak Singal-to-Noise Ratio)MSE (Mean Square Error) dBdBPSNR 6.32
36
255log10
2
10
255
1
255
1
2^
2
2
10
)()256
1(
255log10
i jijij xx
dBPSNR
MSE
MSE
Results of Motion Compensation Coding
PSNR = 22.68 dB,MSE=6.50,MAE=25Bits for motion vector = 1002 bits
Residual Image Coded ImageDCT Coding
PSNR = 43.35 dBBit Rate = 21957 bits/frameCompression ration== (256 * 256 * 8) / 21957 = 23.9
ITU-T Recommendation H.261(Previously “CCITT Recommendation”)
Video Codec for Audiovisual Services at p×64 kbit/s
Geneva, 1990: revised at helsinki, 1993
H.261 v.s. p×64
The Recommendation H.261 describes the video coding and decoding methods for the moving picture component of audiovisual services(videophone, videoconference, etc.) at the rates of p×64 kbit/s, where p is in the range 1 to 30.=> p×64 (called p times sixty four) coder
H.261 v.s. MPEG
The H.261 specification is already implemented in several manufacturers. Its target is telecommunications at a rate as low as 64 kbits. MPEG is defined for higher bit rate – 0.9 Mbits to 1.5 Mbits and consequently for higher quality.
H.261
Video codec for audiovisual services– ISDN Videophone and video conferencing– Low bit rates, low delay
1984: at m×384 kbits/s (m = 1, …, 5)1988-90: at p×64 kbits/s (p = 1, …, 30)
Motion Estimation
For each 16*16 superblock(SB), ME searches the best match in the referenced frame, and returns a motion vector MV = (X,Y).Both X and Y have integer value not exceeding ±15.Only the difference (residual) between the SB and the best match is DCT encoded
Coding of Motion Vectors
Differential codingVLC for MV differenceExample:
MVD Code… …-7&25 0000 0111-6&26 0000 1001-5&27 0000 1011-4&28 0000 111-3&29 0001 1-2&30 0011-1 0110 11 0102&-30 00103&-29 0001 04&-28 0000 1105&-27 0000 10106&-26 0000 10007&-25 0000 0110… …
15 14 -13 12 … -1 -27 25 …
011 00001010 00000111 …
Motion Compensation(MC) & Motion Estimation (ME)
MC is optional for each MB. (MTYPE => MB based)Only one MV for each MB.The ME compares a 16x16 superblock in the luminance block (Y) throughout a small search area of the previously transmitted image.Both horizontal and vertical components of these motion vectors have integer values not exceeding ±15.The MV is used for all 4 Y blocks. The MV for both Cb and Cr is derived by halving the component values of the MB MV.[NOT in H.261] The displacement with the smallest absolute superblock difference, determined by the sum of the absolute values of the pel-to-pel difference throughout the block, is considered the MV for the particular MB
Quantization
# of quantizers is 1 for INTRA dc coefficient and 31 for all other coefficients.Within a MB, the same quantizer is used for all coefficient excepts the INTRA dc one.The equations for the quantizer can be written in terms of the MB quantization factor, Q sometimes termed MQUANT:– C(u,v) = F(u,v) / 2Q if Q is odd– C(u,v) = (F(u,v) ±1)Q 1 if Q is even (F>0 => +-, F<0=>-+
Quantization for INTRA dc term: – C = (F+4) / 8 with inverse F = 8C.
±
Loop Filter (FIL)
The filter is separable into one-dimensional horizontal and vertical functions.The function is non-recursive with coefficients of ¼, ½, ¼ except at block edges.The function has coefficients of 0, 1, 0 at block edges.The filter is switched on/off for all 6 blocks in a MB according to MTYPE.
×¼ ×½ ×¼
Decoder
Source format– Pictures are coded as luminance and two colour diffe
rence components (Y, Cb, and Cr).
CIF (Common Intermediate Format)– Y: 352 × 288– Cb, Cr: 176 × 144
Decoder
QCIF (Quarter-CIF)– Y: 176 × 144– Cb, Cr: 88 × 72
CIF for NTSC (National Television System Committee) input (MPEG SIF 525)– Y: 352 × 240– Cb, Cr: 176 × 120
All codecs must be able to operate using QCIF. Some codecs can also operate with CIF.
H.261 Video Formats
VideoFormat
Luminance (Y) Chrominance(Cb, Cr)pixels/line lines/frame pixels/line lines/frame
CIF 352 288 176 144QCIF 176 144 88 72Y pixel
Cb, Cr pixel
Block boundary
Arrangements of data structure in H.261
123
176
144
QCIF picture
1 2 3 4 5 6 7 8 9 10 1112 13 14 15 16 17 18 19 20 21 2223 24 25 26 27 28 29 30 31 32 33
176
48
GOB (Group Of Block)
Y1 Y2Y3 Y4
U V8
88
8
16
16 MB (Macro Block)
Data Structure of Compressed Bitstream in H.261
Picture Header GOB data … GOB data PictureLayer
GOB Header MB data … MB data GOBLayer
MB Header
Block data
… Block data
MBLayer
TCOEFF … TCOEFF Block data
Block LayerFixed Length Code
Variable Length Code
Structure of picture layer
Picture start code (PSC) (20 bits)0000 0000 0000 0001 0000
Temporal reference (TR) (5 bits)It is formed by incrementing its value in the previously tran
smitted picture header by one plus the number of non-transmitted pictures since that last transmitted one. (Only the five LSBs used)
PSC TR PTYPE PEI PSPARE… PEI … GOB data
Structure of picture layer
Type information (PTYPE) (6 bits)Bit 1 Split screen indicatorBit 2 Document camera indicator, “0” off, “1” on;Bit 3 Freeze picture release, “0” off, “1” on;Bit 4 Source format, “0” QCIF, “1” CIF;Bit 5 Optional still image model HI_RES, “0” on, “1” offBit 6 Sparewhere Bit 1 is MSB
Extra insertion information (PEI) (1 bit)“1” signals the presence of the following optional data field.
PSC TR PTYPE PEI PSPARE… PEI … GOB data
GOB Layer
Group of blocks start code (GBSC) (16 bits)– 0000 0000 0000 0001 (if “0000” followed, then it is
treated as a PSC)Group number (GN) (4 bits)– GN indicates the position of the group of blocks. 13,
14 and 15 are reserved for future use. 0 (0000) is used in the PSC.
GBSC GN GQUANT GEI GSPARE… GEI … MB data
GOB Layer
Quantizer information (GQUANT) (5 bits)– The quantizer to be used in the GOB until overridden by any
subsequent MQUENT.
Extra insertion information (GEI) (1 bit)– “1” signals the presence of the following optional data field.
Spare information (GSPARE) (0/8/16… bits)– If PEI = “1”, then the following 8-bits data is GSPARE.
GBSC GN GQUANT GEI GSPARE… GEI … MB data
MB Layer
Macroblock address(MBA) (Variable length: TABLE 1)– MBA indicates the position of a MB within a GOB. It i
s the difference between the absolute addresses of the MB and the last transmitted MB.
Type information (MTYPE) (Variable length: TABLE 2)
MBA MTYPE MQUANT MVD CBP Block data
MB Layer
Quantizer (MQUANT) (5 bits)– MQUANT is present only if so indicated by MTYPE
(1, 3, 6, 9).
MBA MTYPE MQUANT MVD CBP Block data
MB Layer
Motion vector data (MVD) (Variable length: TABLE 3)– MVD is obtained from the MV (for the MB) by subtracting the
vector of the preceding MB. The vector of the preceding MB is regarded as zero in the following three situations:
• 1) evaluating MVD for MB 1, 12, 23.• 2)evaluating MVD for MBs in which MBA does not represent a differe
nce of 1• 3) MTYPE of the previous MB was not MC.
– Only one of the pair will yield a MV falling within the permitted range.
MBA MTYPE MQUANT MVD CBP Block data
MB Layer
Coded block pattern (CBP) (Variable length: TABLE 4)– CBP is present if indicated by MTYPE (2, 3, 5, 6, 8, 9). The c
odeword gives a pattern number signifying those blocks in the MB for which at least one transform coefficient is transmitted.
– CBP = 32P1 + 16P2 + 8P3 + 4P4 + 2P5 + P6
where Pn = 1 if any coefficient is present for block n, else 0.
MBA MTYPE MQUANT MVD CBP Block data
1 23 4
5 6Y Cb Cr
Block Layer
Transform coefficients (TCOEFF) (Variable length: TABLE 5)– TCOEFF is always present for all six blocks in a MB when
MTYPE indicates INTRA. In other cases MTYPE and CBP signal which blocks have coefficient data transmitted for them.
– The most commonly occurring combination of successive zeros (RUN) and the following value (LEVEL) are encoded with variable length codes in TABLE 5. Other combinations of (RUN, LEVEL) are encoded with a 20-bit word consisting of 6 bits ESCAPE, 6 bits RUN and 8 bits LEVEL.
Block Layer
There are two code tables in TABLE 5:– 1) Being used for the first transmitted LEVEL in INTER, INTE
R+MC, and INTER+MC+FIL blocks. (EOB is not included).– 2) Being used for all other LEVELs (EOB is included) except t
he first one in INTRA blocks which is fixed length coded with 8 bits.
Coefficients after the last non-zero one are not transmitted. EOB is always the last item in blocks for which coefficients are transmitted.
Structure of H.261 Bitstream
PSC TR PTYPE PEI PSPARE… PEI … GOB data
GBSC GN GQUANT GEI GSPARE… GEI … MB data
MBA MTYPE MQUANT MVD CBP Block data …
…
Coding of H.261 Bitstream
PSC TR PTYPE PEI PSPARE GOB Layer
GBSC GN GQUANT GEI GSPARE MB Layer
Picture Layer
GOB Layer
Coding of H.261 Bitstream
MBA MTYPE MQUANT
MB Layer
MVD CBP Block Layer
CBP
MVD
MBA stuffing
TCOEFF EOB
Fixed length
Variable length
H.263
H.263 = (H.261) + (MPEG-like features)Compared to H.261
– More allowable picture formats– Half-pixel motion estimation, no loop filter– Different VLC tables at macroblock and block le
vels– Four negotiable options
3~4 dB better PSNR than H.261 at <64 kbps
H.263 Video Formats
Sub-CIF QCIF CIF 4CIF 16CIF
Pels/line 128 176 352 704 1408
Lines 96 144 288 576 1152
Four Negotiable OptionsUnrestricted Motion Vector: motion
vectors can point outside the picture, -31.5 to 31.5 instead of –16 to 15.5
Advanced Prediction Mode: 8 8 motion vectors, overlapped block motion compensation, and motion vectors can point outside the picture
Syntax-based Arithmetic Coding (about 5% decreasing in bit-rate)
PB-frame
H.263+ 12 Optional Modes
Annex D: New Unrestricted Motion Vector (mv range up to +/- 256)
Annex I: Advanced Intra CodingAnnex J: Deblocking FilterAnnex M: Improved PB-FrameAnnex O: Temporal, Spatial, and SNR Scalab
ilityAnnex P: Reference Picture ResamplingAnnex Q: Reduced Resolution Update
H.263+ Optional Modes
Annex S: Alternative Inter VLC Annex I: Modified Quantization
Error ResilienceError ResilienceAnnex K: Slice Structured Annex R: Independent Segment
Decoding Annex N: Reference Picture Selection
Codec Implementation Issues
Fast algorithm for motion estimationFast algorithm for DCT/IDCTHuffman table implementationProgram design
– Program diagram– Memory assess (frame stores)– Register assignment– Program redundancy
Supplemental Enhancement Information
Enhanced featuresPicture freeze and releaseTagging information
Snapshot Video segment start/end Progressive refinement start/end
Chroma keyCan be discarded by decoders that do not u
nderstand
H.263++ and H.263L
H.263++ (year 2000)Backward compatible to H.263 and H.263+Technical proposals on
Error resilience 4 4 motion compensation and transform Adaptive quantization Long-term/background memory De-blocking and de-ringing filters …
H.263++ (year 2002)Not necessarily Backward compatible to
H.263-type encoders