Upload
haanh
View
223
Download
0
Embed Size (px)
Citation preview
1
H.264 Advanced Video CODEC
IC-SOC, Hawaii, November 22, 2004
Youn-Long LinDepartment of Computer Science
National Tsing Hua University
2
Video Coding FlowColor
Space Conversion
EntropyCoding
TransformationTo Frequency
Domain
Quantization
Prediction(Redundancy
Reduction)
RGB YUV Residual
Coefficients
LargeCoefficients
Bit Stream
3
Video Coding StandardsMPEG-2 Video:
big success in digital tv, dvd, ...Designed for high bit-rates > 1.5 Mbit/sNot suitable for wireless applications
H.263/MPEG-4 Video:Designed also for low bit-rates < 100 kbit/sContain some network adaptation featuresSuitable for wireless applications (chosen by 3GPP)
H.26L: New ITU-T Q.6/SG16 (VCEG) projectSuperior design for all bit-rates and error resilienceContains improved network adaptationWill replace H.263/MPEG-4 Video in wireless ?Also called H.264 or MPEG-4 Part 10
4
64kbps ~ 150Mbps64kbps –~2Mbps
2-15 MbpsUp to 1.5 Mbps
Transmission rate
I, P, B, SI, SPI, P, BI, P, BI, P, B, DPicture type
5 frameOne frameOne frameOne frameRef frame
3 profile8 profiles5 profilesNoProfiles
¼ pel¼ pel½ pel½ pelPixel accuracy
16 MVs per MBYesYesYesME, MC
VLC, CAVLC and CABAC
VLCVLCVLCEntropy coding
Increase at the rate of 12.5%
Vector quant.constant increment
constant increment
Quant. step size
4*4 int transformDCT/ WaveletDCTDCTTransform
8*8, 8*16, 16*8, 16*16, 4*8, 8*4, 4*4
16*16, 8*816*8
8*88*8Block size
16*1616*1616*16(frame)16*8(field)
16*16MB size
H.264MPEG-4MPEG-2MPEG-1Standard
5
H.264 Profiles
SP and SI slices
Data partitioning
B slices
Weighted prediction
I slices
CAVLC
Slice Group and ASO
Redundant Slices
Interlace
CABAC
P slices
Baseline profile
Main profile
Extended profile
6
H.264 Decoding Profile
MC
18%
Pic Rec
3%
DF
15%
Intra Pred
10%
IDCT/IQ
10%
Frame Level
Decoding
20%
CABAC
24%
7
H.264 Decoder Block Diagram
CABAC MBinfomem
Coeffmem
MC
Intrapred
IDCT/IQ
Predmem
PicRec
Residualmem
reconstructmem
unfiltermem
MVmem
Ref idxmem
DF
picnummem
refMBmem
Ref framemem
ParamemVLC
Picturemanage
SoftwareHardwareMemory
raw stream
H.264 stream
8
Entropy DecodingH.264/AVC entropy coding methods
VLCCAVLCCABAC
CABAC can save up to 7% of bit-rate in comparison with CAVLC
10
Variable Length Coding
Exp-Golomb code is used universally for all symbolsVLC with regular construction
11
VLC decoding
linfo_ue(Exp-Golomb)
syntax = (se(v) | me(v))?
syntax = me(v)?
linfo_cbpinter/intra
bit stream & syntax element
syntax = u(v)?
Code Num
Get VLC length
linfo_se
Yes
No
Yes
YesNo
No
me se ue u_vme_disable = 1
13
CABAC algorithmInit new slice
Initialize C ontext table
G et 2 bytes from bit stream
Initialize codlO ffset, codlR ange
D ecide next syntax
elem ent to be decoded
D ecide context from
N eighbor syntax elem ent
Init decode new m acroblock
N orm al D ecoding process
G et one byte from bit stream
Bypass
D ecoding process
Term inal D ecoding
process
14
Handshaking
Read parameter
start_read_pa end_read_pa
BuildTable
Init_slice end_buld
Decoder
Init_cabacend_cabac
end_slice
Read &write24 macroblockInit_rw
end_rw_macroblock
Read &write24 macroblock
end_rw_coeff
90 clock cycle
15
CABAC Experimental results
Gate count 138,226 gates200MHz based on TSMC 0.13 µmstandard cell library 2 to 3 cycles to generate one bit of data.Sufficient to decode main profile CIF video stream at 30 fps
17
Index of Sub_Blocks
15141110
131298
7632
541 0
00 01 02 03
10 11 12 13
20 21 22 23
30 31 32 33
2120
1918
2524
232200
10
01
11
00
10
01
11
00
10
20
30
01
11
21
31
02
12
22
32
03
13
23
33
-1
00
10
01
11
00
10
01
11
16 17
Luma (16x16)
Scanning order of residual blocks within a macroblock (16x16 pixels)
(16x16 Intra mode only )
Cb(8 x 8)
Cr (8 x 8)
18
16-Way Parallel Inverse Transc00c02
c01c03
c10c12
c11c13
c20c22
c21c23
c30c32
c31c33
++++++++++++++++
1/2
1/2
1/2
1/2
1/2
1/2
1/2
1/2
reorder
c00c20
c10c30
c01c21
c11c31
c02c22
c12c32
c03c23
c13c33
1/2
1/2
1/2
1/2
1/2
1/2
1/2
1/2
row by row column by column++++++++++++++++
++++++++++++++++
19
Synthesis Report
Synthesized using the Synopsys Design Compiler with TSMC 0.13μm standard cell library
17.7 mWPower
Consumption(Dynamic Power)
53,623Gate Count
100 MHzFrequencyModule “itrans & rescale”
24
H.264 Motion Compensation
Variable block size :
Macroblocktypes
Sub macroblocktypes
16x16 16x8 8x16 8x8
8x8 8x4 4x8 4x4
26
6-tape Filter Block Diagramx x x x x xreg
reg
reg
reg
reg
reg
reg reg reg reg reg reg
+ + +reg reg reg
+
+
coeff1 coeff2 coeff3 coeff4 coeff5 coeff6
pixel_1
pixel_2
pixel_3
pixel_4
pixel_5
pixel_6
27
Pipeline SchedulingMotion vector
generationGet reference
FrameInterpolation
Weight prediction
WriteMV & RefIdx
Memory
ReadMV & RefIdx
Memory
Picture ordering
Motion vectorgeneration
Get referenceFrame
WriteMV & RefIdx
Memory
ReadMV & RefIdx
Memory
Motion vectorgeneration
Get referenceFrame
InterpolationWeight prediction
WriteMV & RefIdx
Memory
ReadMV & RefIdx
Memory
Motion vectorgeneration
WriteMV & RefIdx
Memory
ReadMV & RefIdx
Memory
Motion vectorgeneration
ReadMV & RefIdx
Memory
Motion vectorgeneration
28
MC Memory UsageGlobal memory
Parameter information memoryCABAC memoryReference frame memory
Local memory ( total 26,112 bits )mv memory ( 2 x 16 x 23 x 16 = 11,776 bits)Reference pixel memory ( 2 x 8 x 16 x 16 = 4,096 bits)Down pixel memory ( 2 x 8 x 3 x 16 = 768 bits)Left pixel memory ( 2 x 8 x 2 x 16 = 512 bits)Right pixel memory ( 2 x 8 x 3 x 16 = 768 bits)Temporal pixel memory ( 2 x 8 x 16 x 16 = 4,096 bits)Predict pixel memory ( 2 x 8 x 16 x 16 = 4,096 bits)
29
MC Synthesis Report
Synthesized using the Synopsys Design Compiler withTSMC 0.13μm standard cell library
156 mWPower Consumption
78,056Gate Count
100 MHzFrequency
31
Deblocking Filter Introduction
Deblocking filter can achieve up to 9% bit-rate saving without degrading video quality
Reference from EBU Technical Review
32
Deblocking Filter Algorithm
The deblocking filter process can be divided into
horizontal filter across vertical edges,vertical filter across horizontal edges
33
Horizontal Filter across Vertical Edges
Macroblock N
16 8
168
8
8
0123456789101112131415
20212223242526272829303132333435
16171819
40414243444546474849505152535455
36373839
60616263646566676869707172737475
56575859
80818283848586878889909192939495
76777879
Macroblock N
34
Vertical Filter across Horizontal Edges
16 8
168
8
8
Macroblock N
0123456789101112131415
20212223242526272829303132333435
16171819
40414243444546474849505152535455
36373839
60616263646566676869707172737475
56575859
80818283848586878889909192939495
76777879
Macroblock N
36
Integration OverviewMB decoder IP for H.264/AVCIP integration
CABACMC, Intra PredictionIDCT/IQPicture ReconstructionDFMain Controller.
AMBA interfaceFPGA prototyping, HW/SW Co-VerificationMain Profile; CIF 30fps
37
MB Decoder FSMStartSlice
MB_type
CABAC,I_decode
CABAC,PB_decode
Last_MB?MB_type MB_type
PB_decode
I_decode
Slicedone
PB_MB
CABAC
I_MB
I_MB
PB_MB
NO Yes
End_MB
End_MB
End_MB
End_MB
38
Current Progress
IP integrationMain Controller, Deblocking FilterPicture Reconstruction
HW/SW Co-Simulation for reference software and deblocking filter FPGA prototyping for CABAC