Upload
osman
View
44
Download
0
Embed Size (px)
DESCRIPTION
Combined scalability coding based on the scalable extension of H.264/AVC. Sangseok Park, PhD candidate. June 13, 2008. Abstract. Scalable Video Coding (SVC) of H.264/AVC The annex G of latest SVC draft [JVT-X201] Spatial, temporal, quality scalabilities - PowerPoint PPT Presentation
Citation preview
Combined scalability codingCombined scalability codingbased on the scalable extension of based on the scalable extension of
H.264/AVCH.264/AVC
Sangseok Park, PhD candidate
June 13, 2008
2
AbstractScalable Video Coding (SVC) of H.264/AVC
The annex G of latest SVC draft [JVT-X201]Spatial, temporal, quality scalabilitiesFinal Draft International Standard (FDIS) in July 2007
Simplified Fine-Granular Scalability (FGS) [JVT-W111]Combination of significant and refinement coding passesIntroduction of code type methodSVC phase 2
Bit-Depth Scalability (BDS) designBased on Inter-Layer Prediction (ILP) [Schwarz07]Reverse tone-mapping processThe low complexity motion-compensation structureSVC phase 2
[JVT-X201] T. Wiegand et al, “Joint Draft ITU-T Rec. H.264 | ISO/IEC 14496-10 / Amd.3 Scalable video coding,” JVT Document, JVT-X201, Geneva, Switzerland, Jul. 2007, [JVT-W111] M. Karczewicz, S. Park, and H. Chung., "Report of core experiment on FGS simplification (CE1)." JVT Document, JVT-W111. San Jose, California, Apr. 2007 [Schwarz07] H. Schwarz, D. Marpe, and T. Wiegand,“Overview of the Scalable Video Coding Extension of the H.264 /AVC standard,” JVT Document, IEEE Trans. CSVT, vol. 17(9), pp. 1103-1120, Sept. 2007
3
Scalability : One bit-stream can adapt itself according to networks or terminals by dropping or truncation of parts of a bit-stream even without severe degradation of the content
Good coding efficiencyLow decoder complexityMuch better than simulcast stream, the simplest scalability approach, which combines several independent bit streams
[HHI website] HHI's Image Communication., "The Scalable Video Coding Amendment of the H.264/AVC Standard." http://ip.hhi.de/imagecom_G1/savce/.
[HHI website]
4
Quality scalabilityCGS coding ( Coarse Grain Scalability )
Small number of bit-extraction pointsCoarse bit rate variationDecreasing quantization steps from the SNR base layer to enhancement layer.
Fine grain scalability (FGS) codingProgressive refinement (PR) slicesCan be truncated at arbitrary extraction pointsNo zig-zag scanning order of transform coefficients, cyclical scanning order usedHigh complexity Macroblock
(MB)
Slice
5
SVC encoder block diagram
Inputvideo
2D spatialdecimation
Multiplex
Base layer
H.264/AVC compatible scalable video encoder
Transform/Quantization
Transform/Quantization
Entropyencoder
Entropyencoder
1st spatial enhancement layer
2nd spatial enhancement layer
Hierarchicalmotion-compensated
prediction
Hierarchicalmotion-compensated
prediction
texture
motion
Hierarchicalmotion-compensated
prediction
texture
motionTransform/
Quantization
Inter-layer prediction in intra, motion, and residual
H.264/AVC compatible
base layer bitstream
FGS enhancement layer
FGS enhancement layer
FGS enhancement layer
Scalable bit-streamSpatial
Scalability
CGS
texture
motionEntropyencoder
Progressive refinement
coding (FGS)
Progressive refinement
coding (FGS)
Progressive refinement
coding (FGS)
6
Simplified FGS encoder block diagram
EOB: End of block, VLC: Variable length coder, CBP: Coded block patternBL : Base layer, EL : Enhancement layer
FGS layer
Prescan MBs to find VLC code tablefor coding of significant coefficients
Encode luma & chroma CBP VLC
Encode luma && chroma
Frame
Prescan MBs to find refinement coefficients
VLC writer
SIGNIFICANT & NOT CODED coefficients : refinement coefficients
Combining refinement coefficientsSign of BL = coeffBL < 0 ? 1 : 0Sign of EL = coeffEL < 0 ? 1 : 0Sign=XOR (sign of BL, sign of EL )Symbol = Sign ? 2 : 1
Encode MB header ( luma scan index == 0 )
Encode luma & chroma significant & refinement coefficient
in case CBP4x4 != 0
VLC writer
VLC writerx x
EOB shift arraystatistically obtained Select the best codebook out of 5
Encode luma refinement coefficients
Bit-stream
Bit-stream
Bit-stream
This part are skipped for simplification
7
Results for FGS ( Fine Granular Scalability) Performed on the basis of JSVM 7.10 with C++ [JSVM] The proposed method was accepted and verified as SVC of H.264/AVC in the 23rd JVT meeting, San Jose, CA [JVT-W200]The average improvement on all tested CIF sequences is 0.46% bitrate reduction while the complexity of original FGS encoder is reduced so much that high-level syntax decreased up to one-third of the original FGS encoder [JVT-W200]
[JSVM] JSVM 8.10, RWTH CVS server[JVT-W200] T. Wiegand et al, “Meeting Report, Draft 7,” JVT Document, JVT-W200, San Jose, CA, Apr. 2007
8
Bit-Depth Scalability (BDS)
New scalability, called as Bit-Depth scalability, needed for High dynamic range (HDR) contents, such as high accurate video, remote sensing, medical applications, digital animation movies since HDR cameras and display devices have been developed.
Bit Depth,8, shows two to three orders of dynamic range ex: 256
Bit Depth,10, shows three to four orders of dynamic range ex: 1024
Backward compatibility should be considered.
The content can be viewed simultaneously in both current low dynamic range (LDR) devices and HDR devices.
However the current SVC does not support the bit-depth scalability.
HDR camera
HDR sequence
LDR sequence
SVCbit-stream
HDRdevices
LDRdevices
Postproduction
TM processingBacward
compatibility
Bit-stream extraction
9
Tone-mapping (TM) or Inverse tone-mapping (iTM) ideas for reduction or extension of the dynamic range.
TM : convert HDR sequences into LDR sequences. ex: 10bpp to 8bpp
iTM : convert LDR sequences into HDR sequences but not a exactly mathematical inverse due to loss of information. ex: 8bpp to 10bpp
Preserve the human perception for the scene.
HDR images cannot be viewed with conventional monitors but can be viewed after TM processes.
10
The coding flow of the enhancement layer for each macroblock is arranged as follows
11
+
Tone mapping
+Entropy codingLow bit-depth
input data
High bit-depth input data
+
-Transform
Transform
Quantization
QuantizationEntropy coding
-
Multiplex
10bit
8 bit
10bit
8 bit
Bit stream
The enhancement layer
The base layer
Prediction(Inter)
Prediction(Inter/Intra)
Inversetone mapping
+
10 bit
Deblockingfilter
Deblockingfilter
Scaling and Inversetransform
Scaling and Inversetransform
Base layer(intra MBs)
Enhancement layer
Frame n
Collocated MB
Base layer(inter MBs)
Enhancement layer
Frame n
Collocated MB
Inter-Layer Intra Prediction
Reconstructedpixels
Inter-Layer Motion Prediction
MVs, Reference index
MVs, Reference index
RD optimization
SVC structure for BDS
[Park08] S. Park and K.R.Rao, “Bit-Depth Scalable Video Coding Based on H.264/AVC,” IEICE Trans. Fundamentals Letter, Vol.E91-A, No.6, pp. 1541-1544 June 2008
12
Generate a mapping function
Inverse tone-mapping (iTM) to expand the dynamic range in LDR (Low Dynamic Range) sequences
Linear scaling is the simplest approach
Severe noise in borders of bright area and dark area and makes contrast change sharply
Mapping Function (MF) approach
Arithmetic mean, not computationally expensive and easy to use to obtain a one-to-many mapping function [Mantiuk06]
i : the number of pixels per frame, j :
is a pixel value in a LDR sequence, is a pixel value in a HDR (High Dynamic Range) sequence, and is the number of frequencies where how many cases of pixels fall into each j bin. Mapping information is sent on a sequence parameter set (SPS) for the entire
sequence
Can be overridden, depending on the features of each frame, by being sent on picture parameter set (PPS) or a slice header.
),(ˆ),(),(
2),(),(ˆ
1010
810
yxLyxLyxResidual
yxLyxLHDR
LDR
iLiLsum HDRLDR )()(
1)( jsumjhist
jMF
layerbasetheinpixelperbits2 iLLDR iLHDR
)( jhist
13
Bit Steam in H.264/AVC [Wiegand03]VCL (Video Coding Layer) NAL unit (Network Abstraction Layer)
contains the values of the samples in the video pictures
non VCL NAL unit contains associated additional information such as parameter sets
One frame can be one slice or split into several slices but one frame corresponds to one slice in my research
0 1 703
704
576704
...
...
405503
4CIF 8 bpp frame
150 50
0 1 703
704
576704
...
...
405503
4CIF 10 bpp frame
560 200
Index i
iLLDR
iLHDR
704*576-1=405503
2550 1 150 ...
1500 LDRL
5600 HDRL
560
...
50200
...
... ... jhist
Accumulated sum
Each bin contains the number of counts
Obtain average by each count
jMF
SPS#1 PPS#1 Slice #1 or Frame #1 PPS#2 Slice #2 or Frame #2...
One bit stream
non VCL NAL unit non VCL NAL unit
VCL NAL unit VCL NAL unit
[Wiegand03] T. Wiegand et al, “Overview of the H.264/AVC video coding standard,” IEEE Trans. CSVT, vol. 13(7), pp. 560-576, July 2003.
14
),(),(*_),(ˆ8
110 yxoffsetyxLfactorscalingyxL LDR
),(*_),(),( 810 yxLfactorscalingyxLyxResidual LDRHDR
1 1
10 ,,*1),(
Wm
mi
Hn
nj
HDR yjxiResidualyxLHW
yxoffset
)),((),(ˆ8
210 yxLMFyxL LDR
15
Bitrate(kbps) PSNR-Y(dB) Bitrate PSNR-Y PSNR(dB) Bitrate(%)
3215.13 43.85 3132.28 43.707644.33 46.90 7533.29 46.71
15829.76 50.58 15699.18 50.3629116.24 54.53 29399.27 54.37
6935.62 38.94 5131.48 39.5212931.06 42.66 9982.40 42.8724121.86 46.44 19280.89 46.2945838.05 50.34 37380.40 50.10
3650.80 43.13 3584.24 42.938993.74 46.01 8961.20 45.87
19258.09 49.60 19330.82 49.5336366.92 53.50 36799.18 53.50
3472.78 41.76 3343.08 41.617662.78 44.47 7304.21 44.22
16603.69 47.33 15805.15 46.9835965.69 50.71 34531.20 50.39
5141.98 40.72 5004.92 40.6414637.22 43.63 14440.12 43.5532431.34 47.35 32152.78 47.3260242.00 51.49 60407.29 51.53
2446.87 42.85 2694.82 42.845914.61 44.96 6051.15 44.83
14376.38 47.63 14523.98 47.5033552.38 51.08 33785.44 50.97
Average of 10bit sequences 0.14 -1.26
63264.40 43.67 40684.40 47.70107487.60 47.75 72171.92 50.68175330.16 51.49 124266.40 53.43288465.84 54.97 219020.96 56.2982232.00 42.12 58211.76 44.76
138702.08 45.78 100101.04 47.88223707.36 49.16 167135.76 50.80364473.12 52.59 283502.64 53.78
Average of 12bit sequences 4.27 -48.61
library 4.72 -52.37
Sunrise 3.81 -44.86
Waves -0.18 5.99
Plane -0.11 3.06
Staples -0.01 0.02
Freeway 1.41 -22.22
Night -0.12 2.62
HHI Proposed
CapitolRecords -0.14 2.98
Video Sequences
Experimental results
16
Coding gain of 0.14dB or 1.2% reduction in bits rate is obtained for 10 bits/pixel test sequences.Coding gain for 12 bits/pixel test sequences reaches up to 4.2dB or 48% reduction in bit rate.This approach brings the minimum increase in complexity by avoiding motion estimation in the enhancement layerIncreases the robustness of quality when there is no frequent update of a mapping function table
17
Future Works related to H.265
H.265 design project from VCEG meeting in Geneva, Apr. 2008 [Lee08] Progressive-scan (only) Picture sizes
QVGA, VGA, 1080p60, 2kx4kFrame rate
12.5/15, 24/25/30, 50/60, 100/120,Picture size/grid conversion within the design (e.g. 4:4:4 4:2:0, 8bpp, 10bpp, 12bpp)Sampling grid : 4:2:0, 4:4:4, Bayer Color ArrayViews : 1, N > 1 Portable encoders, Parallelism, memory bandwidth, asymmetry (can shift balance from encoder to decoder for videoconferencing, surveillance, and mobile camcorders) from complexity issues
[Lee08] From Dr. Yung-Lyul Lee at Sejong University in Korea, presently visiting professor in UTA