IMPLEMENTATION OF COMPLEXITY REDUCTION ALGORITHM FOR INTRA MODE SELECTION IN H.264/AVC

Slide 1

Santosh Kumar MuniyappaIMPLEMENTATION OF COMPLEXITY REDUCTION ALGORITHM FOR INTRA MODE SELECTION IN H.264/AVC

Guided by- Dr. RaoContentsNeed for video compressionMotivationVideo coding standards, video formats and qualityOverview of H.264Complexity reduction algorithm for intra mode selectionExperimental resultsConclusions and future workReferences

Need for Video CompressionGrowing market for high bit-rate connections, large storage capacity Insufficient resource to handle uncompressed videoBetter proposition is to send high-resolution compressed video than a low-resolution, uncompressed stream over a high bit-rate transmission channel

Motivation [2]Removing redundancy in a video clipOnly a small percentage of any particular frame is new informationHighly complex process

Reduce the overall complexity suitable for handheld devices Timeline of Video Development [10]Inter-operability between encoders and decoders from different manufacturersBuild a video platform which helps to interact with video codecs, audio codecs, transport protocols, security and rights management in well defined and consistent ways

Video Format [19]CIF - common intermediate formatSIF - simple intermediate formatSDTV standard definition televisionFormatVideo ApplicationsResolution Sub-QCIF 128 96Mobile applicationsQuarter CIF (QCIF)176 144Mobile applicationsSIF(525)352 x 240Video conferencingCIF/SIF(625)352 288Video conferencing4CIF/4SIF(625)704 576SDTV, DVD videoQualitySpecify, evaluate and compareVisual quality is inherently subjectivePSNR - most widely used objective quality measurementPSNRdB = 10 log10 ((2n 1)2 / MSE)where, n = number of bits per pixelSSIM - represents perceptual image quality based on the structural information

OVERVIEW OF H.264 / AVC STANDARD

First published by JVT in 2003Achieve substantially higher video compression and a network friendly video representationBuilt on the concepts of earlier standards such as MPEG-2 and MPEG-4 Visual 50% reduction in bit-rateError resilienceSupports various interactive (video telephony) and non-interactive applications (broadcast, streaming, storage, video on demand)H.264 Video Codec [3]

Slice Types [3]I (intra) slice contains reference only to itself.P (predictive) slice uses one or more recently decoded slices as a reference (or prediction) for picture construction.B (bi-predictive) slice works similar to P slices except that former and future I or P slices may be used as reference picturesSI and SP or switching slices may be used for transitions between two different H.264 video streams.

H.264 Encoder [27]

H.264 Decoder [27]

Intra PredictionI pictures usually have a large amount of information present in the frame.The spatial correlation between adjacent macro-blocks in a given frame is exploited.H.264 offers nine modes for intra prediction of 4x4 luminance blocks.H.264 offers four modes of intra prediction for 16x16 luminance block.H.264 supports four modes similar to 16x16 luminance block for prediction of 8x8 chrominance blocks.Intra Prediction Modes for 4x4 Luminance Blocks [27]

Intra Prediction Modes for 16x16 Luminance Blocks and 8x8 Chrominance Blocks [27]

Inter Prediction [1]Takes advantage of the temporal redundancies that exist among successive frames.Temporal prediction in P frames involves predicting from one or more past frames known as reference frames.

Inter Prediction Modes [11]

Integer Transform and QuantizationTransform:Prediction error block is expressed in the form of transform co-efficients. H.264 employs a purely integer spatial transform, which is a rough approximation of the DCT.Quantization:Significant portion of data compression takes place.Fifty-two different quantization step sizes can be chosen.Step sizes are increased at a compounding rate of approximately 12.5%.

De-blocking Filter and Entropy CodingDe-blocking filter:Removes the blocking artifacts due to the block based encoding patternIn-loop de-blocking filter

Entropy coding:Assigning shorter code-words to symbols with higher probabilities of occurrence, and longer code-words to symbols with less frequent occurrences.CAVLC and CABACH.264 Profiles [27]

Complexity Reduction Algorithm For Intra Mode Selection

Existing intra prediction algorithm in H.264 using RDO examines all possible combinations of coding modes.Mode combinations for each MB sums up to 592.This research proposes a complexity reduction algorithm using simple directional masks and neighboring modes.Reduces the number of mode combinations into 132 at the most, with negligible loss of PSNR and SSIM with a slight bit-rate increase.Rate Distortion OptimizationLagrangian multiplier MODE = 0.85x 2(QP-12)/3 Cost = D + MODE x RN8 number of modes of an 8x8 chroma block N4 number of modes of a 4x4 luma block N16 number of modes of a 16x16 luma block Computing the best mode for one macro block:N8x (16xN4 + N16) = 4 x (16 x 9 + 4) = 592

Proposed Intra Mode Selection Algorithm for a 4x4 Luma Block [16]Diff = |a m| + |b n| + |c o| + |d p|, for vertical directionDiff = |a d| + |e h| + |i l| + |m p|, for horizontal directionDiff = |c i| + 2|d m| + |h n|, for diagonal down left direction Diff = |b l| + 2|a p| + |e o|, for diagonal down right directionDiff = |a n| + 2|b o| + |c p|, for vertical right directionDiff = |a h| + 2|e l| + |i p|,for horizontal down directionDiff = |b m| + 2|c n| + |d o|, for vertical left directionDiff = |e d| + 2|i h| + |m l|, for horizontal up direction

Continued [16]Step 1 - For a 4x4 luma block, obtain avg and SStep 2a - If S is larger than a threshold, T1, carry out RDO procedure for at most four candidate modes: two modes with minimum and second minimum Diff, and at most two modes from adjacent blocks. In this case, DC mode of adjacent blocks is excluded from RDO procedure.Step 2b - If S is smaller than a threshold, T1, carry out RDO procedure for at most four candidate modes: one mode with minimum Diff, at most two modes from adjacent blocks, and DC mode.

Note: T1 is set to 32

Proposed Intra Mode Selection Algorithm for a 16x16 Luma Block Step 1 - Examine sizes of adjacent blocks: if both blocks (upper block and left block) are 16x16, go to Step 2, otherwise go to Step 4.Step 2 - Examine modes of adjacent blocks: if both modes are same, go to Step 3, otherwise select the best mode for a 16x16 luma block, which results in the minimum SATD (sum of absolute transformed differences) between two adjacent modes of modeA and modeB.Step 3 - If both adjacent modes are DC mode, go to Step 4, and otherwise select the best mode for a 16x16 luma block, which results in the minimum SATD between the adjacent mode and DC mode.Step 4 - Let V be a vertical difference between upper boundary pixels of the current block and boundary pixels of the upper block, and H be a horizontal difference between left boundary pixels of the current block and boundary pixels of the left block.

Continued [16]

ContinuedStep5 - Obtain candidate modes by using two difference values, V and H:if |V H | is smaller than 2xT2, candidate modes are DC mode and plane mode.if (V H) is larger than T2, candidate modes are DC mode and horizontal mode.if (V H) is smaller than T2, candidate modes are DC and vertical mode.The threshold T2 is set equal to 8.Finally, select the best mode among all candidate modes by choosing the mode with minimum SATD.

Experimental ResultsI-only type.Various QP of 10, 20, 28, 34, and 40.QCIF -100 frames CIF - 30 framesThe results were compared with exhaustive search of JM in terms of the change of PSNR, bit-rate, SSIM, compression ratio, and encoding time.Intel Pentium Dual Core processor of 2GHz and 2GB memory.Test Sequences [39]

CIF and QCIF video sequences (a) bridge-close (b) bridge-far (c) coastguard (d) container (e) mobileCR = Complexity reduction algorithmJM = Joint modelCR = Complexity reduction algorithmJM = Joint modelCR = Complexity reduction algorithmJM = Joint modelCR = Complexity reduction algorithmJM = Joint modelCR = Complexity reduction algorithmJM = Joint modelConclusionsComparison of encoding time, PSNR, bit-rate, and SSIM between the JM reference software and complexity reduced encoder at QP, 10, 20, 28, 34 and 40.Average of 24.93% reduction in encoding time in case of complexity reduced algorithm with negligible loss of PSNR and SSIM and a paltry increase in bit-rate at nearly the same compression ratio.Hence, this approach of reducing the number of mode combinations in spatial domain using simple directional masks can find its application in low complexity devices like mobile or any handheld device.Future WorkThe complexity reduction algorithm was implemented for CIF and QCIF format video sequences. The idea can be extended further to other video formats like 4SIF and HD.The complexity reduction algorithm was integrated in to JM 18.0 reference software, it can be integrated to other open source H.264 softwares like X264 and performance analysis can be done.Since the aim is to reduce the overall complexity suitable for handheld devices with limited computing resources, algorithms which reduce the mode combinations in inter-prediction can also be integrated with this complexity reduced intra-prediction algorithm.ReferencesI. E.G. Richardson, H.264 and MPEG-4 video compression: video coding for next-generation multimedia, Wiley, 2003.Interframe coding pictures, Axis Communications, http://www.axis.com/products/video/about_networkvideo/compression.htmI. E. Richardson, The H.264 advanced video compression standard, 2nd Edition, Wiley, 2010.Draft ITU-T Recommendation and final draft international standard of joint video specification (ITU-T Rec. H.264/ISO/IEC 14 496-10 AVC), Mar. 2003.ITU-T Recommendation H.120. Codecs for Videoconferencing using primary digital group transmission. March 1993.ITU-T Recommendation H.261. Video codec for audiovisual services at px64 kbits. December 1990. March 1993 (revised).ITU-T Recommendation H.262. Information technology generic coding of moving pictures and associated audio information: video. July 1995ISO/IEC 11172-5. Information technology - Coding of moving pictures and associated audio for digital storage media at up to about 1.5Mbps. November 1998.ITU-T Recommendation H.263. Video coding for low bit rate communication. February 1998.Overview of International Video Coding Standards (preceding H.264/AVC), Gary J. Sullivan, ITU, 2005.H.264 / MPEG-4 Part 10 White Paper, I. E G Richardson. www.vcodex.com.M. Jafari and S. Kasaei, Fast Intra- and Inter-Prediction Mode Decision in H.264 Advanced Video Coding, International Journal of Computer Science and Network Security, VOL.8 No.5, pp. 1-6, May 2008.F. Pan et al, Fast intra mode decision algorithm for H.264/AVC video coding, in Proc.IEEE Int. Conf. Image Process., pp. 781784, Singapore, Oct. 2004.F. Fu et al, Fast intra prediction algorithm in H.264/AVC, in Proc. 7th Int. Conf. Signal Process., pp. 11911194, Beijing, China, Sep. 2004.Y. Zhang et al, Fast 4x4 intra-prediction mode selection for H.264, in Proc. Int. Conf. Multimedia Expo, pp. 11511154, Taipei, Taiwan, Jun. 2004.J. Kim et al, Complexity reduction algorithm for intra mode selection in H.264/AVC video coding J. Blanc-Talon et al. (Eds.): pp. 454 465, ACIVS 2006, LNCS 4179, Springer-Verlag Berlin Heidelberg, 2006. JM reference software, Fraunhofer Institute for Telecommunications Heinrich Hertz Institute. http://iphome.hhi.de/suehring/tml/Recommendation ITU-R BT.601-7, Studio encoding parameters of digital television for standard 4:3 and widescreen 16:9 aspect ratios, BT Series, March 2011Open source article on video formats, Wikipedia foundation. http://en.wikipedia.org/wiki/Common_Intermediate_FormatZ. Wang et al, "Image quality assessment: From error visibility to structural similarity," IEEE Transactions on Image Processing, vol. 13, no. 4, pp. 600-612, Apr. 2004.ITU-T Recommendation H.264. Advanced video coding for generic audiovisual services. November 2007.ISO/IEC 11172-4. Information technology - Coding of moving pictures and associated audio for digital storage media at up to about 1.5Mbps Part 4. March 1998.ITU Telecom. Standardization Sector of ITU, Video coding for low bitrate communication, Draft ITU-T Recommendation H.263 Version 2, Sept. 1997.ITU-T "Video Coding for low bit rate communication," ITU-T Recommendation H.263; version 1, Nov 1995; version 2, Jan. 1998; version 3, Nov. 2000.ISO/IEC 14496-2. Information technology - Coding of audio-visual objects Part 2. December 2001.Monash University, Multimedia webpage, http://www.ctie.monash.edu.au/EMERGE/multimedia/H261_263/H03.HTM

S. Kwon, A. Tamhankar, and K. Rao, Overview of H. 264/MPEG-4 part 10, " Journal of Visual Communication and Image Representation, vol. 17, no. 2, pp.186-216, April 2006.T. Purushotham, Low complexity H.264 encoder using machine learning," M.S. Thesis, E.E Dept, UTA, 2010.H. Yadav, Optimization of the deblocking filter in H.264 codec for real time implementation, M.S. Thesis, E.E Dept, UTA, 2006G. Sullivan, P. Topiwala, and A. Luthra, The H.264/AVC advanced video coding standard: overview and introduction to the fidelity range extensions, in: SPIE Conference on Applications of Digital Image Processing XXVII, vol. 5558, pp. 53-74, 2004.K.R. Rao and P. Yip, Discrete Cosine Transform, Academic Press, 1990.Y. Huh, K. Panusopone and K.R. Rao, Variable block size coding of images with hybrid quantization, IEEE Trans. Circuits and Systems for Video Technology vol. 6, pp. 679685, Dec. 1996.J. Ribas-Corbera and D.L. Neuhoff, Optimizing Block Size in Motion Compensation, Journal of Electronic Imaging, vol. 7, pp.155-165, Jan. 1998M. Wien, Variable block size transforms for H.264/AVC, IEEE Trans. For Circuits and Systems for Video Technology, vol. 13, pp. 604613, July 2003.M. Ravassi, M. Mattavelli and C. Clerc, JVT/H.26L decoder complexity analysis, Joint Video Team (JVT) of ISO/IEC MPEG & ITU-T VCEG, doc. JVT-D153, Klagenfurt, Austria, 2226 July, 2002 (available via anonymous ftp from ftp://ftp.imtc-files.org/jvt-experts/).M.-T. Sun, T.-D. Wu and J.-N. Hwang, Dynamic bit allocation in video combining for multipoint conferencing, IEEE Trans. Circuits and Syst. II, vol. 45, no. 5, pp. 644-648, May 1998.O. Werner, Re-quantization for transcoding of MPEG-2 intra frames, IEEE Trans. Image Processing, vol. 8, no. 2, pp. 179-191, Feb. 1999.

A. Luthra, G. Sullivan and T. Wiegand, Introduction to the special issue on the H.264/AVC video coding standard, IEEE Trans. on Circuits and Systems for Video Technology, vol. 13, issue 7, pp. 557-559, July 2003.Repository for freely-redistributable test sequences, media.xiph.org.D. Marpe, T. Wiegand and G. Sullivan, "The H.264/MPEG4 advanced video coding standard and its applications," IEEE, Communications Magazine, vol.44, no.8, pp.134-143, Aug. 2006.ITU-T and ISO / IEC JTC 1, Advanced Video Coding for Generic Audiovisual Services, ITU-T Rec. H.264 & ISO / IEC 14496-10, Version 1, May 2003; Version 2, Jan. 2004; Version 3 (with High family of profiles), Sept 2004; Version 4, July 2005 [Online]. Available: http://www.itu.int/rec/T-REC-H.264.G. J. Sullivan, The H.264 / MPEG4-AVC video coding standard and its deployment status, Proc. SPIE Conf. Visual Communications and Image Processing (VCIP), Beijing, China, July 2005.T. Wiegand et al, Rate-constrained coder control and comparion of video coding standards, IEEE Trans. Circuits Systems Video Technol., vol. 13, no. 7, pp. 688-703, July 2003.A. Puri, X. Chen and A. Luthra, Video coding using the H.264/MPEG-4 AVC compression standard, Signal processing: image Communication, vol. 19, pp. 793-849, Oct. 2004.T. Stockhammer, D. Kontopodis, and T. Wiegand, Rate-distortion optimization for H.26L video coding in packet loss environment, in Proc. Packet Video Workshop 2002, Pittsburgh, PY, April 2002.K.R. Rao and J.J. Hwang, Techniques and standards for digital image/video/audio coding, Englewood Cliffs, NJ: Prentice Hall, 1996.Thank You

Documents

IMPLEMENTATION OF COMPLEXITY REDUCTION ALGORITHM FOR INTRA MODE SELECTION IN H.264/AVC