[Lecture Notes in Electrical Engineering] Ubiquitous Information Technologies and Applications Volume 214 || Fast Coding Algorithm for High Efficient Video Coding (HEVC)

Fast Coding Algorithm for High EfficientVideo Coding (HEVC)

Jong-Hyeok Lee, Chang-Ki Kim, Jeong-Bae Leeand Byung-Gyu Kim

Abstract The JCT-VC is developing the next-generation video coding standardwhich is called a high efficiency video coding (HEVC). In the HEVC, there arethree unit in block structure: coding unit (CU), prediction unit (PU), and transformunit (TU). The CU is the basic unit of region splitting like macroblock. The CUconcept perform recursive splitting into four blocks with equal size, starting fromthe treeblock. We use mode information of the current CU to avoid unnecessaryCU splitting process. To reduce more computational complexity, we propose adepth range selection mechanism (DRSM) by selecting adaptive depth range. Thedepth range selection mechanism provides an effective splitting by performing justCUs in want of process. The experimental results show that the proposed algorithmcan achieve about 48 % of reduction in encoding time for RandomAccess HEcompared to the HEVC test model (HM) 6.0 encoder with BD-bitrate loss of1.2 %.

Keywords HEVC � Video coding � Motion estimation � Motion detection

J.-H. Lee (&) � C.-K. Kim � J.-B. Lee � B.-G. KimDepartment of Computer Engineering, SunMoon University, A-san, Republic of Koreae-mail: [email protected]

C.-K. Kime-mail: [email protected]

J.-B. Leee-mail: [email protected]

B.-G. Kime-mail: [email protected]

Y.-H. Han et al. (eds.), Ubiquitous Information Technologies and Applications,Lecture Notes in Electrical Engineering 214, DOI: 10.1007/978-94-007-5857-5_31,� Springer Science+Business Media Dordrecht 2013

289

1 Introduction

In the past, ITU-T Video Coding Experts Group (VCEG) and the ISO/IEC MovingPicture Experts Group (MPEG) developed The H.264/MPEG-4 Advanced VideoCoding (H.264/AVC) standard that having significant improvement in compres-sion performance compared to prior standards [1].

Although the H.264/AVC has significant compression performance, but usersusing video coding technology always want to video service with more speed-upand high quality. Keeping a high speed and quality is important but there is trade-off problem. We need it for the newest broadcasting services and highresavc1olution video sequences (i.e, 1080@60 Hz, Full-High Definition (HD) (i.e,2D to 3D stereoscopic) and Ultra Definition video contents).

Therefore, to improve performance of video coding efficiency, ISO-IEC/MPEGand ITU-T/VCEG recently formed the Joint Collaborative Team on Video Coding(JCT-VC). The JCT-VC is developing the next-generation video coding standard,called high efficiency video coding (HEVC). JCT-VC has taken the 7th Meeting upto this day. Compression performance of the HEVC improved typically 40–45 %compared with H.264/AVC anchors in terms of gross BD-rate based on PSNR.This means that the HEVC can achieve better compression efficiency for highresolution than the H.264/AVC.

In the HEVC, there are some contributions with high speed within negligibleloss of BD-bitrate. In [2], Coded Block Flag (CBF) was used to reduce complexityby early termination of CU encoding. This algorithm terminates the next PUencoding of the CU if CBF in a CU is zero (cbf ¼ 0) for luma and chroma (luma,u, v). Choi et al. [3] proposed a tree-pruning algorithm that early terminates CU.To reduce computational complexity, it uses the mode information of current CU.If the best prediction mode of current CU is SKIP mode, it skips the sub-treecomputations. As for [2, 3], Yang et al. [4] proposed early skip detection algo-rithm. Their motivation was early skip conditions from the H.264/AVC. There arethree conditions. In the first, the best reference picture is just the previous one. Thesecond condition is that motion vector difference (MVD) is (0,0). The third con-dition is that Coded Block Pattern (CBP) is zero. Their algorithm proposed twoconditions and changed processing order of the PU mode search. In the firstcondition, motion vector difference (MVD) is (0,0). Coded block flag (CBF) is 0for luma and chroma in next condition.

In [5], an adaptive coding unit has been proposed based on early SKIP detectiontechnique. In this paper, three tests have been performed to find the statistical char-acteristics of SKIP mode. From these tests it was found that current CU and neigh-boring CUs are highly correlated. Hence in this paper an adaptive weighting factoradjusting method is proposed using these correlations. In [6], conditional probabil-ities has been estimated for the optimal intra direction of current block. From thiscalculation a most probable mode (MPM) was defined from its neighboring blocks.Leng et al. proposed that it was shown that the large CU can be considered as veryefficient for high resolution, slow motion or large QP video sequence [7].

290 J.-H. Lee et al.

We propose an effective coding unit selection algorithm for HEVC based onadaptive depth range selection mechanism. In the proposed algorithm, we use earlyCU termination by using correlation of CUs and depth that similar to the blockcorrelation. Also we determine adaptive depth range for performing the CU pro-cess. Section 2 describes the proposed method. Experimental results are shown inSect. 3. Concluding remarks will be given in Sect. 4.

2 Proposed Fast Scheme

In the HEVC, there is independent block structure which has a highly flexiblehierarchical structure base on the generic quadtree scheme. There are three unit inblock structure: coding unit (CU), prediction unit (PU), and transform unit (TU).The CU is the basic unit of region splitting as macroblock. But it is different frommacroblock. It does not restrict the maximum size and it allows recursive splittinginto four equally sized CUs to give a content-adaptivity, starting from the largestcoding units (LCU). The LCU concept is analogous to that of the macroblock withbiggest size in previous standard such as the H.264/AVC. Its maximum size thatallowed size of the luma block in a LCU is 64� 64. Therefore CU blocks limitedas large as the LCU or as small as 8� 8.

The PU is the basic unit of intra/inter prediction that used for intra/inter pre-diction. Each CU may include one or more PUs that its size may be as large as theCU or as small as 8� 4 or 4� 8 in the luma block. The type of PU N� N isallowed only, just in case the corresponding CU size is greater than the smallestallowed CU size, which called as smallest coding unit (SCU).

The TU is the basic unit of transform that used for the transform and quanti-zation processes. Each CU may include one or more TUs, that variable TUs isperformed transform and quantization. The maximum quadtree depth can beadjustable and specified in the slice header syntax or in configuration file.

Fig. 1 The CU processing perform recursive splitting into four block with equal size

Fast Coding Algorithm for High Efficient Video Coding (HEVC) 291

The HEVC has very large complexity because of sub-tree computations of eachCU in quadtree structure. This process perform recursive splitting into four blockwith equal size, starting from the LCU to the SCU. Figure 1 shows the CU pro-cessing order with recursive splitting. We can call upper depth and lower depthaccording to depth with high position or low position.

To reduce large complexity in quadtree structure, our proposed method usesmode information for the CU and depth range selection mechanism (DRSM).

2.1 Early CU Termination

In the first, it can be decided which CU is performed up to where in CU process.Therefore, by using block correlation between splitting CUs and previous CU (inupper depth), we may reduce its complexity. In the HM reference software, there is thekey function for evaluating variable prediction modes. It is TEncCU::xCompressCU.

Each CU processing has large complexity because its spritting sub-CUs com-putations. To reduce the encoding complexity, our method employs SKIP modeinformation. In Fig. 2, the probability of SKIP mode of sub-CUs when higher CUis SKIP mode is shown. Figure 2 is a result from 100 frames per CLASS Dsequences. It shows that probability is very high when higher CU is SKIP mode.This means 16� 16 CUs within 32� 32 CU with SKIP mode can be SKIP mode,if 32� 32 CU in depth 1 is SKIP mode.

The proposed method adopts termination condition by using SKIP mode [3]. Byusing this kind of early CU termination, we can reduce the computational com-plexity effectively in HEVC.

Fig. 2 The probability ofSKIP mode of sub-CUs whenhigher CU is SKIP mode


2.2 Depth Range Selection Mechanism (DRSM)

Our proposed algorithm can achieve to reduce computational complexity by onlyselecting early terminate condition. However, to reduce more complexity, depthrange selection mechanism (DRSM) is proposed in this work, as follows:

Depth d ¼

d � 1; if upper depth;

d; if current depth;

d þ 1; if lower depth;

0; if LCU ðCU in rootÞ;

8>>><

>>>:

ð1Þ

CUdi ¼CU0; if CU ¼ LCU;

CUdiði ¼ 0; 1; 2; 3Þ; if otherwise;

�

ð2Þ

In the above equations, d and i is a depth level and processing order, respectively.If the current depth is d, if a upper depth or lower depth than the current depthwhich can be described as d þ 1 or d � 1 according to depth order (as relationbetween parent and child in Tree structure), respectively. The LCU (i.e, root CU)can represent CU0. To make effective process, we have studied depth correlationsuch as block correlation. Also depth has some spatial correlation between d þ 1and d � 1 as shown in Fig. 1. As the CU processing order, the first LCU is alwayschecked on. So, we do not consider the LCU for know property in frame as likestationary or not. In other words, performing the LCU is necessary for complexityreduction. If the LCU was determined SKIP mode and then terminate the CUprocessing, it can possible reduce too much computational complexity by notestimate CUs under the LCU.

Therefore, we consider to CU1iði ¼ 0; 1; 2; 3Þ in depth 1 under the root CU. TheCU process performs the one of the CU in depth 1 after the LCU processing. TheseCUs under the LCU have correlation, similar to block correlation. In Fig. 3, whenthe current depth is SKIP mode, the probability of upper and low-raking depth of itis illustrated. From this Fig. 3, these CUs have strong correlation between upperand lower depth like that of large macroblock and its sub-macroblock. In case ofCUs with same size (in same depth) and its sub-CUs (its CUs in lower depth),SKIP probability is high while that of upper depth is lass than those of the samedepth and lower depth.

Based on this analysis, the proposed method check on where is a depth withSKIP mode. Depth level with SKIP mode selects an adaptive depth range betweenCU10 and CU13. If the depth range was selected as depth d, the proposed methoddetermines three depths of range such as d � 1, d and d þ 1 depths. Otherwise, allother depths will be skipped in the process.


2.3 The Overall Procedure

As shown in Fig. 4, the overall procedure is performed based on the early CUtermination and depth range selection mechanism (DRSM), as follows:

(1) For the current CU, we check on where SKIP mode is detected after end ofprocessing in current CU. This process performs until the process finds SKIPmode from the LCU up to the SCU, sequentially. If it finds, go to Step (2).

Fig. 3 When current splitting CU is SKIP mode, Probability of higher CU and Sub CU of currentsplitting CU

Fig. 4 The overall flow of the proposed algorithm


Otherwise, split the CU and repeat the search process. (If the process does notfind and SKIP mode, then it will be full CU search)

(2) (Check on the depth range)Check on the depth range before estimating LCU or splitting CU within the

LCU. If it is determined from the previous processing depth d, go to Step (3).Otherwise, go to Step (4).

(3) Compare depth of splitting CU with SKIP depth d. If the splitting CU is upperdepth than d � 1 or lower depth than d þ 1, it skips the current depth and issplit by going to Step (1). Otherwise, go to Step (4).

(4) In this step, check on if the current CU mode is SKIP or not. If the current CUmode is SKIP, go to Step (5). Otherwise, the next CU processing is estimated,continuously.

(5) (Decision for SKIP depth d and early CU termination)If SKIP depth d was not determined, it determines depth of current estimatingCU as SKIP depth d. And this processing is skipped because of the current CUis SKIP mode. So, the next CU processing is estimated continuously by go toStep (1).

3 Experimental Results

The proposed algorithm has been implemented on HM 6.0 (HEVC referencesoftware). Test condition was random access using main (RA-Main). The standardsequences with all frames were used from Class A to F with various QP values (22,27, 32, 37). The details of the encoding environment are same to JCTVC-H1100[8].

Figure 5 shows the RD performance. The proposed method is very similar tothe original HM 6.0 software. There is negligible loss of quality and BD-rate. Thismeans that the suggested algorithm can keep a reliable video quality withspeeding-up the HM encoder by about 48 %.

(a)(b)

Fig. 5 Rate-distortion (RD) curves for (a) BQTerrace and (b) BQSquare sequences for Class Band Class D in random access, main condition


Table 1 shows the performance for comparison between JCTVC-F092 [3] andthe proposed algorithm. D B means the total bit rate changes (in percentage). D Y-PSNR means the Y-PSNR changes, D T means the time saving factor (in per-centage). ‘‘þ’’ means increase and ‘‘�’’ means decrease. All these measurementsare conducted based on total 50 frames of each test sequence in all classes. Fromthe results in Table 1, it can be seen that the proposed algorithm achieves—onaverage, 45.06 % time saving with only 0.05 (dB) loss in Y-PSNR and 0.09 %decrement in the total bit rate. JCTVC-F092 [3], which as a contributed paper inthe HEVC standard, achieves about 40.62 % encoding time saving, 0.04 (dB)Y-PSNR loss, and 0.70 % total bit rate reduction.

Although the bit-rate is slightly larger than original reference software (HM6.0) and JCTVC-F092 [3] in the Nabuta, BasketballDrive and SlideEditing at someQP values, our algorithm achieved more improved complexity reduction about10 % to maximum 13 % with similar Y-PSNR than JCTVC-F092 [3] in Race-Horses (in class C and D), BlowingBubbles (in class D), ChinaSpeed (in class F) atsome QP values. Table 2 illustrates the overall performance of the proposed fastalgorithm. The performance of the proposed algorithm achieved a time-savingfactor of about 48 % on average value while keeping 1.2 % of the average loss ofBD-rate for Y component. JCTVC-F092 [3] achieved a time-saving factor of about44 %. For Y component, 0.4 of BD-rate was observed using [3], too. From these

Table 1 The Experimental results for JCTVC-F092 [3] and our proposed algorithm on HM 6.0

Random access main JCTVC-F092 [3] Proposed algorithm

D B D Y PSNR(dB) D T D B D Y PSNR(dB) D T

Class A �0.74 �0.040 �37.85 �0.15 �0.042 �42.48Class B �0.78 �0.035 �47.03 �0.01 �0.044 �49.90Class C �0.77 �0.054 �31.00 �0.03 �0.055 �36.19Class D �0.74 �0.053 �32.56 �0.31 �0.054 �39.62Class F �0.53 �0.033 �56.77 �0.12 �0.032 �59.33Total average �0.71 �0.043 �41.04 �0.12 �0.045 �45.50

Table 2 The overall performance of JCTVC-F092 [3] and the proposed algorithm

Random access main JCTVC-F092 [3] Proposed algorithm

Y U V Y U V

Class A 0.4 -0.6 -0.5 1.2 0.6 0.6Class B 0.6 -0.4 -0.3 1.7 1.3 1.2Class C 0.5 0.1 0.1 1.2 1.1 1.3Class D 0.5 -0.5 -0.1 0.8 0.9 0.6Class F 0.2 -0.1 -0.1 0.8 0.9 0.6Overall 0.4 -0.3 -0.2 1.2 0.8 0.8Enc time (%) 56 52Dec time (%) 99 99


results, it is known that the proposed algorithm can speed-up the HM encoder witha negligible loss of quality and BD-rate, while comparing to other method [3].

The reason why the measured time-saving factors between results of Tables 1and 2 are different is that the used calculation methods were different. In JCTVCstandard, a encoding time (enctime) is measured by the geometric average (mean),but average time in Table 1 was just observed based on arithmetic mean. However,with the same measurement, the proposed achieved a speed-up factor of up to13 % compared with JCTVC-F092 [3]. It means that the proposed algorithm canmake faster encoding system than the other [3].

4 Conclusion

We have proposed a fast scheme using mode information of the current CU anddepth range selection mechanism (DRSM). To reduce the encoding complexity,our method employs SKIP mode information and on the analysis of depth levelwith SKIP mode of the first tree. The designed DSRM provided an effectivesplitting by performing just CUs in want of process. Through experiments, thespeed-up factor of 36–59 % was verified with very small loss of quality.

Acknowledgments The research was supported by the Korea Science and Engineering Foun-dation (KOSEF) grant funded by the Korea government (MEST), under Grant NRF-2010-0024786.

References

1. Wiegand, T., Sullivan, G.J.: The h.264/avc video coding standard. IEEE Signal Process. Mag.24(2), 148–153 (2007)

2. Lim, J.Y., Lee, Y.-L.: Early termination of cu encoding to reduce hevc complexity. JCTVC-F045, JCT-VC document (2011)

3. Jang, E.S., Choi, K., Park, S.-H.: Coding tree pruning based cu early termination. JCTVC-F092, JCT-VC document (2011)

4. Won, K., Lee, H., Jeon, B., Yang, J., Kim, J.: Early skip detection for HEVC. JCTVC-G543,JCT-VC document (2011)

5. Kim, J., Jeong, S., Cho, S., Choi, J.S.: Adaptive coding unit early termination algorithm forHEVC. In: International Conference on Consumer Electronics (ICCE). Las Vegas, Jan 2012

6. Zhao, L., Zhang, L., Ma, S., Zhao, D.: Fast mode decision algorithm for intra prediction inHEVC. In: International Conference on Visual Communications and Image Processing (VCIP)(2011)

7. Leng, J., Lei, S., Ikenaga, T., Sakaida, S.: Content based hierarchical fast coding unit decisionalgorithm For HEVC. In: International Conference on Multimedia and Signal Processing(2011)

8. Bossen, F.: Common test conditions and software reference configurations. JCTVC-E700,JCT-VC document, Jan 2011


Documents

[Lecture Notes in Electrical Engineering] Ubiquitous Information Technologies and Applications Volume 214 || Fast Coding Algorithm for High Efficient Video Coding (HEVC)