[IEEE 2012 IEEE 16th International Symposium on Consumer Electronics - (ISCE 2012) - Harrisburg, PA, USA (2012.06.4-2012.06.6)] 2012 IEEE 16th International Symposium on Consumer Electronics

Abstract—In this paper, we propose a coding unit (CU) depth-

based adaptive loop filter (ALF) decision method to reduce the encoder complexity. In high efficiency video coding (HEVC), ALF is designed based on Wiener filter to minimize the distortion between the original frame and the coded frame. In order to design the optimal filter for each CU, the filter design process is repeated 12 times per CU. The repetition of the filter design process occupies about 77 percent in the total ALF computation time, which substantially increases the encoder complexity. The proposed method simplifies the filter design with the CU depth-based ALF decision, which is based on the observation by exploiting ineffective computations. Experimental results show that the proposed method reduces the encoding time for the repetition of the filter design process by about 13 percent of that of the HEVC test model (i.e., HM 5.0) with 0.1 percent BD-rate increase.

I. INTRODUCTION The high efficiency video coding (HEVC) is the latest video

coding standard that yields the best compression performance among existing video coding standards [1]. Compared to previous video coding standards such as MPEG-4 Part 2 and MPEG-4 Part 10 AVC/H.264, the HEVC improves coding efficiency about 50% [2]. To achieve this higher coding efficiency, HEVC introduces a new coding structure (e.g., coding unit (CU), prediction unit (PU), and transform unit (TU)) and efficient tools such as adaptive loop filter (ALF), sample-adaptive offset (SAO), short-distance intra prediction (SDIP) etc. However, the use of these efficient coding tools increases the overall computational complexity at encoder side dramatically [3]. Typically, ALF is regarded as one of the computational intensive coding tools. For this reason, many researchers have tried to reduce the computational complexity of ALF.

ALF is introduced in HEVC for higher coding gain and better visual quality by minimizing the distortion between the original frame and the coded frame [4]-[7]. Nevertheless, a large number of additions and multiplications for filtering operation is a cumbersome process, especially intra frame coding. For example, we found ALF takes about 22 percent in the encoding time for intra only configuration in HEVC test model (i.e., HM 5.0).

In order to improve the complexity of ALF, a joint collaborative team on video coding (JCT-VC) has exploited the reduced number of encoding passes the reduced filter size

Manuscript received March 11, 2012. This work was supported by

National Research Foundation of Korea (NRF) grant funded by the Korea government (MEST) (No. 201100000001179), Korea and Seoul R&BD Program (PA100094), Korea. S. Park, K. Choi, and E. S. Jang is with the department of Electronics and Computer Engineering, Hanyang University, Seoul, Korea (e-mail: [email protected]).

and its shape, and efficient mode classification for frame-level ALF [8]-[10]. In [8], one-pass and two-pass encoding algorithm is studied instead of conventional filtering passes. By deriving current filtering pass and filtering distortion based on previous coded filter information, the method can decrease the coding time by about four percent with marginal loss of coding gain. In [9], 5x5 filter shape with nine coefficients and 11x5 filter shape with 8 coefficients are presented for reducing ALF computational complexity, which have much smaller numbers of coefficients compared to the conventional approach. In [10], low complexity adaptation of block-based filter is introduced for frame-level ALF process and the method shows a 4 percent decoding time decrease. These proposed methods for fast ALF filtering process reduce the computational complexity effectively albeit considerable scope for possible improvements.

In this paper, we propose CU depth-based ALF decision method for fast HEVC encoding by exploiting unnecessary overlapped procedures during ALF encoding process. The proposed method is designed to include only essential encoding process from the observation; the relationship between the CU size and the ALF control map (i.e., ALF on/off decision map). This CU depth-based ALF decision method makes it possible to achieve a significant reduction in the number of operations possible.

The paper is organized as follows. The proposed method is presented in section II with the ALF encoding process. In section III our experimental results are given. Conclusions are presented in section IV.

II. PROPOSED METHOD

A. ALF encoding process ALF is designed based on a Wiener filter design with the

goal of minimizing the mean square error caused by quantization of transform coefficients. ALF encoding process in HEVC (i.e. HM 5.0) is applied to the reconstructed pixels after sample adaptive offset or deblocking filter process by using the filter coefficients. The ALF encoding process consists of two parts: frame-level ALF adaptation and CU-level ALF adaptation.

In the process of frame-level ALF adaptation, HEVC encoder evaluates the use of ALF in a whole frame. If the result shows the relative RD costs is lower than the RD cost before the ALF adaptation, the encoder indicates the use of ALF and stores the minimum cost for the evaluation with CU-level ALF adaptation.

An intriguing fact is that some parameters are provided for the CU-level ALF adaptation. During the process of frame-level ALF adaptation, the encoder selects the best filter shape

� CU Depth-based ALF Decision for Fast HEVC Encoding

Sanghyo Park, Kiho Choi, and Euee S. Jang, Senior Member, IEEE

2012 IEEE 16th International Symposium

978-1-4673-1356-8/12/$31.00 ©2012 IEEE

2012 ISCE 1569582395

1

between snowflake shape and cross shape as well as its coefficients from the determined ALF mode (e.g., block-based classification or region-based classification). Because of these two parameters already evaluated, it is not necessary that the procedure during the CU-level ALF adaptation for the filter shape with its coefficients.

The details of the process of CU-level ALF adaptation are shown in Fig. 1. The first step is an initiation of the control map based on the determined CU size and the control map depth. For example, the initial block size is set to a block size indicated by control map depth if the block size is bigger than the determined CU size. Otherwise, the determined CU size is chosen for the initial block size. The second step is to determine the ALF on/off decision for the blocks by comparing SSD values with and without ALF. The third step is to update the filter coefficients based on the pixel values that are in the regions to be used for the ALF adaptation. By extracting the autocorrelation and the cross-correlation from the new updated pixels, the ALF can obtain the optimal filter coefficients. This process is performed three times by the so-called redesign procedure. Finally, the CU-level ALF adaptation is concluded if the control depth is equal to the maximum value (i.e., max=three).

In HEVC, the ALF adaptation for obtaining the optimal ALF filter provide better coding efficiency however, the increased complexity during the adaptation poses a

challenging undertaken for the encoder. . Typically, both the loops of the control depth and the redesign procedure used in the depiction of ALF constitute the critical passes which increase computation complexity significantly. In this paper, we try to tackle the computational intensive passes for the fast CU depth-based ALF decision method by eliminating the loops for adaptation and deciding ALF on/off blocks efficiently in the control map.

B. CU depth-based ALF decision In order to reduce the computational complexity during the

process of the CU-level ALF adaption, we investigate the strategy to remove the adaptation loops for the control map depth and the redesign procedure. In addition, we design the proposed method to select AFL on/off decision adaptively based on a statistical analysis.

The first strategy is to reduce the adaptation loop for the control map depth from four times to once. The various block sizes for the on/off control map is one of the essential parts to achieve an optimal ALF on/off map. The only problem is that the process for some depths can be redundancy computations if the ALF on/off decision of blocks in a certain control map depth shows same result with the other cases. For example, the evaluation in control depth=0, 1, and 2 (i.e., depth 0: 64x64 block size, depth 1: 32x32 block size, and depth 3: 16x16 block size) is unnecessary computations where the most of blocks in the on/off control map of the last control map depth incidentally produces similar decisions with the other control map depths.

In this case, the ALF on/off decision in the last control map depth can be considered as the main essential part therefore the other cases can be removed. The conditional probability in Table I shows our assumption clearly. In the table, HEVC encoder determines the ALF off decision regardless of the

Fig. 1. ALF control map in the encoding process.

TABLE II PROBABILITY OF THE NUMBER OF BLOCKS APPLIED BY ALF

CU depth

ParkScene (%)

BQTerrace (%)

BQMall (%)

PartyScene (%)

BasketballPass (%)

0 33 48 34 48 99 1 64 74 45 63 96 2 84 90 68 78 98 3 84 94 72 83 99

TABLE I THE CONDITIONAL PROBABILITY OF ALF APPLIED PIXELS BASED ON

THE CONTROL MAP DEPTH Probability Depth 0 (%) Depth 1 (%) Depth 2 (%)

P(B100|A100) 100 100 100

P(B95|A95) 96 95 94

P(B90|A90) 87 84 91 P(A), P(B): the probability that how many pixels are not applied by ALF in each LCU. P(B100|A100) denotes the probability of pixels that not use ALF with the one of above depth value if all pixels use ALF in the last control map depth. P(B95|A95) and P(B90|A90) represent that each probability of pixels to not use ALF are more than 95 percent and 90 percent respectively. The statistics are calculated within 20 frames in BQTerrace sequence when quantization parameter is 27.

2

control map depths if ALF of the most blocks are turned off in the last control map depth. For example, P(B100|A100) shows that the probability of ALF is 100 percent in any depth if the probability of ALF is same percent in the last depth. Based on the observation, we present one-pass control map depth procedure, which the loops for depth zero to two are removed.

The second strategy is to remove unnecessary SSD computations. In the ALF on/off decision, the evaluation is determined by using SSD values. If there is an efficient method to estimate the on/off decision early, the SSD calculation can be skipped. From the motivation, we exploited the ratio of ALF according to the CU size as shown in Table II. An interesting part is that most blocks for the CU size=8x8 and 16x16 is determined to be used for CU-level ALF. Considering that the ALF is likely to reduce errors easily in the small block regions, this observation would be obvious and gives a hint to reduce the computational complexity. Based on the strong correlation, the proposed method bypasses the SSD calculations by setting ALF on blocks in advance when the determined CU size is 8x8 or 16x16. The evaluation for other sizes is same as conventional method.

The third strategy is to reduce the number of iterations for the redesign procedure. Because we always set to the use of ALF on both 8x8 and 16x16 blocks and evaluate the use of ALF on 32x32 and 64x64 blocks the changes on ALF control map are little. This means that the updated coefficients throughout updating auto-correlation and cross-correlation are similar although redesign procedure is conducted. Thus, the proposed method updates only a time for the redesign procedure. Overall, the proposed method is depicted in Fig. 2.

III. EXPERIMENTAL RESULTS For the performance evaluation, we assessed the total

execution time of the proposed method to confirm the comparison of our decoding time and that of HM 5.0 [11]. This was conducted in the intra-only configuration under the common test configurations defined by JCT-VC [12]. The testing environment is specified in Table III.

Table IV shows the total encoding time of HM 5.0 and the proposed method in the intra-only configuration. In addition, the encoding times of ALF control map are presented to show how much impact the simplification of the proposed method has on the encoding process.

Our experimental results show that the total encoding time of the proposed method is reduced to about eight percent on average compared to that of HM 5.0. In addition, the encoding time of CU-level ALF is performed, and the proposed method is reduced to 87 percent on average compared to that of HM 5.0 with negligible peak signal-to-noise ratio (PSNR) lose (i.e., 0.1 dB). The proposed method dramatically simplifies the encoding process of CU-level ALF adaptation by eliminating the loop that finds maximum depth of ALF and reducing the repetition of redesign process.

IV. CONCLUSION ALF can achieve good coding efficiency but with heavy

encoding complexity in HEVC especially Intra-only case. Our proposed algorithm reduces the most severe part of ALF which is CU-level ALF adaptation in the encoding process from 12 passes to 1 by CU depth-based decision. We suggest that this algorithm be used for fast HEVC encoding in Intra-only case.

Fig. 2. The proposed method of ALF control map in the encoding process.

TABLE III TEST CONDITIONS AND SOFTWARE REFERENCE CONFIGURATIONS

Test Sequences

Class B (1080p): ParkScene, BQTerrace Class C (WVGA): BQMall, PartyScene Class D (WQVGA): BasketballPass, BQSquare,

Total frames to be coded

10 seconds of video duration

Software HM 5.0 Quantization parameter

22, 27, 32 and 37

Others Intra-only configuration, high-efficiency setting

3

REFERENCES [1] G. Sullivan, and J. –R. Ohm, "Recent developments in standardization

of high efficiency video coding (HEVC)," Proc. of SPIE, vol. 7798, pp. 7798-30, Aug. 2010.

[2] T. Wiegand, J. -R. Ohm, G. J. Sullivan, W. –J. Han, R. Joshi, T. K. Tan, K. Ugur, “Special Section on the Joint Call for Proposals on High Efficiency Video Coding (HEVC) Standardization,” IEEE Trans. Circuits Syst. Video Tech., vol. 20, no. 12, Dec. 2010.

[3] K. Choi, E. S. Jang, “Fast coding unit decision method based on coding tree pruning for high efficiency video coding,” SPIE Opt. Eng. vol. 51, Mar. 2012.

[4] K. McCann, et al., Samsung’s Response to the Call for Proposals on Video Compression Technology, document JCTVC-A124, Joint Collaborative Team on Video Coding (JCT-VC), Dresden, Germany, Apr. 2010.

[5] T. Chujoh, A. Tanizawa, T. Yamakage, Description of video coding technology proposal by TOSHIBA, document JCTVC-A117, Joint Collaborative Team on Video Coding (JCT-VC), Dresden, Germany, Apr. 2010.

[6] Y. –W. Huang, et al., A Technical Description of MediaTek’s Proposal to the JCT-VC CfP, document JCTVC-A109, Joint Collaborative Team on Video Coding (JCT-VC), Dresden, Germany, Apr. 2010.

[7] M. Karczewicz, et al., Video coding technology proposal by Qualcomm Inc., document JCTVC-A121, Joint Collaborative Team on Video Coding (JCT-VC), Dresden, Germany, Apr. 2010.

[8] I. S. Chong, et al., CE8 subset 0: Improved ALF N pass encoding, document JCTVC-G1023, Joint Collaborative Team on Video Coding (JCT-VC), Geneva, Switzerland, Nov. 2011.

[9] W. Lai, F. C. A. Fernandes, I. –K. Kim, CE8 Subtest 1: Block-based filter adaptation with features on subtest of pixels, document JCTVC-F301, Joint Collaborative Team on Video Coding (JCT-VC), Torino, Italy, Jul. 2011.

[10] W. Lai, F. C. A. Fernandes, H. Guermazi, F. Kossentini, M. Horowitz, CE8 Subtest 4: ALF using vertical-size 5 filters with up to 9 coefficients, document JCTVC-F303, Joint Collaborative Team on Video Coding (JCT-VC), Torino, Italy, Jul. 2011.

[11] High Efficiency Video Coding Test Model software 5.0 https://hevc.hhi.fraunhofer.de/svn/svn_HEVCSoftware.

[12] F. Bossen, Common test conditions and software reference configurations, Document JCTVC-G1200, Joint Collaborative Team on Video Coding (JCT-VC), Geneva, Switzerland, Nov. 2011.

TABLE IV

RESULTS OF THE COMPARISON BETWEEN HM 5.0 AND THE PROPOSED METHOD WITH RESPECT TO TOTAL ENCODING TIME

Reference Tested Comparison

Bitrate (kbps)

Y PSNR (dB)

Enc T [s] Bitrate (kbps)

Y PSNR (dB)

Enc T [s] Bitrate (%)

Y PSNR (dB)

Enc T [%]

Class B ParkScene 22 53564.89 41.85 4745.78 53413.89 41.82 4446.69 -0.28 -0.03 -6.30

1080p 27 28877.10 38.79 4295.17 28864.78 38.78 3797.59 -0.04 -0.01 -11.58

32 14970.93 35.77 3867.95 14967.25 35.77 3334.18 -0.02 0.00 -13.80

37 7361.16 32.91 3706.88 7361.16 32.91 3109.13 0.00 0.00 -16.13

BQTerrace 22 184831.07 42.57 10804.11 184648.27 42.56 11580.27 -0.10 -0.02 7.18

27 81047.17 37.18 11193.44 80994.74 37.17 10067.33 -0.06 0.00 -10.06

32 41207.63 34.65 9842.99 41193.16 34.65 8574.11 -0.04 0.00 -12.89

37 22180.71 32.26 9213.00 22175.19 32.26 8036.14 -0.02 0.00 -12.77

Class C BQMall 22 23725.59 42.02 2205.78 23685.34 42.00 2140.00 -0.17 -0.02 -2.98

WVGA 27 14154.10 39.16 2027.04 14132.06 39.15 1881.92 -0.16 -0.02 -7.16

32 8323.95 36.14 1917.70 8313.08 36.13 1622.54 -0.13 -0.01 -15.39

37 4745.49 33.06 1858.45 4742.31 33.06 1577.05 -0.07 0.00 -15.14

PartyScene 22 44444.33 41.16 2163.18 44438.52 41.16 2167.27 -0.01 0.00 0.19

27 27535.21 36.97 2084.07 27509.79 36.96 1929.66 -0.09 -0.01 -7.41

32 16427.43 33.19 1914.37 16417.98 33.19 1628.28 -0.06 -0.01 -14.94

37 8922.66 29.51 1685.35 8921.58 29.51 1501.96 -0.01 0.00 -10.88

Class D BasketballPass 22 5380.99 43.10 437.77 5365.70 43.06 426.86 -0.28 -0.04 -2.49

WQVGA 27 3200.12 39.53 420.78 3194.16 39.52 391.38 -0.19 -0.02 -6.99

32 1827.66 36.13 393.11 1826.62 36.12 349.04 -0.06 -0.01 -11.21

37 1018.11 32.94 375.46 1017.66 32.93 314.53 -0.04 0.00 -16.23

BQSquare 22 13252.70 41.26 611.24 13249.68 41.26 621.83 -0.02 0.00 1.73

27 8426.48 36.89 585.39 8413.75 36.88 544.33 -0.15 -0.01 -7.01

32 5182.26 33.14 550.10 5181.48 33.14 491.76 -0.02 0.00 -10.61

37 3073.70 29.66 503.62 3073.51 29.66 439.97 -0.01 0.00 -12.64

All 25986.73 36.66 3225.11 25962.57 36.65 2957.24 -0.09 -0.01 -8.31

4

Documents

[IEEE 2012 IEEE 16th International Symposium on Consumer Electronics - (ISCE 2012) - Harrisburg, PA, USA (2012.06.4-2012.06.6)] 2012 IEEE 16th International Symposium on Consumer Electronics