1. Objective Of The Project - Ut Arlington · Web viewAMVP- Advance motion vector prediction. AP- Above Predictor. ARP- Above Right Predictor. B-frame- Bi-predictive frame. BMA -

Department of Electrical Engineering

The University of Texas at Arlington

A Project Proposal on

Early termination for TZSearch in HEVC motion estimation.

Under the guidance of Dr. K. R. Rao

For the fulfillment of the course Multimedia Processing (EE5359)Spring 2016

Submitted by

Rajath Shivananda (1001096626)

1

TABLE OF CONTENTS

1. Objective of the project.....................................................................................................................8

2. H.265 / High Efficiency Video Coding.................................................................................................8

2.1 Introduction.................................................................................................................................8

2.2 Encoder and Decoder in HEVC...................................................................................................10

2.3 Features of HEVC.......................................................................................................................13

2.3.1 Coding tree units and coding tree block (CTB) structure:.......................................................14

2.3.2 Coding units (CUs) and coding blocks (CBs):...........................................................................14

2.3.4 TUs and transform blocks:......................................................................................................16

3. Inter Picture Prediction in HEVC......................................................................................................16

3.1 Introduction...............................................................................................................................16

3.2 Motion Vector Prediction...........................................................................................................16

3.3 Advanced Motion Vector Prediction Process.............................................................................17

3.4 Merge Mode..............................................................................................................................18

3.5 Motion Compensation...............................................................................................................18

3.6 Fractional Sample Interpolation.................................................................................................21

3.7 Weighted Sample Prediction......................................................................................................21

4. Block Matching [22][23]...................................................................................................................22

4.1 Motion Estimation Algorithms [22] [23]....................................................................................23

4.2 Full Search Algorithm.................................................................................................................23

4.3 TZSearch Algorithm [25]............................................................................................................24

5. Proposed Algorithm [16]..................................................................................................................26

5.1 Introduction [16]........................................................................................................................26

5.2 Median Predictors [16]..............................................................................................................26

6. Configuration Profile........................................................................................................................35

6.1 Introduction...............................................................................................................................35

6.1.1 Low Delay................................................................................................................................35

6.1.2 Random Access.......................................................................................................................36

6.1.3 Custom Profile (Random Access Early)....................................................................................37

7. Test Sequences................................................................................................................................38

8. Experimental Results.......................................................................................................................40

2

8.1 Test Conditions..........................................................................................................................40

8.2 Comparison of low delay, random access and random access early for different QP values and different video sequences................................................................................................................41

Conclusion: -........................................................................................................................................44

REFERENCES.........................................................................................................................................45

3

Acknowledgement

I would like to acknowledge Dr. K.R.Rao for his continuous support and guidance during the course of the project. We thank him for providing necessary feedback and dedicating his precious time in reviewing the reports and presentation slides at each step.

I would also like extend our gratitude towards Mr. Tuan Ho for helping us understand Inter prediction and addressing other issues faced during the course of the project without which the project would not have been successful.

4

List of Acronyms and Abbreviations

AVC- Advanced Video Coding.

AMVP- Advance motion vector prediction.

AP- Above Predictor.

ARP- Above Right Predictor.

B-frame- Bi-predictive frame.

BMA - Block Matching Algorithm.

CABAC- Context Adaptive Binary Arithmetic Coding.

CTB- Coding Tree Block.

CTU- Coding Tree Unit.

CU- Coding Unit.

CB- Coding Block

DCT- Discrete Cosine Transform.

GOP – Group of Pictures

HDTV- High Definition Television.

HEVC- High Efficiency Video Coding.

HM- HEVC Test Model.

I-frame- Intra-coded frame.

ICASSP- International Conference on Acoustics, Speech and Signal Processing.

JCT- Joint Collaborative Team.

JCT-VC- Joint Collaborative Team on Video Coding.

JM- H.264 Test Model.

JPEG- Joint Photographic Experts Group.

KBPS - Kilo Bits Per Second.

LCU - Large Coding Unit.

LDSP- Large Diamond Search Pattern

LP – Left Predictor.

5

MV- Motion Vector.

MP - Median Predictor.

MC - Motion Compensation.

ME - Motion Estimation.

MPEG - Motion Picture Experts Group.

P-frame: Predicted frame.

PC- Prediction Chunking.

PU- Prediction Unit.

PB- Prediction Block.

PSNR- Peak Signal to Noise Ratio.

SAD- Sum of absolute Difference.

SCU- Small coding unit.

SSD -Sum of Squared Differences.

QP: Quantization Parameter.

RD: Rate Distortion

TB: Transform Block.

TU: Transform Unit.

TZSearch: Test Zone Search

6

Abstract [14]

The TZSearch algorithm was adopted in the high efficiency video coding reference software HM as a fast Motion Estimation (ME) algorithm for its excellent performance in reducing ME time and maintaining a comparable Rate Distortion (RD) performance. However, the multiple initial search point decision and the hybrid block matching search contribute a relatively high computational complexity to TZSearch. Based on the statistical analysis of the probability of median predictor to be selected as the final best point in the large Coding Units (CUs) (64x64, 32x32) and small CUs (16x16, 8x8) as well as the center-biased characteristic of the final best search point in ME process two early terminations for TZSearch are proposed. Experimental results shows that 38.96% encoding time is saved, while the RD performance degradation is quite acceptable [16].

7

1. Objective of the project

In this project, TZSearch is used as the best block matching algorithm compared to full search algorithm. Proposed algorithm is followed to terminate TZSearch algorithm to reduce the computational time by 38.96 % [16]. Median predictors are used as the best initial search point which is about 67.41% (Average). Further Experiment is conducted using 3 configuration profiles which are random access, low delay and custom configuration profile. The results for these different configuration profiles are compared and the best configuration profile is selected. Lastly, different video sequences are used and their PSNR and Bitrate are compared.

2. H.265 / High Efficiency Video Coding

2.1 Introduction

High Efficiency Video Coding (HEVC) [2] is an international standard for video compression developed by a working group of ISO/IEC MPEG (Moving Picture Experts Group) and ITU-T VCEG (Video Coding Experts Group). The main goal of HEVC standard is to significantly improve compression performance compared to existing standards (such as H.264/Advanced Video Coding [2]) in the range of 50% bit rate reduction at similar visual quality [4].

HEVC is designed to address existing applications of H.264/MPEG-4 AVC and to focus on two key issues: increased video resolution and increased use of parallel processing architectures [4]. It primarily targets consumer applications as pixel formats are limited to 4:2:0 8-bit and 4:2:0 10-bit. The next revision of the standard, will enable new use-cases with the support of additional pixel formats such as 4:2:2 and 4:4:4 and bit depth higher than 10-bit [5], embedded bit-stream scalability and 3D video [6].

8

Figure 1. YVU format for 10-bit 4:2:2, 8-bit 4:2:2 and 8-bit 4:2:0 [43].

Figure 2. Evolution of video coding standards over the years [13]

9

Figure 3. Comparison of 10-bit and 8-bit color [43].

Figure 4. Comparison of YVU format for 4:2:2 and 4:4:4 [43].

2.2 Encoder and Decoder in HEVC

Figure 5. Block diagram of HEVC CODEC [7]

Source video, consisting of a sequence of video frames, is encoded or compressed by a video encoder to create a compressed video bit stream. The compressed bit stream is stored or transmitted. A video decoder decompresses the bit stream to create a sequence of decoded frames [7].

The video encoder performs the following steps: Partitioning each picture into multiple units. Predicting each unit using inter or intra prediction, and subtracting the prediction from the

unit. Transforming and quantizing the residual (the difference between the original picture unit

and the prediction). Entropy encoding transform output, prediction information, mode information and headers.

10

The video decoder performs the following steps: Entropy decoding and extracting the elements of the coded sequence. Rescaling and inverting the transform stage. Predicting each unit and adding the prediction to the output of the inverse transform. Reconstructing a decoded video image.

11

The Figure 6 and Figure 7 represent the detailed block diagrams of HEVC encoder and decoder respectively:

Figure 6. Block Diagram of HEVC Encoder [2]

Figure 7. Block Diagram of HEVC Decoder [8]

12

2.3 Features of HEVC

The video coding layer of HEVC employs the same hybrid approach (inter-/intra-picture prediction and 2-D transform coding) used in all video compression standards. Figure 6 depicts the block diagram of a hybrid video encoder, which can create a bit-stream conforming to the HEVC standard. Figure 7 shows the HEVC decoder block diagram. An encoding algorithm producing an HEVC compliant bit-stream would typically proceed as follows. Each picture is split into block-shaped regions, with the exact block partitioning being conveyed to the decoder. The first picture of a video sequence (and the first picture at each clean random access point in a video sequence) is coded using only intra-picture prediction (that uses prediction of data spatially from region-to-region within the same picture, but has no dependence on other pictures). For all remaining pictures of a sequence or between random access points, inter-picture temporally predictive coding modes are typically used for most blocks.

The encoding process for inter-picture prediction consists of choosing motion data comprising the selected reference picture and motion vector to be applied for predicting the samples of each block. The encoder and decoder generate identical inter-picture prediction signals by applying MC using the MV and mode decision data, which are transmitted as side information. The residual signal of the intra- or inter-picture prediction, which is the difference between the original block and its prediction, is transformed by a linear spatial transform. The transform coefficients are then scaled, quantized, entropy coded, and transmitted together with the prediction information. The encoder duplicates the decoder processing loop (see gray-shaded boxes in Figure.4) such that both will generate identical predictions for subsequent data. Therefore, the quantized transform coefficients are constructed by inverse scaling and are then inverse transformed to duplicate the decoded approximation of the residual signal. The residual is then added to the prediction, and the result of that addition may then be fed into one or two loop filters to smooth out artifacts induced by block-wise processing and quantization.

The final picture representation (that is a duplicate of the output of the decoder) is stored in a decoded picture buffer to be used for the prediction of subsequent pictures. In general, the order of encoding or decoding processing of pictures often differs from the order in which they arrive from the source; necessitating a distinction between the decoding order (i.e., bit-stream order) and the output order (i.e., display order) for a decoder. Video material to be encoded by HEVC is generally expected to be input as progressive scan imagery (either due to the source video originating in that format or resulting from de-interlacing prior to encoding). No explicit coding features are present in the HEVC design to support the use of interlaced scanning, as interlaced scanning is no longer used for displays and is becoming substantially less common for distribution. However, a metadata syntax has been provided in HEVC to allow an encoder to indicate that interlace-scanned video has been sent by coding each field (i.e., the even or odd numbered lines of each video frame) of interlaced video as a separate picture or that it has been sent by coding each interlaced frame as an HEVC coded picture. This provides an efficient method of coding interlaced video without burdening decoders with a need to support a special decoding process for it. In the following, the various features involved in hybrid video coding using HEVC are highlighted as follows.

13

2.3.1 Coding tree units and coding tree block (CTB) structure:

The core of the coding layer in previous standards was the macroblock, containing a 16×16 block of luma samples and, in the usual case of 4:2:0 colour sampling, two corresponding 8×8 blocks of chroma samples; whereas the analogous structure in HEVC is the coding tree unit (CTU), which has a size selected by the encoder and can be larger than a traditional macroblock. The CTU consists of a luma CTB and the corresponding chroma CTBs and syntax elements. The size L×L of a luma CTB can be chosen as L = 16, 32, or 64 samples, with the larger sizes typically enabling better compression. HEVC then supports a partitioning of the CTBs into smaller blocks using a tree structure and quad tree-like signalling [10]. The partitioning of CTBs into CBs ranging from 64*64 down to 8*8 is shown in Figure 8.

Figure 8. 64*64 CTBs split into CBs [9]

2.3.2 Coding units (CUs) and coding blocks (CBs):

The quad tree syntax of the CTU specifies the size and positions of its luma and chroma CBs. The root of the quadtree is associated with the CTU. Hence, the size of the luma CTB is the largest supported size for a luma CB. The splitting of a CTU into luma and chroma CBs is signalled jointly. One luma CB and ordinarily two chroma CBs, together with associated syntax, form a coding unit (CU) as shown in Figure 9. A CTB may contain only one CU or may be split to form multiple CUs, and each CU has an associated partitioning into prediction units (PUs) and a tree of transform units (TUs).

14

Figure 9. CUs split into CBs [9]

2.3.3 Prediction units and prediction blocks (PBs):

The decision whether to code a picture area using interpicture or intrapicture prediction is made at the CU level. A PU partitioning structure has its root at the CU level. Depending on the basic prediction-type decision, the luma and chroma CBs can then be further split in size and predicted from luma and chroma prediction blocks (PBs) as shown in figure 10 HEVC supports variable PB sizes from 64×64 down to 4×4 samples.

Figure 10. Partitioning of Prediction Blocks from Coding Blocks [9]

15

2.3.4 TUs and transform blocks:

The prediction residual is coded using block transforms. A TU tree structure has its root at the CU level. The luma CB residual may be identical to the luma transform block (TB) or may be further split into smaller luma TBs as shown in Figure 11. The same applies to the chroma TBs. Integer basis functions similar to those of the discrete cosine transform (DCT) are defined for the square TB sizes 4×4, 8×8, 16×16, and 32×32. For the 4×4 transform of luma intrapicture prediction residuals, an integer transform derived from a form of discrete sine transform (DST) is alternatively specified.

Figure 11. Partitioning of Transform Blocks from Coding Blocks [9]

3. Inter Picture Prediction in HEVC

3.1 Introduction

Motion estimation (ME) and motion compensation is seen as one of the most important methods of exploiting redundancy in motion pictures. Its importance is so high that 50% to 70% of encoder complexity is dedicated to the motion estimation process [18]. However, as we move towards higher resolution videos, computational complexity is becoming a bigger concern. This is why motion estimation is seen as a major savings area in terms of computational expense. However, in HEVC and the previous video coding standard H.264/MPEG-4(AVC), motion estimation using multiple reference frames was also introduced which added to the complexity of the motion estimation process. While it provided the ability to improve PSNR, it also added extra computational cost.

3.2 Motion Vector Prediction

Like the AVC, the HEVC standard has two reference lists: L0 and L1. They can hold 16 references each, but the maximum total number of unique pictures is 8. This means that to find a maximum output, the same picture has to be added more than once. The encoder may choose to do this to be able to predict the

16

same picture with different weights (weighted prediction). The HEVC standard uses more complex motion prediction than AVC. The HEVC standard uses candidate list indexing. There are two MV prediction modes: merge and AMVP (advanced motion vector prediction). The encoder decides between these two modes for each PU and signals it in the bit stream with a flag. Only the AMVP process can result in any desired MV, since it is the only one that codes an MV delta. Each mode builds a list of candidate MVs, and then selects one of them using an index coded in the bit stream.

Figure 12. Position of spatial candidates of motion information [19]

Figure 13. Quad-tree splitting flag. 1- Level 1 (L1), 0 – Level (L0) [20]

3.3 Advanced Motion Vector Prediction Process

17

AMVP process is performed once for each MV; so once per L0 or L1 PU for unidirectional PU, or twice for a bidirectional PU. The bit stream specifies the reference picture to use for each MV. A two-deep candidate list is formed:

First, a left predictor is obtained. a0 is preferred over a1, same list is preferred over the opposite list, and neighbor is preferred that points to the same picture over one that does not. If no neighbor points to the same picture, the motion vector is scaled to match the picture distance (similar process as the AVC temporal direct mode). If all this results are in a valid candidate, a motion vector is added to the candidate list.

Second, upper predictor is obtained. b0 is preferred over b1, b1is preferred over b2 and the neighbor MV that points to the same picture is preferred for motion vector prediction over the one that are not in same picture. Neighbor scaling for the upper predictor is only done if it was not done for the left neighbor, ensuring no more than one scaling operation per PU. If the candidate is found, it is added to the list. If the list still contains less than two candidates, a temporal candidate (scaled MV according to picture distance) is obtained, which is co-located with the right bottom of the PU. If the candidate lies outside the CTB row, or outside the picture, or if the co-located PU is intra, center position is tried again. If temporal candidate is found, it is added to the list. If the candidate list is still empty, a (0, 0) motion vector is added until full. Finally, with the transmitted index, right candidate is selected and is added in the transmitted MV delta.

3.4 Merge Mode

The merge process results in a candidate list of up to five entries deep, configured in the slice header. Each entry ends up being L0, L1 or bidirectional. Four spatial candidates are added in this order: a1, b1, b0, a0, b2. A candidate is not added to the list if it is the same as one of the earlier candidates. Then, if the list still has room, a temporal candidate is added, which is found by the same process as in AMVP (Advance Motion Vector Prediction) process. Then, if the list still has room, bidirectional candidates are added and formed by making combinations of the L0 and L1 vectors of the other candidates already in the list. Then finally if the list still is not full, (0, 0) MVs are added with increasing reference indices. The final motion vector is obtained by picking one of the up-to-five candidates as signaled in the bit stream.

The HEVC sub-samples the temporal motion vectors on a 16x16 grid. Thus, decoder only needs to make room for two motion vectors (L0 and L1) per 16x16 regions in the picture when it allocates the temporal motion vector buffer. When the decoder calculates the co-located position, lower 4 bits are zeroed out of the x/y position, snapping the location to a multiple of 16 and picture is considered to be co-located, that is signaled in the slice header.

18

3.5 Motion Compensation

Like MPEG-4/AVC [1], HEVC specifies motion vectors in 1/4-pel, but uses an 8tap filter for luma (all positions), and a 4-tap 1/8-pel filter for chroma as shown in figure 32. Because of the 8-tap filter, any given NxM sized block requires extra pixels on all sides (3 left/above, 4 right and below) to provide the filter with the data it needs. With small blocks like an 8x4, (8+7) x (4+7) = 15x11 pixels are needed. The HEVC standard limits the smallest block to be uni-directional and 4x4 is not supported since more small blocks require more memory read, thus increasing more memory access, more time and more power.

The HEVC standard also supports weighted prediction for both uni- and bidirectional PUs. However, the weights are always explicitly transmitted in the slice header, there is no implicit weighted prediction like in MPEG-4/AVC [19]. Quarter-sample precision is used for the motion vectors. 7-tap (weights: -1, 4, -10, 58, 17, -5, 1) or 8-tap (weights: -1, 4, -11, 40, 40, -11, 4, 1) filters are used for interpolation of fractional-sample positions as shown in figure 3-3. Similar to H.264/MPEG-4 AVC [19], multiple reference pictures are used as shown in figure 14. For each PB, either one or two motion vectors can be transmitted, resulting either in unipredictive or bipredictive coding, respectively. A scaling and offset operation can be applied to the prediction signal/signals in a manner known as weighted prediction.

Figure 14. Integer (Ai, j) and fractional sample (a i, j) position for luma interpolation [1]

19

Figure 15. Multiple pictures used as reference for the current picture for motion

compensation [2]

Figure 14 shows the positions labeled with upper-case letters, A i, j, representing the available luma samples at integer sample locations, whereas the other positions labeled with lower-case letters represent samples at non-integer sample locations, which need to be generated by interpolation. The samples labeled a0,j, b0,j, c0,j, d0,0, h0,0, and n0,0 are derived from the samples Ai,j by applying the eight-tap filter for half-sample positions and the seven-tap filter for the quarter-sample positions as follows [19]:

Where the constant B ≥ 8 is the bit depth of the reference samples (typically B = 8 for most applications). In these formulas, the symbol, >>, denotes an arithmetic right shift operation. The samples labeled e0,0, f0,0, g0,0, i0,0, j0,0, k0,0, p0,0, q0,0, and r0,0 can be derived by applying the corresponding filters to samples located at vertically adjacent a0,j, b0,j and c0,j positions as follows [19]:

20

3.6 Fractional Sample Interpolation

Interpolation tasks arise naturally in the context of video coding because the true displacements of objects from one picture to another are independent of the sampling grid of cameras. Therefore, in MCP, fractional-sample accuracy is used to more accurately capture continuous motion. Samples available at integer positions are filtered to estimate values at fractional positions. This spatial domain operation can be seen in the frequency domain as introducing phase delays to individual frequency components. An ideal interpolation filter for band-limited signals induces a constant phase delay to all frequencies and does not alter their magnitudes. The efficiency of MCP is limited by many factors—the spectral content of original and already reconstructed pictures, camera noise level, motion blur, quantization noise in reconstructed pictures, etc. Similar to H.264/AVC, HEVC supports motion vectors with quarter-pixel accuracy for the luma component and one-eighth pixel accuracy for chroma components. If the motion vector has a half or quarter-pixel accuracy, samples at fractional positions need to be interpolated using the samples at integer-sample positions. The interpolation process in HEVC introduces several improvements over H.264/AVC that contributes to the significant coding efficiency increase of HEVC [21].

3.7 Weighted Sample Prediction

Similar to H.264/AVC, HEVC includes a weighted prediction (WP) tool that is particularly useful for coding sequences with fades. In WP, a multiplicative weighting factor and an additive offset are applied to the motion compensated prediction. In principle, WP replaces the inter prediction signal P by a linearly weighted prediction signal.

Where w is an Illumination Compensation weight and o is an offset.

21

The inputs to the WP process are: the width and the height of the luma prediction block, prediction samples to be weighted, the prediction list utilization flags, the reference indices for each list, and the color component index. Weighting factors w0 and w1, and offsets o0 and o1 are determined using the data transmitted in the bit stream. The subscripts indicate the reference picture list to which the weight and the offset are applied. The output of this process is the array of prediction sample values. In HEVC weight and offset parameters are explicitly signaled (explicit mode). Optimal solutions are obtained when the Illumination Compensation weights, motion estimation and Rate Distortion Optimization (RDO) are considered jointly [21]. However, practical systems usually employ simplified techniques, such as determining approximate weights by considering picture-to-picture mean variation [21].

4. Block Matching [22][23]

The MPEG and H.26X standards [20] use block-matching technique for motion estimation /compensation. In the block-matching technique, each current frame is divided into equal-size blocks, called source blocks. Each source block is associated with a search region in the reference frame. The objective of block-matching is to find a candidate block in the search region best matched to the source block. The relative distances between a source block and its candidate blocks are called motion vectors.

Figure.16. Block matching scenario [22]

X: Source block for block-matchingBx: Search area associated with X

22

MV: Motion vector

4.1 Motion Estimation Algorithms [22] [23]

• Full Search Algorithm

• TZSearch Algorithm

4.2 Full Search Algorithm

In video coding, Full search algorithm based on block matching finds optimal motion vectors which minimize the matching differences between reference blocks and candidate blocks in search area. Full search algorithm has been widely used in video coding applications because of its simple and easy hardware implementation. However, high computational cost of the full search algorithm with very large search area has been considered as a serious problem for realizing fast real-time video coding.

In figure 17 an example for full search algorithm is shown where the blue colored pixels are previously determined SAD values of the current block before determining the best match which is purple in color. It can be seen that it takes numerous computations before the best match is found.

23

Figure 17. Full Search algorithm [23]

4.3 TZSearch Algorithm [25]

Motion estimation is an essential process in HEVC. It finds the best matched block position in the past (or future frames) for every block in the current video frame. Full search algorithms searching all the blocks in reference frame search window can find the most accurate matching block, but they are also too time-consuming [26]. So fast motion estimation algorithms searching only blocks which are likely to be the best matched block position is widely used. TZ search method is adopted as the fast integer pixel motion estimation method in HM. It has four steps as described in the following [27]:

1. Start Search Center: Establish a set of search centers, including the motion vector obtained from median prediction, the motion vector of the left, the up and the upper right position in the corresponding block of the reference frame, the motion vector at (0,0) position. Choose the point which has the smallest matching error as search center of next step.

2. Diamond or Square Search: Determine the search range and the search pattern. Run the search with different stride lengths from 1 through 64, in multiples of 2. Then performing 2 point search to check only the 2 untested points if the optimal point is around the search center with stride length being 1. Find out the smallest matching error point as search center of step 3.

3. Raster Search: If the distance between the optimal point obtained from step 2 and current search center called best distance is 0, stop the search. Otherwise, if it is greater than the value of iRaster which is set appropriately, raster search is performed and the value of iRaster is used as the stride length of raster search.

4. Raster/Star Refinement: Set the optimal point from step 3 as the starting point. Raster refinement performs 8 point diamond or square search with the best distance decreasing according to the exponential of 2 and updating every step of the current starting point location until the best distance is 0. The star refinement is similar to step 2 except for the optimal point being the starting point every round. The refinement process will only start if the best distance is greater than zero. When the best distance equals to 0, the search stops.

Some of the commonly used TZSearch patterns are shown in the Figure 18.

24

Figure 18. TZSearch patterns

4.3.1 Diamond Search Algorithm.

Figure.19 Diamond search scenario for ME [28] [29]

The Diamond Search algorithm employs two search patterns.

1. Large diamond search pattern (LDSP) comprises nine checking points from which eight points surround the center one to compose a diamond shape.

25

2. Small diamond search pattern (SDSP) consisting of five checking points forms a small diamond shape.

5. Proposed Algorithm [16]

5.1 Introduction [16]

The TZSearch algorithm was adopted in the high efficiency video coding reference software HM as a fast Motion Estimation (ME) algorithm for its excellent performance in reducing ME time and maintaining a comparable Rate Distortion (RD) performance. However, the multiple initial search point decision and the hybrid block matching search contribute a relatively high computational complexity to TZSearch. Based on the statistical analysis of the probability of median predictor to be selected as the final best point in the large Coding Units (CUs) (64X64, 32X32) and small CUs (16X16, 8X8) as well as the center-biased characteristic of the final best search point in ME process, two early terminations for TZSearch are proposed.

5.2 Median Predictors [16]

Flexible size representation technique contribute the largest proportion of encoding time to HEVC encoder. The search procedure of TZSearch includes two steps, initial search point decision and block matching search, respectively. The first step is to determine the initial search point by using a set of predictors which includes Median Predictor (MP) [10], Left Predictor (LP), Above Predictor (AP), Above-Right Predictor (ARP) and (0, 0). LP, AP and ARP are corresponding to the Motion Vector (MV) of left, top, top right block of the current block, respectively. After the initial search point is determined, the hybrid block matching search, including multiple diamond/square search and raster search, are used to locate the best matching block which is with the minimum RD cost. However, the computational complexity of multiple initial search point decision and hybrid block matching search is still relatively high. If these two processes can be simplified, much more encoding time will be saved.

26

Figure 20. Median Predictors [30]

In video sequences, there are a large number of blocks with static or quite slow motion activity. For these blocks, they have the largest probability to select the MP as the final best point in ME process. From experimental results it can be seen that approximately 62.26 % of the final best search point are Median Predictors. Table 1 shows the possibility of selecting Median Predictors as the final best search point for different video sequences and different QP.

QPSequence 24 28 32 36 AverageBQMall 58.74 60.99 60.66 62.70 60.77Johnny 61.44 64.15 67.35 70.84 65.94Modisode2 65.55 68.15 70.55 70.24 68.62ParkScene 42.22 46.84 45.24 49.66 45.99Average 56.9875 60.0325 60.95 63.36 60.33

Table 1. Possibility of selecting Median Predictors as the final best search point, Unit(%).

Code Used:-

The code below is taken from HM 16.0. This particular part of the code helps in determining the initial best point to start the TZSearch.

Void TEncSearch::xTZSearch(const TComDataCU* const pcCU,const TComPattern* const pcPatternKey,const Pel* const piRefY,const Int iRefStride,const TComMv* const pcMvSrchRngLT,const TComMv* const pcMvSrchRngRB,TComMv& rcMv,Distortion& ruiSAD,const TComMv* const pIntegerMv2Nx2NPred,const Bool bExtendedSettings){const Bool bUseAdaptiveRaster = bExtendedSettings;const Int iRaster = 5;const Bool bTestOtherPredictedMV = bExtendedSettings;const Bool bTestZeroVector = true;const Bool bTestZeroVectorStart = bExtendedSettings;const Bool bTestZeroVectorStop = false;const Bool bFirstSearchDiamond = true; // 1 = xTZ8PointDiamondSearch 0 = xTZ8PointSquareSearchconst Bool bFirstCornersForDiamondDist1 = bExtendedSettings;const Bool bFirstSearchStop = m_pcEncCfg->getFastMEAssumingSmootherMVEnabled();const UInt uiFirstSearchRounds = 3; // first search stop X rounds after best match (must be >=1)const Bool bEnableRasterSearch = true;const Bool bAlwaysRasterSearch = bExtendedSettings; // true: BETTER but factor 2 slowerconst Bool bRasterRefinementEnable = false; // enable either raster refinement or star refinementconst Bool bRasterRefinementDiamond = false; // 1 = xTZ8PointDiamondSearch 0 = xTZ8PointSquareSearchconst Bool bRasterRefinementCornersForDiamondDist1 = bExtendedSettings;const Bool bStarRefinementEnable = true; // enable either star refinement or raster refinementconst Bool bStarRefinementDiamond = true; // 1 = xTZ8PointDiamondSearch 0 = xTZ8PointSquareSearch

27

const Bool bStarRefinementCornersForDiamondDist1 = bExtendedSettings;const Bool bStarRefinementStop = false;const UInt uiStarRefinementRounds = 2; // star refinement stop X rounds after best match (must be >=1)const Bool bNewZeroNeighbourhoodTest = bExtendedSettings;

UInt uiSearchRange = m_iSearchRange;pcCU->clipMv(rcMv);#if ME_ENABLE_ROUNDING_OF_MVSrcMv.divideByPowerOf2(2);#elsercMv >>= 2;#endif// init TZSearchStructIntTZSearchStruct cStruct;cStruct.iYStride = iRefStride;cStruct.piRefY = piRefY;cStruct.uiBestSad = MAX_UINT;

// set rcMv (Median predictor) as start point and as best pointxTZSearchHelp(pcPatternKey, cStruct, rcMv.getHor(), rcMv.getVer(), 0, 0);

// test whether one of PRED_A, PRED_B, PRED_C MV is better start point than Median predictorif (bTestOtherPredictedMV){for (UInt index = 0; index < NUM_MV_PREDICTORS; index++){TComMv cMv = m_acMvPredictors[index];pcCU->clipMv(cMv);#if ME_ENABLE_ROUNDING_OF_MVScMv.divideByPowerOf2(2);#elsecMv >>= 2;#endifif (cMv != rcMv && (cMv.getHor() != cStruct.iBestX && cMv.getVer() != cStruct.iBestY)){// only test cMV if not obviously previously tested.xTZSearchHelp(pcPatternKey, cStruct, cMv.getHor(), cMv.getVer(), 0, 0);}}}

// test whether zero Mv is better start point than Median predictorif (bTestZeroVector){if ((rcMv.getHor() != 0 || rcMv.getVer() != 0) &&(0 != cStruct.iBestX || 0 != cStruct.iBestY)){// only test 0-vector if not obviously previously tested.xTZSearchHelp(pcPatternKey, cStruct, 0, 0, 0, 0);}}

Int iSrchRngHorLeft = pcMvSrchRngLT->getHor();Int iSrchRngHorRight = pcMvSrchRngRB->getHor();Int iSrchRngVerTop = pcMvSrchRngLT->getVer();Int iSrchRngVerBottom = pcMvSrchRngRB->getVer();

28

if (pIntegerMv2Nx2NPred != 0){TComMv integerMv2Nx2NPred = *pIntegerMv2Nx2NPred;integerMv2Nx2NPred <<= 2;pcCU->clipMv(integerMv2Nx2NPred);#if ME_ENABLE_ROUNDING_OF_MVSintegerMv2Nx2NPred.divideByPowerOf2(2);#elseintegerMv2Nx2NPred >>= 2;#endifif ((rcMv != integerMv2Nx2NPred) &&(integerMv2Nx2NPred.getHor() != cStruct.iBestX || integerMv2Nx2NPred.getVer() != cStruct.iBestY)){// only test integerMv2Nx2NPred if not obviously previously tested.xTZSearchHelp(pcPatternKey, cStruct, integerMv2Nx2NPred.getHor(), integerMv2Nx2NPred.getVer(), 0, 0);}

// reset search rangeTComMv cMvSrchRngLT;TComMv cMvSrchRngRB;Int iSrchRng = m_iSearchRange;TComMv currBestMv(cStruct.iBestX, cStruct.iBestY);currBestMv <<= 2;xSetSearchRange(pcCU, currBestMv, iSrchRng, cMvSrchRngLT, cMvSrchRngRB);iSrchRngHorLeft = cMvSrchRngLT.getHor();iSrchRngHorRight = cMvSrchRngRB.getHor();iSrchRngVerTop = cMvSrchRngLT.getVer();iSrchRngVerBottom = cMvSrchRngRB.getVer();}

// start searchInt iDist = 0;Int iStartX = cStruct.iBestX;Int iStartY = cStruct.iBestY;

const Bool bBestCandidateZero = (cStruct.iBestX == 0) && (cStruct.iBestY == 0);

// first search around best position up to now.// The following works as a "subsampled/log" window search around the best candidatefor (iDist = 1; iDist <= (Int)uiSearchRange; iDist *= 2){if (bFirstSearchDiamond == 1){xTZ8PointDiamondSearch(pcPatternKey, cStruct, pcMvSrchRngLT, pcMvSrchRngRB, iStartX, iStartY, iDist, bFirstCornersForDiamondDist1);}else{xTZ8PointSquareSearch(pcPatternKey, cStruct, pcMvSrchRngLT, pcMvSrchRngRB, iStartX, iStartY, iDist);}

if (bFirstSearchStop && (cStruct.uiBestRound >= uiFirstSearchRounds)) // stop criterion{break;}

29

}

if (!bNewZeroNeighbourhoodTest){// test whether zero Mv is a better start point than Median predictorif (bTestZeroVectorStart && ((cStruct.iBestX != 0) || (cStruct.iBestY != 0))){xTZSearchHelp(pcPatternKey, cStruct, 0, 0, 0, 0);if ((cStruct.iBestX == 0) && (cStruct.iBestY == 0)){// test its neighborhoodfor (iDist = 1; iDist <= (Int)uiSearchRange; iDist *= 2){xTZ8PointDiamondSearch(pcPatternKey, cStruct, pcMvSrchRngLT, pcMvSrchRngRB, 0, 0, iDist, false);if (bTestZeroVectorStop && (cStruct.uiBestRound > 0)) // stop criterion{break;}}}}}else{// Test also zero neighbourhood but with half the range// It was reported that the original (above) search scheme using bTestZeroVectorStart did not// make sense since one would have already checked the zero candidate earlier// and thus the conditions for that test would have not been satisfiedif (bTestZeroVectorStart == true && bBestCandidateZero != true){for (iDist = 1; iDist <= ((Int)uiSearchRange >> 1); iDist *= 2){xTZ8PointDiamondSearch(pcPatternKey, cStruct, pcMvSrchRngLT, pcMvSrchRngRB, 0, 0, iDist, false);if (bTestZeroVectorStop && (cStruct.uiBestRound > 2)) // stop criterion{break;}}}}

// calculate only 2 missing points instead 8 points if cStruct.uiBestDistance == 1if (cStruct.uiBestDistance == 1){cStruct.uiBestDistance = 0;xTZ2PointSearch(pcPatternKey, cStruct, pcMvSrchRngLT, pcMvSrchRngRB);}

// raster search if distance is too bigif (bUseAdaptiveRaster){int iWindowSize = iRaster;Int iSrchRngRasterLeft = iSrchRngHorLeft;Int iSrchRngRasterRight = iSrchRngHorRight;Int iSrchRngRasterTop = iSrchRngVerTop;Int iSrchRngRasterBottom = iSrchRngVerBottom;

30

if (!(bEnableRasterSearch && (((Int)(cStruct.uiBestDistance) > iRaster)))){iWindowSize++;iSrchRngRasterLeft /= 2;iSrchRngRasterRight /= 2;iSrchRngRasterTop /= 2;iSrchRngRasterBottom /= 2;}cStruct.uiBestDistance = iWindowSize;for (iStartY = iSrchRngRasterTop; iStartY <= iSrchRngRasterBottom; iStartY += iWindowSize){for (iStartX = iSrchRngRasterLeft; iStartX <= iSrchRngRasterRight; iStartX += iWindowSize){xTZSearchHelp(pcPatternKey, cStruct, iStartX, iStartY, 0, iWindowSize);}}}else{if (bEnableRasterSearch && (((Int)(cStruct.uiBestDistance) > iRaster) || bAlwaysRasterSearch)){cStruct.uiBestDistance = iRaster;for (iStartY = iSrchRngVerTop; iStartY <= iSrchRngVerBottom; iStartY += iRaster){for (iStartX = iSrchRngHorLeft; iStartX <= iSrchRngHorRight; iStartX += iRaster){xTZSearchHelp(pcPatternKey, cStruct, iStartX, iStartY, 0, iRaster);}}}}

// raster refinement

if (bRasterRefinementEnable && cStruct.uiBestDistance > 0){while (cStruct.uiBestDistance > 0){iStartX = cStruct.iBestX;iStartY = cStruct.iBestY;if (cStruct.uiBestDistance > 1){iDist = cStruct.uiBestDistance >>= 1;if (bRasterRefinementDiamond == 1){xTZ8PointDiamondSearch(pcPatternKey, cStruct, pcMvSrchRngLT, pcMvSrchRngRB, iStartX, iStartY, iDist, bRasterRefinementCornersForDiamondDist1);}else{xTZ8PointSquareSearch(pcPatternKey, cStruct, pcMvSrchRngLT, pcMvSrchRngRB, iStartX, iStartY, iDist);}}

// calculate only 2 missing points instead 8 points if cStruct.uiBestDistance == 1

31

if (cStruct.uiBestDistance == 1){cStruct.uiBestDistance = 0;if (cStruct.ucPointNr != 0){xTZ2PointSearch(pcPatternKey, cStruct, pcMvSrchRngLT, pcMvSrchRngRB);}}}}

// star refinementif (bStarRefinementEnable && cStruct.uiBestDistance > 0){while (cStruct.uiBestDistance > 0){iStartX = cStruct.iBestX;iStartY = cStruct.iBestY;cStruct.uiBestDistance = 0;cStruct.ucPointNr = 0;for (iDist = 1; iDist < (Int)uiSearchRange + 1; iDist *= 2){if (bStarRefinementDiamond == 1){xTZ8PointDiamondSearch(pcPatternKey, cStruct, pcMvSrchRngLT, pcMvSrchRngRB, iStartX, iStartY, iDist, bStarRefinementCornersForDiamondDist1);}else{xTZ8PointSquareSearch(pcPatternKey, cStruct, pcMvSrchRngLT, pcMvSrchRngRB, iStartX, iStartY, iDist);}if (bStarRefinementStop && (cStruct.uiBestRound >= uiStarRefinementRounds)) // stop criterion{break;}}

// calculate only 2 missing points instead 8 points if cStrukt.uiBestDistance == 1if (cStruct.uiBestDistance == 1){cStruct.uiBestDistance = 0;if (cStruct.ucPointNr != 0){xTZ2PointSearch(pcPatternKey, cStruct, pcMvSrchRngLT, pcMvSrchRngRB);}}}}

// write out best matchrcMv.set(cStruct.iBestX, cStruct.iBestY);static unsigned long int top = 0, topright = 0, topleft = 0, down = 0, downleft = 0, downright = 0, left = 0, right = 0; //Static Variables declared to count all the prediction directionif (cStruct.iBestX <= -1){if (cStruct.iBestY == 0)

32

{cout << "Left-";left++;cout << left << endl;}}if (cStruct.iBestX == 0){if (cStruct.iBestY >= 1){cout << "TOP-";top++;cout << top << endl;}}if (cStruct.iBestX >= 1){if (cStruct.iBestY >= 1){cout << "TOP_ABOVE_RIGHT-";topright++;cout << topright << endl;}}/*if (cStruct.iBestX ==0){if (cStruct.iBestY == 0){cout <<"Center" << endl;}}*/if (cStruct.iBestX == 0){if (cStruct.iBestY <= -1){cout << "Down-";down++;cout << down << endl;

}}if (cStruct.iBestX >= 1){if (cStruct.iBestY <= -1){cout << "Down_Right-";downright++;cout << downright << endl;}}if (cStruct.iBestX <= -1){if (cStruct.iBestY <= -1){cout << "Down_Left-";downleft++;

33

cout << downleft << endl;

}}if (cStruct.iBestX <= -1){if (cStruct.iBestY >= 1){cout << "Above_Left-";topleft++;cout << topleft<<endl;}}if (cStruct.iBestX >= 1){if (cStruct.iBestY >= 1){cout << "Right-";right++;cout << right<<endl;}}ruiSAD = cStruct.uiBestSad - m_pcRdCost->getCostOfVectorWithPredictor(cStruct.iBestX, cStruct.iBestY);}In the code mentioned above static variables are used to help find out the direction of the prediction used as the best point. Further TZSearch is terminated using the algorithm shown in figure 21.

34

Figure 21. Algorithm to terminate TZSearch.

6. Configuration Profile

6.1 Introduction

Configuration profile is required by the HM software to perform encoding and decoding on the video sequences based on the parameter values defined in the configuration file. In this project 3 configuration profiles are considered with different Quantization parameters (QP).

6.1.1 Low Delay

In this profile one I-frame is introduced in the beginning of the encoder followed by B-frames or P-frames. I-frame are called Intra frame and B-frame is Bi-Directional frame. In this profile PSNR is degraded because there is only one I-frame followed by B or P frames.

In figure 22 show the summary of Low Delay profile where PSNR of I-frame is higher than PSNR of B-frame.

35

Figure 22. Low Delay Profile Summary. Bitrate in sec, Y, U, V-PSNR in db, Total # of frames 150.

6.1.2 Random Access

In this profile I-frames are introduced in between B-frames or P-frames. This technique of introducing I-frames in between every 30 B or P frames which helps to improve overall PSNR because I-frame does not use any reference picture for prediction instead uses its own picture for prediction hence the possibility of have error is less. Where as in B-frames and P- frames, prediction is based on reference picture which are B-frame and P-frame respectively, so the possibility of having error is higher.

In figure 23 it can be seen that the average PSNR of random access profile is improved compared to low delay profile.

36

Figure 23. Random Access Profile Summary. Bitrate in sec, Y, U, V-PSNR in db, Total # of frames 150.

6.1.3 Custom Profile (Random Access Early)

In this profile Coding Unit is terminated when best match is found which saves the encoding time by 40%. In figure 24, the encoding time is reduced by 40% with comprisable decrease in PSNR.

37

Figure 24. Custom Profile Summary. Bitrate in sec, Y, U, V-PSNR in dB, Total # of frames 150.

7. Test Sequences

Figure 25. A frame from Mobisode2 video sequence. Resolution – 416x240. 38

Figure 26. A frame from BQMall video sequence. Resolution – 832x480.

Figure 27. A frame from Johnny video sequence. Resolution – 1280x720.

39

Figure 28. A frame from Park Scene video sequence. Resolution – 1920x1080.

8. Experimental Results

8.1 Test Conditions

Test Conditions

Frame Rate 30

Total Number of Frames 60

GOP 8

Search Range 64

CU Size / Depth 64/4

Inter frames intervals 32

QP 24, 28, 32, 36

Table 2. Test Conditions

40

8.2 Comparison of low delay, random access and random access early for different QP values and different video sequences.

PSNR in DB for Different QPVideo Sequence Profile 24 28 32 36Modisode2 Low_delay 44.0872 41.8229 39.8071 37.9595

Random_access 44.6688 42.4969 40.5565 38.7318Random_access_early 44.6429 42.4997 40.5335 386617

Table 3. PSNR for low delay, random access and random access early profile.

Bitrate in kbps for Different QPVideo Sequence Profile 24 28 32 36Modisode2 Low_delay 154.0520 88.0560 53.4080 34.7000

Random_access 131.4360 77.4760 46.8400 30.4360Random_access_early 134.7200 78.7480 47.5880 30.3840

Table 4. Bitrate for low delay, random access and random access early profile.

Encoding Time in Seconds for Different QPVideo Sequence Profile 24 28 32 36Modisode2 Low_delay 213.243 154.464 179.762 138.741

Random_access 123.502 115.912 104.691 113.527Random_access_early 78.859 72.575 63.303 66.9806Encoding Time saved (%). 36.14759 37.38785 39.53348 41.00029

Table 5. Encoding Time for low delay, random access and random access early profile.

PSNR in DB for Different QPVideo Sequence Profile 24 28 32 36BQMall Low_delay 39.3297 36.9636 34.6412 32.4062


Table 6. PSNR for low delay, random access and random access early profile

Bitrate in kbps for Different QPVideo Sequence Profile 24 28 32 36BQMall Low_delay 2049.2600 1118.9640 634.7920 371.7720


Table 7. Bitrate for low delay, random access and random access early profile

41

Encoding Time in Seconds for Different QPVideo Sequence Profile 24 28 32 36BQMall Low_delay 1115.699 852.405 767.365 703.206


Table 8. Encoding Time for low delay, random access and random access early profile

PSNR in DB for Different QPVideo Sequence Profile 24 28 32 36Johnny Low_delay 39.3297 36.9636 34.6412 32.4062

Random_access 43.4299 42.1872 40.7531 39.0332

Random_access_early 43.4189 42.1770 40.7362 39.0224Table 9. PSNR for low delay, random access and random access early profile

Bitrate in kbps for Different QPVideo Sequence Profile 24 28 32 36Johnny Low_delay 2049.2600 1118.9640 634.7920 371.7720


Table 10. Bitrate for low delay, random access and random access early profile

Encoding Time in Seconds for Different QPVideo Sequence Profile 24 28 32 36Johnny Low_delay 1115.699 852.405 767.365 703.206


Table 11. Encoding Time for low delay, random access and random access early profile

PSNR in DB for Different QPVideo Sequence Profile 24 28 32 36ParkScene Low_delay 39.4567 37.3035 35.2910 33.4236

Random_access 39.7192 37.7745 35.8569 34.0347

Random_access_early 39.7196 37.7595 35.8355 34.0067 Table 12. PSNR for low delay, random access and random access early profile

Table 13. Bitrate for low delay, random access and random access early profile42

Encoding Time in Seconds for Different QPVideo Sequence Profile 24 28 32 36ParkScene Low_delay 5042.252 4590.547 4266.833 3280.501


Table 14. Encoding Time for low delay, random access and random access early profile.

Bitrate in kbps for Different QPVideo Sequence Profile 24 28 32 36ParkScene Low_delay 5042.252 4590.547 4266.833 3280.501


43

Conclusion: -

Hence from the results shown in table 1 it can be said that about 60% of the motion vectors are median predictors. And these median predictors are terminated using TZSearch which saves encoding time by 38%. The results can be seen from table 3 through table 14.

44

REFERENCES

[1] HEVC overview http://www.apsipa2013.org/wpcontent/uploads/2013/09/Tutorial_8_NextGenerationVideoCoding_Part_2.pdf

[2] D. Marpe et al, “The H.264/MPEG4 advanced video coding standard and its applications”, IEEE Communications Magazine, Vol. 44, pp. 134-143, Aug. 2006.

[3] B. Bross et al, “High Efficiency Video Coding (HEVC) Text Specification Draft 10”, Document JCTVC-L1003, ITU-T/ISO/IEC Joint Collaborative Team on Video Coding (JCT-VC),Mar.2013.

[4] G.J. Sullivan et al, “Overview of the high efficiency video coding (HEVC) standard”, IEEE Trans. Circuits and Systems for Video Technology, vol. 22,no.12, pp. 1649 – 1668, Dec 2012.

[5] HEVC white paper - Ateme: http://www.ateme.com/an-introduction-to-uhdtv-and-hevc

[6] G.J. Sullivan et al, “Standardized Extensions of High Efficiency Video Coding (HEVC)”, IEEE Journal of selected topics in Signal Processing, Vol. 7, No. 6, pp. 1001-1016, Dec. 2013.

[7] HEVC tutorial by I.E.G. Richardson: http://www.vcodex.com/h265.html

[8] C. Fogg, “Suggested figures for the HEVC specification”, ITU-T / ISO-IEC Document: JCTVC J0292r1, July 2012.

[9] U.S.M. Dayananda, “Study and Performance comparison of HEVC and H.264 video codecs” Final project report , EE Dept., UTA, Arlington, TX, Dec. 2011 available on http://www.uta.edu/faculty/krrao/dip/Courses/EE5359/index_tem.html

[10] HM Software Manual - https://hevc.hhi.fraunhofer.de/svn/svn_HEVCSoftware/

[11] Visual studio: http://www.dreamspark.com

[12] Tortoise SVN: http://tortoisesvn.net/downloads.html

[13] Multimedia processing course website: http://www.uta.edu/faculty/krrao/dip/

[14] C. E. Rhee et al, “A Survey of Fast Mode Decision Algorithms for Inter-Prediction and Their Applications to High Efficiency Video Coding”, IEEE Transactions on Consumer Electronics, vol 58, no. 4, pp 1375-1383, Dec. 2012.

[15] R. Li, B. Zeng, and M.L. Lio, “A new three-step search algorithm for block motion estimation” IEEE Trans. Circuits and Systems for Video Technology, vol. 4, no. 4, pp. 438-443,August 1994.

[16] Z. Pan et al, “Early termination for TZSearch in HEVC motion estimation”, IEEE ICASSP 2013, pp. 1389-1392, June 2013.

45

[17] X. Zhang, S. Wang, S. Ma, “Early termination of coding unit splitting for HEVC”, Signal & Information Processing Association Annual Summit and Conference. Page(s):1-4, December 2012.

[18] Ahmad asghar, Muhammad atiq, Rai Ammad khan, Nadeem a. khan, “Motion Estimation and Inter Prediction Mode Selection in HEVC”, Recent Researches in Telecommunications, Informatics, Electronics and Signal Processing, page(s): 351 – 357, December 2013.

[19] G. Sullivan et al, “Overview of the high efficiency video coding (HEVC) standard”, IEEE Transactions on Circuits and Systems for Video Technology, vol. 22, no. 12, pp 1649-1668, December 2012.

[20]HEVC tutorial http://www.apsipa2013.org/wpcontent/uploads/2013/09/Tutorial_8_NextGenerationVideoCoding_Part_2.pdf.

[21] V. Sze, M. Budagavi and G.J. Sullivan (editors), “High Efficiency Video Coding (HEVC) Algorithms and Architectures” Springer, 2014.

[22] L.C.Manikandan et al, “A new survey on Block Matching Algorithms in Video Coding” in International Journal of Engineering Research, vol. 3, pp.121-125, February 2014.

[23] M. Jakubowski and G. Pastuszak, “Block-based motion estimation algorithms-a survey,” Journal of Opto-Electronics Review, vol. 21, pp 86-102, March 2013.

[24] Maria Santamaria and Maria Trujilo,”A comparision of block-matching motion estimation algorithms” on October 4th 2012.

[25] L. Xufeng, et al,”Fast motion estimation for HEVC.” 2014 IEEE International Symposium on Broadband Multimedia and Broadcasting (BMSB), IEEE, December 2014.

[26] N. Purnachand, et al, “Fast motion estimation algorithm for HEVC,” in Consumer Electronics - Berlin (ICCE-Berlin), 2012 IEEE second International Conference on Consumer Electronics - Berlin (ICCE-Berlin), 2012 IEEE International Conference on. Berlin: pp. 34–37, IEEE, March 2012.

[27] X.-l. Tang, et al, “An analysis of TZ search algorithm in JMVC,” in Green Circuits and Systems (ICGCS), 2010 International Conference on, ser. Green Circuits and Systems (ICGCS), 2010 International Conference on. Shanghai: pp. 516–520, IEEE, September 2010.

[28] L.N.A. Alves, and A. Navarro, " Fast Motion Estimation Algorithm for HEVC ", Proc IEEE International Conf. on Consumer Electronics - vol.11 , pp. 11 - 14 , ICCE Berlin , Germany, September 2012

[29] M. Jakubowski and G. Pastuszak, “Block-based motion estimation algorithms-a survey,” Journal of Opto-Electronics Review, vol. 21, pp 86-102, March 2013.

46

[30] Z. Pan, S. Kwong, L. Xu, Y. Zhang, T. Zhao, “Predictive and distribution-oriented fast motion estimation for H.264/AVC” Journal of Real-Time Image Processing, vol. 9, page(s): 597 – 607, December 2014.

[31] P. Nalluri et al, “High Speed SAD Architectures for variable block size estimation in HEVC video coding”, IEEE International Conference on Image Processing (ICIP). Page(s): 1233 – 1237, October 2014.

[32] M. A. B. Ayed, et al, “TZ Search pattern search improvement for HEVC motion estimation modules,” Advanced Technologies for Signal and Image Processing (ATSIP). Page(s): 95 – 99, March 2014.2014.

[33] Introduction to Motion estimation and Motion compensation---> http://www.cmlab.csie.ntu.edu.tw/cml/dsp/training/coding/motion/me1.html

[34] HM Software Manual - https://hevc.hhi.fraunhofer.de/svn/svn_HEVCSoftware/

[35] Visual studio: http://www.dreamspark.com

[36] Tortoise SVN: http://tortoisesvn.net/downloads.html

[37] Tutorials---> N. Ling, “High efficiency video coding and its 3D extension: A research perspective,” Keynote Speech, IEEE Conference on Industrial Electronics and Applications, Singapore, July 2012.

[38] Tutorials---> X. Wang et al, “Paralleling variable block size motion estimation of HEVC on CPU plus GPU platform”, IEEE International Conference on Multimedia and Expo workshop, 2013.

[39] Tutorials---> H.R. Tohidpour, et al, “Content adaptive complexity reduction scheme for quality/fidelity scalable HEVC”, IEEE International Conference on Image Processing, pp. 1744-1748, June 2013.

[40] HEVC tutorial 2014 ISCAS ---> http://www.rle.mit.edu/eems/wp-content/uploads/2014/06/H.265-HEVC-Tutorial-2014-ISCAS.pdf

[41] Video Lecture on Digital Voice and Picture Communication by Prof.S. Sengupta, Department of Electronics and Electrical Communication Engineering IIT Kharagpur -> https://www.youtube.com/watch?v=Tm4C2ZFd3zE.

[42] Lecture on video coding standards ->http://nptel.iitm.ac.in

[43] YUV format figures -> http://www.dvxuser.com/V6/showthread.php?336009-Request-4K-10-bit-4-2-2-internal-for-the-DVX200!/page6.

47

Documents

1. Objective Of The Project - Ut Arlington · Web viewAMVP- Advance motion vector prediction. AP- Above Predictor. ARP- Above Right Predictor. B-frame- Bi-predictive frame. BMA -