Low Complexity Multiview Video Coding

of 36

Low Complexity MultiviewVideo Coding

Shadan Khattak

June 10, 2014

Centre for Electronic and Communications Engineering

De Montfort University

of 36

Outline

1. Background

2. Preliminaries

3. Fast Encoding Techniques for Multiview Video CodingA. Previous Disparity Vector – Disparity Estimation (PDV-DE)B. Stereo Motion Consistency Constraint based Motion and Disparity Estimation

(SMCC-MDE)C. Complete Low-Complexity Multiview Video Coding (CLCMVC)

4. Early SKIP Mode Decision for View Synthesis Prediction Enhanced Multiview Video Coding

5. Consistent Error Concealment of Multiview plus Depth Video Broadcast

2

of 36

Background

• Increasing trend of feature-rich devices• 3DTV, FTV, Immersive Teleconferencing

• Growing number of views

• Multiview Video Coding• 20-50% better compression• 10-19x increased complexity• Complexity reduction is required

• Multiview Video Transmission• Consistency constraint on the reconstructed video• Consistent reconstruction is required

3

Ref: Samsung, LG

of 36

Preliminaries:

4

MVC Approach

• Encode views dependently

• Frames can refer to other frames from the same view as well as those from the neighbouring views

Conventional Approach

• Encode each view independently

• Frames can refer to other frames from the same view.

of 36

1. Fast Encoding Techniques for MVC

5

of 36

1A – Previous Disparity Vector – Disparity Estimation (PDV-DE)

Motivation 1:

• Effect of including Disparity Estimation (DE) in the MVC encoder:

• Increases the search space => increased complexity

• Reduction in redundancy is required for fast encoding

6

of 36


Motivation 2:

• Previous disparity vector (PDV) as search centre

• Analysis of optimal Disparity Vectors

• For frames at different temporal levels (TLs)

• For different macroblock (MB) types

• Simple: View-neighbourhood used part. size 16x16

• Normal: View-neighbourhood used part. sizes 16x16,16x8,8x16

• Complex: View-neighbourhood used part. sizes 16x16, 16x8, 8x16, 8x8

7

TL4

TL3

of 36


Algorithm Description

• Default search range: (±64, ±64)

• New search range:• (±64, ±64)

• TL3 and MB is neither Simple nor Normal

• TL1 or TL2

• (±32, ±32), (±16, ±16), or (±8, ±8)• According to the flowchart

8

of 36


Results:

• JMVM 6.0 (with Fast TZ mode enabled) as baseline

• Reduction in encoding time: ~35%

• Effect on compression: negligible

• Performance analysis:• At different bitrates: largely

consistent• For different sequences: largely

consistent

9

Seq. QP ∆ PSNR∆ Bitrate

(%)∆ Time

(%)

∆ SearchPoints

(%)

Ballroom 20 0.01 0.12 35.29 38.66

24 0.01 -0.19 35.52 39.15

28 0.00 -0.48 34.63 38.20

32 0.01 -0.62 32.95 37.69

36 0.04 -0.12 32.08 36.88

Avg. 0.01 -0.26 34.09 38.12

Exit 20 0.02 0.40 35.50 37.55

24 0.02 0.48 37.59 39.90

28 0.01 0.41 36.95 39.71

32 0.02 0.28 36.09 39.04

36 0.05 0.47 36.24 39.44

Avg. 0.03 0.41 36.47 39.13

Average 0.02 0.07 35.28 38.62

of 36

1B – Stereo Motion Consistency Constraint based Motion and Disparity Estimation (SMCC-MDE)

Motivation:

• Stereo motion consistent constraint (SMCC): • A pixel-based geometrical constraint

between motion and disparity vectors of stereo videos

10

𝑀𝑉0, 𝑡 + 𝐷𝑉1, 𝑡 − 1 =𝐷𝑉1, 𝑡 +𝑀𝑉1, 𝑡 (1)

𝑀𝑉1, 𝑡 = 𝑀𝑉0, 𝑡 + 𝐷𝑉1, 𝑡 − 1 −𝐷𝑉1, 𝑡 (2)

𝐷𝑉1, 𝑡 = 𝐷𝑉1, 𝑡 − 1 +𝑀𝑉1, 𝑡 −𝑀𝑉0, 𝑡 (3)

of 36



• Extension of SMCC to block-based MVC• Non-aligned blocks => Maximum

overlap rule

• Goal: Predict motion vector MV1,t and disparity vector DV1,t

11

of 36


Algorithm Description (Cont.)

• Obtain estimations of MV1,t and DV1,t:

12

of 36



• Refine the estimated motion and disparity vectors:

13

SR= 𝑚𝑎𝑥 𝐸𝑠𝑡 𝑀𝑉1, 𝑡𝑥 , 𝐸𝑠𝑡 𝑀𝑉1, 𝑡𝑦

of 36


Results

• JMVM 6.0 (with Fast TZ mode enabled) as baseline

• Reduction in encoding time:~41%

• Reduction in compression performance: negligible

• Performance analysis:• More time savings at higher QPs

• Estimated motion vectors depend on differences between motion vectors of two frames

• This distance is generally large when coarse quantization (higher QPs) is used

14

Seq. QP ∆ PSNR∆ Bitrate

(%)∆ Time

(%)

∆ SearchPoints

(%)

Ballroom 20 0.00 0.30 46.42 48.19

24 0.00 0.53 46.09 47.66

28 -0.01 0.81 45.28 48.48

32 -0.01 0.74 43.87 47.96

36 0.00 0.77 38.20 45.57

Avg. 0.00 0.63 43.97 47.57

Exit 20 -0.01 0.04 42.25 46.62

24 0.00 0.3o 40.64 40.88

28 -0.01 0.57 40.39 40.67

32 -0.02 0.05 39.00 39.68

36 0.01 0.31 35.73 37.73

Avg. 0.00 0.25 39.60 41.12

Average 0.00 0.44 41.78 44.34

of 36

1C – Complete Low-Complexity Multiview Video Coding

Motivation

• Overall complexity of the MVC encoder can be broken down into four levels

1. Mode Selection

2. Prediction Direction

3. Reference Direction

4. Block matching

15

Level 1: Different prediction modes

Level 2: Different prediction directions

of 36


Motivation (Cont.)

• PDV-DE and SMCC-MDE work at level 4

• Combining them with state-of-the-art methods for other levels can result in a complete low-complexity encoding solution for MVC.

16

Level 3: Different reference directions Level 4: Block matching (Search range)

of 36



17

of 36


Results

18

60

65

70

75

80

85

90

95

High Motion Low Motion Large Disparity Small Disparity High Bit rates Low Bit rates

[5]

[2]+[4]+[5]

• It is possible to add up gains by combining state-of-the-art methods that work at different levels.• e.g., [2]+[4]+[5]

significantly better than [5].

• Scope for improvement:• High motion, Large

disparity, High bit rates

of 36


Results (Cont.)

19

60

65

70

75

80

85

90

95

100

High Motion Low Motion LargeDisparity

SmallDisparity

High Bit rates Low Bit rates

[5]

[2]+[4]+[5]

CLCMVC

• Reduction in encoding time: ~ 93.7%

• Reduction in compression: negligible

• Performance analysis:• Improvement over other

methods more significant at high bitrates

• Consistently achieves over 90% reduction in encoding time.

of 36


Results

20

of 36

2. Bayesian Early SKIP mode decision method for View Synthesis Prediction Enhanced MVC

21

of 36


Motivation

• View Synthesis Prediction (VSP) modes have been shown to improve the compression performance of MVC• VSP modes already incorporated in the test models of upcoming compression

standards (3D-AVC)

• Testing additional modes further increases the computational complexity of the MVC encoder

• Mode selection can be formulated as a Bayesian decision problem• i.e., select mode mi , if:

22

of 36


Motivation (Cont.)

• P(mi) and p(x|mi) can be estimated from V2 (Fig. 1).

• The PDF of random variable x can be modelled using log-normal distribution (Fig. 2).

23

Fig. 1

Fig. 2

of 36


• To account for Bayes error and the fact that P(mi)and P(x|mi) are estimates from a different view, a tolerance threshold e is introduced.

24

of 36


Results

• Reduction in complexity: ~30%• Around 12% more reduction

compared to a baseline method.

• Reduction in compression: negligible

• Performance analysis:• More time saving at higher QPs.

• More larger modes are selected at higher QPs.

25

of 36

3. Consistent Error Concealment Technique for Multiview Video Plus Depth Broadcast

26

of 36


27

Motivation

• Inconsistent reconstruction of frames leads to flickering artifacts in 3D videos.

• The availability of multiple views and their associated depth maps in MVD representation can be useful in detecting inconsistencies between frames.

of 36



28

of 36



• Create a set C of candidate reconstructed blocks

• Evaluate ICF for each candidate reconstruction

• Choose the candidate with the smallest ICF

29

of 36


Candidate Reconstructed MBs

30

of 36


PSNR Results

of 36


PSNR Results

of 3633

Baseline BMA

Zoomed, cropped portions

Proposed

Baseline BMA Proposed Baseline BMA Proposed

of 3634

Baseline BMA

Proposed

Baseline BMA

Proposed

Baseline BMA Proposed

of 3635

Full Size Frames

Zoomed, cropped portions

Original frame Reconstructed using baseline BMA Reconstructed using the proposed method

of 36

Questions?

36

Documents

Low Complexity Multiview Video Coding