Guillaume Laroche,Joel Jung,Beatrice Pesquet-Popescu
CSVT 2008
1
For the purpose of reducing the bitrate, the paper proposes two schemes:
A competition-based spatial-temporal scheme for the prediction of motion vector
Increasing the amount of skipped macroblocks via using a competition-based SKIP mode
2
IntroductionMV prediction and selection
MV and SKIP mode competitionCompetition-based MV codingCompetition-based Skip modeMultiple reference framesMV competition for B-slice
Experimental Results
Conclusion
3
mvcolmv1
mv7mv0
mv6
mv3
mv2mv5
mv4
mv
mvcmvb
mva
mvd
Frame NFrame N-1
mvcol is the collocation of macroblock “mv”
FrameN
FrameN-1
mvmvcol
4
We choose : Motion vector residual is given by:
εmv : motion vector residual
mv : motion vector p : motion vector predictor (MVp)
mv mv p
E
B C
A E
B CA E
BC
A
median{ , , }a b cp mv mv mv
5
A skipped MB only has the mode itself needing to be transmitted
Most used in static background
6
Two types: spatial and temporal Spatial direct mode uses neighboring MV to
predict MV In temporal direct mode list0 and list1 predicted
vectors are scaled
Ref1Ref0Ref2
Current B frame
mvcolL1
mvcolL00
1Lmv
11Lmv
dL0L2
dL0L1
dL0
1
1
01 0
0 1
11 0 0 1
0 1
( )
L
L
colLL
L L
colLL L L
L L
mvmv d
d
mvmv d d
d
7
By minimizing the RD-criterion:
D : distortionLR : weighted rate and the corresponding
bitrate components:
Rr : the rate for block residue (luma+chroma)
Rm : the rate of the macroblock mode (SKIP or intra/inter prediction and macroblock partition type)
Rmv : the rate of the motion vector residue
Ro : the rate of the others components (header, CBP…)
J D LR
r m m mv mv o oLR R R R R
8
For SKIP mode, the RD-criterion becomes:
where no any Ro, Rr, or Rmv is necessary to be transmitted in SKIP mode.
In practice, the cost λmRm is negligible compared with the distortion.
9
SKIP SKIP m mJ D R
Predictor set:Spatial predictors:
mva, mvb, mvc, mvd ,H.264 median predictor mvH.264, and extended spatial predictor mvspaEXT, where
if 3 vectors are available. Otherwise equal to mva, , otherwise equal to mvb, otherwise mvc, or 0 if none is available.
10
mv
mvcmvb
mva
mvd
Frame N
median{ , , }spaEXT a b cmv mv mv mv
Ref: J. Jung and G. Laroche, “Competition-based scheme for motion vector selection and coding” ITU-T VCEG, Klagenfurt, Austria, 2006, Information VCEG-AC06
Predictor set:Temporal predictors:
mvcol, mvtf, mvtm5, mvtm9, where
11
mvcolmv1
mv7mv0
mv6
mv3
mv2mv5
mv4
mv
mvcmvb
mva
mvd
Frame NFrame N-1Ref1 Ref0
Current frame
Current block
Collocated block
mvH.264mvtf
mvcol
5
9
median{ ,{ ,0 4}}
median{ ,{ ,0 8}}tm col i
tm col i
mv mv mv i
mv mv mv i
Predictor set:
Spatial-temporal predictors:
It gives a higher importance to the mvcol value
12
median{ , , , , }spt col col a b cmv mv mv mv mv mv
Choices of MV:Adaptive choices
Based on content or statistical criteria No need to transmit index of the mode if decoder is
able to determine the mode
Exhaustive choices All possible predictions are tested A mode needs to be transmitted in the bit stream An index i and a residual εmvi
are associated with
each predictor :
where n is the number of predictors in the defined predictor set P
13
ip P , 1,
imv imv p i n
For the selection of the MV, the bitrate of the motion vector residue Rmv is replaced by Rmv/mm to yield:
where Rmv/mm contains the cost of the
residual εmvi and the cost of the index
information i
14
/r m m mv mv mm o oLR R R R R
/ 1,...,min ( ) ( )
imv mm mv i nR i
We change the equation
to
JSKIPi: RD cost
DSKIPi: distortion related to pi
where Ps is the set of motion vectors for the SKIP mode
If Skip mode is chosen, the index of the predictor is sent.
15
SKIP SKIP m mJ D R
( ) , 1,...,i iSKIP SKIP m m sJ D R i i n
i sp P
Assuming an object moves with constant speed, the predictor mvcolR0
is scaled according to the temporal
distances of the reference pictures used to the current block and the temporal distance between Ref0 and Refj.
16
Current frameRefiRefj Ref0
mvmvcolR0
dj
di
0
0
R
R
col
Scol ij
mvmv d
d
mvScolR0: Scaled predictor
Ref0 : previous reference frame
Another predictor: the sum of temporally successive collocated vectors
Considering the all MV in each reference frame only point to their first previous frame. In this configuration, mvScoli
is scaled MV collocated in
Refi pointing to Refi+1
The sum of these successive temporal predictors
mvTsumj is defined by:
j : the reference frame number of the current predictor block17
0
,j i
i j
Tsum Scoli
mv mv j N
We consider mvtfsumj , a sum of predictors derived from the
predictor mvtf :
mvStfRi is the MV at the position given by mvStfRi-1
in Refi-1 pointing to
Refi ,except mvStfR0 which is mvScol0
18
0
,j Ri
i j
tfsum Stfi
mv mv j N
Ref3 Ref1 Ref0Ref2 Current B frame
1Scolmv2Scolmv
3Scolmv
mvStfR1mvStfR2mvStfR30Scolmv
mvStfR0=
No modification of the Direct mode is proposed The MV resulting from the spatial Direct mode
is not considered in the set of predictors Considering the case of N successively coded
B-frames
19
Ref1Ref0Ref2
Current B frame
mvcolL1
mvcolL00
1Lmv
11Lmv
dL0L2
dL0L1
dL0
1
1
01 0
0 1
11 0 0 1
0 1
( )
L
L
colLL
L L
colLL L L
L L
mvmv d
d
mvmv d d
d
0
0
02 0
0 2
12 0 0 1
0 2
( )
L
L
colLL
L L
colLL L L
L L
mvmv d
d
mvmv d d
d
02Lmv
12Lmv
Vector mvcolB-1L0 and mvcolB-1L1
are used for
the scaling of predictors pair: , and
, respectively.
20
Ref1Ref0
Current B frame
mvcolB-1L0
dL0B-1
dL0L1
dL0
B-1
mvcolB-1L1
1 0
1 0
03 0
0 1
13 0 0 1
0 1
( )
B L
B L
colLL
L B
colLL L L
L B
mvmv d
d
mvmv d d
d
1 1
1 1
04 0
0 1 0 1
14 0 1 0
0 1 0 1
( )
B L
B L
colLL
L B L L
colLL L L
L L L B
mvmv d
d d
mvmv d d
d d
03Lmv 1
3Lmv
14Lmv0
4Lmv
Bitrate saving on the first and second B-frame for CIF sequencesFirst predictor: mvH.264
mvcolL1: MV collocated in the
future frame without scaling
mvBcol = (collocated block == intra mode ? mva : mvScol L1
)
mvScolL0 and mvScolL1
proves that MV field of a B-frame is
more correlated with the future reference frame
21
Two profile: Baseline profile, High profile 32*32 search range 8*8 transform 4 reference frames Test set: 9 CIF, 4 SD(640*480), and 2
720p(1280*720) sequences QP=28, 32, 36, 40
22
Predictor sets: 11 predictors in the set P:
Percentage of the selection of each proposed predictor for MV competition for the CIF test set in the Baseline profile:
23
0.264 5
9
, , , , , ,
, , , ,R
j j
H a b c Scol tm
tm tf spt Tsum tfsum
mv mv mv mv mv mv
mv mv mv mv mv
Comparing P sets containing two predictorsFor all CIF sequences, mvH.264 is combined one
by one with each predictor.The bitrate savings for different pairs of
predictors:
24
Selecting the optimal number of predictors in the sets
P sets of MV predictor are:
Ps sets of MV SKIP mode are:
25
0
0
1 .264
2 .264
4 .264 9
{ }
{ , }
{ , , , }R
R
H
H Scol
H Scol a tm
P mv
P mv mv
P mv mv mv mv
0
1 .264
2
4 .264
{ }
{ , }
{ , , , }R
s H
s spaEXT a
s spaEXT H Scol a
P mv
P mv mv
P mv mv mv mv
Spatial and temporal predictor competitionTemporal predictors are useful
The temporal selection is correlated with the reference frame
26
The percentage of increase of the number of macroblocks encoded with the SKIP mode
27
For sequences with large objects and fluid motion
A spatial predictor as the second predictor is less efficient for sequences with static background
A compression gain is acquired for all test sequences
28
For simple or no motion sequences, SKIP mode is widely used, so the gains are lower.
Fast or complex motion sequences take full advantage of the temporal prediction
RD curves for 4 of the test sets At low bitrate, motion
information tends to become a significant part of the total bitstream
The bitrate reduction is not related to the resolution, but related the frame rate
29
The problem is modified due to the presence of B pictures and multiple reference frames Is the P set used for the P-frames in the Baseline
profile still adapted to the High profile, where the temporal distance between P-frames is increased?
Which set is the most adapted to the B-frames, and is it the same for all the B-frames between two P-frames?
30
The same sets as the ones proposed for the Baseline profile gives the best results
The temporal distance between two P-frames is larger, so the temporal correlation between motion vector fields is smaller
31
Distribution of the predictor selection in the High IBBP profile for the P- and B-frames
Bitrate saving in the high IBBP profile (only computed for CIF sequences)
Bitrate saving on the first and second B-frame for CIF sequencesFirst predictor: mvH.264
mvcolL1: MV collocated in the
future frame without scaling
mvBcol = (collocated block == intra mode ? mva : mvScol L1
)
mvScolL0 and mvScolL1
proves that MV field of a B-frame is
more correlated with the future reference frame
32
Bitrate reduction of each sequences
The gain is lower than the Baseline profile is explained by the results obtained on P-frames
33
Average bitrate reduction of Baseline and High profile are 7.7% and 4.3% respectively.
The MV predictions are selected via an RD-criterion that considers the cost of the residual and the index for the prediction.
An adaptation of predictors set according to the statistical characteristics for the sequence should allow to increase even more bitrate saving.
34