Video Watermarking
Real-time Labeling of MPEG-2 Compressed VideoG. C. Langelaar, R. L. Lagendijk, and J. Biemond
ITS, ICTG, Delft University of Technology
Multimedia Security
2
Outline
• Introduction to MPEG video compression standards
• Watermarking MPEG compressed video sequence– Labeling in the bit domain– Labeling in the coefficient domain
• Conclusion
Introduction to MPEG Video Compression Standard
4
MPEG Video Encoding
Zig-ZagQuantDCT
DPCM
RLE
Huffman or Arithmetic
Coding
01001...
I-Frame
RG
B
Y
I Q
To Other Color Space(Optional)
ScanIQ
uantID
CT
Reconstruct &Update
Y
I QForward Frame Buffer
Y
I QBackward Frame Buffer
MotionEstatimation
P B-Frame
Different Image
Motion Vector
5
Discrete Cosine Transform
• DCT
– Convert the time domain signal into frequency domain
88 84 83 84 85 86 83 82 86 82 82 83 82 83 83 81 82 82 84 87 87 87 81 84 81 86 87 89 82 82 84 87 81 84 83 87 85 89 80 81 81 85 85 86 81 89 81 85 82 81 86 83 86 89 81 84 88 88 90 84 85 88 88 81
675 1 -6 2 -2 0 5 -5 -4 1 2 1 5 1 -3 0 2 3 4 6 -2 2 1 5 -3 -1 0 2 0 -2 2 -4 4 3 1 -1 -2 1 -3 1 1 -2 0 -3 2 -1 1 1 3 0 -1 0 -1 -1 0 -2 -1 -1 -5 5 2 -2 2 0
DCT
1)(,2
1)0(
),16
12cos(
16
12cos(),(
2
)(
2
)(),(
7
0
7
0
xcc
lm
kn
mnxlckc
mnDCTn m
6
Quantization
• Quantization
– lossy data compression technique– e.g. 4/2 = 2, 5/2 = 2
675 1 -6 2 -2 0 5 -5 -4 1 2 1 5 1 -3 0 2 3 4 6 -2 2 1 5 -3 -1 0 2 0 -2 2 -4 4 3 1 -1 -2 1 -3 1 1 -2 0 -3 2 -1 1 1 3 0 -1 0 -1 -1 0 -2 -1 -1 -5 5 2 -2 2 0
168 0 -1 0 0 0 1 -1 -1 0 0 0 1 0 0 0 0 0 1 1 0 0 0 1 0 0 0 0 0 0 0 -1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 -1 1 0 0 0 0
Quant
Xq(i,j)=X(i,j)/Q(i,j)
7
ZigZag Scan
• ZigZag Scan– make all zeros to be concatenated– get compression gain
168 0 -1 0 0 0 1 -1 -1 0 0 0 1 0 0 0 0 0 1 1 0 0 0 1 0 0 0 0 0 0 0 -1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 -1 1 0 0 0 0
Zigzag168,0,-1,0,0,0,0,0,0,0,0,1, 0,1,0,0,0,1,1,...,1,0,0,0,-1, 0,0,0,0,0,0,0,0,0,0
8
Run Length Encode
• Run Length Encode– Encode continuous zeros and a nonzero
into a symbol
RLE(1,-1) (8,1) (1,1) (3,1) (0,1) (9,1) (3,-1) (14,1) (4,-1) (0,1) (3,-1) EOB
168,0,-1,0,0,0,0,0,0,0,0,1, 0,1,0,0,0,1,1,...,1,0,0,0,-1, 0,0,0,0,0,0,0,0,0,0
RLE(x0,x1,x2,…) = (zeros,nozero)...
9
DPCM
• Difference Pulse Coded Modulation– using the previous DC to predict the next DC
value
DC=138 DC=140
DC=136 DC=138
DPCM138 => 2 => -4 => 2
XDPCM= Xi - Xi-1
10
Variable Length Codes
• Variable Length Coding
– code representation length is decided by its appearing possibility– short codeword to represent the high frequency event
ABAC
Fix length code
A : 00B : 01C : 10
00010010
Variable length code
A : 1B : 01C : 00
101100
8 bits 6 bits
11
Motion Estimation
88 84 83 84 85 86 83 82 86 82 82 83 82 83 83 81 82 82 84 87 87 87 81 84 81 86 87 89 82 82 84 87 81 84 83 87 85 89 80 81 81 85 85 86 81 89 81 85 82 81 86 83 86 89 81 84 88 88 90 84 85 88 88 81
84 82 83 81 85 86 83 81 82 82 81 83 82 83 83 81 83 82 84 87 87 87 81 88 81 85 86 88 82 82 84 87 81 84 85 87 85 89 84 81 82 85 81 84 81 89 81 83 81 87 86 83 86 89 81 84 88 82 87 84 87 89 84 81
Frame N Frame N+1
-4 -2 0 -3 0 0 0 -1 -4 0 -1 0 0 0 0 0 1 0 0 0 0 0 0 4 0 -1 -1 -1 0 0 0 0 0 0 2 0 0 0 4 0 1 0 -4 -2 0 0 0 -2 -1 6 0 0 0 0 0 0 0 -6 -3 0 2 1 -4 0
- =
Motion Vector
12
Motion Compensation
• Motion Compensation
– using motion vector between current frame and reference frame to reconstruct the prediction of current frame
Reference frame
Current frame
Reconstruct frame
13
P,B-Frames
• The P-picture uses MC to de-correlate dependence between continuous frames
• The B-picture is introduced for increasing the frame rate without increasing too much bitrate
I or P I or PB
Forwardprediction
Backwardprediction
14
GOP Layer
• I,P,B three type of picture to consist a GOP(group of picture)
Temporal : 1 2 3 4 5 6 7 8 9 10 11 12 13
Picture type : I B B P B B P B B P B B P
Coding seq : 1 3 4 2 6 7 5 9 10 8 12 13 11
15
I,P,B Frame
• Difference between picture types
I Frame P Frame B Frame
Compression Ratio
Low Good Best
Random Access
Best Hard Hardest
Complexity normal high highest
Watermarking MPEG Compressed Video Sequence
17
Motivation
• Digital video recording devices is entering market.– The development and success of such
devices depend on technological advances and the existence of adequate copy protection methods
18
Simple One-Copy Only Scenario
Label Presented?Yes
Stop Recording
NoAdd Label
Store
Labeled or Un-labeled Data
Labeled Data
Detection/Labeling Module in Video Recorder
19
Requirements of Video Labeling
• Fidelity– The quality of video should not be significantly affected
• Robustness– Against re-encoding at a different bit-rate
• Blind-detection• Directly embedding into the video data
– For different platforms, interfaces, and data file formats• Complexity
– Real-time embedding/detection– Cost of labeling modules in consumer players
• Bit-stream length increasing is forbidden– To avoid hardware buffer overflow
20
Labeling on Compressed Data
• Decoding, labeling, and re-encoding is computation-consuming
• A labeling scheme for MPEG video should avoid – Requiring a full-decoding operation– Drastically decreasing the quality of video– Increasing the size of labeled video data
• We need techniques which can be applied directly on the MPEG compressed data!!
21
Labeling MPEG Video Streams
• A real-time labeling algorithm for compressed video should closely follow the compression standard to avoid computationally demanding operations– DCT/IDCT– Motion estimation
• The algorithm should work on the lowest layer– Quantized DCT-block
22
DCT-block Representing Domains
Labeling in the Bit Domain
24
Bit Domain Labeling
• To label bit-stream L, consisting of bits Li (i=1,2,…,l) in the MPEG stream – Suitable VLCs are selected – The LSB of their quantized level are forced to
the value of Li
25
Selecting Suitable VLCs
• Label-bit carrying VLC (lc-VLC)– Codewords which another codeword exists wit
h• the same run length• a level difference of 1• the same codeword length
• Such a change of VLC will yield perceptually invisible degradations after decoding
26
Example of lc-VLC
27
Embedding in the Bit Domain
Test codewordsin a macroblock
An lc-VLC codeword is found
If the LSB of its levelequal to the label bit Li?
The codeword is unchanged
Yes
NoReplace the lc-VLC codeword with an
other one
The procedure is repeated until all label bits are embedded
28
Extraction in the Bit Domain
• The procedure is repeated until for i=1,…,l
Test codewordsin a macroblock
An lc-VLC codeword is found
Assign the value represented by its LSB to Li
29
The Label Bit-rate
• The maximum number of label bits that can be added to the video stream per second– Determined by the number of lc-VLC codewor
ds – Not known in advance– Experimental evaluation is required first
30
Evaluating the Label Bit-rate
• A MPEG-2 video sequence is tested – 10 seconds – 720x576 pixels– 25 frames per second– GOP length is 12– Containing I, P, B frames
31
Evaluating the Label Bit-rate (cont.)
Intra coded macro-blocks only
Intra and inter coded macro-blocks only
32
Video Quality Degradation
• Informal subjective tests– The labeling does not introduce any visible
artifacts in 4, 6, and 8 Mb/s coded streams– As for coding at a bit-rate less than 2 Mb/s,
the unlabeled bit-stream already contain too many coding artifacts
33
Video Quality Degradation (cont.)
• Numerically measurements– Difference images are amplified by 60
Unlabeled I-frame Frame difference (4 Mb/s) Frame difference (8 Mb/s)
Most differences are located around the edges and in the textured areas
34
Distributions of the lc-VLC Codewords
• According to the experiments, it appear that the lc-VLC are fairly uniformly distributed over the DCT spectrum– Each non-zero DCT-coefficient represented b
y a VLC have an equal probability of being modified
• Why this conclusion? Smooth areas has fewer nonzero DCT coefficients, and thus fewer VLC codewords
35
Distributions of the lc-VLC Codewords (cont.)
• The maximum local degradation should be as low as possible – The number of lc-VLC codewords per block should be limit
ed
87% of all blocks
64% of all lc-VLC11% of all blocks
36
Distributions of the lc-VLC Codewords (cont.)
• Using a threshold mechanism to limit the number of lc-VLC replacements per DCT block to Tm
The effect of limiting the number of lc-VLC replacements per block
37
Drifts due to Prediction
• ∆MSE=MSEl-MSEu
– MSEl: the MSE per frame between the uncompressed sequence and the labeled sequence coded at 8 Mb/s
– MSEu: the MSE per frame between the uncompressed sequence and the sequence coded at 8 Mb/s
38
Robustness of Bit Domain Embedding
• Bit domain is efficient and simple, but it can also be removed without significantly affecting the quality of video– Decoding and re-encoding using another bit-
rate– Labeling the video stream again using
another random label bit-stream • Suitable for consumer applications
Labeling in the Coefficient Domain
40
Coefficient Domain Labeling
• Coefficient domain labeling embeds the label bit-stream L consisting of bits Li, i=1,…,l, in the I-frames only.
• Each bit out of the label string has its own label-bit carrying region, lc-region, in a I-frame.
41
Example of lc-region
• The first label bit is located in the top-left corner of a I frame
42
Embedding in the Coefficient Domain
• A label bit is embedded in a lc-region by introducing an energy difference between the high-frequency DCT-coefficients of the top-half of the lc-region (denoted by lc-subregion A) and the bottom-half (denoted by B)
• Definition of the total energy in A
• The energy difference D
)}(|}63,0{{)(
)()(12/
0 )(
2),(
cuucS
DCTcoefcEn
b cSubuA
BA EED
S(c) is defined by thecut-off point c
43
Embedding in the Coefficient Domain (cont.)
• Label bit ”0” is defined as D>0, and label bit “1” as D<0
44
Embedding in the Coefficient Domain (cont.)
• If a bit “0” must be embedded, all energy after the cut-off points in the DCT-blocks of lc-subregion B is eliminated by setting the corresponding DCT-coefficeints to zero. Thus,
• If label bit”1” must be embedded, all energy after the cut-off-point in the DCT-blocks of lc-subregion A is elimated, so that:
AABA EEEED 0
BBBA EEEED 0
45
Why Using this Energy Difference?
• The coefficients can be forced to zero by shifting the EOB marker towards the DC-coefficient
46
The Side Effects of Coefficient Elimination
• Since coefficients are removed to add a label, the labeled compressed video stream will always be smaller than the unlabeled video stream.
• If it is necessary to keep the original size of compressed video sequence, stuffing bits can be inserted .
47
Factors Determining EA and EB
• The spatial contents of the sub-regions A and B
• The cut-off point C
48
Determining the Cut-off Points
• Labeling
• Extraction
)})(())((|}63,0{{
),(
DcEDcEcc
cMaxc
tmpBtmpAtmpT
T
)})((|}63,0{{),(
)},)((|}63,0{{),(
),,(
TcEcccMinc
TcEcccMinc
ccMinc
tmpBtmpTTB
tmpAtmpTTA
BA
D and n together determines the visibility and robustness
0<T<D
49
Labeling Procedure
50
Label Extracting Procedure
51
Discarded Coefficients
• D=20, T=15 • Not enough high frequency coefficients exist in the 1.4 Mb/s
compressed stream, since only 75% of the extracted label bits are correct.
52
Video Quality Degradation
Unlabeled I-frame Frame difference (4 Mb/s)Frame difference (8 Mb/s)
• Numerically measurements– Difference images are amplified by 60– Fewer difference per frame, but larger difference per
block
53
Histogram of the cut-off Points
• The lower the bit-rate is, the lower the cut-off points have to be because of the lack of high frequency components in the compressed video stream
54
Discarded VLC Codewords Per Block
95% of all blocks
Coded at 8 Mb/s
10% of blocks have energy above the cut-off point
55
Limiting Discarded VLC Codewords
• Some coefficients will not be eliminated due to the limit
• Error-correction codes can be used
56
MSE of different Frame Types
The sequence is coded at 8 Mb/s The degradation now fades out for P and B frames
57
Robustness of Coefficient Domain Labeling
• Possible attacks– Re-encoding– Trans-coding
• Trans-coding a 8Mb/s sequence at a lower bit-rate
58
Conclusions
• Comparison of the two labeling schemes– The bit domain labeling method is
computationally highly efficient and a high label bit-rate, but not robust against re-labeling or trans-coding
– The coefficient domain labeling is more complex and has a lower bit-rate, but has higher robustness
59
Conclusions (cont.)
• Both schemes are suitable for a real-time copy protection system for digital video recording devices
60
The European ACTS SMASH Project
• The problem which labeling method should be used for a digital video recording device is currently under investigation within the European ACTS SMASH project
61
References for video watermarking
• 1. Gerrit C. Langelaar and Reginald L. Lagendijk, “Optimal Differential Energy Watermarking of DCT Encoded Images and Video,” IEEE Trans. On Image Processing, vol.10, No.1, Jan. 2001, pp.148-158
• 2. Real-Time Labeling of MPEG-2 Compressed Video – http://ict.ewi.tudelft.nl/pub/inald/realtime.pdf