Upload
buimien
View
237
Download
2
Embed Size (px)
Citation preview
Watermarking, steganography and content forensics
Ingemar J. Cox
Ingemar J. Cox
Introduction
Watermarking
Steganography
Content forensics
2
Ingemar J. Cox
Watermarking
Watermarking is the practice of imperceptibly altering a Work (image, song, etc.) to embed a
message about that Work.
3
Ingemar J. Cox
Watermarking
The primary motivation for watermarking has been to protect content
4
Ingemar J. Cox 5
Ingemar J. Cox 5
Ingemar J. Cox 6
Muzak: the first commercial watermarking
The first skyscraper was built in Chicago in 1885
Ingemar J. Cox
Muzak: the first commercial watermarking
The elevator was an essential element
In the 1930’s passenger elevators were new and frightening Music in elevators was introduced to calm passengers
Muzak was the dominant supplier
Nirvana - On a plain
Rockabye Baby - On a plain
7
Ingemar J. Cox
Muzak: the first commercial watermarking
The elevator was an essential element
In the 1930’s passenger elevators were new and frightening Music in elevators was introduced to calm passengers
Muzak was the dominant supplier
Nirvana - On a plain
Rockabye Baby - On a plain
7
Ingemar J. Cox
Muzak: the first commercial watermarking
The elevator was an essential element
In the 1930’s passenger elevators were new and frightening Music in elevators was introduced to calm passengers
Muzak was the dominant supplier
Nirvana - On a plain
Rockabye Baby - On a plain
7
Ingemar J. Cox 8
Muzak: the first commercial watermarking
Emil Hembrooke, “Identification of sound and like signals”, US Patent 3,004,104 Filed 1954, Issued 1961
“The present invention makes possible the positive identification of the origin of a musical presentation and thereby constitutes an effective means of preventing such piracy, i.e. it can be likened to a watermark in paper.”
In use until the mid 1980’s
Ingemar J. Cox 9
UC
L A
dast
ral P
ark
Po
stg
rad
ua
te C
am
pu
s
Applications of digital watermarking
Broadcast Monitoring Nielsen/Digimarc Teletrax/Philips
Owner Identification Verimatrix - IPTV Widevine Technologies
Proof of Ownership
Transaction Tracking Thomson/Technicolor (Philips) - Oscar screeners Cinea/Dolby - Digital cinema
Ingemar J. Cox
Applications of digital watermarking
Content Authentication Signum Technologies
Copy Control Verance - HD-DVD, DVD-audio
Legacy systems Tektronix - syncing sound and video (lipsync) MarkAny - syncing lyrics with music (mp3 players)
11
Ingemar J. Cox
Watermarking
Why not use cryptography?
Cryptography assumes:1. Alice and Bob trust one another2. Communication between Alice and Bob succeeds
However, Alice (Hollywood) cannot trust Bob (consumer) And if communication fails, watermark protection fails
12
Ingemar J. Cox
Watermarking
Watermarking is NOT cryptography
13
Ingemar J. Cox
Watermarking
Watermarking IS communications
14
Ingemar J. Cox
Watermarking
The content is more important than the message
So the watermark/message must be imperceptible
And often, the message payload is small
But, to be practical, a watermark must also be robust
15
Ingemar J. Cox
Watermarking
Spread spectrum communications content modeled as noise
high noise regime
Communications with side information content modeled as side information
16
UC
L A
dast
ral P
ark
Po
stg
rad
ua
te C
am
pu
s
Watermarking as communications
17
Transmitter Receiver+
Noise
message, m’message, m x y
x is limited by a power constraint∑ x2[i] ≤ p
UC
L A
dast
ral P
ark
Po
stg
rad
ua
te C
am
pu
s
Watermarking as communications
18
Embedder Detector
Noise
message, m message, m’x y+ +
Noise
x is limited by a power constraint∑ x2[i] ≤ p
UC
L A
dast
ral P
ark
Po
stg
rad
ua
te C
am
pu
s
Spread spectrum communications
Requirements: Unobtrusive Survive common distortions
E.g. lossy compression
Spread spectrum communications Originally developed for military communications
Difficult for enemy to detect Difficult for enemy to jam
UC
L A
dast
ral P
ark
Po
stg
rad
ua
te C
am
pu
s
Spread spectrum communications
Let’s consider embedding an 8-bit message in an image 01100101
UC
L A
dast
ral P
ark
Po
stg
rad
ua
te C
am
pu
s
Spread spectrum communications
Since we have an 8-bit message Spread each bit over all pixels
Spread spectrum watermarking
UC
L A
dast
ral P
ark
Po
stg
rad
ua
te C
am
pu
s
Spread spectrum communications
Each bit is represented by a “chip” sequence A pseudo random number sequence
UC
L A
dast
ral P
ark
Po
stg
rad
ua
te C
am
pu
s
Spread spectrum communications
10 101 0 1 0
UC
L A
dast
ral P
ark
Po
stg
rad
ua
te C
am
pu
s
Spread spectrum communications
Detect each bit using linear correlation
UC
L A
dast
ral P
ark
Po
stg
rad
ua
te C
am
pu
s
Spread spectrum communications
UC
L A
dast
ral P
ark
Po
stg
rad
ua
te C
am
pu
s
Spread spectrum communications
UC
L A
dast
ral P
ark
Po
stg
rad
ua
te C
am
pu
s
Spread spectrum communications
UC
L A
dast
ral P
ark
Po
stg
rad
ua
te C
am
pu
s
Spread spectrum communications
UC
L A
dast
ral P
ark
Po
stg
rad
ua
te C
am
pu
s
Spread spectrum communications
UC
L A
dast
ral P
ark
Po
stg
rad
ua
te C
am
pu
s
Spread spectrum communications
UC
L A
dast
ral P
ark
Po
stg
rad
ua
te C
am
pu
s
Spread spectrum communications
UC
L A
dast
ral P
ark
Po
stg
rad
ua
te C
am
pu
s
Spread spectrum communications
UC
L A
dast
ral P
ark
Po
stg
rad
ua
te C
am
pu
s
Spread spectrum communications
UC
L A
dast
ral P
ark
Po
stg
rad
ua
te C
am
pu
s
Perceptual modelling
In the previous example, the random pattern was added equally to all parts of the image
But some areas are more (less) sensitive than others
We can identify these areas using perceptual models Same models used for lossy compression
Must embed in perceptually SIGNIFICANT regions to be robust to lossy compression
UC
L A
dast
ral P
ark
Po
stg
rad
ua
te C
am
pu
s
Original image
UC
L A
dast
ral P
ark
Po
stg
rad
ua
te C
am
pu
s
No perceptual modeling
UC
L A
dast
ral P
ark
Po
stg
rad
ua
te C
am
pu
s
Perceptual modeling
UC
L A
dast
ral P
ark
Po
stg
rad
ua
te C
am
pu
s
Original image
UC
L A
dast
ral P
ark
Po
stg
rad
ua
te C
am
pu
s
Communications with side information
Spread spectrum watermarking models the cover Work as noise
However, the cover Work is Not random Completely known at the time of embedding
UC
L A
dast
ral P
ark
Po
stg
rad
ua
te C
am
pu
s
Watermarking as communications
35
Embedder Detector
Noise
message, m message, m’x y+ +
Noise
x is limited by a power constraint∑ x2[i] ≤ p
UC
L A
dast
ral P
ark
Po
stg
rad
ua
te C
am
pu
s
Watermarking as communications
36
Embedder Detector
Noise
message, m message, m’x y+ +
Noise
x is limited by a power constraint∑ x2[i] ≤ p
UC
L A
dast
ral P
ark
Po
stg
rad
ua
te C
am
pu
s
Communications with side information
Host signal need not interfere with watermark message Potential for much greater payloads
Dirty Paper Coding
UC
L A
dast
ral P
ark
Po
stg
rad
ua
te C
am
pu
s
Writing on dirty paper
38
UC
L A
dast
ral P
ark
Po
stg
rad
ua
te C
am
pu
s
Writing on dirty paper
38
UC
L A
dast
ral P
ark
Po
stg
rad
ua
te C
am
pu
s
Writing on dirty paper
38
UC
L A
dast
ral P
ark
Po
stg
rad
ua
te C
am
pu
s
Writing on dirty paper
38
A
UC
L A
dast
ral P
ark
Po
stg
rad
ua
te C
am
pu
s
Writing on dirty paper
38
A
UC
L A
dast
ral P
ark
Po
stg
rad
ua
te C
am
pu
s
Watermarking with side information
Communications one-to-one mapping between message and code
Communications with side information one-to-many mapping between message and codes
Implementations Quantization index modulation
Chen and Wornell Dirty paper trellis coding
39
Ingemar J. Cox 40
Ingemar J. Cox 40
END OF PART ONE
Ingemar J. Cox
Steganography
Steganography is the practice of undetectably altering a Work, to embed a message.
42
Ingemar J. Cox
Watermarking
Watermarking is the practice of imperceptibly altering a Work (image, song, etc.) to embed a
message about that Work.
43
Ingemar J. Cox
Steganography
Steganography is the practice of undetectably altering a Work, to embed a message.
44
Ingemar J. Cox
Steganography
Motivation Spies Dissidents Terrorism Organized crime
Little or no evidence to support motivation Child pornography
Little or no evidence to support motivation
45
Ingemar J. Cox
The Technical Mujahid
46
TABLE OF CONTENTS
Section 1: Covert Communications and Hiding Secrets Inside Images
Section 2: Designing Jihadi Websites from A-Z
Section 3: Smart Weapons, Short Range Shoulder-Fired Missiles
Section 4: The Secrets of the Mujahideen, an Inside Perspective
Section 5-6: Video Technology and Subtitling Video Clips
send technical articles to
http://www.teqanymag.arabform.com
Ingemar J. Cox
History of steganography
Herodotus tatooing slave’s shaved head
Aeneas the Tactician modifying height of letters, marking letters with holes
Cardan’s Grille
Francois Bacon italic and normal fonts
47
Ingemar J. Cox
History of steganography
SALT-II Signed June 18, 1979 by Jimmy Carter and Leonid
Brezhnev
48
Ingemar J. Cox
History of steganography
The Prisoner’s Problem ( G.J.Simmons)
49
Embedder Extractor
Cover Work
message, m message, m’x +y is limited by a statistical constraint
Warden BobAlice
y
UC
L A
dast
ral P
ark
Po
stg
rad
ua
te C
am
pu
s
The science behind steganography
Anderson and Petitcolas “Thought experiment”
Imagine a perfect compressor for music
Compressor Music in → random string out
Decompressor Random string in → music out
UC
L A
dast
ral P
ark
Po
stg
rad
ua
te C
am
pu
s
The science behind steganography
Then take message and encrypt it
Input encrypted message into decompressor Outputs music! Alice sends music to Bob
Bob now compresses music Output is encrypted message Decrypts to obtain message
UC
L A
dast
ral P
ark
Po
stg
rad
ua
te C
am
pu
s
Statistical Steganography
Cachin Provided first information-theoretic definition for
steganographic security Perfectly secure
DKL(Pc ||Ps) = 0 ε-secure
DKL(Pc || Ps) ≤ ɛ
UC
L A
dast
ral P
ark
Po
stg
rad
ua
te C
am
pu
s
Statistical steganography
Assumes the Warden knows the distribution of cover Works, i.e. PC
UC
L A
dast
ral P
ark
Po
stg
rad
ua
te C
am
pu
s
LSB embedding
One of the earliest forms of digital steganography
Simply flip the least significant bit to encode the hidden message
Assumes that the LSB bits are random There’re not!
UC
L A
dast
ral P
ark
Po
stg
rad
ua
te C
am
pu
s
LSB steganalysis
Histogram attack H(i) is the frequency of intensity i
Assume we embed a bit in every LSB
Then half the time we change on odd number to an even number e.g. 1→0, 3 →2, …
And half the time we change on even number to an odd number e.g. 0→1, 2 →3, …
Then Hs(2i)=Hs(2i+1) for i=0,127
UC
L A
dast
ral P
ark
Po
stg
rad
ua
te C
am
pu
s
First-order statistics
Stochastic modulation Maintains first-order statistics But not higher-order statistics
Various algorithms are available, e.g. OutGuess
UC
L A
dast
ral P
ark
Po
stg
rad
ua
te C
am
pu
s
Model-based steganography
Sallee
Split cover Work, c, into two parts ca - unaltered cb – altered
Model the conditional distribution P(cb|ca)
Generate distribution using an arithmetic entropy encoder/decoder
UC
L A
dast
ral P
ark
Po
stg
rad
ua
te C
am
pu
s
Choosing the cover text
Message more important than the cover Work
Given a message, we can choose which cover Work to hide it in Not possible for watermarking
Correlated steganography
UC
L A
dast
ral P
ark
Po
stg
rad
ua
te C
am
pu
s
Choosing the cover text
If our hidden message is an image, X choose a cover image, Y, that is similar
The number of bits needed to encode X is H(X)
The number of bits needed to encode X given Y is H(X|Y)
Thus the number of bits needed to encode the hidden image may be much less
UC
L A
dast
ral P
ark
Po
stg
rad
ua
te C
am
pu
s
Minimizing the embedding distortion
Matrix embedding
“Wet paper” codes
60
UC
L A
dast
ral P
ark
Po
stg
rad
ua
te C
am
pu
s
Communications with side information
Coding for defective memory
Imagine I have a USB memory stick
It can store 3-bits of information
Worse still, it’s faulty One of the bits is stuck at “1”
I can therefore send you 2-bits of information But which 2-bits did I send?
UC
L A
dast
ral P
ark
Po
stg
rad
ua
te C
am
pu
s
Matrix embedding/Wet paper coding
62
UC
L A
dast
ral P
ark
Po
stg
rad
ua
te C
am
pu
s
Matrix embedding/Wet paper coding
62
000
010001
011100101110111
UC
L A
dast
ral P
ark
Po
stg
rad
ua
te C
am
pu
s
Matrix embedding/Wet paper coding
62
000
010
001
011100
101
110
111
UC
L A
dast
ral P
ark
Po
stg
rad
ua
te C
am
pu
s
Matrix embedding/Wet paper coding
62
000
010
001
011100
101
110
111 00
01
10
11
UC
L A
dast
ral P
ark
Po
stg
rad
ua
te C
am
pu
s
Matrix embedding/Wet paper coding
62
000
010
001
011100
101
110
111 00
01
10
11
010
UC
L A
dast
ral P
ark
Po
stg
rad
ua
te C
am
pu
s
Matrix embedding/Wet paper coding
62
000
010
001
011100
101
110
111 00
01
10
11
010
UC
L A
dast
ral P
ark
Po
stg
rad
ua
te C
am
pu
s
Matrix embedding/Wet paper coding
62
000
010
001
011100
101
110
111 00
01
10
11
010
UC
L A
dast
ral P
ark
Po
stg
rad
ua
te C
am
pu
s
Matrix embedding/Wet paper coding
62
000
010
001
011100
101
110
111 00
01
10
11
010
UC
L A
dast
ral P
ark
Po
stg
rad
ua
te C
am
pu
s
Matrix embedding/Wet paper coding
62
000
010
001
011100
101
110
111 00
01
10
11
010
UC
L A
dast
ral P
ark
Po
stg
rad
ua
te C
am
pu
s
Wet paper codes
Several differences between coding for defective memory and steganography The number of “stuck at” elements is much greater The number of “stuck at” elements varies Real-time performance not required
Syndrome codes
LT-codes
UC
L A
dast
ral P
ark
Po
stg
rad
ua
te C
am
pu
s
Blind steganalysis
How does the Warden know the distribution of cover Works, PC ?
Analytic models
Machine learning Neural networks Support vector machines (SVM) etc.
END OF PART TWO
UC
L A
dast
ral P
ark
Po
stg
rad
ua
te C
am
pu
s
Content forensics
How can we be certain that an image, audio conversation or video is authentic?
Active technology Insert authentication signature
cryptography Watermarking
Passive (non-intrusive) technology Content analysis
UC
L A
dast
ral P
ark
Po
stg
rad
ua
te C
am
pu
s
Recent history: 2003
LA Times 2003
67
UC
L A
dast
ral P
ark
Po
stg
rad
ua
te C
am
pu
s
Recent history: 2004
68
US Democratic Presidential Nomination 2004
UC
L A
dast
ral P
ark
Po
stg
rad
ua
te C
am
pu
s
Recent history: 2005
The Star, May 2005
69
UC
L A
dast
ral P
ark
Po
stg
rad
ua
te C
am
pu
s
Recent history: 2006
Reuters August 2006
70
UC
L A
dast
ral P
ark
Po
stg
rad
ua
te C
am
pu
s
Digital forensics
On August 7th 2006, Reuters withdrew all 920 photographs by a freelance Lebanese photographer from its database after a review showed Adnan Hajj had altered two images.
UC
L A
dast
ral P
ark
Po
stg
rad
ua
te C
am
pu
s
Content forensics
Like steganalysis, look for statistical anomolies in the content.
72
UC
L A
dast
ral P
ark
Po
stg
rad
ua
te C
am
pu
s
Content forensics: source identification
Identify which camera took an image
CCD cameras exhibit several sources of noise dark current, shot noise photoresponse non-uniformity noise (PRNU)
Estimate PRNU (a naturally occurring watermark) detect using correlation
Lukas, Fredrich and Goljan Bayram, Sencar and Memon
73
UC
L A
dast
ral P
ark
Po
stg
rad
ua
te C
am
pu
s
Content forensics: detecting re-sampling
Detecting re-sampling linear, bi-cubic, etc.
Introduces correlation between neighboring pixels
Use EM algorithm to estimate both the re-sampling amount and the correlation Popescu and Farid
74
UC
L A
dast
ral P
ark
Po
stg
rad
ua
te C
am
pu
s
Content forensics: double JPEG compression
Introduces artifacts in the histogram of the DCT coefficients
75
UC
L A
dast
ral P
ark
Po
stg
rad
ua
te C
am
pu
s
Content forensics
Detecting lighting inconsistencies estimate light source direction
76
UC
L A
dast
ral P
ark
Po
stg
rad
ua
te C
am
pu
s
Content forensics: copy-move forgery
Copy-move forgery copy portion of image and paste in another location introduces correlation!
77
THE END