Steganography in digital images. Copyright protection “Signature” or “watermark” of the creator/sender Invisible Hard to remove Robust to processing 64

Steganography in digital images

• Copyright protection• “Signature” or “watermark” of the creator/sender• Invisible• Hard to remove• Robust to processing• 64 bits are likely enough

• Fingerprinting (traitor tracing)• Identifier of the recipient/customer• Requirements as before

• Authentication (verifying data integrity and origin)• Invisible• Fragile• Hard to modify the image while preserving the watermark

Data hiding applications

Covert (stealth) communication. The goal is to hide the realmessage in some other (cover) content. The cover has no value other than a decoy.

Statistical undetectability – no statistical test shouldexist that could distinguish between clean objects andthose containing a secret message.

Large payload – it is a communication method!

Robustness may or may not be an issue. If the channelis noise-free, no robustness is needed.

A successful attack on steganography can detect the presenceof the message but not necessarily read it!!

Steganography

Steganography is not watermarking

What do they have in common

- Both hide data

- Often similar tools are used

What is different about them

- A digital watermark contains information about the cover object in which it is embedded while the cover in stego is just a decoy.

- The presence of a digital watermark is often advertised, not concealed, while the presence of a steganographic message should be secret (we should not be able to tell that something is in!)

- A watermark is usually a few bits (typically 1 – 60 bits), while steganography strives for large payload (it is used for communication after all)

Both are privacy tools involving keys that enable two or more parties communicate privately

Crypto makes the message unintelligible to those not possessing the correct keys, but the existence of secret message is obvious (overt)

Stego conceals the very presence of message (covert), the communicated object is just a decoy.

Steganography vs. cryptography

Schwarzenegger’s letter

Steganography is sometimes called- Secret writing- Concealed writing- Covert communication- Stealth communication- Data hiding- Electronic invisible ink- The prisoners’ problem

Word origin

From Greek Steganos (covered) and graphia (writing)

Steganography

• ~470 B.C. First written evidence by Greek historian Herodotus.

• Term coined by Johannes Trithemius in 1499.

• Steganography in its modern form is only ~15 years old.

Data-hiding software

Number of data-hiding software released per year. Data provided courtesy of Neil Johnson.

Stego software by media type

Data provided courtesy of Neil Johnson.

Three fundamental types of steganography

1. Steganography by cover selection

Sender selects a cover from a large set of available covers so that the required message is communicated.

2. Steganography by cover synthesis

Sender creates the cover that communicates the desired message.

3. Steganography by cover modification

Sender modifies an existing cover in order to convey the required message.

Steganography by cover modification

Cover object Stego-object

00101…1

CompressionEncryption

Image source

00101…1

DecryptionDecompression

Communication is monitored by a warden looking for suspicious

artifacts

Main requirement: Undetectability (no algorithm can decide about stego and cover objects with success better than random guessing)

Warden: passive, active, malicious

Alice Bobencryption key

stego key

Steganographic security

Cover source ………… random variable x on XStego source ………… random variable y on XMeasure of security … DKL(px||py) = i px(i) log px(i)/py(i)

Kullback-Leibler divergence (relative entropy)Perfect security ……… DKL(px||py) = 0-security ……………... DKL(px||py) <

x ~ px

y ~ py

Warden

LSB embedding and its analysis

LSB embedding

Cover image is grayscale with M×N pixels. All pixelsare 8-bit integers in {0, …, 255}.

To embed a message:• Visit pixels pseudo-randomly.• Embed one bit at every pixel.

- message bit LSB of the pixel value• Continue, until all bits are embedded

To read the message:• Follow the same path and read LSBs of visited pixels.

Example:Pixel value is 11 = (00001011)2

We want to embed bit “0” (change 11 to 10)We want to embed bit “1” (no change is needed)

LSB embedding (continued)

Maximal payload that can be embedded: M×N bits.

Assume we embed payload of m MN bits.Relative payload = m/(MN) bits per pixel (bpp).0 1.

When embedding m bits, we make on average m/2 changes.Change rate = (m/2)/(MN) = /2.0 1/2.

Change rate = probability of making an embedding change.

LSB embedding is very popular

General (can be applied to any digital file consisting of numerical data) Extremely simple Fast High capacity (1 bit per pixel, embedding efficiency 2) Does not require any software present on the computer

One command line in UNIX Perl script (source: A. Ker, Oxford University):

perl -n0777e ’$_=unpack"b*",$_;split/(\s+)/,<STDIN>,5;@_[8]=~s{.}{$&&v254|chop()&v1}ge;print@_’<input.pgm >output.pgm secrettextfile

LSB plane of images resembles random noise embedding is undetectable?

Example: LSB plane of Lenna

LSB bitplane of a never-compressed imageBlack dot = odd pixel valueWhite dot = even pixel value

Properties of LSB flipping

FlipLSB(x) is idempotent, e.g., LSBflip(LSBflip(x)) = x for all x

LSB flipping induces a permutation on {0, …, 255}

0 1, 2 3, 4 5, …, 254 255

LSB flipping is “asymmetrical” (e.g., 3 may change to 2 but never to 4)

| LSB(x) – x | = 1 for all x (embedding distortion is 1 per pixel)

LSBflip(x) = x + 1 – 2(x mod 2)

Effect of LSB embedding on histogram

parts untouchedby embedding

2i 2i+1

LSB flipping pair 2i, 2i+1

hc [2i] = number of pixels with value 2i in the cover imagehc [2i+1] = number of pixels with value 2i + 1 in the cover image

hs [2i] = (hc [2i] + hc [2i+1])/2hs [2i+1] = (hc [2i] + hc [2i+1])/2

For a fully embedded image:

2i 2i+1

hc [2i]

hc [2i+1]

hs [2i+1]

hs [2i]

“Twin peaks” in the histogram

• The peaks can be tested for using a chi-square test

Spread-spectrum steganography

SS steganography

Spread each bit b{-1,1} among s pixels x1, …, xs:

yi = xi + b ei

ei = spreading sequence, e ~ N(0, 2)

Maximal payload = MN/s (bits) or 1/s (bpp).

To read the message:(1/s)i yi ei = (1/s)i xi ei + (1/s) b i ei

2 =

~ N(0,E(x) 2 /s) b 2

> 0 b = 1

< 0 b = -1

This term will be smallwith high probability

(E(x) = energy per pixel)

SS steganography

Spreading buys us robustness at the expense of s-times lower payload.

Consider a distorted signal: zi = yi + ni

(1/s)i zi ei = (1/s)i xi ei + (1/s)i ni ei + (1/s) b i ei2

~ N(0,E(x) 2 /s) ~ N(0,E(n) 2 /s)

Both terms will be smallwith high probability

b 2

Steganalysis in the wide sense

Traditional steganalysis: a steganography system is considered broken, when the mere presence of a hidden message is detected

Forensic analysis: detection of the message may not be sufficient; often, other information would be useful

• identification of the embedding algorithm (LSB, ±1, …)• the stego software used (F5 , OutGuess, Steganos, …)• the stego key (StegoSuite © by Wetstones, Inc.)• the hidden bit-stream• the decrypted message

Documents

Steganography in digital images. Copyright protection “Signature” or “watermark” of the creator/sender Invisible Hard to remove Robust to processing 64