View
216
Download
1
Category
Preview:
Citation preview
Watermarking Relational Databases
CSC 574/474 Information System Security
Cryptography Vs. Steganography Cryptography
Encryption: translate information into an unintelligible form
Decryption: decode to retrieve information Attackers cannot recover the information
Stenography Hide information in a seemingly common
message “Security through obscurity”: Attackers don’t
know where to find the information
Steganography Examples Greek messengers
Message tattooed into shaved head Invisible ink in a cover letter Bits hidden in pictures
Sounds familiar? Hide one image into another
Least significant bits Other forms?
Example
Taken from http://www.petitcolas.net/fabien/steganography/image%5Fdowngrading/old/
Example
Courtesy: http://www.petitcolas.net/fabien/steganography/image%5Fdowngrading/old/
Example
Courtesy: http://www.petitcolas.net/fabien/steganography/image_downgrading/index.html
Example
Courtesy: http://www.petitcolas.net/fabien/steganography/image_downgrading/index.html
Illustration of A Steganographic System
http://www.vu.union.edu/~shoemakc/watermarking/watermarking.html
Digital Watermarks
Insert marks into original data Use to demonstrate ownership: images,
video, audio, software… Other usage?
Should not significantly affect quality of original data
Should not be able to be destroyed easily Deter instead of prevent illegal copying
Watermarking Databases
Why? Data in database are intellectual
properties Is it possible?
Some numerical data do not need to be precise to be useful
Example? Some data are imprecise in nature
Example?
What Makes Watermarking Databases Different
Dealing with multiple objects (tuples) instead of one
Tuple order does not matter After dropping part of the
database, the remaining part is still valuable
Desirable Features
Detectability Allow undetectable marks
Robustness Benign updates, malicious attacks
Incremental updatability Do not need to re-compute
watermarks during updates
Desirable Features Imperceptibility
Preserve usefulness of the database Blind system
Do not need the original database for detection
Key-based system Watermarking scheme is open Only the private key matters
Attacks Benign updates Malicious attacks
Bit attack Rounding attack Subset attack Mix and match attack Additive attack Invertibility attack
Basic Setup n tuples, v numerical attributes, P: primary
key e least significant bits 1/r: fraction of tuples marked w: number of marked tuples (n/r) a: confidence parameter t: min number of correct mars for detection H: a one way hash function, K: private key,
F: a MAC function F(m) = H(K || H(K||m))
Watermark Insertion Algo.P A1 A2 … … A
v
AvAi… … … …A2A1P
(1) if (F(P) mod r = 0) then should mark(2) choose to mark Ai where i = F(P) mod v
bj bk bk-1 … be-1 … … b1
(3) choose jth bit to mark where j = F(P) mod e
bj = 0 if H(K || P) is evenbi =1 otherwise
Watermark Detection Algo.
1. Determine whether a tuple is marked2. Determine which attribute is marked
1. totalcount++3. Determine which bit is marked4. Check whether the jth bit is the same
as the expected mark1. Matchout++
5. Check whether a threshold t is met
How to determine threshold t?
Operations on Watermarked Databases
Query ? Updates?
Insertion Deletion Modification
How to Determine Threshold t
1. The probability that bj is not changed by watermarking is __
2. Out of w checks, the probability that t matches by chance is __
3. What is the probability the detection algorithm makes a wrong decision?
bj = 0 if H(K || P) is evenbi =1 otherwise
How to Determine Threshold t
1. 0.52. C(w, t) * 0.5^w3. (C(w, t) + C(w,t+1) + … + C(w,w)) *
0.5^w (1)
Let a be the tolerable error rate, we have to choose the minimum t such that(1) < a
Robustness Against Attacks
Bit-Flipping attack Choose s tuples from n tuples, flip all
the e least significant bits, the chance to erase the watermark is
Sumi=w-t+1,…,wC(w, i)C(n-w, s-i)/C(n,s)
Mix-and-Match Attack
Mallory takes k fraction of the database Mix it with his own relation Create a new relation of size n
For Alice to detect the watermark K*n/r + 0.5*(1-k)*n/r >= t
Additive Attack
Mallory inserts his own watermark in Alice’s database
How to determine who is the original owner? If two watermarking scheme marks
the same bit of the same tuple Then?
Invertibility Attack
Mallory finds a key that yields a satisfactory watermark on the database Affected by a The larger a is, is it easier or harder to
find such a key?
Design Tradeoffs
↓ a ↓ false hits ↑ missed watermark
↓ r ↑ robustness ↑ data errors
↑ v ↑ robustness
↑ e ↑ robustness ↑ data errors
Comments of the Paper
Simple yet effective idea Thorough analysis
Coming up with a good approach is hard
Analyze, validate and make the approach complete is even harder
No data on key length and hash function. What are their impact on performance?
Discussion
Possible attacks Frequent updates of the same tuple? Side channels
Water marking a tuple requires extra time Basic assumption
The owner’s database is secured
Regulations or law regarding database copyright?
Discussion
How to handle non-numerical data Every change is significant But we have to make changes
Minimize number of changes Encode message in cross-tuple
properties E.g., attribute frequency histogram
Discussion
Watermarking semi-structured data, e.g., XML? Attributes or element values can be
similarly watermarked Define key is an issue
The structure of the semi-structured data may also need to be watermarked
Further Reading
Watermarking Relational Databases by Rakesh Agrawal and Jerry Kiernan, International Conference on Very Large Data Bases (VLDB), 2002.
Rights Assessment for Discrete Digital Data, Ph.D thesis, by Radu Sion, Purdue University.
Recommended