Upload
katell-carver
View
41
Download
0
Tags:
Embed Size (px)
DESCRIPTION
Y. X. Time. IBM Research. University of Chicago. University of Venice, Italy. IBM Research. C. Lucchese, M. Vlachos, D. Rajan, P.S. Yu. Objective: Ownership seal with Mining Guarantees. - PowerPoint PPT Presentation
Citation preview
University of Venice, Italy
C. Lucchese, M. Vlachos, D. Rajan, P.S. YuIBM Research IBM Research University of Chicago
TimeX
Y
Objective: Ownership seal with Mining Guarantees
Output on database and data mining operations is the same as on the original data
Output on database and data mining operations is the same as on the original data
Final Destination
Embed a stamp so that we can claim ownership of the
data
Embed a stamp so that we can claim ownership of the
data
NN Search
Clustering
Classification
the trajectories are modified imperceptibly,but their neighboring objects are not distorted
…
Applications: Database Search
Search operations remains same
– outsource data to a mining company
– maintain principal rights of the dataset
We want to retain the Nearest Neighbors of each object.
x
NN(x)
y1 y2
Watermark does not change the nearest neighbor
Determine the maximum watermark embedding power p which maintains NN
for all objects:
Dp(x, NN(x)) < Dp(x,y)
Class A
Class A
Class A
Class B
Class B
Class B
Dataset of time-series/trajectories with class labels
Objective: Distort the data imperceptibly so that class labels are maintained.
Acceptable
Unacceptable
Modified Dataset including watermark
Applications: Classification Preservation
Applications: Clustering Preservation
Results of clustering remains the same
– geodesic distances will remain the same
– hierarchical clustering will not be affected
De Brazza Monkey MaleDe Brazza Monkey Juvenile Male
Juvenile Baboon
Mandrill2 male
Orangutan juvenile
Orangutan2 male
Gray-necked Owl Monkey Male
Gray-necked Owl Monkey Female
Red Howler Monkey Male
Mandrill male
Common Chimpanzee male
Common Chimpanzee Male 2
Mantled HowlerMonkey
The secret key is embedded in a domain resilient to common trajectory transformations
Frequency Domain
ft ift
watermarked magnitudes
original data watermarked data
watermark
Magnitude
Phase
Magnitude
Phasesame
modified
Frequency Domain
Example:w = [-1 1 -1 -1 1 1 ]
p (embedding power)
Additive Embedding in Magnitudes
Techniques are also applicable for image shapes
Red Howler Monkey Male
(Alouatta seniculus seniculus)
Conversion of skull shape into a
two-dimensional sequence
Orangutan skull Extracted Shape Embed the key in the k most important coefficients
(shapes can be treated as trajectories)
Secret information is hidden in some of the frequency components
2 coeffs 4 coeffs 8 coeffs
16 coeffs 32 coeffs 64 coeffs
Select the frequency coefficients that best describe the shape of the trajectory
One can select either highest energy coefficients, or low frequency coefficients. Removal of the watermark will be more difficult without destroying the important trajectory characteristics
TimeX
Y
key is detected very efficiently even when it is inserted with low embedding power
Frequency Domain
ft
watermark
Magnitude
Phase
correlation
w = [-1 1 -1 -1 1 1 ]
watermarked data
Detection of the embedded key is virtually perfect
Better Detection (semi-blind):Remove ‘background noise’ bias before the embedding and during the detection
Threshold
MST after watermarkingMST before watermarking
example of using our techniquefor spanning tree preservation
the proposed fast algorithm prunes a significant amount of the search space
We need to examine for each power p, how many times the following is violated:
Dp(x, NN(x)) > Dp(x,y)
x
NN(x)
y z
Finding the maximum embedding power
Express distance parametrized by the embedding power of the key
our approach can embed the hidden information more than 300 times faster than the brute-force approach
The fast search techniques find the same result as the exhaustive search, but are 2-3 orders of magnitude faster
Running Time
The efficient key embedding + detectionallow for effective key recovery even under attacks Geometric Attacks: perfect detection under Translation/Rotation/Scaling attacks
Gaussian Noise attack has to destroy the data in order to be effective
Decimation attack can be perfectly withstood
Data Reduction attack (even when pruning 50% of dataset) is not effective