Cryptography For Privacy Preserving Data MiningCSE 4120 : Technical Writing & Seminar
Submitted by:MD. Mesbah Uddin Khan
Roll – 0707059 Dated, June 14, 2011
Things we need to know
PrivacyPrivacy PreservingPrivacy Preserving ComputationSecure ComputationsPrivacy Preserving Data MiningCryptography
Privacy (1/2)
Lets consider following facts:
• Separate medical institutions wish to conduct a joint research while preserving the privacy of their patients. • In this scenario it is required to protect privileged information, but it is also required to enable its use for research.
How can we solve this problem??
Privacy(2/2)Therefore we need a protocol
which…
is secure, i.e. original parties would require a third party who will do the computation and leave results to the original parties.
limits information leak in distributed computation.
Privacy Preserving
Ultra large database holds a lot of transactional records.
Privacy preserving protocols are designed in order to preserve privacy even in the presence of adversarial participants.
Adversarial participants attempt to gather information about the inputs of their peers.
Adversarial participantsTwo types of adversaries:
Semi-honest adversary◦ also known as a passive, or honest but
curious adversary
Malicious adversary◦ may arbitrarily deviate from the protocol
speciation
Privacy Preserving Computations (1/3)Classification
◦ Separate parties try to build decision trees without disclosing contents of their private database
◦ Algorithms: ID3, Gain Ratio, Gini Index etc
Data Clustering◦ Both parties want to jointly perform data
clustering◦ Performed based on data clustering
principles
Privacy Preserving Computations (2/3)Mining Association Rules
◦ Both parties jointly find the association rules from their databases without revealing the information from individual databases.
Fraud Detection◦ Two parties want to cooperate in
preventing fraudulent system, without sharing their data patterns.
◦ Private database contains sensitive data.
Privacy Preserving Computations (3/3)Profile Matching
◦ Mr. X has a database of hackers profile. ◦ Mr. Y has recently traced a behavior of a
person, whom he suspects a hacker. ◦ Now, if Mr. Y wants to check whether his
doubt is correct, he needs to check Mr. X’s database.
◦ Mr. X’s database needs to be protected because it contains hackers related sensitive information.
◦ Therefore, when Mr. Y enters the hackers behavior and searches Mr. X’s database, he cant view his whole database, but instead, only gets the comparison results of the matching behavior
Two distinct problems 1. Secure Computation:
◦ which functions can be safely computed.◦ safety means that privacy of individuals is
preserved.
2. Privacy Preserving Data Mining:◦ compute results while minimizing the
damage to privacy.◦ compute the results without pooling the
data, and in a way that reveals nothing but the final results of the data mining computation.
CryptographyCryptography is the practice and
study of hiding information.
Concluding Remarks
functions can be computed efficiently using specialized constructions
secure protocol for computing a certain function will always be more costly than a native protocol
Thank you all…