View
0
Download
0
Category
Preview:
Citation preview
HAL Id: hal-01956155https://hal.archives-ouvertes.fr/hal-01956155
Submitted on 14 Dec 2018
HAL is a multi-disciplinary open accessarchive for the deposit and dissemination of sci-entific research documents, whether they are pub-lished or not. The documents may come fromteaching and research institutions in France orabroad, or from public or private research centers.
L’archive ouverte pluridisciplinaire HAL, estdestinée au dépôt et à la diffusion de documentsscientifiques de niveau recherche, publiés ou non,émanant des établissements d’enseignement et derecherche français ou étrangers, des laboratoirespublics ou privés.
Towards Scalable, Efficient and Privacy PreservingMachine Learning
Rania Talbi, Sara Bouchenak
To cite this version:Rania Talbi, Sara Bouchenak. Towards Scalable, Efficient and Privacy Preserving Machine Learning.Middleware ’18 Doctoral Symposium, Dec 2018, Rennes, France. �hal-01956155�
Preliminaryresults
TowardsScalable,EfficientandPrivacyPreservingMachineLearning
ContextandMotivation
RaniaTalbi,SaraBouchenakINSALyon,France
{firstname.lastname}@insa-lyon.fr
Relatedwork
Designprinciples
Objectives
References
⋮
⋮
M(⋃B%�� )
𝑩𝒊 :Localbanktransactionsof𝐶+
𝑪𝑭:Fraudulentcompany
𝑪𝒊 :Companyi
𝐶.
𝐶/ 𝐶0
𝐶1
𝐴𝑨:CentralSupervisionAuthority
𝑀:DataMiningforfrauddetection
December10th,Middleware2018’sdoctoralsymposium- Rennes,France.
DynAmic Privacy Preserving machine Learning Framework (DAPPLE)
𝐷𝑂/
PrivacyPreservingClassifierLearningPrivacyPreservingClassPrediction
𝑸𝒋
[𝑋;]=>?
[𝐶;]=>?
[𝑤>]=>A
𝐶𝑆𝑃 𝐷𝑂.
.
.
.
𝐷𝑂D
[𝑆>.]=>E
[𝑆>/]=>F
[𝑆>D]=>GIncrementalupdateofthedatamodel
𝑫𝑶𝒊 :DataOwneri
𝑸𝒋 :ClassificationQeurierj
[𝒘𝒌]𝒑𝒌𝒘 :Encrypteddatamodel
𝐂𝐒𝐏:ClassificationServiceProvider
[𝑿𝒋]𝒑𝒌𝒋 :Encryptedclassificationquery
[𝑪𝒋]𝒑𝒌𝒋 :Encryptedclassificationresponse
[𝑺𝒌𝒊]𝒑𝒌𝒊 :Encryptedlocaltrainingdatachunkfromdataowner𝐷𝑂+
§ Minimizethecomputationalcostsincurredbyprivacypreservation.§ Providean end-to-endprivacypreservingoutsourced dataclassificationservice.§ Enableasetofmutuallyuntrusteddataownerstohaveaglobalvisionontheunionoftheirdata
withoutbreachingtheprivacyofeachoneofthem.§ Enabledynamicdatamodelupdateswhennewtrainingdatasamplesareavailable.
§ Wehaveusedasyntheticdatasetfor
frauddetectioninaB2Bnetwork.
§ Thisdatasetcontains1000bank
transactionswith9attributeseach.
§ Wecompareourworktothe
Ciphermedframework[8].
PPML
DifferentMLalgorithms
DifferentPrivacy-preservationobjectives
Differentarchitectures
- Clustering[1]- Classification[2]- AssociationRule
Mining[3] MLoutput
protection
Originaldata
protection
….Distributed [4]
Outsourced[5]
Privacy
RuntimeUtility
Privacy
RuntimeUtility
Cryptographictechniques(SMC/HE,GC,OT)
Non-cryptographictechniques(PP-DataPublishing
techniques)
PrivacyPreservationtechniques
Privacy
RuntimeUtility
§ Cryptographicbasedprotection(data
model,trainingdata,classificationqueries
andresponses)
§ Decentprivacyandutilitylevels§ Partialhomomorphicencryption(PHE)
basedbuildingblocks§ Efficientruntime
§ EntirelyoutsourcedMLcomputationsoverencrypteddata
§ CombinePHEwithcryptographicblinding
(DTPKCcryptosystem[6])
𝑒𝑥 ∶ [𝑥]=>⨂ 𝑟 => = [𝑥⨁𝑟]=>
𝑼𝟏 𝑼𝟐
§ (1)Blindinputs
§ (2)Partiallydecryptblindedvalues
§ (3)Decryptblindedvalues
§ (4)Runoperationoverblindedvalues
§ (4)removeblindingfromtheresult
(2)
(4)
§ WeimplementedtheVFDTincremental
decisiontreelearningalgorithm[7]
Naiveapproach:acombinationoflowlevelPP-
buildingblocks1st optimization:useinlinebuildingblocks
2nd optimization:Parallelcomputing
B
A
A B
§ [1]X.Hu, et.al:Privacy-PreservingK-MeansClusteringUponNegativeDatabases. ICONIP(4) 2018.§ [2]S.Kimetal. Privacy-PreservingNaiveBayesClassificationUsingFullyHomomorphicEncryption. ICONIP
(4)2018: 349-358§ [3]L.Liu etal:Privacy-PreservingMiningofAssociationRuleonOutsourcedCloudDatafromMultiple
Parties. ACISP2018: 431-451§ [4]H.Yu etal.:Privacy-PreservingSVMClassificationonVerticallyPartitionedData. PAKDD 2006: 647-656§ [5]T.Li etal.:Outsourcedprivacy-preservingclassificationserviceoverencrypteddata. J.NetworkandComputer
Applications 106: 100-110 (2018)§ [6]X.Liu etal.:AnEfficientPrivacy-PreservingOutsourcedCalculationToolkitWithMultipleKeys. IEEETrans
InformationForensicsandSecurity 11(11): 2401-2414 (2016)§ [7]M.Domingos etal.:Mininghigh-speeddatastreams. KDD 2000: 71-80§ [8]R.Bost etal. :MachineLearningClassificationoverEncryptedData. NDSS 2015
2018ACM/IFIPInternationalMiddlewareConference,DoctoralSymposium,December10-14th2018– Rennes,France
Recommended