17
Research Article Social Security and Privacy for Social IoT Polymorphic Value Set: A Solution to Inference Attacks on Social Networks Yunpeng Gao 1 and Nan Zhang 2 1 School of Engineering and Applied Science, George Washington University, Washington, DC, USA 2 Kogod School of Business, American University, Washington, DC, USA Correspondence should be addressed to Yunpeng Gao; [email protected] Received 27 April 2019; Accepted 4 August 2019; Published 28 August 2019 Guest Editor: Houbing Song Copyright © 2019 Yunpeng Gao and Nan Zhang. is is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Social Internet of ings (SIoT) integrates social network schemes into Internet of ings (IoT), which provides opportunities for IoT objects to form social communities. Existing social network models have been adopted by SIoT paradigm. e wide dis- tribution of IoT objects and openness of social networks, however, make it more challenging to preserve privacy of IoT users. In this paper, we present a novel framework that preserves privacy against inference attacks on social network data through ranked retrieval models. We propose PVS, a privacy-preserving framework that involves the design of polymorphic value sets and ranking functions. PVS enables polymorphism of private attributes by allowing them to respond to different queries in different ways. We begin this work by identifying two classes of adversaries, authenticity-ignorant adversary, and authenticity-knowledgeable adversary, based on their knowledge of the distribution of private attributes. Next, we define the measurement functions of utility loss and propose PVSV and PVST that preserve privacy against authenticity-ignorant and authenticity-knowledgeable adver- saries, respectively. We take into account the utility loss of query results in the design of PVSV and PVST. Finally, we show that PVSV and PVST meet the privacy guarantee with acceptable utility loss in extensive experiments over real-world datasets. 1. Introduction 1.1. Motivation. SIoT integrates social network schemes into IoT systems, which provides opportunities for IoT objects to form social networks. e SIoT paradigm is promising as it is believed that SIoT structures are helpful in enhancing the navigability of IoT networks, identifying levels of trust- worthiness and reusing existing social network models [5]. In this scenario, privacy and security issues have been ex- tensively studied [6, 7, 24, 48]. However, current studies in privacy preservation of IoT systems focus on access control [23, 46], communication and authentication protocols [4, 34, 44], and attribute-based encryption [39, 43]. e features of social network have not been thoroughly considered. e nature of online social networks (OSN) requires sharing of information. User information, including activity patterns and descriptive attributes, is mined and analyzed to improve user experience of OSN applications. ird party users also take advantage of the huge amount of data col- lected by social networks [21]. As part of the improvement of user experience, ranked retrieval models have been exten- sively studied and applied to many OSN features, e.g., link prediction and recommendation systems [32, 36, 47]. For instance, given a user’s information (which can be a tuple in a database), a ranked retrieval model returns a ranking result that serves OSN features, e.g., “People you may know” and “Recommended for you.” Furthermore, many OSN pro- viders improve the accuracy of ranked results by taking into account private attributes of users in the ranked retrieval model. For example, sensitive demographics such as race, religion, and income can help in friend recommendation features as intuitively people sharing similar demographics are more likely to be interested in each other. OSN providers relieve users’ concern of privacy leakage by allowing them to mark attributes as “private” and hiding Hindawi Security and Communication Networks Volume 2019, Article ID 5498375, 16 pages https://doi.org/10.1155/2019/5498375

Research Article - Hindawi Publishing Corporationdownloads.hindawi.com/journals/scn/2019/5498375.pdf · [4, 34, 44], and attribute-based encryption [39, 43]. ’e features of social

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Research Article - Hindawi Publishing Corporationdownloads.hindawi.com/journals/scn/2019/5498375.pdf · [4, 34, 44], and attribute-based encryption [39, 43]. ’e features of social

Research ArticleSocial Security and Privacy for Social IoT Polymorphic ValueSet A Solution to Inference Attacks on Social Networks

Yunpeng Gao 1 and Nan Zhang 2

1School of Engineering and Applied Science George Washington University Washington DC USA2Kogod School of Business American University Washington DC USA

Correspondence should be addressed to Yunpeng Gao ypgaogwuedu

Received 27 April 2019 Accepted 4 August 2019 Published 28 August 2019

Guest Editor Houbing Song

Copyright copy 2019 Yunpeng Gao and Nan Zhang is is an open access article distributed under the Creative CommonsAttribution License which permits unrestricted use distribution and reproduction in anymedium provided the original work isproperly cited

Social Internet ofings (SIoT) integrates social network schemes into Internet ofings (IoT) which provides opportunities forIoT objects to form social communities Existing social network models have been adopted by SIoT paradigm e wide dis-tribution of IoT objects and openness of social networks however make it more challenging to preserve privacy of IoT users Inthis paper we present a novel framework that preserves privacy against inference attacks on social network data through rankedretrieval modelsWe propose PVS a privacy-preserving framework that involves the design of polymorphic value sets and rankingfunctions PVS enables polymorphism of private attributes by allowing them to respond to dierent queries in dierent ways Webegin this work by identifying two classes of adversaries authenticity-ignorant adversary and authenticity-knowledgeableadversary based on their knowledge of the distribution of private attributes Next we dene the measurement functions of utilityloss and propose PVSV and PVST that preserve privacy against authenticity-ignorant and authenticity-knowledgeable adver-saries respectively We take into account the utility loss of query results in the design of PVSV and PVST Finally we show thatPVSV and PVST meet the privacy guarantee with acceptable utility loss in extensive experiments over real-world datasets

1 Introduction

11Motivation SIoT integrates social network schemes intoIoT systems which provides opportunities for IoTobjects toform social networkse SIoTparadigm is promising as it isbelieved that SIoT structures are helpful in enhancing thenavigability of IoT networks identifying levels of trust-worthiness and reusing existing social network models [5]In this scenario privacy and security issues have been ex-tensively studied [6 7 24 48] However current studies inprivacy preservation of IoT systems focus on access control[23 46] communication and authentication protocols[4 34 44] and attribute-based encryption [39 43] efeatures of social network have not been thoroughlyconsidered

e nature of online social networks (OSN) requiressharing of information User information including activitypatterns and descriptive attributes is mined and analyzed to

improve user experience of OSN applications ird partyusers also take advantage of the huge amount of data col-lected by social networks [21] As part of the improvement ofuser experience ranked retrieval models have been exten-sively studied and applied to many OSN features eg linkprediction and recommendation systems [32 36 47] Forinstance given a userrsquos information (which can be a tuple ina database) a ranked retrieval model returns a ranking resultthat serves OSN features eg ldquoPeople you may knowrdquo andldquoRecommended for yourdquo Furthermore many OSN pro-viders improve the accuracy of ranked results by taking intoaccount private attributes of users in the ranked retrievalmodel For example sensitive demographics such as racereligion and income can help in friend recommendationfeatures as intuitively people sharing similar demographicsare more likely to be interested in each other

OSN providers relieve usersrsquo concern of privacy leakageby allowing them to mark attributes as ldquoprivaterdquo and hiding

HindawiSecurity and Communication NetworksVolume 2019 Article ID 5498375 16 pageshttpsdoiorg10115520195498375

private attributes from profiling pages and ranked resultsUsers believe that their privacy is well protected since privateattributes are invisible to the public in their profiles or anyranked results However Rahman et al [35] proposed RankInference and showed that privacy of private attributes in theranked retrieval model is not guaranteed In their approachthe value of a private attribute can be inferred through aranked retrieval interface given the premises that the domainof the private attribute is finite and that the ranking functionhas both monotonicity property and additivity property[35] To be more specific given monotonicity and additivityconditions an adversary is always able to find a pair ofdifferential queries qθ and qθprime such that (a) qθ and qθprime sharethe same predicate on all attributes except for a privateattribute B1 (b) the value of B1 in qθprime is θwhile the value of B1in qθ is not and (c) the ranked result of qθ contains thevictim tuple v while the ranked result of qθprime does not Rahmanet al showed that given the above conditions the adversarywas able to conclude that the value of vrsquos B1 is not equal to θFurthermore the adversary was able to infer the value of vrsquosB1 by finding more differential queries and excluding morevalues from the domain of B1 as long as the domain is finite

Privacy issues arise when private attributes of users aretaken into account in the ranked retrieval model Intuitivelythe issues can be solved by removing private attributes fromranking functions which decreases the utility of many OSNfeatures the recommendation system cannot provide ac-curate results From OSN providersrsquo perspective removingprivate attributes is not a practical solution )erefore thiswork aims not only to address the issue of rank inference butalso to propose a framework that preserves the privacy ofusers against all inference attacks through the ranked re-trieval model while minimizes utility loss

12 Related Work Encryption technologies have been usedthroughout history in security and privacy preservationSearching on encrypted data [10 38] has been introduced toensure data privacy Cao et al [9] proposed a scheme thatallows privacy-preserving ranked search over encrypteddata A query consisting of multiple keywords is conductedby searching over encrypted documents with securek-nearest neighbor (kNN) technique Chen et al [11] tookinto consideration correlations between documents beforeconducting a search query and achieved better performanceVertical fragmentation [13 15 16 22] has been applied toencrypted data which hides identities of users by separatingidentifier attributes with descriptive attributes Howeverencrypted searching is developed to preserve privacy againstadversaries in a cloud computing environment Adversariescan still access decrypted searching results from whichprivate attribute values can be inferred

As ONS has been emerging as an important source forbig data many studies have been carried out for privacy-preserving data mining (PPDM) and privacy-preservingdata publishing (PPDP) Perturbative methods implementthe ldquocamouflagerdquo paradigm where original data are directlymodified [28] Agrawal et al [3] proposed an algorithm thatperturbs data with random additive noise Liu et al proposed

data perturbation with multiplicative noise However ran-dom noise has predictable structures in the spectral domainand thus privacy provided by additive noise is questionable[25] Furthermore additive or multiplicative noise can onlybe applied to numerical data Data swapping approaches[19 30 31] perturb data by swapping values between recordsthat are close to each other Other distance-based ap-proaches include [12] in which data points are perturbedwithout changing their relevant closeness relationships and[2] in which data points are clustered and each data pointrsquosvalue is replaced by the value of the cluster center Howeverthose approaches rely on a universal measurement ofcloseness between data points in multidimensional spaceFurthermore they are limited by the distributions of datapoints

Generalization is the process of replacing a group ofvalues with a more general value that can represent thegroup Suppression is the ultimate state of generalizationsuch that the representative value is ldquonot applicablerdquo and as aresult the group of values is removed from the dataset [45]k-anonymity [40 41] is a widely studied approach thatpreserves privacy of records by grouping at least k recordsinto an equivalence class )e attribute values of the k re-cords are suppressed so that the k records are in-distinguishable from adversaries Machanavajjhala et al [27]proposed L-diversity that focuses on attribute privacyL-diversity forces each equivalent class to have at least ldifferent values for each attribute Li et al [26] proposedT-closeness that further considers the distribution of at-tribute values T-closeness sets a threshold for the variancebetween the distribution of a private attribute in eachequivalence class and the distribution of the same attributein the entire dataset However those approaches are de-veloped to preserve privacy of published data while theranked retrieval model of ONS does not directly revealprivate attributes to the public Furthermore suppression ofprivate attribute values introduces unnecessary utility loss tothe ranked retrieval model Equivalent Set [20] was proposedto preserve privacy against inference attacks through theranked retrieval model )is approach groups differenttuples into a set such that they are indistinguishable inranked results However this approach requires that tuplesin the same equivalent set have different values in everyprivate attribute and have the same value in every publicattribute Assume that a kNN query q is sent to a rankedretrieval interface protected by Equivalent Set and that atuple t is equal to q in most private attributes Since the othertuples in the same equivalent set of t are different from t inevery private attribute they must be different from q in mostprivate attributes too )erefore the original rank of t givenq should be much higher than that of any other tuple in thesame equivalent set In this case the rank of t given q will besignificantly lowered by Equivalent Set in order to achieveindistinguishability which reduces the accuracy of theranked result of q Furthermore it is possible that there is notprime such that tprime is different from t in every private attribute andis same with t in every public attribute In this case we haveto suppress the private attribute values of t which furtherintroduces utility loss

2 Security and Communication Networks

Differential privacy [8 17 18] is another widely studiedframework that preserves privacy of published datasets orhidden databases It imposes a strong guarantee of privacyon tuples in statistical databases by adding noise to theprocess of query results However the ranked retrievalmodel outputs ranks of tuples instead of their values oraggregate statistics We cannot directly add noise to rankedresults as the rank of a tuple is determined by not only thetuple itself but also by other tuples in the database Fur-thermore the optimization of the ranked retrieval model hasnot been considered

13 Contributions )is work presents a novel scheme forprivacy-preserving the ranked retrieval model We start withan introduction to the adversary model and introduce ourdefinition of privacy guarantee We identify two categoriesof adversaries based on their prior knowledge and assumethat adversaries can launch optimal inference attacksthrough ranked results

We propose the polymorphic value set (PVS) a privacy-preserving framework for the ranked retrieval model Dif-ferent from existing methods PVS does not directly modifyvalues of tuples or query results Instead PVS enablespolymorphism of private attributes such that a private at-tribute of a tuple can respond to different queries in differentways We prove that our framework meets the privacyguarantee stated in Problem Statement For adversaries withand without prior knowledge we design and implement thepolymorphic value set with true values (PVST) and poly-morphic value set with Virtual Values (PVSV) respectivelyIn the design of PVST and PVSV we consider utility loss inthe ranked retrieval model and propose a practical mea-surement of utility loss We prove that the task of mini-mizing utility loss is NP-hard and present two heuristicalgorithms that implement PVST and PVSV respectivelyWe run our implementations of PVST and PVSV on a real-world dataset from eHarmony [29] that contains 486464tuples)e experiments yield excellent results with respect toprivacy guarantee and utility loss

)e remainder of this paper is organized as followsProblem Statement introduces the adversary model and theprivacy guarantee Privacy-Preserving Framework in-troduces our privacy-preserving framework Frameworkwith Virtual Values presents the design and implementationof PVSV along with our analysis of utility loss )eimplementation of PVST and analysis of utility loss arepresented in Framework with True Values ExperimentalResults contains our experimental evaluation of PVSV andPVST In Conclusions we conclude this paper with asummary of our key contributions and a discussion of someopen problems

2 Problem Statement

21 Ranked Retrieval Model In information retrieval wehave witnessed extensive research in the ranked retrievalmodel Unlike the Boolean retrieval model where only re-sults that exactly match the predicates can be returned theranked retrieval model allows users to retrieve a list of

records sorted by a proprietary ranking function)erefore theranked retrieval model provides an alternative solution forusers seeking results sorted by their relevance to the query

As discussed in the introduction many OSN applica-tions have been using the ranked retrieval model to processincoming queries Upon a query q the ranked retrievalmodel would calculate each tuple trsquos score according to aproprietary score function s(t ∣ q) and return top-k tuples indescending order of their scores )e attributes of tuplescould be either categorical or numerical In this paper weconsider only categorical data which does not limit thescope of our research Actually numerical data can betreated as categorical data by categorizing the numericaldomain into small intervals such that no more than onetuple in the database falls into the same interval Withoutloss of generality we also assume that there is no duplicatetuple that is equal to another tuple in every attribute

We now formalize our ranked retrieval model with cat-egorical attributes Consider an n-tuple database D with mpublic attributes A1 Am and mprime private attributesB1 Bmprime Let VA

i (resp VBj ) denote the value domain of

Ai(resp Bj) Let t[A]i(resp t[B]j) denote the value of t inAi(resp Bj) Upon query q the score function s(t ∣ q)

computes a score for each tuple t isin D )e ranked retrievalmodel will then sort all tuples in D in the descending orderand return them as the ranked result We consider the casewhere the score function is linear )erefore s(t ∣ q) can bedefined as

s(t ∣ q) 1113944m

i1w

Ai ρ q Ai1113858 1113859 t Ai1113858 1113859( 1113857 + 1113944

mprime

j1w

Bi ρ q Bi1113858 1113859 t Bi1113858 1113859( 1113857

(1)

where wAi (resp wB

j ) isin (0 1) is the weight of attributeAi(resp Bj) in the score function and the matching functionρ(q[Ai] t[Ai]) (resp ρ(q[Bi] t[Bi])) indicates if tmatches qin attribute Ai (resp Bj))erefore the value of ρ(β1 β2) is 1if β1 is equal to β2 and the value of ρ(β1 β2) is 0 if β1 is notequal to β2 Note that our ranked retrieval model satisfies themonotonicity and additivity properties defined in [35]

22 AdversaryModel In Motivation we mentioned that wedo not make any assumption about the method adopted byan adversary when attacking a database We also assume thatthe adversary has prior knowledge about the metadata oftables in the database as well as the proprietary rankingfunction Furthermore the adversary is assumed to be ableto issue queries to the ranked retrieval model view rankedresults and insert tuples to the database As a result theadversary is able to retrieve all public attribute values bycrawling the database through the query interface [37] Wedenote the set of queries issued by the adversary as QA theset of tuples inserted by the adversary as IA and the cor-responding set of ranked results as RD RD is fully de-termined by QA and IA given fixed D We name allinformation regarding a tuple t isin D that an adversary canfind in RD as the trace of t and denote the trace of t as Tt )etrace of t includes but not limited to the rank of t and the

Security and Communication Networks 3

relationship between t and any other tuple (eg t has ahigher or lower rank than another tuple tprime) in a rankedresult )erefore given fixed D IA and QA Tt is fully de-termined by the attribute values of t

Another capability of the adversary we model is theadversaryrsquos prior knowledge Consider an extreme case wherethe adversary knows the equivalence relation between two at-tributes B1 and A1 In this case even without RD the adversaryis still able to infer the value of B1 of any t isin D In reality anadversary can acquire such attribute correlations by being orconsulting an expert in the domain of the dataset or by adoptingdata mining methods [42] For example based on the personalinformation (eg gender ethnicity age and blood type whichcan be used to infer the gene) stored as public attributes andpublished in public medical data repositories genetic epide-miologists can generally conclude that an individual does nothave some diseases merely based on the fact that these diseaseswould never be found by the candidate gene among historicmedical datasets )erefore an adversary with prior knowledgecould eliminate the possibility of a certain tuple in the databaseWemodel prior knowledge as a function PK(t ∣ D) that takes asinput t andD and returns either 0 or 1 PK(t ∣ D) 0 indicatesthat given prior knowledge the possibility of t isin D is zeroPK(t ∣ D) 1 indicates that the adversary cannot eliminate thepossibility of t isin D As prior knowledge helps adversaries inlaunching an inference attack adversaries can be partitionedinto two classes adversaries with prior knowledge and adver-saries without prior knowledge of the dataset

Definition 1 We name adversaries without prior knowledgeof the authenticity of any tuple as authenticity-ignorant ad-versaries For authenticity-ignorant adversaries PK(t ∣ D)

always outputs 1 We name adversaries with such priorknowledge as authenticity-knowledgeable adversaries Forauthenticity-knowledgeable adversaries PK(t ∣ D) 1 ift isin D and PK(t ∣ D) 0 if t notin D

)e objective of both classes of adversaries is to maxi-mize the following g(v Bj) value when inferring the value ofvictim tuple v in attribute Bj

g v Bj1113872 1113873 Pr v Bj1113960 1113961 a1113872 1113873 (2)

where Pr(v[Bj] a) is the probability of v[Bj] a and a isthe value inferred by the adversary given prior knowledgeand ranked results

For authenticity-ignorant adversaries PK(t ∣ D) alwaysoutputs 1 regardless of t and D We only assume the caseswhere users input true information to the databases)ereforefor authenticity-knowledgeable adversaries PK(t ∣ D) 1 ift isin D and PK(t ∣ D) 0 if t notin D In this paper we assume astrong adversary that can infer the private attribute values of anarbitrary tuple given the premise that the trace of the targettuple is unique )e premise can be easily met because as longas there is no duplicate tuple in the dataset the adversary canalways construct well-designed IA andQA such that the trace ofthe target tuple is different from the trace of any other tuple)erefore the adversary can always find a such thatPr(v[Bj] a) 100

23ProblemStatement Aprivacy breach can be described by asuccessful inference of a private attribute value in the databaseWe view privacy of v[Bj] as the upper bound on the possibilitythat an adversary succeeds in inferring the value of v[Bj] Notethat we do not make any assumption about the adversaryrsquosattacking method Our objective in this paper is to present aframework that sets an upper bound on the probability ofsuccessful inference of an arbitrary private attribute for tuplet isin D )erefore we define the objective of the framework as

forallt isin D j isin 1 mprime1113864 1113865 g v Bj1113872 1113873le ε (3)

We present the upper bound ϵ as our privacy guaranteeHowever a privacy-preserving framework should pro-

vide not only a privacy guarantee but also a notion ofutilitymdashafter all a framework that removes all private at-tribute values or replaces them with randomly generatedvalues can surely preserve privacy )erefore we use ameasurement based on the variance of ranked results beforeand after adopting our framework Given D and a set of allpossible queries denoted as Q we define the utility loss forour ranked retrieval model as follows

U 1113944tisinD

1113944qisinQ

Rank(t ∣ q) minus Rankprime(t ∣ q)1113868111386811138681113868

1113868111386811138681113868 (4)

where Rank(t ∣ q) and Rankprime(t ∣ q) refer to the ranks oftuple t in the ranked result given query q before and afterapplying our frameworks respectively

3 Privacy-Preserving Framework

)e only information an adversary can obtain from a da-tabase through the ranked retrieval model is public attributevalues and ranked results For an adversary without priorknowledge information regarding private attribute valuescan only be retrieved from ranked results )erefore inorder to preserve privacy we have to modify the rankedretrieval model such that the adversary cannot retrieve anyuseful information about private attributes from rankedresults

An idea is to group different tuples together in rankedresults As in our prior work [20] we can group two tuples v1and v2 together such that they share the same rank in anyranked result )is can be achieved by adopting a newranking function sprime(t ∣ q) such that sprime(v1 ∣ q) sprime(v2 ∣ q) forall q If v1 and v2 have different values on every privateattributes then the adversary is unable to infer the privatevalues of v1 since v1 and v2 are indistinguishable in anyranked results However this method suffers from highutility loss In order to preserve the privacy of all privateattributes v1 and v2 have to be different over all privateattributes)us the original scores of v1 and v2 ie s(v1 ∣ q)

and s(v2 ∣ q) differ a lot which leads to a higher variancebetween the rank of v1 before and after adopting thismethod

In this work we present a novel framework that pre-serves privacy of private attributes while minimizes theutility loss We observe that for a tuple vrsquos private attributev[Bj] if there are at least two potential values for v[Bj] and

4 Security and Communication Networks

an adversary cannot differentiate any one of them then theprivacy of v[Bj] can be preserved For instance if theprobability of v[Bj] a is equal to the probability ofv[Bj] aprime given ranked results and prior knowledge thenthe adversary cannot exclude any one of them If both theprobabilities are 50 then the adversary may choose torandomly pick a value from a and aprime as the inferred result Inthis case g(v Bj) will not exceed 50 and privacy of v[Bj]

can be preserved To prove this statement suppose that v isan arbitrary tuple in databaseD and we want to preserve theprivacy of v[Bj] Let β

Bj be an arbitrary value in VB

j v[Bj]1113966 1113967We construct tuple vprime such that vprime and v differ in only oneattribute Bj vprime[Bj] βB

j We also construct databaseDprime suchthat D and Dprime differ in only one tuple v isin D while vprime isin DprimeWe define a new score function sprime(t ∣ q)

sprime(t ∣ q) 1113944m

i1w

Ai ρ q Ai1113858 1113859 t Ai1113858 1113859( 1113857 + 1113944

mprime

j1w

Bi ρprime q Bj1113960 1113961 t Bj1113960 11139611113872 1113873

(5)

where

ρprime q Bj1113960 1113961 t Bj1113960 11139611113872 1113873 ρ q Bj1113960 1113961 t Bj1113960 11139611113872 1113873 if tne v and tne vprime

ρprime q Bj1113960 1113961 t Bj1113960 11139611113872 1113873 Max ρ q Bj1113960 1113961 v Bj1113960 11139611113872 1113873 ρ q Bj1113960 1113961 vprime Bj1113960 11139611113872 11138731113872 1113873

(6)

if t v or t vprimeImagine a case where an adversary queriesD and Dprime with

the same query workload QA We denote the ranked resultsfrom D as RD and the ranked results from Dprime as RDprime As inthe new score function sprime(v ∣ q) sprime(vprime ∣ q) for forallq isin QA RD

is identical to RDprime )erefore given only ranked results theadversary cannot tell the difference between the two data-bases being queried Furthermore even if we exchange thevalues of v and vprime the adversary still cannot observe anychange inRD or RDprime As a result the value of v[Bj] and β

Bj are

equivalent from the perspective of the adversary and theprivacy of v[Bj] can be well preserved Intuitively v can beseen as a tuple that has two polymorphic forms in Bj v[Bj]

and βBj When calculating the score of v with sprime(t ∣ q) we

always choose the value that can maximize sprime(v ∣ q)We can further extend the statement to a more general

case For each tuple v isin D and each private attribute Bj wecan select e distinct values β1 β2 βe from VB

j v[Bj]1113966 1113967elt |VB

j | minus 1 )e new score function can be defined as

sprime(t ∣ q) 1113944m

i1w

Ai ρ q Ai1113858 1113859 t Ai1113858 1113859( 1113857 + 1113944

mprime

j1w

Bi ρprime q Bj1113960 1113961 t Bj1113960 11139611113872 1113873

(7)

whereρprime q Bj1113960 1113961 t Bj1113960 11139611113872 1113873 ρ q Bj1113960 1113961 t Bj1113960 11139611113872 1113873 if tne v

ρprime q Bj1113960 1113961 t Bj1113960 11139611113872 1113873 Max ρ q Bj1113960 1113961 t Bj1113960 11139611113872 1113873 ρ q Bj1113960 1113961 β11113872 11138731113872

ρ q Bj1113960 1113961 βe1113872 11138731113873 if t v

(8)

Consider e tuples v(1)prime v(e)prime which are identical to v

in all attributes except for Bj Let v(i)prime[Bj] be βi

foralli 1 e )en for forallq we have sprime(v ∣ q) sprime(v(i)prime ∣ q)As a result from the adversaryrsquos perspective there are e + 1potential values for v[Bj] v[Bj] β1 βe which are in-distinguishable from each other from any ranked results)eprivacy of v[Bj] can be preserved by grouping it with eequivalent values

As such we introduce the construction of the poly-morphic value set (PVS) We put v[Bj] into a set in which allvalues are indistinguishable when calculating the score withrespect to Bj in the ranking function ie ρprime(q[Bj] v[Bj])We name the above set as the polymorphic value set anddenote the polymorphic value set of tuple v in attribute Bj asP

Bj

v We define PBj

v as follows

Definition 2 PBj

v is a set containing all indistinguishablevalues of tuple vrsquos private attribute Bj Assigning v[Bj] withan arbitrary value in P

Bj

v will not change the value ofsprime(v ∣ q)forallq ie ρprime(q[Bj] v[Bj]) ρprime(q[Bj] β)forallβ isin P

Bj

v

and forallq

Since the adversary cannot distinguish different values inP

Bj

v by launching any inference attacks based only on rankedresults the privacy guarantee of v[Bj] is

ε 1

PBj

v

11138681113868111386811138681113868

11138681113868111386811138681113868

(9)

In order to achieve the privacy guarantee defined in (3)for each v isin D and each private attribute Bj we have toensure that (a) there is one and only one polymorphic valueset P

Bj

v for vrsquos attribute Bj and (b) the privacy guaranteedefined in (9) is always valid

4 Framework with Virtual Values

41 Design In this section we introduce how polymorphicvalue sets can be constructed with generated values to meetthe privacy guarantee in (9) against authenticity-ignorantadversaries We name values generated by our framework asvirtual values

As we proved in Privacy-Preserving Framework anauthenticity-ignorant adversary cannot distinguish the valueof v[Bj] from other valid values in P

Bj

v given ranked resultsSince an adversary without prior knowledge cannot validatethe authenticity of any value in P

Bj

v all values in PBj

v areldquovalidrdquo in the perspective of the adversary no matter if theyare generated by our framework or collected from real datain D )erefore we observe that P

Bj

v can be formed by anyvalues in VB

j In order to achieve the privacy guarantee of ε we have to

ensure that |PBj

v |gt (1ε) forallv isin D and forallj isin 1 mprimeAn intuitive algorithm to generate P

Bj

v of size 1ε is torandomly pick (1ε) minus 1 values from VB

j v[Bj] Specifically

let the initial PBj

v v[Bj]1113966 1113967 )en we can randomly pick

distinct values fromVBj v[Bj] and insert them intoP

Bj

v untilPBj

v

contains at least (1ε) distinct values In the same manner wecan construct a polymorphic value set for each tuplersquos eachprivate attribute

Security and Communication Networks 5

411 Privacy Guarantee For a databaseDwhere every tuplevrsquos every private attribute Bj is included by one polymorphicvalue set with virtual values (PVSV) whose size is at least l ifthe adversary has no prior knowledge of D a privacy level ofε (1l) is achieved

For an authenticity-ignorant adversary PK(v ∣ D) 1forallv As proved in Privacy-Preserving Framework for anauthenticity-ignorant adversary it is impossible to distin-guish v[Bj] with at least l minus 1 other values )us we haveg(v Bj)le (1l) for forallv isin D and forallj isin 1 mprime Accordingto equation (3) a privacy guarantee of (1l) can be achieved

42 Utility Optimization In this section we discuss how toreduce utility loss caused by polymorphic value sets Weintroduced a metric of utility loss in (4) that calculates thesum of difference in ranked results given all possible queriesTo practically calculate utility loss we limit the range ofqueries to a finite set named query workload In practice thequery workload of a database D can be a set of queries thatare more frequently issued than any other queries A queryworkload may contain duplicate queries which reflect thedistribution of frequent queries With a query workloaddenoted as Q we can define the practical utility loss as

UQ 1113944qisinQ

1113944visinD

Rank(v ∣ q) minus Rankprime(v ∣ q)1113868111386811138681113868

1113868111386811138681113868 (10)

In order to reduce utility loss we have to find assign-ments of P

Bj

v for forallv isin D and forallj isin 1 mprime such that theoverall UQ can be minimized Without loss of generality weonly consider constructing polymorphic value sets of size 2In this case the privacy guarantee is 12 For each v[Bj] weneed to find a value that is indistinguishable from v[Bj] Wedenote the polymorphic value of v[Bj] as vprime[Bj]

Definition 3 We define the 2-PVSV problem as follows givendatabase D and query workload Q find a polymorphic valuevprime[Bj] from VB

j v[Bj]1113966 1113967 for each v[Bj] forallv isin D andforallj isin 1 mprime such that UQ defined in (10) is minimized

Theorem 1 4e 2-PVSV problem is NP-hard

)e proof of )eorem 1 in detail can be found in Ap-pendix A

43 Heuristic Algorithm We have proved that the 2-PVSVproblem is an NP-hard problem that may not be solved inpolynomial time)erefore we propose PVSV-Constructora heuristic algorithm that can return an approximate so-lution in polynomial time

We observe that |s(v ∣ q) minus sprime(v ∣ q)| is relevant to|Rank(v ∣ q) minus Rankprime(v ∣ q)| A smaller difference betweens(v ∣ q) and sprime(v| q) leads to a smaller difference betweenRank(v ∣ q) and Rankprime(v ∣ q) )erefore |Rank(v ∣ q)minus

Rankprime(v ∣ q)| can be approximately minimized by a solutionthat minimizes |s(v ∣ q) minus sprime(v ∣ q)| As such we propose anapproximation of UQ that calculates the score differencebefore and after adopting our framework We denote thescore difference as US

Q and define USQ as follows

USQ 1113944

qisinQ1113944visinD

s(v ∣ q) minus sprime(v ∣ q)1113868111386811138681113868

1113868111386811138681113868 (11)

As shown in Algorithm 1 given input database D thenumber of public and private attributes m and mprime re-spectively query workload Q and privacy guarantee ϵPVSV-Constructor constructs an equivalent value set foreach v isin D and j isin 1 mprime that minimizes US

Q Recall thedefinition of sprime(t| q) in (7) and we have

USQ 1113944

visinD1113944

mprime

j11113944qisinQ

wBj ρprime q Bj1113960 1113961 v Bj1113960 11139611113872 1113873 minus ρ q Bj1113960 1113961 v Bj1113960 11139611113872 111387311138681113868111386811138681113868

11138681113868111386811138681113868

(12)

We denote the score difference of v[Bj] contributed byP

Bj

v as U(PBj

v )

U PBj

v1113874 1113875 1113944qisinQ

wBj ρprime q Bj1113960 1113961 v Bj1113960 11139611113872 1113873 minus ρ q Bj1113960 1113961 v Bj1113960 11139611113872 111387311138681113868111386811138681113868

11138681113868111386811138681113868

(13)

)erefore USQ is the sum of U(P

Bj

v ) over all tuples andprivate attributes

USQ 1113944

visinD1113944

mprime

j1U P

Bj

v1113874 1113875 (14)

Since the construction of each PBj

v is independent fromother polymorphic value sets we can minimize US

Q by mini-

mizing the score difference contributed by each PBj

v forforallj isin 1 mprime1113864 1113865 and forallv isin D Note that |ρprime(q[Bj] v[Bj]) minus

ρ(q[Bj] v[Bj])| 0 if q[Bj] v[Bj] or q[Bj] notin PBj

v Alsonote that |ρprime(q[Bj] v[Bj]) minus ρ(q[Bj] v[Bj])| 1 when

q[Bj]ne v[Bj] and q[Bj] isin PBj

v )erefore if forallq isin Q

v[Bj] q[Bj] then any assignment of PBj

v cannot contribute

to a higher U(PBj

v ) since ρprime(q[Bj] v[Bj]) 1 ρ(q[Bj]

v[Bj]) 1 and |ρprime(q[Bj] v[Bj]) minus ρ(q[Bj] v[Bj])| is always

zero In this situation U(PBj

v ) 0 On the contrary if existq isin Qv[Bj]ne q[Bj] then ρ(q[Bj] v[Bj]) 0 and thus|ρprime(q[Bj] v[Bj]) minus ρ(q[Bj] v[Bj])| ρprime(q[Bj] v[Bj])

According to equation (13) the value of U(PBj

v ) is

1113944qisinQ

wBj ρprime q Bj1113960 1113961 v Bj1113960 11139611113872 1113873 minus ρ q Bj1113960 1113961 v Bj1113960 11139611113872 111387311138681113868111386811138681113868

11138681113868111386811138681113868

1113944

qisinQq Bj1113858 1113859nev Bj1113858 1113859

wBj ρprime q Bj1113960 1113961 v Bj1113960 11139611113872 1113873

(15)

In order to minimize U(PBj

v ) we have to find an as-signment of P

Bj

v which minimizes |1113864q isin Q ∣ q[Bj]ne

v[Bj]existβ isin PBj

v q[Bj] β1113865| Consider the simplest case

where we want to construct PBj

v of size 2 v[Bj] β1113966 1113967 We have

6 Security and Communication Networks

1113944

qisinQq Bj1113858 1113859nev Bj1113858 1113859

wBj ρprime q Bj1113960 1113961 v Bj1113960 11139611113872 1113873 w

Bj q isin Q ∣ q Bj1113960 1113961 β1113966 111396711138681113868111386811138681113868

11138681113868111386811138681113868

(16)

In this case β has to be a value in VBj v[Bj] such that β

has the lowest frequency among all q[Bj] values for q isin QNote that β does not have to be an element of q[Bj] ∣1113966

q isin Q If β notin q[Bj] ∣ q isin Q1113966 1113967 its frequency is 0 Similarly if we

want to construct PBj

v of size k then we should insert the k minus 1least frequent values among all q[Bj] values into P

Bj

v In line 4 of Algorithm 1 we initialize the polymorphic

value set of v[Bj] by inserting v[Bj] into PBj

v For v[Bj] aset Set is constructed from VB

j v[Bj]1113966 1113967 which containsall values that can be inserted into P

Bj

v In order tominimize U(P

Bj

v ) we insert the k minus 1 least frequentvalues from Set into P

Bj

v by their frequencies inq[Bj] ∣ q isin Q1113966 1113967 )e above construction of P

Bj

v is repeatedfor each t isin D and j isin 1 mprime )e computationalcomplexity of Algorithm 1 is O(|D| middot |Q| middot mprime)

5 Framework with True Values

51 Authenticity-Knowledgeable Adversaries As we men-tioned in the adversary model an authenticity-knowl-edgeable adversary is able to tell if t isin D is possible As aresult the authenticity-knowledgeable adversary can launcha more efficient attack on private attributes by examining theauthenticity of values learned from RA We show a simplecase where an authenticity-knowledgeable adversary breaksthe privacy guarantee provided by polymorphic values setsconstructed with virtual values Consider that the objectiveof the adversary is to infer the value of v[B0] and B0 is theonly private attribute in D PB0

v is the polymorphic value setgenerated for v[B0] and |PB0

v | 1ε Without prior knowl-edge Pv[B0] ε and the privacy guarantee is achievedHowever for a value β isin PB0

v v[B0]1113864 1113865 if the adversary canconclude that PK(vprime ∣ D) 0 for any tuple vprime such that vprime isequal to v in all public attributes and vprime[B0] β then theadversary can exclude β from PB0

v )erefore

g v B01113858 1113859( 1113857 ε

1 minus εgt ε (17)

and equation (9) no longer holdsAs shown above if values in PB0

v are marked by anadversary as invalid for v given PK(vprime|D) then the adversarycan successfully break the privacy guarantee defined inframework with virtual values

52 Design In this section we propose polymorphic valuesets with true values (PVST) that construct polymorphicvalue sets with values that cannot be excluded by PK(t ∣ D)PVST considers nontrivial prior knowledge of adversariesand presents the same degree of privacy guarantee in-troduced in (9)

We have shown above that the implementation withvirtual values can be compromised by adversaries with priorknowledge Consider a tuple v isin D Given privacy guaranteeϵ we construct mprime polymorphic values sets PB0

v PB

mprimev for

each private attributes of v where |PBj

v |ge 1ε Let set MAv be

MAv v A01113858 11138591113864 1113865 times middot middot middot times v Am1113858 11138591113864 1113865 times P

B0v times middot middot middot times P

Bmprimev (18)

MAv is the Cartesian product of sets each of which

contains vrsquos all equivalent values in an attribute For publicattribute Ai the corresponding set is v[Ai]1113864 1113865 since publicattribute values are open to the adversary For private at-tribute Bj the corresponding set is P

Bj

v )erefore MAv

contains all possible tuples that are indistinguishable with v

with respect to RA (including v itself )With prior knowledge on PK(t ∣ D) an adversary is able

to exclude a value β from PBj

v if

forallt isin t ∣ t isinMAv t Bj1113960 1113961 β1113966 1113967PK(t ∣ D) 0 (19)

As described above it is safe for the adversary to con-clude that βne v[Bj] if there is no t isinMA

v such that t[Bj] βand PK(t ∣ D) 1 Alternatively if for every β in P

Bj

v thereis a t such that t[Bj] β and PK(t ∣ D) 1 then the ad-

versary cannot exclude any value in PBj

v and thereforePv[Bj]le ε is guaranteed Since PK(t ∣ D) 1 iff t isin D weconstruct polymorphic value sets with true values fromt isin D

Definition 4 If there exists l distinct values β1 βl isin PBj

v

such that forallr isin 1 l βr holds the following property

(i) Input ε D m mprime Q(ii) Output P

Bj

v forallv isin D and forallj isin 1 mprime1113864 1113865

(1) k (1ε) minus 1(2) for t in D do(3) for j isin 1 mprime1113864 1113865 do(4) P

Bj

v v[Bj]1113966 1113967

(5) Set VBj v[Bj]1113966 1113967

(6) Sort elements of Set by their frequencies in q[Bj]| q isin Q1113966 1113967 in ascending order(7) Remove the last |Set| minus k + 1 elements in Set(8) P

Bj

v PBj

v cup Set(9) end(10) end

ALGORITHM 1 PVSV-Constructor

Security and Communication Networks 7

existt isinMAv t Bj1113960 1113961 βr and t isin D (20)

)en we say that PBj

v covers l true values We denoteT

Bj

v β1 βl1113864 1113865 as the true value set of t in BjPrivacy guarantee for a database D where every tuplersquos

every private attribute is included by one equivalent value setwhich covers at least l true values a privacy level of ε 1l isachieved

Assume that the adversaryrsquos objective is to infer the valueof v[Bj] As mentioned in the adversary model we make noassumption on the attacking methods adopted by an ad-versary Consider MA

v defined in (18) forallt tprime isinMAv and

forallq isin QA s(t ∣ q) s(tprime ∣ q) )erefore the adversary cannotdistinguish tuples in MA

v by observing RA In this situationthe adversary would use PK(t ∣ D) to exclude all tuple t inMA

v such that PK(t ∣ D) 0 However as PBj

v covers l truevalues there exists l tuples t1 tl isinMA

v such thatPK(t ∣ D) 1 and tr[Bj]ne ts[Bj] for arbitraryr s isin 1 l As a result values in t1[Bj] tl[Bj]1113966 1113967 areindistinguishable given RA and PK(t ∣ D) )us Pv[Bj] 1lAccording to (3) a privacy level of 1l is achieved

53 Utility Optimization An intuitive method of con-structing P

Bj

v is to insert the value of Bj of all tuples that sharethe same public attribute values with v into P

Bj

v ie PBj

v

PBj

v 1113864vprime[Bj]|foralli and vprime isin D vprime[Ai] v[Ai]1113865 )e privacyguarantee is met if P

Bj

v covers at least 1ε true valuesNevertheless utility loss cannot be ignored as in the intuitivemethod s(vprime| q) would be the same for all vprime isin 1113864vprime| foralliand vprime isin D vprime[Ai] v[Ai]1113865 and thus information of privateattributes are missing in the ranked result of vprime )ereforethe size of P

Bj

v is critical in balancing privacy and utilityWith no loss of generality we limit the size of each poly-morphic value set to 2 We show that constructing suchpolymorphic value sets is an NP-hard problem

Definition 5 We define the 2-PVST problem as followsgiven database D and a query workload Q construct P

Bj

v ofsize 2 for each tuple v isin D forallj isin 1 mprime1113864 1113865 and minimizeUQ defined in (10)

Theorem 2 4e 2-PVST problem is NP-hard

)e proof of )eorem 2 can be found in Appendix B

54 Heuristic Algorithm We have shown that the 2-PVSTproblem is NP-hard In this subsection we present PVST-Constructor a heuristic algorithm that constructs PVSTwithin polynomial time PVST-Constructor tries to mini-mize |P

Bj

v | and 1113936qisinQ|sprime(v ∣ q) minus s(v ∣ q)| for each v isin D

and j isin 1 mprime with the greedy algorithm)e pseudo-code of PVST-Constructor is shown in

Algorithm 2 In lines 2 to 6 we initialize each PBj

v withv[Bj] )en for each v isin D PVST-Constructor constructsPB1

v PBmprime

v by finding a vprime with the greedy algorithm andinserting vprime[Bj] into P

Bj

v forallj isin 1 mprime)e above process

will be taken multiple times until |PBj

v |ge 1ε Since every vprime isa real tuple existing in D and we insert at least k tuples in theabove processes P

Bj

v covers at least k + 1 true values andthus the privacy guarantee is met As mentioned in UtilityOptimization the size of P

Bj

v is critical in minimizing utilityloss )erefore the heuristic algorithm tries to minimize theutility loss by minimizing the size of each P

Bj

v We count thenumber of polymorphic value sets of v if the set contains lessthan k values that can be enlarged by inserting vprimersquos privateattribute values We denote this count as covervprime

covervprime j ∣ PBj

v

11138681113868111386811138681113868

11138681113868111386811138681113868lt k vprime Bj1113960 1113961 notin PBj

v1113882 1113883

1113868111386811138681113868111386811138681113868

1113868111386811138681113868111386811138681113868 (21)

Tuple vprime with a higher covervprime value can enlarge the sizeof more polymorphic value sets of v and thus we can reducethe number of tuples that we have to insert into PB1

v PBmprime

v Furthermore we take into consideration queries in Q In

order to minimize UQ we have to minimize |sprime(v ∣ q) minus

s(v ∣ q)| for q isin Q Note that for each attribute Bj|ρprime(q[Bj] v[Bj]) minus ρ(q[Bj] v[Bj])| 1 if v[Bj]ne q[Bj] andq[Bj] isin P

Bj

v We denote the value of 1113936j|ρprime(q[Bj] v[Bj]) minus

ρ(q[Bj] v[Bj])| as lossvprime )us we have

lossvprime 1113944qisinQ

1113944

j1mprime

ρprime q Bj1113960 1113961 v Bj1113960 11139611113872 1113873 minus ρ q Bj1113960 1113961 v Bj1113960 11139611113872 111387311138681113868111386811138681113868

11138681113868111386811138681113868

1113944qisinQ

1113944

j1mprime

δ v vprime q j( 1113857

(22)where δ(v vprime q j) 1 if v[Bj]ne q[Bj] and vprime[Bj] q[Bj]

and δ(v vprime q j) 0 otherwise )erefore tuple vprime with asmaller lossvprime can reduce the value of 1113936j|ρprime(q[Bj]

v[Bj]) minus ρ(q[Bj] v[Bj])| and thus we can reduce the valueof UQ

)e computation of covervprime and lossvprime is done in line 11)en we compute scorevprime the score of vprime that indicates howpreferable vprime is relative to other tuples in D v Weadopt the greedy algorithm to find the next vprime for PB1

v

PB

mprimev ie in each iteration we choose the tuple

that has the highest score value In line 15 we insert theprivate attribute values of the chosen tuple (denoted as vmax)into PB1

v PB

mprimev )e above process is repeated until

the sizes of PB1v P

Bmprime

v are no less than k

6 Experimental Results

61 Experimental Setup To validate PVSV-Constructor andPVST-Constructor algorithms we conducted experiments on areal world dataset [29] from eHarmony which contains 58attributes and 486464 tuples We removed 5 noncategoricalattributes and randomly picked 20 categorical attributes fromthe remaining 53 attributes )e domain sizes of the 20 at-tributes range from 2 to 15 After removing duplicate tuples werandomly picked 300000 tuples as our testing bed

By default we use the ranking function from the rankedretrieval model with all weights set to 1 All experimentalresults were obtained on a Mac machine running Mac OS

8 Security and Communication Networks

with 8GB of RAM )e algorithms were implemented inPython

62 Privacy )e privacy guarantee of PVSV-Constructorand PVST-Constructor were tested by performing RankedInference attack [35] including Point-Query In-QueryPoint-QueryampInsert and In-QueryampInsert attackingmethods on the dataset From a total of 20 attributes 5attributes were randomly chosen as public attributes andanother 5 attributes were randomly chosen as private at-tributes We randomly picked 20000 distinct tuples fromthe dataset as the testing bed and randomly generated 10tuples as the query workload For PVSV-Constructor weconstructed a polymorphic value set of size 2 for eachprivate attribute and each tuple in the testing bed ForPVST-Constructor we constructed a polymorphic valueset that covers at least two true values for each privateattribute and each tuple We randomly picked 1000 tuplesfrom the testing bed as our targets and performed 1000Rank Inference attacks (250 attacks for each of the fourmethods) on the five private attributes of target tuples Wemeasured the attack success guess rates based on the fre-quency of successful inference among all inference at-tempts Figure 1 shows the success guess rates of RankInference attacks on the unprotected testing bed the testingbed with PVSV and the testing bed with PVST As the sizeof each polymorphic value set is 2 the success guess rateson PVSV are around 50 which are significantly lowerthan those of unprotected dataset We also observe that thesuccess guess rates on PVSTare slightly lower than those of

PVSV )e reason is that PVSV-Constructor will beinserting values into tuple vrsquos polymorphic value sets untilall vrsquos polymorphic value sets cover at least 2 true values)us some polymorphic value sets of v may contain morethan 2 values

63 Utility In this subsection we quantify utility loss ofPVSV-Constructor and PVST-Constructor algorithms)e privacy guarantee of both PVSV-Constructor andPVST-Constructor is 12 ie the polymorphic value setsconstructed by PVSV-Constructor contains 2 values andthe polymorphic value sets constructed by PVST-Con-structor contains at least 2 true values )e key param-eters here are the size of query workload Q the size ofdatabase D the number of public and private attributesand the weight ratios in the ranking function We ran-domly generated 20 tuples as the query workload Bydefault we picked 10 tuples from Q and set |D| 300 000m 10 and mprime 10 )erefore we randomly picked 10attributes from the testing bed and set them as publicattributes )e rest of the 10 attributes were set as privateattributes

Many recommendation systems of ONS applicationsfeature top-k recommendation [1 14 47] where the rankedresult contains a set of k tuples that will be of interest to acertain user as it is impractical and unnecessary to return alltuples in the database to the user )erefore we introduceaverage top-k utility loss Uaul a variant of UQ that focuses onutility loss of the top-k tuples in a ranked result Uaul isdefined as

(i) Input ε D m mprime Q(ii) Output P

Bj

v forallv and forallBj

(1) k (1ε) minus 1(2) for v in D do(3) for j isin 1 mprime1113864 1113865 do(4) P

Bj

v v[Bj]1113966 1113967

(5) end(6) end(7) for v in D do(8) while existj isin 1 mprime1113864 1113865 |P

Bj

v |lt k + 1 do(9) for vprime isin t isin D|forallAi t[Ai] v[Ai]1113864 1113865 do(10) compute covervprime lossvprime(11) scorevprime covervprime 11 + exp(minus lossvprime )

(12) end(13) vmax argmaxvprimeisin |tisinD|forallAit[Ai]v[Ai] scorevprime

(14) for j isin 1 mprime1113864 1113865 do

(15) PBj

v PBj

v cup tmax[Bj]1113966 1113967

(16) end(17) end(18) end

ALGORITHM 2 PVST-Constructor

Security and Communication Networks 9

Uaul 1

|Q|1113944

|Q|

i1

1Dprime

11138681113868111386811138681113868111386811138681113868

1113944

visinDprime

min Rank(v ∣ q) k + 11113864 1113865 minus min Rankprime(v ∣ q) k + 11113864 11138651113868111386811138681113868

1113868111386811138681113868

k (23)

where Dprime v isin D |Rank(v | q))lt k or Rank(v | q))lt k1113864 1113865Intuitively Uaul represents the average percentage rankdifference relative to k over all queries and all tuples in top-kUaul is equivalent to UQ|Q||D|2 when k |D| By default weset k 100

631 Evaluation of Uaul with Varying k We first discuss theaverage top-k utility loss of PVSV-Constructor PVST-Constructor and a baseline algorithm on varying k withother parameters set to default values For a tuple vrsquos at-tribute Bj the baseline algorithm constructs P

Bj

v with v[Bj]

and a value randomly picked from VBj v[Bj]1113966 1113967 )e results

are presented in Figure 2 which shows that the Uaul ofPVSV-Constructor is significantly lower than that of thebaseline algorithm )e Uaul of PVST-Constructor is alsolower than that of the baseline algorithm when kge 5 eventhough the baseline algorithm cannot preserve privacyagainst authenticity-knowledgeable adversaries )e exper-imental results show that both PVSV and PVST can reduceutility loss with respect to rank differences Also note thatPVST-Constructor constructs polymorphic value sets withtrue values and thus from Figure 3 we can see that somepolymorphic value sets constructed by PVST-Constructorcontains more than 2 values

632 Evaluation of Uaul with Varying Sizes of QFigure 4 presents the average top-k utility loss of PVSV-Constructor on varying |Q| When |Q| is set to 1 5 or 10 werandomly picked 1 5 or 10 queries from the original Qrespectively With increasing number of queries in the queryworkload the Uaul of PVSV-Constructor increases mono-tonically )e reason is that PVSV-Constructor alwaysgenerates the least frequent value (denoted as vprime[Bj]) in1113864q[Bj] | q isin Q1113865 for v[Bj] If |Q| is small then it is possiblethat vprime[Bj] notin 1113864q[Bj] | q isin Q1113865 and thus s(v ∣ q) sprime(v ∣ q)

forallq isin Q However a larger query workload covers moreprivate attributes values and thus it will be harder forPVSV-Constructor to generate a value for v[Bj] that has noimpact on the rank of v for any q isin Q Figure 5 presents theaverage top-k utility loss of PVST-Constructor on varying|Q| We can see that the size of Q has no significant impacton the Uaul of PVST-Constructor as the PVST-Constructoralways pick a tuple vprime that is different from v in most at-tributes and then insert vprime[Bj] into P

Bj

v

633 Evaluation of Uaul with Varying Sizes of the DatasetFigures 6 and 7 depict the impact of the size of datasets onUaul of PVSV-Constructor and PVST-Constructor Datasetsof 100000 and 200000 tuples were randomly sampled from

QPoint QIn QIPoint QIIn

Atta

ck su

cces

s gue

ss ra

te

Without protectionPVSVPVST

1

09

08

07

06

05

04

03

02

01

0

Figure 1 Attack success guess

10 Security and Communication Networks

the testing bed of 300000 tuples As expected |D| has nosignificant impact on the Uaul of PVSV-Constructor sincePVSV-Constructor generates values from the domain ofeach private attributes |D| has no impact on the Uaul ofPVST-Constructor which indicates that a dataset contain-ing 100000 tuples is sufficient for PVST-Constructor togenerate polymorphic value sets with true values

634 Evaluation of Uaul with Varying m We investigate theimpact of the number of private and public attributes on

average top-k utility loss Figure 8 presents theUaul of PVSV-Constructor with fixed mprime and varying m and with fixed mand varying mprime When mmprime is set to 5 we randomly re-moved 5 attributes from the testing bed When mmprime is set to

0 20 40 60 80 100k

RandomPVSVPVST

Ave

rage

top-k

utili

ty lo

ss

1

09

08

07

06

05

04

03

02

01

0

Figure 2 Utility loss vs k

1 2 3 4 5+Size of PVST

3

25

2

15

1

05

0

Coun

t 10

5

Figure 3 Size of polymorphic value set (PVST)

|Q|

Ave

rage

top-k

utili

ty lo

ss

1

009

008

007

006

005

004

003

002

001

00 10 20

Figure 4 Utility loss vs |Q|

|Q|

Ave

rage

top-k

utili

ty lo

ss

1

09

08

07

06

05

04

03

02

01

00 10 20

Figure 5 Utility loss vs |Q|

0 1 2 3|D|105

Ave

rage

top-k

utili

ty lo

ss

1

009

008

007

006

005

004

003

002

001

0

Figure 6 Utility loss vs |D|

Security and Communication Networks 11

15 we added 5 more categorical attributes randomly chosenfrom the unused attributes We observe that the Uaulmonotonically decreases with increasing number of publicattributes and monotonically increases with increasingnumber of private attributes As expected a higher pro-portion of public attributes leads to less variant betweens(v ∣ q) and sprime(v ∣ q) e results of the same experimentwith PVST-Constructor are shown in Figure 9 Uaul in-creases as increasing number of private attributes as ex-pected However Uaul also increases slightly with increasingnumber of public attributes is is due to the fact that withmore public attributes there will be few tuples that share thesame public attribute values Since PVST-Constructor in-serts only private attribute values from tuples sharing samepublic attribute values more values will be inserted to eachpolymorphic value set which introduces higher utility loss

635 Evaluation of Uaul with Varying Weight RatiosFigures 10 and 11 illustrate the impact of weight ratios on theaverage utility loss e experiment was conducted with a

xed private attribute weight of 1 and varying public at-tribute weights of 1 2 and 3 As expected Uaul of bothPVSV-Constructor and PVST-Constructor decreases as

0 1 2 3Weight ratio

Ave

rage

top-k

utili

ty lo

ss

1

009

008

007

006

005

004

003

002

001

0

Figure 10 Utility loss vs weight ratio

0 1 2 3Weight ratio

Ave

rage

top-k

utili

ty lo

ss

1

09

08

07

06

05

04

03

02

01

0

Figure 11 Utility loss vs weight ratio

0 1 2 3|D|105

Ave

rage

top-k

utili

ty lo

ss1

09

08

07

06

05

04

03

02

01

0

Figure 7 Utility loss vs |D|

0 5 10 15m or mprime

fix mprime = 10fix m = 10

Ave

rage

top-k

utili

ty lo

ss

1

009

008

007

006

005

004

003

002

001

0

Figure 8 Utility loss vs m and mprime

0 5 10 15m or mprime

fix mprime = 10fix m = 10

Ave

rage

top-k

utili

ty lo

ss

1

09

08

07

06

05

04

03

02

01

0

Figure 9 Utility loss vs m and mprime

12 Security and Communication Networks

increasing weight ratio of public attributes)e reason is thatas the public attribute weight increases the part in sprime(v ∣ q)

caused by private attributes decreases )erefore less impactwould be made to sprime(v ∣ q) by PVSV and PVST As a resultsprime(v ∣ q) would be closer to s(v ∣ q) and the utility loss couldbe decreased

7 Conclusions

In this paper we proposed a novel framework that preservesprivacy of private attributes against arbitrary attacks throughthe ranked retrieval model Furthermore we identify twocategories of adversaries based on varying adversarial ca-pabilities For each kind of adversaries we presentedimplementation of our framework Our experimental resultssuggest that our implementations efficiently preserve privacyagainst Rank Inference attack [35] Moreover the imple-mentations significantly reduce utility loss with respect tothe variance in ranked results

It is our hope that this paper can motivate further re-search in privacy preservation of SIoT with consideration ofsocial network features andor variant information retrievalmodels eg text mining

Appendix

A 2-PVSV

In this subsection we prove that constructing an optimal PVSVfor each attribute of a tuple v is an NP-hard problem

Definition A1 For a tuple v in D we create PBj

v for each BjWe say v satisfies query q if the following hold for any tuple t(tne v) in D (1) If s(v ∣ q)lt s(t ∣ q) then sprime(v ∣ q)lt sprime(t ∣ q)and (2) if s(v ∣ q)ge s(t ∣ q) then sprime(v ∣ q)ge sprime(t ∣ q)

For a database containing 2 tuples the 2-PVSV problemcan be redefined as follows given a query workload Q adatabase D construct PB1

v PBmprime

v of size 2 such that v

satisfies the most queries in Q

Definition A2 Max-3Sat Problem given a 3-CNF formulaΦ find the truth assignment that satisfies that most clauses

Lemma A1 Max-3Sat leP 2-PVSV Problem

Proof We construct a reduction function f(Φ) (D s Q)

which takes a Max-3Sat instance as input and returns a 2-PVSV instance Without loss of generality we suppose thatΦ is a conjunction of l clauses and each clause is a dis-junction of 3 literals from set X x1 xn1113864 1113865 We constructdatabase D as follows D has 0 public attributes and n + 1private attributes B1 Bn C1 Let VB

j be the attribute do-main of Bj j isin 1 n and VC

1 be the attribute domain ofC1 Let VB

i minus 1 0 1 2 n + 2 and VC1 0 1 Two

tuples v1 and v2 are inserted into D v1[Bj] 0 and v2[Bj]

j + 2 for j isin 1 n while v1[C1] 0 and v2[C1] 1 Wesimplify the score function defined in (1) by setting all weightsto 1 Also note that ρ(q[Bj] t[Bj]) 0 if q[Bj] is null

We construct query workload Q based on Φ For eachclause Lk isin Φ we construct a query qk such that qk[Bi] i iffthe corresponding literal xi is a positive literal in Lk andq[Bi] i + 1 iff xi is a negative literal in the clause Forexample given a clause (x1orx2orx3) the correspondingquery q should satisfy q[B1] 1 q[B2] 3 q[B3] 3 andq[C1] 0 All other attributes in q are set to null by default)erefore s(v1 ∣ q) 1 and s(v2 ∣ q) 0 forallq isin Q

Since s(v2 ∣ q)lt s(v1 ∣ q) forallq isin Q in order to minimizeutility loss the value of sprime(v2 ∣ q) should be as small aspossible We observe that the minimum possible sprime(v2 ∣ q) is1 because PC1

v2must be 0 1 as VC

1 0 1 Without loss ofgenerality we assume that we have already constructedPB1

v2 P

Bmprime

v2 PC1v2

for v2 such that sprime(v2 ∣ q) 1 forallq isin QNow we have an instance of 2-PVSV problem that given

D Q constructs PB1v1

PBmprime

v1 PC1

v1that satisfies the most

queries in Q Since VC1 0 1 PC1

v1must be 0 1 too as

PC1v1

contains two distinct values )erefore for an ar-bitrary query q isin Q we have ρprime(q[C] v1[C]) 1 andρprime(q[C] v2[C]) 1 In order to let v1 satisfy q we have toensure that sprime(v1 ∣ q)gt sprime(v2 ∣ q) 1 )us we have

sprime v1 ∣ q( 1113857gt 1

⟺1113944mprime

i1ρprime q Bi1113858 1113859 v1 Bi1113858 1113859( 1113857gt 1

⟺existi isin 1 mprime1113864 1113865 ρprime q Bi1113858 1113859 v1 Bi1113858 1113859( 1113857 1

⟺existi isin 1 mprime1113864 1113865 q Bi1113858 1113859 isin PBi

v1

(A1)

Now we show that the solution of 2-PVSV problemconstructed above can answer the corresponding Max-3Sat Problem Suppose that we have the solution PB1

v1

PB

mprimev1 such that v1 satisfies the most queries in Q As in

(A1) if v1 satisfies qk we have qk[Bi] isin PBiv1 If v1 does not

satisfy qk then qk[Bi] notin PBiv1 Recall that we assign value i or

i + 1 to qk[Bi] if xi is a positive or negative literal in clauseLk respectively )erefore the assignment of xi given PBi

v1is

xi true if i isin PBi

v1

xi false if i + 1 isin PBi

v1

(A2)

Furthermore from equation (A1) we have

sprime v1 ∣ q( 1113857gt 1⟺L true (A3)

where L is qrsquos corresponding clause in Φ Note thati i + 1 nsubQ as |PBi

v1| 2 and 0 isin PBi

v1 )us the value of xi is

either true or falseAs we proved above v1 satisfies qi if and only if Li is true

given assignment x1 xl1113864 1113865 constructed according toequation (A2) )erefore x1 xl1113864 1113865 satisfies the mostclauses in ϕ if and only if v1 satisfies the most queries in Q

We now prove that function f can be conducted inpolynomial time Given a formula Φ with n variables and lclauses we construct a 2-PVSV instance with 2 tuples each

Security and Communication Networks 13

of which has n + 1 attributes and l queries each of which has4 attributes )erefore O(2n + 4l) assignments are neededand f can be conducted in polynomial time

Proof of 4eorem 1 We now prove that the 2-PVSVproblem is NP-hard In Lemma 3 we proved that Max-3SatProblem can be reduced to the 2-PVSV problem in poly-nomial time Furthermore as Max-3Sat Problem is a NP-hard problem [33] the 2-PVSV problem is NP-hard

B 2-PVST

In this section we prove that constructing an optimal PVSTfor each private attribute of a tuple v isin D is NP-hard We usethe definition of v satisfying q from (9))e 2-PVSTproblemcan be redefined as follows

Definition B1 2-PVSTproblem given a query workload Qa database D the optimization problem of 2-PVST is toconstruct an arbitrary tuple vrsquos polymorphic setsPB1

v PBmprime

v that satisfies the most queries in Q )e size ofeach polymorphic vale set is 2

Lemma B1 Max-3Sat le P 2-PVST problem

Proof We construct a reduction function f(Φ) (D s Q)

which takes a Max-3Sat instance as input and returns a 2-PVST instance We assume that Φ is a conjunction of lclauses and each clause is a disjunction of 3 literals from setX x1 xn1113864 1113865 Let D have no public attribute and n + 1private attributes B1 Bn C1 For each literal xi we inserttwo tuples vi and vi

prime into D where

vi C11113858 1113859 1 vi Bi1113858 1113859 1 vi Bj1113960 1113961 minus 1foralljne i

viprime C11113858 1113859 1 vi

prime Bi1113858 1113859 minus 1 viprime Bj1113960 1113961 1foralljne i

(B1)

)en we insert tuple v into D where v[C1] 0 andv[Bj] 0 forallj isin 1 mprime We also set the domain of eachBj as minus 1 0 1 and the domain of C1 as 0 1

Without loss of generality the score function defined in(1) is simplified by setting all weights to 1

Next we construct query workload Q For each clauseLk isin Φ we construct a query qk in which qk[Bj] minus 1 iff xj isa positive literal in Lk and qk[Bj] 1 iff xj is a negativeliteral in Lk We also set qk[C1] 1 )e rest of the attributevalues are set to null by default )erefore ifLk (x1orne x2orx3) then we have qk[C1] 1 qk[B1] 1qk[B2] 1 qk[B3] minus 1 and qk[Bj] null forj isin 4 mprime

Note that ρ(q[Bj] t[Bj]) 0 if q[Bj] is null )uss(v qk) 0 forallq isin Q and s(vi qk) s(vi

prime qk)ge 1 forallq isin Q Weobserve that for q isin Q and i isin 1 n s(v ∣ q)lt s(vi ∣ q)

and s(v ∣ q)lt s(viprime ∣ q) In order to reduce UQ we have to

maximize the number of q isin Q such that sprime(v ∣ q)lt sprime(vi ∣ q)

and sprime(v ∣ q)lt sprime(viprime ∣ q) )e maximum value of sprime(vi ∣ q)

and sprime(viprime ∣ q) is 4 which can be achieved by inserting vi[Bj]

into PBj

viprime and inserting vi

prime[Bj] into PBj

vi Since PC1

v 0 1 thevalue of sprime(v ∣ q) is at least 1 )erefore we have

sprime(v ∣ q)lt sprime vi ∣ q( 1113857

⟺sprime(v ∣ q)lt 4

⟺1113944

mprime

j1ρprime q Bj1113960 1113961 v Bj1113960 11139611113872 1113873lt 3

⟺existj isin 1 mprime ρprime q Bj1113960 1113961 v Bj1113960 11139611113872 1113873 0

⟺existj isin 1 mprime v Bj1113960 1113961ne q Bj1113960 1113961

⟺existj isin 1 mprime minus q Bj1113960 1113961 isin PBj

v

(B2)

Since |PBj

v | is limited to 2 PBj

v 0 1 or 0 minus 1 ie wemust insert either virsquos or vi

primersquos private attribute values intoPB1

v PBmv Recall that we assign minus 1 or 1 to qk[Bj] if xj

is positive or negative respectively in Lk In order toreduce Max-3Sat to 2-PSVT we assign literals in the fol-lowing way

xj true if minus 1 isin PBj

v

xj false if 1 isin PBj

v (B3)

)erefore we denote the corresponding clause of q as Land from equation (B2) we have

existj isin 1 mprime1113864 1113865 minus q Bj1113960 1113961 isin PBj

v

⟺existj isin 1 mprime1113864 1113865 xj true if xj is positive in L

orxj false if xj is negative in L

⟺L true(B4)

From the above equation we observe that v satisfying qk

is equivalent to Lk being true)erefore we can reduceMax-3Sat to 2-PVST Given a solution to the 2-PVSTproblem thatmaximizes the number of queries satisfied by v assignmentx1 xn1113864 1113865 produced by equation (B3) can satisfy thelargest number of clauses in Φ

Given a Max-3Sat instance Φ the construction of the 2-PVST instance can be conducted in polynomial time as weconstruct n queries with 3 attributes and 2n + 1 tuples with nattributes)e transformation from the optimal solution of 2-PVST to the optimal solution of Max-3Sat also takes poly-nomial time as we assign the value to each one of x1 xn

once )erefore f is a polynomial time function

Proof of 4eorem 2 In 4 we proved that a Max-3Sat in-stance can be reduced to a 2-PSVT instance in polynomialtime Furthermore as Max-3Sat Problem is an NP-hardproblem the 2-PVSV problem is an NP-hard problem

Data Availability

)e data used to support the findings of this study areavailable from the corresponding author upon request

14 Security and Communication Networks

Conflicts of Interest

)e authors declare that they have no conflicts of interestregarding the publication of this paper

Acknowledgments

)is research was partially supported by the National Sci-ence Foundation (grant nos 0852674 0915834 1117297and 1343976) and the Army Research Office (grant noW911NF-15-1-0020)

References

[1] G Adomavicius and A Tuzhilin ldquoToward the next generationof recommender systems a survey of the state-of-the-art andpossible extensionsrdquo IEEE Transactions on Knowledge andData Engineering vol 17 no 6 pp 734ndash749 2005

[2] C C Aggarwal and S Y Philip ldquoA condensation approach toprivacy preserving data miningrdquo in Proceedings of the In-ternational Conference on Extending Database Technologypp 183ndash199 Springer Heraklion Crete Greece March 2004

[3] R Agrawal and R Srikant ldquoPrivacy-preserving data miningrdquoin ACM Sigmod Record vol 29 no 2 pp 439ndash450 ACM2000

[4] A Alcaide E Palomar J Montero-Castillo and A RibagordaldquoAnonymous authentication for privacy-preserving IoT tar-get-driven applicationsrdquo Computers amp Security vol 37pp 111ndash123 2013

[5] L Atzori A Iera G Morabito and M Nitti ldquo)e socialinternet of things (SIoT)mdashwhen social networks meet theinternet of things concept architecture and network char-acterizationrdquo Computer Networks vol 56 no 16 pp 3594ndash3608 2012

[6] Z Cai Z He X Guan and Y Li ldquoCollective data-sanitizationfor preventing sensitive information inference attacks insocial networksrdquo IEEE Transactions on Dependable and SecureComputing vol 15 no 4 pp 577ndash590 2018

[7] Z Cai and X Zheng ldquoA private and efficient mechanism for datauploading in smart cyber-physical systemsrdquo IEEE Transactionson Network Science and Engineering vol 24 p 1 2018

[8] Z Cai X Zheng and J Yu ldquoA differential-private frameworkfor urban traffic flows estimation via taxi companiesrdquo IEEETransactions on Industrial Informatics vol 17 2019

[9] N Cao C Wang M Li K Ren and W Lou ldquoPrivacy-preserving multi-keyword ranked search over encryptedcloud datardquo IEEE Transactions on Parallel and DistributedSystems vol 25 no 1 pp 222ndash233 2014

[10] Y-C Chang and M Mitzenmacher ldquoPrivacy preservingkeyword searches on remote encrypted datardquo in Proceedingsof the International Conference on Applied Cryptography andNetwork Security pp 442ndash455 Springer New York NY USAJune 2005

[11] C Chen X Zhu P Shen et al ldquoAn efficient privacy-pre-serving ranked keyword search methodrdquo IEEE Transactionson Parallel and Distributed Systems vol 27 no 4 pp 951ndash9632016

[12] K Chen and L Liu ldquoPrivacy preserving data classificationwith rotation perturbationrdquo in Proceedings of the Fifth IEEEInternational Conference on Data Mining (ICDMrsquo05) p 4IEEE Houston TX USA November 2005

[13] V Ciriani S D C Di Vimercati S Foresti S JajodiaS Paraboschi and P Samarati ldquoFragmentation and en-cryption to enforce privacy in data storagerdquo in Proceedings of

the Fifth European symposium on research in computer se-curity pp 171ndash186 Springer Dresden Germany September2007

[14] M Deshpande and G Karypis ldquoItem-based top-N recom-mendation algorithmsrdquo ACM Transactions on InformationSystems vol 22 no 1 pp 143ndash177 2004

[15] S D C di Vimercati R F Erbacher S Foresti S JajodiaG Livraga and P Samarati ldquoEncryption and fragmentationfor data confidentiality in the cloudrdquo in Foundations of Se-curity Analysis and Design VII A Aldini J Lopez andF Martinelli Eds pp 212ndash243 Springer Berlin Germany2013

[16] S D C di Vimercati S Foresti G Livraga S Paraboschi andP Samarati ldquoConfidentiality protection in large databasesrdquo inA Comprehensive Guide through the Italian Database Researchover the Last 25 Years pp 457ndash472 Springer Berlin Ger-many 2018

[17] C Dwork ldquoDifferential privacyrdquo Encyclopedia of Cryptog-raphy and Security pp 338ndash340 2011

[18] C Dwork and A Roth ldquo)e algorithmic foundations ofdifferential privacyrdquo Foundations and Trendsreg in 4eoreticalComputer Science vol 9 no 3-4 pp 211ndash407 2014

[19] S E Fienberg and J McIntyre ldquoData swapping variations ona theme by dalenius and reissrdquo in Privacy in Statistical Da-tabases pp 14ndash29 Springer Berlin Germany 2004

[20] Y Gao T Yan and N Zhang ldquoA privacy-preservingframework for rank inferencerdquo in Proceedings of the IEEESymposium on Privacy-Aware Computing (PAC) pp 180-181IEEE Washington DC USA August 2017

[21] Z He Z Cai and J Yu ldquoLatent-data privacy preserving withcustomized data utility for social network datardquo IEEETransactions on Vehicular Technology vol 67 no 1pp 665ndash673 2017

[22] T Hong S Mei Z Wang and J Ren ldquoA novel verticalfragmentation method for privacy protection based on en-tropy minimization in a relational databaserdquo Symmetryvol 10 no 11 p 637 2018

[23] X Huang R Fu B Chen T Zhang and A Roscoe ldquoUserinteractive internet of things privacy preserved access con-trolrdquo in Proceedings of the International Conference for In-ternet Technology and Secured Transactions pp 597ndash602IEEE London UK December 2012

[24] Y Huo X Fan L Ma X Cheng Z Tian and D ChenldquoSecure communications in tiered 5G wireless networks withcooperative jammingrdquo IEEE Transactions on Wireless Com-munications vol 18 no 6 pp 3265ndash3280 2019

[25] K Liu H Kargupta and J Ryan ldquoRandom projection-basedmultiplicative data perturbation for privacy preserving dis-tributed data miningin latexrdquo IEEE Transactions on knowledgeand Data Engineering vol 18 no 1 pp 92ndash106 2005

[26] N Li T Li and S Venkatasubramanian ldquot-closeness privacybeyond k-anonymity and l-diversityrdquo in Proceedings of theIEEE 23rd International Conference on Data Engineeringpp 106ndash115 IEEE Istanbul Turkey April 2007

[27] A Machanavajjhala D Kifer J Gehrke andM Venkitasubramaniam ldquoL-diversityrdquo ACMTransactions onKnowledge Discovery from Data vol 1 no 1 pp 3ndashes 2007

[28] S Matwin ldquoPrivacy-preserving data mining techniquessurvey and challengesrdquo in Studies in Applied PhilosophyEpistemology and Rational Ethics pp 209ndash221 SpringerBerlin Germany 2013

[29] B McFee and G Lanckriet ldquoMetric learning to rankrdquo inProceedings of the 27th International Conference on MachineLearning (ICMLrsquo10) Haifa Israel June 2010

Security and Communication Networks 15

[30] K Muralidhar and R Sarathy ldquoData shufflingmdasha newmasking approach for numerical datardquo Management Sciencevol 52 no 5 pp 658ndash670 2006

[31] J Nin J Herranz and V Torra ldquoRethinking rank swapping todecrease disclosure riskrdquo Data amp Knowledge Engineeringvol 64 no 1 pp 346ndash364 2008

[32] K Okamoto W Chen and X-Y Li ldquoRanking of closenesscentrality for large-scale social networksrdquo in Frontiers inAlgorithmics pp 186ndash195 Springer Berlin Germany 2008

[33] C H Papadimitriou and M Yannakakis ldquoOptimizationapproximation and complexity classesrdquo Journal of computerand system sciences vol 43 no 3 pp 425ndash440 1991

[34] L Peng R-C Wang X-Y Su and C Long ldquoPrivacy pro-tection based on key-changed mutual authentication protocolin internet of thingsrdquo in Communications in Computer andInformation Science pp 345ndash355 Springer Berlin Germany2013

[35] M F Rahman W Liu S )irumuruganathan N Zhang andG Das ldquoPrivacy implications of database rankingrdquo Pro-ceedings of the VLDB Endowment vol 8 no 10 pp 1106ndash1117 2015

[36] R Schenkel T Crecelius M Kacimi et al ldquoEfficient top-kquerying over social-tagging networksrdquo in Proceedings of the31st Annual International ACM SIGIR Conference on Researchand Development in Information Retrieval pp 523ndash530ACM Singapore July 2008

[37] C Sheng N Zhang Y Tao and X Jin ldquoOptimal algorithmsfor crawling a hidden database in the webrdquo Proceedings of theVLDB Endowment vol 5 no 11 pp 1112ndash1123 2012

[38] D X Song D Wagner and A Perrig ldquoPractical techniquesfor searches on encrypted datardquo in Proceeding 2000 IEEESymposium on Security and Privacy SampP 2000 pp 44ndash55IEEE Berkeley CA USA May 2000

[39] J Su D Cao B Zhao X Wang and I You ldquoePASS anexpressive attribute-based signature scheme with privacy andan unforgeability guarantee for the Internet of)ingsrdquo FutureGeneration Computer Systems vol 33 pp 11ndash18 2014

[40] L Sweeney ldquok-anonymity a Model for Protecting PrivacyrdquoInternational Journal of Uncertainty Fuzziness and Knowl-edge-Based Systems vol 10 no 5 pp 557ndash570 2002

[41] T Tassa A Mazza and A Gionis ldquok-concealment An al-ternative model of k-type anonymityrdquo Transactions on DataPrivacy vol 5 no 1 pp 189ndash222 2012

[42] J H Y F W Wang J C W G K Koperski D LiY L A R N Stefanovic and B X O R Z Dbminer ldquoAsystem for mining knowledge in large relational databasesrdquo inProceedings of the International Conference on KnowledgeDiscovery and Data Mining and Knowledge Discovery(KDDrsquo96) pp 250ndash255 San Diego CA USA August 1996

[43] X Wang J Zhang E M Schooler and M Ion ldquoPerformanceevaluation of attribute-based encryption toward data privacyin the IoTrdquo in Proceedings of the IEEE International Con-ference on Communications (ICC) pp 725ndash730 IEEE SydneyAustralia June 2014

[44] Y Wang and Q Wen ldquoA privacy enhanced dns scheme forthe internet of thingsrdquo in Proceedings of the IET InternationalConference on Communication Technology and Application(ICCTA 2011) )e Institution of Engineering amp TechnologyBeijing China October 2011

[45] Y Xu T Ma M Tang and W Tian ldquoA survey of privacypreserving data publishing using generalization and sup-pressionrdquo Applied Mathematics amp Information Sciencesvol 8 no 3 pp 1103ndash1116 2014

[46] J-C Yang and B-X Fang ldquoSecurity model and key tech-nologies for the internet of thingsrdquo 4e Journal of ChinaUniversities of Posts and Telecommunications vol 18pp 109ndash112 2011

[47] X Yang H Steck Y Guo and Y Liu ldquoOn top-k recom-mendation using social networksrdquo in Proceedings of the sixthACM conference on Recommender systemsmdashRecSysrsquo12pp 67ndash74 ACM Dublin Ireland September 2012

[48] X Zheng Z Cai J Yu C Wang and Y Li ldquoFollow but notrack privacy preserved profile publishing in cyber-physicalsocial systemsrdquo IEEE Internet of 4ings Journal vol 4 no 6pp 1868ndash1878 2017

16 Security and Communication Networks

International Journal of

AerospaceEngineeringHindawiwwwhindawicom Volume 2018

RoboticsJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Active and Passive Electronic Components

VLSI Design

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Shock and Vibration

Hindawiwwwhindawicom Volume 2018

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawiwwwhindawicom

Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Control Scienceand Engineering

Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Journal ofEngineeringVolume 2018

SensorsJournal of

Hindawiwwwhindawicom Volume 2018

International Journal of

RotatingMachinery

Hindawiwwwhindawicom Volume 2018

Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Navigation and Observation

International Journal of

Hindawi

wwwhindawicom Volume 2018

Advances in

Multimedia

Submit your manuscripts atwwwhindawicom

Page 2: Research Article - Hindawi Publishing Corporationdownloads.hindawi.com/journals/scn/2019/5498375.pdf · [4, 34, 44], and attribute-based encryption [39, 43]. ’e features of social

private attributes from profiling pages and ranked resultsUsers believe that their privacy is well protected since privateattributes are invisible to the public in their profiles or anyranked results However Rahman et al [35] proposed RankInference and showed that privacy of private attributes in theranked retrieval model is not guaranteed In their approachthe value of a private attribute can be inferred through aranked retrieval interface given the premises that the domainof the private attribute is finite and that the ranking functionhas both monotonicity property and additivity property[35] To be more specific given monotonicity and additivityconditions an adversary is always able to find a pair ofdifferential queries qθ and qθprime such that (a) qθ and qθprime sharethe same predicate on all attributes except for a privateattribute B1 (b) the value of B1 in qθprime is θwhile the value of B1in qθ is not and (c) the ranked result of qθ contains thevictim tuple v while the ranked result of qθprime does not Rahmanet al showed that given the above conditions the adversarywas able to conclude that the value of vrsquos B1 is not equal to θFurthermore the adversary was able to infer the value of vrsquosB1 by finding more differential queries and excluding morevalues from the domain of B1 as long as the domain is finite

Privacy issues arise when private attributes of users aretaken into account in the ranked retrieval model Intuitivelythe issues can be solved by removing private attributes fromranking functions which decreases the utility of many OSNfeatures the recommendation system cannot provide ac-curate results From OSN providersrsquo perspective removingprivate attributes is not a practical solution )erefore thiswork aims not only to address the issue of rank inference butalso to propose a framework that preserves the privacy ofusers against all inference attacks through the ranked re-trieval model while minimizes utility loss

12 Related Work Encryption technologies have been usedthroughout history in security and privacy preservationSearching on encrypted data [10 38] has been introduced toensure data privacy Cao et al [9] proposed a scheme thatallows privacy-preserving ranked search over encrypteddata A query consisting of multiple keywords is conductedby searching over encrypted documents with securek-nearest neighbor (kNN) technique Chen et al [11] tookinto consideration correlations between documents beforeconducting a search query and achieved better performanceVertical fragmentation [13 15 16 22] has been applied toencrypted data which hides identities of users by separatingidentifier attributes with descriptive attributes Howeverencrypted searching is developed to preserve privacy againstadversaries in a cloud computing environment Adversariescan still access decrypted searching results from whichprivate attribute values can be inferred

As ONS has been emerging as an important source forbig data many studies have been carried out for privacy-preserving data mining (PPDM) and privacy-preservingdata publishing (PPDP) Perturbative methods implementthe ldquocamouflagerdquo paradigm where original data are directlymodified [28] Agrawal et al [3] proposed an algorithm thatperturbs data with random additive noise Liu et al proposed

data perturbation with multiplicative noise However ran-dom noise has predictable structures in the spectral domainand thus privacy provided by additive noise is questionable[25] Furthermore additive or multiplicative noise can onlybe applied to numerical data Data swapping approaches[19 30 31] perturb data by swapping values between recordsthat are close to each other Other distance-based ap-proaches include [12] in which data points are perturbedwithout changing their relevant closeness relationships and[2] in which data points are clustered and each data pointrsquosvalue is replaced by the value of the cluster center Howeverthose approaches rely on a universal measurement ofcloseness between data points in multidimensional spaceFurthermore they are limited by the distributions of datapoints

Generalization is the process of replacing a group ofvalues with a more general value that can represent thegroup Suppression is the ultimate state of generalizationsuch that the representative value is ldquonot applicablerdquo and as aresult the group of values is removed from the dataset [45]k-anonymity [40 41] is a widely studied approach thatpreserves privacy of records by grouping at least k recordsinto an equivalence class )e attribute values of the k re-cords are suppressed so that the k records are in-distinguishable from adversaries Machanavajjhala et al [27]proposed L-diversity that focuses on attribute privacyL-diversity forces each equivalent class to have at least ldifferent values for each attribute Li et al [26] proposedT-closeness that further considers the distribution of at-tribute values T-closeness sets a threshold for the variancebetween the distribution of a private attribute in eachequivalence class and the distribution of the same attributein the entire dataset However those approaches are de-veloped to preserve privacy of published data while theranked retrieval model of ONS does not directly revealprivate attributes to the public Furthermore suppression ofprivate attribute values introduces unnecessary utility loss tothe ranked retrieval model Equivalent Set [20] was proposedto preserve privacy against inference attacks through theranked retrieval model )is approach groups differenttuples into a set such that they are indistinguishable inranked results However this approach requires that tuplesin the same equivalent set have different values in everyprivate attribute and have the same value in every publicattribute Assume that a kNN query q is sent to a rankedretrieval interface protected by Equivalent Set and that atuple t is equal to q in most private attributes Since the othertuples in the same equivalent set of t are different from t inevery private attribute they must be different from q in mostprivate attributes too )erefore the original rank of t givenq should be much higher than that of any other tuple in thesame equivalent set In this case the rank of t given q will besignificantly lowered by Equivalent Set in order to achieveindistinguishability which reduces the accuracy of theranked result of q Furthermore it is possible that there is notprime such that tprime is different from t in every private attribute andis same with t in every public attribute In this case we haveto suppress the private attribute values of t which furtherintroduces utility loss

2 Security and Communication Networks

Differential privacy [8 17 18] is another widely studiedframework that preserves privacy of published datasets orhidden databases It imposes a strong guarantee of privacyon tuples in statistical databases by adding noise to theprocess of query results However the ranked retrievalmodel outputs ranks of tuples instead of their values oraggregate statistics We cannot directly add noise to rankedresults as the rank of a tuple is determined by not only thetuple itself but also by other tuples in the database Fur-thermore the optimization of the ranked retrieval model hasnot been considered

13 Contributions )is work presents a novel scheme forprivacy-preserving the ranked retrieval model We start withan introduction to the adversary model and introduce ourdefinition of privacy guarantee We identify two categoriesof adversaries based on their prior knowledge and assumethat adversaries can launch optimal inference attacksthrough ranked results

We propose the polymorphic value set (PVS) a privacy-preserving framework for the ranked retrieval model Dif-ferent from existing methods PVS does not directly modifyvalues of tuples or query results Instead PVS enablespolymorphism of private attributes such that a private at-tribute of a tuple can respond to different queries in differentways We prove that our framework meets the privacyguarantee stated in Problem Statement For adversaries withand without prior knowledge we design and implement thepolymorphic value set with true values (PVST) and poly-morphic value set with Virtual Values (PVSV) respectivelyIn the design of PVST and PVSV we consider utility loss inthe ranked retrieval model and propose a practical mea-surement of utility loss We prove that the task of mini-mizing utility loss is NP-hard and present two heuristicalgorithms that implement PVST and PVSV respectivelyWe run our implementations of PVST and PVSV on a real-world dataset from eHarmony [29] that contains 486464tuples)e experiments yield excellent results with respect toprivacy guarantee and utility loss

)e remainder of this paper is organized as followsProblem Statement introduces the adversary model and theprivacy guarantee Privacy-Preserving Framework in-troduces our privacy-preserving framework Frameworkwith Virtual Values presents the design and implementationof PVSV along with our analysis of utility loss )eimplementation of PVST and analysis of utility loss arepresented in Framework with True Values ExperimentalResults contains our experimental evaluation of PVSV andPVST In Conclusions we conclude this paper with asummary of our key contributions and a discussion of someopen problems

2 Problem Statement

21 Ranked Retrieval Model In information retrieval wehave witnessed extensive research in the ranked retrievalmodel Unlike the Boolean retrieval model where only re-sults that exactly match the predicates can be returned theranked retrieval model allows users to retrieve a list of

records sorted by a proprietary ranking function)erefore theranked retrieval model provides an alternative solution forusers seeking results sorted by their relevance to the query

As discussed in the introduction many OSN applica-tions have been using the ranked retrieval model to processincoming queries Upon a query q the ranked retrievalmodel would calculate each tuple trsquos score according to aproprietary score function s(t ∣ q) and return top-k tuples indescending order of their scores )e attributes of tuplescould be either categorical or numerical In this paper weconsider only categorical data which does not limit thescope of our research Actually numerical data can betreated as categorical data by categorizing the numericaldomain into small intervals such that no more than onetuple in the database falls into the same interval Withoutloss of generality we also assume that there is no duplicatetuple that is equal to another tuple in every attribute

We now formalize our ranked retrieval model with cat-egorical attributes Consider an n-tuple database D with mpublic attributes A1 Am and mprime private attributesB1 Bmprime Let VA

i (resp VBj ) denote the value domain of

Ai(resp Bj) Let t[A]i(resp t[B]j) denote the value of t inAi(resp Bj) Upon query q the score function s(t ∣ q)

computes a score for each tuple t isin D )e ranked retrievalmodel will then sort all tuples in D in the descending orderand return them as the ranked result We consider the casewhere the score function is linear )erefore s(t ∣ q) can bedefined as

s(t ∣ q) 1113944m

i1w

Ai ρ q Ai1113858 1113859 t Ai1113858 1113859( 1113857 + 1113944

mprime

j1w

Bi ρ q Bi1113858 1113859 t Bi1113858 1113859( 1113857

(1)

where wAi (resp wB

j ) isin (0 1) is the weight of attributeAi(resp Bj) in the score function and the matching functionρ(q[Ai] t[Ai]) (resp ρ(q[Bi] t[Bi])) indicates if tmatches qin attribute Ai (resp Bj))erefore the value of ρ(β1 β2) is 1if β1 is equal to β2 and the value of ρ(β1 β2) is 0 if β1 is notequal to β2 Note that our ranked retrieval model satisfies themonotonicity and additivity properties defined in [35]

22 AdversaryModel In Motivation we mentioned that wedo not make any assumption about the method adopted byan adversary when attacking a database We also assume thatthe adversary has prior knowledge about the metadata oftables in the database as well as the proprietary rankingfunction Furthermore the adversary is assumed to be ableto issue queries to the ranked retrieval model view rankedresults and insert tuples to the database As a result theadversary is able to retrieve all public attribute values bycrawling the database through the query interface [37] Wedenote the set of queries issued by the adversary as QA theset of tuples inserted by the adversary as IA and the cor-responding set of ranked results as RD RD is fully de-termined by QA and IA given fixed D We name allinformation regarding a tuple t isin D that an adversary canfind in RD as the trace of t and denote the trace of t as Tt )etrace of t includes but not limited to the rank of t and the

Security and Communication Networks 3

relationship between t and any other tuple (eg t has ahigher or lower rank than another tuple tprime) in a rankedresult )erefore given fixed D IA and QA Tt is fully de-termined by the attribute values of t

Another capability of the adversary we model is theadversaryrsquos prior knowledge Consider an extreme case wherethe adversary knows the equivalence relation between two at-tributes B1 and A1 In this case even without RD the adversaryis still able to infer the value of B1 of any t isin D In reality anadversary can acquire such attribute correlations by being orconsulting an expert in the domain of the dataset or by adoptingdata mining methods [42] For example based on the personalinformation (eg gender ethnicity age and blood type whichcan be used to infer the gene) stored as public attributes andpublished in public medical data repositories genetic epide-miologists can generally conclude that an individual does nothave some diseases merely based on the fact that these diseaseswould never be found by the candidate gene among historicmedical datasets )erefore an adversary with prior knowledgecould eliminate the possibility of a certain tuple in the databaseWemodel prior knowledge as a function PK(t ∣ D) that takes asinput t andD and returns either 0 or 1 PK(t ∣ D) 0 indicatesthat given prior knowledge the possibility of t isin D is zeroPK(t ∣ D) 1 indicates that the adversary cannot eliminate thepossibility of t isin D As prior knowledge helps adversaries inlaunching an inference attack adversaries can be partitionedinto two classes adversaries with prior knowledge and adver-saries without prior knowledge of the dataset

Definition 1 We name adversaries without prior knowledgeof the authenticity of any tuple as authenticity-ignorant ad-versaries For authenticity-ignorant adversaries PK(t ∣ D)

always outputs 1 We name adversaries with such priorknowledge as authenticity-knowledgeable adversaries Forauthenticity-knowledgeable adversaries PK(t ∣ D) 1 ift isin D and PK(t ∣ D) 0 if t notin D

)e objective of both classes of adversaries is to maxi-mize the following g(v Bj) value when inferring the value ofvictim tuple v in attribute Bj

g v Bj1113872 1113873 Pr v Bj1113960 1113961 a1113872 1113873 (2)

where Pr(v[Bj] a) is the probability of v[Bj] a and a isthe value inferred by the adversary given prior knowledgeand ranked results

For authenticity-ignorant adversaries PK(t ∣ D) alwaysoutputs 1 regardless of t and D We only assume the caseswhere users input true information to the databases)ereforefor authenticity-knowledgeable adversaries PK(t ∣ D) 1 ift isin D and PK(t ∣ D) 0 if t notin D In this paper we assume astrong adversary that can infer the private attribute values of anarbitrary tuple given the premise that the trace of the targettuple is unique )e premise can be easily met because as longas there is no duplicate tuple in the dataset the adversary canalways construct well-designed IA andQA such that the trace ofthe target tuple is different from the trace of any other tuple)erefore the adversary can always find a such thatPr(v[Bj] a) 100

23ProblemStatement Aprivacy breach can be described by asuccessful inference of a private attribute value in the databaseWe view privacy of v[Bj] as the upper bound on the possibilitythat an adversary succeeds in inferring the value of v[Bj] Notethat we do not make any assumption about the adversaryrsquosattacking method Our objective in this paper is to present aframework that sets an upper bound on the probability ofsuccessful inference of an arbitrary private attribute for tuplet isin D )erefore we define the objective of the framework as

forallt isin D j isin 1 mprime1113864 1113865 g v Bj1113872 1113873le ε (3)

We present the upper bound ϵ as our privacy guaranteeHowever a privacy-preserving framework should pro-

vide not only a privacy guarantee but also a notion ofutilitymdashafter all a framework that removes all private at-tribute values or replaces them with randomly generatedvalues can surely preserve privacy )erefore we use ameasurement based on the variance of ranked results beforeand after adopting our framework Given D and a set of allpossible queries denoted as Q we define the utility loss forour ranked retrieval model as follows

U 1113944tisinD

1113944qisinQ

Rank(t ∣ q) minus Rankprime(t ∣ q)1113868111386811138681113868

1113868111386811138681113868 (4)

where Rank(t ∣ q) and Rankprime(t ∣ q) refer to the ranks oftuple t in the ranked result given query q before and afterapplying our frameworks respectively

3 Privacy-Preserving Framework

)e only information an adversary can obtain from a da-tabase through the ranked retrieval model is public attributevalues and ranked results For an adversary without priorknowledge information regarding private attribute valuescan only be retrieved from ranked results )erefore inorder to preserve privacy we have to modify the rankedretrieval model such that the adversary cannot retrieve anyuseful information about private attributes from rankedresults

An idea is to group different tuples together in rankedresults As in our prior work [20] we can group two tuples v1and v2 together such that they share the same rank in anyranked result )is can be achieved by adopting a newranking function sprime(t ∣ q) such that sprime(v1 ∣ q) sprime(v2 ∣ q) forall q If v1 and v2 have different values on every privateattributes then the adversary is unable to infer the privatevalues of v1 since v1 and v2 are indistinguishable in anyranked results However this method suffers from highutility loss In order to preserve the privacy of all privateattributes v1 and v2 have to be different over all privateattributes)us the original scores of v1 and v2 ie s(v1 ∣ q)

and s(v2 ∣ q) differ a lot which leads to a higher variancebetween the rank of v1 before and after adopting thismethod

In this work we present a novel framework that pre-serves privacy of private attributes while minimizes theutility loss We observe that for a tuple vrsquos private attributev[Bj] if there are at least two potential values for v[Bj] and

4 Security and Communication Networks

an adversary cannot differentiate any one of them then theprivacy of v[Bj] can be preserved For instance if theprobability of v[Bj] a is equal to the probability ofv[Bj] aprime given ranked results and prior knowledge thenthe adversary cannot exclude any one of them If both theprobabilities are 50 then the adversary may choose torandomly pick a value from a and aprime as the inferred result Inthis case g(v Bj) will not exceed 50 and privacy of v[Bj]

can be preserved To prove this statement suppose that v isan arbitrary tuple in databaseD and we want to preserve theprivacy of v[Bj] Let β

Bj be an arbitrary value in VB

j v[Bj]1113966 1113967We construct tuple vprime such that vprime and v differ in only oneattribute Bj vprime[Bj] βB

j We also construct databaseDprime suchthat D and Dprime differ in only one tuple v isin D while vprime isin DprimeWe define a new score function sprime(t ∣ q)

sprime(t ∣ q) 1113944m

i1w

Ai ρ q Ai1113858 1113859 t Ai1113858 1113859( 1113857 + 1113944

mprime

j1w

Bi ρprime q Bj1113960 1113961 t Bj1113960 11139611113872 1113873

(5)

where

ρprime q Bj1113960 1113961 t Bj1113960 11139611113872 1113873 ρ q Bj1113960 1113961 t Bj1113960 11139611113872 1113873 if tne v and tne vprime

ρprime q Bj1113960 1113961 t Bj1113960 11139611113872 1113873 Max ρ q Bj1113960 1113961 v Bj1113960 11139611113872 1113873 ρ q Bj1113960 1113961 vprime Bj1113960 11139611113872 11138731113872 1113873

(6)

if t v or t vprimeImagine a case where an adversary queriesD and Dprime with

the same query workload QA We denote the ranked resultsfrom D as RD and the ranked results from Dprime as RDprime As inthe new score function sprime(v ∣ q) sprime(vprime ∣ q) for forallq isin QA RD

is identical to RDprime )erefore given only ranked results theadversary cannot tell the difference between the two data-bases being queried Furthermore even if we exchange thevalues of v and vprime the adversary still cannot observe anychange inRD or RDprime As a result the value of v[Bj] and β

Bj are

equivalent from the perspective of the adversary and theprivacy of v[Bj] can be well preserved Intuitively v can beseen as a tuple that has two polymorphic forms in Bj v[Bj]

and βBj When calculating the score of v with sprime(t ∣ q) we

always choose the value that can maximize sprime(v ∣ q)We can further extend the statement to a more general

case For each tuple v isin D and each private attribute Bj wecan select e distinct values β1 β2 βe from VB

j v[Bj]1113966 1113967elt |VB

j | minus 1 )e new score function can be defined as

sprime(t ∣ q) 1113944m

i1w

Ai ρ q Ai1113858 1113859 t Ai1113858 1113859( 1113857 + 1113944

mprime

j1w

Bi ρprime q Bj1113960 1113961 t Bj1113960 11139611113872 1113873

(7)

whereρprime q Bj1113960 1113961 t Bj1113960 11139611113872 1113873 ρ q Bj1113960 1113961 t Bj1113960 11139611113872 1113873 if tne v

ρprime q Bj1113960 1113961 t Bj1113960 11139611113872 1113873 Max ρ q Bj1113960 1113961 t Bj1113960 11139611113872 1113873 ρ q Bj1113960 1113961 β11113872 11138731113872

ρ q Bj1113960 1113961 βe1113872 11138731113873 if t v

(8)

Consider e tuples v(1)prime v(e)prime which are identical to v

in all attributes except for Bj Let v(i)prime[Bj] be βi

foralli 1 e )en for forallq we have sprime(v ∣ q) sprime(v(i)prime ∣ q)As a result from the adversaryrsquos perspective there are e + 1potential values for v[Bj] v[Bj] β1 βe which are in-distinguishable from each other from any ranked results)eprivacy of v[Bj] can be preserved by grouping it with eequivalent values

As such we introduce the construction of the poly-morphic value set (PVS) We put v[Bj] into a set in which allvalues are indistinguishable when calculating the score withrespect to Bj in the ranking function ie ρprime(q[Bj] v[Bj])We name the above set as the polymorphic value set anddenote the polymorphic value set of tuple v in attribute Bj asP

Bj

v We define PBj

v as follows

Definition 2 PBj

v is a set containing all indistinguishablevalues of tuple vrsquos private attribute Bj Assigning v[Bj] withan arbitrary value in P

Bj

v will not change the value ofsprime(v ∣ q)forallq ie ρprime(q[Bj] v[Bj]) ρprime(q[Bj] β)forallβ isin P

Bj

v

and forallq

Since the adversary cannot distinguish different values inP

Bj

v by launching any inference attacks based only on rankedresults the privacy guarantee of v[Bj] is

ε 1

PBj

v

11138681113868111386811138681113868

11138681113868111386811138681113868

(9)

In order to achieve the privacy guarantee defined in (3)for each v isin D and each private attribute Bj we have toensure that (a) there is one and only one polymorphic valueset P

Bj

v for vrsquos attribute Bj and (b) the privacy guaranteedefined in (9) is always valid

4 Framework with Virtual Values

41 Design In this section we introduce how polymorphicvalue sets can be constructed with generated values to meetthe privacy guarantee in (9) against authenticity-ignorantadversaries We name values generated by our framework asvirtual values

As we proved in Privacy-Preserving Framework anauthenticity-ignorant adversary cannot distinguish the valueof v[Bj] from other valid values in P

Bj

v given ranked resultsSince an adversary without prior knowledge cannot validatethe authenticity of any value in P

Bj

v all values in PBj

v areldquovalidrdquo in the perspective of the adversary no matter if theyare generated by our framework or collected from real datain D )erefore we observe that P

Bj

v can be formed by anyvalues in VB

j In order to achieve the privacy guarantee of ε we have to

ensure that |PBj

v |gt (1ε) forallv isin D and forallj isin 1 mprimeAn intuitive algorithm to generate P

Bj

v of size 1ε is torandomly pick (1ε) minus 1 values from VB

j v[Bj] Specifically

let the initial PBj

v v[Bj]1113966 1113967 )en we can randomly pick

distinct values fromVBj v[Bj] and insert them intoP

Bj

v untilPBj

v

contains at least (1ε) distinct values In the same manner wecan construct a polymorphic value set for each tuplersquos eachprivate attribute

Security and Communication Networks 5

411 Privacy Guarantee For a databaseDwhere every tuplevrsquos every private attribute Bj is included by one polymorphicvalue set with virtual values (PVSV) whose size is at least l ifthe adversary has no prior knowledge of D a privacy level ofε (1l) is achieved

For an authenticity-ignorant adversary PK(v ∣ D) 1forallv As proved in Privacy-Preserving Framework for anauthenticity-ignorant adversary it is impossible to distin-guish v[Bj] with at least l minus 1 other values )us we haveg(v Bj)le (1l) for forallv isin D and forallj isin 1 mprime Accordingto equation (3) a privacy guarantee of (1l) can be achieved

42 Utility Optimization In this section we discuss how toreduce utility loss caused by polymorphic value sets Weintroduced a metric of utility loss in (4) that calculates thesum of difference in ranked results given all possible queriesTo practically calculate utility loss we limit the range ofqueries to a finite set named query workload In practice thequery workload of a database D can be a set of queries thatare more frequently issued than any other queries A queryworkload may contain duplicate queries which reflect thedistribution of frequent queries With a query workloaddenoted as Q we can define the practical utility loss as

UQ 1113944qisinQ

1113944visinD

Rank(v ∣ q) minus Rankprime(v ∣ q)1113868111386811138681113868

1113868111386811138681113868 (10)

In order to reduce utility loss we have to find assign-ments of P

Bj

v for forallv isin D and forallj isin 1 mprime such that theoverall UQ can be minimized Without loss of generality weonly consider constructing polymorphic value sets of size 2In this case the privacy guarantee is 12 For each v[Bj] weneed to find a value that is indistinguishable from v[Bj] Wedenote the polymorphic value of v[Bj] as vprime[Bj]

Definition 3 We define the 2-PVSV problem as follows givendatabase D and query workload Q find a polymorphic valuevprime[Bj] from VB

j v[Bj]1113966 1113967 for each v[Bj] forallv isin D andforallj isin 1 mprime such that UQ defined in (10) is minimized

Theorem 1 4e 2-PVSV problem is NP-hard

)e proof of )eorem 1 in detail can be found in Ap-pendix A

43 Heuristic Algorithm We have proved that the 2-PVSVproblem is an NP-hard problem that may not be solved inpolynomial time)erefore we propose PVSV-Constructora heuristic algorithm that can return an approximate so-lution in polynomial time

We observe that |s(v ∣ q) minus sprime(v ∣ q)| is relevant to|Rank(v ∣ q) minus Rankprime(v ∣ q)| A smaller difference betweens(v ∣ q) and sprime(v| q) leads to a smaller difference betweenRank(v ∣ q) and Rankprime(v ∣ q) )erefore |Rank(v ∣ q)minus

Rankprime(v ∣ q)| can be approximately minimized by a solutionthat minimizes |s(v ∣ q) minus sprime(v ∣ q)| As such we propose anapproximation of UQ that calculates the score differencebefore and after adopting our framework We denote thescore difference as US

Q and define USQ as follows

USQ 1113944

qisinQ1113944visinD

s(v ∣ q) minus sprime(v ∣ q)1113868111386811138681113868

1113868111386811138681113868 (11)

As shown in Algorithm 1 given input database D thenumber of public and private attributes m and mprime re-spectively query workload Q and privacy guarantee ϵPVSV-Constructor constructs an equivalent value set foreach v isin D and j isin 1 mprime that minimizes US

Q Recall thedefinition of sprime(t| q) in (7) and we have

USQ 1113944

visinD1113944

mprime

j11113944qisinQ

wBj ρprime q Bj1113960 1113961 v Bj1113960 11139611113872 1113873 minus ρ q Bj1113960 1113961 v Bj1113960 11139611113872 111387311138681113868111386811138681113868

11138681113868111386811138681113868

(12)

We denote the score difference of v[Bj] contributed byP

Bj

v as U(PBj

v )

U PBj

v1113874 1113875 1113944qisinQ

wBj ρprime q Bj1113960 1113961 v Bj1113960 11139611113872 1113873 minus ρ q Bj1113960 1113961 v Bj1113960 11139611113872 111387311138681113868111386811138681113868

11138681113868111386811138681113868

(13)

)erefore USQ is the sum of U(P

Bj

v ) over all tuples andprivate attributes

USQ 1113944

visinD1113944

mprime

j1U P

Bj

v1113874 1113875 (14)

Since the construction of each PBj

v is independent fromother polymorphic value sets we can minimize US

Q by mini-

mizing the score difference contributed by each PBj

v forforallj isin 1 mprime1113864 1113865 and forallv isin D Note that |ρprime(q[Bj] v[Bj]) minus

ρ(q[Bj] v[Bj])| 0 if q[Bj] v[Bj] or q[Bj] notin PBj

v Alsonote that |ρprime(q[Bj] v[Bj]) minus ρ(q[Bj] v[Bj])| 1 when

q[Bj]ne v[Bj] and q[Bj] isin PBj

v )erefore if forallq isin Q

v[Bj] q[Bj] then any assignment of PBj

v cannot contribute

to a higher U(PBj

v ) since ρprime(q[Bj] v[Bj]) 1 ρ(q[Bj]

v[Bj]) 1 and |ρprime(q[Bj] v[Bj]) minus ρ(q[Bj] v[Bj])| is always

zero In this situation U(PBj

v ) 0 On the contrary if existq isin Qv[Bj]ne q[Bj] then ρ(q[Bj] v[Bj]) 0 and thus|ρprime(q[Bj] v[Bj]) minus ρ(q[Bj] v[Bj])| ρprime(q[Bj] v[Bj])

According to equation (13) the value of U(PBj

v ) is

1113944qisinQ

wBj ρprime q Bj1113960 1113961 v Bj1113960 11139611113872 1113873 minus ρ q Bj1113960 1113961 v Bj1113960 11139611113872 111387311138681113868111386811138681113868

11138681113868111386811138681113868

1113944

qisinQq Bj1113858 1113859nev Bj1113858 1113859

wBj ρprime q Bj1113960 1113961 v Bj1113960 11139611113872 1113873

(15)

In order to minimize U(PBj

v ) we have to find an as-signment of P

Bj

v which minimizes |1113864q isin Q ∣ q[Bj]ne

v[Bj]existβ isin PBj

v q[Bj] β1113865| Consider the simplest case

where we want to construct PBj

v of size 2 v[Bj] β1113966 1113967 We have

6 Security and Communication Networks

1113944

qisinQq Bj1113858 1113859nev Bj1113858 1113859

wBj ρprime q Bj1113960 1113961 v Bj1113960 11139611113872 1113873 w

Bj q isin Q ∣ q Bj1113960 1113961 β1113966 111396711138681113868111386811138681113868

11138681113868111386811138681113868

(16)

In this case β has to be a value in VBj v[Bj] such that β

has the lowest frequency among all q[Bj] values for q isin QNote that β does not have to be an element of q[Bj] ∣1113966

q isin Q If β notin q[Bj] ∣ q isin Q1113966 1113967 its frequency is 0 Similarly if we

want to construct PBj

v of size k then we should insert the k minus 1least frequent values among all q[Bj] values into P

Bj

v In line 4 of Algorithm 1 we initialize the polymorphic

value set of v[Bj] by inserting v[Bj] into PBj

v For v[Bj] aset Set is constructed from VB

j v[Bj]1113966 1113967 which containsall values that can be inserted into P

Bj

v In order tominimize U(P

Bj

v ) we insert the k minus 1 least frequentvalues from Set into P

Bj

v by their frequencies inq[Bj] ∣ q isin Q1113966 1113967 )e above construction of P

Bj

v is repeatedfor each t isin D and j isin 1 mprime )e computationalcomplexity of Algorithm 1 is O(|D| middot |Q| middot mprime)

5 Framework with True Values

51 Authenticity-Knowledgeable Adversaries As we men-tioned in the adversary model an authenticity-knowl-edgeable adversary is able to tell if t isin D is possible As aresult the authenticity-knowledgeable adversary can launcha more efficient attack on private attributes by examining theauthenticity of values learned from RA We show a simplecase where an authenticity-knowledgeable adversary breaksthe privacy guarantee provided by polymorphic values setsconstructed with virtual values Consider that the objectiveof the adversary is to infer the value of v[B0] and B0 is theonly private attribute in D PB0

v is the polymorphic value setgenerated for v[B0] and |PB0

v | 1ε Without prior knowl-edge Pv[B0] ε and the privacy guarantee is achievedHowever for a value β isin PB0

v v[B0]1113864 1113865 if the adversary canconclude that PK(vprime ∣ D) 0 for any tuple vprime such that vprime isequal to v in all public attributes and vprime[B0] β then theadversary can exclude β from PB0

v )erefore

g v B01113858 1113859( 1113857 ε

1 minus εgt ε (17)

and equation (9) no longer holdsAs shown above if values in PB0

v are marked by anadversary as invalid for v given PK(vprime|D) then the adversarycan successfully break the privacy guarantee defined inframework with virtual values

52 Design In this section we propose polymorphic valuesets with true values (PVST) that construct polymorphicvalue sets with values that cannot be excluded by PK(t ∣ D)PVST considers nontrivial prior knowledge of adversariesand presents the same degree of privacy guarantee in-troduced in (9)

We have shown above that the implementation withvirtual values can be compromised by adversaries with priorknowledge Consider a tuple v isin D Given privacy guaranteeϵ we construct mprime polymorphic values sets PB0

v PB

mprimev for

each private attributes of v where |PBj

v |ge 1ε Let set MAv be

MAv v A01113858 11138591113864 1113865 times middot middot middot times v Am1113858 11138591113864 1113865 times P

B0v times middot middot middot times P

Bmprimev (18)

MAv is the Cartesian product of sets each of which

contains vrsquos all equivalent values in an attribute For publicattribute Ai the corresponding set is v[Ai]1113864 1113865 since publicattribute values are open to the adversary For private at-tribute Bj the corresponding set is P

Bj

v )erefore MAv

contains all possible tuples that are indistinguishable with v

with respect to RA (including v itself )With prior knowledge on PK(t ∣ D) an adversary is able

to exclude a value β from PBj

v if

forallt isin t ∣ t isinMAv t Bj1113960 1113961 β1113966 1113967PK(t ∣ D) 0 (19)

As described above it is safe for the adversary to con-clude that βne v[Bj] if there is no t isinMA

v such that t[Bj] βand PK(t ∣ D) 1 Alternatively if for every β in P

Bj

v thereis a t such that t[Bj] β and PK(t ∣ D) 1 then the ad-

versary cannot exclude any value in PBj

v and thereforePv[Bj]le ε is guaranteed Since PK(t ∣ D) 1 iff t isin D weconstruct polymorphic value sets with true values fromt isin D

Definition 4 If there exists l distinct values β1 βl isin PBj

v

such that forallr isin 1 l βr holds the following property

(i) Input ε D m mprime Q(ii) Output P

Bj

v forallv isin D and forallj isin 1 mprime1113864 1113865

(1) k (1ε) minus 1(2) for t in D do(3) for j isin 1 mprime1113864 1113865 do(4) P

Bj

v v[Bj]1113966 1113967

(5) Set VBj v[Bj]1113966 1113967

(6) Sort elements of Set by their frequencies in q[Bj]| q isin Q1113966 1113967 in ascending order(7) Remove the last |Set| minus k + 1 elements in Set(8) P

Bj

v PBj

v cup Set(9) end(10) end

ALGORITHM 1 PVSV-Constructor

Security and Communication Networks 7

existt isinMAv t Bj1113960 1113961 βr and t isin D (20)

)en we say that PBj

v covers l true values We denoteT

Bj

v β1 βl1113864 1113865 as the true value set of t in BjPrivacy guarantee for a database D where every tuplersquos

every private attribute is included by one equivalent value setwhich covers at least l true values a privacy level of ε 1l isachieved

Assume that the adversaryrsquos objective is to infer the valueof v[Bj] As mentioned in the adversary model we make noassumption on the attacking methods adopted by an ad-versary Consider MA

v defined in (18) forallt tprime isinMAv and

forallq isin QA s(t ∣ q) s(tprime ∣ q) )erefore the adversary cannotdistinguish tuples in MA

v by observing RA In this situationthe adversary would use PK(t ∣ D) to exclude all tuple t inMA

v such that PK(t ∣ D) 0 However as PBj

v covers l truevalues there exists l tuples t1 tl isinMA

v such thatPK(t ∣ D) 1 and tr[Bj]ne ts[Bj] for arbitraryr s isin 1 l As a result values in t1[Bj] tl[Bj]1113966 1113967 areindistinguishable given RA and PK(t ∣ D) )us Pv[Bj] 1lAccording to (3) a privacy level of 1l is achieved

53 Utility Optimization An intuitive method of con-structing P

Bj

v is to insert the value of Bj of all tuples that sharethe same public attribute values with v into P

Bj

v ie PBj

v

PBj

v 1113864vprime[Bj]|foralli and vprime isin D vprime[Ai] v[Ai]1113865 )e privacyguarantee is met if P

Bj

v covers at least 1ε true valuesNevertheless utility loss cannot be ignored as in the intuitivemethod s(vprime| q) would be the same for all vprime isin 1113864vprime| foralliand vprime isin D vprime[Ai] v[Ai]1113865 and thus information of privateattributes are missing in the ranked result of vprime )ereforethe size of P

Bj

v is critical in balancing privacy and utilityWith no loss of generality we limit the size of each poly-morphic value set to 2 We show that constructing suchpolymorphic value sets is an NP-hard problem

Definition 5 We define the 2-PVST problem as followsgiven database D and a query workload Q construct P

Bj

v ofsize 2 for each tuple v isin D forallj isin 1 mprime1113864 1113865 and minimizeUQ defined in (10)

Theorem 2 4e 2-PVST problem is NP-hard

)e proof of )eorem 2 can be found in Appendix B

54 Heuristic Algorithm We have shown that the 2-PVSTproblem is NP-hard In this subsection we present PVST-Constructor a heuristic algorithm that constructs PVSTwithin polynomial time PVST-Constructor tries to mini-mize |P

Bj

v | and 1113936qisinQ|sprime(v ∣ q) minus s(v ∣ q)| for each v isin D

and j isin 1 mprime with the greedy algorithm)e pseudo-code of PVST-Constructor is shown in

Algorithm 2 In lines 2 to 6 we initialize each PBj

v withv[Bj] )en for each v isin D PVST-Constructor constructsPB1

v PBmprime

v by finding a vprime with the greedy algorithm andinserting vprime[Bj] into P

Bj

v forallj isin 1 mprime)e above process

will be taken multiple times until |PBj

v |ge 1ε Since every vprime isa real tuple existing in D and we insert at least k tuples in theabove processes P

Bj

v covers at least k + 1 true values andthus the privacy guarantee is met As mentioned in UtilityOptimization the size of P

Bj

v is critical in minimizing utilityloss )erefore the heuristic algorithm tries to minimize theutility loss by minimizing the size of each P

Bj

v We count thenumber of polymorphic value sets of v if the set contains lessthan k values that can be enlarged by inserting vprimersquos privateattribute values We denote this count as covervprime

covervprime j ∣ PBj

v

11138681113868111386811138681113868

11138681113868111386811138681113868lt k vprime Bj1113960 1113961 notin PBj

v1113882 1113883

1113868111386811138681113868111386811138681113868

1113868111386811138681113868111386811138681113868 (21)

Tuple vprime with a higher covervprime value can enlarge the sizeof more polymorphic value sets of v and thus we can reducethe number of tuples that we have to insert into PB1

v PBmprime

v Furthermore we take into consideration queries in Q In

order to minimize UQ we have to minimize |sprime(v ∣ q) minus

s(v ∣ q)| for q isin Q Note that for each attribute Bj|ρprime(q[Bj] v[Bj]) minus ρ(q[Bj] v[Bj])| 1 if v[Bj]ne q[Bj] andq[Bj] isin P

Bj

v We denote the value of 1113936j|ρprime(q[Bj] v[Bj]) minus

ρ(q[Bj] v[Bj])| as lossvprime )us we have

lossvprime 1113944qisinQ

1113944

j1mprime

ρprime q Bj1113960 1113961 v Bj1113960 11139611113872 1113873 minus ρ q Bj1113960 1113961 v Bj1113960 11139611113872 111387311138681113868111386811138681113868

11138681113868111386811138681113868

1113944qisinQ

1113944

j1mprime

δ v vprime q j( 1113857

(22)where δ(v vprime q j) 1 if v[Bj]ne q[Bj] and vprime[Bj] q[Bj]

and δ(v vprime q j) 0 otherwise )erefore tuple vprime with asmaller lossvprime can reduce the value of 1113936j|ρprime(q[Bj]

v[Bj]) minus ρ(q[Bj] v[Bj])| and thus we can reduce the valueof UQ

)e computation of covervprime and lossvprime is done in line 11)en we compute scorevprime the score of vprime that indicates howpreferable vprime is relative to other tuples in D v Weadopt the greedy algorithm to find the next vprime for PB1

v

PB

mprimev ie in each iteration we choose the tuple

that has the highest score value In line 15 we insert theprivate attribute values of the chosen tuple (denoted as vmax)into PB1

v PB

mprimev )e above process is repeated until

the sizes of PB1v P

Bmprime

v are no less than k

6 Experimental Results

61 Experimental Setup To validate PVSV-Constructor andPVST-Constructor algorithms we conducted experiments on areal world dataset [29] from eHarmony which contains 58attributes and 486464 tuples We removed 5 noncategoricalattributes and randomly picked 20 categorical attributes fromthe remaining 53 attributes )e domain sizes of the 20 at-tributes range from 2 to 15 After removing duplicate tuples werandomly picked 300000 tuples as our testing bed

By default we use the ranking function from the rankedretrieval model with all weights set to 1 All experimentalresults were obtained on a Mac machine running Mac OS

8 Security and Communication Networks

with 8GB of RAM )e algorithms were implemented inPython

62 Privacy )e privacy guarantee of PVSV-Constructorand PVST-Constructor were tested by performing RankedInference attack [35] including Point-Query In-QueryPoint-QueryampInsert and In-QueryampInsert attackingmethods on the dataset From a total of 20 attributes 5attributes were randomly chosen as public attributes andanother 5 attributes were randomly chosen as private at-tributes We randomly picked 20000 distinct tuples fromthe dataset as the testing bed and randomly generated 10tuples as the query workload For PVSV-Constructor weconstructed a polymorphic value set of size 2 for eachprivate attribute and each tuple in the testing bed ForPVST-Constructor we constructed a polymorphic valueset that covers at least two true values for each privateattribute and each tuple We randomly picked 1000 tuplesfrom the testing bed as our targets and performed 1000Rank Inference attacks (250 attacks for each of the fourmethods) on the five private attributes of target tuples Wemeasured the attack success guess rates based on the fre-quency of successful inference among all inference at-tempts Figure 1 shows the success guess rates of RankInference attacks on the unprotected testing bed the testingbed with PVSV and the testing bed with PVST As the sizeof each polymorphic value set is 2 the success guess rateson PVSV are around 50 which are significantly lowerthan those of unprotected dataset We also observe that thesuccess guess rates on PVSTare slightly lower than those of

PVSV )e reason is that PVSV-Constructor will beinserting values into tuple vrsquos polymorphic value sets untilall vrsquos polymorphic value sets cover at least 2 true values)us some polymorphic value sets of v may contain morethan 2 values

63 Utility In this subsection we quantify utility loss ofPVSV-Constructor and PVST-Constructor algorithms)e privacy guarantee of both PVSV-Constructor andPVST-Constructor is 12 ie the polymorphic value setsconstructed by PVSV-Constructor contains 2 values andthe polymorphic value sets constructed by PVST-Con-structor contains at least 2 true values )e key param-eters here are the size of query workload Q the size ofdatabase D the number of public and private attributesand the weight ratios in the ranking function We ran-domly generated 20 tuples as the query workload Bydefault we picked 10 tuples from Q and set |D| 300 000m 10 and mprime 10 )erefore we randomly picked 10attributes from the testing bed and set them as publicattributes )e rest of the 10 attributes were set as privateattributes

Many recommendation systems of ONS applicationsfeature top-k recommendation [1 14 47] where the rankedresult contains a set of k tuples that will be of interest to acertain user as it is impractical and unnecessary to return alltuples in the database to the user )erefore we introduceaverage top-k utility loss Uaul a variant of UQ that focuses onutility loss of the top-k tuples in a ranked result Uaul isdefined as

(i) Input ε D m mprime Q(ii) Output P

Bj

v forallv and forallBj

(1) k (1ε) minus 1(2) for v in D do(3) for j isin 1 mprime1113864 1113865 do(4) P

Bj

v v[Bj]1113966 1113967

(5) end(6) end(7) for v in D do(8) while existj isin 1 mprime1113864 1113865 |P

Bj

v |lt k + 1 do(9) for vprime isin t isin D|forallAi t[Ai] v[Ai]1113864 1113865 do(10) compute covervprime lossvprime(11) scorevprime covervprime 11 + exp(minus lossvprime )

(12) end(13) vmax argmaxvprimeisin |tisinD|forallAit[Ai]v[Ai] scorevprime

(14) for j isin 1 mprime1113864 1113865 do

(15) PBj

v PBj

v cup tmax[Bj]1113966 1113967

(16) end(17) end(18) end

ALGORITHM 2 PVST-Constructor

Security and Communication Networks 9

Uaul 1

|Q|1113944

|Q|

i1

1Dprime

11138681113868111386811138681113868111386811138681113868

1113944

visinDprime

min Rank(v ∣ q) k + 11113864 1113865 minus min Rankprime(v ∣ q) k + 11113864 11138651113868111386811138681113868

1113868111386811138681113868

k (23)

where Dprime v isin D |Rank(v | q))lt k or Rank(v | q))lt k1113864 1113865Intuitively Uaul represents the average percentage rankdifference relative to k over all queries and all tuples in top-kUaul is equivalent to UQ|Q||D|2 when k |D| By default weset k 100

631 Evaluation of Uaul with Varying k We first discuss theaverage top-k utility loss of PVSV-Constructor PVST-Constructor and a baseline algorithm on varying k withother parameters set to default values For a tuple vrsquos at-tribute Bj the baseline algorithm constructs P

Bj

v with v[Bj]

and a value randomly picked from VBj v[Bj]1113966 1113967 )e results

are presented in Figure 2 which shows that the Uaul ofPVSV-Constructor is significantly lower than that of thebaseline algorithm )e Uaul of PVST-Constructor is alsolower than that of the baseline algorithm when kge 5 eventhough the baseline algorithm cannot preserve privacyagainst authenticity-knowledgeable adversaries )e exper-imental results show that both PVSV and PVST can reduceutility loss with respect to rank differences Also note thatPVST-Constructor constructs polymorphic value sets withtrue values and thus from Figure 3 we can see that somepolymorphic value sets constructed by PVST-Constructorcontains more than 2 values

632 Evaluation of Uaul with Varying Sizes of QFigure 4 presents the average top-k utility loss of PVSV-Constructor on varying |Q| When |Q| is set to 1 5 or 10 werandomly picked 1 5 or 10 queries from the original Qrespectively With increasing number of queries in the queryworkload the Uaul of PVSV-Constructor increases mono-tonically )e reason is that PVSV-Constructor alwaysgenerates the least frequent value (denoted as vprime[Bj]) in1113864q[Bj] | q isin Q1113865 for v[Bj] If |Q| is small then it is possiblethat vprime[Bj] notin 1113864q[Bj] | q isin Q1113865 and thus s(v ∣ q) sprime(v ∣ q)

forallq isin Q However a larger query workload covers moreprivate attributes values and thus it will be harder forPVSV-Constructor to generate a value for v[Bj] that has noimpact on the rank of v for any q isin Q Figure 5 presents theaverage top-k utility loss of PVST-Constructor on varying|Q| We can see that the size of Q has no significant impacton the Uaul of PVST-Constructor as the PVST-Constructoralways pick a tuple vprime that is different from v in most at-tributes and then insert vprime[Bj] into P

Bj

v

633 Evaluation of Uaul with Varying Sizes of the DatasetFigures 6 and 7 depict the impact of the size of datasets onUaul of PVSV-Constructor and PVST-Constructor Datasetsof 100000 and 200000 tuples were randomly sampled from

QPoint QIn QIPoint QIIn

Atta

ck su

cces

s gue

ss ra

te

Without protectionPVSVPVST

1

09

08

07

06

05

04

03

02

01

0

Figure 1 Attack success guess

10 Security and Communication Networks

the testing bed of 300000 tuples As expected |D| has nosignificant impact on the Uaul of PVSV-Constructor sincePVSV-Constructor generates values from the domain ofeach private attributes |D| has no impact on the Uaul ofPVST-Constructor which indicates that a dataset contain-ing 100000 tuples is sufficient for PVST-Constructor togenerate polymorphic value sets with true values

634 Evaluation of Uaul with Varying m We investigate theimpact of the number of private and public attributes on

average top-k utility loss Figure 8 presents theUaul of PVSV-Constructor with fixed mprime and varying m and with fixed mand varying mprime When mmprime is set to 5 we randomly re-moved 5 attributes from the testing bed When mmprime is set to

0 20 40 60 80 100k

RandomPVSVPVST

Ave

rage

top-k

utili

ty lo

ss

1

09

08

07

06

05

04

03

02

01

0

Figure 2 Utility loss vs k

1 2 3 4 5+Size of PVST

3

25

2

15

1

05

0

Coun

t 10

5

Figure 3 Size of polymorphic value set (PVST)

|Q|

Ave

rage

top-k

utili

ty lo

ss

1

009

008

007

006

005

004

003

002

001

00 10 20

Figure 4 Utility loss vs |Q|

|Q|

Ave

rage

top-k

utili

ty lo

ss

1

09

08

07

06

05

04

03

02

01

00 10 20

Figure 5 Utility loss vs |Q|

0 1 2 3|D|105

Ave

rage

top-k

utili

ty lo

ss

1

009

008

007

006

005

004

003

002

001

0

Figure 6 Utility loss vs |D|

Security and Communication Networks 11

15 we added 5 more categorical attributes randomly chosenfrom the unused attributes We observe that the Uaulmonotonically decreases with increasing number of publicattributes and monotonically increases with increasingnumber of private attributes As expected a higher pro-portion of public attributes leads to less variant betweens(v ∣ q) and sprime(v ∣ q) e results of the same experimentwith PVST-Constructor are shown in Figure 9 Uaul in-creases as increasing number of private attributes as ex-pected However Uaul also increases slightly with increasingnumber of public attributes is is due to the fact that withmore public attributes there will be few tuples that share thesame public attribute values Since PVST-Constructor in-serts only private attribute values from tuples sharing samepublic attribute values more values will be inserted to eachpolymorphic value set which introduces higher utility loss

635 Evaluation of Uaul with Varying Weight RatiosFigures 10 and 11 illustrate the impact of weight ratios on theaverage utility loss e experiment was conducted with a

xed private attribute weight of 1 and varying public at-tribute weights of 1 2 and 3 As expected Uaul of bothPVSV-Constructor and PVST-Constructor decreases as

0 1 2 3Weight ratio

Ave

rage

top-k

utili

ty lo

ss

1

009

008

007

006

005

004

003

002

001

0

Figure 10 Utility loss vs weight ratio

0 1 2 3Weight ratio

Ave

rage

top-k

utili

ty lo

ss

1

09

08

07

06

05

04

03

02

01

0

Figure 11 Utility loss vs weight ratio

0 1 2 3|D|105

Ave

rage

top-k

utili

ty lo

ss1

09

08

07

06

05

04

03

02

01

0

Figure 7 Utility loss vs |D|

0 5 10 15m or mprime

fix mprime = 10fix m = 10

Ave

rage

top-k

utili

ty lo

ss

1

009

008

007

006

005

004

003

002

001

0

Figure 8 Utility loss vs m and mprime

0 5 10 15m or mprime

fix mprime = 10fix m = 10

Ave

rage

top-k

utili

ty lo

ss

1

09

08

07

06

05

04

03

02

01

0

Figure 9 Utility loss vs m and mprime

12 Security and Communication Networks

increasing weight ratio of public attributes)e reason is thatas the public attribute weight increases the part in sprime(v ∣ q)

caused by private attributes decreases )erefore less impactwould be made to sprime(v ∣ q) by PVSV and PVST As a resultsprime(v ∣ q) would be closer to s(v ∣ q) and the utility loss couldbe decreased

7 Conclusions

In this paper we proposed a novel framework that preservesprivacy of private attributes against arbitrary attacks throughthe ranked retrieval model Furthermore we identify twocategories of adversaries based on varying adversarial ca-pabilities For each kind of adversaries we presentedimplementation of our framework Our experimental resultssuggest that our implementations efficiently preserve privacyagainst Rank Inference attack [35] Moreover the imple-mentations significantly reduce utility loss with respect tothe variance in ranked results

It is our hope that this paper can motivate further re-search in privacy preservation of SIoT with consideration ofsocial network features andor variant information retrievalmodels eg text mining

Appendix

A 2-PVSV

In this subsection we prove that constructing an optimal PVSVfor each attribute of a tuple v is an NP-hard problem

Definition A1 For a tuple v in D we create PBj

v for each BjWe say v satisfies query q if the following hold for any tuple t(tne v) in D (1) If s(v ∣ q)lt s(t ∣ q) then sprime(v ∣ q)lt sprime(t ∣ q)and (2) if s(v ∣ q)ge s(t ∣ q) then sprime(v ∣ q)ge sprime(t ∣ q)

For a database containing 2 tuples the 2-PVSV problemcan be redefined as follows given a query workload Q adatabase D construct PB1

v PBmprime

v of size 2 such that v

satisfies the most queries in Q

Definition A2 Max-3Sat Problem given a 3-CNF formulaΦ find the truth assignment that satisfies that most clauses

Lemma A1 Max-3Sat leP 2-PVSV Problem

Proof We construct a reduction function f(Φ) (D s Q)

which takes a Max-3Sat instance as input and returns a 2-PVSV instance Without loss of generality we suppose thatΦ is a conjunction of l clauses and each clause is a dis-junction of 3 literals from set X x1 xn1113864 1113865 We constructdatabase D as follows D has 0 public attributes and n + 1private attributes B1 Bn C1 Let VB

j be the attribute do-main of Bj j isin 1 n and VC

1 be the attribute domain ofC1 Let VB

i minus 1 0 1 2 n + 2 and VC1 0 1 Two

tuples v1 and v2 are inserted into D v1[Bj] 0 and v2[Bj]

j + 2 for j isin 1 n while v1[C1] 0 and v2[C1] 1 Wesimplify the score function defined in (1) by setting all weightsto 1 Also note that ρ(q[Bj] t[Bj]) 0 if q[Bj] is null

We construct query workload Q based on Φ For eachclause Lk isin Φ we construct a query qk such that qk[Bi] i iffthe corresponding literal xi is a positive literal in Lk andq[Bi] i + 1 iff xi is a negative literal in the clause Forexample given a clause (x1orx2orx3) the correspondingquery q should satisfy q[B1] 1 q[B2] 3 q[B3] 3 andq[C1] 0 All other attributes in q are set to null by default)erefore s(v1 ∣ q) 1 and s(v2 ∣ q) 0 forallq isin Q

Since s(v2 ∣ q)lt s(v1 ∣ q) forallq isin Q in order to minimizeutility loss the value of sprime(v2 ∣ q) should be as small aspossible We observe that the minimum possible sprime(v2 ∣ q) is1 because PC1

v2must be 0 1 as VC

1 0 1 Without loss ofgenerality we assume that we have already constructedPB1

v2 P

Bmprime

v2 PC1v2

for v2 such that sprime(v2 ∣ q) 1 forallq isin QNow we have an instance of 2-PVSV problem that given

D Q constructs PB1v1

PBmprime

v1 PC1

v1that satisfies the most

queries in Q Since VC1 0 1 PC1

v1must be 0 1 too as

PC1v1

contains two distinct values )erefore for an ar-bitrary query q isin Q we have ρprime(q[C] v1[C]) 1 andρprime(q[C] v2[C]) 1 In order to let v1 satisfy q we have toensure that sprime(v1 ∣ q)gt sprime(v2 ∣ q) 1 )us we have

sprime v1 ∣ q( 1113857gt 1

⟺1113944mprime

i1ρprime q Bi1113858 1113859 v1 Bi1113858 1113859( 1113857gt 1

⟺existi isin 1 mprime1113864 1113865 ρprime q Bi1113858 1113859 v1 Bi1113858 1113859( 1113857 1

⟺existi isin 1 mprime1113864 1113865 q Bi1113858 1113859 isin PBi

v1

(A1)

Now we show that the solution of 2-PVSV problemconstructed above can answer the corresponding Max-3Sat Problem Suppose that we have the solution PB1

v1

PB

mprimev1 such that v1 satisfies the most queries in Q As in

(A1) if v1 satisfies qk we have qk[Bi] isin PBiv1 If v1 does not

satisfy qk then qk[Bi] notin PBiv1 Recall that we assign value i or

i + 1 to qk[Bi] if xi is a positive or negative literal in clauseLk respectively )erefore the assignment of xi given PBi

v1is

xi true if i isin PBi

v1

xi false if i + 1 isin PBi

v1

(A2)

Furthermore from equation (A1) we have

sprime v1 ∣ q( 1113857gt 1⟺L true (A3)

where L is qrsquos corresponding clause in Φ Note thati i + 1 nsubQ as |PBi

v1| 2 and 0 isin PBi

v1 )us the value of xi is

either true or falseAs we proved above v1 satisfies qi if and only if Li is true

given assignment x1 xl1113864 1113865 constructed according toequation (A2) )erefore x1 xl1113864 1113865 satisfies the mostclauses in ϕ if and only if v1 satisfies the most queries in Q

We now prove that function f can be conducted inpolynomial time Given a formula Φ with n variables and lclauses we construct a 2-PVSV instance with 2 tuples each

Security and Communication Networks 13

of which has n + 1 attributes and l queries each of which has4 attributes )erefore O(2n + 4l) assignments are neededand f can be conducted in polynomial time

Proof of 4eorem 1 We now prove that the 2-PVSVproblem is NP-hard In Lemma 3 we proved that Max-3SatProblem can be reduced to the 2-PVSV problem in poly-nomial time Furthermore as Max-3Sat Problem is a NP-hard problem [33] the 2-PVSV problem is NP-hard

B 2-PVST

In this section we prove that constructing an optimal PVSTfor each private attribute of a tuple v isin D is NP-hard We usethe definition of v satisfying q from (9))e 2-PVSTproblemcan be redefined as follows

Definition B1 2-PVSTproblem given a query workload Qa database D the optimization problem of 2-PVST is toconstruct an arbitrary tuple vrsquos polymorphic setsPB1

v PBmprime

v that satisfies the most queries in Q )e size ofeach polymorphic vale set is 2

Lemma B1 Max-3Sat le P 2-PVST problem

Proof We construct a reduction function f(Φ) (D s Q)

which takes a Max-3Sat instance as input and returns a 2-PVST instance We assume that Φ is a conjunction of lclauses and each clause is a disjunction of 3 literals from setX x1 xn1113864 1113865 Let D have no public attribute and n + 1private attributes B1 Bn C1 For each literal xi we inserttwo tuples vi and vi

prime into D where

vi C11113858 1113859 1 vi Bi1113858 1113859 1 vi Bj1113960 1113961 minus 1foralljne i

viprime C11113858 1113859 1 vi

prime Bi1113858 1113859 minus 1 viprime Bj1113960 1113961 1foralljne i

(B1)

)en we insert tuple v into D where v[C1] 0 andv[Bj] 0 forallj isin 1 mprime We also set the domain of eachBj as minus 1 0 1 and the domain of C1 as 0 1

Without loss of generality the score function defined in(1) is simplified by setting all weights to 1

Next we construct query workload Q For each clauseLk isin Φ we construct a query qk in which qk[Bj] minus 1 iff xj isa positive literal in Lk and qk[Bj] 1 iff xj is a negativeliteral in Lk We also set qk[C1] 1 )e rest of the attributevalues are set to null by default )erefore ifLk (x1orne x2orx3) then we have qk[C1] 1 qk[B1] 1qk[B2] 1 qk[B3] minus 1 and qk[Bj] null forj isin 4 mprime

Note that ρ(q[Bj] t[Bj]) 0 if q[Bj] is null )uss(v qk) 0 forallq isin Q and s(vi qk) s(vi

prime qk)ge 1 forallq isin Q Weobserve that for q isin Q and i isin 1 n s(v ∣ q)lt s(vi ∣ q)

and s(v ∣ q)lt s(viprime ∣ q) In order to reduce UQ we have to

maximize the number of q isin Q such that sprime(v ∣ q)lt sprime(vi ∣ q)

and sprime(v ∣ q)lt sprime(viprime ∣ q) )e maximum value of sprime(vi ∣ q)

and sprime(viprime ∣ q) is 4 which can be achieved by inserting vi[Bj]

into PBj

viprime and inserting vi

prime[Bj] into PBj

vi Since PC1

v 0 1 thevalue of sprime(v ∣ q) is at least 1 )erefore we have

sprime(v ∣ q)lt sprime vi ∣ q( 1113857

⟺sprime(v ∣ q)lt 4

⟺1113944

mprime

j1ρprime q Bj1113960 1113961 v Bj1113960 11139611113872 1113873lt 3

⟺existj isin 1 mprime ρprime q Bj1113960 1113961 v Bj1113960 11139611113872 1113873 0

⟺existj isin 1 mprime v Bj1113960 1113961ne q Bj1113960 1113961

⟺existj isin 1 mprime minus q Bj1113960 1113961 isin PBj

v

(B2)

Since |PBj

v | is limited to 2 PBj

v 0 1 or 0 minus 1 ie wemust insert either virsquos or vi

primersquos private attribute values intoPB1

v PBmv Recall that we assign minus 1 or 1 to qk[Bj] if xj

is positive or negative respectively in Lk In order toreduce Max-3Sat to 2-PSVT we assign literals in the fol-lowing way

xj true if minus 1 isin PBj

v

xj false if 1 isin PBj

v (B3)

)erefore we denote the corresponding clause of q as Land from equation (B2) we have

existj isin 1 mprime1113864 1113865 minus q Bj1113960 1113961 isin PBj

v

⟺existj isin 1 mprime1113864 1113865 xj true if xj is positive in L

orxj false if xj is negative in L

⟺L true(B4)

From the above equation we observe that v satisfying qk

is equivalent to Lk being true)erefore we can reduceMax-3Sat to 2-PVST Given a solution to the 2-PVSTproblem thatmaximizes the number of queries satisfied by v assignmentx1 xn1113864 1113865 produced by equation (B3) can satisfy thelargest number of clauses in Φ

Given a Max-3Sat instance Φ the construction of the 2-PVST instance can be conducted in polynomial time as weconstruct n queries with 3 attributes and 2n + 1 tuples with nattributes)e transformation from the optimal solution of 2-PVST to the optimal solution of Max-3Sat also takes poly-nomial time as we assign the value to each one of x1 xn

once )erefore f is a polynomial time function

Proof of 4eorem 2 In 4 we proved that a Max-3Sat in-stance can be reduced to a 2-PSVT instance in polynomialtime Furthermore as Max-3Sat Problem is an NP-hardproblem the 2-PVSV problem is an NP-hard problem

Data Availability

)e data used to support the findings of this study areavailable from the corresponding author upon request

14 Security and Communication Networks

Conflicts of Interest

)e authors declare that they have no conflicts of interestregarding the publication of this paper

Acknowledgments

)is research was partially supported by the National Sci-ence Foundation (grant nos 0852674 0915834 1117297and 1343976) and the Army Research Office (grant noW911NF-15-1-0020)

References

[1] G Adomavicius and A Tuzhilin ldquoToward the next generationof recommender systems a survey of the state-of-the-art andpossible extensionsrdquo IEEE Transactions on Knowledge andData Engineering vol 17 no 6 pp 734ndash749 2005

[2] C C Aggarwal and S Y Philip ldquoA condensation approach toprivacy preserving data miningrdquo in Proceedings of the In-ternational Conference on Extending Database Technologypp 183ndash199 Springer Heraklion Crete Greece March 2004

[3] R Agrawal and R Srikant ldquoPrivacy-preserving data miningrdquoin ACM Sigmod Record vol 29 no 2 pp 439ndash450 ACM2000

[4] A Alcaide E Palomar J Montero-Castillo and A RibagordaldquoAnonymous authentication for privacy-preserving IoT tar-get-driven applicationsrdquo Computers amp Security vol 37pp 111ndash123 2013

[5] L Atzori A Iera G Morabito and M Nitti ldquo)e socialinternet of things (SIoT)mdashwhen social networks meet theinternet of things concept architecture and network char-acterizationrdquo Computer Networks vol 56 no 16 pp 3594ndash3608 2012

[6] Z Cai Z He X Guan and Y Li ldquoCollective data-sanitizationfor preventing sensitive information inference attacks insocial networksrdquo IEEE Transactions on Dependable and SecureComputing vol 15 no 4 pp 577ndash590 2018

[7] Z Cai and X Zheng ldquoA private and efficient mechanism for datauploading in smart cyber-physical systemsrdquo IEEE Transactionson Network Science and Engineering vol 24 p 1 2018

[8] Z Cai X Zheng and J Yu ldquoA differential-private frameworkfor urban traffic flows estimation via taxi companiesrdquo IEEETransactions on Industrial Informatics vol 17 2019

[9] N Cao C Wang M Li K Ren and W Lou ldquoPrivacy-preserving multi-keyword ranked search over encryptedcloud datardquo IEEE Transactions on Parallel and DistributedSystems vol 25 no 1 pp 222ndash233 2014

[10] Y-C Chang and M Mitzenmacher ldquoPrivacy preservingkeyword searches on remote encrypted datardquo in Proceedingsof the International Conference on Applied Cryptography andNetwork Security pp 442ndash455 Springer New York NY USAJune 2005

[11] C Chen X Zhu P Shen et al ldquoAn efficient privacy-pre-serving ranked keyword search methodrdquo IEEE Transactionson Parallel and Distributed Systems vol 27 no 4 pp 951ndash9632016

[12] K Chen and L Liu ldquoPrivacy preserving data classificationwith rotation perturbationrdquo in Proceedings of the Fifth IEEEInternational Conference on Data Mining (ICDMrsquo05) p 4IEEE Houston TX USA November 2005

[13] V Ciriani S D C Di Vimercati S Foresti S JajodiaS Paraboschi and P Samarati ldquoFragmentation and en-cryption to enforce privacy in data storagerdquo in Proceedings of

the Fifth European symposium on research in computer se-curity pp 171ndash186 Springer Dresden Germany September2007

[14] M Deshpande and G Karypis ldquoItem-based top-N recom-mendation algorithmsrdquo ACM Transactions on InformationSystems vol 22 no 1 pp 143ndash177 2004

[15] S D C di Vimercati R F Erbacher S Foresti S JajodiaG Livraga and P Samarati ldquoEncryption and fragmentationfor data confidentiality in the cloudrdquo in Foundations of Se-curity Analysis and Design VII A Aldini J Lopez andF Martinelli Eds pp 212ndash243 Springer Berlin Germany2013

[16] S D C di Vimercati S Foresti G Livraga S Paraboschi andP Samarati ldquoConfidentiality protection in large databasesrdquo inA Comprehensive Guide through the Italian Database Researchover the Last 25 Years pp 457ndash472 Springer Berlin Ger-many 2018

[17] C Dwork ldquoDifferential privacyrdquo Encyclopedia of Cryptog-raphy and Security pp 338ndash340 2011

[18] C Dwork and A Roth ldquo)e algorithmic foundations ofdifferential privacyrdquo Foundations and Trendsreg in 4eoreticalComputer Science vol 9 no 3-4 pp 211ndash407 2014

[19] S E Fienberg and J McIntyre ldquoData swapping variations ona theme by dalenius and reissrdquo in Privacy in Statistical Da-tabases pp 14ndash29 Springer Berlin Germany 2004

[20] Y Gao T Yan and N Zhang ldquoA privacy-preservingframework for rank inferencerdquo in Proceedings of the IEEESymposium on Privacy-Aware Computing (PAC) pp 180-181IEEE Washington DC USA August 2017

[21] Z He Z Cai and J Yu ldquoLatent-data privacy preserving withcustomized data utility for social network datardquo IEEETransactions on Vehicular Technology vol 67 no 1pp 665ndash673 2017

[22] T Hong S Mei Z Wang and J Ren ldquoA novel verticalfragmentation method for privacy protection based on en-tropy minimization in a relational databaserdquo Symmetryvol 10 no 11 p 637 2018

[23] X Huang R Fu B Chen T Zhang and A Roscoe ldquoUserinteractive internet of things privacy preserved access con-trolrdquo in Proceedings of the International Conference for In-ternet Technology and Secured Transactions pp 597ndash602IEEE London UK December 2012

[24] Y Huo X Fan L Ma X Cheng Z Tian and D ChenldquoSecure communications in tiered 5G wireless networks withcooperative jammingrdquo IEEE Transactions on Wireless Com-munications vol 18 no 6 pp 3265ndash3280 2019

[25] K Liu H Kargupta and J Ryan ldquoRandom projection-basedmultiplicative data perturbation for privacy preserving dis-tributed data miningin latexrdquo IEEE Transactions on knowledgeand Data Engineering vol 18 no 1 pp 92ndash106 2005

[26] N Li T Li and S Venkatasubramanian ldquot-closeness privacybeyond k-anonymity and l-diversityrdquo in Proceedings of theIEEE 23rd International Conference on Data Engineeringpp 106ndash115 IEEE Istanbul Turkey April 2007

[27] A Machanavajjhala D Kifer J Gehrke andM Venkitasubramaniam ldquoL-diversityrdquo ACMTransactions onKnowledge Discovery from Data vol 1 no 1 pp 3ndashes 2007

[28] S Matwin ldquoPrivacy-preserving data mining techniquessurvey and challengesrdquo in Studies in Applied PhilosophyEpistemology and Rational Ethics pp 209ndash221 SpringerBerlin Germany 2013

[29] B McFee and G Lanckriet ldquoMetric learning to rankrdquo inProceedings of the 27th International Conference on MachineLearning (ICMLrsquo10) Haifa Israel June 2010

Security and Communication Networks 15

[30] K Muralidhar and R Sarathy ldquoData shufflingmdasha newmasking approach for numerical datardquo Management Sciencevol 52 no 5 pp 658ndash670 2006

[31] J Nin J Herranz and V Torra ldquoRethinking rank swapping todecrease disclosure riskrdquo Data amp Knowledge Engineeringvol 64 no 1 pp 346ndash364 2008

[32] K Okamoto W Chen and X-Y Li ldquoRanking of closenesscentrality for large-scale social networksrdquo in Frontiers inAlgorithmics pp 186ndash195 Springer Berlin Germany 2008

[33] C H Papadimitriou and M Yannakakis ldquoOptimizationapproximation and complexity classesrdquo Journal of computerand system sciences vol 43 no 3 pp 425ndash440 1991

[34] L Peng R-C Wang X-Y Su and C Long ldquoPrivacy pro-tection based on key-changed mutual authentication protocolin internet of thingsrdquo in Communications in Computer andInformation Science pp 345ndash355 Springer Berlin Germany2013

[35] M F Rahman W Liu S )irumuruganathan N Zhang andG Das ldquoPrivacy implications of database rankingrdquo Pro-ceedings of the VLDB Endowment vol 8 no 10 pp 1106ndash1117 2015

[36] R Schenkel T Crecelius M Kacimi et al ldquoEfficient top-kquerying over social-tagging networksrdquo in Proceedings of the31st Annual International ACM SIGIR Conference on Researchand Development in Information Retrieval pp 523ndash530ACM Singapore July 2008

[37] C Sheng N Zhang Y Tao and X Jin ldquoOptimal algorithmsfor crawling a hidden database in the webrdquo Proceedings of theVLDB Endowment vol 5 no 11 pp 1112ndash1123 2012

[38] D X Song D Wagner and A Perrig ldquoPractical techniquesfor searches on encrypted datardquo in Proceeding 2000 IEEESymposium on Security and Privacy SampP 2000 pp 44ndash55IEEE Berkeley CA USA May 2000

[39] J Su D Cao B Zhao X Wang and I You ldquoePASS anexpressive attribute-based signature scheme with privacy andan unforgeability guarantee for the Internet of)ingsrdquo FutureGeneration Computer Systems vol 33 pp 11ndash18 2014

[40] L Sweeney ldquok-anonymity a Model for Protecting PrivacyrdquoInternational Journal of Uncertainty Fuzziness and Knowl-edge-Based Systems vol 10 no 5 pp 557ndash570 2002

[41] T Tassa A Mazza and A Gionis ldquok-concealment An al-ternative model of k-type anonymityrdquo Transactions on DataPrivacy vol 5 no 1 pp 189ndash222 2012

[42] J H Y F W Wang J C W G K Koperski D LiY L A R N Stefanovic and B X O R Z Dbminer ldquoAsystem for mining knowledge in large relational databasesrdquo inProceedings of the International Conference on KnowledgeDiscovery and Data Mining and Knowledge Discovery(KDDrsquo96) pp 250ndash255 San Diego CA USA August 1996

[43] X Wang J Zhang E M Schooler and M Ion ldquoPerformanceevaluation of attribute-based encryption toward data privacyin the IoTrdquo in Proceedings of the IEEE International Con-ference on Communications (ICC) pp 725ndash730 IEEE SydneyAustralia June 2014

[44] Y Wang and Q Wen ldquoA privacy enhanced dns scheme forthe internet of thingsrdquo in Proceedings of the IET InternationalConference on Communication Technology and Application(ICCTA 2011) )e Institution of Engineering amp TechnologyBeijing China October 2011

[45] Y Xu T Ma M Tang and W Tian ldquoA survey of privacypreserving data publishing using generalization and sup-pressionrdquo Applied Mathematics amp Information Sciencesvol 8 no 3 pp 1103ndash1116 2014

[46] J-C Yang and B-X Fang ldquoSecurity model and key tech-nologies for the internet of thingsrdquo 4e Journal of ChinaUniversities of Posts and Telecommunications vol 18pp 109ndash112 2011

[47] X Yang H Steck Y Guo and Y Liu ldquoOn top-k recom-mendation using social networksrdquo in Proceedings of the sixthACM conference on Recommender systemsmdashRecSysrsquo12pp 67ndash74 ACM Dublin Ireland September 2012

[48] X Zheng Z Cai J Yu C Wang and Y Li ldquoFollow but notrack privacy preserved profile publishing in cyber-physicalsocial systemsrdquo IEEE Internet of 4ings Journal vol 4 no 6pp 1868ndash1878 2017

16 Security and Communication Networks

International Journal of

AerospaceEngineeringHindawiwwwhindawicom Volume 2018

RoboticsJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Active and Passive Electronic Components

VLSI Design

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Shock and Vibration

Hindawiwwwhindawicom Volume 2018

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawiwwwhindawicom

Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Control Scienceand Engineering

Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Journal ofEngineeringVolume 2018

SensorsJournal of

Hindawiwwwhindawicom Volume 2018

International Journal of

RotatingMachinery

Hindawiwwwhindawicom Volume 2018

Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Navigation and Observation

International Journal of

Hindawi

wwwhindawicom Volume 2018

Advances in

Multimedia

Submit your manuscripts atwwwhindawicom

Page 3: Research Article - Hindawi Publishing Corporationdownloads.hindawi.com/journals/scn/2019/5498375.pdf · [4, 34, 44], and attribute-based encryption [39, 43]. ’e features of social

Differential privacy [8 17 18] is another widely studiedframework that preserves privacy of published datasets orhidden databases It imposes a strong guarantee of privacyon tuples in statistical databases by adding noise to theprocess of query results However the ranked retrievalmodel outputs ranks of tuples instead of their values oraggregate statistics We cannot directly add noise to rankedresults as the rank of a tuple is determined by not only thetuple itself but also by other tuples in the database Fur-thermore the optimization of the ranked retrieval model hasnot been considered

13 Contributions )is work presents a novel scheme forprivacy-preserving the ranked retrieval model We start withan introduction to the adversary model and introduce ourdefinition of privacy guarantee We identify two categoriesof adversaries based on their prior knowledge and assumethat adversaries can launch optimal inference attacksthrough ranked results

We propose the polymorphic value set (PVS) a privacy-preserving framework for the ranked retrieval model Dif-ferent from existing methods PVS does not directly modifyvalues of tuples or query results Instead PVS enablespolymorphism of private attributes such that a private at-tribute of a tuple can respond to different queries in differentways We prove that our framework meets the privacyguarantee stated in Problem Statement For adversaries withand without prior knowledge we design and implement thepolymorphic value set with true values (PVST) and poly-morphic value set with Virtual Values (PVSV) respectivelyIn the design of PVST and PVSV we consider utility loss inthe ranked retrieval model and propose a practical mea-surement of utility loss We prove that the task of mini-mizing utility loss is NP-hard and present two heuristicalgorithms that implement PVST and PVSV respectivelyWe run our implementations of PVST and PVSV on a real-world dataset from eHarmony [29] that contains 486464tuples)e experiments yield excellent results with respect toprivacy guarantee and utility loss

)e remainder of this paper is organized as followsProblem Statement introduces the adversary model and theprivacy guarantee Privacy-Preserving Framework in-troduces our privacy-preserving framework Frameworkwith Virtual Values presents the design and implementationof PVSV along with our analysis of utility loss )eimplementation of PVST and analysis of utility loss arepresented in Framework with True Values ExperimentalResults contains our experimental evaluation of PVSV andPVST In Conclusions we conclude this paper with asummary of our key contributions and a discussion of someopen problems

2 Problem Statement

21 Ranked Retrieval Model In information retrieval wehave witnessed extensive research in the ranked retrievalmodel Unlike the Boolean retrieval model where only re-sults that exactly match the predicates can be returned theranked retrieval model allows users to retrieve a list of

records sorted by a proprietary ranking function)erefore theranked retrieval model provides an alternative solution forusers seeking results sorted by their relevance to the query

As discussed in the introduction many OSN applica-tions have been using the ranked retrieval model to processincoming queries Upon a query q the ranked retrievalmodel would calculate each tuple trsquos score according to aproprietary score function s(t ∣ q) and return top-k tuples indescending order of their scores )e attributes of tuplescould be either categorical or numerical In this paper weconsider only categorical data which does not limit thescope of our research Actually numerical data can betreated as categorical data by categorizing the numericaldomain into small intervals such that no more than onetuple in the database falls into the same interval Withoutloss of generality we also assume that there is no duplicatetuple that is equal to another tuple in every attribute

We now formalize our ranked retrieval model with cat-egorical attributes Consider an n-tuple database D with mpublic attributes A1 Am and mprime private attributesB1 Bmprime Let VA

i (resp VBj ) denote the value domain of

Ai(resp Bj) Let t[A]i(resp t[B]j) denote the value of t inAi(resp Bj) Upon query q the score function s(t ∣ q)

computes a score for each tuple t isin D )e ranked retrievalmodel will then sort all tuples in D in the descending orderand return them as the ranked result We consider the casewhere the score function is linear )erefore s(t ∣ q) can bedefined as

s(t ∣ q) 1113944m

i1w

Ai ρ q Ai1113858 1113859 t Ai1113858 1113859( 1113857 + 1113944

mprime

j1w

Bi ρ q Bi1113858 1113859 t Bi1113858 1113859( 1113857

(1)

where wAi (resp wB

j ) isin (0 1) is the weight of attributeAi(resp Bj) in the score function and the matching functionρ(q[Ai] t[Ai]) (resp ρ(q[Bi] t[Bi])) indicates if tmatches qin attribute Ai (resp Bj))erefore the value of ρ(β1 β2) is 1if β1 is equal to β2 and the value of ρ(β1 β2) is 0 if β1 is notequal to β2 Note that our ranked retrieval model satisfies themonotonicity and additivity properties defined in [35]

22 AdversaryModel In Motivation we mentioned that wedo not make any assumption about the method adopted byan adversary when attacking a database We also assume thatthe adversary has prior knowledge about the metadata oftables in the database as well as the proprietary rankingfunction Furthermore the adversary is assumed to be ableto issue queries to the ranked retrieval model view rankedresults and insert tuples to the database As a result theadversary is able to retrieve all public attribute values bycrawling the database through the query interface [37] Wedenote the set of queries issued by the adversary as QA theset of tuples inserted by the adversary as IA and the cor-responding set of ranked results as RD RD is fully de-termined by QA and IA given fixed D We name allinformation regarding a tuple t isin D that an adversary canfind in RD as the trace of t and denote the trace of t as Tt )etrace of t includes but not limited to the rank of t and the

Security and Communication Networks 3

relationship between t and any other tuple (eg t has ahigher or lower rank than another tuple tprime) in a rankedresult )erefore given fixed D IA and QA Tt is fully de-termined by the attribute values of t

Another capability of the adversary we model is theadversaryrsquos prior knowledge Consider an extreme case wherethe adversary knows the equivalence relation between two at-tributes B1 and A1 In this case even without RD the adversaryis still able to infer the value of B1 of any t isin D In reality anadversary can acquire such attribute correlations by being orconsulting an expert in the domain of the dataset or by adoptingdata mining methods [42] For example based on the personalinformation (eg gender ethnicity age and blood type whichcan be used to infer the gene) stored as public attributes andpublished in public medical data repositories genetic epide-miologists can generally conclude that an individual does nothave some diseases merely based on the fact that these diseaseswould never be found by the candidate gene among historicmedical datasets )erefore an adversary with prior knowledgecould eliminate the possibility of a certain tuple in the databaseWemodel prior knowledge as a function PK(t ∣ D) that takes asinput t andD and returns either 0 or 1 PK(t ∣ D) 0 indicatesthat given prior knowledge the possibility of t isin D is zeroPK(t ∣ D) 1 indicates that the adversary cannot eliminate thepossibility of t isin D As prior knowledge helps adversaries inlaunching an inference attack adversaries can be partitionedinto two classes adversaries with prior knowledge and adver-saries without prior knowledge of the dataset

Definition 1 We name adversaries without prior knowledgeof the authenticity of any tuple as authenticity-ignorant ad-versaries For authenticity-ignorant adversaries PK(t ∣ D)

always outputs 1 We name adversaries with such priorknowledge as authenticity-knowledgeable adversaries Forauthenticity-knowledgeable adversaries PK(t ∣ D) 1 ift isin D and PK(t ∣ D) 0 if t notin D

)e objective of both classes of adversaries is to maxi-mize the following g(v Bj) value when inferring the value ofvictim tuple v in attribute Bj

g v Bj1113872 1113873 Pr v Bj1113960 1113961 a1113872 1113873 (2)

where Pr(v[Bj] a) is the probability of v[Bj] a and a isthe value inferred by the adversary given prior knowledgeand ranked results

For authenticity-ignorant adversaries PK(t ∣ D) alwaysoutputs 1 regardless of t and D We only assume the caseswhere users input true information to the databases)ereforefor authenticity-knowledgeable adversaries PK(t ∣ D) 1 ift isin D and PK(t ∣ D) 0 if t notin D In this paper we assume astrong adversary that can infer the private attribute values of anarbitrary tuple given the premise that the trace of the targettuple is unique )e premise can be easily met because as longas there is no duplicate tuple in the dataset the adversary canalways construct well-designed IA andQA such that the trace ofthe target tuple is different from the trace of any other tuple)erefore the adversary can always find a such thatPr(v[Bj] a) 100

23ProblemStatement Aprivacy breach can be described by asuccessful inference of a private attribute value in the databaseWe view privacy of v[Bj] as the upper bound on the possibilitythat an adversary succeeds in inferring the value of v[Bj] Notethat we do not make any assumption about the adversaryrsquosattacking method Our objective in this paper is to present aframework that sets an upper bound on the probability ofsuccessful inference of an arbitrary private attribute for tuplet isin D )erefore we define the objective of the framework as

forallt isin D j isin 1 mprime1113864 1113865 g v Bj1113872 1113873le ε (3)

We present the upper bound ϵ as our privacy guaranteeHowever a privacy-preserving framework should pro-

vide not only a privacy guarantee but also a notion ofutilitymdashafter all a framework that removes all private at-tribute values or replaces them with randomly generatedvalues can surely preserve privacy )erefore we use ameasurement based on the variance of ranked results beforeand after adopting our framework Given D and a set of allpossible queries denoted as Q we define the utility loss forour ranked retrieval model as follows

U 1113944tisinD

1113944qisinQ

Rank(t ∣ q) minus Rankprime(t ∣ q)1113868111386811138681113868

1113868111386811138681113868 (4)

where Rank(t ∣ q) and Rankprime(t ∣ q) refer to the ranks oftuple t in the ranked result given query q before and afterapplying our frameworks respectively

3 Privacy-Preserving Framework

)e only information an adversary can obtain from a da-tabase through the ranked retrieval model is public attributevalues and ranked results For an adversary without priorknowledge information regarding private attribute valuescan only be retrieved from ranked results )erefore inorder to preserve privacy we have to modify the rankedretrieval model such that the adversary cannot retrieve anyuseful information about private attributes from rankedresults

An idea is to group different tuples together in rankedresults As in our prior work [20] we can group two tuples v1and v2 together such that they share the same rank in anyranked result )is can be achieved by adopting a newranking function sprime(t ∣ q) such that sprime(v1 ∣ q) sprime(v2 ∣ q) forall q If v1 and v2 have different values on every privateattributes then the adversary is unable to infer the privatevalues of v1 since v1 and v2 are indistinguishable in anyranked results However this method suffers from highutility loss In order to preserve the privacy of all privateattributes v1 and v2 have to be different over all privateattributes)us the original scores of v1 and v2 ie s(v1 ∣ q)

and s(v2 ∣ q) differ a lot which leads to a higher variancebetween the rank of v1 before and after adopting thismethod

In this work we present a novel framework that pre-serves privacy of private attributes while minimizes theutility loss We observe that for a tuple vrsquos private attributev[Bj] if there are at least two potential values for v[Bj] and

4 Security and Communication Networks

an adversary cannot differentiate any one of them then theprivacy of v[Bj] can be preserved For instance if theprobability of v[Bj] a is equal to the probability ofv[Bj] aprime given ranked results and prior knowledge thenthe adversary cannot exclude any one of them If both theprobabilities are 50 then the adversary may choose torandomly pick a value from a and aprime as the inferred result Inthis case g(v Bj) will not exceed 50 and privacy of v[Bj]

can be preserved To prove this statement suppose that v isan arbitrary tuple in databaseD and we want to preserve theprivacy of v[Bj] Let β

Bj be an arbitrary value in VB

j v[Bj]1113966 1113967We construct tuple vprime such that vprime and v differ in only oneattribute Bj vprime[Bj] βB

j We also construct databaseDprime suchthat D and Dprime differ in only one tuple v isin D while vprime isin DprimeWe define a new score function sprime(t ∣ q)

sprime(t ∣ q) 1113944m

i1w

Ai ρ q Ai1113858 1113859 t Ai1113858 1113859( 1113857 + 1113944

mprime

j1w

Bi ρprime q Bj1113960 1113961 t Bj1113960 11139611113872 1113873

(5)

where

ρprime q Bj1113960 1113961 t Bj1113960 11139611113872 1113873 ρ q Bj1113960 1113961 t Bj1113960 11139611113872 1113873 if tne v and tne vprime

ρprime q Bj1113960 1113961 t Bj1113960 11139611113872 1113873 Max ρ q Bj1113960 1113961 v Bj1113960 11139611113872 1113873 ρ q Bj1113960 1113961 vprime Bj1113960 11139611113872 11138731113872 1113873

(6)

if t v or t vprimeImagine a case where an adversary queriesD and Dprime with

the same query workload QA We denote the ranked resultsfrom D as RD and the ranked results from Dprime as RDprime As inthe new score function sprime(v ∣ q) sprime(vprime ∣ q) for forallq isin QA RD

is identical to RDprime )erefore given only ranked results theadversary cannot tell the difference between the two data-bases being queried Furthermore even if we exchange thevalues of v and vprime the adversary still cannot observe anychange inRD or RDprime As a result the value of v[Bj] and β

Bj are

equivalent from the perspective of the adversary and theprivacy of v[Bj] can be well preserved Intuitively v can beseen as a tuple that has two polymorphic forms in Bj v[Bj]

and βBj When calculating the score of v with sprime(t ∣ q) we

always choose the value that can maximize sprime(v ∣ q)We can further extend the statement to a more general

case For each tuple v isin D and each private attribute Bj wecan select e distinct values β1 β2 βe from VB

j v[Bj]1113966 1113967elt |VB

j | minus 1 )e new score function can be defined as

sprime(t ∣ q) 1113944m

i1w

Ai ρ q Ai1113858 1113859 t Ai1113858 1113859( 1113857 + 1113944

mprime

j1w

Bi ρprime q Bj1113960 1113961 t Bj1113960 11139611113872 1113873

(7)

whereρprime q Bj1113960 1113961 t Bj1113960 11139611113872 1113873 ρ q Bj1113960 1113961 t Bj1113960 11139611113872 1113873 if tne v

ρprime q Bj1113960 1113961 t Bj1113960 11139611113872 1113873 Max ρ q Bj1113960 1113961 t Bj1113960 11139611113872 1113873 ρ q Bj1113960 1113961 β11113872 11138731113872

ρ q Bj1113960 1113961 βe1113872 11138731113873 if t v

(8)

Consider e tuples v(1)prime v(e)prime which are identical to v

in all attributes except for Bj Let v(i)prime[Bj] be βi

foralli 1 e )en for forallq we have sprime(v ∣ q) sprime(v(i)prime ∣ q)As a result from the adversaryrsquos perspective there are e + 1potential values for v[Bj] v[Bj] β1 βe which are in-distinguishable from each other from any ranked results)eprivacy of v[Bj] can be preserved by grouping it with eequivalent values

As such we introduce the construction of the poly-morphic value set (PVS) We put v[Bj] into a set in which allvalues are indistinguishable when calculating the score withrespect to Bj in the ranking function ie ρprime(q[Bj] v[Bj])We name the above set as the polymorphic value set anddenote the polymorphic value set of tuple v in attribute Bj asP

Bj

v We define PBj

v as follows

Definition 2 PBj

v is a set containing all indistinguishablevalues of tuple vrsquos private attribute Bj Assigning v[Bj] withan arbitrary value in P

Bj

v will not change the value ofsprime(v ∣ q)forallq ie ρprime(q[Bj] v[Bj]) ρprime(q[Bj] β)forallβ isin P

Bj

v

and forallq

Since the adversary cannot distinguish different values inP

Bj

v by launching any inference attacks based only on rankedresults the privacy guarantee of v[Bj] is

ε 1

PBj

v

11138681113868111386811138681113868

11138681113868111386811138681113868

(9)

In order to achieve the privacy guarantee defined in (3)for each v isin D and each private attribute Bj we have toensure that (a) there is one and only one polymorphic valueset P

Bj

v for vrsquos attribute Bj and (b) the privacy guaranteedefined in (9) is always valid

4 Framework with Virtual Values

41 Design In this section we introduce how polymorphicvalue sets can be constructed with generated values to meetthe privacy guarantee in (9) against authenticity-ignorantadversaries We name values generated by our framework asvirtual values

As we proved in Privacy-Preserving Framework anauthenticity-ignorant adversary cannot distinguish the valueof v[Bj] from other valid values in P

Bj

v given ranked resultsSince an adversary without prior knowledge cannot validatethe authenticity of any value in P

Bj

v all values in PBj

v areldquovalidrdquo in the perspective of the adversary no matter if theyare generated by our framework or collected from real datain D )erefore we observe that P

Bj

v can be formed by anyvalues in VB

j In order to achieve the privacy guarantee of ε we have to

ensure that |PBj

v |gt (1ε) forallv isin D and forallj isin 1 mprimeAn intuitive algorithm to generate P

Bj

v of size 1ε is torandomly pick (1ε) minus 1 values from VB

j v[Bj] Specifically

let the initial PBj

v v[Bj]1113966 1113967 )en we can randomly pick

distinct values fromVBj v[Bj] and insert them intoP

Bj

v untilPBj

v

contains at least (1ε) distinct values In the same manner wecan construct a polymorphic value set for each tuplersquos eachprivate attribute

Security and Communication Networks 5

411 Privacy Guarantee For a databaseDwhere every tuplevrsquos every private attribute Bj is included by one polymorphicvalue set with virtual values (PVSV) whose size is at least l ifthe adversary has no prior knowledge of D a privacy level ofε (1l) is achieved

For an authenticity-ignorant adversary PK(v ∣ D) 1forallv As proved in Privacy-Preserving Framework for anauthenticity-ignorant adversary it is impossible to distin-guish v[Bj] with at least l minus 1 other values )us we haveg(v Bj)le (1l) for forallv isin D and forallj isin 1 mprime Accordingto equation (3) a privacy guarantee of (1l) can be achieved

42 Utility Optimization In this section we discuss how toreduce utility loss caused by polymorphic value sets Weintroduced a metric of utility loss in (4) that calculates thesum of difference in ranked results given all possible queriesTo practically calculate utility loss we limit the range ofqueries to a finite set named query workload In practice thequery workload of a database D can be a set of queries thatare more frequently issued than any other queries A queryworkload may contain duplicate queries which reflect thedistribution of frequent queries With a query workloaddenoted as Q we can define the practical utility loss as

UQ 1113944qisinQ

1113944visinD

Rank(v ∣ q) minus Rankprime(v ∣ q)1113868111386811138681113868

1113868111386811138681113868 (10)

In order to reduce utility loss we have to find assign-ments of P

Bj

v for forallv isin D and forallj isin 1 mprime such that theoverall UQ can be minimized Without loss of generality weonly consider constructing polymorphic value sets of size 2In this case the privacy guarantee is 12 For each v[Bj] weneed to find a value that is indistinguishable from v[Bj] Wedenote the polymorphic value of v[Bj] as vprime[Bj]

Definition 3 We define the 2-PVSV problem as follows givendatabase D and query workload Q find a polymorphic valuevprime[Bj] from VB

j v[Bj]1113966 1113967 for each v[Bj] forallv isin D andforallj isin 1 mprime such that UQ defined in (10) is minimized

Theorem 1 4e 2-PVSV problem is NP-hard

)e proof of )eorem 1 in detail can be found in Ap-pendix A

43 Heuristic Algorithm We have proved that the 2-PVSVproblem is an NP-hard problem that may not be solved inpolynomial time)erefore we propose PVSV-Constructora heuristic algorithm that can return an approximate so-lution in polynomial time

We observe that |s(v ∣ q) minus sprime(v ∣ q)| is relevant to|Rank(v ∣ q) minus Rankprime(v ∣ q)| A smaller difference betweens(v ∣ q) and sprime(v| q) leads to a smaller difference betweenRank(v ∣ q) and Rankprime(v ∣ q) )erefore |Rank(v ∣ q)minus

Rankprime(v ∣ q)| can be approximately minimized by a solutionthat minimizes |s(v ∣ q) minus sprime(v ∣ q)| As such we propose anapproximation of UQ that calculates the score differencebefore and after adopting our framework We denote thescore difference as US

Q and define USQ as follows

USQ 1113944

qisinQ1113944visinD

s(v ∣ q) minus sprime(v ∣ q)1113868111386811138681113868

1113868111386811138681113868 (11)

As shown in Algorithm 1 given input database D thenumber of public and private attributes m and mprime re-spectively query workload Q and privacy guarantee ϵPVSV-Constructor constructs an equivalent value set foreach v isin D and j isin 1 mprime that minimizes US

Q Recall thedefinition of sprime(t| q) in (7) and we have

USQ 1113944

visinD1113944

mprime

j11113944qisinQ

wBj ρprime q Bj1113960 1113961 v Bj1113960 11139611113872 1113873 minus ρ q Bj1113960 1113961 v Bj1113960 11139611113872 111387311138681113868111386811138681113868

11138681113868111386811138681113868

(12)

We denote the score difference of v[Bj] contributed byP

Bj

v as U(PBj

v )

U PBj

v1113874 1113875 1113944qisinQ

wBj ρprime q Bj1113960 1113961 v Bj1113960 11139611113872 1113873 minus ρ q Bj1113960 1113961 v Bj1113960 11139611113872 111387311138681113868111386811138681113868

11138681113868111386811138681113868

(13)

)erefore USQ is the sum of U(P

Bj

v ) over all tuples andprivate attributes

USQ 1113944

visinD1113944

mprime

j1U P

Bj

v1113874 1113875 (14)

Since the construction of each PBj

v is independent fromother polymorphic value sets we can minimize US

Q by mini-

mizing the score difference contributed by each PBj

v forforallj isin 1 mprime1113864 1113865 and forallv isin D Note that |ρprime(q[Bj] v[Bj]) minus

ρ(q[Bj] v[Bj])| 0 if q[Bj] v[Bj] or q[Bj] notin PBj

v Alsonote that |ρprime(q[Bj] v[Bj]) minus ρ(q[Bj] v[Bj])| 1 when

q[Bj]ne v[Bj] and q[Bj] isin PBj

v )erefore if forallq isin Q

v[Bj] q[Bj] then any assignment of PBj

v cannot contribute

to a higher U(PBj

v ) since ρprime(q[Bj] v[Bj]) 1 ρ(q[Bj]

v[Bj]) 1 and |ρprime(q[Bj] v[Bj]) minus ρ(q[Bj] v[Bj])| is always

zero In this situation U(PBj

v ) 0 On the contrary if existq isin Qv[Bj]ne q[Bj] then ρ(q[Bj] v[Bj]) 0 and thus|ρprime(q[Bj] v[Bj]) minus ρ(q[Bj] v[Bj])| ρprime(q[Bj] v[Bj])

According to equation (13) the value of U(PBj

v ) is

1113944qisinQ

wBj ρprime q Bj1113960 1113961 v Bj1113960 11139611113872 1113873 minus ρ q Bj1113960 1113961 v Bj1113960 11139611113872 111387311138681113868111386811138681113868

11138681113868111386811138681113868

1113944

qisinQq Bj1113858 1113859nev Bj1113858 1113859

wBj ρprime q Bj1113960 1113961 v Bj1113960 11139611113872 1113873

(15)

In order to minimize U(PBj

v ) we have to find an as-signment of P

Bj

v which minimizes |1113864q isin Q ∣ q[Bj]ne

v[Bj]existβ isin PBj

v q[Bj] β1113865| Consider the simplest case

where we want to construct PBj

v of size 2 v[Bj] β1113966 1113967 We have

6 Security and Communication Networks

1113944

qisinQq Bj1113858 1113859nev Bj1113858 1113859

wBj ρprime q Bj1113960 1113961 v Bj1113960 11139611113872 1113873 w

Bj q isin Q ∣ q Bj1113960 1113961 β1113966 111396711138681113868111386811138681113868

11138681113868111386811138681113868

(16)

In this case β has to be a value in VBj v[Bj] such that β

has the lowest frequency among all q[Bj] values for q isin QNote that β does not have to be an element of q[Bj] ∣1113966

q isin Q If β notin q[Bj] ∣ q isin Q1113966 1113967 its frequency is 0 Similarly if we

want to construct PBj

v of size k then we should insert the k minus 1least frequent values among all q[Bj] values into P

Bj

v In line 4 of Algorithm 1 we initialize the polymorphic

value set of v[Bj] by inserting v[Bj] into PBj

v For v[Bj] aset Set is constructed from VB

j v[Bj]1113966 1113967 which containsall values that can be inserted into P

Bj

v In order tominimize U(P

Bj

v ) we insert the k minus 1 least frequentvalues from Set into P

Bj

v by their frequencies inq[Bj] ∣ q isin Q1113966 1113967 )e above construction of P

Bj

v is repeatedfor each t isin D and j isin 1 mprime )e computationalcomplexity of Algorithm 1 is O(|D| middot |Q| middot mprime)

5 Framework with True Values

51 Authenticity-Knowledgeable Adversaries As we men-tioned in the adversary model an authenticity-knowl-edgeable adversary is able to tell if t isin D is possible As aresult the authenticity-knowledgeable adversary can launcha more efficient attack on private attributes by examining theauthenticity of values learned from RA We show a simplecase where an authenticity-knowledgeable adversary breaksthe privacy guarantee provided by polymorphic values setsconstructed with virtual values Consider that the objectiveof the adversary is to infer the value of v[B0] and B0 is theonly private attribute in D PB0

v is the polymorphic value setgenerated for v[B0] and |PB0

v | 1ε Without prior knowl-edge Pv[B0] ε and the privacy guarantee is achievedHowever for a value β isin PB0

v v[B0]1113864 1113865 if the adversary canconclude that PK(vprime ∣ D) 0 for any tuple vprime such that vprime isequal to v in all public attributes and vprime[B0] β then theadversary can exclude β from PB0

v )erefore

g v B01113858 1113859( 1113857 ε

1 minus εgt ε (17)

and equation (9) no longer holdsAs shown above if values in PB0

v are marked by anadversary as invalid for v given PK(vprime|D) then the adversarycan successfully break the privacy guarantee defined inframework with virtual values

52 Design In this section we propose polymorphic valuesets with true values (PVST) that construct polymorphicvalue sets with values that cannot be excluded by PK(t ∣ D)PVST considers nontrivial prior knowledge of adversariesand presents the same degree of privacy guarantee in-troduced in (9)

We have shown above that the implementation withvirtual values can be compromised by adversaries with priorknowledge Consider a tuple v isin D Given privacy guaranteeϵ we construct mprime polymorphic values sets PB0

v PB

mprimev for

each private attributes of v where |PBj

v |ge 1ε Let set MAv be

MAv v A01113858 11138591113864 1113865 times middot middot middot times v Am1113858 11138591113864 1113865 times P

B0v times middot middot middot times P

Bmprimev (18)

MAv is the Cartesian product of sets each of which

contains vrsquos all equivalent values in an attribute For publicattribute Ai the corresponding set is v[Ai]1113864 1113865 since publicattribute values are open to the adversary For private at-tribute Bj the corresponding set is P

Bj

v )erefore MAv

contains all possible tuples that are indistinguishable with v

with respect to RA (including v itself )With prior knowledge on PK(t ∣ D) an adversary is able

to exclude a value β from PBj

v if

forallt isin t ∣ t isinMAv t Bj1113960 1113961 β1113966 1113967PK(t ∣ D) 0 (19)

As described above it is safe for the adversary to con-clude that βne v[Bj] if there is no t isinMA

v such that t[Bj] βand PK(t ∣ D) 1 Alternatively if for every β in P

Bj

v thereis a t such that t[Bj] β and PK(t ∣ D) 1 then the ad-

versary cannot exclude any value in PBj

v and thereforePv[Bj]le ε is guaranteed Since PK(t ∣ D) 1 iff t isin D weconstruct polymorphic value sets with true values fromt isin D

Definition 4 If there exists l distinct values β1 βl isin PBj

v

such that forallr isin 1 l βr holds the following property

(i) Input ε D m mprime Q(ii) Output P

Bj

v forallv isin D and forallj isin 1 mprime1113864 1113865

(1) k (1ε) minus 1(2) for t in D do(3) for j isin 1 mprime1113864 1113865 do(4) P

Bj

v v[Bj]1113966 1113967

(5) Set VBj v[Bj]1113966 1113967

(6) Sort elements of Set by their frequencies in q[Bj]| q isin Q1113966 1113967 in ascending order(7) Remove the last |Set| minus k + 1 elements in Set(8) P

Bj

v PBj

v cup Set(9) end(10) end

ALGORITHM 1 PVSV-Constructor

Security and Communication Networks 7

existt isinMAv t Bj1113960 1113961 βr and t isin D (20)

)en we say that PBj

v covers l true values We denoteT

Bj

v β1 βl1113864 1113865 as the true value set of t in BjPrivacy guarantee for a database D where every tuplersquos

every private attribute is included by one equivalent value setwhich covers at least l true values a privacy level of ε 1l isachieved

Assume that the adversaryrsquos objective is to infer the valueof v[Bj] As mentioned in the adversary model we make noassumption on the attacking methods adopted by an ad-versary Consider MA

v defined in (18) forallt tprime isinMAv and

forallq isin QA s(t ∣ q) s(tprime ∣ q) )erefore the adversary cannotdistinguish tuples in MA

v by observing RA In this situationthe adversary would use PK(t ∣ D) to exclude all tuple t inMA

v such that PK(t ∣ D) 0 However as PBj

v covers l truevalues there exists l tuples t1 tl isinMA

v such thatPK(t ∣ D) 1 and tr[Bj]ne ts[Bj] for arbitraryr s isin 1 l As a result values in t1[Bj] tl[Bj]1113966 1113967 areindistinguishable given RA and PK(t ∣ D) )us Pv[Bj] 1lAccording to (3) a privacy level of 1l is achieved

53 Utility Optimization An intuitive method of con-structing P

Bj

v is to insert the value of Bj of all tuples that sharethe same public attribute values with v into P

Bj

v ie PBj

v

PBj

v 1113864vprime[Bj]|foralli and vprime isin D vprime[Ai] v[Ai]1113865 )e privacyguarantee is met if P

Bj

v covers at least 1ε true valuesNevertheless utility loss cannot be ignored as in the intuitivemethod s(vprime| q) would be the same for all vprime isin 1113864vprime| foralliand vprime isin D vprime[Ai] v[Ai]1113865 and thus information of privateattributes are missing in the ranked result of vprime )ereforethe size of P

Bj

v is critical in balancing privacy and utilityWith no loss of generality we limit the size of each poly-morphic value set to 2 We show that constructing suchpolymorphic value sets is an NP-hard problem

Definition 5 We define the 2-PVST problem as followsgiven database D and a query workload Q construct P

Bj

v ofsize 2 for each tuple v isin D forallj isin 1 mprime1113864 1113865 and minimizeUQ defined in (10)

Theorem 2 4e 2-PVST problem is NP-hard

)e proof of )eorem 2 can be found in Appendix B

54 Heuristic Algorithm We have shown that the 2-PVSTproblem is NP-hard In this subsection we present PVST-Constructor a heuristic algorithm that constructs PVSTwithin polynomial time PVST-Constructor tries to mini-mize |P

Bj

v | and 1113936qisinQ|sprime(v ∣ q) minus s(v ∣ q)| for each v isin D

and j isin 1 mprime with the greedy algorithm)e pseudo-code of PVST-Constructor is shown in

Algorithm 2 In lines 2 to 6 we initialize each PBj

v withv[Bj] )en for each v isin D PVST-Constructor constructsPB1

v PBmprime

v by finding a vprime with the greedy algorithm andinserting vprime[Bj] into P

Bj

v forallj isin 1 mprime)e above process

will be taken multiple times until |PBj

v |ge 1ε Since every vprime isa real tuple existing in D and we insert at least k tuples in theabove processes P

Bj

v covers at least k + 1 true values andthus the privacy guarantee is met As mentioned in UtilityOptimization the size of P

Bj

v is critical in minimizing utilityloss )erefore the heuristic algorithm tries to minimize theutility loss by minimizing the size of each P

Bj

v We count thenumber of polymorphic value sets of v if the set contains lessthan k values that can be enlarged by inserting vprimersquos privateattribute values We denote this count as covervprime

covervprime j ∣ PBj

v

11138681113868111386811138681113868

11138681113868111386811138681113868lt k vprime Bj1113960 1113961 notin PBj

v1113882 1113883

1113868111386811138681113868111386811138681113868

1113868111386811138681113868111386811138681113868 (21)

Tuple vprime with a higher covervprime value can enlarge the sizeof more polymorphic value sets of v and thus we can reducethe number of tuples that we have to insert into PB1

v PBmprime

v Furthermore we take into consideration queries in Q In

order to minimize UQ we have to minimize |sprime(v ∣ q) minus

s(v ∣ q)| for q isin Q Note that for each attribute Bj|ρprime(q[Bj] v[Bj]) minus ρ(q[Bj] v[Bj])| 1 if v[Bj]ne q[Bj] andq[Bj] isin P

Bj

v We denote the value of 1113936j|ρprime(q[Bj] v[Bj]) minus

ρ(q[Bj] v[Bj])| as lossvprime )us we have

lossvprime 1113944qisinQ

1113944

j1mprime

ρprime q Bj1113960 1113961 v Bj1113960 11139611113872 1113873 minus ρ q Bj1113960 1113961 v Bj1113960 11139611113872 111387311138681113868111386811138681113868

11138681113868111386811138681113868

1113944qisinQ

1113944

j1mprime

δ v vprime q j( 1113857

(22)where δ(v vprime q j) 1 if v[Bj]ne q[Bj] and vprime[Bj] q[Bj]

and δ(v vprime q j) 0 otherwise )erefore tuple vprime with asmaller lossvprime can reduce the value of 1113936j|ρprime(q[Bj]

v[Bj]) minus ρ(q[Bj] v[Bj])| and thus we can reduce the valueof UQ

)e computation of covervprime and lossvprime is done in line 11)en we compute scorevprime the score of vprime that indicates howpreferable vprime is relative to other tuples in D v Weadopt the greedy algorithm to find the next vprime for PB1

v

PB

mprimev ie in each iteration we choose the tuple

that has the highest score value In line 15 we insert theprivate attribute values of the chosen tuple (denoted as vmax)into PB1

v PB

mprimev )e above process is repeated until

the sizes of PB1v P

Bmprime

v are no less than k

6 Experimental Results

61 Experimental Setup To validate PVSV-Constructor andPVST-Constructor algorithms we conducted experiments on areal world dataset [29] from eHarmony which contains 58attributes and 486464 tuples We removed 5 noncategoricalattributes and randomly picked 20 categorical attributes fromthe remaining 53 attributes )e domain sizes of the 20 at-tributes range from 2 to 15 After removing duplicate tuples werandomly picked 300000 tuples as our testing bed

By default we use the ranking function from the rankedretrieval model with all weights set to 1 All experimentalresults were obtained on a Mac machine running Mac OS

8 Security and Communication Networks

with 8GB of RAM )e algorithms were implemented inPython

62 Privacy )e privacy guarantee of PVSV-Constructorand PVST-Constructor were tested by performing RankedInference attack [35] including Point-Query In-QueryPoint-QueryampInsert and In-QueryampInsert attackingmethods on the dataset From a total of 20 attributes 5attributes were randomly chosen as public attributes andanother 5 attributes were randomly chosen as private at-tributes We randomly picked 20000 distinct tuples fromthe dataset as the testing bed and randomly generated 10tuples as the query workload For PVSV-Constructor weconstructed a polymorphic value set of size 2 for eachprivate attribute and each tuple in the testing bed ForPVST-Constructor we constructed a polymorphic valueset that covers at least two true values for each privateattribute and each tuple We randomly picked 1000 tuplesfrom the testing bed as our targets and performed 1000Rank Inference attacks (250 attacks for each of the fourmethods) on the five private attributes of target tuples Wemeasured the attack success guess rates based on the fre-quency of successful inference among all inference at-tempts Figure 1 shows the success guess rates of RankInference attacks on the unprotected testing bed the testingbed with PVSV and the testing bed with PVST As the sizeof each polymorphic value set is 2 the success guess rateson PVSV are around 50 which are significantly lowerthan those of unprotected dataset We also observe that thesuccess guess rates on PVSTare slightly lower than those of

PVSV )e reason is that PVSV-Constructor will beinserting values into tuple vrsquos polymorphic value sets untilall vrsquos polymorphic value sets cover at least 2 true values)us some polymorphic value sets of v may contain morethan 2 values

63 Utility In this subsection we quantify utility loss ofPVSV-Constructor and PVST-Constructor algorithms)e privacy guarantee of both PVSV-Constructor andPVST-Constructor is 12 ie the polymorphic value setsconstructed by PVSV-Constructor contains 2 values andthe polymorphic value sets constructed by PVST-Con-structor contains at least 2 true values )e key param-eters here are the size of query workload Q the size ofdatabase D the number of public and private attributesand the weight ratios in the ranking function We ran-domly generated 20 tuples as the query workload Bydefault we picked 10 tuples from Q and set |D| 300 000m 10 and mprime 10 )erefore we randomly picked 10attributes from the testing bed and set them as publicattributes )e rest of the 10 attributes were set as privateattributes

Many recommendation systems of ONS applicationsfeature top-k recommendation [1 14 47] where the rankedresult contains a set of k tuples that will be of interest to acertain user as it is impractical and unnecessary to return alltuples in the database to the user )erefore we introduceaverage top-k utility loss Uaul a variant of UQ that focuses onutility loss of the top-k tuples in a ranked result Uaul isdefined as

(i) Input ε D m mprime Q(ii) Output P

Bj

v forallv and forallBj

(1) k (1ε) minus 1(2) for v in D do(3) for j isin 1 mprime1113864 1113865 do(4) P

Bj

v v[Bj]1113966 1113967

(5) end(6) end(7) for v in D do(8) while existj isin 1 mprime1113864 1113865 |P

Bj

v |lt k + 1 do(9) for vprime isin t isin D|forallAi t[Ai] v[Ai]1113864 1113865 do(10) compute covervprime lossvprime(11) scorevprime covervprime 11 + exp(minus lossvprime )

(12) end(13) vmax argmaxvprimeisin |tisinD|forallAit[Ai]v[Ai] scorevprime

(14) for j isin 1 mprime1113864 1113865 do

(15) PBj

v PBj

v cup tmax[Bj]1113966 1113967

(16) end(17) end(18) end

ALGORITHM 2 PVST-Constructor

Security and Communication Networks 9

Uaul 1

|Q|1113944

|Q|

i1

1Dprime

11138681113868111386811138681113868111386811138681113868

1113944

visinDprime

min Rank(v ∣ q) k + 11113864 1113865 minus min Rankprime(v ∣ q) k + 11113864 11138651113868111386811138681113868

1113868111386811138681113868

k (23)

where Dprime v isin D |Rank(v | q))lt k or Rank(v | q))lt k1113864 1113865Intuitively Uaul represents the average percentage rankdifference relative to k over all queries and all tuples in top-kUaul is equivalent to UQ|Q||D|2 when k |D| By default weset k 100

631 Evaluation of Uaul with Varying k We first discuss theaverage top-k utility loss of PVSV-Constructor PVST-Constructor and a baseline algorithm on varying k withother parameters set to default values For a tuple vrsquos at-tribute Bj the baseline algorithm constructs P

Bj

v with v[Bj]

and a value randomly picked from VBj v[Bj]1113966 1113967 )e results

are presented in Figure 2 which shows that the Uaul ofPVSV-Constructor is significantly lower than that of thebaseline algorithm )e Uaul of PVST-Constructor is alsolower than that of the baseline algorithm when kge 5 eventhough the baseline algorithm cannot preserve privacyagainst authenticity-knowledgeable adversaries )e exper-imental results show that both PVSV and PVST can reduceutility loss with respect to rank differences Also note thatPVST-Constructor constructs polymorphic value sets withtrue values and thus from Figure 3 we can see that somepolymorphic value sets constructed by PVST-Constructorcontains more than 2 values

632 Evaluation of Uaul with Varying Sizes of QFigure 4 presents the average top-k utility loss of PVSV-Constructor on varying |Q| When |Q| is set to 1 5 or 10 werandomly picked 1 5 or 10 queries from the original Qrespectively With increasing number of queries in the queryworkload the Uaul of PVSV-Constructor increases mono-tonically )e reason is that PVSV-Constructor alwaysgenerates the least frequent value (denoted as vprime[Bj]) in1113864q[Bj] | q isin Q1113865 for v[Bj] If |Q| is small then it is possiblethat vprime[Bj] notin 1113864q[Bj] | q isin Q1113865 and thus s(v ∣ q) sprime(v ∣ q)

forallq isin Q However a larger query workload covers moreprivate attributes values and thus it will be harder forPVSV-Constructor to generate a value for v[Bj] that has noimpact on the rank of v for any q isin Q Figure 5 presents theaverage top-k utility loss of PVST-Constructor on varying|Q| We can see that the size of Q has no significant impacton the Uaul of PVST-Constructor as the PVST-Constructoralways pick a tuple vprime that is different from v in most at-tributes and then insert vprime[Bj] into P

Bj

v

633 Evaluation of Uaul with Varying Sizes of the DatasetFigures 6 and 7 depict the impact of the size of datasets onUaul of PVSV-Constructor and PVST-Constructor Datasetsof 100000 and 200000 tuples were randomly sampled from

QPoint QIn QIPoint QIIn

Atta

ck su

cces

s gue

ss ra

te

Without protectionPVSVPVST

1

09

08

07

06

05

04

03

02

01

0

Figure 1 Attack success guess

10 Security and Communication Networks

the testing bed of 300000 tuples As expected |D| has nosignificant impact on the Uaul of PVSV-Constructor sincePVSV-Constructor generates values from the domain ofeach private attributes |D| has no impact on the Uaul ofPVST-Constructor which indicates that a dataset contain-ing 100000 tuples is sufficient for PVST-Constructor togenerate polymorphic value sets with true values

634 Evaluation of Uaul with Varying m We investigate theimpact of the number of private and public attributes on

average top-k utility loss Figure 8 presents theUaul of PVSV-Constructor with fixed mprime and varying m and with fixed mand varying mprime When mmprime is set to 5 we randomly re-moved 5 attributes from the testing bed When mmprime is set to

0 20 40 60 80 100k

RandomPVSVPVST

Ave

rage

top-k

utili

ty lo

ss

1

09

08

07

06

05

04

03

02

01

0

Figure 2 Utility loss vs k

1 2 3 4 5+Size of PVST

3

25

2

15

1

05

0

Coun

t 10

5

Figure 3 Size of polymorphic value set (PVST)

|Q|

Ave

rage

top-k

utili

ty lo

ss

1

009

008

007

006

005

004

003

002

001

00 10 20

Figure 4 Utility loss vs |Q|

|Q|

Ave

rage

top-k

utili

ty lo

ss

1

09

08

07

06

05

04

03

02

01

00 10 20

Figure 5 Utility loss vs |Q|

0 1 2 3|D|105

Ave

rage

top-k

utili

ty lo

ss

1

009

008

007

006

005

004

003

002

001

0

Figure 6 Utility loss vs |D|

Security and Communication Networks 11

15 we added 5 more categorical attributes randomly chosenfrom the unused attributes We observe that the Uaulmonotonically decreases with increasing number of publicattributes and monotonically increases with increasingnumber of private attributes As expected a higher pro-portion of public attributes leads to less variant betweens(v ∣ q) and sprime(v ∣ q) e results of the same experimentwith PVST-Constructor are shown in Figure 9 Uaul in-creases as increasing number of private attributes as ex-pected However Uaul also increases slightly with increasingnumber of public attributes is is due to the fact that withmore public attributes there will be few tuples that share thesame public attribute values Since PVST-Constructor in-serts only private attribute values from tuples sharing samepublic attribute values more values will be inserted to eachpolymorphic value set which introduces higher utility loss

635 Evaluation of Uaul with Varying Weight RatiosFigures 10 and 11 illustrate the impact of weight ratios on theaverage utility loss e experiment was conducted with a

xed private attribute weight of 1 and varying public at-tribute weights of 1 2 and 3 As expected Uaul of bothPVSV-Constructor and PVST-Constructor decreases as

0 1 2 3Weight ratio

Ave

rage

top-k

utili

ty lo

ss

1

009

008

007

006

005

004

003

002

001

0

Figure 10 Utility loss vs weight ratio

0 1 2 3Weight ratio

Ave

rage

top-k

utili

ty lo

ss

1

09

08

07

06

05

04

03

02

01

0

Figure 11 Utility loss vs weight ratio

0 1 2 3|D|105

Ave

rage

top-k

utili

ty lo

ss1

09

08

07

06

05

04

03

02

01

0

Figure 7 Utility loss vs |D|

0 5 10 15m or mprime

fix mprime = 10fix m = 10

Ave

rage

top-k

utili

ty lo

ss

1

009

008

007

006

005

004

003

002

001

0

Figure 8 Utility loss vs m and mprime

0 5 10 15m or mprime

fix mprime = 10fix m = 10

Ave

rage

top-k

utili

ty lo

ss

1

09

08

07

06

05

04

03

02

01

0

Figure 9 Utility loss vs m and mprime

12 Security and Communication Networks

increasing weight ratio of public attributes)e reason is thatas the public attribute weight increases the part in sprime(v ∣ q)

caused by private attributes decreases )erefore less impactwould be made to sprime(v ∣ q) by PVSV and PVST As a resultsprime(v ∣ q) would be closer to s(v ∣ q) and the utility loss couldbe decreased

7 Conclusions

In this paper we proposed a novel framework that preservesprivacy of private attributes against arbitrary attacks throughthe ranked retrieval model Furthermore we identify twocategories of adversaries based on varying adversarial ca-pabilities For each kind of adversaries we presentedimplementation of our framework Our experimental resultssuggest that our implementations efficiently preserve privacyagainst Rank Inference attack [35] Moreover the imple-mentations significantly reduce utility loss with respect tothe variance in ranked results

It is our hope that this paper can motivate further re-search in privacy preservation of SIoT with consideration ofsocial network features andor variant information retrievalmodels eg text mining

Appendix

A 2-PVSV

In this subsection we prove that constructing an optimal PVSVfor each attribute of a tuple v is an NP-hard problem

Definition A1 For a tuple v in D we create PBj

v for each BjWe say v satisfies query q if the following hold for any tuple t(tne v) in D (1) If s(v ∣ q)lt s(t ∣ q) then sprime(v ∣ q)lt sprime(t ∣ q)and (2) if s(v ∣ q)ge s(t ∣ q) then sprime(v ∣ q)ge sprime(t ∣ q)

For a database containing 2 tuples the 2-PVSV problemcan be redefined as follows given a query workload Q adatabase D construct PB1

v PBmprime

v of size 2 such that v

satisfies the most queries in Q

Definition A2 Max-3Sat Problem given a 3-CNF formulaΦ find the truth assignment that satisfies that most clauses

Lemma A1 Max-3Sat leP 2-PVSV Problem

Proof We construct a reduction function f(Φ) (D s Q)

which takes a Max-3Sat instance as input and returns a 2-PVSV instance Without loss of generality we suppose thatΦ is a conjunction of l clauses and each clause is a dis-junction of 3 literals from set X x1 xn1113864 1113865 We constructdatabase D as follows D has 0 public attributes and n + 1private attributes B1 Bn C1 Let VB

j be the attribute do-main of Bj j isin 1 n and VC

1 be the attribute domain ofC1 Let VB

i minus 1 0 1 2 n + 2 and VC1 0 1 Two

tuples v1 and v2 are inserted into D v1[Bj] 0 and v2[Bj]

j + 2 for j isin 1 n while v1[C1] 0 and v2[C1] 1 Wesimplify the score function defined in (1) by setting all weightsto 1 Also note that ρ(q[Bj] t[Bj]) 0 if q[Bj] is null

We construct query workload Q based on Φ For eachclause Lk isin Φ we construct a query qk such that qk[Bi] i iffthe corresponding literal xi is a positive literal in Lk andq[Bi] i + 1 iff xi is a negative literal in the clause Forexample given a clause (x1orx2orx3) the correspondingquery q should satisfy q[B1] 1 q[B2] 3 q[B3] 3 andq[C1] 0 All other attributes in q are set to null by default)erefore s(v1 ∣ q) 1 and s(v2 ∣ q) 0 forallq isin Q

Since s(v2 ∣ q)lt s(v1 ∣ q) forallq isin Q in order to minimizeutility loss the value of sprime(v2 ∣ q) should be as small aspossible We observe that the minimum possible sprime(v2 ∣ q) is1 because PC1

v2must be 0 1 as VC

1 0 1 Without loss ofgenerality we assume that we have already constructedPB1

v2 P

Bmprime

v2 PC1v2

for v2 such that sprime(v2 ∣ q) 1 forallq isin QNow we have an instance of 2-PVSV problem that given

D Q constructs PB1v1

PBmprime

v1 PC1

v1that satisfies the most

queries in Q Since VC1 0 1 PC1

v1must be 0 1 too as

PC1v1

contains two distinct values )erefore for an ar-bitrary query q isin Q we have ρprime(q[C] v1[C]) 1 andρprime(q[C] v2[C]) 1 In order to let v1 satisfy q we have toensure that sprime(v1 ∣ q)gt sprime(v2 ∣ q) 1 )us we have

sprime v1 ∣ q( 1113857gt 1

⟺1113944mprime

i1ρprime q Bi1113858 1113859 v1 Bi1113858 1113859( 1113857gt 1

⟺existi isin 1 mprime1113864 1113865 ρprime q Bi1113858 1113859 v1 Bi1113858 1113859( 1113857 1

⟺existi isin 1 mprime1113864 1113865 q Bi1113858 1113859 isin PBi

v1

(A1)

Now we show that the solution of 2-PVSV problemconstructed above can answer the corresponding Max-3Sat Problem Suppose that we have the solution PB1

v1

PB

mprimev1 such that v1 satisfies the most queries in Q As in

(A1) if v1 satisfies qk we have qk[Bi] isin PBiv1 If v1 does not

satisfy qk then qk[Bi] notin PBiv1 Recall that we assign value i or

i + 1 to qk[Bi] if xi is a positive or negative literal in clauseLk respectively )erefore the assignment of xi given PBi

v1is

xi true if i isin PBi

v1

xi false if i + 1 isin PBi

v1

(A2)

Furthermore from equation (A1) we have

sprime v1 ∣ q( 1113857gt 1⟺L true (A3)

where L is qrsquos corresponding clause in Φ Note thati i + 1 nsubQ as |PBi

v1| 2 and 0 isin PBi

v1 )us the value of xi is

either true or falseAs we proved above v1 satisfies qi if and only if Li is true

given assignment x1 xl1113864 1113865 constructed according toequation (A2) )erefore x1 xl1113864 1113865 satisfies the mostclauses in ϕ if and only if v1 satisfies the most queries in Q

We now prove that function f can be conducted inpolynomial time Given a formula Φ with n variables and lclauses we construct a 2-PVSV instance with 2 tuples each

Security and Communication Networks 13

of which has n + 1 attributes and l queries each of which has4 attributes )erefore O(2n + 4l) assignments are neededand f can be conducted in polynomial time

Proof of 4eorem 1 We now prove that the 2-PVSVproblem is NP-hard In Lemma 3 we proved that Max-3SatProblem can be reduced to the 2-PVSV problem in poly-nomial time Furthermore as Max-3Sat Problem is a NP-hard problem [33] the 2-PVSV problem is NP-hard

B 2-PVST

In this section we prove that constructing an optimal PVSTfor each private attribute of a tuple v isin D is NP-hard We usethe definition of v satisfying q from (9))e 2-PVSTproblemcan be redefined as follows

Definition B1 2-PVSTproblem given a query workload Qa database D the optimization problem of 2-PVST is toconstruct an arbitrary tuple vrsquos polymorphic setsPB1

v PBmprime

v that satisfies the most queries in Q )e size ofeach polymorphic vale set is 2

Lemma B1 Max-3Sat le P 2-PVST problem

Proof We construct a reduction function f(Φ) (D s Q)

which takes a Max-3Sat instance as input and returns a 2-PVST instance We assume that Φ is a conjunction of lclauses and each clause is a disjunction of 3 literals from setX x1 xn1113864 1113865 Let D have no public attribute and n + 1private attributes B1 Bn C1 For each literal xi we inserttwo tuples vi and vi

prime into D where

vi C11113858 1113859 1 vi Bi1113858 1113859 1 vi Bj1113960 1113961 minus 1foralljne i

viprime C11113858 1113859 1 vi

prime Bi1113858 1113859 minus 1 viprime Bj1113960 1113961 1foralljne i

(B1)

)en we insert tuple v into D where v[C1] 0 andv[Bj] 0 forallj isin 1 mprime We also set the domain of eachBj as minus 1 0 1 and the domain of C1 as 0 1

Without loss of generality the score function defined in(1) is simplified by setting all weights to 1

Next we construct query workload Q For each clauseLk isin Φ we construct a query qk in which qk[Bj] minus 1 iff xj isa positive literal in Lk and qk[Bj] 1 iff xj is a negativeliteral in Lk We also set qk[C1] 1 )e rest of the attributevalues are set to null by default )erefore ifLk (x1orne x2orx3) then we have qk[C1] 1 qk[B1] 1qk[B2] 1 qk[B3] minus 1 and qk[Bj] null forj isin 4 mprime

Note that ρ(q[Bj] t[Bj]) 0 if q[Bj] is null )uss(v qk) 0 forallq isin Q and s(vi qk) s(vi

prime qk)ge 1 forallq isin Q Weobserve that for q isin Q and i isin 1 n s(v ∣ q)lt s(vi ∣ q)

and s(v ∣ q)lt s(viprime ∣ q) In order to reduce UQ we have to

maximize the number of q isin Q such that sprime(v ∣ q)lt sprime(vi ∣ q)

and sprime(v ∣ q)lt sprime(viprime ∣ q) )e maximum value of sprime(vi ∣ q)

and sprime(viprime ∣ q) is 4 which can be achieved by inserting vi[Bj]

into PBj

viprime and inserting vi

prime[Bj] into PBj

vi Since PC1

v 0 1 thevalue of sprime(v ∣ q) is at least 1 )erefore we have

sprime(v ∣ q)lt sprime vi ∣ q( 1113857

⟺sprime(v ∣ q)lt 4

⟺1113944

mprime

j1ρprime q Bj1113960 1113961 v Bj1113960 11139611113872 1113873lt 3

⟺existj isin 1 mprime ρprime q Bj1113960 1113961 v Bj1113960 11139611113872 1113873 0

⟺existj isin 1 mprime v Bj1113960 1113961ne q Bj1113960 1113961

⟺existj isin 1 mprime minus q Bj1113960 1113961 isin PBj

v

(B2)

Since |PBj

v | is limited to 2 PBj

v 0 1 or 0 minus 1 ie wemust insert either virsquos or vi

primersquos private attribute values intoPB1

v PBmv Recall that we assign minus 1 or 1 to qk[Bj] if xj

is positive or negative respectively in Lk In order toreduce Max-3Sat to 2-PSVT we assign literals in the fol-lowing way

xj true if minus 1 isin PBj

v

xj false if 1 isin PBj

v (B3)

)erefore we denote the corresponding clause of q as Land from equation (B2) we have

existj isin 1 mprime1113864 1113865 minus q Bj1113960 1113961 isin PBj

v

⟺existj isin 1 mprime1113864 1113865 xj true if xj is positive in L

orxj false if xj is negative in L

⟺L true(B4)

From the above equation we observe that v satisfying qk

is equivalent to Lk being true)erefore we can reduceMax-3Sat to 2-PVST Given a solution to the 2-PVSTproblem thatmaximizes the number of queries satisfied by v assignmentx1 xn1113864 1113865 produced by equation (B3) can satisfy thelargest number of clauses in Φ

Given a Max-3Sat instance Φ the construction of the 2-PVST instance can be conducted in polynomial time as weconstruct n queries with 3 attributes and 2n + 1 tuples with nattributes)e transformation from the optimal solution of 2-PVST to the optimal solution of Max-3Sat also takes poly-nomial time as we assign the value to each one of x1 xn

once )erefore f is a polynomial time function

Proof of 4eorem 2 In 4 we proved that a Max-3Sat in-stance can be reduced to a 2-PSVT instance in polynomialtime Furthermore as Max-3Sat Problem is an NP-hardproblem the 2-PVSV problem is an NP-hard problem

Data Availability

)e data used to support the findings of this study areavailable from the corresponding author upon request

14 Security and Communication Networks

Conflicts of Interest

)e authors declare that they have no conflicts of interestregarding the publication of this paper

Acknowledgments

)is research was partially supported by the National Sci-ence Foundation (grant nos 0852674 0915834 1117297and 1343976) and the Army Research Office (grant noW911NF-15-1-0020)

References

[1] G Adomavicius and A Tuzhilin ldquoToward the next generationof recommender systems a survey of the state-of-the-art andpossible extensionsrdquo IEEE Transactions on Knowledge andData Engineering vol 17 no 6 pp 734ndash749 2005

[2] C C Aggarwal and S Y Philip ldquoA condensation approach toprivacy preserving data miningrdquo in Proceedings of the In-ternational Conference on Extending Database Technologypp 183ndash199 Springer Heraklion Crete Greece March 2004

[3] R Agrawal and R Srikant ldquoPrivacy-preserving data miningrdquoin ACM Sigmod Record vol 29 no 2 pp 439ndash450 ACM2000

[4] A Alcaide E Palomar J Montero-Castillo and A RibagordaldquoAnonymous authentication for privacy-preserving IoT tar-get-driven applicationsrdquo Computers amp Security vol 37pp 111ndash123 2013

[5] L Atzori A Iera G Morabito and M Nitti ldquo)e socialinternet of things (SIoT)mdashwhen social networks meet theinternet of things concept architecture and network char-acterizationrdquo Computer Networks vol 56 no 16 pp 3594ndash3608 2012

[6] Z Cai Z He X Guan and Y Li ldquoCollective data-sanitizationfor preventing sensitive information inference attacks insocial networksrdquo IEEE Transactions on Dependable and SecureComputing vol 15 no 4 pp 577ndash590 2018

[7] Z Cai and X Zheng ldquoA private and efficient mechanism for datauploading in smart cyber-physical systemsrdquo IEEE Transactionson Network Science and Engineering vol 24 p 1 2018

[8] Z Cai X Zheng and J Yu ldquoA differential-private frameworkfor urban traffic flows estimation via taxi companiesrdquo IEEETransactions on Industrial Informatics vol 17 2019

[9] N Cao C Wang M Li K Ren and W Lou ldquoPrivacy-preserving multi-keyword ranked search over encryptedcloud datardquo IEEE Transactions on Parallel and DistributedSystems vol 25 no 1 pp 222ndash233 2014

[10] Y-C Chang and M Mitzenmacher ldquoPrivacy preservingkeyword searches on remote encrypted datardquo in Proceedingsof the International Conference on Applied Cryptography andNetwork Security pp 442ndash455 Springer New York NY USAJune 2005

[11] C Chen X Zhu P Shen et al ldquoAn efficient privacy-pre-serving ranked keyword search methodrdquo IEEE Transactionson Parallel and Distributed Systems vol 27 no 4 pp 951ndash9632016

[12] K Chen and L Liu ldquoPrivacy preserving data classificationwith rotation perturbationrdquo in Proceedings of the Fifth IEEEInternational Conference on Data Mining (ICDMrsquo05) p 4IEEE Houston TX USA November 2005

[13] V Ciriani S D C Di Vimercati S Foresti S JajodiaS Paraboschi and P Samarati ldquoFragmentation and en-cryption to enforce privacy in data storagerdquo in Proceedings of

the Fifth European symposium on research in computer se-curity pp 171ndash186 Springer Dresden Germany September2007

[14] M Deshpande and G Karypis ldquoItem-based top-N recom-mendation algorithmsrdquo ACM Transactions on InformationSystems vol 22 no 1 pp 143ndash177 2004

[15] S D C di Vimercati R F Erbacher S Foresti S JajodiaG Livraga and P Samarati ldquoEncryption and fragmentationfor data confidentiality in the cloudrdquo in Foundations of Se-curity Analysis and Design VII A Aldini J Lopez andF Martinelli Eds pp 212ndash243 Springer Berlin Germany2013

[16] S D C di Vimercati S Foresti G Livraga S Paraboschi andP Samarati ldquoConfidentiality protection in large databasesrdquo inA Comprehensive Guide through the Italian Database Researchover the Last 25 Years pp 457ndash472 Springer Berlin Ger-many 2018

[17] C Dwork ldquoDifferential privacyrdquo Encyclopedia of Cryptog-raphy and Security pp 338ndash340 2011

[18] C Dwork and A Roth ldquo)e algorithmic foundations ofdifferential privacyrdquo Foundations and Trendsreg in 4eoreticalComputer Science vol 9 no 3-4 pp 211ndash407 2014

[19] S E Fienberg and J McIntyre ldquoData swapping variations ona theme by dalenius and reissrdquo in Privacy in Statistical Da-tabases pp 14ndash29 Springer Berlin Germany 2004

[20] Y Gao T Yan and N Zhang ldquoA privacy-preservingframework for rank inferencerdquo in Proceedings of the IEEESymposium on Privacy-Aware Computing (PAC) pp 180-181IEEE Washington DC USA August 2017

[21] Z He Z Cai and J Yu ldquoLatent-data privacy preserving withcustomized data utility for social network datardquo IEEETransactions on Vehicular Technology vol 67 no 1pp 665ndash673 2017

[22] T Hong S Mei Z Wang and J Ren ldquoA novel verticalfragmentation method for privacy protection based on en-tropy minimization in a relational databaserdquo Symmetryvol 10 no 11 p 637 2018

[23] X Huang R Fu B Chen T Zhang and A Roscoe ldquoUserinteractive internet of things privacy preserved access con-trolrdquo in Proceedings of the International Conference for In-ternet Technology and Secured Transactions pp 597ndash602IEEE London UK December 2012

[24] Y Huo X Fan L Ma X Cheng Z Tian and D ChenldquoSecure communications in tiered 5G wireless networks withcooperative jammingrdquo IEEE Transactions on Wireless Com-munications vol 18 no 6 pp 3265ndash3280 2019

[25] K Liu H Kargupta and J Ryan ldquoRandom projection-basedmultiplicative data perturbation for privacy preserving dis-tributed data miningin latexrdquo IEEE Transactions on knowledgeand Data Engineering vol 18 no 1 pp 92ndash106 2005

[26] N Li T Li and S Venkatasubramanian ldquot-closeness privacybeyond k-anonymity and l-diversityrdquo in Proceedings of theIEEE 23rd International Conference on Data Engineeringpp 106ndash115 IEEE Istanbul Turkey April 2007

[27] A Machanavajjhala D Kifer J Gehrke andM Venkitasubramaniam ldquoL-diversityrdquo ACMTransactions onKnowledge Discovery from Data vol 1 no 1 pp 3ndashes 2007

[28] S Matwin ldquoPrivacy-preserving data mining techniquessurvey and challengesrdquo in Studies in Applied PhilosophyEpistemology and Rational Ethics pp 209ndash221 SpringerBerlin Germany 2013

[29] B McFee and G Lanckriet ldquoMetric learning to rankrdquo inProceedings of the 27th International Conference on MachineLearning (ICMLrsquo10) Haifa Israel June 2010

Security and Communication Networks 15

[30] K Muralidhar and R Sarathy ldquoData shufflingmdasha newmasking approach for numerical datardquo Management Sciencevol 52 no 5 pp 658ndash670 2006

[31] J Nin J Herranz and V Torra ldquoRethinking rank swapping todecrease disclosure riskrdquo Data amp Knowledge Engineeringvol 64 no 1 pp 346ndash364 2008

[32] K Okamoto W Chen and X-Y Li ldquoRanking of closenesscentrality for large-scale social networksrdquo in Frontiers inAlgorithmics pp 186ndash195 Springer Berlin Germany 2008

[33] C H Papadimitriou and M Yannakakis ldquoOptimizationapproximation and complexity classesrdquo Journal of computerand system sciences vol 43 no 3 pp 425ndash440 1991

[34] L Peng R-C Wang X-Y Su and C Long ldquoPrivacy pro-tection based on key-changed mutual authentication protocolin internet of thingsrdquo in Communications in Computer andInformation Science pp 345ndash355 Springer Berlin Germany2013

[35] M F Rahman W Liu S )irumuruganathan N Zhang andG Das ldquoPrivacy implications of database rankingrdquo Pro-ceedings of the VLDB Endowment vol 8 no 10 pp 1106ndash1117 2015

[36] R Schenkel T Crecelius M Kacimi et al ldquoEfficient top-kquerying over social-tagging networksrdquo in Proceedings of the31st Annual International ACM SIGIR Conference on Researchand Development in Information Retrieval pp 523ndash530ACM Singapore July 2008

[37] C Sheng N Zhang Y Tao and X Jin ldquoOptimal algorithmsfor crawling a hidden database in the webrdquo Proceedings of theVLDB Endowment vol 5 no 11 pp 1112ndash1123 2012

[38] D X Song D Wagner and A Perrig ldquoPractical techniquesfor searches on encrypted datardquo in Proceeding 2000 IEEESymposium on Security and Privacy SampP 2000 pp 44ndash55IEEE Berkeley CA USA May 2000

[39] J Su D Cao B Zhao X Wang and I You ldquoePASS anexpressive attribute-based signature scheme with privacy andan unforgeability guarantee for the Internet of)ingsrdquo FutureGeneration Computer Systems vol 33 pp 11ndash18 2014

[40] L Sweeney ldquok-anonymity a Model for Protecting PrivacyrdquoInternational Journal of Uncertainty Fuzziness and Knowl-edge-Based Systems vol 10 no 5 pp 557ndash570 2002

[41] T Tassa A Mazza and A Gionis ldquok-concealment An al-ternative model of k-type anonymityrdquo Transactions on DataPrivacy vol 5 no 1 pp 189ndash222 2012

[42] J H Y F W Wang J C W G K Koperski D LiY L A R N Stefanovic and B X O R Z Dbminer ldquoAsystem for mining knowledge in large relational databasesrdquo inProceedings of the International Conference on KnowledgeDiscovery and Data Mining and Knowledge Discovery(KDDrsquo96) pp 250ndash255 San Diego CA USA August 1996

[43] X Wang J Zhang E M Schooler and M Ion ldquoPerformanceevaluation of attribute-based encryption toward data privacyin the IoTrdquo in Proceedings of the IEEE International Con-ference on Communications (ICC) pp 725ndash730 IEEE SydneyAustralia June 2014

[44] Y Wang and Q Wen ldquoA privacy enhanced dns scheme forthe internet of thingsrdquo in Proceedings of the IET InternationalConference on Communication Technology and Application(ICCTA 2011) )e Institution of Engineering amp TechnologyBeijing China October 2011

[45] Y Xu T Ma M Tang and W Tian ldquoA survey of privacypreserving data publishing using generalization and sup-pressionrdquo Applied Mathematics amp Information Sciencesvol 8 no 3 pp 1103ndash1116 2014

[46] J-C Yang and B-X Fang ldquoSecurity model and key tech-nologies for the internet of thingsrdquo 4e Journal of ChinaUniversities of Posts and Telecommunications vol 18pp 109ndash112 2011

[47] X Yang H Steck Y Guo and Y Liu ldquoOn top-k recom-mendation using social networksrdquo in Proceedings of the sixthACM conference on Recommender systemsmdashRecSysrsquo12pp 67ndash74 ACM Dublin Ireland September 2012

[48] X Zheng Z Cai J Yu C Wang and Y Li ldquoFollow but notrack privacy preserved profile publishing in cyber-physicalsocial systemsrdquo IEEE Internet of 4ings Journal vol 4 no 6pp 1868ndash1878 2017

16 Security and Communication Networks

International Journal of

AerospaceEngineeringHindawiwwwhindawicom Volume 2018

RoboticsJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Active and Passive Electronic Components

VLSI Design

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Shock and Vibration

Hindawiwwwhindawicom Volume 2018

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawiwwwhindawicom

Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Control Scienceand Engineering

Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Journal ofEngineeringVolume 2018

SensorsJournal of

Hindawiwwwhindawicom Volume 2018

International Journal of

RotatingMachinery

Hindawiwwwhindawicom Volume 2018

Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Navigation and Observation

International Journal of

Hindawi

wwwhindawicom Volume 2018

Advances in

Multimedia

Submit your manuscripts atwwwhindawicom

Page 4: Research Article - Hindawi Publishing Corporationdownloads.hindawi.com/journals/scn/2019/5498375.pdf · [4, 34, 44], and attribute-based encryption [39, 43]. ’e features of social

relationship between t and any other tuple (eg t has ahigher or lower rank than another tuple tprime) in a rankedresult )erefore given fixed D IA and QA Tt is fully de-termined by the attribute values of t

Another capability of the adversary we model is theadversaryrsquos prior knowledge Consider an extreme case wherethe adversary knows the equivalence relation between two at-tributes B1 and A1 In this case even without RD the adversaryis still able to infer the value of B1 of any t isin D In reality anadversary can acquire such attribute correlations by being orconsulting an expert in the domain of the dataset or by adoptingdata mining methods [42] For example based on the personalinformation (eg gender ethnicity age and blood type whichcan be used to infer the gene) stored as public attributes andpublished in public medical data repositories genetic epide-miologists can generally conclude that an individual does nothave some diseases merely based on the fact that these diseaseswould never be found by the candidate gene among historicmedical datasets )erefore an adversary with prior knowledgecould eliminate the possibility of a certain tuple in the databaseWemodel prior knowledge as a function PK(t ∣ D) that takes asinput t andD and returns either 0 or 1 PK(t ∣ D) 0 indicatesthat given prior knowledge the possibility of t isin D is zeroPK(t ∣ D) 1 indicates that the adversary cannot eliminate thepossibility of t isin D As prior knowledge helps adversaries inlaunching an inference attack adversaries can be partitionedinto two classes adversaries with prior knowledge and adver-saries without prior knowledge of the dataset

Definition 1 We name adversaries without prior knowledgeof the authenticity of any tuple as authenticity-ignorant ad-versaries For authenticity-ignorant adversaries PK(t ∣ D)

always outputs 1 We name adversaries with such priorknowledge as authenticity-knowledgeable adversaries Forauthenticity-knowledgeable adversaries PK(t ∣ D) 1 ift isin D and PK(t ∣ D) 0 if t notin D

)e objective of both classes of adversaries is to maxi-mize the following g(v Bj) value when inferring the value ofvictim tuple v in attribute Bj

g v Bj1113872 1113873 Pr v Bj1113960 1113961 a1113872 1113873 (2)

where Pr(v[Bj] a) is the probability of v[Bj] a and a isthe value inferred by the adversary given prior knowledgeand ranked results

For authenticity-ignorant adversaries PK(t ∣ D) alwaysoutputs 1 regardless of t and D We only assume the caseswhere users input true information to the databases)ereforefor authenticity-knowledgeable adversaries PK(t ∣ D) 1 ift isin D and PK(t ∣ D) 0 if t notin D In this paper we assume astrong adversary that can infer the private attribute values of anarbitrary tuple given the premise that the trace of the targettuple is unique )e premise can be easily met because as longas there is no duplicate tuple in the dataset the adversary canalways construct well-designed IA andQA such that the trace ofthe target tuple is different from the trace of any other tuple)erefore the adversary can always find a such thatPr(v[Bj] a) 100

23ProblemStatement Aprivacy breach can be described by asuccessful inference of a private attribute value in the databaseWe view privacy of v[Bj] as the upper bound on the possibilitythat an adversary succeeds in inferring the value of v[Bj] Notethat we do not make any assumption about the adversaryrsquosattacking method Our objective in this paper is to present aframework that sets an upper bound on the probability ofsuccessful inference of an arbitrary private attribute for tuplet isin D )erefore we define the objective of the framework as

forallt isin D j isin 1 mprime1113864 1113865 g v Bj1113872 1113873le ε (3)

We present the upper bound ϵ as our privacy guaranteeHowever a privacy-preserving framework should pro-

vide not only a privacy guarantee but also a notion ofutilitymdashafter all a framework that removes all private at-tribute values or replaces them with randomly generatedvalues can surely preserve privacy )erefore we use ameasurement based on the variance of ranked results beforeand after adopting our framework Given D and a set of allpossible queries denoted as Q we define the utility loss forour ranked retrieval model as follows

U 1113944tisinD

1113944qisinQ

Rank(t ∣ q) minus Rankprime(t ∣ q)1113868111386811138681113868

1113868111386811138681113868 (4)

where Rank(t ∣ q) and Rankprime(t ∣ q) refer to the ranks oftuple t in the ranked result given query q before and afterapplying our frameworks respectively

3 Privacy-Preserving Framework

)e only information an adversary can obtain from a da-tabase through the ranked retrieval model is public attributevalues and ranked results For an adversary without priorknowledge information regarding private attribute valuescan only be retrieved from ranked results )erefore inorder to preserve privacy we have to modify the rankedretrieval model such that the adversary cannot retrieve anyuseful information about private attributes from rankedresults

An idea is to group different tuples together in rankedresults As in our prior work [20] we can group two tuples v1and v2 together such that they share the same rank in anyranked result )is can be achieved by adopting a newranking function sprime(t ∣ q) such that sprime(v1 ∣ q) sprime(v2 ∣ q) forall q If v1 and v2 have different values on every privateattributes then the adversary is unable to infer the privatevalues of v1 since v1 and v2 are indistinguishable in anyranked results However this method suffers from highutility loss In order to preserve the privacy of all privateattributes v1 and v2 have to be different over all privateattributes)us the original scores of v1 and v2 ie s(v1 ∣ q)

and s(v2 ∣ q) differ a lot which leads to a higher variancebetween the rank of v1 before and after adopting thismethod

In this work we present a novel framework that pre-serves privacy of private attributes while minimizes theutility loss We observe that for a tuple vrsquos private attributev[Bj] if there are at least two potential values for v[Bj] and

4 Security and Communication Networks

an adversary cannot differentiate any one of them then theprivacy of v[Bj] can be preserved For instance if theprobability of v[Bj] a is equal to the probability ofv[Bj] aprime given ranked results and prior knowledge thenthe adversary cannot exclude any one of them If both theprobabilities are 50 then the adversary may choose torandomly pick a value from a and aprime as the inferred result Inthis case g(v Bj) will not exceed 50 and privacy of v[Bj]

can be preserved To prove this statement suppose that v isan arbitrary tuple in databaseD and we want to preserve theprivacy of v[Bj] Let β

Bj be an arbitrary value in VB

j v[Bj]1113966 1113967We construct tuple vprime such that vprime and v differ in only oneattribute Bj vprime[Bj] βB

j We also construct databaseDprime suchthat D and Dprime differ in only one tuple v isin D while vprime isin DprimeWe define a new score function sprime(t ∣ q)

sprime(t ∣ q) 1113944m

i1w

Ai ρ q Ai1113858 1113859 t Ai1113858 1113859( 1113857 + 1113944

mprime

j1w

Bi ρprime q Bj1113960 1113961 t Bj1113960 11139611113872 1113873

(5)

where

ρprime q Bj1113960 1113961 t Bj1113960 11139611113872 1113873 ρ q Bj1113960 1113961 t Bj1113960 11139611113872 1113873 if tne v and tne vprime

ρprime q Bj1113960 1113961 t Bj1113960 11139611113872 1113873 Max ρ q Bj1113960 1113961 v Bj1113960 11139611113872 1113873 ρ q Bj1113960 1113961 vprime Bj1113960 11139611113872 11138731113872 1113873

(6)

if t v or t vprimeImagine a case where an adversary queriesD and Dprime with

the same query workload QA We denote the ranked resultsfrom D as RD and the ranked results from Dprime as RDprime As inthe new score function sprime(v ∣ q) sprime(vprime ∣ q) for forallq isin QA RD

is identical to RDprime )erefore given only ranked results theadversary cannot tell the difference between the two data-bases being queried Furthermore even if we exchange thevalues of v and vprime the adversary still cannot observe anychange inRD or RDprime As a result the value of v[Bj] and β

Bj are

equivalent from the perspective of the adversary and theprivacy of v[Bj] can be well preserved Intuitively v can beseen as a tuple that has two polymorphic forms in Bj v[Bj]

and βBj When calculating the score of v with sprime(t ∣ q) we

always choose the value that can maximize sprime(v ∣ q)We can further extend the statement to a more general

case For each tuple v isin D and each private attribute Bj wecan select e distinct values β1 β2 βe from VB

j v[Bj]1113966 1113967elt |VB

j | minus 1 )e new score function can be defined as

sprime(t ∣ q) 1113944m

i1w

Ai ρ q Ai1113858 1113859 t Ai1113858 1113859( 1113857 + 1113944

mprime

j1w

Bi ρprime q Bj1113960 1113961 t Bj1113960 11139611113872 1113873

(7)

whereρprime q Bj1113960 1113961 t Bj1113960 11139611113872 1113873 ρ q Bj1113960 1113961 t Bj1113960 11139611113872 1113873 if tne v

ρprime q Bj1113960 1113961 t Bj1113960 11139611113872 1113873 Max ρ q Bj1113960 1113961 t Bj1113960 11139611113872 1113873 ρ q Bj1113960 1113961 β11113872 11138731113872

ρ q Bj1113960 1113961 βe1113872 11138731113873 if t v

(8)

Consider e tuples v(1)prime v(e)prime which are identical to v

in all attributes except for Bj Let v(i)prime[Bj] be βi

foralli 1 e )en for forallq we have sprime(v ∣ q) sprime(v(i)prime ∣ q)As a result from the adversaryrsquos perspective there are e + 1potential values for v[Bj] v[Bj] β1 βe which are in-distinguishable from each other from any ranked results)eprivacy of v[Bj] can be preserved by grouping it with eequivalent values

As such we introduce the construction of the poly-morphic value set (PVS) We put v[Bj] into a set in which allvalues are indistinguishable when calculating the score withrespect to Bj in the ranking function ie ρprime(q[Bj] v[Bj])We name the above set as the polymorphic value set anddenote the polymorphic value set of tuple v in attribute Bj asP

Bj

v We define PBj

v as follows

Definition 2 PBj

v is a set containing all indistinguishablevalues of tuple vrsquos private attribute Bj Assigning v[Bj] withan arbitrary value in P

Bj

v will not change the value ofsprime(v ∣ q)forallq ie ρprime(q[Bj] v[Bj]) ρprime(q[Bj] β)forallβ isin P

Bj

v

and forallq

Since the adversary cannot distinguish different values inP

Bj

v by launching any inference attacks based only on rankedresults the privacy guarantee of v[Bj] is

ε 1

PBj

v

11138681113868111386811138681113868

11138681113868111386811138681113868

(9)

In order to achieve the privacy guarantee defined in (3)for each v isin D and each private attribute Bj we have toensure that (a) there is one and only one polymorphic valueset P

Bj

v for vrsquos attribute Bj and (b) the privacy guaranteedefined in (9) is always valid

4 Framework with Virtual Values

41 Design In this section we introduce how polymorphicvalue sets can be constructed with generated values to meetthe privacy guarantee in (9) against authenticity-ignorantadversaries We name values generated by our framework asvirtual values

As we proved in Privacy-Preserving Framework anauthenticity-ignorant adversary cannot distinguish the valueof v[Bj] from other valid values in P

Bj

v given ranked resultsSince an adversary without prior knowledge cannot validatethe authenticity of any value in P

Bj

v all values in PBj

v areldquovalidrdquo in the perspective of the adversary no matter if theyare generated by our framework or collected from real datain D )erefore we observe that P

Bj

v can be formed by anyvalues in VB

j In order to achieve the privacy guarantee of ε we have to

ensure that |PBj

v |gt (1ε) forallv isin D and forallj isin 1 mprimeAn intuitive algorithm to generate P

Bj

v of size 1ε is torandomly pick (1ε) minus 1 values from VB

j v[Bj] Specifically

let the initial PBj

v v[Bj]1113966 1113967 )en we can randomly pick

distinct values fromVBj v[Bj] and insert them intoP

Bj

v untilPBj

v

contains at least (1ε) distinct values In the same manner wecan construct a polymorphic value set for each tuplersquos eachprivate attribute

Security and Communication Networks 5

411 Privacy Guarantee For a databaseDwhere every tuplevrsquos every private attribute Bj is included by one polymorphicvalue set with virtual values (PVSV) whose size is at least l ifthe adversary has no prior knowledge of D a privacy level ofε (1l) is achieved

For an authenticity-ignorant adversary PK(v ∣ D) 1forallv As proved in Privacy-Preserving Framework for anauthenticity-ignorant adversary it is impossible to distin-guish v[Bj] with at least l minus 1 other values )us we haveg(v Bj)le (1l) for forallv isin D and forallj isin 1 mprime Accordingto equation (3) a privacy guarantee of (1l) can be achieved

42 Utility Optimization In this section we discuss how toreduce utility loss caused by polymorphic value sets Weintroduced a metric of utility loss in (4) that calculates thesum of difference in ranked results given all possible queriesTo practically calculate utility loss we limit the range ofqueries to a finite set named query workload In practice thequery workload of a database D can be a set of queries thatare more frequently issued than any other queries A queryworkload may contain duplicate queries which reflect thedistribution of frequent queries With a query workloaddenoted as Q we can define the practical utility loss as

UQ 1113944qisinQ

1113944visinD

Rank(v ∣ q) minus Rankprime(v ∣ q)1113868111386811138681113868

1113868111386811138681113868 (10)

In order to reduce utility loss we have to find assign-ments of P

Bj

v for forallv isin D and forallj isin 1 mprime such that theoverall UQ can be minimized Without loss of generality weonly consider constructing polymorphic value sets of size 2In this case the privacy guarantee is 12 For each v[Bj] weneed to find a value that is indistinguishable from v[Bj] Wedenote the polymorphic value of v[Bj] as vprime[Bj]

Definition 3 We define the 2-PVSV problem as follows givendatabase D and query workload Q find a polymorphic valuevprime[Bj] from VB

j v[Bj]1113966 1113967 for each v[Bj] forallv isin D andforallj isin 1 mprime such that UQ defined in (10) is minimized

Theorem 1 4e 2-PVSV problem is NP-hard

)e proof of )eorem 1 in detail can be found in Ap-pendix A

43 Heuristic Algorithm We have proved that the 2-PVSVproblem is an NP-hard problem that may not be solved inpolynomial time)erefore we propose PVSV-Constructora heuristic algorithm that can return an approximate so-lution in polynomial time

We observe that |s(v ∣ q) minus sprime(v ∣ q)| is relevant to|Rank(v ∣ q) minus Rankprime(v ∣ q)| A smaller difference betweens(v ∣ q) and sprime(v| q) leads to a smaller difference betweenRank(v ∣ q) and Rankprime(v ∣ q) )erefore |Rank(v ∣ q)minus

Rankprime(v ∣ q)| can be approximately minimized by a solutionthat minimizes |s(v ∣ q) minus sprime(v ∣ q)| As such we propose anapproximation of UQ that calculates the score differencebefore and after adopting our framework We denote thescore difference as US

Q and define USQ as follows

USQ 1113944

qisinQ1113944visinD

s(v ∣ q) minus sprime(v ∣ q)1113868111386811138681113868

1113868111386811138681113868 (11)

As shown in Algorithm 1 given input database D thenumber of public and private attributes m and mprime re-spectively query workload Q and privacy guarantee ϵPVSV-Constructor constructs an equivalent value set foreach v isin D and j isin 1 mprime that minimizes US

Q Recall thedefinition of sprime(t| q) in (7) and we have

USQ 1113944

visinD1113944

mprime

j11113944qisinQ

wBj ρprime q Bj1113960 1113961 v Bj1113960 11139611113872 1113873 minus ρ q Bj1113960 1113961 v Bj1113960 11139611113872 111387311138681113868111386811138681113868

11138681113868111386811138681113868

(12)

We denote the score difference of v[Bj] contributed byP

Bj

v as U(PBj

v )

U PBj

v1113874 1113875 1113944qisinQ

wBj ρprime q Bj1113960 1113961 v Bj1113960 11139611113872 1113873 minus ρ q Bj1113960 1113961 v Bj1113960 11139611113872 111387311138681113868111386811138681113868

11138681113868111386811138681113868

(13)

)erefore USQ is the sum of U(P

Bj

v ) over all tuples andprivate attributes

USQ 1113944

visinD1113944

mprime

j1U P

Bj

v1113874 1113875 (14)

Since the construction of each PBj

v is independent fromother polymorphic value sets we can minimize US

Q by mini-

mizing the score difference contributed by each PBj

v forforallj isin 1 mprime1113864 1113865 and forallv isin D Note that |ρprime(q[Bj] v[Bj]) minus

ρ(q[Bj] v[Bj])| 0 if q[Bj] v[Bj] or q[Bj] notin PBj

v Alsonote that |ρprime(q[Bj] v[Bj]) minus ρ(q[Bj] v[Bj])| 1 when

q[Bj]ne v[Bj] and q[Bj] isin PBj

v )erefore if forallq isin Q

v[Bj] q[Bj] then any assignment of PBj

v cannot contribute

to a higher U(PBj

v ) since ρprime(q[Bj] v[Bj]) 1 ρ(q[Bj]

v[Bj]) 1 and |ρprime(q[Bj] v[Bj]) minus ρ(q[Bj] v[Bj])| is always

zero In this situation U(PBj

v ) 0 On the contrary if existq isin Qv[Bj]ne q[Bj] then ρ(q[Bj] v[Bj]) 0 and thus|ρprime(q[Bj] v[Bj]) minus ρ(q[Bj] v[Bj])| ρprime(q[Bj] v[Bj])

According to equation (13) the value of U(PBj

v ) is

1113944qisinQ

wBj ρprime q Bj1113960 1113961 v Bj1113960 11139611113872 1113873 minus ρ q Bj1113960 1113961 v Bj1113960 11139611113872 111387311138681113868111386811138681113868

11138681113868111386811138681113868

1113944

qisinQq Bj1113858 1113859nev Bj1113858 1113859

wBj ρprime q Bj1113960 1113961 v Bj1113960 11139611113872 1113873

(15)

In order to minimize U(PBj

v ) we have to find an as-signment of P

Bj

v which minimizes |1113864q isin Q ∣ q[Bj]ne

v[Bj]existβ isin PBj

v q[Bj] β1113865| Consider the simplest case

where we want to construct PBj

v of size 2 v[Bj] β1113966 1113967 We have

6 Security and Communication Networks

1113944

qisinQq Bj1113858 1113859nev Bj1113858 1113859

wBj ρprime q Bj1113960 1113961 v Bj1113960 11139611113872 1113873 w

Bj q isin Q ∣ q Bj1113960 1113961 β1113966 111396711138681113868111386811138681113868

11138681113868111386811138681113868

(16)

In this case β has to be a value in VBj v[Bj] such that β

has the lowest frequency among all q[Bj] values for q isin QNote that β does not have to be an element of q[Bj] ∣1113966

q isin Q If β notin q[Bj] ∣ q isin Q1113966 1113967 its frequency is 0 Similarly if we

want to construct PBj

v of size k then we should insert the k minus 1least frequent values among all q[Bj] values into P

Bj

v In line 4 of Algorithm 1 we initialize the polymorphic

value set of v[Bj] by inserting v[Bj] into PBj

v For v[Bj] aset Set is constructed from VB

j v[Bj]1113966 1113967 which containsall values that can be inserted into P

Bj

v In order tominimize U(P

Bj

v ) we insert the k minus 1 least frequentvalues from Set into P

Bj

v by their frequencies inq[Bj] ∣ q isin Q1113966 1113967 )e above construction of P

Bj

v is repeatedfor each t isin D and j isin 1 mprime )e computationalcomplexity of Algorithm 1 is O(|D| middot |Q| middot mprime)

5 Framework with True Values

51 Authenticity-Knowledgeable Adversaries As we men-tioned in the adversary model an authenticity-knowl-edgeable adversary is able to tell if t isin D is possible As aresult the authenticity-knowledgeable adversary can launcha more efficient attack on private attributes by examining theauthenticity of values learned from RA We show a simplecase where an authenticity-knowledgeable adversary breaksthe privacy guarantee provided by polymorphic values setsconstructed with virtual values Consider that the objectiveof the adversary is to infer the value of v[B0] and B0 is theonly private attribute in D PB0

v is the polymorphic value setgenerated for v[B0] and |PB0

v | 1ε Without prior knowl-edge Pv[B0] ε and the privacy guarantee is achievedHowever for a value β isin PB0

v v[B0]1113864 1113865 if the adversary canconclude that PK(vprime ∣ D) 0 for any tuple vprime such that vprime isequal to v in all public attributes and vprime[B0] β then theadversary can exclude β from PB0

v )erefore

g v B01113858 1113859( 1113857 ε

1 minus εgt ε (17)

and equation (9) no longer holdsAs shown above if values in PB0

v are marked by anadversary as invalid for v given PK(vprime|D) then the adversarycan successfully break the privacy guarantee defined inframework with virtual values

52 Design In this section we propose polymorphic valuesets with true values (PVST) that construct polymorphicvalue sets with values that cannot be excluded by PK(t ∣ D)PVST considers nontrivial prior knowledge of adversariesand presents the same degree of privacy guarantee in-troduced in (9)

We have shown above that the implementation withvirtual values can be compromised by adversaries with priorknowledge Consider a tuple v isin D Given privacy guaranteeϵ we construct mprime polymorphic values sets PB0

v PB

mprimev for

each private attributes of v where |PBj

v |ge 1ε Let set MAv be

MAv v A01113858 11138591113864 1113865 times middot middot middot times v Am1113858 11138591113864 1113865 times P

B0v times middot middot middot times P

Bmprimev (18)

MAv is the Cartesian product of sets each of which

contains vrsquos all equivalent values in an attribute For publicattribute Ai the corresponding set is v[Ai]1113864 1113865 since publicattribute values are open to the adversary For private at-tribute Bj the corresponding set is P

Bj

v )erefore MAv

contains all possible tuples that are indistinguishable with v

with respect to RA (including v itself )With prior knowledge on PK(t ∣ D) an adversary is able

to exclude a value β from PBj

v if

forallt isin t ∣ t isinMAv t Bj1113960 1113961 β1113966 1113967PK(t ∣ D) 0 (19)

As described above it is safe for the adversary to con-clude that βne v[Bj] if there is no t isinMA

v such that t[Bj] βand PK(t ∣ D) 1 Alternatively if for every β in P

Bj

v thereis a t such that t[Bj] β and PK(t ∣ D) 1 then the ad-

versary cannot exclude any value in PBj

v and thereforePv[Bj]le ε is guaranteed Since PK(t ∣ D) 1 iff t isin D weconstruct polymorphic value sets with true values fromt isin D

Definition 4 If there exists l distinct values β1 βl isin PBj

v

such that forallr isin 1 l βr holds the following property

(i) Input ε D m mprime Q(ii) Output P

Bj

v forallv isin D and forallj isin 1 mprime1113864 1113865

(1) k (1ε) minus 1(2) for t in D do(3) for j isin 1 mprime1113864 1113865 do(4) P

Bj

v v[Bj]1113966 1113967

(5) Set VBj v[Bj]1113966 1113967

(6) Sort elements of Set by their frequencies in q[Bj]| q isin Q1113966 1113967 in ascending order(7) Remove the last |Set| minus k + 1 elements in Set(8) P

Bj

v PBj

v cup Set(9) end(10) end

ALGORITHM 1 PVSV-Constructor

Security and Communication Networks 7

existt isinMAv t Bj1113960 1113961 βr and t isin D (20)

)en we say that PBj

v covers l true values We denoteT

Bj

v β1 βl1113864 1113865 as the true value set of t in BjPrivacy guarantee for a database D where every tuplersquos

every private attribute is included by one equivalent value setwhich covers at least l true values a privacy level of ε 1l isachieved

Assume that the adversaryrsquos objective is to infer the valueof v[Bj] As mentioned in the adversary model we make noassumption on the attacking methods adopted by an ad-versary Consider MA

v defined in (18) forallt tprime isinMAv and

forallq isin QA s(t ∣ q) s(tprime ∣ q) )erefore the adversary cannotdistinguish tuples in MA

v by observing RA In this situationthe adversary would use PK(t ∣ D) to exclude all tuple t inMA

v such that PK(t ∣ D) 0 However as PBj

v covers l truevalues there exists l tuples t1 tl isinMA

v such thatPK(t ∣ D) 1 and tr[Bj]ne ts[Bj] for arbitraryr s isin 1 l As a result values in t1[Bj] tl[Bj]1113966 1113967 areindistinguishable given RA and PK(t ∣ D) )us Pv[Bj] 1lAccording to (3) a privacy level of 1l is achieved

53 Utility Optimization An intuitive method of con-structing P

Bj

v is to insert the value of Bj of all tuples that sharethe same public attribute values with v into P

Bj

v ie PBj

v

PBj

v 1113864vprime[Bj]|foralli and vprime isin D vprime[Ai] v[Ai]1113865 )e privacyguarantee is met if P

Bj

v covers at least 1ε true valuesNevertheless utility loss cannot be ignored as in the intuitivemethod s(vprime| q) would be the same for all vprime isin 1113864vprime| foralliand vprime isin D vprime[Ai] v[Ai]1113865 and thus information of privateattributes are missing in the ranked result of vprime )ereforethe size of P

Bj

v is critical in balancing privacy and utilityWith no loss of generality we limit the size of each poly-morphic value set to 2 We show that constructing suchpolymorphic value sets is an NP-hard problem

Definition 5 We define the 2-PVST problem as followsgiven database D and a query workload Q construct P

Bj

v ofsize 2 for each tuple v isin D forallj isin 1 mprime1113864 1113865 and minimizeUQ defined in (10)

Theorem 2 4e 2-PVST problem is NP-hard

)e proof of )eorem 2 can be found in Appendix B

54 Heuristic Algorithm We have shown that the 2-PVSTproblem is NP-hard In this subsection we present PVST-Constructor a heuristic algorithm that constructs PVSTwithin polynomial time PVST-Constructor tries to mini-mize |P

Bj

v | and 1113936qisinQ|sprime(v ∣ q) minus s(v ∣ q)| for each v isin D

and j isin 1 mprime with the greedy algorithm)e pseudo-code of PVST-Constructor is shown in

Algorithm 2 In lines 2 to 6 we initialize each PBj

v withv[Bj] )en for each v isin D PVST-Constructor constructsPB1

v PBmprime

v by finding a vprime with the greedy algorithm andinserting vprime[Bj] into P

Bj

v forallj isin 1 mprime)e above process

will be taken multiple times until |PBj

v |ge 1ε Since every vprime isa real tuple existing in D and we insert at least k tuples in theabove processes P

Bj

v covers at least k + 1 true values andthus the privacy guarantee is met As mentioned in UtilityOptimization the size of P

Bj

v is critical in minimizing utilityloss )erefore the heuristic algorithm tries to minimize theutility loss by minimizing the size of each P

Bj

v We count thenumber of polymorphic value sets of v if the set contains lessthan k values that can be enlarged by inserting vprimersquos privateattribute values We denote this count as covervprime

covervprime j ∣ PBj

v

11138681113868111386811138681113868

11138681113868111386811138681113868lt k vprime Bj1113960 1113961 notin PBj

v1113882 1113883

1113868111386811138681113868111386811138681113868

1113868111386811138681113868111386811138681113868 (21)

Tuple vprime with a higher covervprime value can enlarge the sizeof more polymorphic value sets of v and thus we can reducethe number of tuples that we have to insert into PB1

v PBmprime

v Furthermore we take into consideration queries in Q In

order to minimize UQ we have to minimize |sprime(v ∣ q) minus

s(v ∣ q)| for q isin Q Note that for each attribute Bj|ρprime(q[Bj] v[Bj]) minus ρ(q[Bj] v[Bj])| 1 if v[Bj]ne q[Bj] andq[Bj] isin P

Bj

v We denote the value of 1113936j|ρprime(q[Bj] v[Bj]) minus

ρ(q[Bj] v[Bj])| as lossvprime )us we have

lossvprime 1113944qisinQ

1113944

j1mprime

ρprime q Bj1113960 1113961 v Bj1113960 11139611113872 1113873 minus ρ q Bj1113960 1113961 v Bj1113960 11139611113872 111387311138681113868111386811138681113868

11138681113868111386811138681113868

1113944qisinQ

1113944

j1mprime

δ v vprime q j( 1113857

(22)where δ(v vprime q j) 1 if v[Bj]ne q[Bj] and vprime[Bj] q[Bj]

and δ(v vprime q j) 0 otherwise )erefore tuple vprime with asmaller lossvprime can reduce the value of 1113936j|ρprime(q[Bj]

v[Bj]) minus ρ(q[Bj] v[Bj])| and thus we can reduce the valueof UQ

)e computation of covervprime and lossvprime is done in line 11)en we compute scorevprime the score of vprime that indicates howpreferable vprime is relative to other tuples in D v Weadopt the greedy algorithm to find the next vprime for PB1

v

PB

mprimev ie in each iteration we choose the tuple

that has the highest score value In line 15 we insert theprivate attribute values of the chosen tuple (denoted as vmax)into PB1

v PB

mprimev )e above process is repeated until

the sizes of PB1v P

Bmprime

v are no less than k

6 Experimental Results

61 Experimental Setup To validate PVSV-Constructor andPVST-Constructor algorithms we conducted experiments on areal world dataset [29] from eHarmony which contains 58attributes and 486464 tuples We removed 5 noncategoricalattributes and randomly picked 20 categorical attributes fromthe remaining 53 attributes )e domain sizes of the 20 at-tributes range from 2 to 15 After removing duplicate tuples werandomly picked 300000 tuples as our testing bed

By default we use the ranking function from the rankedretrieval model with all weights set to 1 All experimentalresults were obtained on a Mac machine running Mac OS

8 Security and Communication Networks

with 8GB of RAM )e algorithms were implemented inPython

62 Privacy )e privacy guarantee of PVSV-Constructorand PVST-Constructor were tested by performing RankedInference attack [35] including Point-Query In-QueryPoint-QueryampInsert and In-QueryampInsert attackingmethods on the dataset From a total of 20 attributes 5attributes were randomly chosen as public attributes andanother 5 attributes were randomly chosen as private at-tributes We randomly picked 20000 distinct tuples fromthe dataset as the testing bed and randomly generated 10tuples as the query workload For PVSV-Constructor weconstructed a polymorphic value set of size 2 for eachprivate attribute and each tuple in the testing bed ForPVST-Constructor we constructed a polymorphic valueset that covers at least two true values for each privateattribute and each tuple We randomly picked 1000 tuplesfrom the testing bed as our targets and performed 1000Rank Inference attacks (250 attacks for each of the fourmethods) on the five private attributes of target tuples Wemeasured the attack success guess rates based on the fre-quency of successful inference among all inference at-tempts Figure 1 shows the success guess rates of RankInference attacks on the unprotected testing bed the testingbed with PVSV and the testing bed with PVST As the sizeof each polymorphic value set is 2 the success guess rateson PVSV are around 50 which are significantly lowerthan those of unprotected dataset We also observe that thesuccess guess rates on PVSTare slightly lower than those of

PVSV )e reason is that PVSV-Constructor will beinserting values into tuple vrsquos polymorphic value sets untilall vrsquos polymorphic value sets cover at least 2 true values)us some polymorphic value sets of v may contain morethan 2 values

63 Utility In this subsection we quantify utility loss ofPVSV-Constructor and PVST-Constructor algorithms)e privacy guarantee of both PVSV-Constructor andPVST-Constructor is 12 ie the polymorphic value setsconstructed by PVSV-Constructor contains 2 values andthe polymorphic value sets constructed by PVST-Con-structor contains at least 2 true values )e key param-eters here are the size of query workload Q the size ofdatabase D the number of public and private attributesand the weight ratios in the ranking function We ran-domly generated 20 tuples as the query workload Bydefault we picked 10 tuples from Q and set |D| 300 000m 10 and mprime 10 )erefore we randomly picked 10attributes from the testing bed and set them as publicattributes )e rest of the 10 attributes were set as privateattributes

Many recommendation systems of ONS applicationsfeature top-k recommendation [1 14 47] where the rankedresult contains a set of k tuples that will be of interest to acertain user as it is impractical and unnecessary to return alltuples in the database to the user )erefore we introduceaverage top-k utility loss Uaul a variant of UQ that focuses onutility loss of the top-k tuples in a ranked result Uaul isdefined as

(i) Input ε D m mprime Q(ii) Output P

Bj

v forallv and forallBj

(1) k (1ε) minus 1(2) for v in D do(3) for j isin 1 mprime1113864 1113865 do(4) P

Bj

v v[Bj]1113966 1113967

(5) end(6) end(7) for v in D do(8) while existj isin 1 mprime1113864 1113865 |P

Bj

v |lt k + 1 do(9) for vprime isin t isin D|forallAi t[Ai] v[Ai]1113864 1113865 do(10) compute covervprime lossvprime(11) scorevprime covervprime 11 + exp(minus lossvprime )

(12) end(13) vmax argmaxvprimeisin |tisinD|forallAit[Ai]v[Ai] scorevprime

(14) for j isin 1 mprime1113864 1113865 do

(15) PBj

v PBj

v cup tmax[Bj]1113966 1113967

(16) end(17) end(18) end

ALGORITHM 2 PVST-Constructor

Security and Communication Networks 9

Uaul 1

|Q|1113944

|Q|

i1

1Dprime

11138681113868111386811138681113868111386811138681113868

1113944

visinDprime

min Rank(v ∣ q) k + 11113864 1113865 minus min Rankprime(v ∣ q) k + 11113864 11138651113868111386811138681113868

1113868111386811138681113868

k (23)

where Dprime v isin D |Rank(v | q))lt k or Rank(v | q))lt k1113864 1113865Intuitively Uaul represents the average percentage rankdifference relative to k over all queries and all tuples in top-kUaul is equivalent to UQ|Q||D|2 when k |D| By default weset k 100

631 Evaluation of Uaul with Varying k We first discuss theaverage top-k utility loss of PVSV-Constructor PVST-Constructor and a baseline algorithm on varying k withother parameters set to default values For a tuple vrsquos at-tribute Bj the baseline algorithm constructs P

Bj

v with v[Bj]

and a value randomly picked from VBj v[Bj]1113966 1113967 )e results

are presented in Figure 2 which shows that the Uaul ofPVSV-Constructor is significantly lower than that of thebaseline algorithm )e Uaul of PVST-Constructor is alsolower than that of the baseline algorithm when kge 5 eventhough the baseline algorithm cannot preserve privacyagainst authenticity-knowledgeable adversaries )e exper-imental results show that both PVSV and PVST can reduceutility loss with respect to rank differences Also note thatPVST-Constructor constructs polymorphic value sets withtrue values and thus from Figure 3 we can see that somepolymorphic value sets constructed by PVST-Constructorcontains more than 2 values

632 Evaluation of Uaul with Varying Sizes of QFigure 4 presents the average top-k utility loss of PVSV-Constructor on varying |Q| When |Q| is set to 1 5 or 10 werandomly picked 1 5 or 10 queries from the original Qrespectively With increasing number of queries in the queryworkload the Uaul of PVSV-Constructor increases mono-tonically )e reason is that PVSV-Constructor alwaysgenerates the least frequent value (denoted as vprime[Bj]) in1113864q[Bj] | q isin Q1113865 for v[Bj] If |Q| is small then it is possiblethat vprime[Bj] notin 1113864q[Bj] | q isin Q1113865 and thus s(v ∣ q) sprime(v ∣ q)

forallq isin Q However a larger query workload covers moreprivate attributes values and thus it will be harder forPVSV-Constructor to generate a value for v[Bj] that has noimpact on the rank of v for any q isin Q Figure 5 presents theaverage top-k utility loss of PVST-Constructor on varying|Q| We can see that the size of Q has no significant impacton the Uaul of PVST-Constructor as the PVST-Constructoralways pick a tuple vprime that is different from v in most at-tributes and then insert vprime[Bj] into P

Bj

v

633 Evaluation of Uaul with Varying Sizes of the DatasetFigures 6 and 7 depict the impact of the size of datasets onUaul of PVSV-Constructor and PVST-Constructor Datasetsof 100000 and 200000 tuples were randomly sampled from

QPoint QIn QIPoint QIIn

Atta

ck su

cces

s gue

ss ra

te

Without protectionPVSVPVST

1

09

08

07

06

05

04

03

02

01

0

Figure 1 Attack success guess

10 Security and Communication Networks

the testing bed of 300000 tuples As expected |D| has nosignificant impact on the Uaul of PVSV-Constructor sincePVSV-Constructor generates values from the domain ofeach private attributes |D| has no impact on the Uaul ofPVST-Constructor which indicates that a dataset contain-ing 100000 tuples is sufficient for PVST-Constructor togenerate polymorphic value sets with true values

634 Evaluation of Uaul with Varying m We investigate theimpact of the number of private and public attributes on

average top-k utility loss Figure 8 presents theUaul of PVSV-Constructor with fixed mprime and varying m and with fixed mand varying mprime When mmprime is set to 5 we randomly re-moved 5 attributes from the testing bed When mmprime is set to

0 20 40 60 80 100k

RandomPVSVPVST

Ave

rage

top-k

utili

ty lo

ss

1

09

08

07

06

05

04

03

02

01

0

Figure 2 Utility loss vs k

1 2 3 4 5+Size of PVST

3

25

2

15

1

05

0

Coun

t 10

5

Figure 3 Size of polymorphic value set (PVST)

|Q|

Ave

rage

top-k

utili

ty lo

ss

1

009

008

007

006

005

004

003

002

001

00 10 20

Figure 4 Utility loss vs |Q|

|Q|

Ave

rage

top-k

utili

ty lo

ss

1

09

08

07

06

05

04

03

02

01

00 10 20

Figure 5 Utility loss vs |Q|

0 1 2 3|D|105

Ave

rage

top-k

utili

ty lo

ss

1

009

008

007

006

005

004

003

002

001

0

Figure 6 Utility loss vs |D|

Security and Communication Networks 11

15 we added 5 more categorical attributes randomly chosenfrom the unused attributes We observe that the Uaulmonotonically decreases with increasing number of publicattributes and monotonically increases with increasingnumber of private attributes As expected a higher pro-portion of public attributes leads to less variant betweens(v ∣ q) and sprime(v ∣ q) e results of the same experimentwith PVST-Constructor are shown in Figure 9 Uaul in-creases as increasing number of private attributes as ex-pected However Uaul also increases slightly with increasingnumber of public attributes is is due to the fact that withmore public attributes there will be few tuples that share thesame public attribute values Since PVST-Constructor in-serts only private attribute values from tuples sharing samepublic attribute values more values will be inserted to eachpolymorphic value set which introduces higher utility loss

635 Evaluation of Uaul with Varying Weight RatiosFigures 10 and 11 illustrate the impact of weight ratios on theaverage utility loss e experiment was conducted with a

xed private attribute weight of 1 and varying public at-tribute weights of 1 2 and 3 As expected Uaul of bothPVSV-Constructor and PVST-Constructor decreases as

0 1 2 3Weight ratio

Ave

rage

top-k

utili

ty lo

ss

1

009

008

007

006

005

004

003

002

001

0

Figure 10 Utility loss vs weight ratio

0 1 2 3Weight ratio

Ave

rage

top-k

utili

ty lo

ss

1

09

08

07

06

05

04

03

02

01

0

Figure 11 Utility loss vs weight ratio

0 1 2 3|D|105

Ave

rage

top-k

utili

ty lo

ss1

09

08

07

06

05

04

03

02

01

0

Figure 7 Utility loss vs |D|

0 5 10 15m or mprime

fix mprime = 10fix m = 10

Ave

rage

top-k

utili

ty lo

ss

1

009

008

007

006

005

004

003

002

001

0

Figure 8 Utility loss vs m and mprime

0 5 10 15m or mprime

fix mprime = 10fix m = 10

Ave

rage

top-k

utili

ty lo

ss

1

09

08

07

06

05

04

03

02

01

0

Figure 9 Utility loss vs m and mprime

12 Security and Communication Networks

increasing weight ratio of public attributes)e reason is thatas the public attribute weight increases the part in sprime(v ∣ q)

caused by private attributes decreases )erefore less impactwould be made to sprime(v ∣ q) by PVSV and PVST As a resultsprime(v ∣ q) would be closer to s(v ∣ q) and the utility loss couldbe decreased

7 Conclusions

In this paper we proposed a novel framework that preservesprivacy of private attributes against arbitrary attacks throughthe ranked retrieval model Furthermore we identify twocategories of adversaries based on varying adversarial ca-pabilities For each kind of adversaries we presentedimplementation of our framework Our experimental resultssuggest that our implementations efficiently preserve privacyagainst Rank Inference attack [35] Moreover the imple-mentations significantly reduce utility loss with respect tothe variance in ranked results

It is our hope that this paper can motivate further re-search in privacy preservation of SIoT with consideration ofsocial network features andor variant information retrievalmodels eg text mining

Appendix

A 2-PVSV

In this subsection we prove that constructing an optimal PVSVfor each attribute of a tuple v is an NP-hard problem

Definition A1 For a tuple v in D we create PBj

v for each BjWe say v satisfies query q if the following hold for any tuple t(tne v) in D (1) If s(v ∣ q)lt s(t ∣ q) then sprime(v ∣ q)lt sprime(t ∣ q)and (2) if s(v ∣ q)ge s(t ∣ q) then sprime(v ∣ q)ge sprime(t ∣ q)

For a database containing 2 tuples the 2-PVSV problemcan be redefined as follows given a query workload Q adatabase D construct PB1

v PBmprime

v of size 2 such that v

satisfies the most queries in Q

Definition A2 Max-3Sat Problem given a 3-CNF formulaΦ find the truth assignment that satisfies that most clauses

Lemma A1 Max-3Sat leP 2-PVSV Problem

Proof We construct a reduction function f(Φ) (D s Q)

which takes a Max-3Sat instance as input and returns a 2-PVSV instance Without loss of generality we suppose thatΦ is a conjunction of l clauses and each clause is a dis-junction of 3 literals from set X x1 xn1113864 1113865 We constructdatabase D as follows D has 0 public attributes and n + 1private attributes B1 Bn C1 Let VB

j be the attribute do-main of Bj j isin 1 n and VC

1 be the attribute domain ofC1 Let VB

i minus 1 0 1 2 n + 2 and VC1 0 1 Two

tuples v1 and v2 are inserted into D v1[Bj] 0 and v2[Bj]

j + 2 for j isin 1 n while v1[C1] 0 and v2[C1] 1 Wesimplify the score function defined in (1) by setting all weightsto 1 Also note that ρ(q[Bj] t[Bj]) 0 if q[Bj] is null

We construct query workload Q based on Φ For eachclause Lk isin Φ we construct a query qk such that qk[Bi] i iffthe corresponding literal xi is a positive literal in Lk andq[Bi] i + 1 iff xi is a negative literal in the clause Forexample given a clause (x1orx2orx3) the correspondingquery q should satisfy q[B1] 1 q[B2] 3 q[B3] 3 andq[C1] 0 All other attributes in q are set to null by default)erefore s(v1 ∣ q) 1 and s(v2 ∣ q) 0 forallq isin Q

Since s(v2 ∣ q)lt s(v1 ∣ q) forallq isin Q in order to minimizeutility loss the value of sprime(v2 ∣ q) should be as small aspossible We observe that the minimum possible sprime(v2 ∣ q) is1 because PC1

v2must be 0 1 as VC

1 0 1 Without loss ofgenerality we assume that we have already constructedPB1

v2 P

Bmprime

v2 PC1v2

for v2 such that sprime(v2 ∣ q) 1 forallq isin QNow we have an instance of 2-PVSV problem that given

D Q constructs PB1v1

PBmprime

v1 PC1

v1that satisfies the most

queries in Q Since VC1 0 1 PC1

v1must be 0 1 too as

PC1v1

contains two distinct values )erefore for an ar-bitrary query q isin Q we have ρprime(q[C] v1[C]) 1 andρprime(q[C] v2[C]) 1 In order to let v1 satisfy q we have toensure that sprime(v1 ∣ q)gt sprime(v2 ∣ q) 1 )us we have

sprime v1 ∣ q( 1113857gt 1

⟺1113944mprime

i1ρprime q Bi1113858 1113859 v1 Bi1113858 1113859( 1113857gt 1

⟺existi isin 1 mprime1113864 1113865 ρprime q Bi1113858 1113859 v1 Bi1113858 1113859( 1113857 1

⟺existi isin 1 mprime1113864 1113865 q Bi1113858 1113859 isin PBi

v1

(A1)

Now we show that the solution of 2-PVSV problemconstructed above can answer the corresponding Max-3Sat Problem Suppose that we have the solution PB1

v1

PB

mprimev1 such that v1 satisfies the most queries in Q As in

(A1) if v1 satisfies qk we have qk[Bi] isin PBiv1 If v1 does not

satisfy qk then qk[Bi] notin PBiv1 Recall that we assign value i or

i + 1 to qk[Bi] if xi is a positive or negative literal in clauseLk respectively )erefore the assignment of xi given PBi

v1is

xi true if i isin PBi

v1

xi false if i + 1 isin PBi

v1

(A2)

Furthermore from equation (A1) we have

sprime v1 ∣ q( 1113857gt 1⟺L true (A3)

where L is qrsquos corresponding clause in Φ Note thati i + 1 nsubQ as |PBi

v1| 2 and 0 isin PBi

v1 )us the value of xi is

either true or falseAs we proved above v1 satisfies qi if and only if Li is true

given assignment x1 xl1113864 1113865 constructed according toequation (A2) )erefore x1 xl1113864 1113865 satisfies the mostclauses in ϕ if and only if v1 satisfies the most queries in Q

We now prove that function f can be conducted inpolynomial time Given a formula Φ with n variables and lclauses we construct a 2-PVSV instance with 2 tuples each

Security and Communication Networks 13

of which has n + 1 attributes and l queries each of which has4 attributes )erefore O(2n + 4l) assignments are neededand f can be conducted in polynomial time

Proof of 4eorem 1 We now prove that the 2-PVSVproblem is NP-hard In Lemma 3 we proved that Max-3SatProblem can be reduced to the 2-PVSV problem in poly-nomial time Furthermore as Max-3Sat Problem is a NP-hard problem [33] the 2-PVSV problem is NP-hard

B 2-PVST

In this section we prove that constructing an optimal PVSTfor each private attribute of a tuple v isin D is NP-hard We usethe definition of v satisfying q from (9))e 2-PVSTproblemcan be redefined as follows

Definition B1 2-PVSTproblem given a query workload Qa database D the optimization problem of 2-PVST is toconstruct an arbitrary tuple vrsquos polymorphic setsPB1

v PBmprime

v that satisfies the most queries in Q )e size ofeach polymorphic vale set is 2

Lemma B1 Max-3Sat le P 2-PVST problem

Proof We construct a reduction function f(Φ) (D s Q)

which takes a Max-3Sat instance as input and returns a 2-PVST instance We assume that Φ is a conjunction of lclauses and each clause is a disjunction of 3 literals from setX x1 xn1113864 1113865 Let D have no public attribute and n + 1private attributes B1 Bn C1 For each literal xi we inserttwo tuples vi and vi

prime into D where

vi C11113858 1113859 1 vi Bi1113858 1113859 1 vi Bj1113960 1113961 minus 1foralljne i

viprime C11113858 1113859 1 vi

prime Bi1113858 1113859 minus 1 viprime Bj1113960 1113961 1foralljne i

(B1)

)en we insert tuple v into D where v[C1] 0 andv[Bj] 0 forallj isin 1 mprime We also set the domain of eachBj as minus 1 0 1 and the domain of C1 as 0 1

Without loss of generality the score function defined in(1) is simplified by setting all weights to 1

Next we construct query workload Q For each clauseLk isin Φ we construct a query qk in which qk[Bj] minus 1 iff xj isa positive literal in Lk and qk[Bj] 1 iff xj is a negativeliteral in Lk We also set qk[C1] 1 )e rest of the attributevalues are set to null by default )erefore ifLk (x1orne x2orx3) then we have qk[C1] 1 qk[B1] 1qk[B2] 1 qk[B3] minus 1 and qk[Bj] null forj isin 4 mprime

Note that ρ(q[Bj] t[Bj]) 0 if q[Bj] is null )uss(v qk) 0 forallq isin Q and s(vi qk) s(vi

prime qk)ge 1 forallq isin Q Weobserve that for q isin Q and i isin 1 n s(v ∣ q)lt s(vi ∣ q)

and s(v ∣ q)lt s(viprime ∣ q) In order to reduce UQ we have to

maximize the number of q isin Q such that sprime(v ∣ q)lt sprime(vi ∣ q)

and sprime(v ∣ q)lt sprime(viprime ∣ q) )e maximum value of sprime(vi ∣ q)

and sprime(viprime ∣ q) is 4 which can be achieved by inserting vi[Bj]

into PBj

viprime and inserting vi

prime[Bj] into PBj

vi Since PC1

v 0 1 thevalue of sprime(v ∣ q) is at least 1 )erefore we have

sprime(v ∣ q)lt sprime vi ∣ q( 1113857

⟺sprime(v ∣ q)lt 4

⟺1113944

mprime

j1ρprime q Bj1113960 1113961 v Bj1113960 11139611113872 1113873lt 3

⟺existj isin 1 mprime ρprime q Bj1113960 1113961 v Bj1113960 11139611113872 1113873 0

⟺existj isin 1 mprime v Bj1113960 1113961ne q Bj1113960 1113961

⟺existj isin 1 mprime minus q Bj1113960 1113961 isin PBj

v

(B2)

Since |PBj

v | is limited to 2 PBj

v 0 1 or 0 minus 1 ie wemust insert either virsquos or vi

primersquos private attribute values intoPB1

v PBmv Recall that we assign minus 1 or 1 to qk[Bj] if xj

is positive or negative respectively in Lk In order toreduce Max-3Sat to 2-PSVT we assign literals in the fol-lowing way

xj true if minus 1 isin PBj

v

xj false if 1 isin PBj

v (B3)

)erefore we denote the corresponding clause of q as Land from equation (B2) we have

existj isin 1 mprime1113864 1113865 minus q Bj1113960 1113961 isin PBj

v

⟺existj isin 1 mprime1113864 1113865 xj true if xj is positive in L

orxj false if xj is negative in L

⟺L true(B4)

From the above equation we observe that v satisfying qk

is equivalent to Lk being true)erefore we can reduceMax-3Sat to 2-PVST Given a solution to the 2-PVSTproblem thatmaximizes the number of queries satisfied by v assignmentx1 xn1113864 1113865 produced by equation (B3) can satisfy thelargest number of clauses in Φ

Given a Max-3Sat instance Φ the construction of the 2-PVST instance can be conducted in polynomial time as weconstruct n queries with 3 attributes and 2n + 1 tuples with nattributes)e transformation from the optimal solution of 2-PVST to the optimal solution of Max-3Sat also takes poly-nomial time as we assign the value to each one of x1 xn

once )erefore f is a polynomial time function

Proof of 4eorem 2 In 4 we proved that a Max-3Sat in-stance can be reduced to a 2-PSVT instance in polynomialtime Furthermore as Max-3Sat Problem is an NP-hardproblem the 2-PVSV problem is an NP-hard problem

Data Availability

)e data used to support the findings of this study areavailable from the corresponding author upon request

14 Security and Communication Networks

Conflicts of Interest

)e authors declare that they have no conflicts of interestregarding the publication of this paper

Acknowledgments

)is research was partially supported by the National Sci-ence Foundation (grant nos 0852674 0915834 1117297and 1343976) and the Army Research Office (grant noW911NF-15-1-0020)

References

[1] G Adomavicius and A Tuzhilin ldquoToward the next generationof recommender systems a survey of the state-of-the-art andpossible extensionsrdquo IEEE Transactions on Knowledge andData Engineering vol 17 no 6 pp 734ndash749 2005

[2] C C Aggarwal and S Y Philip ldquoA condensation approach toprivacy preserving data miningrdquo in Proceedings of the In-ternational Conference on Extending Database Technologypp 183ndash199 Springer Heraklion Crete Greece March 2004

[3] R Agrawal and R Srikant ldquoPrivacy-preserving data miningrdquoin ACM Sigmod Record vol 29 no 2 pp 439ndash450 ACM2000

[4] A Alcaide E Palomar J Montero-Castillo and A RibagordaldquoAnonymous authentication for privacy-preserving IoT tar-get-driven applicationsrdquo Computers amp Security vol 37pp 111ndash123 2013

[5] L Atzori A Iera G Morabito and M Nitti ldquo)e socialinternet of things (SIoT)mdashwhen social networks meet theinternet of things concept architecture and network char-acterizationrdquo Computer Networks vol 56 no 16 pp 3594ndash3608 2012

[6] Z Cai Z He X Guan and Y Li ldquoCollective data-sanitizationfor preventing sensitive information inference attacks insocial networksrdquo IEEE Transactions on Dependable and SecureComputing vol 15 no 4 pp 577ndash590 2018

[7] Z Cai and X Zheng ldquoA private and efficient mechanism for datauploading in smart cyber-physical systemsrdquo IEEE Transactionson Network Science and Engineering vol 24 p 1 2018

[8] Z Cai X Zheng and J Yu ldquoA differential-private frameworkfor urban traffic flows estimation via taxi companiesrdquo IEEETransactions on Industrial Informatics vol 17 2019

[9] N Cao C Wang M Li K Ren and W Lou ldquoPrivacy-preserving multi-keyword ranked search over encryptedcloud datardquo IEEE Transactions on Parallel and DistributedSystems vol 25 no 1 pp 222ndash233 2014

[10] Y-C Chang and M Mitzenmacher ldquoPrivacy preservingkeyword searches on remote encrypted datardquo in Proceedingsof the International Conference on Applied Cryptography andNetwork Security pp 442ndash455 Springer New York NY USAJune 2005

[11] C Chen X Zhu P Shen et al ldquoAn efficient privacy-pre-serving ranked keyword search methodrdquo IEEE Transactionson Parallel and Distributed Systems vol 27 no 4 pp 951ndash9632016

[12] K Chen and L Liu ldquoPrivacy preserving data classificationwith rotation perturbationrdquo in Proceedings of the Fifth IEEEInternational Conference on Data Mining (ICDMrsquo05) p 4IEEE Houston TX USA November 2005

[13] V Ciriani S D C Di Vimercati S Foresti S JajodiaS Paraboschi and P Samarati ldquoFragmentation and en-cryption to enforce privacy in data storagerdquo in Proceedings of

the Fifth European symposium on research in computer se-curity pp 171ndash186 Springer Dresden Germany September2007

[14] M Deshpande and G Karypis ldquoItem-based top-N recom-mendation algorithmsrdquo ACM Transactions on InformationSystems vol 22 no 1 pp 143ndash177 2004

[15] S D C di Vimercati R F Erbacher S Foresti S JajodiaG Livraga and P Samarati ldquoEncryption and fragmentationfor data confidentiality in the cloudrdquo in Foundations of Se-curity Analysis and Design VII A Aldini J Lopez andF Martinelli Eds pp 212ndash243 Springer Berlin Germany2013

[16] S D C di Vimercati S Foresti G Livraga S Paraboschi andP Samarati ldquoConfidentiality protection in large databasesrdquo inA Comprehensive Guide through the Italian Database Researchover the Last 25 Years pp 457ndash472 Springer Berlin Ger-many 2018

[17] C Dwork ldquoDifferential privacyrdquo Encyclopedia of Cryptog-raphy and Security pp 338ndash340 2011

[18] C Dwork and A Roth ldquo)e algorithmic foundations ofdifferential privacyrdquo Foundations and Trendsreg in 4eoreticalComputer Science vol 9 no 3-4 pp 211ndash407 2014

[19] S E Fienberg and J McIntyre ldquoData swapping variations ona theme by dalenius and reissrdquo in Privacy in Statistical Da-tabases pp 14ndash29 Springer Berlin Germany 2004

[20] Y Gao T Yan and N Zhang ldquoA privacy-preservingframework for rank inferencerdquo in Proceedings of the IEEESymposium on Privacy-Aware Computing (PAC) pp 180-181IEEE Washington DC USA August 2017

[21] Z He Z Cai and J Yu ldquoLatent-data privacy preserving withcustomized data utility for social network datardquo IEEETransactions on Vehicular Technology vol 67 no 1pp 665ndash673 2017

[22] T Hong S Mei Z Wang and J Ren ldquoA novel verticalfragmentation method for privacy protection based on en-tropy minimization in a relational databaserdquo Symmetryvol 10 no 11 p 637 2018

[23] X Huang R Fu B Chen T Zhang and A Roscoe ldquoUserinteractive internet of things privacy preserved access con-trolrdquo in Proceedings of the International Conference for In-ternet Technology and Secured Transactions pp 597ndash602IEEE London UK December 2012

[24] Y Huo X Fan L Ma X Cheng Z Tian and D ChenldquoSecure communications in tiered 5G wireless networks withcooperative jammingrdquo IEEE Transactions on Wireless Com-munications vol 18 no 6 pp 3265ndash3280 2019

[25] K Liu H Kargupta and J Ryan ldquoRandom projection-basedmultiplicative data perturbation for privacy preserving dis-tributed data miningin latexrdquo IEEE Transactions on knowledgeand Data Engineering vol 18 no 1 pp 92ndash106 2005

[26] N Li T Li and S Venkatasubramanian ldquot-closeness privacybeyond k-anonymity and l-diversityrdquo in Proceedings of theIEEE 23rd International Conference on Data Engineeringpp 106ndash115 IEEE Istanbul Turkey April 2007

[27] A Machanavajjhala D Kifer J Gehrke andM Venkitasubramaniam ldquoL-diversityrdquo ACMTransactions onKnowledge Discovery from Data vol 1 no 1 pp 3ndashes 2007

[28] S Matwin ldquoPrivacy-preserving data mining techniquessurvey and challengesrdquo in Studies in Applied PhilosophyEpistemology and Rational Ethics pp 209ndash221 SpringerBerlin Germany 2013

[29] B McFee and G Lanckriet ldquoMetric learning to rankrdquo inProceedings of the 27th International Conference on MachineLearning (ICMLrsquo10) Haifa Israel June 2010

Security and Communication Networks 15

[30] K Muralidhar and R Sarathy ldquoData shufflingmdasha newmasking approach for numerical datardquo Management Sciencevol 52 no 5 pp 658ndash670 2006

[31] J Nin J Herranz and V Torra ldquoRethinking rank swapping todecrease disclosure riskrdquo Data amp Knowledge Engineeringvol 64 no 1 pp 346ndash364 2008

[32] K Okamoto W Chen and X-Y Li ldquoRanking of closenesscentrality for large-scale social networksrdquo in Frontiers inAlgorithmics pp 186ndash195 Springer Berlin Germany 2008

[33] C H Papadimitriou and M Yannakakis ldquoOptimizationapproximation and complexity classesrdquo Journal of computerand system sciences vol 43 no 3 pp 425ndash440 1991

[34] L Peng R-C Wang X-Y Su and C Long ldquoPrivacy pro-tection based on key-changed mutual authentication protocolin internet of thingsrdquo in Communications in Computer andInformation Science pp 345ndash355 Springer Berlin Germany2013

[35] M F Rahman W Liu S )irumuruganathan N Zhang andG Das ldquoPrivacy implications of database rankingrdquo Pro-ceedings of the VLDB Endowment vol 8 no 10 pp 1106ndash1117 2015

[36] R Schenkel T Crecelius M Kacimi et al ldquoEfficient top-kquerying over social-tagging networksrdquo in Proceedings of the31st Annual International ACM SIGIR Conference on Researchand Development in Information Retrieval pp 523ndash530ACM Singapore July 2008

[37] C Sheng N Zhang Y Tao and X Jin ldquoOptimal algorithmsfor crawling a hidden database in the webrdquo Proceedings of theVLDB Endowment vol 5 no 11 pp 1112ndash1123 2012

[38] D X Song D Wagner and A Perrig ldquoPractical techniquesfor searches on encrypted datardquo in Proceeding 2000 IEEESymposium on Security and Privacy SampP 2000 pp 44ndash55IEEE Berkeley CA USA May 2000

[39] J Su D Cao B Zhao X Wang and I You ldquoePASS anexpressive attribute-based signature scheme with privacy andan unforgeability guarantee for the Internet of)ingsrdquo FutureGeneration Computer Systems vol 33 pp 11ndash18 2014

[40] L Sweeney ldquok-anonymity a Model for Protecting PrivacyrdquoInternational Journal of Uncertainty Fuzziness and Knowl-edge-Based Systems vol 10 no 5 pp 557ndash570 2002

[41] T Tassa A Mazza and A Gionis ldquok-concealment An al-ternative model of k-type anonymityrdquo Transactions on DataPrivacy vol 5 no 1 pp 189ndash222 2012

[42] J H Y F W Wang J C W G K Koperski D LiY L A R N Stefanovic and B X O R Z Dbminer ldquoAsystem for mining knowledge in large relational databasesrdquo inProceedings of the International Conference on KnowledgeDiscovery and Data Mining and Knowledge Discovery(KDDrsquo96) pp 250ndash255 San Diego CA USA August 1996

[43] X Wang J Zhang E M Schooler and M Ion ldquoPerformanceevaluation of attribute-based encryption toward data privacyin the IoTrdquo in Proceedings of the IEEE International Con-ference on Communications (ICC) pp 725ndash730 IEEE SydneyAustralia June 2014

[44] Y Wang and Q Wen ldquoA privacy enhanced dns scheme forthe internet of thingsrdquo in Proceedings of the IET InternationalConference on Communication Technology and Application(ICCTA 2011) )e Institution of Engineering amp TechnologyBeijing China October 2011

[45] Y Xu T Ma M Tang and W Tian ldquoA survey of privacypreserving data publishing using generalization and sup-pressionrdquo Applied Mathematics amp Information Sciencesvol 8 no 3 pp 1103ndash1116 2014

[46] J-C Yang and B-X Fang ldquoSecurity model and key tech-nologies for the internet of thingsrdquo 4e Journal of ChinaUniversities of Posts and Telecommunications vol 18pp 109ndash112 2011

[47] X Yang H Steck Y Guo and Y Liu ldquoOn top-k recom-mendation using social networksrdquo in Proceedings of the sixthACM conference on Recommender systemsmdashRecSysrsquo12pp 67ndash74 ACM Dublin Ireland September 2012

[48] X Zheng Z Cai J Yu C Wang and Y Li ldquoFollow but notrack privacy preserved profile publishing in cyber-physicalsocial systemsrdquo IEEE Internet of 4ings Journal vol 4 no 6pp 1868ndash1878 2017

16 Security and Communication Networks

International Journal of

AerospaceEngineeringHindawiwwwhindawicom Volume 2018

RoboticsJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Active and Passive Electronic Components

VLSI Design

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Shock and Vibration

Hindawiwwwhindawicom Volume 2018

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawiwwwhindawicom

Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Control Scienceand Engineering

Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Journal ofEngineeringVolume 2018

SensorsJournal of

Hindawiwwwhindawicom Volume 2018

International Journal of

RotatingMachinery

Hindawiwwwhindawicom Volume 2018

Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Navigation and Observation

International Journal of

Hindawi

wwwhindawicom Volume 2018

Advances in

Multimedia

Submit your manuscripts atwwwhindawicom

Page 5: Research Article - Hindawi Publishing Corporationdownloads.hindawi.com/journals/scn/2019/5498375.pdf · [4, 34, 44], and attribute-based encryption [39, 43]. ’e features of social

an adversary cannot differentiate any one of them then theprivacy of v[Bj] can be preserved For instance if theprobability of v[Bj] a is equal to the probability ofv[Bj] aprime given ranked results and prior knowledge thenthe adversary cannot exclude any one of them If both theprobabilities are 50 then the adversary may choose torandomly pick a value from a and aprime as the inferred result Inthis case g(v Bj) will not exceed 50 and privacy of v[Bj]

can be preserved To prove this statement suppose that v isan arbitrary tuple in databaseD and we want to preserve theprivacy of v[Bj] Let β

Bj be an arbitrary value in VB

j v[Bj]1113966 1113967We construct tuple vprime such that vprime and v differ in only oneattribute Bj vprime[Bj] βB

j We also construct databaseDprime suchthat D and Dprime differ in only one tuple v isin D while vprime isin DprimeWe define a new score function sprime(t ∣ q)

sprime(t ∣ q) 1113944m

i1w

Ai ρ q Ai1113858 1113859 t Ai1113858 1113859( 1113857 + 1113944

mprime

j1w

Bi ρprime q Bj1113960 1113961 t Bj1113960 11139611113872 1113873

(5)

where

ρprime q Bj1113960 1113961 t Bj1113960 11139611113872 1113873 ρ q Bj1113960 1113961 t Bj1113960 11139611113872 1113873 if tne v and tne vprime

ρprime q Bj1113960 1113961 t Bj1113960 11139611113872 1113873 Max ρ q Bj1113960 1113961 v Bj1113960 11139611113872 1113873 ρ q Bj1113960 1113961 vprime Bj1113960 11139611113872 11138731113872 1113873

(6)

if t v or t vprimeImagine a case where an adversary queriesD and Dprime with

the same query workload QA We denote the ranked resultsfrom D as RD and the ranked results from Dprime as RDprime As inthe new score function sprime(v ∣ q) sprime(vprime ∣ q) for forallq isin QA RD

is identical to RDprime )erefore given only ranked results theadversary cannot tell the difference between the two data-bases being queried Furthermore even if we exchange thevalues of v and vprime the adversary still cannot observe anychange inRD or RDprime As a result the value of v[Bj] and β

Bj are

equivalent from the perspective of the adversary and theprivacy of v[Bj] can be well preserved Intuitively v can beseen as a tuple that has two polymorphic forms in Bj v[Bj]

and βBj When calculating the score of v with sprime(t ∣ q) we

always choose the value that can maximize sprime(v ∣ q)We can further extend the statement to a more general

case For each tuple v isin D and each private attribute Bj wecan select e distinct values β1 β2 βe from VB

j v[Bj]1113966 1113967elt |VB

j | minus 1 )e new score function can be defined as

sprime(t ∣ q) 1113944m

i1w

Ai ρ q Ai1113858 1113859 t Ai1113858 1113859( 1113857 + 1113944

mprime

j1w

Bi ρprime q Bj1113960 1113961 t Bj1113960 11139611113872 1113873

(7)

whereρprime q Bj1113960 1113961 t Bj1113960 11139611113872 1113873 ρ q Bj1113960 1113961 t Bj1113960 11139611113872 1113873 if tne v

ρprime q Bj1113960 1113961 t Bj1113960 11139611113872 1113873 Max ρ q Bj1113960 1113961 t Bj1113960 11139611113872 1113873 ρ q Bj1113960 1113961 β11113872 11138731113872

ρ q Bj1113960 1113961 βe1113872 11138731113873 if t v

(8)

Consider e tuples v(1)prime v(e)prime which are identical to v

in all attributes except for Bj Let v(i)prime[Bj] be βi

foralli 1 e )en for forallq we have sprime(v ∣ q) sprime(v(i)prime ∣ q)As a result from the adversaryrsquos perspective there are e + 1potential values for v[Bj] v[Bj] β1 βe which are in-distinguishable from each other from any ranked results)eprivacy of v[Bj] can be preserved by grouping it with eequivalent values

As such we introduce the construction of the poly-morphic value set (PVS) We put v[Bj] into a set in which allvalues are indistinguishable when calculating the score withrespect to Bj in the ranking function ie ρprime(q[Bj] v[Bj])We name the above set as the polymorphic value set anddenote the polymorphic value set of tuple v in attribute Bj asP

Bj

v We define PBj

v as follows

Definition 2 PBj

v is a set containing all indistinguishablevalues of tuple vrsquos private attribute Bj Assigning v[Bj] withan arbitrary value in P

Bj

v will not change the value ofsprime(v ∣ q)forallq ie ρprime(q[Bj] v[Bj]) ρprime(q[Bj] β)forallβ isin P

Bj

v

and forallq

Since the adversary cannot distinguish different values inP

Bj

v by launching any inference attacks based only on rankedresults the privacy guarantee of v[Bj] is

ε 1

PBj

v

11138681113868111386811138681113868

11138681113868111386811138681113868

(9)

In order to achieve the privacy guarantee defined in (3)for each v isin D and each private attribute Bj we have toensure that (a) there is one and only one polymorphic valueset P

Bj

v for vrsquos attribute Bj and (b) the privacy guaranteedefined in (9) is always valid

4 Framework with Virtual Values

41 Design In this section we introduce how polymorphicvalue sets can be constructed with generated values to meetthe privacy guarantee in (9) against authenticity-ignorantadversaries We name values generated by our framework asvirtual values

As we proved in Privacy-Preserving Framework anauthenticity-ignorant adversary cannot distinguish the valueof v[Bj] from other valid values in P

Bj

v given ranked resultsSince an adversary without prior knowledge cannot validatethe authenticity of any value in P

Bj

v all values in PBj

v areldquovalidrdquo in the perspective of the adversary no matter if theyare generated by our framework or collected from real datain D )erefore we observe that P

Bj

v can be formed by anyvalues in VB

j In order to achieve the privacy guarantee of ε we have to

ensure that |PBj

v |gt (1ε) forallv isin D and forallj isin 1 mprimeAn intuitive algorithm to generate P

Bj

v of size 1ε is torandomly pick (1ε) minus 1 values from VB

j v[Bj] Specifically

let the initial PBj

v v[Bj]1113966 1113967 )en we can randomly pick

distinct values fromVBj v[Bj] and insert them intoP

Bj

v untilPBj

v

contains at least (1ε) distinct values In the same manner wecan construct a polymorphic value set for each tuplersquos eachprivate attribute

Security and Communication Networks 5

411 Privacy Guarantee For a databaseDwhere every tuplevrsquos every private attribute Bj is included by one polymorphicvalue set with virtual values (PVSV) whose size is at least l ifthe adversary has no prior knowledge of D a privacy level ofε (1l) is achieved

For an authenticity-ignorant adversary PK(v ∣ D) 1forallv As proved in Privacy-Preserving Framework for anauthenticity-ignorant adversary it is impossible to distin-guish v[Bj] with at least l minus 1 other values )us we haveg(v Bj)le (1l) for forallv isin D and forallj isin 1 mprime Accordingto equation (3) a privacy guarantee of (1l) can be achieved

42 Utility Optimization In this section we discuss how toreduce utility loss caused by polymorphic value sets Weintroduced a metric of utility loss in (4) that calculates thesum of difference in ranked results given all possible queriesTo practically calculate utility loss we limit the range ofqueries to a finite set named query workload In practice thequery workload of a database D can be a set of queries thatare more frequently issued than any other queries A queryworkload may contain duplicate queries which reflect thedistribution of frequent queries With a query workloaddenoted as Q we can define the practical utility loss as

UQ 1113944qisinQ

1113944visinD

Rank(v ∣ q) minus Rankprime(v ∣ q)1113868111386811138681113868

1113868111386811138681113868 (10)

In order to reduce utility loss we have to find assign-ments of P

Bj

v for forallv isin D and forallj isin 1 mprime such that theoverall UQ can be minimized Without loss of generality weonly consider constructing polymorphic value sets of size 2In this case the privacy guarantee is 12 For each v[Bj] weneed to find a value that is indistinguishable from v[Bj] Wedenote the polymorphic value of v[Bj] as vprime[Bj]

Definition 3 We define the 2-PVSV problem as follows givendatabase D and query workload Q find a polymorphic valuevprime[Bj] from VB

j v[Bj]1113966 1113967 for each v[Bj] forallv isin D andforallj isin 1 mprime such that UQ defined in (10) is minimized

Theorem 1 4e 2-PVSV problem is NP-hard

)e proof of )eorem 1 in detail can be found in Ap-pendix A

43 Heuristic Algorithm We have proved that the 2-PVSVproblem is an NP-hard problem that may not be solved inpolynomial time)erefore we propose PVSV-Constructora heuristic algorithm that can return an approximate so-lution in polynomial time

We observe that |s(v ∣ q) minus sprime(v ∣ q)| is relevant to|Rank(v ∣ q) minus Rankprime(v ∣ q)| A smaller difference betweens(v ∣ q) and sprime(v| q) leads to a smaller difference betweenRank(v ∣ q) and Rankprime(v ∣ q) )erefore |Rank(v ∣ q)minus

Rankprime(v ∣ q)| can be approximately minimized by a solutionthat minimizes |s(v ∣ q) minus sprime(v ∣ q)| As such we propose anapproximation of UQ that calculates the score differencebefore and after adopting our framework We denote thescore difference as US

Q and define USQ as follows

USQ 1113944

qisinQ1113944visinD

s(v ∣ q) minus sprime(v ∣ q)1113868111386811138681113868

1113868111386811138681113868 (11)

As shown in Algorithm 1 given input database D thenumber of public and private attributes m and mprime re-spectively query workload Q and privacy guarantee ϵPVSV-Constructor constructs an equivalent value set foreach v isin D and j isin 1 mprime that minimizes US

Q Recall thedefinition of sprime(t| q) in (7) and we have

USQ 1113944

visinD1113944

mprime

j11113944qisinQ

wBj ρprime q Bj1113960 1113961 v Bj1113960 11139611113872 1113873 minus ρ q Bj1113960 1113961 v Bj1113960 11139611113872 111387311138681113868111386811138681113868

11138681113868111386811138681113868

(12)

We denote the score difference of v[Bj] contributed byP

Bj

v as U(PBj

v )

U PBj

v1113874 1113875 1113944qisinQ

wBj ρprime q Bj1113960 1113961 v Bj1113960 11139611113872 1113873 minus ρ q Bj1113960 1113961 v Bj1113960 11139611113872 111387311138681113868111386811138681113868

11138681113868111386811138681113868

(13)

)erefore USQ is the sum of U(P

Bj

v ) over all tuples andprivate attributes

USQ 1113944

visinD1113944

mprime

j1U P

Bj

v1113874 1113875 (14)

Since the construction of each PBj

v is independent fromother polymorphic value sets we can minimize US

Q by mini-

mizing the score difference contributed by each PBj

v forforallj isin 1 mprime1113864 1113865 and forallv isin D Note that |ρprime(q[Bj] v[Bj]) minus

ρ(q[Bj] v[Bj])| 0 if q[Bj] v[Bj] or q[Bj] notin PBj

v Alsonote that |ρprime(q[Bj] v[Bj]) minus ρ(q[Bj] v[Bj])| 1 when

q[Bj]ne v[Bj] and q[Bj] isin PBj

v )erefore if forallq isin Q

v[Bj] q[Bj] then any assignment of PBj

v cannot contribute

to a higher U(PBj

v ) since ρprime(q[Bj] v[Bj]) 1 ρ(q[Bj]

v[Bj]) 1 and |ρprime(q[Bj] v[Bj]) minus ρ(q[Bj] v[Bj])| is always

zero In this situation U(PBj

v ) 0 On the contrary if existq isin Qv[Bj]ne q[Bj] then ρ(q[Bj] v[Bj]) 0 and thus|ρprime(q[Bj] v[Bj]) minus ρ(q[Bj] v[Bj])| ρprime(q[Bj] v[Bj])

According to equation (13) the value of U(PBj

v ) is

1113944qisinQ

wBj ρprime q Bj1113960 1113961 v Bj1113960 11139611113872 1113873 minus ρ q Bj1113960 1113961 v Bj1113960 11139611113872 111387311138681113868111386811138681113868

11138681113868111386811138681113868

1113944

qisinQq Bj1113858 1113859nev Bj1113858 1113859

wBj ρprime q Bj1113960 1113961 v Bj1113960 11139611113872 1113873

(15)

In order to minimize U(PBj

v ) we have to find an as-signment of P

Bj

v which minimizes |1113864q isin Q ∣ q[Bj]ne

v[Bj]existβ isin PBj

v q[Bj] β1113865| Consider the simplest case

where we want to construct PBj

v of size 2 v[Bj] β1113966 1113967 We have

6 Security and Communication Networks

1113944

qisinQq Bj1113858 1113859nev Bj1113858 1113859

wBj ρprime q Bj1113960 1113961 v Bj1113960 11139611113872 1113873 w

Bj q isin Q ∣ q Bj1113960 1113961 β1113966 111396711138681113868111386811138681113868

11138681113868111386811138681113868

(16)

In this case β has to be a value in VBj v[Bj] such that β

has the lowest frequency among all q[Bj] values for q isin QNote that β does not have to be an element of q[Bj] ∣1113966

q isin Q If β notin q[Bj] ∣ q isin Q1113966 1113967 its frequency is 0 Similarly if we

want to construct PBj

v of size k then we should insert the k minus 1least frequent values among all q[Bj] values into P

Bj

v In line 4 of Algorithm 1 we initialize the polymorphic

value set of v[Bj] by inserting v[Bj] into PBj

v For v[Bj] aset Set is constructed from VB

j v[Bj]1113966 1113967 which containsall values that can be inserted into P

Bj

v In order tominimize U(P

Bj

v ) we insert the k minus 1 least frequentvalues from Set into P

Bj

v by their frequencies inq[Bj] ∣ q isin Q1113966 1113967 )e above construction of P

Bj

v is repeatedfor each t isin D and j isin 1 mprime )e computationalcomplexity of Algorithm 1 is O(|D| middot |Q| middot mprime)

5 Framework with True Values

51 Authenticity-Knowledgeable Adversaries As we men-tioned in the adversary model an authenticity-knowl-edgeable adversary is able to tell if t isin D is possible As aresult the authenticity-knowledgeable adversary can launcha more efficient attack on private attributes by examining theauthenticity of values learned from RA We show a simplecase where an authenticity-knowledgeable adversary breaksthe privacy guarantee provided by polymorphic values setsconstructed with virtual values Consider that the objectiveof the adversary is to infer the value of v[B0] and B0 is theonly private attribute in D PB0

v is the polymorphic value setgenerated for v[B0] and |PB0

v | 1ε Without prior knowl-edge Pv[B0] ε and the privacy guarantee is achievedHowever for a value β isin PB0

v v[B0]1113864 1113865 if the adversary canconclude that PK(vprime ∣ D) 0 for any tuple vprime such that vprime isequal to v in all public attributes and vprime[B0] β then theadversary can exclude β from PB0

v )erefore

g v B01113858 1113859( 1113857 ε

1 minus εgt ε (17)

and equation (9) no longer holdsAs shown above if values in PB0

v are marked by anadversary as invalid for v given PK(vprime|D) then the adversarycan successfully break the privacy guarantee defined inframework with virtual values

52 Design In this section we propose polymorphic valuesets with true values (PVST) that construct polymorphicvalue sets with values that cannot be excluded by PK(t ∣ D)PVST considers nontrivial prior knowledge of adversariesand presents the same degree of privacy guarantee in-troduced in (9)

We have shown above that the implementation withvirtual values can be compromised by adversaries with priorknowledge Consider a tuple v isin D Given privacy guaranteeϵ we construct mprime polymorphic values sets PB0

v PB

mprimev for

each private attributes of v where |PBj

v |ge 1ε Let set MAv be

MAv v A01113858 11138591113864 1113865 times middot middot middot times v Am1113858 11138591113864 1113865 times P

B0v times middot middot middot times P

Bmprimev (18)

MAv is the Cartesian product of sets each of which

contains vrsquos all equivalent values in an attribute For publicattribute Ai the corresponding set is v[Ai]1113864 1113865 since publicattribute values are open to the adversary For private at-tribute Bj the corresponding set is P

Bj

v )erefore MAv

contains all possible tuples that are indistinguishable with v

with respect to RA (including v itself )With prior knowledge on PK(t ∣ D) an adversary is able

to exclude a value β from PBj

v if

forallt isin t ∣ t isinMAv t Bj1113960 1113961 β1113966 1113967PK(t ∣ D) 0 (19)

As described above it is safe for the adversary to con-clude that βne v[Bj] if there is no t isinMA

v such that t[Bj] βand PK(t ∣ D) 1 Alternatively if for every β in P

Bj

v thereis a t such that t[Bj] β and PK(t ∣ D) 1 then the ad-

versary cannot exclude any value in PBj

v and thereforePv[Bj]le ε is guaranteed Since PK(t ∣ D) 1 iff t isin D weconstruct polymorphic value sets with true values fromt isin D

Definition 4 If there exists l distinct values β1 βl isin PBj

v

such that forallr isin 1 l βr holds the following property

(i) Input ε D m mprime Q(ii) Output P

Bj

v forallv isin D and forallj isin 1 mprime1113864 1113865

(1) k (1ε) minus 1(2) for t in D do(3) for j isin 1 mprime1113864 1113865 do(4) P

Bj

v v[Bj]1113966 1113967

(5) Set VBj v[Bj]1113966 1113967

(6) Sort elements of Set by their frequencies in q[Bj]| q isin Q1113966 1113967 in ascending order(7) Remove the last |Set| minus k + 1 elements in Set(8) P

Bj

v PBj

v cup Set(9) end(10) end

ALGORITHM 1 PVSV-Constructor

Security and Communication Networks 7

existt isinMAv t Bj1113960 1113961 βr and t isin D (20)

)en we say that PBj

v covers l true values We denoteT

Bj

v β1 βl1113864 1113865 as the true value set of t in BjPrivacy guarantee for a database D where every tuplersquos

every private attribute is included by one equivalent value setwhich covers at least l true values a privacy level of ε 1l isachieved

Assume that the adversaryrsquos objective is to infer the valueof v[Bj] As mentioned in the adversary model we make noassumption on the attacking methods adopted by an ad-versary Consider MA

v defined in (18) forallt tprime isinMAv and

forallq isin QA s(t ∣ q) s(tprime ∣ q) )erefore the adversary cannotdistinguish tuples in MA

v by observing RA In this situationthe adversary would use PK(t ∣ D) to exclude all tuple t inMA

v such that PK(t ∣ D) 0 However as PBj

v covers l truevalues there exists l tuples t1 tl isinMA

v such thatPK(t ∣ D) 1 and tr[Bj]ne ts[Bj] for arbitraryr s isin 1 l As a result values in t1[Bj] tl[Bj]1113966 1113967 areindistinguishable given RA and PK(t ∣ D) )us Pv[Bj] 1lAccording to (3) a privacy level of 1l is achieved

53 Utility Optimization An intuitive method of con-structing P

Bj

v is to insert the value of Bj of all tuples that sharethe same public attribute values with v into P

Bj

v ie PBj

v

PBj

v 1113864vprime[Bj]|foralli and vprime isin D vprime[Ai] v[Ai]1113865 )e privacyguarantee is met if P

Bj

v covers at least 1ε true valuesNevertheless utility loss cannot be ignored as in the intuitivemethod s(vprime| q) would be the same for all vprime isin 1113864vprime| foralliand vprime isin D vprime[Ai] v[Ai]1113865 and thus information of privateattributes are missing in the ranked result of vprime )ereforethe size of P

Bj

v is critical in balancing privacy and utilityWith no loss of generality we limit the size of each poly-morphic value set to 2 We show that constructing suchpolymorphic value sets is an NP-hard problem

Definition 5 We define the 2-PVST problem as followsgiven database D and a query workload Q construct P

Bj

v ofsize 2 for each tuple v isin D forallj isin 1 mprime1113864 1113865 and minimizeUQ defined in (10)

Theorem 2 4e 2-PVST problem is NP-hard

)e proof of )eorem 2 can be found in Appendix B

54 Heuristic Algorithm We have shown that the 2-PVSTproblem is NP-hard In this subsection we present PVST-Constructor a heuristic algorithm that constructs PVSTwithin polynomial time PVST-Constructor tries to mini-mize |P

Bj

v | and 1113936qisinQ|sprime(v ∣ q) minus s(v ∣ q)| for each v isin D

and j isin 1 mprime with the greedy algorithm)e pseudo-code of PVST-Constructor is shown in

Algorithm 2 In lines 2 to 6 we initialize each PBj

v withv[Bj] )en for each v isin D PVST-Constructor constructsPB1

v PBmprime

v by finding a vprime with the greedy algorithm andinserting vprime[Bj] into P

Bj

v forallj isin 1 mprime)e above process

will be taken multiple times until |PBj

v |ge 1ε Since every vprime isa real tuple existing in D and we insert at least k tuples in theabove processes P

Bj

v covers at least k + 1 true values andthus the privacy guarantee is met As mentioned in UtilityOptimization the size of P

Bj

v is critical in minimizing utilityloss )erefore the heuristic algorithm tries to minimize theutility loss by minimizing the size of each P

Bj

v We count thenumber of polymorphic value sets of v if the set contains lessthan k values that can be enlarged by inserting vprimersquos privateattribute values We denote this count as covervprime

covervprime j ∣ PBj

v

11138681113868111386811138681113868

11138681113868111386811138681113868lt k vprime Bj1113960 1113961 notin PBj

v1113882 1113883

1113868111386811138681113868111386811138681113868

1113868111386811138681113868111386811138681113868 (21)

Tuple vprime with a higher covervprime value can enlarge the sizeof more polymorphic value sets of v and thus we can reducethe number of tuples that we have to insert into PB1

v PBmprime

v Furthermore we take into consideration queries in Q In

order to minimize UQ we have to minimize |sprime(v ∣ q) minus

s(v ∣ q)| for q isin Q Note that for each attribute Bj|ρprime(q[Bj] v[Bj]) minus ρ(q[Bj] v[Bj])| 1 if v[Bj]ne q[Bj] andq[Bj] isin P

Bj

v We denote the value of 1113936j|ρprime(q[Bj] v[Bj]) minus

ρ(q[Bj] v[Bj])| as lossvprime )us we have

lossvprime 1113944qisinQ

1113944

j1mprime

ρprime q Bj1113960 1113961 v Bj1113960 11139611113872 1113873 minus ρ q Bj1113960 1113961 v Bj1113960 11139611113872 111387311138681113868111386811138681113868

11138681113868111386811138681113868

1113944qisinQ

1113944

j1mprime

δ v vprime q j( 1113857

(22)where δ(v vprime q j) 1 if v[Bj]ne q[Bj] and vprime[Bj] q[Bj]

and δ(v vprime q j) 0 otherwise )erefore tuple vprime with asmaller lossvprime can reduce the value of 1113936j|ρprime(q[Bj]

v[Bj]) minus ρ(q[Bj] v[Bj])| and thus we can reduce the valueof UQ

)e computation of covervprime and lossvprime is done in line 11)en we compute scorevprime the score of vprime that indicates howpreferable vprime is relative to other tuples in D v Weadopt the greedy algorithm to find the next vprime for PB1

v

PB

mprimev ie in each iteration we choose the tuple

that has the highest score value In line 15 we insert theprivate attribute values of the chosen tuple (denoted as vmax)into PB1

v PB

mprimev )e above process is repeated until

the sizes of PB1v P

Bmprime

v are no less than k

6 Experimental Results

61 Experimental Setup To validate PVSV-Constructor andPVST-Constructor algorithms we conducted experiments on areal world dataset [29] from eHarmony which contains 58attributes and 486464 tuples We removed 5 noncategoricalattributes and randomly picked 20 categorical attributes fromthe remaining 53 attributes )e domain sizes of the 20 at-tributes range from 2 to 15 After removing duplicate tuples werandomly picked 300000 tuples as our testing bed

By default we use the ranking function from the rankedretrieval model with all weights set to 1 All experimentalresults were obtained on a Mac machine running Mac OS

8 Security and Communication Networks

with 8GB of RAM )e algorithms were implemented inPython

62 Privacy )e privacy guarantee of PVSV-Constructorand PVST-Constructor were tested by performing RankedInference attack [35] including Point-Query In-QueryPoint-QueryampInsert and In-QueryampInsert attackingmethods on the dataset From a total of 20 attributes 5attributes were randomly chosen as public attributes andanother 5 attributes were randomly chosen as private at-tributes We randomly picked 20000 distinct tuples fromthe dataset as the testing bed and randomly generated 10tuples as the query workload For PVSV-Constructor weconstructed a polymorphic value set of size 2 for eachprivate attribute and each tuple in the testing bed ForPVST-Constructor we constructed a polymorphic valueset that covers at least two true values for each privateattribute and each tuple We randomly picked 1000 tuplesfrom the testing bed as our targets and performed 1000Rank Inference attacks (250 attacks for each of the fourmethods) on the five private attributes of target tuples Wemeasured the attack success guess rates based on the fre-quency of successful inference among all inference at-tempts Figure 1 shows the success guess rates of RankInference attacks on the unprotected testing bed the testingbed with PVSV and the testing bed with PVST As the sizeof each polymorphic value set is 2 the success guess rateson PVSV are around 50 which are significantly lowerthan those of unprotected dataset We also observe that thesuccess guess rates on PVSTare slightly lower than those of

PVSV )e reason is that PVSV-Constructor will beinserting values into tuple vrsquos polymorphic value sets untilall vrsquos polymorphic value sets cover at least 2 true values)us some polymorphic value sets of v may contain morethan 2 values

63 Utility In this subsection we quantify utility loss ofPVSV-Constructor and PVST-Constructor algorithms)e privacy guarantee of both PVSV-Constructor andPVST-Constructor is 12 ie the polymorphic value setsconstructed by PVSV-Constructor contains 2 values andthe polymorphic value sets constructed by PVST-Con-structor contains at least 2 true values )e key param-eters here are the size of query workload Q the size ofdatabase D the number of public and private attributesand the weight ratios in the ranking function We ran-domly generated 20 tuples as the query workload Bydefault we picked 10 tuples from Q and set |D| 300 000m 10 and mprime 10 )erefore we randomly picked 10attributes from the testing bed and set them as publicattributes )e rest of the 10 attributes were set as privateattributes

Many recommendation systems of ONS applicationsfeature top-k recommendation [1 14 47] where the rankedresult contains a set of k tuples that will be of interest to acertain user as it is impractical and unnecessary to return alltuples in the database to the user )erefore we introduceaverage top-k utility loss Uaul a variant of UQ that focuses onutility loss of the top-k tuples in a ranked result Uaul isdefined as

(i) Input ε D m mprime Q(ii) Output P

Bj

v forallv and forallBj

(1) k (1ε) minus 1(2) for v in D do(3) for j isin 1 mprime1113864 1113865 do(4) P

Bj

v v[Bj]1113966 1113967

(5) end(6) end(7) for v in D do(8) while existj isin 1 mprime1113864 1113865 |P

Bj

v |lt k + 1 do(9) for vprime isin t isin D|forallAi t[Ai] v[Ai]1113864 1113865 do(10) compute covervprime lossvprime(11) scorevprime covervprime 11 + exp(minus lossvprime )

(12) end(13) vmax argmaxvprimeisin |tisinD|forallAit[Ai]v[Ai] scorevprime

(14) for j isin 1 mprime1113864 1113865 do

(15) PBj

v PBj

v cup tmax[Bj]1113966 1113967

(16) end(17) end(18) end

ALGORITHM 2 PVST-Constructor

Security and Communication Networks 9

Uaul 1

|Q|1113944

|Q|

i1

1Dprime

11138681113868111386811138681113868111386811138681113868

1113944

visinDprime

min Rank(v ∣ q) k + 11113864 1113865 minus min Rankprime(v ∣ q) k + 11113864 11138651113868111386811138681113868

1113868111386811138681113868

k (23)

where Dprime v isin D |Rank(v | q))lt k or Rank(v | q))lt k1113864 1113865Intuitively Uaul represents the average percentage rankdifference relative to k over all queries and all tuples in top-kUaul is equivalent to UQ|Q||D|2 when k |D| By default weset k 100

631 Evaluation of Uaul with Varying k We first discuss theaverage top-k utility loss of PVSV-Constructor PVST-Constructor and a baseline algorithm on varying k withother parameters set to default values For a tuple vrsquos at-tribute Bj the baseline algorithm constructs P

Bj

v with v[Bj]

and a value randomly picked from VBj v[Bj]1113966 1113967 )e results

are presented in Figure 2 which shows that the Uaul ofPVSV-Constructor is significantly lower than that of thebaseline algorithm )e Uaul of PVST-Constructor is alsolower than that of the baseline algorithm when kge 5 eventhough the baseline algorithm cannot preserve privacyagainst authenticity-knowledgeable adversaries )e exper-imental results show that both PVSV and PVST can reduceutility loss with respect to rank differences Also note thatPVST-Constructor constructs polymorphic value sets withtrue values and thus from Figure 3 we can see that somepolymorphic value sets constructed by PVST-Constructorcontains more than 2 values

632 Evaluation of Uaul with Varying Sizes of QFigure 4 presents the average top-k utility loss of PVSV-Constructor on varying |Q| When |Q| is set to 1 5 or 10 werandomly picked 1 5 or 10 queries from the original Qrespectively With increasing number of queries in the queryworkload the Uaul of PVSV-Constructor increases mono-tonically )e reason is that PVSV-Constructor alwaysgenerates the least frequent value (denoted as vprime[Bj]) in1113864q[Bj] | q isin Q1113865 for v[Bj] If |Q| is small then it is possiblethat vprime[Bj] notin 1113864q[Bj] | q isin Q1113865 and thus s(v ∣ q) sprime(v ∣ q)

forallq isin Q However a larger query workload covers moreprivate attributes values and thus it will be harder forPVSV-Constructor to generate a value for v[Bj] that has noimpact on the rank of v for any q isin Q Figure 5 presents theaverage top-k utility loss of PVST-Constructor on varying|Q| We can see that the size of Q has no significant impacton the Uaul of PVST-Constructor as the PVST-Constructoralways pick a tuple vprime that is different from v in most at-tributes and then insert vprime[Bj] into P

Bj

v

633 Evaluation of Uaul with Varying Sizes of the DatasetFigures 6 and 7 depict the impact of the size of datasets onUaul of PVSV-Constructor and PVST-Constructor Datasetsof 100000 and 200000 tuples were randomly sampled from

QPoint QIn QIPoint QIIn

Atta

ck su

cces

s gue

ss ra

te

Without protectionPVSVPVST

1

09

08

07

06

05

04

03

02

01

0

Figure 1 Attack success guess

10 Security and Communication Networks

the testing bed of 300000 tuples As expected |D| has nosignificant impact on the Uaul of PVSV-Constructor sincePVSV-Constructor generates values from the domain ofeach private attributes |D| has no impact on the Uaul ofPVST-Constructor which indicates that a dataset contain-ing 100000 tuples is sufficient for PVST-Constructor togenerate polymorphic value sets with true values

634 Evaluation of Uaul with Varying m We investigate theimpact of the number of private and public attributes on

average top-k utility loss Figure 8 presents theUaul of PVSV-Constructor with fixed mprime and varying m and with fixed mand varying mprime When mmprime is set to 5 we randomly re-moved 5 attributes from the testing bed When mmprime is set to

0 20 40 60 80 100k

RandomPVSVPVST

Ave

rage

top-k

utili

ty lo

ss

1

09

08

07

06

05

04

03

02

01

0

Figure 2 Utility loss vs k

1 2 3 4 5+Size of PVST

3

25

2

15

1

05

0

Coun

t 10

5

Figure 3 Size of polymorphic value set (PVST)

|Q|

Ave

rage

top-k

utili

ty lo

ss

1

009

008

007

006

005

004

003

002

001

00 10 20

Figure 4 Utility loss vs |Q|

|Q|

Ave

rage

top-k

utili

ty lo

ss

1

09

08

07

06

05

04

03

02

01

00 10 20

Figure 5 Utility loss vs |Q|

0 1 2 3|D|105

Ave

rage

top-k

utili

ty lo

ss

1

009

008

007

006

005

004

003

002

001

0

Figure 6 Utility loss vs |D|

Security and Communication Networks 11

15 we added 5 more categorical attributes randomly chosenfrom the unused attributes We observe that the Uaulmonotonically decreases with increasing number of publicattributes and monotonically increases with increasingnumber of private attributes As expected a higher pro-portion of public attributes leads to less variant betweens(v ∣ q) and sprime(v ∣ q) e results of the same experimentwith PVST-Constructor are shown in Figure 9 Uaul in-creases as increasing number of private attributes as ex-pected However Uaul also increases slightly with increasingnumber of public attributes is is due to the fact that withmore public attributes there will be few tuples that share thesame public attribute values Since PVST-Constructor in-serts only private attribute values from tuples sharing samepublic attribute values more values will be inserted to eachpolymorphic value set which introduces higher utility loss

635 Evaluation of Uaul with Varying Weight RatiosFigures 10 and 11 illustrate the impact of weight ratios on theaverage utility loss e experiment was conducted with a

xed private attribute weight of 1 and varying public at-tribute weights of 1 2 and 3 As expected Uaul of bothPVSV-Constructor and PVST-Constructor decreases as

0 1 2 3Weight ratio

Ave

rage

top-k

utili

ty lo

ss

1

009

008

007

006

005

004

003

002

001

0

Figure 10 Utility loss vs weight ratio

0 1 2 3Weight ratio

Ave

rage

top-k

utili

ty lo

ss

1

09

08

07

06

05

04

03

02

01

0

Figure 11 Utility loss vs weight ratio

0 1 2 3|D|105

Ave

rage

top-k

utili

ty lo

ss1

09

08

07

06

05

04

03

02

01

0

Figure 7 Utility loss vs |D|

0 5 10 15m or mprime

fix mprime = 10fix m = 10

Ave

rage

top-k

utili

ty lo

ss

1

009

008

007

006

005

004

003

002

001

0

Figure 8 Utility loss vs m and mprime

0 5 10 15m or mprime

fix mprime = 10fix m = 10

Ave

rage

top-k

utili

ty lo

ss

1

09

08

07

06

05

04

03

02

01

0

Figure 9 Utility loss vs m and mprime

12 Security and Communication Networks

increasing weight ratio of public attributes)e reason is thatas the public attribute weight increases the part in sprime(v ∣ q)

caused by private attributes decreases )erefore less impactwould be made to sprime(v ∣ q) by PVSV and PVST As a resultsprime(v ∣ q) would be closer to s(v ∣ q) and the utility loss couldbe decreased

7 Conclusions

In this paper we proposed a novel framework that preservesprivacy of private attributes against arbitrary attacks throughthe ranked retrieval model Furthermore we identify twocategories of adversaries based on varying adversarial ca-pabilities For each kind of adversaries we presentedimplementation of our framework Our experimental resultssuggest that our implementations efficiently preserve privacyagainst Rank Inference attack [35] Moreover the imple-mentations significantly reduce utility loss with respect tothe variance in ranked results

It is our hope that this paper can motivate further re-search in privacy preservation of SIoT with consideration ofsocial network features andor variant information retrievalmodels eg text mining

Appendix

A 2-PVSV

In this subsection we prove that constructing an optimal PVSVfor each attribute of a tuple v is an NP-hard problem

Definition A1 For a tuple v in D we create PBj

v for each BjWe say v satisfies query q if the following hold for any tuple t(tne v) in D (1) If s(v ∣ q)lt s(t ∣ q) then sprime(v ∣ q)lt sprime(t ∣ q)and (2) if s(v ∣ q)ge s(t ∣ q) then sprime(v ∣ q)ge sprime(t ∣ q)

For a database containing 2 tuples the 2-PVSV problemcan be redefined as follows given a query workload Q adatabase D construct PB1

v PBmprime

v of size 2 such that v

satisfies the most queries in Q

Definition A2 Max-3Sat Problem given a 3-CNF formulaΦ find the truth assignment that satisfies that most clauses

Lemma A1 Max-3Sat leP 2-PVSV Problem

Proof We construct a reduction function f(Φ) (D s Q)

which takes a Max-3Sat instance as input and returns a 2-PVSV instance Without loss of generality we suppose thatΦ is a conjunction of l clauses and each clause is a dis-junction of 3 literals from set X x1 xn1113864 1113865 We constructdatabase D as follows D has 0 public attributes and n + 1private attributes B1 Bn C1 Let VB

j be the attribute do-main of Bj j isin 1 n and VC

1 be the attribute domain ofC1 Let VB

i minus 1 0 1 2 n + 2 and VC1 0 1 Two

tuples v1 and v2 are inserted into D v1[Bj] 0 and v2[Bj]

j + 2 for j isin 1 n while v1[C1] 0 and v2[C1] 1 Wesimplify the score function defined in (1) by setting all weightsto 1 Also note that ρ(q[Bj] t[Bj]) 0 if q[Bj] is null

We construct query workload Q based on Φ For eachclause Lk isin Φ we construct a query qk such that qk[Bi] i iffthe corresponding literal xi is a positive literal in Lk andq[Bi] i + 1 iff xi is a negative literal in the clause Forexample given a clause (x1orx2orx3) the correspondingquery q should satisfy q[B1] 1 q[B2] 3 q[B3] 3 andq[C1] 0 All other attributes in q are set to null by default)erefore s(v1 ∣ q) 1 and s(v2 ∣ q) 0 forallq isin Q

Since s(v2 ∣ q)lt s(v1 ∣ q) forallq isin Q in order to minimizeutility loss the value of sprime(v2 ∣ q) should be as small aspossible We observe that the minimum possible sprime(v2 ∣ q) is1 because PC1

v2must be 0 1 as VC

1 0 1 Without loss ofgenerality we assume that we have already constructedPB1

v2 P

Bmprime

v2 PC1v2

for v2 such that sprime(v2 ∣ q) 1 forallq isin QNow we have an instance of 2-PVSV problem that given

D Q constructs PB1v1

PBmprime

v1 PC1

v1that satisfies the most

queries in Q Since VC1 0 1 PC1

v1must be 0 1 too as

PC1v1

contains two distinct values )erefore for an ar-bitrary query q isin Q we have ρprime(q[C] v1[C]) 1 andρprime(q[C] v2[C]) 1 In order to let v1 satisfy q we have toensure that sprime(v1 ∣ q)gt sprime(v2 ∣ q) 1 )us we have

sprime v1 ∣ q( 1113857gt 1

⟺1113944mprime

i1ρprime q Bi1113858 1113859 v1 Bi1113858 1113859( 1113857gt 1

⟺existi isin 1 mprime1113864 1113865 ρprime q Bi1113858 1113859 v1 Bi1113858 1113859( 1113857 1

⟺existi isin 1 mprime1113864 1113865 q Bi1113858 1113859 isin PBi

v1

(A1)

Now we show that the solution of 2-PVSV problemconstructed above can answer the corresponding Max-3Sat Problem Suppose that we have the solution PB1

v1

PB

mprimev1 such that v1 satisfies the most queries in Q As in

(A1) if v1 satisfies qk we have qk[Bi] isin PBiv1 If v1 does not

satisfy qk then qk[Bi] notin PBiv1 Recall that we assign value i or

i + 1 to qk[Bi] if xi is a positive or negative literal in clauseLk respectively )erefore the assignment of xi given PBi

v1is

xi true if i isin PBi

v1

xi false if i + 1 isin PBi

v1

(A2)

Furthermore from equation (A1) we have

sprime v1 ∣ q( 1113857gt 1⟺L true (A3)

where L is qrsquos corresponding clause in Φ Note thati i + 1 nsubQ as |PBi

v1| 2 and 0 isin PBi

v1 )us the value of xi is

either true or falseAs we proved above v1 satisfies qi if and only if Li is true

given assignment x1 xl1113864 1113865 constructed according toequation (A2) )erefore x1 xl1113864 1113865 satisfies the mostclauses in ϕ if and only if v1 satisfies the most queries in Q

We now prove that function f can be conducted inpolynomial time Given a formula Φ with n variables and lclauses we construct a 2-PVSV instance with 2 tuples each

Security and Communication Networks 13

of which has n + 1 attributes and l queries each of which has4 attributes )erefore O(2n + 4l) assignments are neededand f can be conducted in polynomial time

Proof of 4eorem 1 We now prove that the 2-PVSVproblem is NP-hard In Lemma 3 we proved that Max-3SatProblem can be reduced to the 2-PVSV problem in poly-nomial time Furthermore as Max-3Sat Problem is a NP-hard problem [33] the 2-PVSV problem is NP-hard

B 2-PVST

In this section we prove that constructing an optimal PVSTfor each private attribute of a tuple v isin D is NP-hard We usethe definition of v satisfying q from (9))e 2-PVSTproblemcan be redefined as follows

Definition B1 2-PVSTproblem given a query workload Qa database D the optimization problem of 2-PVST is toconstruct an arbitrary tuple vrsquos polymorphic setsPB1

v PBmprime

v that satisfies the most queries in Q )e size ofeach polymorphic vale set is 2

Lemma B1 Max-3Sat le P 2-PVST problem

Proof We construct a reduction function f(Φ) (D s Q)

which takes a Max-3Sat instance as input and returns a 2-PVST instance We assume that Φ is a conjunction of lclauses and each clause is a disjunction of 3 literals from setX x1 xn1113864 1113865 Let D have no public attribute and n + 1private attributes B1 Bn C1 For each literal xi we inserttwo tuples vi and vi

prime into D where

vi C11113858 1113859 1 vi Bi1113858 1113859 1 vi Bj1113960 1113961 minus 1foralljne i

viprime C11113858 1113859 1 vi

prime Bi1113858 1113859 minus 1 viprime Bj1113960 1113961 1foralljne i

(B1)

)en we insert tuple v into D where v[C1] 0 andv[Bj] 0 forallj isin 1 mprime We also set the domain of eachBj as minus 1 0 1 and the domain of C1 as 0 1

Without loss of generality the score function defined in(1) is simplified by setting all weights to 1

Next we construct query workload Q For each clauseLk isin Φ we construct a query qk in which qk[Bj] minus 1 iff xj isa positive literal in Lk and qk[Bj] 1 iff xj is a negativeliteral in Lk We also set qk[C1] 1 )e rest of the attributevalues are set to null by default )erefore ifLk (x1orne x2orx3) then we have qk[C1] 1 qk[B1] 1qk[B2] 1 qk[B3] minus 1 and qk[Bj] null forj isin 4 mprime

Note that ρ(q[Bj] t[Bj]) 0 if q[Bj] is null )uss(v qk) 0 forallq isin Q and s(vi qk) s(vi

prime qk)ge 1 forallq isin Q Weobserve that for q isin Q and i isin 1 n s(v ∣ q)lt s(vi ∣ q)

and s(v ∣ q)lt s(viprime ∣ q) In order to reduce UQ we have to

maximize the number of q isin Q such that sprime(v ∣ q)lt sprime(vi ∣ q)

and sprime(v ∣ q)lt sprime(viprime ∣ q) )e maximum value of sprime(vi ∣ q)

and sprime(viprime ∣ q) is 4 which can be achieved by inserting vi[Bj]

into PBj

viprime and inserting vi

prime[Bj] into PBj

vi Since PC1

v 0 1 thevalue of sprime(v ∣ q) is at least 1 )erefore we have

sprime(v ∣ q)lt sprime vi ∣ q( 1113857

⟺sprime(v ∣ q)lt 4

⟺1113944

mprime

j1ρprime q Bj1113960 1113961 v Bj1113960 11139611113872 1113873lt 3

⟺existj isin 1 mprime ρprime q Bj1113960 1113961 v Bj1113960 11139611113872 1113873 0

⟺existj isin 1 mprime v Bj1113960 1113961ne q Bj1113960 1113961

⟺existj isin 1 mprime minus q Bj1113960 1113961 isin PBj

v

(B2)

Since |PBj

v | is limited to 2 PBj

v 0 1 or 0 minus 1 ie wemust insert either virsquos or vi

primersquos private attribute values intoPB1

v PBmv Recall that we assign minus 1 or 1 to qk[Bj] if xj

is positive or negative respectively in Lk In order toreduce Max-3Sat to 2-PSVT we assign literals in the fol-lowing way

xj true if minus 1 isin PBj

v

xj false if 1 isin PBj

v (B3)

)erefore we denote the corresponding clause of q as Land from equation (B2) we have

existj isin 1 mprime1113864 1113865 minus q Bj1113960 1113961 isin PBj

v

⟺existj isin 1 mprime1113864 1113865 xj true if xj is positive in L

orxj false if xj is negative in L

⟺L true(B4)

From the above equation we observe that v satisfying qk

is equivalent to Lk being true)erefore we can reduceMax-3Sat to 2-PVST Given a solution to the 2-PVSTproblem thatmaximizes the number of queries satisfied by v assignmentx1 xn1113864 1113865 produced by equation (B3) can satisfy thelargest number of clauses in Φ

Given a Max-3Sat instance Φ the construction of the 2-PVST instance can be conducted in polynomial time as weconstruct n queries with 3 attributes and 2n + 1 tuples with nattributes)e transformation from the optimal solution of 2-PVST to the optimal solution of Max-3Sat also takes poly-nomial time as we assign the value to each one of x1 xn

once )erefore f is a polynomial time function

Proof of 4eorem 2 In 4 we proved that a Max-3Sat in-stance can be reduced to a 2-PSVT instance in polynomialtime Furthermore as Max-3Sat Problem is an NP-hardproblem the 2-PVSV problem is an NP-hard problem

Data Availability

)e data used to support the findings of this study areavailable from the corresponding author upon request

14 Security and Communication Networks

Conflicts of Interest

)e authors declare that they have no conflicts of interestregarding the publication of this paper

Acknowledgments

)is research was partially supported by the National Sci-ence Foundation (grant nos 0852674 0915834 1117297and 1343976) and the Army Research Office (grant noW911NF-15-1-0020)

References

[1] G Adomavicius and A Tuzhilin ldquoToward the next generationof recommender systems a survey of the state-of-the-art andpossible extensionsrdquo IEEE Transactions on Knowledge andData Engineering vol 17 no 6 pp 734ndash749 2005

[2] C C Aggarwal and S Y Philip ldquoA condensation approach toprivacy preserving data miningrdquo in Proceedings of the In-ternational Conference on Extending Database Technologypp 183ndash199 Springer Heraklion Crete Greece March 2004

[3] R Agrawal and R Srikant ldquoPrivacy-preserving data miningrdquoin ACM Sigmod Record vol 29 no 2 pp 439ndash450 ACM2000

[4] A Alcaide E Palomar J Montero-Castillo and A RibagordaldquoAnonymous authentication for privacy-preserving IoT tar-get-driven applicationsrdquo Computers amp Security vol 37pp 111ndash123 2013

[5] L Atzori A Iera G Morabito and M Nitti ldquo)e socialinternet of things (SIoT)mdashwhen social networks meet theinternet of things concept architecture and network char-acterizationrdquo Computer Networks vol 56 no 16 pp 3594ndash3608 2012

[6] Z Cai Z He X Guan and Y Li ldquoCollective data-sanitizationfor preventing sensitive information inference attacks insocial networksrdquo IEEE Transactions on Dependable and SecureComputing vol 15 no 4 pp 577ndash590 2018

[7] Z Cai and X Zheng ldquoA private and efficient mechanism for datauploading in smart cyber-physical systemsrdquo IEEE Transactionson Network Science and Engineering vol 24 p 1 2018

[8] Z Cai X Zheng and J Yu ldquoA differential-private frameworkfor urban traffic flows estimation via taxi companiesrdquo IEEETransactions on Industrial Informatics vol 17 2019

[9] N Cao C Wang M Li K Ren and W Lou ldquoPrivacy-preserving multi-keyword ranked search over encryptedcloud datardquo IEEE Transactions on Parallel and DistributedSystems vol 25 no 1 pp 222ndash233 2014

[10] Y-C Chang and M Mitzenmacher ldquoPrivacy preservingkeyword searches on remote encrypted datardquo in Proceedingsof the International Conference on Applied Cryptography andNetwork Security pp 442ndash455 Springer New York NY USAJune 2005

[11] C Chen X Zhu P Shen et al ldquoAn efficient privacy-pre-serving ranked keyword search methodrdquo IEEE Transactionson Parallel and Distributed Systems vol 27 no 4 pp 951ndash9632016

[12] K Chen and L Liu ldquoPrivacy preserving data classificationwith rotation perturbationrdquo in Proceedings of the Fifth IEEEInternational Conference on Data Mining (ICDMrsquo05) p 4IEEE Houston TX USA November 2005

[13] V Ciriani S D C Di Vimercati S Foresti S JajodiaS Paraboschi and P Samarati ldquoFragmentation and en-cryption to enforce privacy in data storagerdquo in Proceedings of

the Fifth European symposium on research in computer se-curity pp 171ndash186 Springer Dresden Germany September2007

[14] M Deshpande and G Karypis ldquoItem-based top-N recom-mendation algorithmsrdquo ACM Transactions on InformationSystems vol 22 no 1 pp 143ndash177 2004

[15] S D C di Vimercati R F Erbacher S Foresti S JajodiaG Livraga and P Samarati ldquoEncryption and fragmentationfor data confidentiality in the cloudrdquo in Foundations of Se-curity Analysis and Design VII A Aldini J Lopez andF Martinelli Eds pp 212ndash243 Springer Berlin Germany2013

[16] S D C di Vimercati S Foresti G Livraga S Paraboschi andP Samarati ldquoConfidentiality protection in large databasesrdquo inA Comprehensive Guide through the Italian Database Researchover the Last 25 Years pp 457ndash472 Springer Berlin Ger-many 2018

[17] C Dwork ldquoDifferential privacyrdquo Encyclopedia of Cryptog-raphy and Security pp 338ndash340 2011

[18] C Dwork and A Roth ldquo)e algorithmic foundations ofdifferential privacyrdquo Foundations and Trendsreg in 4eoreticalComputer Science vol 9 no 3-4 pp 211ndash407 2014

[19] S E Fienberg and J McIntyre ldquoData swapping variations ona theme by dalenius and reissrdquo in Privacy in Statistical Da-tabases pp 14ndash29 Springer Berlin Germany 2004

[20] Y Gao T Yan and N Zhang ldquoA privacy-preservingframework for rank inferencerdquo in Proceedings of the IEEESymposium on Privacy-Aware Computing (PAC) pp 180-181IEEE Washington DC USA August 2017

[21] Z He Z Cai and J Yu ldquoLatent-data privacy preserving withcustomized data utility for social network datardquo IEEETransactions on Vehicular Technology vol 67 no 1pp 665ndash673 2017

[22] T Hong S Mei Z Wang and J Ren ldquoA novel verticalfragmentation method for privacy protection based on en-tropy minimization in a relational databaserdquo Symmetryvol 10 no 11 p 637 2018

[23] X Huang R Fu B Chen T Zhang and A Roscoe ldquoUserinteractive internet of things privacy preserved access con-trolrdquo in Proceedings of the International Conference for In-ternet Technology and Secured Transactions pp 597ndash602IEEE London UK December 2012

[24] Y Huo X Fan L Ma X Cheng Z Tian and D ChenldquoSecure communications in tiered 5G wireless networks withcooperative jammingrdquo IEEE Transactions on Wireless Com-munications vol 18 no 6 pp 3265ndash3280 2019

[25] K Liu H Kargupta and J Ryan ldquoRandom projection-basedmultiplicative data perturbation for privacy preserving dis-tributed data miningin latexrdquo IEEE Transactions on knowledgeand Data Engineering vol 18 no 1 pp 92ndash106 2005

[26] N Li T Li and S Venkatasubramanian ldquot-closeness privacybeyond k-anonymity and l-diversityrdquo in Proceedings of theIEEE 23rd International Conference on Data Engineeringpp 106ndash115 IEEE Istanbul Turkey April 2007

[27] A Machanavajjhala D Kifer J Gehrke andM Venkitasubramaniam ldquoL-diversityrdquo ACMTransactions onKnowledge Discovery from Data vol 1 no 1 pp 3ndashes 2007

[28] S Matwin ldquoPrivacy-preserving data mining techniquessurvey and challengesrdquo in Studies in Applied PhilosophyEpistemology and Rational Ethics pp 209ndash221 SpringerBerlin Germany 2013

[29] B McFee and G Lanckriet ldquoMetric learning to rankrdquo inProceedings of the 27th International Conference on MachineLearning (ICMLrsquo10) Haifa Israel June 2010

Security and Communication Networks 15

[30] K Muralidhar and R Sarathy ldquoData shufflingmdasha newmasking approach for numerical datardquo Management Sciencevol 52 no 5 pp 658ndash670 2006

[31] J Nin J Herranz and V Torra ldquoRethinking rank swapping todecrease disclosure riskrdquo Data amp Knowledge Engineeringvol 64 no 1 pp 346ndash364 2008

[32] K Okamoto W Chen and X-Y Li ldquoRanking of closenesscentrality for large-scale social networksrdquo in Frontiers inAlgorithmics pp 186ndash195 Springer Berlin Germany 2008

[33] C H Papadimitriou and M Yannakakis ldquoOptimizationapproximation and complexity classesrdquo Journal of computerand system sciences vol 43 no 3 pp 425ndash440 1991

[34] L Peng R-C Wang X-Y Su and C Long ldquoPrivacy pro-tection based on key-changed mutual authentication protocolin internet of thingsrdquo in Communications in Computer andInformation Science pp 345ndash355 Springer Berlin Germany2013

[35] M F Rahman W Liu S )irumuruganathan N Zhang andG Das ldquoPrivacy implications of database rankingrdquo Pro-ceedings of the VLDB Endowment vol 8 no 10 pp 1106ndash1117 2015

[36] R Schenkel T Crecelius M Kacimi et al ldquoEfficient top-kquerying over social-tagging networksrdquo in Proceedings of the31st Annual International ACM SIGIR Conference on Researchand Development in Information Retrieval pp 523ndash530ACM Singapore July 2008

[37] C Sheng N Zhang Y Tao and X Jin ldquoOptimal algorithmsfor crawling a hidden database in the webrdquo Proceedings of theVLDB Endowment vol 5 no 11 pp 1112ndash1123 2012

[38] D X Song D Wagner and A Perrig ldquoPractical techniquesfor searches on encrypted datardquo in Proceeding 2000 IEEESymposium on Security and Privacy SampP 2000 pp 44ndash55IEEE Berkeley CA USA May 2000

[39] J Su D Cao B Zhao X Wang and I You ldquoePASS anexpressive attribute-based signature scheme with privacy andan unforgeability guarantee for the Internet of)ingsrdquo FutureGeneration Computer Systems vol 33 pp 11ndash18 2014

[40] L Sweeney ldquok-anonymity a Model for Protecting PrivacyrdquoInternational Journal of Uncertainty Fuzziness and Knowl-edge-Based Systems vol 10 no 5 pp 557ndash570 2002

[41] T Tassa A Mazza and A Gionis ldquok-concealment An al-ternative model of k-type anonymityrdquo Transactions on DataPrivacy vol 5 no 1 pp 189ndash222 2012

[42] J H Y F W Wang J C W G K Koperski D LiY L A R N Stefanovic and B X O R Z Dbminer ldquoAsystem for mining knowledge in large relational databasesrdquo inProceedings of the International Conference on KnowledgeDiscovery and Data Mining and Knowledge Discovery(KDDrsquo96) pp 250ndash255 San Diego CA USA August 1996

[43] X Wang J Zhang E M Schooler and M Ion ldquoPerformanceevaluation of attribute-based encryption toward data privacyin the IoTrdquo in Proceedings of the IEEE International Con-ference on Communications (ICC) pp 725ndash730 IEEE SydneyAustralia June 2014

[44] Y Wang and Q Wen ldquoA privacy enhanced dns scheme forthe internet of thingsrdquo in Proceedings of the IET InternationalConference on Communication Technology and Application(ICCTA 2011) )e Institution of Engineering amp TechnologyBeijing China October 2011

[45] Y Xu T Ma M Tang and W Tian ldquoA survey of privacypreserving data publishing using generalization and sup-pressionrdquo Applied Mathematics amp Information Sciencesvol 8 no 3 pp 1103ndash1116 2014

[46] J-C Yang and B-X Fang ldquoSecurity model and key tech-nologies for the internet of thingsrdquo 4e Journal of ChinaUniversities of Posts and Telecommunications vol 18pp 109ndash112 2011

[47] X Yang H Steck Y Guo and Y Liu ldquoOn top-k recom-mendation using social networksrdquo in Proceedings of the sixthACM conference on Recommender systemsmdashRecSysrsquo12pp 67ndash74 ACM Dublin Ireland September 2012

[48] X Zheng Z Cai J Yu C Wang and Y Li ldquoFollow but notrack privacy preserved profile publishing in cyber-physicalsocial systemsrdquo IEEE Internet of 4ings Journal vol 4 no 6pp 1868ndash1878 2017

16 Security and Communication Networks

International Journal of

AerospaceEngineeringHindawiwwwhindawicom Volume 2018

RoboticsJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Active and Passive Electronic Components

VLSI Design

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Shock and Vibration

Hindawiwwwhindawicom Volume 2018

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawiwwwhindawicom

Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Control Scienceand Engineering

Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Journal ofEngineeringVolume 2018

SensorsJournal of

Hindawiwwwhindawicom Volume 2018

International Journal of

RotatingMachinery

Hindawiwwwhindawicom Volume 2018

Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Navigation and Observation

International Journal of

Hindawi

wwwhindawicom Volume 2018

Advances in

Multimedia

Submit your manuscripts atwwwhindawicom

Page 6: Research Article - Hindawi Publishing Corporationdownloads.hindawi.com/journals/scn/2019/5498375.pdf · [4, 34, 44], and attribute-based encryption [39, 43]. ’e features of social

411 Privacy Guarantee For a databaseDwhere every tuplevrsquos every private attribute Bj is included by one polymorphicvalue set with virtual values (PVSV) whose size is at least l ifthe adversary has no prior knowledge of D a privacy level ofε (1l) is achieved

For an authenticity-ignorant adversary PK(v ∣ D) 1forallv As proved in Privacy-Preserving Framework for anauthenticity-ignorant adversary it is impossible to distin-guish v[Bj] with at least l minus 1 other values )us we haveg(v Bj)le (1l) for forallv isin D and forallj isin 1 mprime Accordingto equation (3) a privacy guarantee of (1l) can be achieved

42 Utility Optimization In this section we discuss how toreduce utility loss caused by polymorphic value sets Weintroduced a metric of utility loss in (4) that calculates thesum of difference in ranked results given all possible queriesTo practically calculate utility loss we limit the range ofqueries to a finite set named query workload In practice thequery workload of a database D can be a set of queries thatare more frequently issued than any other queries A queryworkload may contain duplicate queries which reflect thedistribution of frequent queries With a query workloaddenoted as Q we can define the practical utility loss as

UQ 1113944qisinQ

1113944visinD

Rank(v ∣ q) minus Rankprime(v ∣ q)1113868111386811138681113868

1113868111386811138681113868 (10)

In order to reduce utility loss we have to find assign-ments of P

Bj

v for forallv isin D and forallj isin 1 mprime such that theoverall UQ can be minimized Without loss of generality weonly consider constructing polymorphic value sets of size 2In this case the privacy guarantee is 12 For each v[Bj] weneed to find a value that is indistinguishable from v[Bj] Wedenote the polymorphic value of v[Bj] as vprime[Bj]

Definition 3 We define the 2-PVSV problem as follows givendatabase D and query workload Q find a polymorphic valuevprime[Bj] from VB

j v[Bj]1113966 1113967 for each v[Bj] forallv isin D andforallj isin 1 mprime such that UQ defined in (10) is minimized

Theorem 1 4e 2-PVSV problem is NP-hard

)e proof of )eorem 1 in detail can be found in Ap-pendix A

43 Heuristic Algorithm We have proved that the 2-PVSVproblem is an NP-hard problem that may not be solved inpolynomial time)erefore we propose PVSV-Constructora heuristic algorithm that can return an approximate so-lution in polynomial time

We observe that |s(v ∣ q) minus sprime(v ∣ q)| is relevant to|Rank(v ∣ q) minus Rankprime(v ∣ q)| A smaller difference betweens(v ∣ q) and sprime(v| q) leads to a smaller difference betweenRank(v ∣ q) and Rankprime(v ∣ q) )erefore |Rank(v ∣ q)minus

Rankprime(v ∣ q)| can be approximately minimized by a solutionthat minimizes |s(v ∣ q) minus sprime(v ∣ q)| As such we propose anapproximation of UQ that calculates the score differencebefore and after adopting our framework We denote thescore difference as US

Q and define USQ as follows

USQ 1113944

qisinQ1113944visinD

s(v ∣ q) minus sprime(v ∣ q)1113868111386811138681113868

1113868111386811138681113868 (11)

As shown in Algorithm 1 given input database D thenumber of public and private attributes m and mprime re-spectively query workload Q and privacy guarantee ϵPVSV-Constructor constructs an equivalent value set foreach v isin D and j isin 1 mprime that minimizes US

Q Recall thedefinition of sprime(t| q) in (7) and we have

USQ 1113944

visinD1113944

mprime

j11113944qisinQ

wBj ρprime q Bj1113960 1113961 v Bj1113960 11139611113872 1113873 minus ρ q Bj1113960 1113961 v Bj1113960 11139611113872 111387311138681113868111386811138681113868

11138681113868111386811138681113868

(12)

We denote the score difference of v[Bj] contributed byP

Bj

v as U(PBj

v )

U PBj

v1113874 1113875 1113944qisinQ

wBj ρprime q Bj1113960 1113961 v Bj1113960 11139611113872 1113873 minus ρ q Bj1113960 1113961 v Bj1113960 11139611113872 111387311138681113868111386811138681113868

11138681113868111386811138681113868

(13)

)erefore USQ is the sum of U(P

Bj

v ) over all tuples andprivate attributes

USQ 1113944

visinD1113944

mprime

j1U P

Bj

v1113874 1113875 (14)

Since the construction of each PBj

v is independent fromother polymorphic value sets we can minimize US

Q by mini-

mizing the score difference contributed by each PBj

v forforallj isin 1 mprime1113864 1113865 and forallv isin D Note that |ρprime(q[Bj] v[Bj]) minus

ρ(q[Bj] v[Bj])| 0 if q[Bj] v[Bj] or q[Bj] notin PBj

v Alsonote that |ρprime(q[Bj] v[Bj]) minus ρ(q[Bj] v[Bj])| 1 when

q[Bj]ne v[Bj] and q[Bj] isin PBj

v )erefore if forallq isin Q

v[Bj] q[Bj] then any assignment of PBj

v cannot contribute

to a higher U(PBj

v ) since ρprime(q[Bj] v[Bj]) 1 ρ(q[Bj]

v[Bj]) 1 and |ρprime(q[Bj] v[Bj]) minus ρ(q[Bj] v[Bj])| is always

zero In this situation U(PBj

v ) 0 On the contrary if existq isin Qv[Bj]ne q[Bj] then ρ(q[Bj] v[Bj]) 0 and thus|ρprime(q[Bj] v[Bj]) minus ρ(q[Bj] v[Bj])| ρprime(q[Bj] v[Bj])

According to equation (13) the value of U(PBj

v ) is

1113944qisinQ

wBj ρprime q Bj1113960 1113961 v Bj1113960 11139611113872 1113873 minus ρ q Bj1113960 1113961 v Bj1113960 11139611113872 111387311138681113868111386811138681113868

11138681113868111386811138681113868

1113944

qisinQq Bj1113858 1113859nev Bj1113858 1113859

wBj ρprime q Bj1113960 1113961 v Bj1113960 11139611113872 1113873

(15)

In order to minimize U(PBj

v ) we have to find an as-signment of P

Bj

v which minimizes |1113864q isin Q ∣ q[Bj]ne

v[Bj]existβ isin PBj

v q[Bj] β1113865| Consider the simplest case

where we want to construct PBj

v of size 2 v[Bj] β1113966 1113967 We have

6 Security and Communication Networks

1113944

qisinQq Bj1113858 1113859nev Bj1113858 1113859

wBj ρprime q Bj1113960 1113961 v Bj1113960 11139611113872 1113873 w

Bj q isin Q ∣ q Bj1113960 1113961 β1113966 111396711138681113868111386811138681113868

11138681113868111386811138681113868

(16)

In this case β has to be a value in VBj v[Bj] such that β

has the lowest frequency among all q[Bj] values for q isin QNote that β does not have to be an element of q[Bj] ∣1113966

q isin Q If β notin q[Bj] ∣ q isin Q1113966 1113967 its frequency is 0 Similarly if we

want to construct PBj

v of size k then we should insert the k minus 1least frequent values among all q[Bj] values into P

Bj

v In line 4 of Algorithm 1 we initialize the polymorphic

value set of v[Bj] by inserting v[Bj] into PBj

v For v[Bj] aset Set is constructed from VB

j v[Bj]1113966 1113967 which containsall values that can be inserted into P

Bj

v In order tominimize U(P

Bj

v ) we insert the k minus 1 least frequentvalues from Set into P

Bj

v by their frequencies inq[Bj] ∣ q isin Q1113966 1113967 )e above construction of P

Bj

v is repeatedfor each t isin D and j isin 1 mprime )e computationalcomplexity of Algorithm 1 is O(|D| middot |Q| middot mprime)

5 Framework with True Values

51 Authenticity-Knowledgeable Adversaries As we men-tioned in the adversary model an authenticity-knowl-edgeable adversary is able to tell if t isin D is possible As aresult the authenticity-knowledgeable adversary can launcha more efficient attack on private attributes by examining theauthenticity of values learned from RA We show a simplecase where an authenticity-knowledgeable adversary breaksthe privacy guarantee provided by polymorphic values setsconstructed with virtual values Consider that the objectiveof the adversary is to infer the value of v[B0] and B0 is theonly private attribute in D PB0

v is the polymorphic value setgenerated for v[B0] and |PB0

v | 1ε Without prior knowl-edge Pv[B0] ε and the privacy guarantee is achievedHowever for a value β isin PB0

v v[B0]1113864 1113865 if the adversary canconclude that PK(vprime ∣ D) 0 for any tuple vprime such that vprime isequal to v in all public attributes and vprime[B0] β then theadversary can exclude β from PB0

v )erefore

g v B01113858 1113859( 1113857 ε

1 minus εgt ε (17)

and equation (9) no longer holdsAs shown above if values in PB0

v are marked by anadversary as invalid for v given PK(vprime|D) then the adversarycan successfully break the privacy guarantee defined inframework with virtual values

52 Design In this section we propose polymorphic valuesets with true values (PVST) that construct polymorphicvalue sets with values that cannot be excluded by PK(t ∣ D)PVST considers nontrivial prior knowledge of adversariesand presents the same degree of privacy guarantee in-troduced in (9)

We have shown above that the implementation withvirtual values can be compromised by adversaries with priorknowledge Consider a tuple v isin D Given privacy guaranteeϵ we construct mprime polymorphic values sets PB0

v PB

mprimev for

each private attributes of v where |PBj

v |ge 1ε Let set MAv be

MAv v A01113858 11138591113864 1113865 times middot middot middot times v Am1113858 11138591113864 1113865 times P

B0v times middot middot middot times P

Bmprimev (18)

MAv is the Cartesian product of sets each of which

contains vrsquos all equivalent values in an attribute For publicattribute Ai the corresponding set is v[Ai]1113864 1113865 since publicattribute values are open to the adversary For private at-tribute Bj the corresponding set is P

Bj

v )erefore MAv

contains all possible tuples that are indistinguishable with v

with respect to RA (including v itself )With prior knowledge on PK(t ∣ D) an adversary is able

to exclude a value β from PBj

v if

forallt isin t ∣ t isinMAv t Bj1113960 1113961 β1113966 1113967PK(t ∣ D) 0 (19)

As described above it is safe for the adversary to con-clude that βne v[Bj] if there is no t isinMA

v such that t[Bj] βand PK(t ∣ D) 1 Alternatively if for every β in P

Bj

v thereis a t such that t[Bj] β and PK(t ∣ D) 1 then the ad-

versary cannot exclude any value in PBj

v and thereforePv[Bj]le ε is guaranteed Since PK(t ∣ D) 1 iff t isin D weconstruct polymorphic value sets with true values fromt isin D

Definition 4 If there exists l distinct values β1 βl isin PBj

v

such that forallr isin 1 l βr holds the following property

(i) Input ε D m mprime Q(ii) Output P

Bj

v forallv isin D and forallj isin 1 mprime1113864 1113865

(1) k (1ε) minus 1(2) for t in D do(3) for j isin 1 mprime1113864 1113865 do(4) P

Bj

v v[Bj]1113966 1113967

(5) Set VBj v[Bj]1113966 1113967

(6) Sort elements of Set by their frequencies in q[Bj]| q isin Q1113966 1113967 in ascending order(7) Remove the last |Set| minus k + 1 elements in Set(8) P

Bj

v PBj

v cup Set(9) end(10) end

ALGORITHM 1 PVSV-Constructor

Security and Communication Networks 7

existt isinMAv t Bj1113960 1113961 βr and t isin D (20)

)en we say that PBj

v covers l true values We denoteT

Bj

v β1 βl1113864 1113865 as the true value set of t in BjPrivacy guarantee for a database D where every tuplersquos

every private attribute is included by one equivalent value setwhich covers at least l true values a privacy level of ε 1l isachieved

Assume that the adversaryrsquos objective is to infer the valueof v[Bj] As mentioned in the adversary model we make noassumption on the attacking methods adopted by an ad-versary Consider MA

v defined in (18) forallt tprime isinMAv and

forallq isin QA s(t ∣ q) s(tprime ∣ q) )erefore the adversary cannotdistinguish tuples in MA

v by observing RA In this situationthe adversary would use PK(t ∣ D) to exclude all tuple t inMA

v such that PK(t ∣ D) 0 However as PBj

v covers l truevalues there exists l tuples t1 tl isinMA

v such thatPK(t ∣ D) 1 and tr[Bj]ne ts[Bj] for arbitraryr s isin 1 l As a result values in t1[Bj] tl[Bj]1113966 1113967 areindistinguishable given RA and PK(t ∣ D) )us Pv[Bj] 1lAccording to (3) a privacy level of 1l is achieved

53 Utility Optimization An intuitive method of con-structing P

Bj

v is to insert the value of Bj of all tuples that sharethe same public attribute values with v into P

Bj

v ie PBj

v

PBj

v 1113864vprime[Bj]|foralli and vprime isin D vprime[Ai] v[Ai]1113865 )e privacyguarantee is met if P

Bj

v covers at least 1ε true valuesNevertheless utility loss cannot be ignored as in the intuitivemethod s(vprime| q) would be the same for all vprime isin 1113864vprime| foralliand vprime isin D vprime[Ai] v[Ai]1113865 and thus information of privateattributes are missing in the ranked result of vprime )ereforethe size of P

Bj

v is critical in balancing privacy and utilityWith no loss of generality we limit the size of each poly-morphic value set to 2 We show that constructing suchpolymorphic value sets is an NP-hard problem

Definition 5 We define the 2-PVST problem as followsgiven database D and a query workload Q construct P

Bj

v ofsize 2 for each tuple v isin D forallj isin 1 mprime1113864 1113865 and minimizeUQ defined in (10)

Theorem 2 4e 2-PVST problem is NP-hard

)e proof of )eorem 2 can be found in Appendix B

54 Heuristic Algorithm We have shown that the 2-PVSTproblem is NP-hard In this subsection we present PVST-Constructor a heuristic algorithm that constructs PVSTwithin polynomial time PVST-Constructor tries to mini-mize |P

Bj

v | and 1113936qisinQ|sprime(v ∣ q) minus s(v ∣ q)| for each v isin D

and j isin 1 mprime with the greedy algorithm)e pseudo-code of PVST-Constructor is shown in

Algorithm 2 In lines 2 to 6 we initialize each PBj

v withv[Bj] )en for each v isin D PVST-Constructor constructsPB1

v PBmprime

v by finding a vprime with the greedy algorithm andinserting vprime[Bj] into P

Bj

v forallj isin 1 mprime)e above process

will be taken multiple times until |PBj

v |ge 1ε Since every vprime isa real tuple existing in D and we insert at least k tuples in theabove processes P

Bj

v covers at least k + 1 true values andthus the privacy guarantee is met As mentioned in UtilityOptimization the size of P

Bj

v is critical in minimizing utilityloss )erefore the heuristic algorithm tries to minimize theutility loss by minimizing the size of each P

Bj

v We count thenumber of polymorphic value sets of v if the set contains lessthan k values that can be enlarged by inserting vprimersquos privateattribute values We denote this count as covervprime

covervprime j ∣ PBj

v

11138681113868111386811138681113868

11138681113868111386811138681113868lt k vprime Bj1113960 1113961 notin PBj

v1113882 1113883

1113868111386811138681113868111386811138681113868

1113868111386811138681113868111386811138681113868 (21)

Tuple vprime with a higher covervprime value can enlarge the sizeof more polymorphic value sets of v and thus we can reducethe number of tuples that we have to insert into PB1

v PBmprime

v Furthermore we take into consideration queries in Q In

order to minimize UQ we have to minimize |sprime(v ∣ q) minus

s(v ∣ q)| for q isin Q Note that for each attribute Bj|ρprime(q[Bj] v[Bj]) minus ρ(q[Bj] v[Bj])| 1 if v[Bj]ne q[Bj] andq[Bj] isin P

Bj

v We denote the value of 1113936j|ρprime(q[Bj] v[Bj]) minus

ρ(q[Bj] v[Bj])| as lossvprime )us we have

lossvprime 1113944qisinQ

1113944

j1mprime

ρprime q Bj1113960 1113961 v Bj1113960 11139611113872 1113873 minus ρ q Bj1113960 1113961 v Bj1113960 11139611113872 111387311138681113868111386811138681113868

11138681113868111386811138681113868

1113944qisinQ

1113944

j1mprime

δ v vprime q j( 1113857

(22)where δ(v vprime q j) 1 if v[Bj]ne q[Bj] and vprime[Bj] q[Bj]

and δ(v vprime q j) 0 otherwise )erefore tuple vprime with asmaller lossvprime can reduce the value of 1113936j|ρprime(q[Bj]

v[Bj]) minus ρ(q[Bj] v[Bj])| and thus we can reduce the valueof UQ

)e computation of covervprime and lossvprime is done in line 11)en we compute scorevprime the score of vprime that indicates howpreferable vprime is relative to other tuples in D v Weadopt the greedy algorithm to find the next vprime for PB1

v

PB

mprimev ie in each iteration we choose the tuple

that has the highest score value In line 15 we insert theprivate attribute values of the chosen tuple (denoted as vmax)into PB1

v PB

mprimev )e above process is repeated until

the sizes of PB1v P

Bmprime

v are no less than k

6 Experimental Results

61 Experimental Setup To validate PVSV-Constructor andPVST-Constructor algorithms we conducted experiments on areal world dataset [29] from eHarmony which contains 58attributes and 486464 tuples We removed 5 noncategoricalattributes and randomly picked 20 categorical attributes fromthe remaining 53 attributes )e domain sizes of the 20 at-tributes range from 2 to 15 After removing duplicate tuples werandomly picked 300000 tuples as our testing bed

By default we use the ranking function from the rankedretrieval model with all weights set to 1 All experimentalresults were obtained on a Mac machine running Mac OS

8 Security and Communication Networks

with 8GB of RAM )e algorithms were implemented inPython

62 Privacy )e privacy guarantee of PVSV-Constructorand PVST-Constructor were tested by performing RankedInference attack [35] including Point-Query In-QueryPoint-QueryampInsert and In-QueryampInsert attackingmethods on the dataset From a total of 20 attributes 5attributes were randomly chosen as public attributes andanother 5 attributes were randomly chosen as private at-tributes We randomly picked 20000 distinct tuples fromthe dataset as the testing bed and randomly generated 10tuples as the query workload For PVSV-Constructor weconstructed a polymorphic value set of size 2 for eachprivate attribute and each tuple in the testing bed ForPVST-Constructor we constructed a polymorphic valueset that covers at least two true values for each privateattribute and each tuple We randomly picked 1000 tuplesfrom the testing bed as our targets and performed 1000Rank Inference attacks (250 attacks for each of the fourmethods) on the five private attributes of target tuples Wemeasured the attack success guess rates based on the fre-quency of successful inference among all inference at-tempts Figure 1 shows the success guess rates of RankInference attacks on the unprotected testing bed the testingbed with PVSV and the testing bed with PVST As the sizeof each polymorphic value set is 2 the success guess rateson PVSV are around 50 which are significantly lowerthan those of unprotected dataset We also observe that thesuccess guess rates on PVSTare slightly lower than those of

PVSV )e reason is that PVSV-Constructor will beinserting values into tuple vrsquos polymorphic value sets untilall vrsquos polymorphic value sets cover at least 2 true values)us some polymorphic value sets of v may contain morethan 2 values

63 Utility In this subsection we quantify utility loss ofPVSV-Constructor and PVST-Constructor algorithms)e privacy guarantee of both PVSV-Constructor andPVST-Constructor is 12 ie the polymorphic value setsconstructed by PVSV-Constructor contains 2 values andthe polymorphic value sets constructed by PVST-Con-structor contains at least 2 true values )e key param-eters here are the size of query workload Q the size ofdatabase D the number of public and private attributesand the weight ratios in the ranking function We ran-domly generated 20 tuples as the query workload Bydefault we picked 10 tuples from Q and set |D| 300 000m 10 and mprime 10 )erefore we randomly picked 10attributes from the testing bed and set them as publicattributes )e rest of the 10 attributes were set as privateattributes

Many recommendation systems of ONS applicationsfeature top-k recommendation [1 14 47] where the rankedresult contains a set of k tuples that will be of interest to acertain user as it is impractical and unnecessary to return alltuples in the database to the user )erefore we introduceaverage top-k utility loss Uaul a variant of UQ that focuses onutility loss of the top-k tuples in a ranked result Uaul isdefined as

(i) Input ε D m mprime Q(ii) Output P

Bj

v forallv and forallBj

(1) k (1ε) minus 1(2) for v in D do(3) for j isin 1 mprime1113864 1113865 do(4) P

Bj

v v[Bj]1113966 1113967

(5) end(6) end(7) for v in D do(8) while existj isin 1 mprime1113864 1113865 |P

Bj

v |lt k + 1 do(9) for vprime isin t isin D|forallAi t[Ai] v[Ai]1113864 1113865 do(10) compute covervprime lossvprime(11) scorevprime covervprime 11 + exp(minus lossvprime )

(12) end(13) vmax argmaxvprimeisin |tisinD|forallAit[Ai]v[Ai] scorevprime

(14) for j isin 1 mprime1113864 1113865 do

(15) PBj

v PBj

v cup tmax[Bj]1113966 1113967

(16) end(17) end(18) end

ALGORITHM 2 PVST-Constructor

Security and Communication Networks 9

Uaul 1

|Q|1113944

|Q|

i1

1Dprime

11138681113868111386811138681113868111386811138681113868

1113944

visinDprime

min Rank(v ∣ q) k + 11113864 1113865 minus min Rankprime(v ∣ q) k + 11113864 11138651113868111386811138681113868

1113868111386811138681113868

k (23)

where Dprime v isin D |Rank(v | q))lt k or Rank(v | q))lt k1113864 1113865Intuitively Uaul represents the average percentage rankdifference relative to k over all queries and all tuples in top-kUaul is equivalent to UQ|Q||D|2 when k |D| By default weset k 100

631 Evaluation of Uaul with Varying k We first discuss theaverage top-k utility loss of PVSV-Constructor PVST-Constructor and a baseline algorithm on varying k withother parameters set to default values For a tuple vrsquos at-tribute Bj the baseline algorithm constructs P

Bj

v with v[Bj]

and a value randomly picked from VBj v[Bj]1113966 1113967 )e results

are presented in Figure 2 which shows that the Uaul ofPVSV-Constructor is significantly lower than that of thebaseline algorithm )e Uaul of PVST-Constructor is alsolower than that of the baseline algorithm when kge 5 eventhough the baseline algorithm cannot preserve privacyagainst authenticity-knowledgeable adversaries )e exper-imental results show that both PVSV and PVST can reduceutility loss with respect to rank differences Also note thatPVST-Constructor constructs polymorphic value sets withtrue values and thus from Figure 3 we can see that somepolymorphic value sets constructed by PVST-Constructorcontains more than 2 values

632 Evaluation of Uaul with Varying Sizes of QFigure 4 presents the average top-k utility loss of PVSV-Constructor on varying |Q| When |Q| is set to 1 5 or 10 werandomly picked 1 5 or 10 queries from the original Qrespectively With increasing number of queries in the queryworkload the Uaul of PVSV-Constructor increases mono-tonically )e reason is that PVSV-Constructor alwaysgenerates the least frequent value (denoted as vprime[Bj]) in1113864q[Bj] | q isin Q1113865 for v[Bj] If |Q| is small then it is possiblethat vprime[Bj] notin 1113864q[Bj] | q isin Q1113865 and thus s(v ∣ q) sprime(v ∣ q)

forallq isin Q However a larger query workload covers moreprivate attributes values and thus it will be harder forPVSV-Constructor to generate a value for v[Bj] that has noimpact on the rank of v for any q isin Q Figure 5 presents theaverage top-k utility loss of PVST-Constructor on varying|Q| We can see that the size of Q has no significant impacton the Uaul of PVST-Constructor as the PVST-Constructoralways pick a tuple vprime that is different from v in most at-tributes and then insert vprime[Bj] into P

Bj

v

633 Evaluation of Uaul with Varying Sizes of the DatasetFigures 6 and 7 depict the impact of the size of datasets onUaul of PVSV-Constructor and PVST-Constructor Datasetsof 100000 and 200000 tuples were randomly sampled from

QPoint QIn QIPoint QIIn

Atta

ck su

cces

s gue

ss ra

te

Without protectionPVSVPVST

1

09

08

07

06

05

04

03

02

01

0

Figure 1 Attack success guess

10 Security and Communication Networks

the testing bed of 300000 tuples As expected |D| has nosignificant impact on the Uaul of PVSV-Constructor sincePVSV-Constructor generates values from the domain ofeach private attributes |D| has no impact on the Uaul ofPVST-Constructor which indicates that a dataset contain-ing 100000 tuples is sufficient for PVST-Constructor togenerate polymorphic value sets with true values

634 Evaluation of Uaul with Varying m We investigate theimpact of the number of private and public attributes on

average top-k utility loss Figure 8 presents theUaul of PVSV-Constructor with fixed mprime and varying m and with fixed mand varying mprime When mmprime is set to 5 we randomly re-moved 5 attributes from the testing bed When mmprime is set to

0 20 40 60 80 100k

RandomPVSVPVST

Ave

rage

top-k

utili

ty lo

ss

1

09

08

07

06

05

04

03

02

01

0

Figure 2 Utility loss vs k

1 2 3 4 5+Size of PVST

3

25

2

15

1

05

0

Coun

t 10

5

Figure 3 Size of polymorphic value set (PVST)

|Q|

Ave

rage

top-k

utili

ty lo

ss

1

009

008

007

006

005

004

003

002

001

00 10 20

Figure 4 Utility loss vs |Q|

|Q|

Ave

rage

top-k

utili

ty lo

ss

1

09

08

07

06

05

04

03

02

01

00 10 20

Figure 5 Utility loss vs |Q|

0 1 2 3|D|105

Ave

rage

top-k

utili

ty lo

ss

1

009

008

007

006

005

004

003

002

001

0

Figure 6 Utility loss vs |D|

Security and Communication Networks 11

15 we added 5 more categorical attributes randomly chosenfrom the unused attributes We observe that the Uaulmonotonically decreases with increasing number of publicattributes and monotonically increases with increasingnumber of private attributes As expected a higher pro-portion of public attributes leads to less variant betweens(v ∣ q) and sprime(v ∣ q) e results of the same experimentwith PVST-Constructor are shown in Figure 9 Uaul in-creases as increasing number of private attributes as ex-pected However Uaul also increases slightly with increasingnumber of public attributes is is due to the fact that withmore public attributes there will be few tuples that share thesame public attribute values Since PVST-Constructor in-serts only private attribute values from tuples sharing samepublic attribute values more values will be inserted to eachpolymorphic value set which introduces higher utility loss

635 Evaluation of Uaul with Varying Weight RatiosFigures 10 and 11 illustrate the impact of weight ratios on theaverage utility loss e experiment was conducted with a

xed private attribute weight of 1 and varying public at-tribute weights of 1 2 and 3 As expected Uaul of bothPVSV-Constructor and PVST-Constructor decreases as

0 1 2 3Weight ratio

Ave

rage

top-k

utili

ty lo

ss

1

009

008

007

006

005

004

003

002

001

0

Figure 10 Utility loss vs weight ratio

0 1 2 3Weight ratio

Ave

rage

top-k

utili

ty lo

ss

1

09

08

07

06

05

04

03

02

01

0

Figure 11 Utility loss vs weight ratio

0 1 2 3|D|105

Ave

rage

top-k

utili

ty lo

ss1

09

08

07

06

05

04

03

02

01

0

Figure 7 Utility loss vs |D|

0 5 10 15m or mprime

fix mprime = 10fix m = 10

Ave

rage

top-k

utili

ty lo

ss

1

009

008

007

006

005

004

003

002

001

0

Figure 8 Utility loss vs m and mprime

0 5 10 15m or mprime

fix mprime = 10fix m = 10

Ave

rage

top-k

utili

ty lo

ss

1

09

08

07

06

05

04

03

02

01

0

Figure 9 Utility loss vs m and mprime

12 Security and Communication Networks

increasing weight ratio of public attributes)e reason is thatas the public attribute weight increases the part in sprime(v ∣ q)

caused by private attributes decreases )erefore less impactwould be made to sprime(v ∣ q) by PVSV and PVST As a resultsprime(v ∣ q) would be closer to s(v ∣ q) and the utility loss couldbe decreased

7 Conclusions

In this paper we proposed a novel framework that preservesprivacy of private attributes against arbitrary attacks throughthe ranked retrieval model Furthermore we identify twocategories of adversaries based on varying adversarial ca-pabilities For each kind of adversaries we presentedimplementation of our framework Our experimental resultssuggest that our implementations efficiently preserve privacyagainst Rank Inference attack [35] Moreover the imple-mentations significantly reduce utility loss with respect tothe variance in ranked results

It is our hope that this paper can motivate further re-search in privacy preservation of SIoT with consideration ofsocial network features andor variant information retrievalmodels eg text mining

Appendix

A 2-PVSV

In this subsection we prove that constructing an optimal PVSVfor each attribute of a tuple v is an NP-hard problem

Definition A1 For a tuple v in D we create PBj

v for each BjWe say v satisfies query q if the following hold for any tuple t(tne v) in D (1) If s(v ∣ q)lt s(t ∣ q) then sprime(v ∣ q)lt sprime(t ∣ q)and (2) if s(v ∣ q)ge s(t ∣ q) then sprime(v ∣ q)ge sprime(t ∣ q)

For a database containing 2 tuples the 2-PVSV problemcan be redefined as follows given a query workload Q adatabase D construct PB1

v PBmprime

v of size 2 such that v

satisfies the most queries in Q

Definition A2 Max-3Sat Problem given a 3-CNF formulaΦ find the truth assignment that satisfies that most clauses

Lemma A1 Max-3Sat leP 2-PVSV Problem

Proof We construct a reduction function f(Φ) (D s Q)

which takes a Max-3Sat instance as input and returns a 2-PVSV instance Without loss of generality we suppose thatΦ is a conjunction of l clauses and each clause is a dis-junction of 3 literals from set X x1 xn1113864 1113865 We constructdatabase D as follows D has 0 public attributes and n + 1private attributes B1 Bn C1 Let VB

j be the attribute do-main of Bj j isin 1 n and VC

1 be the attribute domain ofC1 Let VB

i minus 1 0 1 2 n + 2 and VC1 0 1 Two

tuples v1 and v2 are inserted into D v1[Bj] 0 and v2[Bj]

j + 2 for j isin 1 n while v1[C1] 0 and v2[C1] 1 Wesimplify the score function defined in (1) by setting all weightsto 1 Also note that ρ(q[Bj] t[Bj]) 0 if q[Bj] is null

We construct query workload Q based on Φ For eachclause Lk isin Φ we construct a query qk such that qk[Bi] i iffthe corresponding literal xi is a positive literal in Lk andq[Bi] i + 1 iff xi is a negative literal in the clause Forexample given a clause (x1orx2orx3) the correspondingquery q should satisfy q[B1] 1 q[B2] 3 q[B3] 3 andq[C1] 0 All other attributes in q are set to null by default)erefore s(v1 ∣ q) 1 and s(v2 ∣ q) 0 forallq isin Q

Since s(v2 ∣ q)lt s(v1 ∣ q) forallq isin Q in order to minimizeutility loss the value of sprime(v2 ∣ q) should be as small aspossible We observe that the minimum possible sprime(v2 ∣ q) is1 because PC1

v2must be 0 1 as VC

1 0 1 Without loss ofgenerality we assume that we have already constructedPB1

v2 P

Bmprime

v2 PC1v2

for v2 such that sprime(v2 ∣ q) 1 forallq isin QNow we have an instance of 2-PVSV problem that given

D Q constructs PB1v1

PBmprime

v1 PC1

v1that satisfies the most

queries in Q Since VC1 0 1 PC1

v1must be 0 1 too as

PC1v1

contains two distinct values )erefore for an ar-bitrary query q isin Q we have ρprime(q[C] v1[C]) 1 andρprime(q[C] v2[C]) 1 In order to let v1 satisfy q we have toensure that sprime(v1 ∣ q)gt sprime(v2 ∣ q) 1 )us we have

sprime v1 ∣ q( 1113857gt 1

⟺1113944mprime

i1ρprime q Bi1113858 1113859 v1 Bi1113858 1113859( 1113857gt 1

⟺existi isin 1 mprime1113864 1113865 ρprime q Bi1113858 1113859 v1 Bi1113858 1113859( 1113857 1

⟺existi isin 1 mprime1113864 1113865 q Bi1113858 1113859 isin PBi

v1

(A1)

Now we show that the solution of 2-PVSV problemconstructed above can answer the corresponding Max-3Sat Problem Suppose that we have the solution PB1

v1

PB

mprimev1 such that v1 satisfies the most queries in Q As in

(A1) if v1 satisfies qk we have qk[Bi] isin PBiv1 If v1 does not

satisfy qk then qk[Bi] notin PBiv1 Recall that we assign value i or

i + 1 to qk[Bi] if xi is a positive or negative literal in clauseLk respectively )erefore the assignment of xi given PBi

v1is

xi true if i isin PBi

v1

xi false if i + 1 isin PBi

v1

(A2)

Furthermore from equation (A1) we have

sprime v1 ∣ q( 1113857gt 1⟺L true (A3)

where L is qrsquos corresponding clause in Φ Note thati i + 1 nsubQ as |PBi

v1| 2 and 0 isin PBi

v1 )us the value of xi is

either true or falseAs we proved above v1 satisfies qi if and only if Li is true

given assignment x1 xl1113864 1113865 constructed according toequation (A2) )erefore x1 xl1113864 1113865 satisfies the mostclauses in ϕ if and only if v1 satisfies the most queries in Q

We now prove that function f can be conducted inpolynomial time Given a formula Φ with n variables and lclauses we construct a 2-PVSV instance with 2 tuples each

Security and Communication Networks 13

of which has n + 1 attributes and l queries each of which has4 attributes )erefore O(2n + 4l) assignments are neededand f can be conducted in polynomial time

Proof of 4eorem 1 We now prove that the 2-PVSVproblem is NP-hard In Lemma 3 we proved that Max-3SatProblem can be reduced to the 2-PVSV problem in poly-nomial time Furthermore as Max-3Sat Problem is a NP-hard problem [33] the 2-PVSV problem is NP-hard

B 2-PVST

In this section we prove that constructing an optimal PVSTfor each private attribute of a tuple v isin D is NP-hard We usethe definition of v satisfying q from (9))e 2-PVSTproblemcan be redefined as follows

Definition B1 2-PVSTproblem given a query workload Qa database D the optimization problem of 2-PVST is toconstruct an arbitrary tuple vrsquos polymorphic setsPB1

v PBmprime

v that satisfies the most queries in Q )e size ofeach polymorphic vale set is 2

Lemma B1 Max-3Sat le P 2-PVST problem

Proof We construct a reduction function f(Φ) (D s Q)

which takes a Max-3Sat instance as input and returns a 2-PVST instance We assume that Φ is a conjunction of lclauses and each clause is a disjunction of 3 literals from setX x1 xn1113864 1113865 Let D have no public attribute and n + 1private attributes B1 Bn C1 For each literal xi we inserttwo tuples vi and vi

prime into D where

vi C11113858 1113859 1 vi Bi1113858 1113859 1 vi Bj1113960 1113961 minus 1foralljne i

viprime C11113858 1113859 1 vi

prime Bi1113858 1113859 minus 1 viprime Bj1113960 1113961 1foralljne i

(B1)

)en we insert tuple v into D where v[C1] 0 andv[Bj] 0 forallj isin 1 mprime We also set the domain of eachBj as minus 1 0 1 and the domain of C1 as 0 1

Without loss of generality the score function defined in(1) is simplified by setting all weights to 1

Next we construct query workload Q For each clauseLk isin Φ we construct a query qk in which qk[Bj] minus 1 iff xj isa positive literal in Lk and qk[Bj] 1 iff xj is a negativeliteral in Lk We also set qk[C1] 1 )e rest of the attributevalues are set to null by default )erefore ifLk (x1orne x2orx3) then we have qk[C1] 1 qk[B1] 1qk[B2] 1 qk[B3] minus 1 and qk[Bj] null forj isin 4 mprime

Note that ρ(q[Bj] t[Bj]) 0 if q[Bj] is null )uss(v qk) 0 forallq isin Q and s(vi qk) s(vi

prime qk)ge 1 forallq isin Q Weobserve that for q isin Q and i isin 1 n s(v ∣ q)lt s(vi ∣ q)

and s(v ∣ q)lt s(viprime ∣ q) In order to reduce UQ we have to

maximize the number of q isin Q such that sprime(v ∣ q)lt sprime(vi ∣ q)

and sprime(v ∣ q)lt sprime(viprime ∣ q) )e maximum value of sprime(vi ∣ q)

and sprime(viprime ∣ q) is 4 which can be achieved by inserting vi[Bj]

into PBj

viprime and inserting vi

prime[Bj] into PBj

vi Since PC1

v 0 1 thevalue of sprime(v ∣ q) is at least 1 )erefore we have

sprime(v ∣ q)lt sprime vi ∣ q( 1113857

⟺sprime(v ∣ q)lt 4

⟺1113944

mprime

j1ρprime q Bj1113960 1113961 v Bj1113960 11139611113872 1113873lt 3

⟺existj isin 1 mprime ρprime q Bj1113960 1113961 v Bj1113960 11139611113872 1113873 0

⟺existj isin 1 mprime v Bj1113960 1113961ne q Bj1113960 1113961

⟺existj isin 1 mprime minus q Bj1113960 1113961 isin PBj

v

(B2)

Since |PBj

v | is limited to 2 PBj

v 0 1 or 0 minus 1 ie wemust insert either virsquos or vi

primersquos private attribute values intoPB1

v PBmv Recall that we assign minus 1 or 1 to qk[Bj] if xj

is positive or negative respectively in Lk In order toreduce Max-3Sat to 2-PSVT we assign literals in the fol-lowing way

xj true if minus 1 isin PBj

v

xj false if 1 isin PBj

v (B3)

)erefore we denote the corresponding clause of q as Land from equation (B2) we have

existj isin 1 mprime1113864 1113865 minus q Bj1113960 1113961 isin PBj

v

⟺existj isin 1 mprime1113864 1113865 xj true if xj is positive in L

orxj false if xj is negative in L

⟺L true(B4)

From the above equation we observe that v satisfying qk

is equivalent to Lk being true)erefore we can reduceMax-3Sat to 2-PVST Given a solution to the 2-PVSTproblem thatmaximizes the number of queries satisfied by v assignmentx1 xn1113864 1113865 produced by equation (B3) can satisfy thelargest number of clauses in Φ

Given a Max-3Sat instance Φ the construction of the 2-PVST instance can be conducted in polynomial time as weconstruct n queries with 3 attributes and 2n + 1 tuples with nattributes)e transformation from the optimal solution of 2-PVST to the optimal solution of Max-3Sat also takes poly-nomial time as we assign the value to each one of x1 xn

once )erefore f is a polynomial time function

Proof of 4eorem 2 In 4 we proved that a Max-3Sat in-stance can be reduced to a 2-PSVT instance in polynomialtime Furthermore as Max-3Sat Problem is an NP-hardproblem the 2-PVSV problem is an NP-hard problem

Data Availability

)e data used to support the findings of this study areavailable from the corresponding author upon request

14 Security and Communication Networks

Conflicts of Interest

)e authors declare that they have no conflicts of interestregarding the publication of this paper

Acknowledgments

)is research was partially supported by the National Sci-ence Foundation (grant nos 0852674 0915834 1117297and 1343976) and the Army Research Office (grant noW911NF-15-1-0020)

References

[1] G Adomavicius and A Tuzhilin ldquoToward the next generationof recommender systems a survey of the state-of-the-art andpossible extensionsrdquo IEEE Transactions on Knowledge andData Engineering vol 17 no 6 pp 734ndash749 2005

[2] C C Aggarwal and S Y Philip ldquoA condensation approach toprivacy preserving data miningrdquo in Proceedings of the In-ternational Conference on Extending Database Technologypp 183ndash199 Springer Heraklion Crete Greece March 2004

[3] R Agrawal and R Srikant ldquoPrivacy-preserving data miningrdquoin ACM Sigmod Record vol 29 no 2 pp 439ndash450 ACM2000

[4] A Alcaide E Palomar J Montero-Castillo and A RibagordaldquoAnonymous authentication for privacy-preserving IoT tar-get-driven applicationsrdquo Computers amp Security vol 37pp 111ndash123 2013

[5] L Atzori A Iera G Morabito and M Nitti ldquo)e socialinternet of things (SIoT)mdashwhen social networks meet theinternet of things concept architecture and network char-acterizationrdquo Computer Networks vol 56 no 16 pp 3594ndash3608 2012

[6] Z Cai Z He X Guan and Y Li ldquoCollective data-sanitizationfor preventing sensitive information inference attacks insocial networksrdquo IEEE Transactions on Dependable and SecureComputing vol 15 no 4 pp 577ndash590 2018

[7] Z Cai and X Zheng ldquoA private and efficient mechanism for datauploading in smart cyber-physical systemsrdquo IEEE Transactionson Network Science and Engineering vol 24 p 1 2018

[8] Z Cai X Zheng and J Yu ldquoA differential-private frameworkfor urban traffic flows estimation via taxi companiesrdquo IEEETransactions on Industrial Informatics vol 17 2019

[9] N Cao C Wang M Li K Ren and W Lou ldquoPrivacy-preserving multi-keyword ranked search over encryptedcloud datardquo IEEE Transactions on Parallel and DistributedSystems vol 25 no 1 pp 222ndash233 2014

[10] Y-C Chang and M Mitzenmacher ldquoPrivacy preservingkeyword searches on remote encrypted datardquo in Proceedingsof the International Conference on Applied Cryptography andNetwork Security pp 442ndash455 Springer New York NY USAJune 2005

[11] C Chen X Zhu P Shen et al ldquoAn efficient privacy-pre-serving ranked keyword search methodrdquo IEEE Transactionson Parallel and Distributed Systems vol 27 no 4 pp 951ndash9632016

[12] K Chen and L Liu ldquoPrivacy preserving data classificationwith rotation perturbationrdquo in Proceedings of the Fifth IEEEInternational Conference on Data Mining (ICDMrsquo05) p 4IEEE Houston TX USA November 2005

[13] V Ciriani S D C Di Vimercati S Foresti S JajodiaS Paraboschi and P Samarati ldquoFragmentation and en-cryption to enforce privacy in data storagerdquo in Proceedings of

the Fifth European symposium on research in computer se-curity pp 171ndash186 Springer Dresden Germany September2007

[14] M Deshpande and G Karypis ldquoItem-based top-N recom-mendation algorithmsrdquo ACM Transactions on InformationSystems vol 22 no 1 pp 143ndash177 2004

[15] S D C di Vimercati R F Erbacher S Foresti S JajodiaG Livraga and P Samarati ldquoEncryption and fragmentationfor data confidentiality in the cloudrdquo in Foundations of Se-curity Analysis and Design VII A Aldini J Lopez andF Martinelli Eds pp 212ndash243 Springer Berlin Germany2013

[16] S D C di Vimercati S Foresti G Livraga S Paraboschi andP Samarati ldquoConfidentiality protection in large databasesrdquo inA Comprehensive Guide through the Italian Database Researchover the Last 25 Years pp 457ndash472 Springer Berlin Ger-many 2018

[17] C Dwork ldquoDifferential privacyrdquo Encyclopedia of Cryptog-raphy and Security pp 338ndash340 2011

[18] C Dwork and A Roth ldquo)e algorithmic foundations ofdifferential privacyrdquo Foundations and Trendsreg in 4eoreticalComputer Science vol 9 no 3-4 pp 211ndash407 2014

[19] S E Fienberg and J McIntyre ldquoData swapping variations ona theme by dalenius and reissrdquo in Privacy in Statistical Da-tabases pp 14ndash29 Springer Berlin Germany 2004

[20] Y Gao T Yan and N Zhang ldquoA privacy-preservingframework for rank inferencerdquo in Proceedings of the IEEESymposium on Privacy-Aware Computing (PAC) pp 180-181IEEE Washington DC USA August 2017

[21] Z He Z Cai and J Yu ldquoLatent-data privacy preserving withcustomized data utility for social network datardquo IEEETransactions on Vehicular Technology vol 67 no 1pp 665ndash673 2017

[22] T Hong S Mei Z Wang and J Ren ldquoA novel verticalfragmentation method for privacy protection based on en-tropy minimization in a relational databaserdquo Symmetryvol 10 no 11 p 637 2018

[23] X Huang R Fu B Chen T Zhang and A Roscoe ldquoUserinteractive internet of things privacy preserved access con-trolrdquo in Proceedings of the International Conference for In-ternet Technology and Secured Transactions pp 597ndash602IEEE London UK December 2012

[24] Y Huo X Fan L Ma X Cheng Z Tian and D ChenldquoSecure communications in tiered 5G wireless networks withcooperative jammingrdquo IEEE Transactions on Wireless Com-munications vol 18 no 6 pp 3265ndash3280 2019

[25] K Liu H Kargupta and J Ryan ldquoRandom projection-basedmultiplicative data perturbation for privacy preserving dis-tributed data miningin latexrdquo IEEE Transactions on knowledgeand Data Engineering vol 18 no 1 pp 92ndash106 2005

[26] N Li T Li and S Venkatasubramanian ldquot-closeness privacybeyond k-anonymity and l-diversityrdquo in Proceedings of theIEEE 23rd International Conference on Data Engineeringpp 106ndash115 IEEE Istanbul Turkey April 2007

[27] A Machanavajjhala D Kifer J Gehrke andM Venkitasubramaniam ldquoL-diversityrdquo ACMTransactions onKnowledge Discovery from Data vol 1 no 1 pp 3ndashes 2007

[28] S Matwin ldquoPrivacy-preserving data mining techniquessurvey and challengesrdquo in Studies in Applied PhilosophyEpistemology and Rational Ethics pp 209ndash221 SpringerBerlin Germany 2013

[29] B McFee and G Lanckriet ldquoMetric learning to rankrdquo inProceedings of the 27th International Conference on MachineLearning (ICMLrsquo10) Haifa Israel June 2010

Security and Communication Networks 15

[30] K Muralidhar and R Sarathy ldquoData shufflingmdasha newmasking approach for numerical datardquo Management Sciencevol 52 no 5 pp 658ndash670 2006

[31] J Nin J Herranz and V Torra ldquoRethinking rank swapping todecrease disclosure riskrdquo Data amp Knowledge Engineeringvol 64 no 1 pp 346ndash364 2008

[32] K Okamoto W Chen and X-Y Li ldquoRanking of closenesscentrality for large-scale social networksrdquo in Frontiers inAlgorithmics pp 186ndash195 Springer Berlin Germany 2008

[33] C H Papadimitriou and M Yannakakis ldquoOptimizationapproximation and complexity classesrdquo Journal of computerand system sciences vol 43 no 3 pp 425ndash440 1991

[34] L Peng R-C Wang X-Y Su and C Long ldquoPrivacy pro-tection based on key-changed mutual authentication protocolin internet of thingsrdquo in Communications in Computer andInformation Science pp 345ndash355 Springer Berlin Germany2013

[35] M F Rahman W Liu S )irumuruganathan N Zhang andG Das ldquoPrivacy implications of database rankingrdquo Pro-ceedings of the VLDB Endowment vol 8 no 10 pp 1106ndash1117 2015

[36] R Schenkel T Crecelius M Kacimi et al ldquoEfficient top-kquerying over social-tagging networksrdquo in Proceedings of the31st Annual International ACM SIGIR Conference on Researchand Development in Information Retrieval pp 523ndash530ACM Singapore July 2008

[37] C Sheng N Zhang Y Tao and X Jin ldquoOptimal algorithmsfor crawling a hidden database in the webrdquo Proceedings of theVLDB Endowment vol 5 no 11 pp 1112ndash1123 2012

[38] D X Song D Wagner and A Perrig ldquoPractical techniquesfor searches on encrypted datardquo in Proceeding 2000 IEEESymposium on Security and Privacy SampP 2000 pp 44ndash55IEEE Berkeley CA USA May 2000

[39] J Su D Cao B Zhao X Wang and I You ldquoePASS anexpressive attribute-based signature scheme with privacy andan unforgeability guarantee for the Internet of)ingsrdquo FutureGeneration Computer Systems vol 33 pp 11ndash18 2014

[40] L Sweeney ldquok-anonymity a Model for Protecting PrivacyrdquoInternational Journal of Uncertainty Fuzziness and Knowl-edge-Based Systems vol 10 no 5 pp 557ndash570 2002

[41] T Tassa A Mazza and A Gionis ldquok-concealment An al-ternative model of k-type anonymityrdquo Transactions on DataPrivacy vol 5 no 1 pp 189ndash222 2012

[42] J H Y F W Wang J C W G K Koperski D LiY L A R N Stefanovic and B X O R Z Dbminer ldquoAsystem for mining knowledge in large relational databasesrdquo inProceedings of the International Conference on KnowledgeDiscovery and Data Mining and Knowledge Discovery(KDDrsquo96) pp 250ndash255 San Diego CA USA August 1996

[43] X Wang J Zhang E M Schooler and M Ion ldquoPerformanceevaluation of attribute-based encryption toward data privacyin the IoTrdquo in Proceedings of the IEEE International Con-ference on Communications (ICC) pp 725ndash730 IEEE SydneyAustralia June 2014

[44] Y Wang and Q Wen ldquoA privacy enhanced dns scheme forthe internet of thingsrdquo in Proceedings of the IET InternationalConference on Communication Technology and Application(ICCTA 2011) )e Institution of Engineering amp TechnologyBeijing China October 2011

[45] Y Xu T Ma M Tang and W Tian ldquoA survey of privacypreserving data publishing using generalization and sup-pressionrdquo Applied Mathematics amp Information Sciencesvol 8 no 3 pp 1103ndash1116 2014

[46] J-C Yang and B-X Fang ldquoSecurity model and key tech-nologies for the internet of thingsrdquo 4e Journal of ChinaUniversities of Posts and Telecommunications vol 18pp 109ndash112 2011

[47] X Yang H Steck Y Guo and Y Liu ldquoOn top-k recom-mendation using social networksrdquo in Proceedings of the sixthACM conference on Recommender systemsmdashRecSysrsquo12pp 67ndash74 ACM Dublin Ireland September 2012

[48] X Zheng Z Cai J Yu C Wang and Y Li ldquoFollow but notrack privacy preserved profile publishing in cyber-physicalsocial systemsrdquo IEEE Internet of 4ings Journal vol 4 no 6pp 1868ndash1878 2017

16 Security and Communication Networks

International Journal of

AerospaceEngineeringHindawiwwwhindawicom Volume 2018

RoboticsJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Active and Passive Electronic Components

VLSI Design

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Shock and Vibration

Hindawiwwwhindawicom Volume 2018

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawiwwwhindawicom

Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Control Scienceand Engineering

Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Journal ofEngineeringVolume 2018

SensorsJournal of

Hindawiwwwhindawicom Volume 2018

International Journal of

RotatingMachinery

Hindawiwwwhindawicom Volume 2018

Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Navigation and Observation

International Journal of

Hindawi

wwwhindawicom Volume 2018

Advances in

Multimedia

Submit your manuscripts atwwwhindawicom

Page 7: Research Article - Hindawi Publishing Corporationdownloads.hindawi.com/journals/scn/2019/5498375.pdf · [4, 34, 44], and attribute-based encryption [39, 43]. ’e features of social

1113944

qisinQq Bj1113858 1113859nev Bj1113858 1113859

wBj ρprime q Bj1113960 1113961 v Bj1113960 11139611113872 1113873 w

Bj q isin Q ∣ q Bj1113960 1113961 β1113966 111396711138681113868111386811138681113868

11138681113868111386811138681113868

(16)

In this case β has to be a value in VBj v[Bj] such that β

has the lowest frequency among all q[Bj] values for q isin QNote that β does not have to be an element of q[Bj] ∣1113966

q isin Q If β notin q[Bj] ∣ q isin Q1113966 1113967 its frequency is 0 Similarly if we

want to construct PBj

v of size k then we should insert the k minus 1least frequent values among all q[Bj] values into P

Bj

v In line 4 of Algorithm 1 we initialize the polymorphic

value set of v[Bj] by inserting v[Bj] into PBj

v For v[Bj] aset Set is constructed from VB

j v[Bj]1113966 1113967 which containsall values that can be inserted into P

Bj

v In order tominimize U(P

Bj

v ) we insert the k minus 1 least frequentvalues from Set into P

Bj

v by their frequencies inq[Bj] ∣ q isin Q1113966 1113967 )e above construction of P

Bj

v is repeatedfor each t isin D and j isin 1 mprime )e computationalcomplexity of Algorithm 1 is O(|D| middot |Q| middot mprime)

5 Framework with True Values

51 Authenticity-Knowledgeable Adversaries As we men-tioned in the adversary model an authenticity-knowl-edgeable adversary is able to tell if t isin D is possible As aresult the authenticity-knowledgeable adversary can launcha more efficient attack on private attributes by examining theauthenticity of values learned from RA We show a simplecase where an authenticity-knowledgeable adversary breaksthe privacy guarantee provided by polymorphic values setsconstructed with virtual values Consider that the objectiveof the adversary is to infer the value of v[B0] and B0 is theonly private attribute in D PB0

v is the polymorphic value setgenerated for v[B0] and |PB0

v | 1ε Without prior knowl-edge Pv[B0] ε and the privacy guarantee is achievedHowever for a value β isin PB0

v v[B0]1113864 1113865 if the adversary canconclude that PK(vprime ∣ D) 0 for any tuple vprime such that vprime isequal to v in all public attributes and vprime[B0] β then theadversary can exclude β from PB0

v )erefore

g v B01113858 1113859( 1113857 ε

1 minus εgt ε (17)

and equation (9) no longer holdsAs shown above if values in PB0

v are marked by anadversary as invalid for v given PK(vprime|D) then the adversarycan successfully break the privacy guarantee defined inframework with virtual values

52 Design In this section we propose polymorphic valuesets with true values (PVST) that construct polymorphicvalue sets with values that cannot be excluded by PK(t ∣ D)PVST considers nontrivial prior knowledge of adversariesand presents the same degree of privacy guarantee in-troduced in (9)

We have shown above that the implementation withvirtual values can be compromised by adversaries with priorknowledge Consider a tuple v isin D Given privacy guaranteeϵ we construct mprime polymorphic values sets PB0

v PB

mprimev for

each private attributes of v where |PBj

v |ge 1ε Let set MAv be

MAv v A01113858 11138591113864 1113865 times middot middot middot times v Am1113858 11138591113864 1113865 times P

B0v times middot middot middot times P

Bmprimev (18)

MAv is the Cartesian product of sets each of which

contains vrsquos all equivalent values in an attribute For publicattribute Ai the corresponding set is v[Ai]1113864 1113865 since publicattribute values are open to the adversary For private at-tribute Bj the corresponding set is P

Bj

v )erefore MAv

contains all possible tuples that are indistinguishable with v

with respect to RA (including v itself )With prior knowledge on PK(t ∣ D) an adversary is able

to exclude a value β from PBj

v if

forallt isin t ∣ t isinMAv t Bj1113960 1113961 β1113966 1113967PK(t ∣ D) 0 (19)

As described above it is safe for the adversary to con-clude that βne v[Bj] if there is no t isinMA

v such that t[Bj] βand PK(t ∣ D) 1 Alternatively if for every β in P

Bj

v thereis a t such that t[Bj] β and PK(t ∣ D) 1 then the ad-

versary cannot exclude any value in PBj

v and thereforePv[Bj]le ε is guaranteed Since PK(t ∣ D) 1 iff t isin D weconstruct polymorphic value sets with true values fromt isin D

Definition 4 If there exists l distinct values β1 βl isin PBj

v

such that forallr isin 1 l βr holds the following property

(i) Input ε D m mprime Q(ii) Output P

Bj

v forallv isin D and forallj isin 1 mprime1113864 1113865

(1) k (1ε) minus 1(2) for t in D do(3) for j isin 1 mprime1113864 1113865 do(4) P

Bj

v v[Bj]1113966 1113967

(5) Set VBj v[Bj]1113966 1113967

(6) Sort elements of Set by their frequencies in q[Bj]| q isin Q1113966 1113967 in ascending order(7) Remove the last |Set| minus k + 1 elements in Set(8) P

Bj

v PBj

v cup Set(9) end(10) end

ALGORITHM 1 PVSV-Constructor

Security and Communication Networks 7

existt isinMAv t Bj1113960 1113961 βr and t isin D (20)

)en we say that PBj

v covers l true values We denoteT

Bj

v β1 βl1113864 1113865 as the true value set of t in BjPrivacy guarantee for a database D where every tuplersquos

every private attribute is included by one equivalent value setwhich covers at least l true values a privacy level of ε 1l isachieved

Assume that the adversaryrsquos objective is to infer the valueof v[Bj] As mentioned in the adversary model we make noassumption on the attacking methods adopted by an ad-versary Consider MA

v defined in (18) forallt tprime isinMAv and

forallq isin QA s(t ∣ q) s(tprime ∣ q) )erefore the adversary cannotdistinguish tuples in MA

v by observing RA In this situationthe adversary would use PK(t ∣ D) to exclude all tuple t inMA

v such that PK(t ∣ D) 0 However as PBj

v covers l truevalues there exists l tuples t1 tl isinMA

v such thatPK(t ∣ D) 1 and tr[Bj]ne ts[Bj] for arbitraryr s isin 1 l As a result values in t1[Bj] tl[Bj]1113966 1113967 areindistinguishable given RA and PK(t ∣ D) )us Pv[Bj] 1lAccording to (3) a privacy level of 1l is achieved

53 Utility Optimization An intuitive method of con-structing P

Bj

v is to insert the value of Bj of all tuples that sharethe same public attribute values with v into P

Bj

v ie PBj

v

PBj

v 1113864vprime[Bj]|foralli and vprime isin D vprime[Ai] v[Ai]1113865 )e privacyguarantee is met if P

Bj

v covers at least 1ε true valuesNevertheless utility loss cannot be ignored as in the intuitivemethod s(vprime| q) would be the same for all vprime isin 1113864vprime| foralliand vprime isin D vprime[Ai] v[Ai]1113865 and thus information of privateattributes are missing in the ranked result of vprime )ereforethe size of P

Bj

v is critical in balancing privacy and utilityWith no loss of generality we limit the size of each poly-morphic value set to 2 We show that constructing suchpolymorphic value sets is an NP-hard problem

Definition 5 We define the 2-PVST problem as followsgiven database D and a query workload Q construct P

Bj

v ofsize 2 for each tuple v isin D forallj isin 1 mprime1113864 1113865 and minimizeUQ defined in (10)

Theorem 2 4e 2-PVST problem is NP-hard

)e proof of )eorem 2 can be found in Appendix B

54 Heuristic Algorithm We have shown that the 2-PVSTproblem is NP-hard In this subsection we present PVST-Constructor a heuristic algorithm that constructs PVSTwithin polynomial time PVST-Constructor tries to mini-mize |P

Bj

v | and 1113936qisinQ|sprime(v ∣ q) minus s(v ∣ q)| for each v isin D

and j isin 1 mprime with the greedy algorithm)e pseudo-code of PVST-Constructor is shown in

Algorithm 2 In lines 2 to 6 we initialize each PBj

v withv[Bj] )en for each v isin D PVST-Constructor constructsPB1

v PBmprime

v by finding a vprime with the greedy algorithm andinserting vprime[Bj] into P

Bj

v forallj isin 1 mprime)e above process

will be taken multiple times until |PBj

v |ge 1ε Since every vprime isa real tuple existing in D and we insert at least k tuples in theabove processes P

Bj

v covers at least k + 1 true values andthus the privacy guarantee is met As mentioned in UtilityOptimization the size of P

Bj

v is critical in minimizing utilityloss )erefore the heuristic algorithm tries to minimize theutility loss by minimizing the size of each P

Bj

v We count thenumber of polymorphic value sets of v if the set contains lessthan k values that can be enlarged by inserting vprimersquos privateattribute values We denote this count as covervprime

covervprime j ∣ PBj

v

11138681113868111386811138681113868

11138681113868111386811138681113868lt k vprime Bj1113960 1113961 notin PBj

v1113882 1113883

1113868111386811138681113868111386811138681113868

1113868111386811138681113868111386811138681113868 (21)

Tuple vprime with a higher covervprime value can enlarge the sizeof more polymorphic value sets of v and thus we can reducethe number of tuples that we have to insert into PB1

v PBmprime

v Furthermore we take into consideration queries in Q In

order to minimize UQ we have to minimize |sprime(v ∣ q) minus

s(v ∣ q)| for q isin Q Note that for each attribute Bj|ρprime(q[Bj] v[Bj]) minus ρ(q[Bj] v[Bj])| 1 if v[Bj]ne q[Bj] andq[Bj] isin P

Bj

v We denote the value of 1113936j|ρprime(q[Bj] v[Bj]) minus

ρ(q[Bj] v[Bj])| as lossvprime )us we have

lossvprime 1113944qisinQ

1113944

j1mprime

ρprime q Bj1113960 1113961 v Bj1113960 11139611113872 1113873 minus ρ q Bj1113960 1113961 v Bj1113960 11139611113872 111387311138681113868111386811138681113868

11138681113868111386811138681113868

1113944qisinQ

1113944

j1mprime

δ v vprime q j( 1113857

(22)where δ(v vprime q j) 1 if v[Bj]ne q[Bj] and vprime[Bj] q[Bj]

and δ(v vprime q j) 0 otherwise )erefore tuple vprime with asmaller lossvprime can reduce the value of 1113936j|ρprime(q[Bj]

v[Bj]) minus ρ(q[Bj] v[Bj])| and thus we can reduce the valueof UQ

)e computation of covervprime and lossvprime is done in line 11)en we compute scorevprime the score of vprime that indicates howpreferable vprime is relative to other tuples in D v Weadopt the greedy algorithm to find the next vprime for PB1

v

PB

mprimev ie in each iteration we choose the tuple

that has the highest score value In line 15 we insert theprivate attribute values of the chosen tuple (denoted as vmax)into PB1

v PB

mprimev )e above process is repeated until

the sizes of PB1v P

Bmprime

v are no less than k

6 Experimental Results

61 Experimental Setup To validate PVSV-Constructor andPVST-Constructor algorithms we conducted experiments on areal world dataset [29] from eHarmony which contains 58attributes and 486464 tuples We removed 5 noncategoricalattributes and randomly picked 20 categorical attributes fromthe remaining 53 attributes )e domain sizes of the 20 at-tributes range from 2 to 15 After removing duplicate tuples werandomly picked 300000 tuples as our testing bed

By default we use the ranking function from the rankedretrieval model with all weights set to 1 All experimentalresults were obtained on a Mac machine running Mac OS

8 Security and Communication Networks

with 8GB of RAM )e algorithms were implemented inPython

62 Privacy )e privacy guarantee of PVSV-Constructorand PVST-Constructor were tested by performing RankedInference attack [35] including Point-Query In-QueryPoint-QueryampInsert and In-QueryampInsert attackingmethods on the dataset From a total of 20 attributes 5attributes were randomly chosen as public attributes andanother 5 attributes were randomly chosen as private at-tributes We randomly picked 20000 distinct tuples fromthe dataset as the testing bed and randomly generated 10tuples as the query workload For PVSV-Constructor weconstructed a polymorphic value set of size 2 for eachprivate attribute and each tuple in the testing bed ForPVST-Constructor we constructed a polymorphic valueset that covers at least two true values for each privateattribute and each tuple We randomly picked 1000 tuplesfrom the testing bed as our targets and performed 1000Rank Inference attacks (250 attacks for each of the fourmethods) on the five private attributes of target tuples Wemeasured the attack success guess rates based on the fre-quency of successful inference among all inference at-tempts Figure 1 shows the success guess rates of RankInference attacks on the unprotected testing bed the testingbed with PVSV and the testing bed with PVST As the sizeof each polymorphic value set is 2 the success guess rateson PVSV are around 50 which are significantly lowerthan those of unprotected dataset We also observe that thesuccess guess rates on PVSTare slightly lower than those of

PVSV )e reason is that PVSV-Constructor will beinserting values into tuple vrsquos polymorphic value sets untilall vrsquos polymorphic value sets cover at least 2 true values)us some polymorphic value sets of v may contain morethan 2 values

63 Utility In this subsection we quantify utility loss ofPVSV-Constructor and PVST-Constructor algorithms)e privacy guarantee of both PVSV-Constructor andPVST-Constructor is 12 ie the polymorphic value setsconstructed by PVSV-Constructor contains 2 values andthe polymorphic value sets constructed by PVST-Con-structor contains at least 2 true values )e key param-eters here are the size of query workload Q the size ofdatabase D the number of public and private attributesand the weight ratios in the ranking function We ran-domly generated 20 tuples as the query workload Bydefault we picked 10 tuples from Q and set |D| 300 000m 10 and mprime 10 )erefore we randomly picked 10attributes from the testing bed and set them as publicattributes )e rest of the 10 attributes were set as privateattributes

Many recommendation systems of ONS applicationsfeature top-k recommendation [1 14 47] where the rankedresult contains a set of k tuples that will be of interest to acertain user as it is impractical and unnecessary to return alltuples in the database to the user )erefore we introduceaverage top-k utility loss Uaul a variant of UQ that focuses onutility loss of the top-k tuples in a ranked result Uaul isdefined as

(i) Input ε D m mprime Q(ii) Output P

Bj

v forallv and forallBj

(1) k (1ε) minus 1(2) for v in D do(3) for j isin 1 mprime1113864 1113865 do(4) P

Bj

v v[Bj]1113966 1113967

(5) end(6) end(7) for v in D do(8) while existj isin 1 mprime1113864 1113865 |P

Bj

v |lt k + 1 do(9) for vprime isin t isin D|forallAi t[Ai] v[Ai]1113864 1113865 do(10) compute covervprime lossvprime(11) scorevprime covervprime 11 + exp(minus lossvprime )

(12) end(13) vmax argmaxvprimeisin |tisinD|forallAit[Ai]v[Ai] scorevprime

(14) for j isin 1 mprime1113864 1113865 do

(15) PBj

v PBj

v cup tmax[Bj]1113966 1113967

(16) end(17) end(18) end

ALGORITHM 2 PVST-Constructor

Security and Communication Networks 9

Uaul 1

|Q|1113944

|Q|

i1

1Dprime

11138681113868111386811138681113868111386811138681113868

1113944

visinDprime

min Rank(v ∣ q) k + 11113864 1113865 minus min Rankprime(v ∣ q) k + 11113864 11138651113868111386811138681113868

1113868111386811138681113868

k (23)

where Dprime v isin D |Rank(v | q))lt k or Rank(v | q))lt k1113864 1113865Intuitively Uaul represents the average percentage rankdifference relative to k over all queries and all tuples in top-kUaul is equivalent to UQ|Q||D|2 when k |D| By default weset k 100

631 Evaluation of Uaul with Varying k We first discuss theaverage top-k utility loss of PVSV-Constructor PVST-Constructor and a baseline algorithm on varying k withother parameters set to default values For a tuple vrsquos at-tribute Bj the baseline algorithm constructs P

Bj

v with v[Bj]

and a value randomly picked from VBj v[Bj]1113966 1113967 )e results

are presented in Figure 2 which shows that the Uaul ofPVSV-Constructor is significantly lower than that of thebaseline algorithm )e Uaul of PVST-Constructor is alsolower than that of the baseline algorithm when kge 5 eventhough the baseline algorithm cannot preserve privacyagainst authenticity-knowledgeable adversaries )e exper-imental results show that both PVSV and PVST can reduceutility loss with respect to rank differences Also note thatPVST-Constructor constructs polymorphic value sets withtrue values and thus from Figure 3 we can see that somepolymorphic value sets constructed by PVST-Constructorcontains more than 2 values

632 Evaluation of Uaul with Varying Sizes of QFigure 4 presents the average top-k utility loss of PVSV-Constructor on varying |Q| When |Q| is set to 1 5 or 10 werandomly picked 1 5 or 10 queries from the original Qrespectively With increasing number of queries in the queryworkload the Uaul of PVSV-Constructor increases mono-tonically )e reason is that PVSV-Constructor alwaysgenerates the least frequent value (denoted as vprime[Bj]) in1113864q[Bj] | q isin Q1113865 for v[Bj] If |Q| is small then it is possiblethat vprime[Bj] notin 1113864q[Bj] | q isin Q1113865 and thus s(v ∣ q) sprime(v ∣ q)

forallq isin Q However a larger query workload covers moreprivate attributes values and thus it will be harder forPVSV-Constructor to generate a value for v[Bj] that has noimpact on the rank of v for any q isin Q Figure 5 presents theaverage top-k utility loss of PVST-Constructor on varying|Q| We can see that the size of Q has no significant impacton the Uaul of PVST-Constructor as the PVST-Constructoralways pick a tuple vprime that is different from v in most at-tributes and then insert vprime[Bj] into P

Bj

v

633 Evaluation of Uaul with Varying Sizes of the DatasetFigures 6 and 7 depict the impact of the size of datasets onUaul of PVSV-Constructor and PVST-Constructor Datasetsof 100000 and 200000 tuples were randomly sampled from

QPoint QIn QIPoint QIIn

Atta

ck su

cces

s gue

ss ra

te

Without protectionPVSVPVST

1

09

08

07

06

05

04

03

02

01

0

Figure 1 Attack success guess

10 Security and Communication Networks

the testing bed of 300000 tuples As expected |D| has nosignificant impact on the Uaul of PVSV-Constructor sincePVSV-Constructor generates values from the domain ofeach private attributes |D| has no impact on the Uaul ofPVST-Constructor which indicates that a dataset contain-ing 100000 tuples is sufficient for PVST-Constructor togenerate polymorphic value sets with true values

634 Evaluation of Uaul with Varying m We investigate theimpact of the number of private and public attributes on

average top-k utility loss Figure 8 presents theUaul of PVSV-Constructor with fixed mprime and varying m and with fixed mand varying mprime When mmprime is set to 5 we randomly re-moved 5 attributes from the testing bed When mmprime is set to

0 20 40 60 80 100k

RandomPVSVPVST

Ave

rage

top-k

utili

ty lo

ss

1

09

08

07

06

05

04

03

02

01

0

Figure 2 Utility loss vs k

1 2 3 4 5+Size of PVST

3

25

2

15

1

05

0

Coun

t 10

5

Figure 3 Size of polymorphic value set (PVST)

|Q|

Ave

rage

top-k

utili

ty lo

ss

1

009

008

007

006

005

004

003

002

001

00 10 20

Figure 4 Utility loss vs |Q|

|Q|

Ave

rage

top-k

utili

ty lo

ss

1

09

08

07

06

05

04

03

02

01

00 10 20

Figure 5 Utility loss vs |Q|

0 1 2 3|D|105

Ave

rage

top-k

utili

ty lo

ss

1

009

008

007

006

005

004

003

002

001

0

Figure 6 Utility loss vs |D|

Security and Communication Networks 11

15 we added 5 more categorical attributes randomly chosenfrom the unused attributes We observe that the Uaulmonotonically decreases with increasing number of publicattributes and monotonically increases with increasingnumber of private attributes As expected a higher pro-portion of public attributes leads to less variant betweens(v ∣ q) and sprime(v ∣ q) e results of the same experimentwith PVST-Constructor are shown in Figure 9 Uaul in-creases as increasing number of private attributes as ex-pected However Uaul also increases slightly with increasingnumber of public attributes is is due to the fact that withmore public attributes there will be few tuples that share thesame public attribute values Since PVST-Constructor in-serts only private attribute values from tuples sharing samepublic attribute values more values will be inserted to eachpolymorphic value set which introduces higher utility loss

635 Evaluation of Uaul with Varying Weight RatiosFigures 10 and 11 illustrate the impact of weight ratios on theaverage utility loss e experiment was conducted with a

xed private attribute weight of 1 and varying public at-tribute weights of 1 2 and 3 As expected Uaul of bothPVSV-Constructor and PVST-Constructor decreases as

0 1 2 3Weight ratio

Ave

rage

top-k

utili

ty lo

ss

1

009

008

007

006

005

004

003

002

001

0

Figure 10 Utility loss vs weight ratio

0 1 2 3Weight ratio

Ave

rage

top-k

utili

ty lo

ss

1

09

08

07

06

05

04

03

02

01

0

Figure 11 Utility loss vs weight ratio

0 1 2 3|D|105

Ave

rage

top-k

utili

ty lo

ss1

09

08

07

06

05

04

03

02

01

0

Figure 7 Utility loss vs |D|

0 5 10 15m or mprime

fix mprime = 10fix m = 10

Ave

rage

top-k

utili

ty lo

ss

1

009

008

007

006

005

004

003

002

001

0

Figure 8 Utility loss vs m and mprime

0 5 10 15m or mprime

fix mprime = 10fix m = 10

Ave

rage

top-k

utili

ty lo

ss

1

09

08

07

06

05

04

03

02

01

0

Figure 9 Utility loss vs m and mprime

12 Security and Communication Networks

increasing weight ratio of public attributes)e reason is thatas the public attribute weight increases the part in sprime(v ∣ q)

caused by private attributes decreases )erefore less impactwould be made to sprime(v ∣ q) by PVSV and PVST As a resultsprime(v ∣ q) would be closer to s(v ∣ q) and the utility loss couldbe decreased

7 Conclusions

In this paper we proposed a novel framework that preservesprivacy of private attributes against arbitrary attacks throughthe ranked retrieval model Furthermore we identify twocategories of adversaries based on varying adversarial ca-pabilities For each kind of adversaries we presentedimplementation of our framework Our experimental resultssuggest that our implementations efficiently preserve privacyagainst Rank Inference attack [35] Moreover the imple-mentations significantly reduce utility loss with respect tothe variance in ranked results

It is our hope that this paper can motivate further re-search in privacy preservation of SIoT with consideration ofsocial network features andor variant information retrievalmodels eg text mining

Appendix

A 2-PVSV

In this subsection we prove that constructing an optimal PVSVfor each attribute of a tuple v is an NP-hard problem

Definition A1 For a tuple v in D we create PBj

v for each BjWe say v satisfies query q if the following hold for any tuple t(tne v) in D (1) If s(v ∣ q)lt s(t ∣ q) then sprime(v ∣ q)lt sprime(t ∣ q)and (2) if s(v ∣ q)ge s(t ∣ q) then sprime(v ∣ q)ge sprime(t ∣ q)

For a database containing 2 tuples the 2-PVSV problemcan be redefined as follows given a query workload Q adatabase D construct PB1

v PBmprime

v of size 2 such that v

satisfies the most queries in Q

Definition A2 Max-3Sat Problem given a 3-CNF formulaΦ find the truth assignment that satisfies that most clauses

Lemma A1 Max-3Sat leP 2-PVSV Problem

Proof We construct a reduction function f(Φ) (D s Q)

which takes a Max-3Sat instance as input and returns a 2-PVSV instance Without loss of generality we suppose thatΦ is a conjunction of l clauses and each clause is a dis-junction of 3 literals from set X x1 xn1113864 1113865 We constructdatabase D as follows D has 0 public attributes and n + 1private attributes B1 Bn C1 Let VB

j be the attribute do-main of Bj j isin 1 n and VC

1 be the attribute domain ofC1 Let VB

i minus 1 0 1 2 n + 2 and VC1 0 1 Two

tuples v1 and v2 are inserted into D v1[Bj] 0 and v2[Bj]

j + 2 for j isin 1 n while v1[C1] 0 and v2[C1] 1 Wesimplify the score function defined in (1) by setting all weightsto 1 Also note that ρ(q[Bj] t[Bj]) 0 if q[Bj] is null

We construct query workload Q based on Φ For eachclause Lk isin Φ we construct a query qk such that qk[Bi] i iffthe corresponding literal xi is a positive literal in Lk andq[Bi] i + 1 iff xi is a negative literal in the clause Forexample given a clause (x1orx2orx3) the correspondingquery q should satisfy q[B1] 1 q[B2] 3 q[B3] 3 andq[C1] 0 All other attributes in q are set to null by default)erefore s(v1 ∣ q) 1 and s(v2 ∣ q) 0 forallq isin Q

Since s(v2 ∣ q)lt s(v1 ∣ q) forallq isin Q in order to minimizeutility loss the value of sprime(v2 ∣ q) should be as small aspossible We observe that the minimum possible sprime(v2 ∣ q) is1 because PC1

v2must be 0 1 as VC

1 0 1 Without loss ofgenerality we assume that we have already constructedPB1

v2 P

Bmprime

v2 PC1v2

for v2 such that sprime(v2 ∣ q) 1 forallq isin QNow we have an instance of 2-PVSV problem that given

D Q constructs PB1v1

PBmprime

v1 PC1

v1that satisfies the most

queries in Q Since VC1 0 1 PC1

v1must be 0 1 too as

PC1v1

contains two distinct values )erefore for an ar-bitrary query q isin Q we have ρprime(q[C] v1[C]) 1 andρprime(q[C] v2[C]) 1 In order to let v1 satisfy q we have toensure that sprime(v1 ∣ q)gt sprime(v2 ∣ q) 1 )us we have

sprime v1 ∣ q( 1113857gt 1

⟺1113944mprime

i1ρprime q Bi1113858 1113859 v1 Bi1113858 1113859( 1113857gt 1

⟺existi isin 1 mprime1113864 1113865 ρprime q Bi1113858 1113859 v1 Bi1113858 1113859( 1113857 1

⟺existi isin 1 mprime1113864 1113865 q Bi1113858 1113859 isin PBi

v1

(A1)

Now we show that the solution of 2-PVSV problemconstructed above can answer the corresponding Max-3Sat Problem Suppose that we have the solution PB1

v1

PB

mprimev1 such that v1 satisfies the most queries in Q As in

(A1) if v1 satisfies qk we have qk[Bi] isin PBiv1 If v1 does not

satisfy qk then qk[Bi] notin PBiv1 Recall that we assign value i or

i + 1 to qk[Bi] if xi is a positive or negative literal in clauseLk respectively )erefore the assignment of xi given PBi

v1is

xi true if i isin PBi

v1

xi false if i + 1 isin PBi

v1

(A2)

Furthermore from equation (A1) we have

sprime v1 ∣ q( 1113857gt 1⟺L true (A3)

where L is qrsquos corresponding clause in Φ Note thati i + 1 nsubQ as |PBi

v1| 2 and 0 isin PBi

v1 )us the value of xi is

either true or falseAs we proved above v1 satisfies qi if and only if Li is true

given assignment x1 xl1113864 1113865 constructed according toequation (A2) )erefore x1 xl1113864 1113865 satisfies the mostclauses in ϕ if and only if v1 satisfies the most queries in Q

We now prove that function f can be conducted inpolynomial time Given a formula Φ with n variables and lclauses we construct a 2-PVSV instance with 2 tuples each

Security and Communication Networks 13

of which has n + 1 attributes and l queries each of which has4 attributes )erefore O(2n + 4l) assignments are neededand f can be conducted in polynomial time

Proof of 4eorem 1 We now prove that the 2-PVSVproblem is NP-hard In Lemma 3 we proved that Max-3SatProblem can be reduced to the 2-PVSV problem in poly-nomial time Furthermore as Max-3Sat Problem is a NP-hard problem [33] the 2-PVSV problem is NP-hard

B 2-PVST

In this section we prove that constructing an optimal PVSTfor each private attribute of a tuple v isin D is NP-hard We usethe definition of v satisfying q from (9))e 2-PVSTproblemcan be redefined as follows

Definition B1 2-PVSTproblem given a query workload Qa database D the optimization problem of 2-PVST is toconstruct an arbitrary tuple vrsquos polymorphic setsPB1

v PBmprime

v that satisfies the most queries in Q )e size ofeach polymorphic vale set is 2

Lemma B1 Max-3Sat le P 2-PVST problem

Proof We construct a reduction function f(Φ) (D s Q)

which takes a Max-3Sat instance as input and returns a 2-PVST instance We assume that Φ is a conjunction of lclauses and each clause is a disjunction of 3 literals from setX x1 xn1113864 1113865 Let D have no public attribute and n + 1private attributes B1 Bn C1 For each literal xi we inserttwo tuples vi and vi

prime into D where

vi C11113858 1113859 1 vi Bi1113858 1113859 1 vi Bj1113960 1113961 minus 1foralljne i

viprime C11113858 1113859 1 vi

prime Bi1113858 1113859 minus 1 viprime Bj1113960 1113961 1foralljne i

(B1)

)en we insert tuple v into D where v[C1] 0 andv[Bj] 0 forallj isin 1 mprime We also set the domain of eachBj as minus 1 0 1 and the domain of C1 as 0 1

Without loss of generality the score function defined in(1) is simplified by setting all weights to 1

Next we construct query workload Q For each clauseLk isin Φ we construct a query qk in which qk[Bj] minus 1 iff xj isa positive literal in Lk and qk[Bj] 1 iff xj is a negativeliteral in Lk We also set qk[C1] 1 )e rest of the attributevalues are set to null by default )erefore ifLk (x1orne x2orx3) then we have qk[C1] 1 qk[B1] 1qk[B2] 1 qk[B3] minus 1 and qk[Bj] null forj isin 4 mprime

Note that ρ(q[Bj] t[Bj]) 0 if q[Bj] is null )uss(v qk) 0 forallq isin Q and s(vi qk) s(vi

prime qk)ge 1 forallq isin Q Weobserve that for q isin Q and i isin 1 n s(v ∣ q)lt s(vi ∣ q)

and s(v ∣ q)lt s(viprime ∣ q) In order to reduce UQ we have to

maximize the number of q isin Q such that sprime(v ∣ q)lt sprime(vi ∣ q)

and sprime(v ∣ q)lt sprime(viprime ∣ q) )e maximum value of sprime(vi ∣ q)

and sprime(viprime ∣ q) is 4 which can be achieved by inserting vi[Bj]

into PBj

viprime and inserting vi

prime[Bj] into PBj

vi Since PC1

v 0 1 thevalue of sprime(v ∣ q) is at least 1 )erefore we have

sprime(v ∣ q)lt sprime vi ∣ q( 1113857

⟺sprime(v ∣ q)lt 4

⟺1113944

mprime

j1ρprime q Bj1113960 1113961 v Bj1113960 11139611113872 1113873lt 3

⟺existj isin 1 mprime ρprime q Bj1113960 1113961 v Bj1113960 11139611113872 1113873 0

⟺existj isin 1 mprime v Bj1113960 1113961ne q Bj1113960 1113961

⟺existj isin 1 mprime minus q Bj1113960 1113961 isin PBj

v

(B2)

Since |PBj

v | is limited to 2 PBj

v 0 1 or 0 minus 1 ie wemust insert either virsquos or vi

primersquos private attribute values intoPB1

v PBmv Recall that we assign minus 1 or 1 to qk[Bj] if xj

is positive or negative respectively in Lk In order toreduce Max-3Sat to 2-PSVT we assign literals in the fol-lowing way

xj true if minus 1 isin PBj

v

xj false if 1 isin PBj

v (B3)

)erefore we denote the corresponding clause of q as Land from equation (B2) we have

existj isin 1 mprime1113864 1113865 minus q Bj1113960 1113961 isin PBj

v

⟺existj isin 1 mprime1113864 1113865 xj true if xj is positive in L

orxj false if xj is negative in L

⟺L true(B4)

From the above equation we observe that v satisfying qk

is equivalent to Lk being true)erefore we can reduceMax-3Sat to 2-PVST Given a solution to the 2-PVSTproblem thatmaximizes the number of queries satisfied by v assignmentx1 xn1113864 1113865 produced by equation (B3) can satisfy thelargest number of clauses in Φ

Given a Max-3Sat instance Φ the construction of the 2-PVST instance can be conducted in polynomial time as weconstruct n queries with 3 attributes and 2n + 1 tuples with nattributes)e transformation from the optimal solution of 2-PVST to the optimal solution of Max-3Sat also takes poly-nomial time as we assign the value to each one of x1 xn

once )erefore f is a polynomial time function

Proof of 4eorem 2 In 4 we proved that a Max-3Sat in-stance can be reduced to a 2-PSVT instance in polynomialtime Furthermore as Max-3Sat Problem is an NP-hardproblem the 2-PVSV problem is an NP-hard problem

Data Availability

)e data used to support the findings of this study areavailable from the corresponding author upon request

14 Security and Communication Networks

Conflicts of Interest

)e authors declare that they have no conflicts of interestregarding the publication of this paper

Acknowledgments

)is research was partially supported by the National Sci-ence Foundation (grant nos 0852674 0915834 1117297and 1343976) and the Army Research Office (grant noW911NF-15-1-0020)

References

[1] G Adomavicius and A Tuzhilin ldquoToward the next generationof recommender systems a survey of the state-of-the-art andpossible extensionsrdquo IEEE Transactions on Knowledge andData Engineering vol 17 no 6 pp 734ndash749 2005

[2] C C Aggarwal and S Y Philip ldquoA condensation approach toprivacy preserving data miningrdquo in Proceedings of the In-ternational Conference on Extending Database Technologypp 183ndash199 Springer Heraklion Crete Greece March 2004

[3] R Agrawal and R Srikant ldquoPrivacy-preserving data miningrdquoin ACM Sigmod Record vol 29 no 2 pp 439ndash450 ACM2000

[4] A Alcaide E Palomar J Montero-Castillo and A RibagordaldquoAnonymous authentication for privacy-preserving IoT tar-get-driven applicationsrdquo Computers amp Security vol 37pp 111ndash123 2013

[5] L Atzori A Iera G Morabito and M Nitti ldquo)e socialinternet of things (SIoT)mdashwhen social networks meet theinternet of things concept architecture and network char-acterizationrdquo Computer Networks vol 56 no 16 pp 3594ndash3608 2012

[6] Z Cai Z He X Guan and Y Li ldquoCollective data-sanitizationfor preventing sensitive information inference attacks insocial networksrdquo IEEE Transactions on Dependable and SecureComputing vol 15 no 4 pp 577ndash590 2018

[7] Z Cai and X Zheng ldquoA private and efficient mechanism for datauploading in smart cyber-physical systemsrdquo IEEE Transactionson Network Science and Engineering vol 24 p 1 2018

[8] Z Cai X Zheng and J Yu ldquoA differential-private frameworkfor urban traffic flows estimation via taxi companiesrdquo IEEETransactions on Industrial Informatics vol 17 2019

[9] N Cao C Wang M Li K Ren and W Lou ldquoPrivacy-preserving multi-keyword ranked search over encryptedcloud datardquo IEEE Transactions on Parallel and DistributedSystems vol 25 no 1 pp 222ndash233 2014

[10] Y-C Chang and M Mitzenmacher ldquoPrivacy preservingkeyword searches on remote encrypted datardquo in Proceedingsof the International Conference on Applied Cryptography andNetwork Security pp 442ndash455 Springer New York NY USAJune 2005

[11] C Chen X Zhu P Shen et al ldquoAn efficient privacy-pre-serving ranked keyword search methodrdquo IEEE Transactionson Parallel and Distributed Systems vol 27 no 4 pp 951ndash9632016

[12] K Chen and L Liu ldquoPrivacy preserving data classificationwith rotation perturbationrdquo in Proceedings of the Fifth IEEEInternational Conference on Data Mining (ICDMrsquo05) p 4IEEE Houston TX USA November 2005

[13] V Ciriani S D C Di Vimercati S Foresti S JajodiaS Paraboschi and P Samarati ldquoFragmentation and en-cryption to enforce privacy in data storagerdquo in Proceedings of

the Fifth European symposium on research in computer se-curity pp 171ndash186 Springer Dresden Germany September2007

[14] M Deshpande and G Karypis ldquoItem-based top-N recom-mendation algorithmsrdquo ACM Transactions on InformationSystems vol 22 no 1 pp 143ndash177 2004

[15] S D C di Vimercati R F Erbacher S Foresti S JajodiaG Livraga and P Samarati ldquoEncryption and fragmentationfor data confidentiality in the cloudrdquo in Foundations of Se-curity Analysis and Design VII A Aldini J Lopez andF Martinelli Eds pp 212ndash243 Springer Berlin Germany2013

[16] S D C di Vimercati S Foresti G Livraga S Paraboschi andP Samarati ldquoConfidentiality protection in large databasesrdquo inA Comprehensive Guide through the Italian Database Researchover the Last 25 Years pp 457ndash472 Springer Berlin Ger-many 2018

[17] C Dwork ldquoDifferential privacyrdquo Encyclopedia of Cryptog-raphy and Security pp 338ndash340 2011

[18] C Dwork and A Roth ldquo)e algorithmic foundations ofdifferential privacyrdquo Foundations and Trendsreg in 4eoreticalComputer Science vol 9 no 3-4 pp 211ndash407 2014

[19] S E Fienberg and J McIntyre ldquoData swapping variations ona theme by dalenius and reissrdquo in Privacy in Statistical Da-tabases pp 14ndash29 Springer Berlin Germany 2004

[20] Y Gao T Yan and N Zhang ldquoA privacy-preservingframework for rank inferencerdquo in Proceedings of the IEEESymposium on Privacy-Aware Computing (PAC) pp 180-181IEEE Washington DC USA August 2017

[21] Z He Z Cai and J Yu ldquoLatent-data privacy preserving withcustomized data utility for social network datardquo IEEETransactions on Vehicular Technology vol 67 no 1pp 665ndash673 2017

[22] T Hong S Mei Z Wang and J Ren ldquoA novel verticalfragmentation method for privacy protection based on en-tropy minimization in a relational databaserdquo Symmetryvol 10 no 11 p 637 2018

[23] X Huang R Fu B Chen T Zhang and A Roscoe ldquoUserinteractive internet of things privacy preserved access con-trolrdquo in Proceedings of the International Conference for In-ternet Technology and Secured Transactions pp 597ndash602IEEE London UK December 2012

[24] Y Huo X Fan L Ma X Cheng Z Tian and D ChenldquoSecure communications in tiered 5G wireless networks withcooperative jammingrdquo IEEE Transactions on Wireless Com-munications vol 18 no 6 pp 3265ndash3280 2019

[25] K Liu H Kargupta and J Ryan ldquoRandom projection-basedmultiplicative data perturbation for privacy preserving dis-tributed data miningin latexrdquo IEEE Transactions on knowledgeand Data Engineering vol 18 no 1 pp 92ndash106 2005

[26] N Li T Li and S Venkatasubramanian ldquot-closeness privacybeyond k-anonymity and l-diversityrdquo in Proceedings of theIEEE 23rd International Conference on Data Engineeringpp 106ndash115 IEEE Istanbul Turkey April 2007

[27] A Machanavajjhala D Kifer J Gehrke andM Venkitasubramaniam ldquoL-diversityrdquo ACMTransactions onKnowledge Discovery from Data vol 1 no 1 pp 3ndashes 2007

[28] S Matwin ldquoPrivacy-preserving data mining techniquessurvey and challengesrdquo in Studies in Applied PhilosophyEpistemology and Rational Ethics pp 209ndash221 SpringerBerlin Germany 2013

[29] B McFee and G Lanckriet ldquoMetric learning to rankrdquo inProceedings of the 27th International Conference on MachineLearning (ICMLrsquo10) Haifa Israel June 2010

Security and Communication Networks 15

[30] K Muralidhar and R Sarathy ldquoData shufflingmdasha newmasking approach for numerical datardquo Management Sciencevol 52 no 5 pp 658ndash670 2006

[31] J Nin J Herranz and V Torra ldquoRethinking rank swapping todecrease disclosure riskrdquo Data amp Knowledge Engineeringvol 64 no 1 pp 346ndash364 2008

[32] K Okamoto W Chen and X-Y Li ldquoRanking of closenesscentrality for large-scale social networksrdquo in Frontiers inAlgorithmics pp 186ndash195 Springer Berlin Germany 2008

[33] C H Papadimitriou and M Yannakakis ldquoOptimizationapproximation and complexity classesrdquo Journal of computerand system sciences vol 43 no 3 pp 425ndash440 1991

[34] L Peng R-C Wang X-Y Su and C Long ldquoPrivacy pro-tection based on key-changed mutual authentication protocolin internet of thingsrdquo in Communications in Computer andInformation Science pp 345ndash355 Springer Berlin Germany2013

[35] M F Rahman W Liu S )irumuruganathan N Zhang andG Das ldquoPrivacy implications of database rankingrdquo Pro-ceedings of the VLDB Endowment vol 8 no 10 pp 1106ndash1117 2015

[36] R Schenkel T Crecelius M Kacimi et al ldquoEfficient top-kquerying over social-tagging networksrdquo in Proceedings of the31st Annual International ACM SIGIR Conference on Researchand Development in Information Retrieval pp 523ndash530ACM Singapore July 2008

[37] C Sheng N Zhang Y Tao and X Jin ldquoOptimal algorithmsfor crawling a hidden database in the webrdquo Proceedings of theVLDB Endowment vol 5 no 11 pp 1112ndash1123 2012

[38] D X Song D Wagner and A Perrig ldquoPractical techniquesfor searches on encrypted datardquo in Proceeding 2000 IEEESymposium on Security and Privacy SampP 2000 pp 44ndash55IEEE Berkeley CA USA May 2000

[39] J Su D Cao B Zhao X Wang and I You ldquoePASS anexpressive attribute-based signature scheme with privacy andan unforgeability guarantee for the Internet of)ingsrdquo FutureGeneration Computer Systems vol 33 pp 11ndash18 2014

[40] L Sweeney ldquok-anonymity a Model for Protecting PrivacyrdquoInternational Journal of Uncertainty Fuzziness and Knowl-edge-Based Systems vol 10 no 5 pp 557ndash570 2002

[41] T Tassa A Mazza and A Gionis ldquok-concealment An al-ternative model of k-type anonymityrdquo Transactions on DataPrivacy vol 5 no 1 pp 189ndash222 2012

[42] J H Y F W Wang J C W G K Koperski D LiY L A R N Stefanovic and B X O R Z Dbminer ldquoAsystem for mining knowledge in large relational databasesrdquo inProceedings of the International Conference on KnowledgeDiscovery and Data Mining and Knowledge Discovery(KDDrsquo96) pp 250ndash255 San Diego CA USA August 1996

[43] X Wang J Zhang E M Schooler and M Ion ldquoPerformanceevaluation of attribute-based encryption toward data privacyin the IoTrdquo in Proceedings of the IEEE International Con-ference on Communications (ICC) pp 725ndash730 IEEE SydneyAustralia June 2014

[44] Y Wang and Q Wen ldquoA privacy enhanced dns scheme forthe internet of thingsrdquo in Proceedings of the IET InternationalConference on Communication Technology and Application(ICCTA 2011) )e Institution of Engineering amp TechnologyBeijing China October 2011

[45] Y Xu T Ma M Tang and W Tian ldquoA survey of privacypreserving data publishing using generalization and sup-pressionrdquo Applied Mathematics amp Information Sciencesvol 8 no 3 pp 1103ndash1116 2014

[46] J-C Yang and B-X Fang ldquoSecurity model and key tech-nologies for the internet of thingsrdquo 4e Journal of ChinaUniversities of Posts and Telecommunications vol 18pp 109ndash112 2011

[47] X Yang H Steck Y Guo and Y Liu ldquoOn top-k recom-mendation using social networksrdquo in Proceedings of the sixthACM conference on Recommender systemsmdashRecSysrsquo12pp 67ndash74 ACM Dublin Ireland September 2012

[48] X Zheng Z Cai J Yu C Wang and Y Li ldquoFollow but notrack privacy preserved profile publishing in cyber-physicalsocial systemsrdquo IEEE Internet of 4ings Journal vol 4 no 6pp 1868ndash1878 2017

16 Security and Communication Networks

International Journal of

AerospaceEngineeringHindawiwwwhindawicom Volume 2018

RoboticsJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Active and Passive Electronic Components

VLSI Design

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Shock and Vibration

Hindawiwwwhindawicom Volume 2018

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawiwwwhindawicom

Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Control Scienceand Engineering

Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Journal ofEngineeringVolume 2018

SensorsJournal of

Hindawiwwwhindawicom Volume 2018

International Journal of

RotatingMachinery

Hindawiwwwhindawicom Volume 2018

Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Navigation and Observation

International Journal of

Hindawi

wwwhindawicom Volume 2018

Advances in

Multimedia

Submit your manuscripts atwwwhindawicom

Page 8: Research Article - Hindawi Publishing Corporationdownloads.hindawi.com/journals/scn/2019/5498375.pdf · [4, 34, 44], and attribute-based encryption [39, 43]. ’e features of social

existt isinMAv t Bj1113960 1113961 βr and t isin D (20)

)en we say that PBj

v covers l true values We denoteT

Bj

v β1 βl1113864 1113865 as the true value set of t in BjPrivacy guarantee for a database D where every tuplersquos

every private attribute is included by one equivalent value setwhich covers at least l true values a privacy level of ε 1l isachieved

Assume that the adversaryrsquos objective is to infer the valueof v[Bj] As mentioned in the adversary model we make noassumption on the attacking methods adopted by an ad-versary Consider MA

v defined in (18) forallt tprime isinMAv and

forallq isin QA s(t ∣ q) s(tprime ∣ q) )erefore the adversary cannotdistinguish tuples in MA

v by observing RA In this situationthe adversary would use PK(t ∣ D) to exclude all tuple t inMA

v such that PK(t ∣ D) 0 However as PBj

v covers l truevalues there exists l tuples t1 tl isinMA

v such thatPK(t ∣ D) 1 and tr[Bj]ne ts[Bj] for arbitraryr s isin 1 l As a result values in t1[Bj] tl[Bj]1113966 1113967 areindistinguishable given RA and PK(t ∣ D) )us Pv[Bj] 1lAccording to (3) a privacy level of 1l is achieved

53 Utility Optimization An intuitive method of con-structing P

Bj

v is to insert the value of Bj of all tuples that sharethe same public attribute values with v into P

Bj

v ie PBj

v

PBj

v 1113864vprime[Bj]|foralli and vprime isin D vprime[Ai] v[Ai]1113865 )e privacyguarantee is met if P

Bj

v covers at least 1ε true valuesNevertheless utility loss cannot be ignored as in the intuitivemethod s(vprime| q) would be the same for all vprime isin 1113864vprime| foralliand vprime isin D vprime[Ai] v[Ai]1113865 and thus information of privateattributes are missing in the ranked result of vprime )ereforethe size of P

Bj

v is critical in balancing privacy and utilityWith no loss of generality we limit the size of each poly-morphic value set to 2 We show that constructing suchpolymorphic value sets is an NP-hard problem

Definition 5 We define the 2-PVST problem as followsgiven database D and a query workload Q construct P

Bj

v ofsize 2 for each tuple v isin D forallj isin 1 mprime1113864 1113865 and minimizeUQ defined in (10)

Theorem 2 4e 2-PVST problem is NP-hard

)e proof of )eorem 2 can be found in Appendix B

54 Heuristic Algorithm We have shown that the 2-PVSTproblem is NP-hard In this subsection we present PVST-Constructor a heuristic algorithm that constructs PVSTwithin polynomial time PVST-Constructor tries to mini-mize |P

Bj

v | and 1113936qisinQ|sprime(v ∣ q) minus s(v ∣ q)| for each v isin D

and j isin 1 mprime with the greedy algorithm)e pseudo-code of PVST-Constructor is shown in

Algorithm 2 In lines 2 to 6 we initialize each PBj

v withv[Bj] )en for each v isin D PVST-Constructor constructsPB1

v PBmprime

v by finding a vprime with the greedy algorithm andinserting vprime[Bj] into P

Bj

v forallj isin 1 mprime)e above process

will be taken multiple times until |PBj

v |ge 1ε Since every vprime isa real tuple existing in D and we insert at least k tuples in theabove processes P

Bj

v covers at least k + 1 true values andthus the privacy guarantee is met As mentioned in UtilityOptimization the size of P

Bj

v is critical in minimizing utilityloss )erefore the heuristic algorithm tries to minimize theutility loss by minimizing the size of each P

Bj

v We count thenumber of polymorphic value sets of v if the set contains lessthan k values that can be enlarged by inserting vprimersquos privateattribute values We denote this count as covervprime

covervprime j ∣ PBj

v

11138681113868111386811138681113868

11138681113868111386811138681113868lt k vprime Bj1113960 1113961 notin PBj

v1113882 1113883

1113868111386811138681113868111386811138681113868

1113868111386811138681113868111386811138681113868 (21)

Tuple vprime with a higher covervprime value can enlarge the sizeof more polymorphic value sets of v and thus we can reducethe number of tuples that we have to insert into PB1

v PBmprime

v Furthermore we take into consideration queries in Q In

order to minimize UQ we have to minimize |sprime(v ∣ q) minus

s(v ∣ q)| for q isin Q Note that for each attribute Bj|ρprime(q[Bj] v[Bj]) minus ρ(q[Bj] v[Bj])| 1 if v[Bj]ne q[Bj] andq[Bj] isin P

Bj

v We denote the value of 1113936j|ρprime(q[Bj] v[Bj]) minus

ρ(q[Bj] v[Bj])| as lossvprime )us we have

lossvprime 1113944qisinQ

1113944

j1mprime

ρprime q Bj1113960 1113961 v Bj1113960 11139611113872 1113873 minus ρ q Bj1113960 1113961 v Bj1113960 11139611113872 111387311138681113868111386811138681113868

11138681113868111386811138681113868

1113944qisinQ

1113944

j1mprime

δ v vprime q j( 1113857

(22)where δ(v vprime q j) 1 if v[Bj]ne q[Bj] and vprime[Bj] q[Bj]

and δ(v vprime q j) 0 otherwise )erefore tuple vprime with asmaller lossvprime can reduce the value of 1113936j|ρprime(q[Bj]

v[Bj]) minus ρ(q[Bj] v[Bj])| and thus we can reduce the valueof UQ

)e computation of covervprime and lossvprime is done in line 11)en we compute scorevprime the score of vprime that indicates howpreferable vprime is relative to other tuples in D v Weadopt the greedy algorithm to find the next vprime for PB1

v

PB

mprimev ie in each iteration we choose the tuple

that has the highest score value In line 15 we insert theprivate attribute values of the chosen tuple (denoted as vmax)into PB1

v PB

mprimev )e above process is repeated until

the sizes of PB1v P

Bmprime

v are no less than k

6 Experimental Results

61 Experimental Setup To validate PVSV-Constructor andPVST-Constructor algorithms we conducted experiments on areal world dataset [29] from eHarmony which contains 58attributes and 486464 tuples We removed 5 noncategoricalattributes and randomly picked 20 categorical attributes fromthe remaining 53 attributes )e domain sizes of the 20 at-tributes range from 2 to 15 After removing duplicate tuples werandomly picked 300000 tuples as our testing bed

By default we use the ranking function from the rankedretrieval model with all weights set to 1 All experimentalresults were obtained on a Mac machine running Mac OS

8 Security and Communication Networks

with 8GB of RAM )e algorithms were implemented inPython

62 Privacy )e privacy guarantee of PVSV-Constructorand PVST-Constructor were tested by performing RankedInference attack [35] including Point-Query In-QueryPoint-QueryampInsert and In-QueryampInsert attackingmethods on the dataset From a total of 20 attributes 5attributes were randomly chosen as public attributes andanother 5 attributes were randomly chosen as private at-tributes We randomly picked 20000 distinct tuples fromthe dataset as the testing bed and randomly generated 10tuples as the query workload For PVSV-Constructor weconstructed a polymorphic value set of size 2 for eachprivate attribute and each tuple in the testing bed ForPVST-Constructor we constructed a polymorphic valueset that covers at least two true values for each privateattribute and each tuple We randomly picked 1000 tuplesfrom the testing bed as our targets and performed 1000Rank Inference attacks (250 attacks for each of the fourmethods) on the five private attributes of target tuples Wemeasured the attack success guess rates based on the fre-quency of successful inference among all inference at-tempts Figure 1 shows the success guess rates of RankInference attacks on the unprotected testing bed the testingbed with PVSV and the testing bed with PVST As the sizeof each polymorphic value set is 2 the success guess rateson PVSV are around 50 which are significantly lowerthan those of unprotected dataset We also observe that thesuccess guess rates on PVSTare slightly lower than those of

PVSV )e reason is that PVSV-Constructor will beinserting values into tuple vrsquos polymorphic value sets untilall vrsquos polymorphic value sets cover at least 2 true values)us some polymorphic value sets of v may contain morethan 2 values

63 Utility In this subsection we quantify utility loss ofPVSV-Constructor and PVST-Constructor algorithms)e privacy guarantee of both PVSV-Constructor andPVST-Constructor is 12 ie the polymorphic value setsconstructed by PVSV-Constructor contains 2 values andthe polymorphic value sets constructed by PVST-Con-structor contains at least 2 true values )e key param-eters here are the size of query workload Q the size ofdatabase D the number of public and private attributesand the weight ratios in the ranking function We ran-domly generated 20 tuples as the query workload Bydefault we picked 10 tuples from Q and set |D| 300 000m 10 and mprime 10 )erefore we randomly picked 10attributes from the testing bed and set them as publicattributes )e rest of the 10 attributes were set as privateattributes

Many recommendation systems of ONS applicationsfeature top-k recommendation [1 14 47] where the rankedresult contains a set of k tuples that will be of interest to acertain user as it is impractical and unnecessary to return alltuples in the database to the user )erefore we introduceaverage top-k utility loss Uaul a variant of UQ that focuses onutility loss of the top-k tuples in a ranked result Uaul isdefined as

(i) Input ε D m mprime Q(ii) Output P

Bj

v forallv and forallBj

(1) k (1ε) minus 1(2) for v in D do(3) for j isin 1 mprime1113864 1113865 do(4) P

Bj

v v[Bj]1113966 1113967

(5) end(6) end(7) for v in D do(8) while existj isin 1 mprime1113864 1113865 |P

Bj

v |lt k + 1 do(9) for vprime isin t isin D|forallAi t[Ai] v[Ai]1113864 1113865 do(10) compute covervprime lossvprime(11) scorevprime covervprime 11 + exp(minus lossvprime )

(12) end(13) vmax argmaxvprimeisin |tisinD|forallAit[Ai]v[Ai] scorevprime

(14) for j isin 1 mprime1113864 1113865 do

(15) PBj

v PBj

v cup tmax[Bj]1113966 1113967

(16) end(17) end(18) end

ALGORITHM 2 PVST-Constructor

Security and Communication Networks 9

Uaul 1

|Q|1113944

|Q|

i1

1Dprime

11138681113868111386811138681113868111386811138681113868

1113944

visinDprime

min Rank(v ∣ q) k + 11113864 1113865 minus min Rankprime(v ∣ q) k + 11113864 11138651113868111386811138681113868

1113868111386811138681113868

k (23)

where Dprime v isin D |Rank(v | q))lt k or Rank(v | q))lt k1113864 1113865Intuitively Uaul represents the average percentage rankdifference relative to k over all queries and all tuples in top-kUaul is equivalent to UQ|Q||D|2 when k |D| By default weset k 100

631 Evaluation of Uaul with Varying k We first discuss theaverage top-k utility loss of PVSV-Constructor PVST-Constructor and a baseline algorithm on varying k withother parameters set to default values For a tuple vrsquos at-tribute Bj the baseline algorithm constructs P

Bj

v with v[Bj]

and a value randomly picked from VBj v[Bj]1113966 1113967 )e results

are presented in Figure 2 which shows that the Uaul ofPVSV-Constructor is significantly lower than that of thebaseline algorithm )e Uaul of PVST-Constructor is alsolower than that of the baseline algorithm when kge 5 eventhough the baseline algorithm cannot preserve privacyagainst authenticity-knowledgeable adversaries )e exper-imental results show that both PVSV and PVST can reduceutility loss with respect to rank differences Also note thatPVST-Constructor constructs polymorphic value sets withtrue values and thus from Figure 3 we can see that somepolymorphic value sets constructed by PVST-Constructorcontains more than 2 values

632 Evaluation of Uaul with Varying Sizes of QFigure 4 presents the average top-k utility loss of PVSV-Constructor on varying |Q| When |Q| is set to 1 5 or 10 werandomly picked 1 5 or 10 queries from the original Qrespectively With increasing number of queries in the queryworkload the Uaul of PVSV-Constructor increases mono-tonically )e reason is that PVSV-Constructor alwaysgenerates the least frequent value (denoted as vprime[Bj]) in1113864q[Bj] | q isin Q1113865 for v[Bj] If |Q| is small then it is possiblethat vprime[Bj] notin 1113864q[Bj] | q isin Q1113865 and thus s(v ∣ q) sprime(v ∣ q)

forallq isin Q However a larger query workload covers moreprivate attributes values and thus it will be harder forPVSV-Constructor to generate a value for v[Bj] that has noimpact on the rank of v for any q isin Q Figure 5 presents theaverage top-k utility loss of PVST-Constructor on varying|Q| We can see that the size of Q has no significant impacton the Uaul of PVST-Constructor as the PVST-Constructoralways pick a tuple vprime that is different from v in most at-tributes and then insert vprime[Bj] into P

Bj

v

633 Evaluation of Uaul with Varying Sizes of the DatasetFigures 6 and 7 depict the impact of the size of datasets onUaul of PVSV-Constructor and PVST-Constructor Datasetsof 100000 and 200000 tuples were randomly sampled from

QPoint QIn QIPoint QIIn

Atta

ck su

cces

s gue

ss ra

te

Without protectionPVSVPVST

1

09

08

07

06

05

04

03

02

01

0

Figure 1 Attack success guess

10 Security and Communication Networks

the testing bed of 300000 tuples As expected |D| has nosignificant impact on the Uaul of PVSV-Constructor sincePVSV-Constructor generates values from the domain ofeach private attributes |D| has no impact on the Uaul ofPVST-Constructor which indicates that a dataset contain-ing 100000 tuples is sufficient for PVST-Constructor togenerate polymorphic value sets with true values

634 Evaluation of Uaul with Varying m We investigate theimpact of the number of private and public attributes on

average top-k utility loss Figure 8 presents theUaul of PVSV-Constructor with fixed mprime and varying m and with fixed mand varying mprime When mmprime is set to 5 we randomly re-moved 5 attributes from the testing bed When mmprime is set to

0 20 40 60 80 100k

RandomPVSVPVST

Ave

rage

top-k

utili

ty lo

ss

1

09

08

07

06

05

04

03

02

01

0

Figure 2 Utility loss vs k

1 2 3 4 5+Size of PVST

3

25

2

15

1

05

0

Coun

t 10

5

Figure 3 Size of polymorphic value set (PVST)

|Q|

Ave

rage

top-k

utili

ty lo

ss

1

009

008

007

006

005

004

003

002

001

00 10 20

Figure 4 Utility loss vs |Q|

|Q|

Ave

rage

top-k

utili

ty lo

ss

1

09

08

07

06

05

04

03

02

01

00 10 20

Figure 5 Utility loss vs |Q|

0 1 2 3|D|105

Ave

rage

top-k

utili

ty lo

ss

1

009

008

007

006

005

004

003

002

001

0

Figure 6 Utility loss vs |D|

Security and Communication Networks 11

15 we added 5 more categorical attributes randomly chosenfrom the unused attributes We observe that the Uaulmonotonically decreases with increasing number of publicattributes and monotonically increases with increasingnumber of private attributes As expected a higher pro-portion of public attributes leads to less variant betweens(v ∣ q) and sprime(v ∣ q) e results of the same experimentwith PVST-Constructor are shown in Figure 9 Uaul in-creases as increasing number of private attributes as ex-pected However Uaul also increases slightly with increasingnumber of public attributes is is due to the fact that withmore public attributes there will be few tuples that share thesame public attribute values Since PVST-Constructor in-serts only private attribute values from tuples sharing samepublic attribute values more values will be inserted to eachpolymorphic value set which introduces higher utility loss

635 Evaluation of Uaul with Varying Weight RatiosFigures 10 and 11 illustrate the impact of weight ratios on theaverage utility loss e experiment was conducted with a

xed private attribute weight of 1 and varying public at-tribute weights of 1 2 and 3 As expected Uaul of bothPVSV-Constructor and PVST-Constructor decreases as

0 1 2 3Weight ratio

Ave

rage

top-k

utili

ty lo

ss

1

009

008

007

006

005

004

003

002

001

0

Figure 10 Utility loss vs weight ratio

0 1 2 3Weight ratio

Ave

rage

top-k

utili

ty lo

ss

1

09

08

07

06

05

04

03

02

01

0

Figure 11 Utility loss vs weight ratio

0 1 2 3|D|105

Ave

rage

top-k

utili

ty lo

ss1

09

08

07

06

05

04

03

02

01

0

Figure 7 Utility loss vs |D|

0 5 10 15m or mprime

fix mprime = 10fix m = 10

Ave

rage

top-k

utili

ty lo

ss

1

009

008

007

006

005

004

003

002

001

0

Figure 8 Utility loss vs m and mprime

0 5 10 15m or mprime

fix mprime = 10fix m = 10

Ave

rage

top-k

utili

ty lo

ss

1

09

08

07

06

05

04

03

02

01

0

Figure 9 Utility loss vs m and mprime

12 Security and Communication Networks

increasing weight ratio of public attributes)e reason is thatas the public attribute weight increases the part in sprime(v ∣ q)

caused by private attributes decreases )erefore less impactwould be made to sprime(v ∣ q) by PVSV and PVST As a resultsprime(v ∣ q) would be closer to s(v ∣ q) and the utility loss couldbe decreased

7 Conclusions

In this paper we proposed a novel framework that preservesprivacy of private attributes against arbitrary attacks throughthe ranked retrieval model Furthermore we identify twocategories of adversaries based on varying adversarial ca-pabilities For each kind of adversaries we presentedimplementation of our framework Our experimental resultssuggest that our implementations efficiently preserve privacyagainst Rank Inference attack [35] Moreover the imple-mentations significantly reduce utility loss with respect tothe variance in ranked results

It is our hope that this paper can motivate further re-search in privacy preservation of SIoT with consideration ofsocial network features andor variant information retrievalmodels eg text mining

Appendix

A 2-PVSV

In this subsection we prove that constructing an optimal PVSVfor each attribute of a tuple v is an NP-hard problem

Definition A1 For a tuple v in D we create PBj

v for each BjWe say v satisfies query q if the following hold for any tuple t(tne v) in D (1) If s(v ∣ q)lt s(t ∣ q) then sprime(v ∣ q)lt sprime(t ∣ q)and (2) if s(v ∣ q)ge s(t ∣ q) then sprime(v ∣ q)ge sprime(t ∣ q)

For a database containing 2 tuples the 2-PVSV problemcan be redefined as follows given a query workload Q adatabase D construct PB1

v PBmprime

v of size 2 such that v

satisfies the most queries in Q

Definition A2 Max-3Sat Problem given a 3-CNF formulaΦ find the truth assignment that satisfies that most clauses

Lemma A1 Max-3Sat leP 2-PVSV Problem

Proof We construct a reduction function f(Φ) (D s Q)

which takes a Max-3Sat instance as input and returns a 2-PVSV instance Without loss of generality we suppose thatΦ is a conjunction of l clauses and each clause is a dis-junction of 3 literals from set X x1 xn1113864 1113865 We constructdatabase D as follows D has 0 public attributes and n + 1private attributes B1 Bn C1 Let VB

j be the attribute do-main of Bj j isin 1 n and VC

1 be the attribute domain ofC1 Let VB

i minus 1 0 1 2 n + 2 and VC1 0 1 Two

tuples v1 and v2 are inserted into D v1[Bj] 0 and v2[Bj]

j + 2 for j isin 1 n while v1[C1] 0 and v2[C1] 1 Wesimplify the score function defined in (1) by setting all weightsto 1 Also note that ρ(q[Bj] t[Bj]) 0 if q[Bj] is null

We construct query workload Q based on Φ For eachclause Lk isin Φ we construct a query qk such that qk[Bi] i iffthe corresponding literal xi is a positive literal in Lk andq[Bi] i + 1 iff xi is a negative literal in the clause Forexample given a clause (x1orx2orx3) the correspondingquery q should satisfy q[B1] 1 q[B2] 3 q[B3] 3 andq[C1] 0 All other attributes in q are set to null by default)erefore s(v1 ∣ q) 1 and s(v2 ∣ q) 0 forallq isin Q

Since s(v2 ∣ q)lt s(v1 ∣ q) forallq isin Q in order to minimizeutility loss the value of sprime(v2 ∣ q) should be as small aspossible We observe that the minimum possible sprime(v2 ∣ q) is1 because PC1

v2must be 0 1 as VC

1 0 1 Without loss ofgenerality we assume that we have already constructedPB1

v2 P

Bmprime

v2 PC1v2

for v2 such that sprime(v2 ∣ q) 1 forallq isin QNow we have an instance of 2-PVSV problem that given

D Q constructs PB1v1

PBmprime

v1 PC1

v1that satisfies the most

queries in Q Since VC1 0 1 PC1

v1must be 0 1 too as

PC1v1

contains two distinct values )erefore for an ar-bitrary query q isin Q we have ρprime(q[C] v1[C]) 1 andρprime(q[C] v2[C]) 1 In order to let v1 satisfy q we have toensure that sprime(v1 ∣ q)gt sprime(v2 ∣ q) 1 )us we have

sprime v1 ∣ q( 1113857gt 1

⟺1113944mprime

i1ρprime q Bi1113858 1113859 v1 Bi1113858 1113859( 1113857gt 1

⟺existi isin 1 mprime1113864 1113865 ρprime q Bi1113858 1113859 v1 Bi1113858 1113859( 1113857 1

⟺existi isin 1 mprime1113864 1113865 q Bi1113858 1113859 isin PBi

v1

(A1)

Now we show that the solution of 2-PVSV problemconstructed above can answer the corresponding Max-3Sat Problem Suppose that we have the solution PB1

v1

PB

mprimev1 such that v1 satisfies the most queries in Q As in

(A1) if v1 satisfies qk we have qk[Bi] isin PBiv1 If v1 does not

satisfy qk then qk[Bi] notin PBiv1 Recall that we assign value i or

i + 1 to qk[Bi] if xi is a positive or negative literal in clauseLk respectively )erefore the assignment of xi given PBi

v1is

xi true if i isin PBi

v1

xi false if i + 1 isin PBi

v1

(A2)

Furthermore from equation (A1) we have

sprime v1 ∣ q( 1113857gt 1⟺L true (A3)

where L is qrsquos corresponding clause in Φ Note thati i + 1 nsubQ as |PBi

v1| 2 and 0 isin PBi

v1 )us the value of xi is

either true or falseAs we proved above v1 satisfies qi if and only if Li is true

given assignment x1 xl1113864 1113865 constructed according toequation (A2) )erefore x1 xl1113864 1113865 satisfies the mostclauses in ϕ if and only if v1 satisfies the most queries in Q

We now prove that function f can be conducted inpolynomial time Given a formula Φ with n variables and lclauses we construct a 2-PVSV instance with 2 tuples each

Security and Communication Networks 13

of which has n + 1 attributes and l queries each of which has4 attributes )erefore O(2n + 4l) assignments are neededand f can be conducted in polynomial time

Proof of 4eorem 1 We now prove that the 2-PVSVproblem is NP-hard In Lemma 3 we proved that Max-3SatProblem can be reduced to the 2-PVSV problem in poly-nomial time Furthermore as Max-3Sat Problem is a NP-hard problem [33] the 2-PVSV problem is NP-hard

B 2-PVST

In this section we prove that constructing an optimal PVSTfor each private attribute of a tuple v isin D is NP-hard We usethe definition of v satisfying q from (9))e 2-PVSTproblemcan be redefined as follows

Definition B1 2-PVSTproblem given a query workload Qa database D the optimization problem of 2-PVST is toconstruct an arbitrary tuple vrsquos polymorphic setsPB1

v PBmprime

v that satisfies the most queries in Q )e size ofeach polymorphic vale set is 2

Lemma B1 Max-3Sat le P 2-PVST problem

Proof We construct a reduction function f(Φ) (D s Q)

which takes a Max-3Sat instance as input and returns a 2-PVST instance We assume that Φ is a conjunction of lclauses and each clause is a disjunction of 3 literals from setX x1 xn1113864 1113865 Let D have no public attribute and n + 1private attributes B1 Bn C1 For each literal xi we inserttwo tuples vi and vi

prime into D where

vi C11113858 1113859 1 vi Bi1113858 1113859 1 vi Bj1113960 1113961 minus 1foralljne i

viprime C11113858 1113859 1 vi

prime Bi1113858 1113859 minus 1 viprime Bj1113960 1113961 1foralljne i

(B1)

)en we insert tuple v into D where v[C1] 0 andv[Bj] 0 forallj isin 1 mprime We also set the domain of eachBj as minus 1 0 1 and the domain of C1 as 0 1

Without loss of generality the score function defined in(1) is simplified by setting all weights to 1

Next we construct query workload Q For each clauseLk isin Φ we construct a query qk in which qk[Bj] minus 1 iff xj isa positive literal in Lk and qk[Bj] 1 iff xj is a negativeliteral in Lk We also set qk[C1] 1 )e rest of the attributevalues are set to null by default )erefore ifLk (x1orne x2orx3) then we have qk[C1] 1 qk[B1] 1qk[B2] 1 qk[B3] minus 1 and qk[Bj] null forj isin 4 mprime

Note that ρ(q[Bj] t[Bj]) 0 if q[Bj] is null )uss(v qk) 0 forallq isin Q and s(vi qk) s(vi

prime qk)ge 1 forallq isin Q Weobserve that for q isin Q and i isin 1 n s(v ∣ q)lt s(vi ∣ q)

and s(v ∣ q)lt s(viprime ∣ q) In order to reduce UQ we have to

maximize the number of q isin Q such that sprime(v ∣ q)lt sprime(vi ∣ q)

and sprime(v ∣ q)lt sprime(viprime ∣ q) )e maximum value of sprime(vi ∣ q)

and sprime(viprime ∣ q) is 4 which can be achieved by inserting vi[Bj]

into PBj

viprime and inserting vi

prime[Bj] into PBj

vi Since PC1

v 0 1 thevalue of sprime(v ∣ q) is at least 1 )erefore we have

sprime(v ∣ q)lt sprime vi ∣ q( 1113857

⟺sprime(v ∣ q)lt 4

⟺1113944

mprime

j1ρprime q Bj1113960 1113961 v Bj1113960 11139611113872 1113873lt 3

⟺existj isin 1 mprime ρprime q Bj1113960 1113961 v Bj1113960 11139611113872 1113873 0

⟺existj isin 1 mprime v Bj1113960 1113961ne q Bj1113960 1113961

⟺existj isin 1 mprime minus q Bj1113960 1113961 isin PBj

v

(B2)

Since |PBj

v | is limited to 2 PBj

v 0 1 or 0 minus 1 ie wemust insert either virsquos or vi

primersquos private attribute values intoPB1

v PBmv Recall that we assign minus 1 or 1 to qk[Bj] if xj

is positive or negative respectively in Lk In order toreduce Max-3Sat to 2-PSVT we assign literals in the fol-lowing way

xj true if minus 1 isin PBj

v

xj false if 1 isin PBj

v (B3)

)erefore we denote the corresponding clause of q as Land from equation (B2) we have

existj isin 1 mprime1113864 1113865 minus q Bj1113960 1113961 isin PBj

v

⟺existj isin 1 mprime1113864 1113865 xj true if xj is positive in L

orxj false if xj is negative in L

⟺L true(B4)

From the above equation we observe that v satisfying qk

is equivalent to Lk being true)erefore we can reduceMax-3Sat to 2-PVST Given a solution to the 2-PVSTproblem thatmaximizes the number of queries satisfied by v assignmentx1 xn1113864 1113865 produced by equation (B3) can satisfy thelargest number of clauses in Φ

Given a Max-3Sat instance Φ the construction of the 2-PVST instance can be conducted in polynomial time as weconstruct n queries with 3 attributes and 2n + 1 tuples with nattributes)e transformation from the optimal solution of 2-PVST to the optimal solution of Max-3Sat also takes poly-nomial time as we assign the value to each one of x1 xn

once )erefore f is a polynomial time function

Proof of 4eorem 2 In 4 we proved that a Max-3Sat in-stance can be reduced to a 2-PSVT instance in polynomialtime Furthermore as Max-3Sat Problem is an NP-hardproblem the 2-PVSV problem is an NP-hard problem

Data Availability

)e data used to support the findings of this study areavailable from the corresponding author upon request

14 Security and Communication Networks

Conflicts of Interest

)e authors declare that they have no conflicts of interestregarding the publication of this paper

Acknowledgments

)is research was partially supported by the National Sci-ence Foundation (grant nos 0852674 0915834 1117297and 1343976) and the Army Research Office (grant noW911NF-15-1-0020)

References

[1] G Adomavicius and A Tuzhilin ldquoToward the next generationof recommender systems a survey of the state-of-the-art andpossible extensionsrdquo IEEE Transactions on Knowledge andData Engineering vol 17 no 6 pp 734ndash749 2005

[2] C C Aggarwal and S Y Philip ldquoA condensation approach toprivacy preserving data miningrdquo in Proceedings of the In-ternational Conference on Extending Database Technologypp 183ndash199 Springer Heraklion Crete Greece March 2004

[3] R Agrawal and R Srikant ldquoPrivacy-preserving data miningrdquoin ACM Sigmod Record vol 29 no 2 pp 439ndash450 ACM2000

[4] A Alcaide E Palomar J Montero-Castillo and A RibagordaldquoAnonymous authentication for privacy-preserving IoT tar-get-driven applicationsrdquo Computers amp Security vol 37pp 111ndash123 2013

[5] L Atzori A Iera G Morabito and M Nitti ldquo)e socialinternet of things (SIoT)mdashwhen social networks meet theinternet of things concept architecture and network char-acterizationrdquo Computer Networks vol 56 no 16 pp 3594ndash3608 2012

[6] Z Cai Z He X Guan and Y Li ldquoCollective data-sanitizationfor preventing sensitive information inference attacks insocial networksrdquo IEEE Transactions on Dependable and SecureComputing vol 15 no 4 pp 577ndash590 2018

[7] Z Cai and X Zheng ldquoA private and efficient mechanism for datauploading in smart cyber-physical systemsrdquo IEEE Transactionson Network Science and Engineering vol 24 p 1 2018

[8] Z Cai X Zheng and J Yu ldquoA differential-private frameworkfor urban traffic flows estimation via taxi companiesrdquo IEEETransactions on Industrial Informatics vol 17 2019

[9] N Cao C Wang M Li K Ren and W Lou ldquoPrivacy-preserving multi-keyword ranked search over encryptedcloud datardquo IEEE Transactions on Parallel and DistributedSystems vol 25 no 1 pp 222ndash233 2014

[10] Y-C Chang and M Mitzenmacher ldquoPrivacy preservingkeyword searches on remote encrypted datardquo in Proceedingsof the International Conference on Applied Cryptography andNetwork Security pp 442ndash455 Springer New York NY USAJune 2005

[11] C Chen X Zhu P Shen et al ldquoAn efficient privacy-pre-serving ranked keyword search methodrdquo IEEE Transactionson Parallel and Distributed Systems vol 27 no 4 pp 951ndash9632016

[12] K Chen and L Liu ldquoPrivacy preserving data classificationwith rotation perturbationrdquo in Proceedings of the Fifth IEEEInternational Conference on Data Mining (ICDMrsquo05) p 4IEEE Houston TX USA November 2005

[13] V Ciriani S D C Di Vimercati S Foresti S JajodiaS Paraboschi and P Samarati ldquoFragmentation and en-cryption to enforce privacy in data storagerdquo in Proceedings of

the Fifth European symposium on research in computer se-curity pp 171ndash186 Springer Dresden Germany September2007

[14] M Deshpande and G Karypis ldquoItem-based top-N recom-mendation algorithmsrdquo ACM Transactions on InformationSystems vol 22 no 1 pp 143ndash177 2004

[15] S D C di Vimercati R F Erbacher S Foresti S JajodiaG Livraga and P Samarati ldquoEncryption and fragmentationfor data confidentiality in the cloudrdquo in Foundations of Se-curity Analysis and Design VII A Aldini J Lopez andF Martinelli Eds pp 212ndash243 Springer Berlin Germany2013

[16] S D C di Vimercati S Foresti G Livraga S Paraboschi andP Samarati ldquoConfidentiality protection in large databasesrdquo inA Comprehensive Guide through the Italian Database Researchover the Last 25 Years pp 457ndash472 Springer Berlin Ger-many 2018

[17] C Dwork ldquoDifferential privacyrdquo Encyclopedia of Cryptog-raphy and Security pp 338ndash340 2011

[18] C Dwork and A Roth ldquo)e algorithmic foundations ofdifferential privacyrdquo Foundations and Trendsreg in 4eoreticalComputer Science vol 9 no 3-4 pp 211ndash407 2014

[19] S E Fienberg and J McIntyre ldquoData swapping variations ona theme by dalenius and reissrdquo in Privacy in Statistical Da-tabases pp 14ndash29 Springer Berlin Germany 2004

[20] Y Gao T Yan and N Zhang ldquoA privacy-preservingframework for rank inferencerdquo in Proceedings of the IEEESymposium on Privacy-Aware Computing (PAC) pp 180-181IEEE Washington DC USA August 2017

[21] Z He Z Cai and J Yu ldquoLatent-data privacy preserving withcustomized data utility for social network datardquo IEEETransactions on Vehicular Technology vol 67 no 1pp 665ndash673 2017

[22] T Hong S Mei Z Wang and J Ren ldquoA novel verticalfragmentation method for privacy protection based on en-tropy minimization in a relational databaserdquo Symmetryvol 10 no 11 p 637 2018

[23] X Huang R Fu B Chen T Zhang and A Roscoe ldquoUserinteractive internet of things privacy preserved access con-trolrdquo in Proceedings of the International Conference for In-ternet Technology and Secured Transactions pp 597ndash602IEEE London UK December 2012

[24] Y Huo X Fan L Ma X Cheng Z Tian and D ChenldquoSecure communications in tiered 5G wireless networks withcooperative jammingrdquo IEEE Transactions on Wireless Com-munications vol 18 no 6 pp 3265ndash3280 2019

[25] K Liu H Kargupta and J Ryan ldquoRandom projection-basedmultiplicative data perturbation for privacy preserving dis-tributed data miningin latexrdquo IEEE Transactions on knowledgeand Data Engineering vol 18 no 1 pp 92ndash106 2005

[26] N Li T Li and S Venkatasubramanian ldquot-closeness privacybeyond k-anonymity and l-diversityrdquo in Proceedings of theIEEE 23rd International Conference on Data Engineeringpp 106ndash115 IEEE Istanbul Turkey April 2007

[27] A Machanavajjhala D Kifer J Gehrke andM Venkitasubramaniam ldquoL-diversityrdquo ACMTransactions onKnowledge Discovery from Data vol 1 no 1 pp 3ndashes 2007

[28] S Matwin ldquoPrivacy-preserving data mining techniquessurvey and challengesrdquo in Studies in Applied PhilosophyEpistemology and Rational Ethics pp 209ndash221 SpringerBerlin Germany 2013

[29] B McFee and G Lanckriet ldquoMetric learning to rankrdquo inProceedings of the 27th International Conference on MachineLearning (ICMLrsquo10) Haifa Israel June 2010

Security and Communication Networks 15

[30] K Muralidhar and R Sarathy ldquoData shufflingmdasha newmasking approach for numerical datardquo Management Sciencevol 52 no 5 pp 658ndash670 2006

[31] J Nin J Herranz and V Torra ldquoRethinking rank swapping todecrease disclosure riskrdquo Data amp Knowledge Engineeringvol 64 no 1 pp 346ndash364 2008

[32] K Okamoto W Chen and X-Y Li ldquoRanking of closenesscentrality for large-scale social networksrdquo in Frontiers inAlgorithmics pp 186ndash195 Springer Berlin Germany 2008

[33] C H Papadimitriou and M Yannakakis ldquoOptimizationapproximation and complexity classesrdquo Journal of computerand system sciences vol 43 no 3 pp 425ndash440 1991

[34] L Peng R-C Wang X-Y Su and C Long ldquoPrivacy pro-tection based on key-changed mutual authentication protocolin internet of thingsrdquo in Communications in Computer andInformation Science pp 345ndash355 Springer Berlin Germany2013

[35] M F Rahman W Liu S )irumuruganathan N Zhang andG Das ldquoPrivacy implications of database rankingrdquo Pro-ceedings of the VLDB Endowment vol 8 no 10 pp 1106ndash1117 2015

[36] R Schenkel T Crecelius M Kacimi et al ldquoEfficient top-kquerying over social-tagging networksrdquo in Proceedings of the31st Annual International ACM SIGIR Conference on Researchand Development in Information Retrieval pp 523ndash530ACM Singapore July 2008

[37] C Sheng N Zhang Y Tao and X Jin ldquoOptimal algorithmsfor crawling a hidden database in the webrdquo Proceedings of theVLDB Endowment vol 5 no 11 pp 1112ndash1123 2012

[38] D X Song D Wagner and A Perrig ldquoPractical techniquesfor searches on encrypted datardquo in Proceeding 2000 IEEESymposium on Security and Privacy SampP 2000 pp 44ndash55IEEE Berkeley CA USA May 2000

[39] J Su D Cao B Zhao X Wang and I You ldquoePASS anexpressive attribute-based signature scheme with privacy andan unforgeability guarantee for the Internet of)ingsrdquo FutureGeneration Computer Systems vol 33 pp 11ndash18 2014

[40] L Sweeney ldquok-anonymity a Model for Protecting PrivacyrdquoInternational Journal of Uncertainty Fuzziness and Knowl-edge-Based Systems vol 10 no 5 pp 557ndash570 2002

[41] T Tassa A Mazza and A Gionis ldquok-concealment An al-ternative model of k-type anonymityrdquo Transactions on DataPrivacy vol 5 no 1 pp 189ndash222 2012

[42] J H Y F W Wang J C W G K Koperski D LiY L A R N Stefanovic and B X O R Z Dbminer ldquoAsystem for mining knowledge in large relational databasesrdquo inProceedings of the International Conference on KnowledgeDiscovery and Data Mining and Knowledge Discovery(KDDrsquo96) pp 250ndash255 San Diego CA USA August 1996

[43] X Wang J Zhang E M Schooler and M Ion ldquoPerformanceevaluation of attribute-based encryption toward data privacyin the IoTrdquo in Proceedings of the IEEE International Con-ference on Communications (ICC) pp 725ndash730 IEEE SydneyAustralia June 2014

[44] Y Wang and Q Wen ldquoA privacy enhanced dns scheme forthe internet of thingsrdquo in Proceedings of the IET InternationalConference on Communication Technology and Application(ICCTA 2011) )e Institution of Engineering amp TechnologyBeijing China October 2011

[45] Y Xu T Ma M Tang and W Tian ldquoA survey of privacypreserving data publishing using generalization and sup-pressionrdquo Applied Mathematics amp Information Sciencesvol 8 no 3 pp 1103ndash1116 2014

[46] J-C Yang and B-X Fang ldquoSecurity model and key tech-nologies for the internet of thingsrdquo 4e Journal of ChinaUniversities of Posts and Telecommunications vol 18pp 109ndash112 2011

[47] X Yang H Steck Y Guo and Y Liu ldquoOn top-k recom-mendation using social networksrdquo in Proceedings of the sixthACM conference on Recommender systemsmdashRecSysrsquo12pp 67ndash74 ACM Dublin Ireland September 2012

[48] X Zheng Z Cai J Yu C Wang and Y Li ldquoFollow but notrack privacy preserved profile publishing in cyber-physicalsocial systemsrdquo IEEE Internet of 4ings Journal vol 4 no 6pp 1868ndash1878 2017

16 Security and Communication Networks

International Journal of

AerospaceEngineeringHindawiwwwhindawicom Volume 2018

RoboticsJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Active and Passive Electronic Components

VLSI Design

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Shock and Vibration

Hindawiwwwhindawicom Volume 2018

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawiwwwhindawicom

Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Control Scienceand Engineering

Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Journal ofEngineeringVolume 2018

SensorsJournal of

Hindawiwwwhindawicom Volume 2018

International Journal of

RotatingMachinery

Hindawiwwwhindawicom Volume 2018

Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Navigation and Observation

International Journal of

Hindawi

wwwhindawicom Volume 2018

Advances in

Multimedia

Submit your manuscripts atwwwhindawicom

Page 9: Research Article - Hindawi Publishing Corporationdownloads.hindawi.com/journals/scn/2019/5498375.pdf · [4, 34, 44], and attribute-based encryption [39, 43]. ’e features of social

with 8GB of RAM )e algorithms were implemented inPython

62 Privacy )e privacy guarantee of PVSV-Constructorand PVST-Constructor were tested by performing RankedInference attack [35] including Point-Query In-QueryPoint-QueryampInsert and In-QueryampInsert attackingmethods on the dataset From a total of 20 attributes 5attributes were randomly chosen as public attributes andanother 5 attributes were randomly chosen as private at-tributes We randomly picked 20000 distinct tuples fromthe dataset as the testing bed and randomly generated 10tuples as the query workload For PVSV-Constructor weconstructed a polymorphic value set of size 2 for eachprivate attribute and each tuple in the testing bed ForPVST-Constructor we constructed a polymorphic valueset that covers at least two true values for each privateattribute and each tuple We randomly picked 1000 tuplesfrom the testing bed as our targets and performed 1000Rank Inference attacks (250 attacks for each of the fourmethods) on the five private attributes of target tuples Wemeasured the attack success guess rates based on the fre-quency of successful inference among all inference at-tempts Figure 1 shows the success guess rates of RankInference attacks on the unprotected testing bed the testingbed with PVSV and the testing bed with PVST As the sizeof each polymorphic value set is 2 the success guess rateson PVSV are around 50 which are significantly lowerthan those of unprotected dataset We also observe that thesuccess guess rates on PVSTare slightly lower than those of

PVSV )e reason is that PVSV-Constructor will beinserting values into tuple vrsquos polymorphic value sets untilall vrsquos polymorphic value sets cover at least 2 true values)us some polymorphic value sets of v may contain morethan 2 values

63 Utility In this subsection we quantify utility loss ofPVSV-Constructor and PVST-Constructor algorithms)e privacy guarantee of both PVSV-Constructor andPVST-Constructor is 12 ie the polymorphic value setsconstructed by PVSV-Constructor contains 2 values andthe polymorphic value sets constructed by PVST-Con-structor contains at least 2 true values )e key param-eters here are the size of query workload Q the size ofdatabase D the number of public and private attributesand the weight ratios in the ranking function We ran-domly generated 20 tuples as the query workload Bydefault we picked 10 tuples from Q and set |D| 300 000m 10 and mprime 10 )erefore we randomly picked 10attributes from the testing bed and set them as publicattributes )e rest of the 10 attributes were set as privateattributes

Many recommendation systems of ONS applicationsfeature top-k recommendation [1 14 47] where the rankedresult contains a set of k tuples that will be of interest to acertain user as it is impractical and unnecessary to return alltuples in the database to the user )erefore we introduceaverage top-k utility loss Uaul a variant of UQ that focuses onutility loss of the top-k tuples in a ranked result Uaul isdefined as

(i) Input ε D m mprime Q(ii) Output P

Bj

v forallv and forallBj

(1) k (1ε) minus 1(2) for v in D do(3) for j isin 1 mprime1113864 1113865 do(4) P

Bj

v v[Bj]1113966 1113967

(5) end(6) end(7) for v in D do(8) while existj isin 1 mprime1113864 1113865 |P

Bj

v |lt k + 1 do(9) for vprime isin t isin D|forallAi t[Ai] v[Ai]1113864 1113865 do(10) compute covervprime lossvprime(11) scorevprime covervprime 11 + exp(minus lossvprime )

(12) end(13) vmax argmaxvprimeisin |tisinD|forallAit[Ai]v[Ai] scorevprime

(14) for j isin 1 mprime1113864 1113865 do

(15) PBj

v PBj

v cup tmax[Bj]1113966 1113967

(16) end(17) end(18) end

ALGORITHM 2 PVST-Constructor

Security and Communication Networks 9

Uaul 1

|Q|1113944

|Q|

i1

1Dprime

11138681113868111386811138681113868111386811138681113868

1113944

visinDprime

min Rank(v ∣ q) k + 11113864 1113865 minus min Rankprime(v ∣ q) k + 11113864 11138651113868111386811138681113868

1113868111386811138681113868

k (23)

where Dprime v isin D |Rank(v | q))lt k or Rank(v | q))lt k1113864 1113865Intuitively Uaul represents the average percentage rankdifference relative to k over all queries and all tuples in top-kUaul is equivalent to UQ|Q||D|2 when k |D| By default weset k 100

631 Evaluation of Uaul with Varying k We first discuss theaverage top-k utility loss of PVSV-Constructor PVST-Constructor and a baseline algorithm on varying k withother parameters set to default values For a tuple vrsquos at-tribute Bj the baseline algorithm constructs P

Bj

v with v[Bj]

and a value randomly picked from VBj v[Bj]1113966 1113967 )e results

are presented in Figure 2 which shows that the Uaul ofPVSV-Constructor is significantly lower than that of thebaseline algorithm )e Uaul of PVST-Constructor is alsolower than that of the baseline algorithm when kge 5 eventhough the baseline algorithm cannot preserve privacyagainst authenticity-knowledgeable adversaries )e exper-imental results show that both PVSV and PVST can reduceutility loss with respect to rank differences Also note thatPVST-Constructor constructs polymorphic value sets withtrue values and thus from Figure 3 we can see that somepolymorphic value sets constructed by PVST-Constructorcontains more than 2 values

632 Evaluation of Uaul with Varying Sizes of QFigure 4 presents the average top-k utility loss of PVSV-Constructor on varying |Q| When |Q| is set to 1 5 or 10 werandomly picked 1 5 or 10 queries from the original Qrespectively With increasing number of queries in the queryworkload the Uaul of PVSV-Constructor increases mono-tonically )e reason is that PVSV-Constructor alwaysgenerates the least frequent value (denoted as vprime[Bj]) in1113864q[Bj] | q isin Q1113865 for v[Bj] If |Q| is small then it is possiblethat vprime[Bj] notin 1113864q[Bj] | q isin Q1113865 and thus s(v ∣ q) sprime(v ∣ q)

forallq isin Q However a larger query workload covers moreprivate attributes values and thus it will be harder forPVSV-Constructor to generate a value for v[Bj] that has noimpact on the rank of v for any q isin Q Figure 5 presents theaverage top-k utility loss of PVST-Constructor on varying|Q| We can see that the size of Q has no significant impacton the Uaul of PVST-Constructor as the PVST-Constructoralways pick a tuple vprime that is different from v in most at-tributes and then insert vprime[Bj] into P

Bj

v

633 Evaluation of Uaul with Varying Sizes of the DatasetFigures 6 and 7 depict the impact of the size of datasets onUaul of PVSV-Constructor and PVST-Constructor Datasetsof 100000 and 200000 tuples were randomly sampled from

QPoint QIn QIPoint QIIn

Atta

ck su

cces

s gue

ss ra

te

Without protectionPVSVPVST

1

09

08

07

06

05

04

03

02

01

0

Figure 1 Attack success guess

10 Security and Communication Networks

the testing bed of 300000 tuples As expected |D| has nosignificant impact on the Uaul of PVSV-Constructor sincePVSV-Constructor generates values from the domain ofeach private attributes |D| has no impact on the Uaul ofPVST-Constructor which indicates that a dataset contain-ing 100000 tuples is sufficient for PVST-Constructor togenerate polymorphic value sets with true values

634 Evaluation of Uaul with Varying m We investigate theimpact of the number of private and public attributes on

average top-k utility loss Figure 8 presents theUaul of PVSV-Constructor with fixed mprime and varying m and with fixed mand varying mprime When mmprime is set to 5 we randomly re-moved 5 attributes from the testing bed When mmprime is set to

0 20 40 60 80 100k

RandomPVSVPVST

Ave

rage

top-k

utili

ty lo

ss

1

09

08

07

06

05

04

03

02

01

0

Figure 2 Utility loss vs k

1 2 3 4 5+Size of PVST

3

25

2

15

1

05

0

Coun

t 10

5

Figure 3 Size of polymorphic value set (PVST)

|Q|

Ave

rage

top-k

utili

ty lo

ss

1

009

008

007

006

005

004

003

002

001

00 10 20

Figure 4 Utility loss vs |Q|

|Q|

Ave

rage

top-k

utili

ty lo

ss

1

09

08

07

06

05

04

03

02

01

00 10 20

Figure 5 Utility loss vs |Q|

0 1 2 3|D|105

Ave

rage

top-k

utili

ty lo

ss

1

009

008

007

006

005

004

003

002

001

0

Figure 6 Utility loss vs |D|

Security and Communication Networks 11

15 we added 5 more categorical attributes randomly chosenfrom the unused attributes We observe that the Uaulmonotonically decreases with increasing number of publicattributes and monotonically increases with increasingnumber of private attributes As expected a higher pro-portion of public attributes leads to less variant betweens(v ∣ q) and sprime(v ∣ q) e results of the same experimentwith PVST-Constructor are shown in Figure 9 Uaul in-creases as increasing number of private attributes as ex-pected However Uaul also increases slightly with increasingnumber of public attributes is is due to the fact that withmore public attributes there will be few tuples that share thesame public attribute values Since PVST-Constructor in-serts only private attribute values from tuples sharing samepublic attribute values more values will be inserted to eachpolymorphic value set which introduces higher utility loss

635 Evaluation of Uaul with Varying Weight RatiosFigures 10 and 11 illustrate the impact of weight ratios on theaverage utility loss e experiment was conducted with a

xed private attribute weight of 1 and varying public at-tribute weights of 1 2 and 3 As expected Uaul of bothPVSV-Constructor and PVST-Constructor decreases as

0 1 2 3Weight ratio

Ave

rage

top-k

utili

ty lo

ss

1

009

008

007

006

005

004

003

002

001

0

Figure 10 Utility loss vs weight ratio

0 1 2 3Weight ratio

Ave

rage

top-k

utili

ty lo

ss

1

09

08

07

06

05

04

03

02

01

0

Figure 11 Utility loss vs weight ratio

0 1 2 3|D|105

Ave

rage

top-k

utili

ty lo

ss1

09

08

07

06

05

04

03

02

01

0

Figure 7 Utility loss vs |D|

0 5 10 15m or mprime

fix mprime = 10fix m = 10

Ave

rage

top-k

utili

ty lo

ss

1

009

008

007

006

005

004

003

002

001

0

Figure 8 Utility loss vs m and mprime

0 5 10 15m or mprime

fix mprime = 10fix m = 10

Ave

rage

top-k

utili

ty lo

ss

1

09

08

07

06

05

04

03

02

01

0

Figure 9 Utility loss vs m and mprime

12 Security and Communication Networks

increasing weight ratio of public attributes)e reason is thatas the public attribute weight increases the part in sprime(v ∣ q)

caused by private attributes decreases )erefore less impactwould be made to sprime(v ∣ q) by PVSV and PVST As a resultsprime(v ∣ q) would be closer to s(v ∣ q) and the utility loss couldbe decreased

7 Conclusions

In this paper we proposed a novel framework that preservesprivacy of private attributes against arbitrary attacks throughthe ranked retrieval model Furthermore we identify twocategories of adversaries based on varying adversarial ca-pabilities For each kind of adversaries we presentedimplementation of our framework Our experimental resultssuggest that our implementations efficiently preserve privacyagainst Rank Inference attack [35] Moreover the imple-mentations significantly reduce utility loss with respect tothe variance in ranked results

It is our hope that this paper can motivate further re-search in privacy preservation of SIoT with consideration ofsocial network features andor variant information retrievalmodels eg text mining

Appendix

A 2-PVSV

In this subsection we prove that constructing an optimal PVSVfor each attribute of a tuple v is an NP-hard problem

Definition A1 For a tuple v in D we create PBj

v for each BjWe say v satisfies query q if the following hold for any tuple t(tne v) in D (1) If s(v ∣ q)lt s(t ∣ q) then sprime(v ∣ q)lt sprime(t ∣ q)and (2) if s(v ∣ q)ge s(t ∣ q) then sprime(v ∣ q)ge sprime(t ∣ q)

For a database containing 2 tuples the 2-PVSV problemcan be redefined as follows given a query workload Q adatabase D construct PB1

v PBmprime

v of size 2 such that v

satisfies the most queries in Q

Definition A2 Max-3Sat Problem given a 3-CNF formulaΦ find the truth assignment that satisfies that most clauses

Lemma A1 Max-3Sat leP 2-PVSV Problem

Proof We construct a reduction function f(Φ) (D s Q)

which takes a Max-3Sat instance as input and returns a 2-PVSV instance Without loss of generality we suppose thatΦ is a conjunction of l clauses and each clause is a dis-junction of 3 literals from set X x1 xn1113864 1113865 We constructdatabase D as follows D has 0 public attributes and n + 1private attributes B1 Bn C1 Let VB

j be the attribute do-main of Bj j isin 1 n and VC

1 be the attribute domain ofC1 Let VB

i minus 1 0 1 2 n + 2 and VC1 0 1 Two

tuples v1 and v2 are inserted into D v1[Bj] 0 and v2[Bj]

j + 2 for j isin 1 n while v1[C1] 0 and v2[C1] 1 Wesimplify the score function defined in (1) by setting all weightsto 1 Also note that ρ(q[Bj] t[Bj]) 0 if q[Bj] is null

We construct query workload Q based on Φ For eachclause Lk isin Φ we construct a query qk such that qk[Bi] i iffthe corresponding literal xi is a positive literal in Lk andq[Bi] i + 1 iff xi is a negative literal in the clause Forexample given a clause (x1orx2orx3) the correspondingquery q should satisfy q[B1] 1 q[B2] 3 q[B3] 3 andq[C1] 0 All other attributes in q are set to null by default)erefore s(v1 ∣ q) 1 and s(v2 ∣ q) 0 forallq isin Q

Since s(v2 ∣ q)lt s(v1 ∣ q) forallq isin Q in order to minimizeutility loss the value of sprime(v2 ∣ q) should be as small aspossible We observe that the minimum possible sprime(v2 ∣ q) is1 because PC1

v2must be 0 1 as VC

1 0 1 Without loss ofgenerality we assume that we have already constructedPB1

v2 P

Bmprime

v2 PC1v2

for v2 such that sprime(v2 ∣ q) 1 forallq isin QNow we have an instance of 2-PVSV problem that given

D Q constructs PB1v1

PBmprime

v1 PC1

v1that satisfies the most

queries in Q Since VC1 0 1 PC1

v1must be 0 1 too as

PC1v1

contains two distinct values )erefore for an ar-bitrary query q isin Q we have ρprime(q[C] v1[C]) 1 andρprime(q[C] v2[C]) 1 In order to let v1 satisfy q we have toensure that sprime(v1 ∣ q)gt sprime(v2 ∣ q) 1 )us we have

sprime v1 ∣ q( 1113857gt 1

⟺1113944mprime

i1ρprime q Bi1113858 1113859 v1 Bi1113858 1113859( 1113857gt 1

⟺existi isin 1 mprime1113864 1113865 ρprime q Bi1113858 1113859 v1 Bi1113858 1113859( 1113857 1

⟺existi isin 1 mprime1113864 1113865 q Bi1113858 1113859 isin PBi

v1

(A1)

Now we show that the solution of 2-PVSV problemconstructed above can answer the corresponding Max-3Sat Problem Suppose that we have the solution PB1

v1

PB

mprimev1 such that v1 satisfies the most queries in Q As in

(A1) if v1 satisfies qk we have qk[Bi] isin PBiv1 If v1 does not

satisfy qk then qk[Bi] notin PBiv1 Recall that we assign value i or

i + 1 to qk[Bi] if xi is a positive or negative literal in clauseLk respectively )erefore the assignment of xi given PBi

v1is

xi true if i isin PBi

v1

xi false if i + 1 isin PBi

v1

(A2)

Furthermore from equation (A1) we have

sprime v1 ∣ q( 1113857gt 1⟺L true (A3)

where L is qrsquos corresponding clause in Φ Note thati i + 1 nsubQ as |PBi

v1| 2 and 0 isin PBi

v1 )us the value of xi is

either true or falseAs we proved above v1 satisfies qi if and only if Li is true

given assignment x1 xl1113864 1113865 constructed according toequation (A2) )erefore x1 xl1113864 1113865 satisfies the mostclauses in ϕ if and only if v1 satisfies the most queries in Q

We now prove that function f can be conducted inpolynomial time Given a formula Φ with n variables and lclauses we construct a 2-PVSV instance with 2 tuples each

Security and Communication Networks 13

of which has n + 1 attributes and l queries each of which has4 attributes )erefore O(2n + 4l) assignments are neededand f can be conducted in polynomial time

Proof of 4eorem 1 We now prove that the 2-PVSVproblem is NP-hard In Lemma 3 we proved that Max-3SatProblem can be reduced to the 2-PVSV problem in poly-nomial time Furthermore as Max-3Sat Problem is a NP-hard problem [33] the 2-PVSV problem is NP-hard

B 2-PVST

In this section we prove that constructing an optimal PVSTfor each private attribute of a tuple v isin D is NP-hard We usethe definition of v satisfying q from (9))e 2-PVSTproblemcan be redefined as follows

Definition B1 2-PVSTproblem given a query workload Qa database D the optimization problem of 2-PVST is toconstruct an arbitrary tuple vrsquos polymorphic setsPB1

v PBmprime

v that satisfies the most queries in Q )e size ofeach polymorphic vale set is 2

Lemma B1 Max-3Sat le P 2-PVST problem

Proof We construct a reduction function f(Φ) (D s Q)

which takes a Max-3Sat instance as input and returns a 2-PVST instance We assume that Φ is a conjunction of lclauses and each clause is a disjunction of 3 literals from setX x1 xn1113864 1113865 Let D have no public attribute and n + 1private attributes B1 Bn C1 For each literal xi we inserttwo tuples vi and vi

prime into D where

vi C11113858 1113859 1 vi Bi1113858 1113859 1 vi Bj1113960 1113961 minus 1foralljne i

viprime C11113858 1113859 1 vi

prime Bi1113858 1113859 minus 1 viprime Bj1113960 1113961 1foralljne i

(B1)

)en we insert tuple v into D where v[C1] 0 andv[Bj] 0 forallj isin 1 mprime We also set the domain of eachBj as minus 1 0 1 and the domain of C1 as 0 1

Without loss of generality the score function defined in(1) is simplified by setting all weights to 1

Next we construct query workload Q For each clauseLk isin Φ we construct a query qk in which qk[Bj] minus 1 iff xj isa positive literal in Lk and qk[Bj] 1 iff xj is a negativeliteral in Lk We also set qk[C1] 1 )e rest of the attributevalues are set to null by default )erefore ifLk (x1orne x2orx3) then we have qk[C1] 1 qk[B1] 1qk[B2] 1 qk[B3] minus 1 and qk[Bj] null forj isin 4 mprime

Note that ρ(q[Bj] t[Bj]) 0 if q[Bj] is null )uss(v qk) 0 forallq isin Q and s(vi qk) s(vi

prime qk)ge 1 forallq isin Q Weobserve that for q isin Q and i isin 1 n s(v ∣ q)lt s(vi ∣ q)

and s(v ∣ q)lt s(viprime ∣ q) In order to reduce UQ we have to

maximize the number of q isin Q such that sprime(v ∣ q)lt sprime(vi ∣ q)

and sprime(v ∣ q)lt sprime(viprime ∣ q) )e maximum value of sprime(vi ∣ q)

and sprime(viprime ∣ q) is 4 which can be achieved by inserting vi[Bj]

into PBj

viprime and inserting vi

prime[Bj] into PBj

vi Since PC1

v 0 1 thevalue of sprime(v ∣ q) is at least 1 )erefore we have

sprime(v ∣ q)lt sprime vi ∣ q( 1113857

⟺sprime(v ∣ q)lt 4

⟺1113944

mprime

j1ρprime q Bj1113960 1113961 v Bj1113960 11139611113872 1113873lt 3

⟺existj isin 1 mprime ρprime q Bj1113960 1113961 v Bj1113960 11139611113872 1113873 0

⟺existj isin 1 mprime v Bj1113960 1113961ne q Bj1113960 1113961

⟺existj isin 1 mprime minus q Bj1113960 1113961 isin PBj

v

(B2)

Since |PBj

v | is limited to 2 PBj

v 0 1 or 0 minus 1 ie wemust insert either virsquos or vi

primersquos private attribute values intoPB1

v PBmv Recall that we assign minus 1 or 1 to qk[Bj] if xj

is positive or negative respectively in Lk In order toreduce Max-3Sat to 2-PSVT we assign literals in the fol-lowing way

xj true if minus 1 isin PBj

v

xj false if 1 isin PBj

v (B3)

)erefore we denote the corresponding clause of q as Land from equation (B2) we have

existj isin 1 mprime1113864 1113865 minus q Bj1113960 1113961 isin PBj

v

⟺existj isin 1 mprime1113864 1113865 xj true if xj is positive in L

orxj false if xj is negative in L

⟺L true(B4)

From the above equation we observe that v satisfying qk

is equivalent to Lk being true)erefore we can reduceMax-3Sat to 2-PVST Given a solution to the 2-PVSTproblem thatmaximizes the number of queries satisfied by v assignmentx1 xn1113864 1113865 produced by equation (B3) can satisfy thelargest number of clauses in Φ

Given a Max-3Sat instance Φ the construction of the 2-PVST instance can be conducted in polynomial time as weconstruct n queries with 3 attributes and 2n + 1 tuples with nattributes)e transformation from the optimal solution of 2-PVST to the optimal solution of Max-3Sat also takes poly-nomial time as we assign the value to each one of x1 xn

once )erefore f is a polynomial time function

Proof of 4eorem 2 In 4 we proved that a Max-3Sat in-stance can be reduced to a 2-PSVT instance in polynomialtime Furthermore as Max-3Sat Problem is an NP-hardproblem the 2-PVSV problem is an NP-hard problem

Data Availability

)e data used to support the findings of this study areavailable from the corresponding author upon request

14 Security and Communication Networks

Conflicts of Interest

)e authors declare that they have no conflicts of interestregarding the publication of this paper

Acknowledgments

)is research was partially supported by the National Sci-ence Foundation (grant nos 0852674 0915834 1117297and 1343976) and the Army Research Office (grant noW911NF-15-1-0020)

References

[1] G Adomavicius and A Tuzhilin ldquoToward the next generationof recommender systems a survey of the state-of-the-art andpossible extensionsrdquo IEEE Transactions on Knowledge andData Engineering vol 17 no 6 pp 734ndash749 2005

[2] C C Aggarwal and S Y Philip ldquoA condensation approach toprivacy preserving data miningrdquo in Proceedings of the In-ternational Conference on Extending Database Technologypp 183ndash199 Springer Heraklion Crete Greece March 2004

[3] R Agrawal and R Srikant ldquoPrivacy-preserving data miningrdquoin ACM Sigmod Record vol 29 no 2 pp 439ndash450 ACM2000

[4] A Alcaide E Palomar J Montero-Castillo and A RibagordaldquoAnonymous authentication for privacy-preserving IoT tar-get-driven applicationsrdquo Computers amp Security vol 37pp 111ndash123 2013

[5] L Atzori A Iera G Morabito and M Nitti ldquo)e socialinternet of things (SIoT)mdashwhen social networks meet theinternet of things concept architecture and network char-acterizationrdquo Computer Networks vol 56 no 16 pp 3594ndash3608 2012

[6] Z Cai Z He X Guan and Y Li ldquoCollective data-sanitizationfor preventing sensitive information inference attacks insocial networksrdquo IEEE Transactions on Dependable and SecureComputing vol 15 no 4 pp 577ndash590 2018

[7] Z Cai and X Zheng ldquoA private and efficient mechanism for datauploading in smart cyber-physical systemsrdquo IEEE Transactionson Network Science and Engineering vol 24 p 1 2018

[8] Z Cai X Zheng and J Yu ldquoA differential-private frameworkfor urban traffic flows estimation via taxi companiesrdquo IEEETransactions on Industrial Informatics vol 17 2019

[9] N Cao C Wang M Li K Ren and W Lou ldquoPrivacy-preserving multi-keyword ranked search over encryptedcloud datardquo IEEE Transactions on Parallel and DistributedSystems vol 25 no 1 pp 222ndash233 2014

[10] Y-C Chang and M Mitzenmacher ldquoPrivacy preservingkeyword searches on remote encrypted datardquo in Proceedingsof the International Conference on Applied Cryptography andNetwork Security pp 442ndash455 Springer New York NY USAJune 2005

[11] C Chen X Zhu P Shen et al ldquoAn efficient privacy-pre-serving ranked keyword search methodrdquo IEEE Transactionson Parallel and Distributed Systems vol 27 no 4 pp 951ndash9632016

[12] K Chen and L Liu ldquoPrivacy preserving data classificationwith rotation perturbationrdquo in Proceedings of the Fifth IEEEInternational Conference on Data Mining (ICDMrsquo05) p 4IEEE Houston TX USA November 2005

[13] V Ciriani S D C Di Vimercati S Foresti S JajodiaS Paraboschi and P Samarati ldquoFragmentation and en-cryption to enforce privacy in data storagerdquo in Proceedings of

the Fifth European symposium on research in computer se-curity pp 171ndash186 Springer Dresden Germany September2007

[14] M Deshpande and G Karypis ldquoItem-based top-N recom-mendation algorithmsrdquo ACM Transactions on InformationSystems vol 22 no 1 pp 143ndash177 2004

[15] S D C di Vimercati R F Erbacher S Foresti S JajodiaG Livraga and P Samarati ldquoEncryption and fragmentationfor data confidentiality in the cloudrdquo in Foundations of Se-curity Analysis and Design VII A Aldini J Lopez andF Martinelli Eds pp 212ndash243 Springer Berlin Germany2013

[16] S D C di Vimercati S Foresti G Livraga S Paraboschi andP Samarati ldquoConfidentiality protection in large databasesrdquo inA Comprehensive Guide through the Italian Database Researchover the Last 25 Years pp 457ndash472 Springer Berlin Ger-many 2018

[17] C Dwork ldquoDifferential privacyrdquo Encyclopedia of Cryptog-raphy and Security pp 338ndash340 2011

[18] C Dwork and A Roth ldquo)e algorithmic foundations ofdifferential privacyrdquo Foundations and Trendsreg in 4eoreticalComputer Science vol 9 no 3-4 pp 211ndash407 2014

[19] S E Fienberg and J McIntyre ldquoData swapping variations ona theme by dalenius and reissrdquo in Privacy in Statistical Da-tabases pp 14ndash29 Springer Berlin Germany 2004

[20] Y Gao T Yan and N Zhang ldquoA privacy-preservingframework for rank inferencerdquo in Proceedings of the IEEESymposium on Privacy-Aware Computing (PAC) pp 180-181IEEE Washington DC USA August 2017

[21] Z He Z Cai and J Yu ldquoLatent-data privacy preserving withcustomized data utility for social network datardquo IEEETransactions on Vehicular Technology vol 67 no 1pp 665ndash673 2017

[22] T Hong S Mei Z Wang and J Ren ldquoA novel verticalfragmentation method for privacy protection based on en-tropy minimization in a relational databaserdquo Symmetryvol 10 no 11 p 637 2018

[23] X Huang R Fu B Chen T Zhang and A Roscoe ldquoUserinteractive internet of things privacy preserved access con-trolrdquo in Proceedings of the International Conference for In-ternet Technology and Secured Transactions pp 597ndash602IEEE London UK December 2012

[24] Y Huo X Fan L Ma X Cheng Z Tian and D ChenldquoSecure communications in tiered 5G wireless networks withcooperative jammingrdquo IEEE Transactions on Wireless Com-munications vol 18 no 6 pp 3265ndash3280 2019

[25] K Liu H Kargupta and J Ryan ldquoRandom projection-basedmultiplicative data perturbation for privacy preserving dis-tributed data miningin latexrdquo IEEE Transactions on knowledgeand Data Engineering vol 18 no 1 pp 92ndash106 2005

[26] N Li T Li and S Venkatasubramanian ldquot-closeness privacybeyond k-anonymity and l-diversityrdquo in Proceedings of theIEEE 23rd International Conference on Data Engineeringpp 106ndash115 IEEE Istanbul Turkey April 2007

[27] A Machanavajjhala D Kifer J Gehrke andM Venkitasubramaniam ldquoL-diversityrdquo ACMTransactions onKnowledge Discovery from Data vol 1 no 1 pp 3ndashes 2007

[28] S Matwin ldquoPrivacy-preserving data mining techniquessurvey and challengesrdquo in Studies in Applied PhilosophyEpistemology and Rational Ethics pp 209ndash221 SpringerBerlin Germany 2013

[29] B McFee and G Lanckriet ldquoMetric learning to rankrdquo inProceedings of the 27th International Conference on MachineLearning (ICMLrsquo10) Haifa Israel June 2010

Security and Communication Networks 15

[30] K Muralidhar and R Sarathy ldquoData shufflingmdasha newmasking approach for numerical datardquo Management Sciencevol 52 no 5 pp 658ndash670 2006

[31] J Nin J Herranz and V Torra ldquoRethinking rank swapping todecrease disclosure riskrdquo Data amp Knowledge Engineeringvol 64 no 1 pp 346ndash364 2008

[32] K Okamoto W Chen and X-Y Li ldquoRanking of closenesscentrality for large-scale social networksrdquo in Frontiers inAlgorithmics pp 186ndash195 Springer Berlin Germany 2008

[33] C H Papadimitriou and M Yannakakis ldquoOptimizationapproximation and complexity classesrdquo Journal of computerand system sciences vol 43 no 3 pp 425ndash440 1991

[34] L Peng R-C Wang X-Y Su and C Long ldquoPrivacy pro-tection based on key-changed mutual authentication protocolin internet of thingsrdquo in Communications in Computer andInformation Science pp 345ndash355 Springer Berlin Germany2013

[35] M F Rahman W Liu S )irumuruganathan N Zhang andG Das ldquoPrivacy implications of database rankingrdquo Pro-ceedings of the VLDB Endowment vol 8 no 10 pp 1106ndash1117 2015

[36] R Schenkel T Crecelius M Kacimi et al ldquoEfficient top-kquerying over social-tagging networksrdquo in Proceedings of the31st Annual International ACM SIGIR Conference on Researchand Development in Information Retrieval pp 523ndash530ACM Singapore July 2008

[37] C Sheng N Zhang Y Tao and X Jin ldquoOptimal algorithmsfor crawling a hidden database in the webrdquo Proceedings of theVLDB Endowment vol 5 no 11 pp 1112ndash1123 2012

[38] D X Song D Wagner and A Perrig ldquoPractical techniquesfor searches on encrypted datardquo in Proceeding 2000 IEEESymposium on Security and Privacy SampP 2000 pp 44ndash55IEEE Berkeley CA USA May 2000

[39] J Su D Cao B Zhao X Wang and I You ldquoePASS anexpressive attribute-based signature scheme with privacy andan unforgeability guarantee for the Internet of)ingsrdquo FutureGeneration Computer Systems vol 33 pp 11ndash18 2014

[40] L Sweeney ldquok-anonymity a Model for Protecting PrivacyrdquoInternational Journal of Uncertainty Fuzziness and Knowl-edge-Based Systems vol 10 no 5 pp 557ndash570 2002

[41] T Tassa A Mazza and A Gionis ldquok-concealment An al-ternative model of k-type anonymityrdquo Transactions on DataPrivacy vol 5 no 1 pp 189ndash222 2012

[42] J H Y F W Wang J C W G K Koperski D LiY L A R N Stefanovic and B X O R Z Dbminer ldquoAsystem for mining knowledge in large relational databasesrdquo inProceedings of the International Conference on KnowledgeDiscovery and Data Mining and Knowledge Discovery(KDDrsquo96) pp 250ndash255 San Diego CA USA August 1996

[43] X Wang J Zhang E M Schooler and M Ion ldquoPerformanceevaluation of attribute-based encryption toward data privacyin the IoTrdquo in Proceedings of the IEEE International Con-ference on Communications (ICC) pp 725ndash730 IEEE SydneyAustralia June 2014

[44] Y Wang and Q Wen ldquoA privacy enhanced dns scheme forthe internet of thingsrdquo in Proceedings of the IET InternationalConference on Communication Technology and Application(ICCTA 2011) )e Institution of Engineering amp TechnologyBeijing China October 2011

[45] Y Xu T Ma M Tang and W Tian ldquoA survey of privacypreserving data publishing using generalization and sup-pressionrdquo Applied Mathematics amp Information Sciencesvol 8 no 3 pp 1103ndash1116 2014

[46] J-C Yang and B-X Fang ldquoSecurity model and key tech-nologies for the internet of thingsrdquo 4e Journal of ChinaUniversities of Posts and Telecommunications vol 18pp 109ndash112 2011

[47] X Yang H Steck Y Guo and Y Liu ldquoOn top-k recom-mendation using social networksrdquo in Proceedings of the sixthACM conference on Recommender systemsmdashRecSysrsquo12pp 67ndash74 ACM Dublin Ireland September 2012

[48] X Zheng Z Cai J Yu C Wang and Y Li ldquoFollow but notrack privacy preserved profile publishing in cyber-physicalsocial systemsrdquo IEEE Internet of 4ings Journal vol 4 no 6pp 1868ndash1878 2017

16 Security and Communication Networks

International Journal of

AerospaceEngineeringHindawiwwwhindawicom Volume 2018

RoboticsJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Active and Passive Electronic Components

VLSI Design

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Shock and Vibration

Hindawiwwwhindawicom Volume 2018

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawiwwwhindawicom

Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Control Scienceand Engineering

Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Journal ofEngineeringVolume 2018

SensorsJournal of

Hindawiwwwhindawicom Volume 2018

International Journal of

RotatingMachinery

Hindawiwwwhindawicom Volume 2018

Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Navigation and Observation

International Journal of

Hindawi

wwwhindawicom Volume 2018

Advances in

Multimedia

Submit your manuscripts atwwwhindawicom

Page 10: Research Article - Hindawi Publishing Corporationdownloads.hindawi.com/journals/scn/2019/5498375.pdf · [4, 34, 44], and attribute-based encryption [39, 43]. ’e features of social

Uaul 1

|Q|1113944

|Q|

i1

1Dprime

11138681113868111386811138681113868111386811138681113868

1113944

visinDprime

min Rank(v ∣ q) k + 11113864 1113865 minus min Rankprime(v ∣ q) k + 11113864 11138651113868111386811138681113868

1113868111386811138681113868

k (23)

where Dprime v isin D |Rank(v | q))lt k or Rank(v | q))lt k1113864 1113865Intuitively Uaul represents the average percentage rankdifference relative to k over all queries and all tuples in top-kUaul is equivalent to UQ|Q||D|2 when k |D| By default weset k 100

631 Evaluation of Uaul with Varying k We first discuss theaverage top-k utility loss of PVSV-Constructor PVST-Constructor and a baseline algorithm on varying k withother parameters set to default values For a tuple vrsquos at-tribute Bj the baseline algorithm constructs P

Bj

v with v[Bj]

and a value randomly picked from VBj v[Bj]1113966 1113967 )e results

are presented in Figure 2 which shows that the Uaul ofPVSV-Constructor is significantly lower than that of thebaseline algorithm )e Uaul of PVST-Constructor is alsolower than that of the baseline algorithm when kge 5 eventhough the baseline algorithm cannot preserve privacyagainst authenticity-knowledgeable adversaries )e exper-imental results show that both PVSV and PVST can reduceutility loss with respect to rank differences Also note thatPVST-Constructor constructs polymorphic value sets withtrue values and thus from Figure 3 we can see that somepolymorphic value sets constructed by PVST-Constructorcontains more than 2 values

632 Evaluation of Uaul with Varying Sizes of QFigure 4 presents the average top-k utility loss of PVSV-Constructor on varying |Q| When |Q| is set to 1 5 or 10 werandomly picked 1 5 or 10 queries from the original Qrespectively With increasing number of queries in the queryworkload the Uaul of PVSV-Constructor increases mono-tonically )e reason is that PVSV-Constructor alwaysgenerates the least frequent value (denoted as vprime[Bj]) in1113864q[Bj] | q isin Q1113865 for v[Bj] If |Q| is small then it is possiblethat vprime[Bj] notin 1113864q[Bj] | q isin Q1113865 and thus s(v ∣ q) sprime(v ∣ q)

forallq isin Q However a larger query workload covers moreprivate attributes values and thus it will be harder forPVSV-Constructor to generate a value for v[Bj] that has noimpact on the rank of v for any q isin Q Figure 5 presents theaverage top-k utility loss of PVST-Constructor on varying|Q| We can see that the size of Q has no significant impacton the Uaul of PVST-Constructor as the PVST-Constructoralways pick a tuple vprime that is different from v in most at-tributes and then insert vprime[Bj] into P

Bj

v

633 Evaluation of Uaul with Varying Sizes of the DatasetFigures 6 and 7 depict the impact of the size of datasets onUaul of PVSV-Constructor and PVST-Constructor Datasetsof 100000 and 200000 tuples were randomly sampled from

QPoint QIn QIPoint QIIn

Atta

ck su

cces

s gue

ss ra

te

Without protectionPVSVPVST

1

09

08

07

06

05

04

03

02

01

0

Figure 1 Attack success guess

10 Security and Communication Networks

the testing bed of 300000 tuples As expected |D| has nosignificant impact on the Uaul of PVSV-Constructor sincePVSV-Constructor generates values from the domain ofeach private attributes |D| has no impact on the Uaul ofPVST-Constructor which indicates that a dataset contain-ing 100000 tuples is sufficient for PVST-Constructor togenerate polymorphic value sets with true values

634 Evaluation of Uaul with Varying m We investigate theimpact of the number of private and public attributes on

average top-k utility loss Figure 8 presents theUaul of PVSV-Constructor with fixed mprime and varying m and with fixed mand varying mprime When mmprime is set to 5 we randomly re-moved 5 attributes from the testing bed When mmprime is set to

0 20 40 60 80 100k

RandomPVSVPVST

Ave

rage

top-k

utili

ty lo

ss

1

09

08

07

06

05

04

03

02

01

0

Figure 2 Utility loss vs k

1 2 3 4 5+Size of PVST

3

25

2

15

1

05

0

Coun

t 10

5

Figure 3 Size of polymorphic value set (PVST)

|Q|

Ave

rage

top-k

utili

ty lo

ss

1

009

008

007

006

005

004

003

002

001

00 10 20

Figure 4 Utility loss vs |Q|

|Q|

Ave

rage

top-k

utili

ty lo

ss

1

09

08

07

06

05

04

03

02

01

00 10 20

Figure 5 Utility loss vs |Q|

0 1 2 3|D|105

Ave

rage

top-k

utili

ty lo

ss

1

009

008

007

006

005

004

003

002

001

0

Figure 6 Utility loss vs |D|

Security and Communication Networks 11

15 we added 5 more categorical attributes randomly chosenfrom the unused attributes We observe that the Uaulmonotonically decreases with increasing number of publicattributes and monotonically increases with increasingnumber of private attributes As expected a higher pro-portion of public attributes leads to less variant betweens(v ∣ q) and sprime(v ∣ q) e results of the same experimentwith PVST-Constructor are shown in Figure 9 Uaul in-creases as increasing number of private attributes as ex-pected However Uaul also increases slightly with increasingnumber of public attributes is is due to the fact that withmore public attributes there will be few tuples that share thesame public attribute values Since PVST-Constructor in-serts only private attribute values from tuples sharing samepublic attribute values more values will be inserted to eachpolymorphic value set which introduces higher utility loss

635 Evaluation of Uaul with Varying Weight RatiosFigures 10 and 11 illustrate the impact of weight ratios on theaverage utility loss e experiment was conducted with a

xed private attribute weight of 1 and varying public at-tribute weights of 1 2 and 3 As expected Uaul of bothPVSV-Constructor and PVST-Constructor decreases as

0 1 2 3Weight ratio

Ave

rage

top-k

utili

ty lo

ss

1

009

008

007

006

005

004

003

002

001

0

Figure 10 Utility loss vs weight ratio

0 1 2 3Weight ratio

Ave

rage

top-k

utili

ty lo

ss

1

09

08

07

06

05

04

03

02

01

0

Figure 11 Utility loss vs weight ratio

0 1 2 3|D|105

Ave

rage

top-k

utili

ty lo

ss1

09

08

07

06

05

04

03

02

01

0

Figure 7 Utility loss vs |D|

0 5 10 15m or mprime

fix mprime = 10fix m = 10

Ave

rage

top-k

utili

ty lo

ss

1

009

008

007

006

005

004

003

002

001

0

Figure 8 Utility loss vs m and mprime

0 5 10 15m or mprime

fix mprime = 10fix m = 10

Ave

rage

top-k

utili

ty lo

ss

1

09

08

07

06

05

04

03

02

01

0

Figure 9 Utility loss vs m and mprime

12 Security and Communication Networks

increasing weight ratio of public attributes)e reason is thatas the public attribute weight increases the part in sprime(v ∣ q)

caused by private attributes decreases )erefore less impactwould be made to sprime(v ∣ q) by PVSV and PVST As a resultsprime(v ∣ q) would be closer to s(v ∣ q) and the utility loss couldbe decreased

7 Conclusions

In this paper we proposed a novel framework that preservesprivacy of private attributes against arbitrary attacks throughthe ranked retrieval model Furthermore we identify twocategories of adversaries based on varying adversarial ca-pabilities For each kind of adversaries we presentedimplementation of our framework Our experimental resultssuggest that our implementations efficiently preserve privacyagainst Rank Inference attack [35] Moreover the imple-mentations significantly reduce utility loss with respect tothe variance in ranked results

It is our hope that this paper can motivate further re-search in privacy preservation of SIoT with consideration ofsocial network features andor variant information retrievalmodels eg text mining

Appendix

A 2-PVSV

In this subsection we prove that constructing an optimal PVSVfor each attribute of a tuple v is an NP-hard problem

Definition A1 For a tuple v in D we create PBj

v for each BjWe say v satisfies query q if the following hold for any tuple t(tne v) in D (1) If s(v ∣ q)lt s(t ∣ q) then sprime(v ∣ q)lt sprime(t ∣ q)and (2) if s(v ∣ q)ge s(t ∣ q) then sprime(v ∣ q)ge sprime(t ∣ q)

For a database containing 2 tuples the 2-PVSV problemcan be redefined as follows given a query workload Q adatabase D construct PB1

v PBmprime

v of size 2 such that v

satisfies the most queries in Q

Definition A2 Max-3Sat Problem given a 3-CNF formulaΦ find the truth assignment that satisfies that most clauses

Lemma A1 Max-3Sat leP 2-PVSV Problem

Proof We construct a reduction function f(Φ) (D s Q)

which takes a Max-3Sat instance as input and returns a 2-PVSV instance Without loss of generality we suppose thatΦ is a conjunction of l clauses and each clause is a dis-junction of 3 literals from set X x1 xn1113864 1113865 We constructdatabase D as follows D has 0 public attributes and n + 1private attributes B1 Bn C1 Let VB

j be the attribute do-main of Bj j isin 1 n and VC

1 be the attribute domain ofC1 Let VB

i minus 1 0 1 2 n + 2 and VC1 0 1 Two

tuples v1 and v2 are inserted into D v1[Bj] 0 and v2[Bj]

j + 2 for j isin 1 n while v1[C1] 0 and v2[C1] 1 Wesimplify the score function defined in (1) by setting all weightsto 1 Also note that ρ(q[Bj] t[Bj]) 0 if q[Bj] is null

We construct query workload Q based on Φ For eachclause Lk isin Φ we construct a query qk such that qk[Bi] i iffthe corresponding literal xi is a positive literal in Lk andq[Bi] i + 1 iff xi is a negative literal in the clause Forexample given a clause (x1orx2orx3) the correspondingquery q should satisfy q[B1] 1 q[B2] 3 q[B3] 3 andq[C1] 0 All other attributes in q are set to null by default)erefore s(v1 ∣ q) 1 and s(v2 ∣ q) 0 forallq isin Q

Since s(v2 ∣ q)lt s(v1 ∣ q) forallq isin Q in order to minimizeutility loss the value of sprime(v2 ∣ q) should be as small aspossible We observe that the minimum possible sprime(v2 ∣ q) is1 because PC1

v2must be 0 1 as VC

1 0 1 Without loss ofgenerality we assume that we have already constructedPB1

v2 P

Bmprime

v2 PC1v2

for v2 such that sprime(v2 ∣ q) 1 forallq isin QNow we have an instance of 2-PVSV problem that given

D Q constructs PB1v1

PBmprime

v1 PC1

v1that satisfies the most

queries in Q Since VC1 0 1 PC1

v1must be 0 1 too as

PC1v1

contains two distinct values )erefore for an ar-bitrary query q isin Q we have ρprime(q[C] v1[C]) 1 andρprime(q[C] v2[C]) 1 In order to let v1 satisfy q we have toensure that sprime(v1 ∣ q)gt sprime(v2 ∣ q) 1 )us we have

sprime v1 ∣ q( 1113857gt 1

⟺1113944mprime

i1ρprime q Bi1113858 1113859 v1 Bi1113858 1113859( 1113857gt 1

⟺existi isin 1 mprime1113864 1113865 ρprime q Bi1113858 1113859 v1 Bi1113858 1113859( 1113857 1

⟺existi isin 1 mprime1113864 1113865 q Bi1113858 1113859 isin PBi

v1

(A1)

Now we show that the solution of 2-PVSV problemconstructed above can answer the corresponding Max-3Sat Problem Suppose that we have the solution PB1

v1

PB

mprimev1 such that v1 satisfies the most queries in Q As in

(A1) if v1 satisfies qk we have qk[Bi] isin PBiv1 If v1 does not

satisfy qk then qk[Bi] notin PBiv1 Recall that we assign value i or

i + 1 to qk[Bi] if xi is a positive or negative literal in clauseLk respectively )erefore the assignment of xi given PBi

v1is

xi true if i isin PBi

v1

xi false if i + 1 isin PBi

v1

(A2)

Furthermore from equation (A1) we have

sprime v1 ∣ q( 1113857gt 1⟺L true (A3)

where L is qrsquos corresponding clause in Φ Note thati i + 1 nsubQ as |PBi

v1| 2 and 0 isin PBi

v1 )us the value of xi is

either true or falseAs we proved above v1 satisfies qi if and only if Li is true

given assignment x1 xl1113864 1113865 constructed according toequation (A2) )erefore x1 xl1113864 1113865 satisfies the mostclauses in ϕ if and only if v1 satisfies the most queries in Q

We now prove that function f can be conducted inpolynomial time Given a formula Φ with n variables and lclauses we construct a 2-PVSV instance with 2 tuples each

Security and Communication Networks 13

of which has n + 1 attributes and l queries each of which has4 attributes )erefore O(2n + 4l) assignments are neededand f can be conducted in polynomial time

Proof of 4eorem 1 We now prove that the 2-PVSVproblem is NP-hard In Lemma 3 we proved that Max-3SatProblem can be reduced to the 2-PVSV problem in poly-nomial time Furthermore as Max-3Sat Problem is a NP-hard problem [33] the 2-PVSV problem is NP-hard

B 2-PVST

In this section we prove that constructing an optimal PVSTfor each private attribute of a tuple v isin D is NP-hard We usethe definition of v satisfying q from (9))e 2-PVSTproblemcan be redefined as follows

Definition B1 2-PVSTproblem given a query workload Qa database D the optimization problem of 2-PVST is toconstruct an arbitrary tuple vrsquos polymorphic setsPB1

v PBmprime

v that satisfies the most queries in Q )e size ofeach polymorphic vale set is 2

Lemma B1 Max-3Sat le P 2-PVST problem

Proof We construct a reduction function f(Φ) (D s Q)

which takes a Max-3Sat instance as input and returns a 2-PVST instance We assume that Φ is a conjunction of lclauses and each clause is a disjunction of 3 literals from setX x1 xn1113864 1113865 Let D have no public attribute and n + 1private attributes B1 Bn C1 For each literal xi we inserttwo tuples vi and vi

prime into D where

vi C11113858 1113859 1 vi Bi1113858 1113859 1 vi Bj1113960 1113961 minus 1foralljne i

viprime C11113858 1113859 1 vi

prime Bi1113858 1113859 minus 1 viprime Bj1113960 1113961 1foralljne i

(B1)

)en we insert tuple v into D where v[C1] 0 andv[Bj] 0 forallj isin 1 mprime We also set the domain of eachBj as minus 1 0 1 and the domain of C1 as 0 1

Without loss of generality the score function defined in(1) is simplified by setting all weights to 1

Next we construct query workload Q For each clauseLk isin Φ we construct a query qk in which qk[Bj] minus 1 iff xj isa positive literal in Lk and qk[Bj] 1 iff xj is a negativeliteral in Lk We also set qk[C1] 1 )e rest of the attributevalues are set to null by default )erefore ifLk (x1orne x2orx3) then we have qk[C1] 1 qk[B1] 1qk[B2] 1 qk[B3] minus 1 and qk[Bj] null forj isin 4 mprime

Note that ρ(q[Bj] t[Bj]) 0 if q[Bj] is null )uss(v qk) 0 forallq isin Q and s(vi qk) s(vi

prime qk)ge 1 forallq isin Q Weobserve that for q isin Q and i isin 1 n s(v ∣ q)lt s(vi ∣ q)

and s(v ∣ q)lt s(viprime ∣ q) In order to reduce UQ we have to

maximize the number of q isin Q such that sprime(v ∣ q)lt sprime(vi ∣ q)

and sprime(v ∣ q)lt sprime(viprime ∣ q) )e maximum value of sprime(vi ∣ q)

and sprime(viprime ∣ q) is 4 which can be achieved by inserting vi[Bj]

into PBj

viprime and inserting vi

prime[Bj] into PBj

vi Since PC1

v 0 1 thevalue of sprime(v ∣ q) is at least 1 )erefore we have

sprime(v ∣ q)lt sprime vi ∣ q( 1113857

⟺sprime(v ∣ q)lt 4

⟺1113944

mprime

j1ρprime q Bj1113960 1113961 v Bj1113960 11139611113872 1113873lt 3

⟺existj isin 1 mprime ρprime q Bj1113960 1113961 v Bj1113960 11139611113872 1113873 0

⟺existj isin 1 mprime v Bj1113960 1113961ne q Bj1113960 1113961

⟺existj isin 1 mprime minus q Bj1113960 1113961 isin PBj

v

(B2)

Since |PBj

v | is limited to 2 PBj

v 0 1 or 0 minus 1 ie wemust insert either virsquos or vi

primersquos private attribute values intoPB1

v PBmv Recall that we assign minus 1 or 1 to qk[Bj] if xj

is positive or negative respectively in Lk In order toreduce Max-3Sat to 2-PSVT we assign literals in the fol-lowing way

xj true if minus 1 isin PBj

v

xj false if 1 isin PBj

v (B3)

)erefore we denote the corresponding clause of q as Land from equation (B2) we have

existj isin 1 mprime1113864 1113865 minus q Bj1113960 1113961 isin PBj

v

⟺existj isin 1 mprime1113864 1113865 xj true if xj is positive in L

orxj false if xj is negative in L

⟺L true(B4)

From the above equation we observe that v satisfying qk

is equivalent to Lk being true)erefore we can reduceMax-3Sat to 2-PVST Given a solution to the 2-PVSTproblem thatmaximizes the number of queries satisfied by v assignmentx1 xn1113864 1113865 produced by equation (B3) can satisfy thelargest number of clauses in Φ

Given a Max-3Sat instance Φ the construction of the 2-PVST instance can be conducted in polynomial time as weconstruct n queries with 3 attributes and 2n + 1 tuples with nattributes)e transformation from the optimal solution of 2-PVST to the optimal solution of Max-3Sat also takes poly-nomial time as we assign the value to each one of x1 xn

once )erefore f is a polynomial time function

Proof of 4eorem 2 In 4 we proved that a Max-3Sat in-stance can be reduced to a 2-PSVT instance in polynomialtime Furthermore as Max-3Sat Problem is an NP-hardproblem the 2-PVSV problem is an NP-hard problem

Data Availability

)e data used to support the findings of this study areavailable from the corresponding author upon request

14 Security and Communication Networks

Conflicts of Interest

)e authors declare that they have no conflicts of interestregarding the publication of this paper

Acknowledgments

)is research was partially supported by the National Sci-ence Foundation (grant nos 0852674 0915834 1117297and 1343976) and the Army Research Office (grant noW911NF-15-1-0020)

References

[1] G Adomavicius and A Tuzhilin ldquoToward the next generationof recommender systems a survey of the state-of-the-art andpossible extensionsrdquo IEEE Transactions on Knowledge andData Engineering vol 17 no 6 pp 734ndash749 2005

[2] C C Aggarwal and S Y Philip ldquoA condensation approach toprivacy preserving data miningrdquo in Proceedings of the In-ternational Conference on Extending Database Technologypp 183ndash199 Springer Heraklion Crete Greece March 2004

[3] R Agrawal and R Srikant ldquoPrivacy-preserving data miningrdquoin ACM Sigmod Record vol 29 no 2 pp 439ndash450 ACM2000

[4] A Alcaide E Palomar J Montero-Castillo and A RibagordaldquoAnonymous authentication for privacy-preserving IoT tar-get-driven applicationsrdquo Computers amp Security vol 37pp 111ndash123 2013

[5] L Atzori A Iera G Morabito and M Nitti ldquo)e socialinternet of things (SIoT)mdashwhen social networks meet theinternet of things concept architecture and network char-acterizationrdquo Computer Networks vol 56 no 16 pp 3594ndash3608 2012

[6] Z Cai Z He X Guan and Y Li ldquoCollective data-sanitizationfor preventing sensitive information inference attacks insocial networksrdquo IEEE Transactions on Dependable and SecureComputing vol 15 no 4 pp 577ndash590 2018

[7] Z Cai and X Zheng ldquoA private and efficient mechanism for datauploading in smart cyber-physical systemsrdquo IEEE Transactionson Network Science and Engineering vol 24 p 1 2018

[8] Z Cai X Zheng and J Yu ldquoA differential-private frameworkfor urban traffic flows estimation via taxi companiesrdquo IEEETransactions on Industrial Informatics vol 17 2019

[9] N Cao C Wang M Li K Ren and W Lou ldquoPrivacy-preserving multi-keyword ranked search over encryptedcloud datardquo IEEE Transactions on Parallel and DistributedSystems vol 25 no 1 pp 222ndash233 2014

[10] Y-C Chang and M Mitzenmacher ldquoPrivacy preservingkeyword searches on remote encrypted datardquo in Proceedingsof the International Conference on Applied Cryptography andNetwork Security pp 442ndash455 Springer New York NY USAJune 2005

[11] C Chen X Zhu P Shen et al ldquoAn efficient privacy-pre-serving ranked keyword search methodrdquo IEEE Transactionson Parallel and Distributed Systems vol 27 no 4 pp 951ndash9632016

[12] K Chen and L Liu ldquoPrivacy preserving data classificationwith rotation perturbationrdquo in Proceedings of the Fifth IEEEInternational Conference on Data Mining (ICDMrsquo05) p 4IEEE Houston TX USA November 2005

[13] V Ciriani S D C Di Vimercati S Foresti S JajodiaS Paraboschi and P Samarati ldquoFragmentation and en-cryption to enforce privacy in data storagerdquo in Proceedings of

the Fifth European symposium on research in computer se-curity pp 171ndash186 Springer Dresden Germany September2007

[14] M Deshpande and G Karypis ldquoItem-based top-N recom-mendation algorithmsrdquo ACM Transactions on InformationSystems vol 22 no 1 pp 143ndash177 2004

[15] S D C di Vimercati R F Erbacher S Foresti S JajodiaG Livraga and P Samarati ldquoEncryption and fragmentationfor data confidentiality in the cloudrdquo in Foundations of Se-curity Analysis and Design VII A Aldini J Lopez andF Martinelli Eds pp 212ndash243 Springer Berlin Germany2013

[16] S D C di Vimercati S Foresti G Livraga S Paraboschi andP Samarati ldquoConfidentiality protection in large databasesrdquo inA Comprehensive Guide through the Italian Database Researchover the Last 25 Years pp 457ndash472 Springer Berlin Ger-many 2018

[17] C Dwork ldquoDifferential privacyrdquo Encyclopedia of Cryptog-raphy and Security pp 338ndash340 2011

[18] C Dwork and A Roth ldquo)e algorithmic foundations ofdifferential privacyrdquo Foundations and Trendsreg in 4eoreticalComputer Science vol 9 no 3-4 pp 211ndash407 2014

[19] S E Fienberg and J McIntyre ldquoData swapping variations ona theme by dalenius and reissrdquo in Privacy in Statistical Da-tabases pp 14ndash29 Springer Berlin Germany 2004

[20] Y Gao T Yan and N Zhang ldquoA privacy-preservingframework for rank inferencerdquo in Proceedings of the IEEESymposium on Privacy-Aware Computing (PAC) pp 180-181IEEE Washington DC USA August 2017

[21] Z He Z Cai and J Yu ldquoLatent-data privacy preserving withcustomized data utility for social network datardquo IEEETransactions on Vehicular Technology vol 67 no 1pp 665ndash673 2017

[22] T Hong S Mei Z Wang and J Ren ldquoA novel verticalfragmentation method for privacy protection based on en-tropy minimization in a relational databaserdquo Symmetryvol 10 no 11 p 637 2018

[23] X Huang R Fu B Chen T Zhang and A Roscoe ldquoUserinteractive internet of things privacy preserved access con-trolrdquo in Proceedings of the International Conference for In-ternet Technology and Secured Transactions pp 597ndash602IEEE London UK December 2012

[24] Y Huo X Fan L Ma X Cheng Z Tian and D ChenldquoSecure communications in tiered 5G wireless networks withcooperative jammingrdquo IEEE Transactions on Wireless Com-munications vol 18 no 6 pp 3265ndash3280 2019

[25] K Liu H Kargupta and J Ryan ldquoRandom projection-basedmultiplicative data perturbation for privacy preserving dis-tributed data miningin latexrdquo IEEE Transactions on knowledgeand Data Engineering vol 18 no 1 pp 92ndash106 2005

[26] N Li T Li and S Venkatasubramanian ldquot-closeness privacybeyond k-anonymity and l-diversityrdquo in Proceedings of theIEEE 23rd International Conference on Data Engineeringpp 106ndash115 IEEE Istanbul Turkey April 2007

[27] A Machanavajjhala D Kifer J Gehrke andM Venkitasubramaniam ldquoL-diversityrdquo ACMTransactions onKnowledge Discovery from Data vol 1 no 1 pp 3ndashes 2007

[28] S Matwin ldquoPrivacy-preserving data mining techniquessurvey and challengesrdquo in Studies in Applied PhilosophyEpistemology and Rational Ethics pp 209ndash221 SpringerBerlin Germany 2013

[29] B McFee and G Lanckriet ldquoMetric learning to rankrdquo inProceedings of the 27th International Conference on MachineLearning (ICMLrsquo10) Haifa Israel June 2010

Security and Communication Networks 15

[30] K Muralidhar and R Sarathy ldquoData shufflingmdasha newmasking approach for numerical datardquo Management Sciencevol 52 no 5 pp 658ndash670 2006

[31] J Nin J Herranz and V Torra ldquoRethinking rank swapping todecrease disclosure riskrdquo Data amp Knowledge Engineeringvol 64 no 1 pp 346ndash364 2008

[32] K Okamoto W Chen and X-Y Li ldquoRanking of closenesscentrality for large-scale social networksrdquo in Frontiers inAlgorithmics pp 186ndash195 Springer Berlin Germany 2008

[33] C H Papadimitriou and M Yannakakis ldquoOptimizationapproximation and complexity classesrdquo Journal of computerand system sciences vol 43 no 3 pp 425ndash440 1991

[34] L Peng R-C Wang X-Y Su and C Long ldquoPrivacy pro-tection based on key-changed mutual authentication protocolin internet of thingsrdquo in Communications in Computer andInformation Science pp 345ndash355 Springer Berlin Germany2013

[35] M F Rahman W Liu S )irumuruganathan N Zhang andG Das ldquoPrivacy implications of database rankingrdquo Pro-ceedings of the VLDB Endowment vol 8 no 10 pp 1106ndash1117 2015

[36] R Schenkel T Crecelius M Kacimi et al ldquoEfficient top-kquerying over social-tagging networksrdquo in Proceedings of the31st Annual International ACM SIGIR Conference on Researchand Development in Information Retrieval pp 523ndash530ACM Singapore July 2008

[37] C Sheng N Zhang Y Tao and X Jin ldquoOptimal algorithmsfor crawling a hidden database in the webrdquo Proceedings of theVLDB Endowment vol 5 no 11 pp 1112ndash1123 2012

[38] D X Song D Wagner and A Perrig ldquoPractical techniquesfor searches on encrypted datardquo in Proceeding 2000 IEEESymposium on Security and Privacy SampP 2000 pp 44ndash55IEEE Berkeley CA USA May 2000

[39] J Su D Cao B Zhao X Wang and I You ldquoePASS anexpressive attribute-based signature scheme with privacy andan unforgeability guarantee for the Internet of)ingsrdquo FutureGeneration Computer Systems vol 33 pp 11ndash18 2014

[40] L Sweeney ldquok-anonymity a Model for Protecting PrivacyrdquoInternational Journal of Uncertainty Fuzziness and Knowl-edge-Based Systems vol 10 no 5 pp 557ndash570 2002

[41] T Tassa A Mazza and A Gionis ldquok-concealment An al-ternative model of k-type anonymityrdquo Transactions on DataPrivacy vol 5 no 1 pp 189ndash222 2012

[42] J H Y F W Wang J C W G K Koperski D LiY L A R N Stefanovic and B X O R Z Dbminer ldquoAsystem for mining knowledge in large relational databasesrdquo inProceedings of the International Conference on KnowledgeDiscovery and Data Mining and Knowledge Discovery(KDDrsquo96) pp 250ndash255 San Diego CA USA August 1996

[43] X Wang J Zhang E M Schooler and M Ion ldquoPerformanceevaluation of attribute-based encryption toward data privacyin the IoTrdquo in Proceedings of the IEEE International Con-ference on Communications (ICC) pp 725ndash730 IEEE SydneyAustralia June 2014

[44] Y Wang and Q Wen ldquoA privacy enhanced dns scheme forthe internet of thingsrdquo in Proceedings of the IET InternationalConference on Communication Technology and Application(ICCTA 2011) )e Institution of Engineering amp TechnologyBeijing China October 2011

[45] Y Xu T Ma M Tang and W Tian ldquoA survey of privacypreserving data publishing using generalization and sup-pressionrdquo Applied Mathematics amp Information Sciencesvol 8 no 3 pp 1103ndash1116 2014

[46] J-C Yang and B-X Fang ldquoSecurity model and key tech-nologies for the internet of thingsrdquo 4e Journal of ChinaUniversities of Posts and Telecommunications vol 18pp 109ndash112 2011

[47] X Yang H Steck Y Guo and Y Liu ldquoOn top-k recom-mendation using social networksrdquo in Proceedings of the sixthACM conference on Recommender systemsmdashRecSysrsquo12pp 67ndash74 ACM Dublin Ireland September 2012

[48] X Zheng Z Cai J Yu C Wang and Y Li ldquoFollow but notrack privacy preserved profile publishing in cyber-physicalsocial systemsrdquo IEEE Internet of 4ings Journal vol 4 no 6pp 1868ndash1878 2017

16 Security and Communication Networks

International Journal of

AerospaceEngineeringHindawiwwwhindawicom Volume 2018

RoboticsJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Active and Passive Electronic Components

VLSI Design

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Shock and Vibration

Hindawiwwwhindawicom Volume 2018

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawiwwwhindawicom

Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Control Scienceand Engineering

Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Journal ofEngineeringVolume 2018

SensorsJournal of

Hindawiwwwhindawicom Volume 2018

International Journal of

RotatingMachinery

Hindawiwwwhindawicom Volume 2018

Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Navigation and Observation

International Journal of

Hindawi

wwwhindawicom Volume 2018

Advances in

Multimedia

Submit your manuscripts atwwwhindawicom

Page 11: Research Article - Hindawi Publishing Corporationdownloads.hindawi.com/journals/scn/2019/5498375.pdf · [4, 34, 44], and attribute-based encryption [39, 43]. ’e features of social

the testing bed of 300000 tuples As expected |D| has nosignificant impact on the Uaul of PVSV-Constructor sincePVSV-Constructor generates values from the domain ofeach private attributes |D| has no impact on the Uaul ofPVST-Constructor which indicates that a dataset contain-ing 100000 tuples is sufficient for PVST-Constructor togenerate polymorphic value sets with true values

634 Evaluation of Uaul with Varying m We investigate theimpact of the number of private and public attributes on

average top-k utility loss Figure 8 presents theUaul of PVSV-Constructor with fixed mprime and varying m and with fixed mand varying mprime When mmprime is set to 5 we randomly re-moved 5 attributes from the testing bed When mmprime is set to

0 20 40 60 80 100k

RandomPVSVPVST

Ave

rage

top-k

utili

ty lo

ss

1

09

08

07

06

05

04

03

02

01

0

Figure 2 Utility loss vs k

1 2 3 4 5+Size of PVST

3

25

2

15

1

05

0

Coun

t 10

5

Figure 3 Size of polymorphic value set (PVST)

|Q|

Ave

rage

top-k

utili

ty lo

ss

1

009

008

007

006

005

004

003

002

001

00 10 20

Figure 4 Utility loss vs |Q|

|Q|

Ave

rage

top-k

utili

ty lo

ss

1

09

08

07

06

05

04

03

02

01

00 10 20

Figure 5 Utility loss vs |Q|

0 1 2 3|D|105

Ave

rage

top-k

utili

ty lo

ss

1

009

008

007

006

005

004

003

002

001

0

Figure 6 Utility loss vs |D|

Security and Communication Networks 11

15 we added 5 more categorical attributes randomly chosenfrom the unused attributes We observe that the Uaulmonotonically decreases with increasing number of publicattributes and monotonically increases with increasingnumber of private attributes As expected a higher pro-portion of public attributes leads to less variant betweens(v ∣ q) and sprime(v ∣ q) e results of the same experimentwith PVST-Constructor are shown in Figure 9 Uaul in-creases as increasing number of private attributes as ex-pected However Uaul also increases slightly with increasingnumber of public attributes is is due to the fact that withmore public attributes there will be few tuples that share thesame public attribute values Since PVST-Constructor in-serts only private attribute values from tuples sharing samepublic attribute values more values will be inserted to eachpolymorphic value set which introduces higher utility loss

635 Evaluation of Uaul with Varying Weight RatiosFigures 10 and 11 illustrate the impact of weight ratios on theaverage utility loss e experiment was conducted with a

xed private attribute weight of 1 and varying public at-tribute weights of 1 2 and 3 As expected Uaul of bothPVSV-Constructor and PVST-Constructor decreases as

0 1 2 3Weight ratio

Ave

rage

top-k

utili

ty lo

ss

1

009

008

007

006

005

004

003

002

001

0

Figure 10 Utility loss vs weight ratio

0 1 2 3Weight ratio

Ave

rage

top-k

utili

ty lo

ss

1

09

08

07

06

05

04

03

02

01

0

Figure 11 Utility loss vs weight ratio

0 1 2 3|D|105

Ave

rage

top-k

utili

ty lo

ss1

09

08

07

06

05

04

03

02

01

0

Figure 7 Utility loss vs |D|

0 5 10 15m or mprime

fix mprime = 10fix m = 10

Ave

rage

top-k

utili

ty lo

ss

1

009

008

007

006

005

004

003

002

001

0

Figure 8 Utility loss vs m and mprime

0 5 10 15m or mprime

fix mprime = 10fix m = 10

Ave

rage

top-k

utili

ty lo

ss

1

09

08

07

06

05

04

03

02

01

0

Figure 9 Utility loss vs m and mprime

12 Security and Communication Networks

increasing weight ratio of public attributes)e reason is thatas the public attribute weight increases the part in sprime(v ∣ q)

caused by private attributes decreases )erefore less impactwould be made to sprime(v ∣ q) by PVSV and PVST As a resultsprime(v ∣ q) would be closer to s(v ∣ q) and the utility loss couldbe decreased

7 Conclusions

In this paper we proposed a novel framework that preservesprivacy of private attributes against arbitrary attacks throughthe ranked retrieval model Furthermore we identify twocategories of adversaries based on varying adversarial ca-pabilities For each kind of adversaries we presentedimplementation of our framework Our experimental resultssuggest that our implementations efficiently preserve privacyagainst Rank Inference attack [35] Moreover the imple-mentations significantly reduce utility loss with respect tothe variance in ranked results

It is our hope that this paper can motivate further re-search in privacy preservation of SIoT with consideration ofsocial network features andor variant information retrievalmodels eg text mining

Appendix

A 2-PVSV

In this subsection we prove that constructing an optimal PVSVfor each attribute of a tuple v is an NP-hard problem

Definition A1 For a tuple v in D we create PBj

v for each BjWe say v satisfies query q if the following hold for any tuple t(tne v) in D (1) If s(v ∣ q)lt s(t ∣ q) then sprime(v ∣ q)lt sprime(t ∣ q)and (2) if s(v ∣ q)ge s(t ∣ q) then sprime(v ∣ q)ge sprime(t ∣ q)

For a database containing 2 tuples the 2-PVSV problemcan be redefined as follows given a query workload Q adatabase D construct PB1

v PBmprime

v of size 2 such that v

satisfies the most queries in Q

Definition A2 Max-3Sat Problem given a 3-CNF formulaΦ find the truth assignment that satisfies that most clauses

Lemma A1 Max-3Sat leP 2-PVSV Problem

Proof We construct a reduction function f(Φ) (D s Q)

which takes a Max-3Sat instance as input and returns a 2-PVSV instance Without loss of generality we suppose thatΦ is a conjunction of l clauses and each clause is a dis-junction of 3 literals from set X x1 xn1113864 1113865 We constructdatabase D as follows D has 0 public attributes and n + 1private attributes B1 Bn C1 Let VB

j be the attribute do-main of Bj j isin 1 n and VC

1 be the attribute domain ofC1 Let VB

i minus 1 0 1 2 n + 2 and VC1 0 1 Two

tuples v1 and v2 are inserted into D v1[Bj] 0 and v2[Bj]

j + 2 for j isin 1 n while v1[C1] 0 and v2[C1] 1 Wesimplify the score function defined in (1) by setting all weightsto 1 Also note that ρ(q[Bj] t[Bj]) 0 if q[Bj] is null

We construct query workload Q based on Φ For eachclause Lk isin Φ we construct a query qk such that qk[Bi] i iffthe corresponding literal xi is a positive literal in Lk andq[Bi] i + 1 iff xi is a negative literal in the clause Forexample given a clause (x1orx2orx3) the correspondingquery q should satisfy q[B1] 1 q[B2] 3 q[B3] 3 andq[C1] 0 All other attributes in q are set to null by default)erefore s(v1 ∣ q) 1 and s(v2 ∣ q) 0 forallq isin Q

Since s(v2 ∣ q)lt s(v1 ∣ q) forallq isin Q in order to minimizeutility loss the value of sprime(v2 ∣ q) should be as small aspossible We observe that the minimum possible sprime(v2 ∣ q) is1 because PC1

v2must be 0 1 as VC

1 0 1 Without loss ofgenerality we assume that we have already constructedPB1

v2 P

Bmprime

v2 PC1v2

for v2 such that sprime(v2 ∣ q) 1 forallq isin QNow we have an instance of 2-PVSV problem that given

D Q constructs PB1v1

PBmprime

v1 PC1

v1that satisfies the most

queries in Q Since VC1 0 1 PC1

v1must be 0 1 too as

PC1v1

contains two distinct values )erefore for an ar-bitrary query q isin Q we have ρprime(q[C] v1[C]) 1 andρprime(q[C] v2[C]) 1 In order to let v1 satisfy q we have toensure that sprime(v1 ∣ q)gt sprime(v2 ∣ q) 1 )us we have

sprime v1 ∣ q( 1113857gt 1

⟺1113944mprime

i1ρprime q Bi1113858 1113859 v1 Bi1113858 1113859( 1113857gt 1

⟺existi isin 1 mprime1113864 1113865 ρprime q Bi1113858 1113859 v1 Bi1113858 1113859( 1113857 1

⟺existi isin 1 mprime1113864 1113865 q Bi1113858 1113859 isin PBi

v1

(A1)

Now we show that the solution of 2-PVSV problemconstructed above can answer the corresponding Max-3Sat Problem Suppose that we have the solution PB1

v1

PB

mprimev1 such that v1 satisfies the most queries in Q As in

(A1) if v1 satisfies qk we have qk[Bi] isin PBiv1 If v1 does not

satisfy qk then qk[Bi] notin PBiv1 Recall that we assign value i or

i + 1 to qk[Bi] if xi is a positive or negative literal in clauseLk respectively )erefore the assignment of xi given PBi

v1is

xi true if i isin PBi

v1

xi false if i + 1 isin PBi

v1

(A2)

Furthermore from equation (A1) we have

sprime v1 ∣ q( 1113857gt 1⟺L true (A3)

where L is qrsquos corresponding clause in Φ Note thati i + 1 nsubQ as |PBi

v1| 2 and 0 isin PBi

v1 )us the value of xi is

either true or falseAs we proved above v1 satisfies qi if and only if Li is true

given assignment x1 xl1113864 1113865 constructed according toequation (A2) )erefore x1 xl1113864 1113865 satisfies the mostclauses in ϕ if and only if v1 satisfies the most queries in Q

We now prove that function f can be conducted inpolynomial time Given a formula Φ with n variables and lclauses we construct a 2-PVSV instance with 2 tuples each

Security and Communication Networks 13

of which has n + 1 attributes and l queries each of which has4 attributes )erefore O(2n + 4l) assignments are neededand f can be conducted in polynomial time

Proof of 4eorem 1 We now prove that the 2-PVSVproblem is NP-hard In Lemma 3 we proved that Max-3SatProblem can be reduced to the 2-PVSV problem in poly-nomial time Furthermore as Max-3Sat Problem is a NP-hard problem [33] the 2-PVSV problem is NP-hard

B 2-PVST

In this section we prove that constructing an optimal PVSTfor each private attribute of a tuple v isin D is NP-hard We usethe definition of v satisfying q from (9))e 2-PVSTproblemcan be redefined as follows

Definition B1 2-PVSTproblem given a query workload Qa database D the optimization problem of 2-PVST is toconstruct an arbitrary tuple vrsquos polymorphic setsPB1

v PBmprime

v that satisfies the most queries in Q )e size ofeach polymorphic vale set is 2

Lemma B1 Max-3Sat le P 2-PVST problem

Proof We construct a reduction function f(Φ) (D s Q)

which takes a Max-3Sat instance as input and returns a 2-PVST instance We assume that Φ is a conjunction of lclauses and each clause is a disjunction of 3 literals from setX x1 xn1113864 1113865 Let D have no public attribute and n + 1private attributes B1 Bn C1 For each literal xi we inserttwo tuples vi and vi

prime into D where

vi C11113858 1113859 1 vi Bi1113858 1113859 1 vi Bj1113960 1113961 minus 1foralljne i

viprime C11113858 1113859 1 vi

prime Bi1113858 1113859 minus 1 viprime Bj1113960 1113961 1foralljne i

(B1)

)en we insert tuple v into D where v[C1] 0 andv[Bj] 0 forallj isin 1 mprime We also set the domain of eachBj as minus 1 0 1 and the domain of C1 as 0 1

Without loss of generality the score function defined in(1) is simplified by setting all weights to 1

Next we construct query workload Q For each clauseLk isin Φ we construct a query qk in which qk[Bj] minus 1 iff xj isa positive literal in Lk and qk[Bj] 1 iff xj is a negativeliteral in Lk We also set qk[C1] 1 )e rest of the attributevalues are set to null by default )erefore ifLk (x1orne x2orx3) then we have qk[C1] 1 qk[B1] 1qk[B2] 1 qk[B3] minus 1 and qk[Bj] null forj isin 4 mprime

Note that ρ(q[Bj] t[Bj]) 0 if q[Bj] is null )uss(v qk) 0 forallq isin Q and s(vi qk) s(vi

prime qk)ge 1 forallq isin Q Weobserve that for q isin Q and i isin 1 n s(v ∣ q)lt s(vi ∣ q)

and s(v ∣ q)lt s(viprime ∣ q) In order to reduce UQ we have to

maximize the number of q isin Q such that sprime(v ∣ q)lt sprime(vi ∣ q)

and sprime(v ∣ q)lt sprime(viprime ∣ q) )e maximum value of sprime(vi ∣ q)

and sprime(viprime ∣ q) is 4 which can be achieved by inserting vi[Bj]

into PBj

viprime and inserting vi

prime[Bj] into PBj

vi Since PC1

v 0 1 thevalue of sprime(v ∣ q) is at least 1 )erefore we have

sprime(v ∣ q)lt sprime vi ∣ q( 1113857

⟺sprime(v ∣ q)lt 4

⟺1113944

mprime

j1ρprime q Bj1113960 1113961 v Bj1113960 11139611113872 1113873lt 3

⟺existj isin 1 mprime ρprime q Bj1113960 1113961 v Bj1113960 11139611113872 1113873 0

⟺existj isin 1 mprime v Bj1113960 1113961ne q Bj1113960 1113961

⟺existj isin 1 mprime minus q Bj1113960 1113961 isin PBj

v

(B2)

Since |PBj

v | is limited to 2 PBj

v 0 1 or 0 minus 1 ie wemust insert either virsquos or vi

primersquos private attribute values intoPB1

v PBmv Recall that we assign minus 1 or 1 to qk[Bj] if xj

is positive or negative respectively in Lk In order toreduce Max-3Sat to 2-PSVT we assign literals in the fol-lowing way

xj true if minus 1 isin PBj

v

xj false if 1 isin PBj

v (B3)

)erefore we denote the corresponding clause of q as Land from equation (B2) we have

existj isin 1 mprime1113864 1113865 minus q Bj1113960 1113961 isin PBj

v

⟺existj isin 1 mprime1113864 1113865 xj true if xj is positive in L

orxj false if xj is negative in L

⟺L true(B4)

From the above equation we observe that v satisfying qk

is equivalent to Lk being true)erefore we can reduceMax-3Sat to 2-PVST Given a solution to the 2-PVSTproblem thatmaximizes the number of queries satisfied by v assignmentx1 xn1113864 1113865 produced by equation (B3) can satisfy thelargest number of clauses in Φ

Given a Max-3Sat instance Φ the construction of the 2-PVST instance can be conducted in polynomial time as weconstruct n queries with 3 attributes and 2n + 1 tuples with nattributes)e transformation from the optimal solution of 2-PVST to the optimal solution of Max-3Sat also takes poly-nomial time as we assign the value to each one of x1 xn

once )erefore f is a polynomial time function

Proof of 4eorem 2 In 4 we proved that a Max-3Sat in-stance can be reduced to a 2-PSVT instance in polynomialtime Furthermore as Max-3Sat Problem is an NP-hardproblem the 2-PVSV problem is an NP-hard problem

Data Availability

)e data used to support the findings of this study areavailable from the corresponding author upon request

14 Security and Communication Networks

Conflicts of Interest

)e authors declare that they have no conflicts of interestregarding the publication of this paper

Acknowledgments

)is research was partially supported by the National Sci-ence Foundation (grant nos 0852674 0915834 1117297and 1343976) and the Army Research Office (grant noW911NF-15-1-0020)

References

[1] G Adomavicius and A Tuzhilin ldquoToward the next generationof recommender systems a survey of the state-of-the-art andpossible extensionsrdquo IEEE Transactions on Knowledge andData Engineering vol 17 no 6 pp 734ndash749 2005

[2] C C Aggarwal and S Y Philip ldquoA condensation approach toprivacy preserving data miningrdquo in Proceedings of the In-ternational Conference on Extending Database Technologypp 183ndash199 Springer Heraklion Crete Greece March 2004

[3] R Agrawal and R Srikant ldquoPrivacy-preserving data miningrdquoin ACM Sigmod Record vol 29 no 2 pp 439ndash450 ACM2000

[4] A Alcaide E Palomar J Montero-Castillo and A RibagordaldquoAnonymous authentication for privacy-preserving IoT tar-get-driven applicationsrdquo Computers amp Security vol 37pp 111ndash123 2013

[5] L Atzori A Iera G Morabito and M Nitti ldquo)e socialinternet of things (SIoT)mdashwhen social networks meet theinternet of things concept architecture and network char-acterizationrdquo Computer Networks vol 56 no 16 pp 3594ndash3608 2012

[6] Z Cai Z He X Guan and Y Li ldquoCollective data-sanitizationfor preventing sensitive information inference attacks insocial networksrdquo IEEE Transactions on Dependable and SecureComputing vol 15 no 4 pp 577ndash590 2018

[7] Z Cai and X Zheng ldquoA private and efficient mechanism for datauploading in smart cyber-physical systemsrdquo IEEE Transactionson Network Science and Engineering vol 24 p 1 2018

[8] Z Cai X Zheng and J Yu ldquoA differential-private frameworkfor urban traffic flows estimation via taxi companiesrdquo IEEETransactions on Industrial Informatics vol 17 2019

[9] N Cao C Wang M Li K Ren and W Lou ldquoPrivacy-preserving multi-keyword ranked search over encryptedcloud datardquo IEEE Transactions on Parallel and DistributedSystems vol 25 no 1 pp 222ndash233 2014

[10] Y-C Chang and M Mitzenmacher ldquoPrivacy preservingkeyword searches on remote encrypted datardquo in Proceedingsof the International Conference on Applied Cryptography andNetwork Security pp 442ndash455 Springer New York NY USAJune 2005

[11] C Chen X Zhu P Shen et al ldquoAn efficient privacy-pre-serving ranked keyword search methodrdquo IEEE Transactionson Parallel and Distributed Systems vol 27 no 4 pp 951ndash9632016

[12] K Chen and L Liu ldquoPrivacy preserving data classificationwith rotation perturbationrdquo in Proceedings of the Fifth IEEEInternational Conference on Data Mining (ICDMrsquo05) p 4IEEE Houston TX USA November 2005

[13] V Ciriani S D C Di Vimercati S Foresti S JajodiaS Paraboschi and P Samarati ldquoFragmentation and en-cryption to enforce privacy in data storagerdquo in Proceedings of

the Fifth European symposium on research in computer se-curity pp 171ndash186 Springer Dresden Germany September2007

[14] M Deshpande and G Karypis ldquoItem-based top-N recom-mendation algorithmsrdquo ACM Transactions on InformationSystems vol 22 no 1 pp 143ndash177 2004

[15] S D C di Vimercati R F Erbacher S Foresti S JajodiaG Livraga and P Samarati ldquoEncryption and fragmentationfor data confidentiality in the cloudrdquo in Foundations of Se-curity Analysis and Design VII A Aldini J Lopez andF Martinelli Eds pp 212ndash243 Springer Berlin Germany2013

[16] S D C di Vimercati S Foresti G Livraga S Paraboschi andP Samarati ldquoConfidentiality protection in large databasesrdquo inA Comprehensive Guide through the Italian Database Researchover the Last 25 Years pp 457ndash472 Springer Berlin Ger-many 2018

[17] C Dwork ldquoDifferential privacyrdquo Encyclopedia of Cryptog-raphy and Security pp 338ndash340 2011

[18] C Dwork and A Roth ldquo)e algorithmic foundations ofdifferential privacyrdquo Foundations and Trendsreg in 4eoreticalComputer Science vol 9 no 3-4 pp 211ndash407 2014

[19] S E Fienberg and J McIntyre ldquoData swapping variations ona theme by dalenius and reissrdquo in Privacy in Statistical Da-tabases pp 14ndash29 Springer Berlin Germany 2004

[20] Y Gao T Yan and N Zhang ldquoA privacy-preservingframework for rank inferencerdquo in Proceedings of the IEEESymposium on Privacy-Aware Computing (PAC) pp 180-181IEEE Washington DC USA August 2017

[21] Z He Z Cai and J Yu ldquoLatent-data privacy preserving withcustomized data utility for social network datardquo IEEETransactions on Vehicular Technology vol 67 no 1pp 665ndash673 2017

[22] T Hong S Mei Z Wang and J Ren ldquoA novel verticalfragmentation method for privacy protection based on en-tropy minimization in a relational databaserdquo Symmetryvol 10 no 11 p 637 2018

[23] X Huang R Fu B Chen T Zhang and A Roscoe ldquoUserinteractive internet of things privacy preserved access con-trolrdquo in Proceedings of the International Conference for In-ternet Technology and Secured Transactions pp 597ndash602IEEE London UK December 2012

[24] Y Huo X Fan L Ma X Cheng Z Tian and D ChenldquoSecure communications in tiered 5G wireless networks withcooperative jammingrdquo IEEE Transactions on Wireless Com-munications vol 18 no 6 pp 3265ndash3280 2019

[25] K Liu H Kargupta and J Ryan ldquoRandom projection-basedmultiplicative data perturbation for privacy preserving dis-tributed data miningin latexrdquo IEEE Transactions on knowledgeand Data Engineering vol 18 no 1 pp 92ndash106 2005

[26] N Li T Li and S Venkatasubramanian ldquot-closeness privacybeyond k-anonymity and l-diversityrdquo in Proceedings of theIEEE 23rd International Conference on Data Engineeringpp 106ndash115 IEEE Istanbul Turkey April 2007

[27] A Machanavajjhala D Kifer J Gehrke andM Venkitasubramaniam ldquoL-diversityrdquo ACMTransactions onKnowledge Discovery from Data vol 1 no 1 pp 3ndashes 2007

[28] S Matwin ldquoPrivacy-preserving data mining techniquessurvey and challengesrdquo in Studies in Applied PhilosophyEpistemology and Rational Ethics pp 209ndash221 SpringerBerlin Germany 2013

[29] B McFee and G Lanckriet ldquoMetric learning to rankrdquo inProceedings of the 27th International Conference on MachineLearning (ICMLrsquo10) Haifa Israel June 2010

Security and Communication Networks 15

[30] K Muralidhar and R Sarathy ldquoData shufflingmdasha newmasking approach for numerical datardquo Management Sciencevol 52 no 5 pp 658ndash670 2006

[31] J Nin J Herranz and V Torra ldquoRethinking rank swapping todecrease disclosure riskrdquo Data amp Knowledge Engineeringvol 64 no 1 pp 346ndash364 2008

[32] K Okamoto W Chen and X-Y Li ldquoRanking of closenesscentrality for large-scale social networksrdquo in Frontiers inAlgorithmics pp 186ndash195 Springer Berlin Germany 2008

[33] C H Papadimitriou and M Yannakakis ldquoOptimizationapproximation and complexity classesrdquo Journal of computerand system sciences vol 43 no 3 pp 425ndash440 1991

[34] L Peng R-C Wang X-Y Su and C Long ldquoPrivacy pro-tection based on key-changed mutual authentication protocolin internet of thingsrdquo in Communications in Computer andInformation Science pp 345ndash355 Springer Berlin Germany2013

[35] M F Rahman W Liu S )irumuruganathan N Zhang andG Das ldquoPrivacy implications of database rankingrdquo Pro-ceedings of the VLDB Endowment vol 8 no 10 pp 1106ndash1117 2015

[36] R Schenkel T Crecelius M Kacimi et al ldquoEfficient top-kquerying over social-tagging networksrdquo in Proceedings of the31st Annual International ACM SIGIR Conference on Researchand Development in Information Retrieval pp 523ndash530ACM Singapore July 2008

[37] C Sheng N Zhang Y Tao and X Jin ldquoOptimal algorithmsfor crawling a hidden database in the webrdquo Proceedings of theVLDB Endowment vol 5 no 11 pp 1112ndash1123 2012

[38] D X Song D Wagner and A Perrig ldquoPractical techniquesfor searches on encrypted datardquo in Proceeding 2000 IEEESymposium on Security and Privacy SampP 2000 pp 44ndash55IEEE Berkeley CA USA May 2000

[39] J Su D Cao B Zhao X Wang and I You ldquoePASS anexpressive attribute-based signature scheme with privacy andan unforgeability guarantee for the Internet of)ingsrdquo FutureGeneration Computer Systems vol 33 pp 11ndash18 2014

[40] L Sweeney ldquok-anonymity a Model for Protecting PrivacyrdquoInternational Journal of Uncertainty Fuzziness and Knowl-edge-Based Systems vol 10 no 5 pp 557ndash570 2002

[41] T Tassa A Mazza and A Gionis ldquok-concealment An al-ternative model of k-type anonymityrdquo Transactions on DataPrivacy vol 5 no 1 pp 189ndash222 2012

[42] J H Y F W Wang J C W G K Koperski D LiY L A R N Stefanovic and B X O R Z Dbminer ldquoAsystem for mining knowledge in large relational databasesrdquo inProceedings of the International Conference on KnowledgeDiscovery and Data Mining and Knowledge Discovery(KDDrsquo96) pp 250ndash255 San Diego CA USA August 1996

[43] X Wang J Zhang E M Schooler and M Ion ldquoPerformanceevaluation of attribute-based encryption toward data privacyin the IoTrdquo in Proceedings of the IEEE International Con-ference on Communications (ICC) pp 725ndash730 IEEE SydneyAustralia June 2014

[44] Y Wang and Q Wen ldquoA privacy enhanced dns scheme forthe internet of thingsrdquo in Proceedings of the IET InternationalConference on Communication Technology and Application(ICCTA 2011) )e Institution of Engineering amp TechnologyBeijing China October 2011

[45] Y Xu T Ma M Tang and W Tian ldquoA survey of privacypreserving data publishing using generalization and sup-pressionrdquo Applied Mathematics amp Information Sciencesvol 8 no 3 pp 1103ndash1116 2014

[46] J-C Yang and B-X Fang ldquoSecurity model and key tech-nologies for the internet of thingsrdquo 4e Journal of ChinaUniversities of Posts and Telecommunications vol 18pp 109ndash112 2011

[47] X Yang H Steck Y Guo and Y Liu ldquoOn top-k recom-mendation using social networksrdquo in Proceedings of the sixthACM conference on Recommender systemsmdashRecSysrsquo12pp 67ndash74 ACM Dublin Ireland September 2012

[48] X Zheng Z Cai J Yu C Wang and Y Li ldquoFollow but notrack privacy preserved profile publishing in cyber-physicalsocial systemsrdquo IEEE Internet of 4ings Journal vol 4 no 6pp 1868ndash1878 2017

16 Security and Communication Networks

International Journal of

AerospaceEngineeringHindawiwwwhindawicom Volume 2018

RoboticsJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Active and Passive Electronic Components

VLSI Design

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Shock and Vibration

Hindawiwwwhindawicom Volume 2018

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawiwwwhindawicom

Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Control Scienceand Engineering

Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Journal ofEngineeringVolume 2018

SensorsJournal of

Hindawiwwwhindawicom Volume 2018

International Journal of

RotatingMachinery

Hindawiwwwhindawicom Volume 2018

Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Navigation and Observation

International Journal of

Hindawi

wwwhindawicom Volume 2018

Advances in

Multimedia

Submit your manuscripts atwwwhindawicom

Page 12: Research Article - Hindawi Publishing Corporationdownloads.hindawi.com/journals/scn/2019/5498375.pdf · [4, 34, 44], and attribute-based encryption [39, 43]. ’e features of social

15 we added 5 more categorical attributes randomly chosenfrom the unused attributes We observe that the Uaulmonotonically decreases with increasing number of publicattributes and monotonically increases with increasingnumber of private attributes As expected a higher pro-portion of public attributes leads to less variant betweens(v ∣ q) and sprime(v ∣ q) e results of the same experimentwith PVST-Constructor are shown in Figure 9 Uaul in-creases as increasing number of private attributes as ex-pected However Uaul also increases slightly with increasingnumber of public attributes is is due to the fact that withmore public attributes there will be few tuples that share thesame public attribute values Since PVST-Constructor in-serts only private attribute values from tuples sharing samepublic attribute values more values will be inserted to eachpolymorphic value set which introduces higher utility loss

635 Evaluation of Uaul with Varying Weight RatiosFigures 10 and 11 illustrate the impact of weight ratios on theaverage utility loss e experiment was conducted with a

xed private attribute weight of 1 and varying public at-tribute weights of 1 2 and 3 As expected Uaul of bothPVSV-Constructor and PVST-Constructor decreases as

0 1 2 3Weight ratio

Ave

rage

top-k

utili

ty lo

ss

1

009

008

007

006

005

004

003

002

001

0

Figure 10 Utility loss vs weight ratio

0 1 2 3Weight ratio

Ave

rage

top-k

utili

ty lo

ss

1

09

08

07

06

05

04

03

02

01

0

Figure 11 Utility loss vs weight ratio

0 1 2 3|D|105

Ave

rage

top-k

utili

ty lo

ss1

09

08

07

06

05

04

03

02

01

0

Figure 7 Utility loss vs |D|

0 5 10 15m or mprime

fix mprime = 10fix m = 10

Ave

rage

top-k

utili

ty lo

ss

1

009

008

007

006

005

004

003

002

001

0

Figure 8 Utility loss vs m and mprime

0 5 10 15m or mprime

fix mprime = 10fix m = 10

Ave

rage

top-k

utili

ty lo

ss

1

09

08

07

06

05

04

03

02

01

0

Figure 9 Utility loss vs m and mprime

12 Security and Communication Networks

increasing weight ratio of public attributes)e reason is thatas the public attribute weight increases the part in sprime(v ∣ q)

caused by private attributes decreases )erefore less impactwould be made to sprime(v ∣ q) by PVSV and PVST As a resultsprime(v ∣ q) would be closer to s(v ∣ q) and the utility loss couldbe decreased

7 Conclusions

In this paper we proposed a novel framework that preservesprivacy of private attributes against arbitrary attacks throughthe ranked retrieval model Furthermore we identify twocategories of adversaries based on varying adversarial ca-pabilities For each kind of adversaries we presentedimplementation of our framework Our experimental resultssuggest that our implementations efficiently preserve privacyagainst Rank Inference attack [35] Moreover the imple-mentations significantly reduce utility loss with respect tothe variance in ranked results

It is our hope that this paper can motivate further re-search in privacy preservation of SIoT with consideration ofsocial network features andor variant information retrievalmodels eg text mining

Appendix

A 2-PVSV

In this subsection we prove that constructing an optimal PVSVfor each attribute of a tuple v is an NP-hard problem

Definition A1 For a tuple v in D we create PBj

v for each BjWe say v satisfies query q if the following hold for any tuple t(tne v) in D (1) If s(v ∣ q)lt s(t ∣ q) then sprime(v ∣ q)lt sprime(t ∣ q)and (2) if s(v ∣ q)ge s(t ∣ q) then sprime(v ∣ q)ge sprime(t ∣ q)

For a database containing 2 tuples the 2-PVSV problemcan be redefined as follows given a query workload Q adatabase D construct PB1

v PBmprime

v of size 2 such that v

satisfies the most queries in Q

Definition A2 Max-3Sat Problem given a 3-CNF formulaΦ find the truth assignment that satisfies that most clauses

Lemma A1 Max-3Sat leP 2-PVSV Problem

Proof We construct a reduction function f(Φ) (D s Q)

which takes a Max-3Sat instance as input and returns a 2-PVSV instance Without loss of generality we suppose thatΦ is a conjunction of l clauses and each clause is a dis-junction of 3 literals from set X x1 xn1113864 1113865 We constructdatabase D as follows D has 0 public attributes and n + 1private attributes B1 Bn C1 Let VB

j be the attribute do-main of Bj j isin 1 n and VC

1 be the attribute domain ofC1 Let VB

i minus 1 0 1 2 n + 2 and VC1 0 1 Two

tuples v1 and v2 are inserted into D v1[Bj] 0 and v2[Bj]

j + 2 for j isin 1 n while v1[C1] 0 and v2[C1] 1 Wesimplify the score function defined in (1) by setting all weightsto 1 Also note that ρ(q[Bj] t[Bj]) 0 if q[Bj] is null

We construct query workload Q based on Φ For eachclause Lk isin Φ we construct a query qk such that qk[Bi] i iffthe corresponding literal xi is a positive literal in Lk andq[Bi] i + 1 iff xi is a negative literal in the clause Forexample given a clause (x1orx2orx3) the correspondingquery q should satisfy q[B1] 1 q[B2] 3 q[B3] 3 andq[C1] 0 All other attributes in q are set to null by default)erefore s(v1 ∣ q) 1 and s(v2 ∣ q) 0 forallq isin Q

Since s(v2 ∣ q)lt s(v1 ∣ q) forallq isin Q in order to minimizeutility loss the value of sprime(v2 ∣ q) should be as small aspossible We observe that the minimum possible sprime(v2 ∣ q) is1 because PC1

v2must be 0 1 as VC

1 0 1 Without loss ofgenerality we assume that we have already constructedPB1

v2 P

Bmprime

v2 PC1v2

for v2 such that sprime(v2 ∣ q) 1 forallq isin QNow we have an instance of 2-PVSV problem that given

D Q constructs PB1v1

PBmprime

v1 PC1

v1that satisfies the most

queries in Q Since VC1 0 1 PC1

v1must be 0 1 too as

PC1v1

contains two distinct values )erefore for an ar-bitrary query q isin Q we have ρprime(q[C] v1[C]) 1 andρprime(q[C] v2[C]) 1 In order to let v1 satisfy q we have toensure that sprime(v1 ∣ q)gt sprime(v2 ∣ q) 1 )us we have

sprime v1 ∣ q( 1113857gt 1

⟺1113944mprime

i1ρprime q Bi1113858 1113859 v1 Bi1113858 1113859( 1113857gt 1

⟺existi isin 1 mprime1113864 1113865 ρprime q Bi1113858 1113859 v1 Bi1113858 1113859( 1113857 1

⟺existi isin 1 mprime1113864 1113865 q Bi1113858 1113859 isin PBi

v1

(A1)

Now we show that the solution of 2-PVSV problemconstructed above can answer the corresponding Max-3Sat Problem Suppose that we have the solution PB1

v1

PB

mprimev1 such that v1 satisfies the most queries in Q As in

(A1) if v1 satisfies qk we have qk[Bi] isin PBiv1 If v1 does not

satisfy qk then qk[Bi] notin PBiv1 Recall that we assign value i or

i + 1 to qk[Bi] if xi is a positive or negative literal in clauseLk respectively )erefore the assignment of xi given PBi

v1is

xi true if i isin PBi

v1

xi false if i + 1 isin PBi

v1

(A2)

Furthermore from equation (A1) we have

sprime v1 ∣ q( 1113857gt 1⟺L true (A3)

where L is qrsquos corresponding clause in Φ Note thati i + 1 nsubQ as |PBi

v1| 2 and 0 isin PBi

v1 )us the value of xi is

either true or falseAs we proved above v1 satisfies qi if and only if Li is true

given assignment x1 xl1113864 1113865 constructed according toequation (A2) )erefore x1 xl1113864 1113865 satisfies the mostclauses in ϕ if and only if v1 satisfies the most queries in Q

We now prove that function f can be conducted inpolynomial time Given a formula Φ with n variables and lclauses we construct a 2-PVSV instance with 2 tuples each

Security and Communication Networks 13

of which has n + 1 attributes and l queries each of which has4 attributes )erefore O(2n + 4l) assignments are neededand f can be conducted in polynomial time

Proof of 4eorem 1 We now prove that the 2-PVSVproblem is NP-hard In Lemma 3 we proved that Max-3SatProblem can be reduced to the 2-PVSV problem in poly-nomial time Furthermore as Max-3Sat Problem is a NP-hard problem [33] the 2-PVSV problem is NP-hard

B 2-PVST

In this section we prove that constructing an optimal PVSTfor each private attribute of a tuple v isin D is NP-hard We usethe definition of v satisfying q from (9))e 2-PVSTproblemcan be redefined as follows

Definition B1 2-PVSTproblem given a query workload Qa database D the optimization problem of 2-PVST is toconstruct an arbitrary tuple vrsquos polymorphic setsPB1

v PBmprime

v that satisfies the most queries in Q )e size ofeach polymorphic vale set is 2

Lemma B1 Max-3Sat le P 2-PVST problem

Proof We construct a reduction function f(Φ) (D s Q)

which takes a Max-3Sat instance as input and returns a 2-PVST instance We assume that Φ is a conjunction of lclauses and each clause is a disjunction of 3 literals from setX x1 xn1113864 1113865 Let D have no public attribute and n + 1private attributes B1 Bn C1 For each literal xi we inserttwo tuples vi and vi

prime into D where

vi C11113858 1113859 1 vi Bi1113858 1113859 1 vi Bj1113960 1113961 minus 1foralljne i

viprime C11113858 1113859 1 vi

prime Bi1113858 1113859 minus 1 viprime Bj1113960 1113961 1foralljne i

(B1)

)en we insert tuple v into D where v[C1] 0 andv[Bj] 0 forallj isin 1 mprime We also set the domain of eachBj as minus 1 0 1 and the domain of C1 as 0 1

Without loss of generality the score function defined in(1) is simplified by setting all weights to 1

Next we construct query workload Q For each clauseLk isin Φ we construct a query qk in which qk[Bj] minus 1 iff xj isa positive literal in Lk and qk[Bj] 1 iff xj is a negativeliteral in Lk We also set qk[C1] 1 )e rest of the attributevalues are set to null by default )erefore ifLk (x1orne x2orx3) then we have qk[C1] 1 qk[B1] 1qk[B2] 1 qk[B3] minus 1 and qk[Bj] null forj isin 4 mprime

Note that ρ(q[Bj] t[Bj]) 0 if q[Bj] is null )uss(v qk) 0 forallq isin Q and s(vi qk) s(vi

prime qk)ge 1 forallq isin Q Weobserve that for q isin Q and i isin 1 n s(v ∣ q)lt s(vi ∣ q)

and s(v ∣ q)lt s(viprime ∣ q) In order to reduce UQ we have to

maximize the number of q isin Q such that sprime(v ∣ q)lt sprime(vi ∣ q)

and sprime(v ∣ q)lt sprime(viprime ∣ q) )e maximum value of sprime(vi ∣ q)

and sprime(viprime ∣ q) is 4 which can be achieved by inserting vi[Bj]

into PBj

viprime and inserting vi

prime[Bj] into PBj

vi Since PC1

v 0 1 thevalue of sprime(v ∣ q) is at least 1 )erefore we have

sprime(v ∣ q)lt sprime vi ∣ q( 1113857

⟺sprime(v ∣ q)lt 4

⟺1113944

mprime

j1ρprime q Bj1113960 1113961 v Bj1113960 11139611113872 1113873lt 3

⟺existj isin 1 mprime ρprime q Bj1113960 1113961 v Bj1113960 11139611113872 1113873 0

⟺existj isin 1 mprime v Bj1113960 1113961ne q Bj1113960 1113961

⟺existj isin 1 mprime minus q Bj1113960 1113961 isin PBj

v

(B2)

Since |PBj

v | is limited to 2 PBj

v 0 1 or 0 minus 1 ie wemust insert either virsquos or vi

primersquos private attribute values intoPB1

v PBmv Recall that we assign minus 1 or 1 to qk[Bj] if xj

is positive or negative respectively in Lk In order toreduce Max-3Sat to 2-PSVT we assign literals in the fol-lowing way

xj true if minus 1 isin PBj

v

xj false if 1 isin PBj

v (B3)

)erefore we denote the corresponding clause of q as Land from equation (B2) we have

existj isin 1 mprime1113864 1113865 minus q Bj1113960 1113961 isin PBj

v

⟺existj isin 1 mprime1113864 1113865 xj true if xj is positive in L

orxj false if xj is negative in L

⟺L true(B4)

From the above equation we observe that v satisfying qk

is equivalent to Lk being true)erefore we can reduceMax-3Sat to 2-PVST Given a solution to the 2-PVSTproblem thatmaximizes the number of queries satisfied by v assignmentx1 xn1113864 1113865 produced by equation (B3) can satisfy thelargest number of clauses in Φ

Given a Max-3Sat instance Φ the construction of the 2-PVST instance can be conducted in polynomial time as weconstruct n queries with 3 attributes and 2n + 1 tuples with nattributes)e transformation from the optimal solution of 2-PVST to the optimal solution of Max-3Sat also takes poly-nomial time as we assign the value to each one of x1 xn

once )erefore f is a polynomial time function

Proof of 4eorem 2 In 4 we proved that a Max-3Sat in-stance can be reduced to a 2-PSVT instance in polynomialtime Furthermore as Max-3Sat Problem is an NP-hardproblem the 2-PVSV problem is an NP-hard problem

Data Availability

)e data used to support the findings of this study areavailable from the corresponding author upon request

14 Security and Communication Networks

Conflicts of Interest

)e authors declare that they have no conflicts of interestregarding the publication of this paper

Acknowledgments

)is research was partially supported by the National Sci-ence Foundation (grant nos 0852674 0915834 1117297and 1343976) and the Army Research Office (grant noW911NF-15-1-0020)

References

[1] G Adomavicius and A Tuzhilin ldquoToward the next generationof recommender systems a survey of the state-of-the-art andpossible extensionsrdquo IEEE Transactions on Knowledge andData Engineering vol 17 no 6 pp 734ndash749 2005

[2] C C Aggarwal and S Y Philip ldquoA condensation approach toprivacy preserving data miningrdquo in Proceedings of the In-ternational Conference on Extending Database Technologypp 183ndash199 Springer Heraklion Crete Greece March 2004

[3] R Agrawal and R Srikant ldquoPrivacy-preserving data miningrdquoin ACM Sigmod Record vol 29 no 2 pp 439ndash450 ACM2000

[4] A Alcaide E Palomar J Montero-Castillo and A RibagordaldquoAnonymous authentication for privacy-preserving IoT tar-get-driven applicationsrdquo Computers amp Security vol 37pp 111ndash123 2013

[5] L Atzori A Iera G Morabito and M Nitti ldquo)e socialinternet of things (SIoT)mdashwhen social networks meet theinternet of things concept architecture and network char-acterizationrdquo Computer Networks vol 56 no 16 pp 3594ndash3608 2012

[6] Z Cai Z He X Guan and Y Li ldquoCollective data-sanitizationfor preventing sensitive information inference attacks insocial networksrdquo IEEE Transactions on Dependable and SecureComputing vol 15 no 4 pp 577ndash590 2018

[7] Z Cai and X Zheng ldquoA private and efficient mechanism for datauploading in smart cyber-physical systemsrdquo IEEE Transactionson Network Science and Engineering vol 24 p 1 2018

[8] Z Cai X Zheng and J Yu ldquoA differential-private frameworkfor urban traffic flows estimation via taxi companiesrdquo IEEETransactions on Industrial Informatics vol 17 2019

[9] N Cao C Wang M Li K Ren and W Lou ldquoPrivacy-preserving multi-keyword ranked search over encryptedcloud datardquo IEEE Transactions on Parallel and DistributedSystems vol 25 no 1 pp 222ndash233 2014

[10] Y-C Chang and M Mitzenmacher ldquoPrivacy preservingkeyword searches on remote encrypted datardquo in Proceedingsof the International Conference on Applied Cryptography andNetwork Security pp 442ndash455 Springer New York NY USAJune 2005

[11] C Chen X Zhu P Shen et al ldquoAn efficient privacy-pre-serving ranked keyword search methodrdquo IEEE Transactionson Parallel and Distributed Systems vol 27 no 4 pp 951ndash9632016

[12] K Chen and L Liu ldquoPrivacy preserving data classificationwith rotation perturbationrdquo in Proceedings of the Fifth IEEEInternational Conference on Data Mining (ICDMrsquo05) p 4IEEE Houston TX USA November 2005

[13] V Ciriani S D C Di Vimercati S Foresti S JajodiaS Paraboschi and P Samarati ldquoFragmentation and en-cryption to enforce privacy in data storagerdquo in Proceedings of

the Fifth European symposium on research in computer se-curity pp 171ndash186 Springer Dresden Germany September2007

[14] M Deshpande and G Karypis ldquoItem-based top-N recom-mendation algorithmsrdquo ACM Transactions on InformationSystems vol 22 no 1 pp 143ndash177 2004

[15] S D C di Vimercati R F Erbacher S Foresti S JajodiaG Livraga and P Samarati ldquoEncryption and fragmentationfor data confidentiality in the cloudrdquo in Foundations of Se-curity Analysis and Design VII A Aldini J Lopez andF Martinelli Eds pp 212ndash243 Springer Berlin Germany2013

[16] S D C di Vimercati S Foresti G Livraga S Paraboschi andP Samarati ldquoConfidentiality protection in large databasesrdquo inA Comprehensive Guide through the Italian Database Researchover the Last 25 Years pp 457ndash472 Springer Berlin Ger-many 2018

[17] C Dwork ldquoDifferential privacyrdquo Encyclopedia of Cryptog-raphy and Security pp 338ndash340 2011

[18] C Dwork and A Roth ldquo)e algorithmic foundations ofdifferential privacyrdquo Foundations and Trendsreg in 4eoreticalComputer Science vol 9 no 3-4 pp 211ndash407 2014

[19] S E Fienberg and J McIntyre ldquoData swapping variations ona theme by dalenius and reissrdquo in Privacy in Statistical Da-tabases pp 14ndash29 Springer Berlin Germany 2004

[20] Y Gao T Yan and N Zhang ldquoA privacy-preservingframework for rank inferencerdquo in Proceedings of the IEEESymposium on Privacy-Aware Computing (PAC) pp 180-181IEEE Washington DC USA August 2017

[21] Z He Z Cai and J Yu ldquoLatent-data privacy preserving withcustomized data utility for social network datardquo IEEETransactions on Vehicular Technology vol 67 no 1pp 665ndash673 2017

[22] T Hong S Mei Z Wang and J Ren ldquoA novel verticalfragmentation method for privacy protection based on en-tropy minimization in a relational databaserdquo Symmetryvol 10 no 11 p 637 2018

[23] X Huang R Fu B Chen T Zhang and A Roscoe ldquoUserinteractive internet of things privacy preserved access con-trolrdquo in Proceedings of the International Conference for In-ternet Technology and Secured Transactions pp 597ndash602IEEE London UK December 2012

[24] Y Huo X Fan L Ma X Cheng Z Tian and D ChenldquoSecure communications in tiered 5G wireless networks withcooperative jammingrdquo IEEE Transactions on Wireless Com-munications vol 18 no 6 pp 3265ndash3280 2019

[25] K Liu H Kargupta and J Ryan ldquoRandom projection-basedmultiplicative data perturbation for privacy preserving dis-tributed data miningin latexrdquo IEEE Transactions on knowledgeand Data Engineering vol 18 no 1 pp 92ndash106 2005

[26] N Li T Li and S Venkatasubramanian ldquot-closeness privacybeyond k-anonymity and l-diversityrdquo in Proceedings of theIEEE 23rd International Conference on Data Engineeringpp 106ndash115 IEEE Istanbul Turkey April 2007

[27] A Machanavajjhala D Kifer J Gehrke andM Venkitasubramaniam ldquoL-diversityrdquo ACMTransactions onKnowledge Discovery from Data vol 1 no 1 pp 3ndashes 2007

[28] S Matwin ldquoPrivacy-preserving data mining techniquessurvey and challengesrdquo in Studies in Applied PhilosophyEpistemology and Rational Ethics pp 209ndash221 SpringerBerlin Germany 2013

[29] B McFee and G Lanckriet ldquoMetric learning to rankrdquo inProceedings of the 27th International Conference on MachineLearning (ICMLrsquo10) Haifa Israel June 2010

Security and Communication Networks 15

[30] K Muralidhar and R Sarathy ldquoData shufflingmdasha newmasking approach for numerical datardquo Management Sciencevol 52 no 5 pp 658ndash670 2006

[31] J Nin J Herranz and V Torra ldquoRethinking rank swapping todecrease disclosure riskrdquo Data amp Knowledge Engineeringvol 64 no 1 pp 346ndash364 2008

[32] K Okamoto W Chen and X-Y Li ldquoRanking of closenesscentrality for large-scale social networksrdquo in Frontiers inAlgorithmics pp 186ndash195 Springer Berlin Germany 2008

[33] C H Papadimitriou and M Yannakakis ldquoOptimizationapproximation and complexity classesrdquo Journal of computerand system sciences vol 43 no 3 pp 425ndash440 1991

[34] L Peng R-C Wang X-Y Su and C Long ldquoPrivacy pro-tection based on key-changed mutual authentication protocolin internet of thingsrdquo in Communications in Computer andInformation Science pp 345ndash355 Springer Berlin Germany2013

[35] M F Rahman W Liu S )irumuruganathan N Zhang andG Das ldquoPrivacy implications of database rankingrdquo Pro-ceedings of the VLDB Endowment vol 8 no 10 pp 1106ndash1117 2015

[36] R Schenkel T Crecelius M Kacimi et al ldquoEfficient top-kquerying over social-tagging networksrdquo in Proceedings of the31st Annual International ACM SIGIR Conference on Researchand Development in Information Retrieval pp 523ndash530ACM Singapore July 2008

[37] C Sheng N Zhang Y Tao and X Jin ldquoOptimal algorithmsfor crawling a hidden database in the webrdquo Proceedings of theVLDB Endowment vol 5 no 11 pp 1112ndash1123 2012

[38] D X Song D Wagner and A Perrig ldquoPractical techniquesfor searches on encrypted datardquo in Proceeding 2000 IEEESymposium on Security and Privacy SampP 2000 pp 44ndash55IEEE Berkeley CA USA May 2000

[39] J Su D Cao B Zhao X Wang and I You ldquoePASS anexpressive attribute-based signature scheme with privacy andan unforgeability guarantee for the Internet of)ingsrdquo FutureGeneration Computer Systems vol 33 pp 11ndash18 2014

[40] L Sweeney ldquok-anonymity a Model for Protecting PrivacyrdquoInternational Journal of Uncertainty Fuzziness and Knowl-edge-Based Systems vol 10 no 5 pp 557ndash570 2002

[41] T Tassa A Mazza and A Gionis ldquok-concealment An al-ternative model of k-type anonymityrdquo Transactions on DataPrivacy vol 5 no 1 pp 189ndash222 2012

[42] J H Y F W Wang J C W G K Koperski D LiY L A R N Stefanovic and B X O R Z Dbminer ldquoAsystem for mining knowledge in large relational databasesrdquo inProceedings of the International Conference on KnowledgeDiscovery and Data Mining and Knowledge Discovery(KDDrsquo96) pp 250ndash255 San Diego CA USA August 1996

[43] X Wang J Zhang E M Schooler and M Ion ldquoPerformanceevaluation of attribute-based encryption toward data privacyin the IoTrdquo in Proceedings of the IEEE International Con-ference on Communications (ICC) pp 725ndash730 IEEE SydneyAustralia June 2014

[44] Y Wang and Q Wen ldquoA privacy enhanced dns scheme forthe internet of thingsrdquo in Proceedings of the IET InternationalConference on Communication Technology and Application(ICCTA 2011) )e Institution of Engineering amp TechnologyBeijing China October 2011

[45] Y Xu T Ma M Tang and W Tian ldquoA survey of privacypreserving data publishing using generalization and sup-pressionrdquo Applied Mathematics amp Information Sciencesvol 8 no 3 pp 1103ndash1116 2014

[46] J-C Yang and B-X Fang ldquoSecurity model and key tech-nologies for the internet of thingsrdquo 4e Journal of ChinaUniversities of Posts and Telecommunications vol 18pp 109ndash112 2011

[47] X Yang H Steck Y Guo and Y Liu ldquoOn top-k recom-mendation using social networksrdquo in Proceedings of the sixthACM conference on Recommender systemsmdashRecSysrsquo12pp 67ndash74 ACM Dublin Ireland September 2012

[48] X Zheng Z Cai J Yu C Wang and Y Li ldquoFollow but notrack privacy preserved profile publishing in cyber-physicalsocial systemsrdquo IEEE Internet of 4ings Journal vol 4 no 6pp 1868ndash1878 2017

16 Security and Communication Networks

International Journal of

AerospaceEngineeringHindawiwwwhindawicom Volume 2018

RoboticsJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Active and Passive Electronic Components

VLSI Design

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Shock and Vibration

Hindawiwwwhindawicom Volume 2018

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawiwwwhindawicom

Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Control Scienceand Engineering

Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Journal ofEngineeringVolume 2018

SensorsJournal of

Hindawiwwwhindawicom Volume 2018

International Journal of

RotatingMachinery

Hindawiwwwhindawicom Volume 2018

Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Navigation and Observation

International Journal of

Hindawi

wwwhindawicom Volume 2018

Advances in

Multimedia

Submit your manuscripts atwwwhindawicom

Page 13: Research Article - Hindawi Publishing Corporationdownloads.hindawi.com/journals/scn/2019/5498375.pdf · [4, 34, 44], and attribute-based encryption [39, 43]. ’e features of social

increasing weight ratio of public attributes)e reason is thatas the public attribute weight increases the part in sprime(v ∣ q)

caused by private attributes decreases )erefore less impactwould be made to sprime(v ∣ q) by PVSV and PVST As a resultsprime(v ∣ q) would be closer to s(v ∣ q) and the utility loss couldbe decreased

7 Conclusions

In this paper we proposed a novel framework that preservesprivacy of private attributes against arbitrary attacks throughthe ranked retrieval model Furthermore we identify twocategories of adversaries based on varying adversarial ca-pabilities For each kind of adversaries we presentedimplementation of our framework Our experimental resultssuggest that our implementations efficiently preserve privacyagainst Rank Inference attack [35] Moreover the imple-mentations significantly reduce utility loss with respect tothe variance in ranked results

It is our hope that this paper can motivate further re-search in privacy preservation of SIoT with consideration ofsocial network features andor variant information retrievalmodels eg text mining

Appendix

A 2-PVSV

In this subsection we prove that constructing an optimal PVSVfor each attribute of a tuple v is an NP-hard problem

Definition A1 For a tuple v in D we create PBj

v for each BjWe say v satisfies query q if the following hold for any tuple t(tne v) in D (1) If s(v ∣ q)lt s(t ∣ q) then sprime(v ∣ q)lt sprime(t ∣ q)and (2) if s(v ∣ q)ge s(t ∣ q) then sprime(v ∣ q)ge sprime(t ∣ q)

For a database containing 2 tuples the 2-PVSV problemcan be redefined as follows given a query workload Q adatabase D construct PB1

v PBmprime

v of size 2 such that v

satisfies the most queries in Q

Definition A2 Max-3Sat Problem given a 3-CNF formulaΦ find the truth assignment that satisfies that most clauses

Lemma A1 Max-3Sat leP 2-PVSV Problem

Proof We construct a reduction function f(Φ) (D s Q)

which takes a Max-3Sat instance as input and returns a 2-PVSV instance Without loss of generality we suppose thatΦ is a conjunction of l clauses and each clause is a dis-junction of 3 literals from set X x1 xn1113864 1113865 We constructdatabase D as follows D has 0 public attributes and n + 1private attributes B1 Bn C1 Let VB

j be the attribute do-main of Bj j isin 1 n and VC

1 be the attribute domain ofC1 Let VB

i minus 1 0 1 2 n + 2 and VC1 0 1 Two

tuples v1 and v2 are inserted into D v1[Bj] 0 and v2[Bj]

j + 2 for j isin 1 n while v1[C1] 0 and v2[C1] 1 Wesimplify the score function defined in (1) by setting all weightsto 1 Also note that ρ(q[Bj] t[Bj]) 0 if q[Bj] is null

We construct query workload Q based on Φ For eachclause Lk isin Φ we construct a query qk such that qk[Bi] i iffthe corresponding literal xi is a positive literal in Lk andq[Bi] i + 1 iff xi is a negative literal in the clause Forexample given a clause (x1orx2orx3) the correspondingquery q should satisfy q[B1] 1 q[B2] 3 q[B3] 3 andq[C1] 0 All other attributes in q are set to null by default)erefore s(v1 ∣ q) 1 and s(v2 ∣ q) 0 forallq isin Q

Since s(v2 ∣ q)lt s(v1 ∣ q) forallq isin Q in order to minimizeutility loss the value of sprime(v2 ∣ q) should be as small aspossible We observe that the minimum possible sprime(v2 ∣ q) is1 because PC1

v2must be 0 1 as VC

1 0 1 Without loss ofgenerality we assume that we have already constructedPB1

v2 P

Bmprime

v2 PC1v2

for v2 such that sprime(v2 ∣ q) 1 forallq isin QNow we have an instance of 2-PVSV problem that given

D Q constructs PB1v1

PBmprime

v1 PC1

v1that satisfies the most

queries in Q Since VC1 0 1 PC1

v1must be 0 1 too as

PC1v1

contains two distinct values )erefore for an ar-bitrary query q isin Q we have ρprime(q[C] v1[C]) 1 andρprime(q[C] v2[C]) 1 In order to let v1 satisfy q we have toensure that sprime(v1 ∣ q)gt sprime(v2 ∣ q) 1 )us we have

sprime v1 ∣ q( 1113857gt 1

⟺1113944mprime

i1ρprime q Bi1113858 1113859 v1 Bi1113858 1113859( 1113857gt 1

⟺existi isin 1 mprime1113864 1113865 ρprime q Bi1113858 1113859 v1 Bi1113858 1113859( 1113857 1

⟺existi isin 1 mprime1113864 1113865 q Bi1113858 1113859 isin PBi

v1

(A1)

Now we show that the solution of 2-PVSV problemconstructed above can answer the corresponding Max-3Sat Problem Suppose that we have the solution PB1

v1

PB

mprimev1 such that v1 satisfies the most queries in Q As in

(A1) if v1 satisfies qk we have qk[Bi] isin PBiv1 If v1 does not

satisfy qk then qk[Bi] notin PBiv1 Recall that we assign value i or

i + 1 to qk[Bi] if xi is a positive or negative literal in clauseLk respectively )erefore the assignment of xi given PBi

v1is

xi true if i isin PBi

v1

xi false if i + 1 isin PBi

v1

(A2)

Furthermore from equation (A1) we have

sprime v1 ∣ q( 1113857gt 1⟺L true (A3)

where L is qrsquos corresponding clause in Φ Note thati i + 1 nsubQ as |PBi

v1| 2 and 0 isin PBi

v1 )us the value of xi is

either true or falseAs we proved above v1 satisfies qi if and only if Li is true

given assignment x1 xl1113864 1113865 constructed according toequation (A2) )erefore x1 xl1113864 1113865 satisfies the mostclauses in ϕ if and only if v1 satisfies the most queries in Q

We now prove that function f can be conducted inpolynomial time Given a formula Φ with n variables and lclauses we construct a 2-PVSV instance with 2 tuples each

Security and Communication Networks 13

of which has n + 1 attributes and l queries each of which has4 attributes )erefore O(2n + 4l) assignments are neededand f can be conducted in polynomial time

Proof of 4eorem 1 We now prove that the 2-PVSVproblem is NP-hard In Lemma 3 we proved that Max-3SatProblem can be reduced to the 2-PVSV problem in poly-nomial time Furthermore as Max-3Sat Problem is a NP-hard problem [33] the 2-PVSV problem is NP-hard

B 2-PVST

In this section we prove that constructing an optimal PVSTfor each private attribute of a tuple v isin D is NP-hard We usethe definition of v satisfying q from (9))e 2-PVSTproblemcan be redefined as follows

Definition B1 2-PVSTproblem given a query workload Qa database D the optimization problem of 2-PVST is toconstruct an arbitrary tuple vrsquos polymorphic setsPB1

v PBmprime

v that satisfies the most queries in Q )e size ofeach polymorphic vale set is 2

Lemma B1 Max-3Sat le P 2-PVST problem

Proof We construct a reduction function f(Φ) (D s Q)

which takes a Max-3Sat instance as input and returns a 2-PVST instance We assume that Φ is a conjunction of lclauses and each clause is a disjunction of 3 literals from setX x1 xn1113864 1113865 Let D have no public attribute and n + 1private attributes B1 Bn C1 For each literal xi we inserttwo tuples vi and vi

prime into D where

vi C11113858 1113859 1 vi Bi1113858 1113859 1 vi Bj1113960 1113961 minus 1foralljne i

viprime C11113858 1113859 1 vi

prime Bi1113858 1113859 minus 1 viprime Bj1113960 1113961 1foralljne i

(B1)

)en we insert tuple v into D where v[C1] 0 andv[Bj] 0 forallj isin 1 mprime We also set the domain of eachBj as minus 1 0 1 and the domain of C1 as 0 1

Without loss of generality the score function defined in(1) is simplified by setting all weights to 1

Next we construct query workload Q For each clauseLk isin Φ we construct a query qk in which qk[Bj] minus 1 iff xj isa positive literal in Lk and qk[Bj] 1 iff xj is a negativeliteral in Lk We also set qk[C1] 1 )e rest of the attributevalues are set to null by default )erefore ifLk (x1orne x2orx3) then we have qk[C1] 1 qk[B1] 1qk[B2] 1 qk[B3] minus 1 and qk[Bj] null forj isin 4 mprime

Note that ρ(q[Bj] t[Bj]) 0 if q[Bj] is null )uss(v qk) 0 forallq isin Q and s(vi qk) s(vi

prime qk)ge 1 forallq isin Q Weobserve that for q isin Q and i isin 1 n s(v ∣ q)lt s(vi ∣ q)

and s(v ∣ q)lt s(viprime ∣ q) In order to reduce UQ we have to

maximize the number of q isin Q such that sprime(v ∣ q)lt sprime(vi ∣ q)

and sprime(v ∣ q)lt sprime(viprime ∣ q) )e maximum value of sprime(vi ∣ q)

and sprime(viprime ∣ q) is 4 which can be achieved by inserting vi[Bj]

into PBj

viprime and inserting vi

prime[Bj] into PBj

vi Since PC1

v 0 1 thevalue of sprime(v ∣ q) is at least 1 )erefore we have

sprime(v ∣ q)lt sprime vi ∣ q( 1113857

⟺sprime(v ∣ q)lt 4

⟺1113944

mprime

j1ρprime q Bj1113960 1113961 v Bj1113960 11139611113872 1113873lt 3

⟺existj isin 1 mprime ρprime q Bj1113960 1113961 v Bj1113960 11139611113872 1113873 0

⟺existj isin 1 mprime v Bj1113960 1113961ne q Bj1113960 1113961

⟺existj isin 1 mprime minus q Bj1113960 1113961 isin PBj

v

(B2)

Since |PBj

v | is limited to 2 PBj

v 0 1 or 0 minus 1 ie wemust insert either virsquos or vi

primersquos private attribute values intoPB1

v PBmv Recall that we assign minus 1 or 1 to qk[Bj] if xj

is positive or negative respectively in Lk In order toreduce Max-3Sat to 2-PSVT we assign literals in the fol-lowing way

xj true if minus 1 isin PBj

v

xj false if 1 isin PBj

v (B3)

)erefore we denote the corresponding clause of q as Land from equation (B2) we have

existj isin 1 mprime1113864 1113865 minus q Bj1113960 1113961 isin PBj

v

⟺existj isin 1 mprime1113864 1113865 xj true if xj is positive in L

orxj false if xj is negative in L

⟺L true(B4)

From the above equation we observe that v satisfying qk

is equivalent to Lk being true)erefore we can reduceMax-3Sat to 2-PVST Given a solution to the 2-PVSTproblem thatmaximizes the number of queries satisfied by v assignmentx1 xn1113864 1113865 produced by equation (B3) can satisfy thelargest number of clauses in Φ

Given a Max-3Sat instance Φ the construction of the 2-PVST instance can be conducted in polynomial time as weconstruct n queries with 3 attributes and 2n + 1 tuples with nattributes)e transformation from the optimal solution of 2-PVST to the optimal solution of Max-3Sat also takes poly-nomial time as we assign the value to each one of x1 xn

once )erefore f is a polynomial time function

Proof of 4eorem 2 In 4 we proved that a Max-3Sat in-stance can be reduced to a 2-PSVT instance in polynomialtime Furthermore as Max-3Sat Problem is an NP-hardproblem the 2-PVSV problem is an NP-hard problem

Data Availability

)e data used to support the findings of this study areavailable from the corresponding author upon request

14 Security and Communication Networks

Conflicts of Interest

)e authors declare that they have no conflicts of interestregarding the publication of this paper

Acknowledgments

)is research was partially supported by the National Sci-ence Foundation (grant nos 0852674 0915834 1117297and 1343976) and the Army Research Office (grant noW911NF-15-1-0020)

References

[1] G Adomavicius and A Tuzhilin ldquoToward the next generationof recommender systems a survey of the state-of-the-art andpossible extensionsrdquo IEEE Transactions on Knowledge andData Engineering vol 17 no 6 pp 734ndash749 2005

[2] C C Aggarwal and S Y Philip ldquoA condensation approach toprivacy preserving data miningrdquo in Proceedings of the In-ternational Conference on Extending Database Technologypp 183ndash199 Springer Heraklion Crete Greece March 2004

[3] R Agrawal and R Srikant ldquoPrivacy-preserving data miningrdquoin ACM Sigmod Record vol 29 no 2 pp 439ndash450 ACM2000

[4] A Alcaide E Palomar J Montero-Castillo and A RibagordaldquoAnonymous authentication for privacy-preserving IoT tar-get-driven applicationsrdquo Computers amp Security vol 37pp 111ndash123 2013

[5] L Atzori A Iera G Morabito and M Nitti ldquo)e socialinternet of things (SIoT)mdashwhen social networks meet theinternet of things concept architecture and network char-acterizationrdquo Computer Networks vol 56 no 16 pp 3594ndash3608 2012

[6] Z Cai Z He X Guan and Y Li ldquoCollective data-sanitizationfor preventing sensitive information inference attacks insocial networksrdquo IEEE Transactions on Dependable and SecureComputing vol 15 no 4 pp 577ndash590 2018

[7] Z Cai and X Zheng ldquoA private and efficient mechanism for datauploading in smart cyber-physical systemsrdquo IEEE Transactionson Network Science and Engineering vol 24 p 1 2018

[8] Z Cai X Zheng and J Yu ldquoA differential-private frameworkfor urban traffic flows estimation via taxi companiesrdquo IEEETransactions on Industrial Informatics vol 17 2019

[9] N Cao C Wang M Li K Ren and W Lou ldquoPrivacy-preserving multi-keyword ranked search over encryptedcloud datardquo IEEE Transactions on Parallel and DistributedSystems vol 25 no 1 pp 222ndash233 2014

[10] Y-C Chang and M Mitzenmacher ldquoPrivacy preservingkeyword searches on remote encrypted datardquo in Proceedingsof the International Conference on Applied Cryptography andNetwork Security pp 442ndash455 Springer New York NY USAJune 2005

[11] C Chen X Zhu P Shen et al ldquoAn efficient privacy-pre-serving ranked keyword search methodrdquo IEEE Transactionson Parallel and Distributed Systems vol 27 no 4 pp 951ndash9632016

[12] K Chen and L Liu ldquoPrivacy preserving data classificationwith rotation perturbationrdquo in Proceedings of the Fifth IEEEInternational Conference on Data Mining (ICDMrsquo05) p 4IEEE Houston TX USA November 2005

[13] V Ciriani S D C Di Vimercati S Foresti S JajodiaS Paraboschi and P Samarati ldquoFragmentation and en-cryption to enforce privacy in data storagerdquo in Proceedings of

the Fifth European symposium on research in computer se-curity pp 171ndash186 Springer Dresden Germany September2007

[14] M Deshpande and G Karypis ldquoItem-based top-N recom-mendation algorithmsrdquo ACM Transactions on InformationSystems vol 22 no 1 pp 143ndash177 2004

[15] S D C di Vimercati R F Erbacher S Foresti S JajodiaG Livraga and P Samarati ldquoEncryption and fragmentationfor data confidentiality in the cloudrdquo in Foundations of Se-curity Analysis and Design VII A Aldini J Lopez andF Martinelli Eds pp 212ndash243 Springer Berlin Germany2013

[16] S D C di Vimercati S Foresti G Livraga S Paraboschi andP Samarati ldquoConfidentiality protection in large databasesrdquo inA Comprehensive Guide through the Italian Database Researchover the Last 25 Years pp 457ndash472 Springer Berlin Ger-many 2018

[17] C Dwork ldquoDifferential privacyrdquo Encyclopedia of Cryptog-raphy and Security pp 338ndash340 2011

[18] C Dwork and A Roth ldquo)e algorithmic foundations ofdifferential privacyrdquo Foundations and Trendsreg in 4eoreticalComputer Science vol 9 no 3-4 pp 211ndash407 2014

[19] S E Fienberg and J McIntyre ldquoData swapping variations ona theme by dalenius and reissrdquo in Privacy in Statistical Da-tabases pp 14ndash29 Springer Berlin Germany 2004

[20] Y Gao T Yan and N Zhang ldquoA privacy-preservingframework for rank inferencerdquo in Proceedings of the IEEESymposium on Privacy-Aware Computing (PAC) pp 180-181IEEE Washington DC USA August 2017

[21] Z He Z Cai and J Yu ldquoLatent-data privacy preserving withcustomized data utility for social network datardquo IEEETransactions on Vehicular Technology vol 67 no 1pp 665ndash673 2017

[22] T Hong S Mei Z Wang and J Ren ldquoA novel verticalfragmentation method for privacy protection based on en-tropy minimization in a relational databaserdquo Symmetryvol 10 no 11 p 637 2018

[23] X Huang R Fu B Chen T Zhang and A Roscoe ldquoUserinteractive internet of things privacy preserved access con-trolrdquo in Proceedings of the International Conference for In-ternet Technology and Secured Transactions pp 597ndash602IEEE London UK December 2012

[24] Y Huo X Fan L Ma X Cheng Z Tian and D ChenldquoSecure communications in tiered 5G wireless networks withcooperative jammingrdquo IEEE Transactions on Wireless Com-munications vol 18 no 6 pp 3265ndash3280 2019

[25] K Liu H Kargupta and J Ryan ldquoRandom projection-basedmultiplicative data perturbation for privacy preserving dis-tributed data miningin latexrdquo IEEE Transactions on knowledgeand Data Engineering vol 18 no 1 pp 92ndash106 2005

[26] N Li T Li and S Venkatasubramanian ldquot-closeness privacybeyond k-anonymity and l-diversityrdquo in Proceedings of theIEEE 23rd International Conference on Data Engineeringpp 106ndash115 IEEE Istanbul Turkey April 2007

[27] A Machanavajjhala D Kifer J Gehrke andM Venkitasubramaniam ldquoL-diversityrdquo ACMTransactions onKnowledge Discovery from Data vol 1 no 1 pp 3ndashes 2007

[28] S Matwin ldquoPrivacy-preserving data mining techniquessurvey and challengesrdquo in Studies in Applied PhilosophyEpistemology and Rational Ethics pp 209ndash221 SpringerBerlin Germany 2013

[29] B McFee and G Lanckriet ldquoMetric learning to rankrdquo inProceedings of the 27th International Conference on MachineLearning (ICMLrsquo10) Haifa Israel June 2010

Security and Communication Networks 15

[30] K Muralidhar and R Sarathy ldquoData shufflingmdasha newmasking approach for numerical datardquo Management Sciencevol 52 no 5 pp 658ndash670 2006

[31] J Nin J Herranz and V Torra ldquoRethinking rank swapping todecrease disclosure riskrdquo Data amp Knowledge Engineeringvol 64 no 1 pp 346ndash364 2008

[32] K Okamoto W Chen and X-Y Li ldquoRanking of closenesscentrality for large-scale social networksrdquo in Frontiers inAlgorithmics pp 186ndash195 Springer Berlin Germany 2008

[33] C H Papadimitriou and M Yannakakis ldquoOptimizationapproximation and complexity classesrdquo Journal of computerand system sciences vol 43 no 3 pp 425ndash440 1991

[34] L Peng R-C Wang X-Y Su and C Long ldquoPrivacy pro-tection based on key-changed mutual authentication protocolin internet of thingsrdquo in Communications in Computer andInformation Science pp 345ndash355 Springer Berlin Germany2013

[35] M F Rahman W Liu S )irumuruganathan N Zhang andG Das ldquoPrivacy implications of database rankingrdquo Pro-ceedings of the VLDB Endowment vol 8 no 10 pp 1106ndash1117 2015

[36] R Schenkel T Crecelius M Kacimi et al ldquoEfficient top-kquerying over social-tagging networksrdquo in Proceedings of the31st Annual International ACM SIGIR Conference on Researchand Development in Information Retrieval pp 523ndash530ACM Singapore July 2008

[37] C Sheng N Zhang Y Tao and X Jin ldquoOptimal algorithmsfor crawling a hidden database in the webrdquo Proceedings of theVLDB Endowment vol 5 no 11 pp 1112ndash1123 2012

[38] D X Song D Wagner and A Perrig ldquoPractical techniquesfor searches on encrypted datardquo in Proceeding 2000 IEEESymposium on Security and Privacy SampP 2000 pp 44ndash55IEEE Berkeley CA USA May 2000

[39] J Su D Cao B Zhao X Wang and I You ldquoePASS anexpressive attribute-based signature scheme with privacy andan unforgeability guarantee for the Internet of)ingsrdquo FutureGeneration Computer Systems vol 33 pp 11ndash18 2014

[40] L Sweeney ldquok-anonymity a Model for Protecting PrivacyrdquoInternational Journal of Uncertainty Fuzziness and Knowl-edge-Based Systems vol 10 no 5 pp 557ndash570 2002

[41] T Tassa A Mazza and A Gionis ldquok-concealment An al-ternative model of k-type anonymityrdquo Transactions on DataPrivacy vol 5 no 1 pp 189ndash222 2012

[42] J H Y F W Wang J C W G K Koperski D LiY L A R N Stefanovic and B X O R Z Dbminer ldquoAsystem for mining knowledge in large relational databasesrdquo inProceedings of the International Conference on KnowledgeDiscovery and Data Mining and Knowledge Discovery(KDDrsquo96) pp 250ndash255 San Diego CA USA August 1996

[43] X Wang J Zhang E M Schooler and M Ion ldquoPerformanceevaluation of attribute-based encryption toward data privacyin the IoTrdquo in Proceedings of the IEEE International Con-ference on Communications (ICC) pp 725ndash730 IEEE SydneyAustralia June 2014

[44] Y Wang and Q Wen ldquoA privacy enhanced dns scheme forthe internet of thingsrdquo in Proceedings of the IET InternationalConference on Communication Technology and Application(ICCTA 2011) )e Institution of Engineering amp TechnologyBeijing China October 2011

[45] Y Xu T Ma M Tang and W Tian ldquoA survey of privacypreserving data publishing using generalization and sup-pressionrdquo Applied Mathematics amp Information Sciencesvol 8 no 3 pp 1103ndash1116 2014

[46] J-C Yang and B-X Fang ldquoSecurity model and key tech-nologies for the internet of thingsrdquo 4e Journal of ChinaUniversities of Posts and Telecommunications vol 18pp 109ndash112 2011

[47] X Yang H Steck Y Guo and Y Liu ldquoOn top-k recom-mendation using social networksrdquo in Proceedings of the sixthACM conference on Recommender systemsmdashRecSysrsquo12pp 67ndash74 ACM Dublin Ireland September 2012

[48] X Zheng Z Cai J Yu C Wang and Y Li ldquoFollow but notrack privacy preserved profile publishing in cyber-physicalsocial systemsrdquo IEEE Internet of 4ings Journal vol 4 no 6pp 1868ndash1878 2017

16 Security and Communication Networks

International Journal of

AerospaceEngineeringHindawiwwwhindawicom Volume 2018

RoboticsJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Active and Passive Electronic Components

VLSI Design

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Shock and Vibration

Hindawiwwwhindawicom Volume 2018

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawiwwwhindawicom

Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Control Scienceand Engineering

Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Journal ofEngineeringVolume 2018

SensorsJournal of

Hindawiwwwhindawicom Volume 2018

International Journal of

RotatingMachinery

Hindawiwwwhindawicom Volume 2018

Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Navigation and Observation

International Journal of

Hindawi

wwwhindawicom Volume 2018

Advances in

Multimedia

Submit your manuscripts atwwwhindawicom

Page 14: Research Article - Hindawi Publishing Corporationdownloads.hindawi.com/journals/scn/2019/5498375.pdf · [4, 34, 44], and attribute-based encryption [39, 43]. ’e features of social

of which has n + 1 attributes and l queries each of which has4 attributes )erefore O(2n + 4l) assignments are neededand f can be conducted in polynomial time

Proof of 4eorem 1 We now prove that the 2-PVSVproblem is NP-hard In Lemma 3 we proved that Max-3SatProblem can be reduced to the 2-PVSV problem in poly-nomial time Furthermore as Max-3Sat Problem is a NP-hard problem [33] the 2-PVSV problem is NP-hard

B 2-PVST

In this section we prove that constructing an optimal PVSTfor each private attribute of a tuple v isin D is NP-hard We usethe definition of v satisfying q from (9))e 2-PVSTproblemcan be redefined as follows

Definition B1 2-PVSTproblem given a query workload Qa database D the optimization problem of 2-PVST is toconstruct an arbitrary tuple vrsquos polymorphic setsPB1

v PBmprime

v that satisfies the most queries in Q )e size ofeach polymorphic vale set is 2

Lemma B1 Max-3Sat le P 2-PVST problem

Proof We construct a reduction function f(Φ) (D s Q)

which takes a Max-3Sat instance as input and returns a 2-PVST instance We assume that Φ is a conjunction of lclauses and each clause is a disjunction of 3 literals from setX x1 xn1113864 1113865 Let D have no public attribute and n + 1private attributes B1 Bn C1 For each literal xi we inserttwo tuples vi and vi

prime into D where

vi C11113858 1113859 1 vi Bi1113858 1113859 1 vi Bj1113960 1113961 minus 1foralljne i

viprime C11113858 1113859 1 vi

prime Bi1113858 1113859 minus 1 viprime Bj1113960 1113961 1foralljne i

(B1)

)en we insert tuple v into D where v[C1] 0 andv[Bj] 0 forallj isin 1 mprime We also set the domain of eachBj as minus 1 0 1 and the domain of C1 as 0 1

Without loss of generality the score function defined in(1) is simplified by setting all weights to 1

Next we construct query workload Q For each clauseLk isin Φ we construct a query qk in which qk[Bj] minus 1 iff xj isa positive literal in Lk and qk[Bj] 1 iff xj is a negativeliteral in Lk We also set qk[C1] 1 )e rest of the attributevalues are set to null by default )erefore ifLk (x1orne x2orx3) then we have qk[C1] 1 qk[B1] 1qk[B2] 1 qk[B3] minus 1 and qk[Bj] null forj isin 4 mprime

Note that ρ(q[Bj] t[Bj]) 0 if q[Bj] is null )uss(v qk) 0 forallq isin Q and s(vi qk) s(vi

prime qk)ge 1 forallq isin Q Weobserve that for q isin Q and i isin 1 n s(v ∣ q)lt s(vi ∣ q)

and s(v ∣ q)lt s(viprime ∣ q) In order to reduce UQ we have to

maximize the number of q isin Q such that sprime(v ∣ q)lt sprime(vi ∣ q)

and sprime(v ∣ q)lt sprime(viprime ∣ q) )e maximum value of sprime(vi ∣ q)

and sprime(viprime ∣ q) is 4 which can be achieved by inserting vi[Bj]

into PBj

viprime and inserting vi

prime[Bj] into PBj

vi Since PC1

v 0 1 thevalue of sprime(v ∣ q) is at least 1 )erefore we have

sprime(v ∣ q)lt sprime vi ∣ q( 1113857

⟺sprime(v ∣ q)lt 4

⟺1113944

mprime

j1ρprime q Bj1113960 1113961 v Bj1113960 11139611113872 1113873lt 3

⟺existj isin 1 mprime ρprime q Bj1113960 1113961 v Bj1113960 11139611113872 1113873 0

⟺existj isin 1 mprime v Bj1113960 1113961ne q Bj1113960 1113961

⟺existj isin 1 mprime minus q Bj1113960 1113961 isin PBj

v

(B2)

Since |PBj

v | is limited to 2 PBj

v 0 1 or 0 minus 1 ie wemust insert either virsquos or vi

primersquos private attribute values intoPB1

v PBmv Recall that we assign minus 1 or 1 to qk[Bj] if xj

is positive or negative respectively in Lk In order toreduce Max-3Sat to 2-PSVT we assign literals in the fol-lowing way

xj true if minus 1 isin PBj

v

xj false if 1 isin PBj

v (B3)

)erefore we denote the corresponding clause of q as Land from equation (B2) we have

existj isin 1 mprime1113864 1113865 minus q Bj1113960 1113961 isin PBj

v

⟺existj isin 1 mprime1113864 1113865 xj true if xj is positive in L

orxj false if xj is negative in L

⟺L true(B4)

From the above equation we observe that v satisfying qk

is equivalent to Lk being true)erefore we can reduceMax-3Sat to 2-PVST Given a solution to the 2-PVSTproblem thatmaximizes the number of queries satisfied by v assignmentx1 xn1113864 1113865 produced by equation (B3) can satisfy thelargest number of clauses in Φ

Given a Max-3Sat instance Φ the construction of the 2-PVST instance can be conducted in polynomial time as weconstruct n queries with 3 attributes and 2n + 1 tuples with nattributes)e transformation from the optimal solution of 2-PVST to the optimal solution of Max-3Sat also takes poly-nomial time as we assign the value to each one of x1 xn

once )erefore f is a polynomial time function

Proof of 4eorem 2 In 4 we proved that a Max-3Sat in-stance can be reduced to a 2-PSVT instance in polynomialtime Furthermore as Max-3Sat Problem is an NP-hardproblem the 2-PVSV problem is an NP-hard problem

Data Availability

)e data used to support the findings of this study areavailable from the corresponding author upon request

14 Security and Communication Networks

Conflicts of Interest

)e authors declare that they have no conflicts of interestregarding the publication of this paper

Acknowledgments

)is research was partially supported by the National Sci-ence Foundation (grant nos 0852674 0915834 1117297and 1343976) and the Army Research Office (grant noW911NF-15-1-0020)

References

[1] G Adomavicius and A Tuzhilin ldquoToward the next generationof recommender systems a survey of the state-of-the-art andpossible extensionsrdquo IEEE Transactions on Knowledge andData Engineering vol 17 no 6 pp 734ndash749 2005

[2] C C Aggarwal and S Y Philip ldquoA condensation approach toprivacy preserving data miningrdquo in Proceedings of the In-ternational Conference on Extending Database Technologypp 183ndash199 Springer Heraklion Crete Greece March 2004

[3] R Agrawal and R Srikant ldquoPrivacy-preserving data miningrdquoin ACM Sigmod Record vol 29 no 2 pp 439ndash450 ACM2000

[4] A Alcaide E Palomar J Montero-Castillo and A RibagordaldquoAnonymous authentication for privacy-preserving IoT tar-get-driven applicationsrdquo Computers amp Security vol 37pp 111ndash123 2013

[5] L Atzori A Iera G Morabito and M Nitti ldquo)e socialinternet of things (SIoT)mdashwhen social networks meet theinternet of things concept architecture and network char-acterizationrdquo Computer Networks vol 56 no 16 pp 3594ndash3608 2012

[6] Z Cai Z He X Guan and Y Li ldquoCollective data-sanitizationfor preventing sensitive information inference attacks insocial networksrdquo IEEE Transactions on Dependable and SecureComputing vol 15 no 4 pp 577ndash590 2018

[7] Z Cai and X Zheng ldquoA private and efficient mechanism for datauploading in smart cyber-physical systemsrdquo IEEE Transactionson Network Science and Engineering vol 24 p 1 2018

[8] Z Cai X Zheng and J Yu ldquoA differential-private frameworkfor urban traffic flows estimation via taxi companiesrdquo IEEETransactions on Industrial Informatics vol 17 2019

[9] N Cao C Wang M Li K Ren and W Lou ldquoPrivacy-preserving multi-keyword ranked search over encryptedcloud datardquo IEEE Transactions on Parallel and DistributedSystems vol 25 no 1 pp 222ndash233 2014

[10] Y-C Chang and M Mitzenmacher ldquoPrivacy preservingkeyword searches on remote encrypted datardquo in Proceedingsof the International Conference on Applied Cryptography andNetwork Security pp 442ndash455 Springer New York NY USAJune 2005

[11] C Chen X Zhu P Shen et al ldquoAn efficient privacy-pre-serving ranked keyword search methodrdquo IEEE Transactionson Parallel and Distributed Systems vol 27 no 4 pp 951ndash9632016

[12] K Chen and L Liu ldquoPrivacy preserving data classificationwith rotation perturbationrdquo in Proceedings of the Fifth IEEEInternational Conference on Data Mining (ICDMrsquo05) p 4IEEE Houston TX USA November 2005

[13] V Ciriani S D C Di Vimercati S Foresti S JajodiaS Paraboschi and P Samarati ldquoFragmentation and en-cryption to enforce privacy in data storagerdquo in Proceedings of

the Fifth European symposium on research in computer se-curity pp 171ndash186 Springer Dresden Germany September2007

[14] M Deshpande and G Karypis ldquoItem-based top-N recom-mendation algorithmsrdquo ACM Transactions on InformationSystems vol 22 no 1 pp 143ndash177 2004

[15] S D C di Vimercati R F Erbacher S Foresti S JajodiaG Livraga and P Samarati ldquoEncryption and fragmentationfor data confidentiality in the cloudrdquo in Foundations of Se-curity Analysis and Design VII A Aldini J Lopez andF Martinelli Eds pp 212ndash243 Springer Berlin Germany2013

[16] S D C di Vimercati S Foresti G Livraga S Paraboschi andP Samarati ldquoConfidentiality protection in large databasesrdquo inA Comprehensive Guide through the Italian Database Researchover the Last 25 Years pp 457ndash472 Springer Berlin Ger-many 2018

[17] C Dwork ldquoDifferential privacyrdquo Encyclopedia of Cryptog-raphy and Security pp 338ndash340 2011

[18] C Dwork and A Roth ldquo)e algorithmic foundations ofdifferential privacyrdquo Foundations and Trendsreg in 4eoreticalComputer Science vol 9 no 3-4 pp 211ndash407 2014

[19] S E Fienberg and J McIntyre ldquoData swapping variations ona theme by dalenius and reissrdquo in Privacy in Statistical Da-tabases pp 14ndash29 Springer Berlin Germany 2004

[20] Y Gao T Yan and N Zhang ldquoA privacy-preservingframework for rank inferencerdquo in Proceedings of the IEEESymposium on Privacy-Aware Computing (PAC) pp 180-181IEEE Washington DC USA August 2017

[21] Z He Z Cai and J Yu ldquoLatent-data privacy preserving withcustomized data utility for social network datardquo IEEETransactions on Vehicular Technology vol 67 no 1pp 665ndash673 2017

[22] T Hong S Mei Z Wang and J Ren ldquoA novel verticalfragmentation method for privacy protection based on en-tropy minimization in a relational databaserdquo Symmetryvol 10 no 11 p 637 2018

[23] X Huang R Fu B Chen T Zhang and A Roscoe ldquoUserinteractive internet of things privacy preserved access con-trolrdquo in Proceedings of the International Conference for In-ternet Technology and Secured Transactions pp 597ndash602IEEE London UK December 2012

[24] Y Huo X Fan L Ma X Cheng Z Tian and D ChenldquoSecure communications in tiered 5G wireless networks withcooperative jammingrdquo IEEE Transactions on Wireless Com-munications vol 18 no 6 pp 3265ndash3280 2019

[25] K Liu H Kargupta and J Ryan ldquoRandom projection-basedmultiplicative data perturbation for privacy preserving dis-tributed data miningin latexrdquo IEEE Transactions on knowledgeand Data Engineering vol 18 no 1 pp 92ndash106 2005

[26] N Li T Li and S Venkatasubramanian ldquot-closeness privacybeyond k-anonymity and l-diversityrdquo in Proceedings of theIEEE 23rd International Conference on Data Engineeringpp 106ndash115 IEEE Istanbul Turkey April 2007

[27] A Machanavajjhala D Kifer J Gehrke andM Venkitasubramaniam ldquoL-diversityrdquo ACMTransactions onKnowledge Discovery from Data vol 1 no 1 pp 3ndashes 2007

[28] S Matwin ldquoPrivacy-preserving data mining techniquessurvey and challengesrdquo in Studies in Applied PhilosophyEpistemology and Rational Ethics pp 209ndash221 SpringerBerlin Germany 2013

[29] B McFee and G Lanckriet ldquoMetric learning to rankrdquo inProceedings of the 27th International Conference on MachineLearning (ICMLrsquo10) Haifa Israel June 2010

Security and Communication Networks 15

[30] K Muralidhar and R Sarathy ldquoData shufflingmdasha newmasking approach for numerical datardquo Management Sciencevol 52 no 5 pp 658ndash670 2006

[31] J Nin J Herranz and V Torra ldquoRethinking rank swapping todecrease disclosure riskrdquo Data amp Knowledge Engineeringvol 64 no 1 pp 346ndash364 2008

[32] K Okamoto W Chen and X-Y Li ldquoRanking of closenesscentrality for large-scale social networksrdquo in Frontiers inAlgorithmics pp 186ndash195 Springer Berlin Germany 2008

[33] C H Papadimitriou and M Yannakakis ldquoOptimizationapproximation and complexity classesrdquo Journal of computerand system sciences vol 43 no 3 pp 425ndash440 1991

[34] L Peng R-C Wang X-Y Su and C Long ldquoPrivacy pro-tection based on key-changed mutual authentication protocolin internet of thingsrdquo in Communications in Computer andInformation Science pp 345ndash355 Springer Berlin Germany2013

[35] M F Rahman W Liu S )irumuruganathan N Zhang andG Das ldquoPrivacy implications of database rankingrdquo Pro-ceedings of the VLDB Endowment vol 8 no 10 pp 1106ndash1117 2015

[36] R Schenkel T Crecelius M Kacimi et al ldquoEfficient top-kquerying over social-tagging networksrdquo in Proceedings of the31st Annual International ACM SIGIR Conference on Researchand Development in Information Retrieval pp 523ndash530ACM Singapore July 2008

[37] C Sheng N Zhang Y Tao and X Jin ldquoOptimal algorithmsfor crawling a hidden database in the webrdquo Proceedings of theVLDB Endowment vol 5 no 11 pp 1112ndash1123 2012

[38] D X Song D Wagner and A Perrig ldquoPractical techniquesfor searches on encrypted datardquo in Proceeding 2000 IEEESymposium on Security and Privacy SampP 2000 pp 44ndash55IEEE Berkeley CA USA May 2000

[39] J Su D Cao B Zhao X Wang and I You ldquoePASS anexpressive attribute-based signature scheme with privacy andan unforgeability guarantee for the Internet of)ingsrdquo FutureGeneration Computer Systems vol 33 pp 11ndash18 2014

[40] L Sweeney ldquok-anonymity a Model for Protecting PrivacyrdquoInternational Journal of Uncertainty Fuzziness and Knowl-edge-Based Systems vol 10 no 5 pp 557ndash570 2002

[41] T Tassa A Mazza and A Gionis ldquok-concealment An al-ternative model of k-type anonymityrdquo Transactions on DataPrivacy vol 5 no 1 pp 189ndash222 2012

[42] J H Y F W Wang J C W G K Koperski D LiY L A R N Stefanovic and B X O R Z Dbminer ldquoAsystem for mining knowledge in large relational databasesrdquo inProceedings of the International Conference on KnowledgeDiscovery and Data Mining and Knowledge Discovery(KDDrsquo96) pp 250ndash255 San Diego CA USA August 1996

[43] X Wang J Zhang E M Schooler and M Ion ldquoPerformanceevaluation of attribute-based encryption toward data privacyin the IoTrdquo in Proceedings of the IEEE International Con-ference on Communications (ICC) pp 725ndash730 IEEE SydneyAustralia June 2014

[44] Y Wang and Q Wen ldquoA privacy enhanced dns scheme forthe internet of thingsrdquo in Proceedings of the IET InternationalConference on Communication Technology and Application(ICCTA 2011) )e Institution of Engineering amp TechnologyBeijing China October 2011

[45] Y Xu T Ma M Tang and W Tian ldquoA survey of privacypreserving data publishing using generalization and sup-pressionrdquo Applied Mathematics amp Information Sciencesvol 8 no 3 pp 1103ndash1116 2014

[46] J-C Yang and B-X Fang ldquoSecurity model and key tech-nologies for the internet of thingsrdquo 4e Journal of ChinaUniversities of Posts and Telecommunications vol 18pp 109ndash112 2011

[47] X Yang H Steck Y Guo and Y Liu ldquoOn top-k recom-mendation using social networksrdquo in Proceedings of the sixthACM conference on Recommender systemsmdashRecSysrsquo12pp 67ndash74 ACM Dublin Ireland September 2012

[48] X Zheng Z Cai J Yu C Wang and Y Li ldquoFollow but notrack privacy preserved profile publishing in cyber-physicalsocial systemsrdquo IEEE Internet of 4ings Journal vol 4 no 6pp 1868ndash1878 2017

16 Security and Communication Networks

International Journal of

AerospaceEngineeringHindawiwwwhindawicom Volume 2018

RoboticsJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Active and Passive Electronic Components

VLSI Design

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Shock and Vibration

Hindawiwwwhindawicom Volume 2018

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawiwwwhindawicom

Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Control Scienceand Engineering

Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Journal ofEngineeringVolume 2018

SensorsJournal of

Hindawiwwwhindawicom Volume 2018

International Journal of

RotatingMachinery

Hindawiwwwhindawicom Volume 2018

Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Navigation and Observation

International Journal of

Hindawi

wwwhindawicom Volume 2018

Advances in

Multimedia

Submit your manuscripts atwwwhindawicom

Page 15: Research Article - Hindawi Publishing Corporationdownloads.hindawi.com/journals/scn/2019/5498375.pdf · [4, 34, 44], and attribute-based encryption [39, 43]. ’e features of social

Conflicts of Interest

)e authors declare that they have no conflicts of interestregarding the publication of this paper

Acknowledgments

)is research was partially supported by the National Sci-ence Foundation (grant nos 0852674 0915834 1117297and 1343976) and the Army Research Office (grant noW911NF-15-1-0020)

References

[1] G Adomavicius and A Tuzhilin ldquoToward the next generationof recommender systems a survey of the state-of-the-art andpossible extensionsrdquo IEEE Transactions on Knowledge andData Engineering vol 17 no 6 pp 734ndash749 2005

[2] C C Aggarwal and S Y Philip ldquoA condensation approach toprivacy preserving data miningrdquo in Proceedings of the In-ternational Conference on Extending Database Technologypp 183ndash199 Springer Heraklion Crete Greece March 2004

[3] R Agrawal and R Srikant ldquoPrivacy-preserving data miningrdquoin ACM Sigmod Record vol 29 no 2 pp 439ndash450 ACM2000

[4] A Alcaide E Palomar J Montero-Castillo and A RibagordaldquoAnonymous authentication for privacy-preserving IoT tar-get-driven applicationsrdquo Computers amp Security vol 37pp 111ndash123 2013

[5] L Atzori A Iera G Morabito and M Nitti ldquo)e socialinternet of things (SIoT)mdashwhen social networks meet theinternet of things concept architecture and network char-acterizationrdquo Computer Networks vol 56 no 16 pp 3594ndash3608 2012

[6] Z Cai Z He X Guan and Y Li ldquoCollective data-sanitizationfor preventing sensitive information inference attacks insocial networksrdquo IEEE Transactions on Dependable and SecureComputing vol 15 no 4 pp 577ndash590 2018

[7] Z Cai and X Zheng ldquoA private and efficient mechanism for datauploading in smart cyber-physical systemsrdquo IEEE Transactionson Network Science and Engineering vol 24 p 1 2018

[8] Z Cai X Zheng and J Yu ldquoA differential-private frameworkfor urban traffic flows estimation via taxi companiesrdquo IEEETransactions on Industrial Informatics vol 17 2019

[9] N Cao C Wang M Li K Ren and W Lou ldquoPrivacy-preserving multi-keyword ranked search over encryptedcloud datardquo IEEE Transactions on Parallel and DistributedSystems vol 25 no 1 pp 222ndash233 2014

[10] Y-C Chang and M Mitzenmacher ldquoPrivacy preservingkeyword searches on remote encrypted datardquo in Proceedingsof the International Conference on Applied Cryptography andNetwork Security pp 442ndash455 Springer New York NY USAJune 2005

[11] C Chen X Zhu P Shen et al ldquoAn efficient privacy-pre-serving ranked keyword search methodrdquo IEEE Transactionson Parallel and Distributed Systems vol 27 no 4 pp 951ndash9632016

[12] K Chen and L Liu ldquoPrivacy preserving data classificationwith rotation perturbationrdquo in Proceedings of the Fifth IEEEInternational Conference on Data Mining (ICDMrsquo05) p 4IEEE Houston TX USA November 2005

[13] V Ciriani S D C Di Vimercati S Foresti S JajodiaS Paraboschi and P Samarati ldquoFragmentation and en-cryption to enforce privacy in data storagerdquo in Proceedings of

the Fifth European symposium on research in computer se-curity pp 171ndash186 Springer Dresden Germany September2007

[14] M Deshpande and G Karypis ldquoItem-based top-N recom-mendation algorithmsrdquo ACM Transactions on InformationSystems vol 22 no 1 pp 143ndash177 2004

[15] S D C di Vimercati R F Erbacher S Foresti S JajodiaG Livraga and P Samarati ldquoEncryption and fragmentationfor data confidentiality in the cloudrdquo in Foundations of Se-curity Analysis and Design VII A Aldini J Lopez andF Martinelli Eds pp 212ndash243 Springer Berlin Germany2013

[16] S D C di Vimercati S Foresti G Livraga S Paraboschi andP Samarati ldquoConfidentiality protection in large databasesrdquo inA Comprehensive Guide through the Italian Database Researchover the Last 25 Years pp 457ndash472 Springer Berlin Ger-many 2018

[17] C Dwork ldquoDifferential privacyrdquo Encyclopedia of Cryptog-raphy and Security pp 338ndash340 2011

[18] C Dwork and A Roth ldquo)e algorithmic foundations ofdifferential privacyrdquo Foundations and Trendsreg in 4eoreticalComputer Science vol 9 no 3-4 pp 211ndash407 2014

[19] S E Fienberg and J McIntyre ldquoData swapping variations ona theme by dalenius and reissrdquo in Privacy in Statistical Da-tabases pp 14ndash29 Springer Berlin Germany 2004

[20] Y Gao T Yan and N Zhang ldquoA privacy-preservingframework for rank inferencerdquo in Proceedings of the IEEESymposium on Privacy-Aware Computing (PAC) pp 180-181IEEE Washington DC USA August 2017

[21] Z He Z Cai and J Yu ldquoLatent-data privacy preserving withcustomized data utility for social network datardquo IEEETransactions on Vehicular Technology vol 67 no 1pp 665ndash673 2017

[22] T Hong S Mei Z Wang and J Ren ldquoA novel verticalfragmentation method for privacy protection based on en-tropy minimization in a relational databaserdquo Symmetryvol 10 no 11 p 637 2018

[23] X Huang R Fu B Chen T Zhang and A Roscoe ldquoUserinteractive internet of things privacy preserved access con-trolrdquo in Proceedings of the International Conference for In-ternet Technology and Secured Transactions pp 597ndash602IEEE London UK December 2012

[24] Y Huo X Fan L Ma X Cheng Z Tian and D ChenldquoSecure communications in tiered 5G wireless networks withcooperative jammingrdquo IEEE Transactions on Wireless Com-munications vol 18 no 6 pp 3265ndash3280 2019

[25] K Liu H Kargupta and J Ryan ldquoRandom projection-basedmultiplicative data perturbation for privacy preserving dis-tributed data miningin latexrdquo IEEE Transactions on knowledgeand Data Engineering vol 18 no 1 pp 92ndash106 2005

[26] N Li T Li and S Venkatasubramanian ldquot-closeness privacybeyond k-anonymity and l-diversityrdquo in Proceedings of theIEEE 23rd International Conference on Data Engineeringpp 106ndash115 IEEE Istanbul Turkey April 2007

[27] A Machanavajjhala D Kifer J Gehrke andM Venkitasubramaniam ldquoL-diversityrdquo ACMTransactions onKnowledge Discovery from Data vol 1 no 1 pp 3ndashes 2007

[28] S Matwin ldquoPrivacy-preserving data mining techniquessurvey and challengesrdquo in Studies in Applied PhilosophyEpistemology and Rational Ethics pp 209ndash221 SpringerBerlin Germany 2013

[29] B McFee and G Lanckriet ldquoMetric learning to rankrdquo inProceedings of the 27th International Conference on MachineLearning (ICMLrsquo10) Haifa Israel June 2010

Security and Communication Networks 15

[30] K Muralidhar and R Sarathy ldquoData shufflingmdasha newmasking approach for numerical datardquo Management Sciencevol 52 no 5 pp 658ndash670 2006

[31] J Nin J Herranz and V Torra ldquoRethinking rank swapping todecrease disclosure riskrdquo Data amp Knowledge Engineeringvol 64 no 1 pp 346ndash364 2008

[32] K Okamoto W Chen and X-Y Li ldquoRanking of closenesscentrality for large-scale social networksrdquo in Frontiers inAlgorithmics pp 186ndash195 Springer Berlin Germany 2008

[33] C H Papadimitriou and M Yannakakis ldquoOptimizationapproximation and complexity classesrdquo Journal of computerand system sciences vol 43 no 3 pp 425ndash440 1991

[34] L Peng R-C Wang X-Y Su and C Long ldquoPrivacy pro-tection based on key-changed mutual authentication protocolin internet of thingsrdquo in Communications in Computer andInformation Science pp 345ndash355 Springer Berlin Germany2013

[35] M F Rahman W Liu S )irumuruganathan N Zhang andG Das ldquoPrivacy implications of database rankingrdquo Pro-ceedings of the VLDB Endowment vol 8 no 10 pp 1106ndash1117 2015

[36] R Schenkel T Crecelius M Kacimi et al ldquoEfficient top-kquerying over social-tagging networksrdquo in Proceedings of the31st Annual International ACM SIGIR Conference on Researchand Development in Information Retrieval pp 523ndash530ACM Singapore July 2008

[37] C Sheng N Zhang Y Tao and X Jin ldquoOptimal algorithmsfor crawling a hidden database in the webrdquo Proceedings of theVLDB Endowment vol 5 no 11 pp 1112ndash1123 2012

[38] D X Song D Wagner and A Perrig ldquoPractical techniquesfor searches on encrypted datardquo in Proceeding 2000 IEEESymposium on Security and Privacy SampP 2000 pp 44ndash55IEEE Berkeley CA USA May 2000

[39] J Su D Cao B Zhao X Wang and I You ldquoePASS anexpressive attribute-based signature scheme with privacy andan unforgeability guarantee for the Internet of)ingsrdquo FutureGeneration Computer Systems vol 33 pp 11ndash18 2014

[40] L Sweeney ldquok-anonymity a Model for Protecting PrivacyrdquoInternational Journal of Uncertainty Fuzziness and Knowl-edge-Based Systems vol 10 no 5 pp 557ndash570 2002

[41] T Tassa A Mazza and A Gionis ldquok-concealment An al-ternative model of k-type anonymityrdquo Transactions on DataPrivacy vol 5 no 1 pp 189ndash222 2012

[42] J H Y F W Wang J C W G K Koperski D LiY L A R N Stefanovic and B X O R Z Dbminer ldquoAsystem for mining knowledge in large relational databasesrdquo inProceedings of the International Conference on KnowledgeDiscovery and Data Mining and Knowledge Discovery(KDDrsquo96) pp 250ndash255 San Diego CA USA August 1996

[43] X Wang J Zhang E M Schooler and M Ion ldquoPerformanceevaluation of attribute-based encryption toward data privacyin the IoTrdquo in Proceedings of the IEEE International Con-ference on Communications (ICC) pp 725ndash730 IEEE SydneyAustralia June 2014

[44] Y Wang and Q Wen ldquoA privacy enhanced dns scheme forthe internet of thingsrdquo in Proceedings of the IET InternationalConference on Communication Technology and Application(ICCTA 2011) )e Institution of Engineering amp TechnologyBeijing China October 2011

[45] Y Xu T Ma M Tang and W Tian ldquoA survey of privacypreserving data publishing using generalization and sup-pressionrdquo Applied Mathematics amp Information Sciencesvol 8 no 3 pp 1103ndash1116 2014

[46] J-C Yang and B-X Fang ldquoSecurity model and key tech-nologies for the internet of thingsrdquo 4e Journal of ChinaUniversities of Posts and Telecommunications vol 18pp 109ndash112 2011

[47] X Yang H Steck Y Guo and Y Liu ldquoOn top-k recom-mendation using social networksrdquo in Proceedings of the sixthACM conference on Recommender systemsmdashRecSysrsquo12pp 67ndash74 ACM Dublin Ireland September 2012

[48] X Zheng Z Cai J Yu C Wang and Y Li ldquoFollow but notrack privacy preserved profile publishing in cyber-physicalsocial systemsrdquo IEEE Internet of 4ings Journal vol 4 no 6pp 1868ndash1878 2017

16 Security and Communication Networks

International Journal of

AerospaceEngineeringHindawiwwwhindawicom Volume 2018

RoboticsJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Active and Passive Electronic Components

VLSI Design

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Shock and Vibration

Hindawiwwwhindawicom Volume 2018

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawiwwwhindawicom

Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Control Scienceand Engineering

Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Journal ofEngineeringVolume 2018

SensorsJournal of

Hindawiwwwhindawicom Volume 2018

International Journal of

RotatingMachinery

Hindawiwwwhindawicom Volume 2018

Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Navigation and Observation

International Journal of

Hindawi

wwwhindawicom Volume 2018

Advances in

Multimedia

Submit your manuscripts atwwwhindawicom

Page 16: Research Article - Hindawi Publishing Corporationdownloads.hindawi.com/journals/scn/2019/5498375.pdf · [4, 34, 44], and attribute-based encryption [39, 43]. ’e features of social

[30] K Muralidhar and R Sarathy ldquoData shufflingmdasha newmasking approach for numerical datardquo Management Sciencevol 52 no 5 pp 658ndash670 2006

[31] J Nin J Herranz and V Torra ldquoRethinking rank swapping todecrease disclosure riskrdquo Data amp Knowledge Engineeringvol 64 no 1 pp 346ndash364 2008

[32] K Okamoto W Chen and X-Y Li ldquoRanking of closenesscentrality for large-scale social networksrdquo in Frontiers inAlgorithmics pp 186ndash195 Springer Berlin Germany 2008

[33] C H Papadimitriou and M Yannakakis ldquoOptimizationapproximation and complexity classesrdquo Journal of computerand system sciences vol 43 no 3 pp 425ndash440 1991

[34] L Peng R-C Wang X-Y Su and C Long ldquoPrivacy pro-tection based on key-changed mutual authentication protocolin internet of thingsrdquo in Communications in Computer andInformation Science pp 345ndash355 Springer Berlin Germany2013

[35] M F Rahman W Liu S )irumuruganathan N Zhang andG Das ldquoPrivacy implications of database rankingrdquo Pro-ceedings of the VLDB Endowment vol 8 no 10 pp 1106ndash1117 2015

[36] R Schenkel T Crecelius M Kacimi et al ldquoEfficient top-kquerying over social-tagging networksrdquo in Proceedings of the31st Annual International ACM SIGIR Conference on Researchand Development in Information Retrieval pp 523ndash530ACM Singapore July 2008

[37] C Sheng N Zhang Y Tao and X Jin ldquoOptimal algorithmsfor crawling a hidden database in the webrdquo Proceedings of theVLDB Endowment vol 5 no 11 pp 1112ndash1123 2012

[38] D X Song D Wagner and A Perrig ldquoPractical techniquesfor searches on encrypted datardquo in Proceeding 2000 IEEESymposium on Security and Privacy SampP 2000 pp 44ndash55IEEE Berkeley CA USA May 2000

[39] J Su D Cao B Zhao X Wang and I You ldquoePASS anexpressive attribute-based signature scheme with privacy andan unforgeability guarantee for the Internet of)ingsrdquo FutureGeneration Computer Systems vol 33 pp 11ndash18 2014

[40] L Sweeney ldquok-anonymity a Model for Protecting PrivacyrdquoInternational Journal of Uncertainty Fuzziness and Knowl-edge-Based Systems vol 10 no 5 pp 557ndash570 2002

[41] T Tassa A Mazza and A Gionis ldquok-concealment An al-ternative model of k-type anonymityrdquo Transactions on DataPrivacy vol 5 no 1 pp 189ndash222 2012

[42] J H Y F W Wang J C W G K Koperski D LiY L A R N Stefanovic and B X O R Z Dbminer ldquoAsystem for mining knowledge in large relational databasesrdquo inProceedings of the International Conference on KnowledgeDiscovery and Data Mining and Knowledge Discovery(KDDrsquo96) pp 250ndash255 San Diego CA USA August 1996

[43] X Wang J Zhang E M Schooler and M Ion ldquoPerformanceevaluation of attribute-based encryption toward data privacyin the IoTrdquo in Proceedings of the IEEE International Con-ference on Communications (ICC) pp 725ndash730 IEEE SydneyAustralia June 2014

[44] Y Wang and Q Wen ldquoA privacy enhanced dns scheme forthe internet of thingsrdquo in Proceedings of the IET InternationalConference on Communication Technology and Application(ICCTA 2011) )e Institution of Engineering amp TechnologyBeijing China October 2011

[45] Y Xu T Ma M Tang and W Tian ldquoA survey of privacypreserving data publishing using generalization and sup-pressionrdquo Applied Mathematics amp Information Sciencesvol 8 no 3 pp 1103ndash1116 2014

[46] J-C Yang and B-X Fang ldquoSecurity model and key tech-nologies for the internet of thingsrdquo 4e Journal of ChinaUniversities of Posts and Telecommunications vol 18pp 109ndash112 2011

[47] X Yang H Steck Y Guo and Y Liu ldquoOn top-k recom-mendation using social networksrdquo in Proceedings of the sixthACM conference on Recommender systemsmdashRecSysrsquo12pp 67ndash74 ACM Dublin Ireland September 2012

[48] X Zheng Z Cai J Yu C Wang and Y Li ldquoFollow but notrack privacy preserved profile publishing in cyber-physicalsocial systemsrdquo IEEE Internet of 4ings Journal vol 4 no 6pp 1868ndash1878 2017

16 Security and Communication Networks

International Journal of

AerospaceEngineeringHindawiwwwhindawicom Volume 2018

RoboticsJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Active and Passive Electronic Components

VLSI Design

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Shock and Vibration

Hindawiwwwhindawicom Volume 2018

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawiwwwhindawicom

Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Control Scienceand Engineering

Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Journal ofEngineeringVolume 2018

SensorsJournal of

Hindawiwwwhindawicom Volume 2018

International Journal of

RotatingMachinery

Hindawiwwwhindawicom Volume 2018

Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Navigation and Observation

International Journal of

Hindawi

wwwhindawicom Volume 2018

Advances in

Multimedia

Submit your manuscripts atwwwhindawicom

Page 17: Research Article - Hindawi Publishing Corporationdownloads.hindawi.com/journals/scn/2019/5498375.pdf · [4, 34, 44], and attribute-based encryption [39, 43]. ’e features of social

International Journal of

AerospaceEngineeringHindawiwwwhindawicom Volume 2018

RoboticsJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Active and Passive Electronic Components

VLSI Design

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Shock and Vibration

Hindawiwwwhindawicom Volume 2018

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawiwwwhindawicom

Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Control Scienceand Engineering

Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Journal ofEngineeringVolume 2018

SensorsJournal of

Hindawiwwwhindawicom Volume 2018

International Journal of

RotatingMachinery

Hindawiwwwhindawicom Volume 2018

Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Navigation and Observation

International Journal of

Hindawi

wwwhindawicom Volume 2018

Advances in

Multimedia

Submit your manuscripts atwwwhindawicom