22
Structural and Message based Private Friend Recommendation Bharath K. Samanthula and Wei Jiang Department of Computer Science, Missouri S&T Rolla, Missouri 65409 Email: {bspq8, wjiang}@mst.edu Abstract The emerging growth of online social networks have opened new doors for var- ious business applications such as promoting a new product across its customers. Besides this, friend recommendation is an important tool for recommending poten- tial candidates as friends to users in order to enhance the development of the entire network structure. Existing friend recommendation methods utilize social network structure and/or user profile information. However, these techniques can no longer be applicable if the privacy of users is taken into consideration. In this paper, we propose a two-phase private friend recommendation protocol for recommending friends to a given target user based on the network structure as well as utilizing the real message interaction between users. Our protocol computes the recommen- dation scores of all users who are within a radius of h from the target user in a privacy preserving manner. In addition, we show the practical applicability of our approach through empirical analysis. 1 Introduction Online social networks [1] such as Facebook and Google+ have been emerging as a new communication service for users to share information. Along this direction, social network analysis [2–4] has emerged as an important tool for many business intelligence applications [5] such as identifying potential customers and promoting items based on their interests. Besides this, it also helps users to make new friends through social recommendations; therefore, providing the medium for users to expand their social connections and sharing information of interest. Discovering new friends to a given target user A is equivalent to solving the link prediction [6] problem for A in the corresponding social network. Given a snapshot of the social network, the link prediction problem aims at inferring the new interac- tions that are likely to happen among its nodes. In our case, the nodes of the social network are the users and an edge between two users indicates a friendship between them. Briefly, friend recommendations can be performed as follows. (i) Social close- ness (hereafter, we refer to it as recommendation score) between A and each potential 1

Structural and Message based Private Friend Recommendationweb.mst.edu/~wjiang/PFR-Tech.pdf · 2012. 4. 25. · Ford Butler Cole Kelly Shaw Ray Fox Ryan Hart Jones 2 2 1 1 2 1 4 3

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Structural and Message based Private Friend Recommendationweb.mst.edu/~wjiang/PFR-Tech.pdf · 2012. 4. 25. · Ford Butler Cole Kelly Shaw Ray Fox Ryan Hart Jones 2 2 1 1 2 1 4 3

Structural and Message based Private FriendRecommendation

Bharath K. Samanthula and Wei JiangDepartment of Computer Science, Missouri S&T

Rolla, Missouri 65409Email: {bspq8, wjiang}@mst.edu

Abstract

The emerging growth of online social networks have opened new doorsfor var-ious business applications such as promoting a new product across its customers.Besides this, friend recommendation is an important tool for recommending poten-tial candidates as friends to users in order to enhance the development of the entirenetwork structure. Existing friend recommendation methods utilize social networkstructure and/or user profile information. However, these techniques can no longerbe applicable if the privacy of users is taken into consideration. In this paper, wepropose a two-phase private friend recommendation protocol for recommendingfriends to a given target user based on the network structure as well asutilizing thereal message interaction between users. Our protocol computes the recommen-dation scores of all users who are within a radius ofh from the target user in aprivacy preserving manner. In addition, we show the practical applicability of ourapproach through empirical analysis.

1 Introduction

Online social networks [1] such as Facebook and Google+ havebeen emerging as anew communication service for users to share information. Along this direction, socialnetwork analysis [2–4] has emerged as an important tool for many business intelligenceapplications [5] such as identifying potential customers and promoting items based ontheir interests. Besides this, it also helps users to make new friends through socialrecommendations; therefore, providing the medium for users to expand their socialconnections and sharing information of interest.

Discovering new friends to a given target userA is equivalent to solving the linkprediction [6] problem forA in the corresponding social network. Given a snapshotof the social network, the link prediction problem aims at inferring the new interac-tions that are likely to happen among its nodes. In our case, the nodes of the socialnetwork are the users and an edge between two users indicatesa friendship betweenthem. Briefly, friend recommendations can be performed as follows. (i) Social close-ness (hereafter, we refer to it as recommendation score) betweenA and each potential

1

Page 2: Structural and Message based Private Friend Recommendationweb.mst.edu/~wjiang/PFR-Tech.pdf · 2012. 4. 25. · Ford Butler Cole Kelly Shaw Ray Fox Ryan Hart Jones 2 2 1 1 2 1 4 3

Figure 1: A sample social network for Lee withh = 3

candidate is computed. (ii) The candidates with Top-K scores are recommended as newfriends toA.

Recently, Bi-Ru Dai et al. [7] proposed a new friend recommendation algorithm(denoted as CSM - meaning ”Combine Structure and Messages”)by utilizing the realmessages communicated between the users as well as the network structure. However,as users are more concerned about their privacy [8–10], manyonline social networkshave provided various privacy settings for users to keep their data private. In general,users are allowed to keep thier friend lists, profile information etc., as private informa-tion. Under this scenario, the computation of recommendation scores is non-trivial. Inthis paper, we propose a private friend recommendation algorithm based on the similar-ity metric proposed in [7]. Our method computes the recommendation scores betweenA and all potential users who areh-hop away fromA in a privacy preserving manner.Figure 1, shows a sample network for target userLeewith h = 3. In practice, as pro-posed by Stanley Milgram [11], any two persons can get acquainted each other throughsix degree of seperation (i.e.,1 < h ≤ 6) in the network.

1.1 Problem Definition

Consider a social network graphGs with the nodes denoting the users and the (di-rected) weighted edge between any two nodes denoting the number of real messageinteractions between them. Since the message interaction can be bi-directional, wetake the minimum number of messages, as mentioned in [7, 12],as the actual weightof the edge (denoting the strength of the relationship). A sample minimum messageinteraction between various users (forh = 3) in Lee’s network is as shown in Figure 2.In general, if userA sendsn1 messages toB andB sendsn2 messages toA, then theweight of the edge is taken asmin(n1, n2).

For a target userA (whom we wish to recommend friends), we generate a can-didate network withA as the root and an edge between the users denote the number(minimum) of real message interactions. Note that the userswho are 1-hop away from

2

Page 3: Structural and Message based Private Friend Recommendationweb.mst.edu/~wjiang/PFR-Tech.pdf · 2012. 4. 25. · Ford Butler Cole Kelly Shaw Ray Fox Ryan Hart Jones 2 2 1 1 2 1 4 3

A are actually his/her friends. In order to generate the candidate network, we have toremove the links between users at the same level. For example, consider Figure 2. Wecan generate the candidate network by simply removing the link betweenHall andCox (since they are on the same level). The recommendation score(RS) betweenAand a userU l-hop (2 ≤ l ≤ h) away fromA in the candidate network is computed asfollows [7]:

RS(A,Uj) =

(∑

k

(|Pk(A,U)| ∗∏

i

C(Si−1, Si))

)∗DU

TN(1)

wherePk(A,U) denote all the intermediate users belong to thekth shortest path start-ing from A (root) to userU , |Pk(A,U)| is the total number of messages along pathPk(A,U), andL(i) is the set of all users at leveli (i.e., i-hop away fromA). Si ∈Pk(A,U) ∩ L(i), for i = 1, . . . , l − 1, whereU ∈ L(l). Note thatS0 denote the rootuserA. C(Si−1, Si) denote the proportion of messages between userSi andSi−1 tothe total number of messages at leveli. Here, userSi−1 is the parent of userSi inthe corresponding candidate network;DU denote the degree ofU andTN denotes thetotal number of users in the candidate network.

When the privacy of users is taken into consideration, the computation of abovementioned recommendation score is not straight-forward. More specifically, in thispaper, we assume the following private information (PI) foruserU :

(i). PI 1 - Friendship: The friendship between any two usersU andV is notrevealed to any other user.

(ii). PI 2 - Strength of Friendship: The weight of an edge betweenU andV ,denoted asCU,V , is not revealed to users other thanU andV .

(iii). PI 3 - Degree: The size of the friend list ofU is not revealed to other users.

Without loss of generality, letU1, . . . , Un be the set of potential candidates who areat mostl-hop (2 ≤ l ≤ h) away fromA. The goal of this paper is to develop a privatefriend recommendation protocol which is formally defined asfollows:

PFR(A,F (A), U1, . . . , Un)→ Γ (2)

whereF (A) denote the friend list of userA. Γ is defined as:

Γ = {〈RS(A,U1), U1〉, . . . , 〈RS(A,Un), Un〉}

Here,RS(A,Uj) is the new recommendation score forUj which is correlated to theactual scoreRS(A,Uj) (based on Equation 1) as below, for1 ≤ j ≤ n:

RS(A,Uj) = Mh ∗ TN ∗RS(A,Uj)

Mh is the normalizing factor (more details are given in Section3) for a user ath-hop away fromA andTN is the number of users in the candidate network (whichis a constant). At the end of the PFR protocol, the values ofRS(A,Uj) andUj , for1 ≤ j ≤ n, are known only toA and the privacy of each user (PI 1, 2, and 3) ispreserved.

3

Page 4: Structural and Message based Private Friend Recommendationweb.mst.edu/~wjiang/PFR-Tech.pdf · 2012. 4. 25. · Ford Butler Cole Kelly Shaw Ray Fox Ryan Hart Jones 2 2 1 1 2 1 4 3

Lee

Hall Cox Bell

Ford Butler Cole Kelly

Shaw Ray Fox Ryan Hart Jones

2 2 1

1

2 1 4 3 3

2 1 4 2 5 2 4

Figure 2: Message interaction between different users in Lee’s network

1.2 Main Contribution

The proposed protocol computes the recommendation scores between a target userAand all potential candidates who are at mostl-hop (friendship) away fromA in a privacypreserving manner. The main contributions of this paper are:

• Security - The proposed protocol guarantees that the friend lists, thestrength offriendships, and the friend list sizes of each user are kept as private.

• Accuracy - Our protocol computes the recommendation scores which are scaledby a constant factorMh∗TN ; therefore, the relative ordering among the scores ispreserved. Hence, our protocol guarantees the same kind of effectiveness similarto the CSM method [7].

• Efficiency - Furthermore, the computation costs incurred on the internal usersin the proposed protocol is negligible; therefore, the proposed protocol is veryefficient from internal users perspective.

The paper is organized as follows. We discuss the existing related work in Section2. Section 3 presents the new scoring function along with a running example. Theproposed PFR protocol is discussed in Section 4. Section 5 discusses the empiricalresults based on various parameters. We conclude the paper along with future work inSection 6.

2 Related work

Friend recommendation score between any two given users canbe computed eitherbased on the network topology and/or user profile contents [13, 14]. Recently, re-

4

Page 5: Structural and Message based Private Friend Recommendationweb.mst.edu/~wjiang/PFR-Tech.pdf · 2012. 4. 25. · Ford Butler Cole Kelly Shaw Ray Fox Ryan Hart Jones 2 2 1 1 2 1 4 3

HPEnc An Additive Homomorphic Probablistic Encryption System

T A trusted party (such as network administrator)

〈E,D〉 A pair of HPEnc based encryption and decryption function

〈pk, pr〉 A public and private key pair corresponding to〈E,D〉

CU,V Minimum messages exchanged betweenU andV

Ml,M′

lNormalization and Scalar factors for a user∈ L(l)

Table 1: COMMON NOTATIONS

searchers have focused on developing hybrid friend recommendation algorithms [15,16] to take advantage from both metrics. As an independent work, Schuchuan et al. [12]proposed a graph-based friend recommendation algorithm using weighted minimum-message ratio as the scoring metric. This work was later improved in [7] by taking theevolution of entire network into consideration. Nevertheless, the above methods arenot applicable if privacy of users is taken into consideration.

Ashwin et al. [17] analyzed the trade-offs between accuracyand privacy for privatefriend recommendation algorithms based on differential privacy. Our work is entirelydifferent from theirs since the security definition in our paper is from the well-knownsecure multiparty computation (SMC) [18].

3 Order Preserving Scoring Function

The original scoring function given in Equation 1 contains arational factorC(Si−1,i)which varies withi, for 1 ≤ i ≤ l − 1. Therefore, to perform encryption operations,here we define a new scoring function (producing an integer value) based on Equation 1such that the relative rankings among the final recommendantion scores are preserved.Table 1 presents the common notations used extensively in the paper.

3.1 Normalization Factor

Given a snapshot of the network forA, we define the normalization factor for a userl-hop (or friend) away fromA (where2 ≤ l ≤ h) as:

Ml =l−1∏

i=1

Mi−1,i (3)

whereMi−1,i, denoting the total number of messages exchanged between users atL(i− 1) andL(i), is as given below.

Mi−1,i =∑

U∈L(i−1)V ∈L(i)

CU,V

We explicitly assumeM1 = 1 since users who are 1-hop fromA are already friends ofA. For any two potential candidatesU andV who arel-hop away fromA (2 ≤ l ≤ h),we observe thatU andV have the same normalization factor.

5

Page 6: Structural and Message based Private Friend Recommendationweb.mst.edu/~wjiang/PFR-Tech.pdf · 2012. 4. 25. · Ford Butler Cole Kelly Shaw Ray Fox Ryan Hart Jones 2 2 1 1 2 1 4 3

Example 1. Refer to Figure 2, consider the potential candidate Cole whois 2 hopsaway from Lee. In addition,L(0) = 〈Lee〉 andL(1) = 〈Hall, Cox,Bell〉. Hence, thenormalization factor for Cole isM2 which is equal toM0,1 = CLee,Hall+CLee,Cox+CLee,Bell = 5. Similaraly,M1,2 = 13. By substituting these values in Equation 3, thenormalization factor for users at level 3 isM3 =

∏2i=1 Mi−1,i = M0,1 ∗M1,2 = 65.

Observation 1. For any userU ∈ L(i − 1) andV ∈ L(i), one can observe that thevalue ofC(U, V ) is equivalent toCU,V

Mi−1,i. Therefore, for a userU at levell, the rational

factor in Equation 1 can be simplified as follows:∏l−1

i=1 C(Si−1, Si) =∏l−1

i=1

CSi−1,Si

Mi−1,i= 1

Ml

∏l−1i=1 CSi−1,Si

3.2 Scalar Factor

Given a target userA andh, we define the scalar factor for a user at levell, for 1 ≤ l ≤h, as follows:

M ′l =

Mh

Ml

=M0,1 ∗ . . . ∗Mh−2,h−1

M0,1 ∗ . . . ∗Ml−2,l−1(4)

whereMl is the normalization factor for a user belonging toL(l). In addition, weobserve thatM ′

l is same for all users who are at same levell. Furthermore, whenl = h, we haveM ′

h = 1. Similarly, we haveM ′1 = Mh. From Figure 2, the scalar

factor forCole is given by:M ′2 = M3

M2

=M0,1∗M1,2

M0,1= M1,2 = 13.

Definition 1. For any given target userA and potential candidateU who is l hopsaway fromA, we define the new scoring function (denoted asRS(A,U)) as follows:

RS(A,U) = M ′l ∗

(∑

k

(|Pk(A,U)| ∗∏

i

CSi−1,Si)

)∗DU (5)

Note thatCSi−1,Siis the number of messages exchanged between parent userSi−1 and

Si on thekth shortest path fromA toU , for i = 1, . . . , l− 1. Based on Equation 4 andObservation 1, we can re-write Equation 5 as below.

RS(A,U) =Mh

Ml

(∑

k

(|Pk(A,U)| ∗∏

i

CSi−1,Si)

)∗DU

= Mh ∗

(∑

k

(|Pk(A,U)| ∗∏

i

CSi−1,Si

M ′l

)

)∗DU

= Mh ∗ TN ∗

(∑

k

[Pk(A,U).CSum ∗∏

i

C(Si−1, Si)]

)∗DU

TN

= Mh ∗ TN ∗RS(A,U)

The values ofMh andTN are constants for any given snapshot of the social network(for a fixedh). Therefore, the relative orderings among the recommendation scoresof the potential candidates based on Equation 5 are preserved. That is, for any twopotential usersU and V if RS(A,U) > RS(A, V ), then the new scoring functionguarantees thatRS(A,U) > RS(A, V ) for any fixedh andA.

6

Page 7: Structural and Message based Private Friend Recommendationweb.mst.edu/~wjiang/PFR-Tech.pdf · 2012. 4. 25. · Ford Butler Cole Kelly Shaw Ray Fox Ryan Hart Jones 2 2 1 1 2 1 4 3

3.3 Computation of Similarity Score

Refer to Figure 2 and let us consider the case of computing therecommendation scorebetweenLee andFox. Here,Fox has two shortest paths fromLee; P1(Lee, Fox) ={Lee,Hall, Butler, Fox} andP2(Lee, Fox) = {Lee,Cox,Butler, Fox}. The total(minimum) number of messages along the first path i.e.,|P1(Lee, Fox)| is 7. Simi-larly, |P2(Lee, Fox)| = 10. Along P1(Lee, Fox), we have two internal usersHall

andButler who are respectively 1 and 2 hops away fromLee. In addition, we haveCLee,Hall = 2 andCHall,Butler = 1. Similarly, for the pathP2(Lee, Fox), we haveCCox,Butler = 4. SinceFox is 3 hops away fromLee, her scaling factorM ′

3 is 1. Bysubstituting the above values in Equation 5, we have:

RS(Lee, Fox) = 1[7 ∗ 2 ∗ 1 + 10 ∗ 2 ∗ 4]DFox = 94 ∗DFox

Whereas, the actual score from Equation 1 is:

RS(Lee, Fox) =

[7 ∗

2

5∗

1

13+ 10 ∗

2

5∗

4

13

]∗DFox

TN

=1

65∗ 94 ∗

DFox

TN

whereDFox is the degree (size of friend list) ofFox andTN denotes the size of thecandidate network. It is clear thatRS(Lee, Fox) = Mh ∗TN ∗RS(Lee, Fox), whereMh = 65.

4 Proposed Protocol

In this section, we present our private friend recommendation (termed as PFR) protocolwhich computes the recommendation scores between the target userA and all poten-tial candidates who are at mosth-hop (> 1) away fromA based on Equation 5. Weexplicitly make the following assumptions:

1. If U ∈ F (V ), thenV ∈ F (U), andCU,V is known only toU andV . We assumeF (A) = 〈B1, . . . , Bm〉 .

2. Each user has a unique user ID (for example, Facebook user ID is generally atmost 128-bit integer).

3. There exists a third partyT (e.g., network administrator) who generates a pairof encryption and decryption function〈E,D〉 for A based on the additive homo-morphic & probabilistic encryption scheme (such as Paillier Cryptosystem [19]).Let pk andpr be the corresponding public and private key pair such thatpr isknown only toT andpk is public. In addition, letN be the group size (usuallyof 1024 bits). For any two given plaintextsm1,m2 ∈ ZN , the HPEnc systemexhibits the following properties:

(a) Homomorphic Addition: Epk(m1 +m2)← Epk(m1) ∗ Epk(m2) modN2;

7

Page 8: Structural and Message based Private Friend Recommendationweb.mst.edu/~wjiang/PFR-Tech.pdf · 2012. 4. 25. · Ford Butler Cole Kelly Shaw Ray Fox Ryan Hart Jones 2 2 1 1 2 1 4 3

(b) Homomorphic Multiplication: Epk(m1 ∗m2)← E(m2)m1 mod N2;

(c) Semantic Security: The encryption scheme is semantically secure as de-fined in [20, 21]. Briefly, given a set of ciphertexts, an adversary cannotdeduce any additional information about the plaintext.

To generate the candidate network, we need to omit the messages between users whoare at same level. For example, in Figure 2, we should not consider CHall,Cox forcomputing the recommendation scores in the PFR protocol (asmentioned in [7, 12]).Thus, to explicitly generate the candidate network, we include an initialization stepas follows. Initially,A generates a countert = h − 1 and passes it over to his/herfriends. Upon receiving the counter, each intermediate user U stores the value of re-ceived counter (locally) and also stores the parent user whosent the counter toU (de-noted asPr(U) ). After this,U decrements the counter by 1 and sends it to his/herfriends. This process continues until users ath-hop fromA receive a counter oft = 0.Since a user can receive multiple counter values, we have thefollowing observations.

Observation 2. Consider userU , who isl-hop away fromA and1 ≤ l ≤ h, receivingmultiplet values. We address the following two cases:

Case 1: If the values of counters are same, thenU has multiple shortest paths(with parents ofU on the same level). In this case,U considers one of the parents (canbe chosen randomly) as actual parentPr(U) and any further communication happensonly with that parent. For example, refer to Figure 2, “Hart”receivest = 0 from bothCole and Kelly. Therefore, he can pick one of them, say Kelly,asPr(U).

Case 2: If U receives different values oft which happens whenU receives coun-ters from parents who are at different levels. In this case,U selects one of the parentuser who sent the maximumt value asPr(U). In the PFR protocol, the child usersof U (denoted asCh(U)) are users belonging toF (U) − R(U), whereR(U) are theset of users who have sent a counter value toU . The important observation here isU omits the messages exchanged with the users who have sent smaller counter values(also dumps the corresponding counter). This further implies that,U considers onlymessages exchanged between him/her and eitherPr(U) or Ch(U) (therefore forminga candidate network by omitting messages with users on the same level). An exampleto this case is user “Cox” (refer to Figure 2). Here, Cox receivest = 2 andt = 1 fromLee and Hall respectively. Therefore, Cox treates Lee as theactual parent user andomitsCCox,Hall for the rest of the computation (also dumps countert = 1 receivedfrom Hall).

At the end of the initialization step, based on Observation 2, each internal userUhas the values oft, pk, Pr(U) andCh(U). Apart from the above initilaization step,the proposed PFR protocol mainly consists of the following two phases:

Phase 1 - Secure Computation of Scalar Factors:During Phase 1,A computesthe list of encrypted scalar factors (denoted asΦ with Φl−1 denoting the encryptedscalar factor for levell where2 ≤ l ≤ h) in a privacy preserving manner. This phaseutilizes a secure mutliplication protocol (only ifh > 2) as a building block. At the end,onlyA knowsΦ and nothing is revealed to other users.

Phase 2 - Secure Computation of Recommendation Scores:Following fromPhase 1,A (with Φ as input),T and other internal users jointly compute the recommen-

8

Page 9: Structural and Message based Private Friend Recommendationweb.mst.edu/~wjiang/PFR-Tech.pdf · 2012. 4. 25. · Ford Butler Cole Kelly Shaw Ray Fox Ryan Hart Jones 2 2 1 1 2 1 4 3

Algorithm 1 PFRRequire: pr is private toT , h is private toA, andpk is public;Uj knowst, Pr(Uj)

andCh(Uj) from initialization step{Steps 1 - 7 performed byUj with t ≥ 2}

1: s← |Ch(Uj)|2: XUj

← Epk(∑s

i=1 CUj ,Vi), whereVi ∈ Ch(Uj)

3: LUj[t− 1]← XUj

4: if t > 2 andUj receivedLVifrom Vi then

5: LUj[k]←

∏s

i=1 LVi[k] mod N2, for 1 ≤ k ≤ t− 2

6: end if7: sendLUj

to Pr(Uj){Steps 8 - 16 performed byA and T}

8: Φh−1 = Epk(1)9: if h ≥ 3 then

10: LA[k]←∏m

i=1 LBi[k] mod N2, for 1 ≤ k ≤ h− 2

11: if h = 3 then12: Φ1 ← LA

13: else14: ComputeΦ usingLA as input to the SMP protocol15: end if16: end if{Steps 17 - 21 performed byA}

17: for all Bi ∈ Ch(A) do18: α1 ← Epk(CA,Bi

)

19: αi ← ΦCA,Bi

i−1 mod N2, for 2 ≤ i ≤ h

20: sendA,Φ, andα toBi

21: end for{Steps 22 - 36 performed byUj}

22: receiveA,Φ, andα from Y = Pr(Uj)23: if Uj ∈ Pr(A) then24: sendA, Φ andα to eachVi ∈ Ch(Uj)25: else26: computeβj ← α

DUj

1 mod N2

27: computeγj ← α2 ∗ ΦCY,Uj

1 mod N2

28: Zj ← {Epk(Uj), 〈βj , γj〉}29: sendZj toA

30: end if31: if t > 0 then32: Φi ← Φi+1, for 1 ≤ i ≤ t

33: α1 ← αCY,Uj

1 mod N2

34: αi ← αi+1 ∗ ΦCY,Uj

i−1 mod N2, for 2 ≤ i ≤ t+ 135: sendA, Φ andα to eachVi ∈ Ch(Uj)36: end if{Step 37 performed byA and T}

37: (RS(A,Uj), Uj)← SMPA(Zj), for eachZj

9

Page 10: Structural and Message based Private Friend Recommendationweb.mst.edu/~wjiang/PFR-Tech.pdf · 2012. 4. 25. · Ford Butler Cole Kelly Shaw Ray Fox Ryan Hart Jones 2 2 1 1 2 1 4 3

dation scores of all potential candidates who arel-hop away fromA, where2 ≤ l ≤ h.This phase utilizes a secure multiplication and addition protocol as a building block.The final recommendation scores and the corresponding user IDs are revealed only toA and nothing is revealed to other users.

At the beginning of the protocol,A choses the value ofh and executes the initial-ization step as explained earlier1. Then, during Phase 1,A decides whether there is aneed to take the help of other users in order to generateΦ. If h = 2, A computesΦlocally. Otherwise, forh > 2, A computesΦ with the help of internal users. After this,during Phase 2,A sends necessary information toBi along with his/her user ID andΦ,for 1 ≤ i ≤ m. Then, each intermediate userUj receives the necessary informationfrom Pr(Uj), generates his/her encrypted partial scores2 (only if Uj is not already afriend ofA) and sends the encrypted partial scores toA. In addition, if the value oft(stored during initialization step) ofUj is greater than 0, he/she computes the necessaryinformation (fort > 0) and sends it to his/her corresponding child friends. Afterre-ceiving all the encrypted partial scores,A andT involve in a secure multiplication andaddition protocol to compute the recommendation scores foreach potential candidateUj . At the end of this step, onlyA knows the user IDs of all potential friends alongwith their recommendation scores (computed based on Equation 5). The overall stepsinvolved in the PFR protocol are highlighted in Algorithm 1.Now, we discuss the stepsinvolved in each of the two phases in detail.

4.1 Phase 1 - Secure Computation of Scalar Factors:

To start with,A decides the value ofh i.e., the radius (number of hops) until whichthe potential candidates are explored. As mentioned earlier, the value ofh is atmost6 in practice and is always greater than 1 [11]. If the value ofh is 2, then only thechild friends ofA’s friends are considered as the potential candidates. Since the scalarfactor for users atl = 2 is M ′

2 = 1, A simply setsΦ1 = Epk(1) for security reasons.Whenh > 2, A does not have necessary information to compute the encryption ofscalar factors (such asM ′

3) since the potential candidates can belong to anyL(l), where2 ≤ l ≤ h. Therefore, whenh > 2, A computesΦ, with the help of internal userswho are at mosth − 2 hops away fromA. We observe that potential candidates whoare at mosth − 2 hops away fromA are sufficient to generate the encryption of allscalar factors because the partial scores ofMh−2,h−1 are known to users belonging toL(h−2). Note that, irrespective of the value ofh, Φh−1 = Epk(1) always hold. Phase1 involves steps 1 to 16 as shown in Algorithm 1.

In order to computeΦ, for h > 2, A simply waits for internal users witht ≥ 2to send in the aggregated data. To start with, each internal userUj (including Bi)performs the following operations based on the value oft:

1. ComputesXUj= Epk(

∑s

i=1 CUj ,Vi), whereVi is the child friend ofUj and

s = |Ch(Uj)|

1Note thath is always greater than 1. Because,h = 1 denotes the users who are 1-hop away fromA

who are actual friends ofA.2Note that the number of encrypted partial scores send by each potential candidateUj depends on the

actual number of shortest paths starting fromA toUj which is directly correlated to the underlying networkstructure (which varies for each target userA).

10

Page 11: Structural and Message based Private Friend Recommendationweb.mst.edu/~wjiang/PFR-Tech.pdf · 2012. 4. 25. · Ford Butler Cole Kelly Shaw Ray Fox Ryan Hart Jones 2 2 1 1 2 1 4 3

Algorithm 2 SMP(Epk(a), Epk(b))→ Epk(a ∗ b)

Require: A hasEpk(a) andEpk(b)1: A:

(a). Pick two random numbersra, rb ∈ ZN

(b). za ← Epk(a) ∗ Epk(ra) mod N2

(c). zb ← Epk(b) ∗ Epk(rb) mod N2; sendza, zb to T

2: T :

(a). Receiveza andzb fromA

(b). ua ← Dpr(za); ub ← Dpr(zb)

(c). Computeu = ua ∗ ub mod N

(d). v ← Epk(u); sendv toA

3: A:

(a). Receivev from T

(b). s← v ∗ Epk(a)N−rb mod N2

(c). s′ ← s ∗ Epk(b)N−ra mod N2

(d). Epk(a ∗ b)← s′ ∗ Epk(ra ∗ rb)N−1 mod N2

2. Creates a new vectorLUjof sizet− 1; setsLUj

[t− 1] toXUj

3. If t > 2,Uj receivesLVifromVi and updatesLUj

by aggregatingLVicomponent-

wise.

LUj[k] =

∏si=1 LVi

[k] mod N2, for 1 ≤ k ≤ t− 2

4. SendsLUjto Pr(Uj)

The above process forwards the aggregated data at each internal user in a bottom-upfashion (in encrypted form). At the end,A receivesLBi

fromBi, for 1 ≤ i ≤ m. Afterthis,A generates the final aggregated encrypted list (LA) and proceeds as follows:

1. LA[k] =∏m

i=1 LBi[k] mod N2, for 1 ≤ k ≤ |LBi

|, whereLBidenote the

aggregated list received fromBi. The observation is|LBi| = h− 2, for 1 ≤ i ≤

m.

2. Assign the encrypted scalar factor for levelh asΦh−1 = Epk(1). If h = 3, setsΦ1 ← LA[1]. Else, letLA = 〈Epk(x1), . . . , Epk(xh−2)〉. Using Secure multi-plication (SMP) protocol, as shown in Algorithm 2,A andT jointly computeΦusingLA as below.

Φl ← Epk(∏h−l−1

j=1 xj) mod N2, for 1 ≤ l ≤ h− 2

11

Page 12: Structural and Message based Private Friend Recommendationweb.mst.edu/~wjiang/PFR-Tech.pdf · 2012. 4. 25. · Ford Butler Cole Kelly Shaw Ray Fox Ryan Hart Jones 2 2 1 1 2 1 4 3

The SMP protocol is one of the basic building blocks in the field of secure multi-party compution (SMC) [18]. The basic concept of the SMP protocol is based on thefollowing property which holds for any givena, b ∈ ZN :

a ∗ b = (a+ r1) ∗ (b+ r2)− a ∗ r2 − b ∗ r1 − r1 ∗ r2 (6)

where the arithmetic operations are performed underZN . Briefly, given thatA hasinput Epk(a) andEpk(b), A andT jointly computeEpk(a ∗ b). The output of theSMP protocol, that is,Epk(a ∗ b) is known only toA and the values ofa andb are notrevealed to eitherA or T .

Theorem 1. The output of Phase 1 is the list of encrypted scalar factors (in order) foreach level. That is,

Φl = Epk(M′l+1), for 1 ≤ l ≤ h− 1

whereM ′l+1 is the scalar factor for users atl + 1 hops away fromA.

Proof. For h = 2, it is clear thatM ′2 = 1 andΦ1 = Epk(1) = Epk(M

′2). Note

that irrespective of the value ofh, we always haveΦh−1 = Epk(1) = M ′h. When

h ≥ 3, initially the internal userX with t = 2 (denoting levelh − 2) sendsLX =

Epk(∑|Ch(X)|

i=1 CX,Yi) to Z, whereYi ∈ Ch(X) andZ = Pr(X). Then,Z ag-

gregates the data received fromCh(Z). Without loss of generality, letZ receivesLX1

, . . . , LXd, whereXi ∈ Ch(Z). Then, the aggregated entry inLZ is LZ [1] =

Lx1[1] ∗ . . . ∗ LXd

. In addition,Z setsLZ [2] = Epk(∑|Ch(Z)|

i=1 CZ,Xi). Since we

are aggregating data component-wise,lth component inLZ is equivalent to the en-cryption of summation of (minimum) number of messages exchanged between usersatL(h − l − 1) andL(h − l) under sub-tree ofZ. (Note that, following from Obser-vation 2, ifXi has multiple parents, then he/she will sendLXi

to only actual parentuserPr(Xi)). This aggregation process continues at each level in a bottom-up fash-ion. Finally, whenA computesLA (by aggregating theLBi

’s component-wise, for1 ≤ i ≤ m), we observe that thelth component inLA is equivalent to encryptionof sum of (minimum) number of messages exchanged between users atL[h − l − 1]andL[h − l], that is,LA[l] = Epk(Mh−l−1,h−l), for 1 ≤ l ≤ h − 2. In addition, ifLA = 〈Epk(x1), . . . , Epk(xh−2)〉, thenxl = Mh−l−1,h−l, for 1 ≤ l ≤ h − 2. Basedon the above discussions, we consider the following two scenarios:Scenario 1: Whenh = 3, we have|LA| = 1 andΦ1 gives the encrypted scalar factorfor users at level 2 as shown below.

Φ1 = LA[1]

= Epk(M1,2)

= Epk(M0,1 ∗M1,2

M0,1)

= Epk(M3

M2)

= Epk(M′2)

12

Page 13: Structural and Message based Private Friend Recommendationweb.mst.edu/~wjiang/PFR-Tech.pdf · 2012. 4. 25. · Ford Butler Cole Kelly Shaw Ray Fox Ryan Hart Jones 2 2 1 1 2 1 4 3

Scenario 2: On the other hand, whenh > 3, A andT involve in the SMP protocol(Step 14 in Algorithm 1). Following from the SMP protocol, the value ofΦl, for1 ≤ l ≤ h− 2, can be formulated as shown below:

Φl = Epk(

h−l−1∏

j=1

xj)

= Epk(h−l−1∏

j=1

Mh−j−1,h−j)

= Epk(

h−2∏

k=l

Mk,k+1)

= Epk(M0,1 ∗ . . . ∗Mh−2,h−1

M0,1 ∗ . . . ∗Ml−1,l)

= Epk(Mh

Ml+1)

= Epk(M′l+1)

If the maximum value ofh is 6 (sufficient for most situations), the maximum sizeof L′ is 4. Therefore, Phase 1 is bounded by 2 initiations of the SMPProtocol.

4.2 Phase 2 - Secure Computation of Recommendation Scores:

During Phase 2,A with inputΦ along withT and the internal users jointly compute therecommendation score for each potential candidate. The main steps involved in Phase2 are shown as steps 17 to 37 in Algorithm 1. To start with, initially A computes avectorα of sizeh as follows:

α1 = Epk(CA,Bi)

αi = ΦCA,Bi

i−1 mod N2, for 2 ≤ i ≤ t+ 1

After this, A sendsA,Φ, andα to Bi, for 1 ≤ i ≤ m. Then, each internal userUj

receives the values ofA,Φ, andα from Pr(Uj) and checks whetherA is already afriend ofUj (this case happens only ifUj is equal to one of theBi’s). If A ∈ F (Uj),thenUj simply forwardsA,Φ, andα to each of his/her child friend. Otherwise,Uj

computes the encryption of shares of his/her recommendation score as below:

βj = αDUj

1 mod N2; γj = α2 ∗ ΦCY,Uj

1 mod N2

whereDUjdenote the degree ofUj andY is the parent friend ofUj . After this,Uj

sendsZj = {Epk(Uj), 〈βj , γj〉} toA. Note thatUj can receive multiple pairs of (Φ, α)which occurs only when there exist multiple shortest paths fromA to Uj . Under thisscenario,Uj creates the encrypted partial scores for each pair of (Φ, α) and simplyappends them toZj as follows.

13

Page 14: Structural and Message based Private Friend Recommendationweb.mst.edu/~wjiang/PFR-Tech.pdf · 2012. 4. 25. · Ford Butler Cole Kelly Shaw Ray Fox Ryan Hart Jones 2 2 1 1 2 1 4 3

Zj = {Epk(Uj), 〈β1,j , γ1,j〉, . . . , βs,j , γs,j〉}

where eachβl,j , γl,j , for 1 ≤ l ≤ s, is computed as explained above for each pair of(Φ, α) ands denotes the number of such pairs (number of shortest paths toUj fromA).In addition, ift > 0, thenUj proceeds as follows:

• UpdateΦ andα:

– Φi = Φi+1, for 1 ≤ i ≤ t

– α1 = αCY,Uj

1 mod N2

– αi = αi+1 ∗ ΦCY,Uj

i−1 , for 2 ≤ i ≤ t+ 1

• SendsA, Φ andα to his/her child friends. IfUj receives multiple pairs of(Φ, α),Uj updates each pair as above and sends all updated pairs to the child friends.

Upon receiving the entries from all potential candidates,A andT involve in a securemultiplication and addition (SMPA) protocol for all entries corresponding to each po-tential candidate as explained in Algorithm 3.

Let us consider the entryZj = {Epk(Uj), 〈β1,j , γ1,j〉, . . . , 〈βs,j , γs,j〉}, wheresdenote the number of shortest paths fromA toUj . In addition, letβl,j = Epk(al,j) andγl,j = Epk(bl,j), for 1 ≤ l ≤ s. The goal of the SMPA protocol is to securely computea1,j ∗ b1,j + · · ·+ as,j ∗ bs,j as output without revealing the values ofal,j andbl,j , for1 ≤ l ≤ s, to eitherA or T . At the end of the SMPA protocol, only userA knows therecommendation score corresponding toUj , for 1 ≤ j ≤ n.

The main steps involved in the SMPA protocol are as shown in Algorithm 3. Ini-tially, userA randomizes each encrypted tuple〈βl,j , γl,j〉 , for 1 ≤ l ≤ s, as follows:

βl,j = βl,j ∗ Epk(rl,j) mod N2

γl,j = γl,j ∗ Epk(r′l,j) mod N2

rl,j andr′l,j are randomly chosen inZN . A also randomizesEpk(Uj) and performsthese homomorphic operations (steps 1(b) to 1(g)). Therj and r are also randomnumbers inZN . After this,A sendsβl,j andγl,j , for 1 ≤ l ≤ s, to T along with thevalues ofw andλ. Upon receiving the values,T decryptsβl,j andγl,j , for 1 ≤ l ≤ s,multiplies and adds them as shown below:

• For1 ≤ l ≤ s, al,j = Dpr(βl,j) and bl,j = Dpr(γl,j)

• c =∑s

l=1 al,j ∗ bl,j mod N .

Furthermore,T decryptsw andλ: z = Dpr(w) and s2 = Dpr(λ), and computess1 = z+ c mod N . Then,T sendss1 ands2 toA. Finally,A removes the randomnessfrom s1 ands2 to get the actual score and user IDUj as follows:

RS(A,Uj) = s1 − r mod N ; Uj = s2 − rj mod N

Here,RS(A,Uj) is the recommendation score for userUj based on the order preserv-ing scoring function mentioned in Section 3. Note that(N − 1) represents ”-1” underZN .

14

Page 15: Structural and Message based Private Friend Recommendationweb.mst.edu/~wjiang/PFR-Tech.pdf · 2012. 4. 25. · Ford Butler Cole Kelly Shaw Ray Fox Ryan Hart Jones 2 2 1 1 2 1 4 3

Algorithm 3 SMPARequire: A’s input isZj

1: A:

(a). for 1 ≤ l ≤ s do:

• βl,j ← βl,j ∗ Epk(rl,j) mod N2, whererl,j ∈ ZN

• γl,j ← γl,j ∗ Epk(r′l,j) mod N2, wherer′l,j ∈ ZN

(b). λ← Epk(Uj) ∗ Epk(rj) mod N2, whererj ∈ ZN

(c). Epk(r)← Epk(∑s

l=1 rl,j ∗ r′l,j)

(d). Epk(r1)←∏s

l=1 βr′l,jl,j mod N2

(e). Epk(r2)←∏s

l=1 γrl,jl,j mod N2

(f). τ ← Epk(r) ∗ Epk(r)N−1 mod N2, wherer ∈ ZN

(g). w = τ ∗ Epk(r1)N−1 ∗ Epk(r2)

N−1 mod N2

(h). Sendw, λ andβl,j , γl,j , for 1 ≤ l ≤ s to T

2: T :

(a). Receive parameters fromA

(b). al,j ← Dpr(ul,j) ; bl,j ← Dpr(vl,j), for 1 ≤ l ≤ s

(c). c←∑s

l=1 al,j ∗ bl,j mod N

(d). z ← Dpr(w); s1 ← z + c mod N

(e). s2 ← Dpr(λ); sends1 ands2 toA

3: A:

(a). Receives1 ands2 from T

(b). RS(A,Uj)← s1 − r mod N (recommendation score)

(c). Uj ← s2 − rj mod N (corresponding user ID)

Theorem 2. The output of Phase 2 is the list of recommendation scores along with thecorresponding users IDs. That is, for any given entryZj , we have:

s1,j − r mod N = RS(A,Uj)

s2,j − rj mod N = Uj

Wheres1,j and s2,j are the values sent toA from T corresponding to the entryZj

during the SMPA protocol, for1 ≤ j ≤ n.

Proof. Without loss of generality, consider a potential userUj who receivesA and

15

Page 16: Structural and Message based Private Friend Recommendationweb.mst.edu/~wjiang/PFR-Tech.pdf · 2012. 4. 25. · Ford Butler Cole Kelly Shaw Ray Fox Ryan Hart Jones 2 2 1 1 2 1 4 3

(Φ, α) pairs from his/her parent friends. Let us assume thatUj receivess numberof different (Φ, α) pairs (representings number of shortest paths fromA to Uj) andlet βk,j , γk,j denote the encrypted partial scores corresponding tokth pair (Φk, αk)(denotingkth shortest path fromA to Uj), for 1 ≤ k ≤ s. Uj computes the encryptedpartial shares forkth pair, where1 ≤ k ≤ s, as follows:

βk,j = αDUj

1,k mod N2 = Epk(DUj∗∏

i

CSi−1,Si)

γk,j = α2,k ∗ ΦCY,Uj

1,k mod N2 = Epk(M′l ∗ |Pk(A,Uj)|)

whereαy,k (resp.,βy,k) denote theyth component of vectorαk (resp.,βk); i =1, . . . , l − 1; l = L(Uj) andSi−1 = Pr(Si) along thekth path fromA to Uj . Afterthis, Uj sendsZj = {Epk(Uj), 〈β1,j , γ1,j〉, . . . , 〈βs,j , γs,j〉} to A. Upon receiving,A andT involve in the SMPA protocol. As mentioned earlier, letβk,j = Epk(ak,j)andγk,j = Epk(bk,j), for 1 ≤ k ≤ s. Since the SMPA protocol securely multiplieseach(βk,j , γk,j) pair (within encryption) and then adds them, the output of the SMPAprotocol can be formulated as follows:

s1,j − r mod N =

s∑

k=1

ak,j ∗ bk,j

=

s∑

k=1

(DUj∗∏

i

CSi−1,Si) ∗ (M ′

l ∗ |Pk(A,Uj)|)

= M ′l

s∑

k=1

(|Pk(A,Uj)| ∗

i

CSi−1,Si

)∗DUj

= RS(A,Uj)

Similarly, we can show thats2 − rj mod N = Uj .

During the actual implementation, the SMPA protocol can be initiated in parallel asthe computation for any given potential userUj is independent of others. Thus, overall,the SMPA requires one round of communication betweenA andT .

Example 2. We show various intermediate steps and results involved in the PFR proto-col using Figure 2 as an example. We haveh = 3 and Lee as the target user. Followingfrom initialization step, users at 1-hop away from Lee, thatis, 〈Hall, Cox,Bell〉 havea value oft = 2. Similarly, 〈Ford,Butler, Cole,Kelly〉 have a value oft = 1.Whereas,〈Shaw,Ray, Fox,Ryan,Hart, Jones〉 havet = 0. Each of them is awareof pk and also their parent and child friend(s).

Phase 1: Initially, Hall computesLHall[1] = Epk(CHall,Ford + CHall,Butler) =Epk(3). Similarly, Cox and Bell computeLCox[1] = Epk(7) andLBell[1] = Epk(3)respectively. Observe thatCHall,Cox is not included inLHall[1] andLCox[1] sinceHall and Cox are at same level from Lee. After this, Hall, Cox,and Bell sendLHall, LCox,andLBell resp., to Lee. Upon receiving values, Lee computesLLee[1] = LHall[1] ∗LCox[1] ∗ LBell[1] mod N2 = Epk(13). Then, Lee sets the encrypted scalar factorsas follows:

16

Page 17: Structural and Message based Private Friend Recommendationweb.mst.edu/~wjiang/PFR-Tech.pdf · 2012. 4. 25. · Ford Butler Cole Kelly Shaw Ray Fox Ryan Hart Jones 2 2 1 1 2 1 4 3

{Epk(Ford), 〈Epk(2 ∗DFord), Epk(52)〉}{Epk(Butler), 〈Epk(2 ∗DButler), Epk(39)〉, 〈Epk(2 ∗DButler), Epk(78)〉}{Epk(Cole), 〈Epk(2 ∗DCole), Epk(65)〉}{Epk(Kelly), 〈Epk(DKelly), Epk(52)〉}

Table 2: Encrypted partial scores corresponding to each Potential Candidate at level 2based on the PFR protocol

Φ1 = LLee[1] = Epk(13); Φ2 = Epk(1)

Phase 2: During Phase 2, Lee computes encrypted vectorα (different) for each ofhis friends. Without loss of generality, consider user Hall. Lee createsα for Hall asfollows.

α = 〈Epk(CLee,Hall),ΦCLee,Hall

1 ,ΦCLee,Hall

2 〉

= 〈Epk(2), Epk(2 ∗ 13), Epk(2 ∗ 1)〉

Then, Lee sends〈Lee,Φ, α〉 to Hall who further forwards them to Ford and Butler. Thefinal entries (that are sent to Lee) from all potential users at level 2 are shown in Table2. Finally, Lee andT involve in the SMPA protocol to get the scaled recommendationscores. E.g., the recommendation score for Ford isRS(Lee, Ford) = 2 ∗DFord ∗ 4 ∗

13 = 104 ∗DFord. It is clear thatRS(Lee, Ford) = Mh ∗ TN ∗ RS, where actualrecommendation score for Ford isRS = 4 ∗ 2

5 ∗DFord

TNandMh = 65.

4.3 Security Analysis

During Phase 1, each internal user sends the encrypted aggregated data only toPr(U).Thus, the privacy of individual users is preserved as per thesecurity definition of SMC[18]. In addition, during the SMP protocol,A first randomizes the values ofLA andsends them toT . Therefore, the simulated view ofT is indistinguishable compared tothe real view (trusted third party model). Furthermore, sinceT sends only the encryptedscalar factors toA, neither the values ofxi’s norΦi’s are revealed toA, for 1 ≤ i ≤h− 2. Therefore, the privacy ofA andUj are preserved, for1 ≤ j ≤ n.

In Phase 2,A initially sends{A,Φ, α} to eachBi, for 1 ≤ j ≤ m. Each internaluserUj , computes his/her encrypted partial scores usingDUj

andCYi,Uj, whereYi is

the parent friend ofUj (with multiple parents denoting multiple shortest paths fromA to Uj). Then,Uj sends his entryZj in encrypted form toA. Here, the privacy ofeachUj is preserved under the assumption that number of shortest paths toUj canbe revealed toA. However, we emphasize that this problem can be solved usingthesource privacy technique [22]. During the SMPA protocol, the values of each entryare randomized inZN and send toT . That is, the values ofDUj

∗∏

i CSi−1,Siand

M ′l ∗ |Pk(A,Uj)|, for 1 ≤ k ≤ s, are randomized and sent toT . Therefore, the privacy

of A andUj is preserved further. In addition, the final output sent toA is the actualoutput and the intermediate values are never revealed toA and internal users.

17

Page 18: Structural and Message based Private Friend Recommendationweb.mst.edu/~wjiang/PFR-Tech.pdf · 2012. 4. 25. · Ford Butler Cole Kelly Shaw Ray Fox Ryan Hart Jones 2 2 1 1 2 1 4 3

4.4 Complexity Analysis

We analyze the computation and communication costs of each party in the PFR proto-col.

4.4.1 Computation Cost

For Phase 1, the computation cost of each internal userUj depends on his/her countervalue and number of child friends. In addition, irrespective of the counter value,Uj

has to perform one encryption operation. Therefore, the computation complexity ofUj

is bounded by one encryption andO(t ∗ |Ch(Uj)|) homomorphic addition operations.Whereas, the computation complexity ofA mainly depends onh andm. If h = 2, thenA simply performs one encryption operation. However, whenh = 3, A’s computationcomplexity is bounded byO(h ∗m) homomorphic additions and one encryption. Onthe other hand, ifh > 3, the computation complexity ofA mainly comes from theSMP protocol which depends on the values ofh andm. That is,A’s computationcomplexity is bounded byO(h ∗ m) homomorphic additions andO(h) number ofencryption operations. Whereas, the computation complexity of T is bounded byO(h)decryption operations (coming from the SMP protocol).

In Phase 2, the computation complexity of each internal user(excludingBi’s) de-pends on his/hert ands (number of shortest paths fromA to Uj). Therefore,Uj ’scomputation cost is bounded byO(t ∗ s) exponentiations and homomorphic additions.On the other hand,A has to initially computez, which depends on the value ofh,for eachBi. Therefore, the computation complexity ofA for computing allz values isbounded byO(h∗m) encryption and exponentiation operations. In addition, during theSMPA protocol,A has to randomize all components of each potential candidate. Letndenote the number of potential candidates ands be the maximum number of shortestpaths, then the computation cost ofA in the SMPA protocol is bounded byO(s ∗ n)encryption and exponentiation operations. Overall, during Phase 2, the computationcomplexity ofA is bounded byO(s∗n) encryption and exponentiation operations (un-der the assumptions ∗ n > h ∗ m). Whereas, the computation complexity ofT isbounded byO(s ∗ n) decryption operations.

4.4.2 Communication Cost

Without loss of generality, letp denote the Paillier encryption key size (in this paperwe fix it to 1,024 bits which is a commonly accepted key size). During Phase 1, thecommunication complexity between any two internal users isbounded byO(p∗ t) bits.Whereas, betweenA and allBi’s is bounded byO(p ∗ h ∗ m) bits. In addition, forthe SMP protocol, the communication complexity betweenA andT is bounded byO(p ∗ h) bits.

Additionally, during Phase 2, the communication cost between any two internalusers is bounded byO(p ∗ t) bits, wheret is the counter of the corresponding parentuser. SinceA has to sendΦ andα to eachBi, for 1 ≤ i ≤ m, the communication costbetweenA and allBi’s is bounded byO(p∗h∗m) bits. In addition, since each potentialcandidate sends the encrypted partial shares toA, the communication cost betweenA

18

Page 19: Structural and Message based Private Friend Recommendationweb.mst.edu/~wjiang/PFR-Tech.pdf · 2012. 4. 25. · Ford Butler Cole Kelly Shaw Ray Fox Ryan Hart Jones 2 2 1 1 2 1 4 3

0

15

30

45

60

75

90

0 50 100 150 200 250

Tim

e (m

illis

econ

ds)

Number of child friends

AT

Uj

(a) Complexity: Phase 1

0

150

300

450

600

750

900

0 5 10 15 20 25

Tim

e (m

illis

econ

ds)

Number of shortest paths (s)

AT

Uj

(b) Complexity: Phase 2

0

4

8

12

16

20

24

0 1 2 3 4 5 6

Tim

e (s

econ

ds)

Radius (h)

s=1s=5

(c) Complexity: PFR for varyinghands

Figure 3: Empirical Results

and all potential candidates is bounded byO(p ∗ s ∗ n) bits. Finally, during the SMPAprotocol, the communication cost betweenA andT is bounded byO(p ∗ s ∗m) bits.

5 Empirical Analysis

As mentioned earlier, the effectiveness of PFR is same as CSMmethod [7]. Therefore,in this section, we empirically analyze the computational cost of PFR. The PFR proto-col was implemented in C, and experiments were performed on aIntel R©XeonR© six-CoreTM3.07GHz PC with 12GB memory. We fix the value of Paillier key size to 1,024bits (since it is a commonly accepted key size) for all the experiments.

5.1 Computation cost

We first analyze the computation time of Phases 1 and 2 seperately for A, T , and in-ternal userUj . Here, we assume that the minimum number of messages exchangedbetween any two usersU andV (i.e.,CU,V ) is uniformly distributed in [1, 100]. Sincethe run time ofUj depends on to which level he/she belongs, we present the average ofcomputation costs of all internal users at different levels.

For Phase 1, we fix the value ofh to 6 and compute the run time ofA, T , andUj byvarying the number of child friends from 50 to 250 (note that users at levelh− 1 andhdo not involve in Phase 1). As shown in Figure 3(a), the computation time forA, T , andUj are 79, 30, and 4.25 milliseconds respectively when the number of child friends is50. In addition, the computation time ofUj varies only slightly (due to less expensivehomomorphic operations) from 4.25 to 6 milliseconds when the number of child friendsof Uj are varied from 50 to 250. However, forA the computation time remains the samesince the homomorphic addition operations are negligible compared to the encryptionoperations involved in SMP. Sinceh is fixed, the encryption cost in SMP remains thesame irrespective of the child friends ofA. Therefore, the computation cost ofA andT remains the same in Phase 1.

During Phase 2, the computation time to find the recommendation score for eachpotential candidateUj mainly depends on the cost of SMPA which in turn dependson the number of shortest paths (s) from A to Uj . Thus, we analyze the computationcost forA, T andUj based on varying values ofs with h = 6. The computation cost

19

Page 20: Structural and Message based Private Friend Recommendationweb.mst.edu/~wjiang/PFR-Tech.pdf · 2012. 4. 25. · Ford Butler Cole Kelly Shaw Ray Fox Ryan Hart Jones 2 2 1 1 2 1 4 3

of Uj (averaged over different levels) varies from 0.66 to 1.83 milliseconds whensvaries from 5 to 25 as shown in 3(b). However, due to encryption costs in SMPA, thecomputation cost ofA varies from 179 to 875 milliseconds whens is changed from 5to 25. From Figure 3(b), a similar trend can be observed forT .

Finally, we compute the total run time of PFR based on varyingvalues ofh ands. Since the total cost depends on the number of potential candidates in the network(n), we fix it to 100. As shown in Figure 3(c), we observe that the total time doesnot change much for varying values ofh whens is fixed. For example, whens = 1,the total time for PFR to compute 100 recommendation scores (sincen = 100) variesfrom 4.5 to 4.63 seconds when the value ofh is changed from 2 to 6. A similar trendcan be observed fors = 5. As expected, for any given value ofh, the computationtime of PFR is almost increased by a factor of 5 whens is changed from 1 to 5. E.g.,whenh = 6, the computation time of PFR varies from 4.63 to 22.62 seconds whensis changed from 1 to 5. Also, we observed that for any fixed values ofh ands, therunning time of PFR grows almost linearly withn.

These results show that most of the significant computation (more than 99%) isbetweenA andT . The computation cost incurred on all internal nodes is negligible. Inaddition, forA andT the computation time grows linearly withs andn. Additionally,when the size of encryption key doubles, the computation time increases by almost afactor of 6. In our experiments, we fixed the key size to 1,024 bits producing 2,048-bitencrypted values, sufficiently secure for most applications.

6 Conclusion

Due to privacy concerns [8–10], many online social networksare providing users var-ious privacy settings. Existing friend recommendation algorithms do not take privacyinto account; therefore, they are not applicable in privateenvironments. In this paper,we propose a new private friend recommendation (PFR) algorithm based on the net-work structure as well as real messages exchanged between users. The proposed PFRprotocol computes the recommendation scores of all users within a radius ofh from atarget user. We also showed the practical applicability of our approach via experimentalresults.

One issue we have with the PFR protocol is related to the potential usersU andVwho are friends and are also at same distance fromA (forming a traingle withPr(U)).Under this scenario,U andV can know that they both share a common friendPr(U).Another issue with the PFR protocol is the revealation of number of shortest pathsinformation toA. Becuase, the number of entries inZj (which is revealed toA duringPhase 2) corresponding to userUj depends on the number of shortest paths fromA toUj . However, we emphasize that the above two issues can be solved using the sourceprivacy [22] technique, and this incurs extra communication and computation cost tothe PFR protocol. We will investigate the above issues in thefuture.

20

Page 21: Structural and Message based Private Friend Recommendationweb.mst.edu/~wjiang/PFR-Tech.pdf · 2012. 4. 25. · Ford Butler Cole Kelly Shaw Ray Fox Ryan Hart Jones 2 2 1 1 2 1 4 3

References

[1] R. Kumar, J. Novak, and A. Tomkins, “Structure and evolution of online socialnetworks,” inProceedings of the 12th ACM SIGKDD international conference onKnowledge discovery and data mining, 2006, pp. 611–617.

[2] A. Mislove, M. Marcon, K. P. Gummadi, P. Druschel, and B. Bhattacharjee,“Measurement and analysis of online social networks,” inProceedings of the 7thACM SIGCOMM conference on Internet measurement (IMC ’07), 2007, pp. 29–42.

[3] J. Srivastava, M. A. Ahmad, N. Pathak, and D. K. W. Hsu,Data mining basedsocial network analysis from online behavior. Tutorial at the 8th SIAM Interna-tional Conference on Data Mining (SDM’ 08), 2008.

[4] S. Wasserman and K. Faust,Social Network Analysis: Methods and Applications.Cambridge University Press, Cambridge, UK, 1994.

[5] F. Bonchi, C. Castillo, A. Gionis, and A. Jaimes, “Socialnetwork analysis andmining for business applications,”ACM Transactions on Intelligent Systems andTechnology, vol. 2, pp. 22:1–22:37, May 2011.

[6] D. Liben-Nowell and J. Kleinberg, “The link-predictionproblem for social net-works,” Journal of American Society for Information Science and Technology,vol. 58, pp. 1019–1031, May 2007.

[7] B.-R. Dai, C.-Y. Lee, and C.-H. Chung, “A framework of recommendation systembased on both network structure and messages,” inInternational Conference onAdvances in Social Networks Analysis and Mining (ASONAM’ 11), july 2011, pp.709 –714.

[8] B. Krishnamurthy and C. E. Wills, “On the leakage of personally identifiableinformation via online social networks,”SIGCOMM Computer CommunicationReview, vol. 40, no. 1, pp. 112–117, 2010.

[9] J. Delgado, E. Rodrıguez, and S. Llorente, “User’s privacy in applications pro-vided through social networks,” inProceedings of second ACM SIGMM Work-shop on Social Media, 2010, pp. 39–44.

[10] Y. Yang, J. Lutes, F. Li, B. Luo, and P. Liu, “Stalking online: on user privacyin social networks,” inProceedings of the second ACM conference on Data andApplication Security and Privacy (CODASPY ’12). ACM, 2012, pp. 37–48.

[11] S. Milgram, “The small world problem,”Psychology Today, vol. 2, pp. 60–67,1967.

[12] S. Lo and C. Lin, “Wmr–a graph-based algorithm for friendrecommendation,”in Proceedings of the 2006 IEEE/WIC/ACM International Conference on WebIntelligence (WI ’06). IEEE Computer Society, 2006, pp. 121–128.

21

Page 22: Structural and Message based Private Friend Recommendationweb.mst.edu/~wjiang/PFR-Tech.pdf · 2012. 4. 25. · Ford Butler Cole Kelly Shaw Ray Fox Ryan Hart Jones 2 2 1 1 2 1 4 3

[13] X. Xie, “Potential friend recommendation in online social network,” inIEEE/ACM Int’l Conference on Cyber, Physical and Social Computing (CP-SCom), December 2010, pp. 831 –835.

[14] J. Naruchitparames, M. Giine, and S. Louis, “Friend recommendations in socialnetworks using genetic algorithms and network topology,” in IEEE Congress onEvolutionary Computation (CEC), june 2011, pp. 2207 –2214.

[15] J. Chen, W. Geyer, C. Dugan, M. Muller, and I. Guy, “Make new friends, but keepthe old: recommending people on social networking sites,” in Proceedings of the27th international conference on Human factors in computing systems, 2009, pp.201–210.

[16] L. Gou, F. You, J. Guo, L. Wu, and X. L. Zhang, “Sfviz: interest-based friendsexploration and recommendation in social networks,” inProceedings of VisualInformation Communication - International Symposium (VINCI ’11). ACM,2011, pp. 1–10.

[17] A. Machanavajjhala, A. Korolova, and A. D. Sarma, “Personalized social recom-mendations: accurate or private,”Proc. VLDB Endowment, vol. 4, pp. 440–450,april 2011.

[18] O. Goldreich,The Foundations of Cryptography. Cambridge, University Press,2004, vol. 2, ch. General Cryptographic Protocols.

[19] P. Paillier, “Public-key cryptosystems based on composite degree residuosityclasses,” inProceedings of the 17th international conference on Theoryand ap-plication of cryptographic techniques. Springer-Verlag, 1999, pp. 223–238.

[20] S. Goldwasser, S. Micali, and C. Rackoff, “The knowledge complexity of interac-tive proof systems,”SIAM Journal of Computing, vol. 18, pp. 186–208, February1989.

[21] O. Goldreich,The Foundations of Cryptography. Cambridge, University Press,2004, vol. 2, ch. Encryption Schemes.

[22] W. Jiang, L. Si, and J. Li, “Protecting source privacy infederated search,” inProceedings of the 30th annual international ACM SIGIR conference on Researchand development in information retrieval. ACM, 2007, pp. 761–762.

22