Transcript
Page 1: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of RecommenderSystems

Neil Hurley

Complex Adaptive System LaboratoryComputer Science and Informatics

University College Dublin

Clique Strategic Research Clusterclique.ucd.ie

October 2011

RecSys 2011: Tutorial on Recommender Robustness

Page 2: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

Outline

1 What is Robustness?Profile Injection AttacksRelevance of RobustnessMeasuring Robustness

2 Attack StrategiesAttacking kNN Algorithms

3 Attack DetectionPCA-based Attack DetectionStatistical Attack DetectionCost-Benefit Analysis

4 Robustness of Model-based Algorithms

5 Attack Resistant Recommendation AlgorithmsProvably Manipulation Resistant Algorithms

6 Stability, Trust and Privacy

RecSys 2011: Tutorial on Recommender Robustness

Page 3: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

Outline

1 What is Robustness?Profile Injection AttacksRelevance of RobustnessMeasuring Robustness

2 Attack StrategiesAttacking kNN Algorithms

3 Attack DetectionPCA-based Attack DetectionStatistical Attack DetectionCost-Benefit Analysis

4 Robustness of Model-based Algorithms

5 Attack Resistant Recommendation AlgorithmsProvably Manipulation Resistant Algorithms

6 Stability, Trust and Privacy

RecSys 2011: Tutorial on Recommender Robustness

Page 4: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

Outline

1 What is Robustness?Profile Injection AttacksRelevance of RobustnessMeasuring Robustness

2 Attack StrategiesAttacking kNN Algorithms

3 Attack DetectionPCA-based Attack DetectionStatistical Attack DetectionCost-Benefit Analysis

4 Robustness of Model-based Algorithms

5 Attack Resistant Recommendation AlgorithmsProvably Manipulation Resistant Algorithms

6 Stability, Trust and Privacy

RecSys 2011: Tutorial on Recommender Robustness

Page 5: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

Outline

1 What is Robustness?Profile Injection AttacksRelevance of RobustnessMeasuring Robustness

2 Attack StrategiesAttacking kNN Algorithms

3 Attack DetectionPCA-based Attack DetectionStatistical Attack DetectionCost-Benefit Analysis

4 Robustness of Model-based Algorithms

5 Attack Resistant Recommendation AlgorithmsProvably Manipulation Resistant Algorithms

6 Stability, Trust and Privacy

RecSys 2011: Tutorial on Recommender Robustness

Page 6: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

Outline

1 What is Robustness?Profile Injection AttacksRelevance of RobustnessMeasuring Robustness

2 Attack StrategiesAttacking kNN Algorithms

3 Attack DetectionPCA-based Attack DetectionStatistical Attack DetectionCost-Benefit Analysis

4 Robustness of Model-based Algorithms

5 Attack Resistant Recommendation AlgorithmsProvably Manipulation Resistant Algorithms

6 Stability, Trust and Privacy

RecSys 2011: Tutorial on Recommender Robustness

Page 7: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

Outline

1 What is Robustness?Profile Injection AttacksRelevance of RobustnessMeasuring Robustness

2 Attack StrategiesAttacking kNN Algorithms

3 Attack DetectionPCA-based Attack DetectionStatistical Attack DetectionCost-Benefit Analysis

4 Robustness of Model-based Algorithms

5 Attack Resistant Recommendation AlgorithmsProvably Manipulation Resistant Algorithms

6 Stability, Trust and PrivacyRecSys 2011: Tutorial on Recommender Robustness

Page 8: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

What is Robustness?

Outline

1 What is Robustness?Profile Injection AttacksRelevance of RobustnessMeasuring Robustness

2 Attack StrategiesAttacking kNN Algorithms

3 Attack DetectionPCA-based Attack DetectionStatistical Attack DetectionCost-Benefit Analysis

4 Robustness of Model-based Algorithms

5 Attack Resistant Recommendation AlgorithmsProvably Manipulation Resistant Algorithms

6 Stability, Trust and PrivacyRecSys 2011: Tutorial on Recommender Robustness

Page 9: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

What is Robustness?

Profile Injection Attacks

Outline

1 What is Robustness?Profile Injection AttacksRelevance of RobustnessMeasuring Robustness

2 Attack StrategiesAttacking kNN Algorithms

3 Attack DetectionPCA-based Attack DetectionStatistical Attack DetectionCost-Benefit Analysis

4 Robustness of Model-based Algorithms

5 Attack Resistant Recommendation AlgorithmsProvably Manipulation Resistant Algorithms

6 Stability, Trust and PrivacyRecSys 2011: Tutorial on Recommender Robustness

Page 10: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

What is Robustness?

Profile Injection Attacks

Defining the Problem

Recommender Systems use personal information aboutend-users to make useful personalised recommendations.

When ratings are provided explicitly, recommender algorithmshave been designed on the assumption that the providedinformation is correct.

However . . .

“One can have, some claim, as many electronic per-sonas as one has time and energy to create”

–Judith Donath (1998) as quoted in Douceur (2002)

How does the system perform if multiple identities are used totry to deliberately bias the recommender output?

RecSys 2011: Tutorial on Recommender Robustness

Page 11: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

What is Robustness?

Profile Injection Attacks

Defining the Problem

In 2002, John Douceur of Microsoft Research coined the termSybil Attack to refer to an attack against identity onpeer-to-peer systems in which an individual entitymasquerades as multiple separate entities

“If the local entity has no direct physical knowledge of remote entities, it perceives them onlyas informational abstractions that we call identities. The system must ensure that distinctidentities refer to distinct entities; otherwise, when the local entity selects a subset of identitiesto redundantly perform a remote operation, it can be duped into selecting a single remoteentity multiple times, thereby defeating the redundancy”

In the same year, the first paper (O’Mahony et al. 2002)appeared on the vulnerability of Recommender Systems tomalicious strategies for “recommendation promotion” – laterdubbed profile injection attacks.

RecSys 2011: Tutorial on Recommender Robustness

Page 12: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

What is Robustness?

Profile Injection Attacks

Defining the Problem

Robustness refers to the ability of a system to operate understressful conditions.

While there are many possible stresses that can be applied toRecommender Systems, research on RS robustness hasfocused on performance when the dataset is stressedspecifically when

the dataset is full of noisy, erroneous data;typically, imagined to have been corrupted through a concertedsybil attack, with an aim of biasing the recommender output.

RecSys 2011: Tutorial on Recommender Robustness

Page 13: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

What is Robustness?

Profile Injection Attacks

Robust RS Research

The goal of robust recommendation is to prevent attackersfrom manipulating an RS through large-scale insertion of falseuser profiles: a profile injection attack

An attack is a concerted effort to bias the results of arecommender system by the insertion of a large number ofprofiles using false identities or sybils.Each identity is referred to as an attack profile.Research has concentrated on attacks designed to achieve aparticular recommendation outcome

A Product Push attack: attempt to secure positiverecommendations for an item or items;A Product Nuke attack: attempt to secure negativerecommendations for an item or items.

We can also think of attacks that aim to simply destroy theaccuracy of the system.

RecSys 2011: Tutorial on Recommender Robustness

Page 14: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

What is Robustness?

Profile Injection Attacks

Robust RS Research

The goal of robust recommendation is to prevent attackersfrom manipulating an RS through large-scale insertion of falseuser profiles: a profile injection attackAn attack is a concerted effort to bias the results of arecommender system by the insertion of a large number ofprofiles using false identities or sybils.

Each identity is referred to as an attack profile.Research has concentrated on attacks designed to achieve aparticular recommendation outcome

A Product Push attack: attempt to secure positiverecommendations for an item or items;A Product Nuke attack: attempt to secure negativerecommendations for an item or items.

We can also think of attacks that aim to simply destroy theaccuracy of the system.

RecSys 2011: Tutorial on Recommender Robustness

Page 15: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

What is Robustness?

Profile Injection Attacks

Robust RS Research

The goal of robust recommendation is to prevent attackersfrom manipulating an RS through large-scale insertion of falseuser profiles: a profile injection attackAn attack is a concerted effort to bias the results of arecommender system by the insertion of a large number ofprofiles using false identities or sybils.Each identity is referred to as an attack profile.

Research has concentrated on attacks designed to achieve aparticular recommendation outcome

A Product Push attack: attempt to secure positiverecommendations for an item or items;A Product Nuke attack: attempt to secure negativerecommendations for an item or items.

We can also think of attacks that aim to simply destroy theaccuracy of the system.

RecSys 2011: Tutorial on Recommender Robustness

Page 16: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

What is Robustness?

Profile Injection Attacks

Robust RS Research

The goal of robust recommendation is to prevent attackersfrom manipulating an RS through large-scale insertion of falseuser profiles: a profile injection attackAn attack is a concerted effort to bias the results of arecommender system by the insertion of a large number ofprofiles using false identities or sybils.Each identity is referred to as an attack profile.Research has concentrated on attacks designed to achieve aparticular recommendation outcome

A Product Push attack: attempt to secure positiverecommendations for an item or items;A Product Nuke attack: attempt to secure negativerecommendations for an item or items.

We can also think of attacks that aim to simply destroy theaccuracy of the system.

RecSys 2011: Tutorial on Recommender Robustness

Page 17: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

What is Robustness?

Profile Injection Attacks

Robust RS Research

The goal of robust recommendation is to prevent attackersfrom manipulating an RS through large-scale insertion of falseuser profiles: a profile injection attackAn attack is a concerted effort to bias the results of arecommender system by the insertion of a large number ofprofiles using false identities or sybils.Each identity is referred to as an attack profile.Research has concentrated on attacks designed to achieve aparticular recommendation outcome

A Product Push attack: attempt to secure positiverecommendations for an item or items;

A Product Nuke attack: attempt to secure negativerecommendations for an item or items.

We can also think of attacks that aim to simply destroy theaccuracy of the system.

RecSys 2011: Tutorial on Recommender Robustness

Page 18: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

What is Robustness?

Profile Injection Attacks

Robust RS Research

The goal of robust recommendation is to prevent attackersfrom manipulating an RS through large-scale insertion of falseuser profiles: a profile injection attackAn attack is a concerted effort to bias the results of arecommender system by the insertion of a large number ofprofiles using false identities or sybils.Each identity is referred to as an attack profile.Research has concentrated on attacks designed to achieve aparticular recommendation outcome

A Product Push attack: attempt to secure positiverecommendations for an item or items;A Product Nuke attack: attempt to secure negativerecommendations for an item or items.

We can also think of attacks that aim to simply destroy theaccuracy of the system.

RecSys 2011: Tutorial on Recommender Robustness

Page 19: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

What is Robustness?

Profile Injection Attacks

Robust RS Research

The goal of robust recommendation is to prevent attackersfrom manipulating an RS through large-scale insertion of falseuser profiles: a profile injection attackAn attack is a concerted effort to bias the results of arecommender system by the insertion of a large number ofprofiles using false identities or sybils.Each identity is referred to as an attack profile.Research has concentrated on attacks designed to achieve aparticular recommendation outcome

A Product Push attack: attempt to secure positiverecommendations for an item or items;A Product Nuke attack: attempt to secure negativerecommendations for an item or items.

We can also think of attacks that aim to simply destroy theaccuracy of the system.

RecSys 2011: Tutorial on Recommender Robustness

Page 20: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

What is Robustness?

Profile Injection Attacks

Robust RS Research

We assume that the attacker has no direct access to theratings database – manipulation achieved via the creation offalse profiles only.

We ignore system-level methods (e.g. Captchas) forpreventing the generation of false identities or ratings

Focus is on the recommendation algorithm’s ability to resistmanipulation either by

Identifying false profiles from their statistical properties andignoring or lessening their impact on the generation ofrecommendations; orGenerating recommendations in a manner that is inherentlyinsensitive to manipulation.

RecSys 2011: Tutorial on Recommender Robustness

Page 21: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

What is Robustness?

Profile Injection Attacks

Robust RS Research

We assume that the attacker has no direct access to theratings database – manipulation achieved via the creation offalse profiles only.

We ignore system-level methods (e.g. Captchas) forpreventing the generation of false identities or ratings

Focus is on the recommendation algorithm’s ability to resistmanipulation either by

Identifying false profiles from their statistical properties andignoring or lessening their impact on the generation ofrecommendations; orGenerating recommendations in a manner that is inherentlyinsensitive to manipulation.

RecSys 2011: Tutorial on Recommender Robustness

Page 22: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

What is Robustness?

Profile Injection Attacks

Robust RS Research

We assume that the attacker has no direct access to theratings database – manipulation achieved via the creation offalse profiles only.

We ignore system-level methods (e.g. Captchas) forpreventing the generation of false identities or ratings

Focus is on the recommendation algorithm’s ability to resistmanipulation either by

Identifying false profiles from their statistical properties andignoring or lessening their impact on the generation ofrecommendations; orGenerating recommendations in a manner that is inherentlyinsensitive to manipulation.

RecSys 2011: Tutorial on Recommender Robustness

Page 23: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

What is Robustness?

Profile Injection Attacks

Robust RS Research

We assume that the attacker has no direct access to theratings database – manipulation achieved via the creation offalse profiles only.

We ignore system-level methods (e.g. Captchas) forpreventing the generation of false identities or ratings

Focus is on the recommendation algorithm’s ability to resistmanipulation either by

Identifying false profiles from their statistical properties andignoring or lessening their impact on the generation ofrecommendations;

orGenerating recommendations in a manner that is inherentlyinsensitive to manipulation.

RecSys 2011: Tutorial on Recommender Robustness

Page 24: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

What is Robustness?

Profile Injection Attacks

Robust RS Research

We assume that the attacker has no direct access to theratings database – manipulation achieved via the creation offalse profiles only.

We ignore system-level methods (e.g. Captchas) forpreventing the generation of false identities or ratings

Focus is on the recommendation algorithm’s ability to resistmanipulation either by

Identifying false profiles from their statistical properties andignoring or lessening their impact on the generation ofrecommendations; orGenerating recommendations in a manner that is inherentlyinsensitive to manipulation.

RecSys 2011: Tutorial on Recommender Robustness

Page 25: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

What is Robustness?

Profile Injection Attacks

Example

RecSys 2011: Tutorial on Recommender Robustness

Page 26: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

What is Robustness?

Profile Injection Attacks

Example

RecSys 2011: Tutorial on Recommender Robustness

Page 27: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

What is Robustness?

Profile Injection Attacks

Example

RecSys 2011: Tutorial on Recommender Robustness

Page 28: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

What is Robustness?

Profile Injection Attacks

Threats to Reputation Systems I

It is interesting to compare the scenario studied in RS researchwith the threats identified for reputation systems in the 2007ENISA report (Carrara and Hogben 2007):

1 Whitewashing attack: reseting a poor reputation byrejoining the system with a new identity.

2 Sybil attack or pseudospoofing: the attacker uses multipleidentities (sybils) in order to manipulate a reputation score.

3 Impersonation and reputation theft: acquiring the identityof another and stealing her reputation.

4 Bootstrap issues and related threats: the initial reputationof a newcomer may be particularly vulnerable to attack.

RecSys 2011: Tutorial on Recommender Robustness

Page 29: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

What is Robustness?

Profile Injection Attacks

Threats to Reputation Systems II

5 Extortion: co-ordinated campaigns aimed at blackmail bydamaging an individual’s reputation for malicious motives.

6 Denial-of-reputation: attack designed to damage reputationand create an opportunity for blackmail in order to have thereputation cleaned.

7 Ballot stuffing and bad mouthing: reporting of a falsereputation score; the attackers collude to givepositive/negative feedback, to increase or lower a reputation.

8 Collusion: multiple users conspire to influence a givenreputation.

9 Repudiation of data and transaction: denial that atransaction occurred, or denial of the existence of data forwhich individual is responsible.

RecSys 2011: Tutorial on Recommender Robustness

Page 30: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

What is Robustness?

Profile Injection Attacks

Threats to Reputation Systems III

10 Recommender dishonesty: dishonest reputation scoring.

11 Privacy threats for voters and reputation owners: forexample, anonymity improves the accuracy of votes.

12 Social threats: Discriminatory behaviour, herd behaviour,penalisation of innovative, controversial opinions, vocalminority effect etc.

13 Threats to the underlying networks: e.g. denial of serviceattack.

14 Trust topology threats: e.g.targeting most highly influentialnodes.

15 Threats to ratings: exploiting features of metrics used bythe system to calculate the aggregate reputation rating

RecSys 2011: Tutorial on Recommender Robustness

Page 31: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

What is Robustness?

Profile Injection Attacks

The Recommendation Attack Game

An attack has an associated context-dependent cost

real dollar cost if the insertion of a sybil rating requires thepurchase of the corresponding item;time/effort cost;risk cost, associated with the likelihood of being detected;

In any case, we may model the cost as proportional to

The number of sybil profiles created ;the total number of constituent ratings within the sybil profiles.

RecSys 2011: Tutorial on Recommender Robustness

Page 32: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

What is Robustness?

Profile Injection Attacks

The Recommendation Attack Game

An attack has an associated context-dependent cost

real dollar cost if the insertion of a sybil rating requires thepurchase of the corresponding item;

time/effort cost;risk cost, associated with the likelihood of being detected;

In any case, we may model the cost as proportional to

The number of sybil profiles created ;the total number of constituent ratings within the sybil profiles.

RecSys 2011: Tutorial on Recommender Robustness

Page 33: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

What is Robustness?

Profile Injection Attacks

The Recommendation Attack Game

An attack has an associated context-dependent cost

real dollar cost if the insertion of a sybil rating requires thepurchase of the corresponding item;time/effort cost;

risk cost, associated with the likelihood of being detected;

In any case, we may model the cost as proportional to

The number of sybil profiles created ;the total number of constituent ratings within the sybil profiles.

RecSys 2011: Tutorial on Recommender Robustness

Page 34: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

What is Robustness?

Profile Injection Attacks

The Recommendation Attack Game

An attack has an associated context-dependent cost

real dollar cost if the insertion of a sybil rating requires thepurchase of the corresponding item;time/effort cost;risk cost, associated with the likelihood of being detected;

In any case, we may model the cost as proportional to

The number of sybil profiles created ;the total number of constituent ratings within the sybil profiles.

RecSys 2011: Tutorial on Recommender Robustness

Page 35: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

What is Robustness?

Profile Injection Attacks

The Recommendation Attack Game

An attack has an associated context-dependent cost

real dollar cost if the insertion of a sybil rating requires thepurchase of the corresponding item;time/effort cost;risk cost, associated with the likelihood of being detected;

In any case, we may model the cost as proportional to

The number of sybil profiles created ;the total number of constituent ratings within the sybil profiles.

RecSys 2011: Tutorial on Recommender Robustness

Page 36: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

What is Robustness?

Profile Injection Attacks

The Recommendation Attack Game

An attack has an associated context-dependent cost

real dollar cost if the insertion of a sybil rating requires thepurchase of the corresponding item;time/effort cost;risk cost, associated with the likelihood of being detected;

In any case, we may model the cost as proportional to

The number of sybil profiles created ;

the total number of constituent ratings within the sybil profiles.

RecSys 2011: Tutorial on Recommender Robustness

Page 37: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

What is Robustness?

Profile Injection Attacks

The Recommendation Attack Game

An attack has an associated context-dependent cost

real dollar cost if the insertion of a sybil rating requires thepurchase of the corresponding item;time/effort cost;risk cost, associated with the likelihood of being detected;

In any case, we may model the cost as proportional to

The number of sybil profiles created ;the total number of constituent ratings within the sybil profiles.

RecSys 2011: Tutorial on Recommender Robustness

Page 38: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

What is Robustness?

Profile Injection Attacks

Impact Curve

Figure: Impact curve from Burke et al. (2011)

RecSys 2011: Tutorial on Recommender Robustness

Page 39: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

What is Robustness?

Profile Injection Attacks

Notation

Consider R to be an n×m database of ratings provided by aset of n genuine users for m items in a system catalogue.

Let G be the set of n genuine users, and letra = (ra,1, . . . , ra,m)T represent the set of ratings provided byuser a for each item.

ra,i ∈ L, some quality label, typically a numerical value oversome discrete range. Write ra,i = ∅ if user a has not rateditem i.

Let A be the set of attack profiles of size nA = number ofprofiles and mA = number of ratings. Let ai denote a singleattack profile 1 ≤ i ≤ nA.

Let R′ be the (n+ nA)×m database of genuine and attackprofiles available to the recommendation algorithmpost-attack.

RecSys 2011: Tutorial on Recommender Robustness

Page 40: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

What is Robustness?

Profile Injection Attacks

Notation

Consider R to be an n×m database of ratings provided by aset of n genuine users for m items in a system catalogue.

Let G be the set of n genuine users, and letra = (ra,1, . . . , ra,m)T represent the set of ratings provided byuser a for each item.

ra,i ∈ L, some quality label, typically a numerical value oversome discrete range. Write ra,i = ∅ if user a has not rateditem i.

Let A be the set of attack profiles of size nA = number ofprofiles and mA = number of ratings. Let ai denote a singleattack profile 1 ≤ i ≤ nA.

Let R′ be the (n+ nA)×m database of genuine and attackprofiles available to the recommendation algorithmpost-attack.

RecSys 2011: Tutorial on Recommender Robustness

Page 41: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

What is Robustness?

Profile Injection Attacks

Notation

Consider R to be an n×m database of ratings provided by aset of n genuine users for m items in a system catalogue.

Let G be the set of n genuine users, and letra = (ra,1, . . . , ra,m)T represent the set of ratings provided byuser a for each item.

ra,i ∈ L, some quality label, typically a numerical value oversome discrete range. Write ra,i = ∅ if user a has not rateditem i.

Let A be the set of attack profiles of size nA = number ofprofiles and mA = number of ratings. Let ai denote a singleattack profile 1 ≤ i ≤ nA.

Let R′ be the (n+ nA)×m database of genuine and attackprofiles available to the recommendation algorithmpost-attack.

RecSys 2011: Tutorial on Recommender Robustness

Page 42: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

What is Robustness?

Profile Injection Attacks

Notation

Consider R to be an n×m database of ratings provided by aset of n genuine users for m items in a system catalogue.

Let G be the set of n genuine users, and letra = (ra,1, . . . , ra,m)T represent the set of ratings provided byuser a for each item.

ra,i ∈ L, some quality label, typically a numerical value oversome discrete range. Write ra,i = ∅ if user a has not rateditem i.

Let A be the set of attack profiles of size nA = number ofprofiles and mA = number of ratings. Let ai denote a singleattack profile 1 ≤ i ≤ nA.

Let R′ be the (n+ nA)×m database of genuine and attackprofiles available to the recommendation algorithmpost-attack.

RecSys 2011: Tutorial on Recommender Robustness

Page 43: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

What is Robustness?

Profile Injection Attacks

Notation

Let cA = cA(nA,mA) denote the cost of mounting an attack.

The recommendation algorithm may be represented as afunction φ, that uses the rating database R and a user’shistory of previous ratings, ru, to make a set of predictionsp(u, i) = φ(i, R, ru) ∈ L on the set of items.

More generally, φ(.) results in a probability mass function overthe label space L for each predicted item.

Let l(φ(., R′, .), φ(., R, .)) be a loss function representing thequality loss of the recommendation process due to an attack.

This function may depend on the goal of the attack e.g. theoverall loss in accuracy, if the attack seeks to distort generalrecommendation performance,or the shift in ratings for some targeted set of items over sometargeted set of users, in a focused attack.

RecSys 2011: Tutorial on Recommender Robustness

Page 44: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

What is Robustness?

Profile Injection Attacks

Notation

Let cA = cA(nA,mA) denote the cost of mounting an attack.

The recommendation algorithm may be represented as afunction φ, that uses the rating database R and a user’shistory of previous ratings, ru, to make a set of predictionsp(u, i) = φ(i, R, ru) ∈ L on the set of items.

More generally, φ(.) results in a probability mass function overthe label space L for each predicted item.

Let l(φ(., R′, .), φ(., R, .)) be a loss function representing thequality loss of the recommendation process due to an attack.

This function may depend on the goal of the attack e.g. theoverall loss in accuracy, if the attack seeks to distort generalrecommendation performance,or the shift in ratings for some targeted set of items over sometargeted set of users, in a focused attack.

RecSys 2011: Tutorial on Recommender Robustness

Page 45: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

What is Robustness?

Profile Injection Attacks

Notation

Let cA = cA(nA,mA) denote the cost of mounting an attack.

The recommendation algorithm may be represented as afunction φ, that uses the rating database R and a user’shistory of previous ratings, ru, to make a set of predictionsp(u, i) = φ(i, R, ru) ∈ L on the set of items.

More generally, φ(.) results in a probability mass function overthe label space L for each predicted item.

Let l(φ(., R′, .), φ(., R, .)) be a loss function representing thequality loss of the recommendation process due to an attack.

This function may depend on the goal of the attack e.g. theoverall loss in accuracy, if the attack seeks to distort generalrecommendation performance,or the shift in ratings for some targeted set of items over sometargeted set of users, in a focused attack.

RecSys 2011: Tutorial on Recommender Robustness

Page 46: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

What is Robustness?

Profile Injection Attacks

Notation

Let cA = cA(nA,mA) denote the cost of mounting an attack.

The recommendation algorithm may be represented as afunction φ, that uses the rating database R and a user’shistory of previous ratings, ru, to make a set of predictionsp(u, i) = φ(i, R, ru) ∈ L on the set of items.

More generally, φ(.) results in a probability mass function overthe label space L for each predicted item.

Let l(φ(., R′, .), φ(., R, .)) be a loss function representing thequality loss of the recommendation process due to an attack.

This function may depend on the goal of the attack e.g. theoverall loss in accuracy, if the attack seeks to distort generalrecommendation performance,or the shift in ratings for some targeted set of items over sometargeted set of users, in a focused attack.

RecSys 2011: Tutorial on Recommender Robustness

Page 47: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

What is Robustness?

Profile Injection Attacks

Notation

Let cA = cA(nA,mA) denote the cost of mounting an attack.

The recommendation algorithm may be represented as afunction φ, that uses the rating database R and a user’shistory of previous ratings, ru, to make a set of predictionsp(u, i) = φ(i, R, ru) ∈ L on the set of items.

More generally, φ(.) results in a probability mass function overthe label space L for each predicted item.

Let l(φ(., R′, .), φ(., R, .)) be a loss function representing thequality loss of the recommendation process due to an attack.

This function may depend on the goal of the attack e.g. theoverall loss in accuracy, if the attack seeks to distort generalrecommendation performance,

or the shift in ratings for some targeted set of items over sometargeted set of users, in a focused attack.

RecSys 2011: Tutorial on Recommender Robustness

Page 48: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

What is Robustness?

Profile Injection Attacks

Notation

Let cA = cA(nA,mA) denote the cost of mounting an attack.

The recommendation algorithm may be represented as afunction φ, that uses the rating database R and a user’shistory of previous ratings, ru, to make a set of predictionsp(u, i) = φ(i, R, ru) ∈ L on the set of items.

More generally, φ(.) results in a probability mass function overthe label space L for each predicted item.

Let l(φ(., R′, .), φ(., R, .)) be a loss function representing thequality loss of the recommendation process due to an attack.

This function may depend on the goal of the attack e.g. theoverall loss in accuracy, if the attack seeks to distort generalrecommendation performance,or the shift in ratings for some targeted set of items over sometargeted set of users, in a focused attack.

RecSys 2011: Tutorial on Recommender Robustness

Page 49: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

What is Robustness?

Profile Injection Attacks

The Recommender Attack Game

Then the manipulation game, from the attacker’s point ofview is to choose the attack A∗ that maximises the loss for agiven cost bound cmax.

Attack Goal

A∗ = arg max{A|cA≤cmax}minφ l(φ(R′), φ(R))

While the system designer strives to find a recommendationalgorithm that militates against attack

Defense Goal

φ∗ = arg minφ max{A|cA≤cmax} l(φ(R′), φ(R))

RecSys 2011: Tutorial on Recommender Robustness

Page 50: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

What is Robustness?

Profile Injection Attacks

The Recommender Attack Game

Then the manipulation game, from the attacker’s point ofview is to choose the attack A∗ that maximises the loss for agiven cost bound cmax.

Attack Goal

A∗ = arg max{A|cA≤cmax}minφ l(φ(R′), φ(R))

While the system designer strives to find a recommendationalgorithm that militates against attack

Defense Goal

φ∗ = arg minφ max{A|cA≤cmax} l(φ(R′), φ(R))

RecSys 2011: Tutorial on Recommender Robustness

Page 51: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

What is Robustness?

Relevance of Robustness

Outline

1 What is Robustness?Profile Injection AttacksRelevance of RobustnessMeasuring Robustness

2 Attack StrategiesAttacking kNN Algorithms

3 Attack DetectionPCA-based Attack DetectionStatistical Attack DetectionCost-Benefit Analysis

4 Robustness of Model-based Algorithms

5 Attack Resistant Recommendation AlgorithmsProvably Manipulation Resistant Algorithms

6 Stability, Trust and PrivacyRecSys 2011: Tutorial on Recommender Robustness

Page 52: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

What is Robustness?

Relevance of Robustness

Our Original Motivation

Inspired by Work in Digital Watermarking . . .

RecSys 2011: Tutorial on Recommender Robustness

Page 53: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

What is Robustness?

Relevance of Robustness

Our Original Motivation

Inspired by Work in Digital Watermarking . . .

RecSys 2011: Tutorial on Recommender Robustness

Page 54: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

What is Robustness?

Relevance of Robustness

How Realistic Is this Scenario?

news.bbc.co.uk/2/hi/4741259.stm (2001) “ ... ads for films including Hollow Man and A Knight’s Tale quoted

praise from a reviewer called David Manning, who was exposed as being invented ...”

http://tinyurl.com/3d6d969 (Sept. 2003) “AuctionBytes conducted a reader survey to find out how serious

these problems were. According to the survey, 39% of respondents felt that feedback retaliation was a very big

problem on eBay. Nineteen percent of respondents had received retaliatory feedback within the previous 6

months, and 16% had been a victim of feedback extortion within the previous 6 months.”

http://tinyurl.com/n3d5lp (June 2009) “Elsevier officials said Monday that it was a mistake for the publishing

giant’s marketing division to offer $25 Amazon gift cards to anyone who would give a new textbook five stars in a

review posted on Amazon or Barnes & Noble. While those popular Web sites’ customer reviews have long been

known to be something less than scientific, and prone to manipulation if an author has friends write on behalf of

a new work, the idea that a major academic publisher would attempt to pay for good reviews angered some

professors who received the e-mail pitch..”

RecSys 2011: Tutorial on Recommender Robustness

Page 55: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

What is Robustness?

Relevance of Robustness

How Realistic Is this Scenario?

news.bbc.co.uk/2/hi/4741259.stm (2001) “ ... ads for films including Hollow Man and A Knight’s Tale quoted

praise from a reviewer called David Manning, who was exposed as being invented ...”

http://tinyurl.com/3d6d969 (Sept. 2003) “AuctionBytes conducted a reader survey to find out how serious

these problems were. According to the survey, 39% of respondents felt that feedback retaliation was a very big

problem on eBay. Nineteen percent of respondents had received retaliatory feedback within the previous 6

months, and 16% had been a victim of feedback extortion within the previous 6 months.”

http://tinyurl.com/n3d5lp (June 2009) “Elsevier officials said Monday that it was a mistake for the publishing

giant’s marketing division to offer $25 Amazon gift cards to anyone who would give a new textbook five stars in a

review posted on Amazon or Barnes & Noble. While those popular Web sites’ customer reviews have long been

known to be something less than scientific, and prone to manipulation if an author has friends write on behalf of

a new work, the idea that a major academic publisher would attempt to pay for good reviews angered some

professors who received the e-mail pitch..”

RecSys 2011: Tutorial on Recommender Robustness

Page 56: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

What is Robustness?

Relevance of Robustness

How Realistic Is this Scenario?

news.bbc.co.uk/2/hi/4741259.stm (2001) “ ... ads for films including Hollow Man and A Knight’s Tale quoted

praise from a reviewer called David Manning, who was exposed as being invented ...”

http://tinyurl.com/3d6d969 (Sept. 2003) “AuctionBytes conducted a reader survey to find out how serious

these problems were. According to the survey, 39% of respondents felt that feedback retaliation was a very big

problem on eBay. Nineteen percent of respondents had received retaliatory feedback within the previous 6

months, and 16% had been a victim of feedback extortion within the previous 6 months.”

http://tinyurl.com/n3d5lp (June 2009) “Elsevier officials said Monday that it was a mistake for the publishing

giant’s marketing division to offer $25 Amazon gift cards to anyone who would give a new textbook five stars in a

review posted on Amazon or Barnes & Noble. While those popular Web sites’ customer reviews have long been

known to be something less than scientific, and prone to manipulation if an author has friends write on behalf of

a new work, the idea that a major academic publisher would attempt to pay for good reviews angered some

professors who received the e-mail pitch..”

RecSys 2011: Tutorial on Recommender Robustness

Page 57: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

What is Robustness?

Relevance of Robustness

How Realistic Is this Scenario?

http://tinyurl.com/cfmqce (2009) Paul Lamere cites an example of “precision hacking” of a Time Poll.

http://tinyurl.com/mupy7d (2009) Hotel review manipulation

Lang et al. (2010) Social Manipulation in Buzznet “...Hey, I know you don’t know me, but could you do me a

huge fav and vote for me in this contest? All you have to do is buzz me...”

RecSys 2011: Tutorial on Recommender Robustness

Page 58: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

What is Robustness?

Relevance of Robustness

How Realistic Is this Scenario?

http://tinyurl.com/cfmqce (2009) Paul Lamere cites an example of “precision hacking” of a Time Poll.

http://tinyurl.com/mupy7d (2009) Hotel review manipulation

Lang et al. (2010) Social Manipulation in Buzznet “...Hey, I know you don’t know me, but could you do me a

huge fav and vote for me in this contest? All you have to do is buzz me...”

RecSys 2011: Tutorial on Recommender Robustness

Page 59: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

What is Robustness?

Relevance of Robustness

How Realistic Is this Scenario?

http://tinyurl.com/cfmqce (2009) Paul Lamere cites an example of “precision hacking” of a Time Poll.

http://tinyurl.com/mupy7d (2009) Hotel review manipulation

Lang et al. (2010) Social Manipulation in Buzznet “...Hey, I know you don’t know me, but could you do me a

huge fav and vote for me in this contest? All you have to do is buzz me...”

RecSys 2011: Tutorial on Recommender Robustness

Page 60: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

What is Robustness?

Relevance of Robustness

How Realistic Is this Scenario?

http://tinyurl.com/cfmqce (2009) Paul Lamere cites an example of “precision hacking” of a Time Poll.

http://tinyurl.com/mupy7d (2009) Hotel review manipulation

Lang et al. (2010) Social Manipulation in Buzznet “...Hey, I know you don’t know me, but could you do me a

huge fav and vote for me in this contest? All you have to do is buzz me...”

All examples of types of manipulation attacks onrecommender systems.

No hard evidence of automated shilling attacks by sybilinsertion bots.

RecSys 2011: Tutorial on Recommender Robustness

Page 61: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

What is Robustness?

Measuring Robustness

Outline

1 What is Robustness?Profile Injection AttacksRelevance of RobustnessMeasuring Robustness

2 Attack StrategiesAttacking kNN Algorithms

3 Attack DetectionPCA-based Attack DetectionStatistical Attack DetectionCost-Benefit Analysis

4 Robustness of Model-based Algorithms

5 Attack Resistant Recommendation AlgorithmsProvably Manipulation Resistant Algorithms

6 Stability, Trust and PrivacyRecSys 2011: Tutorial on Recommender Robustness

Page 62: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

What is Robustness?

Measuring Robustness

Simple Measures of Attack Impact

Average Prediction Shift

The change in an item’s predicted rating before and after attack,averaged over all predictions or over predictions that are targettedby the attack.

pshift(u, i) = φ(i,R′, ru)− φ(i,R, ru)

Average Hit Ratio

The average likelihood over tested users that a top-Nrecommendation will recommend an item that is the target of anattack. For each such item i and each tested user, u, in a test setof t users, let h(u, i) = 1 if i ∈ Ru, the recommended set. Then,H(i) = 1

t

∑u∈U h(u, i)

RecSys 2011: Tutorial on Recommender Robustness

Page 63: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

What is Robustness?

Measuring Robustness

Simple Measures of Attack Impact

Average Prediction Shift

The change in an item’s predicted rating before and after attack,averaged over all predictions or over predictions that are targettedby the attack.

pshift(u, i) = φ(i,R′, ru)− φ(i,R, ru)

Average Hit Ratio

The average likelihood over tested users that a top-Nrecommendation will recommend an item that is the target of anattack. For each such item i and each tested user, u, in a test setof t users, let h(u, i) = 1 if i ∈ Ru, the recommended set. Then,H(i) = 1

t

∑u∈U h(u, i)

RecSys 2011: Tutorial on Recommender Robustness

Page 64: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

What is Robustness?

Measuring Robustness

Other Measures of Attack Impact

Considering φ(i,R, ru) as a pmf over L, attack impact can bemeasured in terms of the change in this distribution as aresult of the attack.

For instance (Yan and Roy 2009), measure impact in terms ofthe average Kullback-Liebler distance between φ(i,R, ru) andφ(i,R′, ru) over the set of inspected items.

Resnick and Sami (2007) propose loss functions L(l, q) wherel ∈ L, q = φ(i,R, ru) where the true label of item i is l

e.g. the quadratic loss over a two rating scale [HI,LO], withq the probability of HI is given by

L(HI, q) = (1− q)2; L(LO, q) = q2

RecSys 2011: Tutorial on Recommender Robustness

Page 65: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

What is Robustness?

Measuring Robustness

Other Measures of Attack Impact

Considering φ(i,R, ru) as a pmf over L, attack impact can bemeasured in terms of the change in this distribution as aresult of the attack.

For instance (Yan and Roy 2009), measure impact in terms ofthe average Kullback-Liebler distance between φ(i,R, ru) andφ(i,R′, ru) over the set of inspected items.

Resnick and Sami (2007) propose loss functions L(l, q) wherel ∈ L, q = φ(i,R, ru) where the true label of item i is l

e.g. the quadratic loss over a two rating scale [HI,LO], withq the probability of HI is given by

L(HI, q) = (1− q)2; L(LO, q) = q2

RecSys 2011: Tutorial on Recommender Robustness

Page 66: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

Attack Strategies

Outline

1 What is Robustness?Profile Injection AttacksRelevance of RobustnessMeasuring Robustness

2 Attack StrategiesAttacking kNN Algorithms

3 Attack DetectionPCA-based Attack DetectionStatistical Attack DetectionCost-Benefit Analysis

4 Robustness of Model-based Algorithms

5 Attack Resistant Recommendation AlgorithmsProvably Manipulation Resistant Algorithms

6 Stability, Trust and PrivacyRecSys 2011: Tutorial on Recommender Robustness

Page 67: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

Attack Strategies

Forming Attack Profiles

(Mobasher et al. 2007) introduce the following notation forthe components of an attack profile:

RecSys 2011: Tutorial on Recommender Robustness

Page 68: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

Attack Strategies

Forming Attack Profiles

(Mobasher et al. 2007) introduce the following notation forthe components of an attack profile:

IT , the target item(s) receive typically the maximum (resp.minimum) rating for a push (resp. nuke) attack.

RecSys 2011: Tutorial on Recommender Robustness

Page 69: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

Attack Strategies

Forming Attack Profiles

(Mobasher et al. 2007) introduce the following notation forthe components of an attack profile:

IS , the selected item(s) are chosen and rated in a manner tosupport the attack.

RecSys 2011: Tutorial on Recommender Robustness

Page 70: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

Attack Strategies

Forming Attack Profiles

(Mobasher et al. 2007) introduce the following notation forthe components of an attack profile:

IF , the filler items(s) fill out the remainder of the ratings inthe attack profile.

RecSys 2011: Tutorial on Recommender Robustness

Page 71: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

Attack Strategies

Forming Attack Profiles

The goal of attack profile creation must be to

Effectively support the purpose of the attack – selection oftarget rating and IS ;

But, remain unobtrusive so as to avoid detection and filtering– selection of the filler items IF .

Must also consider what information is available to theattacker

Typically assume some knowledge of the statistics of the ratingdatabase is available;May also have knowledge of the recommendation algorithm –an informed attack.

Note Kerkchoff’s principal – avoid “security throughobscurity”.

RecSys 2011: Tutorial on Recommender Robustness

Page 72: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

Attack Strategies

Forming Attack Profiles

The goal of attack profile creation must be to

Effectively support the purpose of the attack – selection oftarget rating and IS ;But, remain unobtrusive so as to avoid detection and filtering– selection of the filler items IF .

Must also consider what information is available to theattacker

Typically assume some knowledge of the statistics of the ratingdatabase is available;May also have knowledge of the recommendation algorithm –an informed attack.

Note Kerkchoff’s principal – avoid “security throughobscurity”.

RecSys 2011: Tutorial on Recommender Robustness

Page 73: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

Attack Strategies

Forming Attack Profiles

The goal of attack profile creation must be to

Effectively support the purpose of the attack – selection oftarget rating and IS ;But, remain unobtrusive so as to avoid detection and filtering– selection of the filler items IF .

Must also consider what information is available to theattacker

Typically assume some knowledge of the statistics of the ratingdatabase is available;May also have knowledge of the recommendation algorithm –an informed attack.

Note Kerkchoff’s principal – avoid “security throughobscurity”.

RecSys 2011: Tutorial on Recommender Robustness

Page 74: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

Attack Strategies

Forming Attack Profiles

The goal of attack profile creation must be to

Effectively support the purpose of the attack – selection oftarget rating and IS ;But, remain unobtrusive so as to avoid detection and filtering– selection of the filler items IF .

Must also consider what information is available to theattacker

Typically assume some knowledge of the statistics of the ratingdatabase is available;

May also have knowledge of the recommendation algorithm –an informed attack.

Note Kerkchoff’s principal – avoid “security throughobscurity”.

RecSys 2011: Tutorial on Recommender Robustness

Page 75: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

Attack Strategies

Forming Attack Profiles

The goal of attack profile creation must be to

Effectively support the purpose of the attack – selection oftarget rating and IS ;But, remain unobtrusive so as to avoid detection and filtering– selection of the filler items IF .

Must also consider what information is available to theattacker

Typically assume some knowledge of the statistics of the ratingdatabase is available;May also have knowledge of the recommendation algorithm –an informed attack.

Note Kerkchoff’s principal – avoid “security throughobscurity”.

RecSys 2011: Tutorial on Recommender Robustness

Page 76: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

Attack Strategies

Forming Attack Profiles

The goal of attack profile creation must be to

Effectively support the purpose of the attack – selection oftarget rating and IS ;But, remain unobtrusive so as to avoid detection and filtering– selection of the filler items IF .

Must also consider what information is available to theattacker

Typically assume some knowledge of the statistics of the ratingdatabase is available;May also have knowledge of the recommendation algorithm –an informed attack.

Note Kerkchoff’s principal – avoid “security throughobscurity”.

RecSys 2011: Tutorial on Recommender Robustness

Page 77: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

Attack Strategies

Attacking kNN Algorithms

Outline

1 What is Robustness?Profile Injection AttacksRelevance of RobustnessMeasuring Robustness

2 Attack StrategiesAttacking kNN Algorithms

3 Attack DetectionPCA-based Attack DetectionStatistical Attack DetectionCost-Benefit Analysis

4 Robustness of Model-based Algorithms

5 Attack Resistant Recommendation AlgorithmsProvably Manipulation Resistant Algorithms

6 Stability, Trust and PrivacyRecSys 2011: Tutorial on Recommender Robustness

Page 78: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

Attack Strategies

Attacking kNN Algorithms

User-based kNN Attack

The first CF profile insertion attack (O’Mahony et al. 2002),was an informed attack that exploited a particular weakness inthe basic version of Resnick’s user-based kNN algorithm.

User-based CF predicts a rating pa,j for item j, user a asfollows:

– Form a neighbourhood by picking the top-k most similar usersto a

– Pearson Correlation

wa,i =P

j(ra,j−ra)(ri,j−ri)√Pj∈Na∩Ni

(ra,j−ra)2P

j(ri,j−ri)2

– Make a prediction by taking a weighted average of

neighbours ratings using: pa,j = ra + κ∑n

i=1 wa,i(ri,j − ri)where κ is a normalising factor

RecSys 2011: Tutorial on Recommender Robustness

Page 79: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

Attack Strategies

Attacking kNN Algorithms

User-based kNN Attack

The first CF profile insertion attack (O’Mahony et al. 2002),was an informed attack that exploited a particular weakness inthe basic version of Resnick’s user-based kNN algorithm.

User-based CF predicts a rating pa,j for item j, user a asfollows:

– Form a neighbourhood by picking the top-k most similar usersto a

– Pearson Correlation

wa,i =P

j(ra,j−ra)(ri,j−ri)√Pj∈Na∩Ni

(ra,j−ra)2P

j(ri,j−ri)2

– Make a prediction by taking a weighted average of

neighbours ratings using: pa,j = ra + κ∑n

i=1 wa,i(ri,j − ri)where κ is a normalising factor

RecSys 2011: Tutorial on Recommender Robustness

Page 80: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

Attack Strategies

Attacking kNN Algorithms

User-based kNN Attack

The first CF profile insertion attack (O’Mahony et al. 2002),was an informed attack that exploited a particular weakness inthe basic version of Resnick’s user-based kNN algorithm.

User-based CF predicts a rating pa,j for item j, user a asfollows:

– Form a neighbourhood by picking the top-k most similar usersto a

– Pearson Correlation

wa,i =P

j(ra,j−ra)(ri,j−ri)√Pj∈Na∩Ni

(ra,j−ra)2P

j(ri,j−ri)2

– Make a prediction by taking a weighted average of

neighbours ratings using: pa,j = ra + κ∑n

i=1 wa,i(ri,j − ri)where κ is a normalising factor

RecSys 2011: Tutorial on Recommender Robustness

Page 81: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

Attack Strategies

Attacking kNN Algorithms

User-based kNN Attack

Correlation calculated over the items which the target userand attack profile have in common. A small intersection setcan lead to high correlations.

Therefore a small profile of popular items will have a largecorrelation with users who have rated these items

Implies a low-cost, effective attack on a large proportion ofthe userbase.

However, not at all unobtrusive.

RecSys 2011: Tutorial on Recommender Robustness

Page 82: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

Attack Strategies

Attacking kNN Algorithms

User-based kNN Attack

Correlation calculated over the items which the target userand attack profile have in common. A small intersection setcan lead to high correlations.

Therefore a small profile of popular items will have a largecorrelation with users who have rated these items

Implies a low-cost, effective attack on a large proportion ofthe userbase.

However, not at all unobtrusive.

RecSys 2011: Tutorial on Recommender Robustness

Page 83: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

Attack Strategies

Attacking kNN Algorithms

User-based kNN Attack

Correlation calculated over the items which the target userand attack profile have in common. A small intersection setcan lead to high correlations.

Therefore a small profile of popular items will have a largecorrelation with users who have rated these items

Implies a low-cost, effective attack on a large proportion ofthe userbase.

However, not at all unobtrusive.

RecSys 2011: Tutorial on Recommender Robustness

Page 84: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

Attack Strategies

Attacking kNN Algorithms

User-based kNN Attack

Correlation calculated over the items which the target userand attack profile have in common. A small intersection setcan lead to high correlations.

Therefore a small profile of popular items will have a largecorrelation with users who have rated these items

Implies a low-cost, effective attack on a large proportion ofthe userbase.

However, not at all unobtrusive.

RecSys 2011: Tutorial on Recommender Robustness

Page 85: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

Attack Strategies

Attacking kNN Algorithms

Other Attacks Strategies

Many other attack variants proposed by (Lam and Riedl 2004) and(Mobasher et al. 2007). Assuming a push attack – a target item isgiven the maximum rating and the attack profile is filled out asfollows:

Random Attack Randomly chosen filler items get randomlydrawn rating values.

Average Attack Randomly chosen filler items. Ratings drawnfrom normal distribution with item means set tothose of rating database.

Probe Attack Filler items filled by starting with a set of seedratings, querying the recommender system to fillthe ratings of the remaining items.

RecSys 2011: Tutorial on Recommender Robustness

Page 86: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

Attack Strategies

Attacking kNN Algorithms

Other Attacks Strategies

Many other attack variants proposed by (Lam and Riedl 2004) and(Mobasher et al. 2007). Assuming a push attack – a target item isgiven the maximum rating and the attack profile is filled out asfollows:

Random Attack Randomly chosen filler items get randomlydrawn rating values.

Average Attack Randomly chosen filler items. Ratings drawnfrom normal distribution with item means set tothose of rating database.

Probe Attack Filler items filled by starting with a set of seedratings, querying the recommender system to fillthe ratings of the remaining items.

RecSys 2011: Tutorial on Recommender Robustness

Page 87: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

Attack Strategies

Attacking kNN Algorithms

Other Attacks Strategies

Many other attack variants proposed by (Lam and Riedl 2004) and(Mobasher et al. 2007). Assuming a push attack – a target item isgiven the maximum rating and the attack profile is filled out asfollows:

Random Attack Randomly chosen filler items get randomlydrawn rating values.

Average Attack Randomly chosen filler items. Ratings drawnfrom normal distribution with item means set tothose of rating database.

Probe Attack Filler items filled by starting with a set of seedratings, querying the recommender system to fillthe ratings of the remaining items.

RecSys 2011: Tutorial on Recommender Robustness

Page 88: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

Attack Strategies

Attacking kNN Algorithms

Other Attacks Strategies

Many other attack variants proposed by (Lam and Riedl 2004) and(Mobasher et al. 2007). Assuming a push attack – a target item isgiven the maximum rating and the attack profile is filled out asfollows:

Segment Attack Identifying popular items in a particular usersegment, these are given maximum ratingand remaining filler items are given minimumrating.

Bandwagon Attack A set of popular items is given the maximumrating, while remaining filler items are givenrandom ratings.

RecSys 2011: Tutorial on Recommender Robustness

Page 89: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

Attack Strategies

Attacking kNN Algorithms

Other Attacks Strategies

Many other attack variants proposed by (Lam and Riedl 2004) and(Mobasher et al. 2007). Assuming a push attack – a target item isgiven the maximum rating and the attack profile is filled out asfollows:

Segment Attack Identifying popular items in a particular usersegment, these are given maximum ratingand remaining filler items are given minimumrating.

Bandwagon Attack A set of popular items is given the maximumrating, while remaining filler items are givenrandom ratings.

RecSys 2011: Tutorial on Recommender Robustness

Page 90: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

Attack Strategies

Attacking kNN Algorithms

Evaluation User-based kNNToward Trustworthy Recommender Systems • 23:21

Fig. 8. Comparison of attacks against user-based algorithm, prediction shift (left) and hit ratio(right). The baseline in the right panel indicates the hit ratio results prior to attack.

target item. The filler size, in this case, is the proportion of the remaining itemsthat are assigned random ratings based on the overall data distribution acrossthe whole database (see Figure 4). In the case of the MovieLens data, thesefrequently-rated items are predictable box office successes including such titlesas Star Wars, Return of Jedi, Titanic, etc. The attack profiles consist of highratings given to these popular titles in conjunction with high ratings for thepushed movie. Figure 7 shows the effect of filler size on the effectiveness ofthis attack. In this particular experiment, we selected only a single popularmovie as the “bandwagon” movie with which to associate the target and usedan attack size of 10% (i.e., the number of attack profiles were about 10% of thesize of the original database).

For the bandwagon attack, the best results were obtained by using a 3%filler size (set IF containing ratings for approximately 3% of the movies in thedatabase). This number seems to correspond closely to the average number ofmovies per user in the database. For subsequent experiments with this attackmodel we used five “bandwagon” movies. The five movies chosen were those withthe most ratings in the database. Obviously, this is a form of system-specificknowledge, increasing the knowledge requirements of this attack somewhat.However, we verified the general popularity of these movies using externaldata sources2,3 and found they would be among anyone’s list of movies likelyto have been seen by many viewers.

Figure 8 shows the results of a comparative experiment examining threealgorithms at different attack sizes. The algorithms included the average attack(3% filler size), the bandwagon attack (using one frequently rated item and3% filler size), and the random attack (6% filler size). These parameters werechosen pessimistically as they are the versions of each attack that were foundto be most effective. We see that even without system-specific data an attacklike the bandwagon attack can be successful at higher attack levels. The moreknowledge-intensive average attack was still better, with the best performance

2http://www.the-numbers.com/movies/records/inflation.html.3http://www.imdb.com/boxoffice/alltimegross.

ACM Transactions on Internet Technology, Vol. 7, No. 4, Article 23, Publication date: October 2007.

Results from (Mobasher et al. 2007) on the Movielens 100K dataset. User-based kNN algorithm withk = 20. All users who have rated at least 20 items.

Attack Size given as a percentage of the total number of users in the dataset on the x-axis. Results shownfor different filler sizes, written as a percentage of the total number of items.

Bandwagon attack uses a single popular item and 3% filler size.

Baseline refers to hit-ratio pre-attack

RecSys 2011: Tutorial on Recommender Robustness

Page 91: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

Attack Strategies

Attacking kNN Algorithms

Evaluation Item-based kNN

Results from (Mobasher et al. 2007) on the Movielens 100K dataset. Item-based kNN algorithm withk = 20.

RecSys 2011: Tutorial on Recommender Robustness

Page 92: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

Attack Strategies

Attacking kNN Algorithms

Evaluation Item-based kNNToward Trustworthy Recommender Systems • 23:23

Fig. 10. Prediction shift and hit ratio results for the horror movie segment in the item-basedalgorithm.

the target item for use as the segment portion of the attack profile IS . In therealm of movies, we might imagine selecting movies of a similar genre or moviescontaining the same actors.

If we evaluate the segmented attack based on its average impact on all users,there is nothing remarkable. The attack has an effect but does not approach thenumbers reached by the average attack. However, we must recall our marketsegment assumption: namely, that recommendations made to in-segment usersare much more useful to the attacker than recommendations to other users. Ourfocus must therefore be with the “in-segment” users, those users who have ratedthe segment movies highly and presumably are desirable customers for pusheditems that are similar: an attacker using the Horror segment would presumablybe interested in pushing a new movie of this type.

To build our segmented attack profiles, we identified the user segment asall users who had given above average scores (4 or 5) to any three of the fiveselected horror movies, namely, Alien, Psycho, The Shining, Jaws, and TheBirds.4 For this set of five movies, we then selected all combinations of threemovies that had at least 50 users, support, and chose 50 of those users randomlyand averaged the results.

The power of the segmented attack is emphasized in Figure 10, in which theimpact of the attack is compared within the targeted user segment and withinthe set of all users. The left panel in the figure shows the comparison in termsof prediction shift and varying attack sizes, while the right panel depicts thehit ratio at 1% attack.5 While the segmented attack does show some impactagainst the system as a whole, it truly succeeded in its mission: to push theattacked movie precisely to those users defined by the segment. Indeed, in thecase of in-segment users, the hit ratio was much higher than average attack.The chart also depicts the effect of hit ratio before any attack. Clearly thesegmented attack had a bigger impact than any other attack we have previouslyexamined against item-based algorithm. Our prediction shift results show thatthe segmented attack was more effective against in-segment users than even

4The list was generated from online sources of the popular horror films: http://www.imdb.

com/chart/horror and http://www.filmsite.org/afi100thrillers1.html.5Note that our previous hit ratio figures have used 10% attack size. Here we see comparableperformance with 1/10 of the number of attack profiles.

ACM Transactions on Internet Technology, Vol. 7, No. 4, Article 23, Publication date: October 2007.

Results from (Mobasher et al. 2007) on the Movielens 100K dataset. Item-based kNN algorithm withk = 20.

Prediction shift and hit ratio results for the Horror Movie Segment.

RecSys 2011: Tutorial on Recommender Robustness

Page 93: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

Attack Detection

Outline

1 What is Robustness?Profile Injection AttacksRelevance of RobustnessMeasuring Robustness

2 Attack StrategiesAttacking kNN Algorithms

3 Attack DetectionPCA-based Attack DetectionStatistical Attack DetectionCost-Benefit Analysis

4 Robustness of Model-based Algorithms

5 Attack Resistant Recommendation AlgorithmsProvably Manipulation Resistant Algorithms

6 Stability, Trust and PrivacyRecSys 2011: Tutorial on Recommender Robustness

Page 94: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

Attack Detection

Profile Injection Attacks

Two well-studied attacks :

Construct spam profile with max (or min) rating for targeteditem.Choose a set of filler items at randomRandom Attack – insert ratings in filler items according toN (µ, σ)Average Attack – insert ratings in filler item i according to

N (µi, σi)

For evaluations, genuine profiles are drawn from 1,000,000rating Movielens dataset.

RecSys 2011: Tutorial on Recommender Robustness

Page 95: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

Attack Detection

Detection Flow

RecSys 2011: Tutorial on Recommender Robustness

Page 96: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

Attack Detection

PCA-based Attack Detection

Outline

1 What is Robustness?Profile Injection AttacksRelevance of RobustnessMeasuring Robustness

2 Attack StrategiesAttacking kNN Algorithms

3 Attack DetectionPCA-based Attack DetectionStatistical Attack DetectionCost-Benefit Analysis

4 Robustness of Model-based Algorithms

5 Attack Resistant Recommendation AlgorithmsProvably Manipulation Resistant Algorithms

6 Stability, Trust and PrivacyRecSys 2011: Tutorial on Recommender Robustness

Page 97: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

Attack Detection

PCA-based Attack Detection

PCA Detector (Mehta et al. 2007)

A method of identifying and removing a cluster ofhighly-correlated attack profiles.

A cluster C defined by an indicator vector x such that xi = 1if user ui ∈ C and xi = 0 otherwise.S a profile similarity matrix, with eigenvectors/values ei, λi

Quadratic form

xT Sx =∑

i∈C,j∈C

S(i, j) =m∑

i=1

(x.ei)2λi .

Find x that correlates most with first few eigenvectors of S.

RecSys 2011: Tutorial on Recommender Robustness

Page 98: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

Attack Detection

PCA-based Attack Detection

PCA Detector (Mehta et al. 2007)

Success of PCA depends on how S is calculated.S = Z0, covariance of profiles ignoring missing values.

RecSys 2011: Tutorial on Recommender Robustness

Page 99: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

Attack Detection

PCA-based Attack Detection

PCA Detector (Mehta et al. 2007)

Success of PCA depends on how S is calculated.S = Z1, covariance of profiles treating missing values as 0.

RecSys 2011: Tutorial on Recommender Robustness

Page 100: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

Attack Detection

PCA-based Attack Detection

PCA Detector (Mehta et al. 2007)

Success of PCA depends on how S is calculated.Small Overlap → low genuine/attack covariance.

RecSys 2011: Tutorial on Recommender Robustness

Page 101: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

Attack Detection

Statistical Attack Detection

Outline

1 What is Robustness?Profile Injection AttacksRelevance of RobustnessMeasuring Robustness

2 Attack StrategiesAttacking kNN Algorithms

3 Attack DetectionPCA-based Attack DetectionStatistical Attack DetectionCost-Benefit Analysis

4 Robustness of Model-based Algorithms

5 Attack Resistant Recommendation AlgorithmsProvably Manipulation Resistant Algorithms

6 Stability, Trust and PrivacyRecSys 2011: Tutorial on Recommender Robustness

Page 102: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

Attack Detection

Statistical Attack Detection

Neyman-Pearson Statistical Detection

y ∈ Y be an observation i.e. a user profile over all possibleprofiles YTwo hypotheses

H0 – that y is a genuine profile, associated pdf fY|H1(y)

H1 – that y is an attack profile, associated pdf fY|H0(y)

N-P criterion sets a bound α on false alarm probability pf andmaximises the good detection probability pD

ψ∗(y) ={H1 if l(y) > ηH0 if l(y) < η

where l(y) =fY|H1

(y)

fY|H0(y)

(likelihood ratio)

RecSys 2011: Tutorial on Recommender Robustness

Page 103: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

Attack Detection

Statistical Attack Detection

Modelling Attack Profiles

As attacks follow well-defined construction, easy to constructmodel:

m∏i=1

Pr[Yi = yi]

=m∏i=1

Pr[Yi = φ]θi (Pr[Yi = yi|Yi 6= φ]Pr[Yi 6= φ])1−θi

Which items are rated?

RecSys 2011: Tutorial on Recommender Robustness

Page 104: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

Attack Detection

Statistical Attack Detection

Modelling Attack Profiles

As attacks follow well-defined construction, easy to constructmodel:

m∏i=1

Pr[Yi = yi]

=m∏i=1

Pr[Yi = φ]θi (Pr[Yi = yi|Yi 6= φ]Pr[Yi 6= φ])1−θi

What ratings used?

RecSys 2011: Tutorial on Recommender Robustness

Page 105: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

Attack Detection

Statistical Attack Detection

Modelling Attack Profiles

As attacks follow well-defined construction, easy to constructmodel:

m∏i=1

Pr[Yi = yi]

=m∏i=1

Pr[Yi = φ]θi

(Q

(yi − 1

2 − µiσi

)−Q

(yi + 1

2 − µiσi

))1−θi

Q(.) = Gaussian Q-function

RecSys 2011: Tutorial on Recommender Robustness

Page 106: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

Attack Detection

Statistical Attack Detection

Modelling Genuine Profiles

Simple model – identical to attack model except thatprobability of selecting a filler item is estimated aspi , Pr[Yi = φ] from a dataset of genuine profiles.

m∏i=1

Pr[Yi = yi]

=m∏i=1

piθi (Pr[Yi = yi|Yi 6= φ](1− pi))1−θi

Not a realistic model of genuine ratings – assumes user’sratings are all independent

But sufficient (almost) to distinguish from attack profiles.

RecSys 2011: Tutorial on Recommender Robustness

Page 107: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

Attack Detection

Statistical Attack Detection

Random Attack (Filler Size=5%)

RecSys 2011: Tutorial on Recommender Robustness

Page 108: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

Attack Detection

Statistical Attack Detection

Average Attack (Filler Size=5%)

RecSys 2011: Tutorial on Recommender Robustness

Page 109: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

Attack Detection

Statistical Attack Detection

Lesson Learned

Filler item selection is key to the success of the standardattacks on k-NN user-based algorithm.

Low (attack profile / genuine profile) overlap (few commonratings) makes extreme correlations possible – hence attackprofiles are unusually influential. (Basis of original attackproposed in O’Mahony et al. (2002).)But also allows for successful detection – also highlyperceptible.

RecSys 2011: Tutorial on Recommender Robustness

Page 110: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

Attack Detection

Statistical Attack Detection

Attack Obfuscation

Several obfuscation strategies evaluated previously (Williamset al. 2006).

Effective obfuscation must try to reduce the statisticaldifferences between genuine and attack profiles.

RecSys 2011: Tutorial on Recommender Robustness

Page 111: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

Attack Detection

Statistical Attack Detection

Average over Popular (AoP) attack

It is clear that a major weakness of the average and randomattacks is their unrealistic selection of filler items.

To be imperceptible an attacker must choose items to rate ina similar fashion to genuine users.

Average Over Popular Attack – identical to average

attack, but filler items are chosen from x-% most popularitems.

AoP is a less perceptible but also less effective attack onkNN user-based algorithm.

Nevertheless it does work . . .

RecSys 2011: Tutorial on Recommender Robustness

Page 112: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

Attack Detection

Statistical Attack Detection

AoP Prediction Shift (Attack Size=3%)

RecSys 2011: Tutorial on Recommender Robustness

Page 113: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

Attack Detection

Statistical Attack Detection

AoP Hits (rating of ≥ 4) (Attack Size=3%)

RecSys 2011: Tutorial on Recommender Robustness

Page 114: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

Attack Detection

Statistical Attack Detection

AoP PCA Detection

RecSys 2011: Tutorial on Recommender Robustness

Page 115: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

Attack Detection

Statistical Attack Detection

Improved Genuine Profile Model

Adopt model that takes account of correlations betweenratings.

Assume ratings follow multivariate normal distribution –parameters:

µ0, m-dimensional vector of mean-ratings for each item; andΣ0, m×m matrix of item correlations.

Assume attack profiles also multivariate gaussian withparameters µ1, Σ1

Hence, attacked database can be modelled as a GaussianMixture Model.

RecSys 2011: Tutorial on Recommender Robustness

Page 116: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

Attack Detection

Statistical Attack Detection

Improved Genuine Profile Model

DifficultyVery high dimensional vectorsMissing values

Factor Model : Assume that the rating matrix can berepresented by the linear model

YT = AXT + N ,

X : n× k matrix, xu,j = extent user u likes category jA : m× k matrix ai,j = extent item i belongs in category jN : independent noiseX(i, :) assumed independent normal.

Expectation maximisation to learn A from dataset ([Canny,2002]).

RecSys 2011: Tutorial on Recommender Robustness

Page 117: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

Attack Detection

Statistical Attack Detection

Improved Genuine Profile Model

N-P test on multivariate normal k-dimensional vectorobtained by projection with A

w = ATYT = AT (AXT + N)

X : n× k matrix, xu,j = extent user u likes category jA : m× k matrix ai,j = extent item i belongs in category jN : independent noiseX(i, :) assumed independent normal.

RecSys 2011: Tutorial on Recommender Robustness

Page 118: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

Attack Detection

Statistical Attack Detection

Projection by Clustering

N-P test on multivariate normal k-dimensional vectorobtained by projection with P

w = PTYT

Obtain a clustering of item-set into k clusters of similar items.P : n× k projection matrix sums all ratings belonging to acluster.

RecSys 2011: Tutorial on Recommender Robustness

Page 119: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

Attack Detection

Statistical Attack Detection

Supervised AoP Detection

N-P test reduces to

(w−µw0)TΣw0−1(w−µw0)−(w−µw1)TΣw1

−1(w−µw1) ≶ η

Need

Database of genuine profiles andDatabase of attack profiles

to train the detector.

RecSys 2011: Tutorial on Recommender Robustness

Page 120: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

Attack Detection

Statistical Attack Detection

AoP Detection (Factor Analysis)

Actual attack=20% AoP Attack. Plot shows training ondifferent attack types.

RecSys 2011: Tutorial on Recommender Robustness

Page 121: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

Attack Detection

Statistical Attack Detection

AoP Detection (Clustering)

Actual attack=20% AoP Attack. Plot shows training ondifferent attack types.

RecSys 2011: Tutorial on Recommender Robustness

Page 122: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

Attack Detection

Statistical Attack Detection

Unsupervised AoP Detection

Unsupervised Gaussian mixture model to learn modelparameters

µw0, Σw0, µw1, Σw1, a0 and a1, where ai is the probabilitythat a profile belongs in cluster i.

Implementation from William Wong, Purdue University,http://web.ics.purdue.edu/~wong17/.

RecSys 2011: Tutorial on Recommender Robustness

Page 123: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

Attack Detection

Statistical Attack Detection

AoP Detection

RecSys 2011: Tutorial on Recommender Robustness

Page 124: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

Attack Detection

Cost-Benefit Analysis

Outline

1 What is Robustness?Profile Injection AttacksRelevance of RobustnessMeasuring Robustness

2 Attack StrategiesAttacking kNN Algorithms

3 Attack DetectionPCA-based Attack DetectionStatistical Attack DetectionCost-Benefit Analysis

4 Robustness of Model-based Algorithms

5 Attack Resistant Recommendation AlgorithmsProvably Manipulation Resistant Algorithms

6 Stability, Trust and PrivacyRecSys 2011: Tutorial on Recommender Robustness

Page 125: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

Attack Detection

Cost-Benefit Analysis

Cost-Benefit Analysis

In O’Mahony et al. (2006), we examined a simple cost-benefitmodel

The ROI for an attack on item j given by:

ROIj =cjN(n′j−nj) −

Pk∈Ij

ckPk∈Ij

ck

Simplify by assuming cp = cq, ∀ p, q ∈ IROIj = N(n′j−nj) − sj

sj

sj – total number of ratings inserted in an attack on item j

Approximated the fractions nj & n′j using count of goodpredictions and browser-to-buyer conversion probability

RecSys 2011: Tutorial on Recommender Robustness

Page 126: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

Attack Detection

Cost-Benefit Analysis

Results

Profitj = c(αj(GP′j − GPj)− sj

); c = $10, α = 10%

0 2000 4000 6000 8000 10000−200

0

200

400

600

N

Pro

fit (

$)

r2 = 0.9894

RecSys 2011: Tutorial on Recommender Robustness

Page 127: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

Attack Detection

Cost-Benefit Analysis

Cost-benefit Analysis

Vu et al. (2010) examines the adversarial cost to attacking aranking system in terms of nA = number of profiles andmA=number of ratings.

They assume a rating function r(u, i) ∈ {−1, 0, 1} and aquality-popularity score for each item

f(i) =∑u∈G

r(u, i)

RecSys 2011: Tutorial on Recommender Robustness

Page 128: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

Attack Detection

Cost-Benefit Analysis

Cost-benefit Analysis

Using a trust mechanism that can detect a malicious ratingwith probability γ, they show that a ranking function of theform

f(i) =∑

u∈G∪Ar(u, i)t(u, i)

can be used to design a ranking system in which the minimumadversarial cost in expectation to boost the rank of an itemfrom k to 1 includes the cost of posting mA=nA ratings, with

nA = (x1 + xk)1− 2ε+ εγ

1− γ

where xi denotes the number of honest ratings for item i andε is the probability of an erroneous rating.

RecSys 2011: Tutorial on Recommender Robustness

Page 129: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

Robustness of Model-based Algorithms

Outline

1 What is Robustness?Profile Injection AttacksRelevance of RobustnessMeasuring Robustness

2 Attack StrategiesAttacking kNN Algorithms

3 Attack DetectionPCA-based Attack DetectionStatistical Attack DetectionCost-Benefit Analysis

4 Robustness of Model-based Algorithms

5 Attack Resistant Recommendation AlgorithmsProvably Manipulation Resistant Algorithms

6 Stability, Trust and PrivacyRecSys 2011: Tutorial on Recommender Robustness

Page 130: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

Robustness of Model-based Algorithms

Model-based Algorithms

Attack takes effect when model is re-trained, using corruptedratings database.

RecSys 2011: Tutorial on Recommender Robustness

Page 131: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

Robustness of Model-based Algorithms

Model-based Algorithms

Attack takes effect when model is re-trained, using corruptedratings database.

RecSys 2011: Tutorial on Recommender Robustness

Page 132: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

Robustness of Model-based Algorithms

Model-based Algorithms

Attack takes effect when model is re-trained, using corruptedratings database.

RecSys 2011: Tutorial on Recommender Robustness

Page 133: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

Robustness of Model-based Algorithms

Model-based Algorithms

Attack takes effect when model is re-trained, using corruptedratings database.

RecSys 2011: Tutorial on Recommender Robustness

Page 134: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

Robustness of Model-based Algorithms

Model-based Algorithms

Attack takes effect when model is re-trained, using corruptedratings database.

RecSys 2011: Tutorial on Recommender Robustness

Page 135: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

Robustness of Model-based Algorithms

Model-based Algorithms

Attack takes effect when model is re-trained, using corruptedratings database.

RecSys 2011: Tutorial on Recommender Robustness

Page 136: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

Robustness of Model-based Algorithms

Model-based Algorithms

Attack takes effect when model is re-trained, using corruptedratings database.

RecSys 2011: Tutorial on Recommender Robustness

Page 137: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

Robustness of Model-based Algorithms

Manipulation-resistance of Model-based Algorithms

Initial work such as Mobasher et al. (2006) showed thatmodel-based algorithms are more resistant to manipulationthan memory-based algorithms.

RecSys 2011: Tutorial on Recommender Robustness

Page 138: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

Robustness of Model-based Algorithms

Manipulation-resistance of Model-based Algorithms

Initial work such as Mobasher et al. (2006) showed thatmodel-based algorithms are more resistant to manipulationthan memory-based algorithms.

RecSys 2011: Tutorial on Recommender Robustness

Page 139: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

Robustness of Model-based Algorithms

Manipulation-resistance of Model-based Algorithms

Initial work such as Mobasher et al. (2006) showed thatmodel-based algorithms are more resistant to manipulationthan memory-based algorithms.

The user-base is clustered into segments using k-means orPLSA clustering and users matched to closest segmentprofiles.

RecSys 2011: Tutorial on Recommender Robustness

Page 140: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

Robustness of Model-based Algorithms

Manipulation-resistance of Model-based Algorithms

Initial work such as Mobasher et al. (2006) showed thatmodel-based algorithms are more resistant to manipulationthan memory-based algorithms.

Sybil profiles tend to be clustered into same segment, therebyreducing power of attack.

RecSys 2011: Tutorial on Recommender Robustness

Page 141: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

Robustness of Model-based Algorithms

Manipulation-resistance of Model-based Algorithms

Initial work such as Mobasher et al. (2006) showed thatmodel-based algorithms are more resistant to manipulationthan memory-based algorithms.

However, as shown in Cheng and Hurley (2009a), a diversifiedattack can create less similar but yet still effective attackprofiles.

RecSys 2011: Tutorial on Recommender Robustness

Page 142: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

Robustness of Model-based Algorithms

Manipulation-resistance of Model-based Algorithms

From Mobasher et al. (2006): Evaluation on MovieLens 100Kdataset

RecSys 2011: Tutorial on Recommender Robustness

Page 143: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

Robustness of Model-based Algorithms

Manipulation-resistance of Model-based Algorithms

From Cheng and Hurley (2009a): Evaluation on MovieLens 1Mdataset

RecSys 2011: Tutorial on Recommender Robustness

Page 144: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

Robustness of Model-based Algorithms

Manipulation-resistance of Matrix Factorisation

Modern matrix factorization algorithms use a least squaresstep to find the factors. This is known to be sensitive tooutliers.

RecSys 2011: Tutorial on Recommender Robustness

Page 145: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

Robustness of Model-based Algorithms

Manipulation-resistance of Matrix Factorisation

Cheng and Hurley (2010) Prediction shift of basic Bellkoralgorithm kNN for a single randomly chosen unpopular item.

RecSys 2011: Tutorial on Recommender Robustness

Page 146: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

Robustness of Model-based Algorithms

Summary of Robustness Results on Standard Algorithms

A number of general attack strategies have been proposedthat work effectively in particular on memory-basedalgorithms.

Standard attacks are generally detectable:Cost effectiveness implies that a sybil profile should beunusually influential.

Special purpose (informed) attacks can be tailored towardsparticular CF algorithms

e.g. a high diversity attack proved effective against thek-means clustering algorithm.

Sybil profiles can be obfuscated to make them less detectableSelection of filler items is Achille’s heal of standard averageattack, but also explains its power.Obfuscation reduces the effectiveness of attacks butmemory-based algorithms are still vulnerable.

RecSys 2011: Tutorial on Recommender Robustness

Page 147: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

Robustness of Model-based Algorithms

Summary of Robustness Results on Standard Algorithms

A number of general attack strategies have been proposedthat work effectively in particular on memory-basedalgorithms.

Standard attacks are generally detectable:

Cost effectiveness implies that a sybil profile should beunusually influential.

Special purpose (informed) attacks can be tailored towardsparticular CF algorithms

e.g. a high diversity attack proved effective against thek-means clustering algorithm.

Sybil profiles can be obfuscated to make them less detectableSelection of filler items is Achille’s heal of standard averageattack, but also explains its power.Obfuscation reduces the effectiveness of attacks butmemory-based algorithms are still vulnerable.

RecSys 2011: Tutorial on Recommender Robustness

Page 148: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

Robustness of Model-based Algorithms

Summary of Robustness Results on Standard Algorithms

A number of general attack strategies have been proposedthat work effectively in particular on memory-basedalgorithms.

Standard attacks are generally detectable:Cost effectiveness implies that a sybil profile should beunusually influential.

Special purpose (informed) attacks can be tailored towardsparticular CF algorithms

e.g. a high diversity attack proved effective against thek-means clustering algorithm.

Sybil profiles can be obfuscated to make them less detectableSelection of filler items is Achille’s heal of standard averageattack, but also explains its power.Obfuscation reduces the effectiveness of attacks butmemory-based algorithms are still vulnerable.

RecSys 2011: Tutorial on Recommender Robustness

Page 149: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

Attack Resistant Recommendation Algorithms

Outline

1 What is Robustness?Profile Injection AttacksRelevance of RobustnessMeasuring Robustness

2 Attack StrategiesAttacking kNN Algorithms

3 Attack DetectionPCA-based Attack DetectionStatistical Attack DetectionCost-Benefit Analysis

4 Robustness of Model-based Algorithms

5 Attack Resistant Recommendation AlgorithmsProvably Manipulation Resistant Algorithms

6 Stability, Trust and PrivacyRecSys 2011: Tutorial on Recommender Robustness

Page 150: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

Attack Resistant Recommendation Algorithms

VarSelectSVD

Some early work (O’Mahony et al. 2004) appliedneighbourhood filtering to the standard user-based algorithm,to cluster and filter suspicious neighbours.

Mehta and Nejdl (2008) introduces the the VarSelectSVDmethod – a matrix factorisation strategy taking robustnessinto account.

An SVD factorization of the rating matrix R requires thefollowing to be solved:

arg minG,H‖R−GH‖

Users are flagged as suspicious using PCA clustering.Flagged users do not contribute to the update of righteigenvector – they do not contribute to the model.

RecSys 2011: Tutorial on Recommender Robustness

Page 151: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

Attack Resistant Recommendation Algorithms

VarSelectSVD

Some early work (O’Mahony et al. 2004) appliedneighbourhood filtering to the standard user-based algorithm,to cluster and filter suspicious neighbours.

Mehta and Nejdl (2008) introduces the the VarSelectSVDmethod – a matrix factorisation strategy taking robustnessinto account.

An SVD factorization of the rating matrix R requires thefollowing to be solved:

arg minG,H‖R−GH‖

Users are flagged as suspicious using PCA clustering.Flagged users do not contribute to the update of righteigenvector – they do not contribute to the model.

RecSys 2011: Tutorial on Recommender Robustness

Page 152: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

Attack Resistant Recommendation Algorithms

VarSelectSVD

Some early work (O’Mahony et al. 2004) appliedneighbourhood filtering to the standard user-based algorithm,to cluster and filter suspicious neighbours.

Mehta and Nejdl (2008) introduces the the VarSelectSVDmethod – a matrix factorisation strategy taking robustnessinto account.

An SVD factorization of the rating matrix R requires thefollowing to be solved:

arg minG,H‖R−GH‖

Users are flagged as suspicious using PCA clustering.Flagged users do not contribute to the update of righteigenvector – they do not contribute to the model.

RecSys 2011: Tutorial on Recommender Robustness

Page 153: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

Attack Resistant Recommendation Algorithms

VarSelectSVD

Some early work (O’Mahony et al. 2004) appliedneighbourhood filtering to the standard user-based algorithm,to cluster and filter suspicious neighbours.

Mehta and Nejdl (2008) introduces the the VarSelectSVDmethod – a matrix factorisation strategy taking robustnessinto account.

An SVD factorization of the rating matrix R requires thefollowing to be solved:

arg minG,H‖R−GH‖

Users are flagged as suspicious using PCA clustering.

Flagged users do not contribute to the update of righteigenvector – they do not contribute to the model.

RecSys 2011: Tutorial on Recommender Robustness

Page 154: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

Attack Resistant Recommendation Algorithms

VarSelectSVD

Some early work (O’Mahony et al. 2004) appliedneighbourhood filtering to the standard user-based algorithm,to cluster and filter suspicious neighbours.

Mehta and Nejdl (2008) introduces the the VarSelectSVDmethod – a matrix factorisation strategy taking robustnessinto account.

An SVD factorization of the rating matrix R requires thefollowing to be solved:

arg minG,H‖R−GH‖

Users are flagged as suspicious using PCA clustering.Flagged users do not contribute to the update of righteigenvector – they do not contribute to the model.

RecSys 2011: Tutorial on Recommender Robustness

Page 155: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

Attack Resistant Recommendation Algorithms

VarSelectSVD

RecSys 2011: Tutorial on Recommender Robustness

Page 156: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

Attack Resistant Recommendation Algorithms

VarSelectSVD Results

MAE, Prediction Shift and Hit Ratio for 5% Average Attack,7% Filler on Movielens 1M

RecSys 2011: Tutorial on Recommender Robustness

Page 157: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

Attack Resistant Recommendation Algorithms

Manipulation Resistance Through Robust Statistics

Robust statistics describes an alternative approach to classicalstatistics where the motivation is to produce estimators thatare not unduly affected by small departures from the model.

Robust regression uses a bounded cost function which limitsthe effect of outliers.

Two Approaches

RecSys 2011: Tutorial on Recommender Robustness

Page 158: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

Attack Resistant Recommendation Algorithms

Manipulation Resistance Through Robust Statistics

Robust statistics describes an alternative approach to classicalstatistics where the motivation is to produce estimators thatare not unduly affected by small departures from the model.

Robust regression uses a bounded cost function which limitsthe effect of outliers.

Two Approaches

1 M-estimators –

arg minG,H

∑rij 6=0

ρ(rij−gihj) where ρ(r) ={ 1

2γ r2 |r| ≤ γ

|r| − γ2 |r| > γ

RecSys 2011: Tutorial on Recommender Robustness

Page 159: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

Attack Resistant Recommendation Algorithms

Manipulation Resistance Through Robust Statistics

Robust statistics describes an alternative approach to classicalstatistics where the motivation is to produce estimators thatare not unduly affected by small departures from the model.

Robust regression uses a bounded cost function which limitsthe effect of outliers.

Two Approaches

2 Least Trimmed Squares –

arg minG,H

h∑i=1

e2(i) where e2(1) ≤ e2(2) ≤ · · · ≤ e

2(n)

RecSys 2011: Tutorial on Recommender Robustness

Page 160: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

Attack Resistant Recommendation Algorithms

Robust Statistics Results on Bellkor Method

RecSys 2011: Tutorial on Recommender Robustness

Page 161: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

Attack Resistant Recommendation Algorithms

Provably Manipulation Resistant Algorithms

Outline

1 What is Robustness?Profile Injection AttacksRelevance of RobustnessMeasuring Robustness

2 Attack StrategiesAttacking kNN Algorithms

3 Attack DetectionPCA-based Attack DetectionStatistical Attack DetectionCost-Benefit Analysis

4 Robustness of Model-based Algorithms

5 Attack Resistant Recommendation AlgorithmsProvably Manipulation Resistant Algorithms

6 Stability, Trust and PrivacyRecSys 2011: Tutorial on Recommender Robustness

Page 162: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

Attack Resistant Recommendation Algorithms

Provably Manipulation Resistant Algorithms

The Influence Limiter (Resnick and Sami 2007)

Output of recommendation systemqj passes through aninfluence-limiting process toproduce a modified output qj .

qj is a weighted average of qj−1

and qj

The weighting depends on thereputation that user j hasaccumulated with respect to thetarget.

RecSys 2011: Tutorial on Recommender Robustness

Page 163: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

Attack Resistant Recommendation Algorithms

Provably Manipulation Resistant Algorithms

The Influence Limiter (Resnick and Sami 2007)

A scoring function assignsreputation to j based on whetheror not the target actually likes theitem.

Tuned so that honest users canreach full credibility after O(log n)steps.

The main result states that thetotal impact of n sybils in terms ofperformance reduction is bounded

by −neλ .

RecSys 2011: Tutorial on Recommender Robustness

Page 164: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

Attack Resistant Recommendation Algorithms

Provably Manipulation Resistant Algorithms

Yan and Roy (2009)’s Manipulation Robust Algorithm

A class of CF algorithms – which the authors’ call linear CFalgorithms – is presented in which the manipulationdistortion is bounded above by

1n

11−r .

where r is the fraction of the data that is generated bymanipulators and n is the number of products that havealready been rated by a user whose future ratings are to bepredicted.

“Suppose a CF system that accepts binary ratings predictsfuture ratings correctly 80% of the time in the absence ofmanipulation. If 10% of all ratings are provided bymanipulators, according to our bound, the system can maintaina 75% rate of correct predictions by requiring each new user torate at least 21 products before receiving recommendations.”

RecSys 2011: Tutorial on Recommender Robustness

Page 165: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

Attack Resistant Recommendation Algorithms

Provably Manipulation Resistant Algorithms

Yan and Roy (2009)’s Manipulation Robust Algorithm

Let r ∈ L be a rating in the label space.

Let ν be a permutation of the items such that νn is the nth

item rated by an active user, a.

Let rn−1a be the active user’s profile after rating n− 1 items.

Then a CF algorithm may be described as

A PMF over the rating that the active user gives for the νn

item.Depending on the active user’s current profilethe training set, Rthe order in which the user rates items.

RecSys 2011: Tutorial on Recommender Robustness

Page 166: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

Attack Resistant Recommendation Algorithms

Provably Manipulation Resistant Algorithms

Yan and Roy (2009)’s Manipulation Robust Algorithm

Let r ∈ L be a rating in the label space.

Let ν be a permutation of the items such that νn is the nth

item rated by an active user, a.

Let rn−1a be the active user’s profile after rating n− 1 items.

Then a CF algorithm may be described as

P (ra,νn = r|rn−1a ,R,ν)

A PMF over the rating that the active user gives for the νn

item.Depending on the active user’s current profilethe training set, Rthe order in which the user rates items.

RecSys 2011: Tutorial on Recommender Robustness

Page 167: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

Attack Resistant Recommendation Algorithms

Provably Manipulation Resistant Algorithms

Yan and Roy (2009)’s Manipulation Robust Algorithm

Let r ∈ L be a rating in the label space.

Let ν be a permutation of the items such that νn is the nth

item rated by an active user, a.

Let rn−1a be the active user’s profile after rating n− 1 items.

Then a CF algorithm may be described as

P (ra,νn = r|rn−1a ,R,ν)

A PMF over the rating that the active user gives for the νn

item.

Depending on the active user’s current profilethe training set, Rthe order in which the user rates items.

RecSys 2011: Tutorial on Recommender Robustness

Page 168: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

Attack Resistant Recommendation Algorithms

Provably Manipulation Resistant Algorithms

Yan and Roy (2009)’s Manipulation Robust Algorithm

Let r ∈ L be a rating in the label space.

Let ν be a permutation of the items such that νn is the nth

item rated by an active user, a.

Let rn−1a be the active user’s profile after rating n− 1 items.

Then a CF algorithm may be described as

P (ra,νn = r|rn−1a ,R,ν)

A PMF over the rating that the active user gives for the νn

item.

Depending on the active user’s current profile

the training set, Rthe order in which the user rates items.

RecSys 2011: Tutorial on Recommender Robustness

Page 169: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

Attack Resistant Recommendation Algorithms

Provably Manipulation Resistant Algorithms

Yan and Roy (2009)’s Manipulation Robust Algorithm

Let r ∈ L be a rating in the label space.

Let ν be a permutation of the items such that νn is the nth

item rated by an active user, a.

Let rn−1a be the active user’s profile after rating n− 1 items.

Then a CF algorithm may be described as

P (ra,νn = r|rn−1a ,R,ν)

A PMF over the rating that the active user gives for the νn

item.Depending on the active user’s current profile

the training set, R

the order in which the user rates items.

RecSys 2011: Tutorial on Recommender Robustness

Page 170: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

Attack Resistant Recommendation Algorithms

Provably Manipulation Resistant Algorithms

Yan and Roy (2009)’s Manipulation Robust Algorithm

Let r ∈ L be a rating in the label space.

Let ν be a permutation of the items such that νn is the nth

item rated by an active user, a.

Let rn−1a be the active user’s profile after rating n− 1 items.

Then a CF algorithm may be described as

P (ra,νn = r|rn−1a ,R,ν)

A PMF over the rating that the active user gives for the νn

item.Depending on the active user’s current profilethe training set, R

the order in which the user rates items.

RecSys 2011: Tutorial on Recommender Robustness

Page 171: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

Attack Resistant Recommendation Algorithms

Provably Manipulation Resistant Algorithms

Yan and Roy (2009)’s Manipulation Robust Algorithm

A probabilistic CF algorithm predicts independently of theorder ν.

Let R′ = (R,Y) be a database divided in proportion (1− r)to r between R and Y.

Then a linear probabilistic CF algorithm is one in which

P (.|rn−1a , (R,Y)) = (1− r)P (.|rn−1

a ,R) + rP (.|rn−1a ,Y)

As the active user rates more and more products, his ratingswill tend to be distinguished as sampled from P (.|rn−1

a ,R)and hence the influence of P (.|rn−1

a ,Y) diminishes as n grows.

RecSys 2011: Tutorial on Recommender Robustness

Page 172: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

Attack Resistant Recommendation Algorithms

Provably Manipulation Resistant Algorithms

Yan and Roy (2009)’s Manipulation Robust Algorithm

The kNN algorithm is not linear and does not satisfy thebound

The Naive Bayes (NB) algorithm is an asymptotically linearCF algorithm.

Kernel Density Estimation (KDE) is a linear CF algorithm.

RecSys 2011: Tutorial on Recommender Robustness

Page 173: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

Stability, Trust and Privacy

Outline

1 What is Robustness?Profile Injection AttacksRelevance of RobustnessMeasuring Robustness

2 Attack StrategiesAttacking kNN Algorithms

3 Attack DetectionPCA-based Attack DetectionStatistical Attack DetectionCost-Benefit Analysis

4 Robustness of Model-based Algorithms

5 Attack Resistant Recommendation AlgorithmsProvably Manipulation Resistant Algorithms

6 Stability, Trust and PrivacyRecSys 2011: Tutorial on Recommender Robustness

Page 174: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

Stability, Trust and Privacy

Robustness and Related Concepts

Stability refers to a recommender system’s ability to provideconsistent recommendations

in the face of noise in the dataset.using different training sample;over time, if new ratings ‘agree’ with past ratings

Adomavicius and Zhang (2010), explores stability from thepoint of view of adding a CF algorithm’s predictions to thedataset as new ratings, and measuring the prediction shiftthat is incurred.

RecSys 2011: Tutorial on Recommender Robustness

Page 175: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

Stability, Trust and Privacy

Robustness and Related Concepts

Stability refers to a recommender system’s ability to provideconsistent recommendations

in the face of noise in the dataset.

using different training sample;over time, if new ratings ‘agree’ with past ratings

Adomavicius and Zhang (2010), explores stability from thepoint of view of adding a CF algorithm’s predictions to thedataset as new ratings, and measuring the prediction shiftthat is incurred.

RecSys 2011: Tutorial on Recommender Robustness

Page 176: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

Stability, Trust and Privacy

Robustness and Related Concepts

Stability refers to a recommender system’s ability to provideconsistent recommendations

in the face of noise in the dataset.using different training sample;

over time, if new ratings ‘agree’ with past ratings

Adomavicius and Zhang (2010), explores stability from thepoint of view of adding a CF algorithm’s predictions to thedataset as new ratings, and measuring the prediction shiftthat is incurred.

RecSys 2011: Tutorial on Recommender Robustness

Page 177: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

Stability, Trust and Privacy

Robustness and Related Concepts

Stability refers to a recommender system’s ability to provideconsistent recommendations

in the face of noise in the dataset.using different training sample;over time, if new ratings ‘agree’ with past ratings

Adomavicius and Zhang (2010), explores stability from thepoint of view of adding a CF algorithm’s predictions to thedataset as new ratings, and measuring the prediction shiftthat is incurred.

RecSys 2011: Tutorial on Recommender Robustness

Page 178: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

Stability, Trust and Privacy

Robustness and Related Concepts

Amatriain et al. (2009) carries out a user study of test re-testreliability and shows that inconsistencies negatively impact thequality of the predictions.

RecSys 2011: Tutorial on Recommender Robustness

Page 179: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

Stability, Trust and Privacy

Robustness and Related Concepts

Trust has been widely explored in collaborative filtering

. . . avoid manipulation by only using rating of trusted users orbuild trust based on user behaviour (c.f. the Influence Limiter). . . nevertheless systems based on trust are still manipulatable– tinyurl.com/6z6k6kr two fictitious Facebook userssuccessfully made friends with 95 users within two weeks.

RecSys 2011: Tutorial on Recommender Robustness

Page 180: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

Stability, Trust and Privacy

Robustness and Related Concepts

Trust has been widely explored in collaborative filtering

. . . avoid manipulation by only using rating of trusted users orbuild trust based on user behaviour (c.f. the Influence Limiter)

. . . nevertheless systems based on trust are still manipulatable– tinyurl.com/6z6k6kr two fictitious Facebook userssuccessfully made friends with 95 users within two weeks.

RecSys 2011: Tutorial on Recommender Robustness

Page 181: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

Stability, Trust and Privacy

Robustness and Related Concepts

Trust has been widely explored in collaborative filtering

. . . avoid manipulation by only using rating of trusted users orbuild trust based on user behaviour (c.f. the Influence Limiter). . . nevertheless systems based on trust are still manipulatable– tinyurl.com/6z6k6kr two fictitious Facebook userssuccessfully made friends with 95 users within two weeks.

RecSys 2011: Tutorial on Recommender Robustness

Page 182: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

Stability, Trust and Privacy

Robustness and Related Concepts

Privacy is an important security concern from thepoint-of-view of the end-user

In Cheng and Hurley (2009b), we demonstrate that a systemarchitecture to support privacy can provide new opportunitiesfor manipulation attacks.Differential privacy has been studied in the context ofrecommender systems by a number of researchers. Oneapproach to differential privacy is the use of robust statistics.The connection to manipulation resistance may be worthpursuing.

RecSys 2011: Tutorial on Recommender Robustness

Page 183: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

Stability, Trust and Privacy

Robustness and Related Concepts

Privacy is an important security concern from thepoint-of-view of the end-user

In Cheng and Hurley (2009b), we demonstrate that a systemarchitecture to support privacy can provide new opportunitiesfor manipulation attacks.

Differential privacy has been studied in the context ofrecommender systems by a number of researchers. Oneapproach to differential privacy is the use of robust statistics.The connection to manipulation resistance may be worthpursuing.

RecSys 2011: Tutorial on Recommender Robustness

Page 184: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

Stability, Trust and Privacy

Robustness and Related Concepts

Privacy is an important security concern from thepoint-of-view of the end-user

In Cheng and Hurley (2009b), we demonstrate that a systemarchitecture to support privacy can provide new opportunitiesfor manipulation attacks.Differential privacy has been studied in the context ofrecommender systems by a number of researchers. Oneapproach to differential privacy is the use of robust statistics.The connection to manipulation resistance may be worthpursuing.

RecSys 2011: Tutorial on Recommender Robustness

Page 185: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

Stability, Trust and Privacy

More complicated shilling scenarios

We have focused in this tutorial on rating manipulation – thecreation of sybil profiles and ratings that distort arecommendation system’s output.

However, in the real world, shilling can have a morecomplicated form.

e.g. The text of hotel reviews can be used to persuade users toselect certain hotels.Wu et al. (2010) has carried out work on identifying suspiciousreviews in TripAdvisor.

RecSys 2011: Tutorial on Recommender Robustness

Page 186: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

Stability, Trust and Privacy

Conclusion

We’ve reviewed research that has been carried out in lastnumber of years on robustness of RS.The conclusions are quite positive from system managers’points-of-view:

If desired, recommendation algorithms that are largelymanipulation-resistant may be adopted.Filtering strategies can effectively find unusual rating patterns.Obfuscating attack profiles to avoid filtering generally resultsin less effective attacks.

Good recommendation systems are personalised and hence,should be sensitive to the peculiarities of each user’s ratingbehaviour.

A system designer needs to find the right balance betweensensitivity to the melting pot of human behaviour andavoiding easy manipulation.

RecSys 2011: Tutorial on Recommender Robustness

Page 187: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

Stability, Trust and Privacy

Thank You

My research is sponsored by Science Foundation Ireland undergrant 08/SRC/I1407: Clique: Graph and Network Analysis Cluster

RecSys 2011: Tutorial on Recommender Robustness

Page 188: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

Stability, Trust and Privacy

References I

Adomavicius, G. and Zhang, J.: 2010, On the stability ofrecommendation algorithms, Proceedings of the fourth ACMconference on Recommender systems, RecSys ’10, ACM, NewYork, NY, USA, pp. 47–54.URL: http://doi.acm.org/10.1145/1864708.1864722

Amatriain, X., Pujol, J. M. and Oliver, N.: 2009, I like it . . . I likeit not: Evaluating user ratings noise in recommender systems,Proceedings of the 17th International Conference on UserModeling, Adaptation, and Personalization: formerly UM andAH, UMAP ’09, Springer-Verlag, Berlin, Heidelberg,pp. 247–258.URL: http://dx.doi.org/10.1007/978-3-642-02247-0 24

RecSys 2011: Tutorial on Recommender Robustness

Page 189: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

Stability, Trust and Privacy

References II

Burke, R., O’Mahony, M. P. and Hurley, N. J.: 2011, Robustcollaborative recommendation, in F. Ricci, L. Rokach, B. Shapiraand P. B. Kantor (eds), Recommender Systems Handbook,Springer, pp. 805–835.

Carrara, E. and Hogben, G.: 2007, Enisa position paper no.2,reputation-based systems: a security analysis, Technical report,ENISA.

Cheng, Z. and Hurley, N.: 2009a, Effective diverse and obfuscatedattacks on model-based recommender systems, in L. D.Bergman, A. Tuzhilin, R. D. Burke, A. Felfernig andL. Schmidt-Thieme (eds), RecSys, ACM, pp. 141–148.

RecSys 2011: Tutorial on Recommender Robustness

Page 190: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

Stability, Trust and Privacy

References III

Cheng, Z. and Hurley, N.: 2009b, Trading robustness for privacy indecentralized recommender systems, in K. Z. Haigh andN. Rychtyckyj (eds), IAAI, AAAI.

Cheng, Z. and Hurley, N.: 2010, Robust collaborative filtering byleast trimmed squares matrix factorisation, Proceedings of theInternational Conference on Tools in Artificial Intelligence.

Donath, J.: 1998, Communities in Cyberspace, Routledge, chapterIdentity and Deception in the Virtual Community.

Douceur, J.: 2002, The sybil attack, Proceedings of the FirstInternational Workshop on Peer-to-Peer Systems.

Lam, S. K. and Riedl, J.: 2004, Shilling recommender systems forfun and profit, In Proceedings of the 13th International WorldWide Web Conference pp. 393–402.

RecSys 2011: Tutorial on Recommender Robustness

Page 191: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

Stability, Trust and Privacy

References IV

Lang, J., Spear, M. and Wu, S. F.: 2010, Social manipulation ofonline recommender systems, Proceedings of the Secondinternational conference on Social informatics, SocInfo’10,Springer-Verlag, Berlin, Heidelberg, pp. 125–139.URL: http://dl.acm.org/citation.cfm?id=1929326.1929336

Mehta, B., Hofmann, T. and Fankhauser, P.: 2007, Lies andpropaganda: Detecting spam users in collaborative filtering,Proceedings of the 12th international conference on Intelligentuser interfaces, pp. 14–21.

Mehta, B. and Nejdl, W.: 2008, Attack resistant collaborativefiltering, Proceedings of the 31st annual international ACMSIGIR conference on Research and development in informationretrieval, SIGIR ’08, ACM, New York, NY, USA, pp. 75–82.URL: http://doi.acm.org/10.1145/1390334.1390350

RecSys 2011: Tutorial on Recommender Robustness

Page 192: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

Stability, Trust and Privacy

References V

Mobasher, B., Burke, R., Bhaumik, R. and Williams, C.: 2007,Toward trustworthy recommender systems: An analysis of attackmodels and algorithm robustness, ACM Transactions on InternetTechnology 7(4).

Mobasher, B., Burke, R. D. and Sandvig, J. J.: 2006, Model-basedcollaborative filtering as a defense against profile injectionattacks, AAAI, AAAI Press.

O’Mahony, M. P., Hurley, N. J. and Silvestre, C. C. M.: 2004, Anevaluation of neighbourhood formation on the performance ofcollaborative filtering, Artificial Intelligence Review21(1), 215–228.

RecSys 2011: Tutorial on Recommender Robustness

Page 193: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

Stability, Trust and Privacy

References VI

O’Mahony, M. P., Hurley, N. J. and Silvestre, G. C. M.: 2002,Promoting recommendations: An attack on collaborativefiltering, in A. Hameurlain, R. Cicchetti and R. Traunmuller(eds), DEXA, Vol. 2453 of Lecture Notes in Computer Science,Springer, pp. 494–503.

O’Mahony, M. P., Hurley, N. J. and Silvestre, G. C. M.: 2006,Attacking recommender systems: The cost of promotion,Workshop on Recommender Systems at the 17th EuropeanConference on Artificial Intelligence (ECAI’06), 28th–29th, Rivadel Garda, Italy.

RecSys 2011: Tutorial on Recommender Robustness

Page 194: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

Stability, Trust and Privacy

References VII

Resnick, P. and Sami, R.: 2007, The influence limiter: provablymanipulation-resistant recommender systems, RecSys ’07:Proceedings of the 2007 ACM conference on Recommendersystems, ACM, New York, NY, USA, pp. 25–32.

Vu, L.-H., Papaioannou, T. and Aberer, K.: 2010, Impact of trustmanagement and information sharing to adversarial cost inranking systems, in M. Nishigaki, A. Jsang, Y. Murayama andS. Marsh (eds), Trust Management IV, Vol. 321 of IFIPAdvances in Information and Communication Technology,Springer Boston, pp. 108–124.

RecSys 2011: Tutorial on Recommender Robustness

Page 195: Tutorial on Robustness of Recommender Systems

Tutorial on Robustness of Recommender Systems

Stability, Trust and Privacy

References VIII

Williams, C., Mobasher, B., Burke, R., Bhaumik, R. and Sandvig,J.: 2006, Detection of obfuscated attacks in collaborativerecommender systems, In Proceedings of the 17th EuropeanConference on Artificial Intelligence (ECAI’06) .

Wu, G., Greene, D. and Cunningham, P.: 2010, Merging multiplecriteria to identify suspicious reviews, in X. Amatriain,M. Torrens, P. Resnick and M. Zanker (eds), RecSys, ACM,pp. 241–244.

Yan, X. and Roy, B. V.: 2009, Manipulation robustness ofcollaborative filtering systems, CoRR abs/0903.0064.

RecSys 2011: Tutorial on Recommender Robustness


Recommended