Collective Opinion Spam Detection using Active Inference

Rayana & Akoglu

Shebuti Rayana* Leman Akoglu

May 6, 2016Miami, Florida, USA

SDM 2016

Rayana & Akoglu 2Collective Opinion Spam Detection using Active Inference

How do consumers learn about product quality?

Advertisements

Consumer review websites (Yelp, TripAdvisor etc.)

Impact of consumer reviews on sales?

+1 star-rating increases revenue by 5-9%Harvard Study by M. Luca Reviews, Reputation, and Revenue: The Case of Yelp.com


Paid/Biased reviewers write fake reviews

unjustly promote / demote products or businesses

Problem ?

Humans only slightly better than chanceFinding Deceptive Opinion Spam by Any Stretch of the Imagination Ott et al. 2011


Online Review System

Review networkMeta Data

spammer

target product

fake review

BudgetOracle (e.g. human)


ReviewNetwork

ReviewText

ReviewBehavior

ActiveInference

Ott’2011 ✓

Mukherjee’2013

✓ ✓

Jindal’2008 ✓

Wang’2011 ✓ ✓

FraudEagle ✓

SpEagle ✓ ✓ ✓

SpEagle+ ✓ ✓ ✓

SpEagle + EUCR

✓ ✓ ✓ ✓

Rayana & Akoglu 6

A network classification problem• Given

• user-review-product network (tri-partite)• features extracted from metadata (i.e. text, behavior)

– for users, reviews and products

• An Oracle (e.g. Human annotator, Yelp.com)• A budget B

• Select query and classify all network objects into type-specific classes

• Users (‘benign’ vs. ‘spammer’)• Products (‘non-target’ vs. ‘target’)• Reviews (‘genuine’ vs. ‘fake’)

Collective Opinion Spam Detection using Active Inference

writes belongsU R P

Rayana & Akoglu 7

Objective

Wisely select “valuable” nodes

Find metric to quantify “value” of a node

Achieve improved performance

over random selection



Rayana & Akoglu 9

SpEagle: A collective classification approach (unsupervised)

Objective function utilizes pairwise Markov Random Fields

- inference problem (NP-Hard)

Loopy Belief Propagation (LBP) for inference


1) Repeat for each node:

2) At convergence:

Prior edge type

Compatibility potential

belief of node i


Compatibility potential:

Prior:

Spam

Score

Prior

[Rayana et al. KDD2015]

Meta

data Features: (i) Review Text,

and (ii) Behavioral

(timestamp, rating)

Users: ‘benign’ ‘spammer’Products: ‘non-target’ ‘target’Reviews: ‘genuine’ ‘fake’

Rayana & Akoglu 11

SpEagle can incorporate labels seamlessly (SpEagle+) - can use user, review and/or product labels

For labeled nodes, priors are set to:

• φ ← {ϵ, 1 − ϵ} for spam category(i.e., fake, spammer, or target)

• φ ← {1− ϵ, ϵ} for non-spam category

(i.e., genuine, benign, or non-target)

Nodes are chosen randomly forlabeling


Rayana & Akoglu 12

Settings

Existing inference model pose queries to select “valuable” nodes

Selected nodes are labeled by an Oracle (e.g. Human)

Label are utilized at inference time

Objective

Minimize labeling cost

Maximize classification performance

▪ Achieve higher accuracy within a budget


Rayana & Akoglu 13

Objective: find “islands” of uncertainty EUCR incorporates three important

characteristics of a valuable node:

i. Self-uncertainty of a node

ii. Density of the region it belongs to

iii. Proximity to other uncertain nodes


Rayana & Akoglu 14

Step 1 – (i) + (ii): Calculate Weighted UnCertaintyscore,

where, 𝑏𝑥 is belief of node 𝑥 to belong to class 𝑦𝑖, and weight,

Step 2 – Find set 𝑺 of top 𝒌 nodes having high 𝐖𝐔𝐂score


• 𝑈𝐷𝑥 = user degree of a review node 𝑥• 𝑚𝑖𝑛𝑈𝐷 = minimum degree of a user node• 𝑚𝑎𝑥𝑈𝐷 = maximum degree of a user node

Rayana & Akoglu 15

Step 3 – (iii): Proximity of node 𝑥 ∈ 𝑆 to node 𝑗, ∀𝑗 ∈ 𝑅,calculated using RWR probability

where, 𝑐 = 0.85, 𝑊 = column norm. Adj. matrix of 𝐺𝑅, 𝐺𝑅= review-review

network, and 𝑒𝑥 = 1, 𝑓𝑜𝑟 𝑥

0, 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒

We can express the proximity as,

Step 4 – select most valuable node 𝑥 as,


Rayana & Akoglu 16

Uncertainty Sampling (US):

Valuable node – data instance with highest uncertainty

Metric – entropy of the final beliefs,

Query-by-Committee (QBC):

Committee consists of multiple members

Valuable node - instance on which committee members disagrees most

Metric – average Kullback-Leibler (KL) divergence (soft-voting (SV))


Rayana & Akoglu 17

Build committee 𝐶 = {𝜃 1 , … , 𝜃(|𝐶|)}, with review feature bagging

4 out of 16 features randomly selected w/o replacement

Calculate disagreement using avg. KL divergence

where, and


[MacCallum+,98]

Rayana & Akoglu 18

(i) Most-sure disagreement (QBC-MS): strong and conflicting evidence

(ii) Least-sure disagreement (QBC-LS): no conclusive evidence

Objectives to optimize: members should disagree on node 𝑥,

both should be large for most-sure , and

both should be small for least-sure disagreement

Overall evidence for node 𝑥,


Members with

+ve evidence:

-ve evidence:

total

+ve evidence:

-ve evidence:

[Sharma+,13]

Rayana & Akoglu 19

Find set 𝑺 of top 𝒌 nodes having high 𝑄𝐵𝐶 − 𝑆𝑉 score

Valuable node selection:

QBC-MS selects node with maximum evidence from S,

QBC-LS selects nodes with minimum evidence from S,


Rayana & Akoglu 20

Requires two classifiers, 1. content-only (CO): logistic regression on

features from metadata, and 2. collective classifier (CC): SpEagle (metadata +

review network)

Valuable node - instance on which decision of COand CC differs most

Constrains on CO: require enough labels for training, susceptible to class imbalance, and re-train at each iteration



2 Yelp datasets1: recommended vs. non-recommended reviews

YelpChi – hotel and restaurant reviews from Chicago

YelpNYC – restaurant reviews from New York City

1 Datasets are made available to the community2 A spammer has at least one filtered review

Settings: Pool-based active InferenceQuery Selection: Review nodes only

2

Oracle: Yelp.com

Rayana & Akoglu 23

EUCR is superior to random selection and adapted existing approaches

YelpChi

Figure shows AP at differentBudget

YelpNYC



Figu

re: N

DC

G@

10

0 (

left

) an

d

ND

CG

@1

00

0 (

righ

t) v

s B

ud

get

YelpChi

YelpNYC

Rayana & Akoglu 25

EUCR provides- correct label to almost as many users as the budget size- enough fake reviews being labeled

RS & ALFNET – imbalanced fake vs genuine reviews- ALFNET labels multiple reviews of same user

US & QBC – selects (uncertain or disagreed) node selfishly, not considering neighborhood



YelpChi

YelpNYC


With budget as small as 300, EUCR outperforms random selection and other adapted baselines

EUCR

Rayana & Akoglu 28

Main contributions:

Adapted existing label acquisition approaches in network inference settings

Defined characteristics of valuable nodes

Proposed a new metric Expected UnCertainty Reach (EUCR) to quantify the value

Achieved improved performance

Evaluated on two large real-world datasets from Yelp.com

Email [email protected] .edu for code and data



For datasets email [email protected]

http://www.cs.stonybrook.edu/~datalab/

mailto:[email protected]

http://www.cs.stonybrook.edu/~leman/

Data & Analytics

Collective Opinion Spam Detection using Active Inference