Recsplorer: Recommendation Algorithms Based on Precedence Mining Aditya Parameswaran Stanford University (Joint work with G. Koutrika, B. Bercovitz & H

Recsplorer: Recommendation Algorithms Based on Precedence Mining

Aditya ParameswaranStanford University

(Joint work with G. Koutrika, B. Bercovitz & H. Garcia-Molina)

1

Applications (Far too many!)

2

What’s New?

Collaborative filtering

Extracting patterns ~10 yrs But not used in recommendations!Challenge: Aggregation & Sparsity

Sets, not Sequences

Won’t need ratings!

Lack of “similar people”

3

Motivating Example

q1 q2 q3 q4A : 5 B : 5 D : 5 -A : 1 E : 2 D : 4 F : 3G : 4 H : 2 E : 3 F : 3B : 2 G : 4 H : 4 E : 4A : 5 G : 4 E : 4 -

Useru1u2u3u4u

Target user

G : 4 E : 3H : 2H : 4G : 4 E : 4

G : 4 E : 4

H : 2H : 4

H : 3

Ignore potentially useful information

Exploit patterns only among similar users

Sparsity of ratings,Few recommendationsRecommend

4

Motivating Example (contd.)

q1 q2 q3 q4A : 5 B : 5 D : 5 -A : 1 E : 2 D : 4 F : 3G : 4 H : 2 E : 3 F : 3B : 2 G : 4 H : 4 E : 4A : 5 G : 4 E : 4 -

Useru1u2u3u4u

Target user

A : 5 D : 5A : 1 D : 4

A DA D

A D

E : 2 F : 3E : 3 F : 3

E FG H

DFH

Recommend

Mine a larger portion of user histories

Exploit patterns across all users

More and better recommendations

User preferences, logical orders, interest evolution

H : 2G : 4

H : 4G : 4G : 4

How to assign scores?

5

GoalsQuality of recommendations

Not enough!

Coverage

Goodness

Unexpectedness

Predictability

Not covered in this talk

Efficiency

6

Precedence Model

A prediction problem using conditional probabilities

Given A, what is the probability that X will follow P[ X | A ]

Incorrect!Contains

-AX

A XX A

User Histu1u2u3u4u5

P[ X | A ] = 1/3

P[ X | A with no X preceding ] = 1/2

P[X |AX]7

Algorithm 1: Single Item Max-Confidence

Current user’s history UD1 D2 D3 Dm…

X

sup(Di, X) θ

P[X|DmX]

score(X) = maxi P[X | DiX]

8


X

score(X) = P[X | UX]

Algorithm 2: Joint Probabilities

9

score(X) = P[X | UX] Current user’s history : U = {D1, D2, … Dm}

Approximating:

score(X) = P[X | D1X D2

X … DmX]

score(X) P[X] × Π P[DiX | X]

Di in U

Algorithm 2: Joint Probabilities (Contd.)

10


X

score(X) P[X] × Π P[DiX | X]

Di in UTop Di in U

Algorithm 3: Hybrid

11

Evaluation: Methodology

Dataset: 7,500 Student transcripts from CourseRank

Evaluation Methodology:

Input: x Hidden: r

Metrics:precision@k = fraction of top-k recommendations in r

coverage@k = number of users for whom an algorithm generates at least k recommendations

System: CourseRank (an educational social site for Stanford)

12

Evaluation: Algorithms

Popularity

Reranked

Hybrid

Joint Probabilities

Single Item Max Confidence

Collaborative Filtering

Not covered in this talk

Joint Probabilities with Support

13

EvaluationSupport θ =30, I =3 samples, x=14

14

EvaluationSupport θ =30, I =3 samples, k=2 recommendations

15

EvaluationSupport θ =30, I =3 samples, k=10 recommendations

16

Summary of Contributions

Finer-grained precedence model to leverage collective wisdom

Higher coverage + precision@k

More in paper: • other algorithms• goodness / unexpectedness• optimal thresholds • user study

17

Documents

Recsplorer: Recommendation Algorithms Based on Precedence Mining Aditya Parameswaran Stanford University (Joint work with G. Koutrika, B. Bercovitz & H