48
1 ©MapR Technologies - Confidential Polyvalent Recommendations

Polyvalent Recommendations

Embed Size (px)

DESCRIPTION

Recent work in recommendations allows some really amazing simplicity of implementation while extending the inputs handled to multiple kinds of interactions against items different from the ones being recommended.

Citation preview

Page 1: Polyvalent Recommendations

1©MapR Technologies - Confidential

Polyvalent Recommendations

Page 2: Polyvalent Recommendations

2©MapR Technologies - Confidential

Multiple Kinds of Behavior for Recommending

Multiple Kinds of Things

Page 3: Polyvalent Recommendations

3©MapR Technologies - Confidential

Contact:– [email protected]– @ted_dunning– @apachemahout– @[email protected]

Slides and such (available late tonight):– http://www.slideshare.net/tdunning

Hash tags: #mapr #recommendations

Page 4: Polyvalent Recommendations

4©MapR Technologies - Confidential

A new approach to recommendation, polyvalent recommendation, that is both simpler and much more powerful than traditional approaches. The idea is that you can combine user, item and content recommendations into a single query that you can implement using a very simple architecture.

Page 5: Polyvalent Recommendations

5©MapR Technologies - Confidential

Recommendations

Often known (inaccurately) as collaborative filtering Actors interact with items– observe successful interaction

We want to suggest additional successful interactions Observations inherently very sparse

Page 6: Polyvalent Recommendations

6©MapR Technologies - Confidential

Examples

Customers buying books (Linden et al) Web visitors rating music (Shardanand and Maes) or movies (Riedl,

et al), (Netflix) Internet radio listeners not skipping songs (Musicmatch) Internet video watchers watching >30 s

Page 7: Polyvalent Recommendations

7©MapR Technologies - Confidential

Dyadic Structure

Functional– Interaction: actor -> item*

Relational– Interaction Actors x Items⊆

Matrix– Rows indexed by actor, columns by item– Value is count of interactions

Predict missing observations

Page 8: Polyvalent Recommendations

8©MapR Technologies - Confidential

Recommendation Basics

History:

User Thing1 3

2 4

3 4

2 3

3 2

1 1

2 1

Page 9: Polyvalent Recommendations

9©MapR Technologies - Confidential

Recommendation Basics

History as matrix:

(t1, t2) cooccur 2 times, (t1, t4) once, (t2, t4) once

t1 t2 t3 t4

u1 1 0 1 0

u2 1 0 1 1

u3 0 1 0 1

Page 10: Polyvalent Recommendations

10©MapR Technologies - Confidential

A Quick Simplification

Users who do h

Also do r

User-centric recommendations

Item-centric recommendations

Page 11: Polyvalent Recommendations

11©MapR Technologies - Confidential

Recommendation Basics

Coocurrence

t1 t2 t3 t4

t1 2 0 2 1

t2 0 1 0 1

t3 2 0 1 1

t4 1 1 1 2

Page 12: Polyvalent Recommendations

12©MapR Technologies - Confidential

Problems with Raw Cooccurrence

Very popular items co-occur with everything– Welcome document– Elevator music

That isn’t interesting– We want anomalous cooccurrence

Page 13: Polyvalent Recommendations

13©MapR Technologies - Confidential

Recommendation Basics

Coocurrence

t1 t2 t3 t4

t1 2 0 2 1

t2 0 1 0 1

t3 2 0 1 1

t4 1 1 1 2t3 not t3

t1 2 1

not t1 1 1

Page 14: Polyvalent Recommendations

14©MapR Technologies - Confidential

Root LLR Details

In Rentropy = function(k) { -sum(k*log((k==0)+(k/sum(k))))}rootLLr = function(k) { sqrt( (entropy(rowSums(k))+entropy(colSums(k)) - entropy(k))/2)}

Like sqrt(mutual information * N/2)

Page 15: Polyvalent Recommendations

15©MapR Technologies - Confidential

Spot the Anomaly

Root LLR is roughly like standard deviations

A not A

B 13 1000

not B 1000 100,000

A not A

B 1 0

not B 0 2

A not A

B 1 0

not B 0 10,000

A not A

B 10 0

not B 0 100,000

0.44 0.98

2.26 7.15

Page 16: Polyvalent Recommendations

16©MapR Technologies - Confidential

Threshold by Score

Coocurrence

t1 t2 t3 t4

t1 2 0 2 1

t2 0 1 0 1

t3 2 0 1 1

t4 1 1 1 2

Page 17: Polyvalent Recommendations

17©MapR Technologies - Confidential

Threshold by Score

Significant cooccurrence => Indicators

t1 t2 t3 t4

t1 1 0 0 1t2 0 1 0 1t3 0 0 1 1t4 1 0 0 1

Page 18: Polyvalent Recommendations

18©MapR Technologies - Confidential

Decomposition for Cooccurrence

Can use SVD for cooccurrence

But first one or two singular vectors just encode popularity … ignore those

VT projects items into concept space, V projects back into item space

Thresholding reconstructed cooccurrence matrix is another way to get indicators

Page 19: Polyvalent Recommendations

19©MapR Technologies - Confidential

What’s right about this?

Page 20: Polyvalent Recommendations

20©MapR Technologies - Confidential

Virtues of Current State of the Art

Lots of well publicized history– Netflix, Amazon, Overstock

Lots of support– Mahout, commercial offerings like Myrrix

Lots of existing code– Mahout, commercial codes

Proven track record Well socialized solution

Page 21: Polyvalent Recommendations

21©MapR Technologies - Confidential

What’s wrong about this?

Page 22: Polyvalent Recommendations

22©MapR Technologies - Confidential

Cross Occurrence

We don’t have to do co-occurrence We can do cross-occurrence

Result is cross-recommendation

Page 23: Polyvalent Recommendations

23©MapR Technologies - Confidential

Fundamental Algorithmics

Cooccurrence

A is users x items, K is items x items Product has general shape of matrix K tells us “users who interacted with x also interacted with y”

Page 24: Polyvalent Recommendations

24©MapR Technologies - Confidential

Fundamental Algorithmic Structure

Cooccurrence

Matrix approximation by factoring

LLR

Page 25: Polyvalent Recommendations

25©MapR Technologies - Confidential

But Wait ...

Does it have to be that way?

Page 26: Polyvalent Recommendations

26©MapR Technologies - Confidential

But why not ...

Why just dyadic learning?

Why not triadic learning?Why not cross learning?

Page 27: Polyvalent Recommendations

27©MapR Technologies - Confidential

For example

Users enter queries (A)– (actor = user, item=query)

Users view videos (B)– (actor = user, item=video)

A’A gives query recommendation– “did you mean to ask for”

B’B gives video recommendation– “you might like these videos”

Page 28: Polyvalent Recommendations

28©MapR Technologies - Confidential

The punch-line

B’A recommends videos in response to a query– (isn’t that a search engine?)– (not quite, it doesn’t look at content or meta-data)

Page 29: Polyvalent Recommendations

29©MapR Technologies - Confidential

Real-life example

Query: “Paco de Lucia” Conventional meta-data search results:– “hombres del paco” times 400– not much else

Recommendation based search:– Flamenco guitar and dancers– Spanish and classical guitar– Van Halen doing a classical/flamenco riff

Page 30: Polyvalent Recommendations

30©MapR Technologies - Confidential

Real-life example

Page 31: Polyvalent Recommendations

31©MapR Technologies - Confidential

Hypothetical Example

Want a navigational ontology? Just put labels on a web page with traffic– This gives A = users x label clicks

Remember viewing history– This gives B = users x items

Cross recommend– B’A = label to item mapping

After several users click, results are whatever users think they should be

Page 32: Polyvalent Recommendations

32©MapR Technologies - Confidential

But wait,there’s more!

Page 33: Polyvalent Recommendations

33©MapR Technologies - Confidential

users

things

Page 34: Polyvalent Recommendations

34©MapR Technologies - Confidential

users

thingtype 1

thingtype 2

Page 35: Polyvalent Recommendations

35©MapR Technologies - Confidential

Page 36: Polyvalent Recommendations

36©MapR Technologies - Confidential

Summary

Input: Multiple kinds of behavior on one set of things

Output: Recommendations for one kind of behavior with a different set of things

Cross recommendation is a special case

Page 37: Polyvalent Recommendations

37©MapR Technologies - Confidential

Now again, without the scary math

Page 38: Polyvalent Recommendations

38©MapR Technologies - Confidential

Input Data User transactions– user id, merchant id– SIC code, amount– Descriptions, cuisine, …

Offer transactions– user id, offer id– vendor id, merchant id’s, – offers, views, accepts

Page 39: Polyvalent Recommendations

39©MapR Technologies - Confidential

Input Data User transactions– user id, merchant id– SIC code, amount– Descriptions, cuisine, …

Offer transactions– user id, offer id– vendor id, merchant id’s, – offers, views, accepts

Derived user data– merchant id’s– anomalous descriptor terms– offer & vendor id’s

Derived merchant data– local top40– SIC code– vendor code– amount distribution

Page 40: Polyvalent Recommendations

40©MapR Technologies - Confidential

Cross-recommendation

Per merchant indicators– merchant id’s– chain id’s– SIC codes– indicator terms from text– offer vendor id’s

Computed by finding anomalous (indicator => merchant) rates

Page 41: Polyvalent Recommendations

41©MapR Technologies - Confidential

Search-based Recommendations

Sample document– Merchant Id– Field for text description– Phone– Address– Location

Page 42: Polyvalent Recommendations

42©MapR Technologies - Confidential

Search-based Recommendations

Sample document– Merchant Id– Field for text description– Phone– Address– Location

– Indicator merchant id’s– Indicator industry (SIC) id’s– Indicator offers– Indicator text– Local top40

Page 43: Polyvalent Recommendations

43©MapR Technologies - Confidential

Search-based Recommendations

Sample document– Merchant Id– Field for text description– Phone– Address– Location

– Indicator merchant id’s– Indicator industry (SIC) id’s– Indicator offers– Indicator text– Local top40

Sample query– Current location– Recent merchant descriptions– Recent merchant id’s– Recent SIC codes– Recent accepted offers– Local top40

Page 44: Polyvalent Recommendations

44©MapR Technologies - Confidential

SolRIndexerSolR

IndexerSolrindexing

Cooccurrence(Mahout)

Item meta-data

Indexshards

Complete history

Page 45: Polyvalent Recommendations

45©MapR Technologies - Confidential

SolRIndexerSolR

IndexerSolrsearchWeb tier

Item meta-data

Indexshards

User history

Page 46: Polyvalent Recommendations

46©MapR Technologies - Confidential

Objective Results

At a very large credit card company

History is all transactions, all web interaction

Processing time cut from 20 hours per day to 3

Recommendation engine load time decreased from 8 hours to 3 minutes

Recommendation quality increased visibly

Page 47: Polyvalent Recommendations

47©MapR Technologies - Confidential

Contact:– [email protected]– @ted_dunning

Slides and such (available late tonight):– http://www.slideshare.net/tdunning

Hash tags: #mapr #recommendations

We are hiring!

Page 48: Polyvalent Recommendations

48©MapR Technologies - Confidential

Thank You