61
1 ©MapR Technologies 2013- Confidential Introduction to Mahout And How To Build a Recommender

DFW Big Data talk on Mahout Recommenders

Embed Size (px)

DESCRIPTION

This talk focussed on how to build recommenders using new technology and capabilities from Mahout. The key here is that recommenders can be built much more easily than you might expect.

Citation preview

Page 1: DFW Big Data talk on Mahout Recommenders

1©MapR Technologies 2013- Confidential

Introduction to MahoutAnd How To Build a Recommender

Page 2: DFW Big Data talk on Mahout Recommenders

2©MapR Technologies 2013- Confidential

Me, Us

Ted Dunning, Chief Application Architect, MapRCommitter PMC member, Mahout, Zookeeper, DrillBought the beer at the first HUG

MapRDistributes more open source components for HadoopAdds major technology for performance, HA, industry standard API’s

TonightHash tag - #dfwbd #maprSee also - @ApacheMahout @ApacheDrill

@ted_dunning and @mapR

Page 3: DFW Big Data talk on Mahout Recommenders

3©MapR Technologies 2013- Confidential

Requested Topic For Tonight

What is Mahout? What makes it different? How can big data technology solve impossible problems? How is big data affecting the world?

Page 4: DFW Big Data talk on Mahout Recommenders

4©MapR Technologies 2013- Confidential

Also

What is MapR? What is MapR doing? How does MapR’s technology work? How are customers making use of MapR? How can anyone make use of MapR to solve problems?

Page 5: DFW Big Data talk on Mahout Recommenders

5©MapR Technologies 2013- Confidential

Oh … Also This

Detailed break-down of a live machine learning system running with Mahout on MapR

With code examples

Page 6: DFW Big Data talk on Mahout Recommenders

6©MapR Technologies 2013- Confidential

I may have to summarize

Page 7: DFW Big Data talk on Mahout Recommenders

7©MapR Technologies 2013- Confidential

I may have to summarize

just a bit

Page 8: DFW Big Data talk on Mahout Recommenders

8©MapR Technologies 2013- Confidential

Part 1:5 minutes of math

Page 9: DFW Big Data talk on Mahout Recommenders

9©MapR Technologies 2013- Confidential

Part 2:12 minutes: I want a pony

Page 10: DFW Big Data talk on Mahout Recommenders

10©MapR Technologies 2013- Confidential

Part 3:A working example

Page 11: DFW Big Data talk on Mahout Recommenders

11©MapR Technologies 2013- Confidential

What Does Machine Learning Look Like?

Page 12: DFW Big Data talk on Mahout Recommenders

12©MapR Technologies 2013- Confidential

What Does Machine Learning Look Like?

O(κ k d + k3 d) = O(k2 d log n + k3 d) for small k, high qualityO(κ d log k) or O(d log κ log k) for larger k, looser quality

But tonight we’re going to show you how to keep it simple yet powerful…

Page 13: DFW Big Data talk on Mahout Recommenders

13©MapR Technologies 2013- Confidential

Comparison of Three Main ML Topics

Recommendation: – Involves observation of interactions between people taking action (users)

and items for input data to the recommender model– Goal is to suggest additional appropriate or desirable interactions– Applications include: movie, music or map-based restaurant choices;

suggesting sale items for e-stores or via cash-register receipts

Page 14: DFW Big Data talk on Mahout Recommenders

14©MapR Technologies 2013- Confidential

Page 15: DFW Big Data talk on Mahout Recommenders

15©MapR Technologies 2013- Confidential

Page 16: DFW Big Data talk on Mahout Recommenders

16©MapR Technologies 2013- Confidential

Part 1:A bit of math

(the math of bits)

Page 17: DFW Big Data talk on Mahout Recommenders

17©MapR Technologies 2013- Confidential

Mahout Math

Goals are– basic linear algebra,– and statistical sampling,– and good clustering,– decent speed,– extensibility,– especially for sparse data

But not – totally badass speed– comprehensive set of algorithms– optimization, root finders, quadrature

Page 18: DFW Big Data talk on Mahout Recommenders

18©MapR Technologies 2013- Confidential

Matrices and Vectors

At the core:– DenseVector, RandomAccessSparseVector– DenseMatrix, SparseRowMatrix

Highly composable API

Important ideas: – view*, assign and aggregate– iteration

m.viewDiagonal().assign(v)

Page 19: DFW Big Data talk on Mahout Recommenders

19©MapR Technologies 2013- Confidential

Assign? View?

Why assign?– Copying is the major cost for naïve matrix packages– In-place operations critical to reasonable performance– Many kinds of updates required, so functional style very helpful

Why view?– In-place operations often required for blocks, rows, columns or diagonals– With views, we need #assign + #views methods– Without views, we need #assign x #views methods

Synergies– With both views and assign, many loops become single line

Page 20: DFW Big Data talk on Mahout Recommenders

24©MapR Technologies 2013- Confidential

Examples

double alpha; a.assign(alpha);

a.assign(b, Functions.chain( Functions.plus(beta), Functions.times(alpha));

Page 21: DFW Big Data talk on Mahout Recommenders

26©MapR Technologies 2013- Confidential

More Examples

The trace of a matrix

Set diagonal to zero

Set diagonal to negative of row sums

Page 22: DFW Big Data talk on Mahout Recommenders

27©MapR Technologies 2013- Confidential

Examples

The trace of a matrix

Set diagonal to zero

Set diagonal to negative of row sums

m.viewDiagonal().zSum()

Page 23: DFW Big Data talk on Mahout Recommenders

28©MapR Technologies 2013- Confidential

Examples

The trace of a matrix

Set diagonal to zero

Set diagonal to negative of row sums

m.viewDiagonal().zSum()

m.viewDiagonal().assign(0)

Page 24: DFW Big Data talk on Mahout Recommenders

29©MapR Technologies 2013- Confidential

Examples

The trace of a matrix

Set diagonal to zero

Set diagonal to negative of row sums excluding the diagonal

m.viewDiagonal().zSum()

m.viewDiagonal().assign(0)

Vector diag = m.viewDiagonal().assign(0);diag.assign(m.rowSums().assign(Functions.MINUS));

Page 25: DFW Big Data talk on Mahout Recommenders

32©MapR Technologies 2013- Confidential

Clustering and Such

Streaming k-means and ball k-means– streaming reduces very large data to a cluster sketch– ball k-means is a high quality k-means implementation– the cluster sketch is also usable for other applications– single machine threaded and map-reduce versions available

SVD and friends– stochastic SVD has in-memory, single machine out-of-core and map-reduce

versions– good for reducing very large sparse matrices to tall skinny dense ones

Spectral clustering– based on SVD, allows massive dimensional clustering

Page 26: DFW Big Data talk on Mahout Recommenders

33©MapR Technologies 2013- Confidential

Mahout Math Summary

Matrices, Vectors– views– in-place assignment– aggregations– iterations

Functions– lots built-in– cooperate with sparse vector optimizations

Sampling– abstract samplers– samplers as functions

Other stuff … clustering, SVD

Page 27: DFW Big Data talk on Mahout Recommenders

34©MapR Technologies 2013- Confidential

Part 2:How recommenders work

(I still want a pony)

Page 28: DFW Big Data talk on Mahout Recommenders

35©MapR Technologies 2013- Confidential

Recommendations

Behavior of a crowd helps us understand what individuals will do

Page 29: DFW Big Data talk on Mahout Recommenders

36©MapR Technologies 2013- Confidential

Recommendations

Alice got an apple and a puppy

Charles got a bicycle

Alice

Charles

Page 30: DFW Big Data talk on Mahout Recommenders

37©MapR Technologies 2013- Confidential

Recommendations

Alice got an apple and a puppy

Charles got a bicycle

Bob got an apple

Alice

Bob

Charles

Page 31: DFW Big Data talk on Mahout Recommenders

38©MapR Technologies 2013- Confidential

Recommendations

What else would Bob like??

Alice

Bob

Charles

Page 32: DFW Big Data talk on Mahout Recommenders

39©MapR Technologies 2013- Confidential

Recommendations

What if everybody gets a pony?

Now what does Bob want??

Alice

Bob

Charles

Page 33: DFW Big Data talk on Mahout Recommenders

40©MapR Technologies 2013- Confidential

Log Files

Alice

Bob

Charles

Alice

Bob

Charles

Alice

Page 34: DFW Big Data talk on Mahout Recommenders

41©MapR Technologies 2013- Confidential

Log Files

u1

u3

u2

u1

u3

u2

u1

t1

t2

t3

t4

t3

t3

t1

Page 35: DFW Big Data talk on Mahout Recommenders

42©MapR Technologies 2013- Confidential

Log Files and Dimensions

u1

u3

u2

u1

u3

u2

u1

t1

t2

t3

t4

t3

t3

t1

t1

t2

t3

t4

Things u1 Alice

BobCharles

u3u2

Users

Page 36: DFW Big Data talk on Mahout Recommenders

43©MapR Technologies 2013- Confidential

History Matrix

Alice

Bob

Charles

✔ ✔ ✔

✔ ✔

✔ ✔

Page 37: DFW Big Data talk on Mahout Recommenders

44©MapR Technologies 2013- Confidential

Cooccurrence Matrix

1 2

1 1

1

1

2 1

Page 38: DFW Big Data talk on Mahout Recommenders

45©MapR Technologies 2013- Confidential

Indicator Matrix

Page 39: DFW Big Data talk on Mahout Recommenders

46©MapR Technologies 2013- Confidential

Indicator Matrix

id: t4title: puppydesc: The sweetest little puppy ever.keywords: puppy, dog, pet

indicators: (t1)

Page 40: DFW Big Data talk on Mahout Recommenders

47©MapR Technologies 2013- Confidential

Problems with Raw Cooccurrence

Very popular items co-occur with everything– Welcome document– Elevator music

That isn’t interesting– We want anomalous cooccurrence

Page 41: DFW Big Data talk on Mahout Recommenders

48©MapR Technologies 2013- Confidential

Recommendation Basics

Coocurrence

t3 not t3

t1 2 1

not t1 1 1

Page 42: DFW Big Data talk on Mahout Recommenders

49©MapR Technologies 2013- Confidential

Spot the Anomaly

Root LLR is roughly like standard deviations

A not A

B 13 1000

not B 1000 100,000

A not A

B 1 0

not B 0 2

A not A

B 1 0

not B 0 10,000

A not A

B 10 0

not B 0 100,000

0.44 0.98

2.26 7.15

Page 43: DFW Big Data talk on Mahout Recommenders

50©MapR Technologies 2013- Confidential

A Quick Simplification

Users who do h (a vector of things a user has done)

Also do r

User-centric recommendations(transpose translates back to things)

Item-centric recommendations(change the order of operations)

A translates things into users

Page 44: DFW Big Data talk on Mahout Recommenders

51©MapR Technologies 2013- Confidential

Symmetry Gives Cross Recommentations

Conventional recommendations with off-line learning

Cross recommendations

Page 45: DFW Big Data talk on Mahout Recommenders

52©MapR Technologies 2013- Confidential

For example

Users enter queries (A)– (actor = user, item=query)

Users view videos (B)– (actor = user, item=video)

ATA gives query recommendation– “did you mean to ask for”

BTB gives video recommendation– “you might like these videos”

Page 46: DFW Big Data talk on Mahout Recommenders

53©MapR Technologies 2013- Confidential

The punch-line

BTA recommends videos in response to a query– (isn’t that a search engine?)– (not quite, it doesn’t look at content or meta-data)

Page 47: DFW Big Data talk on Mahout Recommenders

54©MapR Technologies 2013- Confidential

Real-life example

Query: “Paco de Lucia” Conventional meta-data search results:– “hombres del paco” times 400– not much else

Recommendation based search:– Flamenco guitar and dancers– Spanish and classical guitar– Van Halen doing a classical/flamenco riff

Page 48: DFW Big Data talk on Mahout Recommenders

55©MapR Technologies 2013- Confidential

Real-life example

Page 49: DFW Big Data talk on Mahout Recommenders

56©MapR Technologies 2013- Confidential

Hypothetical Example

Want a navigational ontology? Just put labels on a web page with traffic– This gives A = users x label clicks

Remember viewing history– This gives B = users x items

Cross recommend– B’A = label to item mapping

After several users click, results are whatever users think they should be

Page 50: DFW Big Data talk on Mahout Recommenders

57©MapR Technologies 2013- Confidential

Nice. But we can do better?

Page 51: DFW Big Data talk on Mahout Recommenders

58©MapR Technologies 2013- Confidential

users

things

Page 52: DFW Big Data talk on Mahout Recommenders

59©MapR Technologies 2013- Confidential

users

thingtype 1

thingtype 2

Page 53: DFW Big Data talk on Mahout Recommenders

60©MapR Technologies 2013- Confidential

Page 54: DFW Big Data talk on Mahout Recommenders

61©MapR Technologies 2013- Confidential

Part 3:What about that worked example?

Page 55: DFW Big Data talk on Mahout Recommenders

62©MapR Technologies 2013- Confidential

http://bit.ly/18vbbaT

Page 56: DFW Big Data talk on Mahout Recommenders

63©MapR Technologies 2013- Confidential

SolRIndexerSolR

IndexerSolrindexing

Cooccurrence(Mahout)

Item meta-data

Indexshards

Complete history

Analyze with Map-Reduce

Page 57: DFW Big Data talk on Mahout Recommenders

64©MapR Technologies 2013- Confidential

SolRIndexerSolR

IndexerSolrsearchWeb tier

Item meta-data

Indexshards

User history

Deploy with Conventional Search System

Page 58: DFW Big Data talk on Mahout Recommenders

65©MapR Technologies 2013- Confidential

Objective Results

At a very large credit card company

History is all transactions

Development time to minimal viable product about 4 months

General release 2-3 months later

Search-based recs at or equal in quality to other techniques

Page 59: DFW Big Data talk on Mahout Recommenders

66©MapR Technologies 2013- Confidential

Summary

Input: Multiple kinds of behavior on one set of things

Output: Recommendations for one kind of behavior with a different set of things

Cross recommendation is a special case

Page 60: DFW Big Data talk on Mahout Recommenders

67©MapR Technologies 2013- Confidential

Objective Results

At a very large credit card company

History is all transactions

Development time to minimal viable product about 4 months

General release 2-3 months later

Search-based recs at or equal in quality to other techniques

Page 61: DFW Big Data talk on Mahout Recommenders

68©MapR Technologies 2013- Confidential

Me, Us

Ted Dunning, Chief Application Architect, MapRCommitter PMC member, Mahout, Zookeeper, DrillBought the beer at the first HUGtdunning@{apache.org,maprtech.com} [email protected]

MapRDistributes more open source components for HadoopAdds major technology for performance, HA, industry standard API’s

TonightHash tag - #dfwbd #maprSee also - @ApacheMahout @ApacheDrill

@ted_dunning and @mapR