61
Scalable Collaborative Filtering for Commerce Recommendation Yiqun Hu & Yew Yap Goh {yiqhu, ygoh}@paypal.com September, 2014

Scalable Collaborative Filtering for Commerce Recommendation

Embed Size (px)

DESCRIPTION

The slides for DataScience.SG Meetup (23, Sep, 2014). An introduction of a scalable collaborative filtering method using Hadoop mapreduce we build for commerce recommendation application.

Citation preview

Page 1: Scalable Collaborative Filtering for Commerce Recommendation

Scalable Collaborative Filtering for Commerce Recommendation

Yiqun Hu & Yew Yap Goh {yiqhu, ygoh}@paypal.com

September, 2014

Page 2: Scalable Collaborative Filtering for Commerce Recommendation

Agenda• Our Problem: Commerce Recommendation

• Collaborative Filtering (CF) 101

• Mahout’s Solution and Its Problems

• Our Solution

• Summary & Take Away

Page 3: Scalable Collaborative Filtering for Commerce Recommendation

Commerce Recommendation

Page 4: Scalable Collaborative Filtering for Commerce Recommendation

Which One To Choose ?We Accept

Page 5: Scalable Collaborative Filtering for Commerce Recommendation

Which One To Choose ?We Accept

Page 6: Scalable Collaborative Filtering for Commerce Recommendation

Which One To Choose ?We Accept

Page 7: Scalable Collaborative Filtering for Commerce Recommendation

A Win-Win Solution

Consumer Merchant

Recommendation !Engine

Make good use of my money Grow my business

Page 8: Scalable Collaborative Filtering for Commerce Recommendation

CF Input - Interaction Matrix

Consumer

Merchant

Commerce Interaction Matrix

Page 9: Scalable Collaborative Filtering for Commerce Recommendation

CF Input - Implicit Feedback

Binary Likeness Matrix

Confidence Matrix

Consumer

Merchant

Interaction Matrix

Commerce Interaction Matrix

* Yifa Hu, Yehuda Koren and Chris Volinsky, Collaborative Filtering For Implicit Feedback Datasets, ICDM 2008

Page 10: Scalable Collaborative Filtering for Commerce Recommendation

CF Modeling - Matrix Factorization

• U - the models of every consumer

• V - the models of every merchant

• Find the optimal U/V via optimization

Con

sum

er

Merchant

Con

sum

er Merchant

Page 11: Scalable Collaborative Filtering for Commerce Recommendation

• Iteratively updateFix V and update U: Fix U and update V:

Alternative Least Square (ALS)

RegularizationData Fitting

* Yifa Hu, Yehuda Koren and Chris Volinsky, Collaborative Filtering For Implicit Feedback Datasets, ICDM 2008

Page 12: Scalable Collaborative Filtering for Commerce Recommendation

Scalable ALS

Constant in current iteration

Only need to consider the

nonzero entities

Only need to consider

nonzero entities

* Yifa Hu, Yehuda Koren and Chris Volinsky, Collaborative Filtering For Implicit Feedback Datasets, ICDM 2008

Page 13: Scalable Collaborative Filtering for Commerce Recommendation

Open Source Technologies

Page 14: Scalable Collaborative Filtering for Commerce Recommendation

Open Source Technologies

Page 15: Scalable Collaborative Filtering for Commerce Recommendation

Open Source Technologies

Page 16: Scalable Collaborative Filtering for Commerce Recommendation

Implementation in MahoutAggregate'user'ra*ngs'

Ini*alize'item'models'

Update'user'models''

Update'item'models'

(1534,2323,2)'(1534,1128,3)'(1534,5678,1)''''''''''…'

(1534,'{1128:3,'2323:2,'5678:1,'…})'

Ini*alize'matrix'V'using'the'average'ra*ngs'

Run'K'itera*ons'

Page 17: Scalable Collaborative Filtering for Commerce Recommendation

How To Parallelize ?

."

."

."

."".""."

."

."

."

Input&Transac,on&Matrix&

Worker&K(1& Worker&K&Worker&2&Worker&1& Worker&3&

…"…"

Page 18: Scalable Collaborative Filtering for Commerce Recommendation

How To Parallelize ?

."

."

."

."".""."

."

."

."

Input&Transac,on&Matrix&

Worker&K(1& Worker&K&Worker&2&Worker&1& Worker&3&

…" …"…"…" …" …"…"

Page 19: Scalable Collaborative Filtering for Commerce Recommendation

How To Parallelize ?

."

."

."

."".""."

."

."

."

Input&Transac,on&Matrix&

Worker&K(1& Worker&K&Worker&2&Worker&1& Worker&3&

…" …"…"…" …" …"…"

Page 20: Scalable Collaborative Filtering for Commerce Recommendation

For Every Worker …

Worker&1&

…&

Compute&

Solve&Least&Square&Problem&

Page 21: Scalable Collaborative Filtering for Commerce Recommendation

Simple Illustration

Consumer

Merchant

Commerce Interaction Matrix

Worker&1&

Worker&2&

Worker&3&

Worker&4&

Page 22: Scalable Collaborative Filtering for Commerce Recommendation

Simple Illustration

Consumer

Merchant

Commerce Interaction Matrix

Worker&1&

Worker&2&

Worker&3&

Worker&4&

Page 23: Scalable Collaborative Filtering for Commerce Recommendation

Simple Illustration

Consumer

Merchant

Commerce Interaction Matrix

Worker&1&

Worker&2&

Worker&3&

Worker&4&

Page 24: Scalable Collaborative Filtering for Commerce Recommendation

Simple Illustration

Consumer

Merchant

Commerce Interaction Matrix

Worker&1&

Worker&2&

Worker&3&

Worker&4&

Page 25: Scalable Collaborative Filtering for Commerce Recommendation

Simple Illustration

Consumer

Merchant

Commerce Interaction Matrix

Worker&1&

Worker&2&

Worker&3&

Worker&4&

Page 26: Scalable Collaborative Filtering for Commerce Recommendation

Simple Illustration

Consumer

Merchant

Commerce Interaction Matrix

Worker&1&

Worker&2&

Worker&3&

Worker&4&

Page 27: Scalable Collaborative Filtering for Commerce Recommendation

Anything Wrong ?!

!

• Unnecessary broadcast of all item vectors to every worker;

• The volume of all item vectors can be huge and impossible to load into memory;

…"…"Worker"K)1" Worker"K"Worker"2"Worker"1" Worker"3"

…" …"…"…" …"

Page 28: Scalable Collaborative Filtering for Commerce Recommendation

Scalable ALS

Constant in current iteration

Only need to consider the

nonzero entities

Only need to consider

nonzero entities

Page 29: Scalable Collaborative Filtering for Commerce Recommendation

ALS Recap

Page 30: Scalable Collaborative Filtering for Commerce Recommendation

ALS Recap

Page 31: Scalable Collaborative Filtering for Commerce Recommendation

The Solution of Spotify

* Chris Johnson, Erik Bernhardsson, Algorithmic Music Recommendations at Spotify

Page 32: Scalable Collaborative Filtering for Commerce Recommendation

For Every Map Job …

…"…"

JobTracker

Worker 1 Worker 2 Worker 3

…"

…"

block 2 block 3block 1

…"…"…"

Page 33: Scalable Collaborative Filtering for Commerce Recommendation

For Every Map Job …

…"…"

JobTracker

Worker 1 Worker 2 Worker 3

…"

…"

block 2 block 3block 1

…" …" …"

Page 34: Scalable Collaborative Filtering for Commerce Recommendation

The Solution of Spotify

* Chris Johnson, Erik Bernhardsson, Algorithmic Music Recommendations at Spotify

Page 35: Scalable Collaborative Filtering for Commerce Recommendation

Mapper&5&Mapper&4&Mapper&3&Mapper&2&Mapper&1&

Simple Illustration

Page 36: Scalable Collaborative Filtering for Commerce Recommendation

Mapper&5&Mapper&4&Mapper&3&Mapper&2&Mapper&1&

Simple Illustration

Page 37: Scalable Collaborative Filtering for Commerce Recommendation

Mapper&5&Mapper&4&Mapper&3&Mapper&2&Mapper&1&

Simple Illustration

Page 38: Scalable Collaborative Filtering for Commerce Recommendation

Mapper&5&Mapper&4&Mapper&3&Mapper&2&Mapper&1&

Simple Illustration

Reducer 1

Page 39: Scalable Collaborative Filtering for Commerce Recommendation

Mapper&5&Mapper&4&Mapper&3&Mapper&2&Mapper&1&

Simple Illustration

Reducer 1 Reducer 2

Page 40: Scalable Collaborative Filtering for Commerce Recommendation

Mapper&5&Mapper&4&Mapper&3&Mapper&2&Mapper&1&

Simple Illustration

Reducer 1 Reducer 2 Reducer 3

Page 41: Scalable Collaborative Filtering for Commerce Recommendation

Mapper&5&Mapper&4&Mapper&3&Mapper&2&Mapper&1&

Simple Illustration

Reducer 1 Reducer 2 Reducer 3 Reducer 4

Page 42: Scalable Collaborative Filtering for Commerce Recommendation

Inside A Mapper

File System

Memory

Mapper

Page 43: Scalable Collaborative Filtering for Commerce Recommendation

Inside A Mapper

File System

Memory

Mapper

Page 44: Scalable Collaborative Filtering for Commerce Recommendation

Inside A Mapper

File System

Memory

Mapper

Page 45: Scalable Collaborative Filtering for Commerce Recommendation

Inside A Mapper

File System

Memory

Mapper

Page 46: Scalable Collaborative Filtering for Commerce Recommendation

Inside A Mapper

File System

Memory

Disadvantages:!

• Unnecessary file copy to all nodes before job;

• Each map function call, potential context switch with inefficient I/O load;

Mapper

Page 47: Scalable Collaborative Filtering for Commerce Recommendation

Our Solution - Shard Partition

(Lee, {1:1, 2:5}) (Kim, {1:4, 2:2}) (Ha, {1:2, 2:4}) (Oh, {1:2, 2:4}

(Lee, {4:2}) (Kim, {4:5}) (Ha, {3:3}) (Oh, {4:5})

(Lee, {5:4}) (Kim, {5:1, 6:2}) (Ha, {6:5}) (Oh, {5:1})

Consumer

Merchant

Commerce Interaction Matrix

Page 48: Scalable Collaborative Filtering for Commerce Recommendation

Our Solution - Shard Partition

Shard 1

(Lee, {1:1, 2:5}) (Kim, {1:4, 2:2}) (Ha, {1:2, 2:4}) (Oh, {1:2, 2:4}

(Lee, {4:2}) (Kim, {4:5}) (Ha, {3:3}) (Oh, {4:5})

(Lee, {5:4}) (Kim, {5:1, 6:2}) (Ha, {6:5}) (Oh, {5:1})

Consumer

Merchant

Commerce Interaction Matrix

Page 49: Scalable Collaborative Filtering for Commerce Recommendation

Our Solution - Shard Partition

Shard 1 Shard 2

(Lee, {1:1, 2:5}) (Kim, {1:4, 2:2}) (Ha, {1:2, 2:4}) (Oh, {1:2, 2:4}

(Lee, {4:2}) (Kim, {4:5}) (Ha, {3:3}) (Oh, {4:5})

(Lee, {5:4}) (Kim, {5:1, 6:2}) (Ha, {6:5}) (Oh, {5:1})

Consumer

Merchant

Commerce Interaction Matrix

Page 50: Scalable Collaborative Filtering for Commerce Recommendation

Our Solution - Shard Partition

Shard 1 Shard 2 Shard 3

(Lee, {1:1, 2:5}) (Kim, {1:4, 2:2}) (Ha, {1:2, 2:4}) (Oh, {1:2, 2:4}

(Lee, {4:2}) (Kim, {4:5}) (Ha, {3:3}) (Oh, {4:5})

(Lee, {5:4}) (Kim, {5:1, 6:2}) (Ha, {6:5}) (Oh, {5:1})

Consumer

Merchant

Commerce Interaction Matrix

Page 51: Scalable Collaborative Filtering for Commerce Recommendation

Shard 1

Our Solution - Shard Partition

Shard 1 Shard 2 Shard 3

(Lee, {1:1, 2:5}) (Kim, {1:4, 2:2}) (Ha, {1:2, 2:4}) (Oh, {1:2, 2:4}

(Lee, {4:2}) (Kim, {4:5}) (Ha, {3:3}) (Oh, {4:5})

(Lee, {5:4}) (Kim, {5:1, 6:2}) (Ha, {6:5}) (Oh, {5:1})

Consumer

Merchant

Commerce Interaction Matrix

Page 52: Scalable Collaborative Filtering for Commerce Recommendation

Shard 2Shard 1

Our Solution - Shard Partition

Shard 1 Shard 2 Shard 3

(Lee, {1:1, 2:5}) (Kim, {1:4, 2:2}) (Ha, {1:2, 2:4}) (Oh, {1:2, 2:4}

(Lee, {4:2}) (Kim, {4:5}) (Ha, {3:3}) (Oh, {4:5})

(Lee, {5:4}) (Kim, {5:1, 6:2}) (Ha, {6:5}) (Oh, {5:1})

Consumer

Merchant

Commerce Interaction Matrix

Page 53: Scalable Collaborative Filtering for Commerce Recommendation

Shard 2Shard 1

Our Solution - Shard Partition

Shard 1 Shard 2 Shard 3

(Lee, {1:1, 2:5}) (Kim, {1:4, 2:2}) (Ha, {1:2, 2:4}) (Oh, {1:2, 2:4}

(Lee, {4:2}) (Kim, {4:5}) (Ha, {3:3}) (Oh, {4:5})

Shard 3(Lee, {5:4}) (Kim, {5:1, 6:2}) (Ha, {6:5}) (Oh, {5:1})

Consumer

Merchant

Commerce Interaction Matrix

Page 54: Scalable Collaborative Filtering for Commerce Recommendation

Our Solution - Parallel Shard Processing

Shard 1 Shard 2 Shard 3Consumer

Merchant

Commerce Interaction Matrix

Page 55: Scalable Collaborative Filtering for Commerce Recommendation

Our Solution - Parallel Shard Processing

Shard 1 Shard 2 Shard 3

Stage 1:!Compute the individual contributions of each rating

Consumer

Merchant

Commerce Interaction Matrix

MapReduce Job !for Shard 1!

MapReduce Job !for Shard 2!

MapReduce Job !for Shard 3!

Page 56: Scalable Collaborative Filtering for Commerce Recommendation

Our Solution - Parallel Shard Processing

Shard 1 Shard 2 Shard 3

Global MapReduce Job for ALS !

Stage 1:!Compute the individual contributions of each rating

Stage 2:!Aggregate all contributions for every user and update their models in parallel

Consumer

Merchant

Commerce Interaction Matrix

MapReduce Job !for Shard 1!

MapReduce Job !for Shard 2!

MapReduce Job !for Shard 3!

Page 57: Scalable Collaborative Filtering for Commerce Recommendation

For Every Map Job …

…"…"

JobTracker

Worker 1 Worker 2 Worker 3

…"

…"

block 2 block 3block 1

Page 58: Scalable Collaborative Filtering for Commerce Recommendation

For Every Map Job …

…"…"

JobTracker

Worker 1 Worker 2 Worker 3

…"

…"

block 2 block 3block 1

Page 59: Scalable Collaborative Filtering for Commerce Recommendation

Scalability Comparison

#Customers! #Merchants! #Transactions! Capacity!Apache Mahout! 4,790,651! 1,195,890! 23,721,103! Fail!

Our Method! 45,948,109! 4,386,744! 324,084,408! Success!

More than 10x scalability improvement! !

Page 60: Scalable Collaborative Filtering for Commerce Recommendation

Summary

• Recommendation Problem

• Collaborative Filtering by Matrix Factorization

• Alternative Least Square (ALS)

• A Scalable Solution

Page 61: Scalable Collaborative Filtering for Commerce Recommendation

Thank You!!Question & Answer