How to build a recommender system?

Coen StevensLead Recommendation Engineer

Wakoopa use case

Mission:Discover software & games

MacWindows Linux

Software tracker

Your profile

Updates

Software pages

Recommendations

Building a recommender systemApproach and challenges

(implicit) (explicit)

• Noisy

• Only positive feedback

• Easy to collect

• Accurate

• Positive and negative feedback

• Hard to collect

what do we have?

Usage Ratingsvs.

Datawhat do we use?

• Active users (Tracker activity in the past month): ~9.000

• Actively used software items (in the past month): ~10.000

• We calculate recommendations for each OS together with Web applications separately

Recommender system methods

• Item-based collaborative filtering

• User-based collaborative filtering (we only use for calculating user similarities to find people like you)

• Combining both methods

Collaborative recommendations: The user will be recommended items that people with similar tastes and preferences liked (used) in the past

Item-Based Collaborative FilteringUser software usage matrix

220 90 180 22

280 12 42 80

175 210 210 45

165 35 195 13 25

100 50 185 35 190

60 65 185

Software items

User software usage matrix [0, 1]

1 1 0 1 0 1 0

1 1 1 0 1 0 0

1 1 0 1 0 1 0

1 0 1 1 1 1 0

0 1 1 1 0 1 1

0 1 0 1 0 0 1

Software items

How do we predict the probability that I would like to use GMail?

1 1 0 1 0 1 0

1 1 1 0 1 0 0

1 1 1 0 1 0

1 0 1 1 1 1 0

0 1 1 1 0 1 1

0 1 0 1 0 0 1

Software items

Calculate the similarities between Gmail and the other software items.

1 1 0 1 0 1 0

1 1 1 0 1 0 0

1 1 0 1 0 1 0

1 0 1 1 1 1 0

0 1 1 1 0 1 1

0 1 0 1 0 0 1

Software items

Cosine Similarity(Firefox, Gmail)

1 1 0 1 0 1 0

1 1 1 0 1 0 0

1 1 0 1 0 1 0

1 0 1 1 1 1 0

0 1 1 1 0 1 1

0 1 0 1 0 0 1

Software items

1 1 0 1 0 1 0

1 1 1 0 1 0 0

1 1 0 1 0 1 0

1 0 1 1 1 1 0

0 1 1 1 0 1 1

0 1 0 1 0 0 1

Software items

Popularity correction, we put less trust

in popular software

Item-item correlation matrix

1 0.1 0.6 0.1 0.1 0.1 0.7

0.2 1 0.8 0.5 0.8 0.1 0.9

0.1 0.6 1 0.5 0.7 0.2 0.3

0.2 0.6 0.4 1 0.8 0.2 0.3

0.5 0.4 0.4 0.4 1 0.1 0.2

0.5 0.5 0.3 0.5 0.3 1 0.3

0.2 0.6 0.3 0.8 0.7 0.7 1

Item-item correlation matrix

1 0.1 0.6 0.1 0.1 0.1 0.7

0.2 1 0.8 0.5 0.8 0.1 0.9

0.1 0.6 1 0.5 0.7 0.2 0.3

0.2 0.6 0.4 1 0.8 0.2 0.3

0.5 0.4 0.4 0.4 1 0.1 0.2

0.5 0.5 0.3 0.5 0.3 1 0.3

0.2 0.6 0.3 0.8 0.7 0.7 1

Gmail similarities

K-nearest neighbor approach

Gmail similarities

• Performance vs quality

• We take only the ‘K’ most similar items (say 4)

• Space complexity: O(m + Kn)

• Computational complexity: O(m + n²)

Gmail similarities

Calculate the predicted value for Gmail

User usage

Gmail similarities

User usage

Gmail similarities

Usage correction, more usage results

in a higher score [0,1]

(0.6 * 0.9) + (0.8 * 0.8) + (0.4 * 0.6)

0.6 + 0.8 + 0.4 + 0.4= 0.82

Gmail similarities User usage

(0.6 * 0.9) + (0.8 * 0.8) + (0.4 * 0.6)

0.6 + 0.8 + 0.4 + 0.4= 0.82

Gmail similarities User usage

• User feedback

• Contacts usage

• Commercial vs Free

Calculate all unknown values andshow the Top-N recommendations to each user

1 1 1 1

1 1 1 1 1

Software items

?? ? ?

ExplainabilityWhy did I get this recommendation?

• Overlap between the item’s (K) neighbors and your usage

User-Based Collaborative Filtering

1 1 0 1 0 1 0

1 1 1 0 1 0 0

1 1 0 1 0 1 0

1 1 1 1 1 1 0

0 1 1 1 0 1 1

0 1 0 1 0 0 1

Cosine Similarity(Coen, Menno)

Finding people like you

0.1 0.2 0 0.4 0 0.4 0

0.1 0.2 0.6 0 0.8 0 0

0.1 0.2 0 0.4 0 0.4 0

0.1 0.2 0.6 0.4 0.8 0.4 0

0 0.2 0.6 0.4 0 0.4 0.2

0 0.2 0 0.4 0 0 0.2

Applying inverse user frequency

log(n/ni): ni is the number of users that uses item i and n is the total number of users in the database

The fact that you both use Textmate tells you more than when you both use firefox

0.1 0.2 0 0.4 0 0.4 0

0.1 0.2 0.6 0 0.8 0 0

0.1 0.2 0 0.4 0 0.4 0

0.1 0.2 0.6 0.4 0.8 0.4 0

0 0.2 0.6 0.4 0 0.4 0.2

0 0.2 0 0.4 0 0 0.2

1 0.8 0.6 0.5 0.7 0.2

0.8 1 0.4 0.7 0.5 0.5

0.6 0.4 1 0.4 0.9 0.1

0.5 0.8 0.4 1 0.6 0.4

0.8 0.5 0.9 0.6 1 0.2

0.2 0.5 0.1 0.4 0.2 1

User-user correlation matrix

Performancemeasure for success

• Cross-validation: Train-Test split (80-20)

• Precision and Recall:- precision = size(hit set) / size(total given recs) - recall = size(hit set) / size(test set)

• Root mean squared error (RMSE)

Implementation

• Ruby Enterprise Edition (garbage collection)

• MySQL database

• Built our own c-libraries

• Amazon EC2: - Low cost- Flexibility- Ease of use

• Open source

Future challenges

• What is the best algorithm for Wakoopa? (or you)

• Reducing space-time complexity (scalability):- Parallelization (Clojure)- Distributed computing (Hadoop)

1 evening, 3 speakers, 100 developerswww.recked.org

How to build a recommender system?

Technology

Recommender Systems. Finding Trusted Information How many cows in Texas?

Recommender Systems Handbook - Home - Springer978-0-387-858… · · 2017-08-28Printed on acid-free paper ... interacting with recommender systems, recommender sys- ... 1.6 Recommender

How to Build a Social Strategy (and How NOT to Build One)

How to Retrain Recommender System? A Sequential Meta

How to use recommender systems in e-business domains · recall or predictions’ mean absolute error) ... improving and developing more accurate recommender systems and on studying

Recommender Systems

TFR: A Tourist Food Recommender System based on Collaborative Filtering · 2018-08-14 · Artificial Intelligence, Recommender System Keywords Food Recommender System, Tourist Recommender

Compsci 101, Fall 2012 20.1 LWoC l Review Recommender, dictionaries, files How to create recommendations in order? food.txt Toward a Duke eatery-recommender

Content-based Recommender System for Movie Websitekth.diva-portal.org › smash › get › diva2:935353 › FULLTEXT02.pdf · It is important to build a real recommender system for

How To Build Apps That Build Business Faster

How to Build

How to build a Recommender System

Personalized Recommender by Exploiting Domain based Expert ... · Incremental Collaborative Filtering Recommender system. B. Content-Based Recommender System Content based recommender

Recommender Systems. Outline Limitations of Recommender Systems SMARTMUSEUM Case Study

Recommender Systems - Universidade NOVA de Lisboactp.di.fct.unl.pt/~jmag/ws/slides/b08 Recommender systems.pdf · Recommender systems •Recommender systems aim at suggesting new

Tutorial: Recommender Systemswelling/teaching/CS77B... · Tutorial: Recommender Systems ... Recommender systems implementation & evaluation Product configuration systems Web mining

Data Science: What It Is and How It Can Help Your …...Data Science approach: Build a recommender system using historical data, recommend future purchases. Identify which products

How to build a recommender system based on Mahout and Java EE · How to build a recommender system based on Mahout and Java EE Berlin Expert Days 29.– 30. March 2012 Manuel Blechschmidt

Affective recommender systems: the role of emotions in recommender systems

Part 1: How to build a USB PIC 18F4550 or New: Discussion ...users.telenet.be/hlmaster/MGPelec/How to build a USB device with... · Part 1: How to build a USB ... · How-to build