Upload
timothy-leonard
View
78
Download
0
Embed Size (px)
Citation preview
Tim Leonardlinkedin.com/in/letimle
@letimle
A song recommendation engine.Data feed from hypem.com
songprof.fr:Proof-of-concept recommendation enginehypem.com user data
HypeMachine does not make recommendations. Vs. competition / lost engagement
Users ♥ songs they like
Blog posts aggregated into play-list
Hype Machine: “The best place to �nd new music on the web. ”
The data:
125,566 Songs
9,000 Users
songprof.fr:Proof-of-concept recommendation enginehypem.com user data
900k User + Song Interactions to model
An example of 687 songs with 27 Users in common
(Limited to interactions within last year)
Bipartite Graph relationship
Commonalities in behavior -> song recommendations
Collaborative �ltering
Method:1) Generate features based on nearest neighbor song commonalities, between users2) Use ML to classify quality of recomendations, based on features from 1)
Why use two stage approach?Network properties alone not su�cient for predicting if song liked.
Optimal network features need to be discovered.
Song
UserUser
♥User
User
♥♥
Song
Song
Song♥
Possible Recommendation
User
UserNumber of users that both liked a songTotal number possible likes
1) Generate features based on nearest neighbor song commonalities, between users
Advantages:- Very e�cient- Features can be used with ML
Disadvantages:- Su�ers from cold start problem (vs. more complex, cpu expensive al-
Song
UserUser
♥User
User
♥♥
Song
Song
Song♥
Possible Recommendation
User
User
Training/test sets: 67k/18k songs
♥ = Number of users that both liked a song Total number possible likes
Gradient Boosting Machine (GBM) Classi�er. 6 fold CV.Did user like song, yes/no?
78 % Raw accuracy (liked yes/no)64% Predictive accuracy (AUC)
Out of sample validation of model:
2) Use ML to classify quality of recomendations, based on network features
For subset of data, generate predictions, check against historical ♥ data:
Demo: www.songprof.fr
Songs ordered by likelihood of being liked, Given source song being liked
Based on network similarity metricsFed into model
Trained on actual ‘likes’ behavior
Thanks!