Download pdf - APPLYING NEURAL NETWORKS TO MOVIE RECOMMENDATION · APPLYING NEURAL NETWORKS TO MOVIE RECOMMENDATION UTKARSH KAJARIA PROBLEM MODEL TRAINING RESULTS Idea: Use deep learning to incorporate

APPLYINGNEURALNETWORKSTOMOVIERECOMMENDATION

UTKARSH KAJARIA

PROBLEM MODELTRAINING RESULTS

Idea:Use deep learning to incorporate information from the metadata availablefor movielens dataset (such as tags, genre, average rating) and comparethe results with the base model comprising of only userid, item id andratings. Compare the results treating SVD as the benchmarking algorithm.

TheDatasetandBenchmark:Thisdatasetcontains5-starratingandtaggingactivityfromMovieLens.Itcontains100004ratingsand1296usergeneratedtagsacross9125movies.WhenevaluatedusingFunkSVD,awellknownimplementationofSVD,wegetameanabsoluteerror(MAE)of0.7155

FeatureConstruction:• userid:vectorized embeddingofuserid,(similartoword2vec)• itemId:vectorized embeddingofitemId• avg_rating:Takenfromtheimdb websiteandnormalizedtobein[-1,1]• tags:582uniquetagstakenasbinaryasymmetricfeatures• genres:20genrestakenasbinaryasymmetricfeatures

ProblemFormulation:Wemodeltheproblemasamulti-classclassificationproblemwith5possibleoutputvaluesforratings1through5 andusingdifferentcombinationsoftheabovefeaturesasinput.(Showninthediagram)

Training the NetworkWe train a 5 layer neural network consisting of one input layer, 3 hiddenlayers and one output layer. The diagram below is a conceptualrepresentation. Our output layer has 5 cells. In addition we give dropoutsbetween hidden layers.

We train our model for 20, 50 and 100 epochs with:

Crossentropy as the loss function, andAdam as the optimization algorithm

ResultsInMAE(MeanAbsoluteError)For20epochs

ANALYSISANDCONCLUSIONS• It’s clear that as we increase the dropout, we get better performance as

overfitting is checked.• Our base model usr_itm outperforms the benchmark of FunkSVD

showing that deep learning methods can give us improvements overtraditional matrix factorization algorithms.

• Secondly, the best model, usr_itm_avg gives a MAE of 0.6674,considerably outperforming our basemodel which shows thatavg_rating is a useful factor in predicting movie ratings for individualusers.

• Third, this tells us that usr_itm_tags model does not perform muchbetter than the base usr_itm suggesting that tags do not offer muchvalue addition in predicting user rating.

DATA SCIENCECOLLEGE OF SCIENCE &ENGINEERING

avg_ratinggenres

itemiduserid

tags

user_itm

user_itm_genre user_itm_avg user_itm_tags

relu relurelu

softmax

Dropout usr_itm (m1)

usr_itm_avg(m2)

usr_itm_genre(m3)

usr_itm_tags(m4)

0.0 0.830 0.801 0.782 0.834

0.1 0.755 0.731 0.725 0.755

0.2 0.726 0.715 0.702 0.718

0.3 0.690 0.699 0.694 0.704

0.4 0.699 0.676 0.685 0.703

0.5 0.6821 0.6674 0.6742 0.6880