DataFusionforDealingwiththeRecommendation
ProblemDenisParra,PUCChile
KeynoteforIFUP
WorkshoponMulti-dimensionalInformationFusionforUserModelingandPersonalization
UMAP2016,Halifax,Canada
Inthistalk
• Recommendationofarticleswithuser-controlledfusion
• Fusingdatainthemusicdomain
• Fusionfore-marketplacesinvirtualworlds
• Howtointegratetimeintocollaborativefiltering?
7/17/16 D.Parra,IFUPkeynote,UMAP2016 2
Part1:RecommendationofArticleswithUser-ControlledFusion
7/17/16 D.Parra,IFUPkeynote,UMAP2016 3
RecommendationofArticles
• Problem:a)Traditionaluserfeedbackis(was?)difficulttoobtain,b)Sparsity
• Thereareseveralpotentialsourcesofrecommendation,butmostlyfromtheitems:
• Content• Co-citations,co-authorship• Etc.
• Ourapproach:giveuserscontroloverwhattofuse.• Would itwork?• Howmuchdatacombination istheoptimum?• Doesvisual representationaffectthebehavior/accuracy?
7/17/16 D.Parra,IFUPkeynote,UMAP2016 4
References• Verbert,K.,Parra,D.,Brusilovsky,P.,&Duval,E.(2013).Visualizingrecommendationstosupportexploration,transparencyandcontrollability.InProceedingsofthe2013internationalconferenceonIntelligentuserinterfaces (pp.351-362).ACM.
• Parra,D.,Brusilovsky,P.,&Trattner,C.(2014).Seewhatyouwanttosee:visualuser-drivenapproachforhybridrecommendation.InProceedingsofthe19thinternationalconferenceonIntelligentUserInterfaces (pp.235-240).ACM.
• Verbert,K.,Parra,D.,&Brusilovksy,P.(2014).Theeffectofdifferentset-basedvisualizationsonuserexplorationofrecommendations.InProceedingsoftheJointWorkshoponInterfacesandHumanDecisionMakinginRecommenderSystems (pp.37-44).
7/17/16 D.Parra,IFUPkeynote,UMAP2016 5
TalkExplorer
• ImplementedinitiallyforauserstudyinACMHypertext2012forConferenceNavigator.
• Mainquestiontoaddress:Dousersconsiderthefusionofseveralsourcesofdatawhenchoosingrelevantitems?
7/17/16 D.Parra,IFUPkeynote,UMAP2016 6
Recap– ConferenceNavigator
Program Proceedings Author List Recommendations
7/17/16 D.Parra,IFUPkeynote,UMAP2016 7
TalkExplorer Interface
7/17/16 D.Parra,IFUPkeynote,UMAP2016 8
TalkExplorer - Entities
EntitiesTags,RecommenderAgents,Users
7/17/16 D.Parra,IFUPkeynote,UMAP2016 9
TalkExplorer – CentralCanvas
RecommenderRecommender
Cluster with intersection of entities
Cluster (of talks) associated to only one entity
• CanvasArea:IntersectionsofDifferentEntities
User
7/17/16 D.Parra,IFUPkeynote,UMAP2016 10
TalkExplorer - Articles
ItemsTalksexploredbytheuser
7/17/16 D.Parra,IFUPkeynote,UMAP2016 11
TalkExplorer StudiesI&II
• StudyI• ControlledExperiment:Userswereaskedtodiscoverrelevanttalksbyexploringthethreetypesofentities:tags,recommenderagentsandusers.
• ConductedatHypertextandUMAP2012(21users)• SubjectsfamiliarwithVisualizationsandRecsys
• StudyII• FieldStudy:Userswereleftfreetoexploretheinterface.
• ConductedatLAK2012andECTEL2013(18users)• Subjectsfamiliarwithvisualizations,butnotmuchwithRecSys
7/17/16 D.Parra,IFUPkeynote,UMAP2016 12
Evaluation:Intersections&Effectiveness
• Whatdowecallan“Intersection”?
• Weused#explorationsonintersectionsandtheireffectiveness,definedas:
7/17/16 D.Parra,IFUPkeynote,UMAP2016 13
ResultsofStudiesI&II
• Effectivenessincreaseswithintersectionsofmoreentities
• Effectivenesswasn’taffectedinthefieldstudy(study2)
• …butexplorationdistributionwasaffected
7/17/16 D.Parra,IFUPkeynote,UMAP2016 14
SetFusion
• Mainmotivationwasinvestigatingasimplerwaytovisualizerecommendationsfromseveralsources.Wouldthatimprove“effectiveness”?
• 3studieswereconducted• FieldstudyinCSCW2013• ControlleduserwithiConference series• FieldstudyinUMAP2013
7/17/16 D.Parra,IFUPkeynote,UMAP2016 15
SetFusion
7/17/16 D.Parra,IFUPkeynote,UMAP2016 16
SetFusion I
Traditional Ranked List
Papers sorted by Relevance. It combines 3 recommendation approaches.
7/17/16 D.Parra,IFUPkeynote,UMAP2016 17
SetFusion - IISlidersAllow the user to control the importance of each data source or recommendation method
Interactive Venn DiagramAllows the user to inspect and to filter papers recommended. Actions available:- Filter item list by clicking on an area- Highlight a paper by mouse-over on a circle- Scroll to paper by clicking on a circle- Indicate bookmarked papers
7/17/16 D.Parra,IFUPkeynote,UMAP2016 18
SetFusion ControlledStudy
• 40users,within-subjectsstudy,simulatediConference attendance
7/17/16 D.Parra,IFUPkeynote,UMAP2016 19
ControlledStudyMainResults
• Controllingandfusingsourcesofrelevancyproducesmorebookmarks:
• 58.44%ofbookmarksafterusingsliders• 28.08%ofbookmarksafterusingVenndiagram
7/17/16 D.Parra,IFUPkeynote,UMAP2016 20
ControlledStudyMainResults
• Userspreferarticlesrecommendedbyafusionofmethods,inbothconditions,buttheeffectisstrongerwiththevisualization
7/17/16 D.Parra,IFUPkeynote,UMAP2016 21
SetFusion – UMAP2013
• FieldStudy:letusersfreelyexploretheinterface
- ~50% (50 users) tried the SetFusion recommender
- 28% (14 users) bookmarked at least one paper
- Users explored in average 14.9 talks and bookmarked 7.36 talks in average.
7/17/16 D.Parra,IFUPkeynote,UMAP2016 22
TalkExplorer Vs.SetFusion
Clustermap Venndiagram
7/17/16 D.Parra,IFUPkeynote,UMAP2016 23
TalkExplorer vs.SetFusion
• Comparingdistributionsofexplorations
In studies 1 and 2 over TalkExplorer we observed an important change in the distribution of explorations.
7/17/16 D.Parra,IFUPkeynote,UMAP2016 24
TalkExplorer vs.SetFusion
• Comparingdistributionsofexplorations
Comparing the field studies:- In TalkExplorer, 84% of
the explorations over intersections were performed over clusters of 1 item
- In SetFusion, was only 52%, compared to 48% (18% + 30%) of multiple intersections, diff. not statistically significant
7/17/16 D.Parra,IFUPkeynote,UMAP2016 25
Take-aways
• Weshowedthatintersectionsofseveralcontextsofrelevancehelptodiscoverrelevantitems.
• Thevisualparadigmusedcanhaveastrongeffectonuserbehavior:weneedtokeepworkingonvisualrepresentationsthatpromoteexplorationwithoutincreasingthecognitiveloadovertheusers.
7/17/16 D.Parra,IFUPkeynote,UMAP2016 26
Part2:FusingDataintheMusicDomain
7/17/16 D.Parra,IFUPkeynote,UMAP2016 27
References
Parra-Santander,D.,&Amatriain,X.(2011).WalktheTalk:Analyzingtherelationbetweenimplicitandexplicitfeedbackforpreferenceelicitation.ProceedingsofUMAP2011,Girona,SpainParra,D.,Karatzoglou,A.,Amatriain,X.,&Yavuz,I.(2011).Implicitfeedbackrecommendationviaimplicit-to-explicitordinallogisticregressionmapping.ProceedingsoftheCARSWorkshop,RecSysChicago,IL,USA,2011.
7/17/16 D.Parra,IFUPkeynote,UMAP2016 28
Introduction(backin2011)
• Mostofrecommendersystemapproachesrelyonexplicitinformationoftheusers,but…
• Explicitfeedback:scarce(peoplearenotespeciallyeagertorateortoprovidepersonalinfo)
• Implicitfeedback:Islessscarce,but(Huetal.,2008)There’snonegativefeedback …andifyouwatchaTVprogramjust
onceortwice?
Noisy …butexplicit feedbackisalsonoisy(Amatriainetal.,2009)
Preference&Confidence …weaimtomaptheI.F.topreference(our maingoal)
Lackofevaluationmetrics …ifwe canmapI.F.andE.F.,wecanhaveacomparableevaluation
7/17/16 D.Parra,IFUPkeynote,UMAP2016 29
Introduction(Today)
• Isitpossibletomapimplicitbehaviortoexplicitpreference(ratings)?Thesedatacaneventuallybefusedintoasinglecompactmodel.
• OURAPPROACH:StudywithLast.fmusers• PartI:Askuserstorate100albums(howtosample)• PartII:Buildamodeltomapcollectedimplicitfeedbackandcontexttoexplicitfeedback
7/17/16 D.Parra,IFUPkeynote,UMAP2016 30
WalktheTalk(2011)
Albumstheylistened toduringlast:7days,3months,6months,year,overall Foreachalbuminthe listwe
obtained: #userplays(ineachperiod),#ofglobal listeners and#ofglobalplays
7/17/16 D.Parra,IFUPkeynote,UMAP2016 31
WalktheTalk- 2
• Requirements:18y.o.,scrobblings >5000
7/17/16 D.Parra,IFUPkeynote,UMAP2016 32
QuantizationofDataforSampling• Whatitemsshouldtheyrate?Item(album)sampling:
• ImplicitFeedback(IF):playcount forauseronagivenalbum.Changedtoscale[1-3],3meansbeingmorelistenedto.
• GlobalPopularity(GP):globalplaycount forallusersonagivenalbum[1-3].Changedtoscale[1-3],3meansbeingmorelistenedto.
• Recentness (R):timeelapsedsinceuserplayedagivenalbum.Changedtoscale[1-3],3meansbeinglistenedtomorerecently.
7/17/16 D.Parra,IFUPkeynote,UMAP2016 33
RegressionAnalysis
• IncludingRecentness increasesR2inmorethan10%[1->2]• IncludingGPincreasesR2,notmuchcompared toRE+IF[1->3]• NotIncludingGP,butincluding interactionbetween IFandREimproves thevarianceoftheDVexplained bytheregressionmodel.[2->4]
M1:implicit feedback
M2:implicitfeedback&recentness
M4:Interactionofimplicitfeedback&recentness
M3:implicitfeedback,recentness,globalpopularity
7/17/16 D.Parra,IFUPkeynote,UMAP2016 34
RegressionAnalysis
• Wetestedconclusionsofregressionanalysisbypredictingthescore,checkingRMSEin10-foldcrossvalidation.
• Resultsofregressionanalysisaresupported.
Model RMSE1 RMSE2Useraverage 1.5308 1.1051M1:Implicit feedback 1.4206 1.0402M2:Implicitfeedback +recentness 1.4136 1.034M3:Implicitfeedback + recentness +globalpopularity 1.4130 1.0338M4:Interaction ofImplicitfeedback *recentness 1.4127 1.0332
7/17/16 D.Parra,IFUPkeynote,UMAP2016 35
PartII:ExtensionofWalktheTalk
• ImplicitFeedbackRecommendationviaImplicit-to-ExplicitOLRMapping(Recsys 2011,CARSWorkshop)
• Considerratingsasordinalvariables• Usemixed-modelstoaccountfornon-independenceofobservations
• Comparewithstate-of-the-artimplicitfeedbackalgorithm
7/17/16 D.Parra,IFUPkeynote,UMAP2016 36
Recallingthe1st study(5/5)
• PredictionofratingbymultipleLinearRegressionevaluatedwithRMSE.
• ResultsshowedthatImplicitfeedback (playcountofthealbumbyaspecificuser)andrecentness(howrecentlyanalbumwaslistenedto)wereimportantfactors,globalpopularityhadaweakereffect.
• Resultsalsoshowedthatlisteningstyle(ifuserpreferredtolistentosingletracks,CDs,oreither)wasalsoanimportantfactor,andnottheotherones.
7/17/16 D.Parra,IFUPkeynote,UMAP2016 37
...but
• LinearRegressiondidn’taccountforthenestednatureofratings
• Andratings weretreatedascontinuous,whentheyareactuallyordinal.
User1
13530452215432
Usern
32104525432135
...
7/17/16 D.Parra,IFUPkeynote,UMAP2016 38
So,OrdinalLogisticRegression!
• ActuallyMixed-EffectsOrdinalMultinomialLogisticRegression
• Mixed-effects:Nestednatureofratings• Weobtaina distributionoverratings(ordinalmultinomial)pereachpairUSER,ITEM->wepredict theratingusingtheexpectedvalue.
• …Andwecancomparetheinferredratings with amethodthatdirectlyusesimplicitinformation(playcounts)torecommend (byHu,Koren etal.2007)
7/17/16 D.Parra,IFUPkeynote,UMAP2016 39
OrdinalRegressionforMapping
• Model
• Predictedvalue
7/17/16 D.Parra,IFUPkeynote,UMAP2016 40
Datasets
• D1:users,albums,if,re,gp,ratings,demographics/consumption
• D2:users,albums,if,re,gp,NORATINGS.
7/17/16 D.Parra,IFUPkeynote,UMAP2016 41
Results
7/17/16 D.Parra,IFUPkeynote,UMAP2016 42
Conclusions(after5years)
• FusionofImplicitfeedback(scrobbles)andrecencycanhelptomakemorepreciserecommendations
• ModelsliketheonebyGurbanov andRiccipresentedthisyearatUMAPofferamorecompactwaytoworkwiththesedata:
“ModelingandPredictingUser Actionsin Recommender Systems”byTural Gurbanov, FrancescoRicci,Meinhard Ploner
• Evaluationisstillachallenge!
7/17/16 D.Parra,IFUPkeynote,UMAP2016 43
Part3:DataFusionforVirtualWorlds
7/17/16 D.Parra,IFUPkeynote,UMAP2016 44
References
Lacic,E.,Kowald,D.,Eberhard,L.,Trattner,C.,Parra,D.,&Marinho,L.B.(2015).Utilizingonlinesocialnetworkandlocation-baseddatatorecommendproductsandcategoriesinonlinemarketplaces.InMining,Modeling,andRecommending'Things'inSocialMedia (pp.96-115).SpringerInternationalPublishing.Trattner,C.,Parra,D.,Eberhard,L.,&Wen,X.(2014,April).Whowilltradewithwhom?:Predictingbuyer-sellerinteractionsinonlinetradingplatformsthroughsocialnetworks.In Proceedingsofthe23rdInternationalConferenceonWorldWideWeb (pp.387-388).ACM.
7/17/16 D.Parra,IFUPkeynote,UMAP2016 45
SecondLife
7/17/16 D.Parra,IFUPkeynote,UMAP2016 46
SocialNetwork
Marketplace
VirtualWorld
Christoph TrattnerKnow-CenterGraz,Austria
Dataset(Task:Itemrecommendation)
7/17/16 D.Parra,IFUPkeynote,UMAP2016 47
RecommendationApproaches
• User-basedCollaborativeFiltering,where
• Hybridapproaches(combinefeatures)
7/17/16 D.Parra,IFUPkeynote,UMAP2016 48
SimilarityFeatures- I
7/17/16 D.Parra,IFUPkeynote,UMAP2016 49
SimilarityFeaturesII
7/17/16 D.Parra,IFUPkeynote,UMAP2016 50
Hybrids
7/17/16 D.Parra,IFUPkeynote,UMAP2016 51
DifferentTask:PredictBuyer-Seller
7/17/16 D.Parra,IFUPkeynote,UMAP2016 52
PredictBuyer-Sellers:AUCResults
7/17/16 D.Parra,IFUPkeynote,UMAP2016 53
Summary
• Thesestudiesshowthatsocialnetworkdataisveryimportantforcertaintypesofrecommendations.
• Duetothelackofavailablecross-servicedataintherealworld,usingdatafromSecondLifehasthepotentialofaProxytobuildmodelsfortherealworld.
7/17/16 D.Parra,IFUPkeynote,UMAP2016 54
Part4:FusionofTimeintoCF
7/17/16 D.Parra,IFUPkeynote,UMAP2016 55
References
Larrain,S.,Trattner,C.,Parra,D.,Graells-Garrido,E.,&Nørvåg,K.(2015).GoodTimesBadTimes:AStudyonRecency EffectsinCollaborativeFilteringforSocialTagging.In Proceedingsofthe9thACMConferenceonRecommenderSystems (pp.269-272).ACM.
7/17/16 D.Parra,IFUPkeynote,UMAP2016 56
Time-AwareCollaborativeFiltering
• CollaborativeFiltering(UserandItem-based)considersalltransactionsequallyimportant
• Buttransactionswhichhappenedtoolongagomightbelessimportantshapingtheusermodel…
7/17/16 D.Parra,IFUPkeynote,UMAP2016 57
5
4
2
1
54
Active user
User_1
User_2
2
3
4
Item 1
Item 2
consumed2yearsago
consumed1monthago
TwoConceptsforTime-AwareCF
• Itemsconsumedrecentlymightbemoreimportantthanitemsconsumedlongtimeago.
•When andhow toincorporatetimeinuser-anditem-basedcollaborativefiltering?
7/17/16 D.Parra,IFUPkeynote,UMAP2016 58
WhenandHowinUB-CF
7/17/16 D.Parra,IFUPkeynote,UMAP2016 59
Item1 Item2 … Itemj Itemm
User1 1 5 2
User2 5 1 4 2
…
User i 3 4
…
Usern 2 5 5
Step1:Findsimilarusers.Weighttransactionsbasedonrecency difference
WhenandHowinUB-CF
7/17/16 D.Parra,IFUPkeynote,UMAP2016 60
Item1 Item2 … Itemj Itemm
User1 1 5 2
User2 5 1 4 3
…
User i 3 4
…
Usern 2 5 4
Step2:Similarusersfound.Recommenditemswithhighratingsandconsumedrecently.
WhenandHowinIB-CF
7/17/16 D.Parra,IFUPkeynote,UMAP2016 61
Item1 Item2 … Itemj Itemm
User1 1 5 2
User2 5 1 4 2
…
User i 3 4
…
Usern 2 5 5
Step1:Findsimilaritemssim(items(user i)).Weightitemsbasedonrecency.
Consu-med1weekago
Consu-med1yearago
WhenandHowinIB-CF
7/17/16 D.Parra,IFUPkeynote,UMAP2016 62
Item1 Item2 … Itemj Itemm
User1 1 5 2
User2 5 1 4 2
…
User i 3 4
…
Usern 2 5 5
Step2:FindsimilaritemsItem1.Weightitemsbasedonrecency difference.
Decayfunctions
• Exponential
• Power
• Linear
• Logistic
• BLL
7/17/16 D.Parra,IFUPkeynote,UMAP2016 63
Parametersandfitting
7/17/16 D.Parra,IFUPkeynote,UMAP2016 64
Daysfrombookmark
Median=50days
Evaluation:Datasets
7/17/16 D.Parra,IFUPkeynote,UMAP2016 65
Evaluation:ResultsI
7/17/16 D.Parra,IFUPkeynote,UMAP2016 66
Evaluation:ResultsII
7/17/16 D.Parra,IFUPkeynote,UMAP2016 67
Summary
• Bestresults:Post-filteringcombinedwithpowerdecaygivesthebest
• Pre- andPost-filteringproduceastrongeffect,butUB-CFismoresusceptiblethanIB-CFtotheeffectoffilteringspeciallypre-filtering.
• ThehybridizationofUBandIBimprovesmakestherecommendationmorerobust.
• Futurework:fitparametersonauserbasisratherthandatasetbasis.
7/17/16 D.Parra,IFUPkeynote,UMAP2016 68
Wrappingup
7/17/16 D.Parra,IFUPkeynote,UMAP2016 69
• Visualapproachesforuser-controllabledatafusioncanwork,butthere’sroomtofindeffectivevisual-interactivecombinations.
• Inthemusicdomainandotherdomains,timeandrecency canworkverywellforrecommendation.
• …butusingtimerequiresanadequatemodelingofthedecayfunctions.
• InformationfromVirtualworldscouldmaybeusedasproxytobuildmodelsandusethemfortransferlearning.
7/17/16 D.Parra,IFUPkeynote,UMAP2016 70
PromisingworksinthisUMAP2016
• UsingSemanticInformation:ExtendtheworkofMusto etal.(UMAP2016)tosupportbettermodelsandmoreexplainablemodels.
• Combinetaxonomieswithimplicit/explicitfeedbackusingcompactgraphicalmodels(co-authoredbyg.Guo)
• Extendmodelswithtimeandothersourcesoffeedback(Turgnov etal.)
7/17/16 D.Parra,IFUPkeynote,UMAP2016 71
IdeasforDataFusion
• Combinemultimodalinformationwithinthesameembeddingusingdeeplearninghasgivengreatresultsinvisualprocessing+NLPfields:
• VisualQ&A• AutomaticCaptioningofPictures
7/17/16 D.Parra,IFUPkeynote,UMAP2016 72
Antol, S., Agrawal, A., Lu, J., Mitchell, M., Batra, D., Lawrence Zitnick, C., & Parikh, D. (2015). Vqa: Visual question answering. In Proceedings of the IEEE International Conference on Computer Vision (pp. 2425-2433).