62
Comparing Recommenda/on Algorithms for Social Bookmarking Toine Bogers Royal School of Library and Informa/on Science Copenhagen, Denmark

Comparing Recommendaon Algorithms for Social Bookmarkingtoinebogers.com/content/slides/201001-comparing-recsys-for-bookm… · Social bookmarking • Way of storing, organizing, and

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Comparing Recommendaon Algorithms for Social Bookmarkingtoinebogers.com/content/slides/201001-comparing-recsys-for-bookm… · Social bookmarking • Way of storing, organizing, and

ComparingRecommenda/onAlgorithmsforSocialBookmarking

ToineBogers

RoyalSchoolofLibraryandInforma/onScience

Copenhagen,Denmark

Page 2: Comparing Recommendaon Algorithms for Social Bookmarkingtoinebogers.com/content/slides/201001-comparing-recsys-for-bookm… · Social bookmarking • Way of storing, organizing, and

Aboutme

•  Ph.D.fromTilburgUniversity  “RecommenderSystemsforSocialBookmarking”  Promotor:Prof.dr.AntalvandenBosch

•  Currently@RSLIS(Copenhagen,DK)  Researchassistantonretrievalfusionproject

•  Researchinterests  Recommendersystems  Socialbookmarking

  Expertsearch  Informa/onretrieval

Page 3: Comparing Recommendaon Algorithms for Social Bookmarkingtoinebogers.com/content/slides/201001-comparing-recsys-for-bookm… · Social bookmarking • Way of storing, organizing, and

Outline

1.  Introduc/on2.  Collabora/vefiltering3.  Content‐basedfiltering4.  Recommendersystemsfusion

5.  Conclusions

Page 4: Comparing Recommendaon Algorithms for Social Bookmarkingtoinebogers.com/content/slides/201001-comparing-recsys-for-bookm… · Social bookmarking • Way of storing, organizing, and
Page 5: Comparing Recommendaon Algorithms for Social Bookmarkingtoinebogers.com/content/slides/201001-comparing-recsys-for-bookm… · Social bookmarking • Way of storing, organizing, and

Socialbookmarking

•  Wayofstoring,organizing,andmanagingbookmarksofWebpages,scien/ficar/cles,books,etc.  Alldoneonline  Canbemadepublicorkeptprivate  Allowuserstotag(=label)theiritems

  Manydifferentwebsitesavailable:

Page 6: Comparing Recommendaon Algorithms for Social Bookmarkingtoinebogers.com/content/slides/201001-comparing-recsys-for-bookm… · Social bookmarking • Way of storing, organizing, and

Socialbookmarking

•  Differentdomains  Webpages  Scien/ficar/cles  Books

•  Stronggrowthinpopularity  Millionsofusers,items,andtags

  Forexample:Delicious-  140,000+posts/dayonaveragein2008(Keller,2009)-  7,000,000+posts/monthin2008(Wetzkeretal.,2009)

Page 7: Comparing Recommendaon Algorithms for Social Bookmarkingtoinebogers.com/content/slides/201001-comparing-recsys-for-bookm… · Social bookmarking • Way of storing, organizing, and

Contentoverload

•  Problemswiththisgrowth  Contentoverload  Increasingambiguity

•  Howcanwedealwiththis?  Browsing  Search

•  Apossiblesolu/on  Takeamoreac/verole:recommenda,on

Canbecomelesseffec/veascontentincreases!

Page 8: Comparing Recommendaon Algorithms for Social Bookmarkingtoinebogers.com/content/slides/201001-comparing-recsys-for-bookm… · Social bookmarking • Way of storing, organizing, and

Recommenda/ontasks

!"#$%"&%'("&

)"

*+")&

,"-#))"./

012#.

!"#$%"&

$,#3%'.4

*+")&

"5$",+6

7#,"&

%'("&+8'6&

914&

6:44"62#.

;#)1'.

"5$",+6

!",6#.1%'<"0&

6"1,-8

;"$+8&

=,#>6'.4

?@AB

*9A7

9CD

?@AB *9A7 9CD

!"#$%&$'''

()*&"$+$$'''

Page 9: Comparing Recommendaon Algorithms for Social Bookmarkingtoinebogers.com/content/slides/201001-comparing-recsys-for-bookm… · Social bookmarking • Way of storing, organizing, and

Itemrecommenda/on

•  Ourfocus:itemrecommenda,on   Iden/fysetsofitemsthatarelikelytobeofinteresttoacertainuser-  Returnarankedlistofitems

-  ‘FindGoodItems’task(Herlockeretal.,2004)

  Basedondifferentinforma/onsources-  Transac/onpajerns(usagedata,purchaseinforma/on)

–  Explicitra/ngs–  Implicitfeedback

- Metadata

-  Tags

Page 10: Comparing Recommendaon Algorithms for Social Bookmarkingtoinebogers.com/content/slides/201001-comparing-recsys-for-bookm… · Social bookmarking • Way of storing, organizing, and

Relatedwork

•  Workonsocialbookmarkingmostlyfocusedon  Improvingbrowsingexperience

-  clustering,dealingwithambiguity

  Incorpora/ngtagsinsearchalgorithms

  Tagrecommenda/on

•  Problemswithworkonitemrecommenda/on  Differentdatasets  Differentevalua/onmetrics

  Nocomparisonofalgorithmsundercontrolledcondi/ons  Hardlyeverpubliclyavailabledatasets  Nouser‐basedevalua/on

Page 11: Comparing Recommendaon Algorithms for Social Bookmarkingtoinebogers.com/content/slides/201001-comparing-recsys-for-bookm… · Social bookmarking • Way of storing, organizing, and

Collec/ngdata

•  Fourdatasetsfromtwodifferentdomains  Webbookmarks

- Delicious-  BibSonomy

  Scien/ficar/cles-  CiteULike-  BibSonomy

~78%ofuserspostedonlytypeofcontent(bookmarksorscien/ficar/cles)

Page 12: Comparing Recommendaon Algorithms for Social Bookmarkingtoinebogers.com/content/slides/201001-comparing-recsys-for-bookm… · Social bookmarking • Way of storing, organizing, and

Whatdidwecollect?

•  Usagedata  User‐item‐tagtripleswith/mestamps

•  Metadata  Varieswiththedomain

Scien,ficar,cles  Item‐intrinsic

-  TITLE,DESCRIPTION,JOURNAL,AUTHOR,TAGS,URL,etc.

  Item‐extrinsic-  CHAPTER,DAY,EDITION,

YEAR,INSTITUTION,etc.

Webbookmarks  TITLE,DESCRIPTION,TAGS,

URL

Page 13: Comparing Recommendaon Algorithms for Social Bookmarkingtoinebogers.com/content/slides/201001-comparing-recsys-for-bookm… · Social bookmarking • Way of storing, organizing, and

Filtering

•  Why?  Toreducenoiseinourdatasets  Commonprocedureinrecommendersystemsresearch

•  How?  ≥20itemsperuser

  ≥2usersperitem(nohapaxlegomenaitems)  Nountaggedposts

•  Comparedtorelatedwork  Stricterfiltering  Morerealis/c

Page 14: Comparing Recommendaon Algorithms for Social Bookmarkingtoinebogers.com/content/slides/201001-comparing-recsys-for-bookm… · Social bookmarking • Way of storing, organizing, and

Datasets

Delicious BibSonomy CiteULike BibSonomy

#users 1,243 192 1,322 167

#items 152,698 11,165 38,419 12,982

#tags 42,820 13,233 28,312 5,165

#posts 238,070 29,096 84,637 29,720

Scien,ficar,clesBookmarks

Page 15: Comparing Recommendaon Algorithms for Social Bookmarkingtoinebogers.com/content/slides/201001-comparing-recsys-for-bookm… · Social bookmarking • Way of storing, organizing, and

Experimentalsetup

•  Backtes/ng  Withholdrandomlyselecteditemsfromtestusers  Useremainingmaterialfortrainingrecommendersystem  Successispredictedtheuser’sinterestinhis/herwithhelditems

•  Details  Overall90%‐10%splitonusers  Withhold10randomlyselecteditems ofeachtestuser  Parameterop/miza/on

- Used10‐foldcross‐valida/on-  90‐10splits-  10withhelditems

  Macro‐averagingofevalua/onscores

Page 16: Comparing Recommendaon Algorithms for Social Bookmarkingtoinebogers.com/content/slides/201001-comparing-recsys-for-bookm… · Social bookmarking • Way of storing, organizing, and

Evalua/on

•  ‘FindGoodItems’taskreturnsarankedlist  Needmetricthattakeintorankingofitems

•  Precision‐orientedmetric  MeanAveragePrecision(MAP)

-  AveragePrecision(AP)isaverageofprecisionvaluesateachrelevant,retrieveditem

- MAPisAPaveragedoverallusers

-  “singlefiguremeasureofqualityacrossrecalllevels”(Manning,2009)

•  Testeddifferentmetrics  Allprecision‐orientedmetricsshowedthesamepicture

Page 17: Comparing Recommendaon Algorithms for Social Bookmarkingtoinebogers.com/content/slides/201001-comparing-recsys-for-bookm… · Social bookmarking • Way of storing, organizing, and
Page 18: Comparing Recommendaon Algorithms for Social Bookmarkingtoinebogers.com/content/slides/201001-comparing-recsys-for-bookm… · Social bookmarking • Way of storing, organizing, and

Collabora/vefiltering

•  Ques/on  Howcanweusetheinforma/oninthefolksonomytogeneratebejerrecommenda/ons? - Users-  Items-  Tags

•  Collabora/vefiltering(CF)  Ajemptstoautomate“word‐of‐mouth”recommenda/ons  Recommenditemsbasedonhowlike‐mindedusersratedthoseitems

  Similaritybasedon- Usagedata-  Taggingdata

usagepajerns

Page 19: Comparing Recommendaon Algorithms for Social Bookmarkingtoinebogers.com/content/slides/201001-comparing-recsys-for-bookm… · Social bookmarking • Way of storing, organizing, and

Collabora/vefiltering

•  Model‐basedCF  ‘Eager’recommenda/onalgorithms  Trainapredic/vemodeloftherecommenda/ontask

  Quicktoapplytogeneraterecommenda/ons

•  Memory‐basedCF  ‘Lazy’recommenda/onalgorithms  Simplystoreallpajernsinmemory

  Deferpredic/onefforttowhenuserrequestsrecommenda/ons

Page 20: Comparing Recommendaon Algorithms for Social Bookmarkingtoinebogers.com/content/slides/201001-comparing-recsys-for-bookm… · Social bookmarking • Way of storing, organizing, and

Relatedwork

•  Model‐based  HybridPLSA‐basedapproach(Wetzkeretal.,2009)  Tensordecomposi/on(Symeonidisetal.,2008)

•  Memory‐based  Tag‐awarefusion(Tso‐Sujeretal.,2008)

•  Graph‐based  FolkRank(Hothoetal.,2006)  Randomwalk(Clementsetal.,2008)

Page 21: Comparing Recommendaon Algorithms for Social Bookmarkingtoinebogers.com/content/slides/201001-comparing-recsys-for-bookm… · Social bookmarking • Way of storing, organizing, and

Algorithms

•  User‐basedk‐NNalgorithm  Calculatesimilaritybetweentheac/veuserandallotherusers  Determinethetopknearestneighbors

-  I.e.,themostsimilarusers

  Unseenitemsfromnearestneighborsarescoredbythesimilaritybetweentheneighborandtheac/veuser

•  Item‐basedk‐NNalgorithm  Calculatesimilaritybetweentheac/veuser’sitemsandallotheritems

  Determinethetopknearestneighbors-  I.e.,themostsimilaritemsforeachoftheac/veuser’sitems

  Unseenneighboringitemsarescoredbythesimilaritybetweentheneighborandtheac/veuser’sitem

Page 22: Comparing Recommendaon Algorithms for Social Bookmarkingtoinebogers.com/content/slides/201001-comparing-recsys-for-bookm… · Social bookmarking • Way of storing, organizing, and

Usagedata

•  Baseline:CFusingusagedata•  Profilevectors

  Userprofiles  Itemprofiles

•  Noexplicitra/ngsavailable  Onlybinaryinforma/on(1or0)  Orrather:unary!

•  Similaritymetric  Cosinesimilarity

•  10‐foldcross‐valua/ontoop/mizek

UI

items

users

Page 23: Comparing Recommendaon Algorithms for Social Bookmarkingtoinebogers.com/content/slides/201001-comparing-recsys-for-bookm… · Social bookmarking • Way of storing, organizing, and

Results(usagedata)

BibSonomy Delicious BibSonomy CiteULike

UBCF+usagedata 0.0277 0.0046 0.0865 0.0746

IBCF+usagedata 0.0244 0.0027 0.0737 0.0887

Scien,ficar,clesBookmarks

Page 24: Comparing Recommendaon Algorithms for Social Bookmarkingtoinebogers.com/content/slides/201001-comparing-recsys-for-bookm… · Social bookmarking • Way of storing, organizing, and

•  Tagsareshorttopicaldescrip/onsofanitem(oruser)

•  Profilevectors  Usertagprofiles  Itemtagprofiles

•  Similaritymetrics  Cosinesimilarity

  Jaccardoverlap  Dice’scoefficient

Taggingdata

UT

tags

users

IT

tags

items

Page 25: Comparing Recommendaon Algorithms for Social Bookmarkingtoinebogers.com/content/slides/201001-comparing-recsys-for-bookm… · Social bookmarking • Way of storing, organizing, and

Results(taggingdata)

BibSonomy Delicious BibSonomy CiteULike

UBCF+usagedata 0.0277 0.0046 0.0865 0.0746

IBCF+usagedata 0.0244 0.0027 0.0737 0.0887

UBCF+taggingdata 0.0102 0.0017 0.0459 0.0449

IBCF+taggingdata 0.0370 0.0101 0.1100 0.0814

Scien,ficar,clesBookmarks

Page 26: Comparing Recommendaon Algorithms for Social Bookmarkingtoinebogers.com/content/slides/201001-comparing-recsys-for-bookm… · Social bookmarking • Way of storing, organizing, and

Findings(taggingdata)

•  CFwithtagoverlap  User‐basedCFperformssignificantlyworse  Item‐basedCFperformsmuchbejer

- Ouensta/s/callysignificantimprovements

  ExceptonCiteULike:CFwithouttagsbejer•  Similaritymetricrela/velyunimportant

  Cosinesimilarityslightlybejer

Page 27: Comparing Recommendaon Algorithms for Social Bookmarkingtoinebogers.com/content/slides/201001-comparing-recsys-for-bookm… · Social bookmarking • Way of storing, organizing, and

Comparisontorelatedwork

•  Randomwalkmodel(Clementsetal.,2008)  Createtransi/onmatrixbasedontripar/tefolksonomygraph  SimilartoFolkRank,butnowalksofinfinitelength

  Walklengthnisaparameter

•  Tag‐awarefusion(Tso‐Sujeretal.,2008)  Fusionofalgorithmsanddatarepresenta,ons  Usagedataandtaggingdata

- User‐basedCFextendUImatrixwithtagsasextraitems

-  Item‐basedCFextendUImatrixwithtagsasextrausers

  User‐basedCFanditem‐basedCF-  Fusetogetherpredic/ons

Page 28: Comparing Recommendaon Algorithms for Social Bookmarkingtoinebogers.com/content/slides/201001-comparing-recsys-for-bookm… · Social bookmarking • Way of storing, organizing, and

Comparisontorelatedwork

!"#$%&'"#()

*+,#$-./

0,#1%&'"#()

*+,#$-./

! "#

$#2

%!"#$"

%&#'"

&()"

%&#'"

%&#'"

&()"

!"#$"

&()"

Page 29: Comparing Recommendaon Algorithms for Social Bookmarkingtoinebogers.com/content/slides/201001-comparing-recsys-for-bookm… · Social bookmarking • Way of storing, organizing, and

Results

BibSonomy Delicious BibSonomy CiteULike

UBCF+usagedata 0.0277 0.0046 0.0865 0.0746

IBCF+usagedata 0.0244 0.0027 0.0737 0.0887

UBCF+taggingdata 0.0102 0.0017 0.0459 0.0449

IBCF+taggingdata 0.0370 0.0101 0.1100 0.0814

UBCF+fuseddata 0.0303 0.0057 0.0829 0.0739

IBCF+fuseddata 0.0468 0.0125 0.1280 0.1212

Tag‐awarefusion 0.0474 0.0166 0.1297 0.1268

Randomwalkmodel 0.0182 0.0003 0.0608 0.0536

Scien,ficar,clesBookmarks

Page 30: Comparing Recommendaon Algorithms for Social Bookmarkingtoinebogers.com/content/slides/201001-comparing-recsys-for-bookm… · Social bookmarking • Way of storing, organizing, and
Page 31: Comparing Recommendaon Algorithms for Social Bookmarkingtoinebogers.com/content/slides/201001-comparing-recsys-for-bookm… · Social bookmarking • Way of storing, organizing, and

Metadata‐basedrecommenda/on

•  Ques/on  Howcanweusethemetadatatogenerate(bejer)itemrecommenda/ons?

•  Content‐basedfiltering  Buildrepresenta/onsofthecontentinasystem

  Learnaprofileoftheuser’sinterests  Matchcontentrepresenta/onsagainsttheuser’sprofile

Page 32: Comparing Recommendaon Algorithms for Social Bookmarkingtoinebogers.com/content/slides/201001-comparing-recsys-for-bookm… · Social bookmarking • Way of storing, organizing, and

Reminder:whatdidwecollect?

•  Twotypesofmetadata  Intrinsicmetadata,i.e.,directlyrela/ngtothecontent

-  E.g.,<TITLE>,<DESCRIPTION>,<JOURNAL>,<AUTHOR>,...  Extrinsicmetadata,i.e.,administra/veinforma/on

-  E.g.,<PAGES>,<MONTH>,<EDITION>,…

Page 33: Comparing Recommendaon Algorithms for Social Bookmarkingtoinebogers.com/content/slides/201001-comparing-recsys-for-bookm… · Social bookmarking • Way of storing, organizing, and

Relatedwork

•  Commonapproaches  Informa/onretrieval  Machinelearning

•  Examples  TF∙IDFweigh/ng(Lang,1995;Whitman&Lawrence,2002)

  Personalinforma/onagents(Balabanovic,1998;Joachimsetal.,1997;Chiritaetal.,2006)

  NaiveBayes(Mooneyetal.,2000;DeGemmisetal.,2008)

  Linearregression(Alspectoretal.,1997)•  Nothingappliedtosocialbookmarkingsofar!

Page 34: Comparing Recommendaon Algorithms for Social Bookmarkingtoinebogers.com/content/slides/201001-comparing-recsys-for-bookm… · Social bookmarking • Way of storing, organizing, and

•  TakeanIRapproach:profile‐centricmatching  Buildrepresenta/onsofthecontentinasystem

-  Allmetadataassignedtoanitem→itemprofile

  Learnaprofileoftheuser’sinterests-  Collateallofuser’smetadataintoauserprofile

  Matchandrankitemprofilestouserprofiles-  LanguagemodelingwithJelinek‐Mercersmoothing

-  Stopwordfiltering,nostemming

Profile‐centricmatching

Page 35: Comparing Recommendaon Algorithms for Social Bookmarkingtoinebogers.com/content/slides/201001-comparing-recsys-for-bookm… · Social bookmarking • Way of storing, organizing, and

Profile‐centricmatching

!"#$%$%&'$()*'+",-.)/0123)'4/)"'+",-.)/

!"#$%&'(&)*"+(,-.*(/+)0

/$*$.#"$(5

*#(16$%&78 9

: 0 ;

< = ;

> ;

()/('+#$"/("#$%$%&'+#$"/

7

0

9

0

:

0

9

=

<

=

7

;

:

;

>

;

<

;

7

8

9

8

:

8

>

8

Page 36: Comparing Recommendaon Algorithms for Social Bookmarkingtoinebogers.com/content/slides/201001-comparing-recsys-for-bookm… · Social bookmarking • Way of storing, organizing, and

•  Problem  Biguserprofilewillmatchnearlyanything  Sacrificingprecisionforrecall

•  Differentlevelofgranularity:post‐centricmatching  Constructmetadatarepresenta/onsofeachpost

  Matcheachoftheuser’spostsagainstallotherposts  Match,rank,andaggregateallretrievedposts

Post‐centricmatching

Page 37: Comparing Recommendaon Algorithms for Social Bookmarkingtoinebogers.com/content/slides/201001-comparing-recsys-for-bookm… · Social bookmarking • Way of storing, organizing, and

Post‐centricmatching

!"#$%$%&'()*+*,-./0'1*0"2*'()*+*

!"#$%&'()*+,(-.*$/0(*1.,2

*$3$4#"$+5

3#+-6$%&

7

7

7

8

9'9'9

:

:

:

:

8

;

8

8

9'9'9

,

,

<

,

+0*+'(#$"*+"#$%$%&'(#$"*

7

,

8

,

;

,

8

<

=

<

7

>

;

>

?

>

=

>

7

:

8

:

;

:

?

:

Page 38: Comparing Recommendaon Algorithms for Social Bookmarkingtoinebogers.com/content/slides/201001-comparing-recsys-for-bookm… · Social bookmarking • Way of storing, organizing, and

Results

BibSonomy Delicious BibSonomy CiteULike

Profile‐centricmatching 0.0402 0.0014 0.1279 0.0987

Post‐centricmatching 0.0259 0.0036 0.1190 0.0455

Scien,ficar,clesBookmarks

•  Problemwithpost‐centricmatching:datasparseness

Page 39: Comparing Recommendaon Algorithms for Social Bookmarkingtoinebogers.com/content/slides/201001-comparing-recsys-for-bookm… · Social bookmarking • Way of storing, organizing, and

Hybridfiltering

•  Similaritybetweenusersanditemsbasedonmetadata  Plugthesesimilari/esintostandardk‐NNCFapproach!  User‐basedCFwithmetadata‐basedsimilari/es

-  Textualsimilaritybetweenuserprofiles

  Item‐basedCFwithmetadata‐basedsimilari/es-  Textualsimilaritybetweenitemprofiles

Page 40: Comparing Recommendaon Algorithms for Social Bookmarkingtoinebogers.com/content/slides/201001-comparing-recsys-for-bookm… · Social bookmarking • Way of storing, organizing, and

Results

BibSonomy Delicious BibSonomy CiteULike

Profile‐centricmatching 0.0402 0.0014 0.1279 0.0987

Post‐centricmatching 0.0259 0.0036 0.1190 0.0455

Hybrid(UBCF+metadata) 0.0218 0.0039 0.0410 0.0608

Hybrid(IBCF+metadata) 0.0399 0.0017 0.1510 0.0746

Scien,ficar,clesBookmarks

Page 41: Comparing Recommendaon Algorithms for Social Bookmarkingtoinebogers.com/content/slides/201001-comparing-recsys-for-bookm… · Social bookmarking • Way of storing, organizing, and

Results(comparison)

BibSonomy Delicious BibSonomy CiteULike

Profile‐centricmatching 0.0402 0.0014 0.1279 0.0987

Post‐centricmatching 0.0259 0.0036 0.1190 0.0455

Hybrid(UBCF+metadata) 0.0218 0.0039 0.0410 0.0608

Hybrid(IBCF+metadata) 0.0399 0.0017 0.1510 0.0746

BestCFrun 0.0370 0.0101 0.1100 0.0887

Scien,ficar,clesBookmarks

Page 42: Comparing Recommendaon Algorithms for Social Bookmarkingtoinebogers.com/content/slides/201001-comparing-recsys-for-bookm… · Social bookmarking • Way of storing, organizing, and

Results(comparison)

BibSonomy Delicious BibSonomy CiteULike

Profile‐centricmatching 0.0402 0.0014 0.1279 0.0987

Post‐centricmatching 0.0259 0.0036 0.1190 0.0455

Hybrid(UBCF+metadata) 0.0218 0.0039 0.0410 0.0608

Hybrid(IBCF+metadata) 0.0399 0.0017 0.1510 0.0746

BestCFrun 0.0370 0.0101 0.1100 0.0887

Tag‐awarefusion 0.0474 0.0166 0.1297 0.1268

Scien,ficar,clesBookmarks

Page 43: Comparing Recommendaon Algorithms for Social Bookmarkingtoinebogers.com/content/slides/201001-comparing-recsys-for-bookm… · Social bookmarking • Way of storing, organizing, and

Findings

•  Content‐basedfiltering  Profile‐levelmatchingbejerthanpost‐level

•  Hybridfiltering  Item‐basedCFwithmetadatasimilari/esworksbest

•  Noclearwinneroveralldatasets

Page 44: Comparing Recommendaon Algorithms for Social Bookmarkingtoinebogers.com/content/slides/201001-comparing-recsys-for-bookm… · Social bookmarking • Way of storing, organizing, and
Page 45: Comparing Recommendaon Algorithms for Social Bookmarkingtoinebogers.com/content/slides/201001-comparing-recsys-for-bookm… · Social bookmarking • Way of storing, organizing, and

Datafusion

•  Ques/on  Canweimproveperformancebycombiningdifferentrecommenda/onalgorithms?

  Tenta/veanswer:yes!

•  Datafusionusedindifferentfields  Machinelearning  Informa/onretrieval

-  Collec/onfusion-  Resultsfusion

Page 46: Comparing Recommendaon Algorithms for Social Bookmarkingtoinebogers.com/content/slides/201001-comparing-recsys-for-bookm… · Social bookmarking • Way of storing, organizing, and

Combina/ontaxonomy

•  Burke(2002)definessevendifferenttechniques1.  Mixed(allshowntogether,interleaved)2.  Switching(pickone,dependingonthesitua/on)

3.  Featurecombina/on(combinesourcesforasinglealgorithm)

4.  Cascade(outputofalgorithm1isinputofalgorithm2)

5.  Featureaugmenta/on(outputalg.1isinputfeaturealg.2)6.  Meta‐level(modelalg.1isinputforalg.2)

7.  Weightedcombina/on(outputcombina/onof≥2alg.)-  SameasresultsfusioninIR

Page 47: Comparing Recommendaon Algorithms for Social Bookmarkingtoinebogers.com/content/slides/201001-comparing-recsys-for-bookm… · Social bookmarking • Way of storing, organizing, and

Whydoesdatafusionwork?

•  Problem  Recommenda/onistoocomplex  Individualsolu/oncannevercapturethiscompletely

•  Solu/on  Combinedifferentalgorithmsanddatarepresenta/ons

  Eachhighlightsadifferentaspectofthetask  Overlapbetweentheindividualrunsisevidenceofrelevance

Page 48: Comparing Recommendaon Algorithms for Social Bookmarkingtoinebogers.com/content/slides/201001-comparing-recsys-for-bookm… · Social bookmarking • Way of storing, organizing, and

Howdowecombine?

•  Score‐basedfusion  Differentalgorithmshavedifferentscoredistribu/ons  Scorenormaliza/oninto[0,1]range

•  Sixstandardcombina/ontechniquesfromIR  CombMAX(maxscoreperitem)

  CombMIN(minscoreperitem)  CombMED(medianscoreperitem)

  CombSUM(sumofscoresperitem)  CombMNZ(sumofscoresperitem×no.ofretrievingruns)

  CombANZ(sumofscoresperitem÷no.ofretrievingruns)

Page 49: Comparing Recommendaon Algorithms for Social Bookmarkingtoinebogers.com/content/slides/201001-comparing-recsys-for-bookm… · Social bookmarking • Way of storing, organizing, and

Howdowecombine?

•  Unweightedvs.weightedcombina/on  “Notallrecommenda/onalgorithmsarecreatedequal!”  Linearweigh/ngofindividualruns  Weightop/miza/onusingrandom‐restarthillclimbing

-  Stepsof0.1-  100itera/ons- Using10‐foldcross‐valida/on

Page 50: Comparing Recommendaon Algorithms for Social Bookmarkingtoinebogers.com/content/slides/201001-comparing-recsys-for-bookm… · Social bookmarking • Way of storing, organizing, and

Whatdowecombine?

•  Whataspectsofthetaskcanwevary?  Algorithms

- User‐basedCF-  Item‐basedCF

-  Content‐basedfiltering(profile‐andpost‐centricmatching)

- Hybridfiltering(CFwithmetadataoverlap)

  Datarepresenta/on- Usagedata-  Tags- Metadata

  Numberofrunscombined-  Canvaryfromtwotoeight

Page 51: Comparing Recommendaon Algorithms for Social Bookmarkingtoinebogers.com/content/slides/201001-comparing-recsys-for-bookm… · Social bookmarking • Way of storing, organizing, and

Whatdowecombine?

RunID #runs Descrip,on

FusionA 2 BestUBCFandIBCFrunswithusagedata

FusionB 2 BestUBCFandIBCFrunswithtagggingdata

FusionC 2 BestCFrunswithusageand/ortaggingdata(A+B)

Page 52: Comparing Recommendaon Algorithms for Social Bookmarkingtoinebogers.com/content/slides/201001-comparing-recsys-for-bookm… · Social bookmarking • Way of storing, organizing, and

Whatdowecombine?

RunID #runs Descrip,on

FusionA 2 BestUBCFandIBCFrunswithusagedata

FusionB 2 BestUBCFandIBCFrunswithtagggingdata

FusionC 2 BestCFrunswithusageand/ortaggingdata(A+B)

FusionD 2 Bestprofile‐centricandpost‐centricmatchingruns

FusionE 2 BestUBCFandIBCFrunswithmetadatasimilarity

FusionF 2 Bestmetadata‐basedruns(D+E)

Page 53: Comparing Recommendaon Algorithms for Social Bookmarkingtoinebogers.com/content/slides/201001-comparing-recsys-for-bookm… · Social bookmarking • Way of storing, organizing, and

Whatdowecombine?

RunID #runs Descrip,on

FusionA 2 BestUBCFandIBCFrunswithusagedata

FusionB 2 BestUBCFandIBCFrunswithtagggingdata

FusionC 2 BestCFrunswithusageand/ortaggingdata(A+B)

FusionD 2 Bestprofile‐centricandpost‐centricmatchingruns

FusionE 2 BestUBCFandIBCFrunswithmetadatasimilarity

FusionF 2 Bestmetadata‐basedruns(D+E)

FusionG 2 Bestfolksonomicandbestmetadata‐basedrun(C+F)

Page 54: Comparing Recommendaon Algorithms for Social Bookmarkingtoinebogers.com/content/slides/201001-comparing-recsys-for-bookm… · Social bookmarking • Way of storing, organizing, and

Whatdowecombine?

RunID #runs Descrip,on

FusionA 2 BestUBCFandIBCFrunswithusagedata

FusionB 2 BestUBCFandIBCFrunswithtagggingdata

FusionC 2 BestCFrunswithusageand/ortaggingdata(A+B)

FusionD 2 Bestprofile‐centricandpost‐centricmatchingruns

FusionE 2 BestUBCFandIBCFrunswithmetadatasimilarity

FusionF 2 Bestmetadata‐basedruns(D+E)

FusionG 2 Bestfolksonomicandbestmetadata‐basedrun(C+F)

FusionH 4 AllfourbestCFrunswithusageand/ortaggingdata(A+B)

FusionI 4 Allfourbestmetadata‐basedruns(D+E)

FusionJ 8 Alleightbestruns(A+B+D+E)

Page 55: Comparing Recommendaon Algorithms for Social Bookmarkingtoinebogers.com/content/slides/201001-comparing-recsys-for-bookm… · Social bookmarking • Way of storing, organizing, and

Results

RunID BibSonomy Delicious BibSonomy CiteULike

FusionA 0.0362 0.0065 0.1017 0.0949

FusionB 0.0434 0.0105 0.1196 0.0952

FusionC 0.0482 0.0115 0.1593 0.1278

FusionD 0.0388 0.0038 0.1303 0.1008

FusionE 0.0514 0.0051 0.1596 0.0945

FusionF 0.0494 0.0056 0.1600 0.1136

FusionG 0.0539 0.0109 0.1539 0.1556

FusionH 0.0619 0.0092 0.1671 0.1286

FusionI 0.0565 0.0065 0.1749 0.1188

FusionJ 0.0695 0.0090 0.1983 0.1531

Scien,ficar,clesBookmarks

Page 56: Comparing Recommendaon Algorithms for Social Bookmarkingtoinebogers.com/content/slides/201001-comparing-recsys-for-bookm… · Social bookmarking • Way of storing, organizing, and

Comparison

BibSonomy Delicious BibSonomy CiteULike

UBCF+usage 0.0277 0.0046 0.0865 0.0757

UBCF+tags 0.0102 0.0017 0.0459 0.0449

IBCF+usage 0.0244 0.0027 0.0737 0.0887

IBCF+tags 0.0370 0.0101 0.1100 0.0814

Content‐based+profile 0.0402 0.0014 0.1279 0.0987

Content‐based+post 0.0259 0.0036 0.1190 0.0455

Hybrid(UBCF+metadata) 0.0218 0.0039 0.0410 0.0608

Hybrid(IBCF+metadata) 0.0399 0.0017 0.1510 0.0746

Bestfusionrun 0.0695 0.0115 0.1983 0.1556

%Improvement +72.9% +13.9% +31.3% +57.6%

Scien,ficar,clesBookmarks

Page 57: Comparing Recommendaon Algorithms for Social Bookmarkingtoinebogers.com/content/slides/201001-comparing-recsys-for-bookm… · Social bookmarking • Way of storing, organizing, and

Findings

•  Fusionworks!Butwhatworksbest?  Weightedfusion  Combiningdifferentalgorithms

  Combiningdifferentdatarepresenta/ons  Combiningahighernumberofruns

  CombMNZandCombSUM

•  Addi/onalanalysesshowedthat  Improvementsmostlyaprecision‐enhancingeffect  Duetobejerrankingofdocuments

•  Newques/on:whereisthesweetspot?  Performancevs.computa/on

Page 58: Comparing Recommendaon Algorithms for Social Bookmarkingtoinebogers.com/content/slides/201001-comparing-recsys-for-bookm… · Social bookmarking • Way of storing, organizing, and
Page 59: Comparing Recommendaon Algorithms for Social Bookmarkingtoinebogers.com/content/slides/201001-comparing-recsys-for-bookm… · Social bookmarking • Way of storing, organizing, and

•  Usingtagoverlapinitem‐basedCFworkswell  Easytoimplement/adapt

•  Metadata‐basedrecommenda/onouenbejerthanCF  Notsignificantly  Noclearwinningalgorithm

  Easiesttoimplementusingexis/ngsearchengine

•  Recommenderfusionispromising  Combinerunsthatcoverdifferentaspects

  Weightedfusionworksbest

  Combiningmore(butdifferent)runsworksbejer

Overallfindings

Page 60: Comparing Recommendaon Algorithms for Social Bookmarkingtoinebogers.com/content/slides/201001-comparing-recsys-for-bookm… · Social bookmarking • Way of storing, organizing, and

•  Large‐scalecomparisonofalgorithms

•  Online,user‐basedevalua/onofalgorithms•  Exploringotherrecommenda/ontasks

Futurework

Page 61: Comparing Recommendaon Algorithms for Social Bookmarkingtoinebogers.com/content/slides/201001-comparing-recsys-for-bookm… · Social bookmarking • Way of storing, organizing, and

Ques/ons?

Page 62: Comparing Recommendaon Algorithms for Social Bookmarkingtoinebogers.com/content/slides/201001-comparing-recsys-for-bookm… · Social bookmarking • Way of storing, organizing, and

Metadatafindings

•  Whatdidwetestintermsofmetadatafields?  Individualintrinsicfields  Allintrinsicfieldscombined

  Allintrinsicfields+allextrinsicfieldscombined

•  Metadata  Allintrinsicmetadatacombinedworksbest

  Bestfields:TAGS,TITLE,AUTHOR,URL,ABSTRACT  Extrinsicmetadatacontributeslijle