The Automated Travel Agent: Machine Learning for Hotel ...cs229.stanford.edu/proj2016spr/poster/017.pdf · The Automated Travel Agent: Machine Learning for Hotel Cluster Recommendation

TheAutomatedTravelAgent:MachineLearningforHotelClusterRecommendationMichaelArruza-Cruz,MichaelStraka,JohnPericich

Expediauserswhopreferthesametypesofhotelspresumablyshareothercommonalities(i.e.,non-hotelcommonalities)witheachother.Withthisinmind,KagglechallengeddeveloperstorecommendhotelstoExpediausers.Armedwithatrainingsetcontainingdataabout37millionExpediausers,wesetouttodojustthat.Ourmachine-learningalgorithmsrangedfromdirectapplicationsofmateriallearnedinclasstomulti-partalgorithmswithnovelcombinationsofrecommendersystemtechniques.Kaggle’sbenchmarkforrandomlyguessingauser’shotelclusteris0.02260,andthemeanaverageprecisionK=5valuefornaïverecommendersystemsis0.05949.Ourbestcombinationofmachine-learningalgorithmsachievedafigurejustover0.30.Ourresultsprovideinsightintoperformingmulti-classclassificationondatasetsthatlacklinearstructure.

• 37milliondataentriescorrespondingtouserdata,eachwithatotalof23featurescorrespondingtohoteldestination,numberofrooms,numberofchildren,lengthofstay,etc.

• Hotelclustersareanonymized,andonlyuserdataisgiven.Thismakestypicaluser-itemmatrixmethodssuchasAlternatingLeastSquaresimpossibletouse.

• Thedatasetisskewedtowardscertainhotelclustersoverothers;certainclustersareoverrepresentedwhileothersappearveryrarely.

• Likewise,somedestinationsappearveryfrequently,andothersappearonlyahandfuloftimes.

• Thedataisalsonotlinearlyseparable.

Baselinemethods:• OurfirstattemptinvolvedimplementingabasicmultinomialNaïveBayesclassifier

thatreturnsalistofthetoptenmostlikelyclustersforauser.Thismethodservedasourbaselinemovingforward.

• OursecondattemptinvolvedtheuseofasupportvectormachinewithanRBFkernel.ThisunderperformedcomparedtoNaïveBayes.Wesuspectitisdiffculttofitahyperplanetothedatawithoutusingparametersthatresultinoverfittingduetothelackoflinearseparability

GradientBoosting:

• Firstrealsuccessfoundbyusingensembleofdecisiontreesminimizingsoftmaxlossfunction

• Likelyduetointelligentlearningofnon-linearstructureinthedata,alongwithboosting’sresistancetooverfitting

• ConvergedmoreslowlythanSVMandNaïveBayesforincreasedvaluesofKinMapk,implyingitsrankingsaremorenuanced

KernelizedUserSimilarity(MostEffectiveMethod):

• First,weclusterthedatatogetherbasedondestination;trainingdatasharingthesamedestinationidaregroupedtogether.

• Foreachnewtestingexample,weretrievethetraininggroupwithamatchingdestinationidandcreateusersimilaritymatrices.Thematricesaremadeutilizingakernelfunction;wetriedthefollowingthreekernels:

𝐾"(𝑥, 𝑦) =)1{𝑥, = 𝑦,}.

,/"

exp(−𝑧6

2𝜏6)

Wherezistheplacementoftheuserintermsofsimilaritytothetestexample(firstmostsimilar,second,third,etc.)andtauis60.

𝐾6(𝑥, 𝑦) = 91𝑚)1{𝑥, = 𝑦,}

.

,/"

;<

wheree=5.

𝐾=(𝑥, 𝑦) =1𝑚>)1{𝑥, = 𝑦,}

.?

,/"

+ 𝑘)𝑥,𝑦,

.??

,/"

wherexandyaredividedintotwovectorsofsizem’andm’’,withdifferentfeaturesseparatedintoeach.

• Oncethesimilaritymatrixiscreated,findthetop150usersmostsimilartothetestexample,andforeachhotelclusterrepresentedinthe150userssumtheirsimilarityscore(determinedbythechosenkernelmethod).

• Then,recommendthetop10hotelclusters(bysimilarityscore)foundinthemostsimilarusers.

• Ofthethreekernelsmentionedabove,thesecondkernelprovedmosteffective,likelyduetoheavilydiscretizednatureoftheuserfeatures

Algorithm Precision Recall F1

SVM(RBFKernel)

0.00543661971831 0.0100680272109 0.000970935340416

NaïveBayes

0.0758968048912 0.0724648868325 0.0735346947801

GradientBoosting

0.123878702014 0.128351405483 0.108727388991

UserSimilarity:Kernel1

0.182515928508

0.182225684972

0.171697778825


0.183583426068

0.182550650282

0.171638835758


0.185584222149

0.184623438005

0.173430366799

Thisprojectprovidesanexcellentcasestudyforapplyingmachinelearningalgorithmstolargedatasetslackingobviousstructure.Italsoembodiesthechallengeofrecommendingitemsaboutwhichwehavenofeatures.Inaddressingthesechallenges,wedemonstratedthatacreativecombinationofusersimilaritymatricesandJaccardsimilarityoutperformsgradientboosting—atechniquecurrentlywell-knownforwinningKagglecompetitions.Forfuturework,werecommendusingensemblestackingmethodstocombinepredictionsfromvariousalgorithms.Furtherworkcouldalsoexploretuninghyper-parametersforgradientboosting.

MethodologyAbstract

DataandFeaturesResults

DiscussionandFutureWork

Figure1:Frequencyofhotelclustersindataset.

Figure2:PCAinthreedimensionsofdataforthreemostpopularhotelclusters.Whilenotlinearlyseparable,theredoesappeartobesomenon-linearstructuretothehotelclusters.Thesuccessofourmethodsbasedonusersimilaritysupportthis.

Overall,thebestmethodswerethemethodsthatutilizedusersimilarityandkernelstorecommendhotelclustersthatothersimilarusersbooked.GradientBoostingwasalsoeffective,butmeanaverageprecisionseemedtohitahardcapat.25regardlessoftheparametersused.TheSVMperformedverypoorly,asdidotherbasicmachinelearningmethodsattemptedonthedatainitially.

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

SVM(RBFKernel)

GradientBoosting




NaiiveBayes

MeanAveragePrecision:5Predictions

MeanAveragePrecision:5Predictions

Documents

The Automated Travel Agent: Machine Learning for Hotel ...cs229.stanford.edu/proj2016spr/poster/017.pdf · The Automated Travel Agent: Machine Learning for Hotel Cluster Recommendation