Multi-view MachinesBokai Cao, Hucheng Zhou, Guoqiang Li and Philip S. Yu
Multi-view Data• WebdocumentsonWikipedia
• URL• Wordsonthepage
2
Multi-view data is prevalent and important in web applications.
• WebimagesonInstagram• Visualinforma<on• Textualtags
• WebsearchonBing• Userprofile• Adinforma<on• Querydescrip<on
Multi-view Data• WebdocumentsonWikipedia
• URL• Wordsonthepage
3
• WebimagesonInstagram• Visualinforma<on• Textualtags
• WebsearchonBing• Userprofile• Adinforma<on• Querydescrip<on
View%1
View%2
View%3
View%1
View%2
View%3
first-ord
er%
interac3
ons
SVM
Multi-view data is prevalent and important in web applications.
SupportVectorMachines(SVMs)
4
View%1
View%2
View%3
View%1
View%2
View%3
first-ord
er%
interac3
ons
SVM
[Vapnik 1995] The linear SVM model (and LR etc.) is limited to the first-order interactions. Polynomial SVMs are likely to degenerate into linear SVMs on sparse datasets.
SupportTensorMachines(STMs)
5
View%1
View%2
View%3
View%1
View%2
View%3 third-order%interac2ons
STM
[Cao et al. 2014] The STM model explores only the highest-order interactions.
Factoriza<onMachines(FMs)
6
View%1
View%2
View%3
View%1
View%2
View%3
second.order%interac2ons
first.ord
er%
interac2
ons
third.order%interac2ons
FM
[Rendle 2010] The FM model uses separate parameters to approximate interactions in different orders.
Mul<-viewMachines(MVMs)
7
View%1
View%2
View%3
View%1
View%2
View%3
global%bias
third3order%interac6ons
second3order%interac6ons
first3order%interac6ons
MVMs include the full-order interactions between multi-view features in a compact model.
MVMs jointly factorize the interaction parameters in different orders to learn a consistent representation under sparsity.
MVMs are a general predictor that can work with different loss functions for a variety of machine learning tasks.
Experiments• Datasets
• MovieLens:users(138,493),movies(27,278)andimplicitfeedback(27,278)
• BingAds:queries(958,426),ads(1,935,510)andimpressions(18)• Baselines
• Linearregression/logis<cregression(LR)• Tensorfactoriza<on(TF)• Mul<-viewFMs(mvFM,mvFM-3d,mvFM-reg)
• Implementa<on• GradientdescentwithAdaGrad• GraphXinSparkwithitera<veforwardandbackwardsteps• Codeavailableath_ps://github.com/cloudml/zen/tree/mvm_opt/
ml/src/main/scala/com/github/cloudml/zen/ml/recommenda<on
8
Experiments
9
Acknowledgements• WSDM Student Scholarship
• SIGIR Student Travel Grant
• We also thank Dinggang Shen for his discussions, Chunhui Zhang from the Bing Ads team for providing the dataset and Bo Zhao for helping with the experiments.
10