Multiple Kernel Learning Marius Kloft T ECHNISCHE U NIVERSITÄT B ERLIN ( NOW : C OURANT / S LOAN -K...

Multiple Kernel LearningMarius Kloft TECHNISCHE UNIVERSITÄT BERLIN(NOW: COURANT / SLOAN-KETTERING INSTITUTE)

Marius Kloft

• Aim

▫ Learning the relation of of two random quantities and

from observations

• Kernel-based learning:

Machine Learning

• Example

▫ Object detection in images

Marius Kloft

Multiple Views / Kernels

How to combine the views?

Weightings.

(Lanckriet, 2004)

Marius Kloft

Computation of Weights?

• State of the art

▫ Sparse weights

Kernels / views are completely discarded

▫But why discard information?

(Lanckriet et al., Bach et al., 2004)

Marius Kloft

From Vision to Reality?

• State of the art: sparse method

▫ empirically ineffective

(Gehler et al., Noble et al., Cortes et al., Shawe-Taylor et al., etc., NIPS WS 2008, ICML, ICCV, ...)

• New methodology

learning bounds of rate up to O(M/n)

More effective in applications

More efficient and effective in practice

Marius Kloft

Presentation of MethodologyNon-sparse Multiple Kernel Learning

Marius Kloft

• Generalized formulation

▫ arbitrary loss

▫ arbitrary norms

e.g. lp-norms:

1-norm leads to sparsity:

New Methodology

• Computation of weights?

▫ Model

Kernel

▫ Mathematical program

Convex problem.

(Kloft et al., NIPS 2009, ECML 2009/10, JMLR 2011)

Optimization over weights

Marius Kloft

Theoretical Analysis

• Theoretical foundations

▫ Active research topic

NIPS workshop 2009

▫ We show:

Theorem (Kloft & Blanchard). The local Rademacher com-plexity of MKL is bounded by1:

• Corollaries (Learning Bounds)

▫ Bound:

depends on kernel

best known rate:

(Kloft & Blanchard, NIPS 2011, JMLR 2012; Kloft, Rückert, Bartlett, ECML 2010)

(Cortes, Mohri, Rostamizadeh, ICML 2010)

Marius Kloft

Proof (Sketch)

1. Relating the original class with the centered class

2. Bounding the complexity of the centered class

3. Khintchine-Kahane’s and Rosenthal’s inequalities

4. Bounding the complexity of the original class

5. Relating the bound to the truncation of the spectra of the kernels

Marius Kloft

• Implementation

▫ In C++ (“SHOGUN Toolbox”)

Matlab/Octave/Python/R support

▫ Runtime:

~ 1-2 orders of magnitude faster

Optimization

• Algorithms

1. Newton method

2. sequential, quadratically constrained programming

with level set projections

3. block-coordinate descent alg.

Alternate

solve (P) w.r.t. w

solve (P) w.r.t. :

Until convergence

(proved)

(Kloft et al., JMLR 2011)

analytical

(Sketch)

Marius Kloft

▫ Choice of p crucial

▫ Optimality of p depends on true sparsity

SVM fails in the sparse scenarios

1-norm best in the sparsest scenario only

p-norm MKL proves robust in all scenarios

• Bounds

▫ Can minimize bounds w.r.t. p

▫ Bounds well reflect empirical results

• Design

▫ Two 50-dimensional Gaussians

mean μ1 := binary vector;

▫Zero-mean features irrelevant

50 training examples

One linear kernel per feature

▫ Six scenarios

Vary % of irrelevant features

Variance so that Bayes error constant

• Results

sparsity0% 44% 64% 82% 92% 98%

test error

0% 44% 64% 82% 92% 98%

Toy Experiment

Sparsity

sparsity

Marius Kloft

Applications▫

▫Bio

t site

non-sparsity

• Lesson learned:

▫ Optimality of a method depends on true underlying sparsity of problem

• Applications studied:

Marius Kloft

• Visual object recognition

▫ Aim: annotation of visual media (e.g., images)

▫ Motivation:

▫ content-based image retrieval

Application Domain: Computer Vision

aeroplane bicycle bird

Marius Kloft

• Multiple kernels

▫ based on

Color histograms

shapes

(gradients)

local features

(SIFT words)

spatial features

• Visual object recognition

▫ Aim: annotation of visual media (e.g., images)

▫ Motivation:

▫ content-based image retrieval

Marius Kloft

• Datasets

▫ 1. VOC 2009 challenge

7054 train / 6925 test images

20 object categories

▫aeroplane, bicycle,…

▫ 2. ImageCLEF2010 Challenge

8000 train / 10000 test images

▫ taken from Flickr

93 concept categories

▫partylife, architecture, skateboard, …

• 32 Kernels

▫ 4 types:

▫ varied over color channel combinations and spatial tilings (levels of a spatial pyramid)

▫ one-vs.-rest classifier for each visual concept

Bag of Words Global Histogram

Gradients BoW-SIFT Ho(O)G

Color Histograms BoW-C HoC

Marius Kloft

• Challenge results

▫ Employed our approach for ImageCLEF2011 Photo Annotation challenge

achieved the winning entries in 3 categories

• Preliminary results:

▫ using BoW-S only gives worse results

→ BoW-S alone is not sufficient

SVM MKL

VOC2009 55.85 56.76

CLEF2010 36.45 37.02

(Binder, Kloft, et al., 2010, 2011)

Marius Kloft

• Experiment

▫ Disagreement of single-kernel classifiers

▫ BoW-S kernels induce more or less the same predictions

• Why can MKL help?

▫ some images better captured by certain kernels

▫ different images may have different kernels that capture them well

Marius Kloft

• Detection of

▫ transcription start sites:

• by means of kernels based on:

▫ sequence alignments

▫ distribution of nukleotides

downstream, upstream

▫ folding properties

binding energies and angles

• Empirical analysis

▫ detection accuracy (AUC):

▫ higher accuracies than sparse MKL and ARTS (small n)

ARTS = winner of international comparison of 19 models

(Abeel et al., 2009)

(Kloft et al., NIPS 2009, JMLR 2011)

Application Domain: Genetics

Marius Kloft

• Theoretical analysis

▫ impact of lp-Norm on bound

▫ confirms experimental results:

stronger theoretical guarantees for proposed approach (p>1)

empirical and theoretical results approximately equal for

• Empirical analysis

▫ detection accuracy (AUC):

▫ higher accuracies than sparse MKL and ARTS (small n)

ARTS = winner of international comparison of 19 models

(Kloft et al., NIPS 2009, JMLR 2011)

(Abeel et al., 2009)

Application Domain: Genetics

Marius Kloft

• Protein Fold Prediction

▫ Prediction of fold class of protein

Fold class related to protein’s function

▫e.g., important for drug design

▫ Data set and kernels from Y. Ying

27 fold classes

Fixed train and test sets

12 biologically inspired kernels

▫e.g. hydrophobicity, polarity, van-der-Waals volume

• Results

▫ Accuracy:

▫ 1-norm MKL and SVM on par

▫ p-norm MKL performs best

6% higher accuracy than baselines

Application Domain: Pharmacology

Marius Kloft

▫Com

r visi

Visual

▫Bio

non-sparsity

Further Applications

Marius Kloft

Conclusion: Non-sparse Multiple Kernel Learning

Training with > 100,000 data points and > 1 000 Kernels

Sharp learning bounds

Appli-cations

Visual Object Recognition

winner of ImageCLEF 2011 Photo-annotation Challenge

Computational Biology

gene detector competitive to winner of int. comparison

Thank you for your attention

I will be happy to answer any additional questions

References

▫ Abeel, Van de Peer, Saeys (2009). Toward a gold standard for promoter prediction evaluation. Bioinformatics.

▫ Bach (2008). Consistency of the Group Lasso and Multiple Kernel Learning. Journal of Machine Learning Research (JMLR).

▫ Kloft, Brefeld, Laskov, and Sonnenburg (2008). Non-sparse Multiple Kernel Learning. NIPS Workshop on Kernel Learning.

▫ Kloft, Brefeld, Sonnenburg, Laskov, Müller, and Zien (2009). Efficient and Accurate Lp-norm Multiple Kernel Learning. Advances in Neural Information Processing Systems (NIPS 2009).

▫ Kloft, Rückert, and Bartlett (2010). A Unifying View of Multiple Kernel Learning. ECML.

▫ Kloft, Blanchard (2011). The Local Rademacher Complexity of Lp-Norm Multiple Kernel Learning. Advances in Neural Information Processing Systems (NIPS 2011).

▫ Kloft, Brefeld, Sonnenburg, and Zien (2011). Lp-Norm Multiple Kernel Learning. Journal of Machine Learning Research (JMLR), 12(Mar):953-997.

▫ Kloft and Blanchard (2012). On the Convergence Rate of Lp-norm Multiple Kernel Learning. Journal of Machine Learning Research (JMLR), 13(Aug):2465-2502.

▫ Lanckriet, Cristianini, Bartlett, El Ghaoui, Jordan (2004). Learning the Kernel Matrix with Semidefinite Programming. Journal of Machine Learning Research (JMLR).

Multiple Kernel Learning Marius Kloft T ECHNISCHE U NIVERSITÄT B ERLIN ( NOW : C OURANT / S LOAN -K...

Documents

MergedFile - Christina Jovanovic...w GZI. 20030.00/4/04 I ECHNISCHE UNIVERSITÄT WIEN VIENNA UNIVERSITY OF TECHNOLOGY Matr. Nr. 0427039 Geburtsdacum 25.09.1974 DUPLIKAT- STUDIENBERECHTIGUNGSZEUGNIS

TTechnische echnische UUniversität niversität MMünchenünchen€¦Gas storage Steam Thermal energy Electrical energy Gasflare Preselection Pretreatment. Pretreatment and preparationPretreatment

Machine Learning for Network Intrusion Detection Dr. Marius Kloft, Dipl.-Math. TexPoint fonts used in EMF. Read the TexPoint manual before you delete this

This is the Week! - motoristsallstars.commotoristsallstars.com/pdf/FinalistAdSunday_fullpage.pdf · Kasandra Sanidad Paul Flanagan Licking Valley High School Morgan Gossett Noah Ourant

A Unifying Review of Deep and Shallow Anomaly DetectionKaiserslautern, 67653 Kaiserslautern, Germany (e-mail: kloft@cs.uni-kl.de). T. G. Dietterich is with the School of Electrical

December 2018 Volume 43 Number 12 - morrisminorvic.org.au Generator 2018.pdf · December 2018 – Volume 43 – Number 12 ... Lyn Chapman Philip Lovel Janine Kloft Janine Rogerson

L-f I 1 Faculteit der Technische Natuurkundealexandria.tue.nl/extra2/afstversl/n/411747.pdfL-f i I 1 "echnische Universiteit :indhoven Faculteit der Technische Natuurkunde Vakgroep

L e C ourant P orteur en L igne jennifer.dulaurier@univ-lyon2.fr mai 2006

Toward Supervised Anomaly Detection · Toward Supervised Anomaly Detection Nico Görnitz NICO.GOERNITZ@TU-BERLIN DE Marius Kloft KLOFT@TU-BERLIN.DE Machine Learning Laboratory, Technische

T ECHNISCHE U NIVERSITÄT KAISERSLAUTERN K. Bergmann Lecture 6 Lecture course - Riga, fall 2013 Coherent light-matter interaction: Optically driven adiabatic

Electricité 2 - Gemini | Sciences Physique · ours La Tension alternative 2 Activité on lusion Une tension ontinue engendre un ourant dont le sens ne hange pas au ours du temps

Algorithms and Data Structures - hu-berlin.dekloftmar/lectures/2016...Algorithms and Data Structures Marius Kloft Open Hashing. ... • Many suggestions on how to chose the next index

Guide 4 Body - who.int · EDP equipment development plan ELCB earth-leakage circuit-break er FIFO first in, first out g grams G T Z D eutsche Gesellschaft f r T echnische Zusammenarbeit

Peter Kloft and Daniel Rau - Converia · 2018. 4. 3. · Peter Kloft and Daniel Rau The 11th International Fluid Power Conference, 11. IFK, March 19-21, 2018, Aachen, Germany Thank

Cables and Accessories - JAT GmbH · echnische T Änderungen vorbehalten! Jenaer Antriebstechnik GmbH . Buchaer Straße 1 . D-07745 Jena . el.: +49 3641 T 63376-055. Fax : +49 3641

OFFICIAL · territory manager/liaison dwaine brown, psp galion 303 territory manager ken ourant newcomerstown 1337 territory manager steve wilcox indian lake 1533 territory manager

ECHNISCHE UNIVERSITÄT RESDEN · Technische Universität Dresden Dipl.-Ing. (FH) Gudrun Kraut, Institute of Photogrammetry and Remote Sensing, Technische Universität Dresden Msc