PAC-Bayes Risk Bounds for Sample-Compressed Gibbs Classifiers ICML 2005 ICML 2005 François Laviolette and Mario Marchand Université Laval

PAC-Bayes Risk Bounds for Sample-Compressed Gibbs Classifiers

Download PPT Report

Upload
cuyler
View
36
Download
2

Tags:

Embed Size (px)

DESCRIPTION

PAC-Bayes Risk Bounds for Sample-Compressed Gibbs Classifiers. ICML 2005 François Laviolette and Mario Marchand Université Laval. PLAN. The “traditional” PAC-Bayes theorem (for the usual data-independent setting ) - PowerPoint PPT Presentation

Citation preview

PAC-Bayes Risk Bounds for

Sample-Compressed Gibbs Classifiers

ICML 2005ICML 2005

François Laviolette and Mario MarchandUniversité Laval

Page 2: PAC-Bayes Risk Bounds for Sample-Compressed Gibbs Classifiers

PLAN The “traditional” PAC-Bayes theorem

(for the usual data-independent setting )

The “generalized” PAC-Bayes theorem (for the more general sample compression setting)

Implications and follow-ups

Page 3: PAC-Bayes Risk Bounds for Sample-Compressed Gibbs Classifiers

A result from folklore :

Page 4: PAC-Bayes Risk Bounds for Sample-Compressed Gibbs Classifiers

In particular, for Gibbs classifiers:

What if we choose P after observing the data?

Page 5: PAC-Bayes Risk Bounds for Sample-Compressed Gibbs Classifiers

The “traditional” PAC-Bayes Theorem

Page 6: PAC-Bayes Risk Bounds for Sample-Compressed Gibbs Classifiers

The Gibbs and the majority vote We have a bound for GQ but we normally use instead the Bayes

classifier BQ (which is the Q-weighted majority vote classifier)

Consequently R(BQ) · 2R(GQ) (can be improved with the “de-randomization” technique of Langford-Shaw-Taylor 2003)

So the PAC-Bayes theorem also gives a bound on the Majority vote classifier.

Page 7: PAC-Bayes Risk Bounds for Sample-Compressed Gibbs Classifiers

The sample compression setting Theorem 1 is valid in the usual data-independent

setting where H is defined without reference to the training data

Example: H = the set of all linear classifiers h: Rn!{-1,+1}

In the more general sample compression setting, each classifier is identified by 2 different sources of information:

The compression set: an (ordered) subset of the training set A message string of additional information needed to identify a

classifier

Theorem 1 is not valid in this more general setting

To be more precise: In the sample compression setting, there exists a

“reconstruction” function R that gives a classifier

h = R(, Si)

when given a compression set Si and a message string .

Recall that Si is an ordered subset of the training set S where the order is specified by i=(i1, i2, … , i|i|).

Page 9: PAC-Bayes Risk Bounds for Sample-Compressed Gibbs Classifiers

Examples

Set Covering Machines (SCM) [Marchand and Shaw-Taylor JMLR 2002]

Decision List Machines (DLM) [Marchand and Sokolova JMLR 2005]

Support Vector Machines (SVM) Nearest neighbour classifiers (NNC) …

Page 10: PAC-Bayes Risk Bounds for Sample-Compressed Gibbs Classifiers

We will thus use priors defined over the set of all the parameters (i,) needed by the reconstruction function R, once a training set S is given.

The priors should be written as:

Priors in the sample compression setting

The priors must be Data-independent

Page 11: PAC-Bayes Risk Bounds for Sample-Compressed Gibbs Classifiers

The “generalized” PAC-Bayes Theorem

Page 12: PAC-Bayes Risk Bounds for Sample-Compressed Gibbs Classifiers

a (the rescaled ) incorporates Occam’s principle of parsimony

The new PAC-Bayes theorem states that the risk bound for is lower than the risk bound for any .

Page 13: PAC-Bayes Risk Bounds for Sample-Compressed Gibbs Classifiers

The PAC-Bayes theorem for bounded compression set size

Page 14: PAC-Bayes Risk Bounds for Sample-Compressed Gibbs Classifiers

Conclusion The new PAC-Bayes bound

is valid in the more general sample compression setting.

incorporates automatically the Occam’s principle of parsimony

A sample compressed Gibbs classifier can have a smaller risk bound than any of its member.

Page 15: PAC-Bayes Risk Bounds for Sample-Compressed Gibbs Classifiers

The next steps Finding derived bounds for particular sample

compressed classifiers like: majority votes of SCMs and DLMs, SVMs NNCs.

Developing new learning algorithms based on the theoretical information given by the bound.

A tight Risk bound for Majority vote classifiers ?

1 Data Mining Lecture 5: KNN and Bayes Classifiers

Documents

Tool wear monitoring using naïve Bayes classifiers

Documents

Generative Classifiers: Part 1guerzhoy/411_2018/lec/week5/generative.pdf · •Discrete test •Continuous test •Naïve Bayes: Spam filtering example •Continuous features •Naïve

Documents

Naïve Bayes for Text Classification - Penn Engineeringcis520/lectures/naive_bayes.pdf · Using Naive Bayes Classifiers to Classify Text: Bag of Words u General model: Features are

Documents

Tool wear monitoring using naïve Bayes classifiers | SpringerLink

Documents

Winter 2021 Lecture 3: Bayes Classiﬁers · Roy Fox | CS 273A | Winter 2021 | Lecture 3: Bayes Classifiers Naïve Bayes models • We want to predict some value , e.g. auto accident

Documents

Lirong Xia Naïve Bayes Classifiers Friday, April 8, 2014

Documents

Machine Learningepxing/Class/10701-12f/Lecture/lecture4...Classifiers zGoal: Wish to learn f: X →Y, e.g., P(Y|X) zGenerative classifiers (e.g., Naïve Bayes): zAssume some functional

Documents

Tackling the Poor Assumptions of Naïve Bayes Text Classifierscseweb.ucsd.edu/~elkan/254/NaiveBayesForText.pdf · Tackling the Poor Assumptions of Naïve Bayes Text Classifiers Jason

Documents

Naive Bayes Classifiers and Document Classification · Brandon Malone Naive Bayes Classi ers and Document Classi cation. The Multinomial Distribution Multinomial document model Naive

Documents

DATA MINING: NAÏVE BAYES - storm.cis.fordham.edugweiss/classes... · Bayes Classifiers 27 That was a visual intuition for a simple case of the Bayes classifier, also called: •

Documents

UVA CS 4501: Machine Learning Lecture 17: Generative Bayes ... · Machine Learning Lecture 17: Generative Bayes Classifiers Dr. Yanjun Qi University of Virginia Department of Computer

Documents

Bayes Theorem & Naïve Bayes - Penn Engineeringcis521/Lectures/naive-bayes-spam.pdfUsing Naive Bayes Classifiers to Classify Text: Basic method for Multinomial Variables • As a generative

Documents

MLE’s, Bayesian Classifiers and Naïve Bayes

Documents

On Discriminative vs. Generative Classifiers: A · PDF fileOn Discriminative vs. Generative classifiers: A comparison of logistic regression and naive Bayes Andrew Y. Ng Computer Science

Documents

Compression-Based Averaging of Selective Naive Bayes ...jmlr.csail.mit.edu/papers/volume8/boulle07a/boulle07a.pdf · COMPRESSION-BASED AVERAGING OF SELECTIVE NAIVE BAYES CLASSIFIERS

Documents

Investigating the Performance of Naive- Bayes Classifiers ... · • Naïve Bayes classifier is one of the mostly used practical Bayesian learning methods. – Very effective when

Documents

Naïve Bayes Classifiers · 2016-10-28 · Naïve Bayes in the real world •One of the oldest, simplest models for classification •Still, very powerful and used all the time in

Documents

Machine Learning and Data Mining Bayes Classifierskkask/Spring-2018 CS273P/slides/02... · 2018. 5. 20. · Machine Learning and Data Mining 2 : Bayes Classifiers Kalev Kask + A basic

Documents

Bayesian Networks. Motivation The conditional independence assumption made by naïve Bayes classifiers may seem to rigid, especially for classification

Documents

Bayes Classifiers II: More Examples - CS Departmentgqi/CAP5610/CAP5610Lecture04.pdf · Bayes Classifiers II: More Examples CAP5610 Machine Learning Instructor: Guo-Jun QI

Documents

Tool wear monitoring using naïve Bayes classifiers · 3 Naïve Bayes classifier for tool condition monitoring Intoolconditionmonitoring,theuncertainvariableofinterest is the state

Documents

Naïve Bayes Classifiereamonn/CE/Bayesian Classification withInsect... · The Naive Bayes classifiers is often represented as this type of graph… Note the direction of the arrows,

Documents

Bayes Inference via Gibbs Sampling Autoregressive SubjectAlbert and Chib: Gibbs Sampling in Time Series With Markov Shifts The parameterization of the model in (1.1) and (1.2) is quite

Documents

Gaussian Naive Bayes and Linear Regressionaritter.github.io/courses/5523_slides/linear_regression.pdf · 2020-06-13 · Naïve Bayes: What you should know • Designing classifiers

Documents

Non-Bayes classifiers. Linear discriminants, neural networks

Documents

An Experimental Comparison of Different Classifiers for ...iieng.org/images/proceedings_pdf/9829E0114071.pdf · popular classification methods: Logistic regression, naive Bayes classifier,

Documents

On Discriminative vs. Generative classifiers: Naïve Bayes Presenter : Seung Hwan, Bae

Documents

Machine Learning A brief sketch of - Agenda (Indico) · MAP as decision rule Naive Bayes classifier Gaussian , Multinomial Generative models. Naive Bayes classifiers Conditional independence

Documents

Learning Fair Naive Bayes Classifiers by Discovering and

Documents