Transcript

CS 59000 Statistical Machine learningLecture 10

Yuan (Alan) QiPurdue CS

Sept. 25 2008

Outline

• Review of Fisher’s linear discriminant, percepton, probabilistic generative models,

• Probabilistic discriminative models: Logistic regressionProbit regression

Fisher Linear Discriminant

Within Class and Between Class Scatter Matrices

Generative eigenvalue problem

Fisher’s Linear Discriminant

Perceptron

Generalized Linear Model

Minimize

where M denotes the set of all misclassified patterns

Stochastic Gradient Descent

Probabilistic Generative Models

Gaussian Class-Conditional DensitiesConditional densities of data:

The posterior distribution for label/class:

Maximum Likelihood Estimation

Linked to Fisher’s linear discriminant

Discrete features

Naïve Bayes classification:

Probabilistic Discriminative Models

Instead of modeling

Model directly

Generative vs Condition Models

Discussion

Logistic Regression

Let

Likelihood function

Maximum Likelihood Estimation

Note that

Newton-Raphson Optimization for Linear Regression

Let H denote Hessian matrix

It converges in one iteration for linear regression.

Newton-Raphson Optimization for Logistic Regression

Gradient and Hessian of the error function:

Newton-Raphson Optimization for Logistic Regression

Iterative reweighted least squares (IRLS):Solving a series of weighted least-square

problems

Other discriminative models

Generative models <-> Logistic regression

How about other discriminative functions?

Probit Regression

Probit function:

Labeling Noise Model

Robust to outliers and labeling errors

Generalized Linear Models

Generalized Linear Models

Generalized linear model: Activation function: Link function:

Canonical Link Function

If we choose the canonical link function:

Gradient of the error function:

Laplace Approximation for Posterior

Gaussian approximation around mode:

Illustration of Laplace Approximation

Evidence Approximation

Bayesian Information Criterion

Approximation of Laplace approximation:

More accurate evidence approximation needed

Bayesian Logistic Regression


Recommended