On-line Learning with Passive-Aggressive Algorithms Joseph Keshet The Hebrew University Learning...

On-line Learning with On-line Learning with Passive-Aggressive Passive-Aggressive

AlgorithmsAlgorithms

Joseph KeshetThe Hebrew University

Learning Seminar,2004

On-line Learning w/ Passive-Aggressive Joseph Keshet, The Hebrew University

Supervised Learning Supervised Learning ComponentsComponents

• Instance space

• Label space

• Mapping from to are called classifiers

• There exists unknown target classifier

• Goal: produce hypothesis that would be a good approximation of the target

OutlineOutline• Binary classification

– Problem setting– On-line algorithm– Mistake bound analysis– Kernels

• Regression

• Novelty-detection / “one-class”

• Hierarchical classification

Binary ClassificationBinary Classification• Input: examples

• Restriction: linear classification functions

• Goal: find that attains small error

Online LearningOnline LearningInitiate:

Receive vector

Predict label

Receive correct label

Suffer error

Apply update rule to obtain

Margin & LossMargin & Loss• Margin• Binary error is combinatorial quantity and thus

difficult to minimize directly• Define loss

Binary Error

“0-1” loss

The Update RuleThe Update Rule

Classify the current example correctly

Keep the current hyperplane close to

the last one

The Update RuleThe Update Rule

Loss BoundLoss Bound TheoremTheorem• sequence of examples

• satisfies

• Then

The Non-Separable CaseThe Non-Separable Case

Slack variable

The Update StoryThe Update Story

Correct classification-

No update

Misclassification Misclassification

The non-separable case

KernelsKernels• Since

• Note that

• Therefore

On-line RegressionOn-line Regression• Input: examples

• Goal: find that attains small discrepancy

On-line RegressionOn-line Regression• Define loss

• Update rule

On-line Novelty DetectionOn-line Novelty Detection• Input: examples

• Goal: find that closes the smallest ball

On-line Novelty DetectionOn-line Novelty Detection• Define loss

• Update rule

Hierarchical ClassificationHierarchical Classification• Goal: spoken phoneme recognition

PHONEMES

Sononorants

Silences

ObstruentsNasals

Liquids

Vowels

Plosives FricativesFront Center Back

n m ng

d k p t

f v sh s thdhzh z

l y w r Affricates

oyowuhuwaaao eraway

iy ih eyehae

Metric Over Label TreeMetric Over Label Tree

• A given hierarchy induces a metric over the set of labels tree distance

Metric Over Label TreeMetric Over Label Tree

• A given hierarchy induces a metric over the set of labels tree distance

Metric Over LabelsMetric Over Labels• Metric semantics:

γ(a,b) is the severity of predicting label “b” instead of correct label “a”

•Our high-level goal:

Tolerate minor errors …•Sibling errors

•Under-confident predictions - predicting a parent

…but, avoid major errors

Hierarchical ClassifierHierarchical Classifier• Assume and

• Associate a prototype

with each label

• Score of label

• Classify by: W4 W5 W6 W7 W8

W9 W10

Hierarchical ClassifierHierarchical Classifier• Goal: maintain close to

• Define

• Goal: maintain small

w4 w5 w6 w7 w8

w9 w10

Online LearningOnline LearningFor

Receive instance

Predict label

Receive correct label

Suffer tree-based penalty

Apply update rule to obtain

Goal: Suffer a small cumulative tree error

Tree LossTree Loss• Difficult to minimize directly

• Instead upper bound by

This is the Tree Loss

TheThe Update RuleUpdate Rule

w4 w5 w8

Local update – only nodesalong the path from to are updated

Loss BoundLoss Bound TheoremTheorem• sequence of examples• satisfies

• Then

where and

Extension: KernelsExtension: Kernels• Since

• Note that

• Therefore

ExperimentsExperiments• Synthetic data: depth 4, 121 labels

Generated using orthogonal set in with Guassian noise (var 0.16)100 train instances and 50 test instances for each label

• Phoneme recognizer: 41 phoenems, taken from TIMIT corpus, MFCC+∆+∆∆ front-end, concatenation of 5 frames, RBF kernel2000 train vectors and 500 test vector per phoneme

ExperimentsExperiments• Flat - Ignore the hierarchy - solve as

multiclass CC

• Greedy approach: solve a multiclass problem at nodes with at least 2 children

ResultsResults

AveragedTree Error

MulticlassError

Synthetic data (tree) 0.05 5

Synthetic data (flat) 0.11 8.6

Synthetic data (greedy) 0.52 34.9

Phonemes (tree) 1.3 40.6

Phonemes (flat) 1.41 41.8

Phonemes (greedy) 2.48 58.2

ResultsResults• Difference between the error rate of the Hieron

and the multiclass

Synthetic data Phonemes

gross errorsgross errors

minor errorsminor errors

Hierarchy vs. FlatHierarchy vs. Flat

Similarity between the prototypes

QuickTime™ and aVideo decompressor

are needed to see this picture.

ThanksThanks!!

On-line Learning with Passive-Aggressive Algorithms Joseph Keshet The Hebrew University Learning...

Documents

Information Services: Aggressive State Policiesmedia.straffordpub.com/...services-navigating-aggressive-state-policies-2011-08-11/... · Navigating Aggressive State Policies ... Irwin

Long Aggressive

Aggressive Roller

Aggressive Victims, Passive Victims, and Bullies ...ace.ucr.edu/people/nancy_guerra/nancy_pdfs/aggressive victims... · Aggressive Victims, Passive Victims, and Bullies 19 they were

Aggressive Girls 1

Aggressive Driving Literature Review - STOPandGO.orgstopandgo.org/research/aggressive/tasca.pdf · aggressive driving is actually taking place due largely to differences in how aggressive

Elinor Keshet & Jamie Ferguson - Living prototypes (workshop)

THE RELATIONSHIP BETWEEN AGGRESSIVE AND ASSERTIVE ... · the relationship between aggressive and assertive communication behaviors: examination and scale development of the aggressive

Nuances in the Management of Aggressive … in the Management of Aggressive Lymphomas Michelle Wisniewski, PA Paul A. Hamlin, MD Memorial Sloan Kettering Cancer Center Learning Objectives

aggressive i

Aggressive Confidence - Muay Thai Kickboxing Academymastermindmuaythai.com/.../Aggressive-Confidence.pdf · Aggressive Confidence 5 Many of my students are surprised at the amount

Aggressive period

Conscious Learning and Learning to be Conscious (Greg Nixon) How aggressive global education based in rational good will, the needs of the planet, intolerance

Alex Mogilner, UC Davis Leah Keshet, UBC, Canada Actin dynamics and the regulation of cell motility

Behavioural Strategies of Aggressive and Non-Aggressive Male Mice In

Aggressive Driving Prevention

"How can white teachers recognize and challenge racism?" Acknowledging collusion and learning an aggressive humility

Integral Calculus with Applications to the Life Sciences · ii Leah Edelstein-Keshet List of Contributors Leah Edelstein-Keshet Department of Mathematics, UBC, Vancouver Author of

Java Arrays, Packages, andModifiers - Saeed Sh · CS108 Discussion Section Orr Keshet orr@cs Java Arrays, Packages, andModifiers

Aggressive Driving