38
Ensemble Learning Robot Image Credit: Viktoriya Sukhanova © 123RF.com These slides were assembled by Byron Boots based on the slides assembled by Eric Eaton, with grateful acknowledgement of the many others who made their course materials freely available online. Feel free to reuse or adapt these slides for your own academic purposes, provided that you include proper attribution.

Ensemble Learning - University of Washington · 2020. 2. 19. · Ensemble Learning Consider a set of classifiers h 1, ..., h L Idea: construct a classifier H(x)that combines the individual

  • Upload
    others

  • View
    5

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Ensemble Learning - University of Washington · 2020. 2. 19. · Ensemble Learning Consider a set of classifiers h 1, ..., h L Idea: construct a classifier H(x)that combines the individual

Ensemble Learning

Robot Image Credit: Viktoriya Sukhanova © 123RF.com

These slides were assembled by Byron Boots based on the slides assembled by Eric Eaton, with grateful acknowledgement of the many others who made their course materials freely available online. Feel free to reuse or adapt these slides for your own academic purposes, provided that you include proper attribution.

Page 2: Ensemble Learning - University of Washington · 2020. 2. 19. · Ensemble Learning Consider a set of classifiers h 1, ..., h L Idea: construct a classifier H(x)that combines the individual

Ensemble LearningConsider a set of classifiers h1, ..., hL

Idea: construct a classifier H(x) that combines the individual decisions of h1, ..., hL

• e.g., could have the member classifiers vote, or• e.g., could use different members for different regions of the

instance space

2[Based on slide by Léon Bottou]

Page 3: Ensemble Learning - University of Washington · 2020. 2. 19. · Ensemble Learning Consider a set of classifiers h 1, ..., h L Idea: construct a classifier H(x)that combines the individual

Ensemble LearningConsider a set of classifiers h1, ..., hL

Idea: construct a classifier H(x) that combines the individual decisions of h1, ..., hL

• e.g., could have the member classifiers vote, or• e.g., could use different members for different regions of the

instance space

Successful ensembles require diversity• Classifiers should make different mistakes

• Can have different types of base learners

3[Based on slide by Léon Bottou]

Page 4: Ensemble Learning - University of Washington · 2020. 2. 19. · Ensemble Learning Consider a set of classifiers h 1, ..., h L Idea: construct a classifier H(x)that combines the individual

Combining Classifiers: Averaging

• Final hypothesis is a simple vote of the members

4

x

h1

h2

hL

H(x)

Page 5: Ensemble Learning - University of Washington · 2020. 2. 19. · Ensemble Learning Consider a set of classifiers h 1, ..., h L Idea: construct a classifier H(x)that combines the individual

Combining Classifiers: Weighted Average

• Coefficients of individual members are trained using a validation set

5

x

h1

h2

hL

H(x)

w1

w2

wi

wL

Page 6: Ensemble Learning - University of Washington · 2020. 2. 19. · Ensemble Learning Consider a set of classifiers h 1, ..., h L Idea: construct a classifier H(x)that combines the individual

Combining Classifiers: Gating

• Coefficients of individual members depend on input

• Train gating function via validation set6

x

h1

h2

hL

H(x)

Gating Fn

w1

w2

wi

wL

Page 7: Ensemble Learning - University of Washington · 2020. 2. 19. · Ensemble Learning Consider a set of classifiers h 1, ..., h L Idea: construct a classifier H(x)that combines the individual

C

Combining Classifiers: Stacking

• Predictions of 1st layer used as input to 2nd layer

• Train 2nd layer on validation set7

x

h1

h2

hL

H(x)

1st Layer 2nd Layer

Page 8: Ensemble Learning - University of Washington · 2020. 2. 19. · Ensemble Learning Consider a set of classifiers h 1, ..., h L Idea: construct a classifier H(x)that combines the individual

How to Achieve Diversity

Cause of the MistakePattern was difficultOverfittingSome features are noisy

Diversification StrategyWe will come back to this…Vary the training setsVary the set of input features

8[Based on slide by Léon Bottou]

Page 9: Ensemble Learning - University of Washington · 2020. 2. 19. · Ensemble Learning Consider a set of classifiers h 1, ..., h L Idea: construct a classifier H(x)that combines the individual

Manipulating the Training DataBootstrap replication:• Given n training examples, construct a new training set by

sampling n instances with replacement

• Excludes ~35% of the training instances

Bootstrap aggregating (Bagging):• Create bootstrap replicates of training set• Train a classifier (e.g., a decision tree) for each replicate

• Estimate classifier performance using out-of-bootstrap data• Average output of all classifiers

Boosting: (in just a minute...)

9[Based on slide by Léon Bottou]

Page 10: Ensemble Learning - University of Washington · 2020. 2. 19. · Ensemble Learning Consider a set of classifiers h 1, ..., h L Idea: construct a classifier H(x)that combines the individual

Manipulating the FeaturesRandom Forests• Construct decision trees on bootstrap replicas

– Restrict the node decisions to a small subset of features picked randomly for each node

• Do not prune the trees– Estimate tree performance

on out-of-bootstrap data

• Average the output of all trees (or choose mode decision)

10[Based on slide by Léon Bottou]

Page 11: Ensemble Learning - University of Washington · 2020. 2. 19. · Ensemble Learning Consider a set of classifiers h 1, ..., h L Idea: construct a classifier H(x)that combines the individual

Adaptive Boosting (AdaBoost)

11

Page 12: Ensemble Learning - University of Washington · 2020. 2. 19. · Ensemble Learning Consider a set of classifiers h 1, ..., h L Idea: construct a classifier H(x)that combines the individual

AdaBoost[Freund & Schapire, 1997]

• A meta-learning algorithm with great theoretical and empirical performance

• Turns a base learner (i.e., a “weak hypothesis”) into a high performance classifier

• Creates an ensemble of weak hypotheses by repeatedly emphasizing mispredicted instances

12

Page 13: Ensemble Learning - University of Washington · 2020. 2. 19. · Ensemble Learning Consider a set of classifiers h 1, ..., h L Idea: construct a classifier H(x)that combines the individual

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

AdaBoost

• Size of point represents the instance’s weight

13

Page 14: Ensemble Learning - University of Washington · 2020. 2. 19. · Ensemble Learning Consider a set of classifiers h 1, ..., h L Idea: construct a classifier H(x)that combines the individual

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

AdaBoost

t = 1

14

Page 15: Ensemble Learning - University of Washington · 2020. 2. 19. · Ensemble Learning Consider a set of classifiers h 1, ..., h L Idea: construct a classifier H(x)that combines the individual

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

AdaBoost

t = 1 + –

15

Page 16: Ensemble Learning - University of Washington · 2020. 2. 19. · Ensemble Learning Consider a set of classifiers h 1, ..., h L Idea: construct a classifier H(x)that combines the individual

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

AdaBoost

t = 1 + –

• βt measures the importance of ht

• If , then (can trivially guarantee) ✏t 0.5 �t � 016

Page 17: Ensemble Learning - University of Washington · 2020. 2. 19. · Ensemble Learning Consider a set of classifiers h 1, ..., h L Idea: construct a classifier H(x)that combines the individual

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

AdaBoost

t = 1 + –

• βt measures the importance of ht

• If , then ( βt grows as gets smaller) ✏t 0.5 �t � 0 ✏t 0.517

Page 18: Ensemble Learning - University of Washington · 2020. 2. 19. · Ensemble Learning Consider a set of classifiers h 1, ..., h L Idea: construct a classifier H(x)that combines the individual

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

AdaBoost

• Weights of correct predictions are multiplied by

• Weights of incorrect predictions are multiplied by

t = 1 + –

e��t 1e�t � 1

18

Page 19: Ensemble Learning - University of Washington · 2020. 2. 19. · Ensemble Learning Consider a set of classifiers h 1, ..., h L Idea: construct a classifier H(x)that combines the individual

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

AdaBoost

• Weights of correct predictions are multiplied by

• Weights of incorrect predictions are multiplied by

t = 1 + –

e��t 1e�t � 1

19

Page 20: Ensemble Learning - University of Washington · 2020. 2. 19. · Ensemble Learning Consider a set of classifiers h 1, ..., h L Idea: construct a classifier H(x)that combines the individual

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

AdaBoost

t = 1 + –

20

Disclaimer: Note that resized points in the illustration above are not necessarily to scale with βt

Page 21: Ensemble Learning - University of Washington · 2020. 2. 19. · Ensemble Learning Consider a set of classifiers h 1, ..., h L Idea: construct a classifier H(x)that combines the individual

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

AdaBoost

t = 2 + –

21

Page 22: Ensemble Learning - University of Washington · 2020. 2. 19. · Ensemble Learning Consider a set of classifiers h 1, ..., h L Idea: construct a classifier H(x)that combines the individual

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

AdaBoost

t = 2 + –

22

Page 23: Ensemble Learning - University of Washington · 2020. 2. 19. · Ensemble Learning Consider a set of classifiers h 1, ..., h L Idea: construct a classifier H(x)that combines the individual

Training a Model with Weighted Instances

• For algorithms like logistic regression, can simply incorporate weights w into the cost function– Essentially, weigh the cost of misclassification differently

for each instance

• For algorithms that don’t directly support instance weights (e.g., ID3 decision trees, etc.), use weighted bootstrap sampling– Form training set by resampling instances with

replacement according to w

23

Jreg(✓) = �nX

i=1

wi [yi log h✓(xi) + (1� yi) log (1� h✓(xi))] + �k✓[1:d]k22

Page 24: Ensemble Learning - University of Washington · 2020. 2. 19. · Ensemble Learning Consider a set of classifiers h 1, ..., h L Idea: construct a classifier H(x)that combines the individual

+ –

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

AdaBoost

t = 2

+ –

24

Page 25: Ensemble Learning - University of Washington · 2020. 2. 19. · Ensemble Learning Consider a set of classifiers h 1, ..., h L Idea: construct a classifier H(x)that combines the individual

+ –

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

AdaBoost

t = 2

+ –

• βt measures the importance of ht

• If , then ( βt grows as gets smaller) ✏t 0.5 �t � 0 ✏t 0.525

Page 26: Ensemble Learning - University of Washington · 2020. 2. 19. · Ensemble Learning Consider a set of classifiers h 1, ..., h L Idea: construct a classifier H(x)that combines the individual

+ –

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

AdaBoost

t = 2

+ –

• Weights of correct predictions are multiplied by

• Weights of incorrect predictions are multiplied by e��t 1e�t � 1

26

Page 27: Ensemble Learning - University of Washington · 2020. 2. 19. · Ensemble Learning Consider a set of classifiers h 1, ..., h L Idea: construct a classifier H(x)that combines the individual

+ –

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

AdaBoost

t = 2

+ –

• Weights of correct predictions are multiplied by

• Weights of incorrect predictions are multiplied by e��t 1e�t � 1

27

Page 28: Ensemble Learning - University of Washington · 2020. 2. 19. · Ensemble Learning Consider a set of classifiers h 1, ..., h L Idea: construct a classifier H(x)that combines the individual

+ –

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

AdaBoost

t = 3

+ –

28

Page 29: Ensemble Learning - University of Washington · 2020. 2. 19. · Ensemble Learning Consider a set of classifiers h 1, ..., h L Idea: construct a classifier H(x)that combines the individual

+ –

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

AdaBoost

t = 3

+ –

+ –

29

Page 30: Ensemble Learning - University of Washington · 2020. 2. 19. · Ensemble Learning Consider a set of classifiers h 1, ..., h L Idea: construct a classifier H(x)that combines the individual

+ –

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

AdaBoost

t = 3

+ –

+ –

• βt measures the importance of ht

• If , then ( βt grows as gets smaller) ✏t 0.5 �t � 0 ✏t 0.530

Page 31: Ensemble Learning - University of Washington · 2020. 2. 19. · Ensemble Learning Consider a set of classifiers h 1, ..., h L Idea: construct a classifier H(x)that combines the individual

+ –

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

AdaBoost

t = 3

+ –

+ –

• Weights of correct predictions are multiplied by

• Weights of incorrect predictions are multiplied by e��t 1e�t � 1

31

Page 32: Ensemble Learning - University of Washington · 2020. 2. 19. · Ensemble Learning Consider a set of classifiers h 1, ..., h L Idea: construct a classifier H(x)that combines the individual

+ –

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

AdaBoost

t = 3

+ –

+ –

• Weights of correct predictions are multiplied by

• Weights of incorrect predictions are multiplied by e��t 1e�t � 1

32

Page 33: Ensemble Learning - University of Washington · 2020. 2. 19. · Ensemble Learning Consider a set of classifiers h 1, ..., h L Idea: construct a classifier H(x)that combines the individual

+ –

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

AdaBoost

t = 4

+ –

+ –

33

Page 34: Ensemble Learning - University of Washington · 2020. 2. 19. · Ensemble Learning Consider a set of classifiers h 1, ..., h L Idea: construct a classifier H(x)that combines the individual

+ –

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

AdaBoost

t = 4

+ –

+ –+

34

Page 35: Ensemble Learning - University of Washington · 2020. 2. 19. · Ensemble Learning Consider a set of classifiers h 1, ..., h L Idea: construct a classifier H(x)that combines the individual

+ –

+ –

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

AdaBoost

t = 4

+ –+

35

Page 36: Ensemble Learning - University of Washington · 2020. 2. 19. · Ensemble Learning Consider a set of classifiers h 1, ..., h L Idea: construct a classifier H(x)that combines the individual

+

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

AdaBoost

+ +

t = T

+

+

+ + +

+ +

36

Page 37: Ensemble Learning - University of Washington · 2020. 2. 19. · Ensemble Learning Consider a set of classifiers h 1, ..., h L Idea: construct a classifier H(x)that combines the individual

+

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

AdaBoost

+ +

t = T

+

+

+ + +

+ +

37

Page 38: Ensemble Learning - University of Washington · 2020. 2. 19. · Ensemble Learning Consider a set of classifiers h 1, ..., h L Idea: construct a classifier H(x)that combines the individual

+

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,the number of iterations T

1: Initialize a vector of n uniform weights w1 =⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

AdaBoost

+ +

t = T

+

+

+ + +

+ +

• Final model is a weighted combination of members– Each member weighted by its importance

38