28
ACTIVE LEARNING Scenarios and techniques ABBASSI SABER Nawfal [email protected]

Active learning: Scenarios and techniques

Embed Size (px)

Citation preview

Page 1: Active learning: Scenarios and techniques

ACTIVE LEARNINGScenarios and techniques

ABBASSI SABER [email protected]

Page 2: Active learning: Scenarios and techniques

Outline

■ Scenarios

■ Query Strategy Frameworks (techniques)

Page 3: Active learning: Scenarios and techniques

Scenarios

■ Membership Query Synthesis

■ Stream-Based Selective Sampling

■ Pool-Based Sampling

Page 4: Active learning: Scenarios and techniques

Scenarios

Page 5: Active learning: Scenarios and techniques

Scenarios

■ Membership Query Synthesis

Page 6: Active learning: Scenarios and techniques

Scenarios

■ Stream-Based Selective Sampling

Yes/ No ?

Page 7: Active learning: Scenarios and techniques

Scenarios

■ Pool-Based Sampling

Page 8: Active learning: Scenarios and techniques

Query Strategy Frameworks

Page 9: Active learning: Scenarios and techniques

1- Uncertainty Sampling:

■ Uncertainty Sampling:

The Active learner queries the instances about which it is least certain how to label.

Three variants: least confident, margin sampling and entropy

Page 10: Active learning: Scenarios and techniques

1- Uncertainty Sampling:

Least Confident:■ Is about selecting the instance whose prediction is the least confident :

Where:

(The most popular class label for the model Teta)

Page 11: Active learning: Scenarios and techniques

1- Uncertainty Sampling:

Least Confident:■ The con is that least confident considers only information about the most probable

label. So other remaining labels will not be treated.

Page 12: Active learning: Scenarios and techniques

1- Uncertainty Sampling:

Margin sampling:■ Aims to correct for a shortcoming in least confident strategy.

■ Incorporate the posterior of the second most likely label.

■ Instances with large margin are easy since the model has little doubt in differentiating between the two most likely classes.

Page 13: Active learning: Scenarios and techniques

1- Uncertainty Sampling:

Margin sampling:■ Con is that if there lot of labels … back to the same problem as in Least Confident

Page 14: Active learning: Scenarios and techniques

1- Uncertainty Sampling:

Entropy:■ More general uncertainty sampling strategy is to select the instance that will bring you

most quantity of information:

■ Select the instance that having , overall labels, the highest entropy.

Page 15: Active learning: Scenarios and techniques

1- Uncertainty Sampling:

■ The most informative instances should be at the center of the triangle(where the posterior label distribution is uniform)

■ The least informative instances are at the three corners (where one of the classes has extremely high probability )

Page 16: Active learning: Scenarios and techniques

2- Query-By-Committee

Page 17: Active learning: Scenarios and techniques

2- Query-By-Committee

Page 18: Active learning: Scenarios and techniques

2- Query-By-Committee

■ Each member of committee votes for labeling query candidates

■ The most informative query is considered to be the instance about which they most disagree

■ To calculate the level of disagreement, two approaches:

– Vote Entropy

– Average Kullback-Leibler (KL) divergence

■ This is in information and probability theory a measure of the difference between two probability distributions P and Q.

■ is the expectation of the logarithmic difference between the probabilities P and Q

Page 19: Active learning: Scenarios and techniques

2- Query-By-Committee

Vote Entropy

V(yi): number of votes received by a label (yi) from among the committee members’ predictions

C: Committee size

Page 20: Active learning: Scenarios and techniques

2- Query-By-Committee

Average Kullback-Leibler (KL) divergence

Where:

And

X* is the averageof KL divergence

The consensus probability that the label yi is correct for xThe average of probabilities from different models

The probability that yiis the correct label for x using the model : c

Page 21: Active learning: Scenarios and techniques

3- Expected Model Change

■ Is about selecting the instance that would impart the greatest change to the current model if we knew its label.

■ As example the : The Expected Gradient Length, generally applied for problems where gradient-based training is used.

■ Con: this technique can be computationally expensive if both the feature space and set labeling are very large

The objective function

Page 22: Active learning: Scenarios and techniques

4- Expected Error Reduction

■ Measure how much the generalization error is likely to be reduced.

Page 23: Active learning: Scenarios and techniques

4- Expected Error Reduction

■ Measure how much the generalization error is likely to be reduced.

■ The problem is that it calculate the expected error for the most popular label, like if the correct labeling is only the popular label :p

The new model after it has been re-trained with the tuple <x,yi>

The most popular label

Page 24: Active learning: Scenarios and techniques

4- Expected Error Reduction

■ An other variant which is more “credible” is to reduce the expected total number of incorrect predictions.

■ Other way to understand it , is like choosing the instance that is expected to reduce the most the future entropy.

■ Not like the previous which consider only the popular label

Page 25: Active learning: Scenarios and techniques

5- Variance reduction

■ Instead of looking for a future minimal expected error (expensive and not in a closed form), look for future minimal variance

Page 26: Active learning: Scenarios and techniques

6- Density-Weighted Methods

■ The idea is that the informative instances should not only be those which are uncertain, but also those which “representative” of the underlying distribution.

The most uncertain but less representative

Not the most uncertain but more representative

Page 27: Active learning: Scenarios and techniques

6- Density-Weighted Methods

■ Query instances as follow:

Page 28: Active learning: Scenarios and techniques

4- Density-Weighted Methods

■ Query instances as follow:

Informativeness, using previous techniques (uncertainty, QBC…)

Similarity between x and x(u)

representativeness