55
Active Learning Literature Survey Burr Settles Presented by: Lovedeep Gondara Technical report, 2010 Burr Settles Presented by: Lovedeep Gondara Active Learning Technical report, 2010 1 / 55

Active Learning - Literature Survey - GitHub Pages · selection strategies: uncertainty sampling (active learning) and random sampling (passive learning). We can see that the active

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Active Learning - Literature Survey - GitHub Pages · selection strategies: uncertainty sampling (active learning) and random sampling (passive learning). We can see that the active

Active LearningLiterature Survey

Burr SettlesPresented by: Lovedeep Gondara

Technical report, 2010

Burr Settles Presented by: Lovedeep Gondara Active Learning Technical report, 2010 1 / 55

Page 2: Active Learning - Literature Survey - GitHub Pages · selection strategies: uncertainty sampling (active learning) and random sampling (passive learning). We can see that the active

Outline

1 IntroductionWhat is active learning?Example

2 ScenariosMembership query synthesisStream-based selective samplingPool based sampling

3 Query Strategy FrameworksUncertainty samplingQuery-by-committeeExpected model changeExpected Error reductionVariance reductionDensity weighted methods

4 Analysis of Active LearningEmpirical AnalysisTheoretical Analysis

Burr Settles Presented by: Lovedeep Gondara Active Learning Technical report, 2010 2 / 55

Page 3: Active Learning - Literature Survey - GitHub Pages · selection strategies: uncertainty sampling (active learning) and random sampling (passive learning). We can see that the active

Outline

1 IntroductionWhat is active learning?Example

2 ScenariosMembership query synthesisStream-based selective samplingPool based sampling

3 Query Strategy FrameworksUncertainty samplingQuery-by-committeeExpected model changeExpected Error reductionVariance reductionDensity weighted methods

4 Analysis of Active LearningEmpirical AnalysisTheoretical Analysis

Burr Settles Presented by: Lovedeep Gondara Active Learning Technical report, 2010 3 / 55

Page 4: Active Learning - Literature Survey - GitHub Pages · selection strategies: uncertainty sampling (active learning) and random sampling (passive learning). We can see that the active

What is active learning?Definition

Sub-field of machine learning, based on idea of letting model chooseits own data.

If it can, it should perform better.

Burr Settles Presented by: Lovedeep Gondara Active Learning Technical report, 2010 4 / 55

Page 5: Active Learning - Literature Survey - GitHub Pages · selection strategies: uncertainty sampling (active learning) and random sampling (passive learning). We can see that the active

What is active learning?Definition

Gathering labels for all data can be challenging.

Active learning circumvents this issue by asking an oracle to labelinstances.

Aims to achieve high accuracy using little data.

Burr Settles Presented by: Lovedeep Gondara Active Learning Technical report, 2010 5 / 55

Page 6: Active Learning - Literature Survey - GitHub Pages · selection strategies: uncertainty sampling (active learning) and random sampling (passive learning). We can see that the active

What is active learning?Example

Figure: Pool based active learning cycle

Burr Settles Presented by: Lovedeep Gondara Active Learning Technical report, 2010 6 / 55

Page 7: Active Learning - Literature Survey - GitHub Pages · selection strategies: uncertainty sampling (active learning) and random sampling (passive learning). We can see that the active

Outline

1 IntroductionWhat is active learning?Example

2 ScenariosMembership query synthesisStream-based selective samplingPool based sampling

3 Query Strategy FrameworksUncertainty samplingQuery-by-committeeExpected model changeExpected Error reductionVariance reductionDensity weighted methods

4 Analysis of Active LearningEmpirical AnalysisTheoretical Analysis

Burr Settles Presented by: Lovedeep Gondara Active Learning Technical report, 2010 7 / 55

Page 8: Active Learning - Literature Survey - GitHub Pages · selection strategies: uncertainty sampling (active learning) and random sampling (passive learning). We can see that the active

ExamplePool based

Queries are selected from a pool of unlabelled instances usinguncertainty sampling.

Selects the instance in the pool about which model is least certain.

Burr Settles Presented by: Lovedeep Gondara Active Learning Technical report, 2010 8 / 55

Page 9: Active Learning - Literature Survey - GitHub Pages · selection strategies: uncertainty sampling (active learning) and random sampling (passive learning). We can see that the active

ExamplePool based

Figure: An illustrative example of pool-based active learning. (a) A toy data setof 400 instances, evenly sampled from two class Gaussians. The instances arerepresented as points in a 2D feature space. (b) A logistic regression modeltrained with 30 labeled instances randomly drawn from the problem domain. Theline represents the decision boundary of the classifier (70% accuracy). (c) Alogistic regression model trained with 30 actively queried instances usinguncertainty sampling (90%).

Burr Settles Presented by: Lovedeep Gondara Active Learning Technical report, 2010 9 / 55

Page 10: Active Learning - Literature Survey - GitHub Pages · selection strategies: uncertainty sampling (active learning) and random sampling (passive learning). We can see that the active

ExamplePool based

Figure: Learning curves for text classification: baseball vs. hockey. Curves plotclassification accuracy as a function of the number of documents queried for twoselection strategies: uncertainty sampling (active learning) and random sampling(passive learning). We can see that the active learning approach is superior herebecause its learning curve dominates that of random sampling.

Burr Settles Presented by: Lovedeep Gondara Active Learning Technical report, 2010 10 / 55

Page 11: Active Learning - Literature Survey - GitHub Pages · selection strategies: uncertainty sampling (active learning) and random sampling (passive learning). We can see that the active

Outline

1 IntroductionWhat is active learning?Example

2 ScenariosMembership query synthesisStream-based selective samplingPool based sampling

3 Query Strategy FrameworksUncertainty samplingQuery-by-committeeExpected model changeExpected Error reductionVariance reductionDensity weighted methods

4 Analysis of Active LearningEmpirical AnalysisTheoretical Analysis

Burr Settles Presented by: Lovedeep Gondara Active Learning Technical report, 2010 11 / 55

Page 12: Active Learning - Literature Survey - GitHub Pages · selection strategies: uncertainty sampling (active learning) and random sampling (passive learning). We can see that the active

Membership query synthesis

Learner requests labels for any unlabelled instance

Assuming generated queries are de novo

Burr Settles Presented by: Lovedeep Gondara Active Learning Technical report, 2010 12 / 55

Page 13: Active Learning - Literature Survey - GitHub Pages · selection strategies: uncertainty sampling (active learning) and random sampling (passive learning). We can see that the active

Membership query synthesis

Query synthesis can be awkward for human annotators

Examples: Image annotation and NLP

Works better when annotators are non-humans

Burr Settles Presented by: Lovedeep Gondara Active Learning Technical report, 2010 13 / 55

Page 14: Active Learning - Literature Survey - GitHub Pages · selection strategies: uncertainty sampling (active learning) and random sampling (passive learning). We can see that the active

Outline

1 IntroductionWhat is active learning?Example

2 ScenariosMembership query synthesisStream-based selective samplingPool based sampling

3 Query Strategy FrameworksUncertainty samplingQuery-by-committeeExpected model changeExpected Error reductionVariance reductionDensity weighted methods

4 Analysis of Active LearningEmpirical AnalysisTheoretical Analysis

Burr Settles Presented by: Lovedeep Gondara Active Learning Technical report, 2010 14 / 55

Page 15: Active Learning - Literature Survey - GitHub Pages · selection strategies: uncertainty sampling (active learning) and random sampling (passive learning). We can see that the active

Stream-based selective sampling

Assumption: Unlabelled instance comes at no or minimal cost

Sample first, then learner decided whether to ask for label or not

Burr Settles Presented by: Lovedeep Gondara Active Learning Technical report, 2010 15 / 55

Page 16: Active Learning - Literature Survey - GitHub Pages · selection strategies: uncertainty sampling (active learning) and random sampling (passive learning). We can see that the active

Stream-based selective sampling

Uniform distribution: Similar to membership query synthesis

Non uniform or unknown distribution: Still sensible queries

Burr Settles Presented by: Lovedeep Gondara Active Learning Technical report, 2010 16 / 55

Page 17: Active Learning - Literature Survey - GitHub Pages · selection strategies: uncertainty sampling (active learning) and random sampling (passive learning). We can see that the active

Selective samplingWhat to query?

Use some informative measure, such that more informative instancesare more likely to be queried

Region of uncertainty: Only query instances that fall within it

Burr Settles Presented by: Lovedeep Gondara Active Learning Technical report, 2010 17 / 55

Page 18: Active Learning - Literature Survey - GitHub Pages · selection strategies: uncertainty sampling (active learning) and random sampling (passive learning). We can see that the active

Outline

1 IntroductionWhat is active learning?Example

2 ScenariosMembership query synthesisStream-based selective samplingPool based sampling

3 Query Strategy FrameworksUncertainty samplingQuery-by-committeeExpected model changeExpected Error reductionVariance reductionDensity weighted methods

4 Analysis of Active LearningEmpirical AnalysisTheoretical Analysis

Burr Settles Presented by: Lovedeep Gondara Active Learning Technical report, 2010 18 / 55

Page 19: Active Learning - Literature Survey - GitHub Pages · selection strategies: uncertainty sampling (active learning) and random sampling (passive learning). We can see that the active

Pool based sampling

Assumption: Large amount of unlabelled instances are available

Assuming a closed pool, queries are drawn from it

Burr Settles Presented by: Lovedeep Gondara Active Learning Technical report, 2010 19 / 55

Page 20: Active Learning - Literature Survey - GitHub Pages · selection strategies: uncertainty sampling (active learning) and random sampling (passive learning). We can see that the active

Pool based sampling

Instances are queried in a greedy fashion

Evaluate all instances in a pool using some informative measure

Burr Settles Presented by: Lovedeep Gondara Active Learning Technical report, 2010 20 / 55

Page 21: Active Learning - Literature Survey - GitHub Pages · selection strategies: uncertainty sampling (active learning) and random sampling (passive learning). We can see that the active

Outline

1 IntroductionWhat is active learning?Example

2 ScenariosMembership query synthesisStream-based selective samplingPool based sampling

3 Query Strategy FrameworksUncertainty samplingQuery-by-committeeExpected model changeExpected Error reductionVariance reductionDensity weighted methods

4 Analysis of Active LearningEmpirical AnalysisTheoretical Analysis

Burr Settles Presented by: Lovedeep Gondara Active Learning Technical report, 2010 21 / 55

Page 22: Active Learning - Literature Survey - GitHub Pages · selection strategies: uncertainty sampling (active learning) and random sampling (passive learning). We can see that the active

Uncertainty sampling

Query instances that active learner is least certain about

Straightforward for probabilistic models

Burr Settles Presented by: Lovedeep Gondara Active Learning Technical report, 2010 22 / 55

Page 23: Active Learning - Literature Survey - GitHub Pages · selection strategies: uncertainty sampling (active learning) and random sampling (passive learning). We can see that the active

Uncertainty sampling

Only considers information about most probable model

Throws away information about remaining label distribution

Margin sampling aims to correct for this bias

Burr Settles Presented by: Lovedeep Gondara Active Learning Technical report, 2010 23 / 55

Page 24: Active Learning - Literature Survey - GitHub Pages · selection strategies: uncertainty sampling (active learning) and random sampling (passive learning). We can see that the active

Uncertainty samplingMargin sampling

Margin sampling incorporates the posterior of second most likely label

For very large label sets, still ignores much of output distribution

Use entropy for a more general approach

Burr Settles Presented by: Lovedeep Gondara Active Learning Technical report, 2010 24 / 55

Page 25: Active Learning - Literature Survey - GitHub Pages · selection strategies: uncertainty sampling (active learning) and random sampling (passive learning). We can see that the active

Uncertainty samplingEntropy

Entropy is information theoretic measure representing amount ofinformation needed to encode a distribution

For binary classification it reduces to margin and least confidentstrategies

Burr Settles Presented by: Lovedeep Gondara Active Learning Technical report, 2010 25 / 55

Page 26: Active Learning - Literature Survey - GitHub Pages · selection strategies: uncertainty sampling (active learning) and random sampling (passive learning). We can see that the active

Uncertainty sampling

Figure: Heatmaps illustrating the query behavior of common uncertainty measuresin a three-label classification problem. Simplex corners indicate where one labelhas very high probability, with the opposite edge showing the probability range forthe other two classes when that label has very low probability. Simplex centersrepresent a uniform posterior distribution. The most informative query region foreach strategy is shown in dark red, radiating from the centers.

Burr Settles Presented by: Lovedeep Gondara Active Learning Technical report, 2010 26 / 55

Page 27: Active Learning - Literature Survey - GitHub Pages · selection strategies: uncertainty sampling (active learning) and random sampling (passive learning). We can see that the active

Outline

1 IntroductionWhat is active learning?Example

2 ScenariosMembership query synthesisStream-based selective samplingPool based sampling

3 Query Strategy FrameworksUncertainty samplingQuery-by-committeeExpected model changeExpected Error reductionVariance reductionDensity weighted methods

4 Analysis of Active LearningEmpirical AnalysisTheoretical Analysis

Burr Settles Presented by: Lovedeep Gondara Active Learning Technical report, 2010 27 / 55

Page 28: Active Learning - Literature Survey - GitHub Pages · selection strategies: uncertainty sampling (active learning) and random sampling (passive learning). We can see that the active

Query-by-committee

Committee of models trained on same labelled set representingcompeting hypothesis

Each member is allowed to vote on labelling of query candidates.

Instance about which they most disagree is most informative query

Burr Settles Presented by: Lovedeep Gondara Active Learning Technical report, 2010 28 / 55

Page 29: Active Learning - Literature Survey - GitHub Pages · selection strategies: uncertainty sampling (active learning) and random sampling (passive learning). We can see that the active

Query-by-committee

Minimize the version space, i.e. set of hypothesis that are consistentwith current labelled training data

Figure: Version space examples for (a) linear and (b) axis-parallel boxclassifiers. All hypotheses are consistent with the labeled training data in L(as indicated by shaded polygons), but each represents a different model inthe version space.

Burr Settles Presented by: Lovedeep Gondara Active Learning Technical report, 2010 29 / 55

Page 30: Active Learning - Literature Survey - GitHub Pages · selection strategies: uncertainty sampling (active learning) and random sampling (passive learning). We can see that the active

Query-by-committeeDisagreement

Vote entropy (QBC generalization of entropy based uncertaintysampling)

Average Kullback-Leibler divergence

Burr Settles Presented by: Lovedeep Gondara Active Learning Technical report, 2010 30 / 55

Page 31: Active Learning - Literature Survey - GitHub Pages · selection strategies: uncertainty sampling (active learning) and random sampling (passive learning). We can see that the active

Outline

1 IntroductionWhat is active learning?Example

2 ScenariosMembership query synthesisStream-based selective samplingPool based sampling

3 Query Strategy FrameworksUncertainty samplingQuery-by-committeeExpected model changeExpected Error reductionVariance reductionDensity weighted methods

4 Analysis of Active LearningEmpirical AnalysisTheoretical Analysis

Burr Settles Presented by: Lovedeep Gondara Active Learning Technical report, 2010 31 / 55

Page 32: Active Learning - Literature Survey - GitHub Pages · selection strategies: uncertainty sampling (active learning) and random sampling (passive learning). We can see that the active

Expected model change

Select the instance that would impart greatest change to the model ifwe knew its label

Expected gradient length approach for discriminative probabilisticmodels

Learner should query the instance which would result in new traininggradient of largest magnitude

Burr Settles Presented by: Lovedeep Gondara Active Learning Technical report, 2010 32 / 55

Page 33: Active Learning - Literature Survey - GitHub Pages · selection strategies: uncertainty sampling (active learning) and random sampling (passive learning). We can see that the active

Expected model change

Prefers instances that are likely to most influence the model

Computationally expensive if feature space and and label set is large

Non scaled features may cause issues

Burr Settles Presented by: Lovedeep Gondara Active Learning Technical report, 2010 33 / 55

Page 34: Active Learning - Literature Survey - GitHub Pages · selection strategies: uncertainty sampling (active learning) and random sampling (passive learning). We can see that the active

Outline

1 IntroductionWhat is active learning?Example

2 ScenariosMembership query synthesisStream-based selective samplingPool based sampling

3 Query Strategy FrameworksUncertainty samplingQuery-by-committeeExpected model changeExpected Error reductionVariance reductionDensity weighted methods

4 Analysis of Active LearningEmpirical AnalysisTheoretical Analysis

Burr Settles Presented by: Lovedeep Gondara Active Learning Technical report, 2010 34 / 55

Page 35: Active Learning - Literature Survey - GitHub Pages · selection strategies: uncertainty sampling (active learning) and random sampling (passive learning). We can see that the active

Expected Error reduction

Amount of generalization error reduction

Query instance with minimal expected future error

Burr Settles Presented by: Lovedeep Gondara Active Learning Technical report, 2010 35 / 55

Page 36: Active Learning - Literature Survey - GitHub Pages · selection strategies: uncertainty sampling (active learning) and random sampling (passive learning). We can see that the active

Expected Error reduction

Can be very computationally expensive

Applications have only been considered in simple binary classificationtasks

Burr Settles Presented by: Lovedeep Gondara Active Learning Technical report, 2010 36 / 55

Page 37: Active Learning - Literature Survey - GitHub Pages · selection strategies: uncertainty sampling (active learning) and random sampling (passive learning). We can see that the active

Outline

1 IntroductionWhat is active learning?Example

2 ScenariosMembership query synthesisStream-based selective samplingPool based sampling

3 Query Strategy FrameworksUncertainty samplingQuery-by-committeeExpected model changeExpected Error reductionVariance reductionDensity weighted methods

4 Analysis of Active LearningEmpirical AnalysisTheoretical Analysis

Burr Settles Presented by: Lovedeep Gondara Active Learning Technical report, 2010 37 / 55

Page 38: Active Learning - Literature Survey - GitHub Pages · selection strategies: uncertainty sampling (active learning) and random sampling (passive learning). We can see that the active

Variance reduction

Indirectly minimize generalization error by minimizing output variance

Computationally expensive

Burr Settles Presented by: Lovedeep Gondara Active Learning Technical report, 2010 38 / 55

Page 39: Active Learning - Literature Survey - GitHub Pages · selection strategies: uncertainty sampling (active learning) and random sampling (passive learning). We can see that the active

Outline

1 IntroductionWhat is active learning?Example

2 ScenariosMembership query synthesisStream-based selective samplingPool based sampling

3 Query Strategy FrameworksUncertainty samplingQuery-by-committeeExpected model changeExpected Error reductionVariance reductionDensity weighted methods

4 Analysis of Active LearningEmpirical AnalysisTheoretical Analysis

Burr Settles Presented by: Lovedeep Gondara Active Learning Technical report, 2010 39 / 55

Page 40: Active Learning - Literature Survey - GitHub Pages · selection strategies: uncertainty sampling (active learning) and random sampling (passive learning). We can see that the active

Density weighted methods

Informative instances: not only uncertain, but also representative

Inhabit dense regions of input space

Burr Settles Presented by: Lovedeep Gondara Active Learning Technical report, 2010 40 / 55

Page 41: Active Learning - Literature Survey - GitHub Pages · selection strategies: uncertainty sampling (active learning) and random sampling (passive learning). We can see that the active

Outline

1 IntroductionWhat is active learning?Example

2 ScenariosMembership query synthesisStream-based selective samplingPool based sampling

3 Query Strategy FrameworksUncertainty samplingQuery-by-committeeExpected model changeExpected Error reductionVariance reductionDensity weighted methods

4 Analysis of Active LearningEmpirical AnalysisTheoretical Analysis

Burr Settles Presented by: Lovedeep Gondara Active Learning Technical report, 2010 41 / 55

Page 42: Active Learning - Literature Survey - GitHub Pages · selection strategies: uncertainty sampling (active learning) and random sampling (passive learning). We can see that the active

Does Active learning works?

Literature suggests that it does

Companies like Google, IBM, Microsoft use it

All this indicates that Active Learning has matured to the age ofpractical use

Burr Settles Presented by: Lovedeep Gondara Active Learning Technical report, 2010 42 / 55

Page 43: Active Learning - Literature Survey - GitHub Pages · selection strategies: uncertainty sampling (active learning) and random sampling (passive learning). We can see that the active

Does Active learning works?

Training set built in cooperation with an active learner is tied to themodel that was used to generate it.

Labelled instances are a biased distribution

Burr Settles Presented by: Lovedeep Gondara Active Learning Technical report, 2010 43 / 55

Page 44: Active Learning - Literature Survey - GitHub Pages · selection strategies: uncertainty sampling (active learning) and random sampling (passive learning). We can see that the active

Outline

1 IntroductionWhat is active learning?Example

2 ScenariosMembership query synthesisStream-based selective samplingPool based sampling

3 Query Strategy FrameworksUncertainty samplingQuery-by-committeeExpected model changeExpected Error reductionVariance reductionDensity weighted methods

4 Analysis of Active LearningEmpirical AnalysisTheoretical Analysis

Burr Settles Presented by: Lovedeep Gondara Active Learning Technical report, 2010 44 / 55

Page 45: Active Learning - Literature Survey - GitHub Pages · selection strategies: uncertainty sampling (active learning) and random sampling (passive learning). We can see that the active

Does Active learning works?

Remains elusive irrespective of recent advances

Bound on number of queries required to learn a sufficiently accuratemodel?

Burr Settles Presented by: Lovedeep Gondara Active Learning Technical report, 2010 45 / 55

Page 46: Active Learning - Literature Survey - GitHub Pages · selection strategies: uncertainty sampling (active learning) and random sampling (passive learning). We can see that the active

Summary

Active learning is an interesting concept

There is still a lot of ground that need to be covered (Theoreticallyand Empirically)

Burr Settles Presented by: Lovedeep Gondara Active Learning Technical report, 2010 46 / 55

Page 47: Active Learning - Literature Survey - GitHub Pages · selection strategies: uncertainty sampling (active learning) and random sampling (passive learning). We can see that the active

For Further Reading I

Settles, Burr.Active learning.Morgan & Claypool Publishers, 2012.

Burr Settles Presented by: Lovedeep Gondara Active Learning Technical report, 2010 47 / 55

Page 48: Active Learning - Literature Survey - GitHub Pages · selection strategies: uncertainty sampling (active learning) and random sampling (passive learning). We can see that the active

Uncertainty Sampling

For problems with three or more labels

x∗LC = argmaxx1− Pθ(y |x) (1)

where y = argmaxxPθ(y |x) is the class label with highest posteriorprobability under the model θ.

Burr Settles Presented by: Lovedeep Gondara Active Learning Technical report, 2010 48 / 55

Page 49: Active Learning - Literature Survey - GitHub Pages · selection strategies: uncertainty sampling (active learning) and random sampling (passive learning). We can see that the active

Margin Sampling

x∗M = argminxPθ(y1|x)− Pθ(y2|x) (2)

where y1 and y2 are first and second most probable class levels.

Burr Settles Presented by: Lovedeep Gondara Active Learning Technical report, 2010 49 / 55

Page 50: Active Learning - Literature Survey - GitHub Pages · selection strategies: uncertainty sampling (active learning) and random sampling (passive learning). We can see that the active

Entropy

x∗H = argmaxx −∑i

Pθ(yi |x) logPθ(yi |x) (3)

where yi ranges over all possible class levels.

Burr Settles Presented by: Lovedeep Gondara Active Learning Technical report, 2010 50 / 55

Page 51: Active Learning - Literature Survey - GitHub Pages · selection strategies: uncertainty sampling (active learning) and random sampling (passive learning). We can see that the active

Query by committeeVote entropy

x∗VE = argmaxx −∑i

V (yi )

Clog

V (yi )

C(4)

where yi ranges over all possible class levels and V (yi ) is number of votesa label receives, C being committee size.

Burr Settles Presented by: Lovedeep Gondara Active Learning Technical report, 2010 51 / 55

Page 52: Active Learning - Literature Survey - GitHub Pages · selection strategies: uncertainty sampling (active learning) and random sampling (passive learning). We can see that the active

Query by committeeKL divergence

x∗KL = argmaxx1

C

C∑c=1

D(Pθ(c)||PC ) (5)

where

D(Pθ(c)||PC ) =∑i

Pθ(c)(yi |x) logPθ(c)

PC (yi |x)(6)

θ(c) is a particular model in the committee, and C is committee as awhole, thus

PC (yi |x) =1

C

C∑c=1

Pθ(c)(yi |x) (7)

is the consensus probability that yi is correct label.

Burr Settles Presented by: Lovedeep Gondara Active Learning Technical report, 2010 52 / 55

Page 53: Active Learning - Literature Survey - GitHub Pages · selection strategies: uncertainty sampling (active learning) and random sampling (passive learning). We can see that the active

Expected model change

Let ∆lθ(L) be the gradient of the objective function l with respect to themodel parameters θ.Let ∆lθ(L∪ < x , y >) be the new gradient obtained by adding the trainingtuple < x , y > to L.AS query algorithm does not know the true label in advance, we insteadcalculate the length as an expectation over the possible labellings

x∗EGL = argmaxx∑i

Pθ(yi |x)||∆lθ(L∪ < x , yi >)|| (8)

where ||.|| is Euclidean norm of each resulting gradient vector

Burr Settles Presented by: Lovedeep Gondara Active Learning Technical report, 2010 53 / 55

Page 54: Active Learning - Literature Survey - GitHub Pages · selection strategies: uncertainty sampling (active learning) and random sampling (passive learning). We can see that the active

Expected error reduction

Estimate expected future error of a model trained using L∪ < x , y > onremaining unlabeled instances, query the instance with minimal expectedfuture error, we can minimise 0/1 loss:

x∗0/1 = argminx∑i

Pθ(yi |x)(U∑

u=1

1− Pθ+<x,yi>(y |x (u))) (9)

where θ+<x ,yi> is new model after retraining with < x , yi >.We can also use expected log-loss:

x∗log = argminx∑i

Pθ(yi |x)(−U∑

u=1

∑j

Pθ+<x,yi>(yj |x (u)) logPθ+<x,yi>(yj |x (u)))

(10)

Burr Settles Presented by: Lovedeep Gondara Active Learning Technical report, 2010 54 / 55

Page 55: Active Learning - Literature Survey - GitHub Pages · selection strategies: uncertainty sampling (active learning) and random sampling (passive learning). We can see that the active

Density Weighted Method

We wish to query instances as follows:

x∗ID = argmaxxφA(x)× (1

U

U∑u=1

sim(x , x (u)))β (11)

φA(x) is informativeness of x according to some base query strategy A,such as QBC etc. Second term weights it by its average similarity to allother instances in input distribution. β controls the relative importance ofdensity term.

Burr Settles Presented by: Lovedeep Gondara Active Learning Technical report, 2010 55 / 55