Active Learn

Embed Size (px)

Citation preview

  • 7/31/2019 Active Learn

    1/45

    Ac#veLearning

    Aarti Singh

    Machine Learning 10-601

    Dec 6, 2011

    Slides Courtesy: Burr Settles, Rui Castro, Rob Nowak

    1

  • 7/31/2019 Active Learn

    2/45

    Semi-supervised learning:Design a predictor based on iid unlabeled andfew randomlylabeled examples.

    Learning algorithm

    Assumption:Knowledge of marginal density can simplify predictione.g. similar data points have similar labels

    Learningfromunlabeleddata

  • 7/31/2019 Active Learn

    3/45

    Active learning:Design a predictor based on iid unlabeled and selectivelylabeled examples

    Learning algorithm

    Selectivelabeling

    Assumption: Some unlabeled examples are more informative than others forprediction.

    Learningfromunlabeleddata

  • 7/31/2019 Active Learn

    4/45

    35

    8

    7

    9

    4

    2

    1

    many unlabeled data

    plus a few labeled examples

    knowledge of clusters + a few labels in each is sufficient to

    design a good predictor Semi-supervised learning

    Example:Hand-wri

  • 7/31/2019 Active Learn

    5/45

    Not all examples are created equal

    Labeled examples near boundaries of clusters

    are much more informative Active learning

    Example:Hand-wri

  • 7/31/2019 Active Learn

    6/45

    PassiveLearning

  • 7/31/2019 Active Learn

    7/45

    Semi-supervisedLearning

  • 7/31/2019 Active Learn

    8/45

    Ac#veLearning

  • 7/31/2019 Active Learn

    9/45

    The eyes focus on the interesting and relevant features,and do not sample all the regions in the scene in the sameway.

    Feedbackdrivenlearning

  • 7/31/2019 Active Learn

    10/45

    Feedbackdrivenlearning

  • 7/31/2019 Active Learn

    11/45

    Is the personwearing a hat ?

    Does the personhave blue eyes ?

    Active Learning works very well in simple conditions

    TheTwentyques#onsgame

    Focus on mostinformative questions

  • 7/31/2019 Active Learn

    12/45

    ThoughtExperiment

    supposeyouretheleaderofanEarthconvoysenttocolonizeplanetMars

    people who ate the roundMartian fruits found them tasty!

    people who ate the spikedMartian fruits died!

  • 7/31/2019 Active Learn

    13/45

    Poisonvs.YummyFruits

    problem:theresarangeofspiky-to-roundfruitshapesonMars:

    youneedtolearnthethresholdof

    roundnesswherethefruitsgofrom

    poisonoustosafe.

    andyouneedtodeterminethisrisking

    asfewcolonistslivesaspossible!

  • 7/31/2019 Active Learn

    14/45

    Tes#ngFruitSafety

    thisisjustabinarybisec#onsearch

    YourfirstacFvelearningalgorithm!

  • 7/31/2019 Active Learn

    15/45

    Ac#veLearning

    keyidea:thelearnercanchoosetrainingdataonthefly

    onMars:whetherafruitwaspoisonous/safe

    ingeneral:thetruelabelofsomeinstance goal:reducethetrainingcosts

    onMars:thenumberoflivesatriskingeneral:thenumberofqueries

  • 7/31/2019 Active Learn

    16/45

    Goal: Given a budget of n samples, learn thresholdas accurately as possible

    step function

    Learningachange-point

    Locate a change-point or threshold (poisonous/yummy fruit,

    contamination boundary)

  • 7/31/2019 Active Learn

    17/45

    Sample locations must be chosen before any observationsare made

    PassiveLearning

  • 7/31/2019 Active Learn

    18/45

    Sample locations must be chosen before any observationsare made

    Too many wasted samples. Learning is limited by sampling

    resolution

    PassiveLearning

  • 7/31/2019 Active Learn

    19/45

    Sample locations are chosen based on previous observations

    ActiveLearning

  • 7/31/2019 Active Learn

    20/45

    Sample locations are chosen based on previous observations

    The error decays much faster than in the passivescenario. No wasted samples Exponential improvement!

    Works even when labels are noisy though improvement depends on

    amount of noise

    ActiveLearning

  • 7/31/2019 Active Learn

    21/45

    Prac#calLearningCurves

    text classification:baseballvs. hockey

    active learning

    passive learning

    better

  • 7/31/2019 Active Learn

    22/45

    Probabilis#cBinaryBisec#on

    letstrygeneralizingourbinarysearchmethodusingaprobabilis.cclassifier:

    0.50.0

    1.00.50.5

  • 7/31/2019 Active Learn

    23/45

    UncertaintySampling

    queryinstancesthelearnerismostuncertainabout

    400 instances sampledfrom 2 class Gaussians

    random sampling30 labeled instances

    (accuracy=0.7)

    active learning30 labeled instances

    (accuracy=0.9)

    [Lewis & Gale, SIGIR94]

    Using logistic regression

  • 7/31/2019 Active Learn

    24/45

    GeneralizingtoMul#-ClassProblems

    least confident [Culotta & McCallum, AAAI05]

    smallest-margin [Scheffer et al., CAIDA01]

    entropy [Dagan & Engelson, ICML95]

    note:forbinarytasks,theseareequivalent

  • 7/31/2019 Active Learn

    25/45

    Query-By-Commi

  • 7/31/2019 Active Learn

    26/45

    VersionSpaceExamples

  • 7/31/2019 Active Learn

    27/45

    QBCExample

  • 7/31/2019 Active Learn

    28/45

    QBCExample

  • 7/31/2019 Active Learn

    29/45

    QBCExample

  • 7/31/2019 Active Learn

    30/45

    QBCExample

  • 7/31/2019 Active Learn

    31/45

    QBCGuarantees

    [Freund et al.,97] theoreFcalguaranteesd VCdimensionofcommiKeeclassifiers

    UndersomemildcondiFons,theQBCalgorithmachievesa

    predicFonaccuracyofandw.h.p.

    #unlabeledexamplesgenerated O(d/)

    #labelsqueried O(log2d/)

    Exponen#alimprovement!!

  • 7/31/2019 Active Learn

    32/45

  • 7/31/2019 Active Learn

    33/45

    Batch-basedac#velearning

    wireless sensor networks/mobile sensing

    Active sensing

  • 7/31/2019 Active Learn

    34/45

    Coarse sampling (Low variance, bias limited)

    Refine sampling (Low variance, low bias)

    Batch-basedac#velearning

  • 7/31/2019 Active Learn

    35/45

    Ac#veLearningforTerrainMapping

  • 7/31/2019 Active Learn

    36/45

    Whendoesac#velearningwork?

    Passive = Active

    Active learning is useful if complexity of target function is localized labels of some data points are more informative than others.

    Passive

    Active

    [Castro et al.,05]1-D 2-D

  • 7/31/2019 Active Learn

    37/45

    Ac#vevs.Semi-Supervised

    uncertainty sampling

    query instances the modelis least confident about

    query-by-committee (QBC)

    use ensembles to rapidlyreduce the version space

    Generative modelexpectation-maximization (EM)

    propagate confident labelings

    among unlabeled data

    co-trainingmulti-view learning

    use ensembles with multiple views

    to constrain the version space

    bothtrytoa

  • 7/31/2019 Active Learn

    38/45

    Problem:Outliers

    aninstancemaybeuncertainorcontroversial(forQBC)simplybecauseitsanoutlier

    queryingoutliersisnotlikelytohelpusreduceerroronmoretypicaldata

  • 7/31/2019 Active Learn

    39/45

  • 7/31/2019 Active Learn

    40/45

    Solu#on2:Es#matedErrorReduc#on

    minimizetheriskR(x)ofaquerycandidateexpecteduncertaintyoverUifxisaddedtoL

    [Roy & McCallum, ICML01; Zhu et al., ICML-WS03]

    sum overunlabeled instances uncertainty of u

    after retraining with x

    expectation overpossible labelings of x

  • 7/31/2019 Active Learn

    41/45

    TextClassifica#onExamples

    [Roy & McCallum, ICML01]

  • 7/31/2019 Active Learn

    42/45

    TextClassifica#onExamples

    [Roy & McCallum, ICML01]

  • 7/31/2019 Active Learn

    43/45

    Ac#veLearningScenarios

    Query synthesis: construct desired query/questions

    Stream-based selective sampling: unlabeled data presentedin a stream, decide whether or not to query its label

    Pool-based active learning: given a pool of unlabeled data,

    select one and query its label

  • 7/31/2019 Active Learn

    44/45

    AlternateSe]ngs

    Sofarwefocusedonqueryinglabelsforunlabeleddata.

    Otherquerytypes:

    Ac#vefeatureacquisi#ondecidingwhetherornottoobtainaparFcularfeature,e.g.featuressuchasgeneexpressionsmightbecorrelated.

    Mul#pleInstanceac#velearning-onelabelforabagofinstances,e.g.labelforadocument(bagofinstances)butcanquerypassages(instance)coarse-scalelabelsarecheaper

    Othersengs:

    Cost-sensi#veac#velearningsomelabelsmaybemoreexpensivethan

    others,e.g.collecFngpaFentvitalsvs.complexandexpensivemedicalproceduresfordiagnosis.

    Mul#-taskac#velearningifeachlabelprovidesinformaFonformulFpletasks,whichinstancesshouldbequeriedsoastobemaximallyinformaFveacrossalltasks,e.g.animagecanbelabeledasart/photo,nature/man-madeobjects,containsafaceornot.

  • 7/31/2019 Active Learn

    45/45

    Ac#veLearningSummary

    BinarybisecFon Uncertaintysampling Query-by-commiKee DensityWeighFng EsFmatedErrorReducFon ExtensionsAcFveeatureacquisiFon,MulFple-instanceacFvelearning,Cost-sensiFveacFvelearning,MulF-taskacFvelearning

    Active learning is a powerful tool if complexity of targetfunction is localized labels of some data points are moreinformative than others.