Upload
tranminh
View
216
Download
0
Embed Size (px)
Citation preview
COSC343:ArtificialIntelligence
LechSzymanskiLech Szymanski (Otago)
Lech Szymanski
Dept. of Computer Science, University of Otago
COSC343Lecture5
Lecture5:BayesianReasoning
LechSzymanskiLech Szymanski (Otago)
• Conditionalprobabilityandindependence
• Curseofdimensionality
• Bayes’rule
• NaiveBayesClassifier
Intoday’slecture
COSC343Lecture5
LechSzymanskiLech Szymanski (Otago)
Consideramedicalscenario,with3Booleanvariables
• cavity (doesthepatienthaveacavityornot?)• toothache (doesthepatienthaveatoothacheornot?)• catch (doesthedentist’sprobecatchonthepatient’stooth?)Here’sanexampleprobabilitymodel:thejointprobabilitydistribution
Recap:asimplemedicalexample
COSC343Lecture5
toothache toothachecatch catch catch catch
cavity .108 .012 .072 .008cavity .016 .064 .144 .576
LechSzymanskiLech Szymanski (Otago)
Probabilityofaneventgiven informationaboutanotherevent.
Conditionalprobabilities
COSC343Lecture5
“given”
probability of a given b
“and”
LechSzymanskiLech Szymanski (Otago)
Assumewe’regiventhefollowingjointprobabilitydistribution
andlearnthatapatienthasatoothache
E.g.howtocalculate?
Computingconditionalprobabilities
COSC343Lecture5
toothache toothachecatch catch catch catch
cavity .108 .012 .072 .008cavity .016 .064 .144 .576
LechSzymanskiLech Szymanski (Otago)
Conditionalprobabilitiescanalsobecomputedforwhole
distributions.
Forinstance,theconditionalprobabilitydistribution ofCavity andCatch giventoothache showsallpossiblevaluesofCavity andCatchgiventoothache.
Conditionalprobabilityofwholedistributions
COSC343Lecture5
toothache toothachecatch catch catch catch
cavity .108 .012 .072 .008cavity .016 .064 .144 .576
catch catchcavity .188/.2 .12/.2
cavity .16/.2 .064/.2
LechSzymanskiLech Szymanski (Otago)
Inferencefromfulljointdistributionsisallaboutsummingoverhiddenvariables.
• Ifthereare Booleanvariablesinthetable,ittakestimetocomputeaprobability
• Moreover,thespacecomplexityofthetableitselfis
• Thismeansasincreases,itbecomesinfeasibleeventostorethefulljointdistribution
• Andremember:someonehastobuildthefulljointdistributionfromempiricaldata!
Inpractice,probabilisticreasoningtasksofteninvolvethousandsofrandomvariables.Obviously,amoreefficientreasoningtechniqueisnecessary.
Complexityofreasoningfromafulljointdistribution
COSC343Lecture5 LechSzymanskiLech Szymanski (Otago)
Curseofdimensionality
COSC343Lecture5
When the dimensionality ofproblem increases, thevolume of the spacebecomes so large, you needexponentially more points tosample the space at thesame density. Thisphenomenon is sometimesreferred as the curse ofdimensionality.
https://haifengl.wordpress.com/2016/02/29/there-is-no-big-data-in-machine-learning/
LechSzymanskiLech Szymanski (Otago)
• IfthenitissaidthatAandBareindependent
IfAandBareindependent,thenitfollowsthat:
and
Independence
COSC343Lecture5
Huge reductions in complexity can be made if we know that some variables in our domain are independent of others
LechSzymanskiLech Szymanski (Otago)
SayweaddavariableWeather toourdentistmodel,withfourvalues:sunny,cloudy,rainy,snowy.• Weatherandteethdon’taffecteachother
• Tocomputetheprobabilityofapropositioninvolvingweatherandteeth,wecanjustmultiplytheresultsfoundinthetwodistributions
Given:
itfollowsthat:
Anexampleofindependence
COSC343Lecture5
LechSzymanskiLech Szymanski (Otago)
Ifwewantedtocreateafulljointdistribution,wecoulddosobymultiplyingtheappropriateprobabilitiesfromthetwodistributions
• Insteadof4dimensionsof2x2x2x4=32sampleswehavetwospacesof3and1dimensionswithtotalof2x2x2+4=12samples
Independence
COSC343Lecture5
toothache toothachecatch catch catch catch
cavity .108 .012 .072 .008cavity .016 .064 .144 .576
sunny .6cloudy .029rainy .1snowy .001
LechSzymanskiLech Szymanski (Otago)
Absoluteindependenceispowerful,butrare.
• Canwefactoroutfurther?
• Forinstance,indentistry,mostvariablesarerelated,butnotdirectly.
Whatdodo?
Averyusefulconceptisconditionalindependence• Forinstance,havingacavity
• changestheprobabilityofatoothache
• andchangestheprobabilityofacatch
• butthesetwoeffectsareseparate!
Conditionalindependence
COSC343Lecture5
LechSzymanskiLech Szymanski (Otago)
KnowingthevalueofToothache certainlygivesusinformationaboutthevalueofCatch.• Butonlyindirectly,astheyarebothlinkedtothevariable
Cavity.
IfIalreadyknewthevalueofCavity,thenlearningaboutCatchwon’ttellmeanythingelseaboutthevalueofToothache.
Therefore:
Conditionalindependence
COSC343Lecture5
Conditional independence
Conditional probability
LechSzymanskiLech Szymanski (Otago)
Fromthedefinitionofconditionalprobability
itfollowsthat
andso
Bayes’rule
COSC343Lecture5
Bayes’ Rule
and
LechSzymanskiLech Szymanski (Otago)
Aproblem
COSC343Lecture5
Chills(C)
Runningnose(R)
Headache
(H)Fever(Fr)
Flu
y n mild y n
y y no n y
y n strong y y
n y mild y y
n n no n n
n y strong y y
n y strong n n
y y mild n y
y n mild n ?
Information from previous patients:
How to diagnose a person with the following symptoms?
Compute
and
and see which one is more likely.
effects cause
LechSzymanskiLech Szymanski (Otago)
It’shardtoestimate:
• Youhavetohaveinformationonmanygroupsofpeople– oneforeachpossiblecombinationofsymptoms– andtesteachpersonineachgroupforflu
It’sfareasiertoestimate:
• Youtakeasampleofonlytwogroupsofpeople– thosewithfluandthosewithout,andtestthemforsymptoms
FromBayes’rule
What’smorelikely?
COSC343Lecture5
Larger or smaller?causeeffects
LechSzymanskiLech Szymanski (Otago)
Inverttheproblem
COSC343Lecture5
Bayes
Need probability distributions and
causeeffects
causeeffects
Normalising denominator is the same on both sides, and so plays no part in the comparison
LechSzymanskiLech Szymanski (Otago)
Gatheringpriorprobabilitiesfromtrainingset
COSC343Lecture5
Chills(C)
Runningnose(R)
Headache
(H)Fever(Fr)
Flu
y n mild y n
y y no n y
y n strong y y
n y mild y y
n n no n n
n y strong y y
n y strong n n
y y mild n y
y n
5/8 3/8
LechSzymanskiLech Szymanski (Otago)
• joint probability distribution has 4
dimensions and 24 combinations of symptoms
• has 1 dimension and 2 combinations of symptoms
• has 1 dimension and 2 combinations of symptoms
• has 1 dimension and 2 combinations of symptoms
• has 1 dimension and 3 combinations of symptoms
Gatheringconditionalprobabilitiesfromtrainingset
COSC343Lecture5
Chills(C)
Runningnose(R)
Headache
(H)Fever(Fr)
Flu
y n mild y n
y y no n y
y n strong y y
n y mild y y
n n no n n
n y strong y y
n y strong n n
y y mild n y
• If the symptoms were independent events, then from independence we would have
Let’s just assume the symptoms are independent! Naive Bayes
LechSzymanskiLech Szymanski (Otago)
Gatheringconditionalprobabilitiesfromtrainingset
COSC343Lecture5
Chills(C)
Runningnose(R)
Headache
(H)Fever(Fr)
Flu
y n mild y n
y y no n y
y n strong y y
n y mild y y
n n no n n
n y strong y y
n y strong n n
y y mild n y
y n
5/8 3/8
y n
3/5 2/5
y n
1/3 2/3
y n
4/5 1/5
y n
1/3 2/3
y n
3/5 2/5
y n
1/3 2/3
mild no strong
2/5 1/5 2/5
mild no strong
1/3 1/3 1/3
LechSzymanskiLech Szymanski (Otago)
NaiveBayesClassifier
COSC343Lecture5
C R H Fr Flu
y n mild n ?
And so, for the person with the symptoms:
Probability of flu given symptoms:
- From Bayes’ rule
- From the assumption of independence
Naive Bayes Classifier deems this person more likely not to have a flu!
- From the probability distributions derived from the training data
LechSzymanskiLech Szymanski (Otago)
NaiveBayesClassifier
COSC343Lecture5
y n
nomild
strong
where
,
and
no flu
flu
LechSzymanskiLech Szymanski (Otago)
• Conditionalprobabilitiesmodelchangesinuncertainty
givensomeinformation
• Independencesimplifiescomputation
• Bayes’rule:possibletocomputefrom
• NaiveBayes:assumecauses inputvariablesareindependent,sothat
Summaryandreading
COSC343Lecture5
Readingforthelecture:AIMAChapter13Sections3-5
Readingfornextlecture:AIMAChapter18Section3