Upload
others
View
4
Download
0
Embed Size (px)
Citation preview
Probabilistic Models Part 1: What is probability?
Machine Learning mlvu.github.io
Vrije Universiteit Amsterdam
ProbabilityisanimportanttoolinMachineLearning.Weexpectthatyouhavebeentaughtprobabilitytheoryalready,butsinceit’sasubtleconcept,withcomplicatedfoundations,we’llgooverthebasicsagaininthis?irstvideo.
2
Tostart,let’slookatthewayweuseprobabilityinformally.
Let’ssayyouareaconcernedparent,youreadthisheadlineandyouareshockedbyit.Youturntoyourpartner,andyousay“thatmeansthattheprobabilitythatoursonisgamblingonlineis12.5%”.Yourpartnerdisagrees,youhaveagoodhandleonyourson’sbehaviourandhisspending.Unlesshehasacreditcardyoudon’tknowabout,andtheprobabilityofthatismuchlower.
Well,thentheprobabilitythatJosh,hisclosestfriend,gamblesonlinemustbe12.5%.Ifoneineightteenageboysisgambling,theymustbehidingsomewhere.Yourpartnerdisagreesagain:probabilitydoesn’tenterintoit.Joshiseithergamblingorheisn’t.
Clearly,weneedtolookatwhatwemeanwhenwesaythataprobabilityofsomethingissuch-and-so.Therearetwocommonlyacceptedwaysoflookingatit:objecttiveandsubjectiveprobability.We’llstartwithobjectiveprobability.
image source: https://www.theguardian.com/society/2016/sep/20/one-in-eight-european-teenage-boys-gamble-online-says-survey
objective probability
frequentism: probability is only a property of repeated experiments.
3
Inobjectiveprobability,“theprobabilitythatXisthecase”representsanobjectivetruth:whateveraprobabilityis,itmustbethesameforeverybody.YouandImaydisagreeoveraprobability,butonlybecauseoneofusiswrong.Thereisonetrueprobability.
Themostcommonformofobjectiveprobabilityisfrequentism.Underthefrequentistde?inition,probabilityisapropertyofa(hypothetical)repeatedexperiment.Forinstance,takethestatement“theprobabilityofrolling6withafairdie,isoneinsix.”
Theexperimentisrollingadie(thesingularofdice).Theoutcomewearediscussingistherollresultingina6.Ifweweretorepeattheexperimentalargenumber
oftimes,N,thentheproportionoftimesweobservethediscussedoutcomeiscloseto1in6.Moreprecisely,asNgrows,theproportionconvergesto1in6.
Underafrequentistinterpretation,saying“theprobabilityisoneinsix",isequivalenttosaying“ifIrollthedierepeatedly,therelativefrequencyofsixeswillconvergeto1in6asthenumberofrollgrows.”
4
Underobjectiveprobabilitythestatement“theprobabilitythatJoshisgamblingis12.5%”isindeednonsense.ThereisnoexperimentwecanimaginewhereJosh“turnsout"tobeagambleronein8times.Heeithergamblesorhedoesn’t.
WhatwecansayisthattheprobabilitythatateenageboydrawnrandomlyfromtheEuropeanpopulationgamblesonlineis12.5%.Thisisanexperimentwecanrepeat,andateveryrepetition,wechooseadifferentboy,sowegetadifferentoutcome.
Weshouldalsonotethatourstatementisnotpreciselycorrect.Theactualprobabilityisanumberwedon’tknow.Thisiswhathappensinpractice:theprobabilityofXhappeningisp.Wedon’tknowp,butwedoknowthatthereissomeexperimentforwhichtheproportionofsuccessfultrials(Xhappens)convergestopwithrepeatedtrials.Werepeatalargenumberoftrials,checktheproportionoftimesphappened,andusethatasanestimateofthetruep.Thatisalsowhathappenedintheresearchbehindthisarticle.Wedon’tknowpreciselyhowmanyteenageboysgambleonline,sotheresearchersfoundawaytoestimatethetotalproportion
subjective probability
Bayesianism: probability is an expression of our uncertainty and of our beliefs.
5
Thealternativetoobjectivismissubjectivism.Itstatesthatprobabilityexpressesouruncertainty.IfXisabooleanvariable,onethatistrueornottrue,andweareuncertainwhetherXistrue,wecanassignaprobabilitytoXbeingtrue.Aprobabilityof0.5meansweareentirelyambivalent,aprobabilityof0.75meanswethinkXisprettylikely,andaprobabilityof1meanswe’reentirelysure.
Inthiscase,differentpeoplecanhavedifferentprobabilitiesforthesamethingbeingtrue.YouandImaydisagreeandbothberight.IyouhaveinformationIdon’thave,yourprobabilitymaybeclosertocertaintythanmine.
Bayesianismisthemainformofsubjectiveprobability.ItbuildsonBayes’rule,whichwewilldiscusslater,totellushowweshoulduseobservationstoupdateourbeliefs.
6
Undersubjectivism,wecansay“theprobabilitythatoursonisgamblingis12.5%”Wedon’tknowwhathegetsupto,soeventhoughthereisade?initeobjectiveanswer,weareuncertain.Ifweknowonlythisheadline,wemaywellpick12.5%asourbeliefthatoursonisgambling.Ofcourse,asnotedbeforewehavealotmoreknowledgeaboutoursonthataboutotherteenageboys.Weknowhegoestobedontime,weknowwheregetshismoney,heprobablydoesn’thaveasecretcreditcard.Soeventhoughtheprobabilityforarandomteenageboywouldbe12.5%,theprobabilityforoursonisactuallymuchlower,becausewehaveextrainformation.
Thisisthefundamentaldifferencebetweenthetwoviews:underfrequentism,probabilityisde?inedasanobjectivepropertyoftheworld.TheprobabilityofXisthesameforallpeopleregardlessofwhatweknowordon’tknow.UnderBayesianism,probabilityisanexpressionofasubjectiveproperty:itcanchangefromonepersontothenext,andifwelearnnewinformation,itcanchangefromonemomenttothenext.Ifwe?indoutthatoursondoeshaveasecretcreditcard,theprobabilitythatheisgambler,suddenlyjumpsdramatically
NotethatBayesianism,inasenseencompassesfrequentism.Weareuncertainabouttheoutcomeofsomeexperiment,whichwecanexpressasaBayesianbelief.Ifweunderstandtheexperimentproperlymthenthatbeliefcoincidesexactlywiththeprobabilitythatafrequentistwouldassign.Bayesianismjustextendsthede?initiontoallowforpersonalbeliefsthatarenotobjectivelytrue,
subjectivism vs. objectivism
A disambiguation of the word probability.
Leads to fundamentally different ways of doing statistics.
Is machine learning a probabilistic discipline? If so, is it subjective or objective?
7
So,atheartsubjectivismandobjectivismaredisambiguations.Thewordprobabilityisambiguous,andtheseallowyoutomakeprecisewhatyoumean.
Notethatyoudon’thavetocommittoonevieworanother.Atheartsubjectiveandobjectiveprobabilityarejustwaystobemorepreciseaboutwhatthewordprobabilityactuallymeans.Youcanusethesubjectivede?initiononedayandtheobjectivede?initionthenext(especiallyininformalsettings).
However,onceyoustartdoingstatistics,thetwode?initionsleadtofundamentallydifferentapproaches(whichwe’llseeinmoredetaillater).Andinthestatisticalcommunitytherearede?initelytwocamps:thefrequentistsandtheBayesians.
Sincemachinelearningisoftenseenasanotherformofstatistics,youmayaskwhetheritisusuallyseenasusingsubjectiveorobjectiveprobability.Ican’tgiveyouacommonlyacceptedanswer,Ithinkopinionsdiffer.
MyviewisthatMachineLearning,whilebeingstatisticalinnature,isnotfundamentallyprobabilistic.Thefundamentalprinciplesofmachinelearningcanbede?inedandexplainedwithoutrecoursetoprobabilitytheory(andindeed,wehavedonesoformostofthestartofthecourse).Thefundamentalgoalof(of?line)machinelearningistominimisetestsetlossgivenonlyatrainingset,andsomehintastotherelationbetweenthetwodatasets.
Ofcourse,evenifmachinelearningisnotfundamentallyprobabilistic,probabilityhasproventobeaverypowerfultool(muchlikelinearalgebraandcalculus),inhelpingussolvethisproblem.Theconsequence,inmyview,isthatwecanborrowwhatevermethodsaremosthelpfultousatthetime.We’llusethefrequentistmethodswhenweneedthem,andtheBayesianmethodswhentheyprovemosthelpful.We’lleven,attimes,mixthetwoinasinglemodel.
probability theory
Basic ingredients
• sample space
• event space
• probability function p(…)
• random variable
8
Allthatwasabouttheinterpretationofprobabilities.
Themathematicalde?initionofprobability,studiedinthe?ieldofprobabilitytheory,isentirelydistinctfromthequestionofwhatthede?initionofprobabilityisasappliedtotherealworld.BothfrequentistsandBayesiansusethesamemathematicalframworktoexpressprobabilityasanumberbetween0and1.Theonlydifferencebetweenthemisinwhatthisnumberistakentoexpress.
We’llgothroughthebasicingredientsquickly.
sample space
9
Ω = {heads, tails}
Ω = {1, 2, 3, 4, 5, 6}
Ω = {(1, 1), (1, 2), …, (6, 6)}
Ω = ℝ <- continuous sample space
<- discrete sample spaces
Firstthesamplespace.Thesearethesingleoutcomesortruthsthatwewishtomodel.Ifwe?lipacoin,oursamplespaceisthesetofthetwooutcomesheadsandtails.
Wecanhavediscretesamplespacesorcontinuousones.
Adiscretesamplespacecanalsobein?inite:consider?lippingacoinandcountinghowmany?lipsittakestoseetails.Inthiscaseanynumberof?lipsispossible,sothesamplespaceisthenaturalnumbers(althoughanynumberlargerthan20willgetanastronomicallysmallprobability).
event space
10
E = {{}, {1}, {2}, {3}, …, {1, 2}, …, {1, 2, 3, 4, 5, 6}}
Events are the things that have probability: subsets of the sample space. All even throws, all throws higher than three, etc.
powerset: the set of all subsets sigma-algebra: for continuous sample spaces.
Fromthesamplespace,weconstructtheeventspace.Eventsarethosethingsthatcanhaveprobabilities.Theseincludetheelementsofthesamplespace,liketheprobabilityofrollingasixwithadie,buttheyalsoincludesetsofmulitpleelementsofthesamplespace,liketheeventofrollingaoneorasixandtheeventofrollingandevennumber.Eventheemptysetandthesetofallsixnumbersareevents.Aswewillsee,thesewillgetprobabilities0and1respectively.
Howtheeventspaceisconstructedisatechnicalbusiness.Forourpurposes,wecansimplysaythatifthesamplespaceisdiscretemthentheeventspaceisthepowersetofthesamplespace:thesetofallpossiblesubsetswecanmake.
Forcontinuoussamplespaces,noteverysubsetcanbeanevent.Weneedtomakesurethatoureventspaceisathingcalleda“sigmaalgebra.”Wewon’tneedtoworryaboutthisinthiscourse.
random variable, probability function
A way to describe events
D “takes values” 1, 2, 3, 4, 5, 6
p(D = 4), p(D > 3), p(D is even) etc
Random variables in ML:
• features i of instance: Xi
• class of instance: Y
• Model (parameters): M
11
Randomvariableshaveaconfusingandconvolutedde?inition,sowe’lljustgiveyoutheintuitiveinterpretation.
Randomvariableshelpustodescribeevents.WecanthinkofarandomvariableDassomethingthattakesthevaluesinthesamplespace,sothatwecanuseittodescribeevents.
Wethenassignaprobabilitytoeacheventwithaprobabilityfunctionp.Thisfunctionmustsatisfyseveralconstaints,butwe’lltakethoseasreadfornow,andjustsaythatitproducesavaluebetween0and1.
Inmachinelearning,it’scommontomodelfeatures,targetlabels,andsometimesevenmodelparametersas
randomvariables.Ifwearereferringtoadatasetofmultipleinstances,wemodeleachasaseparaterandomvariablewiththesamedistribution.
shorthand
p(X = 0): the probability that X takes the value 0 A number between 0 and 1
p(X = x): the probability that X takes the value x. A function of x.
12
p(X = x) =
�14 if x = 034 if x = 1
<latexit sha1_base64="vdDhIX1LhKi3Gu9Mp7NmOqtwFuk=">AAAHKXicfVTbbtRIEPUEmIVwC8sjLy1GIECjyJ5ALg+RsiRchICEKJNEikdRu6fsaY0vre72ZIZW/8l+wn7NPsFq3/gR2pdE9njALy7VOadUfapUHgupkLb9vbV07fqN9h83by3fvnP33v2VB38eiyTlBPokCRN+6mEBIY2hL6kM4ZRxwJEXwok33s3wkwlwQZP4SM4YDCIcxNSnBEuTOl+ZsGenaBu5HlFT/TyPIKCxIqam0Mj1OSbK0eqlRk+RK2EqFaI+0qXCCGzkugVt7Xc0ByEX4mFZ+HylY6/a+YeagVMGHav8Ds4f3PjbHSYkjSCWJMRCnDk2kwOFuaQkBL3spgIYJmMcwFkq/c2BojFLJcREoycG89MQyQRlHqAh5UBkODMBJpyaCoiMsHmCNE4t10sJiHEEojucUCaKUEyCIpDY2DxQ03wM+m5NqQKO2YiSaa01hSMRYTlqJMUs8upJSEPgk6iezNo0Tc4xp8AJFZkJB8aZfZaNVhwlByU+mrERxEKrlIe6KjQAcA6+EeahAJkylb/G7NNYbEueQjcL89z2HubjQxh2TZ1aot6OHyZYamNGDBckiSJspu4yrYq1cLurOreqih5qpdzMF89DhxlcQz9X0M96vnL/CvVR36A18LgCHjcKn1TQk3mpl1bQtIFOKuikUdm7qMAXDXhaQacNdFZBZw30awX92rQSm0Gf9QaqsDsfk9oP6QTecYBYq05Pz7+FmwmeOXVJNlXVcXRu9xB8c14KIJpldPX+6NNHrXY3e6/sdT3P8MIULin22vqrXbtBCYpuSo69udl73eAkHMfBVaG9N+t/Oc1CLOUsvCJtbKy93Xo9vyKcNN5XPgN1HNTwI1hELxteKPAWCQoTFvLHTf47jme/YCeLql96s1DBFikujbpUzLXEsrUaEyPOTioOC8oemGPL4ZNZt31zILBM+AuzYzyIqDHD/N1uFi2be+7MX+9m0O+tbq3aX152dj6Uh/2m9ch6bD2zHGvD2rHeWwdW3yLW/62l1u3WnfY/7X/b39r/FdSlVql5aNW+9o+f5UmTFg==</latexit><latexit sha1_base64="vdDhIX1LhKi3Gu9Mp7NmOqtwFuk=">AAAHKXicfVTbbtRIEPUEmIVwC8sjLy1GIECjyJ5ALg+RsiRchICEKJNEikdRu6fsaY0vre72ZIZW/8l+wn7NPsFq3/gR2pdE9njALy7VOadUfapUHgupkLb9vbV07fqN9h83by3fvnP33v2VB38eiyTlBPokCRN+6mEBIY2hL6kM4ZRxwJEXwok33s3wkwlwQZP4SM4YDCIcxNSnBEuTOl+ZsGenaBu5HlFT/TyPIKCxIqam0Mj1OSbK0eqlRk+RK2EqFaI+0qXCCGzkugVt7Xc0ByEX4mFZ+HylY6/a+YeagVMGHav8Ds4f3PjbHSYkjSCWJMRCnDk2kwOFuaQkBL3spgIYJmMcwFkq/c2BojFLJcREoycG89MQyQRlHqAh5UBkODMBJpyaCoiMsHmCNE4t10sJiHEEojucUCaKUEyCIpDY2DxQ03wM+m5NqQKO2YiSaa01hSMRYTlqJMUs8upJSEPgk6iezNo0Tc4xp8AJFZkJB8aZfZaNVhwlByU+mrERxEKrlIe6KjQAcA6+EeahAJkylb/G7NNYbEueQjcL89z2HubjQxh2TZ1aot6OHyZYamNGDBckiSJspu4yrYq1cLurOreqih5qpdzMF89DhxlcQz9X0M96vnL/CvVR36A18LgCHjcKn1TQk3mpl1bQtIFOKuikUdm7qMAXDXhaQacNdFZBZw30awX92rQSm0Gf9QaqsDsfk9oP6QTecYBYq05Pz7+FmwmeOXVJNlXVcXRu9xB8c14KIJpldPX+6NNHrXY3e6/sdT3P8MIULin22vqrXbtBCYpuSo69udl73eAkHMfBVaG9N+t/Oc1CLOUsvCJtbKy93Xo9vyKcNN5XPgN1HNTwI1hELxteKPAWCQoTFvLHTf47jme/YCeLql96s1DBFikujbpUzLXEsrUaEyPOTioOC8oemGPL4ZNZt31zILBM+AuzYzyIqDHD/N1uFi2be+7MX+9m0O+tbq3aX152dj6Uh/2m9ch6bD2zHGvD2rHeWwdW3yLW/62l1u3WnfY/7X/b39r/FdSlVql5aNW+9o+f5UmTFg==</latexit><latexit sha1_base64="vdDhIX1LhKi3Gu9Mp7NmOqtwFuk=">AAAHKXicfVTbbtRIEPUEmIVwC8sjLy1GIECjyJ5ALg+RsiRchICEKJNEikdRu6fsaY0vre72ZIZW/8l+wn7NPsFq3/gR2pdE9njALy7VOadUfapUHgupkLb9vbV07fqN9h83by3fvnP33v2VB38eiyTlBPokCRN+6mEBIY2hL6kM4ZRxwJEXwok33s3wkwlwQZP4SM4YDCIcxNSnBEuTOl+ZsGenaBu5HlFT/TyPIKCxIqam0Mj1OSbK0eqlRk+RK2EqFaI+0qXCCGzkugVt7Xc0ByEX4mFZ+HylY6/a+YeagVMGHav8Ds4f3PjbHSYkjSCWJMRCnDk2kwOFuaQkBL3spgIYJmMcwFkq/c2BojFLJcREoycG89MQyQRlHqAh5UBkODMBJpyaCoiMsHmCNE4t10sJiHEEojucUCaKUEyCIpDY2DxQ03wM+m5NqQKO2YiSaa01hSMRYTlqJMUs8upJSEPgk6iezNo0Tc4xp8AJFZkJB8aZfZaNVhwlByU+mrERxEKrlIe6KjQAcA6+EeahAJkylb/G7NNYbEueQjcL89z2HubjQxh2TZ1aot6OHyZYamNGDBckiSJspu4yrYq1cLurOreqih5qpdzMF89DhxlcQz9X0M96vnL/CvVR36A18LgCHjcKn1TQk3mpl1bQtIFOKuikUdm7qMAXDXhaQacNdFZBZw30awX92rQSm0Gf9QaqsDsfk9oP6QTecYBYq05Pz7+FmwmeOXVJNlXVcXRu9xB8c14KIJpldPX+6NNHrXY3e6/sdT3P8MIULin22vqrXbtBCYpuSo69udl73eAkHMfBVaG9N+t/Oc1CLOUsvCJtbKy93Xo9vyKcNN5XPgN1HNTwI1hELxteKPAWCQoTFvLHTf47jme/YCeLql96s1DBFikujbpUzLXEsrUaEyPOTioOC8oemGPL4ZNZt31zILBM+AuzYzyIqDHD/N1uFi2be+7MX+9m0O+tbq3aX152dj6Uh/2m9ch6bD2zHGvD2rHeWwdW3yLW/62l1u3WnfY/7X/b39r/FdSlVql5aNW+9o+f5UmTFg==</latexit>
p(X), p(x): shorthand for P(X = x)
Interpretingwhatastatementofaprobabilityfunctionmeansdependsonwhetherallvariablesare“?illedin.”Inthe?irstline,X=0referstoasingle,we’llde?inedevent,sop(X=0)referstoasinglevaluebetween0and1.Inthesecondlinewehaveaclassicalvariablex,sothestatement“X=x”canrefertodifferentevents,dependingonwhatxis.Inotherwords,here“p(X=x)”isafunctionofx.Forexample,ifxcantakevalues0and1,itmayrefertoasimplefunctionliketheoneshownhere.
Sinceweusuallyknowwhichoutcomesbelongtowhichrandomvariables,p(X)andp(x)canbothbeusedasshorthandforp(X=x).Notethatinthesecases,xstandsforsomespeci?icvalue,andXstandsfortherandomvariable.
probability vs. probability density
13
1 2 3 4 5 6 -1 1
Onbothdiscreteandcontinuoussamplespaces,theeventswedescribehaveprobability.
However,whenwelookatagraphliketheoneontheright,describinganormaldistributionde?inedonacontinuoussamplespace,it’simportanttorealizethatthisfunctiondoesnotexpressaprobability.IfIaskyou,underthisdistribution,whichhasthehigherprobability,0or1,theansweristhattheybothhaveprobability0.Theyhavedifferentprobabilitydensity,butwhathasprobabilityinanormaldistributionisaninterval.
Theintervalfrom0to1hashigherprobabilitythantheintervalfrom1to2.Thepoint0hashigherprobability
densitythanthepoint1.
Thisisimportantbecauseprobabilitydensitiescanhavevalueslargerthan1andprobabilitiescan’t.
probabilities and concepts
for random variables X and Y
joint probability: p(X, Y)
marginal probability: p(X)
conditional probability: p(X | Y)
(conditional) independence
Bayes’ theorem
14
Nowthatwehavethebasiclanguageofprobabilitytheoryinplace,wecanlookatsomeofthemostimportantconcepts.Wewillquicklyreviewthese?iveconcepts.
Notethatwehaveasinglesamplespaceandeventspace,andtherandomvariablesXandYwillhelpusdescribetheeventsthatwe’reinterestedin.
running example
Age = {young, teen, old}
Teeth = {healthy, unhealthy, fake}
15
Wewillusethefollowingrunningexample:wesamplearandompersonfromtheDutchpopulationandwechecktheirageandthehealthoftheirteeth(binningtheresultsintothreecategoriesforeachvariable).
Wewanttoaskquestionslike:
• whatistheprobabilityofseeinganoldperson?
• whatistheprobabilitythatayoungpersonhasfaketeeth?
• doesaperson’sagein?luencethehealthoftheirteeth,oristherenorelation?
Thesamplespaceisthesetoftheninedifferentpairsofvalueswecanobserve,andtheeventspaceisthepowersetofthat.TherandomvariablesAgeandTeethwillhelpusdescribetheseevents.
joint probability
p(Age = old & Teeth = healthy)
p(Age, Teeth):
16
T
h u f
A
y 5/18 3/18 1/18
t 1/18 1/18 2/18
o 1/18 1/18 3/18
Thejointdistributionisthemostimportantdistribution.Ittellsustheprobabilityofeachatomicevent:eacheventthatcontainsasingleelementinoursamplespace.
Sincewehavetworandomvariablesinourexample,wecanspecifythejointdistributioninasmalltable.Theprobabilitiesofalltheseeventssumtoone.
Notethatp(Age=old&Teeth=healthy)referstoasinglevalue(1/18),becausewehavespeci?iedtheevent.p(Age,Teeth)doesnotrefertoasinglevalue,becausethevariablesarenotinstantiated,itrepresentsafunctionoftwovariables(i.e.thewholetable).
marginal probability
17
T
h u f
A
y 5/18 3/18 1/18 9/18
t 1/18 1/18 2/18 4/18
o 1/18 1/18 3/18 5/18
7/18 5/18 6/18
p(Age=
old) ->
Ifwewanttofocusonjustonerandomvariable,allweneedtodoissumovertherowsorcolumns.
Forinstance,theprobabilitythatAge=old,regardlessofthevalueofTeeth,istheprobabilityoftheevent{(o,h),(o,u),(o,f)}.Becausewecanwritethesesumsinthemarginsofourjointprobabilitytable,thisprocessof“gettingrid”ofavariableisalsocalledmaginalizingout(asin“wemarginalizeoutthevariableTeeth”).Theresultingdistributionovertheremainingvariable(s)iscalledamarginaldistribution.
marginal probability
p(y) = p(y, h) + p(y, u) + p(y, f )
in general, for joint distribution p(X, Y):
p(x) = ∑y in Y p(x, y)
18
Thisiswhatmarginalizinglookslikeinsymbols:wesumthejointprobabilitiesforallvaluesofoneoftherandomvariables,keepingthevalueoftheother?ixed.
conditional probability
p(T = f | A = y) = p(f, y) / p(y) = 1/ 9
19
T
h u f
A
y 5/18 3/18 1/18
t 1/18 1/18 2/18
o 1/18 1/18 3/18
Theconditionalprobabilityistheprobabilityoveronevariable,ifthevalueofanotherisknown.
Ifweknowthatsomebodyisyoung,weknowthattheprobabilityofthemhavingfalseteethmustbemuchlower.
Theconditionalprobabilityp(X=x|Y=y)iscomputedtakingthejointprobabilityof(x,y)andnormalisingbythesumoftheprobabilitiesintheroworcolumncorrespondingtothepartthat’sgivenintheconditional.
Imaginewe’rethrowingdartsatthistable,andtheprobabilityofhittingacertaincellisthejointprobabilityindicatesinthecell.Theconditional
probabilityp(T=f|A=y)istheprobabilitythatthedarthitsthe(y,f)cell,giventhatit’shittheyrow.
conditional probability
20
p(X = x|Y = y) =p(X = x, Y = y)Px0 p(X = x 0, Y = y)
=p(x,y)
p(y)<latexit sha1_base64="KXWbl7hv942UM5SAXauufIzhCEw=">AAAIUXicfVXdbts2FJbTtfHcdkvXy11MmLE1HYxActYkuzDQ5aftxdpkQZxkiIyAoo9twZREkJQjleVT9B32Qrvao/SulCx7+ut044Pz/Zj8DiW6lHhcWNa/rY17X91/sNn+uvPw0eNvvt168t0lDyOGYYhDErJrF3EgXgBD4QkC15QB8l0CV+78KMWvFsC4FwYXIqEw8tE08CYeRkK3brc+0m3HxfLaHJix+uCEWP6ly0Q9N38emM6EISyLjJ5ZoCjp8Mi/lc4cy/iZUmbO1MK8k9MHKdl0nE7Fc+WXeelWXt5uda0dK3vMemHnRdfIn7PbJw/+dsYhjnwIBCaI8xvbomIkERMeJqA6TsSBIjxHU7iJxORgJL2ARgICrMyfNDaJiClCMw3IHHsMsCCJLhBmnnYw8QzpRQsdY6dsxSFAPvDeeOFRviz5YrosBNIzGMk4m5F6XFLKKUN05uG4tDSJfO4jMas1eeK75SZEBNjCLzfTZepFVpgxMOzxNIQzncwpTefOL8KzHJ8ldAYBVzJiRBWFGgDGYKKFWclBRFRmu9GHbc4HgkXQS8usNzhGbH4O4572KTXKy5mQEIlyy9Xb0OkEcIdD30fBWDpUHy0BsZBOb0dl2RXRcyWlkwbluuZ5CpfQdwX0nVJl8KQAnmiwjA7X6MQcVqWXBfCy9q9XBfSqKnWjAhrV0EUBXdSc3bsCfFeD4wIa19CkgCY19H0BfV/PGeljcdMfyeUssqHKU+It4DUDCJTs9lV1L0zP+8YuS9IzILu2yuIew0R/qZaAn6R0+ebi7R9KHh30X1h7qspwSQQrirW79+LIqlGmy9XkHOvgoH9Y44QMBdO10fHJ3u923YhGjJI1aX9/99VvdacECAnv1k5Hh8f93erGdCLlRdn7tmVVTxvDtajyRMyubdainTbR879pFLhNgmWejfx5nf+aoeQL7LDJfRVzo4I2KVaZNyqSJsVqACtFWRI0xPTfONaays5p+iLoO8uh6ZWByNL3GPRlwuCtfkFO9QcQiZD9ot8KNvU97aV/nV5a/R8RxSuirjodfbPZ1XusXlz2d+zdHevPX7svD/M7rm18b/xobBu2sW+8NN4YZ8bQwMan1g+t7dbzzX82P7WN9saSutHKNU+N0tN++BnxCgG7</latexit>
p(X = x|Y = y) =p(X = x, Y = y)Px0 p(X = x 0, Y = y)
=p(x,y)
p(y)<latexit sha1_base64="KXWbl7hv942UM5SAXauufIzhCEw=">AAAIUXicfVXdbts2FJbTtfHcdkvXy11MmLE1HYxActYkuzDQ5aftxdpkQZxkiIyAoo9twZREkJQjleVT9B32Qrvao/SulCx7+ut044Pz/Zj8DiW6lHhcWNa/rY17X91/sNn+uvPw0eNvvt168t0lDyOGYYhDErJrF3EgXgBD4QkC15QB8l0CV+78KMWvFsC4FwYXIqEw8tE08CYeRkK3brc+0m3HxfLaHJix+uCEWP6ly0Q9N38emM6EISyLjJ5ZoCjp8Mi/lc4cy/iZUmbO1MK8k9MHKdl0nE7Fc+WXeelWXt5uda0dK3vMemHnRdfIn7PbJw/+dsYhjnwIBCaI8xvbomIkERMeJqA6TsSBIjxHU7iJxORgJL2ARgICrMyfNDaJiClCMw3IHHsMsCCJLhBmnnYw8QzpRQsdY6dsxSFAPvDeeOFRviz5YrosBNIzGMk4m5F6XFLKKUN05uG4tDSJfO4jMas1eeK75SZEBNjCLzfTZepFVpgxMOzxNIQzncwpTefOL8KzHJ8ldAYBVzJiRBWFGgDGYKKFWclBRFRmu9GHbc4HgkXQS8usNzhGbH4O4572KTXKy5mQEIlyy9Xb0OkEcIdD30fBWDpUHy0BsZBOb0dl2RXRcyWlkwbluuZ5CpfQdwX0nVJl8KQAnmiwjA7X6MQcVqWXBfCy9q9XBfSqKnWjAhrV0EUBXdSc3bsCfFeD4wIa19CkgCY19H0BfV/PGeljcdMfyeUssqHKU+It4DUDCJTs9lV1L0zP+8YuS9IzILu2yuIew0R/qZaAn6R0+ebi7R9KHh30X1h7qspwSQQrirW79+LIqlGmy9XkHOvgoH9Y44QMBdO10fHJ3u923YhGjJI1aX9/99VvdacECAnv1k5Hh8f93erGdCLlRdn7tmVVTxvDtajyRMyubdainTbR879pFLhNgmWejfx5nf+aoeQL7LDJfRVzo4I2KVaZNyqSJsVqACtFWRI0xPTfONaays5p+iLoO8uh6ZWByNL3GPRlwuCtfkFO9QcQiZD9ot8KNvU97aV/nV5a/R8RxSuirjodfbPZ1XusXlz2d+zdHevPX7svD/M7rm18b/xobBu2sW+8NN4YZ8bQwMan1g+t7dbzzX82P7WN9saSutHKNU+N0tN++BnxCgG7</latexit>
Notethatthedenominatorisjustthemarginalprobability
useful
21
p(x,y) = p(x | y)p(y)<latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit>
Ifwere-arrangethefactorsinthede?initionoftheconditionalprobabilty,wegetthisequation,showingakindofdecompositionofthejointprobability.Thiscomesupalot,soit’susefultomakeamentalnoteofit.
continuous
22
XY
image source: By IkamusumeFan - Own work, CC BY-SA 3.0, https://commons.wikimedia.org/w/index.php?curid=30432580
Hereiswhattheseconceptslooklikewithcontinuousrandomvariables(abivariatenormaldistributioninthiscase).Thejointprobabilitydistributionisrepresentedbythepointcloudinthemiddle.Marginalizingouteithervariableresultsinaunivariatenormal(theredandbluedistributions).
Theconditionaldistributioncorrespondstoaverticalorhorizontalslicethroughthejointdistribution(andalsoresultsinaunivariatenormal.
independence
X and Y are independent if
p(X, Y) = p(X)p(Y)
which implies p(X|Y) = p(X)
X and Y are conditionally independent given Z if
p(X, Y | Z) = p(X | Z) p(Y | Z)
23
IftwovariablesXandYareindependent,thenknowingYwillnotchangewhatweknowaboutX.
Conditionalindolencemeansthatthetwovariablesaredependent,buttheirdependenceisentirelyexplainedbyathirdvariableZ.IfweconditiononZ,thevariablesbecomedependent.
Foranexample:considertwopeopleaandbwhoworkindifferentcitiesintheNetherlands.De?inerandomvariablesAandBdescribingwhetherornotaandbrespectivelyarelatefordinner.Theylivefarenoughaway,thatthetwoeventsareentirelyunrelated,exceptthatwhenitsnowsintheNetherlandseverythingshutsdown.RepresenttheeventofsnowbytherandomvariableS.Now,ifIknowthatAwaslatefordinner,thereisasmallprobabilitythatthatwascausedbysnow.ThistheprobabilitythatBwaslatefordinneraswellslightlyincreases.However,ifIknowthatitdidn’tsnow(IconditiononS),knowingthatAwaslatefordinnerdoesn’tin?luencetheprobabilityofBbeinglatefordinneratall.
conditional independence
A: Alice is home in time for dinner B: Bob is home in time for dinner G: a monster attacks the city
p(A, B|G) = p(A|B) p(B|G)
p(A|G, B) = p(A|G)
24
Conditionalindependencecomesupalot,anditcanbetrickytowrapyourheadaroundat?irst,sohere’sanexample.
Imaginetwopeoplewhoworkindifferentareasofaverybigcity.Inprinciple,theyworksofarapartthatwhetherornottheyarrivehomeintimefordinneriscompletelyindependent.KnowingwhetherornotAliceislatefordinnertellsyounothingaboutwhetherBobishomeintimefordinner.Noaspectoftheirlives(weather,traf?ic)intersectinameaningfulway,exceptone.
Veryrarely,alargemonsterattacksthecity.Inthatcase,alltraf?icshutsdownandeverybodyislatefordinner.ThatmeansthatifweknowthatBobislatefordinner,thereisaslightchancethatit’sbecauseofthemonster,whichshouldslightlyraisethepossibilitythatAliceislatefordinner.However,onceweknowwhetherornotthemonsterhasattacked,knowingthatBobislateprovidesnoadditionalinformation.
Bayes’ rule
the inversion problem:
It’s easy to express the probability of an observable given some hidden cause (assuming we have a model of the world). However, we usually want the opposite.
25
Inshort,weneedawayto“turnaround”theconditionalprobability.Ifweknowp(X|Y),howdoweworkoutp(Y|X)?
26
p(Y | X) =p(X | Y)p(Y)
p(X)<latexit sha1_base64="YV7LWJvefteL9CPlhBrA21FSxAY=">AAAICXicfVVdb9s2FFU/3XrtlraPexFqDGgGI5CcNUkfAnT5WAtsbdIgTlJERkDR17ZgSiJIyrFK8Bf0f+y9b0Vf9yOGvW3/ZJRku5KojS+6uOecq3sPKdGnJODCcf66cfPW7Tt3W/fut7958PDb79YePT7jccIw9HFMYnbhIw4kiKAvAkHggjJAoU/g3J/uZ/j5DBgP4uhUpBQGIRpHwSjASOjU1dqv9JkXY/le2V4YDG3Px/JCrdu7tjdiCEuN5pkCzInrttdditZttaKsq6u1jrPh5Ms2A3cRdKzFOr56dPd3bxjjJIRIYII4v3QdKgYSMRFgAqrtJRwowlM0hstEjHYGMohoIiDCyv5BY6OE2CK2s8HsYcAAC5LqAGEW6Ao2niA9hNDjt6ulOEQoBN4dzgLKi5DPxkUgkPZuIOe5t+phRSnHDNFJgOeV1iQKeYjExEjyNPSrSUgIsFlYTWZt6iZrzDkwHPDMhGPtzBHN9oufxscLfJLSCURcyYQRVRZqABiDkRbmIQeRUJlPow/JlO8KlkA3C/Pc7gFi0xMYdnWdSqLazojESFRTvh5DuxPBNY7DEEVD6VElPQFzIb3uhsq9K6MnSkovM8r37ZMMrqBvS+hbpargYQk81GAV7a/Qkd2vS89K4Jnx1vMSel6X+kkJTQx0VkJnRmX/ugRfG/C8hM4NNC2hqYF+KKEfTJ+RPhaXvYEs9iLfVHlEghm8YgCRkp2eqs/C9H5fulVJdgZkx1W53UMY6T9MAYRpRpevT9/8puT+Tu+5s6XqDJ8ksKQ4m1vP9x2DMi66WXCcnZ3ensGJGYrGq0IHh1s/u2YhmjBKVqTt7c1fXpiVUiAkvl5V2t876G3WB9OOVJtyt13HqZ82hg2rFo7YHdc2rB030RevaRT4TYLCz0b+1OS/Yij9D3bcVH1pc6OCNimWnjcq0ibFcgOWiqokarDp63asNLXJafYhTLHuMbsyECnqHoC+TBi80R/Ikf4BIhGzH/VXwcZhoGvpp9fNov8jovmSqKN2W99sbv0eM4Oz3oa7ueG8+6nzcm9xx92zvreeWs8s19q2XlqvrWOrb2Hrk/Wn9bf1T+tj61Prc+tLQb15Y6F5YlVW649/AfbZ7DQ=</latexit>
Here’showBayes’ruleisusuallywritten.
27
<latexit sha1_base64="KXPAqQnT5XF+5PpZTglAAfD4yCM=">AAAICXicfVVdb9NIFDUsEDbAblkeebGIkCiKKjtd2vJQCfoBSAu0WzUtUh1V48lNYmVsj2bGacxofsH+D955Q7zuj1jt2+4/YWwnWdtjdl58dc851/eeGXt8SgIuHOfva9d/uHHzVuv2j+07d+/99PPa/V/OeJwwDH0ck5h98BEHEkTQF4Eg8IEyQKFP4Nyf7mf4+QwYD+LoVKQUBiEaR8EowEjo1OXab/SJF2MZKtsLg6Ht+VgitW7v2t6IISw1mmcKMCeu2153KVq31Yqyri7XOs6Gky/bDNxF0LEW6/jy/q1P3jDGSQiRwARxfuE6VAwkYiLABFTbSzhQhKdoDBeJGO0MZBDRRECElf1YY6OE2CK2s8HsYcAAC5LqAGEW6Ao2niA9hNDjt6ulOEQoBN4dzgLKi5DPxkUgkPZuIOe5t+peRSnHDNFJgOeV1iQKeYjExEjyNPSrSUgIsFlYTWZt6iZrzDkwHPDMhGPtzBHN9oufxscLfJLSCURcyYQRVRZqABiDkRbmIQeRUJlPow/JlO8KlkA3C/Pc7gFi0xMYdnWdSqLazojESFRTvh5DuxPBFY7DEEVD6VElPQFzIb3uhsq9K6MnSkovM8r37ZMMrqDvS+h7pargYQk81GAV7a/Qkd2vS89K4Jnx1vMSel6X+kkJTQx0VkJnRmX/qgRfGfC8hM4NNC2hqYF+LKEfTZ+RPhYXvYEs9iLfVHlEghm8ZgCRkp2eqs/C9H5fuFVJdgZkx1W53UMY6T9MAYRpRpdvTt+9VXJ/p/fM2VJ1hk8SWFKcza1n+45BGRfdLDjOzk5vz+DEDEXjVaGDw62XrlmIJoySFWl7e/PVc7NSCoTEV6tK+3sHvc36YNqRalPutus49dPGsGHVwhG749qGteMm+uI1jQK/SVD42cifmvzXDKXfYcdN1Zc2Nypok2LpeaMibVIsN2CpqEqiBpv+246VpjY5zT6EKdY9ZlcGIkXdA9CXCYN3+gM50j9AJGL2VH8VbBwGupZ+et0s+j8imi+JOmq39c3m1u8xMzjrbbhbG87vv3Ze7C3uuNvWQ+uR9cRyrW3rhfXGOrb6FrY+W39Z/1j/tv5ofW59aX0tqNevLTQPrMpq/fkNg0Lsjg==</latexit>
p(m | a) =p(a | m)p(m)
p(a)
<latexit sha1_base64="KXPAqQnT5XF+5PpZTglAAfD4yCM=">AAAICXicfVVdb9NIFDUsEDbAblkeebGIkCiKKjtd2vJQCfoBSAu0WzUtUh1V48lNYmVsj2bGacxofsH+D955Q7zuj1jt2+4/YWwnWdtjdl58dc851/eeGXt8SgIuHOfva9d/uHHzVuv2j+07d+/99PPa/V/OeJwwDH0ck5h98BEHEkTQF4Eg8IEyQKFP4Nyf7mf4+QwYD+LoVKQUBiEaR8EowEjo1OXab/SJF2MZKtsLg6Ht+VgitW7v2t6IISw1mmcKMCeu2153KVq31Yqyri7XOs6Gky/bDNxF0LEW6/jy/q1P3jDGSQiRwARxfuE6VAwkYiLABFTbSzhQhKdoDBeJGO0MZBDRRECElf1YY6OE2CK2s8HsYcAAC5LqAGEW6Ao2niA9hNDjt6ulOEQoBN4dzgLKi5DPxkUgkPZuIOe5t+peRSnHDNFJgOeV1iQKeYjExEjyNPSrSUgIsFlYTWZt6iZrzDkwHPDMhGPtzBHN9oufxscLfJLSCURcyYQRVRZqABiDkRbmIQeRUJlPow/JlO8KlkA3C/Pc7gFi0xMYdnWdSqLazojESFRTvh5DuxPBFY7DEEVD6VElPQFzIb3uhsq9K6MnSkovM8r37ZMMrqDvS+h7pargYQk81GAV7a/Qkd2vS89K4Jnx1vMSel6X+kkJTQx0VkJnRmX/qgRfGfC8hM4NNC2hqYF+LKEfTZ+RPhYXvYEs9iLfVHlEghm8ZgCRkp2eqs/C9H5fuFVJdgZkx1W53UMY6T9MAYRpRpdvTt+9VXJ/p/fM2VJ1hk8SWFKcza1n+45BGRfdLDjOzk5vz+DEDEXjVaGDw62XrlmIJoySFWl7e/PVc7NSCoTEV6tK+3sHvc36YNqRalPutus49dPGsGHVwhG749qGteMm+uI1jQK/SVD42cifmvzXDKXfYcdN1Zc2Nypok2LpeaMibVIsN2CpqEqiBpv+246VpjY5zT6EKdY9ZlcGIkXdA9CXCYN3+gM50j9AJGL2VH8VbBwGupZ+et0s+j8imi+JOmq39c3m1u8xMzjrbbhbG87vv3Ze7C3uuNvWQ+uR9cRyrW3rhfXGOrb6FrY+W39Z/1j/tv5ofW59aX0tqNevLTQPrMpq/fkNg0Lsjg==</latexit>
p(m | a) =p(a | m)p(m)
p(a)
Here’sasimpleexample.Let’ssaythatweobservethatAliceislatefordinner(andweobservenothingelse).Doesthistellusanythingaboutwhetheramonsterhasattackedthecity?Itdoesn’ttellusmuch;it’sextremelyrarethatamonsterattacksthecitysoit’salmostcertainthatAliceislateforotherreasons.Still,ifAlicewereontime,we’dknowthatamonstercouldn’t.haveattackedthecity,sincethatwouldalmostcertainlymakeherelate.Sowemaynotknowmuch,butweknowsomething.
Inthiscaseit’seasyforustoworkouttheprobabilitythatAliceislategiventhemonsterattack.Thisisusuallythecasewhentheconditionalisthecauseoftheobservable.Theoppositeisusuallywhatweareinterestedin,sincewehavetheobservableandwanttoreasonaboutitscause.ThisiswhereBayes’rulecomesin.
SaythatweknowtheprobabilitythatweobserveAlicebeinglate,giventhatamonsterattackhappened,p(a|m),issomewherenear1.Bayes’ruletellsushowtousethistocalculatetheoppositeconditionalp(m|a).Thisisnotnear1,becausewemultiplyitbythemarginalprobabilityofamonsterattackp(m),whichisreallylow.WethendividebytheprobabilityofAlicebeinglateingeneralp(a):themorelikelyAliceistobelateduetoothercauses,thelowertheprobabiltiythatitiscausedbyamonsterattack.
28
<latexit sha1_base64="ra11pUDCFf1663BenAvPSy9Hz04=">AAAIjnicnVVdb9s2FJXbrfO8dU27x70QMzY0mxFIzupkD8G6fKx5WJssiJMCkRFQ9JUtmPoASTlWCf6d/Z697t+MkmxPErUNmF54cc85V5fnUqKX0IAL2/6z8+jxRx8/+aT7ae+zz59+8Wzn+YsbHqeMwJjENGbvPcyBBhGMRSAovE8Y4NCjcOstTnL8dgmMB3F0LbIEJiGeRYEfECx06n7nj+SlGxMZKuSGwRS5HpFY7aJvj5DrM0ykhotUiRbMXeQONqpdpLaUXV3D7f0fZUERJaqX79F/ahsUXqJ6Ufc7fXvPLh5kBs466Fvr5/L++ZPf3WlM0hAiQSjm/M6xEzGRmImAUFA9N+WQYLLAM7hLhX84kUGUpAIiotA3GvNTikSMcnfRNGBABM10gAkLdAVE5ljbIfQMevVSHCIcAh9Ml0HCy5AvZ2UgsB7gRK6KAaunNaWcMZzMA7KqtSZxyEMs5kaSZ6FXT0JKgS3DejJvUzfZYK6AkYDnJlxqZy6S/NDw6/hyjc+zZA4RVzJlVFWFGgDGwNfCIuQg0kQWu9EndcGPBEthkIdF7ugUs8UVTAe6Ti1Rb8enMRb1lKe3od2J4IHEYYijqXQTJV0BKyHdwZ4qvKuiV0pKNzfK89BVDtfQdxX0nVJ18KwCnmmwjo63qI/GTelNBbwx3npbQW+bUi+toKmBLivo0qjsPVTgBwNeVdCVgWYVNDPQDxX0g+kz1sfibjiR5SyKocoLGizhDQOIlOwPVXMvTM/7zqlL8jMg+44q7J6Cr39zJRBmOV2eX7/9VcmTw+Ere6SaDI+msKHY+6NXJ7ZBmZXdrDn24eHw2ODEDEezbaHTs9HPjlkoSVlCt6SDg/1ffjQrZUBp/LCtdHJ8Otxvbkw7Um/KOXBsu3naGDGsWjuC+g4yrJ210devaRV4bYLSz1b+wuS/YTj7B3bcVn1jc6siaVNsPG9VZG2KzQA2irokarHp73FsNY2dJ/mHsCC6x/zKwLSsewr6MmHwVn8gF/oHiEXMvtNfBZuFga6lV3eQR/9GxKsNUUe9nr7ZnOY9ZgY3wz1ntGf/9kP/9fH6jutaX1lfWy8txzqwXlvn1qU1tkhn1HE70PG7O91R96j7U0l91FlrvrRqT/f8LyVTFOM=</latexit>
p(m | a) =p(a | m)p(m)
p(a)
=p(a | m)p(m)
p(a | t)p(t) + p(a | m)p(m) + p(a | s)p(s)
caused by traffic
caused by monster
caused by snowfall
IftherearethreepossiblereasonsforAlicetobelate:traf?ic,monsterorsnowfall.ThenwecanseethedenominatorasasummarginalizaingoutthecauseforAlice’slateness.TheproportionofthissumgivenbythemiddletermistheprobabilitythatAlice’slatenessiscausedbyamonsterattack.
Considerthesituationwherebothtraf?icandsnowfallarefarmorelikelithanamonsterattack,sop(t)andp(s)aremuchhigherthanp(m),butneithertraf?icnorsnowfallevercauseAlicetobelate,perhapsbecauseshecycleshomefromwork,andhasabikewithgoodsnowtires.Inthatcaseboththe?irstandlastterminthesumbecomezero,anddespitethefactthatmonsterattacksarereallyrare,wecanstillconcludethatamonsterhasattackedifwenoticethatAliceislatefordinner.
conditional probability and Bayes’ rule
29
definition of conditional probability
see slide 21
p(X | Y) =p(Y,X)
p(Y)
=p(Y | X)p(X)
p(Y)<latexit sha1_base64="to9rl+96iyWlvWOAIe69StK8Qx4=">AAAILnicfVVNb9tGEKXTxErUpnWaYy5EhBRxIRik3NjuwUDijyaHJnYNy3ZhCsZyNZIILcnF7lIWs9i/k3N+TYAeil7zI3LIkpRUksuWFw3mvTeaebPk+pQEXDjOX2t3vrl7b711/0H72+8efv/DxqMfL3icMAx9HJOYXfmIAwki6ItAELiiDFDoE7j0p4cZfjkDxoM4OhcphUGIxlEwCjASOnWzQelzz8fySnlhMPRiLP9Um/ZP+7Y3YghLDeaprl2QNtUqtalsz2sbzKzMkrwsXVHdbHScLSd/bDNwF0HHWjynN4/WP3jDGCchRAITxPm161AxkIiJABNQbS/hQBGeojFcJ2K0N5BBRBMBEVb2M42NEmKL2M6mt4cBAyxIqgOEWaAr2HiC9ABCe9SuluIQoRB4dzgLKC9CPhsXgUDa4IGc5wtQDytKOWaITgI8r7QmUchDJCZGkqehX01CQoDNwmoya1M3WWPOgeGAZyacamdOaLZUfh6fLvBJSicQcSUTRlRZqAFgDEZamIccREJlPo0+SVO+L1gC3SzMc/tHiE3PYNjVdSqJajsjEiNRTfl6DO1OBLc4DkMUDaVHlfQEzIX0ulsq966Mnikpvcwo37fPMriCviuh75Sqgscl8FiDVbS/Qkd2vy69KIEXxr9eltDLutRPSmhioLMSOjMq+7cl+NaA5yV0bqBpCU0N9H0JfW/6jPSxuO4NZLGLfKnyhAQzeM0AIiU7PVWfhel9X7tVSXYGZMdVud1DGOnPUAGEaUaXb87f/q7k4V7vhbOj6gyfJLCkONs7Lw4dgzIuullwnL293oHBiRmKxqtCR8c7r1yzEE0YJSvS7u72b7+alVIgJL5dVTo8OOpt1wfTjlSbcnddx6mfNoYNqxaO2B3XNqwdN9EXf9Mo8JsEhZ+N/KnJf81Q+h/suKn60uZGBW1SLD1vVKRNiuUCloqqJGqw6d91rDS1yWn2Ikyx7jG7MhAp6h6BvkwYvNUvyIn+ACIRs5/1W8HGYaBr6V+vm0X/R0TzJVFH7ba+2dz6PWYGF70td3vL+eOXzsuDxR1333piPbWeW661a7203linVt/C1ifry9q9tfXWx9an1t+tfwrqnbWF5rFVeVqfvwIhIflQ</latexit>
p(X | Y) =p(Y,X)
p(Y)
=p(Y | X)p(X)
p(Y)<latexit sha1_base64="to9rl+96iyWlvWOAIe69StK8Qx4=">AAAILnicfVVNb9tGEKXTxErUpnWaYy5EhBRxIRik3NjuwUDijyaHJnYNy3ZhCsZyNZIILcnF7lIWs9i/k3N+TYAeil7zI3LIkpRUksuWFw3mvTeaebPk+pQEXDjOX2t3vrl7b711/0H72+8efv/DxqMfL3icMAx9HJOYXfmIAwki6ItAELiiDFDoE7j0p4cZfjkDxoM4OhcphUGIxlEwCjASOnWzQelzz8fySnlhMPRiLP9Um/ZP+7Y3YghLDeaprl2QNtUqtalsz2sbzKzMkrwsXVHdbHScLSd/bDNwF0HHWjynN4/WP3jDGCchRAITxPm161AxkIiJABNQbS/hQBGeojFcJ2K0N5BBRBMBEVb2M42NEmKL2M6mt4cBAyxIqgOEWaAr2HiC9ABCe9SuluIQoRB4dzgLKC9CPhsXgUDa4IGc5wtQDytKOWaITgI8r7QmUchDJCZGkqehX01CQoDNwmoya1M3WWPOgeGAZyacamdOaLZUfh6fLvBJSicQcSUTRlRZqAFgDEZamIccREJlPo0+SVO+L1gC3SzMc/tHiE3PYNjVdSqJajsjEiNRTfl6DO1OBLc4DkMUDaVHlfQEzIX0ulsq966Mnikpvcwo37fPMriCviuh75Sqgscl8FiDVbS/Qkd2vy69KIEXxr9eltDLutRPSmhioLMSOjMq+7cl+NaA5yV0bqBpCU0N9H0JfW/6jPSxuO4NZLGLfKnyhAQzeM0AIiU7PVWfhel9X7tVSXYGZMdVud1DGOnPUAGEaUaXb87f/q7k4V7vhbOj6gyfJLCkONs7Lw4dgzIuullwnL293oHBiRmKxqtCR8c7r1yzEE0YJSvS7u72b7+alVIgJL5dVTo8OOpt1wfTjlSbcnddx6mfNoYNqxaO2B3XNqwdN9EXf9Mo8JsEhZ+N/KnJf81Q+h/suKn60uZGBW1SLD1vVKRNiuUCloqqJGqw6d91rDS1yWn2Ikyx7jG7MhAp6h6BvkwYvNUvyIn+ACIRs5/1W8HGYaBr6V+vm0X/R0TzJVFH7ba+2dz6PWYGF70td3vL+eOXzsuDxR1333piPbWeW661a7203linVt/C1ifry9q9tfXWx9an1t+tfwrqnbWF5rFVeVqfvwIhIflQ</latexit>
Ifwestartwiththede?initionofconditionalprobability,thenBayes’rulefollowsdirectlyfrom?illingintheequationfromslide20.
Probabilistic Models Part 2: Learning with probability
Machine Learning mlvu.github.io
Vrije Universiteit Amsterdam
learning
We understand the machine, p(Data | θ) is known.
But we observe only the Data (and the input) and we want to know θ.
31
Observed data
θ : the parameters
“machine”
Hereisananalogyforthewayprobabilityisusuallyappliedinstatisticsandmachinelearning.Weassumesome“machine”(whichcouldbeanyprocess,theuniverse,oranactualmachine)hasgeneratedourdata,byaprocessthatispartlydeterministicandpartlyrandom.Thecon?igurationofthismachineisdeterminedbyitsparameters(theta).Thetacouldbeasinglenumber,severalnumbersorevenacomplicateddatastructure.
Weknowhowthemachineworks,soifweknowtheta,weknowtheprobabilityofeachdataset.Theproblemisthatweonlyobservethedata.
frequentist learning
Maximum likelihood estimation
The function L(θ) = p(X|θ) is called the likelihood.
We often maximize the logarithm of the likelihood instead.
32
✓̂ = argmax✓
p(X | ✓)<latexit sha1_base64="ZkpTB3iuuNemZLV5p/kA6MuvUds=">AAAHHHicfZXPbtNAEMbdQkMJFFo4crGIQAVFlZ3SNj1UlLbQCtE2VE1TKY6i9WbiWPGf1e46Tbrys3DiUeBScQMk3oa141R2HPAlo/l9M5r9djUxiWMzrml/5ubv3F0o3Fu8X3zwcOnR4+WVJxfMDyiGOvYdn16aiIFje1DnNnfgklBArulAw+zvR7wxAMps3zvnIwItF1me3bUx4jLVXn5rWFgYPcRVg/eAo1B9uaMaiFouGrYNU7IkTVYvDdfuqKncK8MotpdL2poWf2o+0JOgpCRfrb2y8N3o+DhwwePYQYw1dY3wlkCU29iBsGgEDAjCfWRBM+DdakvYHgk4eDhUX0jWDRyV+2p0GLVjU8DcGckAYWrLDiruIYowl0cuZlsx8JALrNwZ2ISNQzawxgFH0q+WGMZ+hkuZSmFRRHo2HmZGE8hlLuK9XJKNXDObhMABOnCzyWhMOeSUcggU2ywyoSadOSXRHbFzv5bw3oj0wGOhCKgTpgslAEqhKwvjkAEPiIhPIx9Gn+1wGkA5CuPczgGi/TPolGWfTCI7TtfxEQ+lGR5cYd91kdcRBgnl9cOQC6O8FsZWpelZKIQR+WKa6lmEM/QkRU/C6c71W9pV65Jm4EUKXuQaN1K0MV1qBika5OggRQe5zuZVCl/l8DBFhzk6StFRjl6n6HXeSiQvullpibHd8TWJU8cewCEF8EJRqoTTZ6HyBpt6tiS6VVHSw9juDnTlnhgDdxTJxdH58adQ7FcrG9pmOK0wnQAmEm19c2Nfy0ms8TSJRqtWK3s5jU+RZ902Oni/+U7PNyIBJc6taGtr/cP23vQToTh3vuQYaklXc35Ys+TJwDMLzFkFYxNm6vt5/SFFo3+o/VndJ97MrCCzKiZGTSqmRiLRs+rLJU2ilYqcseQA5LKlcCyf26lcEIj79LWI97wtzZC/RjmK/idEw4lQRsVo8+vTez4f1Ctr22va5zel3Y/JX8Ci8kx5rqwqurKl7CpHSk2pK1j5qtwoP5VfhS+Fb4Wbwo+xdH4uqXmqZL7C77/2npN4</latexit><latexit sha1_base64="ZkpTB3iuuNemZLV5p/kA6MuvUds=">AAAHHHicfZXPbtNAEMbdQkMJFFo4crGIQAVFlZ3SNj1UlLbQCtE2VE1TKY6i9WbiWPGf1e46Tbrys3DiUeBScQMk3oa141R2HPAlo/l9M5r9djUxiWMzrml/5ubv3F0o3Fu8X3zwcOnR4+WVJxfMDyiGOvYdn16aiIFje1DnNnfgklBArulAw+zvR7wxAMps3zvnIwItF1me3bUx4jLVXn5rWFgYPcRVg/eAo1B9uaMaiFouGrYNU7IkTVYvDdfuqKncK8MotpdL2poWf2o+0JOgpCRfrb2y8N3o+DhwwePYQYw1dY3wlkCU29iBsGgEDAjCfWRBM+DdakvYHgk4eDhUX0jWDRyV+2p0GLVjU8DcGckAYWrLDiruIYowl0cuZlsx8JALrNwZ2ISNQzawxgFH0q+WGMZ+hkuZSmFRRHo2HmZGE8hlLuK9XJKNXDObhMABOnCzyWhMOeSUcggU2ywyoSadOSXRHbFzv5bw3oj0wGOhCKgTpgslAEqhKwvjkAEPiIhPIx9Gn+1wGkA5CuPczgGi/TPolGWfTCI7TtfxEQ+lGR5cYd91kdcRBgnl9cOQC6O8FsZWpelZKIQR+WKa6lmEM/QkRU/C6c71W9pV65Jm4EUKXuQaN1K0MV1qBika5OggRQe5zuZVCl/l8DBFhzk6StFRjl6n6HXeSiQvullpibHd8TWJU8cewCEF8EJRqoTTZ6HyBpt6tiS6VVHSw9juDnTlnhgDdxTJxdH58adQ7FcrG9pmOK0wnQAmEm19c2Nfy0ms8TSJRqtWK3s5jU+RZ902Oni/+U7PNyIBJc6taGtr/cP23vQToTh3vuQYaklXc35Ys+TJwDMLzFkFYxNm6vt5/SFFo3+o/VndJ97MrCCzKiZGTSqmRiLRs+rLJU2ilYqcseQA5LKlcCyf26lcEIj79LWI97wtzZC/RjmK/idEw4lQRsVo8+vTez4f1Ctr22va5zel3Y/JX8Ci8kx5rqwqurKl7CpHSk2pK1j5qtwoP5VfhS+Fb4Wbwo+xdH4uqXmqZL7C77/2npN4</latexit><latexit sha1_base64="ZkpTB3iuuNemZLV5p/kA6MuvUds=">AAAHHHicfZXPbtNAEMbdQkMJFFo4crGIQAVFlZ3SNj1UlLbQCtE2VE1TKY6i9WbiWPGf1e46Tbrys3DiUeBScQMk3oa141R2HPAlo/l9M5r9djUxiWMzrml/5ubv3F0o3Fu8X3zwcOnR4+WVJxfMDyiGOvYdn16aiIFje1DnNnfgklBArulAw+zvR7wxAMps3zvnIwItF1me3bUx4jLVXn5rWFgYPcRVg/eAo1B9uaMaiFouGrYNU7IkTVYvDdfuqKncK8MotpdL2poWf2o+0JOgpCRfrb2y8N3o+DhwwePYQYw1dY3wlkCU29iBsGgEDAjCfWRBM+DdakvYHgk4eDhUX0jWDRyV+2p0GLVjU8DcGckAYWrLDiruIYowl0cuZlsx8JALrNwZ2ISNQzawxgFH0q+WGMZ+hkuZSmFRRHo2HmZGE8hlLuK9XJKNXDObhMABOnCzyWhMOeSUcggU2ywyoSadOSXRHbFzv5bw3oj0wGOhCKgTpgslAEqhKwvjkAEPiIhPIx9Gn+1wGkA5CuPczgGi/TPolGWfTCI7TtfxEQ+lGR5cYd91kdcRBgnl9cOQC6O8FsZWpelZKIQR+WKa6lmEM/QkRU/C6c71W9pV65Jm4EUKXuQaN1K0MV1qBika5OggRQe5zuZVCl/l8DBFhzk6StFRjl6n6HXeSiQvullpibHd8TWJU8cewCEF8EJRqoTTZ6HyBpt6tiS6VVHSw9juDnTlnhgDdxTJxdH58adQ7FcrG9pmOK0wnQAmEm19c2Nfy0ms8TSJRqtWK3s5jU+RZ902Oni/+U7PNyIBJc6taGtr/cP23vQToTh3vuQYaklXc35Ys+TJwDMLzFkFYxNm6vt5/SFFo3+o/VndJ97MrCCzKiZGTSqmRiLRs+rLJU2ilYqcseQA5LKlcCyf26lcEIj79LWI97wtzZC/RjmK/idEw4lQRsVo8+vTez4f1Ctr22va5zel3Y/JX8Ci8kx5rqwqurKl7CpHSk2pK1j5qtwoP5VfhS+Fb4Wbwo+xdH4uqXmqZL7C77/2npN4</latexit>
Infrequentistlearning,wearegivensomedataandourjobistoguesstotruemodel(outofamodelclass)thatgeneratedsomedata.
Inthefrequentistviewoftheworld,thetruemodelisnotsubjecttoprobability.Itdoesn’tchangeifwerepeattheexperiment,soweshouldn’tapplyprobabilitytoit.Wejusttrytoguesswhichitis.Thisistypicaloffrequentistapproaches:webuildalgorithmsthatgivesusapointestimateforourmodelparameters.Thatis,theyreturnonepointinourmodelspace.
Oneofthemostcommoncriteriaisthatweshouldpreferthemodel(representedbytheta)forwhichtheprobabilityofseeingthedatathatwesawishighest.Thisiscalledthemaximumlikelihoodprinciple.Underthemaximumlikelihoodprinciple,pickingamodelbecomesanoptimizationproblem.
Often,wemaketheproblemeasierbyoptimizingforthelogarithmofthelikelihood.Thisdoesn’tmovetheoptimum,butthelog-likelihoodisaneasierfunctiontodealwith.
Bayesian learning
33
<latexit sha1_base64="uHJEPojeHqJZnmVT74cmsTcM9/s=">AAAHJHicfZVdTxNBFIYXlIpVFPTSm42NCTVNs1uk1AsSBARiBCqhhaTbkNnpabvpfkxmZkvLOH/HK3+HV15pvDAaf4uz/SC73ere9GSe95yc887k1Cauw7hh/FxYvHN3KXNv+X72wcOVR49X157UWRBSDDUcuAG9tBED1/Ghxh3uwiWhgDzbhQu7txfxiz5Q5gT+OR8SaHqo4zttByOujq5WD8i6ZWNh8S5wJPWP+mVe39atNkVYkPVLy3NaekyQ1xP6vIxEeZm9Ws0ZRWP06enAnAQ5bfJVr9aWvlitAIce+By7iLGGaRDeFIhyB7sgs1bIgCDcQx1ohLxdaQrHJyEHH0v9hWLt0NV5oEcz6S2HAubuUAUIU0dV0HEXqRG4mjybLMXARx6wQqvvEDYOWb8zDjhStjXFYGSrXElkig5FpOvgQaI1gTzmId5NHbKhZycPIXSB9r3kYdSmanJGOQCKHRaZUFXOnJLoqth5UJ3w7pB0wWdShNSV8UQFgFJoq8RRyICHRIymUe+jx7Y5DaEQhaOz7X1Ee2fQKqg6iYNkO203QFwqM3y4xoHnIb8lLCLVG4ABF1ahKEdWxemZFMKKfLFt/SzCCXoSoydytnLtlrb1mqIJWI/BeqrwRYxezKbaYYyGKdqP0X6qsn0dw9cpPIjRQYoOY3SYojcxepO2EqmLbpSaYmz36JrEqev04ZAC+FLkSnJ2FqpusGEmU6JbFTlTjuxuQVutizHwhpFcHJ0fv5dir1LaNMpyVmG7IUwlxkZ5c89ISTrjbiYao1Ip7aY0AUV+57bQ/tvyGzNdiISUuLeira2Ng9e7s0+E4tR8kzH0nKmn/OjMk08anptgz0sYmzBX30vrDyka/kMdzKs+9WZuBpmXMTVqmjHTEomeVU9tahKtVOSOJfugli2FY/XcTtWCQDygL9Ubox3PUWaoX6sQRf8TosFUqKJstPnN2T2fDuqlolkuGh9e5XbeTf4DlrVn2nNtXTO1LW1HO9KqWk3D2mftm/ZL+535lPma+Z75MZYuLkxynmqJL/PnL5D9lbA=</latexit>
p(✓|X) =p(X | ✓)p(✓)
p(X)
prior distribution
model evidence
data distribution
posterior distribution
InBayesianlearning,wecantalkabouttheprobabilityofthetruemodelparameterstakingaparticularvalue.Wedon’tknowthetrueparameters,butthedatagivesussomeidea,soweexpressthatuncertaintyaprobabilitydistributionoverthemodelspace.
Thethreepartsoftheright-handsidehavethesenames.Thepriordistributionisanameyou’llhearoften;itexpressesourbeliefsaboutthemodelbeforewe’veseenthedata.Forinstanceifwedospamclassi?icationinaBayesianway,wemighthaveapriorbeliefabouttheprobabilityofspam,whichwethenupdatebylookingatthecontentoftheemail(thedata).Ourbeliefsabouttheparametersafterseeingthedata,isexpressedbytheposteriordistribution.
NotethatBayesianlearningdoes,inprinciple,notrequireustosearchoroptimizeanything.Ifwecanworkoutthefunctionontherighthandsideofthisequation,wehaveeverythingweneed.Ifweneedagoodmodel,wecanpicktheonetowhichp(θ|X)assignsthehighestprobability,orwecansampleamodelandgetagood?itwithhighprobability.Wecanalsostudyotherpropertiesofthedistribution:forinstancethevarianceofthisdistributionisagoodindicationofhowuncertainwestillareabouttheparametersofthemodel.
Insomecases,likefornormaldistributions,wecanworkallofthisoutanalytically.Formorecomplicatedmodels,it’susuallyimpossibletoworkouttheposterioranalytically,andwehavetomakedowithafunctionthatapproximatesit,orwithanumberofindividualsamplesfromtheposterior.
Wewon’tdealwithpureBayesianlearningmuchinthiscourse,butinmachinelearningthedistinctionbetweenfrequentistandBayesianlearningisnotalwaysadheredtoreligiouslyandconceptsfrombotharesometimesfreelycombined.
a simple example
p(Heads | Straight) =1/2
p(Tails | Straight) = 1/2
34
p(Heads | Bent) = 4/5
p(Tails | Bent) = 1/5
HTH HHT HHT HTH
Toexplainbothmaximumlikelihood?ittingandBayesianlearning,let’slookatasimpleexample.Wehavetwocoins,abentoneandastraightone.Flippingthesecoinsgivesusdifferentprobabilitiesofheadsandtails.
Weaskafriendtopickarandomcoinwithoutshowingus,andto?lipittwelvetimes.Theresultingsequencehasmoreheadsthantails,butnotsuchadisparitythatyouwouldneverexpectitfromafaircoin.Ifwehadtoguesswhichcoinourfriendhadpicked,whichshouldweguess?
imagesource:https://www.magictricks.com/bent.html
a simple example
p(Heads | Straight) =1/2
p(Tails | Straight) = 1/2
35
p(Heads | Bent) = 4/5
p(Tails | Bent) = 1/5
HTH HHT HHT HTH
Model Space
Observed data
Thisisasimpleversionofamodelselectionproblem.Ourmodelclassconsistsoftwomodels(thetwocoins)andourdataconsistsof12instances(theresultsofthecoin?lips).
Notethatpickingjustonecoinisafrequentistapproach,andgivingeachcoinaprobabilityisaBayesianapproach.
imagesource:https://www.magictricks.com/bent.html
maximum likelihood
36
argmaxCoin2{Bent,Straight}
p(HTHHHTHHTHTH | Coin)
argmaxModel 2Model Space
p(Data | Model )<latexit sha1_base64="TjHvkcvrTEf+bAsdMjPRxCEcpYY=">AAAItHicfVVNb9tGEKXSNFXVJnHaYy+LCAWSQDBIubHdQ4DUkhsf6th1LDuAKRjL1YgitPzAcimLWeyl1+R39D/1j/Tc5YdUksuEB2E0773R7JulxomoF3PT/Kdz76v7Xz/4pvtt77vvHz56vPPkh6s4TBiBCQlpyN47OAbqBTDhHqfwPmKAfYfCtbMcZfj1CljshcElTyOY+tgNvLlHMFep251/bcxcH69vhc1hzcUo9AKJbC9Adpk5goDLASq+vOMMe+6CS1tKFD2zQyJOpO0QcSnLuPqp5/MM0mFk+94MVVp4jmy71+jNZkSchjOgsuwwz+YZ9C7CBIqe8uwYc1yrWhE/R73bnb65a+YP0gOrDPpG+ZzfPnnwtz0LSeIrNwjFcXxjmRGfCsy4RyjInp3EoHpYYhduEj4/nAoviBIOAZHoZ4XNE4p4iLIZoJnHgHCaqgAT5qkKiCwww4SrSfXqpWIIsA/xYLbyorgI45VbBByrMU/FOr8G8mFNKVyGo4VH1rXWBPZjH/OFloxT36knIaHAVn49mbWpmmww18CIF2cmnCtnzqLsasWX4XmJL9JoAUEsRcKorAoVAIzBXAnzMAaeRCI/jbrPy/gVZwkMsjDPvRpjtryA2UDVqSXq7cxpiHk95ahjKHcCuCOh7+NgJuxIbm7VYFfm3lXRCymEnRnlOOgig2vo2wr6Vso6eFwBjxVYRydbdI4mTelVBbzSfvW6gl43pU5SQRMNXVXQlVbZuavAdxq8rqBrDU0raKqhHyroB91nrK7FzXAqilnkQxVn1FvBGwYQSNEfyuZZmJr3jVWXZHdA9C2Z2z2DufozLAA/zeji5PL0DylGh8OX5r5sMhyawIZi7u2/HJkaxS26KTnm4eHwSOOEDAfuttD4eP83Sy8UJSyiW9LBwd7vv+qVUqA0vNtWGh2Nh3vNgylH6k1ZB5ZpNm8bI5pVpSOobyHNWreNXv5Mq8BpExR+tvKXOv8Nw+ln2GFb9Y3NrYqoTbHxvFWRtik2A9go6pKgxab/x7HVNE4eZS/Ckqges5WBaVF3DGqZMDhVL8iZ+gPEPGQvRL721A5Uy8W1B1n0JSJeb4gq6mWbzWruMT24Gu5a5q715y/910fljusaPxlPjWeGZRwYr40T49yYGKRjd/7qfOx86u537S7pQkG91yk1Pxq1pxv8Bz87LsU=</latexit><latexit sha1_base64="TjHvkcvrTEf+bAsdMjPRxCEcpYY=">AAAItHicfVVNb9tGEKXSNFXVJnHaYy+LCAWSQDBIubHdQ4DUkhsf6th1LDuAKRjL1YgitPzAcimLWeyl1+R39D/1j/Tc5YdUksuEB2E0773R7JulxomoF3PT/Kdz76v7Xz/4pvtt77vvHz56vPPkh6s4TBiBCQlpyN47OAbqBTDhHqfwPmKAfYfCtbMcZfj1CljshcElTyOY+tgNvLlHMFep251/bcxcH69vhc1hzcUo9AKJbC9Adpk5goDLASq+vOMMe+6CS1tKFD2zQyJOpO0QcSnLuPqp5/MM0mFk+94MVVp4jmy71+jNZkSchjOgsuwwz+YZ9C7CBIqe8uwYc1yrWhE/R73bnb65a+YP0gOrDPpG+ZzfPnnwtz0LSeIrNwjFcXxjmRGfCsy4RyjInp3EoHpYYhduEj4/nAoviBIOAZHoZ4XNE4p4iLIZoJnHgHCaqgAT5qkKiCwww4SrSfXqpWIIsA/xYLbyorgI45VbBByrMU/FOr8G8mFNKVyGo4VH1rXWBPZjH/OFloxT36knIaHAVn49mbWpmmww18CIF2cmnCtnzqLsasWX4XmJL9JoAUEsRcKorAoVAIzBXAnzMAaeRCI/jbrPy/gVZwkMsjDPvRpjtryA2UDVqSXq7cxpiHk95ahjKHcCuCOh7+NgJuxIbm7VYFfm3lXRCymEnRnlOOgig2vo2wr6Vso6eFwBjxVYRydbdI4mTelVBbzSfvW6gl43pU5SQRMNXVXQlVbZuavAdxq8rqBrDU0raKqhHyroB91nrK7FzXAqilnkQxVn1FvBGwYQSNEfyuZZmJr3jVWXZHdA9C2Z2z2DufozLAA/zeji5PL0DylGh8OX5r5sMhyawIZi7u2/HJkaxS26KTnm4eHwSOOEDAfuttD4eP83Sy8UJSyiW9LBwd7vv+qVUqA0vNtWGh2Nh3vNgylH6k1ZB5ZpNm8bI5pVpSOobyHNWreNXv5Mq8BpExR+tvKXOv8Nw+ln2GFb9Y3NrYqoTbHxvFWRtik2A9go6pKgxab/x7HVNE4eZS/Ckqges5WBaVF3DGqZMDhVL8iZ+gPEPGQvRL721A5Uy8W1B1n0JSJeb4gq6mWbzWruMT24Gu5a5q715y/910fljusaPxlPjWeGZRwYr40T49yYGKRjd/7qfOx86u537S7pQkG91yk1Pxq1pxv8Bz87LsU=</latexit><latexit sha1_base64="TjHvkcvrTEf+bAsdMjPRxCEcpYY=">AAAItHicfVVNb9tGEKXSNFXVJnHaYy+LCAWSQDBIubHdQ4DUkhsf6th1LDuAKRjL1YgitPzAcimLWeyl1+R39D/1j/Tc5YdUksuEB2E0773R7JulxomoF3PT/Kdz76v7Xz/4pvtt77vvHz56vPPkh6s4TBiBCQlpyN47OAbqBTDhHqfwPmKAfYfCtbMcZfj1CljshcElTyOY+tgNvLlHMFep251/bcxcH69vhc1hzcUo9AKJbC9Adpk5goDLASq+vOMMe+6CS1tKFD2zQyJOpO0QcSnLuPqp5/MM0mFk+94MVVp4jmy71+jNZkSchjOgsuwwz+YZ9C7CBIqe8uwYc1yrWhE/R73bnb65a+YP0gOrDPpG+ZzfPnnwtz0LSeIrNwjFcXxjmRGfCsy4RyjInp3EoHpYYhduEj4/nAoviBIOAZHoZ4XNE4p4iLIZoJnHgHCaqgAT5qkKiCwww4SrSfXqpWIIsA/xYLbyorgI45VbBByrMU/FOr8G8mFNKVyGo4VH1rXWBPZjH/OFloxT36knIaHAVn49mbWpmmww18CIF2cmnCtnzqLsasWX4XmJL9JoAUEsRcKorAoVAIzBXAnzMAaeRCI/jbrPy/gVZwkMsjDPvRpjtryA2UDVqSXq7cxpiHk95ahjKHcCuCOh7+NgJuxIbm7VYFfm3lXRCymEnRnlOOgig2vo2wr6Vso6eFwBjxVYRydbdI4mTelVBbzSfvW6gl43pU5SQRMNXVXQlVbZuavAdxq8rqBrDU0raKqhHyroB91nrK7FzXAqilnkQxVn1FvBGwYQSNEfyuZZmJr3jVWXZHdA9C2Z2z2DufozLAA/zeji5PL0DylGh8OX5r5sMhyawIZi7u2/HJkaxS26KTnm4eHwSOOEDAfuttD4eP83Sy8UJSyiW9LBwd7vv+qVUqA0vNtWGh2Nh3vNgylH6k1ZB5ZpNm8bI5pVpSOobyHNWreNXv5Mq8BpExR+tvKXOv8Nw+ln2GFb9Y3NrYqoTbHxvFWRtik2A9go6pKgxab/x7HVNE4eZS/Ckqges5WBaVF3DGqZMDhVL8iZ+gPEPGQvRL721A5Uy8W1B1n0JSJeb4gq6mWbzWruMT24Gu5a5q715y/910fljusaPxlPjWeGZRwYr40T49yYGKRjd/7qfOx86u537S7pQkG91yk1Pxq1pxv8Bz87LsU=</latexit><latexit sha1_base64="TjHvkcvrTEf+bAsdMjPRxCEcpYY=">AAAItHicfVVNb9tGEKXSNFXVJnHaYy+LCAWSQDBIubHdQ4DUkhsf6th1LDuAKRjL1YgitPzAcimLWeyl1+R39D/1j/Tc5YdUksuEB2E0773R7JulxomoF3PT/Kdz76v7Xz/4pvtt77vvHz56vPPkh6s4TBiBCQlpyN47OAbqBTDhHqfwPmKAfYfCtbMcZfj1CljshcElTyOY+tgNvLlHMFep251/bcxcH69vhc1hzcUo9AKJbC9Adpk5goDLASq+vOMMe+6CS1tKFD2zQyJOpO0QcSnLuPqp5/MM0mFk+94MVVp4jmy71+jNZkSchjOgsuwwz+YZ9C7CBIqe8uwYc1yrWhE/R73bnb65a+YP0gOrDPpG+ZzfPnnwtz0LSeIrNwjFcXxjmRGfCsy4RyjInp3EoHpYYhduEj4/nAoviBIOAZHoZ4XNE4p4iLIZoJnHgHCaqgAT5qkKiCwww4SrSfXqpWIIsA/xYLbyorgI45VbBByrMU/FOr8G8mFNKVyGo4VH1rXWBPZjH/OFloxT36knIaHAVn49mbWpmmww18CIF2cmnCtnzqLsasWX4XmJL9JoAUEsRcKorAoVAIzBXAnzMAaeRCI/jbrPy/gVZwkMsjDPvRpjtryA2UDVqSXq7cxpiHk95ahjKHcCuCOh7+NgJuxIbm7VYFfm3lXRCymEnRnlOOgig2vo2wr6Vso6eFwBjxVYRydbdI4mTelVBbzSfvW6gl43pU5SQRMNXVXQlVbZuavAdxq8rqBrDU0raKqhHyroB91nrK7FzXAqilnkQxVn1FvBGwYQSNEfyuZZmJr3jVWXZHdA9C2Z2z2DufozLAA/zeji5PL0DylGh8OX5r5sMhyawIZi7u2/HJkaxS26KTnm4eHwSOOEDAfuttD4eP83Sy8UJSyiW9LBwd7vv+qVUqA0vNtWGh2Nh3vNgylH6k1ZB5ZpNm8bI5pVpSOobyHNWreNXv5Mq8BpExR+tvKXOv8Nw+ln2GFb9Y3NrYqoTbHxvFWRtik2A9go6pKgxab/x7HVNE4eZS/Ckqges5WBaVF3DGqZMDhVL8iZ+gPEPGQvRL721A5Uy8W1B1n0JSJeb4gq6mWbzWruMT24Gu5a5q715y/910fljusaPxlPjWeGZRwYr40T49yYGKRjd/7qfOx86u537S7pQkG91yk1Pxq1pxv8Bz87LsU=</latexit>
argmaxCoin2{Bent,Straight}
p(HTHHHTHHTHTH | Coin)
argmaxModel 2Model Space
p(Data | Model )<latexit sha1_base64="TjHvkcvrTEf+bAsdMjPRxCEcpYY=">AAAItHicfVVNb9tGEKXSNFXVJnHaYy+LCAWSQDBIubHdQ4DUkhsf6th1LDuAKRjL1YgitPzAcimLWeyl1+R39D/1j/Tc5YdUksuEB2E0773R7JulxomoF3PT/Kdz76v7Xz/4pvtt77vvHz56vPPkh6s4TBiBCQlpyN47OAbqBTDhHqfwPmKAfYfCtbMcZfj1CljshcElTyOY+tgNvLlHMFep251/bcxcH69vhc1hzcUo9AKJbC9Adpk5goDLASq+vOMMe+6CS1tKFD2zQyJOpO0QcSnLuPqp5/MM0mFk+94MVVp4jmy71+jNZkSchjOgsuwwz+YZ9C7CBIqe8uwYc1yrWhE/R73bnb65a+YP0gOrDPpG+ZzfPnnwtz0LSeIrNwjFcXxjmRGfCsy4RyjInp3EoHpYYhduEj4/nAoviBIOAZHoZ4XNE4p4iLIZoJnHgHCaqgAT5qkKiCwww4SrSfXqpWIIsA/xYLbyorgI45VbBByrMU/FOr8G8mFNKVyGo4VH1rXWBPZjH/OFloxT36knIaHAVn49mbWpmmww18CIF2cmnCtnzqLsasWX4XmJL9JoAUEsRcKorAoVAIzBXAnzMAaeRCI/jbrPy/gVZwkMsjDPvRpjtryA2UDVqSXq7cxpiHk95ahjKHcCuCOh7+NgJuxIbm7VYFfm3lXRCymEnRnlOOgig2vo2wr6Vso6eFwBjxVYRydbdI4mTelVBbzSfvW6gl43pU5SQRMNXVXQlVbZuavAdxq8rqBrDU0raKqhHyroB91nrK7FzXAqilnkQxVn1FvBGwYQSNEfyuZZmJr3jVWXZHdA9C2Z2z2DufozLAA/zeji5PL0DylGh8OX5r5sMhyawIZi7u2/HJkaxS26KTnm4eHwSOOEDAfuttD4eP83Sy8UJSyiW9LBwd7vv+qVUqA0vNtWGh2Nh3vNgylH6k1ZB5ZpNm8bI5pVpSOobyHNWreNXv5Mq8BpExR+tvKXOv8Nw+ln2GFb9Y3NrYqoTbHxvFWRtik2A9go6pKgxab/x7HVNE4eZS/Ckqges5WBaVF3DGqZMDhVL8iZ+gPEPGQvRL721A5Uy8W1B1n0JSJeb4gq6mWbzWruMT24Gu5a5q715y/910fljusaPxlPjWeGZRwYr40T49yYGKRjd/7qfOx86u537S7pQkG91yk1Pxq1pxv8Bz87LsU=</latexit><latexit sha1_base64="TjHvkcvrTEf+bAsdMjPRxCEcpYY=">AAAItHicfVVNb9tGEKXSNFXVJnHaYy+LCAWSQDBIubHdQ4DUkhsf6th1LDuAKRjL1YgitPzAcimLWeyl1+R39D/1j/Tc5YdUksuEB2E0773R7JulxomoF3PT/Kdz76v7Xz/4pvtt77vvHz56vPPkh6s4TBiBCQlpyN47OAbqBTDhHqfwPmKAfYfCtbMcZfj1CljshcElTyOY+tgNvLlHMFep251/bcxcH69vhc1hzcUo9AKJbC9Adpk5goDLASq+vOMMe+6CS1tKFD2zQyJOpO0QcSnLuPqp5/MM0mFk+94MVVp4jmy71+jNZkSchjOgsuwwz+YZ9C7CBIqe8uwYc1yrWhE/R73bnb65a+YP0gOrDPpG+ZzfPnnwtz0LSeIrNwjFcXxjmRGfCsy4RyjInp3EoHpYYhduEj4/nAoviBIOAZHoZ4XNE4p4iLIZoJnHgHCaqgAT5qkKiCwww4SrSfXqpWIIsA/xYLbyorgI45VbBByrMU/FOr8G8mFNKVyGo4VH1rXWBPZjH/OFloxT36knIaHAVn49mbWpmmww18CIF2cmnCtnzqLsasWX4XmJL9JoAUEsRcKorAoVAIzBXAnzMAaeRCI/jbrPy/gVZwkMsjDPvRpjtryA2UDVqSXq7cxpiHk95ahjKHcCuCOh7+NgJuxIbm7VYFfm3lXRCymEnRnlOOgig2vo2wr6Vso6eFwBjxVYRydbdI4mTelVBbzSfvW6gl43pU5SQRMNXVXQlVbZuavAdxq8rqBrDU0raKqhHyroB91nrK7FzXAqilnkQxVn1FvBGwYQSNEfyuZZmJr3jVWXZHdA9C2Z2z2DufozLAA/zeji5PL0DylGh8OX5r5sMhyawIZi7u2/HJkaxS26KTnm4eHwSOOEDAfuttD4eP83Sy8UJSyiW9LBwd7vv+qVUqA0vNtWGh2Nh3vNgylH6k1ZB5ZpNm8bI5pVpSOobyHNWreNXv5Mq8BpExR+tvKXOv8Nw+ln2GFb9Y3NrYqoTbHxvFWRtik2A9go6pKgxab/x7HVNE4eZS/Ckqges5WBaVF3DGqZMDhVL8iZ+gPEPGQvRL721A5Uy8W1B1n0JSJeb4gq6mWbzWruMT24Gu5a5q715y/910fljusaPxlPjWeGZRwYr40T49yYGKRjd/7qfOx86u537S7pQkG91yk1Pxq1pxv8Bz87LsU=</latexit><latexit sha1_base64="TjHvkcvrTEf+bAsdMjPRxCEcpYY=">AAAItHicfVVNb9tGEKXSNFXVJnHaYy+LCAWSQDBIubHdQ4DUkhsf6th1LDuAKRjL1YgitPzAcimLWeyl1+R39D/1j/Tc5YdUksuEB2E0773R7JulxomoF3PT/Kdz76v7Xz/4pvtt77vvHz56vPPkh6s4TBiBCQlpyN47OAbqBTDhHqfwPmKAfYfCtbMcZfj1CljshcElTyOY+tgNvLlHMFep251/bcxcH69vhc1hzcUo9AKJbC9Adpk5goDLASq+vOMMe+6CS1tKFD2zQyJOpO0QcSnLuPqp5/MM0mFk+94MVVp4jmy71+jNZkSchjOgsuwwz+YZ9C7CBIqe8uwYc1yrWhE/R73bnb65a+YP0gOrDPpG+ZzfPnnwtz0LSeIrNwjFcXxjmRGfCsy4RyjInp3EoHpYYhduEj4/nAoviBIOAZHoZ4XNE4p4iLIZoJnHgHCaqgAT5qkKiCwww4SrSfXqpWIIsA/xYLbyorgI45VbBByrMU/FOr8G8mFNKVyGo4VH1rXWBPZjH/OFloxT36knIaHAVn49mbWpmmww18CIF2cmnCtnzqLsasWX4XmJL9JoAUEsRcKorAoVAIzBXAnzMAaeRCI/jbrPy/gVZwkMsjDPvRpjtryA2UDVqSXq7cxpiHk95ahjKHcCuCOh7+NgJuxIbm7VYFfm3lXRCymEnRnlOOgig2vo2wr6Vso6eFwBjxVYRydbdI4mTelVBbzSfvW6gl43pU5SQRMNXVXQlVbZuavAdxq8rqBrDU0raKqhHyroB91nrK7FzXAqilnkQxVn1FvBGwYQSNEfyuZZmJr3jVWXZHdA9C2Z2z2DufozLAA/zeji5PL0DylGh8OX5r5sMhyawIZi7u2/HJkaxS26KTnm4eHwSOOEDAfuttD4eP83Sy8UJSyiW9LBwd7vv+qVUqA0vNtWGh2Nh3vNgylH6k1ZB5ZpNm8bI5pVpSOobyHNWreNXv5Mq8BpExR+tvKXOv8Nw+ln2GFb9Y3NrYqoTbHxvFWRtik2A9go6pKgxab/x7HVNE4eZS/Ckqges5WBaVF3DGqZMDhVL8iZ+gPEPGQvRL721A5Uy8W1B1n0JSJeb4gq6mWbzWruMT24Gu5a5q715y/910fljusaPxlPjWeGZRwYr40T49yYGKRjd/7qfOx86u537S7pQkG91yk1Pxq1pxv8Bz87LsU=</latexit><latexit sha1_base64="TjHvkcvrTEf+bAsdMjPRxCEcpYY=">AAAItHicfVVNb9tGEKXSNFXVJnHaYy+LCAWSQDBIubHdQ4DUkhsf6th1LDuAKRjL1YgitPzAcimLWeyl1+R39D/1j/Tc5YdUksuEB2E0773R7JulxomoF3PT/Kdz76v7Xz/4pvtt77vvHz56vPPkh6s4TBiBCQlpyN47OAbqBTDhHqfwPmKAfYfCtbMcZfj1CljshcElTyOY+tgNvLlHMFep251/bcxcH69vhc1hzcUo9AKJbC9Adpk5goDLASq+vOMMe+6CS1tKFD2zQyJOpO0QcSnLuPqp5/MM0mFk+94MVVp4jmy71+jNZkSchjOgsuwwz+YZ9C7CBIqe8uwYc1yrWhE/R73bnb65a+YP0gOrDPpG+ZzfPnnwtz0LSeIrNwjFcXxjmRGfCsy4RyjInp3EoHpYYhduEj4/nAoviBIOAZHoZ4XNE4p4iLIZoJnHgHCaqgAT5qkKiCwww4SrSfXqpWIIsA/xYLbyorgI45VbBByrMU/FOr8G8mFNKVyGo4VH1rXWBPZjH/OFloxT36knIaHAVn49mbWpmmww18CIF2cmnCtnzqLsasWX4XmJL9JoAUEsRcKorAoVAIzBXAnzMAaeRCI/jbrPy/gVZwkMsjDPvRpjtryA2UDVqSXq7cxpiHk95ahjKHcCuCOh7+NgJuxIbm7VYFfm3lXRCymEnRnlOOgig2vo2wr6Vso6eFwBjxVYRydbdI4mTelVBbzSfvW6gl43pU5SQRMNXVXQlVbZuavAdxq8rqBrDU0raKqhHyroB91nrK7FzXAqilnkQxVn1FvBGwYQSNEfyuZZmJr3jVWXZHdA9C2Z2z2DufozLAA/zeji5PL0DylGh8OX5r5sMhyawIZi7u2/HJkaxS26KTnm4eHwSOOEDAfuttD4eP83Sy8UJSyiW9LBwd7vv+qVUqA0vNtWGh2Nh3vNgylH6k1ZB5ZpNm8bI5pVpSOobyHNWreNXv5Mq8BpExR+tvKXOv8Nw+ln2GFb9Y3NrYqoTbHxvFWRtik2A9go6pKgxab/x7HVNE4eZS/Ckqges5WBaVF3DGqZMDhVL8iZ+gPEPGQvRL721A5Uy8W1B1n0JSJeb4gq6mWbzWruMT24Gu5a5q715y/910fljusaPxlPjWeGZRwYr40T49yYGKRjd/7qfOx86u537S7pQkG91yk1Pxq1pxv8Bz87LsU=</latexit>
Let’sstartwithmaximumlikelihood.Hereistheoptimizationobjective.Weneedtoworkouttheprobabilityofthedatagiventhemodelforeachmodel,andpickthehighestone.
which coin?
37
HTHHHTHHTHTH
p(D|Bent) =4
5
1
5
4
5
4
5
4
5
1
5
4
5
4
5
1
5
4
5
1
5
4
5⇡ 0.000268
p(D|Straight) =1
2
1
2
1
2
1
2
1
2
1
2
1
2
1
2
1
2
1
2
1
2
1
2⇡ 0.000244
<latexit sha1_base64="Olyz7WUvgQ79uXCh+aVzNHhAr0A=">AAAJoHicnVVbb9s2FFbcrqu99ZLtcS9CjQ1dYQSSc3MfArSJs+ZhbdIsTopGRkDRtCyYkgiSsqWy/Dv9T33sPxll2a4kqi1avfiI38XnHB6KLsE+45b1caNx6/ZPd36+22z98uu9+w8ebv52yaKYQjSAEY7oGxcwhP0QDbjPMXpDKAKBi9GVOz3K8KsZosyPwgueEjQMgBf6Yx8CrpZuNjc+kcf99w5HCReHKOTyb/Mv88B0IiicMQVQ7EixK6Xjrt7t/L2Cm7riGw4/oPjupBxACI0S09qyLKu71zMdp/W53P84Bb430UtWbt2Ke7fonr+buuIbDj+g+O6kyiXv7KiSbx62F2/qMfXAXgZtY/mc3Wze+eCMIhgHah4gBoxd2xbhQwEo9yFGsuXEDBEAp8BD1zEf94bCD0nMUQil+afCxjE2eWRm82aOfIogx6kKAKS+cjDhBKiMuZrKVtmKoRAEiHVGM5+wPGQzLw84UCM9FMli5OW9klJ4FJCJD5NSagIELAB8oi2yNHDLiyjGiM6C8mKWpkqywkwQhT7LmnCmOnNKsmPELqKzJT5JyQSFTIqYYlkUKgBRisZKuAgZ4jERi2rU2Z2yA05j1MnCxdpBH9DpORp1lE9poZzOGEeAl5dcVYbqTojmMAoCEI6EQ6TIR97pbMlF74rouRTCyRrluuZ5BpfQVwX0lZRl8LgAHiuwjA7W6NgcVKWXBfBS+9erAnpVlbpxAY01dFZAZ5qzOy/Acw1OCmiioWkBTTX0XQF9p/cZqLG47g5FvheLTRWn2J+hFxShUIq2Or+VWqja72u7LMlmQLRtuWj3CI3Vhz8HgjSji5OLl/9KcdTr7lp7sspwcYxWFGt7b/fI0ihens2SY/V63UONE1EQemuj/vHec1s3IjEleE3a39/+56nulCKMo/na6eiw392uFqY6Uk7K3rctqzptFGqtWnbEbNum1lqvjr78m1qBWyfI+1nLn+r8FxSkX2BHde6rNtcqSJ1i1fNaRVqnWG3ASlGWhDVt+rwda02lcpIdhKm6mUh2ZQCc+/aRukwoeqkOyKn6AAIe0SfqVFAv8JWX+nU6WfQ1IkhWRBW1Wupms6v3mB5cdrdsa8t+vdN+dri84+4afxiPjMeGbewbz4wT48wYGLBx0ggb80bSfNQ8aZ42X+fUxsZS87tReppv/wenKYUe</latexit><latexit sha1_base64="Olyz7WUvgQ79uXCh+aVzNHhAr0A=">AAAJoHicnVVbb9s2FFbcrqu99ZLtcS9CjQ1dYQSSc3MfArSJs+ZhbdIsTopGRkDRtCyYkgiSsqWy/Dv9T33sPxll2a4kqi1avfiI38XnHB6KLsE+45b1caNx6/ZPd36+22z98uu9+w8ebv52yaKYQjSAEY7oGxcwhP0QDbjPMXpDKAKBi9GVOz3K8KsZosyPwgueEjQMgBf6Yx8CrpZuNjc+kcf99w5HCReHKOTyb/Mv88B0IiicMQVQ7EixK6Xjrt7t/L2Cm7riGw4/oPjupBxACI0S09qyLKu71zMdp/W53P84Bb430UtWbt2Ke7fonr+buuIbDj+g+O6kyiXv7KiSbx62F2/qMfXAXgZtY/mc3Wze+eCMIhgHah4gBoxd2xbhQwEo9yFGsuXEDBEAp8BD1zEf94bCD0nMUQil+afCxjE2eWRm82aOfIogx6kKAKS+cjDhBKiMuZrKVtmKoRAEiHVGM5+wPGQzLw84UCM9FMli5OW9klJ4FJCJD5NSagIELAB8oi2yNHDLiyjGiM6C8mKWpkqywkwQhT7LmnCmOnNKsmPELqKzJT5JyQSFTIqYYlkUKgBRisZKuAgZ4jERi2rU2Z2yA05j1MnCxdpBH9DpORp1lE9poZzOGEeAl5dcVYbqTojmMAoCEI6EQ6TIR97pbMlF74rouRTCyRrluuZ5BpfQVwX0lZRl8LgAHiuwjA7W6NgcVKWXBfBS+9erAnpVlbpxAY01dFZAZ5qzOy/Acw1OCmiioWkBTTX0XQF9p/cZqLG47g5FvheLTRWn2J+hFxShUIq2Or+VWqja72u7LMlmQLRtuWj3CI3Vhz8HgjSji5OLl/9KcdTr7lp7sspwcYxWFGt7b/fI0ihens2SY/V63UONE1EQemuj/vHec1s3IjEleE3a39/+56nulCKMo/na6eiw392uFqY6Uk7K3rctqzptFGqtWnbEbNum1lqvjr78m1qBWyfI+1nLn+r8FxSkX2BHde6rNtcqSJ1i1fNaRVqnWG3ASlGWhDVt+rwda02lcpIdhKm6mUh2ZQCc+/aRukwoeqkOyKn6AAIe0SfqVFAv8JWX+nU6WfQ1IkhWRBW1Wupms6v3mB5cdrdsa8t+vdN+dri84+4afxiPjMeGbewbz4wT48wYGLBx0ggb80bSfNQ8aZ42X+fUxsZS87tReppv/wenKYUe</latexit><latexit sha1_base64="Olyz7WUvgQ79uXCh+aVzNHhAr0A=">AAAJoHicnVVbb9s2FFbcrqu99ZLtcS9CjQ1dYQSSc3MfArSJs+ZhbdIsTopGRkDRtCyYkgiSsqWy/Dv9T33sPxll2a4kqi1avfiI38XnHB6KLsE+45b1caNx6/ZPd36+22z98uu9+w8ebv52yaKYQjSAEY7oGxcwhP0QDbjPMXpDKAKBi9GVOz3K8KsZosyPwgueEjQMgBf6Yx8CrpZuNjc+kcf99w5HCReHKOTyb/Mv88B0IiicMQVQ7EixK6Xjrt7t/L2Cm7riGw4/oPjupBxACI0S09qyLKu71zMdp/W53P84Bb430UtWbt2Ke7fonr+buuIbDj+g+O6kyiXv7KiSbx62F2/qMfXAXgZtY/mc3Wze+eCMIhgHah4gBoxd2xbhQwEo9yFGsuXEDBEAp8BD1zEf94bCD0nMUQil+afCxjE2eWRm82aOfIogx6kKAKS+cjDhBKiMuZrKVtmKoRAEiHVGM5+wPGQzLw84UCM9FMli5OW9klJ4FJCJD5NSagIELAB8oi2yNHDLiyjGiM6C8mKWpkqywkwQhT7LmnCmOnNKsmPELqKzJT5JyQSFTIqYYlkUKgBRisZKuAgZ4jERi2rU2Z2yA05j1MnCxdpBH9DpORp1lE9poZzOGEeAl5dcVYbqTojmMAoCEI6EQ6TIR97pbMlF74rouRTCyRrluuZ5BpfQVwX0lZRl8LgAHiuwjA7W6NgcVKWXBfBS+9erAnpVlbpxAY01dFZAZ5qzOy/Acw1OCmiioWkBTTX0XQF9p/cZqLG47g5FvheLTRWn2J+hFxShUIq2Or+VWqja72u7LMlmQLRtuWj3CI3Vhz8HgjSji5OLl/9KcdTr7lp7sspwcYxWFGt7b/fI0ihens2SY/V63UONE1EQemuj/vHec1s3IjEleE3a39/+56nulCKMo/na6eiw392uFqY6Uk7K3rctqzptFGqtWnbEbNum1lqvjr78m1qBWyfI+1nLn+r8FxSkX2BHde6rNtcqSJ1i1fNaRVqnWG3ASlGWhDVt+rwda02lcpIdhKm6mUh2ZQCc+/aRukwoeqkOyKn6AAIe0SfqVFAv8JWX+nU6WfQ1IkhWRBW1Wupms6v3mB5cdrdsa8t+vdN+dri84+4afxiPjMeGbewbz4wT48wYGLBx0ggb80bSfNQ8aZ42X+fUxsZS87tReppv/wenKYUe</latexit><latexit sha1_base64="Olyz7WUvgQ79uXCh+aVzNHhAr0A=">AAAJoHicnVVbb9s2FFbcrqu99ZLtcS9CjQ1dYQSSc3MfArSJs+ZhbdIsTopGRkDRtCyYkgiSsqWy/Dv9T33sPxll2a4kqi1avfiI38XnHB6KLsE+45b1caNx6/ZPd36+22z98uu9+w8ebv52yaKYQjSAEY7oGxcwhP0QDbjPMXpDKAKBi9GVOz3K8KsZosyPwgueEjQMgBf6Yx8CrpZuNjc+kcf99w5HCReHKOTyb/Mv88B0IiicMQVQ7EixK6Xjrt7t/L2Cm7riGw4/oPjupBxACI0S09qyLKu71zMdp/W53P84Bb430UtWbt2Ke7fonr+buuIbDj+g+O6kyiXv7KiSbx62F2/qMfXAXgZtY/mc3Wze+eCMIhgHah4gBoxd2xbhQwEo9yFGsuXEDBEAp8BD1zEf94bCD0nMUQil+afCxjE2eWRm82aOfIogx6kKAKS+cjDhBKiMuZrKVtmKoRAEiHVGM5+wPGQzLw84UCM9FMli5OW9klJ4FJCJD5NSagIELAB8oi2yNHDLiyjGiM6C8mKWpkqywkwQhT7LmnCmOnNKsmPELqKzJT5JyQSFTIqYYlkUKgBRisZKuAgZ4jERi2rU2Z2yA05j1MnCxdpBH9DpORp1lE9poZzOGEeAl5dcVYbqTojmMAoCEI6EQ6TIR97pbMlF74rouRTCyRrluuZ5BpfQVwX0lZRl8LgAHiuwjA7W6NgcVKWXBfBS+9erAnpVlbpxAY01dFZAZ5qzOy/Acw1OCmiioWkBTTX0XQF9p/cZqLG47g5FvheLTRWn2J+hFxShUIq2Or+VWqja72u7LMlmQLRtuWj3CI3Vhz8HgjSji5OLl/9KcdTr7lp7sspwcYxWFGt7b/fI0ihens2SY/V63UONE1EQemuj/vHec1s3IjEleE3a39/+56nulCKMo/na6eiw392uFqY6Uk7K3rctqzptFGqtWnbEbNum1lqvjr78m1qBWyfI+1nLn+r8FxSkX2BHde6rNtcqSJ1i1fNaRVqnWG3ASlGWhDVt+rwda02lcpIdhKm6mUh2ZQCc+/aRukwoeqkOyKn6AAIe0SfqVFAv8JWX+nU6WfQ1IkhWRBW1Wupms6v3mB5cdrdsa8t+vdN+dri84+4afxiPjMeGbewbz4wT48wYGLBx0ggb80bSfNQ8aZ42X+fUxsZS87tReppv/wenKYUe</latexit>
p(D|Bent) =4
5
1
5
4
5
4
5
4
5
1
5
4
5
4
5
1
5
4
5
1
5
4
5⇡ 0.000268
p(D|Straight) =1
2
1
2
1
2
1
2
1
2
1
2
1
2
1
2
1
2
1
2
1
2
1
2⇡ 0.000244
<latexit sha1_base64="Olyz7WUvgQ79uXCh+aVzNHhAr0A=">AAAJoHicnVVbb9s2FFbcrqu99ZLtcS9CjQ1dYQSSc3MfArSJs+ZhbdIsTopGRkDRtCyYkgiSsqWy/Dv9T33sPxll2a4kqi1avfiI38XnHB6KLsE+45b1caNx6/ZPd36+22z98uu9+w8ebv52yaKYQjSAEY7oGxcwhP0QDbjPMXpDKAKBi9GVOz3K8KsZosyPwgueEjQMgBf6Yx8CrpZuNjc+kcf99w5HCReHKOTyb/Mv88B0IiicMQVQ7EixK6Xjrt7t/L2Cm7riGw4/oPjupBxACI0S09qyLKu71zMdp/W53P84Bb430UtWbt2Ke7fonr+buuIbDj+g+O6kyiXv7KiSbx62F2/qMfXAXgZtY/mc3Wze+eCMIhgHah4gBoxd2xbhQwEo9yFGsuXEDBEAp8BD1zEf94bCD0nMUQil+afCxjE2eWRm82aOfIogx6kKAKS+cjDhBKiMuZrKVtmKoRAEiHVGM5+wPGQzLw84UCM9FMli5OW9klJ4FJCJD5NSagIELAB8oi2yNHDLiyjGiM6C8mKWpkqywkwQhT7LmnCmOnNKsmPELqKzJT5JyQSFTIqYYlkUKgBRisZKuAgZ4jERi2rU2Z2yA05j1MnCxdpBH9DpORp1lE9poZzOGEeAl5dcVYbqTojmMAoCEI6EQ6TIR97pbMlF74rouRTCyRrluuZ5BpfQVwX0lZRl8LgAHiuwjA7W6NgcVKWXBfBS+9erAnpVlbpxAY01dFZAZ5qzOy/Acw1OCmiioWkBTTX0XQF9p/cZqLG47g5FvheLTRWn2J+hFxShUIq2Or+VWqja72u7LMlmQLRtuWj3CI3Vhz8HgjSji5OLl/9KcdTr7lp7sspwcYxWFGt7b/fI0ihens2SY/V63UONE1EQemuj/vHec1s3IjEleE3a39/+56nulCKMo/na6eiw392uFqY6Uk7K3rctqzptFGqtWnbEbNum1lqvjr78m1qBWyfI+1nLn+r8FxSkX2BHde6rNtcqSJ1i1fNaRVqnWG3ASlGWhDVt+rwda02lcpIdhKm6mUh2ZQCc+/aRukwoeqkOyKn6AAIe0SfqVFAv8JWX+nU6WfQ1IkhWRBW1Wupms6v3mB5cdrdsa8t+vdN+dri84+4afxiPjMeGbewbz4wT48wYGLBx0ggb80bSfNQ8aZ42X+fUxsZS87tReppv/wenKYUe</latexit><latexit sha1_base64="Olyz7WUvgQ79uXCh+aVzNHhAr0A=">AAAJoHicnVVbb9s2FFbcrqu99ZLtcS9CjQ1dYQSSc3MfArSJs+ZhbdIsTopGRkDRtCyYkgiSsqWy/Dv9T33sPxll2a4kqi1avfiI38XnHB6KLsE+45b1caNx6/ZPd36+22z98uu9+w8ebv52yaKYQjSAEY7oGxcwhP0QDbjPMXpDKAKBi9GVOz3K8KsZosyPwgueEjQMgBf6Yx8CrpZuNjc+kcf99w5HCReHKOTyb/Mv88B0IiicMQVQ7EixK6Xjrt7t/L2Cm7riGw4/oPjupBxACI0S09qyLKu71zMdp/W53P84Bb430UtWbt2Ke7fonr+buuIbDj+g+O6kyiXv7KiSbx62F2/qMfXAXgZtY/mc3Wze+eCMIhgHah4gBoxd2xbhQwEo9yFGsuXEDBEAp8BD1zEf94bCD0nMUQil+afCxjE2eWRm82aOfIogx6kKAKS+cjDhBKiMuZrKVtmKoRAEiHVGM5+wPGQzLw84UCM9FMli5OW9klJ4FJCJD5NSagIELAB8oi2yNHDLiyjGiM6C8mKWpkqywkwQhT7LmnCmOnNKsmPELqKzJT5JyQSFTIqYYlkUKgBRisZKuAgZ4jERi2rU2Z2yA05j1MnCxdpBH9DpORp1lE9poZzOGEeAl5dcVYbqTojmMAoCEI6EQ6TIR97pbMlF74rouRTCyRrluuZ5BpfQVwX0lZRl8LgAHiuwjA7W6NgcVKWXBfBS+9erAnpVlbpxAY01dFZAZ5qzOy/Acw1OCmiioWkBTTX0XQF9p/cZqLG47g5FvheLTRWn2J+hFxShUIq2Or+VWqja72u7LMlmQLRtuWj3CI3Vhz8HgjSji5OLl/9KcdTr7lp7sspwcYxWFGt7b/fI0ihens2SY/V63UONE1EQemuj/vHec1s3IjEleE3a39/+56nulCKMo/na6eiw392uFqY6Uk7K3rctqzptFGqtWnbEbNum1lqvjr78m1qBWyfI+1nLn+r8FxSkX2BHde6rNtcqSJ1i1fNaRVqnWG3ASlGWhDVt+rwda02lcpIdhKm6mUh2ZQCc+/aRukwoeqkOyKn6AAIe0SfqVFAv8JWX+nU6WfQ1IkhWRBW1Wupms6v3mB5cdrdsa8t+vdN+dri84+4afxiPjMeGbewbz4wT48wYGLBx0ggb80bSfNQ8aZ42X+fUxsZS87tReppv/wenKYUe</latexit><latexit sha1_base64="Olyz7WUvgQ79uXCh+aVzNHhAr0A=">AAAJoHicnVVbb9s2FFbcrqu99ZLtcS9CjQ1dYQSSc3MfArSJs+ZhbdIsTopGRkDRtCyYkgiSsqWy/Dv9T33sPxll2a4kqi1avfiI38XnHB6KLsE+45b1caNx6/ZPd36+22z98uu9+w8ebv52yaKYQjSAEY7oGxcwhP0QDbjPMXpDKAKBi9GVOz3K8KsZosyPwgueEjQMgBf6Yx8CrpZuNjc+kcf99w5HCReHKOTyb/Mv88B0IiicMQVQ7EixK6Xjrt7t/L2Cm7riGw4/oPjupBxACI0S09qyLKu71zMdp/W53P84Bb430UtWbt2Ke7fonr+buuIbDj+g+O6kyiXv7KiSbx62F2/qMfXAXgZtY/mc3Wze+eCMIhgHah4gBoxd2xbhQwEo9yFGsuXEDBEAp8BD1zEf94bCD0nMUQil+afCxjE2eWRm82aOfIogx6kKAKS+cjDhBKiMuZrKVtmKoRAEiHVGM5+wPGQzLw84UCM9FMli5OW9klJ4FJCJD5NSagIELAB8oi2yNHDLiyjGiM6C8mKWpkqywkwQhT7LmnCmOnNKsmPELqKzJT5JyQSFTIqYYlkUKgBRisZKuAgZ4jERi2rU2Z2yA05j1MnCxdpBH9DpORp1lE9poZzOGEeAl5dcVYbqTojmMAoCEI6EQ6TIR97pbMlF74rouRTCyRrluuZ5BpfQVwX0lZRl8LgAHiuwjA7W6NgcVKWXBfBS+9erAnpVlbpxAY01dFZAZ5qzOy/Acw1OCmiioWkBTTX0XQF9p/cZqLG47g5FvheLTRWn2J+hFxShUIq2Or+VWqja72u7LMlmQLRtuWj3CI3Vhz8HgjSji5OLl/9KcdTr7lp7sspwcYxWFGt7b/fI0ihens2SY/V63UONE1EQemuj/vHec1s3IjEleE3a39/+56nulCKMo/na6eiw392uFqY6Uk7K3rctqzptFGqtWnbEbNum1lqvjr78m1qBWyfI+1nLn+r8FxSkX2BHde6rNtcqSJ1i1fNaRVqnWG3ASlGWhDVt+rwda02lcpIdhKm6mUh2ZQCc+/aRukwoeqkOyKn6AAIe0SfqVFAv8JWX+nU6WfQ1IkhWRBW1Wupms6v3mB5cdrdsa8t+vdN+dri84+4afxiPjMeGbewbz4wT48wYGLBx0ggb80bSfNQ8aZ42X+fUxsZS87tReppv/wenKYUe</latexit><latexit sha1_base64="Olyz7WUvgQ79uXCh+aVzNHhAr0A=">AAAJoHicnVVbb9s2FFbcrqu99ZLtcS9CjQ1dYQSSc3MfArSJs+ZhbdIsTopGRkDRtCyYkgiSsqWy/Dv9T33sPxll2a4kqi1avfiI38XnHB6KLsE+45b1caNx6/ZPd36+22z98uu9+w8ebv52yaKYQjSAEY7oGxcwhP0QDbjPMXpDKAKBi9GVOz3K8KsZosyPwgueEjQMgBf6Yx8CrpZuNjc+kcf99w5HCReHKOTyb/Mv88B0IiicMQVQ7EixK6Xjrt7t/L2Cm7riGw4/oPjupBxACI0S09qyLKu71zMdp/W53P84Bb430UtWbt2Ke7fonr+buuIbDj+g+O6kyiXv7KiSbx62F2/qMfXAXgZtY/mc3Wze+eCMIhgHah4gBoxd2xbhQwEo9yFGsuXEDBEAp8BD1zEf94bCD0nMUQil+afCxjE2eWRm82aOfIogx6kKAKS+cjDhBKiMuZrKVtmKoRAEiHVGM5+wPGQzLw84UCM9FMli5OW9klJ4FJCJD5NSagIELAB8oi2yNHDLiyjGiM6C8mKWpkqywkwQhT7LmnCmOnNKsmPELqKzJT5JyQSFTIqYYlkUKgBRisZKuAgZ4jERi2rU2Z2yA05j1MnCxdpBH9DpORp1lE9poZzOGEeAl5dcVYbqTojmMAoCEI6EQ6TIR97pbMlF74rouRTCyRrluuZ5BpfQVwX0lZRl8LgAHiuwjA7W6NgcVKWXBfBS+9erAnpVlbpxAY01dFZAZ5qzOy/Acw1OCmiioWkBTTX0XQF9p/cZqLG47g5FvheLTRWn2J+hFxShUIq2Or+VWqja72u7LMlmQLRtuWj3CI3Vhz8HgjSji5OLl/9KcdTr7lp7sspwcYxWFGt7b/fI0ihens2SY/V63UONE1EQemuj/vHec1s3IjEleE3a39/+56nulCKMo/na6eiw392uFqY6Uk7K3rctqzptFGqtWnbEbNum1lqvjr78m1qBWyfI+1nLn+r8FxSkX2BHde6rNtcqSJ1i1fNaRVqnWG3ASlGWhDVt+rwda02lcpIdhKm6mUh2ZQCc+/aRukwoeqkOyKn6AAIe0SfqVFAv8JWX+nU6WfQ1IkhWRBW1Wupms6v3mB5cdrdsa8t+vdN+dri84+4afxiPjMeGbewbz4wT48wYGLBx0ggb80bSfNQ8aZ42X+fUxsZS87tReppv/wenKYUe</latexit>
Sincethecoin?lipsareindependent,theprobabilityoverthewholesequenceisjusttheproductovertheprobabilitiesoftheindividual?lips.There’snotmuchinit,butthelikelihoodforthebentcoinisslightlyhigher,sothat’sthepreferredmodelunderthemaximumlikelihoodcriterion.
(LOG) LIKELIHOOD: What we maximise to fit a probability model
LOSS: What we minimise to fit a machine learning model
38
Weoftentakethelogarithmofthelikelihood.Thelogarithmisamonotonicfunctionsothelikelihoodandtheloglikelihoodhavetheirminimainthesameplace,buttheloglikelihoodisofteneasiertomanipulatesymbolically(seethe?irsthomeworkexercise).
Theloglikelihoodofaprobabilitydistributionisalotlikethelossfunctionswe’vealreadyencounteredmanytimes.
Infact,ifwewantto?itaprobabilitydistributioninsideadeeplearningsystem,weusuallytakethenegativeloglikelihood,sothatwecandogradientdescent.
probability density function
39
N(x | µ,�) =1p2⇡�2
exp
-
1
2�2(x- µ)2
�
<latexit sha1_base64="Mg+fxxE6OudI9/x4WYZOShsvCK4=">AAAHyHicfVVdb9s2FJW7re68dU3Xlw57IWYMSAfXkJQ2SR8CdE22FtuaZEGcFLDcgJKvZMKUxJKULZXgy/Yr9xP2L0b5I5Msb3zxxT3nXF0eXpM+o0RI2/6rdeeTTz+72773eeeLL+9/9WDn4ddXIs14AIMgpSl/52MBlCQwkERSeMc44NincO1Pj0v8egZckDS5lAWDUYyjhIQkwNKkbnb+PN3NkReTMfLSQHlxpnvI800kSBRj/QQdIS/kOFCONrkPXCrXY6RKee9qjTzIGfIohBIN0dN/JW6duJs/XX/myXvX4ySayFHnZqdr9+3FQs3AWQVda7XObx7e7XvjNMhiSGRAsRBDx2ZypDCXJKCgO14mgOFgiiMYZjI8HCmSsExCEmj0vcHCjCKZotIQNCYcAkkLE+CAE1MBBRNs+pfGtk69lIAExyB64xlhYhmKWbQMJDaej1S+OBN9v6ZUEcdsQoK81prCsYixnDSSooj9ehIyCnwW15Nlm6bJDWYOPCCiNOHcOHPGynMWl+n5Cp8UbAKJ0CrjVFeFBgDOITTCRShAZkwtdmOGayqOJM+gV4aL3NEJ5tMLGPdMnVqi3k5IUyzrKd9sw7iTwDxI4xgnY+UxM1wScqm8Xl8vvKuiF1qZkTFG+T66KOEaelpBT7Xe0A5u0RANDFoDryrgVaPwdQW93pT6WQXNGuisgs4alf15BZ434LyC5g20qKBFA/1YQT82rcTm5IfuSC3tXpybOqNkBq85QKJV1/yZN/bCzZEOnbqkPGbVdfTC7jGE5vJZAnFR0tWby7e/aXV86D639/Umw6cZrCn23v7zY7tBiZbdrDj24aH7qsFJOU6i20InP+3/6DQLsYwzeks6ONj7+UWzUgGUpvPbSsevTty9zTniQcOE1V5R10EN06Jt9NWutgr8bYKlU1v50yb/NcfFf7DTbdXXBm5VsG2KtZtbFcU2xdratWJjE6yc1ql5EFh5dWO6pJyAudQ5vDVTfGYuIixT/oMZXR7FxNhnfr1eGf0fEedrook65QvjbL4nzWDg9l/07d+fdV/+snpq7lnfWt9Zu5ZjHVgvrTfWuTWwAuvv1oPW49Y37V/bH9rzdrGk3mmtNI+s2mr/8Q8+9dBS</latexit><latexit sha1_base64="Mg+fxxE6OudI9/x4WYZOShsvCK4=">AAAHyHicfVVdb9s2FJW7re68dU3Xlw57IWYMSAfXkJQ2SR8CdE22FtuaZEGcFLDcgJKvZMKUxJKULZXgy/Yr9xP2L0b5I5Msb3zxxT3nXF0eXpM+o0RI2/6rdeeTTz+72773eeeLL+9/9WDn4ddXIs14AIMgpSl/52MBlCQwkERSeMc44NincO1Pj0v8egZckDS5lAWDUYyjhIQkwNKkbnb+PN3NkReTMfLSQHlxpnvI800kSBRj/QQdIS/kOFCONrkPXCrXY6RKee9qjTzIGfIohBIN0dN/JW6duJs/XX/myXvX4ySayFHnZqdr9+3FQs3AWQVda7XObx7e7XvjNMhiSGRAsRBDx2ZypDCXJKCgO14mgOFgiiMYZjI8HCmSsExCEmj0vcHCjCKZotIQNCYcAkkLE+CAE1MBBRNs+pfGtk69lIAExyB64xlhYhmKWbQMJDaej1S+OBN9v6ZUEcdsQoK81prCsYixnDSSooj9ehIyCnwW15Nlm6bJDWYOPCCiNOHcOHPGynMWl+n5Cp8UbAKJ0CrjVFeFBgDOITTCRShAZkwtdmOGayqOJM+gV4aL3NEJ5tMLGPdMnVqi3k5IUyzrKd9sw7iTwDxI4xgnY+UxM1wScqm8Xl8vvKuiF1qZkTFG+T66KOEaelpBT7Xe0A5u0RANDFoDryrgVaPwdQW93pT6WQXNGuisgs4alf15BZ434LyC5g20qKBFA/1YQT82rcTm5IfuSC3tXpybOqNkBq85QKJV1/yZN/bCzZEOnbqkPGbVdfTC7jGE5vJZAnFR0tWby7e/aXV86D639/Umw6cZrCn23v7zY7tBiZbdrDj24aH7qsFJOU6i20InP+3/6DQLsYwzeks6ONj7+UWzUgGUpvPbSsevTty9zTniQcOE1V5R10EN06Jt9NWutgr8bYKlU1v50yb/NcfFf7DTbdXXBm5VsG2KtZtbFcU2xdratWJjE6yc1ql5EFh5dWO6pJyAudQ5vDVTfGYuIixT/oMZXR7FxNhnfr1eGf0fEedrook65QvjbL4nzWDg9l/07d+fdV/+snpq7lnfWt9Zu5ZjHVgvrTfWuTWwAuvv1oPW49Y37V/bH9rzdrGk3mmtNI+s2mr/8Q8+9dBS</latexit><latexit sha1_base64="Mg+fxxE6OudI9/x4WYZOShsvCK4=">AAAHyHicfVVdb9s2FJW7re68dU3Xlw57IWYMSAfXkJQ2SR8CdE22FtuaZEGcFLDcgJKvZMKUxJKULZXgy/Yr9xP2L0b5I5Msb3zxxT3nXF0eXpM+o0RI2/6rdeeTTz+72773eeeLL+9/9WDn4ddXIs14AIMgpSl/52MBlCQwkERSeMc44NincO1Pj0v8egZckDS5lAWDUYyjhIQkwNKkbnb+PN3NkReTMfLSQHlxpnvI800kSBRj/QQdIS/kOFCONrkPXCrXY6RKee9qjTzIGfIohBIN0dN/JW6duJs/XX/myXvX4ySayFHnZqdr9+3FQs3AWQVda7XObx7e7XvjNMhiSGRAsRBDx2ZypDCXJKCgO14mgOFgiiMYZjI8HCmSsExCEmj0vcHCjCKZotIQNCYcAkkLE+CAE1MBBRNs+pfGtk69lIAExyB64xlhYhmKWbQMJDaej1S+OBN9v6ZUEcdsQoK81prCsYixnDSSooj9ehIyCnwW15Nlm6bJDWYOPCCiNOHcOHPGynMWl+n5Cp8UbAKJ0CrjVFeFBgDOITTCRShAZkwtdmOGayqOJM+gV4aL3NEJ5tMLGPdMnVqi3k5IUyzrKd9sw7iTwDxI4xgnY+UxM1wScqm8Xl8vvKuiF1qZkTFG+T66KOEaelpBT7Xe0A5u0RANDFoDryrgVaPwdQW93pT6WQXNGuisgs4alf15BZ434LyC5g20qKBFA/1YQT82rcTm5IfuSC3tXpybOqNkBq85QKJV1/yZN/bCzZEOnbqkPGbVdfTC7jGE5vJZAnFR0tWby7e/aXV86D639/Umw6cZrCn23v7zY7tBiZbdrDj24aH7qsFJOU6i20InP+3/6DQLsYwzeks6ONj7+UWzUgGUpvPbSsevTty9zTniQcOE1V5R10EN06Jt9NWutgr8bYKlU1v50yb/NcfFf7DTbdXXBm5VsG2KtZtbFcU2xdratWJjE6yc1ql5EFh5dWO6pJyAudQ5vDVTfGYuIixT/oMZXR7FxNhnfr1eGf0fEedrook65QvjbL4nzWDg9l/07d+fdV/+snpq7lnfWt9Zu5ZjHVgvrTfWuTWwAuvv1oPW49Y37V/bH9rzdrGk3mmtNI+s2mr/8Q8+9dBS</latexit>
Wecanseehowalossfunctionandalog-likelhoodaresimilarwhenwelookatanormaldistribution.Thelikelihoodfunctionofthenormaldistributionisthiscomplicatedfunction.
Theprobabilitydensityofourwholedata,givensomemeanandstandarddeviation,issimplytheproductofallindividualprobabilitydensities.Thisfollowsfromtheassumptionthatinstancedataisindependentlydrawnfromthesamedistribution.
Wetakethelogarithmofthisproduct,togiveustheloglikelihoodofsomedata.
maximum likelihood for the normal distribution
40
argmax✓
log p(X | ✓) = argmax✓
lnY
x2X
p(x | ✓)
= argmax✓
X
x
lnp(x | ✓)
= argmaxµ,�
X
x
ln1p2⇡�2
exp
-
1
2�2(x- µ)2
�
= argmaxµ,�
X
x
ln1p2⇡�2
-1
2�2(x- µ)2
<latexit sha1_base64="XnPAttzTK/BsMvTbPb0G1ChNSOA=">AAAI8HictVXRbts2FFXcrfO8dWvWx70QMzakg2tIzpqkGAKkTbYWw9pkQZwEMF2DkmmZMCVxJOVIJfghexr22j/a34ySbE+ytGEv04su7jnn6vLwinQZJULa9p87rXsffHj/o/bHnU8+ffDZ5w93v7gWUcw9PPQiGvFbFwlMSYiHkkiKbxnHKHApvnEXpxl+s8RckCi8kinD4wD5IZkRD0mTmuzuvP8GIu4HKJlAOccSwe8BpJEP2N4tDMgUFNnH4Bg08EIAGY+mE5UASEJwq43MhGUdhJ1j0PANEQeTJC+xLSkUm88pGHkKBrHuAeiaSBCT1rpaA8448pSjDfwrl2oAGSmz3w60BhAnzJDxTIIRePK3ZFAl7iVP1l98/HYAOfHncgz+r6b+eyOwM3nYtft2/oB64KyCrrV6Lia79/twGnlxgEPpUSTEyLGZHCvEJfEo1h0YC8yQt0A+HsVydjRWJGSxxKGnwdcGm8UUyAhkgwOmhGNP0tQEyOPEVADeHJnWpRmvTrWUwCEKsOhNl4SJIhRLvwgkMrM5Vkk+u/pBRal8jticeEmlNYUCESA5ryVFGrjVJI4p5sugmszaNE1uMRPMPSIyEy6MM+cs+x/EVXSxwucpm+NQaBVzqstCA2DO8cwI81BgGTOVr8b8hAtxLHmMe1mY547PEF9c4mnP1Kkkqu3MaIRkNeWaZRh3QnznRUGAwqmCzMySxIlUsNfXuXdl9FIrMy3GKNcFlxlcQd+U0Ddab2mHG3QGhgatgNcl8LpW+KaE3mxL3biExjV0WUKXtcruXQm+q8FJCU1qaFpC0xr6roS+q1uJzM6PBmNV2J3vmzqnZIlfcoxDrbrm391aCzdbOnKqkmybVdfRud1TPDOHdAEEaUZXr65e/6zV6dHgqX2gtxkujfGaYu8fPD21axS/6GbFsY+OBi9qnIij0N8UOvvh4LlTL8RizuiGdHi4/+OzeqUUUxrdbSqdvjgb7G/PEfdqJqzWCroOqJnmN9FXq2oUuE2CwqlG/qLOf8lR+g/sqKn62sBGBWtSrN1sVKRNirW1a8XWIlg2rQtzF7Ds6Ea0oJxhc6hz/NpM8bk5iJCM+Lcqv6GIsc+8YS+L/o2IkjXRRJ3shnG275N6MBz0n/XtX77rnvy0umra1pfWV9ae5ViH1on1yrqwhpbX2m0dtk5az9ui/Vv79/YfBbW1s9I8sipP+/1fAPo6Og==</latexit><latexit sha1_base64="XnPAttzTK/BsMvTbPb0G1ChNSOA=">AAAI8HictVXRbts2FFXcrfO8dWvWx70QMzakg2tIzpqkGAKkTbYWw9pkQZwEMF2DkmmZMCVxJOVIJfghexr22j/a34ySbE+ytGEv04su7jnn6vLwinQZJULa9p87rXsffHj/o/bHnU8+ffDZ5w93v7gWUcw9PPQiGvFbFwlMSYiHkkiKbxnHKHApvnEXpxl+s8RckCi8kinD4wD5IZkRD0mTmuzuvP8GIu4HKJlAOccSwe8BpJEP2N4tDMgUFNnH4Bg08EIAGY+mE5UASEJwq43MhGUdhJ1j0PANEQeTJC+xLSkUm88pGHkKBrHuAeiaSBCT1rpaA8448pSjDfwrl2oAGSmz3w60BhAnzJDxTIIRePK3ZFAl7iVP1l98/HYAOfHncgz+r6b+eyOwM3nYtft2/oB64KyCrrV6Lia79/twGnlxgEPpUSTEyLGZHCvEJfEo1h0YC8yQt0A+HsVydjRWJGSxxKGnwdcGm8UUyAhkgwOmhGNP0tQEyOPEVADeHJnWpRmvTrWUwCEKsOhNl4SJIhRLvwgkMrM5Vkk+u/pBRal8jticeEmlNYUCESA5ryVFGrjVJI4p5sugmszaNE1uMRPMPSIyEy6MM+cs+x/EVXSxwucpm+NQaBVzqstCA2DO8cwI81BgGTOVr8b8hAtxLHmMe1mY547PEF9c4mnP1Kkkqu3MaIRkNeWaZRh3QnznRUGAwqmCzMySxIlUsNfXuXdl9FIrMy3GKNcFlxlcQd+U0Ddab2mHG3QGhgatgNcl8LpW+KaE3mxL3biExjV0WUKXtcruXQm+q8FJCU1qaFpC0xr6roS+q1uJzM6PBmNV2J3vmzqnZIlfcoxDrbrm391aCzdbOnKqkmybVdfRud1TPDOHdAEEaUZXr65e/6zV6dHgqX2gtxkujfGaYu8fPD21axS/6GbFsY+OBi9qnIij0N8UOvvh4LlTL8RizuiGdHi4/+OzeqUUUxrdbSqdvjgb7G/PEfdqJqzWCroOqJnmN9FXq2oUuE2CwqlG/qLOf8lR+g/sqKn62sBGBWtSrN1sVKRNirW1a8XWIlg2rQtzF7Ds6Ea0oJxhc6hz/NpM8bk5iJCM+Lcqv6GIsc+8YS+L/o2IkjXRRJ3shnG275N6MBz0n/XtX77rnvy0umra1pfWV9ae5ViH1on1yrqwhpbX2m0dtk5az9ui/Vv79/YfBbW1s9I8sipP+/1fAPo6Og==</latexit><latexit sha1_base64="XnPAttzTK/BsMvTbPb0G1ChNSOA=">AAAI8HictVXRbts2FFXcrfO8dWvWx70QMzakg2tIzpqkGAKkTbYWw9pkQZwEMF2DkmmZMCVxJOVIJfghexr22j/a34ySbE+ytGEv04su7jnn6vLwinQZJULa9p87rXsffHj/o/bHnU8+ffDZ5w93v7gWUcw9PPQiGvFbFwlMSYiHkkiKbxnHKHApvnEXpxl+s8RckCi8kinD4wD5IZkRD0mTmuzuvP8GIu4HKJlAOccSwe8BpJEP2N4tDMgUFNnH4Bg08EIAGY+mE5UASEJwq43MhGUdhJ1j0PANEQeTJC+xLSkUm88pGHkKBrHuAeiaSBCT1rpaA8448pSjDfwrl2oAGSmz3w60BhAnzJDxTIIRePK3ZFAl7iVP1l98/HYAOfHncgz+r6b+eyOwM3nYtft2/oB64KyCrrV6Lia79/twGnlxgEPpUSTEyLGZHCvEJfEo1h0YC8yQt0A+HsVydjRWJGSxxKGnwdcGm8UUyAhkgwOmhGNP0tQEyOPEVADeHJnWpRmvTrWUwCEKsOhNl4SJIhRLvwgkMrM5Vkk+u/pBRal8jticeEmlNYUCESA5ryVFGrjVJI4p5sugmszaNE1uMRPMPSIyEy6MM+cs+x/EVXSxwucpm+NQaBVzqstCA2DO8cwI81BgGTOVr8b8hAtxLHmMe1mY547PEF9c4mnP1Kkkqu3MaIRkNeWaZRh3QnznRUGAwqmCzMySxIlUsNfXuXdl9FIrMy3GKNcFlxlcQd+U0Ddab2mHG3QGhgatgNcl8LpW+KaE3mxL3biExjV0WUKXtcruXQm+q8FJCU1qaFpC0xr6roS+q1uJzM6PBmNV2J3vmzqnZIlfcoxDrbrm391aCzdbOnKqkmybVdfRud1TPDOHdAEEaUZXr65e/6zV6dHgqX2gtxkujfGaYu8fPD21axS/6GbFsY+OBi9qnIij0N8UOvvh4LlTL8RizuiGdHi4/+OzeqUUUxrdbSqdvjgb7G/PEfdqJqzWCroOqJnmN9FXq2oUuE2CwqlG/qLOf8lR+g/sqKn62sBGBWtSrN1sVKRNirW1a8XWIlg2rQtzF7Ds6Ea0oJxhc6hz/NpM8bk5iJCM+Lcqv6GIsc+8YS+L/o2IkjXRRJ3shnG275N6MBz0n/XtX77rnvy0umra1pfWV9ae5ViH1on1yrqwhpbX2m0dtk5az9ui/Vv79/YfBbW1s9I8sipP+/1fAPo6Og==</latexit>
argmax✓
log p(X | ✓) = argmax✓
lnY
x2X
p(x | ✓)
= argmax✓
X
x
lnp(x | ✓)
= argmaxµ,�
X
x
ln1p2⇡�2
exp
-
1
2�2(x- µ)2
�
= argmaxµ,�
X
x
ln1p2⇡�2
-1
2�2(x- µ)2
<latexit sha1_base64="XnPAttzTK/BsMvTbPb0G1ChNSOA=">AAAI8HictVXRbts2FFXcrfO8dWvWx70QMzakg2tIzpqkGAKkTbYWw9pkQZwEMF2DkmmZMCVxJOVIJfghexr22j/a34ySbE+ytGEv04su7jnn6vLwinQZJULa9p87rXsffHj/o/bHnU8+ffDZ5w93v7gWUcw9PPQiGvFbFwlMSYiHkkiKbxnHKHApvnEXpxl+s8RckCi8kinD4wD5IZkRD0mTmuzuvP8GIu4HKJlAOccSwe8BpJEP2N4tDMgUFNnH4Bg08EIAGY+mE5UASEJwq43MhGUdhJ1j0PANEQeTJC+xLSkUm88pGHkKBrHuAeiaSBCT1rpaA8448pSjDfwrl2oAGSmz3w60BhAnzJDxTIIRePK3ZFAl7iVP1l98/HYAOfHncgz+r6b+eyOwM3nYtft2/oB64KyCrrV6Lia79/twGnlxgEPpUSTEyLGZHCvEJfEo1h0YC8yQt0A+HsVydjRWJGSxxKGnwdcGm8UUyAhkgwOmhGNP0tQEyOPEVADeHJnWpRmvTrWUwCEKsOhNl4SJIhRLvwgkMrM5Vkk+u/pBRal8jticeEmlNYUCESA5ryVFGrjVJI4p5sugmszaNE1uMRPMPSIyEy6MM+cs+x/EVXSxwucpm+NQaBVzqstCA2DO8cwI81BgGTOVr8b8hAtxLHmMe1mY547PEF9c4mnP1Kkkqu3MaIRkNeWaZRh3QnznRUGAwqmCzMySxIlUsNfXuXdl9FIrMy3GKNcFlxlcQd+U0Ddab2mHG3QGhgatgNcl8LpW+KaE3mxL3biExjV0WUKXtcruXQm+q8FJCU1qaFpC0xr6roS+q1uJzM6PBmNV2J3vmzqnZIlfcoxDrbrm391aCzdbOnKqkmybVdfRud1TPDOHdAEEaUZXr65e/6zV6dHgqX2gtxkujfGaYu8fPD21axS/6GbFsY+OBi9qnIij0N8UOvvh4LlTL8RizuiGdHi4/+OzeqUUUxrdbSqdvjgb7G/PEfdqJqzWCroOqJnmN9FXq2oUuE2CwqlG/qLOf8lR+g/sqKn62sBGBWtSrN1sVKRNirW1a8XWIlg2rQtzF7Ds6Ea0oJxhc6hz/NpM8bk5iJCM+Lcqv6GIsc+8YS+L/o2IkjXRRJ3shnG275N6MBz0n/XtX77rnvy0umra1pfWV9ae5ViH1on1yrqwhpbX2m0dtk5az9ui/Vv79/YfBbW1s9I8sipP+/1fAPo6Og==</latexit><latexit sha1_base64="XnPAttzTK/BsMvTbPb0G1ChNSOA=">AAAI8HictVXRbts2FFXcrfO8dWvWx70QMzakg2tIzpqkGAKkTbYWw9pkQZwEMF2DkmmZMCVxJOVIJfghexr22j/a34ySbE+ytGEv04su7jnn6vLwinQZJULa9p87rXsffHj/o/bHnU8+ffDZ5w93v7gWUcw9PPQiGvFbFwlMSYiHkkiKbxnHKHApvnEXpxl+s8RckCi8kinD4wD5IZkRD0mTmuzuvP8GIu4HKJlAOccSwe8BpJEP2N4tDMgUFNnH4Bg08EIAGY+mE5UASEJwq43MhGUdhJ1j0PANEQeTJC+xLSkUm88pGHkKBrHuAeiaSBCT1rpaA8448pSjDfwrl2oAGSmz3w60BhAnzJDxTIIRePK3ZFAl7iVP1l98/HYAOfHncgz+r6b+eyOwM3nYtft2/oB64KyCrrV6Lia79/twGnlxgEPpUSTEyLGZHCvEJfEo1h0YC8yQt0A+HsVydjRWJGSxxKGnwdcGm8UUyAhkgwOmhGNP0tQEyOPEVADeHJnWpRmvTrWUwCEKsOhNl4SJIhRLvwgkMrM5Vkk+u/pBRal8jticeEmlNYUCESA5ryVFGrjVJI4p5sugmszaNE1uMRPMPSIyEy6MM+cs+x/EVXSxwucpm+NQaBVzqstCA2DO8cwI81BgGTOVr8b8hAtxLHmMe1mY547PEF9c4mnP1Kkkqu3MaIRkNeWaZRh3QnznRUGAwqmCzMySxIlUsNfXuXdl9FIrMy3GKNcFlxlcQd+U0Ddab2mHG3QGhgatgNcl8LpW+KaE3mxL3biExjV0WUKXtcruXQm+q8FJCU1qaFpC0xr6roS+q1uJzM6PBmNV2J3vmzqnZIlfcoxDrbrm391aCzdbOnKqkmybVdfRud1TPDOHdAEEaUZXr65e/6zV6dHgqX2gtxkujfGaYu8fPD21axS/6GbFsY+OBi9qnIij0N8UOvvh4LlTL8RizuiGdHi4/+OzeqUUUxrdbSqdvjgb7G/PEfdqJqzWCroOqJnmN9FXq2oUuE2CwqlG/qLOf8lR+g/sqKn62sBGBWtSrN1sVKRNirW1a8XWIlg2rQtzF7Ds6Ea0oJxhc6hz/NpM8bk5iJCM+Lcqv6GIsc+8YS+L/o2IkjXRRJ3shnG275N6MBz0n/XtX77rnvy0umra1pfWV9ae5ViH1on1yrqwhpbX2m0dtk5az9ui/Vv79/YfBbW1s9I8sipP+/1fAPo6Og==</latexit><latexit sha1_base64="XnPAttzTK/BsMvTbPb0G1ChNSOA=">AAAI8HictVXRbts2FFXcrfO8dWvWx70QMzakg2tIzpqkGAKkTbYWw9pkQZwEMF2DkmmZMCVxJOVIJfghexr22j/a34ySbE+ytGEv04su7jnn6vLwinQZJULa9p87rXsffHj/o/bHnU8+ffDZ5w93v7gWUcw9PPQiGvFbFwlMSYiHkkiKbxnHKHApvnEXpxl+s8RckCi8kinD4wD5IZkRD0mTmuzuvP8GIu4HKJlAOccSwe8BpJEP2N4tDMgUFNnH4Bg08EIAGY+mE5UASEJwq43MhGUdhJ1j0PANEQeTJC+xLSkUm88pGHkKBrHuAeiaSBCT1rpaA8448pSjDfwrl2oAGSmz3w60BhAnzJDxTIIRePK3ZFAl7iVP1l98/HYAOfHncgz+r6b+eyOwM3nYtft2/oB64KyCrrV6Lia79/twGnlxgEPpUSTEyLGZHCvEJfEo1h0YC8yQt0A+HsVydjRWJGSxxKGnwdcGm8UUyAhkgwOmhGNP0tQEyOPEVADeHJnWpRmvTrWUwCEKsOhNl4SJIhRLvwgkMrM5Vkk+u/pBRal8jticeEmlNYUCESA5ryVFGrjVJI4p5sugmszaNE1uMRPMPSIyEy6MM+cs+x/EVXSxwucpm+NQaBVzqstCA2DO8cwI81BgGTOVr8b8hAtxLHmMe1mY547PEF9c4mnP1Kkkqu3MaIRkNeWaZRh3QnznRUGAwqmCzMySxIlUsNfXuXdl9FIrMy3GKNcFlxlcQd+U0Ddab2mHG3QGhgatgNcl8LpW+KaE3mxL3biExjV0WUKXtcruXQm+q8FJCU1qaFpC0xr6roS+q1uJzM6PBmNV2J3vmzqnZIlfcoxDrbrm391aCzdbOnKqkmybVdfRud1TPDOHdAEEaUZXr65e/6zV6dHgqX2gtxkujfGaYu8fPD21axS/6GbFsY+OBi9qnIij0N8UOvvh4LlTL8RizuiGdHi4/+OzeqUUUxrdbSqdvjgb7G/PEfdqJqzWCroOqJnmN9FXq2oUuE2CwqlG/qLOf8lR+g/sqKn62sBGBWtSrN1sVKRNirW1a8XWIlg2rQtzF7Ds6Ea0oJxhc6hz/NpM8bk5iJCM+Lcqv6GIsc+8YS+L/o2IkjXRRJ3shnG275N6MBz0n/XtX77rnvy0umra1pfWV9ae5ViH1on1yrqwhpbX2m0dtk5az9ui/Vv79/YfBbW1s9I8sipP+/1fAPo6Og==</latexit>
argmax✓
log p(X | ✓) = argmax✓
lnY
x2X
p(x | ✓)
= argmax✓
X
x
lnp(x | ✓)
= argmaxµ,�
X
x
ln1p2⇡�2
exp
-
1
2�2(x- µ)2
�
= argmaxµ,�
X
x
ln1p2⇡�2
-1
2�2(x- µ)2
<latexit sha1_base64="XnPAttzTK/BsMvTbPb0G1ChNSOA=">AAAI8HictVXRbts2FFXcrfO8dWvWx70QMzakg2tIzpqkGAKkTbYWw9pkQZwEMF2DkmmZMCVxJOVIJfghexr22j/a34ySbE+ytGEv04su7jnn6vLwinQZJULa9p87rXsffHj/o/bHnU8+ffDZ5w93v7gWUcw9PPQiGvFbFwlMSYiHkkiKbxnHKHApvnEXpxl+s8RckCi8kinD4wD5IZkRD0mTmuzuvP8GIu4HKJlAOccSwe8BpJEP2N4tDMgUFNnH4Bg08EIAGY+mE5UASEJwq43MhGUdhJ1j0PANEQeTJC+xLSkUm88pGHkKBrHuAeiaSBCT1rpaA8448pSjDfwrl2oAGSmz3w60BhAnzJDxTIIRePK3ZFAl7iVP1l98/HYAOfHncgz+r6b+eyOwM3nYtft2/oB64KyCrrV6Lia79/twGnlxgEPpUSTEyLGZHCvEJfEo1h0YC8yQt0A+HsVydjRWJGSxxKGnwdcGm8UUyAhkgwOmhGNP0tQEyOPEVADeHJnWpRmvTrWUwCEKsOhNl4SJIhRLvwgkMrM5Vkk+u/pBRal8jticeEmlNYUCESA5ryVFGrjVJI4p5sugmszaNE1uMRPMPSIyEy6MM+cs+x/EVXSxwucpm+NQaBVzqstCA2DO8cwI81BgGTOVr8b8hAtxLHmMe1mY547PEF9c4mnP1Kkkqu3MaIRkNeWaZRh3QnznRUGAwqmCzMySxIlUsNfXuXdl9FIrMy3GKNcFlxlcQd+U0Ddab2mHG3QGhgatgNcl8LpW+KaE3mxL3biExjV0WUKXtcruXQm+q8FJCU1qaFpC0xr6roS+q1uJzM6PBmNV2J3vmzqnZIlfcoxDrbrm391aCzdbOnKqkmybVdfRud1TPDOHdAEEaUZXr65e/6zV6dHgqX2gtxkujfGaYu8fPD21axS/6GbFsY+OBi9qnIij0N8UOvvh4LlTL8RizuiGdHi4/+OzeqUUUxrdbSqdvjgb7G/PEfdqJqzWCroOqJnmN9FXq2oUuE2CwqlG/qLOf8lR+g/sqKn62sBGBWtSrN1sVKRNirW1a8XWIlg2rQtzF7Ds6Ea0oJxhc6hz/NpM8bk5iJCM+Lcqv6GIsc+8YS+L/o2IkjXRRJ3shnG275N6MBz0n/XtX77rnvy0umra1pfWV9ae5ViH1on1yrqwhpbX2m0dtk5az9ui/Vv79/YfBbW1s9I8sipP+/1fAPo6Og==</latexit><latexit sha1_base64="XnPAttzTK/BsMvTbPb0G1ChNSOA=">AAAI8HictVXRbts2FFXcrfO8dWvWx70QMzakg2tIzpqkGAKkTbYWw9pkQZwEMF2DkmmZMCVxJOVIJfghexr22j/a34ySbE+ytGEv04su7jnn6vLwinQZJULa9p87rXsffHj/o/bHnU8+ffDZ5w93v7gWUcw9PPQiGvFbFwlMSYiHkkiKbxnHKHApvnEXpxl+s8RckCi8kinD4wD5IZkRD0mTmuzuvP8GIu4HKJlAOccSwe8BpJEP2N4tDMgUFNnH4Bg08EIAGY+mE5UASEJwq43MhGUdhJ1j0PANEQeTJC+xLSkUm88pGHkKBrHuAeiaSBCT1rpaA8448pSjDfwrl2oAGSmz3w60BhAnzJDxTIIRePK3ZFAl7iVP1l98/HYAOfHncgz+r6b+eyOwM3nYtft2/oB64KyCrrV6Lia79/twGnlxgEPpUSTEyLGZHCvEJfEo1h0YC8yQt0A+HsVydjRWJGSxxKGnwdcGm8UUyAhkgwOmhGNP0tQEyOPEVADeHJnWpRmvTrWUwCEKsOhNl4SJIhRLvwgkMrM5Vkk+u/pBRal8jticeEmlNYUCESA5ryVFGrjVJI4p5sugmszaNE1uMRPMPSIyEy6MM+cs+x/EVXSxwucpm+NQaBVzqstCA2DO8cwI81BgGTOVr8b8hAtxLHmMe1mY547PEF9c4mnP1Kkkqu3MaIRkNeWaZRh3QnznRUGAwqmCzMySxIlUsNfXuXdl9FIrMy3GKNcFlxlcQd+U0Ddab2mHG3QGhgatgNcl8LpW+KaE3mxL3biExjV0WUKXtcruXQm+q8FJCU1qaFpC0xr6roS+q1uJzM6PBmNV2J3vmzqnZIlfcoxDrbrm391aCzdbOnKqkmybVdfRud1TPDOHdAEEaUZXr65e/6zV6dHgqX2gtxkujfGaYu8fPD21axS/6GbFsY+OBi9qnIij0N8UOvvh4LlTL8RizuiGdHi4/+OzeqUUUxrdbSqdvjgb7G/PEfdqJqzWCroOqJnmN9FXq2oUuE2CwqlG/qLOf8lR+g/sqKn62sBGBWtSrN1sVKRNirW1a8XWIlg2rQtzF7Ds6Ea0oJxhc6hz/NpM8bk5iJCM+Lcqv6GIsc+8YS+L/o2IkjXRRJ3shnG275N6MBz0n/XtX77rnvy0umra1pfWV9ae5ViH1on1yrqwhpbX2m0dtk5az9ui/Vv79/YfBbW1s9I8sipP+/1fAPo6Og==</latexit><latexit sha1_base64="XnPAttzTK/BsMvTbPb0G1ChNSOA=">AAAI8HictVXRbts2FFXcrfO8dWvWx70QMzakg2tIzpqkGAKkTbYWw9pkQZwEMF2DkmmZMCVxJOVIJfghexr22j/a34ySbE+ytGEv04su7jnn6vLwinQZJULa9p87rXsffHj/o/bHnU8+ffDZ5w93v7gWUcw9PPQiGvFbFwlMSYiHkkiKbxnHKHApvnEXpxl+s8RckCi8kinD4wD5IZkRD0mTmuzuvP8GIu4HKJlAOccSwe8BpJEP2N4tDMgUFNnH4Bg08EIAGY+mE5UASEJwq43MhGUdhJ1j0PANEQeTJC+xLSkUm88pGHkKBrHuAeiaSBCT1rpaA8448pSjDfwrl2oAGSmz3w60BhAnzJDxTIIRePK3ZFAl7iVP1l98/HYAOfHncgz+r6b+eyOwM3nYtft2/oB64KyCrrV6Lia79/twGnlxgEPpUSTEyLGZHCvEJfEo1h0YC8yQt0A+HsVydjRWJGSxxKGnwdcGm8UUyAhkgwOmhGNP0tQEyOPEVADeHJnWpRmvTrWUwCEKsOhNl4SJIhRLvwgkMrM5Vkk+u/pBRal8jticeEmlNYUCESA5ryVFGrjVJI4p5sugmszaNE1uMRPMPSIyEy6MM+cs+x/EVXSxwucpm+NQaBVzqstCA2DO8cwI81BgGTOVr8b8hAtxLHmMe1mY547PEF9c4mnP1Kkkqu3MaIRkNeWaZRh3QnznRUGAwqmCzMySxIlUsNfXuXdl9FIrMy3GKNcFlxlcQd+U0Ddab2mHG3QGhgatgNcl8LpW+KaE3mxL3biExjV0WUKXtcruXQm+q8FJCU1qaFpC0xr6roS+q1uJzM6PBmNV2J3vmzqnZIlfcoxDrbrm391aCzdbOnKqkmybVdfRud1TPDOHdAEEaUZXr65e/6zV6dHgqX2gtxkujfGaYu8fPD21axS/6GbFsY+OBi9qnIij0N8UOvvh4LlTL8RizuiGdHi4/+OzeqUUUxrdbSqdvjgb7G/PEfdqJqzWCroOqJnmN9FXq2oUuE2CwqlG/qLOf8lR+g/sqKn62sBGBWtSrN1sVKRNirW1a8XWIlg2rQtzF7Ds6Ea0oJxhc6hz/NpM8bk5iJCM+Lcqv6GIsc+8YS+L/o2IkjXRRJ3shnG275N6MBz0n/XtX77rnvy0umra1pfWV9ae5ViH1on1yrqwhpbX2m0dtk5az9ui/Vv79/YfBbW1s9I8sipP+/1fAPo6Og==</latexit>
argmax✓
log p(X | ✓) = argmax✓
lnY
x2X
p(x | ✓)
= argmax✓
X
x
lnp(x | ✓)
= argmaxµ,�
X
x
ln1p2⇡�2
exp
-
1
2�2(x- µ)2
�
= argmaxµ,�
X
x
ln1p2⇡�2
-1
2�2(x- µ)2
<latexit sha1_base64="XnPAttzTK/BsMvTbPb0G1ChNSOA=">AAAI8HictVXRbts2FFXcrfO8dWvWx70QMzakg2tIzpqkGAKkTbYWw9pkQZwEMF2DkmmZMCVxJOVIJfghexr22j/a34ySbE+ytGEv04su7jnn6vLwinQZJULa9p87rXsffHj/o/bHnU8+ffDZ5w93v7gWUcw9PPQiGvFbFwlMSYiHkkiKbxnHKHApvnEXpxl+s8RckCi8kinD4wD5IZkRD0mTmuzuvP8GIu4HKJlAOccSwe8BpJEP2N4tDMgUFNnH4Bg08EIAGY+mE5UASEJwq43MhGUdhJ1j0PANEQeTJC+xLSkUm88pGHkKBrHuAeiaSBCT1rpaA8448pSjDfwrl2oAGSmz3w60BhAnzJDxTIIRePK3ZFAl7iVP1l98/HYAOfHncgz+r6b+eyOwM3nYtft2/oB64KyCrrV6Lia79/twGnlxgEPpUSTEyLGZHCvEJfEo1h0YC8yQt0A+HsVydjRWJGSxxKGnwdcGm8UUyAhkgwOmhGNP0tQEyOPEVADeHJnWpRmvTrWUwCEKsOhNl4SJIhRLvwgkMrM5Vkk+u/pBRal8jticeEmlNYUCESA5ryVFGrjVJI4p5sugmszaNE1uMRPMPSIyEy6MM+cs+x/EVXSxwucpm+NQaBVzqstCA2DO8cwI81BgGTOVr8b8hAtxLHmMe1mY547PEF9c4mnP1Kkkqu3MaIRkNeWaZRh3QnznRUGAwqmCzMySxIlUsNfXuXdl9FIrMy3GKNcFlxlcQd+U0Ddab2mHG3QGhgatgNcl8LpW+KaE3mxL3biExjV0WUKXtcruXQm+q8FJCU1qaFpC0xr6roS+q1uJzM6PBmNV2J3vmzqnZIlfcoxDrbrm391aCzdbOnKqkmybVdfRud1TPDOHdAEEaUZXr65e/6zV6dHgqX2gtxkujfGaYu8fPD21axS/6GbFsY+OBi9qnIij0N8UOvvh4LlTL8RizuiGdHi4/+OzeqUUUxrdbSqdvjgb7G/PEfdqJqzWCroOqJnmN9FXq2oUuE2CwqlG/qLOf8lR+g/sqKn62sBGBWtSrN1sVKRNirW1a8XWIlg2rQtzF7Ds6Ea0oJxhc6hz/NpM8bk5iJCM+Lcqv6GIsc+8YS+L/o2IkjXRRJ3shnG275N6MBz0n/XtX77rnvy0umra1pfWV9ae5ViH1on1yrqwhpbX2m0dtk5az9ui/Vv79/YfBbW1s9I8sipP+/1fAPo6Og==</latexit><latexit sha1_base64="XnPAttzTK/BsMvTbPb0G1ChNSOA=">AAAI8HictVXRbts2FFXcrfO8dWvWx70QMzakg2tIzpqkGAKkTbYWw9pkQZwEMF2DkmmZMCVxJOVIJfghexr22j/a34ySbE+ytGEv04su7jnn6vLwinQZJULa9p87rXsffHj/o/bHnU8+ffDZ5w93v7gWUcw9PPQiGvFbFwlMSYiHkkiKbxnHKHApvnEXpxl+s8RckCi8kinD4wD5IZkRD0mTmuzuvP8GIu4HKJlAOccSwe8BpJEP2N4tDMgUFNnH4Bg08EIAGY+mE5UASEJwq43MhGUdhJ1j0PANEQeTJC+xLSkUm88pGHkKBrHuAeiaSBCT1rpaA8448pSjDfwrl2oAGSmz3w60BhAnzJDxTIIRePK3ZFAl7iVP1l98/HYAOfHncgz+r6b+eyOwM3nYtft2/oB64KyCrrV6Lia79/twGnlxgEPpUSTEyLGZHCvEJfEo1h0YC8yQt0A+HsVydjRWJGSxxKGnwdcGm8UUyAhkgwOmhGNP0tQEyOPEVADeHJnWpRmvTrWUwCEKsOhNl4SJIhRLvwgkMrM5Vkk+u/pBRal8jticeEmlNYUCESA5ryVFGrjVJI4p5sugmszaNE1uMRPMPSIyEy6MM+cs+x/EVXSxwucpm+NQaBVzqstCA2DO8cwI81BgGTOVr8b8hAtxLHmMe1mY547PEF9c4mnP1Kkkqu3MaIRkNeWaZRh3QnznRUGAwqmCzMySxIlUsNfXuXdl9FIrMy3GKNcFlxlcQd+U0Ddab2mHG3QGhgatgNcl8LpW+KaE3mxL3biExjV0WUKXtcruXQm+q8FJCU1qaFpC0xr6roS+q1uJzM6PBmNV2J3vmzqnZIlfcoxDrbrm391aCzdbOnKqkmybVdfRud1TPDOHdAEEaUZXr65e/6zV6dHgqX2gtxkujfGaYu8fPD21axS/6GbFsY+OBi9qnIij0N8UOvvh4LlTL8RizuiGdHi4/+OzeqUUUxrdbSqdvjgb7G/PEfdqJqzWCroOqJnmN9FXq2oUuE2CwqlG/qLOf8lR+g/sqKn62sBGBWtSrN1sVKRNirW1a8XWIlg2rQtzF7Ds6Ea0oJxhc6hz/NpM8bk5iJCM+Lcqv6GIsc+8YS+L/o2IkjXRRJ3shnG275N6MBz0n/XtX77rnvy0umra1pfWV9ae5ViH1on1yrqwhpbX2m0dtk5az9ui/Vv79/YfBbW1s9I8sipP+/1fAPo6Og==</latexit><latexit sha1_base64="XnPAttzTK/BsMvTbPb0G1ChNSOA=">AAAI8HictVXRbts2FFXcrfO8dWvWx70QMzakg2tIzpqkGAKkTbYWw9pkQZwEMF2DkmmZMCVxJOVIJfghexr22j/a34ySbE+ytGEv04su7jnn6vLwinQZJULa9p87rXsffHj/o/bHnU8+ffDZ5w93v7gWUcw9PPQiGvFbFwlMSYiHkkiKbxnHKHApvnEXpxl+s8RckCi8kinD4wD5IZkRD0mTmuzuvP8GIu4HKJlAOccSwe8BpJEP2N4tDMgUFNnH4Bg08EIAGY+mE5UASEJwq43MhGUdhJ1j0PANEQeTJC+xLSkUm88pGHkKBrHuAeiaSBCT1rpaA8448pSjDfwrl2oAGSmz3w60BhAnzJDxTIIRePK3ZFAl7iVP1l98/HYAOfHncgz+r6b+eyOwM3nYtft2/oB64KyCrrV6Lia79/twGnlxgEPpUSTEyLGZHCvEJfEo1h0YC8yQt0A+HsVydjRWJGSxxKGnwdcGm8UUyAhkgwOmhGNP0tQEyOPEVADeHJnWpRmvTrWUwCEKsOhNl4SJIhRLvwgkMrM5Vkk+u/pBRal8jticeEmlNYUCESA5ryVFGrjVJI4p5sugmszaNE1uMRPMPSIyEy6MM+cs+x/EVXSxwucpm+NQaBVzqstCA2DO8cwI81BgGTOVr8b8hAtxLHmMe1mY547PEF9c4mnP1Kkkqu3MaIRkNeWaZRh3QnznRUGAwqmCzMySxIlUsNfXuXdl9FIrMy3GKNcFlxlcQd+U0Ddab2mHG3QGhgatgNcl8LpW+KaE3mxL3biExjV0WUKXtcruXQm+q8FJCU1qaFpC0xr6roS+q1uJzM6PBmNV2J3vmzqnZIlfcoxDrbrm391aCzdbOnKqkmybVdfRud1TPDOHdAEEaUZXr65e/6zV6dHgqX2gtxkujfGaYu8fPD21axS/6GbFsY+OBi9qnIij0N8UOvvh4LlTL8RizuiGdHi4/+OzeqUUUxrdbSqdvjgb7G/PEfdqJqzWCroOqJnmN9FXq2oUuE2CwqlG/qLOf8lR+g/sqKn62sBGBWtSrN1sVKRNirW1a8XWIlg2rQtzF7Ds6Ea0oJxhc6hz/NpM8bk5iJCM+Lcqv6GIsc+8YS+L/o2IkjXRRJ3shnG275N6MBz0n/XtX77rnvy0umra1pfWV9ae5ViH1on1yrqwhpbX2m0dtk5az9ui/Vv79/YfBbW1s9I8sipP+/1fAPo6Og==</latexit>
argmax✓
lnp(X | ✓) =
<latexit sha1_base64="q3dwxVNcLDCOYhDmQhxvRsGEVR4=">AAAHbnicfVXrbtMwFM64rFBuG0j8QYiICjRQVSUduyA0abBxEQI2pnWbtFST4562UZ3Esp2uwcor8DT8hffgLXgEnKSdkjrgPzk63/ednPPZsl1KPC4s6/fCpctXri7Wrl2v37h56/adpeW7RzyMGIYODknITlzEgXgBdIQnCJxQBsh3CRy7o50UPx4D414YHIqYQtdHg8DrexgJlTpbWnnqIDbw0eTMEUMQyHllOiQw6cqJ43s9M08+M7fMs6WG1bKyZeqBPQ0axnTtny0vtpxeiCMfAoEJ4vzUtqjoSsSEhwkkdSfiQBEeoQGcRqK/2ZVeQCMBAU7MJwrrR8QUoZl2bfY8BliQWAUIM09VMPEQMYSFmq1eLsUhQD7wZm/sUZ6HfDzIA4GUMV05yYxLbpWUcsAQHXp4UmpNIp/7SAy1JI99t5yEiAAb++Vk2qZqco45AYY9npqwr5zZo+lm8MNwf4oPYzqEgCcyYiQpChUAjEFfCbOQg4iozKZRJ2DEtwSLoJmGWW5rF7HRAfSaqk4pUW6nT0IkyilXjaHcCeAch76Pgp50aCIdARMhnWYrybwrogeJlE5qlOuaBylcQr8U0C9JMqftXKB9s6PQEnhUAI+0wscF9Hhe6kYFNNLQcQEda5Xd8wJ8rsGTAjrR0LiAxhr6rYB+061EaudP212Z253tm9wj3hjeM4AgkY12Mj8LU1t6apcl6TbLhp1kdvegr26IHPDjlC4/HH7+lMidzfaatZ7MM1wSwYxira6v7VgaZZB3M+VYm5vtNxonZCgYXBTafbv+2tYL0YhRckHa2Fh991KvFAMh4flFpZ03u+3V+XPEsGbCdFazYZuaaYMq+nSqSoFbJcidquSPdP57huJ/sMOq6jMDKxW0SjFzs1IRVylm1s4Uc0PQ9LSOsPpdenUjklN2QV3qDD6rU7ynLiIkQvZcZm+Jp+xTX6eZRv8josmMqKJ6Xb0w9vx7ogdH7Zb9orX29UVj++P0rblmPDAeGyuGbWwY28YHY9/oGNj4bvwwfhq/Fv/U7tce1h7l1EsLU809o7RqK38BnxavGw==</latexit>
<latexit sha1_base64="dUQNBPw+UHI20C+MWUJyCsLXp44=">AAAHT3icfZVbb9MwFMczLh2U2waPvFhUSANVU1LYBaFJwAabEGxjWjekpZoc9zS1mouxna7F8tfjnSfEJ+EJhNOkU9IU8tIj//7n6Pjvo1OPBVRI2/65cOXqteu1xRs367du37l7b2n5/omIE06gTeIg5p89LCCgEbQllQF8Zhxw6AVw6g22U346BC5oHB3LMYNOiP2I9ijB0hydL31BbGWE3JB2kesR5co+SKzRE7SF0P4liQ0JE93MNIL6IdZPkPvSfYlcCSOpLqjsI12ssIVW5qedLzXsVXvyoWrg5EHDyr/D8+Xr39xuTJIQIkkCLMSZYzPZUZhLSgLQdTcRwDAZYB/OEtnb7CgasURCRDR6bFgvCZCMUXp91KUciAzGJsCEU1MBkT7mmEhjUr1cSkCEQxDN7pAykYVi6GeBxMbhjhpNXkDfKWUqn2PWp2RUak3hUIRY9iuHYhx65UNIAuDDsHyYtmmanFGOgBMqUhMOjTMHLH1VcRwf5rw/Zn2IhFYJD3Qx0QDgHHomcRIKkAlTk9uYURqILckTaKbh5GxrB/PBEXSbpk7poNxOL4ix1MaMCC5IHIY46iqXaZUNidtc1ROrivRIKzMkxhfPQ0cpLtH9At3Xs5Xbl7SH2oaW4EkBnlQKnxbo6WyqlxRoUqHDAh1WKnsXBXxRwaMCHVXouEDHFfq1QL9WrcTmoc9aHZXZPXkmdRDQIexygEirRkvP3oWbFzxzyinpq6qGoyd2d6FnNksGwnEqV3vHHz9otb3ZWrPX9azCCxKYSuxn62vbdkXiZ93kGntzs/Wmook5jvzLQjtv11871UIs4Sy4FG1sPHv34s3siHBSuV9+DdRwUMUPf548b3hugjcvITNhrn5Q1e9yPP6HOp5XferN3Aw2L2Nq1DRjpiWWjtXAbGiWrlQcZJIdMMuWw0czbgdmQWAZ86dmxrgfUmOG+XWbafQ/IR5NhSaq183md2b3fDU4aa0666v2p+eNV+/z/4Ab1kPrkbViOdaG9crasw6ttkWsH9bvhdrCYu177Vftz2IuvbKQBw+s0rd48y+W8qJA</latexit>
p(x | ✓) = N(x | µ,�) with ✓ = (µ,�)
<latexit sha1_base64="dUQNBPw+UHI20C+MWUJyCsLXp44=">AAAHT3icfZVbb9MwFMczLh2U2waPvFhUSANVU1LYBaFJwAabEGxjWjekpZoc9zS1mouxna7F8tfjnSfEJ+EJhNOkU9IU8tIj//7n6Pjvo1OPBVRI2/65cOXqteu1xRs367du37l7b2n5/omIE06gTeIg5p89LCCgEbQllQF8Zhxw6AVw6g22U346BC5oHB3LMYNOiP2I9ijB0hydL31BbGWE3JB2kesR5co+SKzRE7SF0P4liQ0JE93MNIL6IdZPkPvSfYlcCSOpLqjsI12ssIVW5qedLzXsVXvyoWrg5EHDyr/D8+Xr39xuTJIQIkkCLMSZYzPZUZhLSgLQdTcRwDAZYB/OEtnb7CgasURCRDR6bFgvCZCMUXp91KUciAzGJsCEU1MBkT7mmEhjUr1cSkCEQxDN7pAykYVi6GeBxMbhjhpNXkDfKWUqn2PWp2RUak3hUIRY9iuHYhx65UNIAuDDsHyYtmmanFGOgBMqUhMOjTMHLH1VcRwf5rw/Zn2IhFYJD3Qx0QDgHHomcRIKkAlTk9uYURqILckTaKbh5GxrB/PBEXSbpk7poNxOL4ix1MaMCC5IHIY46iqXaZUNidtc1ROrivRIKzMkxhfPQ0cpLtH9At3Xs5Xbl7SH2oaW4EkBnlQKnxbo6WyqlxRoUqHDAh1WKnsXBXxRwaMCHVXouEDHFfq1QL9WrcTmoc9aHZXZPXkmdRDQIexygEirRkvP3oWbFzxzyinpq6qGoyd2d6FnNksGwnEqV3vHHz9otb3ZWrPX9azCCxKYSuxn62vbdkXiZ93kGntzs/Wmook5jvzLQjtv11871UIs4Sy4FG1sPHv34s3siHBSuV9+DdRwUMUPf548b3hugjcvITNhrn5Q1e9yPP6HOp5XferN3Aw2L2Nq1DRjpiWWjtXAbGiWrlQcZJIdMMuWw0czbgdmQWAZ86dmxrgfUmOG+XWbafQ/IR5NhSaq183md2b3fDU4aa0666v2p+eNV+/z/4Ab1kPrkbViOdaG9crasw6ttkWsH9bvhdrCYu177Vftz2IuvbKQBw+s0rd48y+W8qJA</latexit>
p(x | ✓) = N(x | µ,�) with ✓ = (µ,�)
Wewantto?indthemeanandstandarddeviationforwhichtheloglikelihoodismaximal.Wewillleavethefullderivationforlater,butherearethe?irstfewsteps.Thisshouldshowyouhowthelogarithmsimpli?iesthings.
First,wecanturntheproductintoasumbymovingthelogarithminside.Thisisexplainindetailinthe?irsthomework.
We?illinthede?initionoftheactualprobabilitydensityfunctionwe’reusing(line3).Thisfunctionistheproductoftwofactors(thedivisionandtheexponent)whichbecometermsifweworkthemoutofthelogarithm.Inthesecondtermtheexponentcancelsagainstthelogarithm.
Weusuallyuseabase-elogarithm,becauseitwillcanceloutagainstthebase-eexponentintheprobabilitydensityforthenormaldistribution.
41
µ,�<latexit sha1_base64="giaUEJtDFa8dQUI04UsX2oay3g8=">AAAH03icfVXLjtMwFA3PQnkNIFZsIiokhKpR0mEeLJBgHsACmGE0nUGaVCPHvU2jOollO50GyxvElo9gwwK+iL/B6YskDniTq3vOubk+dnJ9SkIuHOf3hYuXLl+52rh2vXnj5q3bd1bu3jvmScowdHFCEvbJRxxIGENXhILAJ8oART6BE3+0k+MnY2A8TOIjkVHoRSiIw0GIkdCps5UHXoKlF6WqbXu+jngYREidrbScVWe6bDNw50HLmq+Ds7tXf3j9BKcRxAITxPmp61DRk4iJEBNQTS/lQBEeoQBOUzHY6skwpqmAGCv7scYGKbFFYuct2v2QARYk0wHCLNQVbDxEDGGhN9Isl+IQowh4uz8OKZ+FfBzMAoG0Cz05mbqkbpWUMmCIDkM8KbUmUcQjJIZGkmeRX05CSoCNo3Iyb1M3WWFOgOGQ5yYcaGf2ae48P0oO5vgwo0OIuZIpI6oo1AAwBgMtnIYcRErldDf6uEf8hWAptPNwmnuxi9joEPptXaeUKLczIAkS5ZSvt6HdieEcJ1GE4r70qJKegImQXntVTb0roodK6kujjfJ9+zCHS+iHAvpBqTK4VwD3NFhGu0t0YHer0uMCeGy89aSAnlSlflpAUwMdF9CxUdk/L8DnBjwpoBMDzQpoZqCfC+hn02ekr8VppydnZzE9VLlPwjG8YQCxkq2Oqu6F6fM+dcuS/A7IlqumdvdhoP8VMyDKcrp8e/T+nZI7W511Z0NVGT5JYUFx1jbWdxyDEsy6mXOcra3OtsFJGIqDZaHdvY1XrlmIpoySJWlzc+31c7NSBoQk58tKO9u7nbXqxrQj5abcTddxqreNYcOquSN2y7UNa4M6+vw1tQK/TjDzs5Y/MvlvGMr+wU7qqi9srlXQOsXC81pFVqdYHMBCUZbENTb9PY6lprJzmn8IIz2AaD4yEJnV3QU9TBi81x/Ivv4BIpGwp/qrYEEU6lr66bXz6H9ENFkQddRs6snmVueYGRx3Vl1n1f34rPVyez7jrlkPrUfWE8u1Nq2X1lvrwOpa2JLWd+un9avRbcjGl8bXGfXihbnmvlVajW9/AI57178=</latexit><latexit sha1_base64="giaUEJtDFa8dQUI04UsX2oay3g8=">AAAH03icfVXLjtMwFA3PQnkNIFZsIiokhKpR0mEeLJBgHsACmGE0nUGaVCPHvU2jOollO50GyxvElo9gwwK+iL/B6YskDniTq3vOubk+dnJ9SkIuHOf3hYuXLl+52rh2vXnj5q3bd1bu3jvmScowdHFCEvbJRxxIGENXhILAJ8oART6BE3+0k+MnY2A8TOIjkVHoRSiIw0GIkdCps5UHXoKlF6WqbXu+jngYREidrbScVWe6bDNw50HLmq+Ds7tXf3j9BKcRxAITxPmp61DRk4iJEBNQTS/lQBEeoQBOUzHY6skwpqmAGCv7scYGKbFFYuct2v2QARYk0wHCLNQVbDxEDGGhN9Isl+IQowh4uz8OKZ+FfBzMAoG0Cz05mbqkbpWUMmCIDkM8KbUmUcQjJIZGkmeRX05CSoCNo3Iyb1M3WWFOgOGQ5yYcaGf2ae48P0oO5vgwo0OIuZIpI6oo1AAwBgMtnIYcRErldDf6uEf8hWAptPNwmnuxi9joEPptXaeUKLczIAkS5ZSvt6HdieEcJ1GE4r70qJKegImQXntVTb0roodK6kujjfJ9+zCHS+iHAvpBqTK4VwD3NFhGu0t0YHer0uMCeGy89aSAnlSlflpAUwMdF9CxUdk/L8DnBjwpoBMDzQpoZqCfC+hn02ekr8VppydnZzE9VLlPwjG8YQCxkq2Oqu6F6fM+dcuS/A7IlqumdvdhoP8VMyDKcrp8e/T+nZI7W511Z0NVGT5JYUFx1jbWdxyDEsy6mXOcra3OtsFJGIqDZaHdvY1XrlmIpoySJWlzc+31c7NSBoQk58tKO9u7nbXqxrQj5abcTddxqreNYcOquSN2y7UNa4M6+vw1tQK/TjDzs5Y/MvlvGMr+wU7qqi9srlXQOsXC81pFVqdYHMBCUZbENTb9PY6lprJzmn8IIz2AaD4yEJnV3QU9TBi81x/Ivv4BIpGwp/qrYEEU6lr66bXz6H9ENFkQddRs6snmVueYGRx3Vl1n1f34rPVyez7jrlkPrUfWE8u1Nq2X1lvrwOpa2JLWd+un9avRbcjGl8bXGfXihbnmvlVajW9/AI57178=</latexit><latexit sha1_base64="giaUEJtDFa8dQUI04UsX2oay3g8=">AAAH03icfVXLjtMwFA3PQnkNIFZsIiokhKpR0mEeLJBgHsACmGE0nUGaVCPHvU2jOollO50GyxvElo9gwwK+iL/B6YskDniTq3vOubk+dnJ9SkIuHOf3hYuXLl+52rh2vXnj5q3bd1bu3jvmScowdHFCEvbJRxxIGENXhILAJ8oART6BE3+0k+MnY2A8TOIjkVHoRSiIw0GIkdCps5UHXoKlF6WqbXu+jngYREidrbScVWe6bDNw50HLmq+Ds7tXf3j9BKcRxAITxPmp61DRk4iJEBNQTS/lQBEeoQBOUzHY6skwpqmAGCv7scYGKbFFYuct2v2QARYk0wHCLNQVbDxEDGGhN9Isl+IQowh4uz8OKZ+FfBzMAoG0Cz05mbqkbpWUMmCIDkM8KbUmUcQjJIZGkmeRX05CSoCNo3Iyb1M3WWFOgOGQ5yYcaGf2ae48P0oO5vgwo0OIuZIpI6oo1AAwBgMtnIYcRErldDf6uEf8hWAptPNwmnuxi9joEPptXaeUKLczIAkS5ZSvt6HdieEcJ1GE4r70qJKegImQXntVTb0roodK6kujjfJ9+zCHS+iHAvpBqTK4VwD3NFhGu0t0YHer0uMCeGy89aSAnlSlflpAUwMdF9CxUdk/L8DnBjwpoBMDzQpoZqCfC+hn02ekr8VppydnZzE9VLlPwjG8YQCxkq2Oqu6F6fM+dcuS/A7IlqumdvdhoP8VMyDKcrp8e/T+nZI7W511Z0NVGT5JYUFx1jbWdxyDEsy6mXOcra3OtsFJGIqDZaHdvY1XrlmIpoySJWlzc+31c7NSBoQk58tKO9u7nbXqxrQj5abcTddxqreNYcOquSN2y7UNa4M6+vw1tQK/TjDzs5Y/MvlvGMr+wU7qqi9srlXQOsXC81pFVqdYHMBCUZbENTb9PY6lprJzmn8IIz2AaD4yEJnV3QU9TBi81x/Ivv4BIpGwp/qrYEEU6lr66bXz6H9ENFkQddRs6snmVueYGRx3Vl1n1f34rPVyez7jrlkPrUfWE8u1Nq2X1lvrwOpa2JLWd+un9avRbcjGl8bXGfXihbnmvlVajW9/AI57178=</latexit><latexit sha1_base64="giaUEJtDFa8dQUI04UsX2oay3g8=">AAAH03icfVXLjtMwFA3PQnkNIFZsIiokhKpR0mEeLJBgHsACmGE0nUGaVCPHvU2jOollO50GyxvElo9gwwK+iL/B6YskDniTq3vOubk+dnJ9SkIuHOf3hYuXLl+52rh2vXnj5q3bd1bu3jvmScowdHFCEvbJRxxIGENXhILAJ8oART6BE3+0k+MnY2A8TOIjkVHoRSiIw0GIkdCps5UHXoKlF6WqbXu+jngYREidrbScVWe6bDNw50HLmq+Ds7tXf3j9BKcRxAITxPmp61DRk4iJEBNQTS/lQBEeoQBOUzHY6skwpqmAGCv7scYGKbFFYuct2v2QARYk0wHCLNQVbDxEDGGhN9Isl+IQowh4uz8OKZ+FfBzMAoG0Cz05mbqkbpWUMmCIDkM8KbUmUcQjJIZGkmeRX05CSoCNo3Iyb1M3WWFOgOGQ5yYcaGf2ae48P0oO5vgwo0OIuZIpI6oo1AAwBgMtnIYcRErldDf6uEf8hWAptPNwmnuxi9joEPptXaeUKLczIAkS5ZSvt6HdieEcJ1GE4r70qJKegImQXntVTb0roodK6kujjfJ9+zCHS+iHAvpBqTK4VwD3NFhGu0t0YHer0uMCeGy89aSAnlSlflpAUwMdF9CxUdk/L8DnBjwpoBMDzQpoZqCfC+hn02ekr8VppydnZzE9VLlPwjG8YQCxkq2Oqu6F6fM+dcuS/A7IlqumdvdhoP8VMyDKcrp8e/T+nZI7W511Z0NVGT5JYUFx1jbWdxyDEsy6mXOcra3OtsFJGIqDZaHdvY1XrlmIpoySJWlzc+31c7NSBoQk58tKO9u7nbXqxrQj5abcTddxqreNYcOquSN2y7UNa4M6+vw1tQK/TjDzs5Y/MvlvGMr+wU7qqi9srlXQOsXC81pFVqdYHMBCUZbENTb9PY6lprJzmn8IIz2AaD4yEJnV3QU9TBi81x/Ivv4BIpGwp/qrYEEU6lr66bXz6H9ENFkQddRs6snmVueYGRx3Vl1n1f34rPVyez7jrlkPrUfWE8u1Nq2X1lvrwOpa2JLWd+un9avRbcjGl8bXGfXihbnmvlVajW9/AI57178=</latexit>
�2<latexit sha1_base64="K/SJbfMATGxVkRzf/AW1h6QuToE=">AAAHyXicfVXLbtNAFDXPlPAsLNlYREgIRZWd0qYsKpU+oBKUlqppkeqAxpMbx8rYns6M07gjr/gItrDhn/gbxnlhewyz8dU951zfOTP2dSnxubCs39eu37h563Zt6U797r37Dx4+Wn58yqOYYejgiETss4s4ED+EjvAFgc+UAQpcAmfucCfDz0bAuB+FJyKh0A2QF/p9HyOhUl3HxdLhvheg9Evr66OGtWJNlqkH9ixoGLN19HX59i+nF+E4gFBggjg/ty0quhIx4WMCad2JOVCEh8iD81j0N7rSD2ksIMSp+Vxh/ZiYIjKzxsyezwALkqgAYearCiYeIIawUO3Xi6U4hCgA3uyNfMqnIR9500AgtfeuHE+8Se8XlNJjiA58PC60JlHAAyQGWpIngVtMQkyAjYJiMmtTNVlijoFhn2cmHClnDmnmNz+Jjmb4IKEDCHkqY0bSvFABwBj0lXASchAxlZPdqEMe8k3BYmhm4SS3uYvY8Bh6TVWnkCi20ycREsWUq7ah3AnhEkdBgMKedGgqHQFjIZ3mSjrxLo8ep1I6mVGuax5ncAH9mEM/pmkR3MuBewosop0F2jc7ZelpDjzV3nqWQ8/KUjfOobGGjnLoSKvsXubgSw0e59CxhiY5NNHQqxx6pfuM1LU4b3Xl9CwmhyoPiT+CdwwgTGWjlZb3wtR5n9tFSXYHZMNOJ3b3oK/+EFMgSDK63D85+JDKnY3WmrWelhkuiWFOsVbX13YsjeJNu5lxrI2N1rbGiRgKvUWh3b31N7ZeiMaMkgWp3V59+1qvlAAh0eWi0s72bmu1vDHlSLEpu21bVvm2MaxZNXPEbNimZq1XRZ+9plLgVgmmflbyhzr/HUPJP9hRVfW5zZUKWqWYe16pSKoU8wOYK4qSsMKmv8ex0JR2TrMPYahGEM1GBiLTurughgmDA/WBHKofIBIRe6m+CuYFvqqlnk4zi/5HROM5UUX1uppsdnmO6cFpa8W2VuxPrxpb27MZt2Q8NZ4ZLwzbaBtbxr5xZHQMbFwY340fxs/a+9pFbVy7mlKvX5tpnhiFVfv2B+Fl1B4=</latexit><latexit sha1_base64="K/SJbfMATGxVkRzf/AW1h6QuToE=">AAAHyXicfVXLbtNAFDXPlPAsLNlYREgIRZWd0qYsKpU+oBKUlqppkeqAxpMbx8rYns6M07gjr/gItrDhn/gbxnlhewyz8dU951zfOTP2dSnxubCs39eu37h563Zt6U797r37Dx4+Wn58yqOYYejgiETss4s4ED+EjvAFgc+UAQpcAmfucCfDz0bAuB+FJyKh0A2QF/p9HyOhUl3HxdLhvheg9Evr66OGtWJNlqkH9ixoGLN19HX59i+nF+E4gFBggjg/ty0quhIx4WMCad2JOVCEh8iD81j0N7rSD2ksIMSp+Vxh/ZiYIjKzxsyezwALkqgAYearCiYeIIawUO3Xi6U4hCgA3uyNfMqnIR9500AgtfeuHE+8Se8XlNJjiA58PC60JlHAAyQGWpIngVtMQkyAjYJiMmtTNVlijoFhn2cmHClnDmnmNz+Jjmb4IKEDCHkqY0bSvFABwBj0lXASchAxlZPdqEMe8k3BYmhm4SS3uYvY8Bh6TVWnkCi20ycREsWUq7ah3AnhEkdBgMKedGgqHQFjIZ3mSjrxLo8ep1I6mVGuax5ncAH9mEM/pmkR3MuBewosop0F2jc7ZelpDjzV3nqWQ8/KUjfOobGGjnLoSKvsXubgSw0e59CxhiY5NNHQqxx6pfuM1LU4b3Xl9CwmhyoPiT+CdwwgTGWjlZb3wtR5n9tFSXYHZMNOJ3b3oK/+EFMgSDK63D85+JDKnY3WmrWelhkuiWFOsVbX13YsjeJNu5lxrI2N1rbGiRgKvUWh3b31N7ZeiMaMkgWp3V59+1qvlAAh0eWi0s72bmu1vDHlSLEpu21bVvm2MaxZNXPEbNimZq1XRZ+9plLgVgmmflbyhzr/HUPJP9hRVfW5zZUKWqWYe16pSKoU8wOYK4qSsMKmv8ex0JR2TrMPYahGEM1GBiLTurughgmDA/WBHKofIBIRe6m+CuYFvqqlnk4zi/5HROM5UUX1uppsdnmO6cFpa8W2VuxPrxpb27MZt2Q8NZ4ZLwzbaBtbxr5xZHQMbFwY340fxs/a+9pFbVy7mlKvX5tpnhiFVfv2B+Fl1B4=</latexit><latexit sha1_base64="K/SJbfMATGxVkRzf/AW1h6QuToE=">AAAHyXicfVXLbtNAFDXPlPAsLNlYREgIRZWd0qYsKpU+oBKUlqppkeqAxpMbx8rYns6M07gjr/gItrDhn/gbxnlhewyz8dU951zfOTP2dSnxubCs39eu37h563Zt6U797r37Dx4+Wn58yqOYYejgiETss4s4ED+EjvAFgc+UAQpcAmfucCfDz0bAuB+FJyKh0A2QF/p9HyOhUl3HxdLhvheg9Evr66OGtWJNlqkH9ixoGLN19HX59i+nF+E4gFBggjg/ty0quhIx4WMCad2JOVCEh8iD81j0N7rSD2ksIMSp+Vxh/ZiYIjKzxsyezwALkqgAYearCiYeIIawUO3Xi6U4hCgA3uyNfMqnIR9500AgtfeuHE+8Se8XlNJjiA58PC60JlHAAyQGWpIngVtMQkyAjYJiMmtTNVlijoFhn2cmHClnDmnmNz+Jjmb4IKEDCHkqY0bSvFABwBj0lXASchAxlZPdqEMe8k3BYmhm4SS3uYvY8Bh6TVWnkCi20ycREsWUq7ah3AnhEkdBgMKedGgqHQFjIZ3mSjrxLo8ep1I6mVGuax5ncAH9mEM/pmkR3MuBewosop0F2jc7ZelpDjzV3nqWQ8/KUjfOobGGjnLoSKvsXubgSw0e59CxhiY5NNHQqxx6pfuM1LU4b3Xl9CwmhyoPiT+CdwwgTGWjlZb3wtR5n9tFSXYHZMNOJ3b3oK/+EFMgSDK63D85+JDKnY3WmrWelhkuiWFOsVbX13YsjeJNu5lxrI2N1rbGiRgKvUWh3b31N7ZeiMaMkgWp3V59+1qvlAAh0eWi0s72bmu1vDHlSLEpu21bVvm2MaxZNXPEbNimZq1XRZ+9plLgVgmmflbyhzr/HUPJP9hRVfW5zZUKWqWYe16pSKoU8wOYK4qSsMKmv8ex0JR2TrMPYahGEM1GBiLTurughgmDA/WBHKofIBIRe6m+CuYFvqqlnk4zi/5HROM5UUX1uppsdnmO6cFpa8W2VuxPrxpb27MZt2Q8NZ4ZLwzbaBtbxr5xZHQMbFwY340fxs/a+9pFbVy7mlKvX5tpnhiFVfv2B+Fl1B4=</latexit><latexit sha1_base64="K/SJbfMATGxVkRzf/AW1h6QuToE=">AAAHyXicfVXLbtNAFDXPlPAsLNlYREgIRZWd0qYsKpU+oBKUlqppkeqAxpMbx8rYns6M07gjr/gItrDhn/gbxnlhewyz8dU951zfOTP2dSnxubCs39eu37h563Zt6U797r37Dx4+Wn58yqOYYejgiETss4s4ED+EjvAFgc+UAQpcAmfucCfDz0bAuB+FJyKh0A2QF/p9HyOhUl3HxdLhvheg9Evr66OGtWJNlqkH9ixoGLN19HX59i+nF+E4gFBggjg/ty0quhIx4WMCad2JOVCEh8iD81j0N7rSD2ksIMSp+Vxh/ZiYIjKzxsyezwALkqgAYearCiYeIIawUO3Xi6U4hCgA3uyNfMqnIR9500AgtfeuHE+8Se8XlNJjiA58PC60JlHAAyQGWpIngVtMQkyAjYJiMmtTNVlijoFhn2cmHClnDmnmNz+Jjmb4IKEDCHkqY0bSvFABwBj0lXASchAxlZPdqEMe8k3BYmhm4SS3uYvY8Bh6TVWnkCi20ycREsWUq7ah3AnhEkdBgMKedGgqHQFjIZ3mSjrxLo8ep1I6mVGuax5ncAH9mEM/pmkR3MuBewosop0F2jc7ZelpDjzV3nqWQ8/KUjfOobGGjnLoSKvsXubgSw0e59CxhiY5NNHQqxx6pfuM1LU4b3Xl9CwmhyoPiT+CdwwgTGWjlZb3wtR5n9tFSXYHZMNOJ3b3oK/+EFMgSDK63D85+JDKnY3WmrWelhkuiWFOsVbX13YsjeJNu5lxrI2N1rbGiRgKvUWh3b31N7ZeiMaMkgWp3V59+1qvlAAh0eWi0s72bmu1vDHlSLEpu21bVvm2MaxZNXPEbNimZq1XRZ+9plLgVgmmflbyhzr/HUPJP9hRVfW5zZUKWqWYe16pSKoU8wOYK4qSsMKmv8ex0JR2TrMPYahGEM1GBiLTurughgmDA/WBHKofIBIRe6m+CuYFvqqlnk4zi/5HROM5UUX1uppsdnmO6cFpa8W2VuxPrxpb27MZt2Q8NZ4ZLwzbaBtbxr5xZHQMbFwY340fxs/a+9pFbVy7mlKvX5tpnhiFVfv2B+Fl1B4=</latexit>
Thisisenoughtoshowthatwiththeloglikelihoodwehaveanother“landscape”ontopofourmodelspace.Ifwedidn’twanttoworkouttherestanalytically,wecouldjust?indtheoptimumbygradientdescentorevenrandomsearch.
42
<latexit sha1_base64="m2TZ4YGuGvjK+2NuE2ThQV3sS68=">AAAIrHiclVVvj9s0GE8HjK4w2MHLvbGomAbqqiTH7g5NJ227g02I7Y7T9W5S01VO+iSN6iTGdnrJLH9DvgBfg7fwAqdNj6TJDfCrR/79if3zE9ulJOTCNH/v3Prgw49uf9y90/vk07uffX5v54sLnqTMg5GXkIS9cTEHEsYwEqEg8IYywJFL4NJdHBX45RIYD5P4XOQUJhEO4tAPPSz01HSnAw5mQYSzqXQSTzpRqpTzBDk8jaYZckiMHJ9hT1pKOvxXJqTt0BA5C03lodapt7ZSCD36h2bXwYfZo43xN29t9ODwfd/7rzaO03twiFqdbrbYfOT/WP2LIozrihsF03t9c2iuBmoWVln0jXKcTnduD51Z4qURxMIjmPOxZVIxkZiJ0COgek7KgWJvgQMYp8I/mMgwpqmA2FPoa435KUEiQcWZo1nIwBMk1wX2WKgdkDfHOiWhO6NXt+IQ4wj4YLYMKV+XfBmsC4F1W01ktmo7dbemlAHDdB56WW1pEkc8wmLemOR55NYnISXAllF9slimXuQWMwPmhbwI4VQnc0KLVubnyWmJz3M6h5grmTKiqkINAGPga+Gq5CBSKle70f/Pgh8KlsKgKFdzh8eYLc5gNtA+tYn6cnySYFGfcvU2dDoxXHlJFOF4Jh2qfyABmZDOYKhW2VXRMyV1t+igXBedFXANfV1BXyu1pR1doz4aabQGXlTAi4bxZQW93Ja6aQVNG+iygi4bzu5VBb5qwFkFzRpoXkHzBvqugr5rRon1yY/tiVzHvTo3eULCJbxgALGSfX1hbe2F6SMdW3VJccyyb6lV3DPw9f26BqK8oMuX569+VvLowH5s7qlthktS2FDM3b3HR2aDEqxXU3LMgwP7eYOTMBwH10bHP+w9s5pGNGWUXJP293d//L7plAMhydW109HzY3t3u4+Y1wih3CvqW6gRWtBGL3fVKnDbBOukWvmLJv8Fw/kN7KTNfRNgq4K2KTZptiryNsUm2o1iaxO06NbiHaLF1Y3JmnIM+lJn8Ep38Ym+iLBI2LeyfFKUvuQDZ1BU7yPibEPUVa94Yazt96RZXNhDa29o/vJd/+lP5VvTNe4bXxkPDcvYN54aL41TY2R4nd86f3T+7PzVHXbPu+PuZE291Sk1Xxq10fX/BivRJJc=</latexit>
argmaxµ
X
x
ln1p2⇡�2
-1
2�2(x- µ)2 = argmax
µ
X
x
-1
2�2(x- µ)2
= argmaxµ
-1
2�2
X
x
(x- µ)2
= argmaxµ
-X
x
(x- µ)2
= argminµ
X
x
(x- µ)2
<latexit sha1_base64="m2TZ4YGuGvjK+2NuE2ThQV3sS68=">AAAIrHiclVVvj9s0GE8HjK4w2MHLvbGomAbqqiTH7g5NJ227g02I7Y7T9W5S01VO+iSN6iTGdnrJLH9DvgBfg7fwAqdNj6TJDfCrR/79if3zE9ulJOTCNH/v3Prgw49uf9y90/vk07uffX5v54sLnqTMg5GXkIS9cTEHEsYwEqEg8IYywJFL4NJdHBX45RIYD5P4XOQUJhEO4tAPPSz01HSnAw5mQYSzqXQSTzpRqpTzBDk8jaYZckiMHJ9hT1pKOvxXJqTt0BA5C03lodapt7ZSCD36h2bXwYfZo43xN29t9ODwfd/7rzaO03twiFqdbrbYfOT/WP2LIozrihsF03t9c2iuBmoWVln0jXKcTnduD51Z4qURxMIjmPOxZVIxkZiJ0COgek7KgWJvgQMYp8I/mMgwpqmA2FPoa435KUEiQcWZo1nIwBMk1wX2WKgdkDfHOiWhO6NXt+IQ4wj4YLYMKV+XfBmsC4F1W01ktmo7dbemlAHDdB56WW1pEkc8wmLemOR55NYnISXAllF9slimXuQWMwPmhbwI4VQnc0KLVubnyWmJz3M6h5grmTKiqkINAGPga+Gq5CBSKle70f/Pgh8KlsKgKFdzh8eYLc5gNtA+tYn6cnySYFGfcvU2dDoxXHlJFOF4Jh2qfyABmZDOYKhW2VXRMyV1t+igXBedFXANfV1BXyu1pR1doz4aabQGXlTAi4bxZQW93Ja6aQVNG+iygi4bzu5VBb5qwFkFzRpoXkHzBvqugr5rRon1yY/tiVzHvTo3eULCJbxgALGSfX1hbe2F6SMdW3VJccyyb6lV3DPw9f26BqK8oMuX569+VvLowH5s7qlthktS2FDM3b3HR2aDEqxXU3LMgwP7eYOTMBwH10bHP+w9s5pGNGWUXJP293d//L7plAMhydW109HzY3t3u4+Y1wih3CvqW6gRWtBGL3fVKnDbBOukWvmLJv8Fw/kN7KTNfRNgq4K2KTZptiryNsUm2o1iaxO06NbiHaLF1Y3JmnIM+lJn8Ep38Ym+iLBI2LeyfFKUvuQDZ1BU7yPibEPUVa94Yazt96RZXNhDa29o/vJd/+lP5VvTNe4bXxkPDcvYN54aL41TY2R4nd86f3T+7PzVHXbPu+PuZE291Sk1Xxq10fX/BivRJJc=</latexit>
argmaxµ
X
x
ln1p2⇡�2
-1
2�2(x- µ)2 = argmax
µ
X
x
-1
2�2(x- µ)2
= argmaxµ
-1
2�2
X
x
(x- µ)2
= argmaxµ
-X
x
(x- µ)2
= argminµ
X
x
(x- µ)2
<latexit sha1_base64="m2TZ4YGuGvjK+2NuE2ThQV3sS68=">AAAIrHiclVVvj9s0GE8HjK4w2MHLvbGomAbqqiTH7g5NJ227g02I7Y7T9W5S01VO+iSN6iTGdnrJLH9DvgBfg7fwAqdNj6TJDfCrR/79if3zE9ulJOTCNH/v3Prgw49uf9y90/vk07uffX5v54sLnqTMg5GXkIS9cTEHEsYwEqEg8IYywJFL4NJdHBX45RIYD5P4XOQUJhEO4tAPPSz01HSnAw5mQYSzqXQSTzpRqpTzBDk8jaYZckiMHJ9hT1pKOvxXJqTt0BA5C03lodapt7ZSCD36h2bXwYfZo43xN29t9ODwfd/7rzaO03twiFqdbrbYfOT/WP2LIozrihsF03t9c2iuBmoWVln0jXKcTnduD51Z4qURxMIjmPOxZVIxkZiJ0COgek7KgWJvgQMYp8I/mMgwpqmA2FPoa435KUEiQcWZo1nIwBMk1wX2WKgdkDfHOiWhO6NXt+IQ4wj4YLYMKV+XfBmsC4F1W01ktmo7dbemlAHDdB56WW1pEkc8wmLemOR55NYnISXAllF9slimXuQWMwPmhbwI4VQnc0KLVubnyWmJz3M6h5grmTKiqkINAGPga+Gq5CBSKle70f/Pgh8KlsKgKFdzh8eYLc5gNtA+tYn6cnySYFGfcvU2dDoxXHlJFOF4Jh2qfyABmZDOYKhW2VXRMyV1t+igXBedFXANfV1BXyu1pR1doz4aabQGXlTAi4bxZQW93Ja6aQVNG+iygi4bzu5VBb5qwFkFzRpoXkHzBvqugr5rRon1yY/tiVzHvTo3eULCJbxgALGSfX1hbe2F6SMdW3VJccyyb6lV3DPw9f26BqK8oMuX569+VvLowH5s7qlthktS2FDM3b3HR2aDEqxXU3LMgwP7eYOTMBwH10bHP+w9s5pGNGWUXJP293d//L7plAMhydW109HzY3t3u4+Y1wih3CvqW6gRWtBGL3fVKnDbBOukWvmLJv8Fw/kN7KTNfRNgq4K2KTZptiryNsUm2o1iaxO06NbiHaLF1Y3JmnIM+lJn8Ep38Ym+iLBI2LeyfFKUvuQDZ1BU7yPibEPUVa94Yazt96RZXNhDa29o/vJd/+lP5VvTNe4bXxkPDcvYN54aL41TY2R4nd86f3T+7PzVHXbPu+PuZE291Sk1Xxq10fX/BivRJJc=</latexit>
argmaxµ
X
x
ln1p2⇡�2
-1
2�2(x- µ)2 = argmax
µ
X
x
-1
2�2(x- µ)2
= argmaxµ
-1
2�2
X
x
(x- µ)2
= argmaxµ
-X
x
(x- µ)2
= argminµ
X
x
(x- µ)2
<latexit sha1_base64="m2TZ4YGuGvjK+2NuE2ThQV3sS68=">AAAIrHiclVVvj9s0GE8HjK4w2MHLvbGomAbqqiTH7g5NJ227g02I7Y7T9W5S01VO+iSN6iTGdnrJLH9DvgBfg7fwAqdNj6TJDfCrR/79if3zE9ulJOTCNH/v3Prgw49uf9y90/vk07uffX5v54sLnqTMg5GXkIS9cTEHEsYwEqEg8IYywJFL4NJdHBX45RIYD5P4XOQUJhEO4tAPPSz01HSnAw5mQYSzqXQSTzpRqpTzBDk8jaYZckiMHJ9hT1pKOvxXJqTt0BA5C03lodapt7ZSCD36h2bXwYfZo43xN29t9ODwfd/7rzaO03twiFqdbrbYfOT/WP2LIozrihsF03t9c2iuBmoWVln0jXKcTnduD51Z4qURxMIjmPOxZVIxkZiJ0COgek7KgWJvgQMYp8I/mMgwpqmA2FPoa435KUEiQcWZo1nIwBMk1wX2WKgdkDfHOiWhO6NXt+IQ4wj4YLYMKV+XfBmsC4F1W01ktmo7dbemlAHDdB56WW1pEkc8wmLemOR55NYnISXAllF9slimXuQWMwPmhbwI4VQnc0KLVubnyWmJz3M6h5grmTKiqkINAGPga+Gq5CBSKle70f/Pgh8KlsKgKFdzh8eYLc5gNtA+tYn6cnySYFGfcvU2dDoxXHlJFOF4Jh2qfyABmZDOYKhW2VXRMyV1t+igXBedFXANfV1BXyu1pR1doz4aabQGXlTAi4bxZQW93Ja6aQVNG+iygi4bzu5VBb5qwFkFzRpoXkHzBvqugr5rRon1yY/tiVzHvTo3eULCJbxgALGSfX1hbe2F6SMdW3VJccyyb6lV3DPw9f26BqK8oMuX569+VvLowH5s7qlthktS2FDM3b3HR2aDEqxXU3LMgwP7eYOTMBwH10bHP+w9s5pGNGWUXJP293d//L7plAMhydW109HzY3t3u4+Y1wih3CvqW6gRWtBGL3fVKnDbBOukWvmLJv8Fw/kN7KTNfRNgq4K2KTZptiryNsUm2o1iaxO06NbiHaLF1Y3JmnIM+lJn8Ep38Ym+iLBI2LeyfFKUvuQDZ1BU7yPibEPUVa94Yazt96RZXNhDa29o/vJd/+lP5VvTNe4bXxkPDcvYN54aL41TY2R4nd86f3T+7PzVHXbPu+PuZE291Sk1Xxq10fX/BivRJJc=</latexit>
argmaxµ
X
x
ln1p2⇡�2
-1
2�2(x- µ)2 = argmax
µ
X
x
-1
2�2(x- µ)2
= argmaxµ
-1
2�2
X
x
(x- µ)2
= argmaxµ
-X
x
(x- µ)2
= argminµ
X
x
(x- µ)2
<latexit sha1_base64="m2TZ4YGuGvjK+2NuE2ThQV3sS68=">AAAIrHiclVVvj9s0GE8HjK4w2MHLvbGomAbqqiTH7g5NJ227g02I7Y7T9W5S01VO+iSN6iTGdnrJLH9DvgBfg7fwAqdNj6TJDfCrR/79if3zE9ulJOTCNH/v3Prgw49uf9y90/vk07uffX5v54sLnqTMg5GXkIS9cTEHEsYwEqEg8IYywJFL4NJdHBX45RIYD5P4XOQUJhEO4tAPPSz01HSnAw5mQYSzqXQSTzpRqpTzBDk8jaYZckiMHJ9hT1pKOvxXJqTt0BA5C03lodapt7ZSCD36h2bXwYfZo43xN29t9ODwfd/7rzaO03twiFqdbrbYfOT/WP2LIozrihsF03t9c2iuBmoWVln0jXKcTnduD51Z4qURxMIjmPOxZVIxkZiJ0COgek7KgWJvgQMYp8I/mMgwpqmA2FPoa435KUEiQcWZo1nIwBMk1wX2WKgdkDfHOiWhO6NXt+IQ4wj4YLYMKV+XfBmsC4F1W01ktmo7dbemlAHDdB56WW1pEkc8wmLemOR55NYnISXAllF9slimXuQWMwPmhbwI4VQnc0KLVubnyWmJz3M6h5grmTKiqkINAGPga+Gq5CBSKle70f/Pgh8KlsKgKFdzh8eYLc5gNtA+tYn6cnySYFGfcvU2dDoxXHlJFOF4Jh2qfyABmZDOYKhW2VXRMyV1t+igXBedFXANfV1BXyu1pR1doz4aabQGXlTAi4bxZQW93Ja6aQVNG+iygi4bzu5VBb5qwFkFzRpoXkHzBvqugr5rRon1yY/tiVzHvTo3eULCJbxgALGSfX1hbe2F6SMdW3VJccyyb6lV3DPw9f26BqK8oMuX569+VvLowH5s7qlthktS2FDM3b3HR2aDEqxXU3LMgwP7eYOTMBwH10bHP+w9s5pGNGWUXJP293d//L7plAMhydW109HzY3t3u4+Y1wih3CvqW6gRWtBGL3fVKnDbBOukWvmLJv8Fw/kN7KTNfRNgq4K2KTZptiryNsUm2o1iaxO06NbiHaLF1Y3JmnIM+lJn8Ep38Ym+iLBI2LeyfFKUvuQDZ1BU7yPibEPUVa94Yazt96RZXNhDa29o/vJd/+lP5VvTNe4bXxkPDcvYN54aL41TY2R4nd86f3T+7PzVHXbPu+PuZE291Sk1Xxq10fX/BivRJJc=</latexit>
argmaxµ
X
x
ln1p2⇡�2
-1
2�2(x- µ)2 = argmax
µ
X
x
-1
2�2(x- µ)2
= argmaxµ
-1
2�2
X
x
(x- µ)2
= argmaxµ
-X
x
(x- µ)2
= argminµ
X
x
(x- µ)2
HTH HHT HHT HTH
Bayesian
43
p(Heads | Straight) =1/2
p(Tails | Straight) = 1/2
p(Heads | Bent) = 4/5
p(Tails | Bent) = 1/5
prior?
TheBayesianapproachisalittledifferent.Wewon’tgointothedetails,butwe’llabriefoutlineofhowitworksonthecoinexample,justtogiveyouabasicidea.
We?irstneedtoestablishaprior.Whatistheprobabilityofeachcoininourmodelspace.Wesaidthatwe’daskedafriendtopickacoinatrandom.Ifweassumethathefollowsourinstructions,thenwebelieveeachcoinisequallylikelysobothget0.5probability.Ifwehadtwofaircoinsandonebentone,wecouldsetthepriorto1/3forbentand2/3forfair.
Thisisanimportantthingtounderstandaboutchoosingaprior:itallowsustoencodeourassumptionsabouttheproblem.Andaswesawwhenwediscussedtheproblemofinduction,theseassumptionsarewhatmakelearningpossibleatall.
44
<latexit sha1_base64="dwsKn/5CBbJYU3diZkngPkLZBKo=">AAAHhnicfZXdbts2GIblrp1Tb+3S7XAnQo0O6WoYktv87CBFmqQ/GNYmzeKkQGQEFP1ZJqwfgqQcuwTvp1eye9jdlPpxIIludeIPfN73w8eXBO3TkHDhOP+37vxw996P7Y37nZ9+fvDwl81Hv17wJGUYhjgJE/bJRxxCEsNQEBHCJ8oARX4Il/7sKOOXc2CcJPG5WFIYRSiIyYRgJPTS9eYXunX81P5j37Z10bM9hqUnYCHkv4IhEkyFUk/tZyUMVvAQ4hx4XkdbNfQiMv6GmW59p2nhM/pqT3PterPr9J38s83CLYuuVX6n14/u/eeNE5xG2o9DxPmV61AxkogJgkNQHS/lQBGeoQCuUjHZG0kS01RAjJX9RLNJGtoisbPU7DFhgEW41AXCjOgONp4ihrDQ2XbqrTjEKALeG88J5UXJ50FRCKQPZiQX+cGpBzWnDBiiU4IXtdEkiniExNRY5MvIry9CGgKbR/XFbEw9ZEO5AIYJz0I41cmc0Owy8PPktOTTJZ1CzJVMWaiqRg2AMZhoY15yECmV+W70DZzxfcFS6GVlvrZ/jNjsDMY93ae2UB9nEiZIKB1GDDc4iSIUj6VHVXkBvF5f5VFV6ZmS0sty8X37LMM1+qFCP6hm5+EtndhDTWvwogIvjMaXFXrZtPpphaYGnVfo3Ojs31TwjYEXFbow6LJClwb9XKGfzSiRPuirwUgWcefHJE9CMoe3DCBWsjtQzb0wfYJXbt2SnarsuiqPewwT/SAVIFpmcvnu/P0/Sh7tDbadHdVU+GEKK4nzfGf7yDEkQTFNqXH29gaHhiZhKA5uGx2/3nnlmo1oymh4K9rdff7mr8PmFWHY2F+5Dbvr2kYewTp5OfBag7/OUISwVj8z9W8ZWn5DnazrvspmrYOuc6yCWjkaI9HsWs30M02zJxWFheQY9GPL4L2+bif6gUAiYX/qO8aCiOgw9K/Xy6rvCdFiJdRVp6Nffrf5zpvFxaDv7vSdjy+6B3+X/wEb1u/WY2vLcq1d68B6Z51aQwu3HrZetPZbL9sb7X57u71bSO+0Ss9vVu1rH3wF0x+zlA==</latexit>
p(D) = p(D, Straight) + p(D,Bent)= p(D | Straight)p(Straight) + p(D | Bent)p(Bent)
<latexit sha1_base64="dwsKn/5CBbJYU3diZkngPkLZBKo=">AAAHhnicfZXdbts2GIblrp1Tb+3S7XAnQo0O6WoYktv87CBFmqQ/GNYmzeKkQGQEFP1ZJqwfgqQcuwTvp1eye9jdlPpxIIludeIPfN73w8eXBO3TkHDhOP+37vxw996P7Y37nZ9+fvDwl81Hv17wJGUYhjgJE/bJRxxCEsNQEBHCJ8oARX4Il/7sKOOXc2CcJPG5WFIYRSiIyYRgJPTS9eYXunX81P5j37Z10bM9hqUnYCHkv4IhEkyFUk/tZyUMVvAQ4hx4XkdbNfQiMv6GmW59p2nhM/pqT3PterPr9J38s83CLYuuVX6n14/u/eeNE5xG2o9DxPmV61AxkogJgkNQHS/lQBGeoQCuUjHZG0kS01RAjJX9RLNJGtoisbPU7DFhgEW41AXCjOgONp4ihrDQ2XbqrTjEKALeG88J5UXJ50FRCKQPZiQX+cGpBzWnDBiiU4IXtdEkiniExNRY5MvIry9CGgKbR/XFbEw9ZEO5AIYJz0I41cmc0Owy8PPktOTTJZ1CzJVMWaiqRg2AMZhoY15yECmV+W70DZzxfcFS6GVlvrZ/jNjsDMY93ae2UB9nEiZIKB1GDDc4iSIUj6VHVXkBvF5f5VFV6ZmS0sty8X37LMM1+qFCP6hm5+EtndhDTWvwogIvjMaXFXrZtPpphaYGnVfo3Ojs31TwjYEXFbow6LJClwb9XKGfzSiRPuirwUgWcefHJE9CMoe3DCBWsjtQzb0wfYJXbt2SnarsuiqPewwT/SAVIFpmcvnu/P0/Sh7tDbadHdVU+GEKK4nzfGf7yDEkQTFNqXH29gaHhiZhKA5uGx2/3nnlmo1oymh4K9rdff7mr8PmFWHY2F+5Dbvr2kYewTp5OfBag7/OUISwVj8z9W8ZWn5DnazrvspmrYOuc6yCWjkaI9HsWs30M02zJxWFheQY9GPL4L2+bif6gUAiYX/qO8aCiOgw9K/Xy6rvCdFiJdRVp6Nffrf5zpvFxaDv7vSdjy+6B3+X/wEb1u/WY2vLcq1d68B6Z51aQwu3HrZetPZbL9sb7X57u71bSO+0Ss9vVu1rH3wF0x+zlA==</latexit>
p(D) = p(D, Straight) + p(D,Bent)= p(D | Straight)p(Straight) + p(D | Bent)p(Bent)
<latexit sha1_base64="dwsKn/5CBbJYU3diZkngPkLZBKo=">AAAHhnicfZXdbts2GIblrp1Tb+3S7XAnQo0O6WoYktv87CBFmqQ/GNYmzeKkQGQEFP1ZJqwfgqQcuwTvp1eye9jdlPpxIIludeIPfN73w8eXBO3TkHDhOP+37vxw996P7Y37nZ9+fvDwl81Hv17wJGUYhjgJE/bJRxxCEsNQEBHCJ8oARX4Il/7sKOOXc2CcJPG5WFIYRSiIyYRgJPTS9eYXunX81P5j37Z10bM9hqUnYCHkv4IhEkyFUk/tZyUMVvAQ4hx4XkdbNfQiMv6GmW59p2nhM/pqT3PterPr9J38s83CLYuuVX6n14/u/eeNE5xG2o9DxPmV61AxkogJgkNQHS/lQBGeoQCuUjHZG0kS01RAjJX9RLNJGtoisbPU7DFhgEW41AXCjOgONp4ihrDQ2XbqrTjEKALeG88J5UXJ50FRCKQPZiQX+cGpBzWnDBiiU4IXtdEkiniExNRY5MvIry9CGgKbR/XFbEw9ZEO5AIYJz0I41cmc0Owy8PPktOTTJZ1CzJVMWaiqRg2AMZhoY15yECmV+W70DZzxfcFS6GVlvrZ/jNjsDMY93ae2UB9nEiZIKB1GDDc4iSIUj6VHVXkBvF5f5VFV6ZmS0sty8X37LMM1+qFCP6hm5+EtndhDTWvwogIvjMaXFXrZtPpphaYGnVfo3Ojs31TwjYEXFbow6LJClwb9XKGfzSiRPuirwUgWcefHJE9CMoe3DCBWsjtQzb0wfYJXbt2SnarsuiqPewwT/SAVIFpmcvnu/P0/Sh7tDbadHdVU+GEKK4nzfGf7yDEkQTFNqXH29gaHhiZhKA5uGx2/3nnlmo1oymh4K9rdff7mr8PmFWHY2F+5Dbvr2kYewTp5OfBag7/OUISwVj8z9W8ZWn5DnazrvspmrYOuc6yCWjkaI9HsWs30M02zJxWFheQY9GPL4L2+bif6gUAiYX/qO8aCiOgw9K/Xy6rvCdFiJdRVp6Nffrf5zpvFxaDv7vSdjy+6B3+X/wEb1u/WY2vLcq1d68B6Z51aQwu3HrZetPZbL9sb7X57u71bSO+0Ss9vVu1rH3wF0x+zlA==</latexit>
p(D) = p(D, Straight) + p(D,Bent)= p(D | Straight)p(Straight) + p(D | Bent)p(Bent)
<latexit sha1_base64="msR+vvSM+8XdWuvi8iWHn/ASH0w=">AAAHQHicfVVLb9NAEDavAOFV4MjFIkJqUVTZKX1wqFTa8BACWkrTVqqjar2ZOKv4sdpdpwmr/XMcEH+Bf8AJxAnEibXjVHZs8CWj+b5vNPPtaOJSn3BhWV8vXLx0+Urt6rXr9Rs3b92+s3D33iGPYoahgyM/Yscu4uCTEDqCCB+OKQMUuD4cucOdBD8aAeMkCg/EhEI3QF5I+gQjoVOnCy5ddBiWjoCxkB8EQ8QbCKVMJyA9s71kbppOnyEs6WI7TVWRl6prLKlEtaROFxrWspV+Zjmws6BhZN/e6d0rn51ehOMAQoF9xPmJbVHRlYgJgn1QdSfmQBEeIg9OYtHf6EoS0lhAiJX5SGP92DdFZCbzmj3CAAt/ogOEGdEVTDxAeiShXakXS3EIUQC82RsRyqchH3nTQCBtaVeOU8vVrYJSegzRAcHjQmsSBTxAYlBK8kngFpMQ+8BGQTGZtKmbnGOOgWHCExP2tDO7NHlGfhDtZfhgQgcQciVj5qu8UAPAGPS1MA05iJjKdBq9O0O+KVgMzSRMc5ttxIb70GvqOoVEsZ2+HyGhtBkhnOEoCFDYkw5V2S44zWWVWpVH95WUTuKL65r7CVxA3+XQd2q+cucc7ZsdjRbAwxx4WCp8lEOP5qVunEPjEjrKoaNSZfcsB5+V4HEOHZfQSQ6dlNCPOfRj2UqkH/qk1ZVTu9Nnkrs+GcFLBhAq2Wip+VmYfsETuyhJXlU2bJXa3YO+PiVTIJgkdPnq4O0bJXc2WqvWmppnuH4MM4q1sra6Y5Uo3rSbjGNtbLS2S5yIodA7L9R+vvbMLheiMaP+OWl9feXF0+35FWG4NF82htmwzZIfXhU9a7hS4FYJpiZU8odl/kuGJv9gR1XVZ95UKmiVYmbUTDHXEk3WaqgvNk1OKvKnlDboY8vgrV63XX0gkIjYY71jzAuINkP/Os0k+h8RjWdEHdXr+vLb83e+HBy2lu21Zev9k8bW6+w/4JrxwHhoLBq2sW5sGa+MPaNjYOOL8d34Zfyufap9q/2o/ZxSL17INPeNwlf78xfscaNz</latexit>
p(Straight | D) =p(D | Straight)p(Straight)
p(D)<latexit sha1_base64="VbhJMWMb74zDzqC84OvxFrtA/4c=">AAAHhHicnVXbbtNAEHUKNBAKtPDIi0WEVCCq7JReEBRBW6BCQEtp0kp1VK03E2cVX1a76zRhtb/Dp/AP/A3rxCm+BB7wS0ZzzhnNnhlNXOoTLizrV2Xh2vUbi9Wbt2q3l+7cvbe8cr/No5hhaOHIj9iZizj4JISWIMKHM8oABa4Pp+5gL8FPh8A4icITMabQCZAXkh7BSOjUxfKPHdPpMYQlXd13AtI1HYalI2Ak5DfBEPH6QqknJl2dm1fS/C+d+eyPzpvhuxDONMWculiuW2vW5DPLgZ0GdSP9ji5Wbvx0uhGOA10A+4jzc9uioiMREwT7oGpOzIEiPEAenMeit92RJKSxgBAr87HGerFvishMPDO7hAEW/lgHCDOiK5i4j7RrQjtby5fiEKIAeKM7JJRPQz70poFAeiwdOZqMTd3JKaXHEO0TPMq1JlHAAyT6pSQfB24+CbEPbBjkk0mbuskCcwQME56YcKSdOaTJKvCT6CjF+2Pah5ArGTNfZYUaAMagp4WTkIOIqZy8Ru/fgO8IFkMjCSe5nX3EBsfQbeg6uUS+nZ4fIaG0GSFc4igIUNiVDlXpBjiNNTWxKoseKymdxBfXNY8TOId+yaBfVLFy6wrtmS2N5sB2BmyXCp9m0NOi1I0zaFxChxl0WKrsXmbgyxI8yqCjEjrOoOMS+j2Dfi9bifSgz5sdObV7MiZ56JMhfGAAoZL1piq+hekJntt5STJVWbfVxO4u9PQ5mgLBOKHLg5PPn5Tc225uWJuqyHD9GGYUa31zY88qUbxpNynH2t5u7pY4EUOhd1Vo/93mW7tciMaM+lekra319y92iyvCcOl96TPMum2W/PDm0dOG5wrceYKpCXP5gzL/A0Pjv7CjedVn3sxV0HmKmVEzRaElmqzVQN9pmpxU5E8p+6CPLYPPet0O9YFAImJP9Y4xLyDaDP3rNJLoX0Q0mhF1VKvpy28X73w5aDfX7M016+vz+puP6X/ATeOh8chYNWxjy3hjHBhHRsvAlaVKs/Ky8qq6WG1U16sbU+pCJdU8MHJf9fVvN9u3Dw==</latexit>
=p(D | Straight)p(Straight)
p(D | Straight)p(Straight) + p(D | Bent)p(Bent)
45
<latexit sha1_base64="KPfTFcyReXp4MSCu3E6Tps8JzMI=">AAAHpXicnVVdbxNHFF1TqMEtJbSPvIywUENrol0DSfoQCRIXEGpIGmIHKWtFs+Pr9cj7MZqZdWxG8/fob+i/6ex6Hfarfei++Oqec67unHt17bGACmnbf7dufXP7zrftu/c6331//4cHWw9/HIk44QSGJA5i/snDAgIawVBSGcAnxgGHXgAX3vwoxS8WwAWNo3O5YjAOsR/RKSVYmtTV1l9sG7mcKFfCUqqPkmPqz6TWyA3pBA2eInSA3CnHRLHtQZZrYj9FbLsxrxX6Xzr061edv8EPIdpoqjl9tdW1d+zsQ/XAyYOulX+nVw/vfHEnMUlCU4AEWIhLx2ZyrDCXlASgO24igGEyxz5cJnK6P1Y0YomEiGj0xGDTJEAyRqmpaEI5EBmsTIAJp6YCIjNsXJPG+k65lIAIhyB6kwVlYh2Khb8OJDZzG6tlNld9v6RUPsdsRsmy1JrCoQixnNWSYhV65SQkAfBFWE6mbZomK8wlcEJFasKpceaEpbsizuPTHJ+t2AwioVXCA10UGgA4h6kRZqEAmTCVvcYs6FwcSJ5ALw2z3MEA8/kZTHqmTilRbmcaxFhqY0YE1yQOQxxNlMt0vgFub0dnVhXRM62Um/rieegshUvohwL6QVcrD2/QKRoatASOCuCoVviigF5UpV5SQJMauiigi1pl77oAX9fgZQFd1tBVAV3V0M8F9HPdSmwGfdkfq7Xd2ZjUSUAX8JYDRFp1+7r6Fm4meOmUJelUVdfRmd0TmJp7tQbCVUpX786P/9DqaL//0t7VVYYXJLCh2M93Xx7ZNYq/7ibn2Pv7/cMaJ+Y48m8KDX7ffe3UC7GEs+CGtLf3/M1vh9UV4aT2vvwZqOugmh9+Ez1vuFHgNQnWJjTy53X+W45X/8KOm6pvvGlUsCbFxqiNotISS9dqbu40S08qDtaUAZhjy+HYrNuJORBYxvwXs2PcD6kxw/y6vTT6LyJebogm6nTM5Xeqd74ejPo7zu6O/eeL7qv3+X/AXeuR9djathxrz3plvbNOraFFWs9aH1tua9z+uX3cPm+P1tRbrVzzk1X62lf/AF8CxBw=</latexit>
p(Straight | D) =p(D | Straight)p(Straight)
p(D | Straight)p(Straight) + p(D | Bent)p(Bent)
<latexit sha1_base64="KPfTFcyReXp4MSCu3E6Tps8JzMI=">AAAHpXicnVVdbxNHFF1TqMEtJbSPvIywUENrol0DSfoQCRIXEGpIGmIHKWtFs+Pr9cj7MZqZdWxG8/fob+i/6ex6Hfarfei++Oqec67unHt17bGACmnbf7dufXP7zrftu/c6331//4cHWw9/HIk44QSGJA5i/snDAgIawVBSGcAnxgGHXgAX3vwoxS8WwAWNo3O5YjAOsR/RKSVYmtTV1l9sG7mcKFfCUqqPkmPqz6TWyA3pBA2eInSA3CnHRLHtQZZrYj9FbLsxrxX6Xzr061edv8EPIdpoqjl9tdW1d+zsQ/XAyYOulX+nVw/vfHEnMUlCU4AEWIhLx2ZyrDCXlASgO24igGEyxz5cJnK6P1Y0YomEiGj0xGDTJEAyRqmpaEI5EBmsTIAJp6YCIjNsXJPG+k65lIAIhyB6kwVlYh2Khb8OJDZzG6tlNld9v6RUPsdsRsmy1JrCoQixnNWSYhV65SQkAfBFWE6mbZomK8wlcEJFasKpceaEpbsizuPTHJ+t2AwioVXCA10UGgA4h6kRZqEAmTCVvcYs6FwcSJ5ALw2z3MEA8/kZTHqmTilRbmcaxFhqY0YE1yQOQxxNlMt0vgFub0dnVhXRM62Um/rieegshUvohwL6QVcrD2/QKRoatASOCuCoVviigF5UpV5SQJMauiigi1pl77oAX9fgZQFd1tBVAV3V0M8F9HPdSmwGfdkfq7Xd2ZjUSUAX8JYDRFp1+7r6Fm4meOmUJelUVdfRmd0TmJp7tQbCVUpX786P/9DqaL//0t7VVYYXJLCh2M93Xx7ZNYq/7ibn2Pv7/cMaJ+Y48m8KDX7ffe3UC7GEs+CGtLf3/M1vh9UV4aT2vvwZqOugmh9+Ez1vuFHgNQnWJjTy53X+W45X/8KOm6pvvGlUsCbFxqiNotISS9dqbu40S08qDtaUAZhjy+HYrNuJORBYxvwXs2PcD6kxw/y6vTT6LyJebogm6nTM5Xeqd74ejPo7zu6O/eeL7qv3+X/AXeuR9djathxrz3plvbNOraFFWs9aH1tua9z+uX3cPm+P1tRbrVzzk1X62lf/AF8CxBw=</latexit>
p(Straight | D) =p(D | Straight)p(Straight)
p(D | Straight)p(Straight) + p(D | Bent)p(Bent)
<latexit sha1_base64="KPfTFcyReXp4MSCu3E6Tps8JzMI=">AAAHpXicnVVdbxNHFF1TqMEtJbSPvIywUENrol0DSfoQCRIXEGpIGmIHKWtFs+Pr9cj7MZqZdWxG8/fob+i/6ex6Hfarfei++Oqec67unHt17bGACmnbf7dufXP7zrftu/c6331//4cHWw9/HIk44QSGJA5i/snDAgIawVBSGcAnxgGHXgAX3vwoxS8WwAWNo3O5YjAOsR/RKSVYmtTV1l9sG7mcKFfCUqqPkmPqz6TWyA3pBA2eInSA3CnHRLHtQZZrYj9FbLsxrxX6Xzr061edv8EPIdpoqjl9tdW1d+zsQ/XAyYOulX+nVw/vfHEnMUlCU4AEWIhLx2ZyrDCXlASgO24igGEyxz5cJnK6P1Y0YomEiGj0xGDTJEAyRqmpaEI5EBmsTIAJp6YCIjNsXJPG+k65lIAIhyB6kwVlYh2Khb8OJDZzG6tlNld9v6RUPsdsRsmy1JrCoQixnNWSYhV65SQkAfBFWE6mbZomK8wlcEJFasKpceaEpbsizuPTHJ+t2AwioVXCA10UGgA4h6kRZqEAmTCVvcYs6FwcSJ5ALw2z3MEA8/kZTHqmTilRbmcaxFhqY0YE1yQOQxxNlMt0vgFub0dnVhXRM62Um/rieegshUvohwL6QVcrD2/QKRoatASOCuCoVviigF5UpV5SQJMauiigi1pl77oAX9fgZQFd1tBVAV3V0M8F9HPdSmwGfdkfq7Xd2ZjUSUAX8JYDRFp1+7r6Fm4meOmUJelUVdfRmd0TmJp7tQbCVUpX786P/9DqaL//0t7VVYYXJLCh2M93Xx7ZNYq/7ibn2Pv7/cMaJ+Y48m8KDX7ffe3UC7GEs+CGtLf3/M1vh9UV4aT2vvwZqOugmh9+Ez1vuFHgNQnWJjTy53X+W45X/8KOm6pvvGlUsCbFxqiNotISS9dqbu40S08qDtaUAZhjy+HYrNuJORBYxvwXs2PcD6kxw/y6vTT6LyJebogm6nTM5Xeqd74ejPo7zu6O/eeL7qv3+X/AXeuR9djathxrz3plvbNOraFFWs9aH1tua9z+uX3cPm+P1tRbrVzzk1X62lf/AF8CxBw=</latexit>
p(Straight | D) =p(D | Straight)p(Straight)
p(D | Straight)p(Straight) + p(D | Bent)p(Bent)
<latexit sha1_base64="uyEYdA+pUpVrGwe14ihnlMk4Lio=">AAAHmXiclVVdb9s2FJW7de68dUvXx75wMwakmxFI7pqkDwHaxFuCYW3cNE4KREZA0dcyYX0QJOXYJfjL9r7/sH8zSpYDSfQGjC++uOec68vDi6uARVRI1/279eCzzx9+0X70Zeerrx9/8+3Ok++uRJpxAiOSRin/GGABEU1gJKmM4CPjgOMggutgfpLj1wvggqbJpVwxGMc4TOiUEixN6nbnT7aL/JAoX8JSqmNIpNbIj+kEDZ4jdIT8KcdEsd1BkWsynyO2a+W0Qvd8vsE+SI5pONtotuZ/Rv/vf253uu6eWxxkB14ZdJ3yDG+fPPzLn6Qki00BEmEhbjyXybHCXFISge74mQCGyRyHcJPJ6eFY0YRlEhKi0Y8Gm2YRkinKjUQTyoHIaGUCTDg1FRCZYeOWNHZ36qUEJDgG0ZssKBPrUCzCdSCxeauxWhZvqR/XlCrkmM0oWdZaUzgWMZYzKylWcVBPQhYBX8T1ZN6mabLBXAInVOQmDI0z5yyfD3GZDkt8tmIzSIRWGY90VWgA4BymRliEAmTGVHEbM5RzcSR5Br08LHJHA8znFzDpmTq1RL2daZRiqY0ZCdyRNI5xMlE+0+UE+L09XVhVRS+0Un7uSxCgixyuoe8q6DvdrDy6R6doZNAaeFUBr6zC1xX0uikNsgqaWeiigi6sysFdBb6z4GUFXVroqoKuLPRTBf1kW4nNQ9/0x2ptd/FM6jyiCzjlAIlW3b5u3oWbF7zx6pL8VVXX04XdE5iaHbUG4lVOV2eXb//Q6uSw/9Ld101GEGWwobgv9l+euBYlXHdTctzDw/6xxUk5TsL7QoNf9994diGWcRbdkw4OXvz26rg5IpxY9yuvgboesvwIt9HLhrcKgm2CtQlb+XObf8rx6l/Y6bbqG2+2Ktg2xcaojaLREsvHam72NMtXKo7WlAGYZcvhrRm3c7MgsEz5T2bGeBhTY4b59Xt59F9EvNwQTdTpmM3vNfe8HVz197z9Pff9L93Xv5ffgEfOM+cHZ9fxnAPntXPmDJ2RQ1rft05bw9b79rP2m/ZZu+Q+aJWap07ttD/8A+cdvkw=</latexit>
p(Bent | D) =p(D | Bent)p(Bent)
p(D | Straight)p(Straight) + p(D | Bent)p(Bent)
<latexit sha1_base64="uyEYdA+pUpVrGwe14ihnlMk4Lio=">AAAHmXiclVVdb9s2FJW7de68dUvXx75wMwakmxFI7pqkDwHaxFuCYW3cNE4KREZA0dcyYX0QJOXYJfjL9r7/sH8zSpYDSfQGjC++uOec68vDi6uARVRI1/279eCzzx9+0X70Zeerrx9/8+3Ok++uRJpxAiOSRin/GGABEU1gJKmM4CPjgOMggutgfpLj1wvggqbJpVwxGMc4TOiUEixN6nbnT7aL/JAoX8JSqmNIpNbIj+kEDZ4jdIT8KcdEsd1BkWsynyO2a+W0Qvd8vsE+SI5pONtotuZ/Rv/vf253uu6eWxxkB14ZdJ3yDG+fPPzLn6Qki00BEmEhbjyXybHCXFISge74mQCGyRyHcJPJ6eFY0YRlEhKi0Y8Gm2YRkinKjUQTyoHIaGUCTDg1FRCZYeOWNHZ36qUEJDgG0ZssKBPrUCzCdSCxeauxWhZvqR/XlCrkmM0oWdZaUzgWMZYzKylWcVBPQhYBX8T1ZN6mabLBXAInVOQmDI0z5yyfD3GZDkt8tmIzSIRWGY90VWgA4BymRliEAmTGVHEbM5RzcSR5Br08LHJHA8znFzDpmTq1RL2daZRiqY0ZCdyRNI5xMlE+0+UE+L09XVhVRS+0Un7uSxCgixyuoe8q6DvdrDy6R6doZNAaeFUBr6zC1xX0uikNsgqaWeiigi6sysFdBb6z4GUFXVroqoKuLPRTBf1kW4nNQ9/0x2ptd/FM6jyiCzjlAIlW3b5u3oWbF7zx6pL8VVXX04XdE5iaHbUG4lVOV2eXb//Q6uSw/9Ld101GEGWwobgv9l+euBYlXHdTctzDw/6xxUk5TsL7QoNf9994diGWcRbdkw4OXvz26rg5IpxY9yuvgboesvwIt9HLhrcKgm2CtQlb+XObf8rx6l/Y6bbqG2+2Ktg2xcaojaLREsvHam72NMtXKo7WlAGYZcvhrRm3c7MgsEz5T2bGeBhTY4b59Xt59F9EvNwQTdTpmM3vNfe8HVz197z9Pff9L93Xv5ffgEfOM+cHZ9fxnAPntXPmDJ2RQ1rft05bw9b79rP2m/ZZu+Q+aJWap07ttD/8A+cdvkw=</latexit>
p(Bent | D) =p(D | Bent)p(Bent)
p(D | Straight)p(Straight) + p(D | Bent)p(Bent)
<latexit sha1_base64="uyEYdA+pUpVrGwe14ihnlMk4Lio=">AAAHmXiclVVdb9s2FJW7de68dUvXx75wMwakmxFI7pqkDwHaxFuCYW3cNE4KREZA0dcyYX0QJOXYJfjL9r7/sH8zSpYDSfQGjC++uOec68vDi6uARVRI1/279eCzzx9+0X70Zeerrx9/8+3Ok++uRJpxAiOSRin/GGABEU1gJKmM4CPjgOMggutgfpLj1wvggqbJpVwxGMc4TOiUEixN6nbnT7aL/JAoX8JSqmNIpNbIj+kEDZ4jdIT8KcdEsd1BkWsynyO2a+W0Qvd8vsE+SI5pONtotuZ/Rv/vf253uu6eWxxkB14ZdJ3yDG+fPPzLn6Qki00BEmEhbjyXybHCXFISge74mQCGyRyHcJPJ6eFY0YRlEhKi0Y8Gm2YRkinKjUQTyoHIaGUCTDg1FRCZYeOWNHZ36qUEJDgG0ZssKBPrUCzCdSCxeauxWhZvqR/XlCrkmM0oWdZaUzgWMZYzKylWcVBPQhYBX8T1ZN6mabLBXAInVOQmDI0z5yyfD3GZDkt8tmIzSIRWGY90VWgA4BymRliEAmTGVHEbM5RzcSR5Br08LHJHA8znFzDpmTq1RL2daZRiqY0ZCdyRNI5xMlE+0+UE+L09XVhVRS+0Un7uSxCgixyuoe8q6DvdrDy6R6doZNAaeFUBr6zC1xX0uikNsgqaWeiigi6sysFdBb6z4GUFXVroqoKuLPRTBf1kW4nNQ9/0x2ptd/FM6jyiCzjlAIlW3b5u3oWbF7zx6pL8VVXX04XdE5iaHbUG4lVOV2eXb//Q6uSw/9Ld101GEGWwobgv9l+euBYlXHdTctzDw/6xxUk5TsL7QoNf9994diGWcRbdkw4OXvz26rg5IpxY9yuvgboesvwIt9HLhrcKgm2CtQlb+XObf8rx6l/Y6bbqG2+2Ktg2xcaojaLREsvHam72NMtXKo7WlAGYZcvhrRm3c7MgsEz5T2bGeBhTY4b59Xt59F9EvNwQTdTpmM3vNfe8HVz197z9Pff9L93Xv5ffgEfOM+cHZ9fxnAPntXPmDJ2RQ1rft05bw9b79rP2m/ZZu+Q+aJWap07ttD/8A+cdvkw=</latexit>
p(Bent | D) =p(D | Bent)p(Bent)
p(D | Straight)p(Straight) + p(D | Bent)p(Bent)
ThisisonewaytothinkaboutBayes’rule:we’veobservedanoutcome,inthiscasethedata,whichcanhaveanumberofcauses.Thejointprobabilityofacausewithanoutcomeinthepriorofthecausetimestheprobabilityoftheoutcomegiventhecause.Ifwesumupallthesejointprobabilties,thetheproportioninthatsumforcauseXistheprobabilityofthecausegiventheobservation.
46
<latexit sha1_base64="NLpzuHaVef5zkzPJrsQ+kzbRVRg=">AAAH/niclVXLbttGFKWSJkrVvNwss+igQgMnEQxSaWxnYSC11SYokth1LDuAKRjD0RU1EB+DmaEsZTBAV/2Urlp01aJf0H/o33RIUQ4psnlwo4t7zrm8c+bq0mMBFdK2/21cuvzZlavNa5+3vrh+4+at22tfHos44QT6JA5i/sbDAgIaQV9SGcAbxgGHXgAn3mQvxU+mwAWNoyM5ZzAIsR/RESVYmtTZWuMrto5cTpQrYSbVa8kx9cdSa+SGdIh69xG6t4PcEcdEsfVelqyj30fuxGQznqNVV2ut0KcJ0MN3An8p2IWovjpy3dbHdfbBRt73Xn12u21v2NmDqoGTB20rfw7O1q784w5jkoSmAgmwEKeOzeRAYS4pCUC33EQAw2SCfThN5Gh7oGjEEgkR0egbg42SAMkYpbeFhpQDkcHcBJhwaiogMsbmyNLcaatcSkCEQxCd4ZQysQjF1F8EEpuBGKhZNjD6RkmpfI7ZmJJZqTWFQxFiOa4kxTz0yklIAuDTsJxM2zRNrjBnwAkVqQkHxpl9lg6hOIoPcnw8Z2OIhFYJD3RRaADgHEZGmIUCZMJUdhoz+ROxI3kCnTTMcjs9zCeHMOyYOqVEuZ1REGOpjRkRnJM4DHE0VC7T+Qi4nQ2dWVVED7VSbuqL56HDFC6hrwroK71auX+BjlDfoCXwuAAeVwqfFNCTVamXFNCkgk4L6LRS2TsvwOcVeFZAZxV0XkDnFfRtAX1btRKbiz7tDtTC7uya1H5Ap/CMA0RatdO/eVnCzQ2eOmVJequq7ejM7iGMzCJcAOE8pavnRy9faLW33X1sb+pVhhcksKTYjzYf79kVir/oJufY29vd3Qon5jjyLwr1vt/8zqkWYglnwQVpa+vRD092V0eEk8r58mOgtoMqfvh19LzhWoFXJ1iYUMufVPnPOJ7/Dzuuq770plbB6hRLo5aKlZZYOlbp54ClKxUHC0oPzLLl8NKM275ZEFjG/IGZMe6H1Jhhft1OGr2PiGdLoolaLbP5ndU9Xw2OuxvO5ob907ftpz/m34Br1l3ra2vdcqwt66n13Dqw+hZp/NL4rfFn46/mz81fm783/1hQLzVyzR2r9DT//g9fhOXq</latexit>
p(Straight | D) =p(D | Straight) 12
p(D | Straight) 12 + p(D | Bent) 12
=p(D | Straight)
p(D | Straight) + p(D | Bent)
<latexit sha1_base64="NLpzuHaVef5zkzPJrsQ+kzbRVRg=">AAAH/niclVXLbttGFKWSJkrVvNwss+igQgMnEQxSaWxnYSC11SYokth1LDuAKRjD0RU1EB+DmaEsZTBAV/2Urlp01aJf0H/o33RIUQ4psnlwo4t7zrm8c+bq0mMBFdK2/21cuvzZlavNa5+3vrh+4+at22tfHos44QT6JA5i/sbDAgIaQV9SGcAbxgGHXgAn3mQvxU+mwAWNoyM5ZzAIsR/RESVYmtTZWuMrto5cTpQrYSbVa8kx9cdSa+SGdIh69xG6t4PcEcdEsfVelqyj30fuxGQznqNVV2ut0KcJ0MN3An8p2IWovjpy3dbHdfbBRt73Xn12u21v2NmDqoGTB20rfw7O1q784w5jkoSmAgmwEKeOzeRAYS4pCUC33EQAw2SCfThN5Gh7oGjEEgkR0egbg42SAMkYpbeFhpQDkcHcBJhwaiogMsbmyNLcaatcSkCEQxCd4ZQysQjF1F8EEpuBGKhZNjD6RkmpfI7ZmJJZqTWFQxFiOa4kxTz0yklIAuDTsJxM2zRNrjBnwAkVqQkHxpl9lg6hOIoPcnw8Z2OIhFYJD3RRaADgHEZGmIUCZMJUdhoz+ROxI3kCnTTMcjs9zCeHMOyYOqVEuZ1REGOpjRkRnJM4DHE0VC7T+Qi4nQ2dWVVED7VSbuqL56HDFC6hrwroK71auX+BjlDfoCXwuAAeVwqfFNCTVamXFNCkgk4L6LRS2TsvwOcVeFZAZxV0XkDnFfRtAX1btRKbiz7tDtTC7uya1H5Ap/CMA0RatdO/eVnCzQ2eOmVJequq7ejM7iGMzCJcAOE8pavnRy9faLW33X1sb+pVhhcksKTYjzYf79kVir/oJufY29vd3Qon5jjyLwr1vt/8zqkWYglnwQVpa+vRD092V0eEk8r58mOgtoMqfvh19LzhWoFXJ1iYUMufVPnPOJ7/Dzuuq770plbB6hRLo5aKlZZYOlbp54ClKxUHC0oPzLLl8NKM275ZEFjG/IGZMe6H1Jhhft1OGr2PiGdLoolaLbP5ndU9Xw2OuxvO5ob907ftpz/m34Br1l3ra2vdcqwt66n13Dqw+hZp/NL4rfFn46/mz81fm783/1hQLzVyzR2r9DT//g9fhOXq</latexit>
p(Straight | D) =p(D | Straight) 12
p(D | Straight) 12 + p(D | Bent) 12
=p(D | Straight)
p(D | Straight) + p(D | Bent)
Ifwechooseauniformprior(eachmodelgetsthesameprobability),thenthepriorscanceloutandwearejustleftwithasumoverthedataprobabilities,whichwecomputedalready.
47
Bayesian
<latexit sha1_base64="axGXrMxqrPotnNYqHWKtLW/ASsQ=">AAAHK3icfZXbbhMxEIaXUyjhVOCSmxURCFAU7QbahotKpS0HIUpLaVqkblR5ncnGyh4s25smWH4krngHXoArEFcg3gNvsql248DeZDTfP6Pxb2vi05Bw4Tjfz52/cPFS5fLSlerVa9dv3Fy+dfuQJynD0MZJmLCPPuIQkhjagogQPlIGKPJDOPIHWxk/GgLjJIkPxJhCJ0JBTHoEI6FTJ8s79KHtMSw9ASMhPwiGSNAXStleRLr29iP7wbrtNJ62PK+aKYOZchNiQ7XSPFmuOQ1n8tlm4OZBzcq/vZNbl7563QSnkW6HQ8T5setQ0ZGICYJDUFUv5UARHqAAjlPRa3UkiWkqIMbKvq9ZLw1tkdjZ0ewuYYBFONYBwozoDjbuI4aw0AZUy604xCgCXu8OCeXTkA+DaSCQdq8jRxN31fVSpQwYon2CR6XRJIp4hETfSPJx5JeTkIbAhlE5mY2ph5xTjoBhwjMT9rQzuzS7MX6Q7OW8P6Z9iLmSKQtVsVADYAx6unASchAplZPT6Gcy4OuCpVDPwklufRuxwT5067pPKVEepxcmSChtRgynOIkiFHelR1X+Hrx6Q02sKtJ9JaWX+eL79n6GS/Rdgb5T853bZ7RntzUtwcMCPDQaHxXo0XypnxZoatBhgQ6Nzv5pAZ8aeFSgI4OOC3Rs0E8F+sm0EumLPm525NTuyTXJ3ZAM4RUDiJWsNdX8WZi+wWO3XJLdqqy5amJ3F3p6a0xBNM7k8vXBzlslt1rNFWdVzSv8MIWZxHmyurLlGJJgOk2ucVqt5qahSRiKg7NG2y9Wn7tmI5oyGp6J1taevHy2Of9EGDbOlx/Drrm24UewSJ4PvLDAX1QwNWGhfmDqXzE0/oc6WdR95s3CCrqoYmbUrGJuJJo9q4He2jRbqSicSrZBL1sGO/q57eoFgUTCHus3xoKIaDP0r1fPov8J0Wgm1FG1qje/O7/nzeCw2XBXG877p7WNN/l/wJJ117pnPbRca83asF5be1bbwtYX67v1y/pd+Vz5VvlR+TmVnj+X19yxSl/lz19RiZYd</latexit>
p(Straight | D) = 0.48
p(Bent | D) = 0.52
frequentist maximum likelihood estimator
“Bent is the most likely model”
HTHHHTHHTHTH
Ifweworkthisout,wegettheseprobabilitesfortheposterior.
Notethedifferencewiththemaximumlikelihoodcase.Eventhoughthedifferencesbetweenthetwolikelihoodswereminimal,weonlygetonechoiceforthetruemodelinthefrequentistapproach.IntheBayesianapproachwegetadistributiononthemodelspace.IttellsusnotjustthatBentisthemorelikelymodel,butalsothatbothmodelsarestillquitelikely.Inthissense,gettingaposteriordistributionisamuchmorevaluableresultthangettingapointestimateforyourmodel.
ThedownsideofBayesiananalysisisthatasthemodelsgetmorecomplex,itgetsmoreandmoredif?iculttoaccuratelyapproximatetheposterior,andtryingtodosoiswhathasledtosomeofthemostcomplicatedmaterialinmachinelearning.
Probabilistic Models Part 3: (Naive) Bayes Classifiers
Machine Learning mlvu.github.io
Vrije Universiteit Amsterdam
Inthislecturewe’lltrytoconnectthisprobabilitybusinessintotheabstracttaskofclassi?ication.
classification
X = X1, X2, X3, …: random variable for instance.
Y: random variable for class {pos, neg}
P(Y=pos | X) = 0.1 P(Y=neg | X) = 0.9
49
[email protected]?iersthatreturnnotjustaclassforagiveninstancex(oraranking)butaprobabilityoverallclasses.
Thiscanbeveryuseful.Wecanusetheprobabilitiestoextractaranking(andplotanROCcurve)orwecanusetheprobabilitiestoassesshowcertaintheclassi?ieris.Ifwedon’twanttheprobabilities,wecanjustturnitintoaregularclassi?ierbypickingtheclasswiththehighestprobability.
Notethataprobabilisticclassi?ierisalsoimmediatelyarankingclassi?ier(ifwerankbyhowlikelythepositiveclassis)andaregularclassi?ier(ifwepicktheclass
two approaches
discriminative classifier: learn a function for p(Y|X) directly
50
generative classifier: p(Y|X) ∝p(X|Y)p(Y)
Therearetwoapproachestocastingtheclassi?icationprobleminprobabilisticterms.Agenerativeclassi@ierfocusesonlearningadistributiononthefeaturespacegiventheclassp(X|Y).ThisdistributionisthencombinedwithBayes’ruletogettheprobabilityovertheclasses,conditionedonthedata.
Adiscriminativeclassi@ierlearnsthefunctionp(Y|X)directlywithXasinputandclassprobabilitiesasoutput.Itfunctionsasakindofregression,mappingxtoavectorofclassprobabilities.
We’lllookatsomesimplegenerativeclassi?iers?irst.
generative classifiers
Bayes optimal classifier Marginalize over all classifiers in a model class. Provably optimal (given certain assumptions). Usually too expensive to compute.
Bayes classifier Learn single distribution P(X|Y). Reasonable approach for low-dimensional data.
Naive Bayes classifier Assume conditionally independent features. Simple, cheap and effective for high-dimensional data.
51
Herearethreeapproaches,arrangedfromimpracticalbutentirelycorrecttohighlypractical,butbasedonlargelyincorrectassumptions.
Wewon’tdiscusstheBayesoptimalclassi?ierinthiscourse,butit’sworthknowingthatitexists,andthatitmeanssomethingdifferentthana(naive)Bayesclassi?ier.
Bayes classifier
Fit a model for p(X|Y) and for P(Y)
52
<latexit sha1_base64="rnS8Rx+W1hr/96b8S4nHD9E3wEg=">AAAIn3icjVVdb+NEFHUWWEJgoQuPvBgiUAtRZadsW4QqLf2g+8C2oWraRXVUjZ0bx8rYHs2ME3tH83f4TzzyTxg7TrA93hXz4qt7zrmeOXfs6xIcMG5Zf3eefPDhR08/7n7S+/SzZ59/sfP8yzsWJ9SDsRfjmL5xEQMcRDDmAcfwhlBAoYvh3l2c5fj9EigL4uiWZwQmIfKjYBZ4iKvU484/ZNdxPeFwSLkgMZPSCYOpme6Z3584M4o8QXbTItWg7Tm/aNI9mbP3pHlivl/7DuX/5Jo/mlsy3UAR+GtyIyMfd/rWvlUsUw/sMugb5Ro9Pn/6lzONvSSEiHsYMfZgW4RPBKI88DDInpMwIMhbIB8eEj47noggIgmHyJPmdwqbJdjksZnbbU4DCh7HmQqQRwNVwfTmSHnDVVN69VIMIhQCG0yXAWHrkC39dcCR6uhEpEXH5bOaUvgUkXngpbWtCRSyEPG5lmRZ6NaTkGCgy7CezLepNtlgpkC9gOUmjJQz1yS/Rew2HpX4PCNziJgUCcWyKlQAUAozJSxCBjwhojiNuroLdsJpAoM8LHIn54gubmA6UHVqifp2ZjhGvJ5y1TGUOxGsvDgMUTQVDpHljXAG+7LwroreSCGc3CjXNW9yuIZeVdArKevgRQW8UGAdHW/RmTluSu8q4J321vsKet+UukkFTTR0WUGXWmV3VYFXGpxW0FRDswqaaejbCvpW9xmpa/EwnIh1L4qmimscLOGSAkRS9IeyeRaq+v1g1yX5HRB9WxZ2T2Gm/ntrIMxyunh1+/p3Kc6Ohy+sQ9lkuDiBDcU6OHxxZmkUf72bkmMdHw9PNU5MUeRvC51fHP5q64VIQgneko6ODn77Wa+UAcbxalvp7PR8eNA8mHKkvin7yLas5m2jnmZV6YjZt03NWr+NXr6mVeC2CdZ+tvIXOv+Souwd7Lit+sbmVgVpU2w8b1VkbYpNAzaKuiRqsem/dmw1jZOT/ENYqDFE8pGB8LruOahhQuG1+kCu1Q8Q8Zj+oL4K6oeBqqWeziCP3kdE6Yaool5PTTa7Ocf04G64bx/uW3/81H95Ws64rvG18a2xa9jGkfHSeGWMjLHhdS47YWfZWXW/6V52r7qjNfVJp9R8ZdRW989/ARRUJew=</latexit>
p(pos | x) =p(x | pos) p(pos)
p(x)=
p(x | pos)p(pos)p(x | pos)p(pos) + p(x | neg)p(neg)
<latexit sha1_base64="rnS8Rx+W1hr/96b8S4nHD9E3wEg=">AAAIn3icjVVdb+NEFHUWWEJgoQuPvBgiUAtRZadsW4QqLf2g+8C2oWraRXVUjZ0bx8rYHs2ME3tH83f4TzzyTxg7TrA93hXz4qt7zrmeOXfs6xIcMG5Zf3eefPDhR08/7n7S+/SzZ59/sfP8yzsWJ9SDsRfjmL5xEQMcRDDmAcfwhlBAoYvh3l2c5fj9EigL4uiWZwQmIfKjYBZ4iKvU484/ZNdxPeFwSLkgMZPSCYOpme6Z3584M4o8QXbTItWg7Tm/aNI9mbP3pHlivl/7DuX/5Jo/mlsy3UAR+GtyIyMfd/rWvlUsUw/sMugb5Ro9Pn/6lzONvSSEiHsYMfZgW4RPBKI88DDInpMwIMhbIB8eEj47noggIgmHyJPmdwqbJdjksZnbbU4DCh7HmQqQRwNVwfTmSHnDVVN69VIMIhQCG0yXAWHrkC39dcCR6uhEpEXH5bOaUvgUkXngpbWtCRSyEPG5lmRZ6NaTkGCgy7CezLepNtlgpkC9gOUmjJQz1yS/Rew2HpX4PCNziJgUCcWyKlQAUAozJSxCBjwhojiNuroLdsJpAoM8LHIn54gubmA6UHVqifp2ZjhGvJ5y1TGUOxGsvDgMUTQVDpHljXAG+7LwroreSCGc3CjXNW9yuIZeVdArKevgRQW8UGAdHW/RmTluSu8q4J321vsKet+UukkFTTR0WUGXWmV3VYFXGpxW0FRDswqaaejbCvpW9xmpa/EwnIh1L4qmimscLOGSAkRS9IeyeRaq+v1g1yX5HRB9WxZ2T2Gm/ntrIMxyunh1+/p3Kc6Ohy+sQ9lkuDiBDcU6OHxxZmkUf72bkmMdHw9PNU5MUeRvC51fHP5q64VIQgneko6ODn77Wa+UAcbxalvp7PR8eNA8mHKkvin7yLas5m2jnmZV6YjZt03NWr+NXr6mVeC2CdZ+tvIXOv+Souwd7Lit+sbmVgVpU2w8b1VkbYpNAzaKuiRqsem/dmw1jZOT/ENYqDFE8pGB8LruOahhQuG1+kCu1Q8Q8Zj+oL4K6oeBqqWeziCP3kdE6Yaool5PTTa7Ocf04G64bx/uW3/81H95Ws64rvG18a2xa9jGkfHSeGWMjLHhdS47YWfZWXW/6V52r7qjNfVJp9R8ZdRW989/ARRUJew=</latexit>
p(pos | x) =p(x | pos) p(pos)
p(x)=
p(x | pos)p(pos)p(x | pos)p(pos) + p(x | neg)p(neg)
<latexit sha1_base64="rnS8Rx+W1hr/96b8S4nHD9E3wEg=">AAAIn3icjVVdb+NEFHUWWEJgoQuPvBgiUAtRZadsW4QqLf2g+8C2oWraRXVUjZ0bx8rYHs2ME3tH83f4TzzyTxg7TrA93hXz4qt7zrmeOXfs6xIcMG5Zf3eefPDhR08/7n7S+/SzZ59/sfP8yzsWJ9SDsRfjmL5xEQMcRDDmAcfwhlBAoYvh3l2c5fj9EigL4uiWZwQmIfKjYBZ4iKvU484/ZNdxPeFwSLkgMZPSCYOpme6Z3584M4o8QXbTItWg7Tm/aNI9mbP3pHlivl/7DuX/5Jo/mlsy3UAR+GtyIyMfd/rWvlUsUw/sMugb5Ro9Pn/6lzONvSSEiHsYMfZgW4RPBKI88DDInpMwIMhbIB8eEj47noggIgmHyJPmdwqbJdjksZnbbU4DCh7HmQqQRwNVwfTmSHnDVVN69VIMIhQCG0yXAWHrkC39dcCR6uhEpEXH5bOaUvgUkXngpbWtCRSyEPG5lmRZ6NaTkGCgy7CezLepNtlgpkC9gOUmjJQz1yS/Rew2HpX4PCNziJgUCcWyKlQAUAozJSxCBjwhojiNuroLdsJpAoM8LHIn54gubmA6UHVqifp2ZjhGvJ5y1TGUOxGsvDgMUTQVDpHljXAG+7LwroreSCGc3CjXNW9yuIZeVdArKevgRQW8UGAdHW/RmTluSu8q4J321vsKet+UukkFTTR0WUGXWmV3VYFXGpxW0FRDswqaaejbCvpW9xmpa/EwnIh1L4qmimscLOGSAkRS9IeyeRaq+v1g1yX5HRB9WxZ2T2Gm/ntrIMxyunh1+/p3Kc6Ohy+sQ9lkuDiBDcU6OHxxZmkUf72bkmMdHw9PNU5MUeRvC51fHP5q64VIQgneko6ODn77Wa+UAcbxalvp7PR8eNA8mHKkvin7yLas5m2jnmZV6YjZt03NWr+NXr6mVeC2CdZ+tvIXOv+Souwd7Lit+sbmVgVpU2w8b1VkbYpNAzaKuiRqsem/dmw1jZOT/ENYqDFE8pGB8LruOahhQuG1+kCu1Q8Q8Zj+oL4K6oeBqqWeziCP3kdE6Yaool5PTTa7Ocf04G64bx/uW3/81H95Ws64rvG18a2xa9jGkfHSeGWMjLHhdS47YWfZWXW/6V52r7qjNfVJp9R8ZdRW989/ARRUJew=</latexit>
p(pos | x) =p(x | pos) p(pos)
p(x)=
p(x | pos)p(pos)p(x | pos)p(pos) + p(x | neg)p(neg)
FortheBayesclassi?ier,westartwiththeprobabilitywe’reinterestedinp(Y|X):theprobabilityoftheclassgiventhedata.WethenrewritethisusingBayes’rule.Fromthe?inalform,weseethatifwecomputetheprobabilityfunctionsp(X|Y),thedatagiventheclassandp(Y),thepriorprobabilityoftheclass,wecancomputetheprobabiliteswe;reinterestedin:theclassprobabilitiesgiventhedata.
Sothetaskbecomestolearnfunctionsforthosetwoprobabilties.
Bayes classifier
53
Choose probability distribution M (e.g. MVN)
Fit Mpos to all positive points: p(X=x|pos) = Mpos(x)
Fit Mneg to all negative points: p(X=x|neg) = Mneg(x)
Estimate P(Y) from the class frequencies in the training data, or use domain-specific information.
Compute terms tpos = Mpos(x) p(pos) and tneg = Mneg(x) p(neg)
Compute class probabilities p(pos|x) = tpos / (tpos + tneg) and p(neg|x) = tneg / (tpos + tneg)
•
SohereisthealgorithmforasimpleBayesclassi?ier.WechooseamodelclassforP(X|Y),forinstancemultivariatenormaldistributions.We?itsuchamodelseparatelytoeachclasstogieusonedistribution
example for MVNs
54source: http://learning.cis.upenn.edu/cis520_fall2009/index.php?n=Lectures.NaiveBayes
Hereisanexampleofwhatthatlookslikewith2features.Ontheleftwehavetwoclasses,blueandblack.We?ita2Dnormaldistributiontoeach.Then,foranewpoint,weseewhichassignsthenewpointthehighestprobabilitydensity.
Theredlineprovidesthedecisionboundary.
Naive Bayes
Assume independence between all features, conditional on the class.
Often used with categoric features.
55
p(X1,X2 | Y) = p(X1 | Y)p(X2 | Y)<latexit sha1_base64="07wFBM5IHtaUNIi3c7vSyRp3h6U=">AAAHGXicfZXPbtNAEMbdQkMJFFo4crGIkAqKIjulbSpRqbSFVoi2oWqaoDiK1puJY8V/VrvrNOnKb8KJR+GAKm7Aibdh7SSVHQd8yWh+34xmvl1tTOLYjGvan4XFO3eXcveW7+cfPFx59Hh17ckl8wOKoYZ9x6cNEzFwbA9q3OYONAgF5JoO1M3+QcTrA6DM9r0LPiLQcpHl2V0bIy5T7dU3ZL3R1otqo11WDdfuqIaJxefwpbqrxiSZixLlZKK9WtBKWvyp2UCfBAVl8lXba0vfjY6PAxc8jh3EWFPXCG8JRLmNHQjzRsCAINxHFjQD3q20hO2RgIOHQ/WFZN3AUbmvRpuoHZsC5s5IBghTW3ZQcQ9RhLncN59uxcBDLrBiZ2ATNg7ZwBoHHEmzWmIYmxmupCqFRRHp2XiYGk0gl7mI9zJJNnLNdBICB+jATSejMeWQM8ohUGyzyISqdOaMRAfELvzqhPdGpAceC0VAnTBZKAFQCl1ZGIcMeEBEvI28FX22y2kAxSiMc7uHiPbPoVOUfVKJ9Dhdx0c8lGZ4cIV910VeRxgkFAaHIRdGsRTGViXpeSiEEflimup5hFP0NEFPw9nOtVvaVWuSpuBlAl5mGtcTtD5bagYJGmToIEEHmc7mVQJfZfAwQYcZOkrQUYZeJ+h11kokD7pZbomx3fExiTPHHsARBfBCUSiHs7tQeYJNPV0Snaoo6GFsdwe68pEYA3cUycXxxcnHUBxUypvaVjirMJ0AphJtY2vzQMtIrPE0E41WqZT3MxqfIs+6bXT4buutnm1EAkqcW9H29sb7nf3ZK0JxZr/JGmpBVzN+WPPkk4HnFpjzCsYmzNX3s/ojikb/UPvzuk+9mVtB5lVMjZpWzIxEomvVx7I4elKRM5YcgnxsKZzI63YmHwjEffpK3jFqubY0Q/4axSj6nxANp0IZ5fPy5ddn3/lsUCuXdkrap9eFvQ+Tv4Bl5ZnyXFlXdGVb2VOOlapSU7DyVblRfiq/cl9y33I3uR9j6eLCpOapkvpyv/8CIRKPsg==</latexit><latexit sha1_base64="07wFBM5IHtaUNIi3c7vSyRp3h6U=">AAAHGXicfZXPbtNAEMbdQkMJFFo4crGIkAqKIjulbSpRqbSFVoi2oWqaoDiK1puJY8V/VrvrNOnKb8KJR+GAKm7Aibdh7SSVHQd8yWh+34xmvl1tTOLYjGvan4XFO3eXcveW7+cfPFx59Hh17ckl8wOKoYZ9x6cNEzFwbA9q3OYONAgF5JoO1M3+QcTrA6DM9r0LPiLQcpHl2V0bIy5T7dU3ZL3R1otqo11WDdfuqIaJxefwpbqrxiSZixLlZKK9WtBKWvyp2UCfBAVl8lXba0vfjY6PAxc8jh3EWFPXCG8JRLmNHQjzRsCAINxHFjQD3q20hO2RgIOHQ/WFZN3AUbmvRpuoHZsC5s5IBghTW3ZQcQ9RhLncN59uxcBDLrBiZ2ATNg7ZwBoHHEmzWmIYmxmupCqFRRHp2XiYGk0gl7mI9zJJNnLNdBICB+jATSejMeWQM8ohUGyzyISqdOaMRAfELvzqhPdGpAceC0VAnTBZKAFQCl1ZGIcMeEBEvI28FX22y2kAxSiMc7uHiPbPoVOUfVKJ9Dhdx0c8lGZ4cIV910VeRxgkFAaHIRdGsRTGViXpeSiEEflimup5hFP0NEFPw9nOtVvaVWuSpuBlAl5mGtcTtD5bagYJGmToIEEHmc7mVQJfZfAwQYcZOkrQUYZeJ+h11kokD7pZbomx3fExiTPHHsARBfBCUSiHs7tQeYJNPV0Snaoo6GFsdwe68pEYA3cUycXxxcnHUBxUypvaVjirMJ0AphJtY2vzQMtIrPE0E41WqZT3MxqfIs+6bXT4buutnm1EAkqcW9H29sb7nf3ZK0JxZr/JGmpBVzN+WPPkk4HnFpjzCsYmzNX3s/ojikb/UPvzuk+9mVtB5lVMjZpWzIxEomvVx7I4elKRM5YcgnxsKZzI63YmHwjEffpK3jFqubY0Q/4axSj6nxANp0IZ5fPy5ddn3/lsUCuXdkrap9eFvQ+Tv4Bl5ZnyXFlXdGVb2VOOlapSU7DyVblRfiq/cl9y33I3uR9j6eLCpOapkvpyv/8CIRKPsg==</latexit><latexit sha1_base64="07wFBM5IHtaUNIi3c7vSyRp3h6U=">AAAHGXicfZXPbtNAEMbdQkMJFFo4crGIkAqKIjulbSpRqbSFVoi2oWqaoDiK1puJY8V/VrvrNOnKb8KJR+GAKm7Aibdh7SSVHQd8yWh+34xmvl1tTOLYjGvan4XFO3eXcveW7+cfPFx59Hh17ckl8wOKoYZ9x6cNEzFwbA9q3OYONAgF5JoO1M3+QcTrA6DM9r0LPiLQcpHl2V0bIy5T7dU3ZL3R1otqo11WDdfuqIaJxefwpbqrxiSZixLlZKK9WtBKWvyp2UCfBAVl8lXba0vfjY6PAxc8jh3EWFPXCG8JRLmNHQjzRsCAINxHFjQD3q20hO2RgIOHQ/WFZN3AUbmvRpuoHZsC5s5IBghTW3ZQcQ9RhLncN59uxcBDLrBiZ2ATNg7ZwBoHHEmzWmIYmxmupCqFRRHp2XiYGk0gl7mI9zJJNnLNdBICB+jATSejMeWQM8ohUGyzyISqdOaMRAfELvzqhPdGpAceC0VAnTBZKAFQCl1ZGIcMeEBEvI28FX22y2kAxSiMc7uHiPbPoVOUfVKJ9Dhdx0c8lGZ4cIV910VeRxgkFAaHIRdGsRTGViXpeSiEEflimup5hFP0NEFPw9nOtVvaVWuSpuBlAl5mGtcTtD5bagYJGmToIEEHmc7mVQJfZfAwQYcZOkrQUYZeJ+h11kokD7pZbomx3fExiTPHHsARBfBCUSiHs7tQeYJNPV0Snaoo6GFsdwe68pEYA3cUycXxxcnHUBxUypvaVjirMJ0AphJtY2vzQMtIrPE0E41WqZT3MxqfIs+6bXT4buutnm1EAkqcW9H29sb7nf3ZK0JxZr/JGmpBVzN+WPPkk4HnFpjzCsYmzNX3s/ojikb/UPvzuk+9mVtB5lVMjZpWzIxEomvVx7I4elKRM5YcgnxsKZzI63YmHwjEffpK3jFqubY0Q/4axSj6nxANp0IZ5fPy5ddn3/lsUCuXdkrap9eFvQ+Tv4Bl5ZnyXFlXdGVb2VOOlapSU7DyVblRfiq/cl9y33I3uR9j6eLCpOapkvpyv/8CIRKPsg==</latexit>
Thisworkswellforsmallnumbersoffeatures,butifwehavemanyfeatures,modellingthedependencebetweeneachpairoffeaturesgetsveryexpensive.
Acrude,butveryeffectivesolutionisNaiveBayes.NBjustassumesthatallfeaturesareindependent,conditionalontheclass.
Notethatwedonotassumethatthefeaturesareindependent:it’sperfectlypossibleforonefeaturetobedependentonanotherfeature,buttheareconditionallyindependent.Informally,thedependencybetweenthefeaturesis“caused”bytheclassandnothingelse.JustlikeAliceandBobinthe?irstvideo:theirlatenesshadonlyonepossiblesharedcause,themonster,andoncewe’disolatedthat,theirlatenesswasindependent.
56
“pill” “meeting”T T spamT F spamT T hamT T hamF T hamF T hamF T hamF F spamT F spamF F spamF F ham
Hereisanexampledataset,withbinaryfeatures.Eachfeatureindicateswhetheraparticularwordoccursinthatinstance.
WewillbuildanaiveBayesclassi?ierforthisdatabysimply?ittingabernoullidistributiontoeachfeature.Thatis,wewillestimatep(“pill”|spam)astherelativefrequencywithwhichthe“pill”featurewastrueforspamemails.
57
X1 X2
T T spamT F spamT T hamT T hamF T hamF T hamF T hamF F spamT F spamF F spamF F ham
p(X1=T | ham) = 2/6
p(X1=F | ham) = 4/6
HereiswhatNaiveBayesdoes:itselectsallemailsofoneclass,andthenestimatestheprobabilitythatX1willbeTastherelativefrequencyofemailsforwhichX1wasTinthetrainingset.
Strictlyspeaking,wearemodellingX1asaBernoullidistributionwhoseparameterweestimateas2/6
58
X1 X2
T T spamT F spamT T hamT T hamF T hamF T hamF T hamF F spamT F spamF F spamF F ham
p(X1=T | spam) = 3/5
p(X1=F | spam) = 2/5
Wedothesameforthespamclassandfortheotherfeature.
59
p(Y | X1, . . . ,Xn) / p(X1, . . . ,Xn | Y)p(Y)
= p(X1 | Y)⇥ . . .⇥ p(Xn | Y)p(Y)<latexit sha1_base64="TgV5efg9YsbzGkracG8gBkk1KC0=">AAAHeXicfZVbb9MwGIbTcSiU04BLbiIqUEHVlHSwjYtJYxtsQuzAtG5DSzU57tfWqpNYttO1WPkn/B7+A/+FC5xDpxwKvukXP+/7yX5tuS6jREjL+l1bunX7zt36vfuNBw8fPX6y/PTZmQhCjqGLAxrwCxcJoMSHriSSwgXjgDyXwrk73on5+QS4IIF/KmcMeh4a+mRAMJJ66mr5J2u9dlysvkeOR/rmxZXdNh3aD6Ro6w//jekwHjAZmKxVZokh9b7ReF45TuP1ZiovKBxJPBBZg/lXLPt3o6vlprViJcOsFnZWNI1sHF89vfPL6Qc49MCXmCIhLm2LyZ5CXBJMIWo4oQCG8BgN4TKUg42eIj4LJfg4Ml9pNgipqTcbJ2X2CQcs6UwXCHOiO5h4hDjCUufZKLYS4CO9n3Z/QphISzEZpoVE+jB6apocVvSo4FRDjtiI4GlhaQp5wkNyVJkUM88tTkJIgU+84mS8TL3IknIKHBMRh3Cskzli8QUQp8FxxkczNgJfRCrkNMobNQDOYaCNSSlAhkwlu9G3biw2JQ+hHZfJ3OYu4uMT6Ld1n8JEcTkDGiAZ6TB8uMaB5yG/rxwWKUfCVCqnvRIlUeXpSaSUE+fiuuZJjAv0MEcPo3Ln7g0dmF1NC/AsB88qjc9z9LxsdcMcDSt0kqOTSmf3OoevK3iao9MKneXorEJ/5OiPapRIH/Rlp6fSuJNjUkeUTGCPA/iRanai8l64PsFLu2iJT1U17SiJuw8D/QilwJvFcrV/evA1UjsbnffWWlRWuDSEucRaXXu/Y1Ukw3Q1mcba2OhsVzQBR/7wptHup7WPdrURCzmjN6L19dXPH7bLV4Tjyv6ybZhN26zkMVwkzxa80OAuMqQhLNSPq/o9jmb/UAeLus+zWehgixzzoOaO0pJYfK3GWJvjJxXRVLIL+rHlcKCv25F+IJAM+Ft9x/jQIzoM/eu04+p/QjSdC3XVaOiX3y6/89Wi21n5sGJ9e9fc+pL9BdwzXhgvjZZhG+vGlrFvHBtdA9eWaq2aXevc/VN/WW/V36bSpVrmeW4URn31LyuzrHo=</latexit><latexit sha1_base64="TgV5efg9YsbzGkracG8gBkk1KC0=">AAAHeXicfZVbb9MwGIbTcSiU04BLbiIqUEHVlHSwjYtJYxtsQuzAtG5DSzU57tfWqpNYttO1WPkn/B7+A/+FC5xDpxwKvukXP+/7yX5tuS6jREjL+l1bunX7zt36vfuNBw8fPX6y/PTZmQhCjqGLAxrwCxcJoMSHriSSwgXjgDyXwrk73on5+QS4IIF/KmcMeh4a+mRAMJJ66mr5J2u9dlysvkeOR/rmxZXdNh3aD6Ro6w//jekwHjAZmKxVZokh9b7ReF45TuP1ZiovKBxJPBBZg/lXLPt3o6vlprViJcOsFnZWNI1sHF89vfPL6Qc49MCXmCIhLm2LyZ5CXBJMIWo4oQCG8BgN4TKUg42eIj4LJfg4Ml9pNgipqTcbJ2X2CQcs6UwXCHOiO5h4hDjCUufZKLYS4CO9n3Z/QphISzEZpoVE+jB6apocVvSo4FRDjtiI4GlhaQp5wkNyVJkUM88tTkJIgU+84mS8TL3IknIKHBMRh3Cskzli8QUQp8FxxkczNgJfRCrkNMobNQDOYaCNSSlAhkwlu9G3biw2JQ+hHZfJ3OYu4uMT6Ld1n8JEcTkDGiAZ6TB8uMaB5yG/rxwWKUfCVCqnvRIlUeXpSaSUE+fiuuZJjAv0MEcPo3Ln7g0dmF1NC/AsB88qjc9z9LxsdcMcDSt0kqOTSmf3OoevK3iao9MKneXorEJ/5OiPapRIH/Rlp6fSuJNjUkeUTGCPA/iRanai8l64PsFLu2iJT1U17SiJuw8D/QilwJvFcrV/evA1UjsbnffWWlRWuDSEucRaXXu/Y1Ukw3Q1mcba2OhsVzQBR/7wptHup7WPdrURCzmjN6L19dXPH7bLV4Tjyv6ybZhN26zkMVwkzxa80OAuMqQhLNSPq/o9jmb/UAeLus+zWehgixzzoOaO0pJYfK3GWJvjJxXRVLIL+rHlcKCv25F+IJAM+Ft9x/jQIzoM/eu04+p/QjSdC3XVaOiX3y6/89Wi21n5sGJ9e9fc+pL9BdwzXhgvjZZhG+vGlrFvHBtdA9eWaq2aXevc/VN/WW/V36bSpVrmeW4URn31LyuzrHo=</latexit><latexit sha1_base64="TgV5efg9YsbzGkracG8gBkk1KC0=">AAAHeXicfZVbb9MwGIbTcSiU04BLbiIqUEHVlHSwjYtJYxtsQuzAtG5DSzU57tfWqpNYttO1WPkn/B7+A/+FC5xDpxwKvukXP+/7yX5tuS6jREjL+l1bunX7zt36vfuNBw8fPX6y/PTZmQhCjqGLAxrwCxcJoMSHriSSwgXjgDyXwrk73on5+QS4IIF/KmcMeh4a+mRAMJJ66mr5J2u9dlysvkeOR/rmxZXdNh3aD6Ro6w//jekwHjAZmKxVZokh9b7ReF45TuP1ZiovKBxJPBBZg/lXLPt3o6vlprViJcOsFnZWNI1sHF89vfPL6Qc49MCXmCIhLm2LyZ5CXBJMIWo4oQCG8BgN4TKUg42eIj4LJfg4Ml9pNgipqTcbJ2X2CQcs6UwXCHOiO5h4hDjCUufZKLYS4CO9n3Z/QphISzEZpoVE+jB6apocVvSo4FRDjtiI4GlhaQp5wkNyVJkUM88tTkJIgU+84mS8TL3IknIKHBMRh3Cskzli8QUQp8FxxkczNgJfRCrkNMobNQDOYaCNSSlAhkwlu9G3biw2JQ+hHZfJ3OYu4uMT6Ld1n8JEcTkDGiAZ6TB8uMaB5yG/rxwWKUfCVCqnvRIlUeXpSaSUE+fiuuZJjAv0MEcPo3Ln7g0dmF1NC/AsB88qjc9z9LxsdcMcDSt0kqOTSmf3OoevK3iao9MKneXorEJ/5OiPapRIH/Rlp6fSuJNjUkeUTGCPA/iRanai8l64PsFLu2iJT1U17SiJuw8D/QilwJvFcrV/evA1UjsbnffWWlRWuDSEucRaXXu/Y1Ukw3Q1mcba2OhsVzQBR/7wptHup7WPdrURCzmjN6L19dXPH7bLV4Tjyv6ybZhN26zkMVwkzxa80OAuMqQhLNSPq/o9jmb/UAeLus+zWehgixzzoOaO0pJYfK3GWJvjJxXRVLIL+rHlcKCv25F+IJAM+Ft9x/jQIzoM/eu04+p/QjSdC3XVaOiX3y6/89Wi21n5sGJ9e9fc+pL9BdwzXhgvjZZhG+vGlrFvHBtdA9eWaq2aXevc/VN/WW/V36bSpVrmeW4URn31LyuzrHo=</latexit>
ThisisthenaiveBayesassumptionformulaically.Wesimplyfactorp(X1,…Xn)intonseparate,independentprobabilities.
60
“pill” “meeting”T T spamT F spamT T hamT T hamF T hamF T hamF T hamF F spamT F spamF F spamF F ham
new instance: “pill” & “meeting”
p(ham | X1=T, X2=T) ∝p(X1=T, X2=T | ham) p(ham)
= p(X1=T | ham) p(X2=T | ham) p(ham)
= (2/6) × (5/6) × (6/11)
smoothing
61
X1 X2
T T spamT F spamT T hamT T hamF T hamF T hamF T hamT F spamT F spamT F spamF F ham
p(X1=T | spam) = 5/5
p(X1=F | spam) = 0/5
WhileNaiveBayescanworksurprisinglywell(givenhowstrongandincorrecttheassumptionis),wedorunintoaproblemifforsomefeatureaparticularvaluedoesnotoccur.Inthatcase,weestimatetheprobabilityas0.
62
p(Y | X1, . . . ,Xn) / p(X1, . . . ,Xn | Y)p(Y)
= p(X1 | Y)⇥ . . .⇥ p(Xn | Y)p(Y)
= 0⇥ . . .⇥ p(Xn | Y)p(Y)
= 0<latexit sha1_base64="eZvaiOp+jt7Tzu3xrM1dH85R+eE=">AAAHxHicpVXbbts4EJV7c+ttu2n72BehxgZpYQSS2ybpQ4A2yW6KRXNpECdZREZA0WObMCURJOXYYbmf13/oF+xvLHVxoIu3L8sXj+acMxieGdA+o0RIx/nRuHP33v0HzYePWr88fvL015Vnz89EFHMMPRzRiF/4SAAlIfQkkRQuGAcU+BTO/clugp9PgQsShadyzqAfoFFIhgQjaVJXK/+wtVXPx+ov7QVkYF9cuR3bo4NIio75CF/bHuMRk5HN1qpYKsi0rw28iDyvtbqd0UsMT5IARF5g8ZXQfl7I41g5+v+qvauVtrPupMeuB24etK38HF89u//dG0Q4DiCUmCIhLl2Hyb5CXBJMQbe8WABDeIJGcBnL4VZfkZDFEkKs7d8MNoypbXxLTLcHhAOWdG4ChDkxFWw8RhxhaUbTKpcSECJzuc5gSpjIQjEdZYFEZq59NUvnrp+UlGrEERsTPCu1plAgAiTHtaSYB345CTEFPg3KyaRN02SFOQOOiUhMODbOHLFkl8RpdJzj4zkbQyi0ijnVRaEBgHMYGmEaCpAxU+ltzAJPxLbkMXSSMM1t7yE+OYFBx9QpJcrtDGmEpDZmhHCNoyBA4UB5TCtPwkwqr7OuU6uK6IlWykt88X37JIFL6GEBPdTVyr1bdGj3DFoCzwrgWa3weQE9r0r9uIDGNXRaQKe1yv51Ab6uwbMCOquh8wI6r6E3BfSmbiUyg77s9lVmdzomdUTJFPY5QKhVu6urd+FmgpduWZJMVbVdndo9gKF5zzIgmCd09fn04ItWu1vd986GrjJ8GsOC4rzdeL/r1CijrJuc42xtdXdqnIijcHRbaO/3jU9uvRCLOaO3pM3Nt3982KmuCMe1++XXsNuuXfNjtIyeN7xU4C8TZCYs5U/q/H2O5v/BjpZVX3izVMGWKRZGLRSVlliyVhNsxMmTimhG2QPz2HI4MOt2ZB4IJCP+xuwYHwXEmGF+vU4S/YyIZguiiVot8/K71Xe+HvS66x/Wna/v2h//zP8CHlovrVfWmuVam9ZH67N1bPUs3DhoiMa3hm7uN4OmaMYZ9U4j17ywSqf597+F9MdE</latexit><latexit sha1_base64="eZvaiOp+jt7Tzu3xrM1dH85R+eE=">AAAHxHicpVXbbts4EJV7c+ttu2n72BehxgZpYQSS2ybpQ4A2yW6KRXNpECdZREZA0WObMCURJOXYYbmf13/oF+xvLHVxoIu3L8sXj+acMxieGdA+o0RIx/nRuHP33v0HzYePWr88fvL015Vnz89EFHMMPRzRiF/4SAAlIfQkkRQuGAcU+BTO/clugp9PgQsShadyzqAfoFFIhgQjaVJXK/+wtVXPx+ov7QVkYF9cuR3bo4NIio75CF/bHuMRk5HN1qpYKsi0rw28iDyvtbqd0UsMT5IARF5g8ZXQfl7I41g5+v+qvauVtrPupMeuB24etK38HF89u//dG0Q4DiCUmCIhLl2Hyb5CXBJMQbe8WABDeIJGcBnL4VZfkZDFEkKs7d8MNoypbXxLTLcHhAOWdG4ChDkxFWw8RhxhaUbTKpcSECJzuc5gSpjIQjEdZYFEZq59NUvnrp+UlGrEERsTPCu1plAgAiTHtaSYB345CTEFPg3KyaRN02SFOQOOiUhMODbOHLFkl8RpdJzj4zkbQyi0ijnVRaEBgHMYGmEaCpAxU+ltzAJPxLbkMXSSMM1t7yE+OYFBx9QpJcrtDGmEpDZmhHCNoyBA4UB5TCtPwkwqr7OuU6uK6IlWykt88X37JIFL6GEBPdTVyr1bdGj3DFoCzwrgWa3weQE9r0r9uIDGNXRaQKe1yv51Ab6uwbMCOquh8wI6r6E3BfSmbiUyg77s9lVmdzomdUTJFPY5QKhVu6urd+FmgpduWZJMVbVdndo9gKF5zzIgmCd09fn04ItWu1vd986GrjJ8GsOC4rzdeL/r1CijrJuc42xtdXdqnIijcHRbaO/3jU9uvRCLOaO3pM3Nt3982KmuCMe1++XXsNuuXfNjtIyeN7xU4C8TZCYs5U/q/H2O5v/BjpZVX3izVMGWKRZGLRSVlliyVhNsxMmTimhG2QPz2HI4MOt2ZB4IJCP+xuwYHwXEmGF+vU4S/YyIZguiiVot8/K71Xe+HvS66x/Wna/v2h//zP8CHlovrVfWmuVam9ZH67N1bPUs3DhoiMa3hm7uN4OmaMYZ9U4j17ywSqf597+F9MdE</latexit><latexit sha1_base64="eZvaiOp+jt7Tzu3xrM1dH85R+eE=">AAAHxHicpVXbbts4EJV7c+ttu2n72BehxgZpYQSS2ybpQ4A2yW6KRXNpECdZREZA0WObMCURJOXYYbmf13/oF+xvLHVxoIu3L8sXj+acMxieGdA+o0RIx/nRuHP33v0HzYePWr88fvL015Vnz89EFHMMPRzRiF/4SAAlIfQkkRQuGAcU+BTO/clugp9PgQsShadyzqAfoFFIhgQjaVJXK/+wtVXPx+ov7QVkYF9cuR3bo4NIio75CF/bHuMRk5HN1qpYKsi0rw28iDyvtbqd0UsMT5IARF5g8ZXQfl7I41g5+v+qvauVtrPupMeuB24etK38HF89u//dG0Q4DiCUmCIhLl2Hyb5CXBJMQbe8WABDeIJGcBnL4VZfkZDFEkKs7d8MNoypbXxLTLcHhAOWdG4ChDkxFWw8RhxhaUbTKpcSECJzuc5gSpjIQjEdZYFEZq59NUvnrp+UlGrEERsTPCu1plAgAiTHtaSYB345CTEFPg3KyaRN02SFOQOOiUhMODbOHLFkl8RpdJzj4zkbQyi0ijnVRaEBgHMYGmEaCpAxU+ltzAJPxLbkMXSSMM1t7yE+OYFBx9QpJcrtDGmEpDZmhHCNoyBA4UB5TCtPwkwqr7OuU6uK6IlWykt88X37JIFL6GEBPdTVyr1bdGj3DFoCzwrgWa3weQE9r0r9uIDGNXRaQKe1yv51Ab6uwbMCOquh8wI6r6E3BfSmbiUyg77s9lVmdzomdUTJFPY5QKhVu6urd+FmgpduWZJMVbVdndo9gKF5zzIgmCd09fn04ItWu1vd986GrjJ8GsOC4rzdeL/r1CijrJuc42xtdXdqnIijcHRbaO/3jU9uvRCLOaO3pM3Nt3982KmuCMe1++XXsNuuXfNjtIyeN7xU4C8TZCYs5U/q/H2O5v/BjpZVX3izVMGWKRZGLRSVlliyVhNsxMmTimhG2QPz2HI4MOt2ZB4IJCP+xuwYHwXEmGF+vU4S/YyIZguiiVot8/K71Xe+HvS66x/Wna/v2h//zP8CHlovrVfWmuVam9ZH67N1bPUs3DhoiMa3hm7uN4OmaMYZ9U4j17ywSqf597+F9MdE</latexit>
Sincethewholeestimateofourprobabilityisjustalongproduct,ifoneofthefactorsbecomeszero,thewholethingscollapses.Evenifalltheotherfeaturesgavethisclassaveryhighprobability,thatinformationislost.
pseudo-observations (aka Laplace smoothing)
63
X1 X2
T T spamT F spamT T hamT T hamF T hamF T hamF T hamT F spamT F spamT F spamF F hamF F spamT T spam
Toremedythis,weneedtoapplysmoothing.Thesimplestwastodothatistoaddpseudo-observations.Foreachpossiblevalue,weaddaninstancewhereallthefeatureshavethatvalue.
(Weshoulddothesamefortheclassham).
unsmoothed
smoothed
64
p(X1 = T | Y = spam) =freq. of T in spam data
total # of spam instances<latexit sha1_base64="lM7otXHxmNW2LOvM/MWPYYOq3PY=">AAAHWXicfZVRb9s2EMflbK1Tb22T9rEvxIwBXWEYkrsm6UOALsnWPqxNFthJisgIKPpkE6YkjqQcuxy/Xr9DgT3tk4yS5UyyvOrFh/v973D8kzgHnFGpXPdLY+ubb+/db24/aH33/cNHj3d2n1zIJBUEBiRhibgKsARGYxgoqhhccQE4ChhcBtPjjF/OQEiaxH214DCM8DimISVY2dTNzl/8+dWNhw5R34/oCPkB0R/NoS+I9hXMlZYcR8b8ZAV+KPAqGwr4s4uSEPURjVGmznVohBU2phCpRGGG/Ham+09CY6lwTEAaZG522m7XzT9UD7wiaDvFd3aze++zP0pIGkGsCMNSXnsuV0ONhaKEgWn5qQSOyRSP4TpV4cFQ05inCmJi0I+WhSlDKkGZE2hEBRDFFjbARFDbAZEJtodU1q9WtZWEGEcgO6MZ5XIZytl4GShszR7qeX4Z5mGlUo8F5hNK5pXRNI5khNWklpSLKKgmIWUgZlE1mY1ph1xTzkEQKjMTzqwzpzy7YNlPzgo+WfAJxNLoVDBTLrQAhIDQFuahBJVynZ/GvqqpPFQihU4W5rnDEyym5zDq2D6VRHWckCVYGWtGDLckiSIcj7TPV2/D73RNblWZnhut/cyXIEDnGa7QDyX6wax3HtzREA0srcCLEryoNb4s0cv10iAt0bRGZyU6q3UObkv4tobnJTqv0UWJLmr0U4l+qluJ7UVf94Z6aXd+TfqU0Rm8FQCx0e2eWT+LsDd47VVLslvVbc/kdo8gtEtmCaJFJtfv+u9/N/r4oPfK3TPrioClsJK4L/deHbs1yXg5TaFxDw56RzVNInA8vmt08uveL169EU8FZ3ei/f2Xv70+Wn8igtTOVxwDtT1U82O8SV4MvLEg2FSwNGGjflrXvxV48T/qZFP3lTcbK/imipVRq4q1kXj2rKZ2zfNspWK2lJyAXbYC3tvndmoXBFaJeGHfmBhH1Jphf/1OFn1NiOcroY1aLbv5vfU9Xw8Gve7rrvvHz+03R8VfwLbzzPnBee54zr7zxnnnnDkDhzj/NLYbu40n9/9ubjW3m62ldKtR1Dx1Kl/z6b/zMaX+</latexit><latexit sha1_base64="lM7otXHxmNW2LOvM/MWPYYOq3PY=">AAAHWXicfZVRb9s2EMflbK1Tb22T9rEvxIwBXWEYkrsm6UOALsnWPqxNFthJisgIKPpkE6YkjqQcuxy/Xr9DgT3tk4yS5UyyvOrFh/v973D8kzgHnFGpXPdLY+ubb+/db24/aH33/cNHj3d2n1zIJBUEBiRhibgKsARGYxgoqhhccQE4ChhcBtPjjF/OQEiaxH214DCM8DimISVY2dTNzl/8+dWNhw5R34/oCPkB0R/NoS+I9hXMlZYcR8b8ZAV+KPAqGwr4s4uSEPURjVGmznVohBU2phCpRGGG/Ham+09CY6lwTEAaZG522m7XzT9UD7wiaDvFd3aze++zP0pIGkGsCMNSXnsuV0ONhaKEgWn5qQSOyRSP4TpV4cFQ05inCmJi0I+WhSlDKkGZE2hEBRDFFjbARFDbAZEJtodU1q9WtZWEGEcgO6MZ5XIZytl4GShszR7qeX4Z5mGlUo8F5hNK5pXRNI5khNWklpSLKKgmIWUgZlE1mY1ph1xTzkEQKjMTzqwzpzy7YNlPzgo+WfAJxNLoVDBTLrQAhIDQFuahBJVynZ/GvqqpPFQihU4W5rnDEyym5zDq2D6VRHWckCVYGWtGDLckiSIcj7TPV2/D73RNblWZnhut/cyXIEDnGa7QDyX6wax3HtzREA0srcCLEryoNb4s0cv10iAt0bRGZyU6q3UObkv4tobnJTqv0UWJLmr0U4l+qluJ7UVf94Z6aXd+TfqU0Rm8FQCx0e2eWT+LsDd47VVLslvVbc/kdo8gtEtmCaJFJtfv+u9/N/r4oPfK3TPrioClsJK4L/deHbs1yXg5TaFxDw56RzVNInA8vmt08uveL169EU8FZ3ei/f2Xv70+Wn8igtTOVxwDtT1U82O8SV4MvLEg2FSwNGGjflrXvxV48T/qZFP3lTcbK/imipVRq4q1kXj2rKZ2zfNspWK2lJyAXbYC3tvndmoXBFaJeGHfmBhH1Jphf/1OFn1NiOcroY1aLbv5vfU9Xw8Gve7rrvvHz+03R8VfwLbzzPnBee54zr7zxnnnnDkDhzj/NLYbu40n9/9ubjW3m62ldKtR1Dx1Kl/z6b/zMaX+</latexit><latexit sha1_base64="lM7otXHxmNW2LOvM/MWPYYOq3PY=">AAAHWXicfZVRb9s2EMflbK1Tb22T9rEvxIwBXWEYkrsm6UOALsnWPqxNFthJisgIKPpkE6YkjqQcuxy/Xr9DgT3tk4yS5UyyvOrFh/v973D8kzgHnFGpXPdLY+ubb+/db24/aH33/cNHj3d2n1zIJBUEBiRhibgKsARGYxgoqhhccQE4ChhcBtPjjF/OQEiaxH214DCM8DimISVY2dTNzl/8+dWNhw5R34/oCPkB0R/NoS+I9hXMlZYcR8b8ZAV+KPAqGwr4s4uSEPURjVGmznVohBU2phCpRGGG/Ham+09CY6lwTEAaZG522m7XzT9UD7wiaDvFd3aze++zP0pIGkGsCMNSXnsuV0ONhaKEgWn5qQSOyRSP4TpV4cFQ05inCmJi0I+WhSlDKkGZE2hEBRDFFjbARFDbAZEJtodU1q9WtZWEGEcgO6MZ5XIZytl4GShszR7qeX4Z5mGlUo8F5hNK5pXRNI5khNWklpSLKKgmIWUgZlE1mY1ph1xTzkEQKjMTzqwzpzy7YNlPzgo+WfAJxNLoVDBTLrQAhIDQFuahBJVynZ/GvqqpPFQihU4W5rnDEyym5zDq2D6VRHWckCVYGWtGDLckiSIcj7TPV2/D73RNblWZnhut/cyXIEDnGa7QDyX6wax3HtzREA0srcCLEryoNb4s0cv10iAt0bRGZyU6q3UObkv4tobnJTqv0UWJLmr0U4l+qluJ7UVf94Z6aXd+TfqU0Rm8FQCx0e2eWT+LsDd47VVLslvVbc/kdo8gtEtmCaJFJtfv+u9/N/r4oPfK3TPrioClsJK4L/deHbs1yXg5TaFxDw56RzVNInA8vmt08uveL169EU8FZ3ei/f2Xv70+Wn8igtTOVxwDtT1U82O8SV4MvLEg2FSwNGGjflrXvxV48T/qZFP3lTcbK/imipVRq4q1kXj2rKZ2zfNspWK2lJyAXbYC3tvndmoXBFaJeGHfmBhH1Jphf/1OFn1NiOcroY1aLbv5vfU9Xw8Gve7rrvvHz+03R8VfwLbzzPnBee54zr7zxnnnnDkDhzj/NLYbu40n9/9ubjW3m62ldKtR1Dx1Kl/z6b/zMaX+</latexit>
p(X1 = T | Y = spam) =freq. of T in spam + 1
total # of spam instances + v<latexit sha1_base64="T1GSieFBf11SWXtSFtNgsYPPODg=">AAAITHicfVVdb9s2FJW7Ls68dUu3t+2FmDOg2wxDctYkfQjQ5WPtw9pkgZ1kiIyAoq9swZTEkpRjleBf6M/Yz9n7/kffhgGj5I9Klja9+Pqec67Ic0ldj9FASNv+q/Hgo4cfbzW3P2l9+tmjz7/YefzllYgTTmBAYhrzGw8LoEEEAxlICjeMAw49Ctfe9CTDr2fARRBHfZkyGIZ4HAV+QLA0qbudd+zJzZ2DjlAfuWEwQq5H1O/a/Hc5Ua6EuVSC4VDr77OczzFRLYQWgM/hTRfFPtrt76IgyiU5Gf2IHI1aWqE1V8YSU+S2M/oHXhAJiSMCwihmSLf03U7b7tr5g6qBswza1vK5uHu89Yc7ikkSQiQJxULcOjaTQ4W5DAgF3XITAQyTKR7DbSL9w6EKIpZIiIhG3xnMTyiSMcq8QaOAA5E0NQEmPDAVEJlgs2dpHGyVSwmIcAiiM5oFTCxCMRsvAomN/UM1z9ujH5WUaswxmwRkXlqawqEIsZxUkiINvXISEgp8FpaT2TLNIjeYc+AkEJkJF8aZc5a1XPTjiyU+SdkEIqFVwqkuCg0AnINvhHkoQCZM5bsx52wqjiRPoJOFee7oFPPpJYw6pk4pUV6OT2MsyynPbMO4E8E9icMQRyPlMr08dm6nq3PviuilVsrNjPI8dJnBJfR1AX2tdRk8K4BnBiyjgzXqo8Gm9KoAXlXeel1ArzelXlJAkwo6K6CzSmXvvgDfV+B5AZ1X0LSAphX0bQF9W/UZm2Nx2xuqRS/ypqpzGszgBQeItGr39OZeuOn3rVOWZGdAtR2d2z0C33ykFkCYZnT1sv/qV61ODntP7X29yfBoAiuKvbf/9MSuUMaL1Sw59uFh77jCiTmOxutCp2f7PzvVQizhjK5JBwd7vzyrVkqB0vh+Xenk+LS3t7kx40h5Uc6BY9ubp42TilVLR1DbQRVrx3X05WtqBV6dYOFnLX9a5b/gOP0PdlxXfWVzrYLVKVae1yrSOsWqAStFWRLV2PShHWvNxs5ZdhGmZtaxbGRguqh7CmaYcHhlLsi5+QBiGfMfzK3g4zAwtcyv28mi/yPi+YpoolbLTDZnc45Vg6te19nr2r/91H5+vJxx29Y31rfWE8uxDqzn1kvrwhpYxHrf+LrRbuw2/2y+b/7d/GdBfdBYar6ySs/21r8/c/+v</latexit>
Thischangesourestimatesasshownhere(i.e.wedon’tactuallyneedtoaddthepseudo-observations,wejustchangeourestimator).
Here,visthenumberofdifferentvaluesX1cantake.
Inpractice,weoftenreducetheweightof
summary so far
Bayesian vs frequentist learning. Use what works, mix-and-match.
Discriminative classification: learn p(Y|X) directly
Generative classif.: learn p(X|Y) and p(Y), apply Bayes Bayesian classifier, Naive Bayesian classifier
Naive Bayes: assumes independent features (conditional on the class).
Laplace smoothing: add pseudo-observations to avoid zero probabilities.
65
Probabilistic Models Part4: Logistic Regression
Machine Learning mlvu.github.io
Vrije Universiteit Amsterdam
Thislecturewillbeallabouthowtousethemechanismsofprobabilitytocreateaclassi?ier.
discriminative classifier
Learn P(Y | X) directly.
67
Inthisvideowe’lllookatanexampleofadiscriminativeclassi?ier.Thisisaclassi?ierthatlearnstomapthefeaturesdirectlytoclassprobabilities,withoutusingBayes’ruletoreversetheconditionalprobability.
68
Rememberthatwewerestillonthelookoutforgoodlossfunctionsfortheclassi?icationproblem.We’llusethelanguageofprobabilitytode?ineoneforus.
Least-squares classifier
69
-1
1
0
Thiswasourlastattempt:theleastsquaresloss.
Ourthinkingwas:thehyperplaneclassi?ierchecksifwx+bispositiveornegative,todecidewhethertoassignclassesblueorred,respectively.Whynotjustgiveblueandredsomearbitrarypositiveandnegativevalues,andtreatitasaregressionproblem.
70
1
0
Hereisanotheroption:insteadofgivingnegativeandpositivearbitraryvalues,wegiventhemprobabilities:theprobabilityofbeingpositive,whichis1forallbluepointsand0forallredpoints.(Inotherwords,wemovetheredpointsfrom-1to0).
Thisdoesn’tlooksubstantiallydifferenttoourlinearclassi?ierbecauseourfunctionwTx+bstillrangesfromnegativein?initytopositivein?inity.Itdoesn’tproduceprobabilities,exceptoveraverynarrowrange.
Whatweneed,isawaytosqueezethatwholerangeintotherange[0,1],sothatthemodelonlyeverproducesvalidprobabilities.
the logistic sigmoid
71
t
�(t) =1
1+ e-t=
et
1+ et
1- �(t) = �(-t)<latexit sha1_base64="OubmBkKyQNI3ycJRfkVGByCOCCk=">AAAHTXicfZVLb9NAEMddoGkJrxaOXFZEoAJpZaf0dahU2kIrRB9UTYtUh2q9mThW/FjtrtOkq/16nDkh8UU4VYh1nFR2HPAlo/n9ZzT739XEob7HhWn+mrpz9950aWb2fvnBw0ePn8zNPz3jUcwI1EnkR+yrgzn4Xgh14QkfvlIGOHB8OHc6Owk/7wLjXhSeij6FRoDd0Gt5BAudupyjNvfcAC/YjEihXqNXm8huMUykpaSF3iL4JhdTphTSrENkyuGbGCk0Q7ZdttAimtAtzSyOUpdzFXPJHHyoGFjDoGIMv+PL+envdjMicQChID7m/MIyqWhIzIRHfFBlO+ZAMelgFy5i0VpvSC+ksYCQKPRSs1bsIxGh5PSo6TEgwu/rABPm6Q6ItLE+j9AelfOtOIQ4AF5tdj3K05B33TQQWBvckL3BBahHuUrpMkzbHunlRpM44AEW7UKS9wMnn4TYB9YN8slkTD3kmLIHjHg8MeFYO3NEk0vlp9HxkLf7tA0hVzJmvsoWagCMQUsXDkIOIqZycBr9kjp8U7AYqkk4yG3uYtY5gWZV98kl8uO0/AgLpc0I4YpEQYDDprSpkraAnpB2dUkNrMrSEyWlnfjiOOgkwTl6mKGHarxz/Za2UF3THDzLwLNC4/MMPR8vdeIMjQu0m6HdQmfnKoOvCriXob0C7Wdov0CvM/S6aCXWF31Ra8jU7sE1ySPf68IeAwiVrNTU+FmYvsELK1+S3KqsWGpgdxNaerGkIOgncrl/evBZyZ312oq5qsYVjh/DSGIur67smAWJm04z1Jjr67XtgiZiOHRvG+1+WH1vFRvRmFH/VrS2tvxxY3v8iTBSON/wGKhioYIf7iT5cOCJBc6kgtSEifpOUb/HcP8f6mhS95E3EyvopIqRUaOKsZFo8qySvU6TlYr9VLILetkyONDP7UgvCCwi9ka/MeYGnjZD/9rVJPqfEPdGQh2Vy3rzW+N7vhjUa0sbS+aXd5WtT8O/gFnjufHCWDAsY83YMvaNY6NuEOOncTM1PVUq/Sj9Lt2U/qTSO1PDmmdG7puZ+QtsoKFm</latexit><latexit sha1_base64="OubmBkKyQNI3ycJRfkVGByCOCCk=">AAAHTXicfZVLb9NAEMddoGkJrxaOXFZEoAJpZaf0dahU2kIrRB9UTYtUh2q9mThW/FjtrtOkq/16nDkh8UU4VYh1nFR2HPAlo/n9ZzT739XEob7HhWn+mrpz9950aWb2fvnBw0ePn8zNPz3jUcwI1EnkR+yrgzn4Xgh14QkfvlIGOHB8OHc6Owk/7wLjXhSeij6FRoDd0Gt5BAudupyjNvfcAC/YjEihXqNXm8huMUykpaSF3iL4JhdTphTSrENkyuGbGCk0Q7ZdttAimtAtzSyOUpdzFXPJHHyoGFjDoGIMv+PL+envdjMicQChID7m/MIyqWhIzIRHfFBlO+ZAMelgFy5i0VpvSC+ksYCQKPRSs1bsIxGh5PSo6TEgwu/rABPm6Q6ItLE+j9AelfOtOIQ4AF5tdj3K05B33TQQWBvckL3BBahHuUrpMkzbHunlRpM44AEW7UKS9wMnn4TYB9YN8slkTD3kmLIHjHg8MeFYO3NEk0vlp9HxkLf7tA0hVzJmvsoWagCMQUsXDkIOIqZycBr9kjp8U7AYqkk4yG3uYtY5gWZV98kl8uO0/AgLpc0I4YpEQYDDprSpkraAnpB2dUkNrMrSEyWlnfjiOOgkwTl6mKGHarxz/Za2UF3THDzLwLNC4/MMPR8vdeIMjQu0m6HdQmfnKoOvCriXob0C7Wdov0CvM/S6aCXWF31Ra8jU7sE1ySPf68IeAwiVrNTU+FmYvsELK1+S3KqsWGpgdxNaerGkIOgncrl/evBZyZ312oq5qsYVjh/DSGIur67smAWJm04z1Jjr67XtgiZiOHRvG+1+WH1vFRvRmFH/VrS2tvxxY3v8iTBSON/wGKhioYIf7iT5cOCJBc6kgtSEifpOUb/HcP8f6mhS95E3EyvopIqRUaOKsZFo8qySvU6TlYr9VLILetkyONDP7UgvCCwi9ka/MeYGnjZD/9rVJPqfEPdGQh2Vy3rzW+N7vhjUa0sbS+aXd5WtT8O/gFnjufHCWDAsY83YMvaNY6NuEOOncTM1PVUq/Sj9Lt2U/qTSO1PDmmdG7puZ+QtsoKFm</latexit><latexit sha1_base64="OubmBkKyQNI3ycJRfkVGByCOCCk=">AAAHTXicfZVLb9NAEMddoGkJrxaOXFZEoAJpZaf0dahU2kIrRB9UTYtUh2q9mThW/FjtrtOkq/16nDkh8UU4VYh1nFR2HPAlo/n9ZzT739XEob7HhWn+mrpz9950aWb2fvnBw0ePn8zNPz3jUcwI1EnkR+yrgzn4Xgh14QkfvlIGOHB8OHc6Owk/7wLjXhSeij6FRoDd0Gt5BAudupyjNvfcAC/YjEihXqNXm8huMUykpaSF3iL4JhdTphTSrENkyuGbGCk0Q7ZdttAimtAtzSyOUpdzFXPJHHyoGFjDoGIMv+PL+envdjMicQChID7m/MIyqWhIzIRHfFBlO+ZAMelgFy5i0VpvSC+ksYCQKPRSs1bsIxGh5PSo6TEgwu/rABPm6Q6ItLE+j9AelfOtOIQ4AF5tdj3K05B33TQQWBvckL3BBahHuUrpMkzbHunlRpM44AEW7UKS9wMnn4TYB9YN8slkTD3kmLIHjHg8MeFYO3NEk0vlp9HxkLf7tA0hVzJmvsoWagCMQUsXDkIOIqZycBr9kjp8U7AYqkk4yG3uYtY5gWZV98kl8uO0/AgLpc0I4YpEQYDDprSpkraAnpB2dUkNrMrSEyWlnfjiOOgkwTl6mKGHarxz/Za2UF3THDzLwLNC4/MMPR8vdeIMjQu0m6HdQmfnKoOvCriXob0C7Wdov0CvM/S6aCXWF31Ra8jU7sE1ySPf68IeAwiVrNTU+FmYvsELK1+S3KqsWGpgdxNaerGkIOgncrl/evBZyZ312oq5qsYVjh/DSGIur67smAWJm04z1Jjr67XtgiZiOHRvG+1+WH1vFRvRmFH/VrS2tvxxY3v8iTBSON/wGKhioYIf7iT5cOCJBc6kgtSEifpOUb/HcP8f6mhS95E3EyvopIqRUaOKsZFo8qySvU6TlYr9VLILetkyONDP7UgvCCwi9ka/MeYGnjZD/9rVJPqfEPdGQh2Vy3rzW+N7vhjUa0sbS+aXd5WtT8O/gFnjufHCWDAsY83YMvaNY6NuEOOncTM1PVUq/Sj9Lt2U/qTSO1PDmmdG7puZ+QtsoKFm</latexit>
�(t) =1
1+ e-t=
et
1+ et
1- �(t) = �(-t)<latexit sha1_base64="OubmBkKyQNI3ycJRfkVGByCOCCk=">AAAHTXicfZVLb9NAEMddoGkJrxaOXFZEoAJpZaf0dahU2kIrRB9UTYtUh2q9mThW/FjtrtOkq/16nDkh8UU4VYh1nFR2HPAlo/n9ZzT739XEob7HhWn+mrpz9950aWb2fvnBw0ePn8zNPz3jUcwI1EnkR+yrgzn4Xgh14QkfvlIGOHB8OHc6Owk/7wLjXhSeij6FRoDd0Gt5BAudupyjNvfcAC/YjEihXqNXm8huMUykpaSF3iL4JhdTphTSrENkyuGbGCk0Q7ZdttAimtAtzSyOUpdzFXPJHHyoGFjDoGIMv+PL+envdjMicQChID7m/MIyqWhIzIRHfFBlO+ZAMelgFy5i0VpvSC+ksYCQKPRSs1bsIxGh5PSo6TEgwu/rABPm6Q6ItLE+j9AelfOtOIQ4AF5tdj3K05B33TQQWBvckL3BBahHuUrpMkzbHunlRpM44AEW7UKS9wMnn4TYB9YN8slkTD3kmLIHjHg8MeFYO3NEk0vlp9HxkLf7tA0hVzJmvsoWagCMQUsXDkIOIqZycBr9kjp8U7AYqkk4yG3uYtY5gWZV98kl8uO0/AgLpc0I4YpEQYDDprSpkraAnpB2dUkNrMrSEyWlnfjiOOgkwTl6mKGHarxz/Za2UF3THDzLwLNC4/MMPR8vdeIMjQu0m6HdQmfnKoOvCriXob0C7Wdov0CvM/S6aCXWF31Ra8jU7sE1ySPf68IeAwiVrNTU+FmYvsELK1+S3KqsWGpgdxNaerGkIOgncrl/evBZyZ312oq5qsYVjh/DSGIur67smAWJm04z1Jjr67XtgiZiOHRvG+1+WH1vFRvRmFH/VrS2tvxxY3v8iTBSON/wGKhioYIf7iT5cOCJBc6kgtSEifpOUb/HcP8f6mhS95E3EyvopIqRUaOKsZFo8qySvU6TlYr9VLILetkyONDP7UgvCCwi9ka/MeYGnjZD/9rVJPqfEPdGQh2Vy3rzW+N7vhjUa0sbS+aXd5WtT8O/gFnjufHCWDAsY83YMvaNY6NuEOOncTM1PVUq/Sj9Lt2U/qTSO1PDmmdG7puZ+QtsoKFm</latexit><latexit sha1_base64="OubmBkKyQNI3ycJRfkVGByCOCCk=">AAAHTXicfZVLb9NAEMddoGkJrxaOXFZEoAJpZaf0dahU2kIrRB9UTYtUh2q9mThW/FjtrtOkq/16nDkh8UU4VYh1nFR2HPAlo/n9ZzT739XEob7HhWn+mrpz9950aWb2fvnBw0ePn8zNPz3jUcwI1EnkR+yrgzn4Xgh14QkfvlIGOHB8OHc6Owk/7wLjXhSeij6FRoDd0Gt5BAudupyjNvfcAC/YjEihXqNXm8huMUykpaSF3iL4JhdTphTSrENkyuGbGCk0Q7ZdttAimtAtzSyOUpdzFXPJHHyoGFjDoGIMv+PL+envdjMicQChID7m/MIyqWhIzIRHfFBlO+ZAMelgFy5i0VpvSC+ksYCQKPRSs1bsIxGh5PSo6TEgwu/rABPm6Q6ItLE+j9AelfOtOIQ4AF5tdj3K05B33TQQWBvckL3BBahHuUrpMkzbHunlRpM44AEW7UKS9wMnn4TYB9YN8slkTD3kmLIHjHg8MeFYO3NEk0vlp9HxkLf7tA0hVzJmvsoWagCMQUsXDkIOIqZycBr9kjp8U7AYqkk4yG3uYtY5gWZV98kl8uO0/AgLpc0I4YpEQYDDprSpkraAnpB2dUkNrMrSEyWlnfjiOOgkwTl6mKGHarxz/Za2UF3THDzLwLNC4/MMPR8vdeIMjQu0m6HdQmfnKoOvCriXob0C7Wdov0CvM/S6aCXWF31Ra8jU7sE1ySPf68IeAwiVrNTU+FmYvsELK1+S3KqsWGpgdxNaerGkIOgncrl/evBZyZ312oq5qsYVjh/DSGIur67smAWJm04z1Jjr67XtgiZiOHRvG+1+WH1vFRvRmFH/VrS2tvxxY3v8iTBSON/wGKhioYIf7iT5cOCJBc6kgtSEifpOUb/HcP8f6mhS95E3EyvopIqRUaOKsZFo8qySvU6TlYr9VLILetkyONDP7UgvCCwi9ka/MeYGnjZD/9rVJPqfEPdGQh2Vy3rzW+N7vhjUa0sbS+aXd5WtT8O/gFnjufHCWDAsY83YMvaNY6NuEOOncTM1PVUq/Sj9Lt2U/qTSO1PDmmdG7puZ+QtsoKFm</latexit><latexit sha1_base64="OubmBkKyQNI3ycJRfkVGByCOCCk=">AAAHTXicfZVLb9NAEMddoGkJrxaOXFZEoAJpZaf0dahU2kIrRB9UTYtUh2q9mThW/FjtrtOkq/16nDkh8UU4VYh1nFR2HPAlo/n9ZzT739XEob7HhWn+mrpz9950aWb2fvnBw0ePn8zNPz3jUcwI1EnkR+yrgzn4Xgh14QkfvlIGOHB8OHc6Owk/7wLjXhSeij6FRoDd0Gt5BAudupyjNvfcAC/YjEihXqNXm8huMUykpaSF3iL4JhdTphTSrENkyuGbGCk0Q7ZdttAimtAtzSyOUpdzFXPJHHyoGFjDoGIMv+PL+envdjMicQChID7m/MIyqWhIzIRHfFBlO+ZAMelgFy5i0VpvSC+ksYCQKPRSs1bsIxGh5PSo6TEgwu/rABPm6Q6ItLE+j9AelfOtOIQ4AF5tdj3K05B33TQQWBvckL3BBahHuUrpMkzbHunlRpM44AEW7UKS9wMnn4TYB9YN8slkTD3kmLIHjHg8MeFYO3NEk0vlp9HxkLf7tA0hVzJmvsoWagCMQUsXDkIOIqZycBr9kjp8U7AYqkk4yG3uYtY5gWZV98kl8uO0/AgLpc0I4YpEQYDDprSpkraAnpB2dUkNrMrSEyWlnfjiOOgkwTl6mKGHarxz/Za2UF3THDzLwLNC4/MMPR8vdeIMjQu0m6HdQmfnKoOvCriXob0C7Wdov0CvM/S6aCXWF31Ra8jU7sE1ySPf68IeAwiVrNTU+FmYvsELK1+S3KqsWGpgdxNaerGkIOgncrl/evBZyZ312oq5qsYVjh/DSGIur67smAWJm04z1Jjr67XtgiZiOHRvG+1+WH1vFRvRmFH/VrS2tvxxY3v8iTBSON/wGKhioYIf7iT5cOCJBc6kgtSEifpOUb/HcP8f6mhS95E3EyvopIqRUaOKsZFo8qySvU6TlYr9VLILetkyONDP7UgvCCwi9ka/MeYGnjZD/9rVJPqfEPdGQh2Vy3rzW+N7vhjUa0sbS+aXd5WtT8O/gFnjufHCWDAsY83YMvaNY6NuEOOncTM1PVUq/Sj9Lt2U/qTSO1PDmmdG7puZ+QtsoKFm</latexit>
Forthispurpose,wewillusethelogisticsigmoid.Notethatitsdomainistheentirerealnumberline,anditsrangeis[0,1].
Aninterestingpropertyofthelogisticsigmoidisthesymmetrygiveninthesecondline.Basicallytheremainderbetweensigma(t)and1,isitselfasigmoidrunningintheotherdirection.Inotherwords:?lippingthesigmoidhorizontally,1-σ(t),givesusthesamefunctionas?lippingthesigmoidvertically,σ(-t).We’llmakefrequentuseofthislater.
source: By Qef (talk) - Created from scratch with gnuplot, Public Domain, https://commons.wikimedia.org/w/
index.php?curid=4310325
72
1
0
p(Pos)
c(x) = �(w · x+ b)<latexit sha1_base64="ehh3L8LvY3yGm0TpS6TXCnrKZAw=">AAAHBHicfZVNb9NAEIbdQkMJFFo4crGIEC1EkZPSNj1UKm2hFaIfVE1TKY6q9WbiWPHHanedOl3tmRM/hVPFDTjyH/g3rPNR7DjgS0bzvDOafXc1sYjrMG4Yv2dm79ydy92bv59/8HDh0ePFpSfnLAgphhoO3IBeWIiB6/hQ4w534YJQQJ7lQt3q7sa83gPKnMA/430CTQ/ZvtN2MOIqdbn4Ei9HK/qWbjLH9tCyGWBhWlfSxK2Am5H+WjctLCy5crlYMErG4NOzQXkUFLTRd3K5NPfLbAU49MDn2EWMNcoG4U2BKHewCzJvhgwIwl1kQyPk7WpTOD4JOfhY6i8Ua4euzgM9HlpvORQwd/sqQJg6qoOOO4gizNXR8ulWDHzkASu2eg5hw5D17GHAkfKlKaKBb3IhVSlsikjHwVFqNIE85iHeySRZ37PSSQhdoD0vnYzHVENOKCOg2GGxCSfKmWMS3wU7C05GvNMnHfCZFCF1ZbJQAaAU2qpwEDLgIRGD06gH0GVbnIZQjMNBbmsP0e4ptIqqTyqRHqftBohLZYYPVzjwPOS3hEmkMDlEXJjFkhxYlaSnUggz9sWy9NMYp+hRgh7Jyc61W9rWa4qm4HkCnmca1xO0PllqhQkaZmgvQXuZzurN/8VXGRwlaJSh/QTtZ+h1gl5nrUTqohuVphjaPbgmcew6PdinAL4UhYqcPAtVN9gop0viWxWFshzY3YK22gdD4PVjuTg4O/woxW61smasy0mF5YYwlhir62u7RkZiD6cZaYxqtbKT0QQU+fZto71362/L2UYkpMS9FW1srL7f3Jl8IhRnzjc6hl4o6xk/7Gny0cBTC6xpBUMTpuq7Wf0+Rf1/qINp3cfeTK0g0yrGRo0rJkYi8bPqqnVN4pWK3KFkD9SypXContuxWhCIB/SVemPU9hxlhvo1i3H0PyGKxkIV5fNq85cn93w2qFVKmyXj05vC9ofRX8C89kx7ri1rZW1D29YOtBOtpmHti3ajfdd+5D7nvuZuct+G0tmZUc1TLfXlfv4BMVSJHA==</latexit><latexit sha1_base64="ehh3L8LvY3yGm0TpS6TXCnrKZAw=">AAAHBHicfZVNb9NAEIbdQkMJFFo4crGIEC1EkZPSNj1UKm2hFaIfVE1TKY6q9WbiWPHHanedOl3tmRM/hVPFDTjyH/g3rPNR7DjgS0bzvDOafXc1sYjrMG4Yv2dm79ydy92bv59/8HDh0ePFpSfnLAgphhoO3IBeWIiB6/hQ4w534YJQQJ7lQt3q7sa83gPKnMA/430CTQ/ZvtN2MOIqdbn4Ei9HK/qWbjLH9tCyGWBhWlfSxK2Am5H+WjctLCy5crlYMErG4NOzQXkUFLTRd3K5NPfLbAU49MDn2EWMNcoG4U2BKHewCzJvhgwIwl1kQyPk7WpTOD4JOfhY6i8Ua4euzgM9HlpvORQwd/sqQJg6qoOOO4gizNXR8ulWDHzkASu2eg5hw5D17GHAkfKlKaKBb3IhVSlsikjHwVFqNIE85iHeySRZ37PSSQhdoD0vnYzHVENOKCOg2GGxCSfKmWMS3wU7C05GvNMnHfCZFCF1ZbJQAaAU2qpwEDLgIRGD06gH0GVbnIZQjMNBbmsP0e4ptIqqTyqRHqftBohLZYYPVzjwPOS3hEmkMDlEXJjFkhxYlaSnUggz9sWy9NMYp+hRgh7Jyc61W9rWa4qm4HkCnmca1xO0PllqhQkaZmgvQXuZzurN/8VXGRwlaJSh/QTtZ+h1gl5nrUTqohuVphjaPbgmcew6PdinAL4UhYqcPAtVN9gop0viWxWFshzY3YK22gdD4PVjuTg4O/woxW61smasy0mF5YYwlhir62u7RkZiD6cZaYxqtbKT0QQU+fZto71362/L2UYkpMS9FW1srL7f3Jl8IhRnzjc6hl4o6xk/7Gny0cBTC6xpBUMTpuq7Wf0+Rf1/qINp3cfeTK0g0yrGRo0rJkYi8bPqqnVN4pWK3KFkD9SypXContuxWhCIB/SVemPU9hxlhvo1i3H0PyGKxkIV5fNq85cn93w2qFVKmyXj05vC9ofRX8C89kx7ri1rZW1D29YOtBOtpmHti3ajfdd+5D7nvuZuct+G0tmZUc1TLfXlfv4BMVSJHA==</latexit><latexit sha1_base64="ehh3L8LvY3yGm0TpS6TXCnrKZAw=">AAAHBHicfZVNb9NAEIbdQkMJFFo4crGIEC1EkZPSNj1UKm2hFaIfVE1TKY6q9WbiWPHHanedOl3tmRM/hVPFDTjyH/g3rPNR7DjgS0bzvDOafXc1sYjrMG4Yv2dm79ydy92bv59/8HDh0ePFpSfnLAgphhoO3IBeWIiB6/hQ4w534YJQQJ7lQt3q7sa83gPKnMA/430CTQ/ZvtN2MOIqdbn4Ei9HK/qWbjLH9tCyGWBhWlfSxK2Am5H+WjctLCy5crlYMErG4NOzQXkUFLTRd3K5NPfLbAU49MDn2EWMNcoG4U2BKHewCzJvhgwIwl1kQyPk7WpTOD4JOfhY6i8Ua4euzgM9HlpvORQwd/sqQJg6qoOOO4gizNXR8ulWDHzkASu2eg5hw5D17GHAkfKlKaKBb3IhVSlsikjHwVFqNIE85iHeySRZ37PSSQhdoD0vnYzHVENOKCOg2GGxCSfKmWMS3wU7C05GvNMnHfCZFCF1ZbJQAaAU2qpwEDLgIRGD06gH0GVbnIZQjMNBbmsP0e4ptIqqTyqRHqftBohLZYYPVzjwPOS3hEmkMDlEXJjFkhxYlaSnUggz9sWy9NMYp+hRgh7Jyc61W9rWa4qm4HkCnmca1xO0PllqhQkaZmgvQXuZzurN/8VXGRwlaJSh/QTtZ+h1gl5nrUTqohuVphjaPbgmcew6PdinAL4UhYqcPAtVN9gop0viWxWFshzY3YK22gdD4PVjuTg4O/woxW61smasy0mF5YYwlhir62u7RkZiD6cZaYxqtbKT0QQU+fZto71362/L2UYkpMS9FW1srL7f3Jl8IhRnzjc6hl4o6xk/7Gny0cBTC6xpBUMTpuq7Wf0+Rf1/qINp3cfeTK0g0yrGRo0rJkYi8bPqqnVN4pWK3KFkD9SypXContuxWhCIB/SVemPU9hxlhvo1i3H0PyGKxkIV5fNq85cn93w2qFVKmyXj05vC9ofRX8C89kx7ri1rZW1D29YOtBOtpmHti3ajfdd+5D7nvuZuct+G0tmZUc1TLfXlfv4BMVSJHA==</latexit>
Thisisournewclassi?ier:wecomputethelinearfunctionasbefore,butweapplythelogisticsigmoidtotheresult,squeezingitintotheinterval[0,1].Thismeansthatwecaninterprettheoutputastheprobabilityofthepositiveclass.Thismaybeaveryaccurateprobability,oraveryinaccurateone,dependingonhowwechoosewandb,butit’salwaysavaluebetween0and1.
Nowallweneedisalossfunctionthattellsuswhichprobabilitiesmatchthedata.
log loss
x: some data point
qx: our classifier qx(C) = p(C|x)
qx(Pos) = 0.1 qx(Neg) = 0.9
split data into positive XP and negatives XN
Find the classifier q that maximizes the probability of the true classes. i.e. we use the maximum likelihood objective.
73
Clearly,wewantaclassi?ierthatassignshighprobabilitytothetruelabel,andlowprobabilitytothefalseone.Ifwetreattheclassi?ierasamodelforourdata,wecancomputetheprobabilityofthe
log loss
74
<latexit sha1_base64="vHgiIjxVQJa8oE/+ow/KWnwWIlg=">AAAH3nicfVVNb9tGEGU+GiVq0zrJsZdFhAJOIRiknNjOwUBqyU0OTewYlh3AFNTlaiQRWpKb3aVMZsFrbkGv+QU59dr+l/6bLvUVkst2LxzMe284+2bJ9Rj1hbTtf27cvHX7mzuNu/ea3353//sfth48vBBRzAn0SUQj/s7DAqgfQl/6ksI7xgEHHoVLb9bN8cs5cOFH4blMGQwCPAn9sU+w1KnhFmLbvSfoELmMR6OhStqoi1w/RL0MvR8m290nw62WvWMvFjIDZxW0rNU6HT6488UdRSQOIJSEYiGuHJvJgcJc+oRC1nRjAQyTGZ7AVSzHBwPlhyyWEJIM/aSxcUyRjFDeLRr5HIikqQ4w4b6ugMgUc0yk3lOzXEpAiAMQ7dHcZ2IZivlkGUisDRmoZGFYdr+kVBOO2dQnSak1hQMRYDk1kiINvHISYgp8HpSTeZu6yQozAU58kZtwqp05YfkQxHl0usKnKZtCKDIVc5oVhRoAzmGshYtQgIyZWuxGT34mDiWPoZ2Hi9xhD/PZGYzauk4pUW5nTCMsyylPb0O7E8I1iYIAhyPlsky5EhKp3PZOtvCuiJ5lSrm5UZ6HznK4hL4poG+yrAweF8BjDZbR/gYdo35VelEAL4y3XhbQy6rUiwtobKDzAjo3KnvXBfjagJMCmhhoWkBTA/1QQD+YPmN9LK46A7WcxWKo6oT6c3jJAcJMtTpZdS9cz/vKKUvyM6BaTrawewRj/dtYAkGa09Wr89e/Zap70Hlm72VVhkdjWFPs3b1nXdugTJbdrDj2wUHnyOBEHIeTTaHe8d4vjlmIxZzRDWl/f/fX52alFCiNrjeVuke9zm51Y9qRclPOvmPb1dPGiWHVyhHUcpBh7aSOvnpNrcCrEyz9rOXPTP5LjtP/YEd11dc21ypYnWLtea0irVOsB7BWlCVhjU1fx7HRVHbO8g9hRnSP+ZWB6bJuD/RlwuG1/kBO9A8Qy4j/rL8KPgl8XUs/3XYe/R8RJ2uijppNfbM51XvMDC46O87ejv32aevF0eqOu2v9aD22ti3H2rdeWK+sU6tvEeuj9af1l/V34/fGx8anxh9L6s0bK80jq7Qan/8FLwrZoA==</latexit>
p(D) =Y
x,C2D
qx(C)
log loss
75
argmaxq
Y
C,x
qx(C)
= argmaxq
logY
C,x
qx(C) = argminq
- logY
C,x
qx(C)
= argminq
X
C,x
- logqx(C)
= argminq
-X
x2XP
logqx(P)-X
x2XN
logqx(N)
<latexit sha1_base64="M9O2tYxS+sQRDytiY0W5/QJ1geY=">AAAIt3icfVVNb9tGEKXSNFXUpnHaYy9EhAZxoRqk7NgOCgOJZTc5NLFqWLYBUyCWqxVFaPnh3aVMZsFjr+2/6H/qT+mtyw/JXC5Tngbz3hvOvllynAh7lBnGP50HXzz88tFX3ce9r7958u3TrWffXdIwJhBNYIhDcu0AirAXoAnzGEbXEUHAdzC6cpajHL9aIUK9MLhgaYSmPnADb+5BwETK3vrXAsT1QWLf6i+sX6yIhDObjwZJpt/aycvRtm5ZvRdH+j3LwqGrt/AqjhcIzs+fZeXVJKpFY7+iVKqKWX9xWbSkJrrlBfq1bTmQj7PsXlMmtvMyMpFA/lEiFolte6tv7BjFo6uBWQV9rXrG9rNHf1uzEMY+ChjEgNIb04jYlAPCPIhR1rNiiiIAl8BFNzGbH065F0QxQwHM9B8FNo+xzkI9H4M+8wiCDKciAJB4ooIOF4AAyMSwenIpigLgIzqYrbyIliFduWXAgJj0lCfFTcieSEruEhAtPJhIrXHgUx+whZKkqe/ISRRjRFa+nMzbFE02mAki0KO5CWPhzFmU3y56EY4rfJFGCxTQjMcEZ3WhABAhaC6ERUgRiyNenEZc6SU9YiRGgzwsckcngCzP0Wwg6kgJuZ05DgGTU444hnAnQHcw9H0QzLgVZdxiKGHcGuxkhXd19Dzj3MqNchz9PIcl9GMNFbdLBk9r4KkAZXSyQef6pCm9rIGXyluvauhVU+rENTRW0FUNXSmVnbsafKfASQ1NFDStoamCfqqhn1SfgbgWN8MpL2dRDJWfYW+F3hGEgoz3h1nzLETM+8aUJfkd4H0zK+yeobn4H5aAn+Z0/v7iw28ZHx0OXxn7WZPh4BitKcbu/quRoVDcspuKYxweDo8VTkhA4G4KnZzuvzXVQlFMIrwhHRzs/vparZQijMO7TaXR8clwt3kw4YjclHlgGkbzthGoWFU5ovdNXbHWbaNXr2kVOG2C0s9W/lLlvyMg/Qw7bKu+trlVEbUp1p63KtI2xXoAa4UsCVpsuh/HRtM4eZR/CEsoesxXBsBl3RMklglBH8QHciZ+gICF5Cderb1MLBfXGuTR/xFBsiaKqNcTm81s7jE1uBzumHs7e7/v9d8cVzuuq/2gPddeaqZ2oL3R3mtjbaLBjt35o/Nn56/u667dnXcXJfVBp9J8r0lP9/Y/PjEjaw==</latexit>
argmaxq
Y
C,x
qx(C)
= argmaxq
logY
C,x
qx(C) = argminq
- logY
C,x
qx(C)
= argminq
X
C,x
- logqx(C)
= argminq
-X
x2XP
logqx(P)-X
x2XN
logqx(N)
<latexit sha1_base64="M9O2tYxS+sQRDytiY0W5/QJ1geY=">AAAIt3icfVVNb9tGEKXSNFXUpnHaYy9EhAZxoRqk7NgOCgOJZTc5NLFqWLYBUyCWqxVFaPnh3aVMZsFjr+2/6H/qT+mtyw/JXC5Tngbz3hvOvllynAh7lBnGP50HXzz88tFX3ce9r7958u3TrWffXdIwJhBNYIhDcu0AirAXoAnzGEbXEUHAdzC6cpajHL9aIUK9MLhgaYSmPnADb+5BwETK3vrXAsT1QWLf6i+sX6yIhDObjwZJpt/aycvRtm5ZvRdH+j3LwqGrt/AqjhcIzs+fZeXVJKpFY7+iVKqKWX9xWbSkJrrlBfq1bTmQj7PsXlMmtvMyMpFA/lEiFolte6tv7BjFo6uBWQV9rXrG9rNHf1uzEMY+ChjEgNIb04jYlAPCPIhR1rNiiiIAl8BFNzGbH065F0QxQwHM9B8FNo+xzkI9H4M+8wiCDKciAJB4ooIOF4AAyMSwenIpigLgIzqYrbyIliFduWXAgJj0lCfFTcieSEruEhAtPJhIrXHgUx+whZKkqe/ISRRjRFa+nMzbFE02mAki0KO5CWPhzFmU3y56EY4rfJFGCxTQjMcEZ3WhABAhaC6ERUgRiyNenEZc6SU9YiRGgzwsckcngCzP0Wwg6kgJuZ05DgGTU444hnAnQHcw9H0QzLgVZdxiKGHcGuxkhXd19Dzj3MqNchz9PIcl9GMNFbdLBk9r4KkAZXSyQef6pCm9rIGXyluvauhVU+rENTRW0FUNXSmVnbsafKfASQ1NFDStoamCfqqhn1SfgbgWN8MpL2dRDJWfYW+F3hGEgoz3h1nzLETM+8aUJfkd4H0zK+yeobn4H5aAn+Z0/v7iw28ZHx0OXxn7WZPh4BitKcbu/quRoVDcspuKYxweDo8VTkhA4G4KnZzuvzXVQlFMIrwhHRzs/vparZQijMO7TaXR8clwt3kw4YjclHlgGkbzthGoWFU5ovdNXbHWbaNXr2kVOG2C0s9W/lLlvyMg/Qw7bKu+trlVEbUp1p63KtI2xXoAa4UsCVpsuh/HRtM4eZR/CEsoesxXBsBl3RMklglBH8QHciZ+gICF5Cderb1MLBfXGuTR/xFBsiaKqNcTm81s7jE1uBzumHs7e7/v9d8cVzuuq/2gPddeaqZ2oL3R3mtjbaLBjt35o/Nn56/u667dnXcXJfVBp9J8r0lP9/Y/PjEjaw==</latexit>
argmaxq
Y
C,x
qx(C)
= argmaxq
logY
C,x
qx(C) = argminq
- logY
C,x
qx(C)
= argminq
X
C,x
- logqx(C)
= argminq
-X
x2XP
logqx(P)-X
x2XN
logqx(N)
<latexit sha1_base64="M9O2tYxS+sQRDytiY0W5/QJ1geY=">AAAIt3icfVVNb9tGEKXSNFXUpnHaYy9EhAZxoRqk7NgOCgOJZTc5NLFqWLYBUyCWqxVFaPnh3aVMZsFjr+2/6H/qT+mtyw/JXC5Tngbz3hvOvllynAh7lBnGP50HXzz88tFX3ce9r7958u3TrWffXdIwJhBNYIhDcu0AirAXoAnzGEbXEUHAdzC6cpajHL9aIUK9MLhgaYSmPnADb+5BwETK3vrXAsT1QWLf6i+sX6yIhDObjwZJpt/aycvRtm5ZvRdH+j3LwqGrt/AqjhcIzs+fZeXVJKpFY7+iVKqKWX9xWbSkJrrlBfq1bTmQj7PsXlMmtvMyMpFA/lEiFolte6tv7BjFo6uBWQV9rXrG9rNHf1uzEMY+ChjEgNIb04jYlAPCPIhR1rNiiiIAl8BFNzGbH065F0QxQwHM9B8FNo+xzkI9H4M+8wiCDKciAJB4ooIOF4AAyMSwenIpigLgIzqYrbyIliFduWXAgJj0lCfFTcieSEruEhAtPJhIrXHgUx+whZKkqe/ISRRjRFa+nMzbFE02mAki0KO5CWPhzFmU3y56EY4rfJFGCxTQjMcEZ3WhABAhaC6ERUgRiyNenEZc6SU9YiRGgzwsckcngCzP0Wwg6kgJuZ05DgGTU444hnAnQHcw9H0QzLgVZdxiKGHcGuxkhXd19Dzj3MqNchz9PIcl9GMNFbdLBk9r4KkAZXSyQef6pCm9rIGXyluvauhVU+rENTRW0FUNXSmVnbsafKfASQ1NFDStoamCfqqhn1SfgbgWN8MpL2dRDJWfYW+F3hGEgoz3h1nzLETM+8aUJfkd4H0zK+yeobn4H5aAn+Z0/v7iw28ZHx0OXxn7WZPh4BitKcbu/quRoVDcspuKYxweDo8VTkhA4G4KnZzuvzXVQlFMIrwhHRzs/vparZQijMO7TaXR8clwt3kw4YjclHlgGkbzthGoWFU5ovdNXbHWbaNXr2kVOG2C0s9W/lLlvyMg/Qw7bKu+trlVEbUp1p63KtI2xXoAa4UsCVpsuh/HRtM4eZR/CEsoesxXBsBl3RMklglBH8QHciZ+gICF5Cderb1MLBfXGuTR/xFBsiaKqNcTm81s7jE1uBzumHs7e7/v9d8cVzuuq/2gPddeaqZ2oL3R3mtjbaLBjt35o/Nn56/u667dnXcXJfVBp9J8r0lP9/Y/PjEjaw==</latexit>
argmaxq
Y
C,x
qx(C)
= argmaxq
logY
C,x
qx(C) = argminq
- logY
C,x
qx(C)
= argminq
X
C,x
- logqx(C)
= argminq
-X
x2XP
logqx(P)-X
x2XN
logqx(N)
<latexit sha1_base64="M9O2tYxS+sQRDytiY0W5/QJ1geY=">AAAIt3icfVVNb9tGEKXSNFXUpnHaYy9EhAZxoRqk7NgOCgOJZTc5NLFqWLYBUyCWqxVFaPnh3aVMZsFjr+2/6H/qT+mtyw/JXC5Tngbz3hvOvllynAh7lBnGP50HXzz88tFX3ce9r7958u3TrWffXdIwJhBNYIhDcu0AirAXoAnzGEbXEUHAdzC6cpajHL9aIUK9MLhgaYSmPnADb+5BwETK3vrXAsT1QWLf6i+sX6yIhDObjwZJpt/aycvRtm5ZvRdH+j3LwqGrt/AqjhcIzs+fZeXVJKpFY7+iVKqKWX9xWbSkJrrlBfq1bTmQj7PsXlMmtvMyMpFA/lEiFolte6tv7BjFo6uBWQV9rXrG9rNHf1uzEMY+ChjEgNIb04jYlAPCPIhR1rNiiiIAl8BFNzGbH065F0QxQwHM9B8FNo+xzkI9H4M+8wiCDKciAJB4ooIOF4AAyMSwenIpigLgIzqYrbyIliFduWXAgJj0lCfFTcieSEruEhAtPJhIrXHgUx+whZKkqe/ISRRjRFa+nMzbFE02mAki0KO5CWPhzFmU3y56EY4rfJFGCxTQjMcEZ3WhABAhaC6ERUgRiyNenEZc6SU9YiRGgzwsckcngCzP0Wwg6kgJuZ05DgGTU444hnAnQHcw9H0QzLgVZdxiKGHcGuxkhXd19Dzj3MqNchz9PIcl9GMNFbdLBk9r4KkAZXSyQef6pCm9rIGXyluvauhVU+rENTRW0FUNXSmVnbsafKfASQ1NFDStoamCfqqhn1SfgbgWN8MpL2dRDJWfYW+F3hGEgoz3h1nzLETM+8aUJfkd4H0zK+yeobn4H5aAn+Z0/v7iw28ZHx0OXxn7WZPh4BitKcbu/quRoVDcspuKYxweDo8VTkhA4G4KnZzuvzXVQlFMIrwhHRzs/vparZQijMO7TaXR8clwt3kw4YjclHlgGkbzthGoWFU5ovdNXbHWbaNXr2kVOG2C0s9W/lLlvyMg/Qw7bKu+trlVEbUp1p63KtI2xXoAa4UsCVpsuh/HRtM4eZR/CEsoesxXBsBl3RMklglBH8QHciZ+gICF5Cderb1MLBfXGuTR/xFBsiaKqNcTm81s7jE1uBzumHs7e7/v9d8cVzuuq/2gPddeaqZ2oL3R3mtjbaLBjt35o/Nn56/u667dnXcXJfVBp9J8r0lP9/Y/PjEjaw==</latexit>
argmaxq
Y
C,x
qx(C)
= argmaxq
logY
C,x
qx(C) = argminq
- logY
C,x
qx(C)
= argminq
X
C,x
- logqx(C)
= argminq
-X
x2XP
logqx(P)-X
x2XN
logqx(N)
<latexit sha1_base64="M9O2tYxS+sQRDytiY0W5/QJ1geY=">AAAIt3icfVVNb9tGEKXSNFXUpnHaYy9EhAZxoRqk7NgOCgOJZTc5NLFqWLYBUyCWqxVFaPnh3aVMZsFjr+2/6H/qT+mtyw/JXC5Tngbz3hvOvllynAh7lBnGP50HXzz88tFX3ce9r7958u3TrWffXdIwJhBNYIhDcu0AirAXoAnzGEbXEUHAdzC6cpajHL9aIUK9MLhgaYSmPnADb+5BwETK3vrXAsT1QWLf6i+sX6yIhDObjwZJpt/aycvRtm5ZvRdH+j3LwqGrt/AqjhcIzs+fZeXVJKpFY7+iVKqKWX9xWbSkJrrlBfq1bTmQj7PsXlMmtvMyMpFA/lEiFolte6tv7BjFo6uBWQV9rXrG9rNHf1uzEMY+ChjEgNIb04jYlAPCPIhR1rNiiiIAl8BFNzGbH065F0QxQwHM9B8FNo+xzkI9H4M+8wiCDKciAJB4ooIOF4AAyMSwenIpigLgIzqYrbyIliFduWXAgJj0lCfFTcieSEruEhAtPJhIrXHgUx+whZKkqe/ISRRjRFa+nMzbFE02mAki0KO5CWPhzFmU3y56EY4rfJFGCxTQjMcEZ3WhABAhaC6ERUgRiyNenEZc6SU9YiRGgzwsckcngCzP0Wwg6kgJuZ05DgGTU444hnAnQHcw9H0QzLgVZdxiKGHcGuxkhXd19Dzj3MqNchz9PIcl9GMNFbdLBk9r4KkAZXSyQef6pCm9rIGXyluvauhVU+rENTRW0FUNXSmVnbsafKfASQ1NFDStoamCfqqhn1SfgbgWN8MpL2dRDJWfYW+F3hGEgoz3h1nzLETM+8aUJfkd4H0zK+yeobn4H5aAn+Z0/v7iw28ZHx0OXxn7WZPh4BitKcbu/quRoVDcspuKYxweDo8VTkhA4G4KnZzuvzXVQlFMIrwhHRzs/vparZQijMO7TaXR8clwt3kw4YjclHlgGkbzthGoWFU5ovdNXbHWbaNXr2kVOG2C0s9W/lLlvyMg/Qw7bKu+trlVEbUp1p63KtI2xXoAa4UsCVpsuh/HRtM4eZR/CEsoesxXBsBl3RMklglBH8QHciZ+gICF5Cderb1MLBfXGuTR/xFBsiaKqNcTm81s7jE1uBzumHs7e7/v9d8cVzuuq/2gPddeaqZ2oL3R3mtjbaLBjt35o/Nn56/u667dnXcXJfVBp9J8r0lP9/Y/PjEjaw==</latexit>
Least-squares classifier
76
-1
1
0
Intheleast-squarescase,thelossfunctioncouldbethoughtofintermsoftheresidualsbetweenthepredictionandthetruevalues.Theypullonthelinelikerubberbands.
77
c(x) = �(w · x+ b)<latexit sha1_base64="ehh3L8LvY3yGm0TpS6TXCnrKZAw=">AAAHBHicfZVNb9NAEIbdQkMJFFo4crGIEC1EkZPSNj1UKm2hFaIfVE1TKY6q9WbiWPHHanedOl3tmRM/hVPFDTjyH/g3rPNR7DjgS0bzvDOafXc1sYjrMG4Yv2dm79ydy92bv59/8HDh0ePFpSfnLAgphhoO3IBeWIiB6/hQ4w534YJQQJ7lQt3q7sa83gPKnMA/430CTQ/ZvtN2MOIqdbn4Ei9HK/qWbjLH9tCyGWBhWlfSxK2Am5H+WjctLCy5crlYMErG4NOzQXkUFLTRd3K5NPfLbAU49MDn2EWMNcoG4U2BKHewCzJvhgwIwl1kQyPk7WpTOD4JOfhY6i8Ua4euzgM9HlpvORQwd/sqQJg6qoOOO4gizNXR8ulWDHzkASu2eg5hw5D17GHAkfKlKaKBb3IhVSlsikjHwVFqNIE85iHeySRZ37PSSQhdoD0vnYzHVENOKCOg2GGxCSfKmWMS3wU7C05GvNMnHfCZFCF1ZbJQAaAU2qpwEDLgIRGD06gH0GVbnIZQjMNBbmsP0e4ptIqqTyqRHqftBohLZYYPVzjwPOS3hEmkMDlEXJjFkhxYlaSnUggz9sWy9NMYp+hRgh7Jyc61W9rWa4qm4HkCnmca1xO0PllqhQkaZmgvQXuZzurN/8VXGRwlaJSh/QTtZ+h1gl5nrUTqohuVphjaPbgmcew6PdinAL4UhYqcPAtVN9gop0viWxWFshzY3YK22gdD4PVjuTg4O/woxW61smasy0mF5YYwlhir62u7RkZiD6cZaYxqtbKT0QQU+fZto71362/L2UYkpMS9FW1srL7f3Jl8IhRnzjc6hl4o6xk/7Gny0cBTC6xpBUMTpuq7Wf0+Rf1/qINp3cfeTK0g0yrGRo0rJkYi8bPqqnVN4pWK3KFkD9SypXContuxWhCIB/SVemPU9hxlhvo1i3H0PyGKxkIV5fNq85cn93w2qFVKmyXj05vC9ofRX8C89kx7ri1rZW1D29YOtBOtpmHti3ajfdd+5D7nvuZuct+G0tmZUc1TLfXlfv4BMVSJHA==</latexit><latexit sha1_base64="ehh3L8LvY3yGm0TpS6TXCnrKZAw=">AAAHBHicfZVNb9NAEIbdQkMJFFo4crGIEC1EkZPSNj1UKm2hFaIfVE1TKY6q9WbiWPHHanedOl3tmRM/hVPFDTjyH/g3rPNR7DjgS0bzvDOafXc1sYjrMG4Yv2dm79ydy92bv59/8HDh0ePFpSfnLAgphhoO3IBeWIiB6/hQ4w534YJQQJ7lQt3q7sa83gPKnMA/430CTQ/ZvtN2MOIqdbn4Ei9HK/qWbjLH9tCyGWBhWlfSxK2Am5H+WjctLCy5crlYMErG4NOzQXkUFLTRd3K5NPfLbAU49MDn2EWMNcoG4U2BKHewCzJvhgwIwl1kQyPk7WpTOD4JOfhY6i8Ua4euzgM9HlpvORQwd/sqQJg6qoOOO4gizNXR8ulWDHzkASu2eg5hw5D17GHAkfKlKaKBb3IhVSlsikjHwVFqNIE85iHeySRZ37PSSQhdoD0vnYzHVENOKCOg2GGxCSfKmWMS3wU7C05GvNMnHfCZFCF1ZbJQAaAU2qpwEDLgIRGD06gH0GVbnIZQjMNBbmsP0e4ptIqqTyqRHqftBohLZYYPVzjwPOS3hEmkMDlEXJjFkhxYlaSnUggz9sWy9NMYp+hRgh7Jyc61W9rWa4qm4HkCnmca1xO0PllqhQkaZmgvQXuZzurN/8VXGRwlaJSh/QTtZ+h1gl5nrUTqohuVphjaPbgmcew6PdinAL4UhYqcPAtVN9gop0viWxWFshzY3YK22gdD4PVjuTg4O/woxW61smasy0mF5YYwlhir62u7RkZiD6cZaYxqtbKT0QQU+fZto71362/L2UYkpMS9FW1srL7f3Jl8IhRnzjc6hl4o6xk/7Gny0cBTC6xpBUMTpuq7Wf0+Rf1/qINp3cfeTK0g0yrGRo0rJkYi8bPqqnVN4pWK3KFkD9SypXContuxWhCIB/SVemPU9hxlhvo1i3H0PyGKxkIV5fNq85cn93w2qFVKmyXj05vC9ofRX8C89kx7ri1rZW1D29YOtBOtpmHti3ajfdd+5D7nvuZuct+G0tmZUc1TLfXlfv4BMVSJHA==</latexit><latexit sha1_base64="ehh3L8LvY3yGm0TpS6TXCnrKZAw=">AAAHBHicfZVNb9NAEIbdQkMJFFo4crGIEC1EkZPSNj1UKm2hFaIfVE1TKY6q9WbiWPHHanedOl3tmRM/hVPFDTjyH/g3rPNR7DjgS0bzvDOafXc1sYjrMG4Yv2dm79ydy92bv59/8HDh0ePFpSfnLAgphhoO3IBeWIiB6/hQ4w534YJQQJ7lQt3q7sa83gPKnMA/430CTQ/ZvtN2MOIqdbn4Ei9HK/qWbjLH9tCyGWBhWlfSxK2Am5H+WjctLCy5crlYMErG4NOzQXkUFLTRd3K5NPfLbAU49MDn2EWMNcoG4U2BKHewCzJvhgwIwl1kQyPk7WpTOD4JOfhY6i8Ua4euzgM9HlpvORQwd/sqQJg6qoOOO4gizNXR8ulWDHzkASu2eg5hw5D17GHAkfKlKaKBb3IhVSlsikjHwVFqNIE85iHeySRZ37PSSQhdoD0vnYzHVENOKCOg2GGxCSfKmWMS3wU7C05GvNMnHfCZFCF1ZbJQAaAU2qpwEDLgIRGD06gH0GVbnIZQjMNBbmsP0e4ptIqqTyqRHqftBohLZYYPVzjwPOS3hEmkMDlEXJjFkhxYlaSnUggz9sWy9NMYp+hRgh7Jyc61W9rWa4qm4HkCnmca1xO0PllqhQkaZmgvQXuZzurN/8VXGRwlaJSh/QTtZ+h1gl5nrUTqohuVphjaPbgmcew6PdinAL4UhYqcPAtVN9gop0viWxWFshzY3YK22gdD4PVjuTg4O/woxW61smasy0mF5YYwlhir62u7RkZiD6cZaYxqtbKT0QQU+fZto71362/L2UYkpMS9FW1srL7f3Jl8IhRnzjc6hl4o6xk/7Gny0cBTC6xpBUMTpuq7Wf0+Rf1/qINp3cfeTK0g0yrGRo0rJkYi8bPqqnVN4pWK3KFkD9SypXContuxWhCIB/SVemPU9hxlhvo1i3H0PyGKxkIV5fNq85cn93w2qFVKmyXj05vC9ofRX8C89kx7ri1rZW1D29YOtBOtpmHti3ajfdd+5D7nvuZuct+G0tmZUc1TLfXlfv4BMVSJHA==</latexit>
1
0
Forthecrossentropyloss,wecanimaginetheresidualsforlogisticregressionasthelinesdrawnhere.Thecrossentropylosstriestomaximisetheselinesbyminimisingthenegativeoftheirlogarithm.Youcanthinkofthemaslittlerodspushingthesigmoidtowardstheredandbluepoints.
78
Rememberthatintheleastsquareslosswesquaredtheresidualsbeforesummingthem,topunishoutliers.Takingthelogarithmhasasimilareffect.Lowprobabilitiesaredisproportionatelypunished.
working out the gradient
79
@loss(w, b)@wi
=@�-P
x2XPlogqx(P)-
Px2XN
logqx(N)�
@wi
= -X
x2XP
@ logqx(P)
@wi-
X
x2XN
@ logqx(N)
@wi<latexit sha1_base64="9poBmou8NrV89+fWJxd1hqIvrRY=">AAAJBnicfVXNb9s2FJe7rnO9dW224y5EjQ3J4AaSsybZIUCbj6WHtfGCOAkQGQZFU7JgSuJIyrZK8L6/ZMfehl33R+yyw/a3jJJsT1+ZLnl5vw8+Pj6aDiU+F6b5V+vBRw8/fvRJ+3Hn08+efP702dYX1zyKGcJDFJGI3TqQY+KHeCh8QfAtZRgGDsE3zuwkxW/mmHE/Cq9EQvEogF7ouz6CQqfGW61z22UQSXtGgS3wUkgSca627QhJaTsuWCjVA7aDsn8ctaNyqoYXY18p8M0RKDgQ7Ipt8MLmcTCWS2D7Ibgdp+qBptok8sDP4+V2ntgBL0CFyJB8VyJmiR1gM9+bCrADKqvbdkevf+96hbrKK1c3cW8hDQ55SZnD2gAo0Bk/65q7ZvaBemCtgq6x+gbjrUe/2pMIxQEOBSKQ8zvLpGIkIRM+Ilh17JhjCtEMevguFu7hSPohjQUOkQJfa8yNCRARSE8VTHyGkSCJDiBivnYAaAp18UKffadsxXEIA8x7k7lPeR7yuZcHAurBGcllNljqSUkpPQbp1EfLUmkSBjyAYlpL8iRwykkcE8zmQTmZlqmLrDCXmCGfp00Y6M5c0HRY+VU0WOHThE5xyJWMGVFFoQYwY9jVwizkWMRUZrvRN2TGjwSLcS8Ns9zRKWSzSzzpaZ9SolyOSyIoyilHb0N3J8QLFAUBDCfSpnomshtk93ZV1rsieqn0fUob5TjgMoVL6LsCqgevDJ4VwDMNltHhBnXBsCq9LoDXtVVvCuhNVerEBTSuofMCOq85O4sCvKjBywK6rKFJAU1q6PsC+r7eZ6jH4q4/kvlZZIcqL4g/x+cM41DJbl9V98L0ed9ZZUk6A7JrqazdE+zqn9ccCJKULt9cvf1RyZPD/ktzX1UZDonxmmLu7b88MWsUL69mxTEPD/vHNU7EYOhtjE7P9l9bdSMaM0o2pIODvR++rzslmJBosXE6OT7t71U3pjtSLso6sEyzOm0M1Vq16gjoWqDWWq+JvlqmUeA0CfJ+NvJndf45g8k97KjJfd3mRgVtUqx73qhImhTrA1grypKwoU3/HcdGU9k5TS/CTD9PNH0yIMl9T7F+TBh+qy/Ihf4BhCJi3+pbwbzA1176r91Lo/8jwuWaqKNO+rJZ1XesHlz3d629XfOn77qvjldvXNv4ynhubBuWcWC8Mt4YA2NooNaH1p+tv1v/tH9pf2j/1v49pz5orTRfGqWv/ce/W3RIWQ==</latexit>
@loss(w, b)@wi
=@�-P
x2XPlogqx(P)-
Px2XN
logqx(N)�
@wi
= -X
x2XP
@ logqx(P)
@wi-
X
x2XN
@ logqx(N)
@wi<latexit sha1_base64="9poBmou8NrV89+fWJxd1hqIvrRY=">AAAJBnicfVXNb9s2FJe7rnO9dW224y5EjQ3J4AaSsybZIUCbj6WHtfGCOAkQGQZFU7JgSuJIyrZK8L6/ZMfehl33R+yyw/a3jJJsT1+ZLnl5vw8+Pj6aDiU+F6b5V+vBRw8/fvRJ+3Hn08+efP702dYX1zyKGcJDFJGI3TqQY+KHeCh8QfAtZRgGDsE3zuwkxW/mmHE/Cq9EQvEogF7ouz6CQqfGW61z22UQSXtGgS3wUkgSca627QhJaTsuWCjVA7aDsn8ctaNyqoYXY18p8M0RKDgQ7Ipt8MLmcTCWS2D7Ibgdp+qBptok8sDP4+V2ntgBL0CFyJB8VyJmiR1gM9+bCrADKqvbdkevf+96hbrKK1c3cW8hDQ55SZnD2gAo0Bk/65q7ZvaBemCtgq6x+gbjrUe/2pMIxQEOBSKQ8zvLpGIkIRM+Ilh17JhjCtEMevguFu7hSPohjQUOkQJfa8yNCRARSE8VTHyGkSCJDiBivnYAaAp18UKffadsxXEIA8x7k7lPeR7yuZcHAurBGcllNljqSUkpPQbp1EfLUmkSBjyAYlpL8iRwykkcE8zmQTmZlqmLrDCXmCGfp00Y6M5c0HRY+VU0WOHThE5xyJWMGVFFoQYwY9jVwizkWMRUZrvRN2TGjwSLcS8Ns9zRKWSzSzzpaZ9SolyOSyIoyilHb0N3J8QLFAUBDCfSpnomshtk93ZV1rsieqn0fUob5TjgMoVL6LsCqgevDJ4VwDMNltHhBnXBsCq9LoDXtVVvCuhNVerEBTSuofMCOq85O4sCvKjBywK6rKFJAU1q6PsC+r7eZ6jH4q4/kvlZZIcqL4g/x+cM41DJbl9V98L0ed9ZZUk6A7JrqazdE+zqn9ccCJKULt9cvf1RyZPD/ktzX1UZDonxmmLu7b88MWsUL69mxTEPD/vHNU7EYOhtjE7P9l9bdSMaM0o2pIODvR++rzslmJBosXE6OT7t71U3pjtSLso6sEyzOm0M1Vq16gjoWqDWWq+JvlqmUeA0CfJ+NvJndf45g8k97KjJfd3mRgVtUqx73qhImhTrA1grypKwoU3/HcdGU9k5TS/CTD9PNH0yIMl9T7F+TBh+qy/Ihf4BhCJi3+pbwbzA1176r91Lo/8jwuWaqKNO+rJZ1XesHlz3d629XfOn77qvjldvXNv4ynhubBuWcWC8Mt4YA2NooNaH1p+tv1v/tH9pf2j/1v49pz5orTRfGqWv/ce/W3RIWQ==</latexit>
We’llshowyouthebasicsofworkingoutthegradientforlogisticregression.Thelossbreaksapartinseparatetermsforthepositiveandnegativepoints.Let’slookatoneofthepositiveterms(thenegativecanbederivedinasimilarway).
80
@ logqx(P)
@wi=
@ log�(w · x+ b)
@wi
=@ log 1
1+exp(-wT x-b)
@wi= -
@ log(1+ exp(-wTx- b))
@wi
= -@ log(1+ exp(-wTx- b))
@(1+ exp(-wTx- b))
@(1+ exp(-wTx- b))
@wi
= -1
ln 2
1
(1+ exp(-wTx- b))
@ exp(-wTx- b)
@wi
= -1
(1+ exp(-wTx- b))
@ exp(-wTx- b)
@(-wTx- b)
@(-wTx- b)
@wi
= -exp(-wTx- b)
(1+ exp(-wTx- b))·-xi = (1- �(wTx+ b))xi
= qx(N)xi<latexit sha1_base64="2vcOtw/BEfXstT2FaWDEzowiKeA=">AAALF3icrVbNcttEHFdrKEVQaODIRYOHjkucjOTQJBwyU/JBe6BJyMRJZiLjWa3Xssb6YrWype7sg/ACvEZvDFeOPABXeAV2JdmWtKoLneqSzf4+9r+//bIVuk5EdP3PO3db771/74P7H6offfzgk08fbnx2FQUxhqgPAzfANxaIkOv4qE8c4qKbECPgWS66tqZHAr+eIRw5gX9J0hANPGD7ztiBgPCu4Ubr5pFmjjGA1JyGmukGtvbzMOmYFqTn7DHLewNI50OHMe2gzjUjx/ZARzBMa85MOAqImWwKuSXLTVN91OCR/WswamyaKAk7Wwu3ny7NRNvSFmaaXM1W3auz3uOx5FGU9LZGb6Kxle8bqdXKVoVNOX0Rken6PcZWkf2P8ddmuz6VdzzQ61d4tQhvWen6sf/DJMQG1raSocNNO4bAqjs8U2wuFYIoSjjITw2G9JQVncOHbX1bzz5NbhhFo60U3/lw496v5iiAsYd8Al0QRbeGHpIBBZg40EVMNeMIhQBOgY1uYzLeH1DHD2OCfMi0rzg2jl2NBJo45trIwQgSN+UNALHDHTQ4ATwiwi8DtWoVIR94KOqOZk4Y5c1oZucNAvhNMqBJdtOwBxUltTEIJw5MKqVR4EUeIBOpM0o9q9qJYhfhmVftFGXyImvMBGHoRCKEc57MWShur+gyOC/wSRpOkB8xGmOXlYUcQBijMRdmzQiROKTZbPiVOY0OCI5RVzSzvoNjgKcXaNTlPpWOajljNwCk2mXxafB0fDSHgecBf0TNkG9WghJCze42y7IroxeMUlMEZVnahYAr6GkJPWWsCp6UwBMOVtH+Eh1r/br0qgReSaNel9DrutSKS2gsobMSOpOc+dFZwXMJTkpoIqFpCU0l9GUJfSnnDPi2uO0NaL4W2aLSM9eZoWcYIZ/RNr9Qa3PBfL1vjapE7AHaNlgW9wiN+XubA14q6PT55YsfGD3a7z3Rd1mdYbkxWlD0nd0nR7pEsfNqCo6+v987lDgBBr69NDo+2f3OkI3CGIfukrS3t/P9t7JTilw3mC+djg6Pezv1ifFEqkUZe4au13cbhlJURSJa29CkaO0mejFMo8BqEuR5NvKnMv8ZBulr2EGT+yLmRkXYpFhk3qhImxSLBVgoqhK/IabVciw1tZmH4iCIXwuheDKAm/seI/6YYPSCH5AzfgECEuCv+anAtudwL/7X7IrWOiJIFkTeUlX+shn1d0xuXPW2jZ1t/cdv2k8PizfuvvKF8qXSUQxlT3mqPFfOlb4CW69af7X+bv2j/qK+Un9Tf8+pd+8Ums+Vyqf+8S/5yPuw</latexit>
@ logqx(P)
@wi=
@ log�(w · x+ b)
@wi
=@ log 1
1+exp(-wT x-b)
@wi= -
@ log(1+ exp(-wTx- b))
@wi
= -@ log(1+ exp(-wTx- b))
@(1+ exp(-wTx- b))
@(1+ exp(-wTx- b))
@wi
= -1
ln 2
1
(1+ exp(-wTx- b))
@ exp(-wTx- b)
@wi
= -1
(1+ exp(-wTx- b))
@ exp(-wTx- b)
@(-wTx- b)
@(-wTx- b)
@wi
= -exp(-wTx- b)
(1+ exp(-wTx- b))·-xi = (1- �(wTx+ b))xi
= qx(N)xi<latexit sha1_base64="2vcOtw/BEfXstT2FaWDEzowiKeA=">AAALF3icrVbNcttEHFdrKEVQaODIRYOHjkucjOTQJBwyU/JBe6BJyMRJZiLjWa3Xssb6YrWype7sg/ACvEZvDFeOPABXeAV2JdmWtKoLneqSzf4+9r+//bIVuk5EdP3PO3db771/74P7H6offfzgk08fbnx2FQUxhqgPAzfANxaIkOv4qE8c4qKbECPgWS66tqZHAr+eIRw5gX9J0hANPGD7ztiBgPCu4Ubr5pFmjjGA1JyGmukGtvbzMOmYFqTn7DHLewNI50OHMe2gzjUjx/ZARzBMa85MOAqImWwKuSXLTVN91OCR/WswamyaKAk7Wwu3ny7NRNvSFmaaXM1W3auz3uOx5FGU9LZGb6Kxle8bqdXKVoVNOX0Rken6PcZWkf2P8ddmuz6VdzzQ61d4tQhvWen6sf/DJMQG1raSocNNO4bAqjs8U2wuFYIoSjjITw2G9JQVncOHbX1bzz5NbhhFo60U3/lw496v5iiAsYd8Al0QRbeGHpIBBZg40EVMNeMIhQBOgY1uYzLeH1DHD2OCfMi0rzg2jl2NBJo45trIwQgSN+UNALHDHTQ4ATwiwi8DtWoVIR94KOqOZk4Y5c1oZucNAvhNMqBJdtOwBxUltTEIJw5MKqVR4EUeIBOpM0o9q9qJYhfhmVftFGXyImvMBGHoRCKEc57MWShur+gyOC/wSRpOkB8xGmOXlYUcQBijMRdmzQiROKTZbPiVOY0OCI5RVzSzvoNjgKcXaNTlPpWOajljNwCk2mXxafB0fDSHgecBf0TNkG9WghJCze42y7IroxeMUlMEZVnahYAr6GkJPWWsCp6UwBMOVtH+Eh1r/br0qgReSaNel9DrutSKS2gsobMSOpOc+dFZwXMJTkpoIqFpCU0l9GUJfSnnDPi2uO0NaL4W2aLSM9eZoWcYIZ/RNr9Qa3PBfL1vjapE7AHaNlgW9wiN+XubA14q6PT55YsfGD3a7z3Rd1mdYbkxWlD0nd0nR7pEsfNqCo6+v987lDgBBr69NDo+2f3OkI3CGIfukrS3t/P9t7JTilw3mC+djg6Pezv1ifFEqkUZe4au13cbhlJURSJa29CkaO0mejFMo8BqEuR5NvKnMv8ZBulr2EGT+yLmRkXYpFhk3qhImxSLBVgoqhK/IabVciw1tZmH4iCIXwuheDKAm/seI/6YYPSCH5AzfgECEuCv+anAtudwL/7X7IrWOiJIFkTeUlX+shn1d0xuXPW2jZ1t/cdv2k8PizfuvvKF8qXSUQxlT3mqPFfOlb4CW69af7X+bv2j/qK+Un9Tf8+pd+8Ums+Vyqf+8S/5yPuw</latexit>
@ logqx(P)
@wi=
@ log�(w · x+ b)
@wi
=@ log 1
1+exp(-wT x-b)
@wi= -
@ log(1+ exp(-wTx- b))
@wi
= -@ log(1+ exp(-wTx- b))
@(1+ exp(-wTx- b))
@(1+ exp(-wTx- b))
@wi
= -1
ln 2
1
(1+ exp(-wTx- b))
@ exp(-wTx- b)
@wi
= -1
(1+ exp(-wTx- b))
@ exp(-wTx- b)
@(-wTx- b)
@(-wTx- b)
@wi
= -exp(-wTx- b)
(1+ exp(-wTx- b))·-xi = (1- �(wTx+ b))xi
= qx(N)xi<latexit sha1_base64="2vcOtw/BEfXstT2FaWDEzowiKeA=">AAALF3icrVbNcttEHFdrKEVQaODIRYOHjkucjOTQJBwyU/JBe6BJyMRJZiLjWa3Xssb6YrWype7sg/ACvEZvDFeOPABXeAV2JdmWtKoLneqSzf4+9r+//bIVuk5EdP3PO3db771/74P7H6offfzgk08fbnx2FQUxhqgPAzfANxaIkOv4qE8c4qKbECPgWS66tqZHAr+eIRw5gX9J0hANPGD7ztiBgPCu4Ubr5pFmjjGA1JyGmukGtvbzMOmYFqTn7DHLewNI50OHMe2gzjUjx/ZARzBMa85MOAqImWwKuSXLTVN91OCR/WswamyaKAk7Wwu3ny7NRNvSFmaaXM1W3auz3uOx5FGU9LZGb6Kxle8bqdXKVoVNOX0Rken6PcZWkf2P8ddmuz6VdzzQ61d4tQhvWen6sf/DJMQG1raSocNNO4bAqjs8U2wuFYIoSjjITw2G9JQVncOHbX1bzz5NbhhFo60U3/lw496v5iiAsYd8Al0QRbeGHpIBBZg40EVMNeMIhQBOgY1uYzLeH1DHD2OCfMi0rzg2jl2NBJo45trIwQgSN+UNALHDHTQ4ATwiwi8DtWoVIR94KOqOZk4Y5c1oZucNAvhNMqBJdtOwBxUltTEIJw5MKqVR4EUeIBOpM0o9q9qJYhfhmVftFGXyImvMBGHoRCKEc57MWShur+gyOC/wSRpOkB8xGmOXlYUcQBijMRdmzQiROKTZbPiVOY0OCI5RVzSzvoNjgKcXaNTlPpWOajljNwCk2mXxafB0fDSHgecBf0TNkG9WghJCze42y7IroxeMUlMEZVnahYAr6GkJPWWsCp6UwBMOVtH+Eh1r/br0qgReSaNel9DrutSKS2gsobMSOpOc+dFZwXMJTkpoIqFpCU0l9GUJfSnnDPi2uO0NaL4W2aLSM9eZoWcYIZ/RNr9Qa3PBfL1vjapE7AHaNlgW9wiN+XubA14q6PT55YsfGD3a7z3Rd1mdYbkxWlD0nd0nR7pEsfNqCo6+v987lDgBBr69NDo+2f3OkI3CGIfukrS3t/P9t7JTilw3mC+djg6Pezv1ifFEqkUZe4au13cbhlJURSJa29CkaO0mejFMo8BqEuR5NvKnMv8ZBulr2EGT+yLmRkXYpFhk3qhImxSLBVgoqhK/IabVciw1tZmH4iCIXwuheDKAm/seI/6YYPSCH5AzfgECEuCv+anAtudwL/7X7IrWOiJIFkTeUlX+shn1d0xuXPW2jZ1t/cdv2k8PizfuvvKF8qXSUQxlT3mqPFfOlb4CW69af7X+bv2j/qK+Un9Tf8+pd+8Ums+Vyqf+8S/5yPuw</latexit>
@ logqx(P)
@wi=
@ log�(w · x+ b)
@wi
=@ log 1
1+exp(-wT x-b)
@wi= -
@ log(1+ exp(-wTx- b))
@wi
= -@ log(1+ exp(-wTx- b))
@(1+ exp(-wTx- b))
@(1+ exp(-wTx- b))
@wi
= -1
ln 2
1
(1+ exp(-wTx- b))
@ exp(-wTx- b)
@wi
= -1
(1+ exp(-wTx- b))
@ exp(-wTx- b)
@(-wTx- b)
@(-wTx- b)
@wi
= -exp(-wTx- b)
(1+ exp(-wTx- b))·-xi = (1- �(wTx+ b))xi
= qx(N)xi<latexit sha1_base64="2vcOtw/BEfXstT2FaWDEzowiKeA=">AAALF3icrVbNcttEHFdrKEVQaODIRYOHjkucjOTQJBwyU/JBe6BJyMRJZiLjWa3Xssb6YrWype7sg/ACvEZvDFeOPABXeAV2JdmWtKoLneqSzf4+9r+//bIVuk5EdP3PO3db771/74P7H6offfzgk08fbnx2FQUxhqgPAzfANxaIkOv4qE8c4qKbECPgWS66tqZHAr+eIRw5gX9J0hANPGD7ztiBgPCu4Ubr5pFmjjGA1JyGmukGtvbzMOmYFqTn7DHLewNI50OHMe2gzjUjx/ZARzBMa85MOAqImWwKuSXLTVN91OCR/WswamyaKAk7Wwu3ny7NRNvSFmaaXM1W3auz3uOx5FGU9LZGb6Kxle8bqdXKVoVNOX0Rken6PcZWkf2P8ddmuz6VdzzQ61d4tQhvWen6sf/DJMQG1raSocNNO4bAqjs8U2wuFYIoSjjITw2G9JQVncOHbX1bzz5NbhhFo60U3/lw496v5iiAsYd8Al0QRbeGHpIBBZg40EVMNeMIhQBOgY1uYzLeH1DHD2OCfMi0rzg2jl2NBJo45trIwQgSN+UNALHDHTQ4ATwiwi8DtWoVIR94KOqOZk4Y5c1oZucNAvhNMqBJdtOwBxUltTEIJw5MKqVR4EUeIBOpM0o9q9qJYhfhmVftFGXyImvMBGHoRCKEc57MWShur+gyOC/wSRpOkB8xGmOXlYUcQBijMRdmzQiROKTZbPiVOY0OCI5RVzSzvoNjgKcXaNTlPpWOajljNwCk2mXxafB0fDSHgecBf0TNkG9WghJCze42y7IroxeMUlMEZVnahYAr6GkJPWWsCp6UwBMOVtH+Eh1r/br0qgReSaNel9DrutSKS2gsobMSOpOc+dFZwXMJTkpoIqFpCU0l9GUJfSnnDPi2uO0NaL4W2aLSM9eZoWcYIZ/RNr9Qa3PBfL1vjapE7AHaNlgW9wiN+XubA14q6PT55YsfGD3a7z3Rd1mdYbkxWlD0nd0nR7pEsfNqCo6+v987lDgBBr69NDo+2f3OkI3CGIfukrS3t/P9t7JTilw3mC+djg6Pezv1ifFEqkUZe4au13cbhlJURSJa29CkaO0mejFMo8BqEuR5NvKnMv8ZBulr2EGT+yLmRkXYpFhk3qhImxSLBVgoqhK/IabVciw1tZmH4iCIXwuheDKAm/seI/6YYPSCH5AzfgECEuCv+anAtudwL/7X7IrWOiJIFkTeUlX+shn1d0xuXPW2jZ1t/cdv2k8PizfuvvKF8qXSUQxlT3mqPFfOlb4CW69af7X+bv2j/qK+Un9Tf8+pd+8Ums+Vyqf+8S/5yPuw</latexit>
@ logqx(P)
@wi=
@ log�(w · x+ b)
@wi
=@ log 1
1+exp(-wT x-b)
@wi= -
@ log(1+ exp(-wTx- b))
@wi
= -@ log(1+ exp(-wTx- b))
@(1+ exp(-wTx- b))
@(1+ exp(-wTx- b))
@wi
= -1
ln 2
1
(1+ exp(-wTx- b))
@ exp(-wTx- b)
@wi
= -1
(1+ exp(-wTx- b))
@ exp(-wTx- b)
@(-wTx- b)
@(-wTx- b)
@wi
= -exp(-wTx- b)
(1+ exp(-wTx- b))·-xi = (1- �(wTx+ b))xi
= qx(N)xi<latexit sha1_base64="2vcOtw/BEfXstT2FaWDEzowiKeA=">AAALF3icrVbNcttEHFdrKEVQaODIRYOHjkucjOTQJBwyU/JBe6BJyMRJZiLjWa3Xssb6YrWype7sg/ACvEZvDFeOPABXeAV2JdmWtKoLneqSzf4+9r+//bIVuk5EdP3PO3db771/74P7H6offfzgk08fbnx2FQUxhqgPAzfANxaIkOv4qE8c4qKbECPgWS66tqZHAr+eIRw5gX9J0hANPGD7ztiBgPCu4Ubr5pFmjjGA1JyGmukGtvbzMOmYFqTn7DHLewNI50OHMe2gzjUjx/ZARzBMa85MOAqImWwKuSXLTVN91OCR/WswamyaKAk7Wwu3ny7NRNvSFmaaXM1W3auz3uOx5FGU9LZGb6Kxle8bqdXKVoVNOX0Rken6PcZWkf2P8ddmuz6VdzzQ61d4tQhvWen6sf/DJMQG1raSocNNO4bAqjs8U2wuFYIoSjjITw2G9JQVncOHbX1bzz5NbhhFo60U3/lw496v5iiAsYd8Al0QRbeGHpIBBZg40EVMNeMIhQBOgY1uYzLeH1DHD2OCfMi0rzg2jl2NBJo45trIwQgSN+UNALHDHTQ4ATwiwi8DtWoVIR94KOqOZk4Y5c1oZucNAvhNMqBJdtOwBxUltTEIJw5MKqVR4EUeIBOpM0o9q9qJYhfhmVftFGXyImvMBGHoRCKEc57MWShur+gyOC/wSRpOkB8xGmOXlYUcQBijMRdmzQiROKTZbPiVOY0OCI5RVzSzvoNjgKcXaNTlPpWOajljNwCk2mXxafB0fDSHgecBf0TNkG9WghJCze42y7IroxeMUlMEZVnahYAr6GkJPWWsCp6UwBMOVtH+Eh1r/br0qgReSaNel9DrutSKS2gsobMSOpOc+dFZwXMJTkpoIqFpCU0l9GUJfSnnDPi2uO0NaL4W2aLSM9eZoWcYIZ/RNr9Qa3PBfL1vjapE7AHaNlgW9wiN+XubA14q6PT55YsfGD3a7z3Rd1mdYbkxWlD0nd0nR7pEsfNqCo6+v987lDgBBr69NDo+2f3OkI3CGIfukrS3t/P9t7JTilw3mC+djg6Pezv1ifFEqkUZe4au13cbhlJURSJa29CkaO0mejFMo8BqEuR5NvKnMv8ZBulr2EGT+yLmRkXYpFhk3qhImxSLBVgoqhK/IabVciw1tZmH4iCIXwuheDKAm/seI/6YYPSCH5AzfgECEuCv+anAtudwL/7X7IrWOiJIFkTeUlX+shn1d0xuXPW2jZ1t/cdv2k8PizfuvvKF8qXSUQxlT3mqPFfOlb4CW69af7X+bv2j/qK+Un9Tf8+pd+8Ums+Vyqf+8S/5yPuw</latexit>
@ logqx(P)
@wi=
@ log�(w · x+ b)
@wi
=@ log 1
1+exp(-wT x-b)
@wi= -
@ log(1+ exp(-wTx- b))
@wi
= -@ log(1+ exp(-wTx- b))
@(1+ exp(-wTx- b))
@(1+ exp(-wTx- b))
@wi
= -1
ln 2
1
(1+ exp(-wTx- b))
@ exp(-wTx- b)
@wi
= -1
(1+ exp(-wTx- b))
@ exp(-wTx- b)
@(-wTx- b)
@(-wTx- b)
@wi
= -exp(-wTx- b)
(1+ exp(-wTx- b))·-xi = (1- �(wTx+ b))xi
= qx(N)xi<latexit sha1_base64="2vcOtw/BEfXstT2FaWDEzowiKeA=">AAALF3icrVbNcttEHFdrKEVQaODIRYOHjkucjOTQJBwyU/JBe6BJyMRJZiLjWa3Xssb6YrWype7sg/ACvEZvDFeOPABXeAV2JdmWtKoLneqSzf4+9r+//bIVuk5EdP3PO3db771/74P7H6offfzgk08fbnx2FQUxhqgPAzfANxaIkOv4qE8c4qKbECPgWS66tqZHAr+eIRw5gX9J0hANPGD7ztiBgPCu4Ubr5pFmjjGA1JyGmukGtvbzMOmYFqTn7DHLewNI50OHMe2gzjUjx/ZARzBMa85MOAqImWwKuSXLTVN91OCR/WswamyaKAk7Wwu3ny7NRNvSFmaaXM1W3auz3uOx5FGU9LZGb6Kxle8bqdXKVoVNOX0Rken6PcZWkf2P8ddmuz6VdzzQ61d4tQhvWen6sf/DJMQG1raSocNNO4bAqjs8U2wuFYIoSjjITw2G9JQVncOHbX1bzz5NbhhFo60U3/lw496v5iiAsYd8Al0QRbeGHpIBBZg40EVMNeMIhQBOgY1uYzLeH1DHD2OCfMi0rzg2jl2NBJo45trIwQgSN+UNALHDHTQ4ATwiwi8DtWoVIR94KOqOZk4Y5c1oZucNAvhNMqBJdtOwBxUltTEIJw5MKqVR4EUeIBOpM0o9q9qJYhfhmVftFGXyImvMBGHoRCKEc57MWShur+gyOC/wSRpOkB8xGmOXlYUcQBijMRdmzQiROKTZbPiVOY0OCI5RVzSzvoNjgKcXaNTlPpWOajljNwCk2mXxafB0fDSHgecBf0TNkG9WghJCze42y7IroxeMUlMEZVnahYAr6GkJPWWsCp6UwBMOVtH+Eh1r/br0qgReSaNel9DrutSKS2gsobMSOpOc+dFZwXMJTkpoIqFpCU0l9GUJfSnnDPi2uO0NaL4W2aLSM9eZoWcYIZ/RNr9Qa3PBfL1vjapE7AHaNlgW9wiN+XubA14q6PT55YsfGD3a7z3Rd1mdYbkxWlD0nd0nR7pEsfNqCo6+v987lDgBBr69NDo+2f3OkI3CGIfukrS3t/P9t7JTilw3mC+djg6Pezv1ifFEqkUZe4au13cbhlJURSJa29CkaO0mejFMo8BqEuR5NvKnMv8ZBulr2EGT+yLmRkXYpFhk3qhImxSLBVgoqhK/IabVciw1tZmH4iCIXwuheDKAm/seI/6YYPSCH5AzfgECEuCv+anAtudwL/7X7IrWOiJIFkTeUlX+shn1d0xuXPW2jZ1t/cdv2k8PizfuvvKF8qXSUQxlT3mqPFfOlb4CW69af7X+bv2j/qK+Un9Tf8+pd+8Ums+Vyqf+8S/5yPuw</latexit>
@ logqx(P)
@wi=
@ log�(w · x+ b)
@wi
=@ log 1
1+exp(-wT x-b)
@wi= -
@ log(1+ exp(-wTx- b))
@wi
= -@ log(1+ exp(-wTx- b))
@(1+ exp(-wTx- b))
@(1+ exp(-wTx- b))
@wi
= -1
ln 2
1
(1+ exp(-wTx- b))
@ exp(-wTx- b)
@wi
= -1
(1+ exp(-wTx- b))
@ exp(-wTx- b)
@(-wTx- b)
@(-wTx- b)
@wi
= -exp(-wTx- b)
(1+ exp(-wTx- b))·-xi = (1- �(wTx+ b))xi
= qx(N)xi<latexit sha1_base64="2vcOtw/BEfXstT2FaWDEzowiKeA=">AAALF3icrVbNcttEHFdrKEVQaODIRYOHjkucjOTQJBwyU/JBe6BJyMRJZiLjWa3Xssb6YrWype7sg/ACvEZvDFeOPABXeAV2JdmWtKoLneqSzf4+9r+//bIVuk5EdP3PO3db771/74P7H6offfzgk08fbnx2FQUxhqgPAzfANxaIkOv4qE8c4qKbECPgWS66tqZHAr+eIRw5gX9J0hANPGD7ztiBgPCu4Ubr5pFmjjGA1JyGmukGtvbzMOmYFqTn7DHLewNI50OHMe2gzjUjx/ZARzBMa85MOAqImWwKuSXLTVN91OCR/WswamyaKAk7Wwu3ny7NRNvSFmaaXM1W3auz3uOx5FGU9LZGb6Kxle8bqdXKVoVNOX0Rken6PcZWkf2P8ddmuz6VdzzQ61d4tQhvWen6sf/DJMQG1raSocNNO4bAqjs8U2wuFYIoSjjITw2G9JQVncOHbX1bzz5NbhhFo60U3/lw496v5iiAsYd8Al0QRbeGHpIBBZg40EVMNeMIhQBOgY1uYzLeH1DHD2OCfMi0rzg2jl2NBJo45trIwQgSN+UNALHDHTQ4ATwiwi8DtWoVIR94KOqOZk4Y5c1oZucNAvhNMqBJdtOwBxUltTEIJw5MKqVR4EUeIBOpM0o9q9qJYhfhmVftFGXyImvMBGHoRCKEc57MWShur+gyOC/wSRpOkB8xGmOXlYUcQBijMRdmzQiROKTZbPiVOY0OCI5RVzSzvoNjgKcXaNTlPpWOajljNwCk2mXxafB0fDSHgecBf0TNkG9WghJCze42y7IroxeMUlMEZVnahYAr6GkJPWWsCp6UwBMOVtH+Eh1r/br0qgReSaNel9DrutSKS2gsobMSOpOc+dFZwXMJTkpoIqFpCU0l9GUJfSnnDPi2uO0NaL4W2aLSM9eZoWcYIZ/RNr9Qa3PBfL1vjapE7AHaNlgW9wiN+XubA14q6PT55YsfGD3a7z3Rd1mdYbkxWlD0nd0nR7pEsfNqCo6+v987lDgBBr69NDo+2f3OkI3CGIfukrS3t/P9t7JTilw3mC+djg6Pezv1ifFEqkUZe4au13cbhlJURSJa29CkaO0mejFMo8BqEuR5NvKnMv8ZBulr2EGT+yLmRkXYpFhk3qhImxSLBVgoqhK/IabVciw1tZmH4iCIXwuheDKAm/seI/6YYPSCH5AzfgECEuCv+anAtudwL/7X7IrWOiJIFkTeUlX+shn1d0xuXPW2jZ1t/cdv2k8PizfuvvKF8qXSUQxlT3mqPFfOlb4CW69af7X+bv2j/qK+Un9Tf8+pd+8Ums+Vyqf+8S/5yPuw</latexit>
Let’sworkthederivativeforoneoftheweights.
dlogb(x)/dx=(1/(lnb))(1/x)
81
@ logqx(P)
@wi=
@ log�(w · x+ b)
@wi
=@ log 1
1+exp(-wT x-b)
@wi= -
@ log(1+ exp(-wTx- b))
@wi
= -@ log(1+ exp(-wTx- b))
@(1+ exp(-wTx- b))
@(1+ exp(-wTx- b))
@wi
= -1
ln 2
1
(1+ exp(-wTx- b))
@ exp(-wTx- b)
@wi
= -1
(1+ exp(-wTx- b))
@ exp(-wTx- b)
@(-wTx- b)
@(-wTx- b)
@wi
= -exp(-wTx- b)
(1+ exp(-wTx- b))·-xi = (1- �(wTx+ b))xi
= qx(N)xi<latexit sha1_base64="2vcOtw/BEfXstT2FaWDEzowiKeA=">AAALF3icrVbNcttEHFdrKEVQaODIRYOHjkucjOTQJBwyU/JBe6BJyMRJZiLjWa3Xssb6YrWype7sg/ACvEZvDFeOPABXeAV2JdmWtKoLneqSzf4+9r+//bIVuk5EdP3PO3db771/74P7H6offfzgk08fbnx2FQUxhqgPAzfANxaIkOv4qE8c4qKbECPgWS66tqZHAr+eIRw5gX9J0hANPGD7ztiBgPCu4Ubr5pFmjjGA1JyGmukGtvbzMOmYFqTn7DHLewNI50OHMe2gzjUjx/ZARzBMa85MOAqImWwKuSXLTVN91OCR/WswamyaKAk7Wwu3ny7NRNvSFmaaXM1W3auz3uOx5FGU9LZGb6Kxle8bqdXKVoVNOX0Rken6PcZWkf2P8ddmuz6VdzzQ61d4tQhvWen6sf/DJMQG1raSocNNO4bAqjs8U2wuFYIoSjjITw2G9JQVncOHbX1bzz5NbhhFo60U3/lw496v5iiAsYd8Al0QRbeGHpIBBZg40EVMNeMIhQBOgY1uYzLeH1DHD2OCfMi0rzg2jl2NBJo45trIwQgSN+UNALHDHTQ4ATwiwi8DtWoVIR94KOqOZk4Y5c1oZucNAvhNMqBJdtOwBxUltTEIJw5MKqVR4EUeIBOpM0o9q9qJYhfhmVftFGXyImvMBGHoRCKEc57MWShur+gyOC/wSRpOkB8xGmOXlYUcQBijMRdmzQiROKTZbPiVOY0OCI5RVzSzvoNjgKcXaNTlPpWOajljNwCk2mXxafB0fDSHgecBf0TNkG9WghJCze42y7IroxeMUlMEZVnahYAr6GkJPWWsCp6UwBMOVtH+Eh1r/br0qgReSaNel9DrutSKS2gsobMSOpOc+dFZwXMJTkpoIqFpCU0l9GUJfSnnDPi2uO0NaL4W2aLSM9eZoWcYIZ/RNr9Qa3PBfL1vjapE7AHaNlgW9wiN+XubA14q6PT55YsfGD3a7z3Rd1mdYbkxWlD0nd0nR7pEsfNqCo6+v987lDgBBr69NDo+2f3OkI3CGIfukrS3t/P9t7JTilw3mC+djg6Pezv1ifFEqkUZe4au13cbhlJURSJa29CkaO0mejFMo8BqEuR5NvKnMv8ZBulr2EGT+yLmRkXYpFhk3qhImxSLBVgoqhK/IabVciw1tZmH4iCIXwuheDKAm/seI/6YYPSCH5AzfgECEuCv+anAtudwL/7X7IrWOiJIFkTeUlX+shn1d0xuXPW2jZ1t/cdv2k8PizfuvvKF8qXSUQxlT3mqPFfOlb4CW69af7X+bv2j/qK+Un9Tf8+pd+8Ums+Vyqf+8S/5yPuw</latexit>
i i
=@ log 1
1+exp(-wT x-b)
@wi= -
@ log(1+ exp(-wTx- b))
@wi@wi @wi
= -@ log(1+ exp(-wTx- b))
@(1+ exp(-wTx- b))
@(1+ exp(-wTx- b))
@wi
T
= -1
ln 2
1
(1+ exp(-wTx- b))
@ exp(-wTx- b)
@wi
T T
= -1
(1+ exp(-wTx- b))
@ exp(-wTx- b)
@(-wTx- b)
@(-wTx- b)
@wi
= -exp(-wTx- b)
(1+ exp(-wTx- b))·-xi = (1- �(wTx+ b))xi
@ logqx(P)
@wi=
@ log�(w · x+ b)
@wi
=@ log 1
1+exp(-wT x-b)
@wi= -
@ log(1+ exp(-wTx- b))
@wi
= -@ log(1+ exp(-wTx- b))
@(1+ exp(-wTx- b))
@(1+ exp(-wTx- b))
@wi
= -1
ln 2
1
(1+ exp(-wTx- b))
@ exp(-wTx- b)
@wi
= -1
(1+ exp(-wTx- b))
@ exp(-wTx- b)
@(-wTx- b)
@(-wTx- b)
@wi
= -exp(-wTx- b)
(1+ exp(-wTx- b))·-xi = (1- �(wTx+ b))xi
= qx(N)xi<latexit sha1_base64="2vcOtw/BEfXstT2FaWDEzowiKeA=">AAALF3icrVbNcttEHFdrKEVQaODIRYOHjkucjOTQJBwyU/JBe6BJyMRJZiLjWa3Xssb6YrWype7sg/ACvEZvDFeOPABXeAV2JdmWtKoLneqSzf4+9r+//bIVuk5EdP3PO3db771/74P7H6offfzgk08fbnx2FQUxhqgPAzfANxaIkOv4qE8c4qKbECPgWS66tqZHAr+eIRw5gX9J0hANPGD7ztiBgPCu4Ubr5pFmjjGA1JyGmukGtvbzMOmYFqTn7DHLewNI50OHMe2gzjUjx/ZARzBMa85MOAqImWwKuSXLTVN91OCR/WswamyaKAk7Wwu3ny7NRNvSFmaaXM1W3auz3uOx5FGU9LZGb6Kxle8bqdXKVoVNOX0Rken6PcZWkf2P8ddmuz6VdzzQ61d4tQhvWen6sf/DJMQG1raSocNNO4bAqjs8U2wuFYIoSjjITw2G9JQVncOHbX1bzz5NbhhFo60U3/lw496v5iiAsYd8Al0QRbeGHpIBBZg40EVMNeMIhQBOgY1uYzLeH1DHD2OCfMi0rzg2jl2NBJo45trIwQgSN+UNALHDHTQ4ATwiwi8DtWoVIR94KOqOZk4Y5c1oZucNAvhNMqBJdtOwBxUltTEIJw5MKqVR4EUeIBOpM0o9q9qJYhfhmVftFGXyImvMBGHoRCKEc57MWShur+gyOC/wSRpOkB8xGmOXlYUcQBijMRdmzQiROKTZbPiVOY0OCI5RVzSzvoNjgKcXaNTlPpWOajljNwCk2mXxafB0fDSHgecBf0TNkG9WghJCze42y7IroxeMUlMEZVnahYAr6GkJPWWsCp6UwBMOVtH+Eh1r/br0qgReSaNel9DrutSKS2gsobMSOpOc+dFZwXMJTkpoIqFpCU0l9GUJfSnnDPi2uO0NaL4W2aLSM9eZoWcYIZ/RNr9Qa3PBfL1vjapE7AHaNlgW9wiN+XubA14q6PT55YsfGD3a7z3Rd1mdYbkxWlD0nd0nR7pEsfNqCo6+v987lDgBBr69NDo+2f3OkI3CGIfukrS3t/P9t7JTilw3mC+djg6Pezv1ifFEqkUZe4au13cbhlJURSJa29CkaO0mejFMo8BqEuR5NvKnMv8ZBulr2EGT+yLmRkXYpFhk3qhImxSLBVgoqhK/IabVciw1tZmH4iCIXwuheDKAm/seI/6YYPSCH5AzfgECEuCv+anAtudwL/7X7IrWOiJIFkTeUlX+shn1d0xuXPW2jZ1t/cdv2k8PizfuvvKF8qXSUQxlT3mqPFfOlb4CW69af7X+bv2j/qK+Un9Tf8+pd+8Ums+Vyqf+8S/5yPuw</latexit>
Notethatdespitetheintimidatingformulasinthemiddle,theresultisactuallyverysimple.Thisisoneofthepropertiesofthelogisticsigmoid,ittendstocancelitselfoutwhenthederivativeistaken.
Weignoretheconstantmultiplier(1/ln2)inthefourthline,becauseitdoesn’tchangethedirectionofthegradient,onlythemagnitude.Whenweapplygradientdescentwescalethegradientbyaconstantmultiplieranyway,sowecanignoreit.(Anotheroptionistousethenaturallogarithminthede?initionofthecrossentropy).
dlogb(x)/dx=(1/(lnb))(1/x)
82
@loss(w, b)@wi
= -X
x2XP
qx(N)xi +X
x2XN
qx(P)xi<latexit sha1_base64="MT0weT0midCBZk1kti0+AnaaIGg=">AAAITHicfZXbbiM1HMYnCzQlsNCFO7ixCKAuhGomZdtyUWnpgd0Ldhuqpq3UiSKP40mseA7YnmRmLb8Cj8HjcM977B1CwnNImBPMTf7y7/v+sT97xk5ICRem+Wfn0TvvvrfT3X2/98GHjz/6eO/JJ7c8iBjCYxTQgN07kGNKfDwWRFB8HzIMPYfiO2d5nvK7FWacBP6NSEI88eDcJy5BUOih6d5vXwPbZRBJexkCW+BYSBpwrvbtQI85LlirAbCdvHbUU5ULNVxPiVLgFHwHbB55UxkDm/jgfpqKR5r8Oo33bYbka/U0nhLwbV2WoUKWWVLZdK9vHpjZA5qFVRR9o3hG0yc7v9uzAEUe9gWikPMHywzFREImCKJY9eyI4xCiJZzjh0i4JxNJ/DAS2EcKfKWZG1EgApBmA2aEYSRooguIGNEdAFpAnY7QCfaqrTj2oYf5YLYiIc9LvprnhYA6/omMs+1RjytOOWcwXBAUV6Ymocc9KBaNQZ54TnUQRxSzlVcdTKepJ1lTxpghwtMQRjqZqzDdcn4TjAq+SMIF9rmSEaOqbNQAM4ZdbcxKjkUUymw1+pwt+algER6kZTZ2egHZ8hrPBrpPZaA6HZcGUFSHHL0MnY6P1yjwPOjPpB3qE5YdQ3twoLLsyvRaSWmnQTkOuE5xhb4uUX22qvCyBC81rNLxlrpgXLfeluBt41/vSvSubnWiEo0adFWiq0ZnZ13C6waOSzRu0KREkwZ9U6JvmjlDfSwehhOZ70W2qfKKkhV+wTD2lewPVX0tTO/3g1W1pGdA9i2VxT3Drv5I5cBLUrl8efPqZyXPT4bPzCNVVzg0whuJeXj07NxsSOb5bAqNeXIyPGtoAgb9+bbRxeXRj1azURixkG5Fx8eHP/3Q7JRgSoP1ttP52cXwsL4wnUh1UtaxZZr108ZQI6oiEdC3QCPaeZu8+JtWg9NmyPNs1S+b+hcMJv+hDtq6b2JudYRtjk3mrY6kzbHZgI2javFbYvp3O7ae2srD9EVY6vstTK8MSPO+F1hfJgy/0i/Ilf4AQhGwb/RbweYe0b30rz1Iq/8Twngj1FWvp282q36PNYvb4YF1eGD+8n3/+Vlxx+0anxtfGPuGZRwbz42XxsgYG8h42/ms0+982f2j+7b7V/fvXPqoU3g+NSrP7s4/OO8D6w==</latexit>
@loss(w, b)@wi
=@�-P
x2XPlogqx(P)-
Px2XN
logqx(N)�
@wi
= -X
x2XP
@ logqx(P)
@wi-
X
x2XN
@ logqx(N)
@wi<latexit sha1_base64="9poBmou8NrV89+fWJxd1hqIvrRY=">AAAJBnicfVXNb9s2FJe7rnO9dW224y5EjQ3J4AaSsybZIUCbj6WHtfGCOAkQGQZFU7JgSuJIyrZK8L6/ZMfehl33R+yyw/a3jJJsT1+ZLnl5vw8+Pj6aDiU+F6b5V+vBRw8/fvRJ+3Hn08+efP702dYX1zyKGcJDFJGI3TqQY+KHeCh8QfAtZRgGDsE3zuwkxW/mmHE/Cq9EQvEogF7ouz6CQqfGW61z22UQSXtGgS3wUkgSca627QhJaTsuWCjVA7aDsn8ctaNyqoYXY18p8M0RKDgQ7Ipt8MLmcTCWS2D7Ibgdp+qBptok8sDP4+V2ntgBL0CFyJB8VyJmiR1gM9+bCrADKqvbdkevf+96hbrKK1c3cW8hDQ55SZnD2gAo0Bk/65q7ZvaBemCtgq6x+gbjrUe/2pMIxQEOBSKQ8zvLpGIkIRM+Ilh17JhjCtEMevguFu7hSPohjQUOkQJfa8yNCRARSE8VTHyGkSCJDiBivnYAaAp18UKffadsxXEIA8x7k7lPeR7yuZcHAurBGcllNljqSUkpPQbp1EfLUmkSBjyAYlpL8iRwykkcE8zmQTmZlqmLrDCXmCGfp00Y6M5c0HRY+VU0WOHThE5xyJWMGVFFoQYwY9jVwizkWMRUZrvRN2TGjwSLcS8Ns9zRKWSzSzzpaZ9SolyOSyIoyilHb0N3J8QLFAUBDCfSpnomshtk93ZV1rsieqn0fUob5TjgMoVL6LsCqgevDJ4VwDMNltHhBnXBsCq9LoDXtVVvCuhNVerEBTSuofMCOq85O4sCvKjBywK6rKFJAU1q6PsC+r7eZ6jH4q4/kvlZZIcqL4g/x+cM41DJbl9V98L0ed9ZZUk6A7JrqazdE+zqn9ccCJKULt9cvf1RyZPD/ktzX1UZDonxmmLu7b88MWsUL69mxTEPD/vHNU7EYOhtjE7P9l9bdSMaM0o2pIODvR++rzslmJBosXE6OT7t71U3pjtSLso6sEyzOm0M1Vq16gjoWqDWWq+JvlqmUeA0CfJ+NvJndf45g8k97KjJfd3mRgVtUqx73qhImhTrA1grypKwoU3/HcdGU9k5TS/CTD9PNH0yIMl9T7F+TBh+qy/Ihf4BhCJi3+pbwbzA1176r91Lo/8jwuWaqKNO+rJZ1XesHlz3d629XfOn77qvjldvXNv4ynhubBuWcWC8Mt4YA2NooNaH1p+tv1v/tH9pf2j/1v49pz5orTRfGqWv/ce/W3RIWQ==</latexit>
� P P �
= -X
x2XP
@ logqx(P)
@wi-
X
x2XN
@ logqx(N)
@wi<latexit sha1_base64="9poBmou8NrV89+fWJxd1hqIvrRY=">AAAJBnicfVXNb9s2FJe7rnO9dW224y5EjQ3J4AaSsybZIUCbj6WHtfGCOAkQGQZFU7JgSuJIyrZK8L6/ZMfehl33R+yyw/a3jJJsT1+ZLnl5vw8+Pj6aDiU+F6b5V+vBRw8/fvRJ+3Hn08+efP702dYX1zyKGcJDFJGI3TqQY+KHeCh8QfAtZRgGDsE3zuwkxW/mmHE/Cq9EQvEogF7ouz6CQqfGW61z22UQSXtGgS3wUkgSca627QhJaTsuWCjVA7aDsn8ctaNyqoYXY18p8M0RKDgQ7Ipt8MLmcTCWS2D7Ibgdp+qBptok8sDP4+V2ntgBL0CFyJB8VyJmiR1gM9+bCrADKqvbdkevf+96hbrKK1c3cW8hDQ55SZnD2gAo0Bk/65q7ZvaBemCtgq6x+gbjrUe/2pMIxQEOBSKQ8zvLpGIkIRM+Ilh17JhjCtEMevguFu7hSPohjQUOkQJfa8yNCRARSE8VTHyGkSCJDiBivnYAaAp18UKffadsxXEIA8x7k7lPeR7yuZcHAurBGcllNljqSUkpPQbp1EfLUmkSBjyAYlpL8iRwykkcE8zmQTmZlqmLrDCXmCGfp00Y6M5c0HRY+VU0WOHThE5xyJWMGVFFoQYwY9jVwizkWMRUZrvRN2TGjwSLcS8Ns9zRKWSzSzzpaZ9SolyOSyIoyilHb0N3J8QLFAUBDCfSpnomshtk93ZV1rsieqn0fUob5TjgMoVL6LsCqgevDJ4VwDMNltHhBnXBsCq9LoDXtVVvCuhNVerEBTSuofMCOq85O4sCvKjBywK6rKFJAU1q6PsC+r7eZ6jH4q4/kvlZZIcqL4g/x+cM41DJbl9V98L0ed9ZZUk6A7JrqazdE+zqn9ccCJKULt9cvf1RyZPD/ktzX1UZDonxmmLu7b88MWsUL69mxTEPD/vHNU7EYOhtjE7P9l9bdSMaM0o2pIODvR++rzslmJBosXE6OT7t71U3pjtSLso6sEyzOm0M1Vq16gjoWqDWWq+JvlqmUeA0CfJ+NvJndf45g8k97KjJfd3mRgVtUqx73qhImhTrA1grypKwoU3/HcdGU9k5TS/CTD9PNH0yIMl9T7F+TBh+qy/Ihf4BhCJi3+pbwbzA1176r91Lo/8jwuWaqKNO+rJZ1XesHlz3d629XfOn77qvjldvXNv4ynhubBuWcWC8Mt4YA2NooNaH1p+tv1v/tH9pf2j/1v49pz5orTRfGqWv/ce/W3RIWQ==</latexit>
logistic regression
Use the sigmoid function to turn a linear classifier into a discriminative probabilistic classifier.
Use log loss. Maximise the log-likelihood of the data given the model
Derive the gradient and search for good weights. No analytical solution, but the problem is convex.
83
Regressionisabitofmisnomer,sincewe’rebuildingaclassi?ier.Isupposetheconfusingterminologycomesfromthefactthatwe’re?ittinga(curved)linethroughtheprobabilityvaluesinthedata.
84
Hereisa2Ddatasetthatshowsacommonfailurecasefortheleastsquareclassi?ier.Thepointsatthetoparesofarawayfromtheidealdecisionboundarythattheywillhavehugeresidualsundertheleastsquaresmodel.
least-squares
85
Hereiswhattheleast-squareregressionconvergesto.Clearly,thisisnotasatisfyingsolutionforsuchaneasilyseparabledataset.Thebluepointsatthetoparesofarfromthedecisionboundary.
Inthelinearmodels1lecture,we?ixedoneoftheparametersto1,sothatwecouldplotthelosssurface.Thistime,we’reoptimizingallthreeparameters.
Least-squares classifier
86
Hereisa1Dviewofasimilarsituation.
Ifwewantthedecisionboundarytobebetweentheredandblueclasses,theresidualsforthefar-awaybluepointsbecomeverybig.
87
Thelogisticmodeldoesn’thavethisproblem.Ifthemodel?itswellaroundthedecisionboundary,itdoesn’thavetoworryatallaboutpointsthatarefaraway(ifthey’reontherightsideoftheboundary),
logistic
88
Andhereisthelogisticregressionclassi?ier.
logistic
89
Andhereistheprobabilityfunction(blueishighprobabilityofpositive,redishighprobabilityofnegative).
logistic
90next lecture: maximum margin classifier
Notethatforsuchwell-separableclasses,therearemanysuitableclassi?ier,andlogisticregressionhasnoreasontopreferoneovertheother(allpointsareassignedthecorrectprobabilityverycloseto1).We’llseeasolutiontothisproblemnextlecture,whenwemeetour?inallossfunction:theSVMloss.
summary: logistic regression
Use logistic sigmoid to provide class probabilities from a linear classifier
Use -log p(class|features) as a loss function
Points near the decision boundary get more influence than points far away. The opposite is true for the least squares classifier.
Log loss generalises naturally to multiclass classification (more next week).
91
Probabilistic Models Part 5: Information Theory
Machine Learning mlvu.github.io
Vrije Universiteit Amsterdam
Thislecturewillbeallabouthowtousethemechanismsofprobabilitytocreateaclassi?ier.
information theory
93
aka: what does - log p(x) mean?
Informationtheoryisallabouttherelationbetweenencodinginformationandprobabilitytheory.
Imagineyou’reonholiday,andyou’vebroughtyourtravelmonopoly.Unfortunately,thedicehavegonemissing.Youdohowever,haveacoinwithyou.Canyouusethecoin?liptosimulatethethrowofasixsideddie?
94
headstails
Forafoursideddie,thesolutioniseasy.We?lipthecointwice,andassignanumbertoeachpossibleoutcome.
source:http://www.midlamminiatures.co.uk/blackpolydice/D4Black.html
95
tails
Asixsideddieismoretricky.We’llshowthesolutionforthree“sides”(youcanjustaddanothercoin?liptodecidewhetherit’llbe1,2,3or4,5,6.)
Thetrickistoassignthefourthoutcometoa“reset”.Ifyouthrowtwoheadsinarow,youjuststartagain.Theoreticallyyoucouldbecoin?lippingforever,buttheprobabilityofresettingmorethan?ivetimesisalreadylessthanoneinone-thousand.
Fornowlet’sstickwithtreeswhereeachoutcomeisrepresentedbyonlyoneleaf(andacceptthatthesix-sidesdiecannotbeperfectlymodelledwithacoin).Whatdistributionscanwemodelwithacoininthisway,ifwerequireeachoutcometoberepresentedby
96
1
2
4
3
1 2 3 4 …
1
2 5
1
3 4
1 2 3 4 …
Herearetwoexamples:anexponentiallydecayingdistribution,anda(roughly)polynomiallydecayingone.
prefix-free code
97
d
b fe
a
c
001001001101011000101010001001010001110011101100
0 1
Thesekindsoftreesarecalledpre?ix-freetrees,becausetheyassignapre;ixfreecodetothesetofoutcomes(wejustreplaceheadsandtailswithzerosandones).Thebene?itisthatifwewanttoencodeasequenceoftheseoutcomes,wecanjuststickthecodeoneafteranotherandwewon’tneedanydelimiters.Adecoderwillknowexactlywhereeachcodewordendsandthenextbegins.
codelengths and probabilities
L(x): length of code for x
98
0 1
p(x)=1
2⇥ . . .⇥ 1
2
=
✓1
2
◆L(x)
= 2-L(x)
L(x) = - log2 p(x)<latexit sha1_base64="43m/gnlinQkfWlHncjo1GkYtKAE=">AAAHcHicfVVdb9NIFHX4CmRhtyxPiAdmiXbVVqGyA7TlIRLQ7oJQod2qaZHqUI0n184oY3s0M04TRv4V/Cf+A7+DJ94YOx/YcXb9kpt7zrm5c+74xuOMSmXbX2tXrl67fqN+81bjl9t3fv1t7e7vpzJOBIEuiVksPnhYAqMRdBVVDD5wATj0GJx5w70MPxuBkDSOTtSEQy/EQUR9SrAyqYu1z3x9vIH+codEd5DrC0y0k+p2ilxFQ5DIZf1YycW3nwTDcBtooWTgq/UC7AoaDNTGR31g6i/IHdRGH5F+nCez3EH+6x302FSIg4s2ytq5WGvaW3b+oGrgzIKmNXuOLu5e/+L2Y5KEECnCsJTnjs1VT2OhKGGQNtxEAsdkiAM4T5S/29M04omCiKToT4P5CUMqRplBqE8FEMUmJsBEUFMBkQE2B1PGxka5lIQIG19a/RHlchrKUTANFDYz6OlxPqP0TkmpA4H5gJJxqTWNQxliNagk5ST0yklIGIhRWE5mbZoml5hjEITKzIQj48whz+YuT+KjGT6Y8AFEMtWJYGlRaAAQAnwjzEMJKuE6P425bEPZUSKBVhbmuc4+FsNj6LdMnVKi3I7PYqxSY0YElyQOQxz1tctT7SoYK+22ttLcqiJ6nGrtZr54HjrO4BL6voC+T5crdxeoj7oGLYGnBfC0UvisgJ4tS72kgCYVdFRAR5XK3mUBvqzA4wI6rqCTAjqpoJ8K6KeqldgM+rzd01O78zHpQ0ZH8FoARKlumvd66SzCTPDcKUuyqeqmk+Z298E3u2cKhJOMrt+cvDtI9d5u+5m9nS4zPJbAnGI/2X62Z1cowbSbGcfe3W2/qnBigaNgUWj/7+2XTrUQTwRnC9LOzpN/nr9aviKCVM43OwZqOqjiR7CKPmt4pcBbJZiasJI/rPJfCzz5D3a8qvrcm5UKvkoxN2quWGqJZ9fKLHqXZysVsyllH8yyFfDOXLdDsyCwisWmuWMiCKkxw3y6rSz6PyIez4kmajTM5neW93w16La3nm/Z/z5tvng7+wu4aT2wHlnrlmPtWC+sN9aR1bWI9b32sLZe27jxrX6//rD+x5R6pTbT3LNKT33zBywPp9g=</latexit><latexit sha1_base64="43m/gnlinQkfWlHncjo1GkYtKAE=">AAAHcHicfVVdb9NIFHX4CmRhtyxPiAdmiXbVVqGyA7TlIRLQ7oJQod2qaZHqUI0n184oY3s0M04TRv4V/Cf+A7+DJ94YOx/YcXb9kpt7zrm5c+74xuOMSmXbX2tXrl67fqN+81bjl9t3fv1t7e7vpzJOBIEuiVksPnhYAqMRdBVVDD5wATj0GJx5w70MPxuBkDSOTtSEQy/EQUR9SrAyqYu1z3x9vIH+codEd5DrC0y0k+p2ilxFQ5DIZf1YycW3nwTDcBtooWTgq/UC7AoaDNTGR31g6i/IHdRGH5F+nCez3EH+6x302FSIg4s2ytq5WGvaW3b+oGrgzIKmNXuOLu5e/+L2Y5KEECnCsJTnjs1VT2OhKGGQNtxEAsdkiAM4T5S/29M04omCiKToT4P5CUMqRplBqE8FEMUmJsBEUFMBkQE2B1PGxka5lIQIG19a/RHlchrKUTANFDYz6OlxPqP0TkmpA4H5gJJxqTWNQxliNagk5ST0yklIGIhRWE5mbZoml5hjEITKzIQj48whz+YuT+KjGT6Y8AFEMtWJYGlRaAAQAnwjzEMJKuE6P425bEPZUSKBVhbmuc4+FsNj6LdMnVKi3I7PYqxSY0YElyQOQxz1tctT7SoYK+22ttLcqiJ6nGrtZr54HjrO4BL6voC+T5crdxeoj7oGLYGnBfC0UvisgJ4tS72kgCYVdFRAR5XK3mUBvqzA4wI6rqCTAjqpoJ8K6KeqldgM+rzd01O78zHpQ0ZH8FoARKlumvd66SzCTPDcKUuyqeqmk+Z298E3u2cKhJOMrt+cvDtI9d5u+5m9nS4zPJbAnGI/2X62Z1cowbSbGcfe3W2/qnBigaNgUWj/7+2XTrUQTwRnC9LOzpN/nr9aviKCVM43OwZqOqjiR7CKPmt4pcBbJZiasJI/rPJfCzz5D3a8qvrcm5UKvkoxN2quWGqJZ9fKLHqXZysVsyllH8yyFfDOXLdDsyCwisWmuWMiCKkxw3y6rSz6PyIez4kmajTM5neW93w16La3nm/Z/z5tvng7+wu4aT2wHlnrlmPtWC+sN9aR1bWI9b32sLZe27jxrX6//rD+x5R6pTbT3LNKT33zBywPp9g=</latexit><latexit sha1_base64="43m/gnlinQkfWlHncjo1GkYtKAE=">AAAHcHicfVVdb9NIFHX4CmRhtyxPiAdmiXbVVqGyA7TlIRLQ7oJQod2qaZHqUI0n184oY3s0M04TRv4V/Cf+A7+DJ94YOx/YcXb9kpt7zrm5c+74xuOMSmXbX2tXrl67fqN+81bjl9t3fv1t7e7vpzJOBIEuiVksPnhYAqMRdBVVDD5wATj0GJx5w70MPxuBkDSOTtSEQy/EQUR9SrAyqYu1z3x9vIH+codEd5DrC0y0k+p2ilxFQ5DIZf1YycW3nwTDcBtooWTgq/UC7AoaDNTGR31g6i/IHdRGH5F+nCez3EH+6x302FSIg4s2ytq5WGvaW3b+oGrgzIKmNXuOLu5e/+L2Y5KEECnCsJTnjs1VT2OhKGGQNtxEAsdkiAM4T5S/29M04omCiKToT4P5CUMqRplBqE8FEMUmJsBEUFMBkQE2B1PGxka5lIQIG19a/RHlchrKUTANFDYz6OlxPqP0TkmpA4H5gJJxqTWNQxliNagk5ST0yklIGIhRWE5mbZoml5hjEITKzIQj48whz+YuT+KjGT6Y8AFEMtWJYGlRaAAQAnwjzEMJKuE6P425bEPZUSKBVhbmuc4+FsNj6LdMnVKi3I7PYqxSY0YElyQOQxz1tctT7SoYK+22ttLcqiJ6nGrtZr54HjrO4BL6voC+T5crdxeoj7oGLYGnBfC0UvisgJ4tS72kgCYVdFRAR5XK3mUBvqzA4wI6rqCTAjqpoJ8K6KeqldgM+rzd01O78zHpQ0ZH8FoARKlumvd66SzCTPDcKUuyqeqmk+Z298E3u2cKhJOMrt+cvDtI9d5u+5m9nS4zPJbAnGI/2X62Z1cowbSbGcfe3W2/qnBigaNgUWj/7+2XTrUQTwRnC9LOzpN/nr9aviKCVM43OwZqOqjiR7CKPmt4pcBbJZiasJI/rPJfCzz5D3a8qvrcm5UKvkoxN2quWGqJZ9fKLHqXZysVsyllH8yyFfDOXLdDsyCwisWmuWMiCKkxw3y6rSz6PyIez4kmajTM5neW93w16La3nm/Z/z5tvng7+wu4aT2wHlnrlmPtWC+sN9aR1bWI9b32sLZe27jxrX6//rD+x5R6pTbT3LNKT33zBywPp9g=</latexit>
d
b fe
a
c
p(x)=1
2⇥ . . .⇥ 1
2
=
✓1
2
◆L(x)
= 2-L(x)
L(x) = - log2 p(x)<latexit sha1_base64="43m/gnlinQkfWlHncjo1GkYtKAE=">AAAHcHicfVVdb9NIFHX4CmRhtyxPiAdmiXbVVqGyA7TlIRLQ7oJQod2qaZHqUI0n184oY3s0M04TRv4V/Cf+A7+DJ94YOx/YcXb9kpt7zrm5c+74xuOMSmXbX2tXrl67fqN+81bjl9t3fv1t7e7vpzJOBIEuiVksPnhYAqMRdBVVDD5wATj0GJx5w70MPxuBkDSOTtSEQy/EQUR9SrAyqYu1z3x9vIH+codEd5DrC0y0k+p2ilxFQ5DIZf1YycW3nwTDcBtooWTgq/UC7AoaDNTGR31g6i/IHdRGH5F+nCez3EH+6x302FSIg4s2ytq5WGvaW3b+oGrgzIKmNXuOLu5e/+L2Y5KEECnCsJTnjs1VT2OhKGGQNtxEAsdkiAM4T5S/29M04omCiKToT4P5CUMqRplBqE8FEMUmJsBEUFMBkQE2B1PGxka5lIQIG19a/RHlchrKUTANFDYz6OlxPqP0TkmpA4H5gJJxqTWNQxliNagk5ST0yklIGIhRWE5mbZoml5hjEITKzIQj48whz+YuT+KjGT6Y8AFEMtWJYGlRaAAQAnwjzEMJKuE6P425bEPZUSKBVhbmuc4+FsNj6LdMnVKi3I7PYqxSY0YElyQOQxz1tctT7SoYK+22ttLcqiJ6nGrtZr54HjrO4BL6voC+T5crdxeoj7oGLYGnBfC0UvisgJ4tS72kgCYVdFRAR5XK3mUBvqzA4wI6rqCTAjqpoJ8K6KeqldgM+rzd01O78zHpQ0ZH8FoARKlumvd66SzCTPDcKUuyqeqmk+Z298E3u2cKhJOMrt+cvDtI9d5u+5m9nS4zPJbAnGI/2X62Z1cowbSbGcfe3W2/qnBigaNgUWj/7+2XTrUQTwRnC9LOzpN/nr9aviKCVM43OwZqOqjiR7CKPmt4pcBbJZiasJI/rPJfCzz5D3a8qvrcm5UKvkoxN2quWGqJZ9fKLHqXZysVsyllH8yyFfDOXLdDsyCwisWmuWMiCKkxw3y6rSz6PyIez4kmajTM5neW93w16La3nm/Z/z5tvng7+wu4aT2wHlnrlmPtWC+sN9aR1bWI9b32sLZe27jxrX6//rD+x5R6pTbT3LNKT33zBywPp9g=</latexit><latexit sha1_base64="43m/gnlinQkfWlHncjo1GkYtKAE=">AAAHcHicfVVdb9NIFHX4CmRhtyxPiAdmiXbVVqGyA7TlIRLQ7oJQod2qaZHqUI0n184oY3s0M04TRv4V/Cf+A7+DJ94YOx/YcXb9kpt7zrm5c+74xuOMSmXbX2tXrl67fqN+81bjl9t3fv1t7e7vpzJOBIEuiVksPnhYAqMRdBVVDD5wATj0GJx5w70MPxuBkDSOTtSEQy/EQUR9SrAyqYu1z3x9vIH+codEd5DrC0y0k+p2ilxFQ5DIZf1YycW3nwTDcBtooWTgq/UC7AoaDNTGR31g6i/IHdRGH5F+nCez3EH+6x302FSIg4s2ytq5WGvaW3b+oGrgzIKmNXuOLu5e/+L2Y5KEECnCsJTnjs1VT2OhKGGQNtxEAsdkiAM4T5S/29M04omCiKToT4P5CUMqRplBqE8FEMUmJsBEUFMBkQE2B1PGxka5lIQIG19a/RHlchrKUTANFDYz6OlxPqP0TkmpA4H5gJJxqTWNQxliNagk5ST0yklIGIhRWE5mbZoml5hjEITKzIQj48whz+YuT+KjGT6Y8AFEMtWJYGlRaAAQAnwjzEMJKuE6P425bEPZUSKBVhbmuc4+FsNj6LdMnVKi3I7PYqxSY0YElyQOQxz1tctT7SoYK+22ttLcqiJ6nGrtZr54HjrO4BL6voC+T5crdxeoj7oGLYGnBfC0UvisgJ4tS72kgCYVdFRAR5XK3mUBvqzA4wI6rqCTAjqpoJ8K6KeqldgM+rzd01O78zHpQ0ZH8FoARKlumvd66SzCTPDcKUuyqeqmk+Z298E3u2cKhJOMrt+cvDtI9d5u+5m9nS4zPJbAnGI/2X62Z1cowbSbGcfe3W2/qnBigaNgUWj/7+2XTrUQTwRnC9LOzpN/nr9aviKCVM43OwZqOqjiR7CKPmt4pcBbJZiasJI/rPJfCzz5D3a8qvrcm5UKvkoxN2quWGqJZ9fKLHqXZysVsyllH8yyFfDOXLdDsyCwisWmuWMiCKkxw3y6rSz6PyIez4kmajTM5neW93w16La3nm/Z/z5tvng7+wu4aT2wHlnrlmPtWC+sN9aR1bWI9b32sLZe27jxrX6//rD+x5R6pTbT3LNKT33zBywPp9g=</latexit><latexit sha1_base64="43m/gnlinQkfWlHncjo1GkYtKAE=">AAAHcHicfVVdb9NIFHX4CmRhtyxPiAdmiXbVVqGyA7TlIRLQ7oJQod2qaZHqUI0n184oY3s0M04TRv4V/Cf+A7+DJ94YOx/YcXb9kpt7zrm5c+74xuOMSmXbX2tXrl67fqN+81bjl9t3fv1t7e7vpzJOBIEuiVksPnhYAqMRdBVVDD5wATj0GJx5w70MPxuBkDSOTtSEQy/EQUR9SrAyqYu1z3x9vIH+codEd5DrC0y0k+p2ilxFQ5DIZf1YycW3nwTDcBtooWTgq/UC7AoaDNTGR31g6i/IHdRGH5F+nCez3EH+6x302FSIg4s2ytq5WGvaW3b+oGrgzIKmNXuOLu5e/+L2Y5KEECnCsJTnjs1VT2OhKGGQNtxEAsdkiAM4T5S/29M04omCiKToT4P5CUMqRplBqE8FEMUmJsBEUFMBkQE2B1PGxka5lIQIG19a/RHlchrKUTANFDYz6OlxPqP0TkmpA4H5gJJxqTWNQxliNagk5ST0yklIGIhRWE5mbZoml5hjEITKzIQj48whz+YuT+KjGT6Y8AFEMtWJYGlRaAAQAnwjzEMJKuE6P425bEPZUSKBVhbmuc4+FsNj6LdMnVKi3I7PYqxSY0YElyQOQxz1tctT7SoYK+22ttLcqiJ6nGrtZr54HjrO4BL6voC+T5crdxeoj7oGLYGnBfC0UvisgJ4tS72kgCYVdFRAR5XK3mUBvqzA4wI6rqCTAjqpoJ8K6KeqldgM+rzd01O78zHpQ0ZH8FoARKlumvd66SzCTPDcKUuyqeqmk+Z298E3u2cKhJOMrt+cvDtI9d5u+5m9nS4zPJbAnGI/2X62Z1cowbSbGcfe3W2/qnBigaNgUWj/7+2XTrUQTwRnC9LOzpN/nr9aviKCVM43OwZqOqjiR7CKPmt4pcBbJZiasJI/rPJfCzz5D3a8qvrcm5UKvkoxN2quWGqJZ9fKLHqXZysVsyllH8yyFfDOXLdDsyCwisWmuWMiCKkxw3y6rSz6PyIez4kmajTM5neW93w16La3nm/Z/z5tvng7+wu4aT2wHlnrlmPtWC+sN9aR1bWI9b32sLZe27jxrX6//rD+x5R6pTbT3LNKT33zBywPp9g=</latexit>
Everypre?ixtreede?inesaprobabilitydistributionandacode.Whatabouttheotherwayaround?Canwe?indatreeforanygivenprobabilitydistribution?
Wealreadysawthatsomedistributions(likeasix-sideddie)cannotberepresentedexactly.Buthowclosecanweget?
arithmetic coding
There exists an algorithm which provides for any p(x), a prefix-free code L such that
Thus, if we ignore this minor inaccuracy, or if we allow L(x) to take non-integer values, we may
equate codes with probability distributions.
99
|- log2 p(x)- L(x)| 6 1<latexit sha1_base64="9SNVhtLjCRa3w1AqL8cVWAxLah0=">AAAG+3icfVXbbtNAEHULDSVQaOERCVlESAWFyk5pmz5UKm2hFeqNqmkq1VG03kwcK2t72V2nSbfmiU/hqeIN+AX+gb9hnUtlx4F98WjOOaPZM6uxTYnLhWH8mZq+c3cmd2/2fv7Bw7lHj+cXnpzxIGQYKjggATu3EQfi+lARriBwThkgzyZQtdvbMV7tAONu4J+KHoWahxzfbboYCZWqzz+/1t9YJHDqJZ0udl/pb/T9+HOtWwQ+62Z9vmAsGf2jZwNzGBS04TmuL8z8thoBDj3wBSaI8wvToKImERMuJhDlrZADRbiNHLgIRbNck65PQwE+jvSXCmuGRBeBHveqN1wGWJCeChBmrqqg4xZiCAt1o3y6FAcfecCLjY5L+SDkHWcQCKTsqMlu365oLqWUDkO05eJuqjWJPO4h0cokec+z00kICbCOl07Gbaomx5hdYNjlsQnHypkjGo+AnwbHQ7zVoy3weSRDRqKkUAHAGDSVsB9yECGV/duoubf5hmAhFOOwn9vYQax9Ao2iqpNKpNtpkgCJSJnhwyUOPA/5DWnRSFoCukJaxaWob1USPYmktGJfbFs/ieEUephAD6PxypVbtKlXFJoCzxLgWaZwNYFWx6V2mEDDDNpJoJ1MZfsyAV9m4G4C7WbQXgLtZdCrBHqVtRKpQV+UanJgd39M8oi4HdhlAH4kC6Vo/C5MTfDCTEviqcqCGfXtbkBTrYEB4PViutw7PdiP5Ha5tGKsRuMMm4QwohjLqyvbRobiDLoZcoxyubSV4QQM+c5toZ33q+/MbCEaMkpuSWtryx/Wt8afCMOZ+w2voRdMPeOHM4k+bHiiwJ4kGJgwkd/O8ncZ6v2DHUyqPvJmooJOUoyMGinGWqLxs2pjJY5XKiIDyg6oZcvgQD23I7UgkAjYa/XGmOO5ygz1tYpx9D8i6o6IKsrn1eY3x/d8NqiUltaXjE9vC5sfh7+AWe2Z9kJb1ExtTdvU9rRjraJh7at2o/3Qfua+5L7lbnLfB9TpqaHmqZY6uV9/AccJg74=</latexit><latexit sha1_base64="9SNVhtLjCRa3w1AqL8cVWAxLah0=">AAAG+3icfVXbbtNAEHULDSVQaOERCVlESAWFyk5pmz5UKm2hFeqNqmkq1VG03kwcK2t72V2nSbfmiU/hqeIN+AX+gb9hnUtlx4F98WjOOaPZM6uxTYnLhWH8mZq+c3cmd2/2fv7Bw7lHj+cXnpzxIGQYKjggATu3EQfi+lARriBwThkgzyZQtdvbMV7tAONu4J+KHoWahxzfbboYCZWqzz+/1t9YJHDqJZ0udl/pb/T9+HOtWwQ+62Z9vmAsGf2jZwNzGBS04TmuL8z8thoBDj3wBSaI8wvToKImERMuJhDlrZADRbiNHLgIRbNck65PQwE+jvSXCmuGRBeBHveqN1wGWJCeChBmrqqg4xZiCAt1o3y6FAcfecCLjY5L+SDkHWcQCKTsqMlu365oLqWUDkO05eJuqjWJPO4h0cokec+z00kICbCOl07Gbaomx5hdYNjlsQnHypkjGo+AnwbHQ7zVoy3weSRDRqKkUAHAGDSVsB9yECGV/duoubf5hmAhFOOwn9vYQax9Ao2iqpNKpNtpkgCJSJnhwyUOPA/5DWnRSFoCukJaxaWob1USPYmktGJfbFs/ieEUephAD6PxypVbtKlXFJoCzxLgWaZwNYFWx6V2mEDDDNpJoJ1MZfsyAV9m4G4C7WbQXgLtZdCrBHqVtRKpQV+UanJgd39M8oi4HdhlAH4kC6Vo/C5MTfDCTEviqcqCGfXtbkBTrYEB4PViutw7PdiP5Ha5tGKsRuMMm4QwohjLqyvbRobiDLoZcoxyubSV4QQM+c5toZ33q+/MbCEaMkpuSWtryx/Wt8afCMOZ+w2voRdMPeOHM4k+bHiiwJ4kGJgwkd/O8ncZ6v2DHUyqPvJmooJOUoyMGinGWqLxs2pjJY5XKiIDyg6oZcvgQD23I7UgkAjYa/XGmOO5ygz1tYpx9D8i6o6IKsrn1eY3x/d8NqiUltaXjE9vC5sfh7+AWe2Z9kJb1ExtTdvU9rRjraJh7at2o/3Qfua+5L7lbnLfB9TpqaHmqZY6uV9/AccJg74=</latexit><latexit sha1_base64="9SNVhtLjCRa3w1AqL8cVWAxLah0=">AAAG+3icfVXbbtNAEHULDSVQaOERCVlESAWFyk5pmz5UKm2hFeqNqmkq1VG03kwcK2t72V2nSbfmiU/hqeIN+AX+gb9hnUtlx4F98WjOOaPZM6uxTYnLhWH8mZq+c3cmd2/2fv7Bw7lHj+cXnpzxIGQYKjggATu3EQfi+lARriBwThkgzyZQtdvbMV7tAONu4J+KHoWahxzfbboYCZWqzz+/1t9YJHDqJZ0udl/pb/T9+HOtWwQ+62Z9vmAsGf2jZwNzGBS04TmuL8z8thoBDj3wBSaI8wvToKImERMuJhDlrZADRbiNHLgIRbNck65PQwE+jvSXCmuGRBeBHveqN1wGWJCeChBmrqqg4xZiCAt1o3y6FAcfecCLjY5L+SDkHWcQCKTsqMlu365oLqWUDkO05eJuqjWJPO4h0cokec+z00kICbCOl07Gbaomx5hdYNjlsQnHypkjGo+AnwbHQ7zVoy3weSRDRqKkUAHAGDSVsB9yECGV/duoubf5hmAhFOOwn9vYQax9Ao2iqpNKpNtpkgCJSJnhwyUOPA/5DWnRSFoCukJaxaWob1USPYmktGJfbFs/ieEUephAD6PxypVbtKlXFJoCzxLgWaZwNYFWx6V2mEDDDNpJoJ1MZfsyAV9m4G4C7WbQXgLtZdCrBHqVtRKpQV+UanJgd39M8oi4HdhlAH4kC6Vo/C5MTfDCTEviqcqCGfXtbkBTrYEB4PViutw7PdiP5Ha5tGKsRuMMm4QwohjLqyvbRobiDLoZcoxyubSV4QQM+c5toZ33q+/MbCEaMkpuSWtryx/Wt8afCMOZ+w2voRdMPeOHM4k+bHiiwJ4kGJgwkd/O8ncZ6v2DHUyqPvJmooJOUoyMGinGWqLxs2pjJY5XKiIDyg6oZcvgQD23I7UgkAjYa/XGmOO5ygz1tYpx9D8i6o6IKsrn1eY3x/d8NqiUltaXjE9vC5sfh7+AWe2Z9kJb1ExtTdvU9rRjraJh7at2o/3Qfua+5L7lbnLfB9TpqaHmqZY6uV9/AccJg74=</latexit>
Itturnsoutwecanmodelanydistributioninsuchawaythatthebiggestdifferenceincodelengthisnolargerthanabit.
Ifwehandwavethisdifference,wecanequatecodeswithprobabilitydistributions:everycodegivesusadistributionandeverydistributiongivesusacode.Thehighertheprobabilityofanoutcome,theshorteritscodelength.
entropy
p(X=x): data source
If we encode X with the ideal code for p, what is our expected codelength?
100
H(p) = EpL(x)
=X
x2X
p(x)L(x)
= -X
x2X
p(x) log p(x)<latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit>
H(p) = EpL(x)
=X
x2X
p(x)L(x)
= -X
x2X
p(x) log p(x)<latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit>
H(p) = EpL(x)
=X
x2X
p(x)L(x)
= -X
x2X
p(x) log p(x)<latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit>
Theentropyofadistributionistheexpectedcodelengthofanelementsampledfromthatdistribution.
101
1/4
a b c d
H(p) = 2 bits H(p) = 1.75 bits
1/4
a b c d
1/2
H(p) = 0 bits
a b c d
1
Themoreuniformourdistributionis(themoreunsureweare)thehighertheentropy.
Inthemiddle,weknowsomethingaboutourdistribution,forinstancethataisverylikely,sowecan
makethecodewordforaalittleshorter,reducingthe
expectedcodelength(theentropy).Ontheleft,wehavenosuchoptions,sotheentropyismaximal(equaltolog2N).
cross entropy
p(X): source of our data q(X): our model
Cross entropy: expected codelength if we use q, but the data comes from p.
102
H(p,q) = EpLq(x)
= -X
x2X
p(x) logq(x)<latexit sha1_base64="jWmCafC4IfLKj6tlfaKLnzfdgqM=">AAAIJHicfVXLcts2FGUejVK1aZ1m2Q0mmnacjuoh5cZ2F55J/Gi8SGLXY9meMVUNCF1RHIGPAqAsBoPfyddkl+mim35Et+2i4EMqXy03urrnnMt7DwDCiajHhWn+fufuvfufPOg8/LT72eePvvhy4/FXlzyMGYEhCWnIrh3MgXoBDIUnKFxHDLDvULhy5ocpfrUAxr0wuBBJBCMfu4E39QgWOjXemJ5s2ozISPWR7RD5q3qGvt1H0vaxmDkOOlbjHEavf5E5QW0unyHbRl3N+x7ZPPbHcolsL0DXCuXkjEFDtyip/443euaWmT2oGVhF0DOK52z8+MF7exKS2IdAEIo5v7HMSIwkZsIjFFTXjjlEmMyxCzexmO6NpBdEsYCAKPSNxqYxRSJE6cxo4jEggiY6wIR5ugIiM8wwEdqZbrUUhwD7wPuThRfxPOQLNw8E1raO5DKzXT2qKKXLcDTzyLLSmsQ+T61sJHniO9UkxBTYwq8m0zZ1kzXmEhjxeGrCmXbmNEqXkl+EZwU+S6IZBFzJmFFVFmoAGIOpFmYhBxFHMptG75853xcshn4aZrn9I8zm5zDp6zqVRLWdKQ2xqKYcPYZ2J4BbEvo+DibSjpS0BSyFtPtbKvOujJ4rud5z5ylcQd+W0LdKVcHjEniswSo6XKNTNKxLL0vgZeOtVyX0qi514hIaN9BFCV00Kju3Jfi2AS9L6LKBJiU0aaDvSui7ps9Yb4ubwUjma5Etqjyl3gJeMYBAyd5A1Wdher1vrKok3QOyZ6nM7glM9ccnB/wkpcuTizevlTzcGzw3d1Sd4dAYVhRze+f5odmguHk3Bcfc2xscNDghw4G7LnR0vPPSahaKYhbRNWl3d/unH5uVEqA0vF1XOjw4GmzXB9OOVJuydi3TrO82RhpWFY6gnoUa1rpt9OI1rQKnTZD72cqfN/mvGE7+gx22VV/Z3KqI2hQrz1sVSZtitQArRVUStNj073KsNbXJo/QgzInuMb0yMM3rHoG+TBi80QfkVH8AsQjZd/pUMNf3dC39a/fT6P+IeLki6qjb1TebVb/HmsHlYMva3jJ//qH34qC44x4aXxtPjU3DMnaNF8aJcWYMDWJ8MP40/jL+7rzvfOh87PyWU+/eKTRPjMrT+eMfXHD1ZA==</latexit>
H(p,q) = EpLq(x)
= -X
x2X
p(x) logq(x)<latexit sha1_base64="jWmCafC4IfLKj6tlfaKLnzfdgqM=">AAAIJHicfVXLcts2FGUejVK1aZ1m2Q0mmnacjuoh5cZ2F55J/Gi8SGLXY9meMVUNCF1RHIGPAqAsBoPfyddkl+mim35Et+2i4EMqXy03urrnnMt7DwDCiajHhWn+fufuvfufPOg8/LT72eePvvhy4/FXlzyMGYEhCWnIrh3MgXoBDIUnKFxHDLDvULhy5ocpfrUAxr0wuBBJBCMfu4E39QgWOjXemJ5s2ozISPWR7RD5q3qGvt1H0vaxmDkOOlbjHEavf5E5QW0unyHbRl3N+x7ZPPbHcolsL0DXCuXkjEFDtyip/443euaWmT2oGVhF0DOK52z8+MF7exKS2IdAEIo5v7HMSIwkZsIjFFTXjjlEmMyxCzexmO6NpBdEsYCAKPSNxqYxRSJE6cxo4jEggiY6wIR5ugIiM8wwEdqZbrUUhwD7wPuThRfxPOQLNw8E1raO5DKzXT2qKKXLcDTzyLLSmsQ+T61sJHniO9UkxBTYwq8m0zZ1kzXmEhjxeGrCmXbmNEqXkl+EZwU+S6IZBFzJmFFVFmoAGIOpFmYhBxFHMptG75853xcshn4aZrn9I8zm5zDp6zqVRLWdKQ2xqKYcPYZ2J4BbEvo+DibSjpS0BSyFtPtbKvOujJ4rud5z5ylcQd+W0LdKVcHjEniswSo6XKNTNKxLL0vgZeOtVyX0qi514hIaN9BFCV00Kju3Jfi2AS9L6LKBJiU0aaDvSui7ps9Yb4ubwUjma5Etqjyl3gJeMYBAyd5A1Wdher1vrKok3QOyZ6nM7glM9ccnB/wkpcuTizevlTzcGzw3d1Sd4dAYVhRze+f5odmguHk3Bcfc2xscNDghw4G7LnR0vPPSahaKYhbRNWl3d/unH5uVEqA0vF1XOjw4GmzXB9OOVJuydi3TrO82RhpWFY6gnoUa1rpt9OI1rQKnTZD72cqfN/mvGE7+gx22VV/Z3KqI2hQrz1sVSZtitQArRVUStNj073KsNbXJo/QgzInuMb0yMM3rHoG+TBi80QfkVH8AsQjZd/pUMNf3dC39a/fT6P+IeLki6qjb1TebVb/HmsHlYMva3jJ//qH34qC44x4aXxtPjU3DMnaNF8aJcWYMDWJ8MP40/jL+7rzvfOh87PyWU+/eKTRPjMrT+eMfXHD1ZA==</latexit>
Whatifwedon’tusethecodethatcorrespondstothesourceofourdataptoencodeourdata,butsomeothercodebasedondistributionq.Whatisourexpectedcodelengththen?Thisiscalledthecrossentropy.
Thecrossentropyisminimalwhenp=q(andequaltotheentropy).Wecanconcludetwothings:
• Thecodecorrespondingtopprovidesthebestexpectedcodelength.
• Thecrossentropyisagoodwaytoquantifythedistancebetweentwodistributions(becauseit’sminimalwhenthetwoarethesame).
Kulback-Leibler divergence
Expected difference in codelength between p and q.
Or, difference in expected codelength.
103
KL(p,q) = H(p,q)-H(p)
= -X
x2X
p(x) logq(x)
p(x)<latexit sha1_base64="066P+99nNKuRRJavISk1EIKjCWw=">AAAH1nicfZVdb9s2FIbltqtbb93S9XI3RI0N6eAGkrMm6UWAtknWYP1IFsRJhsgIKPpYFkx9jKQcqQTRq2G3+3vbrxll2YZkatOND87znmPy5QHpJTTgwrb/bt25e++L++0HDztffvXo6282Hn97weOUERiQmMbsysMcaBDBQASCwlXCAIcehUtvelDwyxkwHsTRucgTGIbYj4JxQLDQqZsN8e79psuITFQPuR6Rv6tn6Id9dGwkn69yz5DrdrTmOXJ5Gt7IzA0idKVQSTczzWnsI3fMMJFluU4queLqZqNrb9nzD5mBswi61uI7vXl8/507ikkaQiQIxZxfO3YihhIzERAKquOmHBJMptiH61SM94YyiJJUQEQU+l6zcUqRiFHhABoFDIiguQ4wYYHugMgE68UK7VOn3opDhEPgvdEsSHgZ8plfBgJrk4cymx+CelSrlD7DySQgWW1pEoc8xGJiJHkeevUkpBTYLKwni2XqRa4pM2Ak4IUJp9qZk6Q4WH4eny74JE8mEHElU0ZVtVADYAzGunAechBpIue70dM05fuCpdArwnlu/xCz6RmMerpPLVFfzpjGWNRTnt6GdieCWxKHIY5G0k30OAjIhHR7W2ruXZWeKSndwijPQ2cFrtGPFfpRqTo8qsAjDet0sKJjNFgvvajAC+NfLyv0cr3USys0NeisQmdGZ++2gm8NnFVoZtC8QnODfqrQT6bPWI/FdX8oy7OYH6o8ocEM3jKASMluX63vhenzvnbqJcUMyK6j5naPYKyvohKEeSGXx+cf3it5sNd/Ye+odYVHU1hK7O2dFwe2IfHL1Sw09t5e/42hiRmO/FWjw6Od147ZKElZQlei3d3tn1+anXKgNL5ddTp4c9jfXp8jRgwTFntFXQcZpvlN8sWuGgu8poLSqUb91NS/ZTj/D3Xc1H1pYGNF0lSxdLOxIm+qWFq7rFjbRFJM61S/GUlxr2NaSg5B3/gMPugpPtG3FBYx+1GPLvPDQNunf91eEf2fEGdLoY46Hf38OOuPjRkM+lsvt5xff+q++mXxDj2wvrOeWpuWY+1ar6xj69QaWMT6p2W1HrY67d/an9t/tP8spXdai5onVu1r//Uvd/vTtQ==</latexit><latexit sha1_base64="066P+99nNKuRRJavISk1EIKjCWw=">AAAH1nicfZVdb9s2FIbltqtbb93S9XI3RI0N6eAGkrMm6UWAtknWYP1IFsRJhsgIKPpYFkx9jKQcqQTRq2G3+3vbrxll2YZkatOND87znmPy5QHpJTTgwrb/bt25e++L++0HDztffvXo6282Hn97weOUERiQmMbsysMcaBDBQASCwlXCAIcehUtvelDwyxkwHsTRucgTGIbYj4JxQLDQqZsN8e79psuITFQPuR6Rv6tn6Id9dGwkn69yz5DrdrTmOXJ5Gt7IzA0idKVQSTczzWnsI3fMMJFluU4queLqZqNrb9nzD5mBswi61uI7vXl8/507ikkaQiQIxZxfO3YihhIzERAKquOmHBJMptiH61SM94YyiJJUQEQU+l6zcUqRiFHhABoFDIiguQ4wYYHugMgE68UK7VOn3opDhEPgvdEsSHgZ8plfBgJrk4cymx+CelSrlD7DySQgWW1pEoc8xGJiJHkeevUkpBTYLKwni2XqRa4pM2Ak4IUJp9qZk6Q4WH4eny74JE8mEHElU0ZVtVADYAzGunAechBpIue70dM05fuCpdArwnlu/xCz6RmMerpPLVFfzpjGWNRTnt6GdieCWxKHIY5G0k30OAjIhHR7W2ruXZWeKSndwijPQ2cFrtGPFfpRqTo8qsAjDet0sKJjNFgvvajAC+NfLyv0cr3USys0NeisQmdGZ++2gm8NnFVoZtC8QnODfqrQT6bPWI/FdX8oy7OYH6o8ocEM3jKASMluX63vhenzvnbqJcUMyK6j5naPYKyvohKEeSGXx+cf3it5sNd/Ye+odYVHU1hK7O2dFwe2IfHL1Sw09t5e/42hiRmO/FWjw6Od147ZKElZQlei3d3tn1+anXKgNL5ddTp4c9jfXp8jRgwTFntFXQcZpvlN8sWuGgu8poLSqUb91NS/ZTj/D3Xc1H1pYGNF0lSxdLOxIm+qWFq7rFjbRFJM61S/GUlxr2NaSg5B3/gMPugpPtG3FBYx+1GPLvPDQNunf91eEf2fEGdLoY46Hf38OOuPjRkM+lsvt5xff+q++mXxDj2wvrOeWpuWY+1ar6xj69QaWMT6p2W1HrY67d/an9t/tP8spXdai5onVu1r//Uvd/vTtQ==</latexit><latexit sha1_base64="066P+99nNKuRRJavISk1EIKjCWw=">AAAH1nicfZVdb9s2FIbltqtbb93S9XI3RI0N6eAGkrMm6UWAtknWYP1IFsRJhsgIKPpYFkx9jKQcqQTRq2G3+3vbrxll2YZkatOND87znmPy5QHpJTTgwrb/bt25e++L++0HDztffvXo6282Hn97weOUERiQmMbsysMcaBDBQASCwlXCAIcehUtvelDwyxkwHsTRucgTGIbYj4JxQLDQqZsN8e79psuITFQPuR6Rv6tn6Id9dGwkn69yz5DrdrTmOXJ5Gt7IzA0idKVQSTczzWnsI3fMMJFluU4queLqZqNrb9nzD5mBswi61uI7vXl8/507ikkaQiQIxZxfO3YihhIzERAKquOmHBJMptiH61SM94YyiJJUQEQU+l6zcUqRiFHhABoFDIiguQ4wYYHugMgE68UK7VOn3opDhEPgvdEsSHgZ8plfBgJrk4cymx+CelSrlD7DySQgWW1pEoc8xGJiJHkeevUkpBTYLKwni2XqRa4pM2Ak4IUJp9qZk6Q4WH4eny74JE8mEHElU0ZVtVADYAzGunAechBpIue70dM05fuCpdArwnlu/xCz6RmMerpPLVFfzpjGWNRTnt6GdieCWxKHIY5G0k30OAjIhHR7W2ruXZWeKSndwijPQ2cFrtGPFfpRqTo8qsAjDet0sKJjNFgvvajAC+NfLyv0cr3USys0NeisQmdGZ++2gm8NnFVoZtC8QnODfqrQT6bPWI/FdX8oy7OYH6o8ocEM3jKASMluX63vhenzvnbqJcUMyK6j5naPYKyvohKEeSGXx+cf3it5sNd/Ye+odYVHU1hK7O2dFwe2IfHL1Sw09t5e/42hiRmO/FWjw6Od147ZKElZQlei3d3tn1+anXKgNL5ddTp4c9jfXp8jRgwTFntFXQcZpvlN8sWuGgu8poLSqUb91NS/ZTj/D3Xc1H1pYGNF0lSxdLOxIm+qWFq7rFjbRFJM61S/GUlxr2NaSg5B3/gMPugpPtG3FBYx+1GPLvPDQNunf91eEf2fEGdLoY46Hf38OOuPjRkM+lsvt5xff+q++mXxDj2wvrOeWpuWY+1ar6xj69QaWMT6p2W1HrY67d/an9t/tP8spXdai5onVu1r//Uvd/vTtQ==</latexit><latexit sha1_base64="066P+99nNKuRRJavISk1EIKjCWw=">AAAH1nicfZVdb9s2FIbltqtbb93S9XI3RI0N6eAGkrMm6UWAtknWYP1IFsRJhsgIKPpYFkx9jKQcqQTRq2G3+3vbrxll2YZkatOND87znmPy5QHpJTTgwrb/bt25e++L++0HDztffvXo6282Hn97weOUERiQmMbsysMcaBDBQASCwlXCAIcehUtvelDwyxkwHsTRucgTGIbYj4JxQLDQqZsN8e79psuITFQPuR6Rv6tn6Id9dGwkn69yz5DrdrTmOXJ5Gt7IzA0idKVQSTczzWnsI3fMMJFluU4queLqZqNrb9nzD5mBswi61uI7vXl8/507ikkaQiQIxZxfO3YihhIzERAKquOmHBJMptiH61SM94YyiJJUQEQU+l6zcUqRiFHhABoFDIiguQ4wYYHugMgE68UK7VOn3opDhEPgvdEsSHgZ8plfBgJrk4cymx+CelSrlD7DySQgWW1pEoc8xGJiJHkeevUkpBTYLKwni2XqRa4pM2Ak4IUJp9qZk6Q4WH4eny74JE8mEHElU0ZVtVADYAzGunAechBpIue70dM05fuCpdArwnlu/xCz6RmMerpPLVFfzpjGWNRTnt6GdieCWxKHIY5G0k30OAjIhHR7W2ruXZWeKSndwijPQ2cFrtGPFfpRqTo8qsAjDet0sKJjNFgvvajAC+NfLyv0cr3USys0NeisQmdGZ++2gm8NnFVoZtC8QnODfqrQT6bPWI/FdX8oy7OYH6o8ocEM3jKASMluX63vhenzvnbqJcUMyK6j5naPYKyvohKEeSGXx+cf3it5sNd/Ye+odYVHU1hK7O2dFwe2IfHL1Sw09t5e/42hiRmO/FWjw6Od147ZKElZQlei3d3tn1+anXKgNL5ddTp4c9jfXp8jRgwTFntFXQcZpvlN8sWuGgu8poLSqUb91NS/ZTj/D3Xc1H1pYGNF0lSxdLOxIm+qWFq7rFjbRFJM61S/GUlxr2NaSg5B3/gMPugpPtG3FBYx+1GPLvPDQNunf91eEf2fEGdLoY46Hf38OOuPjRkM+lsvt5xff+q++mXxDj2wvrOeWpuWY+1ar6xj69QaWMT6p2W1HrY67d/an9t/tP8spXdai5onVu1r//Uvd/vTtQ==</latexit>
Thecrossentropyisanicemeasure,butit’snotzerowhenpandqareequal.Instead,it’sequaltotheentropyofp.
Togetameasurethatiszerowhenthetwoareequal,wecanjustsubtractthetheentropyofp.ThisiscalledtheKulback-Leibler(KL)divergence.TheKLdivergenceiszerowhenourmodelisperfect.
104
loss(q) =X
x2X
H(px,qx)
= -X
x2X
(px(P) logqx(P) + px(N) logqx(N))
= -X
x2XP
logqx(P)-X
x2XN
logqx(N)<latexit sha1_base64="rjx4gMJg5812/7NNpYFtL6oAFs8=">AAAInXicfVXdbts2FJa7rfO8dU23y12MmLHC2dxActYkuwjQJfGaizbxgjgOEBkGRdOyYOqnJOVIJfg6e6dd7VVGSbYnidp044Pz/Zj8DiU6EfEYN82/Wk8++fSzp5+3v+h8+dWzr5/vvfjmjoUxRXiMQhLSewcyTLwAj7nHCb6PKIa+Q/DEWZ1n+GSNKfPC4JanEZ760A28hYcgV63Z3t8vbY4TLkjImOx92AenwGaxPxOJ7QXgXoLLXjRL+uDDLNm37c7LU/Bqg4OCYBO84D2gSD3bQWIk94FNQjcT7Bo/FzBF4qoCFw2beu6SN7rPCgepW74CNWLuJTXz2V7XPDDzB+iFtSm6xuYZzV48/dOehyj2ccARgYw9WGbEpwJS7iGCZceOGY4gWkEXP8R8cTIVXhDFHAdIgh8VtogJ4CHIwgZzj2LESaoKiKinHABaQgoRVyPpVK0YDqCPWX++9iJWlGztFgWHap5TkeTzls8qSuFSGC09lFSWJqDPfMiXWpOlvlNt4phguvarzWyZapE1ZoIp8lgWwkglcx1lZ4jdhqMNvkyjJQ6YFDElsixUAKYUL5QwLxnmcSTy3aiDu2KnnMa4n5V57/QC0tUNnveVT6VRXc6ChJBXW47ahkonwI8o9H0YzIUdSVEccbt/IPPsyuiNFMLOgnIccJPBFfSqhKrTVQWHJXCowCo63qELMK5L70rgnfavkxI6qUuduITGGrouoWvN2XkswY8anJTQREPTEppq6McS+lHPGapj8TCYimIW+VDFNfHW+C3FOJCiO5D1vVA17werKsnOgOhaMo97jhfqq1cAfprRxeXt+3dSnJ8MXptHss5wSIy3FPPw6PW5qVHcYjUbjnlyMjjTOCGFgbszuhge/WbpRlFMI7IjHR8f/v6r7pRiQsLHndP52cXgsL4xlUh1UdaxZZr100aRFtUmEdC1gBat20Tf/E2jwGkSFHk28lc6/y2F6X+wwyb3bcyNiqhJsc28UZE2KbYD2CqqkqAhpn/HsdPUdh5lL8IKqTVmVwYkhe8FVpcJxe/VC3KtPoCQh/Qn9VZQ1/eUl/q1+1n1f0SYbImq6nTUzWbV7zG9uBscWIcH5h+/dN+cbe64tvGd8YPRMyzj2HhjXBojY2yg1rC1avFW3P6+PWy/a18V1CetjeZbo/K0J/8AAKMbXg==</latexit>
loss(q) =X
x2X
H(px,qx)
= -X
x2X
(px(P) logqx(P) + px(N) logqx(N))
= -X
x2XP
logqx(P)-X
x2XN
logqx(N)<latexit sha1_base64="rjx4gMJg5812/7NNpYFtL6oAFs8=">AAAInXicfVXdbts2FJa7rfO8dU23y12MmLHC2dxActYkuwjQJfGaizbxgjgOEBkGRdOyYOqnJOVIJfg6e6dd7VVGSbYnidp044Pz/Zj8DiU6EfEYN82/Wk8++fSzp5+3v+h8+dWzr5/vvfjmjoUxRXiMQhLSewcyTLwAj7nHCb6PKIa+Q/DEWZ1n+GSNKfPC4JanEZ760A28hYcgV63Z3t8vbY4TLkjImOx92AenwGaxPxOJ7QXgXoLLXjRL+uDDLNm37c7LU/Bqg4OCYBO84D2gSD3bQWIk94FNQjcT7Bo/FzBF4qoCFw2beu6SN7rPCgepW74CNWLuJTXz2V7XPDDzB+iFtSm6xuYZzV48/dOehyj2ccARgYw9WGbEpwJS7iGCZceOGY4gWkEXP8R8cTIVXhDFHAdIgh8VtogJ4CHIwgZzj2LESaoKiKinHABaQgoRVyPpVK0YDqCPWX++9iJWlGztFgWHap5TkeTzls8qSuFSGC09lFSWJqDPfMiXWpOlvlNt4phguvarzWyZapE1ZoIp8lgWwkglcx1lZ4jdhqMNvkyjJQ6YFDElsixUAKYUL5QwLxnmcSTy3aiDu2KnnMa4n5V57/QC0tUNnveVT6VRXc6ChJBXW47ahkonwI8o9H0YzIUdSVEccbt/IPPsyuiNFMLOgnIccJPBFfSqhKrTVQWHJXCowCo63qELMK5L70rgnfavkxI6qUuduITGGrouoWvN2XkswY8anJTQREPTEppq6McS+lHPGapj8TCYimIW+VDFNfHW+C3FOJCiO5D1vVA17werKsnOgOhaMo97jhfqq1cAfprRxeXt+3dSnJ8MXptHss5wSIy3FPPw6PW5qVHcYjUbjnlyMjjTOCGFgbszuhge/WbpRlFMI7IjHR8f/v6r7pRiQsLHndP52cXgsL4xlUh1UdaxZZr100aRFtUmEdC1gBat20Tf/E2jwGkSFHk28lc6/y2F6X+wwyb3bcyNiqhJsc28UZE2KbYD2CqqkqAhpn/HsdPUdh5lL8IKqTVmVwYkhe8FVpcJxe/VC3KtPoCQh/Qn9VZQ1/eUl/q1+1n1f0SYbImq6nTUzWbV7zG9uBscWIcH5h+/dN+cbe64tvGd8YPRMyzj2HhjXBojY2yg1rC1avFW3P6+PWy/a18V1CetjeZbo/K0J/8AAKMbXg==</latexit>
loss(q) =X
x2X
H(px,qx)
= -X
x2X
(px(P) logqx(P) + px(N) logqx(N))
= -X
x2XP
logqx(P)-X
x2XN
logqx(N)<latexit sha1_base64="rjx4gMJg5812/7NNpYFtL6oAFs8=">AAAInXicfVXdbts2FJa7rfO8dU23y12MmLHC2dxActYkuwjQJfGaizbxgjgOEBkGRdOyYOqnJOVIJfg6e6dd7VVGSbYnidp044Pz/Zj8DiU6EfEYN82/Wk8++fSzp5+3v+h8+dWzr5/vvfjmjoUxRXiMQhLSewcyTLwAj7nHCb6PKIa+Q/DEWZ1n+GSNKfPC4JanEZ760A28hYcgV63Z3t8vbY4TLkjImOx92AenwGaxPxOJ7QXgXoLLXjRL+uDDLNm37c7LU/Bqg4OCYBO84D2gSD3bQWIk94FNQjcT7Bo/FzBF4qoCFw2beu6SN7rPCgepW74CNWLuJTXz2V7XPDDzB+iFtSm6xuYZzV48/dOehyj2ccARgYw9WGbEpwJS7iGCZceOGY4gWkEXP8R8cTIVXhDFHAdIgh8VtogJ4CHIwgZzj2LESaoKiKinHABaQgoRVyPpVK0YDqCPWX++9iJWlGztFgWHap5TkeTzls8qSuFSGC09lFSWJqDPfMiXWpOlvlNt4phguvarzWyZapE1ZoIp8lgWwkglcx1lZ4jdhqMNvkyjJQ6YFDElsixUAKYUL5QwLxnmcSTy3aiDu2KnnMa4n5V57/QC0tUNnveVT6VRXc6ChJBXW47ahkonwI8o9H0YzIUdSVEccbt/IPPsyuiNFMLOgnIccJPBFfSqhKrTVQWHJXCowCo63qELMK5L70rgnfavkxI6qUuduITGGrouoWvN2XkswY8anJTQREPTEppq6McS+lHPGapj8TCYimIW+VDFNfHW+C3FOJCiO5D1vVA17werKsnOgOhaMo97jhfqq1cAfprRxeXt+3dSnJ8MXptHss5wSIy3FPPw6PW5qVHcYjUbjnlyMjjTOCGFgbszuhge/WbpRlFMI7IjHR8f/v6r7pRiQsLHndP52cXgsL4xlUh1UdaxZZr100aRFtUmEdC1gBat20Tf/E2jwGkSFHk28lc6/y2F6X+wwyb3bcyNiqhJsc28UZE2KbYD2CqqkqAhpn/HsdPUdh5lL8IKqTVmVwYkhe8FVpcJxe/VC3KtPoCQh/Qn9VZQ1/eUl/q1+1n1f0SYbImq6nTUzWbV7zG9uBscWIcH5h+/dN+cbe64tvGd8YPRMyzj2HhjXBojY2yg1rC1avFW3P6+PWy/a18V1CetjeZbo/K0J/8AAKMbXg==</latexit>
log loss is cross-entropy loss Thisway,wecanprovethatcrossentropylossisthesameaslogloss.Andindeedthislossfunctionisoftencalledcrossentropyloss.
Thisisnotjustacuriositytyinginformationtheorytomachinelearning,ithaspracticalconsequences.Ittellsuswhatweshoulddointhecasewherethedatasetactuallyprovidesclassprobabilitiesinsteadofclasslabels.Inthatcase,weshouldminimizethecrossentropybetwenethepredicteddistributionandtheonegiveninthedata.
the Minimum Description Length Principle
A model that allows us to compress the data is a model that has learned something about the data.
The better the compression, the more we’ve learned.
Balance model complexity by storing the model, and then the data given the model.
105
Thisleadsustotheminimumdescriptionlengthprinciple,whichinformallystatesthatcompressionandlearningarestronglyrelated.
106
sender receiverdata
ThebestwaytothinkofMDLmodelselectionisinasenderandreceiverframework.Thesenderisgoingtoseesomedata,andisgoingtosendittothereceiver.Beforeobservingthedata,thesenderandreceiverareallowedtocomeupwithanyschemetheylike.Butafterwards,thedatamustbesentusingthescheme,andinawaythatisperfectlydecodablebythereceiverwithoutfurthercommunication.
Weusuallyassumethatthereissomelanguagetodescribeamodelthatthesenderchooses.Thesenderdescribesthemodelandthenthedatagiventhemodel.
107
Wewon’tgointothetechnicaldetailsofMDL,buthereweseeabroadillustrationofhowMDLcanbalanceover-andunder?ittinginaregressionproblem.
Inaregression(orclassi?ication)problem,wecantaketheinstancesandtheirfeaturesas?ixed:boththesenderandreceiverhaveaccesstothem.Thedatathatwewanttosendoverthewireisthetargetlabels;inthiscasethenumbers.Howyouencodeacontinuousvalueisatechnicalmatterthatrequiressomeassumptions.Fornowwecanjustdiscretizerangeofoutptus,andassumethatweareusingacodethatmeansthatbiggernumbercostmorebits.Thesamegoesfortheparametersofthemodel:thesearealsocontinuousvalues,butwe’lldiscretizethemsomehow.Hereweonlyneedtoassumethatusingmoreparametersinyourmodeltakesmorebits.
Oncewe’vechosenamodelwecanreconstructthedatabysendingthemodelparametersandtheresidualvalues.Ontheleft,weseethatifwepickalinearmodelwehavemanylargeresidualstotransmit.Ontheotherhand,ourmodelisdescribedbyonlytwoparameters,sowecantransmitthatpartverycheaply.Ifwemakeourmodelaparabola,werequirethreenumberstotransmitit,sothatpartofourmessagegetsbigger,butbecausethemodel?itssomuchbetter,theresidualsaremuchsmaller,andtheoverallengthofourmessagegetsmuchsmaller.
Ifwemakeourmodela15-thorderpolynomial,wegetaslightlytighter?it,butnotbymuch,andthepricewepayinstoringthe16numbersrequiredtodescribeourmodelmeansthatourmessagelengthisbiggerthanfortheparabola.Sooverallwepreferthemodelinthemiddle,accordingtotheminimumdescriptionlengthprinciple.
108
argmaxM
p(M)p(X | M)
= argminM
- log p(M)p(X | M)
= argminM
- log p(M)- log(X | M)
= argminM
L(M) + LM(X)<latexit sha1_base64="FnS4ryy7Av/1PU9hJV7WYxQJcgQ=">AAAHvHicrVXbbtNAEHXKJSVQKPDIi0VExSWtnEBvQhWFtoBQS0vVtJXqqFpvJo6Vtb3aXacJq/09/oEnfoX1JZUdB8QD+zSec85o9sxq7FDicWFZPytzN27eul2dv1O7e2/h/oPFh49OeRgxDG0ckpCdO4gD8QJoC08QOKcMkO8QOHMGOzF+NgTGvTA4EWMKHR+5gdfzMBI6dbn4y0bM9dHo0mZYHihzyX5r0ufpx4s4crE8V7bvdc1J0rZrS1tmovOCiW7ZJqH7X5TLZpz4Z/n+tfCVuZ8lM3GsuFysWytWcsxy0MyCupGdo8uHt37Y3RBHPgQCE8T5RdOioiMREx4moGp2xIEiPEAuXESit9GRXkAjAQFW5jON9SJiitCMzTa7HgMsyFgHCDNPVzBxHzGEhR5JrViKQ4B84I3u0KM8DfnQTQOB9Dw7cpTMWy0UlNJliPY9PCq0JpHPfST6pSQf+04xCREBNvSLybhN3eQUcwQMezw24Ug7c0jjN8RPwqMM749pHwKuZMSIygs1AIxBTwuTkIOIqExuox/ugG8JFkEjDpPc1i5ig2PoNnSdQqLYTo+ESChtRgBXOPR9FHSlTZW0BYyEtBsrKrEqjx4rKe3YF8cxj2O4gH7NoV/VdOX2Ndoz2xotgKc58LRU+CyHnk1LnSiHRiV0mEOHpcrOVQ6+KsGjHDoqoeMcOi6h33Po97KVSA/6otWRqd3JmOQh8YbwiQEEStZbavouTE/wolmUxFOV9aZK7O5CT++xFPDHMV1+PjnYV3Jno7VqralphkMimFCs12urO1aJ4qbdZBxrY6P1ocQJGQrc60K7e2vvm+VCNGKUXJPW119/3Pww/UQYLt0vu4ZZb5olP9xZ9KzhmQJnliA1YSZ/UOZ/Ymj8B3Y4q/rEm5kKOksxMWqimGqJxs9qgLU4XqmIpJRd0MuWwYF+bod6QSARspcyW/VKL1/XbsTR34hoNCHqqFbTm785vefLQbu1srlifXtT3/6S/QLmjSfGU+O50TTWjW3js3FktA1c2asMKqISVd9VoTqo+il1rpJpHhuFUx3+Bp27xdo=</latexit><latexit sha1_base64="FnS4ryy7Av/1PU9hJV7WYxQJcgQ=">AAAHvHicrVXbbtNAEHXKJSVQKPDIi0VExSWtnEBvQhWFtoBQS0vVtJXqqFpvJo6Vtb3aXacJq/09/oEnfoX1JZUdB8QD+zSec85o9sxq7FDicWFZPytzN27eul2dv1O7e2/h/oPFh49OeRgxDG0ckpCdO4gD8QJoC08QOKcMkO8QOHMGOzF+NgTGvTA4EWMKHR+5gdfzMBI6dbn4y0bM9dHo0mZYHihzyX5r0ufpx4s4crE8V7bvdc1J0rZrS1tmovOCiW7ZJqH7X5TLZpz4Z/n+tfCVuZ8lM3GsuFysWytWcsxy0MyCupGdo8uHt37Y3RBHPgQCE8T5RdOioiMREx4moGp2xIEiPEAuXESit9GRXkAjAQFW5jON9SJiitCMzTa7HgMsyFgHCDNPVzBxHzGEhR5JrViKQ4B84I3u0KM8DfnQTQOB9Dw7cpTMWy0UlNJliPY9PCq0JpHPfST6pSQf+04xCREBNvSLybhN3eQUcwQMezw24Ug7c0jjN8RPwqMM749pHwKuZMSIygs1AIxBTwuTkIOIqExuox/ugG8JFkEjDpPc1i5ig2PoNnSdQqLYTo+ESChtRgBXOPR9FHSlTZW0BYyEtBsrKrEqjx4rKe3YF8cxj2O4gH7NoV/VdOX2Ndoz2xotgKc58LRU+CyHnk1LnSiHRiV0mEOHpcrOVQ6+KsGjHDoqoeMcOi6h33Po97KVSA/6otWRqd3JmOQh8YbwiQEEStZbavouTE/wolmUxFOV9aZK7O5CT++xFPDHMV1+PjnYV3Jno7VqralphkMimFCs12urO1aJ4qbdZBxrY6P1ocQJGQrc60K7e2vvm+VCNGKUXJPW119/3Pww/UQYLt0vu4ZZb5olP9xZ9KzhmQJnliA1YSZ/UOZ/Ymj8B3Y4q/rEm5kKOksxMWqimGqJxs9qgLU4XqmIpJRd0MuWwYF+bod6QSARspcyW/VKL1/XbsTR34hoNCHqqFbTm785vefLQbu1srlifXtT3/6S/QLmjSfGU+O50TTWjW3js3FktA1c2asMKqISVd9VoTqo+il1rpJpHhuFUx3+Bp27xdo=</latexit><latexit sha1_base64="FnS4ryy7Av/1PU9hJV7WYxQJcgQ=">AAAHvHicrVXbbtNAEHXKJSVQKPDIi0VExSWtnEBvQhWFtoBQS0vVtJXqqFpvJo6Vtb3aXacJq/09/oEnfoX1JZUdB8QD+zSec85o9sxq7FDicWFZPytzN27eul2dv1O7e2/h/oPFh49OeRgxDG0ckpCdO4gD8QJoC08QOKcMkO8QOHMGOzF+NgTGvTA4EWMKHR+5gdfzMBI6dbn4y0bM9dHo0mZYHihzyX5r0ufpx4s4crE8V7bvdc1J0rZrS1tmovOCiW7ZJqH7X5TLZpz4Z/n+tfCVuZ8lM3GsuFysWytWcsxy0MyCupGdo8uHt37Y3RBHPgQCE8T5RdOioiMREx4moGp2xIEiPEAuXESit9GRXkAjAQFW5jON9SJiitCMzTa7HgMsyFgHCDNPVzBxHzGEhR5JrViKQ4B84I3u0KM8DfnQTQOB9Dw7cpTMWy0UlNJliPY9PCq0JpHPfST6pSQf+04xCREBNvSLybhN3eQUcwQMezw24Ug7c0jjN8RPwqMM749pHwKuZMSIygs1AIxBTwuTkIOIqExuox/ugG8JFkEjDpPc1i5ig2PoNnSdQqLYTo+ESChtRgBXOPR9FHSlTZW0BYyEtBsrKrEqjx4rKe3YF8cxj2O4gH7NoV/VdOX2Ndoz2xotgKc58LRU+CyHnk1LnSiHRiV0mEOHpcrOVQ6+KsGjHDoqoeMcOi6h33Po97KVSA/6otWRqd3JmOQh8YbwiQEEStZbavouTE/wolmUxFOV9aZK7O5CT++xFPDHMV1+PjnYV3Jno7VqralphkMimFCs12urO1aJ4qbdZBxrY6P1ocQJGQrc60K7e2vvm+VCNGKUXJPW119/3Pww/UQYLt0vu4ZZb5olP9xZ9KzhmQJnliA1YSZ/UOZ/Ymj8B3Y4q/rEm5kKOksxMWqimGqJxs9qgLU4XqmIpJRd0MuWwYF+bod6QSARspcyW/VKL1/XbsTR34hoNCHqqFbTm785vefLQbu1srlifXtT3/6S/QLmjSfGU+O50TTWjW3js3FktA1c2asMKqISVd9VoTqo+il1rpJpHhuFUx3+Bp27xdo=</latexit>
cost of describing the data given the model
cost of describing the model
argmaxM
p(M)p(X | M)
= argminM
- log p(M)p(X | M)
= argminM
- log p(M)- log(X | M)
= argminM
L(M) + LM(X)<latexit sha1_base64="FnS4ryy7Av/1PU9hJV7WYxQJcgQ=">AAAHvHicrVXbbtNAEHXKJSVQKPDIi0VExSWtnEBvQhWFtoBQS0vVtJXqqFpvJo6Vtb3aXacJq/09/oEnfoX1JZUdB8QD+zSec85o9sxq7FDicWFZPytzN27eul2dv1O7e2/h/oPFh49OeRgxDG0ckpCdO4gD8QJoC08QOKcMkO8QOHMGOzF+NgTGvTA4EWMKHR+5gdfzMBI6dbn4y0bM9dHo0mZYHihzyX5r0ufpx4s4crE8V7bvdc1J0rZrS1tmovOCiW7ZJqH7X5TLZpz4Z/n+tfCVuZ8lM3GsuFysWytWcsxy0MyCupGdo8uHt37Y3RBHPgQCE8T5RdOioiMREx4moGp2xIEiPEAuXESit9GRXkAjAQFW5jON9SJiitCMzTa7HgMsyFgHCDNPVzBxHzGEhR5JrViKQ4B84I3u0KM8DfnQTQOB9Dw7cpTMWy0UlNJliPY9PCq0JpHPfST6pSQf+04xCREBNvSLybhN3eQUcwQMezw24Ug7c0jjN8RPwqMM749pHwKuZMSIygs1AIxBTwuTkIOIqExuox/ugG8JFkEjDpPc1i5ig2PoNnSdQqLYTo+ESChtRgBXOPR9FHSlTZW0BYyEtBsrKrEqjx4rKe3YF8cxj2O4gH7NoV/VdOX2Ndoz2xotgKc58LRU+CyHnk1LnSiHRiV0mEOHpcrOVQ6+KsGjHDoqoeMcOi6h33Po97KVSA/6otWRqd3JmOQh8YbwiQEEStZbavouTE/wolmUxFOV9aZK7O5CT++xFPDHMV1+PjnYV3Jno7VqralphkMimFCs12urO1aJ4qbdZBxrY6P1ocQJGQrc60K7e2vvm+VCNGKUXJPW119/3Pww/UQYLt0vu4ZZb5olP9xZ9KzhmQJnliA1YSZ/UOZ/Ymj8B3Y4q/rEm5kKOksxMWqimGqJxs9qgLU4XqmIpJRd0MuWwYF+bod6QSARspcyW/VKL1/XbsTR34hoNCHqqFbTm785vefLQbu1srlifXtT3/6S/QLmjSfGU+O50TTWjW3js3FktA1c2asMKqISVd9VoTqo+il1rpJpHhuFUx3+Bp27xdo=</latexit><latexit sha1_base64="FnS4ryy7Av/1PU9hJV7WYxQJcgQ=">AAAHvHicrVXbbtNAEHXKJSVQKPDIi0VExSWtnEBvQhWFtoBQS0vVtJXqqFpvJo6Vtb3aXacJq/09/oEnfoX1JZUdB8QD+zSec85o9sxq7FDicWFZPytzN27eul2dv1O7e2/h/oPFh49OeRgxDG0ckpCdO4gD8QJoC08QOKcMkO8QOHMGOzF+NgTGvTA4EWMKHR+5gdfzMBI6dbn4y0bM9dHo0mZYHihzyX5r0ufpx4s4crE8V7bvdc1J0rZrS1tmovOCiW7ZJqH7X5TLZpz4Z/n+tfCVuZ8lM3GsuFysWytWcsxy0MyCupGdo8uHt37Y3RBHPgQCE8T5RdOioiMREx4moGp2xIEiPEAuXESit9GRXkAjAQFW5jON9SJiitCMzTa7HgMsyFgHCDNPVzBxHzGEhR5JrViKQ4B84I3u0KM8DfnQTQOB9Dw7cpTMWy0UlNJliPY9PCq0JpHPfST6pSQf+04xCREBNvSLybhN3eQUcwQMezw24Ug7c0jjN8RPwqMM749pHwKuZMSIygs1AIxBTwuTkIOIqExuox/ugG8JFkEjDpPc1i5ig2PoNnSdQqLYTo+ESChtRgBXOPR9FHSlTZW0BYyEtBsrKrEqjx4rKe3YF8cxj2O4gH7NoV/VdOX2Ndoz2xotgKc58LRU+CyHnk1LnSiHRiV0mEOHpcrOVQ6+KsGjHDoqoeMcOi6h33Po97KVSA/6otWRqd3JmOQh8YbwiQEEStZbavouTE/wolmUxFOV9aZK7O5CT++xFPDHMV1+PjnYV3Jno7VqralphkMimFCs12urO1aJ4qbdZBxrY6P1ocQJGQrc60K7e2vvm+VCNGKUXJPW119/3Pww/UQYLt0vu4ZZb5olP9xZ9KzhmQJnliA1YSZ/UOZ/Ymj8B3Y4q/rEm5kKOksxMWqimGqJxs9qgLU4XqmIpJRd0MuWwYF+bod6QSARspcyW/VKL1/XbsTR34hoNCHqqFbTm785vefLQbu1srlifXtT3/6S/QLmjSfGU+O50TTWjW3js3FktA1c2asMKqISVd9VoTqo+il1rpJpHhuFUx3+Bp27xdo=</latexit><latexit sha1_base64="FnS4ryy7Av/1PU9hJV7WYxQJcgQ=">AAAHvHicrVXbbtNAEHXKJSVQKPDIi0VExSWtnEBvQhWFtoBQS0vVtJXqqFpvJo6Vtb3aXacJq/09/oEnfoX1JZUdB8QD+zSec85o9sxq7FDicWFZPytzN27eul2dv1O7e2/h/oPFh49OeRgxDG0ckpCdO4gD8QJoC08QOKcMkO8QOHMGOzF+NgTGvTA4EWMKHR+5gdfzMBI6dbn4y0bM9dHo0mZYHihzyX5r0ufpx4s4crE8V7bvdc1J0rZrS1tmovOCiW7ZJqH7X5TLZpz4Z/n+tfCVuZ8lM3GsuFysWytWcsxy0MyCupGdo8uHt37Y3RBHPgQCE8T5RdOioiMREx4moGp2xIEiPEAuXESit9GRXkAjAQFW5jON9SJiitCMzTa7HgMsyFgHCDNPVzBxHzGEhR5JrViKQ4B84I3u0KM8DfnQTQOB9Dw7cpTMWy0UlNJliPY9PCq0JpHPfST6pSQf+04xCREBNvSLybhN3eQUcwQMezw24Ug7c0jjN8RPwqMM749pHwKuZMSIygs1AIxBTwuTkIOIqExuox/ugG8JFkEjDpPc1i5ig2PoNnSdQqLYTo+ESChtRgBXOPR9FHSlTZW0BYyEtBsrKrEqjx4rKe3YF8cxj2O4gH7NoV/VdOX2Ndoz2xotgKc58LRU+CyHnk1LnSiHRiV0mEOHpcrOVQ6+KsGjHDoqoeMcOi6h33Po97KVSA/6otWRqd3JmOQh8YbwiQEEStZbavouTE/wolmUxFOV9aZK7O5CT++xFPDHMV1+PjnYV3Jno7VqralphkMimFCs12urO1aJ4qbdZBxrY6P1ocQJGQrc60K7e2vvm+VCNGKUXJPW119/3Pww/UQYLt0vu4ZZb5olP9xZ9KzhmQJnliA1YSZ/UOZ/Ymj8B3Y4q/rEm5kKOksxMWqimGqJxs9qgLU4XqmIpJRd0MuWwYF+bod6QSARspcyW/VKL1/XbsTR34hoNCHqqFbTm785vefLQbu1srlifXtT3/6S/QLmjSfGU+O50TTWjW3js3FktA1c2asMKqISVd9VoTqo+il1rpJpHhuFUx3+Bp27xdo=</latexit>
argmaxM
p(M)p(X | M)
= argminM
- log p(M)p(X | M)
= argminM
- log p(M)- log(X | M)
= argminM
L(M) + LM(X)<latexit sha1_base64="FnS4ryy7Av/1PU9hJV7WYxQJcgQ=">AAAHvHicrVXbbtNAEHXKJSVQKPDIi0VExSWtnEBvQhWFtoBQS0vVtJXqqFpvJo6Vtb3aXacJq/09/oEnfoX1JZUdB8QD+zSec85o9sxq7FDicWFZPytzN27eul2dv1O7e2/h/oPFh49OeRgxDG0ckpCdO4gD8QJoC08QOKcMkO8QOHMGOzF+NgTGvTA4EWMKHR+5gdfzMBI6dbn4y0bM9dHo0mZYHihzyX5r0ufpx4s4crE8V7bvdc1J0rZrS1tmovOCiW7ZJqH7X5TLZpz4Z/n+tfCVuZ8lM3GsuFysWytWcsxy0MyCupGdo8uHt37Y3RBHPgQCE8T5RdOioiMREx4moGp2xIEiPEAuXESit9GRXkAjAQFW5jON9SJiitCMzTa7HgMsyFgHCDNPVzBxHzGEhR5JrViKQ4B84I3u0KM8DfnQTQOB9Dw7cpTMWy0UlNJliPY9PCq0JpHPfST6pSQf+04xCREBNvSLybhN3eQUcwQMezw24Ug7c0jjN8RPwqMM749pHwKuZMSIygs1AIxBTwuTkIOIqExuox/ugG8JFkEjDpPc1i5ig2PoNnSdQqLYTo+ESChtRgBXOPR9FHSlTZW0BYyEtBsrKrEqjx4rKe3YF8cxj2O4gH7NoV/VdOX2Ndoz2xotgKc58LRU+CyHnk1LnSiHRiV0mEOHpcrOVQ6+KsGjHDoqoeMcOi6h33Po97KVSA/6otWRqd3JmOQh8YbwiQEEStZbavouTE/wolmUxFOV9aZK7O5CT++xFPDHMV1+PjnYV3Jno7VqralphkMimFCs12urO1aJ4qbdZBxrY6P1ocQJGQrc60K7e2vvm+VCNGKUXJPW119/3Pww/UQYLt0vu4ZZb5olP9xZ9KzhmQJnliA1YSZ/UOZ/Ymj8B3Y4q/rEm5kKOksxMWqimGqJxs9qgLU4XqmIpJRd0MuWwYF+bod6QSARspcyW/VKL1/XbsTR34hoNCHqqFbTm785vefLQbu1srlifXtT3/6S/QLmjSfGU+O50TTWjW3js3FktA1c2asMKqISVd9VoTqo+il1rpJpHhuFUx3+Bp27xdo=</latexit><latexit sha1_base64="FnS4ryy7Av/1PU9hJV7WYxQJcgQ=">AAAHvHicrVXbbtNAEHXKJSVQKPDIi0VExSWtnEBvQhWFtoBQS0vVtJXqqFpvJo6Vtb3aXacJq/09/oEnfoX1JZUdB8QD+zSec85o9sxq7FDicWFZPytzN27eul2dv1O7e2/h/oPFh49OeRgxDG0ckpCdO4gD8QJoC08QOKcMkO8QOHMGOzF+NgTGvTA4EWMKHR+5gdfzMBI6dbn4y0bM9dHo0mZYHihzyX5r0ufpx4s4crE8V7bvdc1J0rZrS1tmovOCiW7ZJqH7X5TLZpz4Z/n+tfCVuZ8lM3GsuFysWytWcsxy0MyCupGdo8uHt37Y3RBHPgQCE8T5RdOioiMREx4moGp2xIEiPEAuXESit9GRXkAjAQFW5jON9SJiitCMzTa7HgMsyFgHCDNPVzBxHzGEhR5JrViKQ4B84I3u0KM8DfnQTQOB9Dw7cpTMWy0UlNJliPY9PCq0JpHPfST6pSQf+04xCREBNvSLybhN3eQUcwQMezw24Ug7c0jjN8RPwqMM749pHwKuZMSIygs1AIxBTwuTkIOIqExuox/ugG8JFkEjDpPc1i5ig2PoNnSdQqLYTo+ESChtRgBXOPR9FHSlTZW0BYyEtBsrKrEqjx4rKe3YF8cxj2O4gH7NoV/VdOX2Ndoz2xotgKc58LRU+CyHnk1LnSiHRiV0mEOHpcrOVQ6+KsGjHDoqoeMcOi6h33Po97KVSA/6otWRqd3JmOQh8YbwiQEEStZbavouTE/wolmUxFOV9aZK7O5CT++xFPDHMV1+PjnYV3Jno7VqralphkMimFCs12urO1aJ4qbdZBxrY6P1ocQJGQrc60K7e2vvm+VCNGKUXJPW119/3Pww/UQYLt0vu4ZZb5olP9xZ9KzhmQJnliA1YSZ/UOZ/Ymj8B3Y4q/rEm5kKOksxMWqimGqJxs9qgLU4XqmIpJRd0MuWwYF+bod6QSARspcyW/VKL1/XbsTR34hoNCHqqFbTm785vefLQbu1srlifXtT3/6S/QLmjSfGU+O50TTWjW3js3FktA1c2asMKqISVd9VoTqo+il1rpJpHhuFUx3+Bp27xdo=</latexit><latexit sha1_base64="FnS4ryy7Av/1PU9hJV7WYxQJcgQ=">AAAHvHicrVXbbtNAEHXKJSVQKPDIi0VExSWtnEBvQhWFtoBQS0vVtJXqqFpvJo6Vtb3aXacJq/09/oEnfoX1JZUdB8QD+zSec85o9sxq7FDicWFZPytzN27eul2dv1O7e2/h/oPFh49OeRgxDG0ckpCdO4gD8QJoC08QOKcMkO8QOHMGOzF+NgTGvTA4EWMKHR+5gdfzMBI6dbn4y0bM9dHo0mZYHihzyX5r0ufpx4s4crE8V7bvdc1J0rZrS1tmovOCiW7ZJqH7X5TLZpz4Z/n+tfCVuZ8lM3GsuFysWytWcsxy0MyCupGdo8uHt37Y3RBHPgQCE8T5RdOioiMREx4moGp2xIEiPEAuXESit9GRXkAjAQFW5jON9SJiitCMzTa7HgMsyFgHCDNPVzBxHzGEhR5JrViKQ4B84I3u0KM8DfnQTQOB9Dw7cpTMWy0UlNJliPY9PCq0JpHPfST6pSQf+04xCREBNvSLybhN3eQUcwQMezw24Ug7c0jjN8RPwqMM749pHwKuZMSIygs1AIxBTwuTkIOIqExuox/ugG8JFkEjDpPc1i5ig2PoNnSdQqLYTo+ESChtRgBXOPR9FHSlTZW0BYyEtBsrKrEqjx4rKe3YF8cxj2O4gH7NoV/VdOX2Ndoz2xotgKc58LRU+CyHnk1LnSiHRiV0mEOHpcrOVQ6+KsGjHDoqoeMcOi6h33Po97KVSA/6otWRqd3JmOQh8YbwiQEEStZbavouTE/wolmUxFOV9aZK7O5CT++xFPDHMV1+PjnYV3Jno7VqralphkMimFCs12urO1aJ4qbdZBxrY6P1ocQJGQrc60K7e2vvm+VCNGKUXJPW119/3Pww/UQYLt0vu4ZZb5olP9xZ9KzhmQJnliA1YSZ/UOZ/Ymj8B3Y4q/rEm5kKOksxMWqimGqJxs9qgLU4XqmIpJRd0MuWwYF+bod6QSARspcyW/VKL1/XbsTR34hoNCHqqFbTm785vefLQbu1srlifXtT3/6S/QLmjSfGU+O50TTWjW3js3FktA1c2asMKqISVd9VoTqo+il1rpJpHhuFUx3+Bp27xdo=</latexit>
<latexit sha1_base64="73pXyJi/zEFpT6TJlIsdfkTSfFU=">AAAHv3icrVXbbtNAEHW4BcKtwCMvFhGIS1rZAdoiVAlooQi1tFRNW6mOqvVm4lhZ26vddRqz2t/jH3jkT1hfUtlxQDywT+M554xmz6zGLiU+F5b1s3Hp8pWr15rXb7Ru3rp95+7SvftHPIoZhh6OSMROXMSB+CH0hC8InFAGKHAJHLvjzRQ/ngDjfhQeioRCP0Be6A99jIROnS39chDzAjQ9cxiWu8p84rw16dP841kaeVieKCfwB+Ys6TitJxtmpvPDmW7ZIZH3X5TL5izxjwV2LqQvzJ0iWYhTxdlS21qxsmPWA7sI2kZx9s/uXf3hDCIcBxAKTBDnp7ZFRV8iJnxMQLWcmANFeIw8OI3FcL0v/ZDGAkKszMcaG8bEFJGZ2m0OfAZYkEQHCDNfVzDxCDGEhR5Kq1qKQ4gC4J3BxKc8D/nEywOB9ET7cppNXN2uKKXHEB35eFppTaKAB0iMakmeBG41CTEBNgmqybRN3eQccwoM+zw1YV87s0fTV8QPo/0CHyV0BCFXMmZElYUaAMZgqIVZyEHEVGa30U93zDcEi6GThlluYwux8QEMOrpOJVFtZ0giJJQ2I4RzHAUBCgfSoUo6AqZCOp0VlVlVRg+UlE7qi+uaBylcQb+W0K9qvnLvAh2aPY1WwKMSeFQrfFxCj+elblxC4xo6KaGTWmX3vASf1+BpCZ3W0KSEJjX0ewn9XrcS6UGfdvsytzsbk9wj/gS2GUCoZLur5u/C9ARP7aoknaps2yqzewBDvclyIEhSuvx8uLuj5OZ697W1quYZLolhRrFerr7etGoUL++m4Fjr690PNU7EUOhdFNr6uPrerheiMaPkgrS29vLTmw/zT4Th2v2Ka5ht26z54S2iFw0vFLiLBLkJC/njOn+boeQP7GhR9Zk3CxV0kWJm1Ewx1xJNn9UYa3G6UhHJKVugly2DXf3c9vSCQCJiz2Wx6pVevp7TSaO/EdF0RtRRq6U3vz2/5+vBUXfFXl2xvr1qv/tS/AOuGw+NR8ZTwzbWjHfGZ2Pf6Bm4sd0IGpPGefN902uGTZpTLzUKzQOjcprJb/cuxrA=</latexit>
argmaxM
p(M)p(X | M)
= argminM
- log p(M)p(X | M)
= argminM
- log p(M)- log p(X | M)
= argminM
L(M) + LM(X)
TherearemanycorrespondencesbetweenusingMDLandusingBayes.Infacttheyareoftenperspectivesonthesamething.Forinstance,ifwechoosethemodelthatmaximizestheposteriorprobability,wecanrewrite,byintroducingalogarithmtoshowthatwearealsochoosingthemodelthatminimizesthecodelength
109
encoding a simplicity assumption Whenwetalkedabouttheproblemofinductionandthenofreelunchtheorem,wenotedthatsomeassumptionaboutthesourceofourdatawasnecessarytomakelearningpossibleatall.Someaspectsofourproblemweneedtoassumebeforewestartlearning.
YoucanthinkofMDLasencodingasimplicityassumption.Weprefersimplesolutionsovercomplexones,andwede?ineasimplesolutionasonethatcompressesthedatawell.Theassumptionwemakeabouttheuniverse,isthatitgeneratedcompressibledataforus.