35
Probabilistic Models Part 1: What is probability? Machine Learning mlvu.github.io Vrije Universiteit Amsterdam Probability is an important tool in Machine Learning. We expect that you have been taught probability theory already, but since it’s a subtle concept, with complicated foundations, we’ll go over the basics again in this ?irst video. 2 To start, let’s look at the way we use probability informally. Let’s say you are a concerned parent, you read this headline and you are shocked by it. You turn to your partner, and you say “that means that the probability that our son is gambling online is 12.5%”. Your partner disagrees, you have a good handle on your son’s behaviour and his spending. Unless he has a credit card you don’t know about, and the probability of that is much lower. Well, then the probability that Josh, his closest friend, gambles online must be 12.5%. If one in eight teenage boys is gambling, they must be hiding somewhere. Your partner disagrees again: probability doesn’t enter in to it. Josh is either gambling or he isn’t. Clearly, we need to look at what we mean when we say that a probability of something is such-and-so. There are two commonly accepted ways of looking at it: objecttive and subjective probability. We’ll start with objective probability. image source: https://www.theguardian.com/society/ 2016/sep/20/one-in-eight-european-teenage-boys- gamble-online-says-survey objective probability frequentism: probability is only a property of repeated experiments. 3 In objective probability, “the probability that X is the case” represents an objective truth: whatever a probability is, it must be the same for everybody. You and I may disagree over a probability, but only because one of us is wrong. There is one true probability. The most common form of objective probability is frequentism. Under the frequentist de?inition, probability is a property of a (hypothetical) repeated experiment. For instance, take the statement “the probability of rolling 6 with a fair die, is one in six.” The experiment is rolling a die (the singular of dice). The outcome we are discussing is the roll resulting in a 6. If we were to repeat the experiment a large number

Probabilistic Models Part 1: What is probability?

  • Upload
    others

  • View
    4

  • Download
    0

Embed Size (px)

Citation preview

Probabilistic Models Part 1: What is probability?

Machine Learning mlvu.github.io

Vrije Universiteit Amsterdam

ProbabilityisanimportanttoolinMachineLearning.Weexpectthatyouhavebeentaughtprobabilitytheoryalready,butsinceit’sasubtleconcept,withcomplicatedfoundations,we’llgooverthebasicsagaininthis?irstvideo.

2

Tostart,let’slookatthewayweuseprobabilityinformally.

Let’ssayyouareaconcernedparent,youreadthisheadlineandyouareshockedbyit.Youturntoyourpartner,andyousay“thatmeansthattheprobabilitythatoursonisgamblingonlineis12.5%”.Yourpartnerdisagrees,youhaveagoodhandleonyourson’sbehaviourandhisspending.Unlesshehasacreditcardyoudon’tknowabout,andtheprobabilityofthatismuchlower.

Well,thentheprobabilitythatJosh,hisclosestfriend,gamblesonlinemustbe12.5%.Ifoneineightteenageboysisgambling,theymustbehidingsomewhere.Yourpartnerdisagreesagain:probabilitydoesn’tenterintoit.Joshiseithergamblingorheisn’t.

Clearly,weneedtolookatwhatwemeanwhenwesaythataprobabilityofsomethingissuch-and-so.Therearetwocommonlyacceptedwaysoflookingatit:objecttiveandsubjectiveprobability.We’llstartwithobjectiveprobability.

image source: https://www.theguardian.com/society/2016/sep/20/one-in-eight-european-teenage-boys-gamble-online-says-survey

objective probability

frequentism: probability is only a property of repeated experiments.

3

Inobjectiveprobability,“theprobabilitythatXisthecase”representsanobjectivetruth:whateveraprobabilityis,itmustbethesameforeverybody.YouandImaydisagreeoveraprobability,butonlybecauseoneofusiswrong.Thereisonetrueprobability.

Themostcommonformofobjectiveprobabilityisfrequentism.Underthefrequentistde?inition,probabilityisapropertyofa(hypothetical)repeatedexperiment.Forinstance,takethestatement“theprobabilityofrolling6withafairdie,isoneinsix.”

Theexperimentisrollingadie(thesingularofdice).Theoutcomewearediscussingistherollresultingina6.Ifweweretorepeattheexperimentalargenumber

oftimes,N,thentheproportionoftimesweobservethediscussedoutcomeiscloseto1in6.Moreprecisely,asNgrows,theproportionconvergesto1in6.

Underafrequentistinterpretation,saying“theprobabilityisoneinsix",isequivalenttosaying“ifIrollthedierepeatedly,therelativefrequencyofsixeswillconvergeto1in6asthenumberofrollgrows.”

4

Underobjectiveprobabilitythestatement“theprobabilitythatJoshisgamblingis12.5%”isindeednonsense.ThereisnoexperimentwecanimaginewhereJosh“turnsout"tobeagambleronein8times.Heeithergamblesorhedoesn’t.

WhatwecansayisthattheprobabilitythatateenageboydrawnrandomlyfromtheEuropeanpopulationgamblesonlineis12.5%.Thisisanexperimentwecanrepeat,andateveryrepetition,wechooseadifferentboy,sowegetadifferentoutcome.

Weshouldalsonotethatourstatementisnotpreciselycorrect.Theactualprobabilityisanumberwedon’tknow.Thisiswhathappensinpractice:theprobabilityofXhappeningisp.Wedon’tknowp,butwedoknowthatthereissomeexperimentforwhichtheproportionofsuccessfultrials(Xhappens)convergestopwithrepeatedtrials.Werepeatalargenumberoftrials,checktheproportionoftimesphappened,andusethatasanestimateofthetruep.Thatisalsowhathappenedintheresearchbehindthisarticle.Wedon’tknowpreciselyhowmanyteenageboysgambleonline,sotheresearchersfoundawaytoestimatethetotalproportion

subjective probability

Bayesianism: probability is an expression of our uncertainty and of our beliefs.

5

Thealternativetoobjectivismissubjectivism.Itstatesthatprobabilityexpressesouruncertainty.IfXisabooleanvariable,onethatistrueornottrue,andweareuncertainwhetherXistrue,wecanassignaprobabilitytoXbeingtrue.Aprobabilityof0.5meansweareentirelyambivalent,aprobabilityof0.75meanswethinkXisprettylikely,andaprobabilityof1meanswe’reentirelysure.

Inthiscase,differentpeoplecanhavedifferentprobabilitiesforthesamethingbeingtrue.YouandImaydisagreeandbothberight.IyouhaveinformationIdon’thave,yourprobabilitymaybeclosertocertaintythanmine.

Bayesianismisthemainformofsubjectiveprobability.ItbuildsonBayes’rule,whichwewilldiscusslater,totellushowweshoulduseobservationstoupdateourbeliefs.

6

Undersubjectivism,wecansay“theprobabilitythatoursonisgamblingis12.5%”Wedon’tknowwhathegetsupto,soeventhoughthereisade?initeobjectiveanswer,weareuncertain.Ifweknowonlythisheadline,wemaywellpick12.5%asourbeliefthatoursonisgambling.Ofcourse,asnotedbeforewehavealotmoreknowledgeaboutoursonthataboutotherteenageboys.Weknowhegoestobedontime,weknowwheregetshismoney,heprobablydoesn’thaveasecretcreditcard.Soeventhoughtheprobabilityforarandomteenageboywouldbe12.5%,theprobabilityforoursonisactuallymuchlower,becausewehaveextrainformation.

Thisisthefundamentaldifferencebetweenthetwoviews:underfrequentism,probabilityisde?inedasanobjectivepropertyoftheworld.TheprobabilityofXisthesameforallpeopleregardlessofwhatweknowordon’tknow.UnderBayesianism,probabilityisanexpressionofasubjectiveproperty:itcanchangefromonepersontothenext,andifwelearnnewinformation,itcanchangefromonemomenttothenext.Ifwe?indoutthatoursondoeshaveasecretcreditcard,theprobabilitythatheisgambler,suddenlyjumpsdramatically

NotethatBayesianism,inasenseencompassesfrequentism.Weareuncertainabouttheoutcomeofsomeexperiment,whichwecanexpressasaBayesianbelief.Ifweunderstandtheexperimentproperlymthenthatbeliefcoincidesexactlywiththeprobabilitythatafrequentistwouldassign.Bayesianismjustextendsthede?initiontoallowforpersonalbeliefsthatarenotobjectivelytrue,

subjectivism vs. objectivism

A disambiguation of the word probability.

Leads to fundamentally different ways of doing statistics.

Is machine learning a probabilistic discipline? If so, is it subjective or objective?

7

So,atheartsubjectivismandobjectivismaredisambiguations.Thewordprobabilityisambiguous,andtheseallowyoutomakeprecisewhatyoumean.

Notethatyoudon’thavetocommittoonevieworanother.Atheartsubjectiveandobjectiveprobabilityarejustwaystobemorepreciseaboutwhatthewordprobabilityactuallymeans.Youcanusethesubjectivede?initiononedayandtheobjectivede?initionthenext(especiallyininformalsettings).

However,onceyoustartdoingstatistics,thetwode?initionsleadtofundamentallydifferentapproaches(whichwe’llseeinmoredetaillater).Andinthestatisticalcommunitytherearede?initelytwocamps:thefrequentistsandtheBayesians.

Sincemachinelearningisoftenseenasanotherformofstatistics,youmayaskwhetheritisusuallyseenasusingsubjectiveorobjectiveprobability.Ican’tgiveyouacommonlyacceptedanswer,Ithinkopinionsdiffer.

MyviewisthatMachineLearning,whilebeingstatisticalinnature,isnotfundamentallyprobabilistic.Thefundamentalprinciplesofmachinelearningcanbede?inedandexplainedwithoutrecoursetoprobabilitytheory(andindeed,wehavedonesoformostofthestartofthecourse).Thefundamentalgoalof(of?line)machinelearningistominimisetestsetlossgivenonlyatrainingset,andsomehintastotherelationbetweenthetwodatasets.

Ofcourse,evenifmachinelearningisnotfundamentallyprobabilistic,probabilityhasproventobeaverypowerfultool(muchlikelinearalgebraandcalculus),inhelpingussolvethisproblem.Theconsequence,inmyview,isthatwecanborrowwhatevermethodsaremosthelpfultousatthetime.We’llusethefrequentistmethodswhenweneedthem,andtheBayesianmethodswhentheyprovemosthelpful.We’lleven,attimes,mixthetwoinasinglemodel.

probability theory

Basic ingredients

• sample space

• event space

• probability function p(…)

• random variable

8

Allthatwasabouttheinterpretationofprobabilities.

Themathematicalde?initionofprobability,studiedinthe?ieldofprobabilitytheory,isentirelydistinctfromthequestionofwhatthede?initionofprobabilityisasappliedtotherealworld.BothfrequentistsandBayesiansusethesamemathematicalframworktoexpressprobabilityasanumberbetween0and1.Theonlydifferencebetweenthemisinwhatthisnumberistakentoexpress.

We’llgothroughthebasicingredientsquickly.

sample space

9

Ω = {heads, tails}

Ω = {1, 2, 3, 4, 5, 6}

Ω = {(1, 1), (1, 2), …, (6, 6)}

Ω = ℝ <- continuous sample space

<- discrete sample spaces

Firstthesamplespace.Thesearethesingleoutcomesortruthsthatwewishtomodel.Ifwe?lipacoin,oursamplespaceisthesetofthetwooutcomesheadsandtails.

Wecanhavediscretesamplespacesorcontinuousones.

Adiscretesamplespacecanalsobein?inite:consider?lippingacoinandcountinghowmany?lipsittakestoseetails.Inthiscaseanynumberof?lipsispossible,sothesamplespaceisthenaturalnumbers(althoughanynumberlargerthan20willgetanastronomicallysmallprobability).

event space

10

E = {{}, {1}, {2}, {3}, …, {1, 2}, …, {1, 2, 3, 4, 5, 6}}

Events are the things that have probability: subsets of the sample space. All even throws, all throws higher than three, etc.

powerset: the set of all subsets sigma-algebra: for continuous sample spaces.

Fromthesamplespace,weconstructtheeventspace.Eventsarethosethingsthatcanhaveprobabilities.Theseincludetheelementsofthesamplespace,liketheprobabilityofrollingasixwithadie,buttheyalsoincludesetsofmulitpleelementsofthesamplespace,liketheeventofrollingaoneorasixandtheeventofrollingandevennumber.Eventheemptysetandthesetofallsixnumbersareevents.Aswewillsee,thesewillgetprobabilities0and1respectively.

Howtheeventspaceisconstructedisatechnicalbusiness.Forourpurposes,wecansimplysaythatifthesamplespaceisdiscretemthentheeventspaceisthepowersetofthesamplespace:thesetofallpossiblesubsetswecanmake.

Forcontinuoussamplespaces,noteverysubsetcanbeanevent.Weneedtomakesurethatoureventspaceisathingcalleda“sigmaalgebra.”Wewon’tneedtoworryaboutthisinthiscourse.

random variable, probability function

A way to describe events

D “takes values” 1, 2, 3, 4, 5, 6

p(D = 4), p(D > 3), p(D is even) etc

Random variables in ML:

• features i of instance: Xi

• class of instance: Y

• Model (parameters): M

11

Randomvariableshaveaconfusingandconvolutedde?inition,sowe’lljustgiveyoutheintuitiveinterpretation.

Randomvariableshelpustodescribeevents.WecanthinkofarandomvariableDassomethingthattakesthevaluesinthesamplespace,sothatwecanuseittodescribeevents.

Wethenassignaprobabilitytoeacheventwithaprobabilityfunctionp.Thisfunctionmustsatisfyseveralconstaints,butwe’lltakethoseasreadfornow,andjustsaythatitproducesavaluebetween0and1.

Inmachinelearning,it’scommontomodelfeatures,targetlabels,andsometimesevenmodelparametersas

randomvariables.Ifwearereferringtoadatasetofmultipleinstances,wemodeleachasaseparaterandomvariablewiththesamedistribution.

shorthand

p(X = 0): the probability that X takes the value 0 A number between 0 and 1

p(X = x): the probability that X takes the value x. A function of x.

12

p(X = x) =

�14 if x = 034 if x = 1

<latexit sha1_base64="vdDhIX1LhKi3Gu9Mp7NmOqtwFuk=">AAAHKXicfVTbbtRIEPUEmIVwC8sjLy1GIECjyJ5ALg+RsiRchICEKJNEikdRu6fsaY0vre72ZIZW/8l+wn7NPsFq3/gR2pdE9njALy7VOadUfapUHgupkLb9vbV07fqN9h83by3fvnP33v2VB38eiyTlBPokCRN+6mEBIY2hL6kM4ZRxwJEXwok33s3wkwlwQZP4SM4YDCIcxNSnBEuTOl+ZsGenaBu5HlFT/TyPIKCxIqam0Mj1OSbK0eqlRk+RK2EqFaI+0qXCCGzkugVt7Xc0ByEX4mFZ+HylY6/a+YeagVMGHav8Ds4f3PjbHSYkjSCWJMRCnDk2kwOFuaQkBL3spgIYJmMcwFkq/c2BojFLJcREoycG89MQyQRlHqAh5UBkODMBJpyaCoiMsHmCNE4t10sJiHEEojucUCaKUEyCIpDY2DxQ03wM+m5NqQKO2YiSaa01hSMRYTlqJMUs8upJSEPgk6iezNo0Tc4xp8AJFZkJB8aZfZaNVhwlByU+mrERxEKrlIe6KjQAcA6+EeahAJkylb/G7NNYbEueQjcL89z2HubjQxh2TZ1aot6OHyZYamNGDBckiSJspu4yrYq1cLurOreqih5qpdzMF89DhxlcQz9X0M96vnL/CvVR36A18LgCHjcKn1TQk3mpl1bQtIFOKuikUdm7qMAXDXhaQacNdFZBZw30awX92rQSm0Gf9QaqsDsfk9oP6QTecYBYq05Pz7+FmwmeOXVJNlXVcXRu9xB8c14KIJpldPX+6NNHrXY3e6/sdT3P8MIULin22vqrXbtBCYpuSo69udl73eAkHMfBVaG9N+t/Oc1CLOUsvCJtbKy93Xo9vyKcNN5XPgN1HNTwI1hELxteKPAWCQoTFvLHTf47jme/YCeLql96s1DBFikujbpUzLXEsrUaEyPOTioOC8oemGPL4ZNZt31zILBM+AuzYzyIqDHD/N1uFi2be+7MX+9m0O+tbq3aX152dj6Uh/2m9ch6bD2zHGvD2rHeWwdW3yLW/62l1u3WnfY/7X/b39r/FdSlVql5aNW+9o+f5UmTFg==</latexit><latexit sha1_base64="vdDhIX1LhKi3Gu9Mp7NmOqtwFuk=">AAAHKXicfVTbbtRIEPUEmIVwC8sjLy1GIECjyJ5ALg+RsiRchICEKJNEikdRu6fsaY0vre72ZIZW/8l+wn7NPsFq3/gR2pdE9njALy7VOadUfapUHgupkLb9vbV07fqN9h83by3fvnP33v2VB38eiyTlBPokCRN+6mEBIY2hL6kM4ZRxwJEXwok33s3wkwlwQZP4SM4YDCIcxNSnBEuTOl+ZsGenaBu5HlFT/TyPIKCxIqam0Mj1OSbK0eqlRk+RK2EqFaI+0qXCCGzkugVt7Xc0ByEX4mFZ+HylY6/a+YeagVMGHav8Ds4f3PjbHSYkjSCWJMRCnDk2kwOFuaQkBL3spgIYJmMcwFkq/c2BojFLJcREoycG89MQyQRlHqAh5UBkODMBJpyaCoiMsHmCNE4t10sJiHEEojucUCaKUEyCIpDY2DxQ03wM+m5NqQKO2YiSaa01hSMRYTlqJMUs8upJSEPgk6iezNo0Tc4xp8AJFZkJB8aZfZaNVhwlByU+mrERxEKrlIe6KjQAcA6+EeahAJkylb/G7NNYbEueQjcL89z2HubjQxh2TZ1aot6OHyZYamNGDBckiSJspu4yrYq1cLurOreqih5qpdzMF89DhxlcQz9X0M96vnL/CvVR36A18LgCHjcKn1TQk3mpl1bQtIFOKuikUdm7qMAXDXhaQacNdFZBZw30awX92rQSm0Gf9QaqsDsfk9oP6QTecYBYq05Pz7+FmwmeOXVJNlXVcXRu9xB8c14KIJpldPX+6NNHrXY3e6/sdT3P8MIULin22vqrXbtBCYpuSo69udl73eAkHMfBVaG9N+t/Oc1CLOUsvCJtbKy93Xo9vyKcNN5XPgN1HNTwI1hELxteKPAWCQoTFvLHTf47jme/YCeLql96s1DBFikujbpUzLXEsrUaEyPOTioOC8oemGPL4ZNZt31zILBM+AuzYzyIqDHD/N1uFi2be+7MX+9m0O+tbq3aX152dj6Uh/2m9ch6bD2zHGvD2rHeWwdW3yLW/62l1u3WnfY/7X/b39r/FdSlVql5aNW+9o+f5UmTFg==</latexit><latexit sha1_base64="vdDhIX1LhKi3Gu9Mp7NmOqtwFuk=">AAAHKXicfVTbbtRIEPUEmIVwC8sjLy1GIECjyJ5ALg+RsiRchICEKJNEikdRu6fsaY0vre72ZIZW/8l+wn7NPsFq3/gR2pdE9njALy7VOadUfapUHgupkLb9vbV07fqN9h83by3fvnP33v2VB38eiyTlBPokCRN+6mEBIY2hL6kM4ZRxwJEXwok33s3wkwlwQZP4SM4YDCIcxNSnBEuTOl+ZsGenaBu5HlFT/TyPIKCxIqam0Mj1OSbK0eqlRk+RK2EqFaI+0qXCCGzkugVt7Xc0ByEX4mFZ+HylY6/a+YeagVMGHav8Ds4f3PjbHSYkjSCWJMRCnDk2kwOFuaQkBL3spgIYJmMcwFkq/c2BojFLJcREoycG89MQyQRlHqAh5UBkODMBJpyaCoiMsHmCNE4t10sJiHEEojucUCaKUEyCIpDY2DxQ03wM+m5NqQKO2YiSaa01hSMRYTlqJMUs8upJSEPgk6iezNo0Tc4xp8AJFZkJB8aZfZaNVhwlByU+mrERxEKrlIe6KjQAcA6+EeahAJkylb/G7NNYbEueQjcL89z2HubjQxh2TZ1aot6OHyZYamNGDBckiSJspu4yrYq1cLurOreqih5qpdzMF89DhxlcQz9X0M96vnL/CvVR36A18LgCHjcKn1TQk3mpl1bQtIFOKuikUdm7qMAXDXhaQacNdFZBZw30awX92rQSm0Gf9QaqsDsfk9oP6QTecYBYq05Pz7+FmwmeOXVJNlXVcXRu9xB8c14KIJpldPX+6NNHrXY3e6/sdT3P8MIULin22vqrXbtBCYpuSo69udl73eAkHMfBVaG9N+t/Oc1CLOUsvCJtbKy93Xo9vyKcNN5XPgN1HNTwI1hELxteKPAWCQoTFvLHTf47jme/YCeLql96s1DBFikujbpUzLXEsrUaEyPOTioOC8oemGPL4ZNZt31zILBM+AuzYzyIqDHD/N1uFi2be+7MX+9m0O+tbq3aX152dj6Uh/2m9ch6bD2zHGvD2rHeWwdW3yLW/62l1u3WnfY/7X/b39r/FdSlVql5aNW+9o+f5UmTFg==</latexit>

p(X), p(x): shorthand for P(X = x)

Interpretingwhatastatementofaprobabilityfunctionmeansdependsonwhetherallvariablesare“?illedin.”Inthe?irstline,X=0referstoasingle,we’llde?inedevent,sop(X=0)referstoasinglevaluebetween0and1.Inthesecondlinewehaveaclassicalvariablex,sothestatement“X=x”canrefertodifferentevents,dependingonwhatxis.Inotherwords,here“p(X=x)”isafunctionofx.Forexample,ifxcantakevalues0and1,itmayrefertoasimplefunctionliketheoneshownhere.

Sinceweusuallyknowwhichoutcomesbelongtowhichrandomvariables,p(X)andp(x)canbothbeusedasshorthandforp(X=x).Notethatinthesecases,xstandsforsomespeci?icvalue,andXstandsfortherandomvariable.

probability vs. probability density

13

1 2 3 4 5 6 -1 1

Onbothdiscreteandcontinuoussamplespaces,theeventswedescribehaveprobability.

However,whenwelookatagraphliketheoneontheright,describinganormaldistributionde?inedonacontinuoussamplespace,it’simportanttorealizethatthisfunctiondoesnotexpressaprobability.IfIaskyou,underthisdistribution,whichhasthehigherprobability,0or1,theansweristhattheybothhaveprobability0.Theyhavedifferentprobabilitydensity,butwhathasprobabilityinanormaldistributionisaninterval.

Theintervalfrom0to1hashigherprobabilitythantheintervalfrom1to2.Thepoint0hashigherprobability

densitythanthepoint1.

Thisisimportantbecauseprobabilitydensitiescanhavevalueslargerthan1andprobabilitiescan’t.

probabilities and concepts

for random variables X and Y

joint probability: p(X, Y)

marginal probability: p(X)

conditional probability: p(X | Y)

(conditional) independence

Bayes’ theorem

14

Nowthatwehavethebasiclanguageofprobabilitytheoryinplace,wecanlookatsomeofthemostimportantconcepts.Wewillquicklyreviewthese?iveconcepts.

Notethatwehaveasinglesamplespaceandeventspace,andtherandomvariablesXandYwillhelpusdescribetheeventsthatwe’reinterestedin.

running example

Age = {young, teen, old}

Teeth = {healthy, unhealthy, fake}

15

Wewillusethefollowingrunningexample:wesamplearandompersonfromtheDutchpopulationandwechecktheirageandthehealthoftheirteeth(binningtheresultsintothreecategoriesforeachvariable).

Wewanttoaskquestionslike:

• whatistheprobabilityofseeinganoldperson?

• whatistheprobabilitythatayoungpersonhasfaketeeth?

• doesaperson’sagein?luencethehealthoftheirteeth,oristherenorelation?

Thesamplespaceisthesetoftheninedifferentpairsofvalueswecanobserve,andtheeventspaceisthepowersetofthat.TherandomvariablesAgeandTeethwillhelpusdescribetheseevents.

joint probability

p(Age = old & Teeth = healthy)

p(Age, Teeth):

16

T

h u f

A

y 5/18 3/18 1/18

t 1/18 1/18 2/18

o 1/18 1/18 3/18

Thejointdistributionisthemostimportantdistribution.Ittellsustheprobabilityofeachatomicevent:eacheventthatcontainsasingleelementinoursamplespace.

Sincewehavetworandomvariablesinourexample,wecanspecifythejointdistributioninasmalltable.Theprobabilitiesofalltheseeventssumtoone.

Notethatp(Age=old&Teeth=healthy)referstoasinglevalue(1/18),becausewehavespeci?iedtheevent.p(Age,Teeth)doesnotrefertoasinglevalue,becausethevariablesarenotinstantiated,itrepresentsafunctionoftwovariables(i.e.thewholetable).

marginal probability

17

T

h u f

A

y 5/18 3/18 1/18 9/18

t 1/18 1/18 2/18 4/18

o 1/18 1/18 3/18 5/18

7/18 5/18 6/18

p(Age=

old) ->

Ifwewanttofocusonjustonerandomvariable,allweneedtodoissumovertherowsorcolumns.

Forinstance,theprobabilitythatAge=old,regardlessofthevalueofTeeth,istheprobabilityoftheevent{(o,h),(o,u),(o,f)}.Becausewecanwritethesesumsinthemarginsofourjointprobabilitytable,thisprocessof“gettingrid”ofavariableisalsocalledmaginalizingout(asin“wemarginalizeoutthevariableTeeth”).Theresultingdistributionovertheremainingvariable(s)iscalledamarginaldistribution.

marginal probability

p(y) = p(y, h) + p(y, u) + p(y, f )

in general, for joint distribution p(X, Y):

p(x) = ∑y in Y p(x, y)

18

Thisiswhatmarginalizinglookslikeinsymbols:wesumthejointprobabilitiesforallvaluesofoneoftherandomvariables,keepingthevalueoftheother?ixed.

conditional probability

p(T = f | A = y) = p(f, y) / p(y) = 1/ 9

19

T

h u f

A

y 5/18 3/18 1/18

t 1/18 1/18 2/18

o 1/18 1/18 3/18

Theconditionalprobabilityistheprobabilityoveronevariable,ifthevalueofanotherisknown.

Ifweknowthatsomebodyisyoung,weknowthattheprobabilityofthemhavingfalseteethmustbemuchlower.

Theconditionalprobabilityp(X=x|Y=y)iscomputedtakingthejointprobabilityof(x,y)andnormalisingbythesumoftheprobabilitiesintheroworcolumncorrespondingtothepartthat’sgivenintheconditional.

Imaginewe’rethrowingdartsatthistable,andtheprobabilityofhittingacertaincellisthejointprobabilityindicatesinthecell.Theconditional

probabilityp(T=f|A=y)istheprobabilitythatthedarthitsthe(y,f)cell,giventhatit’shittheyrow.

conditional probability

20

p(X = x|Y = y) =p(X = x, Y = y)Px0 p(X = x 0, Y = y)

=p(x,y)

p(y)<latexit sha1_base64="KXWbl7hv942UM5SAXauufIzhCEw=">AAAIUXicfVXdbts2FJbTtfHcdkvXy11MmLE1HYxActYkuzDQ5aftxdpkQZxkiIyAoo9twZREkJQjleVT9B32Qrvao/SulCx7+ut044Pz/Zj8DiW6lHhcWNa/rY17X91/sNn+uvPw0eNvvt168t0lDyOGYYhDErJrF3EgXgBD4QkC15QB8l0CV+78KMWvFsC4FwYXIqEw8tE08CYeRkK3brc+0m3HxfLaHJix+uCEWP6ly0Q9N38emM6EISyLjJ5ZoCjp8Mi/lc4cy/iZUmbO1MK8k9MHKdl0nE7Fc+WXeelWXt5uda0dK3vMemHnRdfIn7PbJw/+dsYhjnwIBCaI8xvbomIkERMeJqA6TsSBIjxHU7iJxORgJL2ARgICrMyfNDaJiClCMw3IHHsMsCCJLhBmnnYw8QzpRQsdY6dsxSFAPvDeeOFRviz5YrosBNIzGMk4m5F6XFLKKUN05uG4tDSJfO4jMas1eeK75SZEBNjCLzfTZepFVpgxMOzxNIQzncwpTefOL8KzHJ8ldAYBVzJiRBWFGgDGYKKFWclBRFRmu9GHbc4HgkXQS8usNzhGbH4O4572KTXKy5mQEIlyy9Xb0OkEcIdD30fBWDpUHy0BsZBOb0dl2RXRcyWlkwbluuZ5CpfQdwX0nVJl8KQAnmiwjA7X6MQcVqWXBfCy9q9XBfSqKnWjAhrV0EUBXdSc3bsCfFeD4wIa19CkgCY19H0BfV/PGeljcdMfyeUssqHKU+It4DUDCJTs9lV1L0zP+8YuS9IzILu2yuIew0R/qZaAn6R0+ebi7R9KHh30X1h7qspwSQQrirW79+LIqlGmy9XkHOvgoH9Y44QMBdO10fHJ3u923YhGjJI1aX9/99VvdacECAnv1k5Hh8f93erGdCLlRdn7tmVVTxvDtajyRMyubdainTbR879pFLhNgmWejfx5nf+aoeQL7LDJfRVzo4I2KVaZNyqSJsVqACtFWRI0xPTfONaays5p+iLoO8uh6ZWByNL3GPRlwuCtfkFO9QcQiZD9ot8KNvU97aV/nV5a/R8RxSuirjodfbPZ1XusXlz2d+zdHevPX7svD/M7rm18b/xobBu2sW+8NN4YZ8bQwMan1g+t7dbzzX82P7WN9saSutHKNU+N0tN++BnxCgG7</latexit>

p(X = x|Y = y) =p(X = x, Y = y)Px0 p(X = x 0, Y = y)

=p(x,y)

p(y)<latexit sha1_base64="KXWbl7hv942UM5SAXauufIzhCEw=">AAAIUXicfVXdbts2FJbTtfHcdkvXy11MmLE1HYxActYkuzDQ5aftxdpkQZxkiIyAoo9twZREkJQjleVT9B32Qrvao/SulCx7+ut044Pz/Zj8DiW6lHhcWNa/rY17X91/sNn+uvPw0eNvvt168t0lDyOGYYhDErJrF3EgXgBD4QkC15QB8l0CV+78KMWvFsC4FwYXIqEw8tE08CYeRkK3brc+0m3HxfLaHJix+uCEWP6ly0Q9N38emM6EISyLjJ5ZoCjp8Mi/lc4cy/iZUmbO1MK8k9MHKdl0nE7Fc+WXeelWXt5uda0dK3vMemHnRdfIn7PbJw/+dsYhjnwIBCaI8xvbomIkERMeJqA6TsSBIjxHU7iJxORgJL2ARgICrMyfNDaJiClCMw3IHHsMsCCJLhBmnnYw8QzpRQsdY6dsxSFAPvDeeOFRviz5YrosBNIzGMk4m5F6XFLKKUN05uG4tDSJfO4jMas1eeK75SZEBNjCLzfTZepFVpgxMOzxNIQzncwpTefOL8KzHJ8ldAYBVzJiRBWFGgDGYKKFWclBRFRmu9GHbc4HgkXQS8usNzhGbH4O4572KTXKy5mQEIlyy9Xb0OkEcIdD30fBWDpUHy0BsZBOb0dl2RXRcyWlkwbluuZ5CpfQdwX0nVJl8KQAnmiwjA7X6MQcVqWXBfCy9q9XBfSqKnWjAhrV0EUBXdSc3bsCfFeD4wIa19CkgCY19H0BfV/PGeljcdMfyeUssqHKU+It4DUDCJTs9lV1L0zP+8YuS9IzILu2yuIew0R/qZaAn6R0+ebi7R9KHh30X1h7qspwSQQrirW79+LIqlGmy9XkHOvgoH9Y44QMBdO10fHJ3u923YhGjJI1aX9/99VvdacECAnv1k5Hh8f93erGdCLlRdn7tmVVTxvDtajyRMyubdainTbR879pFLhNgmWejfx5nf+aoeQL7LDJfRVzo4I2KVaZNyqSJsVqACtFWRI0xPTfONaays5p+iLoO8uh6ZWByNL3GPRlwuCtfkFO9QcQiZD9ot8KNvU97aV/nV5a/R8RxSuirjodfbPZ1XusXlz2d+zdHevPX7svD/M7rm18b/xobBu2sW+8NN4YZ8bQwMan1g+t7dbzzX82P7WN9saSutHKNU+N0tN++BnxCgG7</latexit>

Notethatthedenominatorisjustthemarginalprobability

useful

21

p(x,y) = p(x | y)p(y)<latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit>

Ifwere-arrangethefactorsinthede?initionoftheconditionalprobabilty,wegetthisequation,showingakindofdecompositionofthejointprobability.Thiscomesupalot,soit’susefultomakeamentalnoteofit.

continuous

22

XY

image source: By IkamusumeFan - Own work, CC BY-SA 3.0, https://commons.wikimedia.org/w/index.php?curid=30432580

Hereiswhattheseconceptslooklikewithcontinuousrandomvariables(abivariatenormaldistributioninthiscase).Thejointprobabilitydistributionisrepresentedbythepointcloudinthemiddle.Marginalizingouteithervariableresultsinaunivariatenormal(theredandbluedistributions).

Theconditionaldistributioncorrespondstoaverticalorhorizontalslicethroughthejointdistribution(andalsoresultsinaunivariatenormal.

independence

X and Y are independent if

p(X, Y) = p(X)p(Y)

which implies p(X|Y) = p(X)

X and Y are conditionally independent given Z if

p(X, Y | Z) = p(X | Z) p(Y | Z)

23

IftwovariablesXandYareindependent,thenknowingYwillnotchangewhatweknowaboutX.

Conditionalindolencemeansthatthetwovariablesaredependent,buttheirdependenceisentirelyexplainedbyathirdvariableZ.IfweconditiononZ,thevariablesbecomedependent.

Foranexample:considertwopeopleaandbwhoworkindifferentcitiesintheNetherlands.De?inerandomvariablesAandBdescribingwhetherornotaandbrespectivelyarelatefordinner.Theylivefarenoughaway,thatthetwoeventsareentirelyunrelated,exceptthatwhenitsnowsintheNetherlandseverythingshutsdown.RepresenttheeventofsnowbytherandomvariableS.Now,ifIknowthatAwaslatefordinner,thereisasmallprobabilitythatthatwascausedbysnow.ThistheprobabilitythatBwaslatefordinneraswellslightlyincreases.However,ifIknowthatitdidn’tsnow(IconditiononS),knowingthatAwaslatefordinnerdoesn’tin?luencetheprobabilityofBbeinglatefordinneratall.

conditional independence

A: Alice is home in time for dinner B: Bob is home in time for dinner G: a monster attacks the city

p(A, B|G) = p(A|B) p(B|G)

p(A|G, B) = p(A|G)

24

Conditionalindependencecomesupalot,anditcanbetrickytowrapyourheadaroundat?irst,sohere’sanexample.

Imaginetwopeoplewhoworkindifferentareasofaverybigcity.Inprinciple,theyworksofarapartthatwhetherornottheyarrivehomeintimefordinneriscompletelyindependent.KnowingwhetherornotAliceislatefordinnertellsyounothingaboutwhetherBobishomeintimefordinner.Noaspectoftheirlives(weather,traf?ic)intersectinameaningfulway,exceptone.

Veryrarely,alargemonsterattacksthecity.Inthatcase,alltraf?icshutsdownandeverybodyislatefordinner.ThatmeansthatifweknowthatBobislatefordinner,thereisaslightchancethatit’sbecauseofthemonster,whichshouldslightlyraisethepossibilitythatAliceislatefordinner.However,onceweknowwhetherornotthemonsterhasattacked,knowingthatBobislateprovidesnoadditionalinformation.

Bayes’ rule

the inversion problem:

It’s easy to express the probability of an observable given some hidden cause (assuming we have a model of the world). However, we usually want the opposite.

25

Inshort,weneedawayto“turnaround”theconditionalprobability.Ifweknowp(X|Y),howdoweworkoutp(Y|X)?

26

p(Y | X) =p(X | Y)p(Y)

p(X)<latexit sha1_base64="YV7LWJvefteL9CPlhBrA21FSxAY=">AAAICXicfVVdb9s2FFU/3XrtlraPexFqDGgGI5CcNUkfAnT5WAtsbdIgTlJERkDR17ZgSiJIyrFK8Bf0f+y9b0Vf9yOGvW3/ZJRku5KojS+6uOecq3sPKdGnJODCcf66cfPW7Tt3W/fut7958PDb79YePT7jccIw9HFMYnbhIw4kiKAvAkHggjJAoU/g3J/uZ/j5DBgP4uhUpBQGIRpHwSjASOjU1dqv9JkXY/le2V4YDG3Px/JCrdu7tjdiCEuN5pkCzInrttdditZttaKsq6u1jrPh5Ms2A3cRdKzFOr56dPd3bxjjJIRIYII4v3QdKgYSMRFgAqrtJRwowlM0hstEjHYGMohoIiDCyv5BY6OE2CK2s8HsYcAAC5LqAGEW6Ao2niA9hNDjt6ulOEQoBN4dzgLKi5DPxkUgkPZuIOe5t+phRSnHDNFJgOeV1iQKeYjExEjyNPSrSUgIsFlYTWZt6iZrzDkwHPDMhGPtzBHN9oufxscLfJLSCURcyYQRVRZqABiDkRbmIQeRUJlPow/JlO8KlkA3C/Pc7gFi0xMYdnWdSqLazojESFRTvh5DuxPBNY7DEEVD6VElPQFzIb3uhsq9K6MnSkovM8r37ZMMrqBvS+hbpargYQk81GAV7a/Qkd2vS89K4Jnx1vMSel6X+kkJTQx0VkJnRmX/ugRfG/C8hM4NNC2hqYF+KKEfTJ+RPhaXvYEs9iLfVHlEghm8YgCRkp2eqs/C9H5fulVJdgZkx1W53UMY6T9MAYRpRpevT9/8puT+Tu+5s6XqDJ8ksKQ4m1vP9x2DMi66WXCcnZ3ensGJGYrGq0IHh1s/u2YhmjBKVqTt7c1fXpiVUiAkvl5V2t876G3WB9OOVJtyt13HqZ82hg2rFo7YHdc2rB030RevaRT4TYLCz0b+1OS/Yij9D3bcVH1pc6OCNimWnjcq0ibFcgOWiqokarDp63asNLXJafYhTLHuMbsyECnqHoC+TBi80R/Ikf4BIhGzH/VXwcZhoGvpp9fNov8jovmSqKN2W99sbv0eM4Oz3oa7ueG8+6nzcm9xx92zvreeWs8s19q2XlqvrWOrb2Hrk/Wn9bf1T+tj61Prc+tLQb15Y6F5YlVW649/AfbZ7DQ=</latexit>

Here’showBayes’ruleisusuallywritten.

27

<latexit sha1_base64="KXPAqQnT5XF+5PpZTglAAfD4yCM=">AAAICXicfVVdb9NIFDUsEDbAblkeebGIkCiKKjtd2vJQCfoBSAu0WzUtUh1V48lNYmVsj2bGacxofsH+D955Q7zuj1jt2+4/YWwnWdtjdl58dc851/eeGXt8SgIuHOfva9d/uHHzVuv2j+07d+/99PPa/V/OeJwwDH0ck5h98BEHEkTQF4Eg8IEyQKFP4Nyf7mf4+QwYD+LoVKQUBiEaR8EowEjo1OXab/SJF2MZKtsLg6Ht+VgitW7v2t6IISw1mmcKMCeu2153KVq31Yqyri7XOs6Gky/bDNxF0LEW6/jy/q1P3jDGSQiRwARxfuE6VAwkYiLABFTbSzhQhKdoDBeJGO0MZBDRRECElf1YY6OE2CK2s8HsYcAAC5LqAGEW6Ao2niA9hNDjt6ulOEQoBN4dzgLKi5DPxkUgkPZuIOe5t+peRSnHDNFJgOeV1iQKeYjExEjyNPSrSUgIsFlYTWZt6iZrzDkwHPDMhGPtzBHN9oufxscLfJLSCURcyYQRVRZqABiDkRbmIQeRUJlPow/JlO8KlkA3C/Pc7gFi0xMYdnWdSqLazojESFRTvh5DuxPBFY7DEEVD6VElPQFzIb3uhsq9K6MnSkovM8r37ZMMrqDvS+h7pargYQk81GAV7a/Qkd2vS89K4Jnx1vMSel6X+kkJTQx0VkJnRmX/qgRfGfC8hM4NNC2hqYF+LKEfTZ+RPhYXvYEs9iLfVHlEghm8ZgCRkp2eqs/C9H5fuFVJdgZkx1W53UMY6T9MAYRpRpdvTt+9VXJ/p/fM2VJ1hk8SWFKcza1n+45BGRfdLDjOzk5vz+DEDEXjVaGDw62XrlmIJoySFWl7e/PVc7NSCoTEV6tK+3sHvc36YNqRalPutus49dPGsGHVwhG749qGteMm+uI1jQK/SVD42cifmvzXDKXfYcdN1Zc2Nypok2LpeaMibVIsN2CpqEqiBpv+246VpjY5zT6EKdY9ZlcGIkXdA9CXCYN3+gM50j9AJGL2VH8VbBwGupZ+et0s+j8imi+JOmq39c3m1u8xMzjrbbhbG87vv3Ze7C3uuNvWQ+uR9cRyrW3rhfXGOrb6FrY+W39Z/1j/tv5ofW59aX0tqNevLTQPrMpq/fkNg0Lsjg==</latexit>

p(m | a) =p(a | m)p(m)

p(a)

<latexit sha1_base64="KXPAqQnT5XF+5PpZTglAAfD4yCM=">AAAICXicfVVdb9NIFDUsEDbAblkeebGIkCiKKjtd2vJQCfoBSAu0WzUtUh1V48lNYmVsj2bGacxofsH+D955Q7zuj1jt2+4/YWwnWdtjdl58dc851/eeGXt8SgIuHOfva9d/uHHzVuv2j+07d+/99PPa/V/OeJwwDH0ck5h98BEHEkTQF4Eg8IEyQKFP4Nyf7mf4+QwYD+LoVKQUBiEaR8EowEjo1OXab/SJF2MZKtsLg6Ht+VgitW7v2t6IISw1mmcKMCeu2153KVq31Yqyri7XOs6Gky/bDNxF0LEW6/jy/q1P3jDGSQiRwARxfuE6VAwkYiLABFTbSzhQhKdoDBeJGO0MZBDRRECElf1YY6OE2CK2s8HsYcAAC5LqAGEW6Ao2niA9hNDjt6ulOEQoBN4dzgLKi5DPxkUgkPZuIOe5t+peRSnHDNFJgOeV1iQKeYjExEjyNPSrSUgIsFlYTWZt6iZrzDkwHPDMhGPtzBHN9oufxscLfJLSCURcyYQRVRZqABiDkRbmIQeRUJlPow/JlO8KlkA3C/Pc7gFi0xMYdnWdSqLazojESFRTvh5DuxPBFY7DEEVD6VElPQFzIb3uhsq9K6MnSkovM8r37ZMMrqDvS+h7pargYQk81GAV7a/Qkd2vS89K4Jnx1vMSel6X+kkJTQx0VkJnRmX/qgRfGfC8hM4NNC2hqYF+LKEfTZ+RPhYXvYEs9iLfVHlEghm8ZgCRkp2eqs/C9H5fuFVJdgZkx1W53UMY6T9MAYRpRpdvTt+9VXJ/p/fM2VJ1hk8SWFKcza1n+45BGRfdLDjOzk5vz+DEDEXjVaGDw62XrlmIJoySFWl7e/PVc7NSCoTEV6tK+3sHvc36YNqRalPutus49dPGsGHVwhG749qGteMm+uI1jQK/SVD42cifmvzXDKXfYcdN1Zc2Nypok2LpeaMibVIsN2CpqEqiBpv+246VpjY5zT6EKdY9ZlcGIkXdA9CXCYN3+gM50j9AJGL2VH8VbBwGupZ+et0s+j8imi+JOmq39c3m1u8xMzjrbbhbG87vv3Ze7C3uuNvWQ+uR9cRyrW3rhfXGOrb6FrY+W39Z/1j/tv5ofW59aX0tqNevLTQPrMpq/fkNg0Lsjg==</latexit>

p(m | a) =p(a | m)p(m)

p(a)

Here’sasimpleexample.Let’ssaythatweobservethatAliceislatefordinner(andweobservenothingelse).Doesthistellusanythingaboutwhetheramonsterhasattackedthecity?Itdoesn’ttellusmuch;it’sextremelyrarethatamonsterattacksthecitysoit’salmostcertainthatAliceislateforotherreasons.Still,ifAlicewereontime,we’dknowthatamonstercouldn’t.haveattackedthecity,sincethatwouldalmostcertainlymakeherelate.Sowemaynotknowmuch,butweknowsomething.

Inthiscaseit’seasyforustoworkouttheprobabilitythatAliceislategiventhemonsterattack.Thisisusuallythecasewhentheconditionalisthecauseoftheobservable.Theoppositeisusuallywhatweareinterestedin,sincewehavetheobservableandwanttoreasonaboutitscause.ThisiswhereBayes’rulecomesin.

SaythatweknowtheprobabilitythatweobserveAlicebeinglate,giventhatamonsterattackhappened,p(a|m),issomewherenear1.Bayes’ruletellsushowtousethistocalculatetheoppositeconditionalp(m|a).Thisisnotnear1,becausewemultiplyitbythemarginalprobabilityofamonsterattackp(m),whichisreallylow.WethendividebytheprobabilityofAlicebeinglateingeneralp(a):themorelikelyAliceistobelateduetoothercauses,thelowertheprobabiltiythatitiscausedbyamonsterattack.

28

<latexit sha1_base64="ra11pUDCFf1663BenAvPSy9Hz04=">AAAIjnicnVVdb9s2FJXbrfO8dU27x70QMzY0mxFIzupkD8G6fKx5WJssiJMCkRFQ9JUtmPoASTlWCf6d/Z697t+MkmxPErUNmF54cc85V5fnUqKX0IAL2/6z8+jxRx8/+aT7ae+zz59+8Wzn+YsbHqeMwJjENGbvPcyBBhGMRSAovE8Y4NCjcOstTnL8dgmMB3F0LbIEJiGeRYEfECx06n7nj+SlGxMZKuSGwRS5HpFY7aJvj5DrM0ykhotUiRbMXeQONqpdpLaUXV3D7f0fZUERJaqX79F/ahsUXqJ6Ufc7fXvPLh5kBs466Fvr5/L++ZPf3WlM0hAiQSjm/M6xEzGRmImAUFA9N+WQYLLAM7hLhX84kUGUpAIiotA3GvNTikSMcnfRNGBABM10gAkLdAVE5ljbIfQMevVSHCIcAh9Ml0HCy5AvZ2UgsB7gRK6KAaunNaWcMZzMA7KqtSZxyEMs5kaSZ6FXT0JKgS3DejJvUzfZYK6AkYDnJlxqZy6S/NDw6/hyjc+zZA4RVzJlVFWFGgDGwNfCIuQg0kQWu9EndcGPBEthkIdF7ugUs8UVTAe6Ti1Rb8enMRb1lKe3od2J4IHEYYijqXQTJV0BKyHdwZ4qvKuiV0pKNzfK89BVDtfQdxX0nVJ18KwCnmmwjo63qI/GTelNBbwx3npbQW+bUi+toKmBLivo0qjsPVTgBwNeVdCVgWYVNDPQDxX0g+kz1sfibjiR5SyKocoLGizhDQOIlOwPVXMvTM/7zqlL8jMg+44q7J6Cr39zJRBmOV2eX7/9VcmTw+Ere6SaDI+msKHY+6NXJ7ZBmZXdrDn24eHw2ODEDEezbaHTs9HPjlkoSVlCt6SDg/1ffjQrZUBp/LCtdHJ8Otxvbkw7Um/KOXBsu3naGDGsWjuC+g4yrJ210devaRV4bYLSz1b+wuS/YTj7B3bcVn1jc6siaVNsPG9VZG2KzQA2irokarHp73FsNY2dJ/mHsCC6x/zKwLSsewr6MmHwVn8gF/oHiEXMvtNfBZuFga6lV3eQR/9GxKsNUUe9nr7ZnOY9ZgY3wz1ntGf/9kP/9fH6jutaX1lfWy8txzqwXlvn1qU1tkhn1HE70PG7O91R96j7U0l91FlrvrRqT/f8LyVTFOM=</latexit>

p(m | a) =p(a | m)p(m)

p(a)

=p(a | m)p(m)

p(a | t)p(t) + p(a | m)p(m) + p(a | s)p(s)

caused by traffic

caused by monster

caused by snowfall

IftherearethreepossiblereasonsforAlicetobelate:traf?ic,monsterorsnowfall.ThenwecanseethedenominatorasasummarginalizaingoutthecauseforAlice’slateness.TheproportionofthissumgivenbythemiddletermistheprobabilitythatAlice’slatenessiscausedbyamonsterattack.

Considerthesituationwherebothtraf?icandsnowfallarefarmorelikelithanamonsterattack,sop(t)andp(s)aremuchhigherthanp(m),butneithertraf?icnorsnowfallevercauseAlicetobelate,perhapsbecauseshecycleshomefromwork,andhasabikewithgoodsnowtires.Inthatcaseboththe?irstandlastterminthesumbecomezero,anddespitethefactthatmonsterattacksarereallyrare,wecanstillconcludethatamonsterhasattackedifwenoticethatAliceislatefordinner.

conditional probability and Bayes’ rule

29

definition of conditional probability

see slide 21

p(X | Y) =p(Y,X)

p(Y)

=p(Y | X)p(X)

p(Y)<latexit sha1_base64="to9rl+96iyWlvWOAIe69StK8Qx4=">AAAILnicfVVNb9tGEKXTxErUpnWaYy5EhBRxIRik3NjuwUDijyaHJnYNy3ZhCsZyNZIILcnF7lIWs9i/k3N+TYAeil7zI3LIkpRUksuWFw3mvTeaebPk+pQEXDjOX2t3vrl7b711/0H72+8efv/DxqMfL3icMAx9HJOYXfmIAwki6ItAELiiDFDoE7j0p4cZfjkDxoM4OhcphUGIxlEwCjASOnWzQelzz8fySnlhMPRiLP9Um/ZP+7Y3YghLDeaprl2QNtUqtalsz2sbzKzMkrwsXVHdbHScLSd/bDNwF0HHWjynN4/WP3jDGCchRAITxPm161AxkIiJABNQbS/hQBGeojFcJ2K0N5BBRBMBEVb2M42NEmKL2M6mt4cBAyxIqgOEWaAr2HiC9ABCe9SuluIQoRB4dzgLKC9CPhsXgUDa4IGc5wtQDytKOWaITgI8r7QmUchDJCZGkqehX01CQoDNwmoya1M3WWPOgeGAZyacamdOaLZUfh6fLvBJSicQcSUTRlRZqAFgDEZamIccREJlPo0+SVO+L1gC3SzMc/tHiE3PYNjVdSqJajsjEiNRTfl6DO1OBLc4DkMUDaVHlfQEzIX0ulsq966Mnikpvcwo37fPMriCviuh75Sqgscl8FiDVbS/Qkd2vy69KIEXxr9eltDLutRPSmhioLMSOjMq+7cl+NaA5yV0bqBpCU0N9H0JfW/6jPSxuO4NZLGLfKnyhAQzeM0AIiU7PVWfhel9X7tVSXYGZMdVud1DGOnPUAGEaUaXb87f/q7k4V7vhbOj6gyfJLCkONs7Lw4dgzIuullwnL293oHBiRmKxqtCR8c7r1yzEE0YJSvS7u72b7+alVIgJL5dVTo8OOpt1wfTjlSbcnddx6mfNoYNqxaO2B3XNqwdN9EXf9Mo8JsEhZ+N/KnJf81Q+h/suKn60uZGBW1SLD1vVKRNiuUCloqqJGqw6d91rDS1yWn2Ikyx7jG7MhAp6h6BvkwYvNUvyIn+ACIRs5/1W8HGYaBr6V+vm0X/R0TzJVFH7ba+2dz6PWYGF70td3vL+eOXzsuDxR1333piPbWeW661a7203linVt/C1ifry9q9tfXWx9an1t+tfwrqnbWF5rFVeVqfvwIhIflQ</latexit>

p(X | Y) =p(Y,X)

p(Y)

=p(Y | X)p(X)

p(Y)<latexit sha1_base64="to9rl+96iyWlvWOAIe69StK8Qx4=">AAAILnicfVVNb9tGEKXTxErUpnWaYy5EhBRxIRik3NjuwUDijyaHJnYNy3ZhCsZyNZIILcnF7lIWs9i/k3N+TYAeil7zI3LIkpRUksuWFw3mvTeaebPk+pQEXDjOX2t3vrl7b711/0H72+8efv/DxqMfL3icMAx9HJOYXfmIAwki6ItAELiiDFDoE7j0p4cZfjkDxoM4OhcphUGIxlEwCjASOnWzQelzz8fySnlhMPRiLP9Um/ZP+7Y3YghLDeaprl2QNtUqtalsz2sbzKzMkrwsXVHdbHScLSd/bDNwF0HHWjynN4/WP3jDGCchRAITxPm161AxkIiJABNQbS/hQBGeojFcJ2K0N5BBRBMBEVb2M42NEmKL2M6mt4cBAyxIqgOEWaAr2HiC9ABCe9SuluIQoRB4dzgLKC9CPhsXgUDa4IGc5wtQDytKOWaITgI8r7QmUchDJCZGkqehX01CQoDNwmoya1M3WWPOgeGAZyacamdOaLZUfh6fLvBJSicQcSUTRlRZqAFgDEZamIccREJlPo0+SVO+L1gC3SzMc/tHiE3PYNjVdSqJajsjEiNRTfl6DO1OBLc4DkMUDaVHlfQEzIX0ulsq966Mnikpvcwo37fPMriCviuh75Sqgscl8FiDVbS/Qkd2vy69KIEXxr9eltDLutRPSmhioLMSOjMq+7cl+NaA5yV0bqBpCU0N9H0JfW/6jPSxuO4NZLGLfKnyhAQzeM0AIiU7PVWfhel9X7tVSXYGZMdVud1DGOnPUAGEaUaXb87f/q7k4V7vhbOj6gyfJLCkONs7Lw4dgzIuullwnL293oHBiRmKxqtCR8c7r1yzEE0YJSvS7u72b7+alVIgJL5dVTo8OOpt1wfTjlSbcnddx6mfNoYNqxaO2B3XNqwdN9EXf9Mo8JsEhZ+N/KnJf81Q+h/suKn60uZGBW1SLD1vVKRNiuUCloqqJGqw6d91rDS1yWn2Ikyx7jG7MhAp6h6BvkwYvNUvyIn+ACIRs5/1W8HGYaBr6V+vm0X/R0TzJVFH7ba+2dz6PWYGF70td3vL+eOXzsuDxR1333piPbWeW661a7203linVt/C1ifry9q9tfXWx9an1t+tfwrqnbWF5rFVeVqfvwIhIflQ</latexit>

Ifwestartwiththede?initionofconditionalprobability,thenBayes’rulefollowsdirectlyfrom?illingintheequationfromslide20.

Probabilistic Models Part 2: Learning with probability

Machine Learning mlvu.github.io

Vrije Universiteit Amsterdam

learning

We understand the machine, p(Data | θ) is known.

But we observe only the Data (and the input) and we want to know θ.

31

Observed data

θ : the parameters

“machine”

Hereisananalogyforthewayprobabilityisusuallyappliedinstatisticsandmachinelearning.Weassumesome“machine”(whichcouldbeanyprocess,theuniverse,oranactualmachine)hasgeneratedourdata,byaprocessthatispartlydeterministicandpartlyrandom.Thecon?igurationofthismachineisdeterminedbyitsparameters(theta).Thetacouldbeasinglenumber,severalnumbersorevenacomplicateddatastructure.

Weknowhowthemachineworks,soifweknowtheta,weknowtheprobabilityofeachdataset.Theproblemisthatweonlyobservethedata.

frequentist learning

Maximum likelihood estimation

The function L(θ) = p(X|θ) is called the likelihood.

We often maximize the logarithm of the likelihood instead.

32

✓̂ = argmax✓

p(X | ✓)<latexit sha1_base64="ZkpTB3iuuNemZLV5p/kA6MuvUds=">AAAHHHicfZXPbtNAEMbdQkMJFFo4crGIQAVFlZ3SNj1UlLbQCtE2VE1TKY6i9WbiWPGf1e46Tbrys3DiUeBScQMk3oa141R2HPAlo/l9M5r9djUxiWMzrml/5ubv3F0o3Fu8X3zwcOnR4+WVJxfMDyiGOvYdn16aiIFje1DnNnfgklBArulAw+zvR7wxAMps3zvnIwItF1me3bUx4jLVXn5rWFgYPcRVg/eAo1B9uaMaiFouGrYNU7IkTVYvDdfuqKncK8MotpdL2poWf2o+0JOgpCRfrb2y8N3o+DhwwePYQYw1dY3wlkCU29iBsGgEDAjCfWRBM+DdakvYHgk4eDhUX0jWDRyV+2p0GLVjU8DcGckAYWrLDiruIYowl0cuZlsx8JALrNwZ2ISNQzawxgFH0q+WGMZ+hkuZSmFRRHo2HmZGE8hlLuK9XJKNXDObhMABOnCzyWhMOeSUcggU2ywyoSadOSXRHbFzv5bw3oj0wGOhCKgTpgslAEqhKwvjkAEPiIhPIx9Gn+1wGkA5CuPczgGi/TPolGWfTCI7TtfxEQ+lGR5cYd91kdcRBgnl9cOQC6O8FsZWpelZKIQR+WKa6lmEM/QkRU/C6c71W9pV65Jm4EUKXuQaN1K0MV1qBika5OggRQe5zuZVCl/l8DBFhzk6StFRjl6n6HXeSiQvullpibHd8TWJU8cewCEF8EJRqoTTZ6HyBpt6tiS6VVHSw9juDnTlnhgDdxTJxdH58adQ7FcrG9pmOK0wnQAmEm19c2Nfy0ms8TSJRqtWK3s5jU+RZ902Oni/+U7PNyIBJc6taGtr/cP23vQToTh3vuQYaklXc35Ys+TJwDMLzFkFYxNm6vt5/SFFo3+o/VndJ97MrCCzKiZGTSqmRiLRs+rLJU2ilYqcseQA5LKlcCyf26lcEIj79LWI97wtzZC/RjmK/idEw4lQRsVo8+vTez4f1Ctr22va5zel3Y/JX8Ci8kx5rqwqurKl7CpHSk2pK1j5qtwoP5VfhS+Fb4Wbwo+xdH4uqXmqZL7C77/2npN4</latexit><latexit sha1_base64="ZkpTB3iuuNemZLV5p/kA6MuvUds=">AAAHHHicfZXPbtNAEMbdQkMJFFo4crGIQAVFlZ3SNj1UlLbQCtE2VE1TKY6i9WbiWPGf1e46Tbrys3DiUeBScQMk3oa141R2HPAlo/l9M5r9djUxiWMzrml/5ubv3F0o3Fu8X3zwcOnR4+WVJxfMDyiGOvYdn16aiIFje1DnNnfgklBArulAw+zvR7wxAMps3zvnIwItF1me3bUx4jLVXn5rWFgYPcRVg/eAo1B9uaMaiFouGrYNU7IkTVYvDdfuqKncK8MotpdL2poWf2o+0JOgpCRfrb2y8N3o+DhwwePYQYw1dY3wlkCU29iBsGgEDAjCfWRBM+DdakvYHgk4eDhUX0jWDRyV+2p0GLVjU8DcGckAYWrLDiruIYowl0cuZlsx8JALrNwZ2ISNQzawxgFH0q+WGMZ+hkuZSmFRRHo2HmZGE8hlLuK9XJKNXDObhMABOnCzyWhMOeSUcggU2ywyoSadOSXRHbFzv5bw3oj0wGOhCKgTpgslAEqhKwvjkAEPiIhPIx9Gn+1wGkA5CuPczgGi/TPolGWfTCI7TtfxEQ+lGR5cYd91kdcRBgnl9cOQC6O8FsZWpelZKIQR+WKa6lmEM/QkRU/C6c71W9pV65Jm4EUKXuQaN1K0MV1qBika5OggRQe5zuZVCl/l8DBFhzk6StFRjl6n6HXeSiQvullpibHd8TWJU8cewCEF8EJRqoTTZ6HyBpt6tiS6VVHSw9juDnTlnhgDdxTJxdH58adQ7FcrG9pmOK0wnQAmEm19c2Nfy0ms8TSJRqtWK3s5jU+RZ902Oni/+U7PNyIBJc6taGtr/cP23vQToTh3vuQYaklXc35Ys+TJwDMLzFkFYxNm6vt5/SFFo3+o/VndJ97MrCCzKiZGTSqmRiLRs+rLJU2ilYqcseQA5LKlcCyf26lcEIj79LWI97wtzZC/RjmK/idEw4lQRsVo8+vTez4f1Ctr22va5zel3Y/JX8Ci8kx5rqwqurKl7CpHSk2pK1j5qtwoP5VfhS+Fb4Wbwo+xdH4uqXmqZL7C77/2npN4</latexit><latexit sha1_base64="ZkpTB3iuuNemZLV5p/kA6MuvUds=">AAAHHHicfZXPbtNAEMbdQkMJFFo4crGIQAVFlZ3SNj1UlLbQCtE2VE1TKY6i9WbiWPGf1e46Tbrys3DiUeBScQMk3oa141R2HPAlo/l9M5r9djUxiWMzrml/5ubv3F0o3Fu8X3zwcOnR4+WVJxfMDyiGOvYdn16aiIFje1DnNnfgklBArulAw+zvR7wxAMps3zvnIwItF1me3bUx4jLVXn5rWFgYPcRVg/eAo1B9uaMaiFouGrYNU7IkTVYvDdfuqKncK8MotpdL2poWf2o+0JOgpCRfrb2y8N3o+DhwwePYQYw1dY3wlkCU29iBsGgEDAjCfWRBM+DdakvYHgk4eDhUX0jWDRyV+2p0GLVjU8DcGckAYWrLDiruIYowl0cuZlsx8JALrNwZ2ISNQzawxgFH0q+WGMZ+hkuZSmFRRHo2HmZGE8hlLuK9XJKNXDObhMABOnCzyWhMOeSUcggU2ywyoSadOSXRHbFzv5bw3oj0wGOhCKgTpgslAEqhKwvjkAEPiIhPIx9Gn+1wGkA5CuPczgGi/TPolGWfTCI7TtfxEQ+lGR5cYd91kdcRBgnl9cOQC6O8FsZWpelZKIQR+WKa6lmEM/QkRU/C6c71W9pV65Jm4EUKXuQaN1K0MV1qBika5OggRQe5zuZVCl/l8DBFhzk6StFRjl6n6HXeSiQvullpibHd8TWJU8cewCEF8EJRqoTTZ6HyBpt6tiS6VVHSw9juDnTlnhgDdxTJxdH58adQ7FcrG9pmOK0wnQAmEm19c2Nfy0ms8TSJRqtWK3s5jU+RZ902Oni/+U7PNyIBJc6taGtr/cP23vQToTh3vuQYaklXc35Ys+TJwDMLzFkFYxNm6vt5/SFFo3+o/VndJ97MrCCzKiZGTSqmRiLRs+rLJU2ilYqcseQA5LKlcCyf26lcEIj79LWI97wtzZC/RjmK/idEw4lQRsVo8+vTez4f1Ctr22va5zel3Y/JX8Ci8kx5rqwqurKl7CpHSk2pK1j5qtwoP5VfhS+Fb4Wbwo+xdH4uqXmqZL7C77/2npN4</latexit>

Infrequentistlearning,wearegivensomedataandourjobistoguesstotruemodel(outofamodelclass)thatgeneratedsomedata.

Inthefrequentistviewoftheworld,thetruemodelisnotsubjecttoprobability.Itdoesn’tchangeifwerepeattheexperiment,soweshouldn’tapplyprobabilitytoit.Wejusttrytoguesswhichitis.Thisistypicaloffrequentistapproaches:webuildalgorithmsthatgivesusapointestimateforourmodelparameters.Thatis,theyreturnonepointinourmodelspace.

Oneofthemostcommoncriteriaisthatweshouldpreferthemodel(representedbytheta)forwhichtheprobabilityofseeingthedatathatwesawishighest.Thisiscalledthemaximumlikelihoodprinciple.Underthemaximumlikelihoodprinciple,pickingamodelbecomesanoptimizationproblem.

Often,wemaketheproblemeasierbyoptimizingforthelogarithmofthelikelihood.Thisdoesn’tmovetheoptimum,butthelog-likelihoodisaneasierfunctiontodealwith.

Bayesian learning

33

<latexit sha1_base64="uHJEPojeHqJZnmVT74cmsTcM9/s=">AAAHJHicfZVdTxNBFIYXlIpVFPTSm42NCTVNs1uk1AsSBARiBCqhhaTbkNnpabvpfkxmZkvLOH/HK3+HV15pvDAaf4uz/SC73ere9GSe95yc887k1Cauw7hh/FxYvHN3KXNv+X72wcOVR49X157UWRBSDDUcuAG9tBED1/Ghxh3uwiWhgDzbhQu7txfxiz5Q5gT+OR8SaHqo4zttByOujq5WD8i6ZWNh8S5wJPWP+mVe39atNkVYkPVLy3NaekyQ1xP6vIxEeZm9Ws0ZRWP06enAnAQ5bfJVr9aWvlitAIce+By7iLGGaRDeFIhyB7sgs1bIgCDcQx1ohLxdaQrHJyEHH0v9hWLt0NV5oEcz6S2HAubuUAUIU0dV0HEXqRG4mjybLMXARx6wQqvvEDYOWb8zDjhStjXFYGSrXElkig5FpOvgQaI1gTzmId5NHbKhZycPIXSB9r3kYdSmanJGOQCKHRaZUFXOnJLoqth5UJ3w7pB0wWdShNSV8UQFgFJoq8RRyICHRIymUe+jx7Y5DaEQhaOz7X1Ee2fQKqg6iYNkO203QFwqM3y4xoHnIb8lLCLVG4ABF1ahKEdWxemZFMKKfLFt/SzCCXoSoydytnLtlrb1mqIJWI/BeqrwRYxezKbaYYyGKdqP0X6qsn0dw9cpPIjRQYoOY3SYojcxepO2EqmLbpSaYmz36JrEqev04ZAC+FLkSnJ2FqpusGEmU6JbFTlTjuxuQVutizHwhpFcHJ0fv5dir1LaNMpyVmG7IUwlxkZ5c89ISTrjbiYao1Ip7aY0AUV+57bQ/tvyGzNdiISUuLeira2Ng9e7s0+E4tR8kzH0nKmn/OjMk08anptgz0sYmzBX30vrDyka/kMdzKs+9WZuBpmXMTVqmjHTEomeVU9tahKtVOSOJfugli2FY/XcTtWCQDygL9Ubox3PUWaoX6sQRf8TosFUqKJstPnN2T2fDuqlolkuGh9e5XbeTf4DlrVn2nNtXTO1LW1HO9KqWk3D2mftm/ZL+535lPma+Z75MZYuLkxynmqJL/PnL5D9lbA=</latexit>

p(✓|X) =p(X | ✓)p(✓)

p(X)

prior distribution

model evidence

data distribution

posterior distribution

InBayesianlearning,wecantalkabouttheprobabilityofthetruemodelparameterstakingaparticularvalue.Wedon’tknowthetrueparameters,butthedatagivesussomeidea,soweexpressthatuncertaintyaprobabilitydistributionoverthemodelspace.

Thethreepartsoftheright-handsidehavethesenames.Thepriordistributionisanameyou’llhearoften;itexpressesourbeliefsaboutthemodelbeforewe’veseenthedata.Forinstanceifwedospamclassi?icationinaBayesianway,wemighthaveapriorbeliefabouttheprobabilityofspam,whichwethenupdatebylookingatthecontentoftheemail(thedata).Ourbeliefsabouttheparametersafterseeingthedata,isexpressedbytheposteriordistribution.

NotethatBayesianlearningdoes,inprinciple,notrequireustosearchoroptimizeanything.Ifwecanworkoutthefunctionontherighthandsideofthisequation,wehaveeverythingweneed.Ifweneedagoodmodel,wecanpicktheonetowhichp(θ|X)assignsthehighestprobability,orwecansampleamodelandgetagood?itwithhighprobability.Wecanalsostudyotherpropertiesofthedistribution:forinstancethevarianceofthisdistributionisagoodindicationofhowuncertainwestillareabouttheparametersofthemodel.

Insomecases,likefornormaldistributions,wecanworkallofthisoutanalytically.Formorecomplicatedmodels,it’susuallyimpossibletoworkouttheposterioranalytically,andwehavetomakedowithafunctionthatapproximatesit,orwithanumberofindividualsamplesfromtheposterior.

Wewon’tdealwithpureBayesianlearningmuchinthiscourse,butinmachinelearningthedistinctionbetweenfrequentistandBayesianlearningisnotalwaysadheredtoreligiouslyandconceptsfrombotharesometimesfreelycombined.

a simple example

p(Heads | Straight) =1/2

p(Tails | Straight) = 1/2

34

p(Heads | Bent) = 4/5

p(Tails | Bent) = 1/5

HTH HHT HHT HTH

Toexplainbothmaximumlikelihood?ittingandBayesianlearning,let’slookatasimpleexample.Wehavetwocoins,abentoneandastraightone.Flippingthesecoinsgivesusdifferentprobabilitiesofheadsandtails.

Weaskafriendtopickarandomcoinwithoutshowingus,andto?lipittwelvetimes.Theresultingsequencehasmoreheadsthantails,butnotsuchadisparitythatyouwouldneverexpectitfromafaircoin.Ifwehadtoguesswhichcoinourfriendhadpicked,whichshouldweguess?

imagesource:https://www.magictricks.com/bent.html

a simple example

p(Heads | Straight) =1/2

p(Tails | Straight) = 1/2

35

p(Heads | Bent) = 4/5

p(Tails | Bent) = 1/5

HTH HHT HHT HTH

Model Space

Observed data

Thisisasimpleversionofamodelselectionproblem.Ourmodelclassconsistsoftwomodels(thetwocoins)andourdataconsistsof12instances(theresultsofthecoin?lips).

Notethatpickingjustonecoinisafrequentistapproach,andgivingeachcoinaprobabilityisaBayesianapproach.

imagesource:https://www.magictricks.com/bent.html

maximum likelihood

36

argmaxCoin2{Bent,Straight}

p(HTHHHTHHTHTH | Coin)

argmaxModel 2Model Space

p(Data | Model )<latexit sha1_base64="TjHvkcvrTEf+bAsdMjPRxCEcpYY=">AAAItHicfVVNb9tGEKXSNFXVJnHaYy+LCAWSQDBIubHdQ4DUkhsf6th1LDuAKRjL1YgitPzAcimLWeyl1+R39D/1j/Tc5YdUksuEB2E0773R7JulxomoF3PT/Kdz76v7Xz/4pvtt77vvHz56vPPkh6s4TBiBCQlpyN47OAbqBTDhHqfwPmKAfYfCtbMcZfj1CljshcElTyOY+tgNvLlHMFep251/bcxcH69vhc1hzcUo9AKJbC9Adpk5goDLASq+vOMMe+6CS1tKFD2zQyJOpO0QcSnLuPqp5/MM0mFk+94MVVp4jmy71+jNZkSchjOgsuwwz+YZ9C7CBIqe8uwYc1yrWhE/R73bnb65a+YP0gOrDPpG+ZzfPnnwtz0LSeIrNwjFcXxjmRGfCsy4RyjInp3EoHpYYhduEj4/nAoviBIOAZHoZ4XNE4p4iLIZoJnHgHCaqgAT5qkKiCwww4SrSfXqpWIIsA/xYLbyorgI45VbBByrMU/FOr8G8mFNKVyGo4VH1rXWBPZjH/OFloxT36knIaHAVn49mbWpmmww18CIF2cmnCtnzqLsasWX4XmJL9JoAUEsRcKorAoVAIzBXAnzMAaeRCI/jbrPy/gVZwkMsjDPvRpjtryA2UDVqSXq7cxpiHk95ahjKHcCuCOh7+NgJuxIbm7VYFfm3lXRCymEnRnlOOgig2vo2wr6Vso6eFwBjxVYRydbdI4mTelVBbzSfvW6gl43pU5SQRMNXVXQlVbZuavAdxq8rqBrDU0raKqhHyroB91nrK7FzXAqilnkQxVn1FvBGwYQSNEfyuZZmJr3jVWXZHdA9C2Z2z2DufozLAA/zeji5PL0DylGh8OX5r5sMhyawIZi7u2/HJkaxS26KTnm4eHwSOOEDAfuttD4eP83Sy8UJSyiW9LBwd7vv+qVUqA0vNtWGh2Nh3vNgylH6k1ZB5ZpNm8bI5pVpSOobyHNWreNXv5Mq8BpExR+tvKXOv8Nw+ln2GFb9Y3NrYqoTbHxvFWRtik2A9go6pKgxab/x7HVNE4eZS/Ckqges5WBaVF3DGqZMDhVL8iZ+gPEPGQvRL721A5Uy8W1B1n0JSJeb4gq6mWbzWruMT24Gu5a5q715y/910fljusaPxlPjWeGZRwYr40T49yYGKRjd/7qfOx86u537S7pQkG91yk1Pxq1pxv8Bz87LsU=</latexit><latexit sha1_base64="TjHvkcvrTEf+bAsdMjPRxCEcpYY=">AAAItHicfVVNb9tGEKXSNFXVJnHaYy+LCAWSQDBIubHdQ4DUkhsf6th1LDuAKRjL1YgitPzAcimLWeyl1+R39D/1j/Tc5YdUksuEB2E0773R7JulxomoF3PT/Kdz76v7Xz/4pvtt77vvHz56vPPkh6s4TBiBCQlpyN47OAbqBTDhHqfwPmKAfYfCtbMcZfj1CljshcElTyOY+tgNvLlHMFep251/bcxcH69vhc1hzcUo9AKJbC9Adpk5goDLASq+vOMMe+6CS1tKFD2zQyJOpO0QcSnLuPqp5/MM0mFk+94MVVp4jmy71+jNZkSchjOgsuwwz+YZ9C7CBIqe8uwYc1yrWhE/R73bnb65a+YP0gOrDPpG+ZzfPnnwtz0LSeIrNwjFcXxjmRGfCsy4RyjInp3EoHpYYhduEj4/nAoviBIOAZHoZ4XNE4p4iLIZoJnHgHCaqgAT5qkKiCwww4SrSfXqpWIIsA/xYLbyorgI45VbBByrMU/FOr8G8mFNKVyGo4VH1rXWBPZjH/OFloxT36knIaHAVn49mbWpmmww18CIF2cmnCtnzqLsasWX4XmJL9JoAUEsRcKorAoVAIzBXAnzMAaeRCI/jbrPy/gVZwkMsjDPvRpjtryA2UDVqSXq7cxpiHk95ahjKHcCuCOh7+NgJuxIbm7VYFfm3lXRCymEnRnlOOgig2vo2wr6Vso6eFwBjxVYRydbdI4mTelVBbzSfvW6gl43pU5SQRMNXVXQlVbZuavAdxq8rqBrDU0raKqhHyroB91nrK7FzXAqilnkQxVn1FvBGwYQSNEfyuZZmJr3jVWXZHdA9C2Z2z2DufozLAA/zeji5PL0DylGh8OX5r5sMhyawIZi7u2/HJkaxS26KTnm4eHwSOOEDAfuttD4eP83Sy8UJSyiW9LBwd7vv+qVUqA0vNtWGh2Nh3vNgylH6k1ZB5ZpNm8bI5pVpSOobyHNWreNXv5Mq8BpExR+tvKXOv8Nw+ln2GFb9Y3NrYqoTbHxvFWRtik2A9go6pKgxab/x7HVNE4eZS/Ckqges5WBaVF3DGqZMDhVL8iZ+gPEPGQvRL721A5Uy8W1B1n0JSJeb4gq6mWbzWruMT24Gu5a5q715y/910fljusaPxlPjWeGZRwYr40T49yYGKRjd/7qfOx86u537S7pQkG91yk1Pxq1pxv8Bz87LsU=</latexit><latexit sha1_base64="TjHvkcvrTEf+bAsdMjPRxCEcpYY=">AAAItHicfVVNb9tGEKXSNFXVJnHaYy+LCAWSQDBIubHdQ4DUkhsf6th1LDuAKRjL1YgitPzAcimLWeyl1+R39D/1j/Tc5YdUksuEB2E0773R7JulxomoF3PT/Kdz76v7Xz/4pvtt77vvHz56vPPkh6s4TBiBCQlpyN47OAbqBTDhHqfwPmKAfYfCtbMcZfj1CljshcElTyOY+tgNvLlHMFep251/bcxcH69vhc1hzcUo9AKJbC9Adpk5goDLASq+vOMMe+6CS1tKFD2zQyJOpO0QcSnLuPqp5/MM0mFk+94MVVp4jmy71+jNZkSchjOgsuwwz+YZ9C7CBIqe8uwYc1yrWhE/R73bnb65a+YP0gOrDPpG+ZzfPnnwtz0LSeIrNwjFcXxjmRGfCsy4RyjInp3EoHpYYhduEj4/nAoviBIOAZHoZ4XNE4p4iLIZoJnHgHCaqgAT5qkKiCwww4SrSfXqpWIIsA/xYLbyorgI45VbBByrMU/FOr8G8mFNKVyGo4VH1rXWBPZjH/OFloxT36knIaHAVn49mbWpmmww18CIF2cmnCtnzqLsasWX4XmJL9JoAUEsRcKorAoVAIzBXAnzMAaeRCI/jbrPy/gVZwkMsjDPvRpjtryA2UDVqSXq7cxpiHk95ahjKHcCuCOh7+NgJuxIbm7VYFfm3lXRCymEnRnlOOgig2vo2wr6Vso6eFwBjxVYRydbdI4mTelVBbzSfvW6gl43pU5SQRMNXVXQlVbZuavAdxq8rqBrDU0raKqhHyroB91nrK7FzXAqilnkQxVn1FvBGwYQSNEfyuZZmJr3jVWXZHdA9C2Z2z2DufozLAA/zeji5PL0DylGh8OX5r5sMhyawIZi7u2/HJkaxS26KTnm4eHwSOOEDAfuttD4eP83Sy8UJSyiW9LBwd7vv+qVUqA0vNtWGh2Nh3vNgylH6k1ZB5ZpNm8bI5pVpSOobyHNWreNXv5Mq8BpExR+tvKXOv8Nw+ln2GFb9Y3NrYqoTbHxvFWRtik2A9go6pKgxab/x7HVNE4eZS/Ckqges5WBaVF3DGqZMDhVL8iZ+gPEPGQvRL721A5Uy8W1B1n0JSJeb4gq6mWbzWruMT24Gu5a5q715y/910fljusaPxlPjWeGZRwYr40T49yYGKRjd/7qfOx86u537S7pQkG91yk1Pxq1pxv8Bz87LsU=</latexit><latexit sha1_base64="TjHvkcvrTEf+bAsdMjPRxCEcpYY=">AAAItHicfVVNb9tGEKXSNFXVJnHaYy+LCAWSQDBIubHdQ4DUkhsf6th1LDuAKRjL1YgitPzAcimLWeyl1+R39D/1j/Tc5YdUksuEB2E0773R7JulxomoF3PT/Kdz76v7Xz/4pvtt77vvHz56vPPkh6s4TBiBCQlpyN47OAbqBTDhHqfwPmKAfYfCtbMcZfj1CljshcElTyOY+tgNvLlHMFep251/bcxcH69vhc1hzcUo9AKJbC9Adpk5goDLASq+vOMMe+6CS1tKFD2zQyJOpO0QcSnLuPqp5/MM0mFk+94MVVp4jmy71+jNZkSchjOgsuwwz+YZ9C7CBIqe8uwYc1yrWhE/R73bnb65a+YP0gOrDPpG+ZzfPnnwtz0LSeIrNwjFcXxjmRGfCsy4RyjInp3EoHpYYhduEj4/nAoviBIOAZHoZ4XNE4p4iLIZoJnHgHCaqgAT5qkKiCwww4SrSfXqpWIIsA/xYLbyorgI45VbBByrMU/FOr8G8mFNKVyGo4VH1rXWBPZjH/OFloxT36knIaHAVn49mbWpmmww18CIF2cmnCtnzqLsasWX4XmJL9JoAUEsRcKorAoVAIzBXAnzMAaeRCI/jbrPy/gVZwkMsjDPvRpjtryA2UDVqSXq7cxpiHk95ahjKHcCuCOh7+NgJuxIbm7VYFfm3lXRCymEnRnlOOgig2vo2wr6Vso6eFwBjxVYRydbdI4mTelVBbzSfvW6gl43pU5SQRMNXVXQlVbZuavAdxq8rqBrDU0raKqhHyroB91nrK7FzXAqilnkQxVn1FvBGwYQSNEfyuZZmJr3jVWXZHdA9C2Z2z2DufozLAA/zeji5PL0DylGh8OX5r5sMhyawIZi7u2/HJkaxS26KTnm4eHwSOOEDAfuttD4eP83Sy8UJSyiW9LBwd7vv+qVUqA0vNtWGh2Nh3vNgylH6k1ZB5ZpNm8bI5pVpSOobyHNWreNXv5Mq8BpExR+tvKXOv8Nw+ln2GFb9Y3NrYqoTbHxvFWRtik2A9go6pKgxab/x7HVNE4eZS/Ckqges5WBaVF3DGqZMDhVL8iZ+gPEPGQvRL721A5Uy8W1B1n0JSJeb4gq6mWbzWruMT24Gu5a5q715y/910fljusaPxlPjWeGZRwYr40T49yYGKRjd/7qfOx86u537S7pQkG91yk1Pxq1pxv8Bz87LsU=</latexit>

argmaxCoin2{Bent,Straight}

p(HTHHHTHHTHTH | Coin)

argmaxModel 2Model Space

p(Data | Model )<latexit sha1_base64="TjHvkcvrTEf+bAsdMjPRxCEcpYY=">AAAItHicfVVNb9tGEKXSNFXVJnHaYy+LCAWSQDBIubHdQ4DUkhsf6th1LDuAKRjL1YgitPzAcimLWeyl1+R39D/1j/Tc5YdUksuEB2E0773R7JulxomoF3PT/Kdz76v7Xz/4pvtt77vvHz56vPPkh6s4TBiBCQlpyN47OAbqBTDhHqfwPmKAfYfCtbMcZfj1CljshcElTyOY+tgNvLlHMFep251/bcxcH69vhc1hzcUo9AKJbC9Adpk5goDLASq+vOMMe+6CS1tKFD2zQyJOpO0QcSnLuPqp5/MM0mFk+94MVVp4jmy71+jNZkSchjOgsuwwz+YZ9C7CBIqe8uwYc1yrWhE/R73bnb65a+YP0gOrDPpG+ZzfPnnwtz0LSeIrNwjFcXxjmRGfCsy4RyjInp3EoHpYYhduEj4/nAoviBIOAZHoZ4XNE4p4iLIZoJnHgHCaqgAT5qkKiCwww4SrSfXqpWIIsA/xYLbyorgI45VbBByrMU/FOr8G8mFNKVyGo4VH1rXWBPZjH/OFloxT36knIaHAVn49mbWpmmww18CIF2cmnCtnzqLsasWX4XmJL9JoAUEsRcKorAoVAIzBXAnzMAaeRCI/jbrPy/gVZwkMsjDPvRpjtryA2UDVqSXq7cxpiHk95ahjKHcCuCOh7+NgJuxIbm7VYFfm3lXRCymEnRnlOOgig2vo2wr6Vso6eFwBjxVYRydbdI4mTelVBbzSfvW6gl43pU5SQRMNXVXQlVbZuavAdxq8rqBrDU0raKqhHyroB91nrK7FzXAqilnkQxVn1FvBGwYQSNEfyuZZmJr3jVWXZHdA9C2Z2z2DufozLAA/zeji5PL0DylGh8OX5r5sMhyawIZi7u2/HJkaxS26KTnm4eHwSOOEDAfuttD4eP83Sy8UJSyiW9LBwd7vv+qVUqA0vNtWGh2Nh3vNgylH6k1ZB5ZpNm8bI5pVpSOobyHNWreNXv5Mq8BpExR+tvKXOv8Nw+ln2GFb9Y3NrYqoTbHxvFWRtik2A9go6pKgxab/x7HVNE4eZS/Ckqges5WBaVF3DGqZMDhVL8iZ+gPEPGQvRL721A5Uy8W1B1n0JSJeb4gq6mWbzWruMT24Gu5a5q715y/910fljusaPxlPjWeGZRwYr40T49yYGKRjd/7qfOx86u537S7pQkG91yk1Pxq1pxv8Bz87LsU=</latexit><latexit sha1_base64="TjHvkcvrTEf+bAsdMjPRxCEcpYY=">AAAItHicfVVNb9tGEKXSNFXVJnHaYy+LCAWSQDBIubHdQ4DUkhsf6th1LDuAKRjL1YgitPzAcimLWeyl1+R39D/1j/Tc5YdUksuEB2E0773R7JulxomoF3PT/Kdz76v7Xz/4pvtt77vvHz56vPPkh6s4TBiBCQlpyN47OAbqBTDhHqfwPmKAfYfCtbMcZfj1CljshcElTyOY+tgNvLlHMFep251/bcxcH69vhc1hzcUo9AKJbC9Adpk5goDLASq+vOMMe+6CS1tKFD2zQyJOpO0QcSnLuPqp5/MM0mFk+94MVVp4jmy71+jNZkSchjOgsuwwz+YZ9C7CBIqe8uwYc1yrWhE/R73bnb65a+YP0gOrDPpG+ZzfPnnwtz0LSeIrNwjFcXxjmRGfCsy4RyjInp3EoHpYYhduEj4/nAoviBIOAZHoZ4XNE4p4iLIZoJnHgHCaqgAT5qkKiCwww4SrSfXqpWIIsA/xYLbyorgI45VbBByrMU/FOr8G8mFNKVyGo4VH1rXWBPZjH/OFloxT36knIaHAVn49mbWpmmww18CIF2cmnCtnzqLsasWX4XmJL9JoAUEsRcKorAoVAIzBXAnzMAaeRCI/jbrPy/gVZwkMsjDPvRpjtryA2UDVqSXq7cxpiHk95ahjKHcCuCOh7+NgJuxIbm7VYFfm3lXRCymEnRnlOOgig2vo2wr6Vso6eFwBjxVYRydbdI4mTelVBbzSfvW6gl43pU5SQRMNXVXQlVbZuavAdxq8rqBrDU0raKqhHyroB91nrK7FzXAqilnkQxVn1FvBGwYQSNEfyuZZmJr3jVWXZHdA9C2Z2z2DufozLAA/zeji5PL0DylGh8OX5r5sMhyawIZi7u2/HJkaxS26KTnm4eHwSOOEDAfuttD4eP83Sy8UJSyiW9LBwd7vv+qVUqA0vNtWGh2Nh3vNgylH6k1ZB5ZpNm8bI5pVpSOobyHNWreNXv5Mq8BpExR+tvKXOv8Nw+ln2GFb9Y3NrYqoTbHxvFWRtik2A9go6pKgxab/x7HVNE4eZS/Ckqges5WBaVF3DGqZMDhVL8iZ+gPEPGQvRL721A5Uy8W1B1n0JSJeb4gq6mWbzWruMT24Gu5a5q715y/910fljusaPxlPjWeGZRwYr40T49yYGKRjd/7qfOx86u537S7pQkG91yk1Pxq1pxv8Bz87LsU=</latexit><latexit sha1_base64="TjHvkcvrTEf+bAsdMjPRxCEcpYY=">AAAItHicfVVNb9tGEKXSNFXVJnHaYy+LCAWSQDBIubHdQ4DUkhsf6th1LDuAKRjL1YgitPzAcimLWeyl1+R39D/1j/Tc5YdUksuEB2E0773R7JulxomoF3PT/Kdz76v7Xz/4pvtt77vvHz56vPPkh6s4TBiBCQlpyN47OAbqBTDhHqfwPmKAfYfCtbMcZfj1CljshcElTyOY+tgNvLlHMFep251/bcxcH69vhc1hzcUo9AKJbC9Adpk5goDLASq+vOMMe+6CS1tKFD2zQyJOpO0QcSnLuPqp5/MM0mFk+94MVVp4jmy71+jNZkSchjOgsuwwz+YZ9C7CBIqe8uwYc1yrWhE/R73bnb65a+YP0gOrDPpG+ZzfPnnwtz0LSeIrNwjFcXxjmRGfCsy4RyjInp3EoHpYYhduEj4/nAoviBIOAZHoZ4XNE4p4iLIZoJnHgHCaqgAT5qkKiCwww4SrSfXqpWIIsA/xYLbyorgI45VbBByrMU/FOr8G8mFNKVyGo4VH1rXWBPZjH/OFloxT36knIaHAVn49mbWpmmww18CIF2cmnCtnzqLsasWX4XmJL9JoAUEsRcKorAoVAIzBXAnzMAaeRCI/jbrPy/gVZwkMsjDPvRpjtryA2UDVqSXq7cxpiHk95ahjKHcCuCOh7+NgJuxIbm7VYFfm3lXRCymEnRnlOOgig2vo2wr6Vso6eFwBjxVYRydbdI4mTelVBbzSfvW6gl43pU5SQRMNXVXQlVbZuavAdxq8rqBrDU0raKqhHyroB91nrK7FzXAqilnkQxVn1FvBGwYQSNEfyuZZmJr3jVWXZHdA9C2Z2z2DufozLAA/zeji5PL0DylGh8OX5r5sMhyawIZi7u2/HJkaxS26KTnm4eHwSOOEDAfuttD4eP83Sy8UJSyiW9LBwd7vv+qVUqA0vNtWGh2Nh3vNgylH6k1ZB5ZpNm8bI5pVpSOobyHNWreNXv5Mq8BpExR+tvKXOv8Nw+ln2GFb9Y3NrYqoTbHxvFWRtik2A9go6pKgxab/x7HVNE4eZS/Ckqges5WBaVF3DGqZMDhVL8iZ+gPEPGQvRL721A5Uy8W1B1n0JSJeb4gq6mWbzWruMT24Gu5a5q715y/910fljusaPxlPjWeGZRwYr40T49yYGKRjd/7qfOx86u537S7pQkG91yk1Pxq1pxv8Bz87LsU=</latexit><latexit sha1_base64="TjHvkcvrTEf+bAsdMjPRxCEcpYY=">AAAItHicfVVNb9tGEKXSNFXVJnHaYy+LCAWSQDBIubHdQ4DUkhsf6th1LDuAKRjL1YgitPzAcimLWeyl1+R39D/1j/Tc5YdUksuEB2E0773R7JulxomoF3PT/Kdz76v7Xz/4pvtt77vvHz56vPPkh6s4TBiBCQlpyN47OAbqBTDhHqfwPmKAfYfCtbMcZfj1CljshcElTyOY+tgNvLlHMFep251/bcxcH69vhc1hzcUo9AKJbC9Adpk5goDLASq+vOMMe+6CS1tKFD2zQyJOpO0QcSnLuPqp5/MM0mFk+94MVVp4jmy71+jNZkSchjOgsuwwz+YZ9C7CBIqe8uwYc1yrWhE/R73bnb65a+YP0gOrDPpG+ZzfPnnwtz0LSeIrNwjFcXxjmRGfCsy4RyjInp3EoHpYYhduEj4/nAoviBIOAZHoZ4XNE4p4iLIZoJnHgHCaqgAT5qkKiCwww4SrSfXqpWIIsA/xYLbyorgI45VbBByrMU/FOr8G8mFNKVyGo4VH1rXWBPZjH/OFloxT36knIaHAVn49mbWpmmww18CIF2cmnCtnzqLsasWX4XmJL9JoAUEsRcKorAoVAIzBXAnzMAaeRCI/jbrPy/gVZwkMsjDPvRpjtryA2UDVqSXq7cxpiHk95ahjKHcCuCOh7+NgJuxIbm7VYFfm3lXRCymEnRnlOOgig2vo2wr6Vso6eFwBjxVYRydbdI4mTelVBbzSfvW6gl43pU5SQRMNXVXQlVbZuavAdxq8rqBrDU0raKqhHyroB91nrK7FzXAqilnkQxVn1FvBGwYQSNEfyuZZmJr3jVWXZHdA9C2Z2z2DufozLAA/zeji5PL0DylGh8OX5r5sMhyawIZi7u2/HJkaxS26KTnm4eHwSOOEDAfuttD4eP83Sy8UJSyiW9LBwd7vv+qVUqA0vNtWGh2Nh3vNgylH6k1ZB5ZpNm8bI5pVpSOobyHNWreNXv5Mq8BpExR+tvKXOv8Nw+ln2GFb9Y3NrYqoTbHxvFWRtik2A9go6pKgxab/x7HVNE4eZS/Ckqges5WBaVF3DGqZMDhVL8iZ+gPEPGQvRL721A5Uy8W1B1n0JSJeb4gq6mWbzWruMT24Gu5a5q715y/910fljusaPxlPjWeGZRwYr40T49yYGKRjd/7qfOx86u537S7pQkG91yk1Pxq1pxv8Bz87LsU=</latexit>

Let’sstartwithmaximumlikelihood.Hereistheoptimizationobjective.Weneedtoworkouttheprobabilityofthedatagiventhemodelforeachmodel,andpickthehighestone.

which coin?

37

HTHHHTHHTHTH

p(D|Bent) =4

5

1

5

4

5

4

5

4

5

1

5

4

5

4

5

1

5

4

5

1

5

4

5⇡ 0.000268

p(D|Straight) =1

2

1

2

1

2

1

2

1

2

1

2

1

2

1

2

1

2

1

2

1

2

1

2⇡ 0.000244

<latexit sha1_base64="Olyz7WUvgQ79uXCh+aVzNHhAr0A=">AAAJoHicnVVbb9s2FFbcrqu99ZLtcS9CjQ1dYQSSc3MfArSJs+ZhbdIsTopGRkDRtCyYkgiSsqWy/Dv9T33sPxll2a4kqi1avfiI38XnHB6KLsE+45b1caNx6/ZPd36+22z98uu9+w8ebv52yaKYQjSAEY7oGxcwhP0QDbjPMXpDKAKBi9GVOz3K8KsZosyPwgueEjQMgBf6Yx8CrpZuNjc+kcf99w5HCReHKOTyb/Mv88B0IiicMQVQ7EixK6Xjrt7t/L2Cm7riGw4/oPjupBxACI0S09qyLKu71zMdp/W53P84Bb430UtWbt2Ke7fonr+buuIbDj+g+O6kyiXv7KiSbx62F2/qMfXAXgZtY/mc3Wze+eCMIhgHah4gBoxd2xbhQwEo9yFGsuXEDBEAp8BD1zEf94bCD0nMUQil+afCxjE2eWRm82aOfIogx6kKAKS+cjDhBKiMuZrKVtmKoRAEiHVGM5+wPGQzLw84UCM9FMli5OW9klJ4FJCJD5NSagIELAB8oi2yNHDLiyjGiM6C8mKWpkqywkwQhT7LmnCmOnNKsmPELqKzJT5JyQSFTIqYYlkUKgBRisZKuAgZ4jERi2rU2Z2yA05j1MnCxdpBH9DpORp1lE9poZzOGEeAl5dcVYbqTojmMAoCEI6EQ6TIR97pbMlF74rouRTCyRrluuZ5BpfQVwX0lZRl8LgAHiuwjA7W6NgcVKWXBfBS+9erAnpVlbpxAY01dFZAZ5qzOy/Acw1OCmiioWkBTTX0XQF9p/cZqLG47g5FvheLTRWn2J+hFxShUIq2Or+VWqja72u7LMlmQLRtuWj3CI3Vhz8HgjSji5OLl/9KcdTr7lp7sspwcYxWFGt7b/fI0ihens2SY/V63UONE1EQemuj/vHec1s3IjEleE3a39/+56nulCKMo/na6eiw392uFqY6Uk7K3rctqzptFGqtWnbEbNum1lqvjr78m1qBWyfI+1nLn+r8FxSkX2BHde6rNtcqSJ1i1fNaRVqnWG3ASlGWhDVt+rwda02lcpIdhKm6mUh2ZQCc+/aRukwoeqkOyKn6AAIe0SfqVFAv8JWX+nU6WfQ1IkhWRBW1Wupms6v3mB5cdrdsa8t+vdN+dri84+4afxiPjMeGbewbz4wT48wYGLBx0ggb80bSfNQ8aZ42X+fUxsZS87tReppv/wenKYUe</latexit><latexit sha1_base64="Olyz7WUvgQ79uXCh+aVzNHhAr0A=">AAAJoHicnVVbb9s2FFbcrqu99ZLtcS9CjQ1dYQSSc3MfArSJs+ZhbdIsTopGRkDRtCyYkgiSsqWy/Dv9T33sPxll2a4kqi1avfiI38XnHB6KLsE+45b1caNx6/ZPd36+22z98uu9+w8ebv52yaKYQjSAEY7oGxcwhP0QDbjPMXpDKAKBi9GVOz3K8KsZosyPwgueEjQMgBf6Yx8CrpZuNjc+kcf99w5HCReHKOTyb/Mv88B0IiicMQVQ7EixK6Xjrt7t/L2Cm7riGw4/oPjupBxACI0S09qyLKu71zMdp/W53P84Bb430UtWbt2Ke7fonr+buuIbDj+g+O6kyiXv7KiSbx62F2/qMfXAXgZtY/mc3Wze+eCMIhgHah4gBoxd2xbhQwEo9yFGsuXEDBEAp8BD1zEf94bCD0nMUQil+afCxjE2eWRm82aOfIogx6kKAKS+cjDhBKiMuZrKVtmKoRAEiHVGM5+wPGQzLw84UCM9FMli5OW9klJ4FJCJD5NSagIELAB8oi2yNHDLiyjGiM6C8mKWpkqywkwQhT7LmnCmOnNKsmPELqKzJT5JyQSFTIqYYlkUKgBRisZKuAgZ4jERi2rU2Z2yA05j1MnCxdpBH9DpORp1lE9poZzOGEeAl5dcVYbqTojmMAoCEI6EQ6TIR97pbMlF74rouRTCyRrluuZ5BpfQVwX0lZRl8LgAHiuwjA7W6NgcVKWXBfBS+9erAnpVlbpxAY01dFZAZ5qzOy/Acw1OCmiioWkBTTX0XQF9p/cZqLG47g5FvheLTRWn2J+hFxShUIq2Or+VWqja72u7LMlmQLRtuWj3CI3Vhz8HgjSji5OLl/9KcdTr7lp7sspwcYxWFGt7b/fI0ihens2SY/V63UONE1EQemuj/vHec1s3IjEleE3a39/+56nulCKMo/na6eiw392uFqY6Uk7K3rctqzptFGqtWnbEbNum1lqvjr78m1qBWyfI+1nLn+r8FxSkX2BHde6rNtcqSJ1i1fNaRVqnWG3ASlGWhDVt+rwda02lcpIdhKm6mUh2ZQCc+/aRukwoeqkOyKn6AAIe0SfqVFAv8JWX+nU6WfQ1IkhWRBW1Wupms6v3mB5cdrdsa8t+vdN+dri84+4afxiPjMeGbewbz4wT48wYGLBx0ggb80bSfNQ8aZ42X+fUxsZS87tReppv/wenKYUe</latexit><latexit sha1_base64="Olyz7WUvgQ79uXCh+aVzNHhAr0A=">AAAJoHicnVVbb9s2FFbcrqu99ZLtcS9CjQ1dYQSSc3MfArSJs+ZhbdIsTopGRkDRtCyYkgiSsqWy/Dv9T33sPxll2a4kqi1avfiI38XnHB6KLsE+45b1caNx6/ZPd36+22z98uu9+w8ebv52yaKYQjSAEY7oGxcwhP0QDbjPMXpDKAKBi9GVOz3K8KsZosyPwgueEjQMgBf6Yx8CrpZuNjc+kcf99w5HCReHKOTyb/Mv88B0IiicMQVQ7EixK6Xjrt7t/L2Cm7riGw4/oPjupBxACI0S09qyLKu71zMdp/W53P84Bb430UtWbt2Ke7fonr+buuIbDj+g+O6kyiXv7KiSbx62F2/qMfXAXgZtY/mc3Wze+eCMIhgHah4gBoxd2xbhQwEo9yFGsuXEDBEAp8BD1zEf94bCD0nMUQil+afCxjE2eWRm82aOfIogx6kKAKS+cjDhBKiMuZrKVtmKoRAEiHVGM5+wPGQzLw84UCM9FMli5OW9klJ4FJCJD5NSagIELAB8oi2yNHDLiyjGiM6C8mKWpkqywkwQhT7LmnCmOnNKsmPELqKzJT5JyQSFTIqYYlkUKgBRisZKuAgZ4jERi2rU2Z2yA05j1MnCxdpBH9DpORp1lE9poZzOGEeAl5dcVYbqTojmMAoCEI6EQ6TIR97pbMlF74rouRTCyRrluuZ5BpfQVwX0lZRl8LgAHiuwjA7W6NgcVKWXBfBS+9erAnpVlbpxAY01dFZAZ5qzOy/Acw1OCmiioWkBTTX0XQF9p/cZqLG47g5FvheLTRWn2J+hFxShUIq2Or+VWqja72u7LMlmQLRtuWj3CI3Vhz8HgjSji5OLl/9KcdTr7lp7sspwcYxWFGt7b/fI0ihens2SY/V63UONE1EQemuj/vHec1s3IjEleE3a39/+56nulCKMo/na6eiw392uFqY6Uk7K3rctqzptFGqtWnbEbNum1lqvjr78m1qBWyfI+1nLn+r8FxSkX2BHde6rNtcqSJ1i1fNaRVqnWG3ASlGWhDVt+rwda02lcpIdhKm6mUh2ZQCc+/aRukwoeqkOyKn6AAIe0SfqVFAv8JWX+nU6WfQ1IkhWRBW1Wupms6v3mB5cdrdsa8t+vdN+dri84+4afxiPjMeGbewbz4wT48wYGLBx0ggb80bSfNQ8aZ42X+fUxsZS87tReppv/wenKYUe</latexit><latexit sha1_base64="Olyz7WUvgQ79uXCh+aVzNHhAr0A=">AAAJoHicnVVbb9s2FFbcrqu99ZLtcS9CjQ1dYQSSc3MfArSJs+ZhbdIsTopGRkDRtCyYkgiSsqWy/Dv9T33sPxll2a4kqi1avfiI38XnHB6KLsE+45b1caNx6/ZPd36+22z98uu9+w8ebv52yaKYQjSAEY7oGxcwhP0QDbjPMXpDKAKBi9GVOz3K8KsZosyPwgueEjQMgBf6Yx8CrpZuNjc+kcf99w5HCReHKOTyb/Mv88B0IiicMQVQ7EixK6Xjrt7t/L2Cm7riGw4/oPjupBxACI0S09qyLKu71zMdp/W53P84Bb430UtWbt2Ke7fonr+buuIbDj+g+O6kyiXv7KiSbx62F2/qMfXAXgZtY/mc3Wze+eCMIhgHah4gBoxd2xbhQwEo9yFGsuXEDBEAp8BD1zEf94bCD0nMUQil+afCxjE2eWRm82aOfIogx6kKAKS+cjDhBKiMuZrKVtmKoRAEiHVGM5+wPGQzLw84UCM9FMli5OW9klJ4FJCJD5NSagIELAB8oi2yNHDLiyjGiM6C8mKWpkqywkwQhT7LmnCmOnNKsmPELqKzJT5JyQSFTIqYYlkUKgBRisZKuAgZ4jERi2rU2Z2yA05j1MnCxdpBH9DpORp1lE9poZzOGEeAl5dcVYbqTojmMAoCEI6EQ6TIR97pbMlF74rouRTCyRrluuZ5BpfQVwX0lZRl8LgAHiuwjA7W6NgcVKWXBfBS+9erAnpVlbpxAY01dFZAZ5qzOy/Acw1OCmiioWkBTTX0XQF9p/cZqLG47g5FvheLTRWn2J+hFxShUIq2Or+VWqja72u7LMlmQLRtuWj3CI3Vhz8HgjSji5OLl/9KcdTr7lp7sspwcYxWFGt7b/fI0ihens2SY/V63UONE1EQemuj/vHec1s3IjEleE3a39/+56nulCKMo/na6eiw392uFqY6Uk7K3rctqzptFGqtWnbEbNum1lqvjr78m1qBWyfI+1nLn+r8FxSkX2BHde6rNtcqSJ1i1fNaRVqnWG3ASlGWhDVt+rwda02lcpIdhKm6mUh2ZQCc+/aRukwoeqkOyKn6AAIe0SfqVFAv8JWX+nU6WfQ1IkhWRBW1Wupms6v3mB5cdrdsa8t+vdN+dri84+4afxiPjMeGbewbz4wT48wYGLBx0ggb80bSfNQ8aZ42X+fUxsZS87tReppv/wenKYUe</latexit>

p(D|Bent) =4

5

1

5

4

5

4

5

4

5

1

5

4

5

4

5

1

5

4

5

1

5

4

5⇡ 0.000268

p(D|Straight) =1

2

1

2

1

2

1

2

1

2

1

2

1

2

1

2

1

2

1

2

1

2

1

2⇡ 0.000244

<latexit sha1_base64="Olyz7WUvgQ79uXCh+aVzNHhAr0A=">AAAJoHicnVVbb9s2FFbcrqu99ZLtcS9CjQ1dYQSSc3MfArSJs+ZhbdIsTopGRkDRtCyYkgiSsqWy/Dv9T33sPxll2a4kqi1avfiI38XnHB6KLsE+45b1caNx6/ZPd36+22z98uu9+w8ebv52yaKYQjSAEY7oGxcwhP0QDbjPMXpDKAKBi9GVOz3K8KsZosyPwgueEjQMgBf6Yx8CrpZuNjc+kcf99w5HCReHKOTyb/Mv88B0IiicMQVQ7EixK6Xjrt7t/L2Cm7riGw4/oPjupBxACI0S09qyLKu71zMdp/W53P84Bb430UtWbt2Ke7fonr+buuIbDj+g+O6kyiXv7KiSbx62F2/qMfXAXgZtY/mc3Wze+eCMIhgHah4gBoxd2xbhQwEo9yFGsuXEDBEAp8BD1zEf94bCD0nMUQil+afCxjE2eWRm82aOfIogx6kKAKS+cjDhBKiMuZrKVtmKoRAEiHVGM5+wPGQzLw84UCM9FMli5OW9klJ4FJCJD5NSagIELAB8oi2yNHDLiyjGiM6C8mKWpkqywkwQhT7LmnCmOnNKsmPELqKzJT5JyQSFTIqYYlkUKgBRisZKuAgZ4jERi2rU2Z2yA05j1MnCxdpBH9DpORp1lE9poZzOGEeAl5dcVYbqTojmMAoCEI6EQ6TIR97pbMlF74rouRTCyRrluuZ5BpfQVwX0lZRl8LgAHiuwjA7W6NgcVKWXBfBS+9erAnpVlbpxAY01dFZAZ5qzOy/Acw1OCmiioWkBTTX0XQF9p/cZqLG47g5FvheLTRWn2J+hFxShUIq2Or+VWqja72u7LMlmQLRtuWj3CI3Vhz8HgjSji5OLl/9KcdTr7lp7sspwcYxWFGt7b/fI0ihens2SY/V63UONE1EQemuj/vHec1s3IjEleE3a39/+56nulCKMo/na6eiw392uFqY6Uk7K3rctqzptFGqtWnbEbNum1lqvjr78m1qBWyfI+1nLn+r8FxSkX2BHde6rNtcqSJ1i1fNaRVqnWG3ASlGWhDVt+rwda02lcpIdhKm6mUh2ZQCc+/aRukwoeqkOyKn6AAIe0SfqVFAv8JWX+nU6WfQ1IkhWRBW1Wupms6v3mB5cdrdsa8t+vdN+dri84+4afxiPjMeGbewbz4wT48wYGLBx0ggb80bSfNQ8aZ42X+fUxsZS87tReppv/wenKYUe</latexit><latexit sha1_base64="Olyz7WUvgQ79uXCh+aVzNHhAr0A=">AAAJoHicnVVbb9s2FFbcrqu99ZLtcS9CjQ1dYQSSc3MfArSJs+ZhbdIsTopGRkDRtCyYkgiSsqWy/Dv9T33sPxll2a4kqi1avfiI38XnHB6KLsE+45b1caNx6/ZPd36+22z98uu9+w8ebv52yaKYQjSAEY7oGxcwhP0QDbjPMXpDKAKBi9GVOz3K8KsZosyPwgueEjQMgBf6Yx8CrpZuNjc+kcf99w5HCReHKOTyb/Mv88B0IiicMQVQ7EixK6Xjrt7t/L2Cm7riGw4/oPjupBxACI0S09qyLKu71zMdp/W53P84Bb430UtWbt2Ke7fonr+buuIbDj+g+O6kyiXv7KiSbx62F2/qMfXAXgZtY/mc3Wze+eCMIhgHah4gBoxd2xbhQwEo9yFGsuXEDBEAp8BD1zEf94bCD0nMUQil+afCxjE2eWRm82aOfIogx6kKAKS+cjDhBKiMuZrKVtmKoRAEiHVGM5+wPGQzLw84UCM9FMli5OW9klJ4FJCJD5NSagIELAB8oi2yNHDLiyjGiM6C8mKWpkqywkwQhT7LmnCmOnNKsmPELqKzJT5JyQSFTIqYYlkUKgBRisZKuAgZ4jERi2rU2Z2yA05j1MnCxdpBH9DpORp1lE9poZzOGEeAl5dcVYbqTojmMAoCEI6EQ6TIR97pbMlF74rouRTCyRrluuZ5BpfQVwX0lZRl8LgAHiuwjA7W6NgcVKWXBfBS+9erAnpVlbpxAY01dFZAZ5qzOy/Acw1OCmiioWkBTTX0XQF9p/cZqLG47g5FvheLTRWn2J+hFxShUIq2Or+VWqja72u7LMlmQLRtuWj3CI3Vhz8HgjSji5OLl/9KcdTr7lp7sspwcYxWFGt7b/fI0ihens2SY/V63UONE1EQemuj/vHec1s3IjEleE3a39/+56nulCKMo/na6eiw392uFqY6Uk7K3rctqzptFGqtWnbEbNum1lqvjr78m1qBWyfI+1nLn+r8FxSkX2BHde6rNtcqSJ1i1fNaRVqnWG3ASlGWhDVt+rwda02lcpIdhKm6mUh2ZQCc+/aRukwoeqkOyKn6AAIe0SfqVFAv8JWX+nU6WfQ1IkhWRBW1Wupms6v3mB5cdrdsa8t+vdN+dri84+4afxiPjMeGbewbz4wT48wYGLBx0ggb80bSfNQ8aZ42X+fUxsZS87tReppv/wenKYUe</latexit><latexit sha1_base64="Olyz7WUvgQ79uXCh+aVzNHhAr0A=">AAAJoHicnVVbb9s2FFbcrqu99ZLtcS9CjQ1dYQSSc3MfArSJs+ZhbdIsTopGRkDRtCyYkgiSsqWy/Dv9T33sPxll2a4kqi1avfiI38XnHB6KLsE+45b1caNx6/ZPd36+22z98uu9+w8ebv52yaKYQjSAEY7oGxcwhP0QDbjPMXpDKAKBi9GVOz3K8KsZosyPwgueEjQMgBf6Yx8CrpZuNjc+kcf99w5HCReHKOTyb/Mv88B0IiicMQVQ7EixK6Xjrt7t/L2Cm7riGw4/oPjupBxACI0S09qyLKu71zMdp/W53P84Bb430UtWbt2Ke7fonr+buuIbDj+g+O6kyiXv7KiSbx62F2/qMfXAXgZtY/mc3Wze+eCMIhgHah4gBoxd2xbhQwEo9yFGsuXEDBEAp8BD1zEf94bCD0nMUQil+afCxjE2eWRm82aOfIogx6kKAKS+cjDhBKiMuZrKVtmKoRAEiHVGM5+wPGQzLw84UCM9FMli5OW9klJ4FJCJD5NSagIELAB8oi2yNHDLiyjGiM6C8mKWpkqywkwQhT7LmnCmOnNKsmPELqKzJT5JyQSFTIqYYlkUKgBRisZKuAgZ4jERi2rU2Z2yA05j1MnCxdpBH9DpORp1lE9poZzOGEeAl5dcVYbqTojmMAoCEI6EQ6TIR97pbMlF74rouRTCyRrluuZ5BpfQVwX0lZRl8LgAHiuwjA7W6NgcVKWXBfBS+9erAnpVlbpxAY01dFZAZ5qzOy/Acw1OCmiioWkBTTX0XQF9p/cZqLG47g5FvheLTRWn2J+hFxShUIq2Or+VWqja72u7LMlmQLRtuWj3CI3Vhz8HgjSji5OLl/9KcdTr7lp7sspwcYxWFGt7b/fI0ihens2SY/V63UONE1EQemuj/vHec1s3IjEleE3a39/+56nulCKMo/na6eiw392uFqY6Uk7K3rctqzptFGqtWnbEbNum1lqvjr78m1qBWyfI+1nLn+r8FxSkX2BHde6rNtcqSJ1i1fNaRVqnWG3ASlGWhDVt+rwda02lcpIdhKm6mUh2ZQCc+/aRukwoeqkOyKn6AAIe0SfqVFAv8JWX+nU6WfQ1IkhWRBW1Wupms6v3mB5cdrdsa8t+vdN+dri84+4afxiPjMeGbewbz4wT48wYGLBx0ggb80bSfNQ8aZ42X+fUxsZS87tReppv/wenKYUe</latexit><latexit sha1_base64="Olyz7WUvgQ79uXCh+aVzNHhAr0A=">AAAJoHicnVVbb9s2FFbcrqu99ZLtcS9CjQ1dYQSSc3MfArSJs+ZhbdIsTopGRkDRtCyYkgiSsqWy/Dv9T33sPxll2a4kqi1avfiI38XnHB6KLsE+45b1caNx6/ZPd36+22z98uu9+w8ebv52yaKYQjSAEY7oGxcwhP0QDbjPMXpDKAKBi9GVOz3K8KsZosyPwgueEjQMgBf6Yx8CrpZuNjc+kcf99w5HCReHKOTyb/Mv88B0IiicMQVQ7EixK6Xjrt7t/L2Cm7riGw4/oPjupBxACI0S09qyLKu71zMdp/W53P84Bb430UtWbt2Ke7fonr+buuIbDj+g+O6kyiXv7KiSbx62F2/qMfXAXgZtY/mc3Wze+eCMIhgHah4gBoxd2xbhQwEo9yFGsuXEDBEAp8BD1zEf94bCD0nMUQil+afCxjE2eWRm82aOfIogx6kKAKS+cjDhBKiMuZrKVtmKoRAEiHVGM5+wPGQzLw84UCM9FMli5OW9klJ4FJCJD5NSagIELAB8oi2yNHDLiyjGiM6C8mKWpkqywkwQhT7LmnCmOnNKsmPELqKzJT5JyQSFTIqYYlkUKgBRisZKuAgZ4jERi2rU2Z2yA05j1MnCxdpBH9DpORp1lE9poZzOGEeAl5dcVYbqTojmMAoCEI6EQ6TIR97pbMlF74rouRTCyRrluuZ5BpfQVwX0lZRl8LgAHiuwjA7W6NgcVKWXBfBS+9erAnpVlbpxAY01dFZAZ5qzOy/Acw1OCmiioWkBTTX0XQF9p/cZqLG47g5FvheLTRWn2J+hFxShUIq2Or+VWqja72u7LMlmQLRtuWj3CI3Vhz8HgjSji5OLl/9KcdTr7lp7sspwcYxWFGt7b/fI0ihens2SY/V63UONE1EQemuj/vHec1s3IjEleE3a39/+56nulCKMo/na6eiw392uFqY6Uk7K3rctqzptFGqtWnbEbNum1lqvjr78m1qBWyfI+1nLn+r8FxSkX2BHde6rNtcqSJ1i1fNaRVqnWG3ASlGWhDVt+rwda02lcpIdhKm6mUh2ZQCc+/aRukwoeqkOyKn6AAIe0SfqVFAv8JWX+nU6WfQ1IkhWRBW1Wupms6v3mB5cdrdsa8t+vdN+dri84+4afxiPjMeGbewbz4wT48wYGLBx0ggb80bSfNQ8aZ42X+fUxsZS87tReppv/wenKYUe</latexit>

Sincethecoin?lipsareindependent,theprobabilityoverthewholesequenceisjusttheproductovertheprobabilitiesoftheindividual?lips.There’snotmuchinit,butthelikelihoodforthebentcoinisslightlyhigher,sothat’sthepreferredmodelunderthemaximumlikelihoodcriterion.

(LOG) LIKELIHOOD: What we maximise to fit a probability model

LOSS: What we minimise to fit a machine learning model

38

Weoftentakethelogarithmofthelikelihood.Thelogarithmisamonotonicfunctionsothelikelihoodandtheloglikelihoodhavetheirminimainthesameplace,buttheloglikelihoodisofteneasiertomanipulatesymbolically(seethe?irsthomeworkexercise).

Theloglikelihoodofaprobabilitydistributionisalotlikethelossfunctionswe’vealreadyencounteredmanytimes.

Infact,ifwewantto?itaprobabilitydistributioninsideadeeplearningsystem,weusuallytakethenegativeloglikelihood,sothatwecandogradientdescent.

probability density function

39

N(x | µ,�) =1p2⇡�2

exp

-

1

2�2(x- µ)2

<latexit sha1_base64="Mg+fxxE6OudI9/x4WYZOShsvCK4=">AAAHyHicfVVdb9s2FJW7re68dU3Xlw57IWYMSAfXkJQ2SR8CdE22FtuaZEGcFLDcgJKvZMKUxJKULZXgy/Yr9xP2L0b5I5Msb3zxxT3nXF0eXpM+o0RI2/6rdeeTTz+72773eeeLL+9/9WDn4ddXIs14AIMgpSl/52MBlCQwkERSeMc44NincO1Pj0v8egZckDS5lAWDUYyjhIQkwNKkbnb+PN3NkReTMfLSQHlxpnvI800kSBRj/QQdIS/kOFCONrkPXCrXY6RKee9qjTzIGfIohBIN0dN/JW6duJs/XX/myXvX4ySayFHnZqdr9+3FQs3AWQVda7XObx7e7XvjNMhiSGRAsRBDx2ZypDCXJKCgO14mgOFgiiMYZjI8HCmSsExCEmj0vcHCjCKZotIQNCYcAkkLE+CAE1MBBRNs+pfGtk69lIAExyB64xlhYhmKWbQMJDaej1S+OBN9v6ZUEcdsQoK81prCsYixnDSSooj9ehIyCnwW15Nlm6bJDWYOPCCiNOHcOHPGynMWl+n5Cp8UbAKJ0CrjVFeFBgDOITTCRShAZkwtdmOGayqOJM+gV4aL3NEJ5tMLGPdMnVqi3k5IUyzrKd9sw7iTwDxI4xgnY+UxM1wScqm8Xl8vvKuiF1qZkTFG+T66KOEaelpBT7Xe0A5u0RANDFoDryrgVaPwdQW93pT6WQXNGuisgs4alf15BZ434LyC5g20qKBFA/1YQT82rcTm5IfuSC3tXpybOqNkBq85QKJV1/yZN/bCzZEOnbqkPGbVdfTC7jGE5vJZAnFR0tWby7e/aXV86D639/Umw6cZrCn23v7zY7tBiZbdrDj24aH7qsFJOU6i20InP+3/6DQLsYwzeks6ONj7+UWzUgGUpvPbSsevTty9zTniQcOE1V5R10EN06Jt9NWutgr8bYKlU1v50yb/NcfFf7DTbdXXBm5VsG2KtZtbFcU2xdratWJjE6yc1ql5EFh5dWO6pJyAudQ5vDVTfGYuIixT/oMZXR7FxNhnfr1eGf0fEedrook65QvjbL4nzWDg9l/07d+fdV/+snpq7lnfWt9Zu5ZjHVgvrTfWuTWwAuvv1oPW49Y37V/bH9rzdrGk3mmtNI+s2mr/8Q8+9dBS</latexit><latexit sha1_base64="Mg+fxxE6OudI9/x4WYZOShsvCK4=">AAAHyHicfVVdb9s2FJW7re68dU3Xlw57IWYMSAfXkJQ2SR8CdE22FtuaZEGcFLDcgJKvZMKUxJKULZXgy/Yr9xP2L0b5I5Msb3zxxT3nXF0eXpM+o0RI2/6rdeeTTz+72773eeeLL+9/9WDn4ddXIs14AIMgpSl/52MBlCQwkERSeMc44NincO1Pj0v8egZckDS5lAWDUYyjhIQkwNKkbnb+PN3NkReTMfLSQHlxpnvI800kSBRj/QQdIS/kOFCONrkPXCrXY6RKee9qjTzIGfIohBIN0dN/JW6duJs/XX/myXvX4ySayFHnZqdr9+3FQs3AWQVda7XObx7e7XvjNMhiSGRAsRBDx2ZypDCXJKCgO14mgOFgiiMYZjI8HCmSsExCEmj0vcHCjCKZotIQNCYcAkkLE+CAE1MBBRNs+pfGtk69lIAExyB64xlhYhmKWbQMJDaej1S+OBN9v6ZUEcdsQoK81prCsYixnDSSooj9ehIyCnwW15Nlm6bJDWYOPCCiNOHcOHPGynMWl+n5Cp8UbAKJ0CrjVFeFBgDOITTCRShAZkwtdmOGayqOJM+gV4aL3NEJ5tMLGPdMnVqi3k5IUyzrKd9sw7iTwDxI4xgnY+UxM1wScqm8Xl8vvKuiF1qZkTFG+T66KOEaelpBT7Xe0A5u0RANDFoDryrgVaPwdQW93pT6WQXNGuisgs4alf15BZ434LyC5g20qKBFA/1YQT82rcTm5IfuSC3tXpybOqNkBq85QKJV1/yZN/bCzZEOnbqkPGbVdfTC7jGE5vJZAnFR0tWby7e/aXV86D639/Umw6cZrCn23v7zY7tBiZbdrDj24aH7qsFJOU6i20InP+3/6DQLsYwzeks6ONj7+UWzUgGUpvPbSsevTty9zTniQcOE1V5R10EN06Jt9NWutgr8bYKlU1v50yb/NcfFf7DTbdXXBm5VsG2KtZtbFcU2xdratWJjE6yc1ql5EFh5dWO6pJyAudQ5vDVTfGYuIixT/oMZXR7FxNhnfr1eGf0fEedrook65QvjbL4nzWDg9l/07d+fdV/+snpq7lnfWt9Zu5ZjHVgvrTfWuTWwAuvv1oPW49Y37V/bH9rzdrGk3mmtNI+s2mr/8Q8+9dBS</latexit><latexit sha1_base64="Mg+fxxE6OudI9/x4WYZOShsvCK4=">AAAHyHicfVVdb9s2FJW7re68dU3Xlw57IWYMSAfXkJQ2SR8CdE22FtuaZEGcFLDcgJKvZMKUxJKULZXgy/Yr9xP2L0b5I5Msb3zxxT3nXF0eXpM+o0RI2/6rdeeTTz+72773eeeLL+9/9WDn4ddXIs14AIMgpSl/52MBlCQwkERSeMc44NincO1Pj0v8egZckDS5lAWDUYyjhIQkwNKkbnb+PN3NkReTMfLSQHlxpnvI800kSBRj/QQdIS/kOFCONrkPXCrXY6RKee9qjTzIGfIohBIN0dN/JW6duJs/XX/myXvX4ySayFHnZqdr9+3FQs3AWQVda7XObx7e7XvjNMhiSGRAsRBDx2ZypDCXJKCgO14mgOFgiiMYZjI8HCmSsExCEmj0vcHCjCKZotIQNCYcAkkLE+CAE1MBBRNs+pfGtk69lIAExyB64xlhYhmKWbQMJDaej1S+OBN9v6ZUEcdsQoK81prCsYixnDSSooj9ehIyCnwW15Nlm6bJDWYOPCCiNOHcOHPGynMWl+n5Cp8UbAKJ0CrjVFeFBgDOITTCRShAZkwtdmOGayqOJM+gV4aL3NEJ5tMLGPdMnVqi3k5IUyzrKd9sw7iTwDxI4xgnY+UxM1wScqm8Xl8vvKuiF1qZkTFG+T66KOEaelpBT7Xe0A5u0RANDFoDryrgVaPwdQW93pT6WQXNGuisgs4alf15BZ434LyC5g20qKBFA/1YQT82rcTm5IfuSC3tXpybOqNkBq85QKJV1/yZN/bCzZEOnbqkPGbVdfTC7jGE5vJZAnFR0tWby7e/aXV86D639/Umw6cZrCn23v7zY7tBiZbdrDj24aH7qsFJOU6i20InP+3/6DQLsYwzeks6ONj7+UWzUgGUpvPbSsevTty9zTniQcOE1V5R10EN06Jt9NWutgr8bYKlU1v50yb/NcfFf7DTbdXXBm5VsG2KtZtbFcU2xdratWJjE6yc1ql5EFh5dWO6pJyAudQ5vDVTfGYuIixT/oMZXR7FxNhnfr1eGf0fEedrook65QvjbL4nzWDg9l/07d+fdV/+snpq7lnfWt9Zu5ZjHVgvrTfWuTWwAuvv1oPW49Y37V/bH9rzdrGk3mmtNI+s2mr/8Q8+9dBS</latexit>

Wecanseehowalossfunctionandalog-likelhoodaresimilarwhenwelookatanormaldistribution.Thelikelihoodfunctionofthenormaldistributionisthiscomplicatedfunction.

Theprobabilitydensityofourwholedata,givensomemeanandstandarddeviation,issimplytheproductofallindividualprobabilitydensities.Thisfollowsfromtheassumptionthatinstancedataisindependentlydrawnfromthesamedistribution.

Wetakethelogarithmofthisproduct,togiveustheloglikelihoodofsomedata.

maximum likelihood for the normal distribution

40

argmax✓

log p(X | ✓) = argmax✓

lnY

x2X

p(x | ✓)

= argmax✓

X

x

lnp(x | ✓)

= argmaxµ,�

X

x

ln1p2⇡�2

exp

-

1

2�2(x- µ)2

= argmaxµ,�

X

x

ln1p2⇡�2

-1

2�2(x- µ)2

<latexit sha1_base64="XnPAttzTK/BsMvTbPb0G1ChNSOA=">AAAI8HictVXRbts2FFXcrfO8dWvWx70QMzakg2tIzpqkGAKkTbYWw9pkQZwEMF2DkmmZMCVxJOVIJfghexr22j/a34ySbE+ytGEv04su7jnn6vLwinQZJULa9p87rXsffHj/o/bHnU8+ffDZ5w93v7gWUcw9PPQiGvFbFwlMSYiHkkiKbxnHKHApvnEXpxl+s8RckCi8kinD4wD5IZkRD0mTmuzuvP8GIu4HKJlAOccSwe8BpJEP2N4tDMgUFNnH4Bg08EIAGY+mE5UASEJwq43MhGUdhJ1j0PANEQeTJC+xLSkUm88pGHkKBrHuAeiaSBCT1rpaA8448pSjDfwrl2oAGSmz3w60BhAnzJDxTIIRePK3ZFAl7iVP1l98/HYAOfHncgz+r6b+eyOwM3nYtft2/oB64KyCrrV6Lia79/twGnlxgEPpUSTEyLGZHCvEJfEo1h0YC8yQt0A+HsVydjRWJGSxxKGnwdcGm8UUyAhkgwOmhGNP0tQEyOPEVADeHJnWpRmvTrWUwCEKsOhNl4SJIhRLvwgkMrM5Vkk+u/pBRal8jticeEmlNYUCESA5ryVFGrjVJI4p5sugmszaNE1uMRPMPSIyEy6MM+cs+x/EVXSxwucpm+NQaBVzqstCA2DO8cwI81BgGTOVr8b8hAtxLHmMe1mY547PEF9c4mnP1Kkkqu3MaIRkNeWaZRh3QnznRUGAwqmCzMySxIlUsNfXuXdl9FIrMy3GKNcFlxlcQd+U0Ddab2mHG3QGhgatgNcl8LpW+KaE3mxL3biExjV0WUKXtcruXQm+q8FJCU1qaFpC0xr6roS+q1uJzM6PBmNV2J3vmzqnZIlfcoxDrbrm391aCzdbOnKqkmybVdfRud1TPDOHdAEEaUZXr65e/6zV6dHgqX2gtxkujfGaYu8fPD21axS/6GbFsY+OBi9qnIij0N8UOvvh4LlTL8RizuiGdHi4/+OzeqUUUxrdbSqdvjgb7G/PEfdqJqzWCroOqJnmN9FXq2oUuE2CwqlG/qLOf8lR+g/sqKn62sBGBWtSrN1sVKRNirW1a8XWIlg2rQtzF7Ds6Ea0oJxhc6hz/NpM8bk5iJCM+Lcqv6GIsc+8YS+L/o2IkjXRRJ3shnG275N6MBz0n/XtX77rnvy0umra1pfWV9ae5ViH1on1yrqwhpbX2m0dtk5az9ui/Vv79/YfBbW1s9I8sipP+/1fAPo6Og==</latexit><latexit sha1_base64="XnPAttzTK/BsMvTbPb0G1ChNSOA=">AAAI8HictVXRbts2FFXcrfO8dWvWx70QMzakg2tIzpqkGAKkTbYWw9pkQZwEMF2DkmmZMCVxJOVIJfghexr22j/a34ySbE+ytGEv04su7jnn6vLwinQZJULa9p87rXsffHj/o/bHnU8+ffDZ5w93v7gWUcw9PPQiGvFbFwlMSYiHkkiKbxnHKHApvnEXpxl+s8RckCi8kinD4wD5IZkRD0mTmuzuvP8GIu4HKJlAOccSwe8BpJEP2N4tDMgUFNnH4Bg08EIAGY+mE5UASEJwq43MhGUdhJ1j0PANEQeTJC+xLSkUm88pGHkKBrHuAeiaSBCT1rpaA8448pSjDfwrl2oAGSmz3w60BhAnzJDxTIIRePK3ZFAl7iVP1l98/HYAOfHncgz+r6b+eyOwM3nYtft2/oB64KyCrrV6Lia79/twGnlxgEPpUSTEyLGZHCvEJfEo1h0YC8yQt0A+HsVydjRWJGSxxKGnwdcGm8UUyAhkgwOmhGNP0tQEyOPEVADeHJnWpRmvTrWUwCEKsOhNl4SJIhRLvwgkMrM5Vkk+u/pBRal8jticeEmlNYUCESA5ryVFGrjVJI4p5sugmszaNE1uMRPMPSIyEy6MM+cs+x/EVXSxwucpm+NQaBVzqstCA2DO8cwI81BgGTOVr8b8hAtxLHmMe1mY547PEF9c4mnP1Kkkqu3MaIRkNeWaZRh3QnznRUGAwqmCzMySxIlUsNfXuXdl9FIrMy3GKNcFlxlcQd+U0Ddab2mHG3QGhgatgNcl8LpW+KaE3mxL3biExjV0WUKXtcruXQm+q8FJCU1qaFpC0xr6roS+q1uJzM6PBmNV2J3vmzqnZIlfcoxDrbrm391aCzdbOnKqkmybVdfRud1TPDOHdAEEaUZXr65e/6zV6dHgqX2gtxkujfGaYu8fPD21axS/6GbFsY+OBi9qnIij0N8UOvvh4LlTL8RizuiGdHi4/+OzeqUUUxrdbSqdvjgb7G/PEfdqJqzWCroOqJnmN9FXq2oUuE2CwqlG/qLOf8lR+g/sqKn62sBGBWtSrN1sVKRNirW1a8XWIlg2rQtzF7Ds6Ea0oJxhc6hz/NpM8bk5iJCM+Lcqv6GIsc+8YS+L/o2IkjXRRJ3shnG275N6MBz0n/XtX77rnvy0umra1pfWV9ae5ViH1on1yrqwhpbX2m0dtk5az9ui/Vv79/YfBbW1s9I8sipP+/1fAPo6Og==</latexit><latexit sha1_base64="XnPAttzTK/BsMvTbPb0G1ChNSOA=">AAAI8HictVXRbts2FFXcrfO8dWvWx70QMzakg2tIzpqkGAKkTbYWw9pkQZwEMF2DkmmZMCVxJOVIJfghexr22j/a34ySbE+ytGEv04su7jnn6vLwinQZJULa9p87rXsffHj/o/bHnU8+ffDZ5w93v7gWUcw9PPQiGvFbFwlMSYiHkkiKbxnHKHApvnEXpxl+s8RckCi8kinD4wD5IZkRD0mTmuzuvP8GIu4HKJlAOccSwe8BpJEP2N4tDMgUFNnH4Bg08EIAGY+mE5UASEJwq43MhGUdhJ1j0PANEQeTJC+xLSkUm88pGHkKBrHuAeiaSBCT1rpaA8448pSjDfwrl2oAGSmz3w60BhAnzJDxTIIRePK3ZFAl7iVP1l98/HYAOfHncgz+r6b+eyOwM3nYtft2/oB64KyCrrV6Lia79/twGnlxgEPpUSTEyLGZHCvEJfEo1h0YC8yQt0A+HsVydjRWJGSxxKGnwdcGm8UUyAhkgwOmhGNP0tQEyOPEVADeHJnWpRmvTrWUwCEKsOhNl4SJIhRLvwgkMrM5Vkk+u/pBRal8jticeEmlNYUCESA5ryVFGrjVJI4p5sugmszaNE1uMRPMPSIyEy6MM+cs+x/EVXSxwucpm+NQaBVzqstCA2DO8cwI81BgGTOVr8b8hAtxLHmMe1mY547PEF9c4mnP1Kkkqu3MaIRkNeWaZRh3QnznRUGAwqmCzMySxIlUsNfXuXdl9FIrMy3GKNcFlxlcQd+U0Ddab2mHG3QGhgatgNcl8LpW+KaE3mxL3biExjV0WUKXtcruXQm+q8FJCU1qaFpC0xr6roS+q1uJzM6PBmNV2J3vmzqnZIlfcoxDrbrm391aCzdbOnKqkmybVdfRud1TPDOHdAEEaUZXr65e/6zV6dHgqX2gtxkujfGaYu8fPD21axS/6GbFsY+OBi9qnIij0N8UOvvh4LlTL8RizuiGdHi4/+OzeqUUUxrdbSqdvjgb7G/PEfdqJqzWCroOqJnmN9FXq2oUuE2CwqlG/qLOf8lR+g/sqKn62sBGBWtSrN1sVKRNirW1a8XWIlg2rQtzF7Ds6Ea0oJxhc6hz/NpM8bk5iJCM+Lcqv6GIsc+8YS+L/o2IkjXRRJ3shnG275N6MBz0n/XtX77rnvy0umra1pfWV9ae5ViH1on1yrqwhpbX2m0dtk5az9ui/Vv79/YfBbW1s9I8sipP+/1fAPo6Og==</latexit>

argmax✓

log p(X | ✓) = argmax✓

lnY

x2X

p(x | ✓)

= argmax✓

X

x

lnp(x | ✓)

= argmaxµ,�

X

x

ln1p2⇡�2

exp

-

1

2�2(x- µ)2

= argmaxµ,�

X

x

ln1p2⇡�2

-1

2�2(x- µ)2

<latexit sha1_base64="XnPAttzTK/BsMvTbPb0G1ChNSOA=">AAAI8HictVXRbts2FFXcrfO8dWvWx70QMzakg2tIzpqkGAKkTbYWw9pkQZwEMF2DkmmZMCVxJOVIJfghexr22j/a34ySbE+ytGEv04su7jnn6vLwinQZJULa9p87rXsffHj/o/bHnU8+ffDZ5w93v7gWUcw9PPQiGvFbFwlMSYiHkkiKbxnHKHApvnEXpxl+s8RckCi8kinD4wD5IZkRD0mTmuzuvP8GIu4HKJlAOccSwe8BpJEP2N4tDMgUFNnH4Bg08EIAGY+mE5UASEJwq43MhGUdhJ1j0PANEQeTJC+xLSkUm88pGHkKBrHuAeiaSBCT1rpaA8448pSjDfwrl2oAGSmz3w60BhAnzJDxTIIRePK3ZFAl7iVP1l98/HYAOfHncgz+r6b+eyOwM3nYtft2/oB64KyCrrV6Lia79/twGnlxgEPpUSTEyLGZHCvEJfEo1h0YC8yQt0A+HsVydjRWJGSxxKGnwdcGm8UUyAhkgwOmhGNP0tQEyOPEVADeHJnWpRmvTrWUwCEKsOhNl4SJIhRLvwgkMrM5Vkk+u/pBRal8jticeEmlNYUCESA5ryVFGrjVJI4p5sugmszaNE1uMRPMPSIyEy6MM+cs+x/EVXSxwucpm+NQaBVzqstCA2DO8cwI81BgGTOVr8b8hAtxLHmMe1mY547PEF9c4mnP1Kkkqu3MaIRkNeWaZRh3QnznRUGAwqmCzMySxIlUsNfXuXdl9FIrMy3GKNcFlxlcQd+U0Ddab2mHG3QGhgatgNcl8LpW+KaE3mxL3biExjV0WUKXtcruXQm+q8FJCU1qaFpC0xr6roS+q1uJzM6PBmNV2J3vmzqnZIlfcoxDrbrm391aCzdbOnKqkmybVdfRud1TPDOHdAEEaUZXr65e/6zV6dHgqX2gtxkujfGaYu8fPD21axS/6GbFsY+OBi9qnIij0N8UOvvh4LlTL8RizuiGdHi4/+OzeqUUUxrdbSqdvjgb7G/PEfdqJqzWCroOqJnmN9FXq2oUuE2CwqlG/qLOf8lR+g/sqKn62sBGBWtSrN1sVKRNirW1a8XWIlg2rQtzF7Ds6Ea0oJxhc6hz/NpM8bk5iJCM+Lcqv6GIsc+8YS+L/o2IkjXRRJ3shnG275N6MBz0n/XtX77rnvy0umra1pfWV9ae5ViH1on1yrqwhpbX2m0dtk5az9ui/Vv79/YfBbW1s9I8sipP+/1fAPo6Og==</latexit><latexit sha1_base64="XnPAttzTK/BsMvTbPb0G1ChNSOA=">AAAI8HictVXRbts2FFXcrfO8dWvWx70QMzakg2tIzpqkGAKkTbYWw9pkQZwEMF2DkmmZMCVxJOVIJfghexr22j/a34ySbE+ytGEv04su7jnn6vLwinQZJULa9p87rXsffHj/o/bHnU8+ffDZ5w93v7gWUcw9PPQiGvFbFwlMSYiHkkiKbxnHKHApvnEXpxl+s8RckCi8kinD4wD5IZkRD0mTmuzuvP8GIu4HKJlAOccSwe8BpJEP2N4tDMgUFNnH4Bg08EIAGY+mE5UASEJwq43MhGUdhJ1j0PANEQeTJC+xLSkUm88pGHkKBrHuAeiaSBCT1rpaA8448pSjDfwrl2oAGSmz3w60BhAnzJDxTIIRePK3ZFAl7iVP1l98/HYAOfHncgz+r6b+eyOwM3nYtft2/oB64KyCrrV6Lia79/twGnlxgEPpUSTEyLGZHCvEJfEo1h0YC8yQt0A+HsVydjRWJGSxxKGnwdcGm8UUyAhkgwOmhGNP0tQEyOPEVADeHJnWpRmvTrWUwCEKsOhNl4SJIhRLvwgkMrM5Vkk+u/pBRal8jticeEmlNYUCESA5ryVFGrjVJI4p5sugmszaNE1uMRPMPSIyEy6MM+cs+x/EVXSxwucpm+NQaBVzqstCA2DO8cwI81BgGTOVr8b8hAtxLHmMe1mY547PEF9c4mnP1Kkkqu3MaIRkNeWaZRh3QnznRUGAwqmCzMySxIlUsNfXuXdl9FIrMy3GKNcFlxlcQd+U0Ddab2mHG3QGhgatgNcl8LpW+KaE3mxL3biExjV0WUKXtcruXQm+q8FJCU1qaFpC0xr6roS+q1uJzM6PBmNV2J3vmzqnZIlfcoxDrbrm391aCzdbOnKqkmybVdfRud1TPDOHdAEEaUZXr65e/6zV6dHgqX2gtxkujfGaYu8fPD21axS/6GbFsY+OBi9qnIij0N8UOvvh4LlTL8RizuiGdHi4/+OzeqUUUxrdbSqdvjgb7G/PEfdqJqzWCroOqJnmN9FXq2oUuE2CwqlG/qLOf8lR+g/sqKn62sBGBWtSrN1sVKRNirW1a8XWIlg2rQtzF7Ds6Ea0oJxhc6hz/NpM8bk5iJCM+Lcqv6GIsc+8YS+L/o2IkjXRRJ3shnG275N6MBz0n/XtX77rnvy0umra1pfWV9ae5ViH1on1yrqwhpbX2m0dtk5az9ui/Vv79/YfBbW1s9I8sipP+/1fAPo6Og==</latexit><latexit sha1_base64="XnPAttzTK/BsMvTbPb0G1ChNSOA=">AAAI8HictVXRbts2FFXcrfO8dWvWx70QMzakg2tIzpqkGAKkTbYWw9pkQZwEMF2DkmmZMCVxJOVIJfghexr22j/a34ySbE+ytGEv04su7jnn6vLwinQZJULa9p87rXsffHj/o/bHnU8+ffDZ5w93v7gWUcw9PPQiGvFbFwlMSYiHkkiKbxnHKHApvnEXpxl+s8RckCi8kinD4wD5IZkRD0mTmuzuvP8GIu4HKJlAOccSwe8BpJEP2N4tDMgUFNnH4Bg08EIAGY+mE5UASEJwq43MhGUdhJ1j0PANEQeTJC+xLSkUm88pGHkKBrHuAeiaSBCT1rpaA8448pSjDfwrl2oAGSmz3w60BhAnzJDxTIIRePK3ZFAl7iVP1l98/HYAOfHncgz+r6b+eyOwM3nYtft2/oB64KyCrrV6Lia79/twGnlxgEPpUSTEyLGZHCvEJfEo1h0YC8yQt0A+HsVydjRWJGSxxKGnwdcGm8UUyAhkgwOmhGNP0tQEyOPEVADeHJnWpRmvTrWUwCEKsOhNl4SJIhRLvwgkMrM5Vkk+u/pBRal8jticeEmlNYUCESA5ryVFGrjVJI4p5sugmszaNE1uMRPMPSIyEy6MM+cs+x/EVXSxwucpm+NQaBVzqstCA2DO8cwI81BgGTOVr8b8hAtxLHmMe1mY547PEF9c4mnP1Kkkqu3MaIRkNeWaZRh3QnznRUGAwqmCzMySxIlUsNfXuXdl9FIrMy3GKNcFlxlcQd+U0Ddab2mHG3QGhgatgNcl8LpW+KaE3mxL3biExjV0WUKXtcruXQm+q8FJCU1qaFpC0xr6roS+q1uJzM6PBmNV2J3vmzqnZIlfcoxDrbrm391aCzdbOnKqkmybVdfRud1TPDOHdAEEaUZXr65e/6zV6dHgqX2gtxkujfGaYu8fPD21axS/6GbFsY+OBi9qnIij0N8UOvvh4LlTL8RizuiGdHi4/+OzeqUUUxrdbSqdvjgb7G/PEfdqJqzWCroOqJnmN9FXq2oUuE2CwqlG/qLOf8lR+g/sqKn62sBGBWtSrN1sVKRNirW1a8XWIlg2rQtzF7Ds6Ea0oJxhc6hz/NpM8bk5iJCM+Lcqv6GIsc+8YS+L/o2IkjXRRJ3shnG275N6MBz0n/XtX77rnvy0umra1pfWV9ae5ViH1on1yrqwhpbX2m0dtk5az9ui/Vv79/YfBbW1s9I8sipP+/1fAPo6Og==</latexit>

argmax✓

log p(X | ✓) = argmax✓

lnY

x2X

p(x | ✓)

= argmax✓

X

x

lnp(x | ✓)

= argmaxµ,�

X

x

ln1p2⇡�2

exp

-

1

2�2(x- µ)2

= argmaxµ,�

X

x

ln1p2⇡�2

-1

2�2(x- µ)2

<latexit sha1_base64="XnPAttzTK/BsMvTbPb0G1ChNSOA=">AAAI8HictVXRbts2FFXcrfO8dWvWx70QMzakg2tIzpqkGAKkTbYWw9pkQZwEMF2DkmmZMCVxJOVIJfghexr22j/a34ySbE+ytGEv04su7jnn6vLwinQZJULa9p87rXsffHj/o/bHnU8+ffDZ5w93v7gWUcw9PPQiGvFbFwlMSYiHkkiKbxnHKHApvnEXpxl+s8RckCi8kinD4wD5IZkRD0mTmuzuvP8GIu4HKJlAOccSwe8BpJEP2N4tDMgUFNnH4Bg08EIAGY+mE5UASEJwq43MhGUdhJ1j0PANEQeTJC+xLSkUm88pGHkKBrHuAeiaSBCT1rpaA8448pSjDfwrl2oAGSmz3w60BhAnzJDxTIIRePK3ZFAl7iVP1l98/HYAOfHncgz+r6b+eyOwM3nYtft2/oB64KyCrrV6Lia79/twGnlxgEPpUSTEyLGZHCvEJfEo1h0YC8yQt0A+HsVydjRWJGSxxKGnwdcGm8UUyAhkgwOmhGNP0tQEyOPEVADeHJnWpRmvTrWUwCEKsOhNl4SJIhRLvwgkMrM5Vkk+u/pBRal8jticeEmlNYUCESA5ryVFGrjVJI4p5sugmszaNE1uMRPMPSIyEy6MM+cs+x/EVXSxwucpm+NQaBVzqstCA2DO8cwI81BgGTOVr8b8hAtxLHmMe1mY547PEF9c4mnP1Kkkqu3MaIRkNeWaZRh3QnznRUGAwqmCzMySxIlUsNfXuXdl9FIrMy3GKNcFlxlcQd+U0Ddab2mHG3QGhgatgNcl8LpW+KaE3mxL3biExjV0WUKXtcruXQm+q8FJCU1qaFpC0xr6roS+q1uJzM6PBmNV2J3vmzqnZIlfcoxDrbrm391aCzdbOnKqkmybVdfRud1TPDOHdAEEaUZXr65e/6zV6dHgqX2gtxkujfGaYu8fPD21axS/6GbFsY+OBi9qnIij0N8UOvvh4LlTL8RizuiGdHi4/+OzeqUUUxrdbSqdvjgb7G/PEfdqJqzWCroOqJnmN9FXq2oUuE2CwqlG/qLOf8lR+g/sqKn62sBGBWtSrN1sVKRNirW1a8XWIlg2rQtzF7Ds6Ea0oJxhc6hz/NpM8bk5iJCM+Lcqv6GIsc+8YS+L/o2IkjXRRJ3shnG275N6MBz0n/XtX77rnvy0umra1pfWV9ae5ViH1on1yrqwhpbX2m0dtk5az9ui/Vv79/YfBbW1s9I8sipP+/1fAPo6Og==</latexit><latexit sha1_base64="XnPAttzTK/BsMvTbPb0G1ChNSOA=">AAAI8HictVXRbts2FFXcrfO8dWvWx70QMzakg2tIzpqkGAKkTbYWw9pkQZwEMF2DkmmZMCVxJOVIJfghexr22j/a34ySbE+ytGEv04su7jnn6vLwinQZJULa9p87rXsffHj/o/bHnU8+ffDZ5w93v7gWUcw9PPQiGvFbFwlMSYiHkkiKbxnHKHApvnEXpxl+s8RckCi8kinD4wD5IZkRD0mTmuzuvP8GIu4HKJlAOccSwe8BpJEP2N4tDMgUFNnH4Bg08EIAGY+mE5UASEJwq43MhGUdhJ1j0PANEQeTJC+xLSkUm88pGHkKBrHuAeiaSBCT1rpaA8448pSjDfwrl2oAGSmz3w60BhAnzJDxTIIRePK3ZFAl7iVP1l98/HYAOfHncgz+r6b+eyOwM3nYtft2/oB64KyCrrV6Lia79/twGnlxgEPpUSTEyLGZHCvEJfEo1h0YC8yQt0A+HsVydjRWJGSxxKGnwdcGm8UUyAhkgwOmhGNP0tQEyOPEVADeHJnWpRmvTrWUwCEKsOhNl4SJIhRLvwgkMrM5Vkk+u/pBRal8jticeEmlNYUCESA5ryVFGrjVJI4p5sugmszaNE1uMRPMPSIyEy6MM+cs+x/EVXSxwucpm+NQaBVzqstCA2DO8cwI81BgGTOVr8b8hAtxLHmMe1mY547PEF9c4mnP1Kkkqu3MaIRkNeWaZRh3QnznRUGAwqmCzMySxIlUsNfXuXdl9FIrMy3GKNcFlxlcQd+U0Ddab2mHG3QGhgatgNcl8LpW+KaE3mxL3biExjV0WUKXtcruXQm+q8FJCU1qaFpC0xr6roS+q1uJzM6PBmNV2J3vmzqnZIlfcoxDrbrm391aCzdbOnKqkmybVdfRud1TPDOHdAEEaUZXr65e/6zV6dHgqX2gtxkujfGaYu8fPD21axS/6GbFsY+OBi9qnIij0N8UOvvh4LlTL8RizuiGdHi4/+OzeqUUUxrdbSqdvjgb7G/PEfdqJqzWCroOqJnmN9FXq2oUuE2CwqlG/qLOf8lR+g/sqKn62sBGBWtSrN1sVKRNirW1a8XWIlg2rQtzF7Ds6Ea0oJxhc6hz/NpM8bk5iJCM+Lcqv6GIsc+8YS+L/o2IkjXRRJ3shnG275N6MBz0n/XtX77rnvy0umra1pfWV9ae5ViH1on1yrqwhpbX2m0dtk5az9ui/Vv79/YfBbW1s9I8sipP+/1fAPo6Og==</latexit><latexit sha1_base64="XnPAttzTK/BsMvTbPb0G1ChNSOA=">AAAI8HictVXRbts2FFXcrfO8dWvWx70QMzakg2tIzpqkGAKkTbYWw9pkQZwEMF2DkmmZMCVxJOVIJfghexr22j/a34ySbE+ytGEv04su7jnn6vLwinQZJULa9p87rXsffHj/o/bHnU8+ffDZ5w93v7gWUcw9PPQiGvFbFwlMSYiHkkiKbxnHKHApvnEXpxl+s8RckCi8kinD4wD5IZkRD0mTmuzuvP8GIu4HKJlAOccSwe8BpJEP2N4tDMgUFNnH4Bg08EIAGY+mE5UASEJwq43MhGUdhJ1j0PANEQeTJC+xLSkUm88pGHkKBrHuAeiaSBCT1rpaA8448pSjDfwrl2oAGSmz3w60BhAnzJDxTIIRePK3ZFAl7iVP1l98/HYAOfHncgz+r6b+eyOwM3nYtft2/oB64KyCrrV6Lia79/twGnlxgEPpUSTEyLGZHCvEJfEo1h0YC8yQt0A+HsVydjRWJGSxxKGnwdcGm8UUyAhkgwOmhGNP0tQEyOPEVADeHJnWpRmvTrWUwCEKsOhNl4SJIhRLvwgkMrM5Vkk+u/pBRal8jticeEmlNYUCESA5ryVFGrjVJI4p5sugmszaNE1uMRPMPSIyEy6MM+cs+x/EVXSxwucpm+NQaBVzqstCA2DO8cwI81BgGTOVr8b8hAtxLHmMe1mY547PEF9c4mnP1Kkkqu3MaIRkNeWaZRh3QnznRUGAwqmCzMySxIlUsNfXuXdl9FIrMy3GKNcFlxlcQd+U0Ddab2mHG3QGhgatgNcl8LpW+KaE3mxL3biExjV0WUKXtcruXQm+q8FJCU1qaFpC0xr6roS+q1uJzM6PBmNV2J3vmzqnZIlfcoxDrbrm391aCzdbOnKqkmybVdfRud1TPDOHdAEEaUZXr65e/6zV6dHgqX2gtxkujfGaYu8fPD21axS/6GbFsY+OBi9qnIij0N8UOvvh4LlTL8RizuiGdHi4/+OzeqUUUxrdbSqdvjgb7G/PEfdqJqzWCroOqJnmN9FXq2oUuE2CwqlG/qLOf8lR+g/sqKn62sBGBWtSrN1sVKRNirW1a8XWIlg2rQtzF7Ds6Ea0oJxhc6hz/NpM8bk5iJCM+Lcqv6GIsc+8YS+L/o2IkjXRRJ3shnG275N6MBz0n/XtX77rnvy0umra1pfWV9ae5ViH1on1yrqwhpbX2m0dtk5az9ui/Vv79/YfBbW1s9I8sipP+/1fAPo6Og==</latexit>

argmax✓

log p(X | ✓) = argmax✓

lnY

x2X

p(x | ✓)

= argmax✓

X

x

lnp(x | ✓)

= argmaxµ,�

X

x

ln1p2⇡�2

exp

-

1

2�2(x- µ)2

= argmaxµ,�

X

x

ln1p2⇡�2

-1

2�2(x- µ)2

<latexit sha1_base64="XnPAttzTK/BsMvTbPb0G1ChNSOA=">AAAI8HictVXRbts2FFXcrfO8dWvWx70QMzakg2tIzpqkGAKkTbYWw9pkQZwEMF2DkmmZMCVxJOVIJfghexr22j/a34ySbE+ytGEv04su7jnn6vLwinQZJULa9p87rXsffHj/o/bHnU8+ffDZ5w93v7gWUcw9PPQiGvFbFwlMSYiHkkiKbxnHKHApvnEXpxl+s8RckCi8kinD4wD5IZkRD0mTmuzuvP8GIu4HKJlAOccSwe8BpJEP2N4tDMgUFNnH4Bg08EIAGY+mE5UASEJwq43MhGUdhJ1j0PANEQeTJC+xLSkUm88pGHkKBrHuAeiaSBCT1rpaA8448pSjDfwrl2oAGSmz3w60BhAnzJDxTIIRePK3ZFAl7iVP1l98/HYAOfHncgz+r6b+eyOwM3nYtft2/oB64KyCrrV6Lia79/twGnlxgEPpUSTEyLGZHCvEJfEo1h0YC8yQt0A+HsVydjRWJGSxxKGnwdcGm8UUyAhkgwOmhGNP0tQEyOPEVADeHJnWpRmvTrWUwCEKsOhNl4SJIhRLvwgkMrM5Vkk+u/pBRal8jticeEmlNYUCESA5ryVFGrjVJI4p5sugmszaNE1uMRPMPSIyEy6MM+cs+x/EVXSxwucpm+NQaBVzqstCA2DO8cwI81BgGTOVr8b8hAtxLHmMe1mY547PEF9c4mnP1Kkkqu3MaIRkNeWaZRh3QnznRUGAwqmCzMySxIlUsNfXuXdl9FIrMy3GKNcFlxlcQd+U0Ddab2mHG3QGhgatgNcl8LpW+KaE3mxL3biExjV0WUKXtcruXQm+q8FJCU1qaFpC0xr6roS+q1uJzM6PBmNV2J3vmzqnZIlfcoxDrbrm391aCzdbOnKqkmybVdfRud1TPDOHdAEEaUZXr65e/6zV6dHgqX2gtxkujfGaYu8fPD21axS/6GbFsY+OBi9qnIij0N8UOvvh4LlTL8RizuiGdHi4/+OzeqUUUxrdbSqdvjgb7G/PEfdqJqzWCroOqJnmN9FXq2oUuE2CwqlG/qLOf8lR+g/sqKn62sBGBWtSrN1sVKRNirW1a8XWIlg2rQtzF7Ds6Ea0oJxhc6hz/NpM8bk5iJCM+Lcqv6GIsc+8YS+L/o2IkjXRRJ3shnG275N6MBz0n/XtX77rnvy0umra1pfWV9ae5ViH1on1yrqwhpbX2m0dtk5az9ui/Vv79/YfBbW1s9I8sipP+/1fAPo6Og==</latexit><latexit sha1_base64="XnPAttzTK/BsMvTbPb0G1ChNSOA=">AAAI8HictVXRbts2FFXcrfO8dWvWx70QMzakg2tIzpqkGAKkTbYWw9pkQZwEMF2DkmmZMCVxJOVIJfghexr22j/a34ySbE+ytGEv04su7jnn6vLwinQZJULa9p87rXsffHj/o/bHnU8+ffDZ5w93v7gWUcw9PPQiGvFbFwlMSYiHkkiKbxnHKHApvnEXpxl+s8RckCi8kinD4wD5IZkRD0mTmuzuvP8GIu4HKJlAOccSwe8BpJEP2N4tDMgUFNnH4Bg08EIAGY+mE5UASEJwq43MhGUdhJ1j0PANEQeTJC+xLSkUm88pGHkKBrHuAeiaSBCT1rpaA8448pSjDfwrl2oAGSmz3w60BhAnzJDxTIIRePK3ZFAl7iVP1l98/HYAOfHncgz+r6b+eyOwM3nYtft2/oB64KyCrrV6Lia79/twGnlxgEPpUSTEyLGZHCvEJfEo1h0YC8yQt0A+HsVydjRWJGSxxKGnwdcGm8UUyAhkgwOmhGNP0tQEyOPEVADeHJnWpRmvTrWUwCEKsOhNl4SJIhRLvwgkMrM5Vkk+u/pBRal8jticeEmlNYUCESA5ryVFGrjVJI4p5sugmszaNE1uMRPMPSIyEy6MM+cs+x/EVXSxwucpm+NQaBVzqstCA2DO8cwI81BgGTOVr8b8hAtxLHmMe1mY547PEF9c4mnP1Kkkqu3MaIRkNeWaZRh3QnznRUGAwqmCzMySxIlUsNfXuXdl9FIrMy3GKNcFlxlcQd+U0Ddab2mHG3QGhgatgNcl8LpW+KaE3mxL3biExjV0WUKXtcruXQm+q8FJCU1qaFpC0xr6roS+q1uJzM6PBmNV2J3vmzqnZIlfcoxDrbrm391aCzdbOnKqkmybVdfRud1TPDOHdAEEaUZXr65e/6zV6dHgqX2gtxkujfGaYu8fPD21axS/6GbFsY+OBi9qnIij0N8UOvvh4LlTL8RizuiGdHi4/+OzeqUUUxrdbSqdvjgb7G/PEfdqJqzWCroOqJnmN9FXq2oUuE2CwqlG/qLOf8lR+g/sqKn62sBGBWtSrN1sVKRNirW1a8XWIlg2rQtzF7Ds6Ea0oJxhc6hz/NpM8bk5iJCM+Lcqv6GIsc+8YS+L/o2IkjXRRJ3shnG275N6MBz0n/XtX77rnvy0umra1pfWV9ae5ViH1on1yrqwhpbX2m0dtk5az9ui/Vv79/YfBbW1s9I8sipP+/1fAPo6Og==</latexit><latexit sha1_base64="XnPAttzTK/BsMvTbPb0G1ChNSOA=">AAAI8HictVXRbts2FFXcrfO8dWvWx70QMzakg2tIzpqkGAKkTbYWw9pkQZwEMF2DkmmZMCVxJOVIJfghexr22j/a34ySbE+ytGEv04su7jnn6vLwinQZJULa9p87rXsffHj/o/bHnU8+ffDZ5w93v7gWUcw9PPQiGvFbFwlMSYiHkkiKbxnHKHApvnEXpxl+s8RckCi8kinD4wD5IZkRD0mTmuzuvP8GIu4HKJlAOccSwe8BpJEP2N4tDMgUFNnH4Bg08EIAGY+mE5UASEJwq43MhGUdhJ1j0PANEQeTJC+xLSkUm88pGHkKBrHuAeiaSBCT1rpaA8448pSjDfwrl2oAGSmz3w60BhAnzJDxTIIRePK3ZFAl7iVP1l98/HYAOfHncgz+r6b+eyOwM3nYtft2/oB64KyCrrV6Lia79/twGnlxgEPpUSTEyLGZHCvEJfEo1h0YC8yQt0A+HsVydjRWJGSxxKGnwdcGm8UUyAhkgwOmhGNP0tQEyOPEVADeHJnWpRmvTrWUwCEKsOhNl4SJIhRLvwgkMrM5Vkk+u/pBRal8jticeEmlNYUCESA5ryVFGrjVJI4p5sugmszaNE1uMRPMPSIyEy6MM+cs+x/EVXSxwucpm+NQaBVzqstCA2DO8cwI81BgGTOVr8b8hAtxLHmMe1mY547PEF9c4mnP1Kkkqu3MaIRkNeWaZRh3QnznRUGAwqmCzMySxIlUsNfXuXdl9FIrMy3GKNcFlxlcQd+U0Ddab2mHG3QGhgatgNcl8LpW+KaE3mxL3biExjV0WUKXtcruXQm+q8FJCU1qaFpC0xr6roS+q1uJzM6PBmNV2J3vmzqnZIlfcoxDrbrm391aCzdbOnKqkmybVdfRud1TPDOHdAEEaUZXr65e/6zV6dHgqX2gtxkujfGaYu8fPD21axS/6GbFsY+OBi9qnIij0N8UOvvh4LlTL8RizuiGdHi4/+OzeqUUUxrdbSqdvjgb7G/PEfdqJqzWCroOqJnmN9FXq2oUuE2CwqlG/qLOf8lR+g/sqKn62sBGBWtSrN1sVKRNirW1a8XWIlg2rQtzF7Ds6Ea0oJxhc6hz/NpM8bk5iJCM+Lcqv6GIsc+8YS+L/o2IkjXRRJ3shnG275N6MBz0n/XtX77rnvy0umra1pfWV9ae5ViH1on1yrqwhpbX2m0dtk5az9ui/Vv79/YfBbW1s9I8sipP+/1fAPo6Og==</latexit>

argmax✓

lnp(X | ✓) =

<latexit sha1_base64="q3dwxVNcLDCOYhDmQhxvRsGEVR4=">AAAHbnicfVXrbtMwFM64rFBuG0j8QYiICjRQVSUduyA0abBxEQI2pnWbtFST4562UZ3Esp2uwcor8DT8hffgLXgEnKSdkjrgPzk63/ednPPZsl1KPC4s6/fCpctXri7Wrl2v37h56/adpeW7RzyMGIYODknITlzEgXgBdIQnCJxQBsh3CRy7o50UPx4D414YHIqYQtdHg8DrexgJlTpbWnnqIDbw0eTMEUMQyHllOiQw6cqJ43s9M08+M7fMs6WG1bKyZeqBPQ0axnTtny0vtpxeiCMfAoEJ4vzUtqjoSsSEhwkkdSfiQBEeoQGcRqK/2ZVeQCMBAU7MJwrrR8QUoZl2bfY8BliQWAUIM09VMPEQMYSFmq1eLsUhQD7wZm/sUZ6HfDzIA4GUMV05yYxLbpWUcsAQHXp4UmpNIp/7SAy1JI99t5yEiAAb++Vk2qZqco45AYY9npqwr5zZo+lm8MNwf4oPYzqEgCcyYiQpChUAjEFfCbOQg4iozKZRJ2DEtwSLoJmGWW5rF7HRAfSaqk4pUW6nT0IkyilXjaHcCeAch76Pgp50aCIdARMhnWYrybwrogeJlE5qlOuaBylcQr8U0C9JMqftXKB9s6PQEnhUAI+0wscF9Hhe6kYFNNLQcQEda5Xd8wJ8rsGTAjrR0LiAxhr6rYB+061EaudP212Z253tm9wj3hjeM4AgkY12Mj8LU1t6apcl6TbLhp1kdvegr26IHPDjlC4/HH7+lMidzfaatZ7MM1wSwYxira6v7VgaZZB3M+VYm5vtNxonZCgYXBTafbv+2tYL0YhRckHa2Fh991KvFAMh4flFpZ03u+3V+XPEsGbCdFazYZuaaYMq+nSqSoFbJcidquSPdP57huJ/sMOq6jMDKxW0SjFzs1IRVylm1s4Uc0PQ9LSOsPpdenUjklN2QV3qDD6rU7ynLiIkQvZcZm+Jp+xTX6eZRv8josmMqKJ6Xb0w9vx7ogdH7Zb9orX29UVj++P0rblmPDAeGyuGbWwY28YHY9/oGNj4bvwwfhq/Fv/U7tce1h7l1EsLU809o7RqK38BnxavGw==</latexit>

<latexit sha1_base64="dUQNBPw+UHI20C+MWUJyCsLXp44=">AAAHT3icfZVbb9MwFMczLh2U2waPvFhUSANVU1LYBaFJwAabEGxjWjekpZoc9zS1mouxna7F8tfjnSfEJ+EJhNOkU9IU8tIj//7n6Pjvo1OPBVRI2/65cOXqteu1xRs367du37l7b2n5/omIE06gTeIg5p89LCCgEbQllQF8Zhxw6AVw6g22U346BC5oHB3LMYNOiP2I9ijB0hydL31BbGWE3JB2kesR5co+SKzRE7SF0P4liQ0JE93MNIL6IdZPkPvSfYlcCSOpLqjsI12ssIVW5qedLzXsVXvyoWrg5EHDyr/D8+Xr39xuTJIQIkkCLMSZYzPZUZhLSgLQdTcRwDAZYB/OEtnb7CgasURCRDR6bFgvCZCMUXp91KUciAzGJsCEU1MBkT7mmEhjUr1cSkCEQxDN7pAykYVi6GeBxMbhjhpNXkDfKWUqn2PWp2RUak3hUIRY9iuHYhx65UNIAuDDsHyYtmmanFGOgBMqUhMOjTMHLH1VcRwf5rw/Zn2IhFYJD3Qx0QDgHHomcRIKkAlTk9uYURqILckTaKbh5GxrB/PBEXSbpk7poNxOL4ix1MaMCC5IHIY46iqXaZUNidtc1ROrivRIKzMkxhfPQ0cpLtH9At3Xs5Xbl7SH2oaW4EkBnlQKnxbo6WyqlxRoUqHDAh1WKnsXBXxRwaMCHVXouEDHFfq1QL9WrcTmoc9aHZXZPXkmdRDQIexygEirRkvP3oWbFzxzyinpq6qGoyd2d6FnNksGwnEqV3vHHz9otb3ZWrPX9azCCxKYSuxn62vbdkXiZ93kGntzs/Wmook5jvzLQjtv11871UIs4Sy4FG1sPHv34s3siHBSuV9+DdRwUMUPf548b3hugjcvITNhrn5Q1e9yPP6HOp5XferN3Aw2L2Nq1DRjpiWWjtXAbGiWrlQcZJIdMMuWw0czbgdmQWAZ86dmxrgfUmOG+XWbafQ/IR5NhSaq183md2b3fDU4aa0666v2p+eNV+/z/4Ab1kPrkbViOdaG9crasw6ttkWsH9bvhdrCYu177Vftz2IuvbKQBw+s0rd48y+W8qJA</latexit>

p(x | ✓) = N(x | µ,�) with ✓ = (µ,�)

<latexit sha1_base64="dUQNBPw+UHI20C+MWUJyCsLXp44=">AAAHT3icfZVbb9MwFMczLh2U2waPvFhUSANVU1LYBaFJwAabEGxjWjekpZoc9zS1mouxna7F8tfjnSfEJ+EJhNOkU9IU8tIj//7n6Pjvo1OPBVRI2/65cOXqteu1xRs367du37l7b2n5/omIE06gTeIg5p89LCCgEbQllQF8Zhxw6AVw6g22U346BC5oHB3LMYNOiP2I9ijB0hydL31BbGWE3JB2kesR5co+SKzRE7SF0P4liQ0JE93MNIL6IdZPkPvSfYlcCSOpLqjsI12ssIVW5qedLzXsVXvyoWrg5EHDyr/D8+Xr39xuTJIQIkkCLMSZYzPZUZhLSgLQdTcRwDAZYB/OEtnb7CgasURCRDR6bFgvCZCMUXp91KUciAzGJsCEU1MBkT7mmEhjUr1cSkCEQxDN7pAykYVi6GeBxMbhjhpNXkDfKWUqn2PWp2RUak3hUIRY9iuHYhx65UNIAuDDsHyYtmmanFGOgBMqUhMOjTMHLH1VcRwf5rw/Zn2IhFYJD3Qx0QDgHHomcRIKkAlTk9uYURqILckTaKbh5GxrB/PBEXSbpk7poNxOL4ix1MaMCC5IHIY46iqXaZUNidtc1ROrivRIKzMkxhfPQ0cpLtH9At3Xs5Xbl7SH2oaW4EkBnlQKnxbo6WyqlxRoUqHDAh1WKnsXBXxRwaMCHVXouEDHFfq1QL9WrcTmoc9aHZXZPXkmdRDQIexygEirRkvP3oWbFzxzyinpq6qGoyd2d6FnNksGwnEqV3vHHz9otb3ZWrPX9azCCxKYSuxn62vbdkXiZ93kGntzs/Wmook5jvzLQjtv11871UIs4Sy4FG1sPHv34s3siHBSuV9+DdRwUMUPf548b3hugjcvITNhrn5Q1e9yPP6HOp5XferN3Aw2L2Nq1DRjpiWWjtXAbGiWrlQcZJIdMMuWw0czbgdmQWAZ86dmxrgfUmOG+XWbafQ/IR5NhSaq183md2b3fDU4aa0666v2p+eNV+/z/4Ab1kPrkbViOdaG9crasw6ttkWsH9bvhdrCYu177Vftz2IuvbKQBw+s0rd48y+W8qJA</latexit>

p(x | ✓) = N(x | µ,�) with ✓ = (µ,�)

Wewantto?indthemeanandstandarddeviationforwhichtheloglikelihoodismaximal.Wewillleavethefullderivationforlater,butherearethe?irstfewsteps.Thisshouldshowyouhowthelogarithmsimpli?iesthings.

First,wecanturntheproductintoasumbymovingthelogarithminside.Thisisexplainindetailinthe?irsthomework.

We?illinthede?initionoftheactualprobabilitydensityfunctionwe’reusing(line3).Thisfunctionistheproductoftwofactors(thedivisionandtheexponent)whichbecometermsifweworkthemoutofthelogarithm.Inthesecondtermtheexponentcancelsagainstthelogarithm.

Weusuallyuseabase-elogarithm,becauseitwillcanceloutagainstthebase-eexponentintheprobabilitydensityforthenormaldistribution.

41

µ,�<latexit sha1_base64="giaUEJtDFa8dQUI04UsX2oay3g8=">AAAH03icfVXLjtMwFA3PQnkNIFZsIiokhKpR0mEeLJBgHsACmGE0nUGaVCPHvU2jOollO50GyxvElo9gwwK+iL/B6YskDniTq3vOubk+dnJ9SkIuHOf3hYuXLl+52rh2vXnj5q3bd1bu3jvmScowdHFCEvbJRxxIGENXhILAJ8oART6BE3+0k+MnY2A8TOIjkVHoRSiIw0GIkdCps5UHXoKlF6WqbXu+jngYREidrbScVWe6bDNw50HLmq+Ds7tXf3j9BKcRxAITxPmp61DRk4iJEBNQTS/lQBEeoQBOUzHY6skwpqmAGCv7scYGKbFFYuct2v2QARYk0wHCLNQVbDxEDGGhN9Isl+IQowh4uz8OKZ+FfBzMAoG0Cz05mbqkbpWUMmCIDkM8KbUmUcQjJIZGkmeRX05CSoCNo3Iyb1M3WWFOgOGQ5yYcaGf2ae48P0oO5vgwo0OIuZIpI6oo1AAwBgMtnIYcRErldDf6uEf8hWAptPNwmnuxi9joEPptXaeUKLczIAkS5ZSvt6HdieEcJ1GE4r70qJKegImQXntVTb0roodK6kujjfJ9+zCHS+iHAvpBqTK4VwD3NFhGu0t0YHer0uMCeGy89aSAnlSlflpAUwMdF9CxUdk/L8DnBjwpoBMDzQpoZqCfC+hn02ekr8VppydnZzE9VLlPwjG8YQCxkq2Oqu6F6fM+dcuS/A7IlqumdvdhoP8VMyDKcrp8e/T+nZI7W511Z0NVGT5JYUFx1jbWdxyDEsy6mXOcra3OtsFJGIqDZaHdvY1XrlmIpoySJWlzc+31c7NSBoQk58tKO9u7nbXqxrQj5abcTddxqreNYcOquSN2y7UNa4M6+vw1tQK/TjDzs5Y/MvlvGMr+wU7qqi9srlXQOsXC81pFVqdYHMBCUZbENTb9PY6lprJzmn8IIz2AaD4yEJnV3QU9TBi81x/Ivv4BIpGwp/qrYEEU6lr66bXz6H9ENFkQddRs6snmVueYGRx3Vl1n1f34rPVyez7jrlkPrUfWE8u1Nq2X1lvrwOpa2JLWd+un9avRbcjGl8bXGfXihbnmvlVajW9/AI57178=</latexit><latexit sha1_base64="giaUEJtDFa8dQUI04UsX2oay3g8=">AAAH03icfVXLjtMwFA3PQnkNIFZsIiokhKpR0mEeLJBgHsACmGE0nUGaVCPHvU2jOollO50GyxvElo9gwwK+iL/B6YskDniTq3vOubk+dnJ9SkIuHOf3hYuXLl+52rh2vXnj5q3bd1bu3jvmScowdHFCEvbJRxxIGENXhILAJ8oART6BE3+0k+MnY2A8TOIjkVHoRSiIw0GIkdCps5UHXoKlF6WqbXu+jngYREidrbScVWe6bDNw50HLmq+Ds7tXf3j9BKcRxAITxPmp61DRk4iJEBNQTS/lQBEeoQBOUzHY6skwpqmAGCv7scYGKbFFYuct2v2QARYk0wHCLNQVbDxEDGGhN9Isl+IQowh4uz8OKZ+FfBzMAoG0Cz05mbqkbpWUMmCIDkM8KbUmUcQjJIZGkmeRX05CSoCNo3Iyb1M3WWFOgOGQ5yYcaGf2ae48P0oO5vgwo0OIuZIpI6oo1AAwBgMtnIYcRErldDf6uEf8hWAptPNwmnuxi9joEPptXaeUKLczIAkS5ZSvt6HdieEcJ1GE4r70qJKegImQXntVTb0roodK6kujjfJ9+zCHS+iHAvpBqTK4VwD3NFhGu0t0YHer0uMCeGy89aSAnlSlflpAUwMdF9CxUdk/L8DnBjwpoBMDzQpoZqCfC+hn02ekr8VppydnZzE9VLlPwjG8YQCxkq2Oqu6F6fM+dcuS/A7IlqumdvdhoP8VMyDKcrp8e/T+nZI7W511Z0NVGT5JYUFx1jbWdxyDEsy6mXOcra3OtsFJGIqDZaHdvY1XrlmIpoySJWlzc+31c7NSBoQk58tKO9u7nbXqxrQj5abcTddxqreNYcOquSN2y7UNa4M6+vw1tQK/TjDzs5Y/MvlvGMr+wU7qqi9srlXQOsXC81pFVqdYHMBCUZbENTb9PY6lprJzmn8IIz2AaD4yEJnV3QU9TBi81x/Ivv4BIpGwp/qrYEEU6lr66bXz6H9ENFkQddRs6snmVueYGRx3Vl1n1f34rPVyez7jrlkPrUfWE8u1Nq2X1lvrwOpa2JLWd+un9avRbcjGl8bXGfXihbnmvlVajW9/AI57178=</latexit><latexit sha1_base64="giaUEJtDFa8dQUI04UsX2oay3g8=">AAAH03icfVXLjtMwFA3PQnkNIFZsIiokhKpR0mEeLJBgHsACmGE0nUGaVCPHvU2jOollO50GyxvElo9gwwK+iL/B6YskDniTq3vOubk+dnJ9SkIuHOf3hYuXLl+52rh2vXnj5q3bd1bu3jvmScowdHFCEvbJRxxIGENXhILAJ8oART6BE3+0k+MnY2A8TOIjkVHoRSiIw0GIkdCps5UHXoKlF6WqbXu+jngYREidrbScVWe6bDNw50HLmq+Ds7tXf3j9BKcRxAITxPmp61DRk4iJEBNQTS/lQBEeoQBOUzHY6skwpqmAGCv7scYGKbFFYuct2v2QARYk0wHCLNQVbDxEDGGhN9Isl+IQowh4uz8OKZ+FfBzMAoG0Cz05mbqkbpWUMmCIDkM8KbUmUcQjJIZGkmeRX05CSoCNo3Iyb1M3WWFOgOGQ5yYcaGf2ae48P0oO5vgwo0OIuZIpI6oo1AAwBgMtnIYcRErldDf6uEf8hWAptPNwmnuxi9joEPptXaeUKLczIAkS5ZSvt6HdieEcJ1GE4r70qJKegImQXntVTb0roodK6kujjfJ9+zCHS+iHAvpBqTK4VwD3NFhGu0t0YHer0uMCeGy89aSAnlSlflpAUwMdF9CxUdk/L8DnBjwpoBMDzQpoZqCfC+hn02ekr8VppydnZzE9VLlPwjG8YQCxkq2Oqu6F6fM+dcuS/A7IlqumdvdhoP8VMyDKcrp8e/T+nZI7W511Z0NVGT5JYUFx1jbWdxyDEsy6mXOcra3OtsFJGIqDZaHdvY1XrlmIpoySJWlzc+31c7NSBoQk58tKO9u7nbXqxrQj5abcTddxqreNYcOquSN2y7UNa4M6+vw1tQK/TjDzs5Y/MvlvGMr+wU7qqi9srlXQOsXC81pFVqdYHMBCUZbENTb9PY6lprJzmn8IIz2AaD4yEJnV3QU9TBi81x/Ivv4BIpGwp/qrYEEU6lr66bXz6H9ENFkQddRs6snmVueYGRx3Vl1n1f34rPVyez7jrlkPrUfWE8u1Nq2X1lvrwOpa2JLWd+un9avRbcjGl8bXGfXihbnmvlVajW9/AI57178=</latexit><latexit sha1_base64="giaUEJtDFa8dQUI04UsX2oay3g8=">AAAH03icfVXLjtMwFA3PQnkNIFZsIiokhKpR0mEeLJBgHsACmGE0nUGaVCPHvU2jOollO50GyxvElo9gwwK+iL/B6YskDniTq3vOubk+dnJ9SkIuHOf3hYuXLl+52rh2vXnj5q3bd1bu3jvmScowdHFCEvbJRxxIGENXhILAJ8oART6BE3+0k+MnY2A8TOIjkVHoRSiIw0GIkdCps5UHXoKlF6WqbXu+jngYREidrbScVWe6bDNw50HLmq+Ds7tXf3j9BKcRxAITxPmp61DRk4iJEBNQTS/lQBEeoQBOUzHY6skwpqmAGCv7scYGKbFFYuct2v2QARYk0wHCLNQVbDxEDGGhN9Isl+IQowh4uz8OKZ+FfBzMAoG0Cz05mbqkbpWUMmCIDkM8KbUmUcQjJIZGkmeRX05CSoCNo3Iyb1M3WWFOgOGQ5yYcaGf2ae48P0oO5vgwo0OIuZIpI6oo1AAwBgMtnIYcRErldDf6uEf8hWAptPNwmnuxi9joEPptXaeUKLczIAkS5ZSvt6HdieEcJ1GE4r70qJKegImQXntVTb0roodK6kujjfJ9+zCHS+iHAvpBqTK4VwD3NFhGu0t0YHer0uMCeGy89aSAnlSlflpAUwMdF9CxUdk/L8DnBjwpoBMDzQpoZqCfC+hn02ekr8VppydnZzE9VLlPwjG8YQCxkq2Oqu6F6fM+dcuS/A7IlqumdvdhoP8VMyDKcrp8e/T+nZI7W511Z0NVGT5JYUFx1jbWdxyDEsy6mXOcra3OtsFJGIqDZaHdvY1XrlmIpoySJWlzc+31c7NSBoQk58tKO9u7nbXqxrQj5abcTddxqreNYcOquSN2y7UNa4M6+vw1tQK/TjDzs5Y/MvlvGMr+wU7qqi9srlXQOsXC81pFVqdYHMBCUZbENTb9PY6lprJzmn8IIz2AaD4yEJnV3QU9TBi81x/Ivv4BIpGwp/qrYEEU6lr66bXz6H9ENFkQddRs6snmVueYGRx3Vl1n1f34rPVyez7jrlkPrUfWE8u1Nq2X1lvrwOpa2JLWd+un9avRbcjGl8bXGfXihbnmvlVajW9/AI57178=</latexit>

�2<latexit sha1_base64="K/SJbfMATGxVkRzf/AW1h6QuToE=">AAAHyXicfVXLbtNAFDXPlPAsLNlYREgIRZWd0qYsKpU+oBKUlqppkeqAxpMbx8rYns6M07gjr/gItrDhn/gbxnlhewyz8dU951zfOTP2dSnxubCs39eu37h563Zt6U797r37Dx4+Wn58yqOYYejgiETss4s4ED+EjvAFgc+UAQpcAmfucCfDz0bAuB+FJyKh0A2QF/p9HyOhUl3HxdLhvheg9Evr66OGtWJNlqkH9ixoGLN19HX59i+nF+E4gFBggjg/ty0quhIx4WMCad2JOVCEh8iD81j0N7rSD2ksIMSp+Vxh/ZiYIjKzxsyezwALkqgAYearCiYeIIawUO3Xi6U4hCgA3uyNfMqnIR9500AgtfeuHE+8Se8XlNJjiA58PC60JlHAAyQGWpIngVtMQkyAjYJiMmtTNVlijoFhn2cmHClnDmnmNz+Jjmb4IKEDCHkqY0bSvFABwBj0lXASchAxlZPdqEMe8k3BYmhm4SS3uYvY8Bh6TVWnkCi20ycREsWUq7ah3AnhEkdBgMKedGgqHQFjIZ3mSjrxLo8ep1I6mVGuax5ncAH9mEM/pmkR3MuBewosop0F2jc7ZelpDjzV3nqWQ8/KUjfOobGGjnLoSKvsXubgSw0e59CxhiY5NNHQqxx6pfuM1LU4b3Xl9CwmhyoPiT+CdwwgTGWjlZb3wtR5n9tFSXYHZMNOJ3b3oK/+EFMgSDK63D85+JDKnY3WmrWelhkuiWFOsVbX13YsjeJNu5lxrI2N1rbGiRgKvUWh3b31N7ZeiMaMkgWp3V59+1qvlAAh0eWi0s72bmu1vDHlSLEpu21bVvm2MaxZNXPEbNimZq1XRZ+9plLgVgmmflbyhzr/HUPJP9hRVfW5zZUKWqWYe16pSKoU8wOYK4qSsMKmv8ex0JR2TrMPYahGEM1GBiLTurughgmDA/WBHKofIBIRe6m+CuYFvqqlnk4zi/5HROM5UUX1uppsdnmO6cFpa8W2VuxPrxpb27MZt2Q8NZ4ZLwzbaBtbxr5xZHQMbFwY340fxs/a+9pFbVy7mlKvX5tpnhiFVfv2B+Fl1B4=</latexit><latexit sha1_base64="K/SJbfMATGxVkRzf/AW1h6QuToE=">AAAHyXicfVXLbtNAFDXPlPAsLNlYREgIRZWd0qYsKpU+oBKUlqppkeqAxpMbx8rYns6M07gjr/gItrDhn/gbxnlhewyz8dU951zfOTP2dSnxubCs39eu37h563Zt6U797r37Dx4+Wn58yqOYYejgiETss4s4ED+EjvAFgc+UAQpcAmfucCfDz0bAuB+FJyKh0A2QF/p9HyOhUl3HxdLhvheg9Evr66OGtWJNlqkH9ixoGLN19HX59i+nF+E4gFBggjg/ty0quhIx4WMCad2JOVCEh8iD81j0N7rSD2ksIMSp+Vxh/ZiYIjKzxsyezwALkqgAYearCiYeIIawUO3Xi6U4hCgA3uyNfMqnIR9500AgtfeuHE+8Se8XlNJjiA58PC60JlHAAyQGWpIngVtMQkyAjYJiMmtTNVlijoFhn2cmHClnDmnmNz+Jjmb4IKEDCHkqY0bSvFABwBj0lXASchAxlZPdqEMe8k3BYmhm4SS3uYvY8Bh6TVWnkCi20ycREsWUq7ah3AnhEkdBgMKedGgqHQFjIZ3mSjrxLo8ep1I6mVGuax5ncAH9mEM/pmkR3MuBewosop0F2jc7ZelpDjzV3nqWQ8/KUjfOobGGjnLoSKvsXubgSw0e59CxhiY5NNHQqxx6pfuM1LU4b3Xl9CwmhyoPiT+CdwwgTGWjlZb3wtR5n9tFSXYHZMNOJ3b3oK/+EFMgSDK63D85+JDKnY3WmrWelhkuiWFOsVbX13YsjeJNu5lxrI2N1rbGiRgKvUWh3b31N7ZeiMaMkgWp3V59+1qvlAAh0eWi0s72bmu1vDHlSLEpu21bVvm2MaxZNXPEbNimZq1XRZ+9plLgVgmmflbyhzr/HUPJP9hRVfW5zZUKWqWYe16pSKoU8wOYK4qSsMKmv8ex0JR2TrMPYahGEM1GBiLTurughgmDA/WBHKofIBIRe6m+CuYFvqqlnk4zi/5HROM5UUX1uppsdnmO6cFpa8W2VuxPrxpb27MZt2Q8NZ4ZLwzbaBtbxr5xZHQMbFwY340fxs/a+9pFbVy7mlKvX5tpnhiFVfv2B+Fl1B4=</latexit><latexit sha1_base64="K/SJbfMATGxVkRzf/AW1h6QuToE=">AAAHyXicfVXLbtNAFDXPlPAsLNlYREgIRZWd0qYsKpU+oBKUlqppkeqAxpMbx8rYns6M07gjr/gItrDhn/gbxnlhewyz8dU951zfOTP2dSnxubCs39eu37h563Zt6U797r37Dx4+Wn58yqOYYejgiETss4s4ED+EjvAFgc+UAQpcAmfucCfDz0bAuB+FJyKh0A2QF/p9HyOhUl3HxdLhvheg9Evr66OGtWJNlqkH9ixoGLN19HX59i+nF+E4gFBggjg/ty0quhIx4WMCad2JOVCEh8iD81j0N7rSD2ksIMSp+Vxh/ZiYIjKzxsyezwALkqgAYearCiYeIIawUO3Xi6U4hCgA3uyNfMqnIR9500AgtfeuHE+8Se8XlNJjiA58PC60JlHAAyQGWpIngVtMQkyAjYJiMmtTNVlijoFhn2cmHClnDmnmNz+Jjmb4IKEDCHkqY0bSvFABwBj0lXASchAxlZPdqEMe8k3BYmhm4SS3uYvY8Bh6TVWnkCi20ycREsWUq7ah3AnhEkdBgMKedGgqHQFjIZ3mSjrxLo8ep1I6mVGuax5ncAH9mEM/pmkR3MuBewosop0F2jc7ZelpDjzV3nqWQ8/KUjfOobGGjnLoSKvsXubgSw0e59CxhiY5NNHQqxx6pfuM1LU4b3Xl9CwmhyoPiT+CdwwgTGWjlZb3wtR5n9tFSXYHZMNOJ3b3oK/+EFMgSDK63D85+JDKnY3WmrWelhkuiWFOsVbX13YsjeJNu5lxrI2N1rbGiRgKvUWh3b31N7ZeiMaMkgWp3V59+1qvlAAh0eWi0s72bmu1vDHlSLEpu21bVvm2MaxZNXPEbNimZq1XRZ+9plLgVgmmflbyhzr/HUPJP9hRVfW5zZUKWqWYe16pSKoU8wOYK4qSsMKmv8ex0JR2TrMPYahGEM1GBiLTurughgmDA/WBHKofIBIRe6m+CuYFvqqlnk4zi/5HROM5UUX1uppsdnmO6cFpa8W2VuxPrxpb27MZt2Q8NZ4ZLwzbaBtbxr5xZHQMbFwY340fxs/a+9pFbVy7mlKvX5tpnhiFVfv2B+Fl1B4=</latexit><latexit sha1_base64="K/SJbfMATGxVkRzf/AW1h6QuToE=">AAAHyXicfVXLbtNAFDXPlPAsLNlYREgIRZWd0qYsKpU+oBKUlqppkeqAxpMbx8rYns6M07gjr/gItrDhn/gbxnlhewyz8dU951zfOTP2dSnxubCs39eu37h563Zt6U797r37Dx4+Wn58yqOYYejgiETss4s4ED+EjvAFgc+UAQpcAmfucCfDz0bAuB+FJyKh0A2QF/p9HyOhUl3HxdLhvheg9Evr66OGtWJNlqkH9ixoGLN19HX59i+nF+E4gFBggjg/ty0quhIx4WMCad2JOVCEh8iD81j0N7rSD2ksIMSp+Vxh/ZiYIjKzxsyezwALkqgAYearCiYeIIawUO3Xi6U4hCgA3uyNfMqnIR9500AgtfeuHE+8Se8XlNJjiA58PC60JlHAAyQGWpIngVtMQkyAjYJiMmtTNVlijoFhn2cmHClnDmnmNz+Jjmb4IKEDCHkqY0bSvFABwBj0lXASchAxlZPdqEMe8k3BYmhm4SS3uYvY8Bh6TVWnkCi20ycREsWUq7ah3AnhEkdBgMKedGgqHQFjIZ3mSjrxLo8ep1I6mVGuax5ncAH9mEM/pmkR3MuBewosop0F2jc7ZelpDjzV3nqWQ8/KUjfOobGGjnLoSKvsXubgSw0e59CxhiY5NNHQqxx6pfuM1LU4b3Xl9CwmhyoPiT+CdwwgTGWjlZb3wtR5n9tFSXYHZMNOJ3b3oK/+EFMgSDK63D85+JDKnY3WmrWelhkuiWFOsVbX13YsjeJNu5lxrI2N1rbGiRgKvUWh3b31N7ZeiMaMkgWp3V59+1qvlAAh0eWi0s72bmu1vDHlSLEpu21bVvm2MaxZNXPEbNimZq1XRZ+9plLgVgmmflbyhzr/HUPJP9hRVfW5zZUKWqWYe16pSKoU8wOYK4qSsMKmv8ex0JR2TrMPYahGEM1GBiLTurughgmDA/WBHKofIBIRe6m+CuYFvqqlnk4zi/5HROM5UUX1uppsdnmO6cFpa8W2VuxPrxpb27MZt2Q8NZ4ZLwzbaBtbxr5xZHQMbFwY340fxs/a+9pFbVy7mlKvX5tpnhiFVfv2B+Fl1B4=</latexit>

Thisisenoughtoshowthatwiththeloglikelihoodwehaveanother“landscape”ontopofourmodelspace.Ifwedidn’twanttoworkouttherestanalytically,wecouldjust?indtheoptimumbygradientdescentorevenrandomsearch.

42

<latexit sha1_base64="m2TZ4YGuGvjK+2NuE2ThQV3sS68=">AAAIrHiclVVvj9s0GE8HjK4w2MHLvbGomAbqqiTH7g5NJ227g02I7Y7T9W5S01VO+iSN6iTGdnrJLH9DvgBfg7fwAqdNj6TJDfCrR/79if3zE9ulJOTCNH/v3Prgw49uf9y90/vk07uffX5v54sLnqTMg5GXkIS9cTEHEsYwEqEg8IYywJFL4NJdHBX45RIYD5P4XOQUJhEO4tAPPSz01HSnAw5mQYSzqXQSTzpRqpTzBDk8jaYZckiMHJ9hT1pKOvxXJqTt0BA5C03lodapt7ZSCD36h2bXwYfZo43xN29t9ODwfd/7rzaO03twiFqdbrbYfOT/WP2LIozrihsF03t9c2iuBmoWVln0jXKcTnduD51Z4qURxMIjmPOxZVIxkZiJ0COgek7KgWJvgQMYp8I/mMgwpqmA2FPoa435KUEiQcWZo1nIwBMk1wX2WKgdkDfHOiWhO6NXt+IQ4wj4YLYMKV+XfBmsC4F1W01ktmo7dbemlAHDdB56WW1pEkc8wmLemOR55NYnISXAllF9slimXuQWMwPmhbwI4VQnc0KLVubnyWmJz3M6h5grmTKiqkINAGPga+Gq5CBSKle70f/Pgh8KlsKgKFdzh8eYLc5gNtA+tYn6cnySYFGfcvU2dDoxXHlJFOF4Jh2qfyABmZDOYKhW2VXRMyV1t+igXBedFXANfV1BXyu1pR1doz4aabQGXlTAi4bxZQW93Ja6aQVNG+iygi4bzu5VBb5qwFkFzRpoXkHzBvqugr5rRon1yY/tiVzHvTo3eULCJbxgALGSfX1hbe2F6SMdW3VJccyyb6lV3DPw9f26BqK8oMuX569+VvLowH5s7qlthktS2FDM3b3HR2aDEqxXU3LMgwP7eYOTMBwH10bHP+w9s5pGNGWUXJP293d//L7plAMhydW109HzY3t3u4+Y1wih3CvqW6gRWtBGL3fVKnDbBOukWvmLJv8Fw/kN7KTNfRNgq4K2KTZptiryNsUm2o1iaxO06NbiHaLF1Y3JmnIM+lJn8Ep38Ym+iLBI2LeyfFKUvuQDZ1BU7yPibEPUVa94Yazt96RZXNhDa29o/vJd/+lP5VvTNe4bXxkPDcvYN54aL41TY2R4nd86f3T+7PzVHXbPu+PuZE291Sk1Xxq10fX/BivRJJc=</latexit>

argmaxµ

X

x

ln1p2⇡�2

-1

2�2(x- µ)2 = argmax

µ

X

x

-1

2�2(x- µ)2

= argmaxµ

-1

2�2

X

x

(x- µ)2

= argmaxµ

-X

x

(x- µ)2

= argminµ

X

x

(x- µ)2

<latexit sha1_base64="m2TZ4YGuGvjK+2NuE2ThQV3sS68=">AAAIrHiclVVvj9s0GE8HjK4w2MHLvbGomAbqqiTH7g5NJ227g02I7Y7T9W5S01VO+iSN6iTGdnrJLH9DvgBfg7fwAqdNj6TJDfCrR/79if3zE9ulJOTCNH/v3Prgw49uf9y90/vk07uffX5v54sLnqTMg5GXkIS9cTEHEsYwEqEg8IYywJFL4NJdHBX45RIYD5P4XOQUJhEO4tAPPSz01HSnAw5mQYSzqXQSTzpRqpTzBDk8jaYZckiMHJ9hT1pKOvxXJqTt0BA5C03lodapt7ZSCD36h2bXwYfZo43xN29t9ODwfd/7rzaO03twiFqdbrbYfOT/WP2LIozrihsF03t9c2iuBmoWVln0jXKcTnduD51Z4qURxMIjmPOxZVIxkZiJ0COgek7KgWJvgQMYp8I/mMgwpqmA2FPoa435KUEiQcWZo1nIwBMk1wX2WKgdkDfHOiWhO6NXt+IQ4wj4YLYMKV+XfBmsC4F1W01ktmo7dbemlAHDdB56WW1pEkc8wmLemOR55NYnISXAllF9slimXuQWMwPmhbwI4VQnc0KLVubnyWmJz3M6h5grmTKiqkINAGPga+Gq5CBSKle70f/Pgh8KlsKgKFdzh8eYLc5gNtA+tYn6cnySYFGfcvU2dDoxXHlJFOF4Jh2qfyABmZDOYKhW2VXRMyV1t+igXBedFXANfV1BXyu1pR1doz4aabQGXlTAi4bxZQW93Ja6aQVNG+iygi4bzu5VBb5qwFkFzRpoXkHzBvqugr5rRon1yY/tiVzHvTo3eULCJbxgALGSfX1hbe2F6SMdW3VJccyyb6lV3DPw9f26BqK8oMuX569+VvLowH5s7qlthktS2FDM3b3HR2aDEqxXU3LMgwP7eYOTMBwH10bHP+w9s5pGNGWUXJP293d//L7plAMhydW109HzY3t3u4+Y1wih3CvqW6gRWtBGL3fVKnDbBOukWvmLJv8Fw/kN7KTNfRNgq4K2KTZptiryNsUm2o1iaxO06NbiHaLF1Y3JmnIM+lJn8Ep38Ym+iLBI2LeyfFKUvuQDZ1BU7yPibEPUVa94Yazt96RZXNhDa29o/vJd/+lP5VvTNe4bXxkPDcvYN54aL41TY2R4nd86f3T+7PzVHXbPu+PuZE291Sk1Xxq10fX/BivRJJc=</latexit>

argmaxµ

X

x

ln1p2⇡�2

-1

2�2(x- µ)2 = argmax

µ

X

x

-1

2�2(x- µ)2

= argmaxµ

-1

2�2

X

x

(x- µ)2

= argmaxµ

-X

x

(x- µ)2

= argminµ

X

x

(x- µ)2

<latexit sha1_base64="m2TZ4YGuGvjK+2NuE2ThQV3sS68=">AAAIrHiclVVvj9s0GE8HjK4w2MHLvbGomAbqqiTH7g5NJ227g02I7Y7T9W5S01VO+iSN6iTGdnrJLH9DvgBfg7fwAqdNj6TJDfCrR/79if3zE9ulJOTCNH/v3Prgw49uf9y90/vk07uffX5v54sLnqTMg5GXkIS9cTEHEsYwEqEg8IYywJFL4NJdHBX45RIYD5P4XOQUJhEO4tAPPSz01HSnAw5mQYSzqXQSTzpRqpTzBDk8jaYZckiMHJ9hT1pKOvxXJqTt0BA5C03lodapt7ZSCD36h2bXwYfZo43xN29t9ODwfd/7rzaO03twiFqdbrbYfOT/WP2LIozrihsF03t9c2iuBmoWVln0jXKcTnduD51Z4qURxMIjmPOxZVIxkZiJ0COgek7KgWJvgQMYp8I/mMgwpqmA2FPoa435KUEiQcWZo1nIwBMk1wX2WKgdkDfHOiWhO6NXt+IQ4wj4YLYMKV+XfBmsC4F1W01ktmo7dbemlAHDdB56WW1pEkc8wmLemOR55NYnISXAllF9slimXuQWMwPmhbwI4VQnc0KLVubnyWmJz3M6h5grmTKiqkINAGPga+Gq5CBSKle70f/Pgh8KlsKgKFdzh8eYLc5gNtA+tYn6cnySYFGfcvU2dDoxXHlJFOF4Jh2qfyABmZDOYKhW2VXRMyV1t+igXBedFXANfV1BXyu1pR1doz4aabQGXlTAi4bxZQW93Ja6aQVNG+iygi4bzu5VBb5qwFkFzRpoXkHzBvqugr5rRon1yY/tiVzHvTo3eULCJbxgALGSfX1hbe2F6SMdW3VJccyyb6lV3DPw9f26BqK8oMuX569+VvLowH5s7qlthktS2FDM3b3HR2aDEqxXU3LMgwP7eYOTMBwH10bHP+w9s5pGNGWUXJP293d//L7plAMhydW109HzY3t3u4+Y1wih3CvqW6gRWtBGL3fVKnDbBOukWvmLJv8Fw/kN7KTNfRNgq4K2KTZptiryNsUm2o1iaxO06NbiHaLF1Y3JmnIM+lJn8Ep38Ym+iLBI2LeyfFKUvuQDZ1BU7yPibEPUVa94Yazt96RZXNhDa29o/vJd/+lP5VvTNe4bXxkPDcvYN54aL41TY2R4nd86f3T+7PzVHXbPu+PuZE291Sk1Xxq10fX/BivRJJc=</latexit>

argmaxµ

X

x

ln1p2⇡�2

-1

2�2(x- µ)2 = argmax

µ

X

x

-1

2�2(x- µ)2

= argmaxµ

-1

2�2

X

x

(x- µ)2

= argmaxµ

-X

x

(x- µ)2

= argminµ

X

x

(x- µ)2

<latexit sha1_base64="m2TZ4YGuGvjK+2NuE2ThQV3sS68=">AAAIrHiclVVvj9s0GE8HjK4w2MHLvbGomAbqqiTH7g5NJ227g02I7Y7T9W5S01VO+iSN6iTGdnrJLH9DvgBfg7fwAqdNj6TJDfCrR/79if3zE9ulJOTCNH/v3Prgw49uf9y90/vk07uffX5v54sLnqTMg5GXkIS9cTEHEsYwEqEg8IYywJFL4NJdHBX45RIYD5P4XOQUJhEO4tAPPSz01HSnAw5mQYSzqXQSTzpRqpTzBDk8jaYZckiMHJ9hT1pKOvxXJqTt0BA5C03lodapt7ZSCD36h2bXwYfZo43xN29t9ODwfd/7rzaO03twiFqdbrbYfOT/WP2LIozrihsF03t9c2iuBmoWVln0jXKcTnduD51Z4qURxMIjmPOxZVIxkZiJ0COgek7KgWJvgQMYp8I/mMgwpqmA2FPoa435KUEiQcWZo1nIwBMk1wX2WKgdkDfHOiWhO6NXt+IQ4wj4YLYMKV+XfBmsC4F1W01ktmo7dbemlAHDdB56WW1pEkc8wmLemOR55NYnISXAllF9slimXuQWMwPmhbwI4VQnc0KLVubnyWmJz3M6h5grmTKiqkINAGPga+Gq5CBSKle70f/Pgh8KlsKgKFdzh8eYLc5gNtA+tYn6cnySYFGfcvU2dDoxXHlJFOF4Jh2qfyABmZDOYKhW2VXRMyV1t+igXBedFXANfV1BXyu1pR1doz4aabQGXlTAi4bxZQW93Ja6aQVNG+iygi4bzu5VBb5qwFkFzRpoXkHzBvqugr5rRon1yY/tiVzHvTo3eULCJbxgALGSfX1hbe2F6SMdW3VJccyyb6lV3DPw9f26BqK8oMuX569+VvLowH5s7qlthktS2FDM3b3HR2aDEqxXU3LMgwP7eYOTMBwH10bHP+w9s5pGNGWUXJP293d//L7plAMhydW109HzY3t3u4+Y1wih3CvqW6gRWtBGL3fVKnDbBOukWvmLJv8Fw/kN7KTNfRNgq4K2KTZptiryNsUm2o1iaxO06NbiHaLF1Y3JmnIM+lJn8Ep38Ym+iLBI2LeyfFKUvuQDZ1BU7yPibEPUVa94Yazt96RZXNhDa29o/vJd/+lP5VvTNe4bXxkPDcvYN54aL41TY2R4nd86f3T+7PzVHXbPu+PuZE291Sk1Xxq10fX/BivRJJc=</latexit>

argmaxµ

X

x

ln1p2⇡�2

-1

2�2(x- µ)2 = argmax

µ

X

x

-1

2�2(x- µ)2

= argmaxµ

-1

2�2

X

x

(x- µ)2

= argmaxµ

-X

x

(x- µ)2

= argminµ

X

x

(x- µ)2

<latexit sha1_base64="m2TZ4YGuGvjK+2NuE2ThQV3sS68=">AAAIrHiclVVvj9s0GE8HjK4w2MHLvbGomAbqqiTH7g5NJ227g02I7Y7T9W5S01VO+iSN6iTGdnrJLH9DvgBfg7fwAqdNj6TJDfCrR/79if3zE9ulJOTCNH/v3Prgw49uf9y90/vk07uffX5v54sLnqTMg5GXkIS9cTEHEsYwEqEg8IYywJFL4NJdHBX45RIYD5P4XOQUJhEO4tAPPSz01HSnAw5mQYSzqXQSTzpRqpTzBDk8jaYZckiMHJ9hT1pKOvxXJqTt0BA5C03lodapt7ZSCD36h2bXwYfZo43xN29t9ODwfd/7rzaO03twiFqdbrbYfOT/WP2LIozrihsF03t9c2iuBmoWVln0jXKcTnduD51Z4qURxMIjmPOxZVIxkZiJ0COgek7KgWJvgQMYp8I/mMgwpqmA2FPoa435KUEiQcWZo1nIwBMk1wX2WKgdkDfHOiWhO6NXt+IQ4wj4YLYMKV+XfBmsC4F1W01ktmo7dbemlAHDdB56WW1pEkc8wmLemOR55NYnISXAllF9slimXuQWMwPmhbwI4VQnc0KLVubnyWmJz3M6h5grmTKiqkINAGPga+Gq5CBSKle70f/Pgh8KlsKgKFdzh8eYLc5gNtA+tYn6cnySYFGfcvU2dDoxXHlJFOF4Jh2qfyABmZDOYKhW2VXRMyV1t+igXBedFXANfV1BXyu1pR1doz4aabQGXlTAi4bxZQW93Ja6aQVNG+iygi4bzu5VBb5qwFkFzRpoXkHzBvqugr5rRon1yY/tiVzHvTo3eULCJbxgALGSfX1hbe2F6SMdW3VJccyyb6lV3DPw9f26BqK8oMuX569+VvLowH5s7qlthktS2FDM3b3HR2aDEqxXU3LMgwP7eYOTMBwH10bHP+w9s5pGNGWUXJP293d//L7plAMhydW109HzY3t3u4+Y1wih3CvqW6gRWtBGL3fVKnDbBOukWvmLJv8Fw/kN7KTNfRNgq4K2KTZptiryNsUm2o1iaxO06NbiHaLF1Y3JmnIM+lJn8Ep38Ym+iLBI2LeyfFKUvuQDZ1BU7yPibEPUVa94Yazt96RZXNhDa29o/vJd/+lP5VvTNe4bXxkPDcvYN54aL41TY2R4nd86f3T+7PzVHXbPu+PuZE291Sk1Xxq10fX/BivRJJc=</latexit>

argmaxµ

X

x

ln1p2⇡�2

-1

2�2(x- µ)2 = argmax

µ

X

x

-1

2�2(x- µ)2

= argmaxµ

-1

2�2

X

x

(x- µ)2

= argmaxµ

-X

x

(x- µ)2

= argminµ

X

x

(x- µ)2

HTH HHT HHT HTH

Bayesian

43

p(Heads | Straight) =1/2

p(Tails | Straight) = 1/2

p(Heads | Bent) = 4/5

p(Tails | Bent) = 1/5

prior?

TheBayesianapproachisalittledifferent.Wewon’tgointothedetails,butwe’llabriefoutlineofhowitworksonthecoinexample,justtogiveyouabasicidea.

We?irstneedtoestablishaprior.Whatistheprobabilityofeachcoininourmodelspace.Wesaidthatwe’daskedafriendtopickacoinatrandom.Ifweassumethathefollowsourinstructions,thenwebelieveeachcoinisequallylikelysobothget0.5probability.Ifwehadtwofaircoinsandonebentone,wecouldsetthepriorto1/3forbentand2/3forfair.

Thisisanimportantthingtounderstandaboutchoosingaprior:itallowsustoencodeourassumptionsabouttheproblem.Andaswesawwhenwediscussedtheproblemofinduction,theseassumptionsarewhatmakelearningpossibleatall.

44

<latexit sha1_base64="dwsKn/5CBbJYU3diZkngPkLZBKo=">AAAHhnicfZXdbts2GIblrp1Tb+3S7XAnQo0O6WoYktv87CBFmqQ/GNYmzeKkQGQEFP1ZJqwfgqQcuwTvp1eye9jdlPpxIIludeIPfN73w8eXBO3TkHDhOP+37vxw996P7Y37nZ9+fvDwl81Hv17wJGUYhjgJE/bJRxxCEsNQEBHCJ8oARX4Il/7sKOOXc2CcJPG5WFIYRSiIyYRgJPTS9eYXunX81P5j37Z10bM9hqUnYCHkv4IhEkyFUk/tZyUMVvAQ4hx4XkdbNfQiMv6GmW59p2nhM/pqT3PterPr9J38s83CLYuuVX6n14/u/eeNE5xG2o9DxPmV61AxkogJgkNQHS/lQBGeoQCuUjHZG0kS01RAjJX9RLNJGtoisbPU7DFhgEW41AXCjOgONp4ihrDQ2XbqrTjEKALeG88J5UXJ50FRCKQPZiQX+cGpBzWnDBiiU4IXtdEkiniExNRY5MvIry9CGgKbR/XFbEw9ZEO5AIYJz0I41cmc0Owy8PPktOTTJZ1CzJVMWaiqRg2AMZhoY15yECmV+W70DZzxfcFS6GVlvrZ/jNjsDMY93ae2UB9nEiZIKB1GDDc4iSIUj6VHVXkBvF5f5VFV6ZmS0sty8X37LMM1+qFCP6hm5+EtndhDTWvwogIvjMaXFXrZtPpphaYGnVfo3Ojs31TwjYEXFbow6LJClwb9XKGfzSiRPuirwUgWcefHJE9CMoe3DCBWsjtQzb0wfYJXbt2SnarsuiqPewwT/SAVIFpmcvnu/P0/Sh7tDbadHdVU+GEKK4nzfGf7yDEkQTFNqXH29gaHhiZhKA5uGx2/3nnlmo1oymh4K9rdff7mr8PmFWHY2F+5Dbvr2kYewTp5OfBag7/OUISwVj8z9W8ZWn5DnazrvspmrYOuc6yCWjkaI9HsWs30M02zJxWFheQY9GPL4L2+bif6gUAiYX/qO8aCiOgw9K/Xy6rvCdFiJdRVp6Nffrf5zpvFxaDv7vSdjy+6B3+X/wEb1u/WY2vLcq1d68B6Z51aQwu3HrZetPZbL9sb7X57u71bSO+0Ss9vVu1rH3wF0x+zlA==</latexit>

p(D) = p(D, Straight) + p(D,Bent)= p(D | Straight)p(Straight) + p(D | Bent)p(Bent)

<latexit sha1_base64="dwsKn/5CBbJYU3diZkngPkLZBKo=">AAAHhnicfZXdbts2GIblrp1Tb+3S7XAnQo0O6WoYktv87CBFmqQ/GNYmzeKkQGQEFP1ZJqwfgqQcuwTvp1eye9jdlPpxIIludeIPfN73w8eXBO3TkHDhOP+37vxw996P7Y37nZ9+fvDwl81Hv17wJGUYhjgJE/bJRxxCEsNQEBHCJ8oARX4Il/7sKOOXc2CcJPG5WFIYRSiIyYRgJPTS9eYXunX81P5j37Z10bM9hqUnYCHkv4IhEkyFUk/tZyUMVvAQ4hx4XkdbNfQiMv6GmW59p2nhM/pqT3PterPr9J38s83CLYuuVX6n14/u/eeNE5xG2o9DxPmV61AxkogJgkNQHS/lQBGeoQCuUjHZG0kS01RAjJX9RLNJGtoisbPU7DFhgEW41AXCjOgONp4ihrDQ2XbqrTjEKALeG88J5UXJ50FRCKQPZiQX+cGpBzWnDBiiU4IXtdEkiniExNRY5MvIry9CGgKbR/XFbEw9ZEO5AIYJz0I41cmc0Owy8PPktOTTJZ1CzJVMWaiqRg2AMZhoY15yECmV+W70DZzxfcFS6GVlvrZ/jNjsDMY93ae2UB9nEiZIKB1GDDc4iSIUj6VHVXkBvF5f5VFV6ZmS0sty8X37LMM1+qFCP6hm5+EtndhDTWvwogIvjMaXFXrZtPpphaYGnVfo3Ojs31TwjYEXFbow6LJClwb9XKGfzSiRPuirwUgWcefHJE9CMoe3DCBWsjtQzb0wfYJXbt2SnarsuiqPewwT/SAVIFpmcvnu/P0/Sh7tDbadHdVU+GEKK4nzfGf7yDEkQTFNqXH29gaHhiZhKA5uGx2/3nnlmo1oymh4K9rdff7mr8PmFWHY2F+5Dbvr2kYewTp5OfBag7/OUISwVj8z9W8ZWn5DnazrvspmrYOuc6yCWjkaI9HsWs30M02zJxWFheQY9GPL4L2+bif6gUAiYX/qO8aCiOgw9K/Xy6rvCdFiJdRVp6Nffrf5zpvFxaDv7vSdjy+6B3+X/wEb1u/WY2vLcq1d68B6Z51aQwu3HrZetPZbL9sb7X57u71bSO+0Ss9vVu1rH3wF0x+zlA==</latexit>

p(D) = p(D, Straight) + p(D,Bent)= p(D | Straight)p(Straight) + p(D | Bent)p(Bent)

<latexit sha1_base64="dwsKn/5CBbJYU3diZkngPkLZBKo=">AAAHhnicfZXdbts2GIblrp1Tb+3S7XAnQo0O6WoYktv87CBFmqQ/GNYmzeKkQGQEFP1ZJqwfgqQcuwTvp1eye9jdlPpxIIludeIPfN73w8eXBO3TkHDhOP+37vxw996P7Y37nZ9+fvDwl81Hv17wJGUYhjgJE/bJRxxCEsNQEBHCJ8oARX4Il/7sKOOXc2CcJPG5WFIYRSiIyYRgJPTS9eYXunX81P5j37Z10bM9hqUnYCHkv4IhEkyFUk/tZyUMVvAQ4hx4XkdbNfQiMv6GmW59p2nhM/pqT3PterPr9J38s83CLYuuVX6n14/u/eeNE5xG2o9DxPmV61AxkogJgkNQHS/lQBGeoQCuUjHZG0kS01RAjJX9RLNJGtoisbPU7DFhgEW41AXCjOgONp4ihrDQ2XbqrTjEKALeG88J5UXJ50FRCKQPZiQX+cGpBzWnDBiiU4IXtdEkiniExNRY5MvIry9CGgKbR/XFbEw9ZEO5AIYJz0I41cmc0Owy8PPktOTTJZ1CzJVMWaiqRg2AMZhoY15yECmV+W70DZzxfcFS6GVlvrZ/jNjsDMY93ae2UB9nEiZIKB1GDDc4iSIUj6VHVXkBvF5f5VFV6ZmS0sty8X37LMM1+qFCP6hm5+EtndhDTWvwogIvjMaXFXrZtPpphaYGnVfo3Ojs31TwjYEXFbow6LJClwb9XKGfzSiRPuirwUgWcefHJE9CMoe3DCBWsjtQzb0wfYJXbt2SnarsuiqPewwT/SAVIFpmcvnu/P0/Sh7tDbadHdVU+GEKK4nzfGf7yDEkQTFNqXH29gaHhiZhKA5uGx2/3nnlmo1oymh4K9rdff7mr8PmFWHY2F+5Dbvr2kYewTp5OfBag7/OUISwVj8z9W8ZWn5DnazrvspmrYOuc6yCWjkaI9HsWs30M02zJxWFheQY9GPL4L2+bif6gUAiYX/qO8aCiOgw9K/Xy6rvCdFiJdRVp6Nffrf5zpvFxaDv7vSdjy+6B3+X/wEb1u/WY2vLcq1d68B6Z51aQwu3HrZetPZbL9sb7X57u71bSO+0Ss9vVu1rH3wF0x+zlA==</latexit>

p(D) = p(D, Straight) + p(D,Bent)= p(D | Straight)p(Straight) + p(D | Bent)p(Bent)

<latexit sha1_base64="msR+vvSM+8XdWuvi8iWHn/ASH0w=">AAAHQHicfVVLb9NAEDavAOFV4MjFIkJqUVTZKX1wqFTa8BACWkrTVqqjar2ZOKv4sdpdpwmr/XMcEH+Bf8AJxAnEibXjVHZs8CWj+b5vNPPtaOJSn3BhWV8vXLx0+Urt6rXr9Rs3b92+s3D33iGPYoahgyM/Yscu4uCTEDqCCB+OKQMUuD4cucOdBD8aAeMkCg/EhEI3QF5I+gQjoVOnCy5ddBiWjoCxkB8EQ8QbCKVMJyA9s71kbppOnyEs6WI7TVWRl6prLKlEtaROFxrWspV+Zjmws6BhZN/e6d0rn51ehOMAQoF9xPmJbVHRlYgJgn1QdSfmQBEeIg9OYtHf6EoS0lhAiJX5SGP92DdFZCbzmj3CAAt/ogOEGdEVTDxAeiShXakXS3EIUQC82RsRyqchH3nTQCBtaVeOU8vVrYJSegzRAcHjQmsSBTxAYlBK8kngFpMQ+8BGQTGZtKmbnGOOgWHCExP2tDO7NHlGfhDtZfhgQgcQciVj5qu8UAPAGPS1MA05iJjKdBq9O0O+KVgMzSRMc5ttxIb70GvqOoVEsZ2+HyGhtBkhnOEoCFDYkw5V2S44zWWVWpVH95WUTuKL65r7CVxA3+XQd2q+cucc7ZsdjRbAwxx4WCp8lEOP5qVunEPjEjrKoaNSZfcsB5+V4HEOHZfQSQ6dlNCPOfRj2UqkH/qk1ZVTu9Nnkrs+GcFLBhAq2Wip+VmYfsETuyhJXlU2bJXa3YO+PiVTIJgkdPnq4O0bJXc2WqvWmppnuH4MM4q1sra6Y5Uo3rSbjGNtbLS2S5yIodA7L9R+vvbMLheiMaP+OWl9feXF0+35FWG4NF82htmwzZIfXhU9a7hS4FYJpiZU8odl/kuGJv9gR1XVZ95UKmiVYmbUTDHXEk3WaqgvNk1OKvKnlDboY8vgrV63XX0gkIjYY71jzAuINkP/Os0k+h8RjWdEHdXr+vLb83e+HBy2lu21Zev9k8bW6+w/4JrxwHhoLBq2sW5sGa+MPaNjYOOL8d34Zfyufap9q/2o/ZxSL17INPeNwlf78xfscaNz</latexit>

p(Straight | D) =p(D | Straight)p(Straight)

p(D)<latexit sha1_base64="VbhJMWMb74zDzqC84OvxFrtA/4c=">AAAHhHicnVXbbtNAEHUKNBAKtPDIi0WEVCCq7JReEBRBW6BCQEtp0kp1VK03E2cVX1a76zRhtb/Dp/AP/A3rxCm+BB7wS0ZzzhnNnhlNXOoTLizrV2Xh2vUbi9Wbt2q3l+7cvbe8cr/No5hhaOHIj9iZizj4JISWIMKHM8oABa4Pp+5gL8FPh8A4icITMabQCZAXkh7BSOjUxfKPHdPpMYQlXd13AtI1HYalI2Ak5DfBEPH6QqknJl2dm1fS/C+d+eyPzpvhuxDONMWculiuW2vW5DPLgZ0GdSP9ji5Wbvx0uhGOA10A+4jzc9uioiMREwT7oGpOzIEiPEAenMeit92RJKSxgBAr87HGerFvishMPDO7hAEW/lgHCDOiK5i4j7RrQjtby5fiEKIAeKM7JJRPQz70poFAeiwdOZqMTd3JKaXHEO0TPMq1JlHAAyT6pSQfB24+CbEPbBjkk0mbuskCcwQME56YcKSdOaTJKvCT6CjF+2Pah5ArGTNfZYUaAMagp4WTkIOIqZy8Ru/fgO8IFkMjCSe5nX3EBsfQbeg6uUS+nZ4fIaG0GSFc4igIUNiVDlXpBjiNNTWxKoseKymdxBfXNY8TOId+yaBfVLFy6wrtmS2N5sB2BmyXCp9m0NOi1I0zaFxChxl0WKrsXmbgyxI8yqCjEjrOoOMS+j2Dfi9bifSgz5sdObV7MiZ56JMhfGAAoZL1piq+hekJntt5STJVWbfVxO4u9PQ5mgLBOKHLg5PPn5Tc225uWJuqyHD9GGYUa31zY88qUbxpNynH2t5u7pY4EUOhd1Vo/93mW7tciMaM+lekra319y92iyvCcOl96TPMum2W/PDm0dOG5wrceYKpCXP5gzL/A0Pjv7CjedVn3sxV0HmKmVEzRaElmqzVQN9pmpxU5E8p+6CPLYPPet0O9YFAImJP9Y4xLyDaDP3rNJLoX0Q0mhF1VKvpy28X73w5aDfX7M016+vz+puP6X/ATeOh8chYNWxjy3hjHBhHRsvAlaVKs/Ky8qq6WG1U16sbU+pCJdU8MHJf9fVvN9u3Dw==</latexit>

=p(D | Straight)p(Straight)

p(D | Straight)p(Straight) + p(D | Bent)p(Bent)

45

<latexit sha1_base64="KPfTFcyReXp4MSCu3E6Tps8JzMI=">AAAHpXicnVVdbxNHFF1TqMEtJbSPvIywUENrol0DSfoQCRIXEGpIGmIHKWtFs+Pr9cj7MZqZdWxG8/fob+i/6ex6Hfarfei++Oqec67unHt17bGACmnbf7dufXP7zrftu/c6331//4cHWw9/HIk44QSGJA5i/snDAgIawVBSGcAnxgGHXgAX3vwoxS8WwAWNo3O5YjAOsR/RKSVYmtTV1l9sG7mcKFfCUqqPkmPqz6TWyA3pBA2eInSA3CnHRLHtQZZrYj9FbLsxrxX6Xzr061edv8EPIdpoqjl9tdW1d+zsQ/XAyYOulX+nVw/vfHEnMUlCU4AEWIhLx2ZyrDCXlASgO24igGEyxz5cJnK6P1Y0YomEiGj0xGDTJEAyRqmpaEI5EBmsTIAJp6YCIjNsXJPG+k65lIAIhyB6kwVlYh2Khb8OJDZzG6tlNld9v6RUPsdsRsmy1JrCoQixnNWSYhV65SQkAfBFWE6mbZomK8wlcEJFasKpceaEpbsizuPTHJ+t2AwioVXCA10UGgA4h6kRZqEAmTCVvcYs6FwcSJ5ALw2z3MEA8/kZTHqmTilRbmcaxFhqY0YE1yQOQxxNlMt0vgFub0dnVhXRM62Um/rieegshUvohwL6QVcrD2/QKRoatASOCuCoVviigF5UpV5SQJMauiigi1pl77oAX9fgZQFd1tBVAV3V0M8F9HPdSmwGfdkfq7Xd2ZjUSUAX8JYDRFp1+7r6Fm4meOmUJelUVdfRmd0TmJp7tQbCVUpX786P/9DqaL//0t7VVYYXJLCh2M93Xx7ZNYq/7ibn2Pv7/cMaJ+Y48m8KDX7ffe3UC7GEs+CGtLf3/M1vh9UV4aT2vvwZqOugmh9+Ez1vuFHgNQnWJjTy53X+W45X/8KOm6pvvGlUsCbFxqiNotISS9dqbu40S08qDtaUAZhjy+HYrNuJORBYxvwXs2PcD6kxw/y6vTT6LyJebogm6nTM5Xeqd74ejPo7zu6O/eeL7qv3+X/AXeuR9djathxrz3plvbNOraFFWs9aH1tua9z+uX3cPm+P1tRbrVzzk1X62lf/AF8CxBw=</latexit>

p(Straight | D) =p(D | Straight)p(Straight)

p(D | Straight)p(Straight) + p(D | Bent)p(Bent)

<latexit sha1_base64="KPfTFcyReXp4MSCu3E6Tps8JzMI=">AAAHpXicnVVdbxNHFF1TqMEtJbSPvIywUENrol0DSfoQCRIXEGpIGmIHKWtFs+Pr9cj7MZqZdWxG8/fob+i/6ex6Hfarfei++Oqec67unHt17bGACmnbf7dufXP7zrftu/c6331//4cHWw9/HIk44QSGJA5i/snDAgIawVBSGcAnxgGHXgAX3vwoxS8WwAWNo3O5YjAOsR/RKSVYmtTV1l9sG7mcKFfCUqqPkmPqz6TWyA3pBA2eInSA3CnHRLHtQZZrYj9FbLsxrxX6Xzr061edv8EPIdpoqjl9tdW1d+zsQ/XAyYOulX+nVw/vfHEnMUlCU4AEWIhLx2ZyrDCXlASgO24igGEyxz5cJnK6P1Y0YomEiGj0xGDTJEAyRqmpaEI5EBmsTIAJp6YCIjNsXJPG+k65lIAIhyB6kwVlYh2Khb8OJDZzG6tlNld9v6RUPsdsRsmy1JrCoQixnNWSYhV65SQkAfBFWE6mbZomK8wlcEJFasKpceaEpbsizuPTHJ+t2AwioVXCA10UGgA4h6kRZqEAmTCVvcYs6FwcSJ5ALw2z3MEA8/kZTHqmTilRbmcaxFhqY0YE1yQOQxxNlMt0vgFub0dnVhXRM62Um/rieegshUvohwL6QVcrD2/QKRoatASOCuCoVviigF5UpV5SQJMauiigi1pl77oAX9fgZQFd1tBVAV3V0M8F9HPdSmwGfdkfq7Xd2ZjUSUAX8JYDRFp1+7r6Fm4meOmUJelUVdfRmd0TmJp7tQbCVUpX786P/9DqaL//0t7VVYYXJLCh2M93Xx7ZNYq/7ibn2Pv7/cMaJ+Y48m8KDX7ffe3UC7GEs+CGtLf3/M1vh9UV4aT2vvwZqOugmh9+Ez1vuFHgNQnWJjTy53X+W45X/8KOm6pvvGlUsCbFxqiNotISS9dqbu40S08qDtaUAZhjy+HYrNuJORBYxvwXs2PcD6kxw/y6vTT6LyJebogm6nTM5Xeqd74ejPo7zu6O/eeL7qv3+X/AXeuR9djathxrz3plvbNOraFFWs9aH1tua9z+uX3cPm+P1tRbrVzzk1X62lf/AF8CxBw=</latexit>

p(Straight | D) =p(D | Straight)p(Straight)

p(D | Straight)p(Straight) + p(D | Bent)p(Bent)

<latexit sha1_base64="KPfTFcyReXp4MSCu3E6Tps8JzMI=">AAAHpXicnVVdbxNHFF1TqMEtJbSPvIywUENrol0DSfoQCRIXEGpIGmIHKWtFs+Pr9cj7MZqZdWxG8/fob+i/6ex6Hfarfei++Oqec67unHt17bGACmnbf7dufXP7zrftu/c6331//4cHWw9/HIk44QSGJA5i/snDAgIawVBSGcAnxgGHXgAX3vwoxS8WwAWNo3O5YjAOsR/RKSVYmtTV1l9sG7mcKFfCUqqPkmPqz6TWyA3pBA2eInSA3CnHRLHtQZZrYj9FbLsxrxX6Xzr061edv8EPIdpoqjl9tdW1d+zsQ/XAyYOulX+nVw/vfHEnMUlCU4AEWIhLx2ZyrDCXlASgO24igGEyxz5cJnK6P1Y0YomEiGj0xGDTJEAyRqmpaEI5EBmsTIAJp6YCIjNsXJPG+k65lIAIhyB6kwVlYh2Khb8OJDZzG6tlNld9v6RUPsdsRsmy1JrCoQixnNWSYhV65SQkAfBFWE6mbZomK8wlcEJFasKpceaEpbsizuPTHJ+t2AwioVXCA10UGgA4h6kRZqEAmTCVvcYs6FwcSJ5ALw2z3MEA8/kZTHqmTilRbmcaxFhqY0YE1yQOQxxNlMt0vgFub0dnVhXRM62Um/rieegshUvohwL6QVcrD2/QKRoatASOCuCoVviigF5UpV5SQJMauiigi1pl77oAX9fgZQFd1tBVAV3V0M8F9HPdSmwGfdkfq7Xd2ZjUSUAX8JYDRFp1+7r6Fm4meOmUJelUVdfRmd0TmJp7tQbCVUpX786P/9DqaL//0t7VVYYXJLCh2M93Xx7ZNYq/7ibn2Pv7/cMaJ+Y48m8KDX7ffe3UC7GEs+CGtLf3/M1vh9UV4aT2vvwZqOugmh9+Ez1vuFHgNQnWJjTy53X+W45X/8KOm6pvvGlUsCbFxqiNotISS9dqbu40S08qDtaUAZhjy+HYrNuJORBYxvwXs2PcD6kxw/y6vTT6LyJebogm6nTM5Xeqd74ejPo7zu6O/eeL7qv3+X/AXeuR9djathxrz3plvbNOraFFWs9aH1tua9z+uX3cPm+P1tRbrVzzk1X62lf/AF8CxBw=</latexit>

p(Straight | D) =p(D | Straight)p(Straight)

p(D | Straight)p(Straight) + p(D | Bent)p(Bent)

<latexit sha1_base64="uyEYdA+pUpVrGwe14ihnlMk4Lio=">AAAHmXiclVVdb9s2FJW7de68dUvXx75wMwakmxFI7pqkDwHaxFuCYW3cNE4KREZA0dcyYX0QJOXYJfjL9r7/sH8zSpYDSfQGjC++uOec68vDi6uARVRI1/279eCzzx9+0X70Zeerrx9/8+3Ok++uRJpxAiOSRin/GGABEU1gJKmM4CPjgOMggutgfpLj1wvggqbJpVwxGMc4TOiUEixN6nbnT7aL/JAoX8JSqmNIpNbIj+kEDZ4jdIT8KcdEsd1BkWsynyO2a+W0Qvd8vsE+SI5pONtotuZ/Rv/vf253uu6eWxxkB14ZdJ3yDG+fPPzLn6Qki00BEmEhbjyXybHCXFISge74mQCGyRyHcJPJ6eFY0YRlEhKi0Y8Gm2YRkinKjUQTyoHIaGUCTDg1FRCZYeOWNHZ36qUEJDgG0ZssKBPrUCzCdSCxeauxWhZvqR/XlCrkmM0oWdZaUzgWMZYzKylWcVBPQhYBX8T1ZN6mabLBXAInVOQmDI0z5yyfD3GZDkt8tmIzSIRWGY90VWgA4BymRliEAmTGVHEbM5RzcSR5Br08LHJHA8znFzDpmTq1RL2daZRiqY0ZCdyRNI5xMlE+0+UE+L09XVhVRS+0Un7uSxCgixyuoe8q6DvdrDy6R6doZNAaeFUBr6zC1xX0uikNsgqaWeiigi6sysFdBb6z4GUFXVroqoKuLPRTBf1kW4nNQ9/0x2ptd/FM6jyiCzjlAIlW3b5u3oWbF7zx6pL8VVXX04XdE5iaHbUG4lVOV2eXb//Q6uSw/9Ld101GEGWwobgv9l+euBYlXHdTctzDw/6xxUk5TsL7QoNf9994diGWcRbdkw4OXvz26rg5IpxY9yuvgboesvwIt9HLhrcKgm2CtQlb+XObf8rx6l/Y6bbqG2+2Ktg2xcaojaLREsvHam72NMtXKo7WlAGYZcvhrRm3c7MgsEz5T2bGeBhTY4b59Xt59F9EvNwQTdTpmM3vNfe8HVz197z9Pff9L93Xv5ffgEfOM+cHZ9fxnAPntXPmDJ2RQ1rft05bw9b79rP2m/ZZu+Q+aJWap07ttD/8A+cdvkw=</latexit>

p(Bent | D) =p(D | Bent)p(Bent)

p(D | Straight)p(Straight) + p(D | Bent)p(Bent)

<latexit sha1_base64="uyEYdA+pUpVrGwe14ihnlMk4Lio=">AAAHmXiclVVdb9s2FJW7de68dUvXx75wMwakmxFI7pqkDwHaxFuCYW3cNE4KREZA0dcyYX0QJOXYJfjL9r7/sH8zSpYDSfQGjC++uOec68vDi6uARVRI1/279eCzzx9+0X70Zeerrx9/8+3Ok++uRJpxAiOSRin/GGABEU1gJKmM4CPjgOMggutgfpLj1wvggqbJpVwxGMc4TOiUEixN6nbnT7aL/JAoX8JSqmNIpNbIj+kEDZ4jdIT8KcdEsd1BkWsynyO2a+W0Qvd8vsE+SI5pONtotuZ/Rv/vf253uu6eWxxkB14ZdJ3yDG+fPPzLn6Qki00BEmEhbjyXybHCXFISge74mQCGyRyHcJPJ6eFY0YRlEhKi0Y8Gm2YRkinKjUQTyoHIaGUCTDg1FRCZYeOWNHZ36qUEJDgG0ZssKBPrUCzCdSCxeauxWhZvqR/XlCrkmM0oWdZaUzgWMZYzKylWcVBPQhYBX8T1ZN6mabLBXAInVOQmDI0z5yyfD3GZDkt8tmIzSIRWGY90VWgA4BymRliEAmTGVHEbM5RzcSR5Br08LHJHA8znFzDpmTq1RL2daZRiqY0ZCdyRNI5xMlE+0+UE+L09XVhVRS+0Un7uSxCgixyuoe8q6DvdrDy6R6doZNAaeFUBr6zC1xX0uikNsgqaWeiigi6sysFdBb6z4GUFXVroqoKuLPRTBf1kW4nNQ9/0x2ptd/FM6jyiCzjlAIlW3b5u3oWbF7zx6pL8VVXX04XdE5iaHbUG4lVOV2eXb//Q6uSw/9Ld101GEGWwobgv9l+euBYlXHdTctzDw/6xxUk5TsL7QoNf9994diGWcRbdkw4OXvz26rg5IpxY9yuvgboesvwIt9HLhrcKgm2CtQlb+XObf8rx6l/Y6bbqG2+2Ktg2xcaojaLREsvHam72NMtXKo7WlAGYZcvhrRm3c7MgsEz5T2bGeBhTY4b59Xt59F9EvNwQTdTpmM3vNfe8HVz197z9Pff9L93Xv5ffgEfOM+cHZ9fxnAPntXPmDJ2RQ1rft05bw9b79rP2m/ZZu+Q+aJWap07ttD/8A+cdvkw=</latexit>

p(Bent | D) =p(D | Bent)p(Bent)

p(D | Straight)p(Straight) + p(D | Bent)p(Bent)

<latexit sha1_base64="uyEYdA+pUpVrGwe14ihnlMk4Lio=">AAAHmXiclVVdb9s2FJW7de68dUvXx75wMwakmxFI7pqkDwHaxFuCYW3cNE4KREZA0dcyYX0QJOXYJfjL9r7/sH8zSpYDSfQGjC++uOec68vDi6uARVRI1/279eCzzx9+0X70Zeerrx9/8+3Ok++uRJpxAiOSRin/GGABEU1gJKmM4CPjgOMggutgfpLj1wvggqbJpVwxGMc4TOiUEixN6nbnT7aL/JAoX8JSqmNIpNbIj+kEDZ4jdIT8KcdEsd1BkWsynyO2a+W0Qvd8vsE+SI5pONtotuZ/Rv/vf253uu6eWxxkB14ZdJ3yDG+fPPzLn6Qki00BEmEhbjyXybHCXFISge74mQCGyRyHcJPJ6eFY0YRlEhKi0Y8Gm2YRkinKjUQTyoHIaGUCTDg1FRCZYeOWNHZ36qUEJDgG0ZssKBPrUCzCdSCxeauxWhZvqR/XlCrkmM0oWdZaUzgWMZYzKylWcVBPQhYBX8T1ZN6mabLBXAInVOQmDI0z5yyfD3GZDkt8tmIzSIRWGY90VWgA4BymRliEAmTGVHEbM5RzcSR5Br08LHJHA8znFzDpmTq1RL2daZRiqY0ZCdyRNI5xMlE+0+UE+L09XVhVRS+0Un7uSxCgixyuoe8q6DvdrDy6R6doZNAaeFUBr6zC1xX0uikNsgqaWeiigi6sysFdBb6z4GUFXVroqoKuLPRTBf1kW4nNQ9/0x2ptd/FM6jyiCzjlAIlW3b5u3oWbF7zx6pL8VVXX04XdE5iaHbUG4lVOV2eXb//Q6uSw/9Ld101GEGWwobgv9l+euBYlXHdTctzDw/6xxUk5TsL7QoNf9994diGWcRbdkw4OXvz26rg5IpxY9yuvgboesvwIt9HLhrcKgm2CtQlb+XObf8rx6l/Y6bbqG2+2Ktg2xcaojaLREsvHam72NMtXKo7WlAGYZcvhrRm3c7MgsEz5T2bGeBhTY4b59Xt59F9EvNwQTdTpmM3vNfe8HVz197z9Pff9L93Xv5ffgEfOM+cHZ9fxnAPntXPmDJ2RQ1rft05bw9b79rP2m/ZZu+Q+aJWap07ttD/8A+cdvkw=</latexit>

p(Bent | D) =p(D | Bent)p(Bent)

p(D | Straight)p(Straight) + p(D | Bent)p(Bent)

ThisisonewaytothinkaboutBayes’rule:we’veobservedanoutcome,inthiscasethedata,whichcanhaveanumberofcauses.Thejointprobabilityofacausewithanoutcomeinthepriorofthecausetimestheprobabilityoftheoutcomegiventhecause.Ifwesumupallthesejointprobabilties,thetheproportioninthatsumforcauseXistheprobabilityofthecausegiventheobservation.

46

<latexit sha1_base64="NLpzuHaVef5zkzPJrsQ+kzbRVRg=">AAAH/niclVXLbttGFKWSJkrVvNwss+igQgMnEQxSaWxnYSC11SYokth1LDuAKRjD0RU1EB+DmaEsZTBAV/2Urlp01aJf0H/o33RIUQ4psnlwo4t7zrm8c+bq0mMBFdK2/21cuvzZlavNa5+3vrh+4+at22tfHos44QT6JA5i/sbDAgIaQV9SGcAbxgGHXgAn3mQvxU+mwAWNoyM5ZzAIsR/RESVYmtTZWuMrto5cTpQrYSbVa8kx9cdSa+SGdIh69xG6t4PcEcdEsfVelqyj30fuxGQznqNVV2ut0KcJ0MN3An8p2IWovjpy3dbHdfbBRt73Xn12u21v2NmDqoGTB20rfw7O1q784w5jkoSmAgmwEKeOzeRAYS4pCUC33EQAw2SCfThN5Gh7oGjEEgkR0egbg42SAMkYpbeFhpQDkcHcBJhwaiogMsbmyNLcaatcSkCEQxCd4ZQysQjF1F8EEpuBGKhZNjD6RkmpfI7ZmJJZqTWFQxFiOa4kxTz0yklIAuDTsJxM2zRNrjBnwAkVqQkHxpl9lg6hOIoPcnw8Z2OIhFYJD3RRaADgHEZGmIUCZMJUdhoz+ROxI3kCnTTMcjs9zCeHMOyYOqVEuZ1REGOpjRkRnJM4DHE0VC7T+Qi4nQ2dWVVED7VSbuqL56HDFC6hrwroK71auX+BjlDfoCXwuAAeVwqfFNCTVamXFNCkgk4L6LRS2TsvwOcVeFZAZxV0XkDnFfRtAX1btRKbiz7tDtTC7uya1H5Ap/CMA0RatdO/eVnCzQ2eOmVJequq7ejM7iGMzCJcAOE8pavnRy9faLW33X1sb+pVhhcksKTYjzYf79kVir/oJufY29vd3Qon5jjyLwr1vt/8zqkWYglnwQVpa+vRD092V0eEk8r58mOgtoMqfvh19LzhWoFXJ1iYUMufVPnPOJ7/Dzuuq770plbB6hRLo5aKlZZYOlbp54ClKxUHC0oPzLLl8NKM275ZEFjG/IGZMe6H1Jhhft1OGr2PiGdLoolaLbP5ndU9Xw2OuxvO5ob907ftpz/m34Br1l3ra2vdcqwt66n13Dqw+hZp/NL4rfFn46/mz81fm783/1hQLzVyzR2r9DT//g9fhOXq</latexit>

p(Straight | D) =p(D | Straight) 12

p(D | Straight) 12 + p(D | Bent) 12

=p(D | Straight)

p(D | Straight) + p(D | Bent)

<latexit sha1_base64="NLpzuHaVef5zkzPJrsQ+kzbRVRg=">AAAH/niclVXLbttGFKWSJkrVvNwss+igQgMnEQxSaWxnYSC11SYokth1LDuAKRjD0RU1EB+DmaEsZTBAV/2Urlp01aJf0H/o33RIUQ4psnlwo4t7zrm8c+bq0mMBFdK2/21cuvzZlavNa5+3vrh+4+at22tfHos44QT6JA5i/sbDAgIaQV9SGcAbxgGHXgAn3mQvxU+mwAWNoyM5ZzAIsR/RESVYmtTZWuMrto5cTpQrYSbVa8kx9cdSa+SGdIh69xG6t4PcEcdEsfVelqyj30fuxGQznqNVV2ut0KcJ0MN3An8p2IWovjpy3dbHdfbBRt73Xn12u21v2NmDqoGTB20rfw7O1q784w5jkoSmAgmwEKeOzeRAYS4pCUC33EQAw2SCfThN5Gh7oGjEEgkR0egbg42SAMkYpbeFhpQDkcHcBJhwaiogMsbmyNLcaatcSkCEQxCd4ZQysQjF1F8EEpuBGKhZNjD6RkmpfI7ZmJJZqTWFQxFiOa4kxTz0yklIAuDTsJxM2zRNrjBnwAkVqQkHxpl9lg6hOIoPcnw8Z2OIhFYJD3RRaADgHEZGmIUCZMJUdhoz+ROxI3kCnTTMcjs9zCeHMOyYOqVEuZ1REGOpjRkRnJM4DHE0VC7T+Qi4nQ2dWVVED7VSbuqL56HDFC6hrwroK71auX+BjlDfoCXwuAAeVwqfFNCTVamXFNCkgk4L6LRS2TsvwOcVeFZAZxV0XkDnFfRtAX1btRKbiz7tDtTC7uya1H5Ap/CMA0RatdO/eVnCzQ2eOmVJequq7ejM7iGMzCJcAOE8pavnRy9faLW33X1sb+pVhhcksKTYjzYf79kVir/oJufY29vd3Qon5jjyLwr1vt/8zqkWYglnwQVpa+vRD092V0eEk8r58mOgtoMqfvh19LzhWoFXJ1iYUMufVPnPOJ7/Dzuuq770plbB6hRLo5aKlZZYOlbp54ClKxUHC0oPzLLl8NKM275ZEFjG/IGZMe6H1Jhhft1OGr2PiGdLoolaLbP5ndU9Xw2OuxvO5ob907ftpz/m34Br1l3ra2vdcqwt66n13Dqw+hZp/NL4rfFn46/mz81fm783/1hQLzVyzR2r9DT//g9fhOXq</latexit>

p(Straight | D) =p(D | Straight) 12

p(D | Straight) 12 + p(D | Bent) 12

=p(D | Straight)

p(D | Straight) + p(D | Bent)

Ifwechooseauniformprior(eachmodelgetsthesameprobability),thenthepriorscanceloutandwearejustleftwithasumoverthedataprobabilities,whichwecomputedalready.

47

Bayesian

<latexit sha1_base64="axGXrMxqrPotnNYqHWKtLW/ASsQ=">AAAHK3icfZXbbhMxEIaXUyjhVOCSmxURCFAU7QbahotKpS0HIUpLaVqkblR5ncnGyh4s25smWH4krngHXoArEFcg3gNvsql248DeZDTfP6Pxb2vi05Bw4Tjfz52/cPFS5fLSlerVa9dv3Fy+dfuQJynD0MZJmLCPPuIQkhjagogQPlIGKPJDOPIHWxk/GgLjJIkPxJhCJ0JBTHoEI6FTJ8s79KHtMSw9ASMhPwiGSNAXStleRLr29iP7wbrtNJ62PK+aKYOZchNiQ7XSPFmuOQ1n8tlm4OZBzcq/vZNbl7563QSnkW6HQ8T5setQ0ZGICYJDUFUv5UARHqAAjlPRa3UkiWkqIMbKvq9ZLw1tkdjZ0ewuYYBFONYBwozoDjbuI4aw0AZUy604xCgCXu8OCeXTkA+DaSCQdq8jRxN31fVSpQwYon2CR6XRJIp4hETfSPJx5JeTkIbAhlE5mY2ph5xTjoBhwjMT9rQzuzS7MX6Q7OW8P6Z9iLmSKQtVsVADYAx6unASchAplZPT6Gcy4OuCpVDPwklufRuxwT5067pPKVEepxcmSChtRgynOIkiFHelR1X+Hrx6Q02sKtJ9JaWX+eL79n6GS/Rdgb5T853bZ7RntzUtwcMCPDQaHxXo0XypnxZoatBhgQ6Nzv5pAZ8aeFSgI4OOC3Rs0E8F+sm0EumLPm525NTuyTXJ3ZAM4RUDiJWsNdX8WZi+wWO3XJLdqqy5amJ3F3p6a0xBNM7k8vXBzlslt1rNFWdVzSv8MIWZxHmyurLlGJJgOk2ucVqt5qahSRiKg7NG2y9Wn7tmI5oyGp6J1taevHy2Of9EGDbOlx/Drrm24UewSJ4PvLDAX1QwNWGhfmDqXzE0/oc6WdR95s3CCrqoYmbUrGJuJJo9q4He2jRbqSicSrZBL1sGO/q57eoFgUTCHus3xoKIaDP0r1fPov8J0Wgm1FG1qje/O7/nzeCw2XBXG877p7WNN/l/wJJ117pnPbRca83asF5be1bbwtYX67v1y/pd+Vz5VvlR+TmVnj+X19yxSl/lz19RiZYd</latexit>

p(Straight | D) = 0.48

p(Bent | D) = 0.52

frequentist maximum likelihood estimator

“Bent is the most likely model”

HTHHHTHHTHTH

Ifweworkthisout,wegettheseprobabilitesfortheposterior.

Notethedifferencewiththemaximumlikelihoodcase.Eventhoughthedifferencesbetweenthetwolikelihoodswereminimal,weonlygetonechoiceforthetruemodelinthefrequentistapproach.IntheBayesianapproachwegetadistributiononthemodelspace.IttellsusnotjustthatBentisthemorelikelymodel,butalsothatbothmodelsarestillquitelikely.Inthissense,gettingaposteriordistributionisamuchmorevaluableresultthangettingapointestimateforyourmodel.

ThedownsideofBayesiananalysisisthatasthemodelsgetmorecomplex,itgetsmoreandmoredif?iculttoaccuratelyapproximatetheposterior,andtryingtodosoiswhathasledtosomeofthemostcomplicatedmaterialinmachinelearning.

Probabilistic Models Part 3: (Naive) Bayes Classifiers

Machine Learning mlvu.github.io

Vrije Universiteit Amsterdam

Inthislecturewe’lltrytoconnectthisprobabilitybusinessintotheabstracttaskofclassi?ication.

classification

X = X1, X2, X3, …: random variable for instance.

Y: random variable for class {pos, neg}

P(Y=pos | X) = 0.1 P(Y=neg | X) = 0.9

49

[email protected]?iersthatreturnnotjustaclassforagiveninstancex(oraranking)butaprobabilityoverallclasses.

Thiscanbeveryuseful.Wecanusetheprobabilitiestoextractaranking(andplotanROCcurve)orwecanusetheprobabilitiestoassesshowcertaintheclassi?ieris.Ifwedon’twanttheprobabilities,wecanjustturnitintoaregularclassi?ierbypickingtheclasswiththehighestprobability.

Notethataprobabilisticclassi?ierisalsoimmediatelyarankingclassi?ier(ifwerankbyhowlikelythepositiveclassis)andaregularclassi?ier(ifwepicktheclass

two approaches

discriminative classifier: learn a function for p(Y|X) directly

50

generative classifier: p(Y|X) ∝p(X|Y)p(Y)

Therearetwoapproachestocastingtheclassi?icationprobleminprobabilisticterms.Agenerativeclassi@ierfocusesonlearningadistributiononthefeaturespacegiventheclassp(X|Y).ThisdistributionisthencombinedwithBayes’ruletogettheprobabilityovertheclasses,conditionedonthedata.

Adiscriminativeclassi@ierlearnsthefunctionp(Y|X)directlywithXasinputandclassprobabilitiesasoutput.Itfunctionsasakindofregression,mappingxtoavectorofclassprobabilities.

We’lllookatsomesimplegenerativeclassi?iers?irst.

generative classifiers

Bayes optimal classifier Marginalize over all classifiers in a model class. Provably optimal (given certain assumptions). Usually too expensive to compute.

Bayes classifier Learn single distribution P(X|Y). Reasonable approach for low-dimensional data.

Naive Bayes classifier Assume conditionally independent features. Simple, cheap and effective for high-dimensional data.

51

Herearethreeapproaches,arrangedfromimpracticalbutentirelycorrecttohighlypractical,butbasedonlargelyincorrectassumptions.

Wewon’tdiscusstheBayesoptimalclassi?ierinthiscourse,butit’sworthknowingthatitexists,andthatitmeanssomethingdifferentthana(naive)Bayesclassi?ier.

Bayes classifier

Fit a model for p(X|Y) and for P(Y)

52

<latexit sha1_base64="rnS8Rx+W1hr/96b8S4nHD9E3wEg=">AAAIn3icjVVdb+NEFHUWWEJgoQuPvBgiUAtRZadsW4QqLf2g+8C2oWraRXVUjZ0bx8rYHs2ME3tH83f4TzzyTxg7TrA93hXz4qt7zrmeOXfs6xIcMG5Zf3eefPDhR08/7n7S+/SzZ59/sfP8yzsWJ9SDsRfjmL5xEQMcRDDmAcfwhlBAoYvh3l2c5fj9EigL4uiWZwQmIfKjYBZ4iKvU484/ZNdxPeFwSLkgMZPSCYOpme6Z3584M4o8QXbTItWg7Tm/aNI9mbP3pHlivl/7DuX/5Jo/mlsy3UAR+GtyIyMfd/rWvlUsUw/sMugb5Ro9Pn/6lzONvSSEiHsYMfZgW4RPBKI88DDInpMwIMhbIB8eEj47noggIgmHyJPmdwqbJdjksZnbbU4DCh7HmQqQRwNVwfTmSHnDVVN69VIMIhQCG0yXAWHrkC39dcCR6uhEpEXH5bOaUvgUkXngpbWtCRSyEPG5lmRZ6NaTkGCgy7CezLepNtlgpkC9gOUmjJQz1yS/Rew2HpX4PCNziJgUCcWyKlQAUAozJSxCBjwhojiNuroLdsJpAoM8LHIn54gubmA6UHVqifp2ZjhGvJ5y1TGUOxGsvDgMUTQVDpHljXAG+7LwroreSCGc3CjXNW9yuIZeVdArKevgRQW8UGAdHW/RmTluSu8q4J321vsKet+UukkFTTR0WUGXWmV3VYFXGpxW0FRDswqaaejbCvpW9xmpa/EwnIh1L4qmimscLOGSAkRS9IeyeRaq+v1g1yX5HRB9WxZ2T2Gm/ntrIMxyunh1+/p3Kc6Ohy+sQ9lkuDiBDcU6OHxxZmkUf72bkmMdHw9PNU5MUeRvC51fHP5q64VIQgneko6ODn77Wa+UAcbxalvp7PR8eNA8mHKkvin7yLas5m2jnmZV6YjZt03NWr+NXr6mVeC2CdZ+tvIXOv+Souwd7Lit+sbmVgVpU2w8b1VkbYpNAzaKuiRqsem/dmw1jZOT/ENYqDFE8pGB8LruOahhQuG1+kCu1Q8Q8Zj+oL4K6oeBqqWeziCP3kdE6Yaool5PTTa7Ocf04G64bx/uW3/81H95Ws64rvG18a2xa9jGkfHSeGWMjLHhdS47YWfZWXW/6V52r7qjNfVJp9R8ZdRW989/ARRUJew=</latexit>

p(pos | x) =p(x | pos) p(pos)

p(x)=

p(x | pos)p(pos)p(x | pos)p(pos) + p(x | neg)p(neg)

<latexit sha1_base64="rnS8Rx+W1hr/96b8S4nHD9E3wEg=">AAAIn3icjVVdb+NEFHUWWEJgoQuPvBgiUAtRZadsW4QqLf2g+8C2oWraRXVUjZ0bx8rYHs2ME3tH83f4TzzyTxg7TrA93hXz4qt7zrmeOXfs6xIcMG5Zf3eefPDhR08/7n7S+/SzZ59/sfP8yzsWJ9SDsRfjmL5xEQMcRDDmAcfwhlBAoYvh3l2c5fj9EigL4uiWZwQmIfKjYBZ4iKvU484/ZNdxPeFwSLkgMZPSCYOpme6Z3584M4o8QXbTItWg7Tm/aNI9mbP3pHlivl/7DuX/5Jo/mlsy3UAR+GtyIyMfd/rWvlUsUw/sMugb5Ro9Pn/6lzONvSSEiHsYMfZgW4RPBKI88DDInpMwIMhbIB8eEj47noggIgmHyJPmdwqbJdjksZnbbU4DCh7HmQqQRwNVwfTmSHnDVVN69VIMIhQCG0yXAWHrkC39dcCR6uhEpEXH5bOaUvgUkXngpbWtCRSyEPG5lmRZ6NaTkGCgy7CezLepNtlgpkC9gOUmjJQz1yS/Rew2HpX4PCNziJgUCcWyKlQAUAozJSxCBjwhojiNuroLdsJpAoM8LHIn54gubmA6UHVqifp2ZjhGvJ5y1TGUOxGsvDgMUTQVDpHljXAG+7LwroreSCGc3CjXNW9yuIZeVdArKevgRQW8UGAdHW/RmTluSu8q4J321vsKet+UukkFTTR0WUGXWmV3VYFXGpxW0FRDswqaaejbCvpW9xmpa/EwnIh1L4qmimscLOGSAkRS9IeyeRaq+v1g1yX5HRB9WxZ2T2Gm/ntrIMxyunh1+/p3Kc6Ohy+sQ9lkuDiBDcU6OHxxZmkUf72bkmMdHw9PNU5MUeRvC51fHP5q64VIQgneko6ODn77Wa+UAcbxalvp7PR8eNA8mHKkvin7yLas5m2jnmZV6YjZt03NWr+NXr6mVeC2CdZ+tvIXOv+Souwd7Lit+sbmVgVpU2w8b1VkbYpNAzaKuiRqsem/dmw1jZOT/ENYqDFE8pGB8LruOahhQuG1+kCu1Q8Q8Zj+oL4K6oeBqqWeziCP3kdE6Yaool5PTTa7Ocf04G64bx/uW3/81H95Ws64rvG18a2xa9jGkfHSeGWMjLHhdS47YWfZWXW/6V52r7qjNfVJp9R8ZdRW989/ARRUJew=</latexit>

p(pos | x) =p(x | pos) p(pos)

p(x)=

p(x | pos)p(pos)p(x | pos)p(pos) + p(x | neg)p(neg)

<latexit sha1_base64="rnS8Rx+W1hr/96b8S4nHD9E3wEg=">AAAIn3icjVVdb+NEFHUWWEJgoQuPvBgiUAtRZadsW4QqLf2g+8C2oWraRXVUjZ0bx8rYHs2ME3tH83f4TzzyTxg7TrA93hXz4qt7zrmeOXfs6xIcMG5Zf3eefPDhR08/7n7S+/SzZ59/sfP8yzsWJ9SDsRfjmL5xEQMcRDDmAcfwhlBAoYvh3l2c5fj9EigL4uiWZwQmIfKjYBZ4iKvU484/ZNdxPeFwSLkgMZPSCYOpme6Z3584M4o8QXbTItWg7Tm/aNI9mbP3pHlivl/7DuX/5Jo/mlsy3UAR+GtyIyMfd/rWvlUsUw/sMugb5Ro9Pn/6lzONvSSEiHsYMfZgW4RPBKI88DDInpMwIMhbIB8eEj47noggIgmHyJPmdwqbJdjksZnbbU4DCh7HmQqQRwNVwfTmSHnDVVN69VIMIhQCG0yXAWHrkC39dcCR6uhEpEXH5bOaUvgUkXngpbWtCRSyEPG5lmRZ6NaTkGCgy7CezLepNtlgpkC9gOUmjJQz1yS/Rew2HpX4PCNziJgUCcWyKlQAUAozJSxCBjwhojiNuroLdsJpAoM8LHIn54gubmA6UHVqifp2ZjhGvJ5y1TGUOxGsvDgMUTQVDpHljXAG+7LwroreSCGc3CjXNW9yuIZeVdArKevgRQW8UGAdHW/RmTluSu8q4J321vsKet+UukkFTTR0WUGXWmV3VYFXGpxW0FRDswqaaejbCvpW9xmpa/EwnIh1L4qmimscLOGSAkRS9IeyeRaq+v1g1yX5HRB9WxZ2T2Gm/ntrIMxyunh1+/p3Kc6Ohy+sQ9lkuDiBDcU6OHxxZmkUf72bkmMdHw9PNU5MUeRvC51fHP5q64VIQgneko6ODn77Wa+UAcbxalvp7PR8eNA8mHKkvin7yLas5m2jnmZV6YjZt03NWr+NXr6mVeC2CdZ+tvIXOv+Souwd7Lit+sbmVgVpU2w8b1VkbYpNAzaKuiRqsem/dmw1jZOT/ENYqDFE8pGB8LruOahhQuG1+kCu1Q8Q8Zj+oL4K6oeBqqWeziCP3kdE6Yaool5PTTa7Ocf04G64bx/uW3/81H95Ws64rvG18a2xa9jGkfHSeGWMjLHhdS47YWfZWXW/6V52r7qjNfVJp9R8ZdRW989/ARRUJew=</latexit>

p(pos | x) =p(x | pos) p(pos)

p(x)=

p(x | pos)p(pos)p(x | pos)p(pos) + p(x | neg)p(neg)

FortheBayesclassi?ier,westartwiththeprobabilitywe’reinterestedinp(Y|X):theprobabilityoftheclassgiventhedata.WethenrewritethisusingBayes’rule.Fromthe?inalform,weseethatifwecomputetheprobabilityfunctionsp(X|Y),thedatagiventheclassandp(Y),thepriorprobabilityoftheclass,wecancomputetheprobabiliteswe;reinterestedin:theclassprobabilitiesgiventhedata.

Sothetaskbecomestolearnfunctionsforthosetwoprobabilties.

Bayes classifier

53

Choose probability distribution M (e.g. MVN)

Fit Mpos to all positive points: p(X=x|pos) = Mpos(x)

Fit Mneg to all negative points: p(X=x|neg) = Mneg(x)

Estimate P(Y) from the class frequencies in the training data, or use domain-specific information.

Compute terms tpos = Mpos(x) p(pos) and tneg = Mneg(x) p(neg)

Compute class probabilities p(pos|x) = tpos / (tpos + tneg) and p(neg|x) = tneg / (tpos + tneg)

SohereisthealgorithmforasimpleBayesclassi?ier.WechooseamodelclassforP(X|Y),forinstancemultivariatenormaldistributions.We?itsuchamodelseparatelytoeachclasstogieusonedistribution

example for MVNs

54source: http://learning.cis.upenn.edu/cis520_fall2009/index.php?n=Lectures.NaiveBayes

Hereisanexampleofwhatthatlookslikewith2features.Ontheleftwehavetwoclasses,blueandblack.We?ita2Dnormaldistributiontoeach.Then,foranewpoint,weseewhichassignsthenewpointthehighestprobabilitydensity.

Theredlineprovidesthedecisionboundary.

Naive Bayes

Assume independence between all features, conditional on the class.

Often used with categoric features.

55

p(X1,X2 | Y) = p(X1 | Y)p(X2 | Y)<latexit sha1_base64="07wFBM5IHtaUNIi3c7vSyRp3h6U=">AAAHGXicfZXPbtNAEMbdQkMJFFo4crGIkAqKIjulbSpRqbSFVoi2oWqaoDiK1puJY8V/VrvrNOnKb8KJR+GAKm7Aibdh7SSVHQd8yWh+34xmvl1tTOLYjGvan4XFO3eXcveW7+cfPFx59Hh17ckl8wOKoYZ9x6cNEzFwbA9q3OYONAgF5JoO1M3+QcTrA6DM9r0LPiLQcpHl2V0bIy5T7dU3ZL3R1otqo11WDdfuqIaJxefwpbqrxiSZixLlZKK9WtBKWvyp2UCfBAVl8lXba0vfjY6PAxc8jh3EWFPXCG8JRLmNHQjzRsCAINxHFjQD3q20hO2RgIOHQ/WFZN3AUbmvRpuoHZsC5s5IBghTW3ZQcQ9RhLncN59uxcBDLrBiZ2ATNg7ZwBoHHEmzWmIYmxmupCqFRRHp2XiYGk0gl7mI9zJJNnLNdBICB+jATSejMeWQM8ohUGyzyISqdOaMRAfELvzqhPdGpAceC0VAnTBZKAFQCl1ZGIcMeEBEvI28FX22y2kAxSiMc7uHiPbPoVOUfVKJ9Dhdx0c8lGZ4cIV910VeRxgkFAaHIRdGsRTGViXpeSiEEflimup5hFP0NEFPw9nOtVvaVWuSpuBlAl5mGtcTtD5bagYJGmToIEEHmc7mVQJfZfAwQYcZOkrQUYZeJ+h11kokD7pZbomx3fExiTPHHsARBfBCUSiHs7tQeYJNPV0Snaoo6GFsdwe68pEYA3cUycXxxcnHUBxUypvaVjirMJ0AphJtY2vzQMtIrPE0E41WqZT3MxqfIs+6bXT4buutnm1EAkqcW9H29sb7nf3ZK0JxZr/JGmpBVzN+WPPkk4HnFpjzCsYmzNX3s/ojikb/UPvzuk+9mVtB5lVMjZpWzIxEomvVx7I4elKRM5YcgnxsKZzI63YmHwjEffpK3jFqubY0Q/4axSj6nxANp0IZ5fPy5ddn3/lsUCuXdkrap9eFvQ+Tv4Bl5ZnyXFlXdGVb2VOOlapSU7DyVblRfiq/cl9y33I3uR9j6eLCpOapkvpyv/8CIRKPsg==</latexit><latexit sha1_base64="07wFBM5IHtaUNIi3c7vSyRp3h6U=">AAAHGXicfZXPbtNAEMbdQkMJFFo4crGIkAqKIjulbSpRqbSFVoi2oWqaoDiK1puJY8V/VrvrNOnKb8KJR+GAKm7Aibdh7SSVHQd8yWh+34xmvl1tTOLYjGvan4XFO3eXcveW7+cfPFx59Hh17ckl8wOKoYZ9x6cNEzFwbA9q3OYONAgF5JoO1M3+QcTrA6DM9r0LPiLQcpHl2V0bIy5T7dU3ZL3R1otqo11WDdfuqIaJxefwpbqrxiSZixLlZKK9WtBKWvyp2UCfBAVl8lXba0vfjY6PAxc8jh3EWFPXCG8JRLmNHQjzRsCAINxHFjQD3q20hO2RgIOHQ/WFZN3AUbmvRpuoHZsC5s5IBghTW3ZQcQ9RhLncN59uxcBDLrBiZ2ATNg7ZwBoHHEmzWmIYmxmupCqFRRHp2XiYGk0gl7mI9zJJNnLNdBICB+jATSejMeWQM8ohUGyzyISqdOaMRAfELvzqhPdGpAceC0VAnTBZKAFQCl1ZGIcMeEBEvI28FX22y2kAxSiMc7uHiPbPoVOUfVKJ9Dhdx0c8lGZ4cIV910VeRxgkFAaHIRdGsRTGViXpeSiEEflimup5hFP0NEFPw9nOtVvaVWuSpuBlAl5mGtcTtD5bagYJGmToIEEHmc7mVQJfZfAwQYcZOkrQUYZeJ+h11kokD7pZbomx3fExiTPHHsARBfBCUSiHs7tQeYJNPV0Snaoo6GFsdwe68pEYA3cUycXxxcnHUBxUypvaVjirMJ0AphJtY2vzQMtIrPE0E41WqZT3MxqfIs+6bXT4buutnm1EAkqcW9H29sb7nf3ZK0JxZr/JGmpBVzN+WPPkk4HnFpjzCsYmzNX3s/ojikb/UPvzuk+9mVtB5lVMjZpWzIxEomvVx7I4elKRM5YcgnxsKZzI63YmHwjEffpK3jFqubY0Q/4axSj6nxANp0IZ5fPy5ddn3/lsUCuXdkrap9eFvQ+Tv4Bl5ZnyXFlXdGVb2VOOlapSU7DyVblRfiq/cl9y33I3uR9j6eLCpOapkvpyv/8CIRKPsg==</latexit><latexit sha1_base64="07wFBM5IHtaUNIi3c7vSyRp3h6U=">AAAHGXicfZXPbtNAEMbdQkMJFFo4crGIkAqKIjulbSpRqbSFVoi2oWqaoDiK1puJY8V/VrvrNOnKb8KJR+GAKm7Aibdh7SSVHQd8yWh+34xmvl1tTOLYjGvan4XFO3eXcveW7+cfPFx59Hh17ckl8wOKoYZ9x6cNEzFwbA9q3OYONAgF5JoO1M3+QcTrA6DM9r0LPiLQcpHl2V0bIy5T7dU3ZL3R1otqo11WDdfuqIaJxefwpbqrxiSZixLlZKK9WtBKWvyp2UCfBAVl8lXba0vfjY6PAxc8jh3EWFPXCG8JRLmNHQjzRsCAINxHFjQD3q20hO2RgIOHQ/WFZN3AUbmvRpuoHZsC5s5IBghTW3ZQcQ9RhLncN59uxcBDLrBiZ2ATNg7ZwBoHHEmzWmIYmxmupCqFRRHp2XiYGk0gl7mI9zJJNnLNdBICB+jATSejMeWQM8ohUGyzyISqdOaMRAfELvzqhPdGpAceC0VAnTBZKAFQCl1ZGIcMeEBEvI28FX22y2kAxSiMc7uHiPbPoVOUfVKJ9Dhdx0c8lGZ4cIV910VeRxgkFAaHIRdGsRTGViXpeSiEEflimup5hFP0NEFPw9nOtVvaVWuSpuBlAl5mGtcTtD5bagYJGmToIEEHmc7mVQJfZfAwQYcZOkrQUYZeJ+h11kokD7pZbomx3fExiTPHHsARBfBCUSiHs7tQeYJNPV0Snaoo6GFsdwe68pEYA3cUycXxxcnHUBxUypvaVjirMJ0AphJtY2vzQMtIrPE0E41WqZT3MxqfIs+6bXT4buutnm1EAkqcW9H29sb7nf3ZK0JxZr/JGmpBVzN+WPPkk4HnFpjzCsYmzNX3s/ojikb/UPvzuk+9mVtB5lVMjZpWzIxEomvVx7I4elKRM5YcgnxsKZzI63YmHwjEffpK3jFqubY0Q/4axSj6nxANp0IZ5fPy5ddn3/lsUCuXdkrap9eFvQ+Tv4Bl5ZnyXFlXdGVb2VOOlapSU7DyVblRfiq/cl9y33I3uR9j6eLCpOapkvpyv/8CIRKPsg==</latexit>

Thisworkswellforsmallnumbersoffeatures,butifwehavemanyfeatures,modellingthedependencebetweeneachpairoffeaturesgetsveryexpensive.

Acrude,butveryeffectivesolutionisNaiveBayes.NBjustassumesthatallfeaturesareindependent,conditionalontheclass.

Notethatwedonotassumethatthefeaturesareindependent:it’sperfectlypossibleforonefeaturetobedependentonanotherfeature,buttheareconditionallyindependent.Informally,thedependencybetweenthefeaturesis“caused”bytheclassandnothingelse.JustlikeAliceandBobinthe?irstvideo:theirlatenesshadonlyonepossiblesharedcause,themonster,andoncewe’disolatedthat,theirlatenesswasindependent.

56

“pill” “meeting”T T spamT F spamT T hamT T hamF T hamF T hamF T hamF F spamT F spamF F spamF F ham

Hereisanexampledataset,withbinaryfeatures.Eachfeatureindicateswhetheraparticularwordoccursinthatinstance.

WewillbuildanaiveBayesclassi?ierforthisdatabysimply?ittingabernoullidistributiontoeachfeature.Thatis,wewillestimatep(“pill”|spam)astherelativefrequencywithwhichthe“pill”featurewastrueforspamemails.

57

X1 X2

T T spamT F spamT T hamT T hamF T hamF T hamF T hamF F spamT F spamF F spamF F ham

p(X1=T | ham) = 2/6

p(X1=F | ham) = 4/6

HereiswhatNaiveBayesdoes:itselectsallemailsofoneclass,andthenestimatestheprobabilitythatX1willbeTastherelativefrequencyofemailsforwhichX1wasTinthetrainingset.

Strictlyspeaking,wearemodellingX1asaBernoullidistributionwhoseparameterweestimateas2/6

58

X1 X2

T T spamT F spamT T hamT T hamF T hamF T hamF T hamF F spamT F spamF F spamF F ham

p(X1=T | spam) = 3/5

p(X1=F | spam) = 2/5

Wedothesameforthespamclassandfortheotherfeature.

59

p(Y | X1, . . . ,Xn) / p(X1, . . . ,Xn | Y)p(Y)

= p(X1 | Y)⇥ . . .⇥ p(Xn | Y)p(Y)<latexit sha1_base64="TgV5efg9YsbzGkracG8gBkk1KC0=">AAAHeXicfZVbb9MwGIbTcSiU04BLbiIqUEHVlHSwjYtJYxtsQuzAtG5DSzU57tfWqpNYttO1WPkn/B7+A/+FC5xDpxwKvukXP+/7yX5tuS6jREjL+l1bunX7zt36vfuNBw8fPX6y/PTZmQhCjqGLAxrwCxcJoMSHriSSwgXjgDyXwrk73on5+QS4IIF/KmcMeh4a+mRAMJJ66mr5J2u9dlysvkeOR/rmxZXdNh3aD6Ro6w//jekwHjAZmKxVZokh9b7ReF45TuP1ZiovKBxJPBBZg/lXLPt3o6vlprViJcOsFnZWNI1sHF89vfPL6Qc49MCXmCIhLm2LyZ5CXBJMIWo4oQCG8BgN4TKUg42eIj4LJfg4Ml9pNgipqTcbJ2X2CQcs6UwXCHOiO5h4hDjCUufZKLYS4CO9n3Z/QphISzEZpoVE+jB6apocVvSo4FRDjtiI4GlhaQp5wkNyVJkUM88tTkJIgU+84mS8TL3IknIKHBMRh3Cskzli8QUQp8FxxkczNgJfRCrkNMobNQDOYaCNSSlAhkwlu9G3biw2JQ+hHZfJ3OYu4uMT6Ld1n8JEcTkDGiAZ6TB8uMaB5yG/rxwWKUfCVCqnvRIlUeXpSaSUE+fiuuZJjAv0MEcPo3Ln7g0dmF1NC/AsB88qjc9z9LxsdcMcDSt0kqOTSmf3OoevK3iao9MKneXorEJ/5OiPapRIH/Rlp6fSuJNjUkeUTGCPA/iRanai8l64PsFLu2iJT1U17SiJuw8D/QilwJvFcrV/evA1UjsbnffWWlRWuDSEucRaXXu/Y1Ukw3Q1mcba2OhsVzQBR/7wptHup7WPdrURCzmjN6L19dXPH7bLV4Tjyv6ybZhN26zkMVwkzxa80OAuMqQhLNSPq/o9jmb/UAeLus+zWehgixzzoOaO0pJYfK3GWJvjJxXRVLIL+rHlcKCv25F+IJAM+Ft9x/jQIzoM/eu04+p/QjSdC3XVaOiX3y6/89Wi21n5sGJ9e9fc+pL9BdwzXhgvjZZhG+vGlrFvHBtdA9eWaq2aXevc/VN/WW/V36bSpVrmeW4URn31LyuzrHo=</latexit><latexit sha1_base64="TgV5efg9YsbzGkracG8gBkk1KC0=">AAAHeXicfZVbb9MwGIbTcSiU04BLbiIqUEHVlHSwjYtJYxtsQuzAtG5DSzU57tfWqpNYttO1WPkn/B7+A/+FC5xDpxwKvukXP+/7yX5tuS6jREjL+l1bunX7zt36vfuNBw8fPX6y/PTZmQhCjqGLAxrwCxcJoMSHriSSwgXjgDyXwrk73on5+QS4IIF/KmcMeh4a+mRAMJJ66mr5J2u9dlysvkeOR/rmxZXdNh3aD6Ro6w//jekwHjAZmKxVZokh9b7ReF45TuP1ZiovKBxJPBBZg/lXLPt3o6vlprViJcOsFnZWNI1sHF89vfPL6Qc49MCXmCIhLm2LyZ5CXBJMIWo4oQCG8BgN4TKUg42eIj4LJfg4Ml9pNgipqTcbJ2X2CQcs6UwXCHOiO5h4hDjCUufZKLYS4CO9n3Z/QphISzEZpoVE+jB6apocVvSo4FRDjtiI4GlhaQp5wkNyVJkUM88tTkJIgU+84mS8TL3IknIKHBMRh3Cskzli8QUQp8FxxkczNgJfRCrkNMobNQDOYaCNSSlAhkwlu9G3biw2JQ+hHZfJ3OYu4uMT6Ld1n8JEcTkDGiAZ6TB8uMaB5yG/rxwWKUfCVCqnvRIlUeXpSaSUE+fiuuZJjAv0MEcPo3Ln7g0dmF1NC/AsB88qjc9z9LxsdcMcDSt0kqOTSmf3OoevK3iao9MKneXorEJ/5OiPapRIH/Rlp6fSuJNjUkeUTGCPA/iRanai8l64PsFLu2iJT1U17SiJuw8D/QilwJvFcrV/evA1UjsbnffWWlRWuDSEucRaXXu/Y1Ukw3Q1mcba2OhsVzQBR/7wptHup7WPdrURCzmjN6L19dXPH7bLV4Tjyv6ybZhN26zkMVwkzxa80OAuMqQhLNSPq/o9jmb/UAeLus+zWehgixzzoOaO0pJYfK3GWJvjJxXRVLIL+rHlcKCv25F+IJAM+Ft9x/jQIzoM/eu04+p/QjSdC3XVaOiX3y6/89Wi21n5sGJ9e9fc+pL9BdwzXhgvjZZhG+vGlrFvHBtdA9eWaq2aXevc/VN/WW/V36bSpVrmeW4URn31LyuzrHo=</latexit><latexit sha1_base64="TgV5efg9YsbzGkracG8gBkk1KC0=">AAAHeXicfZVbb9MwGIbTcSiU04BLbiIqUEHVlHSwjYtJYxtsQuzAtG5DSzU57tfWqpNYttO1WPkn/B7+A/+FC5xDpxwKvukXP+/7yX5tuS6jREjL+l1bunX7zt36vfuNBw8fPX6y/PTZmQhCjqGLAxrwCxcJoMSHriSSwgXjgDyXwrk73on5+QS4IIF/KmcMeh4a+mRAMJJ66mr5J2u9dlysvkeOR/rmxZXdNh3aD6Ro6w//jekwHjAZmKxVZokh9b7ReF45TuP1ZiovKBxJPBBZg/lXLPt3o6vlprViJcOsFnZWNI1sHF89vfPL6Qc49MCXmCIhLm2LyZ5CXBJMIWo4oQCG8BgN4TKUg42eIj4LJfg4Ml9pNgipqTcbJ2X2CQcs6UwXCHOiO5h4hDjCUufZKLYS4CO9n3Z/QphISzEZpoVE+jB6apocVvSo4FRDjtiI4GlhaQp5wkNyVJkUM88tTkJIgU+84mS8TL3IknIKHBMRh3Cskzli8QUQp8FxxkczNgJfRCrkNMobNQDOYaCNSSlAhkwlu9G3biw2JQ+hHZfJ3OYu4uMT6Ld1n8JEcTkDGiAZ6TB8uMaB5yG/rxwWKUfCVCqnvRIlUeXpSaSUE+fiuuZJjAv0MEcPo3Ln7g0dmF1NC/AsB88qjc9z9LxsdcMcDSt0kqOTSmf3OoevK3iao9MKneXorEJ/5OiPapRIH/Rlp6fSuJNjUkeUTGCPA/iRanai8l64PsFLu2iJT1U17SiJuw8D/QilwJvFcrV/evA1UjsbnffWWlRWuDSEucRaXXu/Y1Ukw3Q1mcba2OhsVzQBR/7wptHup7WPdrURCzmjN6L19dXPH7bLV4Tjyv6ybZhN26zkMVwkzxa80OAuMqQhLNSPq/o9jmb/UAeLus+zWehgixzzoOaO0pJYfK3GWJvjJxXRVLIL+rHlcKCv25F+IJAM+Ft9x/jQIzoM/eu04+p/QjSdC3XVaOiX3y6/89Wi21n5sGJ9e9fc+pL9BdwzXhgvjZZhG+vGlrFvHBtdA9eWaq2aXevc/VN/WW/V36bSpVrmeW4URn31LyuzrHo=</latexit>

ThisisthenaiveBayesassumptionformulaically.Wesimplyfactorp(X1,…Xn)intonseparate,independentprobabilities.

60

“pill” “meeting”T T spamT F spamT T hamT T hamF T hamF T hamF T hamF F spamT F spamF F spamF F ham

new instance: “pill” & “meeting”

p(ham | X1=T, X2=T) ∝p(X1=T, X2=T | ham) p(ham)

= p(X1=T | ham) p(X2=T | ham) p(ham)

= (2/6) × (5/6) × (6/11)

smoothing

61

X1 X2

T T spamT F spamT T hamT T hamF T hamF T hamF T hamT F spamT F spamT F spamF F ham

p(X1=T | spam) = 5/5

p(X1=F | spam) = 0/5

WhileNaiveBayescanworksurprisinglywell(givenhowstrongandincorrecttheassumptionis),wedorunintoaproblemifforsomefeatureaparticularvaluedoesnotoccur.Inthatcase,weestimatetheprobabilityas0.

62

p(Y | X1, . . . ,Xn) / p(X1, . . . ,Xn | Y)p(Y)

= p(X1 | Y)⇥ . . .⇥ p(Xn | Y)p(Y)

= 0⇥ . . .⇥ p(Xn | Y)p(Y)

= 0<latexit sha1_base64="eZvaiOp+jt7Tzu3xrM1dH85R+eE=">AAAHxHicpVXbbts4EJV7c+ttu2n72BehxgZpYQSS2ybpQ4A2yW6KRXNpECdZREZA0WObMCURJOXYYbmf13/oF+xvLHVxoIu3L8sXj+acMxieGdA+o0RIx/nRuHP33v0HzYePWr88fvL015Vnz89EFHMMPRzRiF/4SAAlIfQkkRQuGAcU+BTO/clugp9PgQsShadyzqAfoFFIhgQjaVJXK/+wtVXPx+ov7QVkYF9cuR3bo4NIio75CF/bHuMRk5HN1qpYKsi0rw28iDyvtbqd0UsMT5IARF5g8ZXQfl7I41g5+v+qvauVtrPupMeuB24etK38HF89u//dG0Q4DiCUmCIhLl2Hyb5CXBJMQbe8WABDeIJGcBnL4VZfkZDFEkKs7d8MNoypbXxLTLcHhAOWdG4ChDkxFWw8RhxhaUbTKpcSECJzuc5gSpjIQjEdZYFEZq59NUvnrp+UlGrEERsTPCu1plAgAiTHtaSYB345CTEFPg3KyaRN02SFOQOOiUhMODbOHLFkl8RpdJzj4zkbQyi0ijnVRaEBgHMYGmEaCpAxU+ltzAJPxLbkMXSSMM1t7yE+OYFBx9QpJcrtDGmEpDZmhHCNoyBA4UB5TCtPwkwqr7OuU6uK6IlWykt88X37JIFL6GEBPdTVyr1bdGj3DFoCzwrgWa3weQE9r0r9uIDGNXRaQKe1yv51Ab6uwbMCOquh8wI6r6E3BfSmbiUyg77s9lVmdzomdUTJFPY5QKhVu6urd+FmgpduWZJMVbVdndo9gKF5zzIgmCd09fn04ItWu1vd986GrjJ8GsOC4rzdeL/r1CijrJuc42xtdXdqnIijcHRbaO/3jU9uvRCLOaO3pM3Nt3982KmuCMe1++XXsNuuXfNjtIyeN7xU4C8TZCYs5U/q/H2O5v/BjpZVX3izVMGWKRZGLRSVlliyVhNsxMmTimhG2QPz2HI4MOt2ZB4IJCP+xuwYHwXEmGF+vU4S/YyIZguiiVot8/K71Xe+HvS66x/Wna/v2h//zP8CHlovrVfWmuVam9ZH67N1bPUs3DhoiMa3hm7uN4OmaMYZ9U4j17ywSqf597+F9MdE</latexit><latexit sha1_base64="eZvaiOp+jt7Tzu3xrM1dH85R+eE=">AAAHxHicpVXbbts4EJV7c+ttu2n72BehxgZpYQSS2ybpQ4A2yW6KRXNpECdZREZA0WObMCURJOXYYbmf13/oF+xvLHVxoIu3L8sXj+acMxieGdA+o0RIx/nRuHP33v0HzYePWr88fvL015Vnz89EFHMMPRzRiF/4SAAlIfQkkRQuGAcU+BTO/clugp9PgQsShadyzqAfoFFIhgQjaVJXK/+wtVXPx+ov7QVkYF9cuR3bo4NIio75CF/bHuMRk5HN1qpYKsi0rw28iDyvtbqd0UsMT5IARF5g8ZXQfl7I41g5+v+qvauVtrPupMeuB24etK38HF89u//dG0Q4DiCUmCIhLl2Hyb5CXBJMQbe8WABDeIJGcBnL4VZfkZDFEkKs7d8MNoypbXxLTLcHhAOWdG4ChDkxFWw8RhxhaUbTKpcSECJzuc5gSpjIQjEdZYFEZq59NUvnrp+UlGrEERsTPCu1plAgAiTHtaSYB345CTEFPg3KyaRN02SFOQOOiUhMODbOHLFkl8RpdJzj4zkbQyi0ijnVRaEBgHMYGmEaCpAxU+ltzAJPxLbkMXSSMM1t7yE+OYFBx9QpJcrtDGmEpDZmhHCNoyBA4UB5TCtPwkwqr7OuU6uK6IlWykt88X37JIFL6GEBPdTVyr1bdGj3DFoCzwrgWa3weQE9r0r9uIDGNXRaQKe1yv51Ab6uwbMCOquh8wI6r6E3BfSmbiUyg77s9lVmdzomdUTJFPY5QKhVu6urd+FmgpduWZJMVbVdndo9gKF5zzIgmCd09fn04ItWu1vd986GrjJ8GsOC4rzdeL/r1CijrJuc42xtdXdqnIijcHRbaO/3jU9uvRCLOaO3pM3Nt3982KmuCMe1++XXsNuuXfNjtIyeN7xU4C8TZCYs5U/q/H2O5v/BjpZVX3izVMGWKRZGLRSVlliyVhNsxMmTimhG2QPz2HI4MOt2ZB4IJCP+xuwYHwXEmGF+vU4S/YyIZguiiVot8/K71Xe+HvS66x/Wna/v2h//zP8CHlovrVfWmuVam9ZH67N1bPUs3DhoiMa3hm7uN4OmaMYZ9U4j17ywSqf597+F9MdE</latexit><latexit sha1_base64="eZvaiOp+jt7Tzu3xrM1dH85R+eE=">AAAHxHicpVXbbts4EJV7c+ttu2n72BehxgZpYQSS2ybpQ4A2yW6KRXNpECdZREZA0WObMCURJOXYYbmf13/oF+xvLHVxoIu3L8sXj+acMxieGdA+o0RIx/nRuHP33v0HzYePWr88fvL015Vnz89EFHMMPRzRiF/4SAAlIfQkkRQuGAcU+BTO/clugp9PgQsShadyzqAfoFFIhgQjaVJXK/+wtVXPx+ov7QVkYF9cuR3bo4NIio75CF/bHuMRk5HN1qpYKsi0rw28iDyvtbqd0UsMT5IARF5g8ZXQfl7I41g5+v+qvauVtrPupMeuB24etK38HF89u//dG0Q4DiCUmCIhLl2Hyb5CXBJMQbe8WABDeIJGcBnL4VZfkZDFEkKs7d8MNoypbXxLTLcHhAOWdG4ChDkxFWw8RhxhaUbTKpcSECJzuc5gSpjIQjEdZYFEZq59NUvnrp+UlGrEERsTPCu1plAgAiTHtaSYB345CTEFPg3KyaRN02SFOQOOiUhMODbOHLFkl8RpdJzj4zkbQyi0ijnVRaEBgHMYGmEaCpAxU+ltzAJPxLbkMXSSMM1t7yE+OYFBx9QpJcrtDGmEpDZmhHCNoyBA4UB5TCtPwkwqr7OuU6uK6IlWykt88X37JIFL6GEBPdTVyr1bdGj3DFoCzwrgWa3weQE9r0r9uIDGNXRaQKe1yv51Ab6uwbMCOquh8wI6r6E3BfSmbiUyg77s9lVmdzomdUTJFPY5QKhVu6urd+FmgpduWZJMVbVdndo9gKF5zzIgmCd09fn04ItWu1vd986GrjJ8GsOC4rzdeL/r1CijrJuc42xtdXdqnIijcHRbaO/3jU9uvRCLOaO3pM3Nt3982KmuCMe1++XXsNuuXfNjtIyeN7xU4C8TZCYs5U/q/H2O5v/BjpZVX3izVMGWKRZGLRSVlliyVhNsxMmTimhG2QPz2HI4MOt2ZB4IJCP+xuwYHwXEmGF+vU4S/YyIZguiiVot8/K71Xe+HvS66x/Wna/v2h//zP8CHlovrVfWmuVam9ZH67N1bPUs3DhoiMa3hm7uN4OmaMYZ9U4j17ywSqf597+F9MdE</latexit>

Sincethewholeestimateofourprobabilityisjustalongproduct,ifoneofthefactorsbecomeszero,thewholethingscollapses.Evenifalltheotherfeaturesgavethisclassaveryhighprobability,thatinformationislost.

pseudo-observations (aka Laplace smoothing)

63

X1 X2

T T spamT F spamT T hamT T hamF T hamF T hamF T hamT F spamT F spamT F spamF F hamF F spamT T spam

Toremedythis,weneedtoapplysmoothing.Thesimplestwastodothatistoaddpseudo-observations.Foreachpossiblevalue,weaddaninstancewhereallthefeatureshavethatvalue.

(Weshoulddothesamefortheclassham).

unsmoothed

smoothed

64

p(X1 = T | Y = spam) =freq. of T in spam data

total # of spam instances<latexit sha1_base64="lM7otXHxmNW2LOvM/MWPYYOq3PY=">AAAHWXicfZVRb9s2EMflbK1Tb22T9rEvxIwBXWEYkrsm6UOALsnWPqxNFthJisgIKPpkE6YkjqQcuxy/Xr9DgT3tk4yS5UyyvOrFh/v973D8kzgHnFGpXPdLY+ubb+/db24/aH33/cNHj3d2n1zIJBUEBiRhibgKsARGYxgoqhhccQE4ChhcBtPjjF/OQEiaxH214DCM8DimISVY2dTNzl/8+dWNhw5R34/oCPkB0R/NoS+I9hXMlZYcR8b8ZAV+KPAqGwr4s4uSEPURjVGmznVohBU2phCpRGGG/Ham+09CY6lwTEAaZG522m7XzT9UD7wiaDvFd3aze++zP0pIGkGsCMNSXnsuV0ONhaKEgWn5qQSOyRSP4TpV4cFQ05inCmJi0I+WhSlDKkGZE2hEBRDFFjbARFDbAZEJtodU1q9WtZWEGEcgO6MZ5XIZytl4GShszR7qeX4Z5mGlUo8F5hNK5pXRNI5khNWklpSLKKgmIWUgZlE1mY1ph1xTzkEQKjMTzqwzpzy7YNlPzgo+WfAJxNLoVDBTLrQAhIDQFuahBJVynZ/GvqqpPFQihU4W5rnDEyym5zDq2D6VRHWckCVYGWtGDLckiSIcj7TPV2/D73RNblWZnhut/cyXIEDnGa7QDyX6wax3HtzREA0srcCLEryoNb4s0cv10iAt0bRGZyU6q3UObkv4tobnJTqv0UWJLmr0U4l+qluJ7UVf94Z6aXd+TfqU0Rm8FQCx0e2eWT+LsDd47VVLslvVbc/kdo8gtEtmCaJFJtfv+u9/N/r4oPfK3TPrioClsJK4L/deHbs1yXg5TaFxDw56RzVNInA8vmt08uveL169EU8FZ3ei/f2Xv70+Wn8igtTOVxwDtT1U82O8SV4MvLEg2FSwNGGjflrXvxV48T/qZFP3lTcbK/imipVRq4q1kXj2rKZ2zfNspWK2lJyAXbYC3tvndmoXBFaJeGHfmBhH1Jphf/1OFn1NiOcroY1aLbv5vfU9Xw8Gve7rrvvHz+03R8VfwLbzzPnBee54zr7zxnnnnDkDhzj/NLYbu40n9/9ubjW3m62ldKtR1Dx1Kl/z6b/zMaX+</latexit><latexit sha1_base64="lM7otXHxmNW2LOvM/MWPYYOq3PY=">AAAHWXicfZVRb9s2EMflbK1Tb22T9rEvxIwBXWEYkrsm6UOALsnWPqxNFthJisgIKPpkE6YkjqQcuxy/Xr9DgT3tk4yS5UyyvOrFh/v973D8kzgHnFGpXPdLY+ubb+/db24/aH33/cNHj3d2n1zIJBUEBiRhibgKsARGYxgoqhhccQE4ChhcBtPjjF/OQEiaxH214DCM8DimISVY2dTNzl/8+dWNhw5R34/oCPkB0R/NoS+I9hXMlZYcR8b8ZAV+KPAqGwr4s4uSEPURjVGmznVohBU2phCpRGGG/Ham+09CY6lwTEAaZG522m7XzT9UD7wiaDvFd3aze++zP0pIGkGsCMNSXnsuV0ONhaKEgWn5qQSOyRSP4TpV4cFQ05inCmJi0I+WhSlDKkGZE2hEBRDFFjbARFDbAZEJtodU1q9WtZWEGEcgO6MZ5XIZytl4GShszR7qeX4Z5mGlUo8F5hNK5pXRNI5khNWklpSLKKgmIWUgZlE1mY1ph1xTzkEQKjMTzqwzpzy7YNlPzgo+WfAJxNLoVDBTLrQAhIDQFuahBJVynZ/GvqqpPFQihU4W5rnDEyym5zDq2D6VRHWckCVYGWtGDLckiSIcj7TPV2/D73RNblWZnhut/cyXIEDnGa7QDyX6wax3HtzREA0srcCLEryoNb4s0cv10iAt0bRGZyU6q3UObkv4tobnJTqv0UWJLmr0U4l+qluJ7UVf94Z6aXd+TfqU0Rm8FQCx0e2eWT+LsDd47VVLslvVbc/kdo8gtEtmCaJFJtfv+u9/N/r4oPfK3TPrioClsJK4L/deHbs1yXg5TaFxDw56RzVNInA8vmt08uveL169EU8FZ3ei/f2Xv70+Wn8igtTOVxwDtT1U82O8SV4MvLEg2FSwNGGjflrXvxV48T/qZFP3lTcbK/imipVRq4q1kXj2rKZ2zfNspWK2lJyAXbYC3tvndmoXBFaJeGHfmBhH1Jphf/1OFn1NiOcroY1aLbv5vfU9Xw8Gve7rrvvHz+03R8VfwLbzzPnBee54zr7zxnnnnDkDhzj/NLYbu40n9/9ubjW3m62ldKtR1Dx1Kl/z6b/zMaX+</latexit><latexit sha1_base64="lM7otXHxmNW2LOvM/MWPYYOq3PY=">AAAHWXicfZVRb9s2EMflbK1Tb22T9rEvxIwBXWEYkrsm6UOALsnWPqxNFthJisgIKPpkE6YkjqQcuxy/Xr9DgT3tk4yS5UyyvOrFh/v973D8kzgHnFGpXPdLY+ubb+/db24/aH33/cNHj3d2n1zIJBUEBiRhibgKsARGYxgoqhhccQE4ChhcBtPjjF/OQEiaxH214DCM8DimISVY2dTNzl/8+dWNhw5R34/oCPkB0R/NoS+I9hXMlZYcR8b8ZAV+KPAqGwr4s4uSEPURjVGmznVohBU2phCpRGGG/Ham+09CY6lwTEAaZG522m7XzT9UD7wiaDvFd3aze++zP0pIGkGsCMNSXnsuV0ONhaKEgWn5qQSOyRSP4TpV4cFQ05inCmJi0I+WhSlDKkGZE2hEBRDFFjbARFDbAZEJtodU1q9WtZWEGEcgO6MZ5XIZytl4GShszR7qeX4Z5mGlUo8F5hNK5pXRNI5khNWklpSLKKgmIWUgZlE1mY1ph1xTzkEQKjMTzqwzpzy7YNlPzgo+WfAJxNLoVDBTLrQAhIDQFuahBJVynZ/GvqqpPFQihU4W5rnDEyym5zDq2D6VRHWckCVYGWtGDLckiSIcj7TPV2/D73RNblWZnhut/cyXIEDnGa7QDyX6wax3HtzREA0srcCLEryoNb4s0cv10iAt0bRGZyU6q3UObkv4tobnJTqv0UWJLmr0U4l+qluJ7UVf94Z6aXd+TfqU0Rm8FQCx0e2eWT+LsDd47VVLslvVbc/kdo8gtEtmCaJFJtfv+u9/N/r4oPfK3TPrioClsJK4L/deHbs1yXg5TaFxDw56RzVNInA8vmt08uveL169EU8FZ3ei/f2Xv70+Wn8igtTOVxwDtT1U82O8SV4MvLEg2FSwNGGjflrXvxV48T/qZFP3lTcbK/imipVRq4q1kXj2rKZ2zfNspWK2lJyAXbYC3tvndmoXBFaJeGHfmBhH1Jphf/1OFn1NiOcroY1aLbv5vfU9Xw8Gve7rrvvHz+03R8VfwLbzzPnBee54zr7zxnnnnDkDhzj/NLYbu40n9/9ubjW3m62ldKtR1Dx1Kl/z6b/zMaX+</latexit>

p(X1 = T | Y = spam) =freq. of T in spam + 1

total # of spam instances + v<latexit sha1_base64="T1GSieFBf11SWXtSFtNgsYPPODg=">AAAITHicfVVdb9s2FJW7Ls68dUu3t+2FmDOg2wxDctYkfQjQ5WPtw9pkgZ1kiIyAoq9swZTEkpRjleBf6M/Yz9n7/kffhgGj5I9Klja9+Pqec67Ic0ldj9FASNv+q/Hgo4cfbzW3P2l9+tmjz7/YefzllYgTTmBAYhrzGw8LoEEEAxlICjeMAw49Ctfe9CTDr2fARRBHfZkyGIZ4HAV+QLA0qbudd+zJzZ2DjlAfuWEwQq5H1O/a/Hc5Ua6EuVSC4VDr77OczzFRLYQWgM/hTRfFPtrt76IgyiU5Gf2IHI1aWqE1V8YSU+S2M/oHXhAJiSMCwihmSLf03U7b7tr5g6qBswza1vK5uHu89Yc7ikkSQiQJxULcOjaTQ4W5DAgF3XITAQyTKR7DbSL9w6EKIpZIiIhG3xnMTyiSMcq8QaOAA5E0NQEmPDAVEJlgs2dpHGyVSwmIcAiiM5oFTCxCMRsvAomN/UM1z9ujH5WUaswxmwRkXlqawqEIsZxUkiINvXISEgp8FpaT2TLNIjeYc+AkEJkJF8aZc5a1XPTjiyU+SdkEIqFVwqkuCg0AnINvhHkoQCZM5bsx52wqjiRPoJOFee7oFPPpJYw6pk4pUV6OT2MsyynPbMO4E8E9icMQRyPlMr08dm6nq3PviuilVsrNjPI8dJnBJfR1AX2tdRk8K4BnBiyjgzXqo8Gm9KoAXlXeel1ArzelXlJAkwo6K6CzSmXvvgDfV+B5AZ1X0LSAphX0bQF9W/UZm2Nx2xuqRS/ypqpzGszgBQeItGr39OZeuOn3rVOWZGdAtR2d2z0C33ykFkCYZnT1sv/qV61ODntP7X29yfBoAiuKvbf/9MSuUMaL1Sw59uFh77jCiTmOxutCp2f7PzvVQizhjK5JBwd7vzyrVkqB0vh+Xenk+LS3t7kx40h5Uc6BY9ubp42TilVLR1DbQRVrx3X05WtqBV6dYOFnLX9a5b/gOP0PdlxXfWVzrYLVKVae1yrSOsWqAStFWRLV2PShHWvNxs5ZdhGmZtaxbGRguqh7CmaYcHhlLsi5+QBiGfMfzK3g4zAwtcyv28mi/yPi+YpoolbLTDZnc45Vg6te19nr2r/91H5+vJxx29Y31rfWE8uxDqzn1kvrwhpYxHrf+LrRbuw2/2y+b/7d/GdBfdBYar6ySs/21r8/c/+v</latexit>

Thischangesourestimatesasshownhere(i.e.wedon’tactuallyneedtoaddthepseudo-observations,wejustchangeourestimator).

Here,visthenumberofdifferentvaluesX1cantake.

Inpractice,weoftenreducetheweightof

summary so far

Bayesian vs frequentist learning. Use what works, mix-and-match.

Discriminative classification: learn p(Y|X) directly

Generative classif.: learn p(X|Y) and p(Y), apply Bayes Bayesian classifier, Naive Bayesian classifier

Naive Bayes: assumes independent features (conditional on the class).

Laplace smoothing: add pseudo-observations to avoid zero probabilities.

65

Probabilistic Models Part4: Logistic Regression

Machine Learning mlvu.github.io

Vrije Universiteit Amsterdam

Thislecturewillbeallabouthowtousethemechanismsofprobabilitytocreateaclassi?ier.

discriminative classifier

Learn P(Y | X) directly.

67

Inthisvideowe’lllookatanexampleofadiscriminativeclassi?ier.Thisisaclassi?ierthatlearnstomapthefeaturesdirectlytoclassprobabilities,withoutusingBayes’ruletoreversetheconditionalprobability.

68

Rememberthatwewerestillonthelookoutforgoodlossfunctionsfortheclassi?icationproblem.We’llusethelanguageofprobabilitytode?ineoneforus.

Least-squares classifier

69

-1

1

0

Thiswasourlastattempt:theleastsquaresloss.

Ourthinkingwas:thehyperplaneclassi?ierchecksifwx+bispositiveornegative,todecidewhethertoassignclassesblueorred,respectively.Whynotjustgiveblueandredsomearbitrarypositiveandnegativevalues,andtreatitasaregressionproblem.

70

1

0

Hereisanotheroption:insteadofgivingnegativeandpositivearbitraryvalues,wegiventhemprobabilities:theprobabilityofbeingpositive,whichis1forallbluepointsand0forallredpoints.(Inotherwords,wemovetheredpointsfrom-1to0).

Thisdoesn’tlooksubstantiallydifferenttoourlinearclassi?ierbecauseourfunctionwTx+bstillrangesfromnegativein?initytopositivein?inity.Itdoesn’tproduceprobabilities,exceptoveraverynarrowrange.

Whatweneed,isawaytosqueezethatwholerangeintotherange[0,1],sothatthemodelonlyeverproducesvalidprobabilities.

the logistic sigmoid

71

t

�(t) =1

1+ e-t=

et

1+ et

1- �(t) = �(-t)<latexit sha1_base64="OubmBkKyQNI3ycJRfkVGByCOCCk=">AAAHTXicfZVLb9NAEMddoGkJrxaOXFZEoAJpZaf0dahU2kIrRB9UTYtUh2q9mThW/FjtrtOkq/16nDkh8UU4VYh1nFR2HPAlo/n9ZzT739XEob7HhWn+mrpz9950aWb2fvnBw0ePn8zNPz3jUcwI1EnkR+yrgzn4Xgh14QkfvlIGOHB8OHc6Owk/7wLjXhSeij6FRoDd0Gt5BAudupyjNvfcAC/YjEihXqNXm8huMUykpaSF3iL4JhdTphTSrENkyuGbGCk0Q7ZdttAimtAtzSyOUpdzFXPJHHyoGFjDoGIMv+PL+envdjMicQChID7m/MIyqWhIzIRHfFBlO+ZAMelgFy5i0VpvSC+ksYCQKPRSs1bsIxGh5PSo6TEgwu/rABPm6Q6ItLE+j9AelfOtOIQ4AF5tdj3K05B33TQQWBvckL3BBahHuUrpMkzbHunlRpM44AEW7UKS9wMnn4TYB9YN8slkTD3kmLIHjHg8MeFYO3NEk0vlp9HxkLf7tA0hVzJmvsoWagCMQUsXDkIOIqZycBr9kjp8U7AYqkk4yG3uYtY5gWZV98kl8uO0/AgLpc0I4YpEQYDDprSpkraAnpB2dUkNrMrSEyWlnfjiOOgkwTl6mKGHarxz/Za2UF3THDzLwLNC4/MMPR8vdeIMjQu0m6HdQmfnKoOvCriXob0C7Wdov0CvM/S6aCXWF31Ra8jU7sE1ySPf68IeAwiVrNTU+FmYvsELK1+S3KqsWGpgdxNaerGkIOgncrl/evBZyZ312oq5qsYVjh/DSGIur67smAWJm04z1Jjr67XtgiZiOHRvG+1+WH1vFRvRmFH/VrS2tvxxY3v8iTBSON/wGKhioYIf7iT5cOCJBc6kgtSEifpOUb/HcP8f6mhS95E3EyvopIqRUaOKsZFo8qySvU6TlYr9VLILetkyONDP7UgvCCwi9ka/MeYGnjZD/9rVJPqfEPdGQh2Vy3rzW+N7vhjUa0sbS+aXd5WtT8O/gFnjufHCWDAsY83YMvaNY6NuEOOncTM1PVUq/Sj9Lt2U/qTSO1PDmmdG7puZ+QtsoKFm</latexit><latexit sha1_base64="OubmBkKyQNI3ycJRfkVGByCOCCk=">AAAHTXicfZVLb9NAEMddoGkJrxaOXFZEoAJpZaf0dahU2kIrRB9UTYtUh2q9mThW/FjtrtOkq/16nDkh8UU4VYh1nFR2HPAlo/n9ZzT739XEob7HhWn+mrpz9950aWb2fvnBw0ePn8zNPz3jUcwI1EnkR+yrgzn4Xgh14QkfvlIGOHB8OHc6Owk/7wLjXhSeij6FRoDd0Gt5BAudupyjNvfcAC/YjEihXqNXm8huMUykpaSF3iL4JhdTphTSrENkyuGbGCk0Q7ZdttAimtAtzSyOUpdzFXPJHHyoGFjDoGIMv+PL+envdjMicQChID7m/MIyqWhIzIRHfFBlO+ZAMelgFy5i0VpvSC+ksYCQKPRSs1bsIxGh5PSo6TEgwu/rABPm6Q6ItLE+j9AelfOtOIQ4AF5tdj3K05B33TQQWBvckL3BBahHuUrpMkzbHunlRpM44AEW7UKS9wMnn4TYB9YN8slkTD3kmLIHjHg8MeFYO3NEk0vlp9HxkLf7tA0hVzJmvsoWagCMQUsXDkIOIqZycBr9kjp8U7AYqkk4yG3uYtY5gWZV98kl8uO0/AgLpc0I4YpEQYDDprSpkraAnpB2dUkNrMrSEyWlnfjiOOgkwTl6mKGHarxz/Za2UF3THDzLwLNC4/MMPR8vdeIMjQu0m6HdQmfnKoOvCriXob0C7Wdov0CvM/S6aCXWF31Ra8jU7sE1ySPf68IeAwiVrNTU+FmYvsELK1+S3KqsWGpgdxNaerGkIOgncrl/evBZyZ312oq5qsYVjh/DSGIur67smAWJm04z1Jjr67XtgiZiOHRvG+1+WH1vFRvRmFH/VrS2tvxxY3v8iTBSON/wGKhioYIf7iT5cOCJBc6kgtSEifpOUb/HcP8f6mhS95E3EyvopIqRUaOKsZFo8qySvU6TlYr9VLILetkyONDP7UgvCCwi9ka/MeYGnjZD/9rVJPqfEPdGQh2Vy3rzW+N7vhjUa0sbS+aXd5WtT8O/gFnjufHCWDAsY83YMvaNY6NuEOOncTM1PVUq/Sj9Lt2U/qTSO1PDmmdG7puZ+QtsoKFm</latexit><latexit sha1_base64="OubmBkKyQNI3ycJRfkVGByCOCCk=">AAAHTXicfZVLb9NAEMddoGkJrxaOXFZEoAJpZaf0dahU2kIrRB9UTYtUh2q9mThW/FjtrtOkq/16nDkh8UU4VYh1nFR2HPAlo/n9ZzT739XEob7HhWn+mrpz9950aWb2fvnBw0ePn8zNPz3jUcwI1EnkR+yrgzn4Xgh14QkfvlIGOHB8OHc6Owk/7wLjXhSeij6FRoDd0Gt5BAudupyjNvfcAC/YjEihXqNXm8huMUykpaSF3iL4JhdTphTSrENkyuGbGCk0Q7ZdttAimtAtzSyOUpdzFXPJHHyoGFjDoGIMv+PL+envdjMicQChID7m/MIyqWhIzIRHfFBlO+ZAMelgFy5i0VpvSC+ksYCQKPRSs1bsIxGh5PSo6TEgwu/rABPm6Q6ItLE+j9AelfOtOIQ4AF5tdj3K05B33TQQWBvckL3BBahHuUrpMkzbHunlRpM44AEW7UKS9wMnn4TYB9YN8slkTD3kmLIHjHg8MeFYO3NEk0vlp9HxkLf7tA0hVzJmvsoWagCMQUsXDkIOIqZycBr9kjp8U7AYqkk4yG3uYtY5gWZV98kl8uO0/AgLpc0I4YpEQYDDprSpkraAnpB2dUkNrMrSEyWlnfjiOOgkwTl6mKGHarxz/Za2UF3THDzLwLNC4/MMPR8vdeIMjQu0m6HdQmfnKoOvCriXob0C7Wdov0CvM/S6aCXWF31Ra8jU7sE1ySPf68IeAwiVrNTU+FmYvsELK1+S3KqsWGpgdxNaerGkIOgncrl/evBZyZ312oq5qsYVjh/DSGIur67smAWJm04z1Jjr67XtgiZiOHRvG+1+WH1vFRvRmFH/VrS2tvxxY3v8iTBSON/wGKhioYIf7iT5cOCJBc6kgtSEifpOUb/HcP8f6mhS95E3EyvopIqRUaOKsZFo8qySvU6TlYr9VLILetkyONDP7UgvCCwi9ka/MeYGnjZD/9rVJPqfEPdGQh2Vy3rzW+N7vhjUa0sbS+aXd5WtT8O/gFnjufHCWDAsY83YMvaNY6NuEOOncTM1PVUq/Sj9Lt2U/qTSO1PDmmdG7puZ+QtsoKFm</latexit>

�(t) =1

1+ e-t=

et

1+ et

1- �(t) = �(-t)<latexit sha1_base64="OubmBkKyQNI3ycJRfkVGByCOCCk=">AAAHTXicfZVLb9NAEMddoGkJrxaOXFZEoAJpZaf0dahU2kIrRB9UTYtUh2q9mThW/FjtrtOkq/16nDkh8UU4VYh1nFR2HPAlo/n9ZzT739XEob7HhWn+mrpz9950aWb2fvnBw0ePn8zNPz3jUcwI1EnkR+yrgzn4Xgh14QkfvlIGOHB8OHc6Owk/7wLjXhSeij6FRoDd0Gt5BAudupyjNvfcAC/YjEihXqNXm8huMUykpaSF3iL4JhdTphTSrENkyuGbGCk0Q7ZdttAimtAtzSyOUpdzFXPJHHyoGFjDoGIMv+PL+envdjMicQChID7m/MIyqWhIzIRHfFBlO+ZAMelgFy5i0VpvSC+ksYCQKPRSs1bsIxGh5PSo6TEgwu/rABPm6Q6ItLE+j9AelfOtOIQ4AF5tdj3K05B33TQQWBvckL3BBahHuUrpMkzbHunlRpM44AEW7UKS9wMnn4TYB9YN8slkTD3kmLIHjHg8MeFYO3NEk0vlp9HxkLf7tA0hVzJmvsoWagCMQUsXDkIOIqZycBr9kjp8U7AYqkk4yG3uYtY5gWZV98kl8uO0/AgLpc0I4YpEQYDDprSpkraAnpB2dUkNrMrSEyWlnfjiOOgkwTl6mKGHarxz/Za2UF3THDzLwLNC4/MMPR8vdeIMjQu0m6HdQmfnKoOvCriXob0C7Wdov0CvM/S6aCXWF31Ra8jU7sE1ySPf68IeAwiVrNTU+FmYvsELK1+S3KqsWGpgdxNaerGkIOgncrl/evBZyZ312oq5qsYVjh/DSGIur67smAWJm04z1Jjr67XtgiZiOHRvG+1+WH1vFRvRmFH/VrS2tvxxY3v8iTBSON/wGKhioYIf7iT5cOCJBc6kgtSEifpOUb/HcP8f6mhS95E3EyvopIqRUaOKsZFo8qySvU6TlYr9VLILetkyONDP7UgvCCwi9ka/MeYGnjZD/9rVJPqfEPdGQh2Vy3rzW+N7vhjUa0sbS+aXd5WtT8O/gFnjufHCWDAsY83YMvaNY6NuEOOncTM1PVUq/Sj9Lt2U/qTSO1PDmmdG7puZ+QtsoKFm</latexit><latexit sha1_base64="OubmBkKyQNI3ycJRfkVGByCOCCk=">AAAHTXicfZVLb9NAEMddoGkJrxaOXFZEoAJpZaf0dahU2kIrRB9UTYtUh2q9mThW/FjtrtOkq/16nDkh8UU4VYh1nFR2HPAlo/n9ZzT739XEob7HhWn+mrpz9950aWb2fvnBw0ePn8zNPz3jUcwI1EnkR+yrgzn4Xgh14QkfvlIGOHB8OHc6Owk/7wLjXhSeij6FRoDd0Gt5BAudupyjNvfcAC/YjEihXqNXm8huMUykpaSF3iL4JhdTphTSrENkyuGbGCk0Q7ZdttAimtAtzSyOUpdzFXPJHHyoGFjDoGIMv+PL+envdjMicQChID7m/MIyqWhIzIRHfFBlO+ZAMelgFy5i0VpvSC+ksYCQKPRSs1bsIxGh5PSo6TEgwu/rABPm6Q6ItLE+j9AelfOtOIQ4AF5tdj3K05B33TQQWBvckL3BBahHuUrpMkzbHunlRpM44AEW7UKS9wMnn4TYB9YN8slkTD3kmLIHjHg8MeFYO3NEk0vlp9HxkLf7tA0hVzJmvsoWagCMQUsXDkIOIqZycBr9kjp8U7AYqkk4yG3uYtY5gWZV98kl8uO0/AgLpc0I4YpEQYDDprSpkraAnpB2dUkNrMrSEyWlnfjiOOgkwTl6mKGHarxz/Za2UF3THDzLwLNC4/MMPR8vdeIMjQu0m6HdQmfnKoOvCriXob0C7Wdov0CvM/S6aCXWF31Ra8jU7sE1ySPf68IeAwiVrNTU+FmYvsELK1+S3KqsWGpgdxNaerGkIOgncrl/evBZyZ312oq5qsYVjh/DSGIur67smAWJm04z1Jjr67XtgiZiOHRvG+1+WH1vFRvRmFH/VrS2tvxxY3v8iTBSON/wGKhioYIf7iT5cOCJBc6kgtSEifpOUb/HcP8f6mhS95E3EyvopIqRUaOKsZFo8qySvU6TlYr9VLILetkyONDP7UgvCCwi9ka/MeYGnjZD/9rVJPqfEPdGQh2Vy3rzW+N7vhjUa0sbS+aXd5WtT8O/gFnjufHCWDAsY83YMvaNY6NuEOOncTM1PVUq/Sj9Lt2U/qTSO1PDmmdG7puZ+QtsoKFm</latexit><latexit sha1_base64="OubmBkKyQNI3ycJRfkVGByCOCCk=">AAAHTXicfZVLb9NAEMddoGkJrxaOXFZEoAJpZaf0dahU2kIrRB9UTYtUh2q9mThW/FjtrtOkq/16nDkh8UU4VYh1nFR2HPAlo/n9ZzT739XEob7HhWn+mrpz9950aWb2fvnBw0ePn8zNPz3jUcwI1EnkR+yrgzn4Xgh14QkfvlIGOHB8OHc6Owk/7wLjXhSeij6FRoDd0Gt5BAudupyjNvfcAC/YjEihXqNXm8huMUykpaSF3iL4JhdTphTSrENkyuGbGCk0Q7ZdttAimtAtzSyOUpdzFXPJHHyoGFjDoGIMv+PL+envdjMicQChID7m/MIyqWhIzIRHfFBlO+ZAMelgFy5i0VpvSC+ksYCQKPRSs1bsIxGh5PSo6TEgwu/rABPm6Q6ItLE+j9AelfOtOIQ4AF5tdj3K05B33TQQWBvckL3BBahHuUrpMkzbHunlRpM44AEW7UKS9wMnn4TYB9YN8slkTD3kmLIHjHg8MeFYO3NEk0vlp9HxkLf7tA0hVzJmvsoWagCMQUsXDkIOIqZycBr9kjp8U7AYqkk4yG3uYtY5gWZV98kl8uO0/AgLpc0I4YpEQYDDprSpkraAnpB2dUkNrMrSEyWlnfjiOOgkwTl6mKGHarxz/Za2UF3THDzLwLNC4/MMPR8vdeIMjQu0m6HdQmfnKoOvCriXob0C7Wdov0CvM/S6aCXWF31Ra8jU7sE1ySPf68IeAwiVrNTU+FmYvsELK1+S3KqsWGpgdxNaerGkIOgncrl/evBZyZ312oq5qsYVjh/DSGIur67smAWJm04z1Jjr67XtgiZiOHRvG+1+WH1vFRvRmFH/VrS2tvxxY3v8iTBSON/wGKhioYIf7iT5cOCJBc6kgtSEifpOUb/HcP8f6mhS95E3EyvopIqRUaOKsZFo8qySvU6TlYr9VLILetkyONDP7UgvCCwi9ka/MeYGnjZD/9rVJPqfEPdGQh2Vy3rzW+N7vhjUa0sbS+aXd5WtT8O/gFnjufHCWDAsY83YMvaNY6NuEOOncTM1PVUq/Sj9Lt2U/qTSO1PDmmdG7puZ+QtsoKFm</latexit>

Forthispurpose,wewillusethelogisticsigmoid.Notethatitsdomainistheentirerealnumberline,anditsrangeis[0,1].

Aninterestingpropertyofthelogisticsigmoidisthesymmetrygiveninthesecondline.Basicallytheremainderbetweensigma(t)and1,isitselfasigmoidrunningintheotherdirection.Inotherwords:?lippingthesigmoidhorizontally,1-σ(t),givesusthesamefunctionas?lippingthesigmoidvertically,σ(-t).We’llmakefrequentuseofthislater.

source: By Qef (talk) - Created from scratch with gnuplot, Public Domain, https://commons.wikimedia.org/w/

index.php?curid=4310325

72

1

0

p(Pos)

c(x) = �(w · x+ b)<latexit sha1_base64="ehh3L8LvY3yGm0TpS6TXCnrKZAw=">AAAHBHicfZVNb9NAEIbdQkMJFFo4crGIEC1EkZPSNj1UKm2hFaIfVE1TKY6q9WbiWPHHanedOl3tmRM/hVPFDTjyH/g3rPNR7DjgS0bzvDOafXc1sYjrMG4Yv2dm79ydy92bv59/8HDh0ePFpSfnLAgphhoO3IBeWIiB6/hQ4w534YJQQJ7lQt3q7sa83gPKnMA/430CTQ/ZvtN2MOIqdbn4Ei9HK/qWbjLH9tCyGWBhWlfSxK2Am5H+WjctLCy5crlYMErG4NOzQXkUFLTRd3K5NPfLbAU49MDn2EWMNcoG4U2BKHewCzJvhgwIwl1kQyPk7WpTOD4JOfhY6i8Ua4euzgM9HlpvORQwd/sqQJg6qoOOO4gizNXR8ulWDHzkASu2eg5hw5D17GHAkfKlKaKBb3IhVSlsikjHwVFqNIE85iHeySRZ37PSSQhdoD0vnYzHVENOKCOg2GGxCSfKmWMS3wU7C05GvNMnHfCZFCF1ZbJQAaAU2qpwEDLgIRGD06gH0GVbnIZQjMNBbmsP0e4ptIqqTyqRHqftBohLZYYPVzjwPOS3hEmkMDlEXJjFkhxYlaSnUggz9sWy9NMYp+hRgh7Jyc61W9rWa4qm4HkCnmca1xO0PllqhQkaZmgvQXuZzurN/8VXGRwlaJSh/QTtZ+h1gl5nrUTqohuVphjaPbgmcew6PdinAL4UhYqcPAtVN9gop0viWxWFshzY3YK22gdD4PVjuTg4O/woxW61smasy0mF5YYwlhir62u7RkZiD6cZaYxqtbKT0QQU+fZto71362/L2UYkpMS9FW1srL7f3Jl8IhRnzjc6hl4o6xk/7Gny0cBTC6xpBUMTpuq7Wf0+Rf1/qINp3cfeTK0g0yrGRo0rJkYi8bPqqnVN4pWK3KFkD9SypXContuxWhCIB/SVemPU9hxlhvo1i3H0PyGKxkIV5fNq85cn93w2qFVKmyXj05vC9ofRX8C89kx7ri1rZW1D29YOtBOtpmHti3ajfdd+5D7nvuZuct+G0tmZUc1TLfXlfv4BMVSJHA==</latexit><latexit sha1_base64="ehh3L8LvY3yGm0TpS6TXCnrKZAw=">AAAHBHicfZVNb9NAEIbdQkMJFFo4crGIEC1EkZPSNj1UKm2hFaIfVE1TKY6q9WbiWPHHanedOl3tmRM/hVPFDTjyH/g3rPNR7DjgS0bzvDOafXc1sYjrMG4Yv2dm79ydy92bv59/8HDh0ePFpSfnLAgphhoO3IBeWIiB6/hQ4w534YJQQJ7lQt3q7sa83gPKnMA/430CTQ/ZvtN2MOIqdbn4Ei9HK/qWbjLH9tCyGWBhWlfSxK2Am5H+WjctLCy5crlYMErG4NOzQXkUFLTRd3K5NPfLbAU49MDn2EWMNcoG4U2BKHewCzJvhgwIwl1kQyPk7WpTOD4JOfhY6i8Ua4euzgM9HlpvORQwd/sqQJg6qoOOO4gizNXR8ulWDHzkASu2eg5hw5D17GHAkfKlKaKBb3IhVSlsikjHwVFqNIE85iHeySRZ37PSSQhdoD0vnYzHVENOKCOg2GGxCSfKmWMS3wU7C05GvNMnHfCZFCF1ZbJQAaAU2qpwEDLgIRGD06gH0GVbnIZQjMNBbmsP0e4ptIqqTyqRHqftBohLZYYPVzjwPOS3hEmkMDlEXJjFkhxYlaSnUggz9sWy9NMYp+hRgh7Jyc61W9rWa4qm4HkCnmca1xO0PllqhQkaZmgvQXuZzurN/8VXGRwlaJSh/QTtZ+h1gl5nrUTqohuVphjaPbgmcew6PdinAL4UhYqcPAtVN9gop0viWxWFshzY3YK22gdD4PVjuTg4O/woxW61smasy0mF5YYwlhir62u7RkZiD6cZaYxqtbKT0QQU+fZto71362/L2UYkpMS9FW1srL7f3Jl8IhRnzjc6hl4o6xk/7Gny0cBTC6xpBUMTpuq7Wf0+Rf1/qINp3cfeTK0g0yrGRo0rJkYi8bPqqnVN4pWK3KFkD9SypXContuxWhCIB/SVemPU9hxlhvo1i3H0PyGKxkIV5fNq85cn93w2qFVKmyXj05vC9ofRX8C89kx7ri1rZW1D29YOtBOtpmHti3ajfdd+5D7nvuZuct+G0tmZUc1TLfXlfv4BMVSJHA==</latexit><latexit sha1_base64="ehh3L8LvY3yGm0TpS6TXCnrKZAw=">AAAHBHicfZVNb9NAEIbdQkMJFFo4crGIEC1EkZPSNj1UKm2hFaIfVE1TKY6q9WbiWPHHanedOl3tmRM/hVPFDTjyH/g3rPNR7DjgS0bzvDOafXc1sYjrMG4Yv2dm79ydy92bv59/8HDh0ePFpSfnLAgphhoO3IBeWIiB6/hQ4w534YJQQJ7lQt3q7sa83gPKnMA/430CTQ/ZvtN2MOIqdbn4Ei9HK/qWbjLH9tCyGWBhWlfSxK2Am5H+WjctLCy5crlYMErG4NOzQXkUFLTRd3K5NPfLbAU49MDn2EWMNcoG4U2BKHewCzJvhgwIwl1kQyPk7WpTOD4JOfhY6i8Ua4euzgM9HlpvORQwd/sqQJg6qoOOO4gizNXR8ulWDHzkASu2eg5hw5D17GHAkfKlKaKBb3IhVSlsikjHwVFqNIE85iHeySRZ37PSSQhdoD0vnYzHVENOKCOg2GGxCSfKmWMS3wU7C05GvNMnHfCZFCF1ZbJQAaAU2qpwEDLgIRGD06gH0GVbnIZQjMNBbmsP0e4ptIqqTyqRHqftBohLZYYPVzjwPOS3hEmkMDlEXJjFkhxYlaSnUggz9sWy9NMYp+hRgh7Jyc61W9rWa4qm4HkCnmca1xO0PllqhQkaZmgvQXuZzurN/8VXGRwlaJSh/QTtZ+h1gl5nrUTqohuVphjaPbgmcew6PdinAL4UhYqcPAtVN9gop0viWxWFshzY3YK22gdD4PVjuTg4O/woxW61smasy0mF5YYwlhir62u7RkZiD6cZaYxqtbKT0QQU+fZto71362/L2UYkpMS9FW1srL7f3Jl8IhRnzjc6hl4o6xk/7Gny0cBTC6xpBUMTpuq7Wf0+Rf1/qINp3cfeTK0g0yrGRo0rJkYi8bPqqnVN4pWK3KFkD9SypXContuxWhCIB/SVemPU9hxlhvo1i3H0PyGKxkIV5fNq85cn93w2qFVKmyXj05vC9ofRX8C89kx7ri1rZW1D29YOtBOtpmHti3ajfdd+5D7nvuZuct+G0tmZUc1TLfXlfv4BMVSJHA==</latexit>

Thisisournewclassi?ier:wecomputethelinearfunctionasbefore,butweapplythelogisticsigmoidtotheresult,squeezingitintotheinterval[0,1].Thismeansthatwecaninterprettheoutputastheprobabilityofthepositiveclass.Thismaybeaveryaccurateprobability,oraveryinaccurateone,dependingonhowwechoosewandb,butit’salwaysavaluebetween0and1.

Nowallweneedisalossfunctionthattellsuswhichprobabilitiesmatchthedata.

log loss

x: some data point

qx: our classifier qx(C) = p(C|x)

qx(Pos) = 0.1 qx(Neg) = 0.9

split data into positive XP and negatives XN

Find the classifier q that maximizes the probability of the true classes. i.e. we use the maximum likelihood objective.

73

Clearly,wewantaclassi?ierthatassignshighprobabilitytothetruelabel,andlowprobabilitytothefalseone.Ifwetreattheclassi?ierasamodelforourdata,wecancomputetheprobabilityofthe

log loss

74

<latexit sha1_base64="vHgiIjxVQJa8oE/+ow/KWnwWIlg=">AAAH3nicfVVNb9tGEGU+GiVq0zrJsZdFhAJOIRiknNjOwUBqyU0OTewYlh3AFNTlaiQRWpKb3aVMZsFrbkGv+QU59dr+l/6bLvUVkst2LxzMe284+2bJ9Rj1hbTtf27cvHX7mzuNu/ea3353//sfth48vBBRzAn0SUQj/s7DAqgfQl/6ksI7xgEHHoVLb9bN8cs5cOFH4blMGQwCPAn9sU+w1KnhFmLbvSfoELmMR6OhStqoi1w/RL0MvR8m290nw62WvWMvFjIDZxW0rNU6HT6488UdRSQOIJSEYiGuHJvJgcJc+oRC1nRjAQyTGZ7AVSzHBwPlhyyWEJIM/aSxcUyRjFDeLRr5HIikqQ4w4b6ugMgUc0yk3lOzXEpAiAMQ7dHcZ2IZivlkGUisDRmoZGFYdr+kVBOO2dQnSak1hQMRYDk1kiINvHISYgp8HpSTeZu6yQozAU58kZtwqp05YfkQxHl0usKnKZtCKDIVc5oVhRoAzmGshYtQgIyZWuxGT34mDiWPoZ2Hi9xhD/PZGYzauk4pUW5nTCMsyylPb0O7E8I1iYIAhyPlsky5EhKp3PZOtvCuiJ5lSrm5UZ6HznK4hL4poG+yrAweF8BjDZbR/gYdo35VelEAL4y3XhbQy6rUiwtobKDzAjo3KnvXBfjagJMCmhhoWkBTA/1QQD+YPmN9LK46A7WcxWKo6oT6c3jJAcJMtTpZdS9cz/vKKUvyM6BaTrawewRj/dtYAkGa09Wr89e/Zap70Hlm72VVhkdjWFPs3b1nXdugTJbdrDj2wUHnyOBEHIeTTaHe8d4vjlmIxZzRDWl/f/fX52alFCiNrjeVuke9zm51Y9qRclPOvmPb1dPGiWHVyhHUcpBh7aSOvnpNrcCrEyz9rOXPTP5LjtP/YEd11dc21ypYnWLtea0irVOsB7BWlCVhjU1fx7HRVHbO8g9hRnSP+ZWB6bJuD/RlwuG1/kBO9A8Qy4j/rL8KPgl8XUs/3XYe/R8RJ2uijppNfbM51XvMDC46O87ejv32aevF0eqOu2v9aD22ti3H2rdeWK+sU6tvEeuj9af1l/V34/fGx8anxh9L6s0bK80jq7Qan/8FLwrZoA==</latexit>

p(D) =Y

x,C2D

qx(C)

log loss

75

argmaxq

Y

C,x

qx(C)

= argmaxq

logY

C,x

qx(C) = argminq

- logY

C,x

qx(C)

= argminq

X

C,x

- logqx(C)

= argminq

-X

x2XP

logqx(P)-X

x2XN

logqx(N)

<latexit sha1_base64="M9O2tYxS+sQRDytiY0W5/QJ1geY=">AAAIt3icfVVNb9tGEKXSNFXUpnHaYy9EhAZxoRqk7NgOCgOJZTc5NLFqWLYBUyCWqxVFaPnh3aVMZsFjr+2/6H/qT+mtyw/JXC5Tngbz3hvOvllynAh7lBnGP50HXzz88tFX3ce9r7958u3TrWffXdIwJhBNYIhDcu0AirAXoAnzGEbXEUHAdzC6cpajHL9aIUK9MLhgaYSmPnADb+5BwETK3vrXAsT1QWLf6i+sX6yIhDObjwZJpt/aycvRtm5ZvRdH+j3LwqGrt/AqjhcIzs+fZeXVJKpFY7+iVKqKWX9xWbSkJrrlBfq1bTmQj7PsXlMmtvMyMpFA/lEiFolte6tv7BjFo6uBWQV9rXrG9rNHf1uzEMY+ChjEgNIb04jYlAPCPIhR1rNiiiIAl8BFNzGbH065F0QxQwHM9B8FNo+xzkI9H4M+8wiCDKciAJB4ooIOF4AAyMSwenIpigLgIzqYrbyIliFduWXAgJj0lCfFTcieSEruEhAtPJhIrXHgUx+whZKkqe/ISRRjRFa+nMzbFE02mAki0KO5CWPhzFmU3y56EY4rfJFGCxTQjMcEZ3WhABAhaC6ERUgRiyNenEZc6SU9YiRGgzwsckcngCzP0Wwg6kgJuZ05DgGTU444hnAnQHcw9H0QzLgVZdxiKGHcGuxkhXd19Dzj3MqNchz9PIcl9GMNFbdLBk9r4KkAZXSyQef6pCm9rIGXyluvauhVU+rENTRW0FUNXSmVnbsafKfASQ1NFDStoamCfqqhn1SfgbgWN8MpL2dRDJWfYW+F3hGEgoz3h1nzLETM+8aUJfkd4H0zK+yeobn4H5aAn+Z0/v7iw28ZHx0OXxn7WZPh4BitKcbu/quRoVDcspuKYxweDo8VTkhA4G4KnZzuvzXVQlFMIrwhHRzs/vparZQijMO7TaXR8clwt3kw4YjclHlgGkbzthGoWFU5ovdNXbHWbaNXr2kVOG2C0s9W/lLlvyMg/Qw7bKu+trlVEbUp1p63KtI2xXoAa4UsCVpsuh/HRtM4eZR/CEsoesxXBsBl3RMklglBH8QHciZ+gICF5Cderb1MLBfXGuTR/xFBsiaKqNcTm81s7jE1uBzumHs7e7/v9d8cVzuuq/2gPddeaqZ2oL3R3mtjbaLBjt35o/Nn56/u667dnXcXJfVBp9J8r0lP9/Y/PjEjaw==</latexit>

argmaxq

Y

C,x

qx(C)

= argmaxq

logY

C,x

qx(C) = argminq

- logY

C,x

qx(C)

= argminq

X

C,x

- logqx(C)

= argminq

-X

x2XP

logqx(P)-X

x2XN

logqx(N)

<latexit sha1_base64="M9O2tYxS+sQRDytiY0W5/QJ1geY=">AAAIt3icfVVNb9tGEKXSNFXUpnHaYy9EhAZxoRqk7NgOCgOJZTc5NLFqWLYBUyCWqxVFaPnh3aVMZsFjr+2/6H/qT+mtyw/JXC5Tngbz3hvOvllynAh7lBnGP50HXzz88tFX3ce9r7958u3TrWffXdIwJhBNYIhDcu0AirAXoAnzGEbXEUHAdzC6cpajHL9aIUK9MLhgaYSmPnADb+5BwETK3vrXAsT1QWLf6i+sX6yIhDObjwZJpt/aycvRtm5ZvRdH+j3LwqGrt/AqjhcIzs+fZeXVJKpFY7+iVKqKWX9xWbSkJrrlBfq1bTmQj7PsXlMmtvMyMpFA/lEiFolte6tv7BjFo6uBWQV9rXrG9rNHf1uzEMY+ChjEgNIb04jYlAPCPIhR1rNiiiIAl8BFNzGbH065F0QxQwHM9B8FNo+xzkI9H4M+8wiCDKciAJB4ooIOF4AAyMSwenIpigLgIzqYrbyIliFduWXAgJj0lCfFTcieSEruEhAtPJhIrXHgUx+whZKkqe/ISRRjRFa+nMzbFE02mAki0KO5CWPhzFmU3y56EY4rfJFGCxTQjMcEZ3WhABAhaC6ERUgRiyNenEZc6SU9YiRGgzwsckcngCzP0Wwg6kgJuZ05DgGTU444hnAnQHcw9H0QzLgVZdxiKGHcGuxkhXd19Dzj3MqNchz9PIcl9GMNFbdLBk9r4KkAZXSyQef6pCm9rIGXyluvauhVU+rENTRW0FUNXSmVnbsafKfASQ1NFDStoamCfqqhn1SfgbgWN8MpL2dRDJWfYW+F3hGEgoz3h1nzLETM+8aUJfkd4H0zK+yeobn4H5aAn+Z0/v7iw28ZHx0OXxn7WZPh4BitKcbu/quRoVDcspuKYxweDo8VTkhA4G4KnZzuvzXVQlFMIrwhHRzs/vparZQijMO7TaXR8clwt3kw4YjclHlgGkbzthGoWFU5ovdNXbHWbaNXr2kVOG2C0s9W/lLlvyMg/Qw7bKu+trlVEbUp1p63KtI2xXoAa4UsCVpsuh/HRtM4eZR/CEsoesxXBsBl3RMklglBH8QHciZ+gICF5Cderb1MLBfXGuTR/xFBsiaKqNcTm81s7jE1uBzumHs7e7/v9d8cVzuuq/2gPddeaqZ2oL3R3mtjbaLBjt35o/Nn56/u667dnXcXJfVBp9J8r0lP9/Y/PjEjaw==</latexit>

argmaxq

Y

C,x

qx(C)

= argmaxq

logY

C,x

qx(C) = argminq

- logY

C,x

qx(C)

= argminq

X

C,x

- logqx(C)

= argminq

-X

x2XP

logqx(P)-X

x2XN

logqx(N)

<latexit sha1_base64="M9O2tYxS+sQRDytiY0W5/QJ1geY=">AAAIt3icfVVNb9tGEKXSNFXUpnHaYy9EhAZxoRqk7NgOCgOJZTc5NLFqWLYBUyCWqxVFaPnh3aVMZsFjr+2/6H/qT+mtyw/JXC5Tngbz3hvOvllynAh7lBnGP50HXzz88tFX3ce9r7958u3TrWffXdIwJhBNYIhDcu0AirAXoAnzGEbXEUHAdzC6cpajHL9aIUK9MLhgaYSmPnADb+5BwETK3vrXAsT1QWLf6i+sX6yIhDObjwZJpt/aycvRtm5ZvRdH+j3LwqGrt/AqjhcIzs+fZeXVJKpFY7+iVKqKWX9xWbSkJrrlBfq1bTmQj7PsXlMmtvMyMpFA/lEiFolte6tv7BjFo6uBWQV9rXrG9rNHf1uzEMY+ChjEgNIb04jYlAPCPIhR1rNiiiIAl8BFNzGbH065F0QxQwHM9B8FNo+xzkI9H4M+8wiCDKciAJB4ooIOF4AAyMSwenIpigLgIzqYrbyIliFduWXAgJj0lCfFTcieSEruEhAtPJhIrXHgUx+whZKkqe/ISRRjRFa+nMzbFE02mAki0KO5CWPhzFmU3y56EY4rfJFGCxTQjMcEZ3WhABAhaC6ERUgRiyNenEZc6SU9YiRGgzwsckcngCzP0Wwg6kgJuZ05DgGTU444hnAnQHcw9H0QzLgVZdxiKGHcGuxkhXd19Dzj3MqNchz9PIcl9GMNFbdLBk9r4KkAZXSyQef6pCm9rIGXyluvauhVU+rENTRW0FUNXSmVnbsafKfASQ1NFDStoamCfqqhn1SfgbgWN8MpL2dRDJWfYW+F3hGEgoz3h1nzLETM+8aUJfkd4H0zK+yeobn4H5aAn+Z0/v7iw28ZHx0OXxn7WZPh4BitKcbu/quRoVDcspuKYxweDo8VTkhA4G4KnZzuvzXVQlFMIrwhHRzs/vparZQijMO7TaXR8clwt3kw4YjclHlgGkbzthGoWFU5ovdNXbHWbaNXr2kVOG2C0s9W/lLlvyMg/Qw7bKu+trlVEbUp1p63KtI2xXoAa4UsCVpsuh/HRtM4eZR/CEsoesxXBsBl3RMklglBH8QHciZ+gICF5Cderb1MLBfXGuTR/xFBsiaKqNcTm81s7jE1uBzumHs7e7/v9d8cVzuuq/2gPddeaqZ2oL3R3mtjbaLBjt35o/Nn56/u667dnXcXJfVBp9J8r0lP9/Y/PjEjaw==</latexit>

argmaxq

Y

C,x

qx(C)

= argmaxq

logY

C,x

qx(C) = argminq

- logY

C,x

qx(C)

= argminq

X

C,x

- logqx(C)

= argminq

-X

x2XP

logqx(P)-X

x2XN

logqx(N)

<latexit sha1_base64="M9O2tYxS+sQRDytiY0W5/QJ1geY=">AAAIt3icfVVNb9tGEKXSNFXUpnHaYy9EhAZxoRqk7NgOCgOJZTc5NLFqWLYBUyCWqxVFaPnh3aVMZsFjr+2/6H/qT+mtyw/JXC5Tngbz3hvOvllynAh7lBnGP50HXzz88tFX3ce9r7958u3TrWffXdIwJhBNYIhDcu0AirAXoAnzGEbXEUHAdzC6cpajHL9aIUK9MLhgaYSmPnADb+5BwETK3vrXAsT1QWLf6i+sX6yIhDObjwZJpt/aycvRtm5ZvRdH+j3LwqGrt/AqjhcIzs+fZeXVJKpFY7+iVKqKWX9xWbSkJrrlBfq1bTmQj7PsXlMmtvMyMpFA/lEiFolte6tv7BjFo6uBWQV9rXrG9rNHf1uzEMY+ChjEgNIb04jYlAPCPIhR1rNiiiIAl8BFNzGbH065F0QxQwHM9B8FNo+xzkI9H4M+8wiCDKciAJB4ooIOF4AAyMSwenIpigLgIzqYrbyIliFduWXAgJj0lCfFTcieSEruEhAtPJhIrXHgUx+whZKkqe/ISRRjRFa+nMzbFE02mAki0KO5CWPhzFmU3y56EY4rfJFGCxTQjMcEZ3WhABAhaC6ERUgRiyNenEZc6SU9YiRGgzwsckcngCzP0Wwg6kgJuZ05DgGTU444hnAnQHcw9H0QzLgVZdxiKGHcGuxkhXd19Dzj3MqNchz9PIcl9GMNFbdLBk9r4KkAZXSyQef6pCm9rIGXyluvauhVU+rENTRW0FUNXSmVnbsafKfASQ1NFDStoamCfqqhn1SfgbgWN8MpL2dRDJWfYW+F3hGEgoz3h1nzLETM+8aUJfkd4H0zK+yeobn4H5aAn+Z0/v7iw28ZHx0OXxn7WZPh4BitKcbu/quRoVDcspuKYxweDo8VTkhA4G4KnZzuvzXVQlFMIrwhHRzs/vparZQijMO7TaXR8clwt3kw4YjclHlgGkbzthGoWFU5ovdNXbHWbaNXr2kVOG2C0s9W/lLlvyMg/Qw7bKu+trlVEbUp1p63KtI2xXoAa4UsCVpsuh/HRtM4eZR/CEsoesxXBsBl3RMklglBH8QHciZ+gICF5Cderb1MLBfXGuTR/xFBsiaKqNcTm81s7jE1uBzumHs7e7/v9d8cVzuuq/2gPddeaqZ2oL3R3mtjbaLBjt35o/Nn56/u667dnXcXJfVBp9J8r0lP9/Y/PjEjaw==</latexit>

argmaxq

Y

C,x

qx(C)

= argmaxq

logY

C,x

qx(C) = argminq

- logY

C,x

qx(C)

= argminq

X

C,x

- logqx(C)

= argminq

-X

x2XP

logqx(P)-X

x2XN

logqx(N)

<latexit sha1_base64="M9O2tYxS+sQRDytiY0W5/QJ1geY=">AAAIt3icfVVNb9tGEKXSNFXUpnHaYy9EhAZxoRqk7NgOCgOJZTc5NLFqWLYBUyCWqxVFaPnh3aVMZsFjr+2/6H/qT+mtyw/JXC5Tngbz3hvOvllynAh7lBnGP50HXzz88tFX3ce9r7958u3TrWffXdIwJhBNYIhDcu0AirAXoAnzGEbXEUHAdzC6cpajHL9aIUK9MLhgaYSmPnADb+5BwETK3vrXAsT1QWLf6i+sX6yIhDObjwZJpt/aycvRtm5ZvRdH+j3LwqGrt/AqjhcIzs+fZeXVJKpFY7+iVKqKWX9xWbSkJrrlBfq1bTmQj7PsXlMmtvMyMpFA/lEiFolte6tv7BjFo6uBWQV9rXrG9rNHf1uzEMY+ChjEgNIb04jYlAPCPIhR1rNiiiIAl8BFNzGbH065F0QxQwHM9B8FNo+xzkI9H4M+8wiCDKciAJB4ooIOF4AAyMSwenIpigLgIzqYrbyIliFduWXAgJj0lCfFTcieSEruEhAtPJhIrXHgUx+whZKkqe/ISRRjRFa+nMzbFE02mAki0KO5CWPhzFmU3y56EY4rfJFGCxTQjMcEZ3WhABAhaC6ERUgRiyNenEZc6SU9YiRGgzwsckcngCzP0Wwg6kgJuZ05DgGTU444hnAnQHcw9H0QzLgVZdxiKGHcGuxkhXd19Dzj3MqNchz9PIcl9GMNFbdLBk9r4KkAZXSyQef6pCm9rIGXyluvauhVU+rENTRW0FUNXSmVnbsafKfASQ1NFDStoamCfqqhn1SfgbgWN8MpL2dRDJWfYW+F3hGEgoz3h1nzLETM+8aUJfkd4H0zK+yeobn4H5aAn+Z0/v7iw28ZHx0OXxn7WZPh4BitKcbu/quRoVDcspuKYxweDo8VTkhA4G4KnZzuvzXVQlFMIrwhHRzs/vparZQijMO7TaXR8clwt3kw4YjclHlgGkbzthGoWFU5ovdNXbHWbaNXr2kVOG2C0s9W/lLlvyMg/Qw7bKu+trlVEbUp1p63KtI2xXoAa4UsCVpsuh/HRtM4eZR/CEsoesxXBsBl3RMklglBH8QHciZ+gICF5Cderb1MLBfXGuTR/xFBsiaKqNcTm81s7jE1uBzumHs7e7/v9d8cVzuuq/2gPddeaqZ2oL3R3mtjbaLBjt35o/Nn56/u667dnXcXJfVBp9J8r0lP9/Y/PjEjaw==</latexit>

Least-squares classifier

76

-1

1

0

Intheleast-squarescase,thelossfunctioncouldbethoughtofintermsoftheresidualsbetweenthepredictionandthetruevalues.Theypullonthelinelikerubberbands.

77

c(x) = �(w · x+ b)<latexit sha1_base64="ehh3L8LvY3yGm0TpS6TXCnrKZAw=">AAAHBHicfZVNb9NAEIbdQkMJFFo4crGIEC1EkZPSNj1UKm2hFaIfVE1TKY6q9WbiWPHHanedOl3tmRM/hVPFDTjyH/g3rPNR7DjgS0bzvDOafXc1sYjrMG4Yv2dm79ydy92bv59/8HDh0ePFpSfnLAgphhoO3IBeWIiB6/hQ4w534YJQQJ7lQt3q7sa83gPKnMA/430CTQ/ZvtN2MOIqdbn4Ei9HK/qWbjLH9tCyGWBhWlfSxK2Am5H+WjctLCy5crlYMErG4NOzQXkUFLTRd3K5NPfLbAU49MDn2EWMNcoG4U2BKHewCzJvhgwIwl1kQyPk7WpTOD4JOfhY6i8Ua4euzgM9HlpvORQwd/sqQJg6qoOOO4gizNXR8ulWDHzkASu2eg5hw5D17GHAkfKlKaKBb3IhVSlsikjHwVFqNIE85iHeySRZ37PSSQhdoD0vnYzHVENOKCOg2GGxCSfKmWMS3wU7C05GvNMnHfCZFCF1ZbJQAaAU2qpwEDLgIRGD06gH0GVbnIZQjMNBbmsP0e4ptIqqTyqRHqftBohLZYYPVzjwPOS3hEmkMDlEXJjFkhxYlaSnUggz9sWy9NMYp+hRgh7Jyc61W9rWa4qm4HkCnmca1xO0PllqhQkaZmgvQXuZzurN/8VXGRwlaJSh/QTtZ+h1gl5nrUTqohuVphjaPbgmcew6PdinAL4UhYqcPAtVN9gop0viWxWFshzY3YK22gdD4PVjuTg4O/woxW61smasy0mF5YYwlhir62u7RkZiD6cZaYxqtbKT0QQU+fZto71362/L2UYkpMS9FW1srL7f3Jl8IhRnzjc6hl4o6xk/7Gny0cBTC6xpBUMTpuq7Wf0+Rf1/qINp3cfeTK0g0yrGRo0rJkYi8bPqqnVN4pWK3KFkD9SypXContuxWhCIB/SVemPU9hxlhvo1i3H0PyGKxkIV5fNq85cn93w2qFVKmyXj05vC9ofRX8C89kx7ri1rZW1D29YOtBOtpmHti3ajfdd+5D7nvuZuct+G0tmZUc1TLfXlfv4BMVSJHA==</latexit><latexit sha1_base64="ehh3L8LvY3yGm0TpS6TXCnrKZAw=">AAAHBHicfZVNb9NAEIbdQkMJFFo4crGIEC1EkZPSNj1UKm2hFaIfVE1TKY6q9WbiWPHHanedOl3tmRM/hVPFDTjyH/g3rPNR7DjgS0bzvDOafXc1sYjrMG4Yv2dm79ydy92bv59/8HDh0ePFpSfnLAgphhoO3IBeWIiB6/hQ4w534YJQQJ7lQt3q7sa83gPKnMA/430CTQ/ZvtN2MOIqdbn4Ei9HK/qWbjLH9tCyGWBhWlfSxK2Am5H+WjctLCy5crlYMErG4NOzQXkUFLTRd3K5NPfLbAU49MDn2EWMNcoG4U2BKHewCzJvhgwIwl1kQyPk7WpTOD4JOfhY6i8Ua4euzgM9HlpvORQwd/sqQJg6qoOOO4gizNXR8ulWDHzkASu2eg5hw5D17GHAkfKlKaKBb3IhVSlsikjHwVFqNIE85iHeySRZ37PSSQhdoD0vnYzHVENOKCOg2GGxCSfKmWMS3wU7C05GvNMnHfCZFCF1ZbJQAaAU2qpwEDLgIRGD06gH0GVbnIZQjMNBbmsP0e4ptIqqTyqRHqftBohLZYYPVzjwPOS3hEmkMDlEXJjFkhxYlaSnUggz9sWy9NMYp+hRgh7Jyc61W9rWa4qm4HkCnmca1xO0PllqhQkaZmgvQXuZzurN/8VXGRwlaJSh/QTtZ+h1gl5nrUTqohuVphjaPbgmcew6PdinAL4UhYqcPAtVN9gop0viWxWFshzY3YK22gdD4PVjuTg4O/woxW61smasy0mF5YYwlhir62u7RkZiD6cZaYxqtbKT0QQU+fZto71362/L2UYkpMS9FW1srL7f3Jl8IhRnzjc6hl4o6xk/7Gny0cBTC6xpBUMTpuq7Wf0+Rf1/qINp3cfeTK0g0yrGRo0rJkYi8bPqqnVN4pWK3KFkD9SypXContuxWhCIB/SVemPU9hxlhvo1i3H0PyGKxkIV5fNq85cn93w2qFVKmyXj05vC9ofRX8C89kx7ri1rZW1D29YOtBOtpmHti3ajfdd+5D7nvuZuct+G0tmZUc1TLfXlfv4BMVSJHA==</latexit><latexit sha1_base64="ehh3L8LvY3yGm0TpS6TXCnrKZAw=">AAAHBHicfZVNb9NAEIbdQkMJFFo4crGIEC1EkZPSNj1UKm2hFaIfVE1TKY6q9WbiWPHHanedOl3tmRM/hVPFDTjyH/g3rPNR7DjgS0bzvDOafXc1sYjrMG4Yv2dm79ydy92bv59/8HDh0ePFpSfnLAgphhoO3IBeWIiB6/hQ4w534YJQQJ7lQt3q7sa83gPKnMA/430CTQ/ZvtN2MOIqdbn4Ei9HK/qWbjLH9tCyGWBhWlfSxK2Am5H+WjctLCy5crlYMErG4NOzQXkUFLTRd3K5NPfLbAU49MDn2EWMNcoG4U2BKHewCzJvhgwIwl1kQyPk7WpTOD4JOfhY6i8Ua4euzgM9HlpvORQwd/sqQJg6qoOOO4gizNXR8ulWDHzkASu2eg5hw5D17GHAkfKlKaKBb3IhVSlsikjHwVFqNIE85iHeySRZ37PSSQhdoD0vnYzHVENOKCOg2GGxCSfKmWMS3wU7C05GvNMnHfCZFCF1ZbJQAaAU2qpwEDLgIRGD06gH0GVbnIZQjMNBbmsP0e4ptIqqTyqRHqftBohLZYYPVzjwPOS3hEmkMDlEXJjFkhxYlaSnUggz9sWy9NMYp+hRgh7Jyc61W9rWa4qm4HkCnmca1xO0PllqhQkaZmgvQXuZzurN/8VXGRwlaJSh/QTtZ+h1gl5nrUTqohuVphjaPbgmcew6PdinAL4UhYqcPAtVN9gop0viWxWFshzY3YK22gdD4PVjuTg4O/woxW61smasy0mF5YYwlhir62u7RkZiD6cZaYxqtbKT0QQU+fZto71362/L2UYkpMS9FW1srL7f3Jl8IhRnzjc6hl4o6xk/7Gny0cBTC6xpBUMTpuq7Wf0+Rf1/qINp3cfeTK0g0yrGRo0rJkYi8bPqqnVN4pWK3KFkD9SypXContuxWhCIB/SVemPU9hxlhvo1i3H0PyGKxkIV5fNq85cn93w2qFVKmyXj05vC9ofRX8C89kx7ri1rZW1D29YOtBOtpmHti3ajfdd+5D7nvuZuct+G0tmZUc1TLfXlfv4BMVSJHA==</latexit>

1

0

Forthecrossentropyloss,wecanimaginetheresidualsforlogisticregressionasthelinesdrawnhere.Thecrossentropylosstriestomaximisetheselinesbyminimisingthenegativeoftheirlogarithm.Youcanthinkofthemaslittlerodspushingthesigmoidtowardstheredandbluepoints.

78

Rememberthatintheleastsquareslosswesquaredtheresidualsbeforesummingthem,topunishoutliers.Takingthelogarithmhasasimilareffect.Lowprobabilitiesaredisproportionatelypunished.

working out the gradient

79

@loss(w, b)@wi

=@�-P

x2XPlogqx(P)-

Px2XN

logqx(N)�

@wi

= -X

x2XP

@ logqx(P)

@wi-

X

x2XN

@ logqx(N)

@wi<latexit sha1_base64="9poBmou8NrV89+fWJxd1hqIvrRY=">AAAJBnicfVXNb9s2FJe7rnO9dW224y5EjQ3J4AaSsybZIUCbj6WHtfGCOAkQGQZFU7JgSuJIyrZK8L6/ZMfehl33R+yyw/a3jJJsT1+ZLnl5vw8+Pj6aDiU+F6b5V+vBRw8/fvRJ+3Hn08+efP702dYX1zyKGcJDFJGI3TqQY+KHeCh8QfAtZRgGDsE3zuwkxW/mmHE/Cq9EQvEogF7ouz6CQqfGW61z22UQSXtGgS3wUkgSca627QhJaTsuWCjVA7aDsn8ctaNyqoYXY18p8M0RKDgQ7Ipt8MLmcTCWS2D7Ibgdp+qBptok8sDP4+V2ntgBL0CFyJB8VyJmiR1gM9+bCrADKqvbdkevf+96hbrKK1c3cW8hDQ55SZnD2gAo0Bk/65q7ZvaBemCtgq6x+gbjrUe/2pMIxQEOBSKQ8zvLpGIkIRM+Ilh17JhjCtEMevguFu7hSPohjQUOkQJfa8yNCRARSE8VTHyGkSCJDiBivnYAaAp18UKffadsxXEIA8x7k7lPeR7yuZcHAurBGcllNljqSUkpPQbp1EfLUmkSBjyAYlpL8iRwykkcE8zmQTmZlqmLrDCXmCGfp00Y6M5c0HRY+VU0WOHThE5xyJWMGVFFoQYwY9jVwizkWMRUZrvRN2TGjwSLcS8Ns9zRKWSzSzzpaZ9SolyOSyIoyilHb0N3J8QLFAUBDCfSpnomshtk93ZV1rsieqn0fUob5TjgMoVL6LsCqgevDJ4VwDMNltHhBnXBsCq9LoDXtVVvCuhNVerEBTSuofMCOq85O4sCvKjBywK6rKFJAU1q6PsC+r7eZ6jH4q4/kvlZZIcqL4g/x+cM41DJbl9V98L0ed9ZZUk6A7JrqazdE+zqn9ccCJKULt9cvf1RyZPD/ktzX1UZDonxmmLu7b88MWsUL69mxTEPD/vHNU7EYOhtjE7P9l9bdSMaM0o2pIODvR++rzslmJBosXE6OT7t71U3pjtSLso6sEyzOm0M1Vq16gjoWqDWWq+JvlqmUeA0CfJ+NvJndf45g8k97KjJfd3mRgVtUqx73qhImhTrA1grypKwoU3/HcdGU9k5TS/CTD9PNH0yIMl9T7F+TBh+qy/Ihf4BhCJi3+pbwbzA1176r91Lo/8jwuWaqKNO+rJZ1XesHlz3d629XfOn77qvjldvXNv4ynhubBuWcWC8Mt4YA2NooNaH1p+tv1v/tH9pf2j/1v49pz5orTRfGqWv/ce/W3RIWQ==</latexit>

@loss(w, b)@wi

=@�-P

x2XPlogqx(P)-

Px2XN

logqx(N)�

@wi

= -X

x2XP

@ logqx(P)

@wi-

X

x2XN

@ logqx(N)

@wi<latexit sha1_base64="9poBmou8NrV89+fWJxd1hqIvrRY=">AAAJBnicfVXNb9s2FJe7rnO9dW224y5EjQ3J4AaSsybZIUCbj6WHtfGCOAkQGQZFU7JgSuJIyrZK8L6/ZMfehl33R+yyw/a3jJJsT1+ZLnl5vw8+Pj6aDiU+F6b5V+vBRw8/fvRJ+3Hn08+efP702dYX1zyKGcJDFJGI3TqQY+KHeCh8QfAtZRgGDsE3zuwkxW/mmHE/Cq9EQvEogF7ouz6CQqfGW61z22UQSXtGgS3wUkgSca627QhJaTsuWCjVA7aDsn8ctaNyqoYXY18p8M0RKDgQ7Ipt8MLmcTCWS2D7Ibgdp+qBptok8sDP4+V2ntgBL0CFyJB8VyJmiR1gM9+bCrADKqvbdkevf+96hbrKK1c3cW8hDQ55SZnD2gAo0Bk/65q7ZvaBemCtgq6x+gbjrUe/2pMIxQEOBSKQ8zvLpGIkIRM+Ilh17JhjCtEMevguFu7hSPohjQUOkQJfa8yNCRARSE8VTHyGkSCJDiBivnYAaAp18UKffadsxXEIA8x7k7lPeR7yuZcHAurBGcllNljqSUkpPQbp1EfLUmkSBjyAYlpL8iRwykkcE8zmQTmZlqmLrDCXmCGfp00Y6M5c0HRY+VU0WOHThE5xyJWMGVFFoQYwY9jVwizkWMRUZrvRN2TGjwSLcS8Ns9zRKWSzSzzpaZ9SolyOSyIoyilHb0N3J8QLFAUBDCfSpnomshtk93ZV1rsieqn0fUob5TjgMoVL6LsCqgevDJ4VwDMNltHhBnXBsCq9LoDXtVVvCuhNVerEBTSuofMCOq85O4sCvKjBywK6rKFJAU1q6PsC+r7eZ6jH4q4/kvlZZIcqL4g/x+cM41DJbl9V98L0ed9ZZUk6A7JrqazdE+zqn9ccCJKULt9cvf1RyZPD/ktzX1UZDonxmmLu7b88MWsUL69mxTEPD/vHNU7EYOhtjE7P9l9bdSMaM0o2pIODvR++rzslmJBosXE6OT7t71U3pjtSLso6sEyzOm0M1Vq16gjoWqDWWq+JvlqmUeA0CfJ+NvJndf45g8k97KjJfd3mRgVtUqx73qhImhTrA1grypKwoU3/HcdGU9k5TS/CTD9PNH0yIMl9T7F+TBh+qy/Ihf4BhCJi3+pbwbzA1176r91Lo/8jwuWaqKNO+rJZ1XesHlz3d629XfOn77qvjldvXNv4ynhubBuWcWC8Mt4YA2NooNaH1p+tv1v/tH9pf2j/1v49pz5orTRfGqWv/ce/W3RIWQ==</latexit>

We’llshowyouthebasicsofworkingoutthegradientforlogisticregression.Thelossbreaksapartinseparatetermsforthepositiveandnegativepoints.Let’slookatoneofthepositiveterms(thenegativecanbederivedinasimilarway).

80

@ logqx(P)

@wi=

@ log�(w · x+ b)

@wi

=@ log 1

1+exp(-wT x-b)

@wi= -

@ log(1+ exp(-wTx- b))

@wi

= -@ log(1+ exp(-wTx- b))

@(1+ exp(-wTx- b))

@(1+ exp(-wTx- b))

@wi

= -1

ln 2

1

(1+ exp(-wTx- b))

@ exp(-wTx- b)

@wi

= -1

(1+ exp(-wTx- b))

@ exp(-wTx- b)

@(-wTx- b)

@(-wTx- b)

@wi

= -exp(-wTx- b)

(1+ exp(-wTx- b))·-xi = (1- �(wTx+ b))xi

= qx(N)xi<latexit sha1_base64="2vcOtw/BEfXstT2FaWDEzowiKeA=">AAALF3icrVbNcttEHFdrKEVQaODIRYOHjkucjOTQJBwyU/JBe6BJyMRJZiLjWa3Xssb6YrWype7sg/ACvEZvDFeOPABXeAV2JdmWtKoLneqSzf4+9r+//bIVuk5EdP3PO3db771/74P7H6offfzgk08fbnx2FQUxhqgPAzfANxaIkOv4qE8c4qKbECPgWS66tqZHAr+eIRw5gX9J0hANPGD7ztiBgPCu4Ubr5pFmjjGA1JyGmukGtvbzMOmYFqTn7DHLewNI50OHMe2gzjUjx/ZARzBMa85MOAqImWwKuSXLTVN91OCR/WswamyaKAk7Wwu3ny7NRNvSFmaaXM1W3auz3uOx5FGU9LZGb6Kxle8bqdXKVoVNOX0Rken6PcZWkf2P8ddmuz6VdzzQ61d4tQhvWen6sf/DJMQG1raSocNNO4bAqjs8U2wuFYIoSjjITw2G9JQVncOHbX1bzz5NbhhFo60U3/lw496v5iiAsYd8Al0QRbeGHpIBBZg40EVMNeMIhQBOgY1uYzLeH1DHD2OCfMi0rzg2jl2NBJo45trIwQgSN+UNALHDHTQ4ATwiwi8DtWoVIR94KOqOZk4Y5c1oZucNAvhNMqBJdtOwBxUltTEIJw5MKqVR4EUeIBOpM0o9q9qJYhfhmVftFGXyImvMBGHoRCKEc57MWShur+gyOC/wSRpOkB8xGmOXlYUcQBijMRdmzQiROKTZbPiVOY0OCI5RVzSzvoNjgKcXaNTlPpWOajljNwCk2mXxafB0fDSHgecBf0TNkG9WghJCze42y7IroxeMUlMEZVnahYAr6GkJPWWsCp6UwBMOVtH+Eh1r/br0qgReSaNel9DrutSKS2gsobMSOpOc+dFZwXMJTkpoIqFpCU0l9GUJfSnnDPi2uO0NaL4W2aLSM9eZoWcYIZ/RNr9Qa3PBfL1vjapE7AHaNlgW9wiN+XubA14q6PT55YsfGD3a7z3Rd1mdYbkxWlD0nd0nR7pEsfNqCo6+v987lDgBBr69NDo+2f3OkI3CGIfukrS3t/P9t7JTilw3mC+djg6Pezv1ifFEqkUZe4au13cbhlJURSJa29CkaO0mejFMo8BqEuR5NvKnMv8ZBulr2EGT+yLmRkXYpFhk3qhImxSLBVgoqhK/IabVciw1tZmH4iCIXwuheDKAm/seI/6YYPSCH5AzfgECEuCv+anAtudwL/7X7IrWOiJIFkTeUlX+shn1d0xuXPW2jZ1t/cdv2k8PizfuvvKF8qXSUQxlT3mqPFfOlb4CW69af7X+bv2j/qK+Un9Tf8+pd+8Ums+Vyqf+8S/5yPuw</latexit>

@ logqx(P)

@wi=

@ log�(w · x+ b)

@wi

=@ log 1

1+exp(-wT x-b)

@wi= -

@ log(1+ exp(-wTx- b))

@wi

= -@ log(1+ exp(-wTx- b))

@(1+ exp(-wTx- b))

@(1+ exp(-wTx- b))

@wi

= -1

ln 2

1

(1+ exp(-wTx- b))

@ exp(-wTx- b)

@wi

= -1

(1+ exp(-wTx- b))

@ exp(-wTx- b)

@(-wTx- b)

@(-wTx- b)

@wi

= -exp(-wTx- b)

(1+ exp(-wTx- b))·-xi = (1- �(wTx+ b))xi

= qx(N)xi<latexit sha1_base64="2vcOtw/BEfXstT2FaWDEzowiKeA=">AAALF3icrVbNcttEHFdrKEVQaODIRYOHjkucjOTQJBwyU/JBe6BJyMRJZiLjWa3Xssb6YrWype7sg/ACvEZvDFeOPABXeAV2JdmWtKoLneqSzf4+9r+//bIVuk5EdP3PO3db771/74P7H6offfzgk08fbnx2FQUxhqgPAzfANxaIkOv4qE8c4qKbECPgWS66tqZHAr+eIRw5gX9J0hANPGD7ztiBgPCu4Ubr5pFmjjGA1JyGmukGtvbzMOmYFqTn7DHLewNI50OHMe2gzjUjx/ZARzBMa85MOAqImWwKuSXLTVN91OCR/WswamyaKAk7Wwu3ny7NRNvSFmaaXM1W3auz3uOx5FGU9LZGb6Kxle8bqdXKVoVNOX0Rken6PcZWkf2P8ddmuz6VdzzQ61d4tQhvWen6sf/DJMQG1raSocNNO4bAqjs8U2wuFYIoSjjITw2G9JQVncOHbX1bzz5NbhhFo60U3/lw496v5iiAsYd8Al0QRbeGHpIBBZg40EVMNeMIhQBOgY1uYzLeH1DHD2OCfMi0rzg2jl2NBJo45trIwQgSN+UNALHDHTQ4ATwiwi8DtWoVIR94KOqOZk4Y5c1oZucNAvhNMqBJdtOwBxUltTEIJw5MKqVR4EUeIBOpM0o9q9qJYhfhmVftFGXyImvMBGHoRCKEc57MWShur+gyOC/wSRpOkB8xGmOXlYUcQBijMRdmzQiROKTZbPiVOY0OCI5RVzSzvoNjgKcXaNTlPpWOajljNwCk2mXxafB0fDSHgecBf0TNkG9WghJCze42y7IroxeMUlMEZVnahYAr6GkJPWWsCp6UwBMOVtH+Eh1r/br0qgReSaNel9DrutSKS2gsobMSOpOc+dFZwXMJTkpoIqFpCU0l9GUJfSnnDPi2uO0NaL4W2aLSM9eZoWcYIZ/RNr9Qa3PBfL1vjapE7AHaNlgW9wiN+XubA14q6PT55YsfGD3a7z3Rd1mdYbkxWlD0nd0nR7pEsfNqCo6+v987lDgBBr69NDo+2f3OkI3CGIfukrS3t/P9t7JTilw3mC+djg6Pezv1ifFEqkUZe4au13cbhlJURSJa29CkaO0mejFMo8BqEuR5NvKnMv8ZBulr2EGT+yLmRkXYpFhk3qhImxSLBVgoqhK/IabVciw1tZmH4iCIXwuheDKAm/seI/6YYPSCH5AzfgECEuCv+anAtudwL/7X7IrWOiJIFkTeUlX+shn1d0xuXPW2jZ1t/cdv2k8PizfuvvKF8qXSUQxlT3mqPFfOlb4CW69af7X+bv2j/qK+Un9Tf8+pd+8Ums+Vyqf+8S/5yPuw</latexit>

@ logqx(P)

@wi=

@ log�(w · x+ b)

@wi

=@ log 1

1+exp(-wT x-b)

@wi= -

@ log(1+ exp(-wTx- b))

@wi

= -@ log(1+ exp(-wTx- b))

@(1+ exp(-wTx- b))

@(1+ exp(-wTx- b))

@wi

= -1

ln 2

1

(1+ exp(-wTx- b))

@ exp(-wTx- b)

@wi

= -1

(1+ exp(-wTx- b))

@ exp(-wTx- b)

@(-wTx- b)

@(-wTx- b)

@wi

= -exp(-wTx- b)

(1+ exp(-wTx- b))·-xi = (1- �(wTx+ b))xi

= qx(N)xi<latexit sha1_base64="2vcOtw/BEfXstT2FaWDEzowiKeA=">AAALF3icrVbNcttEHFdrKEVQaODIRYOHjkucjOTQJBwyU/JBe6BJyMRJZiLjWa3Xssb6YrWype7sg/ACvEZvDFeOPABXeAV2JdmWtKoLneqSzf4+9r+//bIVuk5EdP3PO3db771/74P7H6offfzgk08fbnx2FQUxhqgPAzfANxaIkOv4qE8c4qKbECPgWS66tqZHAr+eIRw5gX9J0hANPGD7ztiBgPCu4Ubr5pFmjjGA1JyGmukGtvbzMOmYFqTn7DHLewNI50OHMe2gzjUjx/ZARzBMa85MOAqImWwKuSXLTVN91OCR/WswamyaKAk7Wwu3ny7NRNvSFmaaXM1W3auz3uOx5FGU9LZGb6Kxle8bqdXKVoVNOX0Rken6PcZWkf2P8ddmuz6VdzzQ61d4tQhvWen6sf/DJMQG1raSocNNO4bAqjs8U2wuFYIoSjjITw2G9JQVncOHbX1bzz5NbhhFo60U3/lw496v5iiAsYd8Al0QRbeGHpIBBZg40EVMNeMIhQBOgY1uYzLeH1DHD2OCfMi0rzg2jl2NBJo45trIwQgSN+UNALHDHTQ4ATwiwi8DtWoVIR94KOqOZk4Y5c1oZucNAvhNMqBJdtOwBxUltTEIJw5MKqVR4EUeIBOpM0o9q9qJYhfhmVftFGXyImvMBGHoRCKEc57MWShur+gyOC/wSRpOkB8xGmOXlYUcQBijMRdmzQiROKTZbPiVOY0OCI5RVzSzvoNjgKcXaNTlPpWOajljNwCk2mXxafB0fDSHgecBf0TNkG9WghJCze42y7IroxeMUlMEZVnahYAr6GkJPWWsCp6UwBMOVtH+Eh1r/br0qgReSaNel9DrutSKS2gsobMSOpOc+dFZwXMJTkpoIqFpCU0l9GUJfSnnDPi2uO0NaL4W2aLSM9eZoWcYIZ/RNr9Qa3PBfL1vjapE7AHaNlgW9wiN+XubA14q6PT55YsfGD3a7z3Rd1mdYbkxWlD0nd0nR7pEsfNqCo6+v987lDgBBr69NDo+2f3OkI3CGIfukrS3t/P9t7JTilw3mC+djg6Pezv1ifFEqkUZe4au13cbhlJURSJa29CkaO0mejFMo8BqEuR5NvKnMv8ZBulr2EGT+yLmRkXYpFhk3qhImxSLBVgoqhK/IabVciw1tZmH4iCIXwuheDKAm/seI/6YYPSCH5AzfgECEuCv+anAtudwL/7X7IrWOiJIFkTeUlX+shn1d0xuXPW2jZ1t/cdv2k8PizfuvvKF8qXSUQxlT3mqPFfOlb4CW69af7X+bv2j/qK+Un9Tf8+pd+8Ums+Vyqf+8S/5yPuw</latexit>

@ logqx(P)

@wi=

@ log�(w · x+ b)

@wi

=@ log 1

1+exp(-wT x-b)

@wi= -

@ log(1+ exp(-wTx- b))

@wi

= -@ log(1+ exp(-wTx- b))

@(1+ exp(-wTx- b))

@(1+ exp(-wTx- b))

@wi

= -1

ln 2

1

(1+ exp(-wTx- b))

@ exp(-wTx- b)

@wi

= -1

(1+ exp(-wTx- b))

@ exp(-wTx- b)

@(-wTx- b)

@(-wTx- b)

@wi

= -exp(-wTx- b)

(1+ exp(-wTx- b))·-xi = (1- �(wTx+ b))xi

= qx(N)xi<latexit sha1_base64="2vcOtw/BEfXstT2FaWDEzowiKeA=">AAALF3icrVbNcttEHFdrKEVQaODIRYOHjkucjOTQJBwyU/JBe6BJyMRJZiLjWa3Xssb6YrWype7sg/ACvEZvDFeOPABXeAV2JdmWtKoLneqSzf4+9r+//bIVuk5EdP3PO3db771/74P7H6offfzgk08fbnx2FQUxhqgPAzfANxaIkOv4qE8c4qKbECPgWS66tqZHAr+eIRw5gX9J0hANPGD7ztiBgPCu4Ubr5pFmjjGA1JyGmukGtvbzMOmYFqTn7DHLewNI50OHMe2gzjUjx/ZARzBMa85MOAqImWwKuSXLTVN91OCR/WswamyaKAk7Wwu3ny7NRNvSFmaaXM1W3auz3uOx5FGU9LZGb6Kxle8bqdXKVoVNOX0Rken6PcZWkf2P8ddmuz6VdzzQ61d4tQhvWen6sf/DJMQG1raSocNNO4bAqjs8U2wuFYIoSjjITw2G9JQVncOHbX1bzz5NbhhFo60U3/lw496v5iiAsYd8Al0QRbeGHpIBBZg40EVMNeMIhQBOgY1uYzLeH1DHD2OCfMi0rzg2jl2NBJo45trIwQgSN+UNALHDHTQ4ATwiwi8DtWoVIR94KOqOZk4Y5c1oZucNAvhNMqBJdtOwBxUltTEIJw5MKqVR4EUeIBOpM0o9q9qJYhfhmVftFGXyImvMBGHoRCKEc57MWShur+gyOC/wSRpOkB8xGmOXlYUcQBijMRdmzQiROKTZbPiVOY0OCI5RVzSzvoNjgKcXaNTlPpWOajljNwCk2mXxafB0fDSHgecBf0TNkG9WghJCze42y7IroxeMUlMEZVnahYAr6GkJPWWsCp6UwBMOVtH+Eh1r/br0qgReSaNel9DrutSKS2gsobMSOpOc+dFZwXMJTkpoIqFpCU0l9GUJfSnnDPi2uO0NaL4W2aLSM9eZoWcYIZ/RNr9Qa3PBfL1vjapE7AHaNlgW9wiN+XubA14q6PT55YsfGD3a7z3Rd1mdYbkxWlD0nd0nR7pEsfNqCo6+v987lDgBBr69NDo+2f3OkI3CGIfukrS3t/P9t7JTilw3mC+djg6Pezv1ifFEqkUZe4au13cbhlJURSJa29CkaO0mejFMo8BqEuR5NvKnMv8ZBulr2EGT+yLmRkXYpFhk3qhImxSLBVgoqhK/IabVciw1tZmH4iCIXwuheDKAm/seI/6YYPSCH5AzfgECEuCv+anAtudwL/7X7IrWOiJIFkTeUlX+shn1d0xuXPW2jZ1t/cdv2k8PizfuvvKF8qXSUQxlT3mqPFfOlb4CW69af7X+bv2j/qK+Un9Tf8+pd+8Ums+Vyqf+8S/5yPuw</latexit>

@ logqx(P)

@wi=

@ log�(w · x+ b)

@wi

=@ log 1

1+exp(-wT x-b)

@wi= -

@ log(1+ exp(-wTx- b))

@wi

= -@ log(1+ exp(-wTx- b))

@(1+ exp(-wTx- b))

@(1+ exp(-wTx- b))

@wi

= -1

ln 2

1

(1+ exp(-wTx- b))

@ exp(-wTx- b)

@wi

= -1

(1+ exp(-wTx- b))

@ exp(-wTx- b)

@(-wTx- b)

@(-wTx- b)

@wi

= -exp(-wTx- b)

(1+ exp(-wTx- b))·-xi = (1- �(wTx+ b))xi

= qx(N)xi<latexit sha1_base64="2vcOtw/BEfXstT2FaWDEzowiKeA=">AAALF3icrVbNcttEHFdrKEVQaODIRYOHjkucjOTQJBwyU/JBe6BJyMRJZiLjWa3Xssb6YrWype7sg/ACvEZvDFeOPABXeAV2JdmWtKoLneqSzf4+9r+//bIVuk5EdP3PO3db771/74P7H6offfzgk08fbnx2FQUxhqgPAzfANxaIkOv4qE8c4qKbECPgWS66tqZHAr+eIRw5gX9J0hANPGD7ztiBgPCu4Ubr5pFmjjGA1JyGmukGtvbzMOmYFqTn7DHLewNI50OHMe2gzjUjx/ZARzBMa85MOAqImWwKuSXLTVN91OCR/WswamyaKAk7Wwu3ny7NRNvSFmaaXM1W3auz3uOx5FGU9LZGb6Kxle8bqdXKVoVNOX0Rken6PcZWkf2P8ddmuz6VdzzQ61d4tQhvWen6sf/DJMQG1raSocNNO4bAqjs8U2wuFYIoSjjITw2G9JQVncOHbX1bzz5NbhhFo60U3/lw496v5iiAsYd8Al0QRbeGHpIBBZg40EVMNeMIhQBOgY1uYzLeH1DHD2OCfMi0rzg2jl2NBJo45trIwQgSN+UNALHDHTQ4ATwiwi8DtWoVIR94KOqOZk4Y5c1oZucNAvhNMqBJdtOwBxUltTEIJw5MKqVR4EUeIBOpM0o9q9qJYhfhmVftFGXyImvMBGHoRCKEc57MWShur+gyOC/wSRpOkB8xGmOXlYUcQBijMRdmzQiROKTZbPiVOY0OCI5RVzSzvoNjgKcXaNTlPpWOajljNwCk2mXxafB0fDSHgecBf0TNkG9WghJCze42y7IroxeMUlMEZVnahYAr6GkJPWWsCp6UwBMOVtH+Eh1r/br0qgReSaNel9DrutSKS2gsobMSOpOc+dFZwXMJTkpoIqFpCU0l9GUJfSnnDPi2uO0NaL4W2aLSM9eZoWcYIZ/RNr9Qa3PBfL1vjapE7AHaNlgW9wiN+XubA14q6PT55YsfGD3a7z3Rd1mdYbkxWlD0nd0nR7pEsfNqCo6+v987lDgBBr69NDo+2f3OkI3CGIfukrS3t/P9t7JTilw3mC+djg6Pezv1ifFEqkUZe4au13cbhlJURSJa29CkaO0mejFMo8BqEuR5NvKnMv8ZBulr2EGT+yLmRkXYpFhk3qhImxSLBVgoqhK/IabVciw1tZmH4iCIXwuheDKAm/seI/6YYPSCH5AzfgECEuCv+anAtudwL/7X7IrWOiJIFkTeUlX+shn1d0xuXPW2jZ1t/cdv2k8PizfuvvKF8qXSUQxlT3mqPFfOlb4CW69af7X+bv2j/qK+Un9Tf8+pd+8Ums+Vyqf+8S/5yPuw</latexit>

@ logqx(P)

@wi=

@ log�(w · x+ b)

@wi

=@ log 1

1+exp(-wT x-b)

@wi= -

@ log(1+ exp(-wTx- b))

@wi

= -@ log(1+ exp(-wTx- b))

@(1+ exp(-wTx- b))

@(1+ exp(-wTx- b))

@wi

= -1

ln 2

1

(1+ exp(-wTx- b))

@ exp(-wTx- b)

@wi

= -1

(1+ exp(-wTx- b))

@ exp(-wTx- b)

@(-wTx- b)

@(-wTx- b)

@wi

= -exp(-wTx- b)

(1+ exp(-wTx- b))·-xi = (1- �(wTx+ b))xi

= qx(N)xi<latexit sha1_base64="2vcOtw/BEfXstT2FaWDEzowiKeA=">AAALF3icrVbNcttEHFdrKEVQaODIRYOHjkucjOTQJBwyU/JBe6BJyMRJZiLjWa3Xssb6YrWype7sg/ACvEZvDFeOPABXeAV2JdmWtKoLneqSzf4+9r+//bIVuk5EdP3PO3db771/74P7H6offfzgk08fbnx2FQUxhqgPAzfANxaIkOv4qE8c4qKbECPgWS66tqZHAr+eIRw5gX9J0hANPGD7ztiBgPCu4Ubr5pFmjjGA1JyGmukGtvbzMOmYFqTn7DHLewNI50OHMe2gzjUjx/ZARzBMa85MOAqImWwKuSXLTVN91OCR/WswamyaKAk7Wwu3ny7NRNvSFmaaXM1W3auz3uOx5FGU9LZGb6Kxle8bqdXKVoVNOX0Rken6PcZWkf2P8ddmuz6VdzzQ61d4tQhvWen6sf/DJMQG1raSocNNO4bAqjs8U2wuFYIoSjjITw2G9JQVncOHbX1bzz5NbhhFo60U3/lw496v5iiAsYd8Al0QRbeGHpIBBZg40EVMNeMIhQBOgY1uYzLeH1DHD2OCfMi0rzg2jl2NBJo45trIwQgSN+UNALHDHTQ4ATwiwi8DtWoVIR94KOqOZk4Y5c1oZucNAvhNMqBJdtOwBxUltTEIJw5MKqVR4EUeIBOpM0o9q9qJYhfhmVftFGXyImvMBGHoRCKEc57MWShur+gyOC/wSRpOkB8xGmOXlYUcQBijMRdmzQiROKTZbPiVOY0OCI5RVzSzvoNjgKcXaNTlPpWOajljNwCk2mXxafB0fDSHgecBf0TNkG9WghJCze42y7IroxeMUlMEZVnahYAr6GkJPWWsCp6UwBMOVtH+Eh1r/br0qgReSaNel9DrutSKS2gsobMSOpOc+dFZwXMJTkpoIqFpCU0l9GUJfSnnDPi2uO0NaL4W2aLSM9eZoWcYIZ/RNr9Qa3PBfL1vjapE7AHaNlgW9wiN+XubA14q6PT55YsfGD3a7z3Rd1mdYbkxWlD0nd0nR7pEsfNqCo6+v987lDgBBr69NDo+2f3OkI3CGIfukrS3t/P9t7JTilw3mC+djg6Pezv1ifFEqkUZe4au13cbhlJURSJa29CkaO0mejFMo8BqEuR5NvKnMv8ZBulr2EGT+yLmRkXYpFhk3qhImxSLBVgoqhK/IabVciw1tZmH4iCIXwuheDKAm/seI/6YYPSCH5AzfgECEuCv+anAtudwL/7X7IrWOiJIFkTeUlX+shn1d0xuXPW2jZ1t/cdv2k8PizfuvvKF8qXSUQxlT3mqPFfOlb4CW69af7X+bv2j/qK+Un9Tf8+pd+8Ums+Vyqf+8S/5yPuw</latexit>

@ logqx(P)

@wi=

@ log�(w · x+ b)

@wi

=@ log 1

1+exp(-wT x-b)

@wi= -

@ log(1+ exp(-wTx- b))

@wi

= -@ log(1+ exp(-wTx- b))

@(1+ exp(-wTx- b))

@(1+ exp(-wTx- b))

@wi

= -1

ln 2

1

(1+ exp(-wTx- b))

@ exp(-wTx- b)

@wi

= -1

(1+ exp(-wTx- b))

@ exp(-wTx- b)

@(-wTx- b)

@(-wTx- b)

@wi

= -exp(-wTx- b)

(1+ exp(-wTx- b))·-xi = (1- �(wTx+ b))xi

= qx(N)xi<latexit sha1_base64="2vcOtw/BEfXstT2FaWDEzowiKeA=">AAALF3icrVbNcttEHFdrKEVQaODIRYOHjkucjOTQJBwyU/JBe6BJyMRJZiLjWa3Xssb6YrWype7sg/ACvEZvDFeOPABXeAV2JdmWtKoLneqSzf4+9r+//bIVuk5EdP3PO3db771/74P7H6offfzgk08fbnx2FQUxhqgPAzfANxaIkOv4qE8c4qKbECPgWS66tqZHAr+eIRw5gX9J0hANPGD7ztiBgPCu4Ubr5pFmjjGA1JyGmukGtvbzMOmYFqTn7DHLewNI50OHMe2gzjUjx/ZARzBMa85MOAqImWwKuSXLTVN91OCR/WswamyaKAk7Wwu3ny7NRNvSFmaaXM1W3auz3uOx5FGU9LZGb6Kxle8bqdXKVoVNOX0Rken6PcZWkf2P8ddmuz6VdzzQ61d4tQhvWen6sf/DJMQG1raSocNNO4bAqjs8U2wuFYIoSjjITw2G9JQVncOHbX1bzz5NbhhFo60U3/lw496v5iiAsYd8Al0QRbeGHpIBBZg40EVMNeMIhQBOgY1uYzLeH1DHD2OCfMi0rzg2jl2NBJo45trIwQgSN+UNALHDHTQ4ATwiwi8DtWoVIR94KOqOZk4Y5c1oZucNAvhNMqBJdtOwBxUltTEIJw5MKqVR4EUeIBOpM0o9q9qJYhfhmVftFGXyImvMBGHoRCKEc57MWShur+gyOC/wSRpOkB8xGmOXlYUcQBijMRdmzQiROKTZbPiVOY0OCI5RVzSzvoNjgKcXaNTlPpWOajljNwCk2mXxafB0fDSHgecBf0TNkG9WghJCze42y7IroxeMUlMEZVnahYAr6GkJPWWsCp6UwBMOVtH+Eh1r/br0qgReSaNel9DrutSKS2gsobMSOpOc+dFZwXMJTkpoIqFpCU0l9GUJfSnnDPi2uO0NaL4W2aLSM9eZoWcYIZ/RNr9Qa3PBfL1vjapE7AHaNlgW9wiN+XubA14q6PT55YsfGD3a7z3Rd1mdYbkxWlD0nd0nR7pEsfNqCo6+v987lDgBBr69NDo+2f3OkI3CGIfukrS3t/P9t7JTilw3mC+djg6Pezv1ifFEqkUZe4au13cbhlJURSJa29CkaO0mejFMo8BqEuR5NvKnMv8ZBulr2EGT+yLmRkXYpFhk3qhImxSLBVgoqhK/IabVciw1tZmH4iCIXwuheDKAm/seI/6YYPSCH5AzfgECEuCv+anAtudwL/7X7IrWOiJIFkTeUlX+shn1d0xuXPW2jZ1t/cdv2k8PizfuvvKF8qXSUQxlT3mqPFfOlb4CW69af7X+bv2j/qK+Un9Tf8+pd+8Ums+Vyqf+8S/5yPuw</latexit>

Let’sworkthederivativeforoneoftheweights.

dlogb(x)/dx=(1/(lnb))(1/x)

81

@ logqx(P)

@wi=

@ log�(w · x+ b)

@wi

=@ log 1

1+exp(-wT x-b)

@wi= -

@ log(1+ exp(-wTx- b))

@wi

= -@ log(1+ exp(-wTx- b))

@(1+ exp(-wTx- b))

@(1+ exp(-wTx- b))

@wi

= -1

ln 2

1

(1+ exp(-wTx- b))

@ exp(-wTx- b)

@wi

= -1

(1+ exp(-wTx- b))

@ exp(-wTx- b)

@(-wTx- b)

@(-wTx- b)

@wi

= -exp(-wTx- b)

(1+ exp(-wTx- b))·-xi = (1- �(wTx+ b))xi

= qx(N)xi<latexit sha1_base64="2vcOtw/BEfXstT2FaWDEzowiKeA=">AAALF3icrVbNcttEHFdrKEVQaODIRYOHjkucjOTQJBwyU/JBe6BJyMRJZiLjWa3Xssb6YrWype7sg/ACvEZvDFeOPABXeAV2JdmWtKoLneqSzf4+9r+//bIVuk5EdP3PO3db771/74P7H6offfzgk08fbnx2FQUxhqgPAzfANxaIkOv4qE8c4qKbECPgWS66tqZHAr+eIRw5gX9J0hANPGD7ztiBgPCu4Ubr5pFmjjGA1JyGmukGtvbzMOmYFqTn7DHLewNI50OHMe2gzjUjx/ZARzBMa85MOAqImWwKuSXLTVN91OCR/WswamyaKAk7Wwu3ny7NRNvSFmaaXM1W3auz3uOx5FGU9LZGb6Kxle8bqdXKVoVNOX0Rken6PcZWkf2P8ddmuz6VdzzQ61d4tQhvWen6sf/DJMQG1raSocNNO4bAqjs8U2wuFYIoSjjITw2G9JQVncOHbX1bzz5NbhhFo60U3/lw496v5iiAsYd8Al0QRbeGHpIBBZg40EVMNeMIhQBOgY1uYzLeH1DHD2OCfMi0rzg2jl2NBJo45trIwQgSN+UNALHDHTQ4ATwiwi8DtWoVIR94KOqOZk4Y5c1oZucNAvhNMqBJdtOwBxUltTEIJw5MKqVR4EUeIBOpM0o9q9qJYhfhmVftFGXyImvMBGHoRCKEc57MWShur+gyOC/wSRpOkB8xGmOXlYUcQBijMRdmzQiROKTZbPiVOY0OCI5RVzSzvoNjgKcXaNTlPpWOajljNwCk2mXxafB0fDSHgecBf0TNkG9WghJCze42y7IroxeMUlMEZVnahYAr6GkJPWWsCp6UwBMOVtH+Eh1r/br0qgReSaNel9DrutSKS2gsobMSOpOc+dFZwXMJTkpoIqFpCU0l9GUJfSnnDPi2uO0NaL4W2aLSM9eZoWcYIZ/RNr9Qa3PBfL1vjapE7AHaNlgW9wiN+XubA14q6PT55YsfGD3a7z3Rd1mdYbkxWlD0nd0nR7pEsfNqCo6+v987lDgBBr69NDo+2f3OkI3CGIfukrS3t/P9t7JTilw3mC+djg6Pezv1ifFEqkUZe4au13cbhlJURSJa29CkaO0mejFMo8BqEuR5NvKnMv8ZBulr2EGT+yLmRkXYpFhk3qhImxSLBVgoqhK/IabVciw1tZmH4iCIXwuheDKAm/seI/6YYPSCH5AzfgECEuCv+anAtudwL/7X7IrWOiJIFkTeUlX+shn1d0xuXPW2jZ1t/cdv2k8PizfuvvKF8qXSUQxlT3mqPFfOlb4CW69af7X+bv2j/qK+Un9Tf8+pd+8Ums+Vyqf+8S/5yPuw</latexit>

i i

=@ log 1

1+exp(-wT x-b)

@wi= -

@ log(1+ exp(-wTx- b))

@wi@wi @wi

= -@ log(1+ exp(-wTx- b))

@(1+ exp(-wTx- b))

@(1+ exp(-wTx- b))

@wi

T

= -1

ln 2

1

(1+ exp(-wTx- b))

@ exp(-wTx- b)

@wi

T T

= -1

(1+ exp(-wTx- b))

@ exp(-wTx- b)

@(-wTx- b)

@(-wTx- b)

@wi

= -exp(-wTx- b)

(1+ exp(-wTx- b))·-xi = (1- �(wTx+ b))xi

@ logqx(P)

@wi=

@ log�(w · x+ b)

@wi

=@ log 1

1+exp(-wT x-b)

@wi= -

@ log(1+ exp(-wTx- b))

@wi

= -@ log(1+ exp(-wTx- b))

@(1+ exp(-wTx- b))

@(1+ exp(-wTx- b))

@wi

= -1

ln 2

1

(1+ exp(-wTx- b))

@ exp(-wTx- b)

@wi

= -1

(1+ exp(-wTx- b))

@ exp(-wTx- b)

@(-wTx- b)

@(-wTx- b)

@wi

= -exp(-wTx- b)

(1+ exp(-wTx- b))·-xi = (1- �(wTx+ b))xi

= qx(N)xi<latexit sha1_base64="2vcOtw/BEfXstT2FaWDEzowiKeA=">AAALF3icrVbNcttEHFdrKEVQaODIRYOHjkucjOTQJBwyU/JBe6BJyMRJZiLjWa3Xssb6YrWype7sg/ACvEZvDFeOPABXeAV2JdmWtKoLneqSzf4+9r+//bIVuk5EdP3PO3db771/74P7H6offfzgk08fbnx2FQUxhqgPAzfANxaIkOv4qE8c4qKbECPgWS66tqZHAr+eIRw5gX9J0hANPGD7ztiBgPCu4Ubr5pFmjjGA1JyGmukGtvbzMOmYFqTn7DHLewNI50OHMe2gzjUjx/ZARzBMa85MOAqImWwKuSXLTVN91OCR/WswamyaKAk7Wwu3ny7NRNvSFmaaXM1W3auz3uOx5FGU9LZGb6Kxle8bqdXKVoVNOX0Rken6PcZWkf2P8ddmuz6VdzzQ61d4tQhvWen6sf/DJMQG1raSocNNO4bAqjs8U2wuFYIoSjjITw2G9JQVncOHbX1bzz5NbhhFo60U3/lw496v5iiAsYd8Al0QRbeGHpIBBZg40EVMNeMIhQBOgY1uYzLeH1DHD2OCfMi0rzg2jl2NBJo45trIwQgSN+UNALHDHTQ4ATwiwi8DtWoVIR94KOqOZk4Y5c1oZucNAvhNMqBJdtOwBxUltTEIJw5MKqVR4EUeIBOpM0o9q9qJYhfhmVftFGXyImvMBGHoRCKEc57MWShur+gyOC/wSRpOkB8xGmOXlYUcQBijMRdmzQiROKTZbPiVOY0OCI5RVzSzvoNjgKcXaNTlPpWOajljNwCk2mXxafB0fDSHgecBf0TNkG9WghJCze42y7IroxeMUlMEZVnahYAr6GkJPWWsCp6UwBMOVtH+Eh1r/br0qgReSaNel9DrutSKS2gsobMSOpOc+dFZwXMJTkpoIqFpCU0l9GUJfSnnDPi2uO0NaL4W2aLSM9eZoWcYIZ/RNr9Qa3PBfL1vjapE7AHaNlgW9wiN+XubA14q6PT55YsfGD3a7z3Rd1mdYbkxWlD0nd0nR7pEsfNqCo6+v987lDgBBr69NDo+2f3OkI3CGIfukrS3t/P9t7JTilw3mC+djg6Pezv1ifFEqkUZe4au13cbhlJURSJa29CkaO0mejFMo8BqEuR5NvKnMv8ZBulr2EGT+yLmRkXYpFhk3qhImxSLBVgoqhK/IabVciw1tZmH4iCIXwuheDKAm/seI/6YYPSCH5AzfgECEuCv+anAtudwL/7X7IrWOiJIFkTeUlX+shn1d0xuXPW2jZ1t/cdv2k8PizfuvvKF8qXSUQxlT3mqPFfOlb4CW69af7X+bv2j/qK+Un9Tf8+pd+8Ums+Vyqf+8S/5yPuw</latexit>

Notethatdespitetheintimidatingformulasinthemiddle,theresultisactuallyverysimple.Thisisoneofthepropertiesofthelogisticsigmoid,ittendstocancelitselfoutwhenthederivativeistaken.

Weignoretheconstantmultiplier(1/ln2)inthefourthline,becauseitdoesn’tchangethedirectionofthegradient,onlythemagnitude.Whenweapplygradientdescentwescalethegradientbyaconstantmultiplieranyway,sowecanignoreit.(Anotheroptionistousethenaturallogarithminthede?initionofthecrossentropy).

dlogb(x)/dx=(1/(lnb))(1/x)

82

@loss(w, b)@wi

= -X

x2XP

qx(N)xi +X

x2XN

qx(P)xi<latexit sha1_base64="MT0weT0midCBZk1kti0+AnaaIGg=">AAAITHicfZXbbiM1HMYnCzQlsNCFO7ixCKAuhGomZdtyUWnpgd0Ldhuqpq3UiSKP40mseA7YnmRmLb8Cj8HjcM977B1CwnNImBPMTf7y7/v+sT97xk5ICRem+Wfn0TvvvrfT3X2/98GHjz/6eO/JJ7c8iBjCYxTQgN07kGNKfDwWRFB8HzIMPYfiO2d5nvK7FWacBP6NSEI88eDcJy5BUOih6d5vXwPbZRBJexkCW+BYSBpwrvbtQI85LlirAbCdvHbUU5ULNVxPiVLgFHwHbB55UxkDm/jgfpqKR5r8Oo33bYbka/U0nhLwbV2WoUKWWVLZdK9vHpjZA5qFVRR9o3hG0yc7v9uzAEUe9gWikPMHywzFREImCKJY9eyI4xCiJZzjh0i4JxNJ/DAS2EcKfKWZG1EgApBmA2aEYSRooguIGNEdAFpAnY7QCfaqrTj2oYf5YLYiIc9LvprnhYA6/omMs+1RjytOOWcwXBAUV6Ymocc9KBaNQZ54TnUQRxSzlVcdTKepJ1lTxpghwtMQRjqZqzDdcn4TjAq+SMIF9rmSEaOqbNQAM4ZdbcxKjkUUymw1+pwt+algER6kZTZ2egHZ8hrPBrpPZaA6HZcGUFSHHL0MnY6P1yjwPOjPpB3qE5YdQ3twoLLsyvRaSWmnQTkOuE5xhb4uUX22qvCyBC81rNLxlrpgXLfeluBt41/vSvSubnWiEo0adFWiq0ZnZ13C6waOSzRu0KREkwZ9U6JvmjlDfSwehhOZ70W2qfKKkhV+wTD2lewPVX0tTO/3g1W1pGdA9i2VxT3Drv5I5cBLUrl8efPqZyXPT4bPzCNVVzg0whuJeXj07NxsSOb5bAqNeXIyPGtoAgb9+bbRxeXRj1azURixkG5Fx8eHP/3Q7JRgSoP1ttP52cXwsL4wnUh1UtaxZZr108ZQI6oiEdC3QCPaeZu8+JtWg9NmyPNs1S+b+hcMJv+hDtq6b2JudYRtjk3mrY6kzbHZgI2javFbYvp3O7ae2srD9EVY6vstTK8MSPO+F1hfJgy/0i/Ilf4AQhGwb/RbweYe0b30rz1Iq/8Twngj1FWvp282q36PNYvb4YF1eGD+8n3/+Vlxx+0anxtfGPuGZRwbz42XxsgYG8h42/ms0+982f2j+7b7V/fvXPqoU3g+NSrP7s4/OO8D6w==</latexit>

@loss(w, b)@wi

=@�-P

x2XPlogqx(P)-

Px2XN

logqx(N)�

@wi

= -X

x2XP

@ logqx(P)

@wi-

X

x2XN

@ logqx(N)

@wi<latexit sha1_base64="9poBmou8NrV89+fWJxd1hqIvrRY=">AAAJBnicfVXNb9s2FJe7rnO9dW224y5EjQ3J4AaSsybZIUCbj6WHtfGCOAkQGQZFU7JgSuJIyrZK8L6/ZMfehl33R+yyw/a3jJJsT1+ZLnl5vw8+Pj6aDiU+F6b5V+vBRw8/fvRJ+3Hn08+efP702dYX1zyKGcJDFJGI3TqQY+KHeCh8QfAtZRgGDsE3zuwkxW/mmHE/Cq9EQvEogF7ouz6CQqfGW61z22UQSXtGgS3wUkgSca627QhJaTsuWCjVA7aDsn8ctaNyqoYXY18p8M0RKDgQ7Ipt8MLmcTCWS2D7Ibgdp+qBptok8sDP4+V2ntgBL0CFyJB8VyJmiR1gM9+bCrADKqvbdkevf+96hbrKK1c3cW8hDQ55SZnD2gAo0Bk/65q7ZvaBemCtgq6x+gbjrUe/2pMIxQEOBSKQ8zvLpGIkIRM+Ilh17JhjCtEMevguFu7hSPohjQUOkQJfa8yNCRARSE8VTHyGkSCJDiBivnYAaAp18UKffadsxXEIA8x7k7lPeR7yuZcHAurBGcllNljqSUkpPQbp1EfLUmkSBjyAYlpL8iRwykkcE8zmQTmZlqmLrDCXmCGfp00Y6M5c0HRY+VU0WOHThE5xyJWMGVFFoQYwY9jVwizkWMRUZrvRN2TGjwSLcS8Ns9zRKWSzSzzpaZ9SolyOSyIoyilHb0N3J8QLFAUBDCfSpnomshtk93ZV1rsieqn0fUob5TjgMoVL6LsCqgevDJ4VwDMNltHhBnXBsCq9LoDXtVVvCuhNVerEBTSuofMCOq85O4sCvKjBywK6rKFJAU1q6PsC+r7eZ6jH4q4/kvlZZIcqL4g/x+cM41DJbl9V98L0ed9ZZUk6A7JrqazdE+zqn9ccCJKULt9cvf1RyZPD/ktzX1UZDonxmmLu7b88MWsUL69mxTEPD/vHNU7EYOhtjE7P9l9bdSMaM0o2pIODvR++rzslmJBosXE6OT7t71U3pjtSLso6sEyzOm0M1Vq16gjoWqDWWq+JvlqmUeA0CfJ+NvJndf45g8k97KjJfd3mRgVtUqx73qhImhTrA1grypKwoU3/HcdGU9k5TS/CTD9PNH0yIMl9T7F+TBh+qy/Ihf4BhCJi3+pbwbzA1176r91Lo/8jwuWaqKNO+rJZ1XesHlz3d629XfOn77qvjldvXNv4ynhubBuWcWC8Mt4YA2NooNaH1p+tv1v/tH9pf2j/1v49pz5orTRfGqWv/ce/W3RIWQ==</latexit>

� P P �

= -X

x2XP

@ logqx(P)

@wi-

X

x2XN

@ logqx(N)

@wi<latexit sha1_base64="9poBmou8NrV89+fWJxd1hqIvrRY=">AAAJBnicfVXNb9s2FJe7rnO9dW224y5EjQ3J4AaSsybZIUCbj6WHtfGCOAkQGQZFU7JgSuJIyrZK8L6/ZMfehl33R+yyw/a3jJJsT1+ZLnl5vw8+Pj6aDiU+F6b5V+vBRw8/fvRJ+3Hn08+efP702dYX1zyKGcJDFJGI3TqQY+KHeCh8QfAtZRgGDsE3zuwkxW/mmHE/Cq9EQvEogF7ouz6CQqfGW61z22UQSXtGgS3wUkgSca627QhJaTsuWCjVA7aDsn8ctaNyqoYXY18p8M0RKDgQ7Ipt8MLmcTCWS2D7Ibgdp+qBptok8sDP4+V2ntgBL0CFyJB8VyJmiR1gM9+bCrADKqvbdkevf+96hbrKK1c3cW8hDQ55SZnD2gAo0Bk/65q7ZvaBemCtgq6x+gbjrUe/2pMIxQEOBSKQ8zvLpGIkIRM+Ilh17JhjCtEMevguFu7hSPohjQUOkQJfa8yNCRARSE8VTHyGkSCJDiBivnYAaAp18UKffadsxXEIA8x7k7lPeR7yuZcHAurBGcllNljqSUkpPQbp1EfLUmkSBjyAYlpL8iRwykkcE8zmQTmZlqmLrDCXmCGfp00Y6M5c0HRY+VU0WOHThE5xyJWMGVFFoQYwY9jVwizkWMRUZrvRN2TGjwSLcS8Ns9zRKWSzSzzpaZ9SolyOSyIoyilHb0N3J8QLFAUBDCfSpnomshtk93ZV1rsieqn0fUob5TjgMoVL6LsCqgevDJ4VwDMNltHhBnXBsCq9LoDXtVVvCuhNVerEBTSuofMCOq85O4sCvKjBywK6rKFJAU1q6PsC+r7eZ6jH4q4/kvlZZIcqL4g/x+cM41DJbl9V98L0ed9ZZUk6A7JrqazdE+zqn9ccCJKULt9cvf1RyZPD/ktzX1UZDonxmmLu7b88MWsUL69mxTEPD/vHNU7EYOhtjE7P9l9bdSMaM0o2pIODvR++rzslmJBosXE6OT7t71U3pjtSLso6sEyzOm0M1Vq16gjoWqDWWq+JvlqmUeA0CfJ+NvJndf45g8k97KjJfd3mRgVtUqx73qhImhTrA1grypKwoU3/HcdGU9k5TS/CTD9PNH0yIMl9T7F+TBh+qy/Ihf4BhCJi3+pbwbzA1176r91Lo/8jwuWaqKNO+rJZ1XesHlz3d629XfOn77qvjldvXNv4ynhubBuWcWC8Mt4YA2NooNaH1p+tv1v/tH9pf2j/1v49pz5orTRfGqWv/ce/W3RIWQ==</latexit>

logistic regression

Use the sigmoid function to turn a linear classifier into a discriminative probabilistic classifier.

Use log loss. Maximise the log-likelihood of the data given the model

Derive the gradient and search for good weights. No analytical solution, but the problem is convex.

83

Regressionisabitofmisnomer,sincewe’rebuildingaclassi?ier.Isupposetheconfusingterminologycomesfromthefactthatwe’re?ittinga(curved)linethroughtheprobabilityvaluesinthedata.

84

Hereisa2Ddatasetthatshowsacommonfailurecasefortheleastsquareclassi?ier.Thepointsatthetoparesofarawayfromtheidealdecisionboundarythattheywillhavehugeresidualsundertheleastsquaresmodel.

least-squares

85

Hereiswhattheleast-squareregressionconvergesto.Clearly,thisisnotasatisfyingsolutionforsuchaneasilyseparabledataset.Thebluepointsatthetoparesofarfromthedecisionboundary.

Inthelinearmodels1lecture,we?ixedoneoftheparametersto1,sothatwecouldplotthelosssurface.Thistime,we’reoptimizingallthreeparameters.

Least-squares classifier

86

Hereisa1Dviewofasimilarsituation.

Ifwewantthedecisionboundarytobebetweentheredandblueclasses,theresidualsforthefar-awaybluepointsbecomeverybig.

87

Thelogisticmodeldoesn’thavethisproblem.Ifthemodel?itswellaroundthedecisionboundary,itdoesn’thavetoworryatallaboutpointsthatarefaraway(ifthey’reontherightsideoftheboundary),

logistic

88

Andhereisthelogisticregressionclassi?ier.

logistic

89

Andhereistheprobabilityfunction(blueishighprobabilityofpositive,redishighprobabilityofnegative).

logistic

90next lecture: maximum margin classifier

Notethatforsuchwell-separableclasses,therearemanysuitableclassi?ier,andlogisticregressionhasnoreasontopreferoneovertheother(allpointsareassignedthecorrectprobabilityverycloseto1).We’llseeasolutiontothisproblemnextlecture,whenwemeetour?inallossfunction:theSVMloss.

summary: logistic regression

Use logistic sigmoid to provide class probabilities from a linear classifier

Use -log p(class|features) as a loss function

Points near the decision boundary get more influence than points far away. The opposite is true for the least squares classifier.

Log loss generalises naturally to multiclass classification (more next week).

91

Probabilistic Models Part 5: Information Theory

Machine Learning mlvu.github.io

Vrije Universiteit Amsterdam

Thislecturewillbeallabouthowtousethemechanismsofprobabilitytocreateaclassi?ier.

information theory

93

aka: what does - log p(x) mean?

Informationtheoryisallabouttherelationbetweenencodinginformationandprobabilitytheory.

Imagineyou’reonholiday,andyou’vebroughtyourtravelmonopoly.Unfortunately,thedicehavegonemissing.Youdohowever,haveacoinwithyou.Canyouusethecoin?liptosimulatethethrowofasixsideddie?

94

headstails

Forafoursideddie,thesolutioniseasy.We?lipthecointwice,andassignanumbertoeachpossibleoutcome.

source:http://www.midlamminiatures.co.uk/blackpolydice/D4Black.html

95

tails

Asixsideddieismoretricky.We’llshowthesolutionforthree“sides”(youcanjustaddanothercoin?liptodecidewhetherit’llbe1,2,3or4,5,6.)

Thetrickistoassignthefourthoutcometoa“reset”.Ifyouthrowtwoheadsinarow,youjuststartagain.Theoreticallyyoucouldbecoin?lippingforever,buttheprobabilityofresettingmorethan?ivetimesisalreadylessthanoneinone-thousand.

Fornowlet’sstickwithtreeswhereeachoutcomeisrepresentedbyonlyoneleaf(andacceptthatthesix-sidesdiecannotbeperfectlymodelledwithacoin).Whatdistributionscanwemodelwithacoininthisway,ifwerequireeachoutcometoberepresentedby

96

1

2

4

3

1 2 3 4 …

1

2 5

1

3 4

1 2 3 4 …

Herearetwoexamples:anexponentiallydecayingdistribution,anda(roughly)polynomiallydecayingone.

prefix-free code

97

d

b fe

a

c

001001001101011000101010001001010001110011101100

0 1

Thesekindsoftreesarecalledpre?ix-freetrees,becausetheyassignapre;ixfreecodetothesetofoutcomes(wejustreplaceheadsandtailswithzerosandones).Thebene?itisthatifwewanttoencodeasequenceoftheseoutcomes,wecanjuststickthecodeoneafteranotherandwewon’tneedanydelimiters.Adecoderwillknowexactlywhereeachcodewordendsandthenextbegins.

codelengths and probabilities

L(x): length of code for x

98

0 1

p(x)=1

2⇥ . . .⇥ 1

2

=

✓1

2

◆L(x)

= 2-L(x)

L(x) = - log2 p(x)<latexit sha1_base64="43m/gnlinQkfWlHncjo1GkYtKAE=">AAAHcHicfVVdb9NIFHX4CmRhtyxPiAdmiXbVVqGyA7TlIRLQ7oJQod2qaZHqUI0n184oY3s0M04TRv4V/Cf+A7+DJ94YOx/YcXb9kpt7zrm5c+74xuOMSmXbX2tXrl67fqN+81bjl9t3fv1t7e7vpzJOBIEuiVksPnhYAqMRdBVVDD5wATj0GJx5w70MPxuBkDSOTtSEQy/EQUR9SrAyqYu1z3x9vIH+codEd5DrC0y0k+p2ilxFQ5DIZf1YycW3nwTDcBtooWTgq/UC7AoaDNTGR31g6i/IHdRGH5F+nCez3EH+6x302FSIg4s2ytq5WGvaW3b+oGrgzIKmNXuOLu5e/+L2Y5KEECnCsJTnjs1VT2OhKGGQNtxEAsdkiAM4T5S/29M04omCiKToT4P5CUMqRplBqE8FEMUmJsBEUFMBkQE2B1PGxka5lIQIG19a/RHlchrKUTANFDYz6OlxPqP0TkmpA4H5gJJxqTWNQxliNagk5ST0yklIGIhRWE5mbZoml5hjEITKzIQj48whz+YuT+KjGT6Y8AFEMtWJYGlRaAAQAnwjzEMJKuE6P425bEPZUSKBVhbmuc4+FsNj6LdMnVKi3I7PYqxSY0YElyQOQxz1tctT7SoYK+22ttLcqiJ6nGrtZr54HjrO4BL6voC+T5crdxeoj7oGLYGnBfC0UvisgJ4tS72kgCYVdFRAR5XK3mUBvqzA4wI6rqCTAjqpoJ8K6KeqldgM+rzd01O78zHpQ0ZH8FoARKlumvd66SzCTPDcKUuyqeqmk+Z298E3u2cKhJOMrt+cvDtI9d5u+5m9nS4zPJbAnGI/2X62Z1cowbSbGcfe3W2/qnBigaNgUWj/7+2XTrUQTwRnC9LOzpN/nr9aviKCVM43OwZqOqjiR7CKPmt4pcBbJZiasJI/rPJfCzz5D3a8qvrcm5UKvkoxN2quWGqJZ9fKLHqXZysVsyllH8yyFfDOXLdDsyCwisWmuWMiCKkxw3y6rSz6PyIez4kmajTM5neW93w16La3nm/Z/z5tvng7+wu4aT2wHlnrlmPtWC+sN9aR1bWI9b32sLZe27jxrX6//rD+x5R6pTbT3LNKT33zBywPp9g=</latexit><latexit sha1_base64="43m/gnlinQkfWlHncjo1GkYtKAE=">AAAHcHicfVVdb9NIFHX4CmRhtyxPiAdmiXbVVqGyA7TlIRLQ7oJQod2qaZHqUI0n184oY3s0M04TRv4V/Cf+A7+DJ94YOx/YcXb9kpt7zrm5c+74xuOMSmXbX2tXrl67fqN+81bjl9t3fv1t7e7vpzJOBIEuiVksPnhYAqMRdBVVDD5wATj0GJx5w70MPxuBkDSOTtSEQy/EQUR9SrAyqYu1z3x9vIH+codEd5DrC0y0k+p2ilxFQ5DIZf1YycW3nwTDcBtooWTgq/UC7AoaDNTGR31g6i/IHdRGH5F+nCez3EH+6x302FSIg4s2ytq5WGvaW3b+oGrgzIKmNXuOLu5e/+L2Y5KEECnCsJTnjs1VT2OhKGGQNtxEAsdkiAM4T5S/29M04omCiKToT4P5CUMqRplBqE8FEMUmJsBEUFMBkQE2B1PGxka5lIQIG19a/RHlchrKUTANFDYz6OlxPqP0TkmpA4H5gJJxqTWNQxliNagk5ST0yklIGIhRWE5mbZoml5hjEITKzIQj48whz+YuT+KjGT6Y8AFEMtWJYGlRaAAQAnwjzEMJKuE6P425bEPZUSKBVhbmuc4+FsNj6LdMnVKi3I7PYqxSY0YElyQOQxz1tctT7SoYK+22ttLcqiJ6nGrtZr54HjrO4BL6voC+T5crdxeoj7oGLYGnBfC0UvisgJ4tS72kgCYVdFRAR5XK3mUBvqzA4wI6rqCTAjqpoJ8K6KeqldgM+rzd01O78zHpQ0ZH8FoARKlumvd66SzCTPDcKUuyqeqmk+Z298E3u2cKhJOMrt+cvDtI9d5u+5m9nS4zPJbAnGI/2X62Z1cowbSbGcfe3W2/qnBigaNgUWj/7+2XTrUQTwRnC9LOzpN/nr9aviKCVM43OwZqOqjiR7CKPmt4pcBbJZiasJI/rPJfCzz5D3a8qvrcm5UKvkoxN2quWGqJZ9fKLHqXZysVsyllH8yyFfDOXLdDsyCwisWmuWMiCKkxw3y6rSz6PyIez4kmajTM5neW93w16La3nm/Z/z5tvng7+wu4aT2wHlnrlmPtWC+sN9aR1bWI9b32sLZe27jxrX6//rD+x5R6pTbT3LNKT33zBywPp9g=</latexit><latexit sha1_base64="43m/gnlinQkfWlHncjo1GkYtKAE=">AAAHcHicfVVdb9NIFHX4CmRhtyxPiAdmiXbVVqGyA7TlIRLQ7oJQod2qaZHqUI0n184oY3s0M04TRv4V/Cf+A7+DJ94YOx/YcXb9kpt7zrm5c+74xuOMSmXbX2tXrl67fqN+81bjl9t3fv1t7e7vpzJOBIEuiVksPnhYAqMRdBVVDD5wATj0GJx5w70MPxuBkDSOTtSEQy/EQUR9SrAyqYu1z3x9vIH+codEd5DrC0y0k+p2ilxFQ5DIZf1YycW3nwTDcBtooWTgq/UC7AoaDNTGR31g6i/IHdRGH5F+nCez3EH+6x302FSIg4s2ytq5WGvaW3b+oGrgzIKmNXuOLu5e/+L2Y5KEECnCsJTnjs1VT2OhKGGQNtxEAsdkiAM4T5S/29M04omCiKToT4P5CUMqRplBqE8FEMUmJsBEUFMBkQE2B1PGxka5lIQIG19a/RHlchrKUTANFDYz6OlxPqP0TkmpA4H5gJJxqTWNQxliNagk5ST0yklIGIhRWE5mbZoml5hjEITKzIQj48whz+YuT+KjGT6Y8AFEMtWJYGlRaAAQAnwjzEMJKuE6P425bEPZUSKBVhbmuc4+FsNj6LdMnVKi3I7PYqxSY0YElyQOQxz1tctT7SoYK+22ttLcqiJ6nGrtZr54HjrO4BL6voC+T5crdxeoj7oGLYGnBfC0UvisgJ4tS72kgCYVdFRAR5XK3mUBvqzA4wI6rqCTAjqpoJ8K6KeqldgM+rzd01O78zHpQ0ZH8FoARKlumvd66SzCTPDcKUuyqeqmk+Z298E3u2cKhJOMrt+cvDtI9d5u+5m9nS4zPJbAnGI/2X62Z1cowbSbGcfe3W2/qnBigaNgUWj/7+2XTrUQTwRnC9LOzpN/nr9aviKCVM43OwZqOqjiR7CKPmt4pcBbJZiasJI/rPJfCzz5D3a8qvrcm5UKvkoxN2quWGqJZ9fKLHqXZysVsyllH8yyFfDOXLdDsyCwisWmuWMiCKkxw3y6rSz6PyIez4kmajTM5neW93w16La3nm/Z/z5tvng7+wu4aT2wHlnrlmPtWC+sN9aR1bWI9b32sLZe27jxrX6//rD+x5R6pTbT3LNKT33zBywPp9g=</latexit>

d

b fe

a

c

p(x)=1

2⇥ . . .⇥ 1

2

=

✓1

2

◆L(x)

= 2-L(x)

L(x) = - log2 p(x)<latexit sha1_base64="43m/gnlinQkfWlHncjo1GkYtKAE=">AAAHcHicfVVdb9NIFHX4CmRhtyxPiAdmiXbVVqGyA7TlIRLQ7oJQod2qaZHqUI0n184oY3s0M04TRv4V/Cf+A7+DJ94YOx/YcXb9kpt7zrm5c+74xuOMSmXbX2tXrl67fqN+81bjl9t3fv1t7e7vpzJOBIEuiVksPnhYAqMRdBVVDD5wATj0GJx5w70MPxuBkDSOTtSEQy/EQUR9SrAyqYu1z3x9vIH+codEd5DrC0y0k+p2ilxFQ5DIZf1YycW3nwTDcBtooWTgq/UC7AoaDNTGR31g6i/IHdRGH5F+nCez3EH+6x302FSIg4s2ytq5WGvaW3b+oGrgzIKmNXuOLu5e/+L2Y5KEECnCsJTnjs1VT2OhKGGQNtxEAsdkiAM4T5S/29M04omCiKToT4P5CUMqRplBqE8FEMUmJsBEUFMBkQE2B1PGxka5lIQIG19a/RHlchrKUTANFDYz6OlxPqP0TkmpA4H5gJJxqTWNQxliNagk5ST0yklIGIhRWE5mbZoml5hjEITKzIQj48whz+YuT+KjGT6Y8AFEMtWJYGlRaAAQAnwjzEMJKuE6P425bEPZUSKBVhbmuc4+FsNj6LdMnVKi3I7PYqxSY0YElyQOQxz1tctT7SoYK+22ttLcqiJ6nGrtZr54HjrO4BL6voC+T5crdxeoj7oGLYGnBfC0UvisgJ4tS72kgCYVdFRAR5XK3mUBvqzA4wI6rqCTAjqpoJ8K6KeqldgM+rzd01O78zHpQ0ZH8FoARKlumvd66SzCTPDcKUuyqeqmk+Z298E3u2cKhJOMrt+cvDtI9d5u+5m9nS4zPJbAnGI/2X62Z1cowbSbGcfe3W2/qnBigaNgUWj/7+2XTrUQTwRnC9LOzpN/nr9aviKCVM43OwZqOqjiR7CKPmt4pcBbJZiasJI/rPJfCzz5D3a8qvrcm5UKvkoxN2quWGqJZ9fKLHqXZysVsyllH8yyFfDOXLdDsyCwisWmuWMiCKkxw3y6rSz6PyIez4kmajTM5neW93w16La3nm/Z/z5tvng7+wu4aT2wHlnrlmPtWC+sN9aR1bWI9b32sLZe27jxrX6//rD+x5R6pTbT3LNKT33zBywPp9g=</latexit><latexit sha1_base64="43m/gnlinQkfWlHncjo1GkYtKAE=">AAAHcHicfVVdb9NIFHX4CmRhtyxPiAdmiXbVVqGyA7TlIRLQ7oJQod2qaZHqUI0n184oY3s0M04TRv4V/Cf+A7+DJ94YOx/YcXb9kpt7zrm5c+74xuOMSmXbX2tXrl67fqN+81bjl9t3fv1t7e7vpzJOBIEuiVksPnhYAqMRdBVVDD5wATj0GJx5w70MPxuBkDSOTtSEQy/EQUR9SrAyqYu1z3x9vIH+codEd5DrC0y0k+p2ilxFQ5DIZf1YycW3nwTDcBtooWTgq/UC7AoaDNTGR31g6i/IHdRGH5F+nCez3EH+6x302FSIg4s2ytq5WGvaW3b+oGrgzIKmNXuOLu5e/+L2Y5KEECnCsJTnjs1VT2OhKGGQNtxEAsdkiAM4T5S/29M04omCiKToT4P5CUMqRplBqE8FEMUmJsBEUFMBkQE2B1PGxka5lIQIG19a/RHlchrKUTANFDYz6OlxPqP0TkmpA4H5gJJxqTWNQxliNagk5ST0yklIGIhRWE5mbZoml5hjEITKzIQj48whz+YuT+KjGT6Y8AFEMtWJYGlRaAAQAnwjzEMJKuE6P425bEPZUSKBVhbmuc4+FsNj6LdMnVKi3I7PYqxSY0YElyQOQxz1tctT7SoYK+22ttLcqiJ6nGrtZr54HjrO4BL6voC+T5crdxeoj7oGLYGnBfC0UvisgJ4tS72kgCYVdFRAR5XK3mUBvqzA4wI6rqCTAjqpoJ8K6KeqldgM+rzd01O78zHpQ0ZH8FoARKlumvd66SzCTPDcKUuyqeqmk+Z298E3u2cKhJOMrt+cvDtI9d5u+5m9nS4zPJbAnGI/2X62Z1cowbSbGcfe3W2/qnBigaNgUWj/7+2XTrUQTwRnC9LOzpN/nr9aviKCVM43OwZqOqjiR7CKPmt4pcBbJZiasJI/rPJfCzz5D3a8qvrcm5UKvkoxN2quWGqJZ9fKLHqXZysVsyllH8yyFfDOXLdDsyCwisWmuWMiCKkxw3y6rSz6PyIez4kmajTM5neW93w16La3nm/Z/z5tvng7+wu4aT2wHlnrlmPtWC+sN9aR1bWI9b32sLZe27jxrX6//rD+x5R6pTbT3LNKT33zBywPp9g=</latexit><latexit sha1_base64="43m/gnlinQkfWlHncjo1GkYtKAE=">AAAHcHicfVVdb9NIFHX4CmRhtyxPiAdmiXbVVqGyA7TlIRLQ7oJQod2qaZHqUI0n184oY3s0M04TRv4V/Cf+A7+DJ94YOx/YcXb9kpt7zrm5c+74xuOMSmXbX2tXrl67fqN+81bjl9t3fv1t7e7vpzJOBIEuiVksPnhYAqMRdBVVDD5wATj0GJx5w70MPxuBkDSOTtSEQy/EQUR9SrAyqYu1z3x9vIH+codEd5DrC0y0k+p2ilxFQ5DIZf1YycW3nwTDcBtooWTgq/UC7AoaDNTGR31g6i/IHdRGH5F+nCez3EH+6x302FSIg4s2ytq5WGvaW3b+oGrgzIKmNXuOLu5e/+L2Y5KEECnCsJTnjs1VT2OhKGGQNtxEAsdkiAM4T5S/29M04omCiKToT4P5CUMqRplBqE8FEMUmJsBEUFMBkQE2B1PGxka5lIQIG19a/RHlchrKUTANFDYz6OlxPqP0TkmpA4H5gJJxqTWNQxliNagk5ST0yklIGIhRWE5mbZoml5hjEITKzIQj48whz+YuT+KjGT6Y8AFEMtWJYGlRaAAQAnwjzEMJKuE6P425bEPZUSKBVhbmuc4+FsNj6LdMnVKi3I7PYqxSY0YElyQOQxz1tctT7SoYK+22ttLcqiJ6nGrtZr54HjrO4BL6voC+T5crdxeoj7oGLYGnBfC0UvisgJ4tS72kgCYVdFRAR5XK3mUBvqzA4wI6rqCTAjqpoJ8K6KeqldgM+rzd01O78zHpQ0ZH8FoARKlumvd66SzCTPDcKUuyqeqmk+Z298E3u2cKhJOMrt+cvDtI9d5u+5m9nS4zPJbAnGI/2X62Z1cowbSbGcfe3W2/qnBigaNgUWj/7+2XTrUQTwRnC9LOzpN/nr9aviKCVM43OwZqOqjiR7CKPmt4pcBbJZiasJI/rPJfCzz5D3a8qvrcm5UKvkoxN2quWGqJZ9fKLHqXZysVsyllH8yyFfDOXLdDsyCwisWmuWMiCKkxw3y6rSz6PyIez4kmajTM5neW93w16La3nm/Z/z5tvng7+wu4aT2wHlnrlmPtWC+sN9aR1bWI9b32sLZe27jxrX6//rD+x5R6pTbT3LNKT33zBywPp9g=</latexit>

Everypre?ixtreede?inesaprobabilitydistributionandacode.Whatabouttheotherwayaround?Canwe?indatreeforanygivenprobabilitydistribution?

Wealreadysawthatsomedistributions(likeasix-sideddie)cannotberepresentedexactly.Buthowclosecanweget?

arithmetic coding

There exists an algorithm which provides for any p(x), a prefix-free code L such that

Thus, if we ignore this minor inaccuracy, or if we allow L(x) to take non-integer values, we may

equate codes with probability distributions.

99

|- log2 p(x)- L(x)| 6 1<latexit sha1_base64="9SNVhtLjCRa3w1AqL8cVWAxLah0=">AAAG+3icfVXbbtNAEHULDSVQaOERCVlESAWFyk5pmz5UKm2hFeqNqmkq1VG03kwcK2t72V2nSbfmiU/hqeIN+AX+gb9hnUtlx4F98WjOOaPZM6uxTYnLhWH8mZq+c3cmd2/2fv7Bw7lHj+cXnpzxIGQYKjggATu3EQfi+lARriBwThkgzyZQtdvbMV7tAONu4J+KHoWahxzfbboYCZWqzz+/1t9YJHDqJZ0udl/pb/T9+HOtWwQ+62Z9vmAsGf2jZwNzGBS04TmuL8z8thoBDj3wBSaI8wvToKImERMuJhDlrZADRbiNHLgIRbNck65PQwE+jvSXCmuGRBeBHveqN1wGWJCeChBmrqqg4xZiCAt1o3y6FAcfecCLjY5L+SDkHWcQCKTsqMlu365oLqWUDkO05eJuqjWJPO4h0cokec+z00kICbCOl07Gbaomx5hdYNjlsQnHypkjGo+AnwbHQ7zVoy3weSRDRqKkUAHAGDSVsB9yECGV/duoubf5hmAhFOOwn9vYQax9Ao2iqpNKpNtpkgCJSJnhwyUOPA/5DWnRSFoCukJaxaWob1USPYmktGJfbFs/ieEUephAD6PxypVbtKlXFJoCzxLgWaZwNYFWx6V2mEDDDNpJoJ1MZfsyAV9m4G4C7WbQXgLtZdCrBHqVtRKpQV+UanJgd39M8oi4HdhlAH4kC6Vo/C5MTfDCTEviqcqCGfXtbkBTrYEB4PViutw7PdiP5Ha5tGKsRuMMm4QwohjLqyvbRobiDLoZcoxyubSV4QQM+c5toZ33q+/MbCEaMkpuSWtryx/Wt8afCMOZ+w2voRdMPeOHM4k+bHiiwJ4kGJgwkd/O8ncZ6v2DHUyqPvJmooJOUoyMGinGWqLxs2pjJY5XKiIDyg6oZcvgQD23I7UgkAjYa/XGmOO5ygz1tYpx9D8i6o6IKsrn1eY3x/d8NqiUltaXjE9vC5sfh7+AWe2Z9kJb1ExtTdvU9rRjraJh7at2o/3Qfua+5L7lbnLfB9TpqaHmqZY6uV9/AccJg74=</latexit><latexit sha1_base64="9SNVhtLjCRa3w1AqL8cVWAxLah0=">AAAG+3icfVXbbtNAEHULDSVQaOERCVlESAWFyk5pmz5UKm2hFeqNqmkq1VG03kwcK2t72V2nSbfmiU/hqeIN+AX+gb9hnUtlx4F98WjOOaPZM6uxTYnLhWH8mZq+c3cmd2/2fv7Bw7lHj+cXnpzxIGQYKjggATu3EQfi+lARriBwThkgzyZQtdvbMV7tAONu4J+KHoWahxzfbboYCZWqzz+/1t9YJHDqJZ0udl/pb/T9+HOtWwQ+62Z9vmAsGf2jZwNzGBS04TmuL8z8thoBDj3wBSaI8wvToKImERMuJhDlrZADRbiNHLgIRbNck65PQwE+jvSXCmuGRBeBHveqN1wGWJCeChBmrqqg4xZiCAt1o3y6FAcfecCLjY5L+SDkHWcQCKTsqMlu365oLqWUDkO05eJuqjWJPO4h0cokec+z00kICbCOl07Gbaomx5hdYNjlsQnHypkjGo+AnwbHQ7zVoy3weSRDRqKkUAHAGDSVsB9yECGV/duoubf5hmAhFOOwn9vYQax9Ao2iqpNKpNtpkgCJSJnhwyUOPA/5DWnRSFoCukJaxaWob1USPYmktGJfbFs/ieEUephAD6PxypVbtKlXFJoCzxLgWaZwNYFWx6V2mEDDDNpJoJ1MZfsyAV9m4G4C7WbQXgLtZdCrBHqVtRKpQV+UanJgd39M8oi4HdhlAH4kC6Vo/C5MTfDCTEviqcqCGfXtbkBTrYEB4PViutw7PdiP5Ha5tGKsRuMMm4QwohjLqyvbRobiDLoZcoxyubSV4QQM+c5toZ33q+/MbCEaMkpuSWtryx/Wt8afCMOZ+w2voRdMPeOHM4k+bHiiwJ4kGJgwkd/O8ncZ6v2DHUyqPvJmooJOUoyMGinGWqLxs2pjJY5XKiIDyg6oZcvgQD23I7UgkAjYa/XGmOO5ygz1tYpx9D8i6o6IKsrn1eY3x/d8NqiUltaXjE9vC5sfh7+AWe2Z9kJb1ExtTdvU9rRjraJh7at2o/3Qfua+5L7lbnLfB9TpqaHmqZY6uV9/AccJg74=</latexit><latexit sha1_base64="9SNVhtLjCRa3w1AqL8cVWAxLah0=">AAAG+3icfVXbbtNAEHULDSVQaOERCVlESAWFyk5pmz5UKm2hFeqNqmkq1VG03kwcK2t72V2nSbfmiU/hqeIN+AX+gb9hnUtlx4F98WjOOaPZM6uxTYnLhWH8mZq+c3cmd2/2fv7Bw7lHj+cXnpzxIGQYKjggATu3EQfi+lARriBwThkgzyZQtdvbMV7tAONu4J+KHoWahxzfbboYCZWqzz+/1t9YJHDqJZ0udl/pb/T9+HOtWwQ+62Z9vmAsGf2jZwNzGBS04TmuL8z8thoBDj3wBSaI8wvToKImERMuJhDlrZADRbiNHLgIRbNck65PQwE+jvSXCmuGRBeBHveqN1wGWJCeChBmrqqg4xZiCAt1o3y6FAcfecCLjY5L+SDkHWcQCKTsqMlu365oLqWUDkO05eJuqjWJPO4h0cokec+z00kICbCOl07Gbaomx5hdYNjlsQnHypkjGo+AnwbHQ7zVoy3weSRDRqKkUAHAGDSVsB9yECGV/duoubf5hmAhFOOwn9vYQax9Ao2iqpNKpNtpkgCJSJnhwyUOPA/5DWnRSFoCukJaxaWob1USPYmktGJfbFs/ieEUephAD6PxypVbtKlXFJoCzxLgWaZwNYFWx6V2mEDDDNpJoJ1MZfsyAV9m4G4C7WbQXgLtZdCrBHqVtRKpQV+UanJgd39M8oi4HdhlAH4kC6Vo/C5MTfDCTEviqcqCGfXtbkBTrYEB4PViutw7PdiP5Ha5tGKsRuMMm4QwohjLqyvbRobiDLoZcoxyubSV4QQM+c5toZ33q+/MbCEaMkpuSWtryx/Wt8afCMOZ+w2voRdMPeOHM4k+bHiiwJ4kGJgwkd/O8ncZ6v2DHUyqPvJmooJOUoyMGinGWqLxs2pjJY5XKiIDyg6oZcvgQD23I7UgkAjYa/XGmOO5ygz1tYpx9D8i6o6IKsrn1eY3x/d8NqiUltaXjE9vC5sfh7+AWe2Z9kJb1ExtTdvU9rRjraJh7at2o/3Qfua+5L7lbnLfB9TpqaHmqZY6uV9/AccJg74=</latexit>

Itturnsoutwecanmodelanydistributioninsuchawaythatthebiggestdifferenceincodelengthisnolargerthanabit.

Ifwehandwavethisdifference,wecanequatecodeswithprobabilitydistributions:everycodegivesusadistributionandeverydistributiongivesusacode.Thehighertheprobabilityofanoutcome,theshorteritscodelength.

entropy

p(X=x): data source

If we encode X with the ideal code for p, what is our expected codelength?

100

H(p) = EpL(x)

=X

x2X

p(x)L(x)

= -X

x2X

p(x) log p(x)<latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit>

H(p) = EpL(x)

=X

x2X

p(x)L(x)

= -X

x2X

p(x) log p(x)<latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit>

H(p) = EpL(x)

=X

x2X

p(x)L(x)

= -X

x2X

p(x) log p(x)<latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit>

Theentropyofadistributionistheexpectedcodelengthofanelementsampledfromthatdistribution.

101

1/4

a b c d

H(p) = 2 bits H(p) = 1.75 bits

1/4

a b c d

1/2

H(p) = 0 bits

a b c d

1

Themoreuniformourdistributionis(themoreunsureweare)thehighertheentropy.

Inthemiddle,weknowsomethingaboutourdistribution,forinstancethataisverylikely,sowecan

makethecodewordforaalittleshorter,reducingthe

expectedcodelength(theentropy).Ontheleft,wehavenosuchoptions,sotheentropyismaximal(equaltolog2N).

cross entropy

p(X): source of our data q(X): our model

Cross entropy: expected codelength if we use q, but the data comes from p.

102

H(p,q) = EpLq(x)

= -X

x2X

p(x) logq(x)<latexit sha1_base64="jWmCafC4IfLKj6tlfaKLnzfdgqM=">AAAIJHicfVXLcts2FGUejVK1aZ1m2Q0mmnacjuoh5cZ2F55J/Gi8SGLXY9meMVUNCF1RHIGPAqAsBoPfyddkl+mim35Et+2i4EMqXy03urrnnMt7DwDCiajHhWn+fufuvfufPOg8/LT72eePvvhy4/FXlzyMGYEhCWnIrh3MgXoBDIUnKFxHDLDvULhy5ocpfrUAxr0wuBBJBCMfu4E39QgWOjXemJ5s2ozISPWR7RD5q3qGvt1H0vaxmDkOOlbjHEavf5E5QW0unyHbRl3N+x7ZPPbHcolsL0DXCuXkjEFDtyip/443euaWmT2oGVhF0DOK52z8+MF7exKS2IdAEIo5v7HMSIwkZsIjFFTXjjlEmMyxCzexmO6NpBdEsYCAKPSNxqYxRSJE6cxo4jEggiY6wIR5ugIiM8wwEdqZbrUUhwD7wPuThRfxPOQLNw8E1raO5DKzXT2qKKXLcDTzyLLSmsQ+T61sJHniO9UkxBTYwq8m0zZ1kzXmEhjxeGrCmXbmNEqXkl+EZwU+S6IZBFzJmFFVFmoAGIOpFmYhBxFHMptG75853xcshn4aZrn9I8zm5zDp6zqVRLWdKQ2xqKYcPYZ2J4BbEvo+DibSjpS0BSyFtPtbKvOujJ4rud5z5ylcQd+W0LdKVcHjEniswSo6XKNTNKxLL0vgZeOtVyX0qi514hIaN9BFCV00Kju3Jfi2AS9L6LKBJiU0aaDvSui7ps9Yb4ubwUjma5Etqjyl3gJeMYBAyd5A1Wdher1vrKok3QOyZ6nM7glM9ccnB/wkpcuTizevlTzcGzw3d1Sd4dAYVhRze+f5odmguHk3Bcfc2xscNDghw4G7LnR0vPPSahaKYhbRNWl3d/unH5uVEqA0vF1XOjw4GmzXB9OOVJuydi3TrO82RhpWFY6gnoUa1rpt9OI1rQKnTZD72cqfN/mvGE7+gx22VV/Z3KqI2hQrz1sVSZtitQArRVUStNj073KsNbXJo/QgzInuMb0yMM3rHoG+TBi80QfkVH8AsQjZd/pUMNf3dC39a/fT6P+IeLki6qjb1TebVb/HmsHlYMva3jJ//qH34qC44x4aXxtPjU3DMnaNF8aJcWYMDWJ8MP40/jL+7rzvfOh87PyWU+/eKTRPjMrT+eMfXHD1ZA==</latexit>

H(p,q) = EpLq(x)

= -X

x2X

p(x) logq(x)<latexit sha1_base64="jWmCafC4IfLKj6tlfaKLnzfdgqM=">AAAIJHicfVXLcts2FGUejVK1aZ1m2Q0mmnacjuoh5cZ2F55J/Gi8SGLXY9meMVUNCF1RHIGPAqAsBoPfyddkl+mim35Et+2i4EMqXy03urrnnMt7DwDCiajHhWn+fufuvfufPOg8/LT72eePvvhy4/FXlzyMGYEhCWnIrh3MgXoBDIUnKFxHDLDvULhy5ocpfrUAxr0wuBBJBCMfu4E39QgWOjXemJ5s2ozISPWR7RD5q3qGvt1H0vaxmDkOOlbjHEavf5E5QW0unyHbRl3N+x7ZPPbHcolsL0DXCuXkjEFDtyip/443euaWmT2oGVhF0DOK52z8+MF7exKS2IdAEIo5v7HMSIwkZsIjFFTXjjlEmMyxCzexmO6NpBdEsYCAKPSNxqYxRSJE6cxo4jEggiY6wIR5ugIiM8wwEdqZbrUUhwD7wPuThRfxPOQLNw8E1raO5DKzXT2qKKXLcDTzyLLSmsQ+T61sJHniO9UkxBTYwq8m0zZ1kzXmEhjxeGrCmXbmNEqXkl+EZwU+S6IZBFzJmFFVFmoAGIOpFmYhBxFHMptG75853xcshn4aZrn9I8zm5zDp6zqVRLWdKQ2xqKYcPYZ2J4BbEvo+DibSjpS0BSyFtPtbKvOujJ4rud5z5ylcQd+W0LdKVcHjEniswSo6XKNTNKxLL0vgZeOtVyX0qi514hIaN9BFCV00Kju3Jfi2AS9L6LKBJiU0aaDvSui7ps9Yb4ubwUjma5Etqjyl3gJeMYBAyd5A1Wdher1vrKok3QOyZ6nM7glM9ccnB/wkpcuTizevlTzcGzw3d1Sd4dAYVhRze+f5odmguHk3Bcfc2xscNDghw4G7LnR0vPPSahaKYhbRNWl3d/unH5uVEqA0vF1XOjw4GmzXB9OOVJuydi3TrO82RhpWFY6gnoUa1rpt9OI1rQKnTZD72cqfN/mvGE7+gx22VV/Z3KqI2hQrz1sVSZtitQArRVUStNj073KsNbXJo/QgzInuMb0yMM3rHoG+TBi80QfkVH8AsQjZd/pUMNf3dC39a/fT6P+IeLki6qjb1TebVb/HmsHlYMva3jJ//qH34qC44x4aXxtPjU3DMnaNF8aJcWYMDWJ8MP40/jL+7rzvfOh87PyWU+/eKTRPjMrT+eMfXHD1ZA==</latexit>

Whatifwedon’tusethecodethatcorrespondstothesourceofourdataptoencodeourdata,butsomeothercodebasedondistributionq.Whatisourexpectedcodelengththen?Thisiscalledthecrossentropy.

Thecrossentropyisminimalwhenp=q(andequaltotheentropy).Wecanconcludetwothings:

• Thecodecorrespondingtopprovidesthebestexpectedcodelength.

• Thecrossentropyisagoodwaytoquantifythedistancebetweentwodistributions(becauseit’sminimalwhenthetwoarethesame).

Kulback-Leibler divergence

Expected difference in codelength between p and q.

Or, difference in expected codelength.

103

KL(p,q) = H(p,q)-H(p)

= -X

x2X

p(x) logq(x)

p(x)<latexit sha1_base64="066P+99nNKuRRJavISk1EIKjCWw=">AAAH1nicfZVdb9s2FIbltqtbb93S9XI3RI0N6eAGkrMm6UWAtknWYP1IFsRJhsgIKPpYFkx9jKQcqQTRq2G3+3vbrxll2YZkatOND87znmPy5QHpJTTgwrb/bt25e++L++0HDztffvXo6282Hn97weOUERiQmMbsysMcaBDBQASCwlXCAIcehUtvelDwyxkwHsTRucgTGIbYj4JxQLDQqZsN8e79psuITFQPuR6Rv6tn6Id9dGwkn69yz5DrdrTmOXJ5Gt7IzA0idKVQSTczzWnsI3fMMJFluU4queLqZqNrb9nzD5mBswi61uI7vXl8/507ikkaQiQIxZxfO3YihhIzERAKquOmHBJMptiH61SM94YyiJJUQEQU+l6zcUqRiFHhABoFDIiguQ4wYYHugMgE68UK7VOn3opDhEPgvdEsSHgZ8plfBgJrk4cymx+CelSrlD7DySQgWW1pEoc8xGJiJHkeevUkpBTYLKwni2XqRa4pM2Ak4IUJp9qZk6Q4WH4eny74JE8mEHElU0ZVtVADYAzGunAechBpIue70dM05fuCpdArwnlu/xCz6RmMerpPLVFfzpjGWNRTnt6GdieCWxKHIY5G0k30OAjIhHR7W2ruXZWeKSndwijPQ2cFrtGPFfpRqTo8qsAjDet0sKJjNFgvvajAC+NfLyv0cr3USys0NeisQmdGZ++2gm8NnFVoZtC8QnODfqrQT6bPWI/FdX8oy7OYH6o8ocEM3jKASMluX63vhenzvnbqJcUMyK6j5naPYKyvohKEeSGXx+cf3it5sNd/Ye+odYVHU1hK7O2dFwe2IfHL1Sw09t5e/42hiRmO/FWjw6Od147ZKElZQlei3d3tn1+anXKgNL5ddTp4c9jfXp8jRgwTFntFXQcZpvlN8sWuGgu8poLSqUb91NS/ZTj/D3Xc1H1pYGNF0lSxdLOxIm+qWFq7rFjbRFJM61S/GUlxr2NaSg5B3/gMPugpPtG3FBYx+1GPLvPDQNunf91eEf2fEGdLoY46Hf38OOuPjRkM+lsvt5xff+q++mXxDj2wvrOeWpuWY+1ar6xj69QaWMT6p2W1HrY67d/an9t/tP8spXdai5onVu1r//Uvd/vTtQ==</latexit><latexit sha1_base64="066P+99nNKuRRJavISk1EIKjCWw=">AAAH1nicfZVdb9s2FIbltqtbb93S9XI3RI0N6eAGkrMm6UWAtknWYP1IFsRJhsgIKPpYFkx9jKQcqQTRq2G3+3vbrxll2YZkatOND87znmPy5QHpJTTgwrb/bt25e++L++0HDztffvXo6282Hn97weOUERiQmMbsysMcaBDBQASCwlXCAIcehUtvelDwyxkwHsTRucgTGIbYj4JxQLDQqZsN8e79psuITFQPuR6Rv6tn6Id9dGwkn69yz5DrdrTmOXJ5Gt7IzA0idKVQSTczzWnsI3fMMJFluU4queLqZqNrb9nzD5mBswi61uI7vXl8/507ikkaQiQIxZxfO3YihhIzERAKquOmHBJMptiH61SM94YyiJJUQEQU+l6zcUqRiFHhABoFDIiguQ4wYYHugMgE68UK7VOn3opDhEPgvdEsSHgZ8plfBgJrk4cymx+CelSrlD7DySQgWW1pEoc8xGJiJHkeevUkpBTYLKwni2XqRa4pM2Ak4IUJp9qZk6Q4WH4eny74JE8mEHElU0ZVtVADYAzGunAechBpIue70dM05fuCpdArwnlu/xCz6RmMerpPLVFfzpjGWNRTnt6GdieCWxKHIY5G0k30OAjIhHR7W2ruXZWeKSndwijPQ2cFrtGPFfpRqTo8qsAjDet0sKJjNFgvvajAC+NfLyv0cr3USys0NeisQmdGZ++2gm8NnFVoZtC8QnODfqrQT6bPWI/FdX8oy7OYH6o8ocEM3jKASMluX63vhenzvnbqJcUMyK6j5naPYKyvohKEeSGXx+cf3it5sNd/Ye+odYVHU1hK7O2dFwe2IfHL1Sw09t5e/42hiRmO/FWjw6Od147ZKElZQlei3d3tn1+anXKgNL5ddTp4c9jfXp8jRgwTFntFXQcZpvlN8sWuGgu8poLSqUb91NS/ZTj/D3Xc1H1pYGNF0lSxdLOxIm+qWFq7rFjbRFJM61S/GUlxr2NaSg5B3/gMPugpPtG3FBYx+1GPLvPDQNunf91eEf2fEGdLoY46Hf38OOuPjRkM+lsvt5xff+q++mXxDj2wvrOeWpuWY+1ar6xj69QaWMT6p2W1HrY67d/an9t/tP8spXdai5onVu1r//Uvd/vTtQ==</latexit><latexit sha1_base64="066P+99nNKuRRJavISk1EIKjCWw=">AAAH1nicfZVdb9s2FIbltqtbb93S9XI3RI0N6eAGkrMm6UWAtknWYP1IFsRJhsgIKPpYFkx9jKQcqQTRq2G3+3vbrxll2YZkatOND87znmPy5QHpJTTgwrb/bt25e++L++0HDztffvXo6282Hn97weOUERiQmMbsysMcaBDBQASCwlXCAIcehUtvelDwyxkwHsTRucgTGIbYj4JxQLDQqZsN8e79psuITFQPuR6Rv6tn6Id9dGwkn69yz5DrdrTmOXJ5Gt7IzA0idKVQSTczzWnsI3fMMJFluU4queLqZqNrb9nzD5mBswi61uI7vXl8/507ikkaQiQIxZxfO3YihhIzERAKquOmHBJMptiH61SM94YyiJJUQEQU+l6zcUqRiFHhABoFDIiguQ4wYYHugMgE68UK7VOn3opDhEPgvdEsSHgZ8plfBgJrk4cymx+CelSrlD7DySQgWW1pEoc8xGJiJHkeevUkpBTYLKwni2XqRa4pM2Ak4IUJp9qZk6Q4WH4eny74JE8mEHElU0ZVtVADYAzGunAechBpIue70dM05fuCpdArwnlu/xCz6RmMerpPLVFfzpjGWNRTnt6GdieCWxKHIY5G0k30OAjIhHR7W2ruXZWeKSndwijPQ2cFrtGPFfpRqTo8qsAjDet0sKJjNFgvvajAC+NfLyv0cr3USys0NeisQmdGZ++2gm8NnFVoZtC8QnODfqrQT6bPWI/FdX8oy7OYH6o8ocEM3jKASMluX63vhenzvnbqJcUMyK6j5naPYKyvohKEeSGXx+cf3it5sNd/Ye+odYVHU1hK7O2dFwe2IfHL1Sw09t5e/42hiRmO/FWjw6Od147ZKElZQlei3d3tn1+anXKgNL5ddTp4c9jfXp8jRgwTFntFXQcZpvlN8sWuGgu8poLSqUb91NS/ZTj/D3Xc1H1pYGNF0lSxdLOxIm+qWFq7rFjbRFJM61S/GUlxr2NaSg5B3/gMPugpPtG3FBYx+1GPLvPDQNunf91eEf2fEGdLoY46Hf38OOuPjRkM+lsvt5xff+q++mXxDj2wvrOeWpuWY+1ar6xj69QaWMT6p2W1HrY67d/an9t/tP8spXdai5onVu1r//Uvd/vTtQ==</latexit><latexit sha1_base64="066P+99nNKuRRJavISk1EIKjCWw=">AAAH1nicfZVdb9s2FIbltqtbb93S9XI3RI0N6eAGkrMm6UWAtknWYP1IFsRJhsgIKPpYFkx9jKQcqQTRq2G3+3vbrxll2YZkatOND87znmPy5QHpJTTgwrb/bt25e++L++0HDztffvXo6282Hn97weOUERiQmMbsysMcaBDBQASCwlXCAIcehUtvelDwyxkwHsTRucgTGIbYj4JxQLDQqZsN8e79psuITFQPuR6Rv6tn6Id9dGwkn69yz5DrdrTmOXJ5Gt7IzA0idKVQSTczzWnsI3fMMJFluU4queLqZqNrb9nzD5mBswi61uI7vXl8/507ikkaQiQIxZxfO3YihhIzERAKquOmHBJMptiH61SM94YyiJJUQEQU+l6zcUqRiFHhABoFDIiguQ4wYYHugMgE68UK7VOn3opDhEPgvdEsSHgZ8plfBgJrk4cymx+CelSrlD7DySQgWW1pEoc8xGJiJHkeevUkpBTYLKwni2XqRa4pM2Ak4IUJp9qZk6Q4WH4eny74JE8mEHElU0ZVtVADYAzGunAechBpIue70dM05fuCpdArwnlu/xCz6RmMerpPLVFfzpjGWNRTnt6GdieCWxKHIY5G0k30OAjIhHR7W2ruXZWeKSndwijPQ2cFrtGPFfpRqTo8qsAjDet0sKJjNFgvvajAC+NfLyv0cr3USys0NeisQmdGZ++2gm8NnFVoZtC8QnODfqrQT6bPWI/FdX8oy7OYH6o8ocEM3jKASMluX63vhenzvnbqJcUMyK6j5naPYKyvohKEeSGXx+cf3it5sNd/Ye+odYVHU1hK7O2dFwe2IfHL1Sw09t5e/42hiRmO/FWjw6Od147ZKElZQlei3d3tn1+anXKgNL5ddTp4c9jfXp8jRgwTFntFXQcZpvlN8sWuGgu8poLSqUb91NS/ZTj/D3Xc1H1pYGNF0lSxdLOxIm+qWFq7rFjbRFJM61S/GUlxr2NaSg5B3/gMPugpPtG3FBYx+1GPLvPDQNunf91eEf2fEGdLoY46Hf38OOuPjRkM+lsvt5xff+q++mXxDj2wvrOeWpuWY+1ar6xj69QaWMT6p2W1HrY67d/an9t/tP8spXdai5onVu1r//Uvd/vTtQ==</latexit>

Thecrossentropyisanicemeasure,butit’snotzerowhenpandqareequal.Instead,it’sequaltotheentropyofp.

Togetameasurethatiszerowhenthetwoareequal,wecanjustsubtractthetheentropyofp.ThisiscalledtheKulback-Leibler(KL)divergence.TheKLdivergenceiszerowhenourmodelisperfect.

104

loss(q) =X

x2X

H(px,qx)

= -X

x2X

(px(P) logqx(P) + px(N) logqx(N))

= -X

x2XP

logqx(P)-X

x2XN

logqx(N)<latexit sha1_base64="rjx4gMJg5812/7NNpYFtL6oAFs8=">AAAInXicfVXdbts2FJa7rfO8dU23y12MmLHC2dxActYkuwjQJfGaizbxgjgOEBkGRdOyYOqnJOVIJfg6e6dd7VVGSbYnidp044Pz/Zj8DiU6EfEYN82/Wk8++fSzp5+3v+h8+dWzr5/vvfjmjoUxRXiMQhLSewcyTLwAj7nHCb6PKIa+Q/DEWZ1n+GSNKfPC4JanEZ760A28hYcgV63Z3t8vbY4TLkjImOx92AenwGaxPxOJ7QXgXoLLXjRL+uDDLNm37c7LU/Bqg4OCYBO84D2gSD3bQWIk94FNQjcT7Bo/FzBF4qoCFw2beu6SN7rPCgepW74CNWLuJTXz2V7XPDDzB+iFtSm6xuYZzV48/dOehyj2ccARgYw9WGbEpwJS7iGCZceOGY4gWkEXP8R8cTIVXhDFHAdIgh8VtogJ4CHIwgZzj2LESaoKiKinHABaQgoRVyPpVK0YDqCPWX++9iJWlGztFgWHap5TkeTzls8qSuFSGC09lFSWJqDPfMiXWpOlvlNt4phguvarzWyZapE1ZoIp8lgWwkglcx1lZ4jdhqMNvkyjJQ6YFDElsixUAKYUL5QwLxnmcSTy3aiDu2KnnMa4n5V57/QC0tUNnveVT6VRXc6ChJBXW47ahkonwI8o9H0YzIUdSVEccbt/IPPsyuiNFMLOgnIccJPBFfSqhKrTVQWHJXCowCo63qELMK5L70rgnfavkxI6qUuduITGGrouoWvN2XkswY8anJTQREPTEppq6McS+lHPGapj8TCYimIW+VDFNfHW+C3FOJCiO5D1vVA17werKsnOgOhaMo97jhfqq1cAfprRxeXt+3dSnJ8MXptHss5wSIy3FPPw6PW5qVHcYjUbjnlyMjjTOCGFgbszuhge/WbpRlFMI7IjHR8f/v6r7pRiQsLHndP52cXgsL4xlUh1UdaxZZr100aRFtUmEdC1gBat20Tf/E2jwGkSFHk28lc6/y2F6X+wwyb3bcyNiqhJsc28UZE2KbYD2CqqkqAhpn/HsdPUdh5lL8IKqTVmVwYkhe8FVpcJxe/VC3KtPoCQh/Qn9VZQ1/eUl/q1+1n1f0SYbImq6nTUzWbV7zG9uBscWIcH5h+/dN+cbe64tvGd8YPRMyzj2HhjXBojY2yg1rC1avFW3P6+PWy/a18V1CetjeZbo/K0J/8AAKMbXg==</latexit>

loss(q) =X

x2X

H(px,qx)

= -X

x2X

(px(P) logqx(P) + px(N) logqx(N))

= -X

x2XP

logqx(P)-X

x2XN

logqx(N)<latexit sha1_base64="rjx4gMJg5812/7NNpYFtL6oAFs8=">AAAInXicfVXdbts2FJa7rfO8dU23y12MmLHC2dxActYkuwjQJfGaizbxgjgOEBkGRdOyYOqnJOVIJfg6e6dd7VVGSbYnidp044Pz/Zj8DiU6EfEYN82/Wk8++fSzp5+3v+h8+dWzr5/vvfjmjoUxRXiMQhLSewcyTLwAj7nHCb6PKIa+Q/DEWZ1n+GSNKfPC4JanEZ760A28hYcgV63Z3t8vbY4TLkjImOx92AenwGaxPxOJ7QXgXoLLXjRL+uDDLNm37c7LU/Bqg4OCYBO84D2gSD3bQWIk94FNQjcT7Bo/FzBF4qoCFw2beu6SN7rPCgepW74CNWLuJTXz2V7XPDDzB+iFtSm6xuYZzV48/dOehyj2ccARgYw9WGbEpwJS7iGCZceOGY4gWkEXP8R8cTIVXhDFHAdIgh8VtogJ4CHIwgZzj2LESaoKiKinHABaQgoRVyPpVK0YDqCPWX++9iJWlGztFgWHap5TkeTzls8qSuFSGC09lFSWJqDPfMiXWpOlvlNt4phguvarzWyZapE1ZoIp8lgWwkglcx1lZ4jdhqMNvkyjJQ6YFDElsixUAKYUL5QwLxnmcSTy3aiDu2KnnMa4n5V57/QC0tUNnveVT6VRXc6ChJBXW47ahkonwI8o9H0YzIUdSVEccbt/IPPsyuiNFMLOgnIccJPBFfSqhKrTVQWHJXCowCo63qELMK5L70rgnfavkxI6qUuduITGGrouoWvN2XkswY8anJTQREPTEppq6McS+lHPGapj8TCYimIW+VDFNfHW+C3FOJCiO5D1vVA17werKsnOgOhaMo97jhfqq1cAfprRxeXt+3dSnJ8MXptHss5wSIy3FPPw6PW5qVHcYjUbjnlyMjjTOCGFgbszuhge/WbpRlFMI7IjHR8f/v6r7pRiQsLHndP52cXgsL4xlUh1UdaxZZr100aRFtUmEdC1gBat20Tf/E2jwGkSFHk28lc6/y2F6X+wwyb3bcyNiqhJsc28UZE2KbYD2CqqkqAhpn/HsdPUdh5lL8IKqTVmVwYkhe8FVpcJxe/VC3KtPoCQh/Qn9VZQ1/eUl/q1+1n1f0SYbImq6nTUzWbV7zG9uBscWIcH5h+/dN+cbe64tvGd8YPRMyzj2HhjXBojY2yg1rC1avFW3P6+PWy/a18V1CetjeZbo/K0J/8AAKMbXg==</latexit>

loss(q) =X

x2X

H(px,qx)

= -X

x2X

(px(P) logqx(P) + px(N) logqx(N))

= -X

x2XP

logqx(P)-X

x2XN

logqx(N)<latexit sha1_base64="rjx4gMJg5812/7NNpYFtL6oAFs8=">AAAInXicfVXdbts2FJa7rfO8dU23y12MmLHC2dxActYkuwjQJfGaizbxgjgOEBkGRdOyYOqnJOVIJfg6e6dd7VVGSbYnidp044Pz/Zj8DiU6EfEYN82/Wk8++fSzp5+3v+h8+dWzr5/vvfjmjoUxRXiMQhLSewcyTLwAj7nHCb6PKIa+Q/DEWZ1n+GSNKfPC4JanEZ760A28hYcgV63Z3t8vbY4TLkjImOx92AenwGaxPxOJ7QXgXoLLXjRL+uDDLNm37c7LU/Bqg4OCYBO84D2gSD3bQWIk94FNQjcT7Bo/FzBF4qoCFw2beu6SN7rPCgepW74CNWLuJTXz2V7XPDDzB+iFtSm6xuYZzV48/dOehyj2ccARgYw9WGbEpwJS7iGCZceOGY4gWkEXP8R8cTIVXhDFHAdIgh8VtogJ4CHIwgZzj2LESaoKiKinHABaQgoRVyPpVK0YDqCPWX++9iJWlGztFgWHap5TkeTzls8qSuFSGC09lFSWJqDPfMiXWpOlvlNt4phguvarzWyZapE1ZoIp8lgWwkglcx1lZ4jdhqMNvkyjJQ6YFDElsixUAKYUL5QwLxnmcSTy3aiDu2KnnMa4n5V57/QC0tUNnveVT6VRXc6ChJBXW47ahkonwI8o9H0YzIUdSVEccbt/IPPsyuiNFMLOgnIccJPBFfSqhKrTVQWHJXCowCo63qELMK5L70rgnfavkxI6qUuduITGGrouoWvN2XkswY8anJTQREPTEppq6McS+lHPGapj8TCYimIW+VDFNfHW+C3FOJCiO5D1vVA17werKsnOgOhaMo97jhfqq1cAfprRxeXt+3dSnJ8MXptHss5wSIy3FPPw6PW5qVHcYjUbjnlyMjjTOCGFgbszuhge/WbpRlFMI7IjHR8f/v6r7pRiQsLHndP52cXgsL4xlUh1UdaxZZr100aRFtUmEdC1gBat20Tf/E2jwGkSFHk28lc6/y2F6X+wwyb3bcyNiqhJsc28UZE2KbYD2CqqkqAhpn/HsdPUdh5lL8IKqTVmVwYkhe8FVpcJxe/VC3KtPoCQh/Qn9VZQ1/eUl/q1+1n1f0SYbImq6nTUzWbV7zG9uBscWIcH5h+/dN+cbe64tvGd8YPRMyzj2HhjXBojY2yg1rC1avFW3P6+PWy/a18V1CetjeZbo/K0J/8AAKMbXg==</latexit>

log loss is cross-entropy loss Thisway,wecanprovethatcrossentropylossisthesameaslogloss.Andindeedthislossfunctionisoftencalledcrossentropyloss.

Thisisnotjustacuriositytyinginformationtheorytomachinelearning,ithaspracticalconsequences.Ittellsuswhatweshoulddointhecasewherethedatasetactuallyprovidesclassprobabilitiesinsteadofclasslabels.Inthatcase,weshouldminimizethecrossentropybetwenethepredicteddistributionandtheonegiveninthedata.

the Minimum Description Length Principle

A model that allows us to compress the data is a model that has learned something about the data.

The better the compression, the more we’ve learned.

Balance model complexity by storing the model, and then the data given the model.

105

Thisleadsustotheminimumdescriptionlengthprinciple,whichinformallystatesthatcompressionandlearningarestronglyrelated.

106

sender receiverdata

ThebestwaytothinkofMDLmodelselectionisinasenderandreceiverframework.Thesenderisgoingtoseesomedata,andisgoingtosendittothereceiver.Beforeobservingthedata,thesenderandreceiverareallowedtocomeupwithanyschemetheylike.Butafterwards,thedatamustbesentusingthescheme,andinawaythatisperfectlydecodablebythereceiverwithoutfurthercommunication.

Weusuallyassumethatthereissomelanguagetodescribeamodelthatthesenderchooses.Thesenderdescribesthemodelandthenthedatagiventhemodel.

107

Wewon’tgointothetechnicaldetailsofMDL,buthereweseeabroadillustrationofhowMDLcanbalanceover-andunder?ittinginaregressionproblem.

Inaregression(orclassi?ication)problem,wecantaketheinstancesandtheirfeaturesas?ixed:boththesenderandreceiverhaveaccesstothem.Thedatathatwewanttosendoverthewireisthetargetlabels;inthiscasethenumbers.Howyouencodeacontinuousvalueisatechnicalmatterthatrequiressomeassumptions.Fornowwecanjustdiscretizerangeofoutptus,andassumethatweareusingacodethatmeansthatbiggernumbercostmorebits.Thesamegoesfortheparametersofthemodel:thesearealsocontinuousvalues,butwe’lldiscretizethemsomehow.Hereweonlyneedtoassumethatusingmoreparametersinyourmodeltakesmorebits.

Oncewe’vechosenamodelwecanreconstructthedatabysendingthemodelparametersandtheresidualvalues.Ontheleft,weseethatifwepickalinearmodelwehavemanylargeresidualstotransmit.Ontheotherhand,ourmodelisdescribedbyonlytwoparameters,sowecantransmitthatpartverycheaply.Ifwemakeourmodelaparabola,werequirethreenumberstotransmitit,sothatpartofourmessagegetsbigger,butbecausethemodel?itssomuchbetter,theresidualsaremuchsmaller,andtheoverallengthofourmessagegetsmuchsmaller.

Ifwemakeourmodela15-thorderpolynomial,wegetaslightlytighter?it,butnotbymuch,andthepricewepayinstoringthe16numbersrequiredtodescribeourmodelmeansthatourmessagelengthisbiggerthanfortheparabola.Sooverallwepreferthemodelinthemiddle,accordingtotheminimumdescriptionlengthprinciple.

108

argmaxM

p(M)p(X | M)

= argminM

- log p(M)p(X | M)

= argminM

- log p(M)- log(X | M)

= argminM

L(M) + LM(X)<latexit sha1_base64="FnS4ryy7Av/1PU9hJV7WYxQJcgQ=">AAAHvHicrVXbbtNAEHXKJSVQKPDIi0VExSWtnEBvQhWFtoBQS0vVtJXqqFpvJo6Vtb3aXacJq/09/oEnfoX1JZUdB8QD+zSec85o9sxq7FDicWFZPytzN27eul2dv1O7e2/h/oPFh49OeRgxDG0ckpCdO4gD8QJoC08QOKcMkO8QOHMGOzF+NgTGvTA4EWMKHR+5gdfzMBI6dbn4y0bM9dHo0mZYHihzyX5r0ufpx4s4crE8V7bvdc1J0rZrS1tmovOCiW7ZJqH7X5TLZpz4Z/n+tfCVuZ8lM3GsuFysWytWcsxy0MyCupGdo8uHt37Y3RBHPgQCE8T5RdOioiMREx4moGp2xIEiPEAuXESit9GRXkAjAQFW5jON9SJiitCMzTa7HgMsyFgHCDNPVzBxHzGEhR5JrViKQ4B84I3u0KM8DfnQTQOB9Dw7cpTMWy0UlNJliPY9PCq0JpHPfST6pSQf+04xCREBNvSLybhN3eQUcwQMezw24Ug7c0jjN8RPwqMM749pHwKuZMSIygs1AIxBTwuTkIOIqExuox/ugG8JFkEjDpPc1i5ig2PoNnSdQqLYTo+ESChtRgBXOPR9FHSlTZW0BYyEtBsrKrEqjx4rKe3YF8cxj2O4gH7NoV/VdOX2Ndoz2xotgKc58LRU+CyHnk1LnSiHRiV0mEOHpcrOVQ6+KsGjHDoqoeMcOi6h33Po97KVSA/6otWRqd3JmOQh8YbwiQEEStZbavouTE/wolmUxFOV9aZK7O5CT++xFPDHMV1+PjnYV3Jno7VqralphkMimFCs12urO1aJ4qbdZBxrY6P1ocQJGQrc60K7e2vvm+VCNGKUXJPW119/3Pww/UQYLt0vu4ZZb5olP9xZ9KzhmQJnliA1YSZ/UOZ/Ymj8B3Y4q/rEm5kKOksxMWqimGqJxs9qgLU4XqmIpJRd0MuWwYF+bod6QSARspcyW/VKL1/XbsTR34hoNCHqqFbTm785vefLQbu1srlifXtT3/6S/QLmjSfGU+O50TTWjW3js3FktA1c2asMKqISVd9VoTqo+il1rpJpHhuFUx3+Bp27xdo=</latexit><latexit sha1_base64="FnS4ryy7Av/1PU9hJV7WYxQJcgQ=">AAAHvHicrVXbbtNAEHXKJSVQKPDIi0VExSWtnEBvQhWFtoBQS0vVtJXqqFpvJo6Vtb3aXacJq/09/oEnfoX1JZUdB8QD+zSec85o9sxq7FDicWFZPytzN27eul2dv1O7e2/h/oPFh49OeRgxDG0ckpCdO4gD8QJoC08QOKcMkO8QOHMGOzF+NgTGvTA4EWMKHR+5gdfzMBI6dbn4y0bM9dHo0mZYHihzyX5r0ufpx4s4crE8V7bvdc1J0rZrS1tmovOCiW7ZJqH7X5TLZpz4Z/n+tfCVuZ8lM3GsuFysWytWcsxy0MyCupGdo8uHt37Y3RBHPgQCE8T5RdOioiMREx4moGp2xIEiPEAuXESit9GRXkAjAQFW5jON9SJiitCMzTa7HgMsyFgHCDNPVzBxHzGEhR5JrViKQ4B84I3u0KM8DfnQTQOB9Dw7cpTMWy0UlNJliPY9PCq0JpHPfST6pSQf+04xCREBNvSLybhN3eQUcwQMezw24Ug7c0jjN8RPwqMM749pHwKuZMSIygs1AIxBTwuTkIOIqExuox/ugG8JFkEjDpPc1i5ig2PoNnSdQqLYTo+ESChtRgBXOPR9FHSlTZW0BYyEtBsrKrEqjx4rKe3YF8cxj2O4gH7NoV/VdOX2Ndoz2xotgKc58LRU+CyHnk1LnSiHRiV0mEOHpcrOVQ6+KsGjHDoqoeMcOi6h33Po97KVSA/6otWRqd3JmOQh8YbwiQEEStZbavouTE/wolmUxFOV9aZK7O5CT++xFPDHMV1+PjnYV3Jno7VqralphkMimFCs12urO1aJ4qbdZBxrY6P1ocQJGQrc60K7e2vvm+VCNGKUXJPW119/3Pww/UQYLt0vu4ZZb5olP9xZ9KzhmQJnliA1YSZ/UOZ/Ymj8B3Y4q/rEm5kKOksxMWqimGqJxs9qgLU4XqmIpJRd0MuWwYF+bod6QSARspcyW/VKL1/XbsTR34hoNCHqqFbTm785vefLQbu1srlifXtT3/6S/QLmjSfGU+O50TTWjW3js3FktA1c2asMKqISVd9VoTqo+il1rpJpHhuFUx3+Bp27xdo=</latexit><latexit sha1_base64="FnS4ryy7Av/1PU9hJV7WYxQJcgQ=">AAAHvHicrVXbbtNAEHXKJSVQKPDIi0VExSWtnEBvQhWFtoBQS0vVtJXqqFpvJo6Vtb3aXacJq/09/oEnfoX1JZUdB8QD+zSec85o9sxq7FDicWFZPytzN27eul2dv1O7e2/h/oPFh49OeRgxDG0ckpCdO4gD8QJoC08QOKcMkO8QOHMGOzF+NgTGvTA4EWMKHR+5gdfzMBI6dbn4y0bM9dHo0mZYHihzyX5r0ufpx4s4crE8V7bvdc1J0rZrS1tmovOCiW7ZJqH7X5TLZpz4Z/n+tfCVuZ8lM3GsuFysWytWcsxy0MyCupGdo8uHt37Y3RBHPgQCE8T5RdOioiMREx4moGp2xIEiPEAuXESit9GRXkAjAQFW5jON9SJiitCMzTa7HgMsyFgHCDNPVzBxHzGEhR5JrViKQ4B84I3u0KM8DfnQTQOB9Dw7cpTMWy0UlNJliPY9PCq0JpHPfST6pSQf+04xCREBNvSLybhN3eQUcwQMezw24Ug7c0jjN8RPwqMM749pHwKuZMSIygs1AIxBTwuTkIOIqExuox/ugG8JFkEjDpPc1i5ig2PoNnSdQqLYTo+ESChtRgBXOPR9FHSlTZW0BYyEtBsrKrEqjx4rKe3YF8cxj2O4gH7NoV/VdOX2Ndoz2xotgKc58LRU+CyHnk1LnSiHRiV0mEOHpcrOVQ6+KsGjHDoqoeMcOi6h33Po97KVSA/6otWRqd3JmOQh8YbwiQEEStZbavouTE/wolmUxFOV9aZK7O5CT++xFPDHMV1+PjnYV3Jno7VqralphkMimFCs12urO1aJ4qbdZBxrY6P1ocQJGQrc60K7e2vvm+VCNGKUXJPW119/3Pww/UQYLt0vu4ZZb5olP9xZ9KzhmQJnliA1YSZ/UOZ/Ymj8B3Y4q/rEm5kKOksxMWqimGqJxs9qgLU4XqmIpJRd0MuWwYF+bod6QSARspcyW/VKL1/XbsTR34hoNCHqqFbTm785vefLQbu1srlifXtT3/6S/QLmjSfGU+O50TTWjW3js3FktA1c2asMKqISVd9VoTqo+il1rpJpHhuFUx3+Bp27xdo=</latexit>

cost of describing the data given the model

cost of describing the model

argmaxM

p(M)p(X | M)

= argminM

- log p(M)p(X | M)

= argminM

- log p(M)- log(X | M)

= argminM

L(M) + LM(X)<latexit sha1_base64="FnS4ryy7Av/1PU9hJV7WYxQJcgQ=">AAAHvHicrVXbbtNAEHXKJSVQKPDIi0VExSWtnEBvQhWFtoBQS0vVtJXqqFpvJo6Vtb3aXacJq/09/oEnfoX1JZUdB8QD+zSec85o9sxq7FDicWFZPytzN27eul2dv1O7e2/h/oPFh49OeRgxDG0ckpCdO4gD8QJoC08QOKcMkO8QOHMGOzF+NgTGvTA4EWMKHR+5gdfzMBI6dbn4y0bM9dHo0mZYHihzyX5r0ufpx4s4crE8V7bvdc1J0rZrS1tmovOCiW7ZJqH7X5TLZpz4Z/n+tfCVuZ8lM3GsuFysWytWcsxy0MyCupGdo8uHt37Y3RBHPgQCE8T5RdOioiMREx4moGp2xIEiPEAuXESit9GRXkAjAQFW5jON9SJiitCMzTa7HgMsyFgHCDNPVzBxHzGEhR5JrViKQ4B84I3u0KM8DfnQTQOB9Dw7cpTMWy0UlNJliPY9PCq0JpHPfST6pSQf+04xCREBNvSLybhN3eQUcwQMezw24Ug7c0jjN8RPwqMM749pHwKuZMSIygs1AIxBTwuTkIOIqExuox/ugG8JFkEjDpPc1i5ig2PoNnSdQqLYTo+ESChtRgBXOPR9FHSlTZW0BYyEtBsrKrEqjx4rKe3YF8cxj2O4gH7NoV/VdOX2Ndoz2xotgKc58LRU+CyHnk1LnSiHRiV0mEOHpcrOVQ6+KsGjHDoqoeMcOi6h33Po97KVSA/6otWRqd3JmOQh8YbwiQEEStZbavouTE/wolmUxFOV9aZK7O5CT++xFPDHMV1+PjnYV3Jno7VqralphkMimFCs12urO1aJ4qbdZBxrY6P1ocQJGQrc60K7e2vvm+VCNGKUXJPW119/3Pww/UQYLt0vu4ZZb5olP9xZ9KzhmQJnliA1YSZ/UOZ/Ymj8B3Y4q/rEm5kKOksxMWqimGqJxs9qgLU4XqmIpJRd0MuWwYF+bod6QSARspcyW/VKL1/XbsTR34hoNCHqqFbTm785vefLQbu1srlifXtT3/6S/QLmjSfGU+O50TTWjW3js3FktA1c2asMKqISVd9VoTqo+il1rpJpHhuFUx3+Bp27xdo=</latexit><latexit sha1_base64="FnS4ryy7Av/1PU9hJV7WYxQJcgQ=">AAAHvHicrVXbbtNAEHXKJSVQKPDIi0VExSWtnEBvQhWFtoBQS0vVtJXqqFpvJo6Vtb3aXacJq/09/oEnfoX1JZUdB8QD+zSec85o9sxq7FDicWFZPytzN27eul2dv1O7e2/h/oPFh49OeRgxDG0ckpCdO4gD8QJoC08QOKcMkO8QOHMGOzF+NgTGvTA4EWMKHR+5gdfzMBI6dbn4y0bM9dHo0mZYHihzyX5r0ufpx4s4crE8V7bvdc1J0rZrS1tmovOCiW7ZJqH7X5TLZpz4Z/n+tfCVuZ8lM3GsuFysWytWcsxy0MyCupGdo8uHt37Y3RBHPgQCE8T5RdOioiMREx4moGp2xIEiPEAuXESit9GRXkAjAQFW5jON9SJiitCMzTa7HgMsyFgHCDNPVzBxHzGEhR5JrViKQ4B84I3u0KM8DfnQTQOB9Dw7cpTMWy0UlNJliPY9PCq0JpHPfST6pSQf+04xCREBNvSLybhN3eQUcwQMezw24Ug7c0jjN8RPwqMM749pHwKuZMSIygs1AIxBTwuTkIOIqExuox/ugG8JFkEjDpPc1i5ig2PoNnSdQqLYTo+ESChtRgBXOPR9FHSlTZW0BYyEtBsrKrEqjx4rKe3YF8cxj2O4gH7NoV/VdOX2Ndoz2xotgKc58LRU+CyHnk1LnSiHRiV0mEOHpcrOVQ6+KsGjHDoqoeMcOi6h33Po97KVSA/6otWRqd3JmOQh8YbwiQEEStZbavouTE/wolmUxFOV9aZK7O5CT++xFPDHMV1+PjnYV3Jno7VqralphkMimFCs12urO1aJ4qbdZBxrY6P1ocQJGQrc60K7e2vvm+VCNGKUXJPW119/3Pww/UQYLt0vu4ZZb5olP9xZ9KzhmQJnliA1YSZ/UOZ/Ymj8B3Y4q/rEm5kKOksxMWqimGqJxs9qgLU4XqmIpJRd0MuWwYF+bod6QSARspcyW/VKL1/XbsTR34hoNCHqqFbTm785vefLQbu1srlifXtT3/6S/QLmjSfGU+O50TTWjW3js3FktA1c2asMKqISVd9VoTqo+il1rpJpHhuFUx3+Bp27xdo=</latexit><latexit sha1_base64="FnS4ryy7Av/1PU9hJV7WYxQJcgQ=">AAAHvHicrVXbbtNAEHXKJSVQKPDIi0VExSWtnEBvQhWFtoBQS0vVtJXqqFpvJo6Vtb3aXacJq/09/oEnfoX1JZUdB8QD+zSec85o9sxq7FDicWFZPytzN27eul2dv1O7e2/h/oPFh49OeRgxDG0ckpCdO4gD8QJoC08QOKcMkO8QOHMGOzF+NgTGvTA4EWMKHR+5gdfzMBI6dbn4y0bM9dHo0mZYHihzyX5r0ufpx4s4crE8V7bvdc1J0rZrS1tmovOCiW7ZJqH7X5TLZpz4Z/n+tfCVuZ8lM3GsuFysWytWcsxy0MyCupGdo8uHt37Y3RBHPgQCE8T5RdOioiMREx4moGp2xIEiPEAuXESit9GRXkAjAQFW5jON9SJiitCMzTa7HgMsyFgHCDNPVzBxHzGEhR5JrViKQ4B84I3u0KM8DfnQTQOB9Dw7cpTMWy0UlNJliPY9PCq0JpHPfST6pSQf+04xCREBNvSLybhN3eQUcwQMezw24Ug7c0jjN8RPwqMM749pHwKuZMSIygs1AIxBTwuTkIOIqExuox/ugG8JFkEjDpPc1i5ig2PoNnSdQqLYTo+ESChtRgBXOPR9FHSlTZW0BYyEtBsrKrEqjx4rKe3YF8cxj2O4gH7NoV/VdOX2Ndoz2xotgKc58LRU+CyHnk1LnSiHRiV0mEOHpcrOVQ6+KsGjHDoqoeMcOi6h33Po97KVSA/6otWRqd3JmOQh8YbwiQEEStZbavouTE/wolmUxFOV9aZK7O5CT++xFPDHMV1+PjnYV3Jno7VqralphkMimFCs12urO1aJ4qbdZBxrY6P1ocQJGQrc60K7e2vvm+VCNGKUXJPW119/3Pww/UQYLt0vu4ZZb5olP9xZ9KzhmQJnliA1YSZ/UOZ/Ymj8B3Y4q/rEm5kKOksxMWqimGqJxs9qgLU4XqmIpJRd0MuWwYF+bod6QSARspcyW/VKL1/XbsTR34hoNCHqqFbTm785vefLQbu1srlifXtT3/6S/QLmjSfGU+O50TTWjW3js3FktA1c2asMKqISVd9VoTqo+il1rpJpHhuFUx3+Bp27xdo=</latexit>

argmaxM

p(M)p(X | M)

= argminM

- log p(M)p(X | M)

= argminM

- log p(M)- log(X | M)

= argminM

L(M) + LM(X)<latexit sha1_base64="FnS4ryy7Av/1PU9hJV7WYxQJcgQ=">AAAHvHicrVXbbtNAEHXKJSVQKPDIi0VExSWtnEBvQhWFtoBQS0vVtJXqqFpvJo6Vtb3aXacJq/09/oEnfoX1JZUdB8QD+zSec85o9sxq7FDicWFZPytzN27eul2dv1O7e2/h/oPFh49OeRgxDG0ckpCdO4gD8QJoC08QOKcMkO8QOHMGOzF+NgTGvTA4EWMKHR+5gdfzMBI6dbn4y0bM9dHo0mZYHihzyX5r0ufpx4s4crE8V7bvdc1J0rZrS1tmovOCiW7ZJqH7X5TLZpz4Z/n+tfCVuZ8lM3GsuFysWytWcsxy0MyCupGdo8uHt37Y3RBHPgQCE8T5RdOioiMREx4moGp2xIEiPEAuXESit9GRXkAjAQFW5jON9SJiitCMzTa7HgMsyFgHCDNPVzBxHzGEhR5JrViKQ4B84I3u0KM8DfnQTQOB9Dw7cpTMWy0UlNJliPY9PCq0JpHPfST6pSQf+04xCREBNvSLybhN3eQUcwQMezw24Ug7c0jjN8RPwqMM749pHwKuZMSIygs1AIxBTwuTkIOIqExuox/ugG8JFkEjDpPc1i5ig2PoNnSdQqLYTo+ESChtRgBXOPR9FHSlTZW0BYyEtBsrKrEqjx4rKe3YF8cxj2O4gH7NoV/VdOX2Ndoz2xotgKc58LRU+CyHnk1LnSiHRiV0mEOHpcrOVQ6+KsGjHDoqoeMcOi6h33Po97KVSA/6otWRqd3JmOQh8YbwiQEEStZbavouTE/wolmUxFOV9aZK7O5CT++xFPDHMV1+PjnYV3Jno7VqralphkMimFCs12urO1aJ4qbdZBxrY6P1ocQJGQrc60K7e2vvm+VCNGKUXJPW119/3Pww/UQYLt0vu4ZZb5olP9xZ9KzhmQJnliA1YSZ/UOZ/Ymj8B3Y4q/rEm5kKOksxMWqimGqJxs9qgLU4XqmIpJRd0MuWwYF+bod6QSARspcyW/VKL1/XbsTR34hoNCHqqFbTm785vefLQbu1srlifXtT3/6S/QLmjSfGU+O50TTWjW3js3FktA1c2asMKqISVd9VoTqo+il1rpJpHhuFUx3+Bp27xdo=</latexit><latexit sha1_base64="FnS4ryy7Av/1PU9hJV7WYxQJcgQ=">AAAHvHicrVXbbtNAEHXKJSVQKPDIi0VExSWtnEBvQhWFtoBQS0vVtJXqqFpvJo6Vtb3aXacJq/09/oEnfoX1JZUdB8QD+zSec85o9sxq7FDicWFZPytzN27eul2dv1O7e2/h/oPFh49OeRgxDG0ckpCdO4gD8QJoC08QOKcMkO8QOHMGOzF+NgTGvTA4EWMKHR+5gdfzMBI6dbn4y0bM9dHo0mZYHihzyX5r0ufpx4s4crE8V7bvdc1J0rZrS1tmovOCiW7ZJqH7X5TLZpz4Z/n+tfCVuZ8lM3GsuFysWytWcsxy0MyCupGdo8uHt37Y3RBHPgQCE8T5RdOioiMREx4moGp2xIEiPEAuXESit9GRXkAjAQFW5jON9SJiitCMzTa7HgMsyFgHCDNPVzBxHzGEhR5JrViKQ4B84I3u0KM8DfnQTQOB9Dw7cpTMWy0UlNJliPY9PCq0JpHPfST6pSQf+04xCREBNvSLybhN3eQUcwQMezw24Ug7c0jjN8RPwqMM749pHwKuZMSIygs1AIxBTwuTkIOIqExuox/ugG8JFkEjDpPc1i5ig2PoNnSdQqLYTo+ESChtRgBXOPR9FHSlTZW0BYyEtBsrKrEqjx4rKe3YF8cxj2O4gH7NoV/VdOX2Ndoz2xotgKc58LRU+CyHnk1LnSiHRiV0mEOHpcrOVQ6+KsGjHDoqoeMcOi6h33Po97KVSA/6otWRqd3JmOQh8YbwiQEEStZbavouTE/wolmUxFOV9aZK7O5CT++xFPDHMV1+PjnYV3Jno7VqralphkMimFCs12urO1aJ4qbdZBxrY6P1ocQJGQrc60K7e2vvm+VCNGKUXJPW119/3Pww/UQYLt0vu4ZZb5olP9xZ9KzhmQJnliA1YSZ/UOZ/Ymj8B3Y4q/rEm5kKOksxMWqimGqJxs9qgLU4XqmIpJRd0MuWwYF+bod6QSARspcyW/VKL1/XbsTR34hoNCHqqFbTm785vefLQbu1srlifXtT3/6S/QLmjSfGU+O50TTWjW3js3FktA1c2asMKqISVd9VoTqo+il1rpJpHhuFUx3+Bp27xdo=</latexit><latexit sha1_base64="FnS4ryy7Av/1PU9hJV7WYxQJcgQ=">AAAHvHicrVXbbtNAEHXKJSVQKPDIi0VExSWtnEBvQhWFtoBQS0vVtJXqqFpvJo6Vtb3aXacJq/09/oEnfoX1JZUdB8QD+zSec85o9sxq7FDicWFZPytzN27eul2dv1O7e2/h/oPFh49OeRgxDG0ckpCdO4gD8QJoC08QOKcMkO8QOHMGOzF+NgTGvTA4EWMKHR+5gdfzMBI6dbn4y0bM9dHo0mZYHihzyX5r0ufpx4s4crE8V7bvdc1J0rZrS1tmovOCiW7ZJqH7X5TLZpz4Z/n+tfCVuZ8lM3GsuFysWytWcsxy0MyCupGdo8uHt37Y3RBHPgQCE8T5RdOioiMREx4moGp2xIEiPEAuXESit9GRXkAjAQFW5jON9SJiitCMzTa7HgMsyFgHCDNPVzBxHzGEhR5JrViKQ4B84I3u0KM8DfnQTQOB9Dw7cpTMWy0UlNJliPY9PCq0JpHPfST6pSQf+04xCREBNvSLybhN3eQUcwQMezw24Ug7c0jjN8RPwqMM749pHwKuZMSIygs1AIxBTwuTkIOIqExuox/ugG8JFkEjDpPc1i5ig2PoNnSdQqLYTo+ESChtRgBXOPR9FHSlTZW0BYyEtBsrKrEqjx4rKe3YF8cxj2O4gH7NoV/VdOX2Ndoz2xotgKc58LRU+CyHnk1LnSiHRiV0mEOHpcrOVQ6+KsGjHDoqoeMcOi6h33Po97KVSA/6otWRqd3JmOQh8YbwiQEEStZbavouTE/wolmUxFOV9aZK7O5CT++xFPDHMV1+PjnYV3Jno7VqralphkMimFCs12urO1aJ4qbdZBxrY6P1ocQJGQrc60K7e2vvm+VCNGKUXJPW119/3Pww/UQYLt0vu4ZZb5olP9xZ9KzhmQJnliA1YSZ/UOZ/Ymj8B3Y4q/rEm5kKOksxMWqimGqJxs9qgLU4XqmIpJRd0MuWwYF+bod6QSARspcyW/VKL1/XbsTR34hoNCHqqFbTm785vefLQbu1srlifXtT3/6S/QLmjSfGU+O50TTWjW3js3FktA1c2asMKqISVd9VoTqo+il1rpJpHhuFUx3+Bp27xdo=</latexit>

<latexit sha1_base64="73pXyJi/zEFpT6TJlIsdfkTSfFU=">AAAHv3icrVXbbtNAEHW4BcKtwCMvFhGIS1rZAdoiVAlooQi1tFRNW6mOqvVm4lhZ26vddRqz2t/jH3jkT1hfUtlxQDywT+M554xmz6zGLiU+F5b1s3Hp8pWr15rXb7Ru3rp95+7SvftHPIoZhh6OSMROXMSB+CH0hC8InFAGKHAJHLvjzRQ/ngDjfhQeioRCP0Be6A99jIROnS39chDzAjQ9cxiWu8p84rw16dP841kaeVieKCfwB+Ys6TitJxtmpvPDmW7ZIZH3X5TL5izxjwV2LqQvzJ0iWYhTxdlS21qxsmPWA7sI2kZx9s/uXf3hDCIcBxAKTBDnp7ZFRV8iJnxMQLWcmANFeIw8OI3FcL0v/ZDGAkKszMcaG8bEFJGZ2m0OfAZYkEQHCDNfVzDxCDGEhR5Kq1qKQ4gC4J3BxKc8D/nEywOB9ET7cppNXN2uKKXHEB35eFppTaKAB0iMakmeBG41CTEBNgmqybRN3eQccwoM+zw1YV87s0fTV8QPo/0CHyV0BCFXMmZElYUaAMZgqIVZyEHEVGa30U93zDcEi6GThlluYwux8QEMOrpOJVFtZ0giJJQ2I4RzHAUBCgfSoUo6AqZCOp0VlVlVRg+UlE7qi+uaBylcQb+W0K9qvnLvAh2aPY1WwKMSeFQrfFxCj+elblxC4xo6KaGTWmX3vASf1+BpCZ3W0KSEJjX0ewn9XrcS6UGfdvsytzsbk9wj/gS2GUCoZLur5u/C9ARP7aoknaps2yqzewBDvclyIEhSuvx8uLuj5OZ697W1quYZLolhRrFerr7etGoUL++m4Fjr690PNU7EUOhdFNr6uPrerheiMaPkgrS29vLTmw/zT4Th2v2Ka5ht26z54S2iFw0vFLiLBLkJC/njOn+boeQP7GhR9Zk3CxV0kWJm1Ewx1xJNn9UYa3G6UhHJKVugly2DXf3c9vSCQCJiz2Wx6pVevp7TSaO/EdF0RtRRq6U3vz2/5+vBUXfFXl2xvr1qv/tS/AOuGw+NR8ZTwzbWjHfGZ2Pf6Bm4sd0IGpPGefN902uGTZpTLzUKzQOjcprJb/cuxrA=</latexit>

argmaxM

p(M)p(X | M)

= argminM

- log p(M)p(X | M)

= argminM

- log p(M)- log p(X | M)

= argminM

L(M) + LM(X)

TherearemanycorrespondencesbetweenusingMDLandusingBayes.Infacttheyareoftenperspectivesonthesamething.Forinstance,ifwechoosethemodelthatmaximizestheposteriorprobability,wecanrewrite,byintroducingalogarithmtoshowthatwearealsochoosingthemodelthatminimizesthecodelength

109

encoding a simplicity assumption Whenwetalkedabouttheproblemofinductionandthenofreelunchtheorem,wenotedthatsomeassumptionaboutthesourceofourdatawasnecessarytomakelearningpossibleatall.Someaspectsofourproblemweneedtoassumebeforewestartlearning.

YoucanthinkofMDLasencodingasimplicityassumption.Weprefersimplesolutionsovercomplexones,andwede?ineasimplesolutionasonethatcompressesthedatawell.Theassumptionwemakeabouttheuniverse,isthatitgeneratedcompressibledataforus.

[email protected]