Naïve Bayes - SNS Courseware

Preview:

Citation preview

NaïveBayes

NaïveBayes•  Naïve Bayes is a probabilistic classification method based on Bayes’

theorem(orBayes’law)withafewtweaks.

•  Bayes’ theorem gives the relationship between the probabilities of twoeventsandtheirconditionalprobabilities.

•  Bayes’lawisnamedaftertheEnglishmathematicianThomasBayes.

•  A naïve Bayes classifier assumes that the presence or absence of aparticular feature of a class is unrelated to the presence or absence ofotherfeatures.

CS8091/Unit2/Naivebayes 2

NaïveBayesApplicationsofNaïveBayesClassifier•  NaïveBayesclassifiersareeasyto implementandcanexecuteefficiently

even without prior knowledge of the data, they are among the mostpopularalgorithmsforclassifyingtextdocuments.

•  SpamfilteringisaclassicusecaseofnaïveBayestextclassification.•  Bayesian spam filteringhasbecomeapopularmechanism todistinguish

spame-mailfromlegitimatee-mail.•  NaïveBayesclassifierscanalsobeusedforfrauddetection.

•  Inthedomainofautoinsurance,forexample,basedonatrainingsetwithattributes such as driver’s rating, vehicle age, vehicle price, historicalclaimsby thepolicyholder,police report status, and claimgenuineness,naïveBayescanprovideprobability-basedclassificationofwhetheranewclaimisgenuine.

CS8091/Unit2/Naivebayes 3

NaïveBayesTheorem•  The conditional probability of event C occurring, given that event A has

already occurred, is denoted as P(C|A) , which can be found using theformula

•  Aboveformulacanbeobtainedwithsomeminoralgebraandsubstitutionoftheconditionalprobability:

•  whereCistheclasslabelandAistheobservedattributes

•  SecondformulaisthemostcommonformoftheBayes’theorem.

CS8091/Unit2/Naivebayes 4

NaïveBayesTheorem•  Mathematically, Bayes’ theorem gives the relationship between the

probabilitiesofCandA,P(c)andP(A),andtheconditionalprobabilitiesofCgivenAandAgivenC,namelyP(C|A)andP(A|C).

Example•  John flies frequently and likes to upgrade his seat to first class. He has

determinedthatifhechecksinforhisflightatleasttwohoursearly,theprobabilitythathewillgetanupgradeis0.75;otherwise,theprobabilitythathewillgetanupgradeis0.35.Withhisbusyschedule,hechecksinatleast twohoursbeforehis flightonly40%ofthetime.SupposeJohndidnot receive an upgrade on his most recent attempt. What is theprobabilitythathedidnotarrivetwohoursearly?

CS8091/Unit2/Naivebayes 5

NaïveBayesTheorem•  LetC={Johnarrivedatleasttwohoursearly},andA={Johnreceivedan

upgrade},then¬C={Johndidnotarrivetwohoursearly},and¬A={Johndidnotreceiveanupgrade}.

•  Johncheckedinatleasttwohoursearlyonly40%ofthetime,orP(C)=0.4.Therefore,

•  TheprobabilitythatJohnreceivedanupgradegiventhathecheckedinearlyis0.75,or

•  TheprobabilitythatJohnreceivedanupgradegiventhathedidnotarrivetwohoursearlyis0.35,or

•  Therefore,

CS8091/Unit2/Naivebayes 6

NaïveBayesTheorem•  TheprobabilitythatJohnreceivedanupgradeP(A)canbecomputedas

shown

•  Thus,theprobabilitythatJohndidnotreceiveanupgrade

•  UsingBayes’theorem,theprobabilitythatJohndidnotarrivetwohoursearlygiventhathedidnotreceivehisupgradeisshown

CS8091/Unit2/Naivebayes 7

NaïveBayesTheoremExample2•  Assume that apatientnamedMary tooka lab test for a certaindisease

andtheresultcamebackpositive.Thetestreturnsapositiveresultin95%of the cases in which the disease is actually present, and it returns apositive result in 6% of the cases in which the disease is not present.Furthermore, 1% of the entire population has this disease.What is theprobability that Mary actually has the disease, given that the test ispositive?

•  LetC={havingthedisease}andA={testingpositive}.Thegoalistosolvetheprobabilityofhavingthedisease,giventhatMaryhasapositivetestresult,P(C|A).

•  Fromtheproblemdescription,

CS8091/Unit2/Naivebayes 8

NaïveBayesTheorem•  Bayes’theoremdefines.

•  Theprobabilityoftestingpositive,thatisP(A),needstobecomputedfirst.Thatcomputationisshownbelow

•  AccordingtoBayes’theorem,theprobabilityofhavingthedisease,giventhatMaryhasapositivetestresult,is

CS8091/Unit2/Naivebayes 9

NaïveBayesTheorem•  Amore general form of Bayes’ theorem assigns a classified label to an

objectwithmultipleattributes suchthatthelabelcorrespondstothelargestvalueof.

•  The probability that a set of attribute values should be labeled with aclassification label Ci equals the probability that the set of variables

given given Ci is true, times the probability of Ci divided by theprobabilityof.

•  Mathematically,thisis

CS8091/Unit2/Naivebayes 10

Recommended