Naïve Bayes Algorithm. The Theory P(Y)=.40P(N)=.60 P(M)=.55P(M & Y)=.10P(M & N)=.45 P(F)=.45P(F & Y)=.30P(F & N)=.15 For a campus with 55% male students,

Embed Size (px)

DESCRIPTION

More Theory What if in A and B are independent? That is What if in P(A|B) = P(A & B)/P(B) A and B are independent? That is P(B|A) = P(B) and P(B|A) = P(B) and P(A|B) = P(A) P(A|B) = P(A) That is, the gender has nothing to do with the yes or no vode That is, the gender has nothing to do with the yes or no vode That is = P(A) That is P(A|B) = P(A & B)/P(B) = P(A)  P(A & B) = P(A) P(B)  P(A & B) = P(A) P(B)

Citation preview

Nave Bayes Algorithm The Theory P(Y)=.40P(N)=.60 P(M)=.55P(M & Y)=.10P(M & N)=.45 P(F)=.45P(F & Y)=.30P(F & N)=.15 For a campus with 55% male students, the yes answer to a question is 40%, with the details showing above, what is the possibility of a male student saying yes, what is the possibility of a female student saying yes? P(Y|M) = P(M & Y)/P(M) =.10/.55 =.18 P(Y|F) = P(F & Y)/P(F) =.30/.45 = More Females saying yes than males,.67/.18 = 3.72 times more! P(A|B) = P(A & B)/P(B) Bayes Theorem More Theory What if in A and B are independent? That is What if in P(A|B) = P(A & B)/P(B) A and B are independent? That is P(B|A) = P(B) and P(B|A) = P(B) and P(A|B) = P(A) P(A|B) = P(A) That is, the gender has nothing to do with the yes or no vode That is, the gender has nothing to do with the yes or no vode That is = P(A) That is P(A|B) = P(A & B)/P(B) = P(A) P(A & B) = P(A) P(B) P(A & B) = P(A) P(B) More Theory Independent P(Y)=.40P(N)=.60 P(M)=.55P(M & Y)=.22P(M & N)=.33 P(F)=.45P(F & Y)=.18P(F & N)=.27 When the vote and gender are independent, P(Y & M) = P(M & Y)= P(M)*P(Y) =.55*.40 =.22 What Should the Training Data Look Like? V#GenderVote 112MY 789MN 332FN V#VoteV#Gender 112Y M 789N M 332N F Training Data Sample IDNameParty Perm- anent Tax Cuts Food Stamp s Nuclea r Waste Abortio ns Overse as Welfare Renewa l Estate Tax Repeal 1AbercrombieDNYNYYY 2AckermanDNYNYYN 3AderholtRYNYNNY 4AkinRYNYNNY 5Allen, T.DNYYYYN 6AndrewsDNYYYYN 7ArmeyRYNYNNY 8BacaDNYNYYN 9Bachus, S.RYNYNNY 10BairdDNYYYYN The Result The Question Give a set of voting records, what is the party of the member? The Answer P(D) = 0.2 * 0.57 * 0.94 * 0.89 * 0.49 = P(R) = 0.98 * 0.03 * 0.83 * * 0.51 = Normalize the two Normalize the two P(D) = /( ) = 79% P(D) = /( ) = 79% P(R) = /( ) = 21% P(R) = /( ) = 21% The No Evidence Problem If a P(e) = 0, then the prediction will be 0 To avoid this, we add a non zero amount to each possible count/output, for example 1. To avoid this, we add a non zero amount to each possible count/output, for example 1. Another Sourcehtml