Upload
alexandra-regan
View
220
Download
2
Tags:
Embed Size (px)
Citation preview
Bayesian Statistics: Asking the “Right” Questions
Michael L. Raymer, Ph.D.
8/29/03 M. Raymer – WSU, FBS 2
Statistical Games
“The defendant’s DNA is consistent with the evidentiary sample, and the defendant’s DNA type occurs with a frequency of one in 10,000,000,000.”
“The defendant’s DNA is consistent with the evidentiary sample, and the defendant’s DNA type occurs with a frequency of one in 10,000,000,000.”
“Only about 0.1% of wife batterers actually murder their wives. Therefore, evidence of abuse and battering should not be admissible in a murder trial.”
“Only about 0.1% of wife batterers actually murder their wives. Therefore, evidence of abuse and battering should not be admissible in a murder trial.”
8/29/03 M. Raymer – WSU, FBS 3
The Question
• “Given the evidentiary DNA typeand the defendant’s DNA type, what is the probability that the evidence sample contains the defendant’s DNA?”
• Information available:How common is each allele in a
particular population?CPI, RMP etc.
8/29/03 M. Raymer – WSU, FBS 4
An Example Problem• Suppose the rate of breast cancer
is 1%• Mammograms detect breast cancer
in 80% of cases where it is present• 10% of the time, mammograms will
indicate breast cancer in a healthy patient
• If a woman has a positive mammogram result, what is the probability that she has breast cancer?
8/29/03 M. Raymer – WSU, FBS 5
Results
• 75% -- 3• 50% -- 1• 25% -- 2• <10% -- a lot
8/29/03 M. Raymer – WSU, FBS 6
Determining Probabilities• Counting all possible outcomes• If you flip a coin 4 times, what is the
probability that you will get heads twice?TTTT THTT HTTT HHTTTTTH THTH HTTH HHTHTTHT THHT HTHT HHHTTTHH THHH HTHH HHHH
• P(2 heads) = 6/16 = 0.375
8/29/03 M. Raymer – WSU, FBS 7
Statistical Preliminaries
• Frequency and Probability
We can guess at probabilities by counting frequencies:P(heads) = 0.5
The law of large numbers: the more samples we take the closer we will get to 0.5.
8/29/03 M. Raymer – WSU, FBS 8
Distributions• Counting frequencies gives us
distributions
0
0.02
0.04
0.06
0.08
0.1
0.12
0.14
0.16
0.18
P(N)
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
N =# Heads (20 Tosses)0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
0 0.5 1 1.5 2
N
P(N
)
Binomial Distribution(Discrete)
Gaussian Distribution(Continuous)
8/29/03 M. Raymer – WSU, FBS 9
Density Estimation• Parametric
Assume a Gaussian (e.g.) distribution.Estimate the parameters (,).
• Non-parametricHistogram samplingBin size is criticalGaussian smoothing
can help
8/29/03 M. Raymer – WSU, FBS 10
Combining Probabilities• Non-overlapping outcomes:
• Possible Overlap:
• Independent Events:
2121 or EPEPEEP 2121 or EPEPEEP
212121 and or EEPEPEPEEP 212121 and or EEPEPEPEEP
2121 and EPEPEEP 2121 and EPEPEEP The Product
RuleThe Product
Rule
8/29/03 M. Raymer – WSU, FBS 11
Product Rule Example
• P(Engine > 200 H.P.) = 0.2• P(Color = red) = 0.3• Assuming independence:
P(Red & Fast) = 0.2 × 0.3 = 0.06
• 1/4 * 1/10 * 1/6 * 1/8 * 1/5 1/10,000
8/29/03 M. Raymer – WSU, FBS 12
Statistical Decision Making• One variable:
A ring was found at the scene of the crime. The ring is size 11. The defendant’s ring size is also 11. If a random ring were left at the crime scene, what is the probability that it would have been size 11?
A ring was found at the scene of the crime. The ring is size 11. The defendant’s ring size is also 11. If a random ring were left at the crime scene, what is the probability that it would have been size 11?
0
20
40
60
80
100
120
Frequency
5 6 7 8 9 10 11 12 13
Ring Size
8/29/03 M. Raymer – WSU, FBS 13
Multiple Variables
• Assume independence:
Note what happens to significant digits!
The ring is size 11, and also made of platinum.The ring is size 11, and also made of platinum.
00045.0005.009.0platinum) and 11 size(
005.03822platinum
09.03823411 size
P
P
P
00045.0005.009.0platinum) and 11 size(
005.03822platinum
09.03823411 size
P
P
P
8/29/03 M. Raymer – WSU, FBS 14
Which Question?• If a fruit has a diameter of 4”, how
likely is it to be an apple?
Apples 4” Fruit
8/29/03 M. Raymer – WSU, FBS 15
“Inverting” the question
Given an apple, what is the probability that it will have a diameter of 4”?
Given an apple, what is the probability that it will have a diameter of 4”?
Given a 4” diameter fruit, what is the probability that it is an apple?
Given a 4” diameter fruit, what is the probability that it is an apple?
8/29/03 M. Raymer – WSU, FBS 16
Forensic DNA Evidence
• Given alleles (17, 17), (19, 21),(14, 15.1), what is the probability that a DNA sample belongs to Bob?
Find all (17,17), (19,21), (14,15.1) individuals, how many of them are Bob?
How common are 17, 19, 21, 14, and 15.1 in “the population”?
8/29/03 M. Raymer – WSU, FBS 17
Conditional Probabilities• For related events, we can express
probability conditionally:
• Statistical Independence:
0.01 sunny)|rain(
5.0cloudy|rain
P
P 0.01 sunny)|rain(
5.0cloudy|rain
P
P
121 | EPEEP 121 | EPEEP
8/29/03 M. Raymer – WSU, FBS 18
Bayesian Decision Making
• TerminologyWe have an object, and we want to
decide if it belongs to a classIs this fruit a type of apple?Does this DNA come from a Caucasian
American?Is this car a sports car?
We measure features of the object (evidence):Size, weight, colorAlleles at various loci
8/29/03 M. Raymer – WSU, FBS 19
Bayesian Notation• Feature/Evidence Vector:
• Classes & Posterior Probability:
3.3" oz, .39 yellow,
2.9" oz, 1.6 red,
2
1
x
x
3.3" oz, .39 yellow,
2.9" oz, 1.6 red,
2
1
x
x
15.0|pear
40.0|apple
2
2
xP
xP
15.0|pear
40.0|apple
2
2
xP
xP
8/29/03 M. Raymer – WSU, FBS 20
A Simple Example
• You are given a fruit with adiameter of 4” – is it a pear or an apple?
• To begin, we need to know the distributions of diameters for pears and apples.
8/29/03 M. Raymer – WSU, FBS 21
Maximum Likelihood
P(x)
apple|xP pear|xP
diameterx
Class-Conditional Distributions
Class-Conditional Distributions
1” 2” 3” 4” 5” 6”
8/29/03 M. Raymer – WSU, FBS 22
A Key Problem
• We based this decision on(class conditional)
• What we really want to use is(posterior probability)
• What if we found the fruit in a pear orchard?
• We need to know the prior probability of finding an apple or a pear!
pear|xP
xP |pear
8/29/03 M. Raymer – WSU, FBS 23
Prior Probabilities
• Prior probability + Evidence Posterior Probability
• Without evidence, what is the “prior probability” that a fruit is an apple?
• What is the prior probability that a DNA sample comes from the defendant?
8/29/03 M. Raymer – WSU, FBS 24
The heart of it all• Bayes Rule
classes all
)()|(
)()|(|
classPclassevidenceP
classPclassevidencePevidenceclassP
pearpear|4appleapple|4
appleapple|4
P"dpP"dp
P"dp
8/29/03 M. Raymer – WSU, FBS 25
Bayes Rule
c
jjj
jjj
Pxp
PxpxP
1
|
||
or
xpPxp
xP jjj
|
|
8/29/03 M. Raymer – WSU, FBS 26
Example Revisited
• Is it an ordinary apple or an uncommon pear?
05.0pear|4
4.0apple|4
"dP
"dP
9.0)pear(
1.0apple
P
P
8/29/03 M. Raymer – WSU, FBS 27
Bayes Rule Example
47.0085.0
04.0
9.005.01.04.0
1.04.0
"dP 4|apple
pearpear|4appleapple|4
appleapple|4
P"dpP"dp
P"dp
8/29/03 M. Raymer – WSU, FBS 28
Bayes Rule Example "dP 4|pear
pearpear|4appleapple|4
pearpear|4
P"dpP"dp
P"dp
53.0085.0
045.0
9.005.01.04.0
9.005.0
8/29/03 M. Raymer – WSU, FBS 29
Posing the question
1. What are the classes?2. What is the evidence?3. What is the prior probability?4. What is the class-conditional
probability?
classes all
)()|(
)()|(|
classPclassevidenceP
classPclassevidencePevidenceclassP
8/29/03 M. Raymer – WSU, FBS 30
An Example Problem• Suppose the rate of breast cancer
is 1%• Mammograms detect breast cancer
in 80% of cases where it is present• 10% of the time, mammograms will
indicate breast cancer in a healthy patient
• If a woman has a positive mammogram result, what is the probability that she has breast cancer?
8/29/03 M. Raymer – WSU, FBS 31
Practice Problem Revisited
01.0cancerP
8.0| cancerposP
1.0| healthyposP
• Classes: healthy, cancer• Evidence: positive mammogram
(pos), negative mammogram (neg)
• If a woman has a positive mammogram result, what is the probability that she has breast cancer? ?| poscancerP
8/29/03 M. Raymer – WSU, FBS 32
A Counting Argument
• Suppose we have 1000 women10 will have breast cancer
8 of these will have a positive mammogram
990 will not have breast cancer99 of these will have a positive
mammogram
Of the 107 women with a positive mammogram, 8 have breast cancer8/107 0.075 = 7.5%
8/29/03 M. Raymer – WSU, FBS 33
Solution
%5.7075.0107.0
008.0
99.01.001.08.0
01.08.0
)|( poscancerP
)(||
|
healthyPhealthypospcancerPcancerposp
cancerPcancerposp
8/29/03 M. Raymer – WSU, FBS 34
An Example Problem
• Suppose the chance of a randomly chosen person being guilty is .001
• When a person is guilty, a DNA sample will match that individual 99% of the time.
• .0001 of the time, a DNA will exhibit a false match for an innocent individual
• If a DNA test demonstrates a match, what is the probability of guilt?
8/29/03 M. Raymer – WSU, FBS 35
Solution
909.0000999.000099.0
00099.0
999.00001.0001.099.0
001.099.0
)|( posguiltP
)(||
|
innocentPinnocentpospguiltPguiltposp
guiltPguiltposp
8/29/03 M. Raymer – WSU, FBS 36
Marginal Distributions
apple|1xP pear|1xP
apple|2xP pear|2xP
8/29/03 M. Raymer – WSU, FBS 37
Combining Marginals
• Assuming independent features:
• If we assume independence and use Bayes rule, we have a Naïve Bayes decision maker (classifier).
jdjjj xPxPxPxP ω|ω|ω|ω| 21
8/29/03 M. Raymer – WSU, FBS 38
Bayes Decision Rule
• Provably optimum when the features (evidence) follow Gaussian distributions, and are independent.
jxPxP ji
i
||
such that , classPredict
8/29/03 M. Raymer – WSU, FBS 39
Forensic DNA
• Classes: DNA from defendant, DNA not from defendant
• Evidence: Allele matches at various lociAssumption of independence
• Prior Probabilities?Assumed equal (0.5)What is the true prior probability that an
evidence sample came from a particular individual?
8/29/03 M. Raymer – WSU, FBS 40
The Importance of Priors
0.1 0.5 0.050.1 0.25 0.0250.1 0.1 0.010.1 0.01 0.0010.1 0.001 0.0001
|xP P xP
|
8/29/03 M. Raymer – WSU, FBS 41
Likelihood Ratios
• When deciding between two possibilities, we don’t need the exact probabilities. We only need to know which one is greater.
• The denominator for all the classes is always equal.Can be eliminatedUseful when there are many possible
classes
8/29/03 M. Raymer – WSU, FBS 42
Likelihood Ratio Example
pearpear|4appleapple|4
pearpear|4
P"dpP"dp
P"dp
pearpear|4appleapple|4
appleapple|4
P"dpP"dp
P"dp
8/29/03 M. Raymer – WSU, FBS 43
Likelihood Ratio Example
appleapple|4
pearpear|4
P"dp
P"dp
8/29/03 M. Raymer – WSU, FBS 44
From alleles to identity:
• It is relatively easy to find the allele frequencies in the populationMarginal probability distributions
• Independence assumptionClass conditional probabilities
• Equal prior probabilitiesBayesian posterior probability estimate
8/29/03 M. Raymer – WSU, FBS 45
Thank you.Thank you.
8/29/03 M. Raymer – WSU, FBS 46
A Key Advantage
• The oldest citation:
T. Bayes. “An essay towards solving a problem in the doctrine of chances.” Phil. Trans. Roy. Soc., 53, 1763.
T. Bayes. “An essay towards solving a problem in the doctrine of chances.” Phil. Trans. Roy. Soc., 53, 1763.