Bayesian Statistics: Asking the Right Questions Michael L. Raymer, Ph.D

Preview:

Citation preview

Bayesian Statistics: Asking the “Right” Questions

Michael L. Raymer, Ph.D.

8/29/03 M. Raymer – WSU, FBS 2

Statistical Games

“The defendant’s DNA is consistent with the evidentiary sample, and the defendant’s DNA type occurs with a frequency of one in 10,000,000,000.”

“The defendant’s DNA is consistent with the evidentiary sample, and the defendant’s DNA type occurs with a frequency of one in 10,000,000,000.”

“Only about 0.1% of wife batterers actually murder their wives. Therefore, evidence of abuse and battering should not be admissible in a murder trial.”

“Only about 0.1% of wife batterers actually murder their wives. Therefore, evidence of abuse and battering should not be admissible in a murder trial.”

8/29/03 M. Raymer – WSU, FBS 3

The Question

• “Given the evidentiary DNA typeand the defendant’s DNA type, what is the probability that the evidence sample contains the defendant’s DNA?”

• Information available:How common is each allele in a

particular population?CPI, RMP etc.

8/29/03 M. Raymer – WSU, FBS 4

An Example Problem• Suppose the rate of breast cancer

is 1%• Mammograms detect breast cancer

in 80% of cases where it is present• 10% of the time, mammograms will

indicate breast cancer in a healthy patient

• If a woman has a positive mammogram result, what is the probability that she has breast cancer?

8/29/03 M. Raymer – WSU, FBS 5

Results

• 75% -- 3• 50% -- 1• 25% -- 2• <10% -- a lot

8/29/03 M. Raymer – WSU, FBS 6

Determining Probabilities• Counting all possible outcomes• If you flip a coin 4 times, what is the

probability that you will get heads twice?TTTT THTT HTTT HHTTTTTH THTH HTTH HHTHTTHT THHT HTHT HHHTTTHH THHH HTHH HHHH

• P(2 heads) = 6/16 = 0.375

8/29/03 M. Raymer – WSU, FBS 7

Statistical Preliminaries

• Frequency and Probability

We can guess at probabilities by counting frequencies:P(heads) = 0.5

The law of large numbers: the more samples we take the closer we will get to 0.5.

8/29/03 M. Raymer – WSU, FBS 8

Distributions• Counting frequencies gives us

distributions

0

0.02

0.04

0.06

0.08

0.1

0.12

0.14

0.16

0.18

P(N)

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

N =# Heads (20 Tosses)0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

0 0.5 1 1.5 2

N

P(N

)

Binomial Distribution(Discrete)

Gaussian Distribution(Continuous)

8/29/03 M. Raymer – WSU, FBS 9

Density Estimation• Parametric

Assume a Gaussian (e.g.) distribution.Estimate the parameters (,).

• Non-parametricHistogram samplingBin size is criticalGaussian smoothing

can help

8/29/03 M. Raymer – WSU, FBS 10

Combining Probabilities• Non-overlapping outcomes:

• Possible Overlap:

• Independent Events:

2121 or EPEPEEP 2121 or EPEPEEP

212121 and or EEPEPEPEEP 212121 and or EEPEPEPEEP

2121 and EPEPEEP 2121 and EPEPEEP The Product

RuleThe Product

Rule

8/29/03 M. Raymer – WSU, FBS 11

Product Rule Example

• P(Engine > 200 H.P.) = 0.2• P(Color = red) = 0.3• Assuming independence:

P(Red & Fast) = 0.2 × 0.3 = 0.06

• 1/4 * 1/10 * 1/6 * 1/8 * 1/5 1/10,000

8/29/03 M. Raymer – WSU, FBS 12

Statistical Decision Making• One variable:

A ring was found at the scene of the crime. The ring is size 11. The defendant’s ring size is also 11. If a random ring were left at the crime scene, what is the probability that it would have been size 11?

A ring was found at the scene of the crime. The ring is size 11. The defendant’s ring size is also 11. If a random ring were left at the crime scene, what is the probability that it would have been size 11?

0

20

40

60

80

100

120

Frequency

5 6 7 8 9 10 11 12 13

Ring Size

8/29/03 M. Raymer – WSU, FBS 13

Multiple Variables

• Assume independence:

Note what happens to significant digits!

The ring is size 11, and also made of platinum.The ring is size 11, and also made of platinum.

00045.0005.009.0platinum) and 11 size(

005.03822platinum

09.03823411 size

P

P

P

00045.0005.009.0platinum) and 11 size(

005.03822platinum

09.03823411 size

P

P

P

8/29/03 M. Raymer – WSU, FBS 14

Which Question?• If a fruit has a diameter of 4”, how

likely is it to be an apple?

Apples 4” Fruit

8/29/03 M. Raymer – WSU, FBS 15

“Inverting” the question

Given an apple, what is the probability that it will have a diameter of 4”?

Given an apple, what is the probability that it will have a diameter of 4”?

Given a 4” diameter fruit, what is the probability that it is an apple?

Given a 4” diameter fruit, what is the probability that it is an apple?

8/29/03 M. Raymer – WSU, FBS 16

Forensic DNA Evidence

• Given alleles (17, 17), (19, 21),(14, 15.1), what is the probability that a DNA sample belongs to Bob?

Find all (17,17), (19,21), (14,15.1) individuals, how many of them are Bob?

How common are 17, 19, 21, 14, and 15.1 in “the population”?

8/29/03 M. Raymer – WSU, FBS 17

Conditional Probabilities• For related events, we can express

probability conditionally:

• Statistical Independence:

0.01 sunny)|rain(

5.0cloudy|rain

P

P 0.01 sunny)|rain(

5.0cloudy|rain

P

P

121 | EPEEP 121 | EPEEP

8/29/03 M. Raymer – WSU, FBS 18

Bayesian Decision Making

• TerminologyWe have an object, and we want to

decide if it belongs to a classIs this fruit a type of apple?Does this DNA come from a Caucasian

American?Is this car a sports car?

We measure features of the object (evidence):Size, weight, colorAlleles at various loci

8/29/03 M. Raymer – WSU, FBS 19

Bayesian Notation• Feature/Evidence Vector:

• Classes & Posterior Probability:

3.3" oz, .39 yellow,

2.9" oz, 1.6 red,

2

1

x

x

3.3" oz, .39 yellow,

2.9" oz, 1.6 red,

2

1

x

x

15.0|pear

40.0|apple

2

2

xP

xP

15.0|pear

40.0|apple

2

2

xP

xP

8/29/03 M. Raymer – WSU, FBS 20

A Simple Example

• You are given a fruit with adiameter of 4” – is it a pear or an apple?

• To begin, we need to know the distributions of diameters for pears and apples.

8/29/03 M. Raymer – WSU, FBS 21

Maximum Likelihood

P(x)

apple|xP pear|xP

diameterx

Class-Conditional Distributions

Class-Conditional Distributions

1” 2” 3” 4” 5” 6”

8/29/03 M. Raymer – WSU, FBS 22

A Key Problem

• We based this decision on(class conditional)

• What we really want to use is(posterior probability)

• What if we found the fruit in a pear orchard?

• We need to know the prior probability of finding an apple or a pear!

pear|xP

xP |pear

8/29/03 M. Raymer – WSU, FBS 23

Prior Probabilities

• Prior probability + Evidence Posterior Probability

• Without evidence, what is the “prior probability” that a fruit is an apple?

• What is the prior probability that a DNA sample comes from the defendant?

8/29/03 M. Raymer – WSU, FBS 24

The heart of it all• Bayes Rule

classes all

)()|(

)()|(|

classPclassevidenceP

classPclassevidencePevidenceclassP

pearpear|4appleapple|4

appleapple|4

P"dpP"dp

P"dp

8/29/03 M. Raymer – WSU, FBS 25

Bayes Rule

c

jjj

jjj

Pxp

PxpxP

1

|

||

or

xpPxp

xP jjj

|

|

8/29/03 M. Raymer – WSU, FBS 26

Example Revisited

• Is it an ordinary apple or an uncommon pear?

05.0pear|4

4.0apple|4

"dP

"dP

9.0)pear(

1.0apple

P

P

8/29/03 M. Raymer – WSU, FBS 27

Bayes Rule Example

47.0085.0

04.0

9.005.01.04.0

1.04.0

"dP 4|apple

pearpear|4appleapple|4

appleapple|4

P"dpP"dp

P"dp

8/29/03 M. Raymer – WSU, FBS 28

Bayes Rule Example "dP 4|pear

pearpear|4appleapple|4

pearpear|4

P"dpP"dp

P"dp

53.0085.0

045.0

9.005.01.04.0

9.005.0

8/29/03 M. Raymer – WSU, FBS 29

Posing the question

1. What are the classes?2. What is the evidence?3. What is the prior probability?4. What is the class-conditional

probability?

classes all

)()|(

)()|(|

classPclassevidenceP

classPclassevidencePevidenceclassP

8/29/03 M. Raymer – WSU, FBS 30

An Example Problem• Suppose the rate of breast cancer

is 1%• Mammograms detect breast cancer

in 80% of cases where it is present• 10% of the time, mammograms will

indicate breast cancer in a healthy patient

• If a woman has a positive mammogram result, what is the probability that she has breast cancer?

8/29/03 M. Raymer – WSU, FBS 31

Practice Problem Revisited

01.0cancerP

8.0| cancerposP

1.0| healthyposP

• Classes: healthy, cancer• Evidence: positive mammogram

(pos), negative mammogram (neg)

• If a woman has a positive mammogram result, what is the probability that she has breast cancer? ?| poscancerP

8/29/03 M. Raymer – WSU, FBS 32

A Counting Argument

• Suppose we have 1000 women10 will have breast cancer

8 of these will have a positive mammogram

990 will not have breast cancer99 of these will have a positive

mammogram

Of the 107 women with a positive mammogram, 8 have breast cancer8/107 0.075 = 7.5%

8/29/03 M. Raymer – WSU, FBS 33

Solution

%5.7075.0107.0

008.0

99.01.001.08.0

01.08.0

)|( poscancerP

)(||

|

healthyPhealthypospcancerPcancerposp

cancerPcancerposp

8/29/03 M. Raymer – WSU, FBS 34

An Example Problem

• Suppose the chance of a randomly chosen person being guilty is .001

• When a person is guilty, a DNA sample will match that individual 99% of the time.

• .0001 of the time, a DNA will exhibit a false match for an innocent individual

• If a DNA test demonstrates a match, what is the probability of guilt?

8/29/03 M. Raymer – WSU, FBS 35

Solution

909.0000999.000099.0

00099.0

999.00001.0001.099.0

001.099.0

)|( posguiltP

)(||

|

innocentPinnocentpospguiltPguiltposp

guiltPguiltposp

8/29/03 M. Raymer – WSU, FBS 36

Marginal Distributions

apple|1xP pear|1xP

apple|2xP pear|2xP

8/29/03 M. Raymer – WSU, FBS 37

Combining Marginals

• Assuming independent features:

• If we assume independence and use Bayes rule, we have a Naïve Bayes decision maker (classifier).

jdjjj xPxPxPxP ω|ω|ω|ω| 21

8/29/03 M. Raymer – WSU, FBS 38

Bayes Decision Rule

• Provably optimum when the features (evidence) follow Gaussian distributions, and are independent.

jxPxP ji

i

||

such that , classPredict

8/29/03 M. Raymer – WSU, FBS 39

Forensic DNA

• Classes: DNA from defendant, DNA not from defendant

• Evidence: Allele matches at various lociAssumption of independence

• Prior Probabilities?Assumed equal (0.5)What is the true prior probability that an

evidence sample came from a particular individual?

8/29/03 M. Raymer – WSU, FBS 40

The Importance of Priors

0.1 0.5 0.050.1 0.25 0.0250.1 0.1 0.010.1 0.01 0.0010.1 0.001 0.0001

|xP P xP

|

8/29/03 M. Raymer – WSU, FBS 41

Likelihood Ratios

• When deciding between two possibilities, we don’t need the exact probabilities. We only need to know which one is greater.

• The denominator for all the classes is always equal.Can be eliminatedUseful when there are many possible

classes

8/29/03 M. Raymer – WSU, FBS 42

Likelihood Ratio Example

pearpear|4appleapple|4

pearpear|4

P"dpP"dp

P"dp

pearpear|4appleapple|4

appleapple|4

P"dpP"dp

P"dp

8/29/03 M. Raymer – WSU, FBS 43

Likelihood Ratio Example

appleapple|4

pearpear|4

P"dp

P"dp

8/29/03 M. Raymer – WSU, FBS 44

From alleles to identity:

• It is relatively easy to find the allele frequencies in the populationMarginal probability distributions

• Independence assumptionClass conditional probabilities

• Equal prior probabilitiesBayesian posterior probability estimate

8/29/03 M. Raymer – WSU, FBS 45

Thank you.Thank you.

8/29/03 M. Raymer – WSU, FBS 46

A Key Advantage

• The oldest citation:

T. Bayes. “An essay towards solving a problem in the doctrine of chances.” Phil. Trans. Roy. Soc., 53, 1763.

T. Bayes. “An essay towards solving a problem in the doctrine of chances.” Phil. Trans. Roy. Soc., 53, 1763.

Recommended