34
1 Francisco José Vázquez Polo [www.personales.ulpgc.es/fjvpolo.dmc] Miguel Ángel Negrín Hernández [www.personales.ulpgc.es/mnegrin.dmc] {fjvpolo or mnegrin}@dmc.ulpgc.es Course on Bayesian Methods in Environmental Valuation Basics (continued): Models for proportions and means

Francisco José Vázquez Polo [personales.ulpgc.es/fjvpolo.dmc]

  • Upload
    ronni

  • View
    19

  • Download
    0

Embed Size (px)

DESCRIPTION

Course on Bayesian Methods in Environmental Valuation. Basics (continued): Models for proportions and means. Francisco José Vázquez Polo [www.personales.ulpgc.es/fjvpolo.dmc] Miguel Ángel Negrín Hernández [www.personales.ulpgc.es/mnegrin.dmc] {fjvpolo or mnegrin}@dmc.ulpgc.es. 1. - PowerPoint PPT Presentation

Citation preview

Page 1: Francisco José Vázquez Polo  [personales.ulpgc.es/fjvpolo.dmc]

1

Francisco José Vázquez Polo [www.personales.ulpgc.es/fjvpolo.dmc]

Miguel Ángel Negrín Hernández [www.personales.ulpgc.es/mnegrin.dmc]

{fjvpolo or mnegrin}@dmc.ulpgc.es

Course on Bayesian Methods in Environmental Valuation

Basics (continued):

Models for proportions and means

Page 2: Francisco José Vázquez Polo  [personales.ulpgc.es/fjvpolo.dmc]

1. Introduction to Bayesian Analysis

3. Software: WinBUGS

Contents

2. Bayesian inference. Conjugate priors2.1 Analysis of proportions2.2 Analysis of count data

Course on Bayesian Methods in Environmental Valuation

Page 3: Francisco José Vázquez Polo  [personales.ulpgc.es/fjvpolo.dmc]

Thomas Bayes (1702 - 1761)

Introduction to Bayesian Analysis

Page 4: Francisco José Vázquez Polo  [personales.ulpgc.es/fjvpolo.dmc]

He set out Bayes’s theory of probability in the paper “An essay towards solving a problem in the doctrine of chances” (Philosophical Transactions of the Royal Society of London, 1763). The paper was sent by Richard Price, a friend of Bayes’.

This paper introduced the concept of inverse probability;

set of hypothesis

prior probabilities,

likelihood of the data A

kHH ,,1 kiHP i ,....,1,

kiHAP i ,...,1|

i iHP

Introduction to Bayesian Analysis

Page 5: Francisco José Vázquez Polo  [personales.ulpgc.es/fjvpolo.dmc]

j jj

iii HPHAP

HPHAPAHP

||

|

The posterior probability of Hi given A is proportional to the product of the prior probability of Hi and the likelihood of A when Hi is true.

Introduction to Bayesian Analysis

Bayes’ Theorem:

Page 6: Francisco José Vázquez Polo  [personales.ulpgc.es/fjvpolo.dmc]

Introduction to Bayesian Analysis

Page 7: Francisco José Vázquez Polo  [personales.ulpgc.es/fjvpolo.dmc]

What does probability mean?

The frequency definition of the probability of an event:

The probability of an event is the proportion of the time it would occur in a long sequence of observations (i.e. as the number of trials tends to infinity).

Example: when we say that the probability of getting head on a toss of a fair coin is 0.5, we mean that we would expect to get a head half the time if we flipped the coin a huge number of times under exactly the same conditions.

Requires a sequence of repeatable experiments.

No frequency interpretation possible for probabilities of many kinds of events:

Page 8: Francisco José Vázquez Polo  [personales.ulpgc.es/fjvpolo.dmc]

Probability as degree of belief

The subjective definition of probability is

A probability of an event is a number between 0 and 1 that measures a particular person’s subjective opinion as to how likely that event is to occur (or to have occurred).

Applies whenever the person in questions has an opinion about the event

-If we count ignorance as an opinion.

Different people may have different subjective probabilities regarding the same event.

The same person’s subjective probability may change as more information comes in.

Page 9: Francisco José Vázquez Polo  [personales.ulpgc.es/fjvpolo.dmc]

Properties of probabilities

These properties apply to probability whichever definition is being used.

-Probabilities must not be negative. If A is any event, then

P(A) ≥ 0

-All possible outcomes together must have probability 1.

If S is the sample space in a probability model then

P(S) = 1

Page 10: Francisco José Vázquez Polo  [personales.ulpgc.es/fjvpolo.dmc]

Example: Do you have a rare disease?

Suppose your friend is diagnosed with a rare disease that has no obvious symptoms.

You wish to determine how likely it is that you, too, have the disease.

That is, you are uncertain about your true disease status.

Your friend’s doctor has told him that the proportion of people in the general population who have the disease is 0.001. The disease is not contagious.

A blood test exists to detect the disease, but it sometimes gives incorrect results (0.05)

Page 11: Francisco José Vázquez Polo  [personales.ulpgc.es/fjvpolo.dmc]

Prior distribution

Before we see any data, we have some idea about what values the parameters might take

Experts, experience, previous studies, and so on.

Example:

-e.g. there are very few people 3m tall

Our subjective uncertainty about the parameters before we see the data

Page 12: Francisco José Vázquez Polo  [personales.ulpgc.es/fjvpolo.dmc]

Prior Terminology

Uninformative prior

-Uniform, as wide as possible

-Sometimes called flat priors

-Problem: often difficult to define

Informative Prior

-Not uniform

-Assume we have some prior knowledge

Conjugate Prior

-Prior and posterior have same distributionOften makes the maths easier

Page 13: Francisco José Vázquez Polo  [personales.ulpgc.es/fjvpolo.dmc]

Noninformative or reference priors

Useful when we want inference to be unaffected by information apart from the current data.

In many scientific contexts, we would not bother to carry out an experiment unless we thought it was going to increase our knowledge significantly

- i.e. we expect and want the likelihood to dominate the prior

Page 14: Francisco José Vázquez Polo  [personales.ulpgc.es/fjvpolo.dmc]

Informative priors

Elicitation is the process of extracting expert knowledge about some unknown quantity of interest, or the probability of some future event, which can then be used to supplement any numerical data that we may have.

If the expert in question does not have a statistical background, as is often the case, translating their beliefs into a statistical form suitable for use in our analyses can be a challenging task.

Prior elicitation is an important and yet under researched component of Bayesian statistics

Page 15: Francisco José Vázquez Polo  [personales.ulpgc.es/fjvpolo.dmc]

Example (continuation)

Two possible events:

1.You have the disease

2.You don’t have the disease

Before taking any blood test, you think your chance of having the disease is similar to that of a randomly selected person in the population. So you assign the following prior probabilities to the two events:

Prob (Have disease) = 0.001

Prob(Don’t have disease) = 0.999

Page 16: Francisco José Vázquez Polo  [personales.ulpgc.es/fjvpolo.dmc]

Data

You decide to take the blood test.

- The new information that you obtain to learn about the different models is called data.

- The different possible data results are called observations.

- The data in this example is the result of the blood test.

The two possible observations are:

- A “positive” blood test (+) suggests you have the disease.

- A “negative” blood test (-) suggests you don’t have the disease.

Page 17: Francisco José Vázquez Polo  [personales.ulpgc.es/fjvpolo.dmc]

Likelihood

The probabilities of the two possible test results are different depending on whether you have the disease or not.

These probabilities are called likelihoods – the probabilities of the different data outcomes conditional on each possible model.

P(+ | have disease) = 0.95P(+ | don’t have disease) = 0.05P(- | have disease) = 0.05P(- | don’t have disease) = 0.95

Page 18: Francisco José Vázquez Polo  [personales.ulpgc.es/fjvpolo.dmc]

Bayesian Inference

As P(X) is a constant, all we need to estimate P(θ | X) are P(θ) and P(X | θ)

Bayes’ rule becomes:

P(θ | X) is called the posterior distribution

Product of the prior and the likelihood

We can ignore the constant of proportionality

|| XPPXP

Page 19: Francisco José Vázquez Polo  [personales.ulpgc.es/fjvpolo.dmc]

Posterior distribution

The posterior distribution contains all the current information about the unknown parameter

All Bayesian inference is based on the posterior distribution:

-Estimation

-Estimating values of unknown parameters that can never be observed or known

-Testing

-Prediction

-Estimating the values of potentially observable but currently unobserved quantities.

Page 20: Francisco José Vázquez Polo  [personales.ulpgc.es/fjvpolo.dmc]

Using Bayes’ rule to update probabilities

Bayes’ rule is the formula for updating your probabilities about the models given the data.

Enables you to compute posterior probabilities given the observed data

Bayes’ rule:

P(event | data) P(event) x P(data | event)

Posterior prior x likelihood

Page 21: Francisco José Vázquez Polo  [personales.ulpgc.es/fjvpolo.dmc]

Bayes’ rule applied to the example

You take the blood test and the result is positive (+). This is the data or observation.

P(have disease | +) = 0.019

P(don’t have disease | +) = 0.981

diseasehavetdonPdiseasehavetdonPdiseasehavePdiseasehaveP

diseasehavePdiseasehavePdiseasehaveP

''|||

|

Page 22: Francisco José Vázquez Polo  [personales.ulpgc.es/fjvpolo.dmc]

Learning

The Bayesian approach is often talked about as a learning process

As we get more data, we add it to our store of information by multiplying it by our current posterior distribution.

It has been argued that this can form the basis of a philosophy of science

Page 23: Francisco José Vázquez Polo  [personales.ulpgc.es/fjvpolo.dmc]

What have you learned from the blood test?

The probability of your having the disease has increased by a factor of 19.

But the actual probability is still small.

You decide to obtain more information by taking the blood test again.

Page 24: Francisco José Vázquez Polo  [personales.ulpgc.es/fjvpolo.dmc]

Updating the probabilities again

We assume that the blood tests are independent.

The posterior probabilities after the first test will become your prior probabilities with respect to the second test.

Suppose that the second test is also positive.

The new posterior probabilities are:

P(have disease | +,+) = 0.269P(don’t have disease | +,+) = 0.731

Page 25: Francisco José Vázquez Polo  [personales.ulpgc.es/fjvpolo.dmc]

What if the second test had been negative?

Suppose that the second test is negative.

The new posterior probabilities are:

P(have disease | +,-) = ?P(don’t have disease | +,-) = ?

P(have disease | +,-) = 0.001P(don’t have disease | +,-) = 0.999

Page 26: Francisco José Vázquez Polo  [personales.ulpgc.es/fjvpolo.dmc]

References

Zellner, A. (1971) “An introduction to Bayesian Inference and Econometrics”. John Wiley & Sons. Chen, M., Shao, Q. e Ibrahim, J.(2000). “Monte Carlo Methods in Bayesian Computation”. Springer-Verlag. NY.

Leonard,T. y Hsu, J.S.(1999). “Bayesian Methods. An analysis for statisticians and interdisciplinary researches” Cambridge Series in Statistical and Probabilistic Mathematics. Cambridge.

O’Hagan, A.(1994). “Bayesian Inference”. Kendall’s Advanced Theory of Statistics (vol.2b). E. Arnold. University Press. Cambridge.

O’Hagan, A.(2003). “A primer on Bayesian Statistics in Health Economics and Outcomes Research”. Centre for Bayesian Statistics in Health Economics.

Lee, P. (1993) “Bayesian Statistics: An introduction”. Oxford, UK: Oxford University Press, UK.

Introduction to Bayesian Analsysis.

Page 27: Francisco José Vázquez Polo  [personales.ulpgc.es/fjvpolo.dmc]

Estimation

•Point estimates (mean, mode, median)

- Measures of spread

•Bayesian intervals

Bayesian Inference

Page 28: Francisco José Vázquez Polo  [personales.ulpgc.es/fjvpolo.dmc]

The posterior variance

The posterior variance is one summary of the spread of the posterior distribution

The larger the posterior variance, the more uncertainty we still have about the parameter

Bayesian Inference

Page 29: Francisco José Vázquez Polo  [personales.ulpgc.es/fjvpolo.dmc]

Precisely what information does a p-value provide?

Recall the definition of a p-value: The probability of observing a test statistics as extreme as or more extreme than the observed value, assuming that the null hypothesis is true.

What is the correct way to interpret a confidence interval?

Does a 95% confidence interval provide a 95% probability region for the true parameter value? If not, what is the correct intepretation?

“A range of values, which is likely, with a specified degree of certainty, to contain the true population value of a variable drawn from the study sample”

Bayesian Inference

Page 30: Francisco José Vázquez Polo  [personales.ulpgc.es/fjvpolo.dmc]

Frequentist approach

Parameters are considered "fixed but unknown"

We can not assign a distribution.

Bayesian approach

Parameters are considered random and unknown

They are random because they are unknown

Bayesian Inference

Page 31: Francisco José Vázquez Polo  [personales.ulpgc.es/fjvpolo.dmc]

Bayesian intervals

Called “posterior intervals” or “credible sets”

Recall that the posterior distribution represents our updated subjective probability distribution for the unknown parameter.

Thus, for us, the interpretation of the 95% credible set is that the probability is .95 that the true θ is in the interval.

Contrast this with the interpretation of a frequentist confidence interval.

Bayesian Inference

Page 32: Francisco José Vázquez Polo  [personales.ulpgc.es/fjvpolo.dmc]

Frequentist approach

H0 vs. H1 (2 hypothesis)

α: Type I error (Probability of rejecting the hypothesis when hypothesis is true)

1%, 5% ó 10%

¿β: Type II error (Probability of accepting the hypothesis when hypothesis is false)?

p-value

(accept if p-value > 0.05; reject if p-value < 0.05)

Hypothesis testing

Page 33: Francisco José Vázquez Polo  [personales.ulpgc.es/fjvpolo.dmc]

Hypthotesis testing

Bayesian approach

H0 , H1 , H2, etc. (several hypothesis)

We can estimate the probability of each event from the posterior distribution of the parameters, f(θ|X).

Prob(H0) = Prob(θ ≤ θ0) Prob(H1) = Prob(θ > θ0)

01

00

:

:

H

H

Page 34: Francisco José Vázquez Polo  [personales.ulpgc.es/fjvpolo.dmc]

Prediction

In many situations, interest focuses on predicting values of a future sample from the same population.

-i.e. on estimating values of potentially observable but not yet observed quantities

Example: we can be interested in the result of the next blood test.

The posterior predictive probability is defined as

dxpxpxxP ||| **