37
Topic Discussion: Bayesian Analysis of Informative Hypotheses Quantitative Methods Forum 20 January 2014

Topic Discussion: Bayesian Analysis of Informative Hypotheses

  • Upload
    siran

  • View
    15

  • Download
    2

Embed Size (px)

DESCRIPTION

Topic Discussion: Bayesian Analysis of Informative Hypotheses. Quantitative Methods Forum 20 January 2014. Outline. Articles: A gentle introduction to Bayesian analysis: Applications to developmental research - PowerPoint PPT Presentation

Citation preview

Page 1: Topic Discussion:  Bayesian Analysis of Informative Hypotheses

Topic Discussion: Bayesian Analysis of Informative Hypotheses

Quantitative Methods Forum

20 January 2014

Page 2: Topic Discussion:  Bayesian Analysis of Informative Hypotheses

2

Outline

Articles:

1. A gentle introduction to Bayesian analysis: Applications to developmental research

2. Moving beyond traditional null hypothesis testing: Evaluating expectations directly

3. A prior predictive loss function for the evaluation of inequality constrained hypotheses

Note: The author, Rens van de Schoot,was awarded the APA Division 5 dissertation award in 2013.

Page 3: Topic Discussion:  Bayesian Analysis of Informative Hypotheses

3

A gentle introduction to Bayesian analysis: Applications to developmental research.

Page 4: Topic Discussion:  Bayesian Analysis of Informative Hypotheses

4

Probability

• Frequentist Paradigm– R. A. Fisher, Jerzy Neyman, Egon Pearson– Long-run frequency

• Subjective Probability Paradigm– Bayes’ theorem– Probability as the subjective experience of

uncertainty

Page 5: Topic Discussion:  Bayesian Analysis of Informative Hypotheses

5

Page 6: Topic Discussion:  Bayesian Analysis of Informative Hypotheses

6

Ingredients of Bayesian Statistics

• Prior distribution– Encompasses background knowledge on parameters

tested– Parameters of prior distribution called hyperparameters

• Likelihood function– Information in the data

• Posterior Inference– Combination of prior and likelihood via Bayes’

theorem– Reflects updated knowledge, balancing prior and

observed information

Page 7: Topic Discussion:  Bayesian Analysis of Informative Hypotheses

7

Defining Prior Knowledge

• Lack of information– Still important to quantify ignorance– Noninformative: Uniform Distribution

• Akin to a Frequentist analysis

• Considerable information– Meta-analyses, previous studies

• Sensitivity analyses may be conducted to quantify effect of different prior specifications

• Priors reflect knowledge about model parameters before observing data.

Page 8: Topic Discussion:  Bayesian Analysis of Informative Hypotheses

8

Effect of PriorsHorse and donkey analogyBenefits of priors:

– Incorporate findings from previous studies– Smaller Bayesian credible intervals (cf. confidence

intervals)• Credible intervals are also known as posterior probability

intervals (PPI)• PPI gives the probability that a certain parameter lies within

the interval.

The more prior information available, the smaller the credible intervals.

When priors are misspecified, posterior results are affected.

Page 9: Topic Discussion:  Bayesian Analysis of Informative Hypotheses

9

Page 10: Topic Discussion:  Bayesian Analysis of Informative Hypotheses

10

Empirical Example

Theory of dynamic interactionism:• Individuals believed to

develop through a dynamic and reciprocal transaction between personality and environment

3 studies:• Neyer & Asendorph (2001)• Sturaro et al. (2010) • Asendorpf & van Aken

(2003)Note: N&A involved young

adults, S and A&vA involved adolescents.

Page 11: Topic Discussion:  Bayesian Analysis of Informative Hypotheses

11

Analytic Strategy

• Prior Specification– Used frequentist estimates from one study to another

• Assessment of convergence– Gelman-Rubin criterion and other ‘tuning’ variables

• Cutoff value• Minimum number of iterations• Start values• Examination of trace plots

• Model fit assessed with posterior predictive checking

Page 12: Topic Discussion:  Bayesian Analysis of Informative Hypotheses

12

Results 1

Page 13: Topic Discussion:  Bayesian Analysis of Informative Hypotheses

13

Results 2

Page 14: Topic Discussion:  Bayesian Analysis of Informative Hypotheses

14

Observations

• Point estimates do not differ between Frequentist and Bayesian approaches.

• Credible intervals are smaller than confidence intervals.

• Using prior knowledge in the analyses led to more certainty about outcomes of the analyses; i.e., more confidence (precision) in conclusions.

Page 15: Topic Discussion:  Bayesian Analysis of Informative Hypotheses

15

Theoretical Advantages of Bayesian Approach

• Interpretation– More intuitive because focus on predictive accuracy– Bayesian framework eliminates contradictions in

traditional NHST• Offers more direct expression of uncertainty• Updating knowledge

– Incorporate prior information into estimates instead of conducting NHST repeatedly.

NHST = Null Hypothesis Significance Testing

Page 16: Topic Discussion:  Bayesian Analysis of Informative Hypotheses

16

Practical Advantages of Bayesian Approach

• Smaller sample sizes required for Bayesian estimation compared to Frequentist approaches.

• In context of small sample size, Bayesian methods would produce a slowly increasing confidence regarding coefficients compared to Frequentist approaches.

• Bayesian methods can handle non-normal parameters better than Frequentist approaches.

• Protection against overinterpreting unlikely results.

• Elimination of inadmissible parameters.

Page 17: Topic Discussion:  Bayesian Analysis of Informative Hypotheses

17

Limitations

• Influence of prior specification.

• Prior distribution specification.– Assumption that every parameter has a

distribution.

• Computational time

DIAGNOSTICS?

Page 18: Topic Discussion:  Bayesian Analysis of Informative Hypotheses

18

Comments?

Page 19: Topic Discussion:  Bayesian Analysis of Informative Hypotheses

19

Moving beyond traditional null hypothesis testing: Evaluating expectations directly.

Page 20: Topic Discussion:  Bayesian Analysis of Informative Hypotheses

20

What is “wrong” with the traditional H0?

Example: Shape of the earth

H0: The shape of the earth is a flat disk

H1: The shape of the earth is not a flat disk

Evidence gathered against H0 .

Conclusion: The earth is not a sphere

modification of testable hypotheses.

Page 21: Topic Discussion:  Bayesian Analysis of Informative Hypotheses

21

HA: The shape of the earth is a flat disk

HB: The shape of the earth is a sphere

HA is no longer the complement of HB.

Instead, HA and HB are competing models regarding the shape of the earth.

Testing of such competing hypotheses will result in a more informative conclusion.

Page 22: Topic Discussion:  Bayesian Analysis of Informative Hypotheses

22

What does this example teach us?

The evaluation of informative hypotheses presupposes that prior information is available.

Prior knowledge is available in the form of specific expectations of the ordering of statistical parameters.

Example: Mean comparisonsHI1: μ3 < μ1 < μ5 < μ2 < μ4

HI2: μ3 < {μ1, μ5 , μ2} < μ4 where “,” denotes no ordering

versus traditional setupH0: μ1 = μ2 = μ3 = μ4 = μ5

Hu: μ1 , μ2 , μ3 , μ4 , μ5

Page 23: Topic Discussion:  Bayesian Analysis of Informative Hypotheses

23

Evaluating Informative Hypotheses

• Hypothesis Testing Approaches– F-bar test for ANOVA (Silvapulle, et al., 2002; Silvapulle & Sen,

2004)– Constraints on variance terms in SEM (Stoel, et al., 2006;

Gonzalez & Griffin, 2001)

• Model Selection Approaches– Evaluate competing models for model fit and model complexity.

• Akaike Information Criterion (AIC; Akaike, 1973)• Bayes Information Criterion (BIC; Schwarz, 1978)• Deviance Information Criterion (DIC; Spiegelhalter, et al., 2002)

– These cannot deal with inequality constraints

• Paired Comparison Information Criterion (PCIC; Dayton, 1998 & 2003)• Order restricted Information Criterion (ORIC; Anraku, 1999; Kuiper, et

al., in press)• Bayes Factor

Page 24: Topic Discussion:  Bayesian Analysis of Informative Hypotheses

24

Comments?

Page 25: Topic Discussion:  Bayesian Analysis of Informative Hypotheses

25

A prior predictive loss function for the evaluation of inequality constrained hypotheses.

Page 26: Topic Discussion:  Bayesian Analysis of Informative Hypotheses

26

Inequality Constrained Hypotheses

Example 1

General linear model with two group means:

yi = μ1di1 + μ2di2 + εi εi ~ N(0,σ2)

dig takes on 0 or 1 to indicate group

H0: μ1 , μ2 (unconstrained hypothesis)

H1: μ1 < μ2 (inequality constraint imposed)

Page 27: Topic Discussion:  Bayesian Analysis of Informative Hypotheses

27

Deviance Information Criteria

y: data

θ: unknown parameter

p(.) likelihood

C: constant

Taking the expectation:

A measure of how well model fits data.

Effective number of parameters: : expectation of θ

CpD )|(log2)( θyθ

)]([ θDED

)(θDDpD θ

DpDDIC 2)( θ

Page 28: Topic Discussion:  Bayesian Analysis of Informative Hypotheses

28

More on DIC

model fit + penalty for model complexity

• Smaller is better.• Only valid when posterior distribution approximates

multivariate normal.• Assumes that specified parametric family of pdfs that

generate future observations encompasses true model. (Can be violated.)

• Data y used to construct posterior distribution AND evaluate estimated model DIC selects overfitting models.

• Solution: Bayesian predictive information criterion.

DpDDIC 2)( θ

Page 29: Topic Discussion:  Bayesian Analysis of Informative Hypotheses

29

Bayesian Predictive Information Criterion

• Developed by Ando (2007) to avoid overfitting problems associated with DIC.

… looks like the posterior DIC presented in van de Schoot et al. (2012).

DppEBPIC 2)]|([log2 θyθ

Page 30: Topic Discussion:  Bayesian Analysis of Informative Hypotheses

30

Posterior (Predictive?) DIC

Are these the same?

Note:

DppEBPIC 2)]|([log2 θyθ

)|(log2 yθyf

])|(log2)|(log2[2 θyθy y ff

)(θDDpD

)|(log2)|()|( yyθyθ θxfEEpostDIC fg

Page 31: Topic Discussion:  Bayesian Analysis of Informative Hypotheses

31

Performance of postDIC

H0: μ1 , μ2

H1: μ1 < μ2

H2: μ1 = μ2

Data generated to be consistent (cases 1 to 4) or reversed in direction (cases 5 to 7) with H2.

postDIC does not distinguish H0 from H1.

Recall: Smaller is better.

Page 32: Topic Discussion:  Bayesian Analysis of Informative Hypotheses

32

Prior DIC

• Specification of the prior distribution has more importance for prDIC than postDIC.

What’s the intuitive difference between a posterior predictive vs. prior predictive approach?

)|(log2)|()|( yθxyθ θxfEEprDIC fh

)|(log2 θyfC

)|(log2)( θyθ fEh

Page 33: Topic Discussion:  Bayesian Analysis of Informative Hypotheses

33

Performance of prDICH0: μ1 , μ2 H1: μ1 < μ2 H2: μ1 = μ2

Data generated to be consistent (cases 1 to 4) or reversed in direction (cases 5 to 7) with H2.

prDIC distinguishes H0 from H1when data are in agreement with H1. But chooses a bad fitting model for cases 5 to 7.

Recall: Smaller is better.

Page 34: Topic Discussion:  Bayesian Analysis of Informative Hypotheses

34

Prior (Predictive?) Information Criterion

Omits from prDIC.

New loss function accounts for agreement between and y.

It quantifies how well replicated data x fits a certain hypothesis, and how well the hypothesis fits the data y.

)|(log2)( θyθ fEh )|(log2 θyfC

Page 35: Topic Discussion:  Bayesian Analysis of Informative Hypotheses

35

Performance of PICH0: μ1 , μ2 H1: μ1 < μ2 H2: μ1 = μ2

Data generated to be consistent (cases 1 to 4) or reversed in direction (cases 5 to 7) with H2.

PIC selects the hypotheses that is most consistent with the data – outperforming postDIC and prDIC. (?)

Recall: Smaller is better.

Page 36: Topic Discussion:  Bayesian Analysis of Informative Hypotheses

36

Paper Conclusions

• The (posterior?) DIC performs poorly when evaluating inequality constrained hypotheses.

• The prior DIC can be useful for model selection when the population from which the data are generated is in agreement with the constrained hypotheses.

• The PIC, which is related to the marginal likelihood, is better to select the best set of inequality constrained hypotheses.

Page 37: Topic Discussion:  Bayesian Analysis of Informative Hypotheses

37

Comments?