Spotting pseudoreplication

Preview:

DESCRIPTION

Spotting pseudoreplication. Inspect spatial (temporal) layout of the experiment Examine degrees of freedom in analysis. Degrees of freedom (df). Number of independent terms used to estimate the parameter = Total number of datapoints – number of parameters estimated from data. - PowerPoint PPT Presentation

Citation preview

Spotting pseudoreplication

1. Inspect spatial (temporal) layout of the experiment

2. Examine degrees of freedom in analysis

Degrees of freedom (df)

Number of independent terms used to estimate the parameter

= Total number of datapoints – number of parameters estimated from data

Example: VarianceIf we have 3 data points with a mean value of 10, what’s the df for the variance estimate?

Independent term method:

Can the first data point be any number?

Can the second data point be any number?

Can the third data point be any number?

Yes, say 8

Yes, say 12

No – as mean is fixed !

Variance is (y – mean)2 / (n-1)

Example: VarianceIf we have 3 data points with a mean value of 10, what’s the df for the variance estimate?

Independent term method:

Therefore 2 independent terms (df = 2)

Example: VarianceIf we have 3 data points with a mean value of 10, what’s the df for the variance estimate?

Subtraction method

Total number of data points?

Number of estimates from the data?

df= 3-1 = 2

3

1

Example: Linear regression

Y = mx + b

Therefore 2 parameters estimated simultaneously

(df = n-2)

Example: Analysis of variance (ANOVA)

A B C a1 b1 c1

a2 b2 c2

a3 b3 c3

a4 b4 c4

What is n for each level?

Example: Analysis of variance (ANOVA)

A B C a1 b1 c1

a2 b2 c2

a3 b3 c3

a4 b4 c4

n = 4

How many df for each variance estimate?

df = 3 df = 3 df = 3

Example: Analysis of variance (ANOVA)

A B C a1 b1 c1

a2 b2 c2

a3 b3 c3

a4 b4 c4

What’s the within-treatment df for an ANOVA?

Within-treatment df = 3 + 3 + 3 = 9

df = 3 df = 3 df = 3

Example: Analysis of variance (ANOVA)

A B C a1 b1 c1

a2 b2 c2

a3 b3 c3

a4 b4 c4

If an ANOVA has k levels and n data points per level, what’s a simple formula for within-treatment df?

df = k(n-1)

Spotting pseudoreplication

An experiment has 10 fertilized and 10 unfertilized plots, with 5 plants per plot.

The researcher reports df=98 for the ANOVA (within-treatment MS).

Is there pseudoreplication?

Spotting pseudoreplication

An experiment has 10 fertilized and 10 unfertilized plots, with 5 plants per plot.

The researcher reports df=98 for the ANOVA.

Yes! As k=2, n=10, then df = 2(10-1) = 18

Spotting pseudoreplication

An experiment has 10 fertilized and 10 unfertilized plots, with 5 plants per plot.

The researcher reports df=98 for the ANOVA.

What mistake did the researcher make?

Spotting pseudoreplication

An experiment has 10 fertilized and 10 unfertilized plots, with 5 plants per plot.

The researcher reports df=98 for the ANOVA.

Assumed n=50: 2(50-1)=98

Why is pseudoreplicationa problem?

Hint: think about what we use df for!

How prevalent?

Hurlbert (1984): 48% of papers

Heffner et al. (1996): 12 to 14% of papers

Statistics review

Basic concepts:

• Variability measures

• Distributions

• Hypotheses

• Types of error

Common analyses

• T-tests

• One-way ANOVA

• Two-way ANOVA

• Randomized block

Variance

Ecological rule # 1: Everything varies

…but how much does it vary?

Variance

S2= Σ (xi – x )2

n-1

x

Sum-of-squarecake

Variance

S2= Σ (xi – x )2

n-1

x

Variance

S2= Σ (xi – x )2

n-1

What is the variance of 4, 3, 3, 2 ?

What are the units?

Variance variants

1. Standard deviation (s, or SD)

= Square root (variance)

Advantage: units

Variance variants

2. Standard error (S.E.)

= s

n

Advantage: indicates precision

How to report

We observed 29.7 (+ 5.3) grizzly bears per month (mean + S.E.).

A mean (+ SD)of 29.7 (+ 7.4) grizzly bears were seen per month

+ 1SE or SD

- 1SE or SD

Distributions

Normal• Quantitative data

Poisson• Count

(frequency) data

Normal distribution

0

2

4

6

8

10

12

14

16

mean

67% of data within 1 SD of mean

95% of data within 2 SD of mean

Poisson distribution

0

2

4

6

8

10

12

14

16

18

mean

Mostly, nothing happens (lots of zeros)

Poisson distribution

• Frequency data

• Lots of zero (or minimum value) data

• Variance increases with the mean

1. Correct for correlation between mean and variance by log-transforming y (but log (0) is undefined!!)

2. Use non-parametric statistics (but low power)

3. Use a “generalized linear model” specifying a Poisson distribution

What do you do with Poisson data?

• Null (Ho): no effect of our experimental treatment, “status quo”

• Alternative (Ha): there is an effect

Hypotheses

Whose null hypothesis?

Conditions very strict for rejecting Ho, whereas accepting Ho is easy (just a matter of not finding grounds to reject it).

A criminal trial?Exotic plant species?WTO?

Hypotheses

Null (Ho) and alternative (Ha):

always mutually exclusive

So if Ha is treatment>control…

Types of error

Type 1 error

Type 2 error

Reject Ho Accept Ho

Ho true

Ho false

• Usually ensure only 5% chance of type 1 error (ie. Alpha =0.05)

• Ability to minimize type 2 error: called power

Types of error