37
Design of Experiments and Data Analysis

Design of Experiments and Data Analysis. Let’s Work an Example Data obtained from MS Thesis Studied the “bioavailability” of metals in sediment cores

Embed Size (px)

Citation preview

Page 1: Design of Experiments and Data Analysis. Let’s Work an Example Data obtained from MS Thesis Studied the “bioavailability” of metals in sediment cores

Design of Experiments and Data Analysis

Page 2: Design of Experiments and Data Analysis. Let’s Work an Example Data obtained from MS Thesis Studied the “bioavailability” of metals in sediment cores

Let’s Work an Example

• Data obtained from MS Thesis

• Studied the “bioavailability” of metals in sediment cores

• We’ll analyze chromium data

Page 3: Design of Experiments and Data Analysis. Let’s Work an Example Data obtained from MS Thesis Studied the “bioavailability” of metals in sediment cores

Pt. Mugu Marsh

Page 4: Design of Experiments and Data Analysis. Let’s Work an Example Data obtained from MS Thesis Studied the “bioavailability” of metals in sediment cores
Page 5: Design of Experiments and Data Analysis. Let’s Work an Example Data obtained from MS Thesis Studied the “bioavailability” of metals in sediment cores

Analytical Techniques

• Sediment samples were taken with cores• Sliced into 1 cm slices• Sediment in each slice was extracted using a

strong acid• Extracts were analyzed using an Inductively

Coupled Plasma Mass Spectrometer (ICP-MS)• Calibrations were also conducted• Surfaces areas (SA) and organic carbon (OC)

contents of sediment in each slice were also measured

Page 6: Design of Experiments and Data Analysis. Let’s Work an Example Data obtained from MS Thesis Studied the “bioavailability” of metals in sediment cores

Core processing

1-cm slices

Organic Carbon

Surface Areas

Tessier Extractions

Page 7: Design of Experiments and Data Analysis. Let’s Work an Example Data obtained from MS Thesis Studied the “bioavailability” of metals in sediment cores

Objectives

• To determine if there is a correlation between sediment surface area and organic carbon content

• To determine if there is a relationship between concentration of a specific metal and sediment SA and/or OC

• To determine if there is a relationship between or among metal concentrations

Page 8: Design of Experiments and Data Analysis. Let’s Work an Example Data obtained from MS Thesis Studied the “bioavailability” of metals in sediment cores

Example of Results

0 1 2 3

10

8

6

4

2

0

Dep

th (

cm)

Organic Carbon (%)

1.2 1.8 2.4

CC01

Surface Area (m2/g)

0.3 0.6 0.9 1.2

0.8 1.6 2.4

LM02

0.3 0.6 0.9 1.2

CC02

0.6 1.2 1.8

0.1 0.2 0.3

0.21 0.24

CC03

0 2 4

0 2 4 6 8

LM01

Page 9: Design of Experiments and Data Analysis. Let’s Work an Example Data obtained from MS Thesis Studied the “bioavailability” of metals in sediment cores

Example of Results

0.0

0.5

1.0

1.5

2.0

2.5

3.0

0 1 2 3 4 5 6 7

Surface Area (m2/g)

Org

an

ic C

arb

on

Co

nte

nt (

%) Slope = 0.39

R2 = 0.7

Page 10: Design of Experiments and Data Analysis. Let’s Work an Example Data obtained from MS Thesis Studied the “bioavailability” of metals in sediment cores

Data File

• Create a folder entitled “REU” in the C:\My Documents folder

• Create a folder entitled “2006” in this REU folder• Create a folder entitled “Data Analysis

Workshop” in this 2006 folder• Download Excel File REU_dataanalysis_data.xls

from instructional1.calstatela.edu/ckhachi into the Data Analysis Workshop folder

• Open the file

Page 11: Design of Experiments and Data Analysis. Let’s Work an Example Data obtained from MS Thesis Studied the “bioavailability” of metals in sediment cores

Data File Structure

• There should be 2 worksheets in the workbook:– Data: raw SA, OC, and metals concentration

data– Calibration Curves: ICP-MS calibration data

(relating raw metals concentrations to known calibration concentrations)

• Data for the cores are separated by yellow bands

Page 12: Design of Experiments and Data Analysis. Let’s Work an Example Data obtained from MS Thesis Studied the “bioavailability” of metals in sediment cores

Data File Structure

• Data Columns include:– ID: Random sample ID– Ave Depth: Ave depth of each slice– Solid Mass: Mass of sediments in each slice– Raw ICP-MS data for each of five metals

• Calibration Columns include:– Conc: Concentration of standards in parts per

billion (ppb)– ICP-MS responses for the 5 metals

Page 13: Design of Experiments and Data Analysis. Let’s Work an Example Data obtained from MS Thesis Studied the “bioavailability” of metals in sediment cores

Let’s Start with Calibration Curves

• Most instruments over reasonable ranges have linear responses (i.e., calibration curves are straight lines)

• We need to “model” the data – regression analysis to determine the best-fit line that relates ICP-MS response to concentrations

• We will then use these calibration equations to calculate concentrations for our samples

• Note: because we know that calibrations are usually linear, we will choose a linear regression model…if you don’t know the relationship b/w 2 variables, it sometimes helps to start with plots

Page 14: Design of Experiments and Data Analysis. Let’s Work an Example Data obtained from MS Thesis Studied the “bioavailability” of metals in sediment cores

Calibration Curve for Cr

• Linear response• We know slope and

intercept• R2 value provided• Best-fit line drawn

(looks good to me)• Not enough statistical

information provided to be able to conduct proper error analysis

y = 259.07x + 1787.3

R2 = 0.999

0.0000

2000.0000

4000.0000

6000.0000

8000.0000

10000.0000

12000.0000

14000.0000

16000.0000

0 10 20 30 40 50 60

Series1

Linear (Series1)

Page 15: Design of Experiments and Data Analysis. Let’s Work an Example Data obtained from MS Thesis Studied the “bioavailability” of metals in sediment cores

Regression Analysis for Cr

Rename Worksheet“Cr Analysis”

Page 16: Design of Experiments and Data Analysis. Let’s Work an Example Data obtained from MS Thesis Studied the “bioavailability” of metals in sediment cores

Assumptions

• On average, errors are not consistently positive nor negative.– Linear Model: yi = mx + b + ei, where ei is the error

associated with each observation– Line goes through the middle of data

• Variance of error terms the same across all observations

• Data are independent of each other• Error terms are normally distributed (not that

important)

Page 17: Design of Experiments and Data Analysis. Let’s Work an Example Data obtained from MS Thesis Studied the “bioavailability” of metals in sediment cores

Residual PlotResiduals Plot

-161.9482

0.0000

161.9482

323.8964

1 2 3 4 5 6 7

Observation

Re

sid

ua

l (g

rid

line

s =

SE

est

)

Look at data and linear fit carefully; points lie above the line for smaller values of concentration. If you delete the last point, you get a very different result

Page 18: Design of Experiments and Data Analysis. Let’s Work an Example Data obtained from MS Thesis Studied the “bioavailability” of metals in sediment cores

Regression Statistics

• Multiple R (or just r) is the correlation: – +1 perfectly positively correlated (as x goes

up, so does y)– 0 not correlated– -1 perfectly negatively correlated (as x

goes up, y goes down)

)y()x(

)yy)(xx(1n

1

r

n

1iii

Page 19: Design of Experiments and Data Analysis. Let’s Work an Example Data obtained from MS Thesis Studied the “bioavailability” of metals in sediment cores

Regression Statistics

• R Square (R2): coefficient of determination– Between 0 and 1

• 0 no linear relationship • 1 perfect linear relationship (+ or -)

– Square of the r value– Theoretically, as the number of data points ∞, R2

1 (denominator is fixed)

• Adjusted R Square: fixes this problem…is probably a better measure of how strong the linear relationship is (R2 more common)

• Use 2 or 3 significant figures to report these #s

Page 20: Design of Experiments and Data Analysis. Let’s Work an Example Data obtained from MS Thesis Studied the “bioavailability” of metals in sediment cores

Regression Statistics

• Standard Error: a measure of the amount of error in the prediction of y for an individual x.

• Observations: # of data points

Page 21: Design of Experiments and Data Analysis. Let’s Work an Example Data obtained from MS Thesis Studied the “bioavailability” of metals in sediment cores

ANOVA

• ANalysis Of VAriance (sometimes called an F test)

• df: degrees of freedom• SS: sum of squares

R2 = (1-SSresidual)/SStotal

• MS: Mean squares = SS/df

• F = MSregression/MSresidual larger reject null hypothesis (no correlation)

• Not very useful for single treatment

Page 22: Design of Experiments and Data Analysis. Let’s Work an Example Data obtained from MS Thesis Studied the “bioavailability” of metals in sediment cores

Correlation results• Linear Calibration: y = mx + b

– Slope (m) = 259.0709– Intercept (b) = 1787.2679

• Standard Error: used for hypothesis testing and confidence band formation

Page 23: Design of Experiments and Data Analysis. Let’s Work an Example Data obtained from MS Thesis Studied the “bioavailability” of metals in sediment cores

Correlation results• Confidence intervals

– Intercept• Lower: 1787.2679 – 70.2724 (2.571) = 1606.597• 2.571 standard two-tale t-test table with df = 5

and probability = 0.05

– Slope• Lower: 259.079 – 3.6280(2.571) = 249.74• Upper: 259.079 + 3.6280(2.571) = 268.40

• t stat: = Coefficient/Standard Error

Page 24: Design of Experiments and Data Analysis. Let’s Work an Example Data obtained from MS Thesis Studied the “bioavailability” of metals in sediment cores

Correlation results• P-value: probability of wrongly rejecting

the null hypothesis (Ho), in this case no correlation, if it is in fact true – p > 0.10 null hypothesis maybe OK– 0.10 < p < 0.05 slight evidence against null

hypothesis – p < 0.05 moderate evidence against null

hypothesis – p < 0.01 strong evidence against null

hypothesis

Page 25: Design of Experiments and Data Analysis. Let’s Work an Example Data obtained from MS Thesis Studied the “bioavailability” of metals in sediment cores

• Consult statistical tables again:– For df = 5 and t stat = 25.4, p < 0.000005– For df = 5 and t stat = 71.4, p < 0.0000001

• Very, very strong evidence that Ho is false the calibration curves are linear!

• Linear Model:

Correlation results

)27.70(27.1787ionConcentrat)63.3(07.259sponseRe

Page 26: Design of Experiments and Data Analysis. Let’s Work an Example Data obtained from MS Thesis Studied the “bioavailability” of metals in sediment cores

Using Calibration Equations

• Now we have an equation that relates the response of our equipment to concentrations

• Let’s use this equation to determine concentrations in our samples

Page 27: Design of Experiments and Data Analysis. Let’s Work an Example Data obtained from MS Thesis Studied the “bioavailability” of metals in sediment cores

Raw Data Excel Sheet

Page 28: Design of Experiments and Data Analysis. Let’s Work an Example Data obtained from MS Thesis Studied the “bioavailability” of metals in sediment cores

Measurement Errors

• Add 2 columns to the right of the Cr data• Assume instrument has a 3% error (in reality,

you need to run sample 3 times to get the proper error)

Page 29: Design of Experiments and Data Analysis. Let’s Work an Example Data obtained from MS Thesis Studied the “bioavailability” of metals in sediment cores

Propagation of Errors

• Let us assume that X is dependent upon the experimental variables p, q, and r, which fluctuate in a random and independent way.

• Addition or Subtraction: X = p + q - r:

• Where “s” is the standard deviation or error for each of the variables

2r

2q

2px ssss

Page 30: Design of Experiments and Data Analysis. Let’s Work an Example Data obtained from MS Thesis Studied the “bioavailability” of metals in sediment cores

Propagation of Errors (cont’d)

• Multiplication or Division: X = p * (q/r)

• Other equations exist for logs, etc.• Round +/- to the # of decimal places of the component

number with the fewest number of decimal places• Round x/÷ to the number of significant digits of the

component number with the fewest significant digits.

2

r

2

q

2

px

r

s

q

s

p

s

X

s

Page 31: Design of Experiments and Data Analysis. Let’s Work an Example Data obtained from MS Thesis Studied the “bioavailability” of metals in sediment cores

Let’s use the Calibration Eqn

• Response detector output

• Concentration what we are looking for in the column labeled “Cr Conc (ppb)”

)27.70(27.1787ionConcentrat)63.3(07.259sponseRe

Page 32: Design of Experiments and Data Analysis. Let’s Work an Example Data obtained from MS Thesis Studied the “bioavailability” of metals in sediment cores

Let’s use the Calibration Eqn

• Let’s look at the first line:

• Rearrange to solve for Conc:

• Let’s look at the numerator

70.27)1787.27( Concx 3.63)259.07( 244.69)8156.35(

x 3.63)259.07(

)27.071787.27(- 244.69)8156.35( Conc

Page 33: Design of Experiments and Data Analysis. Let’s Work an Example Data obtained from MS Thesis Studied the “bioavailability” of metals in sediment cores

• Num = 8156.35-1787.27 = 6369.08

• Error in Num:– Recall for +/-:

– Error in Conc =

• So now:

Let’s use the Calibration Eqn

2r

2q

2px ssss

)63.3259.07(

)27.071787.27(- 244.69)8156.35( Conc

63.307.259

58.25408.6369 Conc

244.692

70.272

254.58

Page 34: Design of Experiments and Data Analysis. Let’s Work an Example Data obtained from MS Thesis Studied the “bioavailability” of metals in sediment cores

• Conc =

• Recall, for x/÷: or

• So, ErrConc =

• Final result Conc = 24.58 ± 1.04

Let’s use the Calibration Eqn

2

r

2

q

2

px

r

s

q

s

p

s

X

s

63.307.259

58.25408.6369 Conc

6369.08

259.0724.58

2

r

2

q

2

px r

s

q

s

p

sXs

24.58254.58

6369.08

23.63

259.07

2

1.04

Page 35: Design of Experiments and Data Analysis. Let’s Work an Example Data obtained from MS Thesis Studied the “bioavailability” of metals in sediment cores

Final Results

• Use error bars in the plots

0

2

4

6

8

10

12

14

16

0 5 10 15 20 25 30

Chromium Concentration (ppb)

De

pth

(cm

)

Page 36: Design of Experiments and Data Analysis. Let’s Work an Example Data obtained from MS Thesis Studied the “bioavailability” of metals in sediment cores

Plotting Error Bars

• Error bars can be:– 1-3 standard deviation(s)– Standard error– etc…

• Just be clear in your figure caption what your error bar represents

Page 37: Design of Experiments and Data Analysis. Let’s Work an Example Data obtained from MS Thesis Studied the “bioavailability” of metals in sediment cores

Next Presentation

• A little about design of experiments

• A little more about errors, hypothesis testing, etc…