Statistics Who Spilled Math All Over My Biology?!

Statistics

Who Spilled Math All Over My Biology?!

Practical Applications• Population studies:

– Collecting data in the field on a specific population is time consuming and difficult• What is the mean length of le Doge

tails in Beijing? • Much le Doge! Too Many To

Measure!

– A sample group, rather than the whole population, can be examined and the data applied to the larger population

– This is known as a data set• How can we know if the data set

really represents the larger population?– Statistical analysis

The Bell Curve• Data sets of significant size

should show a normal distribution when plotted out – A Bell Curve

• Next, use the data set to calculate:– Standard Deviation– Standard Error– t-Test

• These values can allow one le Doge data set to be applied to other le Doge groups

Standard Deviation• Measures how scattered a data

set is around its mean– Must use a data set with normal

distribution • S= standard deviation• Ʃ= “sum of”• X= value from date set• Ẍ= mean from data set• n= total number of data points• Now we need some le Doge

data

Standard Doge-viation• We get the following data set of le Doge tails (cm):

• First we find the mean (Ẍ) of the data:– Sum of date points/ # of data points– 10.9

• Second, we apply the equation to all the data points

10. 2 11.0 13.5 9.8 11.3 12.3 10.0

8.0 9.9 9.6 9.7 11.6 12.5 11.0

7.9 13.9 12.7 11.5 10.8 11.3 10.4

Standard Doge-viation

• Ẍ = 10.9• Ʃ (x-Ẍ)2 = 49.0• n= 21• S = sq root (49.0/(21-1))

= sq root (2.45)• S= 1.57

X (X-Ẍ)2

10. 2 0.4911.0 0.0113.5 6.769.8 1.21

11.3 0.1612.3 1.9610.0 0.818.0 8.419.9 19.6 1.699.7 1.44

11.6 0.4912.5 2.5611.0 0.017.9 9

13.9 912.7 3.2411.5 0.3610.8 0.0111.3 0.1610.4 0.25

What Does This All Mean?• Mean le Doge tails (Ẍ = 10.9

cm)• 10.9 cm is the height of

our le Doge bell curve• 95% of le Doge tail lengths

fall between the upper and lower limit from the mean (10.9 cm)– Lower limit= Ẍ - (2 x S)– Upper limit= Ẍ + (2 x S)– S= 1.57

• 95% of all le Doge tail in the data set are within: 10.9 ± 3.14 cm

Working the Numbers• Now that we mastered the

date set of one le Doge group, we can apply our findings to rest of the group

• This will save time and energy since we wont need to measure all the le Doge tails of the next group

• The data from group 1 can apply to group 2 as long as they are similar in type

Standard Error (SM)• The estimated standard deviation

of a whole population based on the mean and standard deviation of one date set– Our data set covered le Doge tails of

group A, but we want data on Group B as well

• Because these are normally distributed data sets, we can sure that 95% of the means (Ẍ) of other groups will be ± (2xSM)

• S= standard deviation • n= number of data points

StanDoge Error (SM)• Standard deviation for le Doge tails in

group A; S= 1.57 cm• n= 21• SM= 1.57/4.58 = 0.34 cm• So we can be 95% certain that the

mean le Doge tails in group B is Ẍ (B) = Ẍ (A) ± (2 x SM)– Ẍ (B) = 10.9 ± 0.68 cm

• How would using a sample size of 100 in group A effect our prediction for group B?– Decrease SM range; 10.9 ± 0.26 cm– data is more accurate

Comparing Multiple Data Sets• Using standard error saves time,

however it only works with populations under the same circumstances– Le Doge groups A and B were le Doges

found in Beijing. The data may not apply to le Doges in France.

– The more variables that are not accounted for, the less certain the data becomes

• Paris le Doge tails study was done:– n= 50– Ẍ= 9.5 cm– S= 2.03

• Is there a significant difference between these two groups? How can we tell?

t-Tests• Determines the significance in

differences between means of multiple data sets

• Ẍ1= mean of data set 1

• Ẍ2= mean of data set 2

• S1= standard deviation of set 1

• S2= standard deviation of set 2

• n1= # of data points in set 1

• n2= #of data points in set 2

• t= 3.14• What does this mean?!

Le Doge Tail Lengths (cm)

Beijing Paris

Ẍ 10.9 9.5

S 1.57 2.03

n 21 50

t-Value Table• To understand significant

difference between data points you need 3 things:1) t-Test value2) Degree of freedom from data sets

df= (n1-1) + (n2-1) = 69

3) t-Value Table• Use the df to find the t-value under

0.05– If the t-Test value larger than the t-

value on the chart, you “fail to reject” there is a significant difference between the data sets

– If is it smaller, the two data sets are not significantly different

t-test = 3.14; df= 69T-value= 2.000 3.14 > 2.000 So Pairs le Doge tail lengths and Beijing le Doge tail lengths are significantly different

Time for much practice.

Many homework.

Wow. Such confusion.

Documents

Statistics Who Spilled Math All Over My Biology?!