Basic Statistics for Research

8/2/2019 Basic Statistics for Research

1/119


2/119

A science that deals withcollecting, organizing,analyzing and

interpreting pertinentdata.

STATISTICS


3/119

Imagine this situation:

You are in a class with just four otherstudents, and the five of you took a 5-

point pop quiz. Today your instructor iswalking around the room, handing backthe quizzes. She stops at your desk and

hands you your paper. Written in boldblack ink on the front is 3/5. How do youreact?


4/119

Are you happy with

your score of 3 ordisappointed?


5/119

How do you decide?

You might calculate yourpercentage correct, realize it is

60%, and be appalled.


6/119

But it is more likely that whendeciding how to react to your

performance, you will wantadditional information.

What additional information

would you like?


7/119

If you are like most students,you will immediately ask your

neighbors, "Whad'ja get?" andthen ask the instructor, "How

did the class do?"


8/119

In other words, the additionalinformation you want is how your

quiz score compares to otherstudents' scores. You thereforeunderstand the importance of

comparing your score to the classdistribution of scores.


9/119


10/119

3 Common Measures of

Central Tendency


11/119

Mean of Ungrouped Data

To compute the mean of ungrouped data, simply

add the given observations and divide it by thenumber of observations.

Xi where: Xi sum of all observations

X = ______ n total number

n observations


12/119

Example:


13/119

Mean of Grouped Data


14/119

Example:

C.I. Freq. Xi FiXi

7679 2 77.5 155

8083 5 81.5 407.5

84

87 5 85.5 427.58891 11 89.5 984.5

9295 4 93.5 374

9699 3 97.5 292.5

n = 30 = 2,641


15/119

Median of Ungrouped Data

To get the median of ungrouped

data, arrange the given observationsaccording to magnitude, then

identify the middle value.


16/119

Example


17/119

Note:

In case of an even number of

observations, we expect two middlevalues, what simply need to be done

is to get the average of the two

observations by adding the twoobservations and dividing it by 2.


18/119


19/119

Example:

C.I. C.B. Freq. F


20/119

Solution:

Identify the Median class. The median classis computed using the formula n/2. Since

n=30, therefore n/2 = 15.

Locate the computed n/2 in the F


21/119

Identify the value of the differentvariables needed in the formula.

LMe = 87.5n/2 = 15

cfb = 12fMe = 11

c = 4


22/119


23/119

Mode of Ungrouped DataTo get the mode of ungrouped data,

identify the observation/s havingthe most number of frequency or

occurrence.


24/119

Example:


25/119


26/119

Mode of Grouped Data


27/119

Example:

C.I. C.B. Freq.

7679 75.579.5 2

8083 79.583.5 5

84

87 83.5

87.5 5

88 91 87.5 91.5 11

9295 91.595.5 4

96

99 95.5

99.5 3n = 30


28/119

Solution:

Identify the Modal class. The modalclass is the class having the highestfrequency. Since the highest frequency

is 11, therefore 88 91 is the modalclass.


29/119


30/119

Scores of 5 Boys and 5 Girls in Mathematics

Boys Girls

Frederick 70 Grace 82

Russel 95 Irish 80

Murphy 60 Abigail 83

Jerome 80 Sherry 81Tom 100 Kristine 79

Mean: 81 Mean: 81


31/119

Boys

60 70 80 90 100

Girls

60 70 80 90 100


32/119

Measures of Variability or Dispersion

RANGE:The difference between the highest and the

lowest observationR = H L

Boys: R = 100 60

R = 40Girls: R = 83 79

R = 4

Therefore the

girls are more

homogeneous

than the boys in

their math

ability


33/119

Mean Deviation:

The average of the summation of the

absolute deviation of each observationfrom the mean.

MD = | XiX |n


34/119

BOYS Xi lXiXl

Frederick 70 11

Russel 95 14

Murphy 60 21Jerome 80 1

Tom 100 19

Mean: 81 = 405 = 66

M.D = 66 / 5 = 13.2


35/119

GIRLS Xi lXiXl

Grace 82 1

Irish 80 1

Abigail 83 2Sherry 81 0

Kristine 79 2

Mean: 81 = 405 = 6

M.D = 6 / 5 = 1.2


36/119

MD ( boys ) = 13.2MD ( girls ) = 1.2

- based from the computed MeanDeviation, the girls are more

homogeneous than the boys.


37/119

VARIANCE:

The average of the squared deviationfrom the mean.

Population Variance

2 = ( Xi X ) 2

n

Sample Variances 2 = ( Xi X ) 2

n - 1


38/119

BOYS Xi XiX ( XiX ) 2

Frederick 70 -11 121

Russel 95 14 196

Murphy 60 -21 441Jerome 80 -1 1

Tom 100 19 361

Mean: 81 = 405 = 1,120

2 = 1,120 / 5 s2 = 1,120 / 4

= 224 = 280


39/119

GIRLS Xi XiX ( XiX ) 2

Grace 82 1 1

Irish 80 1 1

Abigail 83 2 4Sherry 81 0 0

Kristine 79 2 4

Mean: 81 = 405 = 10

2 = 10 / 5 s2 = 10 / 4

= 2 = 2.5


40/119

BOYS2 = 1,120 / 5 s2 = 1,120 / 4

= 224 = 280

GIRLS

The values of

the Variance

also reveals thatthe score of

boys are more

spread out than

that of the girls.

2 = 10 / 5 s2 = 10 / 4

= 2 = 2.5


41/119

STANDARD DEVIATION:

The square root of the Variance

BOYS

2

= 224 s2

= 280 = 14.97 s = 16.73

GIRLS

2 = 2 s 2 = 2.5

= 1.41 s = 1.58


42/119

Let us pause fora BREAK


43/119

HYPOTHESISTESTING


44/119

HYPOTHESIS TESTING

Inferential Statistics formalized body oftechniques used to make conclusions aboutpopulations based on the results of the study on

the samples.Two areas of Inferential Statistics

Estimation

Point Estimation

Interval Estimation

Hypothesis Testing


45/119

HYPOTHESIS TESTING

Research Problem: How effective is Minoxidil intreating male pattern baldness?

Specific Objectives:

1. To estimate the population proportion of patients whowill show new hair growth after being treated withMinoxidil.

2. To determine whether treatment using Minoxidil is betterthan the existing treatment that is known to stimulate hairgrowth among 40% of patients with male patternbaldness.


46/119

HYPOTHESIS TESTING Hypothesis Testing - is the process of making

an inference or generalization about a populationby using data gathered from a sample of thepopulation

It is an area of statistical inference in which oneevaluates a conjecture about some characteristicof the parent population based upon theinformation contained in the random sample.

Usually the conjecture concerns one of theunknown parameters of the population.


47/119

HYPOTHESIS TESTING

Kinds of Hypothesis:

Scientific Hypothesis is a suggested

explanation or solution to a phenomenon.

Statistical Hypothesis:

Itis a guess or prediction made by a researcher

regarding the possible outcome of the study.It is a claim or a statement about an unknown

parameter.


48/119

HYPOTHESIS TESTING

Examples of Scientific Hypothesis:

When Darwin hypothesized that manevolved from the apes, he was making a

scientific hypothesis.

Similarly when Copernicus hypothesizedthat the earth and the planets in the solarsystem revolved around the sun inconcentric circles with the sun as thecenter.


49/119

HYPOTHESIS TESTING

Examples of Statistical Hypothesis:

1. The correlation between X and Y(in the population)

is equal to zero;

2. There is no significant difference in the mean of

the two groups;

3. The mean IQ of the population is 100;

0XY

BA

100


50/119

HYPOTHESIS TESTING

Two Types of Statistical HypothesisNull hypothesis (H0): It is the hypothesis to be

tested which one hopes to reject. It shows

equality or no significant difference, effect, orrelationship between variables.

denoted by Ho.

the statement being tested.

it represents what the experimenter doubts to be true.

must contain the condition of equality and must be writtenwith the symbol =, , or


51/119

HYPOTHESIS TESTING

For the mean, the null hypothesis will be stated inone of these three possible forms:

Ho: = some value

Ho: some value

Ho: some value

Note: the value of can be obtained from previous studiesor from knowledge of the population


52/119

HYPOTHESIS TESTING

Alternative hypothesis (Ha): It generallyrepresents the idea which the researcher wantsto prove.

denoted by Ha

is the statement that must be true if the nullhypothesis is false

the operational statement of the theory that the

experimenter believes to be true and wishes toprove

is sometimes referred to as the research hypothesis


53/119

HYPOTHESIS TESTING

For the mean, the alternative hypothesis will bestated in only one of three possible forms:

Ha: some value

Ha: > some value

Ha: < some value

Note:

Ha is the opposite of Ho. For example, if Ho is given as

= 37.0, then it follows that the alternative hypothesis isgiven by Ha: 37.0.


54/119

HYPOTHESIS TESTING

Note About Using or in Ho

Even though we sometimes express Ho with the

symbol or as in Ho: 37.0or Ho: 37.0, we conduct the test by assumingthat = 37.0 is true.

We must have a single fixed value for so that wecan work with a single distribution having aspecific mean.


55/119

HYPOTHESIS TESTING

Note About Stating Your OwnHypotheses

If you are conducting a research studyand you want to use a hypothesis test tosupportyour claim, the claim must be

stated in such a way that it becomes thealternative hypothesis, so it cannotcontain the condition of equality.


56/119

HYPOTHESIS TESTING

Example in Stating your Hypothesis

If you believe that your brand of refrigerator

lasts longer than the mean of 14 years forother brands, state the claim that > 14,where is the mean life of your

refrigerators.Ho: = 14 vs. Ha: > 14


57/119

HYPOTHESIS TESTING

In this context of trying to support the goalof the research, the alternative hypothesis issometimes referred to as the research

hypothesis.Also in this context, the null hypothesis is

assumed true for the purpose of conductingthe hypothesis test, but it is hoped that the

conclusion will be rejection of the nullhypothesis so that the research hypothesis issupported.


58/119

HYPOTHESIS TESTING

Research Problem:

Comparative performance in Mathematics ofthe first-born and the last-born children.

H0: There is no significant difference in theperformance in mathematics between the first-born and last-born children.

Ha: There is a significant difference in theperformance in mathematics between the first-born and last-born children.


59/119

HYPOTHESIS TESTING

Research Problem:

Effectiveness of an Instructional Strategy

H0: There is no significant effect of modified workedexample strategy in the problem solving ability ofstudents in physics.

Ha: The modified worked example strategy will have asignificant effect in the problem solving ability of students

in physics.Ha: Students exposed to the modified worked examplesare better problem solvers than those exposed toconventional worked examples.


60/119

HYPOTHESIS TESTING

Research Problem:

Relationship between emotional intelligence ofstudents and their level of math anxiety

H0: There is no significant relationship betweenstudents emotional intelligence and their level ofmath anxiety.

Ha: There is significant relationship betweenstudents emotional intelligence and their level ofmath anxiety.


61/119

HYPOTHESIS TESTING

REMARK:

If the null hypothesis is rejected, thealternative hypothesis is accepted andvice versa. Rejection of the nullhypothesis means it is wrong, whileacceptance of the null hypothesis

does not mean it is true, it simplymeans that we do not have enoughevidence to reject it.


62/119

HYPOTHESIS TESTING

Types of Hypothesis Testing

1. Two-tailed test: It is non-directional test with

the region of rejection lying on both tails of the

normal curve. It is used when the alternativehypothesis uses words such as not equal to,significantly different, etc.

Acceptanceregion Rejection regionRejection region


63/119

HYPOTHESIS TESTING

Example: A teacher wants to know if there issignificant difference in the performance inStatistics between his morning and afternoonclasses.

H0: There is no significant difference in theperformance in Statistics between his morningand afternoon classes.

Ha: There is a significant difference in theperformance in Statistics between his morningand afternoon classes.


64/119

HYPOTHESIS TESTING

2. One-tailed test: It is a directional test with theregion of rejection lying on either left or right tailof the normal curve.

Right directional test. The region of rejection is on theright tail. It used when the alternative hypothesis usescomparatives such as greater than, higher than, betterthan, superiorto, exceeds, etc.

Acceptance region

Region of Rejection


65/119

HYPOTHESIS TESTING

Example:Research Problem: Performance inMathematics of the First-born and Last-born

ChildrenH0: The first born-children perform equallywell in mathematics as the last-born children.

Ha: The first born-children perform better inmathematics than the last-born children.


66/119

HYPOTHESIS TESTING

Left directional test. The region of rejectionis on the left tail. It is used when the alternativehypothesis uses comparatives such as lessthan, smaller than, inferior to, lower than,

below, etc.

Acceptance regionRejection region


67/119

HYPOTHESIS TESTING

Example:Research problem:It is known that in the school canteen, the average waitingtime for a customer to receive and pay for his order is 20

minutes, Additional personnel has been added and nowthe management wants to know if the average waiting timehad been reduced.

H0: The average waiting time had not been reduced or the

average waiting time is equal to 20 minutes.

Ha: The average waiting time had been reduced, or the

average waiting time is less than 20 minutes.


68/119

HYPOTHESIS TESTING

What is a test of Significance?

A test of significance is a problem of decidingbetween the null and the alternative hypotheses onthe basis of the information contained in a randomsample.

The goal will be to reject Ho in favor of Ha, because

the alternative is the hypothesis that the researcherbelieves to be true. If we are successful in rejectingHo, we then declare the results to be significant.


69/119

HYPOTHESIS TESTING

Two Types of Error in Hypothesis Testing:

TYPE 1 ERROR

The mistake of rejecting the null hypothesis when it is

true.It is not a miscalculation or a procedural misstep; it is

an actual error that can occur when a rare eventhappens by chance.

The probability of rejecting the null hypothesis when itis true is called the significance level ( ).

The value of is typically predetermined, and the verycommon choices are = 0.05 and = 0.01.


70/119

HYPOTHESIS TESTING

Examples of Type I Error

1.The mistake of rejecting the null

hypothesis that the mean bodytemperature is 37.0 when that mean isreally 37.0.

2.BFA did not allow the release of aneffective medicine.


71/119

HYPOTHESIS TESTING

Type II Error

The mistake of failing to reject the null

hypothesis when it is false.The symbol (beta) is used to

represent the probability of a type II

error.


72/119

HYPOTHESIS TESTING

Examples of Type II Errors

1.The mistake of failing to reject the null

hypothesis ( = 37.0) when it is actuallyfalse (that is, the mean is not 37.0).

2.BFA allowed the release of an ineffective

drug.


73/119

HYPOTHESIS TESTING

A typical example of testing a statistical hypothesis issummarized in the following table.

Accept H0 Reject H0

H0 is trueCorrectDecision

Type 1 Error

H0 is false Type II ErrorCorrectDecision

HYPOTHESIS TESTING


74/119

HYPOTHESIS TESTING

Controlling Type I and Type II Errors

o The experimenter is free to determine . If the test leads to therejection of Ho, the researcher can then conclude that there issufficient evidence supporting Ha at level of significance.

o Usually, is unknown because its hard to calculate it. The commonsolution to this difficulty is to withholdjudgment if the test leads tothe failure to reject Ho.

o and are inversely related. For a fixed sample size n,as decreases increases.

o In almost all statistical tests, both and can be reduced by

increasing the sample size.o Because of the inverse relationship of and , setting a very small

should also be avoided if the researcher cannot afford a very largerisk of committing a Type II error.


75/119

HYPOTHESIS TESTING

The choice of usually depends on theconsequences associated with making aType I error.

Common Choices

OfConsequences of

Type I Error

0.01 or smaller

0.050.10

Very serious

Moderately seriousNot too serious

HYPOTHESIS TESTING


76/119

HYPOTHESIS TESTING

Level of Confidence

a.) 0.05 level95% sure that the error is only 5%.When a different set of samples is taken from the same

population, the probability of getting a result similar to the presentstudy is 95%.

b.) 0.01 level99% sure that the error is only 1%

Note:

A test is said to be significant if the null hypothesis isrejected at the 0.05 level of significance and is consideredhighly significant if the null hypothesis is rejected at the 0.01level of significance.


77/119

HYPOTHESIS TESTING

Steps in Testing the Hypothesis

1. State the null and alternative hypotheses.

2. Decide on a level of significance, .

3. Determine the testing procedure and methodsof analysis (responsibility of the statistician).

4. Decide on the type of data collected and

choose an appropriate test statistic and testingprocedure.


78/119

HYPOTHESIS TESTING

Steps in Testing the Hypothesis

5. State the decision rule.

6. Collect the data and compute for the value of thetest statistic using the sample data.

7. If decision rule is based on region of rejection:Check if the test statistic falls in the region of

rejection. If yes, reject Ho.

If decision rule is based on p-value: Determinethe p-value. If the p-value is less than or equal to, reject Ho.

8. Interpret results.


79/119

HYPOTHESIS TESTING

The Test Statistic - a statistic computed from the sampledata that is especially sensitive to the differencesbetween Ho and Ha.

1. The test statistic should tend to take on certain values when Hois true and different values when Ha is true.

2. The decision to reject Ho depends on the value of the test statistic

3. A decision rule based on the value of the test statistic:Reject Ho if the computed value of the test statistic falls

in the region of rejection.


80/119

HYPOTHESIS TESTING

Critical Value/s

the value or values that separate the criticalregion from the values of the test statistic that

would not lead to rejection of the null hypothesis.

It depends on the nature of the null hypothesis,the relevant sampling distribution, and the levelof significance.

level of significance (): the smaller is,

the smaller the region of rejection


81/119

HYPOTHESIS TESTING

Test Concerning MeansA. Test for one sample mean

a. When is known and n 30.

z =

b. When unknown and n < 30

t =

n

x

n

s

x


82/119

HYPOTHESIS TESTING

Example 1.

The production manager of a large manufacturingcompany estimates that the mean age of his workers is22.8 years. The treasurer of the firm needs more

accurate employee mean age figure in order to estimatethe cost of an annuity benefit program being consideredfor employees. The treasurer takes a random sample of70 employees and finds that the mean age of thesampled employees is 26.2 years with a standard

deviation of 4.6 years. At 0.05 level of significance, testthe hypothesis that the mean age of the employees is notequal to 22.8 years.


83/119

HYPOTHESIS TESTINGSolution:The steps include:

1. Null Hypothesis : H0 : yearsAlternative Hypothesis: Ha : years

Level of Significance:Test Statistics: Two - tailed Test; n = 70

Critical Region: Reject the null hypothesis if z < -1.96 or z > 1.96, otherwiseaccept it.Note: z is used since our sample size n = 70 is quite large.

Compute:

Decision:Since zc = 6.184 exceeds 1.96, the null hypothesis must be rejected; In otherwords, the difference between = 26.2, and years is too large toattribute it to chance. So we can say that their difference is significant. Hence,the mean age of the employees is not 22.8.

8.228.22

05.0

184.6

70

6.4

8.222.26

cZ

n

xZ

x 8.22


84/119

HYPOTHESIS TESTING

Example 2.

A random sample of 20 drinks from a soft-drink machine has an average content of

21.9 deciliters, with a standard deviation of1.42 deciliters. At .05 level of significance,test the hypothesis that = 22.2 deciliters

against the alternative that < 22.2 andassume that the distribution of the softdrinks contents be normal.

HYPOTHESIS TESTING


85/119

Solution:

Null Hypothesis H0 : = 22.2 deciliters

Alternative Hypothesis H1

: < 22.2 deciliters

Level of Significance: = 0.05

Test Statistics: with df = n-1

Note: The students statistic can be used since our sample size n = 20 issmall and the soft drinks content was assumed to be normally distributed.

Criterion: Reject the null hypothesis if computed t < -1.729 (the tabular valueof t at 20-1 degrees of freedom and otherwise, accept it.

Compute:

Decision: Since computed t = -0.945 is greater than -1.729, we accept H0.Conclude that the mean content of the soft drinks is equal to 22.2 deciliters.In other words, though there is a numerical difference of 0.3, this differencecan be attributed to chance.

n

s

xt

945.0

2042.1

2.229.21

ct


86/119

HYPOTHESIS TESTING

Test of Differences of Two Means

a. When n1 30 or n2 30

z =

b. When n1 < 30 and n2 < 30

t = where

2

2

2

1

2

1

21

nn

xx

21

21

11

nns

xx

p

2

11

21

222

211

nn

snsn

sp =


87/119

HYPOTHESIS TESTING

Example 3.

An instructor wishes to determine which of the twomethods of teaching: A or B, is more effective in teachingcertain concepts in Physics. In a class of 36 students, heused method A and in the other class of 40 students,method B. He gave the same final examination for bothclass and garnered the following results:

Method A Method B

Is the instructor correct in assuming that method A is moreeffective than method B. Use 0.01 level of significance.

781x 70

2x

41 s 62 s


88/119

HYPOTHESIS TESTINGSolution:

Null Hypothesis H0 :Alternative Hypothesis H1 :

Level of Significance:Test Statistics:

Z =

Critical Region: Reject the null hypothesis if Zc > 2.326; otherwise state the differencebetween two sample means is not significant.

Compute:

Zc=

Decision: Since Zc = 6.899 is greater than 2.326 the null hypothesis must be rejected.Conclude that the instructors claim is correct that method A is more effective than method B.

BA

BA

01.0

2

2

2

1

2

1

21

nn

xx

899.6

40

6

36

4

7078

22


89/119


90/119

HYPOTHESIS TESTINGSolution:

Null Hypothesis H0 :Alternative Hypothesis H1 :

Level of Significance: with df = n(1) + n(2) - 2Test Statistics:

t = sp =

Criterion: Reject the null hypothesis if tc > 1.68 for 24 + 20 2 = 42 degrees of freedom;otherwise state the difference between two sample means is not significant.

Compute:

sp = t =

Decision: Since tc = 1.684, the null hypothesis must be accepted; in other words, weconclude that the female students of the first section are not taller than the other class.The difference in the mean heights is not significant.

21 21

05.0

21

21

11

nns

xx

p

2

11

21

2

22

2

11

nn

snsn

31.622024

)5.5(120)9.6(12422

675.1

20

1

24

1

31.6

3.1605.163


91/119

CORRELATION


92/119

CORRELATION

Definition:

Correlation is a method used to measure thestrength of relationship between two variables that

tend to vary together in a consistent way. The natureand degree of relationship is indicated by a coefficient,designated by letter r.

By direct causal relations, we mean that ifXand Y

are correlated, then X is partly the cause of Y or Y ispartly the cause ofX.


93/119

CORRELATION

Examples of correlation: There is correlation between

- income and savings

- the extent of fatigue and performance

in speed test.

There is no correlation between- weight and IQ, or

- shoe size and mathematical ability


94/119

CORRELATION

The Scatter Diagram

One can usually and roughly estimate if arelationship exists between two variablesby constructing a scatter diagram. This isdone by plotting the point corresponding to

each observation on a rectangularcoordinate system.


95/119

Scatter Plot Examples

y

x

y

x

y

y

x

x

Linear relationships Curvilinear relationships


96/119


y

x

y

x

y

y

x

x

Strong relationships Weak relationships

(continued)


97/119


y

x

y

x

No relationship

(continued)


98/119

CORRELATION

Examples:

1. Consider the following marks of five students in Englishand Mathematics. Notice that for each student, there

corresponds two scores (paired observations).

Student English (X) Mathematics (Y)

A 55 69

B 64 85

C 96 99

D 44 52

E 83 89


99/119

CORRELATION

2. The following data are the life spans of nine husbands and wivesrandomly selected from a certain community. Draw a scatter diagramand decide whether a relationship exists between their ages.

Couple Age of Husband (X) Age of Wife (Y)

1 65 902 72 95

3 68 45

4 71 51

5 75 50

6 67 627 76 45

8 73 63

9 71 83


100/119

CORRELATION

Types of Correlation

1. Apositive correlation exists when high values in one variableare associated with high values in the second variable. This isalso true when low values in one variable are associated with low

values in the other. Thus, there is a direct relationship that existsin positive correlated variables. Also, in a positive correlation, thepoints on the scatter diagram closely follow a straight line risingto the right.

Examples:

problem solving ability and reading comprehensionincome and savings

income and expenses


101/119

CORRELATION

Types of Correlation:

2.A negative correlation exists when high values

in one variable are associated with low valuesin the second variable, and vice versa. Here,points on the scatter diagram closely follow astraight line falling to the right.

Example:

pressure and volume (at constant temperature)


102/119

CORRELATION

Types of Correlation

3. A zero correlation exists when scores in one variable

tend to score neither systematically high norsystematically low in the other variable. The points on

the scatter diagram are spread in a random mannerwhen this relationship exists.

Examples:

sex and IQ

athletic ability and mental ability

shoe size andmathematical performance


103/119

CORRELATION

Note:

Correlational descriptions are descriptive and they

may not be sufficient to explain the relationshipbetween two variables.

Correlation coefficient (r) is a numerical measure

of the linear relationship between two variables. Itsvalues range from -1 to +1.


104/119

Correlation Coefficient

The population correlation coefficient (rho) measures the strength of theassociation between the variables

The sample correlation coefficient r isan estimate of and is used to

measure the strength of the linearrelationship in the sampleobservations

(continued)

F f d


105/119

Features of and rUnit free

Range between -1 and 1

The closer to -1, the stronger thenegative linear relationship

The closer to 1, the stronger the positivelinear relationship

The closer to 0, the weaker the linearrelationship


106/119

r = +.3 r = +1

Examples of Approximate r Values

y

x

y

x

y

x

y

x

y

x

r = -1 r = -.6 r = 0


107/119

CORRELATION

Correlational Tests:

1. Pearson Product Moment CorrelationIt measures the degree of relation between two at least

interval scale data.

2.Spearmans Rank Correlation Coefficient It is the measure of the correlation between two ordinal

variables.

3. Phi-CoefficientThe phi coefficient determines the degree of relationship

between two variables which are both nominal dichotomouslike sex (male-female) and marital status (married-unmarried).

4. Point BiserialIt measure correlation between an interval and a nominal

dichotomous data.


108/119

CORRELATION

Interpretation of the Correlation CoefficientOnce the value of r is found significant, the rule of

thumb for assessing the degree of relationship betweenthe two quantitative variables can be interpreted using

the following criteria:r-value Verbal Description

0.00-0.29 Little or weak positive (negative) correlation

0.30-0.49 Low positive (negative) correlation

0.50-0.69 Moderate positive (negative) correlation

0.70-0.89 High positive (negative) correlation

0.90-1.00 Very High or strong positive (negative)

correlation


109/119

CORRELATION

Test of significance for r

When ris calculated on the basis of sample data,

we may get a strong positive or negative correlationpurely by chance, even though there is actually nolinear relationship whatever between the two variablesin the population from which the sample came. The

value we obtain for r is only an estimate of acorresponding parameter, the population correlationcoefficient (). What r measures for a sample, measure s for a population.

CORRELATION


110/119

CORRELATION

1. T-distributionwith n-2 degrees of freedom

This is used to test the significance of r arising fromPearson, Spearman, and Point Biserial.

Note: Reject the null hypothesis of no correlation at

the level of significance, if the computed value oft

exceeds the value of the critical t for one-tailed test orfor a two-tailed test; otherwise we accept the nullhypothesis.

21

2

r

nrt

CORRELATION


111/119

CORRELATION

2. The Inference about the phicoefficient uses

1 nrZ


112/119

CORRELATION

NOTE:

The coefficient of determination, the

square of the coefficient of correlation, r2,is the proportion of the total variation in thedependent variable (y) that can be

attributed to the relationship with theindependent variable (x).

C l l i h C l i C ffi i t


113/119

Calculating the Correlation Coefficient

])yy(][)xx([

)yy)(xx(r

22

where:r = Sample correlation coefficientn = Sample sizex = Value of the independent variabley = Value of the dependent variable

])y()y(n][)x()x(n[

yxxynr

2222

Sample correlation coefficient:

or the algebraic equivalent:

Sample Calculation


114/119

Sample CalculationTree

Height

TrunkDiameter

y X xy y2 x2

35 8 280 1225 64

49 9 441 2401 81

27 7 189 729 49

33 6 198 1089 36

60 13 780 3600 169

21 7 147 441 49

45 11 495 2025 121

51 12 612 2601 144

=321 =73 =3142 =14111 =713

l l l


115/119

0

10

20

30

40

50

60

70

0 2 4 6 8 10 12 14

0.886

](321)][8(14111)(73)[8(713)

(73)(321)8(3142)

]y)()y][n(x)()x[n(

yxxynr

22

2222

Trunk Diameter, x

TreeHeight,y

Sample Calculation(continued)

r = 0.886 strong high positivelinear association between x and y


116/119

Significance Test for Correlation

Hypotheses

H0: = 0 (no correlation)

HA: 0 (correlation exists)

Test statistic

(with n 2 degrees of freedom)

2nr1

rt

2

Solution


117/119

Solution

Is there evidence of a linear relationshipbetween tree height and trunk diameter atthe 0.05 level of significance?

H0:

= 0 (No correlation)

H1: 0 (correlation exists)

= 0.05 , df=8 - 2 = 6

4.68

28

.8861

0.886

2n

r1

rt

22

Solution


118/119

4.68

28

.8861

.886

2n

r1

rt

22

Solution

Conclusion:There is

evidence of alinear relationshipat the 5% level of

significance

Decision:Reject H0

Reject H0Reject H0

/2=.025

-t/2Do not reject H0

0t/2

/2=.025

-2.4469 2.44694.68

d.f. = 8-2 = 6


119/119

Thank You for Listening!

Basic Statistics for Research

Documents

Basic Statistics I - PiratePanelcore.ecu.edu/ofe/StatisticsResearch/basic statistics I.pdf · Basic statistics •Statistics: “a bunch of mathematics used to summarize, analyze,

Edward P. Sloan, MD, MPH, FACEP Research Lecture : Basic Statistics

Basic Statistics Introduction to Inferential Statistics

Basic Statistics

Basic Statistics for Research on Your Teaching Dr. Herle McGowan, Department of Statistics October 15, 2010

Basic statistics - wnarifin.github.io · Basic statistics NoteupdatedAugust19,2019. Notforsale:-) WanNorAriﬁn UnitofBiostatisticsandResearchMethodology, UniversitiSainsMalaysia

STATISTICS 7 Basic Statistics

Basic Statistics - Amazon S3 · 2016-07-22 · Basic Statistics Introduction to Statistics Basic Statistical Formulas Commonly used Ecological Equations INTRODUCTION TO STATISTICS

Calculating & Reporting Healthcare Statistics Second Edition Chapter 13 Basic Research Principles

Basic Statistics - European Youth Press...Basic Statistics. Funded by: H2020 EU research and innovation programme Statistics is the science of organising, analysing, collecting, and

Statistics and Probability 13.1 Basic Statistics

Basic Statistics for Social Science Research * Dr. Azadeh Asgari

Methods of Research and Enquiry Basic Statistics and Correlational Research by Dr. Daniel Churchill

Introduction to Statistics and Quantitative Research …...Introduction to Statistics and Quantitative Research Methods Purpose of Presentation • To aid in the understanding of basic

1 Research in Psychology and Basic Concepts in Statistics

Training Course on Basic Statistics for Researchbasicstat-srtc.wikispaces.com/file/view/Stat+directory.pdf · · 2009-08-29Training Course on Basic Statistics for Research August

Basic Statistics for EM Research : Power Calculations

Clinical Research: Basic Statistics and Appraising the Literature

Research Lecture : Basic Statistics

statistics basic