Upload
nguyenminh
View
225
Download
0
Embed Size (px)
Citation preview
13/01/2016
1
ANOVA – Analysis of Variance
Simona Guglielmi, PhD Department of Social and Political Sciences
University of Milan
Objectives
• To refresh descriptive statistics (measures of
central tendency/of dispersion)
• To learn how to test for significant differences in
means from two or more groups (ANOVA one
way)
13/01/2016
2
References
• David Knoke, George W. Bohrnstedt, Alisa Potter Mee, Statistics for Social Data Analysis, 2002, 4th Edition (Chapter 4)
• https://statistics.laerd.com/statistical-guides/one-way-anova-statistical-guide.php
• https://en.wikipedia.org/wiki/F-test
13/01/2016
3
Level of measurement Numeric variable (based on Ratio/interval scale) : variable
that can take discrete or continuous values along a range ( age,
salary, level of trust in people from 1 to 10 point-scale )
Ordinal variable (based on ranking): it allows for rank order
(1st, 2nd, 3rd, etc.) by which data can be sorted, but distances
between values do not have any meaning. (Educational level :
0=less than H.S.; 1=some H.S.; 2=H.S. degree; 3=some college;
4=college degree; 5=post college/ 1 =often 2 =sometimes 3= never)
Nominal variable (based on classification): Numbers may be
used to represent the variables but the numbers do not have
numerical value or relationship. (e.g Gender: 1 male, 2 female;
religion : 1= Catholic 2= Muslim 3= other).
Measure of central tendency
Mean. The arithmetic average, the sum divided by the number of
cases. (NUMERIC VARIABLES)
Median. The value above and below which half of the cases fall, the
50th percentile. If there is an even number of cases, the median is the
average of the two middle cases when they are sorted in ascending or
descending order. The median is a measure of central tendency not
sensitive to outlying values (unlike the mean, which can be affected
by a few extremely high or low values). – ORDINAL AND NUMERIC
VARIABLES
Mode. The most frequently occurring value. If several values share
the greatest frequency of occurrence, each of them is a mode. The
Frequencies procedure reports only the smallest of such multiple
modes. (NOMINAL ORDINAL, NUMERIC VARIABLES)
13/01/2016
4
Measures of dispersion • DISPERSION. Statistics that measure the amount of variation in the
data (NUMERIC VARIABLES)
Variance. The arithmetic mean of the squares of the deviations of all values in a set of numbers from their arithmetic mean. Variance and its square root (the standard deviation) are of fundamental importance as a measure of dispersion.
Std. deviation. A measure of dispersion around the mean. In a normal distribution, 68% of cases fall within one standard deviation of the mean and 95% of cases fall within two standard deviations. For example, if the mean age is 45, with a standard deviation of 10, 95% of the cases would be between 25 and 65 in a normal distribution.
Minimum. The smallest value of a numeric variable.
Maximum. The largest value of a numeric variable.
Frequencies statistics (example,
numeric variable)
13/01/2016
5
Frequencies statistics (example,
numeric variable)
To sum
• Look at the output Statistics
trstprl Trust in country's parliament
N Valid 3287
Missing 48
Mean 4,36
Median 5,00
Mode 5
Std. Deviation 2,312
Minimum 0
Maximum 10
The mean is below 5, that is value in the center of the trust scale (from 1 to 10), So, in average, the level of trust in country’s parliament is low. The mean is smaller than the median , but close to it
The st.deviation is 2.312, the data points are spread out over a quite large range of values
48 missing cases: what is it? Look at frequencies table (48 people have answered don’t know)
13/01/2016
6
The ANOVA approach
• Analysis of variance (ANOVA) is a test of hypothesis that
is appropriate to compare means of a numeric
variable (dependent variable) in two or more
groups (independent variable)
• In other terms, it is a way to examine the impact of a
group classification on a continuous dependent variable
• Independent variables explain or predict a response or
an outcome, which is the dependent variable under
study.
13/01/2016
7
The logic of ANOVA
Examples:
• In a clinical trial to evaluate a new medication, investigators might compare an experimental medication to a placebo and to a standard treatment (i.e., a medication currently being used).
• In a social and political research, researchers might be interested to compare attitudes (eg. nationalism) /behaviors (eg. attendance at religious services) across groups (by age, gender, country,…)
Some question research (examples):
• Do adults living in different regions of the country vary in terms of how often they attend religious services?
• People voting right wing parties are more likely to be nationalist than people voting left parties?
The ANOVA one way
To do ANOVA
• Independent variable is a nominal/ordinal variable with a set of discrete categories (e.g age, gender, …)
• Dependent variable is a numeric measure
Consider an example with
• 4 independent groups and 1 numeric outcome measure.
• The independent groups might be defined by a particular characteristic of the participants such as country (e.g., Italy, France, Germany, Spain ) or by the investigator (e.g., randomizing participants to one of four competing treatments, call them A, B, C and D).
13/01/2016
8
Null and Research Hypotheses
The hypotheses of interest in an ANOVA are as follows:
• H0: μ1 = μ2 = μ3 ... = μk Means are equal
where k = the number of independent comparison groups.
If the j group means all equal one another, then they also all equal this population’s grand mean
It means that grouping has no effect on dependent variable
• H1: Means are not all equal (research Hypothesis).
The null Hypothesis/2
• The null hypothesis in ANOVA is always that there is no difference
in means.
• The research (or alternative) hypothesis is always that the means are
not all equal
• The research hypothesis captures any difference in means
and includes, for example, the situation where all four means are
unequal, where one is different from the other three, where two are
different, and so on
• In ANOVA we are testing for a difference in means (H0:
means are all equal versus H1: means are not all equal) by
evaluating variability in the data
13/01/2016
9
The ANOVA Model
• The general model for an ANOVA with one independent variable Y decomposes an observed score into 3 components:
Yij= μ + aj + eij
Where
Yij= the score of the ith observation in the jth group (eg. Level of attendance at religious services for case 1)
μ = the grand mean, common to all cases in population
aj= the effect of group j, common to every case in that group
eij= the error score, unique for the ith case in the jgroup
Error term or residual (eij = Yij - μ - aj ) is the part of an observed score that can not be attributed to either the common component or the group component
Source of variability
• Total variation is decomposed into two
components:
• Between-group variation: variability that is
due to differences among groups, also called
explained variation.
• Within-group variation: total variability
within each of the groups, this is unexplained
variation.
13/01/2016
10
ANOVA Tables: Sum of Squares
• Total sum of squares: a number obtained by subtracting the
scores of a distribution from their mean, squaring and summing
these values
• Between sum of squares: a value obtained by subtracting the
grand mean from each group mean, squaring this difference for all
individuals, and summing them. It summarizes the effect of the
independent classification variable under study
• Within sum of squares: a value obtained by subtracting each
subgroups mean from each observed score, squaring and summing.
It reflects ummeasured factors
• SS Total = SS between + SS within
• See example https://en.wikipedia.org/wiki/F-test
ANOVA Tables: Means of Squares
• Means squares is estimate of variance, they are average
computed by dividing each sum of squares by its degree of freedom
(DF). DF of MS between is given by the number of levels of the
independent variable – 1 (K-1): DF of MS within by number of cases
- the number of levels of the independent variable (n-k)
• If a significant group effect exist, the between-group variance
(Mean Square between) will be larger that the within-group variance
(Mean Square within) (MS between >MS within)
• If no group effect exist , the Mean squares corresponding to SS
between and SS within are identical (MS between=MS within)
• The F test is computed by taking the ratio of what is called the
"between " variance to the "residual or error" or Within variance
(MS between/MS within)
13/01/2016
11
The formula of F test
The formula for the one-way ANOVA F-test statistic is
The "explained variance is
Where denotes the sample mean in the ith group, ni is the number of observations in the ith group, denotes the overall mean of the data, and K denotes the number of groups.
The "unexplained variance“/"within-group variability" is
where Yij is the jth observation in the ith out of K groups and N is the overall sample size.
(MS between/ MS within)
The F test
• The F-test in one-way analysis of variance is used to assess
whether the expected values of a quantitative variable within
several pre-defined groups differ from each other.
• The F test is computed by taking the ratio of what is called the
"between " variance to the "residual or error" or Within
variance (MS between/MS within)
• F= MS "between “/ MS within
• The numerator captures between variability (i.e., differences among the sample means) and the denominator contains an estimate of the variability in the outcome.
13/01/2016
12
Test F- Interpreting output
• Doing test F for testing H0: μ1 = μ2 = ... = μk
• The statistic will be large if the between-group variability is large
relative to the within-group variability, which is unlikely to
happen if the population means of the groups all have the same
value.
• This F-statistic follows the F-distribution with K−1, N −K
degrees of freedom under the null hypothesis.
• Note that when there are only two groups for the one-way
ANOVA F-test, F=t2 where t is the Student's t statistic.
The decision rule –F test • If the null hypothesis is true, the between variation
(numerator) will not exceed the residual or error variation (denominator) and the F statistic will small.
• If the null hypothesis is false, then the F statistic will be large.
The decision rule is:
• Reject H0 if F > critical value . The appropriate critical value can be found in a table of probabilities for the F distribution (based on DF)
• Reject H0 if p value < critical value (0.05; 0.01)
• If the p value (denoted by “Sig.”) associated to test F is o.028 , this means that if means are equal in the population we only have a 2.8% chance of finding the differences that we observe in our sample. The null hypothesis is usually rejected if p <0 .05
13/01/2016
13
The correlation ratio: Eta squared
• If ANOVA allows to reject null hypothesis, the next question
is: How strong is the relationship between variables?
• The strenght of relationship should be assessed by computing
the correlation ratio or eta-squared η2
• η2= SS between/SS total
• Always ranges from 0 to 1.
• η2 = 0,05 means that 5% of the variance of DV (e.g. level of
attendance at religious services) can be explained stastically
by group (e.g region)
ANOVA summary table
• The result of a one-way ANOVA are commonly presented in an ANOVA summary table
13/01/2016
14
ANOVA summary table Column Description
Unlabeled (Source of variance)
The first column describes each row of the ANOVA summary table. The between-groups estimate of variance forms the numerator of the F ratio. The second row corresponds to the within-groups estimate of variance (the estimate of error). The within-groups estimate of variance forms the denominator of the F ratio. The final row describes the total variability in the data.
Sum of Squares The Sum of squares column gives the sum of squares for each of the estimates of variance. The sum of squares corresponds to the numerator of the variance ratio.
df
The degrees of freedom for the between-groups estimate of variance is given by the number of levels of the IV - 1. (k-1).The degrees of freedom for the within-groups estimate of variance is calculated by subtracting one from the number of people in each condition / category and summing across the conditions / categories (n-k)
Mean Square
The fourth column gives the estimates of variance (the mean squares.) Each mean square is calculated by dividing the sum of square by its degrees of freedom. MSBetween-groups = SSBetween-groups / dfBetween-groups MSWithin-groups = SSWithin-groups / dfWithin-groups
F The fifth column gives the F ratio. It is calculated by dividing mean square between-groups by mean square within-groups. F = MSBetween-groups / MSWithin-groups
Sig. The final column gives the significance of the F ratio. This is the p value. If the p value is less than or equal your α level, then you can reject H0 that all the means are equal.
13/01/2016
15
Testing Hypothesis using ANOVA one
way H1: Italian people have less trust in political parties than French
people
H2: People interested in politics have more trust in political parties than people not interested in politics
▫ Dependent variable: Level of trust in political parties (measured by a 1-10 points scale)
▫ Independent variables: country (2 categories); level of interest in politics (4 categories) or : level of interest in politics recoded (2 categories)
ANOVA one way– Interpreting Output
The significance value of the F test in the ANOVA table is more than 0.05 You must accept the hypothesis that means are equal across countries. If the significance value of the F test is less than 0.05 you must reject the hypothesis that means are equal across countries
Differences are very very small and non statistically significant; so nationality does not affect level of trust in political parties (H1 is falsified) Null Hypothesis is accepted
13/01/2016
16
The Mean decreases across the four groups: the level of trust of people “very interested” is in average 4.1, while the level of trust of people ‘not at all interested” is 2.5 Differences are statistically significant (sig of test F is less than 0.05) Research Hypothesis is confirmed !
ANOVA one way– Interpreting Output
• A farmer wants to know whether the weight of parsley plants is influenced by using a fertilizer.
• He selects 90 plants and randomly divides them into three groups of 30 plants each.
• He applies a biological fertilizer to the first group, a chemical fertilizer to the second group and no fertilizer at all to the third group.
• After a month he weighs all plants,
• Can we conclude from these data that fertilizer affects weight?
ANOVA one way– Interpreting Output
http://www.spss-tutorials.com/spss-one-way-anova/
13/01/2016
17
ANOVA one way– Interpreting Output
N” in the first column refers to the number of cases used for calculating the descriptive statistics. The mean weights are the core of our output. After all, our main research question is whether these differ for different fertilizers. On average, parsley plants weigh some 51 grams if no fertilizer was used. Biological fertilizer results in an average weight of some 54 grams whereas chemical fertilizer does best with a mean weight of 57 grams.
ANOVA one way– Interpreting Output
•The degrees of freedom (df) and F statistic are not immediately interesting but we'll need them later on for reporting our results correctly. •The p value (denoted by “Sig.”) is .028. This means that if the population mean weights are exactly equal, we only have a 2.8% chance of finding the differences that we observe in our sample. • The null hypothesis is usually rejected if p < .05 so we conclude that the mean weights of the three groups of plants are not equal. •The weights of parsley plants are affected by the fertilizer -if any- that's used.
13/01/2016
18
ANOVA one way is an omnibus test
• At this point, it is important to realize that the one-way
ANOVA is an omnibus test statistic and cannot tell
you which specific groups were significantly different
from each other, only that at least two groups were.
• To determine which specific groups differed from each
other, you need to use a post hoc test.
• Because post hoc tests are run to confirm where the
differences occurred between groups, they should only
be run when you have a shown an overall significant
difference in group means (i.e., a significant one-way
ANOVA result).
13/01/2016
19
Disadvantages of F Test
• The advantage of the ANOVA F-test is that we do
not need to pre-specify which treatments/groups are
to be compared, and we do not need to adjust for
making multiple comparisons.
• The disadvantage of the ANOVA F-test is that if we
reject the null hypothesis, we do not know which
treatments can be said to be significantly
different from the others,
Two (ore more) factors ANOVA
• The ANOVA tests described here are called one-factor/one
way ANOVAs. There is one grouping factor and we wish to
compare the means across the different categories of this factor.
• There are situations where it may be of interest to compare
means of a continuous outcome across two or more
factors. For example, suppose a clinical trial is designed to
compare five different treatments for joint pain in patients with
osteoarthritis. Investigators might also hypothesize that there
are differences in the outcome by sex.
• This is an example of a two-factor ANOVA where the
factors are treatment (with 5 levels) and sex (with 2 levels).