Upload
delilah-logan
View
217
Download
2
Embed Size (px)
Citation preview
Multivariate Multivariate AnalysisAnalysis
One-way ANOVAOne-way ANOVA
Tests the difference in the means of 2 or more Tests the difference in the means of 2 or more nominal groupsnominal groups E.g., High vs. Medium vs. Low exposureE.g., High vs. Medium vs. Low exposure
Can be used with more than one IVCan be used with more than one IV Two-way ANOVA, Three-way ANOVA etc.Two-way ANOVA, Three-way ANOVA etc.
ANOVAANOVA
_______-way ANOVA_______-way ANOVA Number refers to the number of IVsNumber refers to the number of IVs
Tests whether there are differences in the Tests whether there are differences in the means of IV groupsmeans of IV groups E.g.:E.g.:
Experimental vs. control groupExperimental vs. control group Women vs. MenWomen vs. Men High vs. Medium vs. Low exposureHigh vs. Medium vs. Low exposure
Logic of ANOVALogic of ANOVA
Variance partitioned into:Variance partitioned into: 1. Systematic variance:1. Systematic variance:
the result of the influence of the Ivsthe result of the influence of the Ivs 2. Error variance:2. Error variance:
the result of unknown factorsthe result of unknown factors
Variation in scores partitions the variance Variation in scores partitions the variance into two parts by calculating the “sum of into two parts by calculating the “sum of squares”:squares”: 1. Between groups variation (systematic)1. Between groups variation (systematic) 2. Within groups variation (error)2. Within groups variation (error)
SS total = SS between + SS withinSS total = SS between + SS within
Significant and Non-Significant and Non-significant Differencessignificant Differences
Significant: Between > Within
Non-significant: Within > Between
Partitioning the Variance Partitioning the Variance ComparisonsComparisons
Total variation = score – grand meanTotal variation = score – grand mean
Between variation = group mean – grand Between variation = group mean – grand meanmean
Within variation = score – group meanWithin variation = score – group mean
Deviation is taken, then squared, then Deviation is taken, then squared, then summed across casessummed across cases Hence the term “Sum of squares” (SS)Hence the term “Sum of squares” (SS)
One-way ANOVA One-way ANOVA exampleexample
Total SS (deviation from grand mean)Group A Group B Group C 49 56 54 52 57 52 52 57 56 53 60 50 49 60 53
Mean = 51 58 53
Grand mean = 54
One-way ANOVA One-way ANOVA exampleexample
Total SS (deviation from grand mean)Group A Group B Group C -5 25 2 4 0 0 -2 4 3 9 -2 4 -2 4 3 9 2 4 -1 1 6 36 -4 16 -5 25 6 36 -1 1
Sum of squares = 59 + 94 + 25 = 178
One-way ANOVA One-way ANOVA exampleexample
Between SS (group mean – grand mean) A B C
Group means 51 58 53Group deviation from grand mean -3 4 -1Squared deviation 9 16 1n(squared deviation) 45 80 5
Between SS = 45 + 80 + 5 = 130
Grand mean = 54
One-way ANOVA One-way ANOVA exampleexample
Within SS (score - group mean)A B C51 58 53
Deviation from group means -2 -21 1 -1 -1 1 -1 3 2 2 -3-2 2 0
Squared deviations 4 4 1 1 1 1 1 1 9 4 4 9 4 4 0
Within SS = 14 + 14 + 20 = 48
The F equation for The F equation for ANOVAANOVA
F = F = Between groups sum of squares/(k-1)Between groups sum of squares/(k-1)
Within groups sum of squares/(N-k)Within groups sum of squares/(N-k)
N = total number of subjectsN = total number of subjects
k = number of groupsk = number of groups
Numerator = Mean square between groupsNumerator = Mean square between groups
Denominator = Mean square within groupsDenominator = Mean square within groups
F-table page 195% POINTS FOR THE F DISTRIBUTION Page 1 Numerator Degrees of Freedom * 1 2 3 4 5 6 7 8 9 10 * 1 161 199 216 225 230 234 237 239 241 242 1 2 18.5 19.0 19.2 19.2 19.3 19.3 19.4 19.4 19.4 19.4 2 D 3 10.1 9.55 9.28 9.12 9.01 8.94 8.89 8.85 8.81 8.79 3 e 4 7.71 6.94 6.59 6.39 6.26 6.16 6.09 6.04 6.00 5.96 4 n 5 6.61 5.79 5.41 5.19 5.05 4.95 4.88 4.82 4.77 4.74 5 o m 6 5.99 5.14 4.76 4.53 4.39 4.28 4.21 4.15 4.10 4.06 6 i 7 5.59 4.74 4.35 4.12 3.97 3.87 3.79 3.73 3.68 3.64 7 n 8 5.32 4.46 4.07 3.84 3.69 3.58 3.50 3.44 3.39 3.35 8 a 9 5.12 4.26 3.86 3.63 3.48 3.37 3.29 3.23 3.18 3.14 9 t 10 4.96 4.10 3.71 3.48 3.33 3.22 3.14 3.07 3.02 2.98 10 o r 11 4.84 3.98 3.59 3.36 3.20 3.09 3.01 2.95 2.90 2.85 11 12 4.75 3.89 3.49 3.26 3.11 3.00 2.91 2.85 2.80 2.75 12 D 13 4.67 3.81 3.41 3.18 3.03 2.92 2.83 2.77 2.71 2.67 13 e 14 4.60 3.74 3.34 3.11 2.96 2.85 2.76 2.70 2.65 2.60 14 g 15 4.54 3.68 3.29 3.06 2.90 2.79 2.71 2.64 2.59 2.54 15 r e 16 4.49 3.63 3.24 3.01 2.85 2.74 2.66 2.59 2.54 2.49 16 e 17 4.45 3.59 3.20 2.96 2.81 2.70 2.61 2.55 2.49 2.45 17 s 18 4.41 3.55 3.16 2.93 2.77 2.66 2.58 2.51 2.46 2.41 18 19 4.38 3.52 3.13 2.90 2.74 2.63 2.54 2.48 2.42 2.38 19 o 20 4.35 3.49 3.10 2.87 2.71 2.60 2.51 2.45 2.39 2.35 20 f 21 4.32 3.47 3.07 2.84 2.68 2.57 2.49 2.42 2.37 2.32 21 F 22 4.30 3.44 3.05 2.82 2.66 2.55 2.46 2.40 2.34 2.30 22 r 23 4.28 3.42 3.03 2.80 2.64 2.53 2.44 2.37 2.32 2.27 23 e 24 4.26 3.40 3.01 2.78 2.62 2.51 2.42 2.36 2.30 2.25 24 e 25 4.24 3.39 2.99 2.76 2.60 2.49 2.40 2.34 2.28 2.24 25 d o 26 4.23 3.37 2.98 2.74 2.59 2.47 2.39 2.32 2.27 2.22 26 m 27 4.21 3.35 2.96 2.73 2.57 2.46 2.37 2.31 2.25 2.20 27 28 4.20 3.34 2.95 2.71 2.56 2.45 2.36 2.29 2.24 2.19 28 29 4.18 3.33 2.93 2.70 2.55 2.43 2.35 2.28 2.22 2.18 29 30 4.17 3.32 2.92 2.69 2.53 2.42 2.33 2.27 2.21 2.16 30 35 4.12 3.27 2.87 2.64 2.49 2.37 2.29 2.22 2.16 2.11 35 40 4.08 3.23 2.84 2.61 2.45 2.34 2.25 2.18 2.12 2.08 40 50 4.03 3.18 2.79 2.56 2.40 2.29 2.20 2.13 2.07 2.03 50 60 4.00 3.15 2.76 2.53 2.37 2.25 2.17 2.10 2.04 1.99 60 70 3.98 3.13 2.74 2.50 2.35 2.23 2.14 2.07 2.02 1.97 70 80 3.96 3.11 2.72 2.49 2.33 2.21 2.13 2.06 2.00 1.95 80 100 3.94 3.09 2.70 2.46 2.31 2.19 2.10 2.03 1.97 1.93 100 150 3.90 3.06 2.66 2.43 2.27 2.16 2.07 2.00 1.94 1.89 150 300 3.87 3.03 2.63 2.40 2.24 2.13 2.04 1.97 1.91 1.86 300 1000 3.85 3.00 2.61 2.38 2.22 2.11 2.02 1.95 1.89 1.84 1000
Significance of FSignificance of F
F-critical is 3.89 (2,12 df)
F observed 16.25 > F critical 3.89
Groups are significantly different
-T-tests could then be run to determine which groups are significantly different from which other groups
Computer Printout Computer Printout ExampleExample
Descriptives
GAVE 'THE FINGER' TO SOMEONE WHILE DRIVI
1462 1.7148 1.28915 .03372 1.6486 1.7809 1.00 7.00
1858 1.3660 .93491 .02169 1.3234 1.4085 1.00 7.00
3320 1.5196 1.11830 .01941 1.4815 1.5576 1.00 7.00
1.00
2.00
Total
N Mean Std. Deviation Std. Error Lower Bound Upper Bound
95% Confidence Interval forMean
Minimum Maximum
ANOVA
GAVE 'THE FINGER' TO SOMEONE WHILE DRIVI
99.536 1 99.536 81.522 .000
4051.191 3318 1.221
4150.727 3319
Between Groups
Within Groups
Total
Sum ofSquares df Mean Square F Sig.
Two-way ANOVATwo-way ANOVA
ANOVA compares:ANOVA compares: Between and within groups varianceBetween and within groups variance
Adds a second IV to one-way ANOVAAdds a second IV to one-way ANOVA 2 IV and 1 DV2 IV and 1 DV
Analyzes significance of:Analyzes significance of: Main effects of each IVMain effects of each IV Interaction effect of the IVsInteraction effect of the IVs
Graphs of potential Graphs of potential outcomesoutcomes
No main effects or interactionsNo main effects or interactions
Main effects of color onlyMain effects of color only
Main effects for motion onlyMain effects for motion only
Main effects for color and motionMain effects for color and motion
InteractionsInteractions
Graphs Graphs
Color B&W
x Motion
* Still
AROUSAL
No main effects for No main effects for interactions interactions
Color B&W
x Motion
* Still
AROUSAL
No main effects for No main effects for interactions interactions
Color B&W
x Motion
* Stillx x* *
AROUSAL
Main effects for color Main effects for color onlyonly
Color B&W
x Motion
* Still
AROUSAL
Main effects for color Main effects for color only only
Color B&W
x Motion
* Still
x
x
*
*
AROUSAL
Main effects for motion Main effects for motion onlyonly
Color B&W
x Motion
* Still
AROUSAL
Main effects for motion Main effects for motion only only
Color B&W
x Motion
* Still
x x
* *
AROUSAL
Main effects for color and Main effects for color and motionmotion
Color B&W
x Motion
* Still
AROUSAL
Main effects for color and Main effects for color and motion motion
Color B&W
x Motion
* Still
x
x*
*
AROUSAL
Transverse interactionTransverse interaction
Color B&W
x Motion
* Still
AROUSAL
Transverse interaction Transverse interaction
Color B&W
x Motion
* Still
x
x*
*
AROUSAL
Interaction—color only Interaction—color only makes a difference for makes a difference for
motionmotion
Color B&W
x Motion
* Still
AROUSAL
Interaction—color only Interaction—color only makes a difference for makes a difference for
motionmotion
Color B&W
x Motion
* Still
x
x**
AROUSAL
Partitioning the variance Partitioning the variance for Two-way ANOVAfor Two-way ANOVA
Total variation = Total variation =
Main effect variable 1 +Main effect variable 1 +
Main effect variable 2 +Main effect variable 2 +
Interaction +Interaction +
Residual (within)Residual (within)
Summary Table for Two-Summary Table for Two-way ANOVAway ANOVA
SourceSource SSSS dfdf MSMS FF
Main effect 1Main effect 1
Main effect 2Main effect 2
InteractionInteraction
WithinWithin
TotalTotal
Printout ExamplePrintout ExampleTests of Between-Subjects Effects
Dependent Variable: MARIJUANA USE SHOULD BE LEGALIZED
74.465a 7 10.638 3.392 .001
5889.077 1 5889.077 1877.565 .000
13.191 1 13.191 4.205 .040
19.048 3 6.349 2.024 .108
.560 3 .187 .060 .981
10366.297 3305 3.137
31942.000 3313
10440.762 3312
SourceCorrected Model
Intercept
SEX
RACE2
SEX * RACE2
Error
Total
Corrected Total
Type III Sumof Squares df Mean Square F Sig.
R Squared = .007 (Adjusted R Squared = .005)a.
Printout plotPrintout plot
Estimated Marginal Means of MARIJUANA USE SHOULD BE LEGALIZED
SEX OF RESPONDENT
2.001.00
Estimated Marginal Means
3.2
3.0
2.8
2.6
2.4
2.2
RACE OF RESPONDENT(W
1.00
2.00
3.00
4.00
Scatter Plot of Price and Scatter Plot of Price and AttendanceAttendance
Price is the average seat price for a single regular season game in today’s Price is the average seat price for a single regular season game in today’s dollarsdollars
Attendance is total annual attendance and is in millions of people per annum.Attendance is total annual attendance and is in millions of people per annum.
2 3 4 5 6Price
0.5
1
1.5
2
2.5
Attendance
Is there a relation Is there a relation there?there?
Lets use linear regression to find out, that Lets use linear regression to find out, that isis Let’s fit a straight line to the data.Let’s fit a straight line to the data. But aren’t there lots of straight lines that But aren’t there lots of straight lines that
could fit?could fit? Yes! Yes!
Desirable PropertiesDesirable Properties
We would like the “closest” line, that is the We would like the “closest” line, that is the one that minimizes the errorone that minimizes the error The idea here is that there is actually a relation, The idea here is that there is actually a relation,
but there is also noise. We would like to make but there is also noise. We would like to make sure the noise (i.e., the deviation from the sure the noise (i.e., the deviation from the postulated straight line) to be as small as postulated straight line) to be as small as possible.possible.
We would like the error (or noise) to be We would like the error (or noise) to be unrelated to the independent variable (in this unrelated to the independent variable (in this case case priceprice).).
If it were, it would not be noise --- right!If it were, it would not be noise --- right!
Scatter Plot of Price and Scatter Plot of Price and AttendanceAttendance
Price is the average seat price for a single regular season game in today’s Price is the average seat price for a single regular season game in today’s dollarsdollars
Attendance is total annual attendance and is in millions of people per annum.Attendance is total annual attendance and is in millions of people per annum.
2 3 4 5 6Price
0.5
1
1.5
2
2.5
Attendance
Simple RegressionSimple Regression
The The simple linear regression MODELsimple linear regression MODEL is: is:
yy = = 00 + + 11xx + +
describes how y is related to xdescribes how y is related to x 00 and and 11 are called are called parameters of the modelparameters of the model.. is a random variable called theis a random variable called the error term error term..
x y
e
Simple RegressionSimple Regression
Graph of the regression equation is a Graph of the regression equation is a straight line.straight line.
ββ0 is the population is the population y-y-intercept of the intercept of the regression line.regression line.
ββ11 is the population slope of the is the population slope of the regression line.regression line.
EE((yy) is the expected value of ) is the expected value of yy for a for a given given xx value value
Simple RegressionSimple Regression
EE((yy))
xx
Slope Slope 11
is positiveis positive
Regression lineRegression line
InterceptIntercept00
Simple RegressionSimple Regression
EE((yy))
xx
Slope Slope 11
is 0is 0
Regression lineRegression lineInterceptIntercept
00
Types of Types of Regression ModelsRegression Models
RegressionModels
LinearNon-
Linear
2+ ExplanatoryVariables
Simple
Non-Linear
Multiple
Linear
1 ExplanatoryVariable
RegressionModels
LinearNon-
Linear
2+ ExplanatoryVariables
Simple
Non-Linear
Multiple
Linear
1 ExplanatoryVariable
Regression Modeling Regression Modeling Steps Steps
1.1. Hypothesize Deterministic ComponentsHypothesize Deterministic Components 2.2. Estimate Unknown Model ParametersEstimate Unknown Model Parameters 3.3. Specify Probability Distribution of Specify Probability Distribution of
Random Error TermRandom Error Term Estimate Standard Deviation of ErrorEstimate Standard Deviation of Error
4.4. Evaluate ModelEvaluate Model 5.5. Use Model for Prediction & Estimation Use Model for Prediction & Estimation
Linear Multiple Linear Multiple Regression ModelRegression Model
1.1. Relationship between 1 dependent & 2 Relationship between 1 dependent & 2 or more independent variables is a linear or more independent variables is a linear functionfunction
Y X X Xi i i k ki i 0 1 1 2 2 Y X X Xi i i k ki i 0 1 1 2 2
Dependent Dependent (response) (response) variablevariable
Independent Independent (explanatory) (explanatory) variablesvariables
Population Population slopesslopes
Population Population Y-interceptY-intercept
Random Random errorerror
X2
Y
X1E(Y) = 0 + 1X 1i + 2X 2i
0
Y i = 0 + 1X 1i + 2X 2i + i
ResponsePlane
(X 1i,X 2i)
(Observed Y )
iX2
Y
X1E(Y) = 0 + 1X 1i + 2X 2i
0
Y i = 0 + 1X 1i + 2X 2i + i
ResponsePlane
(X 1i,X 2i)
(Observed Y )
i
Multiple Regression Multiple Regression ModelModel
Multivariate Multivariate modelmodel