19
Multiple Regression and Correlation Dr. Carlo Magno

Multiple regression presentation

Embed Size (px)

DESCRIPTION

 

Citation preview

Page 1: Multiple regression presentation

Multiple Regression and Correlation

Dr. Carlo Magno

Page 2: Multiple regression presentation

Y = a + bX

Bivariate correlation

y = b1x1 + b2x2 + ... + bnxn + c

Multiple correlation

Page 3: Multiple regression presentation

Multiple Regression– association between a criterion variable and two or more predictor variables (Aron & Aron, 2003).

Multiple correlation coefficient = R

Using two or more variables to predict a criterion variable.

Page 4: Multiple regression presentation

Onwuegbuzie, A. J., Bailey, P, & Daley, C. E. (2000). Cognitive, affective, personality, and demographic predictors of foreign-language achievement. The Journal of Educational Research, 94, 3-15.

Foreign Language Achievement

Cognitive

Academic Ach.

Study Habits

Expectation

Affective

Perception

Anxiety

Personality

Cooperativeness

Competitiveness

Demographic

Gender

Age

Page 5: Multiple regression presentation

Espin, C., Shin, J., Deno, S. L., Skare, S., Robinson, S., & Brenner, B. (2000). Identifying indicators of written expression proficiency for middle school students. The Journal of Special Education, 34, 140-153.

Words written

Words correct

Characters

Sentences

Character/Word

Word/sentences

Correct word sentences

Incorrect Word sentences

Correct minus incorrect word sentences

Mean length of correct word sentences

Written Expression Proficiency

Page 6: Multiple regression presentation

Results Regression coefficient () /Beta weight– Distinct

contribution of a variable, excluding any overlap with other predictor variables. Unstandardized simple regression coefficient

Standardized regression coefficient - converted variables (independent and dependent) to z-scores before doing the regression. Indicates which independent variable has most effect on the dependent variable.

Page 7: Multiple regression presentation

Results Multiple correlation coefficient (R) – the

correlation between the criterion variable and all the predictor variables taken together.

Squared Correlation Coefficient (R2) –The percent of variance in the dependent variable explained collectively by all of the independent variables.

R2adjusted - assessing the goodness of fit of a

regression equation. How well do the predictors (regressors), taken together, explain the variation in the dependent variable.

R2adj = 1 - (1-R2)(N-n-1)/(N-1)

Page 8: Multiple regression presentation

R2adj

above 75% as very good; 50-75% as good; 25-50% as fair; below 25% as poor and perhaps

unacceptable. R2adj values above 90% are

rare in psychological data

Page 9: Multiple regression presentation

Residual - The deviation of a particular point from the regression line (its predicted value).  

t-tests - used to assess the significance of individual b coefficients.

F test - The F test is used to test the significance of R,  

F = [R2/k]/[(1 - R2 )/(n - k - 1)].

Page 10: Multiple regression presentation

Considerations in using multiple regression:

The units (usually people) observed should be a random sample from some well defined population.

The dependent variable should be measured on an interval, continuous scale.

The independent variables should be measured on interval scales

Page 11: Multiple regression presentation

Considerations in using multiple regression:

The distributions of all the variables should be normal

The relationships between the dependent variable and the independent variable should be linear.

Although the independent variables can be correlated, there must be no perfect (or near-perfect) correlations among them, a situation called multicollinearity.

Page 12: Multiple regression presentation

Considerations in using multiple regression:

There must be no interactions (in the anova sense) between independent variables

a rule of thumb for testing b coefficients is to have N >= 104 + m, where m = number of independent variables.

Page 13: Multiple regression presentation

Reporting regression results:“The data were analyzed by multiple regression,

using as regressors age, income and gender. The regression was a rather poor fit (R2adj = 40%), but the overall relationship was significant (F3,12 = 4.32, p < 0.05). With other variables held constant, depression scores were negatively related to age and income, decreasing by 0.16 for every extra year of age, and by 0.09 for every extra pound per week income. Women tended to have higher scores than men, by 3.3 units. Only the effect of income was significant (t12 = 3.18, p < 0.01).

Page 14: Multiple regression presentation

Partial Correlation

In its squared form is the percent of variance in the dependent uniquely and jointly attributable to the given independent when other variables in the equation are controlled

Page 15: Multiple regression presentation

Stepwise Regression

y = ß0 + ß1x1 + ß2x2 + ß3x3 + ß4x4 + ß5x5 +

ß6x6 + ß7x7 + ß8x8 + ß9x9 + ß10x10 + ß11x11 +

ß12x12 + ß13x13 + ß14x14 + ß14x14 + choose a subset of the independent

variables which "best" explains the dependent variable.

Page 16: Multiple regression presentation

Heirarchical Regression

1) Forward Selection Start by choosing the independent

variable which explains the most variation in the dependent variable.

Choose a second variable which explains the most residual variation, and then recalculate regression coefficients.

Continue until no variables "significantly" explain residual variation.

Page 17: Multiple regression presentation

Stepwise Regression

2) Backward Selection Start with all the variables in the model, and

drop the least "significant", one at a time, until you are left with only "significant" variables.

  3) Mixture of the two Perform a forward selection, but drop variables

which become no longer "significant" after introduction of new variables.

Page 18: Multiple regression presentation

Hierarchical Regression

The researcher determines the order of entry of the variables.

F-tests are used to compute the significance of each added variable (or set of variables) to the explanation reflected in R-square

an alternative to comparing betas for purposes of assessing the importance of the independents

Page 19: Multiple regression presentation

Categorical Regression

Used when there is a combination of nominal, ordinal, and interval-level independent variables.