Regression presentation

Regression SPSS Report 4

By Maryam Bolouri & Tahereh Soleimani

The main types of research questions that multiple regression can be used to address are:

1. how well a set of variables is able to predict a particular outcome?

2. which variable in a set of variables is the best predictor of an outcome?

3. whether a particular predictor variable is still able to predict an outcome when the effects of another variable are controlled

To recap: It explores the relationship between one

continuous dependent variable and a number of independent variables or predictors (usually continuous).

Multiple regression is based on correlation. It allows a more sophisticated exploration of the

interrelationship among a set of variables. Multiple regression will provide you with

information about the model as a whole (all subscales) and the relative contribution of each of the variables that make up the model (individual subscales).

Homework 5: Interested to investigate the relationship between

anxiety, motivation and writing performance, a researcher conducted a study with 50 learners. Anxiety and motivation with 20 questions were measured on separate questionnaires on a 5-point Likert scale. The index for writing (out of 25) was the average of two raters of the essay written under timed conditions.

Predictors: anxiety and motivationDependent V: writing performance Relationship btw them (predictive power)

Research question and research hypothesis:

How well the anxiety and motivation levels can predict writing performance? How much variance in writing performance can be explained by scores of anxiety and motivation scales?

Which variables is the best predictor of writing performance?

H0: there is no significant relationship with predictive power between anxiety, motivation and the dependent variable of the study (writing performance).

Step one: checking the assumptionssample size

It is recommended that ‘for social science research, about 15 participants per predictor are needed for a reliable equation’.

In this study there are 2 independent variables so the required sample must be larger that 30 which in this study is 50 and quite acceptable.

Step one: checking the assumptionsMulticollinearity

Multicollinearity exists when the independent variables are highly correlated (r=.9 and above). Singularity occurs when one independent variable is actually a combination of other independent variables.

The correlation btw independent variable must be smaller that 0.7.

The correlation btw independent variables and dependent one must be larger than 0.3

Correlation and coefficient tables:

Correlations

Writing performance anxiety motivationPearson Correlation writingperformance 1.000 .305 -.168

anxiety .305 1.000 .227

motivation -.168 .227 1.000

Sig. (1-tailed) writingperformance . .016 .121

anxiety .016 . .057

motivation .121 .057 .

N writingperformance 50 50 50

anxiety 50 50 50

motivation 50 50 50

Step one: checking the assumptionsoutliers, normality, linearity, homoscedastisity

Multiple regression is very sensitive to outliers (very high or very low scores). Tabachnick and Fidell (2007, p. 128) define outliers as those with standardised residual values above about 3.3 (or less than –3.3).

Residuals are the differences between the obtained and the predicted dependent variable (DV) scores. The residuals scatterplots allow you to check:

• normality: the residuals should be normally distributed about the predicted DV scores

• linearity: the residuals should have a straight-line relationship with predicted DV scores

• homoscedasticity: the variance of the residuals about predicted DV scores should be the same for all predicted scores.

Step one: checking the assumptionsscatter plot and normal probability plot

There was no outlier.

Step two:evaluating the model

how much of the variance in the dependent variable is explained by the model and in this study R square is 0.152 or explains 15per cent of the variance in writing performance. This is not a respectable result.

Adjusted R square statistic ‘corrects’ this value to provide a better estimate of the true population value. As the sample of the study is fairly small it is better to include Adjusted one in the interpretation stage of the study.

To assess the statistical significance of the result, it is necessary to look in the table labelled ANOVA. This tests the null hypothesis that multiple R in the population equals 0. The model in this example reaches statistical significance (Sig. = .021; this really means p<.05).

Model summary table and ANOVA

ANOVAb

Model Sum of Squares df Mean Square F Sig.1 Regression 146.967 2 73.483 4.225 .021a

Residual 817.513 47 17.394

Total 964.480 49

a. Predictors: (Constant), motivation, anxiety

b. Dependent Variable: writing performance

Model Summaryb

Model R R Square Adjusted R Square Std. Error of the Estimate1 .390a .152 .116 4.17060

a. Predictors: (Constant), motivation, anxiety

b. Dependent Variable: writing performance

Step three: evaluating each of the independent variables

we are interested in comparing the contribution of each independent variable; therefore we will use the beta values.

In this case the largest beta coefficient is 0.36, which is for Anxiety. This means that this variable makes the strongest unique contribution to explaining the dependent variable, when the variance explained by all other variables in the model is controlled for. The Beta value for Motivation was slightly lower (–0.25), indicating that it made less of a unique contribution.

In this case, just anxiety made a unique, and statistically significant, contribution to the prediction of writing performance, yet it turned out that motivation didn’t make a respectable contribution.

Step three: evaluating each of the independent variables

In this example, Anxiety has a part correlation coefficient of 0.35. If we square this (multiply it by itself) we get .12, indicating that Anxiety uniquely explains 12 per cent of the variance in writing performance.

For the motivation the value is –.24, which squared gives us .05, indicating a unique contribution of 5 per cent to the explanation of variance in writing performance.

All in all these 2 predictors 12+5=17% of variance in dependant V can be explained. The total R square explained 15% of variance in scores. It is because of the violation of the required assumptions.

Coefficient table

Coefficientsa

Model

Unstandardized Coefficients

Standardized

Coefficients

t Sig.

95% Confidence Interval for B Correlations

Collinearity Statistics

B Std. Error BetaLower Bound

Upper Bound

Zero-order Partial Part

Tolerance VIF

1 (Constant)19.158 2.326

8.236 .000 14.478 23.837

anxiety .056 .021 .362 2.623 .012 .013 .098 .305 .357 .352 .949 1.054motivation

-.058 .032 -.250 -1.815 .076 -.122 .006 -.168 -.256 -.244 .949 1.054

a. Dependent Variable: writingperformance

How to report and present the outcome

Our model, which includes anxiety and motivation, explains 15 percent of the variance in writing performance(Question 1).

Of these two variables, anxiety makes the largest unique contribution (beta = .36), and motivation contribution was not statistically significant (Question 2).

Thank you

You are all great.

Education

Regression presentation