25

Click here to load reader

Regression Analysis presentation by Al Arizmendez and Cathryn Lottier

Embed Size (px)

DESCRIPTION

We present an overview of regression analysis, theoretical construct, then provide a graphic representation before performing multiple regression analysis step by step using SPSS (audio files accompany the tutorial).

Citation preview

  • 1. Regression AnalysisCMGT 587AUNIVERSITY OF SOUTHERN CALIFORNIA AL ARIZMENDEZ/CATHRYN LOTTIER

2. What is Regression Analysis? The Regression Method, more commonly referred toas Regression Analysis, is the assessment of therelationship of a dependent variable and one or moremultiple independent variable(s). It involves techniques for measuring or analyzingmultiple variables and their relationship This technique is used to analyze variables with atleast one dependent variable (often y) and one ormultiple independent variables (often x) tounderstand a phenomena, make predictions, and/ortest hypotheses 3. Assumptions Underlying the Method The validity of regression analysis depends on four assumptions: Linearity: where the relationship between dependent and independent variables are directly proportional to each other Independence: an independence of errors with no serial correlation (a random value of Y is assumed to be independent of any other value of Y) Constant variance: having your data values be scattered to the same extent Normality: the random variable of interest is distributed is a normal manner 4. When can you use Regression Analysis? Regression Analysis is used to make predictions, so it canvirtually be used by anyone Some reasons that you may want to use regressionanalysis are: To model a phenomena to understand it better in order to make decisions To model a phenomena to understand it better to predict values for that in other places or times (later in these slides, you will see an example of this as we created an example to forecast album sales) To test a hypotheses, but one should note that regression analysis is an estimate or guess, not an accurate data set (we will show an example of this later in the slides with our test of life expectancy vs. literary rates) 5. Diving a Little Deeper Multiple linear regression analysis begins by positing thegeneral form of the relationship in the following model: i = 0 + 1i1 + i More simply put: Outcomei = (b0 + b1xi) + errori Where Y is the dependent variable, 0 is the intercept,1 is the slope and i1 is the independent variable The is the residual term, which expresses thecomposite of all the other types of individual differencesthat arent explicitly identified in the model (a.k.a.random error term)a reminder that it will never beperfect 6. What does that really mean? That equation means that the outcome can be predictedfrom a model and some error associated with thatprediction (i) The outcome variable is represented as yi, which ispredicted using a predictor variable (xi) and a parameter(bi) associated with the predictor variable Bi is the line the direction or strength of the relationship or effect B0 tells us what the value of the outcome is when the predictor is 0(the intercept) The betas tell us what the shape of the model is and what itlooks like 7. Explanation of R Squared R2 allows one to assess how well the model fits If you square all of the differences, the sum of all the squareddifferences is known as the total sum of squares (SST ) If an optimal model is fitted to the data, the differencesbetween the observed data points and the values predicted bythe regression line can be squared and summed, which isreferred to as the sum of squared residuals (SSR) The difference between SST and SSR is the model sum ofsquares (SSM) R2 is determined by dividing the model sum of squares by thetotal sum of squares, which is used to describe how well theregression line fits An R2 near 1 indicates that a regression line fits the data well,while an R2 closer to 0 indicates a regression line does not fitthe data very well 8. Example of Regression Analysis Regression Analysis can be used to forecast the trend ofalbum sales (shown on the y-axis) in relation to theadvertising budget (shown on the x-axis) 9. Adding Another Variable to the Equation Now, taking it one step further and adding amount of radio play to the equation This turns into multiple regression analysis with more predictors creating a regression plane (or a 3d model) with the line turning into a plane It looks more complicated, but the principles remain the same as linear regression 10. Explanation of Multiple Regression Analysis Multiple Regression Analysis Often referred to as OLS (Ordinary Least Squares) regression multiple regression can establish whether a set ofindependent variables explains a proportion of the variance ina dependent variable at a significant level (through asignificance test of R2) (Garson, 2012, p. 10) It can also determine the relative predictive importance of theindependent variable (by comparing regression weights, alsoknown as beta weights) 11. Multiple Regression Analysis While the formula for linear regression analysis looks like this:i = 0 + 11i + i Multiple regression analysis looks more like this:i = (0 + 11i+ 22i+ nni) + i This shows that the principles are the same aslinear regression, there are just more predictors! 12. Talking About the Betas The betas tell the relationshipbetween a particular predictorand the outcome The betas also define the shapeof the plane In this instance: the beta 0 is represent where theplane hits the y-axis (value of theoutcome when both predictors arezero) b1 represents the slope of the sideassociated with radio play b2 represents the slope of the sideassociated with advertising budget This can go on for multipledimensions with each of thepredictors defining the shape 13. Simple Linear Regression w/ SPSS Life Expectancy of Females (dependent variable) Literacy of country in percent (independent variable) 14. Simple Regression w/ SPSS: Open the Data Set 15. Simple Regression w/ SPSS: Scatter Its always a good idea to do a scatter plotGraphs>Legacy dialogs> Scatter/Dot>Define 16. Simple Regression w/SPSS: Scatter Add dependent variable (Life expectancy) on y-axis Add independent variable (Literacy) on x-axis 17. Simple Regression : Scatter done, were not Strong uphill pattern, expectancy increases with literacyrate, but we need to run a regression line 18. Simple Regression w/SPSS: Scatter and Regression Line 19. Simple Regression: Plotted, now run itAnalyze> Regression> Linear 20. Simple Regression w/SPSS: Output 21. Simple Regression w/SPSS: Scatter Coefficients 22. Multiple Linear Regression w/SPSSAnalyze>Regression> Linear 23. Multiple Linear Regression w/SPSSTop half of output; notice the multiple variables enteredand the single dependent variable (female life expectancy) 24. Multiple Linear Regression w/SPSSBottom half of Output: 25. Multiple Linear Regression w/SPSS Literacy is one variable, but it is that specific combination of thevariables that Multiple Linear Regression tests for makes MLR sopowerful