69
Chap 13-1 Introduction to Linear Regression and Correlation Analysis

Simple LReg

  • Upload
    wolfael

  • View
    214

  • Download
    1

Embed Size (px)

DESCRIPTION

Simple LReg

Citation preview

  • Chap 13-*

    Introduction to Linear Regression and Correlation Analysis

    Business Statistics: A Decision-Making Approach, 6e 2005 Prentice-Hall, Inc.

  • Business Statistics: A Decision-Making Approach, 6e 2005 Prentice-Hall, Inc.Chap 13-*Chapter GoalsAfter completing this chapter, you should be able to: Calculate and interpret the simple correlation between two variablesDetermine whether the correlation is significantCalculate and interpret the simple linear regression equation for a set of dataUnderstand the assumptions behind regression analysisDetermine whether a regression model is significant

    Business Statistics: A Decision-Making Approach, 6e 2005 Prentice-Hall, Inc.

  • Business Statistics: A Decision-Making Approach, 6e 2005 Prentice-Hall, Inc.Chap 13-*Chapter GoalsAfter completing this chapter, you should be able to:

    Calculate and interpret confidence intervals for the regression coefficientsRecognize regression analysis applications for purposes of prediction and descriptionRecognize some potential problems if regression analysis is used incorrectlyRecognize nonlinear relationships between two variables

    (continued)

    Business Statistics: A Decision-Making Approach, 6e 2005 Prentice-Hall, Inc.

  • Business Statistics: A Decision-Making Approach, 6e 2005 Prentice-Hall, Inc.Chap 13-*Scatter Plots and CorrelationA scatter plot (or scatter diagram) is used to show the relationship between two variablesCorrelation analysis is used to measure strength of the association (linear relationship) between two variablesOnly concerned with strength of the relationship No causal effect is implied

    Business Statistics: A Decision-Making Approach, 6e 2005 Prentice-Hall, Inc.

  • Business Statistics: A Decision-Making Approach, 6e 2005 Prentice-Hall, Inc.Chap 13-*Scatter Plot ExamplesyxyxyyxxLinear relationshipsCurvilinear relationships

    Business Statistics: A Decision-Making Approach, 6e 2005 Prentice-Hall, Inc.

  • Business Statistics: A Decision-Making Approach, 6e 2005 Prentice-Hall, Inc.Chap 13-*Scatter Plot ExamplesyxyxyyxxStrong relationshipsWeak relationships(continued)

    Business Statistics: A Decision-Making Approach, 6e 2005 Prentice-Hall, Inc.

  • Business Statistics: A Decision-Making Approach, 6e 2005 Prentice-Hall, Inc.Chap 13-*Scatter Plot ExamplesyxyxNo relationship(continued)

    Business Statistics: A Decision-Making Approach, 6e 2005 Prentice-Hall, Inc.

  • Business Statistics: A Decision-Making Approach, 6e 2005 Prentice-Hall, Inc.Chap 13-*Correlation CoefficientThe population correlation coefficient (rho) measures the strength of the association between the variables

    The sample correlation coefficient r is an estimate of and is used to measure the strength of the linear relationship in the sample observations(continued)

    Business Statistics: A Decision-Making Approach, 6e 2005 Prentice-Hall, Inc.

  • Business Statistics: A Decision-Making Approach, 6e 2005 Prentice-Hall, Inc.Chap 13-*Features of and rUnit freeRange between -1 and 1The closer to -1, the stronger the negative linear relationshipThe closer to 1, the stronger the positive linear relationshipThe closer to 0, the weaker the linear relationship

    Business Statistics: A Decision-Making Approach, 6e 2005 Prentice-Hall, Inc.

  • Business Statistics: A Decision-Making Approach, 6e 2005 Prentice-Hall, Inc.Chap 13-*r = +.3r = +1Examples of Approximate r Valuesyxyxyxyxyxr = -1r = -.6r = 0

    Business Statistics: A Decision-Making Approach, 6e 2005 Prentice-Hall, Inc.

  • Business Statistics: A Decision-Making Approach, 6e 2005 Prentice-Hall, Inc.Chap 13-*Calculating the Correlation Coefficientwhere:r = Sample correlation coefficientn = Sample sizex = Value of the independent variabley = Value of the dependent variableSample correlation coefficient:or the algebraic equivalent:

    Business Statistics: A Decision-Making Approach, 6e 2005 Prentice-Hall, Inc.

  • Business Statistics: A Decision-Making Approach, 6e 2005 Prentice-Hall, Inc.Chap 13-*Calculation Example

    Tree HeightTrunk Diameteryxxyy2x23582801225644994412401812771897294933619810893660137803600169217147441494511495202512151126122601144=321=73=3142=14111=713

    Business Statistics: A Decision-Making Approach, 6e 2005 Prentice-Hall, Inc.

  • Business Statistics: A Decision-Making Approach, 6e 2005 Prentice-Hall, Inc.Chap 13-*Trunk Diameter, xTreeHeight, yCalculation Example(continued)r = 0.886 relatively strong positive linear association between x and y

    Business Statistics: A Decision-Making Approach, 6e 2005 Prentice-Hall, Inc.

    Chart1

    35

    49

    27

    33

    60

    21

    45

    51

    Sheet1

    Tree HeightTrunk Diameter

    yx

    358

    499

    277

    336

    6013

    217

    4511

    5112

    Sheet1

    0

    0

    0

    0

    0

    0

    0

    0

    Sheet2

    Sheet3

  • Business Statistics: A Decision-Making Approach, 6e 2005 Prentice-Hall, Inc.Chap 13-*Excel OutputExcel Correlation OutputTools / data analysis / correlationCorrelation between Tree Height and Trunk Diameter

    Business Statistics: A Decision-Making Approach, 6e 2005 Prentice-Hall, Inc.

    Sheet4

    Tree HeightTrunk Diameter

    Tree Height1

    Trunk Diameter0.8862311

    Sheet1

    SalesYears with Midwestyxy2x2

    4873$1,461$237,1699

    4455$2,225$198,02525

    2722$544$73,9844

    6418$5,128$410,88164

    1872$374$34,9694

    4406$2,640$193,60036

    3467$2,422$119,71649

    2381$238$56,6441

    3124$1,248$97,34416

    2692$538$72,3614

    6559$5,895$429,02581

    5636$3,378$316,96936

    Sheet2

    Sheet3

  • Business Statistics: A Decision-Making Approach, 6e 2005 Prentice-Hall, Inc.Chap 13-*Significance Test for CorrelationHypotheses H0: = 0 (no correlation) HA: 0 (correlation exists)

    Test statistic

    (with n 2 degrees of freedom)

    Business Statistics: A Decision-Making Approach, 6e 2005 Prentice-Hall, Inc.

  • Business Statistics: A Decision-Making Approach, 6e 2005 Prentice-Hall, Inc.Chap 13-*Example: Produce StoresIs there evidence of a linear relationship between tree height and trunk diameter at the .05 level of significance?H0: = 0 (No correlation)H1: 0 (correlation exists) =.05 , df = 8 - 2 = 6

    Business Statistics: A Decision-Making Approach, 6e 2005 Prentice-Hall, Inc.

  • Business Statistics: A Decision-Making Approach, 6e 2005 Prentice-Hall, Inc.Chap 13-*Example: Test SolutionConclusion: There is evidence of a linear relationship at the 5% level of significanceDecision: Reject H0Reject H0Reject H0a/2=.025-t/2Do not reject H00t/2a/2=.025-2.44692.44694.68d.f. = 8-2 = 6

    Business Statistics: A Decision-Making Approach, 6e 2005 Prentice-Hall, Inc.

  • Business Statistics: A Decision-Making Approach, 6e 2005 Prentice-Hall, Inc.Chap 13-*Introduction to Regression AnalysisRegression analysis is used to:Predict the value of a dependent variable based on the value of at least one independent variableExplain the impact of changes in an independent variable on the dependent variableDependent variable: the variable we wish to explainIndependent variable: the variable used to explain the dependent variable

    Business Statistics: A Decision-Making Approach, 6e 2005 Prentice-Hall, Inc.

  • Business Statistics: A Decision-Making Approach, 6e 2005 Prentice-Hall, Inc.Chap 13-*Simple Linear Regression ModelOnly one independent variable, xRelationship between x and y is described by a linear functionChanges in y are assumed to be caused by changes in x

    Business Statistics: A Decision-Making Approach, 6e 2005 Prentice-Hall, Inc.

  • Business Statistics: A Decision-Making Approach, 6e 2005 Prentice-Hall, Inc.Chap 13-*Types of Regression ModelsPositive Linear RelationshipNegative Linear RelationshipRelationship NOT LinearNo Relationship

    Business Statistics: A Decision-Making Approach, 6e 2005 Prentice-Hall, Inc.

  • Business Statistics: A Decision-Making Approach, 6e 2005 Prentice-Hall, Inc.Chap 13-*Linear componentPopulation Linear RegressionThe population regression model:Population y intercept Population Slope Coefficient Random Error term, or residualDependent VariableIndependent VariableRandom Error component

    Business Statistics: A Decision-Making Approach, 6e 2005 Prentice-Hall, Inc.

  • Business Statistics: A Decision-Making Approach, 6e 2005 Prentice-Hall, Inc.Chap 13-*Linear Regression AssumptionsError values () are statistically independentError values are normally distributed for any given value of xThe probability distribution of the errors is normalThe probability distribution of the errors has constant varianceThe underlying relationship between the x variable and the y variable is linear

    Business Statistics: A Decision-Making Approach, 6e 2005 Prentice-Hall, Inc.

  • Business Statistics: A Decision-Making Approach, 6e 2005 Prentice-Hall, Inc.Chap 13-*Population Linear Regression(continued)Random Error for this x valueyxObserved Value of y for xiPredicted Value of y for xi xiSlope = 1Intercept = 0 i

    Business Statistics: A Decision-Making Approach, 6e 2005 Prentice-Hall, Inc.

  • Business Statistics: A Decision-Making Approach, 6e 2005 Prentice-Hall, Inc.Chap 13-*The sample regression line provides an estimate of the population regression lineEstimated Regression ModelEstimate of the regression interceptEstimate of the regression slope Estimated (or predicted) y valueIndependent variableThe individual random error terms ei have a mean of zero

    Business Statistics: A Decision-Making Approach, 6e 2005 Prentice-Hall, Inc.

  • Business Statistics: A Decision-Making Approach, 6e 2005 Prentice-Hall, Inc.Chap 13-*Least Squares Criterionb0 and b1 are obtained by finding the values of b0 and b1 that minimize the sum of the squared residuals

    Business Statistics: A Decision-Making Approach, 6e 2005 Prentice-Hall, Inc.

  • Business Statistics: A Decision-Making Approach, 6e 2005 Prentice-Hall, Inc.Chap 13-*The Least Squares EquationThe formulas for b1 and b0 are:algebraic equivalent:and

    Business Statistics: A Decision-Making Approach, 6e 2005 Prentice-Hall, Inc.

  • Business Statistics: A Decision-Making Approach, 6e 2005 Prentice-Hall, Inc.Chap 13-*b0 is the estimated average value of y when the value of x is zero

    b1 is the estimated change in the average value of y as a result of a one-unit change in xInterpretation of the Slope and the Intercept

    Business Statistics: A Decision-Making Approach, 6e 2005 Prentice-Hall, Inc.

  • Business Statistics: A Decision-Making Approach, 6e 2005 Prentice-Hall, Inc.Chap 13-*Finding the Least Squares EquationThe coefficients b0 and b1 will usually be found using computer software, such as Excel or Minitab

    Other regression measures will also be computed as part of computer-based regression analysis

    Business Statistics: A Decision-Making Approach, 6e 2005 Prentice-Hall, Inc.

  • Business Statistics: A Decision-Making Approach, 6e 2005 Prentice-Hall, Inc.Chap 13-*Simple Linear Regression ExampleA real estate agent wishes to examine the relationship between the selling price of a home and its size (measured in square feet)

    A random sample of 10 houses is selectedDependent variable (y) = house price in $1000sIndependent variable (x) = square feet

    Business Statistics: A Decision-Making Approach, 6e 2005 Prentice-Hall, Inc.

  • Business Statistics: A Decision-Making Approach, 6e 2005 Prentice-Hall, Inc.Chap 13-*Sample Data for House Price Model

    House Price in $1000s(y)Square Feet (x)2451400312160027917003081875199110021915504052350324245031914252551700

    Business Statistics: A Decision-Making Approach, 6e 2005 Prentice-Hall, Inc.

  • Business Statistics: A Decision-Making Approach, 6e 2005 Prentice-Hall, Inc.Chap 13-*Regression Using ExcelTools / Data Analysis / Regression

    Business Statistics: A Decision-Making Approach, 6e 2005 Prentice-Hall, Inc.

  • Business Statistics: A Decision-Making Approach, 6e 2005 Prentice-Hall, Inc.Chap 13-*Excel OutputThe regression equation is:

    Regression StatisticsMultiple R0.76211R Square0.58082Adjusted R Square0.52842Standard Error41.33032Observations10

    ANOVAdfSSMSFSignificance FRegression118934.934818934.934811.08480.01039Residual813665.56521708.1957Total932600.5000

    CoefficientsStandard Errort StatP-valueLower 95%Upper 95%Intercept98.2483358.033481.692960.12892-35.57720232.07386Square Feet0.109770.032973.329380.010390.033740.18580

    Business Statistics: A Decision-Making Approach, 6e 2005 Prentice-Hall, Inc.

  • Business Statistics: A Decision-Making Approach, 6e 2005 Prentice-Hall, Inc.Chap 13-*Graphical PresentationHouse price model: scatter plot and regression lineSlope = 0.10977Intercept = 98.248

    Business Statistics: A Decision-Making Approach, 6e 2005 Prentice-Hall, Inc.

    Chart2

    245

    312

    279

    308

    199

    219

    405

    324

    319

    255

    House Price

    Square Feet

    House Price ($1000s)

    Sheet4

    SUMMARY OUTPUT

    Regression Statistics

    Multiple R0.76211

    R Square0.58082

    Adjusted R Square0.52842

    Standard Error41.33032

    Observations10

    ANOVA

    dfSSMSFSignificance F

    Regression118934.934818934.934811.084760.01039

    Residual813665.56521708.1957

    Total932600.5000

    CoefficientsStandard Errort StatP-valueLower 95%Upper 95%

    Intercept98.2483358.033481.692960.12892-35.57720232.07386

    Square Feet0.109770.032973.329380.010390.033740.18580

    RESIDUAL OUTPUT

    ObservationPredicted House PriceResiduals

    1251.9231625835-6.9231625835

    2273.876710149538.1232898505

    3284.8534839325-5.8534839325

    4304.06283805283.9371619472

    5218.9928412345-19.9928412345

    6268.388323258-49.388323258

    7356.202513522148.7974864779

    8367.1792873051-43.1792873051

    9254.667356029364.3326439707

    10284.8534839325-29.8534839325

    Sheet4

    2450

    3120

    2790

    3080

    1990

    2190

    4050

    3240

    3190

    2550

    House Price

    Predicted House Price

    Square Feet

    House Price

    Square Feet Line Fit Plot

    Sheet1

    House PriceSquare Feet

    2451400

    3121600

    2791700

    3081875

    1991100

    2191550

    4052350

    3242450

    3191425

    2551700

    Sheet1

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    House Price

    Square Feet

    House Price

    Sheet2

    Sheet3

  • Business Statistics: A Decision-Making Approach, 6e 2005 Prentice-Hall, Inc.Chap 13-*Interpretation of the Intercept, b0b0 is the estimated average value of Y when the value of X is zero (if x = 0 is in the range of observed x values)Here, no houses had 0 square feet, so b0 = 98.24833 just indicates that, for houses within the range of sizes observed, $98,248.33 is the portion of the house price not explained by square feet

    Business Statistics: A Decision-Making Approach, 6e 2005 Prentice-Hall, Inc.

  • Business Statistics: A Decision-Making Approach, 6e 2005 Prentice-Hall, Inc.Chap 13-*Interpretation of the Slope Coefficient, b1b1 measures the estimated change in the average value of Y as a result of a one-unit change in XHere, b1 = .10977 tells us that the average value of a house increases by .10977($1000) = $109.77, on average, for each additional one square foot of size

    Business Statistics: A Decision-Making Approach, 6e 2005 Prentice-Hall, Inc.

  • Business Statistics: A Decision-Making Approach, 6e 2005 Prentice-Hall, Inc.Chap 13-*Least Squares Regression PropertiesThe sum of the residuals from the least squares regression line is 0 ( )The sum of the squared residuals is a minimum (minimized )The simple regression line always passes through the mean of the y variable and the mean of the x variableThe least squares coefficients are unbiased estimates of 0 and 1

    Business Statistics: A Decision-Making Approach, 6e 2005 Prentice-Hall, Inc.

  • Business Statistics: A Decision-Making Approach, 6e 2005 Prentice-Hall, Inc.Chap 13-*Explained and Unexplained VariationTotal variation is made up of two parts:Total sum of SquaresSum of Squares RegressionSum of Squares Errorwhere: = Average value of the dependent variabley = Observed values of the dependent variable = Estimated value of y for the given x value

    Business Statistics: A Decision-Making Approach, 6e 2005 Prentice-Hall, Inc.

  • Business Statistics: A Decision-Making Approach, 6e 2005 Prentice-Hall, Inc.Chap 13-*SST = total sum of squares Measures the variation of the yi values around their mean ySSE = error sum of squares Variation attributable to factors other than the relationship between x and ySSR = regression sum of squares Explained variation attributable to the relationship between x and y(continued)Explained and Unexplained Variation

    Business Statistics: A Decision-Making Approach, 6e 2005 Prentice-Hall, Inc.

  • Business Statistics: A Decision-Making Approach, 6e 2005 Prentice-Hall, Inc.Chap 13-*(continued)XiyxyiSST = (yi - y)2SSE = (yi - yi )2 SSR = (yi - y)2 ___Explained and Unexplained Variationyyy_y

    Business Statistics: A Decision-Making Approach, 6e 2005 Prentice-Hall, Inc.

  • Business Statistics: A Decision-Making Approach, 6e 2005 Prentice-Hall, Inc.Chap 13-*The coefficient of determination is the portion of the total variation in the dependent variable that is explained by variation in the independent variable

    The coefficient of determination is also called R-squared and is denoted as R2Coefficient of Determination, R2where

    Business Statistics: A Decision-Making Approach, 6e 2005 Prentice-Hall, Inc.

  • Business Statistics: A Decision-Making Approach, 6e 2005 Prentice-Hall, Inc.Chap 13-*Coefficient of determination Coefficient of Determination, R2(continued)Note: In the single independent variable case, the coefficient of determination is

    where:R2 = Coefficient of determination r = Simple correlation coefficient

    Business Statistics: A Decision-Making Approach, 6e 2005 Prentice-Hall, Inc.

  • Business Statistics: A Decision-Making Approach, 6e 2005 Prentice-Hall, Inc.Chap 13-*R2 = +1Examples of Approximate R2 ValuesyxyxR2 = 1R2 = 1Perfect linear relationship between x and y:

    100% of the variation in y is explained by variation in x

    Business Statistics: A Decision-Making Approach, 6e 2005 Prentice-Hall, Inc.

  • Business Statistics: A Decision-Making Approach, 6e 2005 Prentice-Hall, Inc.Chap 13-*Examples of Approximate R2 Valuesyxyx0 < R2 < 1Weaker linear relationship between x and y:

    Some but not all of the variation in y is explained by variation in x

    Business Statistics: A Decision-Making Approach, 6e 2005 Prentice-Hall, Inc.

  • Business Statistics: A Decision-Making Approach, 6e 2005 Prentice-Hall, Inc.Chap 13-*Examples of Approximate R2 ValuesR2 = 0No linear relationship between x and y:

    The value of Y does not depend on x. (None of the variation in y is explained by variation in x)yxR2 = 0

    Business Statistics: A Decision-Making Approach, 6e 2005 Prentice-Hall, Inc.

  • Business Statistics: A Decision-Making Approach, 6e 2005 Prentice-Hall, Inc.Chap 13-*Excel Output58.08% of the variation in house prices is explained by variation in square feet

    Regression StatisticsMultiple R0.76211R Square0.58082Adjusted R Square0.52842Standard Error41.33032Observations10

    ANOVAdfSSMSFSignificance FRegression118934.934818934.934811.08480.01039Residual813665.56521708.1957Total932600.5000

    CoefficientsStandard Errort StatP-valueLower 95%Upper 95%Intercept98.2483358.033481.692960.12892-35.57720232.07386Square Feet0.109770.032973.329380.010390.033740.18580

    Business Statistics: A Decision-Making Approach, 6e 2005 Prentice-Hall, Inc.

  • Business Statistics: A Decision-Making Approach, 6e 2005 Prentice-Hall, Inc.Chap 13-*Standard Error of EstimateThe standard deviation of the variation of observations around the regression line is estimated byWhereSSE = Sum of squares error n = Sample size k = number of independent variables in the model

    Business Statistics: A Decision-Making Approach, 6e 2005 Prentice-Hall, Inc.

  • Business Statistics: A Decision-Making Approach, 6e 2005 Prentice-Hall, Inc.Chap 13-*The Standard Deviation of the Regression SlopeThe standard error of the regression slope coefficient (b1) is estimated bywhere:= Estimate of the standard error of the least squares slope

    = Sample standard error of the estimate

    Business Statistics: A Decision-Making Approach, 6e 2005 Prentice-Hall, Inc.

  • Business Statistics: A Decision-Making Approach, 6e 2005 Prentice-Hall, Inc.Chap 13-*Excel Output

    Regression StatisticsMultiple R0.76211R Square0.58082Adjusted R Square0.52842Standard Error41.33032Observations10

    ANOVAdfSSMSFSignificance FRegression118934.934818934.934811.08480.01039Residual813665.56521708.1957Total932600.5000

    CoefficientsStandard Errort StatP-valueLower 95%Upper 95%Intercept98.2483358.033481.692960.12892-35.57720232.07386Square Feet0.109770.032973.329380.010390.033740.18580

    Business Statistics: A Decision-Making Approach, 6e 2005 Prentice-Hall, Inc.

  • Business Statistics: A Decision-Making Approach, 6e 2005 Prentice-Hall, Inc.Chap 13-*Comparing Standard ErrorsyyyxxxyxVariation of observed y values from the regression lineVariation in the slope of regression lines from different possible samples

    Business Statistics: A Decision-Making Approach, 6e 2005 Prentice-Hall, Inc.

  • Business Statistics: A Decision-Making Approach, 6e 2005 Prentice-Hall, Inc.Chap 13-*Inference about the Slope: t Testt test for a population slopeIs there a linear relationship between x and y?Null and alternative hypothesesH0: 1 = 0(no linear relationship)H1: 1 0(linear relationship does exist)Test statistic

    where: b1 = Sample regression slope coefficient 1 = Hypothesized slope sb1 = Estimator of the standard error of the slope

    Business Statistics: A Decision-Making Approach, 6e 2005 Prentice-Hall, Inc.

  • Business Statistics: A Decision-Making Approach, 6e 2005 Prentice-Hall, Inc.Chap 13-*Estimated Regression Equation:The slope of this model is 0.1098 Does square footage of the house affect its sales price?Inference about the Slope: t Test(continued)

    House Price in $1000s(y)Square Feet (x)2451400312160027917003081875199110021915504052350324245031914252551700

    Business Statistics: A Decision-Making Approach, 6e 2005 Prentice-Hall, Inc.

  • Business Statistics: A Decision-Making Approach, 6e 2005 Prentice-Hall, Inc.Chap 13-*Inferences about the Slope: t Test ExampleH0: 1 = 0HA: 1 0Test Statistic: t = 3.329There is sufficient evidence that square footage affects house priceFrom Excel output: Reject H0tb1Decision:Conclusion:Reject H0Reject H0a/2=.025-t/2Do not reject H00t/2a/2=.025-2.30602.30603.329d.f. = 10-2 = 8

    CoefficientsStandard Errort StatP-valueIntercept98.2483358.033481.692960.12892Square Feet0.109770.032973.329380.01039

    Business Statistics: A Decision-Making Approach, 6e 2005 Prentice-Hall, Inc.

  • Business Statistics: A Decision-Making Approach, 6e 2005 Prentice-Hall, Inc.Chap 13-*Regression Analysis for DescriptionConfidence Interval Estimate of the Slope:Excel Printout for House Prices:At 95% level of confidence, the confidence interval for the slope is (0.0337, 0.1858)d.f. = n - 2

    CoefficientsStandard Errort StatP-valueLower 95%Upper 95%Intercept98.2483358.033481.692960.12892-35.57720232.07386Square Feet0.109770.032973.329380.010390.033740.18580

    Business Statistics: A Decision-Making Approach, 6e 2005 Prentice-Hall, Inc.

  • Business Statistics: A Decision-Making Approach, 6e 2005 Prentice-Hall, Inc.Chap 13-*Regression Analysis for DescriptionSince the units of the house price variable is $1000s, we are 95% confident that the average impact on sales price is between $33.70 and $185.80 per square foot of house sizeThis 95% confidence interval does not include 0.Conclusion: There is a significant relationship between house price and square feet at the .05 level of significance

    CoefficientsStandard Errort StatP-valueLower 95%Upper 95%Intercept98.2483358.033481.692960.12892-35.57720232.07386Square Feet0.109770.032973.329380.010390.033740.18580

    Business Statistics: A Decision-Making Approach, 6e 2005 Prentice-Hall, Inc.

  • Business Statistics: A Decision-Making Approach, 6e 2005 Prentice-Hall, Inc.Chap 13-*Confidence Interval for the Average y, Given xConfidence interval estimate for the mean of y given a particular xp

    Size of interval varies according to distance away from mean, x

    Business Statistics: A Decision-Making Approach, 6e 2005 Prentice-Hall, Inc.

  • Business Statistics: A Decision-Making Approach, 6e 2005 Prentice-Hall, Inc.Chap 13-*Confidence Interval for an Individual y, Given xConfidence interval estimate for an Individual value of y given a particular xp

    This extra term adds to the interval width to reflect the added uncertainty for an individual case

    Business Statistics: A Decision-Making Approach, 6e 2005 Prentice-Hall, Inc.

  • Business Statistics: A Decision-Making Approach, 6e 2005 Prentice-Hall, Inc.Chap 13-*Interval Estimates for Different Values of xyxPrediction Interval for an individual y, given xp xpy = b0 + b1xxConfidence Interval for the mean of y, given xp

    Business Statistics: A Decision-Making Approach, 6e 2005 Prentice-Hall, Inc.

  • Business Statistics: A Decision-Making Approach, 6e 2005 Prentice-Hall, Inc.Chap 13-*Estimated Regression Equation:Example: House PricesPredict the price for a house with 2000 square feet

    House Price in $1000s(y)Square Feet (x)2451400312160027917003081875199110021915504052350324245031914252551700

    Business Statistics: A Decision-Making Approach, 6e 2005 Prentice-Hall, Inc.

  • Business Statistics: A Decision-Making Approach, 6e 2005 Prentice-Hall, Inc.Chap 13-*Example: House PricesPredict the price for a house with 2000 square feet:The predicted price for a house with 2000 square feet is 317.85($1,000s) = $317,850(continued)

    Business Statistics: A Decision-Making Approach, 6e 2005 Prentice-Hall, Inc.

  • Business Statistics: A Decision-Making Approach, 6e 2005 Prentice-Hall, Inc.Chap 13-*Estimation of Mean Values: ExampleFind the 95% confidence interval for the average price of 2,000 square-foot housesPredicted Price Yi = 317.85 ($1,000s)Confidence Interval Estimate for E(y)|xpThe confidence interval endpoints are 280.66 -- 354.90, or from $280,660 -- $354,900

    Business Statistics: A Decision-Making Approach, 6e 2005 Prentice-Hall, Inc.

  • Business Statistics: A Decision-Making Approach, 6e 2005 Prentice-Hall, Inc.Chap 13-*Estimation of Individual Values: ExampleFind the 95% confidence interval for an individual house with 2,000 square feetPredicted Price Yi = 317.85 ($1,000s)Prediction Interval Estimate for y|xpThe prediction interval endpoints are 215.50 -- 420.07, or from $215,500 -- $420,070

    Business Statistics: A Decision-Making Approach, 6e 2005 Prentice-Hall, Inc.

  • Business Statistics: A Decision-Making Approach, 6e 2005 Prentice-Hall, Inc.Chap 13-*Finding Confidence and Prediction Intervals PHStatIn Excel, use PHStat | regression | simple linear regression

    Check the confidence and prediction interval for X= box and enter the x-value and confidence level desired

    Business Statistics: A Decision-Making Approach, 6e 2005 Prentice-Hall, Inc.

  • Business Statistics: A Decision-Making Approach, 6e 2005 Prentice-Hall, Inc.Chap 13-*Input valuesFinding Confidence and Prediction Intervals PHStat(continued)Confidence Interval Estimate for E(y)|xpPrediction Interval Estimate for y|xp

    Business Statistics: A Decision-Making Approach, 6e 2005 Prentice-Hall, Inc.

  • Business Statistics: A Decision-Making Approach, 6e 2005 Prentice-Hall, Inc.Chap 13-*Residual AnalysisPurposesExamine for linearity assumptionExamine for constant variance for all levels of x Evaluate normal distribution assumptionGraphical Analysis of ResidualsCan plot residuals vs. xCan create histogram of residuals to check for normality

    Business Statistics: A Decision-Making Approach, 6e 2005 Prentice-Hall, Inc.

  • Business Statistics: A Decision-Making Approach, 6e 2005 Prentice-Hall, Inc.Chap 13-*Residual Analysis for LinearityNot LinearLinearxresidualsxyxyxresiduals

    Business Statistics: A Decision-Making Approach, 6e 2005 Prentice-Hall, Inc.

  • Business Statistics: A Decision-Making Approach, 6e 2005 Prentice-Hall, Inc.Chap 13-*Residual Analysis for Constant Variance Non-constant varianceConstant variancexxyxxyresidualsresiduals

    Business Statistics: A Decision-Making Approach, 6e 2005 Prentice-Hall, Inc.

  • Business Statistics: A Decision-Making Approach, 6e 2005 Prentice-Hall, Inc.Chap 13-*Excel Output

    RESIDUAL OUTPUTPredicted House Price Residuals1251.92316-6.9231622273.8767138.123293284.85348-5.8534844304.062843.9371625218.99284-19.992846268.38832-49.388327356.2025148.797498367.17929-43.179299254.667464.3326410284.85348-29.85348

    Business Statistics: A Decision-Making Approach, 6e 2005 Prentice-Hall, Inc.

    Chart1

    -6.9231625835

    38.1232898505

    -5.8534839325

    3.9371619472

    -19.9928412345

    -49.388323258

    48.7974864779

    -43.1792873051

    64.3326439707

    -29.8534839325

    Square Feet

    Residuals

    House Price Model Residual Plot

    Sheet4

    SUMMARY OUTPUT

    Regression Statistics

    Multiple R0.76211

    R Square0.58082

    Adjusted R Square0.52842

    Standard Error41.33032

    Observations10

    ANOVA

    dfSSMSFSignificance F

    Regression118934.934818934.934811.084760.01039

    Residual813665.56521708.1957

    Total932600.5000

    CoefficientsStandard Errort StatP-valueLower 95%Upper 95%

    Intercept98.2483358.033481.692960.12892-35.57720232.07386

    Square Feet0.109770.032973.329380.010390.033740.18580

    RESIDUAL OUTPUT

    ObservationPredicted House PriceResiduals

    1251.9231625835-6.9231625835

    2273.876710149538.1232898505

    3284.8534839325-5.8534839325

    4304.06283805283.9371619472

    5218.9928412345-19.9928412345

    6268.388323258-49.388323258

    7356.202513522148.7974864779

    8367.1792873051-43.1792873051

    9254.667356029364.3326439707

    10284.8534839325-29.8534839325

    Sheet4

    2451400

    3121600

    2791700

    3081875

    1991100

    2191550

    4052350

    3242450

    3191425

    2551700

    House Price

    Predicted House Price

    Square Feet

    House Price

    Square Feet Line Fit Plot

    DataCopy

    Square FeetHouse Price(X-XBar)^2

    140024599225

    160031213225

    1700279225

    187530825600

    1100199378225

    155021927225

    2350405403225

    2450324540225

    142531984100

    1700255225

    Estimate

    Confidence Interval Estimate

    Data

    X Value2000

    Confidence Level95%

    Intermediate Calculations

    Sample Size10

    Degrees of Freedom8

    t Value2.3060056265

    Sample Mean1715

    Sum of Squared Difference1571500

    Standard Error of the Estimate41.3303236503

    h Statistic0.151686287

    Average Predicted Y (YHat)317.7838052816

    For Average Predicted Y (YHat)

    Interval Half Width37.1195180764

    Confidence Interval Lower Limit280.6642872052

    Confidence Interval Upper Limit354.9033233579

    For Individual Response Y

    Interval Half Width102.2813064456

    Prediction Interval Lower Limit215.502498836

    Prediction Interval Upper Limit420.0651117272

    SLR

    Regression Analysis

    Regression Statistics

    Multiple R0.7621137132

    R Square0.5808173119

    Adjusted R Square0.5284194759

    Standard Error41.3303236503

    Observations10

    ANOVA

    dfSSMSFSignificance F

    Regression118934.93477569218934.93477569211.08475761660.0103940164

    Residual813665.5652243081708.1956530385

    Total932600.5

    CoefficientsStandard Errort StatP-valueLower 95%Upper 95%

    Intercept98.248329621458.03347858471.69295951260.1289188121-35.5771985175232.0738577603

    Square Feet0.10976773780.03296944333.32937796240.01039401640.03374001620.1857954595

    SLR2

    Regression Analysis

    Regression Statistics

    Multiple R0.7621137132

    R Square0.5808173119

    Adjusted R Square0.5284194759

    Standard Error41.3303236503

    Observations10

    ANOVA

    dfSSMSFSignificance F

    Regression118934.93477569218934.93477569211.08475761660.0103940164

    Residual813665.5652243081708.1956530385

    Total932600.5

    CoefficientsStandard Errort StatP-valueLower 95%Upper 95%

    Intercept98.248329621458.03347858471.69295951260.1289188121-35.5771985175232.0738577603

    Square Feet0.10976773780.03296944333.32937796240.01039401640.03374001620.1857954595

    RESIDUAL OUTPUT

    ObservationPredicted House PriceResiduals

    1251.9231625835-6.9231625835

    2273.876710149538.1232898505

    3284.8534839325-5.8534839325

    4304.06283805283.9371619472

    5218.9928412345-19.9928412345

    6268.388323258-49.388323258

    7356.202513522148.7974864779

    8367.1792873051-43.1792873051

    9254.667356029364.3326439707

    10284.8534839325-29.8534839325

    SLR2

    1400

    1600

    1700

    1875

    1100

    1550

    2350

    2450

    1425

    1700

    Square Feet

    Residuals

    House Price Model Residual Plot

    Sheet1

    House PriceSquare Feet

    2451400

    3121600

    2791700

    3081875

    1991100

    2191550

    4052350

    3242450

    3191425

    2551700

    Sheet1

    House Price

    Square Feet

    House Price

    Sheet2

    Sheet3

  • Business Statistics: A Decision-Making Approach, 6e 2005 Prentice-Hall, Inc.Chap 13-*Chapter SummaryIntroduced correlation analysisDiscussed correlation to measure the strength of a linear associationIntroduced simple linear regression analysisCalculated the coefficients for the simple linear regression equationDescribed measures of variation (R2 and s)Addressed assumptions of regression and correlation

    Business Statistics: A Decision-Making Approach, 6e 2005 Prentice-Hall, Inc.

  • Business Statistics: A Decision-Making Approach, 6e 2005 Prentice-Hall, Inc.Chap 13-*Chapter SummaryDescribed inference about the slopeAddressed estimation of mean values and prediction of individual valuesDiscussed residual analysis

    (continued)

    Business Statistics: A Decision-Making Approach, 6e 2005 Prentice-Hall, Inc.