Upload
others
View
11
Download
0
Embed Size (px)
Citation preview
Multiple Regression and Model Building (cont’d) + GIS
11.220Lecture 213 May 2006R. Ryznar
Model Summaryb
.991a .982 .977 46.801Model1
R R SquareAdjustedR Square
Std. Error ofthe Estimate
Predictors: (Constant), SizeSquared, HomeSizea.
Dependent Variable: EnergyUseb.
ANOVAb
831069.5 2 415534.773 189.710 .0001a
15332.554 7 2190.365846402.1 9
RegressionResidualTotal
Model1
Sum ofSquares df Mean Square F Sig.
Predictors: (Constant), SizeSquared, HomeSizea.
Dependent Variable: EnergyUseb.
Coefficientsa
-1216.1438870 242.80636850 -5.009 .001552.39893018 .24583560 4.049 9.758 .00003-.00045004 .00005908 -3.161 -7.618 .00012
(Constant)HomeSizeSizeSquared
Model1
B Std. ErrorUnstandardized Coefficients
Beta
StandardizedCoefficients
t Sig.
Dependent Variable: EnergyUsea.
S2 = SSE/n – (k + 1)Sometimes called MSE
F= ______R2/k______(1-R2)/[n-(k+1)]
R2=SSR/SST or 1-(SSE/SST)
SSE
1-[(SSE/n-k+1)/(SST/n-1)]
K=number of X variablesεβββ +++= 2210 xxy
Model Summaryb
.912a .832 .811 133.438Model1
R R SquareAdjustedR Square
Std. Error ofthe Estimate
Predictors: (Constant), HomeSizea.
Dependent Variable: EnergyUseb. ANOVAb
703957.2 1 703957.183 39.536 .000a
142444.9 8 17805.615846402.1 9
RegressionResidualTotal
Model1
Sum ofSquares df Mean Square F Sig.
Predictors: (Constant), HomeSizea.
Dependent Variable: EnergyUseb.
εββ ++= xy 10
Coefficientsa
578.928 166.968 3.467 .008.540 .086 .912 6.288 .000
(Constant)HomeSize
Model1
B Std. Error
UnstandardizedCoefficients
Beta
StandardizedCoefficients
t Sig.
Dependent Variable: EnergyUsea.
Correlation with Y (r)(survival time)
x1 .346x2 .593
x3 .665
x4 .726
X variables SSE R2
X1 (Blood Clotting) 3.4961 .120X2 (Prognostic Ind.) 2.5763 .352X3 (Enzyme Func.) 2.2153 .442X4 (Liver Func.) 1.8776 .527X1, X2 2.2325 .438X1, X3 1.4072 .646X1, X4 1.8758 .528X2, X3 0.7430 .813X2, X4 1.3922 .650X3, X4 1.2453 .687X1, X2, X3 0.1099 .972X1, X2, X4 1.3905 .650X1, X3, X4 1.1156 .719X2, X3, X4 0.4652 .883X1, X2, X3, X4 0.1098 .972
x1 x2 x3 x4
x1 1 .090 -.150 .502
x2 1 -.024 .369
x3 1 .416
x4 1
Standardized coefficients used to establish a common metric for comparison
εαεββα
+++=+++=
.).(1)(2.).()( 21
QIeducationofyearsincomeQIeducationofyearsincome
Can you say that years of education is more important than I.Q.?
Of course, you cannot, because they are not measured with the same metric. One way to solve this problem of comparing beta coefficients is to use standardized coefficients.
Standardized coefficients are calculated in a regression equation using the z-scores of the dependent (Y) and independent (X) variables.
Interpreting the standardized coefficients
One standard deviation of x1 will increase y by the standardized coefficient associated with x1.
Coefficientsa
-1216.1438870 242.80636850 -5.009 .001552.39893018 .24583560 4.049 9.758 .00003-.00045004 .00005908 -3.161 -7.618 .00012
(Constant)HomeSizeSizeSquared
Model1
B Std. ErrorUnstandardized Coefficients
Beta
StandardizedCoefficients
t Sig.
Dependent Variable: EnergyUsea.
Descriptive Statistics
10 1594.70 306.66710 1880.00 517.62310 3775540 2153984.10510
EnergyUseHomeSizeSizeSquaredValid N (listwise)
N Mean Std. Deviation
Every increase of 1 s.d. in X1 increases the Y by 4.049 s.d., i.e., 4.049 * 306.667=1241.69 or using the unstandardized coefficients 2.39893018 * 517.623=1241.74 (rounding errors …but they should be equal)
Dummy variableseducofyrsOTHERHISPANCAUCASASIAMERIncome 12*95.*2.2*7.0*5.2*9.141.5 +−+−+++=
MulticolinearityData for 67 Florida Counties
• fem = Percentage of households headed by a female
• inc = Median income• hs = Percentage of residents over 25 years old
with at least a high school diploma• urb = Percentage of residents living in an urban
environment• cr = Number of crimes per capita• unemrt = Unemployment rate
fem inc un hs urb cr unemrt
unem
rtcr
urb
hsun
inc
fem
Correlations
1 -.561** -.055 -.511** -.435** -.143 -.055.000 .661 .000 .000 .248 .661
67 67 67 67 67 67 67-.561** 1 -.119 .793** .730** .432** -.119.000 .337 .000 .000 .000 .337
67 67 67 67 67 67 67-.055 -.119 1 -.250* -.053 -.001 1.000**.661 .337 .041 .670 .996 .000
67 67 67 67 67 67 67-.511** .793** -.250* 1 .791** .468** -.250*.000 .000 .041 .000 .000 .041
67 67 67 67 67 67 67-.435** .730** -.053 .791** 1 .678** -.053.000 .000 .670 .000 .000 .670
67 67 67 67 67 67 67-.143 .432** -.001 .468** .678** 1 -.001.248 .000 .996 .000 .000 .996
67 67 67 67 67 67 67-.055 -.119 1.000** -.250* -.053 -.001 1.661 .337 .000 .041 .670 .996
67 67 67 67 67 67 67
Pearson CorrelationSig. (2-tailed)NPearson CorrelationSig. (2-tailed)NPearson CorrelationSig. (2-tailed)NPearson CorrelationSig. (2-tailed)NPearson CorrelationSig. (2-tailed)NPearson CorrelationSig. (2-tailed)NPearson CorrelationSig. (2-tailed)N
fem
inc
un
hs
urb
cr
unemrt
fem inc un hs urb cr unemrt
Correlation is significant at the 0.01 level (2-tailed).**.
Correlation is significant at the 0.05 level (2-tailed).*.
Detecting Multicollinearity with the Variance Inflation Factor (VIF)
Coefficientsa
.024 .042 .579 .565
.002 .001 .172 1.516 .135 .646 1.5471.450E-08 .000 .002 .015 .988 .313 3.191
.000 .001 -.090 -.482 .632 .237 4.217
.000 .001 .030 .304 .762 .842 1.188
.001 .000 .824 5.172 .000 .328 3.049
(Constant)feminchsunemrturb
Model1
B Std. Error
UnstandardizedCoefficients
Beta
StandardizedCoefficients
t Sig. Tolerance VIFCollinearity Statistics
Dependent Variable: cra.
The percentage of each variable not related to the other predictors.
VIF = 1/Tolerance. If Tolerance =1, then VIF =1. As VIF becomes larger, greater overlap exists among predictors.
Z scores for crime per capita
Z scores for % living in
urbanized area
Positive and significant z-score indicates spatial clustering of high values.
Negative and significant z-score indicates spatial clustering of low values.
Final Paper data in GIS
ma_eqv.dbf MA ‘Kind of Community’ (KOC) data for all cities/towns in MAma_eqv_intro.txt A brief explanation of the MA Department of Revenue’s Kind-of-
Community classification of MA cities and towns
GIS Spatial Data Set (formatted as ArcGIS shapefiles and located in the ‘gis’ sub-directory):
ma_towns00 Town boundaries for MA cities and townsmajmhda1 Major roads for MA 9see ‘class’ for road type distinctions)maj_pop1 Major MA lakes and ponds (for better cartography)p525_ma Boundaries for MA PUMA regionsmajmhdcl.avl Pre-configured classification and symbols for MA major roads