33

Air pollution is the introduction of chemicals and biological materials into the atmosphere that causes damage to the natural environment. We focused

Embed Size (px)

Citation preview

Page 1: Air pollution is the introduction of chemicals and biological materials into the atmosphere that causes damage to the natural environment. We focused
Page 2: Air pollution is the introduction of chemicals and biological materials into the atmosphere that causes damage to the natural environment. We focused

Air pollution is the introduction of chemicals and biological materials into the atmosphere that causes damage to the natural environment.

We focused on Sulfur Dioxide as a major contributor to air pollution.

Sulfur is:• Highly reactive gas• Cause of acid rain• Precursor to respiratory and cardiovascular problems

Air pollution is an ongoing problem worldwide, now more than ever.

We conduct a cross-sectional study of the air pollution levels in terms of Sulfur and related factors for 41 US cities using the means over the years 1969-1971. By running several regressions we attempt to determine the likely causes of air pollution.

Page 3: Air pollution is the introduction of chemicals and biological materials into the atmosphere that causes damage to the natural environment. We focused

City SO2 Temperature Man Population Wind Rain RainDays

Phoenix 10 70.3 213 582 6 7.05 36

Little Rock 13 61 91 132 8.2 48.52 100

San Francisco 12 56.7 453 716 8.7 20.66 67

Denver 17 51.9 454 515 9 12.95 86

Hartford 56 49.1 412 158 9 43.37 127

Wilmington 36 54 80 80 9 40.25 114

Washington 29 57.3 434 757 9.3 38.89 111

Jacksonville 14 68.4 136 529 8.8 54.47 116

……. …. …. …. …. …. …. ….

The data are means over the years 1969-1971.

Page 4: Air pollution is the introduction of chemicals and biological materials into the atmosphere that causes damage to the natural environment. We focused

1. City: City

2. SO2: Sulfur dioxide content of air in micrograms per cubic meter

3. Temp: Average annual temperature in degrees Fahrenheit

4. Man: Number of manufacturing enterprises employing 20 or more workers

5. Pop: Population size in thousands from the 1970 census

6. Wind: Average annual wind speed in miles per hour

7. Rain: Average annual precipitation in inches

8. RainDays: Average number of days with precipitation per year

Page 5: Air pollution is the introduction of chemicals and biological materials into the atmosphere that causes damage to the natural environment. We focused

Histogram of sulfur levels:

Since the data has a high Jarque-Bera test and are positively skewed, sulfur levels are not normally distributed.

Page 6: Air pollution is the introduction of chemicals and biological materials into the atmosphere that causes damage to the natural environment. We focused

We ran a number of bi-variate regressions to find out which independent variables significantly explain SO2 levels, both including and excluding dummy variables.

Next we ran a multi-variate regression to see if the variables that we found to be significant are significant in explaining SO2 levels when combined.

We then tested for multicollinearity and lastly investigated an interesting problem.

Page 7: Air pollution is the introduction of chemicals and biological materials into the atmosphere that causes damage to the natural environment. We focused
Page 8: Air pollution is the introduction of chemicals and biological materials into the atmosphere that causes damage to the natural environment. We focused
Page 9: Air pollution is the introduction of chemicals and biological materials into the atmosphere that causes damage to the natural environment. We focused

Dependent Variable: SO2Method: Least SquaresDate: 11/22/09 Time: 14:03Sample: 1 41Included observations: 41

Variable Coefficient Std. Error t-Statistic Prob.  

TEMPERATURE -1.408133 0.468595 -3.005012 0.0046C 108.5711 26.34371 4.121328 0.0002

R-squared 0.188009     Mean dependent var 30.04878Adjusted R-squared 0.167189     S.D. dependent var 23.47227S.E. of regression 21.42044     Akaike info criterion 9.014119Sum squared resid 17894.58     Schwarz criterion 9.097708Log likelihood -182.7894     F-statistic 9.030097Durbin-Watson stat 1.848386     Prob(F-statistic) 0.004624

Temperature significantly explains SO2 levels due to the high t-statistic and low p-values.The coefficient of temperature is negative meaning SO2 levels decrease as temperature increases.

Page 10: Air pollution is the introduction of chemicals and biological materials into the atmosphere that causes damage to the natural environment. We focused

Dependent Variable: SO2Method: Least SquaresDate: 11/22/09 Time: 14:22Sample: 1 41Included observations: 41

Variable Coefficient Std. Error t-Statistic Prob.  

MAN 0.026859 0.005099 5.267788 0.0000C 17.61057 3.691587 4.770462 0.0000

R-squared 0.415727     Mean dependent var 30.04878Adjusted R-squared 0.400745     S.D. dependent var 23.47227S.E. of regression 18.17025     Akaike info criterion 8.684999Sum squared resid 12876.16     Schwarz criterion 8.768588Log likelihood -176.0425     F-statistic 27.74959Durbin-Watson stat 1.721399     Prob(F-statistic) 0.000005

Manufacturing Enterprises significantly explains SO2 levels due to the high t-statistic and low p-values. The positive coefficient of man means that as number of manufacturing enterprises increases so do SO2 levels.

Page 11: Air pollution is the introduction of chemicals and biological materials into the atmosphere that causes damage to the natural environment. We focused

Dependent Variable: SO2Method: Least SquaresDate: 11/22/09 Time: 14:16Sample: 1 41Included observations: 41

Variable Coefficient Std. Error t-Statistic Prob.  

POPULATION 0.020014 0.005644 3.546111 0.0010C 17.86832 4.713844 3.790604 0.0005

R-squared 0.243818     Mean dependent var 30.04878Adjusted R-squared 0.224429     S.D. dependent var 23.47227S.E. of regression 20.67121     Akaike info criterion 8.942912Sum squared resid 16664.66     Schwarz criterion 9.026500Log likelihood -181.3297     F-statistic 12.57490Durbin-Watson stat 1.791243     Prob(F-statistic) 0.001035

Population significantly explains SO2 levels due to the high t-statistic and low p-values.The coefficient of population is positive meaning as population increases, so does SO2.

Page 12: Air pollution is the introduction of chemicals and biological materials into the atmosphere that causes damage to the natural environment. We focused

Dependent Variable: SO2Method: Least SquaresDate: 11/22/09 Time: 14:07Sample: 1 41Included observations: 41

Variable Coefficient Std. Error t-Statistic Prob.  

WIND 1.555741 2.619045 0.594011 0.5559C 15.35652 25.00859 0.614050 0.5427

R-squared 0.008966     Mean dependent var 30.04878Adjusted R-squared -0.016445     S.D. dependent var 23.47227S.E. of regression 23.66448     Akaike info criterion 9.213378Sum squared resid 21840.30     Schwarz criterion 9.296967Log likelihood -186.8743     F-statistic 0.352849Durbin-Watson stat 1.818109     Prob(F-statistic) 0.555935

Wind does not significantly explain SO2 levels as can be seen by the low t-statistic and low R-square.It thus makes sense to take the wind variable out of our regression model.

Page 13: Air pollution is the introduction of chemicals and biological materials into the atmosphere that causes damage to the natural environment. We focused

Dependent Variable: SO2Method: Least SquaresDate: 11/22/09 Time: 14:14Sample: 1 41Included observations: 41

Variable Coefficient Std. Error t-Statistic Prob.  

RAIN 0.108262 0.318822 0.339569 0.7360C 26.06809 12.29492 2.120233 0.0404

R-squared 0.002948     Mean dependent var 30.04878Adjusted R-squared -0.022618     S.D. dependent var 23.47227S.E. of regression 23.73623     Akaike info criterion 9.219433Sum squared resid 21972.94     Schwarz criterion 9.303022Log likelihood -186.9984     F-statistic 0.115307Durbin-Watson stat 1.820565     Prob(F-statistic) 0.736003

Rain does not significantly explain SO2 levels due to the low t-statistic and low R-squared. We thus remove the wind variable from our regression model.

Page 14: Air pollution is the introduction of chemicals and biological materials into the atmosphere that causes damage to the natural environment. We focused

Dependent Variable: SO2Method: Least SquaresDate: 11/22/09 Time: 14:09Sample: 1 41Included observations: 41

Variable Coefficient Std. Error t-Statistic Prob.  

RAINYDAYS 0.327260 0.131760 2.483761 0.0174C -7.226963 15.39914 -0.469310 0.6415

R-squared 0.136577     Mean dependent var 30.04878Adjusted R-squared 0.114438     S.D. dependent var 23.47227S.E. of regression 22.08842     Akaike info criterion 9.075534Sum squared resid 19028.03     Schwarz criterion 9.159123Log likelihood -184.0485     F-statistic 6.169068Durbin-Watson stat 1.970233     Prob(F-statistic) 0.017404

RainyDays does significantly explain the SO2 levels due to the high t-statistic and low p-value.The coefficient of rainydays is positive meaning the SO2 levels will increase as the number of rainy days increases.

Page 15: Air pollution is the introduction of chemicals and biological materials into the atmosphere that causes damage to the natural environment. We focused

Dependent Variable: SO2Method: Least SquaresDate: 12/03/09 Time: 00:56Sample: 1 41Included observations: 41

Variable Coefficient Std. Error t-Statistic Prob.  

TEMPERATURE -0.417243 0.391666 -1.065304 0.2938RAINYDAYS 0.127634 0.100713 1.267308 0.2132

POPULATION -0.043929 0.015398 -2.852937 0.0071MAN 0.068179 0.016111 4.231909 0.0002

C 33.93991 27.91632 1.215773 0.2320

R-squared 0.629094     Mean dependent var 30.04878Adjusted R-squared 0.587882     S.D. dependent var 23.47227S.E. of regression 15.06835     Akaike info criterion 8.376920Sum squared resid 8173.989     Schwarz criterion 8.585892Log likelihood -166.7269     F-statistic 15.26491Durbin-Watson stat 1.543633     Prob(F-statistic) 0.000000

Page 16: Air pollution is the introduction of chemicals and biological materials into the atmosphere that causes damage to the natural environment. We focused
Page 17: Air pollution is the introduction of chemicals and biological materials into the atmosphere that causes damage to the natural environment. We focused

Dependent Variable: RAINYDAYSMethod: Least SquaresDate: 11/22/09 Time: 14:35Sample: 1 41Included observations: 41

Variable Coefficient Std. Error t-Statistic Prob.  

TEMPERATURE -1.577840 0.530112 -2.976427 0.0050C 201.8882 29.80212 6.774289 0.0000

R-squared 0.185108     Mean dependent var 113.9024Adjusted R-squared 0.164214     S.D. dependent var 26.50642S.E. of regression 24.23253     Akaike info criterion 9.260819Sum squared resid 22901.40     Schwarz criterion 9.344408Log likelihood -187.8468     F-statistic 8.859119Durbin-Watson stat 1.233606     Prob(F-statistic) 0.004989

Multicollinearity does exist because the two variables are significantly correlated; they have a high t-statistic and high R-square. RainyDays and Temperature are negatively correlated, as temperature goes up, rainy days goes down.

Page 18: Air pollution is the introduction of chemicals and biological materials into the atmosphere that causes damage to the natural environment. We focused

Dependent Variable: SO2Method: Least SquaresDate: 11/22/09 Time: 14:44Sample: 1 41Included observations: 41

Variable Coefficient Std. Error t-Statistic Prob.  

TEMPERATURE -1.094340 0.512399 -2.135717 0.0392RAINYDAYS 0.198875 0.139720 1.423383 0.1628

C 68.42054 38.36511 1.783405 0.0825

R-squared 0.229110     Mean dependent var 30.04878Adjusted R-squared 0.188537     S.D. dependent var 23.47227S.E. of regression 21.14411     Akaike info criterion 9.010956Sum squared resid 16988.80     Schwarz criterion 9.136339Log likelihood -181.7246     F-statistic 5.646841Durbin-Watson stat 1.934916     Prob(F-statistic) 0.007126

Since multicollinearity exists, we cannot look at the t-statistic for a regression using these two variables as the independent variables. We can however, continue to use the F-statistic to determine if these two variables collectively significantly impact SO2 levels. As it turns out we cannot tell which variable significantly impacts the SO2 level.

Page 19: Air pollution is the introduction of chemicals and biological materials into the atmosphere that causes damage to the natural environment. We focused

Box plot indicating the two outliers: Providence (94) and Chicago (110)

Smallest = 8 (Wichita)Q1 = 12.5 Median = 26 (Richmond)Q3 = 35.5 Largest = 110 (Chicago)IQR = 23

94 110

Page 20: Air pollution is the introduction of chemicals and biological materials into the atmosphere that causes damage to the natural environment. We focused
Page 21: Air pollution is the introduction of chemicals and biological materials into the atmosphere that causes damage to the natural environment. We focused

Dependent Variable: SO2Method: Least SquaresDate: 12/02/09 Time: 14:26Sample: 1 41Included observations: 41

Variable Coefficient Std. Error t-Statistic Prob.  

TEMPERATURE -1.046409 0.342685 -3.053562 0.0042C2 61.31696 15.76031 3.890593 0.0004C1 77.94481 15.73461 4.953716 0.0000C 85.00347 19.36350 4.389881 0.0001

R-squared 0.600423     Mean dependent var 30.04878Adjusted R-squared 0.568025     S.D. dependent var 23.47227S.E. of regression 15.42711     Akaike info criterion 8.402598Sum squared resid 8805.845     Schwarz criterion 8.569776Log likelihood -168.2533     F-statistic 18.53262Durbin-Watson stat 1.893874     Prob(F-statistic) 0.000000

Temperature still significantly explains SO2 levels due to the high t-statistic and low p-values.

Page 22: Air pollution is the introduction of chemicals and biological materials into the atmosphere that causes damage to the natural environment. We focused

Dependent Variable: SO2Method: Least SquaresDate: 12/02/09 Time: 14:30Sample: 1 41Included observations: 41

Variable Coefficient Std. Error t-Statistic Prob.  

MAN 0.025836 0.007284 3.546764 0.0011C2 68.91495 15.10630 4.562001 0.0001C1 7.380472 26.27515 0.280892 0.7804C 16.22323 3.724041 4.356351 0.0001

R-squared 0.626658     Mean dependent var 30.04878Adjusted R-squared 0.596387     S.D. dependent var 23.47227S.E. of regression 14.91206     Akaike info criterion 8.334685Sum squared resid 8227.670     Schwarz criterion 8.501863Log likelihood -166.8610     F-statistic 20.70163Durbin-Watson stat 1.877703     Prob(F-statistic) 0.000000

Manufacturing Enterprises still significantly explains SO2 levels due to the high t-statistic and low p-values.

Page 23: Air pollution is the introduction of chemicals and biological materials into the atmosphere that causes damage to the natural environment. We focused

Dependent Variable: SO2Method: Least SquaresDate: 12/02/09 Time: 14:31Sample: 1 41Included observations: 41

Variable Coefficient Std. Error t-Statistic Prob.  

POPULATION 0.012183 0.007103 1.715259 0.0947C2 72.14691 17.02945 4.236597 0.0001C1 49.28269 26.15993 1.883900 0.0675C 19.67230 4.719600 4.168214 0.0002

R-squared 0.536577     Mean dependent var 30.04878Adjusted R-squared 0.499002     S.D. dependent var 23.47227S.E. of regression 16.61396     Akaike info criterion 8.550832Sum squared resid 10212.88     Schwarz criterion 8.718010Log likelihood -171.2921     F-statistic 14.28020Durbin-Watson stat 1.799458     Prob(F-statistic) 0.000002

Population no longer significantly explains SO2 levels due to the low t-statistic and high p-values.

Page 24: Air pollution is the introduction of chemicals and biological materials into the atmosphere that causes damage to the natural environment. We focused

0

500

1000

1500

2000

2500

3000

3500

0 20 40 60 80 100 120

SO2

PO

PULA

TIO

NPOPULATION vs. SO2

Page 25: Air pollution is the introduction of chemicals and biological materials into the atmosphere that causes damage to the natural environment. We focused

Dependent Variable: SO2Method: Least SquaresDate: 12/02/09 Time: 14:28Sample: 1 41Included observations: 41

Variable Coefficient Std. Error t-Statistic Prob.  

WIND -0.393012 1.937653 -0.202829 0.8404C2 68.11667 17.62874 3.863956 0.0004C1 84.03807 17.58138 4.779946 0.0000C 30.04926 18.40261 1.632881 0.1110

R-squared 0.500282     Mean dependent var 30.04878Adjusted R-squared 0.459765     S.D. dependent var 23.47227S.E. of regression 17.25229     Akaike info criterion 8.626234Sum squared resid 11012.73     Schwarz criterion 8.793412Log likelihood -172.8378     F-statistic 12.34727Durbin-Watson stat 1.712120     Prob(F-statistic) 0.000010

Wind still does not significantly explain SO2 levels as can be seen by the low t-statistic and low R-square.

Page 26: Air pollution is the introduction of chemicals and biological materials into the atmosphere that causes damage to the natural environment. We focused

Dependent Variable: SO2Method: Least SquaresDate: 12/02/09 Time: 14:30Sample: 1 41Included observations: 41

Variable Coefficient Std. Error t-Statistic Prob.  

RAIN 0.070950 0.232440 0.305241 0.7619C2 67.21003 17.51681 3.836887 0.0005C1 83.79963 17.46754 4.797449 0.0000C 23.75684 8.960694 2.651228 0.0117

R-squared 0.500983     Mean dependent var 30.04878Adjusted R-squared 0.460522     S.D. dependent var 23.47227S.E. of regression 17.24018     Akaike info criterion 8.624830Sum squared resid 10997.28     Schwarz criterion 8.792008Log likelihood -172.8090     F-statistic 12.38194Durbin-Watson stat 1.716316     Prob(F-statistic) 0.000009

Rain still does not significantly explain SO2 levels due to the low t-statistic and low R-squared.

Page 27: Air pollution is the introduction of chemicals and biological materials into the atmosphere that causes damage to the natural environment. We focused

Dependent Variable: SO2Method: Least SquaresDate: 12/02/09 Time: 14:29Sample: 1 41Included observations: 41

Variable Coefficient Std. Error t-Statistic Prob.  

RAINYDAYS 0.278414 0.092644 3.005193 0.0047C2 64.41428 15.71003 4.100200 0.0002C1 81.24952 15.69349 5.177276 0.0000C -5.215998 10.79510 -0.483182 0.6318

R-squared 0.597879     Mean dependent var 30.04878Adjusted R-squared 0.565274     S.D. dependent var 23.47227S.E. of regression 15.47614     Akaike info criterion 8.408944Sum squared resid 8861.907     Schwarz criterion 8.576122Log likelihood -168.3834     F-statistic 18.33736Durbin-Watson stat 1.897244     Prob(F-statistic) 0.000000

Rainy Days still significantly explains the SO2 levels due to the high t-statistic and low p-value.

Page 28: Air pollution is the introduction of chemicals and biological materials into the atmosphere that causes damage to the natural environment. We focused

Dependent Variable: TEMPERATUREMethod: Least SquaresDate: 12/02/09 Time: 15:38Sample: 1 41Included observations: 41

Variable Coefficient Std. Error t-Statistic Prob.  

RAINYDAYS -0.114168 0.040132 -2.844797 0.0072C2 -4.720416 6.805350 -0.693633 0.4922C1 -4.462919 6.798183 -0.656487 0.5156C 68.99137 4.676276 14.75349 0.0000

R-squared 0.204186     Mean dependent var 55.76341Adjusted R-squared 0.139660     S.D. dependent var 7.227716S.E. of regression 6.704032     Akaike info criterion 6.735763Sum squared resid 1662.930     Schwarz criterion 6.902941Log likelihood -134.0831     F-statistic 3.164421Durbin-Watson stat 1.108636     Prob(F-statistic) 0.035732

Multicollinearity still exists because the two variables are significantly correlated.

Page 29: Air pollution is the introduction of chemicals and biological materials into the atmosphere that causes damage to the natural environment. We focused

Dependent Variable: SO2Method: Least SquaresDate: 12/02/09 Time: 15:30Sample: 1 41Included observations: 41

Variable Coefficient Std. Error t-Statistic Prob.  

TEMPERATURE -0.741891 0.364337 -2.036277 0.0491RAINYDAYS 0.193714 0.098186 1.972930 0.0562

C2 60.91225 15.17959 4.012773 0.0003C1 77.93852 15.15345 5.143285 0.0000C 45.96809 27.18869 1.690706 0.0995

R-squared 0.639411     Mean dependent var 30.04878Adjusted R-squared 0.599346     S.D. dependent var 23.47227S.E. of regression 14.85731     Akaike info criterion 8.348710Sum squared resid 7946.626     Schwarz criterion 8.557682Log likelihood -166.1486     F-statistic 15.95916Durbin-Watson stat 1.971757     Prob(F-statistic) 0.000000

According to the F-statistic temperature and rainy days are significantly related to SO2 levels. However since multicollinearity exists we cannot refer to the t-statistic and therefore do not know how significant each variable is.

Page 30: Air pollution is the introduction of chemicals and biological materials into the atmosphere that causes damage to the natural environment. We focused

Our final model includes the two dummy variables. This regression model has a significant F-statistic and a small p-value.

Page 31: Air pollution is the introduction of chemicals and biological materials into the atmosphere that causes damage to the natural environment. We focused

0

1

2

3

4

5

6

7

8

9

-30 -20 -10 0 10 20 30

Series: ResidualsSample 1 41Observations 41

Mean 5.20e-16Median -1.262938Maximum 26.55544Minimum -29.56690Std. Dev. 12.37240Skewness 0.169384Kurtosis 2.795795

Jarque-Bera 0.267292Probability 0.874900

Histogram of sulfur levels with dummy variables:

The data has a low Jarque-Bera test, a high probability and is slightly positively skewed, so sulfur levels are normally distributed.

Page 32: Air pollution is the introduction of chemicals and biological materials into the atmosphere that causes damage to the natural environment. We focused

According to the figure above, there is an indication of heteroskedasticity. However since this is a cross sectional analysis, it does not have a significant impact on our final regression.

Page 33: Air pollution is the introduction of chemicals and biological materials into the atmosphere that causes damage to the natural environment. We focused

From our regression model, we find that temperature, rainy days and manufacturing all have a significant effect on SO2 levels, explaining 72% of the sulfur levels.

Out of the three variables however, manufacturing enterprises is the most significant explanatory variable.

Economic Impact:• Given that SO2 is a threat to human wellbeing and the environment, lowering the SO2 levels can reduce future costs.• SO2 pollution is preventable as it stems from human activity.• Lower SO2 levels could be achieved by future restrictions on the number of manufacturing enterprises or on the emission levels of SO2 they release.