Upload
srilakshmi-shunmugaraj
View
215
Download
2
Embed Size (px)
DESCRIPTION
Chapter 20 - problem 38, 40
Citation preview
Chapter 20
Curved Patterns
Question No. 38: Cellular Phones in Africa
(a) The scatterplot of two types of subscribers suggests a possible linear trend in the number of
landlines. The plot of Landline subscribers seems more Linear than that of Mobile
subscribers.
(b)
0
1000
2000
3000
4000
5000
6000
7000
8000
9000
-100000
-50000
0
50000
100000
150000
200000
250000
300000
350000
1990 1995 2000 2005 2010 2015
Lan
dlin
e s
ub
scri
be
rs (
00
0)
Mo
bile
su
bsr
ibe
s (0
00
)
Year
Mobile & Landline Subscribers Vs Year
Mobile Subscribers (Sub-Sahara, 000)
Land Line Subscribers (Sub-Sahara, 000)
Linear (Mobile Subscribers (Sub-Sahara, 000))
Linear (Land Line Subscribers (Sub-Sahara, 000))
y = 374.51x - 744620 R² = 0.987
0
1000
2000
3000
4000
5000
6000
7000
8000
9000
1990 1995 2000 2005 2010 2015
Lan
d L
ine
Su
bsc
rib
ers
(00
0)
Year
Land Line Subscribers Vs Year
Land Line Subscribers (Sub-Sahara, 000)
Linear (Land Line Subscribers(Sub-Sahara, 000))
The linear trend of number of land line subscribers has high regression fitted value, r2=0.987,
but it doesn’t seem to have a trend in the extremes or in the middle of the data.
(c) The regression equation is number of landline subscribers (in 1000s) = 374.51(year)-744620
The slope implies that there is an annual growth in the average number of landline
subscribers by 374510 and the negative intercept represents a large unrealistic extrapolation
for the 0th year.
(d)
There is no pattern that can be interpreted from the residual plot. The residuals represent a
poor fit, deviating from the linearity. The linear equation under-predicts in the edges of the
plot and over-predicts in the the middle of the plot.
(e)
-500
-400
-300
-200
-100
0
100
200
300
400
1994 1996 1998 2000 2002 2004 2006 2008 2010 2012
Re
sid
ual
s
Year
Residual Plot
y = 0.0751x - 141.8 R² = 0.9723
7.8
8
8.2
8.4
8.6
8.8
9
9.2
1990 1995 2000 2005 2010 2015
log
e L
and
lne
su
bsr
ibe
s (0
00
)
Year
Log Subscribers Vs Year
Log trend line shows the bending pattern in the original plot . The residuals from this curve seems to
be random. So, the curve of ‘Estimated loge (Number of Subscribers) = b0 + b1 Year’ is not a better
summary of the growth of the use of landlines compared to that of ‘Number of Subscribers = b0 + b1
Year’ model.
(f)
The regression equation for the log is
log e (number of mobile subscribers) = 0.5819 (years) – 1156
log inv (5819) = 1.789, log inv (0.0751) = 1.078
y = 0.5819x - 1156 R² = 0.9788
0
2
4
6
8
10
12
14
16
1994 1996 1998 2000 2002 2004 2006 2008 2010 2012
log
e m
ob
ile s
ub
srib
es
(00
0)
Year
Log Subscribers Vs Year
y = 0.0751x - 141.8 R² = 0.9723
y = 0.5819x - 1156 R² = 0.9788
0
2
4
6
8
10
12
14
16
1994 1996 1998 2000 2002 2004 2006 2008 2010 2012
Log land Line Log mobile
Linear (Log land Line) Linear (Log mobile)
This implies a high annual rate of growth as the growth rate in the number of mobile
subscribers is 1.789x1000, whereas the growth rate of the number of landline subscribers is
1.078x1000
Question Number 40: CO2
(a) The three prominent Outliers are People’s Republic of China, US and Japan
(b) The plot after removing the outliers
y = 0.5094x + 55.537 R² = 0.5553
0
1000
2000
3000
4000
5000
6000
7000
8000
$0.00 $2,000.00 $4,000.00 $6,000.00 $8,000.00 $10,000.00 $12,000.00
CO
2 (
mill
ion
to
ns)
GDP (billion dollars)
CO2 (million tons) Vs GDP (billion dollars)
y = 0.4587x + 39.846 R² = 0.4041
0
200
400
600
800
1000
1200
1400
1600
1800
$0.00 $500.00 $1,000.00 $1,500.00 $2,000.00 $2,500.00
CO
2 (
mill
ion
of
ton
s)
GDP (billion dollars)
CO2 Vs GDP
CO2 (million tons) Linear (CO2 (million tons))
The pattern in the plot says that the countries with low GDP have lower levels of CO2 emission. The
pattern in the plot is an exponential pattern
The equation to summarize the variation in the form of regression line :
CO2 (in millions of tons)=0.4587*GDP(in billion dollars)+39.846
(c)
The linear pattern is apparent in the scatterplot.
(d) The fitted equation for the plot is : Log CO2 = (Log GDP)*0.879+0.2104
y = 0.879x + 0.2104 R² = 0.8043
-2
0
2
4
6
8
10
-2 0 2 4 6 8 10
Log
CO
2
Log GDP
Log CO2 Vs Log GDP
-2
-1.5
-1
-0.5
0
0.5
1
1.5
2
2.5
-2 0 2 4 6 8 10
Re
sid
ual
s
Log GDP
Residual Plot
.
(e) The fitted equation implies that the fit seems to be appropriate as no pattern is found. The
variation over log GDP is also seems to be the equal
Fitted equation: Log CO2 = (Log GDP)*0.879+0.2104
(f)
Yes, there is change in the y-intercept and in the fitted regression line.
y = 0.879x + 0.0914 R² = 0.8043
-1
-0.5
0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
-0.5 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5
Log1
0 C
O2
Log 10 GDP
Log 10 CO2 Vs Log 10 GDP