Chap003 (1).pptx

8/14/2019 Chap003 (1).pptx

1/51

Business Statistics: Communicating with Numbers

By Sanjiv Jaggia and Alison Kelly

McGraw-Hill/Irwin Copyri ght 2013 by The McGraw-H il l Companies, In c. All ri ghts reserved.

8/14/2019 Chap003 (1).pptx

2/51

3-2

Chapter 3 Learning Objectives (LOs)

LO 3.1:Calculate and interpret the arithmetic mean,the median, and the mode.

LO 3.2:Calculate and interpret percentiles and a box

plot.LO 3.3:Calculate and interpret a geometric mean

return and an average growth rate.

LO 3.4: Calculate and interpret the range, the meanabsolute deviation, the variance, thestandard deviation, and the coefficient ofvariation.

8/14/2019 Chap003 (1).pptx

3/51

3-3

Chapter 3 Learning Objectives (LOs)

LO 3.5:Explain mean-variance analysis andthe Sharpe ratio.

LO 3.6:Apply Chebyshevs Theorem and the

empirical rule.LO 3.7:Calculate the mean and the variance

for grouped data.

LO 3.8:Calculate and interpret the covarianceand the correlation coefficient.

8/14/2019 Chap003 (1).pptx

4/51

3-4

As an investment counselor at a large bank,Rebecca Johnson was asked by aninexperienced investor to explain the differencesbetween two top-performing mutual funds:

Vanguards Precious Metals and Mining fund(Metals)

Fidelitys Strategic Income Fund (Income)

The investor has collected sample returns forthese two funds for years 2000 through 2009.These data are presented in the next slide.

Investment Decision

8/14/2019 Chap003 (1).pptx

5/51

3-5

Investment Decision

Rebecca would like to

1. Determine the typical return of the mutual

funds.2. Evaluate the investment risk of the mutual

funds.

8/14/2019 Chap003 (1).pptx

6/51

8/14/2019 Chap003 (1).pptx

7/513-7

3.1 Measures of Central Location

Example: Investment Decision

Use the data in the introductory case to calculate andinterpret the mean return of the Metals fund and themean return of the Income fund.

LO 3.1

8/14/2019 Chap003 (1).pptx

8/513-8

3.1 Measures of Central LocationLO 3.1

The mean is sensitive to outliers.

Consider the salaries of employees at Acetech.

This mean does not reflect the typical salary!

8/14/2019 Chap003 (1).pptx

9/513-9

3.1 Measures of Central Location

The medianis another measure of central locationthat is not affected by outliers.

When the data are arranged in ascending order,

the median is

the middle value if the number of observations isodd, or

the average of the two middle values if thenumber of observations is even.

LO 3.1

8/14/2019 Chap003 (1).pptx

10/513-10

3.1 Measures of Central Location Consider the sorted salaries of employees at

Acetech (odd number).

Median = 90,000

Consider the sorted data from the Metals funds ofthe introductory case study (even number).

Median = (33.35 + 34.30) / 2 = 33.83%.

LO 3.1

3 values above3 values below

8/14/2019 Chap003 (1).pptx

11/513-11

3.1 Measures of Central Location The modeis another measure of central location.

The most frequently occurring value in a data set

Used to summarize qualitative data

A data set can have no mode, one mode

(unimodal), or many modes (multimodal).

LO 3.1

Consider the salary of employees at Acetech

The mode is $40,000 since this value appears mostoften.

8/14/2019 Chap003 (1).pptx

12/513-12

3.2 Percentiles and Box Plots

In general, thepth percentiledivides a data set intotwo parts:

Approximatelyp percent of the observations have

values less than thepth percentile; Approximately (100 p) percent of the

observations have values greater than thepthpercentile.

LO 3.2 Calculate and interpret percentiles and a box plot.

8/14/2019 Chap003 (1).pptx

13/513-13


Calculating thepth percentile:

First arrange the data in ascending order.

Locate the position, Lp, of thepth percentile byusing the formula:

We use this position to find the percentile as

shown next.

LO 3.2

1100

p

pL n

8/14/2019 Chap003 (1).pptx

14/513-14


Consider the sorted data from the introductory case.

For the 25th percentile, we locate the position:

Similarly, for the 75thpercentile, we first find:

LO 3.2

25 1100

pL n

2510 1 2.75

100

75 1100

pL n

7510 1 8.25

100

8/14/2019 Chap003 (1).pptx

15/513-15


Calculating thepth percentile

Once you find Lp, observe whether or not itis aninteger.

If Lp

is an integer, then the Lp

thobservationin thesorted data set is thepth percentile.

If Lp is not an integer, then interpolate betweentwo corresponding observations to approximate

thepth percentile.

LO 3.2

8/14/2019 Chap003 (1).pptx

16/513-16


Both L25 = 2.75 and L75 = 8.25 are not integers, thus

The 25th percentile is located 75% of the distancebetween the second and third observations, and it is

The 75th percentile is located 25% of the distancebetween the eighth and ninth observations, and it is

LO 3.2

7.34 0.75(8.09 ( 7.34)) 4.23

43.79 0.25(59.45 43.79) 47.71

8/14/2019 Chap003 (1).pptx

17/513-17

A box plot allows you to:

Graphically display the distribution of a data set.

Compare two or more distributions.

Identify outliers in a data set.

Box

* *

WhiskersOutliers

3.2 Percentiles and Box PlotsLO 3.2

8/14/2019 Chap003 (1).pptx

18/513-18


The box plot displays 5 summary values:

S = smallest value

L = largest value

Q1 = first quartile = 25th percentile

Q2 = median =second quartile = 50th percentile

Q3 = third quartile = 75th percentile

LO 3.2

8/14/2019 Chap003 (1).pptx

19/513-19

Using the results obtained from the Metals fund

data, we can label the box plot with the 5summary values:

Note that IQR = Q3 Q1 =43.48

3.2 Percentiles and Box PlotsLO 3.2

Smallest

Value

Largest

Value

First

Quartile

Third

Quartile

SecondQuartile

47.71 4.23 = 43.48

8/14/2019 Chap003 (1).pptx

20/513-20


Detecting outliers Calculate IQR = 43.48

Calculate 1.5 IQR, or 1.5 43.48 = 65.22

There are outliers if

Q1S > 65.22, or if

LQ3 > 65.22

There are no outliers in this data set.

8/14/2019 Chap003 (1).pptx

21/513-21

3.3 The Geometric Mean

Remember that the arithmetic mean is an additiveaverage measurement.

Ignores the effects of compounding.

The geometric meanis a multiplicative averagethat incorporate compounding. It is used to

measure: Average investment returns over several years,

Average growth rates.

LO 3.3 Calculate and interpret a geometric mean return and an

average growth rate.

8/14/2019 Chap003 (1).pptx

22/513-22


For multiperiod returns R1, R2, . . . , Rn, thegeometric mean return GRis calculated as:

where nis the number of multiperiod returns.

LO 3.3

8/14/2019 Chap003 (1).pptx

23/51

3-23


Using the data from the Metals and Income funds,we can calculate the geometric mean returns:

LO 3.3

8/14/2019 Chap003 (1).pptx

24/51

3-24

3.3 The Geometric Mean Computing an average growth rate

For growth rates g1, g2, . . . , gn, the average growth rateGgis calculated as

where nis the number of multiperiod growth rates.

For observationsx1,x2, . . . ,xn, the average growth rateGgis calculated as

where n1 is the number of distinct growth rates.

LO 3.3

8/14/2019 Chap003 (1).pptx

25/51

3-25

3.3 The Geometric Mean For example, consider the sales for Adidas (in

millions of) for the years 2005 through 2009:

Annual growth rates are:

LO 3.3

The average growth rate

using the simplifiedformula is:

8/14/2019 Chap003 (1).pptx

26/51

3-26

3.4 Measures of Dispersion

Measures of dispersion gauge the variability of adata set.

Measures of dispersion include:

Range

Mean Absolute Deviation (MAD)

Variance and Standard Deviation Coefficient of Variation (CV)

LO 3.4 Calculate and interpret the range, the mean absolute deviation,

the variance, the standard deviation, and the coefficient of variation.

8/14/2019 Chap003 (1).pptx

27/51

3-27


Range

It is the simplest measure.

It is focusses on extreme values.

LO 3.4

Range Maximum Value Minimum Value

Calculate the range using the data from the Metalsand Income funds

8/14/2019 Chap003 (1).pptx

28/51

3-28


Mean Absolute Deviation (MAD)

MAD is an average of the absolute difference ofeach observation from the mean.

LO 3.4

Sample MAD ix xn

Population MAD

ix

N

m

8/14/2019 Chap003 (1).pptx

29/51

3-29


Calculate MAD using the data from the Metalsfund.

8/14/2019 Chap003 (1).pptx

30/51

3-30


Varianceand standard deviation

For a given sample,

For a given population,

LO 3.4

2

2 2 and1

ix xs s sn

2

2 2 andix

N

m

8/14/2019 Chap003 (1).pptx

31/51

3-31


Calculate the variance and the standarddeviation using the data from the Metals fund.

8/14/2019 Chap003 (1).pptx

32/51

8/14/2019 Chap003 (1).pptx

33/51

3-33

Calculate the coefficient of variation (CV)using the data from the Metals fund and theIncome fund.

Metals fund: CV

Income fund: CV

3.4 Measures of DispersionLO 3.4

8/14/2019 Chap003 (1).pptx

34/51

3-34

Mean and median returns for the Metals fund are

24.65% and 33.83%, respectively.

Mean and median returns for the Income fund are8.51% and 7.34%, respectively.

The standard deviation for the Metals fund and theIncome fund are 37.13% and 11.07%, respectively.

The coefficient of variation for the Metals fund andthe Income fund are 1.51 and 1.30, respectively.

Synopsis of Investment Decision

8/14/2019 Chap003 (1).pptx

35/51

3-35

3.5 Mean-Variance Analysis and the Sharpe Ratio

Mean-variance analysis:

The performance of an asset is measured by its rate ofreturn.

The rate of return may be evaluated in terms of its reward(mean) and risk (variance).

Higher average returns are often associated with higherrisk.

The Sharpe ratio uses the mean and variance toevaluate risk.

LO 3.5 Explain mean-variance analysis and the Sharpe Ratio.

8/14/2019 Chap003 (1).pptx

36/51

3-36

3.5 Mean-Variance Analysis and the

Sharpe Ratio

Sharpe Ratio

Measures the extra reward per unit of risk.

For an investment , the Sharpe ratio is computed as:

where is the mean return for the investment

is the mean return for a risk-free assetis the standard deviation for the investment

Sharpe Ratiox R

s

LO 3.5

8/14/2019 Chap003 (1).pptx

37/51

3-37

3.5 Mean-Variance Analysis and the

Sharpe Ratio

Sharpe Ratio Example Compute the Sharpe ratios for the Metals and Income

funds given the risk free return of 4%.

Since 0.56 > 0.41, the Metals fund offers more reward perunit of risk as compared to the Income fund.

LO 3.5

8/14/2019 Chap003 (1).pptx

38/51

3-38

3.6 Chebyshevs Theorem and the Empirical Rule

Chebyshevs Theorem For any data set, the proportion of observations that lie

within k standard deviations from the mean is at least11/k2 , where k is any number greater than 1.

Consider a large lecture class with 280 students. The meanscore on an exam is 74 with a standard deviation of 8. Atleast how many students scored within 58 and 90?

With k = 2, we have 11/22= 0.75. At least 75% of 280 or210 students scored within 58 and 90.

LO 3.6 Apply Chebyshevs Theorem and the empirical rule.

8/14/2019 Chap003 (1).pptx

39/51

3-39

3.6 Chebyshevs Theorem and the Empirical RuleLO 3.6

The Empirical Rule:

Approximately 68% of all observations fall in theinterval .

Approximately 95% of all

observations fall in theinterval .

Almost all observationsfall in the interval

.

x s

2x s

3x s

8/14/2019 Chap003 (1).pptx

40/51

3-40

Reconsider the example of the lecture class with

280 students with a mean score of 74 and astandard deviation of 8. Assume that the distributionis symmetric and bell-shaped. Approximately howmany students scored within 58 and 90?

The score 58 is two standard deviations below themean while the score 90 is two standarddeviations above the mean.

Therefore about 95% of 280 students, or0.95(280) = 266 students, scored within 58 and90.

3.6 Chebyshevs Theorem and the Empirical RuleLO 3.6

8/14/2019 Chap003 (1).pptx

41/51

3-41

3.7 Summarizing Grouped Data

When data are grouped or aggregated, we usethese formulas:

where miand iare the midpoint and frequency of the ithclass, respectively.

LO 3.7 Calculate the mean and the variance for grouped data.

2

2

2

Mean:

Variance:1

Standard Deviation:

i i

i i

mx

nm x

sn

s s

8/14/2019 Chap003 (1).pptx

42/51

8/14/2019 Chap003 (1).pptx

43/51

3-43

3.7 Summarizing Grouped DataLO 3.7

Calculate the sample variance and the standard

deviation.

First calculate the sum of the weighted squareddifferences from the mean.

Dividing this sum by (n1) = 361 = 35 yields a variance of10.635($)2.

The square root of the variance yields a standard deviationof $103.13.

8/14/2019 Chap003 (1).pptx

44/51

3-44


Weighted Mean

Let w1, w2, . . . , wn denote the weights of the

sample observationsx1,x2, . . . ,xnsuch that, then

i ix w x

1 2 1

nw w w

8/14/2019 Chap003 (1).pptx

45/51

3-45


A student scores 60 on Exam 1, 70 on Exam 2, and

80 on Exam 3. What is the students average scorefor the course if Exams 1, 2, and 3 are worth 25%,25%, and 50% of the grade, respectively?

Define w1= 0.25, w2= 0.25, and w3= 0.50.

The unweighted mean is only 70 as it does notincorporate the higher weight given to the score onExam 3.

0.25(60) 0.25(70) 0.5(80) 72.50i ix w x

8/14/2019 Chap003 (1).pptx

46/51

3-46

3.8 Covariance and Correlation

The covariance(sxyor xy) describes thedirection of the linear relationship between

two variables,xand y.

The correlation coefficient (rxyor rxy)

describes both the direction and strength ofthe relationship betweenxand y.

LO 3.8 Calculate and interpret the covariance and the correlation coefficient.

8/14/2019 Chap003 (1).pptx

47/51

3-47

3.8 Covariance and CorrelationLO 3.8

The sample covariance sxyis computed as

1

i i

xy

x x y ys

n

The population covariance xyis computed as

i x i y xy

x y

N

m m

8/14/2019 Chap003 (1).pptx

48/51

3-48


The sample correlation rxyis computed as

xy

xy

x y

sr

s s

The population correlation rxyis computed as

Note, 1 < rxy< +1 or 1 < rxy< +1

xy

xy

x y

r

8/14/2019 Chap003 (1).pptx

49/51

3-49


Let s calculate the covariance and the correlation

coefficient for the Metals (x) and Income (y) funds.

Positive Relationship

Also recall: 24.65, 37.13, 8.51, 11.07x yx s y s

8/14/2019 Chap003 (1).pptx

50/51

3-50


We use the following table for the calculations.

Covariance:

Correlation:

8/14/2019 Chap003 (1).pptx

51/51

S Some Excel CommandsLOs 3.1, 3.4, and 3.8

Measures of central location and dispersion: select

the relevant data, and choose Data > Data Analysis> Descriptive Statistics.

Covariance: For sample covariance, chooseFormulas > Insert Function > COVARIANCE.S. Forpopulation covariance, use COVARIANCE.P.

Correlation: For sample correlation or populationcorrelation, choose Formulas > Insert Function >CORREL.

Documents

Chap003 (1).pptx