31
Statistics Lecture Notes Dr. Halil İbrahim CEBECİ Chapter 02 Graphical and Tabular Descriptive Techniques

Statistics Lecture Notes Dr. Halil İbrahim CEBECİ Chapter 02 Graphical and Tabular Descriptive Techniques

Embed Size (px)

Citation preview

Page 1: Statistics Lecture Notes Dr. Halil İbrahim CEBECİ Chapter 02 Graphical and Tabular Descriptive Techniques

Statistics Lecture Notes

Dr. Halil İbrahim CEBECİ

Chapter 02Graphical and TabularDescriptive Techniques

Page 2: Statistics Lecture Notes Dr. Halil İbrahim CEBECİ Chapter 02 Graphical and Tabular Descriptive Techniques

A variable is some characteristic of a population or sample.

E.g. student grades.Typically denoted with a capital letter: X, Y, Z…

The values of the variable are the range of possible values for a variable.

E.g. student marks (0..100)

Data are the observed values of a variable.E.g. student marks: {67, 74, 71, 83, 93, 55, 48}

Definitions

Statistics Lecture Notes – Chapter 02

Page 3: Statistics Lecture Notes Dr. Halil İbrahim CEBECİ Chapter 02 Graphical and Tabular Descriptive Techniques

Interval data

Real numbers, i.e. heights, weights, prices, etc. Also referred to as quantitative or numerical.

Arithmetic operations can be performed on Interval Data, thus its meaningful to talk about 2*Height, or Price + $1, and so on.

Statistics Lecture Notes – Chapter 02

Definitions

Page 4: Statistics Lecture Notes Dr. Halil İbrahim CEBECİ Chapter 02 Graphical and Tabular Descriptive Techniques

Nominal Data

The values of nominal data are categories.E.g. responses to questions about marital status,

coded as:Single =1, Married =2, Divorced =3, Widowed =4

Because the numbers are arbitrary arithmetic operations don’t make any sense (e.g. does Widowed ÷ 2 = Married?!)

Nominal data are also called qualitative or categorical.

Statistics Lecture Notes – Chapter 02

Definitions

Page 5: Statistics Lecture Notes Dr. Halil İbrahim CEBECİ Chapter 02 Graphical and Tabular Descriptive Techniques

Ordinal Data

appear to be categorical in nature, but their values have an order; a ranking to them:

E.g. College course rating system:poor = 1, fair = 2, good = 3, very good = 4, excellent = 5

While its still not meaningful to do arithmetic on this data (e.g. does 2*fair = very good?!), we can say things like:

excellent > poor or fair < very good

That is, order is maintained no matter what numeric values are assigned to each category.

Statistics Lecture Notes – Chapter 02

Definitions

Page 6: Statistics Lecture Notes Dr. Halil İbrahim CEBECİ Chapter 02 Graphical and Tabular Descriptive Techniques

Time Series Data: Ordered data values observed over time

Cross Section Data: Data values observed at a fixed point in time

Statistics Lecture Notes – Chapter 02

Definitions

Page 7: Statistics Lecture Notes Dr. Halil İbrahim CEBECİ Chapter 02 Graphical and Tabular Descriptive Techniques

Ex2.1 -For each of the following examples of data, determine whether the data are quantitative,qualitative, or ranked.

a. the month of the highest sales for each firm in a sample (qualitative)

b. the department in which each of a sample of university professors teaches (qualitative)

c. the weekly closing price of gold throughout a year (quantitative)

d. the size of soft drink (large, medium, or small) ordered by a sample of customers in a restaurant (ranked)

e. the number of barrels of crude oil imported monthly by the United States (quantitative)

Statistics Lecture Notes – Chapter 02

Definitions

Page 8: Statistics Lecture Notes Dr. Halil İbrahim CEBECİ Chapter 02 Graphical and Tabular Descriptive Techniques

Statistics Lecture Notes – Chapter 02

Tabular and Graphical Techniques for Nominal Data

Categorical Variables

• Frequency distribution

• Bar chart• Pie chart• Pareto diagram

Numerical Variables

• Line chart• Frequency

distribution• Histogram and

ogive• Stem-and-leaf

display• Scatter plot

Page 9: Statistics Lecture Notes Dr. Halil İbrahim CEBECİ Chapter 02 Graphical and Tabular Descriptive Techniques

The only allowable calculation on nominal data is to count the frequency of each value of the variable.

We can summarize the data in a table that presents the categories and their counts called a frequency distribution.

A relative frequency distribution lists the categories and the proportion with which each occurs.

Statistics Lecture Notes – Chapter 02

Tabular and Graphical Techniques for Nominal Data

Page 10: Statistics Lecture Notes Dr. Halil İbrahim CEBECİ Chapter 02 Graphical and Tabular Descriptive Techniques

Ex2.2 – The student placement office at university conducted a survey of last year’s business school graduates to determine the general areas in which the graduates found jobs. The placement office intended to use the resulting information to help decide where to concentrate its efforts in attracting companies to campus to conduct job interviews The areas of employment are;

1. Accounting2. Finance3. General Management4. Marketing/Sales5. Other

Statistics Lecture Notes – Chapter 02

Tabular and Graphical Techniques for Nominal Data

Page 11: Statistics Lecture Notes Dr. Halil İbrahim CEBECİ Chapter 02 Graphical and Tabular Descriptive Techniques

Statistics Lecture Notes – Chapter 02

1 1 2 4 1 4 2 4 5 2 5 4 1 1 4 2 3

4 5 1 4 1 3 2 4 3 1 2 5 4 2 3 3 2

5 4 1 4 1 4 5 5 1 4 2 4 2 2 5 2 5

1 5 3 4 1 4 1 2 1 3 4 2 4 5 5 1 2

2 1 4 3 3 1 4 1 1 1 1 2 4 1 4 3 2

2 4 1 1 2 4 4 4 5 4 5 1 1 3 2 1 3

3 1 5 3 1 3 2 1 1 1 5 3 2 3 4 2 5

1 3 1 1 1 4 2 4 4 2 1 4 4 5 5 2 1

4 4 2 5 3 2 4 1 1 4 3 2 4 2 3 1 1

1 2 1 1 4 1 4 3 4 4 2 3 1 4 5 3 3

1 4 1 2 4 1 4 5 2 2 2 5 4 4 4 1 4

4 1 4 4 1 2 4 2 2 3 2 1 4 4 3 4 1

3 4 5 3 3 1 5 1 4 2 2 1 5 5 4 1 1

1 4 3 2 2 1 1 4 2 3 1 3 3 2 2 3

4 2 2 1 4 2 3 1 1 5 1 1 2 1 1 1

Tabular and Graphical Techniques for Nominal Data

Page 12: Statistics Lecture Notes Dr. Halil İbrahim CEBECİ Chapter 02 Graphical and Tabular Descriptive Techniques

Statistics Lecture Notes – Chapter 02

Area Frequency Relative Frequency(%)

Accounting 73 28.9

Finance 52 20.6

General Management 36 14.2

Marketing/Sales 64 25.3

Other 28 11.1

Total 253 100

It all the same information,(based on the same data).Just different presentation.

Tabular and Graphical Techniques for Nominal Data

Page 13: Statistics Lecture Notes Dr. Halil İbrahim CEBECİ Chapter 02 Graphical and Tabular Descriptive Techniques

There are several graphical methods that are used when the data are interval (i.e. numeric, non-categorical).

The most important of these graphical methods is the histogram.

The histogram is not only a powerful graphical technique used to summarize interval data, but it is also used to help explain probabilities.

Statistics Lecture Notes – Chapter 02

Graphical Techniques for Ordinal Data

Page 14: Statistics Lecture Notes Dr. Halil İbrahim CEBECİ Chapter 02 Graphical and Tabular Descriptive Techniques

1. Collect the Data 2. Create a frequency distribution for the data.3. Draw the Histogram.

Ex2.3 – Draw a histogram for long distance telephone bills data

Statistics Lecture Notes – Chapter 02

Graphical Techniques for Ordinal Data

42.19 39.21 75.71 8.37 … 114.67 15.30

38.45 48.54 88.62 7.18 … 27.57 75.49

29.23 93.31 99.50 11.07 … 64.78 68.69

89.35 104.88 85.00 1.47 … 45.81 35.00

… … … … … … …

74.01 93.57 23.31 9.01 … 3.03 41.38

56.01 0 11.05 84.77 … 9.16 45.77

Page 15: Statistics Lecture Notes Dr. Halil İbrahim CEBECİ Chapter 02 Graphical and Tabular Descriptive Techniques

Statistics Lecture Notes – Chapter 02

Graphical Techniques for Ordinal Data

Class Limits Frequency

0 to 15 71

15 to 30 37

30 to 45 13

45 to 60 9

60 to 75 10

75 to 90 18

90 to 105 28

105 to 120 14

Total 200

Histogram

Page 16: Statistics Lecture Notes Dr. Halil İbrahim CEBECİ Chapter 02 Graphical and Tabular Descriptive Techniques

Statistics Lecture Notes – Chapter 02

Graphical Techniques for Ordinal Data

Skewness

Measure of the degree of asymmetry of a frequency distribution

Skewed to left Skewed to right Symmetric or unskewed

Page 17: Statistics Lecture Notes Dr. Halil İbrahim CEBECİ Chapter 02 Graphical and Tabular Descriptive Techniques

Statistics Lecture Notes – Chapter 02

Graphical Techniques for Ordinal Data

KurtosisMeasure of flatness or peakedness of a frequency distribution

relatively peaked

relatively flat

normal

Page 18: Statistics Lecture Notes Dr. Halil İbrahim CEBECİ Chapter 02 Graphical and Tabular Descriptive Techniques

A stemplot (or stem-and-leaf display), in statistics, is a device for presenting quantitative data in a graphical format, similar to a histogram, to assist in visualizing the shape of a distribution.

Ex2.4 - The weights in pounds of a group of workers are as follows

Construct a stem and leaf display for these data

Statistics Lecture Notes – Chapter 02

Stem-and-Leaf Display (Stemplot)

173 165 171 175 188

183 177 160 151 169

162 179 145 171 175

168 158 186 182 162

154 180 164 166 157

Page 19: Statistics Lecture Notes Dr. Halil İbrahim CEBECİ Chapter 02 Graphical and Tabular Descriptive Techniques

The first step in constructing a stem and leaf display is to decide how to split each observation (weight) into two parts: a stem and a leaf.Thus, the first two weights are split into a stem and a leaf as follows:

Statistics Lecture Notes – Chapter 02

Stem-and-Leaf Display (Stemplot)

Weight Stem Leaf

173 17 3

183 18 3

Next, we consider each observation in turn and place its leaf in the same row as its stem, to the right of the vertical line. The resulting stem and leaf display shown below has grouped the 25 weights into five categories.

Stem Leaf

14 5

15 4 8 1 7

16 2 8 5 0 4 6 9 2

17 3 7 9 1 5 1 5

18 3 0 6 2 8

Page 20: Statistics Lecture Notes Dr. Halil İbrahim CEBECİ Chapter 02 Graphical and Tabular Descriptive Techniques

Is a graph of a cumulative frequency distribution.

We create an ogive in three steps…

1. Calculate relative frequencies.2. Calculate cumulative relative frequencies by

adding the current class’ relative frequency to the previous class’ cumulative relative frequency.

3. Draw the Ogive

(For the first class, its cumulative relative frequency is just its relative frequency)

Statistics Lecture Notes – Chapter 02

Ogive

Page 21: Statistics Lecture Notes Dr. Halil İbrahim CEBECİ Chapter 02 Graphical and Tabular Descriptive Techniques

Statistics Lecture Notes – Chapter 02

Ogive

Class Limits Relative Frequency

CumulativeRel. Freq.

0 to 15 71/200=0.355 0.355

15 to 30 37/200 = 0.185 0.540

30 to 45 13/200 = 0.065 0.605

45 to 60 9/200 = 0.045 0.650

60 to 75 10/200 = 0.050 0.700

75 to 90 18/200 = 0.090 0.790

90 to 105 28/200 = 0.140 0.930

105 to 120 14/200 = 0.070 1.00

Total 200

Ex2.5 – Draw an Ogive for the value given in ex2.3

Ogive for Telephone Bills

What telephone bill value is at the 50th percentile?

Page 22: Statistics Lecture Notes Dr. Halil İbrahim CEBECİ Chapter 02 Graphical and Tabular Descriptive Techniques

Tabular Method:

A Contingency table (Cross-Tab Classification table) is used to describe the relationship between two nominal variables.

Ex2.5 –To help advertising campaigns, the advertising managers of the newspapers need to know which segments of the newspaper market are reading their papers. Discuss the relationship between newspaper and occupation with contingency table given below

Statistics Lecture Notes – Chapter 02

Describing the Relationship Between Two Variables

Occupation

Newspaper Blue Collar White Collar Professional Total

G&M 27 39 33 89

Post 18 43 51 112

Star 38 21 22 81

Sun 37 15 20 72

Total 120 108 126 354

Page 23: Statistics Lecture Notes Dr. Halil İbrahim CEBECİ Chapter 02 Graphical and Tabular Descriptive Techniques

Graphic Method (Scatter Diagram):

We create a scatter diagram in three steps

1) Collect the data2) Determine the independent variable (X – house size)

and the dependent variable (Y – selling price)3) Use Excel to create a “scatter diagram”…

Ex2.6 - A real estate agent wanted to know to what extent the selling price of a home is related to its size. Use a graphical technique to describe the relationship between size and price

Statistics Lecture Notes – Chapter 02

Describing the Relationship Between Two Variables

Size 23 16 26 20 22 14 33 28 23 20 27 18

Price 315 229 355 261 234 216 308 306 289 204 265 195

Page 24: Statistics Lecture Notes Dr. Halil İbrahim CEBECİ Chapter 02 Graphical and Tabular Descriptive Techniques

It appears that in fact there is a relationship, that is, the greater the house size the greater the selling price

Statistics Lecture Notes – Chapter 02

Scatter Diagram

Page 25: Statistics Lecture Notes Dr. Halil İbrahim CEBECİ Chapter 02 Graphical and Tabular Descriptive Techniques

Linearity and Direction are two concepts we are interested in

Statistics Lecture Notes – Chapter 02

Scatter Diagram

Page 26: Statistics Lecture Notes Dr. Halil İbrahim CEBECİ Chapter 02 Graphical and Tabular Descriptive Techniques

Observations measured at the same point in time are called cross-sectional data. Observations measured at successive points in time are called time-series data.

Time-series data graphed on a line chart, which plots the value of the variable on the vertical axis against the time periods on the horizontal axis.

Statistics Lecture Notes – Chapter 02

Describing Time Series Data

Page 27: Statistics Lecture Notes Dr. Halil İbrahim CEBECİ Chapter 02 Graphical and Tabular Descriptive Techniques

Q2.1 - Identify the type of data observed for each of the following variables.

a. the number of students in a statistics classb. the student evaluations of the professor (1 = poor, 5

= excellent)c. the political preferences of votersd. the states in the United States of Americae. the size of a condominium (in square feet)

Statistics Lecture Notes – Chapter 02

Exercises

Page 28: Statistics Lecture Notes Dr. Halil İbrahim CEBECİ Chapter 02 Graphical and Tabular Descriptive Techniques

Q2.2 - The salaries (in hundreds of dollars) of a sample of 40 government employees are as follows:

a. Construct a stem and leaf display for these data. (When leaves consist of two digits, they should be separated from one another by commas.)

b. Construct a frequency distribution for these data.c. Construct a relative frequency histogram for the data.d. Construct an ogive for the data.

Statistics Lecture Notes – Chapter 02

208 160 175 334 228 211 179 354

265 215 191 239 298 226 220 260

173 263 226 165 252 422 284 232

225 348 290 180 300 200 245 204

256 281 230 275 158 224 315 217

Exercises

Page 29: Statistics Lecture Notes Dr. Halil İbrahim CEBECİ Chapter 02 Graphical and Tabular Descriptive Techniques

Q2.3 - The number of men and women who have received an M.B.A. degree from a particular universityin each of five years is shown below:

a. Use a component bar chart to depict these datab. Use a line chart to depict these data

Statistics Lecture Notes – Chapter 02

Year Men Women

1988 74 12

1989 85 20

1990 90 32

1991 112 48

1992 128 67

Exercises

Page 30: Statistics Lecture Notes Dr. Halil İbrahim CEBECİ Chapter 02 Graphical and Tabular Descriptive Techniques

Q2.4 - The manager of a large furniture store wanted to know if the number of ads influenced the number of customers. During the past eight months, she kept track of both figures, which are shown below. Construct a scatter diagram for these data, and describe the relationship between the number of ads and the number of customers.

Statistics Lecture Notes – Chapter 02

Month Number of Ads (x) Number of Customer (y)

1 5 528

2 12 876

3 8 653

4 6 571

5 4 556

6 15 1058

7 10 963

8 7 719

Exercises

Page 31: Statistics Lecture Notes Dr. Halil İbrahim CEBECİ Chapter 02 Graphical and Tabular Descriptive Techniques

Keller, Gerald; Statistics for Management and Economics, 9e, 2012

Groebner, D.F.; Shannon, P.W., Fry, P.C, Smith, K.D; Business Statistics: A decision Making Approach, 7e, 2007

Azcel, A.D; Complete Business Statistics, 7e, 2009

Newbold, P., Carlson, W., Thorne B.; Statistics for Business and Economics, 6e, 2007

Ott, R.L., Longnecker, M.; An Introduction to Statistical Methods and Data Analysis, 6e, 2010

Black, K.; Business Statistics for Contemporary Decision Making, 6e, 2010

References

Statistics Lecture Notes – Chapter 01