70
Presenting Data in Tables & Charts Organizing Numerical Data

Presenting Data in Tables & Charts Organizing Numerical Data

  • View
    212

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Presenting Data in Tables & Charts Organizing Numerical Data

Presenting Data in Tables & Charts

Organizing Numerical Data

Page 2: Presenting Data in Tables & Charts Organizing Numerical Data

Data with 20 or more observations should be organized

Page 3: Presenting Data in Tables & Charts Organizing Numerical Data

The Ordered Array: arranges raw data in order from the smallest

observation to the largest observation.

Page 4: Presenting Data in Tables & Charts Organizing Numerical Data

Raw Data Arranged in an Ordered Array

5. Auto Cost($)

5. Auto Cost($)

18000 100018000 200021000 23001000 5000

24000 600015000 9000

1000012000

30000 1200022000 1400012000 1500030000 1600017000 170002300 18000

27000 180005000 18000

46000 20000

26500 21000

Page 5: Presenting Data in Tables & Charts Organizing Numerical Data

The Ordered Array makes it easy to identify:

Page 6: Presenting Data in Tables & Charts Organizing Numerical Data

• extreme values

• typical values

• range where the majority of values are concentrated

Page 7: Presenting Data in Tables & Charts Organizing Numerical Data

Stem and Leaf Display:

shows where raw data clusters over a range

of observations.

Page 8: Presenting Data in Tables & Charts Organizing Numerical Data

EXAMPLE:

the following data represent the weekly salary checks earned by a sample of eight secretaries:

$555 $490 $648 $832$710 $590 $576 $623

Page 9: Presenting Data in Tables & Charts Organizing Numerical Data

First, put the values in ascending order and then use the 100s column as the stems, use the 10s column as

the leaves, and either ignore the units column or round the units

column and then use the 10s column as the leaves.

Page 10: Presenting Data in Tables & Charts Organizing Numerical Data

$555 $490 $648 $832 $710 $590 $576 $623

4 | 95 | 5796 | 247 | 18 | 3

Page 11: Presenting Data in Tables & Charts Organizing Numerical Data

To further illustrate, how we can organize data to present, analyze

and interpret findings,

Page 12: Presenting Data in Tables & Charts Organizing Numerical Data

we will study data from a previous QBA questionnaire:

1) USD students’ auto costs

• 2) USD students’ maximum auto speeds

Page 13: Presenting Data in Tables & Charts Organizing Numerical Data

Raw Data from student questionnaire (partial)

1.Age

2.Gender

0=M, 1=F

3.Live

Campus0=Off 1=On

4.Study / Week (hrs)

5. Auto Cost($)

6.Alch bev / wk (#)

7.Sodas / wk (#)

8.Hrs. Paid / wk (hrs)

9.No. units this sem

(#)

10.TV /

video game /

wk (hrs)19 1 0 15 18000 4 5 13.5 14 218 1 1 33 18000 0 6.5 14 16 424 0 1 12 21000 10 3 4 23 420 1 1 12.5 1000 1 12.5 12 16.5 1019 1 0 12.5 24000 2 0 20 17 520 0 0 10 15000 50 1 0 13 319 1 1 5 21 0 18 16 1419 1 1 20 0 2 0 17 219 0 1 15 30000 2 13 0 16 2021 0 1 10 22000 20 0 5 16 15

Page 14: Presenting Data in Tables & Charts Organizing Numerical Data

RAW ARRAY5.

Auto Cost($)

5. Auto Cost($)

18000 100018000 200021000 23001000 5000

24000 600015000 9000

1000012000

30000 12000

Page 15: Presenting Data in Tables & Charts Organizing Numerical Data

Stem & Leaf Auto Costs

Stem unit:10000

0 1 2 2 5 6 91 0 2 2 4 5 6 7 8 8 82 0 1 2 4 5 7 73 0 04 5 656 0 6

Page 16: Presenting Data in Tables & Charts Organizing Numerical Data

Stem & Leaf MPH

Stem unit:10

7 08 09 0 0 5

10 0 0 0 0 0 0 0 5 5 81112 0 0 013 0 0 514 0 515 0 0 0 2 716 0 01718 0

Page 17: Presenting Data in Tables & Charts Organizing Numerical Data

And just for fun, let’s look at GPA

GPA

GPA

Stem unit: = 1  

2 4 5 5 6 8 8 9  

3 0 0 0 0 1 1 2 2 3 3 3 3 3 4 4 4 5 5 6 7 8 8 9

4 0  

Page 18: Presenting Data in Tables & Charts Organizing Numerical Data

How Else Can We Organize our Data?

Page 19: Presenting Data in Tables & Charts Organizing Numerical Data

Numerical Data

• Frequency Distribution

• Relative Frequency Distribution

• Percentage Frequency Distribution

• Cumulative Frequency Distribution

Page 20: Presenting Data in Tables & Charts Organizing Numerical Data

Frequency Distribution

interval Frequency Percentage interval Frequency Percentage

70 1 3.2% 60 0 0.0%80 1 3.2% 80 2 6.5%90 2 6.5% 100 10 32.3%

100 8 25.8% 120 6 19.4%110 3 9.7% 140 4 12.9%120 3 9.7% 160 8 25.8%130 2 6.5% 180 1 3.2%140 2 6.5%150 4 12.9%160 4 12.9% --170 0 0.0% 70180 1 3.2% 90

Histogram MPH fast car (mph) midpoints

Freq Distribution MPH (22. fast car (mph)Freq. Dist. MPH (22 fast car (mph))

Page 21: Presenting Data in Tables & Charts Organizing Numerical Data

Frequency Distribution

Frequency Distribution for Numerical Data (5. Auto Cost($))

0 Frequency Percentage

10000 7 24.14%

20000 10 34.48%

30000 8 27.59%

40000 0 0.00%

50000 2 6.90%

60000 1 3.45%

70000 1 3.45%

Page 22: Presenting Data in Tables & Charts Organizing Numerical Data

Selecting the Number of Classes

• There is no “correct” number of classes (K) to use in a frequency distribution.

• However, the frequency distribution should have at least 5 classes, but no more than 20

Page 23: Presenting Data in Tables & Charts Organizing Numerical Data

Caution!

• If you have too “FEW” classes (K), a large portion of your data, lies in one class.

• However, if there are a number of empty classes, or too many classes with a frequency of 1 or 2, this may indicate too “MANY” classes (K).

Page 24: Presenting Data in Tables & Charts Organizing Numerical Data

Approximate Number of Classes in Frequency Distribution

# Observations # Classes

Less than 50 5 – 7

50 – 200 7 – 9

200 – 500 9 - 10

500 – 1,000 10– 11

1,000 – 5,000 11- 13

5,000 – 50,000 13 – 17

More than 50,000 17 - 20

Page 25: Presenting Data in Tables & Charts Organizing Numerical Data

What do you gain by organizing your data in a Frequency

Distribution?

Page 26: Presenting Data in Tables & Charts Organizing Numerical Data

Hint!From pages of raw data

Page 27: Presenting Data in Tables & Charts Organizing Numerical Data

Answer

• Reduce large numbers of data points to a workable number of classes and frequencies.

• Study the frequency distribution and learn a great deal about the shape of the data set.

Page 28: Presenting Data in Tables & Charts Organizing Numerical Data

Raw Data from student questionnaire (partial)

1.Age

2.Gender

0=M, 1=F

3.Live

Campus0=Off 1=On

4.Study / Week (hrs)

5. Auto Cost($)

6.Alch bev / wk (#)

7.Sodas / wk (#)

8.Hrs. Paid / wk (hrs)

9.No. units this sem

(#)

10.TV /

video game /

wk (hrs)19 1 0 15 18000 4 5 13.5 14 218 1 1 33 18000 0 6.5 14 16 424 0 1 12 21000 10 3 4 23 420 1 1 12.5 1000 1 12.5 12 16.5 1019 1 0 12.5 24000 2 0 20 17 520 0 0 10 15000 50 1 0 13 319 1 1 5 21 0 18 16 1419 1 1 20 0 2 0 17 219 0 1 15 30000 2 13 0 16 2021 0 1 10 22000 20 0 5 16 15

Page 29: Presenting Data in Tables & Charts Organizing Numerical Data

Frequency Distribution

interval Frequency Percentage interval Frequency Percentage

70 1 3.2% 60 0 0.0%80 1 3.2% 80 2 6.5%90 2 6.5% 100 10 32.3%

100 8 25.8% 120 6 19.4%110 3 9.7% 140 4 12.9%120 3 9.7% 160 8 25.8%130 2 6.5% 180 1 3.2%140 2 6.5%150 4 12.9%160 4 12.9% --170 0 0.0% 70180 1 3.2% 90

Histogram MPH fast car (mph) midpoints

Freq Distribution MPH (22. fast car (mph)Freq. Dist. MPH (22 fast car (mph))

Page 30: Presenting Data in Tables & Charts Organizing Numerical Data

Frequency Distribution

Frequency Distribution for Numerical Data (5. Auto Cost($))

0 Frequency Percentage

10000 7 24.14%

20000 10 34.48%

30000 8 27.59%

40000 0 0.00%

50000 2 6.90%

60000 1 3.45%

70000 1 3.45%

Page 31: Presenting Data in Tables & Charts Organizing Numerical Data

Constructing a Frequency Distribution

• Gather the sample data• Arrange data in an Ordered Array• Select the number of classes to be used• Determine class width: range/ # of classes• Determine the class limits for each class so that

the distribution is easy to interpret• Count the number of data values in each class

(the raw frequencies)• Determine the Relative Frequencies

Page 32: Presenting Data in Tables & Charts Organizing Numerical Data

Relative Frequency =

Raw frequency count in each class

--------------------------------------

Total number of observations (n)

Page 33: Presenting Data in Tables & Charts Organizing Numerical Data

Relative Frequency is essential for comparing the relationship

between two data sets.

Page 34: Presenting Data in Tables & Charts Organizing Numerical Data

To Convert Relative Frequency to Percent Frequency:

Multiply Relative Frequency X 100

Page 35: Presenting Data in Tables & Charts Organizing Numerical Data

Example

15. A doctor's office staff has studied the waiting times for patients who arrive at the office with a request for emergency service. The following data were collected over a one-month period (the waiting times are in minutes).  

                   

2 5 10 12 4 4 5 17 11 8 9 8 12 21 6 8 7 13 18 3  

   

Use classes of 0 - 4, 5- 9, and so on.  

a. Show the frequency distribution.  

b. Show the relative frequency distribution.  

c. Show the cumulative frequency distribution.  

d. Show the relative cumulative frequency distribution.  

Page 36: Presenting Data in Tables & Charts Organizing Numerical Data

How Else Can We Organize our Data?

Page 37: Presenting Data in Tables & Charts Organizing Numerical Data

Graphic Techniques to Describe Numerical Data

1) Histogram (continuous data)

2) Polygon

3) Ogive

4) Scattergram

Page 38: Presenting Data in Tables & Charts Organizing Numerical Data

Histogram

• Uni-modal

• Bi-modal

• Skewed:

i) right or positively skewed

ii) left or negatively skewed

Page 39: Presenting Data in Tables & Charts Organizing Numerical Data

Histogram Auto Costs

Histogram of Auto Cost

0

2

4

6

8

10

12--

1500

0

2500

0

3500

0

4500

0

5500

0

6500

0

Midpoints ($)

Fre

qu

ency

Page 40: Presenting Data in Tables & Charts Organizing Numerical Data

Histogram MPH

Histogram

0

5

10

15

-- 70 90 110 130 150 170Midpoints

Fre

qu

ency

Page 41: Presenting Data in Tables & Charts Organizing Numerical Data

Negative or Left Skewed

Page 42: Presenting Data in Tables & Charts Organizing Numerical Data

Positive or Right Skewed

Page 43: Presenting Data in Tables & Charts Organizing Numerical Data

Quiz Would incomes

of employees in large firms tend to be positively or negatively skewed? Why?

Page 44: Presenting Data in Tables & Charts Organizing Numerical Data

Quiz Do exam

grades tend to be positively or negatively skewed? Why?

Page 45: Presenting Data in Tables & Charts Organizing Numerical Data

A Scatter Diagram

Graphs bivariate data to examine whether a relationship exists between two numerical

variables.

Page 46: Presenting Data in Tables & Charts Organizing Numerical Data

Is there a relationship between the price of their auto and the

maximum MPH a USD student has driven?

Page 47: Presenting Data in Tables & Charts Organizing Numerical Data
Page 48: Presenting Data in Tables & Charts Organizing Numerical Data

Scatter Diagram Speed vs. Cost

0

50

100

150

200

0 20000 40000 60000 80000

($)

(MP

H)

c

Page 49: Presenting Data in Tables & Charts Organizing Numerical Data

Is there a relationship between the number of alcoholic beverages consumed per week and the number of hours studied per

week?

Page 50: Presenting Data in Tables & Charts Organizing Numerical Data

Scattergram: Weekly Alchohol Amount vs Hours Studied

0

10

20

30

40

50

0 5 10 15 20 25 30 35

Hours

No

. of

Dri

nks

Page 51: Presenting Data in Tables & Charts Organizing Numerical Data

$Wk Entertainment vs #Alcoholic Bev/wk

0

50

100

150

200

0 10 20 30 40 50 60

Alcohol Bev/wk

$W

k E

nte

rta

in

Page 52: Presenting Data in Tables & Charts Organizing Numerical Data

GPA vs mph

0

1

2

3

4

50 70 90 110 130 150 170 190

mph

GP

A

Page 53: Presenting Data in Tables & Charts Organizing Numerical Data

MPH vs Alcoholic Beverages/wk

0

10

20

30

40

50

60

50 70 90 110 130 150 170 190

MPH

Alc

oh

olic

Be

ve

rag

es

/wk

Page 54: Presenting Data in Tables & Charts Organizing Numerical Data

Alch bev / wk (#) mph

0 70

0 80

4 90

21 95

8 100

4 100

15 100

1 100

7 100

0 100

2 100

6 105

12 105

3 108

50 120

0 120

2 120

0 130

10 130

2 135

MPH vs Alcoholic Beverages/wk

0

10

20

30

40

50

60

50 70 90 110 130 150 170 190

MPH

Alc

oh

olic

Be

ve

rag

es

/wk

Page 55: Presenting Data in Tables & Charts Organizing Numerical Data

Scatter Diagram: Entertainment Vs. Cash On Hand

0

20

40

60

80

100

120

0 50 100 150 200

Entertainment $ per Week

Ty

pic

al $

On

-Ha

nd

Page 56: Presenting Data in Tables & Charts Organizing Numerical Data

Hours Paid Vs. Cash on Hand ($)

0

20

40

60

80

100

120

0 10 20 30 40 50

Hours per week work

Ca

sh

on

Ha

nd

$

Page 57: Presenting Data in Tables & Charts Organizing Numerical Data

Tables & Charts for Categorical Data

1) Summary Table: similar to Frequency Distribution.

2) Contingency Table for Crosstabulation of Bivariate Categorical Data.

3) Bar Chart: graphical representation of frequency of occurrence.

4) Pie Chart: graphical emphasis of proportion5) Pareto Diagram6) Side-by-Side Bar Charts: for bivariate

categorical data.

Page 58: Presenting Data in Tables & Charts Organizing Numerical Data

Summary Table

Coke Classic Sprite Pepsi-Cola

Diet Coke Pepsi-Cola Coke Classic

Pepsi-Cola Coke Classic Dr. Pepper

Diet Coke Coke Classic Pepsi-Cola

Coke Classic Coke Classic Sprite

Coke Classic Pepsi-Cola

Dr. Pepper Coke Classic Show:

Diet Coke Sprite a. Freq distribution

Pepsi-Cola Dr. Pepper b. Relative Freq

Pepsi-Cola Pepsi-Cola c. Percent Freq

Coke Classic Diet Coke d. Bar graph

Dr. Pepper Pepsi-Cola e. Pareto diagram

Sprite Coke Classic f. Pie chart.

Coke Classic Coke Classic

Diet Coke Coke Classic

Coke Classic Pepsi-Cola

Coke Classic Dr. Pepper

Sprite Coke Classic

Coke Classic Diet Coke

Diet Coke Pepsi-Cola

Coke Classic Pepsi-Cola

Diet Coke Pepsi-Cola

Coke Classic

Presentation of Categorical Data

Page 59: Presenting Data in Tables & Charts Organizing Numerical Data

Soft drink Freq. Relative Freq

Coke Classic 19 .38 or 38%

Diet Coke 8 .16 or 16%

Dr. Pepper 5 .10 or 10%

Pepsi-Cola 13 .26 or 26%

Sprite 5 .10 or 10%

Total 50 1.00 or 100%

Page 60: Presenting Data in Tables & Charts Organizing Numerical Data

Contingency Table for Crosstabulation of Bivariate

Categorical DataCROSSTABULATION OF QUALITY RATING AND MEAL PRICE FOR 300 LOS ANGELES RESTAURANTS

Meal Price Quality Rating $10 – 19 $20 – 29 $30 – 39 $40 – 49 Total Good 42 40 2 0 84 Very Good 34 64 46 6 150 Excellent 2 14 28 22 66 Total 78 118 76 28 300 ROW PERCENTAGES FOR EACH QUALITY RATING CATEGORY

Meal Price Quality Rating $10 – 19 $20 – 29 $30 – 39 $40 – 49 Total Good 50.0 47.6 2.4 0.0 100 Very Good 22.7 42.7 30.6 4.0 100 Excellent 3.0 21.2 42.4 33.4 100

Page 61: Presenting Data in Tables & Charts Organizing Numerical Data

Gender vs. Number of Alcoholic Drinks per Week

Crosstabs Gender vs Alcohol/wk

Count of Gender0=M, 1=F Alcohol/wkGender0=M, 1=F 0 1 2 4 5 Grand Total

0 2 9 4 0 1 161 2 7 1 1 0 11

Grand Total 4 16 5 1 1 27

Page 62: Presenting Data in Tables & Charts Organizing Numerical Data

Contingency Table for Gender vs. Auto Costs

Crosstabs of Gender vs Auto Cost

Count of Gender0=M, 1=F Auto CostGender0=M, 1=F 1 2 3 5 6 7 Grand Total

0 5 4 4 2 1 0 161 2 6 2 0 0 1 11

Grand Total 7 10 6 2 1 1 27

Page 63: Presenting Data in Tables & Charts Organizing Numerical Data

Contingency Table of Gender vs. MPH

Crosstabs Gender vs mph

Count of Gender0=M, 1=F mphGender0=M, 1=F 70-90 91-110 111-130 131-150 5 Grand Total

0 0 3 3 3 7 161 2 6 1 1 1 11

Grand Total 2 9 4 4 8 27

Page 64: Presenting Data in Tables & Charts Organizing Numerical Data

Contingency Table of Live on/off Campus by Gender

GenderMale 0Female 1 0 1 Totals

0 9 9 181 5 10 15

Totals 14 19 33

On (1) Off (0) Campus

Page 65: Presenting Data in Tables & Charts Organizing Numerical Data

Gender0=M, 1=F

Auto Cost($)

Alch bev / wk (#) mph Auto CostAlcohol/wk mph

0 15000 50 120 2 5 31 18000 0 70 2 0 10 21000 10 145 3 1 40 9000 1 160 1 1 50 26500 6 150 3 1 50 2000 4 100 1 1 21 18000 6 105 2 1 21 20000 15 100 2 2 21 27000 36 150 3 4 51 24000 2 135 3 1 40 12000 15 152 2 2 50 16000 6 160 2 1 50 30000 2 150 3 1 51 1000 1 100 1 1 20 5000 12 105 1 2 21 18000 4 90 2 1 10 46000 7 100 5 1 20 6000 12 180 1 2 50 25000 0 120 3 0 30 60000 6 157 6 1 50 17000 20 140 2 2 41 66000 2 120 7 1 31 12000 0 100 2 0 20 45000 0 130 5 0 31 2300 3 108 1 1 20 10000 10 130 1 1 41 14000 2 100 2 1 2

Page 66: Presenting Data in Tables & Charts Organizing Numerical Data

The following data are for 30 observations on two qualitative

variables, X and Y. The categories for X are A, B, and C; the

categories for Y are 1 and 2.

Page 67: Presenting Data in Tables & Charts Organizing Numerical Data

a. Develop a crosstabulation for the data with x in the rows and y in the columns.b. Compute the row percentages.c. Compute the column percentages.d. What is the relationship, if any, between x and y?

Obs. x y1 A 12 B 13 B 14 C 25 B 16 C 27 B 18 C 29 A 1

10 B 111 A 112 B 113 C 214 C 215 C 216 B 217 C 118 B 119 C 120 B 121 C 222 B 123 C 224 A 125 B 126 C 227 C 228 A 129 B 130 B 2

Categories for X are A,B, and C. The categories for Y are 1 and 2.

Page 68: Presenting Data in Tables & Charts Organizing Numerical Data

Side-by-side Bar Charts

Crosstab Gender vs Auto $

0 1 2 3 4 5 6 7

0

1

7

6

5

3

2

1

Drop Page Fields Here

Count of Gender 0=M, 1=F

Gender 0=M, 1=F

Auto Cost

Male

Male

Female

Page 69: Presenting Data in Tables & Charts Organizing Numerical Data

Pareto Diagram

Separates the “vital few” from the “trivial many”.

Page 70: Presenting Data in Tables & Charts Organizing Numerical Data