29
Frequency Distributions To accompany Hawkes Lesson 2.1 Original content by D.R.S. 1

Frequency Distributions To accompany Hawkes Lesson 2.1 Original content by D.R.S. 1

Embed Size (px)

Citation preview

Page 1: Frequency Distributions To accompany Hawkes Lesson 2.1 Original content by D.R.S. 1

1

Frequency Distributions

To accompany Hawkes Lesson 2.1Original content by D.R.S.

Page 2: Frequency Distributions To accompany Hawkes Lesson 2.1 Original content by D.R.S. 1

2

Is your data Qualitative or Quantitative?

• Qualitative: it’s a category– Blood type– Model of car– Favorite fast food restaurant

• Quantitative: it’s a numerical measurement– Heart rate, beats per minute– Fuel efficiency, miles per gallon– Dollars spent on meal– My pain, on a scale from 1 to 10

Page 3: Frequency Distributions To accompany Hawkes Lesson 2.1 Original content by D.R.S. 1

3

Frequency Distribution for Categorical Data

Category Frequency Relative Frequency(list the categories here in this column)

(put the counts of how many in this column)

(this category is what percent of the total sample size?)

(What order? Highest frequency down to lowest?Lowest to highest? Alphabetical? It’s your design decision.)

Page 4: Frequency Distributions To accompany Hawkes Lesson 2.1 Original content by D.R.S. 1

4

Categorical Frequency Distributions are the fuel for the “Family Feud”

(Photograph borrowed from some web site somewhere; I failed to record the exact source.)

Page 5: Frequency Distributions To accompany Hawkes Lesson 2.1 Original content by D.R.S. 1

5

Categorical (or, Qualitative) Frequency Distribution example

• “What state did you visit most recently?”State visited (the category) How many (the frequency)

Alabama 71

California 18

Florida 138

New York 7

South Carolina 48

Tennessee 27

Texas 53

Other states 70

TOTAL 432

Page 6: Frequency Distributions To accompany Hawkes Lesson 2.1 Original content by D.R.S. 1

6

Things we do with Categorical Frequency Distributions

• Sometimes we just leave them as tables of words and numbers for reference and interpretation.

• We draw pictures of them (future lessons).– Bar graphs– Pie charts– Cutesy repeated icons variation of the bar graph

Page 7: Frequency Distributions To accompany Hawkes Lesson 2.1 Original content by D.R.S. 1

A famous categorical frequency distribution we will revisit later

Draw this 5-card poker hand Frequency

Royal Flush 4

Straight Flush (not including Royal Flush) 36

Four of a Kind 624

Full House 3,744

Flush (not including Royal Flush or Straight Flush) 5,108

Straight (not including Royal Flush or Straight Flush) 10,200

Three of a Kind 54,912

Two Pair 123,552

One Pair 1,098,240

Something that’s not special at all 1,302,540

Total 2,598,600

Page 8: Frequency Distributions To accompany Hawkes Lesson 2.1 Original content by D.R.S. 1

8

Quantitative Frequency Distribution(data is number measurements)

Classes FrequencyEach class is a low-to-high range of valuesThese are called the “Class Limits”

The frequency column gives a count of how many data values fit in the class

Page 9: Frequency Distributions To accompany Hawkes Lesson 2.1 Original content by D.R.S. 1

9

Quantitative Frequency Distribution(data are number measurements)

Placement Test Score

How many applicants

0-9 1910-19 3820-29 5230-39 7140-49 50

50 and above 28

Page 10: Frequency Distributions To accompany Hawkes Lesson 2.1 Original content by D.R.S. 1

10

About the Quantitative Frequency Distribution

• Instead of individual test score values, we GROUPED data into CLASSES

• Other names for “classes”: “bins”, “buckets”• Each class is a low-to-high range of data values• Each data value falls into exactly one class• May be one or two “open-ended”classes– Like our “50 and higher”

Page 11: Frequency Distributions To accompany Hawkes Lesson 2.1 Original content by D.R.S. 1

11

About the classes

• CLASS LIMITS are 10-19, 20-29, etc.• Classes do not overlap!• Classes are usually the same width.• CLASS MIDPOINTS are like 14.5, 24.5, etc.

(High minus low, divided by 2)

Page 12: Frequency Distributions To accompany Hawkes Lesson 2.1 Original content by D.R.S. 1

12

Class LIMITS vs. Class BOUNDARIES

• CLASS LIMITS are 10-19, 20-29, etc.• CLASS BOUNDARIES split the “gap” between

class limits: 9.5-19.5, 19.5-29.5, etc.• “9.5-19.5” means 9.5 ≤ x < 19.5 (note ≤ vs. < )– All values between 9.5 and 19.5– Including the lower endpoint of 9.5– But excluding the upper endpoint of 19.5

Page 13: Frequency Distributions To accompany Hawkes Lesson 2.1 Original content by D.R.S. 1

13

A Cumulative Frequency column

Placement Test Score

How many applicants

Cumulative frequency

0-9 19 1910-19 38 5720-29 52 10930-39 71 18040-49 50 23050 and above

28 258

19 + 38 = 57 + 52 = 109 + 71 = 180 + 50 = 230 + 28 = 258

Page 14: Frequency Distributions To accompany Hawkes Lesson 2.1 Original content by D.R.S. 1

14

A Relative Frequency column

Placement Test Score

How many applicants

Relative frequency

0-9 19 7.4%10-19 38 14.7%20-29 52 20.2%30-39 71 27.5%40-49 50 19.4%50 and above

28 10.9%

TOTAL 258

Should total exactly 100%But rounding might throw it off a wee bit.

Page 15: Frequency Distributions To accompany Hawkes Lesson 2.1 Original content by D.R.S. 1

15

Constructing a Frequency Distribution

1. How many classes should we have?2. What class width should we use?3. Find the class limits.4. Sort your data, find the frequency of each class.

Adapted from textbook page 46 © HLS

Page 16: Frequency Distributions To accompany Hawkes Lesson 2.1 Original content by D.R.S. 1

16

Example of Construction

Using runners’ times from the Bunny Hop 5K in Cordele, March 31, 2012 – original data downloaded from a link at rungeorgia.com

Click link to pdf

Page 17: Frequency Distributions To accompany Hawkes Lesson 2.1 Original content by D.R.S. 1

17

1. How many classes?

• Between 5 classes and 20 classes is good• How many data values do you have?• One textbook suggests: if you have < 125 data

values, use the square root of the number of data values

• The Bunny Hop race had 103 finishers.• By that rule, we would have 10 or 11 classes.• Let’s agree on 10 classes for this example.

Page 18: Frequency Distributions To accompany Hawkes Lesson 2.1 Original content by D.R.S. 1

18

2. Choose a Class Width

The “range” is the highest data value minus the lowest data value.

Divide the range by the number of classesThen bump up to the next integer.That’s just a starting point

Page 19: Frequency Distributions To accompany Hawkes Lesson 2.1 Original content by D.R.S. 1

19

2. Choose a Class Width

High 66.8000 – Low 20.0167 = Range 45.7833Divide the range by the number of classes

45.7833 ÷ 10 = 4.57833Then bump up to the next integer.

Class width is 5That’s just a starting point

We like it; it sounds good. Nice “round” kind of a number for our readers

Page 20: Frequency Distributions To accompany Hawkes Lesson 2.1 Original content by D.R.S. 1

20

3. Find the Class Limits

Start at what value for the first class?• The lowest value is 20.0167• Let’s start our first class at 20.0000• Same number of decimal places as the data

The first class has a lower class limit of 20.0000The lower limit of the next class is 25.0000• Take the lower limit of 20.0000 from previous class• + class width of 5 = 25.0000 lower limit for next

class

Page 21: Frequency Distributions To accompany Hawkes Lesson 2.1 Original content by D.R.S. 1

21

3. Find the Class Limits - Lower

The first class has lower class limit = 20.0000The next class has lower class limit = 25.0000Etc. for the rest of the 10 classes:• 30.0000, 35.0000, 40.0000, 45.0000 minutes, and• 50.0000, 55.0000, 60.0000, 65.0000 minutes

Page 22: Frequency Distributions To accompany Hawkes Lesson 2.1 Original content by D.R.S. 1

22

3. Find the class limits - Upper

• The first class has lower class limit 20.0000• The second class has lower class limit 25.0000– So the first class has upper class limit 24.9999

• The first class’s class limits: 20.0000 – 24.9999• Then next comes 25.0000 – 29.9999• Then 30.0000 – 34.9999, etc.• All the way up through 65.0000-69.9999

Page 23: Frequency Distributions To accompany Hawkes Lesson 2.1 Original content by D.R.S. 1

23

4. Count the frequency of each class

Time (minutes) Frequency20.0000-24.9999 925.0000-29.9999 2630.0000-34.9999 2335.0000-39.9999 1440.0000-44.9999 745.0000-49.9999 1150.0000-54.9999 1055.0000-59.9999 060.0000-64.9999 265.0000-69.9999 1

If tallying unsorted data by hand, hash marks are useful.

Page 24: Frequency Distributions To accompany Hawkes Lesson 2.1 Original content by D.R.S. 1

24

Class Limits and Class Boundaries

Class Limits Class Boundaries20.0000-24.9999 19.99995 – 24.9999525.0000-29.9999 24.99995 – 29.9999530.0000-34.9999 29.99995 – 34.99995

Etc. Etc.55.0000-59.9999 54.99995 – 59.9999560.0000-64.9999 59.99995 – 64.9999565.0000-69.9999 64.99995 – 69.99995

Page 25: Frequency Distributions To accompany Hawkes Lesson 2.1 Original content by D.R.S. 1

25

Class Limits and Class Boundaries

• What to do with the gap between the class limits of adjacent classes?

• Limits 25.0000-29.9999 and 30.0000-34.9999• There’s gap between 29.99990 and 30.00000• Midway between them is 29.99995• Class Boundaries extend to that midpoint• 24.99995 – 29.99995 and 29.99995– 34.99995

Page 26: Frequency Distributions To accompany Hawkes Lesson 2.1 Original content by D.R.S. 1

26

Class Boundaries

• Example: Class Limits 25.0000 – 29.9999• Class Boundaries 24.99995 – 29.99995 • This means 24.99995 ≤ x < 29.99995• Note: including the lower boundary (≤)• But not including the upper boundary (<)• Because classes must never overlap

Page 27: Frequency Distributions To accompany Hawkes Lesson 2.1 Original content by D.R.S. 1

27

Class Midpoints(Upper Limit + Lower Limit) ÷ 2

Class Limits Class Midpoints20.0000-24.9999 22.4999525.0000-29.9999 27.4999530.0000-34.9999 32.49995

Etc. Etc.55.0000-59.9999 57. 4999560.0000-64.9999 62. 4999565.0000-69.9999 67. 49995

= one class width apart

Page 28: Frequency Distributions To accompany Hawkes Lesson 2.1 Original content by D.R.S. 1

28

Class Limits, Boundaries, and Midpoints for the Placement Test

• It’s easier with whole numbers as class limitsClassLimits Frequency Class

BoundariesClass

Midpoint0-9 19 -0.5 – 9.5 4.5

10-19 38 9.5 – 19.5 14.520-29 52 19.5 – 29.5 24.530-39 71 29.5 – 39.5 34.540-49 50 39.5 – 49.5 44.550 + 28 49.5 and up None? Or 54.5?

Page 29: Frequency Distributions To accompany Hawkes Lesson 2.1 Original content by D.R.S. 1

29

Excel Tools

• Link: The Excel FREQUENCY function. • Link: The Excel COUNTIF function.– Need to add info about COUNTIFS function.

• Also Excel “Histogram” function generates frequency distributions (discussed in the Histogram lesson)