Upload
dale-harrell
View
221
Download
0
Embed Size (px)
Citation preview
Categorical Data!• Frequency Table
– Records the totals (counts or percentage of observations) for each category. If percentages are shown, it is a relative frequency table.
• Distribution– Shows the possible values of the variable and
the relative frequency of each value
Types of Graphs
• Picture – Remember…
– Ex: Stem & Leaf
• Bar Chart – counts for comparison
Segmented Bar Chart
Contingency Table• Displays counts and sometimes percentages of
individuals falling into named categories on two or more variables. Shows how individuals are distributed along each variable contingent on the value of another variable.– Marginal distribution: distribution of either variable by
itself (last row or column)– Conditional Distribution: distribution of 1 variable for
those individuals satisfying a condition on another variable. Considers a smaller group isolated from the whole.
Independence???• Independence…think impact!
– Variables are independent (one variable does not influence the value of the other variable) if the conditional distribution of 1 variable is the same for each category of the other.
Slide 3 - 8
Contingency Tables
• A contingency table allows us to look at two categorical variables together.
• It shows how individuals are distributed along each variable, contingent on the value of the other variable.– Example: we can examine the class of ticket and whether
a person survived the Titanic:
Slide 3 - 9
Contingency Tables (cont.)
• The margins of the table, both on the right and on the bottom, give totals and the frequency distributions for each of the variables.
• Each frequency distribution is called a marginal distribution of its respective variable.– The marginal distribution of Survival is:
Slide 3 - 10
Contingency Tables (cont.)
• Each cell of the table gives the count for a combination of values of the two values.– For example, the second cell in the crew column tells us
that 673 crew members died when the Titanic sunk.
Slide 3 - 11
Conditional Distributions
• A conditional distribution shows the distribution of one variable for just the individuals who satisfy some condition on another variable.– The following is the conditional distribution of
ticket Class, conditional on having survived:
Slide 3 - 12
Conditional Distributions (cont.)
– The following is the conditional distribution of ticket Class, conditional on having perished:
Python Eggs
Cold Neutral Hot
Number of Eggs
27 56 104
Number hatched
16 38 75
1) What percent of cold eggs hatched?
2) Neutral eggs hatched represent what percent of the total?
3) Are temperature and eggs hatched independent?
Smoking by Students & Parents
Neither Parent Smokes
One Parent Smokes
Both Parents Smoke
Student Does Not Smoke
1168 1823 1380
Student Smokes
188 416 400
Neither Parent Smokes
One Parent Smokes
Both Parents Smoke
Student Does Not Smoke
1168 1823 1380
Student Smokes
188 416 400
4371
1004
1356 2239 1780 5375
The Marginal Distribution
• What percent of children have no parent that smokes?
• Students and both parents that smoke compose what percent of the total?
• Of 1 parent smokers, what is the percentage of children who smoke?
5375
1356
5375
400
2239
416
Is student smoking preference independent of parent choice?
Neither Parent Smokes
One Parent Smokes
Both Parents Smoke
Student Does Not Smoke
1168
86.1%
1823
81.4%
1380 77.5%
Student Smokes
188
13.9%
416 18.6%
400 22.5%
4371
1004
1356 2239 1780 5375
YearResponse of families 1991 2001
Both work full time 142 131
1 full time, 1 part time 274 244
1 full time, 1 work from home 152 173
1 full time, 1 homemaker 396 416
No Opinion 51 51
Is there evidence of a change in working patterns in families???
Conditional distribution by hospital size
Slide 2 - 19
How I get to school
Slide 2 - 20
Simpson’s Paradox
• When averages “pooled” together AND taken across different groups appear to be contradictory. Example: Baseball
Player Overall vs LHP vs RHP
A 33/103 (.320) 28/81 (.346) 5/22 (.227)
B 45/151 (.298) 12/32 (.375) 33/119 (.277)