Upload
linda-hicks
View
62
Download
1
Tags:
Embed Size (px)
DESCRIPTION
Ogive, Stem and Leaf plot & Crosstabulation. Ogive. An ogive is a graph of a cumulative distribution.. The data values are shown on the horizontal axis. Shown on the vertical axis are the: cumulative frequencies, or cumulative relative frequencies, or cumulative percent frequencies. - PowerPoint PPT Presentation
Citation preview
Ogive, Stem and Leaf plot &
Crosstabulation
OgiveOgive
An ogive is a graph of a cumulative distribution..
The data values are shown on the horizontal axis.
Shown on the vertical axis are the:
• cumulative frequencies, or
• cumulative relative frequencies, or
• cumulative percent frequencies
Ogive
The frequency (one of the above) of each class is plotted as a point.
The plotted points are connected by straight lines.
PartsPartsCost ($)Cost ($) PartsPartsCost ($)Cost ($)
2020
4040
6060
8080
100100
Cu
mu
lati
ve P
erc
en
t Fr
eq
uen
cyC
um
ula
tive P
erc
en
t Fr
eq
uen
cyC
um
ula
tive P
erc
en
t Fr
eq
uen
cyC
um
ula
tive P
erc
en
t Fr
eq
uen
cy
50 60 70 80 90 100 11050 60 70 80 90 100 11050 60 70 80 90 100 11050 60 70 80 90 100 110
(89.5, 76)(89.5, 76)
Ogive withOgive with
Cumulative Percent Frequencies Cumulative Percent Frequencies
Example of an OgiveExample of an OgiveExample of an OgiveExample of an Ogive
Stem and Leaf Plots
1. Sort data *** 2. Round data (if necessary) 3. Create TWO new columns (stem and leaf) 4. Put “stem” in one column and “leaves” in
another. 5. Format the leaves column to be left-aligned.
What we have done
Summary of variablesQualitative:
Numeric: Frequency, relative frequency, percentage frequency, cumulative frequency, cumulative relative frequency, cumulative Percentage
Graphical: Bar (column) chart, pie chart
What we have done II
Quantitative: Numeric: Frequency, relative frequency,
percentage frequency, cumulative frequency, cumulative relative frequency, cumulative Percentage
Graphical: histogram, stem and leaf, Ogive, boxplot
Another thing of interest to statisticians
Relationship between variablesVariables:
Quantitative Qualitative
Relationship between variables
Qualitative vs. qualitative: Crosstabulation
Qualitative vs. quantitative: ANOVA etc.
Quantitative vs. quantitative: Regression etc.
Example of Crosstab
Sum of count factor b
factor a 1 2 3 4 5 Grand Total
1 10 20 36 32 51 149
2 69 87 52 32 12 252
3 14 62 32 53 83 244
4 69 91 92 20 25 297
Grand Total 162 260 212 137 171 942
What crosstab tells us?
Cross Tabs: a tabular summary of data for two variables
Marginal Distributions/Probabilities: totals/probabilities in the margins of the cross tabulation.
An example that makes more sense
Sum of Count Win
Ginobli Played
N
Y Total
N 16 22 38
Y 12 32 44
Total 28 54 82
Marginal Distributions
Ginobli’s game play distributionPlayed: 44; Missed: 38
Spurs’ season breakdownWin: 54; Lose: 28
Marginal Probabilities
Ginobli’s chance of playing: 44/82
Spurs’ winning percentage: 54/82
Row (column ) total / grand total
Some other Probabilities
Conditional Probability Spurs’ winning percentage when Ginobli
played. 32/44 Cell count / row (column ) total
Joint Probability: cell count /grand totalE.g. The percentage of games that Spurs won and
Ginobli played.
Crosstab
Example cont.
Components of the tableColumn1 Column2 Column3 Total
Row 1 Cell count Cell count Cell count Row 1 total
Row 2 Cell count Cell count Cell count Row 2 total
Row 3 Cell count Cell count Cell count Row 3 total
Total Column 1 total
Column 1 total
Column 1 total
Grand
Total
Probabilities From Crosstab
Marginal, joint and conditional Marginal probability
row(column) total/grand total Joint probability
cell count / grand total Conditional probability
Cell count / row (column) total
What is the percentage of all patients who received a CHEAP positive test result? Is this a joint, marginal, or conditional percentage?
Marginal: 37.0%
Out of all the patients given the CHEAP test, what is the percentage of false negatives? Is this a joint, marginal, or conditional percentage?
Joint, 2% (this is where CHEAP is negative, but Actual SFI is positive)
What is the percentage of subjects diagnosed as positive by BOTH tests? Is this a joint, marginal, or conditional percentage?
Joint: 30%.
What is the percentage of correct diagnosis?
=(30+61)/100 = 91% That is correct diagnosis of positive AND
negative.
If someone gets the test result and it is “positive”, what is the chance that this person really has the disease.
30/37=81% (conditional)
That means there is still 19% chance that this person does not have the disease.
Check this one out! Homicide convictions in the state of Florida between 1976 and
1980. Did convicted person get death sentence? Is there a racial bias?
YES NO Total (% YES)
White 39 308 347 11.2%
Black 32 345 377 8.5%
Total 71 653 724 9.8%
The other side of the story ii.
Table for those cases involving white victims
YES NO Total (% YES)
White 39 279 318 12.3%
Black 29 121 150 19.3%
Total 68 400 468 14.5%
The other side of the story i.
Table for those cases involving black victims
YES NO Total (% YES)
White 0 29 29 0%
Black 3 224 227 1.3%
Total 3 253 256 1.2%
This is what we call Simpson’s Paradox in statistics
Simpson’s paradox refers to the reversal in the direction of an X versus Y relationship when controlling for a third variable Z.
Another Example
Numbers of flights on time and delayed for two airlines at five airports in June 1991.
Alaska Airline American West Airline
On Time Delayed Delay % On Time Delayed Delay %
3724 501 13.3% 6438 787 10.9%
Another Example (contd) Alaska Airline American West Airline
On Time
Delayed
Delay %
On Time
Delayed
Delay %
L.A. 497 62 11.1%
694 117 14.4%
Phoenix 221 12 5.4% 4840 415 7.9%
San Diego 212 20 8.6% 383 65 14.5%
San Francisco 503 102 16.9%
320 129 28.7%
Seattle 1841 305 14.2%
201 61 23.3%