Ogive, Stem and Leaf plot & Crosstabulation

Preview:

DESCRIPTION

Ogive, Stem and Leaf plot & Crosstabulation. Ogive. An ogive is a graph of a cumulative distribution.. The data values are shown on the horizontal axis. Shown on the vertical axis are the: cumulative frequencies, or cumulative relative frequencies, or cumulative percent frequencies. - PowerPoint PPT Presentation

Citation preview

Ogive, Stem and Leaf plot &

Crosstabulation

OgiveOgive

An ogive is a graph of a cumulative distribution..

The data values are shown on the horizontal axis.

Shown on the vertical axis are the:

• cumulative frequencies, or

• cumulative relative frequencies, or

• cumulative percent frequencies

Ogive

The frequency (one of the above) of each class is plotted as a point.

The plotted points are connected by straight lines.

PartsPartsCost ($)Cost ($) PartsPartsCost ($)Cost ($)

2020

4040

6060

8080

100100

Cu

mu

lati

ve P

erc

en

t Fr

eq

uen

cyC

um

ula

tive P

erc

en

t Fr

eq

uen

cyC

um

ula

tive P

erc

en

t Fr

eq

uen

cyC

um

ula

tive P

erc

en

t Fr

eq

uen

cy

50 60 70 80 90 100 11050 60 70 80 90 100 11050 60 70 80 90 100 11050 60 70 80 90 100 110

(89.5, 76)(89.5, 76)

Ogive withOgive with

Cumulative Percent Frequencies Cumulative Percent Frequencies

Example of an OgiveExample of an OgiveExample of an OgiveExample of an Ogive

Stem and Leaf Plots

1. Sort data *** 2. Round data (if necessary) 3. Create TWO new columns (stem and leaf) 4. Put “stem” in one column and “leaves” in

another. 5. Format the leaves column to be left-aligned.

What we have done

Summary of variablesQualitative:

Numeric: Frequency, relative frequency, percentage frequency, cumulative frequency, cumulative relative frequency, cumulative Percentage

Graphical: Bar (column) chart, pie chart

What we have done II

Quantitative: Numeric: Frequency, relative frequency,

percentage frequency, cumulative frequency, cumulative relative frequency, cumulative Percentage

Graphical: histogram, stem and leaf, Ogive, boxplot

Another thing of interest to statisticians

Relationship between variablesVariables:

Quantitative Qualitative

Relationship between variables

Qualitative vs. qualitative: Crosstabulation

Qualitative vs. quantitative: ANOVA etc.

Quantitative vs. quantitative: Regression etc.

Example of Crosstab

Sum of count factor b          

factor a 1 2 3 4 5 Grand Total

1 10 20 36 32 51 149

2 69 87 52 32 12 252

3 14 62 32 53 83 244

4 69 91 92 20 25 297

Grand Total 162 260 212 137 171 942

What crosstab tells us?

Cross Tabs: a tabular summary of data for two variables

Marginal Distributions/Probabilities: totals/probabilities in the margins of the cross tabulation.

An example that makes more sense

Sum of Count Win    

Ginobli Played

N

Y Total

N 16 22 38

Y 12 32 44

Total 28 54 82

Marginal Distributions

Ginobli’s game play distributionPlayed: 44; Missed: 38

Spurs’ season breakdownWin: 54; Lose: 28

Marginal Probabilities

Ginobli’s chance of playing: 44/82

Spurs’ winning percentage: 54/82

Row (column ) total / grand total

Some other Probabilities

Conditional Probability Spurs’ winning percentage when Ginobli

played. 32/44 Cell count / row (column ) total

Joint Probability: cell count /grand totalE.g. The percentage of games that Spurs won and

Ginobli played.

Crosstab

Example cont.

Components of the tableColumn1 Column2 Column3 Total

Row 1 Cell count Cell count Cell count Row 1 total

Row 2 Cell count Cell count Cell count Row 2 total

Row 3 Cell count Cell count Cell count Row 3 total

Total Column 1 total

Column 1 total

Column 1 total

Grand

Total

Probabilities From Crosstab

Marginal, joint and conditional Marginal probability

row(column) total/grand total Joint probability

cell count / grand total Conditional probability

Cell count / row (column) total

What is the percentage of all patients who received a CHEAP positive test result? Is this a joint, marginal, or conditional percentage?

Marginal: 37.0%

Out of all the patients given the CHEAP test, what is the percentage of false negatives? Is this a joint, marginal, or conditional percentage?

Joint, 2% (this is where CHEAP is negative, but Actual SFI is positive)

What is the percentage of subjects diagnosed as positive by BOTH tests? Is this a joint, marginal, or conditional percentage?

Joint: 30%.

What is the percentage of correct diagnosis?

=(30+61)/100 = 91% That is correct diagnosis of positive AND

negative.

If someone gets the test result and it is “positive”, what is the chance that this person really has the disease.

30/37=81% (conditional)

That means there is still 19% chance that this person does not have the disease.

Check this one out! Homicide convictions in the state of Florida between 1976 and

1980. Did convicted person get death sentence? Is there a racial bias?

YES NO Total (% YES)

White 39 308 347 11.2%

Black 32 345 377 8.5%

Total 71 653 724 9.8%

The other side of the story ii.

Table for those cases involving white victims

YES NO Total (% YES)

White 39 279 318 12.3%

Black 29 121 150 19.3%

Total 68 400 468 14.5%

The other side of the story i.

Table for those cases involving black victims

YES NO Total (% YES)

White 0 29 29 0%

Black 3 224 227 1.3%

Total 3 253 256 1.2%

This is what we call Simpson’s Paradox in statistics

Simpson’s paradox refers to the reversal in the direction of an X versus Y relationship when controlling for a third variable Z.

Another Example

Numbers of flights on time and delayed for two airlines at five airports in June 1991.

Alaska Airline American West Airline

On Time Delayed Delay % On Time Delayed Delay %

3724 501 13.3% 6438 787 10.9%

Another Example (contd) Alaska Airline American West Airline

On Time

Delayed

Delay %

On Time

Delayed

Delay %

L.A. 497 62 11.1%

694 117 14.4%

Phoenix 221 12 5.4% 4840 415 7.9%

San Diego 212 20 8.6% 383 65 14.5%

San Francisco 503 102 16.9%

320 129 28.7%

Seattle 1841 305 14.2%

201 61 23.3%

Recommended