35
Chapter 2 Organizing the Data

Chapter 2 Organizing the Data

  • Upload
    eithne

  • View
    45

  • Download
    0

Embed Size (px)

DESCRIPTION

Chapter 2 Organizing the Data. Frequency Distributions of Nominal Data. Formulas and statistical techniques used by social researchers to: Organize raw data Test hypotheses Raw data is often difficult to synthesize Most common types of distributions are: Frequency Percentage Combination. - PowerPoint PPT Presentation

Citation preview

Page 1: Chapter 2 Organizing the Data

Chapter 2Organizing the Data

Page 2: Chapter 2 Organizing the Data

Frequency Distributions of Nominal Data

• Formulas and statistical techniques used by social researchers to:• Organize raw data• Test hypotheses

• Raw data is often difficult to synthesize• Most common types of distributions are:

• Frequency• Percentage• Combination

Page 3: Chapter 2 Organizing the Data

Nominal Data and Distributions

Responses of Young Boys to Removal of Toy

Response of Child f

Cry 25

Express Anger 15

Withdraw 5

Ply with another toy 5

N=50

Frequency distribution of nominal data consists of two columns:

• Left column has characteristics (e.g., Response of Child)

• Right column has frequency (f)

Page 4: Chapter 2 Organizing the Data

Comparing Distributions• Comparisons clarify and add information

Response to Removal of Toy by Gender of Child

Gender of Child

Response of Child Male Female

Cry 25 14

Express Anger 15 1

Withdraw 5 2

Play with another toy 5 8

Total 50 25

Page 5: Chapter 2 Organizing the Data

Proportions and Percentages• Proportions - Compares the

number of cases in a given category with the total size of the distribution

• Most prefer percentages to show relative size.

• Percentage – The frequency per 100 cases

N

fP

Formula for proportion

N

f100%

Formula for percentage

Page 6: Chapter 2 Organizing the Data

Illustration: Gender of Students Majoring in CJ(f)

Criminal Justice Majors

Gender College A College B

Male 879 119

Female 473 64

Total 1,352 183

Page 7: Chapter 2 Organizing the Data

Illustration: Gender of Students Majoring in CJ (f and

%)Criminal Justice Majors

College A College B

Gender

f % f %

Male 879 65 119 65

Female

473 35 64 35

Total 1,352 100 183 100

Page 8: Chapter 2 Organizing the Data

Rates

• Rates usually preferred by social researchers

• Rate – comparison between actual and potential cases

• Base terms in rates may vary

casespotentialf

casesactualfRate 000,1

Page 9: Chapter 2 Organizing the Data

Rate of Change• Compare the same

population at two points in time

• Rate of Change =

time 2f – time1f

time 1f(100)*

Year Theft Rate1

% Change

2005 120.3

2006 127.4 5.9%

2007 116.8 -8.3%

2008 107.4 -8.0%

2009 98.7 -8.1%

2010 94.6 -4.2%

1Source: National Crime Victimization Survey

Page 10: Chapter 2 Organizing the Data

Ordinal/Interval Data and Distributions

Attitudes Toward Televised Trials

F

Slightly Favorable 9

Somewhat Unfavorable 7

Strongly Favorable 10

Slightly Unfavorable 6

Strongly Unfavorable 12

Somewhat Favorable 21

Total 65

Incorrect

Attitudes Toward Televised Trials

F

Strongly Favorable 10

Somewhat Favorable 21

Slightly Favorable 9

Slightly Unfavorable 6

Somewhat Unfavorable 7

Strongly Unfavorable 12

Total 65

Correct

Page 11: Chapter 2 Organizing the Data

Frequency Distribution of Final-Examination Grades for 71 Students

Grade f Grade f Grade f Grade f

99 0 85 2 71 4 57 0

98 1 84 1 70 9 56 1

97 0 83 0 69 3 55 0

96 1 82 3 68 5 54 1

95 1 81 1 67 1 53 0

94 0 80 2 66 3 52 1

93 0 79 8 65 0 51 1

92 1 78 1 64 1 50 1

91 1 77 0 63 2 N = 71

90 0 76 2 62 0

89 1 75 1 61 0

88 0 74 1 60 2

87 1 73 1 59 3

86 0 72 2 58 1

Page 12: Chapter 2 Organizing the Data

Grouped Frequency Distributions of Interval DataGrouped Frequency Distribution of Final-Examination Grades for 71 Students

Class Interval f %

95-99 3 4.23

90-94 2 2.82

85-89 4 5.63

80-84 7 9.86

75-79 12 16.90

70-74 17 23.94

65-69 12 16.90

60-64 5 7.04

55-59 5 7.04

50-54 4 5.63

71 100

Page 13: Chapter 2 Organizing the Data

Flexible Class IntervalsIncome Category F %

$100,000 and above 16,886 21.9

$75,000-$99,999 10,471 13.5

$50,000-$74,000 15,754 20.3

$40,000-$49,999 7488 9.7

$30,000-$39,999 7996 10.3

$20,000-$29,999 8169 10.6

$15,000-$19,999 3709 4.8

$10,000-$14,999 2890 3.7

$5000-$9999 2024 2.6

Under $5000 2031 2.6

N = 77688

Page 14: Chapter 2 Organizing the Data

Cumulative Distributions• Cumulative frequencies involve the total number of

cases having a given score or a score that is lower• Cumulative frequency shown as cf• cf obtained by the sum of frequencies in that

category plus all lower category frequencies• Cumulative percentage – percentage of cases having

any score or a lower score

N

cfc )100(%

Page 15: Chapter 2 Organizing the Data

Grouped Frequency Distributions of Interval DataGrouped Frequency Distribution of Final-Examination Grades for 71 Students

Class Interval f %

95-99 3 4.23

90-94 2 2.82

85-89 4 5.63

80-84 7 9.86

75-79 12 16.90

70-74 17 23.94

65-69 12 16.90

60-64 5 7.04

55-59 5 7.04

50-54 4 5.63

71 100

Page 16: Chapter 2 Organizing the Data

Grouped Frequency Distributions of Interval DataGrouped Frequency Distribution of Final-Examination Grades for 71 Students

Class Interval f Cf % C%

95-99 3 71 4.23 100

90-94 2 68 2.82 95.76

85-89 4 66 5.63 92.94

80-84 7 62 9.86 87.31

75-79 12 55 16.90 77.45

70-74 17 43 23.94 60.55

65-69 12 26 16.90 36.31

60-64 5 14 7.04 19.71

55-59 5 9 7.04 12.67

50-54 4 4 5.63 5.63

71 100

Page 17: Chapter 2 Organizing the Data

Frequency Distribution of Seat Belt Use

Use of Seat Belts f %

All the time 499 50.1

Most of the time 176 17.7

Some of the time 124 12.4

Seldom 83 8.3

Never 115 11.5

Total 997 100

Cross-tabCross-Tabulation of Seat Belt Use by Gender

Gender of Respondents

Use of Seat Belts Male Female Total

All the time 144 355 499

Most of the time 66 110 176

Some of the time 58 66 124

Seldom 39 44 83

Never 60 55 115

Total 367 630 997

Page 18: Chapter 2 Organizing the Data

What Type to Choose?• There are three sets of percentages

• Total • Row • Column

• All are correct, mathematically speaking • Total percentages may be misleading • Row and column percentages come down to which is more

relevant to the purpose of the analysis

Page 19: Chapter 2 Organizing the Data

Cross-tab Formulas

totalN

ftotal )100(%

rowN

frow )100(%

Formula for total percents columnN

fcol )100(%

Formula for row percents

Formula for column percents

Page 20: Chapter 2 Organizing the Data

Cross Tabulations – Victim-Offender Relationship by Gender of Victim for

Homicides in US for 2005 (With Row%)Victim-Offender Relationship

Gender Intimate Intimate % Family Family % Other Other % Total Total %

Male 617 1,310 11,235 13,161

Female 1,470 639 1,421 3,531

Total 2,087 1,949 12,656 16,692

Page 21: Chapter 2 Organizing the Data

Cross Tabulations – Victim-Offender Relationship by Gender of Victim for

Homicides in US for 2005 (With Row%)Victim-Offender Relationship

Gender Intimate Intimate % Family Family % Other Other % Total Total %

Male 617 4.7% 1,310 10.0% 11,235 85.4% 13,161 100%

Female 1,470 41.6% 639 18.1% 1,421 40.2% 3,531 100%

Total 2,087 12.5% 1,949 11.7% 12,656 75.8% 16,692 100%

Page 22: Chapter 2 Organizing the Data

Cross Tabulations –Victim-Offender Relationship by Gender of Victim for

Homicides in US for 2005 (With Column%)

Victim-Offender Relationship

Male Female Total

Intimate 617 1,470 2,087

Family 1,310 639 1,949

Acquaintance 7,237 998 8,235

Stranger 3,998 423 4,421

Total 13,161 3,531 16,692

Page 23: Chapter 2 Organizing the Data

Cross Tabulations –Victim-Offender Relationship by Gender of Victim for

Homicides in US for 2005 (With Column%)

Victim-Offender Relationship

Male Female Total

Intimate 617 1,470 2,087

4.7% 41.6% 12.5%

Family 1,310 639 1,949

10.0% 18.1% 11.7%

Acquaintance 7,237 998 8,235

55.0% 28.3% 49.3%

Stranger 3,998 423 4,421

30.4% 12.0% 26.5%

Total 13,161 3,531 16,692

100% 100% 100%

Page 24: Chapter 2 Organizing the Data

Graphic Presentations

• Graphs are useful tools to emphasize certain aspects of data.

• Many prefer graphs to tables.• Types of graphs include:

• Pie charts, bar graphs, frequency polygons, line charts, and maps

Page 25: Chapter 2 Organizing the Data

Single22.6%

Married61.2%

Widowed7.3%

Divorced8.9%

Pie Chart of Marital StatusSource: Bureau of the Census

Page 26: Chapter 2 Organizing the Data

Exploded Pie ChartFigure 2.3 Pie Chart of Marital Status

Source: Bureau of the Census

Divorced8.9%

Widowed7.3%

Single22.6%

Married61.2%

Page 27: Chapter 2 Organizing the Data

Bar GraphBar Graph of Seat Belt Use (with percents)

0

10

20

30

40

50

60

Never Seldom Sometimes Most times All times

Seat belt use

Per

cen

t

Page 28: Chapter 2 Organizing the Data

Histogram of Distribution of Children in Little Rock Community Survey

Page 29: Chapter 2 Organizing the Data

Frequency Polygon for Distribution of Student Examination Grades

0

5

10

15

20

52 57 62 67 72 77 82 87 92 97

Midpoint

Fre

quen

cyFrequency Polygon Example

Page 30: Chapter 2 Organizing the Data

Janu

ary

Febr

uary

Mar

chAp

rilMay

June Ju

ly

Augu

st

Sept

embe

r

Octob

er

Novem

ber

Decem

ber

0

2000

4000

6000

8000

10000

12000

14000

MarijuannaAlcoholHallucinogens

Number of Adolescents (< 18 y/o) Using for the First Time by Month

Page 31: Chapter 2 Organizing the Data
Page 32: Chapter 2 Organizing the Data

Shape of a Distribution

• Kurtosis • Leptokurtic • Platykurtic • Mesokurtic

• Skewness• Negative • Positive

• Normal Curve

Page 33: Chapter 2 Organizing the Data

Kurtosis

Leptokurtic Platykurtic Mesokurtic

Some Variation in Kurtosis among Symmetrical Distributions

Page 34: Chapter 2 Organizing the Data

Skewness

Negatively skewed Positively skewed Symmetrical(Normal)

Three Distributions Representing Direction of Skewness

Page 35: Chapter 2 Organizing the Data

Summary• Organizing raw data is critical• Data can be summarized using frequency

distributions.• Comparisons of groups possible through

proportions, percentages and rates.• Cross-tabs allow dimensional (and more) analysis• Graphic presentations:

• help to emphasize findings • make data more accessible to consumers of research• help researchers identify trends