31
Cross Tabulation Andrew Martin PS 372 University of Kentucky

Crosstabs

Embed Size (px)

Citation preview

Page 1: Crosstabs

Cross Tabulation

Andrew Martin

PS 372

University of Kentucky

Page 2: Crosstabs

What is a cross-tabulation?A cross-tabulation displays the joint frequencies

and relatives frequencies of two categorical (nominal or ordinal) variables.

The distribution is listed for each combination of categories that exists between two variables.

Each case is then placed in the cell of the table that represents the combination of values that corresponds to its score on the variables.

Page 3: Crosstabs

What is a cross-tabulation?Ex: Party Identification and gender (NES Data)

Assuming party identification has three categories (Democrat, Independent, Republican) and gender has two (male and female), the table would have 6 cells.

Part ID Male FemaleDem. Cell 1 Cell 2Ind. Cell 3 Cell 4Rep. Cell 5 Cell 6

Page 4: Crosstabs

What is a cross-tabulation?If we use the seven-point party identification scale,

the cross-tabulation gets bigger.

Part ID MaleFemale

Strong Democrat Cell 1 Cell 2Weak Democrat Cell 3

Cell 4Ind.-Democrat Cell 5

Cell 6Independent Cell 7

Cell 8Ind.-Republican Cell 9

Cell 10Weak Republican Cell 11 Cell

12Republican Cell

13 Cell 14

Page 5: Crosstabs
Page 6: Crosstabs

How to construct a cross-tab

(1) Separate the cases into groups based on their values for the independent variables.

(2) For each grouping on the independent variable compute the frequencies or percentages falling in each level of the dependent variable.

(3) Decide whether the frequency or percentage distributions differ from group to group, and if so, by how much.

Page 7: Crosstabs

Remember, with cross-tabs

1. The column percentages are more important than the row percentages.

2. The column percentages should add to 100.

Page 8: Crosstabs
Page 9: Crosstabs

Research Questions

We can use cross-tabs to investigate the following research questions:

Is there a relationship between gender and partisanship?

Are women more likely to be Democrats than men?

If so, does this mean women are more liberal than men?

Page 10: Crosstabs

Cross-tabs and two-variable relationships

We assume the relationship is such that:

Gender --> Partisanship

In other words, gender is the independent variable that explains variation in partisanship, which is the dependent variable.

Page 11: Crosstabs

Strength of Relationship

Refers to how different the observed values of the dependent variables are in the categories of the independent variable.

If every case of a dependent variable appeared in one category, there would be a perfect relationship. This almost never occurs.

If the dependent variable is equally distributed for different categories of the independent variable, there is no relationship.

Page 12: Crosstabs
Page 13: Crosstabs

Insert 12-4 and 12-5

Page 14: Crosstabs

Direction of the relationship

The direction of the relationship shows which values of the independent variable are associated with values of the dependent variable.

If higher values of the independent variable are associated with higher values of the dependent variable, the relationship is positive.

If lower values of the independent variable are associated with higher values of the independent variable, the relationship is negative.

Page 15: Crosstabs

Another research question

Suppose you were asked to predict how Americans would respond to a question about making gun control laws more stringent.

In absence of any information about their attitudes about gun control, what would you use to predict attitudes about pending gun control legislation?

Potential answers: Ideology, partisanship, related survey questions about gun rights and restrictions.

Page 16: Crosstabs

Table 12-6Figure 12-3

Page 17: Crosstabs
Page 18: Crosstabs

Cross-tab limitations

Sometimes it is practicable to examine the relationships of two variables by just looking at the tables.

In some instances analysis involves many tables or tables with many cells.

In those instances, it may be more useful to summarize the information using coefficients for ordinal data.

Page 19: Crosstabs

Calculating Coefficients for Ordinal Data

We're not going to learn how to each statistic, but there are some basic concepts we should review.

In particular, we need to know how to identify concordant pairs, discordant pairs and tied pairs.

Page 20: Crosstabs

Concordant, Discordant and Tied Pairs

In a concordant pair, one case is higher than another case for both variables.

In a discordant pair, one case is lower on one of the variables but higher on the other.

In a tied pair both cases have the same value on one or both variables.

Page 21: Crosstabs

Insert Table 12-7

Page 22: Crosstabs

Insert Table 12-7

3 2 1

3

2

1

Page 23: Crosstabs

Values by Name and Variable

Name Variable XVariable Y

Alex 33

Carl 31

Dawn 23

Ernesto 22

Fay 21

Gus 13

Hera 12

Ike 11

Jasmine 11

Page 24: Crosstabs

Determining Pair Type

Alex and Ike

Variable X = 3 (Alex) – 1 (Ike) = 2Variable Y = 3 (Alex) – 1 (Ike) = 2

Both numbers +; Alex and Ike are concordant

Page 25: Crosstabs

Determining Pair Type

Carl and Ernesto

Variable X = 3 (Carl) – 2 (Ernesto) = 1Variable Y = 1 (Carl) – 2 (Ernesto) = -1

One number +, the other -; Alex and Ike are discordant

Page 26: Crosstabs

Determining Pair Type

Ike and Jasmine

Variable X = 1 (Ike) – 1 (Jasmine) = 0Variable Y = 1 (Ike) – 1 (Jasmine) = 0

Both numbers are 0; this constitutes a tied pair.

Page 27: Crosstabs

Ordinal Coefficients

There are four commonly used coefficients of association for ordinal data:

Kendell's tau-bKendell's tau-c

Somer's dGoodman and Kruskal's gamma

Page 28: Crosstabs

Ordinal Coefficients

Each are calculated somewhat differently (see JRM p. 442), but the intuition is that they measure the probability of concordant pairs minus the probability of discordant pairs.

Measure = pconcordance

– pdiscordance

where p = probability

The measures treat tied pairs somewhat differently.

Page 29: Crosstabs

Ordinal Coefficient Properties1. Theoretically all vary between -1 and 1.

2. In practice a -1 or 1 is unlikely. In fact, a measure of -.4 or .4 or greater illustrates a strong enough association to investigate further.

3. Since 0 means no correlation, values of -.1 to .1 suggest a weak relationship.

4. All ordinal measures of correlation will have the same sign in a given table.

Page 30: Crosstabs

Ordinal Coefficient Properties5. The absolute value of gamma (γ)will always be

greater than or equal to the absolute value of any of the other measures.

6. The relationships among tau b, tau c and Somer's d are harder to generalize because they are affected differently by the structure of the table (that is, the

number of rows and columns).

7. Somer's d is an asymmetric measure because its value depends on which variable is considered

dependent.

Page 31: Crosstabs

Ordinal Coefficient Properties

8. A single measure by itself cannot assess how strongly one variable is related to another. After

the statistical software calculates the measures, you should scrutinize the tables. Do not be lazy with

analysis and interpretation.

9. These coefficients measure a particular type of association, namely correlation, whether linear or

monotonic.