Upload
missinu
View
6
Download
0
Embed Size (px)
DESCRIPTION
measure of association
Citation preview
5/22/2018 Measures of Association
1/10
Measures of AssociationG. David Garson.
Overview
Associationrefers to a wide variety of coefficients which measure strength of
relationship, defined various ways. In common usage "association" refers tomeasures of strength of relationship in which at least one of the variables is a
dichotomy, nominal, or ordinal.
Correlation, which is a type of association used when both variables are interval, is
discussed separately.
Reliability, which is a type of association used to establish the consistency of a
measure or to assess inter-rater similarity on a variable, is also discussed
separately.
Key Concepts and Terms
o Significance versus association. Measures of significancetest the null
hypothesis that the strength of an observed relationship is not different
from what would be expected due to the chance of random sampling.
Significance coefficients reflect not only strength of relationship but also
sample sie and sometimes other parameters. !herefore it is possible tohave a relationship which displays strong association but is not significantex., all males are #epublicans and all females are Democrats, but the
sample sie is only $% or a relationship which displays an extremely wea&
association but is very significant ex., '(.)* of males are #epublicanscompared to '(.(* of females, but sample sie is )',((( and the
significance level is .(()%. +ecause significance and association are not atall euivalent, researchers ordinarily must report both significance and
association when discussing their findings. -ote also that significance is
relevant only when one has a random sample, whereas association isalways relevant to research inferences.
o Coefficients of association. Most coefficients of association vary from (
indicating no relationship% to ) indicating perfect relationship% or )indicating perfect negative relationship%. /s discussed below, however,
there are various types of "perfect relationship" and various types of "norelationship." 0hich definitions the researcher selects may strongly affect
the conclusions to which he or she comes. 0hen particular coefficients are
discussed later in this section, their definitions of perfect and nullrelationships are cited and this is one important criterion used by
researchers in selecting among possible measures of association. If youwish to s&ip the rather long discussion below, 1ust &eep in mind that most
but not all coefficients of association define "perfect relationship" as strictmonotonicity and define a "null relationship" as statistical independence.
http://faculty.chass.ncsu.edu/garson/PA765/correl.htmhttp://faculty.chass.ncsu.edu/garson/PA765/reliab.htmhttp://faculty.chass.ncsu.edu/garson/PA765/signif.htmhttp://faculty.chass.ncsu.edu/garson/PA765/correl.htmhttp://faculty.chass.ncsu.edu/garson/PA765/reliab.htmhttp://faculty.chass.ncsu.edu/garson/PA765/signif.htm5/22/2018 Measures of Association
2/10
Types of perfect relationship. !here are four definitions of"perfect" linear relationship in association, plus the definition ofperfect curvilinear relationships. !he linear definitions are those
dealing with strict monotonic, ordered monotonic, predictive
monotonic, and wea& monotonic relationships. /ll relationshipswhich are perfect by strict monotonicity are also perfect by the
others. 2i&ewise, perfect ordered and predictive monotonicrelationships will also be perfect by the criterion of wea&
monotonicity. 3ne cannot have perfect ordered monotonicity andperfect predictive monotonicity at the same time. -one of the
definitions based on monotonicity are appropriate for curvilinear or
discontinuous relationships.
). The concept of pairs. Strength of linear relationship isdefined in terms of degree of monotonicity, which is based
on counting various types of pairs in a relationship shown ina table. / pair is a two cases, each of which is in a differentcell in the table representing the 1oint distribution of two
variables. 2et x be an independent variable with three
values and let y be a dependent with two values, with a,b, ..., f being the cell counts in the resulting table, illustrated
below4
x
y
) 5 6
) a b c
5 d e f
2. The four types of pairs, how they are counted, and their
symbols are shown in the table below.
!ype of 7air -umber of 7airs Symbol
8oncordant ae9f% 9 bf% 7
Discordant cd9e% 9 bd% :
!ied on x ad 9 be 9cf ;o
!ied on y ab9c% 9 bc 9 de9f% 9 ef
5/22/2018 Measures of Association
3/10
reuires that discordant pairs :%, ties on x ;o%, and ties on
y or perfect negative monotonicity,concordant pairs 7%, ties on x ;o%, and ties on y
5/22/2018 Measures of Association
4/10
relationships% or remains the same, and 5% when every y
value corresponds to 1ust one y value. ?xamples of perfectordered monotonic association are below4
x
y
)' ( (
( )' )'
( ( (
x
y
)' )' ( (
( ( ( (
( ( )' )'
$. %ote this form of association is called "predictive" because
the dependent variable can be predicted uni&uely from
'nowin# the value of the independent variable, #iven that
each independent ( value corresponds uni&uely to one
dependent y value.
$. Weak monotonic. 7erfect positive wea& monotonicityexists when discordant pairs :% are ero. 7erfect positive
wea& monotonicity exists when concordant pairs 7% areero. 7erfect wea& monotonicity exists when )% as x
increases, y also increases or decreases for perfect negativerelationships% or remains the same. In 5by5 tables this
corresponds to having a ero cell in the table. ?xamples ofperfect ordered monotonic association are below4
x
y
)' ( (
)' ( (
)' )' )'
x
y
)' ( ( (
)' )' ( (
( ( )' )'
A. Curvilinear. 8urvilinear association is perfect when everyx value of the independent corresponds to only one y value
of the dependent variable. !he reverse need not be true, nor
need the relationship be continuous. Most investigations ofcurvilinear relationships involve the use of curvefitting
software, however, which usually do reuire the distributionbe continuous. Some applications also reuire that the curve
be describable as a mathematical function.
5/22/2018 Measures of Association
5/10
Curvilinear association is asymmetric in that its definition
depends of which variable is independent and which is
dependent. Thus for hypotheses in which y is the
independent variable, then curvilinear association is perfect
when every y value corresponds to only one ( value. %ote
curvilinear association is never applicable to nominal
variables.
Types of null relationship. !here are four ways to define "norelationship" between two variables. !he leading definition,independence, is a symmetric criterion ma&ing no assumption
about the direction of causation, whereas accord is asymmetric.
+oth independence and accord are nominal criteria, ma&ing noassumption about the level of data. +alance is an ordinal criterion,
except for dichotomies, and assumes the values of the two variablesare ordered. 8leavage is a sufficient condition for independence
and balance, but is a more stringent definition such that
independence or balance do not imply cleavage.). Independence. +y far the most common definition of null
relationship is based on the laws ofprobability.!wovariables are independent when their 1oint distribution is as
would be predicted on the basis of the number of cases in
their individual categories. !he expected value for any 1ointcategory, calculated as in chisuare,is the product of the
number of cases in their separate categories divided by n,
the sample sie. >or instance, if in a sample of )(( there are'( men and =( #epublicans, the expected number of male
#epublicans is '(B=(C)(( 5(. If every 1oint categoryeach of the cells in a table% is the expected value, then there
is a null relationship as defined by the criterion ofindependence. -ote independence ma&es no assumption
about which is the independent and which is the dependent
variable it is symmetric%. 0hen a relationship isindependent, chisuarewill be ero and thus chisuare
may be viewed as a test of independence.
5. Accord. +y this criterion, two variables have a nullrelationship if the largestcount categories of the
http://faculty.chass.ncsu.edu/garson/PA765/probability.htmhttp://faculty.chass.ncsu.edu/garson/PA765/probability.htmhttp://faculty.chass.ncsu.edu/garson/PA765/probability.htmhttp://faculty.chass.ncsu.edu/garson/PA765/chisq.htmhttp://faculty.chass.ncsu.edu/garson/PA765/chisq.htmhttp://faculty.chass.ncsu.edu/garson/PA765/chisq.htmhttp://faculty.chass.ncsu.edu/garson/PA765/probability.htmhttp://faculty.chass.ncsu.edu/garson/PA765/chisq.htmhttp://faculty.chass.ncsu.edu/garson/PA765/chisq.htm5/22/2018 Measures of Association
6/10
independent variable all have the same value on the
dependent variable. >or instance, let the independent below, medium, and high education and let the dependent be
unsatisfactory, satisfactory, and meritorious performance
evaluations. !here might be a tendency to have moremeritorious ratings as one moved from low to medium to
high education. Eowever, it might be true at the same timethe most loweducated, most mediumeducation, and most
higheducated employees all received satisfactory ratings.+y the criterion of accord there would be a null relationship,
whereas by the criterion of independence there would be a
relationship. /ccord is the second most common definitionof strength of relationship and is an asymmetric definition
6. Balance. 0hen the value categories of both variables are
ordered and crosstabulated, by this criterion a nullrelationship is said to exist when the number of cases on therightsloping diagonals% is eual to the number of cases on
the leftsloping diagonals%. 8onsider the following table4
DegreeC#ating F +/ +/ +/ #ow total
Hnsatisfactory $ = )A
Satisfactory = = 6 ))
Meritorious 6 6 5 $
8olumn !otal )' )) )5 6$
). left dia#onals *2+ ri#ht dia#onals * 2+
. n this table there is a tendency for those with less than a
/A de#ree or more than a /A de#ree to receive low
performance ratin#s, and for those with e(actly a /A to do
proportionately best. owever, since the count on the ri#ht-
and left-slopin# dia#onals is 2+ in each case, by accordthere is a null relationship.
@. Cleavage. +y this criterion, a null relationship exists whenthe number of cases associated with each category of the
independent variable is split evenly among the dependentvariable categories. 8onsider the following table4
DegreeC#ating F +/ +/ +/ #ow total
Hnsatisfactory 6 ' $ )@
Satisfactory 6 ' $ )@
5/22/2018 Measures of Association
7/10
Meritorious 6 ' $ )@
8olumn !otal A )' 5= =$
$. left dia#onals *3$ ri#ht dia#onals * 3$
+. 0hen a null relationship e(ists by cleava#e, as above, there
will also be a null relationship by balance and
independence. 1ince there are e&ual numbers of cases in
each dependent cate#ory for each independent cate#ory,
accord cannot be computed but it also approaches null for
tables with perfect cleava#e. owever, note that the reverse
is not true tables with a null relationship by any of the othercriteria need not have a null relationship by the cleava#e
criterion.
o Association with Control Variables !The "laboration #odel!$. In
crosstabulation, for an original table of ; and or instance, for a table of
religious affiliation with party vote, one may see& to control for gender.
!his is done by computing measures of association for the original table;religion and
5/22/2018 Measures of Association
8/10
Suppressionoccurs when the control variable has a positive effect
on the dependent through one path and a negative effect throughanother path. >or threevariable models, suppression may occur
when there is an odd number of negative arrows. In such situations,
the control variable acts in one direction by way of the independentand in the opposite direction in terms of direct effect on the
dependent, thereby mas&ing some of the correlation which wouldexist in the absence of the control. 0hen suppression occurs,
subtable association will be higher than the original bivariateassociation.
o Specific #easures of Association. 0ith the exception of eta, when data
are mixed by data level, the researcher uses a measure of association for
the lower data level. !hus, for nominalbyordinal association one would
use a measure for nominallevel association.
%ichotomous Association &'by'& tables$( Percent %ifference)*ule+s ,) *ule+s *) -isk
.ominal Association( Phi) Contingency Coefficient)Tschuprow+s T) Cramer+s V) /ambda) 0ncertainty Coefficient
.ominal'by'1nterval .onlinear Association( "ta
Ordinal Association( 2amma) 3endall+s tau'b and tau'c)Somers+ d
Association for 1nter'rater Agreement rows and columns arethe same variable$( 3appa
Assumptions
/ssumptions are discussed in the sections for each particular
measure of association. Measures of association may assume
nominal, ordinal, or interval levels of measurementK symmetry orasymmetry of causal directionK suare versus any shape tableK and
alternative definitions of "perfect relationship" and "nullrelationship" as described above.
Strict monotonicity and the assumption of e4ual marginals.Measures of association which define perfect relationship in termsof strict monotonicity can reach ).( only when the two variables
have the same marginal distribution, ignoring null rows and nullcolumns. 3ne such measure,tau b,is used to illustrate this in the
four tables below4
TA5/" A Male >emale#ow!otals
#epublican )' )( 5'
TA5/" 5 Male >emale#ow!otals
#epublican 5( ' 5'
http://faculty.chass.ncsu.edu/garson/PA765/assoc2x2.htmhttp://faculty.chass.ncsu.edu/garson/PA765/assoc2x2.htmhttp://faculty.chass.ncsu.edu/garson/PA765/assocnominal.htmhttp://faculty.chass.ncsu.edu/garson/PA765/assocnominal.htmhttp://faculty.chass.ncsu.edu/garson/PA765/eta.htmhttp://faculty.chass.ncsu.edu/garson/PA765/assocordinal.htmhttp://faculty.chass.ncsu.edu/garson/PA765/assocordinal.htmhttp://faculty.chass.ncsu.edu/garson/PA765/standard.htm#kappahttp://faculty.chass.ncsu.edu/garson/PA765/standard.htm#kappahttp://faculty.chass.ncsu.edu/garson/PA765/assocordinal.htm#taubhttp://faculty.chass.ncsu.edu/garson/PA765/assocordinal.htm#taubhttp://faculty.chass.ncsu.edu/garson/PA765/assocordinal.htm#taubhttp://faculty.chass.ncsu.edu/garson/PA765/assoc2x2.htmhttp://faculty.chass.ncsu.edu/garson/PA765/assoc2x2.htmhttp://faculty.chass.ncsu.edu/garson/PA765/assocnominal.htmhttp://faculty.chass.ncsu.edu/garson/PA765/assocnominal.htmhttp://faculty.chass.ncsu.edu/garson/PA765/eta.htmhttp://faculty.chass.ncsu.edu/garson/PA765/assocordinal.htmhttp://faculty.chass.ncsu.edu/garson/PA765/assocordinal.htmhttp://faculty.chass.ncsu.edu/garson/PA765/standard.htm#kappahttp://faculty.chass.ncsu.edu/garson/PA765/standard.htm#kappahttp://faculty.chass.ncsu.edu/garson/PA765/assocordinal.htm#taub5/22/2018 Measures of Association
9/10
Democrat ' 5( 5'
8olumn
!otals5( 6( n '(
tau b .=($
Democrat ( 5' 5'
8olumn
!otals5( 6( n '(
tau b .$)@
TA5/" C Male >emale#ow
!otals
#epublican 5' ' 6(
Democrat ' )' 5(
8olumn
!otals6( 5( n '(
tau b .'$6
TA5/" % Male >emale#ow
!otals
#epublican 6( ( 6(
Democrat ( 5( 5(
8olumn
!otals6( 5( n '(
tau b ).(
Table Aillustrates a hypothetical relationship between #ender and
political party, shown to have a level of association by tau b of .
)+. Table Brepresents the stron#est possible relationship between
#ender and party if one is forced to 'eep the mar#inal totals thesame as in Table A. 4ven thou#h Table / is as stron# as possible
'eepin# the same total number of men and women, and
Republicans and 5emocrats, its association is less than !. 6it is .
+!$7. Table Cillustrates a relationship between the same two
variables, but where #ender and party have e&ual mar#inals, with a
tau b stren#th of .+3. Table Drepresents the stron#est possible
relationship between #ender and party, 'eepin# the mar#inal totals
the same as in Table C, and its stren#th is a perfect !., reflectin#
strict monotonicity. That is, a monotonic measure of association
li'e tau b can reach !. only when the mar#inal distributions of the
two variables are the same, as they are in Tables C and 5. n the 2-by-2 case, ordered and predictive monotonic measures of
association e(hibit the same behavior, althou#h in lar#er tables they
can reach !. even when row and column mar#inals are not the
same.
#onotonicity and table si6e. In a nonsuare table with no nullrows and no null columns, there will always be ties on the variablewith the smaller number of classes. 0hen the row variable has
fewer classes there will be ties on the row variable y%, and thus
such a table cannot have perfect association by strict or ordered
monotonicity, but may be perfect by predictive or wea&monotonicity. 0hen the column variable has fewer classes there
5/22/2018 Measures of Association
10/10
will be ties on the column variable x%, and thus such a table cannot
have perfect association by strict or predictive monotonicity, butmay be perfect by ordered or wea& monotonicity.
Frequently Asked Questions
Where does one find these measures of association in SPSS7
Most are found in the S7SS 8#3SS!/+S module. >rom
the menu, select Statistics, Summarie, 8rosstabs. In the"8rosstabs" dialog box, clic& the "Statistics" button, then in
the "8rosstabs4 Statistics" dialog box, chec& the measures
you want. S7SS does not offer