16
FRANK D. FINCHAM Florida State University RON ROGGE University of Rochester* Understanding Relationship Quality: Theoretical Challenges and New Tools for Assessment Relationship quality is studied in a variety of disciplines, yet widely accepted practices pro- mulgate a lack of conceptual clarity. We build on a conceptually simple and theoretically advan- tageous view of relationship quality and suggest a shift to conceptualizing it as two distinct yet related dimensions—positive and negative evaluations of relationships. We introduce item response theory as a powerful tool for measure development, demonstrating how relationship quality can be optimally pursued in the context of modern test theory, thus leading to better theory development. Recognizing the limitations of self-reported relationship quality, we extend this two-dimensional conceptualization further by drawing on developments in the derivation of implicit measures. After briefly introducing such measures, we illustrate their application to assessment of relationship quality. Since its inception, marital research has been strongly motivated by the desire to understand and remediate family problems. Because subjec- tive global evaluation of the relationship (rela- tionship quality) is viewed as the final common Family Institute, 225 Sandels Building, Florida State University, Tallahassee, FL 32306-1491 (ffi[email protected]). *Department of Clinical and Social Psychology, University of Rochester, RC Box 270266, Rochester, NY 14627-0266 ([email protected]). Key Words: marital satisfaction, relationship quality. pathway that leads to relationship breakdown (Jacobson, 1985), it has been the dominant construct studied in the literature on relation- ships such as marriage. Not surprisingly, it has gained the attention of researchers from a variety of disciplines, including psychology, sociology, family studies, and communication. Unfortunately, the wealth of empirical atten- tion given to the study of marriage is inversely proportional to conceptual analysis of the central construct studied. As Glenn (1990) pointed out in his decade review of research on marriage, most studies are justified on practical grounds, ‘‘with elements of theory being brought in on an incidental, ad hoc basis’’ (p. 818). The enemy of scientific progress, conceptual confusion, has resulted and the literature is littered with a large number of terms, such as satisfaction, adjust- ment, success, happiness, companionship, or some synonym reflective of the quality of the relationship. These terms tend to be used inter- changeably leading some scholars to even call for the elimination of such terms as marital satisfaction and marital adjustment from the lit- erature (e.g., Trost, 1985). Rather than heeding such a call (and thereby trying to avoid the problem), we confront the conceptual challenge directly in this article. This article focuses on a partner’s subjective evaluation of a roman- tic relationship, and hence we prefer the term relationship quality. However, because such judgments have also been referred to as relation- ship satisfaction (Funk & Rogge, 2007), we use the terms relationship quality and relationship satisfaction interchangeably. Journal of Family Theory & Review 2 (December 2010): 227–242 227 DOI:10.1111/j.1756-2589.2010.00059.x

2010 FTR Paper Revised Ff Submitted

Embed Size (px)

DESCRIPTION

22010 file

Citation preview

  • FRANK D. FINCHAM Florida State University

    RON ROGGE University of Rochester*

    Understanding Relationship Quality: Theoretical

    Challenges and New Tools for Assessment

    Relationship quality is studied in a variety ofdisciplines, yet widely accepted practices pro-mulgate a lack of conceptual clarity. We build ona conceptually simple and theoretically advan-tageous view of relationship quality and suggesta shift to conceptualizing it as two distinctyet related dimensionspositive and negativeevaluations of relationships. We introduce itemresponse theory as a powerful tool for measuredevelopment, demonstrating how relationshipquality can be optimally pursued in the contextof modern test theory, thus leading to bettertheory development. Recognizing the limitationsof self-reported relationship quality, we extendthis two-dimensional conceptualization furtherby drawing on developments in the derivationof implicit measures. After briefly introducingsuch measures, we illustrate their application toassessment of relationship quality.

    Since its inception, marital research has beenstrongly motivated by the desire to understandand remediate family problems. Because subjec-tive global evaluation of the relationship (rela-tionship quality) is viewed as the final common

    Family Institute, 225 Sandels Building, Florida StateUniversity, Tallahassee, FL 32306-1491([email protected]).

    *Department of Clinical and Social Psychology, Universityof Rochester, RC Box 270266, Rochester, NY 14627-0266([email protected]).

    Key Words: marital satisfaction, relationship quality.

    pathway that leads to relationship breakdown(Jacobson, 1985), it has been the dominantconstruct studied in the literature on relation-ships such as marriage. Not surprisingly, it hasgained the attention of researchers from a varietyof disciplines, including psychology, sociology,family studies, and communication.

    Unfortunately, the wealth of empirical atten-tion given to the study of marriage is inverselyproportional to conceptual analysis of the centralconstruct studied. As Glenn (1990) pointed outin his decade review of research on marriage,most studies are justified on practical grounds,with elements of theory being brought in on anincidental, ad hoc basis (p. 818). The enemyof scientific progress, conceptual confusion, hasresulted and the literature is littered with a largenumber of terms, such as satisfaction, adjust-ment, success, happiness, companionship, orsome synonym reflective of the quality of therelationship. These terms tend to be used inter-changeably leading some scholars to even callfor the elimination of such terms as maritalsatisfaction and marital adjustment from the lit-erature (e.g., Trost, 1985). Rather than heedingsuch a call (and thereby trying to avoid theproblem), we confront the conceptual challengedirectly in this article. This article focuses ona partners subjective evaluation of a roman-tic relationship, and hence we prefer the termrelationship quality. However, because suchjudgments have also been referred to as relation-ship satisfaction (Funk & Rogge, 2007), we usethe terms relationship quality and relationshipsatisfaction interchangeably.

    Journal of Family Theory & Review 2 (December 2010): 227242 227DOI:10.1111/j.1756-2589.2010.00059.x

  • 228 Journal of Family Theory & Review

    In the first section, we provide a brief synopsisof current concerns regarding the construct ofrelationship quality, including the psychometriclimitations of current measures. This serves asa springboard for introducing item responsetheory and demonstrating how it can advancemeasurement of relationship quality.1 We thensuggest a fundamental shift in how relationshipquality is conceptualized, arguing that itmight bemore appropriately represented by two distinctbut related dimensionspositive and negativerelationship qualities. Given the limitations ofself-report measures, in the final section, wedescribe how implicit measures might be usedto evaluate positive and negative dimensionsof relationship quality, offering insights intorelationship functioning of which subjects mightbe unaware or unwilling to report. The articleconcludes by summarizing the critique offeredand by highlighting avenues for future research.

    WILL THE REAL RELATIONSHIP SATISFACTION,ADJUSTMENT, SUCCESS, HAPPINESS, AND

    QUALITY PLEASE STAND UP?

    As indicated by the title of this section, numer-ous terms have been used to refer to what weearlier identified as the final common pathwayto relationship breakdown. Terminological het-erogeneity combined with disciplinary diversitymeans that relevant material is scattered acrossa variety of disparate sources, which makes itextremely difficult to access the picture of mar-riage painted by scientific research (Fincham,1998, p. 543).

    Nonetheless, there is increasing recognitionof two major approaches to the central constructstudied by marital researchers. They focus onthe relationship and on intrapersonal processes,respectively. The relationship or interpersonalapproach typically looks at patterns of inter-action such as companionship, conflict, andcommunication and tends to favor use of suchterms as adjustment. In contrast, the intraper-sonal approach focuses on individual judgmentsof spouses, namely their subjective evaluationof the marriage. This approach tends to usesuch terms as marital satisfaction and marital

    1Although most of our observations are motivatedby characteristics of marital research, we use the termrelationship quality wherever appropriate because it is moreinclusive.

    happiness. Both approaches are valuable ones,but problems arise because of another featureof research on relationship quality, to which wenow turn.

    Relationship quality and relationship sat-isfaction (and similar constructs denoted byvarious synonyms) have been almost exclu-sively assessed using self-report. Ironically,even behaviorally oriented psychologists whorejected the utility of self-reportwhen they beganto study marriage systematically in the 1970sused self-reported satisfaction as a criterion vari-able in their studies (see Bradbury & Fincham,1987). Indeed, a primary goal was to account forvariability in such reports of relationship qualityusing coded observations of marital interactions.The problem is that the most widely used mea-sures of relationship quality, the Marital Adjust-ment Test (MAT) (Locke &Wallace, 1959) andthe Dyadic Adjustment Scale (DAS) (Spanier,1976), include items that assess both interactionpatterns (interpersonal processes) and subjec-tive evaluations of the marriage (intrapersonalprocesses), thereby ignoring the conceptual dis-tinction just outlined. Historically, it has beenthe case, then, that researchers study the samething regardless of conceptual distinctions theymight make when introducing their research.

    It is also questionable whether spouses arethe best, or even good, reporters of relationshipproperties. Indeed, research using the SpouseObservation Checklist shows that, on average,spouses agree only about 50% of the time on theoccurrence of a behavior, and as a result, the epis-temological status of such reports has changed sothat they are now viewed not as objective reportsof behavior (as originally thought) but as sub-jective perceptions (for a review, see Bradbury& Fincham, 1987). This viewpoint is consistentwith the construct of sentiment override (Weiss,1980) whereby spouses are posited to respond toquestionnaire items about the partner and mar-riage, not somuch in terms of the itemsmanifestcontent but in terms of their sentiment towardthe partner. Self-report seems better suited to thesecond major approach to marital quality, whichfocuses on how spouses evaluate theirmarriage.2

    2Clearly, some properties of the relationship can beobtained only from spouses (e.g., frequency of intercourse),but others may be beyond the awareness of all but the mostpsychologically sophisticated (e.g., the pattern of interactionduring conflict).

  • Understanding Relationship Quality 229

    In light of the foregoing observations, it isperhaps little wonder that, early on, factor ana-lytic approaches gave rise to the conclusionthat different operations designed to mea-sure marital satisfaction converge and formone dimension (Gottman, 1979, p. 5), a view-point supported by subsequent work that showsthat standard measures of relationship satisfac-tion intercorrelate highly (e.g., Heyman, Sayers,& Bellack, 1994). But what are we to makeof relationship quality assessed in this man-ner? Dahlstrom (1969) describes three levelsat which responses to self-report inventoriescan be interpreted: (a) veridical descriptions ofbehavior (e.g., responses regarding frequency ofdisagreement reflect the actual rate of disagree-ment between spouses); (b) potential reflectionsof attitudes (e.g., frequently reported disagree-ment may reflect high rates of disagreement butmay also reflect the view that the partner isunreasonable, that the spouse feels undervalued,or some other attitude); and (c) behavioral signs,the meaning of which can be determined onlyby actuarial data (e.g., rated disagreement mayreflect time spent together, respondents self-esteem, frequency of intercourse, or a host ofother variables). Few measures of relationshipquality specify the level at which responses areto be interpreted.

    The summation of various dimensions of rela-tionships in omnibus measures of relationshipquality (e.g., interaction, happiness) also pre-cludes meaningful study of the interplay of suchdimensions (e.g., interaction may influence sat-isfaction, and vice versa). It has also given riseto another troubling feature of the literature onrelationship quality, namely that our knowledgeof the determinants and correlates of relationshipquality includes (an unknown number of) spuri-ous findings. This is because of overlapping itemcontent in measures of quality and measures ofconstructs examined in relation to it. The oftendocumented association between self-reportedcommunication (e.g., Marital CommunicationInventory; Do the two of you argue a lot overmoney? Do you and your and your spouseengage in outside activities together?) and rela-tionship quality (e.g., DAS; Indicate the extentof agreement or disagreement between you andyour partner on: handling family finances;Do you and your mate engage in outsideinterests together?) is a particularly egregiousexample of this problem. The resulting tauto-logical association hinders theory construction

    and affects the credibility of research findings.Funk and Rogge (2007) offered evidence to sup-port the cross-contamination of communicationand relationship quality measures, identifying13 items from the MAT and DAS that correlatedmore strongly with the communication factor(extracted from a principle components analysisof 176 satisfaction and communication items in asample of 5,315 respondents) than they did withthe satisfaction factor. Fincham and Bradbury(1987) discuss the dilemma caused by overlap-ping item content at some length, showing thatexclusion of the items common to both measuresdoes not provide a satisfactory solution to thisproblem, as they usually reflect overlap in thedefinition of the constructs.

    Perhaps the most important issue confrontingthe field is that the link between conceptualdistinctions and measurements is not as strongas it might be, thereby hindering theory devel-opment. But this is hardly surprising when, inthe broader marital literature, the associationbetween theories and research tends to be looseand imprecise and, in some cases, constitutesonly a metaphorical connection (Fincham &Beach, 1999, p. 55). As might be expected, thishas an impact at the level of assessment. Itis therefore important for assumptions underly-ing the measurement of relationship quality tobe made explicit and questioned. For example,Spanier (1976) eliminated items from his influ-ential measure (the DAS) that were positivelyskewed, thereby assuming that items reflectiveof relationship quality approximate a normaldistribution. But as Norton (1983) pointed out,such items may be less critical indicators oreven irrelevant to relationship quality if rela-tionship quality inherently involves skewed databecause spouses tend to report happy marriages.Moreover, if the outcome predicted by relation-ship quality is itself skewed (e.g., aggression),then a skewed predictor may be best (Heymanet al., 1994).

    Some time ago, a leading scholar concludedthat the psychometric foundation is reasonablysolid and need not be redone (Gottman &Levenson, 1984, p. 71). The basis for such aconclusion appears to be that different measuresof relationship quality intercorrelate highly, thussuggesting that differences in item content acrossmeasures are relatively unimportant (e.g., Funk& Rogge, 2007; Heyman et al., 1994).

    Such conclusions are quite reasonable forsome research purposes. For instance, they

  • 230 Journal of Family Theory & Review

    suffice if the goal is to select happy or satisfiedversus unhappy or dissatisfied spouses, as isoften done in clinical research on marriage. Herethe exact content of the measure used to selectgroups is less important than its ability to iden-tify correctly the groups of interest. In fact, itwas precisely this type of criterion keying (iden-tifying items optimally distinguishing betweendistressed and nondistressed couples) that wasused to create the two most widely used andcited measures of relationship satisfaction: theMAT and DAS scales. From an actuarial pointof view, such a taxometric approach has merit,as it can effectively create a tool to identify riskgroups. However, to the extent that ones goal isto develop theory for advancing understandingof relationship quality as a continuous constructor to devise conceptually soundmeasures of rela-tionship quality across all ranges of functioning,the above approach is less appropriate.

    In light of the foregoing observations, weargue that current conceptions and operational-izations of relationship quality are inadequate.Much of the conceptual confusion regardingrelationship quality appears to be based on theassumption that constructs related at the empir-ical level are equivalent at the conceptual level.This can lead to a problem that is demonstratedby considering the example of height andweight.Those two dimensions correlate to about thesame degree as many measures of relationshipquality, yetmuch is gained by keeping height andweight separate. Imagine designing a door framehaving only a composite measure of the bignessof users and not their height! Keeping empiricaland conceptual levels of analyses separate hasthe advantage of forcing the researcher to artic-ulate the nature of the construct and the domainof observables to which it relates before devel-oping measures of the construct. Such practicesare likely to facilitate theoretical developmentand the construction of more easily interpretedmeasures of relationship quality.

    Mindful of the conceptual problems encoun-tered in using omnibus measures of relationshipquality, we build on previous work that hasdefined relationship quality as subjective, globalevaluation of the relationship (e.g., Fincham& Bradbury, 1987; Norton, 1983; Schumm,Nichols, Schectman, & Grinsby, 1983). Thestrength of this approach is its conceptual sim-plicity, as it avoids the problem of interpretationand allows for unambiguous specification ofthe constructs nomological network. Because

    it has a clear-cut interpretation, the approachallows the antecedents, correlates, and conse-quences of relationship quality to be examinedin a straightforward manner.

    The remainder of the article builds on thisprior work in three ways. First, with oneexception, relationship satisfaction indices havebeen developed using classical test theory andtherefore fail to take advantage of recent devel-opments in psychometrics. This is importantbecause, as Funk and Rogge (2007) have noted,extant scales of relationship quality have prob-lematic levels of noise in measurement, reducingtheir power to reveal meaningful results andthereby hindering theory development. The nextsection therefore introduces a more recent devel-opment in psychometrics, item response theory(IRT), also known as latent trait theory, andillustrates its application to the assessment ofrelationship quality. After doing this, we suggesta fundamental shift in how relationship qualityis conceptualized, arguing that it might be moreappropriately represented by two distinct butrelated dimensionspositive and negative rela-tionship qualities. This mirrors robust findingsin the affect literature exemplified by scales likethe Positive Affect Negative Affect Schedule(PANAS) (Watson, Clark, & Tellegen, 1988)and the Mood and Anxiety Symptom Question-naire (MASQ) (Watson et al., 1995), suggestingthat the experience of positive and negativeaffect or distress and vitality are substantivelydistinct phenomena. We argue that individualsin romantic relationships might similarly experi-ence both positive and negative feelings towardtheir relationships somewhat independently, andtherefore constraining the assessment of rela-tionship quality to a single dimension couldobscure results and oversimplify theories. Inthe fourth section, we turn to consider the lim-itations of self-report measures. Although thebulk of relationship research has made use ofself-report scales to assess relationship quality,that method of assessment is, by definition, lim-ited by subjects own awareness of and insightinto their true level of relationship quality. Self-report measures have also been shown to beaffected by a number of common biases, such asimpression management and motivated distor-tion (e.g., Stone et al., 2000). Thus, in the fourthsection, we examine how positive and negativerelationship evaluations might be explored usingimplicit measures, thus offering the possibilityof accessing information on relationship quality

  • Understanding Relationship Quality 231

    of which subjects might not be fully aware orwilling to fully report. Such information has thepotential to enrich theory.

    PLACING RELATIONSHIP QUALITY IN THECONTEXT OF MODERN TEST THEORY: ENTER

    ITEM RESPONSE THEORY

    To explain how item response theory(Hambleton, Swaminathan, & Rogers, 1991)can shed light on the assessment of relation-ship quality, and thereby help promote theorydevelopment, we first need to provide a briefdescription of IRT. The field of standardizedtesting has long used IRT to craft noniden-tical but equivalent forms of tests evaluatingacademic ability and competency. Limited pri-marily by the exceedingly large sample sizesrequired, and to a lesser extent by the complex-ity of calculations involved, IRT offers severalimportant advantages over classical test theoryapproaches.

    One advantage is that when an item is evalu-ated with IRT in a sufficiently large and diversesample, the results obtained can be expected toreplicate almost identically in all future samples.This provides insight into how that itemwill per-form across a range of situations and clarifiesexactly how much information it will providefor assessing the construct of interest ( in IRT).This also means that a score for a measure devel-oped using IRT should have an identicalmeaningacross samples. A second advantage offered byIRT is that it assumes that the utility of indi-vidual items varies by levels of the constructbeing assessed ( ). Put simply, this means thatsome items might be highly effective at assess-ing low levels (e.g., 2 SD to 1 SD below thepopulation mean) of a construct like relation-ship satisfaction. Conversely, other items mightoffer little information for people in that rangebut could offer large amounts of information forassessing higher levels (e.g., +1 SD to +2 SD)of that same construct. Although the possibilityof such differences has long been recognizedin the measurement literature, classical test the-ory techniques like factor loadings, item-to-totalcorrelations, squared multiple correlations, andCronbachs alpha coefficients are simply unableto reveal such differences or to quantify themas clearly as IRT. Finally, IRT is able to syn-thesize the results into information profiles foreach item (called item information curves, orIICs) that reveal how much information an item

    provides for assessing the construct of interestat various levels of that construct. In this frame-work, the standard error of measurement is theinverse square root of the information curve.Thus, by quantifying the information providedby each item at various levels of , IRT alsoquantifies the noise in measurement at variouslevels of . As a result, IRT offers a powerfultechnique for evaluating the precision of mea-surement afforded by individual items as well assets of items (or scales).

    To explain how IRT accomplishes all of this,it is necessary to first explain the basic mechan-ics of IRT analyses. An IRT analysis begins byestimating latent scores on the construct of inter-est for each individual in a sample ( scores).These would be the equivalent of GRE or SATscores. Then, on the basis of those estimates,IRT estimates a set of parameters for each itemin the analysis ( and coefficients). Whenevaluating each item, IRT is simply lookingto see whether higher scores are associatedwith respondents selection of higher responsechoices on that item. To the degree that higher scores are tightly linked to higher responsechoices so that there are very sharp boundariesbetween response choices, an item is consideredhighly informative. If, by contrast, responses onthe item seem to be relatively unrelated to scores, then the item would be deemed to offerlittle information for assessing . After esti-mating item parameters for all of the items (andtherefore shedding light on which items offer themost information), IRT starts another iterativecycle by reestimating scores for each individ-ual in the sample on the basis of the new itemparameters, giving greater weight to the itemsoffering greater amounts of information. Withthose new scores, IRT then reestimates the itemparameters. This iterative process stops whenboth the scores and the item parameters stabi-lize. The appendix describes this process in moredetail for dichotomous and Likert scale items.

    Application to Relationship Quality

    In terms of assessing relationship quality, IRToffers a number of exciting possibilities. Firstand foremost, IRT offers the chance to quantifythe precision of measurement (lack of noise)offered by current relationship quality scales.Although 40 years of converging data offerstrong evidence that measures like the DASand MAT are indeed measuring relationship

  • 232 Journal of Family Theory & Review

    quality, very little attention has been given todetermining how precisely or accurately theyassess that construct. This would be equivalentto doing 40 years of research studying fevermedications using the same one or two brandsof thermometers without knowing whether theywere accurate to 0.1 degrees or 10 degrees.As long as the thermometers were indeedmeasuring temperature, researchers should stillget converging results. However, if researcherswere using thermometers that were accurate toonly 10 degrees, it would take considerablylarger sample sizes to discover reliable patternsof change over time and such extreme noise inmeasurement would likely obscure significantand meaningful results in smaller samples. Animportant casualty of such circumstances islikely to be theory development. In many ways,couples researchers find themselves in preciselythis position, and although research using scaleslike the MAT and DAS has undoubtedly beenfruitful, if those scales were to have notablylow levels of precision (high measurementnoise), then the countless significant results thatexcessive noisewas likely to havemaskedwouldoutweigh the information gained by using thosemeasures.

    To address this issue, IRT analyses wereapplied to the items of existing relationship qual-ity scales (e.g., MAT, DAS, Nortons [1983]Quality of Marriage Index [QMI], Hendricks[1988] Relationship Assessment Scale [RAS],Schumm et al.s [1983] Kansas Marital Satis-faction Scale [KMS], and a 15-item SemanticDifferential [SMD] [Karney&Bradbury, 1997])in a sample of 5,315 online respondents (Funk& Rogge, 2007). Although the IICs suggestedthat a number of items from existing scalesoffered high levels of information for assessingrelationship quality, many of the items providednotably low levels of information, indicating thatresponses to those items were relatively unre-lated to relationship quality and would thereforeprimarily contribute error variance or noise tothe scales that used them. This was born outin the test information curves, as the 32-itemDAS seemed to offer little more informationthan the 6-item QMI, and the 15-item MAToffered no more information than a 4-itemversion of the DAS. Thus, the IRT analyses sug-gested that the two most widely used and citedmeasures of relationship quality, the MAT andDAS, had markedly high levels of measurementnoise.

    In addition to offering a powerful tool forevaluating current relationship quality scales,IRT also offers the possibility of developingpsychometrically optimized scales. Given thehigh levels of noise in the MAT and DAS, Funkand Rogge (2007) created the Couples Satis-faction Index (CSI) by using a combination ofexploratory factor analyses and IRT analyseson a pool of 140 items to identify the unidi-mensional, nonredundant set of 32, 16, and 4items offering the greatest information (lowestnoise) for assessing relationship quality. TheCSI scales offered identical patterns of correla-tion with anchor scales from the nomological netto those obtained with the MAT and DAS anddemonstrated appropriately high levels of cor-relation with scales like the MAT and DAS (allcorrelations greater than .87), which suggeststhat they were still assessing the same con-struct of relationship quality. In fact, all of themeasures of relationship quality demonstratedexceedingly high levels of correlation with oneanother and comparable patterns of correlationwith anchor scales. Thus, at a correlational level(in a large and diverse sample), the measuresseemed completely interchangeable. However,Funk and Rogge (2007) were able to demon-strate that the increased information offered bythe CSI sales translated into increased precision(decreased noise) and markedly higher levels ofpower for detecting group differences than theMAT and DAS.

    The psychometric analyses generating theCSI scales also shed light on the theoreticalunderpinnings of the construct of relationshipquality. As both the MAT and DAS were con-structed primarily using criterion keying (select-ing items optimally separating distress fromnondistressed couples), the resultant scales hadmarkedly heterogeneous item content, whichleads to theoretical uncertainty in the boundariesof the construct. In contrast, by using statisticaltechniques more appropriate to evaluating anddeveloping measures of continuous constructs,the IRT analyses presented in Funk and Rogge(2007) identified a more homogeneous set ofitems for the CSI scales. Specifically, items iden-tified as most informative by the IRT analysesalso happened to be the items most prototypicalof the global evaluative dimension (e.g., the topfour items included Please indicate the degreeof happiness, all things considered, of yourrelationship, I have a warm and comfortablerelationship with my partner, How rewarding

  • Understanding Relationship Quality 233

    is your relationship with your partner? andIn general, how satisfied are you with yourrelationship?). This suggests that respondentsmethods of responding to relationship qualityitems align most directly with the more focusedtheoretical definition of this construct discussedearlier. Thus, the development of the CSI scalesoffers relationship researchers a much betterset of thermometers for evaluating relationshipquality. Moreover, the reduced measurementerror associated with those scales offers thepossibility of having greater power to detecttheoretically meaningful resultsparticularly insmall samples. The findings offer a direct chal-lenge to the earlier cited assertion that formeasures of relationship quality the psycho-metric foundation is reasonably solid and neednot be redone (Gottman & Levenson, 1984,p. 71). Instead, for the past 40 years, measure-ment noise has been a serious problem lurkingunderneath the seemingly robust and conver-gent findings of studies using the DAS andMAT. This lack of precision provides anotherreason that helps account for the relative lack oftheoretical development in the marital literature.

    BROADENING OUR HORIZONS:A TWO-DIMENSIONAL CONCEPTUALIZATION

    OF RELATIONSHIP QUALITY

    The bulk of couple research has assumed thatrelationship quality represents a single bipolardimension ranging from extreme dissatisfac-tion to extreme satisfaction. However, Finchamand Linfield (1997) challenged this assump-tion, arguing that individuals might be able tosimultaneously hold both negative and posi-tive sentiments toward romantic partners, justas individuals can experience both positive andnegative affect at the same time. Fincham andLinfield went on to hypothesize that assessingthe two dimensions independently of each otherwould provide additional information on cur-rent relationship functioning that could not beobtained from unidimensional measures like theMAT and DAS. To test this, Fincham and Lin-field developed two 3-item scales to assess eachdimension, the Positive Marital Quality (PMQ)and the Negative Marital Quality (NMQ) scales.To enhance the distinction between the twodimensions, the beginning of each item askedrespondents to consider only the dimension theywere evaluating (e.g., Considering only thepositive qualities of your spouse, and ignoring

    the negative ones, evaluate how positive thesequalities are). Using a sample of 123 marriedcouples, the authors demonstrated that PMQ andNMQeach accounted for unique variance in self-reports of conflict behavior and attributions evenafter controlling for MAT scores. Thus, theirresults suggested that new and useful informa-tion was gained by disentangling the assessmentof positive sentiment toward a relationship fromnegative sentiment toward that same relation-ship. Mattson, Paldino, and Johnson (2007) havealso shown the utility of assessing positive andnegative quality separately using the PMQ andNMQ among engaged couples.

    Extending these results to the evaluation oftreatment effects over time, Rogge et al. (2010)examined linear change in relationship qualityover 3 years using the MAT, PMQ, and NMQin a sample of 174 couples who had receivedeither no treatment (NoTx, n = 44), the Prepa-ration and Relationship Enhancement Program(PREP) (Jacobson & Margolin, 1979; n = 45),the Compassionate and Accepting RelationshipsThrough Empathy program (CARE) (Rogge,Cobb, Johnson, Lawrence, & Bradbury, 2002;n = 52), or an intervention designed to increasecouples awareness of their own relationshipbehaviors without teaching them any specificskills (AWARENESS; n = 33). When usingthe MAT, both husbands and wives in all fourgroups demonstrated drops in quality over the3 years following the interventions, and couplesin the treatment groups failed to demonstratesignificant differences from couples in the NoTxgroup. When using the NMQ to model changein negative relationship qualities over time, theanalyses also failed to identify any differencesbetween the couples receiving treatment andthose in the NoTx group. However, when usingthe PMQ scores to model linear change in pos-itive relationship qualities over time, couplesin the NoTx group demonstrated significantlysharper declines in positives than did couplesin all three active treatment groups (PREP,CARE, and AWARENESS). The results suggestthat positive relationship evaluations can changeover time independently of negative evaluationsand that using only a global measure of qualitylike the MAT might have obscured meaningfultreatment results.

    Extending this work further, Rogge andFincham (2010) developed optimized measuresof positive and negative relationship qual-ity using a combination of exploratory factor

  • 234 Journal of Family Theory & Review

    analyses and IRT in a sample of more than 1,600college students. The authors asked respondentsto rate their relationships on separate sets of 20positive (e.g., enjoyable, pleasant, alive) and 20negative (e.g., bad, empty, lifeless) adjectives,giving similar instructions to those that Finchamand Linfield (1997) used (e.g., Consideringonly the positive qualities of your relationshipand ignoring the negative ones, evaluate yourrelationship on the following qualities). Factoranalyses supported two dimensions of evalua-tion that were moderately correlated with oneanother. The IRT analyses were used to identifythe items most effective for assessing positivequalities (PRQ) and the items most effective forassessing negative qualities (NRQ). Hierarchi-cal regression analyses showed that the PRQ-4andNRQ-4 offered unique information beyond afour-item measure of global relationship quality(CSI-4) for understanding self-reported posi-tive interactions, negative interactions, satisfac-tion with sacrifice, vengefulness toward partner,hostile conflict behavior, and disagreement tol-erance. Furthermore, the PRQ-4 and NRQ-4displayed distinct patterns of validity withinthose regressions, with the NRQ-4 being morestrongly related to things like vengefulness andhostile conflict behavior and the PRQ-4 beingmore strongly related to satisfaction with sacri-fice and disagreement tolerance.

    The results continue to suggest that posi-tive and negative relationship qualities repre-sent two distinct (albeit related) dimensions ofrelationship quality, each with its own uniqueinformation to contribute in attempts to under-stand relationship functioning and relationshipbehavior. Unfortunately, such distinctions havebeen obscured by the widespread assumptionin the existing literature that positive and nega-tive relationship evaluations are simply oppositepoints on a single dimension and can there-fore be assessed with a single scale. As withthe concerns raised by the noise in measure-ment of the MAT and DAS, only time willreveal how forcing negative and positive evalu-ations onto a single scale might have obscuredmany potentially interesting and informativeresults.

    From a theoretical perspective, this bidimen-sional conceptualization has important implica-tions. For example, Fincham and Linfield (1997)have already shown it can be used to identifytwo groups of spouses (those high on bothdimensions vs. those low on both dimensions)

    who behave differently but are indistinguishableon unidimensional measures of marital quality(scoring in the midrange). It also has the poten-tial to yield a richer picture of paths towardrelationship distress. For instance, a decrease inpositive evaluation that precedes an increase innegative evaluation may be quite different fromone in which both processes occur in tandem orone in which the negative increases first and isfollowed by a decrease in the positive. Finally,it is possible that relationship processes havedistinct impacts on positive versus negative eval-uations of relationships. Indeed, links betweenrelationship processes and positive or negativerelationship evaluations might vary by the rel-ative strength of those positive and negativeevaluations. In fact, such theoretical distinc-tions might help explain seemingly conflictualresults, such as findings that hostile conflictbehavior can be associated with poorer relation-ship quality over time (e.g., Rogge & Bradbury,1999) or with improved quality over time (e.g.,Gottman & Krokoff, 1989) in assessments ofrelationship quality with a unidimensional con-struct. Only by expanding the theoretical con-ceptualization to a two-dimensional model ofrelationship evaluations would it be possible totest such possibilities. Regardless of the out-come, there are problems with self-report thatneed to be addressed, an issue to which we nowturn.

    THE LIMITS OF SELF-REPORT: ENTER IMPLICITMEASURES

    The limitations of self-report have been exten-sively documented (e.g., Stone et al., 2000).Such limitations include impression manage-ment, motivated distortion, and the limits ofself-awareness. The first has been widely recog-nized inmarital research and has given rise to thedevelopment of a measure of social desirabilitythat is specific to marriage, the Marital Conven-tionalization Scale (Edmonds, 1967). This scalecontains items that describe the marriage in animpossibly positive light portraying themarriageas perfect and meeting the respondents everyneed. It correlates strongly (in the .50.73 range)with numerous measures of marital adjustmentand satisfaction (Fowers & Applegate, 1996).Edmonds (1967) argued that the social desir-ability bias in responses to assessment of maritalsatisfactionwas unconscious and unintended andtherefore involved fooling oneself rather than

  • Understanding Relationship Quality 235

    fooling others (p. 682). Although researchershave tried to control for this contaminant in theassessment of marital quality using Edmonds(1967) scale, increased concern about what itactually measures (Fowers & Applegate, 1996)renders such efforts moot and stresses the needfor an alternative approach to this problem.

    Motivated distortion and the limits of self-awareness in self-report have received relativelyless attention in marital and family research.The failure to come to terms with nonconsciousprocesses and instead assume that spouses haveaccess to whatever we ask them about is animportant limitation of the marital literature.This assumption has been thoroughly repudiatedin the literature on social cognition and mayaccount for the dramatic expansion of implicitmeasures in behavioral and social science inrecent years (seeWittenbrink & Schwarz, 2007).Implicit measures aim to assess attitudes (orconstructs) that respondents may not be willingto report directly or of which they may not evenbe aware. Such measures provide an index ofthe construct of interest without directly askingfor verbal reports and are therefore likely to befree of social desirability biases.

    The two major implicit measures used inresearch are use of priming methods and theImplicit Association Test (IAT) (Greenwald,McGhee, & Schwartz, 1998). Priming methodsfocus on automatic activation of evaluation asso-ciated with the primed stimulus. This producesa processing advantage for evaluatively congru-ent targets and a disadvantage for evaluativelyincongruent targets, as the response suggestedby the prime must be inhibited (for a review, seeFazio, 2001). By measuring response latency totargets following priming, we gain informationabout the evaluation of the primed object. In asimilar vein, the IAT, which involves sortingwords, assumes that if two concepts are highlyassociated, the IATs sorting tasks will be eas-ier when the two associated concepts share thesame response than when they require differentresponses (Greenwald & Nosek, 2001, p. 85).We turn to consider such sorting tasks.

    WORD-SORTING ASSOCIATION TASKSASSESSING IMPLICIT ATTITUDES

    The IAT and its derivative, the Go/No-GoAssociation Task (GNAT) (Nosek & Banaji,2001) have been used to assess constructsthought to be heavily influenced by self-report

    biases, such as racial stereotypes, self-esteem,and psychopathology (see De Houwer, 2002;Greenwald & Farnham, 2000; Mitchell, Nosek,& Banaji, 2003). In the IAT, participants arepresented with four types of stimuli, one at atime in random order, and are asked to cate-gorize those stimuli into a left- or right-handresponse. For example, respondents might beasked to respond with the left hand for goodwords and the right hand for bad words whilesimultaneously being asked to sort words fromthe two opposing target categories into right-hand and left-hand categories. By alternatelypairing stimuli from each target category withgood and bad response options acrossdifferent blocks of trials and then looking forrelative differences in speed of performance,it is possible to quantify implicit attitudes.For example, Greenwald et al. (1998) measuredimplicit prejudice against Black people by sub-tracting relative differences in response latenciesfrom a stereotype-compatible condition (group-ing Black names with bad words and Whitenames with good words) from relative differ-ences in response latencies from a stereotype-incompatible condition (Black with good andWhite with bad). In this example, largerdiscrepancies in performance between the twoconditions (e.g., notably faster performance onthe stereotype-compatible trials and poorer per-formance on the stereotype-incompatible trials)would suggest stronger implicit stereotypes.

    Although the IAT in its original form offersa powerful methodology for assessing implicitattitudes, it is somewhat limited in that it assessesthose implicit attitudes as a contrast between twoopposing target categories (e.g., Black vs.White,flowers vs. insects). This makes the implicitattitudes assessed specific to the contrasting cat-egories used, and contradictory implicit attitudescan be obtained for the same target category bychanging the contrasting category used in thetask (e.g., Mitchell et al., 2003). Thus, differ-ent results could be expected when assessingimplicit attitudes toward romantic partners ifthose attitudes are assessed using an IAT thatcontrasts partners with selves, partners withstrangers, or partners with friends. In contrast,the GNAT does not require a contrasting cat-egory to assess implicit sentiment toward atarget.

    The GNAT is a highly similar word-sortingtask in which respondents are presented with amixture of stimuli, one at a time in random order,

  • 236 Journal of Family Theory & Review

    and are asked to sort them by pressing a keywhen target words appear (a go response) andto refrain from pressing the key when distracterwords appear (a no-go response). By alternatelypairing good or bad words with words specific tothe category of interest (e.g., romantic partner)in separate blocks of trials and measuring howquickly and accurately a subject responds, it ispossible to assess the subjects implicit attitudetoward that category. In a series of six studiesexamining the validity of the GNAT as an alter-native to the more traditional IAT, Nosek andBanaji (2001) demonstrated that the GNAT wasable to demonstrate comparable differences inperformance to those found with the IAT whencontrasting a category with generally positiveimplicit associations (fruit) from a category withgenerally negative implicit associations (bugs).The authors were also able to demonstrate thatthe GNAT could be used to identify such posi-tive or negative implicit attitudes toward a singletarget category without the use of a contrastingcategory, offering a significant practical advan-tage over the IAT. Nosek and Banaji (2001)further demonstrated that the GNAT could beset to use either response speed (response laten-cies) as the primary index of performance orresponse accuracy (during a much faster andmore difficult task) as the primary index of per-formanceachieving comparable results withboth paradigms. Recently, the GNAT showedpredictive validity in measuring fear of spiders(Teachman, 2007).

    ASSESSMENTS OF IMPLICIT ATTITUDES IN THERELATIONSHIP LITERATURE

    Although not absent frommarital research, stud-ies making use of word-sorting tasks to assessimplicit attitudes are a rarity. Implicit measureswere first introduced to this field in a study thatsought to use response latency or reaction timeto show that time taken to make evaluative judg-ments of the partner and the marriage moderatedthe relation between relationship quality andexpected partner behavior (Fincham, Garnier,Gano-Phillips, & Osborne, 1995). In one task,respondents were asked to sort a set of 48 wordsincluding four partner stimuli (e.g., your part-ner, your spouse, partners first name) intogood and bad categories and response times forthe partner stimuli were considered an implicitmeasure of relationship satisfaction accessibil-ity. In a second task, response latencies were

    measured for 7 self-report Likert items assessingrelationship quality embeddedwithin a larger setof 22 items. The results indicated that comparedto slow responders, fast responders on eithertask showed significantly higher correlations ofself-reported marital quality and expected part-ner behavior in an upcoming interaction. Thissuggested that knowing a persons accessibilityto their relationship quality at an implicit levelmight be associated with greater insight intotheir relationship behaviors and dynamics. Eventhough this innovative study drew significantattention at the time (e.g., Baucom, 1995; Beach,Etherton, &Whitaker, 1995), the tasks used con-founded the implicit assessment of underlyingsentiment (through response latencies) with theconscious (self-reported) assessment of relation-ship quality, thereby potentially obscuring theunique information that an implicit assessmentof attitudes toward a romantic partner mightprovide. The findings were also somewhat lim-ited by the fact that the entire assessment waslimited to a small set (four or seven) of criticaltrials. This stands in contrast to the 140 criti-cal trials typically afforded by IAT or GNATparadigms.

    Implicit assessments have also appeared inthe literature examining the inclusion of oth-ers in the self. Aron, Aron, Tudor, and Nelson(1991) asked respondents to rate themselvesand their partners on a list of 90 stimuli. Theythen measured reaction times as the respon-dents sorted those 90 stimuli into me ornot me categories. They argued that indi-viduals who had incorporated their romanticpartner into their own self-image should haveshown delays on the subset of stimuli for whichthey rated themselves as different from theirpartners (and correspondingly faster times onthe stimuli for which they rated themselvesas similar to their partners). Thus, the authorsused performance on those two types of stimuliin the me and not-me as an implicit assess-ment of cognitive interdependence or closeness.The results supported their hypotheses, and theauthors were able to demonstrate that higherlevels of implicit closeness were correlated withhigher levels of self-reported closeness. Extend-ing those results, Aron and Fraley (1999) demon-strated that implicit closeness as assessed withthe me and not-me task helped predict changesin self-reported closeness over 3 months as wellas relationship breakup over that same interval.Unfortunately, the me and not-me task was also

  • Understanding Relationship Quality 237

    limited by the small number of stimuli that fellinto the two critical categories for individualrespondentswith some respondents having nostimuli or one stimulus falling into the self-different-from-partner categories.

    More recently, the IAT was used to mea-sure implicit attitudes toward romantic partnersin two published studies. Zayas and Shoda(2005) used the name that respondents usedto refer to their romantic partner as the cate-gory label (e.g., John) and the contrastingcategory label (e.g., Not-John) for the twotarget categories to be alternately paired withgood and bad responses. They then used aset of unique descriptive words (e.g., nickname,hair color, city of birth) as partner stimuli anda set of words unassociated with each respon-dents partner as not-partner stimuli to be mixedin with the good and bad stimuli. Theirresults demonstrated that, at a cross-sectionallevel, more positive implicit attitudes as assessedby the partner-IAT were associated with higherlevels of secure attachment (a healthy internalworking model of romantic relationships) andlower levels of attachment avoidance (discom-fort with emotional closeness and intimacy).Banse and Kowalick (2007) used a similar typeof partner IAT to examine its association withconcurrent relationship quality and well-being.They recruited women who were living in ashelter, women who were hospitalized becauseof pregnancy complications, women who hadrecently fallen in love, and female students.Their results showed that abusedwomen demon-strated more negative implicit attitudes towardpartners and that positive implicit attitudeswere associated with higher levels of secureattachment across all women in the sample.Unexpectedly, implicit attitudes were not signif-icantly associatedwith self-reported relationshipquality.

    Extending this work, Lee, Rogge, and Reis (inpress) sought to develop an implicit measure ofattitudes toward romantic partners independentof explicit (self-report) evaluations and withoutthe need for a contrasting category to that ofpartner. The authors also sought to validatethis implicit measure over a much longer timeframeduring which greater amounts of rela-tionship change could be expected to occur.Finally, extending the groundbreaking workwith self-report measures of relationship quality,the authors sought to disentangle positive andnegative implicit attitudes toward a romantic

    partner, examining them as separate dimensionswith potentially separate predictive validities.Thus, Lee, Rogge, and Reis (in press) examinedthe unique validity of a partner-focusedGNAT topredict relationship dissolution over 12 monthsacross two studies. They asked respondents toprovide three stimuli specific to their romanticpartners (e.g., name, nickname, distinguishingcharacteristic). The respondents then completeda GNAT in which the partner stimuli were alter-nately paired with good and bad words asthe target stimuli across two separate blocks of70 trials. Given some of the inherent limitationsof using reaction time data (e.g., extreme skew,high noise-to-signal ratios), the task was setup as a rapid task using accuracy as the primarymeasure of performance. Higher levels of perfor-mance on the trials in which partner stimuli werepaired with good stimuli as targets (partner-goodtrials) were hypothesized to reflect stronger posi-tive implicit attitudes toward a romantic partner,and higher levels of performance on the partner-bad trials were hypothesized to reflect strongernegative implicit attitudes toward a partner.After completing the partner-GNAT, respon-dents completed a battery of self-report ques-tionnaires assessing relationship quality, hostileconflict behavior, and neuroticism. The respon-dents were then contacted four to six times overthe following 12 months to assess relationshipstability.

    Interestingly, in both samples, performanceon the partner-good and partner-bad trialsdemonstrated positive correlations with eachother (r = .46), despite being hypothesizedto reflect positive and negative implicit atti-tudes, respectively. This likely represents sharedmethod variance due to the common mechanicsof the word-sorting task (e.g., general levels ofability, effort expended on the task, ability tosustain attention, comfort with using comput-ers). To control for this, the partner-good andpartner-bad performance indices were enteredpairwise in all subsequent analyses to ensure thattheir shared variance would be dropped from themodels. Discrete-time hazardmodeling in a hier-archical linear modeling (HLM) (Raudenbush& Byrk, 2002) framework demonstrated thatlower levels of performance on partner-goodtrials was associated with significantly higherrisk for breakup over the following 12 monthsin both samples, even after controlling for self-reported relationship quality, hostile conflict andneuroticism. There was also partial support to

  • 238 Journal of Family Theory & Review

    suggest that performance on the partner-bad tri-als had unique predictive validity. In one of thesamples, performance on the partner-bad trialsdemonstrated a significantmain effect such that ahigher level of performance on partner-bad trialswas associated with higher risk of breakup. Inthe other sample, performance on partner-goodand partner-bad trials interacted such that itwas specifically the individuals with belowaverage partner-good performance and aboveaverage partner-bad performance who were atgreatest risk of breaking up over the following12 months. Taken as a set, the findings of Lee,Rogge, and Reis (in press) suggest that implicitassessments of positive and negative attitudestoward a romantic partner do offer insight intothe functioning of those romantic relationshipsthat cannot be obtained through traditional self-report scales. The results also lend additionalsupport to conceptualizing relationship qualityas two distinct dimensions, which suggests thatindividuals can have relatively distinct positiveand negative implicit evaluations of romanticpartnerseach with unique information to con-tribute in understanding relationship stabilityover time.

    Although the implicit measures described arepromising, marital and family scholars havenot gravitated toward them. This could bebecause they require specific equipment anduse paradigms that are not commonly found inthe family literature. It is therefore worth not-ing that paper and pencil (low-tech) implicitmeasures have been developed but have beenused sparingly (see Vargas, Sekaquaptewa, &Hippel, 2007) relative to their high-tech coun-terparts. We now report on a low-tech approachto assessing relationship quality implicitly.

    New Wine, Old Wineskin

    As Fazio and Olson (2003, p. 303) noted, mod-ern implicit measures that assess constructswithout directly asking about them are, in thisregard, no different from earlier proposalsregarding projective methods (see Proshansky,1943). This is not the context to reiterate thepros and cons of projective techniques. Instead,we describe a stunningly simple way in whichwe try and get at relationship quality using whatcannot be considered anything but a projectivemethod. Specifically, we have been asking studyparticipants to do the following: Please draw apicture with (a) a house, (b) a tree (c) a car and

    (d) two people who represent you and your part-ner. Youmay draw them in anyway you like, butyou must include the above items. Please labelthe figure that represents you as me and the onethat represents your partner as partner. Wethen used the distance between the necks of eachperson in the drawing (measured in millimeters)as an index of relationship quality.

    The results obtained have been quite extraor-dinary. Across two samples, there was goodevidence of convergent validity. For example,neck distance correlated significantly with Funkand Rogges (2007) CSI scores, with the lik-ability of the partner and commitment to himor her. Importantly, the neck-distance mea-sure predicted a number of relevant variables4 weeks later over and beyond the CSI and ini-tial level of the variable predicted. The variablesthus predicted included expression of appre-ciation, commitment to partner, likability ofpartner, intimacy, mattering, perceived relation-ship maintenance efforts of the partner, andperceived commitment of partner. Finally, theneck measure may also foretell extradyadic sex-ual behavior and how safe a person feels in therelationship (p < .06 in each case).

    The results are preliminary but deserve men-tion because they remind us that there is poten-tially much to gain from exploring some veryold and well-knownmethods as we seek implicitmeasures of relationship quality.

    Coda

    Although implicit measures have been widelyused, enthusiasm for them has not been matchedby theoretical development. It is therefore impor-tant to note that we are not advocating use ofsuch measures for their own sake. Rather, westrongly agree with Fazio and Olson (2003) that,when their application, use, and interpretationis guided by relevant theory and past litera-ture, implicit measures have the potential toserve as useful methodological tools for testinghypotheses (p. 320).

    CONCLUSION

    We have traversed a great deal of territory in thisarticle. Rather than attempt to offer an exhaustiveanalysis of each topic addressed, our goal hasbeen to pique the readers interest and providesome pivotal citations for further reading on thetopic.

  • Understanding Relationship Quality 239

    It appears that the marital literature is at acrossroads in regard to its most frequently stud-ied construct, relationship quality. The weight ofinertia has promulgated use ofmeasures that lackconceptual clarity and can even be questioned onpsychometric grounds. This continuing practicehas stunted theory development. After docu-menting that case, we offered a conceptuallysimple and theoretically advantageous view ofrelationship quality as evaluation of the relation-ship.We then illustrated how this conceptualiza-tion can be pursued in the context of modern testtheory providing, en passant, a brief introduc-tion to IRT. Next, we offered an expansion ofthe unidimensional view of relationship quality,suggesting that positive and negative evaluationsmight be conceptually distinct and should beassessed separately. We then presented psycho-metric data supporting such a theoretical viewand outlined some of the most important theoret-ical implications, including the ability to identifydifferent groups of spouses who tend to fallnear themidpoint of unidimensional relationshipquality scales but behave quite differently fromeach other. Recognizing the limitations of self-reported relationship quality, we drew on relatedattitude research and developments to deriveimplicit attitude measures. Again, after offeringa brief introduction to suchmeasures, we offeredexamples of their application to the assess-ment of relationship quality, further supporting amore complex two-dimensional conceptualiza-tion of relationship quality. Thus, in this article,we strove to demonstrate how theory couldshape psychometric inquiries and how psycho-metrics could, in turn, help refine theoreticaldevelopment.

    Clearly, there is a choice to be made. Eitherwe allow inertia to condemn us to a futurethat is much like the past, or we break out ofour comfort zone and pursue new approachesto conceptualizing and measuring relationshipquality in the dominant epistemology of researchon this topic. As evidenced by the otherarticles in this special issue, the construct ofrelationship quality has come under attack onfundamental philosophical grounds. We wouldargue that this construct is still fundamentallysound, as people seem to inherently experiencerelationships on globally positive and negativedimensionsboth at a conscious level and at animplicit level, of which they might not be fullyaware. However, if this construct is to withstandthe theoretical and philosophical criticisms

    levied against it (and to remain a useful tool foradvancing our knowledge of relationships), thenit is necessary for couples researchers to adoptnew methods of conceptualizing and measuringrelationship quality. It is not easy or comfortableto explore literatures in other disciplines,especially when it involves mastering new andsometimes complex methods. However, failureto do so brings with it a high cost and one thatwill potentially lead our field to collapse undertheweight of its own conceptual confusion. Sucha future is simply too ghastly to contemplate.

    NOTEThis article was made possible by grant 90FE0022/01from the Department of Health and Human ServicesAdministration for Children and Families awarded to thefirst author. The authors thank Sesen Negash and NatalieSentore for their comments on an earlier draft of this article.

    REFERENCES

    Aron, A., Aron, E. N., Tudor, M., & Nelson, G.(1991). Close relationships as including other in theself. Journal of Personality and Social Psychology,60, 241 253.

    Aron, A., & Fraley, B. (1999). Relationship close-ness as including other in the self: Cognitiveunderpinnings and measures. Social Cognition,17, 140 160.

    Banse, R., & Kowalick, C. (2007). Implicit attitudestowards romantic partners predict well-beingin stressful life conditions: Evidence from theantenatal maternity ward. International Journal ofPsychology, 42, 149 157.

    Baucom, D. H. (1995). A new look at sentiment over-rideLets not get carried away yet: Comment onFincham et al. (1995). Journal of Family Psychol-ogy, 9, 15 18.

    Beach, S. R., Etherton, J., & Whitaker, D. (1995).Cognitive accessibility and sentiment over-ride Starting a revolution: Comment on Finchamet al. (1995). Journal of Family Psychology, 9,19 23.

    Bradbury, T. N., & Fincham, F. D. (1987). Theassessment of affect in marriage. In K. D. OLeary(Ed.), Assessment of marital discord: An inte-gration for research and clinical practice(pp. 59 108). Hillsdale, NJ: Erlbaum.

    Dahlstrom, W. G. (1969). Recurrent issues in thedevelopment of the MMPI. InJ. M. Butcher (Ed.),Research developments and clinical applications(pp. 1 40). New York: McGraw Hill.

    De Houwer, J. (2002). The Implicit Association Testas a tool for studying dysfunctional associationsin psychopathology: Strengths and limitations.Journal of Behavior Therapy and ExperimentalPsychiatry, 33, 115 133.

  • 240 Journal of Family Theory & Review

    Edmonds, V. H. (1967). Marital conventionalization:Definition and measurement. Journal of Marriageand the Family, 24, 349 354.

    Fazio, R. H. (2001). On the automatic activation ofassociated evaluations: An overview. Cognitionand Emotion, 15, 115 141.

    Fazio, R. H., & Olson, M. A. (2003). Implicit mea-sures in social cognition research: Their mean-ing and use. Annual Review of Psychology, 54,297 232.

    Fincham, F. D. (1998). Child development and mari-tal relations. Child Development, 69, 543 574.

    Fincham, F. D., & Beach, S. R. (1999). Maritalconflict: Implications for working with couples.Annual Review of Psychology, 50, 47 77.

    Fincham, F. D., & Bradbury, T. N. (1987). Theassessment of marital quality: A reevaluation.Journal of Marriage and the Family, 49, 797 809.

    Fincham, F. D., Garnier, P. C., Gano-Phillips, S., &Osborne, L. N. (1995). Preinteraction expecta-tions, marital satisfaction, and accessibility: A newlook at sentiment override. Journal of Family Psy-chology, 9, 3 14.

    Fincham, F. D., & Linfield, K. J. (1997). A new lookat marital quality: Can spouses feel positive andnegative about their marriage? Journal of FamilyPsychology, 11, 489 502.

    Fowers, B. J., & Applegate, B. (1996). Marital satis-faction and conventionalization examined dyadi-cally. Current Psychology, 15, 197 214.

    Funk, J. L., & Rogge, R. D. (2007). Testing the rulerwith item response theory: Increasing precisionof measurement for relationship satisfaction withthe Couples Satisfaction Index. Journal of FamilyPsychology, 21, 572 583.

    Glenn, N. D. (1990). Quantitative research on maritalquality in the 1980s: A critical review. Journal ofMarriage and the Family, 52, 818 831.

    Gottman, J. M. (1979). Marital interaction: Experi-mental investigations. NewYork: Academic Press.

    Gottman, J. M., & Krokoff, L. J. (1989). Maritalinteraction and satisfaction: A longitudinal view.Journal of Consulting and Clinical Psychology,57, 47 52.

    Gottman, J. M., & Levenson, R. W. (1984). Whymarriages fail: Affective and physiological pat-terns in marital interaction. In J. C. Masters &K. Yarkin-Levin (Eds.), Boundary areas in socialand developmental psychology (pp. 67 106).New York: Academic Press.

    Greenwald, A. G., & Farnham, S. D. (2000). Usingthe Implicit Association Test to measure self-esteem and self-concept. Journal of Personalityand Social Psychology, 79, 1022 1038.

    Greenwald, A. G., McGhee, D. E., & Schwartz,J. L. K. (1998). Measuring individual differencesin implicit cognition: The Implicit AssociationTest. Journal of Personality and Social Psychol-ogy, 74, 1464 1480.

    Greenwald, A. G., & Nosek, B. A. (2001). Health ofthe Implicit Association Test at age 3. Zeitschriftfur Experimentelle Psychologie, 48, 85 93.

    Hambleton, R. K., Swaminathan, H., & Rogers, H. J.(1991). Fundamentals of item response theory.Newbury Park, CA: Sage.

    Hendrick, S. S. (1988). A generic measure ofrelationship satisfaction. Journal of Marriage andthe Family, 50, 93 98.

    Heyman, R .E., Sayers, S. L.,&Bellack, A. S. (1994).Global marital satisfaction versus marital adjust-ment: An empirical comparison of three measures.Journal of Family Psychology, 8, 432 446.

    Jacobson, N. S. (1985). The role of observationmeasures in marital therapy outcome research.Behavioral Assessment, 7, 287 308.

    Jacobson, N. S., & Margolin, G. (1979). Maritaltherapy: Strategies based on social learningand behavior exchange principles. New York:Brunner/Mazel.

    Karney, B. R., & Bradbury, T. N. (1997). Neuroti-cism, marital interaction, and the trajectory ofmarital satisfaction. Journal of Personality andSocial Psychology, 72, 1075 1092.

    Lee, S., Rogge, R. D., & Reis, H. T. (In press).Assessing the seeds of relationship decay: Usingimplicit evaluations to detect the early stages ofdisillusionment. Psychological Science.

    Locke, H. J., & Wallace, K. M. (1959). Short maritaladjustment prediction tests: Their reliabilityand validity. Marriage and Family Living, 21,251 255.

    Mattson, R. E., Paldino, D.,& Johnson, M. D. (2007).The increased construct validity and clinical utilityof assessing relationship quality using separatepositive and negative dimensions. PsychologicalAssessment, 19, 146 151.

    Mitchell, J. P., Nosek, B. A., & Banaji, M. R. (2003).Contextual variations in implicit evaluation.Journal of Experimental Psychology: General,132, 455 469.

    Norton, R. (1983). Measuring marital quality: Acritical look at the dependent variable. Journalof Marriage and the Family, 45, 141 151.

    Nosek, B. A., & Banaji, M. R. (2001). The go/no-goassociation task. Social Cognition, 19, 625 664.

    Proshansky, H. M. (1943). A projective method forthe study of attitudes. Journal of Applied SocialPsychology, 38, 393 395.

    Raudenbush, S. W., & Byrk, A. S. (2002). Hierarchi-cal linear models: Applications and data analysismethods. Thousand Oaks, CA: Sage.

    Rogge, R. D., & Bradbury, T.N. (1999). Till violencedoes us part: The differing roles of communicationand aggression in predicting adverse martialoutcomes. Journal of Consulting and ClinicalPsychology, 67, 340 351.

    Rogge, R. D., Cobb, R. J., Johnson, M. D., Lawrence,E. E., & Bradbury, T. N. (2002). The CARE

  • Understanding Relationship Quality 241

    program: A preventive approach to maritalintervention. In A. S. Gurman & N. S. Jacobson(Eds.), Clinical handbook of couple therapy (3rded., pp. 420 435). New York: Guilford Press.

    Rogge, R .D., Cobb, R. J., Lawrence, E. E., John-son, M. D., Story, L. B., Rothman, A. D., et al.(2010). Teaching skills vs. raising awareness:The effects of the PREP, CARE and AWARE-NESS programs on 3-year trajectories of maritalfunctioning. Unpublished manuscript, Universityof Rochester, Rochester, NY.

    Rogge, R. D., & Fincham, F. D. (2010). Disentan-gling positive and negative feelings toward rela-tionships: Development and validation of thePositive-Negative Relationship Quality (PNRQ)scale. Unpublished manuscript, University ofRochester, Rochester, NY.

    Samejima, F. (1997). Graded response model. InW. J. van der Linden & R. K. Hambleton (Eds.),Handbook of modern item response theory(pp. 85 100). New York: Springer.

    Schumm, W. A., Nichols, C. W., Schectman, K. L.,& Grinsby, C. C. (1983). Characteristics ofresponses to the Kansas Marital Satisfaction Scaleby a sample of 84 married mothers. PsychologicalReports, 53, 567 572.

    Spanier, G. B. (1976). Measuring dyadic adjustment:New scales for assessing the quality of marriageand similar dyads. Journal of Marriage and theFamily, 38, 15 28.

    Stone, A. A., Turkan, J. S., Bachrach, C. A., Jobe,J. B., Kurtzman, H. S., & Cain, V. S. (2000). Thescience of self-report: Implications for researchand practice. Mahwah, NJ: Erlbaum.

    Teachman, B. A. (2007). Evaluating implicit spiderfear associations using the go/no-go associationtask. Journal of Behavior Therapy and Experi-mental Psychiatry, 38, 157 167.

    Trost, J. E. (1985). Abandon adjustment! Journal ofMarriage and the Family, 47, 1072 1073.

    Vargas, P.T., Sekaquaptewa, D.,&Hippel, W. (2007).Armed only with paper and pencil: Low-techmeasures of implicit attitudes. In B. Wittenbrink&N. Schwarz (Eds.), Implicit measures of attitudes(pp. 103 124). New York: Guilford Press.

    Watson, D., Clark, L. A., & Tellegen, A. (1988).Development and validation of brief measures ofpositive and negative affect: The PANAS scale.Journal of Personality and Social Psychology, 54,1063 1070.

    Watson, D., Clark, L. A., Weber, K., Assenheimer,J. S., Strauss, M. E., & McCormick, R. A. (1995).Testing a tripartite model: II. Exploring thesymptom structure of anxiety and depression instudent, adult, and patient samples. Journal ofAbnormal Psychology, 104, 15 25.

    Weiss, R. L. (1980). Strategic behavioral maritaltherapy: Toward a model for assessment andintervention. In J. P. Vincent (Ed.), Advances

    in family intervention, assessment and theory(Vol. 1, pp. 229 271). Greenwich, CT: JAIPress.

    Wittenbrink, B., & Schwarz, N. (2007). Implicitmeasures of attitudes. New York: Guilford Press.

    Zayas, V., & Shoda, Y. (2005). Do automatic reac-tions elicited by thoughts of romantic partner,mother, and self relate to adult romantic attach-ment? Personality and Social Psychology Bulletin,31, 1011 1025.

    APPENDIX

    For a simple dichotomous question (true orfalse), the two-parameter logistic model wouldtake the form:Pi() = 1/[1 + exp(i ) ( i)]. In this model, Pi() indicates the proba-bility that an individual with trait level willendorse item i as true. Thus, IRT would esti-mate two-item parameters for the dichotomousitem (i and i)the discrimination coefficient() estimates the relative amount of informationan item contributes and the difficulty coeffi-cient () estimates the region of the latenttrait where the item is most informative. Thecharacteristics of a response curve can then besynthesized with the following equation to esti-mate the overall amount of information the itemprovides: Ii() = [P i ()]2/[Pi()(1 Pi[ ])],where P i () is the first derivative of Pi()with respect to . The Ii() function createsan item information curve (IIC), revealing therelative amount of information the item con-tributes at various points along the continuumof the underlying trait ( ). As mentioned ear-lier, the standard error of an item is simplySE() = 1/srt[I()], or basically the inverseof the information provided, so these estimatesof information also serve as estimates of theprecision of measurement (the lack of error inmeasurement). Information curves of individualitems can be summed to create test informationcurves (TICs), which estimate the information(and precision) a set of items provides to theassessment of the underlying trait. BecauseIRT information curves are based on preciseestimates of each respondents latent score,with a sufficiently large and diverse sample,the estimated information curves become sam-ple independentaccurately estimating howsets of items will perform in any number ofnew samples. An IRT analysis of the existingmeasures of relationship quality would there-fore allow researchers to create TICs for thosemeasures, placing them on the same ruler to

  • 242 Journal of Family Theory & Review

    determine which scales measure up to the taskof assessing relationship quality.

    To apply IRT analyses to items with Lik-ert response scales, one can use the GradedResponse Model (GRM) (Samejima, 1997). Inthe GRM, an item with m response options isconsidered a set of m1 dichotomous thresh-olds. Thus, for an item with four responsechoices, GRM would model response curves(operating characteristic curves, or OCCs) for

    the three dichotomous decisions respondentsface (1 vs. 2, 2 vs. 3, and 3 vs. 4). The threeOCCs are assumed to have the same discrimi-nation (i) parameter but three distinct difficulty(ij) parameters. Therefore, GRM affords thepossibility of very precisely examining the qual-ity of information provided by items with Likertresponse scales across a range of possible rela-tionship quality values (typically from 3 to 3SD around the population mean).