9

Click here to load reader

Psychometric Properties of the ORS and SRS

Embed Size (px)

DESCRIPTION

Article on the Psychometric Properties of the Dutch Version of the ORS and SRS

Citation preview

Page 1: Psychometric Properties of the ORS and SRS

ISSN-L 1015-5759 • ISSN-Print 1015-5759 • ISSN-Online 2151-2426

Offi cial Organ of the European Association of Psychological Assessment

european journal of

psychologicalassessment

www.hogrefe.com/journals/ejpa

Edited byMatthias Ziegler

Abstracted/Indexed in:Current Contents/

Social & Behavioral SciencesSocial Sciences Citation Index (SSCI)

Social ScisearchPsycINFO

Psychological AbstractsPSYNDEX

ERIHScopus

Page 2: Psychometric Properties of the ORS and SRS

Official Organ of the European Association of Psychological Assessment

Your article has appeared in a journal published by Hogrefe Publishing. This e-offprint is provided exclusively for the personal use of the authors. It may not be posted on a personal or institutional website or to an institutional or disciplinary repository.

If you wish to post the article to your personal or institutional website or to archive it in an institutional or disciplinary repository, please use either a pre-print or a post-print of your manuscript in accordance with the publication release for your article and our “Online Rights for Journal Articles” (www.hogrefe.com/journals).

Page 3: Psychometric Properties of the ORS and SRS

Original Article

Measuring Feedback From ClientsThe Psychometric Properties of the Dutch Outcome

Rating Scale and Session Rating Scale

Pauline Janse,1 Liesbeth Boezen-Hilberdink,2 Maarten K. van Dijk,1

Marc J. P. M. Verbraak,1,3 and Giel J. M. Hutschemaekers3

1HSK Group, Arnhem, The Netherlands, 2Diaconessenhuis, Zorgcombinatie Noorderboog, Meppel,The Netherlands, 3Behavioral Science Institute, Radboud University Nijmegen, The Netherlands

Abstract. Treatment results can be improved by obtaining feedback from clients concerning their progress during therapy and the quality of thetherapeutic relationship. This feedback can be rated using short instruments such as the Outcome Rating Scale (ORS) and the Session Rating Scale(SRS), which are being increasingly used in many countries. This study investigates the validity and reliability of the Dutch ORS and SRS in a largesample of subjects (N = 587) drawn from the clients of an outpatient mental healthcare organization. The results are compared to those of previousDutch and American studies. While both the ORS and the SRS exhibited adequate test-retest reliability and internal consistency, their concurrentvalidity was limited (more for the SRS than for the ORS). New standards are proposed for the Dutch ORS and SRS. The scores obtained with thesestandards are interpreted differently than those obtained using American standards. The clinical implications of the limited validity of the ORS andthe SRS are discussed, as is the use of different standards in conjunction with these instruments.

Keywords: Outcome Rating Scale (ORS), Session Rating Scale (SRS), client feedback, validity, reliability

A promising approach to make therapy more effective is toexplicitly ask clients for feedback on how they view theirprogress during treatment and to discuss potential improve-ments with those who are making insufficient progress(e.g., Lambert & Shimokawa, 2011). Research shows that,based on their clinical intuition alone, therapists do notalways correctly predict which clients will drop out or dete-riorate during therapy (Hannan et al., 2005). Another find-ing was that the clients’ assessment of the quality of thetherapeutic relationship can differ greatly from that of theirtherapists (Hafkenscheid, Duncan, & Miller, 2010; Hovarth& Bedi, 2002).

With that in mind, Scott Miller and Barry Duncan(2004) developed a system to provide such client directedfeedback. They made the system as user-friendly as possi-ble for therapists and clients, in terms of feasibility andpracticality. Their feedback system consists of two shortquestionnaires: the Outcome Rating Scale (ORS) and theSession Rating Scale (SRS). The ORS covers three areasof client functioning: individual (personal well-being),interpersonal (family, close relationships), and social (work,school, friendships). It was developed as a short alternativeto the Outcome Questionnaire (Lambert et al., 1996). TheSRS measures the therapeutic alliance and reflects Bordin’s(1979) definition of alliance: the relationship between clientand therapist and consensus about goals and approach ormethod. Miller and Duncan added a fourth item to eachinstrument, which involved global assessments of daily

functioning (for the ORS) and of the treatment session(for the SRS). The outcomes are discussed during the ses-sions. If the scores do not show improvement, or do notreach the designated cut-off scores, the possible reasonsare discussed with the client. As such these instrumentsenhance engagement and participation of both client andtherapist in treatment.

A number of studies have shown that use of the ORSand SRS during treatment improves outcome (Miller,Duncan, Brown, Sorrell, & Chalk, 2006; Reese, Norswor-thy, & Rowlands, 2009). Miller and colleagues (2006)reported an increase in the overall effect size of treatmentfrom .39 in the 6-month baseline period (before the feed-back system was implemented) to an effect size of .79when feedback was provided by means of the ORS andSRS. In addition, two studies on couples therapy showedthat feedback enabled four times more clients to achieveclinically significant change relative to conventional treat-ment (Anker, Duncan, & Sparks, 2009; Reese, Toland,Slone, & Norsworthy, 2010).

The ORS and the SRS are now widely used in theNetherlands (Beljouw & Verhaak, 2010). Until now, how-ever, the psychometric properties of the Dutch versions ofthe ORS and SRS have not been sufficiently verified. Onlytwo previous studies have examined psychometric aspectsof the Dutch ORS and SRS (Beljouw & Verhaak, 2010;Hafkenscheid et al., 2010). The study conducted byHafkenscheid and colleagues (2010) provided the first data

� 2013 Hogrefe Publishing European Journal of Psychological Assessment 2013DOI: 10.1027/1015-5759/a000172

Author’s personal copy (e-offprint)

Page 4: Psychometric Properties of the ORS and SRS

on reliability but their sample was unrepresentative (in gen-eral, little progress was made during treatments). Moreover,most of the patients in question were treated by the firstauthor, which limits the potential for generalization to othersettings and other therapists. The study by Beljouw andVerhaak (2010) focused solely on the convergent validityof the ORS.

The Dutch ORS and SRS were generally interpretedusing the American standards. The data generated byHafkenscheid et al. (2010) suggest that the population ofthe Netherlands differs from that of the US, which rendersuse of the American standards problematic. For example,Hafkenscheid et al. (2010) found lower scores for theSRS than was the case in the American studies. In addition,the global mean for the SRS (32.4) was found to be wellbelow the cut-off score of the mean from the Americanstudies (36 points). Additional data from different patientpopulations are needed to produce reliable Dutch standardsand to determine the extent of any differences betweenthese standards and their American equivalents.

The purpose of the current study is to examine the psy-chometric properties and standards of the Dutch versions ofthe ORS and SRS in a large sample of outpatients. Further-more, the Reliable Change Index (RCI; Jacobson & Truax,1991) was calculated to determine whether a change in agiven individual’s ORS score was clinically significant.The results of this study are compared to those of previousAmerican and Dutch studies.

Method

Participants

Clinical Sample

The clinical sample consisted of 587 consecutive clientswho had been referred by their physician to one of the fiveparticipating branches of HSK in the period from 2009 tothe end of 2010. HSK is a Dutch organization providingoutpatient mental healthcare. It operates throughout theNetherlands, and provides cognitive and behavioral thera-pies for common mental disorders. The age of the clientsin this sample ranged from 18 to 71 years, with a meanof 41 (SD = 11.1). They presented with diverse psycholog-ical disorders (Table 1). Of the total sample, 543 clientsreceived treatment after intake. The average course of treat-ment consisted of 16 sessions (SD = 8.7).

Nonclinical Sample

It is important to determine the cut-off point on the ORSthat distinguishes the functional population from the dys-functional population. To this end, Jacobson and Truax(1991) recommend using their formula c, which takesscores from both clinical and nonclinical samples (repre-senting the scores of a functional population) into account.

Accordingly, a nonclinical sample was also included in thisstudy. These individuals filled in the ORS and SCL-90 onceonly, for the purpose of comparison. This nonclinical sam-ple consisted of the partners of the clients included in thestudy. They received the questionnaires (including informa-tion about the study) and an informed consent form fromtheir partners, the clients. Any of the partners who wereundergoing psychological treatment was excluded. Thefinal, nonclinical sample consisted of 116 volunteers.Fifty-six percent (n = 65) of these participants were female,and the average age was 41 years (SD = 11.0).

Procedure

The participants signed an informed consent form at intake.Clients were asked to fill in the ORS and SRS during eachtreatment session. The Outcome Questionnaire (OQ-45;Lambert et al., 1996) and an alliance questionnaire(WAV-12, Stinckens, Ulburghs, & Claes, 2009) were com-pleted at the start of treatment, once every fifth session, andat the end of the treatment. In order to eliminate any possi-bility of feedback effects affecting therapy, the therapistswere not allowed to see the completed questionnaires.The Symptom checklist (SCL-90-R; Arrindell & Ettema,2003) was administered at intake and at the end oftreatment.

Measurements

The Outcome Rating Scale (ORS) and the SessionRating Scale (SRS)

The ORS and SRS each consist of four items, which areanswered using 10-cm visual analog scales (VAS) rangingfrom negative (left) to positive (right).

The ORS measures three areas of client functioning:individual, interpersonal, and social, as well as measuringthe client’s overall view of their personal well-being.

The SRS measures the relationship between the clientand the therapist, consensus about goals and methods,and the client’s overall view (at the end of a session) con-cerning the quality of the therapeutic relationship.

Table 1. Characteristics of the clinical sample

N %

SexMale 281 47.9Female 306 52.1

DiagnosisAdjustment disorder 164 28.0Work-related distress 163 27.8Mood disorders 122 20.9Anxiety disorders 102 17.3Other 13 5.6

2 P. Janse et al.: Psychometric Properties of the Dutch Outcome Rating Scale and Session Rating Scale

Author’s personal copy (e-offprint)

European Journal of Psychological Assessment 2013 � 2013 Hogrefe Publishing

Page 5: Psychometric Properties of the ORS and SRS

The marks made by clients on each of the four lines aremeasured to the nearest millimeter to derive the score.These are then combined to obtain a total score. The totalscores range from 0 to 40 on both measures. High scoreson the ORS reflect a good level of well-being and function-ing, while high scores on the SRS reflect a good therapeuticrelationship. The most recent version of the Dutch transla-tion of the ORS and SRS was used (translation by Asmus,Crouzen & van Oenen, 2004).

Instruments Used to Validate the ORS and SRS

The concurrent validity of the ORS was tested using theOutcome Questionnaire (OQ-45; Lambert et al., 1996) andthe Symptom checklist (SCL-90; Arrindell & Ettema, 2003).

The OQ-45, which consists of 45 items, measures threedomains of functioning: symptom distress (SD), interper-sonal relations (IR), and social role performance (SR).The Dutch version of the OQ-45 demonstrated adequateoverall reliability (De Jong et al., 2007), but it was inade-quate in terms of the Social Role subscale. Its constructvalidity proved to be adequate. In the current study, theOQ-45’s internal consistency (or alpha values) was .88for the SD domain, .80 for the IR domain, .60 for the SRdomain, and .82 for the OQ-45 total score (N = 483).

The Symptom Checklist Revised (SCL-90-R; Derogatis,1994) measures a broad range of psychological problemsand symptoms of psychopathology. The 90 items includedin the Dutch SCL-90-R are categorized into eight subscales.A client’s overall score on the SCL-90-R reflects his gen-eral psychological and psychosomatic well-being. TheDutch SCL-90-R has shown good psychometric properties(Arrindell & Ettema, 2003). Alpha values for the SCL-90-Rin this study ranged from .59 to .90 for the subscales and.77 for the total score (N = 541).

The ORS was expected to have a reasonably strong rela-tionship with the OQ-45, and a moderately strong relation-ship with the SCL-90. The latter being due to the slightlydifferent concepts measured by the ORS and the SCL-90.The SCL-90 focuses on symptoms of psychological prob-lems, whereas the ORS also measures an individual’swell-being in relationships and at work. The ‘‘Individual’’subscale of the ORS and the total score on the ORS areexpected to show the strongest relationship to the SCL-90total score.

The concurrent validity of the SRS was determined bycomparing it to the Dutch version of the Working AllianceInventory, Short Form (WAV-12; Stinckens et al., 2009).The WAV-12 is based on Bordin’s (1979) definition ofthe therapeutic relationship. It consists of 12 items and mea-sures three domains of the therapeutic relationship, namely‘‘Goal,’’ ‘‘Task,’’ and ‘‘Bond.’’ As the SRS and theWAV-12 are both based on Bordin’s theory, their totalscores should show strong correlation with one another.As the subscales of the measures are slightly different, theymay show weaker correlations. The Internal consistency(alpha) of the WAV-12 in this study was .84 for the Taskdomain, .82 for the Goal domain, .80 for the Bond domain,and .87 for the total score (N = 285).

Statistical Analysis

First the normality of the scores was checked. Next theinternal consistencies of the ORS and SRS were calculatedusing Cronbach’s alpha. Test-retest reliability and the con-current validity of the ORS were calculated using bivariatecorrelations. The predictive validity of the SRS for treat-ment outcome was determined by linear regression analy-sis, using the difference between the total pretreatmentand posttreatment SCL-90 scores as a measure of outcome.

Independent t-tests (two-tailed, p < .05) were used tomeasure differences between males and females in thescores obtained using these measures, and between the clin-ical and nonclinical groups.

The standards to be used in conjunction with the ORSwere determined on the basis of cut-off scores and theRCI. The ORS cut-off score used to distinguish betweenthe functional and dysfunctional populations was calculatedusing Jacobson and Truax’s (1991) formula c:

c ¼ S0M1 þ S1M0

S0 þ S1: ð1Þ

M1 = the mean of the pretreatment clinical group, M0 = themean of the nonclinical sample, and S0, S1 = the standarddeviations of clinical and nonclinical samples.

The RCI of the ORS was calculated by multiplying sdiff

by the z value of the requisite significance level (1.96,p < .05). All statistical analyses were performed usingSPSS version 17.0 (SPSS, Chicago, IL).

Results

Outcome Rating Scale

Normative Data

Table 2 shows the mean scores and standard deviations ofthe clinical and nonclinical samples for the ORS totalscores obtained at intake. The total score for the ORSwas lower than that of a clinical group reported by Milleret al. (2003; M = 19.6, SD = 8.7). The clinical group’saverage total score for the OQ-45 was 70.5 (SD = 22.2),while their average total SCL-90 score was 180.7

Table 2. Means and standard deviations on the ORS totalscores of the clinical and nonclinical samples

Nonclinicaln = 116

Clinicaln = 524

M SD M SD

ORS individual 7.3 1.8 3.6 2.1ORS relational 7.4 1.7 5.5 2.4ORS social 7.5 1.6 3.9 2.4ORS overall 7.5 1.6 4.0 2.0ORS total 29.6 6.0 17.0 7.2

P. Janse et al.: Psychometric Properties of the Dutch Outcome Rating Scale and Session Rating Scale 3

Author’s personal copy (e-offprint)

� 2013 Hogrefe Publishing European Journal of Psychological Assessment 2013

Page 6: Psychometric Properties of the ORS and SRS

(SD = 47.3), indicating a high level of distress. The non-clinical group had an average total score for the SCL-90 of111.0 (SD = 21.8), reflecting a good level of well-being.

At intake, no significant differences were foundbetween males and females in terms of the ORS totalscore (t(522) = 0.58, p > .05), the OQ-45 total score(t(481) = �4.49, p > .05), or the SCL-90-R total score(t(545) = �1.10, p > .05).

Cut-Off Scores and Reliable Change

The ORS cut-off score between the nonclinical and clinicalranges was 24, one point lower than the American cut-offscore. At 9 points, the RCI (which is defined as the mini-mum amount of change in outcome required to indicategenuine change, rather than mere error) exceeded theAmerican RCI of 5 (Miller & Duncan, 2004). This suggeststhat, during treatment, Dutch clients need to achieve agreater degree of change on the ORS for such change tobe considered reliable.

Psychometric Properties of the OutcomeRating Scale

Reliability

In the clinical sample, internal consistency was determinedat intake and at the first, third, and fifth sessions. The alphavalues of the total score varied from .82 to .96. Thenonclinical sample had an alpha value of .94 (N = 116).The relationships between the subscales were strong andin line with the results found both in American studies(Miller et al., 2003) and in available Dutch data (Hafkensc-heid et al., 2010).

The test-retest reliability of the ORS was established bycomputing correlations between five measurement points inthe clinical sample (Table 3). The decrease in N from intake

(587 clients at intake and 323 at the first and second mea-surement points) is due to missing data or to clients whoreceived no further treatment. The correlation betweenthe ORS total scores at subsequent measurement pointswas adequate and slightly higher than that found in thestudies by Miller and colleagues (2003; r ranging from.49 to .66) and by Hafkenscheid et al. (2010; r rangingbetween .16 and .63).

Criterion Validity

As an outcome measure, the ORS must be able to distin-guish between clinical and nonclinical groups. The differ-ence between the mean scores of these groups (Table 1)was significant (t(636) = �17.4, p < .05), indicating thatthe ORS can indeed effectively distinguish between dys-functional and functional clients at group level.

Concurrent Validity

In the clinical sample, correlations between the ORS andOQ-45 were calculated (Table 4) at intake.

The reported correlations between ORS and OQ-45 sub-scales and total scales were negative, as a good level ofwell-being is indicated by high scores on the ORS but bylow scores on the OQ-45 (and the SCL-90). Overall thesecorrelations were moderately strong (Cohen, 1988),although they were slightly lower than those found in thestudy by Miller and colleagues (2003). However, the corre-lation found for the ORS and OQ-45 total scores was in linewith their findings (r ranged from�.53 to �.69). The ORS,as a general measure of treatment outcome, still appeared tobe reasonably valid.

Concurrent validity was also tested by calculating thecorrelation between the ORS and SCL-90 total and subscalescores at intake. In the clinical sample, correlationsranged from r = �.09 to �.56 (n = 481). The strongest

Table 3. Test-retest reliability of the ORS between five administrations

1st–2nd 2nd–3rd 3rd–4th 4th–5th

n r n r n r n r

ORS 323 .64 341 .57 339 .69 334 .63

Note. All correlations are significant at p < .01 level.

Table 4. Correlations between the ORS and OQ-45 subscales and total scales in the clinical sample at intake

OQ-45 SD (n = 493) OQ-45 IR (n = 482) OQ-45 SR (n = 492) OQ-45Total (n = 455)

ORS individual �.53 �.40 �.30 �.52ORS interpersonal �.36 �.54 �.19 �.45ORS social role �.46 �.36 �.46 �.50ORS overall �.55 �.45 �.34 �.56ORS total �.58 �.54 �.40 �.62

Notes. All correlations are significant at p < .01 level. OQ-45 SD = Symptom Distress; OQ-45 IR = Interpersonal Relation; OQ-45SR = Social Role.

4 P. Janse et al.: Psychometric Properties of the Dutch Outcome Rating Scale and Session Rating Scale

Author’s personal copy (e-offprint)

European Journal of Psychological Assessment 2013 � 2013 Hogrefe Publishing

Page 7: Psychometric Properties of the ORS and SRS

relationships were between the ORS Overall scale and ORStotal score and the SCL-90 Depression scale (r = �.54 and�.56, respectively), and between the ORS total score andthe SCL-90 total score (.50). In the nonclinical sample(n = 111) the correlations were stronger (r ranged from�.19 to �.70). Here too, the strongest relationships werebetween the ORS Overall scale and the SCL-90 Depressionscale (r = �.70) and between ORS and the SCL-90 totalscores (�.66).

Sensitivity to Change

The ORS is used as an instrument to track progress, so itmust be capable of measuring changes in clients’ well-being during treatment. Of the total sample, 172 clientsfilled in the ORS both at intake and at the end of their treat-ment. The mean ORS total score at intake was 16.9(SE = .57) and 29.2 (SE = .58) posttreatment. Theposttreatment well-being of clients was significantly betterthan it was before treatment commenced (t(171) = �17,72,p < .05, r = .81).

The Psychometric Properties of the SessionRating Scale

Normative Data

Table 5 shows the mean SRS total scores and standarddeviations of the clinical sample at four measurementpoints. The maximum mean achieved on the SRS leveledoff at 34 after 15 sessions. As with the ORS, the declinein N from the intake value is due to missing data or to cli-ents who did not receive further treatment.

Reliability

Alpha values ranged from .85 to .95 during the first fivesessions. The test-retest reliability (as measured by Spear-man’s rho) was slightly less than that reported by Duncanet al. (2003; an overall r of .64), but still moderately strong(Table 6). Test-retest reliability was assessed between treat-ment sessions. The SRS scores changed between sessions

(in general they seemed to improve over time, see Table 5),and the correlations would probably be stronger if this hadnot been the case. Thus, when taking this into account, thetest-retest reliability of the SRS can be considered adequate.

Concurrent Validity

The relationship between the SRS and the WAV-12 (asmeasured by Spearman’s rho) was assessed at the beginningof treatment (Table 7). The correlations were moderatelystrong and significant (p < .01), but lower than expected,indicating that the SRS and the WAV-12 may be measuringslightly different aspects of the therapeutic relationship.

Predictive Validity

There is evidence that the quality of the therapeuticrelationship influences treatment outcome (e.g., Martin,Garske, & Davis, 2000). A linear regression analysis wastherefore carried out to test the predictive value of theSRS on treatment outcome (as measured by the differencebetween SCL-90 total score at intake and SCL-90 posttreat-ment score). The SRS total score for sessions two and threedid indeed predict outcome (p < .05), the SRS score forsession two being the strongest predictor (b1 = �.14,p < .05). Nonetheless, the SRS had only very limited influ-ence (R2 = .02).

Discussion

The aim of this study was to examine the psychometricproperties of the Dutch ORS and SRS, and to comparethe results with those obtained in American studies andother Dutch studies.

The results demonstrate that the ORS and SRS havestrong internal consistency, reflecting a strong cohesionof the items concerned. This is in line with the findingsof other studies. Furthermore, the ORS and SRS exhibitedadequate test-retest reliability, comparable to those found inthe American studies and another Dutch study (Duncanet al., 2003; Hafkenscheid et al., 2010; Miller et al., 2003).

Table 5. Means and standard deviation on the SRS at sessions 1, 5, 10, and 15

Session 1 Session 5 Session 10 Session 15

n M SD n M SD n M SD n M SD

SRS 349 30.1 6.1 321 32.0 4.7 208 32.6 4.7 121 33.6 4.4

Table 6. Test-retest reliability of the SRS between five administrations

1st–2nd 2nd–3rd 3rd–4th 4th–5th

n rs n rs n rs n rs

SRS 317 .48 313 .72 315 .61 296 .59

P. Janse et al.: Psychometric Properties of the Dutch Outcome Rating Scale and Session Rating Scale 5

Author’s personal copy (e-offprint)

� 2013 Hogrefe Publishing European Journal of Psychological Assessment 2013

Page 8: Psychometric Properties of the ORS and SRS

The moderately strong correlations with other outcomemeasures (concurrent validity) are somewhat lower thanexpected. They are also lower than the correlations foundin other studies (Miller et al., 2003; Campbell & Hemsley,2009). In particular, a stronger relationship was expectedbetween the ORS and OQ-45, as the former is based onthe latter. The difference in scaling (VAS and Likert scales)could be a factor here. The strongest relationships foundwere those between the ORS total and OQ-45 andSCL-90 total scores.

The concurrent validity of the SRS, too, is not as high aswas expected, especially with regard to the subscales of theSRS. This may indicate the SRS is measuring a somewhatdifferent construct than the WAV-12. Given the high inter-nal consistency involved, it follows that it would be betterto use the total scores of the ORS and SRS as general out-come and alliance scores, rather than interpreting the indi-vidual items of these measures.

This study was subject to a number of limitations. Forinstance, the ORS and SRS are Visual Analog Scales(VAS), which clients could interpret subjectively. However,various studies have shown VAS to be reliable and validmeasures, comparable to Likert scales (see for an overviewHasson & Arnetz, 2005). Another limitation of this studywas the method used to determine test-retest reliability.The average interval between measurements was 1 week,during which time the effect of treatment or external factorsmight be expected to produce a change in the ORS, in par-ticular. Duncan et al. (2003) have stated that instrumentswhich are sensitive to change can produce lower test-retestcorrelations. Accordingly, the correlation should not beinterpreted too strictly. In order to determine test-retest cor-relations more accurately, future studies should use shorterintervals between measurements. Furthermore, the partici-pants in this study included a relatively high percentagesof males, so any future studies should include checks todetermine whether the scores obtained are representativeof the Dutch outpatient population as a whole.

One important aim of this study was to establish Dutchstandards for the ORS/SRS. Based on the data obtained inthis study, the clinical cut-off score of the ORS for Dutchpatients attending outpatient clinics in connection withcommon mental disorders can be set at 24. This is one pointlower than the American cut-off score. The present studygave an RCI for the ORS of 9 points, which differs fromthe American RCI of 5 (Miller & Duncan, 2004) but ismore in line with the RCI of 8 found by Hafkenscheidet al. (2010). This means that, relative to American clients,Dutch clients need to achieve more change on the ORS in

order to achieve reliable change. This has implications forthe way in which the feedback system is used during ther-apy, as the standards underpin decisions on whether tochange the approach or interventions used in the courseof treatment. For example, if a Dutch client exhibits a posi-tive change of 5 points on the ORS, this might result in theadoption of a different approach to treatment or even achange of therapist. In the same situation, the Americaninterpretation would be that reliable change has beenachieved and that no change of therapist or approach isnecessary (given that there is a good therapeuticrelationship).

The average scores on the SRS were lower than theAmerican cut-off score of 36, and never exceeded 34 pointsduring treatment. American data show that only 24% ofcases fall below the cut-off score of 36 (Miller & Duncan,2004), yet the present study found that 73% of cases fallbelow the American cut-off score at session 5. This sug-gests that different standards might apply to the Dutchcut-off score for the SRS. The low mean scores on theSRS may be due to cultural differences or to the designof the study. Unlike the therapists in the American studies,the therapists in this study did not see the scores. It may bethat, when the SRS is discussed during the session, thisresults in more socially desirable answers, which in turnlead to higher scores. Before determining a cut-off scorefor the Dutch SRS, this possibility needs to be investigatedfurther in the context of an effect study (in which scores arediscussed during treatment). A study of this kind is alreadyunderway.

The predictive validity of the quality of the therapeuticrelationship, as measured by the SRS, was very limited.Although the SRS at sessions two and three were foundto predict treatment outcome, this relationship was rela-tively weak, suggesting that the therapeutic relationshiphas only a marginal effect in this regard. However, furtherresearch is needed to determine whether the predictivevalidity of the SRS improves when it is actively used dur-ing treatment. As the treatments given in this study werevery structured (the therapists used treatment manuals),the quality of the therapeutic relationships in questionmay be less relevant (e.g., Martin et al., 2000) than whenless rigidly structured treatments are used.

In conclusion, this study has shown that while both theORS and SRS demonstrate adequate reliability, their valid-ity is limited. This finding is in line with those of previousstudies. Accordingly, while the ORS and SRS can be veryuseful feedback instruments, it is advisable to supplementthem (at intervals of several sessions) with better validated

Table 7. Correlations (rs) between the SRS and the WAV-12 subscales and total scales at the beginning of treatment

WAV-12 bond (n = 235) WAV-12 Goal (n = 252) WAV-12 task (n = 248) WAV-12 total (n = 234)

SRS relationship .32 .36 .37 .37SRS goal .38 .41 .40 .43SRS approach .31 .41 .46 .43SRS overall .37 .40 .45 .44SRS total .39 .43 .45 .46

Note. All correlations are significant at p < .01 level (2-tailed).

6 P. Janse et al.: Psychometric Properties of the Dutch Outcome Rating Scale and Session Rating Scale

Author’s personal copy (e-offprint)

European Journal of Psychological Assessment 2013 � 2013 Hogrefe Publishing

Page 9: Psychometric Properties of the ORS and SRS

instruments, to corroborate progress. This study has alsorevealed a difference between Dutch and American stan-dards for the ORS and SRS, which can have major impli-cations for the way in which the feedback system is used.Accordingly, further research is needed on how standardsdiffer from one country to another, as little is known ofthe standards used in countries other than the United States.

In using the ORS and the SRS, the main aims are tohelp therapists prevent dropout and to make therapy moreefficient, by means of frequent feedback from clients. Byrepeatedly measuring the client’s progress and satisfactionwith treatment, the therapist stays alert. The treatmentmaintains the right focus. They are clinical track-and-tracetools enhancing treatment engagement and participation.Treatment outcome, however, needs to be corroboratedby other more valid measures.

References

Anker, M., Duncan, B. L., & Sparks, J. A. (2009). Using clientfeedback to improve couple outcomes: A randomizedclinical trial in a naturalistic setting. Journal of Consultingand Clinical Psychology, 77, 693–804.

Arrindell, W. A., & Ettema, J. H. M. (2003). SCL-90, Handle-iding bij een multidimensionele psychopathologie indicator.[SCL-90, Manual for a multidimensional indicator ofpsychopathology]. Lisse, The Netherlands: Swets &Zeitlinger.

Asmus, F., Crouzen, M., & van Oenen, F. J. (2004). OutcomeRating Scale. Retrieved from http://scottdmiller.com/purchase-individual-or-group-licenses

Campbell, A., & Hemsley, S. (2009). Outcome Rating Scale andSession Rating Scale in psychological practice. Clinical utilityof ultra-brief measures. Clinical Psychologist, 13, 1–9.

Cohen, J. (1988). Statistical power analysis for the behavioralsciences (2nd ed.). Hillsdale, NJ: Erlbaum.

Beljouw van, I. M. J., & Verhaak, P. F. M. (2010). Geschikteuitkomstmaten voor routinematige registratie dooreerstelijnspsychologen [Appropriate outcome measures forroutine registration by primary care psychologists]. Utrecht,The Netherlands: Nivel.

Bordin, E. S. (1979). The generalizability of the psychoanalyticconcept of the working alliance. Psychotherapy: Theory,Research & Practice, 16, 252–260.

De Jong, K., Nugter, M. A., Polak, M. G., Wagenborg, J. E. A.,Spinhoven, Ph., & Heiser, W. J. (2007). The outcomequestionnaire (OQ-45) in a Dutch population: A cross-cultural validation. Clinical Psychology & Psychotherapy,14, 288–301.

Derogatis, L. R. (1994). Symptom Checklist 90–R: Administra-tion, scoring, and procedures manual (3rd ed.). Minneapolis,MN: National Computer Systems.

Duncan, B. L., Miller, S. D., Sparks, J. A., Claud, D. A.,Reynolds, L. R., Brown, J., & Johnson, L. D. (2003). Thesession rating scale: Preliminary psychometric properties ofa ‘‘working’’ alliance measure. Journal of Brief Therapy, 3,3–12.

Hafkenscheid, A., Duncan, B. L., & Miller, S. D. (2010). TheOutcome and Session Rating Scales: A cross-culturalexamination of the psychometric properties of the Dutchtranslation. Journal of Brief Therapy, 7, 1–12.

Hannan, C., Lambert, M. J., Harmon, C., Nielsen, S. L., Smart,D. W., Shimokowa, K., & Sutton, S. (2005). A lab test andalgorithms for identifying clients at risk for treatmentfailure. Journal of Clinical Psychology, 61, 155–163.

Hasson, D., & Arnetz, B. B. (2005). Validation and findingscomparing VAS vs. Likert scales for psychosocial measure-ments. International Electronic Journal of Health Educa-tion, 8, 178–192.

Horvath, A. O., & Bedi, R. P. (2002). The alliance. In J. Norcross(Ed.), Psychotherapy relationships that work: Therapistcontributions and responsiveness to patients (pp. 37–70).New York, NY: Oxford University Press.

Jacobson, N. S., & Truax, P. (1991). Clinical significance: Astatistical approach to defining meaningful change inpsychotherapy research. Journal of Consulting and ClinicalPsychology, 59, 12–19.

Lambert, M. J., Hansen, N. B., Umphress, V. J., Lunnen, K.,Okiishi, J., Burlingame, G., Huefner, J. C., & Reisinger,C. W. (1996). Administration and scoring manual for theOutcome Questionnaire (OQ 45.2). Wilmington, DE:American Professional Credentialing Services.

Lambert, M. J., & Shimokawa, K. (2011). Collecting clientfeedback. Psychotherapy, 48, 72–79.

Martin, D. J., Garske, J. P., & Davis, M. K. (2000). Relation ofthe therapeutic alliance with outcome and other variables: Ameta-analytic review. Journal of Consulting and ClinicalPsychology, 68, 438–450.

Miller, S. D., Duncan, B. L., Brown, J., Sparks, J., & Claud, D.(2003). The outcome rating scale: A preliminary study of thereliability, validity, and feasibility of a brief visual analoguemeasure. Journal of Brief Therapy, 2, 91–100.

Miller, S. D., & Duncan, B. L. (2004). The outcome and sessionrating scale. Administration and scoring manual. Chicago,IL: Institute for the Study of Therapeutic Change.

Miller, S. D., Duncan, B. L., Brown, J., Sorrell, R., & Chalk,M. B. (2006). Using formal client feedback to improveretention and outcome: Making ongoing, real time assess-ment feasible. Journal of Brief Therapy, 5, 5–22.

Reese, R. J., Norsworthy, L. A., & Rowlands, S. R. (2009).Does a continuous feedback system improve psychotherapyoutcome? Psychotherapy theory, research, practice, train-ing, 46, 418–431.

Reese, R. J., Toland, M. D., Slone, N. C., & Norsworthy, L. A.(2010). Effect of client feedback on couple psychotherapyoutcomes. Psychotherapy: Theory, Research, Practice,Training, 47, 616–630.

Stinckens, N., Ulburghs, A., & Claes, L. (2009). De wer-kalliantievragenlijst als sleutelelement in therapiegebeuren.Meting met behulp van de WAV-12, de Nederlandstaligeverkorte versie van de Working Alliance Inventory. [Theworking alliance questionnaire as a key element in therapy.Measurement using the WAV-12, the Dutch shortenedversion of the Working Alliance Inventory]. Tijdschrift voorKlinische Psychologie, 39, 44–60.

Date of acceptance: April 22, 2013Published online: August 23, 2013

Pauline Janse

Department HSK UtrechtHSK Group3522 KE UtrechtThe NetherlandsTel. +31 62 808-8475E-mail [email protected]

P. Janse et al.: Psychometric Properties of the Dutch Outcome Rating Scale and Session Rating Scale 7

Author’s personal copy (e-offprint)

� 2013 Hogrefe Publishing European Journal of Psychological Assessment 2013