6
Validation of a Short Form of the Western Ontario and McMaster Universities Osteoarthritis Index Function Subscale in Hip and Knee Osteoarthritis GABRIEL BARON, 1 FLORENCE TUBACH, 1 PHILIPPE RAVAUD, 1 ISABELLE LOGEART, 2 AND MAXIME DOUGADOS 3 Objective. A short version of the Western Ontario and McMaster Universities Osteoarthritis Index (WOMAC) function scale has recently been developed to enhance the applicability of the scale in routine practice and clinical research for patients with hip and knee osteoarthritis. The goal of the present study was to validate this short form. Methods. We conducted a prospective 4-week cohort study of 1,036 outpatients. Performance on the WOMAC function long form (LF) and short form (SF) was compared. Agreement between responses on the 2 forms was examined according to a Bland-Altman plot. Responsiveness to change (by standardized response mean [SRM]), reproducibility (intraclass correlation coefficient [ICC]), and internal consistency (Cronbach’s alpha) were computed for both forms. Construct validity was assessed based on functional impairment as measured on a numerical rating scale. Results. At baseline, 24% of patients who completed the WOMAC LF had missing data for at least 1 item as compared with only 6% of patients who completed the WOMAC SF. The mean WOMAC SF score was greater than the mean WOMAC LF score (mean SD difference 4.3 4.8 on a 0 –100 scale). SRMs were 0.61 and 0.73, ICCs were 0.76 and 0.68, and Cronbach’s alphas were 0.93 and 0.85 for the WOMAC LF and SF, respectively. The 2 forms had comparable correlation with functional impairment. Conclusion. The WOMAC function short form has a low rate of missing data and is a responsive, reproducible, and valid measure. The mean SF score was 4 points higher than the mean LF score. KEY WORDS. Western Ontario and McMaster Universities Osteoarthritis Index function; Short form; Osteoarthritis; Applicability; Performance. INTRODUCTION The Western Ontario and McMaster Universities Osteoar- thritis Index (WOMAC) function subscale is widely used in clinical trials of hip and knee osteoarthritis (OA) (1). It is a valid, reliable, and responsive measure of function in hip and knee OA (2,3). This index is self administered and the function subscale includes 17 items addressing the degree of difficulty in accomplishing 17 activities of daily life. The original WOMAC function subscale has been shortened recently to increase its applicability in epidemi- ologic studies, clinical trials, and daily clinical practice. Most procedures to shorten a scale usually rely on statistical approaches (4). The procedure used to derive the WOMAC short form was based mainly on the perspective of patients and rheumatologists (5). The WOMAC function short form (SF) consists of 8 of the 17 items of the long form (LF) and has psychometric properties equivalent to that of the LF. How- ever, this SF has not been validated in an independent sam- ple. The main objective of this study was to investigate the psychometric performance of the WOMAC SF in a large independent sample of patients with hip and knee OA. PATIENTS AND METHODS Study population. We conducted a prospective cohort study of 4 weeks’ duration involving outpatients with hip Supported by an unrestricted grant from Merck Sharp & Dohme Chibret Laboratories, Paris, France. 1 Gabriel Baron, MSc, Florence Tubach, MD, Philippe Ra- vaud, MD, PhD: AP-HP, Ho ˆ pital Bichat, De ´partement d’Eide ´miologie, Biostatistique et Recherche Clinique, INSERM, U738, Paris, and Universite ´ Paris 7 Denis Diderot, UFR de Me ´decine, U738, Paris, France; 2 Isabelle Logeart, MD: Merck Sharp & Dohme Chibret Laboratories, Paris, France; 3 Maxime Dougados, MD: Paris-Decartes University, Assistance Publique Ho ˆ pitaux de Paris, Cochin Hospital, Paris, France. Dr. Dougados has received consulting fees and/or hono- raria (less than $10,000) from Merck Sharp & Dohme-Chi- bret Laboratories. Address correspondence to Gabriel Baron, MSc, De ´parte- ment d’Epide ´miologie Biostatistique et Recherche Clinique, INSERM U738, Groupe Hospitalier Bichat-Claude Bernard, 46 rue Henri Huchard, 75018 Paris, France. E-mail: gabriel. [email protected]. Submitted for publication March 21, 2006; accepted in revised form September 28, 2006. Arthritis & Rheumatism (Arthritis Care & Research) Vol. 57, No. 4, May 15, 2007, pp 633– 638 DOI 10.1002/art.22685 © 2007, American College of Rheumatology ORIGINAL ARTICLE 633

Validation of a short form of the Western Ontario and McMaster Universities Osteoarthritis Index function subscale in hip and knee osteoarthritis

Embed Size (px)

Citation preview

Page 1: Validation of a short form of the Western Ontario and McMaster Universities Osteoarthritis Index function subscale in hip and knee osteoarthritis

Validation of a Short Form of the Western Ontarioand McMaster Universities Osteoarthritis IndexFunction Subscale in Hip and Knee OsteoarthritisGABRIEL BARON,1 FLORENCE TUBACH,1 PHILIPPE RAVAUD,1 ISABELLE LOGEART,2 AND

MAXIME DOUGADOS3

Objective. A short version of the Western Ontario and McMaster Universities Osteoarthritis Index (WOMAC) functionscale has recently been developed to enhance the applicability of the scale in routine practice and clinical research forpatients with hip and knee osteoarthritis. The goal of the present study was to validate this short form.Methods. We conducted a prospective 4-week cohort study of 1,036 outpatients. Performance on the WOMAC functionlong form (LF) and short form (SF) was compared. Agreement between responses on the 2 forms was examined accordingto a Bland-Altman plot. Responsiveness to change (by standardized response mean [SRM]), reproducibility (intraclasscorrelation coefficient [ICC]), and internal consistency (Cronbach’s alpha) were computed for both forms. Constructvalidity was assessed based on functional impairment as measured on a numerical rating scale.Results. At baseline, 24% of patients who completed the WOMAC LF had missing data for at least 1 item as comparedwith only 6% of patients who completed the WOMAC SF. The mean WOMAC SF score was greater than the meanWOMAC LF score (mean � SD difference �4.3 � 4.8 on a 0–100 scale). SRMs were 0.61 and 0.73, ICCs were 0.76 and 0.68,and Cronbach’s alphas were 0.93 and 0.85 for the WOMAC LF and SF, respectively. The 2 forms had comparablecorrelation with functional impairment.Conclusion. The WOMAC function short form has a low rate of missing data and is a responsive, reproducible, and validmeasure. The mean SF score was 4 points higher than the mean LF score.

KEY WORDS. Western Ontario and McMaster Universities Osteoarthritis Index function; Short form; Osteoarthritis;Applicability; Performance.

INTRODUCTION

The Western Ontario and McMaster Universities Osteoar-thritis Index (WOMAC) function subscale is widely used

in clinical trials of hip and knee osteoarthritis (OA) (1). Itis a valid, reliable, and responsive measure of function inhip and knee OA (2,3). This index is self administered andthe function subscale includes 17 items addressing thedegree of difficulty in accomplishing 17 activities of dailylife. The original WOMAC function subscale has beenshortened recently to increase its applicability in epidemi-ologic studies, clinical trials, and daily clinical practice.Most procedures to shorten a scale usually rely on statisticalapproaches (4). The procedure used to derive the WOMACshort form was based mainly on the perspective of patientsand rheumatologists (5). The WOMAC function short form(SF) consists of 8 of the 17 items of the long form (LF) and haspsychometric properties equivalent to that of the LF. How-ever, this SF has not been validated in an independent sam-ple. The main objective of this study was to investigate thepsychometric performance of the WOMAC SF in a largeindependent sample of patients with hip and knee OA.

PATIENTS AND METHODS

Study population. We conducted a prospective cohortstudy of 4 weeks’ duration involving outpatients with hip

Supported by an unrestricted grant from Merck Sharp &Dohme Chibret Laboratories, Paris, France.

1Gabriel Baron, MSc, Florence Tubach, MD, Philippe Ra-vaud, MD, PhD: AP-HP, Hopital Bichat, Departementd’Eidemiologie, Biostatistique et Recherche Clinique,INSERM, U738, Paris, and Universite Paris 7 Denis Diderot,UFR de Medecine, U738, Paris, France; 2Isabelle Logeart,MD: Merck Sharp & Dohme Chibret Laboratories, Paris,France; 3Maxime Dougados, MD: Paris-Decartes University,Assistance Publique Hopitaux de Paris, Cochin Hospital,Paris, France.

Dr. Dougados has received consulting fees and/or hono-raria (less than $10,000) from Merck Sharp & Dohme-Chi-bret Laboratories.

Address correspondence to Gabriel Baron, MSc, Departe-ment d’Epidemiologie Biostatistique et Recherche Clinique,INSERM U738, Groupe Hospitalier Bichat-Claude Bernard,46 rue Henri Huchard, 75018 Paris, France. E-mail: [email protected].

Submitted for publication March 21, 2006; accepted inrevised form September 28, 2006.

Arthritis & Rheumatism (Arthritis Care & Research)Vol. 57, No. 4, May 15, 2007, pp 633–638DOI 10.1002/art.22685© 2007, American College of Rheumatology

ORIGINAL ARTICLE

633

Page 2: Validation of a short form of the Western Ontario and McMaster Universities Osteoarthritis Index function subscale in hip and knee osteoarthritis

or knee OA as defined by the American College of Rheu-matology (6,7) included by 308 private rheumatologists inFrance. Each rheumatologist had to include 4 patients, 3with knee OA and 1 with hip OA. To be included in thestudy, patients had to be receiving treatment with a non-steroidal antiinflammatory drug (NSAID). Inclusion beganwith the onset of treatment or with a switch from oneNSAID to another. Patients were excluded if they had aprosthesis on the assessed joint or if they had been treatedwith intraarticular injection in the 4 weeks before thestudy began. All patients initially visited the rheumatolo-gist in charge of their case, and NSAID therapy was pre-scribed (the drug and its dose were chosen by the physi-cian). A final visit to the same rheumatologist wasscheduled 4 weeks later.

The study protocol was approved by the ethics commit-tee of Cochin Hospital (Paris, France). The study was con-ducted in compliance with the Good Clinical Practice andthe Declaration of Helsinki principles. In accordance withFrench national law, all patients gave their written in-formed consent.

Measurements. At the baseline visit, demographic anddisease data (particularly disease duration) were collected.At baseline and at final visit, all patients were asked toassess their pain on movement, global assessment of dis-ease activity, and functional impairment on an 11-pointnumerical rating scale (NRS), the score ranging from 0 to10 (high score indicates a high level of symptoms).

At baseline and at final visit, half of the patients (the LFgroup) completed the original French Canadian version ofthe WOMAC function subscale (17 items, 5-point Likertscale version where 0 � none, 1 � slight, 2 � moderate,3 � severe, and 4 � extreme), with the total score rangingfrom 0 to 68 and a high score indicating a high degree offunctional impairment (8). In the LF group at baseline, theWOMAC SF (8 items, total score ranging from 0 to 32) wasextracted from the LF. The other half of the patients (theSF group) completed the SF but not the LF (5). Items of theLF and SF are shown in Table 1. At the final visit only, allpatients (LF and SF groups) assessed their response totherapy over 4 weeks in terms of functional impairment ona 3-point Likert scale (where 1 � improved function, 2 �no change, and 3 � worsened function).

Statistical analysis. Scores on the WOMAC LF,WOMAC SF, and NRS (pain, global assessment, and func-tional impairment) were linearly transformed to 0–100scores by multiplying scores by 100 and dividing by thetheoretical maximum of each score. Correlations andagreement between the 2 forms were investigated in the LFgroup, with the scores of the SF extracted from the LF.Correlation between the WOMAC LF and SF scores atbaseline was evaluated by Spearman’s correlation coeffi-cient. The same method was applied to the change in scorebetween baseline and final visits. Agreement between theWOMAC LF and SF was examined by a Bland-Altmanapproach (9). The difference between the WOMAC LF andSF scores was plotted against the average of the 2 scores,and limits of agreement involving the mean difference and

the SD of the difference were computed and plotted. Bydefinition, if differences are normally distributed, 95% ofindividual differences are within 2 SDs of the mean dif-ference (i.e., within the limits of agreement). The Bland-Altman plot is useful in searching for any systematic bias,assessing random error, and revealing whether the differ-ence between scores depends on the level of scores.

We computed the internal consistency of the LF in theLF group and the SF in the SF group. Internal consistencywas assessed by Cronbach’s alpha to examine the degree towhich the items in a scale measure the same concept (10).A value �0.70 is generally regarded as acceptable (11).

The responsiveness to change and the reproducibility ofeach tool were assessed in the subsample in which the toolwas used (i.e., LF group for the WOMAC LF, SF group forthe WOMAC SF). The responsiveness to change measuredover 4 weeks was computed using the standardized re-sponse mean (SRM) defined as the ratio of the meanchange (pretreatment minus posttreatment) to the SD ofthe change scores. Calculation of 95% confidence intervals(95% CIs) involved use of a bootstrap method. SRM can beconsidered large (�0.80), moderate (0.5–0.8), or small(�0.5) (12). As recommended (13), the SRM was also cal-culated for the 3 groups of patients classified according tothe direction of change in function (improved function, nochange, worsened function), which was considered an ex-ternal indicator of change. Intraclass correlation coeffi-cients (ICCs) indicating reproducibility were computed inthe subgroup of patients who reported no change in func-tion between baseline visit and visit at week 4 (14). Thiscomputation relies on the strong assumption that no

Table 1. Items of the Western Ontario and McMasterUniversities Osteoarthritis Index (WOMAC) function

long form (LF) and short form (SF): missing dataper item

Proportion of missing dataat baseline, %

WOMAC LF* WOMAC SF†

Descending stairs 3.3 2.7Ascending stairs 3.7 2.9Rising from sitting 3.5 3.3Standing 3.9Bending to floors 4.9Walking on flat surface 3.5 3.0Getting in/out of a car 3.7 3.0Shopping 5.5 4.0Putting on sock 4.5 2.9Rising from bed 4.1Taking off socks 4.1Lying in bed 4.7Getting in/out of bath 9.8Sitting 3.9Getting on/off toilet 4.1 2.5Heavy domestic duties 9.6Light domestic duties 8.4

* 2.5% of the patients completed none of the items of the WOMACLF.† 2.3% of the patients completed none of the items of the WOMACSF.

634 Baron et al

Page 3: Validation of a short form of the Western Ontario and McMaster Universities Osteoarthritis Index function subscale in hip and knee osteoarthritis

change in the outcomes is equivalent to no perceivedchange in functional status of the patient measured be-tween baseline and final visit. However, this method re-duces the influence of information recall associated withshorter periods of retest (15). ICC values range from 0(totally unreproducible) to 1 (perfectly reproducible); anICC �0.75 is considered excellent reproducibility (16).

Convergent validity was assessed by examining the cor-relation of the WOMAC LF and SF with the measure offunctional impairment (on an NRS) by Spearman’s corre-lation coefficient in the LF group. We also assessed thecorrelation of the WOMAC LF and SF scores with that ofother clinical measures of OA severity such as patientglobal assessment and pain (on an NRS). These variableswere expected to correlate less with the WOMAC LF thanthe WOMAC SF. All statistical analyses involved use ofSAS software, version 9.1 (SAS Institute, Cary, NC).

RESULTS

In all, 1,036 patients were enrolled in the study. Thevalidation process was based on 878 patients (84.7%): 661(75.3%) with knee OA and 217 (24.7%) with hip OA.Characteristics of these 878 patients are described in Table2. Among the 878 patients, 821 (93.5%) completed thevisit at week 4. Excluded patients (n � 158 [15.3%]) didnot differ from included patients (Table 2). Reasons forexclusion were missing data for OA location (hip or knee)and incomplete WOMAC data at baseline (i.e., missingdata for at least 1 item). The proportion of patients withincomplete WOMAC data at baseline was 23.9% for the LFgroup (with missing values per item varying between 3.3%and 9.8%) and 6.1% for the SF group (with missing valuesper item varying between 2.7% and 4.0%) (Table 1).

Mean � SD WOMAC LF and SF scores were 46.3 � 16.4

Table 2. Characteristics of the patients*

WOMAC LF group (n � 512) WOMAC SF group (n � 524)

Knee OA(n � 291)

Hip OA(n � 98)

Patients excludedfrom the analysis

(n � 123)†Knee OA(n � 370)

Hip OA(n � 119)

Patients excludedfrom the analysis

(n � 35)‡

Age, years 66.5 � 10.7 64.8 � 12.3 68.7 � 10.6 66.7 � 11.1 64.7 � 11.9 69.5 � 10.5Female sex, no. (%) 211 (72.5) 63 (64.3) 76 (61.8) 262 (70.8) 72 (60.5) 24 (68.6)Body mass index, kg/m2 28.1 � 4.6 25.9 � 4.0 28.0 � 5.3 27.9 � 4.4 26.4 � 4.1 27.8 � 3.2Disease duration, years 3.8 � 4.2 3.1 � 4.3 5.6 � 8.2 4.3 � 5.6 2.5 � 3.5 2.4 � 3.4Kellgren/Lawrence grade, no. (%)

II 64 (22.0) 23 (23.7) 22 (18.0) 105 (28.6) 27 (22.9) 10 (31.3)III 141 (48.5) 41 (42.3) 50 (41.0) 139 (37.9) 63 (53.4) 13 (40.6)IV 86 (29.6) 33 (34.0) 50 (41.0) 123 (33.5) 28 (23.7) 9 (28.1)

NSAID intake during theprevious 4 weeks, no. (%)

104 (36.0) 37 (37.8) 48 (39.3) 136 (36.8) 45 (37.8) 16 (45.7)

Analgesic intake during theprevious 4 weeks, no. (%)

173 (60.1) 54 (55.7) 71 (58.7) 209 (56.6) 69 (59.0) 19 (54.3)

Symptomatic slow-acting drugintake during the previous 4weeks, no. (%)§

109 (38.8) 32 (33.7) 39 (32.8) 111 (31.1) 37 (33.6) 6 (18.8)

Pain (0–100 NRS)Baseline 63.3 � 17.9 60.8 � 20.3 60.5 � 19.0 61.8 � 18.7 59.3 � 18.9 68.1 � 20.6Week 0 to week 4 �21.0 � 25.8 �15.8 � 20.6 �12.7 � 24.8 �22.5 � 24.5 �19.0 � 24.7 �22.3 � 25.6

Global assessment (0–100 NRS)Baseline 62.6 � 17.3 60.1 � 18.7 60.8 � 18.6 59.2 � 17.6 57.8 � 17.7 61.0 � 20.4Week 0 to week 4 �19.3 � 24.3 �15.1 � 24.1 �13.4 � 22.1 �19.1 � 21.8 �16.9 � 23.7 �14.7 � 27.0

Functional impairment(0–100 NRS)

Baseline 61.4 � 18.6 58.4 � 21.1 59.5 � 18.3 57.8 � 19.7 55.6 � 19.7 58.3 � 25.0Week 0 to week 4 �18.9 � 23.0 �15.8 � 22.4 �13.0 � 21.0 �17.6 � 24.1 �15.7 � 24.4 �14.5 � 22.2

WOMAC function LF(0–100)

Baseline 46.0 � 16.4 47.0 � 16.3 — — — —Week 0 to week 4 �10.9 � 16.8 �7.4 � 15.2 — — — —

WOMAC function SF(0–100)

Baseline 50.6 � 15.5 50.4 � 16.2 — 49.8 � 16.6 48.6 � 14.4 —Week 0 to week 4 �11.2 � 18.1 �8.3 � 15.6 — �12.5 � 16.5 �11.3 � 17.0 —

* Values are the mean � SD unless otherwise indicated. WOMAC � Western Ontario and McMaster Universities Osteoarthritis Index; LF � long form;SF � short form; OA � osteoarthritis; NSAID � nonsteroidal antiinflammatory drug; NRS � numerical rating scale.† Patients in the LF group with missing data at baseline for the WOMAC function LF (n � 122) or without arthritis location (n � 1).‡ Patients in the SF group with missing data at baseline for the WOMAC function SF (n � 32) or without arthritis location (n�3).§ Other than NSAID.

WOMAC Short Form Validation 635

Page 4: Validation of a short form of the Western Ontario and McMaster Universities Osteoarthritis Index function subscale in hip and knee osteoarthritis

(median 47.1) and 49.5 � 16.1 (median 50.0), respectively.The overall distribution of WOMAC LF and SF scores wassimilar at baseline (Figure 1), and the WOMAC LF and SFscores were highly correlated (r � 0.95, P � 0.0001). TheWOMAC LF and SF were also strongly correlated inchange in score measured between baseline and week 4(r � 0.95, P � 0.0001). Agreement between the WOMACLF and SF evaluated by the Bland-Altman approach ispresented in Figure 2. The variability was random anduniform throughout the range of values. The meanWOMAC SF score was greater than the mean WOMAC LFscore (mean � SD difference �4.3 � 4.8). The limits ofagreement showed that 95% of the differences between the2 scales can be expected to lie between �13.7 and 5.2points (2 SDs of the mean difference). Cronbach’s alphacoefficients at baseline demonstrated high internal consis-tency for the WOMAC LF and SF (0.93 and 0.85, respec-tively).

The 4-week responsiveness to change of the LF waslower than that of the SF (SRM 0.61; 95% CI 0.51, 0.71versus SRM 0.73; 95% CI 0.64, 0.85). When consideringthe direction of change in function, the SRM was similarfor the LF and SF for patients who reported no change orworsened function, but not for patients who reported im-proved function (Table 3). ICCs were 0.76 (95% CI 0.66, 0.83) for the WOMAC LF and 0.68 (95% CI 0.57, 0.77) for

the WOMAC SF. Both the WOMAC SF and the WOMACLF correlated with each of the following 3 criteria: func-tional impairment as measured on an NRS (which indi-cates similar convergent validity) and pain and globalassessment as measured on an NRS (correlations with painand global assessment were less than with functional im-pairment) (Table 4).

DISCUSSION

In this study, we compared the performance of the com-plete WOMAC function LF with the WOMAC SF in a largesample of patients with OA (hip and knee) and with abroad range of disease severity. The LF and SF have equiv-alent properties, except for rate of missing data, which waslower with the WOMAC SF.

Authors of the original version of the WOMAC recom-mend imputing data for �1 missing items by using themean of the completed items. The major disadvantage ofthis technique is the underestimation of the variance.Shortening the WOMAC function subscale from 17 to 8items greatly influences the level of missing data. Whenconsidering baseline data, only 6% of the WOMAC SFitems had incomplete data, as compared with 24% of theWOMAC LF items. This simple result could strengthen thedata quality and appeal of the SF. Although no consensualmethod of handling missing data exists (17,18), this con-cern should not be neglected. When comparing data fromthe WOMAC SF extracted from the WOMAC LF in the LFgroup and that from the WOMAC SF in the SF group, therate of missing data for each item was higher for the formerdata. This result could mean that supplementary itemsaffect the global data quality.

All of the characteristics of the WOMAC SF were ex-pected to be similar to those of the WOMAC LF. Thevalidity of the WOMAC SF was demonstrated not only by

Figure 1. Distribution of baseline Western Ontario and McMasterUniversities Osteoarthritis Index (WOMAC) scores (on a 0–100numerical rating scale) for the long form and short form.

Figure 2. Bland-Altman plot of difference between scores on theWestern Ontario and McMaster Universities Osteoarthritis Index(WOMAC) long form (LF) and short form (SF) by mean WOMACLF and SF (d) in patients who completed the full WOMAC func-tion LF (n � 389). The 3 horizontal lines indicate mean individualdifference (d) � 2 SD. The mean WOMAC SF score was greaterthan the mean WOMAC LF score (mean difference �4.3). Thelimits of agreement show that 95% of the differences between the2 scales can be expected to lie between –13.7 and 5.2 points. Theagreement is not dependent on the level of score.

636 Baron et al

Page 5: Validation of a short form of the Western Ontario and McMaster Universities Osteoarthritis Index function subscale in hip and knee osteoarthritis

its similar correlation with the WOMAC LF, but also withthe Bland-Altman plot. However, we found that theWOMAC SF overestimated the WOMAC LF by 4 points onaverage (on a 0–100 scale). In randomized controlled tri-als, because the same instrument (WOMAC LF or SF) isused in all groups, this overestimation will equally affectthe 2 groups and will not affect the difference betweengroups. This concern should be taken into account whenperforming meta-analyses of pooled data from trials usingeither the WOMAC LF or SF. Ninety-five percent of thedifference between the 2 scales was between –13.7 and 5.2points (on a 0–100 scale). Our Cronbach’s alpha values ofthe LF and SF are identical to those observed for bothforms in the study from which the WOMAC SF was de-rived (5). The WOMAC SF is an internally consistent in-strument for patients. The high coefficient alpha value ofthe WOMAC LF may reflect redundancy among items. Wefound sensitivity to change values of the 2 instruments tobe less similar than those in the study from which theWOMAC SF was derived. One possible explanation is thatin the present study the WOMAC SF was not directlyderived from the WOMAC LF. When considering per-ceived change in function as an external indicator, theSRM was higher for the WOMAC SF in the subgroup ofpatients who reported improvement. This finding rein-forces the usefulness of this scale because what we arelooking for in trials is improvement. The reduction of thescale length from 17 to 8 items did not seriously affect theexcellent reproducibility of the LF. In our study, we eval-uated reproducibility in patients who reported no changein functional impairment at 4 weeks. To provide a moreexact measure, reproducibility should be evaluated in anindependent test–retest study with a reduced interval of

time (e.g., 2 days). Finally, this study provides some evi-dence of convergent validity of the 2 scales.

A limitation of this study is the use of the data providedby the LF to validate some of the properties of the SF in theLF group. This use of data may constitute a framing bias,and our results could lead to an overestimation of thesimilarity between the 2 forms (19). This problem is inher-ent to many validation studies. In the present study, when-ever possible, we performed analysis on separate samples(Cronbach’s alpha, SRM, ICC).

In conclusion, our study indicates that the WOMAC SF,which was initially developed with participation of a largepanel of patients and rheumatologists, is a good alternativeto the WOMAC LF for the assessment of function in pa-tients with hip and knee OA. The WOMAC SF retainedproperties of the parent WOMAC LF (responsiveness, re-producibility, and validity). Because of its simplicity, easeof application, and low rate of missing data, the SF may beof interest in routine practice and in clinical research.

AUTHOR CONTRIBUTIONS

Dr. Baron had full access to all of the data in the study and takesresponsibility for the integrity of the data and the accuracy of thedata analysis.Study design. Tubach, Ravaud, Logeart, Dougados.Acquisition of data. Logeart.Analysis and interpretation of data. Baron, Tubach, Ravaud,Logeart, Dougados.Manuscript preparation. Baron, Tubach, Ravaud, Logeart, Dou-gados.Statistical analysis. Baron, Tubach, Ravaud, Dougados.

Table 3. Responsiveness to change of the Western Ontario and McMaster UniversitiesOsteoarthritis Index (WOMAC) function long form (LF) and short form (SF) according to

the patient’s assessment of perceived change in function after 4 weeks*

Scale

Improved function No change Worsened function

SRM 95% CI SRM 95% CI SRM 95% CI

WOMAC LF† 0.92 0.77, 1.08 0.20 0.00, 0.39 �0.65 �1.36, �0.20WOMAC SF‡ 1.09 0.95, 1.25 0.18 0.00, 0.36 �0.63 �1.35, �0.15

* SRM � standardized response mean; 95% CI � 95% confidence interval.† Improved function: n � 203, no change: n � 97, worsened function: n � 17.‡ Improved function: n � 295, no change: n � 119, worsened function: n � 13.

Table 4. Western Ontario and McMaster Universities Osteoarthritis Index (WOMAC)function long form (LF) and short form (SF) Spearman’s correlations with scores on a

numerical rating scale of functional impairment, pain, and global assessment by patientswho completed the full WOMAC function LF (n � 389)

WOMACSF

WOMACLF

Functionalimpairment* Pain*

Globalassessment*

WOMAC LF 1.00 0.95 0.64 0.52 0.53WOMAC SF 1.00 0.65 0.54 0.56Functional impairment* 1.00 0.72 0.74Pain* 1.00 0.79Global assessment* 1.00

* Measured on a 0–10 numerical rating scale.

WOMAC Short Form Validation 637

Page 6: Validation of a short form of the Western Ontario and McMaster Universities Osteoarthritis Index function subscale in hip and knee osteoarthritis

ROLE OF THE STUDY SPONSOR

Merck Sharp & Dohme-Chibret Laboratories participated in thestudy design, data collection, agreement to submit the manu-script, and approval of the content of the submitted manuscript.They did not participate in the data analysis or in the writing ofthe manuscript.

REFERENCES

1. Bellamy N, Buchanan WW, Goldsmith CH, Campbell J, StittLW. Validation study of WOMAC: a health status instrumentfor measuring clinically important patient relevant outcomesto antirheumatic drug therapy in patients with osteoarthritisof the hip or knee. J Rheumatol 1988;15:1833–40.

2. Bellamy N, Kean WF, Buchanan WW, Gerecz-Simon E, Camp-bell J. Double blind randomized controlled trial of sodiummeclofenamate (Meclomen) and diclofenac sodium (Vol-taren): post validation reapplication of the WOMAC Osteoar-thritis Index. J Rheumatol 1992;19:153–9.

3. McConnell S, Kolopack P, Davis AM. The Western Ontarioand McMaster Universities Osteoarthritis Index (WOMAC): areview of its utility and measurement properties. ArthritisRheum 2001;45:453–61.

4. Whitehouse SL, Lingard EA, Katz JN, Learmonth ID. Devel-opment and testing of a reduced WOMAC function scale.J Bone Joint Surg Br 2003;85:706–11.

5. Tubach F, Baron G, Falissard B, Logeart I, Dougados M, Bel-lamy N, et al. Using patients’ and rheumatologists’ opinions tospecify a short form of the WOMAC function subscale. AnnRheum Dis 2005;64:75–9.

6. Altman R, Alarcon G, Appelrouth D, Bloch D, Borenstein D,Brandt K, et al. The American College of Rheumatology cri-teria for the classification and reporting of osteoarthritis of thehip. Arthritis Rheum 1991;34:505–14.

7. Altman R, Asch E, Bloch D, Bole G, Borenstein D, Brandt K, etal. Development of criteria for the classification and reporting

of osteoarthritis: classification of osteoarthritis of the knee.Arthritis Rheum 1986;29:1039–49.

8. Choquette D, Bellamy N, Raynauld JP. A French-Canadianversion of the WOMAC osteoarthritis index [abstract]. Arthri-tis Rheum 1994;37:S226.

9. Bland JM, Altman DG. Statistical methods for assessing agree-ment between two methods of clinical measurement. Lancet1986;1:307–10.

10. Bland JM, Altman DG. Cronbach’s alpha. BMJ 1997;314:572.11. Nunally JC. Pshychometric theory. New York: McGraw-Hill;

1978.12. Husted JA, Cook RJ, Farewell VT, Gladman DD. Methods for

assessing responsiveness: a critical review and recommenda-tions [review]. J Clin Epidemiol 2000;53:459–68.

13. Fortin PR, Stucki G, Katz JN. Measuring relevant change: anemerging challenge in rheumatologic clinical trials. ArthritisRheum 1995;38:1027–30.

14. Shrout PE, Fleiss JL. Intraclass correlations: uses in assessingrater reliability. Psychol Bull 1979;86:420–8.

15. Streiner DL, Norman GR. Health measurement scales: a prac-tical guide to their development and use. 2nd ed. New York:Oxford Medical Publications; 1995.

16. Marx RG, Menezes A, Horovitz L, Jones EC, Warren RF. Acomparison of two time intervals for test-retest reliability ofhealth status instruments. J Clin Epidemiol 2003;56:730–5.

17. Committee for Proprietary Medicinal Products. ICH topic E 9:statistical principles for clinical trials (CPMP/ICH/363/96).URL: http://www.emea.europa.eu/pdfs/human/ich/036396en.pdf.

18. Committee for Proprietary Medicinal Products. Points to con-sider on missing data (CPMP/EWP/1776/99). URL: http://www.emea.europa.eu/pdfs/human/ewp/177699EN.pdf.

19. Haavardsholm EA, Kvien TK, Uhlig T, Smedstad LM, Guille-min F. A comparison of agreement and sensitivity to changebetween AIMS2 and a short form of AIMS2 (AIMS2-SF) inmore than 1,000 rheumatoid arthritis patients. J Rheumatol2000;27:2810–6.

638 Baron et al