9
Neurourology and Urodynamics 13:323-331 ( 1994) Statistical Evaluation of the Function of the 1992 International Continence Society Scientific Committee R. van Mastrigt and J.W. Downie Department of Urology- Urodynarnics, €rasrnus University Rotterdam, Rotterdam, the Netherlands (R. v. M.); Department of Pharmacology, Dalhousie University, Nova Scotia, Canada (J. W. 0.) Paper\ suhinittcd to rhc International ('oniincncc S(~.icly arc read hy the six nicnihcrs of the scicntific commiltcc. who a\sipn three score\ to each papr. one for irs originality. one lor its scicnritic value. and one tor it\ academic or clinical interest. Following dixus\ion at the scientific coiiiniittcc niceting. a prograrii is rnadc fiom the ahtract\ with the highest scores. In this btudy stati\tical propcrtics of the a\signcJ score values lor the 1902 ICS niccting are discussed It I\ concluded that the three \cores do not nicasurc independent properties of the abstract\. that dillbrcnt ahstract types (liw iwtancc "clinical" and "habic" paper>) are sccxed dillcrently hy the ditJcrcn! conlniittcc iliclnhcr\. that thcrc is i) \ignificant consensus anionp thc coinmittcc iiicnihcr\. that the scientific coiiiniiltcc iiiecting has a relatively small eltcct on the scorcs. and th;it dttt'crcnt ahtract typcy arc not cqually rcprcxntcd in thc final program I IW-I wdr! I.,., ftt, Key words: International ('ontinence Society. abstracts, abstract scores, scores, scientific meeting INTROOUCTION Abstracts submitted lor ICS meetings arc collected by that specific years' sci- entific chairman and anonymous versions of it are sent in the order in which they arc received to the members of the scientific committee. Apart lrom the chairman this committee consists of the chairman of the previous year. the chairman of the next year, one local representative. and two ICS representatives. one scientific and one clinical representative that arc altcrnatingly elected by the ICS lor a 2 year period. All six committee members read all abstracts and assign three scores to each abstract [except to those they (co)authori: one for its originality. one for its scicntific value, and one for its academic or clinical interest. Each .\core ranks from 0 to 3. The chairman collccts all scorch in a database. After the deadline for abstract submittal has passed thc committec meets and discusses (still anonymous) all abstracts, with special attention to those abstracts that show wide discrepancies in scoring. At this stage Received for puhlication October 4. 1993; acccptcd January 1%. 1994 Address reprint requests to I)r. ir. K. van Mastrigl. Dcpartiilcni 01 Urology-UrcKlynaiiiic\. Kwm EE1630. Erasmu\ University Kottcrdani, P.0, Box 173H. 3OW I>K Kotterdani. the Netherlands. 0 1994 Wiley-Liss, Inc.

Statistical evaluation of the function of the 1992 international continence society scientific committee

Embed Size (px)

Citation preview

Page 1: Statistical evaluation of the function of the 1992 international continence society scientific committee

Neurourology and Urodynamics 13:323-331 ( 1994)

Statistical Evaluation of the Function of the 1992 International Continence Society Scientific Committee R. van Mastrigt and J.W. Downie

Department of Urology- Urodynarnics, €rasrnus University Rotterdam, Rotterdam, the Netherlands (R. v. M.); Department of Pharmacology, Dalhousie University, Nova Scotia, Canada (J. W. 0.)

Paper\ suhinittcd to rhc International ('oniincncc S ( ~ . i c l y arc read hy the six nicnihcrs of the scicntific commiltcc. who a\sipn three score\ to each p a p r . one f o r irs originality. one lor its scicnritic value. and one tor it\ academic or clinical interest. Following dixus\ion at the scientific coiiiniittcc niceting. a prograrii is rnadc fiom the ahtract\ with the highest scores. In this btudy stati\tical propcrtics of the a\signcJ score values lor the 1902 ICS niccting are discussed I t I \ concluded that the three \cores do not nicasurc independent properties o f the abstract\. that dillbrcnt ahstract types (liw iwtancc "clinical" and "habic" paper>) are sccxed dillcrently hy the ditJcrcn! conlniittcc iliclnhcr\. that thcrc i s i) \ignificant consensus anionp thc coinmittcc iiicnihcr\. that the scientific coiiiniiltcc iiiecting has a relatively small eltcct on the scorcs. and th;it dttt'crcnt ahtract typcy arc not cqually rcprcxntcd in thc final program I IW-I wdr! I.,., f t t ,

Key words: International ('ontinence Society. abstracts, abstract scores, scores, scientific meeting

INTROOUCTION

Abstracts submitted lor ICS meetings arc collected by that specific years' sci- entific chairman and anonymous versions of i t are sent in the order in which they arc received t o the members of the scientific committee. Apart lrom the chairman this committee consists of the chairman of the previous year. the chairman of the next year, one local representative. and two ICS representatives. one scientific and one clinical representative that arc altcrnatingly elected by the ICS lor a 2 year period. All six committee members read all abstracts and assign three scores to each abstract [except to those they (co)authori: one for its originality. one for its scicntific value, and one for its academic or clinical interest. Each .\core ranks from 0 to 3. The chairman collccts all scorch in a database. After the deadline for abstract submittal has passed thc committec meets and discusses (still anonymous) all abstracts, with special attention to those abstracts that show wide discrepancies in scoring. At this stage

Received for puhlication October 4. 1993; acccptcd January 1 % . 1994

Address reprint requests to I)r. ir. K . van Mastrigl. Dcpartiilcni 0 1 Urology-UrcKlynaiiiic\. K w m EE1630. Erasmu\ University Kottcrdani, P . 0 , Box 173H. 3OW I>K Kotterdani. the Netherlands.

0 1994 Wiley-Liss, Inc.

Page 2: Statistical evaluation of the function of the 1992 international continence society scientific committee

324 van Mastrigt and Downie

committee men?bers can change the assigned scores. All scores arc added and finally a program is made up from the abstracts with the highest total scores. The cut-off p i n t for presentations is not predefined but derived from the distribution of scores and the number of available slots in the program.

At the I902 scientific committee meeting some questions emerged regarding the scoring system that might be answered by statistical processing of the assigned scores. In this article the following aspects will be discussed:

I . Are the scores normally distributed, and/or should they be normally distrib- uted?

2. Arc the three scores (lor originality. scientific value, and clinical or aca- demic interest) mutually independent?

3. Arc the scores dependent on the category o f the abstrxt and the background of the committee members? (for instance. do "basic science" papers score differently from "clinical papers" and are there differences between the scores from basic scientists and clinicians?)

4. Docs the scoring change as a function of time. i.e.. arc abstracts submitted, and therefore read, later scored differently froin earlier submitted abstracts?

5 . How arc the scores affected by the discussion at the scientific committee meeting; does this increase the degree of consensus'!

6. How are the abstract categories distributed over the program categories (for instance, arc "clinical" papers rejected more often than "basic" papers)?

METHODS

Following the ICS meeting both the scores assigned before and after the scientific committee meeting were transferred to an IBM compatible PC and were processed using the statistical package SPSS. Apart from the three scores from each committee member and the abstract number (which is a number given when the abstract was received, and therefore represents the historical order in which abstracts are read by the committee members). an abstract type number was assigned. Abstracts involving isolated tissue o r using experimental animals were called "basic ." Urodynamic studies were called "clinical" unless the thrust was mathematical or the study was intended to compare methodologies or explore new methodologies o r technologies in which case they were considered "basic urodynamics." All other abstracts involving patients or patient-related issues were labelled "clinical" except for survey-based studies which were considered a separate category ("survey"). Numbers between I and 6 were assigned randomly to anonymously represent the committcc members. If not explic- itely stated otherwise. all results shown refer to the score values after the scientific committee meeting. 1.c.. after the committee members had had an opportunity to change their scores as a result of the discussions in the meeting.

RESULTS

In 1002 324 abstracts were received and scored. Table I shows for each of the six members o f the scientific committee the distribution of the scores for each of the three categories: originality, scientilic value, and interest. Also the mean, standard deviation. and skewness of the distributions arc shown. as well as the significance of

Page 3: Statistical evaluation of the function of the 1992 international continence society scientific committee

1992 ICS Scientific Committee 325

TABLE 1. The 1)istrihution of Scores for Each of the Six Members of the Scientific Committee

1 Or] S C l

I n r 2 Ori

sc I

1111

3 Or1 S C l

Int

4 Ori S C l

l n t

5 Ori SCI

Int 6 Or1

S C l

Int

13 I0 2 9 4 4

43 3 0 20

31 3 0 29

71 I I 2 75 9 4 3

134 I39 107 1x3 156 I41

x7 1 x 0 I I 0 1x4

I S 0 I 1'1 15s 130 I91 103

137 127 I40 131 14s 130

105 115 10') x3 1 % Iofl

xs 1x2

77 1x7 132 170 137 161

3.1 2 0 20

47 5 2 25 h 3 4

27 I2 I X

24 I I h

43 I0 IS

I .h .73 13 I .7 .&I - ~ . 2 x I .h .62

I .x .70 .24 I .9 . hX 07 I .7 . 0 2 . 0 3 I . 3 .71 - . I 5 1 3 . 60 .27 I . 3 .ho . Ih

1 5 .7x .04 I .4 .74 --.I4 I .4 .73 .o()

I . 3 .90 .02 I .o . x7 .3h 1 . 1 , x o - . 0 2

I .x .hX - . 3 2 I 6 .5x .os I .6 .Oo . I 7

-

"Ori. origin;ility: Sci, scicntifjc valuc. Int. intcrcq

rencc from a nornmal distribution according to the Kolmogorov-Smirnov Goodness o f Fit Test. It can be seen that all scores from all members were not normally distributed. The test compares the tested distribution with a normal distri- bution with the same incan and standard deviation. The fact that for all scores, except the originality score from member 4, the mean was not I .S (the midpoint of the score range) tliercfore did not contribute to the significance o f the test. There is no uniform pattern in the abnormality o f the score distribution. Some distributions are strongly skewed to the right; see lor instance the originality score of member 6; in addition t o the fact that the mean o f this distribution is larger than I . S , there arc many more papers that received a score above the mean than papers that had a score below the mean. Other distributions arc wongly skewed to the left: for instance. the scientific value score o f member 5 ; in spite o f the low mean valuc o f I .O rnany more papers scored below the mean than above.

Table I 1 shows the degree to which the three scores of each committee member were correlated. In fact. all scores were significantly correlated for all members. The smallest correlation was found between the originality and interest score o f member 6 (0.10) and the highest bctwccn originality and interest of member 5 (0.63).

Table 111 gives correlations among the scores of different committcc members. With a 1i.w exceptions the scores were significantly correlated, Without exception thc interest scores were least correlated. i .c. . the least agreement among committee members existed on the interest score. As stated in "methods" the values in the table reflect the score values following the scientific committee meeting. A similar table based on the values before the meeting looked almost identical; none of the displayed values differed more than 50.0 I between the two tables.

Page 4: Statistical evaluation of the function of the 1992 international continence society scientific committee

326 van Mastrigt and Downie

TABLE 11. ’The Correlations Between the Three Scores for Each of the Six Members of the Scientific Committee, and the Associated Significances (Pearson Correlation)

Member

I 2 3 4 S 6

Originality/ Scientific value

.4x 29

.3x

. 3 2

.s I

.3h

Scientific value

Signifi cance

Originality/ Interest

.s9

. 2Y

.26

.47

. ss

. I X

.s I

.24

.23

.42

. h3

. I 0

Signifi- cance

.Ooo

.Ooo

.Ooo

.ooo

.Ooo

.044

TABLE 111. Pearson Correlations Between the Scores of the Dinerent Committee Members* 3 Member Score“ - 3 4 S 6

I Ori sci In t

- Ori scl Int

3 Ori sci Int

4 Ori Sci Int

S Ori sci In t

3

.3h .43 .37

. 2x .4 I .2s

. 10( ~ ) .13( - ) .27 .37 .27 .34 . 3 2 .22 .26

.34

.34

. I4

.3s

.49

.3 I

.3h

.44

.OS( - )

.3s

. ss

.()ti( - )

.39

.43 3 3 . --

.33

.37

. l l ( - )

.32

.31

. 22

.39

.32

. 22

.33

.2J

.I6

.3s

.38 - . 02 ( - )

*All correlations are significant at the 1 % level except those indicated with ( ~ )

”Ori. originality; Sci, scientific value; Int. intercst.

Table IV gives a breakdown of mean score values for the different abstract types. The significancc of Pearson’s chi square indicates whether scores were equally distributed over the abstract types. Only committee member 4 had scores that were independent of the abstract types; for all other members there was a significant correlation between the abstract scores and the type of abstract. In the two sets of rows marked “Basic scientists” and “Clinicians,” the average scores of the two, respec- tively four, committee members with this background were added. Except for the interest scores of the basic scientists, all the scores were significantly dependent on the abstract type. Moreover. with two exceptions (the originality score for “basic” abstracts and the interest score for “basic urodynamics” abstracts) the score values of basic scientists and clinicians were significantly different (Wilcoxon matched pairs test P = 0.05).

Table V shows the rank correlations between the scores and the abstract num- bers. As abstract numbers were assigned in historical order, this correlation is a measure of the change in time of the scoring of the committee members. Only two correlations were significant; both were positive and for the originality score, imply- ing that both these committee members tended to score later abstracts higher for

Page 5: Statistical evaluation of the function of the 1992 international continence society scientific committee

1992 ICS Scientific Committee 327

T A B L E IV. The Mean Scores of All Members for the Different Abstract Types, and the Significance of Pearsun's Chi Square

Abstract type

Basic Basic urodynamics Clinical Survey Signif i-

Meniher ('ategory" ( N = 57) ( N = 17) ( N = 2 3 1 ) ( N = 19) cance

I O r i I . 7 2 . 3 I .5 I .h .00000 Sc i 2.0 -.- 7 1 I .5 I .7 . ooo00 l i l t I .5 2. I I .5 I .7 .0oo2

2 O r i 2 . I 2.3 I . x I .-l .o(x)O0

Sci -.- 7 1 2.3 I .x I .h . m 2 Int I .5 I . 5 I .x I .h .05

3 Or1 1 7 I .7 1 . 1 I .-l .om3 Sci I 9 I .3 I . 2 I .4 ,00000 I l i t I . 2 I . 2 I . 3 I .7 ,006

4 Ori I .6 I .x I .3 I .3 . I 2 Sci I .5 I .4 I . 3 I .4 .25 In1 I . 3 I .-l I 4 I .h .36

5 Ori 2 . 0 I .4 I . 2 .7 .oocuM) Sci I .x 1 . 1 . x .x . oOo()O

Int I . 7 1 . 1 I .o . h .00000

SCl I .9 I . 9 I .5 I .6 . o(Ml07 Int I . 2 I .3 I . 7 I . Y . ooocM)

sci ent i \is'' sci I . h I .3 I .o 1 . 1 . o(XxK)

h Ori 2 . I I . Y I .x I .9 .003

Basic O r i I .x I .5 I .3 1 . 1 .oooI

Int I .5 I . 2 -.- 1 1 1 . 1 ,069 Clinicians" Ori I .o 2 . I I . s 1 5 .0oo1

SCl 2 0 2 . 0 I .s I .5 .oOo(W)

Int I .4 I .5 I .o I .7 .m1

Wri, originality: Sci. scicntil.ic value. l i l t . intcrcst. hData i n Basic Scicnti\t\ and Clinicians \ h o u averages o l the t w o and lour conimlttcc mcnibers, respec- l ively. wi th [hi\ hackground.

originality. Table VI gives the distribution of score changes following the committee meeting; 3 . 2 % of the scores was changed in the meeting. The changes affected 22% of the abstracts as shown in Table VII. Finally Table Vll l shows the distribution of the abstract categories over the program categories. Different abstract categories were significantly dif'lercntly represented in the program categories according to Pearson's chi square.

DISCUSSION

The data shows that none of the scores o f any member of the scientific com- mittee was normally distributed. I t can be wondered if a normal distribution was to be expected. Such a distribution arises from random variation o f a variable around a mean. In the analysed scores there are two sources of variation. One is the intrinsic variation, i.c.. the abstract property that the scorcs attempt to quantify is different for each abstract. The other source of variation is the committee members' estimation of

Page 6: Statistical evaluation of the function of the 1992 international continence society scientific committee

328 van Mastrigt and Downie

TABLE V. The Rank Correlations of the Scores of All Members With the Abstract Number

Rank Mcrnbcr Category,' corrclation Significancc

I Ori ,IX3X * sc I .O()X3 Int - .0165

- Ori .03X7 Sci ,0141 Ini ~ .I343

3 Or] ,1155 Sc i . M Y I Int - .007X

7

4 Or1 , 2 2 6 0 * Sci ,047 I Int , 0 9 7 ~

5 Or1 ,0790 Sci ~- ,0279 Ini .O"

6 Or1 ,0736 Sc i . O W S I nt ,0470

Wri. originality: Sci, hcicntific value: Inf. intcrcht "Significant at fhc I f % Icvcl.

the property. I t is not unlikely that this last variation can be described as a random fluctuation of the scores, but i t is not likely that the intrinsic variation is random. This would imply for instance that the likelihood that authors would submit a very original abstract is the same as the likelihood that they would submit an unoriginal abstract. Probably the latter is much easier and therefore will happen more frequently. The data in Table I reflects the combined el'fect of the two sources of variation, intrinsic and committee-member variation. As there arc large differences between the resulting distributions-approximately one half is skewed to the left and the other half is skewed to the right-this must be ascribed to committee member variation. I t is probably not possible to draw conclusions on the intrinsic variation, o r the intrinsic distribution of the scores. In spite of the considerable committee member variation there is abundant common ground, which can be seen in Table I l l . Most of the scores of most of the members are significantly positively correlated. These significant correlations be- tween the members scores arc not caused by the discussion at the scientific meeting. This discussion resulted in ;I relatively small number of changes. in 3 . 2 % of the scores only, affecting however a considerable number of abstracts, i.c., 22%. The changes werc equally distributed over the abstract categories (Table VII) . i.e., com- mittee members changed their score values for "clinical" papers as often as for "basic" papers etc. relating to the relative number of papers in each category. About half of the changes in scores were changes in the originality score: in seven abstracts (0.12%) this score was changed by 3 points, i.e.. from maximally original to max- imally unoriginal. The originality score is obviously the most dependent on specific knowledge o r background and therefore the most sensitive to discussion. The least

Page 7: Statistical evaluation of the function of the 1992 international continence society scientific committee

1992 ICS Scientific Committee 329

I 'AHIX VI . l h e 1)istrihution of the Changes in Scores Following the Scientific Meeting

Nuiiihcr 0 1 change5 in \cores with magnirudc

(.;It- 7 Mctnher cgory" ~ 3 - I + I + 2 T0r; l l

I Ori Sci In!

7 O r i sci I n 1

3 Ori Sci Int

J Ori sc I

lrir

5 Ori sc i Int

0 Ori Sci l i l t

'l'otal

I'crcciit;icct'

4 X 12 0 h 2 I 3

- 4 0 - I I X - 7 I - 12 - 3 4 0

I I 4 3 LI I 3 J I 4 5

5 I I 5 I -- 0 I I 0

I 0 I I I 3 9 5 3 20

3 I I 5 - 4 - X

4 3 0 I 17 5 I 0

I 0 I X

7 1 6 lo4 37 I 1 I x5

o I ? 0.45 I .7 0.63 0 19 3.2

7 7

7 1

?

7 7

7 7

Y)ri. originality: Sci. \ciciitil'ic value: Inr. intcrc\t. "Pcrcentagc give5 the nuiiihcr of time\ a change of ccriain niagnitudc m - curred ;I\ a percentage ol the rota1 iiunihcr o t heore\. which wa\. 324 (ah- \tract\) x 3 (score\) x 0 (riicnihcrs) = 5 . 8 3 2 .

'1AHI.E VII. The Number of Abstracts in the Four Ahstract Categories for Which One or More of the Scores Was Changed at the Scientific Committee Meeting*

agreement existed on the interest score. Tuble 111 shows that in 6 out of the IS possible member combinations there was no significant agreement o n this score. With one marginal cxccption the correlation of the interest score was thc lowest in all member combinations. This probably reflects that the interest score is the most subjective of the three scores. For each member there was a significant correlation between the three scores with. depending o n the significance Icvcl, one possible exception (the originality and interest score of member 6; see Table 11). These high correlations

Page 8: Statistical evaluation of the function of the 1992 international continence society scientific committee

330 van Mastrigt and Downie

TABLE VIII. The Number and Percentages (in parentheses) of Abstracts in Each of the Four Abstract Categories in the Seven Program Categories*

Kcad Foriiial Intornial hY

I'odiuin p o \ k r p o w x Video Withdrawn titlc Reiect

signify that the three scores are n o t independent. they do not quantify clearly different properties o f the abstracts. and noric ol' the committee members was able to indc- pendently and differently score thcsc properties. I t follows that if i t i s thought desir- able to attciiipt to independently incasurc differcnt aspects of the abstracts, it is necessary to use other scores that quantity inore clearly defined. separate aspects o f the abstracts. The alternative is to use only one overall score for each abstract.

Five of the six members of the scicntilk committee scored difl'crcnt types of abstracts dit'lercntly . For thcsc I'ivc nicmbcrs. not on ly the interest score but also the appreciation o f originality and scicntil'ic value was clearly different lor the four abstract categories. The fact that the three scores arc not truly independent also may play a role hcrc. When the committcc members were grouped as "basic scientists" vs. "clinicians" I0 01' the 12 score values in the liwr abstract categories were significantly dil7crcnt between thcsc groups. This dil'li.rcncc in scoring justifies the composition 01' ;L coiiimittce composed 0 1 members with different background. I t should bc noted that one committee member systcniatically managed to avoid this background based bias.

In thc final program not all abstract categories were equally represented. "Clin- ical" papers and "surveys" were inore often excluded from presentation (read by title) than "basic" and "basic urodynamics" papers. O n the other hand three times inore of these papers were submittcd than "basic" and "basic urodynamics" papers, and two tirncs inore clinical papcrs were accepted to the prograni then basic papers. Compared to "clinical" papers "basic" papers were less Ircquently presented o n the podium, inore frequently ;is ;I poster. and less l'requcntly as read by title. "Basic urodynamics" papers were inore ol'tcn presented o n the podium than "clinical" papers and less l'requcntly as "read by title." This distribution o f the abstracts over the program is only partly determined by the nuinher of accepted abstracts within topic groupings. The decisions about podiuin versus loriixil poster presentation arc a major component 01' the scicntil'ic coininittee's deliberations at its meeting. Although these decisions arc made with the committec still "blinded" to the identity of the authors, they arc influenced by the members' perceptions of what sort of material is best presented orally o r by static display.

A final surprising finding was that with two exceptions the scorings did not significantly change over time and that the two exceptions were positive and for the Same score. 'Two committee members tended to appreciate abstracts as more and inore original while reading 324 o f thcm.

Page 9: Statistical evaluation of the function of the 1992 international continence society scientific committee

1992 ICS Scientific Committee 331

In conclusion i t can be stated that in the data from the 1992 Scientific Com- mittee o f the ICS:

I . The abstract scores were not distributed nornially. 2. The three scores that were supposed to measure independent properties of

the abstracts were not independent. This situation may be rcmedicd by changing the score definitions. scoring othcr aspects o f the abstracts, o r by using a single overall score.

3. Different types of abstracts (e .g . . "basic" vs. "clinical") were generally scored differently by different members o r by the basic scientists versus the clini- cians. In 19Y2 one member managed to avoid this bias.

4. With two (positive!) exceptions scores did n o t change over time, as the committee members read niore and more abstracts.

5. 3.3% o f the score values were changed at the scientific committee meeting, affecting 72% o f the abstracts. irrespective of the abstract category. The degree of consensus between the committec nicmbcrs did not change significantly as a result o f the discussion.

6. Different abstract categories were distributed differently over the program categories. Compared to "clinical" papers. "basic" papers were less frequently presented o n thc podium. iiiorc lrcquently as a poster, and less frequently as read by title. "Basic urodynamics" piipcrs were more often presented o n the podium than "clinical" papers and less frequently as "read by title."

ACKNOWLEDGMENTS

The authors arc grateful to the othcr comniittcc nicmbers, W. Artibani, L.D. Cardozo. J . H . <;a.jcwski, and K . Hiifncr. for their kind permission to use the score data for this paper.