8
Item Reliability of the Milani-Comparetti Motor Development Screening Test The purpose of this study was to determine the level of interobserver and test-retest reliability of the Milani-Comparetti Motor Development Screening Test. Sixty healthy children, aged 1 through 16 months, were videotaped during administration of the Milani-Comparetti test. Four pediatric physical therapists independently viewed each videotape and scored the responses. Interobserver reliability was determined by cal- culation of percentage of agreement and the G statistic between a primary observer and each therapist. Forty-three children were retested within one week by the initial tester to examine test-retest reliability. Test-retest reliability was determined by per- centage of agreement of items between the two test sessions and using the Kappa statistic. Interobserver percentage of agreement for the individual items on the Milani-Comparetti test rangedfrom 79% to 98%. The G statistic was significant for all items indicating the high percentage-ofagreement values were not due merely to chance agreement. Test-retest agreement rangedfrom 80% to 100%. Using Kappa statistic guidelines, excellent test-retest reliability (K> .75) was found for 82% of the test items, with good reliability of the remaining items. Acceptable interobserver and test-retest reliability was found for all items on the Milani-Comparetti test. Use of the Milani-Comparetti test as a clinical screening tool for prediction or follow-up of motor development in children at risk for developmental delays requires further evaluation. [Stuberg WA, White Pf, Miedaner J A, et ah Item reliability of the Milani- Comparetti Motor Development Screening Test Phys Ther 69328-335, 1989] Key Words: Child development; Motor skills; Pediatrics, development; Tests and measurements, general. Wayne A Stuberg Penni J White James A Miedaner Pam R Dehne The Milani-Comparetti Motor Devel- opment Screening Test was first pub- lished in 1967 by Milani-Comparetti and Gidoni as a mechanism for iden- tifying infants with abnormal move- ment patterns. 1 A testing manual was published by the C Louis Meyer Chil- dren's Rehabilitation Institute in 1977 2 to provide information on test admin- istration and revised in 1987 3 to include new data on test standardiza- tion. The Milani-Comparetti test is a screening tool developed to systemati- cally examine the integration of primi- tive reflexes and the emergence of volitional movement against gravity. The test assesses spontaneous motor behavior and evoked responses including righting, protective and equilibrium reactions, and select primitive reflexes. 14 Specifically, the Milani-Comparetti test is used to 1) identify motor dysfunction in W Stuberg, MS, PT, is Assistant Professor and Director of Physical Therapy, C Louis Meyer Children's Rehabilitation Institute, University of Nebraska Medical Center, 42nd & Dewey Ave, Omaha, NE 68105 (USA). P White, MS, PT, is Staff Physical Therapist, C Louis Meyer Children's Rehabilitation Institute. J Miedaner, MS, PT, is Assistant Professor, Division of Physical Therapy Education, and Chief Thera- pist, Vince Mosley Clinic, Medical University of South Carolina, Charleston, SC 29425. P Dehne, MS, PT, is Assistant Instructor and Staff Physical Therapist, C Louis Meyer Children's Reha- bilitation Institute. This article was adapted from a presentation at the Combined Sections Meeting of the American Physical Therapy Association, Atlanta, GA, February 12-15, 1987. This research was supported in part by Project 405, Division of Maternal and Child Health Services, US Department of Health and Human Services, awarded to the C Louis Meyer Children's Rehabili- tation Institute, University of Nebraska Medical Center. This article was submitted December 14, 1987; was with the authors for revision for 27 weeks; and was accepted December 5, 1988. 26/328 Physical Therapy/Volume 69, Number 5/May 1989

Item Reliability of the Milani-Comparetti Motor Development

  • Upload
    tranbao

  • View
    220

  • Download
    2

Embed Size (px)

Citation preview

Page 1: Item Reliability of the Milani-Comparetti Motor Development

Item Reliability of the Milani-Comparetti Motor Development Screening Test

The purpose of this study was to determine the level of interobserver and test-retest reliability of the Milani-Comparetti Motor Development Screening Test. Sixty healthy children, aged 1 through 16 months, were videotaped during administration of the Milani-Comparetti test. Four pediatric physical therapists independently viewed each videotape and scored the responses. Interobserver reliability was determined by cal­culation of percentage of agreement and the G statistic between a primary observer and each therapist. Forty-three children were retested within one week by the initial tester to examine test-retest reliability. Test-retest reliability was determined by per­centage of agreement of items between the two test sessions and using the Kappa statistic. Interobserver percentage of agreement for the individual items on the Milani-Comparetti test ranged from 79% to 98%. The G statistic was significant for all items indicating the high percentage-of agreement values were not due merely to chance agreement. Test-retest agreement ranged from 80% to 100%. Using Kappa statistic guidelines, excellent test-retest reliability (K> .75) was found for 82% of the test items, with good reliability of the remaining items. Acceptable interobserver and test-retest reliability was found for all items on the Milani-Comparetti test. Use of the Milani-Comparetti test as a clinical screening tool for prediction or follow-up of motor development in children at risk for developmental delays requires further evaluation. [Stuberg WA, White Pf, Miedaner J A, et ah Item reliability of the Milani-Comparetti Motor Development Screening Test Phys Ther 69328-335, 1989]

Key Words: Child development; Motor skills; Pediatrics, development; Tests and measurements, general.

Wayne A Stuberg Penni J White James A Miedaner Pam R Dehne

The Milani-Comparetti Motor Devel­opment Screening Test was first pub­

lished in 1967 by Milani-Comparetti and Gidoni as a mechanism for iden­

tifying infants with abnormal move­ment patterns.1 A testing manual was published by the C Louis Meyer Chil­dren's Rehabilitation Institute in 19772

to provide information on test admin­istration and revised in 19873 to include new data on test standardiza­tion. The Milani-Comparetti test is a screening tool developed to systemati­cally examine the integration of primi­tive reflexes and the emergence of volitional movement against gravity. The test assesses spontaneous motor behavior and evoked responses including righting, protective and equilibrium reactions, and select primitive reflexes.14 Specifically, the Milani-Comparetti test is used to 1) identify motor dysfunction in

W Stuberg, MS, PT, is Assistant Professor and Director of Physical Therapy, C Louis Meyer Children's Rehabilitation Institute, University of Nebraska Medical Center, 42nd & Dewey Ave, Omaha, NE 68105 (USA).

P White, MS, PT, is Staff Physical Therapist, C Louis Meyer Children's Rehabilitation Institute.

J Miedaner, MS, PT, is Assistant Professor, Division of Physical Therapy Education, and Chief Thera­pist, Vince Mosley Clinic, Medical University of South Carolina, Charleston, SC 29425.

P Dehne, MS, PT, is Assistant Instructor and Staff Physical Therapist, C Louis Meyer Children's Reha­bilitation Institute.

This article was adapted from a presentation at the Combined Sections Meeting of the American Physical Therapy Association, Atlanta, GA, February 12-15, 1987.

This research was supported in part by Project 405, Division of Maternal and Child Health Services, US Department of Health and Human Services, awarded to the C Louis Meyer Children's Rehabili­tation Institute, University of Nebraska Medical Center.

This article was submitted December 14, 1987; was with the authors for revision for 27 weeks; and was accepted December 5, 1988.

26/328 Physical Therapy/Volume 69, Number 5/May 1989

Page 2: Item Reliability of the Milani-Comparetti Motor Development

infants up to 2 years of age, 2) estab­lish the basis for an early intervention program, 3) aid in research on motor development, and 4) teach the skillful observation of motor development in handicapped and nonhandicapped children.3

Although the Milani-Comparetti test has not been examined with the rigor applied to commonly used tests of infant development, its application as an outcome measure has been documented for prema­ture children with intracranial hemorrhage5 and children with post-cerebral palsy implantation of cerebellar pacemakers.6 A major criticism in using the Milani-Comparetti test for research and clinical applications has been the lack of data establishing the test's reliability.7,8

VanderLinden9 reported intertester reliability for two experienced physi­cal therapists at 85% for data collected as part of Campbell and Wilhelm's Baby Research and Training Section study.10 Specific information describ­ing the methodology to establish reli­ability was not published.

Estimates of interobserver reliability provide information on the accuracy of agreement among examiners on the scoring of items presented during an observation period. Estimates of test-retest reliability, also commonly referred to as intraobserver reliability, provide information on the stability of the scoring of items over repeated tests by the same examiner. The pur­pose of this study was to determine the level of interobserver and test-retest reliability for items on the Milani-Comparetti test. We hypothe­sized that an acceptable level of reli­ability (percentage of agreement greater than 80%) would be obtained using the standardized protocol for administration and scoring.

Method

Subjects

A total of 60 infants, aged 1 through 16 months, participated in the study.

The sample represented a distribution of 15 children in each of four age groupings: 1) 1- to 4-month age group ( = 2.1 months, 5 = 1.1), 2) 5- to 8-month age group ( = 6.2 months, s = 1.1), 3) 9- to 12-month age group ( =10.6 months, s = 1.1), and 4) 13- to 16-month age group ( = 14.5 months, s = 1.2). Each child was tested within three days of a one-month anniversary. The children did not have a history of pre­maturity (>2 weeks), significant birth history, neuromuscular impairment, medical complications, or illness at the time of testing. Table 1 lists the demographic information concerning the sample population and the fami­lies. The children were recruited for participation in the study through "well baby" follow-up clinic appoint­ments at a university hospital, from an Omaha (Neb) area day-care center, and from siblings of C Louis Meyer Children's Rehabilitation Institute staff members.

Informed consent following university guidelines was obtained prior to vid­eotaping the children. Test results were provided to the parent(s).

Setting

The infants were tested in a quiet examination room at the C Louis Meyer Children's Rehabilitation Insti­tute or at an Omaha area day-care center. The session was videotaped with the infants' parent(s) present. Two examiners were present during the videotape session for approxi­mately one third of the sample. One examiner videotaped the session while the other tested the child. The remaining sessions were videotaped by a technician with only the examin­ing therapist present. Only one exam­iner was present for the retest ses­sion, which was not videotaped. The therapist who tested the child initially retested the child.

Examiners

Four pediatric physical therapists par­ticipated in the study. One therapist (WAS) who had over seven years' experience using and teaching the Milani-Comparetti test served as a standard observer for the study. The other therapists had at least three years of pediatric experience. Training

Table 1 - Demographic Characteristics of Subject Population and Families (N=60)

Characteristics of Sample

Race White Black Hispanic

Sex Male Female

N

54 5 1

33 27

%

90 8 2

55 45

Characteristics of Head of Household

Occupation Professional, executive, or administrative Manager, proprietor, or official Sales or clerical worker Craftsman, foreman, or skilled laborer Laborer, farmer, or unemployed

Education 4 or more years of college 1-3 years of college High school or general equivalency diploma Less than high school or general equiva­

lency diploma

N

25 14 7

11 3

21 17 16 6

%

42 24 11 18 5

35 27 28 10

Physical Therapy/Volume 69, Number 5/May 1989 329/27

Page 3: Item Reliability of the Milani-Comparetti Motor Development

in administration and scoring of the Milani-Comparetti test for the thera­pists participating in the study was provided by the standard observer using the criteria published in the testing manual.3 All therapists admin­istered, viewed, and scored 10 dem­onstration videotapes to clarify test scoring criteria. Feedback was pro­vided to each therapist during the training period to ensure consistency in administration of the test items.

Instrumentation

The Milani-Comparetti test consists of 27 items that examine spontaneous motor behaviors and evoked responses.1 Spontaneous motor behaviors include active movement items such as locomotion and pos­tural control items such as sitting or standing. Evoked responses include 1) tilting (equilibrium) reactions, 2) parachute (protective extension) reactions, 3) righting reactions, and 4) primitive reflexes. Approximately two thirds of the items require direct handling, dependent on the child's age at testing. No specialized equip­ment is required, except a cushion or tiltboard to assess equilibrium reac­tions. The authors have written spe­cific behavioral criteria for each item (Figure).3

The score form used for the study was a modification of the original form published by Milani-Comparetti.1

The revised score form was published in the 1977 edition of the Milani-Comparetti test manual2 and repre­sents a reordering of the test items from the original score form. Twenty-one of the test items are scored as present, absent, or incomplete with 6 of the items including subtests that are each scored as present or absent. For example, the item locomotion includes automatic stepping, rolling, crawling, and walking, which the examiner scores using a dichotomous scale of present or absent. A compli­cated scoring system for the Milani-Comparetti test has been developed, but was not used in this study because we believe the system has limited application in routine clinical use.7

BODY LYING SUPINE

Procedures:

Criteria:

Lay the child supine on a cushion and observe his posture. Grasp the

child's hands as if beginning to pull him up to sitting and observe the

lifting of the head into flexion.

Normally, a child at 1 to 1 1/2 months will demonstrate a

flexed posture and by 3 to 4 months the hands will be at

midline. The 5 to 6-month-old may be seen playing with his

feet. The 7-month-old will flex the head forward off the table

in anticipation of being pulled up into a sitting position.

Figure. Test administration criteria for item "body lying supine." (Reprinted by permission.3)

Procedure

Sixty children were videotaped during administration of the Milani-Comparetti test. The Denver Develop­mental Screening Test (DDST) was also administered during the session to determine whether the child was developing within normal limits and to establish reliability data on the DDST for the testers as part of a fur­ther study. The item order on the score form was followed as closely as possible. An attempt was made to administer and score all appropriate items for the child's chronological age. Items greater than two to three months above the child's age were not administered unless advanced development was observed. In gen­eral, 19 items were administered to

the 1- to 4-month-olds, 24 items to the 5- to 8-month-olds, and all 27 items to the 9- to 16-month-olds.

Forty-three of the 60 children were retested within three to five days by the same tester. Only 43 of the chil­dren could be rescheduled to partici­pate in retesting. Test-retest data were collected by two therapists (WAS, PRD). The second session was not videotaped. The Milani-Comparetti test score form results of the first ses­sion were not available to the exam­iner prior to the second session. Test-retest reliability was determined by comparing the therapist's score form results from the first and second ses­sion. The score form completed by the therapist during the first session was not used in the determination of

28/330 Physical Therapy/Volume 69, Number 5/May 1989

Page 4: Item Reliability of the Milani-Comparetti Motor Development

interobserver reliability. The DDST was not administered during the sec­ond session.

The videotapes were viewed and independently scored by each thera­pist and the standard observer. The results of the three observers were compared with the results of the stan­dard observer to determine interob­server reliability. The criteria for scor­ing each item are outlined in the test manual.3

Data Analysis

Data were analyzed individually for each of the 27 items of the Milani-Comparetti test to examine test inter­observer and test-retest reliability. Table 2 lists the month ranges used to calculate reliability for the individual items. The decision to use the month ranges rather than entire age range was made to eliminate inflation of the reliability level that would be intro­duced by scoring items that are obvi­ously present or absent. For example, a backward protective extension response cannot be expected prior to age 6 months, and inclusion of the 1-to 6-month-old children to calculate agreement would incorrectly inflate the results. In general, a range of ± 3 months from the expected date of integration of a primitive reflex or acquisition of a motor milestone was used to set the ranges. The entire range for items such as locomotion was used because the item includes numerous subtest tasks such as roll­ing, crawling, and walking.

Percentage of agreement and the G statistic were used to obtain an esti­mate of interobserver reliability. Per­centage of agreement (PA) was calcu­lated by comparing the score of the standard observer with that of the other observers using the formula

PA = number of agreements/total paired observations × 100 Because percentage of agreement may provide an overestimate of agreement if change agreement is high, the G statistic was calculated to provide an indication of true reverses chance

PA _ number of agreements total paired observations

x 100

Because percentage of agreement may provide an overestimate of agreement if change agreement is high, the G statistic was calculated to provide an indication of true reverses chance

Table 2. Month Ranges for Calculation of Item Reliability

Item

Primitive reflexes Hand grasp Foot grasp Moro Asymmetrical tonic neck Symmetrical tonic neck

Righting reactions Head righting Landau Body derotation Body rotation

Protective reactions Downward Sideways Forward Backward

Equilibrium reactions Prone Supine Sitting All fours Standing

Postural control Body held vertical Body lying prone Body lying supine Body pulled to sitting Sitting All fours Standing

Active movement Standing up from supine Locomotion

Month Range

1-7 6-12 1-7 1-7 2-9

1-5 1-6 1-7 6-12

1-7 3-9 4-10 6-12

2-8 4-10 5-11 5-16 9-16

1-7 1-6 1-8 2-9 2-11 1-16 1-16

6-16 1-16

agreement. For example, if two observers score a behavior with a three-point scoring system, they will agree by chance alone one third of the time. The G statistic accounts for the chance agreement in observations between observers and tests the hypothesis that agreement is greater than chance alone (ie, true agreement).

The G statistic is an appropriate tool when comparing joint agreement of two or more observers with a stan­dard observer.11 A computer program was written to generate the appropri­ate standard error and the G statistic. The G statistic, which is a ratio of the difference between observed and expected agreements to the standard error, was then compared with a stan­dard normal distribution to test the

hypothesis of random agreement. The G statistic values equal to or greater than 2.56 are significant beyond the .01 level for a two-tailed test.

The use of a standard observer and the G statistic for data analysis were chosen because the three observers were not familiar with the Milani-Comparetti test prior to initiation of the study. Percentages of agreement and Kappa values across all testers could have been calculated; however, it was not the intent of the authors to demonstrate association between the testers but rather agreement between the testers and an accepted standard. We assumed, although it was not within the scope of the study, that the standard used in the study would be representative of any individual who

Physical Therapy/Volume 69, Number 5/May 1989 331/29

Page 5: Item Reliability of the Milani-Comparetti Motor Development

had experience in using the Milani-Comparetti test.

Percentages of agreement and Kappa values were used to obtain an esti­mate of test-retest reliability. Percent­ages of agreement were calculated as previously described. Kappa values were calculated using the formula published by Fleiss.12

Kappa is an estimate of agreement beyond chance alone and ranges from +1 to — 1. The hypothesis of agreement significantly different from chance was tested by calcula­tion of the variance of Kappa and standard error to obtain a z score. A probability level of .01 was used for determination of significance, which corresponds to a z score of 2.56 or greater for a two-tailed test. In gen­eral, Kappa values greater than .75 represent excellent agreement beyond chance, values between .40 and .75 represent fair to good agreement, and values below .40 represent poor agreement beyond chance.13

Adequate variability on all items of the Milani-Comparetti test was avail­able to allow calculation of both the G and Kappa statistics. Neither the G nor Kappa statistics are valid pre­dictors of agreement when scores are grouped around one response for an item.14 For an item to have good agreement beyond chance, both a high percentage of agree­ment and a Kappa (test-retest) or G (interobserver) statistic should be obtained.

Although the Milani-Comparetti test does not provide the examiner with a total or summary score, a mean percentage of agreement for inter­observer reliability was calculated to examine between-tester differences for scores for the four age ranges studied. Mean percentage of agree­ment was calculated by modifying the percentage-of-agreement for­mula to include all individual items on the test for the children repre­senting the particular age range.

T a b l e 3 . Interobserver Reliability Scores

Item

Primitive reflexes Hand grasp Foot grasp Moro Asymmetrical tonic neck Symmetrical tonic neck

Righting reactions Head righting Landau Body derotation Body rotation

Protective reactions Downward Sideways Forward Backward

Equilibrium reactions Prone Supine Sitting All fours Standing

Postural control Body held vertical Body lying prone Body lying supine Body pulled to sitting Sitting All fours Standing

Active movement Standing up from supine Locomotion

N

28 26 25 24 9

18 20 19 26

20 16 23 24

17 16 22 12 19

29 26 18 35 43 56 60

38 59

Percentage of Agreementa

98 88 91 91 89

94 92 95 89

90 88 91 92

94 90 86 83 79

97 96 98 88 89 86 86

96 97

G Statisticb

11.13 7.17 9.63 7.80 2.81

8.01 8.78 8.57 8.98

9.13 7.28 8.84 8.90

8.19 7.89 7.76 3.76 6.67

9.45 8.07 8.09

11.75 18.43 23.13 25.13

13.78 23.97

Results

Interobserver reliability results are listed in Table 3. High percentages of agreement and significant G statistics were obtained for all items. The most consistently high level of agreement was noted for active movement and postural control items. Overall, equi­librium reactions demonstrated lower levels of agreement with standing equilibrium being the lowest scored item at 79%.

The mean percentage-of-agreement results of each tester by age range are reported in Table 4. The lowest reli­ability was obtained for the 5- to 8-month age range by tester 3. In gen­

eral, the 1- to 4-month and 5- to 8-month age ranges demonstrated lower levels of agreement than the 9-to 12-month and 13- to 16-month age ranges.

Test-retest reliability results are listed in Table 5. High percentages of agree­ment and significant Kappa values were obtained for all items. Eighty-two percent (20 out of 27) of the items demonstrated excellent test-retest reliability using the Kappa statis­tic guidelines. Although significant, only fair to good Kappa values were found for the items head righting, landau, forward protective, supine equilibrium, and body lying supine. The lower Kappa values for the items

aAgreement of the three observers compared with the standard observer. bAll G-statistic values are significant atp = .01, two-tailed test.

30/332 Physical Therapy/Volume 69, Number 5/May 1989

Page 6: Item Reliability of the Milani-Comparetti Motor Development

were consistent with percentages of agreement in the 80% to 85% range as compared with higher percentages of agreement and Kappa levels for other test items.

Discussion

The results of this study compare favorably with the results of similar studies examining the reliability of commonly used infant screening tools. Mean percentage of agreement between four examiners during the standardization of the DDST was 90% with a range of 85% to 95%.15 Test-retest percentage of agreement for the DDST ranged from 90% to 100% over a one-week interval between testing sessions. Haley et al reported percentages of agreement and Kappa statistics on the items of the Move­ment Assessment of Infants (MAI).13

Using Kappa statistic guidelines as reported in this article, only 2% of the items demonstrated excellent interob-server reliability with 58% in the good range. Ten percent of the items demonstrated excellent test-retest results with 42% in the fair to good range. Eighty-five percent (23 out of 27) of the items for test-retest reliabil­ity in this study were in the excellent range (K > .75), and the remaining items were in the good range. Inter-observer agreement levels were sig­nificantly high and comparable to either the MAI13 or the DDST15

results.

An acceptable level of accuracy (inter-observer agreement) was established for all items on the Milani-Comparetti test. Consistently high levels of agree­ment were found for items in the subsections of righting or protective reactions, primitive reflexes, and active movement (Tab. 3). In general, lower levels of agreement were found for the equilibrium reactions and for items that involved a more complex coding system because many tasks were grouped into one item.

An example of greater complexity in scoring is the item standing, which includes five criteria to be scored: supporting reactions, astasia, weight bearing, standing posture, and inde-

Table 4. Mean Percentage of Agreement of Testers to Standard Observer by Age Range (N=60)

Tester

1 2 3

Age Range (mo) 1-4

89 91 89

s

8.0 5.2 9.5

5-8

89 93 88

s

5.1 3.9 7.3

9-12

90 95 94

s

4.4 3.7 4.6

13-16

93 95 94

s

4.5 4.0 4.0

Table 5. Test-Retest Reliability Scores

Item

Primitive reflexes Hand grasp Foot grasp Moro Asymmetrical tonic neck Symmetrical tonic neck

Righting reactions Head righting Landau Body derotation Body rotation

Protective reactions Downward Sideways Forward Backward

Equilibrium reactions Prone Supine Sitting All fours Standing

Postural control Body held vertical Body lying prone Body lying supine Body pulled to sitting Sitting All fours Standing

Active movement Standing up from supine Locomotion

N

20 19 19 19 19

13 17 19 19

19 19 19 19

17 19 20 31 23

19 17 19 23 30 43 43

30 43

Percentage of Agreement

90 95

100 90 95

85 82 95 95

95 95 84

100

94 84 90 94 91

95 88 80

100 97 91 88

100 100

Kappa

.84

.90 1.00 .83 .91

.66

.65

.92

.92

.92

.92

.74 1.00

.91

.74

.81

.91

.85

.85

.75

.66 1.00 .96 .89 .85

1.00 1.00

pendent standing. Other items with multiple subtests that demonstrated lower levels of agreement were all fours, body pulled to sitting, and sit­ting. An inverse relationship between

the complexity of scoring and level of agreement has been documented by Mash and McElwee, who studied the ability of trained observers in scoring audiotapes using four- and eight-

aAll Kappa values are significant at p < .01, two-tailed test.

Physical Therapy/Volume 69, Number 5/May 1989 333/31

Page 7: Item Reliability of the Milani-Comparetti Motor Development

category scoring systems.16 Only a three-category system of scoring crite­ria is used for 20 of the 27 items on the Milani-Comparetti test (ie, the response is present, in transition, or absent). The items with the less com­plex scoring system demonstrated consistently higher levels of agree­ment with the exception of the equi­librium reactions.

The lowest interobserver percentage of agreement was found for the item standing equilibrium at 79%. Werner and Bayley reported the lowest levels of test-observer and test-retest agree­ment for motor scale items on the Bayley Revised Scale of Mental and Motor Development requiring assist­ance from the examiner.17 Although items that required assistance did not uniformly demonstrate low levels of agreement, the equilibrium reactions were the most difficult to score accu­rately. A subtle relationship between the stimulus intensity and the equilib­rium response was noted in scoring the videotapes, which made scoring difficult. Additional study is needed to delineate the gradations of a develop­ing response and the appropriate stimulus intensity to elicit the response.

The lower level of interobserver agreement for the children in the 1-to 4-month and 5- to 8-month ranges as compared with the 9- to 12-month and 13- to 16-month ranges reflects the greater number of items in transi­tion during those ranges (Tab. 4). Transition of all primitive reflex items, with the exception of foot grasp, from presence to absence takes place within the first eight months. A major­ity of items examining postural con­trol and acquisition of motor mile­stones are also scored during the first eight months in comparison with the 8- to 12-month and 9- to 16-month ranges.

The stability estimates (test-retest agreement) were acceptable and cor­responded to the levels for the accu­racy estimates. Although error is intro­duced with bias of knowledge of the first session results, the error was minimized by allowing a three- to

five-day period between tests. We felt that delaying the retest session for a longer period would introduce incon­sistency in results because of matura-tional changes of the child. The high test-retest scores have important clini­cal significance because the method used to collect the data was similar to clinical measurement versus record­ing off of videotapes.

A limitation of the study is the pediat­ric population that was used. No chil­dren with developmental delays were included to assess the observers' abil­ity to score the items on children with abnormal development. Further stud­ies to establish the reliability levels using populations of developmentally delayed children are needed.

Clinical Implications

The Milani-Comparetti test's ability to discriminate changes in motor devel­opment is decreased after 12 months of age and is extremely limited past 16 to 18 months of age. Only one item—locomotion—includes a subtest that is scored past 16 months of age. The majority of items are scored within the first 12 months of age, which would empirically be the age range of the Milani-Comparetti test's greatest clinical value.

Items scored by observation of the child's movements rather than through handling the child were found to be the most reliable in this study. Minimizing handling should help to obtain more accurate test results.

Although acceptable levels of accuracy and stability were obtained, the gener-alizability of the results of this study to the clinical setting are in question. Further study is needed to establish the constancy of the motor behaviors and accuracy with which they can be observed in children at risk for devel­opmental delays.

Conclusion

The hypothesis that acceptable levels of interobserver and test-retest reli­ability for the Milani-Comparetti test

would be obtained was supported by the results of this study. A standard­ized protocol for test administration has been developed and indications for training provided.3

The results suggest that, with proper training, the Milani-Comparetti test can be used clinically or in research applications as a reliable tool to assess motor development in the first year and one-half for children not suspected of having motor delays. The level of reliability in screening devel­opmentally delayed infants, specific training requirements for varying degrees of examiner experience, and test validity are variables yet to be determined.

Acknowledgment

We thank Kashinath Patil, PhD, for his statistical assistance and development of the computer program for the G statistic.

References

1 Milani-Comparetti A, Gidoni AE: Routine developmental examination in normal and retarded children. Dev Med Child Neurol 9: 631-638, 1967 2 Trembath JT, Kliewer D, Bruce W: The Milani-Comparetti Motor Development Screen­ing Test. Media Resource Center, C Louis Meyer Children's Rehabilitation Institute, Uni­versity of Nebraska Medical Center, Omaha, NE, 1977 3 Stuberg WA, Dehne PR, Miedaner JA, et al: Milani-Comparetti Motor Development Screen­ing Test: Test Manual, 1987 ed. Media Resource Center, C Louis Meyer Children's Rehabilitation Institute, University of Nebraska Medical Center, Omaha, NE, 1987 4 Milani-Comparetti A, Gidoni AE: Pattern analysis of motor development and its disor­ders. Dev Med Child Neurol 9:625-630, 1967 5 Bozynski ME, Nelson MN, Rosati-Skertich C, et al: Two-year longitudinal follow-up of pre­mature infants weighing ≤ 1,200 grams at birth: Sequelae of intracranial hemorrhage. J Dev Behav Pediatr 5:346-352, 1984 6 Penn RD: The neurosurgical treatment of cerebral palsy. Pediatr Ann 8(10):72-78, 1979 7 Ellison PH, Browning CA, Larson B, et al: Development of a scoring system for the Milani-Comparetti and Gidoni method of assessing neurologic abnormality in infancy. Phys Ther 63:1414-1423, 1983 8 Price MM: Critique of the Milani-Comparetti Motor Development Screening Test. Physical and Occupational Therapy in Pediatrics 1(1): 59-68, 1980 9 VanderLinden D: Ability of the Milani-Comparetti developmental examination to pre-

32/334 Physical Therapy/Volume 69, Number 5/May 1989

Page 8: Item Reliability of the Milani-Comparetti Motor Development

diet motor outcome. Physical and Occu­pational Therapy in Pediatrics 5(l):27-38, 1985 10 Campbell SK, Wilhelm IJ: Development from birth to 3 years of age of 15 children at high risk for central nervous system dysfunc­tion. Phys Ther 65:463-469, 1985 11 Light RJ: Measures of response agreement for qualitative data: Some generalizations and alternatives. Psychol Bull 76:365-377, 1971 12 Fleiss JL: Statistical Methods for Rates and Proportions, ed 2. New York, NY, John Wiley & Sons Inc, 1981

13 Haley SM, Harris SR, Tada WL, et al: Item reliability of the Movement Assessment of Infants. Physical and Occupational Therapy in Pediatrics 6(l):21-39, 1986 14 Plewis I, Bax M: The uses and abuses of reliability measures in developmental medi­cine. Dev Med Child Neurol 24:388-390, 1980 15 Frankenburg WK, Dodds JB, Fandal AW: Denver Developmental Screening Test. Den­ver, CO, LADOCA Project and Publishing Foun­dation Inc, 1973

16 Mash EJ, McElwee JD: Situational effects on observer accuracy: Behavioral predictability, prior experience, and complexity of coding categories. Child Dev 45:367-377, 1974 17 Werner EE, Bayley N: The reliability of Bayley's Revised Scale of Mental and Motor Development during the first year of life. Child Dev 37:39-50, 1966

Physical Therapy/Volume 69, Number 5/May 1989 335/33