Download pdf - Comparing Parent Ratings of Referred Preschoolers on the

Western Kentucky UniversityTopSCHOLAR®

Masters Theses & Specialist Projects Graduate School

5-2008

Comparing Parent Ratings of ReferredPreschoolers on the Child Behavior Checklist andBehavior Assessment System for Children - SecondEditionJennifer L. BourWestern Kentucky University, [email protected]

Follow this and additional works at: http://digitalcommons.wku.edu/theses

Part of the Developmental Psychology Commons, Educational Assessment, Evaluation, andResearch Commons, and the Educational Psychology Commons

This Thesis is brought to you for free and open access by TopSCHOLAR®. It has been accepted for inclusion in Masters Theses & Specialist Projects byan authorized administrator of TopSCHOLAR®. For more information, please contact [email protected].

Recommended CitationBour, Jennifer L., "Comparing Parent Ratings of Referred Preschoolers on the Child Behavior Checklist and Behavior AssessmentSystem for Children - Second Edition" (2008). Masters Theses & Specialist Projects. Paper 9.http://digitalcommons.wku.edu/theses/9

http://digitalcommons.wku.edu?utm_source=digitalcommons.wku.edu%2Ftheses%2F9&utm_medium=PDF&utm_campaign=PDFCoverPages

http://digitalcommons.wku.edu/theses?utm_source=digitalcommons.wku.edu%2Ftheses%2F9&utm_medium=PDF&utm_campaign=PDFCoverPages

http://digitalcommons.wku.edu/Graduate?utm_source=digitalcommons.wku.edu%2Ftheses%2F9&utm_medium=PDF&utm_campaign=PDFCoverPages

http://digitalcommons.wku.edu/theses?utm_source=digitalcommons.wku.edu%2Ftheses%2F9&utm_medium=PDF&utm_campaign=PDFCoverPages

http://network.bepress.com/hgg/discipline/410?utm_source=digitalcommons.wku.edu%2Ftheses%2F9&utm_medium=PDF&utm_campaign=PDFCoverPages




COMPARING PARENT RATINGS OF REFERRED PRESCHOOLERS ON THE CHILD BEHAVIOR CHECKLIST AND BEHAVIOR ASSESSMENT SYSTEM FOR

CHILDREN – SECOND EDITION

A Thesis Presented to

The Faculty of the Department of Psychology Western Kentucky University

Bowling Green, KY

In Partial Fulfillment Of the Requirements of the Degree

Specialist in Education

By Jennifer L. Bour

May 2008

COMPARING PARENT RATINGS OF REFERRED PRESCHOOLERS ON THE CHILD BEHAVIOR CHECKLIST AND BEHAVIOR ASSESSMENT SYSTEM

FOR CHILDREN – SECOND EDITION

Date Recommended Director of Thesis

Dean, Graduate Studies and Research Date

Acknowledgements

I would first like to thank my thesis director, Dr. Carl Myers. I greatly appreciate

all your hard work and patience while working on this project. You have provided me

with so much constructive feedback and guidance during my undergraduate and graduate

years, and for that I am truly thankful! I would also like to thank my thesis committee

members, Dr. Elizabeth Jones and Dr. Melissa Hakman, for taking the time to provide me

with ideas and helpful suggestions to help better my thesis. I also thank the Regional

Child Development Clinics for allowing me to collect my research data. Without their

efforts, this research project would have been very difficult. Most importantly, I would

like to thank my friends and parents who have been a wonderful support system to help

keep me going through graduate school and with this project. I would especially like to

thank my sister, Julie, who also went to graduate school ten years ago for School

Psychology. She has given me such wonderful guidance and support since she has been

through similar experiences. She has been my rock throughout this entire experience.

i

Table of Contents

Pages

List of Tables …………………………………………………………………………. iii

Abstract ……………………………………………………………………………….. iv

Introduction …………………………………………………………………………… 3

Literature Review ……………………………………………………………………… 5

Method ……………………………………………………………………………….. 21

Results ……………………………………………………………………………….. 28

Discussion …………………………………………………………………………… 42

References …………………………………………………………………………… 49

Appendix: Human Subjects Review Board Approval ………………………………. 53

ii

List of Tables

Pages

Table 1: General Features of the BASC-2 PRS-P and CBCL/1.5-5…………………. 15

Table 2: Pairs of BASC-2 and CBCL Scales Used for Comparisons………………… 27

Table 3: Correlations Between Corresponding Scales on the BASC-2 PRS-P

CBCL/1.5-5 (n=95)………………………………………………………… 29

Table 4: Correlations Between Corresponding Scales for the Current Study and

for a Non-referred Sample from the BASC-2 Manual……………………… 30

Table 5: Mean T Scores and t-test Results for Comparable BASC-2 and

CBCL Scales (n=95)………………………………………………………… 32

Table 6: Classification Consistency of Ratings Between the BASC-2 PRS-P

and CBCL/1.5-5……………………………………………………………. 36

Table 7: Percentage of Ratings Less Than or Greater Than One Standard

Deviation Between Standard Scores on the Corresponding BASC-2

and CBCL Scales ………………………………………………………….. 41

iii

COMPARING PARENT RATINGS OF REFERRED PRESCHOOLERS ON THE CHILD BEHAVIOR CHECKLIST AND BEHAVIOR ASSESSMENT SYSTEM FOR

CHILDREN – SECOND EDITION

Jennifer Lynn Bour May 2008 54 pages

Directed by: Dr. Carl Myers, Dr. Elizabeth Jones, and Dr. Melissa Hakman

Department of Psychology Western Kentucky University

It is essential for school psychologists assessing children to use instruments that

are reliable and valid. The focus of the current study is to determine whether or not the

parent preschool versions of two popular behavior rating instruments, the Behavior

Assessment System for Children – Second Edition (BASC-2; Reynolds & Kamphaus,

2004) and Child Behavior Checklist (CBCL; Achenbach & Rescorla, 2000), are

consistent measures of similarly-named behavioral constructs in preschool-aged children.

Parents of 95 preschoolers referred to a nonprofit child evaluation clinic because of

behavioral or developmental concerns completed both the BASC-2 and CBCL during an

initial evaluation session. The findings revealed that while significantly high correlations

occurred, the mean standard scores were significantly different for nine of the 15

construct comparisons. For six of the 15 comparisons, the mean difference was greater

than one standard deviation. Additionally, the CBCL usually resulted in higher ratings

than the BASC-2. Such findings imply that the two instruments are not equivalent, and

results from these two behavior rating scales need to be interpreted with caution. The

determination of whether or not a construct is considered clinically significant may

depend solely on the instrument completed by the parent.

iv

Introduction

Behavior rating scales may be one of the most widely used assessment procedures

in evaluating social-emotional problems in children. Because evaluations of preschoolers

have become a relatively new area of assessment for psychologists (Skovgaard,

Houmann, Landorph, & Christiansen, 2004), it is important to evaluate the validity of

these types of instruments for use with young children.

Assessing children from the preschool population may present a number of

challenges for professionals. Due to their young age and limited independence, it is

difficult to obtain accurate information from the child directly. Consequently, parents

must play a prominent role in the assessment process. A common way to obtain input

from parents, as well as from other informants (e.g., teachers), is through the use of

behavior rating scales. These instruments are often used by professionals because they

are considered to be time efficient and easily administered (Elliot & Busse, 1993).

Broad-band behavior rating scales measure a variety of constructs, such as

internalizing and externalizing behaviors in children. Externalizing behaviors are most

often and easily observed by anyone because they are overt behaviors (e.g., hitting,

kicking, etc.). In contrast, internalizing behaviors (e.g., anxious or depressive behaviors)

are more difficult to observe by professionals and parents. Parents or teachers may not be

aware of children’s internalizing behaviors until they are asked to rate the specific

behaviors on a behavior rating scale. Behavior rating scales are advantageous in this

area because they help identify internalizing types of problems that children may be

experiencing based on the reports of parents and teachers.

3

4

Although there are a variety of methods available for professionals to assess

social-emotional behaviors in children, behavior rating scales are commonly used and are

efficient for assessing problematic behaviors (Konold, Walthall, & Pianta, 2004). Two

common behavior rating scales that are often used in evaluating social-emotional

behaviors in school-aged children are the Behavior Assessment System for Children –

Second Edition (BASC-2, Reynolds & Kamphaus, 2004) and the Child Behavior

Checklist (CBCL, Achenbach & Rescorla, 2001). Both scales appear to be known for

their sound psychometric properties, but most research examining the two instruments

have studied the forms developed for school-aged children. Little research has been

conducted on the preschool versions of these instruments.

The preschool versions of the BASC-2 and CBCL assess a number of similarly

named constructs. It is important for professionals to know if the two instruments are

consistent measures of those constructs. This thesis will discuss the assessment of

preschoolers, advantages and disadvantages of behavior rating scales, and research

conducted on the BASC-2 and CBCL. The preschool versions of the CBCL and BASC-2

behavior rating scales were evaluated for the current research project by having parents

of referred preschoolers complete both instruments. Scores from the BASC-2 were

compared to equivalent scales on the CBCL in a variety of ways.

Literature Review

Preschool Assessment

Psychologists often work with children of varying ages; however, most school

psychologists are more familiar with various assessment instruments for the school-age

population than with the preschool population. The assessment of preschool-age children

is considered to be a newer area of assessment in the field of psychology (Skovgaard et

al., 2004). With this in mind, it is important for school psychologists to expand their

knowledge of assessment practices and measures for a younger population of children.

However, there appears to be a limited amount of research and information available

pertaining to the psychological assessment of preschool-age children.

There are a number of issues that professionals need to be aware of and take into

consideration when evaluating preschool children. For instance, using instruments that

are considered to be psychometrically sound for preschool children is essential. Both

early identification of problems and beneficial intervention practices for preschool-age

children are dependent upon having reliable assessment procedures for screening and

identification purposes (Lidz, 2003; Merrell, Blade, Lund, & Kempf, 2003). However,

given that preschool children’s development is highly variable, many preschool

instruments are often less accurate and reliable than instruments utilized with school-age

children.

Additionally, it is imperative that no single assessment instrument be used in

isolation when making evaluation decisions. Rather, it is considered best practice to use

a multimethod assessment approach to gain a full understanding of the child and his or

her needs (Lidz, 2003). In conducting a multimethod assessment, school psychologists

5

6

obtain information from a variety of sources. Valuable information about the child may

be discovered by completing observations in several different environments (e.g., school

and home), as well as in conducting interviews with parents and other professionals, such

as teachers or daycare workers (Sattler, 2001). However, the most popular forms of

obtaining assessment information on young children are through behavioral rating scales

and parental interviews. When possible, research supports the importance of obtaining

information from both parents using questionnaires and interviews, rather than from only

one parent (Bingham, Loukas, Fitzgerald, & Zucker, 2003; Skovgaard et al., 2004).

Parents are one of the main sources for referrals, and they provide valuable perceptions

about their child’s behaviors in the home environment.

Challenges in preschool assessment. A number of challenges may arise when

working with preschool-age children. Depending on the age and characteristics of the

child, several factors may affect the testing process. For instance, when school

psychologists administer standardized tests, preschool children may have a difficult time

with standardized directions and procedures due to variability in acculturation. Young

children have short attention spans, difficulty staying seated for long periods of time, and

difficulty understanding and following verbal instructions, which inhibit optimal

performance on standardized tests. With all of these pitfalls in mind, school

psychologists need to make an effort to alter the testing environment, as well as their own

mannerisms, to make it more suitable for younger children being tested (Sattler, 2001).

In addition, the preschool-age child may have limited independence, which leads to a

more prominent, supportive role for the parents in the assessment process. Behavior

rating scales can help ease the challenges in the assessment process with such a young

7

population of children because they do not rely on the children’s cooperation and

comprehension, but more on how raters perceive their behaviors. Behavior rating scales

will be discussed in further detail later in this literature review.

Preschool assessment methods. There are a number of assessment methods

available for children that measure a variety of behaviors; however, there are some

limitations for what may be used with younger children. For instance, self-report

inventories and client interviews are common forms of assessment, but these procedures

may only be beneficial when working with older children because preschool children are

severely limited in their reading ability and in the amount of information they are able to

give about their own development (Lidz, 2003).

Standardized testing is a popular form of assessment for young children. There

are many standardized tests available for preschool children that allow for a number of

comparisons to be made. School psychologists need to be aware of exactly what they are

intending to measure before selecting a standardized test because certain instruments

measure different constructs. For instance, the Bayley Scales of Infant Development-

Second Edition is not considered to be an intelligence test but more for measuring

developmental milestones in very young children, whereas the Cognitive Abilities Scale-

Second Edition may be more appropriate for measuring a young child’s intellectual

functioning (Lidz, 2003). Additionally, standardized assessments also have their pitfalls.

As mentioned previously, preschool-age children may exhibit many behaviors that are

not conductive to optimal test performance on standardized measures. Young children

may also struggle in being able to solely focus their attention to the task at hand.

8

Psychologists need to take these disadvantages into consideration when evaluating

preschool children since these issues may affect their results.

Third party behavior rating scales are another popular form of assessment. A

person familiar with the child rates the child’s behaviors. These ratings yield scores that

allow the comparison of that child’s behavior ratings to many other children in the

normative sample. Because behavior rating scales are the focus of this thesis, they will

be discussed more in-depth in the next section.

Behavior Rating Scales

Behavior rating scales are common assessment tools used to measure the

behaviors of children. Rating scales are beneficial for the preschool population due to

their ease of administration and their reliance on reports from individuals closely

involved with the child. As with any assessment instrument, there are limitations to using

behavior rating scales, which are detailed later in this review. Behavior rating scales

involve one or more individuals (e.g., teacher, parent) who rate the child’s behavior in a

particular setting (e.g., school or home environment). The raters should be individuals

who have regular contact and interaction with the identified child.

Behavior rating scales are most often used to measure children’s social-emotional

functioning, which is crucial during the preschool years. Unresolved social-emotional

difficulties often continue throughout childhood (Lavigne et al., 2001). Behavior rating

scales assist in the early identification of these problems, leading to early interventions

that help improve children’s level of functioning (Mereydith, 2001). Typically, behavior

rating scales use a Likert scale format. By using a Likert scale, raters are required to

indicate the extent of their agreement to a given statement based on their perceptions of

9

the child’s behavior. As an example, for a statement like, “Is shy around other children,”

the rater is able to choose from a selection of ratings based on their level of agreement

(e.g., Never, Sometimes, Often, or Almost Always). Compiled ratings on groups of

statements that are thought to comprise certain constructs can then be compared to a

normative sample to determine how typical or atypical a child’s social-emotional

functioning is at that point in time.

Constructs of behavior rating scales. Scales can be narrow- or broad-band

focused. There are a variety of narrow-band behavior rating scales available that measure

specific types of psychological and behavioral constructs. For instance, the Childhood

Autism Rating Scale (Schopler, Reichler, & Ro, 1988) is specifically designed to assess

characteristics of autism, while the Social Skills Rating System (Gresham & Elliott, 1990)

is used to measure a child’s social skills. In addition, there are behavior rating scales that

assess a broad range of constructs, called broad-band scales. This type of behavior rating

scale typically evaluates a range of internalizing and externalizing types of behaviors in

children. As described by Merrell et al. (2003), externalizing behaviors are considered to

be overt or excessive behaviors. Aggression, hyperactivity, or delinquent behaviors

would be considered externalizing behaviors. In contrast, children with internalizing

types of behaviors, such as depression or anxiety problems, can be easily overlooked by

teachers and other professionals. Internalizing behavioral symptoms are not easily

observed. Because these children do not overtly demonstrate behavioral problems like

children with externalizing problems do, they are not as commonly referred for problems

within classrooms. Behavior rating scales are considered to be especially beneficial for

assessing internalizing types of problems, in the sense that they help identify behaviors

10

that are not easily observed by professionals (Merrell et al., 2003). Often, professionals

and parents may overlook internalizing behaviors when observing children. Using a

behavior rating scale helps raters focus on types of behaviors that may not normally be

considered by an individual observing a child.

Purposes of behavior rating scales. By assessing a range of psychological and

behavioral constructs, broadband behavior rating scales may be used for a number of

purposes and have many benefits. A common use for a behavior rating scale is as a

screening instrument (Wrobel & Lachar, 1998). By using a behavior rating scale as a

screener, professionals can get a quick “estimate” or gain some insight about a child’s

behaviors. These scales may aid in the determination if a child is at-risk for developing

behavioral problems, which may then lead to preventative types of early interventions

(Merrell, 2003). As a screener, they may also aid in decision making. After

administering a behavior rating scale, a professional may decide whether a more

extensive evaluation is needed.

Behavior rating scales are often used as part of a comprehensive evaluation.

Behavior rating scales provide standardized ratings of a child’s behaviors that contribute

to a broader picture of a child’s level of functioning. Knowing whether or not ratings on

certain behavioral constructs (e.g., hyperactivity, anxiety, depression) are considered

clinically significant is especially important when specific behavioral diagnoses are being

considered. Merrell (2003) suggests using the “aggregation principle” when utilizing

behavior rating scales. This principle recommends obtaining ratings from a variety of

individuals from different settings to provide multiple perspectives on a child’s problem

behaviors. The compiled data is analyzed for consistencies across raters and situations.

11

Additionally, rating scales may be used by those who implement interventions for

children (Elliot & Busse, 1993). Teachers, or other professionals, may be able to use the

information found from a behavior rating scale in tailoring interventions. For instance, if

the results yield that a student is distant, avoids others in class, and appears to be highly

anxious, the teacher may be able to implement an intervention to work on those

difficulties. Another purpose for using behavior rating scales is to monitor the effects of

interventions (Merrell, 2003). Monitoring a student’s progress during a behavioral

intervention is important to ensure that the intervention is benefiting the student or if

modifications need to be made. Also, using a behavior rating scale to re-evaluate the

child’s behavior after an intervention has been completed is useful because the

information obtained demonstrates if progress has been accomplished and maintained.

Advantages of behavior rating scales. There are a number of advantages for

using behavior rating scales, as reported by Elliot and Busse (1993) and Merrell (2003).

Behavior rating scales are considered to be efficient, in the sense that they do not require

a significant amount of time or effort to administer. Conducting observations or

interviews to assess a broad range of internalizing and externalizing behaviors would take

an extensive amount of time. In addition, the rating scales are simple to administer and

score. Another advantage is that the use of rating scales allows for more involvement

from educators and parents with assessment and interventions, because they allow input

from multiple sources. They are considered reliable and valid when used to assess the

behavior of school-age children (Merrell, 2003).

Behavior rating scales are considered advantageous when assessing children

because young children are not able to provide much information about their own

12

development and behaviors. Additionally, the rating scales may aid in distinguishing

between “normal” and “abnormal” levels of behavior in young children. For instance,

assessing for Attention-Deficit/Hyperactivity Disorder (ADHD) in young children is

considered to be a difficult task, due to the fact that it is hard to distinguish between what

may be considered normal and abnormal levels of hyperactivity, impulsivity, and

inattentiveness because such behaviors are often seen in preschoolers. Instead of relying

solely on personal opinions, psychologists rely on behavior rating scales to provide

normative data on the frequency of problem behavior (Sciutto & Terjesen, 2000).

Limitations of behavior rating scales. Behavior rating scales are instruments that

are often used to assess behavioral symptoms in children of varying ages. Drotar, Stein,

and Perrin (1995) noted several shortcomings. One shortcoming is the generalization and

use with cultural groups on which the instruments are not normed. For instance, the

CBCL has been translated into many different languages and used in a number of

cultures, which can invalidate the test. Not every item on the scale may be interpreted in

a similar matter from culture to culture. Additionally, the way that children’s parents

may interpret their child’s symptoms may vary among cultures. For example, parents of

one culture may have a different idea of when to be concerned for certain behavioral

problems than parents from another culture. Differences in levels of concern for certain

behaviors may be seen within any culture as well.

Another limitation is that behavior rating scales do not provide any information

on a problem’s etiology, although, they do provide insight on a student’s current level of

functioning (Elliot & Busse, 1993). Despite standard scores developed from normative

samples, rating scales do not necessarily provide an objective measure of a student’s

13

problem behaviors; rather, the behavior is measured on how the rater perceives the

behavior (Konold et al., 2004). As a result, the ratings may be at-risk for being

influenced by a number of things, such as the rater having a poor memory or lack of

motivation to complete the instrument accurately. Another limitation is a risk of

intentional or unintentional bias in the ratings. For instance, a rater (e.g., teacher) may

have a preconceived notion of the child, thereby making the ratings biased in a positive or

negative sense (Elliot & Busse, 1993). A parent or teacher could intentionally try to

make a child more or less likely to qualify for a classification label or services by

exaggerating or minimizing their ratings.

In addition to rater bias, there are other notable issues of reliability and validity

with behavior rating scales. For instance, interrater reliability is more of an issue with

rating scales than it is with cognitive assessments (Elliot & Busse, 1993). This especially

becomes an issue when school psychologists obtain ratings from several individuals (e.g.,

teacher and parents). One rater may perceive a child’s behaviors differently than another

person rating the same type of behavior. Some behaviors, however, may be situationally

specific. A child may actually be behaving differently in different situations (e.g., school

and home), which would lead to the discrepancy in ratings on behavior rating scales. It is

difficult to know whether differences in ratings are due to differences in the rater’s

perceptions or the child’s actual behaviors. Another issue is that many of the behavior

rating scales have similarly named constructs (e.g., aggression, hyperactivity). As a

result, it would be easy to conclude that different instruments are measuring the same

type of behavior. However, the instruments may actually be measuring different aspects

of the similarly-named constructs and, therefore, one should not conclude that similarly

14

labeled scales on different behavior rating instruments are measuring the “same”

behavior (Elliot & Busse, 1993; Frick & Kamphaus, 2001).

Summary. It is evident that behavior rating scales have a number of advantages

and disadvantages. By using behavior rating scales, as well as other assessments, school

psychologists help aid in the early identification of social-emotional or behavioral

problems in young children. Early identification of problematic behaviors is an essential

step before being able to implement early intervention strategies. By intervening at a

young age, school psychologists may help prevent future negative outcomes for children

(Merrell, Caldarella, Streeter, Boelter, & Gentry, 2001).

The BASC-2 and CBCL

There are a variety of instruments available for assessing children’s social and

emotional behaviors. Two popular instruments are the Behavior Assessment System for

Children: Second Edition (BASC-2, Reynolds & Kamphaus, 2004) and the Child

Behavior Checklist (CBCL, Achenbach & Rescorla, 2001). School-age versions of both

assessment instruments are widely used by school psychologists because they exhibit

good psychometric properties with the school-age population (Vaughn, Riccio, Hynd, &

Hall, 1997). Both instruments were designed for aiding professionals in the evaluation

and diagnosis of certain emotional and behavioral disorders in children. Both also

contain internalizing and externalizing composite scales that evaluate a variety of

internalizing and externalizing types of behaviors in young children. Table 1 provides an

overview of general aspects of the two instruments. Because a comparison of the BASC-

2 and CBCL is the focus of this thesis, research comparing the two instruments will now

be reviewed.

15

Table 1

General Features of the BASC-2 PRS-P and CBCL/1.5-5 Dimension BASC-2 CBCL Age Range 2-5 years 1½ - 5 years Norms 1,200 children 700 non-referred children

Number of Items 134 99

Rating System 4-point scale (Never, 3-point scale (Not True, Sometimes, Often, Somewhat True, Very Almost Always) True) Scales 8 “clinical” scales, 7 “syndrome” scales, 5 4 “adaptive” scales, “DSM-Oriented” scales, and 4 composite scores and 3 composite scores

Few studies compare the current version of the BASC (i.e., BASC-2) with the

CBCL and even fewer studies have compared the preschool versions. Vaughn et al.

(1997) compared the original versions of the BASC (Reynolds & Kamphaus, 1998) and

CBCL (Achenbach, 1991) to determine their effectiveness in discriminating between the

different ADHD subtypes in school-age children. The sample consisted of 73 children

with ages ranging from 6.7 to 11.9 years. These children were referred to a

neuropsychology clinic that was university-based. The Parent Rating Scale (PRS) and

Teacher Rating Scale (TRS) were used for the BASC, and the Parent Report Form (PRF)

and Teacher Report Form (TRF) were used for the CBCL. Results indicated that both

scales demonstrated the ability to correctly identify children who have ADHD; however,

the BASC PRS and TRS were best for identifying the Inattentive subtype of ADHD.

16

Another study also evaluated the original BASC and CBCL’s utility in being able

to correctly identify children with or without ADHD, and the subtypes inattentive and

combined (Ostrander, Weinfurt, Yarnold, & August, 1998). The sample utilized for this

research consisted of 301 children between the ages of 6 and 11 years. In this study,

though, only the parent versions of both instruments were utilized. Overall, the results

differed from the findings of Vaughn et al. (1997). It was found that the BASC was better

able to discriminate between children with or without ADHD than the CBCL scale.

However, when distinguishing between the inattentive and combined subtypes, the CBCL

was a better predictor for students who may have the inattentive subtype, whereas the

BASC was better at predicting the combined subtype of ADHD.

Doyle, Ostrander, Skare, Crosby, and August (1997) compared the original

school-aged versions of BASC and CBCL. The main focus of this study was to determine

if the BASC-PRS was considered to be a comparable instrument to the CBCL. The

sample consisted of 156 children between 6 and 11 years who were considered to be at-

risk for Conduct Disorder. It was determined that the following scales evidenced the

strongest relationships: the Aggression scale on both instruments, the BASC Conduct

Problems and CBCL Delinquency scales, and the BASC Depression and Anxiety scales

with the CBCL Anxiety/Depression scale. Overall, however, it was supported that the

school-age version of the original BASC-PRS was comparable to the previous versions of

the CBCL.

An unpublished comparison of the preschool versions of the original BASC with

the current CBCL was conducted by Sidebottom (2005), who completed her study with

referred preschool-age children. The participants included 50 parents of preschoolers

17

between the ages of 30 and 71 months, with a mean age of 43.9 months. The participants

were referred to a nonprofit child development clinic. This research was interested in a

comparison of 13 similarly-named scales on both assessment instruments. The results

revealed that 9 of the 13 comparisons did have significantly different means. The

Anxiety scale of both instruments was the only scale that yielded a significantly different

mean score. Additionally, 11 out of the 13 comparisons resulted in the CBCL having

higher mean scores than the BASC. It was found that all of the scales were significantly

and positively correlated and 9 of the 13 comparisons had correlations that were

considered to be at least at a moderately strong level (> .50). The comparisons between

the somatization, aggression, atypicality, and withdrawn scales had correlations less than

.50. Sidebottom hypothesized that all corresponding scales would have differences

between the standard scores less than one standard deviation (< 10 points); however, this

hypothesis was refuted because 18 to 60 percent of the participants’ standard scores did

not fall within one standard deviation of each other.

Apparently, only one study has compared the preschool versions of the latest

editions of the BASC-2 and CBCL instruments. Research comparing the parent version of

the CBCL for ages 1.5-5 and the BASC-2 Preschool Rating Scale-Parent (PRS-P) was

reported in the BASC-2 manual (Reynolds & Kamphaus, 2004). The comparison used

parent ratings of 53 children that were 2 to 5 years old. The results indicated that the

similarly-named scales on both instruments tended to have high correlations. For

example, the Behavioral Symptoms Index (BSI) from the BASC-2 and the CBCL/1.5-5

Total Problems score had a correlation coefficient of .78. The ratings were of typical

18

children without behavioral problems, which is not representative of the population of

preschool children referred for evaluations.

Purpose

The assessment and identification of social and emotional problems in preschool

children is becoming more prevalent and is considered a newer area of assessment for

psychologists (Skovgaard et al., 2004). There are a variety of methods that professionals

use to evaluate preschool-aged children; however, the method given the most

consideration in this thesis is the use of behavior rating scales. Behavior rating scales are

considered to be beneficial when assessing behavioral problems because they assist in the

early identification of social and emotional problems in young children. Additionally,

behavior rating scales are advantageous in comparison to other assessment methods for

social and emotional behaviors because they can help identify behaviors that are most

often difficult to observe (e.g., internalizing behaviors). However, researchers have noted

that little information is known about the identification of behavioral problems and the

use of rating scales with the early childhood population (Sciutto & Terjesen, 2000).

The most recent editions of the BASC-2 PRS-P and CBCL/1.5-5 have extended

their age ranges to include even younger preschoolers; however, there appears to be little

evidence to support the use of either scale for preschoolers as young as 18 months of age.

Research studies reported in the BASC-2 and CBCL manuals, as well as from reviewed

studies, indicated that school-age versions of both instruments demonstrate reliable and

valid psychometric properties. However, due to the lack of research for the more recent

preschool versions, it is important both behavior rating scales are evaluated to ensure that

they are reliable and valid measures for assessing social and emotional problem behaviors

19

in preschool children. The purpose of the current study was to determine if the BASC-2

PRS-P and the CBCL/1.5-5 were consistent social-emotional measures of preschoolers’

problematic behaviors. There appears to be no published research available about the

BASC-2 PRS-P and the CBCL/1.5-5 with a referred population of preschool children. For

the current study, a comparison between parent ratings of referred preschool children was

made between the similarly-named scales from each instrument, which helped determine

if the BASC-2 PRS-P and CBCL/1.5-5 reliably and consistently measure corresponding

behavioral constructs.

Hypotheses

1. Correlations between the standard scores on the corresponding constructs on

the BASC-2 PRS-P and the CBCL/1.5.5 would be positive, significant, and at a

moderately strong level, defined as a correlation greater than .50 (Cohen,

1988).

2. Mean scores on each of the corresponding scales on the BASC-2 PRS-P and

the CBCL/1.5-5 would not be significantly different from one another.

Because both instruments have corresponding scales that measure similar

constructs, one would assume that the mean standard scores would all be

consistent with both instruments.

3. Corresponding scales from the BASC-2 PRS-P and CBCL/1.5-5 would provide

consistent classification outcomes. Scores ≥ 1.5 standard deviations above the

mean are considered clinically significant. Thus, scores from corresponding

scales from both instruments were evaluated to determine the overall

percentage of classification agreement (i.e., both scores < 1.5 standard

20

deviations above the mean plus both scores ≥ 1.5 standard deviations above

the mean). Percentages of 80 or above were considered as acceptable

classification consistency (Lidz, 2003).

4. Differences between the standard scores on the corresponding constructs

would be less than one standard deviation. Both instruments use T scores,

which have a mean of 50 and a standard deviation of 10 points. As previously

stated, it was assumed that both instruments would have similar scores on

corresponding constructs; therefore, it was expected that the standard scores

would fall within one standard deviation of each other.

Method

Participants

The participants for this study were 95 parents or guardians of preschoolers who

were referred to a nonprofit child development clinic for a behavioral developmental

evaluation. The children ranged in age from 24 to 70 months, with a mean age of 35.1

months (SD = 10.9 months). Only one parent or guardian of each child was asked to

participate in this study. Typically, the instruments were completed by the children’s

mothers (81%), while 9.5% were completed by female guardians and another 9.5%

completed by the children’s fathers. The majority of the preschool children were

Caucasian (84%) while 12% were African American, 3% were Hispanic, and 1% was

Asian. Additionally, 85.3% of the children were boys, whereas 14.7% were girls. The

participants’ education level was also ascertained, and 61% had a high school education

or less, while 39% had at least some college education. The United States Census Bureau

(U.S. Census Bureau, 2008) indicates the obtained demographic data were roughly

comparable to the general population in Kentucky where 90.2% are Caucasian, 7.5%

African American, 2.0% Hispanic, and 1.0% Asian. The Census Bureau also noted 74.1%

of Kentuckians to have a high school degree or less and 17.1% to have a college degree.

Instruments

Behavior Assessment System for Children, 2nd Edition (BASC-2). The BASC-2

(Reynolds & Kamphaus, 2004) has five different forms or assessment modalities. There

are two behavior rating scales (teacher and parent) which are intended to be used for

children ages 2 to 21 years. A Self-Report of Personality is also part of the BASC-2 that

is intended to be utilized with children between the ages of 8 and 25 years. A fourth

21

22

component of the BASC-2 is a structured developmental history form. The

developmental history is most often completed by a professional, in which detailed

background information is obtained from a child’s parent or guardian; however, it may

also be given to the parent or guardian and completed in the form of a questionnaire.

Finally, the BASC-2 includes a form for observing student behaviors in the classroom

setting. This direct observation form is used to observe both positive and negative types

of behaviors in a classroom (Reynolds & Kamphaus, 2004).

For the purposes of this study, only the behavior rating scale designed to be

completed by parents of preschool children, called the Parent Rating Scale – Preschool

(PRS-P) was described and examined. The original BASC was developed in 1998 with

an age range of 4 to 18 years; however, the most recent version extended its age range.

The BASC-2 PRS-P was normed with a sample of 1,200 preschool children and is

intended for use with children 2 years of age up through 5 years, 11 months. The BASC-

2 uses T scores, which have a mean of 50 and a standard deviation of 10. The BASC-2

PRS-P contains 134 behaviors and the parent or guardian rates the occurrence of each

item using a 4-point rating system: Never, Sometimes, Often or Almost Always. These

behavioral items comprise eight “clinical” scales, which evaluate maladaptive behaviors

(i.e., Hyperactivity, Aggression, Anxiety, Depression, Somatization, Atypicality,

Withdrawal, and Attention Problems). Additionally, the results from the PRS-P provide

four “adaptive” scales (i.e., Adaptability, Social Skills, Activities of Daily Living, and

Functional Communication), as well as four composite scores (i.e., Externalizing

Problems, Internalizing Problems, Adaptive Skills, and the Behavioral Symptoms Index).

23

Reynolds and Kamphaus (2004) reported that the BASC-2 has satisfactory

reliability estimates for the preschool population. More specifically, for test-retest

reliability, a sample of 87 preschool children with a median age of 4 years, 6 months was

used to obtain test-retest data. Each child in the sample was rated twice by the same

parent/guardian with 9 to 70 days between each rating. The children included in the

sample were from the general population or had a previous clinical diagnosis. As noted

in the BASC-2 manual, test-retest reliabilities were adjusted for restriction of range

(Reynolds & Kamphaus, 2004). The reliability coefficients for test-retest reliability

ranged from the low .80s to the lower .90s. Moreover, when evaluating interrater

reliability, a sample of 40 preschoolers with a median age of 3 years, 11 months was

assessed by two different parents/guardians. The time that lapsed between ratings was

between 0 to 70 days. Interrater reliability coefficients were also adjusted for restriction

of range. The estimates for interrater reliability are somewhat lower than the test-retest

coefficients. For interrater reliability, the median reliability coefficient yielded for the

preschool children was .74.

The BASC-2 manual (Reynolds & Kamphaus, 2004) contains evidence of

adequate internal consistency coefficients for the PRS-P for the general norm sample of

preschoolers. More specifically, combined internal consistency coefficients for the

clinical and adaptive scales for ages 2 to 3 years ranged from .77 to .88, and for ages 4 to

5 years, coefficients ranged from .70 to .87. Combined internal-consistency coefficients

for the composite scales for ages 2 to 3 years ranged from .85 to .93, and for ages 4 to 5,

coefficicents ranged from .87 to .93.

24

Child Behavior Checklist (CBCL/1.5-5). The Achenbach System of Empirically

Based Assessment (ASEBA) is an assessment approach often used by professionals to

evaluate a variety of behaviors (i.e., adaptive, social/emotional, maladaptive behaviors,

etc.) in people ranging in age from 1½ to over 90 years (Rescorla, 2005). A widely used

behavior rating scale, which is a component of ASEBA, is the CBCL. Like the BASC-2,

the CBCL also contains a variety of forms for parents/guardians and teachers to complete;

however, the primary focus for this research is the parent form used for children 1½

through 5 years of age.

The CBCL was revised and updated in the year 2000. The original preschool

CBCL version was for children between the ages of 2 and 4. There was also a version of

the CBCL that was intended for children between the ages of 4 and 18 years. Currently,

the most recent revision of the CBCL has extended the preschool version’s age range to 1

½ to 5 years of age (CBCL/1.5-5, Achenbach & Rescorla, 2000). Additionally,

empirically-based scales, as well as DSM-oriented scales, were developed for the new

version of the CBCL/1.5-5 (Rescorla, 2005). The CBCL yields T-scores, which have a

mean of 50, with a standard deviation of 10. Also, only the composite scores on the

CBCL go down to a T score of 50.

The parent version of the CBCL/1.5.-5 contains 99 different problems or

behaviors, with item 100 allowing the respondent to write about any specific problems

that were not previously mentioned in the other items. The parent is required to rate the

occurrence of each item using a 3-point rating scale: 0 (Not True), 1 (Somewhat or

Sometimes True), and 2 (Very True or Often True). The specific behavioral items

comprise seven different “syndrome” scales (i.e., Emotionally Reactive,

25

Anxious/Depressed, Somatic Complaints, Withdrawn, Sleep Problems, Attention

Problems, Aggressive Behavior), as well as five “DSM-Oriented” scales (i.e., Affective

Problems, Anxiety Problems, Pervasive Developmental Problems, Attention

Deficit/Hyperactivity Problems, Oppositional Defiant Problems). The CBCL/1.5-5 also

provides scores for the composite areas of Internalizing Problems, Externalizing

Problems, and Total Problems.

The CBCL/1.5-5 was normed using a sample of 700 non-referred preschool

children. The manual (Achenbach & Rescorla, 2000) reports test-retest reliability over an

8 day period, which resulted in the majority of the scale correlations being in the .80s and

.90s. Additionally, internal-consistency coefficients ranged from .66 to .92 for the

syndrome scales, .63 to .86 for the DSM-oriented scales, and .89 to .95 for the composite

scales (Achenbach & Rescorla, 2000). The validity of the CBCL/1.5-5 was demonstrated

in the manual by providing evidence of the instrument’s ability to accurately distinguish

between referred and non-referred children.

Scale Comparisons

For the purposes of this study, fifteen corresponding scales and composites from

the BASC-2 and the CBCL/1.5-5 were chosen for comparison. The comparisons were

made on the basis of the scales’ similarity in construct names and/or similarity in the

behavioral symptoms measured. For instance, the CBCL contains an Aggressive

Behavior scale, whereas the BASC-2 has an Aggression scale. It is important to note that

four of the BASC-2 scales (i.e., Hyperactivity, Attention Problems, Anxiety, and

Depression) and two of the CBCL scales (i.e., Anxious/Depressed and Attention

Deficit/Hyperactivity Problems) were used in multiple comparisons due to the overlap in

26

construct names from both instruments. The BASC-2 PRS-P and CBCL/1.5-5 scales for

the fifteen comparisons are listed in Table 2.

Procedure

Staff from the nonprofit child development clinic located client files where both

the BASC-2 PRS-P and CBCL/1.5-5 were completed by parents/guardians of referred

preschool-aged children. The staff then copied the score sheets after deleting the child’s

name from the score sheet. The basic information (chronological age, gender, ethnicity,

etc.) was written on a demographic form and stapled to the score sheets. In this manner,

test data were given to the investigator without knowledge or access to the children’s or

parents’ names. All of the rating scales were scored using the computer scoring software

sold by the tests’ publishers. Because the CBCL uses gender-specific norms, gender-

specific norms were also used when scoring the BASC-2 protocols to enhance

comparability.

27

Table 2

Pairs of BASC-2 and CBCL Scales Used for Comparisons

BASC-2 PRS-P CBCL/1.5-5

Hyperactivity ADHD Problems

Hyperactivity Attention Problems

Attention Problems Attention Problems

Attention Problems ADHD Problems

Aggression Aggressive Behavior

Anxiety Anxious/Depressed

Anxiety Anxiety Problems

Depression Anxious/Depressed

Depression Affective Problems

Somatization Somatic Complaints

Atypicality Pervasive Developmental Problems

Withdrawal Withdrawn

Externalizing Externalizing

Internalizing Internalizing

Behavioral Symptoms Index Total Problems

28

Results

Strength of Correlations – Hypothesis 1

It was hypothesized that the correlations between the standard scores on the

corresponding constructs of the BASC-2 PRS-P and the CBCL/1.5-5 would be positive,

significant, and at a moderately strong level (> .50). To evaluate this hypothesis, paired-

sample correlations between T scores were conducted on the corresponding constructs of

both the BASC-2 and CBCL to determine the strength of the relationships between

instruments. Correlations between the T scores of all scales on the CBCL/1.5-5 and on

the BASC-2 PRS-P are provided in Table 3. Correlations for the 15 comparisons are in

bold type in the table and are also listed separately in Table 4. A more stringent p value

of .001 was used for significance for this particular data analysis to control for Type I

error. Results revealed that all 15 comparisons were positively and significantly

correlated. Fourteen of the 15 comparisons were considered to be correlated at a

moderately strong level of r > .50. The comparison that did not achieve a correlation of

greater than .50 was the Somatization scale on the BASC-2 and the Somatic Complaints

scale on the CBCL. The strongest correlations were achieved amongst two of the three

composite scales: Behavior Symptoms Index -Total Problems (.90) and Externalizing

(.90). The third composite scale comparison, Internalizing, only received a correlation of

.63. Other scales with very strong correlations included the Aggression/Aggressive

Behavior (.86) and Attention Deficit Hyperactivity (.79). Additionally, the current

study’s correlations were compared with correlations between the BASC-2 and CBCL

found in the BASC-2 manual, which used a nonreferred sample. This comparison is

29

Table 3

Correlations Between Corresponding Scales on the BASC-2 PRS-P and CBCL/1.5-5 (n = 95)

BASC-2 Scales

CBCL Scales 1 2 3 4 5 6 7 8 9 10 11

Emotionally Reactive .69* .53* .64* .47* .74* .28 .61* .28 .67* .70* .80*

Anxious/Depressed .50* .31 .55* .61* .59* .35* .53* .22 .67* .56* .64*

Somatic Complaints .29 .16 .31 .41* .33* .44* .40* .26 .49* .32 .40*

Withdrawn .28 .42* .17 .05 .25 .05 .71* .61* .17 .23 .52*

Sleep Problems .41* .07 .40* .27 .45* .50* .29 .10 .53* .44* .42*

Attention Problems .75* .66* .54* .13 .52* .29 .46* .08 .44* .68* .67*

Aggressive Behavior .81* .58* .86* .29 .75* .31 .54* .24 .63* .89* .87*

Affective Problems .55* .39* .52* .14 .61* .28 .61* .17 .49* .57* .66*

Anxiety Problems .39* .22 .44* .59* .40* .40* .50* .37* .58* .44* .54*

Pervasive Dev. Prob. .38* .44* .30 .23 .32 .17 .73* .69* .32 .36* .63*

Attention Deficit/Hyp. .79* .69* .59* .15 .58* .20 .42* .10 .44* .73* .70*

Oppositional Defiant .81* .56* .75* .27 .73* .28 .49* .15 .60* .83* .80*

Internalizing .59* .49* .54* .45* .62* .35* .74* .48* .63* .60* .79*

Externalizing .86* .62* .84* .28 .75* .34* .56* .20 .64* .90* .88*

Total Problems .78* .59* .73* .39* .74* .40* .73* .35* .69* .80* .90* Note. Numbered BASC-2 scales are: 1=Hyperactivity, 2=Attention, 3=Aggression, 4=Anxiety, 5=Depression, 6=Somatization, 7=Atypicality, 8=Withdrawal, 9=Internalizing, 10=Externalizing, 11=Behavior Symptoms Index. *p < .001.

30

Table 4 Correlations Between Corresponding Scales for the Current Study and for a Non-

referred Sample from the BASC-2 Manual BASC-2 Scale—CBCL Scale BASC-2 Manual Current Results

Hyperactivity—ADHD Problems .79 .79

Hyperactivity—Attention Problems .78 .75

Attention Problems—Attention Problems .65 .66

Attention Problems—ADHD Problems .59 .69

Aggression—Aggressive Behavior .67 .86

Anxiety—Anxious/Depressed .32 .61

Anxiety—Anxiety Problems .39 .59

Depression—Anxious/Depressed .47 .59

Depression—Affective Problems .54 .61

Somatization—Somatic Complaints .56 .44

Atypicality—Pervasive Dev. Problems .42 .73

Withdrawal—Withdrawn .42 .61

Externalizing—Externalizing .83 .90

Internalizing—Internalizing .68 .63

Behavioral Symptoms Index—Total Problems .78 .90

31

provided in Table 4. A visual inspection of the results indicate a few differences between

correlations within the BASC-2 manual and the current results. For several of the

corresponding scales, the results found within the BASC-2 manual had much lower

correlations than the current results. For instance, the biggest numerical difference was

found for the comparison of the Atypicality and Pervasive Developmental Problems

scales, in which the BASC-2 manual correlation was .42 and the current results achieved a

correlation of .73. The Anxiety and Anxious/Depressed comparison also had a

substantial discrepancy between correlations, in which the manual reported a correlation

of .32 and the current results achieved a correlation of .61. Eleven of the correlations

from the current study were higher than those found within the BASC-2 manual. A

possible reason for these discrepancies may be due to the smaller sample size (n = 53)

used in the manual. However, the most likely reason is that the sample from the BASC-2

manual consisted of non-referred children. Thus, there was a restriction in the range of

scores they obtained, resulting in lower correlation coefficients.

Consistency of Mean Scores – Hypothesis 2

It was hypothesized that the mean scores on each of the corresponding scales on

the BASC-2 PRS-P and the CBCL/1.5-5 would not be significantly different from one

another. To evaluate this hypothesis, paired sample t-tests were used to compare the

mean scores on each of the corresponding scales. Mean scores for the corresponding

scales are listed in Table 5. A more stringent p value of .01 was used as the criterion for

significance for this data analysis to control for Type I errors. Significantly different

mean scores were found in 9 of the 15 comparisons. Six of the 9 significantly different

32

Table 5

Mean T-Scores and t-test Results for Comparable BASC-2 and CBCL Scales (n = 95) BASC-2 Scale—CBCL Scale BASC-2 CBCL t values

Hyperactivity—ADHD Problems 64.6 64.0 0.7

Hyperactivity—Attention Problems 64.6 64.8 -0.2

Attention Problems—Attention Problems 64.4 64.8 -0.5

Attention Problems—ADHD Problems 64.4 64.0 0.5

Aggression—Aggressive Behavior 58.0 69.1 -13.4**

Anxiety—Anxious/Depressed 46.5 57.9 -13.3**

Anxiety—Anxiety Problems 46.5 59.7 -14.2**

Depression—Anxious/Depressed 59.7 57.9 1.4

Depression—Affective Problems 59.7 63.2 -2.7*

Somatization—Somatic Complaints 48.7 59.0 -10.1**

Atypicality—Pervasive Developmental Problems 68.9 71.7 -2.7*

Withdrawal—Withdrawn 57.8 68.8 -10.4**

Externalizing—Externalizing 62.4 67.8 -8.3**

Internalizing—Internalizing 52.2 63.5 -11.0**

Behavioral Symptoms Index—Total Problems 66.6 67.4 -1.3 *p < .01. **p < .001.

33

mean scores were greater than one standard deviation apart, indicating fairly substantial

differences in results.

The corresponding scales that were found to be the most similar were the multiple

combinations of the BASC-2 Hyperactivity and Attention Problems scales with the CBCL

ADHD and Attention Problems scales. Additionally, the Depression-Anxious/Depressed

and the Behavioral Symptoms Index-Total Problems scales from both instruments did not

result in significantly different means. The scales that resulted in the largest difference

between means were the BASC-2 Aggression and Anxiety scales with the CBCL

Aggressive Behavior, Anxious/Depressed, and Anxiety Problems scales.

Another interesting finding is that the mean scores for the CBCL were higher than

the BASC-2 for 12 out of 15 comparisons. The CBCL truncates the “syndrome” and

“DSM-oriented” scales at 50, which means that an individual may not receive a score

lower than 50, except for the overall composite scales. The BASC-2 does not truncate its

scores; therefore, this may account for the consistently higher mean scores on the CBCL.

To test this possibility, all T scores, with the exception of composite scores, were re-

coded on the BASC-2. Any scores on the BASC-2 that were below 50 were re-coded to

equal 50. Upon re-analysis of the data, CBCL mean scores still remained higher than the

BASC-2 truncated mean scores for 11 out of 15 CBCL scales. Additionally, after

conducting paired sample t-tests with the re-coded data, it was found that the scale

comparisons that had significant differences at the p < .001 value listed in Table 5 still

remained significantly different at the p < .001 level. Differences emerged where the p

value was at < .01 in Table 5. Results now revealed that the Depression-Affective

Problems and the Atypicality-Pervasive Developmental Problems comparisons were no

34

longer significantly different. Interestingly, before the truncation, the Depression-

Anxious/Depressed comparison was not found to be significantly different; however,

after re-coding the BASC-2 data, this comparison now became significantly different at

the p < .001 level because the mean for the BASC-2 Depression scale was significantly

higher than the mean for the CBCL Anxious/Depressed scale. This post-hoc analysis

would suggest that neither of the BASC-2’s Depression and Anxiety scales are measuring

the same behaviors as the Anxious/Depressed Scale from the CBCL. Overall, it appears

the higher scores achieved on the CBCL/1.5-5 are not due to the truncation of scores, but

that the CBCL/1.5-5 actually results in consistently higher T scores than the BASC-2

PRS-P.

Consistency of Ratings – Hypothesis 3

It was hypothesized that the ratings on the corresponding scales from the BASC-2

PRS-P and CBCL/1.5-5 would be consistent. In addition to comparing mean ratings on

the two instruments, consistency of results was evaluated by examining classification

outcomes on an individual level. Specifically, standard scores for each scale comparison

were categorized as clinically significant or not clinically significant based on T scores

greater than or less than a criterion score of 65 using cross-tabulation tables. A T score of

65 was chosen because it is 1.5 standard deviations above the mean and is the criterion

for “Borderline Significant” on the CBCL. The use of a T score of 70 (“Clinically

Significant”) was not used because it was thought a score two standard deviations above

the mean would be too stringent of a criterion score. Data analyses involved calculating

the percentage of ratings less than 65 on both instruments, the percentage of ratings at or

above 65 on both instruments, and the percentage of ratings where a scale on one

35

instrument had a score greater than or equal to 65 but there was a corresponding scale

with a score less than 65. A total percentage of overall consistency between ratings on the

BASC-2 PRS-P and CBCL/1.5-5 was then calculated. Overall consistency consisted of

adding the two percentages in which both instruments consistently measured the same

construct as either above or below the criterion score of 65. These results are in Table 6.

On average, it was found that the BASC-2 ratings were above the criterion score while the

CBCL ratings on comparable scales were below the criterion score only 5.8% of the time.

In contrast, the CBCL ratings were above the criterion score while the BASC-2 ratings

were below the criterion score 19.2% of the time. Thus, these results indicate that the

likelihood of obtaining a significantly high score on the CBCL but not on the BASC-2 is

3.3 times more likely than obtaining a significantly high score on the BASC-2 but not on

the CBCL.

Using the criterion of acceptable classification consistency of 80 percent or above

(Lidz, 2003), only 4 of the 15 comparisons met that criterion level. The Behavioral

Symptoms Index-Total Problems (87%) comparison resulted in the highest consistency

percentage. Similar results were also found with the Hyperactivity-ADHD Problems

(85%), Externalizing-Externalizing (84%), and the Anxiety-Anxious/Depressed (81%)

comparisons. These results suggest that there was a higher level of agreement in

classification ratings on those scales between the BASC-2 and CBCL. Most comparisons

(n = 8) resulted in overall classification consistencies between 70 and 80%. There were

three comparisons with overall classification consistencies below 70%: Withdrawal-

Withdrawn (57%), Internalizing-Internalizing (60%), and Somatization-Somatic

Complaints (69%).

36

Table 6

Classification Consistency of Ratings Between the BASC-2 PRS and CBCL/1.5-5 BASC-2 Scales CBCL Scales ADHD Problems

Hyperactivity T < 65 T ≥ 65 Consistency T < 65 43% 4% 85% T ≥ 65 11% 42% Attention Problems

Hyperactivity T < 65 T ≥ 65 Consistency T < 65 34% 14% 77% T ≥ 65 9% 43% Attention Problems

Attention Problems T < 65 T ≥ 65 Consistency T < 65 33% 14% 76% T ≥ 65 11% 43% ADHD Problems

Attention Problems T < 65 T ≥ 65 Consistency T < 65 38% 8% 76% T ≥ 65 16% 38%

37

Table 6 (continued). BASC-2 Scales CBCL Scales

Aggressive Behavior

Aggression T < 65 T ≥ 65 Consistency T < 65 46% 29% 71% T ≥ 65 0% 25% Anxious/Depressed

Anxiety T < 65 T ≥ 65 Consistency T < 65 76% 18% 81% T ≥ 65 1% 5% Anxiety Problems

Anxiety T < 65 T ≥ 65 Consistency T < 65 73% 21% 78% T ≥ 65 1% 5% Anxious/Depressed

Depression T < 65 T ≥ 65 Consistency T < 65 61% 6% 78% T ≥ 65 16% 17% Affective Problems

Depression T < 65 T ≥ 65 Consistency T < 65 50% 18% 73% T ≥ 65 9% 23%

38

Table 6 (continued). BASC-2 Scales CBCL Scales Somatic Complaints

Somatization T < 65 T ≥ 65 Consistency T < 65 64% 29% 69% T ≥ 65 2% 5% Pervasive Developmental Problems

Atypicality T < 65 T ≥ 65 Consistency T < 65 19% 23% 74% T ≥ 65 3% 55% Withdrawn

Withdrawal T < 65 T ≥ 65 Consistency T < 65 34% 40% 57% T ≥ 65 3% 23% Externalizing

Externalizing T < 65 T ≥ 65 Consistency T < 65 44% 16% 84% T ≥ 65 0% 40% Internalizing

Internalizing T < 65 T ≥ 65 Consistency T < 65 47% 39% 60% T ≥ 65 1% 13%

39

Table 6 (continued). BASC-2 Scales CBCL Scales Total Problems

Behavioral Symptoms Index T < 65 T ≥ 65 Consistency T < 65 34% 9% 87% T ≥ 65 4% 53%

Note. Consistency refers to the percentage of agreement where scores from corresponding

scales were both either above or below the cutoff score (T = 65).

40

Standard Score Differences – Hypothesis 4

As yet another way to measure the consistency of ratings on an individual basis, it

was hypothesized that the differences between the standard scores on the corresponding

constructs of the BASC-2 and CBCL will be less than one standard deviation (< 10 points)

apart. To evaluate this hypothesis, the percentage of participants who scored less than,

and greater than, one standard deviation between standard scores on corresponding scales

from both instruments was determined. Results are presented in Table 7. It was found

that all scale comparisons resulted in having standard scores greater than one standard

deviation apart. Six scale comparisons had more than half of the standard scores greater

than one standard deviation apart: Withdrawal-Withdrawn (54%), Somatization-Somatic

Complaints (58%), Aggression-Aggressive Behavior (61%), Internalizing-Internalizing

(65%), Anxiety-Anxious/Depressed (66%), and Anxiety-Anxiety Problems (68%). Only

three of the scale comparisons had less than 20% of their score comparisons greater than

one standard deviation apart: Attention Problems-Attention Problems (16%), Attention

Problems-ADHD Problems (14%), and the overall composite comparison of Behavior

Symptoms Index-Total Problems (9%). Overall, these results provide additional

evidence that the CBCL/1.5-5 and the BASC-2 PRS-P do not consistently measure most

similarly-named constructs.

41

Table 7

Percentage of Ratings Less Than or Greater Than One Standard Deviation Between Standard Scores on the Corresponding BASC-2 and CBCL Scales Percent BASC-2 Scale—CBCL Scale < 1 SD ≥ 1 SD

Hyperactivity—ADHD Problems 75 25

Hyperactivity—Attention Problems 73 27

Attention Problems—Attention Problems 84 16

Attention Problems—ADHD Problems 86 14

Aggression—Aggressive Behavior 39 61

Anxiety—Anxious/Depressed 34 66

Anxiety—Anxiety Problems 32 68

Depression—Anxious/Depressed 54 46

Depression—Affective Problems 51 49

Somatization—Somatic Complaints 42 58

Atypicality—Pervasive Developmental Problems 65 35

Withdrawal—Withdrawn 46 54

Externalizing—Externalizing 76 24

Internalizing—Internalizing 35 65

Behavioral Symptoms Index—Total Problems 91 9 Note. ADHD = Attention Deficit Hyperactivity Disorder.

Discussion

The BASC-2 and CBCL are common instruments utilized by school psychologists

to evaluate social-emotional behaviors in school-age children. However, most available

research examines the forms intended for school-age children. Both instruments have

recent revisions, which expanded their age range to include young children. Yet the

appropriateness of assessing children as young as 18 months with behavior rating scales

has received little attention in the literature. The current study examines the consistency

of measurement between corresponding scales on the parent preschool forms of the

BASC-2 PRS-P and the CBCL/1.5-5 with a group of clinically-referred preschool

children. The results from the current study provided a number of cautions about the use

of the two popular preschool behavior rating scales.

Parents or guardians of 95 referred preschool children from a non-profit child

development clinic were the participants for this study. The parents completed the

preschool forms of the BASC-2 and CBCL at the same point in time. By using this

procedure, temporal, source and setting variance were controlled, thus leaving instrument

variance as the only explanation for differences in results (Merrell, 2003). A total of 15

corresponding scales from the BASC-2 and CBCL were chosen for comparison. To

evaluate these scales, a number of analyses were conducted, including correlations

between standard scores and a comparison of mean scores on either scale. Furthermore,

the percentage of classification consistency between ratings was also evaluated to

determine the level of agreement between the two instruments. The final analysis

involved evaluating the size of the differences between standard scores on the

corresponding scales for each individual.

42

43

The current research examined the strength of correlations between 15

corresponding scales of the BASC-2 and CBCL preschool instruments. It was found that

14 of the 15 comparisons were significantly correlated at a moderately strong level (i.e., r

> .50). The Somatization-Somatic Complaints correlation was less than .50, which may

suggest that the scales are measuring somewhat different constructs. The lack of a

moderately strong correlation for the somatization scales was consistent with

Sidebottom’s (2005) findings comparing the first version of the BASC PRS-P with the

current version of the CBCL/1.5-5.

Test manuals typically provide correlations between instruments as evidence of

construct validity. High correlations, however, do not necessarily mean equivalent

results and the results from this research provide a good example of that caution. For

instance, the aggression scales for both instruments had a very strong correlation (.86),

yet were also found to have significantly different mean scores. Additional data analyses

suggested the high correlations mask the stark differences between rating scales. An

interesting and important finding is that 9 of the 15 comparisons were found to have

significantly different mean scores. The differences in mean scores suggest that the

instruments provide different results for many of the similarly-named scales. This is an

important finding for psychologists, in the sense that the BASC-2 and CBCL scales may

not provide the same information for many of the corresponding scales when evaluating

preschool children.

On average, the BASC-2 and CBCL are consistent measures of attention and

hyperactive behaviors in preschool children. Both instruments achieved strong and

significant correlations for the corresponding scales, in addition to having small

44

differences between mean scores and a high classification consistency. These findings

indicate that both the BASC-2 and CBCL are indeed measuring similar constructs for the

attention and hyperactivity scales. This finding is important for psychologists since

ADHD behaviors are a common referral concern. The BASC-2 and CBCL are also

consistent measures of two composite indices: Externalizing behaviors and overall

problem behaviors. If the overall problem behavior score is used for special education

eligibility purposes, this is an important finding. Both measures seem to provide

equivalent overall ratings.

Another interesting finding is that the Internalizing composites of both

instruments achieved a low level of classification consistency, which would suggest a

low level of agreement between ratings. Thus, Merrell et al.’s (2003) contention that

behavioral rating scales are especially beneficial for assessing internalizing types of

problems is not supported by these results. Additionally, the research revealed that the

BASC-2 PRS-P Anxiety and Depression scales and CBCL/1.5-5 Anxious/Depressed scale

did not achieve acceptable and consistent results across most comparisons in this

research. As a result, it is difficult to make conclusions about what constructs the

CBCL/1.5-5 Anxious/Depressed scale is measuring. It may prove to be beneficial if the

CBCL’s Anxious/Depressed scale was separated into two different scales, as in the

BASC-2 PRS-P, for measuring anxious and depressed behaviors. Separation of the

Anxious/Depressed scale into two scales would also be consistent with a

recommendation made by Frick and Kamphaus (2001). Frick and Kamphaus criticized

the Anxious/Depressed scale on the school-age version of the CBCL because they

45

thought the combination of anxious and depressed behaviors into one scale did “not

match current conceptualizations of childhood psychopathology” (p. 194).

Another important finding from the current research is that the CBCL consistently

resulted in higher scores than the BASC-2. More specifically, 12 of the 15 comparisons

resulted in higher mean scores on the CBCL. Re-coding of BASC-2 data indicated that

the discrepancy between ratings was not due to the CBCL’s truncation of scores at 50.

Unfortunately, it is unknown which instrument provides more “accurate” scores. The

CBCL may provide inflated scores or the BASC-2 may underestimate the severity of

behavioral problems. For instance, the BASC-2 Aggression scale had a mean score (58.0)

in the average range of functioning, whereas the CBCL Aggressive Behavior scale had a

mean score (69.1) at a significantly high level. A psychologist may have a very different

interpretation of a child’s aggressive behaviors depending on which instrument was

administered. These differences in scores between the two instruments have important

implications for psychologists. Psychologists must be cautious in their score

interpretation. A high score on the CBCL may not be that significant or an average score

on the BASC-2 may not reveal significant concerns that truly exist in the home

environment. This finding further supports the idea of using a multi-method assessment

approach and to not solely rely on the results of behavior rating scales since they may

under or over-estimate behavioral problems in children (Lidz, 2003).

Strengths and Limitations

A possible limitation of the current research is the overall representativeness of

the obtained sample. The sample only consisted of “referred” children and their parents

or guardians, which may not represent the population as a whole. However, the

46

participants were a sample of “real” referred children and the participants completed the

scales under the conditions of an actual evaluation. Such children and circumstances

replicate exactly how behavior rating scales will be used by practitioners. Having parents

of nonreferred children complete both instruments, as done by Reynolds and Kamphaus

(2004), provides questionable validity evidence. Additionally, the three validity

indicators on the BASC-2 PRS-P protocols indicated a high level of acceptability on the

current results. More specifically, the F Index, which measures an overly negative

response pattern, achieved “acceptable” ratings for 93% of the participants. The

Response Pattern and Consistency indicators were rated “acceptable” for 98% of the

ratings. A large number of parents who completed the behavior rating scales were

mothers; therefore, the fathers may be considered underrepresented. There does not

appear to be research examining fathers’ consistency of ratings. Another possible

limitation may be that it was unknown how well the parents/guardians of the children

actually understood the questions on either behavior rating scale. If a parent/guardian

had difficulty comprehending what an item was asking, this could alter the obtained

results. Parents’ and guardians’ reading and comprehension abilities were not assessed.

Despite the weaknesses, strengths of the study should not be underscored. The

relatively large sample utilized for the current research may be considered a strength of

the study. Additionally, the fact that time, rater, and setting variables were controlled is

another important strength of this research. It is clear that any differences found between

the two instruments could only be attributed to the instruments themselves.

47

Future Research

A possible area for future research would be to further evaluate the reliability and

validity of using the BASC-2 PRS-P and CBCL/1.5-5 for children who have specific

diagnoses, like ADHD. In the current study, both instruments were found to be

comparable when measuring ADHD-related behaviors. Such research may prove to be

beneficial in further evaluating these instruments with preschoolers with ADHD or other

diagnoses. Additionally, it would also be useful to design studies that could evaluate the

appropriateness (e.g., social validity) of using behavior rating scales with such a young

population of children. The most recent revisions of both the BASC-2 and CBCL

extended their age range downward so that younger children could be assessed. Parents’

and professionals’ views on the appropriateness of measuring certain constructs (e.g.,

anxiety, depression, ADHD) on children as young as 18 months of age should be

ascertained. It is unclear what research support is available for measuring such constructs

in children so young.

Additionally, it may prove to be beneficial to conduct a more thorough analysis of

the individual items on the instruments’ scales to evaluate how each instrument defines

similarly-named constructs. This analysis may help clarify what behavioral constructs

are actually being evaluated. Such clarification would be especially beneficial for the

Anxious/Depressed scale on the CBCL/1.5-5. Finally, comparisons of the BASC-2 or

CBCL with other instruments is another area for future research. For instance, the

Clinical Assessment of Behavior (CAB; Bracken & Keith, 2004) measures similar

constructs as the BASC-2 and CBCL in children as young as two years of age. It would

48

be interesting to see how the CAB compares in its measurement of behavioral constructs

in referred preschoolers.

Summary

The current research has provided psychologists with a variety of information that

may be considered beneficial, in which both consistencies and inconsistencies were

identified in each instrument. However, it remains unclear which instrument provides a

more accurate measure of behavioral constructs in preschool-aged children. This

research has identified evidence that suggests that either instrument is an acceptable

measure when evaluating ADHD-related behaviors, externalizing behaviors, or an overall

level of problem behaviors. Additionally, it was found that the CBCL/1.5-5 typically

provides more elevated scores than the BASC-2 PRS-P, which could influence how a

psychologist interprets scores from either instrument when evaluating preschool-aged

children.

References

Achenbach, T. M. (1991). Manual for the Child Behavior Checklist/4-18 and 1991

Profile. Burlington, VT: University of Vermont Department of Psychiatry.

Achenbach, T. M., & Rescorla L. A. (2000). Manual for the ASEBA Preschool Forms &

Profiles. Burlington, VT: University of Vermont, Research Center for Children,

Youth, & Families.

Achenbach, T. M., & Rescorla, L. A. (2001). Manual for the ASEBA School-Age Forms

& Profiles. Burlington, VT: University of Vermont, Research Center for Children,

Youth, & Families.

Bingham, C. R., Loukas, A., Fitzgerald, H. E., & Zucker, R. A. (2003). Parental ratings

of son's behavior problems in high-risk families: Convergent validity, internal

structure, and interparent agreement. Journal of Personality Assessment, 80(3),

237-251.

Bracken, B. A., & Keith, L. K. (2004). Clinical Assessment of Behavior. Lutz, FL:

Psychological Assessment Resources, Inc.

Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2nd ed.).

Hillsdale, NJ: Lawrence Erlbaum Associates.

Doyle, A., Ostrander, R., Skare, S., Crosby, R. D., & August, G. J. (1997). Convergent

and criterion-related validity of the Behavior Assessment System for Children –

Parent Rating Scale. Journal of Clinical Psychology, 26(3), 276-284.

Drotar, D., Stein, R. E. K., & Perrin, E. C. (1995). Methodological issues in using the

Child Behavior Checklist and its related instruments in clinical child psychology

research. Journal of Clinical Child Psychology, 24(2), 184-192.

49

50

Elliot, S. N., & Busse, R. T. (1993). Behavior rating scales: Issues of use and

development. School Psychology Review, 22(2), 313-321.

Frick, P. J., & Kamphaus, R. W. (2001). Standardized rating scales in the assessment of

children’s behavior and emotional problems (pp. 190-204). In C. E. Walker & M.

C. Roberts (Eds.), Handbook of clinical child psychology (3rd ed.). New York:

Wiley.

Gresham, F. M., & Elliott, S. N. (1990). Social Skills Rating System. Circle Pines, MN:

AGS Publishing.

Konold, T. R., Walthall, J. C., & Pianta, R. C. (2004). The behavior of child behavior

ratings: Measurement structure of the Child Behavior Checklist across time,

informants, and child gender. Behavioral Disorders, 29(4), 372-383.

Lavigne, J. V., Cicchetti, C., Gibbons, R. D., Binns, H. J., Larsen, L., & DeVito, C.

(2001). Oppositional Defiant Disorder with onset in preschool years: Longitudinal

stability and pathways to other disorders. Journal of the American Academy of

Child and Adolescent Psychiatry, 40, 1393-1400.

Lidz, C. S. (2003). Early childhood assessment. Hoboken, NJ: John Wiley & Sons, Inc.

Merrell, K. W. (2003). Behavioral, social, and emotional assessment of children and

adolescents (2nd ed.). Mahwah, NJ: Lawrence Erlbaum Associates, Publishers.

Merrell, K. W., Blade, R. L., Lund, J., & Kempf, K. K. G. (2003). Convergent and

discriminant construct validity of the internalizing symptoms scale for children

with the BASC-SRP-C. Psychology in the Schools, 40(2), 139-144.

Merrell, K. W., Caldarella, P., Streeter, A. L., Boelter, E. W., & Gentry, A. (2001).

Convergent validity of the Home and Community Social Behavior Scales:

51

Comparisons with five behavior rating scales. Psychology in the Schools, 38, 313-

325.

Mereydith, S. P. (2001). Temporal stability and convergent validity of the Behavior

Assessment System for Children. Journal of School Psychology, 39(3), 253-265.

Ostrander, R. Weinfurt, K. P., Yarnold, P. R., & August, G. J. (1998). Diagnosing

Attention Deficit Disorders with the Behavioral Assessment System for Children

and the Child Behavior Checklist: Test and construct validity analyses using

optimal discriminant classification trees. Journal of Consulting and Clinical

Psychology, 66(4), 660-672.

Rescorla, L. A. (2005). Assessment of young children using the Achenbach System of

Empirically Based Assessment (ASEBA). Mental Retardation and Developmental

Disabilities Research Reviews, 11, 226-237.

Reynolds, C. R., & Kamphaus, R. W. (1998). Behavior Assessment System for Children.

Circle Pines, MN: AGS Publishing.

Reynolds, C. R., & Kamphaus, R. W. (2004). Behavior Assessment System for Children

(2nd ed.). Circle Pines, MN: AGS Publishing.

Sattler, J. M. (2001). Assessment of children: Cognitive applications (4th ed.). La Mesa,

CA: Jerome M. Sattler, Publisher, Inc.

Schopler, E., Reichler, R. J., & Ro, B. (1988). Childhood Autism Rating Scale. Circle

Pines, MN: AGS Publishing.

Sciutto, M. J., & Terjesen, M. D. (2000, August). A psychometric review of measures of

ADHD in early childhood. Paper presented at the Annual Meeting of the

American Psychological Association, Washington, D.C.

52

Sidebottom, K. (2005). A comparison of the BASC and CBCL with referred preschoolers.

Unpublished specialist thesis, Western Kentucky University, Bowling Green.

Skovgaard, A. M., Houmann, T., Landorph, S. L., & Christiansen, E. (2004).

Assessment and classification of psychopathology in epidemiological research of

children 0-3 years of age: A review of literature. European Child and Adolescent

Psychiatry, 13(6), 337-346.

U.S. Census Bureau. (2008). Kentucky quickfacts from the U.S. Census Bureau. U.S.

Census Bureau. Retrieved April 3, 2008, from

http://quickfacts.census.gov/qfd/states/21000.html

Vaughn, M. L., Riccio, C. A., Hynd, G. W., & Hall, J. (1997). Diagnosing ADHD

(predominantly inattentive and combined type subtypes): Discriminant validity of

the Behavior Assessment System for Children and the Achenbach Parent and

Teacher Rating Scales. Journal of Clinical Child Psychology, 26(4), 349-357.

Wrobel, N. H., & Lachar, D. (1998). Validity of self and parent-report scales in screening

students for behavioral and emotional problems in elementary school. Psychology

in the Schools, 35(1), 17-27.

Appendix

Human Subjects Review Board Approval

53

54

WESTERN KENTUCKY UNIVERSITY Human Subjects Review Board Office of Sponsored Programs 301 Potter Hall In future correspondence please refer to HS07-056, October 6, 2006 Jennifer Bour c/o Dr. Carl Myers Department of Psychology, WKU Dear Jennifer: Your revision to your research project, “Comparing Parent Ratings of Preschoolers on the CBCL and BASC-2,” was reviewed by the HSRB and it has been determined that risks to subjects are: (1) minimized and reasonable; and that (2) research procedures are consistent with a sound research design and do not expose the subjects to unnecessary risk. Reviewers determined that: (1) benefits to subjects are considered along with the importance of the topic and that outcomes are reasonable; (2) selection of subjects is equitable; and (3) the purposes of the research and the research setting is amenable to subjects’ welfare and producing desired outcomes; that indications of coercion or prejudice are absent, and that participation is clearly voluntary. 1. In addition, the IRB found that you need to orient participants as follows: (1) signed informed consent is not required as data is being retrieved from a secondary source; (2) Provision is made for collecting, using and storing data in a manner that protects the safety and privacy of the subjects and the confidentiality of the data. (3) Appropriate safeguards are included to protect the rights and welfare of the subjects. This project is therefore approved at the Expedited Review Level until July 31, 2007. 2. Please note that the institution is not responsible for any actions regarding this protocol before approval. If you expand the project at a later date to use other instruments please re-apply. Copies of your request for human subjects review, your application, and this approval, are maintained in the Office of Sponsored Programs at the above address. Please report any changes to this approved protocol to this office. A Continuing Review protocol will be sent to you in the future to determine the status of the project. Sincerely, Sean Rubino, M.P.A. Compliance Manager Office of Sponsored Programs Western Kentucky University