188
NOMINATION AND IDENTIFICATION OF TRADITIONALLY UNDERREPRESENTED STUDENTS FOR GIFTED PROGRAMS: INSIGHTS FROM A POPULATION DATASET by MATTHEW T. MCBEE (Under the Direction of Thomas P. Hébert) ABSTRACT A set of studies were performed using a large population dataset obtained from the Georgia Department of Education. The studies focused on uncovering the causes of the underrepresentation of Black and Hispanic students as well as students from low socioeconomic status backgrounds in gifted and talented education programs. The first study examined the performance of the referral sources used in Georgia and concluded that automatic and teacher referrals have the best performance. The study also uncovered evidence that the majority of underrepresentation takes place at the nomination stage of the gifted assessment process. The second study quantified the impact of various individual- and school-level variables on the probability that a student will be identified. It found evidence that race and socioeconomic status make large, independent contributions to the probability of identification. The final study used a sample dataset to introduce hierarchical linear modeling and multilevel structural equation modeling. INDEX WORDS: Gifted, talented, Georgia, identification, underrepresentation, assessment, minority, Black, Hispanic, Asian, African-American, socioeconomic, context, nomination, teacher, ability, multilevel, structural equation modeling

NOMINATION AND IDENTIFICATION OF TRADITIONALLY

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

NOMINATION AND IDENTIFICATION OF TRADITIONALLY UNDERREPRESENTED

STUDENTS FOR GIFTED PROGRAMS: INSIGHTS FROM A POPULATION DATASET

by

MATTHEW T. MCBEE

(Under the Direction of Thomas P. Hébert)

ABSTRACT

A set of studies were performed using a large population dataset obtained from the

Georgia Department of Education. The studies focused on uncovering the causes of the

underrepresentation of Black and Hispanic students as well as students from low socioeconomic

status backgrounds in gifted and talented education programs. The first study examined the

performance of the referral sources used in Georgia and concluded that automatic and teacher

referrals have the best performance. The study also uncovered evidence that the majority of

underrepresentation takes place at the nomination stage of the gifted assessment process. The

second study quantified the impact of various individual- and school-level variables on the

probability that a student will be identified. It found evidence that race and socioeconomic status

make large, independent contributions to the probability of identification. The final study used a

sample dataset to introduce hierarchical linear modeling and multilevel structural equation

modeling.

INDEX WORDS: Gifted, talented, Georgia, identification, underrepresentation, assessment,

minority, Black, Hispanic, Asian, African-American, socioeconomic, context, nomination, teacher, ability, multilevel, structural equation modeling

Page 2: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

NOMINATION AND IDENTIFICATION OF TRADITIONALLY UNDERREPRESENTED

STUDENTS FOR GIFTED PROGRAMS: INSIGHTS FROM A POPULATION DATASET

by

MATTHEW T. MCBEE

B.S., Tennessee Technological University, 2002

M.Ed., University of Georgia, 2004

A Dissertation Submitted to the Graduate Faculty of The University of Georgia in Partial

Fulfillment of the Requirements for the Degree

DOCTOR OF PHILOSOPHY

ATHENS, GEORGIA

2006

Page 3: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

© 2006

Matthew T. McBee

All Rights Reserved

Page 4: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

NOMINATION AND IDENTIFICATION OF TRADITIONALLY UNDERREPRESENTED

STUDENTS FOR GIFTED PROGRAMS: INSIGHTS FROM A POPULATION DATASET

by

MATTHEW T. MCBEE

Major Professor: Thomas P. Hébert

Committee: Deborah Bandalos Martha Carr Bonnie Cramond

Electronic Version Approved: Maureen Grasso Dean of the Graduate School The University of Georgia May 2006

Page 5: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

iv

DEDICATION

This work is dedicated to my parents, Dennis and Mary Kay McBee, whose lifetimes of

hard work made it possible for me to pursue my educational and professional goals.

Page 6: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

v

ACKNOWLEDGEMENTS

Thanks to my soon-to-be wife, Kristin Pierce, for the generous and sometimes firm

support she gave me during the process of writing this dissertation. She also helped with some

rather intractable page numbering issues in the word processing software. Thanks to my major

professor, mentor, and lifelong friend, Dr. Thomas Hébert, who not only contributed a great deal

to the quality of this work through his editing but also provided me with encouragement when it

was sorely needed. Both Dr. Deborah Bandalos and Dr. Linda Muthén were extremely helpful

and prompt with their advice when I had questions related to statistics or the use of the MPlus

program. Scott Fowler provided editing support to me during the final stages of the writing

process.

Page 7: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

vi

TABLE OF CONTENTS

Page

ACKNOWLEDGEMENTS........................................................................................................v

LIST OF TABLES................................................................................................................... viii

LIST OF FIGURES....................................................................................................................x

CHAPTER

1 Introduction and Literature Review............................................................................1

2 A Descriptive Analysis of Referral Sources for Gifted Identification Screening by

Race and Socioeconomic Status..........................................................................48

3 Examining the Probability of Identification of Students for Gifted Programs in

Georgia Elementary Schools: A Multilevel Structural Equation Modeling Study.71

4 Multilevel Analysis in Gifted Education................................................................120

5 Summary and Future Directions ............................................................................167

APPENDICES........................................................................................................................157

A MPlus code for regression analysis........................................................................171

B MPlus code for regression accounting for clustering..............................................172

C MPlus code for random intercept hierarchical linear model....................................173

D MPlus code for random slope and intercept hierarchical linear model ....................174

E MPlus code for single-level structural equation model ...........................................175

F MPlus code for SEM accounting for clustering ......................................................176

G MPlus code for ML-SEM with random intercept model.........................................177

Page 8: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

vii

H MPlus code for ML-SEM with random slope and intercept models........................178

Page 9: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

viii

LIST OF TABLES

Page

Table 2.1: Identified elementary students by race and SES........................................................57

Table 2.2: Overall comparison of referral sources.....................................................................58

Table 2.3: Comparison of referral sources by SES.....................................................................59

Table 2.4: Comparison of referral sources by race.....................................................................61

Table 2.5: Comparison of referral sources by race and SES.......................................................62

Table 3.1: Variable descriptions................................................................................................82

Table 3.2: Descriptive statistics.................................................................................................83

Table 3.3: Variable intercorrelations.........................................................................................84

Table 3.4: Model fit information for each analysis step.............................................................92

Table 3.5: Model summary for within-schools component of random slope model ....................99

Table 3.6: Model summary for between-schools component of random slope model ...............103

Table 3.7: Model summary for “Lunch” slope portion of random slope model ........................107

Table 3.8: Model summary for “Asian” slope portion of random slope model .........................109

Table 3.9: Model-implied probabilities of identification..........................................................111

Table 4.1: Variable descriptions for sample dataset .................................................................140

Table 4.2: Regression and HLM model results........................................................................142

Table 4.3: SEM and ML-SEM model results...........................................................................148

Page 10: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

ix

LIST OF FIGURES

Page

Figure 1.1: Relationship between student numerical intelligence and sensitivity to school mean

numerical intelligence per SES-group .......................................................................32

Figure 3.1: Within-schools path model......................................................................................88

Figure 3.2: Between-schools structural model ...........................................................................90

Figure 3.3: Random slope model for both “ lunch” and “race (Asian)” to “probability of being

identified as gifted” ...................................................................................................93

Figure 3.4: Path values and standard errors for within-schools portion of random slope model..98

Figure 3.5: Path values for between-schools (intercept) component of random slope model ....102

Figure 3.6: Path values for slope portion of random slope model 1A (from “lunch” to

“probability of being identified gifted”)...................................................................106

Figure 3.7: Path values for slope portion of random slope model 1B (from “Asian” to

“probability of being identified gifted”)...................................................................108

Figure 4.1: Coefficients in the random intercept model ...........................................................133

Figure 4.2: Coefficients in the random slope model.................................................................136

Figure 4.3: Example structural equation model .......................................................................138

Figure 4.4: Regression model specified in a SEM context .......................................................138

Figure 4.5: Results for single-level SEM .................................................................................147

Figure 4.6: Results for ML-SEM random intercept model .......................................................151

Figure 4.7: Results for ML-SEM random slope and intercept models......................................153

Page 11: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

1

CHAPTER 1

INTRODUCTION AND LITERATURE REVIEW

Page 12: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

2

The numerical underrepresentation of African-American, Hispanic, and Native students

in gifted education programs has been frequently cited in the gifted education literature (Ford,

1998; Reid, Romanoff, Algozzine, & Udall, 2000; Sarouphim, 1999; Scott, Perou, Urbano,

Hogan, & et al., 1992). Though many publications have addressed this problem, relatively few

have attempted to quantify the severity of the issue. Ford (1998) and Brown (1997) both cited

information gathered by the Office of Civil Rights indicating that Black and Hispanic students

were underrepresented by 41% and 42%, respectively. Most publications in gifted education that

address the issue simply begin by stating that underrepresentation has been and continues to be a

problem. Many proposed explanations for the underrepresentation issue have been cited in the

literature and will be examined later in this chapter.

Factors that Reduce Students’ Probability of Identification for Gifted Programs

Disadvantaged Ethnic Group Membership

The most critical issue related to understanding the disparity in gifted program enrollment

across racial groups is: to what extent does this disparity represent actual differences of

developed or potential capability across groups, and to what extent does it indicate the presence

of some serious flaws in the methods by which we screen, identify, and serve gifted students?

Most scholars who have examined the issue have agreed with Frasier’s belief that “There is no

logical reason to expect that the number of minority students in gifted programs would not be

proportional to their representation in the general population” (1997, p. 498). Though some

early psychologists studying human intelligence believed that non-White populations had lower

IQs by virtue of inferior genetics (e.g., “Spearman’s hypothesis” as described by Naglieri and

Jensen, 1987), this belief has been widely and rightly dismissed as a racist and shameful legacy.

For these reasons, most scholars within gifted education have assumed that the practices and

Page 13: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

3

procedures commonly utilized by schools to find and serve students from underrepresented

groups have failed, and that there are a great number of undiscovered gifted children out there.

As Torrance (1977) observed, “there is a great deal of giftedness among the culturally different

and the waste or underuse of those resources is tragic” (p. 3).

Ford (1998) pointed out that there are at least three classes of contributing factors to the

underrepresentation issue: recruitment and identification problems, personnel training problems,

and retention problems. Currently, the first of these has received the most attention in the

literature.

Most gifted programs rely at least to some degree on standardized measures of ability or

achievement during the assessment process. It has been widely noted that minority and

economically disadvantaged youth significantly under perform on these tests relative to their

relatively advantaged peers (e.g., Ford, Harris, Tyson, & Trotman, 2002; Entwisle & Alexander,

1992; Maker, 1996; Naglieri & Jensen, 1987). The performance gap between Black and White

students tends to be about one standard deviation. In Mills and Tissot’s (1995) study of two

identification instruments, they found that mean scores on the School and College Ability Test

(SCAT), the more traditional of the two instruments studies, were 21.03 for Black students and

28.42 for White students on the verbal subscale, with a pooled standard deviation of about 8.4

points. About the same gap was observed for the math subscale scores, where Black students

had a mean of 18.9 compared with 25.54 for White students, with a pooled standard deviation of

about eight points. More shocking still is that the scores of students receiving free or reduced

price lunch were 9.85 for verbal and 11.29 math, which barely exceeded the scores of students in

special education. Again, these results are fairly typical of similar studies.

Page 14: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

4

If the population test score gap between Black and White students is about one standard

deviation, and the most traditional cutoff score of two standard deviations above the mean on a

measure of mental ability is used to determine gifted program placement, it can easily be shown

from examining the normal curve that the cutoff score would allow about 3% of White children

to qualify (at two standard deviations above their group mean) while the same cutoff would

allow only .13% of Black children to qualify, because they would need to score about three

standard deviations above their group mean to meet the same criteria. This statistical

examination is extremely oversimplified, however, it does illustrate just how severely group

mean differences can affect the proportion of members found in the tails. Ford (1998) rightly

pointed out that many school districts continue to use such outdated definitions of giftedness, and

that the common cutoffs on mental tests are arbitrary.

A great deal of literature has examined this test score differential. Many critics have

argued that minority students tend to do poorly on such tests because the tests themselves are

flawed by being biased in favor of students from the dominant culture (Ford et al., 2002). In

other words, standardized tests might unfairly penalize minority students by assigning them

lower scores for the same level of underlying ability or achievement. The exact nature of this

bias is unknown. Ford argued that verbally loaded tests tend to penalize minority students.

There is some support for this in the literature. For example, Mills and Tissot’s (1995)

previously-mentioned study compared the scores of 347 students from a wide variety of ethnic

backgrounds on the School and College Ability Test (SCAT) to their scores on Raven’s

Advanced Progressive Matrices (APM). The performance gap between verbal and math

subscale scores on the SCAT for Black students was about the same as the gap for White

students. Hispanic students performed better relative to the White group on the math items than

Page 15: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

5

on the verbal items. However, because of the brief description of the math items provided by the

author, one cannot assume that they were not also verbally loaded. When the students were

compared on the APM, the magnitude of the score discrepancy was significantly reduced to

about half a standard deviation, as opposed to a full standard deviation on the SCAT. On the

basis of this and other studies (e.g., Shaunessy, Karnes, & Cobb, 2004), many scholars in gifted

education have advocated for using nonverbal instruments such as Raven’s Advanced

Progressive Matrices and the Naglieri Non-Verbal Assessment test for use with minority, low

SES, or ESL students. Other scholars, such as Pyryt (1996), have cautioned against abandoning

traditional IQ tests too quickly, pointing out that detailed statistical analysis has not revealed

evidence that specific items are biased against minority students. He noted that culturally-loaded

or biased items should be missed more frequently by students from backgrounds that would not

allow them access to this cultural knowledge. When examining the test results of students from

various backgrounds, this pattern is not observed. Minority students and White students tend to

do poorly on the same items; only the former incorrectly answers those items more often.

Another contributing factor to the test score differential could be that minority students

face psychological, social, or cultural barriers not experienced by students from the majority

culture. The literature on this subject is limited, and most of it has focused on issues affecting

Black children. Fordham and Ogbu (1986) argued that Black students in particular experience

intense social pressure to underachieve and disengage from school because they understand

education to be the domain of the White middle class. Therefore, to strive for school

achievement is to strive for entrance into this culture and thus betray the African American

culture. Evidence supporting this claim comes primarily from qualitative case studies. Fordham

(1988) went on to categorize students who achieve academically at the price of perceived

Page 16: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

6

betrayal of their cultures to have experienced a Pyrrhric victory. Valenzuela’s (1999)

ethnographic study of a predominantly Hispanic high school in Texas found that a similar

process existed for second-generation Hispanic immigrants. Another promising area of

investigation deals with stereotype threat (Steele, 1997), which is the finding that performance

on cognitive tasks is significantly depressed whenever individuals fear that their poor

performance might reinforce a negative stereotype about a group to which they belong.

Stereotype threat was originally envisioned to apply to Black-White differences in test

performance. Subsequent research has shown that it operates in female math performance

(Schmader, Johns, & Barquissau, 2004), memory performance tasks in older adults (Chasteen,

Bhattacharyya, Horhota, Tam, & Hasher, 2005), and social cue decoding in men (Koenig &

Eagly, 2005). Yopyk and Prentice (2005) found that student athletes performed more poorly on

a math test when primed with their athlete identity than when primed with their student identity.

Though this finding has been widely replicated, it is poorly understood (see a recent

review by Smith, 2004). For example, Marx and Goff (2005) found that experimenter race has a

salient effect in studies of stereotype threat involving race. Black experimenters were unable to

replicate the performance detriment due to stereotype threat that White experimenters created.

Student-Level Socioeconomic Status (SES)

Students from low socioeconomic status backgrounds are another group that has been

widely described as being underrepresented in gifted education programs. Socioeconomic status

itself is a broad concept whose definition is subject to some debate. The literature examining the

effects of SES on educational outcomes can usually be classified into two major categories with

respect to the operationalization of SES. The first set of studies operationalizes SES at the

individual level as a dichotomous variable indicating whether or not a student is eligible for the

Page 17: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

7

federal free lunch program, or at the school level as the percentage of the student body eligible

for such assistance (e.g., Brosnan, 1983; Entwisle & Alexander, 1992; Ryan & French, 1976). In

this case, eligibility for free or reduced-price lunch is a proxy for annual family income. The

second set of studies operationalizes SES at the individual level as a composite variable usually

including family income, parental educational attainment, and occupational status (e.g., Portes &

MacLeod, 1996; Rumberger, 1995). A third category of studies make use of a single indicator

variable that is not related to the free lunch program, such as occupational prestige alone (Quay,

1989). These have generally been based on Hollingshead and Redlich’s (1958) occupational

scale.

Literature on the representation of low-SES students in gifted education is less

voluminous than research on race, probably because the dearth of poor children in gifted

programs in not nearly as visible as the lack of Black and Hispanic children. There is no federal

office similar to the Office of Civil Rights for the poor, and issues of class do not have the same

urgency and history as issues of race in this country. Previous studies of race and gifted program

admittance are seriously flawed because race and SES are very highly related. Studies that have

examined race without controlling for SES either statistically or experimentally are confounded

and thus very difficult to interpret.

Descriptive data are even less accessible on the numbers describing the degree to which

low SES students are underrepresented in gifted programs. There are, however, innumerable

studies examining the impact of SES on school achievement. The impact of SES on school

achievement has been shown to be very powerful in almost every study including it as a

predictor (Steinberg, Blinde, & Chan, 1984). To examine this issue, the following discussion has

been organized around common variables chosen as outcomes in the literature.

Page 18: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

8

School readiness. Mills (1983) examined school readiness as a function of parental

socioeconomic status in a sample of 49 predominantly middle class kindergartners. School

readiness was assessed via the Test of Basic Experiences, General Concepts, Level K. The study

examined the explanatory power of three measures of SES, which included annual family

income, mother’s level of education, and father’s level of education. Results indicated that the

father’s level of education was significantly related to school readiness, explaining 10.2% of the

variance. The other two measures of SES were not significantly related to school readiness.

This is a somewhat surprising finding and may be due to range restriction on the SES variable in

the sample. Nonetheless, the results are clear. Parental SES may have some impact on school

readiness. Other studies, such as Entwisle and Alexander (1992), Garibaldi (1997), and West

(1985) have also confirmed a small but significant reduction in school readiness for low SES

students as compared to other students – a gap that grows larger during each successive year of

schooling.

Cognitive development. Numerous studies have confirmed that low SES students do not

perform as well as other students on tests of mental ability. Ryan and French (1976) performed a

study of the impact of SES on measured intelligence and school achievement in 209 elementary

school students selected from schools serving homogeneously low, middle, and high SES

populations. Their results provided evidence that low SES students lag behind their more

advantaged agemates in both verbal and nonverbal IQ. Specifically, the mean verbal IQ for the

low SES group was 95.6, compared to 104.8 for the middle SES group and 109.6 in the high SES

group, as measured by the Lorge-Thorndike Intelligence Tests. The mean nonverbal IQ for the

low SES group was 94.8, compared to 108.2 and 113.9 for the middle and high SES groups,

Page 19: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

9

respectively. It is important to note that the gap between the low and middle SES groups was

roughly twice the size as the gap between the middle and high groups.

Perhaps more interesting is Quay’s (1989) study of SES in Piagetian task performance.

This study utilized a sample of 144 first, second, and third grade children. These children were

then classified as being from low, middle, or high SES backgrounds based on parental

occupation. The study hypothesized that one cause of the observed difference in Piagetian task

performance between SES groups (Overton, Wagner, & Dolinsky, 1971) is that low SES

students experienced less congruence between their home and school environments (Laosa,

1983) and therefore would have less skill with school-like materials and tasks. Therefore,

stimulus material was included as an independent variable. In this case, the stimulus material

could be either cardboard cutouts (school-like) or food. The three Piagetian tasks examined were

classification, conservation of substance, and conservation of number.

In the classification task, the performance of the third grade low SES group was inferior

to the performance of the first grade high SES group, indicating the presence of a significant

developmental delay in the low SES group. The gap between the low and middle SES groups

was much higher than the gap between the middle and high SES groups. As predicted, low SES

children of all ages performed better on tasks involving food as a stimulus material. Similar

patterns were found for the classification of substance task. A major ceiling effect for the high

SES group was found in the classification of numbers task. The results indicated that experience

with school materials may explain a fraction of the performance differential across SES groups

and reinforced previous findings that low SES children may experience slower cognitive

development.

Page 20: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

10

Domain specific achievement tests. A number of studies of SES have achievement test

scores as an outcome. One obvious advantage of this approach is that it creates a common scale

of measurement across teachers and schools. Among these studies are Entwisle and Alexander

(1992), Mills and Tissot (1995), Portes and MacLeod (1996), Ryan and French (1976), and

Tyler-Wood and Carri (1993). Entwisle and Alexander’s (1992) study examined the scores of

Black and White students on the first, second, and third grade versions of the math section of the

California Achievement Test. They found that by third grade, the mean math achievement score

for the low SES group was about one half of a standard deviation lover than that of the rest of the

sample. Furthermore, even though the racial differences were the focus of the study, the authors

concluded that racial effects were minimal when SES was controlled.

Portes and MacLeod (1996) defined achievement as reading and math scores for an

Stanford Achievement Test in their study of advantaged and disadvantaged ethnic communities

in California and Florida. They found that parental SES had a strong effect on achievement. A

one standard deviation change in parental SES would result in a 10 percentile gain in math and

an 11 percentile gain in reading, controlling for the other factors.

Ryan and French’s (1976) study found that low, middle, and high SES had Iowa Test of

Basic Skills (ITBS) composite raw scores of 28.4, 34.8, and 40.4, respectively. This corresponds

to a .85 standard deviation difference between the low and middle SES groups and a 1.51

standard deviation difference between the low and high SES groups.

Returning to Mills and Tissot’s (1995) study, the performances of the group of students

receiving free or reduced price lunch on the SCAT were three and two standard deviations below

the scores of the White students in verbal and math achievement respectively. As mentioned

previously, the scores of the group receiving free or reduced-price lunch (FRL) barely exceeded

Page 21: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

11

those of students enrolled in special education. The gap was reduced to a little more than half a

standard deviation when comparing scores on Ravens’ APM. This pattern was also noted by

Tyler-Wood and Carri (1993), who compared the scores of low and average SES groups of

students nominated for gifted programs on a variety of measures, including the CogAT, the

OLSAT, the Stanford-Binet 4, the Slosson Intelligence Test – Revised, and the Matrix Analogies

Test. They found that the gap between SES groups was much bigger on verbal tasks, such as

verbal section of the CogAT and the verbal section of the Stanford-Binet as compared to the

Matrix Analogies Test.

The results of these studies and others have established that low SES is consistent with

lowered performance on standardized achievement measures. Unfortunately, the results are not

reported in such a way to allow the calculation of effect sizes across studies, so the size of the

effect cannot be directly computed.

Global measures of academic performance. The use of teacher-assigned grades as

outcomes is less common in the literature on SES and school achievement. This is probably due

to the variation in grading practices, procedures, and standards across teachers and classes that

greatly reduces the reliability of grades as a measure. Ryan and French’s (1976) study examined

GPAs for third, fourth, and fifth grade in addition to achievement test and IQ scores. The

students attending low SES schools had consistently lower GPAs than the students attending

middle or high SES schools. However, the reported standard deviations for the mean GPAs are

quite high in comparison with the mean differences across schools, so it is probable that some of

the comparisons would not have differed significantly if significance testing had been performed.

Page 22: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

12

School Socioeconomic Status

A number of studies have examined the impact of the SES composition of schools on

student achievement (Everson & Millsap, 2004; Griffith, 1996; Kennedy, 1992; Maggi,

Hertzman, Kohen, & D'Angiulli, 2004; Opdenakker & Van Damme, 2001; Raudenbush & Bryk,

1986; Taylor & Harris, 2003, West, 1985). Most of these studies relied on multilevel analysis

schemes. Taylor and Harris’s (2003) study of race segregation, Griffith’s (1996) study of

parental empowerment, Maggi et. al.’s (2004) study of neighborhood SES composition on high

achieving children, and West’s (1985) study of school-level factors in reading and math

achievement relied on ordinary regression analyses conducted at the school level. Though the

studies have been conducted in the United States, Canada, and Belgium, and have used various

operational definitions of SES and achievement, the results have been remarkably consistent

across studies. Most studies have found that students from all backgrounds do better in schools

with high SES student bodies.

Everson and Millsap (2004) fitted a set of multilevel structural equation models with

latent means to a large data set examining the impact of individual and school level predictors of

SAT performance. The results indicated that school SES had very powerful effects on both SAT

math and SAT verbal scores. Not only did SES exert large direct effects on SAT scores, it also

exerted strong indirect effects through school achievement (grades) and extracurricular activities.

A one standard deviation change in SES would be expected to increase SAT math scores by 60

points and SAT verbal scores by 54.6 points directly. The change in SES would cause

extracurricular activities to increase by .88 standard deviations and would also increase school

achievement by .46 standard deviations. These changes in achievement and extracurricular

Page 23: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

13

activities would then contribute an additional 22.4 points to SAT math and an additional 31.6

points to SAT verbal.

Griffith (1996) examined the impact of parental involvement on academic performance.

His sample included 41 elementary schools, and all variables were measured at the school level.

Academic achievement was operationalized as aggregate scores on the criterion-referenced test

(CRT) while parent involvement was operationalized as the score on a 30-item survey of parental

participation in school activities. The data were analyzed via a flat regression model, with

school racial composition and SES entered at step one as covariates. Interestingly, neither race

nor SES had a significant relationship with academic performance. These results are not typical

for studies of this type, and may reflect problems with biased parameter estimates and low power

resulting from the data aggregation.

Kennedy (1992) analyzed the performances of Black and White male third graders on a

shortened form of the Educational Development Series (EDS) tests used in Louisiana.

Achievement was operationalized as a composite of reading, mathematics, and language

sections. Separate hierarchical linear models (HLMs) were fitted to each group of students.

Results indicated that the school SES was the strongest predictor of achievement at the school

level for both the Black and White children. However, the effect was approximately twice as

strong for White students. The authors concluded that all students are affected by the

composition of their classrooms, but White students appeared to be most sensitive to this

composition

A similar study was conducted by Taylor and Harris in 2003. They examined the effects

of relative integration and segregation on Black and White students’ Stanford 9 scores in third,

fifth, and eigth grades via simple bivariate correlations. The percentage of students within

Page 24: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

14

schools receiving free or reduced price lunch served as a proxy for SES. The achievement of

Black students in eighth grade was negatively correlated with the percentage of the enrollment

that is Black (r = -.688) and the percentage of the enrollment receiving free or reduced-price

lunch (r = -.821), while it was positively correlated with the percentage of students that are White

(r = .698). White students’ achievement was not significantly affected by the Black enrollment

or the overall percentage of students receiving free or reduced-price lunch. It was, however,

negatively correlated with the percentage of White students receiving free or reduced-price

lunch. Taken together, this study and the previous study seem only to create confusion regarding

the relative impact of student SES background on the achievement of Black and White students.

However, both of them clearly indicated that SES of the student body was related to school

achievement.

West (1985) examined achievement at the school level, defining achievement as the

number of students within a school at grade level above in reading and math on the New Jersey

Minimum Basic Skills test. The use of the stepwise method for determining the order of entry

for her predictors resulted in SES being entered first in her analysis of math achievement and

second (behind percentage of the school population that is Black) in the reading achievement

analysis. This complicates the interpretation in the reading achievement case due to collinearity

between race and SES. Her results indicated that each percentage point of increase in the low

SES school population resulted in a .35 percent reduction in the number of students within that

school who were on grade level or above for math.

Setefania Maggi and colleagues (2004) studied the effect of organization-level SES on

the proportion of high achieving students within school. This study also used separate flat

regression models for predicting reading and math achievement, measured in fourth grade and

Page 25: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

15

seventh grade. Neighborhood SES was entered in the first step of all four analyses. It was

highly significant in all of them, with R squared values ranging from .358 to .442. Moreover, the

magnitude of the predictor increased from fourth to seventh grade in all of the analyses. The

authors reiterated Hertzman, McLean, Kohen, Dunn, and Evans’ (2002) findings that a teacher in

a classroom serving thirty low SES students can expect to encounter ten with developmental

delays and still more with specific learning disabilities, whereas a teacher serving thirty students

in a high SES neighborhood can expect only three or four students to have similar issues. Maggi

et. al. proposed that it is this uneven distribution of learning difficulties across SES groups that

might explain their findings. As they stated,

“The learning experiences of the highly competent children may be compromised by the

less stimulating academic climate created by a high proportion of children who face

learning difficulties and by the lack of attention from a teacher who is focused on

children who require additional support” (p. 110).

Opdenakker and Van Damme’s (2001) study is perhaps the most interesting,

comprehensive, and methodologically sound of all the research reported here. The study was

conducted in Belgium, so its generalization to American schooling may not be warranted. They

studied the impact of school SES composition and math ability on individual students’ math

achievement using a series of three level hierarchical linear models. Their study is situated in the

context of school effectiveness research, which has generally concluded that schools do have the

power to affect the learning of their students. The authors pointed out that student composition,

particularly with respect to ability, has been ignored in the previous work on school effectiveness

and may considerably “muddy the waters.” The simple bivariate correlation between the father’s

educational attainment and math achievement was r = .66. In their model of school achievement

Page 26: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

16

with student background characteristics entered, father’s educational attainment (the SES proxy)

was highly significant, though slightly less powerful than student’s numerical intelligence. The

standardized beta value of .338 indicated that math achievement could be expected to go up by

about a third of a standard deviation for each standard deviation increase in father’s education,

even after controlling for the student’s mathematical ability. Further results from this study will

be discussed later in this chapter.

Portes and MacLeod’s (1996) study of the factors affecting the academic performances of

students in four immigrant communities provided evidence reinforcing the findings of other

studies regarding individual and aggregate SES. They extended the efforts of previous work by

also testing for the effect of a cross-level SES interaction. Portes and MacLeod used average

school SES as a predictor of the slope coefficient relating individual SES to mathematics

achievement. This predictor was positive and significant such that the slope relating individual

SES to math achievement was steeper in high SES schools than the same relationship in a low

SES school. Students from high SES backgrounds had higher math achievement when they were

situated within high SES schools. Students from low SES backgrounds were doubly penalized

by attending high SES schools and performed better when they attended low SES schools.

Based on the findings from the reviewed studies, there is extensive evidence supporting

the strong role of the SES in school achievement, at the individual and aggregate levels as well

as an interaction between these across levels. These results provide a strong rationale for

considering these three effects of SES in future examinations of academic performance.

Proposed Mechanisms

Reminded of Rumberger’s (1995) call to focus attention on the processes by which SES

affects educational outcomes over the traditional examination of SES as a structural variable, we

Page 27: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

17

turn now to an examination of proposed mechanisms through which race and poverty may

hamper achievement.

Home and School Environments

Peer influences. Due to such factors as “White flight” and other types of racial and class

stratification, families tend to live in areas populated by other families with similar

characteristics. We have already examined a number of studies showing that poverty has a

strong association with poor educational outcomes, both at the individual and aggregate levels.

Furthermore, schools tend to be relatively homogeneous with respect to their socioeconomic

makeup due to the aforementioned stratification. Students surrounded by peers who are

performing poorly in school are more likely to accept this condition as normative and perform

poorly themselves (Bennett, 1995). A possible explanation for this might lie in an extension of

social comparison theory (Festinger, 1954), which argues that when objective information is

lacking or untrustworthy, people judge their performances and abilities against those of their

peers. Marsh and Parker (1984) referred to this as the frame of reference model. This process

could cause children who outperform many of their classmates to conclude that they are doing

“well enough”, when in fact their performance does not compare favorably with students in more

advantaged schools.

Motivation and academic disidentification as cultural phenomena. As discussed in the

introduction, race and SES are confounded variables. Therefore, a great deal of the research

examining the impact of SES on achievement does so from the perspective of examining the

performances of White students versus those of Black or Hispanic students. Fordham and Ogbu

(1986) proposed that Black students might resist achieving high levels of school success due to

negative social sanctioning from their peers—by being accused of “acting White.” Fordham

Page 28: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

18

(1988) argued that Black students must adopt a raceless identity in order to be successful in

schools. She went on to characterize this as a meaningless victory where the rewards may not be

worth the sacrifice. Most Black students are unwilling to sacrifice their racial and cultural

identities in order to achieve in schools, and more importantly, they should not have to. Many

Black students may perceive school achievement and being Black as incompatible identity

structures. A similar scenario may occur among Hispanic students. Valenzuela’s (1999)

ethnography of a primarily Hispanic high school provided evidence that first-generation

Hispanic immigrants exhibited higher levels of school achievement than Hispanic students born

in the United States, in spite of the increased language difficulties that often accompany recent

immigration. She argued that those students born in the United States had assimilated beliefs

that Hispanic students cannot do well in school without denying their home culture.

Summer setback. Entwisle and Alexander (1992) examined math achievement in a

sample of elementary school students. As described earlier in this review, they found that high

and low SES students entered first grade with only a small achievement differential. This gap

grew to about half a standard deviation by the time the students reached third grade. To better

understand the nature of this development, data were examined from multiple testing dates. The

findings were quite striking. Both groups of students made progress during the school year. In

fact, low SES children made larger gains than high SES children during the school year.

However, during the summers, low SES children lost ground while high SES children continued

to gain. The cumulative effect of these “summer setbacks” was responsible for the growing gap

between the SES groups. The authors concluded that schools may be doing a better job with

educating poor children than they are typically credited for, and that the culprit may lie in the

home environments of economically disadvantaged children.

Page 29: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

19

Maternal communication. Low and high SES families tend to utilize different child

rearing strategies. Middle- and upper-class mothers spend more time talking to their children,

help children understand the causes and consequences of events, and provide more active

scaffolding for their children’s attempts at problem-solving. Research examining the

communication between mothers and children in problem solving situations has shown that low

SES mothers tend to use brief and highly directive statements when assisting their children in

problem solving (Hess & McDevitt, 1984). High SES mothers, on the other hand, tend to take a

less directive approach, issuing leading questions that help children solve the problems on their

own. This style of teaching rather than telling has been shown to result in higher achievement

test scores.

Lack of resources. Low-SES mothers may lack access to quality prenatal health care and

nutrition. The risk of premature delivery is higher for poor mothers, which increases the risk for

the child to have cognitive deficits or learning disabilities. Low-SES mothers are more likely to

use drugs (both legal and illegal) during pregnancy (McLoyd, 1998). Poor families also tend to

experience higher levels of emotional, physical, and financial stress, leading to more conflicts

between parents and children (Duncan & Brooks-Gunn, 2000). The educational attainment of

parents in poor homes is obviously much lower than in middle- and upper-class homes, the

homes contain fewer books and other educational materials, and less disposable income is

available for learning aids like computers, library trips, and museum visitation.

Psychological Issues

Self-concept. Though it is reasonable to assume that low SES children would have low

global self-concepts, two classic studies have suggested that low SES children may actually have

higher global self-concepts (Soares & Soares, 1969; Trowbridge, 1972). Soares and Soares

Page 30: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

20

(1969) compared the self-perceptions of 229 low SES and 285 high SES students in fourth and

eighth grade. The low SES students reported more positive self-perceptions than the high SES

students. Trowbridge’s (1972) study compared the scores of 3,798 children from low, middle,

and high SES backgrounds on the Coopersmith self-esteem inventory and found that low SES

children outscored middle SES children on general self-concept, school-academic self concept,

and social self concept. However, Wylie (1979) pointed out that these classic studies confused

racial group membership with SES (a common flaw in SES research), and that the research up to

that point had failed to yield replicable findings. More recent research has focused on the

academic self-concept rather than global self-concept. Marsh (1984) employed a path analytic

methodology and found that general and academic self-concept were only modestly correlated (r

= .20), and that academic ability had a positive effect on academic self-concept, while the

average ability of one’s classmates had a negative effect on academic self-concept. The author

concluded that his results support the frame-of-reference theory, which argues that individuals

judge their abilities relative to those of their peers instead of against some absolute criteria.

However, high SES students had higher academic self-concepts than low SES students

regardless of the academic context. Marsh, Relich, & Smith (1983) and Shavelson and Bolus

(1982) found evidence that academic self-concept was moderately correlated with measured

academic achievement.

Relative impact of race and SES. One of the questions that has barely been addressed in

the gifted education literature is the relative importance of race and SES in gifted program

identification. Portes and MacLeod (1996) mentioned that race dropped out of their models

when SES was controlled. To address this question, McBee (in press) performed a study using

publicly available school-level data for almost 15,000 schools in Georgia, Louisiana, Texas,

Page 31: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

21

Arkansas, and Florida. The study began by examining the bivariate correlations between the

proportion of students within schools that were White, Black, and Hispanic with the proportion

receiving free or reduced-price lunch and the proportion identified gifted. In agreement with

previous research, negative correlations (approximate r = -.23) were found between the number

of Black students within schools and the number of gifted students across the three states with

sizable Black populations. More sizable negative correlations were found between the number

of students receiving free or reduced-price lunch and the number of gifted students within

schools (approximate r = -.30). However, when the relationship between the percentages of

Black or Hispanic students and the number of gifted students was examined with the proportion

of students receiving free or reduced-price lunch (FRL) was partialed out, all the correlations

dropped to nonsignificance. This is especially impressive given the huge sample size and ample

power of the study. When the opposite relationship was examined, (i.e., the relationship between

the proportion of students receiving FRL and the number of gifted students with the racial

composition of the schools controlled) the correlations remained significant. This analysis

provided evidence that race is not the underlying cause for underrepresentation. However, these

results measured at the group level cannot be assumed to apply to the individual level due to the

ecological fallacy (Robinson, 1950).

Realistic Expectations for Gifted Program Representation

Perhaps the largest question confronting scholars who study the underrepresentation issue

is the following: Just what is meant by proper representation? Most scholars who have examined

this issue have insisted, overtly or explicitly, that gifted programs will not be just until their

enrollments mirror those of the larger society. Whether or not this is a reasonable belief has

scarcely been addressed in the literature, though it is a topic that needs attention. Indeed, many

Page 32: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

22

scholars who insist that only equal representation is fair are self-contradictory to some extent.

These scholars have rejected previous notions regarding the genetic heritability of intelligence,

which would cast the underrepresentation issue as a reflection of the inferior genetic

endowments of certain racial groups, and have taken the position that giftedness is a

developmental quality – and rightly so. However, to argue that only equal representation is fair

from this position is to ignore voluminous amounts of research describing the impact of

environmental stimulation on development, such as Quay’s (1989) study of the role of SES in

Piagetian conservation tasks, McLoyd’s (1998) work on the effects of economic depravation, and

Entwisle and Alexander’s (1992) study of the “summer setback” phenomenon reflecting the

effects of non-stimulating home environments. It is not racist or classist to believe that though

children from all racial, cultural, and socioeconomic backgrounds have the potential to be gifted,

children from these groups grow up in very different environments which go on to drive or

hamper the development and expression of this trait we call giftedness.

Proposed Methods to Identify Traditionally Underrepresented Students

In recent years, a number of scholars and researchers in gifted education have attempted

to find methods of identifying gifted students that would lessen or eliminate the

underrepresentation of students from minority or low-SES backgrounds. These methods have

included the creation of more culturally inclusive descriptions of the traits of giftedness (Frasier

& Passow, 1994), non-verbal psychometric assessment of ability (Naglieri & Ford, 2003; Tyler-

Wood & Carri, 1993), dynamic assessment (Van-Tassel Baska et al., 2002), creativity testing

(Torrance, 1977), and alternative assessments based on Gardner’s theory of multiple

intelligences (Sarouphim, 1999).

Page 33: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

23

Traits, Attitudes, and Behaviors. In most school districts, students must be nominated for

screening before they are officially evaluated for gifted program qualification. The most

common source of these nominations is the classroom teacher (Gagné, 1994, Siegle, 2001), and

most teachers in the United States are from White, middle-class backgrounds. A common

concern expressed in the literature is that teachers may not recognize the signs of giftedness

when they are expressed by a student from a culture with which the teacher is not familiar.

Frasier and Passow (1994) addressed this concern by compiling a list of ten traits, attitudes, and

behaviors (TABs) that are universal indicators of giftedness, which could be used by teachers to

improve the quality of their nominations. The authors note that the expression of these TABs

will vary across environmental and cultural backgrounds, and that gifted students rarely express

all of these to the same degree. The traits are 1) motivation, which may be manifested as unusual

persistence, 2) intense interests that are advanced and consuming, 3) advanced communication

skills that may manifest themselves verbally, physically, artistically, or symbolically, 4) high

problem-solving ability which may be reflected in the spontaneous creation of effective and

creative strategies, 5) extensive memory indicated by a large knowledge base and rapid

acquisition of new information, 6) persistent inquiry and curiosity, 7) insight into deeper

meanings, evidenced by the ability to integrate knowledge across disciplines, 8) reasoning, the

ability to think logically and critically, 9) creativity, evidenced by the production of many new

and original ideas, and 10) a keen sense of humor that may be expressed gently or aggressively.

Of these ten characteristics, problem-solving ability, memory, and reasoning are qualities that are

assessed on modern tests of cognitive ability such as the WAIS-III and the Stanford-Binet 5.

Intense interests, insight, humor, motivation, and communication skills are qualities that, if

noticed by knowledgeable adults, are theoretically likely to result in a nomination for further

Page 34: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

24

screening. If one assumes that these tests are relatively unbiased against poor students, one must

conclude that at least an impoverished background depresses those three abilities because, as

discussed previously in this review, students from low-SES backgrounds consistently perform

more poorly on these tests than their more advantaged peers.

Nonverbal tasks. Tyler-Wood and Carri (1993) compared the scores of low and average

SES groups of students nominated for gifted programs on a variety of measures, including the

CogAT, the OLSAT, the Stanford-Binet 4, the Slosson Intelligence Test – Revised, and the

Matrix Analogies Test. They found that the gap between SES groups was much bigger on verbal

tasks, such as verbal section of the CogAT and the verbal section of the Stanford-Binet. Verbal

tasks are usually considered to reflect crystallized abilities, which are largely determined by an

individual’s prior knowledge and experiences (VanTassel-Baska, Johnson, & Avery, 2002). This

evidence led Tyler-Wood and Carri to call verbal tasks the low-SES gifted student’s “albatross,”

and caused scholars in the field to seek out identification instruments that are primarily non-

verbal. The Naglieri Non-Verbal Ability Test (NNAT) was designed for this purpose, and some

evidence suggests that it identifies similar proportions of White, Black, and Hispanic students at

the upper end of the performance range (Naglieri & Ford, 2003). However, Lohman (2005)

criticized Naglieri and Ford’s work, arguing that their results were based on a highly

nonrepresentative sample of Black and Hispanic children who were from relatively affluent

backgrounds.

Dynamic assessment. Dynamic assessment is a new form of measuring learning ability

that is less dependent on prior knowledge and experiences than traditional forms of testing

(Babaeva, 1999; Bolig & Day, 1993; VanTassel-Baska et al., 2002). These tests are designed to

measure learning speed through a test-train-test format. Examiners present examinees with a

Page 35: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

25

pretest, give examinees targeted instruction based on their performance on the pretest, then

administer a posttest (Kirschenbaum, 1998). Previous research on dynamic assessment has

demonstrated that traditionally measured intelligence is strongly correlated with learning speed

(Ferretti & Butterfield, 1992), and that dynamic assessments more accurately predicted later

school success than the WISC-R for ESOL students (Luther & Wyatt, 1989). Lidz and Macrine

(2001) found that dynamic assessments were effective at identifying culturally diverse learners.

Dynamic assessment appears to be a promising method for identifying students from

underrepresented groups and is worthy of future study.

Creativity. Creativity may be the reliably measurable ability that is not depressed in low

SES populations. Cicerelli (1966) found that the only significant difference between the

creativity scores of low SES and high SES students was that the low SES students had higher

scores in nonverbal elaboration. Similarly, Rogers (1968) found that low SES children were

better in figural fluency. He also noted that the low SES children were more spontaneous and

less conforming, traits which are often thought to be indicative of a creative style. Kaltsounis

(1974) found that low SES Black students outperformed middle class White students on fluency

and originality on the 1966 version of the TTCT figural. However, these findings are not

universal. Forman (1979) found that high-SES children scored higher on a measure of creativity

than their low-SES counterparts. However, when ability and achievement test differences were

partialed out, these differences became non-significant. Haley (1984) compared samples of

middle- and working-class Black children and found that the middle-class students were more

creative in verbal fluency while the working-class students were more creative in kinetic fluency.

Torrance (1977) compiled a list of creative strengths that he believed to be exhibited by

Black students at advanced levels based on his previous research. These included such skills as

Page 36: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

26

the ability to express emotions freely, the ability to improvise with materials, expressive and

dramatic speech, responsiveness to the concrete, enjoyment of movement, rich imagery

language, originality, social facility in small groups, and problem-centeredness. Torrance argued

that searching for evidence of the creative strengths in Black students would help educators

discover and identify them as gifted.

Assessments based on MI theory. A number of researchers are currently investigating

assessments based on Gardner’s (1993) theory of multiple intelligences. Perhaps the most

thoroughly developed and researched assessment scheme of this type is the DISCOVER project

(Maker, Nielson, & Rogers, 1994; Sarouphim, 2002). DISCOVER, which stands for

Discovering Individual Strengths and Capabilities through Observation while Allowing for

Varied Ethnic Responses, is grounded in Gardner’s (1993) theory of multiple intelligences and is

based on notions of “ intelligence-fair” testing and authentic assessment as alternatives to

traditional pen-and-paper tests. The DISCOVER assessment asks students to participate in five

tasks, designed to require linguistic, logical / mathematical, spatial, and interpersonal

intelligences. A panel of judges observes the students in these activities and rates the

performance in each domain on a four-point scale. In general, students receiving the maximum

score in two or more categories are considered to be gifted.

Sarouphim (2002) studied the DISCOVER protocol as applied to a sample of Hispanic

and Native American high school students, most of whom were from low-SES backgrounds.

The average correlation between scores on the five activities was quite small (r = .278). She

concluded that 29.3% of the sample would be classified as gifted according to this approach.

This finding is consistent with previous research on DISCOVER, which identified 22.9% of

students in the sample as gifted (Sarouphim, 2001). Though this and other non-traditional

Page 37: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

27

approaches are clearly able to identify low-SES and minority students for inclusion in gifted

programs, they are not without problems. First, the identification of almost 30% of students in

the sample as gifted is extremely liberal. It is possible that traditional instruments would yield

similar results with lowered cutoffs. Second, the low correlations between tasks indicate that

these students would require highly individualized programming because they could not be

assumed to have common strengths or skills. Third, the study begs the question of differential

identification across SES levels because it did not investigate the DISCOVER model with a

high-SES sample. Based on identification patterns from more traditional instruments, it is seems

likely that most or all students from advantaged backgrounds might be identified as gifted via

this approach.

School Correlates that May Increase the Probability of Identification

Counseling availability. Grantham and Ford (2003) laid out a theoretical rationale

describing how the availability of counseling services within schools could be related to the

identification and retention of minority students in gifted programs. They argued that

multicultural counseling specifically addressing issues of race could help support minority

students through the interpersonal challenges that can discourage their participation. One of

these issues is outright racism as described in Harmon’s (2002) case study of a group of inner-

city gifted youth who were bussed to a predominantly White school for gifted services. The

students reported numerous instances of racist insults and remarks from other students in the

school as well as inferior treatment from teachers. At the conclusion of the article, none of the

students were willing to participate in more gifted education if it meant being bussed to another

school. This outcome is quite disturbing for advocates of gifted education.

Page 38: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

28

A major issue that multicultural counseling may address is Fordham and Ogbu’s (1986)

contention that when intelligent African American students enroll in advanced academic

programs and put obvious effort into academic achievement, they may receive negative social

sanctioning from their peers, who may accuse them of abandoning their own culture and “acting

White.” Students are then placed in an uncomfortable situation in which they must choose social

acceptance or academic effort. Students choosing to achieve may be viewed as cultural traitors

(Fordham, 1988). Similar conflicts have been reported for students of the majority culture, such

as Gross’s (1989) description of the “forced choice dilemma” that may force gifted students to

choose between actualizing their abilities or experiencing intimacy and solidarity with their

peers. However, “acting White” appears to carry with it an accusation of deeper and more

profound betrayal and thus is probably more potent. Cordeiro’s (1991) ethnography of at-risk

Hispanic students attending inner city high schools discussed the importance of positive role

models and significant others that assisted in the development and maintenance of an achieving

identity. This finding was repeated in Hébert and Reis’s (1999) ethnography of high achieving

students in an inner-city high school.

Grantham and Ford (2003) proposed a series of ways that counseling personnel can

support gifted Black students. They advocated for the creation of mentoring programs for Black

students in which successful mentors can serve as discussion partners on such issues as social

injustice, motivation, and persistence. Counselors can give talks on anger management which

may be especially useful for students in the stages of racial identity development that are

accompanied by a great deal of anger. Counselors could provide conflict resolution training to

students to help them cope with the social negotiation aspect of being Black and gifted. Another

Page 39: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

29

significant role for counselors would be to provide support via prescribed fictional readings, a

strategy discussed extensively by Ford and Harris (1999).

Given such difficulties, it is perhaps easy to see how counseling might support students in

negotiating a middle way between such extremes. Indeed, there is some indirect evidence to

support this link. Wilson (1986) reviewed and synthesized the literature on the effects of

counseling interventions on underachievers. For the purpose of her review, only studies

conceptualizing underachievement as a disparity between earned grades and achievement test

scores were considered. The review concluded that counseling was sometimes effective in

bringing about long-term change in behavior. Counseling interventions that were most effective

were long term (defined as lasting six months to one year), group-based, and directive in nature.

The case for considering the availability of counseling services within schools in any

model of gifted identification is tentative. Existing literature provides a rationale for how this

availability might be advantageous to underrepresented populations, but empirical studies in this

area are needed.

Orderly learning environment. Few research studies have examined the association

between school characteristics and achievement while controlling for the composition variables

that frequently confound these studies. One example of such a study is Opdenakker and Van

Damme’s (2001) study of mathematics achievement, which was previously introduced. These

researchers examined the effect of school composition and school process variables on

mathematics achievement. One such process variable included in their study was what they

called “orderliness,” a composite of classroom management effectiveness and time spent on

learning. The orderliness variable was moderately correlated with math achievement (r = .48).

In the multilevel analysis, orderliness itself was not entered due to high multicollinearity with

Page 40: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

30

composition variables. However, an interaction term of orderliness and student numerical ability

was created and entered into the model. This interaction was found to have a significant positive

effect on the regression slope of numerical ability on math achievement. All the slopes were

positive, but the slopes became more positive in schools with high orderliness. This indicated

that students were able to actualize their math abilities more effectively in schools with high

orderliness.

Instructional quality. West (1985) examined the effects of composition and process

variables on the educational effectiveness of urban schools. She analyzed school level data using

a standard regression model, entering the compositional variables (SES and race variables) first

to serve as covariates, then adding in school process variables. Separate analyses were

conducted for reading and math achievement. Teaching experience, a school level variable

operationalized as a composite of years of teaching experience per teacher, percentage of the

faculty with tenure, and amount of professional preparation, was entered into the model as a

predictor. The results indicated that teacher experience was significantly (p = .012) and

positively associated with reading achievement, even after controlling for student background

characteristics and other school process variables. Considering that the study had very low

power due to its small sample size of only 26 schools, the effect of teacher experience would

have to be very powerful in order to trigger statistical significance.

Miller, Ellsworth, and Howell (1986) studied schools that deviate from the traditional

SES – achievement relationship. Unfortunately, this study used a very unusual methodology

rendering the results rather untrustworthy. The authors began with a set of 73 Kansas elementary

schools. They sorted these into two rank ordered lists: one for the percentage of students within

schools receiving free or reduced price lunch and another for low school achievement, defined as

Page 41: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

31

the proportion of students within the school below grade level on the comprehension subtest of

the Iowa Test of Basic Skills. They then subtracted each school’s SES rank from its

achievement rank. A positive difference score indicated that the school’s achievement was better

than expected given its SES composition, while a negative difference indicated the opposite.

The 22 schools with the most positive difference score were compared against the 22 schools

with the most negative difference score on a teacher knowledge of reading instruction scale,

years of teaching experience, 27 separate variables for each items of a scale measuring teacher

attitudes, toward reading instruction, and another 38 variables. The data were analyzed via a

series of 67 unadjusted t-tests. The results indicated that the two groups of schools did not differ

significantly on either teacher knowledge of reading or teaching experience.

Student body characteristics. Opdenakker and Van Damme’s (2001) three-level HLM

study of math achievement reached some interesting conclusions regarding the aggregate effects

of student ability on achievement. The first is that school level math ability was a significant,

powerful, and positive contributor toward individual achievement. Furthermore, a cross level

interaction was detected between individual and student ability where students within schools

with higher average numerical ability gain more achievement from each unit increase in their

ability. The authors stated that “We found that all students benefit from belonging to a school

with a high ability composition, but the more able students benefited the most” (p. 423). Perhaps

these results can be understood in light of Kulik and Kulik’s (1992) meta-analysis of the

effectiveness of various grouping options for gifted students, which found in part that grouping

was effective when it allowed students to receive advanced instruction. Opdenakker and Van

Damme (2001) proposed some mechanisms through which school ability could affect individual

achievement. They argued that bright and motivated students may exert peer pressure on other

Page 42: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

32

students to achieve, that the curriculum may be taught more demandingly and with higher

standards to groups with higher academic readiness, that teachers likely expect more from

classes that they consider to be highly able, and that bright students may benefit from enhanced

academic self-concept. Stated more informally, if teachers “teach to the average,” the level of

teaching goes up for everyone when the average student ability is increased.

The findings became even more interesting when Opdenakker and Van Damme (2001)

performed a similar analysis, but this time replacing student ability with a student ability by

student SES interaction which was interacted with school mean ability as before (see Figure 1.1).

Relationship betw een student numerical intelligence and sensitivity to school mean numerical intelligence per SES-group (reproduced from

Opdenakker & Van Damme, 2001, p. 432)

0

10

20

30

40

50

64 104 144Numerical intelligence

Sen

siti

vity

to

sch

oo

l ab

ility

Low SES

Medium SES

High SES

Figure 1.1: Relationship between student numerical intelligence and sensitivity to school mean numerical intelligence per SES-group

The results indicated that high ability students across SES strata are most sensitive to the ability

composition of their schools, and that this effect is strongest for high ability students from low

Page 43: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

33

SES backgrounds. Indeed, these bright and economically disadvantaged students were about

twice as sensitive to school composition than students of similar ability from most advantaged

backgrounds. These results support and extend previous findings that students from poor

backgrounds are the most sensitive to school effects (e.g., Entwisle and Alexander, 1992).

Critique of the Literature

The studies summarized in this report suffer from some common problems that weaken

their value and ultimately, their trustworthiness to the research community. Understanding these

issues is important so that future studies in the area may address or avoid them. In some cases,

researchers simply made common mistakes in their analysis, such as not accounting for

confounding variables, relying upon univariate statistics when multivariate statistics were more

appropriate, categorizing continuous variables, and bad reporting practices. In other studies, the

most appropriate statistical methods or computer programs were not available at the time of

publication.

Ignoring clustered data. Most research in education is guilty of ignoring the clustered

nature of the data. Studies in education most often examine students who are clustered within

classrooms which are clustered within schools. Ordinary statistical procedures based on the

generalized linear model, such as analysis of variance and regression, assume that the units of

analysis are independent – that they cannot affect each other. When students are clustered within

classrooms, the students within the class are more similar to other students in the same class than

to students in other classes. Clustered data often violates the assumption of independence and

thus results in biased parameter estimates (Raudenbush & Bryk, 2002). Aggregation bias is

another common problem resulting from ignoring the clustered aspect of data. Aggregation bias

results when a predictor exerts different types of influence at different levels. An example

Page 44: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

34

discussed earlier was that of SES, which exerts certain effects at the individual level. However,

organizational units such as classrooms are composed of individuals with their own SES.

Therefore, organizational units have SES properties of their own that may exert influence of a

different magnitude or direction than the effects of SES at the individual level. Flat regression

models ignore the higher-level effects of predictors (Raudenbush & Bryk, 2002). Statistical

techniques, such as hierarchical linear modeling and multilevel structural equation modeling, that

are capable of dealing with this type of clustered data are relatively new in education.

Confounding issues. When two predictors are highly correlated, research examining the

impact of one of these predictors must make an effort to control the other. Otherwise, the

individual variable in question cannot be isolated. An example that affects many of the studies

examined herein is the strong relationship that exists between SES and race. Ethnic and cultural

groups in the United States continue to lack equal access to sources of economic power, so

substantial differences in average SES persist across racial lines. Studies that include one of

these factors without controlling the other are impossible to interpret.

Categorizing continuous variables. It is very common for researchers in education and

other areas of psychology to inappropriately categorize continuous variables using median splits

or similar procedures. This is most frequently performed in order to force data to fit into an

ANOVA framework when a regression framework would be more appropriate. A great deal of

the research examined in this review divided participants into low, middle, and high SES groups

based on splitting up continuous information on family income, parental education, or parental

occupation status (e.g., Kaltsounis, 1974; Quay, 1989; Ryan & French, 1976). When continuous

data are artificially categorized, statistical power is reduced. Such treatments may also mask

nonlinear relationships.

Page 45: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

35

Inappropriate reliance on univariate analysis. Several of the empirical research articles

in this review had access to multiple outcome variables, such as achievement test scores in

multiple subjects. When a researcher is interested in multiple outcomes and has reason to

believe that the outcomes are theoretically related, it is generally more appropriate to utilize a

multivariate analysis that treats the outcomes as a structured composite (Huberty, 1994). This

allows for increased statistical power as well as follow-up descriptive discriminant analysis

techniques to explore constructs within the outcome composite that may be differentially

affected by the predictors.

Ignoring the structure of predictors. The common regression model assumes that each

predictor causes the outcome directly, while being correlated with the other predictors. In other

words, the model assumes that the predictors themselves are not structured. Reality is usually

more complicated. Predictors may exert causative effects on each other as well as on the

outcome variable of interest. Therefore, a given predictor of interest may exert a direct effect on

an outcome while exerting other indirect effects on the outcome through the other predictors.

Path analysis allows the researcher to posit models of how predictors influence each other as

well as the outcome. Multiple path models can be compared against a data set, allowing the

researcher to compare their relative fit and discard theoretical models that do not account for

relations between variables. Marsh (1984) and Marsh and Parker (1984) were the only articles

reviewed to use path analysis, and they did not report model fit statistics.

Inappropriate use of stepwise procedures. Stepwise procedures are popular in multiple

regression and predictive discriminant analysis contexts for automating the selection of

predictors and ordering their entry into the model. The issue of ordering predictors is

particularly crucial because once a predictor is entered into a model its effect is controlled. In

Page 46: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

36

other words, all of the covariance that that predictor shares with another predictor as well as the

outcome is attributed to the variable that is entered first. Unfortunately, stepwise procedures do

not perform either of their intended functions very well. Thompson (1995) showed that stepwise

procedures do not result in the selection of the best subset of predictors and often fail to yield

replicable results due to their capitalization on sampling error for determining variable entry

order. This was illustrated quite clearly in West’s (1985) analyses. In the first analysis, of

reading achievement, the percentage of the school population that was Black was entered first,

followed by SES. In the second analysis, this time of math achievement, SES was entered first.

The percentage of the school population that is Black did not account for any additional variance

once SES was in the model. Of course, this begs the question of whether the percentage of

students that are Black would have made it into the first model if SES had been entered first.

Directions for Future Research

It is apparent from reading this review that much of the research examining the impact of

SES on educational outcomes is quite dated. Much of the relatively recent work is still using the

National Educational Longitudinal Study of 1988 dataset, which is rapidly approaching

obsolescence, or even older datasets such as the High School and Beyond data of the early

eighties.

Future research on the effects of SES on school achievement should be conducted with

up-to-date datasets using analysis methods that account for the multilevel nature of the data.

This would address the problems posed by aggregation bias. Using structural equation modeling

procedures would allow for examination of the structure of the predictors and would begin to

integrate the structure and process approaches that have historically been separate in SES

Page 47: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

37

research. Care must be taken to examine the effects of socioeconomic status independent of

many of the factors that have confounded previous work, such as race.

Conclusion

This essay has reviewed a variety of literature related to gifted identification and school

achievement. We have seen overwhelming evidence that socioeconomic status has powerful

effects at both the individual and aggregate level, which must be considered in future research

within gifted education. We have seen that many scholars in gifted education are concerned with

the underrepresentation issue, but that the true effects of race are difficult to disentangle from

class.

Page 48: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

38

References

Babaeva, J. D. (1999). A dynamic approach to giftedness: Theory and practice. High Ability

Studies, 10(1), 51-68.

Bennett, C. I. (1995). Comprehensive multicultural education: Theory and practice. (4th ed.).

Boston: Allyn & Bacon.

Bolig, E. E., & Day, J. D. (1993). Dynamic assessment and giftedness: The promise of assessing

training responsiveness. Roeper Review, 16(2), 110-113.

Brosnan, F. L. (1983). Overrepresentation of low-socioeconomic minority students in special

education programs in California. Learning Disability Quarterly, 6(4), 517-525.

Brown, C. N. (1997). Gifted identification as a constitutional issue. Roeper Review, 19(3), 157-

167.

Chasteen, A., Bhattacharyya, S., Horhota, M., Tam, R., & Hasher, L. (2005). How feelings of

stereotype threat influence older adults’ memory performance. Experimental Aging

Research, 31(3), 235-260.

Cicerelli, V. G. (1966). Religious affiliation, socio-economic status, and creativity. Journal of

Experimental Education, 35, 90-93.

Cordeiro, P. (1991). An ethnography of high achieving at-risk Hispanic youths at two urban high

schools: Implications of administrators. Connecticut. (ERIC Document Reproduction

Service No. ED 330 088).

Duncan, G. J., & Brooks-Gunn, J. (2000). Family poverty, welfare reform, and child

development. Child Development, 71, 188-196.

Page 49: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

39

Entwisle, D. R., & Alexander, K. L. (1992). Summer setback: Race, poverty, school

composition, and mathematics achievement in the first two years of school. American

Sociological Review, 57(1), 72-84.

Everson, H. T., & Millsap, R. E. (2004). Beyond individual differences: Exploring school effects

on SAT scores. Educational Psychologist, 39(3), 157-172.

Festinger, L. (1954). A theory of social comparison processes. Human Relations, 7, 117-140.

Ford, D. Y. (1998). The underrepresentation of minority students in gifted education: Problems

and promises in recruitment and retention. Journal of Special Education, 32(1), 4-14.

Ford, D. Y. & Harris, J. J. III. (1999). Multicultural gifted education. New York: Teachers

College Press.

Ford, D. Y., Harris, J. J., III, Tyson, C. A., & Trotman, M. F. (2002). Beyond deficit thinking:

Providing access for gifted African American students. Roeper Review, 24(2), 52-58.

Fordham, S. (1988). Racelessness as a factor in black students' school successes: Pragmatic

strategy or Pyrrhic victory? Harvard Educational Review, 5(8), 54-84.

Fordham, S., & Ogbu, J. (1986). Black students' school success: The burden of "acting White."

The Urban Review, 18, 176-206.

Forman, S. (1979). Effects of socioeconomic status on creativity in elementary school children.

Creative Child and Adult Quarterly, 4(2), 87-92.

Frasier, M. M. (1997). Gifted minority students: Reframing approaches to their identification and

education. In N. Colangelo & G. Davis (Eds.), The handbook of gifted education (2nd

ed., pp. 498-515). Needham Heights, MA: Allyn & Bacon.

Frasier, M. M., & Passow, A. H. (1994). Toward a new paradigm for identifying talent potential.

(No. 94112): The National Research Center on the Gifted and Talented.

Page 50: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

40

Gagne, F. (1994). Are Teachers Really Poor Talent Detectors? Comments on Pegnato and

Birch's (1959) Study of the Effectiveness and Efficiency of Various Identification

Techniques. Gifted Child Quarterly, 38(3), 124-126.

Gardner, H. (1993). Frames of mind: The theory of multiple intelligences. NY: Basic Books.

Garibaldi, A. M. (1997). Four decades of progress and decline: An assessment of African

American educational attainment. Journal of Negro Education, 66(2), 105-120.

Grantham, T. C., & Ford, D. Y. (2003). Beyond self-concept and self-esteem: Racial identity and

gifted African American students. High School Journal, 87(1), 18-29.

Griffith, J. (1996). Relation of parental involvement, empowerment, and school traits to student

academic performance. Journal of Educational Research, 90(1), 33-41.

Gross, M. U. M. (1989). The pursuit of excellence or the search for intimacy? The forced-choice

dilemma of gifted youth. Roeper Review, 11(4), 189-194.

Haley, G. (1984). Creative response styles: The effects of socioeconomic status and problem-

solving training. Journal of Creative Behavior, 18(1), 25-40.

Harmon, D. (2002). They won't teach me: The voices of gifted African American inner-city

students. Roeper Review, v24 n2 p68-75 Win 2002, 24(2).

Hébert, T., & Reis, S. (1999). Culturally diverse high-achieving students in an urban high school.

Urban Education, 34(4), 428-457.

Hertzman, C., McLean, S., Kohen, D., Dunn, J., & Evans, T. (2002). Early development in

Vancouver: Report of the community asset mapping project (CAMP). Ottowa, Ontario:

Canadian Institute for Health Information.

Hess, R., & McDevitt, T. (1984). Some cognitive consequences of maternal intervention

techniques: A longitudinal study. Child Development, 55, 1902-1912.

Page 51: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

41

Hollingshead, A. B., & Redlich, F. C. (1958). Social class and mental illness. New York: Wiley.

Kaltsounis, B. (1974). Race, socioeconomic status, and creativity. Psychological Reports, 35,

164-166.

Kennedy, E. (1992). A multilevel study of elementary male Black students and White students.

Journal of Educational Research, 86(2), 105-110.

Kirschenbaum, R. (1998). Dynamic assessment and its use with underserved gifted and talented

populations. Gifted Child Quarterly, 42(3), 140-147.

Koenig, A. & Eagly, A. (2005). Stereotype threat in men on a test of social sensitivity. Sex Roles,

52(7-8), 489-496.

Kulik, J. A., & Kulik, C.-l. C. (1992). Meta-analytic findings on grouping programs. Gifted Child

Quarterly, 36(2), 73-77.

Laosa, L. M. (1983). School, occupation, culture, and family: The impact of parental schooling

on the parent - child relationship. In I. E. Sigel & L. M. Laosa (Eds.), Changing families

(pp. 79-135). New York: Plenum.

Lidz, C. & Macrine, S. (2001). An alternative approach to the identification of gifted culturally

and linguistically diverse learners: The contribution of dynamic assessment. School

Psychology International, 22(1), 74-96.

Lohman, D. F. (2005). Review of Naglieri and Ford (2003): Does the Naglieri Nonverbal Ability

Test identify equal proportions of high-scoring White, Black, and Hispanic students?

Gifted Child Quarterly, 49, 19-28.

Luther, M., & Wyatt, F. (1989). A comparison of Feuerstein's method of LPAD assessment with

conventional IQ testing on disadvantaged New York high school students. International

Journal of Dynamic Assessment and Instruction, 1(1), 49-64.

Page 52: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

42

Maggi, S., Hertzman, C., Kohen, D., & D'Angiulli, A. (2004). Effects of neighborhood

socioeconomic characteristics and class composition on highly competent children.

Journal of Educational Research, 98(2), 109-114.

Maker, C. J. (1996). Identification of gifted minority students: A national problem, needed

changes and a promising solution. Gifted Child Quarterly, 40(1), 41-50.

Maker, C. J., Nielson, A. B., & Rogers, J. A. (1994). Giftedness, diversity, and problem-solving.

Teaching Exceptional Children, 27, 4-19.

Marsh, H. W. (1984). Self-concept, social comparison, and ability grouping: A reply to Kulik

and Kulik. American Educational Research Journal, 21(4), 799-806.

Marsh, H. W., & Parker, J. W. (1984). Determinants of student self-concept: Is it better to be a

relatively large fish in a small pond even if you don't learn to swim as well? Journal of

Personality & Social Psychology, 47(1), 213-231.

Marsh, H. W., Relich, J. D., & Smith, I. D. (1983). Self-concept: The construct validity of

interpretations based on the SDQ. Journal of Personality & Social Psychology (45), 173-

187.

Marx, D. & Goff, P. (2005). The effect of experimenter race on target’s test performance and

subjective experience. British Journal of Social Psychology, 44(4), 645-657.

McBee, M. (In press). Minority representation in gifted programs: A school level analysis of race

and socioeconomics. Roeper Review.

McLoyd, V. C. (1998). Economic advantage and child development. American Psychologist, 53,

185-204.

Page 53: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

43

Miller, J. W., Ellsworth, R., & Howell, J. (1986). Public elementary schools which deviate from

the traditional SES-achievement relationship. Educational Research Quarterly, 10(3), 31-

50.

Mills, B. C. (1983). The effects of socioeconomic status on young children's readiness for

school. Early Child Development & Care, 11(3), 267-273.

Mills, C. J., & Tissot, S. L. (1995). Identifying academic potential in students from under-

represented populations: Is using the Ravens Rrogressive Matrices a good idea? Gifted

Child Quarterly, 39(4), 209-217.

Naglieri, J. A., & Ford, D. Y. (2003). Addressing underrepresentation of gifted minority children

using the Naglieri Nonverbal Ability Test (NNAT). Gifted Child Quarterly, 47(2), 155-

160.

Naglieri, J., & Jensen, A. R. (1987). Comparison of Black - White differences on the WISC-R

and the K-ABC: Spearman's hypothesis. Intelligence, 11, 21-43.

Opdenakker, M.-C., & Van Damme, J. (2001). Relationship between school composition and

characteristics of school process and their effect on mathematics achievement. British

Educational Research Journal, 27(4), 407-432.

Overton, W. F., Wagner, J., & Dolinsky, H. (1971). Social class differences and task variables in

the development of multiplicative classification. Child Development, 42, 1951-1958.

Portes, A., & MacLeod, D. (1996). Educational progress of children of immigrants: The roles of

class, ethnicity, and school context. Sociology of Education, 69(4), 255-275.

Pyryt, M. (1996). IQ: Easy to bash, hard to replace. Roeper Review, 18, 255-258.

Quay, L. C. (1989). Interactions of stimulus materials, age, and SES in the assessment of

cognitive abilities. Journal of Applied Developmental Psychology, 10(3), 401-409.

Page 54: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

44

Raudenbush, S., & Bryk, A. (1986). A hierarchical model for studying school effects. Sociology

of Education, 59, 1-17.

Raudenbush, S., & Bryk, A. (2002). Hierarchical linear models: Applications and data analysis

methods. (2nd ed.). Thousand Oaks, CA: Sage.

Reid, C., Romanoff, B., Algozzine, B., & Udall, A. (2000). An evaluation of alternative

screening procedures. Journal for the Education of the Gifted, 23(4), 379-396.

Robinson, W. S. (1950). Ecological correlations and the behavior of individuals. American

Sociological Review, 15, 351-357.

Rogers, D. W. (1968). Visual expression: A creative advantage of the disadvantaged. The

Elementary School Journal, 68, 394-399.

Rumberger, R. W. (1995). Dropping out of middle school: A multilevel analysis of students and

schools. American Educational Research Journal, 32(3), 583-625.

Ryan, J. J., & French, J. R. (1976). Long-term grade predictions for intelligence and achievement

tests in schools of differing socio-economic levels. Educational & Psychological

Measurement, 36(2), 553-559.

Sarouphim, K. M. (1999). DISCOVER: A promising alternative assessment for the identification

of gifted minorities. Gifted Child Quarterly, 43(4), 244-251.

Sarouphim, K. (2001). Concurrent validity, gender differences, and identification of minority

students. Gifted Child Quarterly, 45, 130-138.

Saurophim, K. (2002). DISCOVER in high school: Identifying gifted Hispanic and Native

American students. Journal of Secondary Gifted Education, 14(1), 30-38.

Page 55: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

45

Schmader, T., Johns, M., & Barquissau, M. (2004). The costs of accepting gender differences:

The role of stereotype endorsement in women's experience in the math domain. Sex

Roles, 50(11), 835-850.

Scott, M. S., Perou, R., Urbano, R. C., Hogan, A., & et al. (1992). The identification of

giftedness: A comparison of White, Hispanic and Black families. Gifted Child Quarterly,

36(3), 131-139.

Siegle, D. (2001, April 18-21). Teacher bias in identifying gifted and talented students. Paper

presented at the Annual meeting of the Council for Exceptional Children, Kansas City,

MO.

Shaunessy, E., Karnes, F. A., & Cobb, Y. (2004). Assessing potentially gifted students from

lower socioeconomic status with nonverbal measures of intelligence. Perceptual & Motor

Skills, 98(3), 1129-1138.

Shavelson, R. J., & Bolus, R. (1982). Self-concept: The interplay of theory and methods. Journal

of Educational Psychology (74), 3-17.

Smith, J. L. (2004). Understanding the process of stereotype threat: A review of mediational

variables and new performance goal directions. Educational Psychology Review, 16(3),

177-206.

Soares, A. T., & Soares, L. M. (1969). Self-perceptions of culturally disadvantaged children.

American Educational Research Journal, 6, 31-45.

Spearman, C. (1904). General intelligence: Objectively determined and measured. American

Journal of Psychology, 15, 201-293.

Steele, C. M. (1997). A threat in the air: How stereotypes shape intellectual identity and

performance. American Psychologist, 52(6), 613-629.

Page 56: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

46

Steinberg, L., Blinde, P., & Chan, K. (1984). Dropping out among language minority youth.

Review of Educational Research, 54(1), 113-132.

Taylor, S. A., & Harris, K. C. (2003). School integration and the achievement test scores of

Black and White students in Savannah, Georgia. North American Journal of Psychology,

5(2), 301-309.

Thompson, B. (1995). Stepwise regression and stepwise discriminant analysis need not apply

here: A guidelines editorial. Educational & Psychological Measurement, 55(4), 525-534.

Torrance, E. P. (1977). Discovery and nurturance of giftedness in the culturally different.

Reston, VA: The Council for Exceptional Children.

Trowbridge, N. (1972). Self-concept and socio-economic status in elementary school children.

American Educational Research Journal, 9, 525-537.

Tyler-Wood, T., & Carri, L. (1993). Verbal measures of cognitive ability: The gifted low SES

student's albatross. Roeper Review, 16(2), 102-106.

Valenzuela, A. (1999). Subtractive schooling: U.S. -Mexican youth and the politics of

caring.Albany, NY: SUNY Press.

VanTassel-Baska, J., Johnson, D., & Avery, L. D. (2002). Using performance tasks in the

identification of economically disadvantaged and minority gifted learners: Findings from

project STAR. Gifted Child Quarterly, 46(2), 110-123.

West, C. A. (1985). Effects of school climate and school social structure on student academic

achievement in selected urban elementary schools. Journal of Negro Education, 54(3),

451-461.

Page 57: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

47

Wilson, N. S. (1986). Counselor interventions with low-achieving and underachieving

elementary, middle, and high school students: A review of the literature. Journal of

Counseling & Development, 64(10), 628-634.

Wylie, R. C. (1979). The self-concept: Theory and research on selected topics. (Vol. 2). Lincoln:

University of Nebraska Press.

Yopyk, D. & Prentice, D. (2005). Am I an athlete or a students? Identity salience and stereotype

threat in student-athletes. Basic and Applied Social Psychology, 27(4), 329-336.

Page 58: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

48

CHAPTER 2

A DESCRIPTIVE ANALYSIS OF REFERRAL SOURCES FOR GIFTED IDENTIFICATION

SCREENING BY RACE AND SOCIOECONOMIC STATUS1

1 McBee, M. Accepted by the Journal for Secondary Gifted Education, 2/16/2006.

Page 59: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

49

Abstract

A dataset containing demographic, gifted nomination status, and gifted identification

status for all elementary school students in the state of Georgia (N = 705,074) was examined.

The results indicated that automatic and teacher referrals were much more valuable than other

referral sources. Asian and White students were much more likely to be nominated than Black

or Hispanic students. Students receiving free or reduced-price lunches were much less likely to

be nominated than students paying for their own lunches. The results suggest that inequalities in

nomination, rather than assessment, may be the primary source of the underrepresentation of

minority and low-SES students in gifted programs.

Page 60: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

50

Despite the vital role of the referral as the “gatekeeper” process through which students

become eligible for official evaluation for entry into gifted programs, it remains poorly

understood. An examination of the gifted education literature reveals a paucity of research in

this area. This is especially troubling and indeed surprising given the field’s well-documented

struggle to identify and serve students from minority or low SES families (e.g., Ford, 1998;

Frasier, Garcia, & Passow, 1995). A relatively large amount of work has examined possible

methods of fairly assessing students who are traditionally underrepresented in programs for the

gifted, including assessment schemes based on dynamic assessment (Kirschenbaum, 2004), non-

verbal ability tests (Naglieri & Ford, 2003), Gardner’s (1983) theory of multiple intelligences

(Sarouphim, 1999), compensatory policies such as lowering IQ cutoff requirements for students

from underrepresented groups (Hunsaker, 1994), and performance-based assessments

(VanTassel-Baska, Johnson, & Avery, 2002). These procedures may hold great promise for

identifying and serving students from these groups. However, most school districts require that a

student be referred or nominated before being formally assessed for gifted program placement.

Students that do not receive a referral will be unable to enter the program no matter which formal

assessment procedure is used. The referral process is an obvious potential source of unfairness

in the entrance process. It is essential that reliable information be made available so that current

practices can be evaluated and perhaps modified.

For the remainder of this paper, the terms “referral” and “nomination” will be used

interchangeably to describe the process of designating a student as potentially gifted. Once a

student has received a nomination or referral, he or she is legally required to undergo official

Page 61: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

51

testing for gifted program placement, assuming that the student’s parents consent. The testing

process will be referred to as “evaluation” or “screening” throughout the remainder of the paper.

Teacher Nominations

The classic study on teacher nominations was conducted by Pegnato and Birch in 1959.

In this study, a variety of screening methods were compared on the basis of “effectiveness” , the

percentage of gifted children nominated by the screening method, and “efficiency” , the

percentage of nominated students that would later be confirmed as gifted through individual

testing. “Giftedness” was operationalized as an IQ score of 136 or greater on the Stanford-Binet.

Therefore, effectiveness was sensitive to false negatives while efficiency was sensitive to false

positives. Pegnato and Birch concluded that teacher judgment was a poor method of screening

students for individual testing. Teacher judgment was just 45% effective, meaning that teachers

only nominated 45% of students that actually had IQs greater than 136, and was only 27%

efficient. Their study, widely acclaimed in the gifted education community, formed the basis of

a widespread belief that teachers are poor judges of student potential. Their method of assessing

screening techniques via effectiveness and efficiency ratings were utilized in much of the later

research on teacher nominations (i.e., Gear, 1976; Waters & Clausen, 1983).

Gagne (1994) re-examined Pegnato and Birch’s (1959) study. He severely criticized the

use of effectiveness and efficiency measures in assessing the quality of a nomination scheme,

pointing out the two are non-independent because they both depend upon the number of students

nominated. In fact, the two indices are negatively correlated. A screening method that

nominates more students will, all things being equal, be more effective since it will necessarily

catch more gifted students while simultaneously becoming less efficient. Gagné argued for the

use of the phi coefficient in judging the effectiveness of a nomination scheme. The phi

Page 62: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

52

coefficient is a correlation coefficient used with categorical data whose interpretation is

equivalent to that of Pearson’s r (Agresti, 1996). To use the phi coefficient, a 2x2 cross-

classification table is created with nomination status (yes/no) on one dimension and gifted status

(yes/no) on the other. Counts from each set of four conditions are placed on the table. The

number of counts on the diagonal, or correctly classified cases, is compared to the total number

of counts. Using this type of procedure to assess the quality of teacher nominations as a

screening strategy does not suffer from the drawbacks of Pegnato and Birch’s system. Gagné’s

analysis of the original data found that teacher judgment had a phi coefficient of .29, which

compared quite favorably with the other methods analyzed in the study. Thus, the belief that

teachers are generally poor at detecting academically gifted students is based partly on a classic

study with flawed methodology.

Another concern with respect to research on the efficacy of teacher nominations is the

criterion variable. Just what should the criterion be? Previous definitions of giftedness that

relied on an IQ score higher than a specific threshold were quite simple to test via the cross-

classification approach outlined above, as they simply asked the teacher to predict which

students would exceed the target IQ. Indeed, Renzulli and Delcourt (1986) criticized this

teacher-predicts-IQ approach, arguing that their imperfect ability to do this should suggest that

they are valuable sources of information on student ability that differs from that which is

measured by psychometric testing. Current multidimensional definitions of giftedness (see

Feldman, 2003) that define it as some combination of academic ability, creativity, motivation,

achievement, leadership, or artistic talent make the selection of an appropriate criterion variable

quite difficult. Renzulli and Delcourt suggested that the ultimate criterion for evaluating the

Page 63: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

53

usefulness of teacher recommendations should be performance in the enriched academic

program or even later life accomplishment.

Ultimately, insufficient research has been conducted on teacher nominations to make

possible a sound judgment regarding their value. But even if teachers are effective at nominating

students from middle-class majority-culture backgrounds, as some more contemporary research

suggests, a significant question remains regarding their ability to detect students with high

academic potential who come from other backgrounds, especially those backgrounds that are

underrepresented in programs for gifted students. A reading of the research literature on this

topic reveals that it has been a frequent source of concern. Nonetheless, only a small number of

studies have empirically examined this issue.

Hunsaker, Finley, and Frank’s (1997) study is one of the few that has addressed Renzulli

and Delcourt’s (1986) criticism. Teachers were trained to recognize the characteristics of

giftedness as they manifest in students from traditionally underrepresented backgrounds. The

researchers examined canonical correlations between teacher ratings on the TABs Summary

Form (Frasier et al., 1995) and the Scales for Rating the Behavioral Characteristics of Superior

Students (Renzulli, Smith, White, Callahan, & Hartman, 1976) for students from low-income and

minority backgrounds and subsequent student performance in the gifted program, as assessed by

the Scale for Rating Students’ Participation in the Local Gifted Education Program (Renzulli &

Westberg, 1991). The results indicated that the teacher ratings of the students’ characteristics

were moderately correlated with specific aspects of the students’ subsequent performances in

gifted education classes. However, the correlations between the “overall success” scale of the

student performance scale with the two canonical variables representing teacher evaluations of

gifted characteristics were quite low (.178 and .220).

Page 64: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

54

There is some evidence suggesting that teachers evaluate Hispanic students less favorably

than White students. Masten, Plata, Wenglar, and Thedford’s (1999) study found that fifth-

grade teachers rated Hispanic students less favorably on the Scales for Rating the Behavioral

Characteristics of Superior Students (SRBCSS) and that their ratings of students were associated

with the students’ level of acculturation and ethnic identification. A similar study, conducted by

Plata and Masten (1998), concluded that nominated Hispanic and Caucasian students had similar

scores on the SRBCSS, but non-nominated Hispanic students did have lower scores than their

Caucasian counterparts. However, since these studies did not control for socioeconomic status

or any other potentially lurking variables, they must be interpreted with caution.

Method

Data Sources

A population dataset was obtained from the Georgia Department of Education via special

request. This dataset included records from all public school students enrolled during the 2004

year. The relevant variables from this dataset that were used in this analysis included the

student’s race, whether or not the student received free or reduced price lunch, whether the

student had been nominated for participation in the gifted program, the source of the nomination,

and whether or not the student had been identified. The overall N for the dataset was 1,820,635.

Of these, all students in grades 1 through 5 were selected. This yielded an N of 705,074, the

population of Georgia elementary school students during the 2004 academic year.

The nomination sources reported in the data were as follows: automatic referrals which

occur automatically when a student scores in the 90th percentile or higher on a standardized test,

teacher referrals, parent referrals, self referrals, peer referrals, and other referral sources, which

are referrals communicated to the school by anyone other than the student’s teacher, parent, self,

Page 65: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

55

or peer. Examples of other referrals would include referrals by a community member, minister,

or relative without custody of the child.

Georgia follows a multiple-criteria assessment procedure. Once students have been

nominated, they are evaluated or screened for gifted program placement. Data must be collected

in four areas: mental ability, achievement, motivation, and creativity. Mental ability is generally

determined via psychometric assessment, achievement is generally determined by standardized

test scores, creativity is generally determined by the Torrance Test of Creative Thinking –

Figural, and motivation is generally determined by grades. However, a variety of other forms of

evidence are admissible, including projects or performances that are evaluated by a panel of

judges. To be identified as gifted, students must either provide evidence of superior ability in

any three of these four domains or must provide evidence of superior ability and achievement.

Evidence for superiority in at least one of the four areas must be provided by a standardized test.

Research Questions

This study addresses the following research questions: 1) How do the referral sources

compare in terms of overall quality, as indicated by the phi coefficient, as well as by the number

of students referred, the proportion of referred students who are successfully identified, and the

proportion of identified students located via the referral method? 2) How do the referral sources

compare in terms of equity across racial and socioeconomic groups? 3) Does the

underrepresentation problem occur primarily at the nomination stage or the testing stage of the

gifted identification process?

Prior to conducting the analysis, the data were screened and prepared. The data were

originally collected from each school by individual teachers who were responsible for reporting

the referral sources. Some teachers apparently misreported students with automatic referrals as

Page 66: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

56

not having been referred at all. This resulted in some students being reported as having been

identified as gifted without being referred, which should obviously be impossible. After

conversing with personnel at the Georgia Department of Education, gifted students coded as not

being referred were recoded to automatic referral. Furthermore, there were a small number of

students with missing data on whether or not they received free or reduced price lunch. Those

cases were excluded from the relevant analyses.

Results

The overall composition of Georgia elementary schools by race, SES, and gifted status is

described in Table 2.1. From this table, it is quite obvious that students from different racial

backgrounds are not equally represented in gifted programs. Furthermore, student SES as

indexed by whether or not the student received free or reduced-price lunch (FRL) is strongly

related to the proportion of students that participate in gifted programs. Hereafter, the term “low

SES” refers to students who are receiving lunch aid while “high SES” refers to students who are

not receiving lunch aid.

In the initial analysis, referral sources for the overall student population were examined.

The results of this analysis are presented in Table 2.2. Almost 10 percent of students had been

referred, and 80.3 percent of those students were subsequently identified. Automatic referrals

had the highest validity as indicated by the phi coefficient. Automatic referrals were also the

most common referral source and had the highest accuracy. Teacher referrals made up the

majority of the remaining referral sources, which also had an acceptable phi coefficient as well

as a high accuracy. Parent and other referrals had similar occurrence frequencies, accuracies,

and phi coefficients. Self and peer referrals were very rare, had the lowest phi coefficients, and

were the least accurate.

Page 67: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

57

Table 2.1: Identified elementary school students by race and SES

______________________________________________________________________________ Race SES N Students N Gifted Percentage Gifted ______________________________________________________________________________ Overall Overall 705,074 55,856 7.9

Low 348,529 10,126 2.9 High 354,364 45,560 12.9

Asian Overall 17,587 3,215 18.3 Low 5,611 530 9.4 High 11,963 2,684 22.4 Black Overall 275,821 8,695 3.2 Low 191,193 4,146 2.2 High 83,376 4,504 5.4 Hispanic Overall 59,398 1,389 2.3 Low 45,057 783 1.7 High 14,309 606 4.2 Native Overall 984 101 10.3 Low 436 23 5.3 High 546 78 14.3 White Overall 333,569 41,005 12.3 Low 97,527 4,267 4.4 High 235,183 36,615 15.6 ______________________________________________________________________________

It is important to note that, in general, the automatic referral process happens first. As

each student can only receive one nomination, this advantages automatic referrals over the other

referral sources. Many students receiving automatic referrals would no doubt have received

referrals from other sources. Furthermore, because one of the four assessment categories that

nominated students must satisfy to gain program entry is satisfied by superior achievement test

scores, students receiving automatic referrals have already fulfilled one of the three criteria for

gaining entrance into the program.

Page 68: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

58

Table 2.2: Overall comparison of referral sources (N = 705,074) ______________________________________________________________________________ Source Percentage Success Percentage Phi referred rate identified ______________________________________________________________________________ All sources 9.9 80.3 100.0 Automatic 5.2 86.3 57.1 .682 Teacher 4.0 74.9 37.7 .505 Parent 0.4 59.2 3.0 .120 Self 0.01 44.2 .03 .010 Peer <0.01 46.2 .01 .006 Other 0.3 77.4 2.2 .123 ______________________________________________________________________________

For the next analysis, the data file was split by FRL status before being analyzed. Results

are presented in Table 2.3. The overall relationship between student SES and gifted program

nominations is very clear. Students who did not receive financial assistance were over three

times more likely to be referred than students receiving FRL. The overall accuracy of referrals

was also higher for the students who paid for their own lunches.

Paid lunch students received over four times as many automatic referrals as FRL students

and over three times as many teacher referrals. The accuracy of all referral sources except peer

referrals was higher for the paid lunch students. This is also reflected in the phi coefficients for

each source. Interestingly, teacher referrals had nearly identical phi coefficients for both groups.

The value of the phi coefficient is dependent upon both the accuracy of the referral source as

well as the proportion of identified students that were referred via that source. Though the

accuracy of teacher referrals is somewhat lower for low-SES students, more low-SES students

are identified via teacher nominations, resulting in the slightly higher phi coefficients for that

group.

Page 69: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

59

Though parent and other referrals were rare in both groups, they were more frequent and

more accurate in the high SES group. Proportionally, high SES students received over four

times as many parent referrals and over twenty four times as many other referrals.

Table 2.3: Comparison of referral sources by SES ______________________________________________________________________________ Source Percentage Success Percentage Phi referred rate identified ______________________________________________________________________________ Free or reduced lunch (n = 348,529) All sources 4.15 70.06 100.00 Automatic 1.93 79.27 52.66 .638 Teacher 1.95 62.84 42.16 .503 Parent .13 53.60 2.10 .094 Self <.01 27.27 .03 .008 Peer <.01 50.00 <.01 .007 Other .13 67.39 3.06 .139 Paid lunch (n = 354,364)

All sources 15.49 83.00 100.00 Automatic 8.49 87.82 54.81 .682

Teacher 6.01 78.68 38.80 .497 Parent .66 61.76 3.17 .119 Self <.01 50.00 .04 .011 Peer <.01 45.45 .01 .005 Other 3.21 81.46 2.03 .116 ______________________________________________________________________________

For analysis three, the data file was split by student race. The results of this analysis can

be found in Table 2.4. Very pronounced differences in nomination frequency are evident across

races, with almost 25 percent of Asian students receiving a nomination while only about 3

percent of Hispanic students received a nomination. Furthermore, automatic referrals remain the

nomination method with the highest phi coefficients and the highest accuracies, except for

Page 70: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

60

Native students where teacher nominations are the most accurate. Again, this is probably due to

automatic referrals coming first in the referral timeline.

Teacher nominations showed evidence of better performance for Asian, White, and

Native students than for Hispanic and Black students. Furthermore, the quality of teacher

nominations for Black students is especially poor in terms of the phi coefficient and accuracy.

Self and peer referrals continue to be rare and of poor quality. There were no peer referrals for

Asian, Hispanic, and Native students, so phi could not be calculated for these groups. The

proportionality of parent nominations varied across racial groups as well. Asian, Native, and

White students had much higher rates of parent nomination than Black and Hispanic students.

In the final analysis, the data file was split by race and SES. The results of this analysis

may be found in Table 2.5. A few patterns deserve mentioning. Automatic referrals performed

well in all groups. In general, automatic referrals performed better in high SES students than low

SES students in terms of both phi and accuracy except for Asian and Native students, where

automatic referrals had higher phi coefficients in low SES students.

Phi coefficients for teacher nominations were higher in low SES students than in high

SES students. However, the accuracy of teacher nominations was higher for high SES students.

The larger phi coefficient values in low SES students result from the fact that a larger proportion

of low SES students are identified via teacher nominations. Though parent nominations were

rare, they were much more frequent in high SES groups.

Page 71: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

61

Table 2.4: Comparison of referral sources by race ______________________________________________________________________________ Race Source Percentage Success Percentage Phi referred rate identified

______________________________________________________________________________ Asian (n = 17,587) All sources 23.02 79.42 100.00 Automatic 12.00 82.51 45.75 .614 Teacher 9.69 77.65 41.16 .503 Parent .88 55.19 2.64 .090 Self .01 0.00 0.00 -.004 Peer 0.00 -NA- 0.00 -NA- Other .44 84.62 2.05 .115 Black (n = 275,821)

All sources 4.58 68.88 100.00 Automatic 2.30 82.25 60.05 .695

Teacher 1.96 56.47 35.10 .431 Parent .17 40.66 2.25 .090 Self <.01 66.67 <.01 .021 Peer <.01 25.00 <.01 .005 Other .14 58.24 2.52 .116 Hispanic (n = 59,398) All sources 3.34 70.08 100.00 Automatic 1.81 76.14 59.04 .664 Teacher 1.36 63.99 37.22 .479 Parent .08 50.00 1.73 .090 Self <.01 0.00 0.00 -.001 Peer 0.00 -NA- 0.00 -NA- Other .08 58.33 2.02 .105 Native (n = 984) All sources 12.30 83.47 100.00 Automatic 6.00 83.05 48.51 .606 Teacher 4.78 87.23 40.59 .568 Parent .71 71.43 4.95 .171 Self .10 100.00 .01 .094 Peer 0.00 -NA- 0.00 -NA- Other .71 71.43 4.95 .171 White (n = 333,569) All sources 14.65 83.90 100.00 Automatic 7.89 88.06 56.53 .675 Teacher 5.83 80.32 38.10 .516 Parent .61 64.47 3.19 .124 Self .01 40.00 <.01 .008 Peer <.01 55.56 <.01 .007 Other .31 84.91 2.14 .123 _____________________________________________________________________________

Page 72: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

62

Table 2.5: Comparison of referral sources by race and SES ______________________________________________________________________________ Group Source Percentage Success Percentage Phi referred rate identified ______________________________________________________________________________ Asian (low SES, n = 5,611) All sources 12.53 75.39 100.00 Automatic 6.65 77.75 54.72 .623 Teacher 5.01 74.38 39.48 .510 Parent .43 50.00 2.26 .091 Self 0.00 -NA- 0.00 -NA- Peer 0.00 -NA- 0.00 -NA- Other .44 76.00 3.58 .152 Asian (high SES, n = 11,963) All sources 27.95 80.26 100.00 Automatic 14.51 83.53 54.02 .603 Teacher 11.90 78.23 41.51 .445 Parent 1.09 56.15 2.72 .085 Self <.01 0.00 0.00 -.005 Peer 0.00 -NA- 0.00 -NA- Other .44 88.68 1.75 .106 Black (low SES, n = 191,193) All sources 3.44 63.03 100.00 Automatic 1.54 78.97 56.08 .659 Teacher 1.69 50.62 38.48 .436 Parent .10 35.71 1.57 .071 Self <.01 60.00 .07 .020 Peer <.01 50.00 .02 .011 Other .11 54.50 2.77 .119 Black (high SES, n = 83,376) All sources 7.19 75.17 100.00 Automatic 4.02 85.12 63.39 .722 Teacher 2.60 65.11 31.33 .431 Parent .36 43.67 2.91 .102 Self .07 75.00 .07 .021 Peer <.01 0.00 0.00 -.001 Other .20 63.03 2.31 .114 Hispanic (low SES, n = 45,057) All sources 2.61 66.53 100.00 Automatic 1.32 72.51 54.92 .625 Teacher 1.17 62.00 41.89 .503 Parent .04 36.84 .89 .055 Self <.01 0.00 0.00 -.001 Peer 0.00 -NA- 0.00 -NA- Other .07 51.43 2.30 .106

Page 73: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

63

Hispanic (high SES, n = 14,309) All sources 5.63 75.28 100.00 Automatic 3.38 80.58 64.36 .709 Teacher 1.95 67.74 31.19 .445 Parent .20 58.62 2.81 .122 Self 0.00 -NA- 0.00 -NA- Peer 0.00 -NA- 0.00 -NA- Other .09 76.92 1.65 .109 Native (low SES, n = 436) All sources 6.19 85.19 100.00 Automatic 2.06 88.89 34.78 .543 Teacher 3.67 81.25 56.52 .663 Parent 0.00 -NA- 0.00 -NA- Self 0.00 -NA- 0.00 -NA- Peer 0.00 -NA- 0.00 -NA- Other .46 100.00 8.70 .165 Native (high SES, n = 546) All sources 17.22 82.98 100.00 Automatic 9.16 82.00 52.56 .614 Teacher 5.68 90.32 35.90 .533 Parent 1.28 71.43 6.41 .186 Self .18 100.00 1.28 .105 Peer 0.00 -NA- 0.00 -NA- Other .92 78.38 3.85 .126 White (low SES, n = 97,527) All sources 5.55 78.89 100.00 Automatic 2.57 82.04 48.28 .617 Teacher 2.57 77.20 45.39 .579 Parent .22 58.33 2.95 .124 Self <.01 0.00 0.00 -.001 Peer 0.00 -NA- 0.00 -NA- Other .17 85.21 3.37 .165 White (high SES, n = 235,183) All sources 18.42 84.50 100.00 Automatic 10.08 88.69 57.42 .675 Teacher 7.19 80.76 37.30 .501 Parent .77 65.16 3.22 .120 Self .01 46.15 .03 .009 Peer <.01 55.56 .01 .007 Other .37 84.86 2.00 .116 ______________________________________________________________________________

Page 74: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

64

Discussion

In consideration of the results presented in this paper, a few substantive and

methodological conclusions can be made. The first is that automatic referrals and teacher

referrals are far superior to the other referral sources. The other referral sources are used far less

often and are generally much less accurate. The peer- and self-referral options are so

infrequently used that they have almost no impact on gifted program enrollments.

Do these results provide evidence that the referral process is biased against economically

disadvantaged, Black, and Hispanic students? This is a complex issue. On the basis of numbers

alone, it is obvious that students from these traditionally underrepresented backgrounds are also

under nominated. The probability of nomination strongly varies across race and class

background. Furthermore, the accuracy of nomination sources also varies across backgrounds.

In general, nominations for low SES students are less accurate than nominations for high SES

students. Furthermore, nominations are less accurate for Black and Hispanic students than for

Asian, Native, and White students.

There are at least two plausible explanations for this pattern, depending on one’s beliefs

regarding the distribution of ability across race and class lines. If one adopts the position that

ability is evenly distributed across these lines, then these results can only indicate severe bias in

the nomination and testing procedure. Many readers will undoubtedly adopt this explanation for

the results presented in this paper. The low rate of automatic referrals could indicate bias in

standardized tests; the low rate of teacher nominations could indicate racism, classism, or

cultural ignorance on the part of teachers; and the low rate of parent nominations could indicate

that these students’ parents are alienated from and distrustful of school culture.

Page 75: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

65

Interpreting these results in this light would lead to the conclusion that the nomination

process, rather than the screening process, is the primary cause of differential representation in

gifted programs. While it is certainly true that nominated students from “advantaged” groups

have a higher probability of successfully passing the screening process than “disadvantaged”

students, the effect of these differing “pass rates” is far smaller than the effect of the differing

nomination rates on the resulting gifted program enrollment. For example, 4.58 percent of Black

students had received a nomination with 68.9 percent of these successfully passing the screening

process, whereas 14.65 percent of White students received a nomination while 83.9 percent of

these successfully passing the screening. The pass rate for Black students is 82% that of the pass

rate for White students, whereas the nomination rate for Black students is only 31% the

nomination rate for White students. Equalizing the pass rates for Black and White students

would do little to restore proportional representation of these students in gifted programs if the

nomination rate remained unchanged.

Alternatively, if one believes that ability is not evenly distributed, then one can interpret

these results in a different light. The low rate of automatic referrals for certain groups reflects

lesser ability. When students from these groups are nominated, they are able to pass through the

screening less frequently. Teachers nominate fewer students from these groups because there are

simply fewer students from these groups that evidence advanced potential. Furthermore, the low

accuracy of teacher nomination for these students could reflect effort on the part of teachers to

address the long-standing inequality of gifted program enrollments by nominating students that

show even questionable potential to pass the screening process.

The true nature of ability distribution is currently unknown and is, perhaps, unknowable.

To answer this question definitely would require that most or all of the stakeholders agree upon

Page 76: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

66

the nature, dimensionality, and meaning of ability, as well as the creation of instruments that

would be accepted by all as trustworthy, valid, and unbiased. This does not appear likely in the

near term. Therefore, the correct interpretation of these results is currently unknowable. Though

the previous discussion presented two possibilities to explain the observed results, it is also quite

possible that both are true. Ability may not be precisely evenly distributed across backgrounds,

but our currently methods for identifying gifted students may also be overlooking students

hailing from traditionally underrepresented backgrounds.

From a policy perspective, the results of this study indicate that more attention needs to

be devoted to the issue of student nominations for gifted programs. Georgia has very strong state

policies in support of gifted education, perhaps the strongest of any state in the nation. Georgia

is among the four states described by the Davidson Institute as having very strong policies and

funding for gifted education (Davidson Institute, 2006). Of these four states, Georgia has the

highest amount of gifted education funding per identified student (although funding levels were

not provided for Iowa and Florida). Georgia’s multiple criteria assessment procedure was

designed in part to help address the underrepresentation problem. The multitude of considered

referral sources speak to the state’s commitment to casting a wide net in search of talented

students. In spite of this commitment, Georgia continues to struggle with the

underrepresentation of minority and low-SES students in its gifted programs. It is unclear how

Georgia’s already flexible nomination policies could be improved without massively increasing

costs. One obvious issue is that the self and peer referrals are so infrequently used. Students

should be reminded that they may nominate themselves or other students for gifted program

assessment. Mandatory assessment of all students for gifted program placement would be

optimal but very expensive to implement.

Page 77: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

67

From a methodological point of view, this study has important limitations. The most

pressing problem is that automatic referrals happen earlier than other referrals. This advantages

automatic referrals and inflates the quality indices associated with that referral source.

Therefore, the quality of referral sources cannot be directly compared. Indeed, the only real way

to compare referral sources would be to allow an individual student to receive multiple

nominations, so that a student could be nominated automatically as well as by her teacher and

peer. Recording only one referral course per student creates a “winner take all” system that

obscures the true value of each referral technique.

As teacher nominations have received the brunt of the attention in the literature, they

deserve further commentary. Though the quality of teacher nominations did fluctuate across

different student backgrounds, overall, the overall quality was quite high. The average phi

coefficient of .505 is almost twice as high as the phi value computed by Gagne’s (1994)

reanalysis of Pegnato and Birch’s (1959) classic study.

Gagne’s (1994) argument for the use of the phi coefficient was sound, but phi should not

completely supplant the other quality indices, especially the accuracy index (referred to as

“efficiency” in Pegnato & Birch, 1959). Although “other” referral sources are comparatively

rare and thus receive a low phi coefficient, they exhibit good accuracy.

The biggest strength of this study was the extremely large N. Because all of these

students came from the same state and fell under uniform policies mandated by the state

regarding gifted education, fluctuation due to policy shifts was minimized. However, the

applicability of these results to other states with differing gifted education policy is unknown.

Page 78: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

68

References

Agresti, A. (1996). An introduction to categorical data analysis. New York, NY: John Wiley and

Sons.

Davidson Institute. (2006, March 2). Genius denied: How to stop wasting our brightest young

minds – Gifted education policies. Retrieved March 2, 2006 from

http://www.geniusdenied.com/Policies/StatePolicy.aspx?NavID=6_0

Feldman, D. H. (2003). A developmental, evolutionary perspective on giftedness. In J. Borland

(Ed.), Rethinking Gifted Education (pp. 9-33). New York, NY: Teachers College Press.

Ford, D. Y. (1998). The underrepresentation of minority students in gifted education: Problems

and promises in recruitment and retention. Journal of Special Education, 32(1), 4-14.

Frasier, M. M., Garcia, J. H., & Passow, A. H. (1995). A review of assessment issues in gifted

education and their implications for identifying gifted minority students. (No. RM95204):

The National Research Center on the Gifted and Talented.

Frasier, M. M., Hunsaker, S. L., Lee, J., Finley, V. S., Garcia, J. H., Martin, D., et al. (1995). An

exploratory study of the effectiveness of the Staff Development Model and the Research-

based Assessment Plan in improving the identification of gifted economically

disadvantaged students. Storrs, CT: University of Connecticut, National Research Center

on the Gifted and Talented.

Gagne, F. (1994). Are teachers really poor talent detectors? Comments on Pegnato and Birch's

(1959) study of the effectiveness and efficiency of various identification techniques.

Gifted Child Quarterly, 38(3), 124-126.

Gardner, H. (1983). Frames of mind: The theory of multiple intelligences. New York, NY: Basic

Books.

Page 79: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

69

Gear, G. H. (1976). Accuracy of teacher judgment in identifying intellectually gifted children: A

review of the literature. Gifted Child Quarterly, 20, 478-489.

Hunsaker, S. L. (1994). Adjustments to traditional procedures for identifying underserved

students: Successes and failures. Exceptional Children, 61(1), 72-76.

Hunsaker, S. L., Finley, V. S., & Frank, E. L. (1997). An analysis of teacher nominations and

student performance in gifted programs. Gifted Child Quarterly, 41(2), 19-24.

Kirschenbaum, R. J. (2004). Dynamic Assessment and Its Use With Underserved Gifted and

Talented Populations. In A. Y. Baldwin & S. M. Reiss (Eds.), Culturally diverse and

underserved populations of gifted students (pp. 49-62). Thousand Oaks, CA: Corwin

Press, Inc.

Masten, W. G., Plata, M., Wenglar, K., & Thedford, J. (1999). Acculturation and teacher ratings

of Hispanic and Anglo-American students. Roeper Review, 22(1), 64-65.

Naglieri, J. A., & Ford, D. Y. (2003). Addressing underrepresentation of gifted minority children

using the Naglieri Nonverbal Ability Test (NNAT). Gifted Child Quarterly, 47(2), 155-

160.

Pegnato, C. W., & Birch, J. W. (1959). Locating gifted children in junior high schools: A

comparison of methods. Exceptional Children, 48, 300-304.

Plata, M., & Masten, W. G. (1998). Teacher ratings of Hispanic and Anglo students on a

behavior rating scale. Roeper Review, 21(2), 139-144.

Renzulli, J. S., & Delcourt, M. (1986). The legacy and logic of research on the identification of

gifted persons. Gifted Child Quarterly, 30, 20-33.

Renzulli, J. S., & Westberg, K. (1991). Scale for rating students' participation in the local gifted

education program. Storrs, CT.: University of Connecticut.

Page 80: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

70

Renzulli, J. S., Smith, L. H., White, A. J., Callahan, C. M., & Hartman, R. K. (1976). Scales for

rating the behavioral characteristics of superior students. Mansfield Center, CT:

Creative Learning Press.

Sarouphim, K. M. (1999). DISCOVER: A promising alternative assessment for the identification

of gifted minorities. Gifted Child Quarterly, 43(4), 244-251.

VanTassel-Baska, J., Johnson, D., & Avery, L. D. (2002). Using performance tasks in the

identification of economically disadvantaged and minority gifted learners: Findings from

Project STAR. Gifted Child Quarterly, 46(2), 110-123.

Waters, T. J., & Clausen, S. (1983). Effectiveness of parent versus teacher nomination of gifted

children. Southern Psychologist, 1(4), 189-191.

Page 81: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

71

CHAPTER 3

EXAMINING THE PROBABILITY OF IDENTIFICATION OF STUDENTS FOR GIFTED PROGRAMS IN GEORGIA ELEMENTARY SCHOOLS: A MULTILEVEL STRUCTURAL

EQUATION MODELING STUDY 2

2 McBee, M. To be submitted to Journal of Educational Psychology.

Page 82: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

72

Abstract

The study focused on the analysis of a large-scale (n = 273,311) dataset collected by the

Georgia Department of Education using multilevel structural equation modeling to model the

probability that a student would be identified for participation in a gifted program. The model

examined individual- and organization-level factors that influence the probability that an

individual would be identified for participation in an advanced educational program. The

probability of being identified as gifted depended strongly on student race and socioeconomic

status. The mean probability of identification varied across schools. The model succeeded in

explaining 23 percent of the school-level variance in the probability of gifted identification and

70 percent of the variance in the school academic environment. The negative impact of having a

low-SES background on the probability of identification varied across schools as well. The

model explained 19 percent of this variance. The positive impact of being Asian varied across

schools as well. The model explained 90 percent of this variance. The impact of being Black or

Hispanic did not vary across schools.

Page 83: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

73

The numerical underrepresentation of African-American, Hispanic, and Native American

students in gifted programs has been frequently cited in the gifted education literature (Ford,

1998; Reid, Romanoff, Algozzine, & Udall, 2000; Sarouphim, 1999; Scott, Perou, Urbano,

Hogan, 1992). The topic of underrepresentation is of critical importance to the field of gifted

education. It forces us to consider the possibility that a great number of students who are in need

of advanced educational opportunities are being denied this opportunity on the basis of racism or

classism. Programs that discriminate against minority students, either in fact or in appearance,

are in danger of elimination, as gifted programs in many states subsist on thin margins of

political will and public support.

The biggest issue related to understanding the disparity in gifted program enrollment

across racial groups is: to what extent does this disparity represent actual differences of

developed or potential capability across groups, and to what extent does it indicate the presence

of serious flaws in the methods by which we screen, identify, and serve gifted students? Most

scholars who have examined the issue have agreed with Frasier’s (1997) belief that “There is no

logical reason to expect that the number of minority students in gifted programs would not be

proportional to their representation in the general population” (p. 498). In spite of the critical

importance of this issue, it remains poorly understood. Though the underrepresentation is widely

noted, decried, and lamented in the literature, and a number of methods to increase the

representation of these groups have been proposed, very few published studies have adequately

addressed the complexity of the issue.

Page 84: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

74

Individual-Level Factors in Underrepresentation

Almost all of the gifted education literature addressing the underrepresentation of

students in gifted programs has focused on two individual characteristics: being a member of a

minority group or having a low socioeconomic status (SES) background.

Underrepresentation of Ethnic Minority Students

Nomination problems. Some literature has focused on the role of nominations in the

underrepresentation problem. Most nominations come from teachers (Hunsaker, Finley, &

Frank, 1997). It is now widely believed that though the underlying dimensions of giftedness are

universal (e.g., Frasier & Passow, 1994, listed ten universal talents, abilities, and behaviors of

gifted children), the expression of these qualities may be heavily influenced by a child’s cultural

and economic background as well as the child’s immediate context (Peterson, 1999). Since

teachers in most schools are of White middle-class backgrounds, they may not consistently

recognize the signs of giftedness expressed in students of diverse cultural backgrounds. Thus,

part of the underrepresentation issue may be caused by unfair nomination procedures. There is

some support for this hypothesis in the research literature (e.g., Masten, Plata, Wenglar, &

Thedford, 1999; Plata & Masten, 1998).

Identification problems. Most gifted programs rely, at least to some degree, on

standardized measures of ability or achievement during the assessment process. It has widely

been noted that minority youth significantly under perform on these tests relative to their peers

(e.g., Entwisle & Alexander, 1992; Maker, 1996; Naglieri & Jensen, 1987). The performance

gap between Black and White students’ test scores tends to be about one standard deviation.

Many critics have argued that minority students tend to do poorly on such tests because the tests

themselves are flawed by being biased in favor of students from the dominant culture (Ford,

Page 85: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

75

Harris, Tyson, & Trotman, 2002). In other words, standardized tests might unfairly penalize

minority students by assigning them lower scores for the same level of underlying ability or

achievement. The exact nature of this bias is unknown. However, empirical studies that have

performed differential item functioning (DIF) analyses to search for biased items have generally

failed to detect them (Jensen, 1980). Even more recent DIF analyses using sophisticated three-

parameter item response theory (IRT) models have failed to detect item bias (Gordon, 1987).

Because these models are sensitive to differences in difficulty, discrimination, and guessability,

they should detect any imaginable bias. However, critics of standardized testing continue to

argue against the use of these tests with minority students.

Mills and Tissot (1995) compared the scores of students from a wide variety of racial

backgrounds on several tests of mental ability. Black students scored about one standard

deviation below White students on the traditional tests, but the gap fell to about half a standard

deviation for Ravens Advanced Progressive Matrices, a completely non-verbal test of mental

ability. They concluded that verbally loaded tests penalize minority tests takers.

Gifted program persistence. Another challenge to equal representation is the issue of

students who choose not to participate in the gifted program due to cultural insensitivity

(Harmon, 2002) or peer pressure (Ford, 1998; Fordham & Ogbu, 1986). This evidence was

primarily gathered via case studies. Worrell, Szarko, and Gabelko (2001) conducted a larger

quantitative study of the issue and were unable to find evidence that race or socioeconomic status

played a role in student program dropout, though this study only examined dropout in a summer

enrichment program.

Page 86: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

76

Low Socioeconomic Status

Students from low socioeconomic backgrounds are another group that has been widely

described as being underrepresented in gifted education programs. Literature in this area is less

voluminous than research on race. Descriptive data describing the degree to which low SES

students are underrepresented in gifted programs is hard to find. The issue is so widespread,

however, that it is taken as a truism.

There are, however, innumerable studies examining the impact of SES on school

achievement. Since school achievement is one dimension that is commonly assessed in gifted

program entrance, it is logical to assume that lower achievement may also lower the probability

of entrance into a gifted program. Low SES has consistently been found to powerfully reduce

student achievement. Ryan and French (1976) found that large differences existed in students

from low, middle, and high SES groups on achievement, as measured by the Iowa Test of Basic

Skills. Portes and MacLeod (1996) found that individual SES had a strong effect on Stanford

Achievement Test scores, even after controlling for a number of other individual variables.

Tyler-Wood and Carri (1993) found that the gap between low SES and average SES

students’ test scores was highest on the verbal subscales on many popular tests of mental ability.

These findings mirror those of Mills and Tissot (1995) with respect to the Ravens test. The

similarity of these findings may be caused by confounding between race and SES. Many

scholars in gifted education now advocate for using nonverbal instruments with minority, low

SES, or English as a Second Language (ESL) students (Naglieri & Ford, 2003).

One of the questions that has not been addressed in the gifted education literature is the

relative importance of race and SES in gifted program identification. McBee (in press) and

Portes and MacLeod (1996) found that race dropped out of their models when SES was

Page 87: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

77

controlled. Many previous studies of race and gifted program admittance are seriously flawed

because race and SES are very highly related. Studies that have examined race without

controlling for SES either statistically or experimentally are confounded and thus impossible to

interpret.

Though there are not clear reasons why test scores should differ across racial lines, there

are perhaps good reasons why students from impoverished backgrounds would under perform.

Mills (1983) found a relatively small reduction in school readiness for Kindergarteners from low

SES backgrounds. Entwisle and Alexander (1992) and West (1985) found similar deficits in

initial readiness that became larger differences in achievement with each passing year. Entwisle

and Alexander attributed this to a “summer setback” phenomenon caused by relatively

unstimulating home environments. Quay (1989) found that cognitive development, as indicated

by performance on Piagetian conservation tasks, was significantly delayed for low SES students.

Students from low SES backgrounds may receive less cognitive scaffolding from their mothers

(Hess & McDevitt, 1984). By definition, low SES families lack resources. They are more likely

to live in substandard housing, have poor medical care, lack healthy foods, experience more

stress, live in high crime areas, and experience increased inter-family conflict (Duncan &

Brooks-Gunn, 2000).

School Factors that May Affect the Probability of Identification

School Socioeconomic Status

The results of a number of studies examining the impact of the SES composition of

schools on student achievement have consistently found that the socioeconomic composition of

the school exerts potent effects on educational outcome over and above the influence of

Page 88: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

78

individual SES (Everson & Millsap, 2004; Kennedy, 1992; Maggi, Hertzman, Kohen, &

D'Angiulli, 2004; Opdenakker & Van Damme, 2001; and Taylor & Harris, 2003).

Everson and Millsap’s (2004) multilevel structural equation modeling study of SAT

performance found that school SES had very powerful direct and indirect effects on both SAT

math and SAT verbal scores. Kennedy (1992) analyzed the performances of Black and White

male third graders on a standardized achievement test. Results indicated that the school SES was

the strongest predictor of achievement at the school level for both the Black and White children.

However, the effect was approximately twice as strong for White students.

Maggi and colleagues (2004) studied the effect of organization-level SES on student

achievement. They found that neighborhood SES was strongly correlated with the proportion of

students within schools that were high achievers. The strength of this relationship increased

from fourth to seventh grade. Portes and MacLeod (1996) found a cross-level SES interaction by

using average school SES as a predictor of the slope coefficient relating individual SES to

mathematics achievement. This predictor was positive and significant such that the slope

relating individual SES to math achievement was steeper in high SES schools than the same

relationship in a low SES school. Students from high SES backgrounds had higher math

achievement when they were situated within high SES schools. Students from low SES

backgrounds were doubly penalized by attending high SES schools and performed better when

they attended low SES schools.

School racial composition. Taylor and Harris (2003) examined the effects of relative

integration and segregation on Black and White students’ Stanford 9 achievement test scores in

third, fifth, and eight grades. The achievement of Black students in eighth grade was strongly

negatively correlated with the percentage of the enrollment that is Black and even more strongly

Page 89: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

79

with the percentage of the enrollment receiving free or reduced-price lunch (FRL), while it was

positively correlated with the percentage of students that are White. White students’

achievement was not significantly affected by the Black enrollment or the overall percentage of

students receiving free or reduced-price lunch. It was, however, negatively correlated with the

percentage of White students receiving subsidized lunch.

Purpose of the Study

In spite of years of research on and attention to the underrepresention of poor and

minority students in gifted programs, the problem remains poorly understood. The current study

has addressed several shortcomings of the previous literature examining the underrepresentation

of minority and low SES students in gifted programs. Previous work in gifted education has

focused almost exclusively on the individual-level effects of race and class on identification

outcome. These studies have typically confounded race with class and are thus uninterpretable.

Research Questions

This study addressed the following general research questions:

1. How are student race, socioeconomic status, days absent, and migrant status related to the

probability of being identified as gifted in Georgia elementary schools?

2. Does the general probability of gifted identification vary across schools? If so, does

school composition, academic quality, behavioral environment, and teacher

characteristics explain any of this variance in the probability of gifted identification?

3. Do the probabilities of identification vary across schools specifically for Black, Hispanic,

and Asian students, as well as students receiving free- or reduced-price lunch? If so, does

school composition, academic quality, behavioral environment, and teacher

characteristics explain any of this variance in the probability of gifted identification for

these students?

Page 90: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

80

Method

Sample and Data Sources

A large dataset collected by the Georgia Department of Education was analyzed in this

study. It contains student-level data on every public school student in Georgia during the 2004

academic year (N = 1,780,591), as well as school-level data on behavioral incidents, teacher

ethnicity, training, and experience, and academic composition.

Data details and preparation. The individual-level data was comprised of student

ethnicity, lunch assistance status, grade, retention status, migrant status, and whether or not the

student had been identified as gifted either in a previous year or during the 2004 year. All of

these variables were categorical in nature. This data was aggregated for each school ID and

combined into a two-level dataset.

A school-level variable representing the academic composition of the students was

created by performing a principal components analysis of 15 variables representing the

percentage of students scoring as “advanced” on the CRCT in the subjects of math, English, and

reading for grades one through five. Initially this factor was included as a measurement model in

the within-schools mode in the SEM software. Once the within- and between-schools models

were combined into a multilevel SEM, the measurement model could no longer be estimated due

to computational limitations. Therefore, the factor was extracted from the 15 variables in the

SPSS software. A single factor was extracted on the basis of the scree plot and Velicer’s

minimum average partial test. The factor explained 78.4% of the variance in the 15 variables.

Its internal consistency reliability was 0.98. The factor loadings ranged from 0.812 to 0.923. The

“academic environment” variable consisted of the standardized factor scores from this principal

component.

Page 91: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

81

Four variables describing the teaching staff for each school were included. Two variables

described the percentage of each school’s teaching staff that were Black or Hispanic. Variables

indicating the percentage of the school’s teaching staff that possessed advanced degrees

(Master’s, EdS, or PhD) and the average number of years of teaching experience for teachers at

each school were also included.

The data contained a set of 14 variables describing the incidence of severe behavior

problems within each school. These variables included the number of incidents of aggravated

battery, aggravated child molestation, aggravated sexual battery, aggravated sodomy, armed

robbery, arson, kidnapping, murder, rape, voluntary manslaughter, nonfelony drug possession,

felony drug possession, felony weapons possession, and terroristic threats. These counts were

added together for each school into a total number of incidents, which was then divided by the

total number of students in the school to yield an incident-to-student ratio.

From this overall dataset, all records of elementary school students were selected. This

resulted in a population (N = 686,375) of all Georgia elementary school students. From this

population, 50% of cases were randomly sampled in order to reduce the computational demands

of the estimation procedure. This resulted in a final sample of 341,634 students in 1,260 schools.

Cluster sizes ranged from 1 to 964, with only 16 clusters having sizes of 30 or smaller.

The data were reported to the Georgia DOE by individual personnel within each school.

The accuracy of the information reported by each school could not be verified. However,

schools are held accountable for the accuracy of the data they report to the state and federal

governments. The reporting system automatically generates error codes when out-of-range or

inconsistent data are reported. Variable descriptions may be found in Table 3.1. Descriptive

statistics may be found in Table 3.2, and variable intercorrelations may be found in Table 3.3.

Page 92: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

82

Table 3.1: Variable descriptions ______________________________________________________________________________ Variable name Description Measurement scale

______________________________________________________________________________

Individual-Level Variables Black Race variable (dummy) Dichotomous Hispanic Race variable (dummy) Dichotomous Asian Race variable (dummy) Dichotomous Lunch SES proxy (0=paid, 1=reduced, 2=free) Trichotomous Retained Was student retained during current year? Dichotomous (0=no, 1=yes) Migrant Did student change schools during current Dichotomous year? (0=no, 1=yes)

School-level Variables Lunch Mean of within-school lunch variable Continuous % Students Black Percentage of enrollment that is Black Continuous % Students Hispanic Percentage of enrollment that is Hispanic Continuous % Students Asian Percentage of enrollment that is Asian Continuous % Teachers Adv. Percentage of teachers with advanced Continuous

degrees (beyond Bachelor’s) Years Tch. Exp. Average number of years of teaching Continuous experience % Teachers Black Percentage of teachers that are Black Continuous % Teachers Hispanic Percentage of teachers that are Hispanic Continuous Incid Student Ratio Ratio of severe behavioral incidents to number of students % GiftPrev Percentage of students previous identified Continuous gifted % Students Retain Percentage of students retained Continuous N Students / 100 Number of students enrolled divided by Continuous 100 Migrant Percentage of students changing schools Continuous Academic Comp. The first principal component extracted Continuous from 15 variables representing the

percentages of students classified as “advanced” on the CRCT in math, language arts, and reading in grades 1-5

Academic Env. A latent variable defined by Academic Continuous Comp. adjusting for its unreliability ______________________________________________________________________________

Page 93: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

83

Table 3.2: Descriptive statistics

Individual-level categorical variables (n = 273,311) ______________________________________________________________________________ Variable Category % Category % Category % ______________________________________________________________________________ Asian Yes 2.9 No 97.1 Black Yes 41.8 No 58.2 Hispanic Yes 9.0 No 91.0 Retained Yes 2.5 No 97.5 Migrant Yes .01 No 99.99 Lunch Paid 50.1 Reduced 8.5 Free 41.1

Individual-level continuous variables (n = 980) ______________________________________________________________________________ Variable Mean SD Minimum Maximum ______________________________________________________________________________ Grade 2.01 1.42 1.00 5.00

School-level variables (n = 1262) Lunch 0.888 0.48 0.00 2.00 % Students Black 41.80 32.50 0.00 100.00 % Students Hispanic 9.24 13.10 0.00 94.68 % Students Asian 2.85 4.40 0.00 30.94 % Teachers Adv 46.71 11.68 3.13 90.31 Avg. Teach Exp 12.12 2.60 0.00 22.18 % Teachers Black 21.23 25.03 0.00 100.00 % Teachers Hispanic 0.71 1.57 0.00 14.61 Incident Student Ratio 0.04 0.18 0.00 4.03 % Students Gifted (prev) 4.05 4.26 0.00 78.68 % Students Retain 2.73 2.24 0.00 27.27 NStudents / 100 8.14 2.90 0.41 23.57 % Students Migrant 0.64 2.42 0.00 22.55 Academic Composite 0.02 1.01 -2.44 3.57 ______________________________________________________________________________ Note: When possible, the reported values are those computed after the listwise deletion of cases with missing variables.

Page 94: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

84

Table 3.3: Variable Intercorrelations ______________________________________________________________________________

Acad. Comp Black Hispanic Asian Grade ______________________________________________________________________________ Acad. Comp 1.018 Black -0.378 0.243 Hispanic -0.093 -0.265 0.081 Asian 0.114 -0.145 -0.054 0.028 Grade -0.001 0.021 -0.032 -0.005 2.015 Migrant -0.047 -0.067 0.248 -0.012 -0.009 Lunch (sch) -0.782 0.417 0.105 -0.083 -0.003 % Asian 0.431 -0.171 0.111 0.266 -0.006 % Hispanic -0.207 -0.133 0.448 0.067 -0.019 % Black -0.572 0.659 -0.094 -0.069 0.009 % Retained -0.327 0.163 0.000 -0.039 -0.006 % Migrant -0.149 -0.086 0.162 -0.028 -0.008 % Gifted (prev) 0.672 -0.273 -0.036 0.121 0.005 Avg Tch Exp 0.089 -0.069 -0.082 -0.038 0.010 % Adv Tch 0.141 -0.147 -0.007 -0.016 0.003 Incid. Ratio -0.098 0.085 -0.011 -0.005 0.006 NStud / 100 0.141 -0.073 0.073 0.082 -0.004 % Tch Hisp -0.048 0.067 0.115 0.025 -0.004 % Tch Black -0.426 0.559 -0.081 -0.057 0.006 % Retained -0.058 0.046 0.025 -0.014 -0.044 Lunch -0.400 0.329 0.173 -0.063 0.001 Gifted 0.115 -0.086 -0.035 0.043 -0.006 ______________________________________________________________________________

Migrant Lunch (sch) % Asian % Hispanic % Black ______________________________________________________________________________ Migrant 0.006 Lunch (sch) 0.065 0.266 % Asian -0.032 -0.314 19.562 % Hispanic 0.112 0.237 0.251 170.232 % Black -0.041 0.631 -0.261 -0.208 1055.591 % Retained 0.013 0.378 -0.149 0.002 0.244 % Migrant 0.311 0.208 -0.105 0.352 -0.130 % Gifted (prev) -0.037 -0.582 0.457 -0.080 -0.413 Avg Tch Exp 0.027 -0.018 -0.147 -0.184 -0.104 % Adv Tch 0.011 -0.088 -0.062 -0.016 -0.223 Incid. Ratio -0.007 0.112 -0.021 -0.027 0.129 NStud / 100 -0.029 -0.192 0.306 0.161 -0.111

Page 95: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

85

______________________________________________________________________________

Migrant Lunch (sch) % Asian % Hispanic % Black ______________________________________________________________________________ % Tch Hisp 0.002 0.123 0.090 0.262 0.100 % Tch Black -0.041 0.515 -0.215 -0.180 0.850 Retained 0.013 0.062 -0.024 0.000 0.046 Lunch 0.075 0.505 -0.163 0.115 0.323 Gifted -0.012 -0.087 0.060 -0.013 -0.064 ______________________________________________________________________________

% Retained % Migrant % Gifted Avg Tch Exp % Adv Tch ______________________________________________________________________________ % Retained 5.046 % Migrant 0.046 5.922 % Gifted (prev) -0.305 -0.121 18.191 Avg Tch Exp 0.088 0.090 0.023 6.730 % Adv Tch -0.039 0.036 0.084 0.529 136.276 Incid. Ratio 0.022 -0.024 -0.107 0.009 0.076 NStud / 100 -0.033 -0.093 0.136 -0.301 -0.205 % Tch Hisp -0.005 -0.003 0.014 -0.146 -0.058 % Tch Black 0.112 -0.132 -0.327 -0.138 -0.207 Retained 0.144 0.005 -0.049 0.010 -0.012 Lunch 0.192 0.104 -0.298 -0.006 -0.046 Gifted -0.039 -0.017 0.084 0.003 0.011 ______________________________________________________________________________ Incid. Ratio NStud / 100 % Tch Hisp % Tch Black ______________________________________________________________________________ Incid. Ratio 0.032 NStud / 100 -0.050 8.404 % Tch Hisp 0.082 0.002 2.437 % Tch Black 0.086 -0.150 0.164 628.122 Retained 0.004 -0.005 0.000 0.024 Lunch 0.056 -0.100 0.059 0.262 Gifted -0.020 0.013 -0.003 -0.049 ______________________________________________________________________________

Page 96: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

86

______________________________________________________________________________ Retained Lunch Gifted ______________________________________________________________________________ Retained 0.025 Lunch 0.082 0.903 Gifted -0.029 -0.106 0.031 ______________________________________________________________________________

Note: Variances are displayed in italicized print on the diagonal.

Analysis

Multilevel structural equation modeling (SEM; Kaplan, 2000) was used as the primary

means of data analysis in this study. A multilevel approach was necessary for this study because

of the obvious nesting of students within schools and also because the relationships between

variables measured at the individual and school levels of analysis comprise the major research

questions. Because the researcher hypothesized a causal structure among the independent

variables and also envisioned the school academic composition as a latent variable, multilevel

SEM was selected over hierarchical linear modeling (McCoach, 2003).

Because the race variables were dummy coded, a separate variable for White students is

omitted. This demarcated White students as the reference group, so the model intercepts could

be interpreted as means for the White group. Similarly, to avoid estimation problems due to high

correlations between variables, the school-level student composition variables did not include a

variable referring to White students. Four of the school-level composition variables, the

percentages of the school enrollments that are Black, Hispanic, and Asian, as well as the school

mean for the “lunch” variable, were grand-mean centered. Additionally, the “grade” variable

was transformed by subtracting one from the actual grade so that the intercept probability of

Page 97: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

87

identification estimated in the models would correspond to first grade students rather than

kindergarteners. The intercepts for the between-schools model and the average values for the

random slopes may thus be interpreted as their expected values in a compositionally average

school for White students in first grade.

Steps in model building

The analysis proceeded in seven steps and was conducted in MPlus for Windows version

3.13. Data preparation and principal components analysis was conducted in SPSS for Windows

version 12.0. The analysis was highly exploratory in nature. I was not interested in formally

testing several causal models against one another. Rather, I sought a reasonable model that

would a) exhibit acceptably good fit, b) remain simple enough for easy interpretability, c)

include only endogeneous variables of substantive interest, d) make theoretical sense, and e) be

possible to estimate on a powerful personal computer.

First, I attempted to create a viable single-level within-schools model, ignoring the

hierarchical structure of the data. Several models were considered during this phase. The final

model selected, described in Figure 3.1, met all the criteria. Because the endogeneous variables

in the model were categorical, the WLSMV estimator was utilized. Model fit was poor

according to the exact-fit test, � 2 (6) = 788.44, p < .001, but excellent according to the

approximate fit indices, CFI = .99, RMSEA = .02.

In the second step, I examined an identical model to that examined in step 1, except this

time the effect of clustering was examined using the TYPE = COMPLEX command in the

MPlus software. The model fit was substantially improved, � 2 (6) = 228.23, p < .001, CFI = .96,

RMSEA = .01. The lower CFI value was caused by a much lower � 2 value for the baseline

model as compared to the first step.

Page 98: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

88

Race(Hispanic)

Race(Asian)

Lunch

Race(Black)

Migrant

Grade

Prob. Gifted Identification

Retained

Slope 1B

Slope 1A

Figure 3.1: Within-schools path model

In step three, a viable single-level between-schools model was created. Again, several

models were considered in this phase of the analysis. The final model selected, which is

described in Figure 3.2, also met the previously-mentioned criteria. Since all the endogeneous

variables in this model were continuous but nonnormally distributed, maximum likelihood

estimation with robust standard errors was selected. The model fit was excellent, � 2 (4) = 2.99, p

= .55, CFI = 1.00, RMSEA = .000. Though the measurement component of the “school

academic environment” latent variable could easily be estimated in the SEM software at this

stage of the analysis, attempting to do so in later stages was impossible due to computational

limits. Therefore, it was treated as a factor defined by the saved factor scores of the principal

Page 99: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

89

component extracted from the 15 academic quality variables. Its error variance was fixed and

was calculated by multiplying the variance in the factor scores by its alpha reliability.

In step four, a multilevel structural equation model was created. The within portion of

the model was identical to that used in steps 1 and 2. The between portion of the model was

unconditional. This model can be thought of as an unconditional intercept model and is similar

to the random ANCOVA model frequently seen in hierarchical linear modeling (HLM) studies.

The purpose of this model was to examine the amount of level-2 variance in “the probability of

gifted identification” , the primary variable of interest. This is especially critical for multilevel

structural equation models with categorical endogenous variables because a within-level variance

is not estimated, which makes the calculation of the intra-class correlation coefficient (ICC)

impossible. This model was used to determine if “the probability of gifted identification” varied

randomly across schools – if it did not, a multilevel treatment is unjustified. The estimated

variance from this step was compared against the residual variance computed in the next step to

determine the amount of level-2 variance explained by the multilevel model. Full-information

maximum likelihood estimation was selected. Due to missing data, the sample size for this

model fell to n = 296,311 and the number of clusters fell to 1,057. Missing data was handled via

listwise deletion because the MPlus software cannot handle missing data in multilevel models.

Numerical integration was required since the endogenous variables were categorical. The

default standard trapezoidal integration was selected, with 15 integration points per dimension.

Page 100: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

90

% StudentsBlack

% StudentsHispanic

Lunch

% StudentsAsian

% TeachersAdv Degree

Years Avg Teacher Exp

Prob. GiftedIdentification

% TeachersBlack

% TeachersHispanic

N Students/ 100

Incident-Student

Ratio

% Students Gifted (prev)

% Students Retained

% Students Migrant

AcademicEnvironment

AcademicComposite

1.0

Figure 3.2: Between-schools structural model

In this case, the only available types of model fit information are the logliklihood value,

Aikakie’s information criterion (AIC), and the Bayesian information criterion (BIC). The usual

fit indices such as chi-square, CFI, RMSEA, and SRMR, cannot be computed. In this case, the

Page 101: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

91

logliklihood value was -316134.1 and the sample-size adjusted BIC was 632,447.2 with 19 free

parameters. The results indicated that the logit of the probability of gifted identification (mean =

-3.25, corresponding with a probability of .037) did vary significantly across schools with a

variance of 1.205 and a standard error of .071. The results confirmed the need for a multilevel

approach for modeling this data.

In step 5, the within- and between-schools models from steps 1 and 3 were combined into

a single multilevel structural equation model. The within-schools portion of this model may be

found in Figure 3.1 and the between-schools portion of this model may be found in Figure 3.2.

This model was similar to the intercept-as-outcome model commonly encountered in HLM

literature. The logliklihood value was –292,096.5 and the sample-size adjusted BIC was

584,585.2 with 42 free parameters. Due to missing data, the sample size dropped to 273,311

students in 980 clusters. The cluster sizes now ranged from 22 to 964.

In step 6, a set of four unconditional random slope models were estimated. The purpose

of these models was to determine whether the path values from “Black” , “Hispanic” , “Asian” , or

“Lunch” to “GRefCur" in the within-schools model varied randomly across schools. The results

of these models indicated that the slopes for “Lunch” and “Asian” did vary significantly across

schools while the slopes for “Black” and “Hispanic” did not. The “lunch” slope had a variance

of 0.058 and a standard error of 0.013. The “Asian” slope had a variance of 0.358 and a standard

error of 0.128

Since the variances of the unconditional slopes for “Black” and “Hispanic” were non-

significant, they were fixed in all further models. The results of step 6 indicated that the slopes

from “Lunch” and “Asian” to “the probability of gifted identification” in the within-schools

model (marked 1A and 1B, respectively, in Figure 3.1) should be treated as random.

Page 102: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

92

In step 7, two separate random slope models were estimated, one for “Lunch” and one for

“Asian.” The structure of these models was identical except for the outcome variable of the

slope model. The within-schools portion of the model is described in Figure 3.1, the between-

schools portion in Figure 3.2, and the random slope models may be found in Figure 3.3. Note

that the components of this model that are redundant with the between-schools model, such as

the structure of the academics factor, have been omitted for clarity. Table 3.4 describes model fit

for the analysis steps.

Table 3.4: Model fit information for each analysis step

______________________________________________________________________________ Step Model � 2 (df) RMSEA CFI Free Log- N-adjusted

parameters likelihood BIC ______________________________________________________________________________ 1 Within 788.44 (6) .02 .99

2 Within (complex) 228.23 (6) .01 .96

3 Between 2.99 (4) .00 1.00

4 Unconditional intercept 19 -316134.1 623447.2

5 ML Intercept 42 -292096.5 584585.2

6 ML Search for Random Slopes -NA- -NA- -NA-

7a ML Lunch Slope 57 -292061.1 585654.6

7b ML Asian Slope 57 -292082.6 584697.6

______________________________________________________________________________

Page 103: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

93

% StudentsBlack

% StudentsHispanic

Lunch

% StudentsAsian

% TeachersAdv Degree

Years Avg Teacher Exp

% TeachersBlack

% TeachersHispanic

N Students

Incident-Student

Ratio

% Students Gifted (prev)

% Students Retained

% Students Migrant

AcademicEnvironment

Random Slope

Figure 3.3: Random slope model for both “ lunch” and “ race (Asian)” to “ probability of being identified as gifted”

The optimal approach to examining multiple random slopes in multilevel analysis is to

consider all of them simultaneously in the same model. However, due to computational

limitations, this could not be done. I tried to reduce the computational demands of the procedure

Page 104: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

94

by requesting MPlus’s Montecarlo integration option with 875 integration points, the maximum

number that could be supported. Montecarlo integration randomly selects a user-specified

number of integration points. I attempted to analyze the model with two random slopes and

Montecarlo integration five times, each time specifying a different seed for the random number,

in order to determine if the results would be stable from run to run. Unfortunately, the results

from this approach were not stable, so it was abandoned. Therefore, only one random slope

could be estimated at a time. It is possible that the values of some parameters as well as their

standard errors would have changed if both slopes could have been estimated together.

The total sample size for both models was n = 273,311 with 980 clusters which ranged in

size from 22 to 964. For the “Lunch” slope model, the logliklihood value was –292,061.05 while

the sample-size adjusted BIC was 584,654.48 with 57 free parameters. For the “Asian” slope

model, the logliklihood value was –292,082.6 and the sample-size adjusted BIC was 584,697.56

with 57 free parameters.

Interpreting model parameters

The majority of the endogeneous or outcome variables in this model were dichotomous.

For this reason, the parameters could not be interpreted in the usual way, that is, the expected

change in the outcome variable given a unit change in the explanatory variable, controlling for

the other variables in the model. When modeling dichotomous outcomes, the event of interest is

the probability that the individual will score “1” on the outcome variable, which is called a

“success.” Because probabilities are necessarily bounded at 0 and 1, they must be transformed to

an unbounded scale before they can be conveniently mathematically modeled.

Page 105: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

95

One solution to this problem is the logit, which is defined as the natural log of the odds

ratio:

z = log p

p

−1

where z is the parameter expressed in logits and p is the probability of success. The parameters

are estimated as logits in many statistical models of dichotomous processes, including the models

presented in this paper. For example, the estimated intercept for the “probability of gifted

identification” in the final model in this study was -3.351. To transform this parameter back into

a probability, one must solve for p. The equation for transforming logits back into probabilities

is:

ze

p −+=

1

1

where z is the parameter value expressed in logits, p is the probability, and e is the base of

natural logarithms. Following this equation, it is easy to see that the estimated mean logit of -

3.351 corresponds with a probability of .034. Therefore, this is the probability of identification

for a “reference student” Georgia during the 2004 academic year. However, the interpretation of

this parameter is complicated by the multilevel aspect of the model. The value of –3.351 is the

intercept for the probability of gifted identification and is therefore the expected value when all

of the school-level variables have values of zero. This is obviously unrealistic, as few schools

have teachers with an average of zero years of experience, a student body size of zero, no

students receiving lunch aid, and so on. When allowing the school-level values to take their

mean values, the value of the logit for the parameter become –2.882, corresponding with an

estimated probability of gifted identification of .053 for a “reference student.”

Page 106: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

96

Referring to the values reported for the within-schools model in Figure 3.4, we note that

the parameter value from “Black” to “probability of gifted identification” was -.934. To

calculate the effect that this variable has on the outcome probability, we first must know the

measurement scale of the explanatory variable. In this case, Black is a dummy variable coded 0

if the student is not Black and 1 if the student is Black. First the logits are summed, then the

resulting logit is converted back into a probability. The first step is identical to the process of

calculating predicted values from a regression equation.

022.01

1

816.3)1*934.0(351.3

)816.3(

10

=+

−=−+−+=

−−e

xzzz Black

Therefore, we see that being Black has an enormous negative impact on the probability of gifted

identification. Black students have less than half the probability of identification of a

comparable White student.

As a final example, the probability of gifted identification for a Hispanic student in third

grade who received free lunch and who was not retained or migrant will be calculated. Recall

that the lunch variable is scored from zero to two, with zero representing paid lunch, one

representing reduced-price lunch, and two representing free lunch.

005.1

1

308.5)3*020.()1*076.1()2*645.(882.2

)308.5(

3210

=+

−=−+−+−+−

+++=

−−e

xzxzxzzz gradeHispaniclunch

These results indicate that a third-grade Hispanic student receiving free lunch would have almost

no chance of being identified for participation in a gifted program.

Finally, note that the magnitude of the path coefficients cannot be used to directly

compare the impact of the various explanatory variables since many of these variables were

Page 107: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

97

measured of different scales. Any variable whose name begins with “%” is a percentage ranging

from 0 to 100. The “lunch” variable is the average of the individual lunch variables for each

school and can range from zero to two. The academic environment factor had a mean of .456

and a standard deviation of .978. The incident-student ratio variable was a measure of the

number of severe behavioral incidents that occurred within each school divided by the total

number of students attending the school. Its mean value was .034 with a maximum of 4.03.

Model results

The path values and standard errors for the within-schools portion of the model may be

found in Figure 3.4, for the between-schools portion of the model in Figure 3.5, and for “Lunch”

and “Asian” slopes, respectively, in Figures 3.6 and 3.7. Note that the estimated values of the

between-schools model were slightly different in the “Asian” and “Lunch” models. When these

values for significant paths diverged by more than 1%, both sets of values were reported in the

figure.

Within-schools model. The results of the within-school model are described in Figure

3.4. Race had strong direct effects, such that Black or Hispanic students had a reduced

probability of gifted identification, even after controlling for socioeconomic status, while Asian

students had an increased the probability of identification. Furthermore, race had an

extraordinarily strong effect on the probability of receiving free or reduced-price lunch, which in

turn has a strong effect on the probability of retention. Grade had weak negative effects on both

the probability of retention and on the probability of gifted identification. Table 3.5 summarizes

the direct, indirect, and total effects of the within-schools model. Indirect effects for categorical

mediator variables were calculated based on a formula given by Winship and Mare (1983, p. 85).

The products of the path coefficients were weighted by the odds ratio of the intercept of the

Page 108: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

98

intervening variable. Unfortunately, because no level-1 variance is estimated in multilevel

models with categorical outcomes, it is not possible to calculate the amount of variance in the

probability of gifted identification explained.

Race(Hispanic)

Race(Asian)

Lunch

Race(Black)

Migrant

Grade

Prob. Gifted Identification

Retained

1.712 (.002)

.563 (.010)

2.064 (.006)

.179 (.009)

.073 (.303)

-3.810 (.618)

-.645 (.177)-.503 (.015)

-.020 (.005)

-.020 (.005)

-.934 (.037)

-1.076 (.068)

1.161 (.512).179 (.009)

1A

1B

Figure 3.4: Path values and standard errors for within-schools portion of random slope model

Between-schools model. The between schools results are summarized in Figure 3.5. The model

explained 70 percent of the variance in the school academic environment through the school

composition. Student body SES had a very powerful impact on the academic environment, as

did the percentage of students within the school that had been previously identified as gifted.

Page 109: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

99

Table 3.5: Model summary for within-schools component of random slope model

______________________________________________________________________________

Intercepts ______________________________________________________________________________

Variable Value Standard T-Value Probability Error ______________________________________________________________________________ Lunch (Reduced) -0.936 .001 785.46* .281 Lunch (Free) -1.361 .001 1011.94* .204 Retained -3.931 .020 201.28* .019 Gifted identification -3.351 .304 11.04* .034

Direct Effects ______________________________________________________________________________ From To Parameter Standard T-Value Value Error ______________________________________________________________________________ Black Lunch 1.712 .002 854.95* Hispanic Lunch 2.064 .006 355.30* Asian Lunch 0.179 .009 19.40* Lunch Retained 0.563 .010 55.25* Grade Retained -0.020 .005 -32.39* Lunch Gifted -0.645 .177 -3.65* Black Gifted -0.934 .037 -25.51* Hispanic Gifted -1.076 .068 -15.79* Asian Gifted 1.161 .512 2.27* Migrant Gifted 0.073 .303 -0.24 Grade Gifted -0.020 .005 -9.67* Retained Gifted -3.810 .618 -6.16*

Indirect Effects ______________________________________________________________________________ From To Through Value ______________________________________________________________________________ Black Gifted Lunch -0.429 Hispanic Gifted Lunch -0.518 Asian Gifted Lunch -0.045 Lunch Gifted Retained -0.044

Page 110: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

100

Grade Gifted Retained 0.001 Black Gifted Lunch, Retained -0.028 Hispanic Gifted Lunch, Retained -0.033 Asian Gifted Lunch, Retained -0.003 Black Retained Lunch 0.375 Hispanic Retained Lunch 0.452 Asian Retained Lunch 0.039

Total Effects ______________________________________________________________________________ From To Value ______________________________________________________________________________ Black Gifted -1.391 Hispanic Gifted -1.627 Asian Gifted 1.113 Lunch Gifted -0.689 Grade Gifted -0.019 ______________________________________________________________________________ * Significant at or beyond p = .05

The racial composition of the student body also impacted the academic environment, with weak

negative effects for the percentages of the student body that were Black or Hispanic and a

stronger positive effect for the percentage of the student body that was Asian. It is quite

interesting that the variables measuring teacher education and experience did not have significant

effects on the school academic environment, nor did the ratio of severe behavioral incidents per

student or the percentage of students that had been retained.

The model explained 23 percent of the between-schools variance in the probability of

gifted identification. Only five paths had significant direct effects. The school SES composition

had a strong direct effect, such that students in schools with more low-SES students had a higher

probability of gifted identification. However, school SES also exerted a strong negative indirect

effect through school academic environment, overpowering the positive direct effect. The total

Page 111: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

101

effect of school SES on the mean probability of identification was negative. Furthermore, the

percentages of the student body that were Black and Hispanic also exerted small positive effects.

The school academic environment had a large positive effect. Finally, the incident-to-student

ratio variable had a strong negative effect on the probability of gifted identification. Table 3.6

summarizes the direct, indirect, and total effects in the between-schools model. Values reported

here are from the “Asian” random slope model. In cases where significant path values or

standard errors were different in the “Lunch” slope model, the value is italicized and reported

beneath the value from the Asian slope model..

Random slope model for “ Lunch.” The results of the random slope model for the

“Lunch” slope (1A) are reported in Figure 3.6. The purpose of this model was to attempt to

explain the variance across schools in the probability that a student receiving free or reduced-

price lunch would be identified as gifted. The mean of this outcome was –.645 with a standard

deviation of .241. This parameter should be interpreted as a logit. The odds of a student

receiving reduced-price lunch being identified as gifted would be, on average, only 52% of the

odds of a comparable student who did not receive aid. The odds of identification for a student

receiving free lunch would be only 27.5% of the odds for a comparable student not receiving aid.

Only two explanatory variables in this model had significant effects. Nonetheless, the model

explained 19 percent of the variance in the parameter. The school academic environment had a

weak negative effect on the probability that a student receiving free or reduced-price lunch

would be identified. The percentage of students previously identified as gifted had a positive

effect on the probability of gifted identification. Table 3.7 summarizes the direct, indirect, and

total effects in this slope model.

Page 112: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

102

% StudentsBlack

% StudentsHispanic

Lunch

% StudentsAsian

% TeachersAdv Degree

Years Avg Teacher Exp

Prob. GiftedIdentification

% TeachersBlack

% TeachersHispanic

N Students/ 100

Incident-Student

Ratio

% Students Gifted (prev)

% Students Retained

% Students Migrant

AcademicEnvironment

AcademicComposite

1.00 (0.0)

.286 (.012)

R2 = .70

-.830 (.054)

-.005 (.001)

-.013 (.002)

.045 (.006)

.003 (.002)

.013 (.008)R2 = .23R2 = .25

.905 (.067)

.881 (.066)

-.003 (.127)

.056 (.005)

-.014 (.008)

.562 (.069)

.615 (.072)

-.002 (.016).010 (.017)

-.013 (.019)

.006 (.012)

.005 (.026)

-1.407 (.399)-1.211 (.408)

-002 (.003)

-.007 (.017)

-.005 (.004)

.012 (.017)

.012 (.004)

.009 (.003)

.310 (.146)

.393 (.162)

.016

Figure 3.5: Path values for between-schools (intercept) component of random slope model

Page 113: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

103

Table 3.6: Model summary for between-schools component of random slope model ______________________________________________________________________________

Intercepts ______________________________________________________________________________

Variable Value Standard T-Value Probability Error ______________________________________________________________________________ Academics 0.456 .114 3.99* Gifted identification -3.351 .304 11.04* .034

Direct Effects ______________________________________________________________________________ From To Parameter Standard T-Value Value Error ______________________________________________________________________________ Lunch Gifted .310 .146 2.13* % Black Gifted .009 .003 3.53* % Hispanic Gifted .012 .004 2.69* % Asian Gifted .012 .017 0.71 % Tch Adv Gifted -.005 .004 -1.47 Academic Env Gifted .562 .069 0.07 Avg Tch Exp Gifted -.007 .017 -0.42 % Tch Black Gifted -.002 .003 -0.58 % Tch Hisp. Gifted .005 .026 0.18 Incident Ratio Gifted -1.407 .399 -3.53* % Gifted Gifted .006 .012 0.55 % Retained Gifted -.013 .019 -0.71 NStudents/100 Gifted .010 .017 0.57 % Migrant Gifted -.002 .016 -0.15 Lunch Academic Env -.830 .054 -15.51* % Black Academic Env -.005 .001 -6.92* % Hispanic Academic Env -.013 .002 -5.98* % Asian Academic Env .045 .006 7.33* % Tch Adv Academic Env .003 .002 1.95 Avg Tch Exp Academic Env .013 .008 1.59 Incident Ratio Academic Env -.003 .127 -0.02 % Gifted Academic Env .056 .005 12.36* % Retained Academic Env -.014 .008 -1.74

Page 114: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

104

Indirect Effects ______________________________________________________________________________

From To Through Value ______________________________________________________________________________ Black Gifted Academic Env -.002 Hispanic Gifted Academic Env -.007 Asian Gifted Academic Env .025 Lunch Gifted Academic Env -.466 % Gifted (prev) Gifted Academic Env .031

Total Effects ______________________________________________________________________________ From To Value ______________________________________________________________________________ Black Gifted .007 Hispanic Gifted .005 Lunch Gifted -.156 ______________________________________________________________________________ * Significant at or beyond p = .05

Random slope model for “ Asian.” The results of the random slope model for the “Asian”

slope (1B) are reported in Figure 3.7. Recall that it was necessary to model this slope separately

due to computational limitations. The mean of this outcome of 1.161 and its standard deviation

was .598. Therefore, Asian students have, on average, 319% greater odds of gifted identification

than comparable White students. The model explained 90 percent of the between-schools

variance in the probability of an Asian student being identified as gifted. Three variables had

statistically significant effects on the outcome. The school socioeconomic composition had a

strong effect such that Asian students in schools with large numbers of students receiving

subsidized lunch had a higher probability of being identified. The percentage of teachers in the

Page 115: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

105

school that were Black had a weak negative effect. Finally, the size of the school student body

had a negative effect on the probability of gifted identification for Asian students. Table 3.8

summarizes the direct effects in this model.

Discussion

The results of this study have demonstrated that, in spite of relatively recent additions to

Georgia law changing the identification procedure for gifted programs to make it easier to

identify traditionally underrepresented students, a serious issue continues to exist in Georgia.

Though a previous study conducted by the author (using school-level data only) suggested that

the disparities in gifted program participation across racial groups might result only from

socioeconomic differences (McBee, in press), the results of the current study demonstrate that

this hopeful scenario is simply untrue. Even when socioeconomic status in controlled, race has a

huge impact on the probability of identification. Hispanic students remain the group with the

lowest probability of identification, though their probability is only slightly lower than the

probability for Black students. Asian students have a higher probability of identification than

students from any other group. Not only does being Black or Hispanic exert a large negative

effect on the probability of identification directly, it also exerts large indirect effects through

increasing the probability that the student will also be economically disadvantaged and therefore

receive the penalties associated with socioeconomic deprivation as well.

Page 116: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

106

% StudentsBlack

% StudentsHispanic

Lunch

% StudentsAsian

% TeachersAdv Degree

Years Avg Teacher Exp

% TeachersBlack

% TeachersHispanic

N Students/ 100

Incident-Student

Ratio

% Students Gifted (prev)

% Students Retained

% Students Migrant

AcademicEnvironment

Random Slope 1A

(-.645)

.047 (.012)

-.099 (.043)

.017 (.008)

-.069 (.093)

.000 (.001)

.002 (.002)

.008 (.007)

.000 (.002)

-.002 (.011)

.003 (.002)

.013 (.014)

-.352 (.277)

-.001 (.010) .000 (.008) -.006 (.010)

R2 = .19

Figure 3.6: Path values for slope portion of random slope model 1A (from “ lunch” to “ probability of being identified gifted” )

Page 117: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

107

Table 3.7: Model summary for “ Lunch” slope portion of random slope model _____________________________________________________________________________

Intercepts

______________________________________________________________________________

Variable Value Standard T-Value Odds Ratio Error ______________________________________________________________________________ Lunch Slope (Red) -0.645 .177 -3.65* .52 Lunch Slope (Free) -1.290 .177 -7.29* .28

Direct Effects ______________________________________________________________________________ From To Parameter Standard T-Value Value Error ______________________________________________________________________________ Lunch Slope 1A -0.069 .093 -0.75 % Black Slope 1A 0.000 .001 0.10 % Hispanic Slope 1A 0.002 .002 0.75 % Asian Slope 1A 0.008 .007 1.18 % Tch Adv Slope 1A 0.000 .002 -0.15 Academic Env Slope 1A -0.099 .043 -2.28* Avg Tch Exp Slope 1A -0.002 .011 -0.15 % Tch Black Slope 1A 0.003 .002 1.49 % Tch Hisp. Slope 1A 0.013 .014 0.93 Incident Ratio Slope 1A -0.352 .277 -1.27 % Gifted Slope 1A 0.017 .008 2.25* % Retained Slope 1A -0.001 .010 -0.14 NStudents/100 Slope 1A 0.000 .008 0.01 % Migrant Slope 1A -0.006 .010 -0.14

Indirect Effects ______________________________________________________________________________

From To Through Value ______________________________________________________________________________ % Gifted (prev) Slope 1A Academic Env -0.006

Total Effects ______________________________________________________________________________

Page 118: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

108

From To Value ______________________________________________________________________________ % Gifted (prev) Slope 1A .011 ______________________________________________________________________________ * significant at or beyond p = .05

% StudentsBlack

% StudentsHispanic

Lunch

% StudentsAsian

% TeachersAdv Degree

Years Avg Teacher Exp

% TeachersBlack

% TeachersHispanic

N Students/ 100

Incident-Student

Ratio

% Students Gifted (prev)

% Students Retained

% Students Migrant

AcademicEnvironment

Random Slope 1B(1.161)

.036 (.063)

.062 (.133)

-.002 (.018)

.947 (.412)

-.005 (.005)

-.010 (.007)

.000 (.012)

-.009 (.006)

-.016 (.028)

-.014 (.006)

-.035 (.036)

.196 (.976)

-.014 (.044) -.043 (.021) .012 (.053)

R2 = .90

Figure 3.7: Path values for slope portion of random slope model 1B (from “ Asian” to “ probability of being identified gifted” )

Page 119: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

109

Table 3.8: Model summary for “ Asian” slope portion of random slope model

______________________________________________________________________________

Intercepts

______________________________________________________________________________

Variable Value Standard T-Value Odds Ratio Error ______________________________________________________________________________ Asian Slope 1.161 .512 2.27 3.19

Direct Effects ______________________________________________________________________________ From To Parameter Standard T-Value Value Error ______________________________________________________________________________ Lunch Slope 1B 0.947 .412 2.30* % Black Slope 1B -0.005 .005 -0.90 % Hispanic Slope 1B -0.010 .007 -1.53 % Asian Slope 1B 0.000 .012 0.01 % Tch Adv Slope 1B -0.009 .006 -1.43 Academic Env Slope 1B 0.062 .133 0.47 Avg Tch Exp Slope 1B -0.016 .028 -0.56 % Tch Black Slope 1B -0.014 .006 -2.38* % Tch Hisp. Slope 1B -0.035 .036 -0.98 Incident Ratio Slope 1B 0.196 .976 0.20 % Gifted Slope 1B -0.002 .018 -0.10 % Retained Slope 1B -0.014 .044 -0.32 NStudents/100 Slope 1B -0.043 .021 -2.01* % Migrant Slope 1B 0.012 .053 0.23 ______________________________________________________________________________ * Significant at or beyond p = .05

Based on the mean values for the school composition, we can calculate the average probability

of identification for a White first grader who does not receive free or reduced-price lunch, was

not retained and not migrant (in other words, a “reference student” ) at 0.053. The same

Page 120: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

110

probability for an equivalent Black student drops to 0.022. The same Black student who receives

free lunch would have an average probability of identification of 0.005, and 59.3 percent of the

Black students in Georgia receive free lunch. The situation for Hispanic students is even more

dire. A Hispanic first grader with no other special characteristics would have a probability of

identification of 0.019. A Hispanic first grader receiving free lunch would have an average

probability of identification of .005, with 66.7 percent of Georgia first graders receiving free

lunch. Table 3.9 describes model-implied probabilities for identification of students of various

backgrounds.

At the between-schools level, the race and socioeconomic composition actually had small

positive direct effects on the probability of gifted identification. However, the positive direct

effect of the SES composition is overshadowed by its larger indirect effect transmitted through

the school academic environment.

One striking finding was that fully seventy percent of the variance in the school academic

environment factor was explained via the school composition. It was surprising that the teacher

variables did not exert a significant influence on the academic environment, nor did the

percentage. It is particularly interesting given the current emphasis on teacher and school

accountability. It is also somewhat surprising that the percentage of students that had been

retained did not exert a significant negative effect on the school academic composition.

Consider that the variables that made up the academic environment composite were the

percentages of students scoring as “advanced” on the Criterion Referenced Competency Test

(CRCT). The results suggest that a certain population of students is likely to do well on the

CRCT under any circumstance and is therefore less sensitive to factors such as teacher quality.

Page 121: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

111

Table 3.9: Model-implied probabilities of identification

______________________________________________________________________________ Student Student FRL Probability of Gifted Identification ______________________________________________________________________________ White Paid 5.3% White Reduced 2.9% White Free 1.5% Black Paid 2.2% Black Reduced 1.1% Black Free 0.6% Hispanic Paid 1.9% Hispanic Reduced 1.0% Hispanic Free 0.5% Asian Paid 15.1% Asian Reduced 8.6% Asian Free 4.7% ____________________________________________________________________________ Note: All probabilities assume that students are first graders in an compositionally average school for White students not receiving free or reduced-price lunch.

Perhaps this is related to the results of Reis, Westberg, Kulilowich, and Purcell’s (1998) study

which found that fifty percent of the curriculum could be eliminated for gifted students without

negatively impacting their achievement test scores. Perhaps the impact of teacher training and

experience would have been more powerful if the academic composite had been defined by

students scoring as “proficient” or “advanced” rather than just ”advanced” on the CRCT.

Another surprising finding was that the random slope coefficients for Black and Hispanic

in the within-schools model did not vary randomly across schools. When planning the study,

Page 122: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

112

these slopes along with the Lunch slope were of primary interest because they represent the

students who are known to be underrepresented in gifted programs. It was hoped that this study

would help identify the types of schools that would be most effective at identifying Black and

Hispanic students. Unfortunately, it seems that all Georgia schools are equally ineffective at

identifying these two types of students.

Turning to the random slope model for Lunch, we noted that two paths had significant

effects. The path from academic environment was weakly negative, while the path from the

percentage of students previously identified as gifted was positive. Therefore, the positive direct

effect of the “previous gifted” variable was somewhat weakened by its indirect effect through the

school academic environment but remained positive. The positive direct effect was hypothesized

because it seemed likely that schools that were more effective at identifying gifted students in

general would also be better at identifying low-SES gifted students. The negative effect of the

academic environment, however, is contrary to what was expected. Perhaps schools with fewer

numbers of students excelling on standardized achievement tests creates more of an opportunity

for traditionally underrepresented students to stand out.

The Asian slope model was initially considered relatively uninteresting compared to the

other potential random slopes in the model and was only examined at all for reasons of

symmetry. This is because Asian students are known to be substantially overrepresented in

gifted programs (Kitano & DiJiosia, 2002). The results of the within-schools component of the

model confirm this finding. In fact, the degree of overrepresentation of Asian students is

actually larger than the degree of underrepresentation for Black or Hispanic students, though not

larger than the degree of underrepresentation of students receiving free lunch.

Page 123: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

113

The results of this model are difficult to understand, especially in light of the small

amount of research that this issue has received. The SES composition of the school exerted a

strong effect on the probability of identification for Asian students, so that schools with more

students receiving aid identified more Asian students. School size exerted a weak negative

effect, as did the percentage of teachers in the school that were Black. This last effect was

included because it was hypothesized that diverse teachers would possess more multicultural

awareness and might therefore be more likely to nominate non-White students for evaluation for

gifted program entry. This did not appear to be the case.

It is possible that the effects in the Asian slope model actually represent composition

effects. There are relatively few Asian students in Georgia in general, with 12.6 percent of

schools having no Asian students and 50.5 percent of schools having less than one percent of

their student bodies composed of Asian students. Asian students would obviously have a small

chance of identification in schools that have zero or very few Asian students. This probably also

explains the relatively high standard error for the Asian slope parameter.

This study had several important limitations. First, on the methodological level,

computational limits became a major problem. The estimation of final slope models presented in

this paper took between six and twelve hours apiece on a powerful personal computer. The

models required a total of only 225 integration points. To estimate a model with both slopes

considered simultaneously would have required a total of 3,375 estimation points. The MPlus

software is not currently equipped to make use of some recent advancements in computer

hardware such as multiple processors, though the upcoming version of the software may have

this feature.

Page 124: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

114

Second, missing data was a major problem. Listwise deletion resulted in the removal of

68,323 cases. There is some evidence that these cases were not missing at random. Upon

examining the descriptive statistics in Table 3.2, one will note that there is a large discrepancy

between the individual- and school-level versions of the “migrant” variable. This is because the

individual version was computed after listwise deletion, whereas the school version was

aggregated from the individual version before listwise deletion took place. The discrepancy

between the two indicates that the majority of migrant students had missing data and were

therefore removed from the analysis. It is quite easy to explain why this might be the case, since

it might take some time for a student’s records to be transferred to the new school. Nevertheless,

missing data that is not missing at random can bias the parameter estimates in any model (Enders

& Bandalos, 2001).

Finally, the dearth of model fit information for the multilevel models in this study is a

serious shortcoming as well. The logliklihood and BIC values reported are predominantly useful

for comparing the fits of competing models, not for making absolute judgments on whether or

not the model fits the data. The actual degree to which the models presented herein conform to

or diverge from the data remains unknown.

This study was the first to examine the issue of underrepresentation through categorical

data analysis tools, enabling the study to examine the actual probabilities of identification for

various student types. It is the first study of this type to use multilevel modeling techniques, and

it uses a dataset of unparalleled size. In spite of its limitations, it significantly extends our

knowledge of this issue.

Page 125: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

115

References

Duncan, G. J., & Brooks-Gunn, J. (2000). Family poverty, welfare reform, and child

development. Child Development, 71, 188-196.

Enders, C. & Bandalos, D. (2001). The relative performance of full information maximum

likelihood estimation for missing data in structural equation models. Structural Equation

Modeling, 8(3), 430-457.

Entwisle, D. R., & Alexander, K. L. (1992). Summer setback: Race, poverty, school

composition, and mathematics achievement in the first two years of school. American

Sociological Review, 57(1), 72-84.

Everson, H. T., & Millsap, R. E. (2004). Beyond individual differences: Exploring school effects

on SAT scores. Educational Psychologist, 39(3), 157-172.

Ford, D. Y. (1998). The underrepresentation of minority students in gifted education: Problems

and promises in recruitment and retention. Journal of Special Education, 32(1), 4-14.

Ford, D. Y., Harris, J. J., III, Tyson, C. A., & Trotman, M. F. (2002). Beyond deficit thinking:

Providing access for gifted African American students. Roeper Review, 24(2), 52-58.

Fordham, S., & Ogbu, J. (1986). Black students' school success: The burden of "acting White."

The Urban Review, 18, 176-206.

Frasier, M. M. (1997). Gifted minority students: Reframing approaches to their identification and

education. In N. Colangelo & G. Davis (Eds.), The handbook of gifted education (2nd

ed., pp. 498-515). Needham Heights, MA: Allyn & Bacon.

Frasier, M. M., & Passow, A. H. (1994). Toward a new paradigm for identifying talent potential.

(No. 94112): The National Research Center on the Gifted and Talented.

Page 126: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

116

Gordon, R. A. (1987). Jensen's contributions concerning test bias: A contextual view. In S.

Modgil & C. Modgil (Eds.), Arthur Jensen -- consensus and controversy (pp. 77-154).

Philadelphia, PA: Falmer.

Grantham, T. C., & Ford, D. Y. (2003). Beyond self-concept and self-esteem: Racial identity and

gifted African American students. High School Journal, 87(1), 18-29.

Harmon, D. (2002). They won't teach me: The voices of gifted African American inner-city

students. Roeper Review, 24(2), 68-75.

Hess, R., & McDevitt, T. (1984). Some cognitive consequences of maternal intervention

techniques: A longitudinal study. Child Development, 55, 1902-1912.

Hunsaker, S. L., Finley, V. S., & Frank, E. L. (1997). An analysis of teacher nominations and

student performance in gifted programs. Gifted Child Quarterly, 41(2), 19-24.

Jensen, A. R. (1980). Bias in mental testing. New York: Free Press.

Kaplan, D. (2000). Structural equation modeling: Foundations and extensions. Thousand Oaks:

CA: Sage.

Kennedy, E. (1992). A multilevel study of elementary male Black students and White students.

Journal of Educational Research, 86(2), 105-110.

Kitano, M. K., & DiJiosia, M. (2002). Are Asian and Pacific Americans overrepresented in

programs for the gifted? Roeper Review, 24(2), 76-81.

Kline, R. B. (2005). Principles and practice of structural equation modeling (2nd ed.). New

York: Guilford.

Maggi, S., Hertzman, C., Kohen, D., & D'Angiulli, A. (2004). Effects of neighborhood

socioeconomic characteristics and class composition on highly competent children.

Journal of Educational Research, 98(2), 109-114.

Page 127: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

117

Maker, C. J. (1996). Identification of gifted minority students: A national problem, needed

changes and a promising solution. Gifted Child Quarterly, 40(1), 41-50.

Masten, W. G., Plata, M., Wenglar, K., & Thedford, J. (1999). Acculturation and teacher ratings

of Hispanic and Anglo-American students. Roeper Review, 22(1), 64-65.

McBee, M. (In press). Minority representation in gifted programs: A school level analysis of race

and socioeconomics. Roeper Review.

McCoach, D. B. (2003). SEM isn't just the schoolwide enrichment model anymore: Structural

equation modeling (SEM) in gifted education. Journal for the Education of the Gifted,

27(1), 36-61.

Mills, B. C. (1983). The effects of socioeconomic status on young children's readiness for

school. Early Child Development & Care, 11(3), 267-273.

Mills, C. J., & Tissot, S. L. (1995). Identifying academic potential in students from under-

represented populations: Is using the Ravens Progressive Matrices a good idea? Gifted

Child Quarterly, 39(4), 209-217.

Naglieri, J. A., & Ford, D. Y. (2003). Addressing underrepresentation of gifted minority children

using the Naglieri Nonverbal Ability Test (NNAT). Gifted Child Quarterly, 47(2), 155-

160.

Naglieri, J., & Jensen, A. R. (1987). Comparison of Black - White differences on the WISC-R

and the K-ABC: Spearman's hypothesis. Intelligence, 11, 21-43.

Opdenakker, M.-C., & Van Damme, J. (2001). Relationship between school composition and

characteristics of school process and their effect on mathematics achievement. British

Educational Research Journal, 27(4), 407-432.

Page 128: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

118

Peterson, J. S. (1999). Gifted--through whose cultural lens? An application of the postpositivistic

mode of inquiry. Journal for the Education of the Gifted, 22(4), 354-383.

Plata, M., & Masten, W. G. (1998). Teacher ratings of Hispanic and Anglo students on a

behavior rating scale. Roeper Review, 21(2), 139-144.

Portes, A., & MacLeod, D. (1996). Educational progress of children of immigrants: The roles of

class, ethnicity, and school context. Sociology of Education, 69(4), 255-275.

Quay, L. C. (1989). Interactions of stimulus materials, age, and SES in the assessment of

cognitive abilities. Journal of Applied Developmental Psychology, 10(3), 401-409.

Reid, C., Romanoff, B., Algozzine, B., & Udall, A. (2000). An evaluation of alternative

screening procedures. Journal for the Education of the Gifted, 23(4), 379-396.

Ryan, J. J., & French, J. R. (1976). Long-term grade predictions for intelligence and achievement

tests in schools of differing socio-economic levels. Educational & Psychological

Measurement, 36(2), 553-559.

Sarouphim, K. M. (1999). Discover: A promising alternative assessment for the identification of

gifted minorities. Gifted Child Quarterly, 43(4), 244-251.

Scott, M. S., Perou, R., Urbano, R. C., & Hogan, A. (1992). The identification of giftedness: A

comparison of White, Hispanic and Black families. Gifted Child Quarterly, 36(3), 131-

139.

Taylor, S. A., & Harris, K. C. (2003). School integration and the achievement test scores of black

and white students in Savannah, Georgia. North American Journal of Psychology, 5(2),

301-309.

Tyler-Wood, T., & Carri, L. (1993). Verbal measures of cognitive ability: The gifted low SES

student's albatross. Roeper Review, 16(2), 102-106.

Page 129: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

119

West, C. A. (1985). Effects of school climate and school social structure on student academic

achievement in selected urban elementary schools. Journal of Negro Education, 54(3),

451-461.

Wilson, N. S. (1986). Counselor interventions with low-achieving and underachieving

elementary, middle, and high school students: A review of the literature. Journal of

Counseling & Development, 64(10), 628-634.

Winship, C. & Mare, R. D. (1983). Structural equations and path analysis for discrete data. The

American Journal of Sociology, 89(1), 54-110.

Worrell, F. C., Szarko, J. E., & Gabelko, N. H. (2001). Multi-year persistence of nontraditional

students in an academic talent development program. Journal of Secondary Gifted

Education, 12(2), 80-89.

Page 130: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

120

CHAPTER 4

MULTILEVEL ANALYSIS IN GIFTED EDUCATION

Page 131: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

121

Abstract

Multilevel data analysis techniques have become very important in educational research.

This paper introduces two varieties of multilevel analysis, hierarchical linear models and

multilevel structural equation models, to the gifted education community. Readers are assumed

to have knowledge of multiple regression techniques. The rationale and purpose of multilevel

analysis is explained. A sample dataset is analyzed according to a variety of multilevel analysis

strategies. Example code for conducting multilevel analyses via the MPlus software package is

included.

Page 132: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

122

Multilevel analysis is a type of data analysis that is currently receiving a great deal of

interest and enthusiasm in many social science disciplines. The past several years have

witnessed the increased rate of offering of courses in multilevel analysis in graduate programs in

education and intensive workshops. A number of fairly accessible resources in multilevel

analysis are now available, and both the number and the ease-of-use of multilevel analysis

software packages have increased. Though the number of publications using multilevel

modeling in education has been steadily increasing, the use of such techniques in gifted

education research is currently very limited. This article was written to introduce multilevel

analysis to researchers. It is intended to introduce two varieties of multilevel analysis,

hierarchical linear modeling (Schreiber & Griffin, 2004) and multilevel structural equation

modeling (Heck, 2001), in a straightforward manner. While some equations must be presented

for understanding multilevel analysis, I have attempted to minimize the mathematical treatment

in favor of conceptual clarity. Understanding this article will require the reader to be familiar

with statistical analysis methods such as analysis of variance and multiple linear regression.

Readers who possess an understanding of structural equation modeling will find the section on

multilevel SEM quite easy to understand. Readers who are unfamiliar with SEM are referred to

McCoach’s (2003) beginner-friendly introduction.

What is Multilevel Analysis?

It will be useful to examine two concepts common to multilevel analysis before delving

deeper into this discussion. These two concepts are data clustering and levels of analysis.

Multilevel analysis approaches are appropriate and useful when data is clustered or has a

hierarchical structure. Clustered data occurs when research participants are grouped in some

way (Hox, 2002). In educational research, students are often situated within classrooms.

Page 133: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

123

Students who are in classrooms together are likely to be more similar than students in different

classrooms for a variety of reasons. If one student gets sick, he could cause several other

classmates to get sick. If one student asks an insightful question, all the other students in that

class benefit from hearing the answer. In practice, the vast majority of data collected in

education is clustered. Higher levels of clustering are possible as well. Students attending the

same school may be more similar than students attending different schools due to differences in

funding, school leadership, and location. Students attending schools in the same district may be

more similar than students across districts due to differences in policy. When ordinary statistical

methods are applied to clustered data, the results may be incorrect or misleading (Raudenbush &

Bryk, 2002).

In traditional data analysis, the researcher is interested in measuring variables, forming

hypotheses, testing models, and making inferences concerned with a single level of analysis.

This level of analysis is usually individual students. For example, we might be interested in the

impact of a student’s verbal ability on performance in the gifted education classroom. Less

commonly, we may choose schools as our level of analysis. In that case, our research questions

would concern how the properties of schools affect school-based outcomes. We might ask how a

school’s per-pupil funding affects the school’s achievement test scores. Frequently, but not

always, the information examined at the school level is actually comprised of aggregated

individual-level data. Schools themselves do not have achievement test scores, but the students

attending the schools do. The mean of these scores may be taken to represent the overall level of

academic achievement of students attending the school. Multilevel analysis differs from

traditional data analysis because it is addresses research questions targeted at different levels of

analysis simultaneously (Hox, 2002). For example, multilevel analysis would allow us to extend

Page 134: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

124

the previous example of the effect of verbal ability on gifted education class performance by also

examining the impact of the verbal abilities of the class on the student’s performance. In fact,

we can envision the variance in student performance as having two parts: one part due to the

student’s individual ability, and one part due to the effect of the ability of the other students by

which he is surrounded. Most gifted education professionals believe that grouping gifted

students together is beneficial. Multilevel analysis allows us to directly address research

questions involving the effects of grouping while simultaneously considering (or adjusting for) a

student’s individual ability. Data clustering generally poses statistical problems, while choosing

a level of analysis is primarily a conceptual issue.

Statistical Issues in Applying Unilevel Statistical Methods to Clustered Data

In practice, almost every dataset analyzed in educational research is comprised of

clustered data because our research designs usually involve the collection of data from a few

hundred students distributed across several classrooms. Our datasets are commonly analyzed

with t-tests, analysis of variance (ANOVA), or multiple linear regression techniques. We are

aware that the accuracy of the results obtained from these techniques is predicated on a number

of assumptions, which may or may not apply to our data. In order to facilitate further discussion,

the basic assumptions of multiple linear regression will be enumerated. Of the three basic

statistical analysis techniques described above, regression is the most flexible. The t-test can be

conceptualized as a special case of ANOVA, and ANOVA can be thought of as a special case of

regression (Pedhazur, 1997). The major assumptions of regression are as follows:

1) The underlying relationship between the predictor variables and the outcome is linear.

2) The residuals (i.e., the prediction errors, Y�

- Y) are assumed to be normally distributed.

3) Residuals have a constant variance (homoscedasticity)

Page 135: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

125

4) Observations are independent (Raudenbush & Bryk, 2002).

Speaking pragmatically, we often proceed with our analysis even when we have evidence that

one or more of these have been violated. In gifted education research, for example, the scores of

our participants on measures of intellectual ability, creativity, or academic achievement are often

extremely skewed. Indeed, most theoretical conceptions of giftedness require that the students

we label “gifted” fall at the top end of one or more assessments. When examining published

gifted education research that has employed linear regression, attempts to confirm whether or not

these assumptions have been violated appear to be quite rare.

Clustered data violates the assumption of independent observations (Curran, 2003). The

assumption of independence means that the participants in the study do not affect one another,

and that variables are only related to each other in the ways specified in the statistical model that

was used. When this assumption is violated, incorrect estimates and standard errors of the model

parameters may result. In regression research, a t-test is associated with each independent

variable in the model. The magnitude of the independent variable’s effect is divided by its

standard error to produce the value of the t-test, which is used to determine the likelihood that the

effect of that independent variable is statistically significant, (i.e., unlikely to have occurred by

chance). Because units in clustered datasets are usually more alike than they would be if the

units were independent, standard errors of the model parameters may be depressed. Depressing

the standard errors causes the overall t-ratio to become larger than it should be, increasing the

probability that the researcher will reject the null hypothesis when it is correct. In other words,

using traditional statistical approaches with clustered data may increase the risk of Type 1 error.

The situation may become even more severe when researchers attempt to mix variables

measured at different levels of analysis in a unilevel statistical model. This practice was

Page 136: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

126

common before the widespread adoption of multilevel statistical methods. Imagine a research

study wherein a researcher entered both students’ individual abilities as well as the average

ability of the students’ classmates as independent variables into a standard regression model. In

this case, each student will have a unique ability score, but all students in a classroom will have

identical classroom ability scores. The standard errors of both parameters are likely to be

affected by the data clustering, but the standard errors for the classroom ability variable will be

extremely depressed. Scullen (1997) pointed out that when ordinary Pearson correlation

coefficients are computed between variables measured at different levels of analysis, the

magnitude of the correlation coefficient may be extremely inflated by as much as 70%.

One solution to this issue that does not violate the assumptions of traditional statistics is

to simply not conduct studies where clustering is a problem by limiting the level of analysis to

schools or classrooms instead of individuals. However, we are generally most interested in what

happens to students as individuals. Furthermore, to only consider classroom-level aggregate

indicators of student learning is to throw away a large amount of data, which reduces the

statistical power or sensitivity of the analysis. Finally, this approach is not without potential

hazards. There is some risk in attempting to make conclusions about individuals based on

aggregated data obtained from a higher level of analysis. The ecological fallacy occurs when

group-level relationships are erroneously presumed to apply to the individuals comprising those

groups (Robinson, 1950). Variables can exert very different effects through different

mechanisms depending on which level of analysis is used. Examining only one level of analysis

may lead the researcher to make incorrect conclusions under these circumstances. This is known

as aggregation bias (James, 1982; Walker & Catrambone, 1993). For instance, several studies of

the big-fish-little-pond effect (BFLPE; Marsh & Parker, 1984) have shown that though

Page 137: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

127

individual student ability has a positive relationship with academic self-concept, the aggregate

ability of the classmates has a negative impact on self-concept (Marsh, 1987; Bachman &

O'Malley, 1986). This is understood through “frame of reference theory” which posits that

individuals make use of both objective data and social comparisons when making self-

judgments. Students that are high in ability will have received some objective evidence to this

end, which results in the high positive correlation between ability and academic self-concept.

Students that are situated in contexts where the average ability of their classmates is also high

will be comparing themselves against a tough standard and will therefore have a lower academic

self-concept than the same student would if surrounded by peers of lower ability. Studies

focusing exclusively on either the individual or aggregate levels of analysis would fail to detect

this interesting relationship. Studies focusing only on the school level would find the effects of

the individual-level relationship between ability and self-concept to be weaker than it actually is

because the social comparison aspect would be unaccounted for in the model to the extent that

bright students are grouped together. The same would be true if the study were conducted

exclusively at the individual level. The positive contribution of individual ability would be

masked by the negative effect of the context.

Introduction to Multilevel Analysis

Multilevel analysis refers to a family of data analysis techniques that are appropriate to

use with clustered data. There are two major variants of multilevel analysis: hierarchical linear

modeling and multilevel structural equation modeling. These two techniques were developed

separately but have much in common. To make things more confusing, a variety of names are

used in the literature for each technique. Hierarchical linear modeling is also known as

multilevel regression or simply multilevel modeling (Raudenbush & Bryk, 2002). Structural

Page 138: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

128

equation modeling is also known as covariance structure analysis or causal modeling (McCoach,

2003). Structural equation models that do not include latent variables are called path models.

Conceptually speaking, hierarchical linear modeling may be thought of as a special case of

multilevel structural equation modeling. For the remainder of the paper, hierarchical linear

models will be abbreviated as HLMs, and multilevel structural equation models will be

abbreviated as ML-SEMs.

There are a number of types of multilevel models to address different research questions.

Perhaps the simplest type of multilevel analysis is very similar to ordinary regression, only the

analysis is able to correctly handle clustered data. Other types of multilevel models explicitly

address research questions directed at multiple levels of analysis simultaneously. These models

will be addressed in more detail later in this paper. Furthermore, most types of multilevel

modeling require several steps. Finally, variants of multilevel models are able to correctly

handle many types of categorical or non-normal data.

A variety of software packages are available for conducting multilevel analyses. These

include SAS, LISREL, HLM, MLWin, MPlus, and several others. For consistency, Mplus

version 3.2 will be used for all analyses presented herein. Mplus is one of the most flexible

statistical modeling software packages currently available.

The basic idea in HLMs is that the standard regression equation is extended to correctly

handle clustered data as well as research questions that address multiple levels of analysis (Hox,

2002). The basic regression equation is:

(1) iii exBBY ++= 10

The Y is the individual’s score on the outcome variable while X is the individual’s score on the

explanatory variable. The inclusion of the subscript i indicates that each individual in the dataset

Page 139: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

129

receives a separate value of the parameter. The B0 and B1 terms are the intercept and slope

parameters, respectively, and are the same for all individuals in the dataset. The e term is the

error or residual term. The value of B0, the intercept, is interpreted as the mean or expected

value of Y for an individual with an X of zero. As an example, let us imagine that we have

constructed a regression equation explaining test performance in terms of IQ score. Our

regression equation is:

(2) iii eIQBBTEST ++= )(10 where B0 = 10.2 and B1 = .60

Note the value of the intercept is 10.2, which is the expected test score for a student with an IQ

of zero. Since no students in our dataset have IQs of zero, the value of this intercept term is not

very meaningful. We can make the intercept meaningful by centering our X variable, or

rescaling it. If the average IQ score in our dataset is 100, we can subtract this mean from all the

IQ scores in our dataset to center the scores about the mean. Now a score of zero on the X

variable represents an IQ of 100. The value of the slope coefficient B is unaffected by the

centering, but the value of B0 is now the expected test score of an individual with an IQ of 100,

or in our example, 80.2. The B1 value of .60 represents the relationship between IQ score and

test performance. All regression programs automatically perform a significance test of the slope

coefficients to determine if the relationships they represent are significantly different from zero.

For the purposes of our example, imagine that our slope is highly significant.

Now imagine that our data were collected from two schools of different quality. In the

first school, the mean test score for a student with an IQ of 100 was 70.2. In the second school,

the mean test score for an IQ of 100 was 90.2. In other words, a student with an average IQ is

likely to score much higher on the test if he attends the second school. The standard regression

model presented estimates a single B0 which is assumed to apply to all the students in the dataset.

Page 140: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

130

In multilevel modeling terms, the coefficients in a standard regression equation are fixed, that is,

a single coefficient is estimated for all units in the study. Clearly, this is inappropriate in our

dataset. Our regression equation is now likely to substantially overpredict the test scores for

students attending the first school while underpredicting test scores for students attending the

second school. This prediction error is captured by the residual, e. Residuals in the standard

regression model are assumed to be normally distributed, but our residuals will be bimodally

distributed. What we need is the ability to estimate a separate B0 for each school. This is

precisely what multilevel modeling allows. Instead of a single intercept coefficient being

estimated that is presumed to apply equally to all schools, separate coefficients are estimated for

each school. We can augment our original regression equation with additional equation that

predicts the value of B0, the intercept term in the original equation. This scenario can be

formalized as follows:

(3) Level 1: ijijiij eIQBBTEST ++= )(10

Level 2: jjB 0000 µγ +=

Note that several parameters now have two subscripts. The subscript i indicates that the

parameter will have a unique value for each individual in the dataset. The subscript j indicates

that the parameter will have a unique value for each school in the dataset. We can refer to the

individual members of our dataset by labeling them the ith student in the jth school. In general,

it is fair to say that subscripts become much more important as well as much more confusing in

multilevel models. Our level two equation explains the value of the intercept in the level-1

model in terms of a level-2 variable.

Let us further explore the level-2 equation. The γ00 term represents the expected value of

the level-1 intercept B0j. It is the mean of all the school intercepts, or the mean of all the school

Page 141: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

131

test score means, which is 80.2 in our example. The term �0j is the level-2 error term. In our

contrived example with only two schools, �01 (the level-2 residual for the first school) would be –

10.0 while �02 would be 10.0. In more realistic models, many clusters or level-2 units (schools,

in this example) would be sampled. In this case, we would expect each school’s mean test score

to be normally distributed about γ00. The inclusion of the level-2 residual �0j is what allows each

school to take on a unique intercept B0j. In multilevel modeling terminology, we say that the

intercept is random, or allowed to randomly vary about γ00. Without it, the intercept is fixed at

γ00. We are interested in the variance in �0j for another reason. If there is very little variance in

�0j, this means that all schools have very similar intercepts. In this case, the results of a

multilevel analysis will be very similar to the results of a traditional unilevel analysis. The added

complexity of a multilevel analysis may not be justified under such conditions. Finally, because

no explanatory variables were included in the level-2 model, we say that it is unconditional.

Another concept that is frequently encountered in multilevel analysis is the intra-class

correlation coefficient (ICC, Raudenbush & Bryk, 2002). The intraclass coefficient is calculated

as the amount of variance in a variable that is at level-2 (i.e, between clusters) divided by the

total variance in the variable at both level-1 and level-2. The ICC can range from zero to one

and indicates the percentage of variance in a variable that occurs across clusters. For this reason,

the ICC is often called the cluster effect. It is automatically calculated by most multilevel

modeling software packages. When the ICC is small, the results of a multilevel analysis will not

be very different from the results of a unilevel analysis. When the ICC is large, the parameter

values and standard errors estimated for a multilevel analysis are likely to be very different from

the values estimated by a unilevel analysis. The ICC is commonly used to determine whether the

added complexity and difficulty of a multilevel analysis is justified.

Page 142: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

132

We have now examined a basic HLM. This example showed how a simple single-level

regression model may be extended to correctly model clustered data.3 This is a useful

application of multilevel modeling. However, the real power of multilevel modeling lies in its

power to address research questions that concern more than one level of analysis. The model we

examined earlier is able to address a research question focused only at the individual level. We

can easily extend the previous model to address true multilevel research questions in a

straightforward way.

Suppose we returned to our original dataset, which contained three variables: the school

ID code, the student’s IQ, and the student’s test score. From our data, we could easily calculate

another variable – the mean IQ score for each school. We determined that an individual

student’s IQ is positively related to that individual’s performance on the test. We also

determined that the average test score in the second school is much higher than the average test

score in the first school. We might naturally be curious about the difference between these

schools. Perhaps the average IQ of students in the second school is higher than the average IQ of

students in the first school. If this is true, it might explain some of the difference in the two

school’s test performances. We can address this research question by entering our new

SCHOOL_IQ variable into our level-2 model as follows:

(4) Level 1: ijijiij eIQBBTEST ++= )(10

Level 2: jjj IQSCHOOLB 001000 )_( µγγ ++=

Adding the explanatory variable to the level-2 model changed the interpretation of γ00. Now,

instead of representing the overall grand-mean test score, it represents the expected mean test

3 Strictly speaking, to produce completely unbiased estimates, the level-1 explanatory variables should be group-mean centered rather than grand-mean centered. See Raudenbush & Bryk (2002), p.135-139. Group-mean centering means that mean score for the variable within each level-2 unit is subtracted from all the scores of the individuals within that unit. The centering of level-1 variables is a complex issue that is beyond the scope of this manuscript.

Page 143: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

133

score for a school with a mean IQ of zero. Again, this is unrealistic. We may center the mean

school IQ scores to facilitate the interpretation of γ00. After centering, γ00 would be interpreted as

the expected mean school test score for schools with an average school mean IQ.

The value of γ01 is interpreted exactly like the value of a slope coefficient in a standard

regression. It may be statistically tested to determine if the relationship it represents is unlikely

to result from chance. If it is significant, then we have explained some of the differences in

mean test scores between schools in terms of the ability composition of the two schools’

students. By adding a variable to the level-2 model, we have made it conditional. The variance

in the residual �0j will be decreased when a significant explanatory variable is added to the model

as compared to the variance in �0j as calculated in the unconditional model. By estimating the

unconditional model and the conditional model in separate steps and recording the value of �0j

from each step, the percentage of level-2 variance explained may be easily calculated. Figure 4.1

provides a graphic demonstration of the meanings of the various coefficients in the model.

Figure 4.1: Coefficients in the random intercept model

Page 144: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

134

One more extension to HLMs needs to be introduced. The previous model included one

level-1 equation and one level-2 equation. The level-2 equation allowed the intercept B0j to vary

across schools. In multilevel modeling terms, we call this equation a random intercept equation.

Careful readers might wonder why we are assuming that a single value of the IQ slope in the

level one model is presumed to apply to all schools. After all, we have already seen that it is

inappropriate to assume that a single intercept coefficient applies to all schools within our

dataset. Why would we assume that a single slope coefficient is applicable to all schools? The

relationship between IQ and test scores might vary across schools. We can formalize this

thinking as follows:

(5) Level 1: ijijjiij eIQBBTEST ++= )(10

Level 2 (intercept): jjj IQSCHOOLB 001000 )_( µγγ ++=

Level 2 (slope): jjB 1101 µγ +=

Note that we have added a subscript to the level-1 slope coefficient for IQ, denoting that we are

now estimating a separate slope coefficient for each school. We have also added an additional

equation describing how the level-1 slope coefficient varies across schools. This equation is

commonly called a random slope model. � 10 is the grand mean of the slope coefficients. Because

of the inclusion of the residual �1j, the value of B1j is allowed to vary randomly. Without it, our

model would require that all schools have a slope equal to � 10. The level-2 slope model

presented above is unconditional because we have not entered any explanatory variables.

Practically speaking, we would use this model in order to examine the significance of �1j. If the

value of �1j is not statistically significant, then we could conclude that a single slope coefficient

does indeed apply to all the schools. If it is significant, we would proceed with our analysis by

adding explanatory variables to the level-2 slope model to try to explain why different schools

Page 145: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

135

have different slopes. We could enter SCHOOL_IQ as a predictor in our random slope model

just as we entered it into our random intercept model. Entering an explanatory variable into the

random slope model would change it from an unconditional model to a conditional model. The

model could be formalized as:

(6) Level 1: ijijjiij eIQBBTEST ++= )(10

Level 2 (intercept): jjj IQSCHOOLB 001000 )_( µγγ ++=

Level 2 (slope): jjj IQSCHOOLB 111101 )_( µγγ ++=

The coefficient � 11 describes the influence of the school’s average IQ score on the relationship

between individual IQ and test scores within that school. In other words, � 11 can be interpreted

as a cross-level interaction. Imagine that the value of � 11 was estimated at .25, and that the value

is statistically significant. Also recall that the SCHOOL_IQ variable was grand-mean centered

to facilitate the interpretation of � 00 in the intercept model. If our overall school grand mean IQ

is 100, then � 10 is the expected value of B1j for a school with an average IQ of 100. If the second

school has a higher average IQ, then the strength of the relationship between individual IQ and

test scores within than school will be greater than the relationship between IQ and test scores in

the first school. Figure 4.2 illustrates the meaning of the coefficients in the random slope model.

The dotted lines represent � 10, the overall mean slope for schools with an average IQ of 100.

Note that the line for B12, the slope for the second school, is significantly steeper than the slope

B11 for the first school. The difference between the schools’ slopes B1j and the grand mean slope

� 10 is represented by � 11.

Page 146: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

136

Figure 4.2: Coefficients in the random slope model

Multilevel Structural Equation Modeling

Multilevel structural equation modeling (ML-SEM) is quite easy to understand once

HLM is understood. Though there are some significant differences in the underlying equations

and estimation methods, the two techniques are conceptually quite similar. Researchers might

choose ML-SEM over HLM for the same reasons that researchers choose SEM over standard

regression models. These reasons might include:

1) SEM is frequently used to test a theoretical model or a set of theoretical models against a

dataset. Numerous fit indices are calculated by SEM programs that indicate how well the

proposed model fits the data.

2) SEM allows researchers to propose a structure among explanatory variables, such that the

explanatory variables are able to influence one another as well at the outcome variable.

This allows the effect of one variable on another to be decomposed into a direct effect

and an indirect effect. This decomposition is often of theoretical interest.

Page 147: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

137

3) SEM allows researchers to incorporate latent variables directly into the model. Latent

variables that are “captured” by multiple indicator variables are not subject to

measurement error as observed variables are. A researcher studying hope, for instance,

could administer several different scales to measure aspects of hope and then combine

these scales into a single latent variable. Each individual scale would include a certain

amount of measurement error, but when multiple measures are incorporated into a single

latent variable, the latent variable is free from measurement error because the unique

variance in each scale, which includes the measurement error, is not included into the

latent variable (McCoach, 2003).

Structural equation models are frequently described through graphic representations. In

SEMs, observed variables are contained in boxes, latent variables are contained in circles, causal

paths (i.e., X causes Y) are described by straight, single-headed arrows, and non-causal

correlations are described by curved, double-headed arrows. A complicating factor in SEMs is

that, in order to be able to successfully estimate a model, the number of parameters to estimate

must not exceed the number of unique pieces of information present in the variance-covariance

matrix of the dataset. In order to test the fit of a SEM, the number of unique pieces of

information must exceed the number of parameters to be estimated. Models that require the

estimation of exactly the same number of parameters as there are unique elements in the

variance-covariance matrix are called saturated models. All of the measures of model fit for

saturated models will indicate perfect fit. This does not mean that saturated models are

theoretically correct, it simply means that they are unable to be tested. Regression models are a

special case of saturated structural equation models. Figure 4.3 illustrates a sample SEM. In this

example, SES is a latent variable with three indicators: annual income, father’s education, and

Page 148: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

138

mother’s education. Socioeconomic status has a direct causal influence on ability and

motivation. Ability and motivation, in turn, affect the student’s test score. Socioeconomic status

does not directly affect test score, instead, it exerts an indirect effect through ability and

motivation. Figure 4.4 shows how a regression model is specified in a SEM context.

Ability

Motivation

TestSES

AnnualIncome

Mother’s Education

Father’s Education

Figure 4.3: Example structural equation model

Ability

Motivation

Test

AnnualIncome

Mother’s Education

Father’s Education

Figure 4.4: Regression model specified in a SEM context

Page 149: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

139

Multilevel structural equation models are extensions of ordinary SEMs in the same way

that HLMs are extensions of the ordinary regression model. Hierarchical linear models begin

with a level-1 equation and then add level-2 equations that describe how the coefficients in the

level-1 equation vary across clusters. Multilevel SEMs begin with a level-1 model with

parameters that might vary across clusters. Level-2 models are added that attempt to explain the

variation in the level-1 model parameters.4 Generally these level-2 models will be presented in

separate graphic figures. In ML-SEM terminology, level-1 models are usually labeled within-

cluster models while level-2 models are usually labeled between-cluster models. Parameters in

within-cluster ML-SEMs that will vary across clusters (and therefore serve as outcomes in level-

2 models) are usually indicated by a small circle, a diamond, or another symbol. Similar to

HLMs, level-2 SEMs that indicate how a variable mean varies across clusters are referred to as

intercept models while level-2 SEMs that indicate how a path coefficient varies across clusters

are referred to a slope models.

Data Analysis Examples

Introducing the Dataset

To facilitate the discussion of multilevel modeling, a sample dataset will be analyzed

using standard regression, SEM, HLM, and ML-SEM approaches. The dataset was originally

based on NELS data but was substantially modified by the author. It includes four student-level

variables and five school-level variables. Variable descriptions may be found in Table 4.1.

4 Ordinary SEMs account only for the variances and covariances between variables. Variable means are generally not considered. However, SEMs may be easily extended to examine variable means. For the remainder of the discussion in this paper, it is presumed that SEMs with a meanstructure are being estimated. Without a meanstructure, there would be no intercept term that could vary across clusters.

Page 150: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

140

Table 4.1: Variable descriptions for sample dataset

______________________________________________________________________________

Variable Name Description

______________________________________________________________________________

Individual-level variables

StudentID Identification code for each student SchoolID Identification code for each school SES Standardized measure of student’s SES background Motivation Student motivation scale score Ability Student IQ score Test Student test score

School-level variables

Sch_HWTime Mean number of hours each week spent on homework Sch_Unsafe Proportion of students within school that feel unsafe during school day Sch_SES School mean SES* Sch_Mot School mean motivation* Sch_ability School mean IQ* ______________________________________________________________________________

* Variable created by calculating each school’s mean of an individual-level variable

Analysis Via Standard Regression

The individual-level variables from the dataset were entered into a standard regression

model with TEST as the outcome and SES, MOTIVATION, and ABILITY as explanatory

variables. The MPlus code for this analysis may be found in Appendix A. Each line of MPlus

code must end with a semicolon. The code listed under the DATA command tells MPlus the

location of the data file to be analyzed. The data file is in tab-delimited format without variable

labels. The code under the VARIABLE heading names the variables in the data file, indicates

which variables should be used, and performs any centering or transformations that might be

necessary. Note that we have grand-mean centered the explanatory variables in this analysis.

Page 151: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

141

This allows the intercept to be interpreted as the expected test score for a student with an average

SES, ability, and motivation score. Under the MODEL heading, the code describing the

statistical model to be analyzed is described. The next line, TEST on ABILITY SES

MOTIVATION defines the regression model. The keyword “on” is used in MPlus to mean

“regressed on.” The line TYPE is MEANSTRUCTURE under the ANALYSIS heading causes

MPlus to calculate an intercept for the TEST variable. Finally, the STANDARDIZED option

under the OUTPUT heading is used to request standardized regression coefficients (betas) in

addition to the unstandardized coefficients. The model results may be found in the first column

of Table 4.2. All three variables are significantly related to the outcome. The three variables

explain about 53% of the variance in test scores.

Analysis Via Regression Accounting for Clustering

The next analysis examines the same regression model but correctly accounts for the

clustering of students within schools. MPlus code for this analysis may be found in Appendix B.

Note that the group-mean centering instead of grand-mean centering was requested for the

variables. The CLUSTER keyword is used to designate that variable that labels the clusters.

The WITHIN command is used to label the variables that should exist in the level-1 model only.

In general, all level-1 variables except the outcome should be declared as WITHIN. Under the

MODEL heading, we now have labels for the within (level-1) and between (level-2) portions of

the model. The regression model declared under the %WITHIN% heading is identical to the

model analyzed in the first regression. Under the %BETWEEN% label, only the outcome

variable is listed. In MPlus language, a variable name mentioned alone is used to refer to that

variable’s variance. By including the variance of TEST in the between part of the model, we are

asking MPlus to allow the intercept of TEST to vary randomly across schools.

Page 152: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

142

Table 4.2: Regression and HLM model results ______________________________________________________________________________ Model Coefficient Regression Regression HLM1 HLM2 w/ Clustering (int) (int+slope) ______________________________________________________________________________ Level-1 SES 7.884* 0.985* 0.985* 1.040* SESES .408 .335 .334 .304 Ability .394* 0.248* 0.248* 0.443* SEability .024 .016 .016 .072 Motivation .690* 0.443* 0.443* 0.248* SEmotivation .137 .077 .076 .014 R2 .528* .300* .300* .312* ______________________________________________________________________________ Level-2 (intercept) Sch_HWTime 0.637 0.635 SEHWTime .356 .362 Sch_Unsafe 8.486 8.481 SEunsafe 8.702 8.816 Sch_SES 0.945 0.947 SESES 2.244 2.255 Sch_Mot 4.208* 4.206* SEmot 1.131 1.128 Sch_Ability 0.949* 0.950* SEability .193 .194 R2 .834* .834* ______________________________________________________________________________ Level-2 (slope) Sch_HWTime 0.008 SEHWTime .010 Sch_Unsafe 0.211 SEunsafe .159 Sch_SES 0.050 SESES .045 Sch_Mot 0.014 SEmot .021 Sch_Ability -0.003 SEability .004 R2 .500 ______________________________________________________________________________ * Parameter is significant, p < .05

Page 153: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

143

The model we have specified is identical to the unconditional random intercept model previously

introduced in equation 3. The results of this model may be found in the second column of Table

4.2. Finally, the TWOLEVEL option was added under the ANALYSIS heading to request a

multilevel analysis. Notice the large discrepancies between the estimated parameter values and

standard errors for the two models. Additionally, the percentage of variance explained in TEST

has fallen to 30%. No variance at level-2 has been explained because no explanatory variables

were introduced into the level-2 model. The intra-class correlation coefficient (ICC) is .894,

indicating that nearly all of the variance in test scores is between clusters. This is the cause of

the large differences between the values estimated for the two models.

Analysis Via Random Intercept Hierarchical Linear Model

In the next analysis, we attempt to explain the variance in TEST across schools by

introducing variables into the level-2 intercept model. The MPlus code for this analysis may be

found in Appendix C. The school-level variables have been added to the USEVARIABLES line.

A line requesting that these variables be grand-mean centered was added, as well as a line

declaring that these variables operate on the between level. Under the model command, the

TEST variable was regressed on the school-level variables. The results of this analysis may be

found in the third column of Table 4.2. Note that the parameters for level-1 are nearly

unchanged as compared with the previous analysis. However, we now have estimates for the

effects of the school-level variables on the intercept. Only two variables, SCH_MOT and

SCH_ABILITY, were significantly related to the intercept. These two variables, however,

explained 83% of the between-schools variance in test scores.

Page 154: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

144

Analysis Via Random Intercept and Random Slope Hierarchical Linear Model

Finally, we examine a model to determine whether the strength of the relationship

between individual ability and test score varies across schools. Substantively speaking, this

analysis could indicate whether or not some schools are better than others at allowing students

with high ability to master more course material, and if so, to determine what characteristics of

the schools are related to this. Only a few additional lines of code are needed to run this

analysis. MPlus code for this analysis may be found in Appendix D. The first thing we must do

is to define the slope parameter in the level-1 model that we will allow to vary across schools. In

our case, this is the slope relating individual IQ to test performance. We do so by entering the

following line:

SLOPE | TEST on ABILITY;

The | symbol is used to label random slopes in MPlus. The slope that is allowed to randomly

vary is specified after the | symbol while a name for this slope is provided before the | symbol.

The name will be used in the level-2 model to identify this slope coefficient. We enter

explanatory variables for this slope in the level-2 model by regressing SLOPE on school-level

variables. Also, it is common to observe some level of correlation between random slopes and

random intercepts. To allow this correlation to be estimated, the command SLOPE WITH TEST

was entered into the level-2 model. In order to estimate the random slope model, the RANDOM

command had to appended to the analysis type. MPlus cannot calculate standardized coefficients

in random slope models, so the request for standardized coefficients was removed from the code.

Before running the conditional random slope model, an unconditional random slope

model was estimated. This was done in the same way that the unconditional intercept model was

requested, by listing the SLOPE label in the level-2 model specification without entering any

Page 155: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

145

explanatory variables for it. By listing SLOPE by itself, we are requesting that MPlus estimate

the variance in SLOPE across clusters. The results indicated that the variance in SLOPE was

0.004, which is very small but statistically significant. In reality, most analysts would probably

not continue by estimating a conditional random slope model at this point due to the extremely

small amount of variance in SLOPE. I have chosen to continue with the analysis for illustrative

purposes.

The results of this model are presented in the fourth column of Table 4.2. Note that

allowing the ABILITY slope to randomly vary across clusters caused the level-1 coefficients to

change. If the unconditional random slope model had provided convincing evidence that random

slopes were needed, we would treat these estimates as being more trustworthy than the estimates

produced in the random intercept model. This is not the case for this particular analysis. The

estimated coefficients for the intercept model are nearly unchanged. None of the variables

entered into the random slope model were significant. Together, they explained 50% of the

(tiny) variance in the slope coefficient, but this amount was not large enough to emerge as being

statistically significant.

This concludes the comparison of regression and HLM approaches to multilevel analysis.

In real life, researchers would want to examine the other two level-1 slopes (SES and

MOTIVATION) for evidence of random variance across schools. This could be done by

specifying multiple random slopes in the MPlus code as follows:

%WITHIN% S_Ability | Test on Ability; S_Motivation | Test on Motivation; S_SES | Test on SES;

%BETWEEN% S_Ability; S_Motivation; S_SES;

Page 156: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

146

Analysis Via Structural Equation Modeling

If we wanted to test a particular theory about how the variables in the model are related

or if we were interested in estimating indirect as well as direct effects, structural equation

modeling would be a better way to approach the analysis of our dataset. In the following

discussion, the same dataset will be analyzed via ordinary SEM, SEM accounting for clustering,

ML-SEM via an intercept model, and ML-SEM via slope and intercept models. The SEM

results will include path values and standard errors that are conceptually identical to the B values

in ordinary regression. The SEM results will also include several measures of model fit. The

model fit information that will be presented here includes the exact-fit test (chi square),

standardized root mean residual (SRMR), root mean square error of approximation (RMSEA),

and the confirmatory fit index (CFI).

The theoretical within-schools (level-1) model hypothesizes that student SES causes both

ABILITY and MOTIVATION, and that ABILITY and MOTIVATION cause the TEST score.

Note that SES is hypothesized to exert an indirect effect on TEST through ABILITY and

MOTIVATION, but not a direct effect. The path model with the estimated coefficients and

standard errors is described graphically in Figure 4.5. The coefficients, standard errors, and

model fit information are presented in the first column of Table 4.3.

Though a full discussion of the derivation and interpretation of the various model-fit

statistics in SEMs is beyond the scope of this article, suffice it to say that the model fit statistics

for this model indicate that the fit is quite poor. All of the path coefficients in the model are

statistically significant.

Page 157: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

147

Ability

Motivation

Test

0.532 (.023)

1.653(.127)

SES

8.119 (.379)

1.552(.066)

R2 = .230

R2 = .266

R2 = .519

Figure 4.5: Results for single-level SEM

To calculate the indirect effect of SES on TEST, the intervening coefficients are

multiplied together. In this model, there are two possible indirect paths from SES to TEST. One

of these is through ABILITY. The indirect effect through ABILITY is equal to 8.119 * 0.532 =

4.319. The indirect effect through MOTIVATION is 1.552 * 1.653 = 2.565. The total indirect

effect of SES on TEST is 4.319 + 2.565 = 6.882.

The MPlus code to run this analysis is available in Appendix E. The only differences in

the code between the single-level regression model and the single-level path model are the

commands that specify the model. In the regression model, a single command regressed all the

explanatory variables directly on TEST. In the SEM, three such statements are needed.

Page 158: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

148

Table 4.3: SEM and ML-SEM model results

______________________________________________________________________________ Model From To SEM SEM ML-SEM1 ML-SEM2 w/Clustering (int) (int+slope) ______________________________________________________________________________ Within SES Ability 8.119 (.379)* 8.119 (.368)* 8.119 (.368)* 8.119 (.368)* SES Mot. 1.552 (.066)* 1.552 (.065)* 1.552 (.065)* 1.552 (.065)* Mot. Test 1.653 (.127)* 0.534 (.053)* 0.512 (.053)* 0.514 (.057)* Ability Test 0.532 (.023)* 0.261 (.011)* 0.255 (.011)* 0.253 (.014)* R2 Ability 0.230* 0.230* 0.230* -NA- R2 Motivation 0.266* 0.266* 0.266* -NA- R2 Test 0.519* 0.353* 0.342* -NA- � 2 (df) 514.353 (2) 511.647 (2) 279.567 (8) -NA- SRMRwithin 0.107 0.097 0.045 -NA- SRMRbetween 0.000 0.032 -NA- RMSEA 0.408 0.407 0.149 -NA- CFI 0.768 0.688 0.918 -NA- ______________________________________________________________________________ Level-2 S_SES S_Abl 10.044 (.619)* 10.044 (.619)* (int) S_SES S_Mot 1.028 (.156)* 1.028 (.153)* S_Abl S_Mot 0.068 (.013)* 0.068 (.013)* S_Unsafe S_Mot 0.524 (.499) 0.524 (.506) S_Mot S_HW 1.495 (.205)* 1.495 (.205)* S_Abl Test 0.676 (.164)* 0.685 (.182)* S_HW Test 0.659 (.299)* 0.595 (.332) S_Mot Test 3.987 (1.269)* 3.996 (.959)* S_SES with S_Unsafe -0.029 (.006)* -0.029 (.006)* R2 Sch_Ability 0.733* -NA- R2 Sch_Motivation 0.851* -NA- R2 Sch_HWTime 0.506* -NA- R2 Sch_Test 0.764* -NA- ______________________________________________________________________________ Level-2 S_Abl Slope -0.003 (.004) (slope) S_HW Slope 0.006 (.009) S_Mot Slope 0.032 (.022) R2 Slope 0.500 ______________________________________________________________________________ * Parameter is significant, p < .05

Page 159: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

149

Analysis via Structural Equation Modeling accounting for Clustering

In the next analysis, we analyze a SEM that accounts for the clustering in the TEST

variable. This is done through an unconditional random intercept model. The results of this

model are presented in the second column of Table 4.3. The path coefficients and standard

errors from SES to ABILITY and MOTIVATION are nearly unchanged. However, the values

and standard errors for the paths leading to TEST are quite different. Also, the percentage of

variance explained in TEST has fallen from 51.9% to 34.2%. The model-fit information is also

nearly unchanged compared to the single-level path model. The SRMR, RMSEA, and chi-

square statistics indicate a slight improvement in model fit, though the CFI indicates a slightly

worse model fit.

The MPlus syntax to analyze this model may be found in Appendix F. We have now

requested a TWOLEVEL analysis type, have divided the model specification into WITHIN and

BETWEEN components, indicated the cluster ID variable, and labeled the variables that exist

only on the WITHIN level. We have requested that SES, ABILITY, and MOTIVATION be

grand-mean centered. Group-mean centering would be more appropriate to produce completely

unbiased level-1 coefficients and standard errors. Unfortunately, this model would not run when

group-mean centering was requested.

Note that the ABILITY and MOTIVATION variables are serving simultaneously as

explanatory variables and outcome variables, and as such, their intercepts may also vary across

schools. The model could easily be extended to allow the intercepts of these variables to vary

across clusters as well.

Page 160: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

150

Analysis via Multilevel Structural Equation Modeling with a Random Intercept Model

In the next model, we will extend the unconditional intercept model from the previous

step by producing another SEM to attempt to explain the relationships between the school-level

variables. The new model is graphically depicted in Figure 4.6.

The intercept model hypothesizes that SCH_SES and SCH_UNSAFE cause SCH_MOT.

SCH_SES and SCH_UNSAFE are related, but one does not cause the other. SCH_SES causes

SCH_ABILITY. SCH_MOT causes SCH_HWTIME. SCH_ABILITY, SCH_MOT, and

SCH_HWTIME cause TEST. The model parameters are described in the third column of Table

4.3. Adding the intercept model greatly improved the model fit. Interestingly, the fit of the

within-schools model was improved according to the SRMR as compared with the previous

model. This dataset has a large ICC of .725, meaning that most of the variance in TEST happens

between schools. Until now, our models had assumed that all of this variance could be explained

by individual-level variables. This is impossible, and the model fit statistics reflected this.

Indeed, the model fit statistics now indicate reasonably good model fit. All of the path

coefficients in the model are significant with the exception of the path from SCH_UNSAFE to

SCH_MOT, indicating that the school motivation is not affected by the proportion of students in

the school that feel unsafe.

The MPlus syntax for estimating this model may be found in Appendix G. The only

addition to the code for estimating this model is the addition of the commands under the

MODEL heading for specifying the intercept model. The variable list in the USEVARIABLES

and BETWEEN statements were revised to include the school-level variables.

Page 161: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

151

Within

Ability

Motivation

Test

0.532 (.023)

1.653(.127)

SES

8.119 (.379)

1.552(.066)

R2 = .230

R2 = .266

R2 = .342

______________________________________________________________________________

Between (intercept)

Sch_Ability

Sch_Mot Test

Sch_SES

Sch_UnsafeSch_HWTime

10.044(.619)

1.028 (.156)0.068(.013)

0.524 (.499) 1.495 (.205)

-.029(.006)

0.659 (.299)

0.676 (.164)

3.987(1.269)

R2 = .506

R2 = ..733

R2 = .764R2 =.851

Figure 4.6: Results for ML-SEM random intercept model

Page 162: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

152

Analysis via Multilevel Structural Equation Modeling with Random Slope and Intercept Models

For the final analysis, the model was extended to model a random slope in the level-1

model. The path coefficient from ABILITY to TEST was allowed to vary across clusters. This

model is described graphically in Figure 4.7.

The random slope is denoted by a diamond in the within-schools model. As with the

corresponding HLM with a random slope, an unconditional model was estimated first to

determine whether or not the slope coefficient actually varies across schools. As before, the

variance across schools was .004, which was statistically significant but too small to be of

substantive interest. As before, a researcher examining this dataset would probably not proceed

with a conditional random slope model.

The model coefficients and standard errors may be found in the fourth column of Table

4.3. The coefficients for the within-schools and the between-schools intercept model are largely

unchanged as compared with the previous analysis. In the intercept model, the path from

SCH_HWTIME to TEST has become nonsignificant due to the slight increase in the standard

error of that parameter. None of the variables included in the slope model were significant, nor

was the percentage of variance explained. Note that most types of model fit information are not

available when examining random slope models. The syntax for conducting this analysis may be

found in Appendix H. The TECH8 option included under the OUTPUT heading causes the

model optimization history to be printed to the screen during the model estimation. Random

slope models are frequently very computationally intensive and may take some time to estimate

even on powerful computers. Some models that the author has examined have taken more than

eight hours to estimate. The availability of real-time optimization history can assure researchers

that the computer program is running and not frozen.

Page 163: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

153

Within

Ability

M otivation

Test

0.516(.062)

SES

8.119 (.368)

1.552(.065)

.253(.014)

_____________________________________________________________________________

Between (intercept)

Sch_Ability

Sch_M ot Test

Sch_SES

Sch_UnsafeSch_HW Tim e

10.044(.619)

1.028 (.153)0.068(.013)

0 .524 (.506) 1.495 (.205)

-.029(.006)

0.595 (.332)

0 .685 (.182)

3.996(.959)

______________________________________________________________________________

Between (slope)

S ch_A bility

S ch_M ot

Sch_HW Tim e

-0.003 (.004)

0.032(.022)

0.006(.009)

Test

Slope

.137 (.087)

Figure 4.7: Results for ML-SEM random slope and intercept models

Page 164: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

154

Discussion

The purpose of this article was to introduce multilevel modeling to the gifted education

community. It is hoped that this article will serve as an approachable introduction for

researchers who would like to begin working with multilevel analysis. At the very least, the use

of rudimentary multilevel analysis techniques can ensure that results generated from clustered

data are correct. As multilevel analysis becomes more mainstream in our field and others, it is

hoped that more sophisticated theories will emerge that are explicitly multilevel in nature. The

result can only be more sophisticated research and theory in our field that can provide more

complete, nuanced, and realistic explanations of educational phenomena.

Page 165: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

155

References

Bachman, J. G., & O'Malley, P. M. (1986). Self-concepts, self-esteem, and educational

experiences: The frog pond revisited (again). Journal of Personality & Social

Psychology, 50(1), 35-46.

Curran, P. J. (2003). Have multilevel models been structural equation models all along?

Multivariate Behavioral Research, 38(4), 529-569.

Heck, R. H. (2001). Multilevel modeling with SEM. In G. A. Marcoulides & R. E. Schumacker

(Eds.), New developments and techniques in structural equation modeling (pp. 89-127).

Mahwah, NJ: Lawrence Erlbaum Associates, Publishers.

Hox, J. (2002). Multilevel analysis techniques and applications. Mahwah, NJ: Lawrence

Erlbaum Associates..

James, L. R. (1982). Aggregation bias in estimates of perceptual agreement. Journal of Applied

Psychology, 67(2), 219-229.

Marsh, H. W. (1987). The big-fish-little-pond effect on academic self-concept. Journal of

Educational Psychology, 79(3), 280-295.

Marsh, H. W., & Parker, J. W. (1984). Determinants of student self-concept: Is it better to be a

relatively large fish in a small pond even if you don't learn to swim as well? Journal of

Personality & Social Psychology, 47(1), 213-231.

McCoach, D. B. (2003). SEM isn't just the schoolwide enrichment model anymore: Structural

equation modeling (SEM) in gifted education. Journal for the Education of the Gifted,

27(1), 36-61.

Pedhazur, E. J. (1997). Multiple regression in behavioral research: Explanation and prediction

(3rd ed.). New York: Harcourt Brace.

Page 166: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

156

Raudenbush, S., & Bryk, A. (2002). Hierarchical linear models: Applications and data analysis

methods. (2nd ed.). Thousand Oaks, CA: Sage.

Robinson, W. S. (1950). Ecological correlations and the behavior of individuals. American

Sociological Review, 15, 351-357.

Schreiber, J. B., & Griffin, B. W. (2004). Review of multilevel modeling and multilevel studies

in The Journal of Educational Research (1992-2002). Journal of Educational Research,

98(1), 24-33.

Scullen, S. E. (1997). When ratings from one source have been averaged, but ratings from

another source have not: Problems and solutions. Journal of Applied Psychology, 82(6),

880-888.

Stapleton, L. M., & Hancock, G. R. (2000). Using multilevel structural equation modeling with

faculty data.

Walker, N., & Catrambone, R. (1993). Aggregation bias and the use of regression in evaluating

models of human performance. Human Factors, 35(3), 397-411.

Page 167: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

157

CHAPTER 5

SUMMARY AND FUTURE DIRECTIONS

Page 168: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

158

This dissertation has focused on the analysis of a very large (N = 705,074) dataset

collected in 2004 by the Georgia Department of Education containing information on every

elementary school student enrolled in Georgia public schools. The purpose of this study was to

uncover new information about the underrepresentation of Black, Hispanic, and low-SES

students from all racial backgrounds in Georgia gifted and talented education programs. In spite

of years of research into this issue on the part of the gifted education community, a number of

questions remain regarding underrepresentation. It is my opinion that this lack of information

has severely impeded efforts to address the underrepresentation issue.

This chapter will take a somewhat different approach from the previous chapters.

Chapter one was written as a stand-alone literature review that attempted to describe the current

state of the field with respect to research on underrepresentation. Chapters two, three, and four

were written as stand-alone journal articles. Chapter two focused on the nomination stage of the

gifted identification process. It used a descriptive approach to examine the nomination rates for

various groups of students with respect to race and socioeconomic status. It also examined the

effectiveness of different nomination sources. Chapter three used multilevel structural equation

modeling to examine individual- and school-level variables that influence a student’s probability

of being identified for participation in the gifted education program. Chapter four was an

introduction to multilevel modeling methodologies written for an audience familiar with the

standard regression model. Whereas the previous chapters were written to stand alone, and often

with a particular journal’s audience in mind, this chapter will attempt to summarize the findings

from all the previous chapters and suggest directions for future research.

Summarizing the Nominations Study

The nominations study compared the performance of automatic (test-score based),

teacher, parent, peer, self, and other referrals and addressed two major research questions:

Page 169: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

159

1. How do the various nomination sources compare with respect to overall quality as

indexed by overall accuracy, overall usefulness, and the phi correlation coefficient?

2. How do the nomination sources compare with respect to differential occurrence

across race and SES groups?

First, the results confirmed that underrepresentation remains a serious issue in Georgia.

The percentage of students identified exhibited extreme variation across racial groups, with

18.3% of Asian students being identified as gifted and only 2.3% of Hispanic students being

identified. White students, which were used as the baseline group in the study, were identified at

a rate of 7.9%, while 3.2% of Black students were identified.

With respect to the first research question, the study found that automatic referrals had

the best performance for all groups except for the high-SES Native American group, where

teacher referrals slightly outperformed automatic referrals. Overall, automatic referrals were

responsible for 57.1% of students identified for gifted program placement. Automatic referrals

were also the most accurate, with 86.3% students receiving an automatic referral being

subsequently identified. Teacher referrals were the second highest quality referral source.

Teacher referrals were the referral source for 37.7 percent of gifted students overall. Teacher

referrals were also quite accurate, with 74.9% of referred students successfully passing the

testing stage to become identified as gifted. The remaining referral sources, which included

referrals from parents, peers, other adults, and the students themselves, were both rarely used and

comparatively inaccurate. Slightly less than 95% of gifted students were referred either

automatically or by the classroom teacher. However, it should be noted that an issue with the

dataset prohibits strong comparisons of quality between the referral sources. This is because in

Georgia schools, only a single nomination source may be listed for each student, and automatic

Page 170: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

160

referrals always occur before other nomination sources are considered. Therefore, the nature of

the data collection process unfairly advantages automatic referrals relative to the other referral

sources.

With respect to the second research question, the study found that nomination rates

exhibited extreme variation across racial and SES groups. These disparities in nomination rates

mirrored the overall pattern of disparities in gifted program enrollments, as may be seen in Table

1. In fact, the study uncovered evidence that the referral stage carries much more responsibility

for the disparities in program enrollments than the subsequent testing stage. Although

differences in the testing stage pass rates are evident across racial and SES groups in Table 2.4,

these differences are much smaller than the differences in nomination rates across groups.

Equalizing the pass rates for the testing stage would have very little effect on the

underrepresentation of Black, Hispanic, and low-SES students, but equalizing the nomination

rates (while leaving the pass rates unchanged) would nearly eliminate the underrepresentation

problem.

Summarizing the Identification Study

The identification study involved the creation of a multilevel structural equation model to

explain the probability of gifted identification in terms of a number of individual- and school-

level variables. Due to the complexity of the models proposed, the full results of the study will

not be replicated here, as this would require pages of figures and tables. However, some of the

more interesting aspects of the findings will be described.

One interesting aspect of the study is that it allowed the impact of race and SES on the

probability of gifted identification to be examined separately and together. In other words, the

model allowed the direct effect of race on the outcome, controlling for student SES to be

Page 171: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

161

examined. It also allowed the direct effect of SES on the outcome to be examined, independent

of race. In the proposed model, race could affect the probability of gifted identification in two

ways: directly, and indirectly through the SES variable. No previous studies of

underrepresentation published in the gifted education literature to date have used this approach.

Because the outcome in question is dichotomous, model parameters are expressed in logits. The

conversion of logits to odds ratios is quite straightforward and must be done in order to properly

interpret the model parameters. The odds ratios are expressed relative to White students, with an

odds ratio of 1.0 indicating that a group has the same odds of being identified as a White

“reference student.” The direct effect of being Black on the probability of gifted identification

corresponded with an odds ratio of 0.39. Therefore, being Black lowers the odds of

identification to just 39% of the odds of identification for a White student. Being Hispanic

lowers the odds of identification to 0.34. Being Asian increases the odds of identification to

3.19, indicating that Asian students have over three times the odds of being identified gifted in a

given year than a corresponding White student.

The only socioeconomic status information available in this study was whether the

student received free lunch, reduced-price lunch, or did not receive lunch aid. This is obviously

only a rough indicator of socioeconomic status, which is believed to be far more complex than a

simple index of annual income (Yang & Gustafsson, 2004). Nevertheless, this SES variable had

a powerful effect on the probability of gifted identification. Relative to a student not receiving

aid, the odds of gifted identification for students receiving reduced-price lunch were only 0.52.

The odds of identification for students receiving free lunch fell to 0.28. The SES variable was

also highly related to race. Black students had 5.54 times the odds of receiving lunch aid as

White students, while Hispanic students had 7.88 times the odds of receiving lunch aid.

Page 172: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

162

The baseline probability of being identified as gifted was not the same across schools.

This was modeled via a separate structural equation model for the intercept. The study identified

some school characteristics which explained some of this variability. Only three variables had

significant direct effects on the probability of identification at the school level. The incident-to-

student ratio, a variable describing the number of severe behavioral problems in the school

divided by the number of students in the school, had a significant negative effect on the school’s

average probability of identification. Surprisingly, the school lunch variable had a significant

positive direct effect. It was expected that the number of students in the school receiving lunch

aid would have a direct negative effect on the school’s mean probability of identification. The

percentage of the student body that is Black exerted a very weak positive direct effect as well.

Several variables exerted indirect effects through the school academic environment, a

composite variable representing the school ITBS achievement test scores. The academic

environment itself exerted a strong positive direct effect on the mean probability of gifted

identification. The percentage of students within the school that had been previously identified

as gifted did not exert a direct effect but did exert a positive indirect effect through the school

academic environment. The number of students receiving lunch aid exerted a negative effect on

the school academic environment and therefore exerted a negative indirect effect on the mean

probability of identification. The magnitude of this indirect effect was larger than the positive

direct effect. The percentages of students that were Black and Hispanic also exerted negative

effects on the school academic environment and thus had weak negative indirect effects on the

mean probability of identification. Finally, the percentage of students attending the school that

was Asian exerted a positive effect on the school academic environment and a positive indirect

effect on the school mean probability of identification.

Page 173: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

163

Students of all types had the highest probability of being identified as gifted in schools

that had few serious behavioral incidents and high achievement test scores. Schools that had

high achievement test scores tended to have a large number of previously identified gifted

students and did not have a high incidence of student poverty. These schools also tended to have

smaller Black and Hispanic populations and larger Asian populations, though the effects of

student body race were quite modest. Interestingly, neither the racial composition of the school’s

teachers, the percentage of teachers holding advanced degrees, the average number of years of

teacher experience, the school size, the percentage of students that had been retained, nor the

percentage of students that had been migrant had any effect on either the school academic

environment (as operationalized) or the mean probability of gifted identification.

The study investigated whether the amount of “disadvantage” of being Black, Hispanic,

or receiving lunch aid varied across schools, as well as whether the amount of “advantage” of

being Asian varied across schools. The impact of being Black or Hispanic did not vary across

schools. The impact of receiving free or reduced-price lunch did vary across schools. These

research questions were addressed through structural equation models of the slope coefficients

from the individual-level model. Only two variables affected the impact of receiving free or

reduced-price lunch: the percentage of students in the school that had been previously identified

as gifted and the school academic environment. The more students had been previously

identified, the less receiving FRL negatively affected the probability of identification. Better

school academic environments led to more disadvantage for receiving FRL.

The degree of “advantage” for being Asian depended on three variables. First, the more

students in the school were receiving lunch aid, the more advantage Asian students experienced.

Page 174: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

164

Second, higher percentages of Black teachers resulted in slightly less advantage for Asian

students. Finally, Asian students were less advantaged in schools with large student bodies.

The Emerging Picture of Underrepresentation

Though many unanswered questions remain about the underrepresentation phenomenon,

the studies described above have yielded some important facts that may be useful both for

informing practical attempts at addressing the problem and for future research.

The gifted identification process in most schools is a two-stage process. Logically

speaking, students must pass through both stages to gain access to gifted education services.

Nearly all of the current research with the goal of improving minority student representation in

gifted programs is based on the assumption that minority students may not be able to score

highly on psychometric ability and achievement measures. Gifted education scholars have

expended considerable effort in making modifications to traditional assessment schemes to

identify more minority students. Examples of such include assessment schemes based on

dynamic assessment (Kirschenbaum, 2004), non-verbal ability tests (Naglieri & Ford, 2003),

performance-based assessments (VanTassel-Baska, Johnson, & Avery, 2002), and assessments

based on Gardner’s (1983) theory of multiple intelligences (Sarouphim, 1999). The nomination

stage has inexplicably been overlooked in the literature, even though students do not get a chance

to be assessed if they do not receive a nomination. The results of the nomination study provided

strong evidence that inequalities at the nomination stage are the primary cause of

underrepresentation. The results of this study were replicated through the use of an agent-based

simulation study of the gifted identification process, currently in progress, which began with the

assumption that Black and White students have the same initial ability distributions (McBee, in

progress). The study found that only a slight reduction in nomination validity, in addition to as

Page 175: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

165

few as 6 point of systematic downward bias at the nomination stage, could account for the

underrepresentation of Black students in gifted programs, assuming that the testing stage

performs equally well for both groups.

It appears that race and social class do make independent contributions to the

underrepresentation issue. Even after controlling for student socioeconomic status via the

student’s free or reduced-price lunch status, a three-level categorical variable, race had an

enormous impact on the probability of identification. Hispanic students experienced the largest

racial penalty followed closely by Black students. Asian students experienced a large racial

advantage. Race also had an enormous impact on the probability of receiving lunch aid, such

that most Black and Hispanic students face both racial and socioeconomic disadvantages with

respect to gifted program identification. The true independent effects of race and social class,

however, cannot be fully explored by this dataset. Socioeconomic status is generally

conceptualized as a composite of annual family income, parental education, and parental

occupation (Yang & Gustafsson, 2004). Socioeconomic status also has a historical component,

such that a family with highly educated adults who have a sizable income does not transition

from high to low SES if the income is suddenly reduced. The SES data available in this study

only considered one roughly categorized dimension of SES, family income. It is possible and

indeed likely that Black and White students or Hispanic and White students treated as

socioeconomically equal by this study (i.e., are both in the same category of the FRL variable)

are actually quite different in terms of family income as well as SES as a whole. This would

especially apply to the “paid” lunch category, which only means that the student’s reported

family income did not exceed 1.85 times the federal poverty line for a given family size (US

Department of Agriculture, 2003). Obviously, great differences in income (and subsequent

Page 176: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

166

opportunity) may take place within this category, differences that may be very meaningful in

terms of conferring additional educational support and opportunity. It is possible that some of

the variance that truly is caused by SES differences was attributed to race differences in the

study. Therefore, though this study has provided evidence that race and social class have

independent effects on the likelihood of gifted identification, the limitations of the data used do

not permit us to reach a final verdict.

Schools are not identical with respect to the success with which they identify gifted

students. A given student’s probability of being identified varies quite strongly across schools.

Seventeen percent of Georgia schools identified no gifted students during the year, while one

school identified 29% of its students as gifted. The mean percentage of students identified was

3.0 with a standard deviation of 3.3. Schools that identify large proportions of their students

each year have some characteristics in common. First, they are safe. Schools with lower

incident-to-student ratios identified more students. Second, the students attending the school

score highly on standardized achievement tests. Third, the schools do not serve economically

disadvantaged students. These results should not be surprising. Schools wherein students have

the best chance of being identified are healthy, successful schools. This, of course, largely

depends on the school composition. The study found that 70% of the variance in school

standardized achievement test scores could be explained by the school composition. The

interpretation of these results is muddied because there was no individual-level measure of

ability or achievement available in the dataset. Therefore, ability, the potentially most important

factor in whether or not a student was identified, was uncontrolled. Some variables in the model,

particularly the SES variables, at both the individual and school levels, may be carrying some

Page 177: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

167

variance that should be rightly attributed to student ability due to the high correlation between

SES and ability found in previous studies.

One way to interpret these results would be to argue that schools that successfully

identify large numbers of students do so precisely because they are filled with identifiable

students (i.e., students whose true ability, motivation, achievement, and creativity exceeds the

requirements for gifted participation), whereas schools that do not identify many students simply

do not have many students that are identifiable. This approach views the efficacy of the gifted

identification process as being essentially the same in a variety of school settings, and the

different probabilities of identification reflect the true abilities of the students attending the

schools.

Another approach argues that there are identifiable students within all schools (though

perhaps not in the same numbers), and that some schools, by virtue of their leadership, location,

funding, and priorities, have committed more time and effort to identifying students and have

developed more effective strategies for doing so. This increase in time and effort means that

students who might be overlooked in other schools would be more likely to be identified. This

study is unable to distinguish between these causes. If individual-level ability data had been

available and included in the individual-level model, ability would have been controlled in the

school-level models as well. Without such data, conclusively disentangling these competing

explanations is not possible. However, the finding that the impact of receiving lunch aid or of

being Asian varied across schools lends some support to the hypothesis that, regardless of

differences in true student ability characteristics, some schools are more effective than others at

identifying gifted students.

Page 178: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

168

Directions for Future Research

The studies included in this dissertation have significantly extended our knowledge of

gifted identification process. The primary weakness of these studies is related to weakness in the

dataset itself. Thought the dataset was of unprecedented size and complexity, it was not

collected for the purposes of this study and therefore omitted some important variables that could

have addressed some of the unanswered questions described above. Future research could

resolve some of these questions by collecting a better dataset – a project that would surely

involve a large investment of time and financial resources.

It is likely that states other than Georgia collect large datasets of this type. This type of

work should be replicated in other states to determine the national characteristic of the

underrepresentation issue. Because Georgia operates under uniform rules and policies with

respect to gifted education, with comparatively little difference across districts, this study was

not able to address concrete policy issues. If similar data were collected from enough states, the

impact of various policy decisions could be assessed. The results could then be used to

empirically determine the most effective policies for identifying traditionally underrepresented

students.

Unfortunately, the main question that underlies work in this area remains largely

unanswered. That question is: to what extent are differential gifted program enrollments across

groups caused by differences in underlying ability, and to what extent are they caused by

inequalities in the assessment process? I am beginning to investigate this problem via an agent-

based simulation study that is currently in progress. A natural next step for this line of research

would be the pursual of grant funding. Funding would allow researchers to assess all students

rather than relying on a nomination stage to select out a small group of students. The results of

Page 179: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

169

this study provide evidence that this step alone could greatly equalize gifted program

enrollments.

Page 180: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

170

References

Gardner, H. (1983). Frames of mind: The theory of multiple intelligences. New York: Basic

books.

Kirschenbaum, R. J. (2004). Dynamic assessment and its use with underserved gifted and

talented populations. In A. Y. Baldwin & S. M. Reiss (Eds.), Culturally diverse and

underserved populations of gifted students (pp. 49-62). Thousand Oaks, CA: Corwin

Press, Inc.

McBee, M. (in progress). Insights from an agent-based simulation of the gifted identification

process.

Naglieri, J. A., & Ford, D. Y. (2003). Addressing underrepresentation of gifted minority children

using the Naglieri Nonverbal Ability Test (NNAT). Gifted Child Quarterly, 47(2), 155-

160.

VanTassel-Baska, J., Johnson, D., & Avery, L. D. (2002). Using performance tasks in the

identification of economically disadvantaged and minority gifted learners: Findings from

Project STAR. Gifted Child Quarterly, 46(2), 110-123.

Sarouphim, K. M. (1999). DISCOVER: A promising alternative assessment for the identification

of gifted minorities. Gifted Child Quarterly, 43(4), 244-251.

United States Department of Agriculture Food and Nutrition Service. (2003, March 13). Child

nutrition programs: Income eligibility guidelines. In Federal Register, 68(9). Retrieved

from http://www.fns.usda.gov/cnd/Governance/notices/iegs/IEGs03-04.pdf

Yang, Y., & Gustafsson, J. E. (2004). Measuring socioeconomic status at individual and

collective levels. Educational Research & Evaluation, 10(3), 259-288.

Page 181: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

171

Appendix A: MPlus code for regression analysis

Title: Example single-level regression Data: File is "c:\SEM data\sample dataset.dat"; Variable: Names are StudentID SchoolID SES Motivation Ability Test Sch_HWTime Sch_Unsafe Sch_SES Sch_Mot Sch_Ability; Usevariables are SES Motivation Ability Test; Centering = grandmean(SES ability motivation); Model: Test on SES Ability Motivation; Analysis: Type is meanstructure; Output: Standardized;

Page 182: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

172

Appendix B: MPlus code for regression accounting for clustering Title: Example regression accounting for clustering Data: File is "c:\SEM data\sample dataset.dat"; Variable: Names are StudentID SchoolID SES Motivation Ability Test Sch_HWTime Sch_Unsafe Sch_SES Sch_Mot Sch_Ability; Usevariables are SES Motivation Ability Test; Centering = groupmean(SES Motivation Ability); Cluster is SchoolID; Within are ability SES Motivation; Model: %within% Test on Ability SES Motivation; %between% test; Analysis: Type is meanstructure twolevel; Output: Standardized;

Page 183: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

173

Appendix C: MPlus code for random intercept hierarchical linear model Title: Example HLM intercept model Data: File is "c:\SEM data\sample dataset.dat"; Variable: Names are StudentID SchoolID SES Motivation Ability Test Sch_HWTime Sch_Unsafe Sch_SES Sch_Mot Sch_Ability; Usevariables are SES Motivation Ability Test Sch_HWTime Sch_Unsafe Sch_SES Sch_Mot Sch_Ability; Centering = groupmean(SES Motivation Ability); Centering = grandmean(Sch_HWTime Sch_Unsafe Sch_SES Sch_Mot Sch_Ability); Cluster is SchoolID; Within are ability SES Motivation; Between are Sch_HWTime Sch_Unsafe Sch_SES Sch_Mot Sch_Ability; Model: %within% Test on Ability SES Motivation; %between% Test on Sch_HWTime Sch_Unsafe Sch_SES Sch_Mot Sch_Ability; Analysis: Type is meanstructure twolevel; Output: Standardized;

Page 184: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

174

Appendix D: MPlus code for random slope and intercept hierarchical linear model Title: Example HLM slope and intercept model Data: File is "c:\SEM data\sample dataset.dat"; Variable: Names are StudentID SchoolID SES Motivation Ability Test Sch_HWTime Sch_Unsafe Sch_SES Sch_Mot Sch_Ability; Usevariables are SES Motivation Ability Test Sch_HWTime Sch_Unsafe Sch_SES Sch_Mot Sch_Ability; Centering = groupmean(SES Motivation Ability); Centering = grandmean(Sch_HWTime Sch_Unsafe Sch_SES Sch_Mot Sch_Ability); Cluster is SchoolID; Within are ability SES Motivation; Between are Sch_HWTime Sch_Unsafe Sch_SES Sch_Mot Sch_Ability; Model: %within% Test on SES Motivation; Slope | Test on Ability; %between% Test on Sch_HWTime Sch_Unsafe Sch_SES Sch_Mot Sch_Ability; Slope on Sch_HWTime Sch_Unsafe Sch_SES Sch_Mot Sch_Ability; Test with slope; Analysis: Type is meanstructure twolevel random;

Page 185: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

175

Appendix E: MPlus code for single-level structural equation model Title: Example SEM Data: File is "c:\SEM data\sample dataset.dat"; Variable: Names are StudentID SchoolID SES Motivation Ability Test Sch_HWTime Sch_Unsafe Sch_SES Sch_Mot Sch_Ability; Usevariables are SES Motivation Ability Test; Centering = grandmean(SES ability motivation); Model: Motivation on SES; Ability on SES; test on motivation ability; Analysis: Type is meanstructure; Output: Standardized;

Page 186: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

176

Appendix F: MPlus code for SEM accounting for clustering Title: Example SEM with clustering Data: File is "c:\SEM data\sample dataset.dat"; Variable: Names are StudentID SchoolID SES Motivation Ability Test Sch_HWTime Sch_Unsafe Sch_SES Sch_Mot Sch_Ability; Usevariables are SES Motivation Ability Test; Cluster is SchoolID; Centering = grandmean(SES ability motivation); Within are SES Motivation Ability; Model: %within% Motivation on SES; Ability on SES; test on motivation ability; %between% test; Analysis: Type is meanstructure twolevel; Output: Standardized;

Page 187: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

177

Appendix G: MPlus code for ML-SEM with random intercept model Title: Example ML-SEM intercept Data: File is "c:\SEM data\sample dataset.dat"; Variable: Names are StudentID SchoolID SES Motivation Ability Test Sch_HWTime Sch_Unsafe Sch_SES Sch_Mot Sch_Ability; Usevariables are SES Motivation Ability Test Sch_HWTime Sch_Unsafe Sch_SES Sch_Mot Sch_Ability; Cluster is SchoolID; Within are SES Motivation Ability; Between are Sch_HWTime Sch_Unsafe Sch_SES Sch_Mot Sch_Ability; Centering = grandmean(Ability Motivation SES Sch_HWTime Sch_Unsafe Sch_SES Sch_Mot Sch_Ability); Model: %within% Motivation on SES; Ability on SES; Test on motivation ability; %between% Sch_Ability on Sch_SES; Sch_Unsafe with Sch_SES; Sch_Mot on Sch_SES Sch_Ability Sch_Unsafe; Sch_HWTime on Sch_Mot; Test on Sch_Ability Sch_HWTime Sch_Mot; Analysis: Type is twolevel meanstructure; Output: Standardized;

Page 188: NOMINATION AND IDENTIFICATION OF TRADITIONALLY

178

Appendix H: MPlus code for ML-SEM with random slope and intercept models Title: Example ML-SEM intercept + slope Data: File is "c:\SEM data\sample dataset.dat"; Variable: Names are StudentID SchoolID SES Motivation Ability Test Sch_HWTime Sch_Unsafe Sch_SES Sch_Mot Sch_Ability; Usevariables are SES Motivation Ability Test Sch_HWTime Sch_Unsafe Sch_SES Sch_Mot Sch_Ability; Cluster is SchoolID; Within are SES Motivation Ability; Between are Sch_HWTime Sch_Unsafe Sch_SES Sch_Mot Sch_Ability; Centering = grandmean(Ability Motivation SES Sch_HWTime Sch_Unsafe Sch_SES Sch_Mot Sch_Ability); Model: %within% Motivation on SES; Ability on SES; Test on motivation; slope | Test on ability; %between% Sch_Ability on Sch_SES; Sch_Unsafe with Sch_SES; Sch_Mot on Sch_SES Sch_Ability Sch_Unsafe; Sch_HWTime on Sch_Mot; Test Slope on Sch_Ability Sch_HWTime Sch_Mot; Analysis: Type is twolevel random meanstructure; Algorithm = integration; Output: Tech8;