English Proficiency and Differences Among Racial and ...race/ethnic groups using the published mean scores for the SAT and GRE populations (Pennock- Roman, 1992). Unlike the cross-sectional

English Proficiency and Differences Among Racial and Ethnic Groups

In Mean SAT@ and GREB Scores: A Longitudinal Analysis

Maria Pennock-Roman

GRE Board Report No. 86-09cP

June 1999

This report presents the findings of a research project funded and carried

out under the auspices of the Graduate Record Examinations Board

Educational Testing Service, Princeton, NJ 08541

Researchers are encouraged to express freely their professional judgment. Therefore, points of view or opinions stated in Graduate

Record Examinations Board Reports do not necessarily represent official Graduate Record Examinations Board position or policy.

The Graduate Record Examinations Board and Educational Testing Service are dedicated to the principle of equal opportunity, and their programs,

services, and employment policies are guided by that principle.

EDUCATIONAL TESTING SERVICE, ETS, the ETS logo, GRADUATE RECORD EXAMINATIONS, and GRE are registered trademarks of Educational Testing Service.

The modernized ETS logo is a tmdemark of Educational Testing Service. College Board and SAT are registered trademarks of the College Entrance Examination Board.

Educational Testing Service Princeton, New Jersey 08541

Copyright 0 1999 by Educational Testing Service. All rights reserved.

Acknowledgments

This study was made possible by the support of the Graduate Record Examinations (GRE) Board. The idea of using merged records from the Scholastic Assessment Test (SAT) and GRE General Test programs for a longitudinal study was inspired by the study conducted by William Angoff and Gene Johnson that used a data set similar to the one used here. In the present study, Nancy Robertson merged the test records from the GRE and the SAT programs for individual examinees. Ka-Ling Chan and Mike Narcowich completed the data analyses for the project. Drafts of the precis, proposal, and final report were reviewed by William Angoff, Henry R. Smith III, Ron Flaugher, Howard Wainer, Tony Clewell, Ken Wilson, and Grant Henning, with additional comments by Don Powers. Also, important references for the literature review were suggested by Jane Faggen and Peggy Thome. I am very appreciative of their contributions.

Note: The author is now at The Pennsylvania State University at University Park

Abstract

Although limited proficiency in the language of a test is known to depress aptitude test scores, the changes that occur as proficiency rises over time have been less well studied. The objective here is to contrast longitudinal changes in test performance for persons who indicated that English was (EBL) or was not their best language (ENBL). Analyses were based on a sample of U.S. citizens and permanent residents @=65,987 EBL and N=1,592 ENBL); each individual had taken both the Scholastic Assessment Test (SAT)’ and the Graduate Record Examinations (GRJZ) General Test some years later. The distance in verbal mean scores between the EBL non-Hispanic White group and ENBL groups (such as, non-Hispanic White, Asian-American, Black, Puerto Rican, other Hispanic) grew closer by 0.2 1 to 0.48 standard deviation units from the taking of the SAT to the taking of the GRE. These findings have implications for the interpretation of test scores in longitudinal studies of linguistic minorities and for selective admissions of students with limited proficiency in English.

KEY WORDS:

English Proficiency English (second language) Longitudinal Studies Graduate Record Examinations Scholastic Aptitude Test Racial Differences

’ At the time the participants in this study took the SAT, it was called the Scholastic Aptitude Test.

Results ____________________~~~~~~~~~~~~~~-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ _~_____________~___~~~~~~~~~~~~~~~~~~~~ 5

Standardized Mean Differences Between Groups on the GRE and the SAT Tests -------------------------- 5

Group Differences in Regressions ___________~~_______~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~-~ 7

English Proficiency and Differences Among Racial and Ethnic Groups in Mean SAT@ and GRE@ Scores: A Longitudinal Analysis

Introduction

Objective

It is well known that limited proficiency in the language of a test decreases aptitude test scores, but the magnitude of improvements over time as proficiency rises have been less studied. For persons limited in English proficiency who take selective admissions tests for U. S. institutions, one typically finds mean verbal scores substantially below norms. In contrast, mean quantitative scores for nonnative speakers of English are closer to norms or sometimes higher than those of native speakers. In this paper, the objective is to contrast longitudinal changes in the verbal and quantitative aptitude test performance of native speakers of English with those of nonnative speakers of English. The main question is: By how much can we expect the gap in mean verbal scores between groups of native and nonnative speakers to close during the college years when instruction occurs in the language of the test?

Background

Inasmuch as graduate level admissions tests serve such a crucial role in access to graduate education and to subsequent college faculty careers, the factors that influence performance on these tests for minority students need to be examined carefully. Growth or maintenance in the skills measured by verbal and quantitative tests during the undergraduate college years are of particular interest, yet have received comparatively little attention, because of the scarcity of longitudinal data on test performance for undergraduates. National longitudinal follow-up studies of high school seniors are very limited in the sample sizes available to study the transition to graduate school for minority students (see, for example, the very small sample sizes in Hilton, 1990, and Hilton & Schrader, 1987, for the National Longitudinal Study of 1972, High School Seniors of 1972, and the High School and Beyond study of 1980 and 1982 seniors).

Two recent papers (Angoff & Johnson, 1988; Pennock-Roman, 1992b) are among the few that have examined changes in student performance from the senior year of high school past the last year of college for large samples of minority and majority students. Both studies, which have partially overlapping samples, found a very high degree of relationship (correlations in the mid .8Os) between corresponding verbal and quantitative tests on the Scholastic Assessment Test (SAT)’ and the Graduate Record Examination (GRE) General Test for individuals tested 4 to 9 years apart. These values are in agreement with an extensive body of research demonstrating high correlations in repeated testing of individuals over time after the age of 3 (see reviews by Anastasi, 1988, pp. 337-338; Bloom, 1964; and Loehlin, Lindzey, & Spuhler, 1975, Chapter 6; and classic studies by Bradway, Thompson, & Cravens, 1958; Charles & James, 1964; Honzik, MacFarlane, & Allen, 1948; Owens, 1966; Tuddenham, Blumenkrantz, & W&en, 1968).

Angoff and Johnson found that curricular effects were small in relation to the large amount of variation attributable to initial student differences, a finding that is also consistent with similar, earlier investigations (e.g., Astin, 1968). Nevertheless, experiential effects were

’ At the time that the SAT was taken by participants in this study, it was called the Scholastic Aptitude Test.

evident since quantitative skills tended to increase or be maintained at a higher rate for students majoring in fields involving mathematics or the physical sciences, in contrast to those majoring in less quantitative fields. Curricular effects for verbal skills were also in the expected direction but much smaller.

When Pennock-Roman (1992b) compared means among groups by race/ethnicity on the GRE and SAT, the longitudinal findings differed markedly from the results of cross-sectional comparisons of race/ethnic groups using the published mean scores for the SAT and GRE populations (Pennock- Roman, 1992). Unlike the cross-sectional findings, the distance between non-Hispanic White students and minority groups in standardized mean scores tended to be the same or to be smaller in GRE scores than in SAT scores.

The size of the relative gain in verbal scores for a particular group as compared with non- Hispanic White students was associated with the proportion of nonnative speakers of English in the group. In particular, the Puerto Rican group, which contained the highest proportion of nonnative speakers of English (55%) showed the greatest relative gains in verbal scores over time. Other groups that had few nonnative speakers of English tended to show no relative gain that was larger than predicted by regression effects.

The link with language background seems highly plausible for a variety of reasons. Angelis, Swinton, and Cowell(1979) and Sinnott (1980) found substantially lower completion rates for nonnative speakers as compared with native speakers on verbal sections of the GRE and the Graduate Management Admissions Test (GMAT). Research dating as far back to the 1930s leads us to expect that when undergraduate students residing in the U.S. who report that English is not their best language are exposed to instruction primarily in English, their skills in reading and vocabulary would increase relatively quickly. Working with elementary-school, Spanish-speaking children in the Southwest (grades 3 to 8) Sanchez (1932b) found large average gains in reading and intelligence scores with repeated testing over a 16-month period. Wilson (1987b) found sizable gains in English proficiency for adult and adolescent nonnative speakers who repeated the Test of English as a Foreign Language (TOEFL) at least once with the interval between test-retest averaging about 12 months. These gains were larger for students who took their last TOEFL in the United States, and who could be assumed to have spent more time in the United States.

Furthermore, gains like those found with TOEFL are likely to generalize fairly well to the verbal GRE, the verbal GMAT, and other similar verbal tests because the correlations between these tests are sizable -- roughly .7 between the GRE verbal and TOEFL and between the GMAT verbal and TOEFL (Powers, 1980; Wilson, 1982, 1985). One would expect that nonnative speakers participating in English classrooms would increase their processing speed with practice and that their verbal test scores would rise. A lesser gain would be expected in the mathematical subtests as compared with verbal subtests.

Rationale

Although past research has established that increases in English proficiency are associated with gains in the verbal test performance in English of children who are nonnative speakers, the size of the gain cannot be predicted a priori in young adults. Further analyses are necessary to gauge how much improvement in aptitude scores can be found for those who are nonnative speakers relative to native speakers of English during the college years. The size of such gains observed under existing

educational conditions provides a bench mark for longitudinal studies particular educational programs are evaluated for this age group.

of linguistic minorities in

As a means of measuring gains in test scores, the present design using available data has advantages and disadvantages when compared with a research study in which data are collected for a specific purpose. Because this design uses scores obtained primarily for use in actual undergraduate and graduate admissions, not for answering a particular research question, it has two notable advantages. First, the samples are very large. And second, the tests were administered under realistic conditions, ensuring that examinees were highly motivated to perform at their best. In contrast, the degree of effort expended by volunteers in experimental research studies for a test that will not determine grades or access to future schooling is unknowable. Also, it is not feasible to recruit very large samples for experimental, longitudinal studies.

One small disadvantage of this design is that, unlike measures in a test-retest design, GRE and SAT tests cannot be compared directly to gauge improvement because they are not parallel tests of the same difficulty. Despite some similarities, their normative populations were entirely distinct and the resulting scales were in different units.

We can compensate for this problem by comparing the test performance of each group on each test with a common reference group, such as the group of non-Hispanic White students for whom English is their best language, and who also took both tests. When there is a narrowing or expanding gap in means between the non-Hispanic White group and another group from one test to another, the gains or losses are relative to the gains or losses experienced by the contrast group in the underlying skills. Relative gains will be evident for nonnative speakers of English if they increase their reading and vocabulary skills in English at a rate faster than the normal growth experienced by students already proficient in English.

It is likely that any changes measured relative to the contrast group will underestimate the absolute gain that we would find with a test-retest design. Native speakers in the contrast group are probably increasing their verbal skills during the undergraduate years. Previous longitudinal studies tend to show slow but steady increases over time in verbal skills in their native tongue for adults. Wilson (1988) studied test-retest results for 3,6 14 adults re-taking the GRE after five or more years and found an average gain of 40 points on the verbal test (.35 standard deviation units) and 17 points on the quantitative test. Thus, the measure of relative gain to be used here will provide an estimate of growth in verbal skills for the nonnative speakers that is most likely a conservative one -- a lower bound. Although a lower bound is less ideal than a direct measure of gain, it is useful, nevertheless, especially when based on large samples tested under realistic conditions. Furthermore, the size of the relative improvement in test performance has direct application to the evaluation of the SAT and GRE scores of nonnative speakers in education.

There are two questions: (1) How large is the comparative improvement in test scores (as measured in English) during the college years for the nonnative speakers? (2) Does the relationship between the SAT and GRE vary depending on English proficiency if one examines the same persons longitudinally? Gains or losses in analytical aptitude as measured by the GRE analytical test are not considered because this test has no analogue in the SAT. For the analytical test, only mean distances are reported.

There are two expectations with regard to regressions of GRE on SAT tests. One expectation is that the prediction of later test performance will be less accurate for nonnative speakers as compared with native speakers because of intervening changes in English proficiency that are probably variable

3

across individuals. Another expectation is that the pattern of relative weights in the regressions will be different in the two language groups. For example, it is possible that the mathematical subtest may have a higher slope coefficient in the regression equation to predict GRE verbal among nonnative speakers than among native speakers. For native speakers, verbal and quantitative tests are correlated about .6; although these tests assess distinct skills, they also tap common reasoning skills, and they are both influenced by quality of schooling and experience with taking tests. In nonnative speakers, the mathematical test may be a more accurate measure of underlying skills and educational experiences that affect performance on both tests because it is less dependent on English proficiency. Thus, the mathematical test could improve the prediction of future gains in verbal performance for nonnative speakers to a greater extent than is found with native speakers.

Method

Data Source

The data set involves GRE records from all five administrations in 1984-85 for which matching SAT test records were found in “End of Year” files from 1979-80 and 1980-8 1. Records were matched primarily by students’ social security numbers. These particular years were chosen for SAT files because they include scores from four to six years earlier, when the bulk of college seniors taking the 1984-85 GRE were seniors in high school. These tapes also include records dating back further than the year designated by the name. Specifically, the records that were matched included nine persons who took the SAT in 1974-75 (i.e., about 10 years before their GRE), 904 who took the exam in 1976-77,961O who took it in 1978, and the bulk (more than 58,000) who took the test in the designated years (1979-8 1).

This data set includes all of the cases studied by Angoff and Johnson (1988), but is approximately 40,000 cases larger because it includes more than one year of SAT data and because some of their restrictions for selection were lifted. Unlike Angoff and Johnson, I included students for whom English was their best language because the changes in test scores for students who reported having limited proficiency in English was a major focus of the study. These data are essentially the same as in the Pennock-Rom&r (1992b) study, but the analyses presented here are different. (The previous study focused on comparisons with cross-sectional data and did not separate groups by language background.)

Measure of English proficiency. The measure of English proficiency is the students’ response to the question on the SAT Student Descriptive Questionnaire (SATSDQ), “Is English your best language?” From this point forward, the notation EBL will be used to refer to the group that reported English was their best language, whereas the notation ENBL will be used to refer to the group that reported English was @ their best language. For persons missing answers to the question on language on the SATSDQ, the classification was made on the basis of their answer to the GRE Background Information Questionnaire (GREBIQ), when available.

Of course there are within-group variations in English proficiency among members of the EBL and of the ENBL categories. The EBL group can be expected to include some persons who are speakers of more than one language, and the ENBL group may have considerable fluctuations in proficiency depending on native language and amount of instruction that they have had in the English language.

Sample sizes. In the matched data set, using the joint GRE and SAT questionnaire responses to classify race/ethnicity, the distribution of race/ethnicity among U.S. citizens and permanent residents taken together was as follows: 55,882 non-Hispanic White (55,386 EBL, 496 ENBL), 2,553 Asian American (2,042 EBL, 5 11 ENBL), 3,052 Black (2,986 EBL, 66 ENBL), 592 Mexican American (569 EBL, 23 ENBL), 508 Puerto Rican (244 EBL, 264 ENBL), 689 Other Hispanic (582 EBL, 107 ENBL), 77 American Indian (all EBL), 1,347 Other (1,253 EBL, 94 ENBL), and 720 examinees for whom an answer to the question on race/ethnicity was missing (689 EBL, 3 1 ENBL).2

For the number of variables being considered, a minimum of 50 cases per group is desirable for each regression. Thus, there are a sufficient number in all of the EBL and most of the ENBL classifications to permit separate regression analyses within EBL and within ENBL groups. The exceptions are the ENBL Mexican American and the ENBL Missing groups, which had very small sample sizes.

Regression Analyses

Multiple regression analyses were run treating EBL and ENBL groups separately but excluding the very small ENBL Mexican American and Missing groups. Since all of the American Indian students said that English was their best language, there were only six ENBL groups and nine EBL groups in all. Differences in regressions were carried out by contrasting the EBL non-Hispanic White group against all other eight EBL groups and against all six ENBL groups.

GRE scores were predicted from both SAT verbal and mathematics scores. Most of the prediction of the verbal GRE test is achieved from SAT verbal, but the bivariate prediction is more accurate, particularly for the ENBL groups. Analogously, GRE quantitative is predicted mostly from SAT mathematical, with a small contribution from SAT verbal.

Results

Standardized Mean Differences Between Groups on the GRE and the SAT Tests

Tables 1 and 2 show the mean differences between each minority group and the EBL non- Hispanic White group on the SAT and GRE tests. Each table has three indexes. RDIFF is the raw scale-score difference between the means for the minority group and the non-Hispanic White EBL group on the given test. These differences were standardized in two ways. For the ZDIFF index, the raw scale-score difference was divided by the corresponding standard deviation of the total sample of examinees on that test. The d index, or effect size (Cohen, 1988, pp. 1 g-44) divides the raw scale- score difference by a pooled estimate based on the two groups’ standard deviations. Both indexes tend to be close in value, although the d index tends to be somewhat larger because its denominators (the pooled within-group standard deviations) are usually smaller than the denominator for the ZDIFF index (the total standard deviation). The d index is reported because it is more commonly used in research on pair-wise group differences. In the present situation, where there are more than two groups and all

’ These sample sizes are slightly smaller than in the prior report (Pennock-Roman, 1992b) because persons who did not indicate their language preference were excluded. Also, there are minor differences in the percentage distribution of persons who are EBL versus ENBL, because the present analyses used the SATSDQ classification primarily whereas the former report used the GREBIQ. The percentage differences depending on the classification are minor (less than 4 percentage points), and reflect mostly differences in missing values.

5

have taken the same test, the same denominator.

the ZDIFF indexes are more comparable across groups because they all have

Insert Tables 1 and 2 about here.

Table 1 contrasts all other ENBL groups against the non-Hispanic White EBL group. Table 2 contrasts the EBL non-Hispanic White group against all other EBL groups. In the last two columns of each table, the difference between corresponding GRE and SAT indexes are shown. As can be seen, the differences in the last two columns of Table 1 tend to be much larger in a positive direction than the corresponding results for EBL groups of Table 2. That is, there was a much more substantial closing of the gap in mean differences over time for the ENBL groups, particularly on the verbal test.

ENBL groups contrasted to EBL non-Hispanic White group. Among groups having 50 or more members, the largest relative improvement occurred for the ENBL Asian American and Puerto Rican groups on the verbal tests. It can be seen from the d indexes in Table 1 that on the SAT verbal, the Asian American group was about 1.8 standard deviation units below the EBL non-Hispanic White group; in contrast, the GRE verbal mean for Asian Americans was below the non-Hispanic White mean only by 1.4 standard deviation units. Thus the net relative improvement on the verbal test was 0.4 standard deviation units. Similarly, on SAT verbal, the Puerto Rican ENBL group mean was 1.7 standard deviation units below the non-Hispanic White group mean, whereas on the GRE verbal, the Puerto Rican group mean was only 1.3 standard deviation units below the non-Hispanic White group mean. The gap was closed by 0.4 standard deviation units. On the quantitative tests, the relative improvement was 0.1 standard deviation units for the Asian American group and 0.2 standard deviation units for the Puerto Rican group.

The relative improvements in verbal and quantitative scores for the other ENBL groups were smaller but still notable. Disregarding the very small Mexican American group, the improvement ranged from about 0.2 to 0.3 standard deviation units on the verbal test, and from 0.1 to 0.3 standard deviation units on the quantitative test. The ENBL Mexican American and Missing groups had fewer than 35 persons each, therefore those means can be expected to have large sampling error variance. While the Mexican American group had atypical values compared with the other groups, the Missing group had values similar to the Puerto Rican and Asian American groups.

EBL groups contrasted to EBL non-Hispanic White group. Results for the EBL groups in Table 2 show very little change among group differences in GRE scores compared with SAT scores. For the Asian American, Black, Puerto Rican, American Indian, Other, and Missing groups, the d indexes for GRE means were essentially equal to corresponding values for SAT d indexes. The near- zero improvements were close to the results one would expect if these groups had the same regression of GRE scores on SAT scores that is found for the non-Hispanic White group. A small narrowing of the group differences can be expected because the relationship between SAT and GRE scores is strong, but less than perfect, and there is regression to the mean.

Slightly larger relative improvements (about 0.1 standard deviation units) were evident on the verbal and quantitative tests for the Mexican American and Other Hispanic groups. These latter groups are likely to contain individuals who may also speak a language other than English at home, even if they judge themselves to be more fluent in English.

Group Differences in Regressions

Group differences in regressions are discussed in two ways: statistically significant contrasts, and substantive differences in the predictions. Given the large sample sizes, most differences in the regressions between the EBL non-Hispanic White and other groups are, in fact, statistically significant, but it is not clear whether they are nontrivial. Unfortunately, the usual indexes for effect size in the difference between regressions are not useful here because the non-Hispanic White group is so much larger than the others. Any contribution of group membership to the overall prediction when groups are pooled is bound to be tiny because each focal group constitutes a small fraction of the whole.

In order to evaluate which results are nontrivial, the predicted GRE scores based on the EBL non-Hispanic White group equation are compared with scores predicted by the equation for that particular group. These analyses also aid us in picturing the differences in two dimensions; because the regressions involve three variables, they cannot be rendered visually in a two-dimensional graph. The predictions are compared at three points: at the means for SAT verbal and SAT mathematical, at one standard deviation unit below the means, and at one standard deviation unit above the means. Within-group regressions are shown in Table 3, and the differences in predictions are shown in Table 4 and Table 6 for GRE verbal and in Table 5 and Table 7 for GRE quantitative.

Insert Table 3 about here.

Statistical contrasts: Accuracy of prediction. From Table 3, it can be seen that there is a trend toward slightly greater accuracy of prediction for the EBL groups on both GRE tests than for ENBL groups. For predicting GRE verbal score, the values of R-squared range from .7 103 to .7604 for the EBL groups in contrast to .4768 to .6673 for the ENBL groups. This pattern of lesser accuracy for ENBL groups is also seen with the standard errors of estimate (S .E.E.) which are somewhat larger for ENBL groups (53.08 to 71.47) than for EBL groups (53.64 to 58.88). For three out of six groups (ENBL non-Hispanic White, ENBL Asian American, and ENBL Other) the F ratios of the squared S .E.E. s (each ENBL group contrasted with the EBL non-Hispanic White group) are significant at the .05 level or better. In terms of predicting GRE quantitative, the R-squared values range from .6767 to .7670 for EBL groups in contrast to a range of .5747 to .7112 for ENBL groups. Similarly the S.E.E.s tend to be larger for the ENBL group (ranging from 5 9.78 to 78.06) than for the EBL groups (ranging from 60.33 to 7 1.45). For three groups -- ENBL Black, ENBL Puerto Rican, and ENBL Other, the S.E.E.s are significantly larger than the EBL non-Hispanic White group S.E.E. at the .05 level or better.

ENBL group differences in regression coefficients: Statistical contrasts, GRE verbal. In the regression of GRE verbal on SAT scores, intercepts are significantly higher for the ENBL groups than for the EBL non-Hispanic White group, except for the ENBL Black group. For every group, the interaction term of group membership with SAT verbal has an incremental prediction that is significant when added to the equation that includes group membership as a main effect. This means that the slope coefficients for SAT verbal are significantly different for the ENBL groups, and Table 3 shows that the slope coefficients for SAT verbal are smaller for ENBL groups than for the EBL non-Hispanic White group. However, only the ENBL non-Hispanic White group has a significantly higher slope coefficient for SAT mathematical. The result for the ENBL Black group is in the same direction but does not reach statistical significance because of the small sample size.

7


Differences in predicted GRE verbal scores: ENBL groups and the EBL non- Hispanic white group. The combined effect of these differences in the regression of GRE verbal scores on SAT scores is best seen in Table 4. The differences in predicted values tend to be negative (i.e., the EBL non-Hispanic White group equation underpredicts GRE scores of ENBL groups) at the point that is one standard deviation unit below the SAT means and at the point that coincides with the SAT means for these groups. Specifically, for the Puerto Rican group, the GRE verbal value predicted by the non-Hispanic White equation at one standard deviation unit below the SAT means is 294 whereas the predicted GRE verbal value based on the Puerto Rican ENBL equation is 324, or 30 points higher. These patterns indicate that future performance on the GICE verbal test of ENBL persons with low scores on the SAT tends to be higher than that of EBL non-Hispanic White persons with equivalent SAT scores.

At one standard deviation above the SAT means, differences between the predictions are not so large. In fact, for the ENBL Black group there is a fairly large positive difference (20 points), indicating overprediction for the Black ENBL group at one standard deviation unit above the SAT means when the EBL non-Hispanic White equation is used.


ENBL group differences in regression coefficients: Statistical contrasts, GRE quantitative. Within-group regression coefficients for the regression of GRE quantitative score on SAT scores can be seen in Table 3, and the differences in predicted GRE quantitative values can be seen in Table 5. Findings for the quantitative test in terms of differences in intercepts are in the same direction as the verbal test results, except that they are not as consistent across groups. For most groups, the intercept is higher for the ENBL groups than for the EBL non-Hispanic White group, with the exceptions of the ENBL Black group and the Puerto Rican group. For the Black group, the intercept is significantly lower, whereas the Puerto Rican group has an intercept that is essentially equal to the EBL non-Hispanic White group’s coefficient. Unlike the regression of GRE verbal score, the slope coefficient for the main predictor (in this case SAT mathematical) tends to be the same across groups. There is only one group (the Asian-American group) for which the slope coefficient for SAT mathematical success is significantly lower. For the other independent variable (SAT verbal) the slope coefficient is significantly lower for only one group -- the Other ENBL group. Whereas the SAT verbal score regression coefficient for the non-Hispanic White EBL group is essentially equal to zero, for the ENBL Other group, SAT verbal score has a modest negative weight, which indicates that this variable is acting as a suppressor variable. The result for the ENBL Black group is in the same direction, but is not statistically significant because of the small sample size for this group.

Differences in predicted GRE quantitative scores: ENBL groups and the EBL non- Hispanic White group. The net effect of these differences in regressions is reflected in the differences between predicted GRE quantitative values using the own-group equation and predicted values using the EBL non-Hispanic White group equation (Table 5). For the ENBL non-Hispanic White, Asian

8

American, Other Hispanic, and Other groups, the reference group equation under-predicts future quantitative scores on the GRE based on SAT values because of the intercept differences. In contrast, the quantitative scores for the Black ENBL group are overpredicted by the reference group equation. Predictions from the ENBL Puerto Rican group’s equation are essentially identical to predictions from the EBL non-Hispanic White group.

EBL group dijjferences in regression coefficients: Statistical contrasts, GRE verbal. Individual regression equations of GRE verbal success on SAT scores for the EBL groups can be seen in Table 3. Several groups have slightly higher intercepts, but the differences are significant only for the EBL Asian American and EBL Other groups. Other groups have slightly smaller intercepts than the reference group, with the largest differences occurring for the EBL Black group (significant) and the American Indian group (not significant because of the small sample size). The SAT verbal regression weight for the EBL minority groups tends to be smaller than it is for the EBL non-Hispanic White group, and these differences are significant for the Black, Mexican American, Puerto Rican, Other Hispanic, and Missing groups. For most groups, the regression weight for SAT mathematical is not significantly different from that of the EBL non-Hispanic White group. The exceptions are the EBL Asian American group (lower SAT mathematical weight) and the EBL Black and Missing groups (higher SAT mathematical weight). There is a trend toward higher SAT mathematical weights also among the EBL Puerto Rican and American Indian groups, but the results are not statistically significant.


Differences in predicted GRE verbal scores: EBL groups and the EBL non-Hispanic White group. Differences in the predicted values of the GRE verbal test shown in Table 6 indicate that for five out of seven EBL minority groups, the regressions are not far from the EBL non-Hispanic White group equation, despite statistically significant differences in the regression coefficients. Among EBL groups, the Black and American Indian groups are the only ones showing substantial differences (i.e., larger than 10 points) between the non-Hispanic White and their own group’s regression of GRE verbal success on SAT scores. The GRE verbal scores of the Black group are over-predicted by the non-Hispanic White equation by roughly 2 1 points at one standard deviation unit above the SAT means and by roughly 10 points at the SAT means. That is, the prediction based on the non-Hispanic White equation is higher than the prediction from the Black EBL group equation at these points, although they yield similar predictions at the lower end of the SAT scales. The American Indian group is also overpredicted by the non-Hispanic White group by 15 points at one standard deviation unit below the SAT means and by 7 points at the SAT means.

EBL group differences in regression coefficients: Statistical contrasts, GRE quantitative. Looking at the individual regression results of GRE quantitative on SAT scores for each group in Table 3, we can see that the regression equations are close to the reference group’s equation for four out of eight EBL minority groups. Four show differences from the reference group equation that are statistically significant. In contrast to the EBL non-Hispanic White group, the EBL Asian American group equation has a lower intercept and a larger, but negative slope coefficient for SAT verbal score. The EBL Black and Puerto Rican group equations have lower intercepts and higher, positive slope coefficients for SAT verbal, as compared with the reference group equation. Also, the American Indian group equation has a significantly higher slope coefficient for SAT verbal, and a marginally significant lower slope coefficient for SAT mathematical than the reference group equation.

9


Differences in predicted GRE quantitative scores: EBL groups and the EBL non-Hispanic White group. These differences are evident in the predicted values of GRE quantitative score (Table 7), which show that the only EBL groups having substantial differences between the EBL non- Hispanic White and their own group’s regression are the Asian American, Black, and Puerto Rican groups. Whereas the Asian American group’s quantitative scores are underpredicted, the Black and Puerto Rican groups’ scores are over-predicted by the non-Hispanic White group equation. The largest differences tend to occur at the low end of the SAT score distribution. The American Indian group shows no large differences in predictions, probably because the slope coefficient differences for SAT verbal score and SAT mathematical score are in opposite directions and mostly cancel each other out.

Distribution of Relative Improvement in Verbal Scores

Tables 1 and 2 show the average relative improvement for groups in GRE verbal and quantitative scores, but they provide little information on how the size of the relative improvement is distributed for each group. In particular, it is important to consider how often the relative verbal improvement can exceed small to large units of a standard deviation. This distribution is shown in Table 8 for the ENBL groups and the EBL non-Hispanic White group that serves as a contrast group. Z-scores are calculated using the total sample means and SDS.


Table 8 shows the cumulative percentage in the sample for which the difference in verbal z- scores (GRE minus SAT) exceed a certain value. For example, for the EBL non-Hispanic White group about 26% of the persons have z-scores exceeding 0.3 standard deviation units, 12% of the persons have differences exceeding 0.6 units, about 4% have differences exceeding 0.9 units, and about 1% have differences in z-scores exceeding 1.2 units. In contrast, for the Total ENBL group (see last column on the right), almost half (49%) have z-score differences exceeding 0.3 standard deviation units, about 3 1% have differences exceeding 0.6 units, 16% have differences exceeding 0.9 units, and 6% have differences exceeding 1.2 units. An even higher percentage of high z-score differences is seen for the Asian American and Other groups, who, in about 9% of cases, had z-score differences larger than 1.2 standard deviation units.

Discussion

Answers to the Main Questions

How much larger is the improvement in test scores (as measured in English) during the college years for the nonnative speakers ? If we classify students according to their self-reported language preferences, the ENBL groups show much larger relative improvements in verbal scores (about 0.3 to 0.5 standard deviation units) than EBL groups (less than 0.1 standard deviation units). Furthermore, much higher percentages of ENBL individuals improve by one or more standard deviation units in their verbal scores (see Table 8).

10

As expected, differences in the relative improvement in quantitative scores are more subtle, ranging from averages of 0.1 to 0.3 standard deviation units for ENBL groups as compared with averages of less than 0.1 standard deviation unit, for the EBL groups.

Does the relationship between the SAT and GRE vary depending on English proficiency if one examines the same persons longitudinally? Yes. As expected, for every group of ENBL students, GRE verbal performance is slightly less predictable from SAT scores than for EBL non- Hispanic White students, although the differences are not statistically significant in three out of six groups. For GRE quantitative, the performance of ENBL students as compared with EBL non- Hispanic White students is also slightly less predictable in four groups (results are statistically significant in three groups). For ENBL groups, accuracy of prediction is usually higher and closer to the EBL non-Hispanic White group’s value in the prediction of GRE quantitative score than in the prediction of GRE verbal score. Because there is a lower degree of relationship between SAT verbal score and GRE verbal score for ENBL groups, it follows that SAT verbal has a lower slope coefficient in the equations for ENBL groups than in the EBL non-Hispanic White group (significant differences in every case). However, the same pattern is not seen when SAT mathematical score is used to predict GRE quantitative score -- there is only one group (Asian-American ENBL) for which the slope coefficient is significantly lower than for the EBL non-Hispanic White group. Owing to intercept and slope coefficient differences, the GRE verbal and GRE quantitative values tend to be higher for most ENBL groups than those for non-Hispanic White EBL persons with equivalent SAT scores, particularly at the group’s SAT means and below the group’s SAT means.

As expected, there are some differences between ENBL groups and the EBL non-Hispanic White group in the pattern of regression weights for the secondary predictor variable although the hypothesized pattern is not often present. The expectation that SAT mathematical would have a higher slope coefficient for ENBL groups than for the EBL non-Hispanic White group in the prediction of GRE verbal is confirmed for two groups (Black and Asian-American). In the prediction of GRE quantitative score there is evidence that SAT verbal score acts as a suppressor variable for two groups -- ENBL Other and ENBL Black).

There are also differences among EBL groups in the regression of GRE quantitative score on SAT scores, in the same direction as those found in the earlier report when the analyses were restricted to U.S. citizens (Pennock-Roman 1992b). These differences may be related to the subtle curricular effects found by Angoff and Johnson (1988) associated with college major and quality of undergraduate institutions. As detailed in Pennock-Roman (1992b), college major and gender were distributed differently across groups. For example, the Black group in particular had a high proportion (66%) of females and a low frequency of majors in quantitative fields.

Why Growth is Underestimated Here

As explained in the introduction, if one used a test-retest design with exactly parallel tests, one would be able to measure increases or decreases in cognitive skills more directly. The measure used here is a relative one, dependent on changes in the relative distance from the EBL non-Hispanic White group at two points in time with non-parallel, although highly related tests. Because the EBL non- Hispanic White group is also likely to be increasing in verbal and quantitative skills (e.g., see Wilson, 198 8) this relative measure probably underestimates the true, absolute gain in skills by the other groups and should be considered a lower bound.

11

Implications for Longitudinal Studies of Linguistic Minorities

Since the 1930s researchers (e.g., Sanchez, 1932a, 1932b, 1934) have cautioned psychologists and educators that the aptitudes of bilingual children are underestimated when they are tested in their weaker language. The consensus is that for these individuals, test scores are restricted to measuring aptitude in the context of environments using the language of the test. The higher the fluency in the language of the test, the closer the test performance in the nonnative language approaches aptitude scores as measured in the native tongue (Alderman, 1982). Slow processing speed in the language of the test may be the crucial aspect that hinders the performance of nonnative speakers (Angelis, 1979; Sinnott, 1980).

The present research extends our understanding of this issue by quantifying the greater relative gain to be found for nonnative speakers. These results imply that the temporal limit to the accuracy of that measurement is shorter for nonnative speakers. Test scores in a nonnative language reflect the test-taker’s abilities in the context of that language confined to a fairly short interval around that point

Substantial improvement can be expected four to six years later, at a rate that exceeds in time. improvement in test scores for native speakers.

Implications for Admissions Committees

The size of the future gain in test performance has implications for the interpretation of the SAT and GRE scores of nonnative speakers by admissions committees. It should be noted that, despite higher gains, the means on the GRE verbal test for the ENBL groups were still quite low and not very competitive in comparison to scores of the EBL non-Hispanic White group. Nevertheless, the results here suggest that the abilities of nonnative speakers of English in future years are somewhat underestimated, and that their later academic achievement in an English environment is probably less predictable.

As shown here and in prior research, the underestimation of abilities occurs less for quantitative tests than it occurs with verbal tests. Hence, greater weight should be placed on the quantitative test (SAT mathematical score and GRE quantitative score) in the evaluation of ENBL candidates. However, the value of placing an emphasis on quantitative tests needs to be examined by students’ background in mathematics and gender. Perhaps this practice would be less useful in the evaluation of candidates who have had less training in quantitative subjects.

Despite the high degree of relationship between SAT and GRE scores for all groups, there were many individuals (including some native speakers of English) who obtained a standard score in one test (relative to the total sample mean) that was substantially different from their standard score on the other test. Although there is a group trend toward stability in test scores, a particular individual can change a great deal over time in test performance. The results here show an average relative increase -- and Wilson’s (1988) results show an absolute gain -- in both verbal and quantitative scores over a period of several years. These findings support the recommendation that candidates for admission report test scores that are fairly recent.

Implications for Counseling

Owing to the absence of data on the amount of training students received in English as a second language (ESL), it was not possible to relate variations in the improvement in test scores to the type of ESL instruction students received. Nevertheless, the results imply that nonnative speakers of English attending college should be counseled to improve their English skills and their reading speed in

12

English as much as possible before taking selective admissions tests, for two reasons. First, it may improve their performance on tests for admission to graduate school. Second, it is possible that such training before taking selective admissions tests may improve the predictive validity of aptitude measures used for nonnative speakers. Wilson (1989) found that the correlations between academic achievement and test performance were higher for portions of the GRE verbal test that were less speeded. Reading speed was not investigated in the present study. Nevertheless, past research suggests that raising reading speed in nonnative speakers may increase the validity of timed tests for this group.

Conclusions

All groups for whom English was not their best language had an average gain exceeding 0.2 standard deviation units on the verbal test, and the largest gains (0.4 - 0.5 standard deviation units) were found for ENBL Asian American and ENBL Puerto Rican examinees. SAT scores were poorer predictors of later GRE scores for ENBL students than for EBL students. Scores on GRE verbal were higher than expected from SAT verbal scores for all ENBL groups, especially for individuals with SAT scores at the lower end. For the majority of ENBL groups, SAT mathematical scores also underpredicted GRE quantitative scores some years later.

It is not possible to know from this data set what the effect of special ESL instruction (apart from the usual classroom instruction in English) can be on the test-performance of nonnative speakers. However, the results suggest that nonnative speakers should be made aware of the importance of improving their English skills as much as possible before taking selective admissions tests. Although reading speed was not specifically studied here, other research on test-taking speed and bilingualism suggests that verbal processing speed is a crucial variable; thus, counselors should also explain the importance of increasing reading speed in English for students aspiring to enter graduate school.

Past research on language proficiency and test performance has led to the long-standing recommendation that college and graduate school admissions committees take a candidate’s proficiency in English into account when interpreting SAT and GRE scores. The present results underscore the temporal limits to the accuracy of prediction for these students. Whereas test scores may indicate the person’s level of language skill at that particular time in an academic environment in English, the individual’s later performance may be underestimated, particularly for those students with low levels of English proficiency who receive continued classroom instruction in English. The present study details how large the expected change in test performance can be for such students over a period of four to six years and under existing circumstances in college settings. Although the average nonnative speaker continues to have low verbal scores (i.e., GRE verbal scores more than one standard deviation unit below the GRE population mean), a fairly large percentage of ENBL individuals experienced increases in their scores of more than one standard deviation unit.

13

References

Alderman, D. L. (1982). Language proficiency as a moderator variable in testing academic aptitude. Journal of Educational Psvchologv, 74, 580-587.

Angelis, P. J., Swinton, S. S., & Cowell, W. R. (1979). The performance of nonnative speakers of English on TOEFL and verbal aptitude tests (TOEFL Research Report No. 3, Educational Testing Service Research Report No. RR-79-7). Princeton, NJ: Educational Testing Service.

Anastasi, A. (1988). Psvcholonical Testing (6th ed.). New York: MacMillan.

Angoff, W. G., & Johnson, E. G. (1988). A study of the differential impact of curriculum on aptitude test scores (Educational Testing Service Research Report No. RR-88-46). Princeton, NJ: Educational Testing Service.

Astin, A. W. ( 1968). Undergraduate achievement and institutional excellence. Science. 16 1, 661-668.

Bradway, K. P., Thompson, C. W., & Cravens, R. B. (1958). Preschool IQs after twenty-five years. Journal of Educational Psvchologv, 49, 278-28 1.

Bloom, B. S . (1964). Stability and change in human characteristics. New York: Wiley.

Charles, D. C., & James, S. T. (1964). Stability of average intelligence. Journal of Genetic Psvchologv. 105, 105-l 11.

Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2nd ed.). Hillsdale, NJ: Erlbaum.

Hilton, T. L. (1990, April). Talent flow in the 1970s and 1980s. Paper presented at the Annual Meeting of the American Educational Research Association, Boston, MA.

Hilton, T. L., & Schrader, W. B. ( 1987). Pathways to graduate school: An empirical study based on national longitudinal data (Educational Testing Service Research Report No. RR-87-4 1 and GRE Board Research Report No. 82-2 1R). Princeton, NJ: Educational Testing Service.

Honzik, M. P., MacFarlane, J. W., & Allen, L. (1948). The stability of mental test performance between two and eighteen years. Journal of Exnerimental Education. 17,309-324.

Loehlin, J. C., Lindzey, G., & Spuhler, J. N. (1975). Race differences in intellie;ence. San Francisco: Freeman.

Owens, W. A. ( 1966). Age and mental abilities: A second adult follow-up. Journal of Educational PsvcholoPv, 57, 3 1 l-325.

Pennock-Roman, M. (1992a). Differences among racial and ethnic groups in mean scores on the GRE and SAT: Cross-sectional comparisons (GRE Research Report No. 86-9a). Princeton, NJ: Educational Testing Service.

15

Pennock-Roman, M. (1992b). Differences among racial and ethnic groups in mean scores on the GRE and SAT: Longitudinal comparisons (GRE Research Report No. 86-9b). Princeton, NJ: Educational Testing Service.

Powers, D. E. ( 1980). The relationship between scores on the Graduate Management Admission Test and the Test of English as a Foreign Language (TOEFL Research Report No. 5, Educational Testing Service Research Report No. RR-80-3 1). Princeton, NJ: Educational Testing Service.

Sanchez, G. I. (1932a). Group differences and Spanish-speaking children: A critical review. Journal of Applied Psvcholoav, 16, 549-558.

Sanchez, G. I. (1932b). Scores of Spanish-speaking children on repeated tests. Journal of Genetic Psvchologv, 40, 223-23 1.

Sanchez, G. I. (1934). Bilingualism and mental measures: A word of caution. Journal of Applied Psvchologv, 18, 765-772.

Sinnott, L. T. (1980). Differences in item performance across groups (GMAC Research Report No. 80- 1 and Educational Testing Service Research Report No. RR-80-19). Princeton, NJ: Educational Testing Service.

Tuddenham, R. D., Blumenkrantz, J., & Wilken, W. R. (1968). Age changes on AGCT: A longitudinal study of average adults. Journal of Consulting and Clinical Psvcholog;v. 32, 659-663.

Wilson, K. M. ( 1982). GMAT and GRE aptitude test performance in relation to primarv language and scores on TOEFL (TOEFL Research Report No. 12 and Educational Testing Service Research Report No. RR-82-28). Princeton, NJ: Educational Testing Service.

Wilson, K. M. (1984). Foreign nationals taking the GRE General Test during 198 l-82: Highlights of a studv (GRE Board Research Report No. 8 l-23a and Educational Testing Service Research Report No. RR-84-39). Princeton, NJ: Educational Testing Service.

Wilson, K. M. (1985). Factors affecting; GMAT predictive valid&v for foreign MBA students: An exploratorv study (Educational Testing Service Research Report No. RR-85-17). Princeton, NJ: Educational Testing Service.

Wilson, K. M. (1986). The relationship of GRE General Test scores to first-year grades for foreign graduate students: Report of a cooperative studv (GRE Board Professional Report No. 82-l 1P and Educational Testing Service Research Report No. RR-86-44). Princeton, NJ: Educational Testing Service.

Wilson, K. M. (1987a). The GRE Subiect Test performance of U.S. and non-U.S. examinees, 1982-84: A comparative analvsis. (GRE Board Research Report No. 83-2OP, Educational Testing Service Research Report No. RR-87-4). Princeton, NJ: Educational Testing Service.

Wilson, K. M. (1987b). Patterns of test-taking and score change for examinees who repeat the Test of English as a Foreign Language (TOEFL Report No. 22 and Educational Testing Service Research Report No. RR-87-3). Princeton, NJ: Educational Testing Service.

16

Wilson, K. M. ( 1988). A studv of the long-term stabilitv of GRE General Test scores (GRE Board Research Report No. 86- 18 and Educational Testing Service Research Report No. RR-88-39). Princeton, NJ: Educational Testing Service.

Wilson, K. M. (1989). Population differences in speed versus level of GRE reading comprehension: An exploratory study (GRE Board Professional Report No. 84009aP and Educational Testing Service Research Report No. RR-89-36). Princeton, NJ: Educational Testing Service.

17

TABLE 1

Groups Not Having English as Their Best Language: Raw-Scale-Score and Standardized Differences as Compared with the Non-Hispanic White Group Having English as Their Best Language

Group/N GRE

Verbal GRE!

Quant. GRE

Analyt. SAT SAT

Verbal Math Verbal Quant.

Diff Diff.

Non-Hispanic White N=496

Asian American N=511

Black N=66

Mexican American N = 23

Puerto Rican N = 264

Other Hispanic N= 107

Other N=94

Missing N=31

Mean 461.17 SD 104.99

RDIFF -61.90 ZDIFF -0.56

d -0.59

568.2 1 126.01 -11.34

- 0.09 - 0.09

Mean SD

RDIFF ZDIFF

d

382.66 657.85 94.30 101.65

- 140.40 78.30 - 1.27 0.61 - 1.40 0.70

Mean SD

RDIFF ZDIFF

d

379.39 112.85

-200.15 - 1.57 - 1.71

Mean SD

RDIFF ZDIFF

d

443.04 114.79

- 136.51 - 1.07 -1.16

Mean SD

RDIFF ZDIFF

d

464.77 438.79 114.76 103.09

- 114.78 - 148.33 -0.90 - 1.24 -0.97 - 1.37

Mean SD

RDIFF ZDIFF

d

533.83 488.79 116.19 118.71 -45.72 - 98.34 -0.36 -0.82 -0.39 -0.85

Mean SD

RDIFF ZDIFF

d

334.55 71.69

- 188.52 - 1.70 - 2.08

351.30 85.94

- 171.76 - 1.55 - 1.78

398.11 88.84

- 124.96 -1.13 - 1.28

453.70 100.57 - 87.36 -0.79 -0.84

412.98 108.89

- 110.09 -0.99 - 1.02

549.57 494.04 376.49 506.70 13 1.43 127.19 117.9s 109.10 -29.97 -93.08 - 147.74 -51.29 -0.24 -0.78 - 1.38 - 0.46 -0.24 -0.77 -1.35 -0.48

Mean 430.32 550.00 SD 155.26 150.25

RDIFF - 92.74 -29.55 ZDIFF -0.84 -0.23

d -0.70 -0.22

538.17 436.90 537.60 121.15 108.3 1 113.26 -48.96 - 87.34 -20.39

-0.41 -0.82 -0.18 -0.42 -0.84 -0.19

513.54 338.69 616.77 111.33 102.81 100.93 -73.58 - 185.54 58.78

-0.62 - 1.74 0.53 -0.66 - 1.82 0.57

359.24 320.00 373.4s 86.29 77.56 82.51

-227.88 -204.23 - .184.51 - 1.91 - 1.91 - 1.66 -2.27 -2.27 - 1.94

410.00 101.64

- 177.12 - 1.48 - 1.65

366.09 425.22 96.54 126.35

-158.15 - .132.77 - 1.4s -1.19 - 1.60 -1.14

362.99 442.54 91.35 91.87

-161.24 - .115.45 - 1.51 - 1.04 - 1.68 -1.16

417.66 489.72 104.37 100.07

- 106.57 - 68.27 - 1.00 -0.61 - 1.04 -0.66

471.94 383.23 503.23 150.94 151.86 135.37

-115.19 - 141.01 - 54.77 -0.96 - 1.32 -0.49 -0.86 - 1.09 -0.45

0.26 0.09 0.25 0.09

0.47 0.43

0.21 0.09 0.19 0.23

-0.07 -0.17 -

0.38 0.40

0.21 0.25 0.19 0.28

0.39 0.23 0.32 0.24

0.48 0.26 0.40 0.23

0.09 0.13

0.12 .0.02

0.14 0.19

Total Sample SD 110.73 127.36 119.58 106.90 111.32

Note. RDIFF = raw-scale-score difference in means between a group and the non-Hispanic White group. ZDIFF = raw-scale-score difference divided by the total sample standard deviation for that particular test. $ = raw-scale-score difference divided by pooled standard deviations for the two groups.

19

TABLE 2

Groups Having English as Their Best Language: Raw-Scale-Score and Standardized Differences as Compared with the Non-Hispanic White Group Having English as Their Best Language

Group/N GRE! GRE GREI SAT SAT Verbal Quant.

Verbal Quant. AMlyt. Verbal Math Diff. Diff.

Non-Hispanic White N = 55386

Asian American N = 2042

Black N = 2986



Other Hispanic N=582

American Indian N=77

Other N = 1253

Missing N=689

Mean SD

523.06 106.17

522.89 111.34 -0.17 -0.00 -0.00

579.55 587.12 524.23 557.99 121.05 112.93 100.67 105.88

Mean SD

RDIFF ZDIFF

d

631.55 588.24 524.12 601.87 113.90 117.10 110.53 103.76 52.00 1.12 -0.11 43.88 0.41 0.01 -0.00 0.39 0.44 0.01 -0.00 0.41

Mean 392.09 417.86 426.37 394.52 414.56 SD 100.28 126.33 113.02 107.87 105.84

RDIFF - .130.98 161.69 - 160.75 - 129.71 - 143.43 ZDIFF -1.18 - 1.27 - 1.34 - 1.21 - 1.29

d - 1.27 -1.31 - 1.42 - 1.24 - 1.35

Mean SD

RDIFF ZDIFF

d

463.71 103.67 -59.36

-0.54 -0.57

469.63 109.17 -53.43

-0.48 -0.50

507.17 503.06 459.24 486.27 126.23 114.00 104.86 105.56 -72.38 - 84.06 -64.99 -71.72

-0.57 - 0.70 -0.61 -0.64 -0.59 -0.74 -0.63 -0.68

Mean SD

RDIFF ZDIFF

d

496.72 514.71 469.39 482.91 135.04 124.80 113.23 113.18 - 82.83 -72.41 - 54.85 -75.08

-0.65 -0.61 -0.51 -0.67 -0.65 -0.61 -0.51 -0.69

Mean 490.52 SD 104.46

RDIFF - 32.55 ZDIFF -0.29

d -0.31

533.21 486.00 511.89 122.00 104.41 109.74 -53.91 - 38.24 -46.10

-0.45 -0.36 -0.41 - 0.46 -0.37 - 0.43

Mean 476.36 SD 108.11

RDIFF -46.70 ZDIFF -0.42

d -0.44

533.99 127.48 -45.56

-0.36 -0.37

524.55 120.62 - 55.00

-0.43 -0.46

539.87 482.99 505.06 109.63 98.63 106.68 -47.25 -41.25 - 52.93

- 0.40 -0.39 - 0.48 -0.42 -0.41 -0.50

Mean 536.47 575.93 577.66 533.15 553.01 SD 117.80 127.13 120.90 112.98 112.04

RDIFF 13.41 -3.62 -9.46 8.92 -4.98 ZDIFF 0.12 -0.03 -0.08 0.08 -0.04

d 0.12 -0.03 -0.08 0.08 -0.05

Mean 513.11 568.30 567.05 514.18 546.08 SD 119.32 139.49 128.98 116.35 122.92

RDIFF -9.96 -11.25 -20.07 - 10.05 - 11.91 ZDIFF -0.09 -0.09 -0.17 -0.09 -0.11

d -0.09 -0.09 -0.17 -0.09 -0.10

Total Sample SD 110.73 127.36 119.58 106.90 111.32

- -

-0.00 -0.00

0.01 0.02

0.03 -0.03

0.02 0.05

0.07 0.08 0.07 0.09

0.03 0.02

0.02 0.04

0.06 0.06 0.06 0.06

-0.04 -0.02

0.04 0.04

0.00 0.00

0.04 0.04

0.02 0.02

0.02 0.02

Note. RDIFF = raw-scale-score difference in means between a group and the non-Hispanic White group. ZDIFF = raw-scale-score difference divided by the total sample standard deviation for that particular test. d = raw-scale-score difference divided by pooled standard deviations for the two groups.

20

TABLE 3

Results of Within-Group Regressions of GRE Scores on SAT Verbal and SAT Math

SAT Verbal SAT Math Group/Language R-Squared S.E.E. Intcpt . Rawb Rawb

DEPENDENT VAR. : GRE Verbal

Non-Hispanic White, EBL .7172 56.46 35.55 0.8311 0.0929 Asian American, EBL .7367 57.18 50.78 0.8286 0.0629 Black, EBL .7142 53.64 66.16 0.6968 0.1231 Mexican American, EBL .7171 55.29 60.26 0.7765 0.0964 Puerto Rican, EBL .7181 58.33 66.71 0.73 14 0.1234 Other Hispanic, EBL .7103 56.38 66.24 0.7785 0.0897 American Indian, EBL .7562 54.45 -9.35 0.8709 0.1289 Other, EBL .7604 57.73 35.05 0.8537 0.0836 Missing, EBL .7576 58.88 32.49 0.7759 0.1495

Non-Hispanic White, ENBL .6673 Asian American, ENBL .5969 Black, ENBL .4768 Puerto Rican, ENBL .6198 Other Hispanic, ENBL .662 1 Other, ENBL .5829

DEPENDENT VAR.: GRE Quantitative

60.75 55.53 0.65 16 0.2250 60.04 97.75 0.6750 0.0913 53.08 104.86 0.4203 0.2549 55.09 91.66 0.6753 0.1386 59.30 67.65 0.6938 0.1599 71.47 112.68 0.6619 0.1008

Non-Hispanic White, EBL Asian American, EBL Black, EBL Mexican American, EBL Puerto Rican, EBL Other Hispanic, EBL American Indian, EBL Other, EBL Missing, EBL

Non-Hispanic White, ENBL Asian American, ENBL Black, ENBL Puerto Rican, ENBL Other Hispanic, ENBL Other, ENBL

.7080

.7198

.7059

.6813

.7494

.7002

.6767

.745 1

.7670

.7112

.6561

.5761

.5747

.7103

.6585

65.41 41.77 0.0049 0.9592 60.33 76.07 -0.0330 0.95 16 68.54 -6.03 0.0899 0.9370 71.45 25.91 0.0075 0.9826 68.02 -11.93 0.0676 0.9876 69.98 39.32 -0.0172 0.9827 69.97 12.45 0.2166 0.8068 64.26 37.27 -0.0169 0.9903 67.49 29.95 -0.0273 1.0116

67.93 59.78 75.20 75.27 63.44 78.06

70.19 - 0.0325 159.78 -0.0337

6.40 -0.1616 43.74 0.0194 64.98 -0.0624 78.07 -0.1268

Note. EBL = English best language; ENBL = English not best language.

0.9527 0.8260 1.1371 0.9355 1.0106 1.0248

21

TABLE 4

ENBL Groups: Predicted GRE Verbal Values Using Own Group Equation Versus Non-Hispanic White EBL Group Equation

Predicted GRE Verbal Values Using Equation from:

SAT Verbal SATMath Own Non-Hispanic Language/Group Value Value Group White EBL Group Difference

Non-Hispanic White Means - 1 SD 328.58 424.34 365.11 348.05 - 17.05 N=496 At Means 436.90 537.60 461.17 448.59 - 12.57

Means + 1 SD 545.2 1 650.86 557.23 549.13 - 8.09

Asian American Means - 1 SD 235.88 515.84 304.07 279.5 1 -24.56 N=511 At Means 338.69 616.77 382.68 374.33 -8.35

Means + 1 SD 441.50 7 17.70 461.29 469.15 7.86

Black Means - 1 SD 242.44 290.98 280.92 264.07 - 16.85 N=66 At Means 320.00 373.48 334.55 336.20 1.64

Means + 1 SD 397.56 455.99 388.18 408.32 20.14

Puerto Rican Means - 1 SD 271.64 350.67 323.71 293.89 -29.82 N = 264 At Means 362.99 442.54 398.13 378.34 - 19.79

Means + 1 SD 454.34 534.41 472.55 462.80 -9.75

Other Hispanic Means - 1 SD 313.30 389.65 347.32 332.13 - 15.19 N = 107 At Means 417.66 489.72 435.73 428.16 -7.57

Means + 1 SD 522.03 589.79 524.14 524.20 0.06

Other Means - 1 SD 258.5 1 397.60 323.86 287.33 - 36.53 N = 94 At Means 376.49 506.70 412.95 395.52 - 17.43

Means + 1 SD 494.47 615.80 502.04 503.71 1.67

Note. EBL = English best language; The equations used here for predicting GREI Verbal are given in Table 3. ENBL = English not best language.

22

TABLE 5

ENBL Groups: Predicted GRE Quantitative Values Using Own Group Equation Versus Non-Hispanic White EBL Group Equation

Predicted GRE Quant. Values Using Equation from:

SAT Verbal SATMath Own Non-Hispanic Language/Group Value Value Group White EBL Group Difference

Non-Hispanic White Means - 1 SD 328.58 424.34 463.78 450.41 - 13.37 N=496 At Means 436.90 537.60 568.17 559.58 -8.58

Means + 1 SD 545.21 650.86 672.55 668.75 -3.80

Asian American N=511

Black N = 66


Other Hispanic N= 107

Means - 1 SD At Means Means + 1 SD




235.88 338.69 441.50

242.44 320.00 397.56

271.64 362.99 454.34

313.30 417.66 522.03

515.84 577.91 537.73 -40.19 616.77 657.82 635.04 - 22.78 717.70 737.72 732.35 -5.36

290.98 298.09 322.07 23.98 373.48 379.38 401.59 22.2 1 455.99 460.66 481.11 20.45

350.67 377.06 379.47 2.41 442.54 464.77 468.03 3.26 534.41 552.49 556.60 4.12

389.65 439.21 417.06 -22.15 489.72 533.82 513.56 - 20.27 589.79 628.44 610.06 - 18.38

Other Means - 1 SD 258.51 397.60 452.75 424.42 -28.33 N = 94 At Means 376.49 506.70 549.60 529.65 - 19.95

Means + 1 SD 494.47 615.80 646.44 634.87 - 11.57

Note. The equations used here for predicting GRE Quantitative are given in Table 3. EBL = English best language; ENBL = English not best language.

23

TABLE 6

EBL Groups: Predicted GRE Verbal Values Using Own Group Equation Versus Non-Hispanic White EBL Group Equation

Language/Group


Predicted GRE Verbal Values Using Equation from:

SAT Verbal SATMath Own Non-Hispanic Value Value Group White EBL Group Difference

Means - 1 SD 423.56 452.11 429.57 429.57 0.00 At Means 524.23 557.99 523.08 523.08 0.00 Means + 1 SD 624.90 663.87 616.58 616.58 0.00

Means - 1 SD 413.59 498.11 424.82 425.56 0.74 At Means 524.12 601.87 522.93 527.06 4.13 Means + 1 SD 634.65 705.63 621.04 628.56 7.52

Means - 1 SD 286.65 308.72 303.90 302.46 - 1.44 At Means 394.52 414.56 392.10 401.95 9.85 Means + 1 SD 502.39 520.40 480.29 501.43 21.14

Means - 1 SD 354.38 380.71 372.13 365.44 -6.69 At Means 459.24 486.27 463.74 462.40 - 1.34 Means + 1 SD 564.11 591.84 555.34 559.36 4.02

Means - 1 SD 356.15 369.73 372.82 365.89 -6.93 At Means 469.39 482.91 469.61 470.52 0.91 Means + 1 SD 582.62 596.09 566.39 575.14 8.75

Means - 1 SD 381.58 402.15 399.37 390.04 -9.33 At Means 486.00 511.89 490.50 487.01 -3.49 Means + 1 SD 590.41 621.63 581.64 583.99 2.35

Means - 1 SD 384.36 398.38 376.74 392.00 15.25 At Means 482.99 505.06 476.39 483.88 7.49 Means + 1 SD 581.61 611.75 576.04 575.76 -0.28

Means - 1 SD 420.17 440.96 430.62 425.72 -4.90 At Means 533.15 553.01 536.44 530.03 -6.41 Means + 1 SD 646.13 665.05 642.25 634.33 -7.92

Means - 1 SD 397.83 423.16 404.43 405.50 1.07 At Means 514.18 546.08 5 13.08 513.61 0.53 Means + 1 SD 630.53 669.00 621.74 621.73 -0.00


Black N = 2986



Other Hispanic N = 582

American Indian N=77

Other N = 1253

Missing N = 689

Note. The equations used here for predicting GRE Verbal are given in Table 3. EBL = English best language; ENBL = English not best language.

24

TABLE 7

EBL Groups: Predicted GRE Quantitative Values Using Own Group Equation Versus Non-Hispanic White EBL Group Equation

Language/Group

Predicted GRE Quant. Values Using Equation from:

SAT Verbal SATMath Own Non-Hispanic Value Value Group White EBL Group Difference



Black N = 2986

Mexican American N=569


Other Hispanic N = 582

American Indian N = 77

Other N = 1253

Missing N = 689

Means - 1 SD 423.56 452.11 At Means 524.23 557.99 Means + 1 SD 624.90 663.87









477.51 477.51 579.57 579.57 681.62 681.62

536.43 521.59 - 631.51 621.66 726.60 721.72

309.01 339.30 417.88 441.35 526.75 543.40

402.65 408.69 507.17 5 10.46 611.68 612.23

377.29 398.16 496.72 507.28 616.16 616.40

427.96 429.39 534.00 535.16 640.04 640.93

417.12 425.79 524.55 528.60 631.99 631.41

466.86 466.81 575.90 574.83 684.95 682.86

447.16 449.62 568.33 568.09 689.50 686.57

0.00 0.00 0.00

14.83 -9.86 -4.88

30.29 23.47 16.65

6.03 3.29 0.55

20.87 10.56 0.25

1.43 1.16 0.89

8.67 4.05

-0.58

-0.05 - 1.07 -2.09

2.46 -0.24 -2.93

Note. EBL = English best The equations used here for predicting GRE Quantitative are given in Table 3. language; ENBL = English not best language.

25

TABLE 8

Percent of Cases Having a Difference in Verbal Z-Scores Exceeding a Given Value of Z-Score

Difference in Non- Non- GRE Verbal - Hispanic Hispanic Puerto Other SAT Verbal White White Asian Black RiCZi.Il Hispanic Other Total Z-Scores EBL ENBL ENBL ENBL ENBL ENBL ENBL ENBL

-2.4 <ZD 100.00 100.02 100.02 100.03 100.0 1 99.98 99.98 100.00

-2.1 CZD 99.99 100.02 99.82 100.03 100.01 99.98 99.98 99.94

- 1.8 <ZD 99.96 100.02 99.82 100.03 100.01 99.98 98.92 99.81

- 1.5 <ZD 99.75 100.02 99.82 98.51 100.01 99.98 96.79 99.62

- 1.2 <ZD 98.99 98.61 99.23 96.99 98.87 99.98 96.79 98.74

-0.9 <ZD 95.68 95.99 98.25 95.47 98.49 98.11 95.73 97.23

-0.6 <ZD 87.02 90.14 94.92 93.95 95.84 91.57 90.41 92.90

-0.3 <ZD 71.69 81.27 87.09 78.80 90.16 76.62 84.03 84.36

0.0 <ZD 46.88 63.12 76.72 60.62 74.25 57.93 72.33 69.60

0.3 <ZD 25.86 41.95 57.35 40.92 51.14 41.11 53.18 48.93

0.6 <ZD 11.61 27.43 37.19 24.25 29.17 22.42 36.16 30.71

0.9 <ZD 3.71 12.51 21.53 7.58 12.12 9.34 18.07 15.38

1.2 <ZD 1.22 5.25 9.01 0.00 4.17 3.73 8.50 6.15

1.5 <ZD 0.41 3.03 3.53 0.00 1.90 1.86 7.44 2.95

1.8 <ZD 0.11 1.01 0.59 0.00 0.76 0.93 1.06 0.75

2.1 <ZD 0.04 0.20 0.20 0.00 0.38 0.93 0.00 0.25

2.4 <ZD 0.02 0.00 0.00 0.00 0.00 0.00 0.00 0.00

2.7 <ZD 0.01 0.00 0.00 0.00 0.00 0.00 0.00 0.00

3.0 <ZD 0.01 0.00 0.00 0.00 0.00 0.00 0.00 0.00

TOTAL N 55386 496 511 66 264 107 94 1592

Note. The cumulative percentages do not add up to 100 exactly because of rounding errors. The total sample size for the ENBL group includes 23 Mexican Americans and 3 1 persons missing a classification for race/ethnic&y. ENBL = English not best language. EBL = English best language.

26

. .

,

/- . . c

GREV .

400 - GREV 400 c . . .’ ,

. . .

SATV

200 I I I 200 300 400 500 600 2oo2l

SATV’ 600

NH White, ENBL Asian Am, ENBL

GREV

200-

SATV 200 300 400 500 t

zoo- 200

SATV 300 400 500 600

Black, ENBL Mexican Am., ENBL

600

GRJZV 400

2OQ ‘

I I I . -A- . . . . . . .

c’ . . .

GREV

I I I lo 300 400 500 (

SATV xl

600 I I I . .

400 -

/-

. . . . . . . . . . .

. .

2Oot I I I 200 300 400 500 600

SAT\’

Other Hispanic, ENBL Other, ENBL

Own Group Equation -------NH White EBLGroup Equation

DO

GREV 400

FIGURE 1

A Comparison of Own-Group Regression Equations with the Non-Hispanic White EBL Group Equation

27

Documents

English Proficiency and Differences Among Racial and ...race/ethnic groups using the published mean scores for the SAT and GRE populations (Pennock- Roman, 1992). Unlike the cross-sectional