Issued March 2004 Census 2000 Topic Report No. 9 Race and ... · Jorge H. del Pinal Census 2000 Testing, Experimentation, and Evaluation Program Topic Report No. 9, TR-9, Race and

Census 2000 Topic Report No. 9TR-9

Issued March 2004

Race andEthnicity inCensus 2000

U.S.Department of CommerceEconomics and Statistics Administration

U.S. CENSUS BUREAU

Census 2000 Testing, Experimentation, and Evaluation Program

The Census 2000 Evaluations Executive SteeringCommittee provided oversight for the Census 2000Testing, Experimentation, and Evaluations (TXE)Program. Members included Cynthia Z. F. Clark,Associate Director for Methodology and Standards;Preston J. Waite, Associate Director for DecennialCensus; Carol M. Van Horn, Chief of Staff; TeresaAngueira, Chief of the Decennial ManagementDivision; Robert E. Fay III, Senior MathematicalStatistician; Howard R. Hogan, (former) Chief of theDecennial Statistical Studies Division; Ruth AnnKillion, Chief of the Planning, Research and EvaluationDivision; Susan M. Miskura, (former) Chief of theDecennial Management Division; Rajendra P. Singh,Chief of the Decennial Statistical Studies Division;Elizabeth Ann Martin, Senior Survey Methodologist;Alan R. Tupek, Chief of the Demographic StatisticalMethods Division; Deborah E. Bolton, AssistantDivision Chief for Program Coordination of thePlanning, Research and Evaluation Division; Jon R.Clark, Assistant Division Chief for Census Design ofthe Decennial Statistical Studies Division; David L.Hubble, (former) Assistant Division Chief forEvaluations of the Planning, Research and EvaluationDivision; Fay F. Nash, (former) Assistant Division Chieffor Statistical Design/Special Census Programs of theDecennial Management Division; James B. Treat,Assistant Division Chief for Evaluations of the Planning,Research and Evaluation Division; and VioletaVazquez of the Decennial Management Division.

As an integral part of the Census 2000 TXE Program,the Evaluations Executive Steering Committee char-tered a team to develop and administer the Census2000 Quality Assurance Process for reports. Past andpresent members of this team include: Deborah E.Bolton, Assistant Division Chief for ProgramCoordination of the Planning, Research and EvaluationDivision; Jon R. Clark, Assistant Division Chief forCensus Design of the Decennial Statistical StudiesDivision; David L. Hubble, (former) Assistant DivisionChief for Evaluations and James B. Treat, AssistantDivision Chief for Evaluations of the Planning, Researchand Evaluation Division; Florence H. Abramson,Linda S. Brudvig, Jason D. Machowski, andRandall J. Neugebauer of the Planning, Research and Evaluation Division; Violeta Vazquez of theDecennial Management Division; and Frank A.Vitrano (formerly) of the Planning, Research andEvaluation Division.

The Census 2000 TXE Program was coordinated by thePlanning, Research and Evaluation Division: Ruth AnnKillion, Division Chief; Deborah E. Bolton, AssistantDivision Chief; and Randall J. Neugebauer andGeorge Francis Train III, Staff Group Leaders. KeithA. Bennett, Linda S. Brudvig, Kathleen HaysGuevara, Christine Louise Hough, Jason D.Machowski, Monica Parrott Jones, Joyce A. Price,Tammie M. Shanks, Kevin A. Shaw,

George A. Sledge, Mary Ann Sykes, and CassandraH. Thomas provided coordination support. FlorenceH. Abramson provided editorial review.

This report was prepared by Jorge H. del Pinal of thePopulation Division, under the direction of John F.Long, Chief. The following authors and project man-agers prepared Census 2000 experiments and evalua-tions that contributed to this report:

Demographic Statistical Methods Division:Sharon R. EnnisPhyllis Singer

Planning, Research and Evaluation Division:Michael BentleyTracy L. MattinglyChristine L. Hough

Population Division:Claudette E. Bennett

Senior Mathematical Statistician:Elizabeth A. Martin

Statistical Research Division:Eleanor GerberCleo D. Redline

Independent Contractors:Susan Berkowitz, WestatFred R. Borsa

Greg Carroll and Everett L. Dove of the Admin-istrative and Customer Services Division, and WalterC. Odom, Chief, provided publications and printingmanagement, graphic design and composition, and edi-torial review for print and electronic media. Generaldirection and production management were providedby James R. Clark, Assistant Division Chief, andSusan L. Rappa, Chief, Publications Services Branch.

Acknowledgments

U.S. Department of CommerceDonald L. Evans,

Secretary

Vacant,Deputy Secretary

Economics and Statistics AdministrationKathleen B. Cooper,

Under Secretary for Economic Affairs

U.S. CENSUS BUREAUCharles Louis Kincannon,

Director

Census 2000 Topic Report No. 9Census 2000 Testing, Experimentation,

and Evaluation Program

RACE AND ETHNICITY IN CENSUS 2000

TR-9

Issued March 2004

Suggested Citation

Jorge H. del PinalCensus 2000 Testing,

Experimentation, and EvaluationProgram Topic Report No. 9, TR-9,

Race and Ethnicity in Census 2000,U. S. Census Bureau,

Washington, DC 20233

ECONOMICS

AND STATISTICS

ADMINISTRATION

Economics and StatisticsAdministration

Kathleen B. Cooper,Under Secretary for Economic Affairs

U.S. CENSUS BUREAU

Charles Louis Kincannon,Director

Hermann Habermann,Deputy Director and Chief Operating Officer

Cynthia Z. F. Clark,Associate Director for Methodology and Standards

Preston J. Waite, Associate Director for Decennial Census

Teresa Angueira, Chief, Decennial Management Division

Ruth Ann Killion, Chief, Planning, Research and Evaluation Division

For sale by the Superintendent of Documents, U.S. Government Printing Office

Internet: bookstore.gpo.gov Phone: toll-free 866-512-1800; DC area 202-512-1800

Fax: 202-512-2250 Mail: Stop SSOP, Washington, DC 20402-0001

U.S. Census Bureau Race and Ethnicity in Census 2000 iii

Foreword . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .vii1.Introduction and Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1

1.1 Related reports . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1

1.2 Past research . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1

1.3 Research questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .22. Census2000 Alternate Questionnaire Experiment . . . . . . . . . . . . . . . . . . .5

2.1 Study design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5

2.2 Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5

2.3 Findings in brief . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5

2.4 Overall race reporting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6

2.5 Overall Hispanic-origin reporting . . . . . . . . . . . . . . . . . . . . . .9

2.6 Detailed Hispanic-origin reporting . . . . . . . . . . . . . . . . . . . . . .93.Census 2000 Content Reinterview Survey . . . . . . . . . . . . . . . . . . .15

3.1 Study design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .15

3.2 Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .15

3.3 Findings in brief . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .16

3.4 Consistency of race reporting . . . . . . . . . . . . . . . . . . . . . . . . .20

3.5 Consistency of ancestry reporting . . . . . . . . . . . . . . . . . . . . . .22

3.6Consistency of place-of-birth reporting . . . . . . . . . . . . . . . . . . . . .234. Census QualitySurvey to Evaluate Responses to the Census

2000Question on Race: An Introduction to the Data . . . . . . . . . . . . . . . .25

Contents

iv Race and Ethnicity in Census 2000 U.S. Census Bureau

4.1 Study design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .254.2 Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .264.3 Findings in brief . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .264.4 Discussion of Census Quality Survey findings . . . . . . . . .28

5. Comparing the Race and Hispanic-Origin Data from the American Community Survey and Census 2000 . . . . . . . . . . .315.1 Study design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .315.2 Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .315.3 Findings in brief . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .31

6. Puerto Rico Census 2000 Race and Ethnicity Questions . . . . .376.1 Study design and limitations . . . . . . . . . . . . . . . . . . . . .376.2 Nonresponse to race and Hispanic origin . . . . . . . . . . . .376.3 Hispanic-origin reporting . . . . . . . . . . . . . . . . . . . . . . . .386.4 Race reporting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .39

7. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .437.1 Effects of questionnaire changes . . . . . . . . . . . . . . . . . .437.2 Consistency in reporting . . . . . . . . . . . . . . . . . . . . . . . .447.3 Sequencing and nonresponse . . . . . . . . . . . . . . . . . . . .457.4 Comparing Census 2000 to other data sources . . . . . . .457.5 Comparing Census 2000 and the 1990 census . . . . . . . .467.6 Puerto Rico . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .477.7 Future research . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .48

8. Recommendations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .498.1 Pretest and evaluate all questionnaire changes, reduce

uncontrolled variation in the questions that are asked, and conduct more research on mode and methodological influences on the data. . . . . . . . . . . . . . . . . . . . . . . . .49

8.2 Use larger sample sizes for tests . . . . . . . . . . . . . . . . .498.3 Avoid overly complex test designs – the simpler

the better . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .498.4 Explore ways to improve mail response – not only is

it less expensive but we may also get more consistently reported race data . . . . . . . . . . . . . . . . . .49

8.5 Explore ways to improve training and monitoring of enumerator and interviewer behavior . . . . . . . . . . . . .49

8.6 Explore ways to minimize the differences between, if not standardize, race and ethnicity questions across data collections . . . . . . . . . . . . . . . . . . . . . . . . . . . . .50

8.7 Within each data collection, minimize or eliminate variation in response categories across forms to avoid introducing data processing differences . . . . . . .50

8.8 Consider removing “Other...” check boxes and keeping the write-in area . . . . . . . . . . . . . . . . . . . . . .50

8.9 Consider not using “Some other race” in combination with other specified races (e.g., change “White and SOR”responses to “White alone”) . . . . . . . . . . . . . . . . . . . . .50

8.10 Consider using information from other items to improve edit procedures, limit 100-percent data tabulations to PL 94-171 race and Hispanic-origin groups, and derive detailed groups from American Community Survey (ACS) data tabulations . . . . . . . . . . . . . . . . . . .51

8.11 Conduct additional analysis of the Census Quality Survey data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .51

References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .53

LIST OF TABLES

Table 2.1 Alternative Questionnaire Experiment (AQE) Item Nonresponse for Hispanic Origin and Race by Hispanic Origin . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6

Table 2.2 Alternative Questionnaire Experiment (AQE) Race Responses by Hispanics . . . . . . . . . . . . . . . . . . . . . .7

Table 2.3 Alternative Questionnaire Experiment (AQE) Race Responses by Non-Hispanics and Hispanic Origin NotAscertained . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8

Table 2.4 Alternative Questionnaire Experiment (AQE) Detailed Hispanic Origin Responses by Form Type . . . . . . . . . .9

Table 2.5 Comparison of Specific Hispanic-Origin Distributions From Census 2000 Long Forms and Simulated Totals Using Supplemental Information on Place-of-Birth andAncestry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .11

U.S. Census Bureau Race and Ethnicity in Census 2000 v

vi Race and Ethnicity in Census 2000 U.S. Census Bureau

Table 3.1 Aggregate Response Variance Measures for Hispanic Origin (Unedited Data) . . . . . . . . . . . . . . . . . . . . . .17

Table 3.2 Response Variance Measures for Hispanic Origin (Edited Data) . . . . . . . . . . . . . . . . . . . . . . . .17

Table 3.3 Hispanic-Origin Index of Inconsistency: 2000 and 1990 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .18

Table 3.4 Hispanic-Origin Question by Questionnaire Type . . . .18

Table 3.5 "Other Hispanic" Category by Questionnaire Type . . .19

Table 3.6 General Hispanic Responses in Census 2000 and Content Reinterview Survey . . . . . . . . . . . . . . . . . . .19

Table 3.7 Response Variance Measures for Race by Hispanic Origin (Edited Data) . . . . . . . . . . . . . . . . . . . . . . . .20

Table 3.8 Race Question by Questionnaire Type . . . . . . . . . . .21

Table 3.9 Single Ancestry Responses From Content Reinterview Survey (CRS) and Census 2000 . . . . . . .22

Table 3.10 "Hispanic" and "Spanish" Single Ancestry Responses in Census 2000 and Content Reinterview Responses . .23

Table 3.11 "Hispanic" and "Spanish" Single Ancestry Responses in Content Reinterview and Census 2000 Responses . .23

Table 3.12 Content Reinterview Survey (CRS) Place-of-Birth Reporting for Central and South America . . . . . . . . .24

Table 4.1 Census Quality Survey Data Collection Sequence: Race Instruction by Panel . . . . . . . . . . . . . . . . . . . .26

Table 4.2 Overall Consistency of Race Reporting for Non-Hispanics for Panel A . . . . . . . . . . . . . . . . . . . .27

Table 4.3 Overall Consistency of Race Reporting for Non-Hispanics for Panel B . . . . . . . . . . . . . . . . . . . .27

Table 4.4 Non-Hispanics Reporting Selected Combinations of Two Races in Panel A Initial Interview by Re-contact Response Including Probe . . . . . . . . . . . . . . . . . . . .29

Table 4.5 Non-Hispanics Reporting Selected Combinations of Two Races in Panel B Re-contact Interview by Initial Contact Response . . . . . . . . . . . . . . . . . . . . . . . . . .29

Table 4.6 Percent of Non-Hispanics Reporting Selected Combinations of Two Races Providing or Not Providing One Consistent Race by Panel . . . . . . . . . .29

Table 4.7 Example "Bridging" Parameters for Non-Hispanics Reporting Selected Combinations of Two Races and One Consistent Race by Panel . . . . . . . . . . . . . . . . .30

Table 5.1 Census 2000 and Census 2000 Supplementary Survey (C2SS) Hispanic Responses (Household

U.S. Census Bureau Race and Ethnicity in Census 2000 vii

The Census 2000 Testing, Experimentation, and Evaluation Programprovides measures of effectiveness for the Census 2000 design,operations, systems, and processes and provides information on the value of new or different methodologies. By providing measuresof how well Census 2000 was conducted, this program fully sup-ports the Census Bureau’s strategy to integrate the 2010 planningprocess with ongoing Master Address File/TIGER enhancements andthe American Community Survey. The purpose of the report that follows is to integrate findings and provide context and backgroundfor interpretation of related Census 2000 evaluations, experiments,and other assessments to make recommendations for planning the 2010 Census. Census 2000 Testing, Experimentation, andEvaluation reports are available on the Census Bureau’s Internet siteat: www.census.gov/pred/www/.

Foreword

This page intentionally left blank.

This report discusses many of thekey findings regarding race andHispanic-origin reporting fromCensus 2000 research. We seek toassess how wording changes,question sequencing, revisedinstructions, dropping examples,and the option to report more thanone race worked for Census 2000in the United States and PuertoRico, and make recommendationsfor designing the 2010 Censusquestions on race and ethnicity.

1.1 Related reports

This topic report is related to theContent and Data Quality TopicReport, and is overlapping to theextent of the discussion on raceand ethnic items – specifically race,Hispanic origin, ancestry, and placeof birth.

1.2 Past research

In some ways we have learned alot from our experience withCensus 2000, and in some waysthings have not changed all thatmuch. Part of the difficulty is thatwe are trying to measure what isessentially a social phenomenon.In order to understand what is stilloccurring to this day, we need toreview what has been said in thepast. We would do well to reflecton the words of William Alonsoand Paul Starr (1987:24-27):

Official statistics do not merelyhold a mirror to reality. Theyreflect the presuppositionsand theories about the natureof society. They are productsof social, political, and eco-

nomic interests that are oftenin conflict with each other.They are sensitive to method-ological decisions made bycomplex organizations with lim-ited resources. More over, offi-cial numbers... often do notreflect all these factors instanta-neously: They echo their pastas the surface of a landscapereflects its underlying geology.(1987:1) [emphasis added]

Official statistics directly affecteveryday lives of millions ofAmericans. ...But official statis-tics also affect society in subtlerways. By the questionsasked (and not asked), cate-gories employed, statisticalmethods used, and tabula-tions published, the statisticalsystems change images, percep-tions, aspirations. The CensusBureau’s methods of classifyingand measuring the size of pop-ulation groups determine howmany citizens will be counted as“Hispanic” or “Native American.”These decisions direct the flowof various federally mandated“preferments,” and they in turnspur various allegiances andantagonisms throughout thepopulation. Such numbersshape society as they meas-ure it. (1987:2) [emphasisadded]

Heraclitus noted that “changealone is unchanging” and CharlesDickens noted that “change begetschange.” So it is that changeaffects our work at the CensusBureau. To paraphrase

Shakespeare, sometimes we seekchange, and sometimes change isthrust upon us. An area of regularchange in decennial censuses isthe items, categories, and meth-ods used to collect racial and ethnic data (Edmonston andSchultze, 1995:142-143). Weincluded racial identification, inone form or another, in every cen-sus since the first in 1790(Bennett, 2000:313; Petersen,1987:193). Hispanic origin did notappear as a distinct question until1970 (Chapa, 2000:244). Prior to1960, census enumerators deter-mined the race of respondentsthrough observation (Bennett,2000:314). Moreover, from 1790to 1860 enumerators were notgiven instructions or definitions ofracial categories, and were free todetermine the race of each person(Petersen, 1987:190).

Our research leading to the intro-duction of mail- and self-enumera-tion in the 1960 census, showedhigher rates of enumerator errorcompared with self-enumerationerror (Baylor, 2000:63). This sug-gested that census data could bemore accurate if self-enumerationwas used as much as feasible(Goldfield and Pemberton,2000:149). Two conflicting issueshave arisen in recent times. First,as the nation’s diversity increases,there is growing pressure for revis-ing and expanding the categoriesincluded on the census to be inclu-sive of all groups and identities.Second, there is a growing anddocumented recognition of the

U.S. Census Bureau Race and Ethnicity in Census 2000 1

1. Introduction and Background

fluidity and ambiguity of racialidentities (Edmonston andSchultze, 1995:141). Courts havebegun to litigate the classificationsbecause of the different conceptualapproaches. “The legal approachviews individuals as potentialmembers of protected classes,”while “the statistical approachreflects an effort to provide a com-prehensive demographic profilethat may extend beyond legal con-siderations” (Edmonston andSchultze, 1995:141).

According to Becker (2000:157),the 1980 census was the first thatrequired us to produce data byrace and Hispanic origin that con-formed with the Office ofManagement and Budget’s (OMB)Statistical Directive No.15, issuedin 1977. The data requirementsfor the Public Law 94-171 “fileincluded a count of the total popu-lation and the population eighteenand older by each of five racegroups (White, Negro or Black,Asian and Pacific Islander,American Indian, and Other) intotal and for persons who alsoreported a Hispanic origin. Theseitems were required to meet thedata needs of the Voting RightsAct.” As a consequence, these cat-egories received prominent atten-tion on the questionnaire and inearly tabulations (Becker,2000:157).

Another reason for heightenedconcern over the race andHispanic-origin data was broughtabout by our research showing adifferential undercount for peoplein racial and ethnic minoritygroups (Robinson and West,2000:165-166). Many interestgroups argued that the censusshould be adjusted for the under-count, but the Census Bureau con-cluded that the methods to achievea fair and equitable adjustmentwere not available. The announce-

ment by the Census BureauDirector that the 1980 censuswould not be adjusted was fol-lowed by numerous lawsuits whichoccupied census staff well into the1980s. In the end, the 1980 cen-sus was not adjusted for theundercount (Becker, 2000:157).

Our research also shows that thereare other factors associated withwhy people are missed in the cen-sus, but many of those factors,such as illiteracy or lack of Englishproficiency, lack of familiarity withreasons for data collection, andhousing units without clearaddresses or in high crime areas(Cohen, 2000:100), may dispro-portionately affect minority popula-tions as well. Our research on the1990 census shows that a differen-tial undercount still existed butwas declining (Bryant, 2000:160-161).

In any case, litigation demandingthat census counts be adjusted forundercount also plagued the 1990census. Numerous lawsuits werefiled against the Census Bureau,and in the process created a nega-tive media environment during thecensus-taking and the data release.This round of litigation was notsettled until the Supreme Courthanded down a decision in March1996 that left the census count asenumerated (Bryant, 2000:15-159).

Census 2000 did not escape thepublic and private scrutiny and liti-gation either. The GeneralAccounting Office (GAO) aloneissued at least forty-seven reportson the census between January1995 and March 2003. We used alot of staff, time, and resources tocollect, analyze, and document ourdecision not to adjust Census2000, and the GAO concluded thatthe two coverage measurementprograms did not meet their objec

tives.1 The U.S. Constitution givesCongress the authority to deter-mine how the census will be con-ducted, but congressional over-sight is influenced by the tensionbetween decisions affecting howthe census will be conducted andthe political consequences of thosedecisions. While most technicaland operation decisions are madeby the Census Bureau, Congresscontinues to direct specific censusoperations (Lowenthal, 2000:83).

1.3 Research questions

The major objectives of this TopicReport are to synthesize resultsfrom the Census 2000 Testing,Experiment, and EvaluationsProgram research relevant to raceand ethnicity, and to find answersto the following questions:

1. What was the overall effect onreporting of race and Hispanicorigin engendered by thechanges in question sequencing,wording, questionnaire layout,and dropping examples thatwere included in 1990? Wascompleteness of reportingadversely affected?

2. Did sequencing of Hispanic ori-gin ahead of race have thedesired effect of reducing nonre-sponse to Hispanic origin? Didthe sequencing of Hispanic ori-gin ahead of race result in pro-portionately fewer “Some otherrace” responses in race and didHispanics have more completereporting of race?

3. How do the decennial data onrace compare to data collectedin other sources, such as in theAccuracy and CoverageEvaluation (A.C.E.), the AmericanCommunity Survey (ACS), the

2 Race and Ethnicity in Census 2000 U.S. Census Bureau

1 The GAO references the Accuracy andCoverage Evaluation (A.C.E.) and theIntegrated Coverage Measurement (ICM) pro-grams.

Census 2000 SupplementarySurvey (C2SS), and the CurrentPopulation Survey?

4. Given the changes in the raceand Hispanic-origin questions in2000, how can these data becompared to data from 1990?What are the limitations of suchcomparisons? What lessonshave we learned about bridgingthe Census 2000 race data sothat they are more comparableto those collected prior to 1990

and in other data collectionsthat do not allow for more thanone race response?

5. Given that the Census 2000 ofPuerto Rico was the first decen-nial census to ask a question onrace in many decades, whatwere the issues in collectingthose data? What were the gen-eral attitudes and problemsexpressed by the Puerto Ricanpublic in terms of the race ques-tion? How do the race and

ethnic data collected in PuertoRico compare to those collectedstate-side for the total popula-tion, Hispanics, and PuertoRicans in the United States?

6. What research and testingshould be conducted before the2010 Census in order toimprove upon the Census 2000questions on race and Hispanicorigin?




2. Census 2000 Alternate QuestionnaireExperiment

The Alternate QuestionnaireExperiment (AQE) was one of themore effective evaluations con-ducted for Census 2000. Althoughits main limitation is that it canonly inform us about the mailresponses, Martin’s (2002a) AQE-based findings are significant forour understanding about the totaleffect of the changes in the censusmail questionnaire from 1990 to2000. A summary of Martin’s(2002a) findings follows:

2.1 Study design

During Census 2000, theAlternative QuestionnaireExperiment 2000 mailed 1990-style short forms to an experimen-tal sample of 10,500 households.The 1990-style form preserves1990 question wording, cate-gories, order, and format, butincorporates some recognizableelements of the Census 2000design. Race and Hispanic-originresponses were coded and pre-edited using a simplified version ofCensus 2000 procedures, but werenot fully edited and imputed. Acontrol panel of about 25,000households received Census 2000questionnaires. Mail return rateswere very similar for both panels(72-73 percent) (Martin, 2002a:iv).

2.2 Limitations

Results of the experiment are generalizable only to the Census2000 mailout/mailback universe.Excluded are mail nonrespondentsenumerated in nonresponse fol-lowup, and segments of the popu-lation enumerated in other opera-tions (such as American Indians on

reservations and Alaska Natives)(Martin, 2002a:iv). Race andHispanic-origin responses werecoded and edited using simplifiedversions of Census 2000 edit andimputation procedures. For exam-ple, reports of more than one racewould not have been allowed inthe 1990 census but were allowedin the 1990-style panel in the AQE.Furthermore, missing data werenot imputed for race or forHispanic origin.

One limitation listed prominently inMartin’s study is the relativelysmall sample size – “...so statisticalinferences about small differencesbetween forms, or small popula-tion groups” may not be reliable(Martin, 2002a:5).

2.3 Findings in brief

2.3.1. Changes to the Census2000 questionnaire resulted in“substantially improved complete-ness of race and Hispanic originreporting” (Martin, 2002a:iv) asmeasured by item nonresponse.

• Hispanic origin: Overall itemnonresponse to the question onHispanic origin was 3.33 per-cent in the Census 2000-stylequestionnaire, compared with14.46 percent in the 1990-style questionnaire.

• Race: Overall item nonresponseto the question on race was3.27 percent in the Census2000-style questionnaire, and5.95 percent in the 1990-stylequestionnaire.

• Race nonresponse by Hispanics:Item nonresponse to the ques-

tion on race by Hispanics was20.79 percent in the Census2000-style questionnaire, com-pared with 30.53 percent inthe 1990-style questionnaire.

• Race nonresponse by non-Hispanics: Item nonresponse tothe question on race was 0.60percent by non-Hispanics in theCensus 2000-style question-naire, and 1.53 percent in the1990-style questionnaire.

2.3.2 Discussion of item nonre-sponse

Item nonresponse is one of themain indicators of data qualitybecause, in the absence of aresponse by the respondent, wemust impute the missing informa-tion. Traditionally, Hispanic originhad one of the highest allocationrates among the short-form items(Edmonston and Schultze,1995:150). One of the majorchanges in Census 2000 was tosequence Hispanic origin ahead ofrace in order to reduce the nonre-sponse to the Hispanic-origin ques-tion (OMB, 1997:58789). CensusBureau research showed that mostpeople who did not answer theHispanic-origin question in 1990were non-Hispanics (Martin,2002a:1). In addition, CensusBureau research showed that about6 percent of those who did notrespond to the Hispanic-origin itemin the 1990 census were reportedas Hispanic in the 1990 ContentReinterview Survey compared toabout 7 percent of those whoanswered the question on Hispanicorigin (McKenney, Bennett,Harrison, and del Pinal, 1993:5).

The 2000 Content ReinterviewSurvey (CRS) showed that 25 per-cent of people who left the ques-tion on Hispanic origin blank inCensus 2000 but answered it inthe CRS were of Hispanic origin(Singer and Ennis 2002:52). Byimplication, the overwhelming pro-portion of those who did notanswer the question on Hispanicorigin in AQE were likely to benon-Hispanic.

Several Census Bureau tests con-ducted in 1987 and published in1990 showed that reversing theorder of the race and Hispanic origin items, and adding instruc-tions to answer both questionsresulted in improved Hispanicorigin response rates (Martin,2002a:1). According to Peterson(1987:207), a Census AdvisoryCommittee had recommended thatthe Census Bureau reverse theorder of race and Hispanic-originquestions for the 1980 census.However, many saw this as anattempt “to raise the maximum thenumber that would be classified”as Hispanic (Petersen, 1987:207).As we will see in a section below,that fear appears to be unfounded.In any case, implementing thischange came about only after theOffice of Management and Budget’s(OMB) mandated sequencingchange (Martin, 2002a:1).

From the standpoint of item nonre-sponse to the Hispanic-origin item,the changes in the Census 2000questionnaire were highly success-ful. Compared to the 1990-styleform, the 2000-style form mayhave reduced nonresponse byabout eleven percentage points or77 percent (see Table 2.1).However, as we will discuss later,reporting of specific groups mayhave been adversely affected bythe questionnaire changes. On theother hand, race nonresponseshows a much more moderatelevel of improvement with the2000-style form: a change of lessthan three percentage points orabout 45 percent lower. As shownabove, race nonresponse variesquite a bit by Hispanic origin.Nonresponse to race by Hispanicswas reduced by almost 10 percent-age points with the 2000-styleform, but represented a change of32 percent. On the other hand,nonresponse by non-Hispanics wasreduced by 0.93 percentage pointswith the 2000-style form, but rep-resents a 61 percent reduction.One downside to the 2000-styleform, from the perspective of non-response, was a higher race nonre-sponse by people who did notrespond to Hispanic origin either.Race nonresponse when Hispanicorigin was missing was higher by3.5 percentage points or about 36percent in 2000-style forms com-

pared with the 1990-style forms.However, this was a much smallergroup than it was for the 1990-style form.

2.3.3 Conclusion on item nonre-sponse

It is worth mentioning again thatthe previously discussed resultsmay only apply to mail responses.The changes to the 2000 question-naire appear to have produced avery salutary effect on Hispanic-origin nonresponse, at least in mailreturns. Although the responserates to the question on Hispanicorigin are vastly improved in the2000-style questionnaire (3.3 per-cent compared with 14.5 percent),nonresponse to Hispanic originremains on the high side.Nonresponse to the race questionis very low for non-Hispanics (0.6percent on the 2000-style form,and 1.5 percent on the 1990-styleform). On the other hand, racenonresponse remains unacceptablyhigh for Hispanics at over 20 per-cent despite a significant improve-ment in race reporting byHispanics in 2000-style forms.Future research is needed toaddress this persisting issue.

2.4 Overall race reporting

2.4.1 “Changes to the Census 2000questionnaire also affected racereporting” (Martin, 2002a:iv,12).

• Reporting of Two or more races:In the Census 2000-style ques-tionnaire 2.03 percent ofrespondents reported Two ormore races compared with 0.82percent in the 1990-style ques-tionnaire.

• Reporting of Native Hawaiianand Other Pacific Islander: Inthe Census 2000-style question-naire 0.17 percent of respon-dents reported Native Hawaiianand Other Pacific Islander


Table 2.1Alternative Questionnaire Experiment (AQE) Item Non-response for Hispanic Origin and Race by Hispanic Origin

Item 2000-Style(1)

1990-Style(2)

Difference(3=1-2)

Percentdifference

(4=3/2)

Hispanic origin . . . . . . . . . . . . . . . . . . . . . 3.33 14.46 –11.13 –77.0Race . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.27 5.95 –2.68 –45.0

Non-Hispanics . . . . . . . . . . . . . . . . . . . 0.60 1.53 –0.93 –60.8Hispanics . . . . . . . . . . . . . . . . . . . . . . . 20.79 30.53 –9.74 –31.9Origin missing . . . . . . . . . . . . . . . . . . . 13.18 9.72 3.46 35.6

Note: Bold numbers in Column 3 indicate significant differences at the p<.05 level.

Source: Derived from Martin (2002a:7,11 Table 2 and Table 4)

compared with 0.05 percent inthe 1990-style questionnaire .

• Reporting of Some other race: Inthe Census 2000-style question-naire 3.72 percent of respon-dents reported Some other racecompared with 4.42 percent inthe 1990-style questionnaire.

2.4.2 Discussion of overall racereporting

Compared to the 1990-style, the2000-style form yields a higherproportion of responses of morethan one race. We expected thisfinding because the 2000-styleform allows reporting of more thanone race but the 1990-style doesnot. This was also one of thechanges called for by the new OMBstandards (OMB, 1997:58789).What is more interesting is thatnearly one percent of respondentsto the 1990-style form also gavemore than one race. While it iswell known that people haveresponded in this manner in pastcensuses (Edmonston, Goldstein,and Tamayo Lott, 1996:23), ourprocedures edited multiple raceresponses into single responses(Cresce, 2003).

Another issue of concern is thereporting of “Some other race,”which is not a standard OMB racecategory (OMB, 1997:58789). The“Some other race” category wasadded for respondents who wereunable to identify with one ormore of the OMB categories (White;Black or African American;American Indian or Alaska Native;Asian; and Native Hawaiian andOther Pacific Islander). One diffi-culty is that “Some other race”(SOR) has become the third largestcategory after “White” and “Black orAfrican American” (Grieco andCassidy, 2001:2-3). Another diffi-culty is that for all other federalstatistical purposes we have reclas-sified the SOR responses into the

OMB categories, and there is “noway to evaluate how this reclassifi-cation corresponds to people’s self-perception” (Edmonston, Goldstein,and Tamayo Lott, 1996:39). As inprevious censuses, the vast majori-ty of people in the SOR category inCensus 2000 were of Hispanic ori-gin (Grieco and Cassidy, 2001:11).For all these reasons, it is impor-tant to examine race reporting sep-arately by Hispanic origin (Martin,2002a:13,14).

2.4.3 Race reporting by Hispanics

• Reporting of Two or more racesby Hispanics: In the Census2000-style questionnaire 7.84percent of Hispanics reportedTwo or more races comparedwith 4.59 percent in the 1990-style questionnaires.

• Reporting of Some other race byHispanics: In the Census 2000-style questionnaire 39.03 per-cent of Hispanics reportedSome other race compared with51.47 percent in the 1990-style questionnaire.

• Reporting of White by Hispanics:In the Census 2000-style ques-tionnaire 48.98 percent ofHispanics reported White com-pared with 39.88 percent inthe 1990-style questionnaire.

2.4.4 Discussion of race reportingby Hispanics

There are several significant differ-ences in race reporting byHispanics in the 2000 and 1990-style forms, as can be seen inTable 2.2. First, Hispanics weremuch less likely to report as SOR(about 24 percent less), and muchmore likely to report as White inthe 2000-style forms (about 23percent more). They were alsomore likely to select more than onerace (about 71 percent) than in the1990-style, as expected. Otherresearch (del Pinal, Martin, Bennett,and Cresce, 2002:3) shows thatmuch of the Two or more racesreporting by Hispanics involvesSOR in combination with otherraces as one of the races. Thus,eliminating SOR responses reducesHispanic reporting of Two or moreraces to about the same level asnon-Hispanics.

Another interesting finding is thatHispanics are more likely (about106 percent more) to report asAmerican Indian in 2000-style thanin 1990-style forms, and much lesslikely (about 93 percent less) toreport as Native Hawaiian andOther Pacific Islander. But overallthese differences are not statisti-cally significant. However, Martin(2002a:13) reports that the


Table 2.2Alternative Questionnaire Experiment (AQE) RaceResponses by Hispanics

Race 2000-Style(1)

1990-Style(2)

Difference(3=1-2)

Percentdifference

(4=3/2)

White . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48.98 39.88 9.10 22.8Black . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.07 2.32 –0.25 -10.8American Indian and Alaska Native . . 1.48 0.72 0.76 105.6Asian. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 0.58 0.88 –0.30 -34.1Native Hawaiian and Other PacificIslander . . . . . . . . . . . . . . . . . . . . . . . . . . . 0.01 0.15 –0.14 -93.3Some other race. . . . . . . . . . . . . . . . . . . . 39.03 51.47 –12.44 -24.2Two or more races. . . . . . . . . . . . . . . . . . 7.84 4.59 3.25 70.8

Note: Bold numbers indicate significant differences at the p<.05 level.

Source: Derived from Martin (2002a:13 Table 6).

difference is significant in the lowcoverage area (LCA) strata (2.08vs. 0.79 percent, or about a 163percent difference) but not in thehigh coverage area (HCA) strata.The 2000-style form capturedmore Native Hawaiian and OtherPacific Islander responses,although the percentages aresmall. At this point it may beworth reminding readers that smallcategories are “more vulnerable toinaccuracies” due to both samplingand non-sampling error(Edmonston, Goldstein, andTamayo Lott 1996:24,39). Indeed,Martin (2002a:5) prominently listsamong the limitations of this studythe relatively small sample size –“so statistical inferences aboutsmall differences between forms,or small population groups” maynot be reliable. In view of this lim-itation Martin’s (2002a) findings areremarkable indeed.

2.4.5 Race reporting by non-Hispanics

• Reporting of Native Hawaiianand Other Pacific Islander bynon-Hispanics: In the Census2000-style questionnaire 0.18percent of non-Hispanicsreported Native Hawaiian andOther Pacific Islander comparedwith 0.04 percent in the 1990-style questionnaire.

• Reporting of White by non-Hispanics: In the Census 2000-style questionnaire 81.15 per-cent of non-Hispanics reportedWhite compared with 82.43percent in the 1990-stylequestionnaire.

• Reporting of Two or more racesby non-Hispanics: In the Census2000-style questionnaire 1.45percent of non-Hispanicsreported Two or more racescompared with 0.48 percent inthe 1990-style questionnaire.

2.4.6 Discussion of race reportingby non-Hispanics

There are several significant differ-ences in race reporting by non-Hispanics and respondents whodid not report a Hispanic origin inthe 2000 and 1990-style forms(see Table 2.3). First, non-Hispanics were slightly less likelyto report as White (about 1.6 per-cent less), and much more likely toreport as Pacific Islander in the2000-style forms (about 350 per-cent more). As expected, and simi-lar to Hispanics, non-Hispanicswere also more likely to selectmore than one race (about 202percent more) in the 2000-styleform.

Martin (2002a:14) explains theslightly lower reporting of Whiteamong non-Hispanics in 2000-style forms as an effect of theoption of reporting more than onerace, yet there was no measurabledownward effect on other cate-gories. If this proposition is true,it suggests that people of more

than one race tend to report asWhite when only one race responseis allowed, but report as Two ormore races when multiple raceresponses are allowed. In a latersection I examine the propensityto report White among respondentswho report more than one race,which may shed light on this issue.

Another interesting finding is that“contrary to what might have beenexpected, there is little evidencethat allowing respondents to reportmore than one race reduced thesingle race reporting in the 5major race categories” (Martin,2002a:iv). This may allay somefears among those who thoughtthat the reported size of someminority categories may be smallerbecause of the reporting of morethan one race. However, one rea-son that the non-White categoriesappear not to be as affected is thatthe 1990-style forms also hadsome (0.82 percent) respondentsreport more than one race despitethe instruction to report one.These multiple responses wouldhave been edited into a single racecategory in 1990. In addition,almost one-third (30.5 percent) ofHispanics did not report a race, soit is unknown how their responseswould have impacted the results.


Table 2.3Alternative Questionnaire Experiment (AQE) RaceResponses by Non-Hispanics and Hispanic Origin NotAscertained2

Race 2000-Style(1)

1990-Style(2)

Difference(3=1-2)

Percentdifference

(4=3/2)

White . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81.15 82.43 –1.28 -1.6Black. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12.28 12.02 0.26 2.2American Indian and Alaska Native . . 0.38 0.48 –0.10 -20.8Asian . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.39 4.34 0.05 1.2Native Hawaiian and Other PacificIslander . . . . . . . . . . . . . . . . . . . . . . . . . . 0.18 0.04 0.14 350.0

Some other race . . . . . . . . . . . . . . . . . . . 0.17 0.20 –0.03 -15.0Two or more races. . . . . . . . . . . . . . . . . . 1.45 0.48 0.97 202.1

Note: Bold numbers indicate significant differences at the p.05 level; bold Italic number indicatessignificant differences at the .10 level.

Source: Derived from Martin (2002a:14 Table 7)

2 This table included both non-Hispanicsand respondents who did not answer theHispanic-origin question, which makes sensebecause our previous research suggests thatmost of the non-responders are not Hispanic(McKenney, Bennett, Harrison, and del Pinal,1993:5).


The actual effect in published racedata may be affected by how theseresponses are allocated. In addi-tion, as Martin (2002a:5) remindsus, these findings are generalizableonly to the Census 2000 mailout/mailback universe.

2.5. Overall Hispanic-originreporting

According to Martin (2002a:v),“despite the reversed sequence ofHispanic origin and race questionwording differences, the same per-centage (slightly over 11.1 per-cent) reported as Hispanic in bothforms.”

Martin (2002a:7) reports that boththe 2000- and 1990-style formsyielded nearly identical proportionsof Hispanic respondents – about11 percent. However, the highrates of missing data create uncer-tainty about the overall percentageof Hispanics identified by eachform. On the other hand, the pro-portion of non-Hispanics in the2000-style form was about 85 per-cent compared to about 74 percentin 1990-style forms. The remain-ing difference is due to people whodid not respond – about 3 percentdid not respond in 2000-styleforms compared to about 14 per-cent in 1990 style forms.

2.5.1 Discussion of overallHispanic-origin reporting

As discussed above, our previousresearch suggests that in the pastnon-Hispanics were much morelikely to omit answering theHispanic-origin question. Martin(2002a:7) concludes that “underthis assumption, the results sug-gest the 2000-style questionnairedid not affect reporting asHispanic, except to reduce thenumber of non-Hispanics whowould have left the item blank in a1990-style questionnaire.” Theultimate distributional effect would

depend on how the missing dataare edited and imputed. Martin(2002a:7) notes that the “differ-ence in rates of missing data isvery large, and was expectedbased on previous tests of effectsof item sequence and an addedinstruction.”

This finding is very importantbecause of the concerns thatsequencing Hispanic origin aheadof race might have the effect ofartificially inflating the number ofHispanics (Petersen, 1987:207).The equal proportions of Hispanicsin the 2000-style and 1990-styleforms (about 11 percent) stronglysuggest that this was not the case.This supports Martin’s (2002:v)conclusion that “any changes from1990 to 2000 in the fraction of thepopulation identifying as Hispanicare not due to changes in thedesign of mail questionnaire.However, there were questionnaireeffects on reporting a detailedHispanic origin,” as I discuss next.We should note that the high ratesof missing data create uncertaintyabout the overall fraction ofHispanics that would be identifiedby each form after the data werefully edited and imputed.

2.6 Detailed Hispanic-origin reporting

According to Martin (2002a:v), “the2000-style questionnaires elicitedfewer reports of specific Hispanicgroups, and more reports of gener-al Hispanic identity (e.g., Hispanic,Latino, Spanish) than the 1990-style questionnaires.”

Martin’s (2002a:10) AQE findingspoint out that “about 92 percent ofHispanics reported a specific groupin 1990-style forms, comparedwith 80 percent who filled out2000-style forms.”3 Martin(2002a:10) broke out theseresponses into five categories(shown in Table 2.4 below), andreported that the 2000-style formsproduced more general or non-spe-cific Hispanic responses (e.g.,“Hispanic,” “Latino,” “Spanish”; or“Other Hispanic” without providinga write-in response) and fewer spe-cific groups (“check box groups,”4

Table 2.4Alternative Questionnaire Experiment (AQE) DetailedHispanic Origin Responses by Form Type

Race 2000-Style(1)

1990-Style(2)

Difference(3=1-2)

Percentdifference

(4=3/2)

Total people identified asHispanic (percent) . . . . . . . . . . . . 100.00 100.00

‘‘Check box groups’’ . . . . . . . . . . . . . . . . 70.25 73.23 –2.98 -4.1Mexican, Mexican American, Chicano 54.26 58.68 –4.42 -7.5‘‘Example groups’’ . . . . . . . . . . . . . . . . . 6.41 11.16 –4.75 -42.6

All other specific Hispanic groups . . . . 4.20 8.68 –4.48 -51.6Write-in general descriptor

(‘‘Hispanic’’/‘‘Latino’’/‘‘Spanish’’) . . . . . . 11.90 1.90 10.00 526.3Other Hispanic, no write-in . . . . . . . . . . 7.25 5.03 2.22 44.1

Note: Bold numbers indicate significant differences at the p<.05 level.

Source: Derived from Martin (2002a:9 Table 9).

3 GAO (2003:14) reported “93 percent ofHispanics given the 1990-style form report-ed a specific subgroup, compared to 81 per-cent of Hispanics given the 2000-style form,”but that was based on preliminary AQE find-ings.

4 Groups with their own specific check-box included: 1) Mexican, Mexican Am.,Chicano; 2) Puerto Rican; and 3) Cuban.


“example groups,”5 and all otherspecific national origin groups).Table 2.4 summarizes the finalAQE results regarding Hispanicsubgroup reporting.

The largest difference between the2000-style and 1990-style forms isthe proportion of general Hispanicresponses (“Hispanic,” “Latino,” and“Spanish”). The 2000-style formsproduced 10 percentage points or526 percent more of theseresponses than did the 1990-styleforms. Similarly, the 2000-styleform also produced another 2.22percentage points or 44 percentmore “Other Hispanic” responseswith no write-in (see Table 2.4).On the other hand, the 2000-styleforms produced fewer specificHispanic groups than the 1990-style forms. The 2000-style formhad about 43 percent (4.75 per-centage points) fewer of the exam-ple groups and about 52 percent(4.48 percentage points) fewer ofthe specific non-example groups.Although 2000-style forms had 4percent (2.98 percentage points)fewer specific checkbox groupsoverall than did the 1990-styleform, that difference was not sta-tistically significant. However,when compared separately, theMexican-origin check box groupwas 7.5 percent (4.42 percentagepoints) lower in the 2000-styleforms, and that difference is statis-tically significant.

Martin (2002a:10) concludes:

...the experiment does offer evi-dence that the questionnaireaffected reporting of detailedHispanic origin. Hispanics whofilled out 2000-style mail ques-tionnaires were less likely toreport a specific Hispanic group

and more likely to report a gen-eral descriptor (such asHispanic, Latino, or Spanish)than those who filled out 1990-style questionnaires. Althoughthe cause of the effect is uncer-tain, it is probably due to thecombined effect of questionwording and the elimination ofexamples in the Census 2000questionnaire.

2.6.1 Discussion of detailedHispanic-origin reporting

Are the AQE results just a fluke oris there other evidence of differ-ences in reporting? I believe theAQE results for detailed Hispanicreporting do, in fact, explain muchof what was noticed from theCensus 2000 data. In a reportabout the Hispanic population fromCensus 2000, Guzmán (2001:2)also noted that “17.3 percent (6.1million) of the total Hispanic popu-lation” did not give a specificnational origin group; and theseresponses “were second in size”behind the population that report-ed Mexican origin.

As additional information fromCensus 2000 became available atmore local levels during the sum-mer of 2001, community advo-cates, journalists and researchersnoted unexpectedly low numbersof specific Hispanic groups.According to Suro (2002:3), twocompeting explanations emerged:“either a large number of peoplehad chosen to identify themselveswith a broad ethnic designation,such as Hispanic or Latino, ratherthan a specific national origin,such as Dominican or Salvadoran,or these results were a product ofchanges in the way the censusquestionnaire asked aboutHispanic origin.”

After examining the Hispanic-origindata, Logan (2002:3,4) concludedthat “Census 2000 did an excellent

job of counting Hispanics, but per-formed poorly in identifying theirorigin.” Among the likely causes,Logan noted that “no examples”and a “change in wording of thequestion itself” in Census 2000resulted in “a severe underestimateof the numbers of specific Hispanicgroups in 2000.” Logan also noteda more dramatic effect in statesand metropolitan areas with largeconcentrations of specific Hispanicgroups. Another reason Logan(2001:4) has for finding theCensus 2000 results implausible isa comparison with the Census2000 Supplementary Survey(C2SS). In the C2SS (which wastaken the same year), about 9.6percent of Hispanics did not report a specific national origin,compared with about 17.6 percentin Census 2000. Similarly, Suro(2002:8) finds the distribution ofspecific Hispanic groups moreplausible in C2SS.

Responding to complaints fromcommunity groups, local govern-ment officials, and researchers,members of Congress asked theU.S. General Accounting Office(GAO) to look into the issue. TheGAO (2003a:1) report expressesconcerns that the “deletion ofHispanic subgroup examples” fromthe Census 2000 questionnairewas the cause of lower thanexpected “counts of Dominicansand other Hispanic subgroups.”GAO concluded that for “Census2000, the [Census] Bureauremoved the subgroup examplesas part of a broader effort to sim-plify the questionnaire and helpimprove response rates.” GAO(2003a:14), as noted above, foundthat early AQE results and C2SSdata seemed to indicate a problemwith the Census 2000 detailedHispanic distribution.

While the debate about how toidentify the Hispanic population

5 Groups given as specific examples inthe 1990-style form included: Argentinean,Columbian, Dominican, Nicaraguan,Salvadoran, and Spaniard.


dates back to the 1960s (seeCholdin, 1986:403), as seenabove, the issue in Census 2000 isthe distribution by specific nation-al-origin groups. Choldin(1986:404) noted that “nationalstatistics must change in responseto sociopolitical changes” and that“the role of the statistician is notsimply scientific, but is also condi-tioned by events in the politicalenvironment.” However, up untilCensus 2000, the differentialundercount was controversial, notthe distribution of specific groups.

2.6.2 Analysis of general Hispanicresponses in Census 2000

At the request of members ofCongress, the Census Bureauundertook the task of using infor-mation on place of birth andancestry from the Census 2000long form to supplement the gen-eral Hispanic-origin responses(Cresce and Ramirez, 2003). Thesenew estimates “do not fully reflectself-identification” and are notmeant to replace the officialCensus 2000 figures. Still, this“simulation” produced interestingresults: of an estimated 5.7 mil-lion individuals who provided ageneral Hispanic response, 54 per-cent (3.1 million) also providedmore information about their spe-cific origin in either place of birthor ancestry. That left about 2.6million individuals who gave noadditional information about theirspecific Hispanic origin (Cresce andRamirez, 2003:9).

This simulation suggests that everysingle specific category (check boxand specific write-in groups) couldbe increased using additional infor-mation from place of birth andancestry (see Table 2.5). The sim-ulation increases the proportion inall specific groups from 84 percentto 93 percent of all Hispanics, anincrease of about 10 percent.

However, example groupsincreased by 35 percent (2.4 per-centage points) and other specificgroups increased by 28 percent(1.4 percentage points). And bydesign, the general responsesdeclined by 52 percent (5.1 per-centage points) and the other non-specific responses declined by 57percent (3.3 percentage points).These results are very similar tothose of Martin (2002), Logan(2001), and Suro (2002), which alluse slightly different data sourcesand methods. Cresce and Ramirez(2003:19) specifically compare thesimulation total to Logan (2001)and Suro (2002), and find that theformer overshoots and the latterundershoots the simulation totals.

2.6.3 Discussion of generalHispanic responses in Census 2000

Cresce and Ramirez (2003:7) listseveral limitations to their simula-tion analysis, some of which alsoapply to the research by Logan(2001) and Suro (2002). Theseanalyses only add to specificgroups by subtracting from thegeneral groups, and don’t use con-tradictory information to reducespecific groups. All three analysesassume the total Hispanic popula-tion is correct and do not add or

subtract from that total. TheMartin (2002) analysis does nothave this limitation, but is limitedto a relatively small sample of mailreturns. While Cresce and Ramirez(2003) use Census 2000 long formdata, Logan (2001) models the dis-tribution of specific groups withCurrent Population Survey (CPS)data, and Suro (2002) uses C2SSdata. For a more detailed discus-sion of these differences, seeCresce and Ramirez (2003:7-8,19-20). All of these studies seem toindicate that the observed changesin the distribution of Census 2000detailed Hispanic groups comparedwith changes seen in other sourceswere not due entirely to a shift inhow people of Hispanic origindefine themselves, but rather tosome product of the changes inthe way we asked the Hispanicorigin question. We are left withthe question of whether the elimi-nation of examples was the proba-ble cause of the reporting differ-ences in detailed Hispanic groups.

2.6.4 Conclusions about detailedHispanic responses in Census 2000

In discussing the reportingchanges in Hispanic groups, Martin(2002a:16) speculates:

Table 2.5Comparison of Specific Hispanic-Origin Distributions FromCensus 2000 Long Forms and Simulated Totals Using Supple-mental Information on Place-of-Birth and Ancestry

Race

Census2000 Long

Form(1)

Simulatedtotals

(2)Difference

(3=2-1)

Percentdifference

(4=3/1)

Total people identified asHispanic (percent) . . . . . . . . . . . . 100.0 100.0

‘‘Check box groups’’—total . . . . . . . . . . 72.5 77.2 4.6 6.4Mexican, Mexican American,Chicano . . . . . . . . . . . . . . . . . . . . . . . 59.3 63.4 4.1 6.9

‘‘Example groups’’ . . . . . . . . . . . . . . . . . . 6.9 9.2 2.4 34.7All other specific Hispanic groups . . . . 4.9 6.3 1.4 28.4Write-in general descriptor(‘‘Hispanic’’/‘‘Latino’’/‘‘Spanish’’) . . . . . 9.9 4.8 -5.1 -51.8

Other Hispanic no-specific . . . . . . . . . . 5.8 2.5 -3.3 -56.5

Source: Derived from Cresce and Ramirez (2003:11 Table 6).


Although the cause of the effectis uncertain, it is probably dueto the combined effect of ques-tion wording and the eliminationof examples in the Census 2000questionnaire. The examplesnext to the write-in box provid-ed cues about the type ofanswer intended by the ques-tion in the 1990-style form. Inthe Census 2000 questionnaire,the instruction to “print group”right after the “Yes, otherSpanish/Hispanic/Latino”response category may havesuggested to some respondentsthat they should print whicheverof these three terms they preferred.

Although the elimination of exam-ples is commonly assumed to bethe main cause of this problem(see GAO 2003:2 for example),Martin (2002a:16) argues that “thehypothesis of example effects doesnot account for the higher report-ing of Mexicans in the 1990-styleform. This difference requires adifferent explanation, because thespecific examples (Mexican,Mexican Am., Chicano) are identi-cal in both forms.” Similarly, theanalysis by Cresce and Ramirez(2003:11) suggests that all checkbox groups (Mexican, Puerto Rican,and Cuban) may have been affect-ed, which also argues that some-thing other than removing exam-ples was at work.

Martin (2002a:16) goes on toargue that:

The wording change from “Isthis person of Spanish/Hispanicorigin?” to “Is this personSpanish/Hispanic/Latino?” mayhave contributed to the report-ing difference. The Census2000 question appears directedto an overarching identificationas Hispanic (or Spanish orLatino), and the absence of

specific Hispanic exampleswould reinforce this wordingeffect.

Reflective of the issue of examplesas they may have affected bothHispanic and race reporting, Martin(2002b:4) notes:

The apparent contrast betweenthe effects of examples in theHispanic origin and race itemsmerits further analysis and con-sideration. The examples in the1990 Hispanic origin questionmay have served to clarify thatthe intent of the question wasto collect detailed Hispanic ori-gin, while the race question maynot have suffered from the sameambiguity, hence may not needexamples. In addition, theexamples were placed different-ly in the two questions. In the1990 form, the Hispanic exam-ples were prominently placed,just below the “otherSpanish/Hispanic” responseoption, above the write-inspace. The race examples wereoff to the left, below the ques-tion and remote from the write-in space, where they were lesslikely to be seen than theHispanic examples were. Thisdifference in placement wouldlikely reduce their impact in therace item compared to theHispanic origin item.

Unfortunately, as Martin (2002a:16)suggests, “the experiment wasdesigned to evaluate the effects ofall the wording and design differ-ences between the 1990 and 2000mail questionnaires, it is not wellsuited to isolating the causes forthis or other differences.” Wespeculate that in effect wechanged the “sense” of theHispanic-origin question by remov-ing examples, dropping “origin”from the question, using three gen-eral terms separated by slash

marks (Spanish/Hispanic/Latino),and using a write-in instruction(“Yes, other Spanish/Hispanic/Latino – Print group”) that seems torequest one term should be print-ed. All of these combined changesmay have caused respondents toselect among the terms listed (oreven reject these terms) ratherthan report their specific origin. Inlater sections, I will present otherevidence to support this con-tention. In any case, the AQE pro-vides the most important andtelling evidence to date on theeffect of questionnaire changes inCensus 2000.

As reflected in the GAO (2003a:10)report, neither the 1997 OMB revi-sions to Directive No.15 nor PublicLaw 94-311 require us to collectdata on detailed Hispanic groupsbut we have done so in the besteffort to get an accurate overallcount of the Hispanic population.All evidence points to the achieve-ment of this goal in Census 2000.However, the fact we publish dataon detailed Hispanic-origin groupsindicates to data users that wehave some confidence in the accu-racy of the reported data. As GAO(2003a:3) summarized the issue,“while the [Census] Bureau report-ed what respondents marked ontheir questionnaires, because ofconfusion over the wording of thequestion, the subgroup data couldbe misleading” [emphasis added].It may no longer be possible for usmerely to publish what respon-dents provided without a thoroughassessment of the data and a deci-sion process about whether topublish or not. However, the pub-lic demand for census data, nomatter how flawed or inconclusive,may give us no recourse but tomake the data available. Thispoint is well illustrated by thedemand for group quarters (seeGAO, 2003b) and adjustment data


(see GAO, 2003c) from Census2000.

While respondent confusion mayplay a role in producing differencesin detailed Hispanic reporting, it isalso likely that our instructionswere not clear in communicatingwhat we wanted from the respon-dent. As Martin (2002c:592)reminds us, “questionnaire changesthat seem minor can have impor-tant effects” on our data.Therefore, we need to “pretest andevaluate all questionnairechanges,” and although we didconduct tests prior to the changesin the census questionnaire, “per-haps the test design and samplesize were not adequate to detect”any effects that would illuminatethese complex and importantissues. It seems that an inade-quate and small sample size, inparticular, may limit our ability todetect the effect of changes. I willaddress these concerns in subse-quent sections of this report. TheGAO report emphasizes the needfor further improvements in thequality of detailed Hispanic data,and highlights the need for consis-tency among data sets in thisregard.

2.6.5. Reporting of detailed Asianand Pacific Islander responses inCensus 2000

Given the concern about theeffects of dropping examples onthe reporting of specific Hispanicgroups, Martin (2002b:1) under-took an examination of the AQEdata to see how the changes in thequestionnaire affected the report-ing of specific race groups.Looking first at the race examplegroups (Hmong, Fijian, Laotian,Thai, Tongan, Pakistani, and

Cambodian) taken as a whole,Martin (2002b:2) found a statisti-cally significant difference in thereporting of these specific groups.However, the 2000-style form,which did not list examples,showed a higher proportion ofthese example groups than the1990-style form (0.356 vs 0.106percent). Martin (2002b:3) alsonotes that “in general, the 2000-style form elicited more reports ofboth the Asian and the PacificIslander example groups, althoughonly the overall differences forAsians and for Pacific Islanders arestatistically significant at the .05level.” One difficulty with theanalysis was that there were noresponses of specific Pacific Islandergroups in the 1990-style forms (seeMartin 2002b:3, Table 2), indicatingthat this sample may have been toosmall to conclude anything aboutexample effects in this case. Martin(2002b:3) also notes that a largersample is needed, but points outthat “the difference is consistent forall the groups, and marginally sig-nificant for several (t > 1.645 is sig-nificant at p<.10 with a 2-tailedtest), despite very small cell fre-quencies.” Additional research onthe use of examples is addressed byMartin, Gerber, and Redline (2003).

Among Martin’s (2002:3-4) otherfindings was the discovery thatthere was no difference in overallreporting of the Asian category(4.04 percent in 2000-style and4.06 percent in the 1990-styleforms), but there were significant-ly more Pacific Islanders in the2000-style forms (0.17 percent vs0.05 percent). Martin (2002b:4)concludes that the “results do notindicate that dropping the exam-ples had any negative effects on

reporting of the [Asian and PacificIslander] example groups in 2000-style forms,” but that “differencesin reporting probably arise fromother design features of the ques-tionnaire, and are probably not a(perverse) effect of examples.”

Martin’s (2002b:4) preliminary con-clusions are as follows:

Other questionnaire features areprobably influencing the resultsfor Pacific Islanders, in particu-lar, splitting the API [Asian andPacific Islander] category intotwo separate categories [“Asian”;“Native Hawaiian and OtherPacific Islander”]. The PacificIslander category is probablymore populated in 2000-styleforms because it is easier forPacific Islanders to report whenthe Pacific Islander boxes aregrouped together rather thaninterspersed among Asianboxes, as they are in the 1990-style form, and when they havetheir own “Other PacificIslander” response box associat-ed with a write-in space.

Both Asian and Pacific Islanderrespondents may have beenconfused by the label “OtherAPI” used in the 1990-styleform, which requires closeattention and skilled reading todecode, and which may havecontributed to the difference inwrite-ins of example groups. Ihave not yet examined whetherthere are also form differencesin write-ins of non-exampleAsian groups, which might shedlight on whether the revisionsmade to the 2000-style formsled to a general increase inwrite-ins of specific Asiangroups.



Content reinterview surveys con-ducted during decennial censuseshave traditionally been an impor-tant tool in assessing the quality ofcensus data (Thomas, Dingbaum,and Woltman, 1993:5). TheCensus 2000 Content ReinterviewSurvey (CRS) is no exception(Singer and Ennis, 2002). The pur-pose of the Census 2000 CRS wasto evaluate the consistency ofresponses to Census 2000 througha reinterview of a sample ofrespondents. A summary of theCensus 2000 CRS findings follows(Singer and Ennis, 2002).

3.1 Study design

The CRS randomly selected 30,000households that were scheduled toreceive the Census 2000 longform. Upon receipt of the longform from these households, theybecame eligible for a reinterview.The CRS randomly chose one sam-ple person from each household tobe reinterviewed via phone (fromthe roster collected at the begin-ning of the CRS) by an experiencedcensus field representative. If arespondent could not be reachedby phone, a personal visit inter-view was attempted. About 78.2percent of interviews were con-ducted by telephone and 21.5 per-cent by personal visit; the remain-ing interviews utilized both modes,or the mode could not be deter-mined (Singer and Ennis, 2002:3).

The primary goal of the CRS was toevaluate the quality of data collect-ed in Census 2000 using simpleresponse variance as measured bythe index of inconsistency (Singerand Ennis, 2002:1). A discussion

of interpreting the index of incon-sistency appears below in section3.3.2. While the index of inconsis-tency is a point estimate, the levelof inconsistency was consideredlow if the index was less than 20,moderate if between 20 and 50,and high if greater than 50. A lowlevel of inconsistency for an itemwas interpreted as meaning thatthere is “usually not a major prob-lem,” a moderate level as “some-what problematic,” and high as“very problematic” (Singer andEnnis, 2002:9).

Singer and Ennis (2002:9) pointout that:

The index of inconsistency maybe substantially higher for rarecategories6 when only a fewindividuals among the smallnumber reporting the character-istic change their response(interview vs. reinterview). Thismay also be a problem for smallsample sizes, even when theydon’t have rare characteristics.We may observe high indexesfor rare categories in a distribu-tion even though the gross dif-ference rate (the proportion ofindividuals in the sample chang-ing their minds) may be small.

Ultimately, the CRS analyzed datafor about 20,0007 preselectedhouseholds (Singer and Ennis,2002:4). The CRS used edit proce-dures similar to Census 2000 for

race, Hispanic origin, and ancestry,but did not go as far as imputingfor nonresponse (Singer and Ennis,2002:4).

3.2 Limitations

This study does not addressresponse bias because, unlike pre-vious census CRS studies, no prob-ing questions were asked. Thetest-retest response evaluationused in this study measures simpleresponse variance (Singer andEnnis, 2002:10). The fact that noprobing questions were asked isnot necessarily a limitationbecause in order to measure biasone must know the “true” value ofthe characteristic being measured.The presumption had been that theprobe or the CRS answer was true.The CRS questionnaire closely fol-lowed the enumerator question-naire for Census 2000, but, unlikeCensus 2000, most interviewswere conducted by telephone.

The mailback universe was over-represented in the 2000 CRS –about three-quarters of the casesanalyzed in CRS completed mail-back forms in Census 2000 com-pared with 58 percent of the pre-selected households. For a majori-ty of cases, then, there is a differ-ence in the mode of collectionbetween the census and the CRS.As consequence this study mayoverestimate inconsistency inCensus 2000 because “data collect-ed by mailback may be less incon-sistent than data collected by enu-merators” (Singer and Ennis,2002:xxi,10-11). Additionally, therespondent answering the CRS wasnot always the census respondent.

3. Census 2000 Content Reinterview Survey

6 For CRS, a characteristic is rare when5 percent or less cases fall in the category.

7 After removing census non-interviews,CRS non-interviews, and non-matches, CRShad 19,554 sample-person matches.


About 68.4 percent of the respon-dents were the same in CRS andcensus, although 48.2 answeredfor themselves in both and 20.2were proxy8 respondents in both.About 22 percent were differentrespondents on CRS and census,and we were not able to determinethe respondent in about 9.6 per-cent of the cases (Singer and Ennis,2002:11). The data in this reportare self-weighted and not weightedup to national estimates. Eachhousing unit had the same weightbecause the sample was selectedwith a single-stage systematic sam-ple. The sampled person wasselected at random within eachhousehold, so each person had anequal probability of selection with-in the household. So “sample per-sons within households of thesame size had the same weight”(Singer and Ennis, 2002:11).

The CRS study compares CRS andCensus 2000 data before consis-tency edits and imputations. Race,Hispanic origin, and ancestry wereedited based only on the informa-tion of the sampled person.Among the possible contributorsto response error are the question-naire design, question wording,interview mode, interviewereffects, inadequate instructions,scanning errors, and deliberate fal-sification (Singer and Ennis,2002:11). The CRS questionnairemimicked the census enumeratorquestionnaire. Collection of infor-mation on race and Hispanic originmay have been affected by admin-istration mode because responsesmay have been affected by thepresence or absence of the flashcard (Singer and Ennis, 2002:11-12).


In this topic report, I focus primari-ly on the consistency of race andHispanic-origin reporting, and to alesser extent on place of birth andancestry, which are also of interestin racial and ethnic research. Theremaining population and housingitems are covered in the compan-ion Content and Data Quality TopicReport. Of the 58 population char-acteristics evaluated by the CRS,16 showed good consistency, 26moderate consistency, and 16 poorconsistency.9 The CRS report con-sidered Hispanic-origin and place-of-birth reporting to be of goodconsistency, and race and ancestryreporting to be of moderate consis-tency.

Over 95 percent of respondentsanswered both the race andHispanic-origin question in Census2000 and CRS. When answering28 of the 58 population questions,including ancestry, householdswith non-Hispanic sample personsshowed more consistency10 thanhouseholds with Hispanic samplepersons. From most consistent toleast consistent, households withWhite sample persons showedmore consistency than householdswith Asian sample persons, house-holds with sample persons report-ing Two or more races, householdswith Black sample persons, andhouseholds with sample personsreporting other single races.However, households with Hispanicsample persons were more consis-tent in reporting place of birththan households with non-Hispanic

sample persons (Singer and Ennis,2002:19-20).

3.3.1 Consistency of Hispanic-origin reporting

According to Singer and Ennis(2002:xxii-xxiii), the edited datafor the Hispanic-origin questiondisplayed good consistency. Butthe lack of instructions forHispanic origin may have causedsome respondents to “choose multiple categories” although the intent was to get only one category.

Singer and Ennis (2002:52-53) notethat the changes in the Hispanic-origin question, including sequenc-ing it ahead of race, the droppingof examples, changing the ques-tion wording and adding “Latino,”and the new instructions to answerboth Hispanic origin and race mayhave influenced consistency. Theyanalyzed the Hispanic-originresponses in two ways. First, theytreated each response category asa “Yes/No” question, using theunedited data. Second, theygrouped the responses, includingwrite-in entries, into eight cate-gories, using the edited data.

The first analysis suggested goodconsistency for the “non-Hispanic”and the “Mexican” categories, butonly moderate consistency for the“Puerto Rican,” “Cuban,” and “OtherHispanic” categories (see Table3.1). The second analysis witheight categories (see Table 3.2)also showed good consistencywith only about 3.3 percent ofrespondents changing theiranswers, and an aggregate indexof inconsistency of 17.2. However,as Singer and Ennis (2002:53-54)remind us, all categories were“rare” except the “Non-Hispanic”and “Mexican” categories. Theyalso noted that about 20 percentof those who changed answerswent from non-Hispanic in the

8 In this report, “proxy” refers to arespondent who was a household memberbut not the sample person.

9 For simplicity of expression, the fol-lowing terms used in the CRS report weremodified: 1) low inconsistency = good con-sistency; 2) moderate inconsistency = mod-erate consistency; and 3) high inconsistency= poor consistency.

10 The phrase “more consistency” is usedin this report instead of “less inconsistency,”and so on from the CRS report, for ease ofexpression.


census to a mix of non-Hispanicand Hispanic in CRS, and about 53percent of those chose “non-Hispanic” and “Mexican.” About 16percent of those who changedanswers were “Other Hispanic” incensus and “Mexican” in CRS.What is clear is that most of theinconsistency arises in the “OtherHispanic” category and the multi-ple reports, as can be seen in Table 3.2.

One caution noted by Singer andEnnis (2002:55) was that the “netdifference rates for all categoriesexcept ‘Puerto Rican’ and ‘Multiplenon-Hispanic’ were statistically dif-ferent from zero suggesting thatthe CRS was not independent ofthe census and/or did not replicate

the census conditions as well asdesired.” Net difference rates(NDRs) give the difference betweenthe original percent in a specificanswer category and the reinter-view percent in the same category.An NDR that is statistically differ-ent from zero suggests that theassumption of replication is notsatisfied.

Among Singer and Ennis’ (2002:55-56) other findings about Hispanic-origin reporting were:

– households with foreign-bornsample persons showed goodconsistency compared withmoderate consistency of house-holds with native-born samplepersons.

– both respondents who reportedon mailback forms and to enu-merators also showed good con-sistency and were not statistical-ly different (with index of 17.6and 16.9 respectively).

– when the data were analyzed assingle response versus multipleresponse, they showed poorconsistency. Giving multipleresponses was a “rare” category,which as stated above, canaffect the index of inconsisten-cy. Only about 1.4 percent ofresponses were multiple.

– about 77 percent of those whochanged their answers reporteda single response in the censusand multiple responses in theCRS; and about 23 percentreported multiple responses inthe census and a singleresponse in the CRS.

3.3.2 Discussion of Hispanic-originreporting

According to Thomas, Dingbaum,and Woltman (1993:8-9), there areseveral ways to interpret the indexof inconsistency, depending on themethodology used to collect rein-terview data.

1. If each of the two observations(the census and the reinterviewin this case) is regarded as anindependent repetition of thesame survey procedure underthe same general conditions, theindex of inconsistency estimatesthe ratio of simple responsevariance to the sum of samplingvariance and simple variance.In this case, as noted by Biemer(1985), the index of inconsis-tency measures the impactof mis-classification errorson total variance of anobservation (emphasisadded).

Table3.1Aggregate Response Variance Measures for HispanicOrigin (Unedited Data)

Reinterview classification Netdifference

rate

Consis-tencylevel

Index of inconsistency

Estimate

90-percentconfidence

interval

Not Hispanic . . . . . . . . . . . . . . . . . . . . . . . *02 Good 10.2 9.3 to 11.1Mexican . . . . . . . . . . . . . . . . . . . . . . . . . . *–0.9 Good 18.0 16.6 to 19.5Puerto Rican . . . . . . . . . . . . . . . . . . . . . . . *–0.3 Moderate 22.7 19.4 to 26.6Cuban. . . . . . . . . . . . . . . . . . . . . . . . . . . . . *–0.3 Moderate 41.7 34.6 to 50.3Other Hispanic . . . . . . . . . . . . . . . . . . . . . 0.0 Moderate 42.2 39.0 to 45.7

* NDR significantly different from zero.

Source: Adapted from Singer and Ennis (2002:53 Table 33).

Table 3.2Response Variance Measures for Hispanic Origin(Edited Data)

Reinterview classificationNet

differencerate

Consis-tencylevel


Estimate90-percent con-fidence interval

Non-Hispanic . . . . . . . . . . . . . . . . . . . . *0.6 Good 10.1 9.2 to 11.0Mexican . . . . . . . . . . . . . . . . . . . . . . . . . *-0.3 Good 13.4 12.2 to 14.8Puerto Rican . . . . . . . . . . . . . . . . . . . . . 0.0 Good 14.2 11.5 to 17.6Cuban . . . . . . . . . . . . . . . . . . . . . . . . . . *-0.1 Good 13.7 9.3 to 20.1Other Hispanic . . . . . . . . . . . . . . . . . . . * 0.4 Moderate 33.8 30.7 to 37.3Multiple non-Hispanic . . . . . . . . . . . . . 0.0 Poor 100.0 42.5 to 100.0Multiple Hispanic . . . . . . . . . . . . . . . . . *-0.1 Poor 80.5 62.4 to 100.0Mixed non-Hispanic and Hispanic . . *-0.6 Poor 98.6 88.0 to 100.0

Aggregate. . . . . . . . . . . . . . . . . . . . Good 17.2 16.1 to 18.4

* NDR significantly different from zero.

Source: Singer and Ennis (2002:55 Table 36).


2. The index of inconsistencymay also be interpreted as acomplement of a measure ofagreement between the cen-sus and the reinterviewresponses. Viewed in this way,the index is the ratio of theobserved number of responsedifferences to the number thatwould occur if the cell countswere formed by a randomagreement mechanism based onthe observed marginal distribu-tions (census and reinterview).

So “when the second observa-tion is not an attempt to repeatthe original interview proce-dure but may represent an‘improved’ data source,” the firstinterpretation of the index ofinconsistency may be question-able. The second interpretation isappropriate “even when the second observation is not anattempt to repeat the originalinterview procedure identical-ly” (Thomas, Dingbaum, andWoltman 1993:9). In this regard, itmay be more appropriate to regardthe 2000 CRS indexes of inconsis-tency in this fashion rather than as simple response variance esti-mators.

How does the 2000 CRS compareto the 1990 CRS? Looking first atthe aggregate index of inconsisten-cy11 in Table 3.3, the 2000 index(17.2) is greater than the 1990index (12.2), although both arestill low.

One reason for the difference inindexes is that more categorieswere used in the calculation in2000 than in 1990. As Thomas,Dingbaum, and Woltman (1993:9)remind us, “the level of index issensitive to the number anddetail of categories in a classifica-tion system as well as to the dis-tribution of the population overthese categories” [emphasisadded]. Similarly, as discussedpreviously, Singer and Ennis(2002:53-54) remind us that allcategories were “rare” except the‘Non-Hispanic’ and ‘Mexican’ cate-gories. Although the total samplesize should have no effect on thedifference in indexes, the totalsample size in the 1990(n=29,647) was about 52 percentlarger than in 2000 (n=19,554)(see Thomas, Dingbaum andWoltman, 1993:30; Singer andEnnis, 2002:4). A larger sample in2000 may have yielded a greaternumber of observations in therarer categories.

Turning to individual categories,we see in Table 3.3 that there wasmuch more consistent reporting in1990 in the “Mexican” and “PuertoRican” categories, but about thesame consistency in reporting forthe “Cuban” and “Other Hispanic”categories. One explanation forthe difference in reporting consis-tency is that the 1990 CRS usedexactly the same question in cen-sus and CRS (Thomas, Dingbaum,and Woltman, 1993:6), but the2000 CRS did not, as we will seebelow. Another reason is that the2000 CRS used telephone inter-views (78 percent; see Singer andEnnis, 2002:3) to a much greaterextent than was probably the casein 1990.

Unlike the 1990 CRS, the questionsasked in the 2000 CRS differedfrom the ones used in the census.In case of Hispanic origin in partic-ular, the CRS question is quite dif-ferent from the mail form, butmore similar to the Census 2000interviewer form (see Table 3.4).

A reasonable person might con-clude that the mailback Hispanic-origin question is really asking if aperson is “Spanish/Hispanic/Latino,” whereas the enumeratorand CRS questions are askingabout specific groups (e.g.,“Mexican,” “Puerto Rican,” “Cuban,”or of another Hispanic or Latinogroup). All had very similarresponse categories, with the pos-sible exception of the “Other

Table 3.3Hispanic-Origin Index ofInconsistency: 2000 and1990

Hispanic-origin category 2000CRS

1990CRS

Not Hispanic . . . . . . . . . . 10.1 9.3Mexican . . . . . . . . . . . . . . 13.4 8.5Puerto Rican . . . . . . . . . . 14.2 8.6Cuban . . . . . . . . . . . . . . . . 13.7 13.6Other Hispanic . . . . . . . . . 33.8 34.1Multiple non-Hispanic . . . 100.0 (X)Multiple Hispanic . . . . . . . 80.5 (X)Mixed non-Hispanic andHispanic . . . . . . . . . . . . . 98.6 (X)

Aggregate . . . . . . . . . 17.2 12.2

(X) Not applicable.

Source: Adapted from Singer and Ennis(2002:55 Table 36) and Thomas, Dingbaum, andWoltman (1993:36 Table 3; 17 Table 4.1).

11 In 1990 the aggregate index wasreferred to as an L-fold index and wasdefined as “a weighted average of the indi-vidual indexes computed for each categoryof a distribution” (Thomas, Dingbaum, andWoltman 1993:9).

Table 3.4Hispanic-Origin Question by Questionnaire Type

Census 2000 questionnaire Hispanic-origin question

Census 2000Form D-2 (mailback long form)

Is this person Spanish/Hispanic/Latino? Mark X the ‘‘No’’box if not Spanish/Hispanic/Latino.

Enumerator QuestionnaireForm D-2(E)

Are any of the persons that I have listed Mexican, PuertoRican, Cuban, or of another Hispanic or Latino group?

Content Reinterview SurveyForm D-1010 (5-10-2000)

(Are you/Is...) Mexican, Puerto Rican, Cuban, or ofanother Hispanic or Latino group?


Hispanic” category (see Table 3.5).Furthermore, the mailback ques-tion could also be seen as asking aperson to select among the choices“Spanish/Hispanic/Latino.” The“print group” instruction on themail form may have reinforced thisbecause no examples were listed.

In addition, the instruction “Mark Xthe “No” box if not Spanish/Hispanic/Latino” on the mailbackform may be interpreted asinstructing the respondent to mark“No” if he/she does not identifywith any or all of the terms. Eitherof these interpretations could haveled to some of the multipleresponses and the “switching”observed in CRS.

Consider a hypothetical example ofa respondent of Mexican originwho might have reasonably con-cluded that the “proper” answer tothe mailback form was one of thefollowing:

1. “No, not Spanish/Hispanic/Latino” because he/she did notidentify with any or all of theterms; or

2. “No...” and “Yes, Mexican,Mexican Am., Chicano” becausehe/she did not identify withany or all of the general terms,but does identify as Mexican –or it could because he/she is ofmixed heritage; or

3. “Yes, other Spanish/Hispanic/Latino” and a write-in of

“Spanish,” “Hispanic,” or“Latino” because he/she identi-fies with one of the generalterms; or

4. “Yes, Mexican ...” and “Yes,other ...” and a write-in of“Spanish,” “Hispanic,” or“Latino” because he/she identi-fies as Mexican and also identi-fies with one of the generalterms (in essence votes for afavorite rubric).

Yet, during the reinterview therespondent may have selected the“Yes, Mexican ...” category oranother inconsistent choice.Dropping examples in Census2000 may have also led to theimpression we were asking respon-dents to select, or even reject, thegeneral responses (see Martin,2002:16).

Using the edited data from the2000 CRS study, Table 3.6 showsthe distribution of general Hispanicresponses (such as Spanish,Hispanic, or Latino).12 These datasuggest that some portion ofrespondents shift between a gener-al Hispanic response and specificresponses. Most of the time theshift is towards a specific Hispanicnational origin. For example,about 205 respondents in CRSgave a general response in Census2000. Of those, about 24 percent(weighted) gave a generalresponse, 12 percent switched tonon-Hispanic, and 64 percent to aspecific Hispanic national origin.From the opposite perspective,there were 138 respondents in CRSthat gave general Hispanicresponses. Of those, about 35 per-cent gave general responses in thecensus, 17 percent non-Hispanic,and 48 percent specific Hispanicnational-origin responses. In anycase, any confusion arising fromthe issues discussed above wouldlead to a much poorer consistency,

Table 3.5"Other Hispanic" Category by Questionnaire Type

Census 2000 questionnaire Hispanic-origin question


Yes, other Spanish/Hispanic/Latino—Print group.


Yes, other Spanish/Hispanic/Latino—What is this group?


Yes, other Spanish/Hispanic/Latino—What is this group?

Table 3.6General Hispanic Responses in Census 2000 and ContentReinterview Survey

Hispanic-origin category

Number ofgeneral

Hispanicresponses in

the censusquestion by

CRSresponse

Weighteddistribution

Number ofgeneral

Hispanicresponses in

the CRSquestion by

censusresponse

Weighteddistribution

Total . . . . . . . . . . . . . . . . . . . 205 100.0% 138 100.0%Not Hispanic . . . . . . . . . . . . . . . . 32 12.3% 27 16.7%Mexican . . . . . . . . . . . . . . . . . . . . 72 35.8% 35 28.5%Puerto Rican . . . . . . . . . . . . . . . 7 14.6% 7 4.4%Cuban . . . . . . . . . . . . . . . . . . . . . 5 6.7% - -Central and South American . . 42 6.9% 22 15.2%General responses . . . . . . . . . . 47 23.8% 47 35.1%

- Represents zero.

Source: Special tabulation of the 2000 CRS micro data.

12 The general terms used included:Hispanic, Latino, Spanish, Spanish American,Other Central American, Other SouthAmerican, Other Hispanic check box with nowrite-in, Spaniard (including specific terms),and all other non-specific national origins.


as measured by the CRS. Many ofthese types of responses are treat-ed as a “change in response”although they may reflect unin-tended effects of question designchanges and methodological differ-ences rather than inaccuracies ofreporting.

3.4 Consistency of racereporting

The race questions changed sub-stantially between 1990 and 2000.Among the most significantchanges were that in Census 2000respondents were allowed to selectmore than one race, whereas in1990 they were only allowed toselect one; in Census 2000Hispanic origin was sequencedahead of race, while in the 1990census it followed, with two otherquestions in between the two; the1990 category “Asian and PacificIslander” was split into separate“Asian” and “Native Hawaiian andOther Pacific Islander” categories;the 1990 categories “AmericanIndian,” “Eskimo,” and “Aleut” werecombined into an “American Indianand Alaska Native” category; andthe 1990 examples for Asian andPacific Islander groups wereremoved (see Singer and Ennis,2002:56, and Martin, 2002:2).

As with Hispanic origin, Singer andEnnis (2002:56) analyze the racedata in two ways. In the firstanalysis, Singer and Ennis(2002:57) examine only the checkbox entries and treat them as“Yes/No” responses. They notethat “all categories were rareexcept ‘White,’ ‘Black or AfricanAm. or Negro’ and ‘Some otherrace,’” and that “the net differencerates for eleven of the fifteen cate-gories were statistically differentfrom zero, suggesting that the CRSwas not independent and/or didnot replicate the census conditionsvery well.” Only the “White,”“Black,” “Filipino,” and “Korean” cat-egories have good consistency.Next, Singer and Ennis (2002:57)look at edited (but not imputed)race data grouped into seven cate-gories. The edited data showedmoderate consistency, with 7.6percent of respondents changingtheir race and an aggregate indexof 23.1. “American Indian andAlaska Native (AIAN),” “NativeHawaiian and Other Pacific Islander(NHPI),” and “Two or more races”categories were considered rare.In addition, the net difference ratesfor the “White,” “Some other race,”and “Two or more races” categoriesare statistically different fromzero, meaning at least one of the

model assumptions of independ-ence or replication was not met.

About 14 percent of the respon-dents who changed their racebetween the census and the CRSreported as “White” in the censusand “Some other race” in CRS.About 32 percent reported just theopposite – “Some other race” incensus and “White” in CRS.Analysis of these responses indi-cated that the “majority of the per-sons in these two inconsistent cat-egories were of Hispanic origin”(Singer and Ennis, 2002:58).

Singer and Ennis (2002:59) thenanalyzed the data by Hispanic origin and found that householdswith non-Hispanic sample personsshowed more consistency (good)than households with Hispanicsample persons (poor). Therefore,Singer and Ennis (2002:59) con-clude that “this suggests that theHispanic population are contribut-ing greatly to the variability in therace data.”

3.4.1 Discussion of race reporting

Although the consistency of report-ing race leaves much to bedesired, it is quite clear thatrespondents of Hispanic origin areless likely to report consistently

Table 3.7Response Variance Measures for Race by Hispanic Origin (Edited Data)

Race categories

Non-Hispanic Hispanic

Consistencylevel


Consistencylevel


Estimate


interval Estimate


interval

White . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Good 9.1 8.4 to 9.8 Poor 88.6 84.8 to 92.8Black, African Am., or Negro . . . . . . . . . . . . . . . . Good 3.9 3.3 to 4.5 Moderate 47.8 36.6 to 62.4Am. Indian or Alaska Native . . . . . . . . . . . . . . . . Moderate 32.1 26.1 to 39.5 Poor 72.0 50.5 to 100.0Asian . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Good 7.1 5.9 to 8.6 Moderate 30.5 11.7 to 79.8Native Hawaiian or Pacific Islander . . . . . . . . . . Moderate 38.5 26.0 to 57.0 Poor 100.0 44.4 to 100.0Some other race . . . . . . . . . . . . . . . . . . . . . . . . . . Poor 90.5 74.5 to 100.0 Poor 90.5 86.2 to 95.2Two or more races . . . . . . . . . . . . . . . . . . . . . . . . Poor 72.9 67.5 to 78.7 Poor 85.5 74.5 to 98.2

Aggregate . . . . . . . . . . . . . . . . . . . . . . . . . . . . Poor 12.6 11.8 to 13.5 Poor 86.9 83.4 to 90.6

Source: Adapted from Singer and Ennis (2002:59 Table 41)


than non-Hispanics. However,among non-Hispanics, only Blacks,Asians, and Whites showed goodconsistency, while AmericanIndians and Pacific Islandersshowed only moderate reportingconsistency (see Table 3.7). The“Some other race” and “Two ormore races” categories showedpoor reporting consistency. Aswas discussed extensively in theHispanic-origin reporting section,there are many reasons why wesee such inconsistent reporting inrace.

Research by Jones and Smith(2003:4) found that the potentialnumber of children in interracialfamilies who could have beenreported as more than one raceapproaches the number of childrenwho were actually reported asmore than one race in Census2000. Thus, Census 2000 doesnot reflect the potential number of“multiracial” children.13 This sug-gests there is, and will be, a sub-stantial proportion of respondentswho at any one time may move inand out of the multiple race popu-lation, making the exact measure-ment of this group challengingindeed.

How does race reporting in Census2000 compare to 1990? Thomas,Dingbaum, and Woltman (1993:21)reported good consistency forWhites (13.5 index of inconsisten-cy), Blacks (3.9), and Asian andPacific Islanders (9.4); moderateconsistency for American Indian,Eskimo, and Aleut (41.2); and

poor consistency for “Other Asianand Pacific Islander” (82.9) and“Other race” (70.3).

Similar to Singer and Ennis (1993),Thomas, Dingbaum, and Woltman(1993:21) reported that the majori-ty of the respondents switchingbetween “White” and “Other race”and vice versa were Hispanic.Unlike the Hispanic question in1990, the race data were “evaluat-ed using a response-bias (probing)type reinterview,” and “the CRSmay be viewed as the ‘preferred’measurement technique” (Thomas,Dingbaum, and Woltman, 1993:6).Given the assumption that the CRSis the preferred measure of race,Thomas, Dingbaum, and Woltman(1993:21) concluded that “theHispanic population are contribut-ing most of the bias in the racedata in the census” by over-reporting as “Other” and under-reporting as “White.” This mayhave been the result of “respon-dent confusion” or “interviewerbehavior in the reinterview survey.”In any case, it is clear in both stud-ies that Hispanic respondents hadtrouble answering the race ques-tion.

What accounts for the difference inthe reporting of race? One, thereis some evidence based on obser-vations14 from nonresponse fol-

low-up (NRFU) interviews that “asignificant number of enumeratorsdid not always read questionnaireitems as written, and often didnot use the flashcards provided,”particularly in the race andHispanic-origin questions (Houghand Borsa, 2003:39). Two, similarto the Hispanic question, the racequestion was different in the mail-back and CRS forms, as can beseen in Table 3.8. Additionally, theCRS (and enumerator) forms maybe perceived by some respondentsas suggesting or encouragingreporting of more than one race.

Three, the sample size of CRS maybe too small to properly measuredifferences in reporting patternsboth because of rare categoriesand/or because the number ofrespondents answering a particularquestion is small (Singer and Ennis,2002:9). Four, as suggested bySinger and Ennis (2002:56,59), CRSmethods do not replicate censusmethodology well for race andHispanic origin. Furthermore, asMartin (2002c:592) reminds us,even small questionnaire changescan, and do, affect study results,and “test design and sample size”may not be adequate to detectthese effects. It is quite possiblethat differences in modes of datacollection and interviewer effectsmay account for some of these dif-ferences as well.

13 Jones and Smith (2002:24-25) foundthat more than 1.6 million additional chil-dren could have been reported as more thanone race based on their interracial parent-age. Coupled with the actual number ofchildren (2.1 million) in the four groupsexamined who were reported as more thanone race, the total number of childrenreported as more than one race could benearly 4 million or higher. The authors referto this population as the “potential pool of‘multiracial’ children.”

14 It should be noted that observationswere not based on a scientifically selectedsample, and were based on subjective judg-ments of individual observers.

Table 3.8Race Question by Questionnaire Type

Census 2000 questionnaire Race question


What is this person’s race? Mark X one or more races toindicate what this person considers himself/herself to be.


Now choose one or more races for each person. Whichrace or races does each person consider himself/herself tobe?


Now choose one or more races for (yourself/...).Whichrace or races (do you/does) consider yourself/himself/herself) to be?


3.5 Consistency ofancestry reporting

One of the changes to the ancestryquestion in Census 2000 was therestructuring of the list of exam-ples from 21 to 16 example ances-tries. German, Croatian,Ecuadorian, Cajun, Irish, Thai, andSlovak were dropped from the1990 list, and Cambodian andNigerian were added for 2000. Inorder to analyze these data, wecollapsed the ancestry responsesinto 58 categories. Single ancestryresponses were reported withmoderate consistency (about 29percent of respondents changedtheir answers in CRS; the aggre-gate index of inconsistency was30.7). Some of the key findingsare:

– respondents who reported onmailback forms showed moreconsistency than those whoreported to enumerators,although both were moderate;

– households with foreign-bornsample people showed moreconsistency than those withnative-born sample people(moderate);

– households with non-Hispanicsample people showed moreconsistency than those withHispanic sample people (bothmoderate).

One of the difficulties with ances-try data is that many respondentsleave the item blank, but the ques-tion was more likely to be unan-swered in Census 2000 (n=4,159or about 21.3 percent) comparedwith CRS (n=1,603 or about 8.2percent). Leaving ancestry blankmay be a result of “perceivedredundancy” by many respondentswho felt they had already providedthis information when theyanswered the race and Hispanic-

origin questions (Martin, Demaio,and Campanelli, 1990:555-556).

3.5.1 Discussion of ancestryreporting

Although ancestry was reportedwith moderate consistency, it wasless consistently reported inhouseholds with Hispanic samplepeople, but also more consistentlyin households with foreign-bornrespondents. Yet it is also truethat proportionately more Hispanichouseholds have foreign-born peo-ple than non-Hispanic households.How can this be reconciled?

Table 3.9 shows nine specific sin-gle-ancestry Hispanic national-ori-gin entries, and only two(“Guatemalan” and “Salvadoran”)had moderate levels of consisten-cy.15 On the other hand, two gen-eral single ancestries (“Hispanic”and “Spanish”) showed even poorerlevels of consistency, meaning thatrespondents answered differentlyin the census and CRS.

Of 62 respondents who reportedas “Hispanic” in CRS, only 8.1 per-

cent also did so in Census 2000;25.8 percent had reported as“Spanish” in Census 2000, and66.1 percent reported otherresponses (some of which could beother specific Hispanic-origin cate-gories). Similarly, of 102 respon-dents who reported as “Spanish” inCRS, 29.4 percent also did so inCensus 2000; 8.8 percent hadreported as “Hispanic” in Census2000, and 61.8 percent reportedother responses. Clearly,“Hispanic” and “Spanish” are notconsistently reported.

Table 3.10 shows how respondentswho reported “Hispanic” and“Spanish” in the Census 2000ancestry question reported in theCRS ancestry question. Of 116“Hispanic” entries in Census 2000,only 4.3 percent reported“Hispanic” in CRS. Nearly two-thirds (62.9 percent) reported“Mexican” in CRS and 13.8 percentreported “Spanish.” About 1.7 per-cent reported “U.S. or American,”and only 4.3 reported “othergroups” (some of which could beother specific Hispanic origin cate-gories). Among 84 who reportedas “Spanish” in Census 2000, 35.7percent as reported “Spanish” inCRS, 10.7 percent reported

Table 3.9Single Ancestry Responses From Content ReinterviewSurvey (CRS) and Census 2000

CRS ancestryresponse

Number

Samecountry in

Census2000

Hispanic orSpanish in

Census2000

All otherresponsesin Census

2000Level of

consistency

Colombian . . . . . . . 28 (100%) 85.7% - 14.3% GoodCuban . . . . . . . . . . . 43 (100%) 95.3% 2.3% 2.3% GoodDominican . . . . . . . 45 (100%) 84.4% 8.9% 6.7% GoodEcuadorian . . . . . . 22 (100%) 95.5% - 4.5% GoodGuatemala . . . . . . . 32 (100%) 68.8% 31.2% - ModerateHonduran . . . . . . . . 22 (100%) 77.3% 13.6% 9.1% GoodMexican . . . . . . . . . 901 (100%) 92.1% 5.8% 2.1% GoodPuerto Rican . . . . . 144 (100%) 80.6% 15.3% 4.2% GoodSalvadoran . . . . . . 36 (100%) 72.2% 11.1% 16.7% ModerateHispanic . . . . . . . . . 62 (100%) 8.1% 25.8% 66.1% PoorSpanish . . . . . . . . . 102 (100%) 29.4% 8.8% 61.8% Poor

- Represents zero.

Source: Adapted from Singer and Ennis (2002:E11-E15 Table E.29; C15-C16 Table C.29).

15 However, the lack of consistency maybe related to switching to general responsessuch as “Hispanic” or “Spanish,” as shown inTables 3.9 and 3.10.


“Hispanic,” and 25.0 percent“Mexican.”

These results suggest that much ofthe inconsistency in the reportingof Hispanic ancestries is related toshifting between general terms(“Hispanic” or “Spanish”)16 and spe-cific terms (“Mexican” or “PuertoRican”), and between general termsthemselves. Table 3.11 showssimilar results when comparing“Hispanic” and “Spanish” responsesin the CRS ancestry question withthe matched Census 2000 ancestryquestion responses. Clearly, somerespondents switch between spe-cific and general Hispanic groupterms, but relatively few switchbetween Hispanic and non-Hispanic ancestries.

3.6 Consistency of place-of-birth reporting

The Census 2000 question onplace of birth included: 1) checkboxes for respondents to indicatewhether they were born in theUnited States or outside the UnitedStates, and 2) write-in spaces toreport their state of birth or coun-

try of birth. With respect to thecheck box responses, place ofbirth was reported very consistent-ly (only about 0.5 percent ofrespondents reported a differentplace of birth for the sample per-son, for an index of inconsistencyof 2.7). Among the findings are:

– respondents who reported onmailback forms showed moreconsistency than those whoreported to enumerators,although both were low;

– households with native-bornsample persons (as identified bythe check box on the citizenshipquestion) showed more consis-tency than households with for-eign-born sample persons;

– households with Hispanic sam-ple persons showed more con-sistency than households withnon-Hispanic sample persons.

3.6.1 Discussion of place-of-birthreporting

Generally speaking, the consisten-cy of place-of-birth reporting (asidentified by the write-in response)is quite good (Singer and Ennis,2002:32). Sample individuals bornoutside of the United States wereasked to report the country ofbirth. All responses to place ofbirth were grouped into 68 cate-gories, which included the 50states, the District of Columbia,United States territories, and othercountries and regions.Approximately 3 percent of CRSrespondents changed answers dur-ing the CRS, yielding an aggregateindex of 3.2.

Table 3.10"Hispanic" and "Spanish" Single Ancestry Responses inCensus 2000 and Content Reinterview Responses

CRS ancestry response ‘‘Hispanic’’ in Census2000 ancestry question

‘‘Spanish’’ in Census2000 ancestry question

Columbian . . . . . . . . . . . . . . . . . . . . . . . . . - 3.6%Cuban . . . . . . . . . . . . . . . . . . . . . . . . . . . . . - 8.3%Dominican . . . . . . . . . . . . . . . . . . . . . . . . . 2.6% -Mexican . . . . . . . . . . . . . . . . . . . . . . . . . . . 62.9% 25.0%Puerto Rican . . . . . . . . . . . . . . . . . . . . . . . 7.8% 3.6%Salvadoran . . . . . . . . . . . . . . . . . . . . . . . . 2.6% 1.2%U.S. or American . . . . . . . . . . . . . . . . . . . 1.7% 1.2%Hispanic . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3% 10.7%Spanish . . . . . . . . . . . . . . . . . . . . . . . . . . . 13.8% 35.7%Other groups . . . . . . . . . . . . . . . . . . . . . . . 4.3% 10.7%

Total . . . . . . . . . . . . . . . . . . . . . . . . . . 100.0% 100%Number . . . . . . . . . . . . . . . . . . . . . . . 116 84

- Represents zero.

Source: Adapted from Singer and Ennis (2002:E11-E15).

16 “Latino” was not tabulated separatelyand may be tabulated with “Other groups.”

Table 3.11Single Ancestry Responses in

Content Reinterview and Census 2000 Responses

Census 2000 ancestry response ‘‘Hispanic’’ in CRSancestry question

‘‘Spanish’’ in CRSancestry question

Cuban . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 0.9% -Dominican . . . . . . . . . . . . . . . . . . . . . . . . . - 3.9%Guatemalan . . . . . . . . . . . . . . . . . . . . . . . . 1.7% 7.8%Honduran . . . . . . . . . . . . . . . . . . . . . . . . . . 0.9% 2.0%Mexican . . . . . . . . . . . . . . . . . . . . . . . . . . . 25.9% 21.6%Puerto Rican . . . . . . . . . . . . . . . . . . . . . . . 10.3% 9.8%Salvadoran . . . . . . . . . . . . . . . . . . . . . . . . - 3.9%U.S. or American . . . . . . . . . . . . . . . . . . . - 1.0%Hispanic . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3% 15.7%Spanish . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.8% 29.4%Other groups . . . . . . . . . . . . . . . . . . . . . . . 1.7% 6.0%

Total . . . . . . . . . . . . . . . . . . . . . . . . . . 100.0% 100.0%Number . . . . . . . . . . . . . . . . . . . . . . . 62 102

- Represents zero.

Source: Adapted from Singer and Ennis (2002:E11-E15).

"Hispanic" and "Spanish"


As shown in Table 3.12, place-of-birth reporting from Central andSouth America appears to be quiteconsistent. These results for placeof birth and the previously dis-cussed results for ancestry sug-gest that, at least for Hispanicgroups, these questions may beconsidered reliable supplements tothe Hispanic-origin data, as shownby Cresce and Ramirez (2003).However, their use for supplement-ing race data needs to be exploredfurther.

Table 3.12Content Reinterview Survey (CRS) Place-of-Birth Reportingfor Central and South America

Area Consistency level Index of inconsistency

Puerto Rico . . . . . . . . . . . . . . . . . . . . . . . . High 3.8Mexico . . . . . . . . . . . . . . . . . . . . . . . . . . . . High 1.2Other Central America . . . . . . . . . . . . . . . High 1.5Caribbean . . . . . . . . . . . . . . . . . . . . . . . . . High 5.0South America . . . . . . . . . . . . . . . . . . . . . High 2.1

Source: Adapted from Singer and Ennis (2002:C23 Table C.34).


4. Census Quality Survey to EvaluateResponses to the Census 2000 Questionon Race: An Introduction to the Data

The main objective of the CensusQuality Survey (CQS) was to assistdata users in comparing race dataobtained by asking respondents to“mark one or more races” with dataobtained by asking respondents to“mark one race.” The CQS collect-ed race data using both methodsfrom the same people, so poten-tially it could be used to evaluatehow respondents reporting multi-ple races respond when asked toreport a single race. For example,the data could be used determinethe proportion of people whoreport as ‘Black’ when asked toreport only one race but report as‘White and Black’ when asked toreport one or more races. Thisinformation could be used “to‘bridge’ the two methods by con-structing statistical adjustments torace distributions obtained usingone method to make them morecomparable to race distributionsobtained using the other” (Bentley,Mattingly, Hough and Bennett,2003:1).

4.1 Study design

According to Bentley, Mattingly,Hough and Bennett (2003:11) sam-ple households were contactedtwice during the CQS survey toprovide information on race. Botha “mark one race” 1990 censusinstruction and a “mark one ormore races” Census 2000 instruc-tion were administered in a splitpanel design. A total sample of55,000 addresses was selected.

The sample households received amailed initial questionnaire in June2001. Households that did notreturn the initial questionnaire

were mailed a second question-naire in early July 2001.Households that did not respondto the first or second mailingswere contacted with nonresponsefollow-up (NRFU) procedures simi-lar to those used for Census 2000.

The sample universe was split intotwo panels (A and B). Panel A,consisting of respondents fromabout 27,500 housing units (HUs),were asked the Census 2000 racequestion. Panel B, consisting ofrespondents from about 27,500housing units, received a similarquestionnaire but the instructionto the question on race was to“mark one race.” During the initialcontact, about 54 percent ofhouseholds in both panelsresponded by mail and the remain-der were interviewed in NRFU per-sonal visits. As in Census 2000,enumerators used flashcards show-ing the instructions and the cate-gories for the questions on raceand ethnicity in CQS initial contactNRFU visits.

Respondents were also askedwhether a Census 2000 form hadbeen filled out for the householdand, if so, who completed theform. This information was usedto assess consistency of reportingwhen race was reported by thesame or a different respondent.The CQS also collected informationon the address where each personin the household was living onApril 1, 2000 to assist in matchingCQS respondents to their Census2000 data. Four to six weeks afterthe second mailout, householdsresponding to the initial contactphase of the data collection were

then re-contacted by telephone tocollect data on race from the alter-nate race question as well as otherdata, such as education andincome.

In the “re-contact” phase of datacollection, Panel A households thatreceived the “mark one or moreraces” instruction in the initial datacollection were asked to “chooseone race” in the re-contact inter-view. Conversely, Panel B house-holds that received the “mark onerace” instruction in the initial con-tact were asked to “choose one ormore races” in the re-contact inter-view.

More than 70 percent of the re-contact interviews were conductedby telephone. Personal interviewswere conducted to collect the re-contact information for householdsthat were not contacted by tele-phone. In both cases, every effortwas made to speak with the indi-vidual who completed the initialquestionnaire. The Panel A ques-tionnaire included a probe for addi-tional information in instanceswhere respondents were reluctantto report a single race when askedto do so. Respondents in bothpanels were asked to provide addi-tional social and demographicinformation, such as relationship,veteran’s status, educational attain-ment, household income, and lan-guage spoken at home, whichmight be relevant to the issue ofdifferential race reporting.

The final sample size of the CQSwas approximately 50,000 inter-viewed housing units and 155,000respondents. About 25 percent of

the sample was allocated to eachof the four cells created by cross-ing panel (A or B) by census formtype (short or long). Each statewas treated as an independentsampling stratum and four distinctsampling strata were identifiedwithin each state.17 In order tomaximize the likelihood of contact-ing households in CQS with indi-viduals reporting more than onerace, 90 percent of the initial sam-ple was selected from amonghouseholds containing at least oneindividual who reported more thanone race in Census 2000.

Because most of the responses thatare coded as “Some other race”(SOR) in Census 2000 are Hispanicethnicities, the CQS focused prima-rily on the OMB race combina-tions18. In order to produce greaterreliability for the combinations oftwo OMB race categories, combina-tions including SOR were sampledat one-third the rate of the othercombinations. As a result, 18 per-cent of the CQS sample consistedof SOR combinations, comparedwith 42 percent in Census 2000.Finally, Census 2000 records werelinked to CQS records in order tofacilitate comparisons betweenCQS and Census 2000 race data.This linking process matched arecord in the 100-percent CensusUnedited File (HCUF) to records inthe CQS file by comparing fieldssuch as first name, last name, mid-dle initial, suffix, sex, date of birth,age, street name, and zip code.

This match also provides anotherset of observations which can beused to estimate “bridging parame-ters,” as can be seen in Table 4.1.

For example, in Panel A, one wouldcompare the “mark one or moreraces” response in the CQS initialcontact with the single-raceresponse in the CQS re-contact.

4.2 Limitations

According to Bentley, Mattingly,Hough and Bennett (2003:21-22),there are operational and qualita-tive limitations to this evaluation:1) the design of the CQS could notrepeat the Census 2000 environ-ment; 2) different collection meth-ods were used in the CQS initialcontact and re-contact; 3) theresponse to a subsequent questionon race can be influenced or condi-tioned by the response to the pre-vious question; 4) proxy report-ing; 5) effects of movers on thesample;19 and 6) possible errorassociated with linking Census2000 data.


4.3.1 What were the responserates for each panel?

After excluding vacant housingunits, Bentley, Mattingly, Houghand Bennett (2003:23-24) reportthat response rates were about 97percent for the initial contact. Inthe re-contact, about 87 percent ofPanel A housing units responded,compared with about 94 percent ofPanel B.

4.3.2 Was the CQS representativeof Census 2000 data?

Because “analytical results can bebiased if the interviewed sample isnot representative of the popula-tion of interest,” Bentley, Mattingly,Hough and Bennett (2003:24) com-pared aggregate CQS distributionswith Census 2000 reporting foreach panel and concluded(2003:vi):

The results from the questionon race suggest that eachpanel appears to be repre-sentative of Census 2000.Aggregated reporting of raceamong non-Hispanic respon-dents to the “mark one or moreraces” instruction closely resem-bles Census 2000 reporting ofrace for each panel. No racegroup appears to be significant-ly different from Census 2000 (p < 0.1 level) in either panel,including the Two or more racespopulation. Reporting of racefor Hispanic respondents is alsosimilar to that in Census 2000,though in Panel A a smaller pro-portion of Hispanics chose“White” as a single race and alarger proportion chose “Someother race” compared withCensus 2000 data.

4.3.3 Persistence of more-than-one-race reporting

The effect of the probe question inPanel A reduced reporting of morethan one race from 1.4 percent to0.4 percent. To the authors thisindicated “that there is a sizeable


17 For additional information aboutthese four strata see Bentley, Mattingly,Hough and Bennett (2003:15-16).

18 White; Black or African American;American Indian and Alaska Native; Asian;and Native Hawaiian and Other PacificIslander.

Table 4.1Census Quality Survey Data Collection Sequence: RaceInstruction by Panel

CQSPanel Census 2000 CQS initial contact CQS re-contact

A . . . ‘‘mark one or more races’’ ‘‘mark one or more races’’ ‘‘choose one race’’B . . ‘‘mark one or more races’’ ‘‘mark one race’’ ‘‘choose one or more races’’

Source: Bentley, Mattingly, Hough and Bennett (2003:10, Table 1).

.

19 Movers created problems with sampleweighting because of differential sampling ofracial combinations. For additional informa-tion about this issue see Bentley, Mattingly,Hough and Bennett (2003:27-28).

portion of people who will persist-ently report Two or more raceswhen asked to report only one”(Bentley, Mattingly, Hough andBennett, 2003:25). The authorsalso note that “in general, unless aprobing question is asked, itappears that about half of all Twoor more race respondents do notgive a single race response.Nonetheless, the data suggest thatthe race distributions do notchange much with the follow upprobe results” (Bentley, Mattingly,Hough and Bennett, 2003:27).

4.3.4 Consistency of race report-ing between the CQS and Census2000 data

Bentley, Mattingly, Hough andBennett (2003:vi) report a “general-ly low consistency of reportingmore than one race betweenCensus 2000 and the CQS”:

Only 40 percent of the non-Hispanic respondents in Panel Awho reported more than onerace in Census 2000 also report-ed more than one race in theinitial contact (“mark one ormore races” instruction).Similarly, only 41 percent ofthose in Panel B who reportedmore than one race in the cen-sus also reported more than onerace in the re-contact. Theother 60 percent reported a single race. In contrast, 97 per-cent to 98 percent of those whoreported a single race of White,Black, or Asian in Census 2000reported the same race in theCensus Quality Survey. ForAmerican Indian or AlaskaNatives, Native Hawaiian orOther Pacific Islanders, andSome other race respondents,the reporting of race consisten-cy ranges from 55 percent to 58percent in Panel A, and 72 per-cent to 78 percent in Panel B.

Tables 4.2 and 4.3 (Bentley,

Mattingly, Hough and Bennett,

2003:28,30) show the lack of con-

sistency among non-Hispanics.

Among the consequences of the

low level of consistency in the

reporting of more than one race,

the authors’ list:

• The effective sample size for

computing bridging parameters

is reduced and the parameters

are sensitive to which data are

used to compute them.

• The stability of bridging

parameters may be unclear

given the observed instability in

reporting more than one race.

4.3.5. Tabulating “mark one race”

responses by specific combinations

of “mark one or more races”

Bentley, Mattingly, Hough and

Bennett (2003:vii,32) find that

“even with the ‘mark one race’

instruction, a significant portion of

respondents report Two or more

races,” and “even with a followup,


Table 4.2Overall Consistency of Race Reporting for Non-Hispanicsfor Panel A*

Census 2000 race

CQS initial contact(‘‘mark one or more races’’)

Single raceTwo or

more races Total

Single race . . . . . . . . . . . . . . . . . . . . . 96,987,813n=34,839

1,286,746n=1,978

98,274,559n=36,817

Two or more races . . . . . . . . . . . . . . . 1,089,924n=9,089

724,686n=8,035

1,814,610n=17,124

Total . . . . . . . . . . . . . . . . . . . . . . . 98,077,737n=43,928

2,011,432n=10,013

100,089,169n=53,941

* The data in Table 4.2 were restricted to matched people who did not have an imputed race in Cen-sus 2000; that is, only those cases where the final edited race was ‘‘as reported,’’ or where the code waschanged ‘‘through consistency edit.’’ The CQS initial-contact Hispanic-origin response was used. Addi-tionally, the weighted data were obtained using the inverse of the original sampling probabilities with noadjustment (Z_WGT1).


Table 4.3Overall Consistency of Race Reporting for Non-Hispanicsfor Panel B*

Census 2000 race

CQS re-contact(‘‘mark one or more races’’)

Single raceTwo or

more races Total

Single race . . . . . . . . . . . . . . . . . . . . . 89,881,179n=32,848

935,610n=1,476

90,816,789n=34,324

Two or more races . . . . . . . . . . . . . . . 825,761n=8,994

565,422n=7,148

1,391,183n=16,142

Total . . . . . . . . . . . . . . . . . . . . . . . 90,706,940n=41,842

1,501,032n=8,624

92,207,972n=50,466

* The data in Table 4.3 were restricted to matched people who did not have an imputed race in Cen-sus 2000; that is, only those cases where the final edited race was ‘‘as reported,’’ or where the code waschanged ‘‘through consistency edit.’’ The CQS initial-contact Hispanic-origin response was used. Addi-tionally, the weighted data were obtained using the inverse of the original sampling probabilities with noadjustment (Z_WGT1).


a significant portion of respon-dents report Two or more races.”Data users must in the end decidehow to deal with the “reluctantcases when computing bridgingparameters” which may in turndepend “on the particular purposeand uses.”

4.4 Discussion of CensusQuality Survey findings

The CQS is very impressive in fourrespects:

• large sample size - about25,000 housing units per paneland 155,000 respondents.

• very high housing unitresponse rates - about 97 per-cent for the initial contact inboth panels, and re-contactresponse rates of 87 percent inPanel A and 94 percent in PanelB.

• representativeness - eachpanel appears to be representa-tive of Census 2000.Aggregated reporting of race bynon-Hispanics closely resemblesCensus 2000 reporting in bothpanels. Race reporting byHispanics is also similar toCensus 2000, but in Panel A asmaller proportion chose Whiteand a larger proportion choseSOR compared with Census2000 data.

• high matching rate - about 86percent of CQS person recordswere matched to their respec-tive Census 2000 record.

Despite the enviable survey execu-tion described above, for the pur-poses of studying possible bridg-ing parameters, the CQS hasseveral limitations:

• too few cases reporting morethan one race - despite veryhigh housing unit responserates, and a high rate of over-

sampling of households whoreported more than one race,the number of cases who report-ed more than one race in CQS isquite low. Among Hispanicsand non-Hispanics there wereabout 21,501 cases20 (or about17.8 percent of 120,522 totalcases) reporting more than onerace in CQS (Panels A and B)and it is those cases that are ofmost interest for computingbridging parameters.

• fewer cases of Two or moreraces due to inconsistentrace reporting - as mentionedin the results section, there isadditional attrition to the casesof major interest due to incon-sistent race reporting (Bentley,Mattingly, Hough and Bennett,2003:28-30). Jones and Smith(2002) also note that there is asubstantial pool of children whocould have been reported asmultiracial but were not, sug-gesting that there may be someinstability associated with meas-uring this population. However,it may be possible to overcomethis limitation by selecting por-tions of the inconsistentresponses and pooling datafrom both panels.

• fewer cases due to reluc-tance to select one race - inPanel A about 2.0 percent ofnon-Hispanics reported morethan one race in the initial con-tact. After the re-contact (whichasked for one race) there werestill 1.4 percent reporting morethan one race. Even after prob-ing for one race, 0.4 percentremained.

• fewer cases due to splitpanel design - unless there issome statistically valid method

to pool Panels A and B, theeffective sample size is reducedto the observations available ineach panel. An amelioratingfactor is that a good portion ofthe CQS cases were successfullymatched to their respectiveCensus 2000 records.

• complex methodology andmultiple modes of data col-lection - in selecting the CQSmethodology, a panel designand contact/re-contact method-ology was selected over amethod of one instrument withtwo questions. Study designerswere worried about the lack ofindependence and the condition-ing effects of the latter method(see Attachment 3 in Bentley,Mattingly, Hough and Bennett,2003:54-56 for the six optionsconsidered). They believed“that substantial, but unmeasur-able, interactions will take placebetween the collected data forboth measurements with bothrace questions in the sameinstrument” (Bentley, Mattingly,Hough and Bennett, 2003:56).

In retrospect, it seems that theCQS methodology may have intro-duced many more sources of bias,such as time lag, mover gains andlosses, interviewer effects, modedifferences, proxy reporting, andpossibly matching problems (all ofwhich may give rise to apparentlyinconsistent reporting) withoutentirely eliminating conditioningeffects or ensuring the independ-ence of observations.

Tables 4.4 and 4.5 show CQSrespondents reporting selectedcombinations of races21 and howthey reported on the alternativemeasurement.


20 Note, these figures do not includeindividuals who did not report a Hispanicorigin.

21 Most of these combinations arenumerically the largest in each panel and arealso of policy interest, but were selected pri-marily for illustrative purposes.

First, we can see that many morerespondents did not answer inPanel A (more than 12.0 percent) –where the initial contact asked“mark one or more races” and there-contact (and probe question)asked “mark one race” – than inPanel B (no more than 2.5 percent).This is not surprising because CQSdeliberately over-sampled the “Two

or more races” population, so it isreasonable to expect that in PanelA respondents may have beenreluctant to report only one race.On the other hand, in Panel B, onemight have expected that, havingbeen restricted to one race initially,these respondents would havebeen eager to report more thanone race.

In both panels, the proportions giv-ing the same response in bothmeasurements was 10 percent orhigher (except for “White andSome other race” and “White andAmerican Indian and AlaskaNative”). “White and Black” and“White and Asian” were most likelyto provide the same response(about 20 percent in Panel A toabout 30 percent in Panel B).Fairly substantial proportions inboth panels gave different orinconsistent responses (rangingfrom 2.9 to 20.6 percent). “Whiteand Black” respondents were par-ticularly susceptible to this (19.3percent in Panel A and 20.6 per-cent in Panel B), while “White andAmerican Indian and Alaska Native”respondents were among least sus-ceptible (2.9 percent) in both pan-els. Often, when asked to reportmore than one race, respondentsmay report their race as “multira-cial,” “mixed,” or “biracial,” whichin census procedures get coded as“Some other race.” Additionalanalysis of these responses shouldbe done.

Table 4.6 shows CQS respondentsreporting selected combinations ofraces and whether they reportedone consistent race in the alterna-tive measurement – for example,someone reporting “White andAsian” in one question and “White”or “Asian” in the other is a consis-tent answer. Although somerespondents did report one race inthe alternate question, sometimesthat race was not consistent (e.g.,someone reporting “White andAsian” in one question and “Black”in the other is an inconsistentanswer). Additional research onthese inconsistent responses needsto be done.

In general, “White and Black orAfrican American” respondents inboth panels were most resistant toselecting one consistent race (54.3


Table 4.4Non-Hispanics Reporting Selected Combinations of TwoRaces in Panel A Initial Interview by Re-contact ResponseIncluding Probe

CQS initial contactNumber First race

Secondrace

Samecombina-

tionDifferent

responseNo

response

White - Black . . . . . . . . 105,222 11.9% 33.8% 20.5% 19.3% 14.5%White - AIAN . . . . . . . . 129,101 50.1% 26.7% 8.1% 2.9% 12.2%White - Asian . . . . . . . . 175,034 36.9% 24.3% 18.5% 6.7% 13.7%White - SOR . . . . . . . . 32,634 69.7% 10.1% 3.6% 3.2% 13.4%Black - AIAN . . . . . . . . 20,880 56.2% 10.1% 11.5% 9.0% 13.2%Asian - NHPI . . . . . . . . 24,900 25.4% 47.0% 10.0% 5.5% 12.2%

Source: Derived from Bentley, Mattingly, Hough and Bennett (2003:32, Table 13).

Table 4.5Non-Hispanics Reporting Selected Combinations of TwoRaces in Panel B Re-contact Interview by Initial ContactResponse

CQS re-contactNumber First race

Secondrace

Samecombina-

tionDifferent

responseNo

response

White - Black . . . . . . . . 137,126 13.3% 35.6% 29.1% 20.6% 1.4%White - AIAN . . . . . . . . 230,566 58.0% 23.8% 14.3% 2.9% 0.9%White - Asian . . . . . . . . 211,546 25.2% 31.5% 31.0% 11.8% 0.4%White - SOR . . . . . . . . 171,512 76.1% 14.7% 2.4% 6.3% 0.4%Black - AIAN . . . . . . . . 37,927 65.8% 13.4% 11.8% 6.5% 2.5%Asian - NHPI . . . . . . . . 35,543 34.9% 26.6% 25.3% 12.2% 0.9%

Source: Derived from Bentley, Mattingly, Hough and Bennett (2003:33, Table 14).

Table 4.6Percent of Non-Hispanics Reporting Selected Combina-tions of Two Races Providing or Not Providing OneConsistent Race by Panel

Combination reportedin CQS

Panel A—oneconsistent race

Panel A—noconsistent race

Panel B—oneconsistent race

Panel B—noconsistent race

White - Black . . . . . . . . 45.7% 54.3% 48.9% 51.1%White - AIAN . . . . . . . . 76.8% 23.2% 81.8% 18.2%White - Asian . . . . . . . . 61.1% 38.9% 56.7% 43.3%White - SOR . . . . . . . . 79.8% 20.2% 90.9% 9.1%Black - AIAN . . . . . . . . 66.3% 33.7% 79.2% 20.8%Asian - NHPI . . . . . . . . 72.4% 27.6% 61.6% 38.4%

Source: Derived from Bentley, Mattingly, Hough and Bennett (2003:32-33, Tables 13 and 14).

and 51.1 percent in Panel A and Brespectively), while “White andSome other race” respondents wereleast resistant (20.2 and 9.1 per-cent respectively). The signifi-cance of these findings is that sub-stantial proportions of respondentsrefused or were unable to give usthe information we need to calcu-late “bridging” parameters, andthereby further reduce the numberof useful cases.

Considering only those caseswhich provide the necessary infor-mation for computing bridgingparameters (that is, race questionsare answered in both instruments,a multiple race response is provid-ed in one instrument, and a “con-sistent” single race response isprovided in the other instrument),what proportion of selected combi-nations select one race over theother? Table 4.7 shows someexample bridging parameters com-puted by ignoring all cases that

did not report one consistent race.For example in Panel A, among“White and Black or AfricanAmerican” respondents who doselect one consistent race, 26.0percent select “White” and 74.0percent select “Black or AfricanAmerican.” Despite the differentmethodologies, Panel B shows verysimilar proportions – 27.3 percentselect “White” and 72.7 percentselect “Black or African American.”However, these calculations ignoreover half of the “White and Blackor African American” respondents,as seen in Table 4.6 above. We seesimilar consistency between panelsfor “Black or African American andAmerican Indian and AlaskaNative.” About 84.8 percent select“Black or African American” inPanel A and 83.0 percent in PanelB. Among “White and AmericanIndian and Alaska Native” respon-dents, 65.2 percent selected Whitein Panel A, and 70.9 percent inPanel B. About 87.3 percent (Panel

A) and 83.8 percent (Panel B) of“White and Some other race”respondents select “White.” In thecase of “White and Asian” and“Asian and Native Hawaiian andOther Pacific Islander” Panel A andB produce contradictory parame-ters. In Panel A, 39.7 percent of“White and Asian” select “Asian,”while in Panel B that proportion is55.6 percent. Similarly in Panel A,35.1 percent of “Asian and NativeHawaiian and Other PacificIslander” select “Asian,” comparedwith 56.8 percent in Panel B.

Although much more analysisneeds to conducted, a questionthat needs to be answered is whichbridging parameter should be usedfor any race combination. Shouldit come from Panel A or Panel B, orfrom a pooled sample of A and B?In addition, matching Census 2000records to CQS records affords usat least two more possible sourcesof bridging parameters (Census2000 to Panel A re-contact andCensus 2000 to Panel B initial con-tact). It is unknown whether thesemay yield either different parame-ters or, worse, inconsistent param-eters. Unfortunately, at this stagethere is no a priori way to decidewhich approach yields the bestbridging parameters. In any event,we lose cases because significantproportions of respondents do notprovide one consistent race in thealternate question.


Table 4.7Example "Bridging" Parameters for Non-HispanicsReporting Selected Combinations of Two Races andOne Consistent Race by Panel

Combination reportedin CQS

Panel A -first race

Panel A -second race

Panel B -first race

Panel B -second race

White - Black . . . . . . . . 26.0% 74.0% 27.3% 72.7%White - AIAN . . . . . . . . 65.2% 34.8% 70.9% 29.1%White - Asian . . . . . . . . 60.3% 39.7% 44.4% 55.6%White - SOR . . . . . . . . 87.3% 12.7% 83.8% 16.2%Black - AIAN . . . . . . . . 84.8% 15.2% 83.0% 17.0%Asian - NHPI . . . . . . . . 35.1% 64.9% 56.8% 43.2%

Source: Derived from Bentley, Mattingly, Hough and Bennett (2003:32-33, Tables 13 and 14).

One of the main objectives of theAmerican Community Survey (ACS)is to serve as a replacement for thelong form in the 2010 Census.Another is to provide a continuous-ly updated source of demographic,socioeconomic, and housing datafor small areas and populationgroups, either as single-year esti-mates or multi-year averages(Bennett and Griffin 2002:206). Inthis chapter, I will concentrate onhow race and Hispanic origin differin Census 2000 and the Census2000 Supplementary Survey (C2SS)based on the work of Bennett andGriffin (2002); Leslie, Raglin, andSchwede (2002); Raglin and Leslie(2002); and Schwede, Leslie, andGriffin (2002).

5.1 Study design

The primary objective of C2SS“was to evaluate the feasibility ofcollecting long-form data outsidethe decennial census” duringCensus 2000 (Bennett and Griffin,2002:206). The C2SS was a surveyof about 700,000 housing unitsusing the ACS methodology. Itwas an operational feasibility testto learn how to collect long-formdata at the same time as, but sepa-rately from, Census 2000. TheC2SS was the first large-scalenational data collection using theACS methods (Raglin and Leslie,2002:2826). The C2SS used thequestionnaire and methods devel-oped for the ACS to collect demo-graphic, social, economic, andhousing data from a national sam-ple of households. C2SS data col-lection began in January 2000 andran through December 2000.

The C2SS was conducted in 1,203counties, and when the original 31sites were added, the full samplesize was large enough to producedata for every state, and mostcounties and metropolitan areaswith populations of 250,000 ormore (Bennett and Griffin,2002:206).

Data were collected in three phas-es. First, a pre-notice letter wassent to each sampled unit, fol-lowed by a questionnaire in themail a week later. If necessary, theinitial mail questionnaire was fol-lowed by a reminder card, andafter three weeks, a replacementquestionnaire was sent. Second, atelephone follow-up was attemptedto obtain information from house-holds that did not return thereplacement questionnaire. Third,a sample of nonrespondents wasselected for a personal visit inter-view. Nonresponse follow-up(NRFU) interviews were conductedby permanent survey field repre-sentatives using computer assistedtechnology (Bennett and Griffin,2002:207).

5.2 Limitations

One might have expected differ-ences between Census 2000 andthe C2SS because they had differ-ent purposes and, therefore, haddifferent design and implementa-tion methods. The C2SS collecteddata continuously throughout theyear using a combination of mail,telephone, and personal visit fol-low-up which lasted over a three-month period. Census 2000, onthe other hand, was a single mas-sive data collection over a very

short period from late March 2000to July 2000 that included an ini-tial mail out mode and subsequentpersonal visit NRFU interviews inas many non-responding house-holds as possible. As a finalresort, Census 2000 allowedproxy responses from non-house-hold respondents, such as neigh-bors, while the C2SS did not.

There were several other importantdifferences between Census 2000and the C2SS: the C2SS had fol-low-up procedures for missingitems on mail returns, whileCensus 2000 did not; question-naires differed, residence rules andreference periods differed, andsome editing and allocation proce-dures varied. Additionally, fol-lowup data were collected in-per-son using paper questionnaires inCensus 2000, but by phone or inperson using automated instru-ments in C2SS. In addition, censusenumerators were temporary work-ers and were not as well trained oras experienced as C2SS field repre-sentatives (FRs). Finally, the C2SSestimates are subject to samplingerror because they are based on asample of the population, whilethe short-form census totals arenot (Bennett and Griffin,2002:208). Moreover, comparisonsbetween Census 2000 and C2SSare limited to the household popu-lation because by design the C2SSdid not include the population liv-ing in group quarters.


Although other 100-percent itemsare available for comparisonbetween Census 2000 and C2SS,


5. Comparing the Race and Hispanic-OriginData From the American CommunitySurvey and Census 2000

the discussion in this chapterfocuses on the Hispanic-origin andrace variables.

5.3.1 Reporting of Hispanic origin

Bennett and Griffin (2002:210)found no discernible differences inthe proportion of Hispanic-originresponses, although there weresignificant differences in thedetailed Hispanic-origin responses.Table 5.1 shows that, comparedwith Census 2000, C2SS producedabout 6.8 percent more Mexicans.On the other hand, the “OtherHispanic” category was about 16.7percent less. The proportion ofCubans and Puerto Ricans were notstatistically different.

5.3.2 Discussion of Hispanic origin

Presumably the lower proportion inthe “Other Hispanic” category inC2SS reported by Bennett andGriffin (2002:210) reflects fewergeneral Hispanic responses(“Hispanic,” “Spanish,” and“Latino”), as shown in otherresearch (see Cresce and Ramirez,2003; Logan 2002; and Suro2002). Bennett and Griffin(2002:210) speculate that theobserved differences are due to theuse of examples in the C2SS.During telephone and personalvisit interviewing, respondentswere read or shown examples for

the “Other Spanish/Hispanic/Latino” category similar to thoseused in the 1990 census. Theseaids were not provided duringCensus 2000 operations, althoughone could argue that the presenceof the Hispanic-origin checkboxgroups act as examples. This doesnot explain why the Mexican per-centage is also lower in Census2000 – these categories were pres-ent in all data collections. ThePuerto Rican and Cuban propor-tions also shows the same patternbut were not statistically signifi-cant.

Although the format and wordingof the Hispanic-origin question onthe mail questionnaire used inC2SS and Census 2000 were simi-lar, there were differences in theother instruments (see Table 5.2).The ACS CATI/CAPI instrumentshad examples for the ‘otherSpanish/Hispanic/Latino’ category(e.g., Argentinean, Columbian,Dominican, Nicaraguan,Salvadoran, Spaniard), but thedecennial mail and enumeratorinstruments did not have exam-ples. The basic response cate-gories were similar, but the Census2000 mail questionnaire categorieswere double-banked (Bennett andGriffin, 2002:207). Having the onequestion split into two separatequestions in CATI/CAPI would pre-sumably make it easier to ask andanswer in interview situations.This effectively reduces the doublenegative statement “Mark [X] ‘No’box if not Spanish...” found on themail questionnaires (Schwede,2003-personal communication). Itmakes sense that the use of expe-rienced interviewers to probe forresponses in other data collectionsmay have contributed to gettingmore detail in C2SS than Census


Table 5.1Census 2000 and Census 2000 Supplementary Survey(C2SS) Hispanic Responses (Household Population Only)

Hispanic Origin Census 2000(1)

C2SS(2)

Difference(3=2-1)

Percentdifference

(4=3/1)

Hispanic or Latino: . . . . . . . . . 12.6% 12.6% - -Mexican . . . . . . . . . . . . . . . . 7.4% 7.9% 0.5 6.8%Puerto Rican . . . . . . . . . . . . 1.2% 1.3% 0.1 8.3%Cuban . . . . . . . . . . . . . . . . . . 0.4% 0.5% 0.1 25.0%Other Hispanic or Latino . . 3.6% 3.0% -0.6 –16.7%

Note: Bold numbers in Column 3 indicate significant differences at the p<.10 level.

Source: Adapted from Bennett and Griffin (2002:210 Table 6).

Table 5.2Hispanic-Origin Question by Questionnaire Type

Questionnaire Hispanic-origin question

Census 2000Form D-2 (mailback long form)— person based or linear layout


American Community SurveyForm ACS-1 (2000)—matrixlayout . . . . . . . . . . . . . . . . . . . . . . .


Enumerator QuestionnaireForm D-2(E) . . . . . . . . . . . . . . . . .

Are any of the persons that I have listed Mexican, PuertoRican, Cuban, or of another Hispanic or Latino group?

American Community SurveyCATI/CAPI instrument

Part 1. Is <name>/ Are you Spanish, Hispanic, or Latino?

Part 2. Is <he/she>/ Are you of Mexican origin, PuertoRican, Cuban or some other Spanish/Hispanic/Latino group?

2000. This argument is exploredmore vigorously in explaining thedifferences in race reporting.

5.3.3 Race reporting

Both Census 2000 and the C2SSallowed respondents to report oneor more races. Bennett and Griffin(2002:208-210) found significantdifferences between C2SS andCensus 2000 distributions for boththe race alone and race alone or incombination categories.22 Whilethe authors found a number of dif-ferences in the race distributions,the percent of respondents report-ing “White alone” and “Some otherrace alone” showed the greatestdifference in the distributions. Inaddition, the C2SS distribution hada significantly lower proportion ofrespondents reporting “Two ormore races.” Small but significantdifferences also exist for “Black orAfrican American alone” “AmericanIndian or Alaska Native alone,” and“Asian alone” (Bennett and Griffin2002:208).

It is important to compare the racedistributions for Hispanics andnon-Hispanics because reportingpatterns tend to be quite differentfor Hispanic respondents. Table5.3 shows that the race distribu-tion for non-Hispanics in C2SS isnot very different from that ofCensus 2000. There were signifi-cant differences for all of the racegroups, except for “NativeHawaiian and Other PacificIslander.” The largest differencebetween Census 2000 and C2SSwas for the “Some other racealone” population. Compared with

Census 2000, C2SS had slightlymore reports of “White alone” (0.4percent) and “Asian alone” (3.9 per-cent), and fewer reports of“American Indian and AlaskaNative” (9.5 percent) and “Two ormore races” (7.9 percent). When“Two or more races” is broken into“Two races which include Someother race” and “All other racecombinations,” we see that Census2000 had proportionately morerace combinations that included“Some other race” as one of theraces than did C2SS (0.50 versus0.15 percent). On the other hand,C2SS had proportionately morereports of all other race combina-

tions than did Census 2000 (1.59and 1.39 percent, respectively).

Table 5.4 shows the race distribu-tion for Hispanics. Compared withCensus 2000, C2SS has about 31percent more reports of “Whitealone,” about 30 percent fewer“Some other race” reports, andabout 24 percent fewer reports of“Two or more races” amongHispanics. When “Two or moreraces” were broken into “Two raceswhich include Some other race”and “All other race combinations,”Census 2000 had proportionatelymore two race combinations thatincluded “Some other race” as oneof the races than did C2SS (5 per-


22 The race alone categories representrespondents who reported one race (plus acategory with all respondents who reportedTwo or more races). Race alone or in combi-nation categories represent respondents whoselected a particular race regardless of thenumber of other races selected ( i.e., “thecombination of people who reported onerace and people who reported that samerace in addition to one or more other races”).

Table 5.3Census 2000 and Census 2000 Supplementary Survey(C2SS) Selected Race Responses by Non-Hispanics(Household Population Only)

RaceCensus

2000(1)

C2SS(2)

Difference(3=2-1)

Percentdifference

(4=3/1)

White . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79.30% 79.58% 0.28 0.4%Black or African American . . . . . . . . . . . 13.49% 13.21% –0.28 –2.1%American Indian and Alaska Native . . 0.84% 0.76% –0.08 –9.5%Asian . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.15% 4.31% 0.16 3.9%Native Hawaiian and Other PacificIslander . . . . . . . . . . . . . . . . . . . . . . . . . . 0.14% 0.16% 0.02 14.3%

Some other race . . . . . . . . . . . . . . . . . . . 0.19% 0.25% 0.06 31.6%Two or more races . . . . . . . . . . . . . . . . . 1.89% 1.74% –0.15 –7.9%

Two races which include Someother race . . . . . . . . . . . . . . . . . . . . 0.50% 0.15% –0.35 –70.0%

All other race combinations . . . . . 1.39% 1.59% 0.20 14.4%

Note: Bold numbers in column 3 indicate significant differences at the p<.10 level.

Source: Adapted from Bennett and Griffin (2002:208-209 Table 2 and Table 4).

Table 5.4Census 2000 and Census 2000 Supplementary Survey(C2SS) Selected Race Responses by Hispanics (HouseholdPopulation Only)

RaceCensus

2000(1)

C2SS(2)

Difference(3=2-1)

Percentdifference

(4=3/1)

Hispanic or Latino: . . . . . . . . . . . . . . . . . 100.00% 100.00% - -White . . . . . . . . . . . . . . . . . . . . . . . . . . . 47.89% 62.91% 15.02 31.4%Some other race . . . . . . . . . . . . . . . . . 42.21% 29.39% –12.82 –30.4%Two or more races . . . . . . . . . . . . . . . 6.31% 4.79% 1.52 –24.1%

Two races which include Someother race . . . . . . . . . . . . . . . . . . . . 5.09% 3.35% –1.74 –34.2%

All other race combinations . . . . . 1.22% 1.45% 0.23 18.9%

Note: Bold numbers in column 3 indicate significant differences at the p<.10 level.

Source: Adapted from Bennett and Griffin (2002:210 Tables 3 and 4).

cent versus 3 percent), and propor-tionately fewer of “All other racecombinations.”


Question Wording. While the word-ing and response categories of themail questionnaires for Census2000 and C2SS were identical (seeTable 5.5), there were differencesin the format of the question-naires. With the exception of thenonresponse followup question-naire, Census 2000 questionnaireswere person based (several ques-tions asked of each individual),while C2SS was matrix based(characteristics of all respondentsin a household were collected in acolumn format). The wording ofthe race questions used in tele-phone and personal visits in C2SSand Census 2000 differed from themail versions and from each other.Some of the differences were need-ed to accommodate the data col-lection mode, but other differencesdid not appear to be necessary.One of the most notable differ-ences was that both the mail andthe enumerator decennial question-naires asked for the race or racesthat a respondent considers him-self/herself to be, while the C2SSCATI/CAPI questionnaire asked thecategory or categories that bestindicate the respondent’s race,which may be measuring differentcognitive domains. The C2SSCATI/CAPI instruments also hadexamples for the “Other Asian”(e.g., Cambodian, Hmong, Thai,Indonesian) and “Other PacificIslander” (e.g., Tahitian, Fijian) cat-egories, while the other three didnot (Bennett and Griffin,2002:207).23

Despite the subtle differences inthe methodologies, Schwede,

Leslie, and Griffin (2002:3136)note these race questions share acommon characteristic:

The response categories for raceon the census and ACS presenta strange pastiche of skin color(white and black), internalindigenous ethnic groups (e.g.,American Indian/Alaska Native),U.S. Island Areas (e.g., Samoa),nationality (e.g., Japanese), andgeographical region for manycountries (other Asian).

Interviewer Effects. In examiningdata from Census 2000 and C2SS,Schwede, Leslie, and Griffin(2002:3134) found unexpectedlylarge differences in the distributionof race, particularly amongHispanics in interviewer-adminis-tered data collections. They notethat about the same percentage(46 percent) of Hispanics reporteda race of “White” as reported “Someother race” in enumerator-collecteddata in Census 2000. On the otherhand, more than twice as manyHispanics reported as “White” (64percent) as reported “Some otherrace” (30 percent) in the C2SS datacollected by interviewers.

Based on that finding, the CensusBureau conducted two studies.The first was a semi-structureddebriefing study of ACS interview-

ers (Leslie, Raglin, and Schwede,2002). The authors hypothesizethat the race reporting differencesmay be due to “interviewer behav-ior” caused by differences in expe-rience and training:

• C2SS interviewers are experi-enced, well-trained, and long-term interviewers who work onother demographic surveys, butCensus 2000 interviewers werehired just for Census 2000.

• Most Census Bureau demo-graphic surveys ask pre-Census2000 race and Hispanic- originquestions which do not ask formore than one race and do notallow reporting of “Some otherrace.”

• In some surveys, interviewers“have been trained to mark raceby observation if the respon-dents refuse in certain situa-tions.”

• Unlike the Census 2000, theC2SS flashcard does not includean instruction that respondentsmay select more than one race.

Although this study occurred wellafter Census 2000 and is based onreported not observed behavior, itsuggests the possibility that someinterviewers may have used activeprobes which might have influ-


23 For a comprehensive list of differ-ences, see Table 2 in Leslie, Raglin, andSchwede (2002:2064).

Table 5.5Race Question by Questionnaire Type

Questionnaire Race question

Census 2000Form D-2 (mailback long form)—person based or linear layout


American Community SurveyForm ACS-1 (2000)—matrixlayout . . . . . . . . . . . . . . . . . . . . . . .


Enumerator QuestionnaireForm D-2(E) . . . . . . . . . . . . . . . . .

Now choose one or more races for each person. Whichrace or races does each person consider himself/herself tobe?

American Community SurveyCATI / CAPI instrument . . . . . . . .

{Show respondent flashcard B} I’m going to read you a listof race categories.Please/Using this list, please/choose oneor more categories that best indicate {Name}/your race.

enced reporting of specific racesresponses (Leslie, Raglin, andSchwede, 2002:2068). The authorshypothesize that the race reportingdifferences may be due to “inter-viewer behavior.” In another studyof the debriefing data, Schwede,Leslie, and Griffin (2002) foundthat fewer years of experience,region of the country, and inter-viewer interpretation of what therace question was asking wereassociated with FRs (field represen-tatives) accepting and recording“Hispanic” as a response in “otherrace.”

What is particularly interestingabout this study is that “wide dif-ferences in FRs’ interpretations ofwhat the race question is askingfor” suggest interviewers’ interpre-tations of the race question maydiffer from region to region as well(Schwede, Leslie, and Griffin,2002:3136). In fact, in focusgroups, FRs “pressed... researchershard to explain just what it isheadquarters wants to collect withthe race question.” The questionsthemselves leave some doubt as towhat is wanted: in mail question-naires the race question asks forthe race or races the respondentconsiders him/herself to be, whilethe ACS CATI and CAPI ask for oneor more categories that best indi-cate the respondent’s race.

The second study examined amatched sample of Census 2000and C2SS records. Raglin andLeslie (2002:2827) matchedrespondents interviewed in theC2SS in March, April, and May2000 to their respective Census2000 records and comparedresponses with the race question.The advantage of this study is the“ability to compare the pairedresponses for people as opposedto looking at totals” (Raglin andLeslie, 2002:2829). The authorsfound much more consistent race

responses among respondents –both Hispanics and non-Hispanics– who answered Census 2000 andC2SS via mail, than those whowere interviewed in each data col-lection (Raglin and Leslie,2002:2831). In explaining thefinding, Raglin and Leslie(2002:2831) note that householdswere not assigned randomly tomail versus interview, but ratherwere interviewed because they didnot respond to the mail question-naire. “Therefore, these people arethe hardest to collect data from.”Raglin and Leslie (2002:2831) alsonote that census interviewers wereallowed to use proxy respondentsoutside the household, were inex-perienced, and used paper andpencil, as opposed to computer-aided instruments. They also notethat C2SS interviewers who did notwork on Census 2000 were morelikely to probe when “Hispanic”was given in answer to race andthat many of these interviewerswork on other surveys that do notallow “Some other race” (Raglinand Leslie, 2002:2830).

Among non-Hispanics, Raglin andLeslie (2002:2831) also noted goodconsistency in reporting when boththe Census 2000 and C2SS datawere collected via mail for White,Black, and Asian respondents.They found only moderate consis-tency for American Indian andAlaska Native, Some other race,and Two or more races respon-dents. Raglin and Leslie(2002:2831) conclude:

There is often concern about theconsistency of race reporting,but these data indicate that fora large share of the population –non-Hispanics who are willing tofill out the mail forms – racereporting is consistent with theexception of people reportingTwo or more races.

According to Raglin and Leslie(2002:2830), there was a notabledifference between Census 2000and C2SS race data for Hispanicscollected by interviewers. Thissuggested that interviewers proba-bly affected the reporting of raceby Hispanics. The authors suggestthat the reason for this was thatmany Census 2000 enumeratorswere temporary employees with little interviewing experience,while C2SS enumerators were per-manent Census Bureau employeeswith more experience.

Thus, it seems likely that enumera-tors and interviewers may havecaused differences in the report-ing of “Some other race” alone orin combination with other races.To the extent that C2SS interview-ers had experience with other datacollection that does not have a“Some other race” category, it islikely that they were less willing toaccept “Some other race” respons-es. As discussed previously, anobservation study24 reported byHough and Borsa (2003:42)showed that some census enumer-ators had difficulty asking aboutrace. Some did not show the flash-card, read the question as word-ed, or read all of the race cate-gories.

Processing Differences. A differ-ence in the processing of enumera-tor forms (which had only onewrite-in area for race) compared tomail forms (which had three write-in areas for race), led to an over-statement of Some other race by 6percent, and Two or more racesresponses by about 15 percent(see Cresce, 2003 for a moredetailed discussion).


24 It should be noted that observationswere not based on a scientifically selectedsample, and were based on subjective judg-ments of individual observers.

Discussion of Differences. Perhapsthe question we should be askingis why there aren’t more differ-ences between Census 2000 andC2SS race distributions, not whythere are any differences (to para-phrase sociologist Kingsley Davis).Even if we took two independentdecennial censuses at the sametime, it would be reasonable toexpect differences due to non-sam-pling error. In comparing Census2000 and C2SS, we know there aresubstantial wording and method-ological differences, and someprocessing differences, as dis-cussed above. However, theinvolvement of interviewers proba-

bly had a large effect on racereporting, particularly that ofHispanic respondents. Self-selec-tion by the “difficult-to-enumerate”through not responding via themail questionnaire may just com-plicate the task of enumerators.But as noted by Leslie, Raglin, andSchwede (2002:2065):

Probing is one part of the ques-tion-and-answer process thatcannot be completely standard-ized and thus, there is an oppor-tunity for interviewers to beinconsistent across respondentsand across interviews. That isthe one situation in which inter-

viewer-related error can occur(Mangione, Fowler, and Louis,1992).

We should note that there aremany other situations where inter-viewer-related error can occur, butit is clear that responses tend to bemost consistent when collected viamail (Raglin and Leslie, 2002:2830-2831). Thus, it seems imperativethat the Census Bureau study waysto maximize mail response, and toensure that interviewers have astandardized approach to collect-ing race in all its surveys in orderto minimize interviewer effects ondata collection.


Census 2000 was the first timethat residents of Puerto Rico wereasked to complete and return theirquestionnaires by mail (Berkowitz,2001:1). It also marks the firsttime questions on race andHispanic origin were asked of indi-viduals in Puerto Rico, althoughrace was collected by enumeratorsthrough observation in the 1950census. The decision to includethe race and Hispanic-origin ques-tions “occurred because the gov-ernment of Puerto Rico requestedthe same questionnaire content asstateside in order to speed the pro-cessing and release of Puerto Ricocensus data and so that PuertoRico could be included in statewidestatistics” (Christenson, 2003:1).

According to Berkowitz (2001:iv):

Almost everyone had heardsomething about Census 2000from television and radio ads,newspapers, schools, or frominformal sources such as rela-tives, neighbors, and “brothers”or “sisters” in their churches.Most had also discussed someaspect of the process withsomeone else. Many partici-pants indicated they had con-sulted with family members orneighbors while trying to com-plete their questionnaires,sometimes in an effort to reacha consensus as to what wasbeing asked or how they shouldanswer.

Because of the newness of thequestions, it is probably not sur-prising that Berkowitz (2001:21-22) found that there were “con-cerns that some questions were

too private,” and that “therace/ethnicity questions inspiredthe most strenuous negative reac-tions of any questions on theCensus 2000 questionnaire” inmore urban coastal communities ofPuerto Rico. It is also possible thatthe role of interviewers in PuertoRico might be different from thatrole stateside. About 53 percentof Puerto Rico’s householdsreturned their Census 2000 ques-tionnaires by mail, compared with65 percent stateside (Berkowitz2001:1).25 Berkowitz (2001:16)found “a strong preference for themore personal, door-to-doorapproach taken in the 1990 cen-sus. They found the idea of drop-ping off the questionnaire at thegate too impersonal and bureau-cratic for their taste.”

6.1 Study design andlimitations

The basic method followed byChristenson (2003:2-3) was tocompare race and Hispanic-origindistributions based on Census2000 100-percent data collectedstateside and in Puerto Rico. Themain limitation of the evaluation ofPuerto Rico’s race and Hispanic-ori-gin data is the “lack of any previ-ous quantitative measures” forcomparison. The lack of cognitivestudies prevents drawing “defini-tive conclusions about what led”respondents to answer the waythey did. Finally, we may notknow “the extent to which theresponses of Puerto Ricans were

shaped by their understanding oftheir racial identity as opposed tothe way they interpreted and react-ed to the question itself.”

6.2 Nonresponse to raceand Hispanic origin

Table 6.1 shows that the nonre-sponse to race is higher in PuertoRico than in the United States (5.0and 4.1 percent, respectively), butjust the opposite is true of theresponse to Hispanic origin (3.4and 4.8 percent, respectively).

Table 6.2 shows that Hispanicsoverall and Puerto Ricans in theU.S. are much more likely not toanswer the race question thantheir counterparts in Puerto Rico.


6. Puerto Rico Census 2000 Race andEthnicity Questions

25 Fifty states and the District ofColumbia constitute the stateside data.

Table 6.1Nonresponse to Race andHispanic Origin in theUnited States and PuertoRico

Question UnitedStates

PuertoRico

Race . . . . . . . . . . . . . . . . . 4.1% 5.0%Hispanic Origin . . . . . . . . 4.8% 3.4%

Source: Tabulation of Census 2000 Hundred-Percent Data File (HDF).

Table 6.2Nonresponse to Race byHispanics and PuertoRicans in the United Statesand Puerto Rico

Hispanic group UnitedStates

PuertoRico

All Hispanics . . . . . . . . . . 14.3% 3.4%Puerto Rican . . . . . . . . . . 17.5% 3.4%

Source: Tabulation of Census 2000 Hundred-Percent Data File (HDF).

Nonresponse to race by Hispanics26

in the United States was over 14percent, and over 17 percent byPuerto Ricans in the United States,compared with under 4 percenteach on the island of Puerto Rico.27

6.3 Hispanic-originreporting

It is not surprising that 98.8 per-cent of Puerto Rico’s residents wereidentified as Hispanic or that 95.1percent were identified as PuertoRican. Another 1.5 percent wereof Dominican origin; 1.4 percentwere identified as “Other Hispanicor Latino”; Cubans were about 0.5percent, and Mexicans 0.3 percent(Christenson, 2003:4). Less than 4percent of the Hispanic-originresponses in Puerto Rico werewrite-in entries, and 37.6 percentof those reflected the check boxresponses (Mexican, Puerto Rican,etc.). Another 52.8 percent of thewrite-in responses were detailedHispanic responses; 6.5 percentwere multiple-responses; and 3.1percent were other responses.Among the specific Hispanicgroups written-in, 71.4 percentwere “Dominican;”11.6 percentwere South American entries; 5.7percent were Spaniard; 4.8 percentwere Central American; and only6.4 percent were general descrip-tors (e.g., Hispanic, Latino, etc.)(Christenson, 2003:5).

6.3.1 Reporting of Hispanic originby enumerators

In general, the distribution ofHispanic-origin responses thatwere enumerator-filled does notvary much from those that wererespondent-filled. Table 6.3 showsno differences in the proportionHispanic and non-Hispanic in

Puerto Rico by mode of collection,but there are some differences inthe specific categories. Comparedwith respondent-filled returns, enu-merator- filled returns showed pro-portionately fewer Puerto Ricans(-0.7 percent), Cubans (-33.0 per-cent), and all other Hispanicgroups (-33.0 percent), but moreDominicans (110 percent) andMexicans (150 percent).

Table 6.4 shows more striking dif-ferences in the proportion ofHispanics and non-Hispanics inthe United States by mode of col-lection. Enumerator-filled returnsin the United States showed pro-portionately fewer non-Hispanic (-6.5 percent) and more Hispanic(52.7 percent) responses thanrespondent-filled returns.Enumerator-filled returns in theUnited States showed proportion-ately fewer Cubans (-20.0 percent),but more Puerto Ricans (45.5 per-

cent), Dominicans (50.0 percent),Mexicans (80.3 percent), and allother Hispanic groups (12.9 per-cent) than respondent-filledreturns.

6.3.2. Discussion of Hispanic-ori-gin reporting

Although there is no benchmark toevaluate the reporting of Hispanicorigin in Puerto Rico, the resultsfrom Census 2000 appear to bereasonable prima-facie, and thereappears to be no particular bias incomparisons of respondent-filledreturns and enumerator-filledreturns in Puerto Rico. In contrast,there are significant differences inthe distributions stateside: enu-merator-filled returns showed pro-portionately more Hispanics (withthe exception of Cubans amongthe groups examined). In terms ofreporting of detailed Hispanicgroups, there did not appear to beexcessive reporting of general


26 Hispanics overall and Puerto Ricanswhose origin was not edited or imputed.

27 Hereafter, I will refer to the island ofPuerto Rico as “the Island.”

Table 6.3Distribution of Hispanic Origin in Puerto Rico by Mode ofData Collection

Hispanic Origin Respondent-filled(1)

Enumerator-filled(2)

Percent difference3=(2-1)/1

Hispanic . . . . . . . . . . . . . . . . . . 98.8% 98.8% 0.0%Puerto Rican . . . . . . . . . . . . 95.5% 94.8% –0.7%Dominican . . . . . . . . . . . . . . 1.0% 2.1% 110.0%Cuban . . . . . . . . . . . . . . . . . . 0.6% 0.4% –33.0%Mexican . . . . . . . . . . . . . . . . 0.2% 0.5% 150.0%Other Hispanics . . . . . . . . . 1.5% 1.0% –33.0%

Non-Hispanic . . . . . . . . . . . . . . 1.2% 1.2% 0.0%

Source: Adapted from Christenson (2003:14, Table 9).

Table 6.4Distribution of Hispanic Origin in the United States byMode of Data Collection

Hispanic Origin Respondent-filled(1)

Enumerator-filled(2)

Percent difference3=(2-1)/1

Hispanic . . . . . . . . . . . . . . . . . . 11.0% 16.8% 52.7%Puerto Rican . . . . . . . . . . . . 1.1% 1.6% 45.5%Dominican . . . . . . . . . . . . . . 0.2% 0.3% 50.0%Cuban . . . . . . . . . . . . . . . . . . 0.5% 0.4% –20.0%Mexican . . . . . . . . . . . . . . . . 6.1% 11.0% 80.3%Other Hispanics . . . . . . . . . 3.1% 3.5% 12.9%

Non-Hispanic . . . . . . . . . . . . . . 89.0% 83.2% –6.5%


Hispanic terms probably becausethe overwhelmingly dominantgroup on the Island (Puerto Rican)appears as a reporting category inall data collections. Cresce andRamirez (2003:11) suggest that thePuerto Rican category, along withthe Cuban category, were the leastaffected by changes in theHispanic-origin question used inCensus 2000 (see Chapter 2 for amore detailed discussion).

6.4 Race reporting

Despite the newness of racereporting in Puerto Rico, reportingwas very complete, as seen in thesection above, and quite differentthan might have been expected.Table 6.5 shows the distribution ofrace in Puerto Rico for all people.About eight in every ten people(80.5 percent) were reported as“White alone,” and 84.0 percentreported “White alone or in combi-nation with one or more otherraces.” Nearly one in twelve (8.0percent) reported as “Black orAfrican American alone,” but 10.9percent reported “Black alone or incombination with one or moreother races.” About 6.8 percentreported as “Some other racealone,” and 8.3 percent did so incombination with other races. Onein 25 (4.2 percent) reported beingof more than one race.

Because most residents of PuertoRico are Hispanic, it is important tocompare their race distribution tothat of stateside Hispanics andPuerto Ricans. Table 6.6 showsthe race distributions of Hispanicsin Puerto Rico and the UnitedStates. Compared with the UnitedStates, Hispanics in Puerto Rico aremuch more likely to report “Whitealone” (68 percent) and “Blackalone” (295 percent). On the otherhand, Hispanics in Puerto Rico aremuch less likely to report“American Indian and Alaska Native

alone” (75 percent), “Some other

race” alone (84 percent), and “Two

or more races” (35 percent).

Christenson (2003:8) reports a

similar pattern when looking at the

“race alone or in combination” dis-

tribution of race.

How different are the responses of

Puerto Ricans on the Island from

those stateside? Table 6.7 shows

similar results as for Hispanics

overall. Compared to the United

States, Puerto Ricans in Puerto Rico

are also much more likely to report

“White alone” (72 percent) but only

somewhat more likely to report

“Black alone” (17 percent). On the

other hand, Puerto Ricans on the

Island are also much less likely to

report “American Indian and Alaska

Native alone” (50 percent), “Some

other race alone” (82 percent), and

“Two or more races” (46 percent)

than are Puerto Ricans stateside.


Table 6.5Race Distribution in Puerto Rico

Selected race categories Race aloneRace alone or in combi-nation with other races

White . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80.5% 84.0%Black or African American . . . . . . . . . . . 8.0% 10.9%American Indian and Alaska Native . . . 0.4% 0.7%Asian . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 0.2% 0.5%Some other race . . . . . . . . . . . . . . . . . . . . 6.8% 8.3%Two or more races . . . . . . . . . . . . . . . . . . 4.2% -

Source: Summary File 1, Table P3 and Table P9.

Table 6.6Race Distribution of Hispanics in Puerto Rico and theUnited States

Selected race categories Puerto Rico(1)

United States(2)

Percentdifference3=(1-2)/2

White alone. . . . . . . . . . . . . . . . . . . . . . . . . 80.7% 47.9% 68%Black or African American alone . . . . . . 7.9% 2.0% 295%American Indian and Alaska Nativealone . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 0.3% 1.2% –75%

Some other race alone . . . . . . . . . . . . . . 6.9% 42.2% –84%Two or more races . . . . . . . . . . . . . . . . . . 4.1% 6.3% –35%

Source: Adapted from Christenson (2003:6) Table 3.

Table 6.7Race Distribution of Puerto Ricans in Puerto Rico and theUnited States

Selected race categories Puerto Rico(1)

United States(2)


White alone . . . . . . . . . . . . . . . . . . . . . . . . 81.4% 47.4% 72%Black or African American alone . . . . . . 7.6% 6.5% 17%American Indian and Alaska Nativealone . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 0.3% 0.6% –50%

Some other race alone. . . . . . . . . . . . . . . 6.6% 37.3% –82%Two or more races . . . . . . . . . . . . . . . . . . 4.0% 7.4% –46%

Source: Adapted from Christenson (2003:9-10) Table 5a and Table 5b.

Christenson (2003:11) reports that9.2 percent of the responses torace in Puerto Rico were write-ins.Of those, 82.8 were classified as“Some other race.” Of the “Someother race” responses, 63.8 per-cent involved a Hispanic-originanswer (e.g., “Hispanic,” “PuertoRicans,” etc.), 31.9 percent were a‘color’ response (e.g. “Moreno,”“Brown,” etc.), 1.8 percent an unde-fined mixed race response (e.g.,“Mixed,” “Mulatto,” “Multiracial,”etc.), and the rest were otherresponses.

6.4.1 Reporting of race by enu-merators

Unlike Hispanic origin, race datacollected by enumerators showreporting that is moderately dis-tinct from that in respondent-filledreturns. As shown in Table 6.8,enumerator-filled returns forHispanics in Puerto Rico are pro-

portionately less likely to be“White alone” (-8 percent), “Blackalone” (-17 percent), and “AmericanIndian and Alaska Native alone” (-400 percent), but more likely tobe “Some other race alone” (54percent) and “Two or more races”(35 percent).

Enumerator-filled returns forHispanics in the United States (asshown in Table 6.9) are proportion-ately less likely to be “Whitealone” (-8 percent), “Two or moreraces” (43 percent), and “AmericanIndian and Alaska Native alone” (63percent), but more likely to be“Black alone” (5 percent) and “Someother race alone” (13 percent).Unlike Hispanic origin, race datacollected by enumerators showedmoderately distinct reporting.


At least in some areas of PuertoRico (urban coastal areas), the race

question28 “elicited the strongestnegative reactions” from partici-pants in four focus groups.Berkowitz (2201:17) notes thatseveral participants reported thatthey “stopped filling out theirquestionnaire” upon reaching therace question. Some participantsfelt the questions were discrimina-tory, divisive, and not appropriatefor “the Creole or ‘mixed’ realitiesof Puerto Rico.” For example,Berkowitz (2201:17-18) reportssome participants’ reactions[emphasis added]:

“I have received training onequal employment. I under-stand that about the races.When I saw the census form andread the race question I thoughtI am not White or Black oranything else because I amHispanic and so I was upsetand decided not to fill it out.”

“I did not find an alternativeanswer for my race becausewe are neither African Blacksnor American Indians. The cen-sus did not have the optionalanswer of ‘Puerto Rican,’ ourrace. The question upset mebecause I thought why do wehave to be divided as a race, ifwe have all kinds of races livinghere: Chinese, Arabs,Dominicans, Cubans. Itoccurred to me that this ques-tion was somewhat racist and Idid not want to fill out the formand so I did not.”

“There was no option forLatino, or Puerto Rican, orHispanic. This badly designedquestion demonstrated that ourculture does not exist. I feltoffended and said I would notfill it out. My wife told me I had


Table 6.9Race Distribution of Hispanics in the United States byMode of Data Collection

Selected race categoriesRespondent-

filled(1)

Enumerator-filled

(2)


White alone . . . . . . . . . . . . . . . . . . . . . . . . 49.2% 45.7% -8%Black or African American alone . . . . . . 2.0% 2.1% 5%American Indian and Alaska Nativealone . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.3% 0.8% –63%

Some other race alone . . . . . . . . . . . . . . 40.0% 46.1% 13%Two or more races . . . . . . . . . . . . . . . . . . 7.0% 4.9% -43%


Table 6.8Race Distribution of Hispanics in Puerto Rico by Mode ofData Collection

Selected race categoriesRespondent-

filled(1)

Enumerator-filled

(2)


White alone . . . . . . . . . . . . . . . . . . . . . . . . 83.0% 77.1% -8%Black or African American alone . . . . . . 8.3% 7.1% -17%American Indian and Alaska Nativealone . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 0.5% 0.1% -400%

Some other race alone . . . . . . . . . . . . . . 4.7% 10.3% 54%Two or more races . . . . . . . . . . . . . . . . . . 3.4% 5.2% 35%


28 Although Berkowitz (2001:17) reportsstrong reactions to the “race/ethnicity ques-tions,” most of the reactions she reportsseem directed solely at race.

to fill it out, according to law. Isaid let them come and get meand have them put me in jail!”

Despite the strong reactions to therace question cited above, it isclear that the race reporting inPuerto Rico is good in terms ofcompleteness (about 5 percent didnot respond). Unlike statesideHispanics and Puerto Ricans, Islandresidents were much more likely torespond to race (see Table 6.2).Residents of Puerto Rico weremuch more likely to report as“White” and much less likely toreport as “Some other race” thantheir stateside counterparts. It ispossible that the higher propor-tion of “Some other race” statesideis partly a function of the muchlarger race nonresponse amongHispanic stateside, and the conse-quent imputation. Hispanic resi-dents in Puerto Rico were alsomore likely to self-report as “Black”on respondent-filled returns thantheir stateside counterparts (8 per-

cent and 2 percent, respectively;see Table 6.8), but Puerto Ricansin the United States were onlyslightly less likely to report “Black”than their Island counterparts (6.5and 7.6 percent, respectively; seeTable 6.7).

Because of the large role that enu-merators played in Puerto Rico,there was some concern that enu-merators may have affected racereporting. Were enumeratorssomehow responsible for the largeproportion reporting “White” inPuerto Rico? That does not appearto be the case.

As seen in Table 6.8, enumerator-filled returns show slightly lessreporting of “White,” and morereporting of “Some other race.”They also show slightly less report-ing of “Black” than respondent-filled returns. Interestingly, enu-merator-filled returns in the UnitedStates showed the opposite: slight-ly higher proportions of “Black”than respondent-filled returns (see

Table 6.9). In any case, it is hard

to conclude that enumerators

somehow significantly biased or

distorted the race data of

Hispanics. The race reporting pat-

tern of respondent-filled returns is

similar, although certainly not iden-

tical, for Hispanic respondents

both in the United States and in

Puerto Rico.

On the other hand, the race report-

ing pattern is very different among

Hispanic and Puerto Rican respon-

dents in Puerto Rico compared

with their stateside counterparts

(see Table 6.6 and Table 6.7). It is

also clear that these differences are

not explained by enumerator

behavior. The differences in the

race reporting pattern of Puerto

Ricans on the Island and in the

United States suggest that, despite

the controversy, “race” is conceptu-

alized and understood differently

on the Island than in the United

States.



The major objective of this TopicReport is to synthesize resultsfrom the Census 2000 Testing,Experiment, and EvaluationsProgram research relevant to raceand Hispanic origin and, if possi-ble, to answer some or all of theresearch questions that guided thereport.

7.1 Effects of question-naire changes

What was the overall effect onreporting of race and Hispanic ori-gin engendered by the changes inquestion sequencing, wording,questionnaire layout, and droppingexamples that were included in1990? Was completeness ofreporting adversely affected?

The lesson we learned, once again,from the AlternativeQuestionnaire Experiment(AQE)29 (see chapter 2) is thatchanges in the questionnaire (inthis case the mailout form) haveunintended consequences. Someof the changes had a perverseeffect and did not fully resolve theissues they were designed toaddress, as explained below.

7.1.1 Sequencing and instructions

In Census 2000, Hispanic originwas sequenced ahead of race andan instruction was added toanswer both questions. Thesechanges had two main objectives:a) decrease nonresponse toHispanic origin; and b) increase

reporting in standard race cate-

gories by Hispanics. While the

AQE could not differentiate exactly

what effects were produced by a

specific change, there is evidence

that all the changes had an effect.

First, nonresponse to the Hispanic-

origin question dropped quite dra-

matically in the 2000-style form

compared to the 1990-style form.

Second, the 2000-style form elicit-

ed better race reporting by

Hispanic respondents, although

nonresponse to race is still much

too high. Proportionately fewer

Hispanic respondents reported as

“Some other race” in the 2000-style

form, but this change did not even

come close to eliminating the prob-

lem. Also, more Hispanic respon-

dents responded “White” in the

2000-style form than in the 1990-

style form.

7.1.2 Two or more races

One of Martin’s (2002a:iv) interest-

ing conclusions is that, “contrary

to what might have been expected,

there is little evidence that allow-

ing respondents to report more

than one race reduced the single

race reporting in the 5 major race

categories.” However, one reason

is that, even with instructions to

report one race, some respondents

to the 1990-style form reported

two or more races anyway. In

addition, almost one-third of

Hispanic respondents did not

report a race. So, the actual

impact on published race data

depends on how these responses

are imputed.

It is important to remember thatthese findings are generalizableonly to the mailout universe.

It is also possible that, with amuch larger sample, we mighthave reached different conclusionsbecause we had few cases in thesmaller categories. Small samplesare a recurring problem with allresearch on these types of ques-tions.

7.1.3 Question wording and exam-ples

One unintended effect of re-design-ing the mail form to be more user-friendly was to change the report-ing of specific Hispanic groups.Fortunately, Martin (2002a:v)reports no evidence of any differ-ence in the proportion of peoplereporting as Hispanic, but this con-clusion could change if nonre-sponses are imputed.Nonetheless, there was probablymore complete reporting by non-Hispanics in the 2000-style form.The problem is that the 2000-styleform elicited fewer reports of spe-cific Hispanic subgroups, and morereports of general Hispanic identity(Martin 2002:v). Data users weredisturbed by the reduced detail forthe Hispanic population in Census2000 (see GAO 2003a, Logan2002:3, and Suro 2002:8).

Many of our critics blame the prob-lem on the dropping of examplesand the change in question word-ing, but it is not clear that this istotally true. First, Martin (2002a)showed that the “Mexican” catego-ry was affected, but logically thiscategory should not have been


7. Conclusion

29 Martin, Elizabeth, 2002,“Questionnaire Effects on Reporting of Raceand Hispanic Origin: Results of a Replicationof the 1990 Mail Short Form in Census2000,” Alternate Questionnaire Experiment.

affected since it appeared as acheckbox in both forms. As Martinpoints out, some of these differ-ences may have resulted fromother changes to the form.Second, Cresce and Ramirez’s(2003) work suggests that “PuertoRican” and “Cuban” groups mayhave been affected even thoughboth appeared as checkboxes inboth forms. Third, Martin(2002b:2) showed the oppositeeffect among Asian and PacificIslander categories: the 2000-styleform had higher proportionsamong the example groups thanthe 1990-style form. However, it isimportant to consider that theseresults may be an artifact of therelatively small sample size for thesmaller race categories. Had thesample size been much larger, wemight have reached different con-clusions. Cresce and Ramirez(2003) did not undertake a similaranalysis for Asian and PacificIslander groups, but it should bedone for completeness sake.

Dropping examples in the questionon Hispanic origin may have givensome respondents the impressionthat we were attempting to getthem to select among the terms“Spanish,” “Hispanic” or “Latino.”The “print group” instruction mayhave reinforced that notion, result-ing in fewer specific and more gen-eral responses. It may have alsocreated inconsistent reporting, asexplored below. In any case, the2003 National Census Test datashould be able to shed additionallight on the effect of examples andrevised instructions on theresponses to the Hispanic-originand race questions.

7.2 Consistency inreporting

The Content Reinterview Survey

(CRS)30 (see chapter 3) allows us toassess the consistency of reportingrace, Hispanic origin, place of birthand ancestry, among other items.The CRS report consideredHispanic-origin and place-of-birthreporting to be of good consisten-cy, and race and ancestry reportingto be of moderate consistency.31

Over 95 percent of respondentsanswered both the race andHispanic-origin question in Census2000 and CRS.

Hispanic-origin reporting.According to Singer and Ennis(2002:xxii-xxiii), the editedHispanic- origin data were of goodconsistency, but the lack of clearinstructions on the question mayhave caused some respondents toreport multiple categories whenthe question was intended to elicitone. Based on unedited data,there was good consistency for the“Not Hispanic” and the “Mexican”checkbox categories, but moderateconsistency for the “Puerto Rican,”“Cuban,” and “Other Hispanic”checkbox categories when consid-ered separately. Examining eightcategories also showed good con-sistency – about 3.3 percent ofrespondents changed theiranswers. However, as Singer andEnnis (2002:53-54) remind us, allbut the “Not Hispanic” and“Mexican” categories were “rare,”which can cause measurementerror in the indexes. Singer andEnnis (2002) also noted that somerespondents changed answersbetween Census 2000 and CRS,

but what is clear is that most ofthe inconsistency arises in the“Other Hispanic” category and themultiple reports.

Examining the differences in thequestions used in Census 2000and CRS, a respondent might con-clude that the mailback Hispanic-origin question is asking if a per-son is “Spanish” or “Hispanic” or“Latino,” whereas the enumeratorand CRS questions are askingabout specific groups (e.g.,“Mexican,” “Puerto Rican,” “Cuban,”or of another Hispanic or Latinogroup). The “print group” instruc-tion on the mailback form mayhave reinforced the notion that wewere asking respondents to select,or even reject, the general respons-es. If so, a respondent in Census2000 may have replied “No, notSpanish/Hispanic/Latino” and “Yes,Cuban,” meaning a Cuban whodoes not identify with the generalterms, but during reinterview inCRS the respondent said, “Yes,Cuban,” thus creating an apparent-ly inconsistent response. Similarly,a respondent might have identifiedas “Latino” in Census 2000, andthen identified as “Yes, PuertoRican” in CRS, also creating anapparent inconsistency.

Race reporting. By examiningreporting of Hispanic respondentsseparately, Singer and Ennis(2002:59) concluded that theHispanic population contributegreatly to the race data variability.This finding reconfirms thatHispanic respondents have moredifficulty answering the race ques-tion than do non-Hispanics.However, among non-Hispanics,only Blacks, Asians, and Whitesshow good consistency, whileAmerican Indians and PacificIslanders show only moderatereporting consistency. “Some otherrace” and “Two or more races”showed poor reporting consisten-


30 Singer, Phyllis, and Sharon R. Ennis,2002, “Census 2000 Content ReinterviewSurvey: Accuracy of Data for SelectedPopulation and Housing Characteristics asMeasured by Reinterview,” Census 2000Evaluation B5.

31 For simplicity of expression, the fol-lowing terms used in the CRS report weremodified: 1) low inconsistency = good con-sistency; 2) moderate inconsistency = mod-erate consistency; and 3) high inconsistency= poor consistency.

cy. There is some evidence from

observations that enumerators did

not always read the question as

worded and may have failed to

show flashcards (Hough and Borsa

2003:42). As with Hispanic origin,

there were differences in question-

naires. But one reoccurring diffi-

culty is a sample size that is insuf-

ficient to properly measure

differences in reporting, especially

for the smaller or rare groups.

Ancestry reporting. One of the

interesting findings reported by

Singer and Ennis (2002:27) was

that responses collected by mail

showed more consistency than

those collected by enumerators,

although both were in the moder-

ate range. In examining the data

for specific Hispanic origins of

sufficient size, we noted more con-

sistency. There was more inconsis-

tency in reporting “Hispanic” and

“Spanish,” and some of the incon-

sistency came from moving

between general Hispanic and spe-

cific Hispanic responses.

Place-of-birth reporting. Generally

speaking, the consistency of place-

of-birth reporting as identified by

the write-in response was quite

good. But Singer and Ennis

(2002:32) warn of evidence that

the model assumptions were not

met for some categories. Even so,

subgroups showed good consisten-

cy. When we examined place-of-

birth reporting from Central and

South America, these responses

appeared to be reported consis-

tently. These results suggest, for

Hispanic groups at least, that place

of birth and ancestry may be con-

sidered reliable supplements for

Hispanic origin. Their use for sup-

plementing race responses, howev-

er, needs to be further explored.

7.3 Sequencing andnonresponse

Did sequencing of Hispanic originahead of race have the desiredeffect of reducing nonresponse toHispanic origin? Did the sequenc-ing of Hispanic origin ahead ofrace result in proportionatelyfewer “Some other race” responsesin race and did Hispanics havemore complete reporting of race?

There is very clear evidence thatsequencing of Hispanic originahead of race did reduce nonre-sponse to Hispanic origin. There issome evidence based on the AQEthat sequencing of Hispanic originahead of race resulted in propor-tionately fewer “Some other race”responses. Nonetheless, it is stillthe third largest race category after“Black or African American,” andshows no indication of disappear-ing. The AQE also offers some evi-dence that Hispanics reported racemore completely in 2000-styleforms, but very large proportions(about 21 percent) of Hispanicsstill did not answer the race ques-tion. In Census 2000, about 17percent of race responses forHispanics were imputed.

7.4 Comparing Census2000 to other data sources

How do the decennial data on racecompare with those collected inother sources?

Several recent studies compareCensus 2000 data on race and eth-nicity to data from other sources.Based on the work of Bennett andGriffin (2002); Leslie, Raglin, andSchwede (2002); Raglin and Leslie(2002); and Schwede, Leslie, andGriffin (2002), we examined howrace and Hispanic origin differ inCensus 2000 and the Census 2000Supplementary Survey (C2SS). Oneof the main objectives of the ACSis to serve as a replacement for the

long form for the 2010 Census.Therefore, it is very important tounderstand how Census 2000 andC2SS differ for race and Hispanicorigin, and what revisions to pro-cedures and the questionnaires canreduce these differences.

Hispanic-origin reporting. Bennettand Griffin (2002:210) found nodiscernible differences in the pro-portion of Hispanic-origin respons-es, but found significant differ-ences in the detailedHispanic-origin responses.Specifically, they found that com-pared with Census 2000, C2SS pro-duced proportionately moreMexicans. On the other hand, the“Other Hispanic” category declinedby about 17 percent. Presumablythis reflects proportionately lowerreporting of general Hispanicresponses, as shown in otherresearch by Cresce and Ramirez,2003; Logan 2002; and Suro 2002.

Bennett and Griffin (2002:210)speculate that the observed differ-ences are due to the use of exam-ples in the C2SS during telephoneand personal visit interviewing.These aids were not provided dur-ing Census 2000 operations,although one could argue that thepresence of the Hispanic origincheckbox groups act as examples.This reasoning does not explainwhy the Mexican percentage isalso lower in Census 2000.

Race reporting. Bennett and Griffin(2002:208-210) found significantdifferences between C2SS andCensus 2000 distributions for boththe race alone and race alone or incombination categories. Theauthors found a number of differ-ences in the race distributions, butthe percentage of “White alone”and “Some other race alone”showed the greatest difference.The C2SS showed proportionatelymore “White alone” responses and


fewer “Two or more races” respons-es. Census 2000 showed propor-tionately more “Some other racealone” responses, as explained inmore detail below. Small but sig-nificant differences also existed for“Black or African American alone,”“American Indian or Alaska Nativealone,” and “Asian alone” (Bennettand Griffin 2002:208). Theauthors also examined the racedistributions for Hispanics andnon-Hispanics because reportingpatterns tend to be quite differentfor Hispanic respondents.

Race reporting by Non-Hispanics.Compared with Census 2000, C2SShad slightly more reports of “Whitealone” and “Asian alone,” andfewer reports of “American Indianand Alaska Native alone” and “Twoor more races.” The largest differ-ence between Census 2000 andC2SS was for the “Some other racealone” population. When “Two ormore races” is broken into “Tworaces which include Some otherrace” and “All other race combina-tions,” Census 2000 had propor-tionately more race combinationsthat included “Some other race” asone of the races than did C2SS.On the other hand, C2SS had pro-portionately more reports of allother race combinations than didCensus 2000.

Race reporting by Hispanics.Compared with Census 2000, C2SShas about 31 percent more reportsof “White alone,” about 30 percentfewer “Some other race” reports,and about 24 percent fewerreports of “Two or more races”among Hispanics. Looking at “Twoor more races” broken into “Tworaces which include Some otherrace” and “All other race combina-tions,” Census 2000 had propor-tionately more two race combina-tions that included “Some otherrace” as one of the races than didC2SS, and proportionately fewer

that included “All other race combi-nations.”

Based on the apparent reportingdifferences, the Census Bureauconducted two studies. The first,a semi-structured study of debrief-ing data, suggested some C2SSinterviewers used active probesthat may have influenced reportingof specific race responses (Leslie,Raglin, and Schwede, 2002:2068).In another study of the debriefingdata, Schwede, Leslie, and Griffin(2002) found that fewer years ofexperience, region of the country,and interviewer interpretation ofwhat the race question was askingmay have affected the responses.These studies also found differ-ences in interviewers’ interpreta-tions of “what” the race question isasking, and noted that interview-ers pressed researchers to explain“just what it is headquarters wantsto collect with the race question.”

Raglin and Leslie (2002:2827)matched respondents interviewedin the C2SS to their Census 2000records and compared responsesto the race question. The authorsfound much more consistent raceresponses among respondents whoanswered both Census 2000 andC2SS via mail than among thosewho were interviewed in both.This was true for both Hispanicsand non-Hispanics. In explainingthe finding, Raglin and Leslie(2002:2831) note that householdswere not assigned randomly tomail vs. personal interviews, butrather were interviewed becausethey did not respond to the mailquestionnaire and may representthe hard-to-enumerate population.

Among non-Hispanics, Raglin andLeslie (2002:2831) noted goodconsistency in reporting when boththe Census 2000 and C2SS datawere collected via mail for White,Black, and Asian respondents.

They found only moderate consis-tency for American Indian andAlaska Native, Some other race,and Two or more races respon-dents. Raglin and Leslie(2002:2831) concluded that, for“non-Hispanics who are willing tofill out the mail forms – racereporting is consistent with theexception of people reporting twoor more races.”

According to Griffin et al.,(2002:63) these studies suggestthat differences in interviewingtechniques used in Census 2000and C2SS may have led to morereporting of “White” in the C2SSand more reporting of “Some otherrace” in Census 2000 forHispanics. This research did notexplain the differences seen fornon-Hispanics. These findings ledresearchers to investigate process-ing differences between Census2000 and ACS. One processingdifference in the race edits for enu-merator forms in Census 2000may have led to an overstatementof the number of respondents inthe “Some other race” and “Two ormore races” categories. Otherresearch suggests this may not bethe entire explanation, although itmay account for some of the dif-ferences in distributions. This pro-cessing difference may have exag-gerated the Two or more racescategory by about 15 percent (seeCresce, 2003).

7.5 Comparing Census2000 and the 1990 census

Given the changes in the race andHispanic-origin question in 2000,how can these data be comparedto data from 1990? What are thelimitations of such comparisons?What lessons have we learnedabout bridging the Census 2000race data so that they are morecomparable to data collected previously and to data in other


data collections that do not allowfor more than one race response?

Hispanic origin. Although therewere what turned out to be signifi-cant differences in the Census2000 and the 1990 censusHispanic-origin questions, theoverall total Hispanic populationdata are reasonably comparable.For example, Logan (2002:3,4)concluded that Census 2000 had agood count of Hispanics, but didnot do well in identifying their spe-cific origin. Several studies indi-cate that the observed changes inthe distribution of detailedHispanic groups in Census 2000were not due entirely to a shift inhow people of Hispanic-origindefine themselves. Rather, thismay have been affected by somechange in the way we asked theHispanic-origin question. We areleft with the question of whetherthe elimination of examples wasthe probable cause of the reportingdifferences in detailed Hispanicgroups. The GAO report (2003a)highlighted the discontent with thereporting of specific Hispanic sub-groups in Census 2000. Thisreport marks an important turningpoint in feedback given to theCensus Bureau. Public concern isnow focused on a very completecount of specific subgroups withinminority categories, rather thanthe concern in previous censuses(e.g., Choldin, 1986) with the dif-ferential undercount of minoritygroups.

Race. The fundamental changes tothe race question in Census 2000which allowed respondents toreport more than one race havecomplicated comparisons with pastcollections that allowed only onerace. The Census Bureau conduct-ed the Census Quality Survey (CQS)to assist data users in comparingrace data obtained under the new

schema with that collected underthe former format.

The CQS is very impressivebecause of its large sample size,high response rates, representativesample, and the high matchingrate with Census 2000 records.But despite an enviable survey exe-cution, the CQS has several limita-tions: too few cases reportingmore than one race, which are fur-ther diminished by inconsistentrace reporting, reluctance to selectone race, and the split-paneldesign. The complex methodologyand multiple modes of data collec-tion will make it difficult for usersto decide how best to “bridge”multiple-race data from Census2000 to other single-race data col-lections. But before we dismissthe CQS, we need to conduct addi-tional research and analysis, andwe need to explore how to poolthe panel data.

In retrospect, it seems that theCQS methodology may have intro-duced many more sources of bias,such as time lag, mover gains andlosses, interviewer effects, modedifferences, proxy reporting, andpossibly matching problems (all ofwhich may give rise to inconsistentreporting) without entirely elimi-nating conditioning effects orensuring the independence ofobservations. It is quite clear thatmuch more analysis is required tofully explore the CQS data andunderstand its implications forrace reporting and bridging.

7.6 Puerto Rico

Given that the Census 2000 ofPuerto Rico was the first to askrace in a decennial census in manydecades, what were the issues incollecting those data? What werethe general attitudes and problemsexpressed by the Puerto Rican pub-lic in terms of the race question?How do the race and ethnicity data

collected in Puerto Rico comparewith data collected state-side forthe total population, Hispanics, andPuerto Ricans in the United States?

Hispanic origin. Although there isno benchmark for Hispanic originin Puerto Rico, the results fromCensus 2000 appear to be reason-able prima-facie, and there appearsto be no particular bias in compar-isons of respondent-filled and enu-merator-filled returns in PuertoRico. In contrast, there are signifi-cant differences in the distribu-tions stateside: enumerator returnsshow proportionately fewerHispanics (with the exception ofCubans among the groups exam-ined). In terms of reporting ofdetailed Hispanic groups, there didnot appear to be excessive report-ing of general Hispanic terms,probably because the overwhelm-ingly dominant group on the Island(Puerto Ricans) appears as a report-ing category in all data collections.Cresce and Ramirez (2003:11) sug-gest that the Puerto Rican catego-ry, along with the Cuban category,were the least affected by changesin the Hispanic-origin questionused in Census 2000.

Race. Berkowitz (2201:17) report-ed that at least in some urbancoastal areas of Puerto Rico therace question elicited strong nega-tive reactions from participants infocus groups. She reports thatseveral participants reported thatthey “stopped filling out theirquestionnaire” upon reaching therace question. Some participantsfelt the questions were discrimina-tory, divisive, and not appropriatefor “the Creole or ‘mixed’ realitiesof Puerto Rico.” There is no evi-dence on how, or even if, thesenegative reactions affectedresponse rates. What we do knowis that, in spite of these reactions,race reporting in Puerto Rico was


quite good in terms of complete-ness.

Unlike stateside Hispanics andPuerto Ricans, Island residentswere much more likely to respondto race. Island residents were alsomuch more likely to report as“White alone” and much less likelyto report as “Some other racealone.” The higher proportion of“Some other race alone” responsesstateside may be a function of themuch larger race nonresponseamong Hispanics in the UnitedStates, and consequently the needto impute race data for non-respondents. Hispanic residents ofPuerto Rico were also more likelyto self-report as “Black alone” thantheir stateside counterparts, butPuerto Ricans in the United Stateswere only slightly less likely toself-report “Black alone” than theirIsland counterparts.

There was concern that enumera-tors may have affected race report-ing because of the large role enu-merators played in Puerto Rico.Enumerator returns show slightlyless reporting of “White,” andmore reporting of “Some otherrace.” They also show slightly lessreporting of “Black” than self-reported returns. Interestingly,enumerator returns in the UnitedStates showed the opposite: slight-ly higher proportion of “Black” thanin mail returns. In this case, it ishard to conclude that enumeratorssomehow significantly biased ordistorted the race data ofHispanics. The race reporting pat-tern of mail returns is similar,although certainly not identical, forHispanic respondents both in theUnited States and in Puerto Rico.The differences in race reportingsuggest that the understandingand conceptualization of race isdifferent for Puerto Ricans on theIsland than in the United States.Despite the controversy that ask-

ing race engendered, a higher per-centage of Puerto Ricans on theIsland reported a race than didPuerto Ricans in the United States.

7.7 Future research

What research and testing shouldbe conducted before the 2010Census in order to improve theCensus 2000 questions on raceand Hispanic origin?

The suggestions arising from thisreview are consistent with thosealready underway with the 2003National Census Test (e.g., examin-ing the role of examples, changingquestion wording and instruc-tions, dropping “Some other race,”changing response categories, andexamining new approaches to col-lecting data on race and ethnicity).

Examples. We need to test theeffect of examples in getting betterinformation about detailedHispanic-origin and race groups.The detail will help not only toensure a complete count but alsoto get the detailed tabulations thatdata users are expecting us to beable to generate.

Question wording and instructions.We need to test the effect of restor-ing “origin” in the Hispanic-originquestion, improving the instructionfor the “Other Hispanic” category,and clarifying the instructions torespondents to answer both ques-tions and to not give an ethnicityresponse in race.

“Some other race” (SOR). We needto test the feasibility of eliminatingthe SOR category because it is notvery consistently reported or use-ful, except as a collection category,and because we have to eliminateit for other purposes, such as sur-vey controls and population esti-mates. However, we must alsounderstand what will happen ifrespondents, especially Hispanics,

continue to report an identity

which is not one of the OMB races.

Fewer Response Categories. We

need to continue to test approach-

es to reduce the “national” origin

categories in both race and

Hispanic origin. Some of these

issues are being explored in the

2003 National Census Test, but

additional research needs to be

conducted. The CRS findings sug-

gest that detailed categories tend

to be less consistently reported.

Part of this may be due to the con-

fusion associated with the pres-

ence of some Asian and Pacific

Islander national-origin groups in

the race question, and Hispanic

national-origin groups in the

Hispanic question. This creates

confusion for some respondents

about the purpose of both ques-

tions. As noted by Schwede,

Leslie, and Griffin (2002:3136), our

response categories are “a strange

pastiche of skin color (white and

black), internal indigenous ethnic

groups (e.g., American Indian/

Alaska Native), U.S. Island Areas

(e.g., Samoa), nationality (e.g.,

Japanese), and geographical region

for many countries (other Asian).”

The authors also note that even

our Census Bureau field represen-

tatives expressed some confusion

about exactly what headquarters

intended to collect with these

questions. Previous experience

suggests that removing some of

the categories would be difficult

because many constituents expect

that existing groups will be

retained on the form, and in fact

we have been under pressure to

expand the number of categories

shown. But, as with other ques-

tions, before changes can be

made, extensive research and test-

ing needs to be done.


Based on the studies reviewed inthis report, we make the followingrecommendations (please note thatwe do not attach any particularimportance to the order in whichthey appear):

8.1 Pretest and evaluateall questionnaire changes,reduce uncontrolledvariation in the questionsthat are asked, andconduct more research onmode and methodologicalinfluences on the data.

It is important that we pretest andevaluate all questionnaire changesprior to implementation. We needto reduce uncontrolled variation inthe questions that are asked andwe need to understand how modeand other methodological differ-ences affect the data that we col-lect. Census 2000 had 54 differenttypes of forms, and many formshad different race and Hispanic-origin questions than the “stan-dard” mail form. The AQE showedthat even what appear to be minorchanges on the Hispanic-originquestion produced noticeable dif-ferences in the responses we col-lected. The studies contrastingCensus 2000 and C2SS data sug-gest that we do not understandhow mode and other methodologi-cal differences affected theresponses in each data collection.(See chapter 2 for more discus-sion.)

8.2 Use larger samplesizes for tests.

As the AQE, CRS, and CQS studiesshowed, there are many instances

where larger sample sizes wouldhave improved our ability to evalu-ate effects on numerically smallgroups. While smaller samplesizes may save money in the shortrun, they may end up costing morein the long run if the tests must berepeated to yield definitive results.On the other hand, large data sets,such as the matched Census 2000and C2SS data, will often producetoo many statistically significantdifferences to yield definitiveresults. However, on balance, it isbetter to have a lot of data ratherthan too little, particularly whenwe seek to understand how pro-posed changes will affect numeri-cally small groups.

8.3 Avoid overly complextest designs – the simplerthe better.

It is important that we avoid overlycomplex test designs. For exam-ple, the complex design of theCQS made it difficult to interpretthe results and answer the ques-tions the test was designed toexplore. Having two panels in theCQS effectively reduced the samplesize available for us to analyze.We need to do a lot more analysisof the CQS data, and we need todetermine whether we can effec-tively pool the data in order toobtain larger sample sizes foranalysis.

8.4 Explore ways toimprove mail response –not only is it lessexpensive but we may alsoget more consistently

reported race data.

A study of matched Census 2000and C2SS records found muchmore consistent race responsesamong respondents who answeredboth Census 2000 and C2SS viamail. This was true both forHispanics and non-Hispanics(Raglin and Leslie, 2002:2831).However, we know that house-holds in Census 2000 and C2SSwere not assigned randomly tomail or interview modes. In fact,households who were intervieweddid not respond to the mail ques-tionnaire and, therefore, may rep-resent a particular segment of thepopulation for whom it is hard tocollect data. The combined bene-fits of lower cost and potentiallymore consistent race responsesmake the mail data collectionmode even more desirable.

8.5 Explore ways toimprove training andmonitoring of enumeratorand interviewer behavior.

No matter how much we improvemail response, there will still be aneed for enumerators and inter-viewers to conduct non-responsefollowup and other data collec-tions. Therefore, interviewerbehavior will always be an impor-tant issue for data collection.Based on a semi-structured debrief-ing study of ACS interviewers,Leslie, Raglin, and Schwede (2002)speculate that interviewer behaviorcaused by differences in experi-ence and training may account forrace reporting differences inCensus 2000 and C2SS. Althoughthis study was based on reported,


8. Recommendations

not observed, behavior, it suggeststhe possibility that some interview-ers used active probes which mayhave influenced reporting of spe-cific races responses (Leslie,Raglin, and Schwede, 2002:2068).

1. Improve interviewer under-standing of race and ethnici-ty questions. In order toensure that we collect reliableinformation on race andHispanic origin, interviewersmust have a good understand-ing of these concepts. In anoth-er study of interviewer debrief-ing data, Schwede, Leslie, andGriffin (2002) found variabilityin interviewer interpretations of“what the race question wasasking.” They found that thisvaried by years of experience,type of experience, and regionof the country. Recognizingthis, we need to explore ways ofensuring a common understand-ing among interviewers aboutthe race and Hispanic-originquestions.

2. Provide a standardizedapproach for collecting raceand ethnicity data. In orderto obtain reliable information onrace and Hispanic origin, inter-viewers must have training andstandard methods for data col-lection. It is important to main-tain consistency across data col-lections within mode so thatinterviewers have similar experi-ence collecting these data.

3. Improve methods to monitorenumerator and interviewerbehavior. In order to reducethe effect of interviewer behav-ior on the collection of consis-tent race and Hispanic-origindata, we need to explore waysto monitor interviewer behaviorthrough training, feedback, andreward or punishment of behav-iors.

8.6 Explore ways tominimize the differencesbetween, if notstandardize, race andethnicity questions acrossdata collections.

The studies reviewed in this report

point out many differences in the

methods and materials used for

race and ethnicity in our data col-

lections. In order to maximize our

ability to collect consistent race

and Hispanic-origin data, we need

to consider standardizing the ques-

tions on race and ethnicity across

our data collections as much as

possible. We recognize that mode

differences may require specific

approaches, but the questions

should be consistent within mode.

This will also reduce variability

arising from differences in the type

of experience among interviewers.

8.7 Within each datacollection, minimize oreliminate variation inresponse categories acrossforms to avoid introducingdata processingdifferences.

We have one documented instance

in which differences in the number

of write-in areas for race respons-

es caused differences in the output

data. A difference in the process-

ing of enumerator forms (which

had only one write-in area for race)

and to mail forms (which had three

write-in areas for race) led to an

overstatement of Some other race

by 6 percent, and an overstate-

ment of Two or more races

responses by about 15 percent

(see chapter 5 for more discus-

sion). Reducing the number of

forms and standardizing input

fields will reduce the probability of

spurious errors in data processing.

8.8 Consider removing“Other...” check boxes andkeeping the write-in area.

Davis et al. (2001:III-16) note aninability of many respondents touse the existing categories. Onesource of incomparability ariseswhen respondents check a box andwrite-in an entry in an inappropri-ate area. For example, Davis etal.(2001:III-18) noted the instanceof a respondent who reported herAmerican Indian tribal affiliation inthe “Yes, other Spanish/Hispanic/Latino” write-in area, after markingthe “No, not Spanish/Hispanic/Latino” box. By checking bothboxes, the respondent created a“mixed Hispanic origin” responsewhich was probably not intended.

Similarly, if a respondent were tomark the “Other Asian” checkboxand write “Irish” in the write-inarea, we would have to make thedecision of whether to classify therespondent as “Asian and White”or to remove the checkbox andkeep the write-in response.Without the “Other...” checkboxes,write-in entries can be automatical-ly evaluated and coded during theautomated edit processing, withouthaving to worry about whether theother checkbox marking wasintended as an additional responseor not.

8.9 Consider not using“Some other race” incombination with otherspecified races (e.g.,change “White and SOR”responses to “Whitealone”).

The CRS report suggests that“Some other race” is not consis-tently reported. Additionally, thiscategory is not used in other feder-al government programs, and it isnot an official OMB-recognized racecategory outside of the census.Therefore, we should consider


ignoring these responses whenthey appear in combination withone or more OMB categories. Buteven if the SOR category is elimi-nated from the census question-naire, it is very likely that we willstill get responses in other write-inareas that do not fit within theOMB-recognized race categories.What we decide to do with thesenon-OMB responses will affect therace distribution produced.

8.10 Consider usinginformation from otheritems to improve editprocedures, limit 100-percent data tabulations toPL 94-171 race andHispanic-origin groups, andderive detailed groupsfrom American CommunitySurvey data tabulations.

Cresce and Ramirez (2003) showedthat information from place ofbirth and ancestry can be usedsuccessfully to supplement “gener-al Hispanic” responses, and pro-duce more detailed informationabout the respondent’s particularHispanic origin (e.g., Guatemalan).Cresce and Ramirez (2003) did notuse this information to change arespondent from Hispanic to non-Hispanic (or vice versa). We alsoknow that Census 2000 100-per-cent data show slightly differentdistributions than those based onCensus 2000 sample data andC2SS distributions. In the future,we should consider releasingdetailed race and Hispanic-origintabulations from ACS data only,rather than 100-percent data.

This recommendation arises fromthree sources: 1) our inability toreconcile the differences betweenACS and 100-percent distributions;2) GAO admonishing us not torelease detailed group data unlesswe can vouch for its accuracy; and3) data users’ desire to have themost complete data possible fordetailed groups. Although the100-percent data is the largestdata collection we undertake, italso has the greatest probability ofsuffering from non-sampling errorbecause everything that can gowrong will go wrong in the largerendeavor.

Under this proposal, the 100-per-cent data would be used for consti-tutionally mandated purposes andfor enforcement of the VotingRights Act; the sample data wouldbe derived from the AmericanCommunity Survey (once fullyimplemented). The 100-percentitems on the ACS questionnairecould be edited with other dataitems on the questionnaire. Forexample, relationship could beedited with the assistance of mari-tal status information. Similarly,place of birth and ancestry couldbe used to supplement the infor-mation about detailed race andHispanic-origin groups. This doesnot imply that these items wouldbe used to change a respondentfrom one major category to anoth-er. For example, you would notchange a respondent from “NotHispanic” to “Hispanic” based ontheir response to the place-of-birthquestion, but you could change a

generic response of “South

American” to “Columbian.”

If we think of the 100-percent data

as being a collection effort about

the number of the nation’s inhabi-

tants and the race, Hispanic origin,

and age of the population of each

census block, then it makes sense

to publish only the information we

are required to at the block level.32

The sample or long-form data

derived from ACS then become the

source of all other demographic,

socioeconomic, and housing char-

acteristics of the nation and of the

geographic units we feel are appro-

priate for release, including

detailed subgroups of the racial

and ethnic populations.

8.11 Conduct additionalanalysis of the CensusQuality Survey data.

As noted in chapter 4, the main

objective of the Census Quality

Survey (CQS) was to assist data

users in comparing race data

obtained by asking respondents to

“mark one or more races” with data

obtained by asking respondents to

“mark one race.” However, a great

deal of further analysis needs to be

conducted to determine how CQS

data can be used to develop

parameters for race bridging.


32 Public Law 94-171, enacted in 1975,amended section 141 of title 13, UnitedStates Code, which directs the CensusBureau to provide redistricting data neededby the 50 states for their use in redrawingdistricts of the United States Congress andstate legislatures


References

Alonso, William and Paul Starr,eds., 1987, The Politics ofNumbers, New York, NY: RussellSage Foundation, pp.1-6.

Baylor, Barbara A., 2000, “CensusTesting,” in Anderson, Margo J.,ed., Encyclopedia of the U.S.Census, Washington, DC:Congressional Quarterly, pp.62-66.

Becker, Patricia, 2000, “1980Census,” in Anderson, Margo J.,ed., Encyclopedia of the U.S.Census, Washington, DC:Congressional Quarterly, pp.155-158.

Bentley, Michael, Tracy Mattingly,Christine Hough, and ClaudetteBennett, 2003, “Census QualitySurvey to Evaluate Responses tothe Census 2000 Question onRace: An Introduction to the Data,”Census 2000 Evaluation B.3,Washington, DC: U.S. CensusBureau, April 3.

Bennett, Claudette, 2000, “Race:Questions and Classifications,” inAnderson, Margo J., ed.,Encyclopedia of the U.S. Census,Washington, DC: CongressionalQuarterly, pp.313-317.

Bennett, Claudette E. and DeborahH. Griffin, 2002, “Race andHispanic Origin Data: AComparison of Results from theCensus 2000 SupplementarySurvey and Census 2000,” 2002Proceedings of the AmericanStatistical Association, Section onSurvey Research Methods [CD-ROM], Alexandria, VA: AmericanStatistical Association, pp.206-211.

Berkowitz, Susan, 2001a, “PuertoRico Focus Groups on the Census2000 Race and EthnicityQuestions,” Census 2000Evaluation B.13, Washington, DC:U.S. Census Bureau, June 20.

Berkowitz, Susan, 2001b, “PuertoRico Focus Groups on WhyHouseholds Did Not Mail Back theCensus 2000 Questionnaire,”Census 2000 Evaluation A.8,Washington, DC: U.S. CensusBureau, July 17.

Bryant, Barbara Everitt, 2000,“1990 Census,” in Anderson, MargoJ., ed., Encyclopedia of the U.S.Census, Washington, DC:Congressional Quarterly, pp.158-161.

Chapa, Jorge, 2000, “Hispanic/Latino Ethnicity and Identifiers,” inAnderson, Margo J., ed.,Encyclopedia of the U.S. Census,Washington, DC: CongressionalQuarterly, pp.243-245.

Choldin, Harvey M., 1986,“Statistics and Politics: the‘Hispanic Issues’ in the 1980Census,” Demography 23(3):403-418.

Christenson, Matthew, 2003,“Puerto Rico Census 2000Responses to the Race andEthnicity Questions,” Census 2000Evaluation B.12, Washington, DC:U.S. Census Bureau, July 14.

Cohen, Michael, 2000, “CoverageEvaluation,” in Anderson, Margo J.,ed., Encyclopedia of the U.S.Census, Washington, DC:Congressional Quarterly, pp.95-101.

Cresce, Arthur R., 2003,“Overstatement of More Than OneRace in Census 2000,” InternalMemorandum to Nancy M. Gordonand Preston J. Waite, June 10.

Cresce, Arthur R., and Roberto R.Ramirez, 2003, “Analysis ofGeneral Hispanic Responses inCensus 2000,” Population DivisionWorking Paper No.72, Washington,DC: U.S. Census Bureau, April 26.

Davis, Diana K., Johnny Blair, E.Lamont Crawley, Kellina M. Craig,Margaret S.B. Rappoport, and CarolAnn Baker, 2001, Census 2000Quality Survey Instrument PilotTest, Washington, DC:Development Associates, Inc.,March.

del Pinal, Jorge, Elizabeth Martin,Claudette Bennett, and Art Cresce,2002, “Overview of Results of NewRace and Hispanic OriginQuestions in Census 2000,” 2002Proceedings of the AmericanStatistical Association, SurveyResearch Method Section [CD-ROM], Alexandria, VA: AmericanStatistical Association, pp.714-719.

Edmonston, Barry, and CharlesSchultze, eds., 1995, Modernizingthe U.S. Census, Washington, DC:National Academy Press.

Edmonston, Barry, JoshuaGoldstein, Juanita Tamayo Lott.,eds.,1996, Spotlight onHeterogeneity, Washington, DC:National Academy Press.

Goldfield, Edwin D. and David M.Pemberton, 2000, “1960 Census,”in Anderson, Margo J., ed.,Encyclopedia of the U.S. Census,Washington, DC: CongressionalQuarterly, pp.148-153.

Grieco, Elizabeth M. and Rachel C.Cassidy, 2001, “Overview of Raceand Hispanic Origin,” Census 2000Brief, C2KBR/01-1, Washington,DC: U.S. Census Bureau.

Guzmán, Betsy, 2001, “TheHispanic Population,” Census 2000Brief, C2KBR/01-3, Washington,DC: U.S. Census Bureau.


Griffin, Deborah Harner, et al.

2003, “Report 3: Comparing

General Demographic and Housing

Characteristics to Census 2000,”

Meeting 21st Century Demographic

Data Needs – Implementing the

American Community Survey,

Draft, Washington, DC: U.S. Census

Bureau, February 28.

Hough, Christine L. and Fred R.

Borsa, 2003, “Data Collection in

Census 2000,” Census 2000 Topic

Report, Washington, DC: U.S.

Census Bureau.

Hubble, David, James Poyer, and

Michael Bentley, 2001, “Study of

Responses to the Census 2000

Question Instruction: ‘Mark One or

More Races,’” 2001 Proceedings of

the American Statistical

Association, Section on

Government Statistics [CD-ROM],

Alexandria, VA: American Statistical

Association, pp.1513-1518.

Jones, Nicholas A. and Amy

Symens Smith, 2003, “New

Explorations of Race Reporting for

Interracial Couples and Their

Children: Census 2000,” Paper pre-

sented at the Annual Meeting of

the Population Association of

America, Minneapolis, MN, May

2003.

Jones, Nicholas A. and Amy

Symens Smith, 2002, “Who Is

‘Multiracial?’ Exploring the

Complexities and Challenges

Associated With Identifying ‘the’

Two or More Races Population in

Census 2000.” Paper presented at

the Annual Meeting of the

Population Association of America,

Atlanta, GA, May 2002.

Leslie, Theresa, David Raglin, and

Laurie Schwede, 2002,

“Understanding the Effects of

Interviewer Behavior in the

Collection of Race Data,” 2002

Proceedings of the American

Statistical Association, Section on

Survey Research Methods [CD-

ROM], Alexandria, VA: American

Statistical Association, pp.2063-

2068.

Logan, John R., 2001, “The New

Latinos: Who They Are, Where They

Are,” Lewis Mumford Center for

Comparative Urban and Regional

Research, New York, NY: University

at Albany, September 10.

Logan, John R., 2002, “Hispanic

Populations and Their Residential

Patterns in Metropolis,” Lewis

Mumford Center for Comparative

Urban and Regional Research, New

York, NY: University at Albany,

May 8.

Lowenthal, Terri Ann, 2000,

“Congress and the Census,” in

Anderson, Margo J., ed.,

Encyclopedia of the U.S. Census,

Washington, DC: Congressional

Quarterly, pp.95-101.

Martin, Elizabeth, 2002a,

“Questionnaire Effects on Reporting

of Race and Hispanic Origin:

Results of a Replication of the

1990 Mail Short Form in Census

2000,” Census 2000 Alternative

Questionnaire Experiment,

Washington, DC: U.S. Census

Bureau, December 12.

Martin, Elizabeth, 2002b, “A

Preliminary Look at Questionnaire

Differences in Reporting of

‘Example’ Race Groups,” Draft,

Washington, DC: U.S. Census

Bureau, September 25.

Martin, Elizabeth, 2002c, “The

Effects of Questionnaire Design on

Reporting of Detailed Hispanic

Origin in Census 2000 Mail

Questionnaires,” Public Opinion

Quarterly 66(4): 582-593.

Martin, Elizabeth, Theresa J.

Demaio, and Pamela C. Campanelli,

1990, “Context Effects for Census

Measures of Race and Hispanic

Origin,” Public Opinion Quarterly

54:551-566.

Martin, Elizabeth, Eleanor Gerber,

and Cleo Redline, 2003,

Consolidated Report: Census 2000

Alternative Questionnaire

Experiment, Census 2000

Alternative Questionnaire

Experiment, Washington, DC: U.S.

Census Bureau, forthcoming.

McKenney, Nampeo, Claudette

Bennett, Roderick Harrison, and

Jorge del Pinal, 1993, “Evaluating

Racial and Ethnic Reporting in the

1990 Census,” Proceedings of the

American Statistical Association,

San Francisco, CA, August.

Petersen, William, 1987, “Politics

and The Measurement of Ethnicity,”

in William Alonso and Paul Starr,

eds., The Politics of Numbers, New

York, NY: Russell Sage Foundation,

pp.187-233.

Raglin, David, and Theresa F.

Leslie, 2002, “How Consistent is

Race Reporting Between the

Census and the Census 2000

Supplementary Survey,”

Proceedings of the American

Statistical Association, Joint

Statistical Meetings – Section on

Survey Research Methods [CD-

ROM], Alexandria, VA: American

Statistical Association, pp.2826-

2831.


Robinson, J. Gregory and KirstenWest, 2000, “DemographicAnalysis,” in Anderson, Margo J.,ed., Encyclopedia of the U.S.Census, Washington, DC:Congressional Quarterly, pp.164-167.

Schwede, Laurie, Theresa F. Leslie,Theresa, and Deborah H. Griffin,2002, “Interviewer’s ReportedBehavior in the Collection of Raceand Hispanic Data,” Proceedings ofthe American StatisticalAssociation, Joint StatisticalMeetings – Section on SurveyResearch Methods [CD-ROM],Alexandria, VA: American StatisticalAssociation, pp.3134-3139.

Singer, Phyllis, and Sharon R. Ennis,2002, “Census 2000 ContentReinterview Survey: Accuracy ofData for Selected Population andHousing Characteristics asMeasured by Reinterview,” Census2000 Evaluation B.5, Washington,DC: U.S. Census Bureau, December 12.

Suro, Roberto, 2002, “Counting the‘Other Hispanics’: How ManyColombians, Dominicans,Ecuadorians, Guatemalans andSalvadorans Are There in theUnited States?” Pew HispanicCenter, Washington, DC, May 9.

Thomas, Kathryn F., Tamara L.Dingbaum, and Henry F. Woltman,1993, Content Reinterview Survey:Accuracy of Data for SelectedPopulation and HousingCharacteristics as Measured byReinterview, 1990 Census ofPopulation and Housing, Evaluationand Research Reports, 1990 CPH-E-1, Washington, DC, U.S. Bureau ofthe Census, October.

U.S. Census Bureau, 2001, Reportof the Executive SteeringCommittee for Accuracy andCoverage Evaluation Policy,Washington, DC, March 1.

U.S. General Accounting Office(GAO), 2003a, Decennial Census:Methods for Collecting andReporting Hispanic Subgroup DataNeed Refinement, GAO-03-228,Washington, DC, January.

U.S. General Accounting Office(GAO), 2003b, Decennial Census:Methods for Collecting andReporting Data on the Homelessand Others without ConventionalHousing Need Refinement, GAO-03-227, Washington, D.C., January.

U.S. Office of Management andBudget (OMB), 1997, “Revisions tothe Standards for the Classificationof Federal Data on Race andEthnicity; Notices,” FederalRegister, 62(210):58781- 59790.



Census 2000 Topic Report No. 9TR-9

Issued March 2004

Race andEthnicity inCensus 2000

U.S.Department of CommerceEconomics and Statistics Administration

U.S. CENSUS BUREAU

Census 2000 Testing, Experimentation, and Evaluation Program