30
1 Updated Unified Category System Updated Unified Category System for 1960-2000 Census Occupations for 1960-2000 Census Occupations Peter B. Meyer, OPT Peter B. Meyer, OPT Brown Bag seminar, Oct 25, 2006 Brown Bag seminar, Oct 25, 2006 Outline 1. Tentative standard categories 2. Users and bug fixes 3. How Census assigns occupation codes

Updated Unified Category System for 1960-2000 Census Occupations

Embed Size (px)

DESCRIPTION

Updated Unified Category System for 1960-2000 Census Occupations. Peter B. Meyer, OPT. Brown Bag seminar, Oct 25, 2006. Outline Tentative standard categories Users and bug fixes How Census assigns occupation codes Imputation practice. 1960 system from 1968-1970 - PowerPoint PPT Presentation

Citation preview

Page 1: Updated Unified Category System for 1960-2000 Census Occupations

1

Updated Unified Category System Updated Unified Category System for 1960-2000 Census for 1960-2000 Census

OccupationsOccupations

Peter B. Meyer, OPTPeter B. Meyer, OPTBrown Bag seminar, Oct 25, 2006Brown Bag seminar, Oct 25, 2006

Outline1. Tentative standard categories2. Users and bug fixes3. How Census assigns

occupation codes4. Imputation practice

Page 2: Updated Unified Category System for 1960-2000 Census Occupations

2

Census Occupational Census Occupational ClassificationsClassifications

1960 system from 1968-1970 1970 system from 1971-1982 1980 system from 1983-1991

1990 system from 1992-2002 2000 system from 2003-

present

Census Bureau staff assign 3-digit occupations codes to respondents of decennial Census

The list of codes changes every Census Current Population Survey (CPS) uses these codes:

Vast data is available in these categories But time series don’t cover the whole period

Page 3: Updated Unified Category System for 1960-2000 Census Occupations

3

Tradeoffs in Classification Tradeoffs in Classification SystemsSystems Duration vs. accuracy, precision

blacksmith, database admin (short precise series) electrical engineer (long evolving series)

Number of occupations vs. sample size of each Narrow distinctions may be of interest

Dental technicians High tech occupations vs. other technical occupations Licensed jobs

Conformity with other data

Page 4: Updated Unified Category System for 1960-2000 Census Occupations

4

Desirable Attributes of a Desirable Attributes of a ClassificationClassification For each occupation, well-behaved time-

series of: mean wage wage variance fraction of the population

New criterion: SPARSENESSSPARSENESS One prefers a classification not be sparse,

meaning it does not have many empty occ-year cells

Page 5: Updated Unified Category System for 1960-2000 Census Occupations

5

Classification Current Classification Current PhasePhase

Earlier working paper (Meyer and Osborne, 2005) defines a unified classification for Census & CPS 3-digit occupation codes from 1960 to present

It was adapted from the 500+ categories in 1990 Census: 379 categories have same name or almost same as 1990 125 were eliminated to help harmonize with other years

Example to follow 19 categories have expanded (changed name or n.e.c. category) 3 categories added for 1960 data which doesn’t fit in

Page 6: Updated Unified Category System for 1960-2000 Census Occupations

6

Hard case; combined category Hard case; combined category herehere

1970code

1970occupation

Title

1980code

1980 component categories and codes

CivilianLabor Force

% of1970

category

284

Sales workers,exceptclerks,retail trade

263Sales workers, motor vehicles

and boats 185,160 37.06%

266Sales workers, furniture and

home furnishings 98,941 19.80%

267Sales workers; radio, television,

hi fi, and appliances76,674 15.35%

268Sales workers, hardware and

building supplies 81,668 16.35%

269 Sales workers, parts 39,120 7.83%

274Sales workers, other

commodities 16,008 3.20%

277Street and door to door sales

workers2,082 0.42%

Page 7: Updated Unified Category System for 1960-2000 Census Occupations

7

Input from users and new data

Corrections from users

Page 8: Updated Unified Category System for 1960-2000 Census Occupations

8

The information coders have

Page 9: Updated Unified Category System for 1960-2000 Census Occupations

9

The information coders have

Page 10: Updated Unified Category System for 1960-2000 Census Occupations

10

Statisticians and Statisticians and ActuariesActuaries

Counts of Actuaries and Statisticians in Census Sample

1960 1970 1980 1990

Actuaries . 45 129 182

Statisticians 199 237 352 338

Separate categories in and after 1970

In 1960 they were all in “statisticians and actuaries”

When standardizing we put all these in “statisticians”

Now we try to infer which people in this population were actuaries

Page 11: Updated Unified Category System for 1960-2000 Census Occupations

11

Statisticians and Statisticians and ActuariesActuaries

Pooled all 1970-1990 statisticians and actuaries Ran many logistic regressions predicting the actuaries Good predictors of whether respondent is an actuary

Recorded in a later year Employed in insurance, accounting/auditing, or professional

services Employed in private sector High salary income High business income, or to earn mostly business income Is employed Lives in Connecticut, Minnesota, Nebraska, or Wisconsin

Page 12: Updated Unified Category System for 1960-2000 Census Occupations

12

Statisticians and Statisticians and ActuariesActuaries For 1970-1990 a logistic regression can

predict occupation right 88% of the time Impute a prediction on 1960 data

Revised counts of actuaries and statisticians after imputation

1960 1970 1980 1990

Actuaries 2929 45 129 182

Statisticians 170170 237 352 338

Page 13: Updated Unified Category System for 1960-2000 Census Occupations

13

1000

020

000

3000

040

000

5000

0

1960 1970 1980 1990year

Statisticians Actuaries

Mean salaries before reassignment

1000

020

000

3000

040

000

5000

0

1960 1970 1980 1990year

Statisticians Actuaries

Mean salaries after reassignment

More accurate statistician category, by later definition

Longer time series for actuaries

Reduces sparsenesssparseness Builds a technique

StatisticiaStatisticians and ns and ActuariesActuaries

Why work this arcane problem?

Page 14: Updated Unified Category System for 1960-2000 Census Occupations

14

Lawyers and JudgesLawyers and Judges Combine all 1970-1990 lawyers and judges Exclude all private sector employees because they

are all lawyers (By definition? ) In the remainder, predictors of judge, not lawyer:

(judge is 1, lawyer is 0 in the next slide) Older Employed in state government High salary income; low or no business income Educated less than 16 years Employed at time of survey

Can get 83% accurate predictions from such a regression

Page 15: Updated Unified Category System for 1960-2000 Census Occupations

15

Logit Regression on 1970-1990 Logit Regression on 1970-1990 Census SampleCensus Sample

  Coefficient Std error p-value

Year -0.005 0.011 0.633

Age 0.155 0.033 0.000

Age-squared -0.001 0.000 0.040

Federal government employee -1.440 0.137 0.000

State government 0.499 0.263 0.058

Ln(salary) -1.795 3.094 0.562

Ln(salary) squared 0.052 0.333 0.877

Ln(salary) cubed 0.003 0.012 0.798

Ln(business income) -0.041 0.036 0.261

Fraction of earned income that is business income -0.714 1.053 0.498

Education less than 16 years 2.235 0.320 0.000

Years of formal education -0.044 0.046 0.336

Is employed at time of survey 0.224 0.241 0.352

Constant 13.017 23.428 0.578

Page 16: Updated Unified Category System for 1960-2000 Census Occupations

16

Can Use Those Coefficients Can Use Those Coefficients in Statain Stata gen logitindex = -.0046652 * year + .1549193 * age

-.0006942 * age * age -1.4405086* indfed +.4986729 * indstate -1.795481 * lnwage +.0517015 * lnwage * lnwage +.0030016 * lnwage * lnwage * lnwage -.040749 * lnbus -.7140285 * busfrac +2.234934 * (educyrs<16) -.0442429 * educyrs +.2239105 * employed +13.0172 /* constant */ ; …gen logitval=exp(logitindex)/(1.0+exp(logitindex))replace logitval=.0001 if !govtemployee /* this is a perfect predictor */replace logitval=.0001 if !indfed & !indstate & !indlocal /* this too */gen assigned = logitval>.46 /* Now ‘assigned’ has a 1 for imputed judges */

Page 17: Updated Unified Category System for 1960-2000 Census Occupations

17

Newly Imputed JudgesNewly Imputed Judges

  1960 1970 1980 1990

Lawyers 19711971 2570 5082 7603

Judges 8282 123 298 331

Respondents in Census samples after imputation

Page 18: Updated Unified Category System for 1960-2000 Census Occupations

18

Preliminary FindingsPreliminary Findings There are opportunities to impute occupations

occasionally with reasonable accuracy The resulting records have “better-classified”

occupations slightly more accurate (in four categories) Slightly less sparse (293 empty cells not 295)

Effects in a substantive regression not focused on these categories is tiny (What does it mean?)

Page 19: Updated Unified Category System for 1960-2000 Census Occupations

19

Census Bureau's National Census Bureau's National Processing Center in Processing Center in Jeffersonville, INJeffersonville, IN

Page 20: Updated Unified Category System for 1960-2000 Census Occupations

20

Who's Doing the CodingWho's Doing the Coding There are “Coders”“Coders” and “Referralists”“Referralists”

CodersCoders follow carefully documented procedures, most likely from the Census National Headquarters in Suitland, MD

In many cases there is not enough information to assign industry and occupation codes

Such unresolved cases are forwarded electronically ("referred") to a “Referralist" “Referralist"

CodersCoders with two years of experience are expected to produce 94 code assignments an hour, with 95% accuracy (codes are checked)

Page 21: Updated Unified Category System for 1960-2000 Census Occupations

21

Who's Doing the CodingWho's Doing the Coding There were about 12 coders and 14 referralists in

October 2006 ReferralistsReferralists have been coderscoders before and usually have

9+9+ years of experience I interviewed three referralistsreferralists, and a supervisor, but no

coders coders during my October 2006 visit The ones I met handled referrals from several surveys:

CPS, ATUS, SIPP, NLS, ACS, and others on contract All of the above surveys use decennial Census

occupation codes Industry and occupational codes for Decennial Censuses

are assigned by other employees, not the ones who permanently work in Jeffersonville now (???)

Page 22: Updated Unified Category System for 1960-2000 Census Occupations

22

Information Available to a Information Available to a CoderCoder "kind of work" "principal duties" employer name city and state ("PSU") of

respondent's home (not workplace)

industry, already coded industry type

(manufacturing, service, other)

years of education, age, sex not income, although it was

available before Jan '94 software.

The industry is normally coded before the occupation. Referralist can match Employer name to a known employer from the Employer Name List (ENL), possibly the same as SSEL Some cases are "autocoded" before coder sees

Page 23: Updated Unified Category System for 1960-2000 Census Occupations

23

Problems and Problematic Cases “Computer work" for occupation (???) Too little information from respondent Exaggeration (example: dot com businesses) Ambiguities:

"water company" for industry or employer "surveyor" occupation "boot" vs "boat" in handwriting hurrying

Referralists confer with each other routinely, but sometimes make different choices from one another

Does technological change go along with occupational ambiguity? YES.YES. Problems with computer work, biotech. Still no nanotech in classification.

Page 24: Updated Unified Category System for 1960-2000 Census Occupations

24

What Would Improve Their Coding Accuracy or Speed? Information about a job title Information about employer's city and state [show CPS 1993 questions] (???) But! Asking more questions would extend the

interview Retrieved from "http://

econterms.net/pbmeyer/research/occs/wiki/index.php?title=Brown_bag_Oct_25"

Page 25: Updated Unified Category System for 1960-2000 Census Occupations

25

Questions for Occupational Time Series Hypotheses for time series of consistently-defined

occupations: Have high tech jobs had rising earnings inequality? [yes] Superstars effect? [yes] Is nurturing work valued less (England et al)? Have mathematical occupations grown in size or pay? Measuring payoffs to skills Have better job-search technologies reduced inequality

within job categories? (as predicted by Stigler (1960)

Researchers sometimes use only industry, not occupation, or limit time span of study to keep consistent occupation

Page 26: Updated Unified Category System for 1960-2000 Census Occupations

26

"What's Next?""What's Next?"

Make next working paper and program code available

Publish at IPUMS Accumulate more classification systems,

techniques, criteria, and expert opinions New wiki of all classifications

Page 27: Updated Unified Category System for 1960-2000 Census Occupations

27

Thank You.

Page 28: Updated Unified Category System for 1960-2000 Census Occupations

28

Worker’s tasks Worker’s function (identified e.g. by inputs and

outputs)

example: blacksmiths vs forging machine operators example: teachers of different subjects and ages of students

Sometimes other distinctions Hierarchically (apprentices, foremen, supervisors) Certification Skills Industry (activity of the employing organization)

To some extent these are separate labor markets, with separated job search, wage setting, unemployment experiences.

Meaning of OccupationMeaning of Occupation

TasksInputs Outputs

Page 29: Updated Unified Category System for 1960-2000 Census Occupations

29

Occupation Attributes IOccupation Attributes I Strength (1-5 from DOT) Reasoning (1-6 from DOT) Mathematical reasoning (1-6 from DOT)

Language use (1-6 from DOT) Duration of specific training (from DOT)

Nurturing (0/1) (England et al, 1994) many others, potentially

Page 30: Updated Unified Category System for 1960-2000 Census Occupations

30

Occupation Attributes IIOccupation Attributes II % urban (e.g. doctor in rural area) often involves traveling (or required mobility

earlier) rate of growth % of immigrants authority (0/1) (England et al, 1994) high tech regulated unionized use of machines involves advocacy; or repair; or negotiation