Upload
keiran
View
37
Download
0
Tags:
Embed Size (px)
DESCRIPTION
Evaluation of Person-based Migration Methodology. Presented to FSCPE Meeting Internal Migration Processing Team Local Area Estimates and Migration Processing Branch U.S. Census Bureau September 26, 2006. Contents of Presentation. Description of Return-based and Person-based - PowerPoint PPT Presentation
Citation preview
Evaluation of Person-based Migration Methodology
Presented to FSCPE Meeting
Internal Migration Processing Team
Local Area Estimates and Migration Processing Branch
U.S. Census Bureau
September 26, 2006
Contents of Presentation
1. Description of Return-based and Person-based
2. Summary of Issues and Recommendations
3. Evaluations
4. Future R/Ds
Internal Revenue Service sends tax extract file to Census Bureau
Drop names and assign unique Person Identification Keys (PIK) derived from SSNs
Run edit process and assign county code to each return based on ZIP+4
Two consecutive years of tax data are matched on primary filer’s PIK
Return-Based Method
Compare county codes on matched returns to define migration
Tally exemptions for in-, out-, and non-migration components
Compute Net Internal Migration Rate (NMR) for Under 65 household population:
NMR = (In-migrants – Out-migrants) /
(Non-migrants + Out-migrants)
Return-Based Method(Cont’d)
Match Year-1/Year-2 matched IRS file to PCF to obtain demographic characteristics for primary filers
Demographic characteristics for the spouse and dependents are imputed based on the characteristics of the primary filer
Migration status is assigned based on the migration status of the primary filer
Produce state and county migration data by age, race, sex, and Hispanic origin
Return-Based Method(Cont’d)
Limitations of Return-Based
Underestimate the moves associated with life-events (e.g., divorce, marriage, first job etc.,)
Demographic characteristics of spouse and dependents are imputed based on the characteristics of the filers
Migration status of spouse and dependents depends on the filer.
Person-Based Method
Start with the return-based edited file Records created for filer, spouse, and all
dependents (up to 4); one record per each individual on the tax return
Unduplicate the records by applying selection rules
Assign county code to each record Matched across two consecutive tax years on
PIK
Person-Based Method(Cont’d)
Compare county codes on matched returns to define migration
Tally exemptions for in-, out-, and non-migrants
Compute Net Migration Rate (NMR) for Under 65 household population:
NMR = (in-migrants – out-migrants) / (non-migrant + out-migrants)
Person-Based Method(Cont’d)
Match Year-1/Year-2 matched IRS file to PCF to obtain demographic characteristics for filer, spouse, and dependents (No imputation!!)
Migration status is individually assigned to filer, spouse, and dependents based on the assigned county codes (No imputation!!)
Produce state and county migration data by age, race, sex, and Hispanic origin
Issues requiring decision making rules
Issue 1. Duplicate Records/Zero Exemptions:
Multiple records are created for one person if the person’s SSN is claimed on more than one tax return, including zero exemption returns
Need to decide which records to keep
Zero Exemption
Filed when a dependent child has enough income to report to the IRS
The parent claims separately the dependent on his or her tax return
87 percent of the duplicate records involve zero exemptions returns
Issues requiring decision making rules
Issue 2. Excess exemptions: The number of SSNs recorded on a tax return
does not match the number of exemptions claimed on the same return
We need to decide whether we create a dummy record for each excess exemption
Summary of Issues Zero exemptions
Retain the zero exemption record and drop the dependent record
Addresses on zero exemption returns are likely to be more accurate
Summary of Issues Other duplicate records
Filer record trumps all! Retain primary filer records and drop
spouse and dependents records
Summary of Issues Excess Exemptions
1. Fewer SSNs than exemptions claimed Exclude excess exemptions
2. More SSNs than exemptions claimed (i.e., negative excess exemptions)
Include the provided SSNs and ignore negative negative excess exemptions
Divorce Scenario
Return-Based Person-Based
1 Non-Migrant 1 Non-Migrant
Non-Match 4 Migrants
1 Filer 1 FilerCty A Cty A
1 Spouse 1 Filer3 Deps 3 Deps Cty A Cty B
Year 1 Year 2
1 Deps 1 Filer 1 Non-Match 1 MigrantCty A Cty B
Year 1 Year 2 Return-Based Person-Based
Student Scenario
EvaluationMatch Rates - Definition
Year-1/Year-2 Match Rate
= (Year-1 and Year-2 Matched Record Count) * 100 / Total Year-1 Record Count
PCF Match Rate
= (Year1,Year2, and PCF Matched Count) * 100 / (Year1 andYear2 Matched Count)
The 10 Lowest Year1-Year2 Match Rates from Return-Based Records from Years 2000 through 2004 (National Average = 90.5%)
County and State Year Match Rate (%)Loving County, TX
Los Alamo County, NM
Loving County, TX
Santa Fe County, NM
Lincoln County, NM
Loving County, TX
Taos County, NM
Bernalillo, County, NM
Sandoval County, NM
Rio Arriba County, NM
2003
2001
2004
2001
2001
2000
2001
2001
2001
2001
76.36
77.71
77.78
78.73
81.74
81.97
83.87
83.91
84.62
84.68
The 10 Lowest Year1-Year2 Match Rates from Person-Based Records from Years 2000 through 2004 (National Average = 94%)
County and State Year Match Rate (%)Loving County, TX
Shannon County, SD
Santa Fe County, NM
Lincoln County, NM
Charlton County, GA
Loving County, TX
North Slope Borough, AK
Taos County, NM
San Miguel County, NM
Loving County, TX
2004
2001
2001
2001
2004
2003
2003
2001
2001
2000
81.97
88.24
87.72
88.53
88.12
87.74
90.06
87.80
91.25
92.79
PCF Match Rates
The match rates from the person-based records were almost the same as the match rates from the return-based records (> 99%).
Total Number of Exemptions and Duplicate Records: 2001-2004
0
2
4
6
8
10
12
14
16
2001 2002 2003 2004
Du
pli
cate
s (
in M
illi
on
s)
243
244
245
246
247
248
249
250
251
252
253
254
Exem
pti
on
s (
in M
illi
on
s)
Duplicate
Total Exemptions
Matched Y1-Y2 Under-Age-65 Exemptions: Percent of Exemptions Migrating by Exemption Status
(10 Percent Sample)
0
5
10
15
20
25
30
35
Y1 Filer Y1 Spouse Y1 Dependent
Per
cen
t Y2 Filer
Y2 Spouse
Y2 Dependent
Y2 Filer Y2 Spouse Y2 Dependent
Y1 Filer 7.10% (679,220) 19.16% (38,315) 11.10% (17,024)
Y1 Spouse 16.70% (24,397) 4.28% (158,015) 30.01% (1,119)
Y1 Dependent 11.90% (43,663) 30.24% (3,488) 6.91% (404,213)
Migration Base:Person-based vs. Return-based
180,000
185,000
190,000
195,000
200,000
205,000
1999-2000 2000-2001 2001-2002 2002-2003 2003-2004
Tho
usan
ds
Return-Based Person-Based
Coverage Analysis by State
1. Coverage patterns are consistent across states and years
2. Person-based coverage was consistently lower than return-based coverage
3. The states with the most extreme coverage rates under return-based processing maintained the same pattern under person-based processing
4. The difference in coverage declined for every state between 2000 and 2004. The highest difference was –5.30 in 2000 and –0.48 in 2004
Number of Inter-county Migrants:Person-based vs. Return-based
11,500
12,000
12,500
13,000
13,500
14,000
1999-2000 2000-2001 2001-2002 2002-2003 2003-2004
Tho
usan
ds
Return-Based Person-Based
Inter-county Migration Percent:Person-based vs. Return-based
5.45.6
5.86.06.2
6.46.66.8
7.07.2
1999-2000 2000-2001 2001-2002 2002-2003 2003-2004
Per
cent
Return-based Person-based
Race and Hispanic Origin Distribution:Person-based vs. Return-based
01020
30405060
708090
White Black AIAN API Hispanic
Per
cent
Return Person
Age Distribution:Person-based vs. Return-based
0
5
10
15
20
25
30
35
40
1-19 20-24 25-29 30-34 35-39 40-44 45-49 50-54 55-59 60-64 65+
Per
cent
Return Person
0123456789
0 1 2 3 4 5 6 7 8 9Return-based Migration Rate (%)
Per
son
-bas
ed M
igra
tion
rat
e (%
)
Outliers
Outliers
95% Confidence Interval
Migration Rate Outliers Definition
Findings from Outlier Analysis
The person-based method had significant effect on the migration flows from the counties with small population to the counties with large population
The new method had the largest impact on individuals in their early 20s
Summary of Findings
The person-based method will produce more accurate migration estimates.
The characteristics from the person-based records will be more accurate than the return-based.
Future R/Ds
1. Integration of Electronic File to enhance the coverage of child dependent
2. Integration of Medicare data at the micro level to produce the migration data for the 65+