Upload
nuala
View
53
Download
0
Tags:
Embed Size (px)
DESCRIPTION
Redefining the Unit Nonresponse Adjustment Cells for the Survey of Residential Alterations and Repairs (SORAR). Laura T. Ozcoskun and Katherine Jenny Thompson Presented By Samson Adeshiyan. Outline. Background The Problem The Authors’ Recipe for a Solution - PowerPoint PPT Presentation
Citation preview
1
Redefining the Unit Nonresponse Adjustment Cells for the Survey of
Residential Alterations and Repairs (SORAR)
Laura T. Ozcoskun and
Katherine Jenny Thompson
Presented By Samson Adeshiyan
2
Outline
• Background
• The Problem
• The Authors’ Recipe for a Solution
• Some Empirical Results Interspersed
3
Survey of Residential Alterations and Repairs (SORAR) Background• Monthly data collection• Low unit response rates• Key item: Total Expenditures
• Maintenance and Repairs• Improvements
• Multi-stage sample of Housing Units (HUs)• Privately-owned vacant HUs (Vacant)• Rental and 5+ unit properties (Rental)
• Modified Half-Sample Variance Estimator
4
The Problem (Motivation)
• SORAR’s three-stage weighting procedure• Duplication control (field subsampling)• Unit non-response adjustment • Post-stratification adjustment
• Suspected that variables used to define unit nonresponse weighting cells not highly related to• Response propensity or• Cell means
5
Response Model
• “Quasi-Randomization” (Oh & Scheuren 1983)• Covariate dependent, missing-at-random (MAR) response
mechanism• Response propensity (p) is a random variable.
• Minimum requirements for weighting cells:1. Heterogeneous response propensities or
2. Heterogeneous cell means
• Optimal adjustment cells satisfy both conditions.
6
The Authors’ Recipe
• Determine Eligible Sets of Classification Variables
• Determine Uncollapsed Cells (Full Model)• Logistic Regression Analysis
• Determine Collapsed Cells (Reduced Model)• General Linear Hypothesis Tests• Relative Efficiency Diagnostic (MSE Ratios)• Time Series Plots of Adjustment Factors
7
Step 1: Find Sets of Classification Variables for Cells
• Respondent requirements per cell:• Actual Cell Size 5
• needed for logistic regression
• Effective “Sample” (cell) Size 5
• Categorical variables
8
Cell Sizes• Effective “Sample” (Cell) Size
• rp is the Actual cell size of cell p
• DEFFp is the design effect for item Y in cell p• indicates efficient design for item Y
p
pp DEFF
rr ~
pp rr ~
9
Candidate Cells (SORAR)• Candidate cell variables (categorical)
• Region (currently used)• Metropolitan Statistical Area (MSA) status
(currently used)• Tenure (Vacant/Rental)• Single-unit vs. Multi-unit
• Candidate cross classifications• Region/MSA Status/Single or Multi-Unit• Region/Tenure/Single or Multi-Unit
10
Step 2: Uncollapsed Cells (Full Model)
• Response Propensity Modeling
• Logistic Regression• Complex survey adaptations of Roberts, Rao,
and Kumar (1987) to test statistics
• Full and reduced (nested) models• Want all effects to be significant in full model• Would like to reject majority of nested models
11
Logistic Regression (SORAR)
• 18 months
• Separate full and reduced models for each month
• Between-cell covariance approximations = 0 (anti-conservative) = -0.25 = -0.50 (conservative)
12
Model 1: Region/MSA/Single or Multi-Unit
Hypothesis = 0 = -0.25 = -0.50
Rejected Not Rejected
Rejected Not Rejected
Rejected Not Rejected
REGION = MSA = HU =0 (Full) 18 0 18 0 18 0
REGION = MSA=0|HU
0 14 4 13 5 10 8
REGION = HU=0|MSA
0 18 0 18 0 18 0
MSA = HU=0|REGION
0 18 0 18 0 18 0
REGION = 0| MSA
0, HU 0 12 6 12 6 9 9
MSA = 0| REGION
0, HU 0 8 10 8 10 8 10
HU = 0| REGION
0, TEN 0 18 0 18 0 18 0
Very sensitive to correlation assumptionsIndicates necessity of including Single/Multi-Unit in
weighting cellsRegion and MSA less necessary given Single/Multi-Unit
13
Model 2: Region/Tenure/Single or Multi-Unit
Insensitive to correlation assumptions (change)Indicates necessity of including Single/Multi-Unit in
weighting cells (unchanged)Region and Tenure often necessary (change)
Hypothesis = 0 = -0.25 = -0.50
Rejected Not Rejected
Rejected Not Rejected
Rejected Not Rejected
REGION = TEN = HU =0 (Full) 18 0 18 0 18 0
REGION = TEN=0|HU
0 18 0 18 0 17 1
REGION = HU=0|TEN
0 18 0 18 0 18 0
TEN = HU=0|REGION 0 18 0 18 0 18 0
REGION = 0| TEN
0, HU 0 14 4 14 4 11 7
TEN = 0| REGION
0, HU 0 13 5 13 5 13 5
HU = 0| REGION
0, TEN 0 18 0 18 0 18 0
14
Step 3: Collapsed Cells (Reduced Model)
• General Linear Hypothesis Tests
• Relative Efficiency Diagnostic
• Time Series Plots of Estimated Nonresponse Adjustment Factors
15
General Linear Hypothesis Test
Hypothesis Tests• H0: and (collapse rows) • H0: and (collapse columns)
Not done with SORAR (cell estimates too variable)
2111 yy 2212 yy 1211 yy 2221 yy
Classification variable k
11y (cell 1) 12y (cell 2) Classification variable k’
21y (cell 3) 22y (cell 4)
16
Relative Efficiency DiagnosticMSE Ratios
• Modified from Eltinge and Yanasaneh (1997)• Definitions
approximately model-unbiased estimate under full model
model-biased estimate under a collapsed weighting
procedure
(under model assumption)
• Mean squared error ratio:
FY
CY
)ˆ(ˆ)ˆ(ˆFF YVYESM
)ˆ(ˆ)ˆ(ˆ)ˆ(ˆ 2CCC YBYVYESM
)ˆ(ˆ)ˆ(ˆ)ˆ(ˆ 2
F
CCC
YV
YBYV
17
SORAR MSE Ratios: Total Expenditures
• Tenure dropped: Median RH = 1.02
• HU Category dropped: Median RT = 0.93
• On average, RH is both greater than one and closer to one than RT
• Not terrifically compelling evidence for either collapsing
• How can values be less than 1?• Function of using empirical data
• Collapsed variances smaller or equivalent to uncollapsed variances
• Estimated bias often “negligible”
18
Time Series Plots of Adjustment Factors
• Visual, less statistical • Fewer assumptions
• Full procedure and collapsed procedure adjustment factors• Within region (SORAR)• Inverse of response propensities (SORAR)
19
Candidate Cells: Region by Single/Multi for Vacant Properties
• Original adjustment factors very different in scale
• Collapsed factors are far from both original factors
0
2
4
6
8
10
12
14
16
Vacant Single-Unit Property Factors Vacant Multi-Unit Property Factors
Collapsed Vacant Units
20
Candidate Cells: Region by Single/Multi for Rental Properties
• Original adjustment factors very different in scale
• Collapsed factors are far from both original factors (c.f. multi-unit factors)
0
2
4
6
8
10
12
14
16
Rental Single-Unit Property Factors Rental Multi-Unit Property Factors
Collapsed Rental Units
21
Candidate Cells: Region by Tenure for Single-Unit Properties
• Scale of original factors “similar” (compared to earlier slide)
• Collapsed factors different for single units
0
2
4
6
8
10
12
14
16
Vacant Single-Unit Property Factors Rental Single-Unit Property Factors
Collapsed Single Unit
22
Candidate Cells: Region by Tenure for Multi-Unit Properties
• Scale of original factors similar
• Collapsed factors similar to original factors
0
2
4
6
8
10
12
14
16
Vacant Multi-Unit Property Factors Rental Multi-Unit Property Factors
Collapsed Multi Unit
23
Final Recommendation (SORAR)
• Full weighting cells• Region/Tenure/Single or Multi-Unit
• Collapsed weighting cells• Region/Single or Multi-Unit• Region
24
Conclusion
• Started with a recipe• Model-development tools• Diagnostic tools
• Modified the recipe for our survey• Considered and dropped diagnostics (data-based)
• Ended up with a new main course• More statistically defensible unit nonresponse
adjustment cells.