Upload
loren-walworth
View
215
Download
1
Tags:
Embed Size (px)
Citation preview
Total Survey ErrorDesign, Implementation, and Evaluation
Paul P. BiemerRTI International and
University of North Carolina at Chapel Hill
2
Modern View of Survey Design
Surveys should be designed to maximize total survey quality within timeliness and budget constraints.
But how when… survey budgets are severely constrained, data must be produced and disseminated in a timely fashion, public interest in participating in surveys has been declining
world-wide for years, and even when participation is obtained, responses may not be
accurate. This is the challenge for survey research in the 21st
century
3
Outline
What is total survey quality? How does it differ from total survey error? How can surveys be designed to maximize total survey
quality? What is the total survey error paradigm and what does it
say about the design, implementation, and evaluation of survey?
4
User and Producer Have Very Different Perspectives on Survey Quality
Producers place high priority on Accuracy – total survey error is minimized Credibility – credible methodologies; trustworthy data
Users place higher priority on Timeliness – data deliveries adhere to schedules Relevance – data satisfy user needs Accessibility – access to data is user friendly Interpretability – documentation is clear; meta-data are
well-managed
5
Users Also Demand…
Comparability – valid demographic, spatial and temporal comparisons
Coherence – estimates from different sources can be reliably combined
Completeness – data are rich enough to satisfy the analysis objectives without undue burden on respondents
6
User
TSQ Optimally Balances Producer and User Requirements
Accuracy
Credibility
Timeliness
Accessibility
Comparability
Producer
Coherence
Completeness
Relevance
Interpretability
7
The Total Survey Quality Paradigm
Identifies measurable and achievable objectives for each user-defined dimension of quality
Determines costs and resources required to achieve these objectives
Maximizes survey accuracy within remaining budget
Survey Budget = Cost of Accuracy
+ Cost of User-Defined Quality
8
Accuracy is maximized by minimizing total survey error
Total Survey Error
Sampling Error• Sampling
scheme• Sample size• Estimator choice
Nonsampling Error• Specification• Nonresponse• Frame• Measurement• Data processing
Systematic
Variable
Bias
Variance
Mean Squared Error (MSE)
MSE = Bias2 + Variance
9
Optimal Design for Total Survey Quality
Total Survey Quality
Accuracy
Timeliness
Accessibility
Comparability
Credibility
Interpretability
Minimize total survey error
Specification error
Nonresponse error
Frame error
Measurement error
Data processing error
budget
time
constraints
Subject toSampling error
10
Designing Surveys to Minimize Total Survey Error
Objective – minimum mean squared error (MSE) subject to cost and timeliness constraints
2MSE = Bias + VarianceMajor bias contributors• concept
misspecification • frame noncoverage • nonresponse• measurement bias• editing errors
Major variance contributors• sampling error • measurement unreliability• interviewer error
11
Key Design Principles
Design robustness – accuracy does not change appreciably as the survey design features change; i.e. optimum is “flat” over a range of alternate designs
Effect generalizability – design features found to be optimal for one survey are often generalizable to other similar surveys
optimum accuracy
12
Key Design Principles
Design robustness – accuracy does not change appreciably as the survey design features change; i.e. optimum is “flat” over a range of alternate designs
Effect generalizability – design features found to be optimal for one survey are often generalizable to other similar surveys
loss in accuracy
13
Implications for Design
Compile information on TSE (e.g., quality profiles) Identify major contributors to TSE Allocate resources to control these errors
Use results from the literature and other similar surveys to guide the design
Develop an effective process for modifying the design during implementation to achieve optimality
Embed experiments and conduct studies to obtain data on TSE for future surveys
14
Design Implementation Strategies
The initial survey design must modified or adapted during implementation to control costs and maximize quality.
Four strategies for reducing costs and errors in real-time: Continuous quality improvement Responsive design Six Sigma Adaptive total design and implementation
Initial quality Final quality
15
Continuous Quality Improvement (CQI)1. Prepare a workflow diagram of the process and identify
key process variables.
2. Identify characteristics of the process that are critical to quality (CTQ).
3. Develop real-time, reliable metrics for the cost and quality of each CTQ.
4. Continuously monitor costs and quality metrics during the process.
5. Intervene as necessary to ensure that quality and costs are within acceptable limits.
16
Responsive Design Strategy Developed for face to face data collection (Groves &
Heeringa, 2006) Similar to CQI but includes three phases:
Experimental phase – tests major design options For e.g., split sample designs to test incentive levels
Main data collection phase – implements design selected in first phase
Continues until “phase capacity” is reached
NRFU phase – special methods implemented to reduce nonresponse bias and control data collection costs
NR double sampling, higher incentives, more intensive followup
Phase capacity – point at which efforts to reduce NR bias under current protocol are no longer cost effective
Innovative uses of paradata for CQI
17
Six Sigma Developed by Motorola in the 1980’s Definition (from Pande, et al, 2000, p. xi)
Extends ideas of Total Quality Management (TQM) and continuous quality improvement (CQI)
Has mostly been applied in business and manufacturing.
“A comprehensive and flexible system for achieving, sustaining and maximizing business success,…uniquely driven by a close understanding of customer needs, disciplined use of facts, data, and statistical analysis, and diligent attention to managing, improving, and reinventing business processes.” – Pande, et al (2000, p. xi)
18
Strengths of Six Sigma Provides a systematic, highly effective approach for quality
improvement (DMAIC). Focuses on attributes of a process that are most important
to the client. Emphasizes decision making based on data analysis. Strives for verifiable and sustainable improvements for
both costs and quality. Contains a rich set of techniques and tools for monitoring,
controlling, and improving a process.
19
Weaknesses of Six Sigma
Can be expensive to implement. Achieving 3.4 defects per million opportunities is an
impossible goal for many survey processes. Often requires data that do not exist and cannot be
obtained affordably. Terminology and some techniques are too business and
manufacturing oriented. This obscures its applicability to survey work.
Uses a lot of jargon.
20
Six Sigma’s DMAIC Strategy
Define the problem. Measure key aspects of the process and collect relevant
data. Analyze the data to determine root causes of the
problem. Improve the process based upon results from the data
analysis. Control the process by continuously monitoring metrics
from the process.
21
Develop a survey design with design options A, B, etc
Pretest design and options
Select and Implement best design option
Monitor critical-to-quality design attributes
(CTQs)
Modify design to maximize accuracy while
meeting cost and timeliness objectives
Budget or schedule
exhausted?
Post-survey processing, adjustment, and file
preparation
Pre-release quality evaluations
STOPyes
Typical Survey Design and Implementation Process
no
1
1
Data release
22
Develop a survey design with design options A, B, etc
Pretest design and options
Select and Implement best design option
Monitor critical-to-quality design attributes
(CTQs)
Modify design to maximize accuracy while
meeting cost and timeliness objectives
Budget or schedule
exhausted?
Post-survey processing, adjustment, and file
preparation
Pre-release quality evaluations
STOPyes
Six Sigma Focuses Primarily on these Activities
no
1
1
Data release
23
Adaptive Total Design and Implementation An approach for continuously monitoring survey
processes to control errors, improve quality, and reduce costs.
Adaptive in that it combines the real-time error control features of CQI, responsive design, and Six Sigma strategies.
Total in that it simultaneously monitors multiple sources; for e.g., Sampling frame and sampling Response quality Nonresponse bias reduction Field production Costs and timeliness
24
Six Sigma Tools and Concepts
Workflow diagram Common vs. special cause variation Process control chart Dashboard Fishbone diagram Pareto chart Many others are available (see Breyfogle, 2003))
25
Interview?Contact?
Workflow Diagram for Sampling and Initial Interview AttemptCompute domain
sample sizeCompute current
eligibility rate
Select sample lines to send to field
Assign case priorities Transmit to FS’s in field
Assign cases to FIs Conduct travel efficiency sessions
Optimize work sequence order
FI places initial contact attempt
Compute required sample per PSU
Complete ROC log Set appointment
Yes Yes2
Critical to quality key:1. Achieve target sample sizes for each domain2. Distribute sample to PSUs to minimize design effects3. FI workloads must be adequate4. Ensure high response propensity for high priority cases 5. Minimize FI travel costs through work sequence optimization6. Ensure that high priority cases are worked fully7. Ensure good cooperation at first contact8. Record of calls (ROC) is completed accurately9. Schedule an firm appointment after each contact
1 1 1
4
5
32
6
7
8 9
5
32
4 4
3
1
26
CTQs and Metrics for Frame Construction and SamplingCTQs
•Maximize frame coverage•Maximize within unit coverage•Detect/control duplications and
ineligibles •Effectively post-stratified sample
for bias reduction•Use auxiliary data and efficient
estimators to minimize samplingerror
•Minimize design effects for keyanalytic domains
•Achieve target sample allocationsfor key domains
•Optimally allocate sample to strataand sampling stages
Process Metrics•Ineligibility rates by domain,
PSU,and overall
•Achieved # interviews by domain
•# HU's identified by q.c. •Projected vs. actual coverage•Design effects by domain•Screener mean propensity•Interview mean propensity•# active cases
27
CTQs and Metrics for Observation QualityCTQs
•Detect/control post-surveymeasurement errors
•Identify/repair problematic surveyquestions
•Detect/control response errors•Minimize interviewer biases and
variances
Process Metrics•CARI results by interviewer and
overall •Interviewer exception report•Missing data item frequency by
interviewer•Replicate measurement analysis
summary•Interview length by interviewer•CARI refusal rate by FI, by
phase
28
CTQs and Metrics for Nonresponse FollowupCTQs
•Maximize response rates•Minimize nonresponse bias•Effectively adjust for unit
nonresponse•Effectively impute missing data
forkey items
Process Metrics•Overall Phase 3 sampling rate
byPSU and overall
•Response rate for high prioritycases
•Hours per convertednonrespondent (refusal vs. other)
•Projected WRR by PSU and overall(actual vs. expected)
•Projected design effects by domain
•Budgeted vs actual hours chargedfor Phase II
29
CTQs and Metrics for Costs, Production, and TimelinessCTQs
•Maximizing interviewing efficiency
•Maximize effectiveness of refusalconversions attempts
•Complete call histories accuratelyand completely
•Minimize hours per completedscreener
•Minimizes hours per completedinterview
•Maintain planned costs per quarter
•Maintain planned schedule forsample completion per quarter
Process Metrics•Cost per interview•Dollars spent vs. dollars
budgetedby interviewer
•Dollars spent vs. value of workconducted by interviewer
•Cost breakdown (by phase andoverall)
•Number of cases interviewed(actual vs. budgeted)
•Calls per hour (actual vs. expected)
•Refusal conversion rates byinterviewer
•Hours charged (actual vs.expected)
•Level of effort per case byinterviewer and overall
•Hours per completed screener•Hours per completed interview
30
Special vs. Common Cause Variation Special causes – assignable to events and
circumstances that are extraordinary, rare and unexpected e.g., frame was not sorted prior to sampling Addressed by actions specific to the cause leaving the design of
the process essentially unchanged Common causes – naturally occurring random
disturbances that are inherent in any process and cannot be avoided. e.g., normal fluctuations of response across regions and months Actions designed to address a common cause is neither
required nor advisable; this lead to process “tampering”
31
Chart of Screening Response Rates by County
32
Chart of Screening Response Rates by County
Problem counties?
33
Chart of Screening Response Rates by County
34
Process Control Chart with More Extreme Values
35
Process Control Chart with More Extreme Values
Special cause
36
Example of a DashboardInterviewer Efficiency - Contacting and Locating
37
Dashboard Showing Weighted Response Rates, Interview Costs, Interviewer Exceptions and Production
38
Other Useful Tools
Interviewer turnover
Reward system
SupervisionEconomy
Job characteristics Personal reasons Unrealistic employee expectations
Low pay
Job difficulty
Availability of higher paying jobs
Inadequate training
Poor supervision
Misinformation from other FIsFamily situation
Conflict with supervisor
Lack of steady work
Lack of benefits
Cause and effect (fishbone) diagrams Helps to identify all possible root causes of a problem An important component of the measure stage of DMAIC.
39
Other Useful Tools (cont’d)
Pareto chart Useful for identify the “vital few” sources of process deficiencies
0
20
40
60
80
100
120
140
160
180
200
614 311 140 638 720 420 Allothers
Occupation category
Err
or
co
un
t
Error count
40
Total Survey Error Evaluation Addresses several dimensions of total survey quality. Essential for optimizing resource allocations to reduce
the errors. In experimentation, needed to compare the quality of
alternative methods. Provides valuable information on data quality for gauging
uncertainty in estimates, interpreting the analysis results, and building confidence and credibility in the data.
41
Primary Methods
Nonresponse bias studies (required by OMB for some surveys) Evaluates differences between respondents and
nonrespondents for key survey items Frame data or prior waves provides data on nonrespondents Model-based approaches for nonignorable nonresponse bias
Measurement bias studies Record check studies Reconciled reinterviews Internal and external consistency checks Test-retest reinterview approaches Embedded repeated measures analysis (e.g. structural equation
modeling, latent class analysis)
42
Primary Methods (continued)
Other methods Frame undercoverage evaluations Editing error (pre- and post-editing comparisons) Cognitive methods for detecting comprehension errors, recall
problems, data sensitivity, etc. Subject matter expert reviews of concepts vs. question meaning Process data summaries
Response rate analysis Data entry error rates Edit failure rates Missing data rates Post-survey adjustment factors
43
Major Take-Home Points Survey quality is multi-dimensional including both data
user and producer dimensions. Accuracy is maximized subject to cost and timeliness
constraints Survey design optimization begins with the initial design
and extends throughout implementation and post-survey processing.
ATDI combines CQI, responsive design and Six Sigma strategies to provide a comprehensive approach for real-time reduction of total survey error and costs.
Survey evaluation is an essential component of the total survey error framework.
44
It is not enough to do your best; you must know what to do, and then do your best.– W. Edwards Deming