44
Total Survey Error Design, Implementation, and Evaluation Paul P. Biemer RTI International and University of North Carolina at Chapel Hill

Total Survey Error Design, Implementation, and Evaluation Paul P. Biemer RTI International and University of North Carolina at Chapel Hill

Embed Size (px)

Citation preview

Page 1: Total Survey Error Design, Implementation, and Evaluation Paul P. Biemer RTI International and University of North Carolina at Chapel Hill

Total Survey ErrorDesign, Implementation, and Evaluation

Paul P. BiemerRTI International and

University of North Carolina at Chapel Hill

Page 2: Total Survey Error Design, Implementation, and Evaluation Paul P. Biemer RTI International and University of North Carolina at Chapel Hill

2

Modern View of Survey Design

Surveys should be designed to maximize total survey quality within timeliness and budget constraints.

But how when… survey budgets are severely constrained, data must be produced and disseminated in a timely fashion, public interest in participating in surveys has been declining

world-wide for years, and even when participation is obtained, responses may not be

accurate. This is the challenge for survey research in the 21st

century

Page 3: Total Survey Error Design, Implementation, and Evaluation Paul P. Biemer RTI International and University of North Carolina at Chapel Hill

3

Outline

What is total survey quality? How does it differ from total survey error? How can surveys be designed to maximize total survey

quality? What is the total survey error paradigm and what does it

say about the design, implementation, and evaluation of survey?

Page 4: Total Survey Error Design, Implementation, and Evaluation Paul P. Biemer RTI International and University of North Carolina at Chapel Hill

4

User and Producer Have Very Different Perspectives on Survey Quality

Producers place high priority on Accuracy – total survey error is minimized Credibility – credible methodologies; trustworthy data

Users place higher priority on Timeliness – data deliveries adhere to schedules Relevance – data satisfy user needs Accessibility – access to data is user friendly Interpretability – documentation is clear; meta-data are

well-managed

Page 5: Total Survey Error Design, Implementation, and Evaluation Paul P. Biemer RTI International and University of North Carolina at Chapel Hill

5

Users Also Demand…

Comparability – valid demographic, spatial and temporal comparisons

Coherence – estimates from different sources can be reliably combined

Completeness – data are rich enough to satisfy the analysis objectives without undue burden on respondents

Page 6: Total Survey Error Design, Implementation, and Evaluation Paul P. Biemer RTI International and University of North Carolina at Chapel Hill

6

User

TSQ Optimally Balances Producer and User Requirements

Accuracy

Credibility

Timeliness

Accessibility

Comparability

Producer

Coherence

Completeness

Relevance

Interpretability

Page 7: Total Survey Error Design, Implementation, and Evaluation Paul P. Biemer RTI International and University of North Carolina at Chapel Hill

7

The Total Survey Quality Paradigm

Identifies measurable and achievable objectives for each user-defined dimension of quality

Determines costs and resources required to achieve these objectives

Maximizes survey accuracy within remaining budget

Survey Budget = Cost of Accuracy

+ Cost of User-Defined Quality

Page 8: Total Survey Error Design, Implementation, and Evaluation Paul P. Biemer RTI International and University of North Carolina at Chapel Hill

8

Accuracy is maximized by minimizing total survey error

Total Survey Error

Sampling Error• Sampling

scheme• Sample size• Estimator choice

Nonsampling Error• Specification• Nonresponse• Frame• Measurement• Data processing

Systematic

Variable

Bias

Variance

Mean Squared Error (MSE)

MSE = Bias2 + Variance

Page 9: Total Survey Error Design, Implementation, and Evaluation Paul P. Biemer RTI International and University of North Carolina at Chapel Hill

9

Optimal Design for Total Survey Quality

Total Survey Quality

Accuracy

Timeliness

Accessibility

Comparability

Credibility

Interpretability

Minimize total survey error

Specification error

Nonresponse error

Frame error

Measurement error

Data processing error

budget

time

constraints

Subject toSampling error

Page 10: Total Survey Error Design, Implementation, and Evaluation Paul P. Biemer RTI International and University of North Carolina at Chapel Hill

10

Designing Surveys to Minimize Total Survey Error

Objective – minimum mean squared error (MSE) subject to cost and timeliness constraints

2MSE = Bias + VarianceMajor bias contributors• concept

misspecification • frame noncoverage • nonresponse• measurement bias• editing errors

Major variance contributors• sampling error • measurement unreliability• interviewer error

Page 11: Total Survey Error Design, Implementation, and Evaluation Paul P. Biemer RTI International and University of North Carolina at Chapel Hill

11

Key Design Principles

Design robustness – accuracy does not change appreciably as the survey design features change; i.e. optimum is “flat” over a range of alternate designs

Effect generalizability – design features found to be optimal for one survey are often generalizable to other similar surveys

optimum accuracy

Page 12: Total Survey Error Design, Implementation, and Evaluation Paul P. Biemer RTI International and University of North Carolina at Chapel Hill

12

Key Design Principles

Design robustness – accuracy does not change appreciably as the survey design features change; i.e. optimum is “flat” over a range of alternate designs

Effect generalizability – design features found to be optimal for one survey are often generalizable to other similar surveys

loss in accuracy

Page 13: Total Survey Error Design, Implementation, and Evaluation Paul P. Biemer RTI International and University of North Carolina at Chapel Hill

13

Implications for Design

Compile information on TSE (e.g., quality profiles) Identify major contributors to TSE Allocate resources to control these errors

Use results from the literature and other similar surveys to guide the design

Develop an effective process for modifying the design during implementation to achieve optimality

Embed experiments and conduct studies to obtain data on TSE for future surveys

Page 14: Total Survey Error Design, Implementation, and Evaluation Paul P. Biemer RTI International and University of North Carolina at Chapel Hill

14

Design Implementation Strategies

The initial survey design must modified or adapted during implementation to control costs and maximize quality.

Four strategies for reducing costs and errors in real-time: Continuous quality improvement Responsive design Six Sigma Adaptive total design and implementation

Initial quality Final quality

Page 15: Total Survey Error Design, Implementation, and Evaluation Paul P. Biemer RTI International and University of North Carolina at Chapel Hill

15

Continuous Quality Improvement (CQI)1. Prepare a workflow diagram of the process and identify

key process variables.

2. Identify characteristics of the process that are critical to quality (CTQ).

3. Develop real-time, reliable metrics for the cost and quality of each CTQ.

4. Continuously monitor costs and quality metrics during the process.

5. Intervene as necessary to ensure that quality and costs are within acceptable limits.

Page 16: Total Survey Error Design, Implementation, and Evaluation Paul P. Biemer RTI International and University of North Carolina at Chapel Hill

16

Responsive Design Strategy Developed for face to face data collection (Groves &

Heeringa, 2006) Similar to CQI but includes three phases:

Experimental phase – tests major design options For e.g., split sample designs to test incentive levels

Main data collection phase – implements design selected in first phase

Continues until “phase capacity” is reached

NRFU phase – special methods implemented to reduce nonresponse bias and control data collection costs

NR double sampling, higher incentives, more intensive followup

Phase capacity – point at which efforts to reduce NR bias under current protocol are no longer cost effective

Innovative uses of paradata for CQI

Page 17: Total Survey Error Design, Implementation, and Evaluation Paul P. Biemer RTI International and University of North Carolina at Chapel Hill

17

Six Sigma Developed by Motorola in the 1980’s Definition (from Pande, et al, 2000, p. xi)

Extends ideas of Total Quality Management (TQM) and continuous quality improvement (CQI)

Has mostly been applied in business and manufacturing.

“A comprehensive and flexible system for achieving, sustaining and maximizing business success,…uniquely driven by a close understanding of customer needs, disciplined use of facts, data, and statistical analysis, and diligent attention to managing, improving, and reinventing business processes.” – Pande, et al (2000, p. xi)

Page 18: Total Survey Error Design, Implementation, and Evaluation Paul P. Biemer RTI International and University of North Carolina at Chapel Hill

18

Strengths of Six Sigma Provides a systematic, highly effective approach for quality

improvement (DMAIC). Focuses on attributes of a process that are most important

to the client. Emphasizes decision making based on data analysis. Strives for verifiable and sustainable improvements for

both costs and quality. Contains a rich set of techniques and tools for monitoring,

controlling, and improving a process.

Page 19: Total Survey Error Design, Implementation, and Evaluation Paul P. Biemer RTI International and University of North Carolina at Chapel Hill

19

Weaknesses of Six Sigma

Can be expensive to implement. Achieving 3.4 defects per million opportunities is an

impossible goal for many survey processes. Often requires data that do not exist and cannot be

obtained affordably. Terminology and some techniques are too business and

manufacturing oriented. This obscures its applicability to survey work.

Uses a lot of jargon.

Page 20: Total Survey Error Design, Implementation, and Evaluation Paul P. Biemer RTI International and University of North Carolina at Chapel Hill

20

Six Sigma’s DMAIC Strategy

Define the problem. Measure key aspects of the process and collect relevant

data. Analyze the data to determine root causes of the

problem. Improve the process based upon results from the data

analysis. Control the process by continuously monitoring metrics

from the process.

Page 21: Total Survey Error Design, Implementation, and Evaluation Paul P. Biemer RTI International and University of North Carolina at Chapel Hill

21

Develop a survey design with design options A, B, etc

Pretest design and options

Select and Implement best design option

Monitor critical-to-quality design attributes

(CTQs)

Modify design to maximize accuracy while

meeting cost and timeliness objectives

Budget or schedule

exhausted?

Post-survey processing, adjustment, and file

preparation

Pre-release quality evaluations

STOPyes

Typical Survey Design and Implementation Process

no

1

1

Data release

Page 22: Total Survey Error Design, Implementation, and Evaluation Paul P. Biemer RTI International and University of North Carolina at Chapel Hill

22

Develop a survey design with design options A, B, etc

Pretest design and options

Select and Implement best design option

Monitor critical-to-quality design attributes

(CTQs)

Modify design to maximize accuracy while

meeting cost and timeliness objectives

Budget or schedule

exhausted?

Post-survey processing, adjustment, and file

preparation

Pre-release quality evaluations

STOPyes

Six Sigma Focuses Primarily on these Activities

no

1

1

Data release

Page 23: Total Survey Error Design, Implementation, and Evaluation Paul P. Biemer RTI International and University of North Carolina at Chapel Hill

23

Adaptive Total Design and Implementation An approach for continuously monitoring survey

processes to control errors, improve quality, and reduce costs.

Adaptive in that it combines the real-time error control features of CQI, responsive design, and Six Sigma strategies.

Total in that it simultaneously monitors multiple sources; for e.g., Sampling frame and sampling Response quality Nonresponse bias reduction Field production Costs and timeliness

Page 24: Total Survey Error Design, Implementation, and Evaluation Paul P. Biemer RTI International and University of North Carolina at Chapel Hill

24

Six Sigma Tools and Concepts

Workflow diagram Common vs. special cause variation Process control chart Dashboard Fishbone diagram Pareto chart Many others are available (see Breyfogle, 2003))

Page 25: Total Survey Error Design, Implementation, and Evaluation Paul P. Biemer RTI International and University of North Carolina at Chapel Hill

25

Interview?Contact?

Workflow Diagram for Sampling and Initial Interview AttemptCompute domain

sample sizeCompute current

eligibility rate

Select sample lines to send to field

Assign case priorities Transmit to FS’s in field

Assign cases to FIs Conduct travel efficiency sessions

Optimize work sequence order

FI places initial contact attempt

Compute required sample per PSU

Complete ROC log Set appointment

Yes Yes2

Critical to quality key:1. Achieve target sample sizes for each domain2. Distribute sample to PSUs to minimize design effects3. FI workloads must be adequate4. Ensure high response propensity for high priority cases 5. Minimize FI travel costs through work sequence optimization6. Ensure that high priority cases are worked fully7. Ensure good cooperation at first contact8. Record of calls (ROC) is completed accurately9. Schedule an firm appointment after each contact

1 1 1

4

5

32

6

7

8 9

5

32

4 4

3

1

Page 26: Total Survey Error Design, Implementation, and Evaluation Paul P. Biemer RTI International and University of North Carolina at Chapel Hill

26

CTQs and Metrics for Frame Construction and SamplingCTQs

•Maximize frame coverage•Maximize within unit coverage•Detect/control duplications and

ineligibles •Effectively post-stratified sample

for bias reduction•Use auxiliary data and efficient

estimators to minimize samplingerror

•Minimize design effects for keyanalytic domains

•Achieve target sample allocationsfor key domains

•Optimally allocate sample to strataand sampling stages

Process Metrics•Ineligibility rates by domain,

PSU,and overall

•Achieved # interviews by domain

•# HU's identified by q.c. •Projected vs. actual coverage•Design effects by domain•Screener mean propensity•Interview mean propensity•# active cases

Page 27: Total Survey Error Design, Implementation, and Evaluation Paul P. Biemer RTI International and University of North Carolina at Chapel Hill

27

CTQs and Metrics for Observation QualityCTQs

•Detect/control post-surveymeasurement errors

•Identify/repair problematic surveyquestions

•Detect/control response errors•Minimize interviewer biases and

variances

Process Metrics•CARI results by interviewer and

overall •Interviewer exception report•Missing data item frequency by

interviewer•Replicate measurement analysis

summary•Interview length by interviewer•CARI refusal rate by FI, by

phase

Page 28: Total Survey Error Design, Implementation, and Evaluation Paul P. Biemer RTI International and University of North Carolina at Chapel Hill

28

CTQs and Metrics for Nonresponse FollowupCTQs

•Maximize response rates•Minimize nonresponse bias•Effectively adjust for unit

nonresponse•Effectively impute missing data

forkey items

Process Metrics•Overall Phase 3 sampling rate

byPSU and overall

•Response rate for high prioritycases

•Hours per convertednonrespondent (refusal vs. other)

•Projected WRR by PSU and overall(actual vs. expected)

•Projected design effects by domain

•Budgeted vs actual hours chargedfor Phase II

Page 29: Total Survey Error Design, Implementation, and Evaluation Paul P. Biemer RTI International and University of North Carolina at Chapel Hill

29

CTQs and Metrics for Costs, Production, and TimelinessCTQs

•Maximizing interviewing efficiency

•Maximize effectiveness of refusalconversions attempts

•Complete call histories accuratelyand completely

•Minimize hours per completedscreener

•Minimizes hours per completedinterview

•Maintain planned costs per quarter

•Maintain planned schedule forsample completion per quarter

Process Metrics•Cost per interview•Dollars spent vs. dollars

budgetedby interviewer

•Dollars spent vs. value of workconducted by interviewer

•Cost breakdown (by phase andoverall)

•Number of cases interviewed(actual vs. budgeted)

•Calls per hour (actual vs. expected)

•Refusal conversion rates byinterviewer

•Hours charged (actual vs.expected)

•Level of effort per case byinterviewer and overall

•Hours per completed screener•Hours per completed interview

Page 30: Total Survey Error Design, Implementation, and Evaluation Paul P. Biemer RTI International and University of North Carolina at Chapel Hill

30

Special vs. Common Cause Variation Special causes – assignable to events and

circumstances that are extraordinary, rare and unexpected e.g., frame was not sorted prior to sampling Addressed by actions specific to the cause leaving the design of

the process essentially unchanged Common causes – naturally occurring random

disturbances that are inherent in any process and cannot be avoided. e.g., normal fluctuations of response across regions and months Actions designed to address a common cause is neither

required nor advisable; this lead to process “tampering”

Page 31: Total Survey Error Design, Implementation, and Evaluation Paul P. Biemer RTI International and University of North Carolina at Chapel Hill

31

Chart of Screening Response Rates by County

Page 32: Total Survey Error Design, Implementation, and Evaluation Paul P. Biemer RTI International and University of North Carolina at Chapel Hill

32

Chart of Screening Response Rates by County

Problem counties?

Page 33: Total Survey Error Design, Implementation, and Evaluation Paul P. Biemer RTI International and University of North Carolina at Chapel Hill

33

Chart of Screening Response Rates by County

Page 34: Total Survey Error Design, Implementation, and Evaluation Paul P. Biemer RTI International and University of North Carolina at Chapel Hill

34

Process Control Chart with More Extreme Values

Page 35: Total Survey Error Design, Implementation, and Evaluation Paul P. Biemer RTI International and University of North Carolina at Chapel Hill

35

Process Control Chart with More Extreme Values

Special cause

Page 36: Total Survey Error Design, Implementation, and Evaluation Paul P. Biemer RTI International and University of North Carolina at Chapel Hill

36

Example of a DashboardInterviewer Efficiency - Contacting and Locating

Page 37: Total Survey Error Design, Implementation, and Evaluation Paul P. Biemer RTI International and University of North Carolina at Chapel Hill

37

Dashboard Showing Weighted Response Rates, Interview Costs, Interviewer Exceptions and Production

Page 38: Total Survey Error Design, Implementation, and Evaluation Paul P. Biemer RTI International and University of North Carolina at Chapel Hill

38

Other Useful Tools

Interviewer turnover

Reward system

SupervisionEconomy

Job characteristics Personal reasons Unrealistic employee expectations

Low pay

Job difficulty

Availability of higher paying jobs

Inadequate training

Poor supervision

Misinformation from other FIsFamily situation

Conflict with supervisor

Lack of steady work

Lack of benefits

Cause and effect (fishbone) diagrams Helps to identify all possible root causes of a problem An important component of the measure stage of DMAIC.

Page 39: Total Survey Error Design, Implementation, and Evaluation Paul P. Biemer RTI International and University of North Carolina at Chapel Hill

39

Other Useful Tools (cont’d)

Pareto chart Useful for identify the “vital few” sources of process deficiencies

0

20

40

60

80

100

120

140

160

180

200

614 311 140 638 720 420 Allothers

Occupation category

Err

or

co

un

t

Error count

Page 40: Total Survey Error Design, Implementation, and Evaluation Paul P. Biemer RTI International and University of North Carolina at Chapel Hill

40

Total Survey Error Evaluation Addresses several dimensions of total survey quality. Essential for optimizing resource allocations to reduce

the errors. In experimentation, needed to compare the quality of

alternative methods. Provides valuable information on data quality for gauging

uncertainty in estimates, interpreting the analysis results, and building confidence and credibility in the data.

Page 41: Total Survey Error Design, Implementation, and Evaluation Paul P. Biemer RTI International and University of North Carolina at Chapel Hill

41

Primary Methods

Nonresponse bias studies (required by OMB for some surveys) Evaluates differences between respondents and

nonrespondents for key survey items Frame data or prior waves provides data on nonrespondents Model-based approaches for nonignorable nonresponse bias

Measurement bias studies Record check studies Reconciled reinterviews Internal and external consistency checks Test-retest reinterview approaches Embedded repeated measures analysis (e.g. structural equation

modeling, latent class analysis)

Page 42: Total Survey Error Design, Implementation, and Evaluation Paul P. Biemer RTI International and University of North Carolina at Chapel Hill

42

Primary Methods (continued)

Other methods Frame undercoverage evaluations Editing error (pre- and post-editing comparisons) Cognitive methods for detecting comprehension errors, recall

problems, data sensitivity, etc. Subject matter expert reviews of concepts vs. question meaning Process data summaries

Response rate analysis Data entry error rates Edit failure rates Missing data rates Post-survey adjustment factors

Page 43: Total Survey Error Design, Implementation, and Evaluation Paul P. Biemer RTI International and University of North Carolina at Chapel Hill

43

Major Take-Home Points Survey quality is multi-dimensional including both data

user and producer dimensions. Accuracy is maximized subject to cost and timeliness

constraints Survey design optimization begins with the initial design

and extends throughout implementation and post-survey processing.

ATDI combines CQI, responsive design and Six Sigma strategies to provide a comprehensive approach for real-time reduction of total survey error and costs.

Survey evaluation is an essential component of the total survey error framework.

Page 44: Total Survey Error Design, Implementation, and Evaluation Paul P. Biemer RTI International and University of North Carolina at Chapel Hill

44

It is not enough to do your best; you must know what to do, and then do your best.– W. Edwards Deming