Upload
others
View
0
Download
0
Embed Size (px)
Citation preview
1
All official NCHS BSC documents are posted on the BSC website
(https://www.cdc.gov/nchs/about/bsc/bsc_meetings.htm)
Department of Health and Human Services
Board of Scientific Counselors
National Center for Health Statistics
Centers for Disease Control and Prevention
January 9-10, 2020
Meeting Summary
The Board of Scientific Counselors (BSC) convened on January 9-10, 2020, at the National
Center for Health Statistics (NCHS), Centers for Disease Control and Prevention (CDC),
3311 Toledo Road, Hyattsville, MD. The meeting was open to the public.
Board Members Present
Linette T. Scott, M.D., M.P.H., Chair, BSC
Kennon R. Copeland, Ph.D.
Prashila Dullabh, M.D. (in person 1/9, by phone 1/10)
Darrell J. Gaskin, Ph.D. (present 1/9 only)
Robert M. Hauser, Ph.D.
Scott H. Holan, Ph.D.
Helen G. Levy, Ph.D.
R. John Lumpkin, M.D., M.P.H.
Sally C. Morton, Ph.D.
Kristen M. Olson, Ph.D. (by phone)
Andy Peytchev, Ph.D.
Ninez A. Ponce, M.P.P., Ph.D.
Gretchen Van Wye, Ph.D., M.A. (present 1/9 only)
CDC/NCHS Participants
Jennifer Madans, Ph.D., Acting Director, NCHS
Sayeedha Uddin, M.D., M.P.H., Designated Federal Officer, NCHS
Chesley Richards, M.D., M.P.H., F.A.C.P., Deputy Director for Public Health Science and
Surveillance, CDC (present 1/10 only)
NCHS Staff
N. Ahluwalia, Division of Health and Nutrition Examination Surveys (DHANES)
Lara Akinbami, DHANES
Johanna Alfier, Division of Analysis and Epidemiology (DAE)
Josephine Alford, Division of Health Care Statistics (DHCS)
Joyce Abma, Division of Vital Statistics (DVS)/Reproductive Statistics Branch (RSB)
Irma Arispe, DAE
Stephen Blumberg, Division of Health Interview Statistics (DHIS)
Debra Brody, DHANES
Rebecca Bunnell, Office of Science (OS)/CDC
https://www.cdc.gov/nchs/about/bsc/bsc_meetings.htm
2
Anjani Chandra, DVS/RSB
Te-Ching Chen, DHANES
Chinagoza Ugwu, DVS/RSB
Barnali Das, DAE/Population Health Reporting and Dissemination Branch (PHRDB)
Carol DeFrances, DHCS
Denys Lau, DHCS
Brian Ward, DHCS
Tala Fakhouri, DHANES
Lee Anne Flagg, DVS
Cordell Golden, DAE
Jessica Graber, DHANES
Leda Gurley, DAE
Nancy Han, DAE
LaJeana Hawkins, DAE
Kevin Heslin, DAE
David Huang, DAE
Geoff Jackson, DHCS
Sarah Lessem, DHIS
Xianfen Li, DAE
Crescent Martin, DHANES
Massey Merediter, Division of Research and Methodology (DRM)
Gwendolyn Mustaf, Office of the Director (OD)
Kathy O’Connor, DHCS
John R. Pleis, DRM
Gindi Renee, DAE
Lauren Rossen, DRM
P. Savia, Office of Planning, Budget, and Legislation (OPBL)
Iris Shimizu, DRM
Merianne Spencer, DAE
Suresh Srinivasan, DHIS
Susan Queen, OPBL
D. Yu, DHANES
Anjel Vahratian, DHIS
Jean Williams, DAE
S. Willson, DRM
General Audience
Greg Binzer, Westat
Chris Bishop, A-Tek, Inc.
Jay Clark, Westat
Jacquie Hogan, Westat
Susan Kinsey, RTI
Michael Link, Abt
List of Abbreviations
ACASI Audio Computer-Assisted Self-Interview
3
API Application Programming Interface
BSC Board of Scientific Counselors
CAPI Computer-Assisted Personal Interviewing
CDC Centers for Disease Control and Prevention
CHC community health center
CMS Centers for Medicare & Medicaid Services
DAE Division of Analysis and Epidemiology
DHANES Division of Health and Nutrition Examination Surveys
DHCS Division of Health Care Statistics
DHIS Division of Health Interview Statistics
DRM Division of Research and Methodology
DVS Division of Vital Statistics
EHR electronic health records
FSRDC Federal Statistical Research Data Center
GREG Generalized Regression
HP Healthy People
MEC Mobile Examination Center
MEPS Medical Expenditure Panel Survey
MMWR Morbidity and Mortality Weekly Report
NAMCS National Ambulatory Medical Care Survey
NCHS National Center for Health Statistics
NDI National Death Index
NHANES National Health and Nutrition Examination Survey
NHCS National Hospital Care Survey
NHIS National Health Interview Survey
NLP natural language processing
NPALS National Post-Acute and Long-Term Care Study
NSFG National Survey of Family Growth
OD Office of the Director
OPBL Office of Planning, Budget, and Legislation
OS Office of Science
PCORI Patient-Centered Outcomes Research Institute
PCORTF Patient-Centered Outcomes Research Trust Fund
PHRDB Population Health Reporting and Dissemination Branch
PSU primary sampling unit
RDC Research Data Center
RFI Request for Information
RSB Reproductive Statistics Branch
SSN Social Security Number
Action Steps
• The BSC voted unanimously to endorse the findings and opinions provided by the Non-Response Bias Workgroup.
4
• The BSC voted unanimously to convene a workgroup for the National Ambulatory Medical Care Survey (NAMCS); Drs. Lumpkin and Copeland volunteered to serve.
• The BSC voted unanimously to form a workgroup on survey design and data presentation; Drs. Holan, Copeland, Hauser, and Peytchev volunteered to serve.
• Future BSC meeting dates for the remainder of 2020 are May 5-6 and September 17-18.
Thursday, January 9, 2020
Presenters
Jennifer H. Madans, Ph.D., Acting Director, NCHS
Paul Sutton, Ph.D., Deputy Director, DVS
Geoffrey Jackson, M.S., Hospital Care Team Lead, Division of Health Care Statistics
Ken Copeland, Ph.D., Co-Chair, Non-Response Bias Workgroup
Andy Peytchev, Ph.D., Co-Chair, Non-Response Bias Workgroup
Lisa Mirel, M.S., Chief, Data Linkage Methodology and Analysis Branch, DAE
Jennifer Parker, Ph.D., Director, Division of Research Methodology
Ryne Paulose, Ph.D., Acting Director, DHANES
Welcome, Introductions, and Call to Order
Linette T. Scott, M.D., M.P.H., Chair, BSC
Dr. Scott called the meeting to order. She asked BSC members to introduce themselves and state
any conflicts of interest. No one reported a conflict of interest.
NCHS Update
Jennifer H. Madans, Ph.D., Acting Director, NCHS
Administrative & Budget Update
The NCHS FY2020 budget ($160M) is the same as it has been since FY2016. NCHS will also
receive about $14 million from CDC’s Public Health and Scientific Services Account.
Furthermore, the FY2020 appropriations bill includes $50M to CDC for Public Health Data
Surveillance/IT Systems Modernization. The Joint Explanatory Statement noted that the multi-
year plan should include “the innovation strategy for surveys conducted by the NCHS,” but it
remains unclear how that requirement may affect NCHS’ budget.
NCHS surveys are funded in part by reimbursable funds (e.g., ranging from 99% for the National
Survey of Family Growth (NSFG) to 29% the National Health Interview Survey (NHIS)). DVS
is funded entirely by NCHS appropriations.
The amount of funding from other sources in FY2020 remains unknown. NCHS has requested
almost $1M from the Patient-Centered Outcomes Research Institute (PCORI) and another $4M
from the Opioid Response Coordinating Unit.
5
NCHS plans to remodel the main floor by adding a wall graphic stating the NCHS mission, a
screen with rotating content, a mural portraying the history of NCHS, and a collage of covers
from previous issues of Health, United States across a map of the United States.
Program Updates
Division of Vital Statistics (DVS)
On January 30, the final 2018 mortality data will be released and will include official maternal
mortality statistics and three related supplemental reports. NCHS plans to conduct an extensive
communications campaign (e.g., public webinars, social media) regarding maternal mortality.
DVS plans to improve the collection of maternal mortality data by reducing errors resulting from
the pregnancy checkbox and encouraging greater cooperation between state vital statistics offices
and state maternal/child health agencies. New information derived from this process should be
incorporated in the vital statistics system.
Division of Health Interview Statistics (DHIS)
DHIS will release preliminary estimates from the 2019 Redesigned NHIS in March or April.
DHIS will introduce an interactive data query system that is expected to include the 2019
estimates by this summer. The full-year data release for the 2019 Redesigned NHIS is planned
for September 2020. NHIS will pilot in-home collection of blood samples and anthropomorphic
measurements from adults. Those results will inform the redesign of the National Health and
Nutrition Examination Survey (NHANES).
Division of Health and Nutrition Examination Survey (DHANES)
Release of the 2017-18 NHANES was delayed to allow for more in-depth evaluation of
nonresponse bias by the BSC workgroup. The workgroup evaluation will be presented later
today. DHANES has finalized the survey weights and will begin to release those data in
February. DHANES has redesigned its website and updated its tutorial. The new operations
branch chief is Jessica Graber.
Division of Health Care Statistics (DHCS)
The 2017 National Hospital Ambulatory Medical Care Survey Emergency Department dataset
was released in November 2019. During late January, RTI International will begin data
collection for the National Post-Acute and Long-Term Care Study (NPALS)—formerly known
as the National Study of Long-Term Care Providers.
Division of Analysis and Epidemiology (DAE)
On October 30, 2019, the Health, US 2018 report was released online and received a lot of
attention. In November 2019, DAE’s first public webinar attracted more than 100 attendees.
NCHS was asked to brief the House and Senate Appropriations Committee’s Labor, Health and
Human Services subcommittee staff.
The release of Healthy People (HP) 2030 is scheduled for March 31. Additional measures and
indicators will be released on a rolling basis through early 2021. Processing of the final data
updates for HP2020 will be completed early this year, with release of the executive summary in
October 2020 and of the final review in late 2021.
6
In 2019, the Data Linkages program released public-use linked mortality files, updated linked
Medicaid, and linked NHUD files. With funding from PCOR Trust Fund, DAE also linked
National Hospital Care Survey (NHCS) data to the National Death Index (NDI) and data from
the Centers for Medicare & Medicaid Services (CMS).
Division of Research and Methodology (DRM)
Census has two pilot projects related to accessing restricted-use data in the Federal Statistical
Research Data Center (FSRDC). The first involves developing a common application portal for
all statistical agencies. The second will assess the feasibility of a virtual data enclave. Census has
closed the Research Data Center (RDC) at its headquarters but opened another one in the District
of Columbia. Two new FSRDCs will open in February 2020. NCHS signed an agreement with
the Environmental Protection Agency to make certain restricted-use data available via the
FSRDC system.
NCHS Publications and Media Exposure
Today marks the 15th anniversary of QuickStats: a single chart with a short, concise caption that
is featured in the Morbidity and Mortality Weekly Report (MMWR). Since its inception in
January 2005, MMWR has published 694 QuickStats by 222 authors, mostly from NCHS staff.
QuickStats gives NCHS more exposure on social media.
NCHS released 94 publications in FY2019 and already has plans for 18 publications in FY2020.
At the International Conference on Health Policy Statistics, NCHS held an invited session on
linked data for evidence-based policy making and presented several posters on maternal
mortality.
Discussion/Reaction by the Board
Discussion focused on suggestions for increasing social media exposure and questions regarding
the RDCs and storage of biospecimens.
One BSC member noted that QuickStats would be well-suited to social media. Dr. Madans
explained that the publisher takes the lead on social media postings, but acknowledged that
NCHS could do additional social media postings that feature QuickStats.
A board member asked about the procedure to access data at the NCHS RDCs. Another board
member noted that the RDC at the University of California, Los Angeles has no data dictionaries
for the NCHS surveys.
One BSC member asked for how long NCHS will store the NHANES biospecimens that have
been collected since 1988. Dr. Madans noted that cost is the main problem with storage. The
stored specimens have allowed researchers to run tests for substances whose importance was not
known when the specimens were collected; thus, NCHS is reluctant to destroy them.
PCOR Trust Fund (PCORTF) Projects Update and Response to BSC Recommendations
Paul Sutton, Ph.D., Deputy Director, DVS
Geoffrey Jackson, M.S., Hospital Care Team Lead, Division of Health Care Statistics
7
Dr. Sutton reviewed the recommendations from the BSC workgroup on engaging researchers to
ensure that mortality data infrastructure improvements align with end-user needs. The first
overarching theme from the workgroup’s summary report focused on generating interest in the
datasets. To this end, the workgroup first recommended that DVS present a suite of products
rather than single, stand-alone reports. Dr. Sutton would like to learn more about the BSC’s ideas
for these products. Second, the data and methods could be made available via webinars and
websites, which aligns with the growing interest within NCHS to conduct webinars. Third, DVS
could provide concrete examples of how specific audiences can use the data. Dr. Sutton
explained that NCHS is redesigning its website and trying to incorporate more audience-specific
guidance. Finally, the workgroup recommended the use of professional societies to help
disseminate information.
Another theme focused on making the methodology readily available. For example, information
regarding state-to-state variation in mortality data quality would be helpful to end users. Dr.
Sutton explained that some of that information is buried in the technical documentation and
acknowledged that DVS may need to feature it more prominently.
The workgroup provided ideas about documenting and sharing the data. Dr. Sutton agreed with
the workgroup’s recommendations to share the code and to transition from PDF to HTML format
for greater accessibility. The workgroup further recommended providing the data in multiple
formats, including application program interface (API) access. Dr. Sutton noted that DVS
provides API access for provisional data but has made fewer advancements for the final data.
CDC Wonder includes an API, but DVS may need to increase awareness of its existence.
Finally, the workgroup provided ideas for enhancing the datasets and tools for easier access.
With respect to CDC Wonder, DVS plans to add NOT as a logical operator for Boolean
statements, include place of occurrence in addition to place of residence, and provide access to
supplemental drug information. DVS also plans to include demographic and geographic data on
the Vital Statistics Rapid Release interface. Regarding the request to document the thresholds
under which provisional state mortality data are suppressed, Dr. Sutton believes that the
information is likely already in the documentation, but DVS should seek ways to make it more
accessible.
Mr. Jackson began with an update about the PCOR projects related to the DHCS. The FY17
project to link NHCS to the NDI and CMS has been completed. The other two projects are
ongoing: FY18 (i.e., enhancing identification of opioid-involved health outcomes using linked
hospital care and mortality data) and FY19 (i.e., identifying co-occurring mental health disorders
among opioid users using linked hospital and mortality data).
The FY18 and F19 projects are being developed concurrently with a joint methodology. DHCS
is currently in Stage 3 (“annotation process”). Stage 1 involved case definition (i.e., eligible
opioids for FY18 and substance use mental health disorders for FY19). In Stage 2, DHCS
selected the relevant opioid codes from a comprehensive list compiled from a broad range of
sources. Stage 3 (i.e., annotation) uses human-annotated data to train a natural language
processing (NLP) classifier. DHCS will develop an enhanced algorithm in Stage 4 and will
8
perform a validation study (fielded in late 2020) that will use actual hospital records to validate
the algorithm in Stage 5. During a Drug Workgroup meeting on January 15, DHCS will present
the abstraction form that will be used in the validation study and will ask for feedback.
Using funding from PCOR, DHCS has created a variety of linked files accessible through the
RDCs. DHCS is planning additional PCOR products.
In response to the aforementioned BSC workgroup recommendations, DHCS has performed a lot
of outreach in recent months to generate interest in the data and to educate end users. In late
2019, DHCS held a workshop at the University of Kentucky to teach users how to apply for
RDC access. DHCS had a poster at the Data Linkage Data Day (10/19/2019) and a poster and
presentation at the International Conference on Health Policy Statistics (January 6-8, 2020).
DHCS will hold a breakout session at the 2020 National Prescription Drug Abuse and Heroin
Summit. Abstracts have been submitted to the 6th International Conference on Establishment
surveys. DHCS is also exploring making the code publicly available via GitHub.
Discussion/Reaction by the Board
Referring to the BSC’s recommendation about a suite of products, Dr. Scott explained that the
previous discussion had identified all the products related to a particular topic (e.g., suicide) from
various data sources should be grouped together, including the different communication
modalities (web, print, fast facts, etc.). One participant asked whether the NLP classifier is
probabilistic and whether there is something systematic about the cases that cannot be classified.
Another BSC member suggested that a user experience survey of the RDCs might help NCHS
better understand users’ concerns.
Non-Response Bias (NRB) Workgroup Report
Ken Copeland, Ph.D., Co-Chair, Non-Response Bias Workgroup
Andy Peytchev, Ph.D., Co-Chair, Non-Response Bias Workgroup
Drs. Copeland and Peytchev co-chaired this workgroup, while Drs. Holan and Hauser served as
members. The purposes of the in-person meeting on October 21, 2019, were to understand the
non-response bias issue on NHIS and NHANES and comment on NCHS’ efforts to address it. A
conference call on November 13 was to comment on additional work for NHANES after the in-
person meeting.
NHIS Non-Response Bias Analysis
DHIS wanted to quantify the extent of nonresponse bias in the redesigned NHIS and to
determine the optimal weighting method to reduce that bias, taking advantage of improvements
in the availability of auxiliary data and paradata, machine learning methods, and other advanced
statistical models. NCHS awarded a contract to ICF to investigate the problem of non-response
bias, evaluate the tradeoffs (e.g., a reduction in nonresponse bias may increase the complexity of
application and replication), and make recommendations regarding weighting methods. For
nonresponse prediction, ICF considered two types of traditional logistic regression models and
two types of machine learning methods, but narrowed the candidate pool to one model of each
type: multilevel logistic model and the random forest method.
9
Dr. Copeland then summarized the seven main findings from the workgroup:
1. Endogenous variables (e.g., reasons for nonresponse) should be omitted from the model, which may reduce the variance and improve bias adjustment.
2. The methodology should be easy to implement, and thus NCHS should evaluate the performance of the single-level logistic regression approach.
3. Final weighting could benefit from additional calibration using a geographic variable and cross-tabulation of sex with age and with race/ethnicity.
4. NCHS could draw repeated samples for the first quarter 2019 NHIS sample to assess alternative models.
5. The NHIS weighting methodology should retain person-level nonresponse adjustment. 6. Rather than using the inverse of individual response propensities, DHIS should consider
using the inverse of the mean response propensity within each propensity stratum, which
may reduce the loss in precision resulting from the adjustment.
7. The set of predictors for the nonresponse model should be reconsidered annually. Model parameters can be updated quarterly and finalized when the data are complete.
NHANES Non-Response Bias Analysis
According to Dr. Peytchev, the problem for the NHANES is more an issue of sampling variance
than nonresponse bias. In the 2017-18 cycle, there was an unexpected change in some survey
estimates (e.g., obesity among non-Hispanic white men aged 20 and older increased from 38% in
2015-16 to 48% in 2017-18). Such a drastic change (10 percentage points) in population
prevalence between two consecutive cycles is unexpected. Changes to the sampling design did
not appear to explain the increase because they had been implemented in the prior cycle and thus
the problem would have shown then. Nor did nonresponse bias seem to be a credible
explanation, because, although NHANES response rates declined, they had done so in similar
magnitude in previous cycles as well, and the difference between the estimates from NHANES
versus NHIS was much larger than in previous years. Finally, DHANES investigated whether the
result may be a consequence of the clustered sample design. Comparing the primary sampling
units (PSUs) selected for NHANES with the overall stratum means based on data from the
Behavioral Risk Factor Surveillance System, DHANES found that that the 2017-18 NHANES
sample selected PSUs that were skewed toward higher obesity. To address the problem,
DHANES evaluated two alternative survey weights: (1) sequential addition of raking (i.e.,
adjusting weights for the sample based on known population characteristics) to education and
urbanization; and (2) adjustment for income at the PSU-level using generalized regression
(GREG).
Dr. Peytchev then summarized three findings from the October workgroup meeting:
1. The PSU-level GREG adjustment is warranted but is insufficient and has an undesirable impact on the variance estimates.
2. Additional calibration variables at the household or person level (e.g., health insurance status) should be considered. For example, the current 3-dimensional post-stratification
method could be replaced with multiple 2-way cross-classified population distributions.
3. Changes in key estimates (e.g., BMI) should not be used to select the calibration variables; rather, they should be used only to validate the weighting adjustments.
10
To further examine the topic of weighting adjustments, the workgroup requested additional
analyses from DHANES. So DHANES tested three alternative weights. The first reduced the
dimensionality of the cross-classified adjustment variables. The second augmented the GREG
adjustment with additional variables, but the models could not be estimated. The third used
income at the census tract level in place of the GREG adjustment. During a follow-up call
(November 13, 2019) to discuss the additional analyses, the workgroup reached consensus on the
following opinions:
1. The third set of weights are the most suitable because they control for income without requiring a change in methodology, impose a smaller loss in precision, and are not
subject to bias resulting from differences in how income was measured.
2. These new weights could be applied to prior NHANES cycles. 3. Additional adjustments may be beneficial in future years.
Discussion/Reaction by the Board
Themes covered during the discussion included the problem of distinguishing between the
effects of the NHIS redesign and changes to the weighting methodology, the use of education
and income for the purposes of raking, and how the sampling variability problem in NHANES
could be detected and avoided in the future.
Because the NHIS redesign and changes to the weighting methodology are occurring at the same
time, one BSC member questioned how DHIS will distinguish between the two effects. DHIS
will attempt to partition the changes into those resulting from questionnaire changes (e.g., using
the bridge sample) vs. those resulting from changes in weighting (e.g., comparing results
weighted using the new versus old methods).
An Dr. Olson asked why education was not used as a raking variable along with sex, age, and
race on NHIS. Dr. Blumberg explained that education was included. The workgroup was
referring to variables that should be cross-classified, but concluded that education should not be
cross-classified with other variables. NHANES uses education for post-stratification, but the
problem there is income, which suffers from missing data and measurement error. In addition,
the income questions differ between NHANES and the American Community Survey.
A BSC member questioned how DHANES might detect similar problems in the absence of such
a glaring anomaly. DHANES could identify problems of PSU selection when the sample is
drawn, but the direct solution would be a less clustered sample.
In response to the specific finding for NHIS, Dr. Blumberg noted that DHIS dropped the
endogenous variables. DHIS prefers the logistic regression approach, but it has not yet explored
the benefits of a single-level model compared with a multi-level model. DHIS is implementing
cross-classification and will retain the person-level response adjustment. He endorsed the ideas
of drawing repeated samples and using the inverse of the mean response propensity.
Actions
Dr. Scott called for a vote regarding whether the BSC supports advancing the findings provided
by the workgroup as recommendations from the Board. The vote was unanimous in support.
11
New Data Linkage Algorithm
Lisa Mirel, M.S., Chief, Data Linkage Methodology and Analysis Branch, DAE
Ms. Mirel explained that the NDI algorithm was initially calibrated using the NHANES I
Epidemiologic Follow-up Study. Some modifications were later made to that algorithm (e.g., to
accommodate collection of only 4-digits of the social security number (SSN)).
The new Enhanced Linkage Approach first makes a deterministic match to the NDI using the
SSN and other identifiers. This dataset is treated as the “truth deck” for evaluating linkage errors.
Next, DAE uses a probabilistic method to match the survey participant with all likely matches in
the NDI based on identifiers other than the SSN. For each likely match, DAE computes a
probability that the survey and NDI records represent the same person. For survey records linked
with multiple NDI records, DAE selects the match with the highest probability of being a match.
Comparison of the New versus Old Linkage Algorithms
In comparing the enhanced algorithm linking NHIS and NHANES with the NDI to the old
linkage algorithm, DAE found some differences in vital status, i.e., whether an individual was
deceased. Based on the new algorithm, comparisons with the “truth deck” revealed low levels of
Type I (false positives, 1%) and Type II (false negatives, 2%) errors across both surveys. For
NHIS, vital status differed for 1.5% of survey participants, but it was higher in 1986-2006 (1.6%,
when the 9-digit SSN was collected) than in 2007-13 (0.7%, when only 4 digits from the SSN
were obtained). For NHANES 1999-2014, vital status differed between the two algorithms for
0.6% of survey participants
DAE further investigated the cases classified as deceased in the old algorithm but alive in the
new algorithm. Most (93%) had incomplete identifiers or many of the identifiers did not match
(were coded as class 3 or 4).
DAE also evaluated the effect of changes in the linkage algorithm on inference. Using survival
models that included sex, age, race/ethnicity, education, marital status, and region as predictors,
DAE compared results using the old versus the new algorithms. Differences in the estimated
hazard rates for all-cause and cause-specific mortality were no greater than 5% except for among
Hispanics, in which the difference was greater than 10%.
Validation of the New Algorithm
Lacking a “gold standard” for assessment of the NCHS mortality linkage, DAE compared results
with survey-reported death information from the Medical Expenditure Panel Survey (MEPS),
which follows survey participants for five rounds over a 2-year period. Agreement between
MEPS and the new algorithm was generally greater than 90% in all years, but agreement with
the old algorithm declined from 94% in 1996 to 84% in 2005 to just above 75% in 2011. DAE
also compared the results with the discharge status recorded in the NHCS. The new algorithm
captured 97% of the deaths recorded in the NHCS, whereas the old algorithm captured only 93%
of those deaths.
12
Conclusions
Concordance between the two algorithms is high (~94%). The differences are mostly participants
who were previously identified as deceased rather than newly identified deaths. The new
algorithm aligns better with external validation sources, and DAE hopes that it will improve
shortcomings with selected demographic groups. In January 2020, NCHS will begin to use the
new algorithm to link data from household surveys to the 2018 NDI. NCHS seeks advice from
the BSC about how to communicate the changes to users.
Discussion/Reaction by the Board
The discussion focused on communicating the change to users and alternative sources for
validating the linkage.
Comments from the BSC stressed the importance of explaining the change so that mid-level
users can understand how it may affect their results. NCHS should explain to users that they
must document which version of the data was used for their analyses.
Dr. Hauser asked whether NCHS uses non-federal data such as the Health and Retirement Study
for validation. Currently NCHS links only to its own surveys.
Dr. Madans concluded that NCHS plans to move forward with this change and believes that it
will not cause differences in substantive findings. The new files will replace the files that were
linked with the old algorithm.
Data Presentation Standards for Rates: Vital Statistics
Jennifer Parker, Ph.D., Director, Division of Research Methodology
Dr. Parker provided some background information about the criteria for data suppression. In
2002, CDC published an entire report that explained the criteria for data suppression in HP. In
May 2013, NCHS formed a workgroup on data presentation standards—to focus on proportions,
which is the most common statistics produced by NCHS. The NCHS Data Presentation
Standards for Proportions was released in August 2017. In 2018, NCHS formed a second
workgroup on data presentation standards for rates. NCHS commonly computes crude rates, age-
specific rates, and age-adjusted rates (based on a “standard US population”).
Dr. Parker reviewed the four types of rates estimated by NCHS:
1. Random numerators and fixed denominators from vital statistics (e.g., death rates); 2. Random numerators and fixed denominators from surveys; 3. Both numerators and denominators are random (e.g., subnational/subgroup rates from
vital statistics when the population denominator is based on a survey estimate); and
4. Infant mortality rates.
The workgroup decided to first focus on the first type of rates. Under the current presentation
guidelines, crude- and age-specific rates are deemed unreliable and suppressed if there are fewer
than 20 events. Similarly, age-adjusted rates are suppressed if the number of events across all
ages is fewer than 20. The workgroup proposed lowering those thresholds to fewer than 10
events, which is the same as the cutoff for confidentiality.
13
One concern is that lowering the threshold might increase random variation that might be
misinterpreted as “real” change. Because of the nature of a Poisson distribution, less variation
exists around the threshold of 10 than at 20 because as the mean number of events increases, so
does the variance. Consequently, DRM does not believe that reducing the threshold for these
types of rates will cause more erratic estimates.
DRM recommends lowering the presentation threshold from 20 events to 10 events for crude,
age-specific, and age-adjusted rates based on vital statistics with a random numerator and fixed
denominator. In the future, DRM plans to tackle similar rates based on survey data, which are
complicated by survey design effects and potentially multiple visits per person.
Discussion/Reaction by the Board
Issues raised during the discussion included different reasons for suppressing data and the
possibility of overdispersion.
Data are suppressed for two reasons: to protect confidentiality and ensure reliability. One BSC
member noted that CMS uses a threshold of 10 or fewer events (rather than fewer than 10) and
recommended using the same threshold for consistency. Another board member explained that
under the current NCHS standards, estimates for which the relative standard error is between 20
and 30 are not suppressed, but labeled as potentially unstable. DRM opted not to use a tiered
system because NCHS prefers a conservative standard for general publications. As long as a user
does not violate confidentiality, they can use the data to compute their own estimates.
Another BSC member noted that DRM assumes the data follow a pure Poisson distribution, but
wondered whether DRM has considered the possibility of overdispersion.
NHANES: Planning for the Future
Ryne Paulose, Ph.D., Acting Director, DHANES
NHANES is considered a gold standard among national surveys and has made extensive,
longstanding contributions to the health of Americans. Some might ask “Why change it?”
Because the contract for NHANES ends after the 2021-22 cycle, which presents an opportunity
to consider data collection innovations. DHANES would like to address limitations that include a
highly clustered sample, small sample size, substantial time burden on the respondent, and heavy
dependence on external funding.
NCHS issued a Request for Information (RFI) in August 2017 to solicit innovative ideas for the
NHANES redesign. DHANES has concluded that the number of PSUs should be increased to
reduce clustering and increase effective sample size. DHANES also considered replacing Mobile
Exam Centers (MECs) with mobile vans (which could travel to more PSUs) or fixed clinics
(which could remain open for longer periods), but decided to continue using the MECs to
preserve data quality and standardization. However, DHANES could reduce the four-trailer
MECs to two trailers or use self-motorized trailers that are less dependent on hookups to utilities.
Some disadvantages of the self-motorized trailers relative to the current MECs are reduced
14
space, increased complexity with more exam teams working concurrently, greater costs from
more trailers, and noise from generators that may be disruptive.
For NHANES 2023, DHANES wants to define core content to be collected continuously every
year and funded primarily by NCHS, while continuing supplemental exams, although fewer in
number to reduce the burden on participants and dependence on external funding. Many issues
remain to be determined, such as whether to oversample by age or race/ethnicity subgroups, the
number of PSUs to sample, the sample size for exams, whether to aim for a 1- or 2-year sample,
and whether to use a split sample for selected supplemental exams.
The potential design currently under consideration would increase the number of PSUs to 60 but
would continue the 2-year design (i.e., about 10,000 participants over the 2-year cycle). Three or
four teams would collect data each year using self-motorized trailers for the examinations. This
design is advantageous because it would de-cluster sample, reduce burden, increase budget
control, and increase flexibility for doing supplemental exams. However, the design does not
directly address response rates, could affect trend analyses, and would require a reduction in
exam content.
Discussion/Reaction by the Board
The discussion highlighted the need to consider innovative ideas and the short time frame for
decision making around redesign.
The BSC suggested the following innovative ideas: creating a state-level NHANES, using a
network of physicians or other clinics to collect data for NHANES, taking advantage of
nongovernmental resources, and enlisting design schools to redesign the MEC. Dr. Paulose
explained that standardization across clinics becomes problematic when contracting others to
conduct the exam. The information collected by NHIS about in-home exams and phlebotomy
will be useful to NHANES as well.
Dr. Madans stressed that NCHS must make redesign decisions within a very short time frame.
To change the 2023 NHANES, NCHS must submit plans to the Office of Acquisition by the end
of 2020. Alternative options are for NHANES to take a 1-year hiatus from the field or do a
bridge survey. The critical issue for NHANES is sampling rather than design. NHANES must
provide national-level estimates, but for which subgroups does NHANES want to continue to
provide estimates? That answer will determine sample size and cost. NCHS must also be
cautious about introducing more variability, which would reduce confidence in the estimates.
The meeting adjourned for the day at 5:10 p.m.
Friday, January 10, 2020
Presenters
Anjani Chandra, Ph.D., NSFG Team Lead, DVS
Denys Lau, Ph.D., Director, Division of Health Care Statistics
Carol DeFrances, Ph.D., Deputy Director, Division of Health Care Statistics
15
Call to Order
Linette T. Scott, M.D., M.P.H., Chair, BSC
Dr. Scott opened day two of the meeting.
NSFG: Planning for the Future
Anjani Chandra, Ph.D., NSFG Team Lead, DVS
At inception, the NSFG’s core purpose was to explain variations in birth rates using
intermediate/proximate determinants of fertility. Each survey provides a nationally
representative, cross-section of the U.S. reproductive-age population (females only prior to
2002). Cycle 1, which began in 1973, sampled ever-married women aged 15-44. The following
events have occurred since then:
• never-married women were included in 1982;
• the NSFG was linked to the NHIS sampling frame in 1988 and 1995;
• survey collection converted to Computer-Assisted Personal Interviewing (CAPI) and Audio Computer-Assisted Self-Interview (ACASI) and began to use incentives in 1995;
• the ACASI was expanded and men were included in 2002;
• the survey transitioned to a continuous fieldwork design in 2006; and
• and the age range was expanded to 15-49 in 2015.
The response rate has declined from greater than 90% in 1973 to ~65% in 2015-17. As the
NSFG’s purposes evolved over time to extend beyond proximate determinants of fertility, the
questionnaires have become longer and more complex.
NSFG Fieldwork Since 2006 (when the continuous fieldwork design began)
The NSFG uses an in-person screening interview to select one respondent per household for the
main CAPI interview; the most sensitive content is self-administered via ACASI. A responsive
fieldwork design (e.g., within each quarter of the fieldwork for a given year, the last 2 weeks
focuses on nonresponse follow-up and incentives are increased) allows for reasonable cost
control while optimizing sample response rates. In December 2018, the public-use data for 2015-
17 were released with additional data available via the RDCs. During fall 2020, NCHS plans to
release public-use and RDC files for 2017-19.
Plans and Development Work for the Upcoming NSFG
To date, the development work has included ongoing consultations with co-sponsors as well as
subject matter and survey methodology experts within and outside NCHS. In 2018, NCHS
hosted an expert workgroup with survey methodologists to explore the use of alternative survey
modes including the internet. During summer 2019, NCHS issued a RFI, requesting vendors to
consider the use of other survey modes, administrative data linkages to reduce response burden
and enhance the analytic potential of the NSFG, the addition of biomarker collection, and
strategies for reducing disclosure risk. Four vendors were invited to present at NCHS, and their
insights have been incorporated into future plans for NSFG.
16
Baseline Survey Plan
The NSFG plans not only to continue in-person interviews with a self-administered mode for the
most sensitive items, but also to streamline the content by identifying core items that can only be
provided by the NSFG and by determining the minimum detail required to measure and track
key variables over time. An advisory workshop is planned for spring 2020 to solicit other ideas
from subject-matter experts. Finally, the NSFG will conduct various pilot tests (e.g., alternative
survey modes, comparability of estimates across survey modes).
Beyond the Baseline Plan
Other ideas under consideration would require additional funding:
1. Interviewing two respondents per household would be a cost-effective way to bolster sample size and to allow the NSFG to sample couples and/or parent/child dyads, but may
increase risk of disclosure and could diminish precision (given the same sample size);
2. Increasing sample size could improve statistical power, allow more design experiments, and permit more frequent release of the public-use data, but would increase costs;
3. Linking individual records to administrative databases (e.g., Medicaid) could enhance the dataset, but may result in consent bias and reduce response rates; and
4. Collecting biomarkers in home could replace direct measures or supplement self-reports, but would increase cost, impose reporting requirements, and require regulatory clearance.
It could also cause consent bias and reduce response rates.
Discussion/Reaction by the Board
The discussion focused on survey mode, sampling two respondents per household, record
linkage, and the complicated nature of the survey.
Several BSC members questioned whether in-person interviews are the best mode for collecting
these data. Young people may be more comfortable with other survey modes that seem more
anonymous. Some millennials would rather talk with a computer algorithm than with a person.
Sampling two respondents per household may make it difficult to obtain independent interviews.
The choice of design (both parents or parent-child dyad) is complicated. Interviews with
teenagers would require parental permission. A major problem is disclosure; data from one of the
respondents could probably never be included in the public-use data.
Record linkage would expand the available information, but the NSFG does not currently collect
any identifiers. In response to a question about the funding required for linkage, Dr. Madans
explained that linkage itself would not entail a large cost except for staff time to evaluate and
document the linkage process. However, collecting identifiers and obtaining consent would incur
additional costs. Linkage with birth records is not be possible because birth records do not
include identifiers. Linkage with the NDI would yield few deaths in a sample that is so young.
The main linkage would be with Medicaid, but Dr. Madans questioned whether the necessary
data collection would be worthwhile, especially because the survey is already complicated.
Securing accurate timing of events requires detailed life history questions. One BSC member
asked whether respondents’ cognition affects their ability to participate in such a complicated
survey. Staff explained that the interviewer first determines whether the respondent is competent
17
to complete the survey and afterwards provides observations regarding the respondent’s ability to
answer the questions. The BSC member expressed concern that screening out respondents based
on perceived competence may introduce bias and that the complex survey may cause
respondents to terminate the interview prematurely. Dr. Madans noted that cognitive function is
more salient for older samples but can be an issue for the NSFG.
NAMCS: Planning for the Future
Denys Lau, Ph.D., Director, Division of Health Care Statistics
Carol DeFrances, Ph.D., Deputy Director, Division of Health Care Statistics
Dr. Lau explained that unlike most other NCHS surveys, the respondents to DHCS surveys are
health care providers. DHCS conducts a family of surveys, but this presentation focuses on
NAMCS, which covers the provision and use of ambulatory medical care services. NAMCS
began in 1973. The survey includes cross-sectional samples of ~3,000 office-based physicians
and ~104 community health centers (CHCs). In addition to the physician and CHC facility
induction interviews, the survey obtains patient visit data from a randomly selected calendar
week, including up to 30 patient encounters with each sampled physician and each of three
randomly selected CHC providers. Electronic health records (EHRs) data were collected for 2
years (2016-17).
NAMCS is valuable because it yields a nationally representative sample of physicians and
CHCs, collects visit-level data directly from medical records, and includes various clinical
elements and provider characteristics. It is an important source of trends in care utilization and
practice over time and serves as benchmarks for Healthy People (HP) objectives and national,
regional, and state-specific estimates. Unlike the American Medical Association physician
surveys, which include only self-reports, the NAMCS physician survey includes clinical data. It
also provides EHR data for a nationally representative sample that is not limited to particular
vendors or Medicare enrollees.
Dr. DeFrances highlighted changes in health care systems since NAMCS began. The settings for
ambulatory services have changed (e.g., urgent care centers, retail health clinics). Providers have
also changed (i.e., nurse practitioners and physician assistants play a more prominent role,
particularly in rural areas). Physician offices are more complex and often owned by hospitals or
conglomerates. Finally, care is no longer provided only in-person (e.g., telemedicine).
As a consequence, the data have changed. Increased reporting requirements have complicated
efforts to recruit participants into NAMCS, which is voluntary. EHR adoption has affected how
data are stored and collected and has widened range of stakeholders (e.g., EHR vendors, health
IT staff). Increasing concerns about data security and confidentiality results in increased
interaction between survey staff and privacy officers and legal departments.
Planning for the Future
DHCS has several plans for the 2020-21 NAMCS. First, induction interviews will be streamlined
to collect only the minimally necessary information, and DHCS will create promotional videos to
encourage participation. Second, NAMCS will explore both monetary and non-monetary
incentives. Third, NAMCS will target large conglomerates to enhance recruitment. Finally,
18
DHCS will explore alternative sources for sampling physicians and new strategies to maintain
sample frames.
Other questions remain to be answered such as: should ambulatory care be redefined to include
different methods of care delivery as well as other providers and settings. DHCS is considering
whether to reintroduce EHR data collection. NAMCS may need to reconsider whether to sample
individual providers, health care groups, or EHR vendors; employ other databases; collect data
for a full calendar year, quarters, or months rather than 1 week; include overlapping, multi-year
panels of respondents; and collect identifiers to enable data linkage.
Dr. DeFrances closed by presenting a long list of questions for discussion. She asked about the
level of interest in forming a workgroup to discuss the future of NAMCS.
Discussion/Reaction by the Board
The discussion focused on the value of NAMCS, the definition of ambulatory care,
providers/facilities to be sampled, and the inclusion of EHR data.
One BSC member noted that the most important (albeit difficult to answer) question is whether
NAMCS is still needed. For all NCHS surveys, it is important to consider what information is
obtained that cannot be harvested via other means. Dr. Scott noted that although a large amount
of data can be obtained from other databases (e.g., CMS), those sources do not capture the
provider’s full patient panel. Within NCHS, NPALS has the most experience integrating blended
data, but NAMCS is more complicated because it involves more sources and more duplication
across different databases.
Regarding the definition of ambulatory care, the BSC agreed on the importance of including
nurse practitioners, physician assistants, and other forms of delivery (e.g., telemedicine,
videoconferencing). NAMCS should consider other sources that can provide a more complete
sampling frame for providers. The full range of care should be measured, and the ways in which
care is changing should be documented. It is not necessary to collect the same type of data from
all providers.
Because most providers now use EHRs, NAMCS should probably collect EHRs, but one BSC
member worried about potential bias, which may be similar to the issue faced with random digit-
dialing as people shifted from landlines to cell phones. Several participants suggested exploring
partnerships (e.g., health information exchanges, Digital Bridge).
Actions
The BSC voted on whether to convene a workgroup focused on the future of NAMCS including
consideration of whether NAMCS is still needed. Support among the Board was unanimous. Drs.
Lumpkin and Copeland volunteered for the workgroup. BSC members should contact NCHS
with suggestions for external experts to serve on this workgroup. One BSC member suggested
David Mechanic and David Kendig.
19
NCHS Data Modernization
This portion of the meeting was an open discussion of the Board led by Dr. Scott. Dr. Scott
suggested a focus on two of the four topics outlined by Dr. Madans at the September BSC
meeting: Next Generation Survey and Data Systems; and Data Integration, Linkage, and Data
Science.
The discussion spent time on four subjects:
1. NCHS Data Modernization 2. Next Generation Survey and Data Systems 3. Data Integration, Linkage, and Data Science 4. Independence of NCHS as Federal Statistical Agency versus Partnership Needs
During the discussion, Dr. Scott captured themes from the workgroup members on slides. These
themes reflect the discussion but were not refined to be specific recommendations. The
expectation is that the themes will be revisited by NCHS and the Board.
. For the first topic, Dr. Scott asked, “What is our ideal vision for NCHS 5 years in the future?”
In response to this question, the following themes were discussed.
• Five years out – what is the ideal state for NCHS to work towards o Maintain high level of confidence in data and reports produced by NCHS o How to adjust in the context of flat funding o Modernization and Interoperability of Systems and impact on NCHS o Keep end use in mind in designing collection o Make-up of the Board – Specialties to have represented
• Data science / Computer science with expertise in large data use • New players that are integrating data and exchanging data in innovative
ways
Next Generation Survey and Data Systems
In September, Dr. Madans highlighted the following on a slide in her presentation.
• Utilizing electronic health records for healthcare statistics
• Next generation population health surveys
• Vital Statistics – quality to match improved speed
The Board identified the following themes in their discussion.
• What is it we need to know and why do we need to know it o NCHS is critical to supporting the public health and health care understanding o Identifying the unique value add of NCHS o Strip surveys to what is unique and then complement with data otherwise
collected
o What do we know about data use and trends in data use? o Consider focus on interoperability and use of what exists and validating those
sources
20
• Response Rates o What alternative means may substitute for consumer response surveys?
• Resource Requirements / Cost
• Societal changes in the interaction with information
• What would be leap frog changes to approaches to understanding the populations could be used
• Consider sample panels for experimentation to assist with innovating over time
• Balance of historical trending with transitions to new processes
• Partnering with others, such as NRF, regarding research nodes to test specific methods
• NCHS Strategic Plan o Does a plan exist? o What health related policies are be unable to answer and what might arise? o Need to be nimbly responsive to rapidly changing environment o What is the right combination of data collection (survey, administrative)?
Data Integration, Linkage, and Data Science
The BSC recognized that NCHS must maximize interoperability through partnerships with the
states. The BSC also noted that NCHS should determine whether it is possible or even feasible to
obtain the necessary identifiers for data linkage. Most of the remaining discussion focused on
balancing NCHS’ competing roles as a data collector versus an analytic and methodologic
leader. Methodological reports from NCHS carry a lot of credibility, but data collection accounts
for most of the budget and no dedicated funding for methodologic work exists. A Board member
suggested that methodologic work might be incorporated into the redesign process. Dr. Madans
responded that using of part of the sample for methodologic testing could compromise NCHS’
ability to perform subgroup analysis. One BSC member argued that model-based statistics
should play a larger role to help mitigate the losses in efficiency from reduced response rates.
In September, Dr. Madans highlighted the following on a slide in her presentation.
• Expand the NCHS Data Linkage and Integration Program
• Data science methods for health statistics
The Board identified the following themes in their discussion.
• Interoperability o Partnerships with states in which states do linkage and then share with federal
partners o Consider health data system with more in depth data across states and federal
partners
• Linkage o Need to address data gaps in identifiers necessary to link
• Vital Statistics – Role of NCHS in vital statistics versus state role with statistics and civil registration
o Look for other ways of partnering required activities with the function of NCHS data collection/processing (Ex. VBP, quality reporting, etc)
• Role of data collector vs analytic reporting and methodologic leader o What is the lead focus/role for NCHS o The value of methodologic reports from NCHS that have credibility weight
21
• Example of a to be: NHIS and health insurance coverage – NHIS was able to do early releases and stepped into the lead entity and the “first place to look” for health insurance
coverage (other examples VR and opioid epidemic or flu surveillance)
• Funding for methodologic work that extends the bench to academic/industry to support the role of NCHS as a methodologic leader
• Future in model-based statistics (data integration and opportunities to leverage/dependence) (model to area/geographic level) (potential unit level model has
distribution assumptions) o Need expertise in specific topics to help check models
• Recommendation to have increased focus on methodologic work, recognizing trade-off in data collection
• Redesign generally needs set aside funds for innovation
• Parallel methodologic work with redesign process and ongoing data collection to assist in testing during process
• Recognition of outstanding work being done
Independence of NCHS as a Federal Statistical Agency versus Partnership Needs
The BSC expressed support for partnerships, recognizing that innovation may be impossible
without external funding. When NCHS data are necessary for programmatic work, CDC
leadership should help to re-enforce sharing of funds with NCHS. Budgets for researchers who
are funded for a project that requires NCHS data should include funding to NCHS for the data.
The Board identified the following themes in their discussion.
• Recommend strategic planning process / continuous quality improvement process to think about how activities continue to build toward goals
• Support for peer review process and publishing/making it known that it occurs
• Fellowship programs are good
• State partnership related to oversample or linkages
• Connections to mandatory reporting needs to potentially add funding for NCHS
• CDC leadership help re-enforce sharing of funds with NCHS when NCHS data is necessary for programmatic work
• Support for multi-partnerships and funders
BSC Wrap-Up
Linette T. Scott, M.D., M.P.H.
Jennifer Madans, Ph.D.
Dr. Scott thanked everyone for their participation. Regarding the new linkage algorithm, new
presentations standards for rates, and future plans for three specific surveys, Dr. Madans asked
the BSC members about their desired involvement in the decision-making process. One board
member stated that it is easier for the BSC to react to a specific issue because it does not have the
context to understand whether a particular design feature will be seriously considered. If a quick
response is needed, perhaps NCHS could schedule a call 2 weeks in advance for whomever can
participate. Dr. Uddin stated that all BSC input must be publicly available. The BSC could host a
virtual meeting similar to that in 2019 if further discussion of this issue is needed. In contrast,
workgroup deliberations are not open to the public. Although NCHS does not want to
22
overburden survey research experts with multiple concurrent workgroups, some decisions must
be made before the September 2020 BSC meeting. The group decided that the most efficient
option would be to convene a workgroup on survey design and data presentation.
Actions
The BSC voted on creation of a workgroup on survey design and data presentation. Support for
workgroup creation was unanimous. Drs. Holan, Copeland, Hauser, and Peytchev volunteered to
serve on the workgroup. NCHS staff will send further communications regarding specific
questions to the BSC via e-mail with a clear indication of the timeframe to submit feedback.
BSC members should notify Dr. Madans if other issues arise from the presentations that deserve
more attention.
Public Comment
There was no public comment.
The meeting was adjourned at 12:30 pm.
To the best of my knowledge, the foregoing summary of minutes is accurate and complete.
__________/s/___________________________ ___________5/17/2020_______
Linette T. Scott, M.D., M.P.H. DATE
Chair, BSC
Structure BookmarksAll official NCHS BSC documents are posted on the BSC website (