Upload
ukdscensus
View
189
Download
0
Embed Size (px)
Citation preview
Evaluating the feasibility of using
administrative data in the context
of census population statistics
Kimberley Brett Office for National Statistics
Census Transformation Programme
Beyond 2021 Research and Design
Overview
• Use of administrative data for population statistics
requires careful consideration of quality issues
- Coverage
- Lags
- Definitions
- Policy
- Administrative and operational process
• Outline how we are developing a framework for
evaluating and improving quality of administrative
data used in SPD construction
• Examples of quality issues in our analysis
Admin Source Reports Beyond 2011
• During phase 1 Beyond 2011 published a series of
reports on the following data sources
• Patient Register (PR)
• Customer Information System (CIS)
• Higher Education Statistics Agency (HESA)-Student Records
• English School Census and Welsh School Census
• Electoral Register
• Evaluation framework that focused on each source
separately:
• Coverage
• Plausibility
• Metadata
• Understanding administrative process
Assessing Coverage – Comparison with Census Estimates
• Patient Register 2011 • Customer Information System 2011
Assessing Coverage – Comparison with Census Estimates
• School Census • Electoral Register
Developing the quality framework
• Our approach is to combine administrative sources to estimate the
population and its characteristics
• Need to evolve the quality framework to identify and improve on
specific issues emerging in the SPD analysis
• Stability of the admin data
• Source conflicts
• Accounting for missing groups
• Established a Data Suppliers Group with government departments
supplying the data:
• Acquisition of additional data to supplement SPD
• Feedback on the quality issues identified in our analysis
• Try and influence how data is collected and maintained for statistical
purposes
• UKSA quality toolkit provides us with a framework to work towards
SPD 5 population counts compared to
the 2011 Census
91% of LA total population
counts within 3.8% of
Census estimate in 2011
Admin data
method
lower than
2011 Census
Admin data
method
higher than
2011 Census
7
SPD 5 population counts compared to
the 2011 Census
• 20- 24 year old males
Using linked data to evaluate quality
• Linking data across multiple administrative sources and
Census data provides valuable insight to explain differences
between SPD counts and Official Estimates
• Can measure the accuracy
• How long lags persists for cohorts of the population
• Definitional and collection mode differences
• Volatility of operational processes
Lags of a Statistical Population Dataset
• Counts on the SPD correlate well with Census estimates, but lags
exist in particular locations
Age by single year
Percentage of the population by single year of age that are recorded
in the same location when comparing the SPD to census estimates
Lags on the Patient Register (PR)
• How long does it take for PR records to update location?
Same address information on 2011 PR as Census
Same address information on 2012 PR as Census
Same address information on 2013 PR as Census
Different address information on 2011 PR, 2012 PR and 2013 PR to
Census
Examples of changes in operational
processes
• The National Duplicate Registration Initiative (NDRI) was the
Audit Commission’s periodic exercise that used data matching
techniques to review GPs’ patient lists
• Conducted list cleaning in 1999, 2004 and 2009
Removal of FP69s
• 2004 - 185,000 patient registration deductions
• 2009 - 95,000 patient registration deductions
• At present, this is done on an ad hoc basis by NHS areas
• Subsequently, in some LAs we observe significant reductions of
patient numbers following list cleaning exercises that result in
decreases of population counts on SPDs
Example of list cleaning and impact on
SPD count
13
Example of operational processing and
impact on SPD count
0
1000000
2000000
3000000
4000000
5000000
6000000
7000000
1 2 3 4 5 6 7 8 9 10 11 12
Po
pu
lati
on
Month
Example of operational processing and
impact on SPD count
0
500000
1000000
1500000
2000000
2500000
3000000
3500000
0 5 10 15 20 25 30 35
Po
pu
lati
on
Day of the month
Quality of admin data for household statistics
• Census collects information at household level: • Definition of “household” relates to ...
• Household definition not captured on admin data
• Limited to an address definition when producing household
statistics
• Aims to produce household statistics in 2016: • Number of households
• Household size
• Household composition
• Challenges in producing these statistics: • Unable to geo-reference all addresses to an address frame
• Churn at the address results in inflated household sizes
• Relationships between family members not recorded on
admin data
Missing Address Identifiers (OSAPRs)
Household Composition Estimates
Modes of collection
• Later releases of Research Outputs will explore potential of
population characteristics.
• Will depend on data availability and quality.
• Potential topics include: • Income
• Ethnicity
• Health
• Need to understand how the definitional differences and modes of
collection impact on the statistics produced from admin data.
• Undertaken analysis of linked records between the 2011 Census
and School Census to compare ethnicity.
2011 Census
ethnicity
English School Census ethnicity
Wh
ite B
riti
sh
Iris
h
Iris
h T
rav
eller/
Gyp
sy/R
om
an
y
Ind
ian
Ban
gla
desh
i
Pakis
tan
i
Wh
ite a
nd
Asia
n
Oth
er
Asia
n
Ch
inese
Afr
ican
Wh
ite a
nd
Bla
ck A
fric
an
Cari
bb
ean
W
hit
e a
nd
Bla
ck
Cari
bb
ean
Oth
er
Wh
ite
Oth
er
Bla
ck
Oth
er
Mix
ed
Oth
er
Eth
nic
ity
Mis
sin
g
To
tal
(den
om
inato
r)
White British 95% 0.50% 0.50% 0.50% 0.50% 0.50% 0.50% 0.50% 0.50% 0.50% 0.50% 0.50% 0.50% 2% 0.50% 0.50% 0.50% 2% 5,048,672
Irish 41% 47% 1% 0.50% 0.50% 0.50% 1% 0.50% 0.50% 0.50% 0.50% 0.50% 0.50% 5% 0.50% 3% 0.50% 2% 22,609
Irish Traveller/
Gypsy/Romany 35% 2% 54% 0.50% 0.50% 0.50% 0.50% 0.50% 0.50% 0.50% 0.50% 0.50% 0.50% 6% 0.50% 0.50% 1% 2% 9,150
Indian 0.50
% 0.50% 0.50% 89% 0.50% 1% 1% 5% 0.50% 0.50% 0.50% 0.50% 0.50% 0.50% 0.50% 1% 0.50% 2% 169,609
Bangladeshi 0.50
% 0.50% 0.50% 0.50% 92% 1% 0.50% 2% 0.50% 0.50% 0.50% 0.50% 0.50% 0.50% 0.50% 0.50% 0.50% 4% 99,905
Pakistani 0.50
% 0.50% 0.50% 1% 0.50% 86% 1% 4% 0.50% 0.50% 0.50% 0.50% 0.50% 0.50% 0.50% 1% 3% 4% 252,189
White and
Asian 11% 0.50% 0.50% 1% 0.50% 2% 54% 3% 0.50% 0.50% 0.50% 0.50% 0.50% 3% 0.50% 15% 3% 4% 82,152
Other Asian 1% 0.50% 0.50% 12% 0.50% 2% 2% 58% 1% 0.50% 0.50% 0.50% 0.50% 0.50% 0.50% 4% 17% 2% 84,028
Chinese 2% 0.50% 0.50% 0.50% 0.50% 0.50% 1% 2% 83% 0.50% 0.50% 0.50% 0.50% 0.50% 0.50% 7% 2% 2% 27,577
African 1% 0.50% 0.50% 0.50% 0.50% 0.50% 0.50% 1% 0.50% 83% 1% 1% 0.50% 1% 7% 2% 1% 3% 190,489
White and
Black African 6% 0.50% 0.50% 0.50% 0.50% 0.50% 0.50% 0.50% 0.50% 8% 55% 1% 3% 3% 3% 14% 2% 4% 38,611
Caribbean 1% 0.50% 0.50% 0.50% 0.50% 0.50% 0.50% 0.50% 0.50% 3% 0.50% 77% 3% 0.50% 9% 3% 1% 4% 71,256
White and
Black
Caribbean 12% 0.50% 0.50% 0.50% 0.50% 0.50% 0.50% 0.50% 0.50% 0.50% 2% 3% 62% 1% 2% 12% 1% 4% 108,920
Other White 8% 0.50% 1% 0.50% 0.50% 0.50% 1% 0.50% 0.50% 0.50% 0.50% 0.50% 0.50% 75% 0.50% 6% 5% 3% 169,626
Other Black 1% 0.50% 0.50% 0.50% 0.50% 0.50% 0.50% 0.50% 0.50% 34% 1% 30% 2% 0.50% 20% 6% 1% 4% 27,625
Other Mixed 10% 0.50% 0.50% 1% 0.50% 1% 5% 3% 1% 2% 3% 2% 8% 5% 4% 47% 5% 5% 23,763
Other Ethnicity 5% 0.50% 0.50% 0.50% 0.50% 0.50% 2% 12% 0.50% 2% 1% 0.50% 0.50% 10% 2% 10% 50% 4% 66,760
Missing 69% 0.50% 0.50% 2% 1% 3% 1% 2% 0.50% 4% 0.50% 2% 1% 5% 1% 2% 2% 3% 222,193
Census and England School Census Ethnicity
Admin data quality and how it impacts
SPDs
21
Risk / Issue Example Mitigation
Definitional differences Definition of residence
Definition of household
Lack of common
address identifier e.g.
(UPRN)
Harmonisation,
combined use with
surveys
Single address register
and UPRN at source
Registration People not de-
registered when abroad
or dead
Not registering on
arrival or moving
lags in registration and
update
Using activity data and
combined survey use
Feedback to data
suppliers
Policy or operational
change
Patient register list
cleaning
Benefit change, e.g.
Universal credit
Consultation and
engagement with
ONS/GSS,
Through new
legislation?
Summary – The potential future of
administrative data
• Need to incorporate more evidence of activity from
admin data to help improve quality of SPDs
• More research to understand whether there are
different characteristics of those people who are
included or are not included on the SPD
• Need to maintain and establish positive relationships
with Data Suppliers
References
• Source reports for administrative datasets can be
found on the ONS website under:
http://www.ons.gov.uk/ons/about-ons/who-ons-are/programmes-
and-projects/beyond-2011/reports-and-publications/index.html
Census Transformation Programme
Annual Research Conference
Covering: 2021 Census design, Census topic
consultation, linking admin data, research outputs
26-27th November 2015
Chichester College
To register an interest or for more information
please email:
Any Questions?