Upload
esther-rodgers
View
219
Download
2
Tags:
Embed Size (px)
Citation preview
Census Interaction Data: Characteristics and Access
John StillwellCentre for Interaction Data Estimation and Research (CIDER)
School of Geography, University of Leeds
Presentation at the ‘After the Census’ session of the ‘ESRC Research Methods Festival’
University of Oxford, 3 July, 2012
Census Programme
CIDER staff:John StillwellOliver Duke-WilliamsAdam DennettKostas Daras
Service reorganization
UK Data Service
Census Support Service
Other data services
?
Presentation1. What are the interaction data sets?2. How are these data sets accessed?3. How are the interaction data sets used in
research?4. What are the major characteristics of the 2011
Census interaction data?- same questions/new questions- SDC- licensing arrangements- geographies - possible tables
5. Conclusions
1. What are the interaction data sets?
• Data on migration that derive from the question in the Census: Where were you living 12 months ago?
- Special Migration Statistics (SMS) in 2001• Data on commuting that derive from the question
in the Census: What is the address of your place of work? (and study in Scotland)
- Special Workplace Statistics (SWS) in 2001- Special Travel Statistics (STS) (Scotland) in 2001
• These data sets are unique because they have two geographies: origin and destination
SMS/SWS are large and often sparsely populated matrices, particularly for small areas
Interaction flow matrix for Leeds contains 5,948,721 cells that have the potential to contain flow counts
Leeds: 2,439 Output Areas
Large and multi-dimensional data sets
1991 SMS Set 1 (Wards)1991 SMS Set 2(Districts)
Country Level 1 Level 2 Level 3
England London Boroughs (33), Metropolitan Districts (36), Unitary Authorities (46), Other Local Authorities (239)
CAS wards (7,969) Output areas (165,665)
Wales Unitary Authorities (22) CAS wards ( 881) Output areas (9,769)
Scotland Council Areas (32) ST wards ( 1,176) Output areas (42,604)
Northern Ireland
Parliamentary Constituencies (18)
CAS wards (582 ) Output areas (5,022)
Total Districts (426) Interaction wards (10,608)
Output areas (223,060)
Geographies of 2001 SMS/SWS/STS
Key point: Interaction data sets are for the UK
CIDER’s interaction data sets
(a) Census data sets
Origin-Destination Statistics• 1981 SMS Set 2 and SWS Set C (County/region level)• 1991 SMS Sets 1 and 2, SWS Sets A-C and Table 100 (students)• 2001 SMS Sets 1 and 2, SWS/STS Levels 1-3 (and postal sectors)
Commissioned Tables• Set of tables from 2001 Census including, for example:
C0649: Commuters by religion at district levelC0711: Migrants by ethnic group and age at district levelC0723: Migrants by age and ethnic group at region/ward level
CIDER’s interaction data sets
(b) Derived or estimated data for census periods
• SMSGAPS: Counts for 1991 SMS Set 2 Tables 3-10 derived by Rees and Duke Williams that include estimates of suppressed values
• MIGPOP: Counts for 1991 SMS Set 2 Table 3 derived by Simpson and Middleton that adjust for under-enumeration
• 1981 SMS Set 2 (wards) and SWS Set C (wards): re-estimated for 1991 and 2001 geography by Boyle and Feng
• 1991 SMS Set 1 (wards) and SWS Set C (wards): re-estimated for 2001 geography by Boyle and Feng
CIDER’s interaction data sets
(c) Estimated time series data sets
• Patient register/NHSCR flows between local authority districts in England and Wales, 1999-2008 (rounded) – estimated and supplied by ONS
• Inter-NUTS2 region migration estimates for UK, mid 1999-2000 to mid 2006-2007 – estimated and supplied by Rees and Dennett (DEMIFER project)
• Inter-NUTS2 region migration estimates for UK, calendar year 2000 to 2006-2007 – estimated and supplied by Rees and Dennett (DEMIFER project)
• Inter-region migration by age, sex and ethnicity for Britain, 1991-1999 and 2000-2007 estimated and supplied by Raymer and Giuletti (ESRC project)
• Inter-county migration by age, sex and ethnicity, 1999-2007, estimated and supplied by Raymer and Giuletti (ESRC project)
• Inter-county migration by age, sex and economic activity , 1999-2007, estimated and supplied by Raymer and Giuletti (ESRC project)
2. How are these data sets accessed?WICID is the online interface to the Census interaction
data sets accessible from the CIDER Home Page
Need to be a registered user of census data
WICID Query Interface
Data selection
Tables available in 2001 SMS Level 1
Cells of Table 3 in 2001 SMS Level 1
Origin and destination geography selection
Area selection tools available
List selection of districts
Map Selection Tool
Map Selection Tool (detail)
Postcode based selection
Finalise Screen
Screen Indicating Extraction Completed
Example of simple query and data extracted
The Query: Extract the data on total migrant flows between the countries of the UK from Table MG1010 in 2001 SMS
The Data: Origin by destination matrix of migration flows in 2000-01
Analysis functions for use on extracted data
Help System opening inside a new browser window
3. How are interaction data sets used in research?
Interaction data sets used by various researchers:
See some examples in Part 2 of CIDER book:
Stillwell, J., Duke-Williams, O.W. and Dennett, A. (eds.) (2010) Technologies for Migration and Commuting Analysis Spatial Interaction Data Applications, pp. 357, IGI Global, Hershey.
Example: What processes of white migration are taking place in London at ward level?
Net migration flows within Net migration flows between Greater London London and rest of England and Wales
Source: 2001 Census Commissioned TableStillwell, J. (2010) Ethnic population concentration and net migration in London, Environment and Planning A, 42: 1439-1456.Location quotients
Are the same processes of migration apparent for Black migrants in London?
Net migration flows within Net migration flows between Greater London London and rest of England and Wales
Source: 2001 Census Commissioned TableLocation quotients
Are the same processes of migration apparent for Chinese migrants in London?
Net migration flows within Net migration flows between Greater London London and rest of England and Wales
Source: 2001 Census Commissioned TableLocation quotients
4. What are the major characteristics of the 2011 Census interaction data?
4.1 What interaction questions were asked?4.2 What about statistical disclosure
control?4.3 What geographies will be used? 4.4 What migration and commuting tables will be available?
4.1 Interaction questionsMain questions for migration and
commuting in 2011 are the same as in 2001
Student ‘migration’ picked up by separate questions
Student term time/boarding school address in the UK: enter term time address belowThis means that it will be possible to generate flows of: (i) those who left HE/FE/boarding school and changed usual residence(ii) those in HE/FE or at boarding school who changed term time address
‘Another address’ question enables further ‘interaction’ data to be generated?
Questions 5 and 6 ask about another address
Potential to produce matrices of interaction flows between usual address and other address – very useful for analyses of mobility (weekly commuting, shared custody of children, second homes, international mobility) hitherto uncaptured
Questions about international immigration
Potential to produce tables of immigrants by country of birth and country of previous usual residence
4.2 Statistical disclosure control?
• Small cell adjustment abandoned in 2011 in favour of record swapping:
- Households swapped- Targeted to ‘risky’ records- Construct risk score for every individual; combine to
household score- Imputation considered as part protection- Households swapped only as far as their risk is
considered ‘high’- Individuals swapped between communal
establishments
• Work on SDC on Origin-Destination Tables still ongoingSource: Spicer, K. (2011) Statistical Disclosure Control for 2011 UK Census, http://www.ons.gov.uk/census/2011-census/consultations/open-consultations/2011-output-consultation---main-statistical-outputs---second-round/index.html
Data licensing arrangementsTier Initial
ideaData availability Tier Current
thinkingData availability
1 Public Download without restriction
1 Public Data available under open government license
2 Safeguarded Download with terms and conditions
2 Safeguarded Data available with Special user license3 Safeguarded
(Approved researcher)
Download only with approved researcher status
4 Approved researcher
Access only with approved researcher status in a secure setting
3 Secure/VML Data available to approved researcher only in a secure setting
Key question: Which data will be available at which tier of licensing?
4.3 What geographies will be used?• Fundamental building blocks for origin-destination migration
flows will be output areas (OAs) with data aggregated to wards and districts
• Problem of LG reorganisation since 2001 – which means there is a user requirement that flows for wards should be generated so as to be able to reconstitute old LG districts for comparison
• Preference for LG districts in Northern Ireland (rather than Parliamentary Constituencies as in 2001)
• New geography for commuting destinations – Workplace Zones (WPZs)
Workplace Zones (WPZs)• OAs based on where people live not work – can be unsuitable
for workplace statistics• Some OAs contain no/few businesses; some contain many
businesses or large employer, e.g. business parks, City of London
• Workplace Zones project looking at splitting/merging OAs for a new geography constrained to MSOAs
• Pilot areas: Tower Hamlets, City of London, Southampton, Nottingham, Suffolk Coastal
• Disclosure control: Population threshold same as OAs (100 workers min; 625 max; no household threshold)
Source: Spicer, K. (2011) Statistical Disclosure Control for 2011 UK Census, http://www.ons.gov.uk/census/2011-census/consultations/open-consultations/2011-output-consultation---main-statistical-outputs---second-round/index.html
4.4 Migration and commuting tables • ONS still undecided about table specifications for interaction
data sets• 2011 Census Prospectus indicates Migration and Workplace
Statistics will be released after October 2013• ONS currently reviewing the table specifications proposed by
Oliver Duke-Williams (UCL):- Special Migration Statistics (SMS)- Special Workplace Statistics/Special Travel
Statistics (SWS/STS)- Special Student Statistics(SSS)- Special Residence Statistics (SRS)
• Important distinction between different types of counts and their relationship with spatial scale and tier of licensing
Three types of tables for each set of SMS/SWS/STS/SSS/SRS
• Likely to be important distinction between:(i) Flow (or headcount) tables, i.e. origin-destination flows of
total persons only(ii) Univariate tables, i.e. origin-destination flows disaggregated
by a single variable e.g. sex, or age or ethnic group(iii) Multivariate tables, i.e. Origin-destination flows disaggregated
by more than one variable, e.g. age by sex or ethnic group by sex
• Each of these flow data sets likely to be produced for flows at different spatial scales: OA-OA; ward-ward; UA/LA-UA/LA with different access/licensing conditions
5. Conclusions
• Anticipate substantial demand for access to 2011 Census interaction data sets
• Collaboration underway with ONS about table design as well as joint dissemination strategy
• Interaction data service soon to be part of the Census Support Service (CSS)
• Key advantage of CSS is provision of user access to data from previous censuses
• Recognise the ‘new’ environment – with 2011 Census likely to be the last of its kind and results of ONS ‘Beyond 2011’ project due in September 2014
• Changing focus of data collection from Census to surveys and administrative sources
Contact details
John StillwellEmail: [email protected]
Oliver Duke-WilliamsEmail: [email protected]
CIDER Web site: http://cider.census.ac.uk/