39
Census Interaction Data: Characteristics and Access John Stillwell Centre for Interaction Data Estimation and Research (CIDER) School of Geography, University of Leeds Presentation at the ‘After the Census’ session of the ‘ESRC Research Methods Festival’ University of Oxford, 3 July, 2012

Census Interaction Data: Characteristics and Access John Stillwell Centre for Interaction Data Estimation and Research (CIDER) School of Geography, University

Embed Size (px)

Citation preview

Page 1: Census Interaction Data: Characteristics and Access John Stillwell Centre for Interaction Data Estimation and Research (CIDER) School of Geography, University

Census Interaction Data: Characteristics and Access

John StillwellCentre for Interaction Data Estimation and Research (CIDER)

School of Geography, University of Leeds

Presentation at the ‘After the Census’ session of the ‘ESRC Research Methods Festival’

University of Oxford, 3 July, 2012

Page 2: Census Interaction Data: Characteristics and Access John Stillwell Centre for Interaction Data Estimation and Research (CIDER) School of Geography, University

Census Programme

CIDER staff:John StillwellOliver Duke-WilliamsAdam DennettKostas Daras

Page 3: Census Interaction Data: Characteristics and Access John Stillwell Centre for Interaction Data Estimation and Research (CIDER) School of Geography, University

Service reorganization

UK Data Service

Census Support Service

Other data services

?

Page 4: Census Interaction Data: Characteristics and Access John Stillwell Centre for Interaction Data Estimation and Research (CIDER) School of Geography, University

Presentation1. What are the interaction data sets?2. How are these data sets accessed?3. How are the interaction data sets used in

research?4. What are the major characteristics of the 2011

Census interaction data?- same questions/new questions- SDC- licensing arrangements- geographies - possible tables

5. Conclusions

Page 5: Census Interaction Data: Characteristics and Access John Stillwell Centre for Interaction Data Estimation and Research (CIDER) School of Geography, University

1. What are the interaction data sets?

• Data on migration that derive from the question in the Census: Where were you living 12 months ago?

- Special Migration Statistics (SMS) in 2001• Data on commuting that derive from the question

in the Census: What is the address of your place of work? (and study in Scotland)

- Special Workplace Statistics (SWS) in 2001- Special Travel Statistics (STS) (Scotland) in 2001

• These data sets are unique because they have two geographies: origin and destination

Page 6: Census Interaction Data: Characteristics and Access John Stillwell Centre for Interaction Data Estimation and Research (CIDER) School of Geography, University

SMS/SWS are large and often sparsely populated matrices, particularly for small areas

Interaction flow matrix for Leeds contains 5,948,721 cells that have the potential to contain flow counts

Leeds: 2,439 Output Areas

Page 7: Census Interaction Data: Characteristics and Access John Stillwell Centre for Interaction Data Estimation and Research (CIDER) School of Geography, University

Large and multi-dimensional data sets

1991 SMS Set 1 (Wards)1991 SMS Set 2(Districts)

Page 8: Census Interaction Data: Characteristics and Access John Stillwell Centre for Interaction Data Estimation and Research (CIDER) School of Geography, University

Country Level 1 Level 2 Level 3

England London Boroughs (33), Metropolitan Districts (36), Unitary Authorities (46), Other Local Authorities (239)

CAS wards (7,969) Output areas (165,665)

Wales Unitary Authorities (22) CAS wards ( 881) Output areas (9,769)

Scotland Council Areas (32) ST wards ( 1,176) Output areas (42,604)

Northern Ireland

Parliamentary Constituencies (18)

CAS wards (582 ) Output areas (5,022)

Total Districts (426) Interaction wards (10,608)

Output areas (223,060)

Geographies of 2001 SMS/SWS/STS

Key point: Interaction data sets are for the UK

Page 9: Census Interaction Data: Characteristics and Access John Stillwell Centre for Interaction Data Estimation and Research (CIDER) School of Geography, University

CIDER’s interaction data sets

(a) Census data sets

Origin-Destination Statistics• 1981 SMS Set 2 and SWS Set C (County/region level)• 1991 SMS Sets 1 and 2, SWS Sets A-C and Table 100 (students)• 2001 SMS Sets 1 and 2, SWS/STS Levels 1-3 (and postal sectors)

Commissioned Tables• Set of tables from 2001 Census including, for example:

C0649: Commuters by religion at district levelC0711: Migrants by ethnic group and age at district levelC0723: Migrants by age and ethnic group at region/ward level

Page 10: Census Interaction Data: Characteristics and Access John Stillwell Centre for Interaction Data Estimation and Research (CIDER) School of Geography, University

CIDER’s interaction data sets

(b) Derived or estimated data for census periods

• SMSGAPS: Counts for 1991 SMS Set 2 Tables 3-10 derived by Rees and Duke Williams that include estimates of suppressed values

• MIGPOP: Counts for 1991 SMS Set 2 Table 3 derived by Simpson and Middleton that adjust for under-enumeration

• 1981 SMS Set 2 (wards) and SWS Set C (wards): re-estimated for 1991 and 2001 geography by Boyle and Feng

• 1991 SMS Set 1 (wards) and SWS Set C (wards): re-estimated for 2001 geography by Boyle and Feng

Page 11: Census Interaction Data: Characteristics and Access John Stillwell Centre for Interaction Data Estimation and Research (CIDER) School of Geography, University

CIDER’s interaction data sets

(c) Estimated time series data sets

• Patient register/NHSCR flows between local authority districts in England and Wales, 1999-2008 (rounded) – estimated and supplied by ONS

• Inter-NUTS2 region migration estimates for UK, mid 1999-2000 to mid 2006-2007 – estimated and supplied by Rees and Dennett (DEMIFER project)

• Inter-NUTS2 region migration estimates for UK, calendar year 2000 to 2006-2007 – estimated and supplied by Rees and Dennett (DEMIFER project)

• Inter-region migration by age, sex and ethnicity for Britain, 1991-1999 and 2000-2007 estimated and supplied by Raymer and Giuletti (ESRC project)

• Inter-county migration by age, sex and ethnicity, 1999-2007, estimated and supplied by Raymer and Giuletti (ESRC project)

• Inter-county migration by age, sex and economic activity , 1999-2007, estimated and supplied by Raymer and Giuletti (ESRC project)

Page 12: Census Interaction Data: Characteristics and Access John Stillwell Centre for Interaction Data Estimation and Research (CIDER) School of Geography, University

2. How are these data sets accessed?WICID is the online interface to the Census interaction

data sets accessible from the CIDER Home Page

Need to be a registered user of census data

Page 13: Census Interaction Data: Characteristics and Access John Stillwell Centre for Interaction Data Estimation and Research (CIDER) School of Geography, University

WICID Query Interface

Page 14: Census Interaction Data: Characteristics and Access John Stillwell Centre for Interaction Data Estimation and Research (CIDER) School of Geography, University

Data selection

Tables available in 2001 SMS Level 1

Cells of Table 3 in 2001 SMS Level 1

Page 15: Census Interaction Data: Characteristics and Access John Stillwell Centre for Interaction Data Estimation and Research (CIDER) School of Geography, University

Origin and destination geography selection

Area selection tools available

List selection of districts

Page 16: Census Interaction Data: Characteristics and Access John Stillwell Centre for Interaction Data Estimation and Research (CIDER) School of Geography, University

Map Selection Tool

Page 17: Census Interaction Data: Characteristics and Access John Stillwell Centre for Interaction Data Estimation and Research (CIDER) School of Geography, University

Map Selection Tool (detail)

Page 18: Census Interaction Data: Characteristics and Access John Stillwell Centre for Interaction Data Estimation and Research (CIDER) School of Geography, University

Postcode based selection

Page 19: Census Interaction Data: Characteristics and Access John Stillwell Centre for Interaction Data Estimation and Research (CIDER) School of Geography, University

Finalise Screen

Screen Indicating Extraction Completed

Page 20: Census Interaction Data: Characteristics and Access John Stillwell Centre for Interaction Data Estimation and Research (CIDER) School of Geography, University

Example of simple query and data extracted

The Query: Extract the data on total migrant flows between the countries of the UK from Table MG1010 in 2001 SMS

The Data: Origin by destination matrix of migration flows in 2000-01

Page 21: Census Interaction Data: Characteristics and Access John Stillwell Centre for Interaction Data Estimation and Research (CIDER) School of Geography, University

Analysis functions for use on extracted data

Page 22: Census Interaction Data: Characteristics and Access John Stillwell Centre for Interaction Data Estimation and Research (CIDER) School of Geography, University

Help System opening inside a new browser window

Page 23: Census Interaction Data: Characteristics and Access John Stillwell Centre for Interaction Data Estimation and Research (CIDER) School of Geography, University

3. How are interaction data sets used in research?

Interaction data sets used by various researchers:

See some examples in Part 2 of CIDER book:

Stillwell, J., Duke-Williams, O.W. and Dennett, A. (eds.) (2010) Technologies for Migration and Commuting Analysis Spatial Interaction Data Applications, pp. 357, IGI Global, Hershey.

Page 24: Census Interaction Data: Characteristics and Access John Stillwell Centre for Interaction Data Estimation and Research (CIDER) School of Geography, University

Example: What processes of white migration are taking place in London at ward level?

Net migration flows within Net migration flows between Greater London London and rest of England and Wales

Source: 2001 Census Commissioned TableStillwell, J. (2010) Ethnic population concentration and net migration in London, Environment and Planning A, 42: 1439-1456.Location quotients

Page 25: Census Interaction Data: Characteristics and Access John Stillwell Centre for Interaction Data Estimation and Research (CIDER) School of Geography, University

Are the same processes of migration apparent for Black migrants in London?

Net migration flows within Net migration flows between Greater London London and rest of England and Wales

Source: 2001 Census Commissioned TableLocation quotients

Page 26: Census Interaction Data: Characteristics and Access John Stillwell Centre for Interaction Data Estimation and Research (CIDER) School of Geography, University

Are the same processes of migration apparent for Chinese migrants in London?

Net migration flows within Net migration flows between Greater London London and rest of England and Wales

Source: 2001 Census Commissioned TableLocation quotients

Page 27: Census Interaction Data: Characteristics and Access John Stillwell Centre for Interaction Data Estimation and Research (CIDER) School of Geography, University

4. What are the major characteristics of the 2011 Census interaction data?

4.1 What interaction questions were asked?4.2 What about statistical disclosure

control?4.3 What geographies will be used? 4.4 What migration and commuting tables will be available?

Page 28: Census Interaction Data: Characteristics and Access John Stillwell Centre for Interaction Data Estimation and Research (CIDER) School of Geography, University

4.1 Interaction questionsMain questions for migration and

commuting in 2011 are the same as in 2001

Page 29: Census Interaction Data: Characteristics and Access John Stillwell Centre for Interaction Data Estimation and Research (CIDER) School of Geography, University

Student ‘migration’ picked up by separate questions

Student term time/boarding school address in the UK: enter term time address belowThis means that it will be possible to generate flows of: (i) those who left HE/FE/boarding school and changed usual residence(ii) those in HE/FE or at boarding school who changed term time address

Page 30: Census Interaction Data: Characteristics and Access John Stillwell Centre for Interaction Data Estimation and Research (CIDER) School of Geography, University

‘Another address’ question enables further ‘interaction’ data to be generated?

Questions 5 and 6 ask about another address

Potential to produce matrices of interaction flows between usual address and other address – very useful for analyses of mobility (weekly commuting, shared custody of children, second homes, international mobility) hitherto uncaptured

Page 31: Census Interaction Data: Characteristics and Access John Stillwell Centre for Interaction Data Estimation and Research (CIDER) School of Geography, University

Questions about international immigration

Potential to produce tables of immigrants by country of birth and country of previous usual residence

Page 32: Census Interaction Data: Characteristics and Access John Stillwell Centre for Interaction Data Estimation and Research (CIDER) School of Geography, University

4.2 Statistical disclosure control?

• Small cell adjustment abandoned in 2011 in favour of record swapping:

- Households swapped- Targeted to ‘risky’ records- Construct risk score for every individual; combine to

household score- Imputation considered as part protection- Households swapped only as far as their risk is

considered ‘high’- Individuals swapped between communal

establishments

• Work on SDC on Origin-Destination Tables still ongoingSource: Spicer, K. (2011) Statistical Disclosure Control for 2011 UK Census, http://www.ons.gov.uk/census/2011-census/consultations/open-consultations/2011-output-consultation---main-statistical-outputs---second-round/index.html

Page 33: Census Interaction Data: Characteristics and Access John Stillwell Centre for Interaction Data Estimation and Research (CIDER) School of Geography, University

Data licensing arrangementsTier Initial

ideaData availability Tier Current

thinkingData availability

1 Public Download without restriction

1 Public Data available under open government license

2 Safeguarded Download with terms and conditions

2 Safeguarded Data available with Special user license3 Safeguarded

(Approved researcher)

Download only with approved researcher status

4 Approved researcher

Access only with approved researcher status in a secure setting

3 Secure/VML Data available to approved researcher only in a secure setting

Key question: Which data will be available at which tier of licensing?

Page 34: Census Interaction Data: Characteristics and Access John Stillwell Centre for Interaction Data Estimation and Research (CIDER) School of Geography, University

4.3 What geographies will be used?• Fundamental building blocks for origin-destination migration

flows will be output areas (OAs) with data aggregated to wards and districts

• Problem of LG reorganisation since 2001 – which means there is a user requirement that flows for wards should be generated so as to be able to reconstitute old LG districts for comparison

• Preference for LG districts in Northern Ireland (rather than Parliamentary Constituencies as in 2001)

• New geography for commuting destinations – Workplace Zones (WPZs)

Page 35: Census Interaction Data: Characteristics and Access John Stillwell Centre for Interaction Data Estimation and Research (CIDER) School of Geography, University

Workplace Zones (WPZs)• OAs based on where people live not work – can be unsuitable

for workplace statistics• Some OAs contain no/few businesses; some contain many

businesses or large employer, e.g. business parks, City of London

• Workplace Zones project looking at splitting/merging OAs for a new geography constrained to MSOAs

• Pilot areas: Tower Hamlets, City of London, Southampton, Nottingham, Suffolk Coastal

• Disclosure control: Population threshold same as OAs (100 workers min; 625 max; no household threshold)

Source: Spicer, K. (2011) Statistical Disclosure Control for 2011 UK Census, http://www.ons.gov.uk/census/2011-census/consultations/open-consultations/2011-output-consultation---main-statistical-outputs---second-round/index.html

Page 36: Census Interaction Data: Characteristics and Access John Stillwell Centre for Interaction Data Estimation and Research (CIDER) School of Geography, University

4.4 Migration and commuting tables • ONS still undecided about table specifications for interaction

data sets• 2011 Census Prospectus indicates Migration and Workplace

Statistics will be released after October 2013• ONS currently reviewing the table specifications proposed by

Oliver Duke-Williams (UCL):- Special Migration Statistics (SMS)- Special Workplace Statistics/Special Travel

Statistics (SWS/STS)- Special Student Statistics(SSS)- Special Residence Statistics (SRS)

• Important distinction between different types of counts and their relationship with spatial scale and tier of licensing

Page 37: Census Interaction Data: Characteristics and Access John Stillwell Centre for Interaction Data Estimation and Research (CIDER) School of Geography, University

Three types of tables for each set of SMS/SWS/STS/SSS/SRS

• Likely to be important distinction between:(i) Flow (or headcount) tables, i.e. origin-destination flows of

total persons only(ii) Univariate tables, i.e. origin-destination flows disaggregated

by a single variable e.g. sex, or age or ethnic group(iii) Multivariate tables, i.e. Origin-destination flows disaggregated

by more than one variable, e.g. age by sex or ethnic group by sex

• Each of these flow data sets likely to be produced for flows at different spatial scales: OA-OA; ward-ward; UA/LA-UA/LA with different access/licensing conditions

Page 38: Census Interaction Data: Characteristics and Access John Stillwell Centre for Interaction Data Estimation and Research (CIDER) School of Geography, University

5. Conclusions

• Anticipate substantial demand for access to 2011 Census interaction data sets

• Collaboration underway with ONS about table design as well as joint dissemination strategy

• Interaction data service soon to be part of the Census Support Service (CSS)

• Key advantage of CSS is provision of user access to data from previous censuses

• Recognise the ‘new’ environment – with 2011 Census likely to be the last of its kind and results of ONS ‘Beyond 2011’ project due in September 2014

• Changing focus of data collection from Census to surveys and administrative sources

Page 39: Census Interaction Data: Characteristics and Access John Stillwell Centre for Interaction Data Estimation and Research (CIDER) School of Geography, University

Contact details

John StillwellEmail: [email protected]

Oliver Duke-WilliamsEmail: [email protected]

CIDER Web site: http://cider.census.ac.uk/