Geo-referenced data Geo-referenced data and DLI aggregate and DLI aggregate
data sourcesdata sources
Chuck Chuck HumphreyHumphrey
University of AlbertaUniversity of AlbertaSeptember 29, 2008September 29, 2008
OutlineOutline
An introduction to geo-referenced data available through the Data Library
Working with the Census of Population Need to know the levels of Census
geography Need to understand the geo-coding of
Census data Need to know the content from the Census
Other geo-referenced data
Geo-referenced dataGeo-referenced data
You have free access to all of the standard data products that Statistics Canada sells through the University’s DLI subscription with Statistics Canada.
For GIS researchers, these products include aggregate data files and Census spatial data files.
Aggregate data files are summary statistics organized in a data file structure around time, geography and content.
Geo-referenced dataGeo-referenced data
What are geo-referenced data? Aggregate data that contain at least one
variable representing a specific spatial unit in which the geo-codes are based on a standard geographic classification and/or have corresponding boundary files using the same geo-coding system.
A spatial unit is the geographic area used as the unit of analysis to structure the data.
Spatial Unit
Geo-code
Geo-referenced data Geo-referenced data
The unit analysis makes up the rows in the data file and is the object being
described by the other variables the file. The values for this variable are geo-
codes for Census tracts.
Geo-referenced data Geo-referenced data
This case in the data file represents Census Tract 0023.00, which was shown
in the image two slides earlier.
Geo-referenced data strategiesGeo-referenced data strategies
For GIS use, we want aggregate data files:• where the variables summarize social and
economic characteristics over spatial areas, and• where the data file is structured with the spatial unit
as the unit of analysis. We want the spatial unit in the data file to
correspond with an available boundary file. We want the variable representing the spatial
unit to use the same geo-codes that occur in the boundary file.
DLI aggregate data filesDLI aggregate data files
What are some of the DLI aggregate data products that might be of use in GIS research? The Census of the Population provides the greatest
possibilities for various levels of geography. The Health Indicators provides statistical
summaries of the Census and some health variables at the health region (HR) level of geography.
The Canadian Business Patterns reports company size for industry codes at the Census level of Census Division (CD) and Census Subdivision (CSD)
DLI aggregate data filesDLI aggregate data files
Justice and Education statistics are reported at the Census Metropolitan Area (CMA) level for the largest CMAs.
Justice also releases statistics at the municipal police force jurisdiction level, which is not a Census geographic level.
Notes: As indicated with Justice, not all of these
products have compatible spatial boundaries with Census geography.
Some may make reference to metropolitan areas but not use the Census geo-codes for CMAs.
The CensusThe Census
The Census is one of the most important sources of geo-referenced data. It is the largest survey conducted in Canada and, consequently, is the primary source for small area statistics.
To use geo-referenced data from the Census, you must know: The variety of spatial units used to disseminate
Census results; The codes used to represent the various Census
spatial units; and The aggregate characteristics from the Census
available for the various spatial units.
1: The variety of spatial units1: The variety of spatial units
Statistics Canada groups the variety of spatial units associated with the Census into two groups:
Source for the graphics: Illustrated Glossary, 2006 Census Geography, Statistics Canada
Administrative areasAdministrative areasSource: Illustrated Glossary, 2006 Census Geography, Statistics Canada
Statistical areasStatistical areasSource: Illustrated Glossary, 2006 Census Geography, Statistics Canada
2: Census geo-codes2: Census geo-codes
Statistics Canada has two categories of geo-code systems: Standard Geographic Classification (SGC) Other geographic entities
Source for the graphic: Illustrated Glossary, 2006 Census Geography, Statistics Canada
Standard geographic Standard geographic classification classification
Source: Illustrated Glossary, 2006 Census Geography, Statistics Canada
Standard geographic Standard geographic classification, 2006classification, 2006
The link to Definitions, data sources and methods on the main page of the Statistics Canada website provides a link to Standard Classifications, which includes Geography.
Other geographic entitiesOther geographic entities
Let’s add dissemination areas!
Dissemination areas
Source: Illustrated Glossary, 2006 Census Geography, Statistics Canada
Dissemination areasDissemination areas
Dissemination areasDissemination areas
The geo-codes for DA’s use the Standard Geographic Classification and an added, unique four digit numeric code.
For Edmonton, one DA code would look like this:
PR CD CSD DA
48 11 061 0001
Dissemination areas
Dissemination areasDissemination areas
The 2001 & 2006 Censuses employ two different geo-coding schemes at the DA level: an 8-digit and an 11-digit code.
The 11-digit code consists of PR (2), CD (2), CSD (3) and DA (4)
The 8-digit code consists of PR (2), CD (2) and DA (4).
The boundary files provided by Statistics Canada use the 8-digit code to identify spatial units.
8-digit DA-level codePR(2)-CD(2)-DA(4)
11-digit DA-level codePR(2)-CD(2)-CSD(3)-DA(4)
11-digit DA
8-digit DA
Other geographic codesOther geographic codes
Other coding systems for spatial units include: Census metropolitan areas and census agglomer
ations;
Economic regions; Health regions; and Countries.
Source: Illustrated Glossary, 2006 Census Geography, Statistics Canada
Census tractsCensus tracts
3: Aggregate characteristics3: Aggregate characteristics
Profile series and basic tabulations Aggregate Census results are disseminated in
two primary products: profile series and topic-based tabulations.
The Profile series is available at the SGC and other levels of geography disseminated by Statistics Canada and consists primarily of counts for all the response categories to questions in the 2B form. In 2006, the 2B form consisted of the eight questions asked on the 2A form plus an additional 53 questions. This series is the most frequently used by GIS researchers on our campus.
2006 profile series breakdown2006 profile series breakdown
Spatial Unit Number of Characteristics
Federal District 2184
CSD 2175
CMA/CA 2175
CT 2175
DA 2172
FSA 2172
Health Regions 42
Basic tabulationsBasic tabulations
Basic tabulations are n-way tables showing the results for combinations of Census questions. The more the variables included in the table, the higher the level of geography that is reported. Few of these tables are below the CSD, CMA/CT level, although always check. For example, in 2001 Religion (13) by Age (8) is available at the DA level.
Aggregate Census dataAggregate Census data
Want data at the CT-level or higher? E-STAT has these data in Beyond 20/20, DBF,
CSV, Tab-delimited format. Statistics Canada website with level 2 access
has text and Beyond 20/20 formats. The Data Library site has text and Beyond
20/20 formats. Available through the UT CHASS Census site
Want data at the DA-level? The Data Library site has text and Beyond
20/20 formats.
HealthHealth
Health Region is the administrative area in which health care is delivered in Canada.
As administrative areas, Health Regions are determined by the provinces. Statistics Canada creates a customized product from the Census aggregating results using Health Region boundaries.
Health Indicators and Community Profiles are the two key sources for Health Region aggregate data.
HealthHealth
CIHI is responsible for disseminating statistics about the health care system at the Health Region level. The CIHI site provides maps without the data for a few indicators. The database, Regional Contextual Information for Health Regions with over 75,000 Population, appears to be the only data source on the CIHI site for Health Regions.
JusticeJustice
The table may refer to jurisdiction instead of geography.
Justice tables Table 253-0004 - Homicide survey, number and
rates (per 100,000 population) of homicide victims, by census metropolitan area
Refer users to http://www.statcan.ca/english/sdds/3315.htm
Report homicides according to four population sizes: 500K +, 250-499K, 100-249K and < 100K
Group metropolitan areas under these categories
JusticeJustice
JusticeJustice
JusticeJustice
Justice tables Police Administration Survey - Municipal Police
Force Administration Character, 1986 - 2006 866 municipal police force jurisdictions The geo-code for municipalities consist of the
standard geography classification for provinces (2-digit codes) followed by 3-digit codes that don’t correspond to Census geography but do correspond with the Uniform Crime Report police force codes
JusticeJustice
JusticeJustice
Justice tables Uniform Crime Survey – Crime Statistics, All
Police Services, 1977 - 2003 “There are approximately 1,200 separate police
locations responding to the survey, comprising about 220 different police forces.” Canadian Crime Statistics, 85-205-XIE, p. 73.
This table contains 2,711 police detachments, some no longer operational.
The geo-code corresponds to the Police Administration Survey: 2-digit province code and 3-digit detachment code.
EducationEducation
The Education tables on the DLI FTP site provide provincial level summaries and for some post-secondary related tables, institution names are provided. No Census spatial units, other than province, are used among this tables.
The Statistics Canada website contains the Report of the Pan-Canadian Education Indicators Program. Includes the use of CMA and non-CMA reporting for some tables. Names and not geo-codes are used to identify CMA’s.
BusinessBusiness
Canadian Business Patterns reports the number of establishments by industrial classification and size of workforce. These aggregate data are available for CD, CSD and CMA/CA levels of Census geography.
The data also provide a time series at these geographic levels since 1998 for both the NAICS and SIC industry classifications.
CANSIMCANSIM
CANSIM is primarily a time series database but every time series is placed in the context of some level of geography. One can search table titles for geography terms but cannot currently search just the geography field within each series.
Odds and endsOdds and ends
Survey of Household Spending Equipment (62F0041XDB): 17 metropolitan areas Spending (62F0031XDB): 17 metropolitan area
Canada Revenue Agency Provincial level statistics summaries from tax
returns. Environment Canada data sources use postal
codes in some instances Environment
Human Activity and the Environment: Annual Statistics Product (16-201-XWE)
Available in CANSIM series, too