79
Class prep Go to S:\classes\UEP_ENV Copy whole folder “American Community Survey Error Exploration” to your Desktop Make writable: right-click on folder => properties => uncheck read-only

Class prep

  • Upload
    ludwig

  • View
    43

  • Download
    0

Embed Size (px)

DESCRIPTION

Class prep. Go to S:\classes\UEP_ENV Copy whole folder “American Community Survey Error Exploration” to your Desktop Make writable: right-click on folder => properties => uncheck read-only. Class prep. - PowerPoint PPT Presentation

Citation preview

Page 1: Class prep

Class prep

Go to S:\classes\UEP_ENV Copy whole folder “American Community

Survey Error Exploration” to your Desktop Make writable: right-click on folder =>

properties => uncheck read-only

Page 2: Class prep

Class prep

1 Using Windows Explorer, go to the following folder:American Community Survey Error Exploration \AFF_data_tables\ Median_HH_Income_tract

2 Open:a ACS_10_SF4_B19013_metadata.csv – this is the metadata

file for the ACS dataa ACS_10_SF4_B19013_Med_HH_Income.xlsx – this is the

data table (median household income)

Page 3: Class prep

Today

Census mapping basics review and questions

Understanding American Community Survey margin of errors

Calculating a reliability index (coefficient of variation or CV)

Visualizing the CV on a map

Page 4: Class prep

Questions about joining tables to geography?

Federal Information Processing Standards (FIPS) Codes

Area Name FIPSState Massachusetts 25County Suffolk 25025Tract 000601 25025000601

Page 5: Class prep

We JOIN the data table to the geography table using the common ID column

Page 6: Class prep
Page 7: Class prep

Mapping Numbers

Page 8: Class prep

Graduated color

Page 9: Class prep

Graduated Colors…number of renters

Page 10: Class prep

Graduated Symbols…number of renters

Page 11: Class prep

Normalization (“divide by”)

Number of population in rental units normalized by total population in occupied housing units

Page 12: Class prep

Fraction of renters living in each tract out of total population in occupied housing units

Page 13: Class prep

Using “Normalization”

Normalize by means “divide by” Percentage – e.g., number of renters over

total population in occupied housing Result is a fraction, e.g., .45 Fractions are translated into percentages by

multiplying by 100 .45 = 45%

Density – population normalized by area (e.g., sq mi, acre)

Page 14: Class prep

Classes and Classification Methods

Page 15: Class prep

Classes and Classification Methods

Page 16: Class prep

Classification methods Details from

ArcGIS 10.1 Help – standard classification methods1. Natural breaks – good for skewed data

2. Equal interval, defined interval, and standard deviation – good for evenly distributed data to show differences

3. Quantiles - good for evenly distributed data to show relative difference (e.g., top and bottom 20 percentile

4. Geometric interval – compromise that attempts to have similar number of features in each class with intervals being roughly the same

Page 17: Class prep

Classification methods

Details from ArcGIS 10.1 Help – standard classification methods Equal interval Defined interval Natural breaks Quantiles Standard deviation Geometric interval

Try them out!

Page 18: Class prep
Page 19: Class prep

Which classification methods is best?

Page 20: Class prep

Formatting numeric labels

Page 21: Class prep
Page 22: Class prep

But make it better!

Clutter and data speak! Clearer and cleaner

Page 23: Class prep

Review

Categories versus numbers Proportional versus graduated symbols Understanding classification methods

No “right” method – explore Different methods => very different results Number of classes – hard to distinguish over 6

Understanding normalization (“divide by”)

Page 24: Class prep

Mapping a particular area – two selection options: Select the town first, then perform select by

location to get all tracts that intersect that town (or have their centroid in that town)

Zoom into an area slightly larger than the region you want to map, then interactively select all the tracts from in that area (e.g., use the select tool to make a box around them)

Then Create Layer from Selected Features

Page 25: Class prep

Copying and pasting the same layer in your table of contents If you want to map several variables that are

within the same joined table(s), you can simply copy and paste the layer so that you have another copy

Then create maps from a different variable in each layer

Make sure to change title, legend

Page 26: Class prep

American Community Survey

What users need to know

Page 27: Class prep

Test: why do we need to use ACS data in policy / environmental analysis?

Page 28: Class prep

Because it has important information about our communities…

Page 29: Class prep

Because it has important information about our communities…

Page 30: Class prep

So we need to learn to use the information reliably…

And especially to understand the margin of error for ACS estimates

Page 31: Class prep

Review – What is the ACS?

American Community Survey A continuous monthly survey of households Long set of questions covering many topics Data is released once a year

1 Year averages – areas with a population 65,000+ 3 Year averages – areas with a population 20,000+ 5 Year averages - all other areas (including census

tracts and blockgroups)

E.g., average number of people commuting by bicycle for 2007-2011

Page 32: Class prep

Use Census 2010 data where possible because it is 100% survey, thus has smaller sampling error Population Counts

Age Race / Hispanic Ethnicity

Housing Unit Counts and Tenure (rented, owner-occupied)

Household and Family Relationships

Page 33: Class prep

ACS: Use the highest aggregation you can in terms of tables (can be hard to find)

Page 34: Class prep
Page 35: Class prep

ACS and Margin of Error

Means of transportation for commute – Tract Level - ACS 2005-2009 5 year estimatesUniverse is workers 16 and over

Workers 16 and Over

Page 36: Class prep

Open the Excel files…

a ACS_10_SF4_B19013_Med_HH_Income.xlsx – this is the data table (median household income)

a ACS_10_SF4_B19013_metadata.csv – this is the metadata file for the ACS data

Page 37: Class prep

Metadata file and data table…

Page 38: Class prep

So let’s understand the margin of error…

Page 39: Class prep

39

What is Sampling Error?

Definition

The uncertainty associated with an estimate that is based on data gathered from a sample of the population rather than the full population

Page 40: Class prep

40

Illustration of Sampling ErrorEstimate average number of children per household for a population with 3 households living in a block:

Household A has 1 childHousehold B has 2 childrenHousehold C has 3 children

The block average based on the full population is two children per household: (1+2+3)/3

Page 41: Class prep

41

Conceptualizing Sampling ErrorThree different samples of 2 households:

1. Households A and B (1 child, 2 children)2. Households B and C (2 children, 3 children)3. Households A and C (1 child, 3 children)

Three different averages based on which sample is used:

4. (1 + 2) / 2 = 1.5 children 5. (2 + 3) / 2 = 2.5 children 6. (1 + 3) / 2 = 2 children

Page 42: Class prep

Sampling Error

Census 2010 is a 100% survey so has smaller error

ACS data is based on samples – error is larger

The smaller the geography, the larger the error (because the sample is smaller)

Especially true for variables that sample a small number of people, e.g., bike commuters

Page 43: Class prep

ACS and Margin of Error

Means of transportation for commute – Tract Level - ACS 2005-2009 5 year estimatesUniverse is workers 16 and over

Workers 16 and Over

Page 44: Class prep

American Community Survey and sampling error

The margin of error is calculated and included with each estimate

Calculated at 90% confidence level

What does that mean?

Page 45: Class prep

ACS and Margin of Error

Means of transportation for commute – Tract Level - ACS 2005-2009 5 year estimatesUniverse is workers 16 and over

Workers 16 and Over

Page 46: Class prep

Confidence level of 90% We don’t know for sure how many people in

Tract 3.02 take public transit to work Based on the ACS sample, our estimate over 5

years is that an average of 747 people take transit, +/- 226 at 90% confidence level

If we did many, many samples of that same tract, 90% of the time the resulting range (521-973 people) would contain the real number of commuters taking transit.

10% of the time it would not

Page 47: Class prep

Confidence level of 90%

The confidence level of a margin of error indicates the likelihood that the true population value (real number) falls within the margin of error

We can be 90% confident that somewhere between 571 and 973 people take transit to work in tract 3.02

Page 48: Class prep

Also we know that Tract 3.02 has somewhere between 1958 and 2684 workers)

So maybe half the workers take transit, or maybe just a fifth of them do. Ugh!!!

Page 49: Class prep

If using ACS data, pay attention to margin of error!

Page 50: Class prep

ACS table from American Factfinder….

Page 51: Class prep

Use metadata file plus AFF web site

This table is showing Educational Attainment for universe of people 25 years and older

Page 52: Class prep

Use AFF web site plus metadata file

Page 53: Class prep

Bottom line for ACS More up to date information Continuous versus point in time

measurement 5 year estimates are the most reliable

because they have the largest samples But…

Poorer precision at finer scales (e.g., census tract) or areas of low population (rural areas)

Poorer precision for variables with low numbers (e.g., people who bike to work)

Page 54: Class prep

Don’t go any lower than tracts for mapping ACS data

Page 55: Class prep

Geographic Hierarchy

Page 56: Class prep

56

Measures associated with sampling error

Page 57: Class prep

Look at Excel file for Med_HH_Income

Page 58: Class prep

58

Measures Associated with Sampling Error Standard Error (SE)

Margin of Error (MOE)

Coefficient of Variation (CV)

Page 59: Class prep

59

Standard Error (SE) Definition

A measure of the variability of an estimate due to sampling

Depends on variability in the population and sample size

FormulaSE = MOE / 1.645 (for 90% confidence level)

Page 60: Class prep

60

Margin of Error (MOE) Definition

A measure of the precision of an estimate at a given level of confidence (90%, 95%, 99%)

MOEs at the 90% confidence level are published for all ACS estimates

Page 61: Class prep

61

Coefficient of Variation (CV)

DefinitionThe relative amount of sampling error associated with a sample estimate

A measure of reliability

FormulaCV = Standard Error / Estimate * 100%

Page 62: Class prep

CV% is a measure of reliability. So what is a good CV %? No agreement

Depends on purpose

Census case studies: less than 15% may be reliable 15-30% - not reliable, be very careful Over 30% - not reliable, use with extreme caution

Page 63: Class prep

To calculate CV, we first calculate the SE:SE = (MOE / 1.645)

Page 64: Class prep

Then the CV% formula is:

CV = (SE / estimate)*100

Page 65: Class prep

Two examples

Median household income and biking to work

Page 66: Class prep
Page 67: Class prep
Page 68: Class prep

Why do you think median household income generally show lower CVs (more reliable estimates)?

Page 70: Class prep

Census data table modifications Preparing data takes understanding and time Probably best to do it in Excel ahead of time Always remember to process the GeoID2

field to make it text To be compatible with shape file:

Column names – 10 characters max, no spaces or symbols

Page 71: Class prep

Close Excel tables before opening ArcMap

Page 72: Class prep

From desktop, open the following mapfile American Community Survey Error

Exploration \ Exploring Error in the American Community Survey.mxd

Page 73: Class prep

Showing in ArcMap Join the fixed Household Median Income

table to Census Tract shape file Create a map of Household Median income –

5 classes by quantiles Right-click and copy tract layer Right-click on Layers and choose Paste

Layers Map CV – 3 classes, with breaks at 15, 30,

and max value

Page 74: Class prep

Symbolizing CV with hatch patterns

Page 75: Class prep

Hands on exploration of commute data

Page 76: Class prep

GIS Tools for Mapping ACS Estimates and Data Quality Informationhttp://gesg.gmu.edu/

Page 77: Class prep

For your census mapping assignment You need to make 6 maps 6 different census variables (not necessarily

from 6 different tables) At least two of the maps have to show ACS

variables You don’t have to show CV on your maps but

if you want to experiment, it’s good practice!

Page 78: Class prep

For your census mapping assignment You can use census data you find from GIS

clearinghouses – e.g., MassGIS Instructions for clipping coastal tracts on GIS

Tips and Tutorials web site

Page 79: Class prep

ACS and Error

Always be aware of error Have a statement about error if you are

making maps Might be good to visualize the CV as well, at

least as an inset? In tables, include the margin of error It’s your reputation that’s at stake!