51
Spatial analysis: a roadmap David O’Sullivan University of Auckland School of Geography and Environmental Science [email protected]

Spatial analysis: a roadmap David O’Sullivan University of Auckland School of Geography and Environmental Science [email protected]

Embed Size (px)

Citation preview

Page 1: Spatial analysis: a roadmap David O’Sullivan University of Auckland School of Geography and Environmental Science d.osullivan@auckland.ac.nz

Spatial analysis: a roadmap

David O’SullivanUniversity of AucklandSchool of Geography and Environmental Science

[email protected]

Page 2: Spatial analysis: a roadmap David O’Sullivan University of Auckland School of Geography and Environmental Science d.osullivan@auckland.ac.nz

GIS and Population Science - Penn State - June 12, 2006 - David O'Sullivan 2 / 51

Overview• A definition of spatial analysis

– Data types and basic questions

• The bad news: classic problems of spatial analysis– Spatial dependence

• The good news: potential of spatial analysis– Some generally useful concepts in the analysis of

spatial data

Page 3: Spatial analysis: a roadmap David O’Sullivan University of Auckland School of Geography and Environmental Science d.osullivan@auckland.ac.nz

GIS and Population Science - Penn State - June 12, 2006 - David O'Sullivan 3 / 51

One definition of spatial analysis• According to O’Sullivan and Unwin (2003) in

Geographic Information Analysis:

“concerned with investigating the patterns that arise as a result of processes that may be operating in space. Techniques and methods to enable the representation, description, measurement, comparison and generation of spatial patterns are central to the study of geographic information analysis”

O'Sullivan, D. and Unwin, D. J. 2003. Geographic Information Analysis. Wiley:

Hoboken, NJ.

Page 4: Spatial analysis: a roadmap David O’Sullivan University of Auckland School of Geography and Environmental Science d.osullivan@auckland.ac.nz

GIS and Population Science - Penn State - June 12, 2006 - David O'Sullivan 4 / 51

What else is ‘spatial analysis’?• It depends on your point of view:

1.Spatial data manipulation, often in a GIS, is frequently referred to as ‘spatial analysis’

2.Spatial data analysis is descriptive and exploratory

3.Spatial statistical analysis employs statistical methods to investigate data with respect to some statistical model

4.Spatial modeling is about constructing models to predict or to better understand spatial phenomena

Page 5: Spatial analysis: a roadmap David O’Sullivan University of Auckland School of Geography and Environmental Science d.osullivan@auckland.ac.nz

GIS and Population Science - Penn State - June 12, 2006 - David O'Sullivan 5 / 51

Auckland schools

Page 6: Spatial analysis: a roadmap David O’Sullivan University of Auckland School of Geography and Environmental Science d.osullivan@auckland.ac.nz

GIS and Population Science - Penn State - June 12, 2006 - David O'Sullivan 6 / 51

Two key aspects in spatial analysis

• Spatial data– What types of spatial data exist?

• Applying standard statistical ideas to spatial data– What problems does this introduce?– Why is spatial data special?

Page 7: Spatial analysis: a roadmap David O’Sullivan University of Auckland School of Geography and Environmental Science d.osullivan@auckland.ac.nz

GIS and Population Science - Penn State - June 12, 2006 - David O'Sullivan 7 / 51

Spatial data

• There are, broadly speaking, two main ways of representing the world:

Vector objects Raster fields

Page 8: Spatial analysis: a roadmap David O’Sullivan University of Auckland School of Geography and Environmental Science d.osullivan@auckland.ac.nz

GIS and Population Science - Penn State - June 12, 2006 - David O'Sullivan 8 / 51

Vector object typesPoints

Cities, people, houses, schools, crimes, disease incidence and mortality…

LinesDrainage, road, communications, power

networks, commutes…

AreasCensus districts, cities, boroughs,

townships, school enrolment zones, police precincts, land cover units…

Page 9: Spatial analysis: a roadmap David O’Sullivan University of Auckland School of Geography and Environmental Science d.osullivan@auckland.ac.nz

GIS and Population Science - Penn State - June 12, 2006 - David O'Sullivan 9 / 51

• Field data are useful when a phenomenon is measurable at all locations– Field data are common for natural

phenomena—air pressure, wind speed etc. – In social/human/population geography field

data are sometimes used to represent estimated densities, since a density may be considered to be measurable everywhere (e.g., crime mapping)

Fields

Page 10: Spatial analysis: a roadmap David O’Sullivan University of Auckland School of Geography and Environmental Science d.osullivan@auckland.ac.nz

GIS and Population Science - Penn State - June 12, 2006 - David O'Sullivan 10 / 51

Attribute data types

• At each location, we have measurements of various attributes

• There are two broad classes (familiar from statistics):– Numerical data, which are either ratio or

interval; and– Categorical data, which are either ordinal

or nominal

Page 11: Spatial analysis: a roadmap David O’Sullivan University of Auckland School of Geography and Environmental Science d.osullivan@auckland.ac.nz

GIS and Population Science - Penn State - June 12, 2006 - David O'Sullivan 11 / 51

The entity-attribute model

Points

Lines

Areas

Fields

Nominal

Ordinal

Interval

Ratio

State highway

Spot height

Electoral districts

Page 12: Spatial analysis: a roadmap David O’Sullivan University of Auckland School of Geography and Environmental Science d.osullivan@auckland.ac.nz

GIS and Population Science - Penn State - June 12, 2006 - David O'Sullivan 12 / 51

Some possible combinations

Source: O'Sullivan and Unwin. 2003.

Page 13: Spatial analysis: a roadmap David O’Sullivan University of Auckland School of Geography and Environmental Science d.osullivan@auckland.ac.nz

GIS and Population Science - Penn State - June 12, 2006 - David O'Sullivan 13 / 51

Some reservations … or … not so fast!

• This model is reductive– In particular, it is a statistician’s and GIS

person’s view– How do you represent complex

ethnographic data in this framework? ‘Home’? Photographs? Sound-clips? Hyperlinks?

– In addition scale is often a complicating factor

Page 14: Spatial analysis: a roadmap David O’Sullivan University of Auckland School of Geography and Environmental Science d.osullivan@auckland.ac.nz

GIS and Population Science - Penn State - June 12, 2006 - David O'Sullivan 14 / 51

OK, but…why the interest in spatial data?

• We assume that location makes a difference, so…– statistical distributions remain relevant…– … but now spatial patterns in the data are

also of interest…– … and the possible relationships between

the two are really what matter

Page 15: Spatial analysis: a roadmap David O’Sullivan University of Auckland School of Geography and Environmental Science d.osullivan@auckland.ac.nz

GIS and Population Science - Penn State - June 12, 2006 - David O'Sullivan 15 / 51

Source: O'Sullivan and Unwin. 2003.

Statistics and spatial analysis• Statistics is about

– Describing observed data

– Comparing observed data to expectations based on a statistical model

– Inferring from the comparison whether the observed data is compatible with the assumptions

Page 16: Spatial analysis: a roadmap David O’Sullivan University of Auckland School of Geography and Environmental Science d.osullivan@auckland.ac.nz

GIS and Population Science - Penn State - June 12, 2006 - David O'Sullivan 16 / 51

Simplification in statistics• To make the mathematics work,

assumptions about observed data are made. In particular that–Observations are random samples from a

population–Observations are independent of one

another

• But… these assumptions are never true of spatial data

Page 17: Spatial analysis: a roadmap David O’Sullivan University of Auckland School of Geography and Environmental Science d.osullivan@auckland.ac.nz

GIS and Population Science - Penn State - June 12, 2006 - David O'Sullivan 17 / 51

The problem with spatial dataor: why “spatial [data] is special”

• In a nutshell: “Everything is related to everything else, but near

things are more related than distant things ”Tobler, W. 1970. A computer movie simulating urban growth

in the Detroit region, Economic Geography 46, 234–40

– This is sometimes called the First Law of Geography … because it is generally true!

– It follows that: spatial data cannot be considered independent random samples from a population

Page 18: Spatial analysis: a roadmap David O’Sullivan University of Auckland School of Geography and Environmental Science d.osullivan@auckland.ac.nz

GIS and Population Science - Penn State - June 12, 2006 - David O'Sullivan 18 / 51

Why “spatial is special” more specifically

• Some commonly identified ‘problems’ with spatial data are:– Autocorrelation– The modifiable areal unit problem (or

MAUP)– Scale effects– Non-uniformity of space and ‘edge effects’

Page 19: Spatial analysis: a roadmap David O’Sullivan University of Auckland School of Geography and Environmental Science d.osullivan@auckland.ac.nz

GIS and Population Science - Penn State - June 12, 2006 - David O'Sullivan 19 / 51

Spatial autocorrelation

• This follows directly from the observation that “… near things are more related than distant things”– Spatial data are self-correlated – There is redundancy in spatial data

because observations made at locations near one another tend to be similar

Page 20: Spatial analysis: a roadmap David O’Sullivan University of Auckland School of Geography and Environmental Science d.osullivan@auckland.ac.nz

GIS and Population Science - Penn State - June 12, 2006 - David O'Sullivan 20 / 51

• Percent Pakeha by meshblock• Positive autocorrelation is much more

common in socio-economic data

Page 21: Spatial analysis: a roadmap David O’Sullivan University of Auckland School of Geography and Environmental Science d.osullivan@auckland.ac.nz

GIS and Population Science - Penn State - June 12, 2006 - David O'Sullivan 21 / 51

Problem, or opportunity?

• Autocorrelation is only a problem if we choose to see it as one. Equally, we can– Describe or measure the autocorrelation

structure of spatial data, in order to characterize it

– Potentially use the description to improve subsequent analysis (e.g., simple interpolation becomes kriging in Geostatistics)

Page 22: Spatial analysis: a roadmap David O’Sullivan University of Auckland School of Geography and Environmental Science d.osullivan@auckland.ac.nz

GIS and Population Science - Penn State - June 12, 2006 - David O'Sullivan 22 / 51

Describing autocorrelation

• Two broad effects can be considered:– First order variation

• Large scale variation in the mean value—a trend or background effect

– Second order variation• Local variation perhaps due to interaction

effects between observations• May be isotropic (no directional effects) or

anisotropic (with a directional component)

Page 23: Spatial analysis: a roadmap David O’Sullivan University of Auckland School of Geography and Environmental Science d.osullivan@auckland.ac.nz

GIS and Population Science - Penn State - June 12, 2006 - David O'Sullivan 23 / 51

First or second order?

• In practice, 1st and 2nd order effects are hard to distinguish

Source: O’Sullivan and Unwin. 2003.

Page 24: Spatial analysis: a roadmap David O’Sullivan University of Auckland School of Geography and Environmental Science d.osullivan@auckland.ac.nz

GIS and Population Science - Penn State - June 12, 2006 - David O'Sullivan 24 / 51

Autocorrelation statistics

• A number of formal measures exist:– Joins count statistics for binary or classed

data– Moran’s I and Geary’s c for numeric data,

at points or aggregated to areas– Semivariogram and covariogram functions

for point data

Page 25: Spatial analysis: a roadmap David O’Sullivan University of Auckland School of Geography and Environmental Science d.osullivan@auckland.ac.nz

GIS and Population Science - Penn State - June 12, 2006 - David O'Sullivan 25 / 51

Results

• For the Pakeha (European) population in Auckland City, using Moran’s I, we get:

Page 26: Spatial analysis: a roadmap David O’Sullivan University of Auckland School of Geography and Environmental Science d.osullivan@auckland.ac.nz

GIS and Population Science - Penn State - June 12, 2006 - David O'Sullivan 26 / 51

The modifiable areal unit problem (MAUP)

• Areas are ‘arbitrary’: they are designed for convenience of data collection, not with respect to underlying patterns– Standard statistical techniques are sensitive to the

choice of units– In one study* it was shown that the correlation

between two variables can be estimated anywhere between –1 and +1, depending on the spatial units used!

*Openshaw, S. and P. J. Taylor. 1979. A million or so correlation coefficients: three experiments on the modifiable areal unit

problem. In N. Wrigley (ed.) Statistical Methods in the Spatial Sciences, Pion: London, 127-44.

Page 27: Spatial analysis: a roadmap David O’Sullivan University of Auckland School of Geography and Environmental Science d.osullivan@auckland.ac.nz

GIS and Population Science - Penn State - June 12, 2006 - David O'Sullivan 27 / 51

Redistricting

• Perhaps the clearest example of MAUP in practice…

Source: The Economist

Page 28: Spatial analysis: a roadmap David O’Sullivan University of Auckland School of Geography and Environmental Science d.osullivan@auckland.ac.nz

GIS and Population Science - Penn State - June 12, 2006 - David O'Sullivan 28 / 51

Scale• Geographic scale effects are

fundamental:– Different data types are appropriate at

different scales, so available spatial data may be dependent on scale, ruling out some types of analysis

– Scale is a factor in autocorrelation, and in the distinction between 1st and 2nd order effects

– It is also a factor in MAUP with respect to the level of aggregation used

Page 29: Spatial analysis: a roadmap David O’Sullivan University of Auckland School of Geography and Environmental Science d.osullivan@auckland.ac.nz

GIS and Population Science - Penn State - June 12, 2006 - David O'Sullivan 29 / 51

Non-uniformity of space• The non-uniformity of space refers to

problems arising from tacit assumptions about the uniform spatial density of ‘background’ populations– For example, ‘clusters’ of crime are expected in

urban areas because more people live there– This leads to numerous analytic complexities

Example: ISO9000 certified firms in the United States

Page 30: Spatial analysis: a roadmap David O’Sullivan University of Auckland School of Geography and Environmental Science d.osullivan@auckland.ac.nz

GIS and Population Science - Penn State - June 12, 2006 - David O'Sullivan 30 / 51

Edge effects

• Entities on the edge of a study area only have neighbors in one direction (toward the middle)– Unless care is taken, this can distort things– Again, coping with this leads to numerous

analytic complexities

Page 31: Spatial analysis: a roadmap David O’Sullivan University of Auckland School of Geography and Environmental Science d.osullivan@auckland.ac.nz

GIS and Population Science - Penn State - June 12, 2006 - David O'Sullivan 31 / 51

The bad news summarized• Data are spatially dependent

– Spatial autocorrelation

• Data are also dependent on how you look at them spatially:– Aggregation– Scale– Non-uniformity of space– Edge-effects

Page 32: Spatial analysis: a roadmap David O’Sullivan University of Auckland School of Geography and Environmental Science d.osullivan@auckland.ac.nz

GIS and Population Science - Penn State - June 12, 2006 - David O'Sullivan 32 / 51

Some good news

• Spatial data do have an intrinsic advantage, however…

• In addition to data we have a record of where the data were observed

• Making the most of this extra information lies at the heart of spatial analysis

Page 33: Spatial analysis: a roadmap David O’Sullivan University of Auckland School of Geography and Environmental Science d.osullivan@auckland.ac.nz

GIS and Population Science - Penn State - June 12, 2006 - David O'Sullivan 33 / 51

Some useful general concepts

• A number of concepts are frequently invoked in spatial analysis– Distance– Adjacency– Interaction– Neighborhood– Proximity polygons

Page 34: Spatial analysis: a roadmap David O’Sullivan University of Auckland School of Geography and Environmental Science d.osullivan@auckland.ac.nz

GIS and Population Science - Penn State - June 12, 2006 - David O'Sullivan 34 / 51

Distance

• Easily calculated from two coordinates– Use Pythagoras’s theorem

– This is trickier on a sphere, but for projected data at sub-regional scales the Euclidean approximation is adequate

22 yxd y

x

d

Page 35: Spatial analysis: a roadmap David O’Sullivan University of Auckland School of Geography and Environmental Science d.osullivan@auckland.ac.nz

GIS and Population Science - Penn State - June 12, 2006 - David O'Sullivan 35 / 51

Other distance metrics

• A variety of non-Euclidean measures

• Network distance on a transport system

• Travel time

• Perceived distance

Page 36: Spatial analysis: a roadmap David O’Sullivan University of Auckland School of Geography and Environmental Science d.osullivan@auckland.ac.nz

GIS and Population Science - Penn State - June 12, 2006 - David O'Sullivan 36 / 51

Adjacency

• This is a sort of binary distance: two spatial objects are either adjacent or not– Often we use distance to decide: if d = 0,

then two objects are adjacent– This can get complex for some kinds of

object– The meaning is clearest for polygons

Page 37: Spatial analysis: a roadmap David O’Sullivan University of Auckland School of Geography and Environmental Science d.osullivan@auckland.ac.nz

GIS and Population Science - Penn State - June 12, 2006 - David O'Sullivan 37 / 51

Queens and rooks

• These terms are fairly self explanatory, referring to which types of adjacency we choose to ‘allow’

Page 38: Spatial analysis: a roadmap David O’Sullivan University of Auckland School of Geography and Environmental Science d.osullivan@auckland.ac.nz

GIS and Population Science - Penn State - June 12, 2006 - David O'Sullivan 38 / 51

A simple example

Page 39: Spatial analysis: a roadmap David O’Sullivan University of Auckland School of Geography and Environmental Science d.osullivan@auckland.ac.nz

GIS and Population Science - Penn State - June 12, 2006 - David O'Sullivan 39 / 51

Adjacency applied to measuring autocorrelation

• For each pair of neighboring cases, calculate the covariance:

– This produces a positive number when two values are similar, and a negative number when they are different

• Averaged over all neighboring cases, and scaled by dividing by the variance of the data, we get a number between –1 and +1– This is interpreted in the same way as a standard

correlation coefficient

xxxx ji

Page 40: Spatial analysis: a roadmap David O’Sullivan University of Auckland School of Geography and Environmental Science d.osullivan@auckland.ac.nz

GIS and Population Science - Penn State - June 12, 2006 - David O'Sullivan 40 / 51

Election 2000 again• Those county level results expressed in terms

of the percent share for George W. Bush

The clear spatial pattern in these data is confirmed by a Moran’s I value of around 0.45

Page 41: Spatial analysis: a roadmap David O’Sullivan University of Auckland School of Geography and Environmental Science d.osullivan@auckland.ac.nz

GIS and Population Science - Penn State - June 12, 2006 - David O'Sullivan 41 / 51

Interaction

• Interaction (often denoted wij) is a

measure of the likely strength of relationship between two entities– The most common form is inverse-distance

ijij d

w1

Page 42: Spatial analysis: a roadmap David O’Sullivan University of Auckland School of Geography and Environmental Science d.osullivan@auckland.ac.nz

GIS and Population Science - Penn State - June 12, 2006 - David O'Sullivan 42 / 51

Other measures of interaction–Inverse distance powered (usually squared)

kij dw

1

–Negative exponential

–Weighted inverse distance

kdij ew

k

jiij d

AAw

Page 43: Spatial analysis: a roadmap David O’Sullivan University of Auckland School of Geography and Environmental Science d.osullivan@auckland.ac.nz

GIS and Population Science - Penn State - June 12, 2006 - David O'Sullivan 43 / 51

Matrices and spatial pattern

• Many of these concepts assign a value to describe the relationship between every pair of objects

• This lends itself to being recorded in a matrix:

04510811610141

45060919924

1086006711068

116916705168

1019911051066

36246868660

D

Page 44: Spatial analysis: a roadmap David O’Sullivan University of Auckland School of Geography and Environmental Science d.osullivan@auckland.ac.nz

GIS and Population Science - Penn State - June 12, 2006 - David O'Sullivan 44 / 51

Spatial weights matrices

• A particularly common matrix is the spatial weights matrix, usually denoted by W

• This records the interaction between each pair of objects, and appears in– autocorrelation, point pattern analysis,

interpolation, spatial regression, geographically weighted regression, spatial interaction modeling…

Page 45: Spatial analysis: a roadmap David O’Sullivan University of Auckland School of Geography and Environmental Science d.osullivan@auckland.ac.nz

GIS and Population Science - Penn State - June 12, 2006 - David O'Sullivan 45 / 51

Neighborhood

• Neighborhood is a less clear-cut concept– It can mean the region of space around

some object– Or the set of objects considered to be

neighbors of that object

• Some notion of neighborhood is implied by any given weights matrix

Page 46: Spatial analysis: a roadmap David O’Sullivan University of Auckland School of Geography and Environmental Science d.osullivan@auckland.ac.nz

GIS and Population Science - Penn State - June 12, 2006 - David O'Sullivan 46 / 51

Proximity polygons• Proximity polygons are an increasingly

important example of the neighborhood concept

• A proximity polygon is associated with each spatial object and is the region of space nearer to that object than to any other

• A good demonstration of the idea is Voroglide by: Praktische Informatik VI, FernUniversität Hagen, Christian Icking, Rolf Klein, Peter Köllner, Lihong Ma

Page 47: Spatial analysis: a roadmap David O’Sullivan University of Auckland School of Geography and Environmental Science d.osullivan@auckland.ac.nz

GIS and Population Science - Penn State - June 12, 2006 - David O'Sullivan 47 / 51

Uses of proximity polygons

• Proximity polygons are commonly used in– Interpolation– Point pattern analysis

• Increasingly they are used throughout spatial analysis, especially in location decision making

Page 48: Spatial analysis: a roadmap David O’Sullivan University of Auckland School of Geography and Environmental Science d.osullivan@auckland.ac.nz

GIS and Population Science - Penn State - June 12, 2006 - David O'Sullivan 48 / 51

Just to review…

• Distance

• Adjacency

• Interaction

• Neighborhood

Page 49: Spatial analysis: a roadmap David O’Sullivan University of Auckland School of Geography and Environmental Science d.osullivan@auckland.ac.nz

GIS and Population Science - Penn State - June 12, 2006 - David O'Sullivan 49 / 51

GIS and spatial analysis

• GIS vendors often claim to offer ‘spatial analysis’– This usually doesn’t mean statistical spatial

analysis, but spatial data manipulation—buffering, overlay etc.

– However, GIS has increased the need for spatial analysis, because more people are making maps, and asking questions about them!

Page 50: Spatial analysis: a roadmap David O’Sullivan University of Auckland School of Geography and Environmental Science d.osullivan@auckland.ac.nz

GIS and Population Science - Penn State - June 12, 2006 - David O'Sullivan 50 / 51

If spatial analysis is so useful, why is it not integrated into GIS?!

• Different perspectives on spatial data– GIS is built around the entity-attribute model.

Spatial analysis uses this data (because that’s the way it comes). Conceptually, spatial analysis sees data as patterns which are the outcomes of processes, which can be quite different.

• Spatial analysis is not widely understood– It has been a specialized field, and therefore hard

to justify incorporating into GIS, as a standard tool

• Spatial analysis can make GIS hard to sell– Spatial analysis is about asking difficult questions,

not about easy answers

Page 51: Spatial analysis: a roadmap David O’Sullivan University of Auckland School of Geography and Environmental Science d.osullivan@auckland.ac.nz

GIS and Population Science - Penn State - June 12, 2006 - David O'Sullivan 51 / 51

Questions?

David O’SullivanUniversity of AucklandSchool of Geography and Environmental Science

[email protected]