86
A Short Course in Geoinformatics Part I: Science Issues in GeoInformatics Michael F. Goodchild

A Short Course in Geoinformatics Part I: Science Issues in GeoInformatics Michael F. Goodchild

Embed Size (px)

Citation preview

Page 1: A Short Course in Geoinformatics Part I: Science Issues in GeoInformatics Michael F. Goodchild

A Short Course in GeoinformaticsPart I: Science Issues in

GeoInformatics

Michael F. Goodchild

Page 2: A Short Course in Geoinformatics Part I: Science Issues in GeoInformatics Michael F. Goodchild

Outline• A short history of GIS• Basic principles of GIScience• Uncertainty

Page 3: A Short Course in Geoinformatics Part I: Science Issues in GeoInformatics Michael F. Goodchild

A short history of GIS• Maps in computers

– for decision-making• each map representing one dimension of a decision

– for managing data• aggregating census returns to reporting zones• managing the multiple data types of transportation

planning

– to support map-making• editing• projection change

Page 4: A Short Course in Geoinformatics Part I: Science Issues in GeoInformatics Michael F. Goodchild

A model for landscape architecture• Ian McHarg’s school at the University of

Pennsylvania

Ian McHarg 1920-2001

Meteorology

Geology

Hydrology

Plant ecology

Animal ecology

Limnology

Computation

Remote sensing

Page 5: A Short Course in Geoinformatics Part I: Science Issues in GeoInformatics Michael F. Goodchild
Page 6: A Short Course in Geoinformatics Part I: Science Issues in GeoInformatics Michael F. Goodchild

“For the first time, a department of landscape architecture could recruit a faculty of distinguished natural scientists sharing the ecological view and determined to integrate their perceptions into a holistic discipline applied to the solution of contemporary problems.”

I.L. McHarg, A Quest for Life (Wiley, 1996, p. 192)

Integration of science into action

Frequently emulated as a model for environmental science

But with a weaker intervention component

The social context is missing

Computation and remote sensing do not fit the model

Page 7: A Short Course in Geoinformatics Part I: Science Issues in GeoInformatics Michael F. Goodchild

The Canada Geographic Information System

• Roger Tomlinson– IBM contracts 1964-68

• 7 layers of land characteristics– soil capability for agriculture– recreation capability– current land use– ….

• To assess the current use of Canadian land– to measure area, plan new uses

Page 8: A Short Course in Geoinformatics Part I: Science Issues in GeoInformatics Michael F. Goodchild
Page 9: A Short Course in Geoinformatics Part I: Science Issues in GeoInformatics Michael F. Goodchild

Technical aspects of CGIS• Manuscript maps at 1:50,000

– 7 per tile

• Hand-scribing of boundaries• An optical scanner creating a raster of

boundaries• Vectorization• Merging with area attributes• The common boundary between two areas

as the basic unit

Page 10: A Short Course in Geoinformatics Part I: Science Issues in GeoInformatics Michael F. Goodchild

Flat-file options (tape)Flat-file options (tape)

By face/polygon– double recording of internal boundaries– spurious differences

By edge/arc– half the data volume– compute area in O(vertices)– simplify overlay– attributes of adjacent polygons– no polygon records

Page 11: A Short Course in Geoinformatics Part I: Science Issues in GeoInformatics Michael F. Goodchild
Page 12: A Short Course in Geoinformatics Part I: Science Issues in GeoInformatics Michael F. Goodchild
Page 13: A Short Course in Geoinformatics Part I: Science Issues in GeoInformatics Michael F. Goodchild
Page 14: A Short Course in Geoinformatics Part I: Science Issues in GeoInformatics Michael F. Goodchild

Technical aspects…• Storage on magnetic tape

– variable-length records– leftpolyID, rightpolyID, #points, (x1,y1),…

• Indexing in Morton order– a quad-tree index

• Numerical output only– tabulations of area– no visual display

• Mainframe technology– later leased land lines at 300 bps

Page 15: A Short Course in Geoinformatics Part I: Science Issues in GeoInformatics Michael F. Goodchild

The quadtreeThe quadtree

Recursive subdivision– variable depth depending on local detail

30

31

32

33

1

0 2

3

Page 16: A Short Course in Geoinformatics Part I: Science Issues in GeoInformatics Michael F. Goodchild

Other types of mapsOther types of maps

Transportation links– linear features– networks– U.S. Bureau of the Census– blocks = 2-cells– street segments = 1-cells– intersections = 0-cells

Page 17: A Short Course in Geoinformatics Part I: Science Issues in GeoInformatics Michael F. Goodchild
Page 18: A Short Course in Geoinformatics Part I: Science Issues in GeoInformatics Michael F. Goodchild

Topological data structures• 1977 conference

– sponsored by Harvard University

• A unifying structure across many application areas– all three of: decision-making, managing data,

editing maps

• The birth of ESRI

Page 19: A Short Course in Geoinformatics Part I: Science Issues in GeoInformatics Michael F. Goodchild

The relational modelThe relational model

The map as a collection of arcs, nodes, and faces– F-A+N = 2

Stored in tables with keys GIS built on RDBMS

– INFO Vertices left out

– a hybrid solution– ARC/INFO– the ARC data structure still proprietary

Page 20: A Short Course in Geoinformatics Part I: Science Issues in GeoInformatics Michael F. Goodchild

Square pegs in round holesSquare pegs in round holes

Cul-de-sacs– allow 1-nodes

Properties of parts of edges– dynamic segmentation– linear referencing

Non-planarity– overpasses and underpasses– turntables

Page 21: A Short Course in Geoinformatics Part I: Science Issues in GeoInformatics Michael F. Goodchild

A 1990s house of cardsA 1990s house of cards

Still no vertices in the RDBMS Points

– coordinates stored in tables– no topological relationships with other features

Does it have to be this hard?– simple CAD data model– points, lines, and areas in an empty space– potentially overlapping– no topological relationships– compute on the fly

Page 22: A Short Course in Geoinformatics Part I: Science Issues in GeoInformatics Michael F. Goodchild
Page 23: A Short Course in Geoinformatics Part I: Science Issues in GeoInformatics Michael F. Goodchild
Page 24: A Short Course in Geoinformatics Part I: Science Issues in GeoInformatics Michael F. Goodchild

Object-oriented data modelingObject-oriented data modeling

All features are instances of classes Classes inherit properties from more

general classes Features can be aggregates of other

features Features can be composed of other

features Features can be associated

Page 25: A Short Course in Geoinformatics Part I: Science Issues in GeoInformatics Michael F. Goodchild
Page 26: A Short Course in Geoinformatics Part I: Science Issues in GeoInformatics Michael F. Goodchild
Page 27: A Short Course in Geoinformatics Part I: Science Issues in GeoInformatics Michael F. Goodchild
Page 28: A Short Course in Geoinformatics Part I: Science Issues in GeoInformatics Michael F. Goodchild
Page 29: A Short Course in Geoinformatics Part I: Science Issues in GeoInformatics Michael F. Goodchild
Page 30: A Short Course in Geoinformatics Part I: Science Issues in GeoInformatics Michael F. Goodchild
Page 31: A Short Course in Geoinformatics Part I: Science Issues in GeoInformatics Michael F. Goodchild

Address Agriculture Archiving Atmospheric Basemap Biodiversity Census-Administrative

Boundaries Defense-Intel Energy Utilities Energy Utilities -

MultiSpeak TM Environmental Regulated

Facilities Forestry Geology

Groundwater Health Historic Preservation and

Archaeology Hydro International Hydrographic

Organization (IHO) S-57 for ENC

Land Parcels Local Government Marine Petroleum Pipeline Raster Telecommunications Transportation Water Utilities

Page 32: A Short Course in Geoinformatics Part I: Science Issues in GeoInformatics Michael F. Goodchild

A paradigm shiftA paradigm shift

Away from the map metaphor– georeferenced events, transactions– objects with no georeferences– phenomena that were never mapped

Neogeography– customized maps

• user-centric• transitory

Interactions, flows

Page 33: A Short Course in Geoinformatics Part I: Science Issues in GeoInformatics Michael F. Goodchild
Page 34: A Short Course in Geoinformatics Part I: Science Issues in GeoInformatics Michael F. Goodchild
Page 35: A Short Course in Geoinformatics Part I: Science Issues in GeoInformatics Michael F. Goodchild
Page 36: A Short Course in Geoinformatics Part I: Science Issues in GeoInformatics Michael F. Goodchild
Page 37: A Short Course in Geoinformatics Part I: Science Issues in GeoInformatics Michael F. Goodchild

*

*

*

*0..1

0..1

0..2

0..2

0..1

ORIGINAL USE CASE MODELS

INTERACTION

MINARD NAPOLEON MAP

KARST FLOW ROUTES

Page 38: A Short Course in Geoinformatics Part I: Science Issues in GeoInformatics Michael F. Goodchild
Page 39: A Short Course in Geoinformatics Part I: Science Issues in GeoInformatics Michael F. Goodchild

*

*0..1

0..2

0..1

Generic Flow Model

Page 40: A Short Course in Geoinformatics Part I: Science Issues in GeoInformatics Michael F. Goodchild
Page 41: A Short Course in Geoinformatics Part I: Science Issues in GeoInformatics Michael F. Goodchild

slide 19 / 22

Page 42: A Short Course in Geoinformatics Part I: Science Issues in GeoInformatics Michael F. Goodchild

slide 15 / 22

Page 43: A Short Course in Geoinformatics Part I: Science Issues in GeoInformatics Michael F. Goodchild

The data modeling cycleThe data modeling cycle

The set of all phenomena in the

domain

Adopt a generic solution

Identify inefficiencies and special

cases

Find workarounds, violate the data

model

Page 44: A Short Course in Geoinformatics Part I: Science Issues in GeoInformatics Michael F. Goodchild

Is the process beginning again?Is the process beginning again?

All features are instances of classes– are all phenomena naturally features?– is there a pre-feature stage?

Inherently continuous phenomena– roads, rivers– topography– the pre-patch ecological landscape

Page 45: A Short Course in Geoinformatics Part I: Science Issues in GeoInformatics Michael F. Goodchild

Basic principles of GIScience• The atomic geographic fact

– the geo-atom– <x,z>– a pair defining what (z) is where (x)

• Point observations are individual geo-atoms– data about lines, areas, volumes can be

decomposed into geo-atoms– the boundary of California defines an infinite

number of statements of the form <x,z>• where z = 1 if x is inside the boundary• else z=0

Page 46: A Short Course in Geoinformatics Part I: Science Issues in GeoInformatics Michael F. Goodchild
Page 47: A Short Course in Geoinformatics Part I: Science Issues in GeoInformatics Michael F. Goodchild
Page 48: A Short Course in Geoinformatics Part I: Science Issues in GeoInformatics Michael F. Goodchild

A typical kernel function

The result of applying a 150km-wide kernel to points distributed

over California

Page 49: A Short Course in Geoinformatics Part I: Science Issues in GeoInformatics Michael F. Goodchild

Discrete objects• Points, lines, areas, or volumes

– in an otherwise empty space– may overlap– countable

• Examples:– buildings– cars– instances of a disease– oil wells

Page 50: A Short Course in Geoinformatics Part I: Science Issues in GeoInformatics Michael F. Goodchild

Continuous fields• Variables that can be measured anywhere

– at any time– z = f(x,y) f(x,y,z) f(x,y,z,t)

• Examples:– elevation of the ground surface– atmospheric temperature– soil pH– wind direction

• Variable can be a class– soil type– land use type

Page 51: A Short Course in Geoinformatics Part I: Science Issues in GeoInformatics Michael F. Goodchild

Fields as objectsFields as objects

Fields discretized as collections of objects– sample points– isolines– triangles of a mesh– samples of a Fourier transform

Methods implied by roles of objects– isolines cannot cross– polygons must not overlap

Page 52: A Short Course in Geoinformatics Part I: Science Issues in GeoInformatics Michael F. Goodchild
Page 53: A Short Course in Geoinformatics Part I: Science Issues in GeoInformatics Michael F. Goodchild
Page 54: A Short Course in Geoinformatics Part I: Science Issues in GeoInformatics Michael F. Goodchild

Mitchell, A., 1999. The ESRI Guide to GIS Analysis. Redlands: ESRI Press

Page 55: A Short Course in Geoinformatics Part I: Science Issues in GeoInformatics Michael F. Goodchild

Principle• There are two fundamentally distinct ways of

aggregating geo-atoms– into discrete objects

• all points within an object have the attributes of the object

– into continuous fields• every point is mapped to a variable

• Marginal cases:– weather highs, lows, fronts– mountain peaks– clouds in the sky

Page 56: A Short Course in Geoinformatics Part I: Science Issues in GeoInformatics Michael F. Goodchild

Beyond objects and fields• Discrete objects that move• Discrete objects that change shape• Discrete objects that have internal structure

Page 57: A Short Course in Geoinformatics Part I: Science Issues in GeoInformatics Michael F. Goodchild

Helix representationHelix representation

Spine: expresses spatio-

temporal 3-D movement of the

center of mass.

Prongs: express expansion or collapse

of the object’s outline

Page 58: A Short Course in Geoinformatics Part I: Science Issues in GeoInformatics Michael F. Goodchild

May Yuan, University of Oklahoma

Page 59: A Short Course in Geoinformatics Part I: Science Issues in GeoInformatics Michael F. Goodchild

Hurricane FrancesHurricane Frances

Page 60: A Short Course in Geoinformatics Part I: Science Issues in GeoInformatics Michael F. Goodchild

Hurricane helixesHurricane helixes

Page 61: A Short Course in Geoinformatics Part I: Science Issues in GeoInformatics Michael F. Goodchild
Page 62: A Short Course in Geoinformatics Part I: Science Issues in GeoInformatics Michael F. Goodchild

Spatially binary data• <x1,x2,z>

– information about the relationship between two locations• flow of migrants• distance• direction• time of travel

– such information is key to understanding many social processes

– conventional geographic information is spatially unary

Page 63: A Short Course in Geoinformatics Part I: Science Issues in GeoInformatics Michael F. Goodchild

1) Spatial dependence principle• Tobler’s First Law of Geography (TFL)

– “All things are similar, but nearby things are more similar than distant things”

• Horizontal context– geographic facts should be consistent with their

surroundings

• Spatial dependence– the tendency for nearby observations to be

correlated– violating an assumption of many statistical tests

that observations are independent

Page 64: A Short Course in Geoinformatics Part I: Science Issues in GeoInformatics Michael F. Goodchild

ValidityValidity

“Nearby things are less similar than distant things”– negative spatial autocorrelation– possible at certain scales

• the checkerboard• retailing

– but negative a/c at one scale requires positive a/c at other scales

– smoothing processes dominate sharpening processes

Page 65: A Short Course in Geoinformatics Part I: Science Issues in GeoInformatics Michael F. Goodchild
Page 66: A Short Course in Geoinformatics Part I: Science Issues in GeoInformatics Michael F. Goodchild
Page 67: A Short Course in Geoinformatics Part I: Science Issues in GeoInformatics Michael F. Goodchild

FormalizationFormalization

Geostatistics– variogram, covariogram– measuring how similarity decreases

(variance increases) with distance– parameters vary by phenomenon

• does this make TFL less of a law?

Page 68: A Short Course in Geoinformatics Part I: Science Issues in GeoInformatics Michael F. Goodchild
Page 69: A Short Course in Geoinformatics Part I: Science Issues in GeoInformatics Michael F. Goodchild

UtilityUtility

Representation– GI is reducible to statements of the form

<x,z>– the atomic form of GI is unmanageable,

encountered only in point samples– all other GI data models assume TFL

Spatial interpolation– IDW and Kriging implement TFL

Page 70: A Short Course in Geoinformatics Part I: Science Issues in GeoInformatics Michael F. Goodchild

If TFL weren’t trueIf TFL weren’t true

GIS would be impossible– a point sample is useful only with

interpolation Life would be impossible

Page 71: A Short Course in Geoinformatics Part I: Science Issues in GeoInformatics Michael F. Goodchild

2) Spatial heterogeneity principle• The Earth’s surface is fundamentally

heterogeneous– unlike humans, whose characteristics are

distributed around an average

• It is difficult to generalize from a single case study

• The results of any case study depend explicitly on the spatial bounds of the study

• The second law of geography• Again, problematic for science

Page 72: A Short Course in Geoinformatics Part I: Science Issues in GeoInformatics Michael F. Goodchild
Page 73: A Short Course in Geoinformatics Part I: Science Issues in GeoInformatics Michael F. Goodchild

Jorge Sifuentes, PhD dissertation

Page 74: A Short Course in Geoinformatics Part I: Science Issues in GeoInformatics Michael F. Goodchild
Page 75: A Short Course in Geoinformatics Part I: Science Issues in GeoInformatics Michael F. Goodchild
Page 76: A Short Course in Geoinformatics Part I: Science Issues in GeoInformatics Michael F. Goodchild

Practical implications of the second lawPractical implications of the second law

A state is not a sample of the nation– a country is not a sample of the world

Classification schemes will differ when devised by local jurisdictions

Figures of the Earth will differ when devised by local surveying agencies

Global standards will always compete with local standards

Page 77: A Short Course in Geoinformatics Part I: Science Issues in GeoInformatics Michael F. Goodchild

3) A fractal principle3) A fractal principle

The closer you look the more you see– and for many natural phenomena the rate

is orderly– Richardson plots– lengths of national boundaries

• Spain and Portugal• context of 1920s

Page 78: A Short Course in Geoinformatics Part I: Science Issues in GeoInformatics Michael F. Goodchild
Page 79: A Short Course in Geoinformatics Part I: Science Issues in GeoInformatics Michael F. Goodchild

Practical implicationsPractical implications

Indexing schemes, quadtrees– partitioning of information at different scales

Length is a function of spatial resolution– and variously under-estimated in GIS– as are many other properties

• slope• soil class• land cover class

– spatial resolution should always be explicit in GIS analysis

• easy in raster• much more difficult in vector

Page 80: A Short Course in Geoinformatics Part I: Science Issues in GeoInformatics Michael F. Goodchild

4) The uncertainty principle4) The uncertainty principle

No representation of the Earth’s surface can be complete– no measurement of position can be perfect– a GIS will always leave doubt about the

true nature of the Earth’s surface

Page 81: A Short Course in Geoinformatics Part I: Science Issues in GeoInformatics Michael F. Goodchild
Page 82: A Short Course in Geoinformatics Part I: Science Issues in GeoInformatics Michael F. Goodchild
Page 83: A Short Course in Geoinformatics Part I: Science Issues in GeoInformatics Michael F. Goodchild

ArcMap 10.0, Plate Carrée projection

Page 84: A Short Course in Geoinformatics Part I: Science Issues in GeoInformatics Michael F. Goodchild

Error-sensitive GISError-sensitive GIS

Storing characterizations of uncertainty Propagation through GIS operations Visualization Confidence limits on products

Page 85: A Short Course in Geoinformatics Part I: Science Issues in GeoInformatics Michael F. Goodchild

How to build one?How to build one?

Augmentation of existing data models– new attributes of objects, object classes,

data sets– metadata– the five-fold way– Lanter and Veregin, GeoLineus– inheritance, object-orientation

Page 86: A Short Course in Geoinformatics Part I: Science Issues in GeoInformatics Michael F. Goodchild