Geographic Theory and Geospatial Knowledge Discovery - Final

Embed Size (px)

Citation preview

  • 8/3/2019 Geographic Theory and Geospatial Knowledge Discovery - Final

    1/51

    1

    Geographic theory and geospatialknowledge discovery

    Harvey J. MillerDepartment of Geography

    University of Utah

    [email protected] International Conference on Data Mining

    Pisa, Italy 18 December 2008

  • 8/3/2019 Geographic Theory and Geospatial Knowledge Discovery - Final

    2/51

    2

    GIS trends

    Geospatial technologies High-resolution monitors Location-aware technologies Geosensor networks etc, etc.

    Atsunami of digital geo-data Increased volume

    Giga to terabyte and beyond

    Increased coverage Seamless databases

    Increased spectrum Text, sound, imagery

  • 8/3/2019 Geographic Theory and Geospatial Knowledge Discovery - Final

    3/51

    3

    Introduction

    Geospatial knowledgediscovery Human-centered process

    of extracting novel,interesting and useful

    patterns from geo-referenced data

    A (very!) special case ofKDD

    Location is important Observations are not independent Errors are often spatial Relationships are often local Non-linearity is typical Distributions are non-normal Highly multivariate but often

    redundant Time often interacts with space Many data layers are categorical Data objects often cannot be

    reduced to points Spatially aggregated data are

    modifiable

    Why is spatial special?- a mini course

    - after Openshaw (1999)

  • 8/3/2019 Geographic Theory and Geospatial Knowledge Discovery - Final

    4/51

    4

    Introduction

    Geographic theory and datamining There is a rich and

    underexploited body ofgeographic theory

    This can help guide the GKDprocess

    Techniques

    Background knowledge Pattern evaluation

    etc Geography its notjust trivia!

  • 8/3/2019 Geographic Theory and Geospatial Knowledge Discovery - Final

    5/51

    5

    Great geographic theories

    Spatial dependency Toblers first law Cartographic transformations

    Spatial heterogeneity Spatial non-stationarity Disaggregate spatial statistics

    Spatial interaction Spatial interaction theory Time geography

    Spatial organization The concept of region Spatial logic

  • 8/3/2019 Geographic Theory and Geospatial Knowledge Discovery - Final

    6/51

    6

    What I will not talk about

    The map One of the most

    powerfultechnologies in the

    history of civilization Still evolving!

    Useful in GKD Interfaces

    Pattern visualization

    But, why the map?

    Earliest known map ofthe world - sixth centuryB.C.E

    www.antweb.orgGoogle Earth

    www.gutenberg.org

  • 8/3/2019 Geographic Theory and Geospatial Knowledge Discovery - Final

    7/51

    7

    What I will not talk about

    Domain theory Theories about processes specific to particular

    domains Ecosystems biology, biogeography

    Landscapes geology, geomorphology

    Cities economics, political science, sociology,geography

    These are theories with geospatial components,but are not uniquely geography

  • 8/3/2019 Geographic Theory and Geospatial Knowledge Discovery - Final

    8/51

    8

    What I am seeking and why

    A theory of geography

    - a unique perspective

    -a coherent way ofthinking

    - amenable to formal andcomputationalrepresentation

    Framework fororganizing GKD

    Suggest newtechniques and

    strategies

  • 8/3/2019 Geographic Theory and Geospatial Knowledge Discovery - Final

    9/51

    9

    Spatial dependency

    Toblers First Law ofGeography Everything is related to

    everything else, but near thingsare related to more distant

    things

    Everything is related toeverything else

    Spatial interdependency Near things are related to more

    distant things Interdependency and proximity

    Tobler, W. R. 1970. A computermovie simulating urban growthin the Detroit region. EconomicGeography46: 234-240.

    Waldo Tobler receiving the 1999ESRI Lifetime Achievement

    Award

    Susanna Baumgart - UCSB

  • 8/3/2019 Geographic Theory and Geospatial Knowledge Discovery - Final

    10/51

    10

    Spatial dependency

    Spatial autocorrelation Association based on geospatialproximity

    Confounding

    Something to be corrected e.g., econometrics

    Informing Reveals information about

    spatial process

    e.g., spatial autocorrelationstatistics, spatial econometrics

    esri.com

    Body Mass Index in Salt Lake City, USA

    Dr. Ikuho Yamada, University of Utah

  • 8/3/2019 Geographic Theory and Geospatial Knowledge Discovery - Final

    11/51

    11

    Spatial dependency

    Spatial interpolation Estimate variables at

    unobserved locationsusing values atobserved locations

    Based on modeled

    proximityrelationships

    e.g., IDW, kriging

    Spatial interpolation of influenzaover time in Europe 2004-2005www.eurosurveillance.org

  • 8/3/2019 Geographic Theory and Geospatial Knowledge Discovery - Final

    12/51

    12

    Spatial dependency

    Geo-space

    Proximity is the core oftheories of geo-space

    Two main components Locations

    Length metric

    Formal theory Beguin and Thisse

    (1979), others Admits a wide range of

    length metrics Including semi-metrics

    Miller and Wentz (2003)Annals, AAG

    The fundamental tenet ofgeography:

    Geo-space is explanatory

  • 8/3/2019 Geographic Theory and Geospatial Knowledge Discovery - Final

    13/51

    13

    Spatial dependency

    Geo-space does not have to beEuclidean Geographic processes can follow

    other metrics

    Cartographic transformations Project geo-space based on:

    Alternative proximity relationships

    Smooth spatial heterogeneity

    Why?

    Visualization Improve explanation

    CartoDraw Keim, North & Panse

    Swedish migration map Hagerstrand (cited by Tobler 1963)

  • 8/3/2019 Geographic Theory and Geospatial Knowledge Discovery - Final

    14/51

    14

    Spatial dependency

    Iceland in air passenger spaceAir passenger flows in Iceland

    Cliff and Haggett (1998)

    Spatial modeling of disease propagation in analternative geo-space

  • 8/3/2019 Geographic Theory and Geospatial Knowledge Discovery - Final

    15/51

    15

    Spatial dependency

    Time-space maps Map with separation

    measured in travel time

    Why? Exploratory visualization Synoptic summary

    Greater explanatory

    power

    Time-space transformationof Salt Lake City, USA

    Nobbir Ahmed and Miller (2007)

    Based on average daily traveltimes (vectors representdisplacements)

  • 8/3/2019 Geographic Theory and Geospatial Knowledge Discovery - Final

    16/51

    16

    Spatial dependency

    Time-space transformations for 4 periods of the day morning, midday, afternoon and evening

    1992 2001

  • 8/3/2019 Geographic Theory and Geospatial Knowledge Discovery - Final

    17/51

    17

    Spatial dependency

    Solutions are sometimes > 3D

    Highly stressed solution western SLC

    Less stressful surfacerepresentation

  • 8/3/2019 Geographic Theory and Geospatial Knowledge Discovery - Final

    18/51

    18

    Spatial dependency

    Comparing alternative spaces Bi-dimensional regression

    Degree of fit between twoplanar configurations

    After transformations,

    rotations and translations Fit and significance

    measures

    Including spatial variation

    Can be extended to higherdimensions

    Nobbir Ahmed and Miller

    Tobler (1994)

  • 8/3/2019 Geographic Theory and Geospatial Knowledge Discovery - Final

    19/51

    19

    Spatial heterogeneity

    Geographic variationoccurs naturally Friction of distance

    Relative location

    Spatial processes arenon-stationary Apparent variation in

    process with respectto location

    If its stationary, itsnot spatial!

    www.geovista.psu.edu

  • 8/3/2019 Geographic Theory and Geospatial Knowledge Discovery - Final

    20/51

    20

    Spatial heterogeneity

    Question: What do

    Charles Darwin and PaulKrugman have incommon?

    Besides beards

    uk.gizmodo.com

    ericblackink.minnpost.com

  • 8/3/2019 Geographic Theory and Geospatial Knowledge Discovery - Final

    21/51

    21

    Spatial heterogeneity

    Both recognized the power

    of spatial heterogeneity Darwin

    Observed geographic variationin species

    Natural selection leads todifferences in species diversityand composition amongdifferent geographic locations

    Long distance dispersal results

    in geographic isolation andevolutionary divergence

    uk.gizmodo.com

  • 8/3/2019 Geographic Theory and Geospatial Knowledge Discovery - Final

    22/51

    22

    Spatial heterogeneity

    Both recognized the power

    of spatial heterogeneity Krugman

    New economic geography

    Geographic variation inproductive factors

    Increasing returns enhancevariation and lead to greaterheterogeneity

    ericblackink.minnpost.com

  • 8/3/2019 Geographic Theory and Geospatial Knowledge Discovery - Final

    23/51

    23

    Spatial heterogeneity

    Disaggregate spatialstatistics Decompose processes by

    location Examples

    Getis-Ord G K-function analysis Geographically weighted

    regression

    Unimaginable prior to

    GIS! Data intensive Visually intensive

    Local clustering of birth defects,

    Shanxi province, China.

    Wu et al (2004)www.biomedcentral.com

  • 8/3/2019 Geographic Theory and Geospatial Knowledge Discovery - Final

    24/51

    24

    Spatial heterogeneity

    Geographically weightedregression Assess spatial variation in

    model structure Parameter estimates

    Parameter errors

    Goodness of fit

    Influence

    Determine whethervariation is systematic

    Validate models betweendata subsets

    Ask questions about spatialstructures in data

    GWR with different spatial lags

    Laffin, S. W. GeoComputation 99

  • 8/3/2019 Geographic Theory and Geospatial Knowledge Discovery - Final

    25/51

    25

    Spatial heterogeneity

    PARM_4

    -1.857220 - -0.601224

    -0.601223 - 0.055812

    0.055813 - 0.787442

    0.787443 - 1.664200

    1.664201 - 3.123810

    TVAL_4

    -3.008610 - -2.580000

    -2.579999 - -1.960000

    -1.959999 - 1.960000

    1.960001 - 2.580000

    2.580001 - 5.342070

    GWR: Spatial variation in the effect of social class onvoter turnout, Dublin Ireland

    Parameter estimates t-tests

    Fotheringham and Demsar (2009)

  • 8/3/2019 Geographic Theory and Geospatial Knowledge Discovery - Final

    26/51

    26

    Spatial heterogeneity

    GWR and visual insight Use visual analytics to

    explore parameter space

    Example SOM clusters based on

    eight parameters

    Cartographic visualization ofclusters

    Fotheringham and Demsar (2009)

  • 8/3/2019 Geographic Theory and Geospatial Knowledge Discovery - Final

    27/51

    27

    Spatial heterogeneity

    GWR and visual insight Use visual analytics to

    explore parameter space

    Example Cluster selection

    Parallel coordinate plot of

    clusters across all eightvariables

    Fotheringham and Demsar (2009)

  • 8/3/2019 Geographic Theory and Geospatial Knowledge Discovery - Final

    28/51

    28

    Spatial interaction

    Spatial interaction theory Linkages and flows

    between locations Spatial separation (-)

    Complementarity (+)

    Origin supply

    Destination demand

    Can be multidimensional

    Map multiple variables intoa single measure

    Originally an analogy with

    Newtons Law ofGravitation, but

  • 8/3/2019 Geographic Theory and Geospatial Knowledge Discovery - Final

    29/51

    29

    Spatial interaction

    spatial interaction has a

    solid theoretical base Entropy maximization

    Alan Wilson 1960s

    Discrete choice theory Stewart Fotheringham

    1980s which has resulted in awide spectrum of models Flow total constraints and

    quasi-constraints Spatial association among

    origins, destinations Behavioral processes etc, etc

  • 8/3/2019 Geographic Theory and Geospatial Knowledge Discovery - Final

    30/51

    30

    Spatial interaction

    Data mining of spatialinteractions Existing techniques

    Connections

    Flows

    Need better techniques Attributed flows

    Spatial object dyads Origin-destination pairs

    orgnet.com

    CubeView detecting outliersin flows Shashi Shekhar

    Visualizing social networks

  • 8/3/2019 Geographic Theory and Geospatial Knowledge Discovery - Final

    31/51

    31

    Spatial interaction

    The death of distance? Distance is changing

    High mobility Connectivity

    Space-adjustingtechnologies (Ron Abler)

    Change the nature ofspace with respect to thetime, cost and effort

    Space-time convergence(Don Janelle)

    Shrinking of distance dueto transport Rate per unit time

    Janelle 1969

    Convergence: Edinburgh andLondon 1658-1950

  • 8/3/2019 Geographic Theory and Geospatial Knowledge Discovery - Final

    32/51

    32

    Spatial interaction

    Telepresence (Don Janelle) Participate in events without physical

    presence

    Space-time fragmentation (Helen Couclelis)

    Spatial fragmentation Activities not tightly coupled with place

    Temporal fragmentation Activities outside standard hours

    Fluid time

    Short planning horizons - Flocking

    Need to expand theories of spatial interaction

    Why let climbing a mountaininterfere with business?

    Mt. Olympus, Utah, 18 June 2006

  • 8/3/2019 Geographic Theory and Geospatial Knowledge Discovery - Final

    33/51

    33

    Spatial interaction

    Time geography Individual in geo-space

    and time

    Constraints imposed by:

    Activity timing Activity locations Mobility resources

    Ability to trade time forspace

    Space-time path Realized movement

    Paths in theory and practiceMeipo Kwan

    Miller (2005)

  • 8/3/2019 Geographic Theory and Geospatial Knowledge Discovery - Final

    34/51

    34

    Spatial interaction

    Time geography Individual in geo-space

    and time

    Constraints imposed by:

    Activity timing Activity locations Mobility resources

    Ability to trade time forspace

    Space-time prism Potential movement

    t

    ija

    ijt

    jt

    jxix

    it

    x

    ijv

    Prism in theory and practice

    Miller andBridwell

    (2009)

  • 8/3/2019 Geographic Theory and Geospatial Knowledge Discovery - Final

    35/51

    35

    Spatial interaction

    Temporal Spatial

    Presence Telepresence

    Synchronous SP

    Face-to-faceST

    Telephone

    TV

    Asynchronous AP

    Post-it notes

    AT

    Mail

    EmailWebpages

    Janelle (1995)

    Communication modes basedon spatio-temporal constraints

    Possible time geographic expressions

    Relationsbetween

    paths

    Temporalevents

  • 8/3/2019 Geographic Theory and Geospatial Knowledge Discovery - Final

    36/51

    36

    Space-time cube: Visual analytic environment for

    exploratory time geography Kraak and Huisman

    Linking space-timepaths with attributes

    Visualizing the intersection ofmultiple space-time prisms

  • 8/3/2019 Geographic Theory and Geospatial Knowledge Discovery - Final

    37/51

    37

    Spatial interaction

    Interactive, multiscalevisualization of space-time paths Explore paths at

    different levels ofspatio-temporalgranularity

    Aggregation based onspatial similarity

    and attributesimilarity (eventually) Tetsuo Kobayashi and Miller (in

    progress)

  • 8/3/2019 Geographic Theory and Geospatial Knowledge Discovery - Final

    38/51

    38

    Spatial organization

    The concept of region Partitioning of geographic space

    based on homogeneity

    Two types of region

    Formal Explicit

    Land cover, terrain, settlementpatterns

    Functional

    Implicit Organization, interactions,

    linkages

    www.desertmuseum.org

    Formal regions basedon biogeography

  • 8/3/2019 Geographic Theory and Geospatial Knowledge Discovery - Final

    39/51

    39

    Spatial organization

    Regions and locationalprocesses Functional regions highlight the

    interplay between spatialprocess and spatial pattern

    Von Thunen bid-renttheory www.rri.wvu.edu

    teacherweb.ftl.pinecrest.edu

  • 8/3/2019 Geographic Theory and Geospatial Knowledge Discovery - Final

    40/51

    40

    Spatial organization

    Central Place Theory Theory of the frequency,

    size and spacing of citiesas market centers

    wolf.readinglitho.co.uk

    Nesting ofmarket

    areas andcities

    Distance Transport AdministrationWikipedia

  • 8/3/2019 Geographic Theory and Geospatial Knowledge Discovery - Final

    41/51

    41

    Spatial organization

    Spatial logic A route to explanation

    Spatial logic: Pattern suggestsprocess

    Process logic: Process suggestspatterns

    Why? Patterns are integrated

    manifestations of complexprocesses

    Why not? Difficult to distinguish individualprocesses

    Equifinality

    Continental drift inferred

    through spatial logic by AlfredWegener (1912)

  • 8/3/2019 Geographic Theory and Geospatial Knowledge Discovery - Final

    42/51

    42

    Spatial organization

    Geography and complexity Can spatial interaction explainintricate geographic patterns?

    Complexity theory Simple, local interactions can

    generate complex global behavior

    Importance of geographic context Pattern and intensity ofinteractions

  • 8/3/2019 Geographic Theory and Geospatial Knowledge Discovery - Final

    43/51

    43

    Spatial organization

    The problem of arbitraryregions Arbitrary regions lead to

    artifacts

    Two types of effects Scale

    Zoning

    Solutions

    Design optimal regions No regions!

    Assess effects

    10 15 5

    5 10 15

    5 10 5

    n = 9; mean = 8.89

    6.67 11.67 6.67

    n= 3; mean = 8.34

    7.5 11.25

    6.67

    n= 3; mean = 8.47

    12.5

    8

    7.5

    n= 3; mean = 9.33

    Modifiable Areal UnitProblem example(after Oliver; www.geog.ubc.ca)

  • 8/3/2019 Geographic Theory and Geospatial Knowledge Discovery - Final

    44/51

    44

    Is there a theory of geography?

    Yes!

    A unique perspectivefocusing on the role ofspatio-temporal proximity

    Is it formal? Yes: geographers have

    been building the formaland analytical foundationsof their field

    Is it coherent? Yes, but it is not unified

    Still need a grand unifiedtheory derived from firstprinciples

    Corn van Elzakker ITC; www.itc.nl

  • 8/3/2019 Geographic Theory and Geospatial Knowledge Discovery - Final

    45/51

    45

    Opportunities and challenges

    Spatial patterns & relations

    Potentially large! Geo-theory can guide GKD

    Background knowledge

    Pattern evaluation

    Background knowledge: challenges Geographic ontology

    Concepts can be abstract, vague, multi-level

    Knowledge extraction

    Geo-theory: Implicit information Equations, algorithms, etc

    KD: Explicit information Networks, hierarchies, rules

    Concept hierarchy forlocation

    - based on Han and Kamber

    (2003)

  • 8/3/2019 Geographic Theory and Geospatial Knowledge Discovery - Final

    46/51

    46

    Opportunities and challenges

    Spatial pattern evaluation

    Reality = theory Interesting but not novel

    Reality = null Not interesting or novel

    Between theory and null Maybe interesting and novel

    Problems What is a good spatial null?

    Not Complete Spatial Randomness (CSR)

    What is the metric? How do we measure spatial departures

    from theory and null?

  • 8/3/2019 Geographic Theory and Geospatial Knowledge Discovery - Final

    47/51

    47

    Opportunities and challenges

    Geographic theory as a pattern filter

    Spatial data mining often generates alarge number of spatial and temporalpatterns and relationships

    Meta-mining (Roddick 1999) Mining the results of previous mining

    exercises

    Derive higher-level patterns and

    rules

  • 8/3/2019 Geographic Theory and Geospatial Knowledge Discovery - Final

    48/51

    48

    Opportunities and challenges

    Algorithms and infrastructure Geographic models and

    techniques can becomputationally complex

    Often involve pairwise distances

    between all geo-locations

    Research needs

    Heuristics

    High-performance computing

    This is a surprisingly under-researched area! 10 years old!

  • 8/3/2019 Geographic Theory and Geospatial Knowledge Discovery - Final

    49/51

    49

    Conclusion

    Geographic theory Rich, coherent,

    formal

    Useful, butunderexploited indata mining

    Waiting to bediscovered

    wikimedia.org

    Help fill the blank spots onthe map!

  • 8/3/2019 Geographic Theory and Geospatial Knowledge Discovery - Final

    50/51

    Bibliography

    Ahmed, N. and Miller, H. J. (2007) "Time-space transformations of geographic space for

    exploring, analyzing and visualizing transportation systems," Journal of TransportGeography, 15, 2-17

    Beguin, H., and J. F. Thisse. (1979) An axiomatic approach to geographical space,Geographical Analysis 11, 32541

    Fotheringham, A. S. (1983) A new set of spatial-interaction models: The theory ofcompeting destinations, Environment and Planning A, 15, 1536.

    Fotheringham, A. S. and Demar, U. (2009) Looking for a relationship? Try GWR? in H. J.Miller and J. Han (eds.) (2009) Geographic Data Mining and Knowledge Discovery - secondedition, Taylor and Francis, in press.

    Getis, A., and J. K. Ord. (1992) The analysis of spatial association by use of distancestatistics, Geographical Analysis, 24, 189206.

    Janelle, D. G. (1969) Spatial organization: A model and concept. Annals of the Association

    of American Geographers 59: 34864. Links Janelle, D. G. (1995) Metropolitan expansion, telecommuting and transportation, in The

    geography of urban transportation, ed. S. Hanson, 40734. New York: Guilford.

  • 8/3/2019 Geographic Theory and Geospatial Knowledge Discovery - Final

    51/51

    Bibliography

    Kraak, H. J. and Huisman, O. (2009) Beyond exploratory visualization of space-timepaths, in in H. J. Miller and J. Han (eds.) (2009) Geographic Data Mining and KnowledgeDiscovery - second edition, Taylor and Francis, in press.

    Miller, H. J. (2004) "Tobler's First Law and spatial analysis" Annals of the Association ofAmerican Geographers, 94, 284-289.

    Miller, H.J. (2005) "A measurement theory for time geography," Geographical Analysis, 37,17-45

    Miller, H. J. (2005) "Necessary space-time conditions for human interaction," Environment

    and Planning B: Planning and Design, 32, 381-401. Miller, H. J. and Bridwell, S. A. (2009), "A field-based theory for time geography," Annals of

    the Association of American Geographers, 99 (in press).

    Miller, H. J. and Wentz, E. A. (2003) "Representation and spatial analysis in geographicinformation systems," Annals of the Association of American Geographers, 93(3), 574-594.

    Tobler, W. R. (1963) Geographic area and map projections. The Geographical Review 53:5978

    Tobler, W. R. (1970) A computer movie simulating urban growth in the Detroit region,Economic Geography, 46, 234-240.

    Tobler, W. R. (1994) Bi-dimensional regression, Geographical Analysis, 26, 187212