58
Representing Geography Data models and axioms

Representing the real world - UBC Blogsblogs.ubc.ca/advancedgis/files/2019/09/lecture04.pdf · Representing the real world in a GIS The contents of a spatial database represent a

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

  • Representing GeographyData models and axioms

  • Road mapRepresenting the real world

    Conceptual models: objects vs fields

    Implementation models: vector vs raster

    Vector topological model

    Surfaces and networks

  • Representing the real world in a GIS

    The contents of a spatial database represent a limited view of reality (mandates):

    The world is infinitely complex

    The spatial database is representation of a model of reality

    Ontological and epistemological issues

  • Representing the real world

    A database may includedigital versions of real objects (‘bona fide’)

    e.g., houses, roads, treesdigital versions of fictitious (i.e., invented) objects (‘fiat’)

    e.g., political boundaries, census tracts, forest stand boundaries

    What kinds of objects exist in the real world?

    If I tell you that in the figure on the right there aretwo cities (Vancouver and Burnaby), what exists

    that you cannot see?

  • Representing the real world

    Some features are discrete, clearly defined entities (e.g., houses, districts)

    discrete representation is no problem

    Other features exist everywhere and vary continuously (e.g., temperature, pH)

    variation needs to be approximated using discrete representations

    Some things you can hold, are clearly delimited.Other things are known through measurements,

    are infinite in scope.

  • Forest

    Non-Forest

    Reality GIS representation

    Sometimes, the distinction between discrete and continuous is not very clear

    ?

    ??

    ?(Fuzzy logic)

  • Representing the real world

    Computers are good at storing discrete spatial data, but bad at storing continuous spatial data (since everything in a computer is ultimately represented as a binary [0,1] number) (also, fractals)

  • Objects versus Fields

    GIScientists have been developing a conceptual model of space, independent of the way the data is physicallyrepresented in the computer:

    Object or entity view“empty space littered with objects” (points, lines or

    areas)

    Field viewvalue is defined for every location

    Can you transmute the concepts?

  • Examples of objects

    Vancouver / Burnaby Census Tracts 3-D view of Prague

  • Examples of fields

    Elevation (a DEM) Groundwater nitrate concentration

  • Data model implementation

    Implementing, in a computer, valid representations of fields and objects requires a set of rules. These rules will depend on the particular (physical) data model selected.

    These rules convert (conceptual) geographic models into discrete (physical, computer-based) representations.

    (aka: the spatial database is representation of a model of reality)

  • GIS data models

    The two dominant representations:

    raster or grid data model

    vector data modelWhat are the two dominant models?

  • The raster data model

    Divides the study area into square cells (a grid)Register the corners of the grid to the EarthRepresent discrete objects as collections of one or more cells

    Represent fields by assigning attribute values (NOIR)to cells More commonly used to represent fields than discrete objects

  • The raster data model

    Raster representation: each color represents a different value of a (e.g.) nominal-scale field denoting land cover class.

    Legend

    Mixed conifer

    Douglas fir

    Oak savannah

    Grassland

  • The raster data model

    Pixel sizeThe size of the cell or picture element, defining the level of spatial detail All variation within a pixel is lost (mixed pixel problem)

    Assignment schemesThe value of a cell may be an average over the cell (�̅�𝑥), a total within the cell (∑), the commonest value in the cell (mode), based on a priority (ensure that rare things get recorded), or a boolean value (present [1] or not [0])

    It may also be the value found at the cell’s central point (centroid). (There are more rules.)

  • Regardless of the assignment rule used, ambiguity will always be present.Sub-pixel Boundary pixel

    Linear sub-pixelIntergrade

    Mixed pixel problem

  • Tessellation data models

    The raster data model is a member of a larger group of field models known as tessellation data models.

    Grid or rasterHexagonalTINQuadtree

    All completely cover the space, and therefore can be used to represent fields (but not all are raster-based models)

    http://www.mcescher.com/gallery/http://www.mcescher.com/gallery/

  • Tesselation models

    Hexagons

    Quadtree

    TIN

    CellsSpatial ecology

    10;10

    http://desktop.arcgis.com/en/arcmap/latest/tools/data-management-toolbox/generatetesellation.htmhttp://www.spatialecology.com/gme/genhexagonsinpolys.htm

  • The vector data model

    Real world objects are represented as points, lines and areas

    Points identify locations (0 D)

    Lines connect points (1 D)

    Areas (polygons) consist of connected line segments (2 D)

  • x

    Node

    Vertexy

    points lines areas

    The vector data model

  • The vector data model

    Objects are defined by their (x,y) coordinates in a coordinate system (spherical: lat,long or planar: x,y)

    Precision of coordinates virtually infinite (only machine-dependent) (resolution)

    but: accuracy most often limited!

    E.g., Distance A to B is given as 1458.394958397 mData was digitized off of a 1:20,000 scale map. Is that distance meaningful? (Significant Digits)

    A B

    http://scienceblogs.com/goodmath/2009/03/04/basics-significant-figures/

  • Resolution: raster & vectorSome thoughts.

    http://ibis.geog.ubc.ca/%7Ebrian/Course.Notes/gisscale

  • Precision versus Accuracy

    Precision: the smallest difference between adjacent positions that was recorded and stored (e.g., were distances measured to the nearest m or mm?). (Also applies to attributes)

    Accuracy: the freedom from error (what is the truth?)

    Uncertainty: now the more commonly-used term, as it represents the overall ambiguity (think of the map scale)

    Which demonstrates accuracy?Which demonstrates precision?

    A good review of the terms

    https://www.inorganicventures.com/accuracy-precision-mean-and-standard-deviation

  • Objects(e.g. spot hts)

    Fields(e.g. Elevation)

    Vector(e.g. Contours, TIN)

    Raster(e.g. DEM)

    Conceptual models

    Implementationmodels

    transmutation

  • Road mapRepresenting the real world

    Conceptual models: objects vs fields

    Implementation models: vector vs raster

    Vector topological model

    Surfaces and networks

  • Levels of GIS data model abstraction

  • CB

    BA

    C AB

    Points Lines Polygons

    Object representations

  • 200240 260

    180200 270

    170 220

    250130

    Raster grid Regular point grid

    Irregular points Contour lines

    Common field representations

    Fishnetin ArcGIS

    TIN

    PolygonsWhat are some of the ways we can store fields in a GIS?

    http://desktop.arcgis.com/en/arcmap/latest/tools/data-management-toolbox/how-create-fishnet-works.htmhttp://desktop.arcgis.com/en/arcmap/latest/tools/data-management-toolbox/how-create-fishnet-works.htm

  • Vector data structures

    There are three main vector data structures:

    Simple (or spaghetti) data structureNo intelligence, duplication of data (within a layer) (KML)

    Point dictionaryNo intelligence, no duplication of data (within a layer)

    Topological structureIntelligence, no duplication of data (within a layer)

    http://code.google.com/apis/kml/documentation/kml_tut.html

  • Spaghetti Vector Data Model

    Each point, line, or polygon is stored as a record in a file that consists of that entity’s ID and a list of coordinates that define geometry.

    ID Coordinates

    1 2,3

    2 5,5

    1

    2

    For Points:

  • Each point, line, or polygon is stored as a record in a file that consists of that entity’s ID and a list of coordinates that define geometry.

    ID Coordinates

    1 (0,1), (3,4), (5,6)

    2 (3,1), (5,2), (4,3)

    1

    2

    For Lines:

    Spaghetti Vector Data Model

  • Each point, line, or polygon is stored as a record in a file that consists of that entity’s ID and a list of coordinates that define geometry.

    ID Coordinates

    1 (2,4), (4,3), (3,6), (2,4)

    2 (3,1), (5,2), (4,3), (3,2), (3,1) 1

    2

    For Polygons:

    Spaghetti Vector Data Model

    KML

    https://developers.google.com/kml/documentation/kmlreference#coordinates

  • • Advantages• simple• efficient for display and plotting

    • Disadvantages• inefficient for most types of spatial analysis, especially for generalization

    Spaghetti Vector Data Model

  • Road map

    Representing the real world

    Conceptual models: objects vs fields

    Implementation models: vector vs raster

    Vector topological model

    Surfaces and Networks

  • Topological data model

    Records coordinates (x,y) of spatial features and encodes spatial relations (i.e., which arcs are connected to a node, which polygons lie on either side of an arc, which arcs make up a polygon)

    Also called “arc-node” data modelarc = linenode = end-point of a line, or a point where two or more lines connect

    Arcs

  • •Planar Enforcement:•No two individual features can overlap.•There are no ‘holes’ or ‘íslands’ that are not themselves features.

    •Every feature is represented as a record in the attribute table.

    Topological data model

    How many topologically-correct areas?4 records – Not topological

    6 records - Topological

  • Topological data model vs. Spaghetti

    Spaghetti: can encode as 2 or 3 polygons (and have 2 or 3 records in the attribute table)

    Topologic: must be encoded as 3 polygons (and have 3 records in the attribute table)

    1

    2

    3

  • Node X YI 1 4II 4 4III 6 4IV 4 1

    Line From To Left Right1 I III O A2 I IV B O3 III IV O C4 I II A B5 II III A C6 II IV C B

    Poly LinesA 1,4,5B 2,4,6C 3,5,6

    1

    1

    5

    4

    3

    2

    6

    2 3 4 5 6

    A

    B C

    1

    2 3

    4 5

    6

    III

    III

    IV

    O = “outside” polygon

    Topological data model

    Sheet1

    NodeXY

    I14

    II44

    III64

    IV41

    &A

    Page &P

    Sheet1

    LineFromToLeftRight

    1IIIIOA

    2IIVBO

    3IIIIVOC

    4IIIAB

    5IIIIIAC

    6IIIVCB

    &A

    Page &P

    Sheet: Sheet1

    Poly

    Lines

    A

    1,4,5

    B

    2,4,6

    C

    3,5,6

  • Topological data model

    Can quickly answer these questions:which roads connect to the central square?which roads do I take to get from here to the hospital?what are the fertility rates in the neighboring districts?

    Topology: adjacency, connectivity, overlap, intersect

  • Topological data model

    Advantagesstores spatial relations explicitlyspatial analysis can be done without accessing coordinatesVery useful for data validation (if topological rules are established)

    Disadvantagesmore complex data structuretopology needs to be re-established after each update

    Preferred system for high-end systems

    Topology and map projections

    ESRI’s help files on topology

    http://desktop.arcgis.com/en/arcmap/latest/manage-data/editing-fundamentals/about-editing-data-in-a-different-projection-projecting-on-the-fly-.htmhttp://desktop.arcgis.com/en/arcmap/latest/manage-data/topologies/an-overview-of-topology-in-arcgis.htm

  • Road mapRepresenting the real world

    Conceptual models: objects vs fields

    Implementation models: vector vs raster

    Vector topological model

    Surfaces and Networks

  • Two special spatial entities

    Two additional spatial entities should be introduced since they extend the basic properties of points, lines, areas and raster cells.

    For some groups / agencies, these are the most important routines / data structures in a GIS.

    These are:SurfacesNetworks

  • Surfaces

    As mentioned before, surfaces are examples of field data

    There are both raster and vector approaches to storing / presenting surface data

  • Surface presentations

    3D?

    2.5D

    ArcGlobe

    http://desktop.arcgis.com/en/arcmap/latest/extensions/3d-analyst/3d-analyst-and-arcglobe.htm

  • Surfaces – data collection

    Remote sensingRadar (grid)Lidar (dense points)Stereo aerial or satellite imagery (grid)

    Photogrammetry (softcopy) (points, lines) (Phodar)Orthophotos

    Field collectionSurveying (leveling, total stations) (points)GPS (points, lines)

    Existing datapaper maps—contours—digitization (lines, points)

    http://archives.profsurv.com/magazine/article.aspx?i=1770http://www.empr.gov.bc.ca/MINING/GEOSCIENCE/MAPPLACE/MAINMAPS/EXPLORATION/Pages/orthophoto.aspx

  • Surfaces – data storage

    Grid data – store as a raster

    Points, lines – can store as is, but generally would want to create a more intelligent data structure.

    TINs (Triangulated Irregular Networks) were developed to intelligently store vector surface data. TINs are based on a Delaunay triangulation or constrained Delaunay triangulation.

    https://en.wikipedia.org/wiki/Delaunay_triangulation

  • Triangulated Irregular Network (TIN)

    Quality TINs are built using VIPS (very important points, such as peaks, pits) and break lines (cliffs, roadways, rivers, lake shores).

    http://desktop.arcgis.com/en/arcmap/latest/manage-data/tin/fundamentals-of-tin-surfaces.htm

  • A topological model that stores elevations at the nodes, and associates slope with the edges and ∆ faces.

    FACE

    S

    Threading a contour

    TIN

  • Surfaces – TINs vs DEMs

    TINs can, most often,represent a complex surface with much less data than a DEM, since in a DEM an elevation must be known for every cell in the DEM.

    The ‘information’ associatedwith every cell in a highresolution DEM can, in fact,be very low if there is littletopographic relief. (Thinkspatial autocorrelation [e.g., a lake]).

    However, deriving drainage networks, shortest paths, etc., are much easier to compute using DEMs.

  • Surfaces – data conversion

    Converting raster data structures (DEMs) to vector data is both easy (creating contours) and complex (creating quality TINs)

    Converting vector data structures (TINs) to other vector structures (contours) and to raster structures (DEMs) is easy

  • Networks

    Networks are another special case of a vector topological data structure

    Can be looped (e.g. bus routes) or radial (e.g., rivers)

    Special attributes are associated withArc – Node -- Arc relations

    turn tables (can you turn left/right, go straight); Stop, no stop

    ArcsDirectionality (one way, two way, speed limit)

  • Networks

    Used for route analysisCouriersEmergency vehiclesGarbage collectionVehicle navigation systems

    Used for hydrological analysisFlow of water through a river system

  • Networks

    Location-allocation

    Shortest path

  • Networks

    Planar enforcement?

    http://desktop.arcgis.com/en/arcmap/latest/extensions/network-analyst/what-is-a-network-dataset.htm

  • Networks

    An extended topological database model

    https://gicycle.wordpress.com/2014/08/29/data-quality-topology/https://gicycle.wordpress.com/2014/08/29/data-quality-topology/

  • Network analysisVector or raster

  • Looking back

    How real can we be?Thinking of space (O/F)Putting space into a computer (R/V)Structuring space (Topology)Adding complexity (Surfaces and Networks)

  • Representing GeographyRoad mapRepresenting the real world in a GISRepresenting the real worldRepresenting the real worldSometimes, the distinction between discrete and continuous is not very clearRepresenting the real worldObjects versus FieldsExamples of objectsExamples of fieldsData model implementationGIS data modelsThe raster data modelThe raster data modelThe raster data modelMixed pixel problemTessellation data modelsTesselation modelsThe vector data modelThe vector data modelThe vector data modelResolution: raster & vectorPrecision versus AccuracySlide Number 24Road mapSlide Number 26Object representationsCommon field representationsVector data structuresSlide Number 30Slide Number 31Slide Number 32Slide Number 33Road mapTopological data modelSlide Number 36Slide Number 37Topological data modelTopological data modelTopological data modelRoad mapTwo special spatial entitiesSurfacesSurface presentationsSurfaces – data collectionSurfaces – data storageSlide Number 47TINSurfaces – TINs vs DEMsSurfaces – data conversionNetworksNetworksNetworksNetworksNetworksNetwork analysisLooking backSlide Number 58