Upload
myles-hart
View
216
Download
1
Embed Size (px)
Citation preview
Spatial Data and GIS
Dept. of Computing Science, University of Aberdeen 2
Introduction• In many domains we process information
in relation to its spatial location– E.g., epidemiological studies are dominated
by geographical distribution of infected cases • Dr Snow’s study of London Cholera epidemic
– engineering designs have a strong spatial basis
• CAD/CAM systems deal with locations of components in a design
– Image processing involves segmenting pixel data in relation to their location to identify objects of interest
– Position aware devices such as mobile phones allow us to track individual movement
Dept. of Computing Science, University of Aberdeen 3
Geospatial Data• Data those are related to geographic locations
are said to be geospatial– Dr Snow’s data are geospatial – data about geographic
locations of water pumps and deaths due to cholera– Weather maps are geospatial – weather
measurements such as temperature and rainfall over a region
• In general, data shown on maps is geospatial• In this course, spatial data = geospatial data • Many of our decisions are based on geospatial
data– Weather at a location drives our decision to plan a
picnic at that location– Supermarkets decide the size and type of a new store
after thoroughly analysing the characteristics of the neighbourhood
• Spatial data mining helps to– Extract useful information for decision making
Dept. of Computing Science, University of Aberdeen 4
Example Data
% Unemployment ratesFor Aberdeen WardsBased on UK Census 2001
<4.79<3.51<2.23
Dept. of Computing Science, University of Aberdeen 5
GIS• GIS refers to
– Geographic Information System – Or Geospatial Information System
• GIS offers – generic (application independent) functionality
required for supporting decision making with geospatial data
– Data storage and retrieval– Data analysis– Visualization
• GIS combines– Data analysis and Visualization for helping users
understand geospatial data• The focus is on offering generic functionality to
help users understand data rather than make decisions for them like expert systems
Dept. of Computing Science, University of Aberdeen 6
GIS Modules
Spatial Database
Spatial Data Analysis
Spatial Visualization (Maps)
Main Modules of a GIS
Dept. of Computing Science, University of Aberdeen 7
Characteristics of Spatial Data (1)
• Spatial Data has two kinds of attributes– Spatial attributes –location
information• E.g. longitude and latitude for
points and boundary information for areas
– Non-spatial attributes• E.g. name, unemployment rates
• We are mainly interested in the non-spatial attributes, e.g. unemployment rates– But want to study them taking
their location (spatial attributes) into consideration
• Location attribute is different from Name and Unemployment rates– Next, we learn why this is so.
Location Name Unemployment Rate
boundary1
Pitmedden 2.05
boundary2
Bankhead/Stoneywood
1.75
boundary3
Danestone 1.15
…. …. ….
… … …
Table: % Unemployment Rates forAberdeen Wards based on UK Census 2001
Spatial attribute Non-Spatial attribute
Non-Spatial attribute
Dept. of Computing Science, University of Aberdeen 8
Characteristics of Spatial Data (2)
• The Location attribute in the table has values– boundary1, boundary2, etc
• Each boundary value (e.g. boundary1) holds data required to draw a ward (e.g. Pitmedden) in the census map shown earlier
• This means, values of spatial attributes are geometrical (graphical) data
• This is the fundamental difference between spatial attributes and non-spatial attributes– Spatial attributes store geometrical
information– Many of the special characteristics of spatial data
can be derived from this.
Dept. of Computing Science, University of Aberdeen 9
Characteristics of Spatial Data (3)
• Objects with similar attributes usually are located nearby spatially– Everything is related to everything else but
nearby things are more related than distant things – first law of Geography
– In spatial statistics this property is called spatial auto-correlation
• Recall auto-correlation from time series data• Data values are not independent
• Most geographic locations are unique (spatial heterogeneity)– Therefore global parameters do not always
accurately describe local values
Dept. of Computing Science, University of Aberdeen 10
Characteristics of Spatial Data (4)
• Relationships among non-spatial attributes are explicit– Foreign keys make explicit links between tables
• But relationships among spatial attributes are implicit– boundary1 and boundary2 could be neighbours, but this is
not explicitly represented• Summary of special properties of spatial data
– Auto-correlation– Spatial heterogeneity– Implicit spatial relationships
• Modelling spatial data needs to be different from modelling ordinary data
• Data modelling influences data manipulation– Querying– Analysis– Visualization
Dept. of Computing Science, University of Aberdeen 11
Concept of Modelling
• Common sense view– Representation of ‘something’ at a level of detail
suitable for its purpose– For example, an architects ‘model’ of a bridge
• Architects model brings the bridge to life even before its construction
• Formal View– Modelling function translates some source domain
into its corresponding target domain– Target domain is used (because it is simple in some
sense than the source domain) for analysis– An inverse modelling function should be available for
translating results of analysis from target domain to the source domain
Dept. of Computing Science, University of Aberdeen 12
Modelling GeographicSpace
• Not possible to directly perceive geographic space in its entirety– Representations (or models) are the only
sources of information about geographic space
– Representations of smaller geographic areas merged to create global representations
• Maps have been the primary models of geographic space– From cave drawings to Google Maps
• The field of map making is called ‘Cartography’
Dept. of Computing Science, University of Aberdeen 13
Geometries Model Geographic Space
• Several different geometries exist– Each is a study of spatial properties that are invariant
under certain groups of transformations– E.g. In Euclidean geometry distance and angle are
preserved under translation, reflection and rotation– Topology is another geometry that for example preserves
openness/closedness of a curve under stretching
• Geometries are hierarchically related – As you move up the hierarchy, each successive geometry
preserves more properties than its subclass
• Different kinds of Geographic phenomena are represented by different geometries
Dept. of Computing Science, University of Aberdeen 14
Computational Models of Geographical Space
• Two fundamentally distinct views– Absolute space
• Space exists in itself and objects are located in this absolute space
• You first create space and put objects in that space
– Relative space• Space is one of the attributes of objects related to
other objects• You first define objects and they create space as
a result of their relative locations and interactions
• Both these views are used in modelling spatial information computationally
Dept. of Computing Science, University of Aberdeen 15
Relational Data Model
• Relational databases model data into a connected set of relations– Each relation is a collection of tuples– Tuple1 -> (location1,temperature1,rainfall1)– Tuple2 -> (location2,temperature2,rainfall2)
• For certain applications, relational models are often criticised for impedance mismatch between – the relational database storing the data– the object oriented code manipulating that data
• For spatial data this mismatch is a problem– The inherent structure of spatial data is not captured
by the relational model
Dept. of Computing Science, University of Aberdeen 16
Relational Model Example
name capital population
boundaryId
Germany
Berlin 80 B1
France Paris 60 B2
… … … …
boundaryId contourId
B1 C1
B2 C2
B2 C3
… …
contourId point_num pointId
C1 2 P1
C1 1 P2
C1 3 P3
… … …
pointId x y
P1 452 1000
P2 365 875
P3 386 985
… … …
Country Boundary
PointContour
Dept. of Computing Science, University of Aberdeen 17
Query - Example
• Return the contours of France• An SQL Query:
Select Boundary.id-controur,x,yFrom Country,Boundary,Contour,PointWhere name = ‘FranceAnd Country.id-boundary=Boundary.id-
boundaryAnd Boundary.id-contour=Contour.id-contourAnd Contour.id-point=Point.id-pointOrder by Boundary.id-contour,point-num;
• Several problems as seen next
Dept. of Computing Science, University of Aberdeen 18
Problems with using RDBMS for Spatial Data
• Formulating queries requires knowledge of the spatial object’s structure– New queries with change in structure
• Bad performance– More space used and more cost in computing joins
• User unfriendly– Manipulating points is not easy
• Difficulty in defining new spatial types• Impossible to express geometric computations
– Adjacency test, point query or window query
Dept. of Computing Science, University of Aberdeen 19
Field-Based Models• Computational Model of absolute space• Information space is viewed as a collection of
fields– Temperature field, rain fall field and wind speed field
form a weather information space• Data attribute values are computed by
functions of locations– Temperature1 = Temperaturefield(location1)– Tempearture2 = Temperaturefield(location2)– RainFall1 = RainFallfield(location1)
• The field is the function, not the set of values• Field is the first-class entity in this kind of
modelling
Dept. of Computing Science, University of Aberdeen 20
Field-Based Models (2)
• Field-based model is a function on location– So we need location data as independent variable
• Given a region of space (geography) we need a framework to partition that space into locations– Tessellation of space– For example using grids
• A field based model is then a function that maps each location to its attribute value
• Useful for modelling data from continuous spatial processes– Temperature fields, elevation data
Dept. of Computing Science, University of Aberdeen 21
Object-based Models• Computational Model of Relative Space• One or more tuples from the relational model
can be lumped together as data values corresponding to an object– All the tuples that have temperatures below zero,
rainfall above 10mm describe an object• The object then has spatial reference
– The above weather conditions could be true for a region of geography
• Object is the first-class entity in this kind of modelling
• Useful for modelling data from discrete spatial processes– Administrative units, rivers
Dept. of Computing Science, University of Aberdeen 22
Object-based Models(2)
• Object-based model maps directly to the object-oriented model we are familiar in computing science– Objects have attributes some of which
happen to be spatial and therefore have values related to space (or geography)
• Field-based models also can be mapped to object-oriented models but not directly
• Field-based and object-based models are complementary not competing– Both are useful for different contexts
Dept. of Computing Science, University of Aberdeen 23
OGC Reference Model• OGC is Open Geospatial Consortium
– Organization for development of standards for geospatial and location based services
• Reference Model is defined in terms of features and themes
• Geographic Object (Feature)– An abstraction of a real world phenomenon– E.g a river, a city and a country
• Feature has two components– Descriptive attributes (non-spatial attributes)
• E.g name and population of a city– Spatial attributes
• Geometry
• Theme (Feature Collection)– Similar to a relation (table) in the relational model– E.g. Rivers, cities and countries
Dept. of Computing Science, University of Aberdeen 24
Coordinate Reference Systems
• Spatial attributes (geometry) of a feature refer to locations on Earth
• A coordinate reference system is used to define unique values to every location on Earth
• For example, Geographic Coordinate System uses longitude and latitude to define uniquely locations on Earth
• Geocentric and projected coordinate reference systems are other examples
• Datum defines origin, orientation and scale of a coordinate system and links it to Earth– This is required to use the abstract system for positions of
feature practically• Coordinate reference system (which includes Datum) is
part of the OGC reference model– Without a geo-reference system geometries of features
cannot be defined
Dept. of Computing Science, University of Aberdeen 25
Spatial Databases• Connected set of Themes (corresponding to
relations/tables in relational model)– Each of these is a collection of geographic objects
(features)• Geographic objects correspond to tuples or
rows in relational model with– non-spatial attributes and– spatial attributes
• Geometric attributes such as location and shape
• Support for geospatial data in MySQl (version 4.1 onwards)
• Two example themes:– Countries (name, population, geo:region)– Languages (language,geo:region)
Dept. of Computing Science, University of Aberdeen 26
Countries
Name Population Region
Ireland 3.6 R1
Britain 57.2 R2
France 58 R3
Spain 39 R4
Portugal 10.5 R5
Germany 78.5 R6
Switzerland 6.3 R7
Dept. of Computing Science, University of Aberdeen 27
Languages
Language Region
Anglo-Saxon R1
Anglo-Saxon R2
Latin R3
Latin R4
Latin R5
Anglo-Saxon R6
Latin R7
Dept. of Computing Science, University of Aberdeen 28
Queries on Spatial databases
• Familiar operations from relational algebra can be defined on themes
• Theme projection– Πpopulation,geo(Countries)
• Theme selection –similar to relational selection– σ population>50(Countries)– Theme union – similar to relational union
• You can work these out yourself
Dept. of Computing Science, University of Aberdeen 29
Spatial Join• In a relational database, join queries help
users to connect or link or join tables• Spatial databases allow users to join
themes– These are called theme overlays– An object of one theme is joined with an
object of the other theme if their geometries intersect
• In our example, the resulting theme will show all the rows and columns of both the tables– You can work it out yourself
Dept. of Computing Science, University of Aberdeen 30
Special Queries
• Some queries to spatial databases are more complicated than the relational queries– Window query – select the objects that overlap a
given window or area– Point query – select the objects that contain the
given point– Clipping – select the objects with the exact
intersection of the geometry of the object and the given window
• To process such queries GIS possesses geometric and topological sense– We will not go into the details here
Dept. of Computing Science, University of Aberdeen 31
Visualization of Spatial Data
• Results of theme operations are not very useful if shown as tables
• They are normally shown as maps in GIS• Theme overlay is the main operation for
creating maps in GIS– Data belonging to the required themes is retrieved
from the database and plotted as overlays in a GIS (you will learn to use overlays in the practical)
• As discussed with other visualizations geo-visualization (or map drawing) too has two aspects– Designing the map– Rendering the map
Dept. of Computing Science, University of Aberdeen 32
Visualization of Spatial Data (2)
• Maps can be rendered using – Vector graphics– Raster graphics
• This distinction can be traced back to the distinction between– Object-based data models (Vector models)– Field-based data models (Raster models)
• Many modern GIS systems allow mixing and matching these two modes to render maps– Google maps overlay vector based spatial
information on top of raster satellite image
Dept. of Computing Science, University of Aberdeen 33
Map Projections• Earth’s surface is a spheroid • A globe is the only realistic visualization of
Earth’s surface• Maps project Earth’s surface on to a plane• The surface of a spheroid cannot be projected
on to a plane without distortion• All large area maps distort some aspect of
Earth’s surface– Either shape or size of features
• Several different projections are defined, each retains certain aspects of Earth’s surface
• Each projection uses a different coordinate reference system
• Purpose of a map guides the selection of an appropriate projection
Dept. of Computing Science, University of Aberdeen 34
Spatial Data Types
• Features can be three Types– referenced to a point
• E.g. Location information of a restaurant– referenced to a path
• E.g. Path information from my home to University– referenced to an area
• E.g. information about a region bounded by a polygon
• We can transform point data into area data by aggregating values over all the points in an area
• Different data analysis tasks and techniques are employed for each of these data types
Dept. of Computing Science, University of Aberdeen 35
Points Data
• Event prediction– E.g. given the spatial distribution of
crimes in an area, predict the likely location of a future crime
• Given some actual observations predict unknown values at intermediate locations by interpolation– Spatial regression
Dept. of Computing Science, University of Aberdeen 36
Paths Data
• Finding least ‘cost’ path over a route map.
• Navigation systems on modern cars find paths and communicate the path information graphically and by speech
• A navigation system is a good example of the kind of systems we are interested in this course– They analyse spatial data to extract
important information plus– They also communicate the extracted
information in different forms to suit the user
Dept. of Computing Science, University of Aberdeen 37
Area/Lattice data
• Public domain is flooded with this type of data– E.g. census data is available for public as aggregated
values over a census tract• Scrol – Scotland’s Census Results Online
– Weather parameters such as temperature and rainfall are reported as aggregated values over a region such as Grampian and Lothian
– Disease count data where counts of a disease are recorded for regions or counties
• Technology to analyse and communicate this type of data has large impact on public life
Dept. of Computing Science, University of Aberdeen 38
Summary• Building the informational and computational
infrastructure to support storing, retrieving, analysing and visualising geospatial data is the job of computer scientists
• Advancement of Geographic Information Systems (GIS) and Global Positioning System (GPS) have allowed us to study most data in relation to its spatial location
• We are now in a position to formulate well formed spatial queries or hypotheses
• Technology is available to answer such queries or test those hypotheses
• All of us will use more and more spatial data in the future