Spatial Databases ENVE/CE 424/524. Definitions Database – an integrated set of data on a...

Preview:

Citation preview

Spatial Databases

ENVE/CE 424/524

Definitions• Database – an integrated set of data on a particular subject

• Spatial database - database containing geographic data of a particular subject for a particular area

• Database Management System (DBMS) – software to create, maintain and access databases

Geographic Information

System

Database Management

System

Data

System• Data load• Editing• Visualization• Mapping• Analysis

• Storage• Indexing• Security• Query

GIS: old and new

GIS used to be monolithic systems

all-in-one, proprietary applications that stored, queried, and visualized data

New systems follow more of a tool-box approach

modularized applications that interoperate

Who can benefit from spatial data management?

Army Commander: Has there been any significant enemy troop movement in the past week?

Insurance Risk Manager: Which houses are most likely to be affected in the next great flood on the Mississippi?

Medical Doctor: Based on this patient’s MRI, have we treated somebody with a similar condition?

Molecular Biologist: Is the topology of the amino acid biosynthesis gene in the genome found in any other sequence feature map in the database?

Astronomer: Find all blue galaxies within 2 arcmin of quasars.

Three classes of users for spatial databases

Major database managers: specialized products for enterprise management

GIS users: analysis of data

Internet user: more generalized requirements

Advantages of Databases over Files

• Avoids redundancy and duplication

• Reduces data maintenance costs

• Applications are separated from the data– Applications persist over time– Support multiple concurrent applications

• Better data sharing

• Security and standards can be defined and enforced

Disadvantages of Databases over Files

• Expense• Complexity• Performance – especially complex data types• Integration with other systems can be difficult

Types of DBMS Model

• Hierarchical

• Network

• Relational – RDBMS

• Object-oriented – OODBMS

• Object-relational - ORDBMS

Characteristics of DBMS• Data model support for multiple data types

– e.g MS Access: Text, Memo, Number, Date/Time, Currency, AutoNumber, Yes/No, OLE Object, Hyperlink, Lookup Wizard

• Load data from files, databases and other applications

• Index for rapid retrieval

• Query language – SQL

• Security – controlled access to data– Multi-level groups

• Controlled update using a transaction manager

• Backup and recovery

Relational DBMS

• Data stored as tuples (tup-el), conceptualized as tables• Table – data about a class of objects

– Two-dimensional list (array)– Rows = objects– Columns = object states (properties, attributes)

Table

Row = object

Column = property Table = Object Class

Object Classes withGeometry called Feature Classes

Relational DBMS

• Most popular type of DBMS– Over 95% of data in DBMS is in RDBMS

• Commercial systems– IBM DB2

– Informix

– Microsoft Access

– Microsoft SQL Server

– Oracle

– Sybase

Spatial Database Example

Land parcel with boundary id: 1050

Relational Database Example

Four tables needed in the land parcel relational database

Relational database example #2

Relation Rules (Codd, 1970)

• Only one value in each cell (intersection of row and column)

• All values in a column are about the same subject• Each row is unique• No significance in column sequence• No significance in row sequence

SQL• Structured (Standard) Query Language – (pronounced SEQUEL)

• Developed by IBM in 1970s

• Now standard for accessing relational databases

• Three types of usage– Stand alone queries– High level programming– Embedded in other applications (ArcGIS)

Types of SQL Statements

• Data Definition Language (DDL)– Create, alter and delete data– CREATE TABLE, CREATE INDEX

• Data Manipulation Language (DML)– Retrieve and manipulate data– SELECT, UPDATE, DELETE, INSERT

• Data Control Languages (DCL)– Control security of data– GRANT, CREATE USER, DROP USER

Geometry

Point Curve Surface

LineString Polygon MultiSurface

Line LinearRing

MultiCurve

MultiPolygon MultiLineString

Composed

Type

Relationship SpatialReferenceSystem

GeometryCollection

MultiPoint

Spatial Types – OGC Simple Features

Data Model: A set of constructs for representing objects and processes in a digital environment

Spatial Relations• Equals – are the geometries the same?

• Disjoint – do the geometries share common point?

• Intersects – do the geometries intersect?

• Touches – do the geometries intersect at their boundaries?

• Crosses – do the geometries overlap?

• Within– is one geometry within another?

• Contains – does one geometry completely contain another?

• Overlaps – do the geometries overlap?

• Relate – are their intersections between the interior, boundary or exterior of the geometries?

Contains Relation

Touches Relation

Spatial Methods• Distance – determines shortest distance between any two points in two

geometries

• Buffer – returns a geometry that represents all the points whose distance from the geometry is less than or equal to a user-defined distance

• ConvexHull – returns a geometry representing the small polygon that can enclose another geometry without any concave areas

• Intersection – returns a geometry that contains just the points common to both input geometries

• Union – returns a geometry that contains all the points in both input geometries

• Difference – returns a geometry containing the points that are different between the two geometries

• SymDifference – returns a geometry containing the points that are in either of the input geometries, but not both

Convex Hull and Difference Methods

Convex Hull

Difference

Indexing• Used to locate rows quickly

• Like a book index, it is a special representation of the content that adds order and makes finding items faster

• RDBMS use simple 1-d indexing

• Spatial DBMS needs 2-d, hierarchical indexing– Grid– Quadtree– R-tree

• Multi-level queries often used for performance (MBR)

Grid Index (multi-level)

- Overlay uniform grid

- Assign objects a grid id

Multi-level grids are used for variable sized objects within a database

Point and Region Quadtree Indexing

Based on recursive division of space.

Point QuadtreeRegion Quadtree

R-tree

Use minimum bounding rectangle (MBR) or minimum bounding box (MBB)

Add a new object to the MBR that would expand the least to accommodate the object

Study Area

Minimum Bounding Rectangle

Minimum Bounding Rectangle

Order Dependence of a Query

Query: Select all households within 3 km of a store that have an income greater than $100,000

1. Select all households with an income greater than $100,000; from this selected set, select all households within 3 km of a store

2. Select all households within 3 km of a store; from this selected set, select all households with an income greater than $100,000

Distributed Databases

www.midcarb.org

References

Longley et al., Geographic Information Systems and Science, 2001Chapter 11

Guenther, Environmental Information Systems, 1998Chapter 3

Final Few Weeks

Lecture: April 15, Metadata and Interoperability

Lab: April 17 (next Thursday), project/problem set work

I’ll spend a few minutes with each of you to get an update on your progress.

• Article review due April 17

Lab: April 22, project lab session.

Lecture April 24, GIS in decision-making

Project Presentation: May 8

Recommended