60
GIS DATABASES an overview

GIS DATABASES an overview. 2 Contents –the basics of data storage –overview of databases the database approach types of databases databases in GIS –design

  • View
    217

  • Download
    0

Embed Size (px)

Citation preview

GIS DATABASESGIS DATABASES

an overviewan overview

2

ContentsContents

– the basics of data storage– overview of databases

• the database approach

• types of databases

• databases in GIS

– design considerations– development of an ARC/INFO database

– the basics of data storage– overview of databases

• the database approach

• types of databases

• databases in GIS

– design considerations– development of an ARC/INFO database

3

Conceptual, logical and physical ...Conceptual, logical and physical ...

Conceptual Logical Physical

4

A storage hierarchy ...A storage hierarchy ...

– files/tables• records

• fields(types …)

– databases

– information systems

– decision support systems (DSS)

– approaches to storage• application/file based

• databases

– files/tables• records

• fields(types …)

– databases

– information systems

– decision support systems (DSS)

– approaches to storage• application/file based

• databases

increasing

complexity

5

Application based approachApplication based approach

PermitsPermits

Tax/RatesAssessment

Tax/RatesAssessment

Assessment Data

Permit Data

Sewer DataSewerMaintenance

SewerMaintenance

Applications using data stored as Application Specific data

6

Database approachDatabase approach

PermitsPermits

Tax/RatesAssessmentTax/RatesAssessment Assessment DataAssessment Data

Permit DataPermit Data

Sewer DataSewer DataSewerMaintenanceSewer

MaintenanceD

ata

base M

an

ag

em

en

t S

yste

m

Database approach and use of shared data -implications for GIS

7

Database … a definitionDatabase … a definition

• A collection of interrelated data stored together with controlled redundancy to serve one or more applications in an optimal fashion.

• A common and controlled approach is used in adding new data and modifying and retrieving existing data within the data base

• A collection of interrelated data stored together with controlled redundancy to serve one or more applications in an optimal fashion.

• A common and controlled approach is used in adding new data and modifying and retrieving existing data within the data base

8

Databases… objectives/advantagesDatabases… objectives/advantages

– centralised data storage and management … global view of data … data dictionary

• standardisation of all aspects of data management

• reduced duplication

• multiple access / retrieval flexibility

• integrity constraints … validation enforced

• ...

– data base management system (DBMS)

– centralised data storage and management … global view of data … data dictionary

• standardisation of all aspects of data management

• reduced duplication

• multiple access / retrieval flexibility

• integrity constraints … validation enforced

• ...

– data base management system (DBMS)

9

Database/s… data dictionaryDatabase/s… data dictionary

– the most critical (?) element of a database– data about data… metadata– essential for system development– uses include

• design - entities and data relationships

• data capture - entry/validation

• operations - program documentation

• maintenance (impact assessment of proposed changes , est. of effort, cost …)

– the most critical (?) element of a database– data about data… metadata– essential for system development– uses include

• design - entities and data relationships

• data capture - entry/validation

• operations - program documentation

• maintenance (impact assessment of proposed changes , est. of effort, cost …)

10

Data dictionary…types of information (general)

Data dictionary…types of information (general)

GIS MetadataGIS Metadata

12

DBMS … key modulesDBMS … key modules

– a data description/definition module• defines/creates/restructures

• enforces rules

– a query module• retrieval for queries, ad-hoc queries, simple reports

– a report writing program– a high level language interface– ...

– a data description/definition module• defines/creates/restructures

• enforces rules

– a query module• retrieval for queries, ad-hoc queries, simple reports

– a report writing program– a high level language interface– ...

13

Database… stages of developmentDatabase… stages of development

– information systems plan for organisation – system specification … user needs analysis– conceptual design … data modelling

• hardware and software independent

– physical design … database design– database implementation– monitoring/audit

– information systems plan for organisation – system specification … user needs analysis– conceptual design … data modelling

• hardware and software independent

– physical design … database design– database implementation– monitoring/audit

14

Database… stages of developmentDatabase… stages of development

15

Organisational strategy and ITLand Information System (LIS) (i)

Organisational strategy and ITLand Information System (LIS) (i)

– Problems/issues:• rationalisation of land related information in

government agencies

• the removal/reduction of duplication

• introduction of economies in data capture, maintenance and storage

• better (and wider) access to data

– Problems/issues:• rationalisation of land related information in

government agencies

• the removal/reduction of duplication

• introduction of economies in data capture, maintenance and storage

• better (and wider) access to data

solutions ...

16

Organisational strategy and ITLand Information System (LIS) (ii)

Organisational strategy and ITLand Information System (LIS) (ii)

– Solutions:• better data distribution mechanism (data format and

location transparent to user)

• knowledge of data distribution built into the data dictionary

• reduction of data duplication

• uniform query language (SQL)

• coding and data interchange standardisation ( … SDTS)

– Solutions:• better data distribution mechanism (data format and

location transparent to user)

• knowledge of data distribution built into the data dictionary

• reduction of data duplication

• uniform query language (SQL)

• coding and data interchange standardisation ( … SDTS)

18

Database types - a history

Evolution of Databasetechnology

19

Database types - hierarchical (i)Database types - hierarchical (i)

– lends itself to GIS use as data are often hierarchical in structure e.g. municipality x province x country

– records divided into logically related fields … connected in a tree-like arrangement

– master field in each group of records … pointers … updates require pointers to be modified

– fast preset queries … ad hoc queries difficult or impossible

– lends itself to GIS use as data are often hierarchical in structure e.g. municipality x province x country

– records divided into logically related fields … connected in a tree-like arrangement

– master field in each group of records … pointers … updates require pointers to be modified

– fast preset queries … ad hoc queries difficult or impossible

20

Database types - hierarchical (ii)Database types - hierarchical (ii)

COUNTRY (USA)

States

Counties

Boundaries

Nodes

Hierarchical Structure for a Cadastral database

Hierarchical Structure for a Cadastral database

Hierarchical Structure for a Cadastral database

Hierarchical Structure for a Cadastral database

23

Database types - network (i)Database types - network (i)

– similar to hierarchical but have multiple connections between files to accommodate many to many (M:M) relationships

– access to a particular file without searching the entire hierarchy above that file

– linked records … quick preset searches … large overhead in pointer management

– modification after creation difficult

– similar to hierarchical but have multiple connections between files to accommodate many to many (M:M) relationships

– access to a particular file without searching the entire hierarchy above that file

– linked records … quick preset searches … large overhead in pointer management

– modification after creation difficult

24

Database types - network (ii)

25

Database types - network (ii)

26

Database types - relational (i)Database types - relational (i)

– model developed from mathematics– records and fields in a 2-dimensional table– no pointers etc … any field can be used to link

one table to another– normalisation … redundancy/stable structure– ad hoc queries SQL… modifications easy– not very efficient for GIS …SQL3

– model developed from mathematics– records and fields in a 2-dimensional table– no pointers etc … any field can be used to link

one table to another– normalisation … redundancy/stable structure– ad hoc queries SQL… modifications easy– not very efficient for GIS …SQL3

27

Database types - relational (i)Database types - relational (i)

28

Database types - relational (iii)

Hierarchical structure

Network structure

Relational structure(part…)

30

Centralised vs distributedCentralised vs distributed

– a database does not necessarily mean a centralised arrangement i.e. all data in one physical place

– a database does not necessarily mean a centralised arrangement i.e. all data in one physical place

31

GIS and distributed databases ...– trend towards open systems ...

• special hardware and software can be used widely … specific applications optimised

• system/network communications is easier

– modular implementation from an overall design … incremental change

– unlimited capacity (nodes) … lower risks

32

Approaches to GIS system designApproaches to GIS system design

– develop a proprietary system– develop a hybrid system: proprietary graphics +

commercial DBMS for attribute data (e.g. ARC/INFO)

– use commercial DBMS and develop spatial functions and graphics display used in geographic analysis (e.g. siroDBMS, System9)

– develop a spatial DBMS from scratch

– develop a proprietary system– develop a hybrid system: proprietary graphics +

commercial DBMS for attribute data (e.g. ARC/INFO)

– use commercial DBMS and develop spatial functions and graphics display used in geographic analysis (e.g. siroDBMS, System9)

– develop a spatial DBMS from scratch

33

Approaches to GIS system design

Softwarelinkages

(1) Separate Spatial and attribute data

(2) Integrated Spatialand attribute data

35

GIS databases … some problems (i)GIS databases … some problems (i)

– centralised risk• centralisation demands better quality control other higher

potential for disaster

– cost• large DBMSs are expensive to design, implement and operate

• piecemeal design is difficult

– complexity• need to keep track of complex hardware and software

• need to keep track of graphical as well as attribute data and the links

– centralised risk• centralisation demands better quality control other higher

potential for disaster

– cost• large DBMSs are expensive to design, implement and operate

• piecemeal design is difficult

– complexity• need to keep track of complex hardware and software

• need to keep track of graphical as well as attribute data and the links

36

GIS databases … some problems (ii)GIS databases … some problems (ii)

Cascading effects of change in a GIS database (ESRI 1989)

GIS Design

38

GIS database design guide

39

Objectives of designObjectives of design

– a good design results in a database which:• contains necessary data but no redundant data

• organises data so that different users access the same data

• accommodates different views of the data

• distinguishes applications which maintain data from those that use it

• appropriately represents, codes and organises geographic features

– a good design results in a database which:• contains necessary data but no redundant data

• organises data so that different users access the same data

• accommodates different views of the data

• distinguishes applications which maintain data from those that use it

• appropriately represents, codes and organises geographic features

40

Design methodology (for ARC/INFO)Design methodology (for ARC/INFO)

– conceptual model• model the users’ view

• define entities and their relationships

– logical model• identify representation of entities

• match to ARC/INFO data model

• organise into geographic data sets

– physical model

– conceptual model• model the users’ view

• define entities and their relationships

– logical model• identify representation of entities

• match to ARC/INFO data model

• organise into geographic data sets

– physical model

41

Design methodology (for ARC/INFO)Design methodology (for ARC/INFO)

– 1. Model the users’ view– 2. Define entities and their relationships– 3. Identify representation of entities– 4. Match to ARC/INFO data model– 5. Organise into geographic data sets –

– 1. Model the users’ view– 2. Define entities and their relationships– 3. Identify representation of entities– 4. Match to ARC/INFO data model– 5. Organise into geographic data sets –

42

1. Model the users’ view1. Model the users’ view

– create a model of work performed by users for which ‘location’ is a factor

• identify organisational functions

• identify the data which supports the functions

– organise data into sets of geographic features• data function matrix

– high level classification of data

– interdependence of data and function

– difference between users and creators of data

– create a model of work performed by users for which ‘location’ is a factor

• identify organisational functions

• identify the data which supports the functions

– organise data into sets of geographic features• data function matrix

– high level classification of data

– interdependence of data and function

– difference between users and creators of data

43

Land development management functionLand development management function

44

Data function matrix …an exampleData function matrix …an example

45

2. Define entities and their relationships2. Define entities and their relationships

– entities: distinguishable objects which have a common set of properties

• identify and describe entities

• identify and describe the relationship among these entities

• document the process– diagrams

– data dictionary

• Normalise the data

– entities: distinguishable objects which have a common set of properties

• identify and describe entities

• identify and describe the relationship among these entities

• document the process– diagrams

– data dictionary

• Normalise the data

46

Entity/relationship definitionEntity/relationship definition

47

Diagramming … entitiesDiagramming … entities

48

NormalisationNormalisation

– First Normal Form (1NF)– Second Normal Form (2NF)– Third Normal Form (3NF)

– First Normal Form (1NF)– Second Normal Form (2NF)– Third Normal Form (3NF)

ASR - Assessor

Underlying entities...Underlying entities...

Parcel Zoning Owner Ownership

53

3. Identify representation of entities3. Identify representation of entities

– determine the most effective spatial representation for geographic features

– consider whether:• a feature might be represented on a map• the shape of a feature might be significant in

performing geographic analysis• the feature will have different representations and

different map scales • textual attributes of the feature will be displayed on

map products• ...

– determine the most effective spatial representation for geographic features

– consider whether:• a feature might be represented on a map• the shape of a feature might be significant in

performing geographic analysis• the feature will have different representations and

different map scales • textual attributes of the feature will be displayed on

map products• ...

54

4. Match to ARC/INFO data model4. Match to ARC/INFO data model

– determine the appropriate ARC/INFO representation for entities

• points, lines, polygons

– ensure complex feature classes are supported• route comprised of sections which in turn are based

on arcs

• a region is composed of polygons

• event is a point or a line which occurs along a route

– others (e.g. GRID, TIN)

– determine the appropriate ARC/INFO representation for entities

• points, lines, polygons

– ensure complex feature classes are supported• route comprised of sections which in turn are based

on arcs

• a region is composed of polygons

• event is a point or a line which occurs along a route

– others (e.g. GRID, TIN)

55

Matching to ARC/INFO data model

Entity Spatialtype

ARC/INFO

Relatedto

Coverage Attribute files

Anno.LUT

56

5. Organise into geographic data sets

5. Organise into geographic data sets

– to identify and name the geographic data sets that will contain the various entities:

• define the contents of geographic data sets (coverages, grids etc)

• name workspaces, geographic data sets, entities and attributes

• complete entity definitions

• add cartographic text and lookup tables

– to identify and name the geographic data sets that will contain the various entities:

• define the contents of geographic data sets (coverages, grids etc)

• name workspaces, geographic data sets, entities and attributes

• complete entity definitions

• add cartographic text and lookup tables

57

5(i) Define the content of geographic data sets5(i) Define the content of geographic data sets

– Data sets supported : coverage, grid, tin, image and drawing

– coverages several entities can be grouped into a single coverage

– DBMS : stored in a separate database management system

– Data sets supported : coverage, grid, tin, image and drawing

– coverages several entities can be grouped into a single coverage

– DBMS : stored in a separate database management system

58

5 (ii) Geographic datasets, entities and attributes

5 (ii) Geographic datasets, entities and attributes

– coverage definitions

• high level summary of the data physically stored in the database

• required for defining the coverage structure

– file naming conventions in ARC/INFO

– coverage definitions• high level summary of the data physically stored in

the database

• required for defining the coverage structure

– file naming conventions in ARC/INFO

59

5 (iii) Complete entity definitions5 (iii) Complete entity definitions

– background information: coverage name, data source, agency, number of records etc.

– attribute definition• attribute name, type, field width

• validation rules/ permitted values

– background information: coverage name, data source, agency, number of records etc.

– attribute definition• attribute name, type, field width

• validation rules/ permitted values

60

5 (iv) Cartographic text & code tables5 (iv) Cartographic text & code tables

– annotation (text, placing rules etc)– look up tables

• pre defined set of values

• description/ labels

• means of creating displays based on attribute values

– annotation (text, placing rules etc)– look up tables

• pre defined set of values

• description/ labels

• means of creating displays based on attribute values

61

Robinson (Ch 14): Scale and GIS databasesRobinson (Ch 14): Scale and GIS databases

– (past) map’s scale greatly influenced map content and data resolution

– GIS data are ‘scaleless’ … scale is still a critical factor with digital databases - because of the ways in which we create digital databases

– scale and resolution (Tab 14.1)

– (past) map’s scale greatly influenced map content and data resolution

– GIS data are ‘scaleless’ … scale is still a critical factor with digital databases - because of the ways in which we create digital databases

– scale and resolution (Tab 14.1)

62

Robinson (Ch 14): Scale and resolution issues Robinson (Ch 14): Scale and resolution issues

– symbolisation and display problems– handling databases of different scales

• join problems (e.g. urban rural)

• merge problems (different themes)

• scale levels– in general

– large scale data (AM/FM etc.)

– symbolisation and display problems– handling databases of different scales

• join problems (e.g. urban rural)

• merge problems (different themes)

• scale levels– in general

– large scale data (AM/FM etc.)

63

Robinson (Ch 15): Managing large GIS Robinson (Ch 15): Managing large GIS

– Data organisation• partitioning

• spatial indexes

• metadata

– data compression• run length encoding (RLE)

• quadtree encoding

• others ...

– Data organisation• partitioning

• spatial indexes

• metadata

– data compression• run length encoding (RLE)

• quadtree encoding

• others ...