Spatial Data Model 2

Preview:

Citation preview

Lecture 4, Wednesday 27th August 2014

DEPARTMENT OF GEOGRAPHY AND ENVIRONMENT

UNIVERSITY OF DHAKA

Most popular DBMS model for GIS

Based on a set of mathematical principals called relational algebra

More of a concept than a data structure

Internal architecture varies substantially from one RDBMS to another

Link the complex spatial relationships between objects

Type of relation:

1. One to one

2. One to many

3. Many to many

4. Many to one

Example of Geo-relational data model

Advantage

1. There is no data redundancy

- type of building of an owner can be changed without destroying the relation between type and rate

- a new type of building can be inserted such as “clay”

2. The most flexible data model

Disadvantage

1. Most RDBMS data manipulation languages require the user to know the contents of relations

In this concept, each individual piece of data can be linked directly anywhere in the database

This is developed in mid 1960s as part of work of CODASYL which proposed programming language COBOL (1966) and then network model (1971)

Example:

A hospital database has three record types:

-Patient: name, date of admission etc.

-doctor: name etc.

-ward: number of beds, name of staff nurse etc.

• We need to link patients to doctor, also to ward

• Doctor record can own many patient records

• Patient record can be owned by both doctor and ward records

Advantage

1. Can handle many to many relations

2. Much greater flexibility of search

3. Reduce redundancy of data

Disadvantage

1. Links between records of the same type are not allowed

2. While a record can be owned by several record of different

types, it cannot be owned by more than one record of the

same type (patient can have only one doctor, only one word)

3. Need more storage in the computer

A set of record “types”

- e.g. supplier record type, department record type, part record type

A set of links connecting all record types in one data structure diagram “tree”

At most one link between two record types, hence links need not be named

- e.g. every county has exactly one state, every part has exactly one

department

No connections between occurrences of the same record type

- cannot go between records at the same level unless they share the same

parent

In geographic database, quadtree can be an example of hierarchical data model

Advantage

1. High speed of access to large datasets and eases of updating

2. The model is based on one to one and many to one relationships

Disadvantage

1. Linkages are only possible vertically but not horizontally or diagonally, that means there is no relation between different trees at the same level unless they share the same parent

1. Restricted to branch to network itself such as many to many relationship

Uses functions to model spatial and non-spatial relationships of geographic objects and the attributes

An object is an encapsulated unit which is characterized by attributes, a set of orientations and rules

Includes four basic elements:

1. Object oriented user interfac

2. Object oriented programming languages

3. Object oriented analysis and design methodologies

4. Object oriented database management

Generic properties: there should be an inheritance relationship

Abstraction: objects, classes and super classes are to be generated by classification, generalization, association and aggregation

Adhoc queries: user can order spatial operations to obtain spatial relationships of geographic objects using a special language

Refers to the fitness for use of data for intended application

Qualitative criteria for high quality data:

1. Data must be accurate and reliable

2. Current and up to date

3. Complete and precise

4. Concise and intelligible

5. Conventionally handled (maintained, transmitted, distributed, classified, resampled, retrieved and updated)

Other factors:

a. Must be projected to the real world

b. Must be captured at a scale using a classification scheme

c. Cartographic properties

d. Transfer format

Accuracy

Degree to which data agree with the values or descriptions of the real-world features that they represent.

Measure of how “close” data match the true values or descriptions.

Accuracy is related to cost of data acquisition.

Data accuracy is often grouped into three

forms:

1. thematic accuracy

2. positional accuracy

3. temporal accuracy

How “exact” data are measured and stored

In mathematics, the exactness of representation is the number

of significant digits used to record data. But for digital

geographic data, this is the number of “bits” and the form

(long integer; floating point etc.) used for data capture and

storage.

Comparison of the precision of storing data by the three storage formats in PC

Format Bits of

storage Significant digits of precision

True floating point

decimals

Long integer 16 9 No

Single precision floating point

32 7 Yes

Double precision floating point

64 13 Yes

The deviation between two values-

1. measured value

2. value of the real world feature

Three types of error that may occure in measurement and

observation:

1. gross error: blunders and mistakes

2. systematic error

3. random error: (normal distribution and least square

adjustment)

Certain degree of doubt

Lack of confidence in the use of the data

Can be divided into three basic groups:

1. Original source maps

- Map projection

- Map scale

- Cartographic generalisations

- Cartographic revision

- Feature classification/ coding

- Field survey measurements

- Photogrammetric measurements

- Image analysis

- Sampling design

- Aging maps

2. Data automation and compilation

- digitizing

- attribute data inpute

- format translation

- map projection transformation

- vectorization of raster data

3. data processing and analysis

- numerical rounding in computing

- Overlay analysis

- Classification and re-classification

- Generalization and aggregation

- Interpolation

- Inappropriate use of algorithm

On top of the above, Vitek et al. (1984) grouped them into two categories:

1. Inherent errors

2. operational errors

Components of spatial data quality

Lineage of spatial data

Positional accuracy

Attribute accuracy

Error matrix/ confusion matrix

Kappa coefficient

Temporal accuracy

Semantic accuracy

Recommended