Upload
others
View
10
Download
0
Embed Size (px)
Citation preview
GeodatabaseDatabase Design
Tomislav SapicGIS Technologist
Faculty of Natural Resources ManagementLakehead University
Table Design
• Two types of attribute tables in GIS:
– Feature attribute table
– Non-spatial attribute table
• Table fields (columns) are also referred to as attributes and are in GIS essentially variables which values provide characteristics (measurements, descriptions) of individual features represented by records (rows).
• When designing a table and its fields and attributes the overall rules are:
– Simple values
– Values are of minimal length
– Individual values – smallest element
– Values are easily understandable and searchable
– Values are unambiguous
– Potential for errors is minimized
– Data redundancy is reduced
– Potential logical inconsistency is avoided
• In order to reduce the overall data size and redundancy a database is often separated into smaller tables that are related and linked to each other, creating a relational database.
• The process of redesigning tables to reduce data redundancy and potential logical inconsistency, including, when needed, creating separate tables linked to each other, is called normalization.
• Database normalization is defined through degrees of normalization – First Normal Form, Second Normal Form, Third Normal Form, Fourth Normal Form, Fifth Normal Form, etc.
• A common GIS database conforms to the Third Normal Form and those below it.
• The goal of the normalization:
– Avoid (eliminate) insertion, update and deletion anomalies.
– Avoid redundant data and waste space that may cause data integrity problems.
– Avoid possible logical inconsistency.
– To ensure the separate tables can be maintained and updated separately and linked when necessary.
– To facilitate a distributed database (Cheng 2006)
• Normalization of databases was first proposed and explained by E.F. Codd, an IBM researcher, in his 1970 paper “A Relational Model of Data for Large Shared Data Banks” (Codd 1970). Since then, the whole concept has been somewhat modified but the main components of it are still valid.
Database Design
• Un-normalized table (multiple and redundant values)
First Normal Form (1NF)
Source: Chang (2006)
Database Normalization
1. There’s no top-to-bottom ordering to the rows.2. There’s no left-to-right ordering to the columns.3. There are no duplicated rows (there is a (composite) Primary Key field in the table).4. Every row-and-field intersection contains exactly one value for the applicable domain
(and nothing else).
First Normal Form (1NF)
• The table should conform to 1NF.• All non-prime attributes should be functionally dependent on a primary key (or the whole of a
composite primary key).• A Primary Key is a field with unique values, that identifies each record in the table.• A Foreign Key is a field with non-unique values and which individual values match values in a Primary
Key field.
Second Normal Form (2NF)
Second Normal Form (2NF)
• A Foreign Key field is used to connect a table to another table with a Primary Key field and can contain non-unique values.
• Another and widely used definition of 2NF is that a non-prime field cannot be dependent on a subset of a composite primary key – field (attribute) dependence on a subset of a composite primary key is called partial dependency.
Source: Chang (2006)
Source: Chang (2006)
Third Normal Form (3NF)
• The table should conform to 2NF.
• All non-prime fields can be functionally dependent only on a (composite) primary key field and not on a non-prime field – field (attribute) dependence on a non-prime field is called transitive dependency.
• The above type of functional dependency between non-prime fields can potentially cause logical inconsistency in the data (data quality!).
Primary KeyForeign Key
Third Normal Form (3NF)
Database Design - Table Join in ArcGIS• Often, related data are stored in two or more different tables for the reasons of database
simplification, easier editing and management, smaller size.• Two tables can be joined as long as they both have a field in which some or all values are the same
as in the field in the other table(s) and the values are stored as the same data type. • The linking field in the origin table is called the primary key and in the destination table the foreign
key.
Primary Key
Foreign Key
One to Many
• There are three types of relationships between two tables: o One to Oneo One to Manyo Many to Many
Many to Many
Origin Table
Destination Table
Destination TableOrigin Table
Destination Table
Origin Table Origin Table
http://www.kenticosolutions.com/Developer-Tips/Tip/May-2011/Many-to-Many-relationships-in-the-Kentico-CMS-Cont.aspx
ArcGIS 10.1, Help
ArcGIS 10.1, Help
Database Design
• In ArcGIS, origin vs destination table and simple vs composite relationship.
Primary key Foreign key
Database Design
• In ArcGIS, it is important to get the origin and destination table right.
Sources:
Chang, Kang-tsung. 2006. Introduction to Geographic Information Systems. McGraw Hill Higher Education. Third Edition.Codd, E.F. 1970. A relational model of data for large shared data banks. Communications of the ACM, Vol. 13. https://www.seas.upenn.edu/~zives/03f/cis550/codd.pdfKent, W. 1983. A Simple Guide to Five Normal Forms in Relational Database Theory. Communications of the ACM, Vol 26(2). http://www.bkent.net/Doc/simple5.htm