42
3 PODS Database Concepts and Terms

3 PODS Database Concepts and Terms 30-minute …...3 PODS Database Concepts and Terms 30-minute session to introduce students to terms and concepts they will need to understand PODS

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: 3 PODS Database Concepts and Terms 30-minute …...3 PODS Database Concepts and Terms 30-minute session to introduce students to terms and concepts they will need to understand PODS

3 PODS Database Concepts and Terms

Presenter
Presentation Notes
30-minute session to introduce students to terms and concepts they will need to understand PODS.
Page 2: 3 PODS Database Concepts and Terms 30-minute …...3 PODS Database Concepts and Terms 30-minute session to introduce students to terms and concepts they will need to understand PODS

PODS BasicsUnit 3 – A training in PODS data management

concepts and terms

Page 3: 3 PODS Database Concepts and Terms 30-minute …...3 PODS Database Concepts and Terms 30-minute session to introduce students to terms and concepts they will need to understand PODS

Intended Audience

• GIS/IT professionals• New to pipeline industry• Little or no exposure to PODS

PODS Training – both PODS Basics and PODS Advanced – create a better understanding of PODS Standards and PODS implementations through geospatial and relational database applications.

Presenter
Presentation Notes
This workshop expects the attendees to have at least a fundamental understanding of GIS. This workshop will help GIS professionals understand the importance of the PODS data model as a foundation for spatially managing pipeline data and assets. It will teach the basic PODS and GIS terminology necessary to understand pipeline data in a GIS, the basics of data models and relational databases, and how Linear Referencing is used to model the location of pipelines and related assets in ArcGIS.
Page 4: 3 PODS Database Concepts and Terms 30-minute …...3 PODS Database Concepts and Terms 30-minute session to introduce students to terms and concepts they will need to understand PODS

An introduction to PODS Data Management Concepts and Terms

3 PODS BASICS

Presenter
Presentation Notes
In Unit 2, we learned about Linear Referencing, the method by which we can store, manage, and work with pipeline features and their associated assets spatially. We learned that the linear referencing method stores the relative locations of pipeline assets like meters, valves, etc. along an existing pipeline route. These assets and their associated data are stored as Events and Event Tables and are related to the spatially stored pipeline in GIS. Today, we’ll turn our attention away from the spatial aspects of pipelines and towards the data management and data quality aspects of the PODS data model. The theme of this unit is CONNECTEDNESS.
Page 5: 3 PODS Database Concepts and Terms 30-minute …...3 PODS Database Concepts and Terms 30-minute session to introduce students to terms and concepts they will need to understand PODS

WebinarSeries Overview

• Unit 1 – PODS Basics• Unit 2 - Linear Referencing Concepts and

Terms• Unit 3 - PODS Data Management

Concepts and Terms• Unit 4 - The PODS Schema• Unit 5 - Spatial Analysis of Pipeline Data• Unit 6 - PODS Implementation

Page 6: 3 PODS Database Concepts and Terms 30-minute …...3 PODS Database Concepts and Terms 30-minute session to introduce students to terms and concepts they will need to understand PODS

Introduction

How would you convert this into an information system?

Presenter
Presentation Notes
In our last session we discussed Linear referencing and the spatial representation of a pipeline and its associated components, To fully understand this concept we focused on a pipeline route as a singular object that’s spatially aware of its location, length. We also discussed linear referencing as a way of locating pipeline components (and other data types) using the pipeline route as a basis for measurement. Today, we’re going to “zoom out” and holistically view the PODS data model and its various components. We’ll expand upon Unit 1’s introduction to the data model by digging deeper into the data management aspects of PODS. But first, examine the image above, what do you see? Pipelines, associated equipment and other assets. What kinds of assets? Production, monitoring, marketing(sales), transmission, mechanical, etc. Also, geographic boundaries. Holistically, you also see connectivity – a network(system) of pipeline, production, equipment, etc. How could you model this as an information system? What kinds of information would need to be associated with each asset type?
Page 7: 3 PODS Database Concepts and Terms 30-minute …...3 PODS Database Concepts and Terms 30-minute session to introduce students to terms and concepts they will need to understand PODS

Work history

Leak survey

Physical pipeline facilities

How do you design a container for all this mission-critical data?

Regulatory compliance

Risk assessment

Operating measures

Site facilities

Cathodic protection

CompressionGeographicboundaries

Geographicfeature

crossings

Inlineinspection

Close interval surveys

Offshore lines

External documents,

reports

Presenter
Presentation Notes
The assets themselves must be stored as well as descriptive information about each one AND connectivity between assets must exist.
Page 8: 3 PODS Database Concepts and Terms 30-minute …...3 PODS Database Concepts and Terms 30-minute session to introduce students to terms and concepts they will need to understand PODS

Work history

Leak survey

Physical pipeline facilities

How do you design a container for all this mission-critical data?

Regulatory compliance

Risk assessment

Operating measures

Site facilities

Cathodic protection

CompressionGeographicboundaries

Geographicfeature

crossings

Inlineinspection

Close interval surveys

Offshore lines

External documents,

reports

With logical groups of tables!

What data?

Organization?

Data Quality?

Governance?You need a data model.

Presenter
Presentation Notes
So the answer to our question “How do you design a container for all this mission-critical data is TABLES. You create groupings of logically related tables. OK, now what? What data should be stored? How should these tables be organized? How should the relationships between datasets be established and governed? You need a Data Model.
Page 9: 3 PODS Database Concepts and Terms 30-minute …...3 PODS Database Concepts and Terms 30-minute session to introduce students to terms and concepts they will need to understand PODS

What is a Data Model?

A data model defines how data is connected, stored, and processed.

It provides the organizational and conceptual model of data relationships

PODS has 3 types of data models:1. Conceptual – Overview, low detail2. Logical – detailed model including

rules, standards, etc.3. Physical – platform-specific,

detailed, finished data model with defined data elements

Page 10: 3 PODS Database Concepts and Terms 30-minute …...3 PODS Database Concepts and Terms 30-minute session to introduce students to terms and concepts they will need to understand PODS

PODS Data Model Example

Presenter
Presentation Notes
This is the conceptual model of version PODS 4.02- one of the most widely-used versions of the PODS Data Model. The groups of colored boxes demarcate the various groupings of data tables – Locations, Regulatory Compliance, Sites, Cathodic Protection, Events, Centerline, Pipeline Facilities, etc.
Page 11: 3 PODS Database Concepts and Terms 30-minute …...3 PODS Database Concepts and Terms 30-minute session to introduce students to terms and concepts they will need to understand PODS

What key drivers shape the PODS Data Model?

Regulatory Compliance

Traceability of pipeline equipment, materials

Quality assurance

Interoperability with other enterprise systems

Project management, monitoring

Industry standards and common language

Data management strategy

Consistency through pipeline phases from design to operation

Presenter
Presentation Notes
Aside from the visual components of the pipeline system, there are other, more intangible associations to be made within the pipeline system. These associations add complexity to our information system model, yet, their importance cannot be underestimated in the management of a pipeline network. The list above represents a few of the key drivers shaping the components of the PODS data model. This is where we find the PODS Data Model – at the intersection of tangible and intangible aspects of a pipeline network.
Page 12: 3 PODS Database Concepts and Terms 30-minute …...3 PODS Database Concepts and Terms 30-minute session to introduce students to terms and concepts they will need to understand PODS

What is the PODS Data Model?

The PODS Data Model is…

• a plan defining how all vital pipeline data is connected, stored, and processed.

• “pipeline-centric.”• designed to reside on spatial (ESRI

GIS) and non-spatial platforms.• GIS Neutral and Vendor neutral

Presenter
Presentation Notes
So, how what about the PODS data model? It begins with the PIPE. The data model is a collection of related data tables fully describing pipeline assets. Several varieties of the model available PODS 6.0, NextGen(7.0), PODS Lite, PODS Relational, PODS Spatial
Page 13: 3 PODS Database Concepts and Terms 30-minute …...3 PODS Database Concepts and Terms 30-minute session to introduce students to terms and concepts they will need to understand PODS

3 PODS Key Concepts and Terms

• The PODS Data Model• Provides the database architecture pipeline operators use to

• store critical information • analyze data about their pipeline systems• manage this data geospatially in a linear-referenced database which can then

be visualized in any GIS platform.

Presenter
Presentation Notes
Data models are abstractions of reality. Platform independent Comprehensive Evolves with member’s input RDBMS and GIS independent Most versions are GIS agnostic.
Page 14: 3 PODS Database Concepts and Terms 30-minute …...3 PODS Database Concepts and Terms 30-minute session to introduce students to terms and concepts they will need to understand PODS

PODS Model Data Building Blocks

Now, let’s go a little deeper in our understanding of how entire pipeline systems and ancillary data are managed within the PODS data model.

Relational Databases

Presenter
Presentation Notes
Next, let’s look at the fundamental building block of the PODS Data Model…. The Relational Database. As the name suggests, these related data tables are the essential unpinning of the Data Model. These databases and the relationships between them are the engine behind the PODS data model. In this unit, we’ll gain a better understanding of relational databases, how they work, and how they function in PODS. Take a moment to examine the graphic – notice the pipe segment is only one of several data types in the model, within the Pipe Segment route, notice the number of ancillary features associated with it. Also notice the data stored for the pipe segment’s casing and coating.
Page 15: 3 PODS Database Concepts and Terms 30-minute …...3 PODS Database Concepts and Terms 30-minute session to introduce students to terms and concepts they will need to understand PODS

What is a Database?

• A Database is…

• A structured collection of information stored electronically

• Efficient, flexible data management (storage, retrieval, analysis, share, and update.)

• A central data repository

Presenter
Presentation Notes
There are various database types such as Relational, Object-oriented, Distributed, etc. For our purposes, we’ll focus on Relational Databases. “Database” in this series refers to this type of database.
Page 16: 3 PODS Database Concepts and Terms 30-minute …...3 PODS Database Concepts and Terms 30-minute session to introduce students to terms and concepts they will need to understand PODS

What’s the difference between a Data Model and a Database?

Data Model•Defines the

connections between data – how its defined, stored, processed

•Not the actual data

Database•Physical implementation

of the data model •A central data

repository, i.e. “System of Record”

Presenter
Presentation Notes
PODS – Pipeline Open Data STANDARD. Data models are abstractions of reality.
Page 17: 3 PODS Database Concepts and Terms 30-minute …...3 PODS Database Concepts and Terms 30-minute session to introduce students to terms and concepts they will need to understand PODS

What is a Database?

A Database is a collection of related data stored in Tables

• Tables are organized into Rows (records) and columns (fields)

• Databases also store prescribed rules, roles, and relationships for data within it

• Controlled by a DBMS (Database Management System)

Records

Fields

Relationships between tables

reflect the association of

objects in reality.

Page 18: 3 PODS Database Concepts and Terms 30-minute …...3 PODS Database Concepts and Terms 30-minute session to introduce students to terms and concepts they will need to understand PODS

Database Types

• Oracle or SQL Server supported relational databases• Enterprise can also be RDBMS, such as Oracle or SQL

Server• Geodatabases - File vs. Enterprise (ESRI)• In pipeline industry, databases are typically connected

with some sort of mapping software Oracle

MicrosoftSQL Server

ESRIGeodatabase

Presenter
Presentation Notes
There are various database types such as Relational, Object-oriented, Distributed, etc. For our purposes, we’ll focus on Relational Databases. The term “Database” in this series will refer to this type of database. RDBMS – Relational Database Management System ?Compare/Contrast PODS Relational and PODS Spatial?
Page 19: 3 PODS Database Concepts and Terms 30-minute …...3 PODS Database Concepts and Terms 30-minute session to introduce students to terms and concepts they will need to understand PODS

Relational Databases• Normalization of disparate data into a Relational Database

• A database design method that organizes information into multiple related tables to minimize data redundancy

• Information goes into the correct table, free of duplicates• Tables are related to each other• PODS database structure is normalized

Operator

• Name• UID• Contact

Pipeline

• Name• UID• OP-ID

Meter

• Name• UID• PIPE-ID

Presenter
Presentation Notes
A Relational Database has undergone a “normalization” process during which the tables, relationships, and rules are established in the database. This process makes the database operate efficiently. Why is it a Relational Database? Explain information about how an operator is stored only in the operator table. The same rule also applies to Pipelines and Meters. If I need information about the pipeline that a meter is on, I follow the relationship to that pipeline in the pipeline table and get it from there.
Page 20: 3 PODS Database Concepts and Terms 30-minute …...3 PODS Database Concepts and Terms 30-minute session to introduce students to terms and concepts they will need to understand PODS

Relational Database Key Fields

• Primary Key• Unique• Not null• Never changes

Operator

• Name• UID• Contact

Pipeline

• Name• UID• OP-ID

Meter

• Name• UID• PIPE-ID

• Foreign Key• Not unique• Multiple• Points to some other Primary Key

Presenter
Presentation Notes
The Primary Key (PK) is the unique ID field. A GUID datatype so the database creates a unique ID for that thing. What is a GUID? It stands for Globally Unique Identifier. More robust than a simple unique ID. In fact, they’re extremely large numbers that are guaranteed to be unique, which is important for identifying every component in a pipeline network. GUIDS ensure the PODS relational databases have unique primary keys. PipelineID is an example of an GUID that permeates nearly all tables. Why? Because most all tables contain asset information relating back to the pipeline.
Page 21: 3 PODS Database Concepts and Terms 30-minute …...3 PODS Database Concepts and Terms 30-minute session to introduce students to terms and concepts they will need to understand PODS

Example – Database Linkages

Presenter
Presentation Notes
Here’s some other data relationships in PODS.
Page 22: 3 PODS Database Concepts and Terms 30-minute …...3 PODS Database Concepts and Terms 30-minute session to introduce students to terms and concepts they will need to understand PODS

3 PODS Key Concepts and Terms

• Databases – Essential PODS building blocks• Relational Databases are the basic building blocks of the PODS Data

Model.• Provides connectivity throughout the PODS Data Model.• Primary and foreign keys within databases connects tables together.

Page 23: 3 PODS Database Concepts and Terms 30-minute …...3 PODS Database Concepts and Terms 30-minute session to introduce students to terms and concepts they will need to understand PODS

Database Schema

The implementation of the Database Model on a specific computer system and relational database.

Page 24: 3 PODS Database Concepts and Terms 30-minute …...3 PODS Database Concepts and Terms 30-minute session to introduce students to terms and concepts they will need to understand PODS

Data Management Terms

Data Model – Conceptual, logical, physical

Relational Database –Connected data tables

Schema – the physical manifestation of the Data Model

Presenter
Presentation Notes
Sometimes you’ll hear or see these terms used interchangeably, but there are differences between them.
Page 25: 3 PODS Database Concepts and Terms 30-minute …...3 PODS Database Concepts and Terms 30-minute session to introduce students to terms and concepts they will need to understand PODS

Ensuring Data Integrity

How does PODS enforce data integrity and quality?

• Code Lists• Enumerators• Domains/SubTypes

Presenter
Presentation Notes
Data integrity enhancements continue with each version of PODS. The addition of Code lists in the newest PODS Data Model (7.0) are a good example of continued strengthening of data quality management. Code lists are managed in the PODS 7.0 Logical Model. Like Lookup Tables. The PODS logical data model will support domains (for geodatabase implemen-tations), code lookup tables (for SQL DDL implementations), and code lists (for the XML Schema Data Exchange Specification). Code list values and descriptions are (typically) synonymous.
Page 26: 3 PODS Database Concepts and Terms 30-minute …...3 PODS Database Concepts and Terms 30-minute session to introduce students to terms and concepts they will need to understand PODS

What is a Code List?

There are 3 Types of Code Lists:1. Enumerators2. Managed Code Lists3. Unmanaged Code Lists

Page 27: 3 PODS Database Concepts and Terms 30-minute …...3 PODS Database Concepts and Terms 30-minute session to introduce students to terms and concepts they will need to understand PODS

Why are Code Lists Important?

• Enforces Data Integrity• Data Standardization• Data Entry

Presenter
Presentation Notes
How often have you looked at a field in a spreadsheet or database and noticed missing or inconsistent data? Like, how many versions of a single company name exist? For example- Reduces “fat fingering.” Smooths data entry process by providing a pick list or menu of items to choose from rather than typing an entry. Product is standardized data.
Page 28: 3 PODS Database Concepts and Terms 30-minute …...3 PODS Database Concepts and Terms 30-minute session to introduce students to terms and concepts they will need to understand PODS

PODS CODE LIST TYPES

All possible values are permanently fixed at

the time of standardization.

Enumerations

Presenter
Presentation Notes
PODS Code Lists are maintained in the PODS Pipeline Data Model’s Logical Model. Code lists store valid values for certain attributes of pipeline features. Code lists are matched to the type of database/geodatabase the data resides in. Because values are important to the working of the standard, PODS manages the lifecycle of values. Values can be added, retired, deprecated, or superseded ONLY in very rare and extenuating circumstances.
Page 29: 3 PODS Database Concepts and Terms 30-minute …...3 PODS Database Concepts and Terms 30-minute session to introduce students to terms and concepts they will need to understand PODS

Enumerator Example

Presenter
Presentation Notes
Fixed values.
Page 30: 3 PODS Database Concepts and Terms 30-minute …...3 PODS Database Concepts and Terms 30-minute session to introduce students to terms and concepts they will need to understand PODS

PODS CODE LIST TYPES

Most possible values are permanently fixed at

the time of standardization.

Managed Code Lists

Presenter
Presentation Notes
Because values are important to the working of the standard AND/OR are required for interoperability reasons, PODS manages the lifecycle of values. Values can be added, retired, deprecated, or superseded. Modules/Users shall not add or supersede values without submission to a TBD PODS management process.
Page 31: 3 PODS Database Concepts and Terms 30-minute …...3 PODS Database Concepts and Terms 30-minute session to introduce students to terms and concepts they will need to understand PODS

Managed Code List Example

Presenter
Presentation Notes
Values in Managed Code lists may be changed by PODS.
Page 32: 3 PODS Database Concepts and Terms 30-minute …...3 PODS Database Concepts and Terms 30-minute session to introduce students to terms and concepts they will need to understand PODS

PODS CODE LIST TYPES

Values in the list are examples and not

managed by PODS. Unmanaged Code Lists

Presenter
Presentation Notes
These values are not important to the working of the standard; therefore, PODS does not manage the lifecycle of values. Values can be added, retired, deprecated, or superseded. Modules/Users can do whatever they like.
Page 33: 3 PODS Database Concepts and Terms 30-minute …...3 PODS Database Concepts and Terms 30-minute session to introduce students to terms and concepts they will need to understand PODS

Unmanaged Code List Example

Presenter
Presentation Notes
This doesn’t mean no management is needed! It means the PODS model isn’t managing these lists. In each organization, there should always be individuals designated to ensure the code lists correct are up to date.
Page 34: 3 PODS Database Concepts and Terms 30-minute …...3 PODS Database Concepts and Terms 30-minute session to introduce students to terms and concepts they will need to understand PODS

Database Subtypes and Domains

A type of code list used in Geospatial versions of PODS (PODS Spatial)

Database rules describing the valid values of a field.

Attribute domains constrain allowable values in a table.

Enforces data integrity.

Presenter
Presentation Notes
Domains and subtypes are another way to enhance data integrity and are similar in results to lookup tables and code lists. In some geospatial databases you can choose to employ subtypes and domains for data integrity enforcement. Here’s an example: Think of a road database. Subtypes of roads might be local, rural, or highway Domains would be the individual types of road that fall into each of the subtype categories. So local road domains might be street, boulevards, and Avenues. Rural domains might be dirt, path, or trail. Highway subtypes might be State highway and Interstate.
Page 35: 3 PODS Database Concepts and Terms 30-minute …...3 PODS Database Concepts and Terms 30-minute session to introduce students to terms and concepts they will need to understand PODS

End of Part 3Any questions?

Page 36: 3 PODS Database Concepts and Terms 30-minute …...3 PODS Database Concepts and Terms 30-minute session to introduce students to terms and concepts they will need to understand PODS

In Summary

1. The PODS Data Model(s) provide the architecture for storing and managing pipeline systems, equipment, etc.

2. PODS relational databases are the systems of record for pipeline assets and operating environments.

3. PODS lookup tables, domains, and codelists enforce Data integrity, standardization.

Page 37: 3 PODS Database Concepts and Terms 30-minute …...3 PODS Database Concepts and Terms 30-minute session to introduce students to terms and concepts they will need to understand PODS

End of Unit 3Any questions?

Page 38: 3 PODS Database Concepts and Terms 30-minute …...3 PODS Database Concepts and Terms 30-minute session to introduce students to terms and concepts they will need to understand PODS

Resources for This Unit

PODS Association web sitehttps://www.pods.org

Page 39: 3 PODS Database Concepts and Terms 30-minute …...3 PODS Database Concepts and Terms 30-minute session to introduce students to terms and concepts they will need to understand PODS

Additional Resources for This Unit

Page 40: 3 PODS Database Concepts and Terms 30-minute …...3 PODS Database Concepts and Terms 30-minute session to introduce students to terms and concepts they will need to understand PODS

Essential GIS Terms

• Some GIS terms that are important to understanding pipeline modeling• PODS Databases

• RDBMS – Relational Data Base Management System • Geodatabases

• File GDB – ESRI format, stores Feature classes, uses files in a folder and no 3rd party RDBMS• Enterprise (ESRI) – Feature classes stored within 3rd party RDBMS

• Oracle or SQL Server supported relational databases• Domain / Code Lookup

• Domain – FGDB way to store limiting value to control acceptable values in a column

• Done with a Code Lookup table in an RDBMS

Presenter
Presentation Notes
File GDB – ESRI format, stores Feature classes, uses files in a folder and no 3rd party RDBMS Enterprise GDB – Feature classes stored within 3rd party RelationalDataBaseManagementSystem. Domain… - FGDB way to store limiting value to control acceptable values in a column, done with a table in an RDBMS.
Page 41: 3 PODS Database Concepts and Terms 30-minute …...3 PODS Database Concepts and Terms 30-minute session to introduce students to terms and concepts they will need to understand PODS

Essential GIS Terms

Some GIS terms that are important to understanding pipeline modeling:• Relationship Class – GDB item that preserves a relationship between tables,

like a persistent join• GUID – a Global Unique Identifier – insures that the value is not duplicated

by any other GUID value in the entire GDB• Editor Tracking – ESRI enabled mechanism for tracking when a feature is

created and last edited• UTC – Coordinated Universal Time – the official agreed-upon time of the

earth, time zone independent

Presenter
Presentation Notes
Rel Class – GDB item that preserves a relationship between tables, like a persistent join. GUID – a Global Unique Identifier – ensures that the value is not duplicated by any other GUID value in the entire GDB. Editor Tracking – ESRI enabled mechanism for tracking when a feature is created and last edited. UTC – Coordinated Universal Time – the official agreed-upon time of the earth, time zone independent.
Page 42: 3 PODS Database Concepts and Terms 30-minute …...3 PODS Database Concepts and Terms 30-minute session to introduce students to terms and concepts they will need to understand PODS

Important Acronyms

• EA – Enterprise Architect by Sparks Systems • DDL – Data Definition Language • APR – Esri’s ArcGIS Pipeline Referencing solution • LRS – Linear Reference System • OGC – Open Geospatial Consortium • SQL – Structured Query Language