30
The POSTGRES Next - Generation Database Management System Michael Stonebraker Greg Kemnitz Presented by: Nirav S. Sheth

The POSTGRES Next - Generation Database Management System Michael Stonebraker Greg Kemnitz Presented by: Nirav S. Sheth

Embed Size (px)

Citation preview

The POSTGRES Next - Generation Database Management System

Michael StonebrakerGreg Kemnitz

Presented by: Nirav S. Sheth

Services offered by DBMS are of 3 kinds

• Traditional Data Management– Simple data type

• Object Management– Nontraditional data types, e.g. bitmaps, icons, text,

polygons…• Knowledge Management

– Store and enforce a collection of rules that are a part of a semantics of an application

Example

An application that stores and manipulates text and graphics to facilitate the layout of Newspaper copy

• Data: customer information

• Object: text, pictures and icons

• Rules: control newspaper layout

The Postgres Data Model and query Language

• Three criteria guided the design of a new data model and query language for Postgres

- Orientation toward database access from a query language

- Orientation toward multilingual access

- Small number of concepts

Orientation toward database access from a query language

• Postgres users interact with database by using the set oriented query language-POSTQUEL

• Postgres gives each record a unique identifier

• Postgres allows a user to define functions(methods) to the DBMS

Orientation toward multilingual access

• Postgres could have tightly been coupled to the compiler and the run time environment of a specific language

• But most databases are accessed by programs written in several different languages

• Postgres is Multilingual

• POSTGRES is programming language neutral

Small number of concepts

• As few concepts as possible so that users have minimum complexity to contend with

• Four constructs:• Class

• Inheritance

• Type

• Function

The POSTGRES Data Model - Class

• Class: a named collection of instances of objects

• Instances: has the same collection of attributes and each attribute is of specific type

• Users can create a new class by specifying the class name, along with all attribute names and their types

The POSTGRES Data Model - Inheritance

• A class can inherit data elements from other classes

• Multiple inheritance – only create objects without ambiguity

Three kinds of classes

• Real: instances are stored in the database• Derived(view or virtual class): instances are not

physically stored but are materialized only when necessary

• Version: store differential relative to its base class– Initially has all instances of the base class– Updates to the version do not affect the base

class– Updates to the base class are reflected in the

version

The POSTGRES Data Model - Types

• Postgres contains an abstract data type(ADT) facility whereby any user can construct arbitrary new base types such as bits, bitstrings, encoded character strings, bitmaps, compressed integers, etc.

• Array of base types supported by Postgres

• Composite types allow to construct complex objects i.e. attributes which contain other instances as or all of their value

Composite Types

• Two kinds are:

- Zero or more instances of any class is automatically a composite type

- Set, whose value is a collection of instances from all classes

POSTGRES notion of Functions

• Three different kinds of Functions are known to Postgres

- C functions

- Operators

- POSTQUEL functions

C Functions

• These are Arbitrary C procedures

• Arguments of C functions are base or composite types

• C functions can be defined to POSTGRES while the system is running and are dynamically loaded when required during query execution

• Inherited down the class hierarchy

• Can not be optimized by the POSTGRES

Operators

• Operators utilize indexes

• Functions with one or two operands

• It is imperative that a user be able to construct new access methods to provide efficient access to instances of non traditional base types. Operators allow this

POSTQUEL

• Set-oriented query language that resembles a superset of a relational query language

• Support nested queries

• Transitive closure

• Support for inheritance

• Support for time travel

Reasons for FAST PATH

• There are a variety of decision support applications in which the end user is given a specialized query language

• It is necessary for the run time system to assign a unique identifier to every persistent object it constructs

The Rules System

• Requirements:

– Referential integrity

– View management

– Triggers

– Integrity constraints

– Protection

– Version control

Implementation of Rules

• Two implementations for Postgres rules:

• Through record level processing

• Query rewrite module

Record Level Processing

• Rules system is called when individual records are accessed, deleted, inserted or modified

• Place a marker on the record – this marker contains the identifier of the corresponding rule and the types of events to which it is sensitive

• Efficient if there are large number of rules and each covers only a few instances

Query rewrite module

• Perform poorly if there are large number of small scope rules

• Performs admirably if there are small number of large scope rules

Rule Activation Policies

• Immediate-same transaction

• Immediate-different transaction

• Deferred-same transaction

• Deferred-different transaction

• Postgres only implements the first option

Storage System

• Postgres implements the idea of “no-overwrite” storage manager

• Old record remains in the database whenever an update occurs and servers the purpose normally performed by a write ahead log

• Postgres log is two bits per transaction indicating whether each transaction committed, aborted, or is in progress

Storage System – contd.

• Two features can be exploited in a no-overwrite system

• Instantaneous crash recovery• Time travel• Stable main memory required• Trade-off is that a POSTGRES database at any

given time will have committed instances intermixed with instances that were written by aborted transactions

Storage System – contd.

• To support time travel, Postgres maintains two different physical collections of records, one for current data and one for historical data, each with it’s own indexes

• A demon know as vacuum cleaner runs in the background which moves records that are no longer valid from the current database to historical database

The POSTGRES Implementation

• Aspects of the implementation that deserve special mention are: the process structure; extendibility; dynamic loading; and the rule wake-up

• Process Structure – Postgres runs as one process for each active user but all users share the Postgres code, buffer pool and lock table but they have private data segments

The POSTGRES Implementation-contd.

• Extendibility – has been accomplished by making the parser, optimizer and execution engine entirely table driven.

• Data types, operators and functions can be added and subtracted dynamically i.e. when the system is running

• Maintains a cache of currently loaded functions and dynamically moves functions into the cache and then ages them out.

POSTGRES Performance

• Better than UCB-INGRES

POSTGRES Performance

Conclusion

• POSTGRES allows an application designer to trade off performance for data independence

• Imports only specific user functions into its address space