View
44
Download
3
Category
Preview:
DESCRIPTION
PostgreSQL & Temporal Data. Christopher Browne Afilias Canada PGCon 2009. Agenda. What kind of temporal data do we need? What data types does PostgreSQL offer? Temporality Representations Time Travel, Transaction Tables, Serial Numbers. What kind of temporal data do we need?. - PowerPoint PPT Presentation
Citation preview
PostgreSQL & Temporal PostgreSQL & Temporal DataDataChristopher BrowneChristopher Browne
Afilias CanadaAfilias Canada
PGCon 2009PGCon 2009
AgendaAgenda
What kind of temporal data do we need?What kind of temporal data do we need?
What data types does PostgreSQL offer?What data types does PostgreSQL offer?
Temporality RepresentationsTemporality Representations
Time Travel, Transaction Tables, Serial Time Travel, Transaction Tables, Serial NumbersNumbers
What kind of temporal What kind of temporal data do we need?data do we need?
Databases store facts about objects and Databases store facts about objects and eventsevents
Interesting times includeInteresting times include
When an event took placeWhen an event took place
When the event was recordedWhen the event was recorded
When someone was charged for the eventWhen someone was charged for the event
More Interesting TimesMore Interesting Times
When you start recognizing income on the When you start recognizing income on the eventevent
When you end recognizing income on the When you end recognizing income on the eventevent
When an object state beginsWhen an object state begins
When an object state endsWhen an object state ends
PostgreSQL Data TypesPostgreSQL Data Types
DateDateProblem: Pre-assumes evaluation of cutoff Problem: Pre-assumes evaluation of cutoff between days!between days!
Time with/without timezoneTime with/without timezoneProblem: Comparisons of Date+Time turn into Problem: Comparisons of Date+Time turn into hideous SQLhideous SQL
TimestampTimestampCombines Date + TimeCombines Date + Time
PostgreSQL Data TypesPostgreSQL Data Types
Timestamp with time zoneTimestamp with time zoneAllows collecting time in ‘local times’ and Allows collecting time in ‘local times’ and recognizing thatrecognizing that
IntervalIntervalDifference between two times/timestampsDifference between two times/timestampsVery useful for indicating duration of time Very useful for indicating duration of time rangesranges
OperatorsOperators
time/timestamp/date +|- interval = time/timestamp/date +|- interval = time/timestamp/datetime/timestamp/date
timestamp - timestamp = intervaltimestamp - timestamp = interval(likewise for the others)(likewise for the others)
timestamp <, <=, >, >= timestamptimestamp <, <=, >, >= timestamp
A BETWEEN B AND CA BETWEEN B AND CA >= B and A <= CA >= B and A <= C
Variations on “when is Variations on “when is it???”it???”
NOW(), transaction_timestamp, NOW(), transaction_timestamp, current_timestampcurrent_timestampall providing all providing startstart of transaction of transaction
statement_timestampstatement_timestamp
clock_timestampclock_timestamp
transaction commit timestamp - not available!transaction commit timestamp - not available!
Commit TimestampCommit Timestamp
Useful representation: Tables record (serverID, ctid)Useful representation: Tables record (serverID, ctid)
At COMMIT time, if the transaction has used this, then insert (serverID, At COMMIT time, if the transaction has used this, then insert (serverID, ctid, clock_timestamp) into timestamp tablectid, clock_timestamp) into timestamp table
Eliminates Slony-I “SYNC” thread & simplifies queriesEliminates Slony-I “SYNC” thread & simplifies queries
Helpful for multimaster replication strategiesHelpful for multimaster replication strategies
Adds a table full of timestamps that needs cleansing :-(Adds a table full of timestamps that needs cleansing :-(
PGTemporalPGTemporal
PgFoundry project implementing PgFoundry project implementing (timestamp,timestamp) type + all logical operations(timestamp,timestamp) type + all logical operations
First aspect: Supports inclusive & exclusive periodsFirst aspect: Supports inclusive & exclusive periods
[ From, To ], ( From, To ), [ From, To ), ( From, To ][ From, To ], ( From, To ), [ From, To ), ( From, To ]
[ and ] indicate “inclusive” periods beginning and [ and ] indicate “inclusive” periods beginning and ending at the specified momentending at the specified moment
( and ) indicate exclusive periods excluding ( and ) indicate exclusive periods excluding endpointsendpoints
Inclusion & ExclusionInclusion & Exclusion
Commonly, [From, To) is the ideal representationCommonly, [From, To) is the ideal representation
Today’s data easily characterized as [2009-05-Today’s data easily characterized as [2009-05-22,2009-05-23)22,2009-05-23)
This month’s period: [2009-05-01, 2009-06-01)This month’s period: [2009-05-01, 2009-06-01)
Successive periods Successive periods do not overlapdo not overlap[2009-04-01,2009-05-01),[2009-05-01,2009-06-01)[2009-04-01,2009-05-01),[2009-05-01,2009-06-01)
Note that SQL “BETWEEN” is equivalent to [From,To]Note that SQL “BETWEEN” is equivalent to [From,To]
A Veritable Panoply of A Veritable Panoply of OperatorsOperators
length(p), first(p), last(p), prior(p), next(p)length(p), first(p), last(p), prior(p), next(p)
contains(p, t), contains(p1, p2), contained_by(t, contains(p, t), contains(p1, p2), contained_by(t, p), contained_by(p1,p2), overlaps(p1,p2), p), contained_by(p1,p2), overlaps(p1,p2), adjacent(p1,p2), overleft(p1,p2), adjacent(p1,p2), overleft(p1,p2), overright(p1,p2), is_empty(p), equals(p1,p2), overright(p1,p2), is_empty(p), equals(p1,p2), nequals(p1,p2), before(p1,p2), after(p1,p2)nequals(p1,p2), before(p1,p2), after(p1,p2)
period(t), period(t1,t2), empty_period()period(t), period(t1,t2), empty_period()
period_intersect(p1,p2), period_union(p1,p2), period_intersect(p1,p2), period_union(p1,p2), minus(p1,p2)minus(p1,p2)
Core????Core????
Should PGTemporal be in core?Should PGTemporal be in core?
What would be needed for it to head in?What would be needed for it to head in?
Classical SQL Classical SQL TemporalityTemporality
Developing Time-Oriented Database Developing Time-Oriented Database Applications in SQL - Richard Snodgrass, Applications in SQL - Richard Snodgrass, available freely as PDFavailable freely as PDF
Uses periods much as in PGTemporalUses periods much as in PGTemporal
Standard SQL does not support periods, alas!Standard SQL does not support periods, alas!
Considerable attention to handling insertion of Considerable attention to handling insertion of past/future historypast/future history
Foreign Key ChallengesForeign Key Challenges
Nontemporal tables: No temporality, No problem!Nontemporal tables: No temporality, No problem!
Referencing table is temporal, referenced table Referencing table is temporal, referenced table isn’t: No problem!isn’t: No problem!
Referenced table is temporal Troublesome! Referenced table is temporal Troublesome!
Referential integrity may be violated simply via Referential integrity may be violated simply via passage of timepassage of time
Referenced & referencing tables may vary Referenced & referencing tables may vary independently!independently!
PostgreSQL Time TravelPostgreSQL Time Travel
Take a stateful tableTake a stateful table
Add triggers to capture (From,To) timestamps Add triggers to capture (From,To) timestamps on INSERT, UPDATE, DELETEon INSERT, UPDATE, DELETE
Sadly, this breaks if you require referential Sadly, this breaks if you require referential integrity constraints pointing to this table :-(integrity constraints pointing to this table :-(
Time Travel ActionsTime Travel Actions
On INSERTOn INSERT
(NEW.From, NEW.To) = (NOW(), NULL)(NEW.From, NEW.To) = (NOW(), NULL)
On DELETEOn DELETE
(OLD.From, OLD.To) = (PrevValue, NOW())(OLD.From, OLD.To) = (PrevValue, NOW())
On UPDATEOn UPDATE
Transforms into DELETE old, INSERT newTransforms into DELETE old, INSERT new
Pulling Specific StatePulling Specific State
Current state:Current state:select * from table where endtime is NULLselect * from table where endtime is NULL
State at a particular time: Set Returning State at a particular time: Set Returning FunctionFunctionselect * from table_at_time(ts)select * from table_at_time(ts)
Pulls tuples effective at that timePulls tuples effective at that time
starttime <= tsstarttime <= ts
endtime is null or endtime >= tsendtime is null or endtime >= ts
Explicit Temporal TablesExplicit Temporal Tables
Accept that it’s temporal to begin withAccept that it’s temporal to begin with
Not just a way to get “history for free”Not just a way to get “history for free”
Enables Science Fiction: Declaring future Enables Science Fiction: Declaring future state!state!
At 9am next Wednesday, state will changeAt 9am next Wednesday, state will change
Eliminates need for “batch jobs”Eliminates need for “batch jobs”
May need to pre-record future-dated events!May need to pre-record future-dated events!
Science Fiction....Science Fiction....
ProblemsProblems
Foreign key references into temporal tables are Foreign key references into temporal tables are problematicproblematic
Overlap?Overlap?
Reference disappearing?Reference disappearing?
Fixing problems requires “fabricating a Fixing problems requires “fabricating a historical story” not just “fixing the state”historical story” not just “fixing the state”
Temporality via Tx Temporality via Tx ReferencesReferences
create table transactions (create table transactions ( tx_id integer primary key default tx_id integer primary key default nextval(‘tx_seq’),nextval(‘tx_seq’), whodunnit integer not null references whodunnit integer not null references users(user_id),users(user_id), and_when timestamptz not null default NOW()); and_when timestamptz not null default NOW());
create table slightly_temporal_object (create table slightly_temporal_object ( object_id serial primary key, object_id serial primary key, tx_id integer not null default currval(‘tx_seq’) tx_id integer not null default currval(‘tx_seq’) references transactions(tx_id)); references transactions(tx_id));
Getting More Temporal - Getting More Temporal - II
Add ON UPDATE trigger that updates tx_id to Add ON UPDATE trigger that updates tx_id to currval(‘tx_seq’)currval(‘tx_seq’)
More Temporal: History!More Temporal: History!
Create a “past history” tableCreate a “past history” table
Similar schema, but drop all data validationSimilar schema, but drop all data validation
Add end_txAdd end_tx
UPDATE/DELETE throw obsolete tuples into UPDATE/DELETE throw obsolete tuples into the “past history table”the “past history table”
Data validation dropped because validation Data validation dropped because validation can change over timecan change over time
Serial Number Serial Number TemporalityTemporality
Used in DNSUsed in DNS
Sets of updates grouped together temporallySets of updates grouped together temporally
A “bump of serial number” indicates A “bump of serial number” indicates common publishing at a common point in common publishing at a common point in timetime
ObjectObject ValueValue ZonZonee
FroFromm
ToTo
ns1.abc.orgns1.abc.org 10.2.3.110.2.3.1 orgorg 11 33
ns1.abc.orgns1.abc.org 10.2.3.210.2.3.2 orgorg 33
ns2.abc.orgns2.abc.org 10.2.2.110.2.2.1 orgorg 22
ns3.abc.orgns3.abc.org 10.9.1.210.9.1.2 orgorg 11 33
ns1.abc.orgns1.abc.org 10.2.3.110.2.3.1 infoinfo 1717 1919
ns2.abc.orgns2.abc.org 10.2.3.210.2.3.2 infoinfo 1414 1818
ns2.abc.orgns2.abc.org 10.9.1.210.9.1.2 infoinfo 1818
ns3.abc.orgns3.abc.org 141.2.3.141.2.3.44
infoinfo 1919
Zone Representation Zone Representation MeritsMerits
It’s fast. We extract multimillion record zones It’s fast. We extract multimillion record zones in minutesin minutes
Arbitrary ability to roll back...Arbitrary ability to roll back...
Nicely supports DNS AXFR/IXFR operationsNicely supports DNS AXFR/IXFR operations
Each serial # represents a sort of “Logical Each serial # represents a sort of “Logical Commit”Commit”
Further Merits of thisFurther Merits of this
Rename “zone” to “module” and this is nice for Rename “zone” to “module” and this is nice for configurationconfiguration
We already know it supports large amounts of We already know it supports large amounts of data efficientlydata efficiently
Configuration is smaller (we hope!)Configuration is smaller (we hope!)
Demerits of zone-like Demerits of zone-like structurestructure
No way to specify a point of time in the futureNo way to specify a point of time in the future
Serial numbers are intended to just keep rolling Serial numbers are intended to just keep rolling alongalong
HOWEVER....HOWEVER....
With complex apps & configuration, fancier With complex apps & configuration, fancier temporality looks like a misfeaturetemporality looks like a misfeature
ConclusionsConclusions
3 ways to represent temporal information3 ways to represent temporal information
Timestamps, Transaction IDs, Serial numbersTimestamps, Transaction IDs, Serial numbers
PostgreSQL changes possiblePostgreSQL changes possible
Should PGtemporal be added to “core”?Should PGtemporal be added to “core”?
Should we try to have temporal foreign key Should we try to have temporal foreign key functionality in core?functionality in core?
Recommended