Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
Efficient Ti喘imeiieries Swith PostgreSQL
iTeveiiimps Sont [email protected] FOSDEM PGDay 2018
Overview
Overview
Background
Overview
Background Complexity
Overview
Background Complexity Time Series
Overview
Background
Schema
Indexing
Normalisation
Summarisation
Partitioning
Complexity Time Series
Bafickgrount d
Developers S,iDevelopers S,iDevelopers S
Expent s Sivei喘oys S
Mont iTorint g
Vis SibiliTyiint ToiTheioperaTiont iofihardwareiant dis SofTware
e.g. web site, database, cluster, disk drive
Mont iTorint g
Vis SibiliTyiint ToiTheioperaTiont iofihardwareiant dis SofTware
e.g. web site, database, cluster, disk drive
! Look !! Graphs !
Mont iTorint gi–i喘imeiieries S
ComplexiTy
“Mificros Servifices SiHell”
PostgreSQL
Redis
FrontendMachine
Learning Bit
Login / IAM
Backend APICache
ConfigurationStore
ElasticsearchCassandra
Search APIService
Discovery(consul, etcd)
Container Orchestration EngineDocker Swarm / Kubernetes
MySQL
MoreiCowbelli-iOpent iTafick
Heat HorizonNova Keystone
Conductor
Scheduler
Compute
GlanceCinder
KVM
MySQL
RabbitMQ
...
...
...
...
......
...
...
Ceph
MoreiCowbelli-iOpent iTafick
Heat HorizonNova Keystone
Conductor
Scheduler
Compute
GlanceCinder
KVM
MySQL
RabbitMQ
...
...
...
...
......
...
...
Ceph
AWS
i i
Mont iTorint g
Software
Network
Storage
Servers
MeTrifics S
Logs S
i i
Middleware
Mont iTorint g
Software
Network
Storage Log API
Kafka
Logstash Elastic
InfluxDB
Metric APIAlerting
Grafana
Kibana
MySQL
SQLite
Servers
MeTrifics S
Logs S
Zookeeper
Storm
i i
Reficap
PostgreSQL
Redis
FrontendMachine
Learning Bit
Login / IAM
Backend APICache
ConfigurationStore
ElasticsearchCassandra
Search APIService
Discovery(consul, etcd)
Container Orchestration Engine - Docker Swarm / Kubernetes
MySQL
i i
Reficap
PostgreSQL
Redis
FrontendMachine
Learning Bit
Login / IAM
Backend APICache
ConfigurationStore
ElasticsearchCassandra
Search APIService
Discovery(consul, etcd)
Container Orchestration Engine - Docker Swarm / Kubernetes
MySQL
Heat HorizonNova Keystone
Conductor
Scheduler
Compute
GlanceCinder
KVM
MySQL
RabbitMQ
...
.........
.........
...
Ceph
i i
Reficap
PostgreSQL
Redis
FrontendMachine
Learning Bit
Login / IAM
Backend APICache
ConfigurationStore
ElasticsearchCassandra
Search APIService
Discovery(consul, etcd)
Container Orchestration Engine - Docker Swarm / Kubernetes
MySQL
Heat HorizonNova Keystone
Conductor
Scheduler
Compute
GlanceCinder
KVM
MySQL
RabbitMQ
...
.........
.........
...
Ceph
Middleware
Log API
Kafka
Logstash Elastic
InfluxDB
Metric APIAlerting
Grafana
Kibana
MySQL
SQLite
Zookeeper
Storm
i i
Middleware
Mont iTorint g
Software
Network
Storage Log API
Kafka
Logstash Elastic
InfluxDB
Metric APIAlerting
Grafana
Kibana
MySQL
SQLite
Servers
MeTrifics S
Logs S
Zookeeper
Storm
i i
Middleware
Mont iTorint g
Software
Network
Storage Log API
Kafka
Logstash Elastic
InfluxDB
Metric APIAlerting
Grafana
Kibana
MySQL
SQLite
Servers
MeTrifics S
Logs S
Zookeeper
Storm
i i
Mont iTorint g
Commendable “right tool for the job” attitude, but…
ATileas ST,ificouldis SomeiofiTheipers Sis STent ficeibeiunt ifedd
Fewerifailureimodes S
Feweribafickupis STraTegies S
Feweriredunt dant ficy/replificaTiont iproToficols S
Ont eis SeTiofificont s Sis STent TidaTais Semant Tifics S
Re-us Seiexis STint gioperaTiont aliknt owledge
i i
Ont eis Simpleiques STiont …
Is PostreSQL a Time Series Database?
喘imeiieries S
喘imeiieries SiDaTabas Ses Si–i喘heiChoifice!
● Ganglia (RRDtool)● Graphite (Whisper)● OpenTSDB (HBase)● KairosDB (Cassandra)● InfuxDB● Prometheus● Gnocchi● Atlas● Heroic
● Hawkular (Cassandra)● MetricTank (Cassandra)● Riak TS (Riak)● Bluefood (Cassandra)● DalmatinerDB● Druid● BTrDB● Warp 10 (Hbase)● Tgres (PostgreSQL!)
喘imeiieries SiDaTabas Ses Si–i喘heiChoifice!
● Ganglia [Berkley]● Graphite [Orbitz]● OpenTSDB [Stubleupon]● KairosDB● InfuxDB● Prometheus [SoundCloud]● Gnocchi [OpenStack]● Atlas [Netfix]● Heroic [Spotify]
● Hawkular [Redhat]● MetricTank [Raintank]● Riak TS [Basho]● Bluefood [Rackspace]● DalmatinerDB● Druid● BTrDB● Warp 10● Tgres
喘imeiieries SiDaTabas Ses S
喘imeiieries SiDaTabas Ses S
2000
喘imeiieries SiDaTabas Ses S
2000 2010
喘imeiieries SiDaTabas Ses S
2000 2010
2013 - 2017
喘imeiieries Si–iPeriodific
time value
10:01 15%
10:02 100%
10:03 86%
ColleficTor
10:04 60%
10:05 0%
ColleficTor
喘imeiieries Si–iPeriodific
ColleficTortime value
10:01 15%
MeTrific
MeTa
name dimensions
cpu { host:prod1 }
10:01 1% cpu { host:prod2 }
10:01 24° temp { sensor:rack }
ColleficTor
10:02 100% cpu { host:prod1 }
10:02 87% cpu { host:prod2 }
10:02 26° temp { sensor:rack }
ColleficTor
喘imeiieries Si–iiporadific
ColleficTortime value
10:07 13
MeTrific
MeTa
name dimensions meta
log { host:prod1 }
Event TMeTa
{ msg:… }
11:39 1 log { host:prod2 } { msg:… }
11:50 2 alarm { host:prod1 } { reason:… }
Event Ts S
14:02 1 alarm { host:prod1 } { reason:… }
"metric": { "timestamp": 1232141412, "value": 42, "name": "cpu.percent", "dimensions": { "hostname": "dev-01" }, "value_meta": { … }}
喘imeiieries Si–iInt ges STiDaTa
● JiONiEnt ficoded● 喘imes STampi(Unt ix/IiO/..)● Meas Surement TiValue● MeTrificiIdent TifficaTiont
– Name– Dimensions (or “Tags”)
● OpTiont aliEvent TiDaTa
喘imeiieries Si–iQueries S
10:01 10:02 10:0320
21
22
23
24
25
26
27
28
29
30
{ sensor:rack }
tem
p °
C
time value
10:01 15%
name dimensions
cpu { host:prod1 }
10:01 1% cpu { host:prod2 }
10:01 24° temp { sensor:rack }
10:02 100% cpu { host:prod1 }
10:02 87% cpu { host:prod2 }
10:02 26° temp { sensor:rack }
10:03 100% cpu { host:prod1 }
10:03 50% cpu { host:prod2 }
10:03 27° temp { sensor:rack }
喘imeiieries Si–iQueries S
10:01 10:02 10:030
10
20
30
40
50
60
70
80
90
100
{ host:prod1 } { host:prod2 }
cpu
%
time value
10:01 15%
name dimensions
cpu { host:prod1 }
10:01 1% cpu { host:prod2 }
10:01 24° temp { sensor:rack }
10:02 100% cpu { host:prod1 }
10:02 87% cpu { host:prod2 }
10:02 26° temp { sensor:rack }
10:03 100% cpu { host:prod1 }
10:03 50% cpu { host:prod2 }
10:03 27° temp { sensor:rack }
? ? ?? ? ?? ? ?
? ? ?? ? ?? ? ?? ? ?? ? ?? ? ?
10:0010:0010:00
喘imeiieries Si–iQueries S
10:01 10:02 10:030
10
20
30
40
50
60
70
80
90
100
{ host:prod1 } { host:prod2 }
cpu
%
time value
10:01 15%
name dimensions
cpu { host:prod1 } 10:01 1% cpu { host:prod2 } 10:01 24° temp { sensor:rack } 10:02 100% cpu { host:prod1 } 10:02 87% cpu { host:prod2 } 10:02 26° temp { sensor:rack } 10:03 100% cpu { host:prod1 } 10:03 50% cpu { host:prod2 } 10:03 27° temp { sensor:rack } 10:0410:0410:0410:0510:0510:05
喘imeiieries Si–iificalint g
time value name dimensions ● ReTent Tiont i(Volume)– How long is data kept– Data volume stored
● MeTrificiAmount T– Number of series names– Additional dimensions
● QueryiComplexiTy– Time range size– Number of metrics
10:01 15% cpu {host:prod1}
10:01 1% cpu {host:prod2}
10:01 24° temp {sensor:rack}
10:02 100% cpu {host:prod1}
10:02 87% cpu {host:prod2}
10:02 26° temp {sensor:rack}
10:03 100% cpu {host:prod1}
10:03 50% cpu {host:prod2}
10:03 27° temp {sensor:rack}
RelaTiont aliModel
CREATE TABLE measurements ( timestamp TIMESTAMPTZ, value FLOAT8, name VARCHAR, dimensions JSONB, value_meta JSON);
RelaTiont aliModeli–iDent ormalis Sediifichema
● iToreiint is Sint gleiTable– Trivial mapping of data– Row is a measurement
● NaTiveiformaTiTime– Normalise to UTC timezone– “Just use TIMESTAMPTZ”
● FloaTint gipoint Tivalue– Typical for sensor use cases– Strict accuracy: use NUMERIC
● e.g. Money!
CREATE TABLE measurements ( timestamp TIMESTAMPTZ, value FLOAT8, name VARCHAR, dimensions JSONB, value_meta JSON);
RelaTiont aliModeli–iDent ormalis Sediifichema
● ArbiTraryilent gThint ame– Fixed length: no real efect
● Key/valueidiment s Siont s S– JSONB is binary encoded– Efficient access to children– Best if querying contents
● Key/valueimeTadaTa– JSON just validated text– Best if purely payload
SELECT DISTINCT name, dimensionsFROM measurementsWHERE name = 'cpu.percent' dimensions @> '{"host": "dev-01"}'::JSONB
RelaTiont aliModeli–iieries SiLis STint giQuery
● ieleficTimeTrificimeTadaTa● ihowieafichimeTrificiont fice● OpTiont allyiflTer
– All metrics with name– All metrics for a host
SELECT TIME_ROUND(timestamp, 60), AVG(value)FROM measurementsWHERE timestamp BETWEEN '2015-01-01Z00:00:00' AND '2015-01-01Z01:00:00' AND name = 'cpu.percent' AND dimensions @> '{"host": "dev-01"}'::JSONBGROUP BY 1
RelaTiont aliModeli–iiint gleiieries SiQuery
● Fint dimeas Surement Ts S– Specify time range– Specify metric name– Specify dimension value
● AggregaTeidaTaipoint Ts S– Round to desired interval– Group by that interval– Take average of all data
points in that interval
RelaTiont aliModeli–iPerformant ficeiAnt alys Sis S
Query Duration (seconds)
Data Volume(M/rows)
喘argeT:i<100ms S(QueryiDuraTiont )
Query Duration (seconds)
Time Range(seconds)
RelaTiont aliModeli–iiint gleiieries SiQueryi(vs Si喘imeiRant ge)
0 1000 2000 3000 4000 5000 6000 7000 8000 90000.00
0.05
0.10
0.15
0.20
0.25
0.30
0.35
0.40
3M Rows
2M Rows
1M Rows
Query Time Range (seconds)
Qu
ery
Du
ratio
n (
seco
nd
s)
RelaTiont aliModeli–iiint gleiieries SiQueryi(vs SiDaTaiVolume)
0 1 2 3 4 5 6 7 8 9 100.00
0.20
0.40
0.60
0.80
1.00
1.20
1.40
Data Volume (M-rows)
Qu
ery
Du
ratio
n (
seco
nd
s)
RelaTiont aliModeli-iAnt alys Sis S
✔ QueryiTimeifxediregardles Ss SiofiTimeirant ge
✔ Ont iTargeTifor<i~1Mirows S
✗ QueryiTimeis Sficales Silint earlyiwiThidaTaivolume
✗ Everyiqueryireads Sieveryirow✗ Full table scan
Int dexint g
Int dexint g
● 喘imes STamps Siareies Ss Sent Tiallyiint Tegers S● Pos STgreiQLihas Simant yiint dexiTypes SB喘REE,iHAiH,iBRIN,iGIN,iGIi喘
● B喘REEiexficellent TiforiEqualiTyiant diBeTweent
Index Table
1
63 2
Int dexint gi–iB喘REE
3 three2 two4 four6 six
1 one5 five8 eight7 seven
45
78
Index Table
1
63 2
Int dexint gi–iB喘REE
3 three2 two4 four6 six
1 one5 five8 eight7 seven
45
78
=7
Index Table
1
63 2
Int dexint gi–iB喘REE
3 three2 two4 four6 six
1 one5 five8 eight7 seven
45
78
>=6<=8
SELECT TIME_ROUND(timestamp, 60), AVG(value)FROM measurementsWHERE timestamp BETWEEN '2015-01-01Z00:00:00' AND '2015-01-01Z01:00:00' AND name = 'cpu.percent' AND dimensions @> '{"host": "dev-01"}'::JSONBGROUP BY 1
Int dexint gi–iiint gleiieries SiQuery
● BE喘WEENipredificaTe● Elimint aTes SihugeiporTiont iofiTableificont Tent Ts S– High selectivity
● Exficellent Tificant didaTeiforiint dexificreaTiont
Int dexint gi-i喘imes STamp
CREATE INDEX ON measurements USING BTREE (timestamp);
● ipeficifyiTableiToiint dex● ipeficifyiint dexiType
– Optional: BTREE is default● ipeficifyificolumnt iToiint dex
Int dexint gi–iiint gleiieries SiQueryi(vs SiDaTaiVolume)
0 1 2 3 4 5 6 7 8 9 100.00
0.20
0.40
0.60
0.80
1.00
1.20
1.40
No Index
With Index
Data Volume (millions/rows)
Qu
ery
Du
ratio
n (
seco
nd
s)
Int dexint gi–iiint gleiieries SiQueryi(vs SiDaTaiVolume)
0 1 2 3 4 5 6 7 8 9 100.000
0.005
0.010
0.015
0.020
0.025
9000
8000
7000
Data Volume (millions/rows)
Qu
ery
Du
ratio
n (
seco
nd
s)
Int dexint gi–iiint gleiieries SiQueryi(vs Si喘imeiRant ge,i10MiRows S)
0 1000 2000 3000 4000 5000 6000 7000 8000 9000 100000.00
0.01
0.02
0.03
0.04
0.05
0.06
0.07
0.08
0.09
0.10
1 Metric
10 Metrics
Query Time Range (seconds)
Qu
ery
Du
ratio
n (
seco
nd
s)
Int dexint gi–iiint gleiieries SiQueryi(vs Si喘imeiRant ge,i10MiRows S)
0 1000 2000 3000 4000 5000 6000 7000 8000 9000 100000.00
0.05
0.10
0.15
0.20
0.25
1 Metric
10 Metrics
100 Metrics
Query Time Range (seconds)
Qu
ery
Du
ratio
n (
seco
nd
s)
Int dexint gi-iAnt alys Sis S
✔ DaTaiVolume
✔ To 10M✔ 喘imeiRant ge
✔ To 9000s (10 Metrics)✔ QueryiTimeis STableias SiDaTaiVolumeiint ficreas Ses S
✗ 喘imeiRant ge
✗ Over 4000s (100 Metrics)✗ Nowiapparent TiqueryiduraTiont iint ficreas Ses Sias Si喘imeiRant geigrows S
✗ Int ficreas Sint gint umberiofimeTrifics Sidras STificallyiafeficTs SiqueryiduraTiont
✗ Data for each uninteresting series must be fltered out
SELECT TIME_ROUND(timestamp, 60), AVG(value)FROM measurementsWHERE timestamp BETWEEN '2015-01-01Z00:00:00' AND '2015-01-01Z01:00:00' AND name = 'cpu.percent' AND dimensions @> '{"host": "dev-01"}'::JSONBGROUP BY 1
Int dexint gi–iiint gleiieries SiQuery
● Moreiint dexint gd– name– dimensions
Int dexint gi–iAddiTiont al
CREATE INDEX ON measurements USING BTREE (name);
CREATE INDEX ON measurements USING GIN (dimensions);
● CreaTeint ewiint dexes Siont imeas Surement Ts SiTable
● ipeficifyint ame– Equality: Use BTREE
● ipeficifyidiment s Siont s S– Containment: Use GIN– Find contents of JSON
Int dexint gi–iieries SiQueryi(vs Si喘imeiRant ge,i10MiRows S,i100iMeTrifics S)
0 1000 2000 3000 4000 5000 6000 7000 8000 9000 100000.00
0.05
0.10
0.15
0.20
0.25
Time & Metric
Time Index
Query Time Range (seconds)
Qu
ery
Du
ratio
n (
seco
nd
s)
Int dexint g
✗ Ohidear
✗ 喘ooimufichiint dexint gificant ibeiharmful
Normalis SaTiont
CREATE TABLE measurements ( timestamp TIMESTAMPTZ, value FLOAT8, name VARCHAR, dimensions JSONB, value_meta JSON);
Normalis SaTiont
CREATE TABLE values ( timestamp TIMESTAMPTZ, value FLOAT8, metric_id INT, value_meta JSON);
CREATE TABLE metrics ( id SERIAL, name VARCHAR, dimensions JSONB, UNIQUE (name, dimensions));
Normalis SaTiont
● Values Sis SToredibyiint Tegeriid– References entry in metric table– The name/dimensions for each
metric are only stored once– Eliminates repeated bulky data
in measurements table● MeTrificiTableidefnt es Siid
– SERIAL produces incrementing integers to allot id values
– UNIQUE constraint is useful during normalisation
● Implicitly creates suitable index
CREATE TABLE values ( timestamp TIMESTAMPTZ, value FLOAT8, metric_id INT, value_meta JSON);
CREATE TABLE metrics ( id SERIAL, name VARCHAR, dimensions JSONB, UNIQUE (name, dimensions));
Normalis SaTiont i–iView
CREATE VIEW measurementsAS SELECT timestamp, value, name, dimensions, value_meta FROM values INNER JOIN metrics ON (metric_id = id);
● Mimificimeas Surement Ts S– Views can be queried in
the same way as tables● Defnt ediwiThiiELEC喘
– Query to run which produces contents of view
● Joint int ormalis SediTables S● Cant ire-us Seis Sameiqueries Sias Sibefore
CREATE RULE measurements_insert AS ON INSERT TO measurements DO INSTEAD INSERT INTO values ( timestamp, value, metric_id, value_meta ) VALUES ( NEW.timestamp, NEW.value, create_metric ( NEW.name, NEW.dimensions), NEW.value_meta );
Normalis SaTiont i–iViewiInt s SerT
● Cant ’Tiint s SerTidaTaiint Toiviews SibyidefaulT
● Cant is Speficifyiant iaficTiont iToiperformiont iINiER喘
● Int s SerTiint Toivalues S● HelperiproficedureiToialloficaTeimeTrific_id
● Normalis SaTiont iis SiTrant s Sparent Tiforius Ser
CREATE FUNCTION create_metric ( in_name VARCHAR, in_dims JSONB) RETURNS INT LANGUAGE plpgsql AS $_$DECLARE out_id INT;BEGIN SELECT id INTO out_id FROM metrics AS m WHERE m.name = in_name AND m.dimensions = in_dims; IF NOT FOUND THEN INSERT INTO metrics ("name", "dimensions") VALUES (in_name, in_dims) RETURNING id INTO out_id; END IF; RETURN out_id;END; $_$;
Normalis SaTiont i–iMeTrificiLookup
● iTorediproficedure– Take name/dimensions– Returns metric_id
● Fint diexis STint gimeTrific– Return existing id
● Ifint ew,iThent iINiER喘– Allocates new id– Return the new id
CREATE INDEX ON values USING BTREE (timestamp);
CREATE INDEX ON values USING BTREE (metric_id);
Normalis SaTiont i-iInt dexint g
● 喘imes STampiint dex– Same as before
● Newiint dexiont imeTrific_id– Allow efficient fltering of
metrics during JOIN– Serves similar purpose to
existing metric indexing
Normalis SaTiont i–iieries SiQueryi(vs Si喘ime,i10MiRows S,i100iMeTrifics S)
0 1000 2000 3000 4000 5000 6000 7000 8000 90000.00
0.05
0.10
0.15
0.20
0.25
Normalised
Denormalised (Time Index)
Denormalised (Extra Index)
Query Time Range (seconds)
Qu
ery
Du
ratio
n (
seco
nd
s)
Normalis SaTiont
✔ Normalis SaTiont ielimint aTedioverheadiofiaddiTiont alimeTrificiint dexint g
✗ 喘heimeTrificiint dexint gis STillidoes Snt ’Tihaveiaipos SiTiveiefeficT
time value metric
10:01 . 1
10:01 . 2
10:02 . 1
10:02 . 2
10:03 . 1
10:03 . 2
10:04 . 1
10:04 . 2
A
B
C
D
E
F
G
H
Normalis SaTiont i–iBiTmapiInt dexiificant
timeindex
metricindex
C
E
F
B
D
F
H
D
D
F
2:02:03
time value metric
10:01 . 1
10:01 . 2
10:02 . 1
10:02 . 2
10:03 . 1
10:03 . 2
10:04 . 1
10:04 . 2
A
B
C
D
E
F
G
H
Normalis SaTiont i–iMulTi-Columnt iInt dexint g
timemetricindex
D
F
2:02:03
CREATE INDEX ON values USING BTREE (timestamp);
CREATE INDEX ON values USING BTREE (metric_id);
Normalis SaTiont i–iMulTi-Columnt iInt dexint g
CREATE INDEX ON values USING BTREE (timestamp, metric_id);
CREATE INDEX ON values USING BTREE (metric_id, timestamp);
Normalis SaTiont i–iieries SiQueryi(vs SiRant ge,i10MiRows S,i100iMeTrifics S)
0 1000 2000 3000 4000 5000 6000 7000 8000 90000.00
0.05
0.10
0.15
0.20
0.25
Normalised (Single Index)
Normalised
Denormalised (Time Index)
Denormalised (Extra Index)
Query Time Range (seconds)
Qu
ery
Du
ratio
n (
seco
nd
s)
Normalis SaTiont
● Int ficreas Seivolumei10MiToi100M– @ 1Hz / 100 Metrics: 1M seconds
● Before: ~1.15 days● Now: ~11.5 days
● Int ficreas SeimaxiTimeirant ges Sifromi9000s SiToi90,000s S– Before: 2.5 hours– Now: 1.04 days
Normalis SaTiont i–iieries SiQueryi(vs SiVolume,i10iMeTrifics S)
0 10 20 30 40 50 60 70 80 90 1000
0.02
0.04
0.06
0.08
0.1
0.12
10000
20000
30000
Data Volume (M-rows)
Qu
ery
Du
ratio
n (
seco
nd
s)
Normalis SaTiont i–iieries SiQueryi(vs SiRant ge,i100MiRows S)
0 10000 20000 30000 40000 50000 60000 70000 80000 900000.00
0.02
0.04
0.06
0.08
0.10
0.12
0.14
0.16
1 Metric
10 Metrics
Query Time Range (seconds)
Qu
ery
Du
ratio
n (
seco
nd
s)
Normalis SaTiont i–iieries SiQueryi(vs SiRant ge,i100MiRows S)
0 10000 20000 30000 40000 50000 60000 70000 80000 900000.00
0.10
0.20
0.30
0.40
0.50
0.60
0.70
0.80
0.90
1 Metric
10 Metrics
100 Metrics
Query Time Range (seconds)
Qu
ery
Du
ratio
n (
seco
nd
s)
Normalis SaTiont i–iieries SiQueryi(vs SiRant ge,i100MiRows S)i(+Cont fg)
0 10000 20000 30000 40000 50000 60000 70000 80000 900000.00
0.10
0.20
0.30
0.40
0.50
0.60
0.70
0.80
0.90
1 Metric
10 Metrics
100 Metrics
Query Time Range (seconds)
Qu
ery
Du
ratio
n (
seco
nd
s)
Normalis SaTiont i-iAnt alys Sis S
✔ DaTaiVolume
✔ To 100M✔ 喘imeiRant ge
✔ To 90,000s (10 Metrics)
✗ 喘imeiRant ge
✗ Over 30,000s (100 Metrics)
✗ Over 90,000s✗ NeediaibeTTeris STraTegyiforis Servificint gilargeriTimeirant ges S
iummaris Sint g
iummaris Sint gi-iProblem
● Fori100imeTrifics S,is Someiqueries Simis Ss SiTargeTi100ms S– Over ~40Ks (~11 days)
● QueryimighTibeireTurnt int giupiToi40Kipoint Ts S● Is SiThis SiaficTuallyint efices Ss Saryd
– Especially if data is simply used for visualisation– An average 1080p monitor only has ~2000 pixels
● LeTs Sis Sayi4000ipoint Ts Siareient ough,iorievent i400
values_2
iummaris Sint gi-iExample
time sum metric
10:00 10 1
10:00 2 2
10:02 5 1
10:02 4 2
values
time value metric
10:00 10 1
10:00 2 2
10:01 20 1
10:01 6 2
10:02 5 1
10:02 4 2
10:03 15 1
10:03 1 2
30
8
20
5
30
8
20
5
✔ iummaryiTableijus STiaifraficTiont iofiTheis Size
CREATE TABLE values_10 ( timestamp TIMESTAMPTZ, metric_id INT, sum FLOAT8, count FLOAT8, min FLOAT8, max FLOAT8,
UNIQUE (metric_id, timestamp));
iummaris Sint gi
● CreaTeivalues SiTable– Use for 10:1 summary
● Ont eient Try/Timeiperiod– Per metric– UNIQUE provide indexing
● MulTipleiaggregaTes S– SUM– COUNT– MIN– MAX
CREATE VIEW summary_10 AS SELECT * FROM values_10 INNER JOIN metrics ON (metric_id = id);
iummaris Sint g
● CreaTeiaiviewias Sibefore– Only storing metric_id
● iimplifes Siqueries S● Joint s SimeTrificidefnt iTiont s S
CREATE FUNCTION summarise_10 () RETURNS TRIGGER LANGUAGE plpgsql AS $_$BEGIN :END; $_$;
CREATE TRIGGER summarise_10_t AFTER INSERT ON values FOR EACH ROW EXECUTE PROCEDURE summarise_10 ();
iummaris Sint gi–i喘riggeriDefnt iTiont
● BoilerplaTe● Defnt eiTriggerifunt ficTiont
– Stored procedure– Contents omitted
● 喘riggeriToiexeficuTe…– On INSERT– To values table– Data passed to procedure
INSERT INTO values_10 VALUES ( TIME_ROUND(NEW.timestamp, 10), NEW.metric_id, NEW.value, 1, NEW.value, NEW.value)ON CONFLICT (metric_id, timestamp)DO UPDATE SET sum = sum + EXCLUDED.sum, count = count + EXCLUDED.count, min = LEAST (min,EXCLUDED.min), max = GREATEST(max,EXCLUDED.max);
iummaris Sint gi–i喘riggeriAficTiont
● Int s SerTiint Tois Summary– NEW is inserted data
● Rount diTimeiToiperiod– 10 seconds
● Int iTialiaggregaTeivalues S● Ifient Tryiexis STs Sialready● UpdaTeiint s STead
– EXCLUDED is current row– Combine new value with
existing aggregate value
SELECT TIME_ROUND(timestamp, 60), (SUM(sum) / SUM(count)) AS avgFROM summary_10WHERE timestamp BETWEEN '2015-01-01Z00:00:00' AND '2015-01-01Z01:00:00' AND name = 'cpu.percent' AND dimensions @> '{"host": "dev-01"}'::JSONBGROUP BY 1
iummaris Sint gi–iiint gleiieries SiQuery
● Mos STlyiunt fichant ged● Queryis SummaryiTable,int oTirawimeas Surement Ts S
● HaveiToiaggregaTeiTheiparTialiaggregaTiont s S– MIN: MIN(min)
– MAX: MAX(max)
– SUM: SUM(sum)
– COUNT: SUM(count)
– AVG: SUM(sum)/SUM(count)
iummaris Sint gi–iieries SiQueryi(vs SiRant ge,i100MiRows S)
0 10000 20000 30000 40000 50000 60000 70000 80000 900000.00
0.02
0.04
0.06
0.08
0.10
0.12
1 Metric
10 Metrics
100 Metrics
Query Time Range (seconds)
Qu
ery
Du
ratio
n (
seco
nd
s)
iummaris Sint gi–iieries SiQueryi(vs SiRant ge,i100MiRows S)
0 10000 20000 30000 40000 50000 60000 70000 80000 900000.00
0.02
0.04
0.06
0.08
0.10
0.12
1 Metric
10 Metrics
100 Metrics
1 Metric
10 Metrics
100 Metrics
Query Time Range (seconds)
Qu
ery
Du
ratio
n (
seco
nd
s)
iummaris Sint g
● Int ficreas SeivolumeiToi1BN– @ 1Hz / 100 Metrics: 10M seconds– Before: 11.5 days– Now: ~115 days : 16½ weeks
● Int ficreas SeimaxiTimeirant ges Sifromi90Ks SiToi900Ks S– Before: ~1.04 days– Now: ~10.4 days
iummaris Sint gi–iieries SiQueryi(vs SiVolume;i100M-1BN)
0 100 200 300 400 500 600 700 800 900 10000.000
0.020
0.040
0.060
0.080
0.100
0.120
100000
200000
300000
Data Volume (M-rows)
Qu
ery
Du
ratio
n (
seco
nd
s)
iummaris Sint gi–iieries SiQueryi(vs SiRant ge,i1BNiRows S)
0 100000 200000 300000 400000 500000 600000 700000 800000 9000000
0.02
0.04
0.06
0.08
0.1
0.12
Query Time Range (seconds)
Qu
ery
Du
ratio
n (
seco
nd
s)
✔ DaTaiVolume
✔ To 1BN – ~16 weeks ✔ 喘imeiRant ge
✔ To ~10 days
✔ 喘ois SficaleifurTherdi喘ryi100:1is Summary
iummaris Sint g
Clos Sint giNoTes S
WhaTiNexTd
● ParTiTiont int g– By timestamp and metric_id– Retention window (i.e. keep last 6 months)
● Int ges STiopTimis SaTiont – Compute aggregates in batches (e.g. every 5min)– Non trigger-based summarisation– Possibility to abuse logical replication
Is SiITiWorThiITd
● Yes S– Reduce complexity in terms of number of technologies– Reuse existing infrastructure and operational knowledge– No loss in consistency – ACID all the way
● Nod– Increase complexity in terms of database design– Is an out-of-the box tool already available?
i i 103
喘hant ks S
s STeve@s Smps Snt .nt eT