51
Building a Custom Data Warehouse Using PostgreSQL TOASTing an Elephant Illustration by Zoe Lubitz

TOASTing an Elephant : Building a Custom Data Warehouse Using PostgreSQL

Embed Size (px)

Citation preview

Page 1: TOASTing an Elephant : Building a Custom Data Warehouse Using PostgreSQL

Building a Custom Data Warehouse Using PostgreSQL

TOASTing an Elephant

Illustration by Zoe Lubitz

Page 2: TOASTing an Elephant : Building a Custom Data Warehouse Using PostgreSQL

David Kohn

Chief Elephant Toaster and Data Engineer at Moat

[email protected]

Page 3: TOASTing an Elephant : Building a Custom Data Warehouse Using PostgreSQL

We measure attention online, for both advertisers and publishers.

We don’t track cookies/ip addresses.

Rather we process billions of events per day that allow us to measure how many people saw an ad or interacted with it.

We are a neutral third party and our metrics are used by both advertisers and publishers to measure their performance and agree on a fair price.

Those billions of events are aggregated in our realtime system and end up as millions of rows per day added to our stats databases.

Page 4: TOASTing an Elephant : Building a Custom Data Warehouse Using PostgreSQL

Moat Interface

Page 5: TOASTing an Elephant : Building a Custom Data Warehouse Using PostgreSQL

Moat Interface

Page 6: TOASTing an Elephant : Building a Custom Data Warehouse Using PostgreSQL

Moat Interface

Page 7: TOASTing an Elephant : Building a Custom Data Warehouse Using PostgreSQL

tuple client filter1date filterN metric1 metricN

Partition Keys Filters (~10 text) Metrics (~170 int8)

Production queries have single client.

Production queries sum all of these.Subset(s) are hierarchical.

Basic Row Structure

Page 8: TOASTing an Elephant : Building a Custom Data Warehouse Using PostgreSQL

tuple client filter1date filterN metric1 metricN

Partition Keys Filters (~10 text) Metrics (~170 int8)

Production queries have single client.

Production queries sum all of these.Subset(s) are hierarchical.

Basic Row Structure

SELECT filter1, filter2 … SUM(metric1), SUM(metric2) … SUM(metricN) FROM rollup_table_name WHERE client = ‘foo’ AND date >= ‘bar’ AND date <= ‘baz’

GROUP BY filter1, filter2 …

Typical Query

Page 9: TOASTing an Elephant : Building a Custom Data Warehouse Using PostgreSQL

Moat Interface

Client Filters Date Range

Metrics

Page 10: TOASTing an Elephant : Building a Custom Data Warehouse Using PostgreSQL

Moat Interface

Client Filters Date Range

Metrics (And there’s a lot more of them you can choose)

Page 11: TOASTing an Elephant : Building a Custom Data Warehouse Using PostgreSQL

Lots of Data

Page 12: TOASTing an Elephant : Building a Custom Data Warehouse Using PostgreSQL

Sum large amounts of data quickly (but only a small fraction of total data, easily partition-able)

Sum all columns of very wide rows

Compress data (for storage and i/o reasons)

Support medium read concurrency (or at least degrade predictably) ie 4-12 requests/second some of which can take minutes to finish

Data is derivative and structured to meet needs of client-facing app high read/aggregation throughput for clients

ETL quickly, some bulk delete/redo operations, once per day

Requirements

Page 13: TOASTing an Elephant : Building a Custom Data Warehouse Using PostgreSQL

Should we choose a row store or a column store?

Page 14: TOASTing an Elephant : Building a Custom Data Warehouse Using PostgreSQL

Old Systems

• 2 masters + 2 replicas each

• Handled last 7 days

• High concurrency

• Highly disk bound

• Heavily partitioned

• Shield for column stores

• ~3 mos/cluster (30 TB license - 8 nodes - $$$)

• Fast, but slowed down under concurrency

• Performance degradation unpredictable

• Projections can lead to slow ETL

• 1 cluster (8 nodes, spinning disk)

• 2012-Present

• No roll up tables, too big

• Incredibly slow for client facing queries (many columns)

• Bulk Insert ETL, delete/update hard

Postgres Vertica Redshift

Page 15: TOASTing an Elephant : Building a Custom Data Warehouse Using PostgreSQL

page

tuple tuple

tuple tuple

header

tuple header attr attr

attr attr

attr

attr

attr attr attrattr attr

table (on disk)

page page page page

page page page page

page page page page

Row Store

A table is a collection of rows, each row split into columns/attrs

Each row must fit into a page.

Page 16: TOASTing an Elephant : Building a Custom Data Warehouse Using PostgreSQL

page

tuple tuple

tuple tuple

header

tuple header attr attr

attr attr

attr

attr

attr attr attrattr attr

table (on disk)

page page page page

page page page page

page page page page

Row Store• Accesses small subsets of rows

quickly

• Little penalty for many columns

selected

• Great for individual inserts,

updates and deletes

• Often normalize data structure

• OLTP workloads

• High concurrency, less

throughput per user

• Data stored uncompressed,

unless too large for a block

Page 17: TOASTing an Elephant : Building a Custom Data Warehouse Using PostgreSQL

pagecompressed values (possibly with surrogate keys)

table (on disk)

attr Apage page page page

page page page

attr Bpage page page page

page page page page

page page

Column Store

A table is a collection of columns.

Each column split into values position corresponds to row.

Values in columns often compressed.

Page 18: TOASTing an Elephant : Building a Custom Data Warehouse Using PostgreSQL

pagecompressed values (possibly with surrogate keys)

table (on disk)

attr Apage page page page

page page page

attr Bpage page page page

page page page page

page page

Column Store• Scans and aggregates large

numbers of rows quickly

• Best when selecting a subset of

columns

• Great for bulk inserts, harder to

delete or update

• Often denormalized data

structure

• OLAP workloads

• Lower concurrency, much higher

throughput per user

• Data can be compressed

Page 19: TOASTing an Elephant : Building a Custom Data Warehouse Using PostgreSQL

page

tuple tuple

tuple tuple

tuple header attr attr

attr attr

attr

attr

attr attr attrattr attr

attr

What happens when an attr is too big to fit in a page?

?

header

Page 20: TOASTing an Elephant : Building a Custom Data Warehouse Using PostgreSQL

TOAST tablepage

The Oversized Attribute StorageTechnique

tuple header attr pointer

attr attr

attr

attr

attr attr attrattr attr

attr

tuple id

compressed attr

segment

LZIP

pagetuple id

compressed attr

segment

Page 21: TOASTing an Elephant : Building a Custom Data Warehouse Using PostgreSQL

Project Marjory

Page 22: TOASTing an Elephant : Building a Custom Data Warehouse Using PostgreSQL

Project Marjory

Page 23: TOASTing an Elephant : Building a Custom Data Warehouse Using PostgreSQL

Moat Interface

Client Filters Date Range

Metrics

Page 24: TOASTing an Elephant : Building a Custom Data Warehouse Using PostgreSQL

tuple client filter1date filterN metric1 metricN

Partition Keys Filters (~10 text) Metrics (~170 int8)

Original Row

Page 25: TOASTing an Elephant : Building a Custom Data Warehouse Using PostgreSQL

tuple client filter1date filterN metric1 metricN

Partition Keys Filters (~10 text) Metrics (~170 int8)

Original Row

Subtype

subtype filter1 filterN metric1 metricN

Page 26: TOASTing an Elephant : Building a Custom Data Warehouse Using PostgreSQL

tuple client filter1date filterN metric1 metricN

Partition Keys Filters (~10 text) Metrics (~170 int8)

Original Row

Subtype

subtype filter1 filterN metric1 metricN

tuple array

MegaRow

clientdate

Partition Keys

subtype

Array of Composite Type (~5000 rows/array)

subtype subtype subtype

subtype subtype subtype subtype

segment

Page 27: TOASTing an Elephant : Building a Custom Data Warehouse Using PostgreSQL

INSERT INTO array_table_name SELECT date, client, segment, ARRAY_AGG( (filter1, filter2 … metric1, metric2 … metricN)::subtype) FROM temp_table_for_etl GROUP BY date, client, segment

Typical ETL Query

Page 28: TOASTing an Elephant : Building a Custom Data Warehouse Using PostgreSQL

INSERT INTO array_table_name SELECT date, client, segment, ARRAY_AGG( (filter1, filter2 … metric1, metric2 … metricN)::subtype) FROM temp_table_for_etl GROUP BY date, client, segment

Typical ETL Query

Reporting Query SELECT a.date, a.client,

s.filter1 ... s.filterN SUM(s.metric1)... SUM(s.metricN)

FROM array_table_name a, LATERAL UNNEST(subtype[]) s (filter1, filter2, … metricN) WHERE client = ‘foo’ AND date >= ‘bar’ AND date <= ‘baz’

Page 29: TOASTing an Elephant : Building a Custom Data Warehouse Using PostgreSQL

1 Client, 10 days, ~150,000 rows/day (~1.5m rows total)

MarjoryRedshift

Page 30: TOASTing an Elephant : Building a Custom Data Warehouse Using PostgreSQL

1 Client, 10 days, ~3,000,000 rows/day (~30m rows total)

MarjoryRedshift

Page 31: TOASTing an Elephant : Building a Custom Data Warehouse Using PostgreSQL

1 Client, 4 months, ~150,000 rows/day (~18m rows total)

MarjoryRedshift

Page 32: TOASTing an Elephant : Building a Custom Data Warehouse Using PostgreSQL

1 Client, 4 months, ~150,000 rows/day (~18m rows total)

MarjoryRedshift

Page 33: TOASTing an Elephant : Building a Custom Data Warehouse Using PostgreSQL

1 Client, 4 months, ~150,000 rows/day (~18m rows total)

MarjoryRedshift

Page 34: TOASTing an Elephant : Building a Custom Data Warehouse Using PostgreSQL

• Performs quite well on our typical queries (lots of columns, large subset of rows)

• Sort order matters less than in column stores

• Query time scales with number of rows unpacked and aggregated, lightly depends on number of columns

• Utilizes resources efficiently for concurrency (Postgres’ stinginess can serve us well)

• 8-10x compression for our data (with a bit of extra tuning of our composite type)

• All done in PL/PGSQL etc, no C-code required.

• Doesn’t do as well on general SQL queries, have to unpack all of the rows

• Not getting you much compared to a column store if you’re accessing only a few columns (one might be able to design it differently though)

• Doesn’t dynamically scale number of workers for size of query (Postgres’ stinginess doesn’t serve us well for more typical BI cases, but that wasn’t what we optimized for)

• Isn’t going to do as well when scanning very large numbers of rows (ie more typical BI)

• All done in PL/PGSQL etc, no C-code required.

The Good The Not-So-Good

Page 35: TOASTing an Elephant : Building a Custom Data Warehouse Using PostgreSQL

Trade generality for fit to our use case.

Page 36: TOASTing an Elephant : Building a Custom Data Warehouse Using PostgreSQL

I’ll Drink to That!

Illustration by Zoe Lubitz

Page 37: TOASTing an Elephant : Building a Custom Data Warehouse Using PostgreSQL

Rollups

SELECT filter1, filter2, filterN, SUM(metric1), SUM(metricN) GROUP BY GROUPING SETS(filter1, filter2 ... filterN-1, filterN), (filter1 ... filterN-1, filterN), ... (filter1, filter2), (filter1)

INSERT INTO byfilter1 ... INSERT INTO byfilter2 ...

Page 38: TOASTing an Elephant : Building a Custom Data Warehouse Using PostgreSQL

tuple arrayarrayarrayarrayclientdate

Partition Keys

subtype

subtype

subtype

subtype

subtype

subtype

subtype

subtype

subtype subtype

segment

subtype

subtype

subtype

subtype

subtypesubtype

subtype

subtype

byfilter4[ ]byfilter3[ ]byfilter2[ ]byfilter1[ ]

MegaRow

Page 39: TOASTing an Elephant : Building a Custom Data Warehouse Using PostgreSQL

tuple arrayarrayclientdate

Partition Keys

subtype

subtype

subtype

subtype

subtype

subtype

subtype

subtype

segmentsubtype

subtype

subtype

byfilter4[ ]byfilter3[ ]byfilter2[ ]byfilter1[ ]

MegaRow

array

subtype

subtypearray

subtype

subtype

subtype

subtype

subtype

subtype

subtype

subtype

Page 40: TOASTing an Elephant : Building a Custom Data Warehouse Using PostgreSQL

tuple arrayarrayclientdate

Partition Keys

subtype

subtype

subtype

subtype

subtype

subtype

subtype

subtype

segmentsubtype

subtype

subtype

byfilter4[ ]byfilter3[ ]byfilter2[ ]byfilter1[ ]

MegaRow

NULL NULL

Page 41: TOASTing an Elephant : Building a Custom Data Warehouse Using PostgreSQL

tuple clientdate

Partition Keys

segment

Rollup Arrays

Summary Statistics

total_rows metadata

Summary Statistics

Page 42: TOASTing an Elephant : Building a Custom Data Warehouse Using PostgreSQL

tuple clientdate

Partition Keys

segment

Rollup Arrays

Summary Statistics

total_rows metadata

Summary Statistics

SELECT date, client, SUM(total_rows) as rows_per_day FROM array_table_name GROUP BY date, client

Count Rows/Day by Client

Page 43: TOASTing an Elephant : Building a Custom Data Warehouse Using PostgreSQL

Partition Keys Rollup Arrays

Distinct ListsSummary Stats

clientdate segment

total rows metadata arrayarray

val val

val val

val val val

val val

val

Distinct Filter Values

Page 44: TOASTing an Elephant : Building a Custom Data Warehouse Using PostgreSQL

Partition Keys Rollup Arrays

Distinct ListsSummary Stats

Targeted Reporting Query

clientdate segment

total rows metadata

SELECT a.date, a.client, s.filter1, s.filter2, … s.metricN FROM array_table_name a, LATERAL UNNEST(subtype[]) s (filter1, filter2, … metricN) WHERE client = ‘foo’ AND date >= ‘bar’ AND date <= ‘baz’ AND s.filter1 = ‘fizz’

arrayarray

val val

val val

val val val

val val

val

Distinct Filter Values

Page 45: TOASTing an Elephant : Building a Custom Data Warehouse Using PostgreSQL

Partition Keys Rollup Arrays

Distinct ListsSummary Stats

Targeted Reporting Query

clientdate segment

total rows metadata

SELECT a.date, a.client, s.filter1, s.filter2, … s.metricN FROM array_table_name a, LATERAL UNNEST(subtype[]) s (filter1, filter2, … metricN) WHERE client = ‘foo’ AND date >= ‘bar’ AND date <= ‘baz’ AND s.filter1 = ‘fizz’

arrayarray

val val

val val

val val val

val val

val

Distinct Filter Values

SELECT a.date, a.client, s.filter1, s.filter2, … s.metricN FROM array_table_name a, LATERAL UNNEST(subtype[]) s (filter1, filter2, … metricN) WHERE client = ‘foo’ AND date >= ‘bar’ AND date <= ‘baz’ AND s.filter1 = ‘fizz’ AND a.distinct_filter1 @> ‘[fizz]’::text[]

Page 46: TOASTing an Elephant : Building a Custom Data Warehouse Using PostgreSQL

Stats

• Marjory (All data since 2012) has about the same on disk footprint as Elmo (last 33ish days)

• ~20x compression compared to normal format Postgres (~10x TOAST + ~2x avoided storage of rollups)

• 5 Marjory instances, each with all of the data for all time (on local store spinning disk drives) have basically taken over what we had on our Vertica and Redshift instances (at least 16 instances)

• Overall tradeoff is I/O for CPU, so had to do some tuning to get parallel to planning/running properly

Page 47: TOASTing an Elephant : Building a Custom Data Warehouse Using PostgreSQL

ALTER TABLE array_table_name ALTER client SET STATISTICS 10000; ALTER TABLE array_table_name ALTER byfilter1 SET STATISTICS 0; ALTER TABLE array_table_name ALTER byfilter2 SET STATISTICS 0; ... ALTER TABLE array_table_name ALTER byfilterN SET STATISTICS 0;

Only Do Meaningful Statistics (But Make Them Good)

Useful Tuning Tips

Page 48: TOASTing an Elephant : Building a Custom Data Warehouse Using PostgreSQL

ALTER TABLE array_table_name ALTER client SET STATISTICS 10000; ALTER TABLE array_table_name ALTER byfilter1 SET STATISTICS 0; ALTER TABLE array_table_name ALTER byfilter2 SET STATISTICS 0; ... ALTER TABLE array_table_name ALTER byfilterN SET STATISTICS 0;

Only Do Meaningful Statistics (But Make Them Good)

Useful Tuning Tips

Make Data-Type Specific Functions For Unnest With Proper StatsCREATE FUNCTION unnest(byfilter4) RETURNS SET OF array_subtype as $func$ ... $func$ LANGUAGE PLPGSQL ROWS 5000 COST 5000;

Page 49: TOASTing an Elephant : Building a Custom Data Warehouse Using PostgreSQL

ALTER TABLE array_table_name ALTER client SET STATISTICS 10000; ALTER TABLE array_table_name ALTER byfilter1 SET STATISTICS 0; ALTER TABLE array_table_name ALTER byfilter2 SET STATISTICS 0; ... ALTER TABLE array_table_name ALTER byfilterN SET STATISTICS 0;

Only Do Meaningful Statistics (But Make Them Good)

Useful Tuning Tips

Make Data-Type Specific Functions For Unnest With Proper StatsCREATE FUNCTION unnest(byfilter4) RETURNS SET OF array_subtype as $func$ ... $func$ LANGUAGE PLPGSQL ROWS 5000 COST 5000;

min_parallel_relation_size parallel_setup_cost parallel_tuple_cost max_worker_processes max_parallel_workers_per_gather cpu_operator_cost?

Futz With Parallelization Parameters Until They Work

Page 50: TOASTing an Elephant : Building a Custom Data Warehouse Using PostgreSQL

Yep. CPU Bound

Page 51: TOASTing an Elephant : Building a Custom Data Warehouse Using PostgreSQL

[email protected]

Illustration by Zoe Lubitz

We’re hiring!

http://grnh.se/os4er71