Upload
anastasia-lubennikova
View
135
Download
0
Embed Size (px)
Citation preview
www.postgrespro.ru
Compression
Compression
✔ TOAST row compression (pg_lzcompress)✔ Gin posting list compression =
effective layout + varbyte encoding
➔ B-tree effective layout➔ B-tree posting list varbyte encoding
➔ Default values➔ Compression of leading columns in index➔ Page compression (dictionary)
B-tree effective layout
B-tree effective layout. TODO
● The patch itself:https://commitfest.postgresql.org/9/494/
● README: https://goo.gl/50O8Q0● Bring back microvacuum functionality
● Find a place for the LP_DEAD flag.● Add compression enable/disable● Varbyte encoding of posting lists
● NOTE: tuples are not sorted by TID
● A lot more testing
Default values
● Problem: Column creation with default values causes all rows to be touched at the same time.
● Any ideas?
Compression of leading columns
● Index on (usr_id, comment_id)● Leading column has a lot of repeated values
(1,1),(1,2),(1,3)...(1,100)
Page compression
● Data independent● Fits both Heap and index pages that have
standard layout● Dictionary compression● Dictionary is placed on a page● Page in memory is compressed● We can decompress each tuple separately● Binary compatible
Page layout
typedef struct PageHeaderData{
PageXLogRecPtr pd_lsn;uint16 pd_checksum;uint16 pd_flags;LocationIndex pd_lower;LocationIndex pd_upper;LocationIndex pd_special;uint16 pd_pagesize_version;TransactionId pd_prune_xid;ItemIdData pd_linp[FLEXIBLE_ARRAY_MEMBER];
} PageHeaderData;
Dictionary
typedef struct PageHeaderData{
PageXLogRecPtr pd_lsn;uint16 pd_checksum;uint16 pd_flags;LocationIndex pd_lower;LocationIndex pd_upper;LocationIndex pd_special;uint16 pd_pagesize_version;TransactionId pd_prune_xid;
} PageHeaderData;
typedef struct DictionaryData{
int32 dictlength;/* Dictionary follows at the end of struct */} DictionaryData;
/* array of line pointers follows the dictionary /*
#define PageGetLinp(page) \(char *)((char *) (page) + SizeOfPageHeaderData +
PageGetDictSize(page))
Questions (1)
1.Can we create a copy of access/nbtree for experimental features?
2.Should compression feature be on/off by default?3.Any other thoughts about user settings?4.Do we want to compress system relations?
Questions (2)
1.When should page compression be triggered?● VACUUM FULL, CLUSTER TABLE● CREATE INDEX● VACUUM● Before bt_split
2.How to decide, whether we should compress the page again or go to the next page?
3.When to invalidate page's dictionary?4.Where to keep info about raw tuple size?
● (tupleheader->t_size)
5.Should we compress tuple header?● Don't forget about hint bits
6.Alignment of compressed tuples?