Upload
masahiko-sawada
View
779
Download
0
Embed Size (px)
Citation preview
Copyright © 2016 NTT DATA Corporation
03/17/2016 NTT DATA Corporation Masahiko Sawada
Introduction VACUUM, FREEZING, XID wraparound
2 Copyright © 2016NTT DATA Corporation
A little about me
Ø Masahiko Sawada Ø twitter : @sawada_masahiko
Ø NTT DATA Corporation Ø Database engineer
Ø PostgreSQL Hacker Ø Core feature Ø pg_bigm (Multi-byte full text search module for PostgreSQL)
3 Copyright © 2016NTT DATA Corporation
Contents
• VACUUM
• Visibility Map
• Freezing Tuple
• XID wraparound
• New VACUUM feature for 9.6
Copyright © 2016 NTT DATA Corporation 4
What is the VACUUM?
5 Copyright © 2016 NTT DATA Corporation
VACUUM
1 AAA
2 BBB
3 CCC
2 bbb
4 DDD Concurrently INSERT/DELETE/UPDATE
1 AAA
2 BBB
3 CCC
2 bbb
1 AAA
3 CCC
2 bbb
4 DDD
VACUUM Starts
VACUUM Done FSM
UPDATE : BBB->bbb
• Postgres garbage collection feature
• Acquire ShareUpdateExclusive Lock
6 Copyright © 2016 NTT DATA Corporation
Why do we need to VACUUM?
• Recover or reuse disk space occupied
• Update data statistics
• Update visibility map to speed up Index-Only Scan.
• Protect against loss of very old data due to XID wraparound
7 Copyright © 2016 NTT DATA Corporation
Evolution history of VACUUM
v8.1 (2005) v8.4 (2009)
autovacuum !?
Visibility Map Free Space Map
v9.5 (2016)
vacuumdb parallel option
v9.6
8 Copyright © 2016 NTT DATA Corporation
VACUUM Syntax
-- VACUUM whole database =# VACUUM;
-- Multiple option, analyzing only col1 column
=# VACUUM FREEZE VERBOSE ANALYZE hoge (col1);
-- Multiple option with parentheses
=# VACUUM (FULL, ANALYZE, VERBOSE) hoge;
Copyright © 2016 NTT DATA Corporation 9
Visibility Map
10 Copyright © 2016 NTT DATA Corporation
Visibility Map
• Introduced at 8.4 • A bit map for each table (1 bit per 1 page) • A table relation can have a visibility map. • keep track of which pages are all-visible page
• keep track of which pages are having garbage. • If 500GB table, Visibility Map is less than 10MB.
Table (base/XXX/1234)
Visibility Map (base/XXX/1234_vm) Block 0
Block 1 Block 2 Block 3 Block 4
11001…
11 Copyright © 2016 NTT DATA Corporation
State transition of Visibility Map bit
VACUUM
0 1
INSERT, UPDATE, DELETE
(NOT all-visible) (all-visible)
12 Copyright © 2016 NTT DATA Corporation
How does the VACUUM works actually?
• VACUUM works with two phases;
1. Scan table to collect TID
2. Reclaim garbage (Table, Index)
maintenance_work_mem
Index
Table
Scan Table
Collect garbage TID
Reclaim garbages
1st Phase
2nd Phase
13 Copyright © 2016 NTT DATA Corporation
Performance improvement point of VACUUM
• Scan table page one by one.
• vacuum can skip, iff there are more than 32 consecutive all-visible pages
• Store and remember garbage tuple ID to maintenance_work_mem.
VACUUM can skip to scan efficiency.
SLOW!! FAST!
VACUUM needs to scan all page.
: all-visible block
: Not all-visible block
Copyright © 2016 NTT DATA Corporation 14
XID wraparound and freezing tuple
15 Copyright © 2016 NTT DATA Corporation
What is the transaction ID (XID)?
• Every tuple has two transaction IDs. • xmin : Inserted XID • xmax : Deleted/Updated XID
xmin | xmax | col -------+------+------ 1810 | 1820 | AAA 1812 | 0 | BBB 1814 | 1830 | CCC 1820 | 0 | XXX
In REPEATABLE READ transaction isolation level, • Transaction 1815 can see ‘AAA’, ‘BBB’ and ‘CCC’. • Transaction 1821 can see ‘BBB’, ‘CCC’ and ‘XXX’ • Transaction 1831 can see ‘BBB’ and ‘XXX’.
16 Copyright © 2016 NTT DATA Corporation
What is the transaction ID (XID)?
• Can represent up to 4 billion transactions (uint32).
• XID space is circular with no endpoint.
• There are 2 billion XIDs that are “older”, 2 billion XIDs that are “newer”.
0 232-1
Older (Not visible)
Newer (Visible)
17 Copyright © 2016 NTT DATA Corporation
What is the XID wraparound?
XID=100 XID=100
XID 100 become not visible
XID=100
Older (Visible)
Newer (Not visible)
XID 100 is visible
Older (Not visible) Older
(Not visible)
Newer (Visible)
Newer (Visible)
Still visible
• Postgres could loss the very old data due to XID wraparound.
• When tuple is more than 2 billion transaction old, it could be happen.
• If 200 TPS system, it’s happen every 120 days.
• Note that it could be happen on INSERT-only table.
18 Copyright © 2016 NTT DATA Corporation
Freezing tuple
• Mark tuple as “Frozen”
• Marking “frozen” means that it will appear to be “in the past” to all transaction.
• Must freeze old tuple *before* XID proceeds 2 billion.
XID=100 (FREEZE)
XID=100 (FREEZE)
Tuple is visible.
XID=100
Older (Visible)
Newer (Not visible)
XID 100 is visible
Older (Not visible) Older
(Not visible)
Newer (Visible)
Newer (Visible)
Still visible. Tuple is marked as ‘FREEZE’
19 Copyright © 2016 NTT DATA Corporation
To prevent old data loss due to XID wraparound
• Emit WARNING log at 10 million transactions remaining.
• Prohibit to generate new XID at 1 million transactions remaining.
• Run anti-wraparound VACUUM automatically.
20 Copyright © 2016 NTT DATA Corporation
Anti-wraparound VACUUM
• All table has pg_class.relfrozenxid value. • All tuples which had been inserted by XID older than relfrozenxid have been
marked as “Frozen”. • Same as forcibly executed VACUUM *FREEZE*.
Current XID pg_class. relfrozenxid
anti-wraparound VACUUM is
launched forcibly
VACUUM could do a whole table scan
autovacuum_max_freeze_age (default 200 million)
+ 2 billion
vacuum_freeze_table_age (default 150 million)
XID wraparound
21 Copyright © 2016 NTT DATA Corporation
Anti-wraparound VACUUM
At this XID, lazy VACUUM is executed.
Current XID pg_class. relfrozenxid
anti-wraparound VACUUM is
launched forcibly
VACUUM could do a whole table scan
autovacuum_max_freeze_age (default 200 million)
+ 2 billion
vacuum_freeze_table_age (default 150 million)
XID wraparound
VACUUM
22 Copyright © 2016 NTT DATA Corporation
VACUUM could do a whole table scan
Anti-wraparound VACUUM
If you execute VACUUM at this XID, anti-wraparound VACUUM will be
executed.
If you do VACUUM at this XID, anti-wraparound VACUUM is executed.
pg_class. relfrozenxid
anti-wraparound VACUUM is
launched forcibly
autovacuum_max_freeze_age (default 200 million)
+ 2 billion
vacuum_freeze_table_age (default 150 million)
XID wraparound
anti-wraparound VACUUM
Current XID
23 Copyright © 2016 NTT DATA Corporation
Anti-wraparound VACUUM
After current XID is exceeded, anti-wraparound VACUUM is launched forcibly by autovacuum.
pg_class. relfrozenxid
anti-wraparound VACUUM is
launched forcibly
autovacuum_max_freeze_age (default 200 million)
+ 2 billion
vacuum_freeze_table_age (default 150 million)
XID wraparound
anti-wraparound auto VACUUM
Current XID
VACUUM could do a whole table scan
24 Copyright © 2016 NTT DATA Corporation
Anti-wraparound VACUUM
After anti-wraparound VACUUM, relrozenxid value is updated.
Current XID pg_class. relfrozenxid
vacuum_freeze_min_age (default 50 million)
25 Copyright © 2016 NTT DATA Corporation
anti-wraparound VACUUM is too slow
• Scanning whole table is always required to proceed relfrozenxid.
• Because lazy vacuum could skip page having the visible but not frozen tuple.
Visibility Map
Block # xmin
0 0 FREEZE FREEZE
1 1 FREEZE FREEZE
1 2 101
102
103
0 3 Garbage
104
Normal VACUUM
Anti-wraparound VACUUM
Copyright © 2016 NTT DATA Corporation 26
How can we improve anti-wraparound VACUUM?
27 Copyright © 2016 NTT DATA Corporation
Approaches
• Freeze Map
• Track pages which are necessary to be frozen.
• 64bit XID
• Change size of XID from 32bit to 64bit.
• LSN to XID map
• Mapping XID to LSN.
28 Copyright © 2016 NTT DATA Corporation
Freeze Map
• New feature for 9.6.
• Improve VACUUM FREEZE, anti-wraparound VACUUM performance.
• Bring us to functionality for VLDB.
29 Copyright © 2016 NTT DATA Corporation
Idea - Add an additional bit
• Not adding new map.
• Add a additional bit to Visibility Map.
• The additional bits tracks which pages are all-frozen.
• All-frozen page should be all-visible as well.
10110010 all-visible all-frozen
30 Copyright © 2016 NTT DATA Corporation
State transition of two bits
00
10 11
all-visible all-frozen
VACUUM UPDATE/ DELETE/ INSERT
UPDATE/ DELETE/ INSERT
VACUUM FREEZE
VACUUM FREEZE
31 Copyright © 2016 NTT DATA Corporation
Idea - Improve anti-wraparound performance
• VACUUM can skip all-frozen page even if anti-wraparound VACUUM is
required.
Normal VACUUM
Anti-wraparound VACUUM
Visiblity Map Block # xmin
visible frozen
1 0 0 FREEZE FREEZE
1 1 1 FREEZE FREEZE
1 0 2 101
102
103
0 0 3 Garbage
104
32 Copyright © 2016 NTT DATA Corporation
Pros/Cons
• Pros
• Dramatically performance improvement for VACUUM FREEZE.
• Read only table. (future)
• Cons
• Bloat Visibility Map size as twice.
33 Copyright © 2016 NTT DATA Corporation
No More Full-Table Vacuums
http://rhaas.blogspot.jp/2016/03/no-more-full-table-vacuums.html#comment-form
Copyright © 2016 NTT DATA Corporation 34
Another work
35 Copyright © 2016 NTT DATA Corporation
Vacuum Progress Checker
• New feature for 9.6. (under reviewing)
• Report progress information of VACUUM via system view.
36 Copyright © 2016 NTT DATA Corporation
Idea
• Add new system view.
• Report meaningful progress information for detail per process doing VACUUM.
postgres(1)=# SELECT * FROM pg_stat_vacuum_progress ; -[ RECORD 1 ]-------+--------------
pid | 55513
relid | 16384
phase | Scanning Heap
total_heap_blks | 451372
current_heap_blkno | 77729
total_index_pages | 559364
scanned_index_pages | 559364 index_scan_count | 1
percent_complete | 17
37 Copyright © 2016 NTT DATA Corporation
Future works
• Read Only Table
• Report progress information of other maintenance command.
Copyright © 2011 NTT DATA Corporation
Copyright © 2016 NTT DATA Corporation
PostgreSQL git repository
git://git.postgresql.org/git/postgresql.git
39 Copyright © 2016 NTT DATA Corporation
VERBOSE option
=# VACUUM VERBOSE hoge; INFO: vacuuming "public.hoge"
INFO: scanned index "hoge_idx1" to remove 1000 row versions
DETAIL: CPU 0.00s/0.01u sec elapsed 0.01 sec.
INFO: "hoge": removed 1000 row versions in 443 pages
DETAIL: CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO: index "hoge_idx1" now contains 100000 row versions in 276 pages DETAIL: 1000 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO: "hoge": found 1000 removable, 100000 nonremovable row versions in 447 out of 447 pages DETAIL: 0 dead row versions cannot be removed yet.
There were 0 unused item pointers.
Skipped 0 pages due to buffer pins.
0 pages are entirely empty.
CPU 0.00s/0.05u sec elapsed 0.05 sec.
VACUUM
40 Copyright © 2016 NTT DATA Corporation
FREEZE option
• Aggressive freezing of tuples
• Same as running normal VACUUM with vacuum_freeze_min_age = 0 and
vacuum_freeze_table_age = 0
• Always scan whole table
41 Copyright © 2016 NTT DATA Corporation
ANALYZE option
• Do ANALYZE after VACUUM • Update data statistics used by planner
-- VACUUM and analyze with VERBOSE option =# VACUUM ANALYZE VERBOSE hoge;
INFO: vacuuming "public.hoge"
:
INFO: analyzing "public.hoge"
INFO: "hoge": scanned 452 of 452 pages, containing 100000 live rows and 0 dead rows; 30000 rows in sample, 100000 estimated total rows
VACUUM
42 Copyright © 2016 NTT DATA Corporation
FULL option
• Completely different from lazy VACUUM
• Similar to CLUSTER
• Acquire AccessExclusiveLock
• Take much longer than lazy VACUUM
• Need more space at most twice as table size.
• Rebuild table and indexes
• Freeze tuple while VACUUM FULL (9.3~)