44
Enhancing recovery and ramp-up performance of DBMS when using an SSD buffer- pool extension Wang Jiangtao 2013-10-18

Enhancing recovery and ramp-up performance of DBMS when using an SSD buffer-pool extension

  • Upload
    elise

  • View
    38

  • Download
    2

Embed Size (px)

DESCRIPTION

Enhancing recovery and ramp-up performance of DBMS when using an SSD buffer-pool extension. Wang Jiangtao 2013-10-18. Outline. Introduction SSD-based extension buffer Enhancing recovery by SSD Two related work Enhancing recovery using…[DaMoN2011] Fast peak-to-peak ….[ICDE2013] Summary. - PowerPoint PPT Presentation

Citation preview

Page 1: Enhancing recovery and ramp-up performance of DBMS when using an SSD buffer-pool extension

Enhancing recovery and ramp-up performance of DBMS when using an

SSD buffer-pool extensionWang Jiangtao

2013-10-18

Page 2: Enhancing recovery and ramp-up performance of DBMS when using an SSD buffer-pool extension

Outline

• Introduction• SSD-based extension buffer• Enhancing recovery by SSD • Two related work– Enhancing recovery using…[DaMoN2011]– Fast peak-to-peak ….[ICDE2013]

• Summary

Page 3: Enhancing recovery and ramp-up performance of DBMS when using an SSD buffer-pool extension

Evolution of HDD

3

• Hard disk drive (HDD)– Access rates have been flat for

~13 years– Disk density growth projection

bleak– Capacity growth is now about to

flatten significantly– Power savings not realized

Page 4: Enhancing recovery and ramp-up performance of DBMS when using an SSD buffer-pool extension

Solid State Drive• Solid State Drive (SSD)– A semiconductor device– Mechanical components free– 3D NAND flash memory

• Technical merits– High IOPS(>50000)– High bandwidth >500MB/s)– Low power: 0.06 (idle)~2.4w (active)– Shock resistance

4

SSD

Page 5: Enhancing recovery and ramp-up performance of DBMS when using an SSD buffer-pool extension

Integrating SSD and HDD

• Background– Performance depends heavily on memory, I/O bandwidth,

access latency(web server)– SSD at capacity not going to be reality– Price($/GB):RAM>>SSD>Disk– Read>>write(SSD)– Only a small amount of data is hot!– Cost-effectiveness is the primary factor for large data center – ……

Page 6: Enhancing recovery and ramp-up performance of DBMS when using an SSD buffer-pool extension

Outline

• Introduction• SSD-based extension buffer• Enhancing recovery by SSD • Two related work– Enhancing recovery using…[DaMoN2011]– Fast peak-to-peak ….[ICDE2013]

• Summary

Page 7: Enhancing recovery and ramp-up performance of DBMS when using an SSD buffer-pool extension

• Basic FrameworkSSD as cache-buffer

1. B. Debnath, etc.Flashstore: high throught persistent key-value store. VLDB 20102. J. Do,etc. Turbocharg-ing DBMS Buffer Pool Using SSDs. SIGMOD 2011. 3. W.H. Kang,etc. Flash-based Extended Cache for Higher Throughput and Faster Recovery. VLDB 20124. J. Do, etc. Fast peak-to-peak behavior with SSD buffer pool. ICDE2013

Page 8: Enhancing recovery and ramp-up performance of DBMS when using an SSD buffer-pool extension

Applications in Industry

• Intel ( Differentiated Storage Services )– Intel 的 SSD 缓存解决方案,是将所需文件临时镜像缓存在 SSD 中。

• Apple(Fusion Drive)– Fusion Drive 包含 SSD 和磁盘– 使用频繁的 app 、文档、照片和其他文件存储在闪存上,– 所有的写入操作都在 SSD ,不常用的内容转移到硬盘

• Oracle Exadata (Database Machine)– 综合了可扩展的服务器和存储、 InfiniBand 网络、智能存储、 PCI 闪存、智能内存高速缓存和混合列式压缩等,实现了软硬件一体化的数据管理。– 智能闪存缓存通过将经常访问的热数据透明地缓存在高速固态存储系统中,来解决磁盘随机 I/O 瓶颈问题

Page 9: Enhancing recovery and ramp-up performance of DBMS when using an SSD buffer-pool extension

Outline

• Introduction• SSD-based extension buffer• Enhancing recovery by SSD • Two related work– Enhancing recovery using…[DaMoN2011]– Fast peak-to-peak ….[ICDE2013]

• Summary

Page 10: Enhancing recovery and ramp-up performance of DBMS when using an SSD buffer-pool extension

Recovery for SSD-based cache system

• Problem definition– A small amount of SSD can ameliorate a large fraction of

random I/O.– A long time is needed when restarting the DBMS from a

shutdown or a crash.

– There has not been much emphasis on exploitation of the persistency of SSD.

Page 11: Enhancing recovery and ramp-up performance of DBMS when using an SSD buffer-pool extension

• Challenge– How to improve the performance of recovery

without negatively impacting peak performance. – How to ensure the correctness of DBMS when

executing recovery algorithm.

Recovery for SSD-based cache system

Page 12: Enhancing recovery and ramp-up performance of DBMS when using an SSD buffer-pool extension

Outline

• Introduction• SSD-based extension buffer• Enhancing recovery by SSD • Two related work– Enhancing recovery using…[DaMoN2011]– Fast peak-to-peak ….[ICDE2013]

• Summary

Page 13: Enhancing recovery and ramp-up performance of DBMS when using an SSD buffer-pool extension

DaMoN 2011

Page 14: Enhancing recovery and ramp-up performance of DBMS when using an SSD buffer-pool extension

Motivation

• Recovery is itself a random I/O intensive process. • The pages that need to be read and written during

recovery may be scattered over various parts of the disk.

• Preserve the state of the SSD buffer pool so that it can be used during crash recovery.

• Provide a warm buffer pool restart

Page 15: Enhancing recovery and ramp-up performance of DBMS when using an SSD buffer-pool extension

TAC (VLDB2010)

• TAC – Wirte-through– Temperature-based

data prefetch

M. Canim,etc. SSD bufferpool extensions for database systems. VLDB2010

Page 16: Enhancing recovery and ramp-up performance of DBMS when using an SSD buffer-pool extension

Implementing Recovery

• Metadata persistence– Store some SSD Buffer pool metadata on the

persistent SSD storage• Mapping information synchronization– When a new page is admitted to the SSD buffer pool and

an old page is evicted, the slot table must be updated. – When a dirty page is evicted from the RAM-resident buffer

pool, no modifications of the slot table are required.

Page 17: Enhancing recovery and ramp-up performance of DBMS when using an SSD buffer-pool extension

Recovery for TAC• Correctness for TAC – Initially be invalidated– The slot is updated after the write is

finished – Missing some valid data on SSD

Page 18: Enhancing recovery and ramp-up performance of DBMS when using an SSD buffer-pool extension

Experiment Results

• Experiment setup– 500 warehouse(TPC-C)– The RAM was kept at 2.0% of the database size.

Impact of metadata writes Impact on logging

Page 19: Enhancing recovery and ramp-up performance of DBMS when using an SSD buffer-pool extension

Experiment Results• Crash performance Restart performance

Page 20: Enhancing recovery and ramp-up performance of DBMS when using an SSD buffer-pool extension

Summary for recovery in TAC

• The experiment is very sufficient, and the analysis is profound.

• However, the metadata file is small(23MB), the size ratio between SSD and RAM is 3(3.6G/1,2G)

• The cost of synchronization is relatively low

Page 21: Enhancing recovery and ramp-up performance of DBMS when using an SSD buffer-pool extension

ICDE 2013

Page 22: Enhancing recovery and ramp-up performance of DBMS when using an SSD buffer-pool extension

Motivation

• With an SSD buffer-pool, a DBMS still treats the disks as the permanent “home” of data.

• Such scheme have a long “peak-to-peak interval” when restarting a DBMS.

• We need a fast mechanism to reduce the restart and ramp-up time

Page 23: Enhancing recovery and ramp-up performance of DBMS when using an SSD buffer-pool extension

BackgroundTwo SSD buffer-pool extension designs

• DW– Write-through

• LC– Write-back

J. Do,etc. Turbocharg-ing DBMS Buffer Pool Using SSDs. SIGMOD 2011.

Page 24: Enhancing recovery and ramp-up performance of DBMS when using an SSD buffer-pool extension

BackgroundTwo SSD buffer-pool extension designs

• Data structure– SSD buffer table

J. Do,etc. Turbocharg-ing DBMS Buffer Pool Using SSDs. SIGMOD 2011.

Page 25: Enhancing recovery and ramp-up performance of DBMS when using an SSD buffer-pool extension

BackgroundTwo SSD buffer-pool extension designs

• Data structure– SSD buffer table

J. Do,etc. Turbocharg-ing DBMS Buffer Pool Using SSDs. SIGMOD 2011.

Page 26: Enhancing recovery and ramp-up performance of DBMS when using an SSD buffer-pool extension

Background Recovery in SQL server2012

• Data structure– Transaction log

Update log (pageID,prepageLSN,…), BUF_WRITE log …..– Dirty page table

Store information about dirty pages in main memory(pageID,recLSN,lastLSN…)

– Transaction tableStores information about active transactions(beginLSN,endLSN,…)

– ……• Checkpoint

Page 27: Enhancing recovery and ramp-up performance of DBMS when using an SSD buffer-pool extension

• Recovery– Analysis phase

Build dirty page tableBuild transaction tableBuild lock table……

– Redo phase– Undo phase

Background Recovery in SQL server

Page 28: Enhancing recovery and ramp-up performance of DBMS when using an SSD buffer-pool extension

Restart design

• Some Pitfalls in using the SSD after a restart– Different version data in SSD and disk

In DW, delay modifying the FC until both the SSD write and the disk write have completed.

In LC, a BUF_WRITE log is generated after the lazy cleaner finishes copying a dirty SSD page to the disks.

In LC, oldestDirtyLSN is the oldest recLSN of the dirty pages in RAM and in the SSD buffer pool.

Page 29: Enhancing recovery and ramp-up performance of DBMS when using an SSD buffer-pool extension

MMR designMain idea

– Stores the mapping table in SSD– Synchronously updates mapping table

Hardening the FC fields– State, pageID, lastusetime, nexttolastusetime

When to harden– When a clean SSD frame is about to be replaced, flush the

state change.– Minimize the number of flushes

Recover the SSD buffer table– Recover the state of FC (FREE, CLEAN, or DIRTY)– Rebuilt the data structures– Recover recLSN of FC after the analysis phase

Page 30: Enhancing recovery and ramp-up performance of DBMS when using an SSD buffer-pool extension

LBR design Main idea

– Check the SSD buffer table during a DBMS checkpoint– Log the update for SSD buffer table through SSD log record– Figure out the protocol to checkpoint, log, and recover

Hardening the FC fields– State, pageID, lastusetime, nexttolastusetime

SSD Log record– SSD_CHKPT:

hardening the states of every 64 FCs– SSD_WRITE_INVALIDATE:

overwrite a clean SSD page when there no available free SSD frame– SSD_POST_WRITE:

after a page is written to SSD– SSD_LAZY_CLEANED:

after a dirty SSD page is cleaned

Page 31: Enhancing recovery and ramp-up performance of DBMS when using an SSD buffer-pool extension

LBR designWhen to harden

– only the SSD_PRE_WRITE_INVALIDATE log record must be flushed to disk, before the thread that generates the log record can continue.

– Group Writing OptimizationRecovery– SSD_CHKPT:

If a FC is DIRTY, Recover recLSN field, update SSD hash table– SSD_WRITE_INVALIDATE:

Invalidate the corresponding FC– SSD_POST_WRITE:

the same as the one used in the processing of an SSD_CHKPT log – SSD_LAZY_CLEANED:

the FC state is changed from LAZYCLEANING to CLEAN

Page 32: Enhancing recovery and ramp-up performance of DBMS when using an SSD buffer-pool extension

LVR designMain idea– Asynchronously harden the SSD buffer table– Dealing with invalid SSD buffer table records recovered

from the recent flushEnsure two properties– The databases should be consistent, if the design chooses

to reuse a page in the SSD buffer pool upon a restart:The PageID of a FC is different from the actual SSD page

– The databases should be consistent, if the design chooses to discard a page in the SSD buffer pool upon a restart, even if the SSD page is newer than the disk versionoldestdirtyLSN

Page 33: Enhancing recovery and ramp-up performance of DBMS when using an SSD buffer-pool extension

LVR design Hardening the FC fields

– State,pageID,lastUseTime,NextTolastUseTime,blank,beforeHardeningLSN The FC flusher thread

— repeatedly scans the SSD buffer table in chunks, and hardens the FCs

Page 34: Enhancing recovery and ramp-up performance of DBMS when using an SSD buffer-pool extension

LVR designCheckpoint

— make sure that the FC flusher thread finishes a complete pass of hardening the SSD buffer table during a checkpoint.

Recovering from shutdown

Page 35: Enhancing recovery and ramp-up performance of DBMS when using an SSD buffer-pool extension

LVR designCheckpoint

— make sure that the FC flusher thread finishes a complete pass of hardening the SSD buffer table during a checkpoint.

Recovering from crash

Page 36: Enhancing recovery and ramp-up performance of DBMS when using an SSD buffer-pool extension

Experiment results

• Experiment setup– 24GB RAM,140GB SSD,200GB database size– SQL Server 2012 – Dirty fraction:20%

• Throughput after restart

TPC-C TPC-E

Page 37: Enhancing recovery and ramp-up performance of DBMS when using an SSD buffer-pool extension

Experiment results

• TPC-C Evaluation– Peak-to-peak interval

restarting from a shutdown. restarting from a crash.

Page 38: Enhancing recovery and ramp-up performance of DBMS when using an SSD buffer-pool extension

Experiment results

• TPC-E Evaluation– Peak-to-peak interval

restarting from a shutdown. restarting from a crash.

Page 39: Enhancing recovery and ramp-up performance of DBMS when using an SSD buffer-pool extension

Outline

• Introduction• SSD-based extension buffer• Enhancing recovery by SSD • Two related work– Enhancing recovery using…[DaMoN2011]– Fast peak-to-peak ….[ICDE2013]

• Summary

Page 40: Enhancing recovery and ramp-up performance of DBMS when using an SSD buffer-pool extension

40

Summary

• Basic requirement– Ensure the consistency and correctness of DBMSs– Minimize the cost of hardening mapping information– Design different recovery algorithm for cache policies

Various pitfalls– Log VS. Metadata files

log-based scheme require a larger spacehigh complexity when designing recovery algorithm

Page 41: Enhancing recovery and ramp-up performance of DBMS when using an SSD buffer-pool extension

41

Summary

• Emerging memory technology– Hardening metadata to PCM synchronously– Scan PCM and rebuilt mapping table for SSD

• Design principle– Finer-grained access granularity– minimizing PCM writes– Designing index to reduce the performance loss

PCM metdata

SSD data

CPU

L1/L2 Cache

DRAM

Page 42: Enhancing recovery and ramp-up performance of DBMS when using an SSD buffer-pool extension

42

Summary

• Asynchronously hardening – The mapping file is crated in SSD– Each flash page is response for a

SSD data area– Only harden the updated SSD

data area– Alleviate the number of I/O– Quickly find the destination FC

when recovery

Page 43: Enhancing recovery and ramp-up performance of DBMS when using an SSD buffer-pool extension

43

Summary

• Lower the cost of scan SSD table– Add a checkpoint for mapping

information update– A log is used to record the recent

checkpoint– Only Scan the related checkpoints

update for metadata

Page 44: Enhancing recovery and ramp-up performance of DBMS when using an SSD buffer-pool extension

Thank You!