31
SQL Server, Storage and You - Part III: Solid State Storage

SQL Server, Storage and You - Part III: Solid State Storage

Embed Size (px)

Citation preview

Page 1: SQL Server, Storage and You - Part III: Solid State Storage

SQL Server, Storage and You - Part III: Solid State Storage

Page 2: SQL Server, Storage and You - Part III: Solid State Storage

Contact Information

• Wesley Brown• [email protected]• Twitter @WesBrownSQL• Blog http://www.sqlserverio.com

Page 3: SQL Server, Storage and You - Part III: Solid State Storage

Today’s Topic Covers…• NAND Flash Structure• MLC and SLC Compared• NAND Flash Read Properties• NAND Flash Write Properties• Wear-Leveling• Garbage Collection• Write Amplification• TRIM• Error Detection and Correction• Reliability• Form Factor• Performance Characteristics• Determining What’s Right for You• Not All SSD’s Are Created Equal

Page 4: SQL Server, Storage and You - Part III: Solid State Storage

Types Of Flash

• Two Main Flavors NAND And NOR• NOR

– Operates like RAM.– NOR is parallel at the cell level.– NOR reads slightly faster than NAND.– Can execute directly from NOR without copy to RAM.

• NAND– NAND operates like a block device a.k.a. hard disk.– NAND is serial at the cell level.– NAND writes significantly faster than NOR.– NAND erases much faster than NOR--4 ms vs. 5 s.

Page 5: SQL Server, Storage and You - Part III: Solid State Storage

Structure of NAND• Serial array of transistors.

– Each transistor holds 1 bit(or more).• Arrays grouped into pages.

– 4096 bytes in size.– Contains “spare” area for ECC and other ops.

• Pages grouped into Blocks– 64 to 128 pages.– Smallest erasable unit.

• Pages grouped into chip– As big as 16 Gigabytes.

• Chips grouped on to devices.– Usually in a parallel arrangement.

Page 6: SQL Server, Storage and You - Part III: Solid State Storage

NAND Flash Structure. Gates, Cells, Pages and Strings.

Page 7: SQL Server, Storage and You - Part III: Solid State Storage

MLC vs. SLC, FIGHT!• MLC (Multi-Level Cell)

– Higher capacity (two bits per cell).– Low P\E cycle count 3k~ 10K~.– Cheaper per Gigabyte.– High ECC needs.

• SLC (Single-Level Cell)– Fast read speed

• 25ns vs. 50ns

– Fast Write Speed• 220ns vs. 900ns

– High P\E cycle count 100k~ to 300k~– Tend to be conservative numbers.

– Minimal ECC requirements• 1 bit per 512 bytes vs. 12~ bits per.

– Expensive• Up to 5x the cost of MLC.

Page 8: SQL Server, Storage and You - Part III: Solid State Storage

Reading NAND Flash

• It isn’t RAM.– Slower access times.

• 1~ ns vs. 50~ ns.• No write in place.

• It isn’t a hard disk.– Much faster access times.

• Nanoseconds vs. Milliseconds

– No moving parts.

Page 9: SQL Server, Storage and You - Part III: Solid State Storage

Writing to NAND

• Program Erase Cycle– Erased state all bits are 1.– Programmed bits are 0.– Programmed pages at a time.

• One pass programming.

– Erased block at a time(128 pages).• Must erase entire block to program a single page

again.

– Finite life cycle, 10k~ MLC 100k~ SLC.• Once failed to erase may still be readable.

Page 10: SQL Server, Storage and You - Part III: Solid State Storage

Data written in pages and erased in blocks. Blocks are becoming larger as NAND Flash die sizes shrink.

Page 11: SQL Server, Storage and You - Part III: Solid State Storage

Feeding And Care of NAND

• Wear-Leveling– Spreads writes across blocks.– Ideally, write to every block before erasing any.– Data grouped into two patterns.

• Static, written once and read many times.• Dynamic, written often read infrequently.

– If you only Wear-Level data in motion you burn out the page quickly.

– If you Wear-Level static data you are incurring extra I/O

Page 12: SQL Server, Storage and You - Part III: Solid State Storage

Keeping Things Fast

• Background Garbage Collection– Defers P/E cycle.– Pages marked as dirty, erased later.– Requires spare area.– Incurs additional I/O.– Can be put under pressure by frequent small

writes.

Page 13: SQL Server, Storage and You - Part III: Solid State Storage

No Free Lunches

• Write Amplification– Ripples in a pond.– Device moves blocks around.– Incoming I/O greater than Device has.– Every write causes additional writes.

• Small writes can be a real problem.• OLTP workloads are a good example.• TRIM can help.

Page 14: SQL Server, Storage and You - Part III: Solid State Storage

Initial Write of 4 pages to a single erasable block.

Page 15: SQL Server, Storage and You - Part III: Solid State Storage

Four new pages and four replacement pages written. Original pages are now marked invalid.

Page 16: SQL Server, Storage and You - Part III: Solid State Storage

Garbage collection comes along and moves all valid pages to a new block and erases the other block.

Page 17: SQL Server, Storage and You - Part III: Solid State Storage

Keeping Things Fast

• TRIM– Supported out of the box on Windows 7, Windows

2008 R2. • Some manufacturers are shipping a TRIM service that

works with their driver

– Acts like spare area for garbage collection.– OS and file system tell drive block is empty.– Filling file system defeats TRIM.– File fragmentation can hurt TRIM.

• Grow your files manually!• Don’t run disk defrag!

Page 18: SQL Server, Storage and You - Part III: Solid State Storage

Detecting Errors and Correcting Them

Many things cause errors on Flash!• Write Disturb

– Data Cells NOT being written to are corrupted.• Fixed with normal erase.

• Read Disturb– Repeated reads on same page effects other pages on block.

• Fixed with normal erase.

• Charge Loss/Gain– Transistors may gain or lose charge over time.

• Flash devices at rest or rarely accessed data.• Fixed with normal erase.

All of these issues are generally dealt with very well using standard ECC techniques.

Page 19: SQL Server, Storage and You - Part III: Solid State Storage

As cells are programmed other cells may experience voltage change.

Page 20: SQL Server, Storage and You - Part III: Solid State Storage

As cells are read other cells in same block can suffer voltage change.

Page 21: SQL Server, Storage and You - Part III: Solid State Storage

If flash is at rest or rarely read cells can suffer charge loss.

Page 22: SQL Server, Storage and You - Part III: Solid State Storage

Pure Speed• Not all drives are benchmarked the same.• Short-stroking

– Only using a small portion of the drive.– Allows for lots of spare capacity via TRIM.

• Huge queue depths.– Increases latency.– Can be unrealistic.

• Odd block transfer sizes.– Random IO testing.

• Some use 512 byte while others use 4k.

– Sequential IO testing.• Most use 128k.• Some use 64k to better fit into large buffers.• Some use 1mb and high queue depths.

Page 23: SQL Server, Storage and You - Part III: Solid State Storage

How Fast Is It Again?

• Read the numbers carefully.– Random IO bench usually 4k.

• SQL Server works on 8k.

– Sequential IO bench usually 128k.• SQL Server works on 64k to 128mb

– Queue depths set high.• SQL Server usually configured for low Queue

depth.

Page 24: SQL Server, Storage and You - Part III: Solid State Storage

Is It Reliable Enough?

• SLC is ready “Out of the box.”– Requires much less infrastructure on disk to

support robust write environments.

• MLC needs some help.– Requires lots of spare area and smarter

controllers to handle extra ECC.– eMLC has all management functions built onto the chip.

• Both configured similarly.– RAID of chips.– TRIM, GC and Wear-Leveling

Page 25: SQL Server, Storage and You - Part III: Solid State Storage

He’s Dead Jim.

• Longevity between devices can be huge.• Consumer grade drives are consumable.

– Aren’t rated for full drive writes.• Desktop drives usually tested on a fraction of drive

capacity!

– Aren’t rated for continuous writes.• It may say three year life span.

– Could be much shorter look at total writes.

Page 26: SQL Server, Storage and You - Part III: Solid State Storage

You Say SATA I Say SAS…• SAS is the king of your heavy workloads.• Command Queuing

– SAS supports up to 216 usually capped at 64.– SATA supports up to 32.

• Error recovery and detection.– SMART isn’t.– SCSI command set is better.

• Duplex– SAS is full duplex and dual ported per drive.– SATA is single duplex and single ported.

• Multi-path IO– Native to SAS at the drive level.– Available to SATA via expanders.

Page 27: SQL Server, Storage and You - Part III: Solid State Storage

The Shape Of Things.• Flash comes in lots of form factors.

• Standard 2.5” and 3.5” drives,• Fibre Attached

• Texas Memory System RAM-SAN 620• Violin Memory

• PCIe add-in cards.• Few “native” cards.• Fusion-io• Texas Memory System RAM-SAN 20• Bundled solutions.• LSI SSS6200• OCZ Z-Drive• OCZ Revodrive

• PCIe To Disk• 2.5” form factor and plugs• Skips SAS/SATA for direct PCIe lanes.

Page 28: SQL Server, Storage and You - Part III: Solid State Storage

Understand Your Workloads!

• You MUST understand your workloads.– Monitor virtual file stats

• http://sqlserverio.com/2011/02/08/gather-virtual-file-statistics-using-t-sql-tsql2sday-15/

– Track random vs. sequential– Track size of transfers

– Capture IO Patterns• http://sqlserverio.com/2010/06/15/fundamentals-of

-storage-systems-capturing-io-patterns/

– Benchmark!• http://sqlserverio.com/2010/06/15/fundamentals-of

-storage-testing-io-systems/

Page 29: SQL Server, Storage and You - Part III: Solid State Storage

I’m Not As Fast As I Use To Be• From new

– Best possible performance.– Drive will never be this fast again.

• Previous writes effect future reads.– Large sequential writes nice for GC.– Small random writes slow GC down.– Wait for GC to catch up when benching drive.

• Give the GC time to settle in going from small random to large sequential or vice versa.

• Steady state is what we are after.

• Performance over time slows.– Cells wear out.

• Causes multiple attempts to read or write• ECC saves you but the IO is still spent.

Page 30: SQL Server, Storage and You - Part III: Solid State Storage

It’s a Sony on the inside, trust me.

• Not all drives are equal.• Understand drives are tuned for workloads.

– Desktop drives don’t favor 100% random writes…– Enterprise drives are expected to get punished.

• Fix it with firmware.– Most drives will have edge cases.

• OCZ and Intel suffered poor performance after drive use over time.

• Be wary of updates that erase your drive.– Gives you a temporary performance boost.

Page 31: SQL Server, Storage and You - Part III: Solid State Storage

Takeaways

• Flash read performance is great, sequential or random.

• Flash write performance is complicated, and can be a problem if you don’t manage it.

• Flash wears out over time. – Not nearly the issue it use to be, but you must understand

your write patterns.– Plan for over provisioning and TRIM support.

• It can have a huge impact on how much storage you actually buy.

– Flash can be error prone. • Be aware that writes and reads can cause data corruption.