Upload
akiesz
View
113
Download
0
Embed Size (px)
Citation preview
Flash/SSD in a Datacenter
Robert MilkowskiRobert Milkowski
Flash Storage in a Datacenter
Robert MilkowskiSenior Systems AnalystTalkTalk Group
http://milek.blogspot.com
InfoShare 2010 2
Flash/SSD in a Datacenter
Robert MilkowskiRobert Milkowski
IOPS Performance Problem
Source: https://www.sun.com/offers/docs/ssd_sun_servers.pdf
InfoShare 2010 3
Flash/SSD in a Datacenter
Robert MilkowskiRobert Milkowski
IOPS Performance
● During the last 10 years
● Server performance imporved ~100x
● CPU performance improved ~10x
● HDD performance improved ~2x
● During the last 10 years
● Server performance improved ~100x
● CPU performance improved ~10x
● HDD performance improved ~2x
InfoShare 2010 4
Flash/SSD in a Datacenter
Robert MilkowskiRobert Milkowski
Solutions
● All-in-memory processing
● NVRAM solutions
● Lots of short-stroked fast HDDs
● SSD as HDD replacement
● SSD as cache
Flash/SSD in a Datacenter
Robert MilkowskiRobert Milkowski
Flash Memory
● Non-volatile memory
● Rapidly growing market adoption
●pendrives, cameras, phones, notebooks, ...
● Available as 2,5” and 3,5” HDD
● FMODs – DIMM like modules
InfoShare 2010 6
Flash/SSD in a Datacenter
Robert MilkowskiRobert Milkowski
● Single-level cell
● 1 bit per cell
● Multi-level cell
● 2+ bits per cell
● Low write endurance
MLC vs. SLC
● SLC - Single-level cell
● 1 bit per cell
● Lower write latency
● MLC - Multi-level cell
● 2+ bits per cell
● Low write endurance
InfoShare 2010 7
Flash/SSD in a Datacenter
Robert MilkowskiRobert Milkowski
● Only well known, free and open source
● Commodity hardware
● Vendor neutral
● Horizontal scalability
● Backup tool to hide all the complexities
Requirements Cont...
● SLC vs. MLC
● SLC on par with enterprise HDDs
●2M Hours MTBF
●3x24x7 100% Write
● Spare cells (32GB vs 24GB)
● Wear Leveling
SSD MTBF
InfoShare 2010 8
Flash/SSD in a Datacenter
Robert MilkowskiRobert Milkowski
Flash Wear Leveling
● Flash Read ~25µs (~40k IOPS)
● Flash Write (400-4,000 IOPS)
●Erase ~2ms
●Write to erased block ~300µs
● TRIM feature
● SLC flash ~1mln program/erase cycles
●MLC << 100k program/erase cycles
● Virtual Block Allocation
●Writes spread across all cells
● Spare cells (24GB FMOD has 32GB cells)
●Bad block relocation
InfoShare 2010 9
Flash/SSD in a Datacenter
Robert MilkowskiRobert Milkowski
Flash R/W Performance
● Flash Read ~25µs (~40k IOPS)
● Flash Write (400-4,000 IOPS)
●Erase ~2ms
●Write to erased block ~300µs
● TRIM command
InfoShare 2010 10
Flash/SSD in a Datacenter
Robert MilkowskiRobert Milkowski
32GB SLC SSD
● Open Source backup solutions
●Cheap but too complicated
● In-house solution
●Most flexible
●Best use of latest technologies
● Sequential I/O
● ~250MB/s Read
● ~170MB/s Write
● Random I/O
● 35,000 4k Read
● 3,300 4K Write
● Power Usage
● 0.1W - 2.5W
● Reliability
● 7x24x3 years (100% write)
● BER 1 sector in 10E-15
● MTBF 2mln hours
InfoShare 2010 11
Flash/SSD in a Datacenter
Robert MilkowskiRobert Milkowski
HDD Random IOPS
● Oracle/RMAN integration
● Integration with other 3rd party software
● Bare Metal Recovery
● Easy-of-use (???)
● Well known (skills)
● Fastest 15k RPM disk drive ~300 IOPS
●Writes are not much slower than Reads
● Enterprise SLC SSD ~35,000 Read IOPS
~100x disks = 1x SSD
InfoShare 2010 12
Flash/SSD in a Datacenter
Robert MilkowskiRobert Milkowski
Power Usage
1. ZFS filesystem created for each client
2. Data RSYNC'ed from the client
3. ZFS snapshot created
Device Approx. power consumption
DRAM 1GB DIMM 5W
15k RPM 300GB HDD 17W
7.2k RPM 750GB HDD 12W
SLC 128GB SSD 2W
InfoShare 2010 13
Flash/SSD in a Datacenter
Robert MilkowskiRobert Milkowski
Price per GB
● FLASH vs. disk price per GB
InfoShare 2010 14
Flash/SSD in a Datacenter
Robert MilkowskiRobert Milkowski
Cost of I/O vs. GB
● Usually licensed (expensive) per
● client
● backup/media server
● tape library
● Skills (lack of)
SSD ~15x more expensive per GB but still cheaper than DRAM
SSD ~70x less expensive for Read IOPSSSD ~7x less expensive for Write IOPS
$/Read IOPS $/Write IOPS
0
0.5
1
1.5
2
2.5
Cost per I/O
3,5” 32GB SLC SSD15k RPM 3,5” 300GB HDD
$/GB
0
10
20
30
40
50
60
Cost per GB
DRAM3,5” 32GB SLC SSD15k RPM 3,5” 300GB HDD
InfoShare 2010 15
Flash/SSD in a Datacenter
Robert MilkowskiRobert Milkowski
Cost of I/O vs. GB
F5100 ~10x more expensive per GB F5100 ~75x less expensive per IO
$/GB
0
10
20
30
40
50
60
70
Cost per GB
Mid-range Disk Array, 192x 15k 300GB disksSun F5100, 2TB configuration
$/IO
0
1
2
3
4
5
6
7
Cost per IO
Mid-range Disk Array, 192x 15k 300GB disksSun F5100, 2TB configuration
InfoShare 2010 16
Flash/SSD in a Datacenter
Robert MilkowskiRobert Milkowski
Sun F5100 Flash Array
● Oracle/RMAN integration
● Integration with other 3rd party software
● Bare Metal Recovery
● Easy-of-use (???)
● 1.6M 4K Random Read IOPS
● 1.2M 4K Random Write IOPS
● 12.8GB/s Sequential Read
● 9.7GB/s Sequential Write
● <400W Power Usage
Up-to 80x 24/32GB SLC FMODsDRAM Cache, Supercapacitors
InfoShare 2010 17
Flash/SSD in a Datacenter
Robert MilkowskiRobert Milkowski
Flash as a HDD Replacement
● High cost per GB
● Small density
● Good option for small datasets
● Still cheaper than memory
InfoShare 2010 18
Flash/SSD in a Datacenter
Robert MilkowskiRobert Milkowski
FLASH as a Cache
● NetApp PAM
● Sun/Oracle L2ARC & SLOG
● EMC FAST (HSM as well)
● Windows ReadyBoost
● NetApp PAM
● Sun/Oracle ZFS HybridStoragePool
● EMC FAST (HSM as well)
● IBM, STEC, ...
● Windows ReadyBoost
InfoShare 2010 19
Flash/SSD in a Datacenter
Robert MilkowskiRobert Milkowski
ZFS – Hybrid Storage Pool
● Transparent Read Cache
● Transparent Write Cache
● DAS, SAN, ISCSI, CIFS, NFS
InfoShare 2010 20
Flash/SSD in a Datacenter
Robert MilkowskiRobert Milkowski
Synchronous I/O
● Latency sensitive
● Databases like MySQL, Oracle
● NFS servers
● ISCSI
InfoShare 2010 21
Flash/SSD in a Datacenter
Robert MilkowskiRobert Milkowski
Flash as a Write Cache
● Flash Read ~25µs (~40k IOPS)
● Flash Write (400-4,000 IOPS)
●Erase ~2ms
●Write to erased block ~300µs
● TRIM feature
● Synchronous IO is written to SSD
●Later it is commited to a main pool
● Poor write latency
●DRAM cache in front of FLASH
●Performance close to NVRAM
●Supercapacitor or Cache Flushes
InfoShare 2010 22
Flash/SSD in a Datacenter
Robert MilkowskiRobert Milkowski
firefox-1.5.0.6-source.tar~43,000 small files inside.
The file was unpacked over NFS twice.The first time with no SSD configured in.The second time with SSD.
It took ~20 minutes in the first case,And ~2 minutes in the second one.
10x improvement!
NFS requires that a file creation is a synchronous operation
Tar utility is a single stream client which is sensitive to I/O latency
Source: http://blogs.sun.com/brendan/entry/slog_screenshots
InfoShare 2010 23
Flash/SSD in a Datacenter
Robert MilkowskiRobert Milkowski
10x NFS clients20x threads per client200 concurrent streams
Each thread performs a 512 bytesynchronous write in a loop.
A thread is added every minute.
~7k synchronous IOPS
Hardware used: Sun Open Storage 7410 128GB RAM 4x quad core AMD Opteron 136x 1TB SATA disks (RAID-10) No SSDs
Source: http://blogs.sun.com/brendan/entry/slog_screenshots
InfoShare 2010 24
Flash/SSD in a Datacenter
Robert MilkowskiRobert Milkowski
10x NFS clients20x threads per client200 concurrent streams
Each thread performs a 512 bytesynchronous write in a loop.
A thread is added every minute.
~114k synchronous IOPSOver 16x improvement
Hardware used: Sun Open Storage 7410 128GB RAM 4x quad core AMD Opteron 136x 1TB SATA disks (RAID-10) 8x SSDs as a write cacheSource: http://blogs.sun.com/brendan/entry/slog_screenshots
InfoShare 2010 25
Flash/SSD in a Datacenter
Robert MilkowskiRobert Milkowski
Flash as a Read Cache
● Cache hierarchy: memory, SSDs, HDDS
● Data checksums
● Warm-up time
●Persistent L2ARC
● Transparent to applications
InfoShare 2010 26
Flash/SSD in a Datacenter
Robert MilkowskiRobert Milkowski
SSD - Read Cache
Source: https://www.sun.com/offers/docs/ssd_sun_servers.pdf
InfoShare 2010 27
Flash/SSD in a Datacenter
Robert MilkowskiRobert Milkowski
Source: http://blogs.sun.com/brendan/entry/l2arc_screenshots
10x NFS clients2x random 8KB Reads per client20 concurrent streams
Working set: 500GBData size: >>500GB
Over 5x more NFS IOPS
Hardware used: Sun Open Storage 7410 128GB RAM 4x quad core AMD Opteron 140x 1TB SATA disks (RAID-10) 6x SSDs as a Read Cache
InfoShare 2010 28
Flash/SSD in a Datacenter
Robert MilkowskiRobert Milkowski
Source: http://blogs.sun.com/brendan/entry/l2arc_screenshots
SSD - Read Cache Warm-Up
InfoShare 2010 29
Flash/SSD in a Datacenter
Robert MilkowskiRobert Milkowski
Summary
● Price between RAM and HDD
● Performance between RAM and HDD
● Less Total Power Usage
● Less Rack Space
● Transparent Read/Write Cache (ZFS HSP)
● Working set on SSD Read Cache
InfoShare 2010 30
Flash/SSD in a Datacenter
Robert MilkowskiRobert Milkowski
Q&A
Flash/SSD in a Datacenter
Robert MilkowskiRobert Milkowski
FLASH Storage in a Datacenter
Robert MilkowskiSenior Systems AnalystTalkTalk Group
http://milek.blogspot.com
InfoShare 2010 32
Flash/SSD in a Datacenter
Robert MilkowskiRobert Milkowski
Performance Tuning for SSD
● If a pool fails ALL backups are lost
●Dual Parity RAID (RAID-Z2)
●Hot Spares
●Backups replicated to another node
●Multiple backup nodes
● HBA custom firmware
● ~50K → ~100k IOPS
● Multiple HBA/Port Connections
● 4k Block Alignment
InfoShare 2010 33
Flash/SSD in a Datacenter
Robert MilkowskiRobert Milkowski
ZFS DeDuplication & SSD
● Global retention policy
● Per-client retention policy
● Deletes ZFS snapshots
● Does not apply to Archives
● DeDuplication is memory hungry
● DDT reads are random
● Big performance impact if DDT>memory
● DDT on SSD
● Cost effective
InfoShare 2010 34
Flash/SSD in a Datacenter
Robert MilkowskiRobert Milkowski
http://en.wikipedia.org/wiki/Flash_memoryhttp://www.silvertonconsulting.com/newsletterd/SSDf_drives.pdfhttp://www.sun.com/storage/flash/SSD_datasheet.pdfhttps://www.sun.com/offers/docs/ssd_sun_servers.pdf
http://opensolaris.org/os/community/zfs/
Useful links