Upload
chen-gwen-shapira
View
119
Download
4
Tags:
Embed Size (px)
DESCRIPTION
Citation preview
Databases in a Solid State World
How Exadata X3 and Other Database Systems Leverage the Performance of FlashGwen Shapira, Senior ConsultantFebruary, 2013
About Me– Oracle ACE Director– Member of Oak Table– 14 years of IT
– Performance Tuning– Troubleshooting– Hadoop
– Presents, Blogs, Tweets
– @gwenshap
© 2013 Pythian2
About Pythian• Recognized Leader:
– Global industry-leader in remote database administration services and consulting for Oracle, Oracle Applications, MySQL and Microsoft SQL Server
– Work with over 250 multinational companies such as Forbes.com, Fox Sports, Nordion and Western Union to help manage their complex IT deployments
• Expertise:
– Pythian’s data experts are the elite in their field. We have the highest concentration of Oracle ACEs on staff—9 including 2 ACE Directors—and 2 Microsoft MVPs.
– Pythian holds 7 Specializations under Oracle Platinum Partner program, including Oracle Exadata, Oracle GoldenGate & Oracle RAC
• Global Reach & Scalability:
– Around the clock global remote support for DBA and consulting, systems administration, special projects or emergency response
© 2013 Pythian3
© 2013 Pythian4
You Never Forget
Your First SSD
Sh*t People Say about SSD:
© 2013 Pythian5
Fast for reads
Don’t use for writes
Use for random writes
Don’t use for REDO
Used for REDO
Only used in Exadata
Only Sun flash devices are supported
Unreliable
Becomes slower over time
Type of SSD matters
Use SATA SSD
Use PCI SSDUse SSD in SAN
Too expensive
Is it same as Flash?
Solid State Disk=No Spinning=Low Latency Random IO
© 2013 Pythian6
We are talking about: NAND FLASH
• As opposed to RAM Flash which is rare but awesome
• SLC – One bit per cell. – High performance.
• MLC– Two bit per cell– High capacity
© 2013 Pythian7
0
1
00
01
10
11
Will Talk About:
• IO Performance• Using SSDs for
Oracle• How Exadata and
ODA uses SSDs• SSD devices• Practice: Reading
SSD Vendor Specs
© 2013 Pythian8
Anatomy of a SSD
© 2013 Pythian9
Cell1bit
Page4K
Block128 Pages
512K
Plane = 1024 Blocks = 512MBPlanes are grouped into Die which are grouped in Packages
The Big Catch:We read and write pagesBut delete blocks
© 2013 Pythian10
IO Operations
© 2013 Pythian11
Reads • CPU registers – 0.3 * ns (1 cycle)• CPU Cache L1 – 1.2* ns • CPU Cache L2 – 3.0* ns• CPU Cache L3 – 12-24 ns
•Main Memory (RAM) – 60-100 ns•SSD – 60,000 ns•Magnetic Storage (“DISK”) – 3,000,000 ns
•SAN devices ~ 15,000,000 ns
© 2013 Pythian12
What about throughput?
• 15K RPM SAS HDD – 120-200MB/s• PCIe SSD – 1-2GB/s• But … How many disks do you use?• Network bandwidth?• CPU Bus bandwidth?
© 2013 Pythian13
Writes
• Writes on new SSD – 250,000 ns• Similar to sequential write to disk
How much data can you write to a new 250GB SSD?
© 2013 Pythian14
Deletes
• Can’t overwrite data without deleting first• Can only delete blocks of 128*4K pages• To Overwrite a page:
– Read 127 pages– Write 127 to a free block– Delete old block– Perform the write we originally requested
• Takes 2ms• Each cell can only be written 100K times
© 2013 Pythian15
The Controller
• Over-provision SSDs• Maintain free lists• Delete and cleanup in background• Balance use of cells (Wear leveling)• RAM caching
© 2013 Pythian16
Consequences:
• Write Amplification– How much data is really written when we write
1MB– 1 means no overhead– The closer to 1 the better
• Benchmarks on new SSD are worthless– Run benchmarks long enough to run out of
overprovisioned space
© 2013 Pythian17
Will Talk About:
• IO Performance• Using SSDs for
Oracle• How Exadata and
ODA uses SSDs• SSD devices• Practice: Reading
SSD Vendor Specs
© 2013 Pythian18
Redo Logs
A: Redo log writes are sequential writes and therefore won’t benefit from SSD
B: Log file sync times are critical to Oracle performance. Therefore placing redo logs on SSD will have dramatic impact on performance.
© 2013 Pythian19
Don’t use SSD for redo if:
• You don’t have “log file sync” related performance problems
• You have dedicated disks for each redo log• Even better if multiple disks, striped.• Your SAN is well configured and has ample
caching• You have RAC and no shared SSDs
© 2013 Pythian20
SSD can make Redo faster if:• You are suffering from high “log file
parallel write”• And your storage admin won’t even
discuss it• Redo is on LUN shared with:
– Redo from multiple databases– Other services (SAP, etc)
• Not enough cache on storage array• Storage network is a bottleneck
© 2013 Pythian21
Placing Data on SSD
© 2013 Pythian22
Should you place data on SSD?• SSD solves IO latency problems• If “DB File Sequential Read” is not in your
top 5 wait events, you probably don’t need your data on SSD.
• If you don’t maximize RAM use for buffer cache – don’t get SSD (yet)
• If your CPU utilization is high, solve this first.
© 2013 Pythian23
Not enough space?
• Move most active segments • Random reads get most benefits from SSD• Active indexes with unique-scans• Fewer writes is better• AWR has IO statistics per segment• https://github.com/gwenshap/Oracle-DBA-
Scripts/blob/master/SSD.sql
© 2013 Pythian24
Why Choose?
• SAN Devices that contain both HDD and SSD
• Smart controllers move most active data to SSD automatically.
• Pros: No need to choose and manually migrate data
• Cons: Your most active data will move without advanced notice
© 2013 Pythian25
Top Mistakes
• Using SSD for production and HDD for Standby– If production needs SSD…– Good chance that standby will fall behind
• Database Smart Flash Cache
© 2013 Pythian26
Database Smart Flash Cache
© 2013 Pythian27
Disk
SGA
Flash Cache
Block read from disk
Block evicted from SGA is written to SSD cacheby DBWR
If block is needed, it is read from SSD
Database Smart Flash Cache• Pros:
– Automatically keeps active data in SSD
• Cons:– Large overhead for managing cache, all taken from SGA– Overhead for DBWR– No benefit and some overhead for writes– Only one SSD device
Using Smart Flash Cache will make your IO faster than using just disks, but smartly placing data on SSD will be even faster.
© 2013 Pythian28
Will Talk About:
• IO Performance• Using SSDs for
Oracle• How Exadata and
ODA uses SSDs• SSD devices• Practice: Reading
SSD Vendor Specs
© 2013 Pythian29
Exadata has LOTS of SSD
• Quarter rack has 3 storage cells• Each with 4 Sun Flash Accelerator F40• 400GB * 4 * 3 = 4.8TB• 21.5GB/s throughput• 375,000 IOPS• Note that IB will limit you to 4GB/s per DB
node
© 2013 Pythian30
Exadata Smart Flash Logging• Redo log writes are written to disk and
SSD together.• Log sync is finished when one write is
successful.• Can’t Lose.• Can’t try that at home• This improves performance for redo when
disks are busy with high throughput operations
© 2013 Pythian31
Exadata Smart Flash Cache
• Not same as DB Smart Flash Cache• SSDs are on storage cells• SSD on Exadata can also be used as ASM
disks and not cache.
© 2013 Pythian32
Exadata Smart Flash Cache
• Reading un-cached data:1. Un-cached data is read
from disk first2. Sent to the database3. and then copied to cache
© 2013 Pythian33
Disks SSD Cache
Cellsrv Database
Exadata Smart Flash Cache
• Cached reads:– Read from disk and SSD simultaneously– Whichever returns first– Effectively increase read throughput– Smart scans mostly
read from disk– Except for objects
using “cell_flash_cache”KEEP clause.
© 2013 Pythian34
Disks SSD Cache
Cellsrv Database
Exadata Smart Flash Cache
• Writes:– Write through cache– Writes go to disk first– Then copied to cache, sometimes– Indexes and tables with random IO– ALTER TABLE customers STORAGE
(CELL_FLASH_CACHE KEEP)
© 2013 Pythian35
Disks SSD Cache
Cellsrv Database
Exadata Smart Flash Cache
• Writes:– Write back cache– Writes go to SSD first– Then copied to disk, eventually
© 2013 Pythian36
Disks SSD Cache
Cellsrv Database
ODA and SSD
• “Four 2.5-inch 200 GB SAS-2 SLC SSDs per shelf for database redo logs “
• Allows multiple databases on ODA• Reduces risk of disk bottlenecks
© 2013 Pythian37
Will Talk About:
• IO Performance• Using SSDs for
Oracle• How Exadata and
ODA uses SSDs• SSD devices• Practice: Reading
SSD Vendor Specs
© 2013 Pythian38
Interfaces
• SATA– 32 outstanding IO– 6Gb/s = 600MB/s– significant latency
• SAS– 256 outstanding IO– 6Gb/s = 600MB/s– Used on ODA
shared storage
© 2013 Pythian39
Interfaces
• PCIe– “Flash”
“Accelerator”– Multiple 500 MB/s
lanes– Low latency– Multiple SAS/SATA
controllers on cardfor extra throughput
© 2013 Pythian40
Interfaces
• Fiber– Use existing
enterprise infrastructures
– Shared storage– Usual SAN
headache– Mandatory for RAC
© 2013 Pythian41
Will Talk About:
• IO Performance• Using SSDs for
Oracle• How Exadata and
ODA uses SSDs• SSD devices• Practice: Reading
SSD Vendor Specs
© 2013 Pythian42
© 2013 Pythian43
Write latency lower than read?
© 2013 Pythian44
Intel SSD 910
identical read/write latency?
© 2013 Pythian45
© 2013 Pythian46
RAMSAN
© 2013 Pythian47
Quick Recap
• SSDs make random reads wicked fast• Writes and deletes are complicated• Place segments with many random reads
on SSD• Exadata uses Smart Flash Cache to
increase throughput• Not all SSDs are the same• Read specs carefully
© 2013 Pythian48
Thank you – Q&A
To contact us
1-877-PYTHIAN
To follow us
http://www.pythian.com/blog
http://www.facebook.com/pages/The-Pythian-Group/163902527671
@pythian
http://www.linkedin.com/company/pythian
© 2013 Pythian49
Toolkit – Colour palette
• The theme colours for this template are pre-loaded. However, if you’re curious this is the palette:
RGB 0 0 0 RGB 204 204 204 RGB 153 153 153 RGB 255 255 255
RGB 0 119 139 RGB 0 163 173 RGB 255 143 40 RGB 255 210 0 RGB 200 0 0
© 2013 Pythian50
Toolkit – Service Icons Higher res will be uploaded soon
© 2013 Pythian51
Toolkit – General Icons
© 2013 Pythian52
Toolkit – Social Media Icons
© 2013 Pythian53
Toolkit – Industry Logos
© 2013 Pythian54
Toolkit – Stock Photos (will grow)
© 2013 Pythian55