Upload
others
View
4
Download
0
Embed Size (px)
Citation preview
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |
CON6527 -‐ ImplemenEng High Availability with Exadata
Michael Nowak MAA SoluEons Architect Oracle Server Technologies Jay Friestad Director -‐ Database Services RENT-‐A-‐CENTER INC
Deep Dive, Best Prac/ces and Use Cases
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |
Safe Harbor Statement The following is intended to outline our general product direcEon. It is intended for informaEon purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or funcEonality, and should not be relied upon in making purchasing decisions. The development, release, and Eming of any features or funcEonality described for Oracle’s products remains at the sole discreEon of Oracle.
2
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |
• Available now – Exadata Express Cloud Service
• Coming soon – Database Cloud Services – Exadata Cloud Machine
3
Announcing Oracle Database 12c Release 2 on Oracle Cloud
Oracle is presenEng features for Oracle Database 12c Release 2 on Oracle Cloud. We will announce availability of the On-‐Prem release someEme aYer Open World.
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |
Program Agenda
MAA Reference Architecture
Exadata MAA: Five Nines Architecture, Best PracEces and PracEcal Use Cases
RENT-‐A-‐CENTER: MAA and Exadata snapshots in acEon
1
2
3
4
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |
Program Agenda
MAA Reference Architecture
Exadata MAA: Five Nines Architecture, Best PracEces and PracEcal Use Cases
RENT-‐A-‐CENTER: MAA and Exadata snapshots in acEon
1
2
3
5
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |
Oracle Maximum Availability Architecture (MAA)
• Protect Service level Recovery Time Objec/ve (RTO)
• Protect Oracle data Recovery Point Objec/ve (RPO)
High Availability and Data ProtecEon
ProducEon Copy
Database ReplicaEon
6
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |
Oracle MAA and the Cloud
Common Pla`orm – On Premises, Cloud, and Hybrid Cloud
On Premises On Cloud
Oracle Database
Designed to Address the Complete Range of Business Requirements
7
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |
Availability Service Levels for Unplanned and Planned Outages Oracle MAA Availability Tiers
BRONZE
SILVER
GOLD • Comprehensive HA and Disaster ProtecEon • Recovery in seconds with zero or near-‐zero data loss
• High Availability (HA) for Recoverable Local Outages • Zero DownEme Rolling Maintenance for Patches and Patch Set Updates
• Basic Service Restart • Backups plus redo for Oracle data protecEon
PLATINUM • Zero Outage for PlaEnum Ready ApplicaEons • Zero data loss
8
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |
Bronze Reference Architecture
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |
RTO of Minutes to Days, RPO From Last Backup Bronze: Single Instance
§ HA capabiliEes included with single-‐instance Oracle Database
§ Supports all consolidaEon strategies – MulEple databases on a single physical machine
– VM for dedicated resources
– Oracle MulEtenant for lowest cost
§ For unrecoverable outages: – Restore from backup
§ For disaster recovery (DR): – Offsite on-‐disk backup or tape
Backups
Single Instance Database Off-‐site backups (disk / tape / cloud)
VMs
Dedicated Databases
MulEtenant
Oracle Restart , Online Maintenance
Corrup8on Protec8on, Flashback Technologies
Recovery Manager
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |
Bronze -‐ Single Instance Oracle Database with ZDLRA Unplanned Outages and Planned Maintenance
Events Downtime Data Loss Potential
Database instance failure Minutes Zero
Recoverable server failure Minutes to hours Zero
Data corrupEons, unrecoverable server failure, database or site failures Hours to days
Since last backup or
Near-‐zero with ZDLRA redo transport
Online file move, reorganizaEon/redefiniEon, and patching Zero Zero Hardware or operaEng system maintenance and database patches that cannot be done online Minutes to hours Zero
Database upgrades: patch sets and full database releases Minutes to hours Zero
Pla`orm migraEons Hours to a day Zero
ApplicaEon upgrades that modify back-‐end database objects Hours to days Zero
Plan
ned
Mainten
ance
Unp
lann
ed
Outages
11
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |
Silver Reference Architecture
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |
RTO of Seconds for Server Failures, RPO near Zero with ZDLRA Silver: High Availability with Fast Failover
Backups
HA Cluster Off-‐site backups (tape / cloud)
Dedicated Databases
MulEtenant
§ AcEve-‐AcEve clustering with Oracle RAC – All nodes acEve at all Emes – Real-‐Eme failover
§ AcEve-‐passive with RAC One Node – AutomaEc failover, fast restart on a second node
§ Zero downEme rolling maintenance across RAC instances – Hardware and OS maintenance – Qualified Oracle Database patches
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |
Silver – High Availability with Fast Failover Unplanned Outages and Planned Maintenance
Events Downtime Data Loss Potential
Database instance failure Seconds if RAC Zero
Recoverable server failure Seconds if RAC Zero
Data corrupEons, database unable to restart, site failure Hours to days Since last backup, or Near-‐zero with ZDLRA redo transport
Online file move, reorganizaEon/redefiniEon, and patching Zero Zero Hardware or O.S. maintenance and database patches that can’t be done online but qualified for RAC rolling install Zero Zero
Database upgrades: patch sets and full database releases Minutes to hours Zero
Pla`orm migraEons Hours to a day Zero
App upgrades that modify back-‐end database objects Hours to days Zero
Plan
ned
Mainten
ance
Unp
lann
ed
Outages
14
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |
Gold Reference Architecture
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |
RTO of Seconds to Minutes, RPO of Zero or Near-‐Zero Gold: Comprehensive HA/DR
§ Real-‐Eme data protecEon, HA and DR using AcEve Data Guard
– Best corrupEon protecEon – Zero or near-‐zero data loss – AutomaEc database failover – Offload read-‐only and backups
§ Flexible logical replicaEon using Oracle GoldenGate
– Replica open read-‐write § Coordinated site failover using Oracle Site Guard
Backups
HA Cluster
Backups
Tape
HA Cluster Site A Site B
Database ReplicaEon
sync/async
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |
Gold – Comprehensive HA and Data Protec/on Unplanned Outages and Planned Maintenance
Events Downtime Data Loss Potential
Database instance failure Seconds Zero
Recoverable server failure Seconds Zero
Data corrupEons, database unable to restart, site failure Zero to minutes Near-‐zero if ASYNC Zero if SYNC
Online file move, reorganizaEon/redefiniEon, and patching Zero Zero Hardware or operaEng system maintenance and database patches that cannot be done online but are qualified for RAC rolling install
Zero Zero
Database upgrades: patch sets, full database releases Seconds Zero
Pla`orm migraEons Seconds Zero
ApplicaEon upgrades that modify database objects Hours to days Zero
Plan
ned
Mainten
ance
Unp
lann
ed
Outages
17
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |
Pla8num Reference Architecture
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |
PlaEnum Zero Applica/on Outage for Pla/num Ready Applica/ons
Site A Site B
Backups
Backups
Bi-‐direcEonal replicaEon
Global Data Services Applica/on Con/nuity
§ Outages masked from applicaEons, in-‐flight transacEons preserved
– ApplicaEon ConEnuity § Zero data loss failover, LAN or WAN
– AcEve Data Guard / Far Sync § Bi-‐direcEonal replicaEon and zero downEme maintenance
– Oracle GoldenGate § Online patching for applicaEons
– EdiEon-‐based RedefiniEon § Automated workload management
– Oracle Global Data Services
Tape
Database ReplicaEon
sync
Tape
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |
Pla/num – Zero Outage for Pla/num Ready Applica/ons
Unplanned Outages and Planned Maintenance
Events Down/me Data Loss Poten/al
Database instance failure Zero applicaEon outage Zero
Recoverable server failure Zero applicaEon outage Zero
Data corrupEon, database unable to restart, site failure Zero applicaEon outage Zero
Online file move, reorganizaEon/redefiniEon, patching Zero applicaEon outage Zero Hardware or operaEng system maintenance and database patches that cannot be done online but are qualified for RAC rolling install
Zero applicaEon outage Zero
Database upgrades: patch sets, full database releases Zero applicaEon outage Zero
Pla`orm migraEons Zero applicaEon outage Zero
ApplicaEon upgrades that modify database objects Zero applicaEon outage Zero
Plan
ned
Mainten
ance
Unp
lann
ed
Outages
20
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |
MAA Best PracEces References
• MAA is Oracle’s HA blueprint. It protects your service levels and data. Pick the Eer that meets your needs and implement the configuraEon and operaEonal best pracEces
• MAA has been validated both internally and by countless customers – Great entry to key opEons: RAC, ADG, GG, Exadata, In-‐Memory
• MAA conEnues to evolve – keep your eye on hpp://www.oracle.com/goto/maa for updates
21
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |
Program Agenda
MAA Reference Architecture
Exadata MAA: Five Nines Architecture, Best PracEces and PracEcal Use Cases
RENT-‐A-‐CENTER: MAA and Exadata snapshots in acEon
1
2
3
22
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |
Five Nines Architecture
23
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |
Exadata is Highly Engineered and Standardized Less Risk, BeTer Results
• Less Deployment Risk and Faster to Market – Delivered assembled, debugged, and ready-‐to-‐run
• Less Performance and Availability Risks – OpEmized database-‐to-‐disk including firmware, OS, network
– Industry experts at every layer of the stack help design, build and support Exadata. Includes MAA input, bug fixes, and configuraEon pracEces.
• Less OperaEng Risk – All failure modes tested end-‐to-‐end. All systems idenEcal.
– Reduces issue resoluEon Emes, reduces vendor management overhead and improves SLAs
– OperaEonal Play Book (including online elasEcity) 24
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |
Highest Database and System Level Availability
• Exadata has supported very stringent service levels for years, and it keeps improving – 2011 MAA Exadata HA low brownout video series (hpps://vimeo.com/62754145) – 2013 & 2014 OOW presentaEons (hpp://educaEon.oracle.com) – 2015 OOW presentaEon (hpp://medianetwork.oracle.com/video/player/4786956541001)
– Exadata documentaEon (hpp://docs.oracle.com) – MAA collateral (hpp://www.oracle.com/au/products/database/exadata-‐maa-‐131903.pdf)
Applica/on Service Level Focus
25
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |
Exadata has Many HA Features SupporEng the Most Stringent SLAs • Fast node and cell death detecEon
• Fast network failure detecEon
• Redundancy protecEon on cellsrv shutdown
• Reduced brownout for instance recovery
• ILOM hang detecEon and repair
• Redundancy protecEon on cell shutdown
• AutomaEc ASM mirror read on IO error corrupEon
• IO error prevenEon with Exadata disk scrubbing / ASM corrupEon repair
• Exadata HARD
• CorrupEon prevenEon with HARD support
• EliminaEon of false posiEve drive failures
• Redundancy Check during power down
• Blue OK-‐to-‐remove LED light noEficaEon for redundancy protecEon
• AcEve AcEve IB Network
• Exadata Smart Write Back, Smart Flash Logging, Smart Scan and Reverse Offload
• Fastest Redo Apply and Instance Recovery
• Efficient resilver rebalance aYer flash failure
• I/O latency capping for reads and writes
• Cell IO Emeout threshold
• Smart Write Back Flash Cache persistence
• I/O and Network Resource Management
• Health factor on predicaEvely failed disks
• Disk confinement
• IO hang detecEon and repair
• Cell to Cell offload for Disk Repair
• Cell-‐to-‐Cell Rebalance Preserves Flash Cache
• Exadata ElasEc ConfiguraEon
• Drop hard disk for replacement
• Drop BBU for Replacement
• Appliance mode support
• Cell Alert Summary
• Flash and Disk Life Cycle Management Alerts
• AutomaEc LED support for disk removal
• Auto online
• Auto disk management
• Priority rebalance support
• EM failure reporEng
• Failure Monitoring on database servers
• UpdaEng database nodes with patchmgr
• OpEmized and Faster Exadata Patching
• Custom DiagnosEc Package for Cell Alerts
• VLAN support and automaEon
• Exachk – full stack health check with criEcal issues alerts
26
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |
Exadata Maximum Availability Architecture (MAA) Designed and Tested to Handle All Failure Scenarios
Best MAA Database PlaUorm | Fastest RAC Instance and Node Failure Recovery | Fastest Backup -‐ RMAN Offload to Storage Deep ASM Mirroring Integra/on | Fastest Data Guard Redo Apply | Complete Failure Tes/ng with Lowest Brownouts
Local standby for HA Failover
Redo-‐based change
replica/on with data consistency
checking
Online patching, reconfigura/on, expansion
LAN WAN
Servers, Disks, Flash, Network,
Power
Ac/ve clusters, Disk/flash mirroring
Within Exadata Within a Site Remote standby for Disaster Recovery
Across Sites
DATA
BA
SE IN-M
EMO
RY
DATA
BA
SE IN-M
EMO
RY
DATA
BA
SE IN-M
EMO
RY
Redundant So]ware
Redundant Hardware
Redundant Systems Redundant Databases
Redundant Systems Redundant Databases
27
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |
Exadata MAA Features
Data Protec/on
Quality of Service
Management
Performance
Reduced HA Brownout
HA Categories Suppor/ng Stringent Applica/on Service Levels for Years
28
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |
Exadata MAA Addresses Common HA Pain Points • ApplicaEon Brownout on Failure or Planned Maintenance
– Exadata reduces applicaEon blackout to sub-‐second. Brownout from node, instance and storage failures reduced from minutes to potenEally seconds. Fastest instance recovery.
• Data CorrupEons – Exadata provides addiEonal corrupEon prevenEon, detecEon, and auto-‐repair. Net benefit is it prevents or reduces hours of potenEal downEme.
• DisrupEve Schema Changes – Exadata accelerates schema changes, index and object rebuilds and reorganizaEons
• Disaster Recovery System Doesn’t Keep Up with ProducEon – Exadata provides fastest redo apply resulEng in Low RTO
29
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |
Exadata Maximum Availability Architecture (MAA)
• For many years HP Integrity (Tandem) NonStop and IBM z-‐systems are considered the Gold Standard for high availability
• IDC classifies Exadata when deployed in a full MAA configuraEon in same category of fault tolerant systems as the above two 1
Exadata MAA Delivers 99.999% Up/me per IDC
FIVE NINES 5X9 99.999%
30
1 Worldwide Fault -‐Tolerant Servers Market Shares, 2014: Vendors Are Hearing the Customer — More Bold Moves Needed to Grow the Segment, IDC, Peter Rupen, Lloyd Cohen, October 2015
Before Oracle Exadata
With Oracle Exadata Difference % Benefit
Unplanned Downtime Number of instances per year 7.1 0.7 6.5 90% MTTR (hours) 2.9 0.4 2.5 86% Productive hours lost per 100 users per year 1,021 66 955 94%
Unplanned Downtime – Revenue Impact Total revenue impact per year $423,700 $5,800 $417,900 99% Planned downtime Number of instances per year 10.9 6.0 4.9 45% MTTR (hours) 4.6 1.9 2.7 59% Productive hours lost per 100 users per year 68 60 8 12%
31
Risk Mitigation – Downtime Oracle Exadata Database Machine
Source: IDC
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |
Best Prac/ces
32
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |
Development Backed Best PracEces Con/nuous Improvement, Always a Priority
Idea
MOS Note 757552.1
Default Exadata deployment
Exadata Health Check (Exachk)
Engineered System with
Best PracEces
PublicaEon Weekly Expert Review / TesEng
You are here But you are also here!
33
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |
What EXAchk is • EXAchk provides configuraEon specific, up to date health check across the enEre DBM stack: o ConEnuously evolving configuraEon checks specific to the Oracle Exadata Database Machine and the soYware that runs upon it
o Exadata, DB, GI, ASM criEcal issues list specific for the environment
o Exadata full stack soYware planner o MAA scorecard that highlights MAA configuraEon gaps and provide guidance to MAA and consolidaEon best pracEces
o Automated scheduling ability and automaEc difference idenEficaEon between runs with email noEficaEon
Exachk saves 8me and money
34
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |
View EXAchk Findings
• Check status • Type of Check • Check Message
• Where the check was run
• Link to expand details
35
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |
• What to do to solve the problem
• Links to relevant Knowledge docs • Where recommendaEon applies
• Where problem doesn’t apply
• Example of data the recommendaEon is based on
View RecommendaEons
36
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |
Review MAA Score Card
• CriEcal Issues in MAA Scorecard o All issues reported in “SOFTWARE
MAINTENANCE BEST PRACTICES”
• SoYware version mapping table
• Installed soYware versions checked for noncurrent or incompaEble feature usage
37
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |
Prac/cal Use Cases
38
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |
Node Failure
“Motherboard replacement complete, I am now restoring power.”
39
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |
Unique Brownout ReducEon Features • Instant Failure Detec/on for Database and Storage Servers – If a server disappears from both InfiniBand switches, declare it dead in less than two seconds
– No waiEng for long heartbeat Emeouts
• Key Benefit: Reducing applicaEon blackout from 30+ seconds to less than two seconds
Ñ
40
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |
Service Level Blackout/Brownout with Storage Server Failure Published Results from Oracle and EMC
Service Outage with Storage Server (Brick) Failure
0.8
15
0 2 4 6 8 10 12 14 16
Exadata EMC XtremIO
Applica/on Blackout
0.8
300
0
50
100
150
200
250
300
350
Exadata EMC XtremIO
Applica/on Brownout
Second
s
Second
s
41
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |
Failed Storage
“Here lies disk drive in slot 11 who lived a long life and
served its purpose”
42
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |
Efficient Rebalance with Service Level ProtecEon • Intelligent and flexible rebalance power sefng
– TesEng in MAA labs to find best balance between redundancy restoraEon Eming and service level protecEon.
– MAA best pracEce default of 4 set at deployment Eme – MAA best pracEce max of 64 available as needed – MOS note 757552.1 available with more informaEon and guidance
Ñ
• 12.2 ASM rebalance restores redundancy first – DrasEcally reduces secondary failure exposure window – Exposed via new REBUILD phase in v$asm_operaEon
43
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |
Data/Metadata Performance PreservaEon Across Failures
• Storage index preserved across rebalance – Minimum Grid Infrastructure soYware version:
• Oracle Database 12c Release 1 (12.1) release 12.1.0.2.160119 with patch 22682752
• Caching hints passing during relocaEon to preserve opEmal access Emings • Special “sick disk state” set to avoid reading from slow performing disk • Special IO tags used to throple flushing during recovery to avoid hot spots
Smart Re-‐caching and Sick Disk Avoidance that Preserves Service Level Performance
44
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |
Sick Storage
“Doc, I don’t feel so good.”
45
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |
Sick Storage
Database update
encounters corrupEon
Just in case the Administrator would like to know, we log the following: <Database side> Corrupt block relaEve dba: 0x16400087 (file 89, block 135) Bad check value found during mulEblock buffer read Data in bad block:
type: 6 format: 2 rdba: 0x16400087
last change scn: 0x0000.b6702b33 seq: 0x1 flg: 0x04
spare1: 0x0 spare2: 0x0 spare3: 0x0
consistency value in tail: 0x2b330601
check value in block header: 0xa07a
computed block checksum: 0x3
Reading datafile '+DATA/qs/datafile/c.257.825768683' for corrupEon at rdba: 0x16400087 (file 89, block 135)
Read datafile mirror ‘DATA_CD_08_CELL13' (file 89, block 135) found same corrupt data (no logical check)
Read datafile mirror ‘DATA_CD_07_CELL14' (file 89, block 135) found valid data Hex dump of (file 89, block 135) in trace file /u01/app/oracle/diag/… /qs1_ora_60475.trc Repaired corrupEon at (file 89, block 135)
con8nue to run without ever no8cing the failure
OLTP, Analy/cs, Consolida/on, In-‐Memory DB
Database reads ASM mirror copy and repairs corrupEon
Corrup/on Detec/on, Mirror Read, and Repair
46
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |
Sick Storage Quality of Service – IO Latency Capping
Cell1
Cell2
Cell3
IO Latency Capping • Flash disk in Cell1, PCI slot 5 is exceeding performance thresholds during a database IO
• If it is a read, it is cancelled and automaEcally redirected to partner Cell3. Alert log reports “NOTE: ASM has redirected some slow reads to mirror sides to improve performance.”
• If it a write, it is cancelled and temporarily wripen to a healthy flash disk on the same cell.
IO latency capping works for both flash and hard disks
47
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |
IO Hang Comparison with and without Exadata Service Level Impact
1
30
0
5
10
15
20
25
30
35
Exadata Tradi/onal Storage
Second
s
LGWR Delay a]er Hung IO
Disk Controller
IO Request
1. IO scheduling
2. Controller Issues
3. Bad Disk
4. Bad Sector
Cellsrv
OS
48
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |
Sick Storage Quality of Service – Disk Confinement
Cell1
Cell2
Cell3
Disk Confinement • Disk in Cell2, slot 7 becomes sick and is taken offline
• IOs redirected to one of the partner disks on Cell1, slot 3
• Dr. Exadata runs diagnosEcs run on disk to determine health
• If deemed healthy, disk is returned to online status and resynced
• If deemed unhealthy, health factor drop is performed, and blue LED is lit when rebalance completes
49
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |
Disaster
“Make the call”
50
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |
Redo Apply Performance For Different Releases Range of Observed Apply Rates for Batch and OLTP
18 25 50
250
25 50 100
650
0
100
200
300
400
500
600
700
Oracle Database 9i Oracle Database 10g Oracle Database 11g (non Exadata)
Oracle Database 11g (Exadata)
High End -‐ OLTP
High End -‐ Batch
For Oracle DB 12c, slightly higher rates
Standby Apply Rate in MB/sec
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |
Best Data ProtecEon Even in the Worst Circumstances
• Oracle ownership of the database, volume management and filesystem enables redundancy restoraEon for the most criEcal files first, just in case a catastrophic full partner storage loss event occurs.
• MAA validated. Big benefits during dire circumstances. MOS 1968607.1
Priority Rebalance – Smarter Redundancy Restora/on
It’s not just bits and blocks, it’s your business and data.
Priority restore • Control Files • Log Files • SP files • TDE key stores • OCR • Wallets
52
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |
Performance Problem
“The system is slow today”
53
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |
One Big Happy Low Latency Family
Exafusion provides low latency cache fusion; Upon commit, LGWR issues IO via low latency RDS call to cell
NRM gives vip priority lane to LGWR I/O
Cell services LGWR IO with lowest possible latency via Flashlog
LGWR releases commit waiters
Service Levels Maintained
Smart Fusion Block Transfer
54
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |
Exadata AWR Support Outlier Detec/on Example from a Real (Big) Customer
55
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |
SoYware Maintenance
“A new soVware update is available. Would you like to
apply?”
56
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |
SoYware Maintenance Infrastructure and Tools
• MAA Exadata Infrastructure enables soYware updates with no service level downEme using a rolling strategy
• Single tool, patchmgr, can be used to patch all three Eers (database, network, storage)
• MOS Note 888828.1 for guidance
57
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |
Program Agenda
MAA Reference Architecture
Exadata MAA: Five Nines Architecture, Best PracEces and PracEcal Use Cases
RENT-‐A-‐CENTER: MAA and Exadata snapshots in acEon
1
2
3
58
q Rent-A-Center (NASDAQ: RCII) is one of the largest rent-to-own (“RTO”) operators in the U.S. u ~4,600 locations across the US, Mexico, Canada and Puerto Rico u ~2,800 Core U.S. locations u ~2,000 Kiosks at retailers u ~150 Mexico locations u ~180 Franchised stores
q RAC founded in 1986; roots trace back to the 1960’s q Flexible rental purchase agreements that generally allows
customers to obtain ownership at the conclusion of the rental term
59
About Rent-‐A-‐Center
Brands
59
Non-‐Prod 1/8 (X5-‐2) Prod ½ (X5-‐2)
Prod ¼ (X5-‐2)
Ac/ve Data Guard For D/R
Golden Gate for Repor/ng PROD OLTP GG PROD
OLAP
ADG PROD D/R
Physical Data Guard For Test Master 1 Physical Data Guard For Test Master 2 Ex
adata Storage Snap
shots
Datacenter (Plano, TX) Datacenter (Lewisville, TX) 10 Miles
Rent-‐A-‐Center – MAA Architecture w/Snapshots
60
TM2 TM1
Rent-‐A-‐Center – Exadata Storage Snapshots -‐ Overview
• Why Exadata Snapshots • Exadata snapshots are ideal for crea/ng space-‐efficient read-‐only or read-‐write snapshots of an Oracle database that
you can use for development, tes/ng, or other non-‐produc/on purposes. • Exadata snapshots can be used by developers and testers who need to validate func/onality in a fully func/onal Exadata
environment (for example, Exadata smart flash, smart scan, hybrid columnar compression). • Types of Exadata Snapshots
• You have a pluggable database (PDB) and want to create a test master from it. • You have a container database (CDB) and want to create test masters from all its PDBs, or you have a simple non-‐
container database and want to create a test master from it. ** Test Master can not be used for Disaster Recovery ** • Easier Refresh • Minimal Impact on Primary • Easier Cleanup and Re-‐clone
• Benefits for Rent-‐A-‐Center • RAC provisions 6 cloned environments (Dev, Test, QA, UAT, Perf & Trn) from our 6TB produc/on database.
• Ini/ally it took us 2-‐3 business days to build one (1) non prod copy of the produc/on database environment using Data Pump.
• Then we cut that down to a dura/on of 1-‐2 days per clone using different RMAN processes • Ul8mately we can now refresh a cloned produc8on database environment in < 1 hour for the first and < 2 hours
for 6 environments that we clone. • Storage savings
• Ini/ally we would ask for 15+ TB to support each of the six (6) cloned environments. Storage was needed for the DB, file copy space, backups, etc…
• A[er implemen8ng the Exadata Storage Snapshots RAC is currently using only 3TB (physical) for 91TB (virtual) storage.
61
Rent-‐A-‐Center – Convert Physical Standby to Test Master • Prerequisites
Found at Docs.Oracle.com (Search: ”Snapshots” > docs.oracle.com/cd/E50790_01/doc/doc.121/e50471/snapshot.htm#CIHCEDAJ) • Convert Physical Standby to Test Master This method will step you through the steps necessary to convert an exis/ng Data Guard physical standby database to a Test Master. Steps to create a Data Guard replica can be found in My Oracle Support note 1617946.1, “Crea/ng a Standby using RMAN Duplicate (RAC or Non_RAC)”.
• Defer redo transport and redo – apply. • When the standby database is at a consistent state and can be opened “READONLY”, stop transport and disable redo apply by execu/ng;
DGMGRL> edit database TESTMASTER set property logshipping=OFF; DGMGRL> edit database TESTMASTER set state=APPLY-OFF;
• Convert to Data Guard snapshot standby • You can now convert the standby database to a Snapshot Standby. This will allow you to make any modifica/ons to the Test Master’s data files such as Data Masking any
sensi/ve data. SQL> Alter Database Convert to Snapshot Standby; SQL> Alter Database Open Read Write;
• Generate and execute the necessary scripts to prepare the Test Master database • Set the ownership of the Test Master Data Files. Example; SQL> ALTER DISKGROUP DATA SET OWNERSHIP OWNER=‘Oracle' FOR FILE '+DATA/TESTMASTER/DATAFILE/system.257.865863315'; Run the following sql to generate dynamic sql script to set ownership on all the data files. SQL> Spool set_owner.sql SQL> select 'ALTER DISKGROUP DATA set ownership owner='||''''||‘Oracle'||''''||' for file '||''''||name||''''||';' from v$datafile; SQL> @set_owner.sql (as SYSASM on +ASM)
• Set Data File Permissions “READONLY”. Example; SQL> ALTER DISKGROUP DATAC1 SET PERMISSION owner=READ ONLY, group=READ ONLY, other=NONE FOR FILE '+DATAC1/TESTMASTER/DATAFILE/system.257.865863315‘; Run the following sql to generate dynamic sql script to set ownership on all the data files. SQL> Spool set_read_only_permission.sql SQL> select 'ALTER DISKGROUP DATAC1 set PERMISSION owner=READ ONLY, group=READ ONLY, other=NONE for file '||''''||name||''''||';' from v$datafile; SQL>@set_read_only_permission.sql (as SYSASM on +ASM)
• Close the database and open “READONLY” • Test Master database is now ready to support the crea/on of Exadata Storage Snapshots.
SQL> shutdown immediate; SQL> Alter Database Open Read Only;
62
Rent-‐a-‐Center – Create Storage Snapshot • Create Storage Snapshots
Create snapshots and use for dev/test.
• Generate necessary scripts for crea/ng snapshots • Run the backup control file to trace command to place the "create controlfile" command into the trace file.
SQL> alter database backup controlfile to trace as ‘$ORACLE_HOME/crt_ctlfile.sql’; • You now determine the exis/ng file names for renaming for your new snapshot.
SQL> spool rename_files.sql SQL> select 'EXECUTE dbms_dnfs.clonedb_renamefile ('||''''||name||''''||','||''''||replace(replace(replace(name,'.','_'),'TESTMASTER',‘YourSnap'),'DATA','SPARSE')||''''||');' from v$datafile;
• Create , and edit as necessary, pfile from Test Master spfile (snap_pfile.ora) • Complete the snapshot
• Final steps to create snapshot. • Shutdown the Test Master
SQL> Shutdown immediate; • Create audit_file_dest directory
$ mkdir $ORACLE_HOME/admin/YourSnap • Startup database using snapshot pfile
SQL> startup nomount pfile=‘snap_pfile.ora’ • Create controlfile from backup controlfile
SQL> @crt_ctlfile.sql • Rename datafiles using script created above
SQL> @rename_files.sql • Open the Exadata snapshot database with the RESETLOGS op/on
SQL> alter database OPEN resetlogs; • Add temp file (not Sparse)
SQL> alter tablespace TEMP add tempfile ‘+DATAC1’ size 10g;
63
Rent-‐a-‐Center – Refresh Test Master Database
• Refresh Test Master Database To refresh the test master database, the main steps are.
• Drop Snapshot Database • Delete the Exadata snapshot databases that are children of the test master database you want to refresh. You can drop them using RMAN. RMAN> startup mount force; RMAN> alter system enable restricted session; RMAN> drop database;
• Change the permissions on Test Master files to Read-‐Write • Set Data File Permissions “READ WRITE”. Example; SQL> ALTER DISKGROUP DATA SET Permission OWNER= Read Write, Group=Read Write, Other=None FOR FILE '+DATA/TESTMASTER/DATAFILE/system.257.865863315‘; Run the following sql to generate dynamic sql script to set ownership on all the data files. SQL> startup mount SQL> spool change_perm.sql SQL> select 'ALTER DISKGROUP DATA set permission owner=read write, group=read write, other=none for file '||''''||name||''''||';' from v$datafile; SQL> @change_perm.sql (as SYSASM on +ASM)
• Convert Data Guard Snapshot Standby back to Physical Standby • To let Data Guard refresh the standby, enable log shipping to the standby and redo apply on the standby:. SQL> alter database convert to Physical standby; DGMGRL> edit database TESTMASTER set property logshipping=ON; DGMGRL> edit database TESTMASTER set state=apply-on;
64