Upload
phungtram
View
215
Download
0
Embed Size (px)
Citation preview
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Real-World Performance TrainingExtreme OLTP Performance
Real-World Performance Team
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
• Small transactions
• Processing small numbers of rows
• Fast (single-digit millisecond) response times
• Large numbers of users
• It’s all about efficiency
• Get-in-get-out
• Be nice to others
Extreme OLTP Workloads
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Agenda
Connection Pools
Serialization
Leaking
Database / Middleware Interaction
Extreme IO
Tools for Diagnostics
Analysis
1
2
3
4
5
6
7
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Agenda
Connection Pools
Serialization
Leaking
Database / Middleware Interaction
Extreme IO
Tools for Diagnostics
Analysis
1
2
3
4
5
6
7
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Connection Pools
• The primary reason for escalation of OLTP systems into the Real-World Performance Group is poor connection pooling strategies.
• Symptoms of a poor connection strategy:
– A high number of connections to the database ( 1,000s )
– A dynamic connection pool with a large number of logon/off to the database ( > 1/Sec )
– Periods of acceptable performance and then unexplainable/undebuggable periods of poor performance/availability
– The inability to determine what is happening in real time
Architectural Challenge
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Connection PoolsConnection Pools
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Performance DataConnection Pools
The workload is incrementally increased by doubling the load.
System appears scalable up to 65% CPU on the DB server.
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Performance DataConnection Pools
A checkpoint is initiated, creating a spike that results in unpredictable response time
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Performance DataConnection Pools
A slight increase to the workload results in a disproportionate CPU
increase, while response time degrades and the system
througput becomes unstable. System monitoring tools become
unreliable
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Performance DataConnection Pools
Reducing the connection pool by 50% results in more application server queuing and fewer DB
processes in a wait state. Observable improvement in response time and a consistent
transaction rate.
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Performance DataConnection Pools
Another slight increase to the workload results in a an unstable system with spikes
in response time and throughput
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Performance DataConnection Pools
Connection pool reduced to 144. Note improvement in response
time and transaction rate.CPU utilization is also reduced.
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Connection Pools
• In order to stay in control of an OLTP system, the DBA must be able to:– Query the system
– Extract trace files
– Debug issues
• Without these tools, when a race condition occurs, it becomes impossible to determine the root cause of any problem or to prevent the problem from reoccurring
• Resource Management: Learning how to manage and control resources• RM is not a turbo, it is a gatekeeper
• RM will slow/stop some processes so that other process can run efficiently
Staying in Control
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
ControlConnection Pools
By reducing the CPU_COUNT in the resource manager, the database can be throttled back. Note the increase in response time and the wait event resmgr: cpu quantum
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Connection Pools
• Connection Storms
• SQL Plan degradations
Race Conditions
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Connection Pools
• May be caused by application servers that allow the size of the pool of database connections to increase rapidly when the pool is exhausted
• A connection storm may take the number of connections from hundreds to thousands in a matter of seconds
• The process creation and logon activity may mask the real problem of why the connection pool was exhausted and make debugging a longer process
• A connection storm may render the database server unstable, unusable
Connection Storms
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
The Anatomy of a Connection Storm
Connection Pools
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
The Anatomy of a Connection Storm
Connection Pools
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
The Anatomy of a Connection Storm
Connection Pools
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Connection Pools
• SQL Plan changes can be caused by multiple issues:– Change in init.ora parameters that influence the optimizer– RDBMS version change– New Schema statistics– Bind Peeking
• Poor SQL can also be introduced by:– Badly tested new application code– Tools version changes
• All have the same impact they introduce SQL that is executing sub optimally resulting in increases of:– CPU– I/O
SQL Plan Degradations
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Connection Pools
• The increases in CPU and I/O can be orders of magnitude greater than the optimal SQL Plan
• The impact is usually to overload the system beyond the ability of the system to respond
• When the number of concurrent sub optimal SQL statements > CPUs a race condition will take place
• Symptoms include– System appears hung
– Unable to get keyboard response from the Database Server even at the console
– To regain control you may have pull the power from the HW
SQL Plan Degradations
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Connection Pools
• SQL Plan degradations are going to happen, it is important to be able to diagnose them and obtain the debug info before you loose the system. To assist in this process there things that can be done to help
– set init.ora(CPU_COUNT) to 75% of the actual CPUs and invoke the default resource plan or construct a resource plan leaving some resources idle
– Limit connections to the DB to 2-3 times the number of Cores
– Ensure a DB connection pool cannot grow under load
SQL Plan Degradations
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Agenda
Connection Pools
Serialization
Leaking
Database / Middleware Interaction
Extreme IO
Tools for Diagnostics
Analysis
1
2
3
4
5
6
7
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Serialization
• The current system performance is acceptable
• New requirement: Capture transaction summary statistics and store data in a flattened “history” table for further analysis
• Implementations of the INSERT statement causes a dramatic reduction in system performance
• The accepted solution must scale, as the application will be deployed globally in the new year
Observations
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Serialization
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Performance DataSerialization
Transaction rate: 52,000 TPS Response time: 2 ms
Dominant wait events: log file sync and CPU
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Performance DataSerialization
Transaction rate: 5,300 TPS Response time: 18 ms
Dominant wait events: buffer busy waits and
enq:TX–index contention
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Reverse Key IndexSerialization
Transaction rate: 19,000 TPS Response time: 5 ms
Contention reduced while the history table remains small.
CPU increases with workload.
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Reverse Key IndexSerialization
Transaction rate: 3,000 TPS Response time: 33 ms
Contention shifts to I/O subsystem when the
history table grows larger
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Hash Partition IndexSerialization
Transaction rate: 20,000 TPS Response time: 5 ms
Contention reduced by hash partitioned history table index
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Add RAC nodeSerialization
Transaction rate: 21,000 TPS Response time: 9 ms
RAC appears not to scale well with contention in cluster waits
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Composite Key IndexSerialization
Transaction rate: 35,000 TPSResponse time: 5 ms
RAC configuration scales well with no contention or gc waits
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Serialization
• INSERT statement results in index contention due to surrogate primary key generated by using a sequence
• Solution 1: REVERSE KEY primary key index– This proves unsatisfactory after the table has grown to a critical size
• Solution 2: HASH PARTITION primary key index– This proves unsatisfactory after moving to a RAC cluster
• Solution 3: Code the sequence to ensure no index contention– New sequence structure: instance# | session id | sequence
– Solution 3 demonstrates optimal scaling and response time characteristics
Data Analysis
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
The Scalable Key Code
INSERT INTO clo_normal.log_big_clever
VALUES
(
1E+25 * USERENV('INSTANCE')
+ 1E+20 * mod(USERENV('SID'),100)
+ clo_normal.big_clever_seq.nextval,
0,0,'card id','HallID','TID',sysdate,1,12,'WheelPos',123456,123456,123456,0,1);
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Performance Testing Results
Serialization
0
5
10
15
20
25
30
35
0
5000
10000
15000
20000
25000
Normal Reverse Key Small Reverse Key Big Hash Big Hash Big 2Nodes
Smart Big 2Nodes
Tx Rate per sec
Response Time (ms)
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Response Time
• Response time impacted by non database code
– Application server time• Code problem
• Capacity planning
• Response time impacted by database internal contention
– Right growing index problem• Reverse Key Index ( Good for small tables )
• Hash partitioned Index ( Good for large tables, poor with RAC )
• Non contentious primary key ( Good in all scenarios, application code change required )
Debugging by Numbers
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Agenda
Connection Pools
Serialization
Leaking
Database / Middleware Interaction
Extreme IO
Tools for Diagnostics
Analysis
1
2
3
4
5
6
7
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Leaking
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Leaking
• Intermittent error: “ORA-01000: Maximum number of cursors exceeded”. Application server fails and must be restarted
• The DBA has suggested that the init.ora parameter open_cursors be reset to 30,000 to make the problem “go away for a while”.
• Symptoms of cursor leaking
Observations
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Performance DataLeaking
Error message: ORA-01000 Maximum open
cursors exceeded
“SQL*Net break/reset to client”
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Session DetailsLeaking
SQL*Net break/reset to client
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Cursor DataLeaking
Cursor list with Count > 1 implies “leaked” cursors
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Leaking
• After a period of time, the system performance begins to decline and then degrades rapidly
• After rapid degradation, the application servers time out and the system is unavailable
• The DBA claims the database is not the problem and simply needs more connections
– The init.ora parameter processes is increased to 20,000
Observations
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Performance DataLeaking
Load diminishes to zero
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Leaking
• Due to coding errors on exception handling, the application leaks connections in the connection pool making them programmatically impossible to use
• This reduces the effective size of the connection pool
• The remaining connection are unable to keep up with the incoming workload
• The rate of connection leakage is accelerated until there are no useable connections left in the pool
Session Leaking
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Leaking
• Potential indicators of session leaking:
– Frequent application server resets
– init.ora parameters process and sessions set very high
– Configuration of large and dynamic connection pools
– Large number of idle connections connected to the database
– Free memory on database server continually reduced
– Presence of idle connection kill scripts or middleware configured to kill idle sessions
Session Leaking
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Leaking
• Without warning, the database appears to hang and the application servers time out simultaneously
• The DBA sees that all connections are waiting on a single lock held by a process that has not been active for a while.
• Each time the problem occurs, the DBA responds by running a script to kill sessions held by long time lock holders and allowing the system to restart.
Observations
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Performance DataLeaking
Leaked lock holder causes entire system
to stall
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Blocking TreeLeaking
Block tree reveals lock holder which
can be killed
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Performance DataLeaking
enq: TX – row lock contention
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Performance DataLeaking
System returns to normal when lock is released. There could have been logical data corruption and it may happen again
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Leaking
• Lock leaking is usually a side effect of session leaking and the exception handling code failing to execute a commit or rollback in the exception handling process.
• A leaked session may be programmatically lost to the connection while holding locks and uncommitted changes to the database.
Lock Leaking
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Leaking
• Programming error impact:
– Potential system hangs: all connections queue up for the held lock
– Potential database logical corruptions: end users may have thought transactions were committed when in fact they have not been
– If sessions return to the connection pool but still have uncommitted changes, it is not deterministic, if and/or when the changes are committed or rolled back. This is a serious data integrity issue.
Lock Leaking
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Leaking
• Developer Bugs
– Incorrect/untested exception handling• Cursor, session and lock leaking
– High values for init.ora ( open_cursors, processes, sessions )
– Idle process and lock holder kill scripts
– Oversized connection pools of largely idle processes
How to Develop High Performance Applications
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Agenda
Connection Pools
Serialization
Leaking
Database / Middleware Interaction
Extreme IO
Tools for Diagnostics
Analysis
1
2
3
4
5
6
7
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Database / MiddlewareInteraction
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
ScenarioDatabase / Middleware Interaction
• Devices ship files.
• Files read and processed by multiple application servers
• Each application server uses multiple threads that connect to database through a connection pool which is distributed by a scan listener over two instances.
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
ProblemDatabase / Middleware Interaction
• It’s too slow
• It’s a problem with the database
– Look at all those waits
• Need to be able to process an order of magnitude more data
• Obviously need to move to Hadoop
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
AnalysisDatabase / Middleware Interaction
• Only small amount of data being processed.
• Both instances essentially idle with most processes waiting in RAC and concurrency waits.
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
SolutionDatabase / Middleware Interaction
• Remove all of those RAC waits by running against a single database instance.
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
AnalysisDatabase / Middleware Interaction
• Throughput up by factor of 2x
• RAC waits gone
• CPU time actually visible
• High concurrency waits
– Buffer busy
– Tx index contention
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
SolutionDatabase / Middleware Interaction
• Reduce contention waits by processing a file entirely within a single application server
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
AnalysisDatabase / Middleware Interaction
• Throughput improved again
• Concurrency events reduced but still present
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
SolutionDatabase / Middleware Interaction
• Introduce affinity for a related set of records to a single thread by hashing
• All records for the same primary key processed by single thread so no contention in index for same primary key vale
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
AnalysisDatabase / Middleware Interaction
• More throughput
• Log file sync predominant event
• CPU usage close to core count
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
SolutionDatabase / Middleware Interaction
• Reintroduce RAC to add more CPU resource
• Implement separate service for each instance
• Connect application server to one instance
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Agenda
Connection Pools
Serialization
Leaking
Database / Middleware Interaction
Extreme IO
Tools for Diagnostics
Analysis
1
2
3
4
5
6
7
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Extreme IO
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Performance DataExtreme IO
Database slow down is magnified in application
server
Erratic database performance
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Performance DataExtreme IO
Note smoothing effect of async logging. Beware of data integrity issues!
log file sync events have been replaced
with log buffer space events
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
ResolutionExtreme IO
With uniform log file sync events, we now
see a slight slow down of other DB I/O
Performance is consistent as flash
cache protects system from I/O
surges
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Extreme IO
• Average response times for log writes mask outlier values of very high write times > 1 sec ( log file parallel write )
• Investigation reveals surges in workloads on shared storage from unrelated workloads
• The business requires full data integrity. Asynchronous logging is not an option due to potential data loss
• There are three potential solutions:– Eliminate background workloads
– Relocate log files to dedicated devices
– Use Exadata flash logging
Data Analysis
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Agenda
Connection Pools
Serialization
Leaking
Database / Middleware Interaction
Extreme IO
Tools for Diagnostics
Analysis
1
2
3
4
5
6
7
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Main Tools for OLTP Diagnostics
• AWR Report
– Diagnostic information on the instance or database level
– Information for the system as a whole
• ADDM Report– Recommendations based on AWR information
• ASH Report
– More granular information such as the session level
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Instance, database reports of database activityHTML reports are preferred
AWR Report
SQL>REM
SQL>REM For Current Database Instance
SQL>@?/rdbms/admin/awrrpt.sql
SQL>REM
SQL>REM For Full RAC Clusterwide Report
SQL>@?/rdbms/admin/awrgrpt.sql
SQL>REM
SQL>REM For SQL Exec History(Detecting Plan Changes)
SQL>@?/admin/awrsqrpt.sql
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
AWR Report
• AWR report can be accessed by navigating
– Database Tab
– Automatic Workload Repository
– Select snapshots
• You may find the command line interface is quicker !
EM Access to Reports
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
EM Access to ReportsAWR Report
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
ADDM Report
• Performs performance diagnostic analysis and makes recommendations for improvement.
• The ADDM report should accompany any AWR report as a matter of standard practice
ADDM
SQL>REM
SQL>REM For Current Database Instance
SQL>@?/rdbms/admin/addmrpt.sql
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
ADDM Report
• ADDM report can be accessed by navigating
– Performance Tab
– “Historical” Right hand drop down menu
– Select snapshot
– (Run ADDM) Button
EM Access to Reports
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
EM AccessADDM Report
Select “Historical”
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
EM AccessADDM Report
Choose analysis icon of interest
Run ADDM Report
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
ASH Report
• Active Session History reports can provided fine granularity tightly scoped reports e.g. for a short time period (< AWR interval) or
• for an individual session or a particular module.
ASH
SQL>REM
SQL>REM For Current Database Instance
SQL>@?/rdbms/admin/ashrpt.sql
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
ASH Report
• ASH report can be accessed by navigating
– Performance Tab
– “Top Activity” Link
– Slide moving window over period of interest
– (Run ASH Report) Button
EM Access to Reports
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
ASH Report
Select Top Activity Link
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
ASH Report
Slide to period of Interest
Run ASH Report
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Agenda
Connection Pools
Serialization
Leaking
Database / Middleware Interaction
Extreme IO
Tools for Diagnostics
Analysis
1
2
3
4
5
6
7
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
AWR Architecture Analysis
• Large amount of data in the AWR report
• Tells us about the way that the system has been architected and designed as well as about how it is performing
• Often see common mistakes
More than just wait events and top SQL
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
AWR from online systemReady for Black Friday?
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
AWR from Online system
• Testing system for Black Friday readiness
• Cannot generate load expected on test system
• Do you see any problems with this system scaling up from this test?
• Will we survive Black Friday ?
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
AWR Header
• First-level bulleted text is Calibri 28 pt
– Second-level text (press tab key) is 24 pt
• Calibri is the only font used in the template
• All bulleted text is sentence case (capitalize first letter of first word)
• Use bold, red, or both to highlight ONLY KEY text
• 32 Cores available
• Over processed
– Sessions is 100x cores
– Session count growing• Session leak
• Dynamic connection pools
• Cursors per session growing– Cursor leakage
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Load Profile
• ~260 sessions active on average
• ~40 on CPU
– Only have 32 cores
– System CPU limited
• 10 logons per second – In a stable system?
• Session leaks
• Dynamic connection pools
• 40% of user txns are rollbacks
– Coding for failure
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Init.ora
• Underscore parameters
• Db_block_size=16384
• Cursor_sharing=FORCE
• Db_file_multiblock_read_count=32
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Init.ora
• Db_writer_processes=12
– On a system that supports asynchIO?
• Open_cursors=2000
– Per session limit
– Implies cursor leaking
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Init.ora
• Optimizer_index_cost_adj=50
– Classic hack parameter
• Processes=5500
• Sessions=8320
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Top events
• Concurrency waits > 35% of time
– Library cache: mutex X
– Latch:row cache objects
– Typical of high CPU load
– A symptom, not the problem
• Row lock contention 18% of time
• IO with 7ms avg read time
• CPU only 15% of DB Time
• Log file sync?
Where is the time going?
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Top SQL
• Top statementSELECT /*SHOP*/ YFS_ORDER_HEADER.*
FROM YFS_ORDER_HEADER
WHERE (ORDER_HEADER_KEY = :1 )
FOR UPDATE
– 12% of load
– 2 million executions
– Average execution 0.1 sec
Where is the time going?
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Top SQL
• Top statementSELECT /*SHOP*/ YFS_ORDER_HEADER.*
FROM YFS_ORDER_HEADER
WHERE (ORDER_HEADER_KEY = :1 )
FOR UPDATE
– 40% of RLWs are on that object
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Top SQL
• Next two statements
– Call of DBMS_APPLICATION_INFO• Application instrumentation
– 14% of load
– 26M executions each
– Instrumentation is a good thing BUT
– Not needed since Oracle 10g
– Use parameters to OCI or Java instead
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Online Summary
• System is CPU bound at test load levels
• System seems to be leaking both cursors and sessions (and maybe locks)
• System is running far too many processes
• High overhead application instrumentation
Not looking good for Black Friday
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Operating System Diagnostics
• Captures Operating system data– top
– vmstat
– netstat
– iostat
– mpstat
• Described in MOS Notes 1617454.1 and 301137.1.
• Note that Exadata deployment is a modified version of OSWatcher that is installed by default on all database nodes and storage cells.
ExaWatcher and OSWatcher
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Operating System Diagnostics
• Lives in – /opt/oracle.ExaWatcher
– /opt/oracle.oswatcher/osw
• The collected data will be under – /opt/oracle.ExaWatcher/archive
– /opt/oracle.oswatcher/osw/archive
– one subdirectory per collector
• Should purge data after seven days
ExaWatcher and OSWatcher