25
Getting Value from Application Performance Metrics Michael Sydor Engineering Services Architect, Author: APM Best practices

Getting Value from Application Performance Metrics

  • Upload
    venice

  • View
    32

  • Download
    1

Embed Size (px)

DESCRIPTION

Getting Value from Application Performance Metrics. Michael Sydor Engineering Services Architect, Author: APM Best practices. Agenda. Why so many metrics with APM? “Big Data”? What we are learning with CA-ABA (analytics) How to find KPIs What’s new for CA-APM 9.6 Release. - PowerPoint PPT Presentation

Citation preview

Page 1: Getting Value from Application Performance Metrics

Getting Value from Application Performance Metrics

Michael SydorEngineering Services Architect,

Author: APM Best practices

Page 2: Getting Value from Application Performance Metrics

2 © 2014 CA. ALL RIGHTS RESERVED.

Agenda

Why so many metrics with APM?– “Big Data”?

What we are learning with CA-ABA (analytics)

How to find KPIs

What’s new for CA-APM 9.6 Release

Page 3: Getting Value from Application Performance Metrics

3 © 2014 CA. ALL RIGHTS RESERVED.

Typical APM Cluster

Dozens to hundreds of applications– 2800 JVMs/CLRs

Up to 5M metrics, every 15 seconds

Large applications span multiple data centers– 2-8 APM clusters, typical– 30-70 EM Collectors for a nationwide portal application

12M to 28M metrics, every 15 seconds

… certainly sounds like big data!!!

Page 4: Getting Value from Application Performance Metrics

4 © 2014 CA. ALL RIGHTS RESERVED.

What is Big Data???APM information is “big”… but it is not “big data” without enrichment

5M Metrics that you don’t fully

understandOR

5M Metrics

that you don’t

fully understand

Trouble

Management

Version

Control

Time of ____

Constraints

Air Traffic

Advisories

Weather

Forecast

AP News

Updates

Marketing

Campaigns

E N R I C H M E N T

Correlation

Trends

Insights

Anomalies

Page 5: Getting Value from Application Performance Metrics

5 © 2014 CA. ALL RIGHTS RESERVED.

Challenges for Big Data

Data Variety – different sources gives different perspectives. Does your data have a significant perspective?

Validation – is the data source meaningful/predictive?

Consistency – are the values trustworthy?

Data Structure and Nomenclature – Mapping, Transformation

Temporal Impedance Mismatch– APM: real-time with 15 second reporting interval– Trouble Management: +15-30 minutes later– Stock Ticker: +15-30 minutes later– Air Traffic Advisories: +30-60 minutes later– Version Control: days to weeks in advance– Marketing Campaign Assessment: 2-4 weeks later

Page 6: Getting Value from Application Performance Metrics

6 © 2014 CA. ALL RIGHTS RESERVED.

KPI Management Maturity

SGCM: Stalls, GC Settings, Concurrency, Memory Management Trends

APC : Availability, Performance, Capacity

EKB: Errors, Key Resource Performance, Business Transaction Survey

VALU

E

KPI MATURITY

(Platform) (Application) (Transaction)

Page 7: Getting Value from Application Performance Metrics

What We are Learning with CA-ABA

Page 8: Getting Value from Application Performance Metrics

ABA Logical Architecture

APM Cluster

5M Metrics100k

Metrics(via RegEx)

Anomaly Engine

Anomalies Alerts

Why only 100k Metrics???Why not 5M???

Page 9: Getting Value from Application Performance Metrics

RegEx == Regular Expression

analytics.metricfeed.process.3 =

Custom Metric Host (Virtual) \\|Custom Metric Process (Virtual)\\|Custom Business Application Agent (Virtual)

analytics.metricfeed.metric.3 =

By Business Service\\|[^|]+\\|[^|]+\\|[^|]+:.+

Page 10: Getting Value from Application Performance Metrics

RegEx is hard… but easy to validate

Page 11: Getting Value from Application Performance Metrics

Metricfeed.3

File S

ystem JSP

Servl

ets

Threa

ds

GC Heap

CPU

Heuris

tics

GC Monito

r

Radian

tLogic

Java M

ail

z/OS M

etrics

Enter

prise M

anag

er0

20

40

60

80

100

120

140

160

180

200

Series1

metricfeed.3

Broader collection of metrics but only 87/500 == 17.4% are generally known as useful

Page 12: Getting Value from Application Performance Metrics

Suspects Identified via Baseline Technique

SiteMinder Backends JSP Frontends JMX Custom0

2

4

6

8

10

12

14

16

18

Series1

Suspects via Baseline TechniquesAverage RT only

100% Useful metrics, ready for validation: 47/43625 == 0.1%

Page 13: Getting Value from Application Performance Metrics

Metric Count TypeView

Page 14: Getting Value from Application Performance Metrics

What is an Application?

Front-ends– Browser? Webservice? Messaging?

Back-ends– Databases Webservices Messaging Mainframes Trading_Partners

Muck-in-the-Middle– Software quality, stability and scalability

- We want to identify KPIs for each of these elements– - helps us build a useful dashboard for Operations– - helps expose with the resources are really doing– - helps us define acceptance criteria, to act proactively– - helps us to triage really effectively

Page 15: Getting Value from Application Performance Metrics

How to Find KPIs

Page 16: Getting Value from Application Performance Metrics

Capacity KPIs – “Tree Rings”

Page 17: Getting Value from Application Performance Metrics

Performance KPIs

High Volume+

Significant Response Time

Page 18: Getting Value from Application Performance Metrics

Create a Simple Alert and Threshold (ConnectionStatus)

Page 19: Getting Value from Application Performance Metrics

Create a Simple Alert, Find Restart and threshold (MetricCount)

“UP” – but not actually doing anything!!!

Page 20: Getting Value from Application Performance Metrics

Understanding Your Environment

Identify the KPIs– Availability

Agent ConnectionStatus Number Live Metrics (Metric Count)

– Performance High Volume components with significant response time

– NOT “Top 10 Response Time”– Capacity

Highest Volume Components

Don’t Wait for Production!!!– Make it part of your pre-production review– Manage the application lifecycle by trending KPIs

Page 21: Getting Value from Application Performance Metrics

Good Better (additional) Best (additional)

Stalls Availability – Connected Status

Errors

GC Settings Availability - Metric Count

Key Resource Performance

Concurrency Suspect Performance Business Transaction Survey

Memory Management (graph)

Suspect Capacity

PlatformCoarse information..but not really APM

Application, Transactions, ResourcesThe APM Advantage

KPI Evolution

Page 22: Getting Value from Application Performance Metrics

What’s New in CA APM 9.6Simplified, automated, and built on CA APM strengths.

Seamless Mainframe Awareness

Faster, Easier APM

• Intelligent Deep Transaction Trace is now dynamic, automated, and requires less developer involvement for deep dives into apps supporting the transactions

• Simplified Triage with easier drill down with Application Triage Map including Socket Grouping• Improved response times with software based Transaction Impact Monitor (end-user experience)• Expanding APMs scope with Java 7 EM & Agents

• Increased insight by adding DB2 details to transaction traces • Greater awareness with CA SYSVIEW MQ alerts & complete status in APM• Driving further cross enterprise depth with CTG traces to fully expand backend calls• Other mainframe based enhancements

Page 23: Getting Value from Application Performance Metrics

Preparing to Upgrade HealthCheck the existing cluster prior to any upgrade

Good: – - Do a clean install of the APM Cluster, alongside of the existing cluster version.

- Manually duplicate management modules, domains.xml, etc. - Bring down the old version, then bring up the new

Better:– - Install the new version in a separate environment, reduced size– - migrate a few applications to the new environment for validation– - upgrade the primary environment after validation achieved

Best:– - Install a new GOLD environment in production, separate from original cluster– - migrate agents, as schedules permit, until original cluster may be

decommissioned– - this provides an opportunity to introduce pre-production review and generally

correct any bad deployment habits

Page 24: Getting Value from Application Performance Metrics

Resources

Community Site– - Cookbook: APM HealthCheck– - Understanding Which Metrics Matter (KPI discussion)– - Cookbook: Application Audit

- more details on the baseline techniques and process

APM best practices – Realizing Application Performance Management– available on Amazon.com and Apress.com

- Baselines, Test Plans, App Audits, Triage, Firefighting - Organizational Models, Service Catalogs

Page 25: Getting Value from Application Performance Metrics

Questions and Answers