27
We operate as John Hancock in the United States, and Manulife in other parts of the world. The John Hancock Monitoring Story: Implementation OR Adaptation? What does it take to succeed with New Relic? September 2017

The John Hancock Monitoring Story, FutureStack17

Embed Size (px)

Citation preview

Page 1: The John Hancock Monitoring Story, FutureStack17

We operate as John Hancock in the United States, and Manulife in other parts of the world.

The John Hancock Monitoring Story:

Implementation OR Adaptation?

What does it take to succeed with New Relic?

September 2017

Page 2: The John Hancock Monitoring Story, FutureStack17

We operate as John Hancock in the United States, and Manulife in other parts of the world.

Navpreet SinghHead of Technical Resolution at John Hancock

2

Page 3: The John Hancock Monitoring Story, FutureStack17

3

Manulife & John Hancock

Source: http://www.manulife.com/Our-Story

A Global company

22 million customers,

35,000 employees, 70,000 agents,

thousands of distribution partners

Global Assets Under Management

and Administration exceeded

$1 trillion in the first quarter of 2017

Page 4: The John Hancock Monitoring Story, FutureStack17

4

Technology Landscape @ John Hancock

150 Year-old Business

Early IT Adapter

Using mainframe

MainframeCOBOLMicrofocus…

ServerlessMicroservicesIn Cloud

VB, PB, Progress, VFP…

Java, .Net, Ruby, Node, Angular, React, PHP…

Windows, Linux, Solaris, AIX…

SQL Server, Oracle, DB2, MySQL…

…And every version of these!

, cloud, and everything in between

Page 5: The John Hancock Monitoring Story, FutureStack17

5

Technology Landscape @John Hancock

600+ applications developed both in-house

and with vendors

Hosted on multiple models

Thousands of IT/IS professionals

Page 6: The John Hancock Monitoring Story, FutureStack17

We operate as John Hancock in the United States, and Manulife in other parts of the world.

The Manulife/John Hancock Reality

Before New Relic

Page 7: The John Hancock Monitoring Story, FutureStack17

Disparate Monitoring Solutions

Many different approaches to

monitor applications

No monitoring software for many applications

Basic hardware monitoring for

ops and vendors

But…Applications talk to each other all the time!

Result: Large holes in end-to-end monitoring

Page 8: The John Hancock Monitoring Story, FutureStack17

8

Example Scenarios

Web page loading slow

Batch process running slow

Don’t know CPU? or RAM? or Disk? or SQL? or App? issue

Dev team can only access app logs;Can’t capture CPU/RAM usage

Need server admin & DBAMeet Service admin to capture CPU/RAM usageWait for assigned admins to respond

Takes hours to days just to obtain databefore troubleshooting

Performance Issues

Page 9: The John Hancock Monitoring Story, FutureStack17

9

Example Scenarios

Web page errors

App layer / Business layer errors

SQL errors

Dev team uses app logs; limited insight

Need to bring to lower regions, do code debugging

Time consuming exercise, lack of real time trace.Web page -> App component -> SQL invokedfrom App

Lack of detail @thread level tracing forperformance issues

Need architect / admins

Application Errors

Page 10: The John Hancock Monitoring Story, FutureStack17

10

Increased Priority Incidents = Need for Better Monitoring

Move from reactive to proactive

We needed a

central monitoring standard

Resolve issues quickly

Improve understanding of application

behavior

Improve visibility into applications

in production

Enter

Page 11: The John Hancock Monitoring Story, FutureStack17

We operate as John Hancock in the United States, and Manulife in other parts of the world.

We’re All a Product of Our Environment!What Else Was Happening When New Relic Was Being Introduced?

Page 12: The John Hancock Monitoring Story, FutureStack17

What Else Was Happening?

Move to CloudPredominantly Azure IaaS with some PaaS, App Service

Some AWS

Move to AgileLargely Scrum, SAFe with some advanced concepts like TDD+Pairing

Push to DevOps

New Relic push aligns with DevOps and Agile

Page 13: The John Hancock Monitoring Story, FutureStack17

CIO/COO sets a Clear Goal!

All applications in Production must be monitored by New Relic within one year

An aggressive, clear, & unambiguous goal:

Page 14: The John Hancock Monitoring Story, FutureStack17

What’s Next?

What’s the right Team Structure?

Who should Ownmonitoring setup and responsibilities?

Page 15: The John Hancock Monitoring Story, FutureStack17

15

Monitoring Ownership

Goal: End-to-end monitoring solution

which spans tiers, hardware, and software

Monitoring Servers

Ops team has clear ownership

Monitoring ApplicationsNot so clear

?

Page 16: The John Hancock Monitoring Story, FutureStack17

16

Monitoring Ownership options

A specialized central monitoring team focused on application monitoring

Ops team owns all monitoring, drives it with

the application teams

1 2

Each app team owns setting up

monitoring

3

Page 17: The John Hancock Monitoring Story, FutureStack17

17

Our Ownership Solution at JH: It’s a Hybrid!

Each app team owns setting up monitoring for

their applications

Center of Excellenceset up to drive the effort

Culture change – very important.This distinguishes adaptation from a simple software implementation

For one BU with 100+ apps, a central monitoring teamestablished within the BU

Page 18: The John Hancock Monitoring Story, FutureStack17

18

Engagement Methodology with App Teams

1st set of Meetings:

New Relic Buy-in

2nd set of Meetings:

App’s Tech

Proposal:

App + New Relic = Great Things!

Periodic Check-ins

Page 19: The John Hancock Monitoring Story, FutureStack17

Adaptation: Best Practices & Suggestions

Culture ChangeGet Buy-In

Highlight the Wins & Success Stories

to Top Leadership

Nurture an Internal Community

Monitoring Maturity CurveDifferent types of monitoring

Alerts – Getting them right

Insights – IT Analytics

Insights – Business Analytics

Page 20: The John Hancock Monitoring Story, FutureStack17

21

Agile mindset to the project

Bias towards action

Don’t sit in a room discussing / researching until you know all the answers

Figure out enough to get started, start executing, find answers in the process – Inspect and Adapt

Page 21: The John Hancock Monitoring Story, FutureStack17

22

Progress Shared monthly with all Senior IT Leaders

Metrics showed:

# of users

Growth over a period:

% Apps by Status

Monthly growth by BU

Metrics Highlighted to Track Progress

Agent TypeMin.

Contracted Apr May Jun

APM (Application

Performance Monitors) 264 61 98 126

Servers Unlimited 575 675 725

Mobile Apps 250000 0 0 298

Browser (Million Checks) 75 1.5 8.3 11

Synthetic*(Million checks) 1.5 1.4 1.4 0.7

Jan-17

Feb-17

Mar-17

Apr-17

May-17

Jun-17

‘In Progress’ and ‘Completed’JH DA

JH DA

Page 22: The John Hancock Monitoring Story, FutureStack17

We operate as John Hancock in the United States, and Manulife in other parts of the world.

Speed Bumps?

Before You Can Live Happily Ever After…

Page 23: The John Hancock Monitoring Story, FutureStack17

24

Some speed bumps we faced?

Firewall – took a long time to resolve internally

SSL issue with older java apps

Sweet spot – Great with tech within the last 20-30 years and upcoming technologies

IBM technologies

PMI Metrics with Websphere

Private Locations Azure deployable image

Server Agents (& breadth)

Page 24: The John Hancock Monitoring Story, FutureStack17

We operate as John Hancock in the United States, and Manulife in other parts of the world.

Some Happy Endings…

Page 25: The John Hancock Monitoring Story, FutureStack17

26

Results - Success Stories

APM: A group improved page performance by 3 secs per page load by identifying tuning opportunities with a SQL executed multiple times for every page load

Synthetics: A group identified a 100+ MB static file was being served by webservers in MA instead of Akamai CDN

SQL Server Plugin: A team identified their Page Life Expectancy had deteriorated drastically since DB moved to new server, indicating inadequate RAM allocated

Insights: A team identified uneven load distribution across servers was causing severely degraded performance

Server API+Synthetics: A team uses alerts on memory exhaustion to avoid what used to be definite downtime

Page 26: The John Hancock Monitoring Story, FutureStack17

28

Going Forward… The Journey Continues

Recently Acquired

Infrastructure Product

NR Software Analysis Review

NR Expert

Services

Increased Insights

Retention Period

Miles to go…

Page 27: The John Hancock Monitoring Story, FutureStack17

29

Questions?