38
BNZ z/Linux Platform a user experience from across the ditch’ Pieter Schutte & Brett Walker – Senior Technical Specialist - z/VM, z/OS SysProg Rodger Donaldson – Senior Technical Specialist - Midgrange -Linux & Solaris John Marshall – Senior Technical Specialist - Infrastructure Design & Internet Banking Rewrite Technical Lead. z/VM and Linux on System Z Technical Education Sessions - 21 October 2009

BNZ z/Linux Platform - IBM z/VM | IBM ·  · 2017-05-09BNZ z/Linux Platform ... Tier 1 Platform Required for Application Repatriation ... In house J2EE application using the Spring

  • Upload
    vudat

  • View
    223

  • Download
    0

Embed Size (px)

Citation preview

Page 1: BNZ z/Linux Platform - IBM z/VM | IBM ·  · 2017-05-09BNZ z/Linux Platform ... Tier 1 Platform Required for Application Repatriation ... In house J2EE application using the Spring

BNZ z/Linux Platform ‘a user experience from across the ditch’Pieter Schutte & Brett Walker – Senior Technical Specialist - z/VM, z/OS SysProgRodger Donaldson – Senior Technical Specialist - Midgrange -Linux & SolarisJohn Marshall – Senior Technical Specialist - Infrastructure Design & Internet Banking Rewrite Technical Lead.z/VM and Linux on System Z Technical Education Sessions - 21 October 2009

Page 2: BNZ z/Linux Platform - IBM z/VM | IBM ·  · 2017-05-09BNZ z/Linux Platform ... Tier 1 Platform Required for Application Repatriation ... In house J2EE application using the Spring

Agenda• The Bank of New Zealand & Group• IT Challenges & Front End Transformation• Why z/Linux• Building The Solution. Our Real World

Experiences• Taking Advantage of the Architecture

• Infrastructure DR• Application Availability

• Roadmap• Summary and Learning

Page 3: BNZ z/Linux Platform - IBM z/VM | IBM ·  · 2017-05-09BNZ z/Linux Platform ... Tier 1 Platform Required for Application Repatriation ... In house J2EE application using the Spring

The National Australia Group

Asset base of A$250 billionMore than A$400 billion in assets under administrationAlmost nine million customersRanked as one of the 50 largest banks in the worldRepresented across 4 continents and 15 countries

Page 4: BNZ z/Linux Platform - IBM z/VM | IBM ·  · 2017-05-09BNZ z/Linux Platform ... Tier 1 Platform Required for Application Repatriation ... In house J2EE application using the Spring

Bank of New Zealand• Employs 5000 staff• Acquired by NAB group in 1992• >2 million accounts• 700k customers • ~100 retail banking products• 185 Branch Sites• 300 IT Staff • First Computer 1966 (an IBM 360/30 with 16k memory)

Page 5: BNZ z/Linux Platform - IBM z/VM | IBM ·  · 2017-05-09BNZ z/Linux Platform ... Tier 1 Platform Required for Application Repatriation ... In house J2EE application using the Spring

IT Challenges & Front End Transformation

Page 6: BNZ z/Linux Platform - IBM z/VM | IBM ·  · 2017-05-09BNZ z/Linux Platform ... Tier 1 Platform Required for Application Repatriation ... In house J2EE application using the Spring

Our Key IT ChallengesData Centres

• Power – Can’t get enough power (now monitoring watts)• Space – One server in -> One server out• Moving out of our Wellington Data Centre (Production)

Focus on IT Costs• Challenged with keeping IT costs flat• Less appetite for IT spend during globala downturn

Corporate Values• Carbon Neutral as an organisation by 2010• Aligning our IT infrastructure with our corporate ‘green’

values

Tier 1 Platform Required for Application Repatriation• Repatriate our Tier 1 Applications.

Page 7: BNZ z/Linux Platform - IBM z/VM | IBM ·  · 2017-05-09BNZ z/Linux Platform ... Tier 1 Platform Required for Application Repatriation ... In house J2EE application using the Spring

Why z/Linux?• – Cultural fit with simplification and Carbon footprint organisational goals.

• – Mainframe hardware enables service level metrics to match that of our core systems. (No data loss)

• – The tier-1 applications are built upon an IBM software stack. Running these on the z/Linux platform creates a one-stop shop for any support calls.

• – Currently our lowest cost platform to run software licenced by CPU cores/sockets. E.g Oracle

• – Consistent with long term l f i th f t d d t

Architecture Fit

Software Licencing

Simplified Support

Availability

Strategic Themes

Page 8: BNZ z/Linux Platform - IBM z/VM | IBM ·  · 2017-05-09BNZ z/Linux Platform ... Tier 1 Platform Required for Application Repatriation ... In house J2EE application using the Spring

Front End Transformation 2007 -> 2009

Middleware / Integration layer. XML over MQ front end. WESB / Process Server and Websphere TX.

Teller application, in house Java app built on top of the IBM BTT framework.

CRM application, in house Java application built on top of the IBM BTT framework.

In house J2EE application using the Spring framework, Websphere Application Server and Oracle.

MWF

TCS

Interact

Internet Banking

Current IB Software Stack z/Linux IB Software Stack

Page 9: BNZ z/Linux Platform - IBM z/VM | IBM ·  · 2017-05-09BNZ z/Linux Platform ... Tier 1 Platform Required for Application Repatriation ... In house J2EE application using the Spring

Successful Virtualisation History

0

50

100

150

200

250

300

350

400

450

2005 2006 2007 2008 2009

Physical x86

VM's

VMware - Production x86 Servers 50%+ Virtualised

Page 10: BNZ z/Linux Platform - IBM z/VM | IBM ·  · 2017-05-09BNZ z/Linux Platform ... Tier 1 Platform Required for Application Repatriation ... In house J2EE application using the Spring

Why z/Series and z/Linux?

Page 11: BNZ z/Linux Platform - IBM z/VM | IBM ·  · 2017-05-09BNZ z/Linux Platform ... Tier 1 Platform Required for Application Repatriation ... In house J2EE application using the Spring

Where does z/Linux fit?

Redhat Enterprise Linux & Windows Server 2003/8 on VMware VI4, and HP Blades.

Commodity

Solaris 10 on Sun M and T series servers running within domains, containers and zones. (Tactical)

Midrange

Redhat Enterprise Linux on the IBM z Series Mainframe running IFL engines within multiple LPARs under the z/VM hypervisor.

High Value

High Value – Typically includes customer facing tier-1 applications that are backed by high or continuous availability requirements and tight service level agreements.

Page 12: BNZ z/Linux Platform - IBM z/VM | IBM ·  · 2017-05-09BNZ z/Linux Platform ... Tier 1 Platform Required for Application Repatriation ... In house J2EE application using the Spring

BNZ z/Linux History

- GCS Team (Middleware) run Proof of Concept on the z Platform, Dev and Test.

Mar –Apr 08

- Revisit MWF Tuning. Building on IBRW. PCBB.Next 6 Month

s

- Internet Banking applications in ProductionSept 09

- Tellers and Interact applications in ProductionJune 09

- z Platform standup complete. (Ready for tier-1applications)Dec 08- GCS went into production on the z10 for a couple of non tier-1 apps

Jul 08

- Held joint IBM / BNZ design workshops- Purchased Z10-EC + DS Enterprise Storage DS8100

Feb 08 Jan 08

Dec 07

Oct 07

Aug 07 - Proof of Concept of Process Server as replacement to ROMA middleware- IBM Forum conference in Wellington

- z/Linux Project Officially launched

- Websphere Process Server software stack verified for performance- Purchased Z9-EC + DS Enterprise Storage DS8100

- Initial Due diligence carried out for a move to the z/Linux platform

Page 13: BNZ z/Linux Platform - IBM z/VM | IBM ·  · 2017-05-09BNZ z/Linux Platform ... Tier 1 Platform Required for Application Repatriation ... In house J2EE application using the Spring

Current Utilisation

Page 14: BNZ z/Linux Platform - IBM z/VM | IBM ·  · 2017-05-09BNZ z/Linux Platform ... Tier 1 Platform Required for Application Repatriation ... In house J2EE application using the Spring

Building the solution. Our real world experiances.

Page 15: BNZ z/Linux Platform - IBM z/VM | IBM ·  · 2017-05-09BNZ z/Linux Platform ... Tier 1 Platform Required for Application Repatriation ... In house J2EE application using the Spring

General Learning & Where to Begin• Start Simple - Don't Bite Off More Than You Can Chew.

• Treat received wisdom with care

• Prove it Yourself - Take advice, but test and verify your workloads for yourself.

• IFL Comparisons - Found that 1 IFL = Core2 Core for a specific in house Java workload.

• System Wide Monitoring - Was like someone turned the lights on!

• Efficient Consolidation – Culture of efficiency –People and Process Tuning

Page 16: BNZ z/Linux Platform - IBM z/VM | IBM ·  · 2017-05-09BNZ z/Linux Platform ... Tier 1 Platform Required for Application Repatriation ... In house J2EE application using the Spring

Overview of Our Infrastructure

Production (Hot)Z10-EC Single Book4 x IFL Engines160 GB Storage

DR (Warm)Z10-EC

Single Book3 x IFL Engines128GB Storage

DS 8300 SAN60TB Disk

DS 8300 SAN80TB Disk

Metro Mirror (PPRC)Data Replication

Mainframe

SAN Storage

Page 17: BNZ z/Linux Platform - IBM z/VM | IBM ·  · 2017-05-09BNZ z/Linux Platform ... Tier 1 Platform Required for Application Repatriation ... In house J2EE application using the Spring

LPAR Configurations• Started out trying to protect from LPAR failure by having 2 Prod LPARS, and all services on one Z. This proved to be unnecessary, and limited our ability to I isolate workloads, and benefit from licensing savings.

• We also found that PPTE impacted Prod and visa versa, regardless of weightings and dedicating engines to PPTE.• Decided to create LPARS for workloads…We are working towards this now.

• This also reduces our licensing cost substantially.

Page 18: BNZ z/Linux Platform - IBM z/VM | IBM ·  · 2017-05-09BNZ z/Linux Platform ... Tier 1 Platform Required for Application Repatriation ... In house J2EE application using the Spring

Platform Considerationsz/VM

• Monitoring – Establish and make Monitoring available to your support and project teams as early as possible.

• Paging Volumes – Small paging volumes maximum Mod9s.• Balancing all paging and user volumes across LCUs.• Z Paging is the Z paging guest memory to disk, this has shown to have a

major impact on performance. e.g. doubling a stress test response time and increasing Z CPU utilisation.

• However we over commit in test and dev, and performance is acceptable.

General Considerations• Balancing load: If you come from a distributed background it can be tempting to

think in terms of balancing CPU use across nodes in a RAC, or across nodes in a WAS cluster. This is the wrong way of looking at it: since the nodes are all running on the same underlying hardware it doesn't really matter how 'balanced' the load is; seeking a balanced load is an artifact of thinking about discrete hardware.

• One great example is our RAC: we actually achieved better performance at lower utilisation by using it with failover, rather than load-balancing connections, which significantly reduced the contention on the RAC interconnect. Since the CPU and disk comes from the same pool, we care about the overall performance, not individual nodes.

Page 19: BNZ z/Linux Platform - IBM z/VM | IBM ·  · 2017-05-09BNZ z/Linux Platform ... Tier 1 Platform Required for Application Repatriation ... In house J2EE application using the Spring

MonitoringESAMON - Real-time performance monitoring and analysis for

z/VM . Really turned the lights on our Zs for us! MONITORING IS

ESSENTIAL!SMART

Linux Top 10 ESAUCD2 Linux Memory & Swap

ESAMAINLPAR Utilisation

ESAUCD4 Linux CPU

• www.velocitysoftware.com

ESALPARSLPAR

Summary

Page 20: BNZ z/Linux Platform - IBM z/VM | IBM ·  · 2017-05-09BNZ z/Linux Platform ... Tier 1 Platform Required for Application Repatriation ... In house J2EE application using the Spring

Linux Guests• CPU Sizing - Sized starting with 1 VCPU, then add cpus as guest

reaches 70%. Assign Relative Shares based on # of CPUs. 100 = 1 vCPU, 200 = 2 vCPUs, etc.

• Memory Sizing - Sized with as little/optimum memory as practicable. Sized to only ever dip into Linux swap under peak conditions. Beware Linux will by design try to consume all available memory as cache.

• Linux Swap – Use VDISKs, beware running out of swap. VDISK memory is allocated when disk is utilised. Start with a smaller VDISK, then larger. E.g. 256MB, then 512MB chunks.

• Linux and Application Patch Levels - Be as up to date as possible, we have seen between 20% - 70% CPU and I/O optimisation through patching Linux (Kernel 5.3) and Oracle (DB & CRS patch 10.2.0.4) -dramatically reduced idle guest utilisation ~15%. We use Satellite for our patch management.

• Satellite + Kickstart vs Disk Cloning - Pros and cons either way. On the Z10 satellite installs and upgrades seem comparable, especially across multiple guests, as disk copies. e.g. upgrading 20-odd Production guests took about quarter of an hour to roll from RHEL 5.2 to RHEL 5.3.

Page 21: BNZ z/Linux Platform - IBM z/VM | IBM ·  · 2017-05-09BNZ z/Linux Platform ... Tier 1 Platform Required for Application Repatriation ... In house J2EE application using the Spring

Oracle• Oracle Data Volumes - Increasing number of addressable

disks improves overall performance.

• Patching - RAC 10.2.0.4 decreased idle cpu utilisation from ~15% to ~6% of an IFL.

• Clusterware Daemon OPROCD fix / setting update required post this patch to stop the guests evicting each other and rebooting under high load.

• Licensing – Segregating to its own LPAR to save money.

• Interconnect can burn CPU, for our application logging we use failover mode rather than load balancing, which reduces contention for resources.

• Swapping – Don’t let Oracle Swap out. Size to fit SGA and memory to satisfy required number of connections.

• IB Runs Approx 370 txn/s

Page 22: BNZ z/Linux Platform - IBM z/VM | IBM ·  · 2017-05-09BNZ z/Linux Platform ... Tier 1 Platform Required for Application Repatriation ... In house J2EE application using the Spring

• JVM Sizing - Guest should be sized to accommodate JVM in guest real memory, and tuned via GC.

• Turn on Verbose GC in your JVM. Outputs to /Websphere/AppServer/profiles/<PROFILE>/logs/<APP>/native_stderr.log.

• JAVA Garbage Collect - pay close attention to GC activity. A good zone for GC (for IBRW) is GC every 10-20 Secs for <50-200ms. Using Gencon.

JVM Performance Tuning

Try Gencon JVM GC Policy –Xgcpolicy:gencon.

• Gencon splits the heap into sections, Nursery for short lived, survivor ,and tenured for long lived objects. GC activity is reduced because you are generally just GCing in the Nursery

• By Enabling and Tuning with Gencon we reduced CPU for our Tellers Platform by a 3rd. And tripled user capacity for Internet Banking.

• Analyse all parts of your app stack. For example, we had a canned response

Page 23: BNZ z/Linux Platform - IBM z/VM | IBM ·  · 2017-05-09BNZ z/Linux Platform ... Tier 1 Platform Required for Application Repatriation ... In house J2EE application using the Spring

• JInsight can provide detail thread and memory profiling, with resource, time to execute and cpu utilisation.

• PPTE Dedicated Engines - For consistency of results, we found and recommend dedicated engines to PPTE LPARS for stress testing.

• Developers Performance Tools - should be provided early access to performance testing tools - assists with early pre-stress test shake downs. Tests can be performed against developers workstation or development environment, to establish early performance bottlenecks & gains.

Websphere / J2EE Performance

Page 24: BNZ z/Linux Platform - IBM z/VM | IBM ·  · 2017-05-09BNZ z/Linux Platform ... Tier 1 Platform Required for Application Repatriation ... In house J2EE application using the Spring

Taking Advantage of the Architecture

Page 25: BNZ z/Linux Platform - IBM z/VM | IBM ·  · 2017-05-09BNZ z/Linux Platform ... Tier 1 Platform Required for Application Repatriation ... In house J2EE application using the Spring

Infrastructure DR

Page 26: BNZ z/Linux Platform - IBM z/VM | IBM ·  · 2017-05-09BNZ z/Linux Platform ... Tier 1 Platform Required for Application Repatriation ... In house J2EE application using the Spring

Storage Replication Setup for DR

Page 27: BNZ z/Linux Platform - IBM z/VM | IBM ·  · 2017-05-09BNZ z/Linux Platform ... Tier 1 Platform Required for Application Repatriation ... In house J2EE application using the Spring

Detailed Storage Replication View

Page 28: BNZ z/Linux Platform - IBM z/VM | IBM ·  · 2017-05-09BNZ z/Linux Platform ... Tier 1 Platform Required for Application Repatriation ... In house J2EE application using the Spring

Network Setup for DR - Colony Model

Page 29: BNZ z/Linux Platform - IBM z/VM | IBM ·  · 2017-05-09BNZ z/Linux Platform ... Tier 1 Platform Required for Application Repatriation ... In house J2EE application using the Spring

DR Invocation Process1. Shutdown Prod (or is off the air due to event).

2. Update Network switches so VLANS now active at secondary site.

3. Storage ownership is switched to DR site.

4. Bring up DR Z

5. Guests IPL keeping their existing IP addresses and are back online.

Page 30: BNZ z/Linux Platform - IBM z/VM | IBM ·  · 2017-05-09BNZ z/Linux Platform ... Tier 1 Platform Required for Application Repatriation ... In house J2EE application using the Spring

Application Availability

Page 31: BNZ z/Linux Platform - IBM z/VM | IBM ·  · 2017-05-09BNZ z/Linux Platform ... Tier 1 Platform Required for Application Repatriation ... In house J2EE application using the Spring

Rolling upgrades and High AvailabilityInternet Banking - Rolling Upgrades and High Availability

• Under Normal Operations, the Primary Cluster (P0) takes the load.

• Local recovery nodes enable a method of upgrading the application (or infrastructure) without disruption to the existing production workload. This also provides a secondary cluster to facilitate local recovery, should the primary cluster fail.

•Updates can then be tested prior to customers being allowed access to the upgraded system.

Normal Operations

Page 32: BNZ z/Linux Platform - IBM z/VM | IBM ·  · 2017-05-09BNZ z/Linux Platform ... Tier 1 Platform Required for Application Repatriation ... In house J2EE application using the Spring

Rolling upgrades and High AvailabilityInternet Banking - Rolling Upgrades and High Availability

• Once the update to P1 is completed and tested, customers can be failed over (drained) onto the Secondary cluster.

•New logins are routed to the P1 cluster, existing P0 sessions are allowed to complete.

Migrating Workload to P1

Page 33: BNZ z/Linux Platform - IBM z/VM | IBM ·  · 2017-05-09BNZ z/Linux Platform ... Tier 1 Platform Required for Application Repatriation ... In house J2EE application using the Spring

Rolling upgrades and High Availability

Internet Banking - Rolling Upgrades and High Availability

• Once all customer sessions are running on P1, the P0 cluster can be upgraded and tested.

•Customers can then be redirected to P0. Maintaining P0 as the primary cluster, for no other reason than to keep things simple.

Active in P1.

Page 34: BNZ z/Linux Platform - IBM z/VM | IBM ·  · 2017-05-09BNZ z/Linux Platform ... Tier 1 Platform Required for Application Repatriation ... In house J2EE application using the Spring

Rolling upgrades and High AvailabilityNormal Operations Internet Banking -

Rolling Upgrades and High Availability

•Roll back to Normal Operations.

Page 35: BNZ z/Linux Platform - IBM z/VM | IBM ·  · 2017-05-09BNZ z/Linux Platform ... Tier 1 Platform Required for Application Repatriation ... In house J2EE application using the Spring

Roadmap

Page 36: BNZ z/Linux Platform - IBM z/VM | IBM ·  · 2017-05-09BNZ z/Linux Platform ... Tier 1 Platform Required for Application Repatriation ... In house J2EE application using the Spring

Future• Continue to learn. Peers, Events, Testing.• Splitting LPARS for effective license costs.• Satellite upgrade to support PXE on Z.• RHEL 5.4 Serial Consoles (currently 3270 consoles).• IBM (VE) Virtual Enterprise for WAS, fewer nodes better utilisation.• Oracle – Split RAC into internal and external RAC.• z/VM 6.x

Page 37: BNZ z/Linux Platform - IBM z/VM | IBM ·  · 2017-05-09BNZ z/Linux Platform ... Tier 1 Platform Required for Application Repatriation ... In house J2EE application using the Spring

Summary and other Learning.

Page 38: BNZ z/Linux Platform - IBM z/VM | IBM ·  · 2017-05-09BNZ z/Linux Platform ... Tier 1 Platform Required for Application Repatriation ... In house J2EE application using the Spring

Summary Findings

• Start with what you know.

• Prove it for yourself.

• Simplified DR

• Improved Availability

• Reduced License Costs

• Consolidating Servers and Services

Questions?