Upload
others
View
3
Download
0
Embed Size (px)
Citation preview
Business Service Management for Performance
z/EnterpriseTM: Best Practices
Laura Knapp WW Business Consultant [email protected]
© Applied Expert Systems, Inc. 2011 1
650-617-2400
Business Service Management for Performance
© Applied Expert Systems, Inc. 2011
2
z/Enterprise Background Networking Details Management Issues Best Practices
Business Service Management for Performance
© Applied Expert Systems, Inc. 2011
3
Through 2007, organizations with more than 200 servers will waste between $500,000 and $720,000 annually supporting underutilized application/server combinations” Gartner Research, December 2004
•Average processor utilization is 6% to 7%
•Prime time processor utilization is only 15%
•40% of equipment is over 3 years old
•Firms have over 20% - 50% more capacity than actually needed
The Problem - Gartner
Business Service Management for Performance
© Applied Expert Systems, Inc. 2011
4
"Over the last three years, we have seen more change in the datacenter market than the previous fifteen years. Consolidation, virtualization, power and cooling, and the aging datacenter market in mature economies is leading to a proliferation in datacenter options, both inside the walls of the datacenter as well as the construction of the datacenter itself," said Michelle Bailey, research vice-president for IDC's Datacenter Trends program
The Problem - IDC
Operational Issues Business Challenges Unlimited demand More processing power Energy costs 8x Management costs 4x 70% of IT budget is
operational overhead
Limited budgets Cost containment Globalization Acquisitions Green
The Data Center
Business Service Management for Performance
Right-Sizing IT Infrastructure Green-Mandated efforts to consolidate entire farms of:
-Servers -Storage
-Networks -Etc.
…and dynamically optimize to only consume the resources you need!
© Applied Expert Systems, Inc. 2011
5
…and dynamically optimize to move applications for high availability and performance!
IBM’s hardware answer is the z/Enterprise
Business Service Management for Performance
IBM z/Enterprise System Unified Management for a smarter system
• Unifies management of resources, extending IBM System z® qualities of service end-to-end across workloads • Provides platform, hardware and workload management
•The world’s fastest and most scalable system: •IBM z/Enterprise 196 (z196)
•Scale out to a trillion instructions per second •IBM z/Enterprise BladeCenter® Extension • (zBX)
• Selected IBM POWER7®blades and IBM System x® Blades1 for tens of thousands of AIX® and Linux applications • High performance optimizers and appliances to accelerate time to insight and reduce cost • Dedicated high performance private network
•HMC
• Ideal for large scale data and transaction serving and mission critical applications • Most efficient platform for Large-scale Linux®
consolidation • Leveraging a large portfolio of z/OS® and Linux on System z applications • Capable of massive scale up, over 50 Billion Instructions per Second (BIPS)
Think cloud and virtualization in a box!!! © Applied Expert Systems, Inc. 2011 6
Business Service Management for Performance
zBX ... Infrastructure to Support More Resources
•HMC
zBX houses the multiplatform solutions Between zBX and z196 Secure connections Potential to eliminate encryption Dramatically reduce firewalls Fast 10GB Ethernet Latency reduction Fewer hops to get to data Private isolated network Traffic on user external networks not impacted Sharing of resources Up to 8 z196 servers can attach to a zBX Platform and Hardware Management Unified Resource Manager
© Applied Expert Systems, Inc. 2011 7
Business Service Management for Performance
z/Enterprise Ensemble
© Applied Expert Systems, Inc. 2011
8
Each z196 with its optional zBX makes up a NODE An ENSEMBLE is a group of one to eight z/Enterprise Nodes to be managed as one single logical virtualized system
When multiple workloads are running across the ensemble, each can have it’s own business objectives, and Unified Resource Manager can share the resources to meet all the business objectives
Business Service Management for Performance
Unified Resource Manager
© Applied Expert Systems, Inc. 2011
9
• Hypervisor management and creation of virtual networks
• Operational controls, service and support for hardware / firmware
• Network management of private and secure data and support networks
• Energy monitoring and management
• Workload awareness and platform performance management
Business Service Management for Performance
© Applied Expert Systems, Inc. 2011
10
z/Enterprise Background Networking Details Management Issues Best Practices
Business Service Management for Performance
z/Enterprise Node with Internal Networks
© Applied Expert Systems, Inc. 2011
11
Business Service Management for Performance
•System z Hardw
are Managem
ent Console W
ith Unified Resource M
anager
•z/Enterprise Node
•X86 Blades Power Blades Optimizers
•Linux on System x
•OS
AL
•
•AIX •A
.V
•
•AIX •A
..t
• • • •
•-
P.
_
-J
.-
-
-
•Linux on System x
•
•Linux on System x
•DataPow
er
•Future
xHypervisor
•Smart Analytics O
ptimizer
pHypervisor
•Blade Center Advanced Management Module (AMM)
•zBX
•Linux on System z
•Linux on System z
•Linux on System z
•z/VM
•z/OS
•z/OS
•z/OS
•z/OS
•PR/SM
•Z CPU, Memory and IO
•System Element (SE)
•z196
IBM z/Enterprise System Overview
•AIX
© Applied Expert Systems, Inc. 2011
12
Business Service Management for Performance
Intra Node Management Network (INMN)
© Applied Expert Systems, Inc. 2011
13
Private network Two Top of Rack (TOR) 1GbE switches Similar to older network models that separated management and control networks from data flows
Business Service Management for Performance
Intra Ensemble Data Network (IEDN)
© Applied Expert Systems, Inc. 2011
14
Dedicated layer 2 10 GbE network No layer 3 routing Can span 10km IP address (IPv4 or IPv6) customer controlled MAC address (prefixes) provisioned by HMC Virtual server isolation via VLANs
Business Service Management for Performance
z/OS Connects to External Router
© Applied Expert Systems, Inc. 2011
15
z/VM z/OS
A (IP
Router)
z/OS B
Zlin 5 AIX
25 Linux
5 Linux
7
OSD OSM OSX
1GbE 10Gbe
TOR TOR
z196 zBX
Zlin 52 AIX
25
Vswitch 1GbE 10Gbe
SE
Business Service Management for Performance
External IP Router
© Applied Expert Systems, Inc. 2011
16
z/VM z/OS
A z/OS
B
Zlin 5 AIX
25 Linux
5 Linux
7
OSD OSM OSX
1GbE 10Gbe
TOR TOR
z196 zBX
Zlin 52 AIX
25
Vswitch 1GbE 10Gbe
SE
Business Service Management for Performance
Virtual Networks Core to z/Enterprise
© Applied Expert Systems, Inc. 2011
17
Create the virtual network in HMC (zManager) Assign a VLAN ID
Add/associate serves to the virtual network
Virtual Network = VLAN ID and associated servers
Name = Customer VLAN ID =200
Servers = z/OS A, Lin 25
Name = Employee VLAN ID =300
Servers = z/OS B, Lin 5, zLin 5, AIX
25, Lin 7
Business Service Management for Performance
Virtualization and Network Access
© Applied Expert Systems, Inc. 2011
18
z/VM z/OS
A z/OS
B
Zlin 5 AIX
25 Linux
5 Linux
7
OSD OSM OSX
1GbE 10Gbe
TOR TOR
z196 zBX
Zlin 52 AIX
25
Vswitch 1GbE 10Gbe
SE
Business Service Management for Performance
z/OS Communications Server and OSM connectivity
z/OS LPAR must participate in the ensemble VTAM start option: ENSEMBLE=YES LPAR must be IPv6 enabled (uses link-local address only) Dynamic generation of CHPIDs and TRLEs VLAN used in access mode (switch handles all VLAN tagging) No information reported to OMPROUTE (cannot add static or dynamic routes) Does support stop, start, packet trace, NTA Only application that uses this is GPMP
© Applied Expert Systems, Inc. 2011
Business Service Management for Performance
z/OS Communications Server and OSX Connectivity
Configure with Interface statement z/OS LPAR must participate in the ensemble VTAM start option: ENSEMBLE=YES TRLE dynamically configured as done for Hiper Sockets or configure and point to it (useful for VM guests LAN) Always use VLAN in trunk mode Supports stop, start, packet trace, NTA
© Applied Expert Systems, Inc. 2011
Business Service Management for Performance
© Applied Expert Systems, Inc. 2011
21
z/Enterprise Background Networking Details Management Issues Best Practices
Business Service Management for Performance
Managing z/Enterprise
© Applied Expert Systems, Inc. 2011
22
• Fundamentals of management apply FCAPS
• Fault • Configuration • Availability • Performance • Security
• Leading to
• Service Level Achievement • Optimum Resource Utilization • Highly available systems • High performing systems
Monitor
Analyze
Diagnose Remediate
Report
Business Service Management for Performance
Approaches to Solving Problems
© Applied Expert Systems, Inc. 2011
23
Top Down or Bottom Up doesn’t matter
Consistency does
• Applications • Middleware • Guest OS
• Hypervisor • Public and private Network
Business Service Management for Performance
z/Enterprise Challenges Management Ecosystem
© Applied Expert Systems, Inc. 2011
24
• z/Enterprise leads to a data center that is • Consolidated - Many in one place • Optimized – Movement to attain
SLA’s • Dynamic – Requires less operator
intervention
• How does this strain your management ecosystem • Location • Inventory • Availability • Performance • Event • Associate to Business Service • Report
Business Service Management for Performance
© Applied Expert Systems, Inc. 2011
25
Management Lifecycle - Monitoring
Implement Changes
Gather Configuration and Traffic Information Observe Statistics
Collect Capacity Data
Analyze Traffic
Solve Problems
Plan Changes Evaluate
Performance Baselining
What-if Analysis
3rd Party Services
3rd Party Services
Services Services
Services
Business Service Management for Performance
© Applied Expert Systems, Inc. 2011
26
Top Service Management Mistakes when moving to a z/Enterprise System •Neglecting to collect and compare “Before and After Snapshots” of the conversions for:
•Physical and virtual infrastructure performance behavior •Performance and behavior of single-point applications vs. virtualized or “cloned” applications •Network flow and control of specific application conversations
•Neglecting to define established baselines for:
•CPU performance •Storage level usage •Network Performance and packet/routing error rates •“IP Service” application error rates or outage levels •Key TCP-based application uptime availability or conversation Round-Trip -Time performance
•Neglecting to periodically and consistently re-examine everything to ensure that all infrastructure components are fully optimized
Monitor
Analyze
Diagnose Remediate
Report
Business Service Management for Performance
© Applied Expert Systems, Inc. 2011
27
Scenario 1 –Application Behavior after Virtualization Situation A major virtualization project involved 5 applications. All 5 applications moved successfully, but one had major performance issues taking 11 hours to run versus 2 on the non-virtualized environment. This application was based on MySQL
Trouble Shooting What was different about this application? Is this application getting all the resources it needs? Is this application using a network resource (DNS server for example) different from the other applications? Anyone have a clue of the flows and controls before virtualization? How do we isolate the problem to a specific area?
Business Service Management for Performance
© Applied Expert Systems, Inc. 2011
28
Scenario 1 –Application Behavior after move to z/Enterprise Solution Running a trace of critical applications will provide details on the interaction with infrastructure components, between the communicating elements, and provide visibility into changed environments. Multiple outbound paths may exist with z196 and zBX boxes. Having a single tracing product that can span z/OS, Linux, z/VM, AIX will be critical for the operations staff! What was different about this application? Look at the sequenced packet flows and response times before and after the move Is this application using a network resource different from the other applications? Look at the status of your IP service devices Anyone have a clue of the flows and controls before virtualization? Collect a trace of the application startup, steady state, and on closing before and after the move. Compare these side by side to quickly see differences.
Business Service Management for Performance
© Applied Expert Systems, Inc. 2011
29
Scenario 2 – Performance Complaints
Situation After moving an application to a z/Enterprise node the application development team complained that their application response times were significantly higher and that the z/Enterprise node was the problem.
Trouble Shooting What is the throughput on the connection adapter? Can it handle the load? Is there any time sensitivity? What is the round trip TCP session time? How does this compare to your baseline before the move? Are your z/Enterprise connection LPAR, Ethernet, OSA elements optimized? Are there any differences in the flows to support systems like DNS, DHCP, or routers? Did you change subnets or IP network address space when you moved the system?
Business Service Management for Performance
© Applied Expert Systems, Inc. 2011
30
Scenario 2 – Performance Complaints Solution Baseline information is critical regarding how the application worked and utilized resources before and after the z/Enterprise move. This needs to be compared to both real time, near real time and historical information on the z/Enterprise Node
What is the throughput on the connection adapter? Can it handle the load? Look at the adapter details. Do you have a baseline for comparison? Remember that some baselines are ‘seasonal’ and you need to take this into consideration What is the round trip TCP session time? How does this compare to your baseline before the move to the z/Enterprise? There is a lot of social emotion behind user complaints over performance. Do you have the background facts to remove emotion from the problem.
10.8.8.166 10.8.8.107 10.8.8.114 10.9.8.200 10.9.8.105 10.8.8.100 127.0.0.1 10.8.8.160 10.9.8.105 10.9.8.155
Business Service Management for Performance
© Applied Expert Systems, Inc. 2011
31
Scenario 3 – Erratic Application Behavior Situation After moving an application to a z/Enterprise Node the users accessing the application had very uneven response times. What was the cause?
Trouble Shooting How did the application access DNS servers? What was the status of the DNS servers? Did the z/Enterprise environment begin to overwhelm the system? Were the uneven response times occurring at specific times or in a repeatable pattern? Were any new applications/images moved or brought up on the z/Enterprise server about the time the erratic behavior occurred? Is this behavior reported by end users, the application team, or tools? If end users, is it one segment or are they located at various locations? If the application team, what tool are they using? What is it showing? If a tool reported the behavior how is it gathering information, analyzing and reporting that information?
Business Service Management for Performance
© Applied Expert Systems, Inc. 2011
32
Scenario 3 – Erratic Application Behavior Solution Using tracing tools a clear understanding of the way the application environment used DNS servers could be understood along with details on exactly which servers were being utilized. Comparing before and after traces would show any differences in these flows. Using monitoring tools with alerting to show the availability and performance of your DNS (and other infrastructure elements) would alert operations to overloaded or non-response DNS servers
How did the application access DNS servers? Running a trace before and after the move would allow you to understand if there were any differences in the flows. What was the status of the DNS servers? Monitoring your critical components provides quick status checks by network operations. Did the z/Enterprise node begin to overwhelm an individual server? Monitoring the critical components through all layers provides a snapshot of overall status.
Business Service Management for Performance
Diagnosis in a z/Enterprise Node
© Applied Expert Systems, Inc. 2011
33
Do not assume the private INMN and IEDN networks will be infallible Each supports tracing which will be a CRITICAL tool to rollout and sustain the ensemble Many items to be learned and if history repeats itself having a consistent, easy to use, tracing tool will be operations ‘hammer’ DOCUMENT DOCUMENT DOCUMENT
Business Service Management for Performance
© Applied Expert Systems, Inc. 2011
34
z/Enterprise Background Networking Details Management Issues Best Practices
Business Service Management for Performance
© Applied Expert Systems, Inc. 2011
35
Evaluate Many Facets of Virtualization on a z/Enterprise
Options Virtualization Infrastructure
Private Host OS and VM control
Reduction in space, power
use and cooling
Reduction in Network and
SAN Infrastructure
Consolidated hardware
management and KVM
Consolidation of applications on existing servers
NO
NO
POOR
NO
NO
Virtualization of applications on existing servers
YES
YES
FAIR
FAIR
NO
Virtualization of applications on
z/Enterprise
YES
YES
EXCELLENT
EXCELLENT
YES
Optimal solution may be achieved by combining the virtual solutions
Business Service Management for Performance
© Applied Expert Systems, Inc. 2011
36
Have a Report Card Server Report Card PASS/
FAIL
Create a base Server template and create new VMs
Baseline existing server, application, and infrastructure (DNS, routers, etc) elements for key KPI’s
Deploy Infrastructure on VMs
Perform virtual machine “internal move”
Perform virtual machine “external move”
Repeatedly perform a physical to virtual operation
Simulated power supply failure testing
Failover testing of the management modules
Failover testing of the network switches
Virtual Infrastructure Report Card PASS/ FAIL
Server Consolidation
Increased host resource utilization
Zero Downtime Upgrades
Baseline core network elements before and after
Datacenter relocation with no moving trucks
Increased server to administrator ratio
Reduced server deployment times
Infrastructure cost savings
Labor cost savings
Centralized management of virtual infrastructure
Business Service Management for Performance
© Applied Expert Systems, Inc. 2011
37
•Architect your z/Enterprise infrastructure carefully •Review with network, application, middleware, development and facility teams
•How many Trace tools will you need? •Construct a roadmap, 3 years out if possible •Is there a Technical Account Manager
•Single point of contact internally •Single point of contact for external vendors
•Assess workloads for optimum placement • Users and workloads differ by environment •Not every application is ready to be virtualized
•Spread the Virtualization story internally (to IT!) •Virtualization for most is still a new and unknown technology •Be sure and include the z/Enterprise story when delivering
Did you consider
Business Service Management for Performance
© Applied Expert Systems, Inc. 2011
38
Enhanced Disaster Recovery Solutions •Replication of individual virtual machines or VM file systems between sites •Manual / Auto restart of replicated virtual machines at time of test / disaster •Centralized management of the distributed virtual infrastructure
Enhanced VDI Solutions •Leverage distributed replicated infrastructure for VDI at time of test or disaster •Support for multi video head virtual desktops
Enhanced Application GRID Geographically dispersed application GRID
The Future
Business Service Management for Performance
QUESTIONS?
[email protected] www.aesclever.com
650-617-2400
© Applied Expert Systems, Inc. 2011
Business Service Management for Performance
Trademarks
© Applied Expert Systems, Inc. 2011
40
The following terms are trademarks or registered trademarks of International Business Machines Corporation in the United States or other countries or both:
• Advanced Peer-to-Peer Networking® • AIX® • alphaWorks® • AnyNet® • AS/400® • BladeCenter® • Candle® • CICS® • DataPower® • DB2 Connect • DB2® • DRDA® • e-business on demand® • e-business (logo) • e business(logo)® • ESCON® • FICON®
• GDDM® • GDPS®
• Geographically Dispersed Parallel Sysplex • HiperSockets • HPR Channel Connectivity • HyperSwap • i5/OS (logo) • i5/OS® • IBM eServer • IBM ( logo)® • IBM® • IBM z/EnterpriseTM System • IMS • Inf iniBand ® • IP Pr intWay • IPDS • iSeries • LANDP®
• Language Environment® • MQSer ies® • MVS • NetView® • OMEGAMON® • Open Power • OpenPower • Operating System/2® • Operating System/400® • OS/2® • OS/390® • OS/400® • Parallel Sysplex® • POWER® • POWER7® • PowerVM • PR/SM • pSeries® • RACF®
• Rational Suite® • Rat iona l® • Redbooks • Redbooks (logo) • Sysplex Timer® • Sys tem i5 • Sys tem p5 • Sys tem x® • Sys tem z® • Sys tem z9® • System z10 • Tivoli (logo)® • Tivoli® • VTAM® • WebSphere® • xSeries® • z9® • z10 BC • z10 EC
• z/Enterprise • zSeries® • z/Architecture • z/OS® • z/VM® • z/VSE
* All other products may be trademarks or registered trademarks of their respective companies.
• Adobe, the Adobe logo, PostScript, and the PostScript logo are either registered trademarks or trademarks of Adobe Systems Incorporated in the United States, and/or other countries. • Cell Broadband Engine is a trademark of Sony Computer Entertainment, Inc. in the United States, other countries, or both and is used under license there from. • Java and all Java-based trademarks are trademarks of Sun Microsystems, Inc. in the United States, other countries, or both. • Microsoft, Windows, Windows NT, and the Windows logo are trademarks of Microsoft Corporation in the United States, other countries, or both. • InfiniBand is a trademark and service mark of the InfiniBand Trade Association.
• Intel, Intel logo, Intel Inside, Intel Inside logo, Intel Centrino, Intel Centrino logo, Celeron, Intel Xeon, Intel SpeedStep, Itanium, and Pentium are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States and other countries. • UNIX is a registered trademark of The Open Group in the United States and other countries. • Linux is a registered trademark of Linus Torvalds in the United States, other countries, or both. • ITIL is a registered trademark, and a registered community trademark of the Office of Government Commerce, and is registered in the U.S. Patent and Trademark Office.
• IT Infrastructure Library is a registered trademark of the Central Computer and Telecommunications Agency, which is now part of the Office of Government Commerce.