The Next I/T Transformation – Things You Need to Know · Today, we are embarking on another major transformation founded on the lessons ... implemented by the System z hardware

The Next I/T Transformation – Things You Need to Know

12/3/2007

Page 1 of 13


Over the years, the approach to information technology infrastructure has gone through

many transformations. It evolved from monolithic centralized systems to distributed

computing to client server and then back to server side computing with light weight

clients. Each transformation provided a better understanding of core needs and resulted in

substantial innovation. During each transition, the resulting discontinuity provided

significant advantage to those who were able to foresee the change and leverage it early.

Today, we are embarking on another major transformation founded on the lessons

learned over the past few years. Three core issues are driving this change from at I/T

perspective:

• Underutilization of I/T assets

• Complexity

• Security

The issues are correlated with overriding business needs to:

• Continue to reduce cost and do more with less

• Compensate for resource limitations and the shortage of skills within the I/T

community overall

• Stay in pace with business needs for applications and services which support the

innovation necessary to compete

• Protect assets and ensure the privacy of information

Business can no longer succeed merely by reducing cost and boosting efficiency. The

new century has brought with it the need to innovate in order to allow employees and

partners to make the most of business information in order to deliver new products and

services quickly. While continuing to be diligent in managing cost, the business must

support a growing collection of projects. It needs to be able to realign human capital and

resources with core skills to facilitate balancing these scare resources across different

projects that span the department, the business unit and even the enterprise. It’s not about

doing more with fewer people, but more about being able to do more with the people we

have. It’s about being able to quickly reallocate skills and physical assets to address the

next critical business need and to create the flexibility necessary to rapidly accommodate

changing missions.

Underutilization of I/T Assets

Several years ago a radical switch in the cost of I/T systems occurred. Hardware costs

declined rapidly leading to the notion of commodity systems. Part of this trend was the

result of competition facilitated by “open systems”. Moreover, it was driven by

automation of manufacturing processes which dramatically reduced the cost of producing

hardware. Since many decisions are based on the cost of acquisition with little

consideration of long term affects, this trend precipitated two fundamental changes. First,


12/3/2007

Page 2 of 13

departments could afford to buy their own systems which allowed them to implement

solutions that addressed requirements that were not being satisfied by central I/T in a

timely fashion. Second, it allowed the central I/T organization to adopt the notion that

clusters of low-cost commodity systems would help them meet the cost reduction goals

mandated by the business. This approach solved an immediate need.

Over time, however, as the number of commodity systems expands from tens to hundreds

and on to thousands of servers within many enterprises, high hidden costs begin to

surface. Administration costs associated with managing the large number of discrete

server instances escalates. Software that is duplicated across many systems results in

larger expenditures for acquisition, upgrades and support. These factors precipitated a

radical change in the I/T cost model. While 60% of I/T spending was for hardware in

1995, today it is only 20% (figure 1). Contrast this with the increasing cost for people

and software which today account for 70% of spending. In addition, the often overlooked

cost of energy and floor space has suddenly become a significant consideration. Many

enterprises must consider allocating funds for costly structural additions in order to be

able to handle the growth in server farms. Some enterprises are finding that additional

electricity is just not available in their area. The challenges are daunting and expensive to

redress. Unfortunately, many of the operational costs remain hidden behind old cost

allocation models as commodity system support costs are often misallocated to the

central I/T system (mainframe). While initially insignificant, as the number of discrete

systems grows, the costs become substantial. Misallocation of these costs often leads to

suboptimal decisions having a real, yet perhaps unrecognized, impact on the business.

For most large enterprises today, the cost of their distributed infrastructure far surpasses

the cost of their traditional mainframe systems. Yet, the mainframe systems continue to

perform most of the core business operations.

As management surveys this situation, it becomes apparent that most of the distributed

systems are running between 10% and 20% average utilization. This is not surprising and

there are several core reasons for this. First, separate systems are justified by business

unit or on a per project basis and the need to isolate workloads often dictate dedicated

systems. So from an administrative and operational perspective, separate and discrete

systems are a natural way to go. Second, the new workload software model is structured

around the commodity hardware. New workloads often require several discrete moving

parts; firewalls to support a DMZ, load balancers, intrusion protection appliances, reverse

proxy authentication and authorization servers, security servers, web servers, application

servers, data base servers and federation engines to name just a few. Each of these

discrete systems needs to be sized in order to meet peak demands. Since peak demand is

often hard to predict, capacity beyond the anticipated peak is typically deployed.

Furthermore, while one component of the infrastructure could benefit from additional

capacity at a given point in time, other components may be blessed with an

overabundance of capacity that cannot be readily shared. The result is an over-engineered

solution consuming vastly more energy and floor space than is necessary. But, it is

argued, this remains acceptable because the hardware costs are deemed to be so low. The

multiplier affect, however, along with the appearance of rising administrative costs is

beginning to dispel this myth. When the peak demands of each system are tabulated it


12/3/2007

Page 3 of 13

becomes apparent that the sum of the peaks is always substantially higher than the peak

of the sums (figure 2). The concept of clusters of commodity systems demonstrates an

issue; it is difficult to share resources across different applications and different

functional components supporting each application resulting in the underutilization of

assets.

Complexity

Managing arrays or clusters of independent systems can be challenging. Even simple

tasks that may take only ten or twenty minutes, consume lots of time when repeated

hundreds or thousands of times. This leads to a growth in the number of support

personnel. Even with automation tools, issues related to unique configurations require

time to sort out. The result is often a multitude of unique software releases and versions

throughout the enterprise. Indeed, it is not uncommon for large enterprises to discover

that they don’t have a valid inventory of everything that is installed and supporting the

business nor do they know, with confidence, the service levels achieved or the exact

composition of support infrastructure products that are deployed.

Ongoing operations including monitoring to respond to situations, auditing, capacity

planning, and service level and change management are significantly more challenging in

a multi-system environment, especially where a multitude of operating systems and

resource managers in employed. In addition, complexity associated with managing a

multitude of interconnections between systems can be substantial.

This uncovers the need for a system that is capable of sharing resources across multiple

workloads with an end-end operational view of all component parts supporting the

applications. The aforementioned challenges highlight the key advantage of a

consolidated system, namely that a single operating system has a view across all moving

parts, and can monitor the components, provision resources and adjust dispatching

priority in real time measured in seconds as opposed to minutes.

Security

The multi-system environment presents unique challenges for security. First, it’s difficult

to establish a uniform, consistent and reliable security policy across all systems. For

example, a policy change needs time to replicate across tens, hundreds or thousands of

servers. In addition, the links between systems must be protected often forcing repetitive

decryption and encryption of data affecting both the cost of accelerator hardware

(especially when it can’t be easily shared across physical platforms) and the management

and distribution of security keys (digital certificates).

Often, simplifications are employed and most frequently trusted internal networks are put

in place to minimize this affect. However, security experts advise that nearly half of all

security breaches are initiated from within the business. This has caused organizations

like VISA to begin to advocate encryption of private information across all network

connections (internet and intranet), end-to-end.


12/3/2007

Page 4 of 13

In addition to the security infrastructure management challenge, there is the issue of

knowing where the servers are, what they are running, what private data may be on those

servers and if it’s being properly protected. Congressional initiatives such as Sarbanes-

Oxley have brought this issue to the forefront.

Virtualization

The latest innovation to address the aforementioned challenges is the concept of

virtualization. While not a new concept (virtualization technology dates back more than

30 years), it has gained renewed interest because of its potential to address many issues

associated with physically separate systems.

Virtualization allows a physical system to become many logical systems with the

isolation characteristic of a separate physical system. The National Institute of Standards

and Technology (NIST) provide a test suite to ensure secure isolation of logical systems

called the Evaluation Assurance Level (EAL). It is critical for systems offering

virtualization to support this standard. In addition, resources such as storage devices and

even supporting software infrastructure may be virtualized allowing several

implementations to be treated as one from a management perspective.

There are several approaches to virtualization:

• Hardware logical partitioning

• Virtual Machine partitioning

• Address Space isolation

• Grid

Hardware partitioning, commonly referred to as logical partitioning, allows the

hardware platform to support multiple operating system images. This type of

virtualization is best for separation of environments such as test, development and where

it makes sense to separate lines of business or types of applications that exhibit a

markedly different runtime behavior (e.g. batch from online) or when separation from an

administrative perspective is desirable. Software partitioning allows many virtual

operating system images, commonly referred to as virtual machines, to run efficiently

under the control of a hypervisor which is responsible for dynamically allocating the

underlying hardware assets to the virtual environments based on their relative

importance. This type of virtualization is well suited for both environmental and

application isolation since the number of supported partitions can be extremely large.

Another approach that is unique to the z/OS operating system leverages the application

isolation of the z/OS address space allowing multiple applications to run under a single

operating system image while ensuring the integrity and security of each application.

Storage protection keys and hardware management of virtual address translation

implemented by the System z hardware offer secure separation of workloads providing

them with the efficiency of a shared physical memory. This approach minimizes the


12/3/2007

Page 5 of 13

number of operating system images, and hence complexity, by allowing application

environments to be virtualized under the control of a single operating system.

Combining virtualization techniques can eliminate single points of failure and facilitate

horizontal scaling. For example, by distributing work across multiple logical partitions,

each sharing the same physical resources, the operating system can be eliminated as a

single point of failure yet the application can continue to benefit from access to all

resources on the system. By placing logical partitions on separate systems in a tightly

coupled environment such as a Parallel Sysplex (a multi-system configuration that is

administered as a single system image that provides high speed data sharing with

integrity), an entire physical system can be brought out of service while the application

continues to serve users. This concept extends naturally to geographically dispersed

systems offering near instantaneous recovery of one Sysplex with another to support

metropolitan and regional disaster recovery.

In the three styles discussed above, the approach is to bring the hardware assets to the

application. Grid, on the other hand, takes the opposite approach by intelligently routing

work and managing the applications running on a collection of systems. Clusters (a set of

servers capable of running the same application) no longer have to be dedicated to one

application and individual systems may be dynamically altered to run different

applications at different points in time. In essence, clusters become dynamically

configured.

Virtualization goes a long way to address many of the issues outlined previously.

Physical asset utilization can certainly increase because resources are no longer dedicated

to applications. Change management is positively affected by the ability to share

infrastructure and automate changes across virtual environments that are physically co-

located. Complexity is reduced by sharing assets and policies across virtualized instances

of application components. For example, a DMZ or a secure reverse proxy server may

share the same hardware and security policy with the application server and data base

server. Security is improved by eliminating network connections that require SSL

endpoints, sharing key rings and providing an end to end view of the application

environment for authorization control and auditing.

Virtualization with System z

The System z supports all of the virtualization styles mentioned above. It also offers

some unique features that make virtualization a compelling option. Indeed, all System z

configurations require hardware logical partitions today. It is implemented in the

microcode and supports EAL level 5. In addition, the hardware performs several key

features that are important. First, address translation is handled by the hardware. An

application works with a virtual address which is dynamically mapped to real memory on

demand. Second, hardware enforces storage protection keys ensuring that non-authorized

applications may not access certain parts of the shared storage. In fact, this is a multilevel

storage protection scheme designed to ensure the security of the environment. Both


12/3/2007

Page 6 of 13

virtual machines running Linux and address spaces running under the control of z/OS

leverage these features. The result is a resilient system that allows multiple applications

to run together without compromising each other.

The System z is ideally suited for virtualization because of its ability to efficiently

manage mixed workloads. Virtualized environments are, by their nature, mixed

workload environments. They require an efficient multi-level cache and fast context

switching in order to run near rated speed under the unique demands of a multi-

application environment. In addition, when multiple applications are sharing resources, it

is essential to have an efficient way to ensure that priority work is serviced first. Without

this balancing act, bottlenecks may appear. Even for a single application that depends

upon multiple runtime components (e.g. the web server, the application server and the

data base server) it is critical that these resources work together and that resources are

dynamically allocated to ensure harmony. A delay within the web server would affect the

other components and the application overall. Of course, in periods of peak demand, it is

essential that the system work on the most important tasks first. This requirement is

served by the workload manager built into the z/OS kernel and by zVM in support of

virtualized Linux images.

Systems z also benefits from the non-disruptive dynamic sparing of components that

occurs under the covers by the operating system and microcode. As the name implies (z

stands for zero downtime), redundant hardware components and specialized software

that redirects work around a failing component without application awareness are key

attributes of a virtualized environment. As the components involved in the execution of

the application are virtualized in order to make optimal use of resources it becomes ever

more important that the system is resilient. Fast recovery is often not an option since an

outage may cause queuing delays that are undesirable and affect operations for an

extended period of time.

The System z provides an architectural design that integrates a variety of technologies to

achieve a heterogeneous network of systems “in a box”. Today, System z augments its

native processors with custom cryptographic accelerators, System Assist Processors and

scores of processors to offload I/O processing from the application processors (figure 3).

Future implementations are expected to integrate the Cell Broadband Engine Architecture

vastly expanding the systems ability to support extreme graphics and compute intensive

processing through its ability to perform more than 250 billion floating point operations

per second. This may eliminate, for instance, the need to transfer data to another platform

in order to perform a complex analysis saving the operational costs associated with the

transfer task and avoiding the security challenges associated with replicated data. What is

of paramount importance is that this capability will integrate into the system in order to

leverage a cohesive design to ensure security, integrity, high availability and application

transparency. It will open the door to a set of applications that tightly integrate business

processing with compute intensive processing to support financial, insurance, law

enforcement and military applications. It will no longer be necessary to build complex

relationships between different systems in order to serve the needs of diverse workloads


12/3/2007

Page 7 of 13

relating to a single application while avoiding the complexity of securing and prioritizing

a multi-system environment.

The multiprocessor design of the System z has been extended to support the notion of

specialty processors. These processors are able to dynamically offload work from the

general purpose processors in order to achieve hallmark price performance for specific

critical industry workloads. By adding specialty processors to an existing System z

configuration execution of Linux, Java and database processing applications achieve

breakthrough price performance. Consider the advantage of being able to consolidate 20

physical server platforms onto a chipset the size of your hand while not affecting the

execution of existing mission critical applications (figure 4). The reduction in energy

consumption and floor space is obvious. What may not be obvious, at first glance, is the

value of incrementally adding the new workload to an environment that supports the

security, availability and disaster recovery already in place to support existing mission

critical workloads. This value is best summarized in one word; leverage. System z

affords the opportunity to leverage an existing investment and extend the benefits of that

investment to new workloads.

Just by Showing Up

New workloads inherit many benefits “just by showing up” on the system. The existing

disaster recovery infrastructure may be leveraged since the new workload is merely a

collection of common artifacts that may be easily added to the existing policy. Tasks

such as capacity planning, security audit and file backup may be added to the existing

automation policy. The Sysplex Distributor, built into z/OS, performs Workload Manager

driven distribution of requests across a Parallel Sysplex circumventing systems under

stress and outages. DB2 data sharing and shared message queues allows for extremely

efficient sharing of resources across multiple systems. Internal propagation of security

identity under the control of a common security policy, administered independently from

the application, that spans multiple systems delivers the necessary protection.

From a day to day operational perspective, resource managers such as the application

server, the database server and the security server are started tasks managed and

recovered by system automation. Messages routed to the console are logged for post

event analysis and may drive automation tasks to minimize human intervention and the

potential for error.

Since many core business processes execute on System z today, new workloads

manifested by the application server, may access these resources directly over cross

memory connections. Not only does this provide greater efficiency, it offers improved

security since network connections are avoided. There is no need to specify unique user

ID’s and passwords in the application server since identity propagation may be managed

by the z/OS security subsystem and controlled exclusively by security personnel. Security

certificates may be managed by the z/OS security subsystem, supporting both clear key

and secure key (the key is stored in tamper reisistent hardware and is never visible in


12/3/2007

Page 8 of 13

memory) operations, to provide consistent security management across the end to end

application environment.

The net value proposition is founded on leverage. It is the ability to deploy new

workloads, developed using a common platform agnostic programming model, onto an

existing environment that is resilient, scalable and managed today. The incremental cost

of adding this workload to the existing environment is marginal, given the pricing model

associated with specialty processors, even without accounting for the total cost savings

accrued by leveraging the operational environment. When total cost of ownership is

considered, the savings can be astounding.

Conslusion

The ability of System z to absorb new workloads offers substantial benefits. It delivers

“best of breed” virtualization to address underutilization of assets with compromising

security; it simplifies the environment by providing control logic which manages the end-

to-end application environment allocating resources on-demand while uniformly applying

necessary security constraints. The common programming model addresses the need to

consolidate skills and more effectively leverage them across many projects. The fast

provisioning of new virtualized server instances (minutes instead of days) radically alters

the time to respond to new business opportunities while allowing for physical assets to be

shared and redundant costs to be eliminated.

While some enterprises are consolidating discrete physical systems onto server farms and

physically locating them on a common data center floor in an attempt to gain control of

operational costs, others are eliminating the discrete systems and consolidating onto

virtual server instances that benefit from the mixed workload attributes of System z. They

are benefiting from the ability to share physical resources across applications while

delivering the secure isolation that is essential. They are leveraging their existing

investment in System z, funded by traditional business applications, and taking advantage

of the landmark price/performance offered by the zAAP and zIIP. This is allowing them

to co-locate new workloads with existing applications and data while taking advantage of

the availability, security, and disaster recovery infrastructure in place today. For many,

adding processor chips to their existing Sysplex offers the lowest cost of acquisition and

when management benefits are tabulated, the total cost of ownership advantage can be

extraordinary. Developers are being freed from the task of building and maintaining their

own test systems while avoiding the need to requisition additional assets in order to move

forward with testing in support of new projects. Best of all, they are finding that the

incremental cost of adding the second, third and fourth application is so small that

applications that may not have been deemed candidates for the System z are now being

deployed. Many enterprises have more than 100 new Java applications running on the

Parallel Sysplex alongside their traditional COBOL applications. Best of all, the

application developers have not had to alter their applications or their development

methodology.


12/3/2007

Page 9 of 13

Activating a zAAP or zIIP requires no additional floor space and consumes the energy of

a standard house light bulb, 65 watts. So, a side benefit is the reduction in their energy

bill and, in some instances, the ability to avoid acquiring a new facility.

The new System z delivers the ability to take server consolidation to a new level, one that

dramatically lowers cost, simplifies the infrastructure, supports sound environmental

policy and facilitates faster deliver of innovative new applications.

The advent of a consolidated system that brings the strength of multiple technologies

“under the covers” of a unified management infrastructure is a transformation that must

be understood.


12/3/2007

Page 10 of 13

43% 27%

20%

10%

Based on IBM Scorpion customer analyses

Personnel Expenses have Become the Dominant Component of TCO

Figure 1


12/3/2007

Page 11 of 13

Variance: a web portal with 13 servers organized into 3 tiers

Relative Load

0.00000

0.05000

0.10000

0.15000

0.20000

0.25000

0.30000

0.35000

Average Peaks Web Peak App Peak DB Peak Total Peak

Lo

ad

File Server

DB Server

APP6

APP5

APP4

APP 3

Inf 2

Inf 1

APP2

APP 1

WEB3

WEB2

WEB1

The peak of the sum is significantly lower than the sum of the peaks

Sum ofthe peaks

Peak ofthe sum

Figure 2


12/3/2007

Page 12 of 13

IBM eServer z9-109 superscalar CMOS 54-way SMP

Tuned for system utilization, industry leading RAS, system security and data integrity

Figure 3

System z9

I/O DEVICE DRIVERSI/O DEVICE DRIVERS

OPERATING SYSTEM &

SYSTEM RESOURCE MGT

OPERATING SYSTEM &

SYSTEM RESOURCE MGTAPPLICATION CODE

SOME OS and Crypto

APPLICATION CODE

SOME OS and Crypto

ESCON ChannelsPCICAPCIXCCFSP (Cage Controller)

DCAsMotor DrivesService Processorsand more

Up to 336 Power PC processors for I/O operations

10 more for offload

EVERY EXECUTION UNIT HAS A DEDICATED “CROSS-CHECK”

PROCESSOR FOR SYSTEM RELIABILITY, STABILITY AND INTEGRITY


12/3/2007

Page 13 of 13

Datacenter

Mainframe

Figure 4

Documents

The Next I/T Transformation – Things You Need to Know · Today, we are embarking on another major transformation founded on the lessons ... implemented by the System z hardware