Upload
others
View
13
Download
0
Embed Size (px)
Citation preview
Why Automate Data Center Operations? © 2015 Zefflin Systems All Rights Reserved
White Paper
Zefflin Systems LLC
Why Automate Data Center Operations?
P a g e | 1 © 2015 Zefflin Systems All Rights Reserved Why Automate Data Center Operations?
1. Introduction .......................................................................................................................................... 1
2. What Processes Should I Target for Automation? ................................................................................ 2
3. What Problems Can I Solve? ................................................................................................................. 3
4. Today’s Software Tools ......................................................................................................................... 7
5. What Are My Peers Doing? ................................................................................................................. 10
6. How Do I Integrate Data Center Automation into My Organization and Environment? ................... 11
7. What Value Does Data Center Automation Bring to My Organization? ............................................. 12
8. Summary ............................................................................................................................................. 13
9. About Zefflin ....................................................................................................................................... 13
P a g e | 1 © 2015 Zefflin Systems All Rights Reserved Why Automate Data Center Operations?
1. Introduction
As an IT leader, you are faced with supporting a growing business while budgets remain flat. At the
same time you are expected to increase the speed, quality and reliability of service – all as technology
constantly evolves and changes. Virtualization of the IT compute and storage infrastructure has
revolutionized operations, increased productivity and resource utilization, and brought new agility to IT.
But virtualization has also created new challenges and problems. Even after making virtualization an
integral part of operations, many IT organizations find themselves asking “What else can we improve
upon?”
The answer lies with the next logical stage in
virtualization’s evolutionary path: automate IT
operations functions to increase productivity.
Regardless of the cloud architecture chosen (public,
private, hybrid), automation of processes in the areas
of Catalog/Request Management, Approvals,
Chargeback, Provisioning (not only OS, but storage,
network, database and application), Governance and
Compliance yields a significant ROI for today’s IT
organization.
Software tools that are used to automate IT processes have matured significantly in recent years. They
now cost less to implement and are easier to integrate into existing infrastructure and tools. Superior
integration capability means that previous investments in areas like virtualization can be preserved and
a best-of-breed approach can be taken without forcing vendor lock-in. Open source software like
OpenStack™ has put tremendous downward pricing pressure on traditional enterprise software. This
means that the ROI of automating specific parts of IT Operations has changed in favor of the CIO, and
what a short time ago might have been a significant financial commitment with high risk is now much
less in both cost and risk. Automation is feasible, affordable and carries much lower risk than even one
year ago.
Virtualization solves problems, but creates others.
P a g e | 2 © 2015 Zefflin Systems All Rights Reserved Why Automate Data Center Operations?
2. What Processes Should I Target for
Automation?
Data center processes can be both numerous and complex. Each
process should be looked at in terms of cost of automation (i.e.,
implementation and maintenance), versus labor and other cost
savings gained. There are a set of core processes, however that
have a large impact on IT service speed, quality and repeatability,
as outlined below.
Process Benefit of Automation ROI
Request/ Catalog
Management
Enable self-service for requesting of complex
computing environments resulting in better
control over standards that are used in both
development and production.
Faster response time – not waiting for
administrator to analyze environments.
Reduction in administration time spent
on:
• Analysis of requests
• Demand management
• Capacity planning
Approvals Creates an audit trail of all requests and
approvals. Brings transparency to the
process, so requestors can see where
approvals are stuck and how long they can
expect them to take.
Reduced time of all stake holders,
because approvals have full visibility – no
more not knowing what the hold-up is.
Charge-back/ Show-
back
Enable cost accounting at a department level,
which can be an improvement over public
cloud providers by requiring less paperwork,
like expense reports and manual chargeback.
• Reduced administration and
accounting.
• Beginning of management of real IT
cost
Provisioning
• Operating
System
• Storage
• Database
• Network
• Application
Better control, reduction in human error
• More efficient use of storage and server
capacity
• Standardize OS images
• Faster provisioning of complex
computing environments
Increase in Administrator/Engineer
productivity for:
• Systems
• Network
• Storage
• Database (DBA)
Step back and objectively look at which
processes to automate.
P a g e | 3 © 2015 Zefflin Systems All Rights Reserved Why Automate Data Center Operations?
Process Benefit of Automation ROI
Governance Better control over environments, resulting
in reduced management cost.
• Reduced administration time – no
one is required to constantly
monitor development
environments to see if they are still
being used.
• Increase in application
development resource utilization –
development environments are
archived and retired per policy
• Reduced infrastructure cost
Compliance • Automatically enunciates out-of-
compliance situations for key areas
including PCI, internal security and ISO.
• Helps to identify previously unknown
processes that result in non-compliance
(such as application hotfixes)
• Provides flexibility as to how to deal
with out-of-compliance situations (like
opening a Service Management
incident, routing to person for
correction, or automatically correcting,
then notifying key personnel)
Dramatic increase in staff productivity
• Reduces the number of out-of-
compliance incidents
• Reduces the resources required to
maintain compliance
• Improves security without hiring
new staff
General Policy
Automation
This category features use of orchestration
solutions to automate numerous repeatable
IT operations tasks such as:
• Automating password reset policy
• Automating event remediation (i.e.,
app restart or server reboot)
• Workflow integration with existing
systems
Improve staff productivity
• Redeploy staff from operations to
more strategic initiatives
3. What Problems Can I Solve?
There are many day-to-day activities and tasks that are
performed by system administrators and IT support staff.
The opportunities for automation are endless and the
Opportunities for automation are numerous.
P a g e | 4 © 2015 Zefflin Systems All Rights Reserved Why Automate Data Center Operations?
following describes just a few.
Operating System Provisioning
Today most IT shops run various flavors of Windows and Linux. Manual configuration is often done after
deployment in a virtualized or physical environment. This is time consuming and predisposed to manual
mistakes. With an automation approach templates can be built for standard versions of each OS (with
various configurations depending upon the purpose of the server). Those standard templates can then
be used and presented in a catalog format so administrators can pick which OS versions they want to
deploy. These same templates can also be used post-deployment to validate images against known
standards. This is particularly effective when automating the audit process. Each server is checked
against an approved image template. If there are differences, remediation can also be automated;
either by automatically deploying changes to return the server to compliance, or by notifying
compliance personnel to investigate further. Manual intervention in this process can be advisable,
especially when first automated, to ensure operational continuity and avoid any undesired rollbacks.
Once OS templates are built and the deployment process is automated, the IT infrastructure is better
controlled, standards are more easily enforced and compliance is improved – all while increasing
productivity of existing staff.
DevOps and Automation
DevOps encompasses the process involved in moving
from code through build, testing, release, and production
rollout. Traditionally application development
organizations have not communicated or coordinated
with operations and/or support teams. As a result, bug
tracking and feedback on production application
releases/upgrades was often reactive and unstructured.
For example, the helpdesk may have been surprised with
a flood of calls resulting from a new release of which they
were not aware.
DevOps as a discipline has improved the situation. Like
any process, once the workflow of code � build � test
�release� production is well defined, it can be
automated. Automation should not only strive to reduce manual effort and improve speed and quality;
it should facilitate coordination and communication between development and operations. A well-
automated process can bring tremendous advantages in agility and competitiveness to companies who
invest in it. The build process, consisting of code merge, compiling and packaging is commonly
automated with development management tools. Once the build is complete, testing can be automated
Development and Operations historically didn't
get along.
P a g e | 5 © 2015 Zefflin Systems All Rights Reserved Why Automate Data Center Operations?
via integration with orchestration, functional, and performance testing tools. Performance testing is
particularly important for large user applications. It is not uncommon for multiple tests to be run to
check many different infrastructure scenarios. Using orchestration, it is possible to queue up scenarios
automatically, provision the environments, run the tests, record the results, de-provision the
environment and provision the next environment, and so on until all tests are completed and passed.
This accelerates the testing process substantially. Additionally, the results can be forwarded in report
form automatically. Using orchestration, the next phase of the process, migration to production is
completed. To facilitate communication and remain ITIL compliant, the orchestration can open/close a
change ticket, recording the release and all cases resolved with the new release (for the support
organization to communicate that to the end users) and notify the help desk (so they can prepare for a
potential increase in support calls).
Automated Problem Remediation
In complex IT environments, there are many day-to-day
operational tasks that represent workarounds or temporary
fixes. These tasks are done on a regular basis and take up
significant administrator time. Many are not even tracked –
IT support staff just complete them on an individual,
isolated basis. This means that:
a. It takes an unknown amount of resources ,
b. There is no way to measure the impact to IT
or to the business; and
c. The helpdesk never knows about it.
Automation provides a way to not only free up staff, it offers a way
to record, track and measure these kinds of tasks, including how
often they happen, the degree of interruption of service, and how
long it takes to correct them. Consider the example of an
application with a memory leak. Periodically, the server runs out of
memory and crashes, interrupting service. The vendor promises the
issue is “fixed in the next release”, but until then the service has to
get restarted when it comes close to depleting system memory. In
an automation framework, an orchestration tool is integrated with a
monitoring tool, and when system memory gets low, it automatically calls a script to restart the service,
timing the whole process. When the service comes back online, the orchestration tool opens a service
desk ticket and closes it, recording the downtime and the impact, satisfying ITIL audit requirements and
keeping the helpdesk in the loop. As a result, service impact and human intervention is minimized until
the next vendor patch release. Implementation of this kind of workflow is very low cost given the
functionality of today’s orchestration tools.
Manual work-arounds are impossible to
track and measure without automation.
Free up valuable staff from
repeatable tasks.
P a g e | 6 © 2015 Zefflin Systems All Rights Reserved Why Automate Data Center Operations?
Virtual Sprawl
Virtual sprawl is a common challenge. Virtualization has
made it easy to deploy computing environments for
development, testing, or production. This has been good
for IT, providing flexibility and ability to deploy servers
quickly. It also creates new problems in keeping track of all
those virtual environments. Monitoring tools can help, but
the management of any computing development, test or
production environment is manual and resource intensive.
If an administrator needs to tear down an environment to
free up resources, they have to find the owner, check if they
still need it, and then backup any data or applications that
need to be preserved. Using automation this can be achieved by implementing policies up front,
combined with use of orchestration and other automation tools to support the process. In an
automated environment, policies can be built into the request process (which is easier to do if it is
catalog-based). For example, when an individual requests a new computing environment to be
deployed (server, storage, network, database, application), they would pick a time limit (e.g., 30 days, 90
days, or indefinite). Once the system is live, orchestration tools monitor the environment (via
integration to the virtualization platform, OS and network monitoring tools, and system logs) for key
policy parameters such as:
a) last user to log in
b) network, CPU, memory, or IO activity
c) uptime
d) log activity
For time-based policies, it is more straightforward. When a computing environment is past it’s time
limit, orchestration runs shutdown scripts, initiates backup of data and application (via integration with
those tools), and opens/closes a change request, documenting the event. In the case of no time limit, a
governance policy is enforced. For example, a policy may be that if there is no network traffic to an
application, no one has logged in for 30 days and there is limited log activity in 10 days, automatically
notify the owner and open a change ticket, shut it down, take a snapshot of the environment, back up
the data, and close the change ticket. This can all be done via an orchestration tool integrated into
monitoring, backup/archive and change management systems, with little to no human intervention to
enforce the policy.
Password Reset
Most companies have a specific password reset policy when it comes to root and Administrator access
on virtual and physical servers. A typical policy might dictate that passwords are changed every 90 days
(or immediately if an employee with access leaves the company). This typically involves an
administrator with privileged access going to each server, manually logging in, setting the new password
Virtual environments are easier to manage
with automation.
P a g e | 7 © 2015 Zefflin Systems All Rights Reserved Why Automate Data Center Operations?
and manually recording it in a secure location. Often conventions are used for development, staging
and production servers (i.e., password_dev, password_test, password _prod). Manual errors can be
made, resulting in additional administrator time to correct. As the number of managed servers
increases, this process becomes more time consuming and prone to error. Now consider an automated
solution using an orchestration tool. With a small amount of effort, a workflow can be built to initiate
the process for all servers using a seed string that will automatically generate the passwords and add
“_dev”, “_test” or “_prod” for the appropriate servers and record all passwords in a secure location for
authorized access. With automation, you have you have an efficient, secure solution that eliminates
human error. Considering 2-3 minutes per server with manual effort for a shop with 1000 servers, up to
50 man-hours occurs each time passwords are reset. Automating this process would yield a compelling
ROI.
4. Today’s Software Tools
There are many software tools on the market today, with
new ones emerging regularly. The list below is not a
comprehensive one, but shows some of the industry
leaders. Tools vary greatly in maturity, cost and scalability.
The key is to select the right tools for your organization that
will minimize cost, risk and resource investment, while
enabling your organization to grow the solution as your
company grows.
New automation software comes to market
faster than ever, with seemingly endless
choices.
P a g e | 8 © 2015 Zefflin Systems All Rights Reserved Why Automate Data Center Operations?
Software Description
OpenStack™
There are many different OpenStack distributions. All use the core OpenStack code,
then add-on their own IP, including utilities, architecture and API’s. The architectures
and engineering approaches are different, resulting in significant differences between
distributions, from installation to scalability to user interface.
Major distributions include:
• Hewlett Packard (Helion™)
• Mirantis®
• Piston Cloud®
• RedHat® (RDO™)
• Canonical®
• cloudscaling®
• MORPHlabs®
Scalr™
Cloud Management Platform, with out-of-box functionality designed to automate the
entire lifecycle of complex computing environments: Service catalog, self-service, cloud
environment management, governance, compliance and analytics.
Red Hat™
CloudForms
Cloud Management Platform, with out-of-box functionality designed to automate the
entire lifecycle of complex computing environments: Service catalog, self-service, cloud
environment management, governance, compliance and analytics.
Puppet™
Automation and orchestration tool, designed for DevOps and configuration management
processes.
Chef™
Automation and orchestration tool, designed for DevOps processes.
OpenCrowbar™
Bare metal provisioning and deployment of OpenStack and other applications.
P a g e | 9 © 2015 Zefflin Systems All Rights Reserved Why Automate Data Center Operations?
Software Description
RightScale™
Cloud Management Platform, with out-of-box functionality designed to automate the
entire lifecycle of complex computing environments: Service catalog, self-service, cloud
environment management, governance, compliance and analytics.
Hewlett Packard
Cloud System Automation Suite™: Server provisioning, automated change detection (for
audit) and orchestration.
• HP Server Automation™ – Server provisioning, application provisioning, patching,
configuration, compliance and governance
• HP Cloud System Automation™ – Catalog, server and storage provisioning
• HP Operations Orchestrator™ - Orchestration
• HP Network Automation™ – Network provisioning and configuration
VMWare® VCloud™
Suite
Includes the vRealize™ suite, also known as:
• vCloud Orchestrator™ (vCO), orchestration tool
• vCloud Automation Center™ (vCAC), for automation of server provisioning and
compliance audits
• vCenter Operations™ (vCOPS), for monitoring of systems
IBM®
• IBM Cloud Orchestrator™
• IBM Cloud Manager™ – server provisioning and virtual environment deployment,
approvals, chargeback
Cisco®
This includes the service catalog, NewScale™, acquired by Cisco in 2011.
BMC®
• Blade Logic Server Automation™ – server provisioning
• Blade Logic Database Automation™ – data base provisioning and operations
• Blade Logic Network Automation™ – Network provisioning and configuration
• Blade Logic Middleware Automation™ – Deploys, configures, and troubleshoots Java
EE applications
• Cloud Lifecycle Management™ – Service catalog, server provisioning, governance
and compliance
• Atrium Orchestrator™ - orchestration
CA®
CA Automation Suite™
• CA Server Automation™ – Server provisioning, application provisioning, patching and
OS configuration
• CA Process Automation™ – Orchestration
• CA Configuration Automation™ – Compliance and configuration management
P a g e | 10 © 2015 Zefflin Systems All Rights Reserved
Why Automate Data Center Operations?
5. What Are My Peers Doing?
IT organizations are now realizing that virtualizing the compute and storage environments is just the
starting point, and in order to continue to reduce the cost of computing, further investment in
automation is necessary. As a result, they are now starting to automate operational processes
surrounding the request, approval, provisioning, monitoring, maintenance, compliance and governance
processes of their complex computing environments. The important thing to remember is that
virtualization lays the groundwork for automation. Virtualization is not, by itself automation. Also,
having a public, private or hybrid cloud
environment does not eliminate the need for
automation. It is relevant and necessary no
matter what the architecture, because the
business processes around requesting,
configuring, chargeback, provisioning,
governance and compliance are just as relevant
if your applications are running on a public,
private or hybrid cloud. In fact, it should not
matter where your computing resources are
running. A properly implemented automation
framework will serve as an abstraction layer
between the users and the computing infrastructure.
Today’s progressive and forward thinking IT organizations are well past virtualization and templating of
OS images. They are investing in the next round of productivity increase – because they have to if they
want to stay relevant and their company competitive. Their companies are growing and their IT budget
as a percentage of company revenue is shrinking. If they don’t automate, streamline and enable their
administrators to do more (much more) with less, they know IT will eventually be the organization that
inhibits company growth. No CIO wants to be the subject of an analyst call. Medium to large
organizations are implementing full private cloud environments, from fully defined service catalogs to
automated provisioning, compliance audits and policy-based governance of most computing
environments, especially application development environments. In addition, they are looking at every
opportunity to automate all processes in their environment, including, but not limited to, those
discussed in this white paper.
Forward thinking IT organizations are starting to reap
significant ROI from automation.
P a g e | 11 © 2015 Zefflin Systems All Rights Reserved
Why Automate Data Center Operations?
6. How Do I Integrate Data Center Automation into My Organization and
Environment? An automation strategy and a plan go hand-in-
hand with a cloud strategy. A cloud strategy and
architecture, whether private, public or hybrid is an
essential first step, but is only part of the answer.
Once you can easily and adaptably deploy OS and
storage in a virtual environment, you should think
about how to automate the processes around that
cloud environment. These processes include
service catalog, approvals, chargeback, application
and database provisioning, governance and
compliance.
The following steps are essential in adopting an automation strategy.
1. Cloud strategy, architecture and roadmap. It is important to understand what you will be working
with before considering automation. For example, choosing AWS as your primary platform provider
may affect the choice of automation tools (like orchestration or server provisioning) and processes
(like application provisioning or compliance).
2. Step back and look at all manual processes. It is important to objectively look at any manual
processes. It is equally essential to look at each process from a ROI perspective: how much do I
have to invest in automating this process? How much do I have to invest in maintaining it? and how
much labor can I save as a result? Caution: pride of ownership and turf protection can influence the
outcome of this review – it must be strictly objective. Some processes may have to be adjusted or
re-engineered which adds to the cost. Examples of simple processes to automate would be server
root password reset or event remediation. More complex processes might include application
provisioning and configuration.
3. Develop a short, medium and long term strategy and objectives, with ROI expectations for each
stage. This will help prioritize and set expectations. Often it is good to start with short, quick win
types of automation projects to prove the success and generate internal momentum for the idea of
further investment in automation. This planning should be done with a firm understanding of what
is possible, feasible and risk appropriate.
4. Identify software tools. Today there are an incomprehensible number and variety of software
tools, from open source to startups and well-established enterprise software companies, that
purport to automate data center processes of all kinds. New tools appear on a weekly basis. It is
important to filter out the noise, cut through the hype and find out what will work for your
organization at a reasonable cost. It is also crucial to determine if you already own some of the
A structured, incremental approach makes
automation manageable. Measure success.
P a g e | 12 © 2015 Zefflin Systems All Rights Reserved
Why Automate Data Center Operations?
software that can be used which will dramatically cut cost. For example, if your company has a
EULA with an enterprise software company in place, you may have access to some tools already
under the terms of that EULA. A solid orchestration tools is essential, as orchestration is the
centerpiece to automation of data center processes. It should be flexible, able to develop custom
workflows without extensive training and have a large library of plug-ins or APIs that can be used to
integrate with your existing applications such as service desk, change management or DevOps tools.
5. Take a baseline for future comparison. A baseline is essential in order to measure progress and
success of future automation efforts. A baseline should encompass metrics for cost and speed of
service and can include measurements like:
a. Average number of admins per server
b. Server utilization (not just systems deployed,
but those that are used)
c. Average time to deploy:
i. A development environment
ii. Production servers and applications
d. Average compliance rate
i. Security
ii. PCI
iii. Internal standards
e. Cost of ensuring compliance, including
manual effort
7. What Value Does Data Center Automation Bring to My Organization?
Automation can bring tremendous value if implemented
well. Improvements in agility, speed, control, cost and end
user satisfaction are all attainable, clearly demonstrating
IT’s value as a budget focused partner to the rest of the
business.
IT is a competitive weapon, as demonstrated through:
• End user satisfaction. End users, who historically
waited for days or weeks for a new server, are now
ecstatic at consistent wait times in hours. For those users who were circumventing IT, are used to
providing a credit card number to public cloud providers and getting instant infrastructure, it is also
a win. Previously they had to go around IT and fill out an expense report. Now they can go to the IT
portal and get the same service while avoiding completion of the expense report or other record
keeping. This enables them to reduce friction in getting their jobs done.
• Speed of delivery. With automation and cloud computing, it is common to be able to take a
request, route it for approval, calculate chargeback and provision a complex environment (i.e.,
server/OS, storage, database, network and application) in minutes or hours, rather than days or
weeks. The ability to do that predictably, reliably and in a repeatable way, has tremendous impact
on the business and agility of the entire company.
• Quality of service. When any IT process is standardized and automated, results become predicable
and repeatable, which raises the quality of services. With provisioning, this means that business
P a g e | 13 © 2015 Zefflin Systems All Rights Reserved
Why Automate Data Center Operations?
users can count on getting a computing environment up and running at a predicable turn-around
time so they can plan their projects more effectively and obtain better business outcomes faster.
With compliance automated, out of compliance situations are flagged much more frequently and
reliably, increasing the rate of compliance. When governance policies are automated, the
computing environment lifecycle is better controlled and resources are more efficiently utilized.
• Cost of service. Cost of delivering IT services drops significantly after automation. When fewer
administrators are required to achieve a higher throughput and resource utilization increases, costs
will go down dramatically.
• Agility. Business agility is increased because users know that they can get turnaround on successful
deployment of complex computing environments within hours. They can plan their business
deliverables around this, which enables the company to react to market changes with more agility
and urgency, potentially before the competition.
• Dramatic increase in productivity. It is not uncommon for organizations to go from one admin for
50 servers to one for 300 when adopting a full automation strategy. This often involves rearranging
skill sets; some resources are diverted to maintaining automation tools and functions while others
focus on developing new ones. Still, the resource investment is much smaller than the labor savings
gained by automating.
8. Summary
Data center automation is not just an option anymore. You, as an IT leader, must continually provide
value at a lower cost. In order for your IT organization to continue supporting a growing businesses,
remain relevant and prepare for the future, automation has to be an essential part of the strategy. We
have outlined some of the possible approaches, challenges, benefits, risks and returns in this white
paper. Every IT organization is different and should develop an automation strategy and plan in line
with the objectives, resources and constraints of their particular business.
9. About Zefflin
Zefflin’s focus is exclusively on Data Center Automation and Cloud Management solutions
implementation and integration. As a world-class, agile, center of excellence, our aim is to work with
best of breed software, combined with the industry's best technical consulting and integration talent.
We cut through the hype, identifying which tools can be implemented and integrated to effectively
automate application development and IT operations. We offer high quality, cost effective solutions
addressing the automation of the entire lifecycle of complex computing environments, from
request/catalog management, automated provisioning (OS, application, database, storage, network), to
policy governance and compliance. Our vision is to bring to market consulting/software solutions that
enable the lights-out data center. This will allow our customers to implement fully automated, private,
public and hybrid cloud systems, delivering low cost, high quality services to their customers while
minimizing personnel cost.