EMC Proven Professional Knowledge Sharing 2010
The Art and Science of a Data Center Migration
Michael E. Breshears
Michael E. BreshearsPractice Team LeadEMC [email protected]
2010 EMC Proven Professional Knowledge Sharing 2
Contents Introduction ................................................................................................................................... 3 Audience ....................................................................................................................................... 4 Scenario ........................................................................................................................................ 4 Why EMC ...................................................................................................................................... 5 Our Methodology .......................................................................................................................... 8
Program Management Office .................................................................................................. 10 Discovery and Analysis ........................................................................................................... 11
Current State Environment .................................................................................................. 12 Bundling ............................................................................................................................... 17 Scheduling ........................................................................................................................... 19
Future State Architecture ........................................................................................................ 19 Detailed Planning .................................................................................................................... 21
Pre-Migration ....................................................................................................................... 21 Night of Deployment ............................................................................................................ 22 Application Test Plans ......................................................................................................... 23 Contingency Plans ............................................................................................................... 23 Post Migration ...................................................................................................................... 25
Execution ................................................................................................................................. 25 Conclusion .................................................................................................................................. 26 Disclaimer: The views, processes or methodologies published in this compilation are those of the authors. They do not necessarily reflect EMC Corporation’s views, processes, or methodologies
2010 EMC Proven Professional Knowledge Sharing 3
Introduction
There are many reasons why a CIO would consider a data center migration or consolidation.
• Through the course of doing business with mergers, acquisitions, and divestitures, many
organizations have found their IT organizations sprawled across multiple regions and
data centers, many containing a mixed set of infrastructure and technologies. The rising
cost and resource demands of managing multiple data centers is, to say the least,
inefficient and illogical.
• According to Gartner research, “half of all data centers will need to overhaul their power
and cooling solutions within the next few years” (May 2007). With older facilities
approaching their limits of power, cooling, and floor space, organizations are confronted
with the fact that it is not cost effective or, in some cases, even possible to upgrade
current facilities.
• Efforts to become eco-friendly with GREEN initiatives are driving organizations to make
substantial changes in the way they do business with regard to data center operations.
Data center relocation, migration, and consolidation continues to emerge as primary IT
initiatives for organizations that must reduce costs, improve operational efficiency, and meet
their growing service demands while striving to be better world citizens. The May 2007 Gartner
research study estimates that nearly three-quarters of the Global 1000 organizations will either
need to move or significantly modify their data center operations in the coming years. This
presents a tremendous opportunity for EMC.
EMC is well positioned as the leading service provider in this space. During the past several
years, EMC Consulting has focused extensively on developing methodologies and processes to
manage data center migrations, virtualization, and consolidations.
2010 EMC Proven Professional Knowledge Sharing 4
Audience
This article, using a scenario-based example, will explain a methodology to manage the
complexities of a Data Center Migration (DCM) project. It will be useful to the EMC sales
community to help them better understand the components of a DCM project so they can
position this service offering within established accounts as well as using it to open the doors of
prospective accounts that EMC has previously been unable to penetrate. It is also a good
reference tool for those fearless EMC Consultants who risk life and limb in the pursuit of the
perfect data center migration.
Scenario
An EMC customer has determined that the current status of its data center does not meet the
needs of the company or their customers. There are power, space, and cooling issues that are
causing service interruptions more and more frequently. The company’s Infrastructure
Technology (IT) resources are frustrated because they spend the majority of their time in fire-
fighter mode instead of developing new functionality. External customers are looking for other
service providers that are more reliable and less expensive. The business is frustrated at the
rising cost of operational expenses needed to support failing infrastructure, and the
environmentalists are picketing outside the CEO’s office because, although the company claims
to be moving in a greener direction, its actions are not aligned.
The company has decided to undertake a strategic initiative to build a new, green data center
facility that will meet the needs of the organization for many years to come. Great effort is taken
to design the building, the cooling systems, and the power supply. When it is completed, the
Corporate Officers are truly impressed with what they have accomplished. However, they now
find themselves in a precarious situation. Until the current facility is decommissioned, the
company is paying for two leases, electricity to cool and service two buildings, as well as all
related maintenance costs. This is not what the business had in mind. The CIO directs the IT
organization to start moving applications ASAP.
2010 EMC Proven Professional Knowledge Sharing 5
While it seems a fairly straightforward task, it soon becomes apparent that the scope of this
project is beyond the capability of the organization’s IT teams. Resources are overburdened
with their daily tasks in the “do more with less” economy. The business units are refusing to
accept the outages that the IT department requires. The application team is trying to squeeze in
major upgrades to the application during the project. By the way, there will be no migration
windows available during the upcoming retail season. The list goes on and on.
During his monthly visit with the customer, the local EMC Account Representative meets with
the CIO. While dining on a preposterously over-priced steak and a nice bottle of wine, the CIO
vents about the company’s inability to successfully move its applications into this new data
center. The Account Rep empathizes with the CIO, and then shifts the conversation to more
important matters; “So Bob, how does the budget look for new storage hardware next quarter?”
Wait, rewind that. Maybe this is how that conversation could actually go:
“Bob, you have been a great EMC customer for many years. You know that we have been able
to provide solutions for numerous data and storage related problems that you have had in the
past. Are you aware that EMC is the industry leader in data center migrations and
consolidations? Our consulting organization has spent the past several years developing a
methodology to do exactly what you are looking for and we bring talent to the table that is
unmatched in the industry. Let me send over our EMC Perspective paper on Data Center
Consolidation and Migrations as well as some customer referrals. I think you will be pleasantly
surprised.”
Why EMC
Stop me if you’ve heard this one; three people are on an elevator; a system engineer, an
application owner, and an EMC Consultant. The CIO of the company steps onto the elevator
and asks the system engineer what he is working on, to which the system engineer replies “I’m
just trying to move some servers.” The CIO looks at the Application Owner, “so then, what are
you working on?” The application owner says “well, I am just trying to move my applications.”
The CIO shakes his head and looks at the EMC Consultant and says “So tell me, what is EMC
working on?” The EMC consultant smiles and says “Sir, we are building you a world-class data
center.”
2010 EMC Proven Professional Knowledge Sharing 6
Managing and executing a DCM project requires the ability to conceptualize the project as a
whole and to visualize the end result. We don’t just do the work. We are not here to just lay
brick, or to pour the foundation. In other words, we don’t just move servers or migrate
applications. We are here to build a data center.
The thought process behind the Art and Science of a Data Center Migration is based on the
EMC Consulting aspiration “to be our clients trusted advisors by providing thought leadership
and guidance to define their information infrastructure vision.” The transformational journeys
that we have already taken with our customers in the data center space have been incredible.
The journeys that we are embarking on in the coming months and years will be even more so.
"Many of our clients are going on a transformational journey right now and we have to be able to
advise them as to how they can take that journey, in the appropriate steps, most efficiently" - Sandra Hamilton, VP EMC Infrastructure Consulting
“But at the end of the day, we’re just moving hardware, right?” This is one of the most common
misconceptions that we hear at the inception of a DCM project. It can be difficult for an
organization that has never attempted one of these projects to understand the need for
“Consulting Services.” Many IT managers feel that they can architect a technology solution to
2010 EMC Proven Professional Knowledge Sharing 7
accomplish their migration goals. However, we have found that technology can only solve so
many problems. Yes, if the goal is to move a single system or application, companies would be
able to mitigate all of their risks and issues by throwing money at the problem.
The outage window and the amount of data loss are proportionate to the cost of the solution
used to migrate. We can seamlessly migrate an application to a new data center location with
zero downtime, zero data loss, and zero negative customer impact as long as the company is
willing to pay for it. But what happens when you are moving two, three, or a thousand or more
applications? No amount of money will be able to purchase a technology solution that can
analyze the interdependencies between all of the applications, their infrastructure, and the
services that are required to run those applications and systems in the new data center. And
what about the human resources needed to actually do the work? I have yet to find a technology
stack that can tell a human exactly what to do, when to do it, identify the scheduling conflicts
that can cause downstream congestion and delayed tasks, and then re-allocate other humans
to resolve the conflict.
Data Center Migrations are a complex, multi-step process, and it takes a unique group of
individuals to navigate this process effectively and efficiently.
So why choose EMC?
• Our Core Competency
• Our People
• Our Methodology
The following is an excerpt from the EMC Consulting Data Center Migration and Consolidation
Differentiation Letter:
EMC is the world’s premier information infrastructure provider. For nearly thirty years, our core
competency has been researching, designing, and building the most complex and robust
information infrastructure in the industry. Key to our success are our strategic relationships with
all the major industry vendors, including IBM, Microsoft, HP, Cisco, Sun, and HDS, and the
deep knowledge we have of their product sets.
2010 EMC Proven Professional Knowledge Sharing 8
Our ability to help clients move their workloads and data to new technology platforms is central
to our business, and so data migration in complex, multi-vendor environments is an EMC core
competency. We move over ten petabytes of client data every month, with one third of that data
residing on non-EMC equipment. This means that we understand [our client’s] IT environment
and we know how to move [the] information and the systems on which it resides.
EMC’s second differentiator is our people. Our experience in delivering hundreds of migration
projects has taught us that a critical success factor is the right combination of business
management consultants and infrastructure experts. This consistent, stable program
organization ensures the right level of business engagement and executive governance. Our
team will include expertise in compliance, regulatory and business risk management programs,
alongside deeply skilled information infrastructure principals. In addition, our team is very
seasoned, averaging over thirteen years in the industry.
Finally, EMC’s Data Center Consolidation and Migration Methodology is an approach to moving
your most critical data and environment that is unique in our industry. Developed over eight
years and through hundreds of successful engagements, our methodology is business- and
application-centric. This approach has proven so successful that it has recently been emulated
by our competitors, who traditionally have relied on heavy staffing requirements during the
migration effort, rather than the upfront discovery and planning.
As you can see, EMC provides a strong, compelling story as to why we are the perfect partner
for any organization undergoing a production IT migration. We understand that our customer’s
applications and data are their most critical assets in the data center. For this reason, EMC’s
DCM methodology begins with an application-centric approach.
Our Methodology
EMC takes a “Top Down” application-based approach to data center migration and
consolidation, and a “bottom-up” approach from the hardware platform. While we completely
understand that the application resides on a technology stack, it is either the application that the
business or customer sees or the availability and performance of the service provided to the end
user that will determine the migration’s customer impact. We typically spend most of our time
with the business unit owners during the analysis and detailed planning phases of our project.
2010 EMC Proven Professional Knowledge Sharing 9
We speak to them in a language that they understand. We determine what the overall impact of
moving an application or system has to the company and the end user. We do this by
determining the baseline performance, measuring against current SLAs, and converting the
results into dollars lost or saved, and customer satisfaction ratings.
If you were to approach a business unit manager and tell them that this coming Saturday we will
be moving all of their Power 595s, DMXs, and the Centera® to the new data center, you would
probably see their eyes glaze over. If you tell them we are going to raise their customer
satisfaction rating by 20% and save the department $1.5 million per year, they are not as
confused. For this reason, we start understanding the customer’s business processes, then the
applications and data supporting those processes. Once we have a solid understanding of how
the business works, we can then focus on what makes it work and what we need to do to
ensure that it continues to work above and beyond expectations once the project is complete.
In order to successfully migrate a customer to a new data center, EMC implements a multi-
phase approach that is aligned with the Plan-Build-Manage lifecycle philosophy.
Here we see the EMC Consulting Data Center Consolidation Solutions Framework:
Figure 1 - Data Center Consolidation Solutions Framework
2010 EMC Proven Professional Knowledge Sharing 10
Program Management Office
One of the first steps is to establish a Program Management Office (PMO) as there will be
multiple work streams operating within each DCM project. The PMO is the management
organization that oversees subsequent phases of the project. The PMO can be implemented
from the customer’s organization, by EMC, or a combination. If the customer already has a
mature PMO in house, they may choose to fill that role within the project. This may also be the
case if the DCM project is but one work stream within a larger portfolio of projects that the
organization currently has underway. As always, EMC is ready to step in and provide PMO
leadership should the customer choose that direction.
The PMO’s objective is to establish project standards, to understand critical business
processes, to help define processes where they may not yet exist, and to define the reporting
structure for the project.
The PMO aligns resources based on work stream and manages the flow of information between
each work stream appropriately. One of the critical goals of the PMO is to look for and find
synergies between work streams and departments so that the project objectives can be
executed as efficiently as possible. The PMO defines the timelines and milestones of the project
roadmap as well as gathers and maintains a list of issues and risks, along with their associated
contingency and mitigation plans. The PMO documents and maintains the Lessons Learned
database that can be referenced by all work streams throughout the project and be used in
future projects by both the customer and EMC.
Here are a few of the critical tasks that are typically performed by the PMO:
• Project Charter
• Project Work Plan
• Action Item / Risk / Issue Registers
• Provide Weekly Status (or as required)
o Dashboards
o Leadership meetings
• Communication Plan
• Quality Assurance
• Database Change Control
2010 EMC Proven Professional Knowledge Sharing 11
Once the PMO is established and engaged, we can focus on the next phase of our DCM
project; Discovery and Analysis.
Discovery and Analysis
“There are known knowns. There are things we know that we know. There are known
unknowns. That is to say there are things that we now know we don't know. But there are also
unknown unknowns. There are things we do not know we don't know.” - Donald Rumsfeld, From a Press Conference at NATO Headquarters, Brussels, Belgium, June 6, 2002
While it is difficult to say emphatically that one phase of a DCM project is more critical than
another, we can argue that discovery and analysis are the keys to a successful migration
project. Organizations with limited or no DCM experience may do a great job in documenting
everything that could possibly be known about a system and create the perfect migration plan
for the night of deployment. However, when they actually execute the migration, cutover, or lift-
and-ship, they tend to discover that they have broken
many other systems within the environment. This is not
necessarily their fault.
DCM projects require tribal knowledge gained from
performing migrations. A typical IT resource might only
participate in a data center relocation project once in their
career. EMC has this experience, and it has taught us that
the discovery and analysis phase is critical to the project.
We need to go down the rabbit hole. This is where we will
discover dependencies that will otherwise be missed.
Within a company’s operating environment, there are
numerous interdependencies that must be accounted for in
order to successfully plan and execute a migration. Very seldom will you find a standalone
application that you can lift-and-ship to the new data center and expect it to work as designed
once you plug it back in. This is what we consider “Low Hanging Fruit”, those applications that
can be moved early to test the environment, migration process, timeline, etc. If you have these
types of applications to move, consider yourself lucky. Then again, if they were all this easy, the
customer wouldn’t need us, would they?
2010 EMC Proven Professional Knowledge Sharing 12
In order to find out “what we do not know”, we undertake a methodical process to uncover those
hidden obstacles that may appear unexpectedly. EMC leverages existing tools to automate the
discovery and analysis process, or in some cases EMC can be asked to deploy a solution to
assist with automated discovery. The EMC Ionix™ Application Suite is one example of such a
tool. EMC Ionix Application Discovery Manager provides continuous discovery and mapping of
applications, their dependencies, and configurations with respect to their underlying
infrastructure in data center environments. With Ionix Application Discovery Manager, we can
get accurate, real-time visibility into the data center from an application standpoint that, again, is
critical for planning data center consolidations and migrations. The data gathered from the tool
is used to populate our Migration Access Database.
Regardless of whether we are using an automated tool, or doing manual discovery, the process
must be as real-time as possible. We cannot depend on a point-in-time snapshot of the
customer environment as the operating environment is dynamic and can change rapidly,
invalidating the data. A DCM project will typically span several months to a year or longer. The
key here is to baseline the data and then implement a mechanism to keep it fresh. We use a
central data repository to store the data that we have collected so that it can be analyzed,
updated, and reported on as needed.
Current State Environment
In order to know where we are going, we must first understand where we are coming from.
Knowing the customer’s current state environment and documenting it enables us to draw a
map to where we want to go with our DCM project. It is not enough to say you are going from
point A to point B. You need to know everything about point A or you will fail in your planning
efforts. Documenting the current state environment is essential to creating the design of the
future state, or end game, as well as defining the mechanisms that will be used to move from
current to future state. Two of the key pieces of information that the consulting team needs in
order to get started are the customer’s approved application list and a current list of servers
already deployed in the environment. This is the foundation of our discovery effort and is the
first step in documenting the customer’s current state environment.
2010 EMC Proven Professional Knowledge Sharing 13
The Approved Applications List
Every DCM project will have a list of In-Scope applications that will be migrated. If you’re lucky,
the customer will have an extensive list of all applications that are deployed in the environment
regardless of whether they are in-scope for this project or not. This list is scrubbed for duplicates
and aliases and will become the master application list in our project database. The master
application list is a lookup table within the database. No other application names will be allowed
in any other table unless it is validated, verified, and added to the master application table first.
This referential integrity is key to ensuring a high level of data integrity. Our detailed migration
plans will only be as good as the data used to create them.
Current Server List
While it is not an absolute that a current server list must be provided up front, it definitely will
make getting started that much easier. The project team will start mapping out dependencies
and linking applications to their underlying infrastructure as quickly as possible. If the customer
can provide a server list up front, the Phase I timeline can be pulled in and more time can be
dedicated to the detailed planning phase. Once the server list is received, the project team will
analyze it for accuracy. If they find that the server list is accurate, they will update the project
database and move forward. If they find that the customer provided inaccurate data, they will
focus more on the discovery tasks to get the accurate server information that they require.
Physical Inventory
In many environments, the consulting team will either ask for a current physical inventory of the
source data center, or they will request that such an inventory be taken. Usually this is a
customer-provided deliverable and should be specified in the Statement of Work (SOW). The
physical inventory is critical to help us validate the data that we already have, fill in gaps of
missing data, and is a great method to discover unknown systems. A physical inventory may not
be as critical in a situation where there is little to no lift-and-ship migration. However, having a
physical inventory is a best practice recommended by the consulting team.
Storage and Databases
Similar to the applications and servers, the customer may be able to provide a list of local and
network storage devices and databases. If we have upfront access to this information, we are
much further along. If not, we need to focus on this area when we are in the discovery phase.
2010 EMC Proven Professional Knowledge Sharing 14
Central Data Repository
That's all you need in life, a little place for your stuff - George Carlin Data integrity is critical in any project where decisions are made or actions taken based on
information analysis. Ensuring that data is accurate and as complete as possible is imperative to
our success. Invalid data is not only useless, but can be harmful and can potentially cause data
loss and system outages. We will use the data repository in subsequent phases of the same
project, as well as future projects for our customer and customer-driven initiatives such as
creating a Configuration Management Database (CMDB). EMC and the customer will make
decisions based upon the validity of this data. Decisions based on inaccurate, incomplete, or
erroneous data are worse than decisions based on no data at all.
The central data repository is the project database of record. This
is the location where collected data is uploaded, normalized, and
used for further discovery, analysis, and reporting. EMC
Consulting typically uses a little home-grown utility affectionately
known as the MAD, or Migration Access Database, as the
project’s central data repository. As the name indicates, it is a
Microsoft Access database that is lightweight, portable, and can
be easily customized to the customer’s environment. It also has a
tendency to drive the administrator insane.
MAD stores discovered data and details about the configuration items associated with the
customer’s infrastructure and applications. The MAD helps to define the relationships between
application and infrastructure layers and all the interdependencies. It also has multiple inputs
from the discovery and analysis phase as well as many outputs used to create planning
documents, runbooks, and reports. We will also use the MAD to produce Bundling reports to
help align our in-scope applications with their target move events.
Application to Server Mapping
When it comes to planning a migration, we can break a system down into four macro
components:
• Hardware
• Software
2010 EMC Proven Professional Knowledge Sharing 15
• Services or middleware
• Data
Depending on the type of move that we are planning to execute, all four components can move
together, or they can be de-coupled and moved separately. Before we can decide however, we
need a solid understanding of how these components exist in the current state environment.
We do this by creating linking tables within MAD that define the systems by application, the
server or servers that form dependency mapping where an application can have hard or soft
dependencies, upstream or downstream, that are not necessarily tied to the architecture of the
system. This is a physical box, virtual machine, or system LPAR with an application installed
with local or network storage devices.
We want to take each of the known servers, and link them to any and all known applications that
are installed on that server. There can be many servers that are dedicated to one application,
for example a Citrix farm. You might also see one physical server that hosts many applications
or virtual instances such as an ESX server hosting multiple virtual machines (VMs), an IBM
server with multiple Logical Partitions (LPARS), or possibly a team server that has an enterprise
application installed on it along with multiple supporting applications that are not necessarily
“company approved” but are used in the performance of the team’s day-to-day activities.
Each of the individual components that make up the application system has its own lifecycle.
We need to know what the system looks like as a whole, and also be able to separate the
components and possibly map them to their final destination separately. Applications typically
outlive the hardware platform on which they are initially installed. As the hardware lifecycle
comes to an end, an application may be getting a face-lift and need to be re-installed on a new
hardware platform. The original server can be returned to the leasing company, or, if it still has
some time left on the lease or maintenance agreement, it can be re-purposed as a swing server
to help with the migration of another application. As we begin to migrate applications and plan
for the decommissioning of the server, or plan to re-purpose the server to support other
migration work streams, you will want to know what is remaining on the box so that you do not
shut the server down prematurely. This will be the case if all applications resident on the server
are not bundled together. This scenario would also indicate that at least one of the applications
on the server will be a logical migration, meaning that the application is de-coupled from the
2010 EMC Proven Professional Knowledge Sharing 16
server and migrated separately. This is the case if the application is being consolidated to virtual
infrastructure.
Once we know which applications reside on what servers, we can begin to map our current
state environment to the future state environment.
NOTE: While many companies will see this as an opportunity to perform upgrades in
both hardware and software, this is not a Best Practice. We are trying to limit the
number of changes to the environment (except for the whole “moving to a new building”). As
applications are migrated to the new data center, and the occasional problem occurs, we need
to troubleshoot. The more changes that we make in the environment during the migration, the
more difficult it will be to identify the root cause and resolve any problems. I am not saying to
disallow any changes; rather, limit your exposure by limiting unnecessary changes. More
changes equals more risk.
Workshops
Our workshop session is a key tool in the discovery and analysis phase of the project. This is
where we sit down with the different teams in the organization and ask them to tell us what they
know about their own systems. We want to do this after we have already done our due diligence
and we have a solid understanding of the environment. This way we can judge the accuracy of
the data that we have, discover components that we missed, and fill in gaps in our data. This is
also when we start to discover those pesky dependencies that I keep referring to. Until you ask
the question “what is going to break if I take this application down or this server offline?” you will
not know what you do not know.
Identifying Dependencies
Dependencies are those items that are affected by changes to another application or system.
For example, if we migrate a Microsoft Exchange Server to the new data center, we might break
Outlook Web Access, Blackberry Enterprise Services, or ActivSynch. Those applications are
dependent on Exchange.
An upstream dependency is anything that an application relies on to function properly. If you
move a T1 circuit, and I lose network connectivity, you are my upstream dependency.
2010 EMC Proven Professional Knowledge Sharing 17
A downstream dependency is anything that is impacted if my application moves or fails. If I
move my FTP server and your Crystal Reporting server stops sending reports, you are my
downstream dependency.
Of course, an application can be both an upstream and downstream dependency as well as
having both upstream and downstream dependencies.
The key is to find out which machines are talking to which other machines, what ports they are
communicating over, and what protocols they are using. We also need to identify whether we
are dealing with hard or soft dependencies. A hard dependency would be a system that is
physically cabled to another, or one that has latency requirements that must be considered prior
to the move. A soft dependency might be a web server that pulls content from a data repository
via TCP port 80 or HTTP through a DNS entry. It is possible to move the target web server
without moving the upstream data repository simultaneously. Although the connection between
the two will be down during the migration, the systems can still communicate on the network
once both systems are back online. If you move the target system with the hard dependency,
the system will be down during the move, and will not come back online without some major
intervention, if at all.
Knowing and documenting this will help us to identify our application bundles. It will also help us
to communicate to the network team which firewall rules we may need to open in the new data
center.
Bundling
If you are moving several applications and potentially several hundred servers during your DCM
project, the relocation may be spread over many months. If this is the case, you will want to
define logical move groups and bundle your applications appropriately.
A bundle is a group of applications that will migrate together as a unit. The bundle will be
assigned to a target move date called a move event.
Once we have a thorough understanding of the components that make up the customer’s
current state environment, the critical dependencies of all applications and systems, and we
have uploaded all the data to our central data repository, we can begin to run reports that will
2010 EMC Proven Professional Knowledge Sharing 18
give us bundling recommendations based on criteria that the project team and customer have
previously set.
The criteria for bundling applications may include:
• Application Criticality
o Recovery Time Objective
o Customers impact felt after 4 hours
• Similar Functionality
o All e-mail systems
o All payment systems
• Dependencies
o All systems dependent on SAP
o All systems needed to support PeopleSoft
• Business Unit
o Finance systems move together
o HR systems move together
• Region
o Houston (US) servers move first
o Then Cleveland (US) servers
• Insurance and Risk
o How much insurance does the physical mover carry per truck?
o Can the company recover if a truck full of servers is damaged or destroyed?
• Available Outage Window
• Resource Availability
• Distance
MAD will capture and report on the quantifiable and objective bundling criteria such as
dependencies, criticality, business unit, etc. However, it is never that simple; not when we have
customers to think about. Once the bundle report is generated, we present it to the customer
who, in turn, will try to break all the logical rules that we put into place to manage our risk by
applying their subjective requirements to the bundling process. This is where we sit, smile, and
be good business partners.
2010 EMC Proven Professional Knowledge Sharing 19
Scheduling
This part is pretty simple so we usually let a Practice Manager handle it. Once we have approval
from the customer on the bundle report, it is time to schedule our move events. This takes just a
couple of pieces of critical information like a current Gregorian wall calendar, the bundle report,
and a consultant to ensure the Practice Manager does it right
Accuracy is the key to scheduling. If you need 120 days to migrate an application from start to
finish, you don’t schedule your event to happen in 90 days. Don’t set yourself up for failure.
There will be plenty of challenges in hitting your 120 day target. The implementation timeline
drives the scheduling; try not to let politics drive your schedule. You will not always win the fight,
but make your voice heard. If you do not win this battle, make sure your risk register reflects the
fact that the condensed timeline is a major risk to the project.
Future State Architecture
So, what does future state architecture mean? This is what our application systems will look like
after any and all changes that will be made before and during the migration. The future state
architecture is our end game for each application. We need to know what our future state
architecture is going to be so we can plan accordingly.
In a perfect world, the future state architecture would closely resemble that of the current state.
So, let’s try a little experiment. Everyone who is currently living in a perfect world, please raise
your hand. I didn’t think so.
Although we do try to limit changes during DCM projects because of the amount of additional
risk involved, we also know that change is inevitable. There is always some amount of up-lifting
that occurs during migration planning. It is not best practice, but at the same time it doesn’t
make much sense to plan and execute a data center relocation as a lift-and-ship if you have to
initiate a hardware refresh as soon as the servers hit the ground. We have to look at the overall
impact to the business. Sometimes it is preferable to take a single outage with a longer duration
instead of trying to get multiple outages approved. In this case, it might be better for the
business to move the application and refresh the hardware at the same time versus causing the
application to be unavailable multiple times.
2010 EMC Proven Professional Knowledge Sharing 20
Future state architecture design can be done in parallel to analysis, bundling, and scheduling
activities. However, it should be complete for all applications in a bundle before the detailed
planning for that bundle starts. Keep in mind that changes to the architecture might require
some long lead-time purchases such as the new hardware platform, software, or licenses. We
need to ensure that we have taken this into account while planning our overall timeline. Once a
new architecture design is recommended, a proof of concept test needs to be completed in
order to sanity check the solution and validate compatibility (e.g., new software on old hardware,
old software on new hardware, new software on new hardware, etc.).
If there is any level of transformation taking place during the DCM project, it is incumbent upon
the organization to perform full functionality, performance, and regression testing on these new
configurations prior to migration to the new data center, so the SOW should include an
assumption that the company has a test environment to fully vet these changes. It is nearly
impossible to troubleshoot issues during a migration window if there is a lot of transformation
taking place during the move. The SOW for the project should reflect an extended test cycle for
any system that is being upgraded either in hardware, software, or services.
Here are a few items that might change during a data center relocation or migration project:
• Network
o Layer 2 to Layer 3 switching
o Physically segmented LANs to vLANs
o Add PCI-compliant DMZ for Payment Systems
• Servers
o Virtualization of physical Windows servers
o Consolidation of multiple under-utilized servers to fewer servers in order to
maximize resources
• Application
o OS upgrades (AIX 5.3 to 6.1)
o New application code releases to add functionality
o Software version is end-of-life and no longer supported
The more changes that the customer tries to roll into the project, the longer the detailed
planning phase and the test cycle should be. Changes to the environment increase risk
2010 EMC Proven Professional Knowledge Sharing 21
exponentially. All changes that the company is planning to implement should be captured early
in the process, and the SOW should reflect this with extended timelines.
Detailed Planning
The detailed planning phase is very interactive. Typically, the EMC project team is not
responsible for architecting the migration solution. As much as we like to think we do, we do not
know the application or system as well as the teams who work on it every day. For this reason,
it is our job to facilitate the discussions and workshops to get the detailed migration plan written
(although it is acceptable for us to give our opinion on how things should be done. After all, we
are the experts, right?)
We document our planning in the Migration Planning Workbook. The Migration Planning
Workbook, or runbook as it is referred to in this article, is a Microsoft Excel workbook comprised
of multiple tabs that provide information pertinent to the planning and execution of a move
event. Typically there will be one runbook per bundle. The runbook includes exports from the
MAD along with various administrative tabs that are used to manage and execute a move event,
pre-migration related tasks, and post migration tasks. The runbook is the document that is used
during the deployment by the EMC team running the command center.
Our detailed planning focuses on the following areas: Pre-migration, Night of Deployment,
Application Testing, Contingency, and Post Migration.
PreMigration
Within the runbook is the pre-migration task list, or PMTL, (clever, right?). The pre-migration tab
is based on a T-minus schedule. The Time of execution (T) is the day of the migration. Pre-
migration tasks are listed as happening x-number of days before execution, or T-minus. The
Pre-migration Task List contains those tasks that must be completed prior to executing the
event. Pre-migration tasks may be broken down into Administrative, Application, Server,
Storage, and Backup tasks along with a list of required meetings. You can modify these
sections to accommodate your customer. For example, if your customer is using Legato as its
backup solution, there are specific tasks that must happen to prepare the clients to be backed
up in the target data center. If you are moving from an older storage array to a new one, you will
want to ensure that servers are remediated, new storage is provisioned, and fiber cables are run
in the target data center. These are all examples of pre-migration tasks.
2010 EMC Proven Professional Knowledge Sharing 22
The standard template is pre-populated with tasks for many types of systems and applications,
to include standard tasks that are administrative in nature. During the detailed planning phase,
the EMC team will work with the different customer teams to scrub the PMTL and pare it down
to those tasks that are required for that migration. While not all tasks in the template version will
be used, it is a good tool to jog the memory of the team members and remind them of tasks that
they may have otherwise missed.
The pre-migration task list is difficult. It can get extremely detailed, take a lot of time to maintain,
and the resources hate it when you remind them that they have tasks that are due. All that being
said, use it!
Night of Deployment
The night of deployment is managed by a tab in the runbook referred to as the Hour-by-Hour
(HBH) plan. The HBH plan is a chronological list of tasks that must be executed during the
migration event. The HBH denotes the task, owner, estimated start time, and estimated
duration. The purpose of this document is to give the Migration Lead a checklist that they can
follow during the migration to ensure that it is on track. It will also give them a warning when
they start to vary from the plan so that corrections can be made, as needed, to get back on track
or to notify resources that they will start their tasks earlier or later than expected.
To develop the HBH, the consulting team works with the customer teams to capture the
procedures that will be used to migrate the bundle to the new data center based upon the future
state architecture and the migration strategy that was decided upon in the Discovery and
Analysis phase of the project. Go into detail. The HBH itself is not intended to be a micro-level
task list, and the EMC Migration Lead does not need to know 100% of the details of each task.
For the purpose of the HBH plan, you simply need to know what will happen, who is
responsible, and how long it will take. That being said, during the creation of this document, you
cannot take a high-level approach. If you do not anticipate every possible scenario, you will get
caught in a roll-back situation. It is absolutely necessary to get those detailed plans. You can
keep the detailed plans and supporting documents on a project share and reference them in the
runbook but they cannot be overlooked.
2010 EMC Proven Professional Knowledge Sharing 23
FRIENDLY HINT: Don’t forget to download all of your migration documents and make hard
copies before your migration starts, especially if the SharePoint server is in your migration
bundle!
Application Test Plans
Testing the application after the deployment is critical to getting sign-off from the business and
support teams. During your detailed planning workshops, drive each team to come up with
testing scenarios for the night of deployment. You will also want to capture baseline data to
compare your test results to. You need to line up resources to conduct the tests for the night of
deployment. If you have external facing applications, get external resources to test for you. As
the teams contemplate and write their application test plans, have them review the project
Lessons Learned reports from the PMO. Take advantage of all resources available to you. It is
far better to run into an issue during the deployment window than to have everyone sign-off on
the migration as a success and then run into a production impacting issue the next day.
Contingency Plans
If there is a possibility of several things going wrong, the one that will cause the most damage
will be the one to go wrong.
- Murphy
If there is a worse time for something to go wrong, it will happen then.
- Murphy
If you perceive that there are four possible ways in which a procedure can go wrong, and
circumvent these, then a fifth way, unprepared for, will promptly develop.
- Murphy
Anything that can go wrong will go wrong.
- Murphy
Shall I continue? Hopefully you get the point. Plan, plan, plan!
2010 EMC Proven Professional Knowledge Sharing 24
Plan for the unthinkable. Your outage window must take into account all possible contingency
scenarios, including:
• Application Failure
• Hardware Failure
• Data Migration Failure
• Transport breakdown
• Weather
• Resources spending too much time at the pub and forgetting that they have a migration
this weekend.
The Contingency Plan Tab in the runbook outlines the scenarios where contingency plans may
need to be invoked, and the response to that scenario should it occur. We also outline the
timeline for each scenario from the last possible point of detection, to the point where the
decision to invoke the contingency plan must be made, to the point where the application or
system must be turned back over to the end users for production.
For the purpose of planning possible outage windows, you must understand that the
contingency scenarios are cumulative; meaning that your plan must take into account the timing
should issues happen within each of the areas where contingency plans are made. For
example, in one bundle, you may have failed backups, slower than expected data migration, a
flat tire on the truck carrying servers, and a failed power supply on a server. Your contingency
plan has to be written so that a series of issues such as this does not push you out the back end
of your migration window.
You must include several decision points during the creation of your HBH plan. A decision point
is basically a Go or No Go meeting that follows a critical milestone in the migration, or at a
specific time in the plan. The decision point is critical to the migration. During the migration
event, do not waver on a LAST POSSIBLE decision time. This is a hard stop. As the migration
lead, if the migration is experiencing issues, once you hit the point where a decision to invoke
contingency MUST be made, you must step up and make that decision regardless of a popular
vote to continue troubleshooting. Systems Engineers will always say “give me one more hour.”
Set your Point of Decision time and do not go past it. If you, as a migration lead, do not have the
2010 EMC Proven Professional Knowledge Sharing 25
authority to invoke a contingency plan, you must ensure that the decision maker is available and
that you add time to the plan to get their sign-off.
Post Migration
The Post Migration Task List contains tasks that take place after the bundle is migrated to the
new data center and must be completed prior to the hand-off of the Runbook to the customer as
a project deliverable. The post migration tasks are based on a T-plus schedule, or x-number of
days after migration completion.
Typically, EMC will own the Runbook until a set time after the migration to ensure post migration
tasks are complete. Post migration tasks that are not complete by the previously determined
hand-off date are assigned to a resource and given a completion date. The hand-off is complete
when the customer accepts the document along with any outstanding post migration tasks.
Typically, the hand-off date for a completed migration is around T+14, or 14 days after the
successful migration of a bundle. This allows EMC to maintain ownership of the bundle and
manage any migration related issues that may arise. After the hand-off of the Runbook to the
customer, all issues that arise will be considered production issues not related to the migration
and handled as business as usual according to the customers processes.
Execution
On move day, everything must go like clockwork to avoid unplanned down time. No two
migrations are ever the same so I would like to give you a few of my own Lessons Learned:
1. Arrive early. You will be using technology to move technology. Your conference room
must have working phones, Internet access, and a well-stocked bathroom (OK, that last
one isn’t really a technology issue, but it is important nonetheless).
2. Security access to the data center for everyone who might show up. You would be
surprised how many times the system engineer from the source data center realizes on
the day of the migration that his badge does not give him access to the new data center.
3. Don’t let them show up! There are a core group of individuals who will need to be onsite
and there are those who make more money than you and can be onsite if they want to
be. Everyone else needs be able to justify their presence. You will have your hands full
2010 EMC Proven Professional Knowledge Sharing 26
managing this beast. You need to control the environment to include limiting bystanders
if necessary.
4. Keep your command center quiet. As people start getting bored and tired, they tend to
get silly. As you can probably tell, I am one of those people. Not in the command center!
Do not hesitate to ask people to leave the room if they are getting loud.
5. Update your contact lists. You will accumulate a large list of contact numbers during the
project. If something goes wrong during your migration, you might need every one of
them.
6. Over-communicate. This is a good rule for DCMs in general. You cannot provide too
much status. Communicate when things are going well; communicate more when things
start to go off the rails.
7. Supply food. If you have a long event, you do not want to lose all of your resources three
times a day for an hour each. Feed them there. They will think you are being a good
host.
8. Don’t burn them out. People will hit a point of diminishing returns after 8-10 hours. Plan
for multiple shifts if necessary. Also, plan to have backup resources available should
your migration go into contingency. Give people a rest if needed. They will be more
productive when they come back.
9. Don’t plan travel within 24 hours of your scheduled migration end time. That is the surest
way to predict that something is going to go wrong.
10. If you don’t swear in the command center, you won’t need to look down in horror to see if
the conference bridge is on mute. Likewise, don’t call the customer PM names when the
conference bridge is not on mute; they really don’t like it.
Conclusion
Each Data Center Migration is unique. Most consultants became consultants to avoid the late
night and weekend grunge work. I know, I was one of them. However, I have really come to
appreciate the amount of work that it takes to successfully execute a Data Center Migration
project. From planning to execution, to the post-mortem where everyone sits around and
congratulates you for a job well done, I cannot imagine doing anything else. It is always a
challenge, and, although there are more than a few evenings and weekends away from the
concierge lounge at my favorite hotel, I do not think that I would give this up even if someone
offered me enough money to buy two or three brand new EMC Symmetrix® V-Max arrays.
2010 EMC Proven Professional Knowledge Sharing 27
"You were serious about that?" - Vincent LaGuardia Gambini, My Cousin Vinny