Upload
hisano
View
34
Download
1
Tags:
Embed Size (px)
DESCRIPTION
Combined Presentations. Business Continuity Program (BCP) & Disaster Recovery Plan (DRP ) Louis Shallal. Aim of Business Continuity Program BCP. To improve the municipal organizational resilience and capacity to respond and recover from a loss of its operational capability. - PowerPoint PPT Presentation
Citation preview
Business Continuity Program (BCP)
&
Disaster Recovery Plan (DRP)
Louis Shallal
Combined Presentations
Aim of Business Continuity Program BCP
• To improve the municipal organizational resilience and capacity to respond and recover from a loss of its operational capability
What is Operational Resilience?
An umbrella term that covers events ranging from recovery through to assessment and on-going prevention.
Operational Resilience
Plan Development
/Improvement
Crisis M
anagement
Disaster Recovery
Business Resum
ption
People
Systems /
Infrastructure
Location / Supplies
Return to Normal
Issues Assessment
Recovery Review
Insurance
Legal
Short-Term Response Long-TermResponse
Risk assessment &Mitigation
Plan Maintenanceand Testing
Supply Chain Assessm
ent
Mitigation
Service M
anagement
Risk Management
Crisis Managem
ent
Disaster Recovery
Applications
Technical Infrastructure
Data Recovery
People
Location / Resources
Procedures
Business Resumption Plan Testing
Event
Damage Assessment
Reputation
Physical Infrastructure
People
IT Systems
• The advance preparations necessary to identify the impact of potential business interruptions; formulate recovery strategies; develop business continuity plans; and administer a training, exercise and maintenance process.
• The technology aspects of a business continuity plan. Its focus is the restoration, at an alternate location, of data centre services and computer processing capabilities.
• An organisation’s coordinated response to a disaster in an effective and timely manner. The goal is to avoid or minimise injury to personnel and or damage to company assets.
• An event, anticipated or unanticipated, that seriously disrupts normal business operations and prevents the company from delivering essential services for a period of time.
BCP/DRP in context –
BCP
DR
Emergency Response
Disaster
Business Continuity vs. Emergency Management
Business Continuity is a process to develop the capacity to effectively respond and recover in an orderly manner to unplanned interruptions that disrupt critical operational functions. (e.g., water damage in the main administrative building preventing access)
Emergency Management is actions taken to manage emergencies in the community including, prevention, mitigation, preparedness, response and recovery. (i.e., a hazardous material spill, tornado or infectious outbreak)
Focus of Business Continuity vs. Emergency Management
Business Continuity is INWARD looking….. Internal to the Organization
Emergency Management is OUTWARD looking … External focus
Business Continuity vs. Emergency Management
Business Continuity in not mandatory for municipalities
………………just Ontario ministries!
Emergency Management is governed by Legislation.
Ontario Emergency Management Legislation
• The Emergency Management and Civil Protection Act requires the Region to create an Emergency Management Program adopted by a By-Law of Council:
• The Program must include:– An Emergency Plan based on identified risks and
hazards– Annual training programs and exercises– Public education on risks to public safety and
emergency preparedness
Ontario Emergency Management Legislation
• Ontario Regulation 380/04 under the EMCPA requirements include:– Trained Emergency Management Program Co-ordinator (CEMC) to
develop & implement the program– Emergency Management Program Committee– A current by-law adopting the program and plan– A current community risk profile– Designated Emergency Operations Centre with appropriate
communications systems– Designated Public Information Officer
• It is all about Emergency Training and Public Awareness !
What types of bad stuff…
NaturalEvents
Human-Caused
Technical Failures
Case Study: Blackout Northeastern North America 2003
– 4:11 pm Aug 14th
– System instability– Domino “Failure”– 50 Million affected – Outage of minutes to days– “Perfect” timing– Variable business disruption
York RegionBlackout !
Case Study: 1998 Ice Storm
– Death and injuries– Loss of livestock– Damage to power grid – Damage to Environment – Damage to Maple Sugar Industry– Loss of Business– High cost of response – Economic Impacts
Case Study: SARS
But…. Other minor emergencies happen!
• Knocked hydro and telephone poles..• Flooding….
Where is this? York Region
© 2004 John Newton Associates Inc.416.929.3621
Some Sobering Statistics50% of organizations never recover if critical business systems are out > 10 days93% of organizations that experience a disaster close within 5 yearsAverage impact of a system shutdown in a large organization: $96,000 per hour or $4 million a week ½% of market share lost every 8 hours Takes 3 years to recover lost ½% of market share One year of consequences for every 6 hours of downtime
Forces Driving Need for BCP
1.Stakeholder expectationsPublicPolitical
2.Regulatory concerns3.Legislated requirements4.Critical infrastructure5.Protection of reputation
Business Continuity Strategies – Phase 2
• Decide on the preferred strategy to develop resumption plans using existing in-house facilities or outsourcing the service
• Ensure equipment, services and facilities are in place to allow full, timely implementation of resumption plans
• Explore opportunities for risk mitigation to reduce the likelihood of resumption plan activation
IT Business Continuity Plan
• The IT BC Plan is to be developed by the IT department in partnership with Emergency Management folks as part of the overall Corporate Business Continuity Plan.
IT Business Continuity Plan• Purpose/Objective :
– First, it supports IT commitment to swiftly and effectively bring under control any emergency situation
– Second, it leverages the IT DR Plan – Third, it serves as a guide to an effective response in a crisis situation.
IT Business Continuity Plan
• Assumption:
– This plan assumes that the Worst Case Scenario is defined as follows:
• Administration Centre of the municipality is destroyed at 3:00pm on a Tuesday.
• 30% of staff located at the Administration Centre are not available to work.
IT Business Continuity Plan
• Components of the Plan:
– Resource plan – indicating primary and two alternate resources responsible for a function during a crisis along with detailed contact information
– Resumption team lead and three coordinators that will be called to action to ensure critical services continue.
– Plan identifies a primary recovery location and two alternative recovery locations with detailed maps to each location.
– Identify Key Responsibilities and Key Actions for all positions responsible for the various elements of the plan.
IT Business Continuity Plan
• Components of the Plan:
– Resource Requirements – desktop, phones, and other equipment required to set-up a recovery office.
– Vital Recovery Records – location of procedures and system passwords.
– External Contact Information – Vendors and Partners– Internal Contact Information – Key Staff in other Departments– Personnel Location Control Form – Who has been contacted and
who has not.– Critical Assessment Forms – Assessment of equipment and
office damage.– Application List – The applications need to be recovered. – Personnel Notification Procedures
Lessons • Executive Level approval & involvement is Key• Plan an Effective Business Impact Analysis (BIA)• Follow a logical investigative sequence • Identify critical functions• Create plans only for critical functions• Strategy Selection: Develop Resumption Plans• Move from Paper to Capacity… and from theory to
practice….– Test and re-test Systems– Train and re-train People
Lessons : Crisis Communications
• Crisis communications plan must dovetail into crisis management plan
• Stakeholders must be known• Speak with one voice - consistency• Truth and timeliness are essential• Silence is not golden• Perception becomes reality• AND DON’T FORGET THE SOCIAL MEDIA!
Strategies for effective and rapid recovery
Disaster Recovery Planning (DRP)
Presentation Outline
Challenge Background: DRP principles Best Practices in DRP DRP phases What Results to Achieve
Challenge of Municipalities
Continue to enable business units to provide service to citizens, business, and the community in the case of any unplanned computing services interruption.
It is all about protecting services to citizens, businesses, & community.
Challenge
It is not the delivery of IT services that we need to protect, it is the delivery of municipal services that are highly depended on technology and IT on a 7/24 basis …
example of mission critical apps APP to serve transportation needs of people with physical disabilities APP for delivering of our water and wastewater distribution APP for social housing services APP for social services APP for child care services APP for our financial and human resource services APP for managing EMS Operations APP for Health Services
Background- DRP Principles A “Cold Site”, is essentially a computer room facility which is ready for build out.
The site has no hardware and is sometimes called a shell site. Recovery times are measured in weeks.
The term “Warm Site” refers to a computer room is in a ready state, but is not up to date in terms of readiness for immediate failover. Recovery times are measured in several hours to days.
A “Hot Site” is one in which the requisite hardware / software is operating and is active. Typically, hot sites and “HA” sites are designed for mission critical operations / businesses such as financial, health care where down time is not an acceptable option and network failover can be performed in a rapid fashion. Recovery times are measured in minutes.
Recovery time is directly related to the degree of maintenance and scheduled efforts desired / committed by Municipal IT.
Days / Hours Hours / MinutesWeeks
Cold Site Warm Site Hot Site/HA
No HWQuick ship
Dated OS & Data
Relatively Current OS & Data
Real-time OS/ Data & Failover
DRP PrinciplesRecovery Times vs. Relative Cost
Timeframe to Recovery
0 minsMonths Weeks < 1 Week < 1 Day
RelativeCost $
Cold Site Warm Site Hot Site
DRP PrinciplesWhat drives the “Temp Site” Recovery Strategy?
A formal Business Impact Analysis quantifying loss is required to decide whether a Cold , Warm or a Hot Site is required.
The BIA will recommend the preferred recovery strategy
The municipality must decide what is the right balance between under funding (as in a Cold Site) and over funding (as in Hot Site) DRP
Best Practices in DRP
Common Failings in Disaster Planning Many organizations underestimate the effort required
to do proper Disaster Planning - Big bang approach often fails under it’s own weight
Many organizations put the “Cart before the horse” and get in trouble that way. Conduct a BIA (Business Impact Analysis) first!
Many organizations look at DRP as a “one-time” event. DRP testing must be ongoing, at least annually!
Many organizations fail to see the “Moving Target” of DRP. For example most municipalities undergo a lot of change in staff and resources as well as growth. things will become obsolete if not continuously updated
Solution is to break it down into phasesMust build on successes and momentumMost organizations cannot sustain adoption
rate Set expectations: DR is not a “quick fix” – long
term commitment to running multiple data centres (or SLA for cloud services)
Best Practices in DRP
DRP phases BIA (Business Impact Analysis)
Should be the first work package conducted Verifies
Risk and Impact Factors Customer Service Productivity Loss of revenue Public Image Financial accountability Loss of data Collective bargaining agreement compliance Employee health and safety compliance Public Safety
Confirms restore time requirements. Leads to DRP recovery strategy-Confirms Cold, Warm or Hot Site-
DRP Implementation phases Phase 1
This Phase delivers the base infrastructure for Disaster Recovery site (which could be on premises or on the cloud). It included a build-out of the computer room at the site and equipping the room with appropriate hardware and software for recovery in the case of “on Premise” solution.
Cloud Solution (IaaS) will eliminate most of the need identified above
This phase includes a successful test of the “Priority One” applications.
DRP phases Phase 2 (on premises)
This Phase builds on the success of Phase one. This phase includes adding enough disk space and additional servers to perform a full recovery. It adds a complete copy of the user base and full permissions from a security perspective. This phase culminates in a successful first user test of the site including all mission critical applications.
Phases one and two should built in an isolated environment in order to minimize risk to the production systems. The DRP domain is a logically separate portion of the network. Any recovery is limited to the computers and systems located at the DR site.
DRP phases Phase3
Assess how the first two phases are able to meet current DR needs before expanding them on a much larger scale.
The separate DRP domain is integrated into the production network, i.e the DRP domain can be “seen” by all computers in the Municipality.
To incorporate the DRP site into the Production environment a large scale re-architecture of the production environment is required. This MAY include an upgrade to the OS (Windows) core.
DRP phases
Phase 4 Phase four is focused on extending the
underlying network infrastructure to build-in redundancy into the system. This will make the recovery site available to a wider number of municipal sites.
What results are you after
Complet BIABuild infrastructure for the right “temp Site.” Successfully test priority one applications.Successfully recovered and test all mission
critical applications. Shoot for over 95% successes rate.
Relocate DRP equipment to an alternate site or to the cloud (data centre).