46
Business Continuity Program (BCP) & Disaster Recovery Plan (DRP) Louis Shallal Combined Presentations

Combined Presentations

  • Upload
    hisano

  • View
    34

  • Download
    1

Embed Size (px)

DESCRIPTION

Combined Presentations. Business Continuity Program (BCP) & Disaster Recovery Plan (DRP ) Louis Shallal. Aim of Business Continuity Program BCP. To improve the municipal organizational resilience and capacity to respond and recover from a loss of its operational capability. - PowerPoint PPT Presentation

Citation preview

Page 1: Combined Presentations

Business Continuity Program (BCP)

&

Disaster Recovery Plan (DRP)

Louis Shallal

Combined Presentations

Page 2: Combined Presentations

Aim of Business Continuity Program BCP

• To improve the municipal organizational resilience and capacity to respond and recover from a loss of its operational capability

Page 3: Combined Presentations

What is Operational Resilience?

An umbrella term that covers events ranging from recovery through to assessment and on-going prevention.

Operational Resilience

Plan Development

/Improvement

Crisis M

anagement

Disaster Recovery

Business Resum

ption

People

Systems /

Infrastructure

Location / Supplies

Return to Normal

Issues Assessment

Recovery Review

Insurance

Legal

Short-Term Response Long-TermResponse

Risk assessment &Mitigation

Plan Maintenanceand Testing

Supply Chain Assessm

ent

Mitigation

Service M

anagement

Risk Management

Crisis Managem

ent

Disaster Recovery

Applications

Technical Infrastructure

Data Recovery

People

Location / Resources

Procedures

Business Resumption Plan Testing

Event

Damage Assessment

Reputation

Physical Infrastructure

People

IT Systems

Page 4: Combined Presentations

• The advance preparations necessary to identify the impact of potential business interruptions; formulate recovery strategies; develop business continuity plans; and administer a training, exercise and maintenance process.

• The technology aspects of a business continuity plan. Its focus is the restoration, at an alternate location, of data centre services and computer processing capabilities.

• An organisation’s coordinated response to a disaster in an effective and timely manner. The goal is to avoid or minimise injury to personnel and or damage to company assets.

• An event, anticipated or unanticipated, that seriously disrupts normal business operations and prevents the company from delivering essential services for a period of time.

BCP/DRP in context –

BCP

DR

Emergency Response

Disaster

Page 5: Combined Presentations

Business Continuity vs. Emergency Management

Business Continuity is a process to develop the capacity to effectively respond and recover in an orderly manner to unplanned interruptions that disrupt critical operational functions. (e.g., water damage in the main administrative building preventing access)

Emergency Management is actions taken to manage emergencies in the community including, prevention, mitigation, preparedness, response and recovery. (i.e., a hazardous material spill, tornado or infectious outbreak)

Page 6: Combined Presentations

Focus of Business Continuity vs. Emergency Management

Business Continuity is INWARD looking….. Internal to the Organization

Emergency Management is OUTWARD looking … External focus

Page 7: Combined Presentations

Business Continuity vs. Emergency Management

Business Continuity in not mandatory for municipalities

………………just Ontario ministries!

Emergency Management is governed by Legislation.

Page 8: Combined Presentations

Ontario Emergency Management Legislation

• The Emergency Management and Civil Protection Act requires the Region to create an Emergency Management Program adopted by a By-Law of Council:

• The Program must include:– An Emergency Plan based on identified risks and

hazards– Annual training programs and exercises– Public education on risks to public safety and

emergency preparedness

Page 9: Combined Presentations

Ontario Emergency Management Legislation

• Ontario Regulation 380/04 under the EMCPA requirements include:– Trained Emergency Management Program Co-ordinator (CEMC) to

develop & implement the program– Emergency Management Program Committee– A current by-law adopting the program and plan– A current community risk profile– Designated Emergency Operations Centre with appropriate

communications systems– Designated Public Information Officer

• It is all about Emergency Training and Public Awareness !

Page 10: Combined Presentations

What types of bad stuff…

NaturalEvents

Human-Caused

Technical Failures

Page 11: Combined Presentations

Case Study: Blackout Northeastern North America 2003

– 4:11 pm Aug 14th

– System instability– Domino “Failure”– 50 Million affected – Outage of minutes to days– “Perfect” timing– Variable business disruption

Page 12: Combined Presentations

York RegionBlackout !

Page 13: Combined Presentations

Case Study: 1998 Ice Storm

– Death and injuries– Loss of livestock– Damage to power grid – Damage to Environment – Damage to Maple Sugar Industry– Loss of Business– High cost of response – Economic Impacts

Page 14: Combined Presentations
Page 15: Combined Presentations
Page 16: Combined Presentations
Page 17: Combined Presentations

Case Study: SARS

Page 18: Combined Presentations

But…. Other minor emergencies happen!

• Knocked hydro and telephone poles..• Flooding….

Page 19: Combined Presentations
Page 20: Combined Presentations

Where is this? York Region

Page 21: Combined Presentations

© 2004 John Newton Associates Inc.416.929.3621

Some Sobering Statistics50% of organizations never recover if critical business systems are out > 10 days93% of organizations that experience a disaster close within 5 yearsAverage impact of a system shutdown in a large organization: $96,000 per hour or $4 million a week ½% of market share lost every 8 hours Takes 3 years to recover lost ½% of market share One year of consequences for every 6 hours of downtime

Page 22: Combined Presentations

Forces Driving Need for BCP

1.Stakeholder expectationsPublicPolitical

2.Regulatory concerns3.Legislated requirements4.Critical infrastructure5.Protection of reputation

Page 23: Combined Presentations

Business Continuity Strategies – Phase 2

• Decide on the preferred strategy to develop resumption plans using existing in-house facilities or outsourcing the service

• Ensure equipment, services and facilities are in place to allow full, timely implementation of resumption plans

• Explore opportunities for risk mitigation to reduce the likelihood of resumption plan activation

Page 24: Combined Presentations

IT Business Continuity Plan

• The IT BC Plan is to be developed by the IT department in partnership with Emergency Management folks as part of the overall Corporate Business Continuity Plan.

Page 25: Combined Presentations

IT Business Continuity Plan• Purpose/Objective :

– First, it supports IT commitment to swiftly and effectively bring under control any emergency situation

– Second, it leverages the IT DR Plan – Third, it serves as a guide to an effective response in a crisis situation.

Page 26: Combined Presentations

IT Business Continuity Plan

• Assumption:

– This plan assumes that the Worst Case Scenario is defined as follows:

• Administration Centre of the municipality is destroyed at 3:00pm on a Tuesday.

• 30% of staff located at the Administration Centre are not available to work.

Page 27: Combined Presentations

IT Business Continuity Plan

• Components of the Plan:

– Resource plan – indicating primary and two alternate resources responsible for a function during a crisis along with detailed contact information

– Resumption team lead and three coordinators that will be called to action to ensure critical services continue.

– Plan identifies a primary recovery location and two alternative recovery locations with detailed maps to each location.

– Identify Key Responsibilities and Key Actions for all positions responsible for the various elements of the plan.

Page 28: Combined Presentations

IT Business Continuity Plan

• Components of the Plan:

– Resource Requirements – desktop, phones, and other equipment required to set-up a recovery office.

– Vital Recovery Records – location of procedures and system passwords.

– External Contact Information – Vendors and Partners– Internal Contact Information – Key Staff in other Departments– Personnel Location Control Form – Who has been contacted and

who has not.– Critical Assessment Forms – Assessment of equipment and

office damage.– Application List – The applications need to be recovered. – Personnel Notification Procedures

Page 29: Combined Presentations

Lessons • Executive Level approval & involvement is Key• Plan an Effective Business Impact Analysis (BIA)• Follow a logical investigative sequence • Identify critical functions• Create plans only for critical functions• Strategy Selection: Develop Resumption Plans• Move from Paper to Capacity… and from theory to

practice….– Test and re-test Systems– Train and re-train People

Page 30: Combined Presentations

Lessons : Crisis Communications

• Crisis communications plan must dovetail into crisis management plan

• Stakeholders must be known• Speak with one voice - consistency• Truth and timeliness are essential• Silence is not golden• Perception becomes reality• AND DON’T FORGET THE SOCIAL MEDIA!

Page 31: Combined Presentations

Strategies for effective and rapid recovery

Disaster Recovery Planning (DRP)

Page 32: Combined Presentations

Presentation Outline

Challenge Background: DRP principles Best Practices in DRP DRP phases What Results to Achieve

Page 33: Combined Presentations

Challenge of Municipalities

Continue to enable business units to provide service to citizens, business, and the community in the case of any unplanned computing services interruption.

It is all about protecting services to citizens, businesses, & community.

Page 34: Combined Presentations

Challenge

It is not the delivery of IT services that we need to protect, it is the delivery of municipal services that are highly depended on technology and IT on a 7/24 basis …

example of mission critical apps APP to serve transportation needs of people with physical disabilities APP for delivering of our water and wastewater distribution APP for social housing services APP for social services APP for child care services APP for our financial and human resource services APP for managing EMS Operations APP for Health Services

Page 35: Combined Presentations

Background- DRP Principles A “Cold Site”, is essentially a computer room facility which is ready for build out.

The site has no hardware and is sometimes called a shell site. Recovery times are measured in weeks.

The term “Warm Site” refers to a computer room is in a ready state, but is not up to date in terms of readiness for immediate failover. Recovery times are measured in several hours to days.

A “Hot Site” is one in which the requisite hardware / software is operating and is active. Typically, hot sites and “HA” sites are designed for mission critical operations / businesses such as financial, health care where down time is not an acceptable option and network failover can be performed in a rapid fashion. Recovery times are measured in minutes.

Recovery time is directly related to the degree of maintenance and scheduled efforts desired / committed by Municipal IT.

Days / Hours Hours / MinutesWeeks

Cold Site Warm Site Hot Site/HA

No HWQuick ship

Dated OS & Data

Relatively Current OS & Data

Real-time OS/ Data & Failover

Page 36: Combined Presentations

DRP PrinciplesRecovery Times vs. Relative Cost

Timeframe to Recovery

0 minsMonths Weeks < 1 Week < 1 Day

RelativeCost $

Cold Site Warm Site Hot Site

Page 37: Combined Presentations

DRP PrinciplesWhat drives the “Temp Site” Recovery Strategy?

A formal Business Impact Analysis quantifying loss is required to decide whether a Cold , Warm or a Hot Site is required.

The BIA will recommend the preferred recovery strategy

The municipality must decide what is the right balance between under funding (as in a Cold Site) and over funding (as in Hot Site) DRP

Page 38: Combined Presentations

Best Practices in DRP

Common Failings in Disaster Planning Many organizations underestimate the effort required

to do proper Disaster Planning - Big bang approach often fails under it’s own weight

Many organizations put the “Cart before the horse” and get in trouble that way. Conduct a BIA (Business Impact Analysis) first!

Many organizations look at DRP as a “one-time” event. DRP testing must be ongoing, at least annually!

Many organizations fail to see the “Moving Target” of DRP. For example most municipalities undergo a lot of change in staff and resources as well as growth. things will become obsolete if not continuously updated

Page 39: Combined Presentations

Solution is to break it down into phasesMust build on successes and momentumMost organizations cannot sustain adoption

rate Set expectations: DR is not a “quick fix” – long

term commitment to running multiple data centres (or SLA for cloud services)

Best Practices in DRP

Page 40: Combined Presentations

DRP phases BIA (Business Impact Analysis)

Should be the first work package conducted Verifies

Risk and Impact Factors Customer Service Productivity Loss of revenue Public Image Financial accountability Loss of data Collective bargaining agreement compliance Employee health and safety compliance Public Safety

Confirms restore time requirements. Leads to DRP recovery strategy-Confirms Cold, Warm or Hot Site-

Page 41: Combined Presentations

DRP Implementation phases Phase 1

This Phase delivers the base infrastructure for Disaster Recovery site (which could be on premises or on the cloud). It included a build-out of the computer room at the site and equipping the room with appropriate hardware and software for recovery in the case of “on Premise” solution.

Cloud Solution (IaaS) will eliminate most of the need identified above

This phase includes a successful test of the “Priority One” applications.

Page 42: Combined Presentations

DRP phases Phase 2 (on premises)

This Phase builds on the success of Phase one. This phase includes adding enough disk space and additional servers to perform a full recovery. It adds a complete copy of the user base and full permissions from a security perspective. This phase culminates in a successful first user test of the site including all mission critical applications.

Phases one and two should built in an isolated environment in order to minimize risk to the production systems. The DRP domain is a logically separate portion of the network. Any recovery is limited to the computers and systems located at the DR site.

Page 43: Combined Presentations

DRP phases Phase3

Assess how the first two phases are able to meet current DR needs before expanding them on a much larger scale.

The separate DRP domain is integrated into the production network, i.e the DRP domain can be “seen” by all computers in the Municipality.

To incorporate the DRP site into the Production environment a large scale re-architecture of the production environment is required. This MAY include an upgrade to the OS (Windows) core.

Page 44: Combined Presentations

DRP phases

Phase 4 Phase four is focused on extending the

underlying network infrastructure to build-in redundancy into the system. This will make the recovery site available to a wider number of municipal sites.

Page 45: Combined Presentations

What results are you after

Complet BIABuild infrastructure for the right “temp Site.” Successfully test priority one applications.Successfully recovered and test all mission

critical applications. Shoot for over 95% successes rate.

Relocate DRP equipment to an alternate site or to the cloud (data centre).

Page 46: Combined Presentations

Thank you

[email protected]

Questions/CommentsAre Welcome