Upload
ck-lim
View
223
Download
0
Embed Size (px)
Citation preview
7/28/2019 BC-DR
1/61
Business Continuity
& Disaster Recovery
Business Impact Analysis
RPO/RTO
Disaster Recovery
Testing, Backups, Audit
7/28/2019 BC-DR
2/61
AcknowledgmentsMaterial is sourced from:
CISA Review Manual 2009, 2008, ISACA. All rights reserved. Used bypermission.
CISA Certified Information Systems Auditor All-in-One Exam Guide, PeterH Gregory, McGraw-Hill
Author: Susan J Lincke, PhD
Univ. of Wisconsin-Parkside
Reviewers/Contributors: Todd Burri & Megan Reid
Funded by National Science Foundation (NSF) Course, Curriculum andLaboratory Improvement (CCLI) grant 0837574: Information Security: Audit,Case Study, and Service Learning.
Any opinions, findings, and conclusions or recommendations expressed in thismaterial are those of the author and/or source(s) and do not necessarilyreflect the views of the National Science Foundation.
7/28/2019 BC-DR
3/61
ObjectivesDefine: Business Continuity Plan (BCP), Business Impact Analysis (BIA),RAID, Disaster Recovery Plan (DRP)Define: Hot site, warm site, cold site, reciprocal agreement, mobile siteDefine and analyze: Recovery point objective (RPO), Recovery timeobjective (RTO)
Define and give order of: Desk based or paper test, preparedness test,fully operational test,Define Tests and give order of: checklist, structured walkthrough,simulation test, parallel test, full interruption, pretest, post-testDefine and give examples for: Diverse routing, alternative routing
Define and analyze examples for: Incremental backup, differential backupDefine cloud computing, Infrastructure as a Service, Platform as Service,Software as a Service, Private cloud, Community cloud, Public cloud,Hybrid cloud.Develop a Business Continuity Plan
Perform a Business Impact Analasys
7/28/2019 BC-DR
4/61
Imagine a company
Bank with 1 Million accounts, socialsecurity numbers, credit cards, loans
Airline serving 50,000 people on 250flights daily
Pharmacy system filling 5 million
prescriptions per year, some of theprescriptions are life-saving
Factory with 200 employees producing200,000 products per day using robots
7/28/2019 BC-DR
5/61
Imagine a system failure
Server failure
Disk System failure
Hacker break-in
Denial of Service attack Extended power failure
Snow storm
Spyware
Malevolent virus or worm
Earthquake, tornado Employee error or revenge
How will this affect eachbusiness?
7/28/2019 BC-DR
6/61
First Step:
Business Impact Analysis Which business processes are of strategic
importance?
What disasters could occur? What impact would they have on the
organization financially? Legally? On
human life? On reputation? What is the required recovery time period?
Answers obtained via questionnaire,interviews, or meeting with key users of IT
7/28/2019 BC-DR
7/61
Event Damage Classification
Negligible: No significant cost or damage
Minor: A non-negligible event with no material or
financial impact on the businessMajor: Impacts one or more departments and mayimpact outside clients
Crisis: Has a major material or financial impact onthe business
Minor, Major, & Crisis events should bedocumented and tracked to repair
7/28/2019 BC-DR
8/61
Workbook:
Disasters and ImpactProblematic Event
or IncidentAffected Business Process(es)
(Assumes a university)
Impact Classification &Effect on finances, legal
liability, human life,reputation
Fire Class rooms, business departments Crisis, at times Major,Human life
Hacking Attack Registration, advising, Major,
Legal liability
Network Unavailable Registration, advising, classes,
homework, education
Crisis
Social engineering,/Fraud
Registration, Major,
Legal liability
Server Failure(Disk/server)
Registration, advising, classes,homework, education.
Major, at times: Crisis
7/28/2019 BC-DR
9/61
Recovery Time: TermsInterruption Window: Time duration organization can waitbetween point of failure and service resumption
Service Delivery Objective (SDO): Level of service in AlternateMode
Maximum Tolerable Outage: Max time in Alternate Mode
Regular Service
Alternate Mode
RegularService
InterruptionWindow
Maximum Tolerable Outage
SDO
Interruption
Time
DisasterRecoveryPlan Implemented
RestorationPlan Implemented
7/28/2019 BC-DR
10/61
Definitions
Business Continuity: Offer critical services inevent of disruption
Disaster Recovery: Survive interruption tocomputer information systems
Alternate Process Mode: Service offered bybackup system
Disaster Recovery Plan (DRP): How to transitionto Alternate Process Mode
Restoration Plan: How to return to regular systemmode
7/28/2019 BC-DR
11/61
Classification of Services
Critical $$$$: Cannot be performed manually.Tolerance to interruption is very low
Vital $$: Can be performed manually for very shorttime
Sensitive $: Can be performed manually for aperiod of time, but may cost more in staff
Nonsensitive : Can be performed manually foran extended period of time with little additionalcost and minimal recovery effort
7/28/2019 BC-DR
12/61
Determine Criticality of Business
ProcessesCorporate
Sales (1) Shipping (2) Engineering (3)
Web Service (1) Sales Calls (2)
Product A (1)
Product B (2)
Product C (3)
Product A (1)
Orders (1)
Inventory (2)
Product B (2)
7/28/2019 BC-DR
13/61
RPO and RTO
How far back can you fail to? How long can you operate without a system?One weeks worth of data? Which services can last how long?
Interruption
1 1 1Hour Day Week
Recovery Point Objective Recovery Time Objective
Interruption
1 1 1Week Day Hour
7/28/2019 BC-DR
14/61
Recovery Point Objective
Mirroring:RAID
BackupImages
Orphan Data: Data which is lost and never recovered.RPO influences the Backup Period
7/28/2019 BC-DR
15/61
Business Impact Analysis
SummaryService Recovery
PointObjective(Hours)
RecoveryTime
Objective(Hours)
CriticalResources(Computer,
people,
peripherals)
Special Notes(Unusual treatment at
Specific times, unusual riskconditions)
Registration 0 hours 4 hours SOLAR,network
Registrar
High priority during Nov-Jan,
March-June, August.
Personnel 2 hours 8 hours PeopleSoft Can operate manually forsome time
Teaching 1 day 1 hour D2L, network,faculty files
During school semester: highpriority.
Work
Book
Partial BIA for a university
7/28/2019 BC-DR
16/61
7/28/2019 BC-DR
17/61
Network Disaster Recovery
Redundancy
Includes:Routing protocolsFail-over
Multiple paths
Alternative Routing
>1 Medium or
> 1 network provider
Diverse Routing
Multiple paths,1 medium type
Last-mile circuit protectionE.g., Local: microwave & cable
Long-haul network diversityRedundant network providers
Voice RecoveryVoice communication backup
7/28/2019 BC-DR
18/61
Disruption vs. Recovery Costs
Cost
Time
Service Downtime
Alternative Recovery Strategies
Minimum Cost
* Hot Site
* Warm Site
* Cold Site
7/28/2019 BC-DR
19/61
Alternative Recovery Strategies
Hot Site: Fully configured, ready to operate within hours
Warm Site: Ready to operate within days: no or low powermain computer. Does contain disks, network, peripherals.
Cold Site: Ready to operate within weeks. Containselectrical wiring, air conditioning, flooring
Duplicate or Redundant Info. Processing Facility:Standby hot site within the organization
Reciprocal Agreement with another organization ordivision
Mobile Site: Fully- or partially-configured trailer comes toyour site, with microwave or satellite communications
7/28/2019 BC-DR
20/61
What is Cloud Computing?
Database
App Server
Laptop
PC
Web ServerCloudComputing
VPN Server
7/28/2019 BC-DR
21/61
This would cost $200/month.This would cost $200/month.
Introduction to Cloud
NIST Visual Model of Cloud Computing DefinitionNational Institute of Standards and Technology, www.cloudstandards.org
http://www.cloudstandards.org/http://www.cloudstandards.org/7/28/2019 BC-DR
22/61
Cloud Service Models
Software(SaaS): Providerruns own applications oncloud infrastructure.Platform(PaaS):
Consumer provides apps;provider provides systemand developmentenvironment.
Infrastructure(laaS):Provides customersaccess to processing,storage, networks or otherfundamental resources
SAAS
PAAS
IAAS
CloudsSoftware &
Apps
Your Application E.g., Clouds DB,
OS
CloudsComputer
OS, networks
7/28/2019 BC-DR
23/61
Cloud Deployment Models
Private Cloud: Dedicated to one organizationCommunity Cloud: Several organizations withshared concerns share computer facilitiesPublic Cloud: Available to the public or alarge industry groupHybrid Cloud: Two or more clouds (private,
community or public clouds) remain distinct butare bound together by standardized orproprietary technology
7/28/2019 BC-DR
24/61
Major Areas of Security
ConcernsMulti-tenancy: Your app is on same server with other
organizations.Need: segmentation, isolation, policy
Service Level Agreement (SLA): Defines performance,security policy, availability, backup, location,compliance, audit issues
Your Coverage: Total security = your portion + providerportion
Responsibility varies for IAAS vs. PAAS vs. SAASYou can transfer security responsibility but not
accountability
7/28/2019 BC-DR
25/61
Hot Site
Contractual costs include: basic subscription,monthly fee, testing charges, activation costs,
and hourly/daily use charges Contractual issues include: other subscriber
access, speed of access, configurations, staffassistance, audit & test
Hot site is for emergency use not long term
May offer warm or cold site for extendeddurations
7/28/2019 BC-DR
26/61
Reciprocal Agreements
Advantage: Low cost
Problems may include:
Quick access Compatibility (computer, software, )
Resource availability: computer, network, staff
Priority of visitor
Security (less a problem if same organization) Testing required
Susceptibility to same disasters
Length of welcomed stay
7/28/2019 BC-DR
27/61
RPO Controls
Data File andSystem/Directory
Location
RPO(Hours)
Special Treatment(Backup period, RAID, File
Retention Strategies)
Registration 0 hours RAID.Mobile Site?
Teaching 1 day Daily backups.
Facilities Computer Center as Redundantinfo processing center
Work
Book
7/28/2019 BC-DR
28/61
Business Continuity Process
Perform Business Impact Analysis Prioritize services to support critical business
processes Determine alternate processing modes for
critical and vital services Develop the Disaster Recovery plan for IS
systems recovery
Develop BCP for business operations recoveryand continuation
Test the plans Maintain plans
7/28/2019 BC-DR
29/61
Question
The amount of data transactions that areallowed to be lost following a computer
failure (i.e., duration of orphan data) is the:1. Recovery Time Objective
2. Recovery Point Objective
3. Service Delivery Objective
4. Maximum Tolerable Outage
7/28/2019 BC-DR
30/61
Question
When the RTO is large, this is associatedwith:
1. Critical applications2. A speedy alternative recovery strategy
3. Sensitive or nonsensitive services
4. An extensive restoration plan
7/28/2019 BC-DR
31/61
Question
When the RPO is very short, the bestsolution is:
1. Cold site2. Data mirroring
3. A detailed and efficient Disaster
Recovery Plan
4. An accurate Business Continuity Plan
7/28/2019 BC-DR
32/61
Disaster Recovery
Disaster RecoveryTesting
7/28/2019 BC-DR
33/61
An Incident Occurs
Security officerdeclares disaster
Call SecurityOfficer (SO)or committee
member
SO follows
pre-establishedprotocol
Emergency ResponseTeam: Human life:
First concern
Phone tree notifies
relevant participants
IT follows DisasterRecovery Plan
Public relationsinterfaces with media(everyone else quiet)
Mgmt, legalcouncil act
7/28/2019 BC-DR
34/61
Concerns for a BCP/DR Plan
Evacuation plan: Peoples lives always take firstpriority
Disaster declaration: Who, how, for what?
Responsibility: Who covers necessary disasterrecovery functions
Procedures for Disaster Recovery
Procedures for Alternate Mode operation Resource Allocation: During recovery & continued
operation
Copies of the plan should be off-site
7/28/2019 BC-DR
35/61
Disaster Recovery
ResponsibilitiesGeneral Business First responder:
Evacuation, fire, health
Damage Assessment Emergency Mgmt Legal Affairs Transportation/Relocation
/Coordination (people,
equipment) Supplies Salvage Training
IT-Specific Functions Software Application
Emergency operations Network recovery Hardware Database/Data Entry
Information Security
7/28/2019 BC-DR
36/61
BCP DocumentsFocus: IT Business
Event
Recovery
Disaster Recovery Plan
Procedures to recover atalternate site
Business Recovery Plan
Recover business after adisaster
IT Contingency Plan:Recovers majorapplication or system
Occupant Emergency Plan:Protect life and assets duringphysical threat
Cyber IncidentResponse Plan:
Malicious cyber incident
Crisis Communication Plan:
Provide status reports to public
and personnel
BusinessContinuity
Business Continuity Plan
Continuity of Operations Plan
Longer duration outages
7/28/2019 BC-DR
37/61
Workbook
Business Continuity Overview
Classifica-tion
(Critical orVital)
BusinessProcess
Incident orProblematic
Event(s)
Procedure for Handling(Section 5)
Vital Registration Computer Failure If total failure,forward requests to UW-System
Otherwise, use 1-week-old database
for read purposes onlyCritical Teaching Computer Failure Faculty DB Recovery Procedure
7/28/2019 BC-DR
38/61
MTBF = MTTF + MTTR
Mean Time to Repair (MTTR)
Mean Time Between Failure (MTBF)
Measure of availability:
5 9s = 99.999% of time working = 5 minutes of failure per year.
works repair works repair works
1 day 84 days
7/28/2019 BC-DR
39/61
Disaster Recovery
Test ExecutionAlways tested in this order:
Desk-Based Evaluation/Paper Test: A
group steps through a paper procedure andmentally performs each step.
Preparedness Test: Part of the full test isperformed. Different parts are testedregularly.
Full Operational Test: Simulation of a fulldisaster
7/28/2019 BC-DR
40/61
Business Continuity Test Types
Checklist Review: Reviews coverage of plan are allimportant concerns covered?
Structured Walkthrough: Reviews all aspects of plan,
often walking through different scenariosSimulation Test: Execute plan based upon a specific
scenario, without alternate site
Parallel Test: Bring up alternate off-site facility, without
bringing down regular siteFull-Interruption: Move processing from regular site to
alternate site.
7/28/2019 BC-DR
41/61
Testing Objectives
Main objective: existing plans will result insuccessful recovery of infrastructure & businessprocesses
Also can:
Identify gaps or errors
Verify assumptions
Test time lines Train and coordinate staff
7/28/2019 BC-DR
42/61
Testing Procedures
Tests start simple andbecome more challenging
with progressInclude an independent 3rdparty (e.g. auditor) toobserve test
Retain documentation foraudit reviews
Develop testobjectives
Execute Test
Evaluate Test
Develop recommendationsto improve test effectiveness
Follow-Up to ensurerecommendations
implemented
7/28/2019 BC-DR
43/61
Test Stages
PreTest: Set the Stage
Set up equipment
Prepare staff
Test: Actual test
PostTest: Cleanup
Returning resources
Calculate metrics: Time required, %
success rate in processing, ratio ofsuccessful transactions in Alternate modevs. normal mode
Delete test data
Evaluate plan
Implement improvements
PreTest
Test
PostTest
7/28/2019 BC-DR
44/61
Gap Analysis
Comparing Current Level with Desired Level
Which processes need to be improved?
Where is staff or equipment lacking?
Where does additional coordination needto occur?
7/28/2019 BC-DR
45/61
Insurance
IPF &Equipment
Data & Media Employee
Damage
Business Interruption:
Loss of profit due to ISinterruption
Valuable Papers &
Records: Covers cashvalue of lost/damagedpaper & records
Fidelity Coverage:
Loss from dishonestemployees
Extra Expense:
Extra cost of operationfollowing IPF damage
Media Reconstruction
Cost of reproduction ofmedia
Errors & Omissions:
Liability for errorresulting in loss to client
IS Equipment &Facilities: Loss of IPF &equipment due todamage
Media Transportation
Loss of data during xport
IPF = Information Processing Facility
7/28/2019 BC-DR
46/61
Auditing BCP
Includes: Is BIA complete with RPO/RTO defined for all services? Is the BCP in-line with business goals, effective, and current? Is it clear who does what in the BCP and DRP? Is everyone trained, competent, and happy with their jobs? Is the DRP detailed, maintained, and tested? Is the BCP and DRP consistent in their recovery coverage? Are people listed in the BCP/phone tree current and do they have a
copy of BC manual?
Are the backup/recovery procedures being followed? Does the hot site have correct copies of all software? Is the backup site maintained to expectations, and are the
expectations effective? Was the DRP test documented well, and was the DRP updated?
7/28/2019 BC-DR
47/61
Summary of BC Security
Controls RAID Backups: Incremental backup, differential
backup Networks: Diverse routing, alternative routing
Alternative Site: Hot site, warm site, cold site,reciprocal agreement, mobile site
Testing: checklist, structured walkthrough,simulation, parallel, full interruption
Insurance
7/28/2019 BC-DR
48/61
Question
The FIRST thing that should be done when you discover
an intruder has hacked into your computer system is to:
1.
Disconnect the computer facilities from the computernetwork to hopefully disconnect the attacker
2. Power down the server to prevent further loss ofconfidentiality and data integrity.
3. Call the manager.4. Follow the directions of the Incident Response Plan.
7/28/2019 BC-DR
49/61
Question
During an audit of the business continuityplan, the finding of MOST concern is:
1. The phone tree has not been double-checked in 6 months
2. The Business Impact Analysis has notbeen updated this year
3. A test of the backup-recovery system isnot performed regularly
4. The backup library site lacks a UPS
7/28/2019 BC-DR
50/61
Question
The first and most important BCP test is the:
1. Fully operational test
2. Preparedness test
3. Security test
4. Desk-based paper test
7/28/2019 BC-DR
51/61
Question
When a disaster occurs, the highestpriority is:
1. Ensuring everyone is safe2. Minimizing data loss by saving important
data
3. Recovery of backup tapes
4. Calling a manager
7/28/2019 BC-DR
52/61
Question
A documented process where onedetermines the most crucial IT operations
from the business perspective1. Business Continuity Plan
2. Disaster Recovery Plan
3. Restoration Plan
4. Business Impact Analysis
7/28/2019 BC-DR
53/61
Question
The PRIMARY goal of the Post-Test is:
1. Write a report for audit purposes
2. Return to normal processing
3. Evaluate test effectiveness and updatethe response plan
4. Report on test to management
7/28/2019 BC-DR
54/61
Question
A test that verifies that the alternate sitesuccessfully can process transactions is
known as:1. Structured walkthrough
2. Parallel test
3. Simulation test4. Preparedness test
7/28/2019 BC-DR
55/61
Interactive Crossword Puzzle
To get more practice the vocabulary fromthis section click on the picture below. For
a word bank look at the previous slide.
Definitions adapted from:
All-In-One CISA Exam Guide
7/28/2019 BC-DR
56/61
HEALTH FIRST CASE STUDYBusiness Impact Analysis & Business Continuity
Jamie Ramon MDDoctor
Chris Ramon RDDietician
TerryLicensed
Practicing Nurse
PatSoftware Consultant
7/28/2019 BC-DR
57/61
7/28/2019 BC-DR
58/61
Step 1: Define Threats
Resulting in Business DisruptionProblematic
Event or
Incident
Affected
Business
Process(es)
Impact Classification &
Effect on finances,legal liability, human
life, reputationFireHacking incidentNetwork Unavailable
(E.g., ISP problem)Social engineering,fraudServer Failure (E.g.,Disk)
Power Failure
7/28/2019 BC-DR
59/61
1 1 1Hour Day Week
Step 2: Define Recovery Objectives
Recovery Point Objective Recovery Time Objective
Interruption
BusinessProcess RecoveryTime
Objective
(Hours)
RecoveryPoint
Objective
(Hours)
CriticalResources
(Computer,people,
peripherals)
Special Notes(Unusual treatment at
specific times, unusual riskconditions)
1 1 1Week Day Hour
7/28/2019 BC-DR
60/61
Business Continuity
Step 3: Attaining Recovery Point Objective(RPO)
Step 4: Attaining Recovery Time Objective(RTO)
Classification
(Critical orVital)
Business
ProcessProblem Event(s)
or IncidentProcedure for Handling
(Section 5)
7/28/2019 BC-DR
61/61