32
Disaster Recovery Sudath Wijeratne 15-Sep-06

Disaster Recovery Sudath Wijeratne 15-Sep-06. Information Services 2 Agenda Background Methodology Our DR Strategy Learning Management system (Blackboard)

Embed Size (px)

Citation preview

Page 1: Disaster Recovery Sudath Wijeratne 15-Sep-06. Information Services 2 Agenda Background Methodology Our DR Strategy Learning Management system (Blackboard)

Disaster Recovery

Sudath Wijeratne 15-Sep-06

Page 2: Disaster Recovery Sudath Wijeratne 15-Sep-06. Information Services 2 Agenda Background Methodology Our DR Strategy Learning Management system (Blackboard)

2Information Services

Agenda

• Background • Methodology• Our DR Strategy• Learning Management system

(Blackboard) DR implementation• Discussions

Page 3: Disaster Recovery Sudath Wijeratne 15-Sep-06. Information Services 2 Agenda Background Methodology Our DR Strategy Learning Management system (Blackboard)

3Information Services

Background

• Griffith has 5 campus locations from Brisbane's Southbank to the Gold Coast

• Servers have been installed local to campus• Nathan centric corporate systems and servers

Page 4: Disaster Recovery Sudath Wijeratne 15-Sep-06. Information Services 2 Agenda Background Methodology Our DR Strategy Learning Management system (Blackboard)

4Information Services

Background• The University AUQA audit in 2003 identified risk

management as a priority, and the impact of failure of electronic infrastructure was identified as a risk requiring mitigation. A loss of the core Nathan Data Centre has the potential to cripple the University’s ability to deliver its core services.

• Achieving sufficient involvement and sponsorship from core business units to undertake a traditional top down Business Impact Statement (BIA) approach to disaster recovery has proven difficult over the last few years.

Page 5: Disaster Recovery Sudath Wijeratne 15-Sep-06. Information Services 2 Agenda Background Methodology Our DR Strategy Learning Management system (Blackboard)

5Information Services

Background …• In absence of BCP plan/strategy, ICTS

management team commissioned a disaster recovery project to jumpstart the process via a bottom up approach for three critical University systems:– Learning@Griffith– Staff Email– Corporate Web Services

Page 6: Disaster Recovery Sudath Wijeratne 15-Sep-06. Information Services 2 Agenda Background Methodology Our DR Strategy Learning Management system (Blackboard)

6Information Services

Methodology• Dozens Of Systems• Hundreds of IT Components (building blocks)• How they relate to each other and what is a real

impact of failure of any particular building block?• We needed a process / methodology to guide us to

approach DR system by system

Page 7: Disaster Recovery Sudath Wijeratne 15-Sep-06. Information Services 2 Agenda Background Methodology Our DR Strategy Learning Management system (Blackboard)

7Information Services

Methodology• We have developed a methodology called

“Building Block”• Methodology is about

– Systems / Services– Architectural components – And their dependencies

Page 8: Disaster Recovery Sudath Wijeratne 15-Sep-06. Information Services 2 Agenda Background Methodology Our DR Strategy Learning Management system (Blackboard)

8Information Services

Component fact sheetArcitectural components

System-1 S1-APS S1-DBS S1-AUS S1-NW S1-DNS

System-2

System-3

System-4

?DBMS Services

Application Services

Storage Services

Authentication Services

Directory Services

NetworkSystems / Services

Web Services

Midddleware Services

DNS

Methodology

Page 9: Disaster Recovery Sudath Wijeratne 15-Sep-06. Information Services 2 Agenda Background Methodology Our DR Strategy Learning Management system (Blackboard)

9Information Services

Building Block methodology• Top down approach is to analyse IT services iteratively

decomposing them into key technological components and dependencies.

• As a result of this process the key building blocks required to deliver the service are identified.

• Recovery solutions for each building block are then undertaken.

• A second round of analysis identified common building blocks that are reusable for other systems, or as base components for holistic disaster recovery (e.g. DNS services, LDAP).

Page 10: Disaster Recovery Sudath Wijeratne 15-Sep-06. Information Services 2 Agenda Background Methodology Our DR Strategy Learning Management system (Blackboard)

10Information Services

Understanding Environment Interdependencies

Web/Apps Services (S1-APS) Database Services (S1-DBS) Authentication (S1-AUS) Network Services (S1-NW)

System-1 (S1)

Page 11: Disaster Recovery Sudath Wijeratne 15-Sep-06. Information Services 2 Agenda Background Methodology Our DR Strategy Learning Management system (Blackboard)

11Information Services

Web/Apps Services (S1-APS) Database Services (S1-DBS) Authentication (S1-AUS) Network Services (S1-NW)

Apps Servers (S1-APS-SRV) Storage Services (S1-APS-SS)

DNS01

LBS (S1-APS-NW)

System-1 (S1)

Understanding Environment Interdependencies

Page 12: Disaster Recovery Sudath Wijeratne 15-Sep-06. Information Services 2 Agenda Background Methodology Our DR Strategy Learning Management system (Blackboard)

12Information Services

Web/Apps Services (S1-APS) Database Services (S1-DBS) Authentication (S1-AUS) Network Services (S1-NW)

Apps Servers (S1-APS-SRV) Storage Services (S1-APS-SS)

LBS (S1-AUS-NW)

LDAP servers (S1-AUS-SVR)

DNS01

LBS (S1-APS-NW)

System-1 (S1)

Understanding Environment Interdependencies

Page 13: Disaster Recovery Sudath Wijeratne 15-Sep-06. Information Services 2 Agenda Background Methodology Our DR Strategy Learning Management system (Blackboard)

13Information Services

Web/Apps Services (S1-APS) Database Services (S1-DBS) Authentication (S1-AUS) Network Services (S1-NW)

Apps Servers (S1-APS-SRV) Storage Services (S1-APS-SS)

LBS (S1-AUS-NW)

Database Server (S1-DBS-SVR)

LDAP servers (S1-AUS-SVR)

Storage Service (S1-DBS-SS)

DNS01

LBS (S1-APS-NW)

System-1 (S1)

Understanding Environment Interdependencies

Page 14: Disaster Recovery Sudath Wijeratne 15-Sep-06. Information Services 2 Agenda Background Methodology Our DR Strategy Learning Management system (Blackboard)

14Information Services

Web/Apps Services (S1-APS) Database Services (S1-DBS) Authentication (S1-AUS) Network Services (S1-NW)

Apps Servers (S1-APS-SRV) Storage Services (S1-APS-SS)

LBS (S1-AUS-NW)

Database Server (S1-DBS-SVR)

LDAP servers (S1-AUS-SVR)

Storage Service (S1-DBS-SS)

DNS01

LBS (S1-APS-NW)

System-1 (S1)

Understanding Environment Interdependencies

Page 15: Disaster Recovery Sudath Wijeratne 15-Sep-06. Information Services 2 Agenda Background Methodology Our DR Strategy Learning Management system (Blackboard)

15Information Services

Understanding Environment Interdependencies

Web/Apps Services (S1-APS) Database Services (S1-DBS) Authentication (S1-AUS) Network Services (S1-NW)

Apps Servers (S1-APS-SRV) Storage Services (S1-APS-SS)

LBS (S1-AUS-NW)

Database Server (S1-DBS-SVR)

LDAP servers (S1-AUS-SVR)

Storage Service (S1-DBS-SS)

DNS01

LBS (S1-APS-NW)

System-1 (S1)

Page 16: Disaster Recovery Sudath Wijeratne 15-Sep-06. Information Services 2 Agenda Background Methodology Our DR Strategy Learning Management system (Blackboard)

16Information Services

Methodology application to Backboard

Page 17: Disaster Recovery Sudath Wijeratne 15-Sep-06. Information Services 2 Agenda Background Methodology Our DR Strategy Learning Management system (Blackboard)

17Information Services

Blackboard

Application Services (BB-AS)

Authentication Services

(BB-AUS)

Network Services (BB-NW)

Database Services

(BB-DBS)

Colleboration servers

(BB-AS-COLSVR)

Application servers (BB-AS-APPSVR)

LDAP Servers(BB-AUS-SVR)

LBS(BB-NWS-LBS)

DNS(BB-NWS-DNS)

BB-DHCP

Leraning@Griffith BOS brakedown structure

Storage(BB-AS-COLSVR-

SS)

Storage(BB-AS-APPSVR-

SS)

Server(BB-DBS-SVR)

Storage(BB-DBS-SS)

Digital Repository

In-House Applications

Border Router(NW-BDR-RTR)

LG Campus Router

(NW-LG-RT)

NA Campus Router

(NW-NA-RT)

GC Campus Router

(NW-GC-RT)

Fibre

Nathan

MG

GC

SB Campus Router

(NW-SB-RT)

ATM

SB

LG

Methodology Summary for Blackboard

Page 18: Disaster Recovery Sudath Wijeratne 15-Sep-06. Information Services 2 Agenda Background Methodology Our DR Strategy Learning Management system (Blackboard)

18Information Services

Fact sheets

INFORMATION SERVICES INFORMATION & COMMUNICATION TECHNOLOGY SERVICES Date: July 2005

Architectural Layer Database Services (E.g. Application, DBMS, Storage, Network Services etc.)

CMDB Reference NA Reference to relevant CMDB for purpose of linking to an existing configuration management process, if applicable (E.g.: DBMS-CI008)

Identification Code BB-DBS Unique identification code. E.g.: DB01

Description Blackboard Database Services Brief description of the building block. E.g.: Learning@Griffith database.

Detailed Description Oracle based DBMS services Detail description of the building block E.g.: Server details, Vendor, SAN storage, anything relevant to identify the building block

Position Responsible Manager, DBMS Position that is responsible for the disaster recovery aspects of the building block. E.g.: Manager, Database & Storage Services.

Version 1.1 Document version.

Last Updated 04/07/2005 Date when this document was last updated.

Dependant Components Building blocks on which this building block is dependant. E.g.: SRV01 (Server), STOR01 (Storage).

DR Level Objective TBD Minimum level of services required while in disaster mode. E.g.: Web access to email.

DR Time Objective TBD The maximum amount of time before service must be made available. E.g.: System must be available within 16 hours of disaster.

DR Point Objective TBD Maximum amount of data loss acceptable as a result of a disaster. E.g.: maximum of 24 hours data loss.

Current DR Contingency Database files are being replicated to MG What is in place now.

Future DR Contingency

TBD What is planned for the future. E.g.: An alternative balance switching method needs to be provided (possible a LBS switch needs to be supplied off Nathan).

Issues Register Reference Nil Building Block DR contingency issues listed in the issues register. E.g.: Present (see issue register), Nil (resolved or nonexistent)

DR Contingency Status In Planning What is the current status of the building block with respect to the ‘Future DR Contingency’. E.g.: Not Started, In Planning, In Progress, Complete.

Page 19: Disaster Recovery Sudath Wijeratne 15-Sep-06. Information Services 2 Agenda Background Methodology Our DR Strategy Learning Management system (Blackboard)

19Information Services

Blackboard

Server Management

Services (BB-SMS)

Network Services

(GU-NCS)

Database Management

Services (BB-DBMS)

Collaboration servers (GU-SMS-BB-COLSVR)

Application servers (GU-SMS-BB-APPSVR)

LDAP Servers(GU-SMS-AUS-SVR)

SLB(GU-NCS-SLB)

DNS(GU-SMS-DNS)

GU-DHCP(GU-NCS-DHCP)

Network Connectivity

(GU-NCS-NW)

Learning@Griffith Recovery Order

BB Oracle(BB-DBMS-SVR)

Storage(GU-SAN-BB-SS)

Storage Services

(GU-SAN)

1

13

12

10

9

8

7

65

4

3

2

11

optional

Page 20: Disaster Recovery Sudath Wijeratne 15-Sep-06. Information Services 2 Agenda Background Methodology Our DR Strategy Learning Management system (Blackboard)

20Information Services

Our DR Strategy is based around …

• Two physically separated primary Data Centres

• Distributed operation of major systems between these Campuses

• Near real-time data replication capabilities between these Data Centres

Page 21: Disaster Recovery Sudath Wijeratne 15-Sep-06. Information Services 2 Agenda Background Methodology Our DR Strategy Learning Management system (Blackboard)

21Information Services

Key Dependencies for DR Project

• SAN Infrastructure provisioning at Gold Coast campus

• Network server campus virtualisation (Between Nathan and Gold Coast)

Page 22: Disaster Recovery Sudath Wijeratne 15-Sep-06. Information Services 2 Agenda Background Methodology Our DR Strategy Learning Management system (Blackboard)

22Information Services

SAN Infrastructure design• Tiered storage capabilities• Provide inter campus (Inter SAN) copy (TRUE

Copy) and snapshot (Shadow Image) capabilities

• Tier to tier copy capabilities • Central management of storage• Allow for DR implementation using above

capabilities

Page 23: Disaster Recovery Sudath Wijeratne 15-Sep-06. Information Services 2 Agenda Background Methodology Our DR Strategy Learning Management system (Blackboard)

23Information Services

Virtual server campus network design

Page 24: Disaster Recovery Sudath Wijeratne 15-Sep-06. Information Services 2 Agenda Background Methodology Our DR Strategy Learning Management system (Blackboard)

24Information Services

Virtual server campus network design• Provide equal access to servers from anywhere• Well defined access points into the server

subnets• Greater server to server connectivity• Allow for DR planning by allowing a shared

layer 2 between sites• Easy to migrate servers and services between

sites

Page 25: Disaster Recovery Sudath Wijeratne 15-Sep-06. Information Services 2 Agenda Background Methodology Our DR Strategy Learning Management system (Blackboard)

25Information Services

Learning@Griffith Distributed DR Architecture

Page 26: Disaster Recovery Sudath Wijeratne 15-Sep-06. Information Services 2 Agenda Background Methodology Our DR Strategy Learning Management system (Blackboard)

26Information Services

Clients

Layer 3 Switch

App Server 1

App Server 12

App Server 13

App Server 24

Colleboration Server(Active)

Nathan

BB Prod Database

File System(NFS mount)

Shared File System

NFS

SAN - NATHAN

BB Prod Database

File System(NFS mount)

Shared File System

SAN - MT Gravatt

Assync. Copy (SAN TRUE COPY)

ORACLE BACKUP

ORACLE BACKUP

NFS

Tape Backup

LDAP Server1 LDAP ServerN

Primary DNS

19/05/2006S.Wijeratne Before DR Implementation

Disaster Recovery Architecture for BlackBoard

F15K

Page 27: Disaster Recovery Sudath Wijeratne 15-Sep-06. Information Services 2 Agenda Background Methodology Our DR Strategy Learning Management system (Blackboard)

27Information Services

Clients

Virtual Campus Network

Layer 7 Switch Layer 7 Switch

App Server 1

App Server 12

App Server 13

App Server 24

Colleboration Server(Active)

Colleboration Server

(Stand by)

Nathan Gold Coast

BB Prod Database

File System(NFS mount)

Shared File System

NFS

SAN - NATHAN

BB Prod Database

File System(NFS mount)

Shared File System

SAN - GOLD COAST

Assync. Copy (SAN TRUE COPY)

ORACLE BACKUP

ORACLE BACKUP

NFS

Shadow Image

Tape Backup

LDAP Server1 LDAP ServerN LDAP ServerN+1 LDAP ServerZ

Primary DNS

Primary DNS(Backup) @South Bank

18/05/2006S.Wijeratne After DR Implementation

Disaster Recovery Architecture for BlackBoard

F15K F15K

Page 28: Disaster Recovery Sudath Wijeratne 15-Sep-06. Information Services 2 Agenda Background Methodology Our DR Strategy Learning Management system (Blackboard)

28Information Services

Page 29: Disaster Recovery Sudath Wijeratne 15-Sep-06. Information Services 2 Agenda Background Methodology Our DR Strategy Learning Management system (Blackboard)

29Information Services

Clients

Virtual Campus Network

Layer 7 Switch Layer 7 Switch

App Server 1

App Server 12

App Server 13

App Server 24

Colleboration Server(Active)

Colleboration Server

(Stand by)

Nathan Gold Coast

BB Prod Database

File System(NFS mount)

Shared File System

NFS

SAN - NATHAN

BB Prod Database

File System(NFS mount)

Shared File System

SAN - GOLD COAST

Assync. Copy (SAN TRUE COPY)

ORACLE BACKUP

ORACLE BACKUP

NFS

Shadow Image

Tape Backup

LDAP Server1 LDAP ServerN LDAP ServerN+1 LDAP ServerZ

Primary DNS

Primary DNS @South Bank

18/05/2006S.Wijeratne Disaster at NA

Disaster Recovery Architecture for BlackBoard

F15K F15K

Page 30: Disaster Recovery Sudath Wijeratne 15-Sep-06. Information Services 2 Agenda Background Methodology Our DR Strategy Learning Management system (Blackboard)

30Information Services

DR Plans

• DR Management Framework

• DR Plans for each building block

• Resumption plans

Page 31: Disaster Recovery Sudath Wijeratne 15-Sep-06. Information Services 2 Agenda Background Methodology Our DR Strategy Learning Management system (Blackboard)

31Information Services

Lesions learnt• Resource issues and priorities with multiple

projects• Distributed environment

– Benefits and challenges• Resources at 2nd primary site• External audit University Audit committees• BCP awareness

Page 32: Disaster Recovery Sudath Wijeratne 15-Sep-06. Information Services 2 Agenda Background Methodology Our DR Strategy Learning Management system (Blackboard)

32Information Services

Discussions