37

High availability and site redundancy with Exchange 2007: Notes from the field Gareth Ireland Infrastructure Consultant

Embed Size (px)

Citation preview

Page 1: High availability and site redundancy with Exchange 2007: Notes from the field Gareth Ireland Infrastructure Consultant
Page 2: High availability and site redundancy with Exchange 2007: Notes from the field Gareth Ireland Infrastructure Consultant

High availability and site redundancy with Exchange

2007: Notes from the field

Gareth IrelandInfrastructure Consultant

Page 3: High availability and site redundancy with Exchange 2007: Notes from the field Gareth Ireland Infrastructure Consultant

Session Objectives And Takeaways• Session Objectives:

– Understanding High Availability requirements and objectives of a business.

– Understanding what to protect in an Exchange Server 2007 environment

– Understanding Exchange Server 2007 features and solutions for protecting services and data

– Understanding issues to consider in site resiliency solutions

– Compare an Exchange 2003 Geo-cluster deployment to that of a Exchange 2007 solution.

Page 4: High availability and site redundancy with Exchange 2007: Notes from the field Gareth Ireland Infrastructure Consultant

Session Objectives And Takeaways• Session Objectives (cont.):

– Practical demonstration of an Exchange Server 2007 High Availability deployment.

• Key Takeaways:– New High Availability features and solutions

reduce the chance of disaster– New Disaster Recovery features and

solutions reduce the time of recovery when disasters do occur

– Demystify the concepts of High Availability features in Exchange Server 2007

Page 5: High availability and site redundancy with Exchange 2007: Notes from the field Gareth Ireland Infrastructure Consultant

High Availability Requirements of Business

Page 6: High availability and site redundancy with Exchange 2007: Notes from the field Gareth Ireland Infrastructure Consultant

Types of FailuresMid-ScaleMid-Scale

Full server Full server failurefailureComplete Complete cluster cluster failurefailureLarge Large storage storage failure, e.g., failure, e.g., SAN failureSAN failure

Small-ScaleSmall-Scale

Accidently Accidently deleted deleted itemsitemsDeleted Deleted mailboxmailboxDisk failureDisk failureDisk Disk Controller Controller failurefailureDatabase Database CorruptionCorruptionLog Log CorruptionCorruptionStorage Storage failure (DAS)failure (DAS)

Large-ScaleLarge-Scale

Total site Total site failurefailure

Page 7: High availability and site redundancy with Exchange 2007: Notes from the field Gareth Ireland Infrastructure Consultant

Exchange Server 2007 High Availability

Exchange OrganizationExchange Organization

Edge Transport server role

Hub Transport server role

Client Access server role

Internet

CCR ClusterCCR Cluster

Mailbox server role

(Active)

Mailbox server role

(Passive)

Unified Messaging server role

Overview

Page 8: High availability and site redundancy with Exchange 2007: Notes from the field Gareth Ireland Infrastructure Consultant

Exchange Server 2007 High Availability SolutionsMatrix

Page 9: High availability and site redundancy with Exchange 2007: Notes from the field Gareth Ireland Infrastructure Consultant

What to Protect and How

Page 10: High availability and site redundancy with Exchange 2007: Notes from the field Gareth Ireland Infrastructure Consultant

• Exchange Server 2003– Requires shared storage– SMTP, OWA, and Mailbox are cluster-aware– Single copy of mailbox data– Up to 8-node Active/Passive– 2-Node Active/Active– Geo-Clusters required Synchronized Storage Replication.– Split-Brain Scenarios

• Exchange Server 2007– Requires shared storage– Mailbox Only

• Simple redundancy for other roles– Single copy of mailbox data– Up to 8-node Active/Passive– Active/Active cut– Improvements in Install, Management, Behavior

Q

DB

Lo

gs

SMTPSMTPMBMBOWAOWA

Q

DB

Lo

gs

MB

Page 11: High availability and site redundancy with Exchange 2007: Notes from the field Gareth Ireland Infrastructure Consultant

• Deployment/operationalcost and complexity

• Recovery time varies based on backup technology, but can be lengthy and painful

• Data redundancy requires integration of partner technology

LimitationsLimitations

Q

DB

Lo

gs

MB

Page 12: High availability and site redundancy with Exchange 2007: Notes from the field Gareth Ireland Infrastructure Consultant

Local Continuous Replication (LCR)

• Standalone server data availability– Data outages expensive to recover– Significant data loss (hours?)– Previous versions of Exchange required

partner products for replication

• What is LCR?– Data replication on a single server

in a single datacenter• Enabled per storage group• Easy to configure

Page 13: High availability and site redundancy with Exchange 2007: Notes from the field Gareth Ireland Infrastructure Consultant

Local Continuous Replication

• Key things to know:– Per storage group, manual configuration– Adds overhead to server– Some configuration limitations

• Benefits– Enables recovery in minutes– Enables recovery without data loss– Enables large mailboxes– Variety of storage and backup options

• Decreases TOC by enabling I/O offload

– Within reach of broad set of customers

Page 14: High availability and site redundancy with Exchange 2007: Notes from the field Gareth Ireland Infrastructure Consultant

DB

Lo

gs

Service Pack 1 Service Pack 1

DB

DB

Lo

gs

Lo

gs

FileShare

Page 15: High availability and site redundancy with Exchange 2007: Notes from the field Gareth Ireland Infrastructure Consultant

Q

Passive Node

CCRCCR

MBX

SCC

MBX

Standby Continuous Replication

Page 16: High availability and site redundancy with Exchange 2007: Notes from the field Gareth Ireland Infrastructure Consultant

Standby Continuous Replication

• Designed for datacenter recovery

• Enables standby configurations out of the box – No clustering required between servers– No single subnet requirement– Spans multiple AD sites

• Granular configuration• Flexible configuration

– Many-to-many

• Manual activation

Page 17: High availability and site redundancy with Exchange 2007: Notes from the field Gareth Ireland Infrastructure Consultant

• Two-node Active/Passive failover cluster– File Share Witness (MNS Quorum)– No shared storage– Witness on Hub Transport– Automatic recovery

• Continuous data replication• Full redundancy• One or two datacenter solution

DB

DB

Lo

gs

Lo

gs

FileShare

Page 18: High availability and site redundancy with Exchange 2007: Notes from the field Gareth Ireland Infrastructure Consultant

• Outage Management– Easy-to-use scheduled outage support– Automatic recovery of unscheduled outages

• Symmetric failover• Resource requirements• Variety of backup options• Reduced backup TCO• Configuration limitations D

B

DB

L

og

s

L

og

s

FileShare

WitnessKB 921181

Page 19: High availability and site redundancy with Exchange 2007: Notes from the field Gareth Ireland Infrastructure Consultant

• Fast recovery to data problems on active node• No single point of failure• Simplified hardware requirements• Simplified storage requirements• Simplified deployment• Exchange-provided replication solution • Enables Mailbox server failover

to second datacenter• Improved management experience• Ability to offload VSS-based backups

BenefitsBenefits

Page 20: High availability and site redundancy with Exchange 2007: Notes from the field Gareth Ireland Infrastructure Consultant

• Cluster service monitors the resources– Failure detection is not instantaneous

• IP Address or Network Name resource failures cause failover– A machine, or network access to it,

has failed completely

• Exchange service failure or timeout doesn’t cause failover– The service is restarted on the same node

• Database failure doesn’t cause failover– Don’t want to move 49 databases because 1 failed

CCR failover behaviorCCR failover behavior

Page 21: High availability and site redundancy with Exchange 2007: Notes from the field Gareth Ireland Infrastructure Consultant

StoreStore

DBDB

ReplicatioReplicationn

ServiceService

CopCopyy

StoreStore

DBDB

ReplicatioReplicationn

ServiceService

CopCopyy

Cluster

Standalone Server CCR

LCR

Available configurationsAvailable configurations

Logs

pulled by

Passive

Active Node Passive Node

Page 22: High availability and site redundancy with Exchange 2007: Notes from the field Gareth Ireland Infrastructure Consultant

• A ‘pull’ model• Exchange server creates log files normally• Log files are copied by Replication service

– Share created on the active node– Exxnnnnnnnn.log files copied as they appear

• Replication service keeps a copy of the database up-to-date– Inspects, and replays log files

• Exx.log is copied for handoff/failover

Basic architectureBasic architecture

Page 23: High availability and site redundancy with Exchange 2007: Notes from the field Gareth Ireland Infrastructure Consultant

Cluster Continuous Replication

Node1 Node2

Database Logs DatabaseLogs

Copy and verify logsCopy and verify logs

\\node1\GUID\\node1\GUID

E00.logE00.log

E0000000012.loE0000000012.logg

E0000000011.loE0000000011.logg

E0000000012.logE0000000012.log

E0000000011.logE0000000011.log

Advance DB Advance DB by playing by playing

logslogs

Online Online seedseed

Updated Updated DB DB

ActiveActive PassivePassive

Page 24: High availability and site redundancy with Exchange 2007: Notes from the field Gareth Ireland Infrastructure Consultant

Clustered Continuous Replication Failover Scenarios

• Scheduled outage• Scheduled outage to correct corruption (logs available)• Scheduled outage to correct corruption (No logs available)

– Transport Dumpster• Store Crash• OS blue screen

– Incremental Replay• Active Network Failure

– Logs copied• Geographically Dispersed Cluster Single machine failure• Geographically Dispersed Cluster Datacenter failure

Page 25: High availability and site redundancy with Exchange 2007: Notes from the field Gareth Ireland Infrastructure Consultant

DEMO : Useful CCR cmdlets • Get-ClusterMailboxServerStatus

– Status information of the cluster• Get-StorageGroupCopyStatus

– Complete status information of CCR or LCR copy• Move-ClusterMailboxServer

– Scheduled (Lossless) move of Exchange resource• Update-StorageGroupCopy

– Initiate or resync an CCR or LCR copy (use Suspend-StorageGroupCopy and Resume-StorageGroupCopy cmdlet as required)

• Get-TransportConfig and Set-TransportConfig– Get and set transport dumpster configuration.

Page 26: High availability and site redundancy with Exchange 2007: Notes from the field Gareth Ireland Infrastructure Consultant

ActiveActive

E00 (Gen 5)E00 0000 0005

E00 0000 0004

• Passive node copies log files– Exx.log is in use

• On move, Exx.log is copied

• Designations are now reversed

Scheduled outage

E00 0000 0001

E00 (Gen 2)

E00 (Gen 3)

E00 (Gen 4)

E00 (Gen 6)

E00 (Gen 5)

E00 0000 0003

E00 0000 0002

E00 0000 0001

E00 0000 0002

E00 0000 0003

E00 0000 0004

Node 1 Node 2

Page 27: High availability and site redundancy with Exchange 2007: Notes from the field Gareth Ireland Infrastructure Consultant

ActiveActive• Failover without copying all log files is called “lossy”

• Passive DB is not completely up-to-date

• Log generation numbers are reused

• Log files havedifferent content!

• Database might be different!

E00 0000 0001

E00 (Gen 2)

E00 (Gen 3)

E00 (Gen 4)E00 (Gen 4)

E00 (Gen 5)

E00 (Gen 6)

E00 (Gen 5)

E00 0000 0003

E00 0000 0002

E00 0000 0001

E00 0000 0002

E00 0000 0003

E00 0000 0004E00 0000 0004

E00 (Gen 5)

E00 0000 0004

E00 0000 0005

Node 1 Node 2

Page 28: High availability and site redundancy with Exchange 2007: Notes from the field Gareth Ireland Infrastructure Consultant

Transport Dumpster• Transport Dumpster is a feature that is only enabled for use by

Clustered Continuous Replication• The transport dumpster submits recently delivered mail after an

unscheduled outage from the Hub Transport Servers• It is enabled by default and should always be turned on when using

CCR• The transport dumpster is enabled organization wide by setting the

amount of storage available per storage group and setting the time to retain mail in the dumpster

• What it does:– The Hub Transport server maintains a queue of mail that was

recently delivered to a clustered mailbox server– In the event of an unplanned failover, CCR automatically

requests every Hub Transport server in the site to redeliver mail from the transport dumpster queue

– The information store automatically deletes the duplicates and redelivers mail that was lost

Transport Dumpster

Page 29: High availability and site redundancy with Exchange 2007: Notes from the field Gareth Ireland Infrastructure Consultant

Types of Failures 2003 vs. 2007 Reviewed

Page 30: High availability and site redundancy with Exchange 2007: Notes from the field Gareth Ireland Infrastructure Consultant

• Stretch CCR on Windows 2003– 1 node per datacenter– Integrated data & server redundancy– Separate storage for each node in

each site– Flexible hardware options– Mailbox server failover and

switchover (manual & automatic)

– File Share Witness quorum– Requirements

• AD fix up for other Exchange roles on site failover• Windows 2003 still requires single subnet• Network pipe between datacenters must carry wide range of

traffic

Exchange Server 2007 Cluster Continuous Exchange Server 2007 Cluster Continuous ReplicationReplication

Key

- Active Directory Logical Site

- Physical Datacenter Site

-Active Directory Domain

WAN

STRETCH CCR/WIN2003

VLAN

Page 31: High availability and site redundancy with Exchange 2007: Notes from the field Gareth Ireland Infrastructure Consultant

Same Subnet

Same Subnet

Stretch CCR with Windows 2003

DC

/GC

1

DC

/GC

2

Primary Data Center

MB

X 1

MB

X 2

CA

S 1

CA

S 2

HU

B 1

HU

B 2

Ed

ge

1

Ed

ge

2

Internet

CMS

AD Site: Redmond

FSW

Internet

Standby Data Center

AD Site: Quincy

DC

/GC

3

DC

/GC

4

CA

S 3

CA

S 4

HU

B 3

HU

B 4

Ed

ge

3

Ed

ge

4

Public

Private

(fs

ws

vr)

Cluster Continuous Replication (CCR)

CMS

Network Load=

Replication+HUB+CAS+

Heartbeats+AD Access+

Client Access+AD Replication

MX Record MX Record

DO IT!

//mail.tailspin.com/…

//mail.tailspin.com/…

FSW

(fs

ws

vr)

BACK

??D

edic

ated

or

Non

-D

edic

ated

??

Page 32: High availability and site redundancy with Exchange 2007: Notes from the field Gareth Ireland Infrastructure Consultant

TOTAL SITE DISASTER

Demo

Page 33: High availability and site redundancy with Exchange 2007: Notes from the field Gareth Ireland Infrastructure Consultant

Demo Lab Setup

Page 34: High availability and site redundancy with Exchange 2007: Notes from the field Gareth Ireland Infrastructure Consultant

Questions & Answers

Page 35: High availability and site redundancy with Exchange 2007: Notes from the field Gareth Ireland Infrastructure Consultant

Blogcasts, Webcasts, & Whitepapers

Support Webcast Microsoft Exchange 2007 Disaster Recovery

http://support.microsoft.com/kb/937563/en-us

CCR: http://msexchangeteam.com/archive/2006/08/09/428642.aspx

SCR: http://msexchangeteam.com/archive/2007/02/23/435699.aspx

LCR: http://msexchangeteam.com/archive/2006/05/24/427788.aspx

Page 36: High availability and site redundancy with Exchange 2007: Notes from the field Gareth Ireland Infrastructure Consultant

Resources

Technical Communities, Webcasts, Blogs, Chats & User Groupshttp://www.microsoft.com/communities/default.mspx

Microsoft Developer Network (MSDN) & TechNet http://microsoft.com/msdn http://microsoft.com/technet

Trial Software and Virtual Labshttp://www.microsoft.com/technet/downloads/trials/default.mspx

Microsoft Learning and Certificationhttp://www.microsoft.com/learning/default.mspx

Database Portabilityhttp:// technet.microsoft.com/en-us/library/bb123954.aspx

File Share Witness for Cluster Continuous Replicationhttp://support.microsoft.com/kb/921181

Dial Tone Recovery Using an Alternate Serverhttp://technet.microsoft.com/en-us/library/bb310785.aspx

Page 37: High availability and site redundancy with Exchange 2007: Notes from the field Gareth Ireland Infrastructure Consultant

Thank you

http://www.microsoft.com/southafrica/ucs/2007