Upload
diella
View
22
Download
2
Tags:
Embed Size (px)
DESCRIPTION
Role-Based High Availability with Exchange 2007. Jim McBee http://www.ithicos.com. Who is Jim McBee!!??. Consultant, Writer, MCSE, MVP and MCT – Honolulu, Hawaii Principal clients (Dell, Microsoft, SAIC, Servco Pacific) Author – Exchange 2003 Advanced Administration (Sybex) - PowerPoint PPT Presentation
Citation preview
Role-Based High Availability with Exchange 2007
Jim McBee
http://www.ithicos.com
Who is Jim McBee!!??• Consultant, Writer, MCSE, MVP and MCT –
Honolulu, Hawaii• Principal clients (Dell, Microsoft, SAIC, Servco
Pacific)• Author – Exchange 2003 Advanced Administration
(Sybex)• Contributor – Exchange and Outlook Administrator• Blog
– http://mostlyexchange.blogspot.com – http://www.directory-update.com
Agenda
• High availability versus fault tolerance
• Resiliency versus high availability
• Server roles
• Providing higher availability
• Continuous replication technologies
Fault tolerance
• Designing and building a server that is resistant to failure
• All servers should be fault tolerant• RAID disks• ECC memory• Redundant power supplies• UPS systems• Active Directory and DNS
High availability
• Components of your system that allow quicker recovery from a failure
• Examples include…– Clustering– Load balanced hosts– Built-in redundancy or load balancing – DNS / application redundancy or load balancing
Resiliency
• Solutions that allow for contingency of operations
• Recovery in the event of a serious disaster• Not solutions that are invoked when applying
a service pack or a quick power outage• Usually not automatic failover• Examples include…
– Standby Continuous Replication– Local Continuous Replication
Server roles
Roles configured at installation• Simplify installation
– Optimize the server for the jobs it performs
– Increase availability through the most efficient and economic means
– Manage the servers more intuitively
Exchange 2007 Server RolesBy defining well-described roles, we can:
– Remove unnecessary functionality– Reduce the attack surface
• Benefit: optimize server performance• Benefit: reduced exposure in the perimeter
EdgeTransport
Server
HubTransport
Server
Mailbox Server
ClientAccessServer
UnifiedMessaging
Server
Perimeter Network Protected Network
Server Roles 1/5• Edge Transport
– Must be on its own separate physical machine – No other roles installed– May be workgroup member or joined to an Active
Directory domain– Uses Active Directory Application Mode (ADAM) for
configuration and recipient information– Perimeter policy enforcement– Message hygiene
• Anti-spam• Transport anti-virus
• Not Required
Server Roles 2/5• Client Access Server (CAS)
– Supports Outlook Web Access, Exchange ActiveSync, Outlook Anywhere (formerly RPC/HTTPS), POP3 and IMAP4 protocols, Autodiscover, Availability, and Web services
– At least one CAS in each Active Directory site and domain where mailbox servers exist
– Requires good network connection (low latency) to mailbox servers
– Uses RPC communication to mailbox server
Server Roles 3/5• Hub Transport
– Handles message delivery and routing (see EX03)
– Applies policies to incoming and outgoing mail– Can handle message hygiene functions– Reduces cost and complexity
• Provides more predictable routing• Reduces downtime
Server Roles 4/5• Mailbox
– Responsible for serving mailbox databases and public folders
– Mailbox access through MAPI– Possible to require MAPI encryption– Possible to run without public folders
Server Roles 5/5• Unified Messaging
– Placed in the protected corporate network– Requires that Mailbox and Hub Transport roles
exist– Check with your phone vendor to see if their
phone system will work with UM server• May require PBX gateway
Things to Consider• Interdependencies
– Mailbox servers require the Hub Transport role for message delivery – even to the same database
– The CAS roles provide OWA, ActiveSync, RPC over HTTP, the Availability Service, Autodiscover, and more
– The Edge role requires a Hub Transport server
• Fault tolerance– Mailbox servers can only talk to Hub Transport servers in the same
Active Directory site– Mailbox servers will talk to Hubs on the same server before other
Hubs in the same Active Directory site– For proxy & re-direct scenarios CAS connects to "best" CAS
• CAS not the same as FE servers
High availability
Focus on Availability and Resilency
• Improve data availability and resiliency– Protect mailbox data from failures and corruptions– Reduce time required to restore mailbox data– Provide data redundancy
• Service availability– Make mailbox data more available– Make cluster failover less painful– Make cluster management easier– Support for ‘stretch’ or ‘geo-clusters’– Allow large mailboxes inexpensively
Hub Transport ServerHigh Availability Options
• Use redundant hardware• Automatically load balanced and redundant with multiple Hub
Transport servers• Inbound SMTP mail
– Direct delivery to Hub Transport from Internet– Direct delivery to Hub Transport from 3rd party SMTP system– Load balancing
• Third party load balancing• Windows Network Load Balancing (NLB)
• Server failure will result in failure of current connections• May result in some data loss for any messages in the Hub
Transport Server queue database
Client Access ServerHigh Availability Options
• Redundant hardware– Windows NLB or third party load balancing– Round robin DNS (not the best solution)
• Server failure will result in current connections being lost– User may need to re-establish connection
Unified Messaging ServerHigh Availability Options
• Redundant hardware– Windows NLB or third party load balancing– Round robin DNS
• PBX or Gateway redundancy– Some PBXs may have load balancing options for multiple UM
servers• Server failure will result in any loss of current connections or
call transfers in progress
Mailbox ServerHigh Availability and Resiliency Options
• Resiliency and recoverability– Local continuous replication (LCR)– Standby continuous replication (SCR)
• Requires Exchange 2007 SP1
• High availability– Cluster continuous replication (CCR)– Single copy clusters (SCC)
• CCR and SCC require dedicated servers– No other roles can exist on a clustered node except Mailbox– Other roles must be on their own hardware
• Changes to transaction log files– 1MB in size– Log file is completely written after 15 minutes– Checkpoint depth is still 20MB / Storage Group
Shared Copy Clusters
• Requires Microsoft Cluster Services
• Benefits– Improved Exchange Cluster
setup– Traditional clustering used today– Failovers use the same data
copy
• Exchange Virtual Server = Clustered Mailbox Server
• 2 to 8 node Active / Passive clusters
Q
DB
Logs
MB
SCC Caveats• Requires expensive hardware with shared
storage
• Can be complicated for admins to learn
• Doesn’t protect from storage/data issues
• Let Servers must be on same IP subnet– Data redundancy provided through partners
• Hardware must be in the Windows Server Catalog
Local Continuous Replication• Additional copy of the logs and database
– On the same server– On a different volume
• Benefits– Easy configuration– Single datacenter– Doesn’t require expensive hardware– Online backups– Very quick restoration of service
• Caveats– Adds additional CPU/memory/disk overhead– Initial seeding required– Manual activation– Additional storage requirements– One database per storage group
DatabaseDatabase Logs
Copy and verify logs
D:\SG1\Logs
E00.logE0000000012.logE0000000011.log E0000000012.log
E0000000011.log
Advance database by playing logs
Enable LCRUpdated database
D:\SG1\Copy\Logs
Logs
Local Continuous ReplicationLocal Continuous Replication
Local Continuous Replication Tips• One database per storage group• Plan for additional hardware resources
– Minimum 20% additional CPU overhead– Additional 1GB of RAM– Will more than double IOPS requirements
• Maximum database size approximately 2GB• Separate storage into LUNs
– Do not break LUNs in to separate partitions– Put each database on a separate LUN– Isolate active and passive LUNs
• Use battery backed up storage controllers– Configure caching controllers for 75% write / 25% read
• LCR activation is manual– Use Restore-StorageGroupCopy cmdlet– Use backup copy “in place” or move it
Local continuous replication
Clustered Continuous Replication
• Benefits– Potentially no single point of
failure– Two copies of the data on
separate servers– No need for shared / SAN
storage.– Full redundancy with
automatic recovery– Backup mailboxes without
disturbing production– Doesn’t require validation for
clustered configuration
Witness
DB DB
DB DB
Log
sLo
gs
Log
sLo
gs
FileShare
KB KB 921181
CCR Advantages
• No single point of failure• Fast recover• Simplified hardware and storage requirements• Simplified deployment• Out-of-the-box replication solution• Can “stretch” the cluster to a second data center• Ability to offload VSS-based backups to passive
node• Can integrate with SCR
CCR Caveats• Requires Microsoft Cluster Services
– Majority Node Set cluster– Requires a third “voting” node - uses a shared folder
• Two-node, Active/Passive only• Backup:
– Streaming backup against production storage groups– VSS backup against production and replica storage groups
• Limit of one database per storage group• Can be used for PF database if it is the only PF database in the
organization• Initial database seeding required• Servers must be on same IP subnet• Transaction logs pulled over SMB shares• Some scenarios required log validation, replay• Database failure does not cause failover
Standby Continuous Replication• Coming in Service Pack 1• Source and target
machines can be– Stand-alone– In two different MSCS
clusters– On different subnets
• Controlled per storage group
• Many-to-one and one-to-many supported
• Manually activated
Replication to a standby server
DB
DB
Log
s
Log
s
LCR versus CCR versus SCR• LCR
– Focused towards resiliency – Improve restore time– Administrator has to initiate restore manually– Single data-center solution– Implements log shipping and replay out of the box
• Log files are copied locally and replayed
• CCR– Targeted towards site resiliency– Automatic failovers– Single or two-data center solution– Supports “stretch” option– Implements log shipping and replay out of the box
• Log files are copied to remote server and replayed– Simplifies cluster deployment
• No SAN or shared storage• SCR
– Provides site and server resiliency– “Cold spare” approach cuts hardware costs– Can be combined with LCR, CCR, and SCC for maximum flexibility
Continuous Replication Basics
• Exchange store runs normallyExchange store runs normally
• Replication service keeps a copy of the Replication service keeps a copy of the database up-to-datedatabase up-to-date• Copies, inspects, and replays log filesCopies, inspects, and replays log files
• In CCR, Cluster service provides failoverIn CCR, Cluster service provides failover• Move network identity (client transparency)Move network identity (client transparency)
• LCR activation is manualLCR activation is manual• Restore-StorageGroupCopy taskRestore-StorageGroupCopy task
Continuous Replication Basics• A ‘pull’ modelA ‘pull’ model
• Exchange server creates log files normallyExchange server creates log files normally
• Log files are copied by Replication serviceLog files are copied by Replication service• EExxnnnnnnnnxxnnnnnnnn.log files copied as they appear.log files copied as they appear
• EExxxx.log is copied for handoff/failover.log is copied for handoff/failover• If it can’t be copied loss setting (AutoDatabaseMountDial) If it can’t be copied loss setting (AutoDatabaseMountDial)
is consultedis consulted• Lossless (0 logs lost)Lossless (0 logs lost)• GoodAvailability (3 logs lost)GoodAvailability (3 logs lost)• BestAvailability (6 logs lost – default setting)BestAvailability (6 logs lost – default setting)
Continuous ReplicationContinuous Replication
SourceDB
InspectorDirectory
TargetLogDirectory
DBCopy
Store
ReplicationServiceSource
LogDirectory
ReplicationService
ReplicationService
Continuous ReplicationContinuous Replication
SourceDB
InspectorDirectory
TargetLogDirectory
DBCopy
Store
Source LogDirectory
LastLogCopyNotified LastLogCopied
LastLogInspectedLastLogReplayed
ReplicationService
ReplicationService
ReplicationService
Continuous Replication Monitoring
LastLogCopyNotifiedLastLogCopyNotifiedLast generation seen in the source directoryLast generation seen in the source directory
LastLogCopied LastLogCopied Last generation copied to Inspector directory by Last generation copied to Inspector directory by Replication serviceReplication service
LastLogInspectedLastLogInspectedLast generation inspectedLast generation inspectedMoved to log file directoryMoved to log file directory
LastLogReplayedLastLogReplayedLast generation replayed into the database copyLast generation replayed into the database copy
Available through Performance MonitorAvailable through Performance Monitor
Divergence• When the copy has information not in the original it When the copy has information not in the original it
is divergedis diverged• Divergence may be in database or log filesDivergence may be in database or log files
• Lossy failover will produce a divergenceLossy failover will produce a divergence• ‘‘Split-brain’ on a cluster also causes divergenceSplit-brain’ on a cluster also causes divergence
• Even if clients can’t connect, background maintenance still Even if clients can’t connect, background maintenance still modifies the databasemodifies the database
• Administrator error can cause divergence!Administrator error can cause divergence!• e.g. running eseutil /re.g. running eseutil /r
Recovering from Divergence• Re-seed will always workRe-seed will always work
• Expensive for large databasesExpensive for large databases
• Look at the common caseLook at the common case• Lossy failoverLossy failover
• Only a few log files are lostOnly a few log files are lost
• Built-in solutionsBuilt-in solutions• Decreased log file size to reduce data lossDecreased log file size to reduce data loss
• Lost Log Resilience (LLR)Lost Log Resilience (LLR)
• Feature built into the Hub Transport server roleFeature built into the Hub Transport server role• Runs to redeliver mail to CMS’ in its SiteRuns to redeliver mail to CMS’ in its Site
• Uses the creation time of the last log file copiedUses the creation time of the last log file copied• CCR only in RTMCCR only in RTM
• Use Set-TransportConfig to change default settings (setting Use Set-TransportConfig to change default settings (setting is organization-wide)is organization-wide)• Set MaxDumpsterSizePerStorageGroup be to Set MaxDumpsterSizePerStorageGroup be to 1.51.5 times the size of times the size of
the maximum message that can be sent (default value is 18MB)the maximum message that can be sent (default value is 18MB)• Recommend MaxDumpsterTime be Recommend MaxDumpsterTime be 7.00:00:007.00:00:00, which is seven days , which is seven days
(default value)(default value)
Transport Dumpster
Backups from Passive Database
• Backing up the passive moves the Backing up the passive moves the performance hit off the activeperformance hit off the active
• Backup the active or the passive?Backup the active or the passive?• Remember, they can change designationsRemember, they can change designations
• Passive backup is VSS onlyPassive backup is VSS only• Data Protection Manager v2Data Protection Manager v2
• Active backup can be VSS or streaming ESEActive backup can be VSS or streaming ESE
Questions?
Thanks for attending!
Book giveaway and e-mail notice
• Please give me a piece of paper with your name for drawing
• Include your e-mail address or give me a business card if you want:– 20% discount code for
Directory Update software– Notification e-mail when
Mastering Exchange Server 2007 is available