Upload
jordan-rich
View
216
Download
0
Tags:
Embed Size (px)
Citation preview
Infrastructure ReliabilityInfrastructure Reliability
Common Systems GroupExperience @ UW Madison
Roger Hanson5 Jan 2005
Common Systems GroupExperience @ UW Madison
Roger Hanson5 Jan 2005
3
OverviewOverview
• Basics – Redundant Hardware
• Test Environments
• Change Management
• Version control
• Testing processes
• Collaboration
• Service Management
• Basics – Redundant Hardware
• Test Environments
• Change Management
• Version control
• Testing processes
• Collaboration
• Service Management
4
BackgroundBackground
• MyUW Portal
• WiscMail campus Mail Service
• In Production in 2001
• New Complex Environments
– Layer 4 Switching
– Directory Enabled Systems
– ES Storage Area Networks
• MyUW Portal
• WiscMail campus Mail Service
• In Production in 2001
• New Complex Environments
– Layer 4 Switching
– Directory Enabled Systems
– ES Storage Area Networks
5
• Campus Portal
• Access to over 130 modules
• 1.8M Logins in Sept. 04
• 49K+ Unique Logins in Sept.
• Campus Portal
• Access to over 130 modules
• 1.8M Logins in Sept. 04
• 49K+ Unique Logins in Sept.
6
Hardware - PortalHardware - Portal
Sun E280
Dell
Apache Web Server 2
WLS Plugin
Sun E280
WLS Node 1
EFS
My UWMadison
WL
S A
dm
inis
tra
tive S
erv
er
Layer 4 Switch 2Layer 4 Switch 1
Dell
Apache Web Server 1
WLS Plugin
Sun E450
OracleApplicationDatabase
WLS Node 2
EFS
My UWMadison
7
• Campus Mail system
• Nearly 90K accounts
• Daily Message Peak over 3M messages
• Service objective
– Never down
– Message delivery in less than 2 minutes
• Campus Mail system
• Nearly 90K accounts
• Daily Message Peak over 3M messages
• Service objective
– Never down
– Message delivery in less than 2 minutes
8
Hardware - EmailHardware - Email
144.92.197.152MAILST1
----happy
144.92.197.251 Priv
144.92.197.153MAILST2
---sleepy
144.92.197.252 Priv
144.92.197.154MAILST3
---sneezy
144.92.197.253 Priv
144.92.197.146MAILST4
---dancer
144.92.197.214 Priv
144.92,197.147MMP1
----prancer
144.92.197.221 Priv
144.92.197.148MMP2
----vixen
144.92.197.222 Priv
Wiscmail.wisc.eduLayer 4 Switch144.92.197.133
SPAM-2spitfire
128.104.1.226
SPAM-3zero
128.104.1.227
SPAM-1mustang
128.104.1.225144.92.197.213 Priv
dasher---
144.92.197.145/xxxSMTP1/SMTPAUTHx
144.92.197.219 Privdonner
---144.92.197.184/xxx
SMTP2/SMTPAUTHx
144.92.197.218 Privblitzen
---144.92.197.183/142
SMTP3/SMTPAUTH3
lists.services.wisc.edu
kodos144.92.104.60
kang144.92.104.61
Public Side Layer 4 Switchsmtp.wiscmail.wisc.edu - 144.92.197.138
smtpauth.wiscmail.wisc.edu - 144.92.197.134
admin.wiscmail.wisc.edu144.92.104.153
filters.wiscmail.wisc.edu144.92.197.154
AVGATE/oceanus144.92.104.17
LDAP & SMTP/AV
La
yer
4 S
witc
hsp
am
.se
rvic
es.
wis
c.e
du
12
8.1
04
.1.1
99
ES
S9
00
GB
ES
S1
50
GB
ES
S6
00
GB
144.92.197.216LDAP2aerate
144.92.197.157
144.92.197.215LDAP1liquefy
144.92.197.156
Pri
vate
Sid
e L
aye
r 4
Sw
itch
lda
p.d
oit.
wis
c.e
du
14
4.9
2.1
97
.19
6
admin.wiscmail.wisc.edu144.92.197.163
filters.wiscmail.wisc.edu144.92.197.165
on144.92.197.237
ES
SE
SSWiscMail Service Design
Spring 2004WiscMail ClientsPOP, IMAP, Web
WiscMail Plus AdminSpam/Filtering Admin
Stats Admin 144.92.197.155WISCMAIL
bashfulLDAP, SMTP/AV
iDA & Mail
ES
S1
50
GB
WiscNet ClientsPOP, IMAP, Web
Steve Kohlbeck12/16/04
ES
S4
50
GB
SM
TP
In
bo
un
d (
MX
)
SM
TP
In
bo
un
d
(MX
)
Internet
SPAM/Filtering Cluster SMTP/AV Cluster
Multiplexor Cluster
LDAP Cluster
Store Cluster
SM
TP
Ou
tbo
un
d Lists
Auth for Login
& Attributes
Au
th fo
r W
iscM
ail
SM
TP
& S
MT
P_
Au
th
Auth & AttributesTo Ldap
Internet
To
mh
ubWiscMail SMTP
1. SMTP Inbound (MX)2. SMTP Outbound 3. SMTP Inbound (POP/IMAP Client)4. SMTP Inbound (WebMail)5. SMTP Post Filtering Loop6. SMTP Post Lists AV ScanS
MT
P/S
MT
P_
AU
TH
fro
m
Ma
il C
lie
nts
Internet
Mhub/Lists/ClassLists
AV Scanning144.92.197.206 Privhermes
---144.92.197.190/141
SMTP4/SMTPAUTH2
144.92.197.205 Privheimdall
---144.92.197.159/140
SMTP5/SMTPAUTH1
SPAM-4corsair
144.92.197.166
SPAM-5hellcat
144.92.197.168
La
yer
4 S
witc
hn
ew
spa
m.s
erv
ice
s.w
isc.
ed
u1
44
.92
.19
7.1
33
Auth & Attributes for D
elivery to
Mailbox & Forwards
WiscMail ClientsPOP & IMAP
Stats
Stats
Stats
WebMail SMTP
WebMail SMTP
144.92.197.201LDAP3
ES
S
SPAM-6stuka
144.92.197.167
Private Network Mail Delivery
144.92.197.202LDAP4
ES
S
Departmental Mail & HAN
WM
+ A
ccou
nt M
gmt
Quota Stats
144.92.197.203 Privsunloan
---144.92.197.155
SMTP99
144.92.197.152MAILST1
----happy
144.92.197.251 Priv
144.92.197.153MAILST2
---sleepy
144.92.197.252 Priv
144.92.197.154MAILST3
---sneezy
144.92.197.253 Priv
144.92.197.146MAILST4
---dancer
144.92.197.214 Priv
144.92,197.147MMP1
----prancer
144.92.197.221 Priv
144.92.197.148MMP2
----vixen
144.92.197.222 Priv
Wiscmail.wisc.eduLayer 4 Switch144.92.197.133
SPAM-2spitfire
128.104.1.226
SPAM-3zero
128.104.1.227
SPAM-1mustang
128.104.1.225144.92.197.213 Priv
dasher---
144.92.197.145/xxxSMTP1/SMTPAUTHx
144.92.197.219 Privdonner
---144.92.197.184/xxx
SMTP2/SMTPAUTHx
144.92.197.218 Privblitzen
---144.92.197.183/142
SMTP3/SMTPAUTH3
lists.services.wisc.edu
kodos144.92.104.60
kang144.92.104.61
Public Side Layer 4 Switchsmtp.wiscmail.wisc.edu - 144.92.197.138
smtpauth.wiscmail.wisc.edu - 144.92.197.134
admin.wiscmail.wisc.edu144.92.104.153
filters.wiscmail.wisc.edu144.92.197.154
AVGATE/oceanus144.92.104.17
LDAP & SMTP/AV
La
yer
4 S
witc
hsp
am
.se
rvic
es.
wis
c.e
du
12
8.1
04
.1.1
99
ES
S9
00
GB
ES
S1
50
GB
ES
S6
00
GB
144.92.197.216LDAP2aerate
144.92.197.157
144.92.197.215LDAP1liquefy
144.92.197.156
Pri
vate
Sid
e L
aye
r 4
Sw
itch
lda
p.d
oit.
wis
c.e
du
14
4.9
2.1
97
.19
6
admin.wiscmail.wisc.edu144.92.197.163
filters.wiscmail.wisc.edu144.92.197.165
on144.92.197.237
ES
SE
SSWiscMail Service Design
Spring 2004WiscMail ClientsPOP, IMAP, Web
WiscMail Plus AdminSpam/Filtering Admin
Stats Admin 144.92.197.155WISCMAIL
bashfulLDAP, SMTP/AV
iDA & Mail
ES
S1
50
GB
WiscNet ClientsPOP, IMAP, Web
Steve Kohlbeck12/16/04
ES
S4
50
GB
SM
TP
In
bo
un
d (
MX
)
SM
TP
In
bo
un
d
(MX
)
Internet
SPAM/Filtering Cluster SMTP/AV Cluster
Multiplexor Cluster
LDAP Cluster
Store Cluster
SM
TP
Ou
tbo
un
d Lists
Auth for Login
& Attributes
Au
th fo
r W
iscM
ail
SM
TP
& S
MT
P_
Au
th
Auth & AttributesTo Ldap
Internet
To
mh
ubWiscMail SMTP
1. SMTP Inbound (MX)2. SMTP Outbound 3. SMTP Inbound (POP/IMAP Client)4. SMTP Inbound (WebMail)5. SMTP Post Filtering Loop6. SMTP Post Lists AV ScanS
MT
P/S
MT
P_
AU
TH
fro
m
Ma
il C
lie
nts
Internet
Mhub/Lists/ClassLists
AV Scanning144.92.197.206 Privhermes
---144.92.197.190/141
SMTP4/SMTPAUTH2
144.92.197.205 Privheimdall
---144.92.197.159/140
SMTP5/SMTPAUTH1
SPAM-4corsair
144.92.197.166
SPAM-5hellcat
144.92.197.168
La
yer
4 S
witc
hn
ew
spa
m.s
erv
ice
s.w
isc.
ed
u1
44
.92
.19
7.1
33
Auth & Attributes for D
elivery to
Mailbox & Forwards
WiscMail ClientsPOP & IMAP
Stats
Stats
Stats
WebMail SMTP
WebMail SMTP
144.92.197.201LDAP3
ES
S
SPAM-6stuka
144.92.197.167
Private Network Mail Delivery
144.92.197.202LDAP4
ES
S
Departmental Mail & HAN
WM
+ A
ccou
nt M
gmt
Quota Stats
144.92.197.203 Privsunloan
---144.92.197.155
SMTP99
9
Basics – Redundant HardwareBasics – Redundant Hardware
• Clustered Server Environment
• Spares (Hot/Warm/Cold)
• Automated Load Balancing
• Automated fail over
• Clustered Server Environment
• Spares (Hot/Warm/Cold)
• Automated Load Balancing
• Automated fail over
10
Test EnvironmentsTest Environments
• Test Cycle
– Test
– Development
– QA
– Production
• QA (also called Integrated Test Environment)
• Test Cycle
– Test
– Development
– QA
– Production
• QA (also called Integrated Test Environment)
11
Change ManagementChange Management
• Use of Change Information System
– Tracking
– Notification
• Use of Code Migration Request process
– Files promoted
– Configuration steps
– Test process
– Backout plans
• Use of Change Information System
– Tracking
– Notification
• Use of Code Migration Request process
– Files promoted
– Configuration steps
– Test process
– Backout plans
12
Version ControlVersion Control
• Use CVS
– http://www.gnu.org/software/cvs/
– Develop in private or shared environments
– Code is published into repository
– Code is then copied to environment (dev, test, qa, and prod)
13
Testing ProcessTesting Process
• Unit testing
• Integrated Testing (QA)
• Log analysis from testing
• Written test plans
• Load Tests
• Testing tools (Empirix)
• System Monitoring (Wiley Introscope)
• Unit testing
• Integrated Testing (QA)
• Log analysis from testing
• Written test plans
• Load Tests
• Testing tools (Empirix)
• System Monitoring (Wiley Introscope)
14
CollaborationCollaboration
• Wiki
• Document Repository/Sharing
• Email Lists
• IM
• Wiki
• Document Repository/Sharing
• Email Lists
• IM
15
Service ManagementService Management
• Major direction at UW to improve reliability
• CIO asking for 5 9s on key systems
• Consulting assistance
• Manage the service not the servers
• Adopt customer’s perspective
• Major direction at UW to improve reliability
• CIO asking for 5 9s on key systems
• Consulting assistance
• Manage the service not the servers
• Adopt customer’s perspective
16
Service ManagementService Management
• Models
– Information Technology Library
– Based on British Telecom agency processes
– Service Support processes
• Incident management
• Problem management
• Change management
• Release management
• Configuration management
• Models
– Information Technology Library
– Based on British Telecom agency processes
– Service Support processes
• Incident management
• Problem management
• Change management
• Release management
• Configuration management
17
Service ManagementService Management
• Models
– Microsoft Operations Framework
• Combines ITIL processes with recommendations for technical processes
• http://www.microsoft.com/mof
18
Next stepsNext steps
• Define service level objectives for key services
• Determine how to measure service reliability
• Engage Data Center staff
• Define service level objectives for key services
• Determine how to measure service reliability
• Engage Data Center staff
19
ObservationsObservations
• Infrastructure complexity
– Teams of specialists
• Funding for environments
• Staffing
• Process costs
• Infrastructure complexity
– Teams of specialists
• Funding for environments
• Staffing
• Process costs
20
QuestionsQuestions
Roger Hanson Internet Infrastructure Applications
Roger Hanson Internet Infrastructure Applications