Upload
nusa
View
47
Download
0
Tags:
Embed Size (px)
DESCRIPTION
Maintaining Large Vista Installations. Amy Edwards, Ezra Freelove, & George Hernandez July 12, 2007. Agenda. Comparisons Who is USG Automation Monitoring Maintenance More Tricks Questions?. (All prod clusters) now: 1-10 11-20 21-50 50-70 70+ Ours in bold. - PowerPoint PPT Presentation
Citation preview
Maintaining Large Vista Installations
Amy Edwards, Ezra Freelove, & George Hernandez
July 12, 2007
2
Agenda
• Comparisons
• Who is USG
• Automation
• Monitoring
• Maintenance
• More Tricks
• Questions?
3
Informal Poll - Number of nodes
(All prod clusters) now:• 1-10• 11-20• 21-50• 50-70• 70+
• Ours in bold
• (All prod clusters) by December:
• 1-10• 11-20• 21-50• 50-70• 70+
4
Informal Poll – Number of DB Instances
Including secondary and non-production
• 1-2• 3-6• 7-10• 10+
• Ours in bold
5
Vista Architecture
6
GeorgiaVIEW Project
• University System of Georgia (USG)
• Vista 3.0.7 • Host 32 institutions &
multiple consortial programs
• >150,000 active students– Active is 100+ actions
• >11,000 active sections / term
7
Issues
• Handling performance issues
• Capacity planning
• Upgrades
• Replication
• JMS sensitivity
• Integration
8
Automation
• Rolling Restarts– Managed nodes restarted weekly
• except JMS
• Log cleanup to preserve space• Error reporting
– application, tracking, vulnerabilities
• Thread dumps• Sync admin node with backup• LDIS batch integration
9
Monitoring
• Nagios– http://www.nagios.org/– Sends alerts
• Stats– Custom AJAX web app– Watch changes of over time
• AWStats– http://www.awstats.org/
10
Nagios Example
11
Nagios Monitors
• OS / Hardware– Load– Temperature– Free space
• Database– Tablespace free space– Listener– Oracle processes
• Application– Direct-login– Weblogic processes– Java MBeans
• Default/Primary Pending Requests Current Count
• Java Heap Current
• JDBC Waiting for Connection Current Count
• Multicast Messages Lost
• Primary count
12
Stats
• Short and long term analysis– 21 months of data
• Graphs all Nagios data collected
• Flexible creation of reports
• Built with AJAX
13
Stats Examples I of III
14
Stats Examples II of III
15
Stats Examples I of III
16
AWStats
• Records data from web server logs
• Custom script grabs data from webserver.log files
• Runs daily
17
AWStats Examples I of II
18
AWStats Eamples II of II
19
Specialized Nodes
• Admin
• JMS
• Institutional Admin– Integration
• Chat
20
JMS Node
• Provides special services– Mail, LC creation, chat
• Failure or migration of JMS node hinders usage
• Services do not migrate well– Allow targeted migration– OTHERS: Pin JMS to a specific node
21
Integration
• Batched LDIS data files
• Cron runs nightly• Files broken up by:
– type– “reasonable” number
of records
• Done on Inst node– Issues with import can
kill node
22
Touching Nodes
• ssh & dsh– Touch groups of nodes at once– Useful for:
• Installs• Gathering logs• Locating a session
23
Maintenance Page
• Hosted on opposite f5
• Two versions– Scheduled maintenance– Unscheduled outage
• In an f5 outage, move DNS to other f5 so message still appears
24
Installs and Upgrades
• Silent install scripts
• Test in both development environments– Create against a small database– Get results of time to complete against a full
size copy of production
• Install to production
25
Powerlinks and Custom Development
• Test in development
• Try to break
• Pilot in production
• Release to all
26
Questions?
27
Want More?
• To view my resources and references for this presentation, visit
www.scholar.com• Simply click “Advanced Search” and
search by ezrafreelove and tag: ‘bbworld07’
28
Contact Information
• Ezra Freelove [email protected]
• Amy [email protected]
• George [email protected]