Managing RightScale on RightScale

February 1, 2011

Your Panel Today

Presenting

• Rafael H. Saavedra – VP, Engineering at RightScale

• Chris Horne – Director, Product Marketing at RightScale

• Douglas Johnson, Operations Manager at RightScale

Please use the questions window to ask questions any time!

Topics

• Managing RightScale on RightScale (Dev, Staging, Prod & Meta)

• RightScale Meta manages RightScale Production

• Production System Overview

• Monitoring Production – Quis Custodiet Ipsos Custodes

• Our Favorite RightScale Features

• Our Not-so-favorite Features

• Deploying RightScale – Cloud Best Practices

RightScale

Production

Managing RightScale on RightScale

Customer A Customer DCustomer B Customer C

RightScale

Development

RightScale

Staging

RightScale

Development

RightScale

Production

RS Production is managed by RS Meta

RightScale Meta

Production

RightScale

StagingCustomer A Customer D

RightScale

Development

RightScale

Development

A multitude of RightScale systems

• Meta Production manages the Production system

• Meta currently lives outside the cloud containing production

• Meta is extremely secure, accessible only by a handful of operations folks

• The Production system is my.rightscale.com

• We are reaching 200 servers with a large fraction in EC2 US-East

• Servers are located in every cloud to achieve high availability

• Servers are allocated in well defined availability zones

• A few staging systems are used for integration and QA

• Ad hoc systems for performance testing, demos, betas, etc.

• Many development systems with simplified configurations

• Development systems are available at the click of a button

Significant increase in cloud usage

N-08 D-08 J-09 F-09 M-09 A-09 M-09 J-09 J-09 A-09 S-09 O-09 N-09 D-09 J-10 F-10 M-10 A-10 M-10 J-10 J-10 A-10 S-10 O-10

Some interesting RightScale numbers

• 2M servers launched by RightScale

• RightScale continuously monitors more than 70k servers

• Every day at RightScale:

• 2,000 array resize actions are executed

• 35,000 alert escalations are triggered

• 20,000 escalation emails are sent to users

• 9.0TB of monitoring data is exchange with our servers

• 1.6TB of logging data is sent to our servers

RightScale production (simplified)d

DB Master

DB Slave

Front Ends

Main App oth

What do our users do?

• Dashboard, API, monitoring graphs & event notifications

• Most of the requests are monitoring updates 85% (70%)

• Dashboard and API calls are heavier requests; they represent

7% of requests but 26% of bandwidth

Monitoring85%

Notifications8%

Dashboard1%

Distribution by Requests

Monitoring70%

Notifications4%

API15%

Dashboard11%

Distribution by Bandwidth

We eat our own dog food

• Production servers are organized into independent deployments

• Core servers: frontends, core/api servers, databases, daemons

We eat our own dog food

• We use security groups extensively to isolate servers

• ServerTemplates are versioned for each major release

• This preserves the ability to launch exact configurations of past versions

Monitoring, alerts & escalations

• We monitor as much relevant data as possible and display it

in insightful ways to quickly detect patterns and abnormalities

• We proactively eliminate the conditions that raise critical alerts

• No broken windows policy. No critical alerts can remain unresolved.

API Network Activity Dashboard Network Activity

How to monitor hundreds of servers?

• We leverage a

monitoring data

warehouse to

develop heat maps

& stacked graphs

Quis Custodiet Ipsos Custodes?*

• We monitor the monitoring and alerting systems

• We extensively use alerts to monitor the responsiveness of all

RightScale servers

• When you have hundreds of cloud servers, you statistically

see more instance failures. Instance and EBS failures can

cause headaches. Be prepared to grab a new instance.

• The meta & production monitoring and alerting systems are

fully decoupled from each other

* Who watches the watchmen?

Our favorite RightScale features

• RightImages – Resist the temptation to build custom images.

Leverage pure, base images to avoid introducing surprises.

• Input Inheritance – Makes it easy to keep configurations in

sync for dozens of servers

• ServerTemplates – Makes it very easy to reproduce

configurations across production, staging and development.

You have to fully automate configuration to manage a high

number of servers.

• Component Library – There are always new assets

(RightScripts, ServerTemplates, Macros, etc.) that can be

adapted to our needs

• Monitoring – It’s easy to make collectd plugins to monitor just

about anything

Our not-so-favorite features

• ServerTemplates Inputs – Powerful but too many of them

make templates difficult to use. Document them well for others.

• Revision Management – Still a ways to go to make users

aware of new versions and how to update

• Component Library – Finding new resources from the library

is not easy and intuitive

• Alerts – They work pretty well but they are not easy to

configure, in particular, custom ones

Best practices for upgrading RightScale

• In the cloud, the cost of duplicating servers is minimal

• Avoid upgrading existing servers (a non-cloud approach).

Launch fresh ones with new software instead (fail forward).

• Old servers can take over in case something goes wrong

• Launch additional slaves to capture recovery points

• One slave continues to replicate in case of master failure

• Another slave is frozen at upgrade point – can rollback by failing over

• Don’t forget to take snapshots in case of major failure

Front Ends

DB Slave

Databases

DB Master

Main App

Upgrading RightScale Step-by-Step

Main App

DB Slave

7) Take snapshot

at cutoff

6) Stop replication

2) Servers with new code

1) Servers with current code

4) Cut access

to site5) Stop all access

to databases

3) Add second slave

9) Reconnect

all servers8) Update schema

10) Open access

to site

Front Ends

DB Slave

Databases

DB Master

Main App

Upgrading RightScale Step-by-Step

Main App

DB Slave

Cutoff SnapshotServers with new code

Servers with old code

Have a project and want to discuss how RightScale can help?

Contact sales@rightscale.com or (866) 720-0208

Ready to get started?

Sign up for our Free Edition: www.RightScale.com/Free

Call us for a VIP trial of our paid editions

Need to learn more?

TCO calculator: www.RightScale.com/tco-calculator

User Conference Videos: www.RightScale.com/conference

Webinar archive: www.RightScale.com/webinars

White papers: www.RightScale.com/whitepapers

Q&A / Getting Started

Thank You!

Managing RightScale on RightScale

Technology

BuildFax: Hybrid Cloud on a Shoestring - RightScale Compute 2013

RightScale Webinar: Learn about the RightScale Cloud Appliance for vSphere

RightScale Webinar: Rock Your SoftLayer Cloud with RightScale

RightScale: Cloud Pricing Trends

Integrating RightScale, Windows, and .NET for Fun and Profit - RightScale Compute 2013

Private Clouds Made Easy - RightScale myCloud

Project Sherpa: How RightScale Went All in on Docker

What's New in RightScale: RightScale Webinar

[RightScale Webinar] Architecting Databases in the cloud: How RightScale Does It

Connecting the Clouds - RightScale Compute 2013

Peer Stories: How RightScale Achieved PCI on Cloud Infrastructure

RightScale Webinar: RightScale Zend Joint Customer Case Study - Mediaspike

RightScale Webinar: Best-in-Class Hybrid Cloud Solutions from Equinix and RightScale

RightScale Customer Use Case - Ubisoft

RightScale Webinar - Tales From the Trenches: Understanding and Managing Cloud Costs

RightScale Webinar: Safeguard Your Cloud Apps by Ensuring High Availability & Disaster Recovery Plans with RightScale & AWS

Eucalyptus-AWS Hybrid Using RightScale myCloud

Behance: Hybrid Infrastructure - RightScale Compute 2013

RightScale Webinar: Hybrid-IT: Connecting Your On-Premises Infrastructure With AWS

Optimizing Your Cloud Applications in RightScale