Astricon - Realities of Global Infrastructure in the Cloud

@cvonwallenstein from @DynInc

Global Infrastructure in the Cloud

Cory von WallensteinChief Technology Officer, Dyn Inc.

@cvonwallenstein

http://www.flickr.com/photos/notaperfectpilot/8119088205/

“Wired people should know something about wires”- Neal Stephenson, quoted in Andrew Blum’s TED Talk What is the Internet, Really?

http://www.ted.com/talks/andrew_blum_what_is_the_internet_really.html






Going Global in the Cloud

• Never been easier• Never been more affordable• Why should or shouldn’t you?• If so, how?


A Word on Costs and Value

• Unlikely to save you raw dollars• Likely to spend the same or more• But here’s what you gain:

– Flexibility – Performance – Reliability – Efficiency

• Are those worthwhile to you?

(can’t really screw this up)(many caveats here)

(if you do it right)(if your team embraces it)


Why go from 1 to N?

Reason 1: Disaster Recovery

http://maps.google.com


http://www.cogentco.com/files/images/network/network_map/networkmap_global_large.png

Speed of light299,792.458 km/second

Theoretical RTT~40ms

Real RTT~90ms


• Things don’t work as well at 90ms RTT latency as they do at 9ms RTT latency

• Where can you go to get out of the way of a disaster but not create latency headaches?

http://www.globaldatavault.com/natural-disaster-threat-maps.htm


http://www.datacenterknowledge.com/archives/2012/07/09/outages-surviving-electric-squirrels-ups-failures/

“A frying squirrel took out half of our Santa Clara data center two years back,”- Mike Christian, Yahoo


http://blog.level3.com/level-3-network/the-10-most-bizarre-and-annoying-causes-of-fiber-cuts/

“Squirrel chews account for a whopping 17% of our damages so far this year! But let me add that it is down from 28% just last year and it continues to decrease since we added cable guards to our plant.”, Fred Lawler, Level(3)

Reason 2: Get closer to users

http://www.akamai.com/html/technology/dataviz1.html

Reason 2: Get closer to users

http://www.akamai.com/html/technology/dataviz1.html

Reason 3: “Sorry, we’re full”

http://www.theregister.co.uk/2010/10/12/capgemini_merlin_data_center/

How: Figure out who and where

• Figure out what your motivations are– Disaster recovery– Get closer to users– Future scaling

• Take a latency inventory of your apps– To end users– To other dependencies

• Get out the maps! Fire up traceroute!– EC2: US East (Northern Virginia), US West (Oregon), US West (Northern California), EU (Ireland), Asia

Pacific (Singapore), Asia Pacific (Tokyo), South America (Sao Paulo), and GovCloud.


How: Deploy and manage w/ sanity• Software defined datacenters

– Fancy term for “I defined the architecture in code instead of Microsoft Visio”

• Configuration management– Orchestrate the cloud APIs, and the config of

systems– Chef– Puppet– CFEngine, and more

• Huge loss if you don’t take advantage of this


How: Coordinating global traffic• What’s the app?

– Application agnostic, like DNS Global Server Load Balancing

• Fancy term for “DNS servers monitor your servers and change DNS answers when events are detected”

– Application specific, like DUNDi• Decentralized coordination and fault tolerance

• Avoid SPOFs like the plague– Keep it simple, keep it scalable


What can you expect?• Flexibility

– Deploy new servers in new locations in hours instead of weeks

• Performance– If horizontally scalable on commodity hardware,

you win. Else, be careful.– If closer to users and site-to-site latency not an

issue or data is distributed/eventually consistent, you win. Else, be careful.


What can you expect?• Reliability

– If you understand “regions” and “availability zones”, you win. Else, be careful.

http://joyent.com/blog/if-i-was-your-cloud-provider-i-d-never-let-you-down

What can you expect?• Efficiency

– Automation– More instrumentation -> reduced MTTD– More scalable– Most important: More focus on what delivers your

business core competitive advantage.


Thank you (and we’re hiring!)VP Technical Operations, Director of Engineering

Director of Security, Network Engineers, Software Engineers, System Engineers, System Administrators (and more!)

Reach out to me: dyn.com, [email protected], @cvonwallenstein