Upload
george-miranda
View
106
Download
0
Tags:
Embed Size (px)
DESCRIPTION
Citation preview
Agility Through Infrastructure Automation
George [email protected]
Austin Agile Conference 2012November 16, 2012
Friday, November 16, 12
Introductions# finger $(whoami)Login: gmiranda! ! ! ! ! ! ! Name: George MirandaDirectory: /home/gmiranda! ! ! Shell: /bin/bashOn since Mon 14 Apr 1997 18:01 (GMT) on tty1 from :0No mail on [email protected]:
twitter:! gmiranda23github:!! gmiranda23irc:!! ! gmiranda23! (irc.freenode.net - #chef)community:!gmiranda23! (community.opscode.com)role:! ! consultant, evangelist, trainer, *:*
Friday, November 16, 12
Scope
Friday, November 16, 12
ScopeAutomation + Culture = Agility
Friday, November 16, 12
ScopeAutomation + Culture = Agility
• Infrastructure Automation Approaches
Friday, November 16, 12
ScopeAutomation + Culture = Agility
• Infrastructure Automation Approaches
• Infrastructure & Automation Best Practices
Friday, November 16, 12
ScopeAutomation + Culture = Agility
• Infrastructure Automation Approaches
• Infrastructure & Automation Best Practices
• Cultural Pitfalls
Friday, November 16, 12
ScopeAutomation + Culture = Agility
• Infrastructure Automation Approaches
• Infrastructure & Automation Best Practices
• Cultural Pitfalls
• Making more awesome
Friday, November 16, 12
ScopeAutomation + Culture = Agility
• Infrastructure Automation Approaches
• Infrastructure & Automation Best Practices
• Cultural Pitfalls
• Making more awesome
What this talk is not
• Chef vs. Puppet
• Cloud All The Things!!!
• How to structure your Organization
• Which Development Model to adopt
Friday, November 16, 12
System Build Approaches
http://www.flickr.com/photos/dancedaoc/3083836988/sizes/z/in/photostream/
Friday, November 16, 12
Complications
• “That one host” you know you can’t rebuild
• Untracked configuration change
• Collections of Bash, PERL, Python, ???
• Rebuild from: wiki, cheatsheets, folklore
http://www.flickr.com/photos/humblog/4996661110/sizes/l/in/photostream/
Friday, November 16, 12
Unprecedented Growth
1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 1980 1981 1982 1983 1984 1985 1986 1987 1988 1989
Virtual Nodes
Physical Hardware
1980 1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015
Friday, November 16, 12
Unprecedented Growth
1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 1980 1981 1982 1983 1984 1985 1986 1987 1988 1989
Virtual Nodes
Physical Hardware
1980Mainframe
1980 1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015
Friday, November 16, 12
Unprecedented Growth
1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 1980 1981 1982 1983 1984 1985 1986 1987 1988 1989
Virtual Nodes
Physical Hardware
1980Mainframe
1990Client/Server
1980 1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015
Friday, November 16, 12
Unprecedented Growth
1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 1980 1981 1982 1983 1984 1985 1986 1987 1988 1989
Virtual Nodes
Physical Hardware
1980Mainframe
1990Client/Server
2000Datacenter
1980 1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015
Friday, November 16, 12
Unprecedented Growth
1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 1980 1981 1982 1983 1984 1985 1986 1987 1988 1989
Virtual Nodes
Physical Hardware
1980Mainframe
1990Client/Server
2000Datacenter
2010+Cloud
1980 1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015
Friday, November 16, 12
Unprecedented Growth
1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 1980 1981 1982 1983 1984 1985 1986 1987 1988 1989
Virtual Nodes
Physical Hardware
1980Mainframe
1990Client/Server
2000Datacenter
2010+Cloud
The things that got us here…
…must change to get us here!
1980 1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015
Friday, November 16, 12
The Rise of Configuration Management
http://www.flickr.com/photos/24375810@N06/6611017007
Friday, November 16, 12
We have a problem at scale
Friday, November 16, 12
Here’s a hint
Friday, November 16, 12
Cabling?
Friday, November 16, 12
Close...
Friday, November 16, 12
http://www.flickr.com/photos/michaelheiss/3090102907/
Complexity
Friday, November 16, 12
Infrastructure
Friday, November 16, 12
Items we manipulate• Routes
• Users
• Groups
• Tasks
• Packages
• Software
• Services
• Nodes
• Networking
• Files
• Directories
• Symlinks
• Mounts
• Ruby Gems
• Python Modules
• Java Artifacts
• Disks
• Volumes
• Filesystems
• Firewall Rules
Friday, November 16, 12
See Node
Application
Friday, November 16, 12
See Nodes
Application
Application Database
Friday, November 16, 12
See Nodes Grow
Application
Application Databases
Friday, November 16, 12
Grow Nodes
App Servers
Application Databases
Friday, November 16, 12
...Grow
Load Balancer
Application Databases
App Servers
Friday, November 16, 12
Grow Nodes, Grow
Load Balancers
Application Databases
App Servers
Friday, November 16, 12
Grow Nodes, Grow
Load Balancers
App DB Cache
App Servers
Application Databases
Friday, November 16, 12
Infrastructure has a Topology
Load Balancers
App DB Cache
App Servers
Application Databases
Friday, November 16, 12
Infrastructure IS a Snowflake
Load Balancers
App DB Cache
App Servers
Application Databases
Floating IP?
Friday, November 16, 12
Complexity Increases Quickly
App LBs
App Servers
NoSQL
DB slaves
Cache
DB Cache
DBs
Friday, November 16, 12
... Increases Very Quickly
DC1 DC3
DC2
Friday, November 16, 12
Configuration Management
http://www.flickr.com/photos/philliecasablanca/3354734116/
Friday, November 16, 12
Sysadmins
Friday, November 16, 12
The Past
Friday, November 16, 12
• Labor intensive
• Error prone
• Hard to reproduce
• Unsustainable
http://www.flickr.com/photos/pureimaginations/4805330106/
Manual Configuration
Friday, November 16, 12
• Typically very brittle
• Throw away, one off scripts
• grep sed awk perl
• curl | bash
http://www.flickr.com/photos/40389360@N00/2428706650/
Scripting
Friday, November 16, 12
• NFS mounts
• rdist
• scp-on-a-for-loop
• rsync on cron
http://www.flickr.com/photos/walkadog/4317655660
File Distribution
Friday, November 16, 12
for i in `cat servers.txt` ; do scp ntp.conf root@$i:/etc/ntpd.conf ; done
for i in `cat servers.txt` ; do ssh root@$i /etc/init.d/ntpd restart ; done
for i in `cat servers.txt` ; do ssh root@$i chkconfig ntpd on ; done
• ^ does not scale
http://www.flickr.com/photos/alexerde/3479006495
This used to be awesome.
Friday, November 16, 12
• Cluster SSH
• ISConf
• Golden Images
Execution Management
Friday, November 16, 12
Typical Boring Infrastructure
Jboss App
Memcache
Postgres Slaves
Postgres Master
NagiosGraphite
Friday, November 16, 12
Typical Boring Infrastructure
Jboss App
Memcache
Postgres Slaves
Postgres Master
NagiosGraphite
• Move SSH off port 22
• Lets put it on 2022
Friday, November 16, 12
Typical Boring Infrastructure
Jboss App
Memcache
Postgres Slaves
Postgres Master
NagiosGraphite
• Move SSH off port 22
• Lets put it on 2022
• edit /etc/ssh/sshd_config
Friday, November 16, 12
Typical Boring Infrastructure
Jboss App
Memcache
Postgres Slaves
Postgres Master
NagiosGraphite
• Move SSH off port 22
• Lets put it on 2022
• edit /etc/ssh/sshd_config
1 2
3
4
5
6
Friday, November 16, 12
Maintenance Window
Jboss App
Memcache
Postgres Slaves
Postgres Master
NagiosGraphite
Friday, November 16, 12
Maintenance Window
Jboss App
Memcache
Postgres Slaves
Postgres Master
NagiosGraphite 1 2
3
8
5 64 7
9
10 11
12
• Launch, Delete
• Repeat
• Typically manually
Friday, November 16, 12
Maintenance Window
Jboss App
Memcache
Postgres Slaves
Postgres Master
NagiosGraphite 1 2
3
8
5 64 7
9
10 11
12
• Launch, Delete
• Repeat
• Typically manually
• Don’t break anything!
• Bob just got fired =(
Friday, November 16, 12
Different IP Addresses?
Jboss App
Memcache
Postgres Slaves
Postgres Master
NagiosGraphite
• Invalid configs!
Friday, November 16, 12
Systems Integration
• Keep a list of current resources
• Collect vast amounts of data on those resources
• Quickly search through stacks of current resource data
• Generate your Infrastructure Topology from a current source of truth
http://www.flickr.com/photos/fotos_medem/3399096196/
Friday, November 16, 12
So when this...
Jboss App
Memcache
Postgres Slaves
Postgres Master
NagiosGraphite
Friday, November 16, 12
... becomes this...
Jboss App
Memcache
Postgres Slaves
Postgres Master
NagiosGraphite
Friday, November 16, 12
That can happen automatically
Jboss App
Memcache
Postgres Slaves
Postgres Master
NagiosGraphite
Friday, November 16, 12
Copyright © 2010 Opscode, Inc - All Rights Reserved 41Friday, November 16, 12
Copyright © 2010 Opscode, Inc - All Rights Reserved 41Friday, November 16, 12
Managing Complexity Today
Friday, November 16, 12
Managing Complexity TodayHow Do we Manage This at Cloud Scale?
• Thousands of infrastructure dependencies and configurations needed for each change.
• Huge Amounts of Time
• Increased Cost of Correction of Manual Errors
• Huge Need for Talent
• Risk of Critical Skills Shortage
Friday, November 16, 12
Google, Amazon, Microsoft, Yahoobuilt their own tools
Friday, November 16, 12
but it was “secret sauce”
Friday, November 16, 12
everyone else was here
... inexperienced & poorly equipped for the world they must now operate in.
Friday, November 16, 12
everyone else was here
... inexperienced & poorly equipped for the world they must now operate in.
Friday, November 16, 12
Infrastructure
"It is common to think in terms of individual machines rather than view an entire infrastructure as a combined whole"
“A good infrastructure, whether departmental, divisional, or enterprise-wide, is a single loosely-
coupled virtual machine, with hundreds or thousands of hard drives and CPU's.”
-- Bootstrapping an Infrastructure USENIX LISA ’98
Friday, November 16, 12
Infrastructure as Code
• Programmatically provision and configure
• Treat like any other code base
• Gives you tools to manage complexity while being flexible enough to evolve with your Infrastructure
• Reconstruct the business from code repository, data backup, and baremetal resources
Friday, November 16, 12
Declarative Syntax
• Define policy
• Say what, not how
• Abstraction between platforms
• Many positive side effects
Friday, November 16, 12
Idempotence
• You’ll hear this a lot
• Property of declarative interface
• Eliminates brittleness of scripting
• Identity function: f(x)=x
• Safe to repeat
Friday, November 16, 12
Chef is a Tool
http://www.flickr.com/photos/wessexandy/7690486884/sizes/c/in/pool-96164123@N00/
Friday, November 16, 12
Wax Philosophical
• We are artists & masters of our craft
• Everyone needs great tools
• Nobody remember’s Picasso’s paintbrush
http://www.flickr.com/photos/vgm8383/2686128924/sizes/l/
Friday, November 16, 12
The core ideas in this talk:
Automation + Culture =AGILITY!
Friday, November 16, 12
The core ideas in this talk:
Automation + Culture =AGILITY!
Friday, November 16, 12
Pitfalls
http://www.flickr.com/photos/nesposit/2787559303/sizes/o/in/photostream/
Friday, November 16, 12
This should sound familiar...
Friday, November 16, 12
Friday, November 16, 12
Friday, November 16, 12
Traditional thinking
Dev’s job is to add new featuresOps’ job is to keep the site stable and fast
http://www.flickr.com/photos/stewart/461099066/
Slide Courtesy of John Allspaw - http://www.slideshare.net/jallspaw/10-deploys-per-day-dev-and-ops-cooperation-at-flickr
Friday, November 16, 12
Ops’ job is NOT to keep the site stable and fast
Dev’s job is NOT to add new features
Friday, November 16, 12
OUR jobis to ENABLE our business
Friday, November 16, 12
Our business REQUIRES change
Friday, November 16, 12
BUT CHANGE IS THE CAUSE OF MOST OUTAGES!
Friday, November 16, 12
Choose:Discourage change in the interests of
stabilityOR
Allow change to happen as often as it needs to
Friday, November 16, 12
http://www.flickr.com/photos/gsfc/6795048198/sizes/o/in/photostream/
The Great Abyss
Friday, November 16, 12
The right culture is a requirement for survival & success at web
scale.
Friday, November 16, 12
Lessons Learned:Every Post-mortem
Friday, November 16, 12
Lessons Learned:Every Post-mortem
Ever...
Friday, November 16, 12
Friday, November 16, 12
Root Cause:“Bad Luck... it was a perfect storm of impossible events”
Friday, November 16, 12
Lesson #1“We have a bunch of manual processes which we need to
automate”
Friday, November 16, 12
Copyright © 2010 Opscode, Inc - All Rights Reserved 70Friday, November 16, 12
Lesson #2“We introduced too many changes
at once”
Friday, November 16, 12
Friday, November 16, 12
Slide Courtesy of John Allspaw - http://www.slideshare.net/jallspaw/ops-metametrics-the-currency-you-pay-for-change
Friday, November 16, 12
Image Courtesy of John Allspaw - http://www.slideshare.net/jallspaw/ops-metametrics-the-currency-you-pay-for-change
RAAAWR!!! I’m SCARY!
Friday, November 16, 12
Slide Courtesy of John Allspaw - http://www.slideshare.net/jallspaw/ops-metametrics-the-currency-you-pay-for-change
Friday, November 16, 12
Slide Courtesy of John Allspaw - http://www.slideshare.net/jallspaw/ops-metametrics-the-currency-you-pay-for-change
Friday, November 16, 12
Images Courtesy of John Allspaw - http://www.slideshare.net/jallspaw/ops-metametrics-the-currency-you-pay-for-change
I can haz cuddle?
Friday, November 16, 12
Friday, November 16, 12
MAKEMORE
AWESOME!!!
Friday, November 16, 12
Continuous Delivery
Faster Time to ValueHigher Availability
Happier TeamsMore Cool Stuff
Friday, November 16, 12
Friday, November 16, 12
Stuff Suits Care About
• Visibility & Accountability
• Reduce Risk
• Business Agility
Friday, November 16, 12
Stuff Engineers care about
• Change when we need it
• Innovate Faster
• Constant Improvements
• Application & Site Resiliency
Friday, November 16, 12
Recap
Friday, November 16, 12
Recap
•Step 1) Automate your Infrastructure
Friday, November 16, 12
Recap
•Step 1) Automate your Infrastructure
•Step 2) Bridge the Cultural Divide
Friday, November 16, 12
Recap
•Step 1) Automate your Infrastructure
•Step 2) Bridge the Cultural Divide
•Step 3) Profit!
Friday, November 16, 12
Recap
•Step 1) Automate your Infrastructure
•Step 2) Bridge the Cultural Divide
•Step 3) Profit!
•Automation + Culture = Agility
Friday, November 16, 12
Try it out!
• Hosted Chef is a SaaS product hosted by Opscode
• http://manage.opscode.com
• Our wiki: http://wiki.opscode.com
• Fast start guide:
• http://wiki.opscode.com/display/chef/Fast+Start+Guide
• Our Community site: http://community.opscode.com
• Cookbooks in our Github account: http://github.com/opscode/cookbooks
• The materials for our 3-day Chef Fundamentals class are online:
• https://github.com/opscode/chef-fundamentals
Friday, November 16, 12
Supported Platforms
• Ubuntu (10.04, 10.10, 11.04, 11.10, 12.04)
• Debian (5.x, 6.x)
• RHEL & CentOS (5.x, 6.x)
• Fedora 10+
• SUSE Enterprise (11.2)
• openSUSE (12.1)
• Solaris (5.9, 5.10, 5.11 -- x86 and SPARC)
• Mac OS X (10.4, 10.5, 10.6, 10.7)
• Windows 7
• Windows Server 2003 R2, 2008, 2008 R2
Friday, November 16, 12
Additional Resources• Opscode Youtube Channel:
• http://www.youtube.com/opscode
• Jesse Robbins, Changing Culture & Being a force for Awesome
• http://www.youtube.com/watch?v=OU8ihx3nT6I
• Matt Ray on Automating Continuous Deployment
• http://www.opscode.com/blog/2012/11/13/automating-continuous-deployment-wchef/
• Continuous Delivery by Jez Humble & David Farley
• http://continuousdelivery.com/
Friday, November 16, 12
Thanks!
• George Miranda
• @gmiranda23
Friday, November 16, 12
Questions?
• On freenode: #chef and #chef-hacking
• http://lists.opscode.com
• http://tickets.opscode.com
• http://help.opscode.com
• @opscode and @opscode_status on Twitter
Friday, November 16, 12