The Upside of Downtime (Velocity 2010)

Preview:

Citation preview

http://gapingvoid.com/

Sunday, June 20, 2010

The Upside of DowntimeTurning disaster into opportunity

Sunday, June 20, 2010

Who’s had a site go down?

Sunday, June 20, 2010

Who’s hasn’t had a site go down?

Sunday, June 20, 2010

There’s always that one guy!

Sunday, June 20, 2010

Sunday, June 20, 2010

Sunday, June 20, 2010

Sunday, June 20, 2010

Sunday, June 20, 2010

Sunday, June 20, 2010

Sunday, June 20, 2010

Sunday, June 20, 2010

Sunday, June 20, 2010

Sunday, June 20, 2010

Downtime sucks

Source: http://www.motivatedphotos.com/?id=8080

Sunday, June 20, 2010

Why downtime sucks

Business

$0

$750

$1,500

$2,250

$3,000

0 2 4 6 8 10 12 14 16 18 20 22

Sales

Sunday, June 20, 2010

Why downtime sucks

Business

Brand

Sunday, June 20, 2010

Why downtime sucks

Business

Brand

You

Sunday, June 20, 2010

Why downtime sucks

Business

Brand

You

Users

Sunday, June 20, 2010

Downtime = Bad! (Duh)

Sunday, June 20, 2010

Approach #1Don’t fail

Sunday, June 20, 2010

Source: http://kansansforlife.files.wordpress.com/2009/12/titanic.jpg

Sunday, June 20, 2010

“Everything fails all the time”-- Werner Vogels (Amazon, CTO)

Sunday, June 20, 2010

“Everything fails all the time”-- Werner Vogels (Amazon, CTO)

Sunday, June 20, 2010

Your site will fail

Werner Vogels (Amazon, CTO)

Sunday, June 20, 2010

Why?!?

Sunday, June 20, 2010

Risk Homeostasis

Why Failure Happens

Source: http://joshuahind.files.wordpress.com/2009/09/bicycle-crash.jpg

Sunday, June 20, 2010

Risk Homeostasis

Black Swan

Why Failure Happens

Source: Amazon.com

Sunday, June 20, 2010

Risk Homeostasis

Black Swan

Unknown unknowns

Why Failure Happens

Source: http://www.apoliticus.com/wp-content/uploads/2009/01/6_21_080306_rumsfeld.jpg

Sunday, June 20, 2010

Risk Homeostasis

Black Swan

Unknown unknowns

Change

Why Failure Happens

Source: http://bozark.net/wordpress/wp-content/uploads/2008/09/barack_obama_change_fairey.jpg

Sunday, June 20, 2010

Risk Homeostasis

Black Swan

Unknown unknowns

Change

Many small failures

Why Failure Happens

Source: http://www.biojobblog.com/uploads/image/dominos.jpg

Sunday, June 20, 2010

Risk Homeostasis

Black Swan

Unknown unknowns

Change

Many small failures

Humans

Why Failure Happens

Source: http://www.librarian.net/talks/clc/CLC.key/SJ_Shoulder_Shrug.jpg

Sunday, June 20, 2010

Sunday, June 20, 2010

Sunday, June 20, 2010

Not unusual

Polisherblocked

Source: http://www.gladwell.com/1996/1996_01_22_a_blowup.htm

Sunday, June 20, 2010

Not unusual Not expected

Polisherblocked

Moisture leaks into air system

Source: http://www.gladwell.com/1996/1996_01_22_a_blowup.htm

Sunday, June 20, 2010

Not unusual

Polisherblocked

Moisture leaks into air system

Flow of cold water stopped

Not expected Not good

Source: http://www.gladwell.com/1996/1996_01_22_a_blowup.htm

Sunday, June 20, 2010

Not unusual

Polisherblocked

Moisture leaks into air system

Flow of cold water stopped

Not expectedBackup disabled

Source: http://www.gladwell.com/1996/1996_01_22_a_blowup.htm

Sunday, June 20, 2010

Not unusual

Polisherblocked

Moisture leaks into air system

Flow of cold water stopped

Not expectedBackup disabled

Indicator blockedDoh!

Source: http://www.gladwell.com/1996/1996_01_22_a_blowup.htm

Sunday, June 20, 2010

Not unusual

Polisherblocked

Moisture leaks into air system

Flow of cold water stopped

Not expectedBackup disabled

Indicator blocked

Relief valve broken

Doh!

Dammit

Source: http://www.gladwell.com/1996/1996_01_22_a_blowup.htm

Sunday, June 20, 2010

Not unusual

Polisherblocked

Moisture leaks into air system

Flow of cold water stopped

Not expectedBackup disabled

Indicator blocked

Relief valve broken

Gauge broken

Doh!

Dammit

WTF

Source: http://www.gladwell.com/1996/1996_01_22_a_blowup.htm

Sunday, June 20, 2010

Not unusual

Polisherblocked

Moisture leaks into air system

Flow of cold water stopped

Meltdown

Not expectedBackup disabled

Indicator blocked

Relief valve broken

Gauge broken

Doh!

Dammit

Source: http://www.gladwell.com/1996/1996_01_22_a_blowup.htm

Sunday, June 20, 2010

Sunday, June 20, 2010

“accidental power failure”

Source: http://www.datacenterknowledge.com/archives/2010/06/16/power-failure-kos-intuit-sites-for-24-hours/

Sunday, June 20, 2010

“traffic accident damaged a nearby utility transformer”

Source: http://www.datacenterknowledge.com/archives/2007/11/13/truck-crash-knocks-rackspace-offline/

Sunday, June 20, 2010

“unfortunate code change”Source: http://www.datacenterknowledge.com/archives/2010/06/11/errant-code-change-crashes-10-million-blogs/

Sunday, June 20, 2010

Sunday, June 20, 2010

“Unhappy customers may get some attention, but unhappy networked customers can quickly impact your business”

-- Clay Shirky

Source: http://happenupon.files.wordpress.com/2009/02/technology-guru-clay-shir-001.jpg, http://scholarlykitchen.sspnet.org/2010/03/02/shirky-at-nfais-how-abundance-breaks-everything/

Sunday, June 20, 2010

Sunday, June 20, 2010

Sunday, June 20, 2010

Sunday, June 20, 2010

Sunday, June 20, 2010

Sunday, June 20, 2010

Sunday, June 20, 2010

http://labs.webmetrics.com/crowdsourceduptimeSunday, June 20, 2010

Sunday, June 20, 2010

Sunday, June 20, 2010

Sunday, June 20, 2010

Sunday, June 20, 2010

Recap

Sunday, June 20, 2010

Your site will fail

Sunday, June 20, 2010

Your site will fail+Downtime is bad

Sunday, June 20, 2010

Your site will fail+Downtime is bad+Everyone will find out

Sunday, June 20, 2010

Your site will fail+Downtime is bad+Everyone will find out=Screw it, I’ll become a lumberjack

Source: http://sbadrinath.files.wordpress.com/2009/03/different26rqcu3.jpg

Sunday, June 20, 2010

“Embrace fear of outages and degradation. Use it to guide your architecture, your code, your infrastructure. So lean into it.”

-- John Allspaw, VP Tech. Ops at Etsy

Sunday, June 20, 2010

Approach #2Prepare for downtime

Sunday, June 20, 2010

Disclaimer: Try hard to avoid downtime

Sunday, June 20, 2010

Learning by example...

Sunday, June 20, 2010

Case Study #1Facebook

Sunday, June 20, 2010

Sunday, June 20, 2010

Sunday, June 20, 2010

Sunday, June 20, 2010

Sunday, June 20, 2010

Sunday, June 20, 2010

Sunday, June 20, 2010

“The larger issue here isn't just that a portion of Facebook's platform has gone down - numerous web services have issues from time to time, including everything from Gmail to Twitter. An outage of this length, however, with no official communication from the company itself is disturbing.”

-- N.Y. Times

Sunday, June 20, 2010

Downtime Disturbing

Facebook

Sunday, June 20, 2010

Sunday, June 20, 2010

Case Study #2Google App Engine

Sunday, June 20, 2010

Sunday, June 20, 2010

Sunday, June 20, 2010

Sunday, June 20, 2010

Sunday, June 20, 2010

Sunday, June 20, 2010

Sunday, June 20, 2010

Sunday, June 20, 2010

Sunday, June 20, 2010

Sunday, June 20, 2010

Sunday, June 20, 2010

Sunday, June 20, 2010

Sunday, June 20, 2010

Sunday, June 20, 2010

Sunday, June 20, 2010

Downtime Kudos

Google App Engine

Sunday, June 20, 2010

Case Study #3Atlassian

Sunday, June 20, 2010

Sunday, June 20, 2010

Sunday, June 20, 2010

Sunday, June 20, 2010

Sunday, June 20, 2010

Sunday, June 20, 2010

Sunday, June 20, 2010

Sunday, June 20, 2010

Sunday, June 20, 2010

Sunday, June 20, 2010

Sunday, June 20, 2010

Sunday, June 20, 2010

Downtime

Atlassian

Bravo

Sunday, June 20, 2010

http://atlassian.com/

Sunday, June 20, 2010

Downtime:Opportunity to Build Trust

Sunday, June 20, 2010

Downtime:Opportunity to Destroy Trust

Sunday, June 20, 2010

How To: Prepare for Downtime

Sunday, June 20, 2010

Something > Nothing

Sunday, June 20, 2010

Upside of Downtime Framework 1.0

Oh crapLife is good That sucked

Time

Sunday, June 20, 2010

Upside of Downtime Framework 1.0

CommunicatePrepare Explain

Time

Sunday, June 20, 2010

Upside of Downtime Framework 1.0

CommunicatePrepare Explain

Time

Sunday, June 20, 2010

Upside of Downtime Framework 1.0

CommunicatePrepare Explain

Time

Sunday, June 20, 2010

Upside of Downtime Framework 1.0

CommunicatePrepare Explain

Time

Sunday, June 20, 2010

CommunicatePrepare Explain

Sunday, June 20, 2010

CommunicatePrepare Explain

1. Communication channel

Sunday, June 20, 2010

1. Communication channel

Something is wrong

Can’t tell if it’s me or you

I’ll assume it’s you

You suck

CommunicatePrepare Explain

Sunday, June 20, 2010

Something is wrong

Can’t tell if it’s me or you

I’ll assume it’s you

I know it’s youTell me when you’re back

You suck a lot less

CommunicatePrepare Explain

1. Communication channel

Sunday, June 20, 2010

Sunday, June 20, 2010

Sunday, June 20, 2010

Sunday, June 20, 2010

Sunday, June 20, 2010

Sunday, June 20, 2010

Sunday, June 20, 2010

Sunday, June 20, 2010

Sunday, June 20, 2010

CommunicatePrepare Explain

1. Communication channel Easy to find

Sunday, June 20, 2010

CommunicatePrepare Explain

1. Communication channel Easy to find

Hosted off-site

Sunday, June 20, 2010

CommunicatePrepare Explain

1. Communication channel Easy to find

Hosted off-site

Real-time / automated

Sunday, June 20, 2010

7 keys for public health dashboards

1. Must show current status for each “service”

2. Data must be accurate and timely

3. Must be easy to find

4. Must provide details for events in real time

5. Provide historical uptime and performance data

6. Provide a way to be notified of status changes

7. Provide details on the data is gathered

Source: http://www.transparentuptime.com/2008/11/rules-for-successful-public-health.html

Sunday, June 20, 2010

CommunicatePrepare Explain

1. Communication channel Easy to find

Hosted off-site

Real-time / automated

2. Process

Sunday, June 20, 2010

CommunicatePrepare Explain

1. Communication channel Easy to find

Hosted off-site

Real-time / automated

2. Process Authority

Sunday, June 20, 2010

CommunicatePrepare Explain

1. Communication channel Easy to find

Hosted off-site

Real-time / automated

2. Process Authority

Mean-Time-To-Communicate (MTTC)

Sunday, June 20, 2010

CommunicatePrepare Explain

1. Communication channel Easy to find

Hosted off-site

Real-time / automated

2. Process Authority

Mean-Time-To-Communicate (MTTC)

On-call/drills/escalations/etc.Sunday, June 20, 2010

Your servers

Sunday, June 20, 2010

Prepare ExplainCommunicate

1. Communicate

Sunday, June 20, 2010

Prepare ExplainCommunicate

1. Communicate Use communication channel

Sunday, June 20, 2010

Prepare ExplainCommunicate

1. Communicate Use communication channel

MTTC

Sunday, June 20, 2010

Prepare ExplainCommunicate

1. Communicate Use communication channel

MTTC

Who/what is affected

Sunday, June 20, 2010

Prepare ExplainCommunicate

1. Communicate Use communication channel

MTTC

Who/what is affected

When the incident started

Sunday, June 20, 2010

Prepare ExplainCommunicate

1. Communicate Use communication channel

MTTC

Who/what is affected

When the incident started

ETA

Sunday, June 20, 2010

Prepare ExplainCommunicate

1. Communicate Use communication channel

MTTC

Who/what is affected

When the incident started

ETA

Update regularly

Sunday, June 20, 2010

Prepare ExplainCommunicate

1. Communicate Use communication channel

MTTC

Who/what is affected

When the incident started

ETA

Update regularly

2. Fix it!Sunday, June 20, 2010

Phew, close one!

Sunday, June 20, 2010

Prepare ExplainCommunicate

1. Postmortem

Sunday, June 20, 2010

Prepare ExplainCommunicate

1. PostmortemAdmit failure

Source: http://en.blog.wordpress.com/2010/02/19/wp-com-downtime-summary/

Sunday, June 20, 2010

Prepare ExplainCommunicate

1. PostmortemAdmit failure

Sound like a human

Source: http://www.bureauofcommunication.com/compose/apology

Sunday, June 20, 2010

Prepare ExplainCommunicate

“We apologize for any inconvenience this may

have caused”

Sunday, June 20, 2010

Prepare ExplainCommunicate

1. PostmortemAdmit failure

Sound like a human

Start time and end time

Source: https://groups.google.com/group/google-appengine/browse_thread/thread/a7640a2743922dcf

Sunday, June 20, 2010

Prepare ExplainCommunicate

1. PostmortemAdmit failure

Sound like a human

Start time and end time

Who/what was impacted

Source: http://techcrunch.com/2009/11/02/large-scale-downtime-at-rackspace-cloud/

Sunday, June 20, 2010

Prepare ExplainCommunicate

1. PostmortemAdmit failure

Sound like a human

Start time and end time

Who/what was impacted

What went wrong

Source: http://www.zendesk.com/2010/03/tuesday-double-whammy.html

Sunday, June 20, 2010

Prepare ExplainCommunicate

1. PostmortemAdmit failure

Sound like a human

Start time and end time

Who/what was impacted

What went wrong

Lessons learned

Source: http://graysky.org/2010/02/downtime-postmortem/

Sunday, June 20, 2010

Prepare ExplainCommunicate

1. PostmortemAdmit failure

Sound like a human

Start time and end time

Who/what was impacted

What went wrong

Lessons learned

Sunday, June 20, 2010

Prepare ExplainCommunicate

“I was completely overwhelmed by the amount of positive feedback and support I received.”

Sunday, June 20, 2010

Prepare ExplainCommunicate

1. PostmortemAdmit failure

Sound like a human

Start time and end time

Who/what was impacted

What went wrong

Lessons learned

2. Improve for the futureSunday, June 20, 2010

“Google is not just saying sorry, they are actually implementing serious changes which probably represents millions of dollars of development to help make sure this doesn't happen again.”

Prepare ExplainCommunicate

Source: http://news.ycombinator.com/item?id=1168493

Sunday, June 20, 2010

Prepare ExplainCommunicate

Source: https://groups.google.com/group/google-appengine/browse_thread/thread/a7640a2743922dcf

Sunday, June 20, 2010

Prepare ExplainCommunicate

Be human

Sunday, June 20, 2010

Prepare ExplainCommunicate

Be authentic

Sunday, June 20, 2010

Prepare ExplainCommunicate

Be transparent

Sunday, June 20, 2010

Prepare ExplainCommunicate

Accept responsibility

Sunday, June 20, 2010

Prepare ExplainCommunicate

Learn and improve

Sunday, June 20, 2010

Trust

Prepare ExplainCommunicate

Sunday, June 20, 2010

CommunicatePrepare Explain1. Communication channel - Easy to find - Off-site - Real-time

2. Process - Give authority - M.T.T.C. - On-call/escalations

1. Communicate - Use channel - M.T.T.C. - Who/what affected - When started - ETA to resolution - Update regularly

2. Fix it!

1. Post-mortem - Admit failure - Sound like a human - Start time and end time - Who/what was impacted - What went wrong - Lessons learned

2. Learn and improve

Upside of Downtime Framework 1.0

Sunday, June 20, 2010

CommunicatePrepare Explain1. Communication channel - Easy to find - Off-site - Real-time

2. Process - Give authority - M.T.T.C. - On-call/escalations

Upside of Downtime Framework 1.0

Be HumanBe TransparentBe Prepared + +

1. Post-mortem - Admit failure - Sound like a human - Start time and end time - Who/what was impacted - What went wrong - Lessons learned

2. Learn and improve

1. Communicate - Use channel - M.T.T.C. - Who/what affected - When started - ETA to resolution - Update regularly

2. Fix it!

Sunday, June 20, 2010

CommunicatePrepare Explain1. Communication channel - Easy to find - Off-site - Real-time

2. Process - Give authority - M.T.T.C. - On-call/escalations

Upside of Downtime Framework 1.0

Be HumanBe TransparentBe Prepared + +

Trust=

1. Post-mortem - Admit failure - Sound like a human - Start time and end time - Who/what was impacted - What went wrong - Lessons learned

2. Learn and improve

1. Communicate - Use channel - M.T.T.C. - Who/what affected - When started - ETA to resolution - Update regularly

2. Fix it!

Sunday, June 20, 2010

Disclaimer:Don’t screw up too often

Sunday, June 20, 2010

Sunday, June 20, 2010

Transparent Not Transparent

Caught

Not Caught

Downtime Prisoner’s Dilemma

Sunday, June 20, 2010

Transparent Not Transparent

Caught

Not Caught Win

Downtime Prisoner’s Dilemma

Sunday, June 20, 2010

Transparent Not Transparent

Caught

Not Caught

Big Loss

Win

Downtime Prisoner’s Dilemma

Sunday, June 20, 2010

Transparent Not Transparent

Caught

Not Caught

Big Win Big Loss

Win

Downtime Prisoner’s Dilemma

Sunday, June 20, 2010

Transparent Not Transparent

Caught

Not Caught

Big Win Big Loss

Win Win

Downtime Prisoner’s Dilemma

Sunday, June 20, 2010

Transparent Not Transparent

Caught

Not Caught

Big Win Big Loss

Win Win

Downtime Prisoner’s Dilemma

Sunday, June 20, 2010

BenefitsGain trust

Reduce churn, increase loyalty

Reduce support costs

Ability to control the message

Competitive advantage

More time to focus on the actual problem

Reduce stress

Sunday, June 20, 2010

Change != Easy

Sunday, June 20, 2010

Change != Impossible

Sunday, June 20, 2010

Keys to Adoption

Getting past a culture of “hide the problem”

Sunday, June 20, 2010

Keys to Adoption

Getting past a culture of “hide the problem”

Overriding commitment to want to improve

Sunday, June 20, 2010

Keys to Adoption

Getting past a culture of “hide the problem”

Overriding commitment to want to improve

Available resources to improve

Sunday, June 20, 2010

Keys to Adoption

Getting past a culture of “hide the problem”

Overriding commitment to want to improve

Available resources to improve

Pain

Sunday, June 20, 2010

Keys to Adoption

Getting past a culture of “hide the problem”

Overriding commitment to want to improve

Available resources to improve

Pain

Buy-in

Sunday, June 20, 2010

Product Management

Support

Sales/Marketing

Engineering/Operations

Sunday, June 20, 2010

Product Management

Support

Default: Lets wait for complaints

Sales/Marketing

Engineering/Operations

Sunday, June 20, 2010

Product Management

Support

Default: Lets wait for complaints

Reality: Proactiveness => Forgiveness

Sales/Marketing

Engineering/Operations

Sunday, June 20, 2010

Product Management

Support

Reality: Proactiveness => Forgiveness

Default: Too much work

Sales/Marketing

Default: Lets wait for complaints

Engineering/Operations

Sunday, June 20, 2010

Product Management

Support

Reality: Proactiveness => Forgiveness

Default: Too much work

Reality: More upfront, less when it matters

Sales/Marketing

Default: Lets wait for complaints

Engineering/Operations

Sunday, June 20, 2010

Product Management

Support

Reality: Proactiveness => Forgiveness

Default: Too much work

Reality: More upfront, less when it matters

Default: Don’t want to look bad

Sales/Marketing

Default: Lets wait for complaints

Engineering/Operations

Sunday, June 20, 2010

Engineering/Operations

Product Management

Support

Reality: Proactiveness => Forgiveness

Default: Too much work

Reality: More upfront, less when it matters

Default: Don’t want to look bad

Reality: Opportunity to learn/improve

Sales/Marketing

Default: Lets wait for complaints

Sunday, June 20, 2010

Product Management

Support

Reality: Proactiveness => Forgiveness

Default: Too much work

Reality: More upfront, less when it matters

Default: Don’t want to look bad

Reality: Opportunity to learn/improve

Default: I don’t want my customers to knowSales/Marketing

Default: Lets wait for complaints

Engineering/Operations

Sunday, June 20, 2010

Product Management

Support

Reality: Proactiveness => Forgiveness

Default: Too much work

Reality: More upfront, less when it matters

Default: Don’t want to look bad

Reality: Opportunity to learn/improve

Default: I don’t want my customers to know

Reality: They’ll find out, better from usSales/

Marketing

Default: Lets wait for complaints

Engineering/Operations

Sunday, June 20, 2010

Product Management

Support

Reality: Proactiveness => Forgiveness

Default: Too much work

Reality: More upfront, less when it matters

Default: Don’t want to look bad

Reality: Opportunity to learn/improve

Default: I don’t want my customers to know

Reality: They’ll find out, better from usSales/

Marketing

Default: Lets wait for complaints

Engineering/Operations

Sunday, June 20, 2010

Source: http://delicious.com/lennysan/healthdashboard

Sunday, June 20, 2010

Simple as that!

Sunday, June 20, 2010

Your site will still fail!

Sunday, June 20, 2010

“The measure of a society is how well it transforms pain and suffering into something worthwhile.”

-- Fredrick Nietzsche

Sunday, June 20, 2010

“The measure of a company is how well it transforms pain of downtime into something worthwhile.”

-- Lenny Rachitsky

Source: Original quote inspired by Fredrick Nietzsche

Sunday, June 20, 2010

Bare minimum:Register a Twitter account

Sunday, June 20, 2010

Lenny Rachitsky@lennysanhttp://www.transparentuptime.com/

Webmetrics/Neustar@webmetricshttp://www.webmetrics.com/

Slides: http://bit.ly/upside-of-downtime

Thank You

Sunday, June 20, 2010

Bonus

Sunday, June 20, 2010

Sunday, June 20, 2010

Sunday, June 20, 2010

CommunicatePrepare Explain1. Communication channel - Easy to find - Off-site - Real-time

2. Process - Give authority - M.T.T.C. - On-call/escalations

1. Communicate - Use channel - M.T.T.C. - Who/what affected - When started - ETA to resolution - Update regularly

2. Fix it!

1. Post-mortem - Admit failure - Sound like a human - Start time and end time - Who/what was impacted - What went wrong - Lessons learned

2. Learn and improve

Upside of Downtime Framework 1.0

Sunday, June 20, 2010

CommunicatePrepare Explain1. Communication channel - Easy to find - Off-site - Real-time

2. Process - Give authority - M.T.T.C. - On-call/escalations

1. Communicate - Use channel - M.T.T.C. - Who/what affected - When started - ETA to resolution - Update regularly

2. Fix it!

1. Post-mortem - Admit failure - Sound like a human - Start time and end time - Who/what was impacted - What went wrong - Lessons learned

2. Learn and improve

Upside of Downtime Framework 1.0

Sunday, June 20, 2010

CommunicatePrepare Explain1. Communication channel - Easy to find - Off-site - Real-time

2. Process - Give authority - M.T.T.C. - On-call/escalations

1. Communicate - Use channel - M.T.T.C. - Who/what affected - When started - ETA to resolution - Update regularly

2. Fix it!

1. Post-mortem - Admit failure - Sound like a human - Start time and end time - Who/what was impacted - What went wrong - Lessons learned

2. Learn and improve

Upside of Downtime Framework 1.0

Sunday, June 20, 2010

CommunicatePrepare Explain1. Communication channel - Easy to find - Off-site - Real-time

2. Process - Give authority - M.T.T.C. - On-call/escalations

1. Communicate - Use channel - M.T.T.C. - Who/what affected - When started - ETA to resolution - Update regularly

2. Fix it!

1. Post-mortem - Admit failure - Sound like a human - Start time and end time - Who/what was impacted - What went wrong - Lessons learned

2. Learn and improve

Upside of Downtime Framework 1.0

"Unlikely that an accidental surface or subsurface oil spill would occur from the proposed activities"

-- Exploration and environmental impact plan

Source: http://en.wikipedia.org/wiki/Deepwater_Horizon_drilling_rig_explosion

Sunday, June 20, 2010

CommunicatePrepare Explain1. Communication channel - Easy to find - Off-site - Real-time

2. Process - Give authority - M.T.T.C. - On-call/escalations

1. Communicate - Use channel - M.T.T.C. - Who/what affected - When started - ETA to resolution - Update regularly

2. Fix it!

1. Post-mortem - Admit failure - Sound like a human - Start time and end time - Who/what was impacted - What went wrong - Lessons learned

2. Learn and improve

Upside of Downtime Framework 1.0

Sunday, June 20, 2010

CommunicatePrepare Explain1. Communication channel - Easy to find - Off-site - Real-time

2. Process - Give authority - M.T.T.C. - On-call/escalations

1. Communicate - Use channel - M.T.T.C. - Who/what affected - When started - ETA to resolution - Update regularly

2. Fix it!

1. Post-mortem - Admit failure - Sound like a human - Start time and end time - Who/what was impacted - What went wrong - Lessons learned

2. Learn and improve

Upside of Downtime Framework 1.0

Sunday, June 20, 2010

CommunicatePrepare Explain1. Communication channel - Easy to find - Off-site - Real-time

2. Process - Give authority - M.T.T.C. - On-call/escalations

1. Communicate - Use channel - M.T.T.C. - Who/what affected - When started - ETA to resolution - Update regularly

2. Fix it!

1. Post-mortem - Admit failure - Sound like a human - Start time and end time - Who/what was impacted - What went wrong - Lessons learned

2. Learn and improve

Upside of Downtime Framework 1.0

Sunday, June 20, 2010

CommunicatePrepare Explain1. Communication channel - Easy to find - Off-site - Real-time

2. Process - Give authority - M.T.T.C. - On-call/escalations

1. Communicate - Use channel - M.T.T.C. - Who/what affected - When started - ETA to resolution - Update regularly

2. Fix it!

1. Post-mortem - Admit failure - Sound like a human - Start time and end time - Who/what was impacted - What went wrong - Lessons learned

2. Learn and improve

Upside of Downtime Framework 1.0

Sunday, June 20, 2010

CommunicatePrepare Explain1. Communication channel - Easy to find - Off-site - Real-time

2. Process - Give authority - M.T.T.C. - On-call/escalations

1. Communicate - Use channel - M.T.T.C. - Who/what affected - When started - ETA to resolution - Update regularly

2. Fix it!

1. Post-mortem - Admit failure - Sound like a human - Start time and end time - Who/what was impacted - What went wrong - Lessons learned

2. Learn and improve

Upside of Downtime Framework 1.0

Sunday, June 20, 2010

CommunicatePrepare Explain1. Communication channel - Easy to find - Off-site - Real-time

2. Process - Give authority - M.T.T.C. - On-call/escalations

1. Communicate - Use channel - M.T.T.C. - Who/what affected - When started - ETA to resolution - Update regularly

2. Fix it!

1. Post-mortem - Admit failure - Sound like a human - Start time and end time - Who/what was impacted - What went wrong - Lessons learned

2. Learn and improve

Upside of Downtime Framework 1.0

Sunday, June 20, 2010

CommunicatePrepare Explain1. Communication channel - Easy to find - Off-site - Real-time

2. Process - Give authority - M.T.T.C. - On-call/escalations

1. Communicate - Use channel - M.T.T.C. - Who/what affected - When started - ETA to resolution - Update regularly

2. Fix it!

1. Post-mortem - Admit failure - Sound like a human - Start time and end time - Who/what was impacted - What went wrong - Lessons learned

2. Learn and improve

Upside of Downtime Framework 1.0

Sunday, June 20, 2010

CommunicatePrepare Explain1. Communication channel - Easy to find - Off-site - Real-time

2. Process - Give authority - M.T.T.C. - On-call/escalations

1. Communicate - Use channel - M.T.T.C. - Who/what affected - When started - ETA to resolution - Update regularly

2. Fix it!

1. Post-mortem - Admit failure - Sound like a human - Start time and end time - Who/what was impacted - What went wrong - Lessons learned

2. Learn and improve

Upside of Downtime Framework 1.0

Sunday, June 20, 2010

“Be not afraid of transparency; some are born transparent, some achieve transparency, and others have transparency thrust upon them.”

-- Burrowed from William Shakespeare

Sunday, June 20, 2010

Sunday, June 20, 2010

Making change1. Find the bright spots - (this presentation has a bunch)

Sunday, June 20, 2010

Making change1. Find the bright spots - (this presentation has a bunch)

2. Script the critical moves - (framework)

Sunday, June 20, 2010

Making change1. Find the bright spots - (this presentation has a bunch)

2. Script the critical moves - (framework)

3. Point to the destination - (W.W.G.D.)

Sunday, June 20, 2010

Making change1. Find the bright spots - (this presentation has a bunch)

2. Script the critical moves - (framework)

3. Point to the destination - (W.W.G.D.)

4. Find the feeling - (how would you feel?)

Sunday, June 20, 2010

Making change1. Find the bright spots - (this presentation has a bunch)

2. Script the critical moves - (framework)

3. Point to the destination - (W.W.G.D.)

4. Find the feeling - (how would you feel?)

5. Shrink the change - (start small)

Sunday, June 20, 2010

Making change1. Find the bright spots - (this presentation has a bunch)

2. Script the critical moves - (framework)

3. Point to the destination - (W.W.G.D.)

4. Find the feeling - (how would you feel?)

5. Shrink the change - (start small)

6. Grow your people - (everyone is learning as they go)

Sunday, June 20, 2010

Making change1. Find the bright spots - (this presentation has a bunch)

2. Script the critical moves - (framework)

3. Point to the destination - (W.W.G.D.)

4. Find the feeling - (how would you feel?)

5. Shrink the change - (start small)

6. Grow your people - (everyone is learning as they go)

7. Tweak the environment - (create a simple process)

Sunday, June 20, 2010

Making change1. Find the bright spots - (this presentation has a bunch)

2. Script the critical moves - (framework)

3. Point to the destination - (W.W.G.D.)

4. Find the feeling - (how would you feel?)

5. Shrink the change - (start small)

6. Grow your people - (everyone is learning as they go)

7. Tweak the environment - (create a simple process)

8. Build habits - (build process organically)

Sunday, June 20, 2010

Making change1. Find the bright spots - (this presentation has a bunch)

2. Script the critical moves - (framework)

3. Point to the destination - (W.W.G.D.)

4. Find the feeling - (how would you feel?)

5. Shrink the change - (start small)

6. Grow your people - (everyone is learning as they go)

7. Tweak the environment - (create a simple process)

8. Build habits - (build process organically)

9. Rally the herd - (get buy in, rest will follow)

Sunday, June 20, 2010