7/25/14
@smithclay
CLAY SMITHFORWARDJS 2014JULY 25, 2014
Embracing failure on the front-end
7/25/14
What this talk covers
NOT COVERED: MY RECIPE FOR TEXAS-STYLE BEEF CHILI. FIND ME AFTER TO TALK ABOUT IT.
The inevitability that Javascript apps will break.
Borrowing good ideas about failure from operations teams.
A bit about the theory of complex systems failure.
Open-source tools and services that help make apps more resilient.
Why talking about failure in the front-end is important.
7/25/14GOOGLE TRENDS ALL THE THINGS
One trend is twice as popular as the other trend on average.
7/25/14DR. COOK IS MY HERO
RICHARD I. COOK, MD. HOW COMPLEX SYSTEMS FAIL.
“Complex systems are intrinsically
hazardous systems.”
SOME THEORY, PART 1
7/25/14
“Exception” tracking with window.onerror
MAY YOU NEVER HAVE TO SEE THIS DIALOG AGAIN
DANGER: THIS GETS PRETTY UGLY.
7/25/14
So you want to use a 3rd party service…
SERIOUSLY, PAUL IRISH APPEARS IN ALL MY TALKS.
THERE ARE LOTS: HTTPS://PLUS.GOOGLE.COM/+PAULIRISH/POSTS/12BVL5EXFJN
7/25/14NS_TOO_MUCH_NOISE. NOT REALLY SURE WHY I REDACTED THE URLS.
FURTHER READING: HTTP://BLOG.MELDIUM.COM/HOME/2013/9/30/SO-YOURE-THINKING-OF-TRACKING-YOUR-JS-ERRORS
Example window.onerror output
7/25/14DOES THIS SOUND LIKE COMMON SENSE YET?
"Change introduces new forms of failure."RICHARD I. COOK, MD. HOW COMPLEX SYSTEMS FAIL.
SOME THEORY, PART 2
7/25/14
Monitor change with phantomas
CREEPY PICTURE, NO? I BET HE WRITES ERLANG. I ALSO DON'T KNOW HOW TO SAY PHANTOMAS.
HTTPS://GITHUB.COM/MACBRE/PHANTOMAS
JEAN MARAIS AS FANTÔMAS IN THE 1964 FILM.
Phantomas is “PhantomJS-based web performance metrics collector and monitoring tool”.
phantomas --cookie '_session=<redacted>'
--reporter=statsd
--statsd-host 127.0.0.1 --statsd-prefix stg
--runs 5
http://staging-web.com
7/25/14
How to get super-detailed site metrics…if you’re lazy and cheap.
5 HABITS OF HIGHLY LAZY FRONT-END PERFORMANCE ENGINEERS
Cloud server/your laptop with phantomas installed
Cron job that runs phantomas with statsd output
DataDog Lite Account + Install DataDog Agent on Server
Configure Alerting (I recommend PagerDuty)
Get woken up at 3am
7/25/14
Make the metrics understandable and actionable
THIS LOOKS IMPRESSIVE WHILE YOU READ HACKER NEWS ON YOUR OTHER MONITOR
TESTING DASHBOARD FOR STAGING ENVIRONMENT IN DATADOG.EVEN FANCIER: INTEGRATE IT INTO YOUR WEB APP: HTTPS://GITHUB.COM/BLOG/1252-HOW-WE-KEEP-GITHUB-FAST
7/25/14
Get alerted as things happen
YOU'LL BE ANGRY AT ME WHEN THIS WAKES YOU UP AT 3AM
CREATING A NEW METRIC ALERT IN DATADOG
Choose a phantomasmetric
Define conditions
7/25/14SAY THIS THE NEXT TIME YOU BLOW SOMETHING UP.
“Failure free operations require experience
with failure.”RICHARD I. COOK, MD. HOW COMPLEX SYSTEMS FAIL.
See also: https://blog.pagerduty.com/2013/11/failure-friday-at-pagerduty/
SOME THEORY, PART 3
7/25/14
Inject chaos into your front-end
ORIGINAL GRAPHIC SLIGHTLY REDACTED
HTTPS://GITHUB.COM/TRAVIS-HILTERBRAND/CHAOS-MONKEY-BROWSER
HTTPS://GITHUB.COM/MIKL/NODE-CHAOS-MONKEYWARE
7/25/14EMBRACING FAILURE ON THE FRONT-END
var props = {
probability:0.5,
allowedMethods:['GET'],
mischiefTypes:[
ChaosMonkey.MischiefTypes.delay,
ChaosMonkey.MischiefTypes.http403
]
};
ChaosMonkey(props);
CONFIGURING CHAOS-MONKEY-BROWSER (*JQUERY REQUIRED)
With a 50% probability, this configuration will
cause jQuery ajax GET requests to slowly
fail with a 403 response.
CDN Failure
API Failure
Connection Failure
Bad SSL certificates
And more!
Prepares for:
7/25/14
Other possible strategies
HOW TO ANNOY PEOPLE DURING CODE REVIEW
1. DISABLE/SLOW DOWN NETWORK CONNECTION (IN CHROME CANARY DEVTOOLS):
2. WHAT HAPPENS WHEN YOU DISABLE JS? (USING PLUGIN RECOMMENDED):
AMAZON.COM ISN’T HAPPY WITHOUT JAVASCRIPT
7/25/14
Lessons learned in failure
SERIOUSLY, REMEMBER ONE OF THESE THINGS
Measure errors and key performance metrics over time
Bad performance = failure
Annoy yourself to fix the broken things with alerting
Find remediation steps to make sure it doesn’t happen again
Get experience with failure before 7pm on a Friday
7/25/14
Thanks!
Additional resources (more reading):
• https://info.aiaa.org/tac/SMG/SOSTC/Shared%20Documents/How%20Complex%20Systems%20Fail.pdf
• http://blog.meldium.com/home/2013/9/30/so-youre-thinking-of-tracking-your-js-errors
• https://blog.pagerduty.com/2013/11/failure-friday-at-pagerduty/