Upload
elizabeth-leddy
View
1.129
Download
1
Tags:
Embed Size (px)
DESCRIPTION
This talk covers a basic methodology for finding and fixing problems in a live system. It covers general techniques for finding the source of issues quickly, workarounds, patching, digging into code, when and how to get help.
Citation preview
Un ablemanaging “disasters” without loosing your cool
@eleddy
DeveladminisystematorsThis talk is for the
who have to
constantly deal with UNKNOWNS
‣ Know thy system
‣ Know thy tools
‣ Know thy neighbors
ThreeCommands
Stairway to Freedom
Prepare
Isolate
Damage Control
Diagnose
Patch
Clean
Fix
Document
Horizon of Intervention
Communicate
Prepare Isolate Control Diagnose Patch Clean Fix Document
Dear Magic Makers -
As some of you may already know, customers are experiencing troubles retrieving their historical records because our archive server is not responding. I am investigating the issue now and will send an update in 20 minutes.
Please fence calls in the meanwhile. If someone can please get me a redbull and some nacho cheese corn nuts in the meanwhile, that would be stellar.
Thanks!
coworkers
Mayday! High Priority
bossman
Prepare for the Worst
‣ Backups
‣ Local Data.fs
‣ Set a time limit
Prepare Isolate Control Diagnose Patch Clean Fix Document
Disable Interference Disabled all backups and packing
Opened up port 8080 to outside network
Moved logs to temporary disk
Prepare Isolate Control Diagnose Patch Clean Fix Document
Isolation by Elimination
Prepare Isolate Control Diagnose Patch Clean Fix Document
Network Hardware Software Data
works for me
obvious, sporadic crazy shit
everything else
not recreatable
locally
Isolation by Elimination
Prepare Isolate Control Diagnose Patch Clean Fix Document
Network Hardware Software Data
works for me
obvious, sporadic crazy shit
everything else
not recreatable
locally
Isolation by Elimination
Prepare Isolate Control Diagnose Patch Clean Fix Document
Network Hardware Software Data
works for me
obvious, sporadic crazy shit
everything else
not recreatable
locally
Isolation by Elimination
Prepare Isolate Control Diagnose Patch Clean Fix Document
Network Hardware Software Data
works for me
obvious, sporadic crazy shit
everything else
not recreatable
locally
Isolation by Elimination
Prepare Isolate Control Diagnose Patch Clean Fix Document
Network Hardware Software Data
works for me
obvious, sporadic crazy shit
everything else
not recreatable
locally
Zopesplosion 3000 Architecture
Apache
Varnish
HAProxy
CDN
APIs
Zope
Zope
Zope
Zope
Zope
Zope
Zope MySQL
MongoDB
SPARQL
WTF mate
ZEO 1-4
ZEO 5-8
ZEO 9-12
Prepare Control Diagnose Patch Clean Fix DocumentIsolate
Zopesplosion 3000 Architecture
Apache
Varnish
HAProxy
CDN
APIs
Zope
Zope
Zope
Zope
Zope
Zope
Zope MySQL
MongoDB
SPARQL
ZEO 1-4
ZEO 5-8
ZEO 9-12
Prepare Control Diagnose Patch Clean Fix DocumentIsolate
?
Zopesplosion 3000 Architecture
Apache
Varnish
HAProxy
CDN
APIs
Zope
Zope
Zope
Zope
Zope
Zope
Zope MySQL
MongoDB
SPARQL
ZEO 1-4
ZEO 5-8
ZEO 9-12
Prepare Control Diagnose Patch Clean Fix DocumentIsolate
? ?
Zopesplosion 3000 Architecture
Apache
Varnish
HAProxy
CDN
APIs
Zope
Zope
Zope
Zope
Zope
Zope
Zope MySQL
MongoDB
SPARQL
ZEO 1-4
ZEO 5-8
ZEO 9-12
Prepare Control Diagnose Patch Clean Fix DocumentIsolate
?
Machine BMachine A
How Zeo Cache Works
Zope Mem. Cache
Zeo
Machine BMachine A
How Zeo Cache Works
Zope Mem. Cache
Zeo
I Want X
Machine BMachine A
How Zeo Cache Works
Zope Mem. Cache
Zeo
I Want XI Need X
Machine BMachine A
How Zeo Cache Works
Zope Mem. Cache
Zeo
I Want XI Need X
X
Machine BMachine A
How Zeo Cache Works
Zope Mem. Cache
Zeo
I Want XI Need X
X
X
Machine BMachine A
How Zeo Cache Works
Zope Mem. Cache
Zeo
I Want XI Need X
XX
X
Machine BMachine A
How Zeo Cache Works
Zope Mem. Cache
Zeo
I Want XI Need X
XX
XModified X
Machine BMachine A
How Zeo Cache Works
Zope Mem. Cache
Zeo
I Want XI Need X
XX
XModified X
‘
Machine BMachine A
How Zeo Cache Works
Zope Mem. Cache
Zeo
I Want XI Need X
XX
XModified X
‘ Modified X
Machine BMachine A
Machine BMachine A
How Zeo Cache Works
Zope Mem. Cache
Zeo
I Want XI Need X
XX
XModified X
‘ Modified X
Zope Disk Cache
Zeo
I Want X
X
XModified X
‘ RESTART
Inconsistent State!
Zopesplosion 3000 Architecture
Apache
Varnish
HAProxy
CDN
APIs
Zope
Zope
Zope
Zope
Zope
Zope
Zope MySQL
MongoDB
SPARQL
ZEO 1-4
ZEO 5-8
ZEO 9-12
Prepare Control Diagnose Patch Clean Patch DocumentIsolate
Hot damn!
Take time to make time
‣ Minimize customer angst
‣ Hang out in custom
‣ Acquisition is your friend
‣ Remember request and response
Prepare Control Diagnose Patch Clean Fix DocumentIsolate
Prepare Control Diagnose Patch Clean Fix DocumentIsolate
Unique or Just Not Obvious?
‣ Zope, zeo, system logs
‣ System stats/monitoring
Prepare Isolate Control Diagnose Patch Clean Fix Document
Test Case
Prepare Isolate Control Diagnose Patch Clean Fix Document
Sarcoidosis!
Probably not...
EstimateFix Time
+
Horizon of Intervention
Prepare Isolate Control Diagnose Patch Clean Fix Document
Can I handle this problem?
Can I do it in a timely manner?
Yes
IRC Plone-users
Yes
NONO
Friends Colleagues
Front End Errors
Take the performance hit
Disable the malfunctioning piece
Prepare Isolate Control Diagnose Patch Clean Fix Document
temporary patch
Prepare Isolate Control Diagnose Patch Clean Fix Document
full patch
Have I mentioned theimportance of
Prepare Isolate Control Diagnose Patch Clean Fix Document
BACKUPSworking with
yet?
Especially when unfucking data...
Clean up
Prepare Isolate Control Diagnose Patch Clean Fix Document
Disabled all backups and packing
Opened up port 8080 to outside network
Moved logs to temporary disk
Disabled zopes 5-10
Clean up
Prepare Isolate Control Diagnose Patch Clean Fix Document
Disabled all backups and packing
Opened up port 8080 to outside network
Moved logs to temporary disk
Disabled zopes 5-10
Prepare Isolate Control Diagnose Patch Clean Fix Document
Delete extra/bad files
Scripts in version control
Communicate
Clean up
Prepare Isolate Control Diagnose Patch Clean Fix Document
I’ve got a fever, and the only solution... is
MORE PATCH!
‣ Update/Close Tickets
‣ Integrate Test Cases
‣ Document Processes
Prepare Isolate Control Diagnose Patch Clean Fix Document
Handling Data Errors
Prepare Isolate Control Diagnose Patch Clean Fix Document
Network Hardware Software Data
works for me
obvious, sporadic crazy shit
everything else
not recreatable
locally
Handling Data Errors
Prepare Isolate Control Diagnose Patch Clean Fix Document
Network Hardware Software Data
works for me
obvious, sporadic crazy shit
everything else
not recreatable
locally
Handling Data Errors
Prepare Isolate Control Diagnose Patch Clean Fix Document
Network Hardware Software Data
works for me
obvious, sporadic crazy shit
everything else
not recreatable
locally
Handling Data Errors
Prepare Isolate Control Diagnose Patch Clean Fix Document
Network Hardware Software Data
works for me
obvious, sporadic crazy shit
everything else
not recreatable
locally
Handling Data Errors
Prepare Isolate Control Diagnose Patch Clean Fix Document
Network Hardware Software Data
works for me
obvious, sporadic crazy shit
everything else
not recreatable
locally
Prepare Isolate Control Diagnose Patch Clean Fix Document
How Data is Stored
Plone
root (app)
NewsMembers Events
acl_users
acl_users
users roles
users roles
news.2010.09.08 news.2010.06.13
Prepare Isolate Control Diagnose Patch Clean Fix Document
temp_folder
The Basics
Prepare Isolate Control Diagnose Patch Clean Fix Document
‣ ./bin/instance debug
‣ app
‣ dir, __dict__
Direct Connect>>> from ZODB.FileStorage import FileStorage>>> from ZODB.DB import DB>>> storage = FileStorage('var/filestorage/Data.fs')>>> db = DB(storage)>>> connection = db.open()>>> root = connection.root()
Prepare Isolate Control Diagnose Patch Clean Fix Document
>>> from ZEO import ClientStorage>>> from ZODB import DB>>> address = '10.0.1.5', 8001>>> db = DB(storage)>>> connection = db.open()>>> root = connection.root()
>>> root[‘app’] = PloneSite()>>> root[‘status’] = ‘Running’
Prepare Isolate Control Diagnose Patch Clean Fix Document
>>> import transaction
>>> del app.Plone.news[‘news-item-id’]
>>> transaction.commit()
_p_changed
Prepare Isolate Control Diagnose Patch Clean Fix Document
When in doubt...
‣ PDB is your friend
‣ The source is your friend
‣ Throw a party for your friends
Prepare Isolate Control Diagnose Patch Clean Fix Document
‣ Know your System
‣ Understand the Tools
‣ Be Nice to your Neighbors