WHEN THE CEPH HITS THE FAN
Dr. Wolfgang Schulze Director Global Storage Consulting Practice Red Hat October 20, 2016
CAN THE CEPH EVEN HIT THE FAN?
2
• A"erall…
• Architecturehasnosinglepointoffailure• Codebaseisverysolidandhadmanyyearstomature• Designedfromthegrounduptoaccommodateforfailures• Supposedtobeself-healingandself-managing• Itsimplifiesday-to-daydatacenteropera?ons
WHAT IS “HITTING THE FAN”, ANYWAYS?
3
• Examplescenarios:• Heavystormtakesoutdatacenter,clusterfailstorestartautoma?cally• Increasedworkloadmakesclusterunstable• Performanceisfinewhenclusterisemptytomoderatelyfilled,butwhen
whengeHngclosephysicalcapacity,writeperformancedrops• Nearlyfullclusterhasbecomeunresponsiveanddenieswrites• Bulkdele?onofobjectstakessolongthattheclientapplica?on?mesout• Rebalancinga"erapar?alelectricoutageimpactsclientswithslow/
blockedrequests
• Resultineachcase:customerfiles• Sev1:Produc?onisdown• Sev2:Produc?onisimpacted
TICKET QUEUE IN RED HAT SUPPORT
4
Realscreenshot,dated2016-10-19CustomernamesremovedManyofthese,cketscouldhavebeenavoidedifbestprac,ceshadbeenfollowed
A SAD, BUT TRUE STORY
5
• CustomerboughtRedHatCephStoragesubscrip?ons• Theyweresuretheyhadenoughexperienceontheirteamandspecifically
declinedoffersfortrainingandconsul?ng• TheydesignedanddeployedCephclusterwithoutguidance
• Originallyforfeasibilitystudy,buteverythingseemedtoworkfine,sotheyputitintoproduc?on
• Nobodyno?cedthatthejournalsizewasconfiguredtoonly100MBinsteadofbestprac?cesizeof5GB
• Acoupleofmonthslatera"erapowerfailure,theCephclusterfailedtorecover• Support?cketwentonforseveralweeks,attheendsomepermanentdataloss
• Endresult:Par?aldataloss,unhappymanagement,unhappycustomers
SOME COMMON MISCONCEPTIONS
6
• ThenewtoolsmakeCepheasytosetup• Youdon’tneeddetailedplanningorarchitecturedesign• Cephworksonanyhardware,andyoucanmix&matchhardware• Storageinfrastructurepeoplewillknowhowtohandletheproduct• Serverpeoplewillknowhowtohandletheproduct• Cephcommunitybitsarejustfine(“Weuseastablerelease”)• Usingcommunitybitsismore“cuHngedge”
COMMON TROUBLE #1 UPSTREAM BITS FOR PRODUCTION SYSTEMS
7
Observa,on• Userisrunningupstreambits• ThishappensevenwithuserswhoarepayingforaRedHatSupportsubscrip?on• Peoplemisinterpretthephrase“stablerelease”incommunityreleasenotes
Problem• RedHatSupportwon’tbeabletohelp• RedHatonlysupportslongtermstablereleases• WhatcouldbeasafeandfullydocumentedupgradetoanewerLTSversion
suddenlybecomesa“migra?on”withrisksandpieallsMi,ga,on• Usesupportedbits,stayinformedaboutroadmap,getinvolved
COMMON TROUBLE #2 USE OF UNSUPPORTED FEATURES
8
Observa,on• Userdeployssystemintoproduc?onusingfeatureswhicharenot(yet)supported• Examples:CephFS,BlueStore
Problem• RedHatSupportwon’tbeabletohelp
• Unlessyouhaveasupportexcep?on,theconversa?onmayendquickly• RedHatEngineeringwillnotbuildhotfixesforyouMi,ga,on• Trytogetasupportexcep?onfromRedHat• Don’tusethefeature
COMMON TROUBLE #3 USE OF UNSUPPORTED CONFIGURATIONS
9
Observa,on• UserdeployCephinawaythatisnotapprovedandhasnotbeentested• Examples:
• RunningCephonunsupportedOpera?ngSystemversions(e.g.GenToo,Debian)• Deploying
Problem• RedHatSupportwon’tbeabletohelp
• Unlessyouhaveasupportexcep?on,theconversa?onmayendquickly• RedHatEngineeringwillnotbuildhotfixesforyou
Mi,ga,on• Readdocumenta?on,considerhealthcheckbeforego-live
COMMON TROUBLE #4 POORLY MANAGED CLUSTER GROWTH
10
Observa,on• Addingdisks(orevenen?renodes)toclustersofrela?velysmalltotalcapacity• Backfill/recoverystarvesclientI/O
Problem• InolderversionsofCeph,defaultconfigura?onvaluesarenotidealforthis
(osd_max_backfills,osd_recovery_max_ac?ve,osd_recovery_op_priority)• Ifyoufailtoadjustthesebeforeyouchangethephysicalconfigura?on,youwill
indeedhavehugeimpact
Mi,ga,on• Knowyourstuff,thinkahead,es?mateimpact,graduallyweighin
COMMON TROUBLE #5 POOR SKILLS AND OPERATIONAL PRACTICES
11
Observa,ons• SubjectmajerexpertswhobroughtCephtotheorganiza?onwerehiredguns,
oremployeeswhohavesincele"• Teamthatendsupmanagingclusterconsidersitsomesortofblackart
Problem• Operatorswhodon’tknowwhattheyaredoingputyourdataatrisk• Thebuilt-insafety/durabilitymaybecompromised
Mi,ga,on• Makesureusersreceivepropertraining,andavoidstaffSPOF• Conductcontrolledemergencydrillstoprac?ceforoutages• Maintainseparateclusterwithsameversionforexperimentsanddryrun,
orlearnhowtodoitwithacloudbasedenvironment
COMMON TROUBLE #6 RISKY CONFIGURATION CHOICES
12
Observa,ons• Usersreadsomewherethatmoun?ngXFSOSD’swiththe‘nobarrier’op?on
willresultinperformancegains
Problem• Whiletheperformancegetsno?ceablybejer,youareintroducingariskfor
datacorrup?onduringpoweroutages• Thebuilt-insafety/durabilitymaybecompromised
Mi,ga,on• Donotuse‘nobarrier’mountop?onunlessyouunderstandfullywhat
hardwareyouhave,andunlessyouknowwhatyouaredoing
COMMON TROUBLE #7 POOR NETWORK CONFIGURATION
13
Observa,ons• Usersdon’tpayenoughajen?ontonetworkconfigura?on• Networkinconsistencies(e.g.JumboFrames)andbojlenecksgoundetected
…un?lCephperformspoorly.
Problem• Troubleshoo?ngnetworkingissuesisdifficultandexpertshardtofind• Cephheavilyreliesonproperconfigura?on
Mi,ga,on• Investinyourteamandnetworkmaintenanceskills
WHAT TO DO WHEN THINGS WENT WRONG
14
1. Staycalmanddon’tmakeitworse!• Poorlyskilledoperatorsmayturnaproblemintoacatastrophe
2. ContactRedHatSupportimmediately• Sev1andSev2issuesarehandledwithtoppriority• Chancesarethattheywillbeabletohelprightawayandgetyourcluster
hummingagain
3. ContactyourtrustedRedHatServicesorSalescontacts• Ifproblemspersistoryoufeelyouneedextrahelp,youmightwanttogeta
CephexpertfromRedHatProfessionalServices
GOOD PRACTICES TO AVOID PROBLEMS
15
1. Don’tstumbleintoimplementa?on/deploymentwithoutcarefulplanning• Captureanddocumentrequirements,doaPOC,doanactualdesign• Engageexpertsearlytohelpwithclusterdesignandhardwarechoices
2. Unlessyoulovetotakerisks,usesupportedbits3. StayclosetotherecommendedreferencearchitecturesfromRedHatpartners4. Makesureyourstaffreceivespropertraining
• RedHatGlobalLearningprovidesexcellenttrainingforGlusterandCeph5. Planforgrowth6. Don’tletthingslinger.Cephdoesnotlikeitwhentheclusteris90%full7. HaveanexpertperformregularStorageHealthCheckstodetectproblemswhile
theyares?llsmall
STORAGE DESIGN CONSULTING
16
• SpecialistsfromRedHatConsul?ngwillhelpplanningyourCephdeployment
• Start:StorageDiscoverySession
• Wecanhelpdiscoverrequirementsanddesignastoragesolu?onthatmatches
• YouwillreceiveadetailedStorageSolu,onarchitecturedocumentwhichwillar?culatedesignchoicesandlayoutastep-by-stepplanforimplementa?on
STORAGE HEALTH CHECKS
17
• Standard3-dayengagementdonebyRedHatstorageexperts• Comprehensivetop-to-bojomanalysisofyourso"ware-definedstorageplaeorm• Sixfocusareas
1. Lifecycle2. Configura?on3. Organiza?on4. UseCase5. Hardware6. Opera?onal
• Clearread-outofissues• Ac?onablerecommenda?ons
POSITIVE NOTE
18
• Iaskedmyconsultantsforfeedbackonthispresenta?on.Hereisonecomment
19
WHERE TO GO NEXT
REDHATSUBSCRIPTIONS
hjps://access.redhat.com/subscrip?on-valueEvalua?on,Pre-produc?on,andProduc?onsubscrip?onsavailable
CONSULTING hjp://www.redhat.com/en/services/consul?ng/storage
TRAINING hjps://www.redhat.com/en/services/training
TESTDRIVE hjp://red.ht/cephtestdrive
To engage a Territory Service Manager in your area, ask for a local Red Hat Storage sales professional at: NORTH AMERICA: 1 (888) REDHAT-1; LATIN AMERICA: 54 (11) 4329-7300; EMEA: 00800 7334 2835 APJ: 65 6490 4200; Brazil: 55 (11) 3529-6000,; Australia: 1800 733 428; New Zealand: 0800 733 428