Distributed Caching in an Ephemeral World Rahul Singh€¦ · Distributed Caching in an Ephemeral...

Preview:

Citation preview

DistributedCachinginanEphemeralWorldRahulSinghFounder&CEOdistelli.comJune7th2017

Distributed?Ephemeral?Huh?

Distributed?Ephemeral?Huh?

…becauseKubernetes

Howdidwegethere?

Thegoodole’days…

Oracle

ApacheFCGI

•  ASingleC++binaryrunninginFCGIworkers•  UnderApachetalkingtoanOracleDatabase

Thegoodole’days…

Oracle

ApacheFCGI

•  Caching?Whocares…

...whenyou’retryingtogetthethingtolinkwithoutrunningoutofmemory

Thegoodole’days…

Oracle

ApacheFCGI

•  Scaling?

Justaddmoreboxes!TheoracleDBcanhandleit.

Thegoodole’days…

OracleRAC

ApacheFCGI

•  Scaling?

DBgeNngtobeaproblem?...OracleRACtotherescue!

Thegoodole’days…

ScalingupwasthesoluOontoeveryproblem

Whybotherwithcachingwhenyoucanburn$400Min3years

Eventuallessonlearned…

Youcan’tscaleupforever

Eventuallessonlearned…

Youhavetoscaleout!

That’sthedistributedpart

AddmoresmallerboxesLotsofsmalldatabasesbehindRESTAPIs

SplitupthatFCGIbinaryintoMicroservices

Microservices

LoadBalancer

MulOpleBackendServiceInstancesInMulOpleIndividualDatacenters

Serviceclients(website)connectsviaaLB

OneservicebackedbyonesmallDB

Microservices

HundredsofFrontendclients

HundredsofBackendServices

Microservices

LotsofsmallpiecesofsoSwarerunningonmanysmallboxes

Allscaledoutindependently

Microservices

Difficulttoarchitectanddifficulttooperate

But…onceyougetitgoingit’sgreat!Themachinesweren’tgoinganywhere

MicroservicesandthenDockerhappened

PackageitasacontainerRunitanywhere

anyXme

DockerPackageitasacontainer

Runitanywhere!Really?Anywhere?

RunitanyXme!

Really?AnyXme?

Docker

How?

Kubernetes+Docker

Kubernetes+Docker

KuberneteswillscheduleyourcontainersanywhereonyourpoolofserversanyOme.

Kubernetes+Docker

Whathappenedtoallthatstability?Whereismystuff?

KubernetescanshutdownacontaineratanyOme.

KubernetescanstartacontaineratanyOme

That’sEphemeral

DistributedandEphemeral

MicroservicespackagedasdockercontainersscheduledbyKubernetesDistributed Ephemeral

What’sthisgottodowithcaching?

IfservicesandprocessesarelonglivedandhavestableIPaddressesthencachingisrelaOvelyeasy

Ifthereisasingleendpointtoretrieveaspecific

objectthencachingisrelaOvelyeasy

DistributedcachinginanEphemeralworld

InadistributedcachethecachekeysarespreadacrossmulOplemachines

Retrievingaspecificobjectdependsonfindingthe

machinethat’scachingtheobjectforthatkey

Ifmachines(orcontainers)movearoundthenitbecomesdifficulttokeeptrackofwhatiscachedwhere

DistributedcachinginanEphemeralworld

Distributedcachinghasmanychallenges

•  FindingnodesthatholdaparOcularkey•  Whathappenswhennodesfailorarerestarted•  HowdoesCacheinvalidaOonwork

DistributedcachinginanEphemeralworld

Distributedcachinghasmanyadvantages

•  Cachelookuptrafficisdistributedacrossmanynodes•  CachesizeiseffecOvelyequaltosumofcachesoneverynode•  Nosinglepointoffailure•  Scalableandperformant

Designingadistributedcache

LoadBalancer

RememberthisMicroservice?

Designingadistributedcache

LoadBalancer

Singlededicatedcachebox

Notreallyscalable

Designingadistributedcache

LoadBalancer

Distributethecacheacrosstheservicenodesthemselves

Designingadistributedcache

LoadBalancer

OR...Haveadedicatedfleetofcachenodesanddistributethecacheacrossthem

Designingadistributedcache

YourcacheisnowspreadacrossmulOplenodes

Designingadistributedcache

IfthereareNnodesinyourcachefleet...then…

Eachnodeholds1/Nofthecache

Themathfromcaptainobvious

Ifthereare20nodesinyourcachefleetthenN=20Ifyourcachecontains100keystheneachnodeshouldholdapproximately5keys

Designingadistributedcache

Eachnodeholds1/Nofthecache

Themathfromcaptainobvious

Ifthereare20nodesinyourcachefleetthenN=20Ifyourcachecontains100keystheneachnodeshouldholdapproximately5keys

dependsonhowyoudistributethekeys

Hashingkeysintobuckets(acrossnodes)

dependsonhowyoudistributethekeys

itdependsonthehashingalgorithmyouuse

Hashingkeysintobuckets(akaacrossnodes)SimpleHashAlgorithm:ModHashingForasetofNnodes,keyKisonnodeBidenOfiedby:b=KmodN

N=10K=22

0 1 2 3 4 5 6 7 8 9

Hashingkeysintobuckets(akaacrossnodes)

Important:ThevalueofNchangesasnodesfailandnewnodesarestarted.Mostooenbecausekubernetesstarts/stopsacontainer

Hashingkeysintobuckets(akaacrossnodes)

Important:ThevalueofNchangesasnodesfailandnewnodesarestarted.Mostooenbecausekubernetesstarts/stopsacontainer

Aproblemwithmodhashing:Whenthenumberofnodeschanges,everyelementisrehashed.

Hashingkeysintobuckets(akaacrossnodes)Aproblemwithmodhashing:Whenthenumberofnodeschanges,everyelementisrehashed.

N=10K=22

0 1 2 3 4 5 6 7 8 9

Ifnode8diesthenN=9Kmod9!=Kmod10

N=9K=22

Hashingkeysintobuckets(akaacrossnodes)Ifeverykeyinthecachemovestoadifferentnodewhenasinglenodefailsthenthecachemissratesgoupooenandaffectperformance.QuesOon:Howdoyouhashkeyssothatwhenasinglenodefailsonlythekeysonthatnodearerehashed?Moregenerally:Howdoyouhashkeyssothatffailuresrehashonly1/fofthetotalcache

Hashingkeysintobuckets(akaacrossnodes)

ConsistentHashing

ConsistentHashing

0

1

2

34

6

7

8

9

5

1.  Placenodesonacircle2.  Placekeysonthesamecircle

3.  AkeyKhashestonodeN>K

ProgrammaOcally1.  CalculateScoreforeachnodeN2.  Maintainanorderedlistofscores

3.  ForKeykcomputescoreforK4.  FindNwhereN>K

ConsistentHashing

0

1

2

34

6

7

8

9

5

WhennodeN=8diesonlykeysonthatnodearerehashed.Therestofthecachestaysthesameandconsequentlynodefailuresdon’tresultinincreasedcachemissrates

Addressing

0

1

2

34

6

7

8

9

5

Ifnodesmovearound,howdoyoukeeptrackofwhichnodesarewhereandwhattheiraddressis?

Addressing

0

1

2

34

6

7

8

9

5

AssigneachnodeauniqueIDandanIP:PORTcombinaOonEverynodecommunicatesithealthtoallothernodes.Gossipismostefficientforthis

Addressing,Liveness&NodeHealth

AssigneachnodeauniqueIDandanIP:PORTcombinaOonEverynodecommunicatesithealthtoallothernodes.Gossipismostefficientforthis

Addressing,Liveness&NodeHealth

Gossip:Everynodeheartbeatstoarandomnodeevery10secondsMembershipdataissharedwithallothernodesoneveryiteraOonWhenmissingheartbeatsaredetectedthenodeisconsidereddeadandthekeysarerehashed.

Addressing,Liveness&NodeHealth

KeyRequirement:Whenrehashingbecauseanodedies,itsimportanttogivethenewnodeanewuniqueIDNeverreuseIds

Cachingisanextensivefield

Wehaven'teventalkedaboutcacheinvalidaOonBackupcaching:ShouldIcachethesamekeyonmulOplenodesforredundancy?Shortanswer:No.Itsnotworthit.`Longeranswer:itdepends.

Recap

CachingisrelaOvelyeasy–justuseLRUcacheswithTTLThingsgetmoredifficultatscale.ThingsgetmoredifficultwithdistributedmicroservicesThingsgetmoredifficultwhenyourcachenodesareephemeralHandlingcachenodefailuresisimportantifyouwanthighcachehitrates

Measure,Measure,Measure

CachinglendsitselfwelltomeasurementVerysaOsfyingtoseehighcachehitratesVerypainfultoseehighcachemissrates

We’rehiringafullstack/frontenddeveloperReact,Java,DistributedSystems

Makeasbigimpactonasmallteam

hwps://www.distelli.com/kubernetes

Recommended