35

OVERVIEW - Cornell Universityoverview lbfs motivation introduction challenges advantages of lbfs how lbfs works? related work

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: OVERVIEW - Cornell Universityoverview lbfs motivation introduction challenges advantages of lbfs how lbfs works? related work
Page 2: OVERVIEW - Cornell Universityoverview lbfs motivation introduction challenges advantages of lbfs how lbfs works? related work

OVERVIEWLBFS   MOTIVATION   INTRODUCTION

  CHALLENGES   ADVANTAGESOFLBFS   HOWLBFSWORKS?   RELATEDWORK

  DESIGN   SECURITYISSUES   IMPLEMENTATION

  SERVERIMPLEMENTATION   CLIENTIMPLEMENTATION

  EVALUATION   SHARK

Page 3: OVERVIEW - Cornell Universityoverview lbfs motivation introduction challenges advantages of lbfs how lbfs works? related work

MOTIVATION

  UsersrarelyconsiderrunningNFSoversloworwideareanetworks.

  Ifbandwidthislow,performanceisunacceptable.

  Datatransferssaturatebo@lenecklinksandcauseunacceptabledelays.

  InteracAveapplicaAonsareslowinrespondingtouserinput.

  RemoteloginisfrustraAng

  SoluAon? Run InteracAve programs locally and manipulate remote files

throughthefilesystem. NetworkFilesystemshouldconsumelessbandwidth.  LBFS.

Page 4: OVERVIEW - Cornell Universityoverview lbfs motivation introduction challenges advantages of lbfs how lbfs works? related work

INTRODUCTION

 LBFSisusedforsloworWideareaNetworks.

 ExploitssimilariAesbetweenFilesorversionsofthesamefile.

 ItusesConvenAonalcomparisonandCaching.

 In LBF, interacAve programs and accessing remote datathroughfilesystemrunlocally.

 LBFS requires 90% less bandwidth than TradiAonalNetworkFileSystem.

Page 5: OVERVIEW - Cornell Universityoverview lbfs motivation introduction challenges advantages of lbfs how lbfs works? related work

 Challenges? Advantages

 ProvidesTradiFonalFSSchemaFcs LocalCache ExploitsCrossFilesimilariFes

 VariableSizeChunks Indexeschunksbyhashvalues

 HowLBFSWork? ProvidesClosetoOpenConsistency

Page 6: OVERVIEW - Cornell Universityoverview lbfs motivation introduction challenges advantages of lbfs how lbfs works? related work

RELATEDWORK

  AFSusesservercallbackstoreducenetworktraffic.

  LeasesarecallbackswithexpiraAondate.

  CODA supports slow networks and even disconnected operaAonsthroughopAmisAcreplicaAon.

  CODAsavesbandwidthasitavoidstransferringfilestotheserver.

  BayouandOceanStoreinvesAgateconflictresoluAonforopAmisAcupdates.

  SpringandWetherall: Uselargeclientandservercaches

  RsyncexploitssimilariAesbetweendirectorytrees.

Page 7: OVERVIEW - Cornell Universityoverview lbfs motivation introduction challenges advantages of lbfs how lbfs works? related work

DESIGN

 LBFSuseslargepersistentfilecacheatclient. ItassumesClienthasenoughcache.

  ItExploitssimilariAesbetweenfilesandfileversions. DividesFilesintoChunks. Onlytransmitsdatachunkscontainingnewdata.

 Tosavechunktransfer,LBFSreliesontheSHA‐1Hash.

 LBFSUses“gzip”compression.

 CentralchallengeinDesignis: Keepingtheindexareasonablesize Dealingwithshi[ingoffsets.

Page 8: OVERVIEW - Cornell Universityoverview lbfs motivation introduction challenges advantages of lbfs how lbfs works? related work

PROBLEMSWITHFIXEDSIZEDBLOCKS

 Single byte inserAon shi[s all the blockboundaries.

OriginalFile

AUerInserFng

 PossiblesoluFons: Indexfilesbythehashesofalloverlapping8KBblocksatalloffsets.

 Rsync:Consideronly twofilesataAme.Existenceofafileisfoundusingfilename.

x

Page 9: OVERVIEW - Cornell Universityoverview lbfs motivation introduction challenges advantages of lbfs how lbfs works? related work

LBFSSoluFonforBlockSize

 LBFS Onlylooksfornon‐overlappingchunksinfiles

 AvoidssensiFvitytoshiUingfileoffsetsbySeZngchunkboundariesbasedonfilecontents.

 Todivideafileintochunks,LBFS Examinesevery(overlapping)48‐byteregionofthefile.

 LBFSUses Rabin’s fingerprints to select boundary regionscalledbreakpoints.

 Fingerprintsareefficienttocomputeonaslidingwindowinafile.

Page 10: OVERVIEW - Cornell Universityoverview lbfs motivation introduction challenges advantages of lbfs how lbfs works? related work

RabinFingerPrints

 Polynomial representaAonofdata in48‐byte regionmoduloanirreduciblepolynomial. FingerPrint=f(x)modp(x)

 ProbabilityofCollision=max(|r|,|s|)/2w‐1

 Boundaryregionshavethe13leastsignificantbitsoftheir fingerprint equal to an arbitrary predefinedvalue.

 Methodisreasonablyfast.

Page 11: OVERVIEW - Cornell Universityoverview lbfs motivation introduction challenges advantages of lbfs how lbfs works? related work

ChunkBoundariesAUeraSeriesofEdits?

  Figureshowsthefiledividedintovariablelengthchunkswithbreakpointsdeterminedbyhashofeach48bitregion.

  EffectofinserAngsometextintothefileatchunkC4.  TransferonlyC8.

  EffectofinserAngadatainC5thatcontainsabreakpoint  Spliengthechunksintotwonewchunks.(C9andC10)  TransferonlytwonewchunksC9andC10.

  Oneofthebreakpointiseliminated.C2+C3‐>C11  TransferonlyC11

Page 12: OVERVIEW - Cornell Universityoverview lbfs motivation introduction challenges advantages of lbfs how lbfs works? related work

PATHOLOGICALCASES VariablesizechunkscanleadtoPathologicalbehavior.

  Ifevery48bytesofafilehappenedtobeabreakpoint.

 VerylargechunkswouldbetoodifficulttosendinasingleRPC.

 ArbitrarysizeRPCmessageswouldbesomewhatinconvenient.

 Chunksizesmustbebetween2Kand64K

 ArAficiallyinsertchunkboundariesiffileisfullofrepeatedsequences.

Page 13: OVERVIEW - Cornell Universityoverview lbfs motivation introduction challenges advantages of lbfs how lbfs works? related work

CHUNKDATABASE Thechunkdatabase Indexeschunksbyfirst64bitsofSHA‐1

hash.

 Thedatabasemapskeysto(file,offset,count)triples.

 LBFSneverreliesonthecorrectnessofthechunkdatabase.

 Howtokeepthisdatabaseuptodate? Mustupdateitwheneverfileisupdated

 CansAllhaveproblemswithlocalupdatesatserversite Crashescancorruptdatabasecontents.

Page 14: OVERVIEW - Cornell Universityoverview lbfs motivation introduction challenges advantages of lbfs how lbfs works? related work

FILECONSISTENCY TheLBFSclientcurrentlyperformswholefilecaching.

 LBFSusesathree‐Aeredschemetodetermineifafileisuptodate. OPENAFILE:

 IFLeaseNotExpired IFLeaseExpired

 ClientgetsaleasefirstAmeafileisopenedforread.

 ClientRenewstheexpiredleasebyrequesAngfilea@ributes.

  It’sthejoboftheClienttocheckifthecachedcopyissAllcurrent.

Page 15: OVERVIEW - Cornell Universityoverview lbfs motivation introduction challenges advantages of lbfs how lbfs works? related work

FILEREADS LBFSUseaddiAonalcallsnotinNFS‐>GETHASHforreads

Page 16: OVERVIEW - Cornell Universityoverview lbfs motivation introduction challenges advantages of lbfs how lbfs works? related work

FILEWRITES LBFSServerupdatesfilesatomicallyatcloseAme.

 UsesTemporaryFiles. 4RPC’sareusedinupdateprotocol:

*1.MKTMPFILE*2.CONDWRITE*3.TMPWRITE*4.COMMITTMP.

Page 17: OVERVIEW - Cornell Universityoverview lbfs motivation introduction challenges advantages of lbfs how lbfs works? related work

SECURITY LBFSusesthesecurityinfrastructurefromSFS. AllServershavepublickeys. AccessControl.

IMPLEMENTATION

 mkdbuAlity  IfFileSize<8KB TrashDirectory

Page 18: OVERVIEW - Cornell Universityoverview lbfs motivation introduction challenges advantages of lbfs how lbfs works? related work

EVALUATION–REPEATEDDATAINFILES

 Bandwidth consumpAon and network uAlizaAon aremeasuredunderseveralcommonworkloads.

 LBFSiscomparedwith: CIFS, NFSversion3and AFS.

Page 19: OVERVIEW - Cornell Universityoverview lbfs motivation introduction challenges advantages of lbfs how lbfs works? related work

EVALUATION(Cont)–BANDWIDTHUTILIZATION

•  Used3WORKLOADS.

•  (MSWord1.4MBfile,gcc‐>Compiledemacs20.7,ed‐>perl)

Page 20: OVERVIEW - Cornell Universityoverview lbfs motivation introduction challenges advantages of lbfs how lbfs works? related work

EVALUATION(3)–APPLICATIONPERFORMANCE

Page 21: OVERVIEW - Cornell Universityoverview lbfs motivation introduction challenges advantages of lbfs how lbfs works? related work

OVERVIEW

SHARK  MOTIVATION   INTRODUCTION  CHALLENGES  ADVANTAGESOFSHARK  HOWSHARKWORKS?

  DESIGN   IMPLEMENTATION   EVALUATION  CONCLUSTION  DISCUSSION(QUESTIONARIES)

Page 22: OVERVIEW - Cornell Universityoverview lbfs motivation introduction challenges advantages of lbfs how lbfs works? related work

MOTIVATION

CurrentSystems

1.ReplicaFngExecuFonEnvironment 2.MulFpleClientCopyingSameFiles

SERVER

Program1

libraries

P2P3

P4

Data

CLIENT

P1Launch

Data

P1

C2

P1

C3

P1

C6

P3

C4

P1

C5

Replicate their execuFon environment on each machine beforelaunchingdistributedapplicaFon‐>WASTERESOURCES+DEBUG

SERVER1

Data1 Data2

Data3

Data3

Data2

Data1

SERVER3

Data1

Data2

Data3

SERVER2

Client1

C3

C2C4

RequestData3

UpdateDB

UpdateDB

Data3,1 Data2,1

UpdateDB

Data3

ReplicaFngdataandservingsamecopyofdatatomulFpleclients‐>INCONSISTENCIES

Page 23: OVERVIEW - Cornell Universityoverview lbfs motivation introduction challenges advantages of lbfs how lbfs works? related work

INTRODUCTION

 Sharkisadistributedfilesystem

 Designedforlarge‐scale,wide‐areadeployment.

 Scalable.

 ProvidesanovelcooperaAve‐cachingmechanism

 Reducestheloadonanoriginfileserver.

Page 24: OVERVIEW - Cornell Universityoverview lbfs motivation introduction challenges advantages of lbfs how lbfs works? related work

CHALLENGES

 Scalability:Whatifalargeprogram(sayinMB’s)isbeingexecutedfromafileserverbymanyclients?

 Becauseofbandwidth,servermightdeliverunacceptableperformance.

 As the model is similar to P2P file systems, administraAon,accountability,andconsistencyneedtobeConsidered.

Page 25: OVERVIEW - Cornell Universityoverview lbfs motivation introduction challenges advantages of lbfs how lbfs works? related work

HOWSHARKWORKS?

 Shark clients findnearby copies of data by using distributedindex.

 Clientavoids transferring thefile/chunksof thefile fromtheserver,ifthesamedatacanbefetchedfromnearby,client.

 Shark is compaAble with exisAng backup and restoreprocedures.

 By shared read, Shark greatly reduces server load andimprovesclientlatency.

Page 26: OVERVIEW - Cornell Universityoverview lbfs motivation introduction challenges advantages of lbfs how lbfs works? related work

DESIGN–PROTOCOL(1/2)

1.  Sharkserverdividesthefile intochunksbyusingRabinfingerprintalgorithm.

2.  Shark interacts with the local host using an exisAngNFSv3andrunsinuserspace.

3.  WhenFirstClientreadsaparAcularfile  Gets file andRegisters as replica proxy for the chunks of the

fileinthedistributedindex.

4.  Nowwhen2ndclientwantstoaccessthesamefile:  it discovers the proxies of the file chunks by querying the

distributedindex. establishes a secure channel to (mulAple such) proxy(s) , and

downloadthefilechunksinparallel.

Page 27: OVERVIEW - Cornell Universityoverview lbfs motivation introduction challenges advantages of lbfs how lbfs works? related work

DESIGNPROTOCOL(2/2)

5.A[erfetching,theclientthenregistersitselfasareplicaproxyforthesechunks.

6.ServerexposestwoApi’s. Put: Client executes put to declare that it has something.

 Get: clientexecutesget toget the listof clientswhohavesomething.

SECUREDATASHARING7. Data is encrypted by the sender and can be decrypted

onlybytheclientwithappropriatereadpermissions.

8.Clientscannotdownload largeamountsofdatawithoutproperreadauthorizaAon.

Page 28: OVERVIEW - Cornell Universityoverview lbfs motivation introduction challenges advantages of lbfs how lbfs works? related work

SECUREDATASHARING

 How? Cross‐file‐systemsharing

 Sharkuses tokengeneratedbythefileserverasasharedsecretbetweenclientandproxy.

 Clientcanverifytheintegrityofthereceiveddata.

 For a sender client (proxy client) to authorize requesterclient,requesterclientwillprovidetheknowledgeoftoken

 Once authorized, receiver client will establish the readpermissionofthefile.

Page 29: OVERVIEW - Cornell Universityoverview lbfs motivation introduction challenges advantages of lbfs how lbfs works? related work

FILECONSISTENCY

 Sharkusestwonetworkfilesystemtechniques: Leases

 AFS‐stylewhole‐filecaching

 WhenclientmakesareadRPCtothefileserver,itgetsareadleaseontheparAcularfile.

  InSharkdefaultleaseduraAon=5mins.

 ForFileCommonaliAes:ItusesLBFS.

Page 30: OVERVIEW - Cornell Universityoverview lbfs motivation introduction challenges advantages of lbfs how lbfs works? related work

COOPERATIVECACHING

Client(C1) Server C2 C3 C4

FileNotCached/LeaseExpired

GETTOK (fh, offset, count)

1. TF = tok (F) = HMACr(F) 2. Split data into chunks 3. Compute tokens for chunks

1. file attributes 2. file token 3. (chunktok1, offset1, size1) (chunktok2, offset2, size2) (chunktok3, offset2, size2)

Determinesif(LocalCache==Latest)

if(LocalCacheisnotlatest=>Create‘K’Threads

t1

Multicast Requesting for Chunk Fi

Chunk F1

Chunk F3

t2t3

t1t2t3

Request Chunk F2 t3

F2

F3

Chunk F2

Issues Series of Read calls to the Kernel NFS server.

Caches the tokens for future reference.

.

Fetch‘K’ChunksinParallel

PUT() -> Announuces as a Proxy for Fi

Page 31: OVERVIEW - Cornell Universityoverview lbfs motivation introduction challenges advantages of lbfs how lbfs works? related work

DISTRIBUTEDINDEXING

 Sharkusesglobaldistributedindexforallsharkclients.

 Systemmaps opaque keys onto nodes by hashing the valueontoakeyID.

 AssigningID’stonodesallowslookupontheO(log(n))

 SharkstoresonlysmallinformaAonaboutwhichclientsstoreswhatdata.

 SharkusesCoralasitsdistributedindex.

 CoralProvidesDistributedSloppyHashTable(DSHT).

 Coralcacheskey/valuepairsatnodeswhoseIDsareclose.

Page 32: OVERVIEW - Cornell Universityoverview lbfs motivation introduction challenges advantages of lbfs how lbfs works? related work

IMPLEMENTATION

  Sharkconsistsof3MainComponents: Serversidedaemon

 Clientsidedaemon

 Coraldaemon

  Implemented in C++ andarebuiltusingSFStoolkit.

  Clientsidedaemon

  BiggestComponentofshark.

  HandlesUserRequests  Transparently

incorporateswholefilecaching.

  Codeis~12,000Lines

Page 33: OVERVIEW - Cornell Universityoverview lbfs motivation introduction challenges advantages of lbfs how lbfs works? related work

EVALUATION(1/2)

 SharkisevaluatedagainstNFSv3andSFS.

 Readtestsareperformedboth

 withinthecontrolledEmulabLANenvironmentand

 InthewideareaonthePlanetLabv3.0test‐bed.

 Theserverrequired0.9seconds tocomputechunksfora10MBrandomfile,and3.6secondsfora40MBrandomfile.

Page 34: OVERVIEW - Cornell Universityoverview lbfs motivation introduction challenges advantages of lbfs how lbfs works? related work

EVALUATION‐Microbenchmarks

  Forthelocal‐areamicro‐benchmarks,localmachinesatNYUareusedasaSharkclient.

  Inthismicro‐benchmark,Shark’schunkingmechanismreducesredundantdatatransfersbyexploiAngdatacommonaliAes.

Page 35: OVERVIEW - Cornell Universityoverview lbfs motivation introduction challenges advantages of lbfs how lbfs works? related work

QUESTIONS

THANKYOU