19
D2.7 – Final Secure Block Device 1 Final secure block device D2.7 Project reference no. 653884 February 2018

Final secure block device D2 - SafeCloud Project · secure file system (SS3). SS3 is a POSIX-compliant distributed file system on top of cloud storage providers that focuses on the

  • Upload
    others

  • View
    4

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Final secure block device D2 - SafeCloud Project · secure file system (SS3). SS3 is a POSIX-compliant distributed file system on top of cloud storage providers that focuses on the

D2.7–FinalSecureBlockDevice 1

Final secure block device

D2.7

Project reference no. 653884

February 2018

Page 2: Final secure block device D2 - SafeCloud Project · secure file system (SS3). SS3 is a POSIX-compliant distributed file system on top of cloud storage providers that focuses on the

D2.7–FinalSecureBlockDevice 2

DocumentinformationScheduleddelivery 28.02.2018Actualdelivery 28.02.2018Version 1.2ResponsiblePartner UniNE

DisseminationlevelPublic

RevisionhistoryDate Editor Status Version Changes 20.02.2018 D.Burihabwa Draft 0.1 Initialversion27.02.2018 D.Burihabwa Draft 0.2 Amendedversion28.02.2018 D.Burihabwa Final 1.0 Finalversion 28.02.2018 D.Burihabwa Final 1.1 Internalreviewerscorrection28.02.2018 H.Mercier Final 1.2 FinalversionwithC&Hcomments

ContributorsD.Burihabwa(UniNE)H.Mercier(UniNE)

InternalreviewersJ.Paulo(INESCTEC)S.Schmerler(Cloud&Heat)

AcknowledgementsThis project is partially funded by the European Commission Horizon 2020 workprogrammeundergrantagreementno.653884.

MoreinformationAdditional information and public deliverables of SafeCloud can be found athttp://www.safecloud-project.eu

Page 3: Final secure block device D2 - SafeCloud Project · secure file system (SS3). SS3 is a POSIX-compliant distributed file system on top of cloud storage providers that focuses on the

D2.7–FinalSecureBlockDevice 3

TableofcontentsDocumentinformation................................................................................................................................2Disseminationlevel......................................................................................................................................2Revisionhistory.............................................................................................................................................2Contributors....................................................................................................................................................2Internalreviewers........................................................................................................................................2Acknowledgements.......................................................................................................................................2Moreinformation..........................................................................................................................................2Tableofcontents...........................................................................................................................................3Executivesummary.......................................................................................................................................41 SafeCloudarchivalusingdataentanglement.............................................................................51.1 Todeleteornottodelete.................................................................................................................................6

2 Architecture...........................................................................................................................................72.1 Design......................................................................................................................................................................72.2 Implementation...................................................................................................................................................7

3 Deployment............................................................................................................................................93.1 Localdeployment................................................................................................................................................93.1.1 Buildthedockerimages........................................................................................................................93.1.2 Deploylocally.............................................................................................................................................9

3.2 Distributeddeployment...................................................................................................................................93.2.1 Createadockerswarm..........................................................................................................................93.2.2 Build,pushanddownloadthedockerimages...........................................................................103.2.3 Deployontheswarm............................................................................................................................10

4 Configuration.......................................................................................................................................114.1 Entanglementparameters............................................................................................................................114.2 Storageparameters.........................................................................................................................................12

5 Basicusage...........................................................................................................................................135.1 Insertadocument.............................................................................................................................................135.2 Readadocument..............................................................................................................................................135.3 Repairadocument...........................................................................................................................................13

6 Replicamanagement.........................................................................................................................146.1 Replicamanagementbyreferencecounting........................................................................................146.2 Replicamanagementbyslidingwindow................................................................................................14

7 Metadatamanagement.....................................................................................................................167.1 Metadatastorageoverhead.........................................................................................................................167.2 Metadatareconstruction...............................................................................................................................16

8 Summary...............................................................................................................................................189 References............................................................................................................................................19

Page 4: Final secure block device D2 - SafeCloud Project · secure file system (SS3). SS3 is a POSIX-compliant distributed file system on top of cloud storage providers that focuses on the

D2.7–FinalSecureBlockDevice 4

ExecutivesummaryThe past few years have seen the emergence of competitive commercial storagesolutions.Individualsandcompanieslookingforstoragespacetosharedocumentswiththeworldorhosttheirremotebackupsnowhaveplentyofcloudalternativestoturnto.Butwheneverauserchoosesoneofthesesolutions,itentruststhecloudproviderwiththedurabilityoftheirdata.Todosoatareasonablecost,mostprovidersperformmulti-site replication. While this enables recovery from accidental loss it does not preventtamperingfrommaliciousoutsidersandeventheprovideritself.Tosolvethisproblem,theSafeCloudprojectprovidessolutionsaspartofWP2.

Tosatisfydifferentstorageneeds,WP2buildsuponexistingself-hostedandcommercialstoragecomponents(SS1)tooffertwosolutions:thesecuredataarchive(SS2)andthesecurefilesystem(SS3).SS3isaPOSIX-compliantdistributedfilesystemontopofcloudstorageprovidersthatfocusesonthesecurityofthedata.Incontrast,SS2isaRESTfulobjectstorefocusingondurability.Withdifferentobjectives,SS2andSS3bothleverageexisting bricks to split the data and trust in such a way that they can recover fromindependentfailures.Inparticular,SS2canrecoverfromfailuresbeyondthecapabilitiesoftypicalreplicationanderasurecodingschemes.Bybuildingrecursivelinksbetweendocuments through data entanglement, the secure data archive can detect and repairtargetedcensoringattempts.IntherestofthisdocumentwepresentRECAST,thefinalprototypeversionofthesecuredataarchive.Moreprecisely,wedescribehowlong-termprotection (D2.2) and short-term protection (D2.5) are blended together in a finalimplementationbuiltontopofplaycloud(D2.2).ThisdocumentservesasausermanualforthedeploymentandexperimentationofRECAST.Partsofthecodebasearepubliclyavailableathttps://github.com/safecloud-project.

This document is organized in two parts. In Section 1, we summarize STEP-archivesintroducedfirstinD2.2andextendedinD2.5.FromSection2andon,wepresentauser-oriented manual describing how to deploy, configure and manage an instance ofRECAST.

SecurestorageSS1

SecureblockstorageSS2

SecuredataarchiveSS3

SecurefilesystemTable1:securestoragesolutions.

Page 5: Final secure block device D2 - SafeCloud Project · secure file system (SS3). SS3 is a POSIX-compliant distributed file system on top of cloud storage providers that focuses on the

D2.7–FinalSecureBlockDevice 5

1 SafeCloudarchivalusingdataentanglementAspartofSafeCloud,weintroduceSTEP-archives,astoragesystemforarchivingcodeddocuments. STEP-archives were first presented in [MAL16], extensively discussed inD2.2,furtherdiscussedinD7.10andfinallyextendedinD2.5.Wesummarizethemhereforcompleteness.Usingdataentanglementanderasure-correctingcodes,wedevelopadatastoragearchitecturewhereastoreddocumentcanonlybedeletedormodifiedbycompromisingtheintegrityofotherdocumentsinthesystem.

Therearetwomainobjectivesbehindthiswork.Thefirstobjectiveisdataintegrity.Wewant to provide guarantees to users that their data cannot be deleted or corruptedwithout compromising other data stored by themselves or other users. The secondobjective is toprovidecensorshipresistancebyforcingacensorwhowantstotamperwith data to do so noisily, i.e., being forced to corrupt a large number of otherdocuments in the system. An ancillary result deriving from the two objectives isincreased protection against failures, which can be seen as attacks from random orfailure-specificcensors.Definition 1. A (s,t,e,p)-archive is a storage system where each archived documentconsistsofacodewordwiths sourceblocks, t tangledblocks,pparityblocksand thatcancorrecte=p-sblockerasures.

When a document is archived, it is split into s ≥ 1 source blocks. Using the s sourceblocks with t distinct old blocks already archived, a systematic maximum distanceseparable (MDS) code [LC04] is used to create p ≥ s parity blocks which are thenarchivedonthesystem.

Anarchiveddocumentcanberecoveredfroms+tormoreofitsblocks.Thecodecancorrectp blockerasuresperdocument codeword,but since the sourceblocks arenotarchivedandareconsideredaserased,atmost𝑒 = 𝑝 − 𝑠blockerasuresperdocumenton the storage medium can be corrected. Note that increasing t does not increasestorageutilizationoverheadorerror-correctingcapabilitybutdoesincreasecodinganddecodingcomplexity.

Anattackercancensoradocument𝑑' byerasingmorethaneofitsblocks.However,byentangling newdocumentswith documents already archived, itmight be possible forthesystemtorecoverthedeletedblocksbyrecursivelydecodingotherdocumentsthatusethem.Thechallengingpartofourapproachisthustochoosethepointerstoentangledblocks.InD2.5we improved uponD2.2 by combining uniform random selection and normaldistribution-basedselectioncentredonthetailofthearchive.Asaresult,wemaintainedrandomness in the structure preventing the attacker from planning the attack inadvance while insuring a short-term protection of the newer documents throughreplication.

FordetailedinformationonSTEP-archives,pleaserefertoD2.2andD2.5.

Page 6: Final secure block device D2 - SafeCloud Project · secure file system (SS3). SS3 is a POSIX-compliant distributed file system on top of cloud storage providers that focuses on the

D2.7–FinalSecureBlockDevice 6

1.1 Todeleteornottodelete

Theconflictbetweenlegitimateandillegitimatedeletionisalwayspresent:ifitiseasytodeletefilesforalegitimatereason,thentheprotectionofferedbyentangleddataislostandillegitimatetamperingbecomeseasier.Thisisunavoidable,andinthissensesimilartodistributedledgerssuchasblockchains:tampering,includingdeletion,mustbeverydifficult by design. Throughout the project, we studied mechanisms to legitimatelydelete entangled files from a STEP-archive. The conclusion is that although deletionmechanisms are indeed possible, they require major design changes, are technicallyexpensive, and always decrease the anti-tampering properties of a STEP-archive. Thisundesirabletradeoffwasnotworththebenefitsofanimplementation,thuswedecidednotto implementanyofthesedeletiontechniques.ThisclearlysignalsonceagainthatourSTEP-archivesshouldonlybeusedtostoreimmutabledata.

Page 7: Final secure block device D2 - SafeCloud Project · secure file system (SS3). SS3 is a POSIX-compliant distributed file system on top of cloud storage providers that focuses on the

D2.7–FinalSecureBlockDevice 7

2 Architecture2.1 Design

Figure2ThearchitectureofRECAST.

Our implementation of the step-archive, named RECAST, is based on playcloud(introduced in D2.2 and used for experimental evaluation in D2.5). As such itsarchitecture is similar to playcloud’s with the addition of a few components formetadata management and coordination. The main components of the system are: aproxy/encoder, a metadata server, a coordination service and one or more storagenodes.The proxymediates interactions between clients and the system communicating overHTTPthroughaRESTinterface.UsersmayinsertnewdocumentsusingaPUTrequestandretrievethembyissuingatGETrequest.The encoder entangles the new documents with older ones already present in thesystem. Note that depending on the use cases and resources available, the proxy cantakecareoftheencodingwithoutdelegatingtoadedicatedencoderinstance.Themetadataserverkeepstrackofthestoreddocument.Theinformationhelpslookingup blocks and checking their integrity as they are fetched from the storage nodes. Inpractice,themetadataisstoredinacentralizedkey-valuestore.Thecoordinationserviceenables separateprocesses suchas theproxyand the repairdaemontooperatesimultaneouslyinthesameliveinstance.Finally, storage nodes serve as backends. They are considered independent andleveragedbytheproxytobalancethereadingandwritingloadevenly(astheblocksareplacedrandomly).

2.2 Implementation

The actual implementation of RECAST is a combination of existing pieces of softwareandnewcode.Thefollowingparagraphisabriefpresentationofthesoftwareused.The proxy listens through a REST API written in Python(2.7) [python] exposed byuwsgi 2.0.15 [uwsgi] application server. The entanglement is implemented usingPyEClib[pyec]and liberasurecode[libec]and Intel ISA-L [isa].Communicationbetweenseparateproxyandencoderinstancesisdoneusinggrpc[grpc].The metadata server is a Redis[redis] server, an in-memory key-value store, whosewritesare loggedondisk.Thecoordinationservice isZookeeper[zkp].Finally, storagenodescaneitherberedisorminio[min]nodes.

Page 8: Final secure block device D2 - SafeCloud Project · secure file system (SS3). SS3 is a POSIX-compliant distributed file system on top of cloud storage providers that focuses on the

D2.7–FinalSecureBlockDevice 8

Each component is packaged as a separate docker [docker] container and can bedeployed either locally, using docker-compose [dc] or on multiple machines usingdockerswarmmode[ds](seeDeploymentsection).

Page 9: Final secure block device D2 - SafeCloud Project · secure file system (SS3). SS3 is a POSIX-compliant distributed file system on top of cloud storage providers that focuses on the

D2.7–FinalSecureBlockDevice 9

3 DeploymentThissectiondetailsthedeploymentprocedureforRECAST.AsRECASTmostlyreliesondockerforeaseofbuildanddeployment,pleasemakesurethatyouhavethefollowingrequirementsinstalledonyourmachine.

3.1 Localdeployment

Alocaldeploymentonlyrequiresdocker[docker]anddocker-compose[dc].Thisprocessfollows2steps:

• Buildthedockerimages• Deploylocally

3.1.1 Buildthedockerimages

Atthetopoftheproject,makesurefiledocker-compose.ymlispresentandthenrun:

Thiscommandwillbuildall the imagesthatarenotpre-built in thedockerhub.At theendoftheprocessyoushouldbereadytodeployrecast.

3.1.2 Deploylocally

Havingmadesurethattheimageswerebuiltproperly,run

You should now have a running instance of RECAST listening on port 3000. You caninteractwith this instance by trying to upload or read files as described in the basicusagesection.Tostopyourinstance,run:

3.2 Distributeddeployment

RECASTcanbedeployedinadistributedmodewherethecomponentsarespreadovermultiplemachines. To deploy tomultiplemachines, RECASTuses docker anddocker-swarm[ds].Atypicaldeploymentfollows3steps:

• Createadockerswarm• Build,pushanddownloadthedockerimages• Deployontheswarm

3.2.1 Createadockerswarm

Firstensurethatall themachinesthataretobeaddedtotheclusterhavethedocker-engineinstalled.Oneofthesemachinesistobechosenastheleaderoftheswarm.Onthatmachine,run:

docker-compose build

docker-compose up --detach=true

docker-swarm init

docker-compose down

Page 10: Final secure block device D2 - SafeCloud Project · secure file system (SS3). SS3 is a POSIX-compliant distributed file system on top of cloud storage providers that focuses on the

D2.7–FinalSecureBlockDevice 10

Takegoodnoteoftheleadertokengivenasoutputofthecommandasitisgoingtobeused by the other machines to join the cluster. On the other machines, run docker-swarmjoinwiththetokentocreatetheswarm.

Bytheendoftheprocess,theswarmisassembledandreadytomoveonimagebuildinganddeployment.

3.2.2 Build,pushanddownloadthedockerimages

Inorderruninswarmmode,thedockerimagesusedbyRECASTmustbebuiltandthenpushed to a docker registry. More specifically, the proxy image must be pushed forRECAST towork.For thisprocess,makesure thatyouhaveanaccounton thedockerhub.First,buildtheimage:

Enteryourdockercredentialsifyouhavenotyetbeenauthenticated.

Thenpushtheimagetothedockerhub.

Finally, edit the docker-compose-production.yml to replace the username prefix in theproxyservice.

3.2.3 Deployontheswarm

Aftersuccessfullybuildingandpushingthedockerimagetothedockerhub,itistimetogetRECASTrunning.Atthetopoftheproject,run:

Whileyouwillgetcontrolofyourconsolepromptquicklyafterrunningthecommandabove, thecompletedeploymentofRECASTmighttakesometime. Indeed,onthefirstdeploymentovertheswarmtheimagesrequiredtothecontainermustbepulledbythemachinesrunningthematchingservice.

docker stack deploy --compose-file docker-compose-production.yml recast

docker swarm join <leader-token-here>

docker build --file pyproxy/Dockerfile –-tag <your-docker-username>/proxy

docker push <your-docker-username>/proxy

docker login

… version: "3" services: proxy: … image: <your-docker-username>/proxy …

Figure3Sectiontoeditindocker-compoe-production.yml.

Page 11: Final secure block device D2 - SafeCloud Project · secure file system (SS3). SS3 is a POSIX-compliant distributed file system on top of cloud storage providers that focuses on the

D2.7–FinalSecureBlockDevice 11

4 ConfigurationRECASTcanbe launchedwithoutchanges to theoriginalconfigurationbut thedefaultsettings may not suit all use cases. Further fine-tuning may be needed to match thedesiredconfiguration.Inparticular,RECAST’ssettingscanbetweakedalongtwoaxes:

1. Entanglementparameters2. Storageparameters

The tweakedparametersultimatelydetermine the storageoverhead (in termsofdiskutilization)toexpectwhenlaunchingaRECASTinstance:

𝑠𝑡𝑜𝑟𝑎𝑔𝑒𝑜𝑣𝑒𝑟ℎ𝑒𝑎𝑑 = 𝑝𝑠∗ 𝑟𝑒𝑝𝑙𝑖𝑐𝑎𝑡𝑖𝑜𝑛𝑓𝑎𝑐𝑡𝑜𝑟

The systemoperatormust thereforebe careful that the chosenRECAST configurationmatchestheresourcesavailable.

4.1 Entanglementparameters

RECAST implements the (s,t,e,p)-archive scheme and lets the operator set the codingparameters.Thenumberofsourceblocks(s),ofpointers(t)andofparities(p)canbetuned by editing the content of configuration.json (at the top of the project) in theentanglementsection.

Thefileconfiguration.jsonservesasthecentralconfigurationfile.Tospreadthechangestootherconfigurationfiles,theoperatorneedstorunthefollowingcommandfromthetopoftheproject.

The configuration changes should now have spread to the other configuration files:pyproxy/dispatcher.jsonandpyproxy/pycoder.cfg.

{ … "entanglement": { "type": "step", "configuration": { "s": 1, "t": 10, "p": 3 } } … }

Figure4Contenttomodifyinconfiguration.json.

./scripts/configure.py

{ … "entanglement": { "configuration": { "p": 3, "s": 1, "t": 10 }, "type": "step", … }

[main] … splitter = entanglement … [entanglement] type = step … [step] s = 1 t = 10 p = 3

Figure5Excerptsofpyproxy/dispatcher.jsonandpyproxy/pycoder.cfgafterchangesinentanglementconfiguration.

Page 12: Final secure block device D2 - SafeCloud Project · secure file system (SS3). SS3 is a POSIX-compliant distributed file system on top of cloud storage providers that focuses on the

D2.7–FinalSecureBlockDevice 12

4.2 Storageparameters

Forlocalordistributedexperimentation,thenumberandtypeofstoragenodes(minio[min] or redis[redis]) can be specified in configuration.json. You can also change thereplicationfactor,whichwillimpactthestorageoverheadbeforereplicamanagement.

Thechangecanthenbepropagatedtotheotherconfigurationfilesbyrunning:

This will result in a matching change in files docker-compose.yml, docker-compose-production.ymlandpyproxy/dispatcher.json.

Pleasenotethatpyproxy/dispatcher.jsonistheactualconfigurationfilereadbyRECASTatstartuptime(configuration.jsonisignoredatruntimeanditschangesmustbespreadby running ./scripts/configure.pybeforehand). In consequence,pyproxy/dispatcher.jsoncan be manually configured to connect to other storage backends that may not becontainerized,deployedlocallyorevenpartoftheswarm.Theoperatorshouldtakecareof deleting the unnecessary containers fromdocker-compose.yml anddocker-compose-production.ymlbeforestartingRECASTwhentheirstoragebackendsarenotmanagedbydocker.

scripts/configure.py

… storage-node-42: image: redis:3.2.8 deploy: placement: constraints: - node.role == worker …

… storage-node-42: image: redis:3.2.8 volumes: - ./volumes/storage-node-42/:/data/ container_name: storage-node-42 …

Figure6Excerptsofdocker-compose.yml,docker-compose-production.ymlandpyproxy/dispatcher.jsonafterchangesinstorageconfiguration.

… "replication_factor": 2, "providers": { … "storage-node-42": { "host": "storage-node-42", "type": "redis", "port": 6379 }, …, }, …

… "storage": { "nodes": 42, "type": "redis", "replication_factor": 2 }, …

Page 13: Final secure block device D2 - SafeCloud Project · secure file system (SS3). SS3 is a POSIX-compliant distributed file system on top of cloud storage providers that focuses on the

D2.7–FinalSecureBlockDevice 13

5 BasicusageUserscan interactwithaRECAST instance through theREST interfaceusinganHTTPclientsuchascurl[curl]oranyothersoftwarelibraryoftheirchoice.

5.1 Insertadocument

Clientscanuploadfilestothearchiveusingasimplecommand.

Inthiscase,theclientuploadsthedocumentlocatedatpath/to/fileunderthekeyname.Onasuccessfulrequest,RECASTrepliestotherequestwiththechosendocumentname.In case of the failure RECAST replieswith the appropriate HTTP error code (400 onemptyrequestsor409whentryingtooverwriteadocument).Inthecasewhereadocumentcannotbestoredunderauser-preferedname,theclientcanchoosetouploadwithoutafilenameandletRECASTpickoneforthem.

RECASTwill then pick a randomUUIDv4 as the name and send it in the reply to theclient.

5.2 Readadocument

OnceuploadedtoRECAST, filescanbereadusing the filenamepickedby theclientortheuseratuploadtime.

Withthiscommand,ausercanrecoveracopyofadocumentstoredinRECASTunderthekeynameinpath/to/file.

Ifauserisonlyinterestedinreadingmetadataaboutagivendocument,theycanissuearequestonthedocumentandgetresultasaJSONobjectdetailinginformationaboutthedocument,itsblocksandthepointersusedforentanglement.

5.3 Repairadocument

In case of loss of one or several blocks, the repair can be operated by attaching to aproxyinstanceandrunningthefollowingcommand:

Whenauditingtheentirearchivetocheckforcorruptionofblocks,run:

curl --request PUT http://proxy-ip:3000/name --upload-file path/to/file

curl --request PUT http://proxy-ip:3000/ --upload-file path/to/file

curl --request GET http://proxy-ip:3000/name --output path/to/file

curl --request GET http://proxy-ip:3000/name/__meta

docker exec --interactive --tty proxy ./repair.py <block ids>

docker exec --interactive --tty proxy ./repair.py

Page 14: Final secure block device D2 - SafeCloud Project · secure file system (SS3). SS3 is a POSIX-compliant distributed file system on top of cloud storage providers that focuses on the

D2.7–FinalSecureBlockDevice 14

6 ReplicamanagementTherightselectionofpointersforentanglementisparamounttolong-termprotectionofdocumentsinthearchive.D2.5describedhowtoachievebetterselectionresultsusingamixedapproachbypickingonepartofthepointersovertheentirearchiveandtheotherinaslidingwindowoverthemostrecentdocumentsthatenteredthearchive(akathetail).Inaddition,weproposedasimilarwindow-basedapproachtoprovideshort-termprotectiontodocumentsinthetailofthearchivethroughreplication.Todealwiththestorageoverheadof this replication (seeSection4),our implementationprovides twostrategies for replicamanagement: reference-counting andwindow-based.The rest ofthissectiondescribesthesestrategiesandtheirpracticaluses.

6.1 Replicamanagementbyreferencecounting

Anewdocumententeringthearchiveisentangledandsplitintoblocks.Theseblocksareimmediately replicated, and the copies are spread across the storage nodes forredundancy.Fromthereon,theblockscanbeusedaspointersfortheentanglementofnew documents. Thus, the reference count, or the number of times a block has beenselected as a pointer, is bound to increase over time. As this reference count alsoinforms us on the possibility of recursive reconstruction in case of failure, we canleverageittodecidehowtodealwiththeexplosioninstorageoverhead.Using a threshold value th, defined by the system operator, the replica managementscriptcancrawl themetadata looking forblocksthathavebeenpointedatthormoretimesanddeletetheircopiestolowerthestorageoverheadacrossthestoragenodes.Thisstrategyguaranteesrecoveryofallblocksincaseofblocklossbutrequiresexpertfine-tuning.Indeed,poorsettingscanleadtoanarchivewherethestorageoverheadisnotmitigatedasbestaspossibleorworsewheresomedocumentscannotberecoveredincaseoffailure.To run the replica management script using a reference-counting base in a RECASTinstanceonDocker,run:

Orasadaemon,runningatfixedintervals:

6.2 Replicamanagementbyslidingwindow

A new document entering the archive is entangled and split into blocks. As a recentdocument,itisstillpartofthetailofthearchivemadeofthewmostrecentdocumentsinthearchive.Inasystemthatusesmixedpointerselection,andisthusbiasedtowardsdocumentsinthetail,wecanassumethatasdocumentsexitthewindow,theycanhavebeenusedaspointers and copiesof theirblocks canbediscarded (seeD2.5 formoredetails).

dockerexec--interactive--ttyproxy./scrub.py--pointers<th>

dockerrun--interactive--ttyproxy--entrypoint./scrub.py--pointers<th>\--interval<seconds>

Page 15: Final secure block device D2 - SafeCloud Project · secure file system (SS3). SS3 is a POSIX-compliant distributed file system on top of cloud storage providers that focuses on the

D2.7–FinalSecureBlockDevice 15

The slidingwindow strategy offers away tomaintain a predictable storage overheadandevenguaranteesthat thesizeoccupiedbyreplicasdecreasesovertimerelativetothe size of the archive. But it does sowith little regard to the actual recoverability ofblocks themselves. Indeed, a document could exit the window without all his blocksbeingpointedatleastonce.TorunthereplicamanagementscriptusingaslidingwindowbaseinaRECASTinstanceonDocker,run:

Orasadaemon,runningatfixedintervals:

dockerexec--interactive–-ttyproxy./scrub.py--window<w>

dockerrun--interactive--ttyproxy--entrypoint./scrub.py--window<w>\--interval<seconds>

Page 16: Final secure block device D2 - SafeCloud Project · secure file system (SS3). SS3 is a POSIX-compliant distributed file system on top of cloud storage providers that focuses on the

D2.7–FinalSecureBlockDevice 16

7 MetadatamanagementBoth long-term and short-term protection schemes require the maintenance ofmetadatabythesystemoperator.Randomormixedentanglementimplykeepingalistofpointersandblocks location.Replicationmanagementcanonlybeperformedwhencopies of blocks can be identified and located. In our implementation, a centralizedmetadata server is set up to keep track of all this information. Please note that acentralizedmetadataserverisonlynecessarybecauseblockplacementisrandom.Iftheblocks were placed in a deterministic fashion, we could remove this component.Howeveras(s,t,e,p)-archivesbuilduponrandomnesstopreventpre-computedattacks,wehavechosennottoimplementsuchablockplacementmodel.In the rest of this section, we describe metadata management in our prototype, itsstrengthandlimitations,andhowtheycanbemitigated.Foroperationalpurposes,thesecurearchiveneedstokeeptrackofblocks’locationandthe entanglement graph. Thismetadata ismaintained in a Redis [redis] database andservesasthesourceoftruthforthesystem.Componentsofthesystemthatneedtoreador write metadata interact with the database using a Python [python] client. Suchcomponentsincludetheproxyorthereplicamanagementandblockrepairscripts.

7.1 Metadatastorageoverhead

In terms of growth, the storage overhead is predictable. Variations can be observeddependingontheconfigurationofSTePbutstoring1documentaddsaround1kBtothemetadataserverregardlessofthesizeofthedocumentsstoredinthearchive.

Figure7Metadatastorageoverheadwithanincreasingnumberofdocumentsinthe

archivewithdifferentSTePconfigurations.The storageoverheadof themetadata can furtherbe reduced through severalmeans.Betternormalizationwouldreducetheoverallsizebyavoidingrepetitionindocumententries.Thiswouldbeachievableatthecostoflongerandmorecomplexqueries.

7.2 Metadatareconstruction

Our system leverages randomness in the selection of tangled blocks and in theplacementofnewblocksintostoragenodes.Ifmetadataisunavailable,damagedorlost,storedblocksbecomemeaningless.Tomitigatethis issue,we implementamechanismtoreducerisksofcompletelossofaccesstothemetadataserver.Thisprocedureenablesthesystemtoreconstruct themetadata fromthedata itself.Under theassumptionsof

Page 17: Final secure block device D2 - SafeCloud Project · secure file system (SS3). SS3 is a POSIX-compliant distributed file system on top of cloud storage providers that focuses on the

D2.7–FinalSecureBlockDevice 17

availableandhoneststoragenodesaswellaspristinedatablocks,wescanthestoragenodes, examining the hosted blocks and reconstruct the associated metadata. Thissolution is possible becausewe prepend the entanglement information to each blockbeforesending it to thestoragenodes.Morespecifically, givenablock, thismetadata-overheadincludesareferencetoallthepointersselectedduringtheentanglementofthedocument. In our prototype, it consists of a fixed 80 Bytes per block (erasure codinginformation) and an additional t * (average block name length). As blocks are namedaccording to the document they belong to and their position in the codeword, thisaveragelengthdependsonthenamingpatternsinthearchive.Figure 7 shows the availability of documents duringmetadata reconstruction in a 16nodes archive with different configurations and replication factors. If all blocks arereplicated,wecanexpecttobeabletoservealldocumentsbeforereadingfromallthestorage nodes as depicted by the full points. In contrast, if none of the blocks arereplicated,manystoragenodeswillhavetobecrawledthroughuntilwegatherenoughknowledgeabouttheentiresystem.Whenrunninganinstanceofoursystemwithmixedentanglementandreplicationmanagementenabled,thenumberofreplicatedblocksisconstantasthearchivegrows.Followingthisprinciple,thenumberofnodesthatneedtobeexploredtorebuildafunctionalarchiveincreases.

Torebuildthemetadataafteralossofthemetadataserver,someinformationabouttheRECASTinstancemustbegathered.Thehostname(orIPaddress)andtheportnumberusedbythemetadataserveraswellasthepathtodispatcher.json.IfyouhavearunningRECASTusingDocker,attachtotheproxycontainerandrunthefollowingcommand:

docker exec -–interactive --tty proxy ./rebuild_metadata.py -–host <host> --port <port> --conf </path/to/dispatcher>

Figure8Availabilityofdocumentsduringmetadatareconstructionina16nodesarchivewithdifferentSTePconfigurationsandreplicationfactors.

Page 18: Final secure block device D2 - SafeCloud Project · secure file system (SS3). SS3 is a POSIX-compliant distributed file system on top of cloud storage providers that focuses on the

D2.7–FinalSecureBlockDevice 18

8 SummaryThis deliverable presents the final implementation of the secure block device whichleverages thesystem introduced inD2.2andextended toprotect thewholearchive inD2.5.Re-introducing(s,t,e,p)-archivesasthecornerstoneofthisprototype,itcoversthearchitecture of the final system, RECAST, and presents a user-orientedmanual for itsdeployment,useandmaintenance.

Page 19: Final secure block device D2 - SafeCloud Project · secure file system (SS3). SS3 is a POSIX-compliant distributed file system on top of cloud storage providers that focuses on the

D2.7–FinalSecureBlockDevice 19

9 References[curl] https://curl.haxx.se/[docker] https://www.docker.com/[dc] https://docs.docker.com/compose/[ds] https://docs.docker.com/engine/swarm/[grpc] http://www.grpc.io/[isa] https://github.com/01org/isa-l[LC04] ShuLinandDanielJ.Costello.ErrorControlCoding.PaersonPrenticeHall,

secondedition,2004.[libec] https://github.com/openstack/liberasurecode/[MAL16] Hugues Mercier, Maxime Augier and Arjen K. Lenstra, STEP-archival:

Storage Integrity and Tamper Resistance using Data Entanglement,SubmittedtotheIEEETransactionsonInformationTheory,2016(revised2017).Availableuponrequest.

[min] https://minio.io/[pyec] https://github.com/openstack/pyeclib[python] https://www.python.org/[redis] https://redis.io[uwsgi] https://github.com/unbit/uwsgi[zkp] https://zookeeper.apache.org/