View
216
Download
0
Category
Preview:
Citation preview
INDIGO DataCloud
DESY Cloud,The Scientific Data CloudManagedSharedStorageAtthe“ownCloud ConnectsBusiness”workshop
Dr.PatrickFuhrmannQuirin BuchholzTigranMkrtchyanPetervanderReestLusine Yakovleva
June1,2016,Frankfurt,PatrickFuhrmannetal. 2TheScientificDataCloud@ownCloud ConnectsBusiness
Content
• Storage@DESY?• Sync’n ShareatDESY
• Motivation• Requirements• Implementation• Setup
• RequirementsfromScienceCommunities.• dCache forDummies.• TheownCloud– dCache Hybridsystem• Summaryandoutlook.
June1,2016,Frankfurt,PatrickFuhrmannetal. 3TheScientificDataCloud@ownCloud ConnectsBusiness
Storage@DESY
• PetraIII[Tier0](2012…)• SynchrotronRadiation
• 14Beamlines• BeamlineGuestScientists
• 1PB/year– 5PB/year
• European[Tier0]XFEL (2017…)• 3.4Km(Linear)• 2017(Firstbeamline)
• BeamlineGuestScientists• 10– 100PB/year
• HERA[Tier0](1992– 2007)• Particleaccelerator(Proton– Electron)
• 6.3Km(Ring)• Somehundredscientists
• 5PBintotal
• LCG[WLCGTier2](2008,2009 …)• Particleaccelerator(Proton– Proton)• 26.7(Ring)
• About10.000scientist• 15PB/year
2020100PBytes
1992
June1,2016,Frankfurt,PatrickFuhrmannetal. 4TheScientificDataCloud@ownCloud ConnectsBusiness
MorestorageatDESY
•TheDESYdatamanagementteamhasquitesomeexperienceinmanaginghugeamountsofdata.
• Incollaborationwithother‘bigdata’sites,weareprovidingadatamanagementsystem‘dCache’,deployedat70sitesaroundtheworld.
• Seelater.•So,whyarewerunningownCloud ?
June1,2016,Frankfurt,PatrickFuhrmannetal. 5TheScientificDataCloud@ownCloud ConnectsBusiness
Motivation
• DESYhasnoexperienceinsophisticateddatasharing.• DatasharingwasdoneinthetraditionalwaywithACL’sand’group’directories
• However:YoungscientistsstarttheircareersatUniversitiesandLab’swithSync’n Shareintheirblood.(DropBoxGeneration).
• PublicITdepartments,foraverylongtime,didn’tregardSync’n Shareasbeingtheirproblemasmanycommercialsolutionswerearound.
• ItessentiallybecameanissueafterSnowden.• LegalRequirement:Datahadtobestored‘onsite’oratleastinGermany
• Consequence:CCneededtoprovideSync’n Sharelikemechanisms.
June1,2016,Frankfurt,PatrickFuhrmannetal. 6TheScientificDataCloud@ownCloud ConnectsBusiness
Requirements
• Finegrainedsharingoffilesanddirectorieswithindividualsandgroups.
• SharingviaintuitiveWeb2.0mechanisms(AppsorBrowser)• Sharingwith‘thepublic’withorwithoutpasswordprotection• Sharingofspacetouploaddata.(protected)• Expirationofshares• Automaticbidirectionalsynchronizationofdatabetweenmobiledevicesandcentralrepository.
June1,2016,Frankfurt,PatrickFuhrmannetal. 7TheScientificDataCloud@ownCloud ConnectsBusiness
TypicalApplication
Your Cloud SpaceSync
Sync
File up and download
June1,2016,Frankfurt,PatrickFuhrmannetal. 8TheScientificDataCloud@ownCloud ConnectsBusiness
StepstakenbyDESY• Evaluatedpossiblesolutionsin2013.• DecidedtogoforownCloud
• Providesmostofthefeaturesneeded.• OpenSource• WasinusebymanyinstitutesandUniversitiesinGermany• UsedbycolleaguesatSURFSara (Amsterdam)andCERN
• Evaluationshowed:• VerygoodSync’n Sharefeature set• Verygoodinplanningahead(roadmap)• Plansforcrosssitefederatedaccess(nowinplace).• Abitweakindatamanagement
• StartedprototypeinstallationatDESYbeginningof2014
June1,2016,Frankfurt,PatrickFuhrmannetal. 9TheScientificDataCloud@ownCloud ConnectsBusiness
WhatshouldtheDESYSetuplooklike?
(ActuallywilllooklikeinJuly)
June1,2016,Frankfurt,PatrickFuhrmannetal. 10TheScientificDataCloud@ownCloud ConnectsBusiness
TheInfrastructure
AuthenticationKerberos
UserManagementRegistryLDAP
Monitoring
LocalandWide AreaNetworkLoadBalancing Firewalls
Virtualization
Accounting 8 UnlimitedPersistentStorage
June1,2016,Frankfurt,PatrickFuhrmannetal. 11TheScientificDataCloud@ownCloud ConnectsBusiness
Infrastructure Integration
PostgresDB
OwnCloud
OwnCloudOwnCloud
OwnCloud
F5,LoadBalancer
AutomaticFailover
June1,2016,Frankfurt,PatrickFuhrmannetal. 12TheScientificDataCloud@ownCloud ConnectsBusiness
MoreIntegration
DESYKerberos
OwnCloud
8UnlimitedCentral
Storage
DESYLDAPDataLifeCycle
Engine
June1,2016,Frankfurt,PatrickFuhrmannetal. 13TheScientificDataCloud@ownCloud ConnectsBusiness
PoolNode
PoolNode
PoolNode
PoolNode
PoolNode
PoolNode
200TBytesRAID6
200TBytesRAID6
200TBytesRAID6
Horizontally ScalingBackend
OwnCloud OwnCloud OwnCloud OwnCloud
NFS4.1/pNFS
WebLoadBalancer(F5)
June1,2016,Frankfurt,PatrickFuhrmannetal. 14TheScientificDataCloud@ownCloud ConnectsBusiness
SomeStatistics
Filesin/outin7days10.000
70.000Filesin/outperhour
Users Total 490
Users Active 277
SpaceAvailable 567TBytes
SpaceUsed 2*30TBytes
Files 10Millions
CurrentDefaultQualityTwoReplicasondifferentstoragenodes.
June1,2016,Frankfurt,PatrickFuhrmannetal. 15TheScientificDataCloud@ownCloud ConnectsBusiness
Isthatsufficient forscientists?
June1,2016,Frankfurt,PatrickFuhrmannetal. 16TheScientificDataCloud@ownCloud ConnectsBusiness
TypicalWorkflow
Derived PublicationRaw
Sharing
June1,2016,Frankfurt,PatrickFuhrmannetal. 17TheScientificDataCloud@ownCloud ConnectsBusiness
DataCategories
1TB
10- 100TB
1– 100PB Raw
Derived
Publication
LHCDetectordataRawX-RayImagesBrainScansReconstructed(Ntuples)PurifiedImagesBrainMaps
Papers,Presentations,Histograms
Amount Category TypicalApplication
June1,2016,Frankfurt,PatrickFuhrmannetal. 18TheScientificDataCloud@ownCloud ConnectsBusiness
Whatdoweneedtosupport ‘scienceworkflows’?
June1,2016,Frankfurt,PatrickFuhrmannetal. 19TheScientificDataCloud@ownCloud ConnectsBusiness
MoreRequirements
• Storagemustbemanageable:DefinedQoS andDataLifecycle• DifferenttypeofdatamusthavedifferentQoS attached,regardingaccesslatency(performance)anddatadurability(howsafeismydata?)
• SpinningDiskforstreaming• SSDforfastrandomaccess• Tapeforarchive• Multiplecopiesindifferentlocationsondifferentmediaforlongtermdatapreservation
• MovingdatabetweendifferentQoS typeshastobeperformed• w/oserviceinterruption• transparentlytotheuser• w/ochangesinthenamespace
June1,2016,Frankfurt,PatrickFuhrmannetal. 20TheScientificDataCloud@ownCloud ConnectsBusiness
QualityofService
Raw
LongTermPreservation(LegalRequirement)
Derived
SSD
LowLatency(HPC,Analysis)
Publication
SSD
Fast,MultiStreamAccess
June1,2016,Frankfurt,PatrickFuhrmannetal. 21TheScientificDataCloud@ownCloud ConnectsBusiness
EvenmoreRequirements
• Differentaccessprotocolsfordifferentapplications• POSIXMountedFS(nfs4.1/pNFS) forfastanalysis• FTPdialects(gridFTP) forwideareatransferswithGLOBUS,WLCG-FTS• http/WebDAVmostlyforbrowserbasedapplications,visualization,..
• Differentauthenticationmechanismmustbeavailable.• Username/passwordforwebapplications• SAMLtosupporttraditionalIdP’s• OpenIDConnectforgoogle/facebook likeIdP’s• CertificatesforhttpsorGRIDapplications
• Differentcredentialsmustbemap-abletothesameidentity.
June1,2016,Frankfurt,PatrickFuhrmannetal. 22TheScientificDataCloud@ownCloud ConnectsBusiness
ScientificDataCloud
HighSpeedDataIngest
FastAnalysisNFS4.1/pNFS
WideAreaTransfers(Globus Online,FTS)byGridFTP
Sync’ing andSharingwith OwnCloud
June1,2016,Frankfurt,PatrickFuhrmannetal. 23TheScientificDataCloud@ownCloud ConnectsBusiness
Whatwouldthatlooklikefromtheuser’sperspective?
June1,2016,Frankfurt,PatrickFuhrmannetal. 24TheScientificDataCloud@ownCloud ConnectsBusiness
MyDESYXXLHomeQoS support
Patrick’shome
June1,2016,Frankfurt,PatrickFuhrmannetal. 25TheScientificDataCloud@ownCloud ConnectsBusiness
MyDESYXXLHomeProtocolSupport
MultiProtocolNFS4.1/pNFS
GridFTPWebDAVSRM
MyownCloud Home SyncShare
Web2.0ownCloud
June1,2016,Frankfurt,PatrickFuhrmannetal. 26TheScientificDataCloud@ownCloud ConnectsBusiness
Howdoweachievethosegoals?
ORChoosingdCache asthestoragebackendfor
ownCloud !
Thescientificdatacloud
June1,2016,Frankfurt,PatrickFuhrmannetal. 27TheScientificDataCloud@ownCloud ConnectsBusiness
SideTrack
What’sdCache ?
June1,2016,Frankfurt,PatrickFuhrmannetal. 28TheScientificDataCloud@ownCloud ConnectsBusiness
dCache inanutshell (cont.)
• Started2000’• Internationalcollaboration(DESY,FERMIlab,NDGF)• About10members:developers,deployment,support,management• Softwaredeployedatabout70sitesEurope,US,Asia,Russia• Largestdeploymentsintheorderof20PBytes ontapeanddisk.• Totalstoragecloseto200PBytes.• Geographicallylargestinstallationspans4countries.• LargelyfundedbyINDIGO-DataCloud,DESY,FERMIlab andNDGF
INDIGO DataCloud
June1,2016,Frankfurt,PatrickFuhrmannetal. 29TheScientificDataCloud@ownCloud ConnectsBusiness
dCache Design
MediaTransferEngineandPoolManagement dCache
Automaticand
ManualMedia
transition
Virtual file-systemnamespaceLayerProtocoland Authentication Engines
gridFTPNFS/pNFS httpWebDAV
SSDs
SpinningDisks
Tape, BlueRay…
June1,2016,Frankfurt,PatrickFuhrmannetal. 30TheScientificDataCloud@ownCloud ConnectsBusiness
NamespaceDesign
NameSpace PhysicalStorage
Disk
Tape
ExternalSystem
LocationManager
Name
Disk1
Disk2
Tape1
June1,2016,Frankfurt,PatrickFuhrmannetal. 31TheScientificDataCloud@ownCloud ConnectsBusiness
DesignConsequence
• Filesarestoredasobjectsonvariousdataback-ends• RandomDevices :Harddisk,SSD• RemovableMedia:Tape• Objectstores:CEPH
• Back-endscanbehighlydistributed(evenbeyondcountries).• TheFilenamespaceengineisindependentofthedatastorageitself.• Internalandexternalservicescanmovedataaroundw/oserviceinterruption.
June1,2016,Frankfurt,PatrickFuhrmannetal. 32TheScientificDataCloud@ownCloud ConnectsBusiness
dCache Featuressupporting ourideaofascientificdatacloud
• MultiProtocolSupport(TransferandAuthentication)• Transferprotocols:NFS/pNFS,http,WebDAV• MultiAuthenticationCredentialsupport(OpenIDConnect,Kerberos,passwd)
• SophisticatedDataManagement• MultiMediasupport(Tape,SpinningDisk,SSD,…)• Automaticandmanualmediatransitions• Addingandremovingdatanodesw/oserviceinterruption• Automaticreplicamanagement
• Enforcesn<x<mcopiesofdatafiles.• Externalstoragesupport(e.g.Tapesystems:TSM,HPSS,OSM,DMF)
June1,2016,Frankfurt,PatrickFuhrmannetal. 33TheScientificDataCloud@ownCloud ConnectsBusiness
Inparticular :TheQoS Interface
June1,2016,Frankfurt,PatrickFuhrmannetal. 34TheScientificDataCloud@ownCloud ConnectsBusiness
dCache QoS Interfaces
WebService
CDMIService
Cloud
dCache
QoSModule
RESTful
June1,2016,Frankfurt,PatrickFuhrmannetal. 35TheScientificDataCloud@ownCloud ConnectsBusiness
TheQoS WebInterface
DISK TAPE
Click,togetFilebackfromTape.
June1,2016,Frankfurt,PatrickFuhrmannetal. 36TheScientificDataCloud@ownCloud ConnectsBusiness
Puttingpiecestogether
June1,2016,Frankfurt,PatrickFuhrmannetal. 37TheScientificDataCloud@ownCloud ConnectsBusiness
TheDataPath
OwnCloud OwnCloud OwnCloud OwnCloud
NFS4.1/pNFS
WebLoadBalancer(F5)
SpinningDisks
SSD’s TAPE
dCache
June1,2016,Frankfurt,PatrickFuhrmannetal. 38TheScientificDataCloud@ownCloud ConnectsBusiness
FutureWorkTheNamespacePath
Namespace
NamespacedCache
SharingDB
ShareAPI
Namespace,Proxy
June1,2016,Frankfurt,PatrickFuhrmannetal. 39TheScientificDataCloud@ownCloud ConnectsBusiness
dCache – OwnCloud hybrid
• Datapathistheeasiestpart.Worksnicely.• Namespacesynchronizationis/wasverydifficult
• Importanttoletallprotocolsseesynchronizednamespace.• ownCloud didn’texpecttheunderlyingstoragesystemtochangenamespacetree.• Manuallytriggeredsynchronizationtooktoolong.• OwnCloud 9providesfirstattemptforanAPIforexternalnamespace.
• Exposing‘shares’toexternalcomponentnotyetinownCloud.• ImportanttoallowallprotocolstouseownCloud-definedshares.• Prerequisites:
• ownCloud :needsAPItoexpose‘shares’• dCache :needstohavea‘share’objectimplemented.
June1,2016,Frankfurt,PatrickFuhrmannetal. 40TheScientificDataCloud@ownCloud ConnectsBusiness
ownCloud andQoS
I/O(NFS)
ownCloud GUIWeb
dCacheNamespaceAPI
ShareAPI
QoSPluggin
(ServerSideApp)
QoSModule
RESTServices
June1,2016,Frankfurt,PatrickFuhrmannetal. 41TheScientificDataCloud@ownCloud ConnectsBusiness
Summary
• AnOwnCloud - dCache Hybridisaperfectsystemforprovidingmanagedsharedstoragetoscientists.
• Sync’n ShareisprovidedbyownCloud.• AccessprotocolsandAuthenticationMechanismsusedinscienceareprovidedbydCache.
• Unlimitedstoragespaces(viaremovablemedia,e.g.tape)• QualityofServicesupport
• automaticandmanualmediatransitions• Automaticreplicamanagementresultinginhighavailabilityanddatadurability.
• Reduceddowntimesduetotransparentdatamigration.
June1,2016,Frankfurt,PatrickFuhrmannetal. 42TheScientificDataCloud@ownCloud ConnectsBusiness
Outlook
• ThecurrentversionoftheownCloud-dCacheHybridsatisfiestheneedfor
• Sync’n Share• Highlyscalableandmanageableback-endstorage
• Forafullintegration• Thename-spacesofthetwosystemsneedtobesynchronized(OC9)• TheownCloud ‘shares’needtobeexposedtohavethemvisibleinallprotocols(nfs,gridFTP,…)
• WeneedtoprovideanownCloudpluggin(serversideapp)tomakethedCacheQoSstoragetypesvisibleinownCloud.
Recommended