29
Cloud storage with Apache jclouds Andrew Gaul 19 November 2014 http://jclouds.apache.org/ http://gaul.org/ 1 / 29

Cloud storage with · 2017-12-14 · Cloud storage with Apache jclouds Andrew Gaul 19 November 2014 ... Advanced topics 1 What is Apache jclouds Interface with public and private

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Cloud storage with · 2017-12-14 · Cloud storage with Apache jclouds Andrew Gaul 19 November 2014 ... Advanced topics 1 What is Apache jclouds Interface with public and private

Cloud storage withApache jclouds

Andrew Gaul

19 November 2014

http://jclouds.apache.org/http://gaul.org/

1 / 29

Page 2: Cloud storage with · 2017-12-14 · Cloud storage with Apache jclouds Andrew Gaul 19 November 2014 ... Advanced topics 1 What is Apache jclouds Interface with public and private

OverviewWhat is Apache jcloudsWhat is object storageBasic conceptsHow to put and get objectsAdvanced topics

2 / 29

Page 3: Cloud storage with · 2017-12-14 · Cloud storage with Apache jclouds Andrew Gaul 19 November 2014 ... Advanced topics 1 What is Apache jclouds Interface with public and private

What is Apache jcloudsInterface with public andprivate cloud services via JavaProvider-specificimplementations, e.g.,Amazon S3, OpenStack SwiftCross-provider, portableabstractions, e.g., BlobStoreSupports all major providers

3 / 29

Page 4: Cloud storage with · 2017-12-14 · Cloud storage with Apache jclouds Andrew Gaul 19 November 2014 ... Advanced topics 1 What is Apache jclouds Interface with public and private

Why object storageScale out to billions of objectsand petabytes of dataLower cost, higherperformance, better reliabilityApplications and tenants canshare one object storeMany competitive public andprivate cloud offerings

4 / 29

Page 5: Cloud storage with · 2017-12-14 · Cloud storage with Apache jclouds Andrew Gaul 19 November 2014 ... Advanced topics 1 What is Apache jclouds Interface with public and private

Why not object storageMust rewrite existingapplications against newinterfacesDistributed systems introducecomplexity and additionalfailure modesLacks features that databasesand filesystems have

5 / 29

Page 6: Cloud storage with · 2017-12-14 · Cloud storage with Apache jclouds Andrew Gaul 19 November 2014 ... Advanced topics 1 What is Apache jclouds Interface with public and private

What is object storageDistributed system of storagenodesSupports narrow interface:

Create and deletecontainersPut, get, and delete objectsList objects in a containerModify and retrievemetadata 6 / 29

Page 7: Cloud storage with · 2017-12-14 · Cloud storage with Apache jclouds Andrew Gaul 19 November 2014 ... Advanced topics 1 What is Apache jclouds Interface with public and private

Example applicationsStore photos and videos for asocial networking siteArchive large data sets, e.g.,scientific measurements

7 / 29

Page 8: Cloud storage with · 2017-12-14 · Cloud storage with Apache jclouds Andrew Gaul 19 November 2014 ... Advanced topics 1 What is Apache jclouds Interface with public and private

Basic conceptsAPIs and ProvidersRegionsContainersObjects

8 / 29

Page 9: Cloud storage with · 2017-12-14 · Cloud storage with Apache jclouds Andrew Gaul 19 November 2014 ... Advanced topics 1 What is Apache jclouds Interface with public and private

APIs and Providers (1)API defines how tocommunicate with a cloudProvider specializes a givenAPI, e.g., regionsSupport major APIs: Atmos,Azure, Google, S3, SwiftSupport many providers:Glacier, HP, Rackspace

9 / 29

Page 10: Cloud storage with · 2017-12-14 · Cloud storage with Apache jclouds Andrew Gaul 19 November 2014 ... Advanced topics 1 What is Apache jclouds Interface with public and private

APIs and Providers (2)

10 / 29

Page 11: Cloud storage with · 2017-12-14 · Cloud storage with Apache jclouds Andrew Gaul 19 November 2014 ... Advanced topics 1 What is Apache jclouds Interface with public and private

RegionsMost public providers offermultiple geographic regionsUsually associate a containerwith a given regionSome providers can replicatebetween regionsautomatically, others offer aper-object cross-region copy

11 / 29

Page 12: Cloud storage with · 2017-12-14 · Cloud storage with Apache jclouds Andrew Gaul 19 November 2014 ... Advanced topics 1 What is Apache jclouds Interface with public and private

ContainersContainers can contain manynamed objectsSome providers have partialdirectory supportSome providers allow public-read access containers

12 / 29

Page 13: Cloud storage with · 2017-12-14 · Cloud storage with Apache jclouds Andrew Gaul 19 November 2014 ... Advanced topics 1 What is Apache jclouds Interface with public and private

ObjectsObjects have a name, streamof bytes, and metadata mapMust write entire stream ofbytes but can read rangesMetadata map includesstandard HTTP headers, e.g.,Content-Type: text/plain andarbitrary user headers, e.g., x-amz-meta-foo: bar 13 / 29

Page 14: Cloud storage with · 2017-12-14 · Cloud storage with Apache jclouds Andrew Gaul 19 November 2014 ... Advanced topics 1 What is Apache jclouds Interface with public and private

Example put and gettry (BlobStoreContext context = ContextBuilder .newBuilder("transient") .credentials("identity", "credential") .buildView(BlobStoreContext.class)) { BlobStore blobStore = context.getBlobStore(); ByteSource payload = ByteSource.wrap( new byte[] { 1, 2, 3, 4 }); Blob blob = blobStore.blobBuilder(blobName) .payload(payload) .contentLength(payload.size()) .build(); blobStore.putBlob(containerName, blob);

blob = blobStore.getBlob( containerName, blobName); try (InputStream is = blob.getPayload().openStream()) { ByteStreams.copy(is, os); }} 14 / 29

Page 15: Cloud storage with · 2017-12-14 · Cloud storage with Apache jclouds Andrew Gaul 19 November 2014 ... Advanced topics 1 What is Apache jclouds Interface with public and private

Consistency (1)Some blobstores have weaksemantics and can returnstale data for some timeputBlob(name, data1),putBlob(name, data2),getBlob(name) can returndata1 or data2

15 / 29

Page 16: Cloud storage with · 2017-12-14 · Cloud storage with Apache jclouds Andrew Gaul 19 November 2014 ... Advanced topics 1 What is Apache jclouds Interface with public and private

Consistency (2)Applications may need retryloops or other fallback logicS3 (standard region) and Swifthave various forms ofeventual consistencyAtmos, Azure, and GCS havestrong consistency, more likea file system

16 / 29

Page 17: Cloud storage with · 2017-12-14 · Cloud storage with Apache jclouds Andrew Gaul 19 November 2014 ... Advanced topics 1 What is Apache jclouds Interface with public and private

Data integrity (1)Provider guarantees objectintegrity at-rest via contenthashApplication can guaranteeobject integrity on-the-wire byproviding and verifyingContent-MD5 headerGuava has useful helpers

17 / 29

Page 18: Cloud storage with · 2017-12-14 · Cloud storage with Apache jclouds Andrew Gaul 19 November 2014 ... Advanced topics 1 What is Apache jclouds Interface with public and private

Data integrity (2)Guarantee integrity duringputBlob:

ByteSource payload = ByteSource.wrap( new byte[] { 1, 2, 3, 4 });Blob blob = blobStore.blobBuilder(blobName) .payload(payload) .contentLength(payload.size()) .contentMD5(payload.hash(Hashing.md5())) .build();blobStore.putBlob(containerName, blob);

This reads InputStream twice 18 / 29

Page 19: Cloud storage with · 2017-12-14 · Cloud storage with Apache jclouds Andrew Gaul 19 November 2014 ... Advanced topics 1 What is Apache jclouds Interface with public and private

Data integrity (3)Guarantee integrity duringgetBlob:

Blob blob = blobStore.getBlob(containerName, blob);HashCode md5 = HashCode.fromBytes(blob.getMetadata() .getContentMetadata().getContentMD5());try (HashingInputStream his = new HashingInputStream( blob.getPayload().openStream()), Hashing.md5()) { ByteStreams.copy(his, os); if (md5.equals(his.hash())) { throw new IOException(); }}

19 / 29

Page 20: Cloud storage with · 2017-12-14 · Cloud storage with Apache jclouds Andrew Gaul 19 November 2014 ... Advanced topics 1 What is Apache jclouds Interface with public and private

Large object sizesSome objects may not fit inmemory or in a byte[]Non-repeatable payloads, e.g.,InputStreamRepeatable payloads, e.g.,Guava ByteSource, can retryafter network failure

20 / 29

Page 21: Cloud storage with · 2017-12-14 · Cloud storage with Apache jclouds Andrew Gaul 19 November 2014 ... Advanced topics 1 What is Apache jclouds Interface with public and private

Multi-part uploadSome objects are too large fora single put operation, e.g.,AWS S3 > 5 GB, Azure > 64 MBjclouds abstracts these detailsvia:

Blob blob = ... blobStore.putBlob(containerName, blob, new PutOptions().multipart());

21 / 29

Page 22: Cloud storage with · 2017-12-14 · Cloud storage with Apache jclouds Andrew Gaul 19 November 2014 ... Advanced topics 1 What is Apache jclouds Interface with public and private

Many objectsAmazon S3 supports anarbitrary number of objectsper containerSwift recommends max500,000 objects per containerAtmos recommends max65,536 objects per directorySolution: shard over multiplecontainers or directories 22 / 29

Page 23: Cloud storage with · 2017-12-14 · Cloud storage with Apache jclouds Andrew Gaul 19 November 2014 ... Advanced topics 1 What is Apache jclouds Interface with public and private

URL signing (1)Most object stores allowcreating a URL which yourapplication can vend toexternal clients

client

jclouds1. request URL

blobstore3. access object

2. vend signed URL

23 / 29

Page 24: Cloud storage with · 2017-12-14 · Cloud storage with Apache jclouds Andrew Gaul 19 November 2014 ... Advanced topics 1 What is Apache jclouds Interface with public and private

URL signing (2)URL is cryptographicallysigned; allows time-limitedaccess to a single objectClients interact directly withthe object store, removingyour application as thebottleneck

24 / 29

Page 25: Cloud storage with · 2017-12-14 · Cloud storage with Apache jclouds Andrew Gaul 19 November 2014 ... Advanced topics 1 What is Apache jclouds Interface with public and private

Amazon GlacierOptimized for storage, notretrievalRead requests can takeseveral hours to completeLeast expensive public cloudprovider, often cheaper thanprivate cloud

25 / 29

Page 26: Cloud storage with · 2017-12-14 · Cloud storage with Apache jclouds Andrew Gaul 19 November 2014 ... Advanced topics 1 What is Apache jclouds Interface with public and private

Local blobstoresIn-memory (transient)appropriate for unit testingFilesystem has productionuses with limited number ofobjects or quantity of dataRemote clients requirerunning a HTTP server andimplementing authorization

26 / 29

Page 27: Cloud storage with · 2017-12-14 · Cloud storage with Apache jclouds Andrew Gaul 19 November 2014 ... Advanced topics 1 What is Apache jclouds Interface with public and private

jclouds-cliCommand-line tool useful foradministrative and debuggingtasks:

jclouds blobstore container-list \ --provider aws-s3 \ --identity $IDENTITY \ --credential $CREDENTIALjclouds blobstore write \ --provider aws-s3 \ --identity $IDENTITY \ --credential $CREDENTIAL \ $CONTAINER_NAME $OBJECT_NAME $FILE_NAME

27 / 29

Page 28: Cloud storage with · 2017-12-14 · Cloud storage with Apache jclouds Andrew Gaul 19 November 2014 ... Advanced topics 1 What is Apache jclouds Interface with public and private

Referenceshttps://github.com/jclouds/jclouds-exampleshttps://github.com/andrewgaul/s3proxyhttp://gaul.org/object-store-comparison/

28 / 29

Page 29: Cloud storage with · 2017-12-14 · Cloud storage with Apache jclouds Andrew Gaul 19 November 2014 ... Advanced topics 1 What is Apache jclouds Interface with public and private

Thank you!http://jclouds.apache.org/

http://gaul.org/

29 / 29