28
Digital Media Ingest and Storage Options on AWS Henry Zhang Amazon Web Services

Ingest and storage options

Embed Size (px)

Citation preview

Page 1: Ingest and storage options

Digital Media Ingest and Storage Options on AWS

Henry ZhangAmazon Web Services

Page 2: Ingest and storage options

Content has Gravity and is getting heavier …

…it’s easier to move processing to the content

4k/8kContent

Page 3: Ingest and storage options

Where is the problem?

More Bandwidth$$$$$

More PowerfulCompute $$$$$

Way more Storage$$$$$

Some Progress(ABR, HEVC, VP10)

Page 4: Ingest and storage options

Where is the sliding scale on my Infrastructure?

Page 5: Ingest and storage options

File Block Object

AWS Storage options for digital media

Amazon

EFS

Amazon

EBS

Amazon EC2

Instance

storage

Amazon

S3Amazon

Glacier

Page 6: Ingest and storage options

A Concept - the Content LakeInspired from Data Lake (Coined by James Dixon in 2010)

A single store of all of digital content that you create and

acquire in any form or factor

•Don’t assume any resolutions/formats (for now or future)

•It is up to the consumer (application consuming the content) to use the

appropriate infrastructure for processing

Page 7: Ingest and storage options

Amazon S3 : the Content Lake

• Durable, cost-effective and fast

• Highly scalable front-end – Multi-part uploads (parallel writes)

– Range-gets (parallel reads)

• No need for capacity planning or provisioning

• Use Amazon S3 with on-premises storage in a hybrid model

• Secure

Page 8: Ingest and storage options

S3 scalability: buckets and objects

Page 9: Ingest and storage options

Hydrating the Content Lake

Amazon S3

Amazon S3(multi-part Upload)

Direct Connect

N x 1G | 10G

Massively Scalable Front-end

Page 10: Ingest and storage options

Introducing AWS Import/Export Snowball

Scale and Speed

• Up to 50TB Capacity per device

• 10Gbps and 1Gbps connectivity

• Parallel data transfer enables PBs transferred in a week

Secure

• Tamper-resistant enclosure

• 256-bit encryption with KMS

• Secure data erasure

Simple

• Manage entire process through AWS Console

• Lightweight data transfer client

• Notifications

Page 11: Ingest and storage options

What is Snowball? Petabyte scale data transport

E-ink shipping

label

Ruggedized

case

“8.5G Impact”

All data encrypted

end-to-end50 TB

10G network

Rain & dust

resistant

Tamper-resistant

case & electronics

Page 12: Ingest and storage options

Can I drop it?

• No (please don’t)

• Snowball is its own box

• Has had many drop tests already

• Can handle 8.5G impacts

• Designed for shipping

Page 13: Ingest and storage options

How it works

Page 14: Ingest and storage options

What does it cost?

• $200 / job plus shipping

• Includes 10 days to fill the device at your site

• $15/day after the tenth day on site

• Standard Amazon S3 charges apply

• $0.03/GB to transfer data out

• $0.00/GB to transfer data in

Page 15: Ingest and storage options

How fast is that truck full of drives?

• Less than 1 day to transfer 250TB via 5x10G connections with 5

Snowballs, less than 1 week including shipping

• Number of days to transfer 250TB via the Internet at typical

utilizations

InternetConnectionSpeed

Utilization 1Gbps 500Mbps 300Mbps 150Mbps

25% 95 190 316 632

50% 47 95 158 316

75% 32 63 105 211

Page 16: Ingest and storage options

What does it cost?

Example 1:

• 250TB loaded on to 5 Snowballs

• 8 days at your site

• 5 * $200 = $1,000 plus shipping

Example 2:

• 30TB exported on to 1 Snowball

• 8 days at your site

• $200 + 30TB * $0.03/GB = $1,121.60 plus shipping

Page 17: Ingest and storage options

Edge Locations

Availability Zone

Region

Dallas (2)

St.Louis

Miami

JacksonvilleLos Angeles (2)

Seattle

Ashburn (3)

Newark

New York (3)

Dublin

London (2)

Amsterdam (2)

Stockholm

Frankfurt (2)Paris (2)

Singapore(2)

Hong Kong (2)

Tokyo (2)

Sao Paulo

South Bend

San JosePalo AltoHayward

OsakaMilan

Sydney

MadridSeoul

Mumbai

Chennai

Regional Lakes …

Page 18: Ingest and storage options

Source

(Virginia)

Destination

(Oregon)

• Only replicates new PUTs. Once S3

is configured, all new uploads into a

source bucket will be replicated

• Entire bucket or prefix based

• 1:1 replication between any 2

regions

Use cases

Compliance - store data hundreds of miles apart

Lower latency - distribute data to remote customers/partners)

S3 cross-region replicationAutomated, fast, and reliable asynchronous replication of data across AWS regions

Page 19: Ingest and storage options

Amazon S3

Amazon S3 (range-gets)

Direct Connect

N x 1G | 10G

Massively Scalable S3 Front-end

EBS

Instance

Store

cMassively Scalable Compute on AWS Cloud

On-Prem Apps

Consuming the Content Lake

Page 20: Ingest and storage options

Object life cycle from hot to cold

S3 Standard• Primary data

• 11 9’s of durability

• 2.75c – 3c per GB/month, $338 -369 per TB/year

S3 – Infrequent Access• Active Archives

• Mezzanine files

• 11 9’s of durability

• 1.25c per GB/month, $154 per TB/year

• 1c per GB for retrievals

Glacier

• Deep/offline archives

• WORM-compliant

data

• 11 9’s of durability

• 0.7c per GB/month,

$86 per TB/year

Data tiering using Life Cycle Policies

Actual customer quote: $0.0125 ?! OMG I will

take all your storage!!!

Page 21: Ingest and storage options

1 PB raw storage

800 TB usable storage

600 TB allocated storage

400 TB application data

S3 capacity pricing—pay only for what you use!

AWS Cloud

Storage

Page 22: Ingest and storage options

Securing your data on S3

• AWS alignment with the latest MPAA cloud based application guidelines for content security –August 2015

• VPC private endpoint for Amazon S3 – enables a true private workflow capability

• Encryption & key management capabilities

• Amazon Glacier Vault for high-value media/originals

Page 23: Ingest and storage options

Preserve, retrieve, and restore every version

of every object stored in your bucket

S3 automatically adds new versions and

preserves deleted objects with delete markers

Easily control the number of versions kept by

using lifecycle expiration policies

Easy to turn on in the AWS Management

Console

Key = photo.gif

ID = 121212

Key = photo.gif

ID = 111111

Versioning Enabled

PUTKey = photo.gif

S3 versioning

Page 24: Ingest and storage options

Amazon S3 event notifications

Delivers notifications to Amazon SNS, Amazon SQS, or AWS

Lambda when events occur in Amazon S3

S3

Events

SNS topic

SQS queue

Lambda function

Notifications

Foo() {

}

Support for notification when

objects are created via Put,

Post, Copy, or Multipart

Upload.

Support for notification when

objects are deleted, as well

as with filtering on prefixes

and suffixes for all types of

notifications.

Page 25: Ingest and storage options

Reference Architecture – Content Processing Pipeline

(Using Lambda)

S3 multi-part API

S3 as backend storage for Content Files acesable to

other processing tasks

Amazon Elastic

Transcoder

S3 Notification

Trigger a Lambda

Function to Start a

transcoding job

Ingest

S3 Notification

Lambda function to

generate a signed

URL to share the

file

Update CMS or

Metadata

Page 26: Ingest and storage options

Elastic File System - Rendering in the Cloud

• Designed to support petabyte scale file systems

• Throughput scales linearly with storage

• Same latency spec across each AZ

• Thousands of concurrent NFS connections

• Works great for large I/O sizes

• Pay for only what you use not what you provision

• Managed with multi-copy durability

Page 27: Ingest and storage options

Media Workloads (redefined)

EBSInstance

Store

Amazon EBS/EFS/EC2 Instance Store

Process

Partner/Affiliate/Service Provider

User Delivery/ConsumptionVFX/Production

On-Prem Apps

Archive

Amazon Glacier (Life Cycle Policies)

c

c

Direct Connect

Content Access Transfer

Disposable Infrastructure

Auto-scaling

Workload specific

Amazon S3

EFS

Page 28: Ingest and storage options

Q&A

Learn more at: http://aws.amazon.com/s3/

http://aws.amazon.com/glacier/

http://aws.amazon.com/importexport/

[email protected]