30
Digital Media Ingest and Storage Options on AWS Erik Durand Amazon Web Services

Digital Media Ingest and Storage Options on AWS

Embed Size (px)

Citation preview

Digital Media Ingest and Storage Options on AWS

Erik Durand

Amazon Web Services

Content has Gravity and is getting heavier …

…it’s easier to move processing to the content

4k/8k

Content

Where is the problem?

More Bandwidth

$$$$$

More Powerful

Compute

$$$$$

Way more Storage

$$$$$

Some Progress

(ABR, HEVC, VP10)

Where is the sliding scale on my Infrastructure?

Amazon EFS

File

Amazon EBSAmazon EC2

Instance Store

Block

Amazon S3 Amazon Glacier

Object

Data Transfer

AWS Direct Connect

AWS Snowball

ISV Connectors Amazon Kinesis

Firehose

S3 TransferAcceleration

StorageGateway

AWS Storage is a platform

A Concept - the Content LakeInspired from Data Lake (Coined by James Dixon in 2010)

A single store of all of digital content that you create and

acquire in any form or factor

•Don’t assume any resolutions/formats (for now or future)

•It is up to the consumer (application consuming the content) to use the appropriate infrastructure for processing

Amazon S3 : the Content Lake

• Durable, cost-effective and fast

• Highly scalable front-end – Multi-part uploads (parallel writes)

– Range-gets (parallel reads)

• No need for capacity planning or provisioning

• Use Amazon S3 with on-premises storage in a hybrid model

• Secure

S3 scalability: buckets and objects

1 PB raw storage

800 TB usable storage

600 TB allocated storage

400 TB application data

S3 capacity pricing—pay only for what you use

AWS Cloud

Storage

Object Storage Options

S3 Standard S3 Standard - Infrequent

Access

Amazon Glacier

Active data Archive dataInfrequently accessed data

- Transition Standard to Standard-IA

- Transition Standard-IA to Amazon Glacier

- Expiration lifecycle policy

- Versioning support

- Prefix support

Data Lifecycle Management

T T+3 days T+5 days T+ 15 days T + 25 days T + 30 days T + 60 days T + 90 days T + 150 days T + 250 days T + 365 days

Data access frequency over time

Save money on storage

58% saving over S3 Standard

44% saving over S3 Standard-IA

* Assumes the highest public pricing tier

Securing your data on AWS

• AWS alignment with the latest MPAA cloud based application guidelines for content security – August 2015

• VPC private endpoint for Amazon S3 – enables a true private workflow capability

• Encryption & key management capabilities

• Amazon Glacier Vault for high-value media/originals

Hydrating the Content Lake

Amazon S3 Amazon S3

(multi-part Upload)

Direct Connect

N x 1G | 10G

Massively Scalable Front-end

S3 Transfer

Acceleration AWS Snowball

Storage Gateway

Avere - Demonstrated M&E Success

• Who uses Avere? Movie studios for the top-20 blockbusters of 2015 for special effects

Customer Challenges• Scale rendering and transcoding performance

• Cost, space & power

• Managing storage silos

• High latency of data access over WAN

• Add compute resources at peak times

• Need for 2-3 months, no long-term commitment

• Do NOT want to rewrite applications

Avere Benefits• Hot data stored on RAM & SSD within FXT cache

• Bulk of data can remain on-prem or on inexpensive

S3

• Caching of remote data eliminates WAN latency

• Clustering provides scalable NAS performance

• Hybrid model - FXT filer on-prem and/or vFXT on EC2

• Pay only for what is used

Optimizing Data Transfer Method #1: The Highway

What is AWS Snowball? Petabyte-scale data transport

E-ink shipping label

Ruggedized case“8.5G impact”

All data encrypted end-to-end

Rain- and dust-resistant

Tamper-resistant case and

electronics

80 TB10 GE network

How it works

Use cases: AWS Import/Export Snowball

Cloud

Migration

Disaster

Recovery

Data Center

Decommission

Content

Distribution

How fast is Snowball?

• Less than 1 day to transfer 200TB via 3x10G connections with 3 Snowballs, less than 1 week including shipping

• Number of days to transfer 200TB via the Internet at typical utilizations

Internet Connection Speed

Utilization 1Gbps 500Mbps 300Mbps 150Mbps

25% 71 141 236 471

50% 36 71 118 236

75% 24 47 225 157

Optimizing Data Transfer Method #2: The Internet

ComplicatedSetup and Management

Hard to OptimizePerformance

Expensive

Optimizing Internet Performance Is Hard

Prioprietary

Introducing Amazon S3 transfer acceleration

S3 BucketAWS EdgeLocation

Uploader

Optimized

Throughput!

Typically 50%–400% faster

Change your endpoint, not your code

54 global edge locations

No firewall exceptions

No client software required

Rio DeJaneiro

Warsaw New York Atlanta Madrid Virginia Melbourne Paris LosAngeles

Seattle Tokyo Singapore

Tim

e [h

rs]

500 GB upload from these edge locations to a bucket in Singapore

Public Internet

How fast is S3 transfer acceleration?

S3 Transfer Acceleration

We have customers uploading large files from all over

the world.

We’ve seen performance improvements in excess of

500% in some cases.

- Emery Wells, Cofounder/CTO ”

Use case: media uploads

Regional Lakes…

AWS is available today in the U.S., Brazil, Europe, Japan,

Singapore, Australia, and China. Additional regions in India, Korea,

the UK, and Ohio are expected to come online over the next 12 –

18 months.

Over 1 million active customers across

190 countries

2,000 government agencies

5,000 educational institutions

12 regions

32 availability zones

54 edge locationsRegion

Edge Location

Source

(Virginia)

Destination

(Oregon)

• Only replicates new PUTs. Once

S3 is configured, all new uploads

into a source bucket will be

replicated

• Entire bucket or prefix based

• 1:1 replication between any 2

regions

Use cases

Compliance - store data hundreds of miles apart

Lower latency - distribute data to remote customers/partners)

S3 cross-region replicationAutomated, fast, and reliable asynchronous replication of data across AWS regions

Amazon S3

Amazon S3 (range-gets)

Direct Connect

N x 1G | 10G

Massively Scalable S3 Front-end

EBS

Instance

Store

cMassively Scalable

Compute on AWS Cloud

On-Prem Apps

Consuming the Content Lake

Media Workloads (redefined)

EBSInstance

Store

Amazon EBS/EFS/EC2 Instance Store

Process

Partner/Affiliate/

Service Provider

User Delivery/ConsumptionVFX/Production

On-Prem Apps

Archive

Amazon Glacier (Life Cycle Policies)

c

c

Direct Connect

Content Access Transfer

Disposable Infrastructure

Auto-scalingWorkload specific

Amazon S3

EFS

Q&A

Learn more at: http://aws.amazon.com/s3/

http://aws.amazon.com/glacier/

http://aws.amazon.com/importexport/

[email protected]