29
Aspera OnDemand: S3-direct Michelle Munson President, CEO and Co-founder of Aspera Oct 2011

Aspera OnDemand: S3-directd36cz9buwru1tt.cloudfront.net/aws-media-summit-2011/Aspera.pdf · • AWS Cloud Front. M&E Big-Data: Big & Getting Bigger ... 2nd Bottleneck –Data Center

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Aspera OnDemand: S3-directd36cz9buwru1tt.cloudfront.net/aws-media-summit-2011/Aspera.pdf · • AWS Cloud Front. M&E Big-Data: Big & Getting Bigger ... 2nd Bottleneck –Data Center

Aspera OnDemand: S3-direct

Michelle Munson

President, CEO and Co-founder of Aspera

Oct 2011

Page 2: Aspera OnDemand: S3-directd36cz9buwru1tt.cloudfront.net/aws-media-summit-2011/Aspera.pdf · • AWS Cloud Front. M&E Big-Data: Big & Getting Bigger ... 2nd Bottleneck –Data Center

Cloud Computing – Why is it so compelling?

1. The potential of infinite computing resources, on demand

– Eliminates the need to plan ahead

– Meet demand - without the lead-time bottleneck

2. The elimination of an up-front commitment

– Reduce capital outlay and investment risk

– Start small & increase h/w resources to match need

– Auto-scale to meet demand

3. Pay-for-use resource model

– CPU’s by the hour

– Storage by the day

– Bandwidth by the GB

– No commitment: Release assets & remove costs as needed

Source: Above the Clouds: A Berkeley View of Cloud Computing, P1, 2009

Page 3: Aspera OnDemand: S3-directd36cz9buwru1tt.cloudfront.net/aws-media-summit-2011/Aspera.pdf · • AWS Cloud Front. M&E Big-Data: Big & Getting Bigger ... 2nd Bottleneck –Data Center

AWS for Media Production & Distribution

• Content Creation

– Compute Intensive: EC2 (10’s, 100’s, 1000’s of CPUs)

• Transcoding, encoding, watermarking, video editing

• Rendering & HPC applications

• Mission Critical Storage and Distribution

– Long-term archive & backup

– Near-line storage for compute

– B2B/B2C media ingest & distribution

• Monetization & Play Out

– Release, project and event specific marketing & social media

– Brand awareness & franchise continuity

– CDN and Delivery

• AWS Cloud Front

Page 4: Aspera OnDemand: S3-directd36cz9buwru1tt.cloudfront.net/aws-media-summit-2011/Aspera.pdf · • AWS Cloud Front. M&E Big-Data: Big & Getting Bigger ... 2nd Bottleneck –Data Center

M&E Big-Data: Big & Getting Bigger

A single digital cinema production can be 800K–1M 2K/4K frames

Page 5: Aspera OnDemand: S3-directd36cz9buwru1tt.cloudfront.net/aws-media-summit-2011/Aspera.pdf · • AWS Cloud Front. M&E Big-Data: Big & Getting Bigger ... 2nd Bottleneck –Data Center

Large Data Freighting: Underpins Content Supply & Creation

Page 6: Aspera OnDemand: S3-directd36cz9buwru1tt.cloudfront.net/aws-media-summit-2011/Aspera.pdf · • AWS Cloud Front. M&E Big-Data: Big & Getting Bigger ... 2nd Bottleneck –Data Center

Characterizing & Understanding

Big-Data Cloud Transfer Bottlenecks

Page 7: Aspera OnDemand: S3-directd36cz9buwru1tt.cloudfront.net/aws-media-summit-2011/Aspera.pdf · • AWS Cloud Front. M&E Big-Data: Big & Getting Bigger ... 2nd Bottleneck –Data Center

Two Major Bottlenecks: WAN Transfer & Local HTTP I/O

1st Bottleneck - WAN

Transfers over the WAN are TCP

based (FTP, SCP, HTTP etc)

2nd Bottleneck – Data Center

“Last-foot” local transfers from EC2 to

S3 can use multiple HTTP connections

Server(EC2)

S3

WAN

Page 8: Aspera OnDemand: S3-directd36cz9buwru1tt.cloudfront.net/aws-media-summit-2011/Aspera.pdf · • AWS Cloud Front. M&E Big-Data: Big & Getting Bigger ... 2nd Bottleneck –Data Center

1st Bottleneck: WAN Transfers

#1 WAN Transfer: Local machine to EC2 Effective throughput

• Single HTTP transfer

• Typical internet conditions

50-250ms latency & 0.1- 3% packet loss

0.5 to 5 Mbps

• 15 parallel http streams 7 to 75 Mbps

• Aspera fasp transfer Up to 700 Mbps

WAN

Page 9: Aspera OnDemand: S3-directd36cz9buwru1tt.cloudfront.net/aws-media-summit-2011/Aspera.pdf · • AWS Cloud Front. M&E Big-Data: Big & Getting Bigger ... 2nd Bottleneck –Data Center

Aspera Solution – fasp Optimal Transfer Performance

• Optimal end-to-end throughput efficiency

– Full utilization of commodity Internet bandwidth

– Highly bandwidth efficient

• Low Overhead

– Less than 0.1% overhead on 30% packet loss

– Full utilization of storage throughput

– Equal performance with large files or large

collections of small files

• Congestion Avoidance and Policy Control

– Real-time policy based bandwidth control

– Congestion avoidance (WAN, LAN, Disk)

• The Result:

– Transfers up to thousands of times faster than FTP

– Precise and predictable transfer times

– End-to-end policy control over transfer priority and

speed

Page 10: Aspera OnDemand: S3-directd36cz9buwru1tt.cloudfront.net/aws-media-summit-2011/Aspera.pdf · • AWS Cloud Front. M&E Big-Data: Big & Getting Bigger ... 2nd Bottleneck –Data Center

fasp – Built-in Security & Reliability

• Secure user/endpoint authentication

– Standard secure shell (SSH)

– Standard system authentication and user access control (LDAP, AD)

• AES-128 encryption in transit and at rest

– Real-time in transit packet encryption

• GUI, API, Web & Mobile

– Encryption at rest per recipient (secured storage of transferred content)

• Data integrity verification

– For each transmitted data block

• Automatic resume of partial or failed transfers

– GUI/CLI/API: Stop/ Start/ Pause/ Resume

• Automatic HTTP fallback in restricted networks

Page 11: Aspera OnDemand: S3-directd36cz9buwru1tt.cloudfront.net/aws-media-summit-2011/Aspera.pdf · • AWS Cloud Front. M&E Big-Data: Big & Getting Bigger ... 2nd Bottleneck –Data Center

fasp – Management & Control

• Extraordinary bandwidth control

– Automatic, full utilization of available bandwidth

– Protection of other network traffic

– On-the-fly, per flow, user and job prioritization

– Highly-concurrent transfer stacking

• Scalable, system-wide monitoring and reporting

– Real-time progress and performance analysis

– Real-time bandwidth utilization

– Detailed transfer history, logging and manifest

• Centralized, network-wide command and control

– Per transfer, user, group and node

– Manage and create global transfer policies

– Remotely initiate, schedule and automate transfer jobs

Page 12: Aspera OnDemand: S3-directd36cz9buwru1tt.cloudfront.net/aws-media-summit-2011/Aspera.pdf · • AWS Cloud Front. M&E Big-Data: Big & Getting Bigger ... 2nd Bottleneck –Data Center

2nd Bottleneck: Data Center/ Local HTTP I/O

2nd Bottleneck – Data Center

“Last-foot” local transfers from Server (EC2) to Storage (S3) using one/many HTTP connections

Server(EC2)

S3

Page 13: Aspera OnDemand: S3-directd36cz9buwru1tt.cloudfront.net/aws-media-summit-2011/Aspera.pdf · • AWS Cloud Front. M&E Big-Data: Big & Getting Bigger ... 2nd Bottleneck –Data Center

AWS S3: 449 Billion objects and counting

Amazon’s S3 is the premier cloud storage system

• 449 Billion objects and counting

• 1,440 objects for every resident of the US

• 64 objects for each person on Planet Earth

• About as many objects as there are stars in the Milky Way

http://aws.typepad.com/aws/2011/07/amazon-s3-more-than-449-billion-objects.html

Page 14: Aspera OnDemand: S3-directd36cz9buwru1tt.cloudfront.net/aws-media-summit-2011/Aspera.pdf · • AWS Cloud Front. M&E Big-Data: Big & Getting Bigger ... 2nd Bottleneck –Data Center

Some History: 2006-2009 Big-Data & AWS

March 2006:

– AWS Launches the S3 object storage system

June 2009:

– AWS announces physical import/export for AWS

September 2009:

– Aspera launches Aspera OnDemand for AWS Video & Graphics

Genetic Sequencing

Photos & Imaging

ComputerModeling

Music / Audio

PDFs

Page 15: Aspera OnDemand: S3-directd36cz9buwru1tt.cloudfront.net/aws-media-summit-2011/Aspera.pdf · • AWS Cloud Front. M&E Big-Data: Big & Getting Bigger ... 2nd Bottleneck –Data Center

Sept 2009: Aspera Launches On-Demand for AWS

• On-Demand: powered by Aspera’s patented fasp™ technology

– Next-generation transport protocol for digital media

– Eliminates the latency & packet loss bottlenecks of TCP

– Reliable & secure asset delivery system

– Replaces FTP, HTTP, NFS, CIFS, tape and disks

– Seamless integration w/ all Aspera Clients and Console Management

• Aspera On-Demand lowered the network barrier to AWS adoption:

– Solved Bottleneck #1

– But didn’t transfer data directly to/from S3 – Bottleneck #2

Page 16: Aspera OnDemand: S3-directd36cz9buwru1tt.cloudfront.net/aws-media-summit-2011/Aspera.pdf · • AWS Cloud Front. M&E Big-Data: Big & Getting Bigger ... 2nd Bottleneck –Data Center

Aspera On-Demand Version 1 – No S3 Support…

AsperaOn-Demand

Server

EC2

LocalHD

Elastic Block Store – EBS (NAS)

fasp

S3???

EC2HTTP

Aspera On-Demand v.1 did not read/write data to S3

Aspera Client

Aspera Connect

Browser Plugin

Aspera Mobile

WAN

Page 17: Aspera OnDemand: S3-directd36cz9buwru1tt.cloudfront.net/aws-media-summit-2011/Aspera.pdf · • AWS Cloud Front. M&E Big-Data: Big & Getting Bigger ... 2nd Bottleneck –Data Center

2011: Big-Data REALLY Meets AWS…

Dec 2010: AWS Announces Major S3 Upgrade

– S3 object size increased

• 5GB to 5TB

• AWS introduces multipart HTTP uploader

• API’s available in Java, .NET, PHP & REST

Fantastic… but now what?

• Still HTTP over the WAN (SLOW)

• Still have to “glue” any fasp high speed transfer

to S3 I/O in custom s/w – big speed bump!

• Find an expert s/w team

• Build upon the multipart API

• Concurrently stream data to S3

• Integrate into operations

Video & Graphics

Genetic Sequencing

Photos & Imaging

ComputerModeling

Music / Audio

PDFs

Page 18: Aspera OnDemand: S3-directd36cz9buwru1tt.cloudfront.net/aws-media-summit-2011/Aspera.pdf · • AWS Cloud Front. M&E Big-Data: Big & Getting Bigger ... 2nd Bottleneck –Data Center

S3 Big-Data i/o: API or Application

Step Action

1 Initiate multipart upload by providing your AWS credentials

2 Provide required bucket name and key name

3 Save the upload ID for each subsequent multipart upload operation

4Upload parts providing part upload information (upload ID, bucket name, part number)

5 Save the responses (ETag value and the part number)

6 Repeat tasks 4 and 5 for each part of your object

7 Execute a final call to complete the multipart upload

! "#

S3 Multi-part uploadHTTP

Multi-Part Uploader API (from AWS) Commercial Tools

Page 19: Aspera OnDemand: S3-directd36cz9buwru1tt.cloudfront.net/aws-media-summit-2011/Aspera.pdf · • AWS Cloud Front. M&E Big-Data: Big & Getting Bigger ... 2nd Bottleneck –Data Center

Why These Challenges of Storing Big Files in the Cloud?

• Designed as scalable distributed object stores

– Target applications require simple read/write operations of binary "blobs”, indexed by a single primary key

– Should work well for storing large numbers of media files, compared to traditional file systems

• BUT

– “Blob" sizes are small (<64 MB) => large media files must be “chunked”

– Data I/O use the standard HTTP protocol – VERY SLOW at distance

– API for managing data requires a team of experts

• M&E/ Big-Data services require high-speed software bridge over the WAN

– Large files to be moved at full bandwidth capacity w/ global access

– Must overcome the WAN and the I/O bottleneck

– Must allow for writing media files of any size

– Must be transparent to the end user uploading / downloading (GUI, command line, browser, etc.)

Page 20: Aspera OnDemand: S3-directd36cz9buwru1tt.cloudfront.net/aws-media-summit-2011/Aspera.pdf · • AWS Cloud Front. M&E Big-Data: Big & Getting Bigger ... 2nd Bottleneck –Data Center

Solving the Big-Data i/o at Scale

With the help of AWS, Aspera did a full characterization of AWS S3 i/o:

• Upload/Download performance vs. thread count

• Upload/Download performance vs. chunk size

• 24hr upload stability w/ fixed thread size

• 24hr download stability w/ fixed chunk

• Upload/Download performance vs. duration

• DNS lookup performance

• Performance w/ concurrent access to single S3 bucket

• Performance w/ max connections per host

Page 21: Aspera OnDemand: S3-directd36cz9buwru1tt.cloudfront.net/aws-media-summit-2011/Aspera.pdf · • AWS Cloud Front. M&E Big-Data: Big & Getting Bigger ... 2nd Bottleneck –Data Center

S3

The Result? Aspera On-Demand S3-direct

Aspera Client

Aspera Connect

Browser Plugin

Aspera Mobile

EC2

AsperaOn-Demand

Server

fasp-S3 Gateway

Server RAM

fasp

Aspera On-Demand S3-direct:

• Full client-side r/w of S3

• Synchronous transfer from Client to S3 (via EC2 Aspera On-Demand)

• Real-time optimization of HTTP threads

• Real-time optimization of chunk size

HTTP S3

Parts API

Optimized S3 i/o

WAN

Page 22: Aspera OnDemand: S3-directd36cz9buwru1tt.cloudfront.net/aws-media-summit-2011/Aspera.pdf · • AWS Cloud Front. M&E Big-Data: Big & Getting Bigger ... 2nd Bottleneck –Data Center

Overcoming Both Bottlenecks - Transferring Data to S3 over WAN

#1 - Transfer Data to EC2 over WAN Effective throughput

• http transfer over WAN (single stream)

• Typical internet conditions

• 50-250ms latency & 0.1- 3% packet loss

• 15 parallel http streams

0.5 to 5 Mbps

7.5 to 100 Mbps

• Aspera fasp transfer over WAN to EC2 up to 700 Mbps

#2 - Transfer Data from EC2 to S3 Effective throughput

• Standard single stream http 20 to 100 Mbps

• Aspera S3 Proxy

• With parallel I/O http streamsup to 700 Mbps

fasp™ 45 Mbps 100 Mbps 200 Mbps 1 Gbps 5 Gbps 10Gbps

1 GB 3.2 min 1.4 min 42 sec 8.4 sec 1.6 sec 0,8 sec

10 GB 32 min 14 min 7 min 1.4 min 16 mins 8.2 sec

100 GB 5.3 hrs 2.3 hrs 1.2 hrs 14 min 2.7 mins 82 sec

1TB 2.1 days 23 hrs 11.7 hrs 2.3 hrs 28 mins 14 mins

Page 23: Aspera OnDemand: S3-directd36cz9buwru1tt.cloudfront.net/aws-media-summit-2011/Aspera.pdf · • AWS Cloud Front. M&E Big-Data: Big & Getting Bigger ... 2nd Bottleneck –Data Center

Aspera On-Demand S3-direct

AsperaOn-Demand

Server

EC2

LocalHD

Elastic Block Store – EBS (NAS)

faspS3

EC2HTTP

Aspera Client

&/or Server

Aspera Connect

Browser Plugin

Aspera Mobile

WAN

fasp

Page 24: Aspera OnDemand: S3-directd36cz9buwru1tt.cloudfront.net/aws-media-summit-2011/Aspera.pdf · • AWS Cloud Front. M&E Big-Data: Big & Getting Bigger ... 2nd Bottleneck –Data Center

Applications: 2K/4K Global Freighting

Page 25: Aspera OnDemand: S3-directd36cz9buwru1tt.cloudfront.net/aws-media-summit-2011/Aspera.pdf · • AWS Cloud Front. M&E Big-Data: Big & Getting Bigger ... 2nd Bottleneck –Data Center

Applications: 2K/4K Global Freighting (faspframes)

• Native 2K/4K frame transport software

• Designed for 10Gbps WANs

• Millions of frame files

• 60 min of footage (1 TB) transferred globally in under 20 minutes !

• 8 Gbps at 200 ms / 2%

Page 26: Aspera OnDemand: S3-directd36cz9buwru1tt.cloudfront.net/aws-media-summit-2011/Aspera.pdf · • AWS Cloud Front. M&E Big-Data: Big & Getting Bigger ... 2nd Bottleneck –Data Center

faspframes – Ultra Simple, Ultra Fast s/w for 2K/4K Transfers

Aspera faspframes Transfer Times

10 Gbps Global WANs

Distance Speed Transfer Time for 1 TB (~60 min Film)

LA-NY (100 ms / 1%) 8.1 Gbps 18.1 minutes

LA-London (200 ms/2%) 7.9 Gbps 18.6 minutes

LA-Mumbai (300ms/5%) 6.3 Gbps 23.3 minutes

Compare To

HW Appliance for 2K/4K Transfers – Highest Capacity Model

Distance Speed Transfer Time for 1 TB (~60 min of Film)

LA-NY (100 ms / 1%) 3.6 Gbps 42 minutes

LA-London (200 ms/2%) No data ??

LA-Mumbai (300ms/5%) No data ??

Page 27: Aspera OnDemand: S3-directd36cz9buwru1tt.cloudfront.net/aws-media-summit-2011/Aspera.pdf · • AWS Cloud Front. M&E Big-Data: Big & Getting Bigger ... 2nd Bottleneck –Data Center

faspframes – Ultra Simple, Ultra Fast s/w for 2K/4K Transfers

What is it?

• An ultra-simple software tool for ultra-fast (fully reliable) transfers of 2K/4K frame files

• Max speed in-order transfer of 2K/4K frame files over WAN (any distance, any bandwidth)

• Available for users of Aspera Point-to-Point and Server

Advantages?

• Software application only integrates easily with any workflow

• No clunky brute force hardware appliances to integrate

• Full 10 Gbps performance; 2X the best speeds published by appliances

• Comprehensive bandwidth management and congestion control

• Seamlessly integrates with Aspera transfer and management tools

Platforms?

• Linux 32/64-bit

• Other platforms coming

Page 28: Aspera OnDemand: S3-directd36cz9buwru1tt.cloudfront.net/aws-media-summit-2011/Aspera.pdf · • AWS Cloud Front. M&E Big-Data: Big & Getting Bigger ... 2nd Bottleneck –Data Center

Big-Data: Accessed & Delivered by Aspera

Page 29: Aspera OnDemand: S3-directd36cz9buwru1tt.cloudfront.net/aws-media-summit-2011/Aspera.pdf · • AWS Cloud Front. M&E Big-Data: Big & Getting Bigger ... 2nd Bottleneck –Data Center

For more information contact:

[email protected]