EE616 Technical Project Video Hosting Architecture By Phillip Sutton

Preview:

Citation preview

EE616 Technical Project

Video Hosting Architecture

By Phillip Sutton

Problem Description

• Need to store and serve massive amounts of video data.

• Solution must be:– Scalable– Reliable– Relatively fast

Complications

• Oh yeah….

• Have relatively little cash.

• SO, need minimal startup costs!

Options

• YouTube…Believe it or Not.

• Build it yourself.

• Managed or dedicated hosting

• Content Delivery Network (CDN).

• Amazon Simple Storage Service.

YouTube

• Free to use.

• 100 million videos served daily.

• Hosted on Google’s reliable and scalable infrastructure.

Video Sharing Site ComparisonWebsite YouTube Yahoo Video Veoh Vimeo

Unique Visitors per year 205,593,000 48,026,000 11,476,000 569,000

Max Video Bit Rate (kbps) ~2001 3003 1,500 1,600

Max Upload File Size (mb) 1002 150 250 500/wk

Max Length (min) 10 N/A N/A N/A

Max Screen Size(s) 320x240 320x240 640x480 1280x7204

Host Format (streaming) FLV FLV FLV FLV

Processing Time Up to several hours

Up to several hours

Few hours Minutes5

1 estimated 2 increasing to 1 GB 3 upcoming 700 kbps 4 claims this capability

Drawbacks

• Limited file size– Need 4.7 GB.

• Limited bitrate– Implies relatively low quality.

• For higher bitrate sites– Still suffer from limited file size.

• No real options to manage library.

• No real options to monetize.

Build It Yourself

• Have almost complete and utter control.

• No messy CDN contracts to deal with.

• Scalable; depending on your budget.

Drawbacks

• Expensive to start.

• Expensive to grow.

• Requires space, power, and resources.

• Requires knowledgeable manpower to maintain and support.

Drawbacks

http://www.acadweb.wwu.edu/dbrunner/P7040180.JPG

Managed/Dedicated Hosting

• Let someone else deal with it – setup, maintenance, and support.

• Mostly reliable– Many claim 99.9% uptime.

• Affordable to start– 500 GB of storage and 2,500 GB bandwidth .– cost about same as small efficiency apartment

on Southside.

Drawbacks

• Can’t scale with you.

• Overage costs will get you!!!

• Can’t control hardware.

• Can’t make favorable networking agreements.

Content Delivery Networks

• Multiple data centers.

• Most have direct internet backbone access.

• Designed for performance.

• Replicate content .

Drawbacks

• Traditionally marketed to enterprises– Apple iTunes uses Akamai.

• Hard to figure costs w/o signing agreement.

• Prepay for chunks of storage and bandwidth.

• Exceeding allocation can be costly.

• Pay for idle storage and bandwidth.

Amazon Simple Storage Service

• New kid on the block.

• Same infrastructure as Amazon.com– Scalable, high availability, low latency.

• Unlimited storage.

• Unlimited bandwidth.

• Pay only for what you use.

• No contracts; zero cost to startup.

Drawbacks

• New kid on the block.

• Latency perhaps not as good as CDNs.

• Bandwidth costs may still be an issue.

• No server side processing.

Comparing Costs

• Build library of 5000 4.7 GB DVDs• Deliver 100TB per month.

Hosted CDN Amazon S3

Storage $241,000 $23,552 $3,523

Bandwidth $141,312 $29,696 $15,153

Total Per Month $382,312 $53,248 $18,676

S3 Overview

• Store objects up to 5 GB in size with metadata.

• Objects stored in buckets.

• Unlimited number of objects per bucket.

• Each bucket is owned by an Amazon Web Service (AWS) account.

S3 Overview

• Object is identified unique key.

• Use REST-style HTTP, SOAP, or HTTP GET/PUT interfaces.

• Supports BitTorrent protocol.

• Authorize requests with ACLs.

S3 Overview

• Authenticated URLs can be created with time-bounded validity.

Over Simplified Architecture

Web Server /

CMS

S3Web

Client

Over Simplified Architecture

• Use S3’s online storage service and economy hosting/bandwidth provider.

• Use a content management system to track all assets stored on S3.

• Web client communicates with CMS and S3.

Upload Content

• Web client requests authentication keys from CMS.

• Once keys are received, client can send files directly to S3.

• Or send files to CMS without access keys.

• Then CMS forwards to S3.

Get Content

• Web client request content from CMS.

• CMS issues authenticated URL with limited time to live.

• Client then has preset amount of time to retrieve file directly from S3.

Issues

• In addition to drawbacks mentioned earlier

• No server-side processing of scripts.

• Need to better handle read/write failures.

• Need to build your own software.

Next

• Lot’s of work left to do.

• Create more detailed architecture.

• Work out code details.

• Implement and test scalability and performance.

Future

• Integration with content management system

• Integrate with Amazon’s EC2 service.

• Explore BitTorrent protocol for increased through-put.

QUESTIONS

Recommended