49
Cloud Economics: design, capacity and operational concerns for e-Commerce Platforms Marcos Garcia Senior IT Architect

Cloud economics design, capacity and operational concerns

Embed Size (px)

Citation preview

Cloud Economics: design, capacity and operational concerns

for e-Commerce Platforms

Marcos GarciaSenior IT Architect

What is Cloud?

Cloud is not Dropbox or iCloud. Those are Cloud Storage services.

Cloud Computing is defined as [1]

1. On-demand self-service - customers can get compute/storage/networking resources with just an email ID and a payment option (credit card)

2. Broad network access - resources can be accessed anywhere, anytime

3. Resource pooling - access to vast amounts of shared resources

4. Rapid elasticity - scale up or down, immediately

5. Measured service - pay as you go

[1] http://csrc.nist.gov/publications/nistpubs/800-145/SP800-145.pdf

Cloud 101

Cloud 101: Storage and Network ResourcesCloud Storage services Cloud Network services

Including streaming

Cloud 101: Compute ResourcesPre-2006 [2]

- Web Hosting- Virtual Private Servers- Virtualized Servers - Dedicated Server Hosting- Collocation services

Post-2006 - same as before, plus:

- Infrastructure as a Service- Platform as a Service- (Software as a Service)- Contanerized IaaS- Lambda - code execution

service [3]

[2] https://en.wikipedia.org/wiki/Amazon_Web_Services#History [3] https://aws.amazon.com/blogs/aws/run-code-cloud/

A web application is described using layers:

● The website Code and Images are executed by a Web Server

● Code is written in a language (PHP, Java, Python or Ruby) that requires Libraries to run.

● It needs a Database to save customer data● All runs on top of an Operating System (Windows

Server, Linux, Solaris)● The OS uses Drivers to abstract the Compute,

Network and Storage resource details

Cloud 101: Structure of a web application

Web Server

Website Code

Website Images

Database

Operating System

CPURAM

Network Card Storage

Drivers

Libraries

Web HostingIt is simply a Web Server where we can upload our website Code and Images, and use a shared Database.

Typical cost: 3-5$/month

Examples:

● 1and1.com● godaddy.com● ehost.com● peer1.com

Web Server

Website Code

Website Images

Shared Database

Operating System

CPURAM Network Card Storage

Drivers

Web Server

Website Code

Website Images

Shared Libraries

Customer A Customer B

Web Hosting subsystem

Web Server

Website Code

Customer B

Website Images

Database

Operating System

CPURAM Network Card Storage

Drivers

Virtual Private ServerIt offers a dedicated Operating System that runs on a shared server with many other customers on it, isolated thanks to a special VPS software. It contains pre-installed Web Servers and Databases. Each customer manages everything above that, and the provider manages the OS and the layers below.

Typical cost: 5-10$/month

Examples

● DigitalOcean● Softlayer VPS● OVH VPS

Web Server

Website Code

Website Images

Shared Libraries

Customer A

VPS subsystem

DatabaseLibraries Libraries

Web Server

Website Code

Customer B

Website Images

Database

Operating System

CPU with VTRAM Network Card Storage

Drivers

Virtualized ServerIt offers an isolated and dedicated Operating System that runs on a shared physical server with few customers on it, isolated thanks to a hardware feature called VT (Virtualization Technology). The Virtualized Server (or VM) contains empty Operating Systems that can be different from the host. Each customer manages everything inside their guest OS instance, and the provider manages the host OS and the layers below.

Typical cost: fixed price 30 to 300$/month (per guest *)

Examples:● peer1 virtual cloud servers● Rackspace Managed vCloud● OVH vSphere as a Service● VMWare vCloud Air

* Often, a minimum amount of guests are required (i.e. 10)

Customer A: Empty

Libraries

Operating System 2Operating System 1

Virtual DriversVirtual Drivers

Dedicated ServerIt offers an isolated and dedicated physical server, that is collocated next to another customer’s server, so the only shared resource is the network traffic, that has to be isolated using the provider’s network features.

Typical cost: 80 to 500$/month

Examples

● peer1 virtual cloud servers● Rackspace Managed vCloud● OVH vSphere as a Service● VMWare vCloud Air

Web Server

Website Code

Website Images

Database

Operating System

CPURAM

Net. Card

Storage

Drivers

Libraries

Web Server

Website Code

Website Images

Database

Operating System

CPURAM

Net. Card

Storage

Drivers

Libraries

Web Server

Website Code

Website Images

Database

Operating System

CPURAM

Net. Card

Storage

Drivers

Libraries

Web Server

Website Code

Website Images

Database

Operating System

CPURAM

Net. Card

Storage

Drivers

Libraries

Provider Network Infrastructure

Customer A Customer B

Customer B Customer A

Infrastructure as a Service‘Invented’ in 2006 by Amazon Web Services, named Elastic Compute Cloud (EC2). It’s an improved Virtualized Server, split in 3 core resources, all managed with a dedicated API (Application Programmable Interface) and multiple pricing metrics. Customers registered using an email-address and a credit card have access to unlimited resources, as long as they can pay for it.

● Compute instance (hours per month)● Storage

○ Operating System image (free)○ Block space (Gigabytes used/hour)○ Object (# Files and GB transferred/ hour)

● Network○ Private IP settings (free)○ Public IP settings (# of IPs per month)○ DNS as a Service (changes per month)○ Load Balancer as a Service (Gbps transferred / month)○ Firewall as a Service (Gbps transferred / month)

Web Server

Website Code

Website Images

Database

Operating System

CPURAM

Network Card Storage

virtual Drivers

Libraries

Compute Instance API

Service

Network API Service

Storage API Service

Customer Instance (VM)

MeteredUsage(used hours/Gbper month)

Virtualization vs Cloud/IaaS

Virtualization is for Virtual Machines that will last many months, and when they fail we need to repair them as quickly as possible.

Typically, only one VM contains a particular component of the Web Application (SPOF - single point of failure). So when the VM fails, it causes a module of our website to fail, maybe even causing a total downtime.

Virtualization offers the best of breed protection mechanisms for VMs, to reduce the probability of an infrastructure failure

It is priced as a fixed amount per month

Cloud/IaaS is for Virtual Machines that will last hours, days or weeks, and whenever they fail we just launch a new one.

The Web Application software was deployed using dozens of small VMs that will cover for the faulty one (no SPOFs). The application understands failure and the service was not affected during downtime.

IaaS offers no protection to the VMs, but the API will signal the failure immediately to our software, so it can react accordingly.

Pricing is variable and subject to usage metrics

Read this: http://www.theregister.co.uk/2013/03/18/servers_pets_or_cattle_cern/

Public vs Private CloudPublic cloud is what we’ve already seen: IaaS offered by a huge service provider, with millions of resources available. It also offers extra services on top of IaaS that are very appreciated by software developers.

The use of shared resources involve extra auditing efforts (i.e. PCI-DSS) to the customer.

Although some say it has more advantages than disadvantages, it can quickly become more expensive than expected if it’s not used where it’s appropiate.

Private cloud is an IaaS deployment on a limited amount of resources, owned by a private entity, so exclusively used inside the company’s perimeter. It is often seen as the most secure option due to that isolation.

It has no economy of scale, as commercial servers are more expensive than those used by public cloud providers (see OCP [4])

Furthermore, employees need new skills to operate private clouds, which means there is a learning curve that may cause the private cloud to be less reliable than the public cloud.

[4] http://www.opencompute.org/

on demand, self-servicebroad access

resource poolingrapid elasticity

measured service

Infrastructure as a Service

Cloud revolutionBy dynamically matching capacity to

demand, the infrastructure now allows a lean growth model, key enabler of the

startup economy

http://www.dynco.co.uk/wp-content/uploads/2015/09/business-growth-1024x640.png

e-Commerce Hosting Requirements

e-Commerce Infrastructure

Web Server

Website Code

Website Images

Database

Operating System

CPURAM

Network Card Storage

Drivers

Libraries

Same as before: Web Application layers

But what kind of Infrastructure do we need?

Typical answers

● Web Hosting is OK for basic eCommerce● Virtualized Servers is OK - better isolation than VPS● Collocated / Dedicated Servers makes PCI-DSS

compliance harder● IaaS is OK for complex eCommerce● PaaS, SaaS, Containers, Lambda, often too complex

and very new: only for big companies.

e-Commerce Traffic AnalysisA complex Web Application, with 2 main functions

● Display our products○ Show pictures and detailed information○ Customer reviews○ Intelligent tracking of customer preferences (based on browsing history)○ Uses SEO techniques to attract visitors from other sites (e-Marketing)

● Allow customers to purchase our products○ Shopping Cart function○ Integration with Credit Card processing systems, PayPal, or any other B2B systems

○ Storage of customer sensitive information, subject to government or industry regulation (SOX, PCI-DSS, PIPED Act, etc.)

○ It’s a common target for hacker attacks, phising and other threats.

Sidenote: PCI-DSSAnual audits required to prove that a company that stores Credit Card information

● Builds and maintains a secure network● Protects cardholder data● Maintains a vulnerability management

program● Implements strong access control

measures● Regularly monitors and test networks● Maintains an information security policy

e-Commerce Demand AnalysisDaily Variation (night vs day) Yearly Variation (high-season)

Note the 2 kinds of traffic, aligned with the 2 functions from earlier: ● visitors only browse our product listing while they decide to buy or not● buyers click on the ‘order’ button and introduce their credit card information to do the purchase.

A slow website (>3 sec per page) loses money [5]

You need to provide enough resources to your website.

[5] http://www.peer1.ca/knowledgebase/how-slow-website-impacts-your-visitors-and-sales

Case Study

A model for Web Site performanceLet’s suppose the following

● Compute Unit (CU): the amount of server’s resources (CPU, RAM, etc) required to display and properly serve a website visitor during 10 minutes○ A small server can handle 60 CUs per hour, 10 every 10-minutes.

● Visitor: a regular visitor that browses our website, clicks on images, reads the descriptions, etc. ○ Browsing our website catalog requires 1 CU.

● Buyer: the most important kind of visitors those browsing our Shopping Cart section, which means they’re halfway their purchase process where they give us their personal details and credit card information○ Going through the purchase process requires 5 CUs.

● It takes more CU to serve a buyer than to serve a visitor (i.e. 5 times more), due to the storage of personal data, credit card validation, checkout process, etc.

Are you smarter than a 5th grader?Remember: 60 CUs per server/hour. 1 visitor = 1CU. 1 buyer = 5 CUs

● Number of visitors can 1 server serve per hour (on average) ?

Answer: 60

● Number of buyers can 1 server serve per hour (on average) ?

Answer: 12

● Maximum number of buyers can 1 server serve in 1 day?

Answer: 288

Daily DemandLow vs High season(example hourly values - best case)

Visitors/h Buyers/h

Night time min, low-season 60 6

Day time max, low-season 200 20

Night time min, high-season 100 10

Day time max, high-season 1000 100

Peak Demand(best-case average vs worst-case)

Visitors/h

Visitors/10min

Buyers/h

Buyers/10min

Total CU Equivalent / 10 min

Total CU per hour

Worst-Case Total Servers (ALL visits

in 10 min)

Best-Case Total Servers (hourly

average)

Night time min, low-season 60 10 6 1 10+(1*5) = 15 60+(6*5) = 90 90/10= 9 (60+6*5)/60 = 1.5

Day time max, low-season 200 33 20 3.3 50 300 30 5.0

Night time min, high-season 100 16.7 10 1.7 25 150 15 2.5

Day time max, high-season 1000 167 100 16.7 250 1500 150 25

Remember: 60 CUs per server/hour. 1 visitor = 1CU. 1 buyer = 5 CUsEquivalent to: 10 CUs per 10 minutes means 10/1=10 visitors every 10 minutes, or 10/5=2 buyers every 10 minutes.

6

1

10min 20min 30min 40min 50min 60min

Buyers’ demand - (similar for visitors)

3-year Budget

With all variables in hand, can we prepare a budget for the infrastructure needed for the next 3 years?

We’ll look at 2 scenarios: private cloud vs public cloud

Option 1: Private CloudHow many servers to buy?

More informationWe’ve supposed a basic server would be able to calculate 60 C.U’s /hour

We know, thanks to our providers, that the average server sold nowadays can perform 480 C.U’s per hour, thanks to multi-core technology

The average price is 3000$ (CAPEX)

Collocation, electricity and other maintenance fees amount to $2000 for the first three years (OPEX)

Our accountant will amortize the servers over 3 years, as OPEX expenses will increase and it will be recommended to renew servers every 3 years.

Rigth-scaling issuesWhen purchasing a Fixed Capacity, we risk undersizing our infrastructure, which means we’re not able to serve our customers during the peak hours, losing potential revenue and maybe damaging our website’s reputation (too slow, unresponsive, faulty...)

Furthermore, we may also be oversizing our infrastructure, which means we’ve spent too much, risking our financial health

We’ll see later how public cloud offers Elasticity as long as our software we can closely adjust itself to add/remove capacity according to the real-time demand

Deciding the size of our Private CloudRemember: a $3000 server with 8 CPU cores is 8x more powerful than our basic server calculated before

Visitors/10min

Buyers/10min

Worst-Case Total Servers (6x peak)

8-CPU servers

Best-Case Total Servers (no peaks)

8-CPU servers

Night time min, low-season 10 1.0 9 1.13 1.5 0.19

Day time max, low-season 33 3.3 30 3.75 5.0 0.63

Night time min, high-season 16.7 1.7 15 1.88 2.5 0.31

Day time max, high-season 167 16.7 150 18.75 25 3.13

How many servers do we buy?

Between 3.13 and 18.75, we need to compromise. We’re going to size only for instant peaks of 2x the average hourly rate, so we pick 6 servers. Using a $3000 server (CAPEX) that costs $2000 over 3y to maintain (OPEX), we need $30.000 during the 3y period (we are supposing the same demand every year).

Option 2: Public CloudCan we forecast the cost?

More informationOn average, a basic server able to do 60 C.U.s per hour costs around 20$ per month. But it is billed per hour, so assume $0.028 per hour (we’re including storage and network costs)

We’re assuming our software can leverage the Elasticity and Autoscaling features of the public cloud, so the number of servers running will be almost exactly those required to properly serve the visitors/buyers any given time.

* The sum of hours is 8640/24h = 360 days in a year (approximation)

Forecasting the size of our Public CloudIn this case, we need to calculate the number of hours per year our basic servers will be powered on

Night time length, 12 h

Day time length 12 h

Low-season 9 months

High-season 3 months

Total CU: 3,045,600. If a basic server can do 60 CU/h, and costs 0.024 $/h, how much will it cost over 3 years?

(using 360 days/year) Hours/y Visitors /h Buyers /h CU/h equivalent

Total CU/y

Night time min, low-season 3240 * 60 6 90

291600

Day time max, low-season 3240 200 20 300 972000

Night time min, high-season 1080 100 10 150 162000

Day time max, high-season 1080 1000 100 1500 1620000

We need 50,760 server hours per year x 0.024 $/h is $1,410 per year, $4,230 per 3 years

Extra Option: CDNAdd a Content Delivery Network

What is a CDN?CDN is a key service that is contracted to help companies scale their web services and deal with traffic peaks, specially with image-heavy content that permits a better browsing experience to the user when they download from a CDN instead of a central server. It can only offload all ‘read-only’ or non-transactional requests. Examples include Akamai, Amazon Cloudfront, etc.

It is priced with a fixed portion and a variable price depending on the traffic volume

Example for our case (2 million visitors/year)$200/month$0.5 per 1000 visitors/year

$2,400 per year (fixed fee)$1,015 per year (variable fee)

3y fees: $10,245

Offloading visitors to the CDNIt can effectively remove all the load related to Visitors, leaving only the Buyers to be treated in our servers (either Private or Public)

(Visitors now in CDN) Visitors

Buyers /h

CU/h Total CU/y

Night time min, low-season - 6 3097200

Day time max, low-season - 20 100324000

Night time min, high-season - 10 5054000

Day time max, high-season - 100 500540000

Total 3 year cost $1,410($2,820 savings)

Worst-Case Total Servers

8-CPU servers

Best-Case Total Servers 8-CPU servers

3 0.38 0.5 0.06

10 1.25 1.7 0.21

5 0.64 0.8 0.10

50 6.25 8.3 1.04

Total cost, 3y2 servers (can handle 2x peaks)

$10,000($20,000 savings)

We’ve effectively offloaded ⅔ of our traffic to the CDNSo we’re saving 66% in infrastructure costs, but we still need to cover the $10,245 for the CDN fees

looks like public cloud is the winnerthere is quite a difference in the 3-year costs for Elastic Public Cloud ($4,230) versus the 3-year server costs for Private Cloud at max.capacity ($30,000, or

$20,245 with CDN)

Even with a CDN, public cloud offers the lowest costs

But it’s not that simple...

Is server cost the only attribute of TCO?

Other CAPEX factorsPrivate Cloud Public Cloud

Procurement (HW) Expensive Zero-cost

Software development Moderate, more expensive than traditional virtualization due to lack of platform’s High-Availability

Expensive: it’s harder to write an Elastic software than a traditional one

Auditing Cheap when network is well designedOtherwise, expensive

Moderate

Systems design & architecture Moderate (not so different to

Traditional Virtualization)Expensive

CDN setup Cheap Cheap although network fees may increase if different provider

Other OPEX factorsPrivate Cloud Public Cloud

Salaries

Moderate (skills are almost the same as traditional virtualization)

Expensive (high-demand)

HW Operating costs

Moderate: power, cooling, replacement parts, etc

Zero-cost

SW maintenance

Cheap, we can apply security by isolation, as our servers are inside a secure perimeter.

Moderate, APIs may change over time and we’ll be forced to update (lock-in factor)

HW Maintenance

Moderate, but cheaper than virtualization, no need for emergency repairs

Zero-cost

Tech Support

Cheap, less than traditional virtualization

Expensive, extra SaaS services may be needed (capacity optimization, security and performance monitors, etc)

Other aspects to be consideredFinancing

● Borrowing or raising money often includes a budget estimation, which makes the purchase of servers more attactive than the outsourcing of public cloud.

● Physical servers can also be leased instead of purchased at very interesting rates.

Accounting

● Some may prefer to have assets (like actual) instad of having outsourced their IT servers● Tax deductions may be available when buying servers (private cloud)● Some servers can be amortized up to 5 years, reducing the CAPEX burden

Uncertainty and resilience

● There is no doubt that public loud offers a zero-engagement model that allows companies to cut back on their fixed costs and allocate them as variable costs

● This makes the company more resilient to market fluctuation.

Hybrid Cloud

Hybrid is the most complex solutionIt is often used as a way to combine the best of both worlds.For instance, we store all sensitive information in our premises (private cloud) but we keep the website parts that are not-sensitive in the public cloud.

This way, the PCI-DSS and other audits are smaller in size and in complexity

We can use the private cloud to do Dev/Test/QA and save on costs that otherwise would have increased the Public Cloud expense

Other IT compliance and regulations my force the use of private cloud and forbid CDN techniques, leaving us with only public cloud as the the scale-out option for the times of the year when capacity demanded our private cloud resources

When use Hybrid?As a rule of thumb, keep the important things (mission critical) close to your business, and let others deal with less

important workloads

http://blogs.vmware.com/vcloud/files/2012/11/bluelockwebinar2.png

In conclusionPublic cloud is the best option for some (i.e. variable and bursty demand), for others it’s Private cloud (i.e. more constant demand).

You need to analyize your particular case and consider all quantitative aspects (CAPEX, OPEX, time to market) as well as qualitative (page load, flexibility, risk management, security and vendor lock-in). Think before you make any final decision, such as buying servers or re-architecting your software to run on public clouds

Remember: not every workload is suited to be run on a cloud, either public or private. If it works, wherever it was hosted (virtualization, baremetal), don’t change it. There’s often no need to jump on new technology unless you need that extra competitive differentiation that a new architecture can bring.

Doubts?Thank you!