192
© 2009 IBM Corporation Cloud Computing for a Smarter Planet Prof. Dr. Kristof Kloeckner CTO and General Manager, Technology, Innovation and Automation IBM Global Technology Services November 9, 2015 Cloud Computing Platform Services

03_Cloud Computing D

  • Upload
    vuanh

  • View
    215

  • Download
    0

Embed Size (px)

Citation preview

Page 1: 03_Cloud Computing D

© 2009 IBM Corporation

Cloud Computing for a SmarterPlanet

Prof. Dr. Kristof KloecknerCTO and General Manager, Technology, Innovation and AutomationIBM Global Technology Services

November 9, 2015

Cloud ComputingPlatform Services

Page 2: 03_Cloud Computing D

© 2009 IBM Corporation2

Agenda

Recap

Origin of Cloud Platforms– Brief Overview of Commercial Platforms

Programming Models and Platforms

Page 3: 03_Cloud Computing D

© 2009 IBM Corporation3

References

Company Web Sites: Amazon, Microsoft, Google, IBM, Salesforce.com Tech blogs, for instance techblog.netflix.com http://wiki.developerforce.com/page/Multi_Tenant_Architecture Alan Brown, Enterprise Software Delivery, Addison Wesley 2013 Gregor Hohpe, Bobby Woolf, Enterprise Integration Patterns, Addison-Wesley 2004 Jez Humble and David Farley: Continuous Delivery, Addison Wesley 2010 Gene Kim et al: The Phoenix Project Craig Larman, Bas Vodde: Scaling Lean & Agile Development, Addison-Wesley 2009 Web Site der Open Group: www.opengroup.org/cloudcomputing Mary and Tom Poppendieck: Lean Software Development. An Agile Toolkit, Addison Wesley

2003 Eric Ries, The Lean Startup George Reese: Cloud Application Architectures, O’Reilly 2009 John W. Rittinghouse, James F. Ransome, Cloud Computing. Implementation, Management

and Security, CRC Press 2009 Andrew Tanenbaum, Maarten van Steen: Distributed Systems. Principles and Paradigms,

Prentice-Hall 2009 Rich Schiesser: IT Systems Management, Prentice-Hall 2002 Jim Rymarczyk, Virtualization, Pre-Print 2009 Tivoli Service Automation Manager Solution Guide Adam Wiggins, The Twelve-Factor App, 12factor.net Bill Wilder, Cloud Architecture Patterns: Using Microsoft Azure, O’Reilly 2012

Page 4: 03_Cloud Computing D

© 2009 IBM Corporation4

References – Downloads from Web

Michael Armbrust et al., Above the Clouds: A Berkeley View of Cloud Computing, Feb. 2009– http://www.eecs.berkeley.edu/Pubs/TechRpts/2009/EECS-2009-28.pdf

Cloud Computing: Platform as a Service. InformationWeek Analytics, October 2, 2009

CSA. Top Threats to Cloud Computing V1.0 https://cloudsecurityalliance.org/topthreats/csathreats.v1.0.pdf

Cloud Use Cases White Paper Version 4, http://cloudusecases.org

DMTF: Architecture for Managing Clouds, Version 1.0.0, 2010-06-18

DMTF: Interoperable Clouds, Version 1.0.0, 2009-11-11

Luiz André Barroso and Urs Hölzle, The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines, SynthesisLectures on Computer Architecture, 2009, http://www.morganclaypool.com/doi/pdf/10.2200/S00193ED1V01Y200905CAC006?cookieSet=1

Scott Crowder, Introduction to Workload Optimized Approach & Workload Market Segmentation, IBM White Paper, December 2009

David Chappell, A short introduction to Cloud, http://www.davidchappell.com/CloudPlatforms--Chappell.pdf

David Chappell, Cloud Platforms Today: A Perspective, April 2009 http://www.davidchappell.com/CloudPlatformsToday--APerspective--Chappell.pdf

Jeffrey Dean and Sanjay Ghemawat, MapReduce: Simplified Data Processing on Large Clusters, – labs.google.com/papers/mapreduce-osdi04.pdf

DeCandia et al. Dynamo: Amazon’s highly available key-value store, SOSP 2007, http://portal.acm.org/citation.cfm?id=1294281&dl=ACM&coll=ACM&CFID=47859964&CFTOKEN=98797782

European Network and Information Security Agency (ENISA), Cloud Computing, Benefits, risks and recommendations for information security,Nov 2009 (http://www.enisa.europa.eu)

Gregor Hohpe, Programming the Cloud, November 2009

http://www.enterpriseintegrationpatterns.com/docs/HohpeProgrammingCloudKeynote.pdfA

nna Liu, Architecting Cloud Applications – the essential checklist, AAF Keynote 2009, N

ational Institute of Standards and Technology, Definition of Cloud Computing, http://csrc.nist.gov/groups/SNS/cloud-computing/N

ational Institute of Standard and Technology, NIST Cloud Computing Reference, Special Publication 500-292N

ing Duan et al., Tenant Behavior Analysis in Software as a Service Environment, ICSOC 2009D

aniel Nurmi et al., The Eucalyptus Open-source Cloud-computing System, http://www.cca08.org/papers/Paper32-Daniel-Nurmi.pdfO

pen Cloud Manifesto, http://www.opencloudmanifesto.org/O

penNebula.org – Various papersB

. Rochwerger et al., The Reservoir Model and Architecture for Open Federated Cloud Computing, IBM Journal of Research andDevelopment, April 2009 http://www8.cs.umu.se/~elmroth/papers/ibmjrd2009.pdf

Werner Vogels, Eventually Consistent, ACM Queue, October 2008

Kees van Gelder, Elastic Data Warehousing in the Cloud, Vrije Univ. Amsterdam

Ying Huang et al., A Framework for Building a Low Cost, Scalable and Secured Platform for Web-Delivered Business Services, IBM SystemsJournal, November 2009

Michael Yuan, Java PaaS Shootout, 4/5/11, IBM developerWorks

Page 5: 03_Cloud Computing D

© 2009 IBM Corporation5

Agenda

Recap

Origin of Cloud Platforms– Brief Overview of Commercial Platforms

Programming Models and Platforms

Page 6: 03_Cloud Computing D

© 2009 IBM Corporation6

Cloud Services Spectrum

6

Cloud EnabledWorkloads

Cloud CentricWorkloads

Scalable

Virtualized

Elastic

Multi-tenant

Standardized InfrastructureHeterogeneous Infrastructure

ExistingMiddlewareWorkloads

EmergingPlatform

Workloads

Automated LIfecycle Integrated Lifecycle

Compatibility with existing systems Exploitation of new environments

Page 7: 03_Cloud Computing D

© 2009 IBM Corporation7

Changes happening at the intersection of workload, application andinfrastructure lifecycle models

Analytics, Mobile, Social Applications

Services Platform(Micro Services)

De

vO

ps

Orchestration & Automation

Software-defined Infrastructure

Delivery Organization

Se

rvic

e M

an

an

ge

me

nm

t

Re

sil

ien

cy

& C

om

pli

an

ce

Digitization drives

Systems of Engagement

DevOps&

infrastructure flexibility

depend on

Value migration to LoBs

Infrastructure and delivery innovation

enables

Hybrid Delivery Models

7

Page 8: 03_Cloud Computing D

© 2009 IBM Corporation8

Next Generaton Cloud Platorm

ExternalEcosystem

Analytics Commerce Collaboration Location Data Services

Marketplace SolutionsApp

Software DefinedNetworking

Resource Abstraction& Optimization

Software DefinedStorage

Software DefinedCompute

Workload definition, Optimization & Orchestration

DevelopmentBig Data &Analytics

Security Integration Mobile Social

Services & Composition Patterns API & Integration Services

TraditionalWorkloads

API API

API API API API API API

Softwareas a Service

API Economy

Page 9: 03_Cloud Computing D

© 2009 IBM Corporation9

Next Generation Cloud Platform Architecture Built on OpenTechnologies

Softwareas aService (SaaS)

Platformas aService (PaaS)

Infrastructure as a Service (IaaS)

APIEconomy

CloudOperating

Environment

SoftwareDefined

Environment

OAuth

OpenShif cloudfoundry.org

TOSCA

OSLC

Page 10: 03_Cloud Computing D

© 2009 IBM Corporation10

Developer Centric Platform, Marketplace & Services in aCloud Operating Environment

OPEN ecosystem of composable services

Optimized workload deployment

Integration patterns with systems of record

CapabilityValue

Fast, automated composition of services

Repeatable patterns-of-expertise

Workload defniton, Optmizaton, & OrchestratonWorkload defniton, Optmizaton, & Orchestraton

SofwareDefned

Environment

SofwareDefned

Environment Sofware DefnedCompute Sofware Defned Storage

Sofware DefnedNetworking

Resource Abstracton & Optmizaton

CloudOperatng

Environment datadatamobilemobiledevelopmentdevelopment operatonaloperatonalapplicatonapplicaton

servicesservices

Traditonal Workloads

Traditonal Workloads

Services & Compositon Paterns API & IntegratonAPI & IntegratonServicesServices

TraditionalTraditionalWorkloadsWorkloads

securitysecurity

cloudfoundry.org

Page 11: 03_Cloud Computing D

© 2009 IBM Corporation11

6

5

43

2

Create app

Add databaseservice

Extract socialmedia data into

database

Add social analytcs service

Add Monitoringservice instance

Agile Service Compositon

Secure the service

1

ITERATE

TASK:TASK:Create a secure application thatCreate a secure application thatanalyses sentiment about certainanalyses sentiment about certain

topics in social mediatopics in social media

Page 12: 03_Cloud Computing D

© 2009 IBM Corporation12

SERVICES FABRIC

APIAPIAPIAPI APIAPIAPIAPI APIAPIAPIAPI

Social Commerce Mobile

Value-added Services

Loyalty

Promoton

Payment

APIAPIAPIAPI

APIAPIAPIAPI EnterpriseEnterprise

Customer Customer InteractonInteracton

APIAPIAPIAPI EnterpriseEnterprisePaternsPaterns

APIAPIAPIAPI

APIAPIAPIAPIEnterpriseEnterpriseCapabilitesCapabilites

EnterpriseEnterpriseCapabilitesCapabilites

APIAPIAPIAPIEnterpriseEnterpriseCapabilitesCapabilites

BANK

TELCO

RETAIL

Serv

ices

Pat

ern

API ServiceManagement

ThrotlingThrotling

API-CatalogAPI-Catalog

MonitoringMonitoring

GovernanceGovernance

Page 13: 03_Cloud Computing D

© 2009 IBM Corporation13

API Economy

Composition of services

Marketplace of internal & externalservices

CapabilityRapid application development &

delivery

API-accessible applications

Multi-channel integration

Value

ExternalEcosystemExternal

Ecosystem

Marketplace SolutionsApp

APIAPIAPIAPI

APIAPIEconomyEconomy

servicesservices

APIAPIAPIAPI

analytcsanalytcs

APIAPIAPIAPI

commercecommerce

APIAPIAPIAPI

collaboratoncollaboraton

APIAPIAPIAPI

locatonlocaton

APIAPIAPIAPI

datadata

APIAPIAPIAPIAPIAPIAPIAPI

OAuth

CloudOperatng

Environment

Workload definition, Optimization, & OrchestrationWorkload definition, Optimization, & OrchestrationSofwareDefned

EnvironmentSoftware Defined Compute Software Defined Storage Software Defined

Networking

Resource Abstracton & Optmizaton

Traditional Traditional WorkloadsWorkloads

Services & Compositon PaternsAPI & IntegratonAPI & IntegratonServicesServices

datadatamobilemobiledevdev opsops applicatonapplicatonservicesservices

securitysecurity…

Page 14: 03_Cloud Computing D

© 2009 IBM Corporation14

Next Generation Cloud Platform

Resource abstraction and optimization

Workload definition, Optimization and Orchestration

ExternalEcosystem

Software Defined Compute Software Defined Storage Software Defined Network

MiddlewareMobileDatastore Services Security Ops Dev’t

TraditionalWorkloads

Collaboration

CommerceAnalytics Location Data Services

API API APIAPI

Marketplace Solutions

APIAPI

APIAPI

Application

Services and Composition PatternsAPI and Integration

Services

Softwareas aService (SaaS)

Platformas aService (PaaS)

Infrastructure as a Service (IaaS)

APIEconomy

CloudOperating

Environment

SoftwareDefined

Environment

Page 15: 03_Cloud Computing D

© 2009 IBM Corporation15

Agenda

Recap

Origin of Cloud Platforms– Brief Overview of Commercial Platforms

Programming Models and Platforms

Page 16: 03_Cloud Computing D

© 2009 IBM Corporation16

Infrastructure (IaaS)

Platform Components

Software (SaaS, BPaaS)

Platform Services drive Eco-System Evolution

Vendors develop platform technologiesto differentiate IaaS and SaaS offerings… 2. What platform services are

required to efficiently deliver SaaSand BPaaS offerings and to attract asubstantial ecosystem?

1. What platform services are requiredto increase attraction and loyalty ofcustomers and partners to IaaSofferings

… which evolve to include *as aservice capability ultimately

enabling the build of asubstantial ecosystem of ISVs

and developers.

3. What platform services and underlying components arecommon and serve both purposes? Do they support differentdeployment options?

Page 17: 03_Cloud Computing D

© 2009 IBM Corporation17

The PaaS Market

PaaS is often presented asthe highest growing cloud

segment

$2,9bin 2016

30 % Annual Growth

Source: Gartner

26 %AGR through 2014

Source: CMSWire

But it is still a very small marketo PaaS accounts for 1% of the $109 b cloud industry (Source: Gartner)o It is expected at 2% of $209 b in 2016 (Source: Gartner)

o PaaS provider (which are often providing other services) don’t give the detailed numbers;some (Azure, AWS) give combined IaaS/PaaS figures o Google App Engine announced 250 000 active users, up from 100 000 in may 2011

(Source: Google)o Heroku (Salesforce acquisition), one of the biggest PaaS provider is claiming having

deployed 2.3 millions app (Source: Salesforce) o However this only accounts for 0.27% market share in the Alexa top 1M (Source: Datanyze)

It is difficult to get real data about PaaS revenues

Page 18: 03_Cloud Computing D

© 2009 IBM Corporation18

The elements of a Cloud Application Platform

DevO

ps

Vendor andCommunity

Services

Fabric andContainer Services

Runtime Services

Web Hosting

LinuxContainer

Warden

Java App Server J2EE

Web Services

SOA

API Economy

Hosting Techniques

Enterprise Grade Middleware

A cloud application platformcombines multi-tenanthosting facilities, runtimeand DevOps services and amarket place of composablesoftware parts

ComposableSoftware

Page 19: 03_Cloud Computing D

© 2009 IBM Corporation19

The PaaS ecosystem

HostingData Center or IaaS

ExecutableApplication Server, Frameworks

Language Run-TimeContainer (Linux)

Technical APIsRelational DatabaseNoSQL DatabaseMessage Queues

Rules, Big Data, AnalyticsManaged or Not

Business APIsAPI Economy, Marketplace

Integration APIBusiness Software APIs

Application

PaaS Ecosystem

DevelopersB

us

ines

sT

ech

no

log

yo This diagram is a break down of PaaS

functionalities

o “Executable” relates to thetechnologies required to provision,deploy and run the softwarecomponents

o “Technical APIs” refers to the set ofmiddleware typically used to writesoftware. These middleware can bemanaged services or not (For example,in Bluemix Cloudant is a fully manageddatabase whereas the MySQL serviceis not)

o “Business APIs” refers to the businessrelated APIs (For example SalesforceCRM APIs) exposed to the applicationdeveloper

Page 20: 03_Cloud Computing D

© 2009 IBM Corporation20

Origin of Platform Services

Infrastructure as a Service– Amazon– VMware

Software as a Service– force.com– Microsoft Azure

Business Process(Solution)as a Service

– IBM Watson– IBM Commerce

Page 21: 03_Cloud Computing D

© 2009 IBM Corporation21

SAP Hana

o HTML5, JavaScript, SQLScript

o Extended App Service –XS (App Server)

o HANA DBo Analytics (Predictive,

Text mining …)o Mobile, Big Data,

Collaboration,Integration, Business

Rules

SAP Infrastructure

No

SAP has built a CloudPlatform on its HANA in-

memory database engine. It is used mainly to extend

SAP core packages withspecific analytics, reporting,

visualization or integrationmodules.

There is a marketplace forthird party to offer their

applications.

Differentiators

Page 22: 03_Cloud Computing D

© 2009 IBM Corporation22

Google App Engine

o Pythono Javao PHPo Go

o Cloud SQL, Cloud Storage,BigQuery

o Google Cloud Endpointo Mail, SMS, Voiceo Translation API

o Task Queues, XMPPo Search

Google Infrastructure

No

General purpose PaaS tobuild web applications on

Google infrastructure.Fast deployment, simple

administration and seamlessscalability.

Developers can composemany Google services

(translation, search, etc…)within their application.

Differentiators

Page 23: 03_Cloud Computing D

© 2009 IBM Corporation23

Microsoft Azure

o C#o Javao PHPo Ruby

o Visual Studio, Integrationo SQL Database, Big Data (HDInsight)

o Storage, Backup, Recoveryo API management

o Media services (live streaming, CDN…)

o Mobile App Backendo HPC with broad partner ecosystem

Microsoft Infrastructure

No

Windows Azure started in2008 with its PaaS Service,

before launching itscompute / storage service in

2012 to counter Amazon.Initially Azure was targetingMicrosoft developers (.NET

model).

Differentiators

Page 24: 03_Cloud Computing D

© 2009 IBM Corporation24

Heroku

o Pythono Java

o Node.jso Ruby

o Data Stores (Postgres, Mongo,Redis…)

o Mobile (Push, SMS, MQTT…)o Search, Logging, Queueing,

Caching …o Analytics services

o Paymentso Monitoring, Utilities

o Media (Encoder, streaming …)

Amazon AWS

Yes, build packs compatiblewith Cloud Foundry

Heroku has been acquiredby SalesForce but has not

been merged withForce.com. It is a generalpurpose PaaS running on

Amazon Cloud.A large network of partners

is contributing to the“Heroku Add-ons” rich set

of composable buildingblocks.

Differentiators

Page 25: 03_Cloud Computing D

© 2009 IBM Corporation25

AWS Elastic Beanstalk

o .NETo Java

o PHP, Node.js, Python,Ruby

o Docker

o All AWS Serviceso Database (RDS, DynamoDB …)o Analytics (EMR, RedShift… )o Storage (S3, Glacier ….)o Media (Encoding …) o Amazon Marketplace

Amazon AWS

NoPortability with Docker

AWS Elastic Beanstalkautomatically handles the

deployment details, capacityprovisioning, load balancing

and application healthmonitoring. It is build on top

of AWS components likeEC2.

As part of AWS, ElasticBeanstalk let the developers

combine and leverage allAWS services (> 30).

The service is free and theuser only pays for theunderlying AWS cloud

components.

Differentiators

Page 26: 03_Cloud Computing D

© 2009 IBM Corporation26

CloudBees

o Java, Scala and otherJVM based runtimeso PHP, JavaScript

o Node.js

o Managed MySQLo Integration

o Partner Services (Cloudant,MongoHQ RabbitMQ …)

o DevOps services (Continuousintegration with Jenkins)

o Amazon AWSo HP Cloud, Openstack

o On premise

o Jenkins for continuous integrationo Tomcat, J2EE as runtime

Created by the former JBossCTO, CloudBees provides a

cloud based continuousintegration platform.

DEV@Cloud provides thecontinuous integration

environment andRUN@Cloud provides the

runtime platform to host theapplication.

Focus is on best practicesaround continuous multi-

branch build, test anddeployment by leveragingthe most successful opensource tools like Jenkins,

Maven, Ant, Git, etc …

Differentiators

Page 27: 03_Cloud Computing D

© 2009 IBM Corporation27

Salesforce1 Platform (force.com)

o Proprietaryo Apex

o Visualforce (GraphicalIDE)

o Mobile SDK

o Cloud Database with schemabuilder

o Salesforce APIso Oracle and SAP backend

o Analytics o Workflows

o Salesforce data centers

o No

Salesforce’s firstdevelopment platform meant

to create an ecosystemaround its core CRM

offering.Apps can be exposed on the

AppExchange.220,000 + apps have been

created so far.There is a high focus on

mobile development and ongraphical “point and click”

development.As such, Salesforce APIs

and back end APIs for SAP /Oracle are exposed in the

development studio in orderfor companies to create

mobile or web interfaces,integration points or specific

reports.

Differentiators

Page 28: 03_Cloud Computing D

© 2009 IBM Corporation28

Pivotal Web Services

o Based on Cloud Foundry

o MySQL, MongoDB, PostgreSQLo MemCached

o Message queueso Search

o Load testingo Email

o Pivotal data centers

o Yes (Cloud Foundry)

Pivotal is an EMC / VMwarespinoff at the heart of the

Open Source Cloud FoundryProject.

All powered by CloudFoundry, Pivotal provides an

Enterprise aPaaS to be runon private clouds andoperates Pivotal Web

Services as a public aPaaS.The services marketplace is

embryonic at that stage,Pivotal being more focused

on the agile developmentparadigm (Pivotal Labs).

However, Pivotal investedrecently heavily in Big Datacomponents with Hadoop,

analytic database as well asin-memory & real-time data

store.

Differentiators

Page 29: 03_Cloud Computing D

© 2009 IBM Corporation29

Red Hat OpenShift

o Java, Java EE (JBossEAP)

o Ruby, PHP, Node.jso Python o PERL

o OpenShift marketplaceo Messagingo Data Storeso Monitoring

o Emailo Search

o Amazon Web Services

o Yes

OpenShift originated from aRed Hat open source PaaS

initiative. Based on thisfoundation, Red Hat is now

offering OpenShiftEnterprise which is a private

application platform for onpremises deployment andOpenShift online which isRed Hat’s operated PaaS.

The OpenShift servicemarketplace exposes the

usual database, messaging,search and other services.There is no differentiation

here when compared toHeroku, Bluemix or Pivotal.

Differentiators

Page 30: 03_Cloud Computing D

© 2009 IBM Corporation30

IBM Bluemix

o Java, Java EE (Liberty WAS) o SDK for Node.jso Ruby on Rails, Ruby Sinatrao All Cloud Foundry compatible build packs

o Mobile Services (Push, Quality Assurance)o Web: Workflow, Rules, Messaging, Cache

o Databases: MySQL, DB2, Cloudant,Mongo

o Big Data: Warehouse, Hadoopo Security, monitoring, integration

o Internet of Things

o SoftLayer

o Yes (Cloud Foundry)

From a development point ofview, IBM Bluemix can be

complemented with DevOpsServices for Bluemix (formerly

JazzHub) and solutions fromService Engage (Application

Performance Management)delivering a full set ofApplication Lifecycle

Management tools.On the services side, IBM is

aggressively including manyof its software portfolio

flagships like DB2 database,rules engine, workflow engine,

Hadoop powered byBigInsights, Analyticspowered by IBM BLU

Acceleration and an Internetof Things framework.

Differentiators

Page 31: 03_Cloud Computing D

© 2009 IBM Corporation31

The PaaS Diagram

Breadth of composable services

Breadth ofdevelopment

supportingservices

It is helpful to classify the PaaS offerings within 2 axes:focus on development and focus on composable

services.o Many Service Providers provide

PaaS with a specific focus.o CloudBees clearly focuses on

development

o Salesforce force.com clearlyfocuses on service composition(around CRM functionalities)

o The horizontal axis represents thebreath of composable services

o The vertical axis represents thebreath of development services

o Another distinction is madebetween general purpose PaaS andSaaS related PaaS (PaaS designedto complement , integrate, enrichcore SaaS)

CloudBees

Force.com

General purpose aPaaS

aPaaS as a SaaS add on service

Page 32: 03_Cloud Computing D

© 2009 IBM Corporation32

The PaaS Diagram

Focus onDevelopment

Focus onComposable

ServicesBreadth of composable services

SAP HANASalesForceForce.com

IBM BluemixGoogle App

Engine

AWSElastic

Beanstalk

Heroku

MS Azure

Breadth ofdevelopment

services

Run timesFrameworksLifecycle MgtContinuous

Integration

CloudBees

Pivotal WebServices

Red HatOpenShift

HP CloudApplication Platform

Oracle PaaS

dotCloud PaaS

General purpose aPaaS

aPaaS as a SaaS add on service

General purposeaPaaS leaders

Businessintegration andpersonalization

aPaaS

Development visionaries

Page 33: 03_Cloud Computing D

© 2009 IBM Corporation33

Conclusion

General

oPaaS is a balanced combination (with different weights for differentcompetitors) of DevOps services (development, test, deployment, autoscaling, monitoring, etc…) and API based building blocks (Technical APIs likedatabases or rules and Business APIs like CRM or SAP components)oMany services are common for all the competitor (It is in third parties'interest to be part of most of the API marketplaces)oDifferentiation on services comes from the aPaaS provider’s own portfolio

Application portability

oApplication portability between different PaaS is still difficult … and almostimpossible if your application is build around provider’s specific buildingblocks!

Page 34: 03_Cloud Computing D

© 2009 IBM Corporation34

Agenda

Recap

Origin of Cloud Platforms– Brief Overview of Commercial Platforms

Programming Models and Platforms

Page 35: 03_Cloud Computing D

© 2009 IBM Corporation35

Agenda: Programming Models and Platforms

Evolving Programming Models – Overview

Extensions to traditional programming models – Middleware patterns inthe cloud

Loosely coupled, relaxed consistency– Amazon Web Services– Microsoft Azure– Google, NoSQL

Content centric– Hadoop, Apache Spark

Database centric– Pangoo– Salesforce

Page 36: 03_Cloud Computing D

© 2009 IBM Corporation36

Cloud Reference Architecture – Focus on PaaS36

Page 37: 03_Cloud Computing D

© 2009 IBM Corporation37

The capabilities required in a PaaS stack map to SOA

Core elements of the software stack have not changed, the delivery platform has The stacks we will be looking at expose virtualization at different levels

Page 38: 03_Cloud Computing D

© 2009 IBM Corporation38

Five Emerging Cloud Architectures

Virtualized Traditional - Extensions of Java Application Servers, Support for‘Traditional’ Transactional Workloads (Cloud enabled)– Moving existing workloads to the cloud– Requires best practices, patterns, tooling

Database Centric - data driven + small computation on small data– With multi-tenancy attractive for enterprise and service providers

Content Centric - computation needs to be close to data + large computation onlarge data– Data Mining, Analytics, Data Warehouse,

Loosely Coupled - computation and data are separate– Can be addressed by existing middleware, but ‘relaxed consistency’ models

emerging– Cloud-centric approaches

Storage Analytics - Data and Storage Integration

Page 39: 03_Cloud Computing D

© 2009 IBM Corporation39

Agenda

Evolving Programming Models – Overview

Extensions to traditional programming models – Middleware patterns inthe cloud

Loosely coupled, relaxed consistency– Amazon Web Services– Microsoft Azure– Google, NoSQL

Content centric– Hadoop, Apache Spark

Database centric– Pangoo– Salesforce.com

Page 40: 03_Cloud Computing D

© 2009 IBM Corporation40

Private Cloud Evolution – Starting Point for Cloud Middleware Patterns

HardwareVirtualization

ImageVirtualization

WorkloadVirtualization

12

3

• Virtualization ofhardware resources in

the data center

• Management ofvirtualized

infrastructure

• Virtualizedinfrastructure leads to

creation of “virtual” software images

• Proliferation of virtualsoftware images

leads to managementchallenges

• Images are combinedinto patternsrepresenting

middleware workloads

• Workloadsencapsulate well

defined combinationsof integratedmiddleware

Image Management Integrated MiddlewareInfrastructure Management

Page 41: 03_Cloud Computing D

© 2009 IBM Corporation41

Key Differentiators for Integrated Middleware

Awareness and optimizations for specific workloads– Integrated stacks of middleware optimized for particular workloads

Consolidating workloads under a simplified management system– Expose radically simplified management model optimized for specific

workloads– Pattern based deployments for most common workloads

Full lifecycle management– Go beyond provisioning to full lifecycle (update, failure recovery, growth,

problem determination)

Elastic, efficient, multi-tenant and automated management andexecution of application workloads

– Integrated monitoring, metering, logging, security, caching, etc.– Automated policies for resource consumption and balancing– Optimized resource utilization of middleware in virtualized environments

Page 42: 03_Cloud Computing D

© 2009 IBM Corporation42

The Simplicity of Workload-Centric Cloud

Page 43: 03_Cloud Computing D

© 2009 IBM Corporation43

Virtualized Middleware can be deployed in different ways

Image Management

Automatedprovisioning of

middleware

Integratedmiddleware with

cloud capabilities

Page 44: 03_Cloud Computing D

© 2009 IBM Corporation44

Patterns of Expertise: Proven best practices and expertise for complex taskslearned from decades of client and partner engagements that are captured, labtested and optimized into a deployable form

MonitoringLifecycleManagement

What is a Pattern?• The pre-defined architecture of an application• For each component of the application (i.e.

database, web server, etc)• Pre-installation on an operating system• Pre-integration across components• Pre-configured & tuned• Pre-configured Monitoring• Pre-configured Security• Lifecycle Management

• In a deployable form, resulting in repeatabledeployment with full lifecycle management

• Delivering superior results:

• Agility: Faster time-to-value• Efficiency: Reduced costs and resources• Simplicity: Simpler skills requirements

Page 45: 03_Cloud Computing D

© 2009 IBM Corporation45

Companies typically approach the Cloud One Step at a Time

vCloud Powered

vCloud Datacenter

vCloud Express

vCloud Virtualized

For VM Hosting• Service runs on VMware

vSphere

Software developer-focused cloud service• Credit card billed pay-for-use

Enterprise IT focused cloud service• Globally consistent, VMware certified,

to meet enterprise security andperformance requirements

VMware compatible cloud service• Service runs on vSphere and vCloud Director• Delivers increased agility, reduced costs , IT

control, application portability

Cloud Interested

Cloud Ready

Early PrivateCloud

MaturePrivateCloud

HybridCloud

Public CloudAdoption andCommitment

Public CloudExperimentation

VMware View of Cloud Adoption

Page 46: 03_Cloud Computing D

© 2009 IBM Corporation46

VMware Cloud Offerings extending Basic Virtualization

InfrastructureServices

ESX vCenter vCloud

OperationalServices Monitoring HA/DR

Chargeback Capacity

DevelopmentServices Spring Java

Python, PHP, Ruby etc.

ApplicationServices Cloud Centric Cloud Enabled

Existing

Integration across services withvCloud & CloudFoundry

Page 47: 03_Cloud Computing D

© 2009 IBM Corporation47

VMWare Cloud Foundry PaaS

Page 48: 03_Cloud Computing D

© 2009 IBM Corporation48

Agenda

Evolving Programming Models – Overview

Extensions to traditional programming models – Middleware patterns inthe cloud

Loosely coupled, relaxed consistency– Amazon Web Services

Amazon material, best practices from A. Trossman, IBM– Microsoft Azure– Google, NoSQL

Content centric– Hadoop, Apache Spark

Database centric– Pangoo, Salesforce.com

Page 49: 03_Cloud Computing D

© 2009 IBM Corporation49

Critical elements of a loosely coupled model

49

Applications Services accessed viaREST/SOAP messages• Storage services

• Data services

• Queuing/messaging Services

• Execution Services (virtualized hardware)

Design to minimize operational costs - up front• e.g. recognize some part of the platform will fail (Storage, DB,

application) & design into application

• Don’t debug - kill/freeze execution instant

Eventual Consistency for Data Handling &Replication: - sometimes data storage serviceor database service will return the wronganswer

Message queue - will deliver messages at leastonce, possibly more than once

Asynchronous - scale achieved by recognizingcomponents that perform operate in parallel• Session/state information stored outside the application

components

Commodity “parts” can come and go, therest of the system does not fail• Both for infrastructure parts, as well as for application parts

Redundant (idempotent) execution is finefor infrastructure working AND forapplication semantics• Without that, very strict guarantees on application state will be

required, making the cost of execution very high

Page 50: 03_Cloud Computing D

© 2009 IBM Corporation50

Eventual Consistency (see Vogels or DeCandia et al.)

Eric Brewer’s CAP Theorem– Of 3 properties of a shared data system (consistency, availability, tolerance to

network partitioning/failure) only 2 can be achieved simultaneously Strategies for availability all depend on data replication

– Quorum approaches with N= Number of Replicas, R = Read Quorum, W=Write Quorum guarantee consistency if R + W > N

– Systems focusing on fault tolerance often use N=3, W=R=2 Other requirements (e.g. high load) require large N. If few writes, often R=1 To minimize likelihood of lost writes, choose W>1 Very large distributed systems have to live with network partitioning If read and write set don’t overlap, we cannot achieve strong consistency, but this

is often combined with a ‘lazy’ update approach to eventually update all nodes

– Good example: Shopping cart– Amazon shopping cart prioritizes availability for write

Other considerations: Failure detection

Page 51: 03_Cloud Computing D

© 2009 IBM Corporation51

The ‘new ACID’ (Gregor Hohpe, Google 2009)

Old ACID – predictive and accurate– Atomic– Consistent– Isolated

– Durable

New ACID – flexible and redundant– Associative (grouping)– Commutative (order)– Idempotent (repetition)– Distributed

Page 52: 03_Cloud Computing D

© 2009 IBM Corporation52

The Twelve Factors for aaS Applications (12factor.net, Adam Wiggins)

“The twelve-factor app is a methodology for building software-as-a-service apps that:Use declarative formats for setup automation, to minimize time and cost for new developersjoining the project;Have a clean contract with the underlying operating system, offering maximum portability between execution environments;Are suitable for deployment on modern cloud platforms, obviating the need for servers andsystems administration;Minimize divergence between development and production, enabling continuousdeployment for maximum agility;And can scale up without significant changes to tooling, architecture, or developmentpractices.

The twelve-factor methodology can be applied to apps written in any programming language,and which use any combination of backing services (database, queue, memory cache, etc).”

(Quote from web site)

Page 53: 03_Cloud Computing D

© 2009 IBM Corporation53

The Twelve Factors for aaS Applications

I. Codebase

One codebase tracked in revision control, many deploys

II. Dependencies

Explicitly declare and isolate dependencies

III. Config

Store config in the environment

IV. Backing Services

Treat backing services as attached resources

V. Build, release, run

Strictly separate build and run stages

VI. Processes

Execute the app as one or more stateless processes

VII. Port binding

Export services via port binding

Page 54: 03_Cloud Computing D

© 2009 IBM Corporation54

The Twelve Factors for aaS Applications…..

VIII. Concurrency

Scale out via the process model

IX. Disposability

Maximize robustness with fast startup and graceful shutdown

X. Dev/prod parity

Keep development, staging, and production as similar as possible

XI. Logs

Treat logs as event streams

XII. Admin processes

Run admin/management tasks as one-off processes

Page 55: 03_Cloud Computing D

© 2009 IBM Corporation55

Microservices

The term "Microservice Architecture" has sprung up over the last few years to describe aparticular way of designing software applications as suites of independently deployableservices. While there is no precise definition of this architectural style, there are certaincommon characteristics around organization around business capability, automateddeployment, intelligence in the endpoints, and decentralized control of languages and data.

From: http://martinfowler.com/articles/microservices.html

Page 56: 03_Cloud Computing D

© 2009 IBM Corporation56

Moving from monolithic applications to micro-services

56

Monolithic app Micro services

Scaling Scaling

Page 57: 03_Cloud Computing D

© 2009 IBM Corporation57

Compartmentalized business capability

Cross-functional teams

Communication via API ONLY!!

Use messaging to remove peer-to-peerdependencies

REST communication

Decentralized data

Design for failure

Pluggable architecture

Enables continuous delivery

Properties of a micro-service architecture

Page 58: 03_Cloud Computing D

© 2009 IBM Corporation58

Simple services but complex distributed systems

IT overhead – Configuration management– HA/DR for each service– Capacity– High degree of automation

API management is a must

Asynchronous communication nature is difficult

DevOps skills is a must

Micro-services do have a cost

Page 59: 03_Cloud Computing D

© 2009 IBM Corporation59

Good reads!!

59

Automate deployments usingproducton-like environmentsand accelerate delivery cycles

A view into the culturalchallenges of adoptng

DevOps and best practces

Paterns for building resilientand robust applicatons

Page 60: 03_Cloud Computing D

© 2009 IBM Corporation60

Core Concepts

Cloud drives changes to business models – economies of sharing and consumption basedpricing. Being fast is more important than getting it completely right

Hybrid clouds – systems of engagement and systems of record

API Economy

Components fail, deal with it – focus on recoverability

CAP Theorem – can’t have consistency, availability and (network) partition all at once

Relaxed (eventual) consistency, actual implementations driven by data replication strategies

Microservices

Containers

Patterns and Orchestration

DevOps – software delivery lifecycle as an accelerated feedback loop

Page 61: 03_Cloud Computing D

© 2009 IBM Corporation61

AWS

History and Evolution Main Elements Best Practices

Page 62: 03_Cloud Computing D

© 2009 IBM Corporation62

Since going public in 1997, Amazon has launched several newbusinesses to grow annual revenues from $148M to $61B

62Source: Amazon 10Qs

May 1997 IPOSplit Adjusted Stock Price: $2

Market Cap: $438m1997 Revenue: $150M,

$1B in TTM Revenue by EOY1998

June 2013Stock Price: $277

Market Cap: $126B2012 Revenue: $61B

Operating Margin: 1.04%Operating FCF: $4.25B

Employees: 88,400

Revenue ($B)

Stock Price ($)

2006: AWS LaunchGrocery

Webstore & FulfillmentUnbox Video Download

1997-2004: New retail categories - Apparel,Jewelry, Wedding Registry, Health & Beauty,

Home Décor, Sports and Outdoors, OfficeSupplies, Electronics, Mobile 2007: AWS EC2 & S3 For Europe

Kindle, Amazon MP3Direct to Kindle Publishing

2008: Cloud FrontAWS Elastic Block Store

Audible Acquisition

2005: Amazon Prime

2009: AWS enters AsiaKindle 2 + Kindle iPhone app

Zappos.com acquisition

2010: Amazon StudiosLiving Social Investment

Kindle for BlackberryMac, iPad, Android

2011: CloudDriveAWS CloudFormationApp store for AndroidPrime Instant Videos

More Kindle than printbooks

Virtual private cloud

2012: AWS re-InventAWS Marketplace

AWS Glacier, AWS Redshift6 Original series pilots

Amazon.com

Page 63: 03_Cloud Computing D

© 2009 IBM Corporation63

Amazon started by monetizing the under-utilized Amazon.cominfrastructure to lay the foundation of AWS

63

Typical annualized InfrastructureUtilization at Amazon.com

The nature of Amazon.com’s business

required them to build capacity

sufficient to handle peak holiday

shopping + 15% headroom

This resulted in over 76% excess

capacity on an annualized basis

Amazon saw an opportunity in this

excess capacity and began leasing

simple compute and storage services

AWS has now grown to ~$2B in

Revenue with Operating Margin

between 7% and 14%

Source: 2013 AWS Summit Key-note Speech Andy Jassy – SVP AWS

15% Headroom

Annualized Idle Infrastructure

Annualized Utilization

Page 64: 03_Cloud Computing D

© 2009 IBM Corporation64

Since it’s launch in 2006, Amazon Web Services has growngeographically, expanded offerings, and attracted major clients

Sources: Company Materials (website, AR), Morgan Keegan research, UBS research, Cowen & Companyresearch, Bain analysis

Amazon.com completed fullmigration to EC2

Added Data Center in Asia toreduce local latency

Netflix added as a client

Collaborated with SunMicrosystems for open sourceenterprise offering

Launched Elastic Block Storage

Amazon.com began using AWSto monitor website performance

Amazon launched AWS

Launched S3 and EC2

IMDb added as a client

Suffered major, 4 day outage,disrupting many customers

EC2, S3, VPC obtained FISMAaccreditation

Elastic Beanstalk (PaaS) launched

Added Data Center in SouthAmerica

Amazon.com began websitemigration to EC2

Flexible Payments service,Virtual Private cloud, andRelational Databaselaunched

Zynga added as major client

SimpleDB launched

Expanded into Europe

2006 2007 2008 2009 2010 2011

64

2012-13

Launched AWS Marketplace

AWS achieves FedRampcertification

Christmas Eve outagesaffect NetFlix and other largeclients

AcronymsEC2 Elastic Compute CloudS3 Simple Storage ServiceVPC Virtual Private Cloud

Page 65: 03_Cloud Computing D

© 2009 IBM Corporation6565

S3 – Storage What is it?•Uses standards-based REST and SOAP interfaces designed to work withany Internet-development toolkit.

•S3 is built on a distributed architecture - data stored redundantly

•Each object is stored in a bucket & retrieved via a unique, developer-assigned key.

•A bucket can be located in the United States or in Europe. All objectswithin the bucket will be stored in the bucket’s location, but the objects canbe accessed from anywhere.

What’s different about it•S3 will fail on read/writes as a component - but system remains reliable. • Apps expected to be designed “loosely coupled” to take this into account•Not a filesystem. Objects are not files•Not for transaction processing•Data redundancy takes minutes - cannot be assure an object youcreated/updated in S3 will be immediately available to other S3applications

AWS ServicesLOOSELY COUPLED STYLE

Page 66: 03_Cloud Computing D

© 2009 IBM Corporation6666

EC2 - virtual computing environmentWhat it is?•Provide “instances” - virtual machines/hardware that run inEC2; based on XenSource •Images can be shared - or rented out to others (Paid AMI thruDevPay)

What’s different about it•Application instances & data are coupled - EC2 does notautomatically save data outside it’s environment

•Instance rebooted - transient data not lost. Instance shutdownor fails - data lost

•Can recycle images to avoid runtime bugs/problems such asmemory leaks, race conditions, etc.. - and freeze images foroff-line debugging.

•From the beginning a developer needs factor long termpersistence into their application design when apps fail forwhatever reason (S3 down, network connection down, etc..)

•Automated management of EC2 images in early phase. Mostapplications have rolled their own

AWS ServicesLOOSELY COUPLED STYLE

Page 67: 03_Cloud Computing D

© 2009 IBM Corporation6767

SQS - Simple Queue ServiceWhat is it?• Access to SQS thru SOAP services• Highly scalable, distributed, hosted queue to reduce/eliminate app-to-app dependencies• All messages are stored redundantly across multiple servers anddata centers• Developers can create an unlimited number of Amazon SQSqueues, each of which can send & receive an unlimited messages.• Message body can contain up to 8 KB of text in any format.• A message is “locked” while a computer is processing it, keepingother computers from trying to process it simultaneously. Ifprocessing fails, the lock will expire and the message will again beavailable.

What’s different about it?• It’s more than a simple queue - applications interact by telling SQSestimated processing time = workflow• Message may not be delivered immediately• Load balancing model is asynchronous - lots of instances could betaking work off the queue, in different data centers• Asynchronous - state/session information store in SQS wherepossible• Messages will end up being delivered more than once in somecases - application to deal with it. • Workloads, number of messages on the queue for an application -is done mathematically on sampled queues• Pricing still a drawback to broader adoption

AWS Services

Page 68: 03_Cloud Computing D

© 2009 IBM Corporation68

Transfer value to customers through price reductions

Drove greater innovation through ecosystem and scale

Be proactive through infrastructure audits to increasecustomer satisfaction and value

Basic compute price is 40% lower than next cheapestcompetitor*

Transfer value to customers through price reductions

Drove greater innovation through ecosystem and scale

Be proactive through infrastructure audits to increasecustomer satisfaction and value

Basic compute price is 40% lower than next cheapestcompetitor*

AWS’ source of differentiation is “good enough” technology deliveredat the lowest prices owing to scale

68

AWS achieving scale across three dimensionsAWS achieving scale across three dimensions

AWS focus on customer valueAWS focus on customer value AWS scale and innovation AWS scale and innovation

Source: 2013 AWS Summit Key-note Speech Andy Jassy – SVP AWS, BCG Server Count Estimate Model*Price per ~2GB RAM Linux on-demand instance hour

Reduce

Prices

MoreCustomers

More

Usage

MoreInfrastructure

Economies

of Scale

Lower

Costs

31 AWS pricereductions since

2006

Page 69: 03_Cloud Computing D

© 2009 IBM Corporation69

The AWS technology stack has expanded from the original EC2Compute and S3 Storage offerings

69Source: Amazon Website

AmazonTerminology

Applications CloudSearch

SES SimpleEmail

Service

SNSSimple

NotificationSvc

SQSSimpleQueueService

SWFSimple

Workflow

ElasticTranscoder

Deployment&

Management

ElasticBeanstalk

CloudWatch

DataPipeline

CloudFormation

IAMIdentity &AccessMgmt.

OpsWorks

Database DynamoDB

RDS Relational

DB Service

ElastiCache

Simple DB Redshift

Storage &ContentDelivery

S3 SimpleStorageService

EBS ElasticBlock

Storage

Glacier CloudFront

Compute &Networking

EC2 ElasticCloud

Compute

Elastic MapReduce

Route 53DNS

Service

DirectConnect

VPC VirtualPrivateCloud

EC2 (Compute) + S3 (Storage)are the original and foundational

offerings of AWS

Petabyte scale datawarehouse service

Content managementand delivery

Archive andbackup

a logically isolated section of the AWS Cloud whereyou can launch AWS resources in a virtual network

that you define

“Apps” in Amazon parlance, more accuratelyPaaS/Middleware. Either way, moving up the value

chain.

IaaS

ValueChain

PaaS

Page 70: 03_Cloud Computing D

© 2009 IBM Corporation70

AWS is moving up the IT value chain over time as they introducehigher value services in PaaS and SaaS

2006 2007 2008 2009 2010 2011 2012 2013

Incr

ea

sin

g V

alu

e

AWS Services Moving to Higher Value Over Time

70 Source: AWS Company Website,

Page 71: 03_Cloud Computing D

© 2009 IBM Corporation71

Amazon Elastic Beanstalk

Elastic Load Balancer

EC2Instances

Apache

EC2Instances

Amazon Linux AMI

http://myapp-staging.elasticbeanstalk.com

ElasticBeanstalk

HostManager

Tomcat

RunningApplication

Environment

Version

AWS Elastic Beanstalk App

Autoscaling

S3

Page 72: 03_Cloud Computing D

© 2009 IBM Corporation72

Amazon re:Invent 2015 New Offering AnnouncementsAnnouncement Value Category Available IBM Equivalent Impact

AWS Quicksight Business Intelligenceservice with visualizationsupport

Analytics Preview Cognos Potential major threatto IBM’s new installsof BI

Kinesis Firehose Loads streaming datafrom Kinesis to S3 orRedshift data stores

Analytics GA Stream Analytics,Infosphere streams

AWS making it easyto transform transientdata and have itpersist in their cloud

Kinesis Streams New feature allows fortemporary storage for 7days from 24 hours

Analytics GA Infosphere streams

Kinesis Analytics A way to run standardSQL queries againststreaming data

Analytics Coming soon

Snowball 50 TB hardenedenclosure used to shipcustomer data to AWS

DB / Storage GA (selectedregions)

No equivalent intoSoftLayer / Bluemix

Unique offering fromAWS removes barrierto their cloud

MariaDB Support open source, MySQLcompatible database

DB GA None Advantage?

Database MigrationServices

migrates customerdatabases to AWS

DB Preview IBM DatabaseConversionWorkBench / IBMInfoSphere CDC

AWS lowering thebarrier to adoption oftheir public cloud

Schema ConversionTool

converts proprietarydatabase schemas,stored procedures, viewsto AWS

DB Preview IBM InfoSphereChange Data Capture(formerly DataMirror)

AWS lowering thebarrier to adoption oftheir public cloud

Amazon Inspector automated securityassessment in AWScloud

Compliance Preview Security ComplianceService

AWS continuing toexpand their gov’tpresence

Accenture AWSBusiness Group

Partnership Announcement ofpartnership / nospecific details

GTS, GBS Instant AWSconsulting unit

Page 73: 03_Cloud Computing D

© 2009 IBM Corporation73

Announcement Value Category Available IBM Equivalent Impact

AWS WAF Web ApplicationFirewall

Security &Identity

GA NetScaler VPXApplication DeliveryController on SL

EC2 DedicatedHost

Visibliity and controlover how instancesare placed onphysical server

Compute Coming Soon Bare Metal IBM seems to still leadin this area, howeverbare metal is “old-world”

Config Rules A set of cloudgovernancecapabilities that allowIT Administrators todefine best practicesfor provisioning andconfiguring AWSresources and thencontinuously monitorcompliance withthose guidelines.

Compliance Preview

CloudwatchDashboards

Console enables youto create re-usablegraphs of AWSresources andcustom metrics soyou can quicklymonitor operationalstatus and identifyissues at a glance.

ManagementTools

GA

ElasticSearchService

Managed service fordeploying, operatingElasticSearch onAWS

Analytics GA ElasticSearch

Page 74: 03_Cloud Computing D

© 2009 IBM Corporation74

Announcement Value Category Available IBM Equivalent Impact

ArchitectureWhitepaper

All experiences fromtalking with customersand identifying bestpractices

Training /Education

Now

EC2 VMs – X1 Intel Xeon E7 V3 - 2TBdata

Compute 1H2016 None Reducesperformanceadvantage of IBMBare metal

EC2 VMs – t2.nano 512 MB. Small, easy,quick

Compute Later this year AWS Lowering costseven further fordevelopers with thistiny instance

Amazon EC2Container Registry

A secure, fully-managedDocker container registry.Manages Dockercontainer images, makingit easier to store anddeploy them

Compute GA Docker Registry

AWS LambdaEnhancements

• Access to servicesrunning in a VirtualPrivate Cloud

• Functions written inPython

• Long running functions• Scheduled functions• Custom Retry Logic

Compute GA None Strengthenedadvantage in server-less services

AWS Mobile Hub Offers a quick and easyway to create mobileapps that use certainAWS services

Mobile Beta now Bluemix DevOps,MobileFirst PlatformFoundation

AWS catching upwith Bluemix Mobileservices

AWS IoT End to end IoT services IoT Beta now IoT Foundation AWS catching upwith IoT at thedeveloper level byoffering free SDKsand partnerships withdevicemanufacturers.

Page 75: 03_Cloud Computing D

© 2009 IBM Corporation75

Best Practices (Andrew Trossman)

Image management– Launch parameters– S3, CVS, SVN– Image Style Management

Release upgradesCluster everything (redundancy)Dynamically respond

– Faults– Demand

Processing Pipeline of Loosely Coupled ServicesConclusions

Page 76: 03_Cloud Computing D

© 2009 IBM Corporation76

Image Management

Changes makes 100% images impracticalBoot Scripts combined with HomogenousEnvironment workImage + Launch Parameters ~= Image

–Extremely repeatable and reliable–Less storage –Tolerates change better

Example template –Builds server from script–Pulls content/code from repository

Page 77: 03_Cloud Computing D

© 2009 IBM Corporation77

Image Style Management

Avoid Heisenbugs – cycle VMs regularlySimple patches update “image”

–Automatically rolled out via regular cyclingNever “fix” by handAlways “replace” the image

Page 78: 03_Cloud Computing D

© 2009 IBM Corporation78

Release Upgrades

Completely rebuild parallel environment– Test– Cut over data– Change DNS– Decommission old when confident

Cheaper to “replace than fix”Traditional “fix” process with staging etc.

– IBM identified 2/3 human effort dedicated to this process

Page 79: 03_Cloud Computing D

© 2009 IBM Corporation79

Cluster Everything

Everything Fails – Applications must accommodateTransparent redundancySeamless failoverMonitoring & Events

Page 80: 03_Cloud Computing D

© 2009 IBM Corporation80

ScalrDynamicResponse toDemand &Availabiltiy

Page 81: 03_Cloud Computing D

© 2009 IBM Corporation81

Always Respond By Cloning

Resist urge to “fix” in placeMost bugs are application bugsTraditional QA is good at removing all but the HeisenbugsClone instance brings a “fresh” server to replace the faulty

one.– This gets past heisenbugs– Enables “off-line” problem determination

“Roll Forward” in the cloud

Page 82: 03_Cloud Computing D

© 2009 IBM Corporation82

Scalr Process Flow

Page 83: 03_Cloud Computing D

© 2009 IBM Corporation83

Page 84: 03_Cloud Computing D

© 2009 IBM Corporation84

Pipeline Loosely Coupled Services

S3End users submit videos to be transcodedto the website

Request message is placed in the Amazon SQSincoming queue with a pointer to the video andto the target video format in the message

SQS

EC2

The transcoding engine, runningon a set of Amazon EC2instances, reads the requestmessage from the incoming queue

1 2

34The engine retrieves,transcodes, and returns thevideo to S3

5a

SimpleDB

Metadata about the video (e.g., format, datecreated and length) can be indexed into AmazonSimpleDB for easy query

A Simplified Example: Video Transcoding Web Site

Sources: Amazon.com, MI Analysis

Client assumed to be:

Web ApplicationLayer

5b

Response message is placed in the outgoingqueue and sent to user with a pointer to theconverted video

Page 85: 03_Cloud Computing D

© 2009 IBM Corporation85

ServiceOrientedPlatform ofAmazon’sArchitecture

http://s3.amazonaws.com/AllThingsDistributed/sosp/amazon-dynamo-sosp2007.pdf

Page 86: 03_Cloud Computing D

© 2009 IBM Corporation86

Examples

Page 87: 03_Cloud Computing D

© 2009 IBM Corporation87

Frontend servers (x 3) - Medium instance (IO/Memory) - App & Cache servers

MySQL servers (x 6) - Medium instance (IO/Memory) - MySQL 5.1 w/ replication - Backup to S3 every 4 hours

Index servers (x 2) - X-Large (CPU/IO) - EBS volumes for IO throughput - EBS snapshots for backup

Infrastructure servers (x 3) - Dist. Logger (Medium – IO) - Analytics Server (Medium – IO) - Messaging Server (Small)

Crawlers (x ~70) - Small instance (Network IO) - Automated build & boot

Staging (x 3) - Medium / Small instances - Scratch space for internal use

Page 88: 03_Cloud Computing D

© 2009 IBM Corporation88

Soocial

Page 89: 03_Cloud Computing D

© 2009 IBM Corporation89

Page 90: 03_Cloud Computing D

© 2009 IBM Corporation90

Observations from 6 startups on AWS (12 – 100s of AMIs)

Everyone deployed monitoring All but one used open source monitoring (the other used home grown) NONE have humans watching/waiting All use image & boot script for repeatable deployments All have scripted fault prevention / resolution All Throw Away, rather than Fix All redeploy entire production for release upgrades

Page 91: 03_Cloud Computing D

© 2009 IBM Corporation91

Scaling a Single Application

SingleSystem

TieredSystem

ClusteredMiddleware,Tiered System

Loosely Coupled

Services

DynamicMassivelyParallelApplication

Ve

rt

ic

al

Sc

al

in

g

Ve

rt

ic

al

Sc

al

in

g

Partitioned DB

Ve

rt

ic

al

Sc

al

in

g

Ve

rt

ic

al

Sc

al

in

g

Ve

rt

ic

al

Horizontal

Horizontal

Horizontal

Horizontal Scaling

Horizontal Scaling

Development Discontinuity

(new application architecture)

Significant Development Required

Page 92: 03_Cloud Computing D

© 2009 IBM Corporation92

Conclusions

Divide Complex Monolith

– Several simpler problems IaaS simplifies self-managed appsCost of IaaS + Apps < Monolithic AppPaaS _is_ an ApplicationStorage _is_ an ApplicationGeneral principle

– We have lots of small problems (apps)– We have one big problem (IaaS)

Page 93: 03_Cloud Computing D

© 2009 IBM Corporation93

Microsoft Windows Azure

Page 94: 03_Cloud Computing D

© 2009 IBM Corporation94

Microsoft’s Cloud OS: Focus on Hybrid

Page 95: 03_Cloud Computing D

© 2009 IBM Corporation95

How Microsoft presents itself: 2013-July SEC filing

“Unique to Microsoft, we continue to design and deliver cloud solutions that allow ourcustomers to use both the cloud and their on-premise assets however best suits theirown needs. For example, a company can choose to deploy Office or MicrosoftDynamics on premise, as a cloud service, or a combination of both. With WindowsServer 2012, Windows Azure, and System Center infrastructure, businesses candeploy applications in their own datacenter, a partner’s datacenter, or in Microsoft’sdatacenter with common security, management, and administration across allenvironments, with the flexibility and scale they desire. These hybrid capabilities allowcustomers to fully harness the power of the cloud so they can achieve greater levelsof efficiency and tap new areas of growth.”

Page 96: 03_Cloud Computing D

© 2009 IBM Corporation96

Ancient history

Initial virtualization platform dates back to 1997 Hyper-V

– Completely new platform– Released in 2008– Designed to “leap-frog” VMware's platform– Latest version has significant enhancements– Built into Windows 8 desktops also

Bing– Started as MSN Search back in late ’90s

Windows Live– Originated as MSN services which date back to 1995

Hotmail– One of Web-browser based email pioneers– Started in 1996, acquired by Microsoft in 1997

Xbox Live– Started in 2002

MSNBC– Founded in 1996

Page 97: 03_Cloud Computing D

© 2009 IBM Corporation97

Early Windows Azure Platform history

October2008

June2010

November2009

• Updated Windows AzureCTP

• Announced VM Role,Project Sydney, and

Windows AzurePlatform pricing and

SLAs

• Enabled Full Trust &PHP, Java, etc.

applications

• Project “Dallas” CTP

• Windows AzureUpdate

• .NET Framework 4

• OS Versioning

• CDN

• SQL Azure Update

• 50GB databases

• Spatial data support

• DAC support

Windows Azure Platform generallyavailable

• Announced the Windows AzurePlatform

• First CTP of the Windows AzurePlatform

Announced SQL AzureRelational DB

March2009

February2010

Page 98: 03_Cloud Computing D

© 2009 IBM Corporation98

Recent Changes and tweaks

Windows Azure initially focused exclusively on PaaS– “Scared” their developer base (radical change)– Too far ahead of its time?– The market was much more comfortable with Amazon’s IaaS focus

Added a strange stateless “VM role” to Azure as a stop-gap– Is now deprecated

Major shift in 2012:– Added full IaaS role support to Azure– Shifted definition of “Azure” to mean “Microsoft’s Public Cloud”– PaaS platform naming shifted to “Azure Cloud Services”

Windows Azure Appliance– Microsoft’s first attempt at a “cloud in a box”– OEM-specific product that included thousands of servers

More of a “public cloud in a box”

HP, Dell and Fujitsu– Fujitsu was only vendor to announce a product (which now seems dead)

Windows Azure Pack (stay tuned…)– Azure Pack + Windows Server 2012 R2 = Azure Appliance

Page 99: 03_Cloud Computing D

© 2009 IBM Corporation99

.NET, Visual Studio, TFS + Git | Java, NodeJS, PHP, Python, Ruby, C++

DataSQL Databases

NoSQL Tables

Blob Storage

HDInsight

WindowsAzure

IaaS + PaaS

Page 100: 03_Cloud Computing D

© 2009 IBM Corporation100

11ConsistentConsistentPlatformPlatform

Windows Azure Services

Service ProvidersService ProvidersPrivate CloudPrivate Cloud

Public CloudPublic Cloud

Microsoft Cloud OS Vision

DEVELOPMENT MANAGEMENT IDENTITY VIRTUALIZATIONDATA

Azure Virtual MachinesAzure Virtual Machines

Windows Azure Services

Page 101: 03_Cloud Computing D

© 2009 IBM Corporation101

Consistent Experience with Common Tools

Page 102: 03_Cloud Computing D

© 2009 IBM Corporation102

Windows Azure™ PaaS

Similar design points asAWS...

Applications Services accessed via REST/SOAPmessages

SQL Services for data & storageAzure OS has messaging serviceAzure OS platform for app deployment

Data & storage - eventual consistencyQueued messages may be delivered more than once

...with key differences•Applications deployed - not Images

• VMs baked into OS

•Application provides declarative description forscalability, reliability & availability of applicationcomponents

• e.g. developer of service owner specifies how piecesare to be distributed under what circumstances

•System automatically replicates code & data• Queuing/messaging Services

•SQL Databese (fka SQLAzure) ServicesLike Amazon, expecting it to be priced (high) basedon operation costs.

.NET, Visual Studio, TFS + Git | Java, NodeJS, PHP, Python, Ruby, C++

Data

SQL Databases

NoSQL Tables

Blob Storage

HDInsight

IaaS + PaaS

Page 103: 03_Cloud Computing D

© 2009 IBM Corporation103

Microsoft Platform as a Service

Windows Azure (compute & simple/scalable storage) SQL Database (fka SQL Azure)

– SQL Server as a Service AppFabric (Cloud-based services)

– Access Control Service (Azure Active Directory)– Enterprise Service Bus– Distributed Object Caching

Traffic Manager– Global traffic management/routing (performance)

Azure Connect (“VPN” between cloud and on-premise services) Azure Portals

– Web-based Service Lifecycle Management tools– SQL Database management– ReSTful APIs also available (non-Browser-based tools)

Azure Media Services Azure Content Delivery Network (CDN)

Page 104: 03_Cloud Computing D

© 2009 IBM Corporation104

Windows Azure Storage

Cloud Storage - Anywhere and anytime access

Blobs, Disks, Tables and Queues Highly Durable, Available and Massively Scalable

Easily build “internet scale” applications

8.5 trillion stored objects

900K request/sec on average (2.3+ trillion per month) Pay for what you use Exposed via easy and open REST APIs Client libraries in .NET, Java, Node.js, Python, PHP, Ruby

Page 105: 03_Cloud Computing D

© 2009 IBM Corporation105

Abstractions – Tables and Queues

Page 106: 03_Cloud Computing D

© 2009 IBM Corporation106

Abstractions – Blobs and Disks

Page 107: 03_Cloud Computing D

© 2009 IBM Corporation107

Azure support for “Open” and “Interoperable” tools and platforms

Windows Azure Tools for Eclipse/Java– One Click cloud deployment– Supports Windows Azure Storage & SQL Azure– Support for Windows Azure Platform SDKs & Drivers– AppFabric SDK Supports Service Bus & Access Control from Java– Provided by 3rd party

Windows Azure SDK for PHP– Supports Windows Azure Storage & Service Management infrastructure– PHP apps can be deployed in an Azure Web Role– Simple Cloud API

Windows Azure Companion– Simplifies installing and configuring open source components and apps on

Azure– Examples: Drupal and PHP apps

Embraces Open Source with NuGet Node.js Git, Github, Dropbox, …

Page 108: 03_Cloud Computing D

© 2009 IBM Corporation108

Recent additions/enhancements to Windows Azure

Microsoft has been making consistent, regular enhancements to Azure, especially overthe past couple years

August 2013– SQL Server AlwaysOn (HA/DR features for hybrid infrastructure)– Notification Hubs (broadcast push notifications for Win8/RT, Windows Phone, iOS

and Android devices)

Used by Bing News app (built into Win8/RT and Windows Phonedevices)

– AutoScale (schedule-based rules) (beta)

Web sites, Cloud Services (PaaS), Virtual Machines (IaaS) and MobileServices

History tracking, proactive notification for AutoScale events (e.g.email)

– Automated VM load balancing (net traffic) management (free)– Portal extensions for operational logs and alerts

September 2013– Dedicated Cache Service (high perf distributed caches for Windows/Linux/ASP/Web

sites)

Azure Mobile Services to be integrated in near future– Scheduled AutoScale (time schedule rules for using AutoScale features)– Azure Web Sites logging to Azure Storage (blobs)

Page 109: 03_Cloud Computing D

© 2009 IBM Corporation109

Commercial Cloud Services

Page 110: 03_Cloud Computing D

© 2009 IBM Corporation110

Agenda

Evolving Programming Models – Overview

Extensions to traditional programming models – Middleware patterns inthe cloud

Loosely coupled, relaxed consistency– Amazon Web Services– Microsoft Azure– Google

Content centric– Hadoop, Apache Spark– NoSQL

Database centric– Pangoo

Page 111: 03_Cloud Computing D

© 2009 IBM Corporation111111

A “Content-Centric” model runs infrastructure, data and computationall on the same nodes

Mgmt Model

Mgmt Model

Mgmt Model

InfrastructurePersistenceProgramming

Looselycoupledstarts here

Real innovationoccurs here

Page 112: 03_Cloud Computing D

© 2009 IBM Corporation112

112

Critical elements of a content centric model“Restricted” programming model• Think Batch: Redux

• Enables parallelized, distributed, fault tolerant computationswithout programming complexity

• No new programming experience required; framework hidesdetails of parallelization, fault tolerance, load balancing, etc.from developer

• Offers simplicity of deployment & scalability - no applicationknowledge of runtime or OS or cloud necessary

Can be deployed on native hardware orvirtualized• Underlying map/reduce runtimes automatically parallelizes the

computation across large-scale clusters of (virtual) machines

Storage & data - Leverages “hybrid” distributedstorage system & file systems designed tohandle petabytes of data - i.e not to be confusedwith an OS file system• Data Handling & Replication: map/reduce implementations thru

a software framework that handles data distribution

Designed to minimize operational costs• The “master” pings every worker periodically. If no response in

a certain amount of time, the master marks the worker as failed.handles machine failures, and schedules inter-machinecommunication to make efficient use of the network and disks

Page 113: 03_Cloud Computing D

© 2009 IBM Corporation113113

Apache Project: Hadoop Core

Open source project to recreate Google’scapabilities (led by Yahoo) withimprovements•Portable – can run as a native or virtualizedsystem•Additional pluggable runtime components forcrawling (structured & unstructured data), querylanguages (Pig Latin, JAQL, Hive, etc..)

Provides a Java framework for large scaleparallel processing map/reduce apps•Offers simplicity of “programming” - Looks like asimple single threaded app model for developers •Today - setting up, coding Hadoop jobs in Java,etc. is the domain of skilled Java engineers

Awareness & Adoption Growing•Could become foundation of new generation ofeasily customizable web analytic applications –at web scale•Yahoo – used in production for indexing content•Facebook – analyze logs, analytics•New York Times

Not as scalable as Google – but does it need to be?

Page 114: 03_Cloud Computing D

© 2009 IBM Corporation114

Hadoop, an open source implementation of map-reduce

Map-reduce runtime• Partitions input data• Schedules program’s execution across set of

machines• Manages inter-machine communication• And more

Programming using Map-reduce:• Users specify a map function that

processes a key/value pair to generate aset of intermediate key/value pairs, and areduce function that merges all intermediatevalues associated with the sameintermediate key.

• Processes and generates large datasets

• Automates program recovery in caseof a failure

• Supports functional style programming• Parallelism is an inherent feature• Critical to keeping costs down

Page 115: 03_Cloud Computing D

© 2009 IBM Corporation115

Conceptual flow with Map Reduce

Conceptually, Map and Reduce functions are identicalThey both operate on and transform key value pairsThe idea is to diminish (reduce) the amount of data as it passes through

this flowThis is a very human idea

ki ,vi k’i ,v’i

Map(k, v) , fm()

i=1,2,3….N i=1,2,3….M

Multiple values for samekey may appear here

You specify the input data setAnd the Map Functon

Note that the values are transformed andchange

But also the keysThe number of keys changes.

Some input records could be discarded

Page 116: 03_Cloud Computing D

© 2009 IBM Corporation116

Conceptual flow with Map Reduce

Conceptually, Map and Reduce functions are identicalThey both operate on and transform key value pairsThe idea is to reduce the amount of data as it passes through this flowThis is a very human idea

ki ,vi k’i ,v’i

Map(k, v) , fm()

i=1,2,3….N i=1,2,3….M

Sort by keyAggregate by key

k’i ,(v1…mi)’i

i=1,2,3….M` (M` < M)

Multiple values for samekey may appear here

Multiple values for samekey should appear here

Let’s focus on the unique keysSo, need to sort and aggregate

by key

Page 117: 03_Cloud Computing D

© 2009 IBM Corporation117

Conceptual flow with Map Reduce

Now we apply another transformational function. Just like before

The idea to reduce is a very human one

That’s really all there is to this.

ki ,vi k’i ,v’i

Map(k, v) , fm()

i=1,2,3….N i=1,2,3….M

Sort by keyAggregate by key

k’i ,(v1…mi)’i

i=1,2,3….M` (M` < M)

k’’i ,v’’i

Reduce(k’, v’) , fr()

i=1,2,3….P

Multiple values for samekey may appear here

Multiple values for samekey should appear here

Page 118: 03_Cloud Computing D

© 2009 IBM Corporation118

Map Reduce: Simple / Sample problemUsing a NCDC data set, find out average precipitation in the US, by year

Use this format: ftp://ftp.ncdc.noaa.gov/pub/data/cdo/samples/PRECIP_15_sample_ascii.dat

STATION STATION_NAME ELEVATION LATITUDE LONGITUDE DATE QPCP UnitsCOOP:311564 CATALOOCHEE NC US 798.9 35.61667 -83.1 19840101 00:15 0

HICOOP:311564 CATALOOCHEE NC US 798.9 35.61667 -83.1 19840104 22:45 1

HICOOP:311564 CATALOOCHEE NC US 798.9 35.61667 -83.1 19840105 00:30 1 HICOOP:311564 CATALOOCHEE NC US 798.9 35.61667 -83.1 19840105 01:30 1 HI

COOP:311564 CATALOOCHEE NC US 798.9 35.61667 -83.1 19840105 02:15 1 HI

QPCP: The amount of precipitation recorded at the station for the 15 minute period endingat the time specified for DATE above given in tenths or hundredths of inches dependingon the value given in the Units element (see definition for Units below). Prior to January1996 QPCP was the only observational element in this data set. The values 9999 or 99999means the data value is missing. The maximum number of characters for this field is 8.This element is selectable when using the Climate Data Online interface for creating dataoutput file.

Units (Flag/Attribute): HI indicates data values (QGAG or QPCP) are in hundredths ofinches. HT indicates data values (QGAG or QPCP) are in tenths of inches.

January 1st, 1984January 4th

Multiple times….

Page 119: 03_Cloud Computing D

© 2009 IBM Corporation119

The Map operation example(k1, v1) =>Map(data, f(x)) => (k2, v2)

Consume a line, output year and precipitation (key = char offset, value = line) => Map() => (key = year,value =

QPCP)

STATION STATION_NAME ELEVATION LATITUDE LONGITUDE DATE QPCP

COOP:311564 CATALOOCHEE NC US 798.9 35.61667 -83.1 19840101 00:15 0

COOP:311564 CATALOOCHEE NC US 798.9 35.61667 -83.1 19840104 22:45 1COOP:311564 CATALOOCHEE NC US 798.9 35.61667 -83.1 19840105 00:30 1COOP:311564 CATALOOCHEE NC US 798.9 35.61667 -83.1 19840105 01:30 1COOP:311564 CATALOOCHEE NC US 798.9 35.61667 -83.1 19840105 02:15 1

YEAR QPCP 1984 0 1984 11984 11984 11984 1

Map inputs:

Map outputs:

Page 120: 03_Cloud Computing D

© 2009 IBM Corporation120

The Map operation example(k1, v1) =>Map(data, f(x)) => (k2, v2)

Consume a line, output year and precipitation (key = char offset, value = line) => Map() => (key = year,value =

QPCP)

STATION STATION_NAME ELEVATION LATITUDE LONGITUDE DATE QPCP

COOP:311564 CATALOOCHEE NC US 798.9 35.61667 -83.1 19840101 00:15 0

COOP:311564 CATALOOCHEE NC US 798.9 35.61667 -83.1 19840104 22:45 1COOP:311564 CATALOOCHEE NC US 798.9 35.61667 -83.1 19840105 00:30 1COOP:311564 CATALOOCHEE NC US 798.9 35.61667 -83.1 19840105 01:30 1COOP:311564 CATALOOCHEE NC US 798.9 35.61667 -83.1 19840105 02:15 1

YEAR QPCP 1984 0 1984 11984 11984 11984 1

Map inputs:

Map outputs:

Page 121: 03_Cloud Computing D

© 2009 IBM Corporation121

Then, sorting / aggregation happens

Sort data by key Aggregate values with same key into one aggregate value

YEAR QPCP 1984 0 1984 11985 41984 11984 11984 1 1985 21985 3

inputs:

outputs:

YEAR QPCP 1984 0 1 1 1

11985 2 3 4

Page 122: 03_Cloud Computing D

© 2009 IBM Corporation122

The Reduce operation example(k2, v2) =>Reduce(data, f(x)) => (k3, v3)

Input: year, list of precipitation values Output: year, average precipitation value Reduce function f(x) = average(x)

YEAR QPCP 1984 11985 31986 21987 31988 3

Reduce inputs:

Reduce outputs:

YEAR QPCP 1984 0,1,0,2,0 1985 1,4,5,0,71986 1,0,2,3,11987 1,6,2,3,41988 1,5,2,5,4

Page 123: 03_Cloud Computing D

© 2009 IBM Corporation123 123

Large Financial Institution wanting to do fraud analytics

A platform that can cost effectively manage PB’s of data and support fraud and marketinganalytics

Must be efficient for structured data Integration with enterprise assets: warehouse, BI tools

New Analytics

Platform

Models of normal andfraudulent card usage

Transactional Credit CardRisk Management System

(Decision to authorize charge)

Transactional System

Analytics

Requirement: analyze 7 years – total250TB at a rate of 100M transactions aday (transaction rate expected to growsubstantially)

Problem 1 (1 year of data):– Today, w/o MSA, takes > 1 week – With MSA – 3 hr!

Problem 2 (1 month of data):– Customer goal: 1 day = “a win”; 10

minutes = “great”; 1 minute =“awesome”MSA at “great” (~10 mins), movingto “awesome”

Data Sizes and Performance

Page 124: 03_Cloud Computing D

© 2009 IBM Corporation124

Brief History of Spark

2002 – MapReduce @ Google 2004 – MapReduce paper 2006 – Hadoop @ Yahoo 2010 – Spark paper 2011 – Hadoop 1.0 GA 2014 – Apache Spark top-level 2014 – 1.2.0 release in December 2015 – 1.3.0 release in March

Spark is HOT!!! Most active project in Hadoop

ecosystem One of top 3 most active Apache

projects Databricks founded by the creators

of Spark from UC Berkeley’sAMPLab

Activity for 6 months in 2014(from Matei Zaharia – 2014 Spark Summit)

Page 125: 03_Cloud Computing D

© 2009 IBM Corporation125

Apache Spark is a fast, general purpose, easy-to-use cluster computingsystem for large-scale data processing

– FastLeverages aggressively cached in-memory

distributed computing and JVM threads

Faster than MapReduce

– GeneralityCovers a wide range of workloads

Provides SQL, streaming and complexanalytics

– Ease of use (for programmers)Spark is written in Scala, an object oriented,

functional programming language

Scala, Python and Java APIs

Scala and Python interactive shells

Runs on Hadoop, Mesos, standalone or cloud

Logistic regression in Hadoop and Spark

from http://spark.apache.org

Page 126: 03_Cloud Computing D

© 2009 IBM Corporation126

Spark Resilient Distributed Datasets- like in-memory hash partitions

Slave node 1

c3 d2

a2 b1

partition3

partition1

partition2

Slave node 2

c2 d1

a1 b2

partition1

partition3

Slave node 3

c1 d2

a3 b3

partition2

partition2

partition1

RDD1

RDD2

RDD3

Spark RDDIn-memory distribution

HDFSOn-disk

distribution

Page 127: 03_Cloud Computing D

© 2009 IBM Corporation127

Directed Acyclic Graph Computation – much more efficient framework than MapReduce

An example of a typical workload consists of 4 MR jobs with 6 intermediate

step Distributed File System (DFS) IOSpark DAG with lazy evaluation(No intermediate step DFS IO)

Page 128: 03_Cloud Computing D

© 2009 IBM Corporation128

Spark Extensions – a common API for data ingest, streaming analytics, machinelearning, graph processing, and more

Extension of the core Spark API Improvements made to the core are passed to these libraries Little overhead to use with the Spark core

Page 129: 03_Cloud Computing D

© 2009 IBM Corporation129

Spark in the real world ...

Batch

Interactive

MachineLearning

DataIntegration

DataWrangling Streaming

GraphProcessing

SQL

Healthcare

Telco

FinancialServices

Media

Manufacturing NationalSecurity

Insurance

Retail

Banking

Page 130: 03_Cloud Computing D

© 2009 IBM Corporation130

Spark to Improve Health Care for Millions of Patients

Independence Blue Cross will leverage Spark asthe Analytic Platform for projects to improve thelifestyle for those who are ill

•Maps of complex referral network to identifycost-efficient providers

•Identify patients who are at the highest risk ofbeing re-hospitalized within a short period oftime

•Enhance the efficacies of managing chronicdisease, such as early detection of diabetes

•Analyze clinical data and scanned images toidentify hip implant patients who has high-riskexposure to complications such as metallosis,infection and dislocation

Page 131: 03_Cloud Computing D

© 2009 IBM Corporation131

Spark to hunt for presence of intelligent extraterrestrial life

IBM, NASA, and the SETI Instituteare collaborating to analyze complex deep space radiosignals using Spark in a hunt for patterns that mightbetray the presence of intelligent extraterrestrial life

•SETI Institute's mission is to explore, understand andexplain the origin and nature of life in the universe

•With Apache Spark as a Service on Bluemix, SETIable to work with IBM on a global scale to explore newways to analyze signal data and build on each other’sinnovations

•Spark application is being developed to analyze the100 million radio events detected by the AllenTelescope Array (ATA) over several years.

Page 132: 03_Cloud Computing D

© 2009 IBM Corporation132

Single genome is 200GB, 1M People peryear tested

Leverage Bluemix and Spark Services toprovide powerful processing and analyticsfor massive Genomes Data

Demonstrated using Spark to search andcompare variant from standardChromosomes repository. visualization ofchromosome to explore variants easily.e.g. Chro1-22, chroX, chroY

Bluemix GenomicsHuman Genome Sequencing using Spark

Page 133: 03_Cloud Computing D

© 2009 IBM Corporation133

But --- there are Gaps in NoSQL DataStores …. Can Enterprises Live with these?

Not good for multi-user, complexapps

Difficulties with data integration toolsthat require understanding of thestructure to support movement of thedata to other systems

Joining entities when relationship arenot stored pre-joined, no multi-objecttransactions

Limited Data management tool setand ecosystem

No data integration with Enterprisedata

Requires highly skilled team to deliver andmanaged NoSQL based deployment

– As complexity increases, investmentin enterprise software is lessexpensive than engineering ad-hocsolutions.

Page 134: 03_Cloud Computing D

© 2009 IBM Corporation134134

Google Software Stack: One View

Google File System• Non-virtualized storage component – specialized distributed filesystem designed for Google workloads• Two types of servers: masters (network coordinators) & workers(operating on data as requested)• Chunk size is 64 MB – not typical file system block size to reduceworkers interacting with master

Bigtable• Distributed column oriented data store but not a relational DB ontop of GFS (Covered in Google TT last year)

Work Queue• Distributed batch processing component & job scheduler

Map Reduce – details• Framework/library in C++ component• Utilizes Work Queue to distribute computations to clusters• ~10,000 Map Reduce programs today• In 2004 ran 29,000 jobs – 2007, 2,200,000 jobs• Google runs ~100,000 jobs per day crunching thru 20 petabytes• Runs across ~100,000 node servers• Indexing , AdWords, Analytics, etc..

Sawzall• Query language, type-safe scripting langauge• Factor of 10 simpler to code up (and shorter) then in C++

Page 135: 03_Cloud Computing D

© 2009 IBM Corporation135

NoSQL defined…

emergence of a growing number of non-relational, emergence of a growing number of non-relational, distributed data stores for massive scale datadistributed data stores for massive scale data

http://nosql-database.org

Page 136: 03_Cloud Computing D

© 2009 IBM Corporation136

Application developers are using NoSQL to rapidly prototype anddeploy

Page 137: 03_Cloud Computing D

© 2009 IBM Corporation137

Categories of NoSQL Use-case Patterns

Scalability for web-apps: single record access based on key– Large data scale– High read concurrency – Ratio of value to number of records is high

Rapid development of web-scale solutions– Chosen for flexible schema– Simple queries (key-lookup)– Lifespan of Apps are short, and required rapid iteration

Scalable Analytics– Scalable fault tolerant framework for

storing and processing MASSIVE data sets (Hadoop)– Lower cost, available hardware– Gives you point access to data in MR, not just sequential

access– Ratio of value to number of records is low Think NOSQL BigData Analytics

Think NOSQL

OLTP

Think NOSQL Agility

Page 138: 03_Cloud Computing D

© 2009 IBM Corporation138

HBase

A NoSQL data store The Hadoop Database

– - included in Apache HadoopAn industry leading implementation of Google’s BigTable

DesignHBase powers some of the leading sites on the Web (e.g.

FaceBook, Yahoo, …)

Page 139: 03_Cloud Computing D

© 2009 IBM Corporation139

There is no single NoSQL --- it’s a landscape

• Simple Key Value Storeso Data is stored in a hash-table of keys. Values

are opaque binary objects.

• Document Storeso Data is stored as documents with tagged

elements

• Column Family o Data attributes are grouped into column sets.

Each storage block contains data from only onecolumn set.

• Graph Storeo Data is stored in nodes and edges of a graph

and accessed using graph traversal

Key Binary Data

Key Document (collection ofkey-values)

Key

Properties Key Properties

Node 1 Node 2

Key

Properties

Relationship 1

Key ColumnFamily1: C1

ColumnFamily1: C2

ColumnFamily2: C1

ColumnFamily3: C1

eXtremeScale

eXtremeScale

Page 140: 03_Cloud Computing D

© 2009 IBM Corporation140

Understanding the CAP Theorem A distributed system can only achieve

two out of the three followingproperties:

– Consistency – all clients see thesame data--the database is truthful!

Can be Strong (e.g. atomic andimmediate), Sequential, Casual,Eventual or Weak.

– Availability – The system is alwayson so the data is available

– Partition Tolerance – The systemfunctions when a network failurecreates two disconnected groupsOR “The network will be allowed tolose arbitrarily many messagessent from one node to another.”

The exact meaning of this property isdebated

CAP Positioning

HBase is Eventually Consistent and Implements

Consistency and Partition Tolerance

(e.g. A Region Server failure is recoverable but the datawill be unavailable for a period of time)

Page 141: 03_Cloud Computing D

© 2009 IBM Corporation141

Development Characteristics of NoSQL Systems

Schema Flexibility andDevelopment Agility– Quick Application

development– No schema first– Sparse schemas– Data models that are native

to the application spaceJSON is very dominant

– Fewer negotiations with ITApplication defines the schemaand access pathsRapid response to feedback andchanging requirements.

Data

JSON

Page 142: 03_Cloud Computing D

© 2009 IBM Corporation142

Runtime Characteristics of NoSQL Systems

Low Latency, Non-Durable Writes: storeobjects as they arrive, no shredding and multi-table storage

– Capture machine generated data where riskof loss is low and not fatal

Low-latency Reads: Objects match applicationaccess, no joins required

– Online analytics and web-facing applicationswhere stored object matches the web-facingapplication

Dynamic Elasticity: – Rapid horizontal scalability (10’s or 100’s of

servers)– Ability to add or delete nodes dynamically– Application transparent elasticity and cloud

compatibility (scale in AND out)– Allows IT expense to scale with usage

Fast initial deployment: Commonly availablehardware, open source software, (Perceived)lack of need for database administration andskills

– Low barrier to entry for early exploration andrapid development iteration

Petabytes

Zettabytes

Sharding

ABC

A

B

C

Page 143: 03_Cloud Computing D

© 2009 IBM Corporation143

BigTable

Variants: HBase

Partitioning: range-aware data-blocks;automatically split size exceed; N data-blocksmapped to physical node (master).

Replication: leverages Distributed File System

Durability: Sync. writes

Dynamo:

Variants: Cassandra Couchbase Riak

Partitioning: Consistent Hashing Hash keys maintained in a ring, map N hash-keys to virtual node in the ring, multiple virtualnodes mapped to the same physical node (soa single physical node maps to multiple slotsin the ring. The scheme reduces number ofkeys that need to be remapped when nodesare added.

Replication: Async replication to N slots, oruse logging

Durability: Synchronous write to quorum

Replica-Sets

Variants: MongoDB OracleDB

Partitioning: range-aware shards distributedon available nodes, config server keeps mapof shards to nodes, shards split automatically

Replication: replica sets with configurable(async/sync) consistency.

Durability: configurable journaling

Approaches to Elastic Scale on Commodity Hardware

Master(standby)Master

Client Client Client

GFS

Page 144: 03_Cloud Computing D

© 2009 IBM Corporation144

RDBMSs achieve scale and HA without sacrificing ACID

Run massive low-latency systems in production– E..g. Stock trading, shipping, credit card authorization,

currency exchange systems– 10.3 tpm TPC-C on IBM Power780,; 3M tpm TPC-C

Intel x3850x5 Advanced optimization for object types, such as

XML. – The current “format du jour” is JSON, which is starting

to gain a lot of attention from RDBMSs vendors.

Enterprise applications will need joins– Reference data– Need ability to optimize data access without changing

the application– RDBMS have been doing this for years. NoSQL

systems will need to introduce this capability in theappication tier.

RDBMSs Support scalability for enterprisedeployments,

– IT works with app developers to manage IT growth

PureScale Elasticity

0123456789

101112

0 5 10 15#members

thro

ug

hp

ut

vs 1

me

mb

er

Page 145: 03_Cloud Computing D

© 2009 IBM Corporation145

‘Content-Centric’ is really about “Big Data” AND “New Analytics”

Text

Logs &Transactions

Clickstream Data

Statistical Model Building

Text Analytics

Biological Sequences

Page 146: 03_Cloud Computing D

© 2009 IBM Corporation146

Agenda

Evolving Programming Models – Overview

Extensions to traditional programming models – Middleware patterns inthe cloud

Loosely coupled, relaxed consistency– Amazon Web Services– Microsoft Azure– Google, NoSQL

Content centric– Hadoop, Apache Spark

Database centric– Pangoo– Salesforce.com

Page 147: 03_Cloud Computing D

© 2009 IBM Corporation147147

A “Database-Centric” model runs infrastructure and database on thesame nodes

Mgmt Model

Mgmt Model

Mgmt Model

InfrastructurePersistenceProgramming

Real innovationat this layer

Page 148: 03_Cloud Computing D

© 2009 IBM Corporation148148

Critical elements of a database centric model

• The database layer needs to multiplex multipleapplications• Database model needs to be flexible if different apps share the database

• For cloud economics to work out, mgmt cost of database layer << #appx mgmt cost of a single database for an app

• Programming model• A focus on schema configuration as opposed to schema design

• Constrain enough to keep cloud economics yet not reduce the marketsignificantly

• Higher bandwidth within a “group of nodes”• For scaling the database within an app (could use larger SMP’s)

• Database nodes are the “keystone”, they need “HA” insome form (so the previous two architectures are notexactly the right fit)

Page 149: 03_Cloud Computing D

© 2009 IBM Corporation149

149

From Single-Tenant to Multi-Tenant Application

MMT common service provides:

Support for cost-effective resource sharing, isolation, diverseSLAs, etc., across different tenants

Management of database resource pool, lifecycle ofapplications & tenant subscriptions, monitor, analyze, and

optimize system operations

Highly on-demand availability and scalability with thenumber of tenants & offerings

Minimize application development or transformation effortfor SaaS ISVs

MT data access mockup package for local testing

MMT MetaRepository

MMT CommonService

Operator

1 5…

Database Resource Pool

10,000

App

1

App

2

App …

Few shards in MT

user1 user100… user1 user100

… user1 user100…

user1,1 user10000,100…

App1 10

Page 150: 03_Cloud Computing D

© 2009 IBM Corporation150

150

Database Multi-Tenancy for the Cloud

Tenant A

Tenant B

App Server

Shared Tables

(economic)

Separate Instances/Databases

(deluxe/advanced)

Separate Tables

(intermediate)

Tenant A

Tenant B

Multi-tenant App

App Server

Multi-tenant App

Hig

her

Qu

ery

Op

tim

izat

ion

/ru

nti

me

Co

mp

lexi

ty,

Hig

her

Sec

uri

ty W

orr

ies

Multi-tenant App

App Server

Higher Multitenancy, better resourceutilization

Page 151: 03_Cloud Computing D

© 2009 IBM Corporation151

151

Multi-tenancy Challenges

Isolation, Scalability, Performance,Customization, Resource Utilization,

Metering …

Virtual Multi-Tenant LayerVirtual Multi-Tenant LayerVirtual Multi-Tenant Layer

DB Multi-Tenant Layer

Page 152: 03_Cloud Computing D

© 2009 IBM Corporation152

152

MT DB Tradeoffs

Isolated Databases Separate Schemas Shared Tables

Simplicity simple simple (but need mechanism to avoidname collisions (3-part name ormapping))

hard

Customizability

(schema)

high high low (might require migration)

Rigorous Isolation(regulatory law)

best moderate lowest

Resource Cost/tenant high low lowest

#Tenants Low large Largest

Operational Cost/tenant(backup, patches, etc.)

high low (but point in time recovery noteasily possible)

Lowest (but point in time recoveryeven harder)

Tools Need tools to deal w/ largenumber of instances/databases

Need tools to deal w/ large number oftables

n/a

DB implementation cost Lowest (qry routing and simplemapping layer)

Low (qry routing, simple mappinglayer and qry mapping)

High (qry routing, simple mappinglayer, qry mapping, row-levelisolation)

Scalability Per tenant Need some data/load balancing w/dynamic migration

Need some data/load balancing w/dynamic migration

Query Optimization Less critical Less critical Critical (wrong plan over very largetables is disastrous)

Per Tenant QueryPerformance

As usual need qry governance Need qry governance and tenant-specific statistics

Page 153: 03_Cloud Computing D

© 2009 IBM Corporation153

Get tenant id via Tenant Identity propagation (ThreadLocal).

Retrieve tenant profile (database, username, password, etc.)according to tenant id.

Connect to underlying database based on tenant profile

– If shared tables, set tenant id in connection; pass down thesql to target db.

– If separate tables, get tenant specific schema name(assigned during tenant onboard) from tenant profile, and

set current schema before each statement is created.

– If separate db, pass down the sql to target db.

MMT Metadata Repository

Tenant info;Offering info;

Physical DB info; Catalog info;SLA…..etc…

Dynamic Routing

MMT JDBC Wrapper

Get tenant id

SaaS Application

REST Service MMT Master App

REST Client

Tenant DB

2

3

45

REST requestw/ tenant id

REST responsew/ tenant profile:

DB info, SLA

JDBC connectionw/ tenant id

6

Result set

JDBC

1

Only once

DB2MMT

Non-db2mmt

Request dbconnection

Cache

DB2 JDBC Driver

Tenant Identitypropagation

Page 154: 03_Cloud Computing D

© 2009 IBM Corporation154

154

Bringing an Application to MMT for DB2

MT App (Offering) development/transform

Operation Management

Runtime

ServiceProviderMMT Admin Console

ISV

TenantUsers

Monitoring,Governance,

…..

MMT Sandbox

Multi-tenant application

IDE

Tenantmanagement

Offeringmanagement

Resourcemanagement

MMT MetaRepository

MMT CommonService

Operator

1 5…

Database Resource Pool

Shards in MT

Multi-tenant App

Page 155: 03_Cloud Computing D

© 2009 IBM Corporation155

155

ISV App

DB2

Application

MT MetaRepository

MMT CommonService

DB2

ISV Local Env. DB2 MMT Runtime Env.

On-boarding

SimulatedMeta File

MT Database Pool

Operator

DB2 DB2

MMT RuntimeAgent

MT Application development/transformation

Provide offeringmetadata file

(XML) ofapplication

Configure/Modify theapplication to use DB2MMT access package

Embed tenantidentification

Develop & Transformation Local & Runtime Environment

Supported J2EE environments– JDBC, Spring, iBatis/Hibernate, JPA– WAS/Tomcat, DB2

MMT LocalSandbox

Example of offering transformation1. Embed tenant identification in application

– Modify Web.xml to include the Filter servletTenantID forpropagation through thread local

2. Configure the application to use MMT data access package– Modify Spring data source config to use MMT data source3. Provide offering metadata file (XML) of application– Data source info, DDL, shared tables info, config info, …

Page 156: 03_Cloud Computing D

© 2009 IBM Corporation156

156

Operation Management (MMT Admin Console)

2. Tenant on-boarding/subscription

3. Offering Upgrade1. Offering onboarding

4. Offering & tenant topological view

Page 157: 03_Cloud Computing D

© 2009 IBM Corporation157

157

Architecture of MMT for DB2

MMT Master App(WAS Cluster for HA & LB)

MMT MetadataRepository

JDBC w/ tenant context

REST w/ tenant context

REST

JDBC

Database Resource

Pool

Database Resource

PoolT 1T 1 T 3T 3 T 4T 4

MMT REST Services

A J2EE SaaS ApplicationA J2EE SaaS Application

MMT JDBC WrapperMMT JDBC Wrapper

DB2 JDBC Driver MMT Admin Console AppMMT Admin Console App

T 2T 2 T 5T 5T 1T 1 T 3T 3 T 4T 4

Tenant Data Node

T 2T 2 T 5T 5

Tenant Data Node

RXA / JDBC

Page 158: 03_Cloud Computing D

© 2009 IBM Corporation158

158

Page 159: 03_Cloud Computing D

© 2009 IBM Corporation159

KingDee’s Exploitationof Pangoo

Multi-tenantMetadata

Repository

MT Runtime Data AccessService

(Runtime ResourceSharing/Isolation, DynamicRouting, SLA tracking …)

MT Operational &Management Service

(HA, Scalability, SLA tracking,Optimization, OLC etc.)

RDB Model AdapterObject Model

Adapter

TenantContext

MT-JDBC DriverSQL

REST/SOAPObject Query (LinQ,

SOQL, GQL etc.)

JDBC SDO Hibernate Agent

Data Object

Data Model Mapping Module

High Available &Scalable Data

Resources Pool

Application

VirtualDataStore

staticschema

dynamicschema

DB-CENTRIC CLOUD

Page 160: 03_Cloud Computing D

© 2009 IBM Corporation160

Salesforce.com PaaS

Page 161: 03_Cloud Computing D

© 2009 IBM Corporation161161

While Salesforce started with CRM, it and its partners run 1000’s ofother transactional apps on force.com

4-way Oracle RAC

Multitenant OptimizationLayer

CRM

Multitenant OptimizationLayer

Multitenant OptimizationLayer

4-way Oracle RAC4-way Oracle RAC

CRM HR Travel HR Mktg

~TB of managedDB

~40,000 tenants~400,000 customobjects

Total10Pods

Pod1 Pod2 Pod3

Take 20 StandardObjects (Accounts,Orders, …)Customize or Createnew ones

Mileage Object

AddWorkfloworBusinessLogic

Get App

ServiceMultipleTenants

DB-CENTRIC CLOUD

Page 162: 03_Cloud Computing D

© 2009 IBM Corporation162162

A Critical Innovation is the Multi-Tenant Database Architecture

Organization_id Key_prefix Id Name,(Others)

Val0 Val1 … ValN

org1 a01 a01…1

org1 a01 a01…2

org1 a02 a02…1

org1 a02 a02…2

org2 a01 a01…3

org2 a01 a01…4

org2 a02 a02…3

Custom Objects are forced into a very limited number of Oracle Tables

•Key_prefix subsetting●Still partitioning by organization_id

•Smart primary keys (key prefix)●Re-use across organizations

•GUID primary keys•ValN flex fields

Opex at database and platform level dominated by #objects [backups, stats, tuning,schema evolution, app design] for most databases. SFDC reduces this by forcing alldisparate objects into fixed set of tables (as rows) -- trading off opex for platformdevelopment costs. Consequently, it is able to store ~400,000 different objects in acouple of dozen tables

DB-CENTRIC CLOUD

Page 163: 03_Cloud Computing D

© 2009 IBM Corporation163

Key PaaS Services

Amazon AWSSalesforce.com

Cloud Foundry Microsoft Azure

Key Services

•Application Environments

•Relational DB as a Service

•Messaging

•Collaboration

•Security / User Management

IBM SOA

Page 164: 03_Cloud Computing D

© 2009 IBM Corporation164

References

Page 165: 03_Cloud Computing D

© 2009 IBM Corporation165

References – Downloads from Web

Michael Armbrust et al., Above the Clouds: A Berkeley View of Cloud Computing, Feb. 2009– http://www.eecs.berkeley.edu/Pubs/TechRpts/2009/EECS-2009-28.pdf

Cloud Computing: Platform as a Service. InformationWeek Analytics, October 2, 2009

Brooks, Carl: „How to build an application for the cloud”, http://searchcloudcomputing.techtarget.com/feature/How-to-build-an-application-for-the-cloud, Feb 2010 (last accessed 10/27/2011)

Ellis, John: „How To Design a Scalable Cloud Application” http://blog.bluelock.com/blog/cumulus-knowledge/how-to-design-a-scalable-cloud-application Jan 2011 (accessed 10/27/2011)

Cloud Use Cases White Paper Version 4, http://cloudusecases.org

DMTF: Architecture for Managing Clouds, Version 1.0.0, 2010-06-18

DMTF: Interoperable Clouds, Version 1.0.0, 2009-11-11

Luiz André Barroso and Urs Hölzle, The Datacenter as a Computer: An Introduction to the Design ofWarehouse-Scale Machines, Synthesis Lectures on Computer Architecture, 2009, http://www.morganclaypool.com/doi/pdf/10.2200/S00193ED1V01Y200905CAC006?cookieSet=1

Scott Crowder, Introduction to Workload Optimized Approach & Workload Market Segmentation, IBM WhitePaper, December 2009

David Chappell, A short introduction to Cloud, http://www.davidchappell.com/CloudPlatforms--Chappell.pdf

David Chappell, Cloud Platforms Today: A Perspective, April 2009 http://www.davidchappell.com/CloudPlatformsToday--APerspective--Chappell.pdf

Jeffrey Dean and Sanjay Ghemawat, MapReduce: Simplified Data Processing on Large Clusters, – labs.google.com/papers/mapreduce-osdi04.pdf

DeCandia et al. Dynamo: Amazon’s highly available key-value store, SOSP 2007, http://portal.acm.org/citation.cfm?id=1294281&dl=ACM&coll=ACM&CFID=47859964&CFTOKEN=98797782

Page 166: 03_Cloud Computing D

© 2009 IBM Corporation166

References – Downloads from Web

European Network and Information Security Agency (ENISA), Cloud Computing, Benefits, risksand recommendations for information security, Nov 2009 (http://www.enisa.europa.eu)

Gregor Hohpe, Programming the Cloud, November 2009, http://www.enterpriseintegrationpatterns.com/docs/HohpeProgrammingCloudKeynote.pdf

Anna Liu, Architecting Cloud Applications – the essential checklist, AAF Keynote 2009,

National Institute of Standards and Technology, Definition of Cloud Computing, http://csrc.nist.gov/groups/SNS/cloud-computing/

National Institute of Standard and Technology, NIST Cloud Computing Reference, SpecialPublication 500-292

Ning Duan et al., Tenant Behavior Analysis in Software as a Service Environment, ICSOC 2009

Daniel Nurmi et al., The Eucalyptus Open-source Cloud-computing System, http://www.cca08.org/papers/Paper32-Daniel-Nurmi.pdf

Open Cloud Manifesto, http://www.opencloudmanifesto.org/

OpenNebula.org – Various papers

B. Rochwerger et al., The Reservoir Model and Architecture for Open Federated CloudComputing, IBM Journal of Research and Development, April 2009 http://www8.cs.umu.se/~elmroth/papers/ibmjrd2009.pdf

Werner Vogels, Eventually Consistent, ACM Queue, October 2008

Ying Huang et al., A Framework for Building a Low Cost, Scalable and Secured Platform for Web-Delivered Business Services, IBM Systems Journal, November 2009

Michael Yuan, Java PaaS Shootout, 4/5/11, IBM developerWorks

Raphael, JR: „The 10 worst cloud outages (and what we can learn from them)” http://www.infoworld.com/d/cloud-computing/the-10-worst-cloud-outages-and-what-we-can-learn-them-902?page=0,3, June 2011 (last accessed 10/27/2011)

Page 167: 03_Cloud Computing D

© 2009 IBM Corporation167

References

Company Web Sites: Amazon, Microsoft, Google, IBM, Salesforce.com Tech blogs, for instance techblog.netflix.com http://wiki.developerforce.com/page/Multi_Tenant_Architecture Alan Brown, Enterprise Software Delivery, Addison Wesley 2013 Gregor Hohpe, Bobby Woolf, Enterprise Integration Patterns, Addison-Wesley 2004 Jez Humble and David Farley: Continuous Delivery, Addison Wesley 2010 Gene Kim et al: The Phoenix Project Kristof Kloeckner, Middleware for Distributed Systems, Lecture Notes 2004 Kristof Kloeckner, The IBM Cloud Agenda, White Paper 2009 Craig Larman, Bas Vodde: Scaling Lean & Agile Development, Addison-Wesley 2009 Web Site der Open Group: www.opengroup.org/cloudcomputing Mary and Tom Poppendieck: Lean Software Development. An Agile Toolkit, Addison Wesley

2003

George Reese: Cloud Application Architectures, O’Reilly 2009 John W. Rittinghouse, James F. Ransome, Cloud Computing. Implementation, Management

and Security, CRC Press 2009 Andrew Tanenbaum, Maarten van Steen: Distributed Systems. Principles and Paradigms,

Prentice-Hall 2009 Rich Schiesser: IT Systems Management, Prentice-Hall 2002 Jim Rymarczyk, Virtualization, Pre-Print 2009 Tivoli Service Automation Manager Solution Guide Adam Wiggins, The Twelve-Factor App, 12factor.net Bill Wilder, Cloud Architecture Patterns: Using Microsoft Azure, O’Reilly 2012

Page 168: 03_Cloud Computing D

© 2009 IBM Corporation

Cloud Business Support System (BSS)Overview

Page 169: 03_Cloud Computing D

© 2011 IBM Corporation

IBM Cloud Computing Reference Architecture: Architecture Overview | IBM Confidential

Cloud Computing Reference Architecture (CC RA) – Overall drill-down

Governance

Security, Resiliency, Performance & Consumability

Cloud Service Provider

Cloud Services

IaaS

PaaS

SaaS

BPaaS

Common CloudManagement Platform

Cloud ServiceIntegration

Tools

Consumer In-house IT

Infrastructure

Middleware

Applications

BusinessProcesses

OSS – Operational SupportServices

BSS – Business SupportServices

CustomerAccount

Management

ServiceOfferingCatalog

ServiceOffering

Management

TransitionManager

DeploymentArchitect

OperationsManager

Service Provider Portal & API

ConsumerAdministrator

ConsumerBusinessManager

Consumer Enduser

Service CreationTools

ServiceManagementDevelopment

Tools

Service RuntimeDevelopment

Tools

SoftwareDevelopment

Tools

Image CreationTools

ServiceComponentDeveloper

Infrastructure

Security &Risk Manager

CustomerCare

ServiceManager

BusinessManager

ServiceComposer

OfferingManager

ServiceIntegrator

Se

rv

ic

e M

an

ag

em

en

t

Se

rv

ic

e C

on

su

me

r P

or

ta

l &

AP

I

Se

rv

ic

e D

ev

el

op

me

nt

Po

rt

al

& A

PI

AP

I

AP

I

AP

I

AP

I

Existing &3rd partyservices,Partner

Ecosystems

Provisioning

Incident &Problem

Management

IT ServiceLevel

Management

Service Automation Management

Service Delivery Catalog

Platform & Virtualization Management

Infr

ast

ruct

ure

Mgm

t In

terf

ace

sP

latf

orm

Mgm

tIn

terf

ace

sS

oftw

are

Mg

mt

Inte

rfa

ces

BP

Mg

mt

Inte

rfa

ces

Page 170: 03_Cloud Computing D

© 2009 IBM Corporation170

Business Support System (BSS)

Services:1. Offering Management & Service

Offering Catalog

2. Customer & Subscriber Management

3. Contract Management

4. Entitlements

5. Order Management

6. Pricing & Rating

7. Accounting, Billing & Invoicing

8. Peering & Settlement

9. Analytics & Reporting

Processes:

Business Support Systems (BSS) are the components that a ServiceProvider uses to run its business operations towards customer

Page 171: 03_Cloud Computing D

© 2009 IBM Corporation171

CCMP R1.0 and R1.1 BSS Functionality Sales

– Face to face using ePricer/eConfig tools

Customer Management– Bulk import of customer onboarding information by

Business Office– UI for user management with various roles– Web Identity support

Subscriber Management– Map customer admin and users to a contract

Offering Management– Bulk upload of Catalog data with list price and cost

information Service Offering Catalog

– UI for display of catalog items details like Images,VM Sizes, 32/64Bit, Block Storage, Reserved IPAddress, VLAN

– UI for submitting provisioning request for a VM on apublic or private network with appropriate IPaddress and attaching a storage

Contract Entitlements– Service Catalog entitlement information by

customer and contract loaded by the BusinessOffice

Reporting and Analytics– Display of usage via BIRT reports– Royalty Reports for Redhat and SuSe

Contract Pricing and Rating– Pricing information by customer and contract

loaded by the Business Office– Simple ETL based price x quantity based pricing

model Billing

– Usage based by the hour, monthly recurring andone time charge

– Flexible billing calendar (monthly, quarterly &yearly) for a Geo

– Billing adjustments, incidental charges– Generating CFT/S spread-sheet feed file– “Green Dollar” Revenue back to SWG Products

Metering– Rollup of VM, IP addresses, storage blocks usage

information via Data Stage Costing

– Usage based costing using offering wide (non-contract) cost rate

– Generating CIF/SSC spread-sheet feed file API

– APIs for Image, Instance and Key Management

Page 172: 03_Cloud Computing D

© 2009 IBM Corporation172

Pricing Models One Time Setup Charges

– Setup– Enterprise Onboarding

Monthly Recurring Charges– Rate Buy Down– VPN/VLAN

Per Hour Usage-based Charges– Virtual Machines

Images (software stack)

OS

Standardized (BR, SL, GD, PT, 32, 64) Compute– IP Address Reservation– Standardized (SM, MD, LG) Persistent Storage

Page 173: 03_Cloud Computing D

© 2009 IBM Corporation173

Pu

bl

ic

AP

I (

Re

st

&S

OA

P)

REST& SOAP

WebBrowser

JavaScript & CSS

CustomerAdmin

CustomerUser

ImageProvider

Developer

EclipsePlug-in

Reporting(BIRT)

Data Warehouse(DB2)

Data Acquisition(DataStage)

Web IdentityLDAP

TAM

Web Seal

AAA

Order to Cash

Billing (CFT/S)

Costing (SSC)

CSV Files

Billing

Cost

Rylty

Invoicing (Geos, IOL)

Financials (CLS, CARS)

OfferingManager

Create Customer Users& set resource limits

Request & use VM,Storage, IP Address

Upload Catalog & ListPrices

Onboard Customers,Billing, Adjustments

Enterprise User Mgmt

BSS Extensions

Resource Mapping

Audit & Compliance

OSS Adapter

CloudUI

CloudBSS

ECWDB

BSS Detailed Component Diagram

Pricing & Rating

Ab

st

ra

ct

io

n L

ay

er

Po

rt

al

Image Meta-data & Scripts

Rational Asset ManagerRAM

Event Messaging

Subscriber ManagementREST

Service Offering Catalog

WDP BSS

EntitlementsWDPBSS

BSS forDev Test

BusinessOffice

Create Images

Page 174: 03_Cloud Computing D

© 2009 IBM Corporation174

Layered Architecture

Page 175: 03_Cloud Computing D

© 2009 IBM Corporation175

Operational Model

Page 176: 03_Cloud Computing D

© 2009 IBM Corporation176

Backup Slides

Page 177: 03_Cloud Computing D

© 2009 IBM Corporation177

What is Docker

177

Simple APIs and readable Dockerfles promote forking and sharing of code GIT/maven style repositories

Layered images promote Contnuous Delivery processes and sharingLight weight images lend themselves to productve local environments to test distributed scenarios

Page 178: 03_Cloud Computing D

© 2009 IBM Corporation178

What is Docker?

Page 179: 03_Cloud Computing D

© 2009 IBM Corporation179

AppA

Containers vs. VMs

Hypervisor (Type 2)

Host OS

Server

Guest

OS

Bins/

Libs

AppA’

Guest

OS

Bins/

Libs

AppB

Guest

OS

Bins/

Libs

Ap

p A

Do

cker

Host OS

Server

Bins/Libs

Ap

p A

Bins/Libs

Ap

p B

Ap

p B

Ap

p B

Ap

p B

VM

Container

Containers are isolated,but share OS and, where

appropriate, bins/libraries

Guest

OS

Guest

OS

…result is significantly fasterdeployment, much less overhead,

easier migration, faster restart

Page 180: 03_Cloud Computing D

© 2009 IBM Corporation180

Why are Docker containers lightweight?

Bins/

Libs

AppA

Original App(No OS to take

up space, resources,or require restart)

Ap

p Δ

Bin

s/

AppA

Bins/

Libs

AppA’

Guest

OS

Bins/

Libs

Modified App

Copy on writeallows

us to only savethe diffsBetween

container Aand container

A’

VMsEvery app, every copy of an

app, and every slight modificationof the app requires a new virtual server

AppA

Guest

OS

Bins/

Libs

Copy ofApp

No OS. CanShare bins/libs

AppA

Guest

OS

Guest

OS

VMs Containers

Page 181: 03_Cloud Computing D

© 2009 IBM Corporation181

What are the basics of the Docker system?

SourceCode

Repository

DockerfileFor

A

Docker Engine

DockerRegistr

y

Build

Do

cker

Host 2 OS (Linux)

Co

nt

ai

ne

rA

Co

nt

ai

ne

rB

Co

nt

ai

ne

rC

Co

nt

ai

ne

r A

Push

SearchPull

Run

Host 1 OS(Linux)

Page 182: 03_Cloud Computing D

© 2009 IBM Corporation182

Changes and Updates

Docker Engine

DockerRegistr

y

Docker Engine

Push

Update

Bins/

Libs

AppA

Ap

p Δ

Bin

s/

Base Container

Image

Host is now runningA’’

Container Mod A’’

Ap

p Δ

Bin

s/

Bins/

Libs

AppA

Bin

s/

Bins/

Libs

AppA’’

Host running A wants toupgrade to A’’. Requests update.

Gets only diffs

Container Mod A’

Page 183: 03_Cloud Computing D

© 2009 IBM Corporation183

Marketecture

DockerFile

SourceCode

Repository

CI/CD

Physical

Virtual

Cloud/Daa

S

Search,Pull

Push

Search,Pull

Push

Search,Pull

Push

Mac/WinDev

Machine

Boot 2Docker

Grey items are non-Docker, Inc. itemsItalics items will not be ready until 2H 2014 or later

Green is open source

DockerHub(pub/priv)

USERS

PROVENANCE

MGMT UI

POLICY

Registries

DockerHub API

APP CREATION

APP DEPLOYMENT

APPMANAGEMENT

DevMachine

Do

cke

r

LinuxOS

PRODBOX

LinuxOS

Do

cke

r

PRODBOX

LinuxOS

Do

cke

r

PRODBOX

LinuxOS

Do

cke

r

GCE RACK IBM

VM

Do

cke

r

VM

Do

cke

r

VM

Do

cke

r

DaaS DaaS DaaS

Infrastructure Mgt

Infrastructure Mgt

Public PrivateCurated

Page 184: 03_Cloud Computing D

© 2009 IBM Corporation184

Docker Ecosystem

Page 185: 03_Cloud Computing D

© 2009 IBM Corporation185

What are Containers and Docker???

Docker Stats Community Activity

Container Downloads +1.2 Million

Trained Developers +45K

Dev Repos publishing containersto Docker Index

+14K

% total contributors who workoutside Docker

~95 %

Active Meetups Over 70 cities in 27 countries

Integrations & growing

OpenStack, RHEL, Ubuntu,Chef, Puppet, Salt, VMWare,

Google Cloud, Amazon, etc +++

185IBM Confidential

Top Community Members

• Containers provide isolation similar to VMs • High performance due to lack of hypervisor overhead

• High density due to much smaller memory footprint allows greater cloud eff iciency

• Near instance startup time accelerates DevOps cycle• Docker Images provide portability across Linux

environments18 months!18 months!

Page 186: 03_Cloud Computing D

© 2009 IBM Corporation186

Four major use cases

Continuous Integration/Continuous Delivery:– Go from developer’s laptop, through automated test, to

production, and through scaling without modification Alternative form of virtualization for multi-tenant services Scale-out/Big Data:

– Rapidly scale same application across hundreds or thousandsof servers…and scale down as rapidly

Cross Cloud Deployment– Move the same application across multiple clouds (public,

private, or hybrid) without modification or noticeable delay

Page 187: 03_Cloud Computing D

© 2009 IBM Corporation187

The Growth of Docker

Microsof plans support for both Kubernetes & Docker on the Microsof Azure platorm

Vmware plans 5 sessions and keynote content on containers at VMWorld US

Google and Mesosphere join to bring together Mesosphere, Kubernetes and GCP

The community and vendors are quickly developing tooling. FIG consumed by docker, Atlassian automatedbuilds, Travis CI automated builds and Chef supported images

AWS Elastc Beanstalk adds Docker support for building and deploying containers

187

Page 188: 03_Cloud Computing D

© 2009 IBM Corporation188

BlueMix

Rich ecosystem of current and future IBM & 3rd Party services

“A platform where developers canact like kids in a sandbox - except

this box is enterprise-grade.”

Page 189: 03_Cloud Computing D

© 2009 IBM Corporation189

BlueMix Cloud Platform ServicesIBM, Open Source and Third Party APIs

Mobile AppManagement

DevOps

JavaLiberty

Ruby onRails

Node.js “Bring YourOwn

Buildpack”

IBMRelationalDatabase

IBM JSONDatabase

Mongo DB PostgreSQL

Mobile Data

Mobile Sync

Data Managemen

tServices

MQTTCloudCode Mobile AppMgmt

Mobile Services

MobileQuality

Assurance

BLU DataWarehouse

MySQL

Twilio

Data Cache SessionCache

Elastic MQ

Web & AppApplication

Decision SSO Redis

MapReduce

RabbitMQ LogAnalysis

Historian

Internet OfThings

Push

Runtimes

Page 190: 03_Cloud Computing D

© 2009 IBM Corporation190

BlueMix DevOps experience

190

Page 191: 03_Cloud Computing D

© 2009 IBM Corporation191

BlueMix Application Creation & Run Flow

ApplicationSourceCode

(e.g. Liberty, Node)

APPLICATION

Lives inJazzHub,

GitHub, LocalFile

Services to secure& manage the App

SERVICERuns anywhere

(couldeven be a CF app with

API)

Service Instance(e.g. Queue)

Service InstanceCreated by

•Call from CF CLI/ACE UI•Auto-created from manifest

•Externally or manuallyfrom

marketplaces Service(e.g. Queuing service)

API

Use Services API (e.g. Put in Queue)

API UIConfgure the

serviceinstance

DeployedAnd Runs

OnSoftlayer VM

OS (Ubuntu)

WardenContainer

LibertyEnv

RunningApp Code

Installedas

Buildpack

Installedas

Droplet

CloudFoundryDEAOpenStackOpenStack

Page 192: 03_Cloud Computing D

© 2009 IBM Corporation192

Services Interfaces

BlueMix

ServiceGateway &

Implementation

Create/Bind/Unbind/Delete Service Instance

Change ServicePlan

Service Instance Operations (Start,Stop)

Monitor Service Instance Status &KPIs

Service Usage Metering &BSS

Scale/Auto-Scale ServiceInstance

Security for Service InstanceAccess

Service SpecificFunctional

Interfaces (UI/API)

NotificationInterfaces

ServiceBackup Service

Instance

BlueMix

Application

New RoCApps

Admin UI for ServiceInstance

FunctionalInterfaces

for the Service

LifecycleInterfaces

for theService

SERVICE INTERFACES