View
215
Download
0
Category
Preview:
Citation preview
© 2009 IBM Corporation
Cloud Computing for a SmarterPlanet
Prof. Dr. Kristof KloecknerCTO and General Manager, Technology, Innovation and AutomationIBM Global Technology Services
November 9, 2015
Cloud ComputingPlatform Services
© 2009 IBM Corporation2
Agenda
Recap
Origin of Cloud Platforms– Brief Overview of Commercial Platforms
Programming Models and Platforms
© 2009 IBM Corporation3
References
Company Web Sites: Amazon, Microsoft, Google, IBM, Salesforce.com Tech blogs, for instance techblog.netflix.com http://wiki.developerforce.com/page/Multi_Tenant_Architecture Alan Brown, Enterprise Software Delivery, Addison Wesley 2013 Gregor Hohpe, Bobby Woolf, Enterprise Integration Patterns, Addison-Wesley 2004 Jez Humble and David Farley: Continuous Delivery, Addison Wesley 2010 Gene Kim et al: The Phoenix Project Craig Larman, Bas Vodde: Scaling Lean & Agile Development, Addison-Wesley 2009 Web Site der Open Group: www.opengroup.org/cloudcomputing Mary and Tom Poppendieck: Lean Software Development. An Agile Toolkit, Addison Wesley
2003 Eric Ries, The Lean Startup George Reese: Cloud Application Architectures, O’Reilly 2009 John W. Rittinghouse, James F. Ransome, Cloud Computing. Implementation, Management
and Security, CRC Press 2009 Andrew Tanenbaum, Maarten van Steen: Distributed Systems. Principles and Paradigms,
Prentice-Hall 2009 Rich Schiesser: IT Systems Management, Prentice-Hall 2002 Jim Rymarczyk, Virtualization, Pre-Print 2009 Tivoli Service Automation Manager Solution Guide Adam Wiggins, The Twelve-Factor App, 12factor.net Bill Wilder, Cloud Architecture Patterns: Using Microsoft Azure, O’Reilly 2012
© 2009 IBM Corporation4
References – Downloads from Web
Michael Armbrust et al., Above the Clouds: A Berkeley View of Cloud Computing, Feb. 2009– http://www.eecs.berkeley.edu/Pubs/TechRpts/2009/EECS-2009-28.pdf
Cloud Computing: Platform as a Service. InformationWeek Analytics, October 2, 2009
CSA. Top Threats to Cloud Computing V1.0 https://cloudsecurityalliance.org/topthreats/csathreats.v1.0.pdf
Cloud Use Cases White Paper Version 4, http://cloudusecases.org
DMTF: Architecture for Managing Clouds, Version 1.0.0, 2010-06-18
DMTF: Interoperable Clouds, Version 1.0.0, 2009-11-11
Luiz André Barroso and Urs Hölzle, The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines, SynthesisLectures on Computer Architecture, 2009, http://www.morganclaypool.com/doi/pdf/10.2200/S00193ED1V01Y200905CAC006?cookieSet=1
Scott Crowder, Introduction to Workload Optimized Approach & Workload Market Segmentation, IBM White Paper, December 2009
David Chappell, A short introduction to Cloud, http://www.davidchappell.com/CloudPlatforms--Chappell.pdf
David Chappell, Cloud Platforms Today: A Perspective, April 2009 http://www.davidchappell.com/CloudPlatformsToday--APerspective--Chappell.pdf
Jeffrey Dean and Sanjay Ghemawat, MapReduce: Simplified Data Processing on Large Clusters, – labs.google.com/papers/mapreduce-osdi04.pdf
DeCandia et al. Dynamo: Amazon’s highly available key-value store, SOSP 2007, http://portal.acm.org/citation.cfm?id=1294281&dl=ACM&coll=ACM&CFID=47859964&CFTOKEN=98797782
European Network and Information Security Agency (ENISA), Cloud Computing, Benefits, risks and recommendations for information security,Nov 2009 (http://www.enisa.europa.eu)
Gregor Hohpe, Programming the Cloud, November 2009
http://www.enterpriseintegrationpatterns.com/docs/HohpeProgrammingCloudKeynote.pdfA
nna Liu, Architecting Cloud Applications – the essential checklist, AAF Keynote 2009, N
ational Institute of Standards and Technology, Definition of Cloud Computing, http://csrc.nist.gov/groups/SNS/cloud-computing/N
ational Institute of Standard and Technology, NIST Cloud Computing Reference, Special Publication 500-292N
ing Duan et al., Tenant Behavior Analysis in Software as a Service Environment, ICSOC 2009D
aniel Nurmi et al., The Eucalyptus Open-source Cloud-computing System, http://www.cca08.org/papers/Paper32-Daniel-Nurmi.pdfO
pen Cloud Manifesto, http://www.opencloudmanifesto.org/O
penNebula.org – Various papersB
. Rochwerger et al., The Reservoir Model and Architecture for Open Federated Cloud Computing, IBM Journal of Research andDevelopment, April 2009 http://www8.cs.umu.se/~elmroth/papers/ibmjrd2009.pdf
Werner Vogels, Eventually Consistent, ACM Queue, October 2008
Kees van Gelder, Elastic Data Warehousing in the Cloud, Vrije Univ. Amsterdam
Ying Huang et al., A Framework for Building a Low Cost, Scalable and Secured Platform for Web-Delivered Business Services, IBM SystemsJournal, November 2009
Michael Yuan, Java PaaS Shootout, 4/5/11, IBM developerWorks
© 2009 IBM Corporation5
Agenda
Recap
Origin of Cloud Platforms– Brief Overview of Commercial Platforms
Programming Models and Platforms
© 2009 IBM Corporation6
Cloud Services Spectrum
6
Cloud EnabledWorkloads
Cloud CentricWorkloads
Scalable
Virtualized
Elastic
Multi-tenant
Standardized InfrastructureHeterogeneous Infrastructure
ExistingMiddlewareWorkloads
EmergingPlatform
Workloads
Automated LIfecycle Integrated Lifecycle
Compatibility with existing systems Exploitation of new environments
© 2009 IBM Corporation7
Changes happening at the intersection of workload, application andinfrastructure lifecycle models
Analytics, Mobile, Social Applications
Services Platform(Micro Services)
De
vO
ps
Orchestration & Automation
Software-defined Infrastructure
Delivery Organization
Se
rvic
e M
an
an
ge
me
nm
t
Re
sil
ien
cy
& C
om
pli
an
ce
Digitization drives
Systems of Engagement
DevOps&
infrastructure flexibility
depend on
Value migration to LoBs
Infrastructure and delivery innovation
enables
Hybrid Delivery Models
7
© 2009 IBM Corporation8
Next Generaton Cloud Platorm
ExternalEcosystem
Analytics Commerce Collaboration Location Data Services
Marketplace SolutionsApp
Software DefinedNetworking
Resource Abstraction& Optimization
Software DefinedStorage
Software DefinedCompute
Workload definition, Optimization & Orchestration
DevelopmentBig Data &Analytics
Security Integration Mobile Social
Services & Composition Patterns API & Integration Services
TraditionalWorkloads
API API
API API API API API API
Softwareas a Service
API Economy
© 2009 IBM Corporation9
Next Generation Cloud Platform Architecture Built on OpenTechnologies
Softwareas aService (SaaS)
Platformas aService (PaaS)
Infrastructure as a Service (IaaS)
APIEconomy
CloudOperating
Environment
SoftwareDefined
Environment
OAuth
OpenShif cloudfoundry.org
TOSCA
OSLC
© 2009 IBM Corporation10
Developer Centric Platform, Marketplace & Services in aCloud Operating Environment
OPEN ecosystem of composable services
Optimized workload deployment
Integration patterns with systems of record
CapabilityValue
Fast, automated composition of services
Repeatable patterns-of-expertise
Workload defniton, Optmizaton, & OrchestratonWorkload defniton, Optmizaton, & Orchestraton
SofwareDefned
Environment
SofwareDefned
Environment Sofware DefnedCompute Sofware Defned Storage
Sofware DefnedNetworking
Resource Abstracton & Optmizaton
CloudOperatng
Environment datadatamobilemobiledevelopmentdevelopment operatonaloperatonalapplicatonapplicaton
servicesservices
Traditonal Workloads
Traditonal Workloads
Services & Compositon Paterns API & IntegratonAPI & IntegratonServicesServices
TraditionalTraditionalWorkloadsWorkloads
securitysecurity
cloudfoundry.org
…
© 2009 IBM Corporation11
6
5
43
2
Create app
Add databaseservice
Extract socialmedia data into
database
Add social analytcs service
Add Monitoringservice instance
Agile Service Compositon
Secure the service
1
ITERATE
TASK:TASK:Create a secure application thatCreate a secure application thatanalyses sentiment about certainanalyses sentiment about certain
topics in social mediatopics in social media
© 2009 IBM Corporation12
SERVICES FABRIC
APIAPIAPIAPI APIAPIAPIAPI APIAPIAPIAPI
Social Commerce Mobile
Value-added Services
Loyalty
Promoton
Payment
APIAPIAPIAPI
APIAPIAPIAPI EnterpriseEnterprise
Customer Customer InteractonInteracton
APIAPIAPIAPI EnterpriseEnterprisePaternsPaterns
APIAPIAPIAPI
APIAPIAPIAPIEnterpriseEnterpriseCapabilitesCapabilites
EnterpriseEnterpriseCapabilitesCapabilites
APIAPIAPIAPIEnterpriseEnterpriseCapabilitesCapabilites
BANK
TELCO
RETAIL
Serv
ices
Pat
ern
API ServiceManagement
ThrotlingThrotling
API-CatalogAPI-Catalog
MonitoringMonitoring
GovernanceGovernance
© 2009 IBM Corporation13
API Economy
Composition of services
Marketplace of internal & externalservices
CapabilityRapid application development &
delivery
API-accessible applications
Multi-channel integration
Value
ExternalEcosystemExternal
Ecosystem
Marketplace SolutionsApp
APIAPIAPIAPI
APIAPIEconomyEconomy
servicesservices
APIAPIAPIAPI
analytcsanalytcs
APIAPIAPIAPI
commercecommerce
APIAPIAPIAPI
collaboratoncollaboraton
APIAPIAPIAPI
locatonlocaton
APIAPIAPIAPI
datadata
APIAPIAPIAPIAPIAPIAPIAPI
OAuth
CloudOperatng
Environment
Workload definition, Optimization, & OrchestrationWorkload definition, Optimization, & OrchestrationSofwareDefned
EnvironmentSoftware Defined Compute Software Defined Storage Software Defined
Networking
Resource Abstracton & Optmizaton
Traditional Traditional WorkloadsWorkloads
Services & Compositon PaternsAPI & IntegratonAPI & IntegratonServicesServices
datadatamobilemobiledevdev opsops applicatonapplicatonservicesservices
securitysecurity…
© 2009 IBM Corporation14
Next Generation Cloud Platform
Resource abstraction and optimization
Workload definition, Optimization and Orchestration
ExternalEcosystem
Software Defined Compute Software Defined Storage Software Defined Network
MiddlewareMobileDatastore Services Security Ops Dev’t
TraditionalWorkloads
Collaboration
CommerceAnalytics Location Data Services
API API APIAPI
Marketplace Solutions
APIAPI
APIAPI
Application
Services and Composition PatternsAPI and Integration
Services
Softwareas aService (SaaS)
Platformas aService (PaaS)
Infrastructure as a Service (IaaS)
APIEconomy
CloudOperating
Environment
SoftwareDefined
Environment
© 2009 IBM Corporation15
Agenda
Recap
Origin of Cloud Platforms– Brief Overview of Commercial Platforms
Programming Models and Platforms
© 2009 IBM Corporation16
Infrastructure (IaaS)
Platform Components
Software (SaaS, BPaaS)
Platform Services drive Eco-System Evolution
Vendors develop platform technologiesto differentiate IaaS and SaaS offerings… 2. What platform services are
required to efficiently deliver SaaSand BPaaS offerings and to attract asubstantial ecosystem?
1. What platform services are requiredto increase attraction and loyalty ofcustomers and partners to IaaSofferings
… which evolve to include *as aservice capability ultimately
enabling the build of asubstantial ecosystem of ISVs
and developers.
3. What platform services and underlying components arecommon and serve both purposes? Do they support differentdeployment options?
© 2009 IBM Corporation17
The PaaS Market
PaaS is often presented asthe highest growing cloud
segment
$2,9bin 2016
30 % Annual Growth
Source: Gartner
26 %AGR through 2014
Source: CMSWire
But it is still a very small marketo PaaS accounts for 1% of the $109 b cloud industry (Source: Gartner)o It is expected at 2% of $209 b in 2016 (Source: Gartner)
o PaaS provider (which are often providing other services) don’t give the detailed numbers;some (Azure, AWS) give combined IaaS/PaaS figures o Google App Engine announced 250 000 active users, up from 100 000 in may 2011
(Source: Google)o Heroku (Salesforce acquisition), one of the biggest PaaS provider is claiming having
deployed 2.3 millions app (Source: Salesforce) o However this only accounts for 0.27% market share in the Alexa top 1M (Source: Datanyze)
It is difficult to get real data about PaaS revenues
© 2009 IBM Corporation18
The elements of a Cloud Application Platform
DevO
ps
Vendor andCommunity
Services
Fabric andContainer Services
Runtime Services
Web Hosting
LinuxContainer
Warden
Java App Server J2EE
Web Services
SOA
API Economy
Hosting Techniques
Enterprise Grade Middleware
A cloud application platformcombines multi-tenanthosting facilities, runtimeand DevOps services and amarket place of composablesoftware parts
ComposableSoftware
© 2009 IBM Corporation19
The PaaS ecosystem
HostingData Center or IaaS
ExecutableApplication Server, Frameworks
Language Run-TimeContainer (Linux)
Technical APIsRelational DatabaseNoSQL DatabaseMessage Queues
Rules, Big Data, AnalyticsManaged or Not
Business APIsAPI Economy, Marketplace
Integration APIBusiness Software APIs
Application
PaaS Ecosystem
DevelopersB
us
ines
sT
ech
no
log
yo This diagram is a break down of PaaS
functionalities
o “Executable” relates to thetechnologies required to provision,deploy and run the softwarecomponents
o “Technical APIs” refers to the set ofmiddleware typically used to writesoftware. These middleware can bemanaged services or not (For example,in Bluemix Cloudant is a fully manageddatabase whereas the MySQL serviceis not)
o “Business APIs” refers to the businessrelated APIs (For example SalesforceCRM APIs) exposed to the applicationdeveloper
© 2009 IBM Corporation20
Origin of Platform Services
Infrastructure as a Service– Amazon– VMware
Software as a Service– force.com– Microsoft Azure
Business Process(Solution)as a Service
– IBM Watson– IBM Commerce
© 2009 IBM Corporation21
SAP Hana
o HTML5, JavaScript, SQLScript
o Extended App Service –XS (App Server)
o HANA DBo Analytics (Predictive,
Text mining …)o Mobile, Big Data,
Collaboration,Integration, Business
Rules
SAP Infrastructure
No
SAP has built a CloudPlatform on its HANA in-
memory database engine. It is used mainly to extend
SAP core packages withspecific analytics, reporting,
visualization or integrationmodules.
There is a marketplace forthird party to offer their
applications.
Differentiators
© 2009 IBM Corporation22
Google App Engine
o Pythono Javao PHPo Go
o Cloud SQL, Cloud Storage,BigQuery
o Google Cloud Endpointo Mail, SMS, Voiceo Translation API
o Task Queues, XMPPo Search
Google Infrastructure
No
General purpose PaaS tobuild web applications on
Google infrastructure.Fast deployment, simple
administration and seamlessscalability.
Developers can composemany Google services
(translation, search, etc…)within their application.
Differentiators
© 2009 IBM Corporation23
Microsoft Azure
o C#o Javao PHPo Ruby
o Visual Studio, Integrationo SQL Database, Big Data (HDInsight)
o Storage, Backup, Recoveryo API management
o Media services (live streaming, CDN…)
o Mobile App Backendo HPC with broad partner ecosystem
Microsoft Infrastructure
No
Windows Azure started in2008 with its PaaS Service,
before launching itscompute / storage service in
2012 to counter Amazon.Initially Azure was targetingMicrosoft developers (.NET
model).
Differentiators
© 2009 IBM Corporation24
Heroku
o Pythono Java
o Node.jso Ruby
o Data Stores (Postgres, Mongo,Redis…)
o Mobile (Push, SMS, MQTT…)o Search, Logging, Queueing,
Caching …o Analytics services
o Paymentso Monitoring, Utilities
o Media (Encoder, streaming …)
Amazon AWS
Yes, build packs compatiblewith Cloud Foundry
Heroku has been acquiredby SalesForce but has not
been merged withForce.com. It is a generalpurpose PaaS running on
Amazon Cloud.A large network of partners
is contributing to the“Heroku Add-ons” rich set
of composable buildingblocks.
Differentiators
© 2009 IBM Corporation25
AWS Elastic Beanstalk
o .NETo Java
o PHP, Node.js, Python,Ruby
o Docker
o All AWS Serviceso Database (RDS, DynamoDB …)o Analytics (EMR, RedShift… )o Storage (S3, Glacier ….)o Media (Encoding …) o Amazon Marketplace
Amazon AWS
NoPortability with Docker
AWS Elastic Beanstalkautomatically handles the
deployment details, capacityprovisioning, load balancing
and application healthmonitoring. It is build on top
of AWS components likeEC2.
As part of AWS, ElasticBeanstalk let the developers
combine and leverage allAWS services (> 30).
The service is free and theuser only pays for theunderlying AWS cloud
components.
Differentiators
© 2009 IBM Corporation26
CloudBees
o Java, Scala and otherJVM based runtimeso PHP, JavaScript
o Node.js
o Managed MySQLo Integration
o Partner Services (Cloudant,MongoHQ RabbitMQ …)
o DevOps services (Continuousintegration with Jenkins)
o Amazon AWSo HP Cloud, Openstack
o On premise
o Jenkins for continuous integrationo Tomcat, J2EE as runtime
Created by the former JBossCTO, CloudBees provides a
cloud based continuousintegration platform.
DEV@Cloud provides thecontinuous integration
environment andRUN@Cloud provides the
runtime platform to host theapplication.
Focus is on best practicesaround continuous multi-
branch build, test anddeployment by leveragingthe most successful opensource tools like Jenkins,
Maven, Ant, Git, etc …
Differentiators
© 2009 IBM Corporation27
Salesforce1 Platform (force.com)
o Proprietaryo Apex
o Visualforce (GraphicalIDE)
o Mobile SDK
o Cloud Database with schemabuilder
o Salesforce APIso Oracle and SAP backend
o Analytics o Workflows
o Salesforce data centers
o No
Salesforce’s firstdevelopment platform meant
to create an ecosystemaround its core CRM
offering.Apps can be exposed on the
AppExchange.220,000 + apps have been
created so far.There is a high focus on
mobile development and ongraphical “point and click”
development.As such, Salesforce APIs
and back end APIs for SAP /Oracle are exposed in the
development studio in orderfor companies to create
mobile or web interfaces,integration points or specific
reports.
Differentiators
© 2009 IBM Corporation28
Pivotal Web Services
o Based on Cloud Foundry
o MySQL, MongoDB, PostgreSQLo MemCached
o Message queueso Search
o Load testingo Email
o Pivotal data centers
o Yes (Cloud Foundry)
Pivotal is an EMC / VMwarespinoff at the heart of the
Open Source Cloud FoundryProject.
All powered by CloudFoundry, Pivotal provides an
Enterprise aPaaS to be runon private clouds andoperates Pivotal Web
Services as a public aPaaS.The services marketplace is
embryonic at that stage,Pivotal being more focused
on the agile developmentparadigm (Pivotal Labs).
However, Pivotal investedrecently heavily in Big Datacomponents with Hadoop,
analytic database as well asin-memory & real-time data
store.
Differentiators
© 2009 IBM Corporation29
Red Hat OpenShift
o Java, Java EE (JBossEAP)
o Ruby, PHP, Node.jso Python o PERL
o OpenShift marketplaceo Messagingo Data Storeso Monitoring
o Emailo Search
o Amazon Web Services
o Yes
OpenShift originated from aRed Hat open source PaaS
initiative. Based on thisfoundation, Red Hat is now
offering OpenShiftEnterprise which is a private
application platform for onpremises deployment andOpenShift online which isRed Hat’s operated PaaS.
The OpenShift servicemarketplace exposes the
usual database, messaging,search and other services.There is no differentiation
here when compared toHeroku, Bluemix or Pivotal.
Differentiators
© 2009 IBM Corporation30
IBM Bluemix
o Java, Java EE (Liberty WAS) o SDK for Node.jso Ruby on Rails, Ruby Sinatrao All Cloud Foundry compatible build packs
o Mobile Services (Push, Quality Assurance)o Web: Workflow, Rules, Messaging, Cache
o Databases: MySQL, DB2, Cloudant,Mongo
o Big Data: Warehouse, Hadoopo Security, monitoring, integration
o Internet of Things
o SoftLayer
o Yes (Cloud Foundry)
From a development point ofview, IBM Bluemix can be
complemented with DevOpsServices for Bluemix (formerly
JazzHub) and solutions fromService Engage (Application
Performance Management)delivering a full set ofApplication Lifecycle
Management tools.On the services side, IBM is
aggressively including manyof its software portfolio
flagships like DB2 database,rules engine, workflow engine,
Hadoop powered byBigInsights, Analyticspowered by IBM BLU
Acceleration and an Internetof Things framework.
Differentiators
© 2009 IBM Corporation31
The PaaS Diagram
Breadth of composable services
Breadth ofdevelopment
supportingservices
It is helpful to classify the PaaS offerings within 2 axes:focus on development and focus on composable
services.o Many Service Providers provide
PaaS with a specific focus.o CloudBees clearly focuses on
development
o Salesforce force.com clearlyfocuses on service composition(around CRM functionalities)
o The horizontal axis represents thebreath of composable services
o The vertical axis represents thebreath of development services
o Another distinction is madebetween general purpose PaaS andSaaS related PaaS (PaaS designedto complement , integrate, enrichcore SaaS)
CloudBees
Force.com
General purpose aPaaS
aPaaS as a SaaS add on service
© 2009 IBM Corporation32
The PaaS Diagram
Focus onDevelopment
Focus onComposable
ServicesBreadth of composable services
SAP HANASalesForceForce.com
IBM BluemixGoogle App
Engine
AWSElastic
Beanstalk
Heroku
MS Azure
Breadth ofdevelopment
services
Run timesFrameworksLifecycle MgtContinuous
Integration
CloudBees
Pivotal WebServices
Red HatOpenShift
HP CloudApplication Platform
Oracle PaaS
dotCloud PaaS
General purpose aPaaS
aPaaS as a SaaS add on service
General purposeaPaaS leaders
Businessintegration andpersonalization
aPaaS
Development visionaries
© 2009 IBM Corporation33
Conclusion
General
oPaaS is a balanced combination (with different weights for differentcompetitors) of DevOps services (development, test, deployment, autoscaling, monitoring, etc…) and API based building blocks (Technical APIs likedatabases or rules and Business APIs like CRM or SAP components)oMany services are common for all the competitor (It is in third parties'interest to be part of most of the API marketplaces)oDifferentiation on services comes from the aPaaS provider’s own portfolio
Application portability
oApplication portability between different PaaS is still difficult … and almostimpossible if your application is build around provider’s specific buildingblocks!
© 2009 IBM Corporation34
Agenda
Recap
Origin of Cloud Platforms– Brief Overview of Commercial Platforms
Programming Models and Platforms
© 2009 IBM Corporation35
Agenda: Programming Models and Platforms
Evolving Programming Models – Overview
Extensions to traditional programming models – Middleware patterns inthe cloud
Loosely coupled, relaxed consistency– Amazon Web Services– Microsoft Azure– Google, NoSQL
Content centric– Hadoop, Apache Spark
Database centric– Pangoo– Salesforce
© 2009 IBM Corporation36
Cloud Reference Architecture – Focus on PaaS36
© 2009 IBM Corporation37
The capabilities required in a PaaS stack map to SOA
Core elements of the software stack have not changed, the delivery platform has The stacks we will be looking at expose virtualization at different levels
© 2009 IBM Corporation38
Five Emerging Cloud Architectures
Virtualized Traditional - Extensions of Java Application Servers, Support for‘Traditional’ Transactional Workloads (Cloud enabled)– Moving existing workloads to the cloud– Requires best practices, patterns, tooling
Database Centric - data driven + small computation on small data– With multi-tenancy attractive for enterprise and service providers
Content Centric - computation needs to be close to data + large computation onlarge data– Data Mining, Analytics, Data Warehouse,
Loosely Coupled - computation and data are separate– Can be addressed by existing middleware, but ‘relaxed consistency’ models
emerging– Cloud-centric approaches
Storage Analytics - Data and Storage Integration
© 2009 IBM Corporation39
Agenda
Evolving Programming Models – Overview
Extensions to traditional programming models – Middleware patterns inthe cloud
Loosely coupled, relaxed consistency– Amazon Web Services– Microsoft Azure– Google, NoSQL
Content centric– Hadoop, Apache Spark
Database centric– Pangoo– Salesforce.com
© 2009 IBM Corporation40
Private Cloud Evolution – Starting Point for Cloud Middleware Patterns
HardwareVirtualization
ImageVirtualization
WorkloadVirtualization
12
3
• Virtualization ofhardware resources in
the data center
• Management ofvirtualized
infrastructure
• Virtualizedinfrastructure leads to
creation of “virtual” software images
• Proliferation of virtualsoftware images
leads to managementchallenges
• Images are combinedinto patternsrepresenting
middleware workloads
• Workloadsencapsulate well
defined combinationsof integratedmiddleware
Image Management Integrated MiddlewareInfrastructure Management
© 2009 IBM Corporation41
Key Differentiators for Integrated Middleware
Awareness and optimizations for specific workloads– Integrated stacks of middleware optimized for particular workloads
Consolidating workloads under a simplified management system– Expose radically simplified management model optimized for specific
workloads– Pattern based deployments for most common workloads
Full lifecycle management– Go beyond provisioning to full lifecycle (update, failure recovery, growth,
problem determination)
Elastic, efficient, multi-tenant and automated management andexecution of application workloads
– Integrated monitoring, metering, logging, security, caching, etc.– Automated policies for resource consumption and balancing– Optimized resource utilization of middleware in virtualized environments
© 2009 IBM Corporation42
The Simplicity of Workload-Centric Cloud
© 2009 IBM Corporation43
Virtualized Middleware can be deployed in different ways
Image Management
Automatedprovisioning of
middleware
Integratedmiddleware with
cloud capabilities
© 2009 IBM Corporation44
Patterns of Expertise: Proven best practices and expertise for complex taskslearned from decades of client and partner engagements that are captured, labtested and optimized into a deployable form
MonitoringLifecycleManagement
What is a Pattern?• The pre-defined architecture of an application• For each component of the application (i.e.
database, web server, etc)• Pre-installation on an operating system• Pre-integration across components• Pre-configured & tuned• Pre-configured Monitoring• Pre-configured Security• Lifecycle Management
• In a deployable form, resulting in repeatabledeployment with full lifecycle management
• Delivering superior results:
• Agility: Faster time-to-value• Efficiency: Reduced costs and resources• Simplicity: Simpler skills requirements
© 2009 IBM Corporation45
Companies typically approach the Cloud One Step at a Time
vCloud Powered
vCloud Datacenter
vCloud Express
vCloud Virtualized
For VM Hosting• Service runs on VMware
vSphere
Software developer-focused cloud service• Credit card billed pay-for-use
Enterprise IT focused cloud service• Globally consistent, VMware certified,
to meet enterprise security andperformance requirements
VMware compatible cloud service• Service runs on vSphere and vCloud Director• Delivers increased agility, reduced costs , IT
control, application portability
Cloud Interested
Cloud Ready
Early PrivateCloud
MaturePrivateCloud
HybridCloud
Public CloudAdoption andCommitment
Public CloudExperimentation
VMware View of Cloud Adoption
© 2009 IBM Corporation46
VMware Cloud Offerings extending Basic Virtualization
InfrastructureServices
ESX vCenter vCloud
OperationalServices Monitoring HA/DR
Chargeback Capacity
DevelopmentServices Spring Java
Python, PHP, Ruby etc.
ApplicationServices Cloud Centric Cloud Enabled
Existing
Integration across services withvCloud & CloudFoundry
© 2009 IBM Corporation47
VMWare Cloud Foundry PaaS
© 2009 IBM Corporation48
Agenda
Evolving Programming Models – Overview
Extensions to traditional programming models – Middleware patterns inthe cloud
Loosely coupled, relaxed consistency– Amazon Web Services
Amazon material, best practices from A. Trossman, IBM– Microsoft Azure– Google, NoSQL
Content centric– Hadoop, Apache Spark
Database centric– Pangoo, Salesforce.com
© 2009 IBM Corporation49
Critical elements of a loosely coupled model
49
Applications Services accessed viaREST/SOAP messages• Storage services
• Data services
• Queuing/messaging Services
• Execution Services (virtualized hardware)
Design to minimize operational costs - up front• e.g. recognize some part of the platform will fail (Storage, DB,
application) & design into application
• Don’t debug - kill/freeze execution instant
Eventual Consistency for Data Handling &Replication: - sometimes data storage serviceor database service will return the wronganswer
Message queue - will deliver messages at leastonce, possibly more than once
Asynchronous - scale achieved by recognizingcomponents that perform operate in parallel• Session/state information stored outside the application
components
Commodity “parts” can come and go, therest of the system does not fail• Both for infrastructure parts, as well as for application parts
Redundant (idempotent) execution is finefor infrastructure working AND forapplication semantics• Without that, very strict guarantees on application state will be
required, making the cost of execution very high
© 2009 IBM Corporation50
Eventual Consistency (see Vogels or DeCandia et al.)
Eric Brewer’s CAP Theorem– Of 3 properties of a shared data system (consistency, availability, tolerance to
network partitioning/failure) only 2 can be achieved simultaneously Strategies for availability all depend on data replication
– Quorum approaches with N= Number of Replicas, R = Read Quorum, W=Write Quorum guarantee consistency if R + W > N
– Systems focusing on fault tolerance often use N=3, W=R=2 Other requirements (e.g. high load) require large N. If few writes, often R=1 To minimize likelihood of lost writes, choose W>1 Very large distributed systems have to live with network partitioning If read and write set don’t overlap, we cannot achieve strong consistency, but this
is often combined with a ‘lazy’ update approach to eventually update all nodes
– Good example: Shopping cart– Amazon shopping cart prioritizes availability for write
Other considerations: Failure detection
© 2009 IBM Corporation51
The ‘new ACID’ (Gregor Hohpe, Google 2009)
Old ACID – predictive and accurate– Atomic– Consistent– Isolated
– Durable
New ACID – flexible and redundant– Associative (grouping)– Commutative (order)– Idempotent (repetition)– Distributed
© 2009 IBM Corporation52
The Twelve Factors for aaS Applications (12factor.net, Adam Wiggins)
“The twelve-factor app is a methodology for building software-as-a-service apps that:Use declarative formats for setup automation, to minimize time and cost for new developersjoining the project;Have a clean contract with the underlying operating system, offering maximum portability between execution environments;Are suitable for deployment on modern cloud platforms, obviating the need for servers andsystems administration;Minimize divergence between development and production, enabling continuousdeployment for maximum agility;And can scale up without significant changes to tooling, architecture, or developmentpractices.
The twelve-factor methodology can be applied to apps written in any programming language,and which use any combination of backing services (database, queue, memory cache, etc).”
(Quote from web site)
© 2009 IBM Corporation53
The Twelve Factors for aaS Applications
I. Codebase
One codebase tracked in revision control, many deploys
II. Dependencies
Explicitly declare and isolate dependencies
III. Config
Store config in the environment
IV. Backing Services
Treat backing services as attached resources
V. Build, release, run
Strictly separate build and run stages
VI. Processes
Execute the app as one or more stateless processes
VII. Port binding
Export services via port binding
© 2009 IBM Corporation54
The Twelve Factors for aaS Applications…..
VIII. Concurrency
Scale out via the process model
IX. Disposability
Maximize robustness with fast startup and graceful shutdown
X. Dev/prod parity
Keep development, staging, and production as similar as possible
XI. Logs
Treat logs as event streams
XII. Admin processes
Run admin/management tasks as one-off processes
© 2009 IBM Corporation55
Microservices
The term "Microservice Architecture" has sprung up over the last few years to describe aparticular way of designing software applications as suites of independently deployableservices. While there is no precise definition of this architectural style, there are certaincommon characteristics around organization around business capability, automateddeployment, intelligence in the endpoints, and decentralized control of languages and data.
From: http://martinfowler.com/articles/microservices.html
© 2009 IBM Corporation56
Moving from monolithic applications to micro-services
56
Monolithic app Micro services
Scaling Scaling
© 2009 IBM Corporation57
Compartmentalized business capability
Cross-functional teams
Communication via API ONLY!!
Use messaging to remove peer-to-peerdependencies
REST communication
Decentralized data
Design for failure
Pluggable architecture
Enables continuous delivery
Properties of a micro-service architecture
© 2009 IBM Corporation58
Simple services but complex distributed systems
IT overhead – Configuration management– HA/DR for each service– Capacity– High degree of automation
API management is a must
Asynchronous communication nature is difficult
DevOps skills is a must
Micro-services do have a cost
© 2009 IBM Corporation59
Good reads!!
59
Automate deployments usingproducton-like environmentsand accelerate delivery cycles
A view into the culturalchallenges of adoptng
DevOps and best practces
Paterns for building resilientand robust applicatons
© 2009 IBM Corporation60
Core Concepts
Cloud drives changes to business models – economies of sharing and consumption basedpricing. Being fast is more important than getting it completely right
Hybrid clouds – systems of engagement and systems of record
API Economy
Components fail, deal with it – focus on recoverability
CAP Theorem – can’t have consistency, availability and (network) partition all at once
Relaxed (eventual) consistency, actual implementations driven by data replication strategies
Microservices
Containers
Patterns and Orchestration
DevOps – software delivery lifecycle as an accelerated feedback loop
© 2009 IBM Corporation61
AWS
History and Evolution Main Elements Best Practices
© 2009 IBM Corporation62
Since going public in 1997, Amazon has launched several newbusinesses to grow annual revenues from $148M to $61B
62Source: Amazon 10Qs
May 1997 IPOSplit Adjusted Stock Price: $2
Market Cap: $438m1997 Revenue: $150M,
$1B in TTM Revenue by EOY1998
June 2013Stock Price: $277
Market Cap: $126B2012 Revenue: $61B
Operating Margin: 1.04%Operating FCF: $4.25B
Employees: 88,400
Revenue ($B)
Stock Price ($)
2006: AWS LaunchGrocery
Webstore & FulfillmentUnbox Video Download
1997-2004: New retail categories - Apparel,Jewelry, Wedding Registry, Health & Beauty,
Home Décor, Sports and Outdoors, OfficeSupplies, Electronics, Mobile 2007: AWS EC2 & S3 For Europe
Kindle, Amazon MP3Direct to Kindle Publishing
2008: Cloud FrontAWS Elastic Block Store
Audible Acquisition
2005: Amazon Prime
2009: AWS enters AsiaKindle 2 + Kindle iPhone app
Zappos.com acquisition
2010: Amazon StudiosLiving Social Investment
Kindle for BlackberryMac, iPad, Android
2011: CloudDriveAWS CloudFormationApp store for AndroidPrime Instant Videos
More Kindle than printbooks
Virtual private cloud
2012: AWS re-InventAWS Marketplace
AWS Glacier, AWS Redshift6 Original series pilots
Amazon.com
© 2009 IBM Corporation63
Amazon started by monetizing the under-utilized Amazon.cominfrastructure to lay the foundation of AWS
63
Typical annualized InfrastructureUtilization at Amazon.com
The nature of Amazon.com’s business
required them to build capacity
sufficient to handle peak holiday
shopping + 15% headroom
This resulted in over 76% excess
capacity on an annualized basis
Amazon saw an opportunity in this
excess capacity and began leasing
simple compute and storage services
AWS has now grown to ~$2B in
Revenue with Operating Margin
between 7% and 14%
Source: 2013 AWS Summit Key-note Speech Andy Jassy – SVP AWS
15% Headroom
Annualized Idle Infrastructure
Annualized Utilization
© 2009 IBM Corporation64
Since it’s launch in 2006, Amazon Web Services has growngeographically, expanded offerings, and attracted major clients
Sources: Company Materials (website, AR), Morgan Keegan research, UBS research, Cowen & Companyresearch, Bain analysis
Amazon.com completed fullmigration to EC2
Added Data Center in Asia toreduce local latency
Netflix added as a client
Collaborated with SunMicrosystems for open sourceenterprise offering
Launched Elastic Block Storage
Amazon.com began using AWSto monitor website performance
Amazon launched AWS
Launched S3 and EC2
IMDb added as a client
Suffered major, 4 day outage,disrupting many customers
EC2, S3, VPC obtained FISMAaccreditation
Elastic Beanstalk (PaaS) launched
Added Data Center in SouthAmerica
Amazon.com began websitemigration to EC2
Flexible Payments service,Virtual Private cloud, andRelational Databaselaunched
Zynga added as major client
SimpleDB launched
Expanded into Europe
2006 2007 2008 2009 2010 2011
64
2012-13
Launched AWS Marketplace
AWS achieves FedRampcertification
Christmas Eve outagesaffect NetFlix and other largeclients
AcronymsEC2 Elastic Compute CloudS3 Simple Storage ServiceVPC Virtual Private Cloud
© 2009 IBM Corporation6565
S3 – Storage What is it?•Uses standards-based REST and SOAP interfaces designed to work withany Internet-development toolkit.
•S3 is built on a distributed architecture - data stored redundantly
•Each object is stored in a bucket & retrieved via a unique, developer-assigned key.
•A bucket can be located in the United States or in Europe. All objectswithin the bucket will be stored in the bucket’s location, but the objects canbe accessed from anywhere.
What’s different about it•S3 will fail on read/writes as a component - but system remains reliable. • Apps expected to be designed “loosely coupled” to take this into account•Not a filesystem. Objects are not files•Not for transaction processing•Data redundancy takes minutes - cannot be assure an object youcreated/updated in S3 will be immediately available to other S3applications
AWS ServicesLOOSELY COUPLED STYLE
© 2009 IBM Corporation6666
EC2 - virtual computing environmentWhat it is?•Provide “instances” - virtual machines/hardware that run inEC2; based on XenSource •Images can be shared - or rented out to others (Paid AMI thruDevPay)
What’s different about it•Application instances & data are coupled - EC2 does notautomatically save data outside it’s environment
•Instance rebooted - transient data not lost. Instance shutdownor fails - data lost
•Can recycle images to avoid runtime bugs/problems such asmemory leaks, race conditions, etc.. - and freeze images foroff-line debugging.
•From the beginning a developer needs factor long termpersistence into their application design when apps fail forwhatever reason (S3 down, network connection down, etc..)
•Automated management of EC2 images in early phase. Mostapplications have rolled their own
AWS ServicesLOOSELY COUPLED STYLE
© 2009 IBM Corporation6767
SQS - Simple Queue ServiceWhat is it?• Access to SQS thru SOAP services• Highly scalable, distributed, hosted queue to reduce/eliminate app-to-app dependencies• All messages are stored redundantly across multiple servers anddata centers• Developers can create an unlimited number of Amazon SQSqueues, each of which can send & receive an unlimited messages.• Message body can contain up to 8 KB of text in any format.• A message is “locked” while a computer is processing it, keepingother computers from trying to process it simultaneously. Ifprocessing fails, the lock will expire and the message will again beavailable.
What’s different about it?• It’s more than a simple queue - applications interact by telling SQSestimated processing time = workflow• Message may not be delivered immediately• Load balancing model is asynchronous - lots of instances could betaking work off the queue, in different data centers• Asynchronous - state/session information store in SQS wherepossible• Messages will end up being delivered more than once in somecases - application to deal with it. • Workloads, number of messages on the queue for an application -is done mathematically on sampled queues• Pricing still a drawback to broader adoption
AWS Services
© 2009 IBM Corporation68
Transfer value to customers through price reductions
Drove greater innovation through ecosystem and scale
Be proactive through infrastructure audits to increasecustomer satisfaction and value
Basic compute price is 40% lower than next cheapestcompetitor*
Transfer value to customers through price reductions
Drove greater innovation through ecosystem and scale
Be proactive through infrastructure audits to increasecustomer satisfaction and value
Basic compute price is 40% lower than next cheapestcompetitor*
AWS’ source of differentiation is “good enough” technology deliveredat the lowest prices owing to scale
68
AWS achieving scale across three dimensionsAWS achieving scale across three dimensions
AWS focus on customer valueAWS focus on customer value AWS scale and innovation AWS scale and innovation
Source: 2013 AWS Summit Key-note Speech Andy Jassy – SVP AWS, BCG Server Count Estimate Model*Price per ~2GB RAM Linux on-demand instance hour
Reduce
Prices
MoreCustomers
More
Usage
MoreInfrastructure
Economies
of Scale
Lower
Costs
31 AWS pricereductions since
2006
© 2009 IBM Corporation69
The AWS technology stack has expanded from the original EC2Compute and S3 Storage offerings
69Source: Amazon Website
AmazonTerminology
Applications CloudSearch
SES SimpleEmail
Service
SNSSimple
NotificationSvc
SQSSimpleQueueService
SWFSimple
Workflow
ElasticTranscoder
Deployment&
Management
ElasticBeanstalk
CloudWatch
DataPipeline
CloudFormation
IAMIdentity &AccessMgmt.
OpsWorks
Database DynamoDB
RDS Relational
DB Service
ElastiCache
Simple DB Redshift
Storage &ContentDelivery
S3 SimpleStorageService
EBS ElasticBlock
Storage
Glacier CloudFront
Compute &Networking
EC2 ElasticCloud
Compute
Elastic MapReduce
Route 53DNS
Service
DirectConnect
VPC VirtualPrivateCloud
EC2 (Compute) + S3 (Storage)are the original and foundational
offerings of AWS
Petabyte scale datawarehouse service
Content managementand delivery
Archive andbackup
a logically isolated section of the AWS Cloud whereyou can launch AWS resources in a virtual network
that you define
“Apps” in Amazon parlance, more accuratelyPaaS/Middleware. Either way, moving up the value
chain.
IaaS
ValueChain
PaaS
© 2009 IBM Corporation70
AWS is moving up the IT value chain over time as they introducehigher value services in PaaS and SaaS
2006 2007 2008 2009 2010 2011 2012 2013
Incr
ea
sin
g V
alu
e
AWS Services Moving to Higher Value Over Time
70 Source: AWS Company Website,
© 2009 IBM Corporation71
Amazon Elastic Beanstalk
Elastic Load Balancer
EC2Instances
Apache
EC2Instances
Amazon Linux AMI
http://myapp-staging.elasticbeanstalk.com
ElasticBeanstalk
HostManager
Tomcat
RunningApplication
Environment
Version
AWS Elastic Beanstalk App
Autoscaling
S3
© 2009 IBM Corporation72
Amazon re:Invent 2015 New Offering AnnouncementsAnnouncement Value Category Available IBM Equivalent Impact
AWS Quicksight Business Intelligenceservice with visualizationsupport
Analytics Preview Cognos Potential major threatto IBM’s new installsof BI
Kinesis Firehose Loads streaming datafrom Kinesis to S3 orRedshift data stores
Analytics GA Stream Analytics,Infosphere streams
AWS making it easyto transform transientdata and have itpersist in their cloud
Kinesis Streams New feature allows fortemporary storage for 7days from 24 hours
Analytics GA Infosphere streams
Kinesis Analytics A way to run standardSQL queries againststreaming data
Analytics Coming soon
Snowball 50 TB hardenedenclosure used to shipcustomer data to AWS
DB / Storage GA (selectedregions)
No equivalent intoSoftLayer / Bluemix
Unique offering fromAWS removes barrierto their cloud
MariaDB Support open source, MySQLcompatible database
DB GA None Advantage?
Database MigrationServices
migrates customerdatabases to AWS
DB Preview IBM DatabaseConversionWorkBench / IBMInfoSphere CDC
AWS lowering thebarrier to adoption oftheir public cloud
Schema ConversionTool
converts proprietarydatabase schemas,stored procedures, viewsto AWS
DB Preview IBM InfoSphereChange Data Capture(formerly DataMirror)
AWS lowering thebarrier to adoption oftheir public cloud
Amazon Inspector automated securityassessment in AWScloud
Compliance Preview Security ComplianceService
AWS continuing toexpand their gov’tpresence
Accenture AWSBusiness Group
Partnership Announcement ofpartnership / nospecific details
GTS, GBS Instant AWSconsulting unit
© 2009 IBM Corporation73
Announcement Value Category Available IBM Equivalent Impact
AWS WAF Web ApplicationFirewall
Security &Identity
GA NetScaler VPXApplication DeliveryController on SL
EC2 DedicatedHost
Visibliity and controlover how instancesare placed onphysical server
Compute Coming Soon Bare Metal IBM seems to still leadin this area, howeverbare metal is “old-world”
Config Rules A set of cloudgovernancecapabilities that allowIT Administrators todefine best practicesfor provisioning andconfiguring AWSresources and thencontinuously monitorcompliance withthose guidelines.
Compliance Preview
CloudwatchDashboards
Console enables youto create re-usablegraphs of AWSresources andcustom metrics soyou can quicklymonitor operationalstatus and identifyissues at a glance.
ManagementTools
GA
ElasticSearchService
Managed service fordeploying, operatingElasticSearch onAWS
Analytics GA ElasticSearch
© 2009 IBM Corporation74
Announcement Value Category Available IBM Equivalent Impact
ArchitectureWhitepaper
All experiences fromtalking with customersand identifying bestpractices
Training /Education
Now
EC2 VMs – X1 Intel Xeon E7 V3 - 2TBdata
Compute 1H2016 None Reducesperformanceadvantage of IBMBare metal
EC2 VMs – t2.nano 512 MB. Small, easy,quick
Compute Later this year AWS Lowering costseven further fordevelopers with thistiny instance
Amazon EC2Container Registry
A secure, fully-managedDocker container registry.Manages Dockercontainer images, makingit easier to store anddeploy them
Compute GA Docker Registry
AWS LambdaEnhancements
• Access to servicesrunning in a VirtualPrivate Cloud
• Functions written inPython
• Long running functions• Scheduled functions• Custom Retry Logic
Compute GA None Strengthenedadvantage in server-less services
AWS Mobile Hub Offers a quick and easyway to create mobileapps that use certainAWS services
Mobile Beta now Bluemix DevOps,MobileFirst PlatformFoundation
AWS catching upwith Bluemix Mobileservices
AWS IoT End to end IoT services IoT Beta now IoT Foundation AWS catching upwith IoT at thedeveloper level byoffering free SDKsand partnerships withdevicemanufacturers.
© 2009 IBM Corporation75
Best Practices (Andrew Trossman)
Image management– Launch parameters– S3, CVS, SVN– Image Style Management
Release upgradesCluster everything (redundancy)Dynamically respond
– Faults– Demand
Processing Pipeline of Loosely Coupled ServicesConclusions
© 2009 IBM Corporation76
Image Management
Changes makes 100% images impracticalBoot Scripts combined with HomogenousEnvironment workImage + Launch Parameters ~= Image
–Extremely repeatable and reliable–Less storage –Tolerates change better
Example template –Builds server from script–Pulls content/code from repository
© 2009 IBM Corporation77
Image Style Management
Avoid Heisenbugs – cycle VMs regularlySimple patches update “image”
–Automatically rolled out via regular cyclingNever “fix” by handAlways “replace” the image
© 2009 IBM Corporation78
Release Upgrades
Completely rebuild parallel environment– Test– Cut over data– Change DNS– Decommission old when confident
Cheaper to “replace than fix”Traditional “fix” process with staging etc.
– IBM identified 2/3 human effort dedicated to this process
© 2009 IBM Corporation79
Cluster Everything
Everything Fails – Applications must accommodateTransparent redundancySeamless failoverMonitoring & Events
© 2009 IBM Corporation80
ScalrDynamicResponse toDemand &Availabiltiy
© 2009 IBM Corporation81
Always Respond By Cloning
Resist urge to “fix” in placeMost bugs are application bugsTraditional QA is good at removing all but the HeisenbugsClone instance brings a “fresh” server to replace the faulty
one.– This gets past heisenbugs– Enables “off-line” problem determination
“Roll Forward” in the cloud
© 2009 IBM Corporation82
Scalr Process Flow
© 2009 IBM Corporation83
© 2009 IBM Corporation84
Pipeline Loosely Coupled Services
S3End users submit videos to be transcodedto the website
Request message is placed in the Amazon SQSincoming queue with a pointer to the video andto the target video format in the message
SQS
EC2
The transcoding engine, runningon a set of Amazon EC2instances, reads the requestmessage from the incoming queue
1 2
34The engine retrieves,transcodes, and returns thevideo to S3
5a
SimpleDB
Metadata about the video (e.g., format, datecreated and length) can be indexed into AmazonSimpleDB for easy query
A Simplified Example: Video Transcoding Web Site
Sources: Amazon.com, MI Analysis
Client assumed to be:
Web ApplicationLayer
5b
Response message is placed in the outgoingqueue and sent to user with a pointer to theconverted video
© 2009 IBM Corporation85
ServiceOrientedPlatform ofAmazon’sArchitecture
http://s3.amazonaws.com/AllThingsDistributed/sosp/amazon-dynamo-sosp2007.pdf
© 2009 IBM Corporation86
Examples
© 2009 IBM Corporation87
Frontend servers (x 3) - Medium instance (IO/Memory) - App & Cache servers
MySQL servers (x 6) - Medium instance (IO/Memory) - MySQL 5.1 w/ replication - Backup to S3 every 4 hours
Index servers (x 2) - X-Large (CPU/IO) - EBS volumes for IO throughput - EBS snapshots for backup
Infrastructure servers (x 3) - Dist. Logger (Medium – IO) - Analytics Server (Medium – IO) - Messaging Server (Small)
Crawlers (x ~70) - Small instance (Network IO) - Automated build & boot
Staging (x 3) - Medium / Small instances - Scratch space for internal use
© 2009 IBM Corporation88
Soocial
© 2009 IBM Corporation89
© 2009 IBM Corporation90
Observations from 6 startups on AWS (12 – 100s of AMIs)
Everyone deployed monitoring All but one used open source monitoring (the other used home grown) NONE have humans watching/waiting All use image & boot script for repeatable deployments All have scripted fault prevention / resolution All Throw Away, rather than Fix All redeploy entire production for release upgrades
© 2009 IBM Corporation91
Scaling a Single Application
SingleSystem
TieredSystem
ClusteredMiddleware,Tiered System
Loosely Coupled
Services
DynamicMassivelyParallelApplication
Ve
rt
ic
al
Sc
al
in
g
Ve
rt
ic
al
Sc
al
in
g
Partitioned DB
Ve
rt
ic
al
Sc
al
in
g
Ve
rt
ic
al
Sc
al
in
g
Ve
rt
ic
al
Horizontal
Horizontal
Horizontal
Horizontal Scaling
Horizontal Scaling
Development Discontinuity
(new application architecture)
Significant Development Required
© 2009 IBM Corporation92
Conclusions
Divide Complex Monolith
– Several simpler problems IaaS simplifies self-managed appsCost of IaaS + Apps < Monolithic AppPaaS _is_ an ApplicationStorage _is_ an ApplicationGeneral principle
– We have lots of small problems (apps)– We have one big problem (IaaS)
© 2009 IBM Corporation93
Microsoft Windows Azure
© 2009 IBM Corporation94
Microsoft’s Cloud OS: Focus on Hybrid
© 2009 IBM Corporation95
How Microsoft presents itself: 2013-July SEC filing
“Unique to Microsoft, we continue to design and deliver cloud solutions that allow ourcustomers to use both the cloud and their on-premise assets however best suits theirown needs. For example, a company can choose to deploy Office or MicrosoftDynamics on premise, as a cloud service, or a combination of both. With WindowsServer 2012, Windows Azure, and System Center infrastructure, businesses candeploy applications in their own datacenter, a partner’s datacenter, or in Microsoft’sdatacenter with common security, management, and administration across allenvironments, with the flexibility and scale they desire. These hybrid capabilities allowcustomers to fully harness the power of the cloud so they can achieve greater levelsof efficiency and tap new areas of growth.”
© 2009 IBM Corporation96
Ancient history
Initial virtualization platform dates back to 1997 Hyper-V
– Completely new platform– Released in 2008– Designed to “leap-frog” VMware's platform– Latest version has significant enhancements– Built into Windows 8 desktops also
Bing– Started as MSN Search back in late ’90s
Windows Live– Originated as MSN services which date back to 1995
Hotmail– One of Web-browser based email pioneers– Started in 1996, acquired by Microsoft in 1997
Xbox Live– Started in 2002
MSNBC– Founded in 1996
© 2009 IBM Corporation97
Early Windows Azure Platform history
October2008
June2010
November2009
• Updated Windows AzureCTP
• Announced VM Role,Project Sydney, and
Windows AzurePlatform pricing and
SLAs
• Enabled Full Trust &PHP, Java, etc.
applications
• Project “Dallas” CTP
• Windows AzureUpdate
• .NET Framework 4
• OS Versioning
• CDN
• SQL Azure Update
• 50GB databases
• Spatial data support
• DAC support
Windows Azure Platform generallyavailable
• Announced the Windows AzurePlatform
• First CTP of the Windows AzurePlatform
Announced SQL AzureRelational DB
March2009
February2010
© 2009 IBM Corporation98
Recent Changes and tweaks
Windows Azure initially focused exclusively on PaaS– “Scared” their developer base (radical change)– Too far ahead of its time?– The market was much more comfortable with Amazon’s IaaS focus
Added a strange stateless “VM role” to Azure as a stop-gap– Is now deprecated
Major shift in 2012:– Added full IaaS role support to Azure– Shifted definition of “Azure” to mean “Microsoft’s Public Cloud”– PaaS platform naming shifted to “Azure Cloud Services”
Windows Azure Appliance– Microsoft’s first attempt at a “cloud in a box”– OEM-specific product that included thousands of servers
More of a “public cloud in a box”
HP, Dell and Fujitsu– Fujitsu was only vendor to announce a product (which now seems dead)
Windows Azure Pack (stay tuned…)– Azure Pack + Windows Server 2012 R2 = Azure Appliance
© 2009 IBM Corporation99
.NET, Visual Studio, TFS + Git | Java, NodeJS, PHP, Python, Ruby, C++
DataSQL Databases
NoSQL Tables
Blob Storage
HDInsight
WindowsAzure
IaaS + PaaS
© 2009 IBM Corporation100
11ConsistentConsistentPlatformPlatform
Windows Azure Services
Service ProvidersService ProvidersPrivate CloudPrivate Cloud
Public CloudPublic Cloud
Microsoft Cloud OS Vision
DEVELOPMENT MANAGEMENT IDENTITY VIRTUALIZATIONDATA
Azure Virtual MachinesAzure Virtual Machines
Windows Azure Services
© 2009 IBM Corporation101
Consistent Experience with Common Tools
© 2009 IBM Corporation102
Windows Azure™ PaaS
Similar design points asAWS...
Applications Services accessed via REST/SOAPmessages
SQL Services for data & storageAzure OS has messaging serviceAzure OS platform for app deployment
Data & storage - eventual consistencyQueued messages may be delivered more than once
...with key differences•Applications deployed - not Images
• VMs baked into OS
•Application provides declarative description forscalability, reliability & availability of applicationcomponents
• e.g. developer of service owner specifies how piecesare to be distributed under what circumstances
•System automatically replicates code & data• Queuing/messaging Services
•SQL Databese (fka SQLAzure) ServicesLike Amazon, expecting it to be priced (high) basedon operation costs.
.NET, Visual Studio, TFS + Git | Java, NodeJS, PHP, Python, Ruby, C++
Data
SQL Databases
NoSQL Tables
Blob Storage
HDInsight
IaaS + PaaS
© 2009 IBM Corporation103
Microsoft Platform as a Service
Windows Azure (compute & simple/scalable storage) SQL Database (fka SQL Azure)
– SQL Server as a Service AppFabric (Cloud-based services)
– Access Control Service (Azure Active Directory)– Enterprise Service Bus– Distributed Object Caching
Traffic Manager– Global traffic management/routing (performance)
Azure Connect (“VPN” between cloud and on-premise services) Azure Portals
– Web-based Service Lifecycle Management tools– SQL Database management– ReSTful APIs also available (non-Browser-based tools)
Azure Media Services Azure Content Delivery Network (CDN)
© 2009 IBM Corporation104
Windows Azure Storage
Cloud Storage - Anywhere and anytime access
Blobs, Disks, Tables and Queues Highly Durable, Available and Massively Scalable
Easily build “internet scale” applications
8.5 trillion stored objects
900K request/sec on average (2.3+ trillion per month) Pay for what you use Exposed via easy and open REST APIs Client libraries in .NET, Java, Node.js, Python, PHP, Ruby
© 2009 IBM Corporation105
Abstractions – Tables and Queues
© 2009 IBM Corporation106
Abstractions – Blobs and Disks
© 2009 IBM Corporation107
Azure support for “Open” and “Interoperable” tools and platforms
Windows Azure Tools for Eclipse/Java– One Click cloud deployment– Supports Windows Azure Storage & SQL Azure– Support for Windows Azure Platform SDKs & Drivers– AppFabric SDK Supports Service Bus & Access Control from Java– Provided by 3rd party
Windows Azure SDK for PHP– Supports Windows Azure Storage & Service Management infrastructure– PHP apps can be deployed in an Azure Web Role– Simple Cloud API
Windows Azure Companion– Simplifies installing and configuring open source components and apps on
Azure– Examples: Drupal and PHP apps
Embraces Open Source with NuGet Node.js Git, Github, Dropbox, …
© 2009 IBM Corporation108
Recent additions/enhancements to Windows Azure
Microsoft has been making consistent, regular enhancements to Azure, especially overthe past couple years
August 2013– SQL Server AlwaysOn (HA/DR features for hybrid infrastructure)– Notification Hubs (broadcast push notifications for Win8/RT, Windows Phone, iOS
and Android devices)
Used by Bing News app (built into Win8/RT and Windows Phonedevices)
– AutoScale (schedule-based rules) (beta)
Web sites, Cloud Services (PaaS), Virtual Machines (IaaS) and MobileServices
History tracking, proactive notification for AutoScale events (e.g.email)
– Automated VM load balancing (net traffic) management (free)– Portal extensions for operational logs and alerts
September 2013– Dedicated Cache Service (high perf distributed caches for Windows/Linux/ASP/Web
sites)
Azure Mobile Services to be integrated in near future– Scheduled AutoScale (time schedule rules for using AutoScale features)– Azure Web Sites logging to Azure Storage (blobs)
© 2009 IBM Corporation109
Commercial Cloud Services
© 2009 IBM Corporation110
Agenda
Evolving Programming Models – Overview
Extensions to traditional programming models – Middleware patterns inthe cloud
Loosely coupled, relaxed consistency– Amazon Web Services– Microsoft Azure– Google
Content centric– Hadoop, Apache Spark– NoSQL
Database centric– Pangoo
© 2009 IBM Corporation111111
A “Content-Centric” model runs infrastructure, data and computationall on the same nodes
Mgmt Model
Mgmt Model
Mgmt Model
InfrastructurePersistenceProgramming
Looselycoupledstarts here
Real innovationoccurs here
© 2009 IBM Corporation112
112
Critical elements of a content centric model“Restricted” programming model• Think Batch: Redux
• Enables parallelized, distributed, fault tolerant computationswithout programming complexity
• No new programming experience required; framework hidesdetails of parallelization, fault tolerance, load balancing, etc.from developer
• Offers simplicity of deployment & scalability - no applicationknowledge of runtime or OS or cloud necessary
Can be deployed on native hardware orvirtualized• Underlying map/reduce runtimes automatically parallelizes the
computation across large-scale clusters of (virtual) machines
Storage & data - Leverages “hybrid” distributedstorage system & file systems designed tohandle petabytes of data - i.e not to be confusedwith an OS file system• Data Handling & Replication: map/reduce implementations thru
a software framework that handles data distribution
Designed to minimize operational costs• The “master” pings every worker periodically. If no response in
a certain amount of time, the master marks the worker as failed.handles machine failures, and schedules inter-machinecommunication to make efficient use of the network and disks
© 2009 IBM Corporation113113
Apache Project: Hadoop Core
Open source project to recreate Google’scapabilities (led by Yahoo) withimprovements•Portable – can run as a native or virtualizedsystem•Additional pluggable runtime components forcrawling (structured & unstructured data), querylanguages (Pig Latin, JAQL, Hive, etc..)
Provides a Java framework for large scaleparallel processing map/reduce apps•Offers simplicity of “programming” - Looks like asimple single threaded app model for developers •Today - setting up, coding Hadoop jobs in Java,etc. is the domain of skilled Java engineers
Awareness & Adoption Growing•Could become foundation of new generation ofeasily customizable web analytic applications –at web scale•Yahoo – used in production for indexing content•Facebook – analyze logs, analytics•New York Times
Not as scalable as Google – but does it need to be?
© 2009 IBM Corporation114
Hadoop, an open source implementation of map-reduce
Map-reduce runtime• Partitions input data• Schedules program’s execution across set of
machines• Manages inter-machine communication• And more
Programming using Map-reduce:• Users specify a map function that
processes a key/value pair to generate aset of intermediate key/value pairs, and areduce function that merges all intermediatevalues associated with the sameintermediate key.
• Processes and generates large datasets
• Automates program recovery in caseof a failure
• Supports functional style programming• Parallelism is an inherent feature• Critical to keeping costs down
© 2009 IBM Corporation115
Conceptual flow with Map Reduce
Conceptually, Map and Reduce functions are identicalThey both operate on and transform key value pairsThe idea is to diminish (reduce) the amount of data as it passes through
this flowThis is a very human idea
ki ,vi k’i ,v’i
Map(k, v) , fm()
i=1,2,3….N i=1,2,3….M
Multiple values for samekey may appear here
You specify the input data setAnd the Map Functon
Note that the values are transformed andchange
But also the keysThe number of keys changes.
Some input records could be discarded
© 2009 IBM Corporation116
Conceptual flow with Map Reduce
Conceptually, Map and Reduce functions are identicalThey both operate on and transform key value pairsThe idea is to reduce the amount of data as it passes through this flowThis is a very human idea
ki ,vi k’i ,v’i
Map(k, v) , fm()
i=1,2,3….N i=1,2,3….M
Sort by keyAggregate by key
k’i ,(v1…mi)’i
i=1,2,3….M` (M` < M)
Multiple values for samekey may appear here
Multiple values for samekey should appear here
Let’s focus on the unique keysSo, need to sort and aggregate
by key
© 2009 IBM Corporation117
Conceptual flow with Map Reduce
Now we apply another transformational function. Just like before
The idea to reduce is a very human one
That’s really all there is to this.
ki ,vi k’i ,v’i
Map(k, v) , fm()
i=1,2,3….N i=1,2,3….M
Sort by keyAggregate by key
k’i ,(v1…mi)’i
i=1,2,3….M` (M` < M)
k’’i ,v’’i
Reduce(k’, v’) , fr()
i=1,2,3….P
Multiple values for samekey may appear here
Multiple values for samekey should appear here
© 2009 IBM Corporation118
Map Reduce: Simple / Sample problemUsing a NCDC data set, find out average precipitation in the US, by year
Use this format: ftp://ftp.ncdc.noaa.gov/pub/data/cdo/samples/PRECIP_15_sample_ascii.dat
STATION STATION_NAME ELEVATION LATITUDE LONGITUDE DATE QPCP UnitsCOOP:311564 CATALOOCHEE NC US 798.9 35.61667 -83.1 19840101 00:15 0
HICOOP:311564 CATALOOCHEE NC US 798.9 35.61667 -83.1 19840104 22:45 1
HICOOP:311564 CATALOOCHEE NC US 798.9 35.61667 -83.1 19840105 00:30 1 HICOOP:311564 CATALOOCHEE NC US 798.9 35.61667 -83.1 19840105 01:30 1 HI
COOP:311564 CATALOOCHEE NC US 798.9 35.61667 -83.1 19840105 02:15 1 HI
QPCP: The amount of precipitation recorded at the station for the 15 minute period endingat the time specified for DATE above given in tenths or hundredths of inches dependingon the value given in the Units element (see definition for Units below). Prior to January1996 QPCP was the only observational element in this data set. The values 9999 or 99999means the data value is missing. The maximum number of characters for this field is 8.This element is selectable when using the Climate Data Online interface for creating dataoutput file.
Units (Flag/Attribute): HI indicates data values (QGAG or QPCP) are in hundredths ofinches. HT indicates data values (QGAG or QPCP) are in tenths of inches.
January 1st, 1984January 4th
Multiple times….
© 2009 IBM Corporation119
The Map operation example(k1, v1) =>Map(data, f(x)) => (k2, v2)
Consume a line, output year and precipitation (key = char offset, value = line) => Map() => (key = year,value =
QPCP)
STATION STATION_NAME ELEVATION LATITUDE LONGITUDE DATE QPCP
COOP:311564 CATALOOCHEE NC US 798.9 35.61667 -83.1 19840101 00:15 0
COOP:311564 CATALOOCHEE NC US 798.9 35.61667 -83.1 19840104 22:45 1COOP:311564 CATALOOCHEE NC US 798.9 35.61667 -83.1 19840105 00:30 1COOP:311564 CATALOOCHEE NC US 798.9 35.61667 -83.1 19840105 01:30 1COOP:311564 CATALOOCHEE NC US 798.9 35.61667 -83.1 19840105 02:15 1
YEAR QPCP 1984 0 1984 11984 11984 11984 1
Map inputs:
Map outputs:
© 2009 IBM Corporation120
The Map operation example(k1, v1) =>Map(data, f(x)) => (k2, v2)
Consume a line, output year and precipitation (key = char offset, value = line) => Map() => (key = year,value =
QPCP)
STATION STATION_NAME ELEVATION LATITUDE LONGITUDE DATE QPCP
COOP:311564 CATALOOCHEE NC US 798.9 35.61667 -83.1 19840101 00:15 0
COOP:311564 CATALOOCHEE NC US 798.9 35.61667 -83.1 19840104 22:45 1COOP:311564 CATALOOCHEE NC US 798.9 35.61667 -83.1 19840105 00:30 1COOP:311564 CATALOOCHEE NC US 798.9 35.61667 -83.1 19840105 01:30 1COOP:311564 CATALOOCHEE NC US 798.9 35.61667 -83.1 19840105 02:15 1
YEAR QPCP 1984 0 1984 11984 11984 11984 1
Map inputs:
Map outputs:
© 2009 IBM Corporation121
Then, sorting / aggregation happens
Sort data by key Aggregate values with same key into one aggregate value
YEAR QPCP 1984 0 1984 11985 41984 11984 11984 1 1985 21985 3
inputs:
outputs:
YEAR QPCP 1984 0 1 1 1
11985 2 3 4
© 2009 IBM Corporation122
The Reduce operation example(k2, v2) =>Reduce(data, f(x)) => (k3, v3)
Input: year, list of precipitation values Output: year, average precipitation value Reduce function f(x) = average(x)
YEAR QPCP 1984 11985 31986 21987 31988 3
Reduce inputs:
Reduce outputs:
YEAR QPCP 1984 0,1,0,2,0 1985 1,4,5,0,71986 1,0,2,3,11987 1,6,2,3,41988 1,5,2,5,4
© 2009 IBM Corporation123 123
Large Financial Institution wanting to do fraud analytics
A platform that can cost effectively manage PB’s of data and support fraud and marketinganalytics
Must be efficient for structured data Integration with enterprise assets: warehouse, BI tools
New Analytics
Platform
Models of normal andfraudulent card usage
Transactional Credit CardRisk Management System
(Decision to authorize charge)
Transactional System
Analytics
Requirement: analyze 7 years – total250TB at a rate of 100M transactions aday (transaction rate expected to growsubstantially)
Problem 1 (1 year of data):– Today, w/o MSA, takes > 1 week – With MSA – 3 hr!
Problem 2 (1 month of data):– Customer goal: 1 day = “a win”; 10
minutes = “great”; 1 minute =“awesome”MSA at “great” (~10 mins), movingto “awesome”
Data Sizes and Performance
© 2009 IBM Corporation124
Brief History of Spark
2002 – MapReduce @ Google 2004 – MapReduce paper 2006 – Hadoop @ Yahoo 2010 – Spark paper 2011 – Hadoop 1.0 GA 2014 – Apache Spark top-level 2014 – 1.2.0 release in December 2015 – 1.3.0 release in March
Spark is HOT!!! Most active project in Hadoop
ecosystem One of top 3 most active Apache
projects Databricks founded by the creators
of Spark from UC Berkeley’sAMPLab
Activity for 6 months in 2014(from Matei Zaharia – 2014 Spark Summit)
© 2009 IBM Corporation125
Apache Spark is a fast, general purpose, easy-to-use cluster computingsystem for large-scale data processing
– FastLeverages aggressively cached in-memory
distributed computing and JVM threads
Faster than MapReduce
– GeneralityCovers a wide range of workloads
Provides SQL, streaming and complexanalytics
– Ease of use (for programmers)Spark is written in Scala, an object oriented,
functional programming language
Scala, Python and Java APIs
Scala and Python interactive shells
Runs on Hadoop, Mesos, standalone or cloud
Logistic regression in Hadoop and Spark
from http://spark.apache.org
© 2009 IBM Corporation126
Spark Resilient Distributed Datasets- like in-memory hash partitions
Slave node 1
c3 d2
a2 b1
partition3
partition1
partition2
Slave node 2
c2 d1
a1 b2
partition1
partition3
Slave node 3
c1 d2
a3 b3
partition2
partition2
partition1
RDD1
RDD2
RDD3
Spark RDDIn-memory distribution
HDFSOn-disk
distribution
© 2009 IBM Corporation127
Directed Acyclic Graph Computation – much more efficient framework than MapReduce
An example of a typical workload consists of 4 MR jobs with 6 intermediate
step Distributed File System (DFS) IOSpark DAG with lazy evaluation(No intermediate step DFS IO)
© 2009 IBM Corporation128
Spark Extensions – a common API for data ingest, streaming analytics, machinelearning, graph processing, and more
Extension of the core Spark API Improvements made to the core are passed to these libraries Little overhead to use with the Spark core
© 2009 IBM Corporation129
Spark in the real world ...
Batch
Interactive
MachineLearning
DataIntegration
DataWrangling Streaming
GraphProcessing
SQL
Healthcare
Telco
FinancialServices
Media
Manufacturing NationalSecurity
Insurance
Retail
Banking
© 2009 IBM Corporation130
Spark to Improve Health Care for Millions of Patients
Independence Blue Cross will leverage Spark asthe Analytic Platform for projects to improve thelifestyle for those who are ill
•Maps of complex referral network to identifycost-efficient providers
•Identify patients who are at the highest risk ofbeing re-hospitalized within a short period oftime
•Enhance the efficacies of managing chronicdisease, such as early detection of diabetes
•Analyze clinical data and scanned images toidentify hip implant patients who has high-riskexposure to complications such as metallosis,infection and dislocation
© 2009 IBM Corporation131
Spark to hunt for presence of intelligent extraterrestrial life
IBM, NASA, and the SETI Instituteare collaborating to analyze complex deep space radiosignals using Spark in a hunt for patterns that mightbetray the presence of intelligent extraterrestrial life
•SETI Institute's mission is to explore, understand andexplain the origin and nature of life in the universe
•With Apache Spark as a Service on Bluemix, SETIable to work with IBM on a global scale to explore newways to analyze signal data and build on each other’sinnovations
•Spark application is being developed to analyze the100 million radio events detected by the AllenTelescope Array (ATA) over several years.
© 2009 IBM Corporation132
Single genome is 200GB, 1M People peryear tested
Leverage Bluemix and Spark Services toprovide powerful processing and analyticsfor massive Genomes Data
Demonstrated using Spark to search andcompare variant from standardChromosomes repository. visualization ofchromosome to explore variants easily.e.g. Chro1-22, chroX, chroY
Bluemix GenomicsHuman Genome Sequencing using Spark
© 2009 IBM Corporation133
But --- there are Gaps in NoSQL DataStores …. Can Enterprises Live with these?
Not good for multi-user, complexapps
Difficulties with data integration toolsthat require understanding of thestructure to support movement of thedata to other systems
Joining entities when relationship arenot stored pre-joined, no multi-objecttransactions
Limited Data management tool setand ecosystem
No data integration with Enterprisedata
Requires highly skilled team to deliver andmanaged NoSQL based deployment
– As complexity increases, investmentin enterprise software is lessexpensive than engineering ad-hocsolutions.
© 2009 IBM Corporation134134
Google Software Stack: One View
Google File System• Non-virtualized storage component – specialized distributed filesystem designed for Google workloads• Two types of servers: masters (network coordinators) & workers(operating on data as requested)• Chunk size is 64 MB – not typical file system block size to reduceworkers interacting with master
Bigtable• Distributed column oriented data store but not a relational DB ontop of GFS (Covered in Google TT last year)
Work Queue• Distributed batch processing component & job scheduler
Map Reduce – details• Framework/library in C++ component• Utilizes Work Queue to distribute computations to clusters• ~10,000 Map Reduce programs today• In 2004 ran 29,000 jobs – 2007, 2,200,000 jobs• Google runs ~100,000 jobs per day crunching thru 20 petabytes• Runs across ~100,000 node servers• Indexing , AdWords, Analytics, etc..
Sawzall• Query language, type-safe scripting langauge• Factor of 10 simpler to code up (and shorter) then in C++
© 2009 IBM Corporation135
NoSQL defined…
emergence of a growing number of non-relational, emergence of a growing number of non-relational, distributed data stores for massive scale datadistributed data stores for massive scale data
http://nosql-database.org
© 2009 IBM Corporation136
Application developers are using NoSQL to rapidly prototype anddeploy
© 2009 IBM Corporation137
Categories of NoSQL Use-case Patterns
Scalability for web-apps: single record access based on key– Large data scale– High read concurrency – Ratio of value to number of records is high
Rapid development of web-scale solutions– Chosen for flexible schema– Simple queries (key-lookup)– Lifespan of Apps are short, and required rapid iteration
Scalable Analytics– Scalable fault tolerant framework for
storing and processing MASSIVE data sets (Hadoop)– Lower cost, available hardware– Gives you point access to data in MR, not just sequential
access– Ratio of value to number of records is low Think NOSQL BigData Analytics
Think NOSQL
OLTP
Think NOSQL Agility
© 2009 IBM Corporation138
HBase
A NoSQL data store The Hadoop Database
– - included in Apache HadoopAn industry leading implementation of Google’s BigTable
DesignHBase powers some of the leading sites on the Web (e.g.
FaceBook, Yahoo, …)
© 2009 IBM Corporation139
There is no single NoSQL --- it’s a landscape
• Simple Key Value Storeso Data is stored in a hash-table of keys. Values
are opaque binary objects.
• Document Storeso Data is stored as documents with tagged
elements
• Column Family o Data attributes are grouped into column sets.
Each storage block contains data from only onecolumn set.
• Graph Storeo Data is stored in nodes and edges of a graph
and accessed using graph traversal
Key Binary Data
Key Document (collection ofkey-values)
Key
Properties Key Properties
Node 1 Node 2
Key
Properties
Relationship 1
Key ColumnFamily1: C1
ColumnFamily1: C2
ColumnFamily2: C1
ColumnFamily3: C1
eXtremeScale
eXtremeScale
© 2009 IBM Corporation140
Understanding the CAP Theorem A distributed system can only achieve
two out of the three followingproperties:
– Consistency – all clients see thesame data--the database is truthful!
Can be Strong (e.g. atomic andimmediate), Sequential, Casual,Eventual or Weak.
– Availability – The system is alwayson so the data is available
– Partition Tolerance – The systemfunctions when a network failurecreates two disconnected groupsOR “The network will be allowed tolose arbitrarily many messagessent from one node to another.”
The exact meaning of this property isdebated
CAP Positioning
HBase is Eventually Consistent and Implements
Consistency and Partition Tolerance
(e.g. A Region Server failure is recoverable but the datawill be unavailable for a period of time)
© 2009 IBM Corporation141
Development Characteristics of NoSQL Systems
Schema Flexibility andDevelopment Agility– Quick Application
development– No schema first– Sparse schemas– Data models that are native
to the application spaceJSON is very dominant
– Fewer negotiations with ITApplication defines the schemaand access pathsRapid response to feedback andchanging requirements.
Data
JSON
© 2009 IBM Corporation142
Runtime Characteristics of NoSQL Systems
Low Latency, Non-Durable Writes: storeobjects as they arrive, no shredding and multi-table storage
– Capture machine generated data where riskof loss is low and not fatal
Low-latency Reads: Objects match applicationaccess, no joins required
– Online analytics and web-facing applicationswhere stored object matches the web-facingapplication
Dynamic Elasticity: – Rapid horizontal scalability (10’s or 100’s of
servers)– Ability to add or delete nodes dynamically– Application transparent elasticity and cloud
compatibility (scale in AND out)– Allows IT expense to scale with usage
Fast initial deployment: Commonly availablehardware, open source software, (Perceived)lack of need for database administration andskills
– Low barrier to entry for early exploration andrapid development iteration
Petabytes
Zettabytes
Sharding
ABC
A
B
C
© 2009 IBM Corporation143
BigTable
Variants: HBase
Partitioning: range-aware data-blocks;automatically split size exceed; N data-blocksmapped to physical node (master).
Replication: leverages Distributed File System
Durability: Sync. writes
Dynamo:
Variants: Cassandra Couchbase Riak
Partitioning: Consistent Hashing Hash keys maintained in a ring, map N hash-keys to virtual node in the ring, multiple virtualnodes mapped to the same physical node (soa single physical node maps to multiple slotsin the ring. The scheme reduces number ofkeys that need to be remapped when nodesare added.
Replication: Async replication to N slots, oruse logging
Durability: Synchronous write to quorum
Replica-Sets
Variants: MongoDB OracleDB
Partitioning: range-aware shards distributedon available nodes, config server keeps mapof shards to nodes, shards split automatically
Replication: replica sets with configurable(async/sync) consistency.
Durability: configurable journaling
Approaches to Elastic Scale on Commodity Hardware
Master(standby)Master
Client Client Client
GFS
© 2009 IBM Corporation144
RDBMSs achieve scale and HA without sacrificing ACID
Run massive low-latency systems in production– E..g. Stock trading, shipping, credit card authorization,
currency exchange systems– 10.3 tpm TPC-C on IBM Power780,; 3M tpm TPC-C
Intel x3850x5 Advanced optimization for object types, such as
XML. – The current “format du jour” is JSON, which is starting
to gain a lot of attention from RDBMSs vendors.
Enterprise applications will need joins– Reference data– Need ability to optimize data access without changing
the application– RDBMS have been doing this for years. NoSQL
systems will need to introduce this capability in theappication tier.
RDBMSs Support scalability for enterprisedeployments,
– IT works with app developers to manage IT growth
PureScale Elasticity
0123456789
101112
0 5 10 15#members
thro
ug
hp
ut
vs 1
me
mb
er
© 2009 IBM Corporation145
‘Content-Centric’ is really about “Big Data” AND “New Analytics”
Text
Logs &Transactions
Clickstream Data
Statistical Model Building
Text Analytics
Biological Sequences
© 2009 IBM Corporation146
Agenda
Evolving Programming Models – Overview
Extensions to traditional programming models – Middleware patterns inthe cloud
Loosely coupled, relaxed consistency– Amazon Web Services– Microsoft Azure– Google, NoSQL
Content centric– Hadoop, Apache Spark
Database centric– Pangoo– Salesforce.com
© 2009 IBM Corporation147147
A “Database-Centric” model runs infrastructure and database on thesame nodes
Mgmt Model
Mgmt Model
Mgmt Model
InfrastructurePersistenceProgramming
Real innovationat this layer
© 2009 IBM Corporation148148
Critical elements of a database centric model
• The database layer needs to multiplex multipleapplications• Database model needs to be flexible if different apps share the database
• For cloud economics to work out, mgmt cost of database layer << #appx mgmt cost of a single database for an app
• Programming model• A focus on schema configuration as opposed to schema design
• Constrain enough to keep cloud economics yet not reduce the marketsignificantly
• Higher bandwidth within a “group of nodes”• For scaling the database within an app (could use larger SMP’s)
• Database nodes are the “keystone”, they need “HA” insome form (so the previous two architectures are notexactly the right fit)
© 2009 IBM Corporation149
149
From Single-Tenant to Multi-Tenant Application
MMT common service provides:
Support for cost-effective resource sharing, isolation, diverseSLAs, etc., across different tenants
Management of database resource pool, lifecycle ofapplications & tenant subscriptions, monitor, analyze, and
optimize system operations
Highly on-demand availability and scalability with thenumber of tenants & offerings
Minimize application development or transformation effortfor SaaS ISVs
MT data access mockup package for local testing
MMT MetaRepository
MMT CommonService
Operator
1 5…
Database Resource Pool
10,000
App
1
App
2
App …
Few shards in MT
user1 user100… user1 user100
… user1 user100…
user1,1 user10000,100…
App1 10
…
© 2009 IBM Corporation150
150
Database Multi-Tenancy for the Cloud
Tenant A
Tenant B
App Server
Shared Tables
(economic)
Separate Instances/Databases
(deluxe/advanced)
Separate Tables
(intermediate)
Tenant A
Tenant B
Multi-tenant App
App Server
Multi-tenant App
Hig
her
Qu
ery
Op
tim
izat
ion
/ru
nti
me
Co
mp
lexi
ty,
Hig
her
Sec
uri
ty W
orr
ies
Multi-tenant App
App Server
Higher Multitenancy, better resourceutilization
© 2009 IBM Corporation151
151
Multi-tenancy Challenges
Isolation, Scalability, Performance,Customization, Resource Utilization,
Metering …
Virtual Multi-Tenant LayerVirtual Multi-Tenant LayerVirtual Multi-Tenant Layer
DB Multi-Tenant Layer
© 2009 IBM Corporation152
152
MT DB Tradeoffs
Isolated Databases Separate Schemas Shared Tables
Simplicity simple simple (but need mechanism to avoidname collisions (3-part name ormapping))
hard
Customizability
(schema)
high high low (might require migration)
Rigorous Isolation(regulatory law)
best moderate lowest
Resource Cost/tenant high low lowest
#Tenants Low large Largest
Operational Cost/tenant(backup, patches, etc.)
high low (but point in time recovery noteasily possible)
Lowest (but point in time recoveryeven harder)
Tools Need tools to deal w/ largenumber of instances/databases
Need tools to deal w/ large number oftables
n/a
DB implementation cost Lowest (qry routing and simplemapping layer)
Low (qry routing, simple mappinglayer and qry mapping)
High (qry routing, simple mappinglayer, qry mapping, row-levelisolation)
Scalability Per tenant Need some data/load balancing w/dynamic migration
Need some data/load balancing w/dynamic migration
Query Optimization Less critical Less critical Critical (wrong plan over very largetables is disastrous)
Per Tenant QueryPerformance
As usual need qry governance Need qry governance and tenant-specific statistics
© 2009 IBM Corporation153
Get tenant id via Tenant Identity propagation (ThreadLocal).
Retrieve tenant profile (database, username, password, etc.)according to tenant id.
Connect to underlying database based on tenant profile
– If shared tables, set tenant id in connection; pass down thesql to target db.
– If separate tables, get tenant specific schema name(assigned during tenant onboard) from tenant profile, and
set current schema before each statement is created.
– If separate db, pass down the sql to target db.
MMT Metadata Repository
Tenant info;Offering info;
Physical DB info; Catalog info;SLA…..etc…
Dynamic Routing
MMT JDBC Wrapper
Get tenant id
SaaS Application
REST Service MMT Master App
REST Client
Tenant DB
2
3
45
REST requestw/ tenant id
REST responsew/ tenant profile:
DB info, SLA
JDBC connectionw/ tenant id
6
Result set
JDBC
1
Only once
DB2MMT
Non-db2mmt
Request dbconnection
Cache
DB2 JDBC Driver
Tenant Identitypropagation
© 2009 IBM Corporation154
154
Bringing an Application to MMT for DB2
MT App (Offering) development/transform
Operation Management
Runtime
ServiceProviderMMT Admin Console
ISV
TenantUsers
Monitoring,Governance,
…..
MMT Sandbox
Multi-tenant application
IDE
Tenantmanagement
Offeringmanagement
Resourcemanagement
MMT MetaRepository
MMT CommonService
Operator
1 5…
Database Resource Pool
Shards in MT
Multi-tenant App
© 2009 IBM Corporation155
155
ISV App
DB2
Application
MT MetaRepository
MMT CommonService
DB2
ISV Local Env. DB2 MMT Runtime Env.
On-boarding
SimulatedMeta File
MT Database Pool
Operator
DB2 DB2
MMT RuntimeAgent
MT Application development/transformation
Provide offeringmetadata file
(XML) ofapplication
Configure/Modify theapplication to use DB2MMT access package
Embed tenantidentification
Develop & Transformation Local & Runtime Environment
Supported J2EE environments– JDBC, Spring, iBatis/Hibernate, JPA– WAS/Tomcat, DB2
MMT LocalSandbox
Example of offering transformation1. Embed tenant identification in application
– Modify Web.xml to include the Filter servletTenantID forpropagation through thread local
2. Configure the application to use MMT data access package– Modify Spring data source config to use MMT data source3. Provide offering metadata file (XML) of application– Data source info, DDL, shared tables info, config info, …
© 2009 IBM Corporation156
156
Operation Management (MMT Admin Console)
2. Tenant on-boarding/subscription
3. Offering Upgrade1. Offering onboarding
4. Offering & tenant topological view
© 2009 IBM Corporation157
157
Architecture of MMT for DB2
MMT Master App(WAS Cluster for HA & LB)
MMT MetadataRepository
JDBC w/ tenant context
REST w/ tenant context
REST
JDBC
Database Resource
Pool
Database Resource
PoolT 1T 1 T 3T 3 T 4T 4
MMT REST Services
A J2EE SaaS ApplicationA J2EE SaaS Application
MMT JDBC WrapperMMT JDBC Wrapper
DB2 JDBC Driver MMT Admin Console AppMMT Admin Console App
T 2T 2 T 5T 5T 1T 1 T 3T 3 T 4T 4
Tenant Data Node
T 2T 2 T 5T 5
Tenant Data Node
RXA / JDBC
© 2009 IBM Corporation158
158
© 2009 IBM Corporation159
KingDee’s Exploitationof Pangoo
Multi-tenantMetadata
Repository
MT Runtime Data AccessService
(Runtime ResourceSharing/Isolation, DynamicRouting, SLA tracking …)
MT Operational &Management Service
(HA, Scalability, SLA tracking,Optimization, OLC etc.)
RDB Model AdapterObject Model
Adapter
TenantContext
MT-JDBC DriverSQL
REST/SOAPObject Query (LinQ,
SOQL, GQL etc.)
JDBC SDO Hibernate Agent
Data Object
Data Model Mapping Module
High Available &Scalable Data
Resources Pool
Application
VirtualDataStore
staticschema
dynamicschema
DB-CENTRIC CLOUD
© 2009 IBM Corporation160
Salesforce.com PaaS
© 2009 IBM Corporation161161
While Salesforce started with CRM, it and its partners run 1000’s ofother transactional apps on force.com
4-way Oracle RAC
Multitenant OptimizationLayer
CRM
Multitenant OptimizationLayer
Multitenant OptimizationLayer
4-way Oracle RAC4-way Oracle RAC
CRM HR Travel HR Mktg
~TB of managedDB
~40,000 tenants~400,000 customobjects
Total10Pods
Pod1 Pod2 Pod3
Take 20 StandardObjects (Accounts,Orders, …)Customize or Createnew ones
Mileage Object
AddWorkfloworBusinessLogic
Get App
ServiceMultipleTenants
DB-CENTRIC CLOUD
© 2009 IBM Corporation162162
A Critical Innovation is the Multi-Tenant Database Architecture
Organization_id Key_prefix Id Name,(Others)
Val0 Val1 … ValN
org1 a01 a01…1
org1 a01 a01…2
org1 a02 a02…1
org1 a02 a02…2
org2 a01 a01…3
org2 a01 a01…4
org2 a02 a02…3
…
…
Custom Objects are forced into a very limited number of Oracle Tables
•Key_prefix subsetting●Still partitioning by organization_id
•Smart primary keys (key prefix)●Re-use across organizations
•GUID primary keys•ValN flex fields
Opex at database and platform level dominated by #objects [backups, stats, tuning,schema evolution, app design] for most databases. SFDC reduces this by forcing alldisparate objects into fixed set of tables (as rows) -- trading off opex for platformdevelopment costs. Consequently, it is able to store ~400,000 different objects in acouple of dozen tables
DB-CENTRIC CLOUD
© 2009 IBM Corporation163
Key PaaS Services
Amazon AWSSalesforce.com
Cloud Foundry Microsoft Azure
Key Services
•Application Environments
•Relational DB as a Service
•Messaging
•Collaboration
•Security / User Management
IBM SOA
© 2009 IBM Corporation164
References
© 2009 IBM Corporation165
References – Downloads from Web
Michael Armbrust et al., Above the Clouds: A Berkeley View of Cloud Computing, Feb. 2009– http://www.eecs.berkeley.edu/Pubs/TechRpts/2009/EECS-2009-28.pdf
Cloud Computing: Platform as a Service. InformationWeek Analytics, October 2, 2009
Brooks, Carl: „How to build an application for the cloud”, http://searchcloudcomputing.techtarget.com/feature/How-to-build-an-application-for-the-cloud, Feb 2010 (last accessed 10/27/2011)
Ellis, John: „How To Design a Scalable Cloud Application” http://blog.bluelock.com/blog/cumulus-knowledge/how-to-design-a-scalable-cloud-application Jan 2011 (accessed 10/27/2011)
Cloud Use Cases White Paper Version 4, http://cloudusecases.org
DMTF: Architecture for Managing Clouds, Version 1.0.0, 2010-06-18
DMTF: Interoperable Clouds, Version 1.0.0, 2009-11-11
Luiz André Barroso and Urs Hölzle, The Datacenter as a Computer: An Introduction to the Design ofWarehouse-Scale Machines, Synthesis Lectures on Computer Architecture, 2009, http://www.morganclaypool.com/doi/pdf/10.2200/S00193ED1V01Y200905CAC006?cookieSet=1
Scott Crowder, Introduction to Workload Optimized Approach & Workload Market Segmentation, IBM WhitePaper, December 2009
David Chappell, A short introduction to Cloud, http://www.davidchappell.com/CloudPlatforms--Chappell.pdf
David Chappell, Cloud Platforms Today: A Perspective, April 2009 http://www.davidchappell.com/CloudPlatformsToday--APerspective--Chappell.pdf
Jeffrey Dean and Sanjay Ghemawat, MapReduce: Simplified Data Processing on Large Clusters, – labs.google.com/papers/mapreduce-osdi04.pdf
DeCandia et al. Dynamo: Amazon’s highly available key-value store, SOSP 2007, http://portal.acm.org/citation.cfm?id=1294281&dl=ACM&coll=ACM&CFID=47859964&CFTOKEN=98797782
© 2009 IBM Corporation166
References – Downloads from Web
European Network and Information Security Agency (ENISA), Cloud Computing, Benefits, risksand recommendations for information security, Nov 2009 (http://www.enisa.europa.eu)
Gregor Hohpe, Programming the Cloud, November 2009, http://www.enterpriseintegrationpatterns.com/docs/HohpeProgrammingCloudKeynote.pdf
Anna Liu, Architecting Cloud Applications – the essential checklist, AAF Keynote 2009,
National Institute of Standards and Technology, Definition of Cloud Computing, http://csrc.nist.gov/groups/SNS/cloud-computing/
National Institute of Standard and Technology, NIST Cloud Computing Reference, SpecialPublication 500-292
Ning Duan et al., Tenant Behavior Analysis in Software as a Service Environment, ICSOC 2009
Daniel Nurmi et al., The Eucalyptus Open-source Cloud-computing System, http://www.cca08.org/papers/Paper32-Daniel-Nurmi.pdf
Open Cloud Manifesto, http://www.opencloudmanifesto.org/
OpenNebula.org – Various papers
B. Rochwerger et al., The Reservoir Model and Architecture for Open Federated CloudComputing, IBM Journal of Research and Development, April 2009 http://www8.cs.umu.se/~elmroth/papers/ibmjrd2009.pdf
Werner Vogels, Eventually Consistent, ACM Queue, October 2008
Ying Huang et al., A Framework for Building a Low Cost, Scalable and Secured Platform for Web-Delivered Business Services, IBM Systems Journal, November 2009
Michael Yuan, Java PaaS Shootout, 4/5/11, IBM developerWorks
Raphael, JR: „The 10 worst cloud outages (and what we can learn from them)” http://www.infoworld.com/d/cloud-computing/the-10-worst-cloud-outages-and-what-we-can-learn-them-902?page=0,3, June 2011 (last accessed 10/27/2011)
© 2009 IBM Corporation167
References
Company Web Sites: Amazon, Microsoft, Google, IBM, Salesforce.com Tech blogs, for instance techblog.netflix.com http://wiki.developerforce.com/page/Multi_Tenant_Architecture Alan Brown, Enterprise Software Delivery, Addison Wesley 2013 Gregor Hohpe, Bobby Woolf, Enterprise Integration Patterns, Addison-Wesley 2004 Jez Humble and David Farley: Continuous Delivery, Addison Wesley 2010 Gene Kim et al: The Phoenix Project Kristof Kloeckner, Middleware for Distributed Systems, Lecture Notes 2004 Kristof Kloeckner, The IBM Cloud Agenda, White Paper 2009 Craig Larman, Bas Vodde: Scaling Lean & Agile Development, Addison-Wesley 2009 Web Site der Open Group: www.opengroup.org/cloudcomputing Mary and Tom Poppendieck: Lean Software Development. An Agile Toolkit, Addison Wesley
2003
George Reese: Cloud Application Architectures, O’Reilly 2009 John W. Rittinghouse, James F. Ransome, Cloud Computing. Implementation, Management
and Security, CRC Press 2009 Andrew Tanenbaum, Maarten van Steen: Distributed Systems. Principles and Paradigms,
Prentice-Hall 2009 Rich Schiesser: IT Systems Management, Prentice-Hall 2002 Jim Rymarczyk, Virtualization, Pre-Print 2009 Tivoli Service Automation Manager Solution Guide Adam Wiggins, The Twelve-Factor App, 12factor.net Bill Wilder, Cloud Architecture Patterns: Using Microsoft Azure, O’Reilly 2012
© 2009 IBM Corporation
Cloud Business Support System (BSS)Overview
© 2011 IBM Corporation
IBM Cloud Computing Reference Architecture: Architecture Overview | IBM Confidential
Cloud Computing Reference Architecture (CC RA) – Overall drill-down
Governance
Security, Resiliency, Performance & Consumability
Cloud Service Provider
Cloud Services
IaaS
PaaS
SaaS
BPaaS
Common CloudManagement Platform
Cloud ServiceIntegration
Tools
Consumer In-house IT
Infrastructure
Middleware
Applications
BusinessProcesses
OSS – Operational SupportServices
BSS – Business SupportServices
CustomerAccount
Management
ServiceOfferingCatalog
ServiceOffering
Management
TransitionManager
DeploymentArchitect
OperationsManager
Service Provider Portal & API
ConsumerAdministrator
ConsumerBusinessManager
Consumer Enduser
Service CreationTools
ServiceManagementDevelopment
Tools
Service RuntimeDevelopment
Tools
SoftwareDevelopment
Tools
Image CreationTools
ServiceComponentDeveloper
Infrastructure
Security &Risk Manager
CustomerCare
ServiceManager
BusinessManager
ServiceComposer
OfferingManager
ServiceIntegrator
Se
rv
ic
e M
an
ag
em
en
t
Se
rv
ic
e C
on
su
me
r P
or
ta
l &
AP
I
Se
rv
ic
e D
ev
el
op
me
nt
Po
rt
al
& A
PI
AP
I
AP
I
AP
I
AP
I
Existing &3rd partyservices,Partner
Ecosystems
Provisioning
Incident &Problem
Management
IT ServiceLevel
Management
Service Automation Management
Service Delivery Catalog
Platform & Virtualization Management
Infr
ast
ruct
ure
Mgm
t In
terf
ace
sP
latf
orm
Mgm
tIn
terf
ace
sS
oftw
are
Mg
mt
Inte
rfa
ces
BP
Mg
mt
Inte
rfa
ces
© 2009 IBM Corporation170
Business Support System (BSS)
Services:1. Offering Management & Service
Offering Catalog
2. Customer & Subscriber Management
3. Contract Management
4. Entitlements
5. Order Management
6. Pricing & Rating
7. Accounting, Billing & Invoicing
8. Peering & Settlement
9. Analytics & Reporting
Processes:
Business Support Systems (BSS) are the components that a ServiceProvider uses to run its business operations towards customer
© 2009 IBM Corporation171
CCMP R1.0 and R1.1 BSS Functionality Sales
– Face to face using ePricer/eConfig tools
Customer Management– Bulk import of customer onboarding information by
Business Office– UI for user management with various roles– Web Identity support
Subscriber Management– Map customer admin and users to a contract
Offering Management– Bulk upload of Catalog data with list price and cost
information Service Offering Catalog
– UI for display of catalog items details like Images,VM Sizes, 32/64Bit, Block Storage, Reserved IPAddress, VLAN
– UI for submitting provisioning request for a VM on apublic or private network with appropriate IPaddress and attaching a storage
Contract Entitlements– Service Catalog entitlement information by
customer and contract loaded by the BusinessOffice
Reporting and Analytics– Display of usage via BIRT reports– Royalty Reports for Redhat and SuSe
Contract Pricing and Rating– Pricing information by customer and contract
loaded by the Business Office– Simple ETL based price x quantity based pricing
model Billing
– Usage based by the hour, monthly recurring andone time charge
– Flexible billing calendar (monthly, quarterly &yearly) for a Geo
– Billing adjustments, incidental charges– Generating CFT/S spread-sheet feed file– “Green Dollar” Revenue back to SWG Products
Metering– Rollup of VM, IP addresses, storage blocks usage
information via Data Stage Costing
– Usage based costing using offering wide (non-contract) cost rate
– Generating CIF/SSC spread-sheet feed file API
– APIs for Image, Instance and Key Management
© 2009 IBM Corporation172
Pricing Models One Time Setup Charges
– Setup– Enterprise Onboarding
Monthly Recurring Charges– Rate Buy Down– VPN/VLAN
Per Hour Usage-based Charges– Virtual Machines
Images (software stack)
OS
Standardized (BR, SL, GD, PT, 32, 64) Compute– IP Address Reservation– Standardized (SM, MD, LG) Persistent Storage
© 2009 IBM Corporation173
Pu
bl
ic
AP
I (
Re
st
&S
OA
P)
REST& SOAP
WebBrowser
JavaScript & CSS
CustomerAdmin
CustomerUser
ImageProvider
Developer
EclipsePlug-in
Reporting(BIRT)
Data Warehouse(DB2)
Data Acquisition(DataStage)
Web IdentityLDAP
TAM
Web Seal
AAA
Order to Cash
Billing (CFT/S)
Costing (SSC)
CSV Files
Billing
Cost
Rylty
Invoicing (Geos, IOL)
Financials (CLS, CARS)
OfferingManager
Create Customer Users& set resource limits
Request & use VM,Storage, IP Address
Upload Catalog & ListPrices
Onboard Customers,Billing, Adjustments
Enterprise User Mgmt
BSS Extensions
Resource Mapping
Audit & Compliance
OSS Adapter
CloudUI
CloudBSS
ECWDB
BSS Detailed Component Diagram
Pricing & Rating
Ab
st
ra
ct
io
n L
ay
er
Po
rt
al
Image Meta-data & Scripts
Rational Asset ManagerRAM
Event Messaging
Subscriber ManagementREST
Service Offering Catalog
WDP BSS
EntitlementsWDPBSS
BSS forDev Test
BusinessOffice
Create Images
© 2009 IBM Corporation174
Layered Architecture
© 2009 IBM Corporation175
Operational Model
© 2009 IBM Corporation176
Backup Slides
© 2009 IBM Corporation177
What is Docker
177
Simple APIs and readable Dockerfles promote forking and sharing of code GIT/maven style repositories
Layered images promote Contnuous Delivery processes and sharingLight weight images lend themselves to productve local environments to test distributed scenarios
© 2009 IBM Corporation178
What is Docker?
© 2009 IBM Corporation179
AppA
Containers vs. VMs
Hypervisor (Type 2)
Host OS
Server
Guest
OS
Bins/
Libs
AppA’
Guest
OS
Bins/
Libs
AppB
Guest
OS
Bins/
Libs
Ap
p A
’
Do
cker
Host OS
Server
Bins/Libs
Ap
p A
Bins/Libs
Ap
p B
Ap
p B
’
Ap
p B
’
Ap
p B
’
VM
Container
Containers are isolated,but share OS and, where
appropriate, bins/libraries
Guest
OS
Guest
OS
…result is significantly fasterdeployment, much less overhead,
easier migration, faster restart
© 2009 IBM Corporation180
Why are Docker containers lightweight?
Bins/
Libs
AppA
Original App(No OS to take
up space, resources,or require restart)
Ap
p Δ
Bin
s/
AppA
Bins/
Libs
AppA’
Guest
OS
Bins/
Libs
Modified App
Copy on writeallows
us to only savethe diffsBetween
container Aand container
A’
VMsEvery app, every copy of an
app, and every slight modificationof the app requires a new virtual server
AppA
Guest
OS
Bins/
Libs
Copy ofApp
No OS. CanShare bins/libs
AppA
Guest
OS
Guest
OS
VMs Containers
© 2009 IBM Corporation181
What are the basics of the Docker system?
SourceCode
Repository
DockerfileFor
A
Docker Engine
DockerRegistr
y
Build
Do
cker
Host 2 OS (Linux)
Co
nt
ai
ne
rA
Co
nt
ai
ne
rB
Co
nt
ai
ne
rC
Co
nt
ai
ne
r A
Push
SearchPull
Run
Host 1 OS(Linux)
© 2009 IBM Corporation182
Changes and Updates
Docker Engine
DockerRegistr
y
Docker Engine
Push
Update
Bins/
Libs
AppA
Ap
p Δ
Bin
s/
Base Container
Image
Host is now runningA’’
Container Mod A’’
Ap
p Δ
Bin
s/
Bins/
Libs
AppA
Bin
s/
Bins/
Libs
AppA’’
Host running A wants toupgrade to A’’. Requests update.
Gets only diffs
Container Mod A’
© 2009 IBM Corporation183
Marketecture
DockerFile
SourceCode
Repository
CI/CD
Physical
Virtual
Cloud/Daa
S
Search,Pull
Push
Search,Pull
Push
Search,Pull
Push
Mac/WinDev
Machine
Boot 2Docker
Grey items are non-Docker, Inc. itemsItalics items will not be ready until 2H 2014 or later
Green is open source
DockerHub(pub/priv)
USERS
PROVENANCE
MGMT UI
POLICY
Registries
DockerHub API
APP CREATION
APP DEPLOYMENT
APPMANAGEMENT
DevMachine
Do
cke
r
LinuxOS
PRODBOX
LinuxOS
Do
cke
r
PRODBOX
LinuxOS
Do
cke
r
PRODBOX
LinuxOS
Do
cke
r
GCE RACK IBM
VM
Do
cke
r
VM
Do
cke
r
VM
Do
cke
r
DaaS DaaS DaaS
Infrastructure Mgt
Infrastructure Mgt
Public PrivateCurated
© 2009 IBM Corporation184
Docker Ecosystem
© 2009 IBM Corporation185
What are Containers and Docker???
Docker Stats Community Activity
Container Downloads +1.2 Million
Trained Developers +45K
Dev Repos publishing containersto Docker Index
+14K
% total contributors who workoutside Docker
~95 %
Active Meetups Over 70 cities in 27 countries
Integrations & growing
OpenStack, RHEL, Ubuntu,Chef, Puppet, Salt, VMWare,
Google Cloud, Amazon, etc +++
185IBM Confidential
Top Community Members
• Containers provide isolation similar to VMs • High performance due to lack of hypervisor overhead
• High density due to much smaller memory footprint allows greater cloud eff iciency
• Near instance startup time accelerates DevOps cycle• Docker Images provide portability across Linux
environments18 months!18 months!
© 2009 IBM Corporation186
Four major use cases
Continuous Integration/Continuous Delivery:– Go from developer’s laptop, through automated test, to
production, and through scaling without modification Alternative form of virtualization for multi-tenant services Scale-out/Big Data:
– Rapidly scale same application across hundreds or thousandsof servers…and scale down as rapidly
Cross Cloud Deployment– Move the same application across multiple clouds (public,
private, or hybrid) without modification or noticeable delay
© 2009 IBM Corporation187
The Growth of Docker
Microsof plans support for both Kubernetes & Docker on the Microsof Azure platorm
Vmware plans 5 sessions and keynote content on containers at VMWorld US
Google and Mesosphere join to bring together Mesosphere, Kubernetes and GCP
The community and vendors are quickly developing tooling. FIG consumed by docker, Atlassian automatedbuilds, Travis CI automated builds and Chef supported images
AWS Elastc Beanstalk adds Docker support for building and deploying containers
187
© 2009 IBM Corporation188
BlueMix
Rich ecosystem of current and future IBM & 3rd Party services
“A platform where developers canact like kids in a sandbox - except
this box is enterprise-grade.”
© 2009 IBM Corporation189
BlueMix Cloud Platform ServicesIBM, Open Source and Third Party APIs
Mobile AppManagement
DevOps
JavaLiberty
Ruby onRails
Node.js “Bring YourOwn
Buildpack”
IBMRelationalDatabase
IBM JSONDatabase
Mongo DB PostgreSQL
Mobile Data
Mobile Sync
Data Managemen
tServices
MQTTCloudCode Mobile AppMgmt
Mobile Services
MobileQuality
Assurance
BLU DataWarehouse
MySQL
Twilio
Data Cache SessionCache
Elastic MQ
Web & AppApplication
Decision SSO Redis
MapReduce
RabbitMQ LogAnalysis
Historian
Internet OfThings
Push
Runtimes
© 2009 IBM Corporation190
BlueMix DevOps experience
190
© 2009 IBM Corporation191
BlueMix Application Creation & Run Flow
ApplicationSourceCode
(e.g. Liberty, Node)
APPLICATION
Lives inJazzHub,
GitHub, LocalFile
Services to secure& manage the App
SERVICERuns anywhere
(couldeven be a CF app with
API)
Service Instance(e.g. Queue)
Service InstanceCreated by
•Call from CF CLI/ACE UI•Auto-created from manifest
•Externally or manuallyfrom
marketplaces Service(e.g. Queuing service)
API
Use Services API (e.g. Put in Queue)
API UIConfgure the
serviceinstance
DeployedAnd Runs
OnSoftlayer VM
OS (Ubuntu)
WardenContainer
LibertyEnv
RunningApp Code
Installedas
Buildpack
Installedas
Droplet
CloudFoundryDEAOpenStackOpenStack
© 2009 IBM Corporation192
Services Interfaces
BlueMix
ServiceGateway &
Implementation
Create/Bind/Unbind/Delete Service Instance
Change ServicePlan
Service Instance Operations (Start,Stop)
Monitor Service Instance Status &KPIs
Service Usage Metering &BSS
Scale/Auto-Scale ServiceInstance
Security for Service InstanceAccess
Service SpecificFunctional
Interfaces (UI/API)
NotificationInterfaces
ServiceBackup Service
Instance
BlueMix
Application
New RoCApps
Admin UI for ServiceInstance
FunctionalInterfaces
for the Service
LifecycleInterfaces
for theService
SERVICE INTERFACES
Recommended