30
Smuggling Multi-Cloud Support into Cloud-native Applications using Elastic Container Platforms 1 Prof. Dr. rer. nat. Nane Kratzke Computer Science and Business Information Systems Nane Kratzke

Smuggling Multi-Cloud Support into Cloud-native Applications using Elastic Container Platforms

Embed Size (px)

Citation preview

Smuggling Multi-Cloud Support into Cloud-native Applicationsusing Elastic Container Platforms

1Prof. Dr. rer. nat. Nane KratzkeComputer Science and Business Information Systems

Nane Kratzke

The next 30 minutes are about ...

• What are Cloud-native Applications?

• Elastic Container Platforms and why theyshould be considered for multi-cloud research.

• A control loop to scale Elastic Container Platforms across Cloud Service Providers

• Some data of our evaluation

• 7 Lessons Learned and Conclusion

Prof. Dr. rer. nat. Nane KratzkeComputer Science and Business Information Systems 2

Presentation URL

Paper URL

Maturity Criteria

3Cloud Native

• Application can dynamically migrate across infrastructureproviders without interruption of service.

• Application can elastically scale out/in appropriately based on stimuli.

2Cloud

Resilient

• Services are stateless.• Application is unaware and unaffected by failure of dependent services. • Application is infrastructure agnostic and can run anywhere.

1Cloud

Friendly

• Application is composed of loosely coupled services.• Application services are discoverable by name.• Application deployment units are designed according to cloud patterns

(e.g. 12-factor app principles)• Application compute and storage are separated.• Application consumes one or more cloud services: compute, storage,

network.

0Cloud Ready

• Application runs on virtualized infrastructure.• Application can be instantiated from an image or script.

According to OPEN DATA CENTER ALLIANCE Best Practices (Architecting Cloud-Aware Applications), 2014

with add-ons by practitioner Mario-Leander Reimer (QAWare)

Cloud Application Maturity Model (CAMM)

Covered bya lot ofSOA andclouddeploymentapproaches.

This contri-bution‘sfocus ...

Research Surveillance of Practitioners

Prof. Dr. rer. nat. Nane KratzkeComputer Science and Business Information Systems 4

Docker SwarmSwarm Mode (since

Docker 1.12) „copies“ theidea of Kubernetes-like control processes but

integrates them in just onecomponent. Secure by

default (control and dataplane). Hides operation

complexity.

GoogleControl processes that

continuously drive current stateof container based applications

towards an intended desiredstate. Makes Google‘s

experience of running large scale production workloadsavailable as open source

(especially from the Google internal Borg system).

MesosphereApache Mesos based

datacenter operating systemfor fine grained resource

allocation. Frameworks tooperate containers and data

services. Datacenter focused. Mesos operates successfullylarge scale datacenters since

years (Twitter, Netflix, ...)

Practitioners ask for simple solutions (elastic platforms) ...

The very basic idea ...

Prof. Dr. rer. nat. Nane KratzkePraktische Informatik und betriebliche Informationssysteme 5

Operate application on current provider.

Scale cluster into prospective provider.

Shutdown nodes on current provider. Cluster reschedules lost container.

Migration finished.Quint, P.-C., & Kratzke, N. (2016). Overcome Vendor Lock-In byIntegrating Already Available Container Technologies - TowardsTransferability in Cloud Computing for SMEs. In Proceedings of CLOUD COMPUTING 2016 (7th. International Conference on Cloud Computing, GRIDS and Virtualization).

Avoiding Vendor Lock-In:

• Make use of elastic containerplatforms to operate elasticservices being deployable to anyIaaS cloud infrastructure.

• Transfer of these services from oneprivate or public cloud infrastructureto another would be possible at runtime.

But the idea provides more options ...

Prof. Dr. rer. nat. Nane KratzkeComputer Science and Business Information Systems 6

Simply stop „a transfer“ somewhere in between and you get ...

One Control Loop for All

Prof. Dr. rer. nat. Nane KratzkeComputer Science and Business Information Systems 7

Operate application on current provider.

Scale cluster into prospective provider.

Shutdown nodes on current provider. Cluster reschedules lost container.

Migration finished.

Control LoopExample to deploy a cluster

Prof. Dr. rer. nat. Nane KratzkeComputer Science and Business Information Systems 8

Definition of an intended state.{ "type": "cluster", "platform": "Swarm", "deployments": [ { "district": "gce-europe", "flavor": "small", "role": "master", "quantity": 1

}, { "district": "gce-europe", "flavor": "small", "role": "worker", "quantity": 9

}, { "district": "aws-europe", "flavor": "small", "role": "worker", "quantity": 0

} ]

}

Control LoopExample to deploy a cluster

Prof. Dr. rer. nat. Nane KratzkeComputer Science and Business Information Systems 9

Derive a prioritized action list.

|| Create secgroup for gce-europe

-- Create master in gce-europe

|| Create worker in gce-europe|| Create worker in gce-europe|| Create worker in gce-europe|| Create worker in gce-europe|| Create worker in gce-europe|| Create worker in gce-europe|| Create worker in gce-europe|| Create worker in gce-europe|| Create worker in gce-europe

|| executed in parallel-- executed sequentially

Control LoopExample to deploy a cluster

Prof. Dr. rer. nat. Nane KratzkeComputer Science and Business Information Systems 10

Updated resources.

- Secgroup for gce-europe- Master node in gce-europe- Worker node in gce-europe- Worker node in gce-europe- Worker node in gce-europe- Worker node in gce-europe- Worker node in gce-europe- Worker node in gce-europe- Worker node in gce-europe- Worker node in gce-europe- Worker node in gce-europe- Worker node in gce-europe

All detail data like IP-adresses, identifiers, etc. omitted for betterreadability.

- Secgroup for gce-europe- Master node in gce-europe- Worker node in gce-europe- Worker node in gce-europe- Worker node in gce-europe- Worker node in gce-europe- Worker node in gce-europe- Worker node in gce-europe- Worker node in gce-europe- Worker node in gce-europe- Worker node in gce-europe- Worker node in gce-europe

Control LoopExample: Transfer of five worker nodes

Prof. Dr. rer. nat. Nane KratzkeComputer Science and Business Information Systems 11

{ "type": "cluster", "platform": "Swarm", "deployments": [

{ "district": "gce-europe", "flavor": "small", "role": "master", "quantity": 1

}, { "district": "gce-europe",

"flavor": "small", "role": "worker", "quantity": 9

}, { "district": "aws-europe",

"flavor": "small", "role": "worker", "quantity": 0

} ]

}

4

5

|| Create secgroup for aws-europe

|| Create worker in aws-europe|| Create worker in aws-europe|| Create worker in aws-europe|| Create worker in aws-europe|| Create worker in aws-europe

-- Delete worker in gce-europe-- Delete worker in gce-europe-- Delete worker in gce-europe-- Delete worker in gce-europe-- Delete worker in gce-europe

|| executed in parallel-- executed sequentially

- Secgroup for gce-europe- Secgroup for aws-europe- Master node in gce-europe- Worker node in gce-europe- Worker node in gce-europe- Worker node in gce-europe- Worker node in gce-europe- Worker node in gce-europe- Worker node in gce-europe- Worker node in gce-europe- Worker node in gce-europe- Worker node in gce-europe- Worker node in aws-europe- Worker node in aws-europe- Worker node in aws-europe- Worker node in aws-europe- Worker node in aws-europe

Resulting Architecture (Domain Model)

Prof. Dr. rer. nat. Nane KratzkeComputer Science and Business Information Systems 12

Extension pointfor elasticplatforms

Currently supported: Kubernetes, Swarm

Extension point for IaaSinfrastructures

Currently supported: AWS, GCE, Azure, OpenStack

Evaluation:5 Experiments (with a 1 Master and 9 Worker Cluster)

Prof. Dr. rer. nat. Nane KratzkeComputer Science and Business Information Systems 13

OpenStack

Google Compute Engine (GCE, n1-standard-2)

Elastic Compute Cloud (EC2, m3.large)

E1

E2 E2

E1

E3, E4, E5

E3, E4, E5

The same experiments havebeen done with OpenStackas well.

E1: Launch a 10 node cluster.

E2: Terminate a 10 node cluster.

E3: Transfer one node of the cluster.

E4: Transfer 5 nodes of the cluster.

E5: Transfer all nodes of the cluster.

Cluster was Docker Swarm (operated a Sock Shop Reference Application and a Redis-based Guestbook)

Kubernetes

Different elastic containerplatforms had no significantimpact on the runtimes. Therefore data is onlypresented for Docker Swarm.

Docker Swarm

Evaluation (Single Cloud)Deploying and terminating clusters

Prof. Dr. rer. nat. Nane KratzkeComputer Science and Business Information Systems 14

Experiment E1

Experiment E2

10 times longer ???

Evaluation (Multi-Cloud)Transfer GCE ⇠⇢ AWS

Prof. Dr. rer. nat. Nane KratzkeComputer Science and Business Information Systems 15

Experiment E3

Experiment E4

Experiment E5

Comparable with a shutdown.

Node terminationtimes seem todominate thetransfer timesmassively.

Why these (dramatic) differences?

Prof. Dr. rer. nat. Nane KratzkeComputer Science and Business Information Systems 16

Analysis turned out:

1. GCE API workssynchronously (a nodetermination call blocks untiltermination is completed)

2. AWS API worksasychronously (so nodetermination call did not block until termination completed, fire and forget)

3. GCE SDN relatedprocessing times take farlonger than AWS SDN related processing times.

Prof. Dr. rer. nat. Nane KratzkeComputer Science and Business Information Systems 17

Conclusion

Prof. Dr. rer. nat. Nane KratzkeComputer Science and Business Information Systems 18

• Elastic container platforms provide often overlooked multi-cloud opportunities

• We could succesfully demonstrate multi-cloud transfers between AWS, GCE, Azure and OpenStack using a simple control loop (scaling Kubernetes andDocker SwarmMode).

• The control loop is designed to be integratable in a MAPE loop as executionphase.

• A cybernetic understanding (intended state vs. current state) makes a lot ofmulti-cloud workflows easier.

• On the downside: The solution is limited to container-based applications (CNMM Level 3) and services (but that seems to become a dominating architecturalstyle).

• New research opportunities and future research directions:• Making the solution available as Open Source

• P2P-based elastic platforms would make deployments even easier (no worker/masterroles)

• There is room for improvements (e.g. resource efficient action planning)

Acknowledgement

• Elastic Straps: Pixabay (CC0 Public Domain, PublicDomainPictures)• Definition: Pixabay (CC0 Public Domain, PDPics)• Class room: Pixabay (CC0 Public Domain, Unsplash)• Railway: Pixabay (CC0 Public Domain, Fotoworkshop4You)• Air Transport: Pixabay (CC0 Public Domain, WikiImages)

Prof. Dr. rer. nat. Nane KratzkeComputer Science and Business Information Systems 19

Picture Reference

This research is funded by German Federal Ministry of Education

and Research (03FH021PX4). I would like to thank Peter Quint,

Christian Stüben, and Arne Salveter for their hard work and their

contributions to the Project Cloud TRANSIT.

Presentation URL

Paper URL

About

Prof. Dr. rer. nat. Nane KratzkeComputer Science and Business Information Systems 20

Nane Kratzke

CoSA: http://cosa.fh-luebeck.de/en/contact/people/n-kratzke

Blog: http://www.nkode.io

Twitter: @NaneKratzke

GooglePlus: +NaneKratzke

LinkedIn: https://de.linkedin.com/in/nanekratzke

GitHub: https://github.com/nkratzke

ResearchGate: https://www.researchgate.net/profile/Nane_Kratzke

SlideShare: http://de.slideshare.net/i21aneka

Backup Slides

Prof. Dr. rer. nat. Nane KratzkeComputer Science and Business Information Systems 21

Elastic Platforms and Multi-cloudrequirements

Multi-Cloud Requirements Contributing Platform concepts

Transferability Integration of nodes into one logical clusterDesigned for failureCross-provider deployable

Data location awareness Pod concept (Kubernetes)Volume orchestrator (Flocker for Docker)

Geolocation awareness Tagging of nodes with geolocation, pricing, policy oron-premise informations

Platform schedulers have selectors (Swarm) /affinitities (Kubernetes) / constraints(Mesos/Marathon) to evaluate these taggings

Pricing awareness

Legislation/policy awareness

Local resources awareness

Security requirements Encrypted data / control plane (Swarm)Encrypted overlay networks (e.g. Weave forKubernetes)

Prof. Dr. rer. nat. Nane KratzkeComputer Science and Business Information Systems 22

Several transferability, awareness and security requirements come along withmulti-cloud approaches. Already existing elastic container platforms contributeto fulfill these requirements.

Cloud-native Application

What?Be IDEAL

• Isolated State• Distributed• Elastic• Automated

management• Loosely coupled

Why? There is a need for ..

• Speed (delivery)• Safety (fault tolerance,

design for failure)• Scalability• Client diversity

How?Integrate ...

• (Micro)service orientedarchitectures (M)SOA

• Use API-basedcollaboration

• Consider cloud-focusedpattern catalogues

• Use self-service agile platforms

Prof. Dr. rer. nat. Nane KratzkeComputer Science and Business Information Systems 23

C. Fehling, F. Leymann, R. Retter, W. Schupeck, and P. Arbitter, Cloud Computing Patterns: Fundamentalsto Design, Build, and Manage Cloud Applications. Springer, 2014.

M. Stine, Migrating to Cloud-Native Application Architectures. O’Reilly, 2015

A. Balalaie, A. Heydarnoori, and P. Jamshidi, “Migrating to Cloud-Native Architectures Using Microservices”, CloudWay 2015, Taormina, Italy

S. Newman, Building Microservices. O’Reilly, 2015.

Often heard by practitioners: „A cloud-native application is an application intentionally designed for the cloud.“ True, but helpful?

Cloud-native Application Definition

Prof. Dr. rer. nat. Nane KratzkeComputer Science and Business Information Systems 24

[KQ2017a] Kratzke, N., & Quint, P.-C. (2017). Understanding Cloud-native Applications after 10 Years ofCloud Computing - A Systematic Mapping Study. Journal of Systems and Software, 126 (April).

We need some guidance ...ClouNS – Cloud-native Application Reference Model

Prof. Dr. rer. nat. Nane KratzkeComputer Science and Business Information Systems 25

[KP2016] Kratzke, N., & Peinl, R. (2016). ClouNS - a Cloud-Native Application Reference Model for Enterprise Architects. In 2016 IEEE 20th International Enterprise Distributed Object Computing Workshop (EDOCW) (pp. 1–10).

Did you know?

Prof. Dr. rer. nat. Nane KratzkePraktische Informatik und betriebliche Informationssysteme 26

2 2

2 4 6 77

7 7 11 11

1 1

2 4 7 1014

21 26 42 44

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016

Relationofconsid

eredservices

consideredbyCIMI,OCCI,CDMI,OVF,OCI,TOSCA notconsidered

Cloud standards improved over the last 10 years. However, cloud standardization coveragedecreased (in relation to all available services).

Analyzed using over 2300 offical release notes of Amazon Web Services (AWS). Data for other providers like Google, Azure, Rackspace, etc. not presented. Basic conclusions for theseproviders are the same.

[KQP+2016] Kratzke, N., Quint, P.-C., Palme, D., & Reimers, D. (2016). Project Cloud TRANSIT - Or toSimplify Cloud-native Application Provisioning forSMEs by Integrating Already Available Container Technologies. In V. Kantere & B. Koch (Eds.), European Project Space on Smart Systems, Big Data, Future Internet - Towards Serving the Grand Societal Challenges.

Research Methodology

Prof. Dr. rer. nat. Nane KratzkeComputer Science and Business Information Systems 27

Main focusof this

contribution

CNA == Cloud-native Application

Prof. Dr. rer. nat. Nane KratzkeComputer Science and Business Information Systems 28

Evaluation:Virtual Machine Type Selection

[KQ2015] Kratzke, N., & Quint, P.-C. (2015). About Automatic Benchmarking of IaaS Cloud Service Providers for a World of Container Clusters. Journal of Cloud Computing Research, 1(1), 16–34.

We searched for the most similar machine types of different public cloud serviceproviders. The similarity indicator maps processing, memory, network, and disk I/O performance to just one similarity value (1 means identical, 0 means no similarity at all).

This reference model guides ourresearch

Prof. Dr. rer. nat. Nane KratzkeComputer Science and Business Information Systems 29

Developing a description language for cloud-native applications.

Developing a standardized way of deploying a clustered container runtimeenvironment for cloud-native applications

(CNMM Level 3 conform deploying/operation)

Make use of commodity services of public cloud service providersonly (IaaS).

Research Surveillance of Practitioners

Prof. Dr. rer. nat. Nane KratzkeComputer Science and Business Information Systems 30

Practitioners often prefer layer-based reference models ...

Jason Lavigne, ”Don’t let aPaaS you by - What is aPaaS and whyMicrosoft is excited about it”, seehttps://atjasonunderscorelavigne.wordpress.com/2014/01/27/dont-let-apaas-you-by/ (last access 4th August 2016)

Johann den Haan, ”Categorizing and Comparing the Cloud Landscape”,see http://www.theenterprisearchitect.eu/blog/categorize-compare-cloud-vendors/ (accessed 4th August 2016)

Josef Adersberger, Andreas Zitzelsberger, Mario-Leander Reimer, ”Der Cloud-Native-Stack: Mesos, Kubernetes und Spring Cloud”, seehttp://www.qaware.de/fileadmin/user_upload/QAware-Cloud-Native-Artikelserie-Java_Magazin-1.pdf (accessed 4th August 2016)

MEKUNS Cloud Landscape Model