Upload
others
View
11
Download
0
Embed Size (px)
Citation preview
(C) 2017 HEIG-VDTSM-ClComp-EN Cloud Computing
CNA — Cloud-native applications
Prof. Dr. Marcel GrafProf. Dr. Thomas Michael Bohnert
TSM-ClComp-EN Cloud Computing | Cloud-native applications | Academic year 2017/2018
Cloud-native applications
■Cloud-native application■Service Oriented Architecture / Microservice Architecture■Twelve-factor applications■Cloud Native Application Design Patterns■Serverless applications
2
TSM-ClComp-EN Cloud Computing | Cloud-native applications | Academic year 2017/2018
Benefits & Drawbacks of Cloud Computing
Benefits for application providers■Obtain IT-Resources on Demand
(Compute, Storage, Network, Services, ...)→ Speed-Up Development / Deployment Cycle → Improve Time-To-Market
■Pay-As-You-Go Pricing-Model → Optimize Costs
Drawbacks for application providers■Cloud infrastructures are built on commodity-
hardware to leverage economies-of-scales effect→ increased failure rate
■ The Cloud infrastructure is shared by its customers (Resource Pooling)→ negative influence on performance
(i.e. Noisy Neighbour Problem)
3
Goal to run applications economically efficient→ avoid
■ over-provisioning (paying for unused resources)
■ under-provisioning (degrading QoS, violating SLAs)
TSM-ClComp-EN Cloud Computing | Cloud-native applications | Academic year 2017/2018
Characteristics of CNA
A cloud-native application is an application optimized for running on a cloud-infrastructure (IaaS or PaaS) having the essential characteristics of being scalable and
resilient. *
One of the main goals of a cloud-native application is to use its underlying cloud-resources as economically efficiently as possible.
*Each phase in the application life-cycle has to be adapted and
optimized for running in a cloud-environment.*
A cloud-native application is typically designed as a distributed application built up from stateless components
(employing asynchronous communication).
4
It’s also possible to get there by migrating (re-designing) an already existing application.→ It’s just semantics but cannot be called “Cloud-native” in that case
TSM-ClComp-EN Cloud Computing | Cloud-native applications | Academic year 2017/2018
Cloud-native application implementation
■To implement CNA, the following topics need to be addressed ■Architecture: Application needs to be designed for scalability and resilience
→ move towards Service Oriented Architecture / Microservices Architecture→ use of CNA specific Patterns
■Organization: Teams need to be (re)organized to → Agile teams organized around Business Capabilities→ Incorporating DevOps principles and methodologies
■Process: Tools & Technologies needs to be adapted/extended→ Automated Software Development / Deployment / Management Pipeline
■Methodologies and guidelines to support Cloud-native application development■Service Oriented Architecture (SOA) / Microservices Architecture■Twelve Factor Application■CNA Patterns
5
TSM-ClComp-EN Cloud Computing | Cloud-native applications | Academic year 2017/2018
Service Oriented Architecture (SOA)
Open group definition: “Service-Oriented Architecture (SOA) is an architectural style that supports service-orientation. Service-orientation is a way of thinking in terms of services and service-based development and the outcomes of services.”
Service: a self-contained unit of functionality that can be accessed in a remote, standardized, technology-independent fashion
SOA Principles:■ Standardized protocols (e.g., SOAP, REST)■ Abstraction (from service implementation)■ Loose coupling■ Reusability■ Composability■ Stateless services■ Discoverable services
6
TSM-ClComp-EN Cloud Computing | Cloud-native applications | Academic year 2017/2018
MicroservicesIntroduction
■Microservices architecture is an SOA architectural style to develop applications■as a suite of small services,■each running in its own process■and communicating with lightweight mechanisms (REST APIs).■They are built around business capabilities following the “do one thing well” principle.■They are independently deployable in a fully automatic way.■There is some minimal centralized management of the services.
■Term coined in 2011 at a software architects’ workshop and formalized by Martin Fowler and others in 2014. But emerged much earlier among cloud-oriented companies:■Amazon in the early 00s moved from the monolithic Obidos application to a service-oriented
architecture.■Netflix (whose infrastructure is completely cloud-based) did a major redesign of their system along a
microservice architecture.■Pivotal is one of the main promoters of Microservices today.
→ For more details about Microservices see appendix
7
TSM-ClComp-EN Cloud Computing | Cloud-native applications | Academic year 2017/2018
Twelve-Factor ApplicationsIntroduction
■Collection of best-practices for cloud-native application architectures, originally developed by engineers at Heroku. ■Methodology for building software-as-a-service apps that
■are suitable for deployment on modern cloud platforms, obviating the need for servers and systems administration → support development / deployment workflow■can scale up without significant changes to tooling, architecture, or development practices. → scalability
■Published in 2011■http://12factor.net/
8
TSM-ClComp-EN Cloud Computing | Cloud-native applications | Academic year 2017/2018
Twelve Factors
One codebase tracked in revision control, many deploys ■A twelve-factor app is always tracked in a version control
system (Git, Mercurial, Subversion, …)■There is only one codebase per app, but there will be many
deploys of the app. ■The codebase is the same across all deploys, although different
versions may be active in each deploy.
9
I
IIIIIIVV
VIVIIVIIIIXXXI
XII
Codebase
DependenciesConfigBacking servicesBuild, release, run
ProcessesPort bindingConcurrencyDisposabilityDev/prod parityLogs
Admin processes
TSM-ClComp-EN Cloud Computing | Cloud-native applications | Academic year 2017/2018
Twelve Factors
Explicitly declare and isolate dependencies ■A twelve-factor app never relies on implicit existence of
system-wide packages. ■ It declares all dependencies, completely and exactly, via a
dependency declaration manifest. ■ It uses a dependency isolation tool to ensure that no implicit
dependencies “leak in” from the surrounding system.■Advantage of explicit dependency declaration: it simplifies
setup for developers new to the app. They just run a build command which will set up everything deterministically.
10
III
IIIIVV
VIVIIVIIIIXXXI
XII
CodebaseDependencies
ConfigBacking servicesBuild, release, run
ProcessesPort bindingConcurrencyDisposabilityDev/prod parityLogs
Admin processes
TSM-ClComp-EN Cloud Computing | Cloud-native applications | Academic year 2017/2018
Twelve Factors
Store config in the environment ■An app’s config is everything that is likely to vary between
deploys (staging, production, developer environments, etc). This includes:■ Resource handles to the database, Memcached, and other
backing services■ Credentials to external services such as Amazon S3 or Twitter■ Per-deploy values such as the canonical hostname for the deploy
■Twelve-factor requires strict separation of config from code and stores config in environment variables.■ Env vars are easy to change between deploys without changing
any code.■ Unlike config files, there is little chance of them being checked
into the code repo accidentally.
11
IIIIII
IVV
VIVIIVIIIIXXXI
XII
CodebaseDependenciesConfig
Backing servicesBuild, release, run
ProcessesPort bindingConcurrencyDisposabilityDev/prod parityLogs
Admin processes
TSM-ClComp-EN Cloud Computing | Cloud-native applications | Academic year 2017/2018
Twelve Factors
Treat backing services as attached resources ■A backing service is any service the app consumes over the
network as part of its normal operation (databases / datastores, messaging / queueing systems, SMTP services, caching, …)■The code for a twelve-factor app makes no distinction between
local and third party services. ■ To the app, both are attached resources, accessed via a URL or
other locator/credentials stored in the config.■ At runtime resources can be attached and detached to deploys at
will, without any code changes.
12
IIIIIIIV
V
VIVIIVIIIIXXXI
XII
CodebaseDependenciesConfigBacking services
Build, release, run
ProcessesPort bindingConcurrencyDisposabilityDev/prod parityLogs
Admin processes
TSM-ClComp-EN Cloud Computing | Cloud-native applications | Academic year 2017/2018
Twelve Factors
Strictly separate build and run stages ■The twelve-factor app uses strict separation between the build,
release, and run stages. ■ For example, it is impossible to make changes to the code at
runtime, since there is no way to propagate those changes back to the build stage.
■ Builds are initiated by the app’s developers whenever new code is deployed. Runtime execution, by contrast, can happen automatically in cases such as a server reboot, or a crashed process being restarted by the process manager.
13
IIIIIIIVV
VIVIIVIIIIXXXI
XII
CodebaseDependenciesConfigBacking servicesBuild, release, run
ProcessesPort bindingConcurrencyDisposabilityDev/prod parityLogs
Admin processes
TSM-ClComp-EN Cloud Computing | Cloud-native applications | Academic year 2017/2018
Twelve Factors
Execute the app as one or more stateless processes ■Twelve-factor processes are stateless and share-nothing.
■ The twelve-factor app never assumes that anything cached in memory or on disk will be available on a future request.
■ Any data that needs to persist must be stored in a stateful backing service, typically a database.
■ Sticky sessions are a violation of twelve-factor and should never be used or relied upon. Session state data is a good candidate for a datastore that offers time-expiration, such as Memcached or Redis.
14
IIIIIIIVVVI
VIIVIIIIXXXI
XII
CodebaseDependenciesConfigBacking servicesBuild, release, runProcesses
Port bindingConcurrencyDisposabilityDev/prod parityLogs
Admin processes
TSM-ClComp-EN Cloud Computing | Cloud-native applications | Academic year 2017/2018
Twelve Factors
Export services via port binding ■The twelve-factor app is completely self-contained and does
not rely on runtime injection of a webserver into the execution environment to create a web-facing service. The web app exports HTTP as a service by binding to a port, and listening to requests coming in on that port.■One app can become the backing service for another app, by
providing the URL to the backing app as a resource handle in the config for the consuming app.
15
IIIIIIIVV
VIVII
VIIIIXXXI
XII
CodebaseDependenciesConfigBacking servicesBuild, release, run
ProcessesPort binding
ConcurrencyDisposabilityDev/prod parityLogs
Admin processes
TSM-ClComp-EN Cloud Computing | Cloud-native applications | Academic year 2017/2018
Twelve Factors
Scale out via the process model ■Apps handle diverse workloads by assigning each type of work
to a process type. For example, HTTP requests may be handled by a web process, and long-running background tasks handled by a worker process.■Scale out app processes horizontally and independently for
each process type.■ The array of process types and number of processes of each type
is known as the process formation.
16
IIIIIIIVV
VIVIIVIII
IXXXI
XII
CodebaseDependenciesConfigBacking servicesBuild, release, run
ProcessesPort bindingConcurrency
DisposabilityDev/prod parityLogs
Admin processes
TSM-ClComp-EN Cloud Computing | Cloud-native applications | Academic year 2017/2018
Twelve Factors
Maximize robustness with fast startup and graceful shutdown ■The twelve-factor app’s processes are disposable, meaning
they can be started or stopped at a moment’s notice. ■Minimize startup time■ Shut down gracefully when receiving a termination signal■ This facilitates fast elastic scaling, rapid deployment of code or
config changes, and robustness of production deploys.■Processes should also be robust against sudden death, in the
case of a failure in the underlying hardware.
17
IIIIIIIVV
VIVIIVIIIIX
XXI
XII
CodebaseDependenciesConfigBacking servicesBuild, release, run
ProcessesPort bindingConcurrencyDisposability
Dev/prod parityLogs
Admin processes
TSM-ClComp-EN Cloud Computing | Cloud-native applications | Academic year 2017/2018
Twelve Factors
Keep development, staging, and production as similar as possible ■Continuous delivery and deployment are enabled by keeping
development, staging, and production environments as similar as possible. ■ Avoid gaps between development and production
■ Avoid time gap: Write code and have it deployed hours or even just minutes later
■ Avoid personnel gap: developers who wrote code are closely involved in deploying it and watching its behavior in production
■ Avoid tools gap: use the same tools in development and production
18
IIIIIIIVV
VIVIIVIIIIXX
XI
XII
CodebaseDependenciesConfigBacking servicesBuild, release, run
ProcessesPort bindingConcurrencyDisposabilityDev/prod parity
Logs
Admin processes
TSM-ClComp-EN Cloud Computing | Cloud-native applications | Academic year 2017/2018
Twelve Factors
Treat logs as event streams ■All running processes and backing services produce logs,
which are commonly written to a file on disk.■A twelve-factor app never concerns itself with routing or
storage of its output stream.■ In staging or production deploys, each process’ stream will be
captured by the execution environment, collated together with all other streams from the app, and routed to one or more final destinations for viewing and long-term archival.
■The stream can be sent to a log indexing and analysis system such as Splunk. This allows for great power and flexibility for introspecting an app’s behavior over time, including:■ Finding specific events in the past.■ Large-scale graphing of trends (such as requests per minute).■ Active alerting according to user-defined heuristics (such as an
alert when the quantity of errors per minute exceeds a certain threshold).
19
IIIIIIIVV
VIVIIVIIIIXXXI
XII
CodebaseDependenciesConfigBacking servicesBuild, release, run
ProcessesPort bindingConcurrencyDisposabilityDev/prod parityLogs
Admin processes
TSM-ClComp-EN Cloud Computing | Cloud-native applications | Academic year 2017/2018
Twelve Factors
Run admin/management tasks as one-off processes ■Administrative or managements tasks, such as database
migrations, should be kept in source control and packaged with the application.■They are executed as one-off processes in environments
identical to the app’s long-running processes.
20
IIIIIIIVV
VIVIIVIIIIXXXI
XII
CodebaseDependenciesConfigBacking servicesBuild, release, run
ProcessesPort bindingConcurrencyDisposabilityDev/prod parityLogs
Admin processes
TSM-ClComp-EN Cloud Computing | Cloud-native applications | Academic year 2017/2018
Design patterns for CNAIntroduction
■ After companies started embracing the cloud and building applications for it best practices and new design patterns for such applications emerged
■ They all came into existence because of a need to deal with this new environment which constituted of new challenges
■ The aim of most of the patterns is to make the application more scalable and/or resilient.
21
TSM-ClComp-EN Cloud Computing | Cloud-native applications | Academic year 2017/2018
Service Registry / Configuration Mgmt
Problem ■ All components/services in a cloud are dynamic, think of lifecycle, every resource/component/service can
be provisioned and disposed at any time■ How do clients of a service and/or routers know about the available instances of a service?Main Ideas ■ The Service Registry manages information of available services
■ Configuration of the services ■ Endpoint of the services / How to reach a service■ Status of a Service
■ Service instances are registered on startup and deregistered on shutdown■ Can be polled/used by other components/services (e.g. : Load Balancer)■ Mostly implemented as a highly-available key-value storeImplementations ■Etcd■Consul■Zookeeper■Netflix Eureka
22
TSM-ClComp-EN Cloud Computing | Cloud-native applications | Academic year 2017/2018
Service Request Flow
23
Source: http://techblog.netflix.com/2015/02/a-microscope-on-microservices.html
Complex chain of services calling one or more other services→ may result in high latency
TSM-ClComp-EN Cloud Computing | Cloud-native applications | Academic year 2017/2018
Circuit BreakerProblem ■ In a distributed environment remote services or shared resources might become unavailable at some
point. This may effect the response time of depending services.Main Idea ■ A failed service should not influence other parts of a distributed system■ Fail fast and allow invoking service to reactImplementation ■ A circuit breaker is a proxy that sits between your application and a remote service or shared resource
that your application accesses.■ If a request to the remote service or shared resource is highly likely to fail then the circuit breaker proxy
will not forward the request and immediately respond with an alternative default result or an error message.
Achieves two things: ■ The remote service or shared resource will not be flooded by requests which could amplify the problem.■ The application won't waste resources (RAM, CPU) for requests that in the end will time out or never
return.24
TSM-ClComp-EN Cloud Computing | Cloud-native applications | Academic year 2017/2018
Circuit Breaker - State Machine■ The circuit breaker is implemented as a state machine
with three states mimicking an electric circuit breaker → Closed, Open and Half-Open
■ Closed: ■ Circuit Breaker will pass every request. ■ If a request fails a failure counter will be increased. ■ If the failure counter surpasses a certain threshold the circuit
breaker will change its state to open. ■ The failure counter will be reset after a certain amount of time.
(Prevents state from changing to open if only sporadic failures occur).
■ Open:■ Circuit Breaker will immediately return an error when asked to
forward a request. ■ After a certain amount of time it will change to the half-open
state. ■ Another possibility is that the circuit breaker sporadically sends
some request/pings by himself to see if the service is responsive again and change to the half-open state after confirming it.
■ Half-Open:■ Circuit Breaker will let some request pass while still responding
with an immediate error to most requests. The reason for this is to not just overwhelm a maybe recuperating service. It will count the successful requests and will change its state to close if a certain amount of successful requests have passed.
■ If a request fails the state will immediately change to open again.25
Source: Cloud Design Patterns: Prescriptive Architecture Guidance For Cloud Applications
TSM-ClComp-EN Cloud Computing | Cloud-native applications | Academic year 2017/2018
Service Load BalancerProblem ■ A single cloud service implementation has a finite capacity, which leads to runtime exceptions, failure and
performance degradation when its processing thresholds are exceeded.Main Idea ■ Redundant deployments of the cloud service are created and a load balancing system is added to
dynamically distribute workloads across cloud service implementations.■ A service load balancer is a piece of software-logic which receives information and forwards this
information to a multitude of recipients. ■ There are multiple ways (algorithms) of how the forwarding-decision can be taken. Some examples being:
■ Round-Robin: Information is distributed in a round-robin fashion■ Least-Connection: Information is sent to the recipient with the least current connections■ Source: Information is sent to the same recipient as on former connections → Used for Sticky Sessions
26
TSM-ClComp-EN Cloud Computing | Cloud-native applications | Academic year 2017/2018
Client
Service Load Balancer variants
27
Dedicated / central Load Balancer:
Client
Client-Side Load Balancer:
Client Load Balancer
Service
Service
ServiceService Registry
Load Balancer
Service
Service
ServiceService Registry
Load Balancer
Client
■ Runs in separate process■ Shared by clients■ Often shared for multiple services■ Possible bottleneck
Example: HA-Proxy
■ Implemented and used as a library■ Runs in client process■ Client decides which service instance
to connect■ Required to keep in sync (needs registry)
Example: Netflix Ribbon
TSM-ClComp-EN Cloud Computing | Cloud-native applications | Academic year 2017/2018
API GatewayProblem ■ An large application may consist of many endpoints, especially in a microservices architecture. If the
client (e.g. browser) would call these directly he would cause high network-traffic and increase the latency. He even may need to know internal structures or support multiple protocols. This makes it difficult to change the internal structure or implementation.
Main Idea ■ Provide clients with a single, simplified entry point to the backend.■ May provide client specific API variants (Browser, Mobile App - iOS or Android, IoT devices)■ May accesses tens (or hundreds) of internal services (microservices) concurrently■ Aggregates the responses and transforms them to meet the client’s needs■ An implementation of the facade pattern
28
TSM-ClComp-EN Cloud Computing | Cloud-native applications | Academic year 2017/2018
API Gateway - problem
29
Source: https://spring.io/blog/2014/11/24/springone2gx-2014-replay-developing-microservices-for-paas-with-spring-and-cloud-foundry
TSM-ClComp-EN Cloud Computing | Cloud-native applications | Academic year 2017/2018
API Gateway - solution
30
Source: https://spring.io/blog/2014/11/24/springone2gx-2014-replay-developing-microservices-for-paas-with-spring-and-cloud-foundry
TSM-ClComp-EN Cloud Computing | Cloud-native applications | Academic year 2017/2018
Health Endpoint MonitoringProblem ■How to detect and report the status (health) of your application.
Even if the process is still running, the application could be crashed, stuck or have some other problem.Main Idea ■ Implement functional checks within an application that external tools can access through exposed
endpoints at regular intervals. This pattern can help to verify that applications and services are performing correctly.
Implementation A health monitoring check typically combines two factors:
■ The checks performed by the application/service in response to the request to the health verification endpoint.■ Analysis of the result by the tool or framework that is performing the health verification check.
Typical checks that can be performed by the monitoring tools include:■ Validating the (HTTP) response code.■ Checking the content of the response to detect errors■ Measuring the response time, which indicates a combination of the network latency and the time that the
application took to execute the request■ Checking resources or services located outside the application, such as a content delivery network (CDN) used
by the application to deliver content from global caches.■ Checking for expiration of SSL certificates.
31
TSM-ClComp-EN Cloud Computing | Cloud-native applications | Academic year 2017/2018
Health Endpoint Monitoring
32
Source: Cloud Design Patterns: Precriptive Architecture Guidance For Cloud Applications
TSM-ClComp-EN Cloud Computing | Cloud-native applications | Academic year 2017/2018
Health ManagerProblem ■ Applications may fail at any time. Hardware defect, VM failure, programming error like memory leaks, etc.
How to make sure, that all the desired number of instances of each programs and services are operational?
Main Idea ■ Monitor the application processes and status (health) of applications (e.g. by checking the health
endpoints)■ Assures that the set of deployed resources for a service are:
■ consistent with the desired state of the service and■ functioning correctly
Implementation ■ It uses a specification of the “desired state” and compares it with the "actual state"■ It automatically restarts failed components
■ Examples■ Fleet (Docker)■ Kubernetes■ CloudFoundry HM9000
33
TSM-ClComp-EN Cloud Computing | Cloud-native applications | Academic year 2017/2018
And more Cloud Design Patterns ...
■ Valet Key■ Use a token or key that provides clients with restricted direct access to a specific resource or service in order to offload
data transfer operations from the application code.■ Throttling
■ Manage the consumption of resources used by an instance of an application, an individual tenant, or an entire service, by throttling access on high load with the goal to stay operational.
■Queue-Based Load Levelling■ Use a queue that acts as a buffer between a task and a service that it invokes in order to smooth intermittent heavy
loads.■ Competing Consumers
■ Enable multiple concurrent consumers to process messages received on the same messaging channel to optimize throughput
■ Event Sourcing■ Rather than storing just the current state, Event Sourcing record the full series of events that describe actions taken on
data in a domain, which are replayed to reconstruct the current state.■ Command Query Response Segregation (CQRS)
■ Split the system into two parts. The command side handles create, update and delete requests. The query side handles queries using one or more materialized views of the application’s data.
See details in AppendixAnd even more → https://msdn.microsoft.com/en-us/library/dn568099.aspx
34
TSM-ClComp-EN Cloud Computing | Cloud-native applications | Academic year 2017/2018
Serverless computingIntroduction
■Serverless computing refers to applications that significantly depend on custom code run in ephemeral containers in the cloud (Function-as-a-Service (FaaS)).■FaaS first introduced by AWS Lambda (2014)■Followed by Google Cloud Functions (2016)
■FaaS characteristics:■Stateless compute containers■Event-triggered■Ephemeral (container may last only for one invocation)■Fully managed by cloud service provider (including autoscaling)■Pricing based on number of requests and time the code executes (100 millisecond increments)
■When combining a FaaS backend with a Single-Page application front-end one obtains a serverless web app.■ “Serverless" meaning that the organization building and supporting a "serverless" application is
not looking after server hardware or IaaS virtual machine instances or PaaS instances, instead delegating everything to the cloud provider.
35
TSM-ClComp-EN Cloud Computing | Cloud-native applications | Academic year 2017/2018
Serverless computingAWS Lambda
■Originally designed for use cases such as image upload, responding to website clicks or reacting to sensor readings from an IoT device.
36
TSM-ClComp-EN Cloud Computing | Cloud-native applications | Academic year 2017/2018
Serverless computingAWS Lambda
37
TSM-ClComp-EN Cloud Computing | Cloud-native applications | Academic year 2017/2018
Serverless computingAWS Lambda
■The code that run on AWS Lambda is called a “Lambda function.” ■Lambda functions must be stateless.■Maximum execution time 300 seconds (configurable timeout, default 3 seconds).■ Inbound network connections are blocked, but outbound network connections are allowed.
■Supported languages:■ Java■Node.js■C#■Python
■Code can include custom libraries, even native ones.■Makes it possible to run arbitrary executables (on Amazon Linux).
38
TSM-ClComp-EN Cloud Computing | Cloud-native applications | Academic year 2017/2018
Serverless computingAWS Lambda — Triggers
■A Lambda function can be invoked via HTTPS synchronously or asynchronously using Amazon Gateway.■One can schedule Lambda functions to be
invoked at a specific time.■The service is tightly integrated with other
AWS services which can act as event sources to trigger Lambda functions. A lambda function can■ respond to changes in an Amazon S3 (object
store) bucket■ respond to updates in an Amazon DynamoDB
(NoSQL DBaaS) table■process records in an Amazon Kinesis
(publish/subscribe messaging) stream■ respond to notifications sent by Amazon
Simple Notification Service (SNS)
■ respond to emails sent to Amazon Simple Email Service (SES)
■ respond to Amazon CloudWatch (monitoring) alarms
■ process CloudWatch log events■ respond to changes in user or device data
managed by Amazon Cognito■ respond to changed in a CodeCommit
(version control) repository■ respond to changes in Amazon Config
resource configurations■ respond to Alexa voice assistant events■ respond to Lex (conversational interface)
events■…
39
TSM-ClComp-EN Cloud Computing | Cloud-native applications | Academic year 2017/2018
Serverless computingA serverless web app
■ Example of a traditional web application transformed into a serverless application.
1. Part of the app executes client-side, using JavaScript. Code, HTML and CSS are downloaded from static file server.
2. Authentication logic is moved to a third-party authentication service.
3. Client accesses product listings directly from product database.
4. Client-side code keeps track of user session, does page navigation, etc.
5. Compute-heavy search is implemented as Lambda function.
6. Purchase is implemented as Lambda function as well, kept server-side for security reasons.
40
Server, VM or instance
Client (browser) Web app
Database
Client (browser)
Static file server
Download HTML, JavaScript, CSS
Client-sidelogic API
Gateway
Purchase function
Search function
Authenti-cationservice
Productdatabase
Purchasedatabase
1
2
3
4
5
6
TSM-ClComp-EN Cloud Computing | Cloud-native applications | Academic year 2017/2018
References
41
Matt StineMigrating to Cloud-Native Application Architectures
2015-02O'Reilly Media
Martin FowlerMicroservices
2014-03http://www.martinfowler.com/articles/microservices.html
Irakli Nadareishvili, Ronnie Mitra, Matt McLarty, Mike AmundsenMicroservice Architecture
2016-07O’Reilly Media
Alex Homer, John Sharp, Larry Brader, Masashi Narumoto, Trent Swanson Cloud Design Patterns
2014-02Microsoft Press
TSM-ClComp-EN Cloud Computing | Cloud-native applications | Academic year 2017/2018
Appendix
42
TSM-ClComp-EN Cloud Computing | Cloud-native applications | Academic year 2017/2018
Life-cycle of a cloud-application
43
TSM-ClComp-EN Cloud Computing | Cloud-native applications | Academic year 2017/2018
MicroservicesMonolithic vs. Microservices architecture
44
Source: http://www.martinfowler.com/articles/microservices.html
TSM-ClComp-EN Cloud Computing | Cloud-native applications | Academic year 2017/2018
MicroservicesComponentization around services
■Any well-architected system is based on modular components.■Traditionally components have been
encapsulated into libraries. ■ Interfaces defined using programming language
mechanisms.■Microservices componentize around services.
■Services are accessed over the network.■ Interface defined by RPC mechanisms (e.g.
REST APIs).■Services are independently deployable (no need
to deploy the whole monolith when a component changes).■Services are independently scalable (able to
accommodate individual processing load).
45
UI
Shipment Accounting
Supply Warehouse
Database
UI
Shipment Accounting
Supply Warehouse
Database Database
Database Database
Monolithic
GET
GET
GET
Microservices
TSM-ClComp-EN Cloud Computing | Cloud-native applications | Academic year 2017/2018
MicroservicesOrganized around Business Capabilities
■Conway’s law:■Any organization that designs a system
(defined broadly) will produce a design whose structure is a copy of the organization's communication structure.
■Common to see in large organizations teams split along UI, server-side logic and database (silos).■Because simple changes require cross-
team approval which is burdensome each team starts to develop code workarounds.
■ In the microservice approach teams are split along business capability.■ Teams are cross-functional: skills
across the whole stack.■ Team size follows “two pizza” principle:
the whole team can be fed by two pizzas.
46
Siloed functional teams lead to a siloed application architecture.
Cross-functional teamsorganized around business capabilites
TSM-ClComp-EN Cloud Computing | Cloud-native applications | Academic year 2017/2018
MicroservicesMonolithic vs. Microservices architecture
47
Monolithic Microservices
ArchitectureBuilt as a single logical executable (typically the server-side part of a three tier client-server-database architecture)
Built as a suite of small services, each running separately and communicating with lightweight mechanisms
Modularity Based on language features Based on business capabilities
AgilityChanges to the system involve building and deploying a new version of the entire application
Changes can be applied to each service independently
Scaling Entire application scaled horizontally behind a load-balancer
Each service scaled independently when needed
Implementation Typically written in one language Each service implemented in the language that best fits the need
Maintainability Large code base intimidating to new developers Smaller code base easier to manage
Persistence One single database holding all the data Each service has its own independent database
TSM-ClComp-EN Cloud Computing | Cloud-native applications | Academic year 2017/2018
More patterns
48
TSM-ClComp-EN Cloud Computing | Cloud-native applications | Academic year 2017/2018
etcd: An example of a service registry
■ Distributed key value store■ Designed for: shared configuration & service discovery■ Implements Raft consensus algorithm■ Handles machine failures, master election etc.■ Actions: read, write, listen■ Data structure■ /folder■ /folder/key■ REST-API■ easy to use client: etcdctlread/write a value■ etcdctl get /folder/key■ etcdctl set /folder/key valueread/create directory■ etcdctl mkdir /folder■ etcdctl ls /folderlisten to changes■etcdctl watch /folder/key■etcdctl exec-watch /folder/key -- /bin/bash -c “touch /tmp/test”
49
Slide credit: Martin Blöchlinger “Migrating an Application into the Cloud with Docker and CoreOS”
TSM-ClComp-EN Cloud Computing | Cloud-native applications | Academic year 2017/2018
Retry PatternProblem ■ In a distributed environment like the cloud you will have to deal with transient errors. Errors that only
occur for a very short time and will be resolved by the systemMain Idea ■ If a service/resource responds with an error and based on the error message it might be transient, first
retry the request before assuming service/resource is down.■ If the fault indicates that the failure is not transient or is unlikely to be successful if repeated the
application should abort the operation and report a suitable exception
Possible reasons for transient errors ■ Temporary overload of service■ Network interruption■ Corrupted package
Different implementation strategies depending on the kind of application: ■ Retry with no time delay (immediate)■ Retry with a fixed time delay
50
Source: Cloud Design Patterns: Prescriptive Architecture Guidance For Cloud Applications
TSM-ClComp-EN Cloud Computing | Cloud-native applications | Academic year 2017/2018
Valet Key PatternProblem ■ When offering static data like images or videos the process of up- or downloading that media can use up
a high amount of resources resulting in (unnecessarily) high costs.Main Idea ■ Use a token or key that provides clients with restricted direct access to a specific resource or service in
order to offload data transfer operations from the application code■ Example: http://docs.openstack.org/juno/config-reference/content/object-storage-tempurl.html
51
Source: Cloud Design Patterns: Prescriptive Architecture Guidance For Cloud Applications
TSM-ClComp-EN Cloud Computing | Cloud-native applications | Academic year 2017/2018
Throttling PatternProblem■ A CNA has always a limit on immediately available (allocated) resources (e.g., CPU, RAM, Storage)■ Bursts in user-request can over overwhelm the application (Poor Performance)■ Throttling of service can mitigate the problem until scaling was performed or the situation has normalizedMain Idea ■ Control the consumption of resources used by an instance of an application, an individual tenant, or an
entire service. This pattern can allow the system to continue to function and meet service level agreements, even when an increase in demand places an extreme load on resources.
Throttling Strategies ■ Reject requests from an individual user who has already accessed system APIs more than n times per
second over a given period of time.■ Disable or degrade the functionality of selected nonessential services so that essential services can run
unimpeded with sufficient resources. E.g: If the application is streaming video output, it could switch to a lower resolution.
■ Defer operations being performed on behalf of lower priority applications or tenants. These operations can be suspended or curtailed, with an exception generated to inform the tenant that the system is busy and that the operation should be retried later.
52
TSM-ClComp-EN Cloud Computing | Cloud-native applications | Academic year 2017/2018
Throttling Pattern: Example
53
Source: Cloud Design Patterns: Prescriptive Architecture Guidance For Cloud Applications
Figure 1 - Graph showing resource utilization against time for applications running on behalf of three users
Figure 2 - Graph showing the effects of combining throttling with autoscaling
TSM-ClComp-EN Cloud Computing | Cloud-native applications | Academic year 2017/2018
Queue-Based Load Leveling PatternProblem ■ It is possible that a service might experience peaks in demand that cause it to become overloaded and
unable to respond to requests in a timely manner. Flooding a service with a large number of concurrent requests may also result in the service failing if it is unable to handle the contention that these requests could cause.
Main Idea ■ Use a queue that acts as a buffer between a task and a service that it invokes in order to smooth
intermittent heavy loads that may otherwise cause the service to fail or the task to time out■ It is also possible to use the queue as an indicator for the auto-scaling mechanism → Instantiate more
‘Services’ when the message queue contains a certain number of messages (see also ‘Competing Consumer Pattern’)
54
Source: Cloud Design Patterns: Prescriptive Architecture Guidance For Cloud Applications
TSM-ClComp-EN Cloud Computing | Cloud-native applications | Academic year 2017/2018
Competing Consumers PatternProblem ■ An application running in the cloud may be expected to handle a large number of requests. Rather than
process each request synchronously, a common technique is for the application to pass them through a messaging system to another service (a consumer service) that handles them asynchronously. Using a single instance of the consumer service might cause that instance to become flooded with requests or the messaging system may be overloaded by an influx of messages coming from the application.
Main Idea ■ Enable multiple concurrent consumers to process messages received on the same messaging channel
(Queue). This pattern enables a system to process multiple messages concurrently to optimize throughput, to improve scalability and availability, and to balance the workload.Consider■ Consumers must be coordinated to ensure
that each message is only delivered to a single consumer.
■ The workload also needs to be load balanced across consumers to prevent a certain instance from becoming a bottleneck.
55
Source: Cloud Design Patterns: Prescriptive Architecture Guidance For Cloud Applications
TSM-ClComp-EN Cloud Computing | Cloud-native applications | Academic year 2017/2018
Benefits■ Enables an load-leveled system that can handle
wide variations in the volume of requests sent by application instances.
■ The queue acts as a buffer between the application instances and the consumer service instances
■ Handling a message that requires some long-running processing to be performed does not prevent other messages from being handled concurrently by other instances of the consumer service.
■ Improves reliability.■ The message queue ensures that each message
is delivered at least once.■ Makes a solution easily scalable.
Issues and Considerations ■ Message being processed multiple times
■ Message processing should be idempotent■ Detecting ‘Poison Messages’
■ Detect and handle malformed messages■ Handling Result handling
■ In case the consumer generates a result which is needed by the producer of the message, there needs to be a way for it to access this result
■ Scaling the Messaging System■ It is quite possible that the messaging systems
(queue) itself becomes overwhelmed by the amount of messages it needs to manage. In such a case the messaging system itself also needs to be scalable
■ Ensuring Reliability of the Messaging System■ A reliable messaging system is needed to
guarantee that, once the application enqueues a message, it will not be lost. This is essential to ensure that all messages are delivered at least once
Competing Consumers Pattern
56
TSM-ClComp-EN Cloud Computing | Cloud-native applications | Academic year 2017/2018
Event Sourcing
Problem ■ The typical approach for application is to maintain the
current state of the data by updating it directly in the data store (typical CRUD model).
■ Limitations:■ The fact that CRUD systems perform update operations directly against
a data store may hinder performance and responsiveness, and limit scalability, due to the processing overhead it requires.
■ In a collaborative domain with many concurrent users, data update conflicts are more likely to occur because the update operations take place on a single item of data.
■ Unless there is an additional auditing mechanism, which records the details of each operation in a separate log, history is lost.
Main idea ■ Use an append-only store to record the full series of events that describe actions taken on data in a domain, rather
than storing just the current state. ■ At any point in time it is possible to read the history of events, and use it to materialize the current state of an entity.
This may occur on demand in order to materialize a domain object, or through a scheduled task so that the state of the entity can be stored as a materialized view to support the presentation layer.
■ This pattern can simplify tasks in complex domains by avoiding the requirement to synchronize the data model and the business domain; improve performance, scalability, and responsiveness; provide consistency for transactional data; and maintain full audit trails and history that may enable compensating actions.
57
Source: Cloud Design Patterns: Prescriptive Architecture Guidance For Cloud Applications
An overview and example of the Event Sourcing pattern
TSM-ClComp-EN Cloud Computing | Cloud-native applications | Academic year 2017/2018
Command and Query Responsibility Segregation Pattern (CQRS)
■Problem ■ In traditional data management systems, both commands (updates to the data) and queries (requests for
data) are executed against the same set of entities in a single data repository.■ Often there is a mismatch between the read and write representations of the data, such as additional columns or properties
that must be updated correctly even though they are not required as part of an operation.■ Risks data contention in a collaborative domain when records are locked, or update conflicts caused by concurrent updates
when optimistic locking is used. These risks increase as the complexity and throughput of the system grows.■ It can make managing security and permissions more cumbersome because each entity is subject to both read and write
operations, which might inadvertently expose data in the wrong context.
Main idea ■ Split the system into two parts. The command side handles create, update and delete requests and
stores it in a write data store. The query side handles queries using one or more materialized views of the application’s data.■ The read store can be a read-only replica of the write store,
or the read and write stores may have a different structure altogether. Using multiple read-only replicas of the read store can considerably increase query performance.
■ Separation of the read and write stores also allows each to be scaled appropriately to match the load. For example, read stores typically encounter a much higher load that write stores.
■ CQRS is often used in combination with Event Sourcing
58