Upload
amazon-web-services
View
7.290
Download
4
Embed Size (px)
Citation preview
© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
David Dooling & Ryan Richt
October 2015
Cloud FirstNew Architecture for New Infrastructure
@ddgenome & @ryan_richt
ARC401
What to Expect from the Session
Theory of Cloud
Scientists Turned Developers Turned Architects
Ryan
David
Scientists Turned Developers Turned Architects
Monsanto
Theory of Cloud
Theory of Cloud
Automated
Elastic
Highly Available
Security
Software defined everything
Unlimited scale + pay-as-you-go
Horizontally Scalable
Multi-AZ/region + shards/replicas
Provision more like things any time
“Do over” + Correct by construction
Theory of Cloud Cloud Architecture
Automated Higher-Order Automation
Elastic Ephemeral Environments
Highly Available Fault Tolerant
Security Secure by Construction
Horizontally Scalable Parallel, Commodity
⇒
Higher-Order Automation
Automated Tests
Continuous Integration
Continuous Delivery
Automated Infrastructure
Automated Fault Detection
Automated Recovery
…and automated tools to build more automation!
Fallacies of Internal Apps
1. The hardware is reliable
2. The network is reliable
3. The database is reliable
4. Other services are available
5. Inside the network is secure
6. …
Fault Tolerant
Fault Tolerant
Fallacies of 1st Generation Cloud
1. Other people’s fault tolerant
code is actually fault tolerant
2. Everything is stateless
3. Everything can be retried
4. Applications should handle all
faults
5. Data is magically handled by
someone else
Elastic, Ephemeral, Cost-Effective
time
cost
Cloud
On Prem
Dynamic Env Replication
time
cost
Cloud
On Prem
Experiments
A Do-Over for Secure by Construction
Secure by Assumption
Secure by Design
Security Automation
Horizontally Scalable
1. The overhead of scaling
grows at most linearly with
additional nodes
2. Reads and writes both
scale out
3. The system can continue to
provide this scalability
under loss of any node
* This (CAP) requires apps to
understand conflicts
Infrastructure Automation
Federation – 1000 VPCs
Amazon VPC
Amazon VPC
Amazon VPC
Amazon VPC
Amazon VPC
Amazon VPC
Amazon VPC
Amazon VPC
Amazon VPC
Amazon VPC
Amazon VPC
Amazon VPCAmazon VPC
Amazon VPC
Amazon VPC
Amazon VPC
Amazon VPC
Amazon VPC
Amazon VPC
Amazon VPC
Amazon VPC
Amazon VPC
Amazon VPC
Amazon VPC
Amazon VPC
Amazon VPC
Amazon VPC
Amazon VPC
Amazon VPC
Amazon VPC
Amazon VPC
Amazon VPC
Amazon VPC
Amazon VPC
Amazon VPC
Amazon VPCAmazon VPC
Amazon VPC
Amazon VPC
Amazon VPC
Amazon VPC
Amazon VPC
Amazon VPC
Amazon VPC
Amazon VPC
Amazon VPC
Cloud Architecture
Cloud Architecture
Cloud Architecture
Cloud Architecture
Cloud Architecture
AWS
CloudFormation
"IPAddress" : {"Type" : "AWS::EC2::EIP","DependsOn" : "AttachGateway","Properties" : {
"Domain" : "vpc","InstanceId" : { "Ref" : "WebServerInstance" }
}},"InstanceSecurityGroup" : {
"Type" : "AWS::EC2::SecurityGroup","Properties" : {
"VpcId" : { "Ref" : "VPC" },"GroupDescription" : "Enable SSH access via port 22","SecurityGroupIngress" : [
{"IpProtocol":"tcp","FromPort":"22","ToPort":"22","CidrIp" : { "Ref" : "SSHLocation"}},
{"IpProtocol":"tcp","FromPort":"80","ToPort":"80","CidrIp" : "0.0.0.0/0"}
]}
},"WebServerInstance" : {
"Type" : "AWS::EC2::Instance","DependsOn" : "AttachGateway","Metadata" : {
"Comment" : "Install a simple application", …
Cloud Architecture
CloudFormation Template Generator
https://github.com/MonsantoCo/cloudformation-template-generator
CloudFormation
Template
Generator
Referential Integrity
Auto Scaling
Group
CFTG: Security Groups
Stax$ ./stax --helpUsage: stax [OPTIONS] COMMAND [COMMAND_ARGS]add Add functionality to an existing VPCauto-services Lanch multiple services on fleet using template/NAME.services filecheck Run various tests against an existing staxclean Remove keys and buckets of non-existant stacksconnect [TARGET] Connect to bastion|gateway|service in the VPC stax over SSHcreate Create a new VPC stax in AWSdescribe Describe the stax created from this hostdelete Delete the existing VPC staxdockerip-update Fetch docker IP addresses and update related filesfleet Run various fleetctl commands against the fleet clusterhelp Output this messagehistory View history of recently created/deleted staxlist List all completely built and running staxrds PASSWORD Create an RDS instance in the DB subnetrds-delete RDSIN Delete RDS instance RDSINremove ADD Remove the previously added ADDservices List servers that are available to run across a staxslack Post usage report to Slack, define hook in stax.configsleep Turn on/off bastion host which allows ssh access into the VPCstart SERVICE Start service SERVICE in the fleet clustertest Automated test to exercise functionality of staxupdate Update an existing VPC with changes from Cloudformationvalidate Validate CloudFormation template
For more help, check the docs: https://github.com/MonsantoCo/stax
Create and
manage
CloudFormation
stacks in AWS
$ ./stax create[ ---- ] creating stax[ NAME ] vpc-stax-36918-outfitting[ ---- ] creating parameter file[ ---- ] checking for valid json file format[ ---- ] creating ssh key pair in aws[ ---- ] creating key pair[ ---- ] create bucket[ ---- ] creating bucket vpc-stax-36918-outfitting[ ---- ] uploading template[ ---- ] validate template[ ---- ] validating template https://s3.amazonaws.com/…[ ---- ] uploading vpc assets[ ---- ] creating stax in aws[ ---- ] stax creation complete[ ---- ] querying aws[ ---- ] query complete[ ---- ] see run/vpc-stax-36918-outfitting.json for details
$ ./stax connect[ ---- ] checking if stax build is complete[ ---- ] describe stax[ NAME ] vpc-stax-36918-outfitting[ ---- ] querying aws[ ---- ] query complete[ ---- ] see run/vpc-stax-36918-outfitting.json for details[ ---- ] stack vpc-stax-36918-outfitting build complete[ ---- ] connecting to stax: bastion
__| __|_ )_| ( / Amazon Linux AMI
___|\___|___|
https://aws.amazon.com/amazon-linux-ami/2014.09-release-notes/[ec2-user@ip-10-183-1-195 ~]$
Stax as a Service - Create
Stax as a Service – List
Stax as a Service – Describe
Stax as a Service – Services
Monsanto
Microservices Lifecycle
Microservices: Cupcakes, Not Wedding Cakes
A modern language for software engineering
Abstract Data Types (ADTs)
Enforced Immutability
Pattern Matching & Destructuring
Assignment
Type-Level Programming
Futures, Actors, Async
Type classes
Scala, Haskell, Swift, OCaML, SML
Scala, Haskell, Clojure, Erlang, OCaML,
SML
CoffeeScript, Scala, Haskell, Swift, OCaML,
Erlang, SML
Haskell, Scala, C++
Erlang, Scala, Java
Haskell, Scala, ~OCaML
Hybrid OO/FP
Provides transition from and backward compatibility with Java
Advanced Abstractions
Algebraic Data Types (ADTs)
Enforced Immutability
Pattern Matching & Destructuring
Assignment
Type-Level Programming
Futures, Actors, Async
Type classes
Scala: A Modern Language for Software Engineering
Advanced Type Constraints
Advanced Generics & Variance
Higher Kinds
F-bounded Polymorphism
Self-Types
Type Projections
Type Members
Path Dependent Types
Type Refinements
Turing-complete!
Project-as-a-Service 1 – Create Code Repo/Wiki/Issues
Project-as-a-Service 2 – Simple Service Template
Runs giter8 to create a fully functional service written in
Scala based off our current best practices:
• Standard libraries (Slick, Spray, Akka, etc.) for
microservices
• Automated tests with ScalaTest
• Administrative REST endpoints
• Built in (remote) logging and metric capabilities
• Auto-Docker-ization
• Local Vagrant environment
Project-as-a-Service 3 – CI & Dockerization
New check-in Test and Build Dockerize
Project-as-a-Service 4 – Continuous Deployment
fleet
Router
Route Updater
Registrator
A commit is made to GitHub1
1
https://github.com/MonsantoCo/etcd-aws-cluster
https://github.com/MonsantoCo/docker-aws
https://github.com/MonsantoCo/fleet-client
fleet
Router
Route Updater
Registrator
GitHub notifies Jenkins that new code is available.
Jenkins runs automated tests to validate that code is functional.2
2
fleet
Router
Route Updater
Registrator
Jenkins builds a Docker container and pushes it to our private Docker registry.3
3
service-1:1
fleet
Router
Route Updater
Registrator
Jenkins registers the service with etcd, our key/value store, since it doesn’t exist.4
4
service-1:1
name
version
revision
service-1 => 1
fleet
Router
Route Updater
Registrator
Jenkins calls fleet to deploy the container running our service.5
5
service-1:1
service-1 => 1
service v1 rev1
10.183.0.100:8080
fleet
Router
Route Updater
Registrator
Registrator notices the service is deployed and registers the location in etcd.6
6
service-1:1
service-1 => 1
service-1-1 =>
[10.183.0.100:8080]
service v1 rev1
10.183.0.100:8080
fleet
Router
Route Updater
Registrator
When a request is received, the router determines the current revision for the service as
well as the location of the service.7
7
service-1:1
service-1 => 1
service-1-1 =>
[10.183.0.100:8080]
service v1 rev1
10.183.0.100:8080
fleet
Router
Route Updater
Registrator
Next commit (rev 2) is received, Jenkins will test/build/push and look up the revision from
etcd. The revision is newer so it continues but does not update the current revision.8
8
service-1:1
service-1 => 1
service-1-1 =>
[10.183.0.100:8080]
service v1 rev1
service-1:2
10.183.0.100:8080
fleet
Router
Route Updater
Registrator
Jenkins deploys the new container to fleet. It runs side-by-side with the previous
revision at a different location.9
9
service-1:1
service-1 => 1
service-1-1 =>
[10.183.0.100:8080]
service v1 rev1
service-1:2
service v1 rev2
10.183.0.100:8081
10.183.0.100:8080
fleet
Router
Route Updater
Registrator
Registrator notices the new service is deployed and registers the location in etcd under
a different key.10
10
service-1:1
service-1 => 1
service-1-1 =>
[10.183.0.100:8080]
service-1-2 =>
[10.183.0.100:8081] service v1 rev1
service-1:2
service v1 rev2
10.183.0.100:8081
10.183.0.100:8080
fleet
Router
Route Updater
Registrator
Traffic continues to flow to the old service as the current revision has not changed.11
11
service-1:1
service-1 => 1
service-1-1 =>
[10.183.0.100:8080]
service-1-2 =>
[10.183.0.100:8081] service v1 rev1
service-1:2
service v1 rev2
10.183.0.100:8081
10.183.0.100:8080
fleet
Router
Route Updater
Registrator
Traffic can be directed to a particular version by using a header for testing purposes.12
12
service-1:1
service-1 => 1
service-1-1 =>
[10.183.0.100:8080]
service-1-2 =>
[10.183.0.100:8081] service v1 rev1
service-1:2
service v1 rev2
X-Service-Revision: 2
10.183.0.100:8081
10.183.0.100:8080
fleet
Router
Route Updater
Registrator
Periodically, Route Updater queries etcd to look for cases where there is a revision
deployed that is newer than the current route.13
service-1:1
service-1 => 1
service-1-1 =>
[10.183.0.100:8080]
service-1-2 =>
[10.183.0.100:8081] service v1 rev1
service-1:2
service v1 rev2 13
10.183.0.100:8081
10.183.0.100:8080
fleet
Router
Route Updater
Registrator
If there is a newer revision, route updater will attempt to call the smoketest endpoint. If
this returns true, it updates the current route.14
service-1:1
service-1 => 2
service-1-1 =>
[10.183.0.100:8080]
service-1-2 =>
[10.183.0.100:8081] service v1 rev1
service-1:2
service v1 rev214
/admin/smoketest
10.183.0.100:8081
10.183.0.100:8080
fleet
Router
Route Updater
Registrator
Now traffic will start flowing to the new revision of the service automatically.15
service-1:1
service-1 => 2
service-1-1 =>
[10.183.0.100:8080]
service-1-2 =>
[10.183.0.100:8081] service v1 rev1
service-1:2
service v1 rev2
15
10.183.0.100:8081
10.183.0.100:8080
fleet
Router
Route Updater
Registrator
Route Updater will notice that there is a stale revision running. It will instruct the service
to cleanly exit by making a call to the /admin/shutdown endpoint.16
service-1:1
service-1 => 2
service-1-1 =>
[10.183.0.100:8080]
service-1-2 =>
[10.183.0.100:8081] service v1 rev1
service-1:2
service v1 rev2
16
/admin/shutdown
10.183.0.100:8081
10.183.0.100:8080
fleet
Router
Route Updater
Registrator
Registrator will notice the container is no longer running and remove its location from
etcd.17
service-1:1
service-1 => 2
service-1-1 =>
[10.183.0.100:8080]
service-1-2 =>
[10.183.0.100:8081]
service-1:2
service v1 rev2
17
10.183.0.100:8081
fleet
Router
Route Updater
Registrator
The system continues as-is until a new revision is deployed.18
service-1:1
service-1 => 2
service-1-2 =>
[10.183.0.100:8081]
service-1:2
service v1 rev2
10.183.0.100:8081
Comprehensive
Service – log4j
Container – logspout
CoreOS – journal forwarder
Bastion/NAT – rsyslog
ELB – S3 (ELK coming soon)
S3 – S3 (ELK coming soon)
CloudTrail – S3 → TrailDash
RDS – (coming soon)
Logging with ScalaLogging and ELK
Easy to use
• Standard ScalaLogging interface
• Auto custom formats (stack traces)
• JSON-format log messages
• Direct-to-ELK writing
• Standard Fields (container ID, code
version, service name, etc)
Instrumentation & Shipping
• Kamon to Prometheus
Exporter, preserves more
metrics than Prometheus JVM
• Improved tracing
• Improved complex data
mapping
• Periodically collect and push
Spray metrics to Kamon
Automating Kamon and PrometheusAuto-discovery, Dashboards, Alerts
• Custom Docker containers with
more automation – etcd
discovery
• Custom default dashboards
• Auto EC2/EBS/RDS standup
• OAuth integration
• SNS notification integration
• Default Alerts
https://github.com/MonsantoCo/spray-kamon-metrics
What’s Next
Improvements & Evolution
AWS Service Catalog – API?
EC2 Container Service
AWS IAM
• EC2 CS Roles
• RDS Roles – per VPC/DB Subnet Groups
Amazon API Gateway
VPC Flow Logs – CloudFormation support?
Inverting control for deployment
CloudFormation update predictability
IAM role
Amazon RDS
Amazon EC2
Container
Service
Higher-Order Automation
Automated Tests
Continuous Integration
Continuous Delivery
Automated Infrastructure
Automated Fault Detection
Automated Recovery
…and automated tools to build more automation!
Monsanto IT
Acknowledgements
Larry Anderson
Chris Coffman
TJ Corrigan
Phil Cryer
Dave D’Alessandro
Daniel Solano Gómez
Justin Honold
Kyle Jones
Jessica Kerr
Kevin Meredith
Jorge Montero
Brian Rodgers
Chris Shafer
Niranjan Vengavasi
Dick Wall
Russ Wilson
Stuart Wong
Thank you!engineering.monsanto.com
@MonsantoPlatformEng
@ddgenome @ryan_richt
Remember to complete
your evaluations!
Related Sessions
ARC309 - From Monolithic to Microservices: Evolving
Architecture Patterns in the Cloud
Thursday, Oct 8, 4:15 PM - 5:15 PM – Palazzo N
MBL203 - From Drones to Cars: Connecting the
Devices in Motion to the Cloud
Friday, Oct 9, 10:15 AM - 11:15 AM – Delfino 4005
http://engineering.monsanto.com/code
@MonsantoPlatformEng
https://github.com/MonsantoCo/cloudformation-template-generator
https://github.com/MonsantoCo/docker-aws
https://github.com/MonsantoCo/etcd-aws-cluster
https://github.com/MonsantoCo/fleet-client
https://github.com/MonsantoCo/spray-kamon-metrics
https://github.com/MonsantoCo/stax
More to come…
@ddgenome @ryan_richt