Java application delivery Deploying without going offline

Java application delivery

Deploying without going offline

Me

Frederic Jambukeswaran, Founder, rapidcloud.io

Previously:

VP, Technology EXPO Communications

CTO Pronto/IAC

VP, Engineering Sportnet

Focus on Deployments

I’m going to focus almost exclusively on deploying your application

My assumption is that you have a separate build process which packages your artifact (jar, war, etc) and prepares your static assets Maven, SBT, Ant, Grunt, etc are all tools to prepare your application

Deploying works by picking up prebuilt artifacts and handling the delivery to your servers Yes you can build your artifacts for each (server) instance if you need to, however it quickly

starts to become a bottleneck if you have many instances or a complex environment.

Deployment Expectations

Zero downtime (at least never offline)

Fast (not slow)

100% automated

Reliable

Simplest Basic Deployment (stop/start)

The simplest deployment strategy in any environment is to shutdown the application, copy the assets and startup again. Unfortunately this leads to our app being offline and inaccessible.

Acceptable for some use cases, maybe on dev or a lightly changed/used application, but it creates a poor user experience and creates a hesitancy to deploy (BAD)

Parallel Deployments to the Rescue!

Parallel deployments allow you to maintain availability by launching your new application while your old one is still handling requests

You are able to verify the new instance is healthy and ready to receive traffic before activating it

You can give your old application time to finish processing existing requests

Downside

They are more difficult to setup and may require application changes in how static assets, databases or sessions are handled

Some gotchas

Http sessions Since we are launching new instances, http sessions will be lost unless you

Setup session replication

Use a centralized session store (memcache, etc)

Don’t use sessions (how cool is it that Play Framework doesn’t use http sessions?)

Database Schemas Since you’re old app and new app will be running at the same time, your schema changes must be backwards

compatible

Static assets You may be serving requests for two different versions at the same time. So your static assets will get

confused.

Use asset fingerprinting (asset auto versioning) to prevent cache poisoning. (OK/Kinda)

Separate your static assets and deliver them for any version (GREAT!)

Server Configurations

We are going to cover how to execute this in the following environments

AWS / Autoscaling (dynamic instances)

Single server (static instance)

Multiple servers (static instances)

AWS : Overview

AWS allows you to “instruct” them on how to launch an server Think of your application in terms of dynamic instances while hold everything

JVM, WWW server, etc

Don’t think about just deploying a war file, think about deploying an entire server instance

How do we keep track of them all? Amazon’s load balancers work together with the auto scaling group to automatically add new

instances to into rotation

Overall…. Very elastic, very cloudy, very awesome.

AWS : Deploying (Mindset)

Think about it as deploying new instances not deploying new code These new instances are configured to run your new code

Then we “undeploy” older code, by terminating the older instances

AWS : Auto Scaling

A very powerful service to manage instances by configuration. Setup rules to tell amazon how many instances you want, what type they are, and under what

conditions it should create /destroy instances.

Example: You can scale up to 10 instances during high load and down to 2 when load is low.

If an instance dies or gets unhealthy, it can automatically be replaced.

Amazon’s load balancers work together with the auto scaling group to automatically add new instances to into rotation

Overall…. Very elastic, very cloudy, very awesome.

What does it look like

Load Balancer

Launch new server group

Deregister the old server group

After the old instances have cooled down terminate them

App 1

Group 1

App 2

Group 2

Load Balancer

App 2

Group 2

Load Balancer

App 1

Group 1

App 2

Group 2

Load Balancer

App 1

Group 1

AWS : Alternatives (which I don’t like)

Launch a new ELB for each deployment and change the DNS settings Tinkering with DNS is not something that I would like to do it

New ELB may need time to scale up in the case of high traffic

Have your autoscaling groups pull the “latest” production built and kill old instance one by one (allowing autoscaling to rebuild with new instances) If your new production build is unstable for some reason, you’ve “broken” your autoscaling

group

What's the catch?

Have to learn Amazon’s environment Powerful, flexible but complicated

Have to think about your instances differently No more www1, www2, frey, Aryn, Baratheon, etc

You have “groups” and you won’t always know how many instances each group holds

To deploy a new application you’ll need to handle orchestrating and managing AWS services

Example: Through the AWS API

To take advantage of autoscaling, AWS needs to be able to launch, configure and deploy your application to an instance completely by itself

http://docs.aws.amazon.com/AutoScaling/latest/DeveloperGuide/as-scale-based-on-demand.html

Imagine an ELB here

AWS Network Components

Key parts of AWS’s networking infrastructure Regions

Availability Zones

VPCs

Subnets

Security Groups

Subnet

Availabilty Zone

VPC

Region

You configure these network

services

Regions and Availability Zones

Regions are separate geographic areas and are independent of each other us-east-1 – Virginia

us-west-1 – Northern California

us-west-2 - Oregon

Availability zones exist within a region and are isolated locations Communication across zones and within a region are low latency/fast

Ex: us-east-1a, us-east-1b

Virtual Private Cloud (VPC)

Create your own logically isolated section of the amazon cloud. You can setup subnets, ip ranges, etc.

AWS automatically creates a VPC for you(for all accounts created > Dec-04 2013)

VPC exists within a region and across availability zones

Subnet

A subnet exists within a VPC and an availability zone (do not span zones)

Public subnet – Traffic is routed to the internet

Private subject – Traffic doesn’t have a route to the internet

Security Groups

Act like firewalls controlling inbound and output network traffic

Associate one or more security groups to each of your instances

Defined at the region level

Security Groups

You can limit access to another security group

You can limit access to an ip range

AWS Network Components (Repeated)

Key parts of AWS’s networking infrastructure Regions

Availability Zones

VPCs

Subnets

Security Groups

Subnet

Availabilty Zone

VPC

Region

You configure these network

services

Configuring AWS to Launch Instances

AMIs

Launch Configurations

Auto Scaling Groups

Load Balancers

Amazon Machine Image (AMI)

These images are the foundation of any server instance you create Specify the OS, installed software, etc

Amazon provides some good ones to use as a starting point RedHat, Ubuntu, etc

AMIs

Customizing an AMI

To speed up launching instances you can take one of Amazon’s AMIs and launch an instance with that.

Customize that instance with all your key software (Java 8, Log Entries, etc)

Create your own AMI from that image

Use your customized AMI for future instances

Its really that easy….

Launch Configurations

What is it? Set of rules to define how to launch an instance

Key elements Security groups

Type of instance (small, medium, t2, m3)

User data (meta data field)

Startup Script Launch configurations can be setup to run a bash script on startup

Configured through the user data attribute

Its an idea way to bootstrap your instance, since the the script carry your instance settngs (i.e. application, environment, role and build_id)

I like to use a boot.sh file built into my AMI and have my launch configuration script just delegate to it (keeps my launch config meta data small)

Code: Create Launch Configuration

AmazonAutoScalingClient autoScalingClient = new AmazonAutoScalingClient();

CreateLaunchConfigurationRequest request = new CreateLaunchConfigurationRequest();

request.withImageId("abc-123-ami");

request.withIamInstanceProfile("application-server");

request.withKeyName("default-ssh-key");

request.withLaunchConfigurationName( "myapplication-staging-cron-lc-" + System.currentTimeMillis());

request.withSecurityGroups(new String[] { "sg-123-abc", "sg-123-def" });

request.withInstanceType("t2.small");

request.withAssociatePublicIpAddress(true);

request.setUserData(toBase64("#! /bin/bash –e\n/opt/boot/boot.sh myapplication staging cron build_1232"));

autoScalingClient.createLaunchConfiguration(request);

Auto Scaling Groups

Auto scaling groups define how you want manage a group of instances

Key elements Capacity of your group

Subnet

Load balancer

Healthcheck to use to monitor the instances

Meta data (tags)

Auto scaling policies

Code: Create Autoscaling Group

CreateAutoScalingGroupRequest createGroup = new CreateAutoScalingGroupRequest();

createGroup.setAutoScalingGroupName("myapplication-staging-cron-asg" + System.currentTimeMillis());

createGroup.setDesiredCapacity(2);

createGroup.setHealthCheckGracePeriod(300); /* how much time before healthchecks have to pass */

createGroup.setHealthCheckType("ELB");

List<String> elbNames = new ArrayList<String>();

elbNames.add("myapplication-staging-elb");

createGroup.setLoadBalancerNames(elbNames);

createGroup.setLaunchConfigurationName(launchConfigurationName);

createGroup.setMaxSize(10);

createGroup.setMinSize(2);

createGroup.setVPCZoneIdentifier("subnet"); /* subnet id for the instances */

createGroup.setDefaultCooldown(300);/* seconds between scaling availabilities */

autoScalingClient.createAutoScalingGroup(createGroup);

Elastic Load Balancer

Elastic Load Balancer (ELB) High availability load balancer which automatically scales up with traffic

Capable of determining the state of an instance with http healthchecks

Deregister and check the status of instances in the ELB using the AWS SDK

Code: Check Instances in ELB

private Integer getReadyCount(List<String> newInstances, String elbName) {

AmazonElasticLoadBalancingClient elbClient = new AmazonElasticLoadBalancingClient();

DescribeInstanceHealthRequest request = new DescribeInstanceHealthRequest(elbName);

DescribeInstanceHealthResult describeInstanceHealth = elbClient.describeInstanceHealth(request);

Integer readyCount = 0;

for(InstanceState state : describeInstanceHealth.getInstanceStates()) {

if(state.getState().equals("InService")) {

if(newInstances.contains(state.getInstanceId())) {

logger.info("{} instance is ready", state.getInstanceId());

readyCount = readyCount +1;

}

}

}

return readyCount;

}

Tip: Name your entities carefully

Use very consistent naming, it can get confusing quickly

Key all entities by: Environment – prod, stage, dev

Application – sso, datafactory, etc

Role – www, cron, worker, etc

Creation timestamp (for uniqueness)

Example:

prod-sso-www-asg-1411586408

prod-sso-cron-asg-1411586408

prod-sso-cron-lc-1411562401

S3 : Great place to store assets

Scalable data store

Ideal place to hold application artifacts

static assets

S3 : Storing Artifacts (jar/war)

Use S3 to house your jar/war files for deployment. Setup a bucket for the artifacts.

repository.mycompany/myapplication/$build_id/myapplication.war

The $build_id will allow you to have multiple versions active (staging, dev, prod, etc)

If Jenkin’s hosted on AWS you can leverage AWS’s command line tools to copy your assets up to a bucket after the build step (lots of other ways to move assets to s3 as well)

aws s3 cp target/universal/myapplication-1.0.zip "s3://<mu bucket>/versions/$BUILD_ID/myapplication-1.0.zip" ;

S3: Managing static assets

You can store your static assets (css/js/images) a public s3 bucket: static.mycompany/myapplication/$build_id/(js|css|img)/…

Serve static assets from your public bucket http://static.mycompany.s3-us-east1.amazonaws.com/

OR front them with a CDN like cloudfront or akamai

Self Initializing (I know I’m repeating)

It’s important to understand with AWS Auto scaling, your instances need to be able to fully and completely initialize themselves in a repeatable and reliable manner.

AWS is going to launch your instance relying on your AMI and launch configuration. Use your launch configuration script to fetch any assets and complete any instance

initialization.

Example Deployment Steps

Identify the build_id to deploy (argument to Jenkins, detect latest build on s3, etc)

Using the AWS SDK/API you then

Create new launch configuration (with meta data defining which app/build/role/env it is for)

Create autoscaling group (direct it to use your new launch config)

Wait for the group to be ready (@ least 1 member is “InService”)

Poll the ELB till the new instances are ready

Get the list of old instances from the ELB

Clean up

Deregister old instances from the ELB

Give your instance time to cooldown

Find all the old auto scaling groups set their counts to 0 and wait for them to terminate their instances

Remove old autoscaling groups and launch configuration

Orchestrating with Jenkins

Jenkins is a great way to pull it all together

Separate build jobs from deployment jobs Build job (one per application)

Checkout application code, package, copy artifact and static assets to S3

Deploy job (one per application/role/environment)

Instruct AWS to launch a new auto scaling group with new application version and tear down old ones

Naming

Suggestion: [application]-[environment]-[role]-[task]

Example:

rapidcloud-build-ci

rapidcloud-prod-worker-deploy

rapidcloud-prod-cron-deploy

rapidcloud-stage-worker-deploy

rapidcloud-stage-cron-deploy

Watch Out

You’ll have new and old instances working side by side for some time Static assets which have changed and are not delivered via S3 will be delivered inconsistently

You’ll use a lot of instances Since each deployment spins up an entirely new application layer, you’ll create and destroy

more instances

Make sure you have enough allocation to handle the occasional spikes

If you use S3 to store assets/artifacts, you’ll need to clean them out periodically

This takes some time to setup, give yourself a few days to write the scripts/configs to manage the deployment to AWS

NON- AWS environments

We’ll cover single server and multiple server deployments now

Single Server

A single server anywhere, cloud or physical hardware.

Very low cost server setup (ex: digital ocean 2gb ram $20/month)

Still need to deploy without going offline

No more rebuilding whole instances now….

What does it look like

www App:8080 www App:8080

App:8081

www App:8080

App:8081

www

App:8081

Start new application instance on 8081

Once new application is readySwitch nginx to 8081

After old application has cooled downStop the instance on 8080

Environment Configuration

Use a reverse proxy to accept www requests and route them to the application Nginx, Apache, Lighttpd

Configure the application to run on any one of two ports (ex: 8080 and 8081)

With each deployment alternate the port we use for the new application and point the reverse proxy to this new port

Reverse Proxy Configuration (nginx)

Nginx..proxy configuration:

location / {

proxy_buffering off;

proxy_pass http://127.0.0.1:8080;

proxy_set_header X-Real-IP $remote_addr;

proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;

proxy_set_header X-Forwarded-Proto $scheme;

proxy_set_header Host $http_host;

}

..

When switching ports, deploy an updated configuration (or change the port) and reload

Ex: sudo /usr/local/bin/nginx -s reload;

Since there is a proxy between the client and

app, we need to forward along the clients IP address

This is where nginx will send the request, we’ll

change this port with each

deployment

How To: Tomcat on 2 ports

For Tomcat install it across two instances with a central catalina home:

CATALINA_HOME=/opt/tomcat

Instances

/opt/tomcat/instances/8080

/opt/tomcat/instances/8081

Modify the server.xml in /opt/tomcat/instance/8081/conf to change or disable the default ports (8080, 844, 8009, 8005)

To startup the 8080 instance:

Execute: /opt/tomcat/bin/catalina.sh run

With environment variables:

CATALINA_HOME=/opt/tomcatCATALINA_BASE=/opt/tomcat/instances/8080

These environment variables tell tomcat where to look for the

webapp and its binaries

How To: Play on 2 ports

For Play install the application in one of two locations

/op/play/9000/

And

/opt/play/9001

To startup the 9000 instance:

/opt/play/9000/bin/start -Dhttp.port=9000 -Dpidfile.path=/opt/play/9000/play.pid


Find running instance (8080 or 8081) Do this by finding which port reverse proxy is using (truth)

Copy application package (war, etc) into the “other” instance 8081 (if 8080 is running)

Startup the new instance (supervisor is great for this)

Verify the new instance (curl, etc)

Activate the new application by switching nginx to the new port

Wait for old requests to cool down (sleep, etc)

Shutdown old application

Bash Snippet – Check Port

NGINX_PORT=8080;

TARGET_PORT=8081;

echo '[echo]: Checking /etc/nginx/nginx.conf to find the active nginx application port.';

if grep "8081" /etc/nginx/nginx.conf

then

NGINX_PORT=8081;

TARGET_PORT=8080;

fi

Bash Snippet - Verify new application

RETURN_CODE=`curl -m30 -s -o /dev/null -w "%{http_code}" http://localhost:8080/am/i/healthy`;

if [ "$RETURN_CODE" == '200' ]

then

echo '[echo]: Url check returned 200 status, success.';

else

echo "[echo]: Url check returned $RETURN_CODE status, failure.";

exit 1;

fi

Watch Out

You’ll need enough memory, since two copies of the application will be running

Logs files will go to two different locations, and keep swapping with each deployment Using a log manager to help, or just check both files

Multi-server Setup

It gets a little more interesting with multiple servers

You’ll need either a load balancer or a reverse proxy to distribute traffic to your application instances

In this walk though, I’ll assume Two or more servers as reverse proxies

Use round robin DNS to route requests to the proxies and DNS failover to handle unexpected “bumps”

Diagram – www / app servers separate

www

www app

app

DNS

Proxy Configuration (nginx)

http {

upstream application {

server srv1.example.com:8080;

server srv2.example.com:8080;

}

server {

listen 80;

location / {

proxy_pass http://application;

}

}

}

This is how nginx knows where to route traffic, by default it’ll round robin

through the list (behavior is configurable)


For each application instance (cycle through them) Take the application offline (wait, offline, yes, signal the app to stop accepting requests)

Wait for the application to cooldown

Shutdown the application

Copy the new assets

Start the application up again.

Verify the application started up successfully

Taking your application offline

When deploying we want to “take an application offline” so nginx stops sending requests, but allow it to continue processing existing ones

Add “offline” state to your application Once offline allnew calls return a http status 500

Can be built into a ServletFilter or Play action annotation

Can be triggered via url, file, etc

Watch out

Nginx latency Identifying proxy health is done as it executes requests meaning increased latency during

deployments

Definitely expect multiple versions live at the same time

Rapidcloud.io

Rapidcloud.io is a cloud management and application deployment service for Java developers.

Join our Beta and get our starter package free for 12 months

Appendix

Supervisord

Tomcat Parallel Deplotment

Amazon SDK

Supervisord

Supervisord is a great service for managing instances.

Uses a declarative configuration to define services

Supervisor Example

[program:tomcat_8080]

command = /opt/tomcat/bin/catalina.sh run

user = tomcat

directory = /opt/tomcat/

process_name = tomcat_8080

autorestart = true

startsecs = 30

stopwaitsecs = 10

redirect_stderr = false

priority = 600

startretries = 1

stdout_logfile=/var/log/tomcat/tomcat_8080.out.log

stderr_logfile=/var/log/tomcat/tomcat_8080.err.log

stdout_logfile_maxbytes = 50MB

stdout_logfile_backups = 10

stderr_logfile_maxbytes = 50MB

stderr_logfile_backups = 10

environment=CATALINA_HOME=/opt/tomcat,CATALINA_BASE=/opt/tomcat/instances/8080,@ENVIRONMENT_VARS@

Tomcat Parallel Deployment

Tomcat has a great built in parallel deployment engine. Once you turn it on, use it by numbering each war ROOT.war#00001

ROOt.war#00002

Tomcat will deploy the highest number (string sorted)

Pro Super easy

No log file confusion

Doesn’t activate if the new war fails

Cons Not sure when the new war is completely active

Some applications don’t play nice (memory leaks, or port conflicts)

Amazon SDK

<dependency>

<groupId>com.amazonaws</groupId>

<artifactId>aws-java-sdk</artifactId>

<version>1.9.4</version>

</dependency>

http://aws.amazon.com/sdk-for-java/

Documents

Java application delivery Deploying without going offline