OpenShift at Point72 · PDF fileAs we focused on OpenShift we started with a strong deployment strategy from ... used the internal dns name. ... We will often need to troubleshoot

CONFIDENTIAL | FOR INTERNAL USE ONLY

OpenShift at Point72 A view into our journey

About Point72

2 CONFIDENTIAL

Who is Point72

Ethics and

Integrity Firm First

Innovation &

Excellence

We are

professionals who

conduct ourselves

ethically and with

integrity at all

times.

We operate as one

firm, dedicated to

succeeding

together, with

mutual respect and

commitment.

We are not

satisfied with the

status quo and are

committed to

pursuing innovation

and excellence.

Growth and

Development

We work together

to advance our

professional and

personal

development.

Community

We are exemplary

citizens of the world

and contribute to the

communities in

which we live and

work.

Point72 Asset Management, L.P. is a family office investment company. We manage the assets of our founder Steve Cohen and

eligible employees. The firm operates with approximately $11B AUM. Our primary office is located Stamford, CT with offices in London,

Hong Kong, Tokyo, and Singapore. We have a world class team consisting of approximately 1,000 employees.

Values

Mission To be the industries premier asset management firm through delivering superior risk adjusted returns, adhering to the highest ethical

standards and offering the greatest opportunities the industries brightest talent.

3 CONFIDENTIAL

Who we are

Billy Shaw @360linux Dan Foley @djfoley0

Director Systems Engineering, at Point72 Asset

Management, has been in the industry working with

Linux and Unix for the last 24 years. Since 2004 he

has been the primary person responsible for the

Linux services at Point72 growing it from a handful of

servers to the majority server operating system at the

firm. Prior to Point72 Billy has worked for JPMorgan

Chase, Travelocity, Organic Inc, and was a

Cryptologic Technician in the United States Navy.

Systems Engineer, at Agio, has been working with

Linux and Unix for the past 7 years. In 2013 he

started working at Agio on the Linux support and

implementation team, working projects to deploy

new software and provide support for our client

environments. Since March 2016 he has been a

dedicated resource for the Point72 Linux Team,

working several projects including the deployment

and development of the Openshift Enterprise

environment.

4 CONFIDENTIAL

How we got started with OpenShift

As we started working on the next generation platform for our trade processing

applications we did a lot of planning. As a team we came up with principles to guide on

our journey to a microservices architecture.

• Open Source

• Cloud First

• Reactive

• CI/CD pipelines

• Elastic Scale

• Resilient

• Evolvable

• Secure

• Everything is streaming

• Everything is distributed

• Big things from small things

• Test Driven Development

• Move code not data

• Anything can fail

• Documentation

5 CONFIDENTIAL

Principles for development

As a team we promote the use of OpenSource

software tools and technologies.

It is important to us that we contribute back to

OpenSource projects we use.

Always deliver new services and workloads

using a cloud platform for IaaS.

Reactive CI/CD

Reactive Systems rely on asynchronous

message-passing to establish a boundary

between components that ensures loose

coupling, isolation and location transparency.

Development processes are automated

wherever possible.

Open Source Cloud First

6 CONFIDENTIAL

Principles for development (continued)

Systems respond to workload requirements

expanding and contracting based on business

requirements.

Systems stay responsive meeting SLA’s based

on varying workloads.

Evolvable Secure

Applications and systems are decoupled and

we are not locked to specific providers at any

layer in the stack.

Security is not optional and applied

appropriately to every layer of the CI/CD

pipelines.

Elastic Scale Resilient

7 CONFIDENTIAL


The services to the business cannot pause or

stop. It is far easier to treat batch as a stream

process than to treat a stream as a batch

process.

Presume code will run on multiple pods

simultaneously across multiple servers.

Big things from small things Test driven development

Focus on the details delivering components in

small deliverable cycles following agile

principles.

Follow test driven development principles to

maintain quality by having full test coverage of

all modules.

Everything is streaming Everything is distributed

8 CONFIDENTIAL


Code is usually smaller than data. With a

distributed containerized infrastructure we are

able to esnure code is moved not the data.

Hardware will fail and software will crash. We

embrace that in our design principles and using

distributed orchestrated containers and

streaming it becomes easier to achieve.

Documentation Monitoring

Everything we do is clearly written up as

comments in code, README markdown in git,

and wiki pages with appropriate supporting

details.

All vital functions are monitored, managed and

logged in a consistent manner across the entire

platform.

Move code not data Anything can fail

9 CONFIDENTIAL

Openshift adoption timeline

June April January December October November September August July February March

2016 2017

PaaS POC’s start Decide to use Origin OpenShift Enterprise purchased MVP Complete

OCP installed with RedHat consulting POC for microservices starts Sprints start for MVP Jenkins deployed

Cloudforms deployed EFK stack deployed

10 CONFIDENTIAL

OpenShift deployment strategy

Deployment description Deployment details

Early on we knew that we wanted to empower our developers to

have as great control as possible over their deployments into

OpenShift.

To facilitate that a technique was put together using our existing

resources in a new way.

All git repositories contain an OpenShift template in JSON format

maintained as part of the application repository. This template has

a corresponding answer file with definitions of values which can be

whenever a developer needs it (any valid OpenShift configuration

can be specified and a build will fail if it is incorrect).

During build time the code is complied as needed and merged

with our docker images in our docker registry and a new image is

created containing the application and artifacts deployed via an

API call into OpenShift.

As we focused on OpenShift we started with a strong deployment strategy from day one

11 CONFIDENTIAL

Reference architecture

All connectivity is done with

dedicated encrypted circuits

Using node selectors during a

deployment applications are

only deployed to the nodes

approved for the type of work

A multi-master design provides HA

for scheduling and API calls

Router sharding is used to

provide isolated environments for

Sandbox, DEV, QA, and UAT

Deployments can also

guarantee compute resources

for applications with well

defined SLA’s.

Monitoring and Reporting

Tools for understanding the state of OpenShift

• Cloudforms

• EFK

• Prometheus

• Grafana

13 CONFIDENTIAL

CloudForms

Requirements Implementation

• Chargeback or showback reports

• Support multiple cloud providers

• Provisioning capabilities

• Consolidated view of cloud resources

• Consolidated view of OpenShift resource

Cloud forms is installed and running inside

OpenShift it’s own name space. We have had

success using it with both Origin and OCP.

We have been able to use to get insight into

resource for our cloud providers to give an ‘evelator

pitch” into resources and cost.

Example of our implementation

14 CONFIDENTIAL

EFK

Requirements Implementation details

• UI for generating dashboards

• Easy expansion

• Fast queries

• Does not run alongside applications

being monitored

• Access control

• Can accept API quires

• Data can come from sources other

than OpenShift

We set up EFK outside of the cluster using a

fluentd daemon set inside OpenShift to get

messages out.

This allows us to keep up with the frequent

Elastic.co release cycle and install any plugins we

require for any component of the stack.


15 CONFIDENTIAL

Prometheus


• Open Source

• Pull all metrics from nodes, containers, and

applications

• Create custom metrics for specific

applications

• Export data

• Provide live data feed for external

applications via RESTful API

• Pull metrics from Jenkins

Prometheus is deployed in a custom container. This

allows us to have more control over the application

in our environment. Currently it is deployed within

OpenShift pulling metrics from Hawkular.

Our microservices in OpenShift also provide data for

Prometheus.

We do not expose Prometheus data to users.

Instead we expose the data via Grafana

Implementation example

16 CONFIDENTIAL

Grafana


• OpenSource

• Handle multiple data feeds

• Active Directory integration

• User access control

• Export graphs and data

• Ability to create custom views and

dashboards

Grafana is deployed in a container we built. For

access to the data feeds and to keep Prometheus

inaccessible, we removed the Prometheus route and

used the internal dns name.

prometheus.devops.svc.cluster.local

In grafana we use multiple data feeds and are able

to control user access, create custom dashboards,

queries, and export data for analysis.


Troubleshooting

Techniques and tools we use for addressing issues

• Cluster

• Routes

• Services

• Pods

• Docker

• Networking

• Persistent Storage

• Backups

18 CONFIDENTIAL

Cluster

Scenario

We use node selectors as part of our deployments

we have had to change the cloud instance types to

match the business workloads used by OpenShift.

We found there are times where node labels were no

longer applied after the instance type was changed.

How we handled it

After some investigation we found all we had to do

was add the label back to the node after the instance

type change.

19 CONFIDENTIAL

Cluster recovery

Scenario

In order to reduce cost we turned off the POC and

now the DEV environments when not in use.

This also allowed us to evaluate, over a long period

of time, how OpenShift behaves when servers “just

go away” and “come back”.

How we handled it

The servers were stopped/started using cloud

provider API’s.

Over the course of 9 months we only saw a small

number of issues each of which was straight forward

to resolve.

20 CONFIDENTIAL

Cluster node reports ‘NotReady’

Scenario

At some point you will encounter a node which

reports a status of NotReady.

How we handled it

Start with the ‘oc describe node nodename’ It will

provide details about the node including pods, their

status, allocated resources, and events.

We review these and check server node health if

anything looks amiss.

21 CONFIDENTIAL

Routes

Scenario

When adding or removing routers we would see

inconsistent results when requests would come in for

routes.

How we handled it

This was an easy one to fix, but took a little bit to

diagnose. We spent a long time in OpenShift looking

for the issue.

At some point it was realized that the entries in our

DNS servers subdomain for the OpenShift

environment had not been updated.

22 CONFIDENTIAL

Routes and DNS

Scenario

It is important to know how SkyDNS resolves names

in order to keep specific connections within

OpenShift without having to use POD IP addresses

which can and will change overtime.

How we handled it

SkyDNS internal format:

Default: <pod_namespace>.cluster.local

<service>.<project / namespace>.svc.cluster.local

<name>.<namespace>.endpoints.cluster.local

23 CONFIDENTIAL

Services

Scenario

We found that applications will expose ports other

than http or https and developers would require

access to the applications on those ports.

Examples of this are crate.io, zookeeper, kafka,

various custom applications, etc.

How we handled it

In order to resolve this issue we took advantage of

node ports.

Node ports operate by reserving a port on each

instance. The port range available is 30000 to

32000.

24 CONFIDENTIAL

Example: let’s say we have 3 pods we want to run on

high compute nodes, but they must be evenly

distributed. To do this you can have 3 or 6 nodes. In

sub groups

Pods

Scenario

We wanted to ensure there was node affinity for pods.

Prior to 3.4, node Anti Affinity was an issue. To ensure

pods were evenly distributed to different nodes we created

a work around using multiple labels for groups of servers.

How we handled it

Each node with a label, computenodeg1 / g2 / g3

and you could tag one or more computer nodes in

the different groups. While also having a main label

of “computenode”.

25 CONFIDENTIAL

Networking

Scenario

We will often need to troubleshoot network

connectivity at many different layers but do not want

to install tools like ping, netcat, nmap, curl, wget, etc.

everywhere (especially in the containers which run

our pods)

How we handled it

Using the built in device file /dev/tcp we can open a

tcp connection on any port we need for testing (the

same works for /dev/udp).

26 CONFIDENTIAL

Docker registry

Scenario

While it is great that OpenShift can act as our

primary docker registry we wanted to be able to

leverage our images for multiple OpenShift

installations, standalone docker daemons, and

ensure that we had full control over the images

coming into the firm.

How we handled it

Using the docker registry in Artifactory we ensured

that all our /etc/sysconfig/docker files were set up to

use our registry as the primary source for images.

27 CONFIDENTIAL

Docker networking

Scenario

From vanilla docker on RHEL 7 or CentOS 7 on up

through Origin and OCP the default IP range used by

the default docker configuration conflicted with

network ranges used at Point72.

How we handled it

Since we currently use 172.17.0.0/16 for production

network segments we had to make sure that all

docker configuration were set up to use something

else. We opted to ensure that all docker settings

(/etc/sysconfig/docker) use a default network of

192.168.0.0/16 CIDR block.

28 CONFIDENTIAL

Docker

Scenario

We found that over time the docker daemon would

hold onto images which did not have a repository

associated with them. Over time this would fill up the

volume allocated to docker preventing deployments

from running correclty.

How we handled it

We started to regularly go through and clean up the

docker images which were no longer used.

If for some reason an image was removed by

mistake on the next deployment it would be pulled

from our internal docker registry.

29 CONFIDENTIAL

Persistent Storage

Scenario

Running out of space on your EBS persistent

storage. It needs to be expanded.

How we handled it

Expanding an EBS volume can be done by creating a

snapshot, then creating a new larger volume from the

snapshot.

Updating the PV object in OpenShift with the new

VolumeID and size is the next step. Then finally the

tricky part is starting the POD to let the volume mount to

a node. Log into that node and run an xfs_growfs (or

appropriate command for your filesystem).

30 CONFIDENTIAL

Persistent Storage

Scenario

Running out of space on your EBS persistent

storage. It needs to be expanded.

How we handled it

Expanding an EBS volume can be done by creating a

snapshot, then creating a new larger volume from the

snapshot.

Updating the PV object in OpenShift with the new

VolumeID and size is the next step. Then finally the

tricky part is starting the POD to let the volume mount to

a node. Log into that node and run an xfs_growfs (or

appropriate command for your filesystem).

31 CONFIDENTIAL

The API creates the EBS volume in AWS in the correct

zone, then return the volumeid, with the volume id, the API

would then create the Persistent Volume object in

OpenShift using the name, size, and returned volume id.

Persistent Storage

Scenario

Prior to v 3.4 using the AWS API for storage

management was in technical preview. We found

issues with being able to have persistent volumes

detach and attach correctly when pods came up on

different nodes.

How we handled it

To address the issue we created our own API to

interface between OpenShift and AWS .

The API would require that the volume size and

name along with a valid authorization token from one

of the masters be provided as input.

32 CONFIDENTIAL

Backups

Scenario

Our ability to recover from failure was a

important requirement.

Backups had to occur daily and be easy to

use for restores.

How we handled it

To address this we looked at each part of the cluster

and came up with techniques to store the

configuration on a daily basis.

Our backup system would pick up the files during the

scheduled backups on each master and node.

tar cf ${BACKUPDIR}/certs-and-keys-

$(hostname).tar *.key *.crt

etcdctl backup --data-dir $ETCD_DATA_DIR --backup-

dir ${BACKUPDIR}/etcd-$(hostname).bak

oc login -u <user> -p <passwd>

oc get projects -o name

oc export dc <project> --as-

template=projectBackup-o json >

yourProjectTemplate.json

oc get rolebindings --export=true --as-

template=roleBindingsBackup -o json >

yourRolebindingsTemplate.json

oc get serviceaccount --as-

template=serviceaccountBackup -o json >

yourServiceaccountTemplate.json

oc get secrets --as-template=secretsBackup -o

json > yourSecretsTemplate.json

oc get pvc --as-template=pvcBackup -o json >

yourPVCsTemplate.json

OpenShift and Docker best practices

Guidelines and practices we adhere to

34 CONFIDENTIAL

Our docker best practices

Take-Away (20 Point Arial)

Reuse

images

New images should be based of an

existing image. Use the FROM

statement in your DOCKERFILE. It

will ensure that updates to the

upstream image are available in the

new image.

Maintain

compatibility

within tags

If you tag your image as p72image:v1

then stick with that for updates to the

image.

If an update is no longer compatible

with the p72image:v1 then update to

v2.

Limit

services

running in

containers

Keep your containers as simple as

possible. Include only what is

necessary for the container to do the

work it needs to do.

THINK LIGHTWEIGHT

Use exec in

wrapper

scripts

Always use exec.

Keep in mind that a Docker container

runs as PID 1 so when the the exec

dies the container (and pod) dies with

it.

1. Footnotes – 9 Point Arial

35 CONFIDENTIAL

Our docker best practices


Remove

temporary

files

Always remove temporary files to

keep bloat out of your image.

For example, if you install RPM’s via

yum, put all three plus a yum clean all

on the same line (each command

would be it’s own layer).

Order your

docker

instructions

properly

If your Dockerfile starts getting

complex think it through. Docker

processes from top to bottom.

Steps least likely to change are at the

top. Through cache subseqeunt

images will be faster.

Always

expose

important

ports

Expose only what is needed. Pay

attention to software you run and if a

port is not needed by your application

it is not important so don’t expose it. Set

environment

variables

Using the ENV instruction, set your

environment variables.

Always include the version of your

project making it easy for others to

know which version of your code is

running.


36 CONFIDENTIAL

OpenShift best practices


Be ready for

any user id

Containers in PODS will use an

arbitrary user id. It is a small measure

of security. If you need known

permissions use a default group of

root. RUN chgrp -R 0 /some/directory

RUN chmod -R g+rw /some/directory

RUN find /some/directory -type d -exec chmod g+x {} +

Use services

Communication between pods must

use services. The service provides a

static endpoint for access and will not

change as pods come up and down.

More

environment

variables

Include environment variables in your

OpenShift deployments.

Control groups can be used too. For

example, dynamically setting the java

HEAP size can be derived from the

memory values.

Use image

metadata

Metatdata will assist everyone using

the deployments and images you

create.

The spirit is to ensure enough

information is available for others

down the road to know what was

intended.

37 CONFIDENTIAL

OpenShift best practices


Clustering

There will be applications which need

to be clustered (think zookeeper).

Pod IP’s will change over time and

the underlying clustered application

needs to be able to handle this in it’s

election process.

Logging

Send all logging to STDOUT. This

data is collected by OpenShift and via

a fluentd forwarded sent to the EFK

stack.

Liveness

and

readiness

probes

This is a nice simple way to check if

your container is still running and

restart based on policy.

A readiness probe will check if a pod

is ready to service requests. If it fails

then the endpoint controller will

remove the pod from the service.

Templates

Always thing of deployments as

templates. This is how all our

deployments are done from an

applications git repository.


Thank you Q&A time

Documents

OpenShift at Point72 · PDF fileAs we focused on OpenShift we started with a strong deployment strategy from ... used the internal dns name. ... We will often need to troubleshoot